Then he writes a little program called Mugshot that will take a snapshot from the pinhole camera every five seconds
or so, and compare it to the previous snapshot, and, if the difference is large enough, save it to a file. An encrypted
file with a meaningless, random name. Mugshot opens no windows and produces no output of its own, so the only
way you can tell it's running is by typing the UNIX command
and hitting the return key. Then the system will spew out a long list of running processes, and Mugshot will show up somewhere in that list.
—Neal Stephenson, Cryptonomicon, 1999
Process behavior is observable both within programs and outside of them. Today's studio will focus on debugging with the GNU debugger, which can be used to influence and observe details of processes as they run in Linux.
In this studio, you will:
Please complete the required exercises below, as well as any optional enrichment exercises that you wish to complete. We encourage you to please work in groups of 2 or 3 people on each studio (and the groups are allowed to change from studio to studio) though if you would prefer to complete any studio by yourself that is allowed.
As you work through these exercises, please record your answers, and when finished upload them along with the relevant source code to the appropriate spot on Canvas.
Make sure that the name of each person who worked on these exercises is listed in the first answer, and make sure you number each of your responses so it is easy to match your responses with each exercise.
As the answer to the first exercise, list the names of the people who worked together on this studio.
Download install the GNU debugger on your Raspberry Pi, if it is not installed already, via:
sudo apt-get install gdb
Now also download a source file and an ARM object file or x86_64 object file that we will explore with gdb. Compile and link them via:
gcc -g gdb-matrix.c gdb-thread-arm32.o -o prog -lpthread
gcc -g gdb-matrix.c gdb-thread-x86_64.o -o prog -lpthread
Finally, run this program via
./prog and notice how the program
terminates. Ensure that the program terminates in the same fashion every time
by running it a few handfuls of times. As the answer to this exercise, copy and
paste a piece of the output showing how the program terminates.
We're going to explore the use of breakpoints and watchpoints via gdb. Breakpoints allow us to interrupt execution of the program when a specific instruction address is reached, while watchpoints allow us to interrupt execution when a certain memory address is modified. Breakpoints and watchpoints are both extremely powerful, and we are just scratching the surface of what can be done with them.
gdb-matrix.c program. As you can see, it
allocates a 2-dimensional array of size
NR_ROWS x NR_COLS, sets
each cell to a specific value, and then iterates through the array checking
that each value has its correct original value and printing to the screen. However,
based on the previoius question we see that a certain cell is failing.
To explore what's happening, modify the program to print out the memory address
of the inidividual matrix cell that's failing. Print this address after the initialization of the
matrix values but before the call to
Recompile+link your program, and now, so that we can actually see the memory
address, we'll run our program with gdb and set a breakpoint on the thread
(gdb) b pthread_create
(Note that b is shorthand for "breakpoint"). Now we will start our program by typing:
Once the breakpoint is reached, you can kill the gdb execution with:
As the answer to this question, show the output of the breakpoint command, and show the memory address of the failing matrix cell, which you should see once the breakpoint is reached.
Now that we have a memory address of the failing cell, we'd like to see when its value is changing. The main thread clearly is not modifying it, but we don't know about the newly created pthread, as it is executing code for which we only have a binary file, a situation that could occur in the real world if a bug exists in, say, a shared library used by your program. What we'll do now is set up a watchpoint to determine when the value of the cell changes.
Again run your program with gdb and set a breakpoint on
look at the value printed for the failing matrix cell, and set a watchpoint on it with:
(gdb) watch *0xaddr
0xaddr is replaced with the address of the failing cell. Note that
you'll need to add the leading * before the address, which tells gdb this is a
memory address rather than a constant value. Now, with the watchpoint
installed, continue execution of your program by typing:
which is short for "continue". If this works correctly, the watch point should fire, showing you the address and name of the function that's running when the cell's value changes. As the answer to this exercise, write the name of the function that modifies the cell's contents, and show the new value that it writes.
The approach we've used is useful, but it requires a print statement to get the address of the failing cell matrix, and a breakpoint so that we can see the print statement. Such prints are sometimes cumbersome and can get lost when the program itself outputs a lot of data. The good news is that watch points actually allow us to query this dynamically at runtime without a hardcoded address. Modify your program to declare a pointer that we will dynamically set to the failing cell's address as follows:
static int * watch_addr = 0;
Now, comment out your print statement, and add a new line to assign the address of the failing matrix cell to the watch_addr variable. Now, install a watch point on this variable with:
(gdb) watch watch_addr
and run the program. Once the program hits the first watch point, the new value
indicates that the line of code you added to set
watch_addr was hit. gdb
outputs the value in base 10 rather than binary. This new value is the value that we
want to watch, so add a 2nd watch point with:
(gdb) watch *addr
where addr is the address that was set to
watch_addr. Now continue your program
until you see the value corrupted again, and as the answer to this exercise, show the code that
you wrote to set
Finally, we can go one step further and eliminate the need to manually specify a watchpoint address.
We can do this by telling gdb to treat
watch_addr as a pointer to an integer,
and to watch for when the value at the address pointed to by that integer changes. We can do this with:
(gdb) watch *(int *)watch_addr
which has familiar c-style syntax.
watch_addr is initialized to 0, then later updated to the address of the corrupted cell,
gdb also needs to be aware of the update, so it can watch the new address when the value of
watch_addr is changed.
To do so, you will need to additionally have gdb watch
allowing it to propagate the update to the first watchpoint you set:
(gdb) watch watch_addr
With both watchpoints installed, run the
program in gdb, and continue through the first two breaks, as these are
triggered in response to the
watch_addr being set to the failing cell,
then the cell getting set to the initial value by your program.
(Depending on how you've written your program, you might also need to continue through a break between those two,
which may be triggered when the cell is allocated and set to 0.)
You should then get a 3rd break when the value is corrupted.
Finally, we can actually use yet another feature of gdb to "fix" the value of the cell to have the correct contents after it gets corrupted. We can do this by getting the address that we want to modify and changing its value:
(gdb) set *(int *)watch_addr = the_value_we_want
the_value_we_want as desired. If we then continue,
the program should run to completion.
As the answer to this exercise, show the last few lines of program output once it runs to completion.
ps utility displays information about active processes.
A single run of the utility provides a list of current processes,
and information about them.
For more information, see the online man page
or issue the command:
man 1 ps
On your raspberry Pi, write a C program that generates significant CPU usage, e.g. that runs a simple integer or floating point operation indefinitely in a loop.Compile and run your program.
With your program still running, open another terminal window,
and run the
ps utility with the appropriate flags to have it display your program's CPU utilization.
kill command to send a
SIGKILL signal to terminate your program.
As the answer to this exercise, please show (1) the command you issued,
(2) the column headers displayed by the
and (3) the entry for your program.
top utility provides a dynamic real-time view of a running system,
including CPU and memory activity, as well as a list of processes or threads,
their individual statuses, and usage information for each one.
htop utility is similar to
but provides utilization details for each CPU core,
an enhanced user interface (including the ability to scroll),
and the ability to suspend and kill processes.
sudo apt-get install htopto install the
htoputility on your Raspberry Pi. Open up a terminal window and run
sudo htopto show all the userspace processes that are running (by default
htopdoesn't show kernel threads, but you can change that in the tool's settings if you'd like).
For more information, see the online man page or issue the command:
man 1 htop
Run your program from the previous exercise in one terminal window of your Raspberry Pi,
htop in another terminal window.
Observe what the utility shows you about (1) the CPU usage of the system in general and
(2) the CPU usage of your running process.
htop to kill your process.
As the answer to this exercise, please explain how
can help identify a process that is using extensive CPU resources,
and possibly slowing down your system.
Please also explain how it compares to
and which utility you would prefer to use to identify and kill a misbehaving process (and why).
In order to provide easier access to kernel and process information,
Linux provides the proc pseudo-filesystem.
This file system, by default, resides under the
and contains individual subdirectories for each process running on the system,
with multiple files in each subdirectory that expose kernel information about that process,
allowing for monitoring and introspection (and in some cases, modification)
with normal file I/O system calls.
On Linux, the
utilities use the proc pseudo-filesystem to display information.
Modify your program generates your heavy CPU usage so that, before it enters its loop, it prints its PID to standard output. Compile and run it on your Raspberry Pi.
In another terminal window, navigate to the
Try to find the following information about the process,
which can be reported by the
by direct inspection of the files in the
Hint: Look at the man page entry for proc(5)
(or run the command
man 5 proc), especially at the entries for the files
For example, the
/proc/[pid]/stat file includes the
fields, which report the time (in jiffies) the process has been scheduled in user and system mode,
as well as the
starttime field, which reports the time since system boot (in jiffies) that the process started.
This can be compared to the contents of
/proc/uptime (which reports a value in seconds)
to determine the elapsed time of the process, which, in combination with the running time,
can allow you to estimate CPU utilization.
As the answer to this exercise, please write the values you found for each statistic listed above, and explain how you retrieved each value.
/proc/[pid]/meminterface, if you did any of the corresponding optional enrichment exercises
/proc filesystem contains a file,
under each process's corresponding subfolder,
which is a binary image representing the virtual memory of the process.
This allows a process's memory to be modified,
which is useful, e.g., while debugging.
gdb-matrix program again,
so that it immediately prints its
pid when it enters
Compile and run it with gdb,
obtaining the address of the failing cell,
and running the program until you see the value corrupted, as before.
This time, however, do not fix the value using gdb.
Instead, you will fix it by writing to the process's corresponding
Write another C program that, as command-line arguments, takes a PID and a memory address.
It should open the
/proc/[pid]/mem file for the given PID,
printing an appropriate error message if it cannot open the file.
The program should then, from the address, compute the page number and page offset
for this address in the process's virtual address space.
mmap the given page,
printing an appropriate error message if the mapping fails,
assign the mapped memory to a pointer,
then cast the memory at the page offset to an integer value.
It should then print the value of this integer.
For help with the
mmap system call,
see the online man page
or issue the command:
man 2 mmap
In a new terminal window, compile and run your program, passing in the address of the failing cell as reported by gdb. Compare the value of the cell, as reported by your program, to the value reported by gdb.
As the answer to this exercise, please show the output of your program, as well as the command issued to gdb (and its output) to show the value of the failing cell.
Modify your program that accesses a process's
so that it takes an optional third command-line argument.
This argument should be an integer, which it then writes to the address supplied by the second argument.
If the argument is not present, your program should proceed as before (only printing the value at the address).
If the argument is present, but cannot be represented as an integer,
your program should print an appropriate error message and exit.
Run your program in a new terminal window
gdb-matrix program running in
gdb as before),
and use your new program to fix the value of the cell to have the correct contents after it gets corrupted.
Continue from the watchpoint in gdb, and verify that your program runs to completion.
As the answer to this exercise, show the last few lines of program output once it runs to completion.
Log into the school's Linux Lab Cluster,
and modify your inotify program from the Observing File System Events studio
to print its PID when it enters
Run it as you did in the previous studio, supplying it with one filename via a command line argument, and another via standard input.
In another terminal window, open its corresponding
lsutility to list the files in that directory. Each file corresponds to one of the process's file descriptors. Use the
catutility to print the contents of each file.
As the answer to this exercise, please show (1) the contents of the files corresponding to the inotify file descriptors printed by your program, and (2) what the information tells you about the files and the inotify watch.
Take your program from the previous exercises that opens the
/proc/[pid]/mem file for a given PID,
and copy it to the Linux Lab cluster (or to
shell.cec.wustl.edu), and compile it there.
ps utility to identify a process belonging to another user (e.g. one of your partners for this studio),
and use your program to attempt to read from its virtual memory (e.g. from address
As the answer to this exercise, please tell us what happened when you tried to use your program in this way. Was it successful? What error message did it provide? What are the security implications of this kind of access to process memory?
Page updated Saturday, January 1, 2022, by Marion Sudvarg and Chris Gill.