Then he writes a little program called Mugshot that will take a snapshot from the pinhole camera every five seconds
or so, and compare it to the previous snapshot, and, if the difference is large enough, save it to a file. An encrypted
file with a meaningless, random name. Mugshot opens no windows and produces no output of its own, so the only
way you can tell it's running is by typing the UNIX command
ps
and hitting the return key. Then the system will spew out a long list of running processes, and Mugshot will show up
somewhere in that list.
—Neal Stephenson, Cryptonomicon, 1999
Process behavior is observable both within programs and outside of them. Today's studio will focus on debugging with the GNU debugger, which can be used to influence and observe details of processes as they run in Linux.
In this studio, you will:
Please complete the required exercises below, as well as any optional enrichment exercises that you wish to complete. We encourage you to please work in groups of 2 or 3 people on each studio (and the groups are allowed to change from studio to studio) though if you would prefer to complete any studio by yourself that is allowed.
As you work through these exercises, please record your answers, and when finished upload them along with the relevant source code to the appropriate spot on Canvas.
Make sure that the name of each person who worked on these exercises is listed in the first answer, and make sure you number each of your responses so it is easy to match your responses with each exercise.
As the answer to the first exercise, list the names of the people who worked together on this studio.
Download install the GNU debugger on your Raspberry Pi, if it is not installed already, via:
sudo apt-get install gdb
Now also download a source file and an ARM object file or x86_64 object file that we will explore with gdb. Compile and link them via:
gcc -g gdb-matrix.c gdb-thread-arm32.o -o prog -lpthread
or
gcc -g gdb-matrix.c gdb-thread-x86_64.o -o prog -lpthread
Finally, run this program via ./prog
and notice how the program
terminates. Ensure that the program terminates in the same fashion every time
by running it a few handfuls of times. As the answer to this exercise, copy and
paste a piece of the output showing how the program terminates.
We're going to explore the use of breakpoints and watchpoints via gdb. Breakpoints allow us to interrupt execution of the program when a specific instruction address is reached, while watchpoints allow us to interrupt execution when a certain memory address is modified. Breakpoints and watchpoints are both extremely powerful, and we are just scratching the surface of what can be done with them.
Examine the gdb-matrix.c
program. As you can see, it
allocates a 2-dimensional array of size NR_ROWS x NR_COLS
, sets
each cell to a specific value, and then iterates through the array checking
that each value has its correct original value and printing to the screen. However,
based on the previoius question we see that a certain cell is failing.
To explore what's happening, modify the program to print out the memory address
of the inidividual matrix cell that's failing. Print this address after the initialization of the
matrix values but before the call to pthread_create
.
Recompile+link your program, and now, so that we can actually see the memory
address, we'll run our program with gdb and set a breakpoint on the thread
creation function:
gdb ./prog
(gdb) b pthread_create
(Note that b is shorthand for "breakpoint"). Now we will start our program by typing:
(gdb) run
Once the breakpoint is reached, you can kill the gdb execution with:
(gdb) quit
As the answer to this question, show the output of the breakpoint command, and show the memory address of the failing matrix cell, which you should see once the breakpoint is reached.
Now that we have a memory address of the failing cell, we'd like to see when its value is changing. The main thread clearly is not modifying it, but we don't know about the newly created pthread, as it is executing code for which we only have a binary file, a situation that could occur in the real world if a bug exists in, say, a shared library used by your program. What we'll do now is set up a watchpoint to determine when the value of the cell changes.
Again run your program with gdb and set a breakpoint on pthread_create
. Now,
look at the value printed for the failing matrix cell, and set a watchpoint on it with:
(gdb) watch *0xaddr
where 0xaddr
is replaced with the address of the failing cell. Note that
you'll need to add the leading * before the address, which tells gdb this is a
memory address rather than a constant value. Now, with the watchpoint
installed, continue execution of your program by typing:
(gdb) c
which is short for "continue". If this works correctly, the watch point should fire, showing you the address and name of the function that's running when the cell's value changes. As the answer to this exercise, write the name of the function that modifies the cell's contents, and show the new value that it writes.
The approach we've used is useful, but it requires a print statement to get the address of the failing cell matrix, and a breakpoint so that we can see the print statement. Such prints are sometimes cumbersome and can get lost when the program itself outputs a lot of data. The good news is that watch points actually allow us to query this dynamically at runtime without a hardcoded address. Modify your program to declare a pointer that we will dynamically set to the failing cell's address as follows:
static int * watch_addr = 0;
Now, comment out your print statement, and add a new line to assign the address of the failing matrix cell to the watch_addr variable. Now, install a watch point on this variable with:
(gdb) watch watch_addr
and run the program. Once the program hits the first watch point, the new value
indicates that the line of code you added to set watch_addr
was hit. gdb
outputs the value in base 10 rather than binary. This new value is the value that we
want to watch, so add a 2nd watch point with:
(gdb) watch *addr
where addr is the address that was set to watch_addr
. Now continue your program
until you see the value corrupted again, and as the answer to this exercise, show the code that
you wrote to set watch_addr
Finally, we can go one step further and eliminate the need to manually specify a watchpoint address.
We can do this by telling gdb to treat watch_addr
as a pointer to an integer,
and to watch for when the value at the address pointed to by that integer changes. We can do this with:
(gdb) watch *(int *)watch_addr
which has familiar c-style syntax.
Because watch_addr
is initialized to 0, then later updated to the address of the corrupted cell,
gdb also needs to be aware of the update, so it can watch the new address when the value of watch_addr
is changed.
To do so, you will need to additionally have gdb watch watch_addr
correctly,
allowing it to propagate the update to the first watchpoint you set:
(gdb) watch watch_addr
With both watchpoints installed, run the
program in gdb, and continue through the first two breaks, as these are
triggered in response to the watch_addr
being set to the failing cell,
then the cell getting set to the initial value by your program.
(Depending on how you've written your program, you might also need to continue through a break between those two,
which may be triggered when the cell is allocated and set to 0.)
You should then get a 3rd break when the value is corrupted.
Finally, we can actually use yet another feature of gdb to "fix" the value of the cell to have the correct contents after it gets corrupted. We can do this by getting the address that we want to modify and changing its value:
(gdb) set *(int *)watch_addr = the_value_we_want
filling in the_value_we_want
as desired. If we then continue,
the program should run to completion.
As the answer to this exercise, show the last few lines of program output once it runs to completion.
The ps
utility displays information about active processes.
A single run of the utility provides a list of current processes,
and information about them.
For more information, see the online man page
or issue the command:
man 1 ps
On your raspberry Pi, write a C program that generates significant CPU usage, e.g. that runs a simple integer or floating point operation indefinitely in a loop.
Compile and run your program.
With your program still running, open another terminal window,
and run the ps
utility with the appropriate flags to have it display your program's CPU utilization.
Use the kill
command to send a SIGKILL
signal to terminate your program.
As the answer to this exercise, please show (1) the command you issued,
(2) the column headers displayed by the ps
utility,
and (3) the entry for your program.
The top
utility provides a dynamic real-time view of a running system,
including CPU and memory activity, as well as a list of processes or threads,
their individual statuses, and usage information for each one.
The htop
utility is similar to top
,
but provides utilization details for each CPU core,
an enhanced user interface (including the ability to scroll),
and the ability to suspend and kill processes.
sudo apt-get install htop
to
install the htop
utility on your Raspberry Pi. Open up a terminal window
and run sudo htop
to show all the userspace processes that are running (by default
htop
doesn't show kernel threads, but you can change that in the tool's settings
if you'd like).
For more information, see the online man page or issue the command:
man 1 htop
Run your program from the previous exercise in one terminal window of your Raspberry Pi,
and run htop
in another terminal window.
Observe what the utility shows you about (1) the CPU usage of the system in general and
(2) the CPU usage of your running process.
Use htop
to kill your process.
As the answer to this exercise, please explain how htop
can help identify a process that is using extensive CPU resources,
and possibly slowing down your system.
Please also explain how it compares to ps
,
and which utility you would prefer to use to identify and kill a misbehaving process (and why).
In order to provide easier access to kernel and process information,
Linux provides the proc pseudo-filesystem.
This file system, by default, resides under the /proc
directory
and contains individual subdirectories for each process running on the system,
with multiple files in each subdirectory that expose kernel information about that process,
allowing for monitoring and introspection (and in some cases, modification)
with normal file I/O system calls.
On Linux, the ps
, top
, and htop
utilities use the proc pseudo-filesystem to display information.
Modify your program generates your heavy CPU usage so that, before it enters its loop, it prints its PID to standard output. Compile and run it on your Raspberry Pi.
In another terminal window, navigate to the /proc/[pid]
directory.
Try to find the following information about the process,
which can be reported by the ps
and htop
utilities,
by direct inspection of the files in the /proc
directory.
Hint: Look at the man page entry for proc(5)
(or run the command man 5 proc
), especially at the entries for the files
/proc/[pid]/status
, /proc/[pid]/stat
, and /proc/uptime
.
For example, the /proc/[pid]/stat
file includes the utime
and stime
fields, which report the time (in jiffies) the process has been scheduled in user and system mode,
as well as the starttime
field, which reports the time since system boot (in jiffies) that the process started.
This can be compared to the contents of /proc/uptime
(which reports a value in seconds)
to determine the elapsed time of the process, which, in combination with the running time,
can allow you to estimate CPU utilization.
As the answer to this exercise, please write the values you found for each statistic listed above, and explain how you retrieved each value.
/proc/[pid]/mem
interface,
if you did any of the corresponding optional enrichment exercisesThe /proc
filesystem contains a file, mem
,
under each process's corresponding subfolder,
which is a binary image representing the virtual memory of the process.
This allows a process's memory to be modified,
which is useful, e.g., while debugging.
Modify the gdb-matrix
program again,
so that it immediately prints its pid
when it enters main()
.
Compile and run it with gdb,
obtaining the address of the failing cell,
and running the program until you see the value corrupted, as before.
This time, however, do not fix the value using gdb.
Instead, you will fix it by writing to the process's corresponding
/proc/[pid]/mem
file.
Write another C program that, as command-line arguments, takes a PID and a memory address.
It should open the /proc/[pid]/mem
file for the given PID,
printing an appropriate error message if it cannot open the file.
The program should then, from the address, compute the page number and page offset
for this address in the process's virtual address space.
It should mmap
the given page,
printing an appropriate error message if the mapping fails,
assign the mapped memory to a pointer,
then cast the memory at the page offset to an integer value.
It should then print the value of this integer.
For help with the mmap
system call,
see the online man page
or issue the command:
man 2 mmap
In a new terminal window, compile and run your program, passing in the address of the failing cell as reported by gdb. Compare the value of the cell, as reported by your program, to the value reported by gdb.
As the answer to this exercise, please show the output of your program, as well as the command issued to gdb (and its output) to show the value of the failing cell.
Modify your program that accesses a process's mem
file
so that it takes an optional third command-line argument.
This argument should be an integer, which it then writes to the address supplied by the second argument.
If the argument is not present, your program should proceed as before (only printing the value at the address).
If the argument is present, but cannot be represented as an integer,
your program should print an appropriate error message and exit.
Run your program in a new terminal window
(with the gdb-matrix
program running in gdb
as before),
and use your new program to fix the value of the cell to have the correct contents after it gets corrupted.
Continue from the watchpoint in gdb, and verify that your program runs to completion.
As the answer to this exercise, show the last few lines of program output once it runs to completion.
Log into the school's Linux Lab Cluster,
and modify your inotify program from the Observing File System Events studio
to print its PID when it enters main()
.
Run it as you did in the previous studio, supplying it with one filename via a command line argument, and another via standard input.
In another terminal window, open its corresponding /proc/[pid]/fdinfo
ls
utility to list the files in that directory.
Each file corresponds to one of the process's file descriptors.
Use the cat
utility to print the contents of each file.
As the answer to this exercise, please show (1) the contents of the files corresponding to the inotify file descriptors printed by your program, and (2) what the information tells you about the files and the inotify watch.
Take your program from the previous exercises that opens the /proc/[pid]/mem
file for a given PID,
and copy it to the Linux Lab cluster (or to shell.cec.wustl.edu
), and compile it there.
Use the ps
utility to identify a process belonging to another user (e.g. one of your partners for this studio),
and use your program to attempt to read from its virtual memory (e.g. from address 0x10
).
As the answer to this exercise, please tell us what happened when you tried to use your program in this way. Was it successful? What error message did it provide? What are the security implications of this kind of access to process memory?
Page updated Saturday, January 1, 2022, by Marion Sudvarg and Chris Gill.