CSE 422S - Operating Systems Organization

CSE 422S: Studio 15

VFS Layer

All filesystems rely on the VFS to enable them not only to coexist, but also to interoperate.

— Robert Love, Linux Kernel Development, 3rd Edition, Chapter 13, pp. 261.

The virtual filesystem (VFS) layer allows a wide range of filesystems to be used within Linux, even if their implementation details vary significantly. Each filesystem is required to implement a common set of abstractions, which in turn allows Linux to handle them in a uniform manner. This also means that how a process views a filesystem is also standardized, allowing even kernel threads and other specialized processes to interact with different filesystem abstractions in a common and portable (at least within Linux) manner.

In this studio, you will:

Write a simple kernel module that accesses the filesystem mounted on your Raspberry Pi, via a kernel thread's process descriptor (task_struct).
Extend that kernel module to explore some of the VFS data structures, including directory entries for the current working directory and the root directory among others.
Extend your kernel module further to do the same for userspace task.

Please complete the required exercises below, as well as any optional enrichment exercises that you wish to complete. We encourage you to please work in groups of 2 or 3 people on each studio (and the groups are allowed to change from studio to studio) though if you would prefer to complete any studio by yourself that is allowed.

As you work through these exercises, please record your answers, and when finished upload them along with the relevant source code to the appropriate spot on Canvas. If you work in a group with other people, only one of you should please upload the answers (and any other files requested) for the studio, and if you need to resubmit (e.g., if a studio were to be marked incomplete when we grade it) the same person who submitted the studio originally should please do that.

Make sure that the name of each person who worked on these exercises is listed in the first answer, and make sure you number each of your responses so it is easy to match your responses with each exercise.

Required Exercises

As the answer to the first exercise, list the names of the people who worked together on this studio.
Write a kernel module that spawns a single kernel thread. That thread should use the current macro to access its own process descriptor (struct task_struct declared in include/linux/sched.h) and print out the values (i.e., the addresses they contain) of three of its task_struct's fields to the system log: fs, files, and nsproxy.
These fields give a process direct access into the virtual filesystem. Respectively, these fields are pointers to the process's filesystem structure (struct fs_struct, declared in include/linux/fs_struct.h), its open file table structure, (struct files_struct, declared in include/linux/fdtable.h), and its namespace proxy structure (struct nsproxy, a struct declared in include/linux/nsproxy.h that wraps the pointer to the mnt_namespace struct described in the text book).
Compile your module and load it on your Raspberry Pi, examine the system log to see your module's output, and then unload it. As the answer to this exercise, please show the lines of the system log that contain your module's output, including the values of the three pointers.
Modify your code so that the kernel thread uses the fs field of its process descriptor to access two fields of the process' filesystem structure (struct fs_struct): pwd and root.
These fields are path structures (struct path, declared in include/linux/path.h) for the process' current working directory and the root directory, respectively. Each of these path structures contains two fields: mnt which points to a VFS mount structure (struct vfsmount, declared in include/linux/mount.h) and dentry which points to a directory entry structure (struct dentry, declared in include/linux/dcache.h).
Modify your module so that its kernel thread prints out the values (the addresses of the locations they point to) of both of those path structures' dentry fields. Your module should also check if the values of those pointers differ; if so, print out the strings in the d_iname fields of the directory entry structures to which they each point. Otherwise, if they point to the same directory entry structure, print out the string for its d_iname field.
Compile your module and load it on your Raspberry Pi, examine the system log to see your module's output, and then unload it. As the answer to this exercise, please explain, based on your module's output to the system log, whether or not (and why or why not) you think the process' current working directory is the same as its root directory.
You will now modify your module so that it prints the names of all of the files and directories that are within the root directory. To do so, its kernel thread will have to traverse the list of directory entries whose head is in the d_subdirs field of the directory entry structure to which the path struct for the root directory points. Because d_subdirs points to a child entry, once you have reached the first child directory entry, you will then have to traverse the list of its siblings. This list's head is d_child field of the directory entry structure.
Special functions are needed to traverse Linux kernel data structures, as described in Chapter 6 of the LKD course text book. Review the discussion and examples in that chapter, and use the appropriate functions in your module's kernel thread to iterate over each child entry. To do so, provide the d_subdirs field as the pointer to the list head, then use d_child as the name of the subsequent member list to traverse. For each child entry, your kernel thread should print the value of its d_iname to the kernel log.
Compile your module and load it on your Raspberry Pi, examine the system log to see your module's output, and then unload it. Then, in a terminal window on your Raspberry Pi, list the contents of the root directory using the command:
ls -l /
As the answer to this exercise, please show the output from your module that contains the names of the entries in the root directory. Does this differ from the output of ls?
Now, modify your module so that, as its kernel thread traverses the list of directory entries in the d_subdirs field of the root directory entry, it only prints the value of an entry's d_iname field if that entry's d_subdirs list is non-empty.
Compile your module and load it on your Raspberry Pi, examine the system log to see your module's output, and then unload it. Again, compare the output to the output of the command:
ls -l /
As the answer to this exercise, please show the output from your module. This time, does it differ from the output of ls? Please explain why you think there is, or is not, a difference.
Things to turn in
- The answers to the above exercises (and answers to any optional exercises you did)
- Your kernel module source code file for the required exercises
- The kernel module source code file(s) if any, for optional enrichment exercises you did
Optional Enrichment Exercises
Make a copy of your kernel module and modify that module's kernel thread so that it does a full (recursive) depth-first traversal of the mounted filesystem, starting at the root directory. When it reaches the directory entry structure for a non-directory file, it should simply print out a line to the system log with the file's name. When it reaches the directory entry structure for a directory, it should print out that directory's name and then recursively explore that directory before visiting any of the other directory entry structures within its parent directory. As the answer to this exercise please show a fragment from the system log that demonstrates depth-first traversal of the filesystem.
Make a copy of your kernel module and modify that module's kernel thread so that it does a full (recursive) breadth-first traversal of the mounted filesystem, starting at the root directory. The system log messages for this version should print out all of the directory entries within the root directory, then all the directory entrys for the first sub-directory of the root directory, then all the directory entries for the second subdirectory of the root directory, etc. As the answer to this exercise please show a fragment from the system log that demonstrates breadth-first traversal of the filesystem.

CSE 422S: Studio 15

VFS Layer

Required Exercises

Things to turn in

Optional Enrichment Exercises