due 2022-11-20
In this lab, we will add the file system support to mCertiKOS. We have already implemented an IDE disk driver and some parts of the file system for you. You will extend the mCertiKOS scheduler with the ability to make threads sleep and wake threads up; this is necessary for the implementation of a more efficient file system. Then you will be asked to implement various parts of the file system across multiple abstraction layers.
* In this lab, you will be reading many file system related code we have (partially) implemented for you. Please START EARLY.
In this and future labs you will progressively build up your kernel.
We will also provide you with some additional source.
To fetch that source,
use Git to commit changes you've made since handing in lab 4 (if any),
fetch the latest version of the course repository.
Then create a local branch called lab5
based on our lab5
branch, origin/lab5
. Note that for this assignment, you
SHOULD NOT merge the lab5 branch with your previous lab4 branch.
We have provided all the code you would need to get started.
$ cd ~/cpsc422/mcertikos
$ /c/cs422/apps/syncrepo.sh lab5
lab5 not set up yet, fetching new branch from release
remote: Counting objects: 105, done.
remote: Compressing objects: 100% (104/104), done.
remote: Total 105 (delta 55), reused 1 (delta 0)
Receiving objects: 100% (105/105), 39.69 KiB | 0 bytes/s, done.
Resolving deltas: 100% (55/55), completed with 46 local objects.
From /c/cs422/repo/mcertikos
8edaa5e..4b96a41 lab4 -> release/lab4
* [new branch] lab5 -> release/lab5
create new branch lab5
Switched to a new branch 'lab5'
set up push path
Counting objects: 105, done.
Delta compression using up to 20 threads.
Compressing objects: 100% (49/49), done.
Writing objects: 100% (105/105), 39.69 KiB | 0 bytes/s, done.
Total 105 (delta 55), reused 105 (delta 55)
To /c/cs422/SUBMIT/lab/netid.git/
* [new branch] lab5 -> lab5
Branch lab5 set up to track remote branch lab5 from origin.
$ git pull
Already up-to-date
Include the following information in the file
called README
in the mcertikos
directory:
who you have worked with; whether you coded this assignment together,
and if not, who worked on which part;
and brief description of what you have implemented;
and anything else you would like us to know. When you are ready to hand in your lab code,
add the new files to git, commit your changes, and then run git push
in
the mcertikos
.
$ git commit -am "finished lab5"
[lab5 a823de9] finished lab5
4 files changed, 87 insertions(+), 10 deletions(-)
$ git push
Exercise 1
The implementation of the file system in mCertiKOS is based on the file system in xv6. Read the Chapter 6, "File system", of the xv6 book carefully to get the overall idea of the file system implementation in xv6.
In user/fstest/fstest.c
, we have implemented a user process
that tests various aspects of the file system.
Once you have implemented required modules in this lab, you should expect to see
the following test outputs from the fstest
program (some tests
are big and may run a little bit slowly):
*As in the previous assignments, the provided tests will not pass until you implement required modules.
*******usertests starting*******
=====test file usertests.ran does not exists=====
=====test file usertests.ran does not exists: ok
=====small file test=====
creat small succeeded; ok, fd: 0
writes ok
open small succeeded ok
read succeeded ok
=====small file test ok=====
=====big files test=====
=====big files ok=====
=====many creates, followed by unlink test=====
=====many creates, followed by unlink; ok=====
=====rmdot test=====
=====rmdot ok=====
=====fourteen test=====
=====fourteen ok=====
=====bigfile test=====
=====bigfile test ok=====
=====subdir test=====
=====subdir ok=====
=====linktest=====
=====linktest ok=====
=====unlinkread test=====
=====unlinkread ok=====
=====dir vs file=====
=====dir vs file OK=====
=====empty file name=====
=====empty file name OK=====
=====bigdir test=====
=====bigdir ok=====
*******end of tests*******
To manipulate the files, we need a disk to store the files and directories.
We cannot directly use a linux disk partition because linux and mCertiKOS implement different
disk and file system structures. Instead, we have created a disk image
newfs/certikos_disk_new.img
which can be fed into the qemu as an additional disk.
DO NOT modify this file, since you will repeatedly use it as the empty disk.
That image file is already structured in a way that implements the mCertiKOS file system
structures and represents a disk with no files in it. To use the disk, you just need to
copy the file into the project root directory and rename it to certikos_disk.img
.
The line 125
of the Makefile
already does this job for you whenever
you run make qemu
or make qemu-nox
.
This means whenever you starts mCertiKOS, you will already start with a disk with no files.
This does not mean that our file system is not persistent. When you exit from qemu, all of
your new changes are already saved into certikos_disk.img
. We just manually
overwrite it with the empty one so that the above test programs can pass on this empty disk.
Once you are done with this lab, feel free to comment out the line in the Makefile and play
with the persistent disk.
Before we dive into the disk driver and file system implementation, we
first have to add more scheduler support to mCertiKOS. We will
implement two functions thread_sleep
and thread_wakeup
that allow a thread to wait on some
resources by temporarily going into sleep, which can be waken up later
when the corresponding resource is ready. For example, disk writes
are asynchronous. When you write some blocks into disk, even though
the function immediately returns, the disk requires some time to
finish the operation. One naive implementation of a disk write would
be to simply busy-wait until the operation isuering complete, by
repeatedly querying the disk to see whether the operation is complete.
As you can imagine, the downside of this approach is the
performance. You basically waste many CPU cycles by doing this
unnecessary busy waiting. Instead, mCertiKOS takes the interrupt
driven approach. When mCertiKOS finds that the operation is not
finished, it will put the current thread into sleep. We configure the
hard disk so that it will issue a special device hardware interrupt
when it finishes the disk write operation. Then in the corresponding
interrupt handler in mCertiKOS, we wake up all the threads that were
waiting upon this signal. This brings us better performance as now
the scheduler can schedule and run another thread while that
particular thread is in sleep.
On the other hand, this synchronization mechanism can be used in many
different scenarios. Thus we need to record the particular reason a
thread is put into sleep so that later we only wake up the relevant
threads. The object that implements this kind of connections is called
a channel. Each channel should be uniquely identifiable by an
identifier. In mCertiKOS, we utilize the fact that each variable in
the kernel has a unique physical address. Thus, you can allow threads
to sleep on address of arbitrary objects in the kernel. If you
checkout the new TCB
structure
in kern/PTCBIntro/PTCBIntro.c
, it now has a new attribute
called channel
indicating which address your thread is
currently sleeping on. We use the value 0
to indicate
that the thread is not waiting on any object.
In mCertiKOS, we use locks to protect the critical sections in the
kernel. When you put the current thread into sleep, you have to
release the relevant locks so that other thread can enter the critical
sections. However, you cannot release it before you call the
sleep. This will allow other threads to enter the critical section
even before you execute sleep, which is not what we want. Instead, we
should first acquire the scheduler lock before we release the other
lock to make sure no other threads can enter the scheduler and they
can only be able to access the critical section after I go to
sleep. This means that the lock release operation has to be performed
within
thread_sleep
. This is why the thread_sleep
in
kern/thread/PThread/PThread.c
takes a lock as the second argument.
Exercise 2
Based on the information above, implement
thread_sleep
, andthread_wakeup
inkern/thread/PThread/PThread.c
. Inthread_sleep
you need to make sure the following:
- The lock is released only after you acquired the scheduler lock.
- The scheduler lock is released before you switch to another thread and reacquired after the switch returns.
- The scheduler lock is released and the lock is reacquired before you leave the function. (how should they be ordered?)
Exercise 3
As described above, when mCertiKOS receives the interrupt signal (
IRQ_IDE1
) from the disk, we should notify the disk driver to wake up all the threads waiting for this signal. The corresponding interrupt handleride_intr
is implemented inkern/dev/disk/ide.c
. Add another case in the interrupt handler to handle this disk interrupt by callingide_intr
. After that, you should also callintr_eoi
to send the "end of interrupt" signal back to the disk so that it knows we have already received and handled the interrupt. Unfortunately, the disk sometimes also generates additional spurious interrupt signals (IRQ_IDE2
). If we do not handle this signal, based on the implementation of the current interrupt handler, it will go to the default interrupt handler which immediately callsintr_eoi
and returns. This may mislead the disk to believe that we have already handled the disk interrupt signals correctly. Add another case to handle this spurious interrupts by simply returning without callingintr_eoi
.
kern/dev/disk/ide.c
.
Exercise 4
Skim through the code to try to get the general idea in the implementation. You will find the "Drivers" section on page 45 of the xv6 book helpful.
The file system in mCertiKOS (kern/fs/
) is implementated in eight layers.
We will examine the layers one by one, filling in some missing implementations.
A buffer cache has two jobs:
Files: bufcache.h
, bufcache.c
.
Dependent files:
params.h
-- file system parameters.kern/lib/buf.h
-- definition of buffer data structure.kern/dev/disk/ide.h
-- IDE disk driver, which reads and writes blocks on an
IDE hard drive.Exercise 5
Skim through the code to try to get the general idea in the implementation. It is OK if you do not fully understand all the details.
This layer allocates raw disk blocks. The file system divides the disk blocks into several sections. Block 0 holds the boot sector, so it cannot used by the file system. Block 1 is called the superblock. It contains metadata about the file system (the file system size in blocks, the number of data blocks, the number of inodes, and the number of blocks in the log). Blocks starting at 2 hold inodes, with multiple inodes per block. After those come bitmap blocks tracking which data blocks are in use. Most of the remaining blocks are data blocks; each is either marked free in the bitmap block, or holds content for a file or directory. The blocks at the end of the disk hold the logging layer’s log.
Files: block.h
, block.c
Dependent files:
stat.h
-- defines the file statistics data structure.dinode.h
-- defines the on-disk inode data structure, containing a size and
an array of block numbers.Exercise 6
Skim through the code to try to get the general idea in the implementation. It is OK if you do not fully understand all the details.
Simple version logging mechanism for crash recovery. It allows higher layers to wrap multi-step updates to several blocks in a single transaction, to ensure that blocks are updated atomically.
Files: log.h
, log.c
Exercise 7
Skim through the code to try to get the general idea in the implementation. It is OK if you do not fully understand all the details.
Inode allocator, reading, writing, metadata. This layer provides unnamed files, each represented using an inode and a sequence of blocks holding the file's data.
Files: inode.h
, inode.c
Dependent files:
stat.h
-- defines the file statistics data structureExercise 8
Skim through the code to try to get the general idea in the implementation. It is OK if you do not fully understand all the details.
A directory is an inode with special contents (i.e., a list of other
inodes). Its inode has type T_DIR and its data is a sequence of directory
entries. Each entry is a struct dirent
(see kern/fs/dir.h
),
which contains a name and an inode number.
Files: dir.h
, dir.c
Exercise 9
Implement
dir_lookup
anddir_link
inkern/fs/dir.c
.In
dir_lookup
, you are given an inode pointer (dp
) for the current directory, the name of the subdirectory to look up, and a pointer (poff
) to write found byte offset to. You should loop through all the subdirectory entries ofdp
by callinginode_read
, and find the one that is allocated (theinum
is not 0) and has the matching name. Then you write the offset of the found inode to the addresspoff
and return the pointer to the inode that the subdirectory entry corresponds to by callinginode_get
with the inode number in the found subdirectory entry. If not found, it should return 0 (null pointer). Note that thesize
attribute in the inode structure represents the total sizes, in bytes, of the subdirectory entries, not the number of subdirectory entries. Also, theinode_read
takes an offset in bytes, not the index of the subdirectory entry.In
dir_link
, you are given an inode pointer (dp
) for the current directory, and the name and inode number for the new subdirectory you are going to link. The first thing you should do is to check whether that subdirectory already exists in the current directory (withdir_lookup
above), and if so, callinode_put
on the inode returned by thedir_lookup
and return -1. Recall thatdir_lookup
callsinode_get
in the end, which creates a reference to the in-memory inode structure and increases the reference counter of that inode. Thus, in the case of error, we should callinode_put
to drop that unnecessary reference we have created. After the error check, you can loop through the subdirectory structures ofdp
to find the first unallocated inode (one with theinum
field being 0). Then copy the name and inum to the subdirectory structure found and issue ainode_write
with the new subdirectory entry and the offset found. You can assume there will always be an unallocated subdirectory entry (or you can also callKERN_PANIC
with appropriate error message if not). The function should return 0 in the case of success.
This layer provides paths like /kern/fs/file.c for convenient naming, and
resolves them with recursive lookup. Path name lookup involves a succession of
calls to dir_lookup
, one for each path component. The function
namei
evaluates path
and returns the corresponding
inode. The function nameiparent
is just a variant: it stops before
the last element, returning the inode of the parent directory and copying the
final element into name. Both call the generalized function namex
to do the real work.
The function namex
starts by deciding where the path evaluation
begins. If the path begins with a slash, evaluation begins at the root;
otherwise, the current directory. Then it has to use skipelem
to
consider each element of the path in turn. Each iteration of the loop must look
up name in the current inode ip
.
Files: path.h
, path.c
Exercise 10
Based on the information above, implement the functions
skipelem
andnamex
.The function
skipelem(path,name)
copies the next path element from path into name. We will use it later innamex
to evaluate the succesive elements in the path. It should have the following behavior:
skipelem("a/bb/c", name)
="bb/c"
, setting name to"a"
skipelem("///a//bb", name)
="bb"
, setting name to"a"
skipelem("a", name)
=""
, setting name to"a"
skipelem("", name)
=skipelem("////", name)
=0
For the implementation, first ignore all leading slash characters in the path by advancing the path pointer until we find a non-slash character. Check that there is something left to process in the new path, and return 0 otherwise. We want to obtain the first group of non-slash characters in the path, and copy them to name. As before, we can advance the pointer until the desired position, but we do not have to forget to record a copy of the original path from where to obtain the first path element. With these two pointers we can obtain the length
len
of this first path element using pointer arithmetic. We only have to copylen
characters to name. Remember to check the caselen >= DIRSIZ
, and to put 0 at the end of the name string. Finally, as we did at the beginning, ignore the trailing slashes in path.For the implementation of
namex
, first take a look at the skeleton code. You will have to process one element inpath
at a time with the help ofskipelem
.Make sure that:
- Each iteration begins by locking
ip
and checking that it is a directory. If not, the lookup fails. (Lockingip
is necessary not becauseip->type
can change underfoot —it can’t— but because untilinode_lock
runs,ip->type
is not guaranteed to have been loaded from disk.)- If
ip
's type is not a directory, return 0.- If the call is for a parent inode (that is, when
nameiparent
is true) and this is the last path element (first character in path is'\0'
), the loop stops early, as per the definition ofnameiparent
. We need to copy the final path element intoname
, sonamex
need only return the unlockedip
.- Then, the loop has to look for the path element using
next = dir_lookup(ip, name, 0)
and prepare for the next iteration by settingip = next
. When the loop runs out of path elements, it returnsip
.- Use
inode_put
every time the inode pointerip
has to be released.- Access to the inode pointer
ip
under evaluation has to be protected with a lock. For every element of the path, lockip
after obtaining the element, and unlock it before returning fromnamex
. The functioninode_unlockput
may be useful here, which is a shortcut to unlock and put.Check the comments on the code for more details.
Each thread has its own table of open files, or file descriptors. A file descriptor is represented by a struct file, which is a wrapper around an inode.
Files: file.h
, file.c
Exercise 11
Read the code carefully and try to understand the code.
This layer is for file system system calls. Most of these syscall functions are just wrappers around the functions in the file descriptor layer, taking care of correctly fetching the syscall arguments.
Files: sysfile.h
, sysfile.c
Dependent files:
fcntl.h
-- defines file modesRecall that the implementation of the system call handlers you have been working on
only had integer arguments. (The system call handler sys_puts
for printf passes the pointer
to the string constant in the user space, but we have previously implemented the code
for you.) When the kernel receives a pointer from a user, you have to be very careful.
A user may pass in a bad pointer. If you allow the user to pass in a pointer to arbitrary
address (which is possible in C), the user may pass a pointer pointing to kernel data structures
or kernel stack, making the kernel crash or even worse, taking control over the kernel.
Thus, on each system call handler receiving pointer arguments, we should always check that
the pointer does not point to kernel part of the data in memory. Check the implementation of
sys_puts
in kern/trap/TSyscall/TSyscall.c
as an example.
We should always insist that the users pass in not only the pointer, but also the length of the content.
Assume that we have sys_open
system call that passes a pointer to the string containing file name.
It may be attempting to just pass in that pointer, and in the kernel you keep reading it until you find '\0'
.
However, this is very dangerous. The user can actually pass a valid pointer in the user memory, which is close
to the kernel-user memory boundary. If the user did not write any '\0'
in the user memory,
then you will continue to read the kernel part of the memory, thus possibly leaking some secure kernel information
to the user, e.g., through error messages. Instead we always insist the user to pass in both the pointer and
the length of the content. Then we check that the entire region is located in the user part of the memory.
On the other hand, e.g., for sys_open
, it would not be a good user experience if you always
have to pass in both the name of the file and the length of the name, since the length can simply be computed
from the name. Instead, we still expose a nice system call interface to the user, but when you actually
traps into the kernel (user/include/syscall.h
), we automatically add the length of the string
as an extra argument. The length of the string can be computed using our user level string library function
strlen
defined in user/lib/string.c
.
The pointer passed from the user is a pointer to the user's virual memory space, while
in mCertiKOS, kernel always have identity page map, thus directly manipulating the
physical address. This means, when the data in the user space is stored in multiple continuous pages,
it may no longer be mapped to continuous physical pages. Thus it would be very cumbersome to directly
manipulate the data in the user space. Instead, we insist that all the system call handlers only manipulate
in-kernel data structures. For example, when the user passes a pointer to some data for read, we create a local
in-kernel buffer, and copy the data from the user into the buffer. If the user passes some pointer for write purpose,
we let the handlers manipulate a in-kernel buffer instead, and at the end, we copy the content from the in-kernel
buffer to the address the pointer points to. The functions that copies data from and to the user's memory
are already implemented in kern/lib/pmap.c
. They already take care of the multiple-page issue
and copy data from and to multiple non-contiguous pages if necessary. As our kernel stack is restricted
to one page, any in-kernel buffer that is large should better be defined at the global scope, to prevent
the kernel stack overflow.
We should limit the length of the data that can be passed from the user. Currently, we restrict the length of the file name to 128 characters and the size of the file buffer to 10000 bytes. If the user passes in a lengh that exceeds its limit, we should reject it and raise appropriate error.
Exercise 12
In
user/include/syscall.h
, we have some system calls that takes a string constant as arguments but does not pass its length to the kernel. Those are,sys_link
,sys_unlink
,sys_open
,sys_mkdir
, andsys_chdir
. Modify their implementations to pass in extra arguments for the lengths of corresponding string inputs. Note that you should not change the signatures of those functions, instead, pass in extra arguments to theint
inline assembly command. Then modify corresponding system call handlers inkern/fs/sysfile.c
so that they take the extra length arguments and copy only required data of given length instead of copying 128 characters every time. Don't forget the trailing '\0' in the end of the string.
Exercise 13
Implement the following system call handlers in
kern/fs/sysfile.c
related to the file system, considering all the issues explained above. Note that the system call handlers should never go wrong. Thus, you should consider all possible sources of errors and return appropriate error codes to the user. Checkkern/thread/PTCBIntro/PTCBIntro.c
to see how to manipulate the set of open files for each thread.
fdalloc
(forsys_open
)sys_read
sys_write
sys_close
sys_fstat
Exercise 14
Read the rest of the code we have provided for you, and understand what they are doing.
Exercise 15
Though we only booted all the processes on one CPU in our test setup, our file system implemention should already support concurrency. However, those in-kernel buffers defined at the global level (e.g., the buffers for file read/write) can be simultaneously accessed by multiple processes running on different CPUs. Add appropriate spinlocks to protect the relevant critical sections by preventing others to access the buffers when you are manipulating them.
Now we have implemented a file system support in the mCertiKOS. In this part, you are asked to implement a unix-like shell on top of your file system implementation.
Exercise 16
Implement a user process called
shell
, and spawn the new process inkern/init/init.c
instead of the previousfstest
. Feel free to reuse any of the existing user processes, or create a new one.
Exercise 17
Implement a unix-like shell in the new user process. You need to first implement necessary system calls to read appropriate input from the user. Feel free to reuse some of the existing functions in
kern/dev/console.c
. Then implement at least the following set of commands:The commands should support both file and directory types, which means that you need to support the
ls
pwd
cd
cp
mv
rm
mkdir
cat
touch
(creating an empty file)- A command of your choice to write a string into a file, creating the file or overwritting the existing contents if necessary. An example could be
"some content" > file
.- A command of your choice to append a string into a file. It should fail if the file does not exist. An example could be
"some content" >> file
.-r
option forcp
andrm
. Otherwise, you do not need to implement the command line options or the permissions. Though not required, you are welcome to implement the set of options that you think is necessary. Clearly document what you have implemented, and include an interesting sample run in the README that covers all the commands and options.
Recall that the global buffers we have allocated for the system calls in kern/fs/sysfile.c
are not locked. Thus it suffers from the potential race conditions where multiple processes may try to
write to or read from the global buffer. As a bonus (thus optional) exercise, you are asked to
change the file system implementation to support full concurrency.
Bonus Exercise
We need to really understand the problem before we can solve it. In your
README
, clearly explain the potential issues that could occur under the current implementation. Note that the file system implementation itself is already properly locked. The issue comes from the unlocked kernel I/O buffers we defined inkern/fs/sysfile.c
.Change
kern/fs/sysfile.c
(and other appropriate files) to solve the problem. You should solve the issue by using locks in appropriate places. Simply duplicating the buffer for each thread is not acceptable. Note that our file system implementation is interrupt driven, where a particular call may get blocked, and the thread may be put into sleep until some particular interrupt is triggered. Simply wrapping the whole system call handlers with lock acquire and release statements would not work, as we should not allow any thread to hold a lock when it is put into sleep. Thus, we definitely need a more fine-grained solution here.You should create a test setup to demonstrate the issues in the old implementation and show that your new implementation indeed solves all of them. Clearly document what you have done, your test setup, and a sample run in the
README
file. The amount of bonus point you receive for this exercise depends on your understanding of the issue, the correctness of your implementation, your test setup, and the quality of your implementation.
This completes the lab.
Don't forget editing your README
file.
Add each of the files that you created to the git repository with the command git add /path/to/the/new/file
.
After this, you should not see any of the files you want to submit under the "Untracked files" section when you run
git status
. Commit your changes
and type git push
in the mcertikos
directory to submit your code.