CS422/522 Lab 6: Advanced OS Projects
due 2022-12-13

Introduction

For this assignment, you can choose one project among the list below. The grading will depend on the correctness and thoroughness of your implementation. Each of the projects can get you full credit if you prove to us that you did a thorough research and a substantial amount of coding.

Instructions

You will create a branch lab6 and push it to the origin remote.

$ cd ~/cpsc422/mcertikos
$ git checkout lab5-solution
$ git checkout -b lab6
$ git push --set-upstream origin lab6
$ # You can simply use git push afterwards to publish your code

Project 1: Video Mode

Change the bootloader to switch the video mode to a graphic mode. Implement text output (in the console driver) by having a font built in the kernel (as a simple array of bytes).

Expose the VGA memory through a memory mapping, add system calls to turn on and off a "video" mode inside the kernel. When video mode is enabled, map the VGA buffer into userspace. Your driver should support at least 4 colors and a resolution of 640x480.

Implement a little user program that does cool things in video mode. For example, write a user program to display a screensaver application or any other interesting animation, and add some control to allow the user to interact with the program with keyboard inputs.

Useful links:

Project 2: UNIX Shell

The shell of the previous assignment is only a mock of a real UNIX shell. This project makes it a lot more interesting and has you go down the same design decisions UNIX creators went through.

Split the different commands implemented in your shell into different user processes.
Unify console IO, IPC, and FS read/writes.

Create a new kernel abstraction of "file descriptor" (represent them in userspace as integers). Two special file descriptors are the console input and the console output. Other file descriptors can either be a pipe between two processes, a write sink to a file, or a read source from a file.

Add a new system call to create a file descriptor in one of the 3 modes above. Have the kernel initialize the console in and console out file descriptors.

Once you have your new abstraction, remove the readline, puts produce, and consume system calls. They are now functionally subsumed by two new system calls read and write that read from the input channel and write to the output channel of the process.

The input/output channels of a process are specified in extra arguments you will add to spawn, the shell is started with the console in and console out channels.

Finally, make the shell able to parse redirections from and to files and other processes. Demonstrate the full functionality of your new shell with pipelines like:
```
cat file | rot13 > file.encrypted
```
(Extra credit) Have the shell start processes from disk instead of having user processes built into the kernel. For this, you will have to change spawn or create a new system call to start a new process from a file stored on the disk. You might also want to change the default qemu disk file to store some ELF files to start.

Project 3: Debugger

Read on the guts of debuggers, then implement a system call API to have a debugging mechanism. Your debugger process should be able to spawn another process, place a few breakpoints in it, dump some interesting memory contents while the debuggee is stopped, and resume the execution.

Your debugger will run in 3 phases:

Read a bunch of addresses to stop at in the debuggee process.
Start the debuggee (for example with a start command).
When a breakpoint is hit, control goes back to the debugger and the user can either type commands to dump the contents of the memory (dump 0x12345678) or resume the execution (continue).
Back to 3.

Give us a sample run in the README together with an explanation as to why it is interesting.

Project 4: Scheduler

Read the literature on dynamic priority-based schedulers. Implement one and showcase its new scheduling features on an example with multiple processes running.

Your new scheduler also needs to implement a load balancing mechanism to make sure that all the CPUs of the computer are fully utilized.

Detail your scheduling algorithm extensively in the README, and justify your comments by providing comprehensive benchmark results. Your scheduler implementation should be cache efficient. Do some research on the multi-core cache hierarchy models and illustrate clearly how each of your design decisions impacts the cache performance. Back up your claim with concrete benchmarks.

You can study the Plan 9 scheduler as it provides a simple base to work upon and the code is clear.

Project 5: Audio Support

Study how qemu supports the audio, then enable appropriate audio devices in qemu, and implement related drivers. Then expose reasonable API to the user level so that user programs can manipulate the sound, and implement some interesting user applications. For example, you can implement a simple piano application with limited notes, and then implement another application to play a self-composed song with provided notes. Make sure that the sound is of reasonable quality. Try to decode and play an audio file stored in the hard disk. You can use some open source decoders if you like.

Project 6: In-kernel Cryptographic Framework

Cryptographic functionalities play fundamental roles in many security applications and protocols. Many software that needs authentication, authorization, identification, integrity checking, secrecy protection, and attestation regards cryptography transformation as its building blocks.

Usually, crypto functions are implemented and provided by user-level libraries, e.g., GnuTLS, OpenSSL, and Crypto++. However, cryptographic computation might be very intensive and may require hardware assistance and better protection. That is why the majority of commodity OS have the crypto library to be part of the kernel. In this way, not only the hardware crypto accelerators that usually require privileged access could be employed, but also a better interface could be provided.

A typical example is the OpenBSD Cryptographic Framework (OCF)[1], which is already ported to Linux and Apache HTTP Server.

In Linux, the kernel crypto API[2] is exposed to the user space as a special network socket AF_ALG. A user could bind this socket with options .salg_type and .salg_name to specify the required crypto transformers, and then uses system calls such as send(), recv(), read(), and write() to interact with the transformer.


/* define a ALG socket with sha1 transformer */
struct sockaddr_alg sa = {
    .salg_family = AF_ALG,
    .salg_type = "hash", /* this selects the hash logic in the kernel */
    .salg_name = "sha1"  /* this is the cipher name */
};

/* create a socket */
fd = socket(AF_ALG, SOCK_SEQPACKET, 0);

/* bind with the socket (start the network connection) */
bind(fd, (struct sockaddr *)&sa, sizeof(sa));

/* send the message for digesting */
send(fd, buf, len, flags);

The interface is further wrapped into a user-level library, namely, libkcapi.

In this project, you are required to implement an in-kernel crypto framework that:

Utilize the hardware provided crypto functions, such as x86 AES / AES-NI instruction set, random number generators, SHA extensions, or the similar functionality from ARM cryptographic extensions[3].
Provide flexible framework to allow further extension of new ciphers/transformers.
Note: You do not necessarily need to provide the exact same socket-based interface to select the transformers/ciphers. However, your syscall interface should not bind to a specific cryptographic algorithms.
Enable the multiple user access for the same hardware crypto accelerators. (context save / restore)
At least provide the following crypto interface:
- SHA-2: SHA256
- AES
- Random number generator
Implement a user-level application to demonstrate how to use the provided crypto interface.

Note: Please talk to us if you prefer to work on the ARM version of mCertiKOS.

References:

[1] A. D. Keromytis, J. L. Wright, and T. de Raadt, “The design of the OpenBSD cryptographic framework,” 2003.
[2] S. Mueller, M. Vasut, "Linux kernel crypto API", https://www.kernel.org/doc/html/v4.17/crypto/index.html#linux-kernel-crypto-api.
[3] ARM, "ARM Cortex-A53 MPCore Processor Technical Reference Manual - Chapter 2.4.1", 2014.

If multiple user threads open one file, the inconsistency may occur due to the lack of mutual exclusion. One solution to avoid this issue is to provide the file lock. In Linux, the file lock is called flock, which provides the functions of applying or removing an advisory lock on an open file. The details of the flock could be found in its manual page:

man 2 flock

A file could be shared in 2 styles:

Exclusive access: when the file is locked, all the attempts to access this file would be put into the waiting list. Those threads will not proceed until the lock is released.
Shared access: shared access is used to simulate the multiple readers one writer situation, while if all the threads hold the shared lock, they could access the file without any conflicts, but if one of them upgrade its lock into the exclusive lock, only the one holds the exclusive lock could access to the file, others have to wait.

Usage example[1]:


/* acquire shared lock */
if (flock(fd, LOCK_SH) == -1) {
    exit(1);
}

/* acquire shared lock in non-blocking mode */
if (flock(fd, LOCK_SH | LOCK_NB) == -1) {
  exit(1);
}

/* non-atomically upgrade to exclusive lock */
if (flock(fd, LOCK_EX) == -1) {
    exit(1);
}

/* release lock */
/* lock is also released automatically when close() is called or process exits */
if (flock(fd, LOCK_UN) == -1) {
    exit(1);
}

In this project, you need to implement the syscall flock, which provides the features that are similar to Linux flock. You need to prepare a user application to demonstrate the using of flock to achieve synchronization between multiple threads.

References:

[1] V. Gaydov, “File locking in Linux”, https://gavv.github.io/articles/file-locks/, 2016.

Project 8: Advanced Synchronization - Wait for Multiple Objects

In many cases, a user thread may wait (either polling or sleeping) on multiple synchronization objects. For example, monitor multiple IPC channels, serial or network input, disk DMA completion, and waiting the notifications from other threads/devices. A mechanism that could combine multiple notification sources into one syscall is sometimes quite useful.

In Linux, this scalable I/O event notification function is called epoll (man 7 epoll). It monitors multiple file descriptors to see if I/O is possible on any of them.

Using epoll is constituted by the following three steps [1]:

Create an epoll object by epoll_create().
Register notification sources that needs to be waited simultaneously with epoll_ctl().
Wait on the epoll object by epoll_wait().

Another similar interface is implemented by FreeBSD, called kqueue, and integrated by MacOS and other BSD like OS. The tutorial of using kqueue could be found in [2].

In this project, you need to implement several system calls that perform similar functions as epoll or kqueue. You need to abstract notification sources, and link them into a new synchronization object. The waiting function should provide both polling mode and sleeping mode. In addition, you need to design a user application to demonstrate the usage of your syscall interfaces.

Note: In this project, the waiting objects does not necessarily need to be file descriptors. You could design your own representation of notification sources.

References:

[1] C. Sridharan, “The method to epoll's madness”, https://medium.com/@copyconstruct/the-method-to-epolls-madness-d9d2d6378642, 2016.
[2] NetBSD, "kqueue tutorial", https://wiki.netbsd.org/tutorials/kqueue_tutorial/, 2015.
[3] Corbet, "The kevent interface", https://lwn.net/Articles/172844/, Lwn.net, 2006.
[4] Dan Kegel, "The C10K problem", http://www.kegel.com/c10k.html, 2014.

Project 9: Signal

Signal is a POSIX inter-process communication (IPC) functionality that allows a user to register several user-level call-back functions, and when the kernel needs to notify the user that some events occur, it could select one of the call-back functions to let the user run it.

A typical signal using scenario is the shell command kill. When user issue command:

kill -num pid

to the kernel, the thread #pid will receive a signal numbered by #num.

When a user thread start, it may use syscall sigaction() to register a call-back function with the specified signal number for preparing to receive the upcoming signals:


struct sigaction {
  void (*sa_handler)(int);
  void (*sa_sigaction)(int, siginfo_t *, void *);
  sigset_t sa_mask;
  int sa_flags;
  void (*sa_restorer)(void);
};

Alternatively, a user thread may use syscall pause() to sleep until a signal is delivered. A full list of signal related syscalls could be found with man 7 signal.

In this project, you are required to implement a set of syscalls that is similar to the POSIX signal and show that a user thread could be properly killed or notified with the signal mechanism. In addition, you need to equip the shell in lab5 with additional commands, such as kill and trap to demonstrate your signal implementation.

References:

[1] H. Arora, “Linux signals fundamentals - Part I and II”, https://www.thegeekstuff.com/2012/03/linux-signals-fundamentals/, 2012.

Project 10: Advanced Memory Allocation

The memory allocation in mCertiKOS is primitive. It only allows the user to allocate one 4K page upon a page fault. In many cases, allocating pages that are consecutive in both virtual and physical addresses are useful, especially for those applications that require to perform operations with the DMA engine. Another benefit of allocating consecutive multiple pages at a time is to allow further combining the pages into a super-page. A super-page (e.g., Page Size Extension in x86) allows the MMU to reduce the TLB entries and then further reduce the page table switch overhead.

In this project, you are required to implement an improved page allocation system that allows the user to:

Allocate physically consecutive pages
Allocate super pages
Free pages

with syscall brk(). Besides, the page allocation system should be robust and could adequately handle the page fragment issue, and also has a decent performance.

Note: Please talk to us if you prefer to work on the ARM version of mCertiKOS.

Project 11: Advanced Synchronization: User-level Mutex

Mutex is an essential building block to implement more sophisticated synchronization objects. A few syscalls, e.g., select(), send(), read(), rely on mutexes to achieve kernel-level mutual exclusion and events notification. In many scenarios, the user threads also need mutexes for implementing higher-level synchronization, especially for those applications that have shared memory communications.

A futex (short for "fast userspace mutex", man 7 futex) is a kernel-level object that provides particular interfaces that users can rely on to implement user-level synchronization. A futex is fast because, in most cases, it will not be used until there is lock contention. And it is lightweight, because it is essentially a wait queue equipped with some atomic state.

In Linux, futex is provided by the futex() system call:


  int futex(int *uaddr, int futex_op, int val,
    const struct timespec *timeout, ...);

The argument uaddr points to the atomic state that needs to be guarded, while the futex_op could be:

WAIT: if *uaddr == val, the current thread goes to sleep.
WAKE: wakes up the number of #val threads that are waiting on the change of the futex.
CMP_REQUEUE: an upgraded version of WAIT that avoids the thundering herd problem.

In this project, you need to implement a futex that at least contains the above three functions. In order to show that the futex is practical, you also need to implement several user-level synchronization objects, such as Semaphores, FIFO Bounded Buffer Queues, and other constructs (e.g. list of higher-level synchronization objects).

References:

[1] D. Hart, “A futex overview and update”, https://lwn.net/Articles/360699/, Lwn.net, 2009.
[2] E. Bendersky, "Basics of Futexes", https://eli.thegreenplace.net/2018/basics-of-futexes/, 2018.

Project 12: Pub/Sub IPC

The topic-based messaging is increasingly popular in modern software architectures, especially for those embedded applications. Systems based on the pub/sub communication middleware, such as ROS, OpenDDS, and Atom enjoy high robustness from this loosely-coupled software interaction. In the pub/sub IPC subsystem, messages are broadcasted by publishers, and all the subscribers will receive the same message at once. No receiver could block the sending of the message so that the system is free from communication deadlock.

The pub/sub IPC in ROS is used as follows:

publisher:


  /* publish a topic named chatter and with a maximum of 1000 messages buffer */
  ros::Publisher chatter_pub = n.advertise("chatter", 1000);

  /* publisher publish a message */
  chatter_pub.publish(msg);

subscriber:


  /* subscriber subscribes to a topic named chatter, it will queue 1000 messages, and the old message will be discarded if a new message arrives. Upon a new message arrives, the subscriber will be activated and execute the callback function: chatterCallback */
  ros::Subscriber sub = n.subscribe("chatter", 1000, chatterCallback);

/* the callback function */
void chatterCallback(const std_msgs::String::ConstPtr& msg) {
  ...
}

For convenience, the pub/sub IPC in ROS is implemented on top of TCP/IP stack, even on the same OS kernel. The broadcasting is actually performed by the publisher iteratively sending out the TCP packets. However, there is no reason to forbid the OS to provide the pub/sub IPC syscalls.

In this project, you are required to design and implement the pub/sub IPC that resembles the ROS topic-based communication system. You only need to consider that all the threads (nodes) on the same OS.

References:

[1] Wikipedia, “Publish–subscribe pattern”, https://en.wikipedia.org/wiki/Publish%E2%80%93subscribe_pattern, 2019.
[2] ROS, "Writing a Simple Publisher and Subscriber (C++)".

Project 13: Networking

Note: Please talk to us for the detailed instructions.

Project 14: Resource Limitation and Bandwidth Control

Note: Please talk to us for the detailed instructions.

Other Projects

We are open to propositions. The above list simply provides some sample projects to give you an idea about the scope. Even if you choose one of the projects in the list, you do not have follow the guidelines exactly, as long as you demonstrate that the project is as interesting and challenging as the sample projects.

If you wish to make a proposal, please send us a mail with a detailed description of your proposal. We will evaluate it and get back to you quickly.

CS422/522 Lab 6: Advanced OS Projectsdue 2022-12-13

Introduction

Instructions

Project 1: Video Mode

Project 2: UNIX Shell

Project 3: Debugger

Project 4: Scheduler

Project 5: Audio Support

Project 6: In-kernel Cryptographic Framework

Project 7: Advanced Synchronization - File Sharing

Project 8: Advanced Synchronization - Wait for Multiple Objects

Project 9: Signal

Project 10: Advanced Memory Allocation

Project 11: Advanced Synchronization: User-level Mutex

Project 12: Pub/Sub IPC

Project 13: Networking

Project 14: Resource Limitation and Bandwidth Control

Other Projects

CS422/522 Lab 6: Advanced OS Projects
due 2022-12-13