CS376: Operating Systems - Interprocess Communication via a Mailbox (100 Points)

Assignment Goals

The goals of this assignment are:
  1. To copy data between user and kernel memory space
  2. To manipulate process state to block and unblock processes until events occur
  3. To implement an interprocess communication mechanism via a mailbox
  4. To implement kernel linked lists and dynamic memory allocation

Background Reading and References

Please refer to the following readings and examples offering templates to help get you started:

The Assignment

You will write a kernel module or component that enables us, via syscalls, signals, or other mechanism, to communicate between two processes. You will implement a mailbox with send and receive capabilities. receive shall be a blocking call, which blocks until a message is received.

Implement the following functions:

  • void mysend(pid_t pid, size_t n, __user char* buf)
  • long myreceive(size_t n, __user char* buf)

Here, pid is the pid you wish to send a message to. Look up the task_struct corresponding to that pid. I recommend writing a function to find a task_struct by its pid to help you:

struct task_struct *find_task_by_pid(pid_t pid) {
    struct task_struct *task;
    for_each_process(task) {
        if (task->pid == pid) {
            return task;
        }
    }
    return NULL;
}

You will create (perhaps in include/linux/sched.h) a struct containing a char* message and an int size of the message. Let’s call it mymessage. Add a linked list to task_struct of these mymessage data structures. You can initialize it in do_fork. Similarly, add a wait_queue_head_t to task_struct in order to wait for a message to arrive, and initialize this in do_fork as well.

Receive Syscall

When receiving, n specifices the number of bytes to send, and the maximum number of bytes to read. buf is the buffer to send, and is a pointer to a char buffer that myreceive will assume has been properly malloc‘ed.

Send Syscall

For this project, you are only required to start out by receiving messages that have already been sent. That is, a call to mysend before you call myreceive should result in the message being missed. However, you can implement your calls in one or more of two ways to resolve this:

  1. Have mysend block until myreceive receives the message.
  2. For extra credit (and in addition to the option above), allow mysend not to block but instead queue the message so that a late call to myreceive will not block, but rather deliver one of the queued messages. In this case, if myreceive is called and no message is available, return 0 as the number of bytes read rather than blocking. You can specify a parameter to myreceive that indicates if it should block or use the queue, or write a separate version of myreceive.

Testing Your Syscalls

To test, write a program that forks a child and sends a char* message to that child. The child can receive and print the message.

What to Do

You can make modifications directly to sched.c and sched.h to add the logic for mysend and myreceive system calls, including any necessary data structures for message queuing. I suggest the following workflow:

Add a mymessage data structure in sched.h

Define a structure to represent a message and a queue to hold messages for a process. This struct should contain the pid_t of the sender, the size_t size of the message, and a char* to hold the address of the message itself.

Create a Kernel Linked List of these mymessage structures

Add a struct list_head to the mymessage structure, and create a message_queue that contains a struct list_head and a wait_queue_head_t for blocking on myreceive.

This data structure might look like this:

struct mymessage {
    pid_t sender_pid;
    size_t size;
    char* msg;
    struct list_head list;
};

struct message_queue {
    struct list_head messages;
    wait_queue_head_t wait;
};

Modify the struct task_struct in sched.h to include a pointer to your struct message_queue structure called msg_queue

You can add this pointer anywhere within the struct. Initialize these in do_fork like this, after the call to get a task_struct* p returned from copy_process, and inside the error check statement if(!IS_ERR(p)) that ensures that this process was established correctly:

// These statements assume you've named your variables as we have above; adjust as needed
p->msg_queue = kmalloc(sizeof(struct message_queue), GFP_KERNEL);
INIT_LIST_HEAD(&(p->msg_queue->messages));
init_waitqueue_head(&(p->msg_queue->wait));

Create System Calls for mysend and myreceive in sched.c

These will contain the logic to append a message to a destination task’s message queue (denoted by the pid), and to receive a message from your own current message queue from a particular specified pid (or from any process if the pid parameter is -1). myreceive returns the number of bytes read, or -1 on error.

Here are the function prototypes for mysend and myreceive:

asmlinkage void sys_mysend(pid_t pid, int n, const char __user *buf);
asmlinkage long sys_myreceive(int n, char __user *buf);

Register your System Calls

Add entries for your two system calls to the syscall table, as before. As usual, copy your function prototypes to include/linux/syscalls.h, and add an entry to arch/i386/kernel/syscall_table.S with an entry .long sys_mysyscall with the name of each of your syscalls. To call these syscalls by name, add the following snippet to include/asm-x86_64/unistd.h for each syscall:

#define __NR_mysyscall ^^^ //(where ^^^ is one more than the highest number in the list).
__SYSCALL(__NR_mysyscall, sys_mysyscall)

Alternatively, you can plan to call your syscalls by number (for example, syscall(123, param1, param2);), and omit the __SYSCALL macro above.

Implement the System Calls

The default logic for myreceive is to block until a message is received. You can add the task to the message queue wait list on myreceive if no message is available, and release a waiting process (if there is one) when calling mysend. Here, we describe the implementation of your new syscalls in detail.

Copying Between Userspace and Kernel Space

Before we begin, note that the mysend and myreceive buf pointers are addresses that exist in user-space, not in kernel space! The __user tags in the function prototypes indicate that the pointer is to a userspace address, and must be brought to/from the kernel using copy_to_user and copy_from_user. You can kmalloc space in kernel memory to copy data between userspace virtual addresses and physical addresses for access by the kernel. You can then put that kernel physical pointer onto the message queue when you send, and move that data from this kernel message queue back to user space on receive. Read this article for details on how to use these functions, or see the example below taken from that article. Note that this is a model template from which you can write your two system calls, but it is intended to demonstrate the copy_to_user (and, analogously, the copy_from_user) functions. It is not an exact copy of the code you will write.

// From: https://wiki.tldp.org/static/kernel_user_space_howto.html#Implementation-8
// GNU Free Documentation License

#include <linux/linkage.h>
#include <asm/uaccess.h>
asmlinkage long sys_mysyscall(char __user *buf, int len)
{
    char msg [200]; // I'd malloc this and put the pointer in a task_struct queue for persistence!
    if(strlen_user(buf) > 200)
        return -EINVAL;
    copy_from_user(msg, buf, strlen_user(buf)); // copy the data to a kernel buffer
    printk("mysyscall: %s\n", msg);
    copy_to_user(buf, "hello world", strlen("hello world")+1); // put "hello world" back into their user buffer
    return strlen("hello world")+1;
}      

Implementing mysend

  1. kmalloc a message structure to be added to some process’ task_struct. This is kernel memory, so you will pass GFP_KERNEL as the second parameter to kmalloc. Don’t roget to malloc enough room to fill the char* msg inside the message struct as well. Set the other fields of the message.

  2. Find the task_struct corresponding to your target pid, and use list_add_tail to pass the address of the list_head in your struct and the address of the list_head in the target process message queue (in their task_struct).

  3. Finally, use wake_up_interruptible with the address of the wait_queue_head_t from the target process’ message queue (in their task_struct) to wake up a process (if one exists) that may be waiting on your message.

Implementing myreceive

  1. Begin by waiting for a message using wait_event_interruptible. This function takes a wait_queue_head_t, and a condition to wait upon. In this case, the condition is to wait until the list is not empty. You can use the !list_empty function, which takes a list_head. Take these from the current process.

  2. Iterate over the list using list_for_each_entry_safe, which is a macro that expands to a for loop, and takes as parameters two pointers to your message struct (the first one is the one you will use, and the second one is for temporary use by the macro), the address of your list_head message queue, and the word list.

  3. If there is a message on the linked list, copy its contents to your userspace using copy_to_user (which takes the __user buf you are passed via the syscall, the kernel buffer you are copying, and the number of bytes to copy (which you kept in your message structure during mysend!).

  4. Finally, use list_del to remove the item from the list (this takes the address of the list_head of the message structure you’re removing).

Once a message is found, your syscall has finished and you can break out of the loop.

Thread-safe linked list operations

Manipulating kernel linked lists may not be thread-safe. You can use rcu_read_lock() and rcu_read_unlock() to get atomic access to those structures. Try to use these only when necessary (i.e., not for the entire function!). list_for_each_safe is thread-safe by itself.

Don’t forget to be robust

If you call kmalloc, check that the return value is not NULL. If any call returns an error state like 0 or -1 as appropriate for the call, clean up your memory with kfree and exit gracefully by returning a negative number as well. Either way, be sure to kfree everything you kmalloc when you are able to do so.

Test Your Program from User Space

  1. Write a user program that forks a child.

  2. The child should create a buffer and call myreceive with that buffer. The child can print the message it receives and exit.

  3. The parent should call mysend and send a message to the child by its pid. The parent can wait for the child after calling mysend (and then terminate itself).

Call your syscalls by number using the syscall method and compile your program with gcc as follows:

gcc testSysCall.c

Submission

In your submission, please include answers to any questions asked on the assignment page in your README file. If you wrote code as part of this assignment, please describe your design, approach, and implementation in your README file as well. Finally, include answers to the following questions:
  • Describe what you did, how you did it, what challenges you encountered, and how you solved them.
  • Please answer any questions found throughout the narrative of this assignment.
  • If collaboration with a buddy was permitted, did you work with a buddy on this assignment? If so, who? If not, do you certify that this submission represents your own original work?
  • Please identify any and all portions of your submission that were not originally written by you (for example, code originally written by your buddy, or anything taken or adapted from a non-classroom resource). It is always OK to use your textbook and instructor notes; however, you are certifying that any portions not designated as coming from an outside person or source are your own original work.
  • Approximately how many hours it took you to finish this assignment (I will not judge you for this at all...I am simply using it to gauge if the assignments are too easy or hard)?
  • Your overall impression of the assignment. Did you love it, hate it, or were you neutral? One word answers are fine, but if you have any suggestions for the future let me know.
  • Using the grading specifications on this page, discuss briefly the grade you would give yourself and why. Discuss each item in the grading specification.
  • Any other concerns that you have. For instance, if you have a bug that you were unable to solve but you made progress, write that here. The more you articulate the problem the more partial credit you will receive (it is fine to leave this blank).

Assignment Rubric

Description Pre-Emerging (< 50%) Beginning (50%) Progressing (85%) Proficient (100%)
Modifications to the Kernel Data Structure (10%) No modifications made or modifications completely break the system functionality. Incomplete or incorrect modifications to the kernel data structure, with major issues. Adequate kernel data structure modifications with minor issues or missing documentation. Comprehensive and correct modifications with clear documentation and rationale for changes.
Copying Data Between Kernel and User Space (10%) No data copying between kernel and user space is present or it completely fails. Copying of data is error-prone, with potential security or stability risks. Data copying is functional but may have minor inefficiencies or lack some safety checks. Efficient and error-free copying of data, with safety checks and validation.
Data-level Thread Safety and Atomic Instruction Code Blocks with Locking Techniques (10%) No consideration for thread safety, leading to severe system instability. Insufficient thread safety measures, with a high risk of race conditions or deadlocks. Most operations are thread-safe, with some potential for race conditions or deadlocks. Thread safety ensured with atomic operations and appropriate use of locking mechanisms.
System Call Implementation (15%) System calls are not implemented or are completely non-functional. System calls are implemented with significant flaws or misunderstanding of requirements. System calls are mostly correct but may have minor issues with edge cases or error handling. System calls are implemented correctly with proper parameter handling and error checking.
System Call Integration with the Operating System (10%) System calls are not integrated, rendering the system calls unusable within the OS. Poor integration of system calls, causing instability or incorrect behavior in the OS. System call integration works, with some minor inconsistencies or deviations from standards. Seamless integration, with system calls functioning as expected within the OS environment.
User Space Test Program (10%) No user space test program is provided, or it fails to execute any meaningful tests. Test program is inadequate, with unclear output and failure to cover necessary scenarios. Test program covers most scenarios but may miss some edge cases or lack detailed output. Comprehensive test program covering all cases, with documentation and clear output.
Robust and Defensive Code Implementation (10%) Code is not robust, with no attempt to handle errors or edge cases. Code lacks robustness, with frequent errors or crashes under edge cases. Code is generally robust but may not handle all edge cases or lacks some defensive measures. Code is robust, handles all edge cases, and includes defensive programming practices.
Condition Variable Implementation (15%) No implementation of condition variables, or implementation causes system failure. Condition variables are implemented incorrectly, leading to inefficient or incorrect operations. Condition variables are used appropriately, but there may be minor inefficiencies or delays. Correct use of condition variables, with efficient handling of wait and signal operations.
Code Documentation and README (10%) Both the README and documentation are lacking. Either the README or documentation are lacking. A README and code documentation are provided. A README and code documentation are provided and appropriate.

Please refer to the Style Guide for code quality examples and guidelines.