Friday, November 12, 2010

kernel/proc.c - sys_call() internals

I gave some idea about sys_call() in the preceding diagrams. sys_call() does a number of checks before it allows a message or notification to be passed from one process to another. These are all part of the security mechanisms in the kernel that keep processes from doing things they are not allowed to do and from corrupting each other's memory maps. Assuming that a process has just made a system call, here is the list of checks that happen:

  1. Does the process have the privileges to make this particular system call? (the priv structure for this process in the process table is looked up to determine this)
  2. In the case of a 'send' or 'notify', is the destination a valid process? In the case of a 'receive' is the source a valid process? Note that a caller can specify 'ANY' as its source in which case it is looking for a message or notification from any process, so this check is unnecessary for that case. 'ECHO' is also a special case where this check doesn't make sense.
  3. Does the message pointer point to a valid region in memory i.e. is it really within the memory map limits of the calling process? Again, the memory map limits are available in the memmap struct for this process in the process table.
  4. If the request is to 'send' to some process, is the caller allowed to send to the destination process?
  5. Is the destination process running?
  6. Has system shutdown started?
If all the checks above have passed then call one of the following functions: mini_notify(), mini_receive() or mini_send() depending upon the nature of the request.

Refer to the 'Message Primitives and Process States' diagrams (preceding posts) for details on what happens within the above functions.

Comments welcome.

Monday, November 1, 2010

Process scheduling in Minix



The diagram above shows the state transitions for any process in Minix. A running process can become blocked when it makes a system call (which translates into the INT assembly instruction, also known as a software interrupt). Once blocked, it has to become ready before it can be run again. The scheduling algorithm only picks ready processes for execution.

The kernel maintains 16 queues of runnable processes. These queues are numbered 0 (highest priority queue) to 15 (lowest priority queue). Different processes have different priorities and the priority of a process may change during its execution. The changed priority is reflected in the fact that the process is moved to a different scheduling queue.

The clock and system tasks in Layer 1 of Minix have the highest priority. The device drivers in Layer 2 get lower priority but they are all not equal. Server processes (process manager, file system etc.) get lower priority than device drivers. User processes get lower priority than the server processes. Refer to this post for more on these layers.

Scheduling is round robin in each queue. If a running process uses up its quantum, it is moved to the tail of its queue and given a new quantum (Transition 4 in the diagram above). When a blocked process is awakened it is put at the head of its queue if it had any part of its quantum left when it blocked (Transition 2 in the diagram above). But it does not get a new quantum. It gets only as much quantum as it was left with when it blocked.

Whenever a running process becomes blocked or a runnable or running process is killed by a signal, it is removed from the scheduling queues. What this means is that the scheduling queues contain only ready or runnable processes.

Given this structure, the rule is very simple: Find the highest priority queue that is not empty and pick and continue running the process that is at the head of that queue.

So here is the chain of events at a high level: Process A is running -> Hardware and software interrupts occur -> interrupt handling code runs -> messages or notifications are sent or received -> the message processing results in process state transitions which causes changes to the scheduling queues in the kernel -> so Process B gets selected for execution.

We have discussed in detail how context switching happens and how the scheduling logic works. It is clear that the state transitions in the above diagram really drive the context switching. But what really causes these state transitions? The answer lies in the messaging layer of Minix. That is the final missing piece of the puzzle and I will discuss it with another diagram in an upcoming post.

Comments welcome.