Linux Kernel: 2.6

From Notes

Jump to: navigation, search


Query kernel to determine clock ticks per second:

getconf CLK_TCK


The following are my notes from some recent research on the scheduler in the 2.6 version of the Linux kernel.

Please note that some of the text is copied straight over from the reference below. This will be cleaned up.


re: http://www-128.ibm.com/developerworks/linux/library/l-scheduler/

files: /usr/src/linux/kernel/sched.c


Implemented by: Ingo Molnar

Scheduler is dynamic, supports load-balancing, and operates in constant time O(1) -- previously O(n).

  • 2.4 used a single runqueue.
  • Each CPU has a runqueue made up of 140 priority lists that are serviced in FIFO order (Active Runqueue).
    • The first 100 are reserved for real time tasks.
    • The last 40 priority lists are for user tasks.
  • There is also an Expired runqueue -- tasks are moved here when they expire all of their time on the active runqueue.
  • The 2.6 scheduler doesn't use a single lock for scheduling; instead, it has a lock on each runqueue. This allows all CPUs to schedule tasks without contention from other CPUs.
  • When tasks are created in an SMP system, they're placed on a given CPU's runqueue. In the general case, you can't know when a task will be short-lived or when it will run for a long time. Therefore, the initial allocation of tasks to CPUs is likely suboptimal.

To maintain a balanced workload across CPUs, work can be redistributed, taking work from an overloaded CPU and giving it to an underloaded one. The Linux 2.6 scheduler provides this functionality by using load balancing. Every 200ms, a processor checks to see whether the CPU loads are unbalanced; if they are, the processor performs a cross-CPU balancing of tasks.

A negative aspect of this process is that the new CPU's cache is cold for a migrated task (needing to pull its data into the cache).

Remember that a CPU's cache is local (on-chip) memory that offers fast access over the system's memory. If a task is executed on a CPU, and data associated with the task is brought into the CPU's local cache, it's considered hot. If no data for a task is located in the CPU's local cache, then for this task, the cache is considered cold.

It's unfortunate, but keeping the CPUs busy makes up for the problem of a CPU cache being cold for a migrated task.


Excellent set of slides by Greg Kroah-Hartman on the Linux Kernel

Personal tools