Thursday, October 21, 2010

Interrupts

Bottom halves and deferring work

Basic idea
ISR must run fast, with current IRQ disabled (and maybe others too),
cannot sleep.
But servicing an interrupt might require lots of work and may require
sleeping.
Defer work to after the ISR finishes.
Only put in ISR:
time-sensitive work
work related directly to the hardware device that interrupted
work that must exclude another interrupt from same device
Can delay either until all ISRs are done (bottom half) or a given time
(kernel timer, Chapter 9)

History
Initially, Linux only had what we will call "BH": static list of 32
bottom halves, globally exclusive. BHs were removed in 2.5.
Later, "task queues" were added, but not lightweight enough for
networking. Task queues were removed in 2.5 (some became tasklets,
others work queues)
2.3 series introduced "softirqs" and "tasklets", which still remain.
2.5 series introduced "work queues", which still remain.

Softirqs
Purpose: high-frequency, highly-threaded work
Representation:
Statically allocated at compile time.
32-entry array: struct softirq_action softirq_vec[32]
in kernel/softirq.c:44
Only 6 are used: HI_SOFTIRQ and friends.
raising softirq:
raise_softirq(number) or raise_softirq_irqoff(number)
sets bit in softirq_pending(cpu)
pending softirqs are checked
during return from interrupt
by ksoftirqd kernel thread
explicitly in a few places, such as networking code
pending softirqs are invoked by do_softirq() -> __do_softirq() [trace]
-> theSoftirq->action(theSoftirq)
Handler execution:
Handler can be preempted by ISR (interrupts are enabled)
Other cpus can run softirqs at the same time, even the same one.
So shared data must be locked.
Typically, softirqs use per-process data to avoid needing locking.
Handler must not sleep (no semaphores, for instance)
to add a new softirq
Place it after HI_SOFTIRQ and before TASKLET_SOFTIRQ.
register by open_softirq(number, action, data)

Tasklets
Purpose: Most cases of deferred work. Preferable to softirqs.
Two categories, represented by TASKLET_SOFTIRQ and HI_SOFTIRQ
Representation: tasklet_struct
field state is one of TASKLET_STATE_SCHED, TASKLET_STATE_RUN.
(latter only used on SMP)
field count: disables tasklet if > 0.
raising (here called "scheduling") tasklet tl:
tasklet_schedule(tl) or tasklet_hi_schedule(tl)
appends to list tasklet_vec or tasklet_hi_vec
kernel/softirq.c:202
then raises appropriate softirq.
if tasklet scheduled again before it runs, it runs only once.
if tasklet already running, it is scheduled to run again.
tasklet only runs on cpu that scheduled it.
associated handler for TASKLET_SOFTIRQ or HI_SOFTIRQ is
tasklet_action() and tasklet_hi_action()
established in softirq_init()
trace tasklet_action()
[what happens to disabled or currently running ones? dropped?]
tasklet execution
only one of any given tasklet runs at a time, but others can run
simultaneously (on other cpus)
must not sleep (no semaphores, for instance)
Handler can be preempted by ISR (interrupts are enabled)
shared data (with other tasklets or interrupts) must be locked
to add a new tasklet
static
DECLARE_TASKLET(name, func, data)
DECLARE_TASKLET_DISABLED(name, func, data)
dynamic
tasklet_init(tl, func, data)
The func should be void func(unsigned long data)
to disable and enable a tasklet
tasklet_disable(tl) tasklet_disable_nosync(tl)
ordinary disable doesn't return until tl finishes, if running.
tasklet_enable(tl) -- re-enables (or enables for the first time)
tasklet_kill(tl) -- unschedules from its pending queue.
may sleep, so don't call tasklet_kill() from interrupt context.

ksoftirqd/n
Purpose: when softirqs are so frequent they interfere with interactive
tasks
Alternative: let kernel not handle reactivated softirqs until next
interrupt.
Each cpu c has a kernel task called ksoftirqd/c running ksoftirqd().
[trace]
Awakened by do_softirq() when a softirq reactivates itself.

Work queues
Deferred work that runs in process context, in a kernel task.
Such work is scheduled, initially holds no locks, and may sleep.
eg to allocate memory, obtain a semaphore, perform blocking I/O.
But must not access user space (which is only borrowed)
Alternative: special-purpose kernel task (deprecated).
Representation (see book 100)
Each type of work queue has its own workqueue_struct
which has one queue per cpu, called cpu_workqueue_struct
the queue is linked by field worklist.
Individual units of work (w) are of struct work_struct,
linked via field entry
running one instance of worker_thread() per cpu.
worker_thread() [trace] is a loop -> run_workqueue() [trace]
The generic kernel task is events/c
"events" is one type of work queue, created in init_workqueues()
Usually use "events". But XFS creates 2 new types of work queue.
To create work w
static
DECLARE_WORK(w, func, data)
dynamic
INIT_WORK(w, func, data)
The func should be void func(void *data)
To create a new type of work queue q
q = create_workqueue(name)
To schedule or cancel work w; where q not given, it is "events".
schedule_work(w) queue_work(q, w)
schedule_delayed_work(w, delay) queue_delayed_work(q, w, delay)
will not run until at least delay ticks have elapsed
flush_scheduled_work() flush_workqueue(q)
sleeps until all work on q completed
cancel_delayed_work(w) (from whatever q it is on)

Which bottom half to use?
If it must sleep, use task queues.
If it must scale well for multiple cpus and run at high frequency,
use softirqs and per-cpu data structures.
Otherwise, use tasklets.

Disabling bottom halves
Needed if KSPC code and a bottom half share data.
Only need to disasble softirqs and tasklets.
Generally also need to grab a lock.
local_bh_disable() local_bh_enable()
Both work only on local processor.
They may be nested.

No comments:

Post a Comment