Root/
1 | RCU on Uniprocessor Systems |
2 | |
3 | |
4 | A common misconception is that, on UP systems, the call_rcu() primitive |
5 | may immediately invoke its function. The basis of this misconception |
6 | is that since there is only one CPU, it should not be necessary to |
7 | wait for anything else to get done, since there are no other CPUs for |
8 | anything else to be happening on. Although this approach will -sort- -of- |
9 | work a surprising amount of the time, it is a very bad idea in general. |
10 | This document presents three examples that demonstrate exactly how bad |
11 | an idea this is. |
12 | |
13 | |
14 | Example 1: softirq Suicide |
15 | |
16 | Suppose that an RCU-based algorithm scans a linked list containing |
17 | elements A, B, and C in process context, and can delete elements from |
18 | this same list in softirq context. Suppose that the process-context scan |
19 | is referencing element B when it is interrupted by softirq processing, |
20 | which deletes element B, and then invokes call_rcu() to free element B |
21 | after a grace period. |
22 | |
23 | Now, if call_rcu() were to directly invoke its arguments, then upon return |
24 | from softirq, the list scan would find itself referencing a newly freed |
25 | element B. This situation can greatly decrease the life expectancy of |
26 | your kernel. |
27 | |
28 | This same problem can occur if call_rcu() is invoked from a hardware |
29 | interrupt handler. |
30 | |
31 | |
32 | Example 2: Function-Call Fatality |
33 | |
34 | Of course, one could avert the suicide described in the preceding example |
35 | by having call_rcu() directly invoke its arguments only if it was called |
36 | from process context. However, this can fail in a similar manner. |
37 | |
38 | Suppose that an RCU-based algorithm again scans a linked list containing |
39 | elements A, B, and C in process contexts, but that it invokes a function |
40 | on each element as it is scanned. Suppose further that this function |
41 | deletes element B from the list, then passes it to call_rcu() for deferred |
42 | freeing. This may be a bit unconventional, but it is perfectly legal |
43 | RCU usage, since call_rcu() must wait for a grace period to elapse. |
44 | Therefore, in this case, allowing call_rcu() to immediately invoke |
45 | its arguments would cause it to fail to make the fundamental guarantee |
46 | underlying RCU, namely that call_rcu() defers invoking its arguments until |
47 | all RCU read-side critical sections currently executing have completed. |
48 | |
49 | Quick Quiz #1: why is it -not- legal to invoke synchronize_rcu() in |
50 | this case? |
51 | |
52 | |
53 | Example 3: Death by Deadlock |
54 | |
55 | Suppose that call_rcu() is invoked while holding a lock, and that the |
56 | callback function must acquire this same lock. In this case, if |
57 | call_rcu() were to directly invoke the callback, the result would |
58 | be self-deadlock. |
59 | |
60 | In some cases, it would possible to restructure to code so that |
61 | the call_rcu() is delayed until after the lock is released. However, |
62 | there are cases where this can be quite ugly: |
63 | |
64 | 1. If a number of items need to be passed to call_rcu() within |
65 | the same critical section, then the code would need to create |
66 | a list of them, then traverse the list once the lock was |
67 | released. |
68 | |
69 | 2. In some cases, the lock will be held across some kernel API, |
70 | so that delaying the call_rcu() until the lock is released |
71 | requires that the data item be passed up via a common API. |
72 | It is far better to guarantee that callbacks are invoked |
73 | with no locks held than to have to modify such APIs to allow |
74 | arbitrary data items to be passed back up through them. |
75 | |
76 | If call_rcu() directly invokes the callback, painful locking restrictions |
77 | or API changes would be required. |
78 | |
79 | Quick Quiz #2: What locking restriction must RCU callbacks respect? |
80 | |
81 | |
82 | Summary |
83 | |
84 | Permitting call_rcu() to immediately invoke its arguments breaks RCU, |
85 | even on a UP system. So do not do it! Even on a UP system, the RCU |
86 | infrastructure -must- respect grace periods, and -must- invoke callbacks |
87 | from a known environment in which no locks are held. |
88 | |
89 | It -is- safe for synchronize_sched() and synchronize_rcu_bh() to return |
90 | immediately on an UP system. It is also safe for synchronize_rcu() |
91 | to return immediately on UP systems, except when running preemptable |
92 | RCU. |
93 | |
94 | Quick Quiz #3: Why can't synchronize_rcu() return immediately on |
95 | UP systems running preemptable RCU? |
96 | |
97 | |
98 | Answer to Quick Quiz #1: |
99 | Why is it -not- legal to invoke synchronize_rcu() in this case? |
100 | |
101 | Because the calling function is scanning an RCU-protected linked |
102 | list, and is therefore within an RCU read-side critical section. |
103 | Therefore, the called function has been invoked within an RCU |
104 | read-side critical section, and is not permitted to block. |
105 | |
106 | Answer to Quick Quiz #2: |
107 | What locking restriction must RCU callbacks respect? |
108 | |
109 | Any lock that is acquired within an RCU callback must be |
110 | acquired elsewhere using an _irq variant of the spinlock |
111 | primitive. For example, if "mylock" is acquired by an |
112 | RCU callback, then a process-context acquisition of this |
113 | lock must use something like spin_lock_irqsave() to |
114 | acquire the lock. |
115 | |
116 | If the process-context code were to simply use spin_lock(), |
117 | then, since RCU callbacks can be invoked from softirq context, |
118 | the callback might be called from a softirq that interrupted |
119 | the process-context critical section. This would result in |
120 | self-deadlock. |
121 | |
122 | This restriction might seem gratuitous, since very few RCU |
123 | callbacks acquire locks directly. However, a great many RCU |
124 | callbacks do acquire locks -indirectly-, for example, via |
125 | the kfree() primitive. |
126 | |
127 | Answer to Quick Quiz #3: |
128 | Why can't synchronize_rcu() return immediately on UP systems |
129 | running preemptable RCU? |
130 | |
131 | Because some other task might have been preempted in the middle |
132 | of an RCU read-side critical section. If synchronize_rcu() |
133 | simply immediately returned, it would prematurely signal the |
134 | end of the grace period, which would come as a nasty shock to |
135 | that other thread when it started running again. |
136 |
Branches:
ben-wpan
ben-wpan-stefan
javiroman/ks7010
jz-2.6.34
jz-2.6.34-rc5
jz-2.6.34-rc6
jz-2.6.34-rc7
jz-2.6.35
jz-2.6.36
jz-2.6.37
jz-2.6.38
jz-2.6.39
jz-3.0
jz-3.1
jz-3.11
jz-3.12
jz-3.13
jz-3.15
jz-3.16
jz-3.18-dt
jz-3.2
jz-3.3
jz-3.4
jz-3.5
jz-3.6
jz-3.6-rc2-pwm
jz-3.9
jz-3.9-clk
jz-3.9-rc8
jz47xx
jz47xx-2.6.38
master
Tags:
od-2011-09-04
od-2011-09-18
v2.6.34-rc5
v2.6.34-rc6
v2.6.34-rc7
v3.9