Root/
1 | Paravirt_ops on IA64 |
2 | ==================== |
3 | 21 May 2008, Isaku Yamahata <yamahata@valinux.co.jp> |
4 | |
5 | |
6 | Introduction |
7 | ------------ |
8 | The aim of this documentation is to help with maintainability and/or to |
9 | encourage people to use paravirt_ops/IA64. |
10 | |
11 | paravirt_ops (pv_ops in short) is a way for virtualization support of |
12 | Linux kernel on x86. Several ways for virtualization support were |
13 | proposed, paravirt_ops is the winner. |
14 | On the other hand, now there are also several IA64 virtualization |
15 | technologies like kvm/IA64, xen/IA64 and many other academic IA64 |
16 | hypervisors so that it is good to add generic virtualization |
17 | infrastructure on Linux/IA64. |
18 | |
19 | |
20 | What is paravirt_ops? |
21 | --------------------- |
22 | It has been developed on x86 as virtualization support via API, not ABI. |
23 | It allows each hypervisor to override operations which are important for |
24 | hypervisors at API level. And it allows a single kernel binary to run on |
25 | all supported execution environments including native machine. |
26 | Essentially paravirt_ops is a set of function pointers which represent |
27 | operations corresponding to low level sensitive instructions and high |
28 | level functionalities in various area. But one significant difference |
29 | from usual function pointer table is that it allows optimization with |
30 | binary patch. It is because some of these operations are very |
31 | performance sensitive and indirect call overhead is not negligible. |
32 | With binary patch, indirect C function call can be transformed into |
33 | direct C function call or in-place execution to eliminate the overhead. |
34 | |
35 | Thus, operations of paravirt_ops are classified into three categories. |
36 | - simple indirect call |
37 | These operations correspond to high level functionality so that the |
38 | overhead of indirect call isn't very important. |
39 | |
40 | - indirect call which allows optimization with binary patch |
41 | Usually these operations correspond to low level instructions. They |
42 | are called frequently and performance critical. So the overhead is |
43 | very important. |
44 | |
45 | - a set of macros for hand written assembly code |
46 | Hand written assembly codes (.S files) also need paravirtualization |
47 | because they include sensitive instructions or some of code paths in |
48 | them are very performance critical. |
49 | |
50 | |
51 | The relation to the IA64 machine vector |
52 | --------------------------------------- |
53 | Linux/IA64 has the IA64 machine vector functionality which allows the |
54 | kernel to switch implementations (e.g. initialization, ipi, dma api...) |
55 | depending on executing platform. |
56 | We can replace some implementations very easily defining a new machine |
57 | vector. Thus another approach for virtualization support would be |
58 | enhancing the machine vector functionality. |
59 | But paravirt_ops approach was taken because |
60 | - virtualization support needs wider support than machine vector does. |
61 | e.g. low level instruction paravirtualization. It must be |
62 | initialized very early before platform detection. |
63 | |
64 | - virtualization support needs more functionality like binary patch. |
65 | Probably the calling overhead might not be very large compared to the |
66 | emulation overhead of virtualization. However in the native case, the |
67 | overhead should be eliminated completely. |
68 | A single kernel binary should run on each environment including native, |
69 | and the overhead of paravirt_ops on native environment should be as |
70 | small as possible. |
71 | |
72 | - for full virtualization technology, e.g. KVM/IA64 or |
73 | Xen/IA64 HVM domain, the result would be |
74 | (the emulated platform machine vector. probably dig) + (pv_ops). |
75 | This means that the virtualization support layer should be under |
76 | the machine vector layer. |
77 | |
78 | Possibly it might be better to move some function pointers from |
79 | paravirt_ops to machine vector. In fact, Xen domU case utilizes both |
80 | pv_ops and machine vector. |
81 | |
82 | |
83 | IA64 paravirt_ops |
84 | ----------------- |
85 | In this section, the concrete paravirt_ops will be discussed. |
86 | Because of the architecture difference between ia64 and x86, the |
87 | resulting set of functions is very different from x86 pv_ops. |
88 | |
89 | - C function pointer tables |
90 | They are not very performance critical so that simple C indirect |
91 | function call is acceptable. The following structures are defined at |
92 | this moment. For details see linux/include/asm-ia64/paravirt.h |
93 | - struct pv_info |
94 | This structure describes the execution environment. |
95 | - struct pv_init_ops |
96 | This structure describes the various initialization hooks. |
97 | - struct pv_iosapic_ops |
98 | This structure describes hooks to iosapic operations. |
99 | - struct pv_irq_ops |
100 | This structure describes hooks to irq related operations |
101 | - struct pv_time_op |
102 | This structure describes hooks to steal time accounting. |
103 | |
104 | - a set of indirect calls which need optimization |
105 | Currently this class of functions correspond to a subset of IA64 |
106 | intrinsics. At this moment the optimization with binary patch isn't |
107 | implemented yet. |
108 | struct pv_cpu_op is defined. For details see |
109 | linux/include/asm-ia64/paravirt_privop.h |
110 | Mostly they correspond to ia64 intrinsics 1-to-1. |
111 | Caveat: Now they are defined as C indirect function pointers, but in |
112 | order to support binary patch optimization, they will be changed |
113 | using GCC extended inline assembly code. |
114 | |
115 | - a set of macros for hand written assembly code (.S files) |
116 | For maintenance purpose, the taken approach for .S files is single |
117 | source code and compile multiple times with different macros definitions. |
118 | Each pv_ops instance must define those macros to compile. |
119 | The important thing here is that sensitive, but non-privileged |
120 | instructions must be paravirtualized and that some privileged |
121 | instructions also need paravirtualization for reasonable performance. |
122 | Developers who modify .S files must be aware of that. At this moment |
123 | an easy checker is implemented to detect paravirtualization breakage. |
124 | But it doesn't cover all the cases. |
125 | |
126 | Sometimes this set of macros is called pv_cpu_asm_op. But there is no |
127 | corresponding structure in the source code. |
128 | Those macros mostly 1:1 correspond to a subset of privileged |
129 | instructions. See linux/include/asm-ia64/native/inst.h. |
130 | And some functions written in assembly also need to be overrided so |
131 | that each pv_ops instance have to define some macros. Again see |
132 | linux/include/asm-ia64/native/inst.h. |
133 | |
134 | |
135 | Those structures must be initialized very early before start_kernel. |
136 | Probably initialized in head.S using multi entry point or some other trick. |
137 | For native case implementation see linux/arch/ia64/kernel/paravirt.c. |
138 |
Branches:
ben-wpan
ben-wpan-stefan
javiroman/ks7010
jz-2.6.34
jz-2.6.34-rc5
jz-2.6.34-rc6
jz-2.6.34-rc7
jz-2.6.35
jz-2.6.36
jz-2.6.37
jz-2.6.38
jz-2.6.39
jz-3.0
jz-3.1
jz-3.11
jz-3.12
jz-3.13
jz-3.15
jz-3.16
jz-3.18-dt
jz-3.2
jz-3.3
jz-3.4
jz-3.5
jz-3.6
jz-3.6-rc2-pwm
jz-3.9
jz-3.9-clk
jz-3.9-rc8
jz47xx
jz47xx-2.6.38
master
Tags:
od-2011-09-04
od-2011-09-18
v2.6.34-rc5
v2.6.34-rc6
v2.6.34-rc7
v3.9