Root/
1 | SAS Layer |
2 | --------- |
3 | |
4 | The SAS Layer is a management infrastructure which manages |
5 | SAS LLDDs. It sits between SCSI Core and SAS LLDDs. The |
6 | layout is as follows: while SCSI Core is concerned with |
7 | SAM/SPC issues, and a SAS LLDD+sequencer is concerned with |
8 | phy/OOB/link management, the SAS layer is concerned with: |
9 | |
10 | * SAS Phy/Port/HA event management (LLDD generates, |
11 | SAS Layer processes), |
12 | * SAS Port management (creation/destruction), |
13 | * SAS Domain discovery and revalidation, |
14 | * SAS Domain device management, |
15 | * SCSI Host registration/unregistration, |
16 | * Device registration with SCSI Core (SAS) or libata |
17 | (SATA), and |
18 | * Expander management and exporting expander control |
19 | to user space. |
20 | |
21 | A SAS LLDD is a PCI device driver. It is concerned with |
22 | phy/OOB management, and vendor specific tasks and generates |
23 | events to the SAS layer. |
24 | |
25 | The SAS Layer does most SAS tasks as outlined in the SAS 1.1 |
26 | spec. |
27 | |
28 | The sas_ha_struct describes the SAS LLDD to the SAS layer. |
29 | Most of it is used by the SAS Layer but a few fields need to |
30 | be initialized by the LLDDs. |
31 | |
32 | After initializing your hardware, from the probe() function |
33 | you call sas_register_ha(). It will register your LLDD with |
34 | the SCSI subsystem, creating a SCSI host and it will |
35 | register your SAS driver with the sysfs SAS tree it creates. |
36 | It will then return. Then you enable your phys to actually |
37 | start OOB (at which point your driver will start calling the |
38 | notify_* event callbacks). |
39 | |
40 | Structure descriptions: |
41 | |
42 | struct sas_phy -------------------- |
43 | Normally this is statically embedded to your driver's |
44 | phy structure: |
45 | struct my_phy { |
46 | blah; |
47 | struct sas_phy sas_phy; |
48 | bleh; |
49 | }; |
50 | And then all the phys are an array of my_phy in your HA |
51 | struct (shown below). |
52 | |
53 | Then as you go along and initialize your phys you also |
54 | initialize the sas_phy struct, along with your own |
55 | phy structure. |
56 | |
57 | In general, the phys are managed by the LLDD and the ports |
58 | are managed by the SAS layer. So the phys are initialized |
59 | and updated by the LLDD and the ports are initialized and |
60 | updated by the SAS layer. |
61 | |
62 | There is a scheme where the LLDD can RW certain fields, |
63 | and the SAS layer can only read such ones, and vice versa. |
64 | The idea is to avoid unnecessary locking. |
65 | |
66 | enabled -- must be set (0/1) |
67 | id -- must be set [0,MAX_PHYS) |
68 | class, proto, type, role, oob_mode, linkrate -- must be set |
69 | oob_mode -- you set this when OOB has finished and then notify |
70 | the SAS Layer. |
71 | |
72 | sas_addr -- this normally points to an array holding the sas |
73 | address of the phy, possibly somewhere in your my_phy |
74 | struct. |
75 | |
76 | attached_sas_addr -- set this when you (LLDD) receive an |
77 | IDENTIFY frame or a FIS frame, _before_ notifying the SAS |
78 | layer. The idea is that sometimes the LLDD may want to fake |
79 | or provide a different SAS address on that phy/port and this |
80 | allows it to do this. At best you should copy the sas |
81 | address from the IDENTIFY frame or maybe generate a SAS |
82 | address for SATA directly attached devices. The Discover |
83 | process may later change this. |
84 | |
85 | frame_rcvd -- this is where you copy the IDENTIFY/FIS frame |
86 | when you get it; you lock, copy, set frame_rcvd_size and |
87 | unlock the lock, and then call the event. It is a pointer |
88 | since there's no way to know your hw frame size _exactly_, |
89 | so you define the actual array in your phy struct and let |
90 | this pointer point to it. You copy the frame from your |
91 | DMAable memory to that area holding the lock. |
92 | |
93 | sas_prim -- this is where primitives go when they're |
94 | received. See sas.h. Grab the lock, set the primitive, |
95 | release the lock, notify. |
96 | |
97 | port -- this points to the sas_port if the phy belongs |
98 | to a port -- the LLDD only reads this. It points to the |
99 | sas_port this phy is part of. Set by the SAS Layer. |
100 | |
101 | ha -- may be set; the SAS layer sets it anyway. |
102 | |
103 | lldd_phy -- you should set this to point to your phy so you |
104 | can find your way around faster when the SAS layer calls one |
105 | of your callbacks and passes you a phy. If the sas_phy is |
106 | embedded you can also use container_of -- whatever you |
107 | prefer. |
108 | |
109 | |
110 | struct sas_port -------------------- |
111 | The LLDD doesn't set any fields of this struct -- it only |
112 | reads them. They should be self explanatory. |
113 | |
114 | phy_mask is 32 bit, this should be enough for now, as I |
115 | haven't heard of a HA having more than 8 phys. |
116 | |
117 | lldd_port -- I haven't found use for that -- maybe other |
118 | LLDD who wish to have internal port representation can make |
119 | use of this. |
120 | |
121 | |
122 | struct sas_ha_struct -------------------- |
123 | It normally is statically declared in your own LLDD |
124 | structure describing your adapter: |
125 | struct my_sas_ha { |
126 | blah; |
127 | struct sas_ha_struct sas_ha; |
128 | struct my_phy phys[MAX_PHYS]; |
129 | struct sas_port sas_ports[MAX_PHYS]; /* (1) */ |
130 | bleh; |
131 | }; |
132 | |
133 | (1) If your LLDD doesn't have its own port representation. |
134 | |
135 | What needs to be initialized (sample function given below). |
136 | |
137 | pcidev |
138 | sas_addr -- since the SAS layer doesn't want to mess with |
139 | memory allocation, etc, this points to statically |
140 | allocated array somewhere (say in your host adapter |
141 | structure) and holds the SAS address of the host |
142 | adapter as given by you or the manufacturer, etc. |
143 | sas_port |
144 | sas_phy -- an array of pointers to structures. (see |
145 | note above on sas_addr). |
146 | These must be set. See more notes below. |
147 | num_phys -- the number of phys present in the sas_phy array, |
148 | and the number of ports present in the sas_port |
149 | array. There can be a maximum num_phys ports (one per |
150 | port) so we drop the num_ports, and only use |
151 | num_phys. |
152 | |
153 | The event interface: |
154 | |
155 | /* LLDD calls these to notify the class of an event. */ |
156 | void (*notify_ha_event)(struct sas_ha_struct *, enum ha_event); |
157 | void (*notify_port_event)(struct sas_phy *, enum port_event); |
158 | void (*notify_phy_event)(struct sas_phy *, enum phy_event); |
159 | |
160 | When sas_register_ha() returns, those are set and can be |
161 | called by the LLDD to notify the SAS layer of such events |
162 | the SAS layer. |
163 | |
164 | The port notification: |
165 | |
166 | /* The class calls these to notify the LLDD of an event. */ |
167 | void (*lldd_port_formed)(struct sas_phy *); |
168 | void (*lldd_port_deformed)(struct sas_phy *); |
169 | |
170 | If the LLDD wants notification when a port has been formed |
171 | or deformed it sets those to a function satisfying the type. |
172 | |
173 | A SAS LLDD should also implement at least one of the Task |
174 | Management Functions (TMFs) described in SAM: |
175 | |
176 | /* Task Management Functions. Must be called from process context. */ |
177 | int (*lldd_abort_task)(struct sas_task *); |
178 | int (*lldd_abort_task_set)(struct domain_device *, u8 *lun); |
179 | int (*lldd_clear_aca)(struct domain_device *, u8 *lun); |
180 | int (*lldd_clear_task_set)(struct domain_device *, u8 *lun); |
181 | int (*lldd_I_T_nexus_reset)(struct domain_device *); |
182 | int (*lldd_lu_reset)(struct domain_device *, u8 *lun); |
183 | int (*lldd_query_task)(struct sas_task *); |
184 | |
185 | For more information please read SAM from T10.org. |
186 | |
187 | Port and Adapter management: |
188 | |
189 | /* Port and Adapter management */ |
190 | int (*lldd_clear_nexus_port)(struct sas_port *); |
191 | int (*lldd_clear_nexus_ha)(struct sas_ha_struct *); |
192 | |
193 | A SAS LLDD should implement at least one of those. |
194 | |
195 | Phy management: |
196 | |
197 | /* Phy management */ |
198 | int (*lldd_control_phy)(struct sas_phy *, enum phy_func); |
199 | |
200 | lldd_ha -- set this to point to your HA struct. You can also |
201 | use container_of if you embedded it as shown above. |
202 | |
203 | A sample initialization and registration function |
204 | can look like this (called last thing from probe()) |
205 | *but* before you enable the phys to do OOB: |
206 | |
207 | static int register_sas_ha(struct my_sas_ha *my_ha) |
208 | { |
209 | int i; |
210 | static struct sas_phy *sas_phys[MAX_PHYS]; |
211 | static struct sas_port *sas_ports[MAX_PHYS]; |
212 | |
213 | my_ha->sas_ha.sas_addr = &my_ha->sas_addr[0]; |
214 | |
215 | for (i = 0; i < MAX_PHYS; i++) { |
216 | sas_phys[i] = &my_ha->phys[i].sas_phy; |
217 | sas_ports[i] = &my_ha->sas_ports[i]; |
218 | } |
219 | |
220 | my_ha->sas_ha.sas_phy = sas_phys; |
221 | my_ha->sas_ha.sas_port = sas_ports; |
222 | my_ha->sas_ha.num_phys = MAX_PHYS; |
223 | |
224 | my_ha->sas_ha.lldd_port_formed = my_port_formed; |
225 | |
226 | my_ha->sas_ha.lldd_dev_found = my_dev_found; |
227 | my_ha->sas_ha.lldd_dev_gone = my_dev_gone; |
228 | |
229 | my_ha->sas_ha.lldd_max_execute_num = lldd_max_execute_num; (1) |
230 | |
231 | my_ha->sas_ha.lldd_queue_size = ha_can_queue; |
232 | my_ha->sas_ha.lldd_execute_task = my_execute_task; |
233 | |
234 | my_ha->sas_ha.lldd_abort_task = my_abort_task; |
235 | my_ha->sas_ha.lldd_abort_task_set = my_abort_task_set; |
236 | my_ha->sas_ha.lldd_clear_aca = my_clear_aca; |
237 | my_ha->sas_ha.lldd_clear_task_set = my_clear_task_set; |
238 | my_ha->sas_ha.lldd_I_T_nexus_reset= NULL; (2) |
239 | my_ha->sas_ha.lldd_lu_reset = my_lu_reset; |
240 | my_ha->sas_ha.lldd_query_task = my_query_task; |
241 | |
242 | my_ha->sas_ha.lldd_clear_nexus_port = my_clear_nexus_port; |
243 | my_ha->sas_ha.lldd_clear_nexus_ha = my_clear_nexus_ha; |
244 | |
245 | my_ha->sas_ha.lldd_control_phy = my_control_phy; |
246 | |
247 | return sas_register_ha(&my_ha->sas_ha); |
248 | } |
249 | |
250 | (1) This is normally a LLDD parameter, something of the |
251 | lines of a task collector. What it tells the SAS Layer is |
252 | whether the SAS layer should run in Direct Mode (default: |
253 | value 0 or 1) or Task Collector Mode (value greater than 1). |
254 | |
255 | In Direct Mode, the SAS Layer calls Execute Task as soon as |
256 | it has a command to send to the SDS, _and_ this is a single |
257 | command, i.e. not linked. |
258 | |
259 | Some hardware (e.g. aic94xx) has the capability to DMA more |
260 | than one task at a time (interrupt) from host memory. Task |
261 | Collector Mode is an optional feature for HAs which support |
262 | this in their hardware. (Again, it is completely optional |
263 | even if your hardware supports it.) |
264 | |
265 | In Task Collector Mode, the SAS Layer would do _natural_ |
266 | coalescing of tasks and at the appropriate moment it would |
267 | call your driver to DMA more than one task in a single HA |
268 | interrupt. DMBS may want to use this by insmod/modprobe |
269 | setting the lldd_max_execute_num to something greater than |
270 | 1. |
271 | |
272 | (2) SAS 1.1 does not define I_T Nexus Reset TMF. |
273 | |
274 | Events |
275 | ------ |
276 | |
277 | Events are _the only way_ a SAS LLDD notifies the SAS layer |
278 | of anything. There is no other method or way a LLDD to tell |
279 | the SAS layer of anything happening internally or in the SAS |
280 | domain. |
281 | |
282 | Phy events: |
283 | PHYE_LOSS_OF_SIGNAL, (C) |
284 | PHYE_OOB_DONE, |
285 | PHYE_OOB_ERROR, (C) |
286 | PHYE_SPINUP_HOLD. |
287 | |
288 | Port events, passed on a _phy_: |
289 | PORTE_BYTES_DMAED, (M) |
290 | PORTE_BROADCAST_RCVD, (E) |
291 | PORTE_LINK_RESET_ERR, (C) |
292 | PORTE_TIMER_EVENT, (C) |
293 | PORTE_HARD_RESET. |
294 | |
295 | Host Adapter event: |
296 | HAE_RESET |
297 | |
298 | A SAS LLDD should be able to generate |
299 | - at least one event from group C (choice), |
300 | - events marked M (mandatory) are mandatory (only one), |
301 | - events marked E (expander) if it wants the SAS layer |
302 | to handle domain revalidation (only one such). |
303 | - Unmarked events are optional. |
304 | |
305 | Meaning: |
306 | |
307 | HAE_RESET -- when your HA got internal error and was reset. |
308 | |
309 | PORTE_BYTES_DMAED -- on receiving an IDENTIFY/FIS frame |
310 | PORTE_BROADCAST_RCVD -- on receiving a primitive |
311 | PORTE_LINK_RESET_ERR -- timer expired, loss of signal, loss |
312 | of DWS, etc. (*) |
313 | PORTE_TIMER_EVENT -- DWS reset timeout timer expired (*) |
314 | PORTE_HARD_RESET -- Hard Reset primitive received. |
315 | |
316 | PHYE_LOSS_OF_SIGNAL -- the device is gone (*) |
317 | PHYE_OOB_DONE -- OOB went fine and oob_mode is valid |
318 | PHYE_OOB_ERROR -- Error while doing OOB, the device probably |
319 | got disconnected. (*) |
320 | PHYE_SPINUP_HOLD -- SATA is present, COMWAKE not sent. |
321 | |
322 | (*) should set/clear the appropriate fields in the phy, |
323 | or alternatively call the inlined sas_phy_disconnected() |
324 | which is just a helper, from their tasklet. |
325 | |
326 | The Execute Command SCSI RPC: |
327 | |
328 | int (*lldd_execute_task)(struct sas_task *, int num, |
329 | unsigned long gfp_flags); |
330 | |
331 | Used to queue a task to the SAS LLDD. @task is the tasks to |
332 | be executed. @num should be the number of tasks being |
333 | queued at this function call (they are linked listed via |
334 | task::list), @gfp_mask should be the gfp_mask defining the |
335 | context of the caller. |
336 | |
337 | This function should implement the Execute Command SCSI RPC, |
338 | or if you're sending a SCSI Task as linked commands, you |
339 | should also use this function. |
340 | |
341 | That is, when lldd_execute_task() is called, the command(s) |
342 | go out on the transport *immediately*. There is *no* |
343 | queuing of any sort and at any level in a SAS LLDD. |
344 | |
345 | The use of task::list is two-fold, one for linked commands, |
346 | the other discussed below. |
347 | |
348 | It is possible to queue up more than one task at a time, by |
349 | initializing the list element of struct sas_task, and |
350 | passing the number of tasks enlisted in this manner in num. |
351 | |
352 | Returns: -SAS_QUEUE_FULL, -ENOMEM, nothing was queued; |
353 | 0, the task(s) were queued. |
354 | |
355 | If you want to pass num > 1, then either |
356 | A) you're the only caller of this function and keep track |
357 | of what you've queued to the LLDD, or |
358 | B) you know what you're doing and have a strategy of |
359 | retrying. |
360 | |
361 | As opposed to queuing one task at a time (function call), |
362 | batch queuing of tasks, by having num > 1, greatly |
363 | simplifies LLDD code, sequencer code, and _hardware design_, |
364 | and has some performance advantages in certain situations |
365 | (DBMS). |
366 | |
367 | The LLDD advertises if it can take more than one command at |
368 | a time at lldd_execute_task(), by setting the |
369 | lldd_max_execute_num parameter (controlled by "collector" |
370 | module parameter in aic94xx SAS LLDD). |
371 | |
372 | You should leave this to the default 1, unless you know what |
373 | you're doing. |
374 | |
375 | This is a function of the LLDD, to which the SAS layer can |
376 | cater to. |
377 | |
378 | int lldd_queue_size |
379 | The host adapter's queue size. This is the maximum |
380 | number of commands the lldd can have pending to domain |
381 | devices on behalf of all upper layers submitting through |
382 | lldd_execute_task(). |
383 | |
384 | You really want to set this to something (much) larger than |
385 | 1. |
386 | |
387 | This _really_ has absolutely nothing to do with queuing. |
388 | There is no queuing in SAS LLDDs. |
389 | |
390 | struct sas_task { |
391 | dev -- the device this task is destined to |
392 | list -- must be initialized (INIT_LIST_HEAD) |
393 | task_proto -- _one_ of enum sas_proto |
394 | scatter -- pointer to scatter gather list array |
395 | num_scatter -- number of elements in scatter |
396 | total_xfer_len -- total number of bytes expected to be transferred |
397 | data_dir -- PCI_DMA_... |
398 | task_done -- callback when the task has finished execution |
399 | }; |
400 | |
401 | When an external entity, entity other than the LLDD or the |
402 | SAS Layer, wants to work with a struct domain_device, it |
403 | _must_ call kobject_get() when getting a handle on the |
404 | device and kobject_put() when it is done with the device. |
405 | |
406 | This does two things: |
407 | A) implements proper kfree() for the device; |
408 | B) increments/decrements the kref for all players: |
409 | domain_device |
410 | all domain_device's ... (if past an expander) |
411 | port |
412 | host adapter |
413 | pci device |
414 | and up the ladder, etc. |
415 | |
416 | DISCOVERY |
417 | --------- |
418 | |
419 | The sysfs tree has the following purposes: |
420 | a) It shows you the physical layout of the SAS domain at |
421 | the current time, i.e. how the domain looks in the |
422 | physical world right now. |
423 | b) Shows some device parameters _at_discovery_time_. |
424 | |
425 | This is a link to the tree(1) program, very useful in |
426 | viewing the SAS domain: |
427 | ftp://mama.indstate.edu/linux/tree/ |
428 | I expect user space applications to actually create a |
429 | graphical interface of this. |
430 | |
431 | That is, the sysfs domain tree doesn't show or keep state if |
432 | you e.g., change the meaning of the READY LED MEANING |
433 | setting, but it does show you the current connection status |
434 | of the domain device. |
435 | |
436 | Keeping internal device state changes is responsibility of |
437 | upper layers (Command set drivers) and user space. |
438 | |
439 | When a device or devices are unplugged from the domain, this |
440 | is reflected in the sysfs tree immediately, and the device(s) |
441 | removed from the system. |
442 | |
443 | The structure domain_device describes any device in the SAS |
444 | domain. It is completely managed by the SAS layer. A task |
445 | points to a domain device, this is how the SAS LLDD knows |
446 | where to send the task(s) to. A SAS LLDD only reads the |
447 | contents of the domain_device structure, but it never creates |
448 | or destroys one. |
449 | |
450 | Expander management from User Space |
451 | ----------------------------------- |
452 | |
453 | In each expander directory in sysfs, there is a file called |
454 | "smp_portal". It is a binary sysfs attribute file, which |
455 | implements an SMP portal (Note: this is *NOT* an SMP port), |
456 | to which user space applications can send SMP requests and |
457 | receive SMP responses. |
458 | |
459 | Functionality is deceptively simple: |
460 | |
461 | 1. Build the SMP frame you want to send. The format and layout |
462 | is described in the SAS spec. Leave the CRC field equal 0. |
463 | open(2) |
464 | 2. Open the expander's SMP portal sysfs file in RW mode. |
465 | write(2) |
466 | 3. Write the frame you built in 1. |
467 | read(2) |
468 | 4. Read the amount of data you expect to receive for the frame you built. |
469 | If you receive different amount of data you expected to receive, |
470 | then there was some kind of error. |
471 | close(2) |
472 | All this process is shown in detail in the function do_smp_func() |
473 | and its callers, in the file "expander_conf.c". |
474 | |
475 | The kernel functionality is implemented in the file |
476 | "sas_expander.c". |
477 | |
478 | The program "expander_conf.c" implements this. It takes one |
479 | argument, the sysfs file name of the SMP portal to the |
480 | expander, and gives expander information, including routing |
481 | tables. |
482 | |
483 | The SMP portal gives you complete control of the expander, |
484 | so please be careful. |
485 |
Branches:
ben-wpan
ben-wpan-stefan
javiroman/ks7010
jz-2.6.34
jz-2.6.34-rc5
jz-2.6.34-rc6
jz-2.6.34-rc7
jz-2.6.35
jz-2.6.36
jz-2.6.37
jz-2.6.38
jz-2.6.39
jz-3.0
jz-3.1
jz-3.11
jz-3.12
jz-3.13
jz-3.15
jz-3.16
jz-3.18-dt
jz-3.2
jz-3.3
jz-3.4
jz-3.5
jz-3.6
jz-3.6-rc2-pwm
jz-3.9
jz-3.9-clk
jz-3.9-rc8
jz47xx
jz47xx-2.6.38
master
Tags:
od-2011-09-04
od-2011-09-18
v2.6.34-rc5
v2.6.34-rc6
v2.6.34-rc7
v3.9