Root/
1 | DCCP protocol |
2 | ============ |
3 | |
4 | |
5 | Contents |
6 | ======== |
7 | |
8 | - Introduction |
9 | - Missing features |
10 | - Socket options |
11 | - Notes |
12 | |
13 | Introduction |
14 | ============ |
15 | |
16 | Datagram Congestion Control Protocol (DCCP) is an unreliable, connection |
17 | oriented protocol designed to solve issues present in UDP and TCP, particularly |
18 | for real-time and multimedia (streaming) traffic. |
19 | It divides into a base protocol (RFC 4340) and plugable congestion control |
20 | modules called CCIDs. Like plugable TCP congestion control, at least one CCID |
21 | needs to be enabled in order for the protocol to function properly. In the Linux |
22 | implementation, this is the TCP-like CCID2 (RFC 4341). Additional CCIDs, such as |
23 | the TCP-friendly CCID3 (RFC 4342), are optional. |
24 | For a brief introduction to CCIDs and suggestions for choosing a CCID to match |
25 | given applications, see section 10 of RFC 4340. |
26 | |
27 | It has a base protocol and pluggable congestion control IDs (CCIDs). |
28 | |
29 | DCCP is a Proposed Standard (RFC 2026), and the homepage for DCCP as a protocol |
30 | is at http://www.ietf.org/html.charters/dccp-charter.html |
31 | |
32 | Missing features |
33 | ================ |
34 | |
35 | The Linux DCCP implementation does not currently support all the features that are |
36 | specified in RFCs 4340...42. |
37 | |
38 | The known bugs are at: |
39 | http://linux-net.osdl.org/index.php/TODO#DCCP |
40 | |
41 | For more up-to-date versions of the DCCP implementation, please consider using |
42 | the experimental DCCP test tree; instructions for checking this out are on: |
43 | http://linux-net.osdl.org/index.php/DCCP_Testing#Experimental_DCCP_source_tree |
44 | |
45 | |
46 | Socket options |
47 | ============== |
48 | |
49 | DCCP_SOCKOPT_SERVICE sets the service. The specification mandates use of |
50 | service codes (RFC 4340, sec. 8.1.2); if this socket option is not set, |
51 | the socket will fall back to 0 (which means that no meaningful service code |
52 | is present). On active sockets this is set before connect(); specifying more |
53 | than one code has no effect (all subsequent service codes are ignored). The |
54 | case is different for passive sockets, where multiple service codes (up to 32) |
55 | can be set before calling bind(). |
56 | |
57 | DCCP_SOCKOPT_GET_CUR_MPS is read-only and retrieves the current maximum packet |
58 | size (application payload size) in bytes, see RFC 4340, section 14. |
59 | |
60 | DCCP_SOCKOPT_AVAILABLE_CCIDS is also read-only and returns the list of CCIDs |
61 | supported by the endpoint. The option value is an array of type uint8_t whose |
62 | size is passed as option length. The minimum array size is 4 elements, the |
63 | value returned in the optlen argument always reflects the true number of |
64 | built-in CCIDs. |
65 | |
66 | DCCP_SOCKOPT_CCID is write-only and sets both the TX and RX CCIDs at the same |
67 | time, combining the operation of the next two socket options. This option is |
68 | preferrable over the latter two, since often applications will use the same |
69 | type of CCID for both directions; and mixed use of CCIDs is not currently well |
70 | understood. This socket option takes as argument at least one uint8_t value, or |
71 | an array of uint8_t values, which must match available CCIDS (see above). CCIDs |
72 | must be registered on the socket before calling connect() or listen(). |
73 | |
74 | DCCP_SOCKOPT_TX_CCID is read/write. It returns the current CCID (if set) or sets |
75 | the preference list for the TX CCID, using the same format as DCCP_SOCKOPT_CCID. |
76 | Please note that the getsockopt argument type here is `int', not uint8_t. |
77 | |
78 | DCCP_SOCKOPT_RX_CCID is analogous to DCCP_SOCKOPT_TX_CCID, but for the RX CCID. |
79 | |
80 | DCCP_SOCKOPT_SERVER_TIMEWAIT enables the server (listening socket) to hold |
81 | timewait state when closing the connection (RFC 4340, 8.3). The usual case is |
82 | that the closing server sends a CloseReq, whereupon the client holds timewait |
83 | state. When this boolean socket option is on, the server sends a Close instead |
84 | and will enter TIMEWAIT. This option must be set after accept() returns. |
85 | |
86 | DCCP_SOCKOPT_SEND_CSCOV and DCCP_SOCKOPT_RECV_CSCOV are used for setting the |
87 | partial checksum coverage (RFC 4340, sec. 9.2). The default is that checksums |
88 | always cover the entire packet and that only fully covered application data is |
89 | accepted by the receiver. Hence, when using this feature on the sender, it must |
90 | be enabled at the receiver, too with suitable choice of CsCov. |
91 | |
92 | DCCP_SOCKOPT_SEND_CSCOV sets the sender checksum coverage. Values in the |
93 | range 0..15 are acceptable. The default setting is 0 (full coverage), |
94 | values between 1..15 indicate partial coverage. |
95 | DCCP_SOCKOPT_RECV_CSCOV is for the receiver and has a different meaning: it |
96 | sets a threshold, where again values 0..15 are acceptable. The default |
97 | of 0 means that all packets with a partial coverage will be discarded. |
98 | Values in the range 1..15 indicate that packets with minimally such a |
99 | coverage value are also acceptable. The higher the number, the more |
100 | restrictive this setting (see [RFC 4340, sec. 9.2.1]). Partial coverage |
101 | settings are inherited to the child socket after accept(). |
102 | |
103 | The following two options apply to CCID 3 exclusively and are getsockopt()-only. |
104 | In either case, a TFRC info struct (defined in <linux/tfrc.h>) is returned. |
105 | DCCP_SOCKOPT_CCID_RX_INFO |
106 | Returns a `struct tfrc_rx_info' in optval; the buffer for optval and |
107 | optlen must be set to at least sizeof(struct tfrc_rx_info). |
108 | DCCP_SOCKOPT_CCID_TX_INFO |
109 | Returns a `struct tfrc_tx_info' in optval; the buffer for optval and |
110 | optlen must be set to at least sizeof(struct tfrc_tx_info). |
111 | |
112 | On unidirectional connections it is useful to close the unused half-connection |
113 | via shutdown (SHUT_WR or SHUT_RD): this will reduce per-packet processing costs. |
114 | |
115 | Sysctl variables |
116 | ================ |
117 | Several DCCP default parameters can be managed by the following sysctls |
118 | (sysctl net.dccp.default or /proc/sys/net/dccp/default): |
119 | |
120 | request_retries |
121 | The number of active connection initiation retries (the number of |
122 | Requests minus one) before timing out. In addition, it also governs |
123 | the behaviour of the other, passive side: this variable also sets |
124 | the number of times DCCP repeats sending a Response when the initial |
125 | handshake does not progress from RESPOND to OPEN (i.e. when no Ack |
126 | is received after the initial Request). This value should be greater |
127 | than 0, suggested is less than 10. Analogue of tcp_syn_retries. |
128 | |
129 | retries1 |
130 | How often a DCCP Response is retransmitted until the listening DCCP |
131 | side considers its connecting peer dead. Analogue of tcp_retries1. |
132 | |
133 | retries2 |
134 | The number of times a general DCCP packet is retransmitted. This has |
135 | importance for retransmitted acknowledgments and feature negotiation, |
136 | data packets are never retransmitted. Analogue of tcp_retries2. |
137 | |
138 | tx_ccid = 2 |
139 | Default CCID for the sender-receiver half-connection. Depending on the |
140 | choice of CCID, the Send Ack Vector feature is enabled automatically. |
141 | |
142 | rx_ccid = 2 |
143 | Default CCID for the receiver-sender half-connection; see tx_ccid. |
144 | |
145 | seq_window = 100 |
146 | The initial sequence window (sec. 7.5.2) of the sender. This influences |
147 | the local ackno validity and the remote seqno validity windows (7.5.1). |
148 | |
149 | tx_qlen = 5 |
150 | The size of the transmit buffer in packets. A value of 0 corresponds |
151 | to an unbounded transmit buffer. |
152 | |
153 | sync_ratelimit = 125 ms |
154 | The timeout between subsequent DCCP-Sync packets sent in response to |
155 | sequence-invalid packets on the same socket (RFC 4340, 7.5.4). The unit |
156 | of this parameter is milliseconds; a value of 0 disables rate-limiting. |
157 | |
158 | IOCTLS |
159 | ====== |
160 | FIONREAD |
161 | Works as in udp(7): returns in the `int' argument pointer the size of |
162 | the next pending datagram in bytes, or 0 when no datagram is pending. |
163 | |
164 | Notes |
165 | ===== |
166 | |
167 | DCCP does not travel through NAT successfully at present on many boxes. This is |
168 | because the checksum covers the pseudo-header as per TCP and UDP. Linux NAT |
169 | support for DCCP has been added. |
170 |
Branches:
ben-wpan
ben-wpan-stefan
javiroman/ks7010
jz-2.6.34
jz-2.6.34-rc5
jz-2.6.34-rc6
jz-2.6.34-rc7
jz-2.6.35
jz-2.6.36
jz-2.6.37
jz-2.6.38
jz-2.6.39
jz-3.0
jz-3.1
jz-3.11
jz-3.12
jz-3.13
jz-3.15
jz-3.16
jz-3.18-dt
jz-3.2
jz-3.3
jz-3.4
jz-3.5
jz-3.6
jz-3.6-rc2-pwm
jz-3.9
jz-3.9-clk
jz-3.9-rc8
jz47xx
jz47xx-2.6.38
master
Tags:
od-2011-09-04
od-2011-09-18
v2.6.34-rc5
v2.6.34-rc6
v2.6.34-rc7
v3.9