##// END OF EJS Templates
wireproto: add frame flag to denote payloads as CBOR...
Gregory Szorc -
r37315:e9aadee6 default
parent child Browse files
Show More
@@ -1,1638 +1,1640
1 1 The Mercurial wire protocol is a request-response based protocol
2 2 with multiple wire representations.
3 3
4 4 Each request is modeled as a command name, a dictionary of arguments, and
5 5 optional raw input. Command arguments and their types are intrinsic
6 6 properties of commands. So is the response type of the command. This means
7 7 clients can't always send arbitrary arguments to servers and servers can't
8 8 return multiple response types.
9 9
10 10 The protocol is synchronous and does not support multiplexing (concurrent
11 11 commands).
12 12
13 13 Handshake
14 14 =========
15 15
16 16 It is required or common for clients to perform a *handshake* when connecting
17 17 to a server. The handshake serves the following purposes:
18 18
19 19 * Negotiating protocol/transport level options
20 20 * Allows the client to learn about server capabilities to influence
21 21 future requests
22 22 * Ensures the underlying transport channel is in a *clean* state
23 23
24 24 An important goal of the handshake is to allow clients to use more modern
25 25 wire protocol features. By default, clients must assume they are talking
26 26 to an old version of Mercurial server (possibly even the very first
27 27 implementation). So, clients should not attempt to call or utilize modern
28 28 wire protocol features until they have confirmation that the server
29 29 supports them. The handshake implementation is designed to allow both
30 30 ends to utilize the latest set of features and capabilities with as
31 31 few round trips as possible.
32 32
33 33 The handshake mechanism varies by transport and protocol and is documented
34 34 in the sections below.
35 35
36 36 HTTP Protocol
37 37 =============
38 38
39 39 Handshake
40 40 ---------
41 41
42 42 The client sends a ``capabilities`` command request (``?cmd=capabilities``)
43 43 as soon as HTTP requests may be issued.
44 44
45 45 The server responds with a capabilities string, which the client parses to
46 46 learn about the server's abilities.
47 47
48 48 HTTP Version 1 Transport
49 49 ------------------------
50 50
51 51 Commands are issued as HTTP/1.0 or HTTP/1.1 requests. Commands are
52 52 sent to the base URL of the repository with the command name sent in
53 53 the ``cmd`` query string parameter. e.g.
54 54 ``https://example.com/repo?cmd=capabilities``. The HTTP method is ``GET``
55 55 or ``POST`` depending on the command and whether there is a request
56 56 body.
57 57
58 58 Command arguments can be sent multiple ways.
59 59
60 60 The simplest is part of the URL query string using ``x-www-form-urlencoded``
61 61 encoding (see Python's ``urllib.urlencode()``. However, many servers impose
62 62 length limitations on the URL. So this mechanism is typically only used if
63 63 the server doesn't support other mechanisms.
64 64
65 65 If the server supports the ``httpheader`` capability, command arguments can
66 66 be sent in HTTP request headers named ``X-HgArg-<N>`` where ``<N>`` is an
67 67 integer starting at 1. A ``x-www-form-urlencoded`` representation of the
68 68 arguments is obtained. This full string is then split into chunks and sent
69 69 in numbered ``X-HgArg-<N>`` headers. The maximum length of each HTTP header
70 70 is defined by the server in the ``httpheader`` capability value, which defaults
71 71 to ``1024``. The server reassembles the encoded arguments string by
72 72 concatenating the ``X-HgArg-<N>`` headers then URL decodes them into a
73 73 dictionary.
74 74
75 75 The list of ``X-HgArg-<N>`` headers should be added to the ``Vary`` request
76 76 header to instruct caches to take these headers into consideration when caching
77 77 requests.
78 78
79 79 If the server supports the ``httppostargs`` capability, the client
80 80 may send command arguments in the HTTP request body as part of an
81 81 HTTP POST request. The command arguments will be URL encoded just like
82 82 they would for sending them via HTTP headers. However, no splitting is
83 83 performed: the raw arguments are included in the HTTP request body.
84 84
85 85 The client sends a ``X-HgArgs-Post`` header with the string length of the
86 86 encoded arguments data. Additional data may be included in the HTTP
87 87 request body immediately following the argument data. The offset of the
88 88 non-argument data is defined by the ``X-HgArgs-Post`` header. The
89 89 ``X-HgArgs-Post`` header is not required if there is no argument data.
90 90
91 91 Additional command data can be sent as part of the HTTP request body. The
92 92 default ``Content-Type`` when sending data is ``application/mercurial-0.1``.
93 93 A ``Content-Length`` header is currently always sent.
94 94
95 95 Example HTTP requests::
96 96
97 97 GET /repo?cmd=capabilities
98 98 X-HgArg-1: foo=bar&baz=hello%20world
99 99
100 100 The request media type should be chosen based on server support. If the
101 101 ``httpmediatype`` server capability is present, the client should send
102 102 the newest mutually supported media type. If this capability is absent,
103 103 the client must assume the server only supports the
104 104 ``application/mercurial-0.1`` media type.
105 105
106 106 The ``Content-Type`` HTTP response header identifies the response as coming
107 107 from Mercurial and can also be used to signal an error has occurred.
108 108
109 109 The ``application/mercurial-*`` media types indicate a generic Mercurial
110 110 data type.
111 111
112 112 The ``application/mercurial-0.1`` media type is raw Mercurial data. It is the
113 113 predecessor of the format below.
114 114
115 115 The ``application/mercurial-0.2`` media type is compression framed Mercurial
116 116 data. The first byte of the payload indicates the length of the compression
117 117 format identifier that follows. Next are N bytes indicating the compression
118 118 format. e.g. ``zlib``. The remaining bytes are compressed according to that
119 119 compression format. The decompressed data behaves the same as with
120 120 ``application/mercurial-0.1``.
121 121
122 122 The ``application/hg-error`` media type indicates a generic error occurred.
123 123 The content of the HTTP response body typically holds text describing the
124 124 error.
125 125
126 126 The ``application/hg-changegroup`` media type indicates a changegroup response
127 127 type.
128 128
129 129 Clients also accept the ``text/plain`` media type. All other media
130 130 types should cause the client to error.
131 131
132 132 Behavior of media types is further described in the ``Content Negotiation``
133 133 section below.
134 134
135 135 Clients should issue a ``User-Agent`` request header that identifies the client.
136 136 The server should not use the ``User-Agent`` for feature detection.
137 137
138 138 A command returning a ``string`` response issues a
139 139 ``application/mercurial-0.*`` media type and the HTTP response body contains
140 140 the raw string value (after compression decoding, if used). A
141 141 ``Content-Length`` header is typically issued, but not required.
142 142
143 143 A command returning a ``stream`` response issues a
144 144 ``application/mercurial-0.*`` media type and the HTTP response is typically
145 145 using *chunked transfer* (``Transfer-Encoding: chunked``).
146 146
147 147 HTTP Version 2 Transport
148 148 ------------------------
149 149
150 150 **Experimental - feature under active development**
151 151
152 152 Version 2 of the HTTP protocol is exposed under the ``/api/*`` URL space.
153 153 It's final API name is not yet formalized.
154 154
155 155 Commands are triggered by sending HTTP POST requests against URLs of the
156 156 form ``<permission>/<command>``, where ``<permission>`` is ``ro`` or
157 157 ``rw``, meaning read-only and read-write, respectively and ``<command>``
158 158 is a named wire protocol command.
159 159
160 160 Non-POST request methods MUST be rejected by the server with an HTTP
161 161 405 response.
162 162
163 163 Commands that modify repository state in meaningful ways MUST NOT be
164 164 exposed under the ``ro`` URL prefix. All available commands MUST be
165 165 available under the ``rw`` URL prefix.
166 166
167 167 Server adminstrators MAY implement blanket HTTP authentication keyed
168 168 off the URL prefix. For example, a server may require authentication
169 169 for all ``rw/*`` URLs and let unauthenticated requests to ``ro/*``
170 170 URL proceed. A server MAY issue an HTTP 401, 403, or 407 response
171 171 in accordance with RFC 7235. Clients SHOULD recognize the HTTP Basic
172 172 (RFC 7617) and Digest (RFC 7616) authentication schemes. Clients SHOULD
173 173 make an attempt to recognize unknown schemes using the
174 174 ``WWW-Authenticate`` response header on a 401 response, as defined by
175 175 RFC 7235.
176 176
177 177 Read-only commands are accessible under ``rw/*`` URLs so clients can
178 178 signal the intent of the operation very early in the connection
179 179 lifecycle. For example, a ``push`` operation - which consists of
180 180 various read-only commands mixed with at least one read-write command -
181 181 can perform all commands against ``rw/*`` URLs so that any server-side
182 182 authentication requirements are discovered upon attempting the first
183 183 command - not potentially several commands into the exchange. This
184 184 allows clients to fail faster or prompt for credentials as soon as the
185 185 exchange takes place. This provides a better end-user experience.
186 186
187 187 Requests to unknown commands or URLS result in an HTTP 404.
188 188 TODO formally define response type, how error is communicated, etc.
189 189
190 190 HTTP request and response bodies use the *Unified Frame-Based Protocol*
191 191 (defined below) for media exchange. The entirety of the HTTP message
192 192 body is 0 or more frames as defined by this protocol.
193 193
194 194 Clients and servers MUST advertise the ``TBD`` media type via the
195 195 ``Content-Type`` request and response headers. In addition, clients MUST
196 196 advertise this media type value in their ``Accept`` request header in all
197 197 requests.
198 198 TODO finalize the media type. For now, it is defined in wireprotoserver.py.
199 199
200 200 Servers receiving requests without an ``Accept`` header SHOULD respond with
201 201 an HTTP 406.
202 202
203 203 Servers receiving requests with an invalid ``Content-Type`` header SHOULD
204 204 respond with an HTTP 415.
205 205
206 206 The command to run is specified in the POST payload as defined by the
207 207 *Unified Frame-Based Protocol*. This is redundant with data already
208 208 encoded in the URL. This is by design, so server operators can have
209 209 better understanding about server activity from looking merely at
210 210 HTTP access logs.
211 211
212 212 In most circumstances, the command specified in the URL MUST match
213 213 the command specified in the frame-based payload or the server will
214 214 respond with an error. The exception to this is the special
215 215 ``multirequest`` URL. (See below.) In addition, HTTP requests
216 216 are limited to one command invocation. The exception is the special
217 217 ``multirequest`` URL.
218 218
219 219 The ``multirequest`` command endpoints (``ro/multirequest`` and
220 220 ``rw/multirequest``) are special in that they allow the execution of
221 221 *any* command and allow the execution of multiple commands. If the
222 222 HTTP request issues multiple commands across multiple frames, all
223 223 issued commands will be processed by the server. Per the defined
224 224 behavior of the *Unified Frame-Based Protocol*, commands may be
225 225 issued interleaved and responses may come back in a different order
226 226 than they were issued. Clients MUST be able to deal with this.
227 227
228 228 SSH Protocol
229 229 ============
230 230
231 231 Handshake
232 232 ---------
233 233
234 234 For all clients, the handshake consists of the client sending 1 or more
235 235 commands to the server using version 1 of the transport. Servers respond
236 236 to commands they know how to respond to and send an empty response (``0\n``)
237 237 for unknown commands (per standard behavior of version 1 of the transport).
238 238 Clients then typically look for a response to the newest sent command to
239 239 determine which transport version to use and what the available features for
240 240 the connection and server are.
241 241
242 242 Preceding any response from client-issued commands, the server may print
243 243 non-protocol output. It is common for SSH servers to print banners, message
244 244 of the day announcements, etc when clients connect. It is assumed that any
245 245 such *banner* output will precede any Mercurial server output. So clients
246 246 must be prepared to handle server output on initial connect that isn't
247 247 in response to any client-issued command and doesn't conform to Mercurial's
248 248 wire protocol. This *banner* output should only be on stdout. However,
249 249 some servers may send output on stderr.
250 250
251 251 Pre 0.9.1 clients issue a ``between`` command with the ``pairs`` argument
252 252 having the value
253 253 ``0000000000000000000000000000000000000000-0000000000000000000000000000000000000000``.
254 254
255 255 The ``between`` command has been supported since the original Mercurial
256 256 SSH server. Requesting the empty range will return a ``\n`` string response,
257 257 which will be encoded as ``1\n\n`` (value length of ``1`` followed by a newline
258 258 followed by the value, which happens to be a newline).
259 259
260 260 For pre 0.9.1 clients and all servers, the exchange looks like::
261 261
262 262 c: between\n
263 263 c: pairs 81\n
264 264 c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
265 265 s: 1\n
266 266 s: \n
267 267
268 268 0.9.1+ clients send a ``hello`` command (with no arguments) before the
269 269 ``between`` command. The response to this command allows clients to
270 270 discover server capabilities and settings.
271 271
272 272 An example exchange between 0.9.1+ clients and a ``hello`` aware server looks
273 273 like::
274 274
275 275 c: hello\n
276 276 c: between\n
277 277 c: pairs 81\n
278 278 c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
279 279 s: 324\n
280 280 s: capabilities: lookup changegroupsubset branchmap pushkey known getbundle ...\n
281 281 s: 1\n
282 282 s: \n
283 283
284 284 And a similar scenario but with servers sending a banner on connect::
285 285
286 286 c: hello\n
287 287 c: between\n
288 288 c: pairs 81\n
289 289 c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
290 290 s: welcome to the server\n
291 291 s: if you find any issues, email someone@somewhere.com\n
292 292 s: 324\n
293 293 s: capabilities: lookup changegroupsubset branchmap pushkey known getbundle ...\n
294 294 s: 1\n
295 295 s: \n
296 296
297 297 Note that output from the ``hello`` command is terminated by a ``\n``. This is
298 298 part of the response payload and not part of the wire protocol adding a newline
299 299 after responses. In other words, the length of the response contains the
300 300 trailing ``\n``.
301 301
302 302 Clients supporting version 2 of the SSH transport send a line beginning
303 303 with ``upgrade`` before the ``hello`` and ``between`` commands. The line
304 304 (which isn't a well-formed command line because it doesn't consist of a
305 305 single command name) serves to both communicate the client's intent to
306 306 switch to transport version 2 (transports are version 1 by default) as
307 307 well as to advertise the client's transport-level capabilities so the
308 308 server may satisfy that request immediately.
309 309
310 310 The upgrade line has the form:
311 311
312 312 upgrade <token> <transport capabilities>
313 313
314 314 That is the literal string ``upgrade`` followed by a space, followed by
315 315 a randomly generated string, followed by a space, followed by a string
316 316 denoting the client's transport capabilities.
317 317
318 318 The token can be anything. However, a random UUID is recommended. (Use
319 319 of version 4 UUIDs is recommended because version 1 UUIDs can leak the
320 320 client's MAC address.)
321 321
322 322 The transport capabilities string is a URL/percent encoded string
323 323 containing key-value pairs defining the client's transport-level
324 324 capabilities. The following capabilities are defined:
325 325
326 326 proto
327 327 A comma-delimited list of transport protocol versions the client
328 328 supports. e.g. ``ssh-v2``.
329 329
330 330 If the server does not recognize the ``upgrade`` line, it should issue
331 331 an empty response and continue processing the ``hello`` and ``between``
332 332 commands. Here is an example handshake between a version 2 aware client
333 333 and a non version 2 aware server:
334 334
335 335 c: upgrade 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a proto=ssh-v2
336 336 c: hello\n
337 337 c: between\n
338 338 c: pairs 81\n
339 339 c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
340 340 s: 0\n
341 341 s: 324\n
342 342 s: capabilities: lookup changegroupsubset branchmap pushkey known getbundle ...\n
343 343 s: 1\n
344 344 s: \n
345 345
346 346 (The initial ``0\n`` line from the server indicates an empty response to
347 347 the unknown ``upgrade ..`` command/line.)
348 348
349 349 If the server recognizes the ``upgrade`` line and is willing to satisfy that
350 350 upgrade request, it replies to with a payload of the following form:
351 351
352 352 upgraded <token> <transport name>\n
353 353
354 354 This line is the literal string ``upgraded``, a space, the token that was
355 355 specified by the client in its ``upgrade ...`` request line, a space, and the
356 356 name of the transport protocol that was chosen by the server. The transport
357 357 name MUST match one of the names the client specified in the ``proto`` field
358 358 of its ``upgrade ...`` request line.
359 359
360 360 If a server issues an ``upgraded`` response, it MUST also read and ignore
361 361 the lines associated with the ``hello`` and ``between`` command requests
362 362 that were issued by the server. It is assumed that the negotiated transport
363 363 will respond with equivalent requested information following the transport
364 364 handshake.
365 365
366 366 All data following the ``\n`` terminating the ``upgraded`` line is the
367 367 domain of the negotiated transport. It is common for the data immediately
368 368 following to contain additional metadata about the state of the transport and
369 369 the server. However, this isn't strictly speaking part of the transport
370 370 handshake and isn't covered by this section.
371 371
372 372 Here is an example handshake between a version 2 aware client and a version
373 373 2 aware server:
374 374
375 375 c: upgrade 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a proto=ssh-v2
376 376 c: hello\n
377 377 c: between\n
378 378 c: pairs 81\n
379 379 c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
380 380 s: upgraded 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a ssh-v2\n
381 381 s: <additional transport specific data>
382 382
383 383 The client-issued token that is echoed in the response provides a more
384 384 resilient mechanism for differentiating *banner* output from Mercurial
385 385 output. In version 1, properly formatted banner output could get confused
386 386 for Mercurial server output. By submitting a randomly generated token
387 387 that is then present in the response, the client can look for that token
388 388 in response lines and have reasonable certainty that the line did not
389 389 originate from a *banner* message.
390 390
391 391 SSH Version 1 Transport
392 392 -----------------------
393 393
394 394 The SSH transport (version 1) is a custom text-based protocol suitable for
395 395 use over any bi-directional stream transport. It is most commonly used with
396 396 SSH.
397 397
398 398 A SSH transport server can be started with ``hg serve --stdio``. The stdin,
399 399 stderr, and stdout file descriptors of the started process are used to exchange
400 400 data. When Mercurial connects to a remote server over SSH, it actually starts
401 401 a ``hg serve --stdio`` process on the remote server.
402 402
403 403 Commands are issued by sending the command name followed by a trailing newline
404 404 ``\n`` to the server. e.g. ``capabilities\n``.
405 405
406 406 Command arguments are sent in the following format::
407 407
408 408 <argument> <length>\n<value>
409 409
410 410 That is, the argument string name followed by a space followed by the
411 411 integer length of the value (expressed as a string) followed by a newline
412 412 (``\n``) followed by the raw argument value.
413 413
414 414 Dictionary arguments are encoded differently::
415 415
416 416 <argument> <# elements>\n
417 417 <key1> <length1>\n<value1>
418 418 <key2> <length2>\n<value2>
419 419 ...
420 420
421 421 Non-argument data is sent immediately after the final argument value. It is
422 422 encoded in chunks::
423 423
424 424 <length>\n<data>
425 425
426 426 Each command declares a list of supported arguments and their types. If a
427 427 client sends an unknown argument to the server, the server should abort
428 428 immediately. The special argument ``*`` in a command's definition indicates
429 429 that all argument names are allowed.
430 430
431 431 The definition of supported arguments and types is initially made when a
432 432 new command is implemented. The client and server must initially independently
433 433 agree on the arguments and their types. This initial set of arguments can be
434 434 supplemented through the presence of *capabilities* advertised by the server.
435 435
436 436 Each command has a defined expected response type.
437 437
438 438 A ``string`` response type is a length framed value. The response consists of
439 439 the string encoded integer length of a value followed by a newline (``\n``)
440 440 followed by the value. Empty values are allowed (and are represented as
441 441 ``0\n``).
442 442
443 443 A ``stream`` response type consists of raw bytes of data. There is no framing.
444 444
445 445 A generic error response type is also supported. It consists of a an error
446 446 message written to ``stderr`` followed by ``\n-\n``. In addition, ``\n`` is
447 447 written to ``stdout``.
448 448
449 449 If the server receives an unknown command, it will send an empty ``string``
450 450 response.
451 451
452 452 The server terminates if it receives an empty command (a ``\n`` character).
453 453
454 454 SSH Version 2 Transport
455 455 -----------------------
456 456
457 457 **Experimental and under development**
458 458
459 459 Version 2 of the SSH transport behaves identically to version 1 of the SSH
460 460 transport with the exception of handshake semantics. See above for how
461 461 version 2 of the SSH transport is negotiated.
462 462
463 463 Immediately following the ``upgraded`` line signaling a switch to version
464 464 2 of the SSH protocol, the server automatically sends additional details
465 465 about the capabilities of the remote server. This has the form:
466 466
467 467 <integer length of value>\n
468 468 capabilities: ...\n
469 469
470 470 e.g.
471 471
472 472 s: upgraded 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a ssh-v2\n
473 473 s: 240\n
474 474 s: capabilities: known getbundle batch ...\n
475 475
476 476 Following capabilities advertisement, the peers communicate using version
477 477 1 of the SSH transport.
478 478
479 479 Unified Frame-Based Protocol
480 480 ============================
481 481
482 482 **Experimental and under development**
483 483
484 484 The *Unified Frame-Based Protocol* is a communications protocol between
485 485 Mercurial peers. The protocol aims to be mostly transport agnostic
486 486 (works similarly on HTTP, SSH, etc).
487 487
488 488 To operate the protocol, a bi-directional, half-duplex pipe supporting
489 489 ordered sends and receives is required. That is, each peer has one pipe
490 490 for sending data and another for receiving.
491 491
492 492 All data is read and written in atomic units called *frames*. These
493 493 are conceptually similar to TCP packets. Higher-level functionality
494 494 is built on the exchange and processing of frames.
495 495
496 496 All frames are associated with a *stream*. A *stream* provides a
497 497 unidirectional grouping of frames. Streams facilitate two goals:
498 498 content encoding and parallelism. There is a dedicated section on
499 499 streams below.
500 500
501 501 The protocol is request-response based: the client issues requests to
502 502 the server, which issues replies to those requests. Server-initiated
503 503 messaging is not currently supported, but this specification carves
504 504 out room to implement it.
505 505
506 506 All frames are associated with a numbered request. Frames can thus
507 507 be logically grouped by their request ID.
508 508
509 509 Frames begin with an 8 octet header followed by a variable length
510 510 payload::
511 511
512 512 +------------------------------------------------+
513 513 | Length (24) |
514 514 +--------------------------------+---------------+
515 515 | Request ID (16) | Stream ID (8) |
516 516 +------------------+-------------+---------------+
517 517 | Stream Flags (8) |
518 518 +-----------+------+
519 519 | Type (4) |
520 520 +-----------+
521 521 | Flags (4) |
522 522 +===========+===================================================|
523 523 | Frame Payload (0...) ...
524 524 +---------------------------------------------------------------+
525 525
526 526 The length of the frame payload is expressed as an unsigned 24 bit
527 527 little endian integer. Values larger than 65535 MUST NOT be used unless
528 528 given permission by the server as part of the negotiated capabilities
529 529 during the handshake. The frame header is not part of the advertised
530 530 frame length. The payload length is the over-the-wire length. If there
531 531 is content encoding applied to the payload as part of the frame's stream,
532 532 the length is the output of that content encoding, not the input.
533 533
534 534 The 16-bit ``Request ID`` field denotes the integer request identifier,
535 535 stored as an unsigned little endian integer. Odd numbered requests are
536 536 client-initiated. Even numbered requests are server-initiated. This
537 537 refers to where the *request* was initiated - not where the *frame* was
538 538 initiated, so servers will send frames with odd ``Request ID`` in
539 539 response to client-initiated requests. Implementations are advised to
540 540 start ordering request identifiers at ``1`` and ``0``, increment by
541 541 ``2``, and wrap around if all available numbers have been exhausted.
542 542
543 543 The 8-bit ``Stream ID`` field denotes the stream that the frame is
544 544 associated with. Frames belonging to a stream may have content
545 545 encoding applied and the receiver may need to decode the raw frame
546 546 payload to obtain the original data. Odd numbered IDs are
547 547 client-initiated. Even numbered IDs are server-initiated.
548 548
549 549 The 8-bit ``Stream Flags`` field defines stream processing semantics.
550 550 See the section on streams below.
551 551
552 552 The 4-bit ``Type`` field denotes the type of frame being sent.
553 553
554 554 The 4-bit ``Flags`` field defines special, per-type attributes for
555 555 the frame.
556 556
557 557 The sections below define the frame types and their behavior.
558 558
559 559 Command Request (``0x01``)
560 560 --------------------------
561 561
562 562 This frame contains a request to run a command.
563 563
564 564 The payload consists of a CBOR map defining the command request. The
565 565 bytestring keys of that map are:
566 566
567 567 name
568 568 Name of the command that should be executed (bytestring).
569 569 args
570 570 Map of bytestring keys to various value types containing the named
571 571 arguments to this command.
572 572
573 573 Each command defines its own set of argument names and their expected
574 574 types.
575 575
576 576 This frame type MUST ONLY be sent from clients to servers: it is illegal
577 577 for a server to send this frame to a client.
578 578
579 579 The following flag values are defined for this type:
580 580
581 581 0x01
582 582 New command request. When set, this frame represents the beginning
583 583 of a new request to run a command. The ``Request ID`` attached to this
584 584 frame MUST NOT be active.
585 585 0x02
586 586 Command request continuation. When set, this frame is a continuation
587 587 from a previous command request frame for its ``Request ID``. This
588 588 flag is set when the CBOR data for a command request does not fit
589 589 in a single frame.
590 590 0x04
591 591 Additional frames expected. When set, the command request didn't fit
592 592 into a single frame and additional CBOR data follows in a subsequent
593 593 frame.
594 594 0x08
595 595 Command data frames expected. When set, command data frames are
596 596 expected to follow the final command request frame for this request.
597 597
598 598 ``0x01`` MUST be set on the initial command request frame for a
599 599 ``Request ID``.
600 600
601 601 ``0x01`` or ``0x02`` MUST be set to indicate this frame's role in
602 602 a series of command request frames.
603 603
604 604 If command data frames are to be sent, ``0x10`` MUST be set on ALL
605 605 command request frames.
606 606
607 607 Command Data (``0x03``)
608 608 -----------------------
609 609
610 610 This frame contains raw data for a command.
611 611
612 612 Most commands can be executed by specifying arguments. However,
613 613 arguments have an upper bound to their length. For commands that
614 614 accept data that is beyond this length or whose length isn't known
615 615 when the command is initially sent, they will need to stream
616 616 arbitrary data to the server. This frame type facilitates the sending
617 617 of this data.
618 618
619 619 The payload of this frame type consists of a stream of raw data to be
620 620 consumed by the command handler on the server. The format of the data
621 621 is command specific.
622 622
623 623 The following flag values are defined for this type:
624 624
625 625 0x01
626 626 Command data continuation. When set, the data for this command
627 627 continues into a subsequent frame.
628 628
629 629 0x02
630 630 End of data. When set, command data has been fully sent to the
631 631 server. The command has been fully issued and no new data for this
632 632 command will be sent. The next frame will belong to a new command.
633 633
634 Bytes Response Data (``0x04``)
635 ------------------------------
634 Response Data (``0x04``)
635 ------------------------
636 636
637 This frame contains raw bytes response data to an issued command.
637 This frame contains raw response data to an issued command.
638 638
639 639 The following flag values are defined for this type:
640 640
641 641 0x01
642 Data continuation. When set, an additional frame containing raw
643 response data will follow.
642 Data continuation. When set, an additional frame containing response data
643 will follow.
644 644 0x02
645 End of data. When sent, the response data has been fully sent and
645 End of data. When set, the response data has been fully sent and
646 646 no additional frames for this response will be sent.
647 0x04
648 CBOR data. When set, the frame payload consists of CBOR data.
647 649
648 650 The ``0x01`` flag is mutually exclusive with the ``0x02`` flag.
649 651
650 652 Error Response (``0x05``)
651 653 -------------------------
652 654
653 655 An error occurred when processing a request. This could indicate
654 656 a protocol-level failure or an application level failure depending
655 657 on the flags for this message type.
656 658
657 659 The payload for this type is an error message that should be
658 660 displayed to the user.
659 661
660 662 The following flag values are defined for this type:
661 663
662 664 0x01
663 665 The error occurred at the transport/protocol level. If set, the
664 666 connection should be closed.
665 667 0x02
666 668 The error occurred at the application level. e.g. invalid command.
667 669
668 670 Human Output Side-Channel (``0x06``)
669 671 ------------------------------------
670 672
671 673 This frame contains a message that is intended to be displayed to
672 674 people. Whereas most frames communicate machine readable data, this
673 675 frame communicates textual data that is intended to be shown to
674 676 humans.
675 677
676 678 The frame consists of a series of *formatting requests*. Each formatting
677 679 request consists of a formatting string, arguments for that formatting
678 680 string, and labels to apply to that formatting string.
679 681
680 682 A formatting string is a printf()-like string that allows variable
681 683 substitution within the string. Labels allow the rendered text to be
682 684 *decorated*. Assuming use of the canonical Mercurial code base, a
683 685 formatting string can be the input to the ``i18n._`` function. This
684 686 allows messages emitted from the server to be localized. So even if
685 687 the server has different i18n settings, people could see messages in
686 688 their *native* settings. Similarly, the use of labels allows
687 689 decorations like coloring and underlining to be applied using the
688 690 client's configured rendering settings.
689 691
690 692 Formatting strings are similar to ``printf()`` strings or how
691 693 Python's ``%`` operator works. The only supported formatting sequences
692 694 are ``%s`` and ``%%``. ``%s`` will be replaced by whatever the string
693 695 at that position resolves to. ``%%`` will be replaced by ``%``. All
694 696 other 2-byte sequences beginning with ``%`` represent a literal
695 697 ``%`` followed by that character. However, future versions of the
696 698 wire protocol reserve the right to allow clients to opt in to receiving
697 699 formatting strings with additional formatters, hence why ``%%`` is
698 700 required to represent the literal ``%``.
699 701
700 702 The raw frame consists of a series of data structures representing
701 703 textual atoms to print. Each atom begins with a struct defining the
702 704 size of the data that follows:
703 705
704 706 * A 16-bit little endian unsigned integer denoting the length of the
705 707 formatting string.
706 708 * An 8-bit unsigned integer denoting the number of label strings
707 709 that follow.
708 710 * An 8-bit unsigned integer denoting the number of formatting string
709 711 arguments strings that follow.
710 712 * An array of 8-bit unsigned integers denoting the lengths of
711 713 *labels* data.
712 714 * An array of 16-bit unsigned integers denoting the lengths of
713 715 formatting strings.
714 716 * The formatting string, encoded as UTF-8.
715 717 * 0 or more ASCII strings defining labels to apply to this atom.
716 718 * 0 or more UTF-8 strings that will be used as arguments to the
717 719 formatting string.
718 720
719 721 TODO use ASCII for formatting string.
720 722
721 723 All data to be printed MUST be encoded into a single frame: this frame
722 724 does not support spanning data across multiple frames.
723 725
724 726 All textual data encoded in these frames is assumed to be line delimited.
725 727 The last atom in the frame SHOULD end with a newline (``\n``). If it
726 728 doesn't, clients MAY add a newline to facilitate immediate printing.
727 729
728 730 Progress Update (``0x07``)
729 731 --------------------------
730 732
731 733 This frame holds the progress of an operation on the peer. Consumption
732 734 of these frames allows clients to display progress bars, estimated
733 735 completion times, etc.
734 736
735 737 Each frame defines the progress of a single operation on the peer. The
736 738 payload consists of a CBOR map with the following bytestring keys:
737 739
738 740 topic
739 741 Topic name (string)
740 742 pos
741 743 Current numeric position within the topic (integer)
742 744 total
743 745 Total/end numeric position of this topic (unsigned integer)
744 746 label (optional)
745 747 Unit label (string)
746 748 item (optional)
747 749 Item name (string)
748 750
749 751 Progress state is created when a frame is received referencing a
750 752 *topic* that isn't currently tracked. Progress tracking for that
751 753 *topic* is finished when a frame is received reporting the current
752 754 position of that topic as ``-1``.
753 755
754 756 Multiple *topics* may be active at any given time.
755 757
756 758 Rendering of progress information is not mandated or governed by this
757 759 specification: implementations MAY render progress information however
758 760 they see fit, including not at all.
759 761
760 762 The string data describing the topic SHOULD be static strings to
761 763 facilitate receivers localizing that string data. The emitter
762 764 MUST normalize all string data to valid UTF-8 and receivers SHOULD
763 765 validate that received data conforms to UTF-8. The topic name
764 766 SHOULD be ASCII.
765 767
766 768 Stream Encoding Settings (``0x08``)
767 769 -----------------------------------
768 770
769 771 This frame type holds information defining the content encoding
770 772 settings for a *stream*.
771 773
772 774 This frame type is likely consumed by the protocol layer and is not
773 775 passed on to applications.
774 776
775 777 This frame type MUST ONLY occur on frames having the *Beginning of Stream*
776 778 ``Stream Flag`` set.
777 779
778 780 The payload of this frame defines what content encoding has (possibly)
779 781 been applied to the payloads of subsequent frames in this stream.
780 782
781 783 The payload begins with an 8-bit integer defining the length of the
782 784 encoding *profile*, followed by the string name of that profile, which
783 785 must be an ASCII string. All bytes that follow can be used by that
784 786 profile for supplemental settings definitions. See the section below
785 787 on defined encoding profiles.
786 788
787 789 Stream States and Flags
788 790 -----------------------
789 791
790 792 Streams can be in two states: *open* and *closed*. An *open* stream
791 793 is active and frames attached to that stream could arrive at any time.
792 794 A *closed* stream is not active. If a frame attached to a *closed*
793 795 stream arrives, that frame MUST have an appropriate stream flag
794 796 set indicating beginning of stream. All streams are in the *closed*
795 797 state by default.
796 798
797 799 The ``Stream Flags`` field denotes a set of bit flags for defining
798 800 the relationship of this frame within a stream. The following flags
799 801 are defined:
800 802
801 803 0x01
802 804 Beginning of stream. The first frame in the stream MUST set this
803 805 flag. When received, the ``Stream ID`` this frame is attached to
804 806 becomes ``open``.
805 807
806 808 0x02
807 809 End of stream. The last frame in a stream MUST set this flag. When
808 810 received, the ``Stream ID`` this frame is attached to becomes
809 811 ``closed``. Any content encoding context associated with this stream
810 812 can be destroyed after processing the payload of this frame.
811 813
812 814 0x04
813 815 Apply content encoding. When set, any content encoding settings
814 816 defined by the stream should be applied when attempting to read
815 817 the frame. When not set, the frame payload isn't encoded.
816 818
817 819 Streams
818 820 -------
819 821
820 822 Streams - along with ``Request IDs`` - facilitate grouping of frames.
821 823 But the purpose of each is quite different and the groupings they
822 824 constitute are independent.
823 825
824 826 A ``Request ID`` is essentially a tag. It tells you which logical
825 827 request a frame is associated with.
826 828
827 829 A *stream* is a sequence of frames grouped for the express purpose
828 830 of applying a stateful encoding or for denoting sub-groups of frames.
829 831
830 832 Unlike ``Request ID``s which span the request and response, a stream
831 833 is unidirectional and stream IDs are independent from client to
832 834 server.
833 835
834 836 There is no strict hierarchical relationship between ``Request IDs``
835 837 and *streams*. A stream can contain frames having multiple
836 838 ``Request IDs``. Frames belonging to the same ``Request ID`` can
837 839 span multiple streams.
838 840
839 841 One goal of streams is to facilitate content encoding. A stream can
840 842 define an encoding to be applied to frame payloads. For example, the
841 843 payload transmitted over the wire may contain output from a
842 844 zstandard compression operation and the receiving end may decompress
843 845 that payload to obtain the original data.
844 846
845 847 The other goal of streams is to facilitate concurrent execution. For
846 848 example, a server could spawn 4 threads to service a request that can
847 849 be easily parallelized. Each of those 4 threads could write into its
848 850 own stream. Those streams could then in turn be delivered to 4 threads
849 851 on the receiving end, with each thread consuming its stream in near
850 852 isolation. The *main* thread on both ends merely does I/O and
851 853 encodes/decodes frame headers: the bulk of the work is done by worker
852 854 threads.
853 855
854 856 In addition, since content encoding is defined per stream, each
855 857 *worker thread* could perform potentially CPU bound work concurrently
856 858 with other threads. This approach of applying encoding at the
857 859 sub-protocol / stream level eliminates a potential resource constraint
858 860 on the protocol stream as a whole (it is common for the throughput of
859 861 a compression engine to be smaller than the throughput of a network).
860 862
861 863 Having multiple streams - each with their own encoding settings - also
862 864 facilitates the use of advanced data compression techniques. For
863 865 example, a transmitter could see that it is generating data faster
864 866 and slower than the receiving end is consuming it and adjust its
865 867 compression settings to trade CPU for compression ratio accordingly.
866 868
867 869 While streams can define a content encoding, not all frames within
868 870 that stream must use that content encoding. This can be useful when
869 871 data is being served from caches and being derived dynamically. A
870 872 cache could pre-compressed data so the server doesn't have to
871 873 recompress it. The ability to pick and choose which frames are
872 874 compressed allows servers to easily send data to the wire without
873 875 involving potentially expensive encoding overhead.
874 876
875 877 Content Encoding Profiles
876 878 -------------------------
877 879
878 880 Streams can have named content encoding *profiles* associated with
879 881 them. A profile defines a shared understanding of content encoding
880 882 settings and behavior.
881 883
882 884 The following profiles are defined:
883 885
884 886 TBD
885 887
886 888 Issuing Commands
887 889 ----------------
888 890
889 891 A client can request that a remote run a command by sending it
890 892 frames defining that command. This logical stream is composed of
891 893 1 or more ``Command Request`` frames and and 0 or more ``Command Data``
892 894 frames.
893 895
894 896 All frames composing a single command request MUST be associated with
895 897 the same ``Request ID``.
896 898
897 899 Clients MAY send additional command requests without waiting on the
898 900 response to a previous command request. If they do so, they MUST ensure
899 901 that the ``Request ID`` field of outbound frames does not conflict
900 902 with that of an active ``Request ID`` whose response has not yet been
901 903 fully received.
902 904
903 905 Servers MAY respond to commands in a different order than they were
904 906 sent over the wire. Clients MUST be prepared to deal with this. Servers
905 907 also MAY start executing commands in a different order than they were
906 908 received, or MAY execute multiple commands concurrently.
907 909
908 910 If there is a dependency between commands or a race condition between
909 911 commands executing (e.g. a read-only command that depends on the results
910 912 of a command that mutates the repository), then clients MUST NOT send
911 913 frames issuing a command until a response to all dependent commands has
912 914 been received.
913 915 TODO think about whether we should express dependencies between commands
914 916 to avoid roundtrip latency.
915 917
916 918 A command is defined by a command name, 0 or more command arguments,
917 919 and optional command data.
918 920
919 921 Arguments are the recommended mechanism for transferring fixed sets of
920 922 parameters to a command. Data is appropriate for transferring variable
921 923 data. Thinking in terms of HTTP, arguments would be headers and data
922 924 would be the message body.
923 925
924 926 It is recommended for servers to delay the dispatch of a command
925 927 until all argument have been received. Servers MAY impose limits on the
926 928 maximum argument size.
927 929 TODO define failure mechanism.
928 930
929 931 Servers MAY dispatch to commands immediately once argument data
930 932 is available or delay until command data is received in full.
931 933
932 934 Capabilities
933 935 ============
934 936
935 937 Servers advertise supported wire protocol features. This allows clients to
936 938 probe for server features before blindly calling a command or passing a
937 939 specific argument.
938 940
939 941 The server's features are exposed via a *capabilities* string. This is a
940 942 space-delimited string of tokens/features. Some features are single words
941 943 like ``lookup`` or ``batch``. Others are complicated key-value pairs
942 944 advertising sub-features. e.g. ``httpheader=2048``. When complex, non-word
943 945 values are used, each feature name can define its own encoding of sub-values.
944 946 Comma-delimited and ``x-www-form-urlencoded`` values are common.
945 947
946 948 The following document capabilities defined by the canonical Mercurial server
947 949 implementation.
948 950
949 951 batch
950 952 -----
951 953
952 954 Whether the server supports the ``batch`` command.
953 955
954 956 This capability/command was introduced in Mercurial 1.9 (released July 2011).
955 957
956 958 branchmap
957 959 ---------
958 960
959 961 Whether the server supports the ``branchmap`` command.
960 962
961 963 This capability/command was introduced in Mercurial 1.3 (released July 2009).
962 964
963 965 bundle2-exp
964 966 -----------
965 967
966 968 Precursor to ``bundle2`` capability that was used before bundle2 was a
967 969 stable feature.
968 970
969 971 This capability was introduced in Mercurial 3.0 behind an experimental
970 972 flag. This capability should not be observed in the wild.
971 973
972 974 bundle2
973 975 -------
974 976
975 977 Indicates whether the server supports the ``bundle2`` data exchange format.
976 978
977 979 The value of the capability is a URL quoted, newline (``\n``) delimited
978 980 list of keys or key-value pairs.
979 981
980 982 A key is simply a URL encoded string.
981 983
982 984 A key-value pair is a URL encoded key separated from a URL encoded value by
983 985 an ``=``. If the value is a list, elements are delimited by a ``,`` after
984 986 URL encoding.
985 987
986 988 For example, say we have the values::
987 989
988 990 {'HG20': [], 'changegroup': ['01', '02'], 'digests': ['sha1', 'sha512']}
989 991
990 992 We would first construct a string::
991 993
992 994 HG20\nchangegroup=01,02\ndigests=sha1,sha512
993 995
994 996 We would then URL quote this string::
995 997
996 998 HG20%0Achangegroup%3D01%2C02%0Adigests%3Dsha1%2Csha512
997 999
998 1000 This capability was introduced in Mercurial 3.4 (released May 2015).
999 1001
1000 1002 changegroupsubset
1001 1003 -----------------
1002 1004
1003 1005 Whether the server supports the ``changegroupsubset`` command.
1004 1006
1005 1007 This capability was introduced in Mercurial 0.9.2 (released December
1006 1008 2006).
1007 1009
1008 1010 This capability was introduced at the same time as the ``lookup``
1009 1011 capability/command.
1010 1012
1011 1013 compression
1012 1014 -----------
1013 1015
1014 1016 Declares support for negotiating compression formats.
1015 1017
1016 1018 Presence of this capability indicates the server supports dynamic selection
1017 1019 of compression formats based on the client request.
1018 1020
1019 1021 Servers advertising this capability are required to support the
1020 1022 ``application/mercurial-0.2`` media type in response to commands returning
1021 1023 streams. Servers may support this media type on any command.
1022 1024
1023 1025 The value of the capability is a comma-delimited list of strings declaring
1024 1026 supported compression formats. The order of the compression formats is in
1025 1027 server-preferred order, most preferred first.
1026 1028
1027 1029 The identifiers used by the official Mercurial distribution are:
1028 1030
1029 1031 bzip2
1030 1032 bzip2
1031 1033 none
1032 1034 uncompressed / raw data
1033 1035 zlib
1034 1036 zlib (no gzip header)
1035 1037 zstd
1036 1038 zstd
1037 1039
1038 1040 This capability was introduced in Mercurial 4.1 (released February 2017).
1039 1041
1040 1042 getbundle
1041 1043 ---------
1042 1044
1043 1045 Whether the server supports the ``getbundle`` command.
1044 1046
1045 1047 This capability was introduced in Mercurial 1.9 (released July 2011).
1046 1048
1047 1049 httpheader
1048 1050 ----------
1049 1051
1050 1052 Whether the server supports receiving command arguments via HTTP request
1051 1053 headers.
1052 1054
1053 1055 The value of the capability is an integer describing the max header
1054 1056 length that clients should send. Clients should ignore any content after a
1055 1057 comma in the value, as this is reserved for future use.
1056 1058
1057 1059 This capability was introduced in Mercurial 1.9 (released July 2011).
1058 1060
1059 1061 httpmediatype
1060 1062 -------------
1061 1063
1062 1064 Indicates which HTTP media types (``Content-Type`` header) the server is
1063 1065 capable of receiving and sending.
1064 1066
1065 1067 The value of the capability is a comma-delimited list of strings identifying
1066 1068 support for media type and transmission direction. The following strings may
1067 1069 be present:
1068 1070
1069 1071 0.1rx
1070 1072 Indicates server support for receiving ``application/mercurial-0.1`` media
1071 1073 types.
1072 1074
1073 1075 0.1tx
1074 1076 Indicates server support for sending ``application/mercurial-0.1`` media
1075 1077 types.
1076 1078
1077 1079 0.2rx
1078 1080 Indicates server support for receiving ``application/mercurial-0.2`` media
1079 1081 types.
1080 1082
1081 1083 0.2tx
1082 1084 Indicates server support for sending ``application/mercurial-0.2`` media
1083 1085 types.
1084 1086
1085 1087 minrx=X
1086 1088 Minimum media type version the server is capable of receiving. Value is a
1087 1089 string like ``0.2``.
1088 1090
1089 1091 This capability can be used by servers to limit connections from legacy
1090 1092 clients not using the latest supported media type. However, only clients
1091 1093 with knowledge of this capability will know to consult this value. This
1092 1094 capability is present so the client may issue a more user-friendly error
1093 1095 when the server has locked out a legacy client.
1094 1096
1095 1097 mintx=X
1096 1098 Minimum media type version the server is capable of sending. Value is a
1097 1099 string like ``0.1``.
1098 1100
1099 1101 Servers advertising support for the ``application/mercurial-0.2`` media type
1100 1102 should also advertise the ``compression`` capability.
1101 1103
1102 1104 This capability was introduced in Mercurial 4.1 (released February 2017).
1103 1105
1104 1106 httppostargs
1105 1107 ------------
1106 1108
1107 1109 **Experimental**
1108 1110
1109 1111 Indicates that the server supports and prefers clients send command arguments
1110 1112 via a HTTP POST request as part of the request body.
1111 1113
1112 1114 This capability was introduced in Mercurial 3.8 (released May 2016).
1113 1115
1114 1116 known
1115 1117 -----
1116 1118
1117 1119 Whether the server supports the ``known`` command.
1118 1120
1119 1121 This capability/command was introduced in Mercurial 1.9 (released July 2011).
1120 1122
1121 1123 lookup
1122 1124 ------
1123 1125
1124 1126 Whether the server supports the ``lookup`` command.
1125 1127
1126 1128 This capability was introduced in Mercurial 0.9.2 (released December
1127 1129 2006).
1128 1130
1129 1131 This capability was introduced at the same time as the ``changegroupsubset``
1130 1132 capability/command.
1131 1133
1132 1134 pushkey
1133 1135 -------
1134 1136
1135 1137 Whether the server supports the ``pushkey`` and ``listkeys`` commands.
1136 1138
1137 1139 This capability was introduced in Mercurial 1.6 (released July 2010).
1138 1140
1139 1141 standardbundle
1140 1142 --------------
1141 1143
1142 1144 **Unsupported**
1143 1145
1144 1146 This capability was introduced during the Mercurial 0.9.2 development cycle in
1145 1147 2006. It was never present in a release, as it was replaced by the ``unbundle``
1146 1148 capability. This capability should not be encountered in the wild.
1147 1149
1148 1150 stream-preferred
1149 1151 ----------------
1150 1152
1151 1153 If present the server prefers that clients clone using the streaming clone
1152 1154 protocol (``hg clone --stream``) rather than the standard
1153 1155 changegroup/bundle based protocol.
1154 1156
1155 1157 This capability was introduced in Mercurial 2.2 (released May 2012).
1156 1158
1157 1159 streamreqs
1158 1160 ----------
1159 1161
1160 1162 Indicates whether the server supports *streaming clones* and the *requirements*
1161 1163 that clients must support to receive it.
1162 1164
1163 1165 If present, the server supports the ``stream_out`` command, which transmits
1164 1166 raw revlogs from the repository instead of changegroups. This provides a faster
1165 1167 cloning mechanism at the expense of more bandwidth used.
1166 1168
1167 1169 The value of this capability is a comma-delimited list of repo format
1168 1170 *requirements*. These are requirements that impact the reading of data in
1169 1171 the ``.hg/store`` directory. An example value is
1170 1172 ``streamreqs=generaldelta,revlogv1`` indicating the server repo requires
1171 1173 the ``revlogv1`` and ``generaldelta`` requirements.
1172 1174
1173 1175 If the only format requirement is ``revlogv1``, the server may expose the
1174 1176 ``stream`` capability instead of the ``streamreqs`` capability.
1175 1177
1176 1178 This capability was introduced in Mercurial 1.7 (released November 2010).
1177 1179
1178 1180 stream
1179 1181 ------
1180 1182
1181 1183 Whether the server supports *streaming clones* from ``revlogv1`` repos.
1182 1184
1183 1185 If present, the server supports the ``stream_out`` command, which transmits
1184 1186 raw revlogs from the repository instead of changegroups. This provides a faster
1185 1187 cloning mechanism at the expense of more bandwidth used.
1186 1188
1187 1189 This capability was introduced in Mercurial 0.9.1 (released July 2006).
1188 1190
1189 1191 When initially introduced, the value of the capability was the numeric
1190 1192 revlog revision. e.g. ``stream=1``. This indicates the changegroup is using
1191 1193 ``revlogv1``. This simple integer value wasn't powerful enough, so the
1192 1194 ``streamreqs`` capability was invented to handle cases where the repo
1193 1195 requirements have more than just ``revlogv1``. Newer servers omit the
1194 1196 ``=1`` since it was the only value supported and the value of ``1`` can
1195 1197 be implied by clients.
1196 1198
1197 1199 unbundlehash
1198 1200 ------------
1199 1201
1200 1202 Whether the ``unbundle`` commands supports receiving a hash of all the
1201 1203 heads instead of a list.
1202 1204
1203 1205 For more, see the documentation for the ``unbundle`` command.
1204 1206
1205 1207 This capability was introduced in Mercurial 1.9 (released July 2011).
1206 1208
1207 1209 unbundle
1208 1210 --------
1209 1211
1210 1212 Whether the server supports pushing via the ``unbundle`` command.
1211 1213
1212 1214 This capability/command has been present since Mercurial 0.9.1 (released
1213 1215 July 2006).
1214 1216
1215 1217 Mercurial 0.9.2 (released December 2006) added values to the capability
1216 1218 indicating which bundle types the server supports receiving. This value is a
1217 1219 comma-delimited list. e.g. ``HG10GZ,HG10BZ,HG10UN``. The order of values
1218 1220 reflects the priority/preference of that type, where the first value is the
1219 1221 most preferred type.
1220 1222
1221 1223 Content Negotiation
1222 1224 ===================
1223 1225
1224 1226 The wire protocol has some mechanisms to help peers determine what content
1225 1227 types and encoding the other side will accept. Historically, these mechanisms
1226 1228 have been built into commands themselves because most commands only send a
1227 1229 well-defined response type and only certain commands needed to support
1228 1230 functionality like compression.
1229 1231
1230 1232 Currently, only the HTTP version 1 transport supports content negotiation
1231 1233 at the protocol layer.
1232 1234
1233 1235 HTTP requests advertise supported response formats via the ``X-HgProto-<N>``
1234 1236 request header, where ``<N>`` is an integer starting at 1 allowing the logical
1235 1237 value to span multiple headers. This value consists of a list of
1236 1238 space-delimited parameters. Each parameter denotes a feature or capability.
1237 1239
1238 1240 The following parameters are defined:
1239 1241
1240 1242 0.1
1241 1243 Indicates the client supports receiving ``application/mercurial-0.1``
1242 1244 responses.
1243 1245
1244 1246 0.2
1245 1247 Indicates the client supports receiving ``application/mercurial-0.2``
1246 1248 responses.
1247 1249
1248 1250 comp
1249 1251 Indicates compression formats the client can decode. Value is a list of
1250 1252 comma delimited strings identifying compression formats ordered from
1251 1253 most preferential to least preferential. e.g. ``comp=zstd,zlib,none``.
1252 1254
1253 1255 This parameter does not have an effect if only the ``0.1`` parameter
1254 1256 is defined, as support for ``application/mercurial-0.2`` or greater is
1255 1257 required to use arbitrary compression formats.
1256 1258
1257 1259 If this parameter is not advertised, the server interprets this as
1258 1260 equivalent to ``zlib,none``.
1259 1261
1260 1262 Clients may choose to only send this header if the ``httpmediatype``
1261 1263 server capability is present, as currently all server-side features
1262 1264 consulting this header require the client to opt in to new protocol features
1263 1265 advertised via the ``httpmediatype`` capability.
1264 1266
1265 1267 A server that doesn't receive an ``X-HgProto-<N>`` header should infer a
1266 1268 value of ``0.1``. This is compatible with legacy clients.
1267 1269
1268 1270 A server receiving a request indicating support for multiple media type
1269 1271 versions may respond with any of the supported media types. Not all servers
1270 1272 may support all media types on all commands.
1271 1273
1272 1274 Commands
1273 1275 ========
1274 1276
1275 1277 This section contains a list of all wire protocol commands implemented by
1276 1278 the canonical Mercurial server.
1277 1279
1278 1280 batch
1279 1281 -----
1280 1282
1281 1283 Issue multiple commands while sending a single command request. The purpose
1282 1284 of this command is to allow a client to issue multiple commands while avoiding
1283 1285 multiple round trips to the server therefore enabling commands to complete
1284 1286 quicker.
1285 1287
1286 1288 The command accepts a ``cmds`` argument that contains a list of commands to
1287 1289 execute.
1288 1290
1289 1291 The value of ``cmds`` is a ``;`` delimited list of strings. Each string has the
1290 1292 form ``<command> <arguments>``. That is, the command name followed by a space
1291 1293 followed by an argument string.
1292 1294
1293 1295 The argument string is a ``,`` delimited list of ``<key>=<value>`` values
1294 1296 corresponding to command arguments. Both the argument name and value are
1295 1297 escaped using a special substitution map::
1296 1298
1297 1299 : -> :c
1298 1300 , -> :o
1299 1301 ; -> :s
1300 1302 = -> :e
1301 1303
1302 1304 The response type for this command is ``string``. The value contains a
1303 1305 ``;`` delimited list of responses for each requested command. Each value
1304 1306 in this list is escaped using the same substitution map used for arguments.
1305 1307
1306 1308 If an error occurs, the generic error response may be sent.
1307 1309
1308 1310 between
1309 1311 -------
1310 1312
1311 1313 (Legacy command used for discovery in old clients)
1312 1314
1313 1315 Obtain nodes between pairs of nodes.
1314 1316
1315 1317 The ``pairs`` arguments contains a space-delimited list of ``-`` delimited
1316 1318 hex node pairs. e.g.::
1317 1319
1318 1320 a072279d3f7fd3a4aa7ffa1a5af8efc573e1c896-6dc58916e7c070f678682bfe404d2e2d68291a18
1319 1321
1320 1322 Return type is a ``string``. Value consists of lines corresponding to each
1321 1323 requested range. Each line contains a space-delimited list of hex nodes.
1322 1324 A newline ``\n`` terminates each line, including the last one.
1323 1325
1324 1326 branchmap
1325 1327 ---------
1326 1328
1327 1329 Obtain heads in named branches.
1328 1330
1329 1331 Accepts no arguments. Return type is a ``string``.
1330 1332
1331 1333 Return value contains lines with URL encoded branch names followed by a space
1332 1334 followed by a space-delimited list of hex nodes of heads on that branch.
1333 1335 e.g.::
1334 1336
1335 1337 default a072279d3f7fd3a4aa7ffa1a5af8efc573e1c896 6dc58916e7c070f678682bfe404d2e2d68291a18
1336 1338 stable baae3bf31522f41dd5e6d7377d0edd8d1cf3fccc
1337 1339
1338 1340 There is no trailing newline.
1339 1341
1340 1342 branches
1341 1343 --------
1342 1344
1343 1345 (Legacy command used for discovery in old clients. Clients with ``getbundle``
1344 1346 use the ``known`` and ``heads`` commands instead.)
1345 1347
1346 1348 Obtain ancestor changesets of specific nodes back to a branch point.
1347 1349
1348 1350 Despite the name, this command has nothing to do with Mercurial named branches.
1349 1351 Instead, it is related to DAG branches.
1350 1352
1351 1353 The command accepts a ``nodes`` argument, which is a string of space-delimited
1352 1354 hex nodes.
1353 1355
1354 1356 For each node requested, the server will find the first ancestor node that is
1355 1357 a DAG root or is a merge.
1356 1358
1357 1359 Return type is a ``string``. Return value contains lines with result data for
1358 1360 each requested node. Each line contains space-delimited nodes followed by a
1359 1361 newline (``\n``). The 4 nodes reported on each line correspond to the requested
1360 1362 node, the ancestor node found, and its 2 parent nodes (which may be the null
1361 1363 node).
1362 1364
1363 1365 capabilities
1364 1366 ------------
1365 1367
1366 1368 Obtain the capabilities string for the repo.
1367 1369
1368 1370 Unlike the ``hello`` command, the capabilities string is not prefixed.
1369 1371 There is no trailing newline.
1370 1372
1371 1373 This command does not accept any arguments. Return type is a ``string``.
1372 1374
1373 1375 This command was introduced in Mercurial 0.9.1 (released July 2006).
1374 1376
1375 1377 changegroup
1376 1378 -----------
1377 1379
1378 1380 (Legacy command: use ``getbundle`` instead)
1379 1381
1380 1382 Obtain a changegroup version 1 with data for changesets that are
1381 1383 descendants of client-specified changesets.
1382 1384
1383 1385 The ``roots`` arguments contains a list of space-delimited hex nodes.
1384 1386
1385 1387 The server responds with a changegroup version 1 containing all
1386 1388 changesets between the requested root/base nodes and the repo's head nodes
1387 1389 at the time of the request.
1388 1390
1389 1391 The return type is a ``stream``.
1390 1392
1391 1393 changegroupsubset
1392 1394 -----------------
1393 1395
1394 1396 (Legacy command: use ``getbundle`` instead)
1395 1397
1396 1398 Obtain a changegroup version 1 with data for changesetsets between
1397 1399 client specified base and head nodes.
1398 1400
1399 1401 The ``bases`` argument contains a list of space-delimited hex nodes.
1400 1402 The ``heads`` argument contains a list of space-delimited hex nodes.
1401 1403
1402 1404 The server responds with a changegroup version 1 containing all
1403 1405 changesets between the requested base and head nodes at the time of the
1404 1406 request.
1405 1407
1406 1408 The return type is a ``stream``.
1407 1409
1408 1410 clonebundles
1409 1411 ------------
1410 1412
1411 1413 Obtains a manifest of bundle URLs available to seed clones.
1412 1414
1413 1415 Each returned line contains a URL followed by metadata. See the
1414 1416 documentation in the ``clonebundles`` extension for more.
1415 1417
1416 1418 The return type is a ``string``.
1417 1419
1418 1420 getbundle
1419 1421 ---------
1420 1422
1421 1423 Obtain a bundle containing repository data.
1422 1424
1423 1425 This command accepts the following arguments:
1424 1426
1425 1427 heads
1426 1428 List of space-delimited hex nodes of heads to retrieve.
1427 1429 common
1428 1430 List of space-delimited hex nodes that the client has in common with the
1429 1431 server.
1430 1432 obsmarkers
1431 1433 Boolean indicating whether to include obsolescence markers as part
1432 1434 of the response. Only works with bundle2.
1433 1435 bundlecaps
1434 1436 Comma-delimited set of strings defining client bundle capabilities.
1435 1437 listkeys
1436 1438 Comma-delimited list of strings of ``pushkey`` namespaces. For each
1437 1439 namespace listed, a bundle2 part will be included with the content of
1438 1440 that namespace.
1439 1441 cg
1440 1442 Boolean indicating whether changegroup data is requested.
1441 1443 cbattempted
1442 1444 Boolean indicating whether the client attempted to use the *clone bundles*
1443 1445 feature before performing this request.
1444 1446 bookmarks
1445 1447 Boolean indicating whether bookmark data is requested.
1446 1448 phases
1447 1449 Boolean indicating whether phases data is requested.
1448 1450
1449 1451 The return type on success is a ``stream`` where the value is bundle.
1450 1452 On the HTTP version 1 transport, the response is zlib compressed.
1451 1453
1452 1454 If an error occurs, a generic error response can be sent.
1453 1455
1454 1456 Unless the client sends a false value for the ``cg`` argument, the returned
1455 1457 bundle contains a changegroup with the nodes between the specified ``common``
1456 1458 and ``heads`` nodes. Depending on the command arguments, the type and content
1457 1459 of the returned bundle can vary significantly.
1458 1460
1459 1461 The default behavior is for the server to send a raw changegroup version
1460 1462 ``01`` response.
1461 1463
1462 1464 If the ``bundlecaps`` provided by the client contain a value beginning
1463 1465 with ``HG2``, a bundle2 will be returned. The bundle2 data may contain
1464 1466 additional repository data, such as ``pushkey`` namespace values.
1465 1467
1466 1468 heads
1467 1469 -----
1468 1470
1469 1471 Returns a list of space-delimited hex nodes of repository heads followed
1470 1472 by a newline. e.g.
1471 1473 ``a9eeb3adc7ddb5006c088e9eda61791c777cbf7c 31f91a3da534dc849f0d6bfc00a395a97cf218a1\n``
1472 1474
1473 1475 This command does not accept any arguments. The return type is a ``string``.
1474 1476
1475 1477 hello
1476 1478 -----
1477 1479
1478 1480 Returns lines describing interesting things about the server in an RFC-822
1479 1481 like format.
1480 1482
1481 1483 Currently, the only line defines the server capabilities. It has the form::
1482 1484
1483 1485 capabilities: <value>
1484 1486
1485 1487 See above for more about the capabilities string.
1486 1488
1487 1489 SSH clients typically issue this command as soon as a connection is
1488 1490 established.
1489 1491
1490 1492 This command does not accept any arguments. The return type is a ``string``.
1491 1493
1492 1494 This command was introduced in Mercurial 0.9.1 (released July 2006).
1493 1495
1494 1496 listkeys
1495 1497 --------
1496 1498
1497 1499 List values in a specified ``pushkey`` namespace.
1498 1500
1499 1501 The ``namespace`` argument defines the pushkey namespace to operate on.
1500 1502
1501 1503 The return type is a ``string``. The value is an encoded dictionary of keys.
1502 1504
1503 1505 Key-value pairs are delimited by newlines (``\n``). Within each line, keys and
1504 1506 values are separated by a tab (``\t``). Keys and values are both strings.
1505 1507
1506 1508 lookup
1507 1509 ------
1508 1510
1509 1511 Try to resolve a value to a known repository revision.
1510 1512
1511 1513 The ``key`` argument is converted from bytes to an
1512 1514 ``encoding.localstr`` instance then passed into
1513 1515 ``localrepository.__getitem__`` in an attempt to resolve it.
1514 1516
1515 1517 The return type is a ``string``.
1516 1518
1517 1519 Upon successful resolution, returns ``1 <hex node>\n``. On failure,
1518 1520 returns ``0 <error string>\n``. e.g.::
1519 1521
1520 1522 1 273ce12ad8f155317b2c078ec75a4eba507f1fba\n
1521 1523
1522 1524 0 unknown revision 'foo'\n
1523 1525
1524 1526 known
1525 1527 -----
1526 1528
1527 1529 Determine whether multiple nodes are known.
1528 1530
1529 1531 The ``nodes`` argument is a list of space-delimited hex nodes to check
1530 1532 for existence.
1531 1533
1532 1534 The return type is ``string``.
1533 1535
1534 1536 Returns a string consisting of ``0``s and ``1``s indicating whether nodes
1535 1537 are known. If the Nth node specified in the ``nodes`` argument is known,
1536 1538 a ``1`` will be returned at byte offset N. If the node isn't known, ``0``
1537 1539 will be present at byte offset N.
1538 1540
1539 1541 There is no trailing newline.
1540 1542
1541 1543 pushkey
1542 1544 -------
1543 1545
1544 1546 Set a value using the ``pushkey`` protocol.
1545 1547
1546 1548 Accepts arguments ``namespace``, ``key``, ``old``, and ``new``, which
1547 1549 correspond to the pushkey namespace to operate on, the key within that
1548 1550 namespace to change, the old value (which may be empty), and the new value.
1549 1551 All arguments are string types.
1550 1552
1551 1553 The return type is a ``string``. The value depends on the transport protocol.
1552 1554
1553 1555 The SSH version 1 transport sends a string encoded integer followed by a
1554 1556 newline (``\n``) which indicates operation result. The server may send
1555 1557 additional output on the ``stderr`` stream that should be displayed to the
1556 1558 user.
1557 1559
1558 1560 The HTTP version 1 transport sends a string encoded integer followed by a
1559 1561 newline followed by additional server output that should be displayed to
1560 1562 the user. This may include output from hooks, etc.
1561 1563
1562 1564 The integer result varies by namespace. ``0`` means an error has occurred
1563 1565 and there should be additional output to display to the user.
1564 1566
1565 1567 stream_out
1566 1568 ----------
1567 1569
1568 1570 Obtain *streaming clone* data.
1569 1571
1570 1572 The return type is either a ``string`` or a ``stream``, depending on
1571 1573 whether the request was fulfilled properly.
1572 1574
1573 1575 A return value of ``1\n`` indicates the server is not configured to serve
1574 1576 this data. If this is seen by the client, they may not have verified the
1575 1577 ``stream`` capability is set before making the request.
1576 1578
1577 1579 A return value of ``2\n`` indicates the server was unable to lock the
1578 1580 repository to generate data.
1579 1581
1580 1582 All other responses are a ``stream`` of bytes. The first line of this data
1581 1583 contains 2 space-delimited integers corresponding to the path count and
1582 1584 payload size, respectively::
1583 1585
1584 1586 <path count> <payload size>\n
1585 1587
1586 1588 The ``<payload size>`` is the total size of path data: it does not include
1587 1589 the size of the per-path header lines.
1588 1590
1589 1591 Following that header are ``<path count>`` entries. Each entry consists of a
1590 1592 line with metadata followed by raw revlog data. The line consists of::
1591 1593
1592 1594 <store path>\0<size>\n
1593 1595
1594 1596 The ``<store path>`` is the encoded store path of the data that follows.
1595 1597 ``<size>`` is the amount of data for this store path/revlog that follows the
1596 1598 newline.
1597 1599
1598 1600 There is no trailer to indicate end of data. Instead, the client should stop
1599 1601 reading after ``<path count>`` entries are consumed.
1600 1602
1601 1603 unbundle
1602 1604 --------
1603 1605
1604 1606 Send a bundle containing data (usually changegroup data) to the server.
1605 1607
1606 1608 Accepts the argument ``heads``, which is a space-delimited list of hex nodes
1607 1609 corresponding to server repository heads observed by the client. This is used
1608 1610 to detect race conditions and abort push operations before a server performs
1609 1611 too much work or a client transfers too much data.
1610 1612
1611 1613 The request payload consists of a bundle to be applied to the repository,
1612 1614 similarly to as if :hg:`unbundle` were called.
1613 1615
1614 1616 In most scenarios, a special ``push response`` type is returned. This type
1615 1617 contains an integer describing the change in heads as a result of the
1616 1618 operation. A value of ``0`` indicates nothing changed. ``1`` means the number
1617 1619 of heads remained the same. Values ``2`` and larger indicate the number of
1618 1620 added heads minus 1. e.g. ``3`` means 2 heads were added. Negative values
1619 1621 indicate the number of fewer heads, also off by 1. e.g. ``-2`` means there
1620 1622 is 1 fewer head.
1621 1623
1622 1624 The encoding of the ``push response`` type varies by transport.
1623 1625
1624 1626 For the SSH version 1 transport, this type is composed of 2 ``string``
1625 1627 responses: an empty response (``0\n``) followed by the integer result value.
1626 1628 e.g. ``1\n2``. So the full response might be ``0\n1\n2``.
1627 1629
1628 1630 For the HTTP version 1 transport, the response is a ``string`` type composed
1629 1631 of an integer result value followed by a newline (``\n``) followed by string
1630 1632 content holding server output that should be displayed on the client (output
1631 1633 hooks, etc).
1632 1634
1633 1635 In some cases, the server may respond with a ``bundle2`` bundle. In this
1634 1636 case, the response type is ``stream``. For the HTTP version 1 transport, the
1635 1637 response is zlib compressed.
1636 1638
1637 1639 The server may also respond with a generic error type, which contains a string
1638 1640 indicating the failure.
@@ -1,883 +1,885
1 1 # wireprotoframing.py - unified framing protocol for wire protocol
2 2 #
3 3 # Copyright 2018 Gregory Szorc <gregory.szorc@gmail.com>
4 4 #
5 5 # This software may be used and distributed according to the terms of the
6 6 # GNU General Public License version 2 or any later version.
7 7
8 8 # This file contains functionality to support the unified frame-based wire
9 9 # protocol. For details about the protocol, see
10 10 # `hg help internals.wireprotocol`.
11 11
12 12 from __future__ import absolute_import
13 13
14 14 import struct
15 15
16 16 from .i18n import _
17 17 from .thirdparty import (
18 18 attr,
19 19 cbor,
20 20 )
21 21 from . import (
22 22 error,
23 23 util,
24 24 )
25 25 from .utils import (
26 26 stringutil,
27 27 )
28 28
29 29 FRAME_HEADER_SIZE = 8
30 30 DEFAULT_MAX_FRAME_SIZE = 32768
31 31
32 32 STREAM_FLAG_BEGIN_STREAM = 0x01
33 33 STREAM_FLAG_END_STREAM = 0x02
34 34 STREAM_FLAG_ENCODING_APPLIED = 0x04
35 35
36 36 STREAM_FLAGS = {
37 37 b'stream-begin': STREAM_FLAG_BEGIN_STREAM,
38 38 b'stream-end': STREAM_FLAG_END_STREAM,
39 39 b'encoded': STREAM_FLAG_ENCODING_APPLIED,
40 40 }
41 41
42 42 FRAME_TYPE_COMMAND_REQUEST = 0x01
43 43 FRAME_TYPE_COMMAND_DATA = 0x03
44 44 FRAME_TYPE_BYTES_RESPONSE = 0x04
45 45 FRAME_TYPE_ERROR_RESPONSE = 0x05
46 46 FRAME_TYPE_TEXT_OUTPUT = 0x06
47 47 FRAME_TYPE_PROGRESS = 0x07
48 48 FRAME_TYPE_STREAM_SETTINGS = 0x08
49 49
50 50 FRAME_TYPES = {
51 51 b'command-request': FRAME_TYPE_COMMAND_REQUEST,
52 52 b'command-data': FRAME_TYPE_COMMAND_DATA,
53 53 b'bytes-response': FRAME_TYPE_BYTES_RESPONSE,
54 54 b'error-response': FRAME_TYPE_ERROR_RESPONSE,
55 55 b'text-output': FRAME_TYPE_TEXT_OUTPUT,
56 56 b'progress': FRAME_TYPE_PROGRESS,
57 57 b'stream-settings': FRAME_TYPE_STREAM_SETTINGS,
58 58 }
59 59
60 60 FLAG_COMMAND_REQUEST_NEW = 0x01
61 61 FLAG_COMMAND_REQUEST_CONTINUATION = 0x02
62 62 FLAG_COMMAND_REQUEST_MORE_FRAMES = 0x04
63 63 FLAG_COMMAND_REQUEST_EXPECT_DATA = 0x08
64 64
65 65 FLAGS_COMMAND_REQUEST = {
66 66 b'new': FLAG_COMMAND_REQUEST_NEW,
67 67 b'continuation': FLAG_COMMAND_REQUEST_CONTINUATION,
68 68 b'more': FLAG_COMMAND_REQUEST_MORE_FRAMES,
69 69 b'have-data': FLAG_COMMAND_REQUEST_EXPECT_DATA,
70 70 }
71 71
72 72 FLAG_COMMAND_DATA_CONTINUATION = 0x01
73 73 FLAG_COMMAND_DATA_EOS = 0x02
74 74
75 75 FLAGS_COMMAND_DATA = {
76 76 b'continuation': FLAG_COMMAND_DATA_CONTINUATION,
77 77 b'eos': FLAG_COMMAND_DATA_EOS,
78 78 }
79 79
80 80 FLAG_BYTES_RESPONSE_CONTINUATION = 0x01
81 81 FLAG_BYTES_RESPONSE_EOS = 0x02
82 FLAG_BYTES_RESPONSE_CBOR = 0x04
82 83
83 84 FLAGS_BYTES_RESPONSE = {
84 85 b'continuation': FLAG_BYTES_RESPONSE_CONTINUATION,
85 86 b'eos': FLAG_BYTES_RESPONSE_EOS,
87 b'cbor': FLAG_BYTES_RESPONSE_CBOR,
86 88 }
87 89
88 90 FLAG_ERROR_RESPONSE_PROTOCOL = 0x01
89 91 FLAG_ERROR_RESPONSE_APPLICATION = 0x02
90 92
91 93 FLAGS_ERROR_RESPONSE = {
92 94 b'protocol': FLAG_ERROR_RESPONSE_PROTOCOL,
93 95 b'application': FLAG_ERROR_RESPONSE_APPLICATION,
94 96 }
95 97
96 98 # Maps frame types to their available flags.
97 99 FRAME_TYPE_FLAGS = {
98 100 FRAME_TYPE_COMMAND_REQUEST: FLAGS_COMMAND_REQUEST,
99 101 FRAME_TYPE_COMMAND_DATA: FLAGS_COMMAND_DATA,
100 102 FRAME_TYPE_BYTES_RESPONSE: FLAGS_BYTES_RESPONSE,
101 103 FRAME_TYPE_ERROR_RESPONSE: FLAGS_ERROR_RESPONSE,
102 104 FRAME_TYPE_TEXT_OUTPUT: {},
103 105 FRAME_TYPE_PROGRESS: {},
104 106 FRAME_TYPE_STREAM_SETTINGS: {},
105 107 }
106 108
107 109 ARGUMENT_RECORD_HEADER = struct.Struct(r'<HH')
108 110
109 111 def humanflags(mapping, value):
110 112 """Convert a numeric flags value to a human value, using a mapping table."""
111 113 flags = []
112 114 for val, name in sorted({v: k for k, v in mapping.iteritems()}.iteritems()):
113 115 if value & val:
114 116 flags.append(name)
115 117
116 118 return b'|'.join(flags)
117 119
118 120 @attr.s(slots=True)
119 121 class frameheader(object):
120 122 """Represents the data in a frame header."""
121 123
122 124 length = attr.ib()
123 125 requestid = attr.ib()
124 126 streamid = attr.ib()
125 127 streamflags = attr.ib()
126 128 typeid = attr.ib()
127 129 flags = attr.ib()
128 130
129 131 @attr.s(slots=True, repr=False)
130 132 class frame(object):
131 133 """Represents a parsed frame."""
132 134
133 135 requestid = attr.ib()
134 136 streamid = attr.ib()
135 137 streamflags = attr.ib()
136 138 typeid = attr.ib()
137 139 flags = attr.ib()
138 140 payload = attr.ib()
139 141
140 142 def __repr__(self):
141 143 typename = '<unknown>'
142 144 for name, value in FRAME_TYPES.iteritems():
143 145 if value == self.typeid:
144 146 typename = name
145 147 break
146 148
147 149 return ('frame(size=%d; request=%d; stream=%d; streamflags=%s; '
148 150 'type=%s; flags=%s)' % (
149 151 len(self.payload), self.requestid, self.streamid,
150 152 humanflags(STREAM_FLAGS, self.streamflags), typename,
151 153 humanflags(FRAME_TYPE_FLAGS[self.typeid], self.flags)))
152 154
153 155 def makeframe(requestid, streamid, streamflags, typeid, flags, payload):
154 156 """Assemble a frame into a byte array."""
155 157 # TODO assert size of payload.
156 158 frame = bytearray(FRAME_HEADER_SIZE + len(payload))
157 159
158 160 # 24 bits length
159 161 # 16 bits request id
160 162 # 8 bits stream id
161 163 # 8 bits stream flags
162 164 # 4 bits type
163 165 # 4 bits flags
164 166
165 167 l = struct.pack(r'<I', len(payload))
166 168 frame[0:3] = l[0:3]
167 169 struct.pack_into(r'<HBB', frame, 3, requestid, streamid, streamflags)
168 170 frame[7] = (typeid << 4) | flags
169 171 frame[8:] = payload
170 172
171 173 return frame
172 174
173 175 def makeframefromhumanstring(s):
174 176 """Create a frame from a human readable string
175 177
176 178 DANGER: NOT SAFE TO USE WITH UNTRUSTED INPUT BECAUSE OF POTENTIAL
177 179 eval() USAGE. DO NOT USE IN CORE.
178 180
179 181 Strings have the form:
180 182
181 183 <request-id> <stream-id> <stream-flags> <type> <flags> <payload>
182 184
183 185 This can be used by user-facing applications and tests for creating
184 186 frames easily without having to type out a bunch of constants.
185 187
186 188 Request ID and stream IDs are integers.
187 189
188 190 Stream flags, frame type, and flags can be specified by integer or
189 191 named constant.
190 192
191 193 Flags can be delimited by `|` to bitwise OR them together.
192 194
193 195 If the payload begins with ``cbor:``, the following string will be
194 196 evaluated as Python code and the resulting object will be fed into
195 197 a CBOR encoder. Otherwise, the payload is interpreted as a Python
196 198 byte string literal.
197 199 """
198 200 fields = s.split(b' ', 5)
199 201 requestid, streamid, streamflags, frametype, frameflags, payload = fields
200 202
201 203 requestid = int(requestid)
202 204 streamid = int(streamid)
203 205
204 206 finalstreamflags = 0
205 207 for flag in streamflags.split(b'|'):
206 208 if flag in STREAM_FLAGS:
207 209 finalstreamflags |= STREAM_FLAGS[flag]
208 210 else:
209 211 finalstreamflags |= int(flag)
210 212
211 213 if frametype in FRAME_TYPES:
212 214 frametype = FRAME_TYPES[frametype]
213 215 else:
214 216 frametype = int(frametype)
215 217
216 218 finalflags = 0
217 219 validflags = FRAME_TYPE_FLAGS[frametype]
218 220 for flag in frameflags.split(b'|'):
219 221 if flag in validflags:
220 222 finalflags |= validflags[flag]
221 223 else:
222 224 finalflags |= int(flag)
223 225
224 226 if payload.startswith(b'cbor:'):
225 227 payload = cbor.dumps(stringutil.evalpython(payload[5:]), canonical=True)
226 228
227 229 else:
228 230 payload = stringutil.unescapestr(payload)
229 231
230 232 return makeframe(requestid=requestid, streamid=streamid,
231 233 streamflags=finalstreamflags, typeid=frametype,
232 234 flags=finalflags, payload=payload)
233 235
234 236 def parseheader(data):
235 237 """Parse a unified framing protocol frame header from a buffer.
236 238
237 239 The header is expected to be in the buffer at offset 0 and the
238 240 buffer is expected to be large enough to hold a full header.
239 241 """
240 242 # 24 bits payload length (little endian)
241 243 # 16 bits request ID
242 244 # 8 bits stream ID
243 245 # 8 bits stream flags
244 246 # 4 bits frame type
245 247 # 4 bits frame flags
246 248 # ... payload
247 249 framelength = data[0] + 256 * data[1] + 16384 * data[2]
248 250 requestid, streamid, streamflags = struct.unpack_from(r'<HBB', data, 3)
249 251 typeflags = data[7]
250 252
251 253 frametype = (typeflags & 0xf0) >> 4
252 254 frameflags = typeflags & 0x0f
253 255
254 256 return frameheader(framelength, requestid, streamid, streamflags,
255 257 frametype, frameflags)
256 258
257 259 def readframe(fh):
258 260 """Read a unified framing protocol frame from a file object.
259 261
260 262 Returns a 3-tuple of (type, flags, payload) for the decoded frame or
261 263 None if no frame is available. May raise if a malformed frame is
262 264 seen.
263 265 """
264 266 header = bytearray(FRAME_HEADER_SIZE)
265 267
266 268 readcount = fh.readinto(header)
267 269
268 270 if readcount == 0:
269 271 return None
270 272
271 273 if readcount != FRAME_HEADER_SIZE:
272 274 raise error.Abort(_('received incomplete frame: got %d bytes: %s') %
273 275 (readcount, header))
274 276
275 277 h = parseheader(header)
276 278
277 279 payload = fh.read(h.length)
278 280 if len(payload) != h.length:
279 281 raise error.Abort(_('frame length error: expected %d; got %d') %
280 282 (h.length, len(payload)))
281 283
282 284 return frame(h.requestid, h.streamid, h.streamflags, h.typeid, h.flags,
283 285 payload)
284 286
285 287 def createcommandframes(stream, requestid, cmd, args, datafh=None,
286 288 maxframesize=DEFAULT_MAX_FRAME_SIZE):
287 289 """Create frames necessary to transmit a request to run a command.
288 290
289 291 This is a generator of bytearrays. Each item represents a frame
290 292 ready to be sent over the wire to a peer.
291 293 """
292 294 data = {b'name': cmd}
293 295 if args:
294 296 data[b'args'] = args
295 297
296 298 data = cbor.dumps(data, canonical=True)
297 299
298 300 offset = 0
299 301
300 302 while True:
301 303 flags = 0
302 304
303 305 # Must set new or continuation flag.
304 306 if not offset:
305 307 flags |= FLAG_COMMAND_REQUEST_NEW
306 308 else:
307 309 flags |= FLAG_COMMAND_REQUEST_CONTINUATION
308 310
309 311 # Data frames is set on all frames.
310 312 if datafh:
311 313 flags |= FLAG_COMMAND_REQUEST_EXPECT_DATA
312 314
313 315 payload = data[offset:offset + maxframesize]
314 316 offset += len(payload)
315 317
316 318 if len(payload) == maxframesize and offset < len(data):
317 319 flags |= FLAG_COMMAND_REQUEST_MORE_FRAMES
318 320
319 321 yield stream.makeframe(requestid=requestid,
320 322 typeid=FRAME_TYPE_COMMAND_REQUEST,
321 323 flags=flags,
322 324 payload=payload)
323 325
324 326 if not (flags & FLAG_COMMAND_REQUEST_MORE_FRAMES):
325 327 break
326 328
327 329 if datafh:
328 330 while True:
329 331 data = datafh.read(DEFAULT_MAX_FRAME_SIZE)
330 332
331 333 done = False
332 334 if len(data) == DEFAULT_MAX_FRAME_SIZE:
333 335 flags = FLAG_COMMAND_DATA_CONTINUATION
334 336 else:
335 337 flags = FLAG_COMMAND_DATA_EOS
336 338 assert datafh.read(1) == b''
337 339 done = True
338 340
339 341 yield stream.makeframe(requestid=requestid,
340 342 typeid=FRAME_TYPE_COMMAND_DATA,
341 343 flags=flags,
342 344 payload=data)
343 345
344 346 if done:
345 347 break
346 348
347 349 def createbytesresponseframesfrombytes(stream, requestid, data,
348 350 maxframesize=DEFAULT_MAX_FRAME_SIZE):
349 351 """Create a raw frame to send a bytes response from static bytes input.
350 352
351 353 Returns a generator of bytearrays.
352 354 """
353 355
354 356 # Simple case of a single frame.
355 357 if len(data) <= maxframesize:
356 358 yield stream.makeframe(requestid=requestid,
357 359 typeid=FRAME_TYPE_BYTES_RESPONSE,
358 360 flags=FLAG_BYTES_RESPONSE_EOS,
359 361 payload=data)
360 362 return
361 363
362 364 offset = 0
363 365 while True:
364 366 chunk = data[offset:offset + maxframesize]
365 367 offset += len(chunk)
366 368 done = offset == len(data)
367 369
368 370 if done:
369 371 flags = FLAG_BYTES_RESPONSE_EOS
370 372 else:
371 373 flags = FLAG_BYTES_RESPONSE_CONTINUATION
372 374
373 375 yield stream.makeframe(requestid=requestid,
374 376 typeid=FRAME_TYPE_BYTES_RESPONSE,
375 377 flags=flags,
376 378 payload=chunk)
377 379
378 380 if done:
379 381 break
380 382
381 383 def createerrorframe(stream, requestid, msg, protocol=False, application=False):
382 384 # TODO properly handle frame size limits.
383 385 assert len(msg) <= DEFAULT_MAX_FRAME_SIZE
384 386
385 387 flags = 0
386 388 if protocol:
387 389 flags |= FLAG_ERROR_RESPONSE_PROTOCOL
388 390 if application:
389 391 flags |= FLAG_ERROR_RESPONSE_APPLICATION
390 392
391 393 yield stream.makeframe(requestid=requestid,
392 394 typeid=FRAME_TYPE_ERROR_RESPONSE,
393 395 flags=flags,
394 396 payload=msg)
395 397
396 398 def createtextoutputframe(stream, requestid, atoms):
397 399 """Create a text output frame to render text to people.
398 400
399 401 ``atoms`` is a 3-tuple of (formatting string, args, labels).
400 402
401 403 The formatting string contains ``%s`` tokens to be replaced by the
402 404 corresponding indexed entry in ``args``. ``labels`` is an iterable of
403 405 formatters to be applied at rendering time. In terms of the ``ui``
404 406 class, each atom corresponds to a ``ui.write()``.
405 407 """
406 408 bytesleft = DEFAULT_MAX_FRAME_SIZE
407 409 atomchunks = []
408 410
409 411 for (formatting, args, labels) in atoms:
410 412 if len(args) > 255:
411 413 raise ValueError('cannot use more than 255 formatting arguments')
412 414 if len(labels) > 255:
413 415 raise ValueError('cannot use more than 255 labels')
414 416
415 417 # TODO look for localstr, other types here?
416 418
417 419 if not isinstance(formatting, bytes):
418 420 raise ValueError('must use bytes formatting strings')
419 421 for arg in args:
420 422 if not isinstance(arg, bytes):
421 423 raise ValueError('must use bytes for arguments')
422 424 for label in labels:
423 425 if not isinstance(label, bytes):
424 426 raise ValueError('must use bytes for labels')
425 427
426 428 # Formatting string must be UTF-8.
427 429 formatting = formatting.decode(r'utf-8', r'replace').encode(r'utf-8')
428 430
429 431 # Arguments must be UTF-8.
430 432 args = [a.decode(r'utf-8', r'replace').encode(r'utf-8') for a in args]
431 433
432 434 # Labels must be ASCII.
433 435 labels = [l.decode(r'ascii', r'strict').encode(r'ascii')
434 436 for l in labels]
435 437
436 438 if len(formatting) > 65535:
437 439 raise ValueError('formatting string cannot be longer than 64k')
438 440
439 441 if any(len(a) > 65535 for a in args):
440 442 raise ValueError('argument string cannot be longer than 64k')
441 443
442 444 if any(len(l) > 255 for l in labels):
443 445 raise ValueError('label string cannot be longer than 255 bytes')
444 446
445 447 chunks = [
446 448 struct.pack(r'<H', len(formatting)),
447 449 struct.pack(r'<BB', len(labels), len(args)),
448 450 struct.pack(r'<' + r'B' * len(labels), *map(len, labels)),
449 451 struct.pack(r'<' + r'H' * len(args), *map(len, args)),
450 452 ]
451 453 chunks.append(formatting)
452 454 chunks.extend(labels)
453 455 chunks.extend(args)
454 456
455 457 atom = b''.join(chunks)
456 458 atomchunks.append(atom)
457 459 bytesleft -= len(atom)
458 460
459 461 if bytesleft < 0:
460 462 raise ValueError('cannot encode data in a single frame')
461 463
462 464 yield stream.makeframe(requestid=requestid,
463 465 typeid=FRAME_TYPE_TEXT_OUTPUT,
464 466 flags=0,
465 467 payload=b''.join(atomchunks))
466 468
467 469 class stream(object):
468 470 """Represents a logical unidirectional series of frames."""
469 471
470 472 def __init__(self, streamid, active=False):
471 473 self.streamid = streamid
472 474 self._active = False
473 475
474 476 def makeframe(self, requestid, typeid, flags, payload):
475 477 """Create a frame to be sent out over this stream.
476 478
477 479 Only returns the frame instance. Does not actually send it.
478 480 """
479 481 streamflags = 0
480 482 if not self._active:
481 483 streamflags |= STREAM_FLAG_BEGIN_STREAM
482 484 self._active = True
483 485
484 486 return makeframe(requestid, self.streamid, streamflags, typeid, flags,
485 487 payload)
486 488
487 489 def ensureserverstream(stream):
488 490 if stream.streamid % 2:
489 491 raise error.ProgrammingError('server should only write to even '
490 492 'numbered streams; %d is not even' %
491 493 stream.streamid)
492 494
493 495 class serverreactor(object):
494 496 """Holds state of a server handling frame-based protocol requests.
495 497
496 498 This class is the "brain" of the unified frame-based protocol server
497 499 component. While the protocol is stateless from the perspective of
498 500 requests/commands, something needs to track which frames have been
499 501 received, what frames to expect, etc. This class is that thing.
500 502
501 503 Instances are modeled as a state machine of sorts. Instances are also
502 504 reactionary to external events. The point of this class is to encapsulate
503 505 the state of the connection and the exchange of frames, not to perform
504 506 work. Instead, callers tell this class when something occurs, like a
505 507 frame arriving. If that activity is worthy of a follow-up action (say
506 508 *run a command*), the return value of that handler will say so.
507 509
508 510 I/O and CPU intensive operations are purposefully delegated outside of
509 511 this class.
510 512
511 513 Consumers are expected to tell instances when events occur. They do so by
512 514 calling the various ``on*`` methods. These methods return a 2-tuple
513 515 describing any follow-up action(s) to take. The first element is the
514 516 name of an action to perform. The second is a data structure (usually
515 517 a dict) specific to that action that contains more information. e.g.
516 518 if the server wants to send frames back to the client, the data structure
517 519 will contain a reference to those frames.
518 520
519 521 Valid actions that consumers can be instructed to take are:
520 522
521 523 sendframes
522 524 Indicates that frames should be sent to the client. The ``framegen``
523 525 key contains a generator of frames that should be sent. The server
524 526 assumes that all frames are sent to the client.
525 527
526 528 error
527 529 Indicates that an error occurred. Consumer should probably abort.
528 530
529 531 runcommand
530 532 Indicates that the consumer should run a wire protocol command. Details
531 533 of the command to run are given in the data structure.
532 534
533 535 wantframe
534 536 Indicates that nothing of interest happened and the server is waiting on
535 537 more frames from the client before anything interesting can be done.
536 538
537 539 noop
538 540 Indicates no additional action is required.
539 541
540 542 Known Issues
541 543 ------------
542 544
543 545 There are no limits to the number of partially received commands or their
544 546 size. A malicious client could stream command request data and exhaust the
545 547 server's memory.
546 548
547 549 Partially received commands are not acted upon when end of input is
548 550 reached. Should the server error if it receives a partial request?
549 551 Should the client send a message to abort a partially transmitted request
550 552 to facilitate graceful shutdown?
551 553
552 554 Active requests that haven't been responded to aren't tracked. This means
553 555 that if we receive a command and instruct its dispatch, another command
554 556 with its request ID can come in over the wire and there will be a race
555 557 between who responds to what.
556 558 """
557 559
558 560 def __init__(self, deferoutput=False):
559 561 """Construct a new server reactor.
560 562
561 563 ``deferoutput`` can be used to indicate that no output frames should be
562 564 instructed to be sent until input has been exhausted. In this mode,
563 565 events that would normally generate output frames (such as a command
564 566 response being ready) will instead defer instructing the consumer to
565 567 send those frames. This is useful for half-duplex transports where the
566 568 sender cannot receive until all data has been transmitted.
567 569 """
568 570 self._deferoutput = deferoutput
569 571 self._state = 'idle'
570 572 self._nextoutgoingstreamid = 2
571 573 self._bufferedframegens = []
572 574 # stream id -> stream instance for all active streams from the client.
573 575 self._incomingstreams = {}
574 576 self._outgoingstreams = {}
575 577 # request id -> dict of commands that are actively being received.
576 578 self._receivingcommands = {}
577 579 # Request IDs that have been received and are actively being processed.
578 580 # Once all output for a request has been sent, it is removed from this
579 581 # set.
580 582 self._activecommands = set()
581 583
582 584 def onframerecv(self, frame):
583 585 """Process a frame that has been received off the wire.
584 586
585 587 Returns a dict with an ``action`` key that details what action,
586 588 if any, the consumer should take next.
587 589 """
588 590 if not frame.streamid % 2:
589 591 self._state = 'errored'
590 592 return self._makeerrorresult(
591 593 _('received frame with even numbered stream ID: %d') %
592 594 frame.streamid)
593 595
594 596 if frame.streamid not in self._incomingstreams:
595 597 if not frame.streamflags & STREAM_FLAG_BEGIN_STREAM:
596 598 self._state = 'errored'
597 599 return self._makeerrorresult(
598 600 _('received frame on unknown inactive stream without '
599 601 'beginning of stream flag set'))
600 602
601 603 self._incomingstreams[frame.streamid] = stream(frame.streamid)
602 604
603 605 if frame.streamflags & STREAM_FLAG_ENCODING_APPLIED:
604 606 # TODO handle decoding frames
605 607 self._state = 'errored'
606 608 raise error.ProgrammingError('support for decoding stream payloads '
607 609 'not yet implemented')
608 610
609 611 if frame.streamflags & STREAM_FLAG_END_STREAM:
610 612 del self._incomingstreams[frame.streamid]
611 613
612 614 handlers = {
613 615 'idle': self._onframeidle,
614 616 'command-receiving': self._onframecommandreceiving,
615 617 'errored': self._onframeerrored,
616 618 }
617 619
618 620 meth = handlers.get(self._state)
619 621 if not meth:
620 622 raise error.ProgrammingError('unhandled state: %s' % self._state)
621 623
622 624 return meth(frame)
623 625
624 626 def onbytesresponseready(self, stream, requestid, data):
625 627 """Signal that a bytes response is ready to be sent to the client.
626 628
627 629 The raw bytes response is passed as an argument.
628 630 """
629 631 ensureserverstream(stream)
630 632
631 633 def sendframes():
632 634 for frame in createbytesresponseframesfrombytes(stream, requestid,
633 635 data):
634 636 yield frame
635 637
636 638 self._activecommands.remove(requestid)
637 639
638 640 result = sendframes()
639 641
640 642 if self._deferoutput:
641 643 self._bufferedframegens.append(result)
642 644 return 'noop', {}
643 645 else:
644 646 return 'sendframes', {
645 647 'framegen': result,
646 648 }
647 649
648 650 def oninputeof(self):
649 651 """Signals that end of input has been received.
650 652
651 653 No more frames will be received. All pending activity should be
652 654 completed.
653 655 """
654 656 # TODO should we do anything about in-flight commands?
655 657
656 658 if not self._deferoutput or not self._bufferedframegens:
657 659 return 'noop', {}
658 660
659 661 # If we buffered all our responses, emit those.
660 662 def makegen():
661 663 for gen in self._bufferedframegens:
662 664 for frame in gen:
663 665 yield frame
664 666
665 667 return 'sendframes', {
666 668 'framegen': makegen(),
667 669 }
668 670
669 671 def onapplicationerror(self, stream, requestid, msg):
670 672 ensureserverstream(stream)
671 673
672 674 return 'sendframes', {
673 675 'framegen': createerrorframe(stream, requestid, msg,
674 676 application=True),
675 677 }
676 678
677 679 def makeoutputstream(self):
678 680 """Create a stream to be used for sending data to the client."""
679 681 streamid = self._nextoutgoingstreamid
680 682 self._nextoutgoingstreamid += 2
681 683
682 684 s = stream(streamid)
683 685 self._outgoingstreams[streamid] = s
684 686
685 687 return s
686 688
687 689 def _makeerrorresult(self, msg):
688 690 return 'error', {
689 691 'message': msg,
690 692 }
691 693
692 694 def _makeruncommandresult(self, requestid):
693 695 entry = self._receivingcommands[requestid]
694 696
695 697 if not entry['requestdone']:
696 698 self._state = 'errored'
697 699 raise error.ProgrammingError('should not be called without '
698 700 'requestdone set')
699 701
700 702 del self._receivingcommands[requestid]
701 703
702 704 if self._receivingcommands:
703 705 self._state = 'command-receiving'
704 706 else:
705 707 self._state = 'idle'
706 708
707 709 # Decode the payloads as CBOR.
708 710 entry['payload'].seek(0)
709 711 request = cbor.load(entry['payload'])
710 712
711 713 if b'name' not in request:
712 714 self._state = 'errored'
713 715 return self._makeerrorresult(
714 716 _('command request missing "name" field'))
715 717
716 718 if b'args' not in request:
717 719 request[b'args'] = {}
718 720
719 721 assert requestid not in self._activecommands
720 722 self._activecommands.add(requestid)
721 723
722 724 return 'runcommand', {
723 725 'requestid': requestid,
724 726 'command': request[b'name'],
725 727 'args': request[b'args'],
726 728 'data': entry['data'].getvalue() if entry['data'] else None,
727 729 }
728 730
729 731 def _makewantframeresult(self):
730 732 return 'wantframe', {
731 733 'state': self._state,
732 734 }
733 735
734 736 def _validatecommandrequestframe(self, frame):
735 737 new = frame.flags & FLAG_COMMAND_REQUEST_NEW
736 738 continuation = frame.flags & FLAG_COMMAND_REQUEST_CONTINUATION
737 739
738 740 if new and continuation:
739 741 self._state = 'errored'
740 742 return self._makeerrorresult(
741 743 _('received command request frame with both new and '
742 744 'continuation flags set'))
743 745
744 746 if not new and not continuation:
745 747 self._state = 'errored'
746 748 return self._makeerrorresult(
747 749 _('received command request frame with neither new nor '
748 750 'continuation flags set'))
749 751
750 752 def _onframeidle(self, frame):
751 753 # The only frame type that should be received in this state is a
752 754 # command request.
753 755 if frame.typeid != FRAME_TYPE_COMMAND_REQUEST:
754 756 self._state = 'errored'
755 757 return self._makeerrorresult(
756 758 _('expected command request frame; got %d') % frame.typeid)
757 759
758 760 res = self._validatecommandrequestframe(frame)
759 761 if res:
760 762 return res
761 763
762 764 if frame.requestid in self._receivingcommands:
763 765 self._state = 'errored'
764 766 return self._makeerrorresult(
765 767 _('request with ID %d already received') % frame.requestid)
766 768
767 769 if frame.requestid in self._activecommands:
768 770 self._state = 'errored'
769 771 return self._makeerrorresult(
770 772 _('request with ID %d is already active') % frame.requestid)
771 773
772 774 new = frame.flags & FLAG_COMMAND_REQUEST_NEW
773 775 moreframes = frame.flags & FLAG_COMMAND_REQUEST_MORE_FRAMES
774 776 expectingdata = frame.flags & FLAG_COMMAND_REQUEST_EXPECT_DATA
775 777
776 778 if not new:
777 779 self._state = 'errored'
778 780 return self._makeerrorresult(
779 781 _('received command request frame without new flag set'))
780 782
781 783 payload = util.bytesio()
782 784 payload.write(frame.payload)
783 785
784 786 self._receivingcommands[frame.requestid] = {
785 787 'payload': payload,
786 788 'data': None,
787 789 'requestdone': not moreframes,
788 790 'expectingdata': bool(expectingdata),
789 791 }
790 792
791 793 # This is the final frame for this request. Dispatch it.
792 794 if not moreframes and not expectingdata:
793 795 return self._makeruncommandresult(frame.requestid)
794 796
795 797 assert moreframes or expectingdata
796 798 self._state = 'command-receiving'
797 799 return self._makewantframeresult()
798 800
799 801 def _onframecommandreceiving(self, frame):
800 802 if frame.typeid == FRAME_TYPE_COMMAND_REQUEST:
801 803 # Process new command requests as such.
802 804 if frame.flags & FLAG_COMMAND_REQUEST_NEW:
803 805 return self._onframeidle(frame)
804 806
805 807 res = self._validatecommandrequestframe(frame)
806 808 if res:
807 809 return res
808 810
809 811 # All other frames should be related to a command that is currently
810 812 # receiving but is not active.
811 813 if frame.requestid in self._activecommands:
812 814 self._state = 'errored'
813 815 return self._makeerrorresult(
814 816 _('received frame for request that is still active: %d') %
815 817 frame.requestid)
816 818
817 819 if frame.requestid not in self._receivingcommands:
818 820 self._state = 'errored'
819 821 return self._makeerrorresult(
820 822 _('received frame for request that is not receiving: %d') %
821 823 frame.requestid)
822 824
823 825 entry = self._receivingcommands[frame.requestid]
824 826
825 827 if frame.typeid == FRAME_TYPE_COMMAND_REQUEST:
826 828 moreframes = frame.flags & FLAG_COMMAND_REQUEST_MORE_FRAMES
827 829 expectingdata = bool(frame.flags & FLAG_COMMAND_REQUEST_EXPECT_DATA)
828 830
829 831 if entry['requestdone']:
830 832 self._state = 'errored'
831 833 return self._makeerrorresult(
832 834 _('received command request frame when request frames '
833 835 'were supposedly done'))
834 836
835 837 if expectingdata != entry['expectingdata']:
836 838 self._state = 'errored'
837 839 return self._makeerrorresult(
838 840 _('mismatch between expect data flag and previous frame'))
839 841
840 842 entry['payload'].write(frame.payload)
841 843
842 844 if not moreframes:
843 845 entry['requestdone'] = True
844 846
845 847 if not moreframes and not expectingdata:
846 848 return self._makeruncommandresult(frame.requestid)
847 849
848 850 return self._makewantframeresult()
849 851
850 852 elif frame.typeid == FRAME_TYPE_COMMAND_DATA:
851 853 if not entry['expectingdata']:
852 854 self._state = 'errored'
853 855 return self._makeerrorresult(_(
854 856 'received command data frame for request that is not '
855 857 'expecting data: %d') % frame.requestid)
856 858
857 859 if entry['data'] is None:
858 860 entry['data'] = util.bytesio()
859 861
860 862 return self._handlecommanddataframe(frame, entry)
861 863 else:
862 864 self._state = 'errored'
863 865 return self._makeerrorresult(_(
864 866 'received unexpected frame type: %d') % frame.typeid)
865 867
866 868 def _handlecommanddataframe(self, frame, entry):
867 869 assert frame.typeid == FRAME_TYPE_COMMAND_DATA
868 870
869 871 # TODO support streaming data instead of buffering it.
870 872 entry['data'].write(frame.payload)
871 873
872 874 if frame.flags & FLAG_COMMAND_DATA_CONTINUATION:
873 875 return self._makewantframeresult()
874 876 elif frame.flags & FLAG_COMMAND_DATA_EOS:
875 877 entry['data'].seek(0)
876 878 return self._makeruncommandresult(frame.requestid)
877 879 else:
878 880 self._state = 'errored'
879 881 return self._makeerrorresult(_('command data frame without '
880 882 'flags'))
881 883
882 884 def _onframeerrored(self, frame):
883 885 return self._makeerrorresult(_('server already errored'))
General Comments 0
You need to be logged in to leave comments. Login now