##// END OF EJS Templates
wireproto: define frame to represent progress updates...
Gregory Szorc -
r37307:b0041036 default
parent child Browse files
Show More
@@ -1,1612 +1,1650
1 1 The Mercurial wire protocol is a request-response based protocol
2 2 with multiple wire representations.
3 3
4 4 Each request is modeled as a command name, a dictionary of arguments, and
5 5 optional raw input. Command arguments and their types are intrinsic
6 6 properties of commands. So is the response type of the command. This means
7 7 clients can't always send arbitrary arguments to servers and servers can't
8 8 return multiple response types.
9 9
10 10 The protocol is synchronous and does not support multiplexing (concurrent
11 11 commands).
12 12
13 13 Handshake
14 14 =========
15 15
16 16 It is required or common for clients to perform a *handshake* when connecting
17 17 to a server. The handshake serves the following purposes:
18 18
19 19 * Negotiating protocol/transport level options
20 20 * Allows the client to learn about server capabilities to influence
21 21 future requests
22 22 * Ensures the underlying transport channel is in a *clean* state
23 23
24 24 An important goal of the handshake is to allow clients to use more modern
25 25 wire protocol features. By default, clients must assume they are talking
26 26 to an old version of Mercurial server (possibly even the very first
27 27 implementation). So, clients should not attempt to call or utilize modern
28 28 wire protocol features until they have confirmation that the server
29 29 supports them. The handshake implementation is designed to allow both
30 30 ends to utilize the latest set of features and capabilities with as
31 31 few round trips as possible.
32 32
33 33 The handshake mechanism varies by transport and protocol and is documented
34 34 in the sections below.
35 35
36 36 HTTP Protocol
37 37 =============
38 38
39 39 Handshake
40 40 ---------
41 41
42 42 The client sends a ``capabilities`` command request (``?cmd=capabilities``)
43 43 as soon as HTTP requests may be issued.
44 44
45 45 The server responds with a capabilities string, which the client parses to
46 46 learn about the server's abilities.
47 47
48 48 HTTP Version 1 Transport
49 49 ------------------------
50 50
51 51 Commands are issued as HTTP/1.0 or HTTP/1.1 requests. Commands are
52 52 sent to the base URL of the repository with the command name sent in
53 53 the ``cmd`` query string parameter. e.g.
54 54 ``https://example.com/repo?cmd=capabilities``. The HTTP method is ``GET``
55 55 or ``POST`` depending on the command and whether there is a request
56 56 body.
57 57
58 58 Command arguments can be sent multiple ways.
59 59
60 60 The simplest is part of the URL query string using ``x-www-form-urlencoded``
61 61 encoding (see Python's ``urllib.urlencode()``. However, many servers impose
62 62 length limitations on the URL. So this mechanism is typically only used if
63 63 the server doesn't support other mechanisms.
64 64
65 65 If the server supports the ``httpheader`` capability, command arguments can
66 66 be sent in HTTP request headers named ``X-HgArg-<N>`` where ``<N>`` is an
67 67 integer starting at 1. A ``x-www-form-urlencoded`` representation of the
68 68 arguments is obtained. This full string is then split into chunks and sent
69 69 in numbered ``X-HgArg-<N>`` headers. The maximum length of each HTTP header
70 70 is defined by the server in the ``httpheader`` capability value, which defaults
71 71 to ``1024``. The server reassembles the encoded arguments string by
72 72 concatenating the ``X-HgArg-<N>`` headers then URL decodes them into a
73 73 dictionary.
74 74
75 75 The list of ``X-HgArg-<N>`` headers should be added to the ``Vary`` request
76 76 header to instruct caches to take these headers into consideration when caching
77 77 requests.
78 78
79 79 If the server supports the ``httppostargs`` capability, the client
80 80 may send command arguments in the HTTP request body as part of an
81 81 HTTP POST request. The command arguments will be URL encoded just like
82 82 they would for sending them via HTTP headers. However, no splitting is
83 83 performed: the raw arguments are included in the HTTP request body.
84 84
85 85 The client sends a ``X-HgArgs-Post`` header with the string length of the
86 86 encoded arguments data. Additional data may be included in the HTTP
87 87 request body immediately following the argument data. The offset of the
88 88 non-argument data is defined by the ``X-HgArgs-Post`` header. The
89 89 ``X-HgArgs-Post`` header is not required if there is no argument data.
90 90
91 91 Additional command data can be sent as part of the HTTP request body. The
92 92 default ``Content-Type`` when sending data is ``application/mercurial-0.1``.
93 93 A ``Content-Length`` header is currently always sent.
94 94
95 95 Example HTTP requests::
96 96
97 97 GET /repo?cmd=capabilities
98 98 X-HgArg-1: foo=bar&baz=hello%20world
99 99
100 100 The request media type should be chosen based on server support. If the
101 101 ``httpmediatype`` server capability is present, the client should send
102 102 the newest mutually supported media type. If this capability is absent,
103 103 the client must assume the server only supports the
104 104 ``application/mercurial-0.1`` media type.
105 105
106 106 The ``Content-Type`` HTTP response header identifies the response as coming
107 107 from Mercurial and can also be used to signal an error has occurred.
108 108
109 109 The ``application/mercurial-*`` media types indicate a generic Mercurial
110 110 data type.
111 111
112 112 The ``application/mercurial-0.1`` media type is raw Mercurial data. It is the
113 113 predecessor of the format below.
114 114
115 115 The ``application/mercurial-0.2`` media type is compression framed Mercurial
116 116 data. The first byte of the payload indicates the length of the compression
117 117 format identifier that follows. Next are N bytes indicating the compression
118 118 format. e.g. ``zlib``. The remaining bytes are compressed according to that
119 119 compression format. The decompressed data behaves the same as with
120 120 ``application/mercurial-0.1``.
121 121
122 122 The ``application/hg-error`` media type indicates a generic error occurred.
123 123 The content of the HTTP response body typically holds text describing the
124 124 error.
125 125
126 126 The ``application/hg-changegroup`` media type indicates a changegroup response
127 127 type.
128 128
129 129 Clients also accept the ``text/plain`` media type. All other media
130 130 types should cause the client to error.
131 131
132 132 Behavior of media types is further described in the ``Content Negotiation``
133 133 section below.
134 134
135 135 Clients should issue a ``User-Agent`` request header that identifies the client.
136 136 The server should not use the ``User-Agent`` for feature detection.
137 137
138 138 A command returning a ``string`` response issues a
139 139 ``application/mercurial-0.*`` media type and the HTTP response body contains
140 140 the raw string value (after compression decoding, if used). A
141 141 ``Content-Length`` header is typically issued, but not required.
142 142
143 143 A command returning a ``stream`` response issues a
144 144 ``application/mercurial-0.*`` media type and the HTTP response is typically
145 145 using *chunked transfer* (``Transfer-Encoding: chunked``).
146 146
147 147 HTTP Version 2 Transport
148 148 ------------------------
149 149
150 150 **Experimental - feature under active development**
151 151
152 152 Version 2 of the HTTP protocol is exposed under the ``/api/*`` URL space.
153 153 It's final API name is not yet formalized.
154 154
155 155 Commands are triggered by sending HTTP POST requests against URLs of the
156 156 form ``<permission>/<command>``, where ``<permission>`` is ``ro`` or
157 157 ``rw``, meaning read-only and read-write, respectively and ``<command>``
158 158 is a named wire protocol command.
159 159
160 160 Non-POST request methods MUST be rejected by the server with an HTTP
161 161 405 response.
162 162
163 163 Commands that modify repository state in meaningful ways MUST NOT be
164 164 exposed under the ``ro`` URL prefix. All available commands MUST be
165 165 available under the ``rw`` URL prefix.
166 166
167 167 Server adminstrators MAY implement blanket HTTP authentication keyed
168 168 off the URL prefix. For example, a server may require authentication
169 169 for all ``rw/*`` URLs and let unauthenticated requests to ``ro/*``
170 170 URL proceed. A server MAY issue an HTTP 401, 403, or 407 response
171 171 in accordance with RFC 7235. Clients SHOULD recognize the HTTP Basic
172 172 (RFC 7617) and Digest (RFC 7616) authentication schemes. Clients SHOULD
173 173 make an attempt to recognize unknown schemes using the
174 174 ``WWW-Authenticate`` response header on a 401 response, as defined by
175 175 RFC 7235.
176 176
177 177 Read-only commands are accessible under ``rw/*`` URLs so clients can
178 178 signal the intent of the operation very early in the connection
179 179 lifecycle. For example, a ``push`` operation - which consists of
180 180 various read-only commands mixed with at least one read-write command -
181 181 can perform all commands against ``rw/*`` URLs so that any server-side
182 182 authentication requirements are discovered upon attempting the first
183 183 command - not potentially several commands into the exchange. This
184 184 allows clients to fail faster or prompt for credentials as soon as the
185 185 exchange takes place. This provides a better end-user experience.
186 186
187 187 Requests to unknown commands or URLS result in an HTTP 404.
188 188 TODO formally define response type, how error is communicated, etc.
189 189
190 190 HTTP request and response bodies use the *Unified Frame-Based Protocol*
191 191 (defined below) for media exchange. The entirety of the HTTP message
192 192 body is 0 or more frames as defined by this protocol.
193 193
194 194 Clients and servers MUST advertise the ``TBD`` media type via the
195 195 ``Content-Type`` request and response headers. In addition, clients MUST
196 196 advertise this media type value in their ``Accept`` request header in all
197 197 requests.
198 198 TODO finalize the media type. For now, it is defined in wireprotoserver.py.
199 199
200 200 Servers receiving requests without an ``Accept`` header SHOULD respond with
201 201 an HTTP 406.
202 202
203 203 Servers receiving requests with an invalid ``Content-Type`` header SHOULD
204 204 respond with an HTTP 415.
205 205
206 206 The command to run is specified in the POST payload as defined by the
207 207 *Unified Frame-Based Protocol*. This is redundant with data already
208 208 encoded in the URL. This is by design, so server operators can have
209 209 better understanding about server activity from looking merely at
210 210 HTTP access logs.
211 211
212 212 In most circumstances, the command specified in the URL MUST match
213 213 the command specified in the frame-based payload or the server will
214 214 respond with an error. The exception to this is the special
215 215 ``multirequest`` URL. (See below.) In addition, HTTP requests
216 216 are limited to one command invocation. The exception is the special
217 217 ``multirequest`` URL.
218 218
219 219 The ``multirequest`` command endpoints (``ro/multirequest`` and
220 220 ``rw/multirequest``) are special in that they allow the execution of
221 221 *any* command and allow the execution of multiple commands. If the
222 222 HTTP request issues multiple commands across multiple frames, all
223 223 issued commands will be processed by the server. Per the defined
224 224 behavior of the *Unified Frame-Based Protocol*, commands may be
225 225 issued interleaved and responses may come back in a different order
226 226 than they were issued. Clients MUST be able to deal with this.
227 227
228 228 SSH Protocol
229 229 ============
230 230
231 231 Handshake
232 232 ---------
233 233
234 234 For all clients, the handshake consists of the client sending 1 or more
235 235 commands to the server using version 1 of the transport. Servers respond
236 236 to commands they know how to respond to and send an empty response (``0\n``)
237 237 for unknown commands (per standard behavior of version 1 of the transport).
238 238 Clients then typically look for a response to the newest sent command to
239 239 determine which transport version to use and what the available features for
240 240 the connection and server are.
241 241
242 242 Preceding any response from client-issued commands, the server may print
243 243 non-protocol output. It is common for SSH servers to print banners, message
244 244 of the day announcements, etc when clients connect. It is assumed that any
245 245 such *banner* output will precede any Mercurial server output. So clients
246 246 must be prepared to handle server output on initial connect that isn't
247 247 in response to any client-issued command and doesn't conform to Mercurial's
248 248 wire protocol. This *banner* output should only be on stdout. However,
249 249 some servers may send output on stderr.
250 250
251 251 Pre 0.9.1 clients issue a ``between`` command with the ``pairs`` argument
252 252 having the value
253 253 ``0000000000000000000000000000000000000000-0000000000000000000000000000000000000000``.
254 254
255 255 The ``between`` command has been supported since the original Mercurial
256 256 SSH server. Requesting the empty range will return a ``\n`` string response,
257 257 which will be encoded as ``1\n\n`` (value length of ``1`` followed by a newline
258 258 followed by the value, which happens to be a newline).
259 259
260 260 For pre 0.9.1 clients and all servers, the exchange looks like::
261 261
262 262 c: between\n
263 263 c: pairs 81\n
264 264 c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
265 265 s: 1\n
266 266 s: \n
267 267
268 268 0.9.1+ clients send a ``hello`` command (with no arguments) before the
269 269 ``between`` command. The response to this command allows clients to
270 270 discover server capabilities and settings.
271 271
272 272 An example exchange between 0.9.1+ clients and a ``hello`` aware server looks
273 273 like::
274 274
275 275 c: hello\n
276 276 c: between\n
277 277 c: pairs 81\n
278 278 c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
279 279 s: 324\n
280 280 s: capabilities: lookup changegroupsubset branchmap pushkey known getbundle ...\n
281 281 s: 1\n
282 282 s: \n
283 283
284 284 And a similar scenario but with servers sending a banner on connect::
285 285
286 286 c: hello\n
287 287 c: between\n
288 288 c: pairs 81\n
289 289 c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
290 290 s: welcome to the server\n
291 291 s: if you find any issues, email someone@somewhere.com\n
292 292 s: 324\n
293 293 s: capabilities: lookup changegroupsubset branchmap pushkey known getbundle ...\n
294 294 s: 1\n
295 295 s: \n
296 296
297 297 Note that output from the ``hello`` command is terminated by a ``\n``. This is
298 298 part of the response payload and not part of the wire protocol adding a newline
299 299 after responses. In other words, the length of the response contains the
300 300 trailing ``\n``.
301 301
302 302 Clients supporting version 2 of the SSH transport send a line beginning
303 303 with ``upgrade`` before the ``hello`` and ``between`` commands. The line
304 304 (which isn't a well-formed command line because it doesn't consist of a
305 305 single command name) serves to both communicate the client's intent to
306 306 switch to transport version 2 (transports are version 1 by default) as
307 307 well as to advertise the client's transport-level capabilities so the
308 308 server may satisfy that request immediately.
309 309
310 310 The upgrade line has the form:
311 311
312 312 upgrade <token> <transport capabilities>
313 313
314 314 That is the literal string ``upgrade`` followed by a space, followed by
315 315 a randomly generated string, followed by a space, followed by a string
316 316 denoting the client's transport capabilities.
317 317
318 318 The token can be anything. However, a random UUID is recommended. (Use
319 319 of version 4 UUIDs is recommended because version 1 UUIDs can leak the
320 320 client's MAC address.)
321 321
322 322 The transport capabilities string is a URL/percent encoded string
323 323 containing key-value pairs defining the client's transport-level
324 324 capabilities. The following capabilities are defined:
325 325
326 326 proto
327 327 A comma-delimited list of transport protocol versions the client
328 328 supports. e.g. ``ssh-v2``.
329 329
330 330 If the server does not recognize the ``upgrade`` line, it should issue
331 331 an empty response and continue processing the ``hello`` and ``between``
332 332 commands. Here is an example handshake between a version 2 aware client
333 333 and a non version 2 aware server:
334 334
335 335 c: upgrade 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a proto=ssh-v2
336 336 c: hello\n
337 337 c: between\n
338 338 c: pairs 81\n
339 339 c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
340 340 s: 0\n
341 341 s: 324\n
342 342 s: capabilities: lookup changegroupsubset branchmap pushkey known getbundle ...\n
343 343 s: 1\n
344 344 s: \n
345 345
346 346 (The initial ``0\n`` line from the server indicates an empty response to
347 347 the unknown ``upgrade ..`` command/line.)
348 348
349 349 If the server recognizes the ``upgrade`` line and is willing to satisfy that
350 350 upgrade request, it replies to with a payload of the following form:
351 351
352 352 upgraded <token> <transport name>\n
353 353
354 354 This line is the literal string ``upgraded``, a space, the token that was
355 355 specified by the client in its ``upgrade ...`` request line, a space, and the
356 356 name of the transport protocol that was chosen by the server. The transport
357 357 name MUST match one of the names the client specified in the ``proto`` field
358 358 of its ``upgrade ...`` request line.
359 359
360 360 If a server issues an ``upgraded`` response, it MUST also read and ignore
361 361 the lines associated with the ``hello`` and ``between`` command requests
362 362 that were issued by the server. It is assumed that the negotiated transport
363 363 will respond with equivalent requested information following the transport
364 364 handshake.
365 365
366 366 All data following the ``\n`` terminating the ``upgraded`` line is the
367 367 domain of the negotiated transport. It is common for the data immediately
368 368 following to contain additional metadata about the state of the transport and
369 369 the server. However, this isn't strictly speaking part of the transport
370 370 handshake and isn't covered by this section.
371 371
372 372 Here is an example handshake between a version 2 aware client and a version
373 373 2 aware server:
374 374
375 375 c: upgrade 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a proto=ssh-v2
376 376 c: hello\n
377 377 c: between\n
378 378 c: pairs 81\n
379 379 c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
380 380 s: upgraded 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a ssh-v2\n
381 381 s: <additional transport specific data>
382 382
383 383 The client-issued token that is echoed in the response provides a more
384 384 resilient mechanism for differentiating *banner* output from Mercurial
385 385 output. In version 1, properly formatted banner output could get confused
386 386 for Mercurial server output. By submitting a randomly generated token
387 387 that is then present in the response, the client can look for that token
388 388 in response lines and have reasonable certainty that the line did not
389 389 originate from a *banner* message.
390 390
391 391 SSH Version 1 Transport
392 392 -----------------------
393 393
394 394 The SSH transport (version 1) is a custom text-based protocol suitable for
395 395 use over any bi-directional stream transport. It is most commonly used with
396 396 SSH.
397 397
398 398 A SSH transport server can be started with ``hg serve --stdio``. The stdin,
399 399 stderr, and stdout file descriptors of the started process are used to exchange
400 400 data. When Mercurial connects to a remote server over SSH, it actually starts
401 401 a ``hg serve --stdio`` process on the remote server.
402 402
403 403 Commands are issued by sending the command name followed by a trailing newline
404 404 ``\n`` to the server. e.g. ``capabilities\n``.
405 405
406 406 Command arguments are sent in the following format::
407 407
408 408 <argument> <length>\n<value>
409 409
410 410 That is, the argument string name followed by a space followed by the
411 411 integer length of the value (expressed as a string) followed by a newline
412 412 (``\n``) followed by the raw argument value.
413 413
414 414 Dictionary arguments are encoded differently::
415 415
416 416 <argument> <# elements>\n
417 417 <key1> <length1>\n<value1>
418 418 <key2> <length2>\n<value2>
419 419 ...
420 420
421 421 Non-argument data is sent immediately after the final argument value. It is
422 422 encoded in chunks::
423 423
424 424 <length>\n<data>
425 425
426 426 Each command declares a list of supported arguments and their types. If a
427 427 client sends an unknown argument to the server, the server should abort
428 428 immediately. The special argument ``*`` in a command's definition indicates
429 429 that all argument names are allowed.
430 430
431 431 The definition of supported arguments and types is initially made when a
432 432 new command is implemented. The client and server must initially independently
433 433 agree on the arguments and their types. This initial set of arguments can be
434 434 supplemented through the presence of *capabilities* advertised by the server.
435 435
436 436 Each command has a defined expected response type.
437 437
438 438 A ``string`` response type is a length framed value. The response consists of
439 439 the string encoded integer length of a value followed by a newline (``\n``)
440 440 followed by the value. Empty values are allowed (and are represented as
441 441 ``0\n``).
442 442
443 443 A ``stream`` response type consists of raw bytes of data. There is no framing.
444 444
445 445 A generic error response type is also supported. It consists of a an error
446 446 message written to ``stderr`` followed by ``\n-\n``. In addition, ``\n`` is
447 447 written to ``stdout``.
448 448
449 449 If the server receives an unknown command, it will send an empty ``string``
450 450 response.
451 451
452 452 The server terminates if it receives an empty command (a ``\n`` character).
453 453
454 454 SSH Version 2 Transport
455 455 -----------------------
456 456
457 457 **Experimental and under development**
458 458
459 459 Version 2 of the SSH transport behaves identically to version 1 of the SSH
460 460 transport with the exception of handshake semantics. See above for how
461 461 version 2 of the SSH transport is negotiated.
462 462
463 463 Immediately following the ``upgraded`` line signaling a switch to version
464 464 2 of the SSH protocol, the server automatically sends additional details
465 465 about the capabilities of the remote server. This has the form:
466 466
467 467 <integer length of value>\n
468 468 capabilities: ...\n
469 469
470 470 e.g.
471 471
472 472 s: upgraded 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a ssh-v2\n
473 473 s: 240\n
474 474 s: capabilities: known getbundle batch ...\n
475 475
476 476 Following capabilities advertisement, the peers communicate using version
477 477 1 of the SSH transport.
478 478
479 479 Unified Frame-Based Protocol
480 480 ============================
481 481
482 482 **Experimental and under development**
483 483
484 484 The *Unified Frame-Based Protocol* is a communications protocol between
485 485 Mercurial peers. The protocol aims to be mostly transport agnostic
486 486 (works similarly on HTTP, SSH, etc).
487 487
488 488 To operate the protocol, a bi-directional, half-duplex pipe supporting
489 489 ordered sends and receives is required. That is, each peer has one pipe
490 490 for sending data and another for receiving.
491 491
492 492 All data is read and written in atomic units called *frames*. These
493 493 are conceptually similar to TCP packets. Higher-level functionality
494 494 is built on the exchange and processing of frames.
495 495
496 496 All frames are associated with a *stream*. A *stream* provides a
497 497 unidirectional grouping of frames. Streams facilitate two goals:
498 498 content encoding and parallelism. There is a dedicated section on
499 499 streams below.
500 500
501 501 The protocol is request-response based: the client issues requests to
502 502 the server, which issues replies to those requests. Server-initiated
503 503 messaging is not currently supported, but this specification carves
504 504 out room to implement it.
505 505
506 506 All frames are associated with a numbered request. Frames can thus
507 507 be logically grouped by their request ID.
508 508
509 509 Frames begin with an 8 octet header followed by a variable length
510 510 payload::
511 511
512 512 +------------------------------------------------+
513 513 | Length (24) |
514 514 +--------------------------------+---------------+
515 515 | Request ID (16) | Stream ID (8) |
516 516 +------------------+-------------+---------------+
517 517 | Stream Flags (8) |
518 518 +-----------+------+
519 519 | Type (4) |
520 520 +-----------+
521 521 | Flags (4) |
522 522 +===========+===================================================|
523 523 | Frame Payload (0...) ...
524 524 +---------------------------------------------------------------+
525 525
526 526 The length of the frame payload is expressed as an unsigned 24 bit
527 527 little endian integer. Values larger than 65535 MUST NOT be used unless
528 528 given permission by the server as part of the negotiated capabilities
529 529 during the handshake. The frame header is not part of the advertised
530 530 frame length. The payload length is the over-the-wire length. If there
531 531 is content encoding applied to the payload as part of the frame's stream,
532 532 the length is the output of that content encoding, not the input.
533 533
534 534 The 16-bit ``Request ID`` field denotes the integer request identifier,
535 535 stored as an unsigned little endian integer. Odd numbered requests are
536 536 client-initiated. Even numbered requests are server-initiated. This
537 537 refers to where the *request* was initiated - not where the *frame* was
538 538 initiated, so servers will send frames with odd ``Request ID`` in
539 539 response to client-initiated requests. Implementations are advised to
540 540 start ordering request identifiers at ``1`` and ``0``, increment by
541 541 ``2``, and wrap around if all available numbers have been exhausted.
542 542
543 543 The 8-bit ``Stream ID`` field denotes the stream that the frame is
544 544 associated with. Frames belonging to a stream may have content
545 545 encoding applied and the receiver may need to decode the raw frame
546 546 payload to obtain the original data. Odd numbered IDs are
547 547 client-initiated. Even numbered IDs are server-initiated.
548 548
549 549 The 8-bit ``Stream Flags`` field defines stream processing semantics.
550 550 See the section on streams below.
551 551
552 552 The 4-bit ``Type`` field denotes the type of frame being sent.
553 553
554 554 The 4-bit ``Flags`` field defines special, per-type attributes for
555 555 the frame.
556 556
557 557 The sections below define the frame types and their behavior.
558 558
559 559 Command Request (``0x01``)
560 560 --------------------------
561 561
562 562 This frame contains a request to run a command.
563 563
564 564 The name of the command to run constitutes the entirety of the frame
565 565 payload.
566 566
567 567 This frame type MUST ONLY be sent from clients to servers: it is illegal
568 568 for a server to send this frame to a client.
569 569
570 570 The following flag values are defined for this type:
571 571
572 572 0x01
573 573 End of command data. When set, the client will not send any command
574 574 arguments or additional command data. When set, the command has been
575 575 fully issued and the server has the full context to process the command.
576 576 The next frame issued by the client is not part of this command.
577 577 0x02
578 578 Command argument frames expected. When set, the client will send
579 579 *Command Argument* frames containing command argument data.
580 580 0x04
581 581 Command data frames expected. When set, the client will send
582 582 *Command Data* frames containing a raw stream of data for this
583 583 command.
584 584
585 585 The ``0x01`` flag is mutually exclusive with both the ``0x02`` and ``0x04``
586 586 flags.
587 587
588 588 Command Argument (``0x02``)
589 589 ---------------------------
590 590
591 591 This frame contains a named argument for a command.
592 592
593 593 The frame type MUST ONLY be sent from clients to servers: it is illegal
594 594 for a server to send this frame to a client.
595 595
596 596 The payload consists of:
597 597
598 598 * A 16-bit little endian integer denoting the length of the
599 599 argument name.
600 600 * A 16-bit little endian integer denoting the length of the
601 601 argument value.
602 602 * N bytes of ASCII data containing the argument name.
603 603 * N bytes of binary data containing the argument value.
604 604
605 605 The payload MUST hold the entirety of the 32-bit header and the
606 606 argument name. The argument value MAY span multiple frames. If this
607 607 occurs, the appropriate frame flag should be set to indicate this.
608 608
609 609 The following flag values are defined for this type:
610 610
611 611 0x01
612 612 Argument data continuation. When set, the data for this argument did
613 613 not fit in a single frame and the next frame will contain additional
614 614 argument data.
615 615
616 616 0x02
617 617 End of arguments data. When set, the client will not send any more
618 618 command arguments for the command this frame is associated with.
619 619 The next frame issued by the client will be command data or
620 620 belong to a separate request.
621 621
622 622 Command Data (``0x03``)
623 623 -----------------------
624 624
625 625 This frame contains raw data for a command.
626 626
627 627 Most commands can be executed by specifying arguments. However,
628 628 arguments have an upper bound to their length. For commands that
629 629 accept data that is beyond this length or whose length isn't known
630 630 when the command is initially sent, they will need to stream
631 631 arbitrary data to the server. This frame type facilitates the sending
632 632 of this data.
633 633
634 634 The payload of this frame type consists of a stream of raw data to be
635 635 consumed by the command handler on the server. The format of the data
636 636 is command specific.
637 637
638 638 The following flag values are defined for this type:
639 639
640 640 0x01
641 641 Command data continuation. When set, the data for this command
642 642 continues into a subsequent frame.
643 643
644 644 0x02
645 645 End of data. When set, command data has been fully sent to the
646 646 server. The command has been fully issued and no new data for this
647 647 command will be sent. The next frame will belong to a new command.
648 648
649 649 Bytes Response Data (``0x04``)
650 650 ------------------------------
651 651
652 652 This frame contains raw bytes response data to an issued command.
653 653
654 654 The following flag values are defined for this type:
655 655
656 656 0x01
657 657 Data continuation. When set, an additional frame containing raw
658 658 response data will follow.
659 659 0x02
660 660 End of data. When sent, the response data has been fully sent and
661 661 no additional frames for this response will be sent.
662 662
663 663 The ``0x01`` flag is mutually exclusive with the ``0x02`` flag.
664 664
665 665 Error Response (``0x05``)
666 666 -------------------------
667 667
668 668 An error occurred when processing a request. This could indicate
669 669 a protocol-level failure or an application level failure depending
670 670 on the flags for this message type.
671 671
672 672 The payload for this type is an error message that should be
673 673 displayed to the user.
674 674
675 675 The following flag values are defined for this type:
676 676
677 677 0x01
678 678 The error occurred at the transport/protocol level. If set, the
679 679 connection should be closed.
680 680 0x02
681 681 The error occurred at the application level. e.g. invalid command.
682 682
683 683 Human Output Side-Channel (``0x06``)
684 684 ------------------------------------
685 685
686 686 This frame contains a message that is intended to be displayed to
687 687 people. Whereas most frames communicate machine readable data, this
688 688 frame communicates textual data that is intended to be shown to
689 689 humans.
690 690
691 691 The frame consists of a series of *formatting requests*. Each formatting
692 692 request consists of a formatting string, arguments for that formatting
693 693 string, and labels to apply to that formatting string.
694 694
695 695 A formatting string is a printf()-like string that allows variable
696 696 substitution within the string. Labels allow the rendered text to be
697 697 *decorated*. Assuming use of the canonical Mercurial code base, a
698 698 formatting string can be the input to the ``i18n._`` function. This
699 699 allows messages emitted from the server to be localized. So even if
700 700 the server has different i18n settings, people could see messages in
701 701 their *native* settings. Similarly, the use of labels allows
702 702 decorations like coloring and underlining to be applied using the
703 703 client's configured rendering settings.
704 704
705 705 Formatting strings are similar to ``printf()`` strings or how
706 706 Python's ``%`` operator works. The only supported formatting sequences
707 707 are ``%s`` and ``%%``. ``%s`` will be replaced by whatever the string
708 708 at that position resolves to. ``%%`` will be replaced by ``%``. All
709 709 other 2-byte sequences beginning with ``%`` represent a literal
710 710 ``%`` followed by that character. However, future versions of the
711 711 wire protocol reserve the right to allow clients to opt in to receiving
712 712 formatting strings with additional formatters, hence why ``%%`` is
713 713 required to represent the literal ``%``.
714 714
715 715 The raw frame consists of a series of data structures representing
716 716 textual atoms to print. Each atom begins with a struct defining the
717 717 size of the data that follows:
718 718
719 719 * A 16-bit little endian unsigned integer denoting the length of the
720 720 formatting string.
721 721 * An 8-bit unsigned integer denoting the number of label strings
722 722 that follow.
723 723 * An 8-bit unsigned integer denoting the number of formatting string
724 724 arguments strings that follow.
725 725 * An array of 8-bit unsigned integers denoting the lengths of
726 726 *labels* data.
727 727 * An array of 16-bit unsigned integers denoting the lengths of
728 728 formatting strings.
729 729 * The formatting string, encoded as UTF-8.
730 730 * 0 or more ASCII strings defining labels to apply to this atom.
731 731 * 0 or more UTF-8 strings that will be used as arguments to the
732 732 formatting string.
733 733
734 734 TODO use ASCII for formatting string.
735 735
736 736 All data to be printed MUST be encoded into a single frame: this frame
737 737 does not support spanning data across multiple frames.
738 738
739 739 All textual data encoded in these frames is assumed to be line delimited.
740 740 The last atom in the frame SHOULD end with a newline (``\n``). If it
741 741 doesn't, clients MAY add a newline to facilitate immediate printing.
742 742
743 Progress Update (``0x07``)
744 --------------------------
745
746 This frame holds the progress of an operation on the peer. Consumption
747 of these frames allows clients to display progress bars, estimated
748 completion times, etc.
749
750 Each frame defines the progress of a single operation on the peer. The
751 payload consists of a CBOR map with the following bytestring keys:
752
753 topic
754 Topic name (string)
755 pos
756 Current numeric position within the topic (integer)
757 total
758 Total/end numeric position of this topic (unsigned integer)
759 label (optional)
760 Unit label (string)
761 item (optional)
762 Item name (string)
763
764 Progress state is created when a frame is received referencing a
765 *topic* that isn't currently tracked. Progress tracking for that
766 *topic* is finished when a frame is received reporting the current
767 position of that topic as ``-1``.
768
769 Multiple *topics* may be active at any given time.
770
771 Rendering of progress information is not mandated or governed by this
772 specification: implementations MAY render progress information however
773 they see fit, including not at all.
774
775 The string data describing the topic SHOULD be static strings to
776 facilitate receivers localizing that string data. The emitter
777 MUST normalize all string data to valid UTF-8 and receivers SHOULD
778 validate that received data conforms to UTF-8. The topic name
779 SHOULD be ASCII.
780
743 781 Stream Encoding Settings (``0x08``)
744 782 -----------------------------------
745 783
746 784 This frame type holds information defining the content encoding
747 785 settings for a *stream*.
748 786
749 787 This frame type is likely consumed by the protocol layer and is not
750 788 passed on to applications.
751 789
752 790 This frame type MUST ONLY occur on frames having the *Beginning of Stream*
753 791 ``Stream Flag`` set.
754 792
755 793 The payload of this frame defines what content encoding has (possibly)
756 794 been applied to the payloads of subsequent frames in this stream.
757 795
758 796 The payload begins with an 8-bit integer defining the length of the
759 797 encoding *profile*, followed by the string name of that profile, which
760 798 must be an ASCII string. All bytes that follow can be used by that
761 799 profile for supplemental settings definitions. See the section below
762 800 on defined encoding profiles.
763 801
764 802 Stream States and Flags
765 803 -----------------------
766 804
767 805 Streams can be in two states: *open* and *closed*. An *open* stream
768 806 is active and frames attached to that stream could arrive at any time.
769 807 A *closed* stream is not active. If a frame attached to a *closed*
770 808 stream arrives, that frame MUST have an appropriate stream flag
771 809 set indicating beginning of stream. All streams are in the *closed*
772 810 state by default.
773 811
774 812 The ``Stream Flags`` field denotes a set of bit flags for defining
775 813 the relationship of this frame within a stream. The following flags
776 814 are defined:
777 815
778 816 0x01
779 817 Beginning of stream. The first frame in the stream MUST set this
780 818 flag. When received, the ``Stream ID`` this frame is attached to
781 819 becomes ``open``.
782 820
783 821 0x02
784 822 End of stream. The last frame in a stream MUST set this flag. When
785 823 received, the ``Stream ID`` this frame is attached to becomes
786 824 ``closed``. Any content encoding context associated with this stream
787 825 can be destroyed after processing the payload of this frame.
788 826
789 827 0x04
790 828 Apply content encoding. When set, any content encoding settings
791 829 defined by the stream should be applied when attempting to read
792 830 the frame. When not set, the frame payload isn't encoded.
793 831
794 832 Streams
795 833 -------
796 834
797 835 Streams - along with ``Request IDs`` - facilitate grouping of frames.
798 836 But the purpose of each is quite different and the groupings they
799 837 constitute are independent.
800 838
801 839 A ``Request ID`` is essentially a tag. It tells you which logical
802 840 request a frame is associated with.
803 841
804 842 A *stream* is a sequence of frames grouped for the express purpose
805 843 of applying a stateful encoding or for denoting sub-groups of frames.
806 844
807 845 Unlike ``Request ID``s which span the request and response, a stream
808 846 is unidirectional and stream IDs are independent from client to
809 847 server.
810 848
811 849 There is no strict hierarchical relationship between ``Request IDs``
812 850 and *streams*. A stream can contain frames having multiple
813 851 ``Request IDs``. Frames belonging to the same ``Request ID`` can
814 852 span multiple streams.
815 853
816 854 One goal of streams is to facilitate content encoding. A stream can
817 855 define an encoding to be applied to frame payloads. For example, the
818 856 payload transmitted over the wire may contain output from a
819 857 zstandard compression operation and the receiving end may decompress
820 858 that payload to obtain the original data.
821 859
822 860 The other goal of streams is to facilitate concurrent execution. For
823 861 example, a server could spawn 4 threads to service a request that can
824 862 be easily parallelized. Each of those 4 threads could write into its
825 863 own stream. Those streams could then in turn be delivered to 4 threads
826 864 on the receiving end, with each thread consuming its stream in near
827 865 isolation. The *main* thread on both ends merely does I/O and
828 866 encodes/decodes frame headers: the bulk of the work is done by worker
829 867 threads.
830 868
831 869 In addition, since content encoding is defined per stream, each
832 870 *worker thread* could perform potentially CPU bound work concurrently
833 871 with other threads. This approach of applying encoding at the
834 872 sub-protocol / stream level eliminates a potential resource constraint
835 873 on the protocol stream as a whole (it is common for the throughput of
836 874 a compression engine to be smaller than the throughput of a network).
837 875
838 876 Having multiple streams - each with their own encoding settings - also
839 877 facilitates the use of advanced data compression techniques. For
840 878 example, a transmitter could see that it is generating data faster
841 879 and slower than the receiving end is consuming it and adjust its
842 880 compression settings to trade CPU for compression ratio accordingly.
843 881
844 882 While streams can define a content encoding, not all frames within
845 883 that stream must use that content encoding. This can be useful when
846 884 data is being served from caches and being derived dynamically. A
847 885 cache could pre-compressed data so the server doesn't have to
848 886 recompress it. The ability to pick and choose which frames are
849 887 compressed allows servers to easily send data to the wire without
850 888 involving potentially expensive encoding overhead.
851 889
852 890 Content Encoding Profiles
853 891 -------------------------
854 892
855 893 Streams can have named content encoding *profiles* associated with
856 894 them. A profile defines a shared understanding of content encoding
857 895 settings and behavior.
858 896
859 897 The following profiles are defined:
860 898
861 899 TBD
862 900
863 901 Issuing Commands
864 902 ----------------
865 903
866 904 A client can request that a remote run a command by sending it
867 905 frames defining that command. This logical stream is composed of
868 906 1 ``Command Request`` frame, 0 or more ``Command Argument`` frames,
869 907 and 0 or more ``Command Data`` frames.
870 908
871 909 All frames composing a single command request MUST be associated with
872 910 the same ``Request ID``.
873 911
874 912 Clients MAY send additional command requests without waiting on the
875 913 response to a previous command request. If they do so, they MUST ensure
876 914 that the ``Request ID`` field of outbound frames does not conflict
877 915 with that of an active ``Request ID`` whose response has not yet been
878 916 fully received.
879 917
880 918 Servers MAY respond to commands in a different order than they were
881 919 sent over the wire. Clients MUST be prepared to deal with this. Servers
882 920 also MAY start executing commands in a different order than they were
883 921 received, or MAY execute multiple commands concurrently.
884 922
885 923 If there is a dependency between commands or a race condition between
886 924 commands executing (e.g. a read-only command that depends on the results
887 925 of a command that mutates the repository), then clients MUST NOT send
888 926 frames issuing a command until a response to all dependent commands has
889 927 been received.
890 928 TODO think about whether we should express dependencies between commands
891 929 to avoid roundtrip latency.
892 930
893 931 Argument frames are the recommended mechanism for transferring fixed
894 932 sets of parameters to a command. Data frames are appropriate for
895 933 transferring variable data. A similar comparison would be to HTTP:
896 934 argument frames are headers and the message body is data frames.
897 935
898 936 It is recommended for servers to delay the dispatch of a command
899 937 until all argument frames for that command have been received. Servers
900 938 MAY impose limits on the maximum argument size.
901 939 TODO define failure mechanism.
902 940
903 941 Servers MAY dispatch to commands immediately once argument data
904 942 is available or delay until command data is received in full.
905 943
906 944 Capabilities
907 945 ============
908 946
909 947 Servers advertise supported wire protocol features. This allows clients to
910 948 probe for server features before blindly calling a command or passing a
911 949 specific argument.
912 950
913 951 The server's features are exposed via a *capabilities* string. This is a
914 952 space-delimited string of tokens/features. Some features are single words
915 953 like ``lookup`` or ``batch``. Others are complicated key-value pairs
916 954 advertising sub-features. e.g. ``httpheader=2048``. When complex, non-word
917 955 values are used, each feature name can define its own encoding of sub-values.
918 956 Comma-delimited and ``x-www-form-urlencoded`` values are common.
919 957
920 958 The following document capabilities defined by the canonical Mercurial server
921 959 implementation.
922 960
923 961 batch
924 962 -----
925 963
926 964 Whether the server supports the ``batch`` command.
927 965
928 966 This capability/command was introduced in Mercurial 1.9 (released July 2011).
929 967
930 968 branchmap
931 969 ---------
932 970
933 971 Whether the server supports the ``branchmap`` command.
934 972
935 973 This capability/command was introduced in Mercurial 1.3 (released July 2009).
936 974
937 975 bundle2-exp
938 976 -----------
939 977
940 978 Precursor to ``bundle2`` capability that was used before bundle2 was a
941 979 stable feature.
942 980
943 981 This capability was introduced in Mercurial 3.0 behind an experimental
944 982 flag. This capability should not be observed in the wild.
945 983
946 984 bundle2
947 985 -------
948 986
949 987 Indicates whether the server supports the ``bundle2`` data exchange format.
950 988
951 989 The value of the capability is a URL quoted, newline (``\n``) delimited
952 990 list of keys or key-value pairs.
953 991
954 992 A key is simply a URL encoded string.
955 993
956 994 A key-value pair is a URL encoded key separated from a URL encoded value by
957 995 an ``=``. If the value is a list, elements are delimited by a ``,`` after
958 996 URL encoding.
959 997
960 998 For example, say we have the values::
961 999
962 1000 {'HG20': [], 'changegroup': ['01', '02'], 'digests': ['sha1', 'sha512']}
963 1001
964 1002 We would first construct a string::
965 1003
966 1004 HG20\nchangegroup=01,02\ndigests=sha1,sha512
967 1005
968 1006 We would then URL quote this string::
969 1007
970 1008 HG20%0Achangegroup%3D01%2C02%0Adigests%3Dsha1%2Csha512
971 1009
972 1010 This capability was introduced in Mercurial 3.4 (released May 2015).
973 1011
974 1012 changegroupsubset
975 1013 -----------------
976 1014
977 1015 Whether the server supports the ``changegroupsubset`` command.
978 1016
979 1017 This capability was introduced in Mercurial 0.9.2 (released December
980 1018 2006).
981 1019
982 1020 This capability was introduced at the same time as the ``lookup``
983 1021 capability/command.
984 1022
985 1023 compression
986 1024 -----------
987 1025
988 1026 Declares support for negotiating compression formats.
989 1027
990 1028 Presence of this capability indicates the server supports dynamic selection
991 1029 of compression formats based on the client request.
992 1030
993 1031 Servers advertising this capability are required to support the
994 1032 ``application/mercurial-0.2`` media type in response to commands returning
995 1033 streams. Servers may support this media type on any command.
996 1034
997 1035 The value of the capability is a comma-delimited list of strings declaring
998 1036 supported compression formats. The order of the compression formats is in
999 1037 server-preferred order, most preferred first.
1000 1038
1001 1039 The identifiers used by the official Mercurial distribution are:
1002 1040
1003 1041 bzip2
1004 1042 bzip2
1005 1043 none
1006 1044 uncompressed / raw data
1007 1045 zlib
1008 1046 zlib (no gzip header)
1009 1047 zstd
1010 1048 zstd
1011 1049
1012 1050 This capability was introduced in Mercurial 4.1 (released February 2017).
1013 1051
1014 1052 getbundle
1015 1053 ---------
1016 1054
1017 1055 Whether the server supports the ``getbundle`` command.
1018 1056
1019 1057 This capability was introduced in Mercurial 1.9 (released July 2011).
1020 1058
1021 1059 httpheader
1022 1060 ----------
1023 1061
1024 1062 Whether the server supports receiving command arguments via HTTP request
1025 1063 headers.
1026 1064
1027 1065 The value of the capability is an integer describing the max header
1028 1066 length that clients should send. Clients should ignore any content after a
1029 1067 comma in the value, as this is reserved for future use.
1030 1068
1031 1069 This capability was introduced in Mercurial 1.9 (released July 2011).
1032 1070
1033 1071 httpmediatype
1034 1072 -------------
1035 1073
1036 1074 Indicates which HTTP media types (``Content-Type`` header) the server is
1037 1075 capable of receiving and sending.
1038 1076
1039 1077 The value of the capability is a comma-delimited list of strings identifying
1040 1078 support for media type and transmission direction. The following strings may
1041 1079 be present:
1042 1080
1043 1081 0.1rx
1044 1082 Indicates server support for receiving ``application/mercurial-0.1`` media
1045 1083 types.
1046 1084
1047 1085 0.1tx
1048 1086 Indicates server support for sending ``application/mercurial-0.1`` media
1049 1087 types.
1050 1088
1051 1089 0.2rx
1052 1090 Indicates server support for receiving ``application/mercurial-0.2`` media
1053 1091 types.
1054 1092
1055 1093 0.2tx
1056 1094 Indicates server support for sending ``application/mercurial-0.2`` media
1057 1095 types.
1058 1096
1059 1097 minrx=X
1060 1098 Minimum media type version the server is capable of receiving. Value is a
1061 1099 string like ``0.2``.
1062 1100
1063 1101 This capability can be used by servers to limit connections from legacy
1064 1102 clients not using the latest supported media type. However, only clients
1065 1103 with knowledge of this capability will know to consult this value. This
1066 1104 capability is present so the client may issue a more user-friendly error
1067 1105 when the server has locked out a legacy client.
1068 1106
1069 1107 mintx=X
1070 1108 Minimum media type version the server is capable of sending. Value is a
1071 1109 string like ``0.1``.
1072 1110
1073 1111 Servers advertising support for the ``application/mercurial-0.2`` media type
1074 1112 should also advertise the ``compression`` capability.
1075 1113
1076 1114 This capability was introduced in Mercurial 4.1 (released February 2017).
1077 1115
1078 1116 httppostargs
1079 1117 ------------
1080 1118
1081 1119 **Experimental**
1082 1120
1083 1121 Indicates that the server supports and prefers clients send command arguments
1084 1122 via a HTTP POST request as part of the request body.
1085 1123
1086 1124 This capability was introduced in Mercurial 3.8 (released May 2016).
1087 1125
1088 1126 known
1089 1127 -----
1090 1128
1091 1129 Whether the server supports the ``known`` command.
1092 1130
1093 1131 This capability/command was introduced in Mercurial 1.9 (released July 2011).
1094 1132
1095 1133 lookup
1096 1134 ------
1097 1135
1098 1136 Whether the server supports the ``lookup`` command.
1099 1137
1100 1138 This capability was introduced in Mercurial 0.9.2 (released December
1101 1139 2006).
1102 1140
1103 1141 This capability was introduced at the same time as the ``changegroupsubset``
1104 1142 capability/command.
1105 1143
1106 1144 pushkey
1107 1145 -------
1108 1146
1109 1147 Whether the server supports the ``pushkey`` and ``listkeys`` commands.
1110 1148
1111 1149 This capability was introduced in Mercurial 1.6 (released July 2010).
1112 1150
1113 1151 standardbundle
1114 1152 --------------
1115 1153
1116 1154 **Unsupported**
1117 1155
1118 1156 This capability was introduced during the Mercurial 0.9.2 development cycle in
1119 1157 2006. It was never present in a release, as it was replaced by the ``unbundle``
1120 1158 capability. This capability should not be encountered in the wild.
1121 1159
1122 1160 stream-preferred
1123 1161 ----------------
1124 1162
1125 1163 If present the server prefers that clients clone using the streaming clone
1126 1164 protocol (``hg clone --stream``) rather than the standard
1127 1165 changegroup/bundle based protocol.
1128 1166
1129 1167 This capability was introduced in Mercurial 2.2 (released May 2012).
1130 1168
1131 1169 streamreqs
1132 1170 ----------
1133 1171
1134 1172 Indicates whether the server supports *streaming clones* and the *requirements*
1135 1173 that clients must support to receive it.
1136 1174
1137 1175 If present, the server supports the ``stream_out`` command, which transmits
1138 1176 raw revlogs from the repository instead of changegroups. This provides a faster
1139 1177 cloning mechanism at the expense of more bandwidth used.
1140 1178
1141 1179 The value of this capability is a comma-delimited list of repo format
1142 1180 *requirements*. These are requirements that impact the reading of data in
1143 1181 the ``.hg/store`` directory. An example value is
1144 1182 ``streamreqs=generaldelta,revlogv1`` indicating the server repo requires
1145 1183 the ``revlogv1`` and ``generaldelta`` requirements.
1146 1184
1147 1185 If the only format requirement is ``revlogv1``, the server may expose the
1148 1186 ``stream`` capability instead of the ``streamreqs`` capability.
1149 1187
1150 1188 This capability was introduced in Mercurial 1.7 (released November 2010).
1151 1189
1152 1190 stream
1153 1191 ------
1154 1192
1155 1193 Whether the server supports *streaming clones* from ``revlogv1`` repos.
1156 1194
1157 1195 If present, the server supports the ``stream_out`` command, which transmits
1158 1196 raw revlogs from the repository instead of changegroups. This provides a faster
1159 1197 cloning mechanism at the expense of more bandwidth used.
1160 1198
1161 1199 This capability was introduced in Mercurial 0.9.1 (released July 2006).
1162 1200
1163 1201 When initially introduced, the value of the capability was the numeric
1164 1202 revlog revision. e.g. ``stream=1``. This indicates the changegroup is using
1165 1203 ``revlogv1``. This simple integer value wasn't powerful enough, so the
1166 1204 ``streamreqs`` capability was invented to handle cases where the repo
1167 1205 requirements have more than just ``revlogv1``. Newer servers omit the
1168 1206 ``=1`` since it was the only value supported and the value of ``1`` can
1169 1207 be implied by clients.
1170 1208
1171 1209 unbundlehash
1172 1210 ------------
1173 1211
1174 1212 Whether the ``unbundle`` commands supports receiving a hash of all the
1175 1213 heads instead of a list.
1176 1214
1177 1215 For more, see the documentation for the ``unbundle`` command.
1178 1216
1179 1217 This capability was introduced in Mercurial 1.9 (released July 2011).
1180 1218
1181 1219 unbundle
1182 1220 --------
1183 1221
1184 1222 Whether the server supports pushing via the ``unbundle`` command.
1185 1223
1186 1224 This capability/command has been present since Mercurial 0.9.1 (released
1187 1225 July 2006).
1188 1226
1189 1227 Mercurial 0.9.2 (released December 2006) added values to the capability
1190 1228 indicating which bundle types the server supports receiving. This value is a
1191 1229 comma-delimited list. e.g. ``HG10GZ,HG10BZ,HG10UN``. The order of values
1192 1230 reflects the priority/preference of that type, where the first value is the
1193 1231 most preferred type.
1194 1232
1195 1233 Content Negotiation
1196 1234 ===================
1197 1235
1198 1236 The wire protocol has some mechanisms to help peers determine what content
1199 1237 types and encoding the other side will accept. Historically, these mechanisms
1200 1238 have been built into commands themselves because most commands only send a
1201 1239 well-defined response type and only certain commands needed to support
1202 1240 functionality like compression.
1203 1241
1204 1242 Currently, only the HTTP version 1 transport supports content negotiation
1205 1243 at the protocol layer.
1206 1244
1207 1245 HTTP requests advertise supported response formats via the ``X-HgProto-<N>``
1208 1246 request header, where ``<N>`` is an integer starting at 1 allowing the logical
1209 1247 value to span multiple headers. This value consists of a list of
1210 1248 space-delimited parameters. Each parameter denotes a feature or capability.
1211 1249
1212 1250 The following parameters are defined:
1213 1251
1214 1252 0.1
1215 1253 Indicates the client supports receiving ``application/mercurial-0.1``
1216 1254 responses.
1217 1255
1218 1256 0.2
1219 1257 Indicates the client supports receiving ``application/mercurial-0.2``
1220 1258 responses.
1221 1259
1222 1260 comp
1223 1261 Indicates compression formats the client can decode. Value is a list of
1224 1262 comma delimited strings identifying compression formats ordered from
1225 1263 most preferential to least preferential. e.g. ``comp=zstd,zlib,none``.
1226 1264
1227 1265 This parameter does not have an effect if only the ``0.1`` parameter
1228 1266 is defined, as support for ``application/mercurial-0.2`` or greater is
1229 1267 required to use arbitrary compression formats.
1230 1268
1231 1269 If this parameter is not advertised, the server interprets this as
1232 1270 equivalent to ``zlib,none``.
1233 1271
1234 1272 Clients may choose to only send this header if the ``httpmediatype``
1235 1273 server capability is present, as currently all server-side features
1236 1274 consulting this header require the client to opt in to new protocol features
1237 1275 advertised via the ``httpmediatype`` capability.
1238 1276
1239 1277 A server that doesn't receive an ``X-HgProto-<N>`` header should infer a
1240 1278 value of ``0.1``. This is compatible with legacy clients.
1241 1279
1242 1280 A server receiving a request indicating support for multiple media type
1243 1281 versions may respond with any of the supported media types. Not all servers
1244 1282 may support all media types on all commands.
1245 1283
1246 1284 Commands
1247 1285 ========
1248 1286
1249 1287 This section contains a list of all wire protocol commands implemented by
1250 1288 the canonical Mercurial server.
1251 1289
1252 1290 batch
1253 1291 -----
1254 1292
1255 1293 Issue multiple commands while sending a single command request. The purpose
1256 1294 of this command is to allow a client to issue multiple commands while avoiding
1257 1295 multiple round trips to the server therefore enabling commands to complete
1258 1296 quicker.
1259 1297
1260 1298 The command accepts a ``cmds`` argument that contains a list of commands to
1261 1299 execute.
1262 1300
1263 1301 The value of ``cmds`` is a ``;`` delimited list of strings. Each string has the
1264 1302 form ``<command> <arguments>``. That is, the command name followed by a space
1265 1303 followed by an argument string.
1266 1304
1267 1305 The argument string is a ``,`` delimited list of ``<key>=<value>`` values
1268 1306 corresponding to command arguments. Both the argument name and value are
1269 1307 escaped using a special substitution map::
1270 1308
1271 1309 : -> :c
1272 1310 , -> :o
1273 1311 ; -> :s
1274 1312 = -> :e
1275 1313
1276 1314 The response type for this command is ``string``. The value contains a
1277 1315 ``;`` delimited list of responses for each requested command. Each value
1278 1316 in this list is escaped using the same substitution map used for arguments.
1279 1317
1280 1318 If an error occurs, the generic error response may be sent.
1281 1319
1282 1320 between
1283 1321 -------
1284 1322
1285 1323 (Legacy command used for discovery in old clients)
1286 1324
1287 1325 Obtain nodes between pairs of nodes.
1288 1326
1289 1327 The ``pairs`` arguments contains a space-delimited list of ``-`` delimited
1290 1328 hex node pairs. e.g.::
1291 1329
1292 1330 a072279d3f7fd3a4aa7ffa1a5af8efc573e1c896-6dc58916e7c070f678682bfe404d2e2d68291a18
1293 1331
1294 1332 Return type is a ``string``. Value consists of lines corresponding to each
1295 1333 requested range. Each line contains a space-delimited list of hex nodes.
1296 1334 A newline ``\n`` terminates each line, including the last one.
1297 1335
1298 1336 branchmap
1299 1337 ---------
1300 1338
1301 1339 Obtain heads in named branches.
1302 1340
1303 1341 Accepts no arguments. Return type is a ``string``.
1304 1342
1305 1343 Return value contains lines with URL encoded branch names followed by a space
1306 1344 followed by a space-delimited list of hex nodes of heads on that branch.
1307 1345 e.g.::
1308 1346
1309 1347 default a072279d3f7fd3a4aa7ffa1a5af8efc573e1c896 6dc58916e7c070f678682bfe404d2e2d68291a18
1310 1348 stable baae3bf31522f41dd5e6d7377d0edd8d1cf3fccc
1311 1349
1312 1350 There is no trailing newline.
1313 1351
1314 1352 branches
1315 1353 --------
1316 1354
1317 1355 (Legacy command used for discovery in old clients. Clients with ``getbundle``
1318 1356 use the ``known`` and ``heads`` commands instead.)
1319 1357
1320 1358 Obtain ancestor changesets of specific nodes back to a branch point.
1321 1359
1322 1360 Despite the name, this command has nothing to do with Mercurial named branches.
1323 1361 Instead, it is related to DAG branches.
1324 1362
1325 1363 The command accepts a ``nodes`` argument, which is a string of space-delimited
1326 1364 hex nodes.
1327 1365
1328 1366 For each node requested, the server will find the first ancestor node that is
1329 1367 a DAG root or is a merge.
1330 1368
1331 1369 Return type is a ``string``. Return value contains lines with result data for
1332 1370 each requested node. Each line contains space-delimited nodes followed by a
1333 1371 newline (``\n``). The 4 nodes reported on each line correspond to the requested
1334 1372 node, the ancestor node found, and its 2 parent nodes (which may be the null
1335 1373 node).
1336 1374
1337 1375 capabilities
1338 1376 ------------
1339 1377
1340 1378 Obtain the capabilities string for the repo.
1341 1379
1342 1380 Unlike the ``hello`` command, the capabilities string is not prefixed.
1343 1381 There is no trailing newline.
1344 1382
1345 1383 This command does not accept any arguments. Return type is a ``string``.
1346 1384
1347 1385 This command was introduced in Mercurial 0.9.1 (released July 2006).
1348 1386
1349 1387 changegroup
1350 1388 -----------
1351 1389
1352 1390 (Legacy command: use ``getbundle`` instead)
1353 1391
1354 1392 Obtain a changegroup version 1 with data for changesets that are
1355 1393 descendants of client-specified changesets.
1356 1394
1357 1395 The ``roots`` arguments contains a list of space-delimited hex nodes.
1358 1396
1359 1397 The server responds with a changegroup version 1 containing all
1360 1398 changesets between the requested root/base nodes and the repo's head nodes
1361 1399 at the time of the request.
1362 1400
1363 1401 The return type is a ``stream``.
1364 1402
1365 1403 changegroupsubset
1366 1404 -----------------
1367 1405
1368 1406 (Legacy command: use ``getbundle`` instead)
1369 1407
1370 1408 Obtain a changegroup version 1 with data for changesetsets between
1371 1409 client specified base and head nodes.
1372 1410
1373 1411 The ``bases`` argument contains a list of space-delimited hex nodes.
1374 1412 The ``heads`` argument contains a list of space-delimited hex nodes.
1375 1413
1376 1414 The server responds with a changegroup version 1 containing all
1377 1415 changesets between the requested base and head nodes at the time of the
1378 1416 request.
1379 1417
1380 1418 The return type is a ``stream``.
1381 1419
1382 1420 clonebundles
1383 1421 ------------
1384 1422
1385 1423 Obtains a manifest of bundle URLs available to seed clones.
1386 1424
1387 1425 Each returned line contains a URL followed by metadata. See the
1388 1426 documentation in the ``clonebundles`` extension for more.
1389 1427
1390 1428 The return type is a ``string``.
1391 1429
1392 1430 getbundle
1393 1431 ---------
1394 1432
1395 1433 Obtain a bundle containing repository data.
1396 1434
1397 1435 This command accepts the following arguments:
1398 1436
1399 1437 heads
1400 1438 List of space-delimited hex nodes of heads to retrieve.
1401 1439 common
1402 1440 List of space-delimited hex nodes that the client has in common with the
1403 1441 server.
1404 1442 obsmarkers
1405 1443 Boolean indicating whether to include obsolescence markers as part
1406 1444 of the response. Only works with bundle2.
1407 1445 bundlecaps
1408 1446 Comma-delimited set of strings defining client bundle capabilities.
1409 1447 listkeys
1410 1448 Comma-delimited list of strings of ``pushkey`` namespaces. For each
1411 1449 namespace listed, a bundle2 part will be included with the content of
1412 1450 that namespace.
1413 1451 cg
1414 1452 Boolean indicating whether changegroup data is requested.
1415 1453 cbattempted
1416 1454 Boolean indicating whether the client attempted to use the *clone bundles*
1417 1455 feature before performing this request.
1418 1456 bookmarks
1419 1457 Boolean indicating whether bookmark data is requested.
1420 1458 phases
1421 1459 Boolean indicating whether phases data is requested.
1422 1460
1423 1461 The return type on success is a ``stream`` where the value is bundle.
1424 1462 On the HTTP version 1 transport, the response is zlib compressed.
1425 1463
1426 1464 If an error occurs, a generic error response can be sent.
1427 1465
1428 1466 Unless the client sends a false value for the ``cg`` argument, the returned
1429 1467 bundle contains a changegroup with the nodes between the specified ``common``
1430 1468 and ``heads`` nodes. Depending on the command arguments, the type and content
1431 1469 of the returned bundle can vary significantly.
1432 1470
1433 1471 The default behavior is for the server to send a raw changegroup version
1434 1472 ``01`` response.
1435 1473
1436 1474 If the ``bundlecaps`` provided by the client contain a value beginning
1437 1475 with ``HG2``, a bundle2 will be returned. The bundle2 data may contain
1438 1476 additional repository data, such as ``pushkey`` namespace values.
1439 1477
1440 1478 heads
1441 1479 -----
1442 1480
1443 1481 Returns a list of space-delimited hex nodes of repository heads followed
1444 1482 by a newline. e.g.
1445 1483 ``a9eeb3adc7ddb5006c088e9eda61791c777cbf7c 31f91a3da534dc849f0d6bfc00a395a97cf218a1\n``
1446 1484
1447 1485 This command does not accept any arguments. The return type is a ``string``.
1448 1486
1449 1487 hello
1450 1488 -----
1451 1489
1452 1490 Returns lines describing interesting things about the server in an RFC-822
1453 1491 like format.
1454 1492
1455 1493 Currently, the only line defines the server capabilities. It has the form::
1456 1494
1457 1495 capabilities: <value>
1458 1496
1459 1497 See above for more about the capabilities string.
1460 1498
1461 1499 SSH clients typically issue this command as soon as a connection is
1462 1500 established.
1463 1501
1464 1502 This command does not accept any arguments. The return type is a ``string``.
1465 1503
1466 1504 This command was introduced in Mercurial 0.9.1 (released July 2006).
1467 1505
1468 1506 listkeys
1469 1507 --------
1470 1508
1471 1509 List values in a specified ``pushkey`` namespace.
1472 1510
1473 1511 The ``namespace`` argument defines the pushkey namespace to operate on.
1474 1512
1475 1513 The return type is a ``string``. The value is an encoded dictionary of keys.
1476 1514
1477 1515 Key-value pairs are delimited by newlines (``\n``). Within each line, keys and
1478 1516 values are separated by a tab (``\t``). Keys and values are both strings.
1479 1517
1480 1518 lookup
1481 1519 ------
1482 1520
1483 1521 Try to resolve a value to a known repository revision.
1484 1522
1485 1523 The ``key`` argument is converted from bytes to an
1486 1524 ``encoding.localstr`` instance then passed into
1487 1525 ``localrepository.__getitem__`` in an attempt to resolve it.
1488 1526
1489 1527 The return type is a ``string``.
1490 1528
1491 1529 Upon successful resolution, returns ``1 <hex node>\n``. On failure,
1492 1530 returns ``0 <error string>\n``. e.g.::
1493 1531
1494 1532 1 273ce12ad8f155317b2c078ec75a4eba507f1fba\n
1495 1533
1496 1534 0 unknown revision 'foo'\n
1497 1535
1498 1536 known
1499 1537 -----
1500 1538
1501 1539 Determine whether multiple nodes are known.
1502 1540
1503 1541 The ``nodes`` argument is a list of space-delimited hex nodes to check
1504 1542 for existence.
1505 1543
1506 1544 The return type is ``string``.
1507 1545
1508 1546 Returns a string consisting of ``0``s and ``1``s indicating whether nodes
1509 1547 are known. If the Nth node specified in the ``nodes`` argument is known,
1510 1548 a ``1`` will be returned at byte offset N. If the node isn't known, ``0``
1511 1549 will be present at byte offset N.
1512 1550
1513 1551 There is no trailing newline.
1514 1552
1515 1553 pushkey
1516 1554 -------
1517 1555
1518 1556 Set a value using the ``pushkey`` protocol.
1519 1557
1520 1558 Accepts arguments ``namespace``, ``key``, ``old``, and ``new``, which
1521 1559 correspond to the pushkey namespace to operate on, the key within that
1522 1560 namespace to change, the old value (which may be empty), and the new value.
1523 1561 All arguments are string types.
1524 1562
1525 1563 The return type is a ``string``. The value depends on the transport protocol.
1526 1564
1527 1565 The SSH version 1 transport sends a string encoded integer followed by a
1528 1566 newline (``\n``) which indicates operation result. The server may send
1529 1567 additional output on the ``stderr`` stream that should be displayed to the
1530 1568 user.
1531 1569
1532 1570 The HTTP version 1 transport sends a string encoded integer followed by a
1533 1571 newline followed by additional server output that should be displayed to
1534 1572 the user. This may include output from hooks, etc.
1535 1573
1536 1574 The integer result varies by namespace. ``0`` means an error has occurred
1537 1575 and there should be additional output to display to the user.
1538 1576
1539 1577 stream_out
1540 1578 ----------
1541 1579
1542 1580 Obtain *streaming clone* data.
1543 1581
1544 1582 The return type is either a ``string`` or a ``stream``, depending on
1545 1583 whether the request was fulfilled properly.
1546 1584
1547 1585 A return value of ``1\n`` indicates the server is not configured to serve
1548 1586 this data. If this is seen by the client, they may not have verified the
1549 1587 ``stream`` capability is set before making the request.
1550 1588
1551 1589 A return value of ``2\n`` indicates the server was unable to lock the
1552 1590 repository to generate data.
1553 1591
1554 1592 All other responses are a ``stream`` of bytes. The first line of this data
1555 1593 contains 2 space-delimited integers corresponding to the path count and
1556 1594 payload size, respectively::
1557 1595
1558 1596 <path count> <payload size>\n
1559 1597
1560 1598 The ``<payload size>`` is the total size of path data: it does not include
1561 1599 the size of the per-path header lines.
1562 1600
1563 1601 Following that header are ``<path count>`` entries. Each entry consists of a
1564 1602 line with metadata followed by raw revlog data. The line consists of::
1565 1603
1566 1604 <store path>\0<size>\n
1567 1605
1568 1606 The ``<store path>`` is the encoded store path of the data that follows.
1569 1607 ``<size>`` is the amount of data for this store path/revlog that follows the
1570 1608 newline.
1571 1609
1572 1610 There is no trailer to indicate end of data. Instead, the client should stop
1573 1611 reading after ``<path count>`` entries are consumed.
1574 1612
1575 1613 unbundle
1576 1614 --------
1577 1615
1578 1616 Send a bundle containing data (usually changegroup data) to the server.
1579 1617
1580 1618 Accepts the argument ``heads``, which is a space-delimited list of hex nodes
1581 1619 corresponding to server repository heads observed by the client. This is used
1582 1620 to detect race conditions and abort push operations before a server performs
1583 1621 too much work or a client transfers too much data.
1584 1622
1585 1623 The request payload consists of a bundle to be applied to the repository,
1586 1624 similarly to as if :hg:`unbundle` were called.
1587 1625
1588 1626 In most scenarios, a special ``push response`` type is returned. This type
1589 1627 contains an integer describing the change in heads as a result of the
1590 1628 operation. A value of ``0`` indicates nothing changed. ``1`` means the number
1591 1629 of heads remained the same. Values ``2`` and larger indicate the number of
1592 1630 added heads minus 1. e.g. ``3`` means 2 heads were added. Negative values
1593 1631 indicate the number of fewer heads, also off by 1. e.g. ``-2`` means there
1594 1632 is 1 fewer head.
1595 1633
1596 1634 The encoding of the ``push response`` type varies by transport.
1597 1635
1598 1636 For the SSH version 1 transport, this type is composed of 2 ``string``
1599 1637 responses: an empty response (``0\n``) followed by the integer result value.
1600 1638 e.g. ``1\n2``. So the full response might be ``0\n1\n2``.
1601 1639
1602 1640 For the HTTP version 1 transport, the response is a ``string`` type composed
1603 1641 of an integer result value followed by a newline (``\n``) followed by string
1604 1642 content holding server output that should be displayed on the client (output
1605 1643 hooks, etc).
1606 1644
1607 1645 In some cases, the server may respond with a ``bundle2`` bundle. In this
1608 1646 case, the response type is ``stream``. For the HTTP version 1 transport, the
1609 1647 response is zlib compressed.
1610 1648
1611 1649 The server may also respond with a generic error type, which contains a string
1612 1650 indicating the failure.
@@ -1,838 +1,841
1 1 # wireprotoframing.py - unified framing protocol for wire protocol
2 2 #
3 3 # Copyright 2018 Gregory Szorc <gregory.szorc@gmail.com>
4 4 #
5 5 # This software may be used and distributed according to the terms of the
6 6 # GNU General Public License version 2 or any later version.
7 7
8 8 # This file contains functionality to support the unified frame-based wire
9 9 # protocol. For details about the protocol, see
10 10 # `hg help internals.wireprotocol`.
11 11
12 12 from __future__ import absolute_import
13 13
14 14 import struct
15 15
16 16 from .i18n import _
17 17 from .thirdparty import (
18 18 attr,
19 19 cbor,
20 20 )
21 21 from . import (
22 22 error,
23 23 util,
24 24 )
25 25 from .utils import (
26 26 stringutil,
27 27 )
28 28
29 29 FRAME_HEADER_SIZE = 8
30 30 DEFAULT_MAX_FRAME_SIZE = 32768
31 31
32 32 STREAM_FLAG_BEGIN_STREAM = 0x01
33 33 STREAM_FLAG_END_STREAM = 0x02
34 34 STREAM_FLAG_ENCODING_APPLIED = 0x04
35 35
36 36 STREAM_FLAGS = {
37 37 b'stream-begin': STREAM_FLAG_BEGIN_STREAM,
38 38 b'stream-end': STREAM_FLAG_END_STREAM,
39 39 b'encoded': STREAM_FLAG_ENCODING_APPLIED,
40 40 }
41 41
42 42 FRAME_TYPE_COMMAND_NAME = 0x01
43 43 FRAME_TYPE_COMMAND_ARGUMENT = 0x02
44 44 FRAME_TYPE_COMMAND_DATA = 0x03
45 45 FRAME_TYPE_BYTES_RESPONSE = 0x04
46 46 FRAME_TYPE_ERROR_RESPONSE = 0x05
47 47 FRAME_TYPE_TEXT_OUTPUT = 0x06
48 FRAME_TYPE_PROGRESS = 0x07
48 49 FRAME_TYPE_STREAM_SETTINGS = 0x08
49 50
50 51 FRAME_TYPES = {
51 52 b'command-name': FRAME_TYPE_COMMAND_NAME,
52 53 b'command-argument': FRAME_TYPE_COMMAND_ARGUMENT,
53 54 b'command-data': FRAME_TYPE_COMMAND_DATA,
54 55 b'bytes-response': FRAME_TYPE_BYTES_RESPONSE,
55 56 b'error-response': FRAME_TYPE_ERROR_RESPONSE,
56 57 b'text-output': FRAME_TYPE_TEXT_OUTPUT,
58 b'progress': FRAME_TYPE_PROGRESS,
57 59 b'stream-settings': FRAME_TYPE_STREAM_SETTINGS,
58 60 }
59 61
60 62 FLAG_COMMAND_NAME_EOS = 0x01
61 63 FLAG_COMMAND_NAME_HAVE_ARGS = 0x02
62 64 FLAG_COMMAND_NAME_HAVE_DATA = 0x04
63 65
64 66 FLAGS_COMMAND = {
65 67 b'eos': FLAG_COMMAND_NAME_EOS,
66 68 b'have-args': FLAG_COMMAND_NAME_HAVE_ARGS,
67 69 b'have-data': FLAG_COMMAND_NAME_HAVE_DATA,
68 70 }
69 71
70 72 FLAG_COMMAND_ARGUMENT_CONTINUATION = 0x01
71 73 FLAG_COMMAND_ARGUMENT_EOA = 0x02
72 74
73 75 FLAGS_COMMAND_ARGUMENT = {
74 76 b'continuation': FLAG_COMMAND_ARGUMENT_CONTINUATION,
75 77 b'eoa': FLAG_COMMAND_ARGUMENT_EOA,
76 78 }
77 79
78 80 FLAG_COMMAND_DATA_CONTINUATION = 0x01
79 81 FLAG_COMMAND_DATA_EOS = 0x02
80 82
81 83 FLAGS_COMMAND_DATA = {
82 84 b'continuation': FLAG_COMMAND_DATA_CONTINUATION,
83 85 b'eos': FLAG_COMMAND_DATA_EOS,
84 86 }
85 87
86 88 FLAG_BYTES_RESPONSE_CONTINUATION = 0x01
87 89 FLAG_BYTES_RESPONSE_EOS = 0x02
88 90
89 91 FLAGS_BYTES_RESPONSE = {
90 92 b'continuation': FLAG_BYTES_RESPONSE_CONTINUATION,
91 93 b'eos': FLAG_BYTES_RESPONSE_EOS,
92 94 }
93 95
94 96 FLAG_ERROR_RESPONSE_PROTOCOL = 0x01
95 97 FLAG_ERROR_RESPONSE_APPLICATION = 0x02
96 98
97 99 FLAGS_ERROR_RESPONSE = {
98 100 b'protocol': FLAG_ERROR_RESPONSE_PROTOCOL,
99 101 b'application': FLAG_ERROR_RESPONSE_APPLICATION,
100 102 }
101 103
102 104 # Maps frame types to their available flags.
103 105 FRAME_TYPE_FLAGS = {
104 106 FRAME_TYPE_COMMAND_NAME: FLAGS_COMMAND,
105 107 FRAME_TYPE_COMMAND_ARGUMENT: FLAGS_COMMAND_ARGUMENT,
106 108 FRAME_TYPE_COMMAND_DATA: FLAGS_COMMAND_DATA,
107 109 FRAME_TYPE_BYTES_RESPONSE: FLAGS_BYTES_RESPONSE,
108 110 FRAME_TYPE_ERROR_RESPONSE: FLAGS_ERROR_RESPONSE,
109 111 FRAME_TYPE_TEXT_OUTPUT: {},
112 FRAME_TYPE_PROGRESS: {},
110 113 FRAME_TYPE_STREAM_SETTINGS: {},
111 114 }
112 115
113 116 ARGUMENT_FRAME_HEADER = struct.Struct(r'<HH')
114 117
115 118 @attr.s(slots=True)
116 119 class frameheader(object):
117 120 """Represents the data in a frame header."""
118 121
119 122 length = attr.ib()
120 123 requestid = attr.ib()
121 124 streamid = attr.ib()
122 125 streamflags = attr.ib()
123 126 typeid = attr.ib()
124 127 flags = attr.ib()
125 128
126 129 @attr.s(slots=True)
127 130 class frame(object):
128 131 """Represents a parsed frame."""
129 132
130 133 requestid = attr.ib()
131 134 streamid = attr.ib()
132 135 streamflags = attr.ib()
133 136 typeid = attr.ib()
134 137 flags = attr.ib()
135 138 payload = attr.ib()
136 139
137 140 def makeframe(requestid, streamid, streamflags, typeid, flags, payload):
138 141 """Assemble a frame into a byte array."""
139 142 # TODO assert size of payload.
140 143 frame = bytearray(FRAME_HEADER_SIZE + len(payload))
141 144
142 145 # 24 bits length
143 146 # 16 bits request id
144 147 # 8 bits stream id
145 148 # 8 bits stream flags
146 149 # 4 bits type
147 150 # 4 bits flags
148 151
149 152 l = struct.pack(r'<I', len(payload))
150 153 frame[0:3] = l[0:3]
151 154 struct.pack_into(r'<HBB', frame, 3, requestid, streamid, streamflags)
152 155 frame[7] = (typeid << 4) | flags
153 156 frame[8:] = payload
154 157
155 158 return frame
156 159
157 160 def makeframefromhumanstring(s):
158 161 """Create a frame from a human readable string
159 162
160 163 DANGER: NOT SAFE TO USE WITH UNTRUSTED INPUT BECAUSE OF POTENTIAL
161 164 eval() USAGE. DO NOT USE IN CORE.
162 165
163 166 Strings have the form:
164 167
165 168 <request-id> <stream-id> <stream-flags> <type> <flags> <payload>
166 169
167 170 This can be used by user-facing applications and tests for creating
168 171 frames easily without having to type out a bunch of constants.
169 172
170 173 Request ID and stream IDs are integers.
171 174
172 175 Stream flags, frame type, and flags can be specified by integer or
173 176 named constant.
174 177
175 178 Flags can be delimited by `|` to bitwise OR them together.
176 179
177 180 If the payload begins with ``cbor:``, the following string will be
178 181 evaluated as Python code and the resulting object will be fed into
179 182 a CBOR encoder. Otherwise, the payload is interpreted as a Python
180 183 byte string literal.
181 184 """
182 185 fields = s.split(b' ', 5)
183 186 requestid, streamid, streamflags, frametype, frameflags, payload = fields
184 187
185 188 requestid = int(requestid)
186 189 streamid = int(streamid)
187 190
188 191 finalstreamflags = 0
189 192 for flag in streamflags.split(b'|'):
190 193 if flag in STREAM_FLAGS:
191 194 finalstreamflags |= STREAM_FLAGS[flag]
192 195 else:
193 196 finalstreamflags |= int(flag)
194 197
195 198 if frametype in FRAME_TYPES:
196 199 frametype = FRAME_TYPES[frametype]
197 200 else:
198 201 frametype = int(frametype)
199 202
200 203 finalflags = 0
201 204 validflags = FRAME_TYPE_FLAGS[frametype]
202 205 for flag in frameflags.split(b'|'):
203 206 if flag in validflags:
204 207 finalflags |= validflags[flag]
205 208 else:
206 209 finalflags |= int(flag)
207 210
208 211 if payload.startswith(b'cbor:'):
209 212 payload = cbor.dumps(stringutil.evalpython(payload[5:]), canonical=True)
210 213
211 214 else:
212 215 payload = stringutil.unescapestr(payload)
213 216
214 217 return makeframe(requestid=requestid, streamid=streamid,
215 218 streamflags=finalstreamflags, typeid=frametype,
216 219 flags=finalflags, payload=payload)
217 220
218 221 def parseheader(data):
219 222 """Parse a unified framing protocol frame header from a buffer.
220 223
221 224 The header is expected to be in the buffer at offset 0 and the
222 225 buffer is expected to be large enough to hold a full header.
223 226 """
224 227 # 24 bits payload length (little endian)
225 228 # 16 bits request ID
226 229 # 8 bits stream ID
227 230 # 8 bits stream flags
228 231 # 4 bits frame type
229 232 # 4 bits frame flags
230 233 # ... payload
231 234 framelength = data[0] + 256 * data[1] + 16384 * data[2]
232 235 requestid, streamid, streamflags = struct.unpack_from(r'<HBB', data, 3)
233 236 typeflags = data[7]
234 237
235 238 frametype = (typeflags & 0xf0) >> 4
236 239 frameflags = typeflags & 0x0f
237 240
238 241 return frameheader(framelength, requestid, streamid, streamflags,
239 242 frametype, frameflags)
240 243
241 244 def readframe(fh):
242 245 """Read a unified framing protocol frame from a file object.
243 246
244 247 Returns a 3-tuple of (type, flags, payload) for the decoded frame or
245 248 None if no frame is available. May raise if a malformed frame is
246 249 seen.
247 250 """
248 251 header = bytearray(FRAME_HEADER_SIZE)
249 252
250 253 readcount = fh.readinto(header)
251 254
252 255 if readcount == 0:
253 256 return None
254 257
255 258 if readcount != FRAME_HEADER_SIZE:
256 259 raise error.Abort(_('received incomplete frame: got %d bytes: %s') %
257 260 (readcount, header))
258 261
259 262 h = parseheader(header)
260 263
261 264 payload = fh.read(h.length)
262 265 if len(payload) != h.length:
263 266 raise error.Abort(_('frame length error: expected %d; got %d') %
264 267 (h.length, len(payload)))
265 268
266 269 return frame(h.requestid, h.streamid, h.streamflags, h.typeid, h.flags,
267 270 payload)
268 271
269 272 def createcommandframes(stream, requestid, cmd, args, datafh=None):
270 273 """Create frames necessary to transmit a request to run a command.
271 274
272 275 This is a generator of bytearrays. Each item represents a frame
273 276 ready to be sent over the wire to a peer.
274 277 """
275 278 flags = 0
276 279 if args:
277 280 flags |= FLAG_COMMAND_NAME_HAVE_ARGS
278 281 if datafh:
279 282 flags |= FLAG_COMMAND_NAME_HAVE_DATA
280 283
281 284 if not flags:
282 285 flags |= FLAG_COMMAND_NAME_EOS
283 286
284 287 yield stream.makeframe(requestid=requestid, typeid=FRAME_TYPE_COMMAND_NAME,
285 288 flags=flags, payload=cmd)
286 289
287 290 for i, k in enumerate(sorted(args)):
288 291 v = args[k]
289 292 last = i == len(args) - 1
290 293
291 294 # TODO handle splitting of argument values across frames.
292 295 payload = bytearray(ARGUMENT_FRAME_HEADER.size + len(k) + len(v))
293 296 offset = 0
294 297 ARGUMENT_FRAME_HEADER.pack_into(payload, offset, len(k), len(v))
295 298 offset += ARGUMENT_FRAME_HEADER.size
296 299 payload[offset:offset + len(k)] = k
297 300 offset += len(k)
298 301 payload[offset:offset + len(v)] = v
299 302
300 303 flags = FLAG_COMMAND_ARGUMENT_EOA if last else 0
301 304 yield stream.makeframe(requestid=requestid,
302 305 typeid=FRAME_TYPE_COMMAND_ARGUMENT,
303 306 flags=flags,
304 307 payload=payload)
305 308
306 309 if datafh:
307 310 while True:
308 311 data = datafh.read(DEFAULT_MAX_FRAME_SIZE)
309 312
310 313 done = False
311 314 if len(data) == DEFAULT_MAX_FRAME_SIZE:
312 315 flags = FLAG_COMMAND_DATA_CONTINUATION
313 316 else:
314 317 flags = FLAG_COMMAND_DATA_EOS
315 318 assert datafh.read(1) == b''
316 319 done = True
317 320
318 321 yield stream.makeframe(requestid=requestid,
319 322 typeid=FRAME_TYPE_COMMAND_DATA,
320 323 flags=flags,
321 324 payload=data)
322 325
323 326 if done:
324 327 break
325 328
326 329 def createbytesresponseframesfrombytes(stream, requestid, data,
327 330 maxframesize=DEFAULT_MAX_FRAME_SIZE):
328 331 """Create a raw frame to send a bytes response from static bytes input.
329 332
330 333 Returns a generator of bytearrays.
331 334 """
332 335
333 336 # Simple case of a single frame.
334 337 if len(data) <= maxframesize:
335 338 yield stream.makeframe(requestid=requestid,
336 339 typeid=FRAME_TYPE_BYTES_RESPONSE,
337 340 flags=FLAG_BYTES_RESPONSE_EOS,
338 341 payload=data)
339 342 return
340 343
341 344 offset = 0
342 345 while True:
343 346 chunk = data[offset:offset + maxframesize]
344 347 offset += len(chunk)
345 348 done = offset == len(data)
346 349
347 350 if done:
348 351 flags = FLAG_BYTES_RESPONSE_EOS
349 352 else:
350 353 flags = FLAG_BYTES_RESPONSE_CONTINUATION
351 354
352 355 yield stream.makeframe(requestid=requestid,
353 356 typeid=FRAME_TYPE_BYTES_RESPONSE,
354 357 flags=flags,
355 358 payload=chunk)
356 359
357 360 if done:
358 361 break
359 362
360 363 def createerrorframe(stream, requestid, msg, protocol=False, application=False):
361 364 # TODO properly handle frame size limits.
362 365 assert len(msg) <= DEFAULT_MAX_FRAME_SIZE
363 366
364 367 flags = 0
365 368 if protocol:
366 369 flags |= FLAG_ERROR_RESPONSE_PROTOCOL
367 370 if application:
368 371 flags |= FLAG_ERROR_RESPONSE_APPLICATION
369 372
370 373 yield stream.makeframe(requestid=requestid,
371 374 typeid=FRAME_TYPE_ERROR_RESPONSE,
372 375 flags=flags,
373 376 payload=msg)
374 377
375 378 def createtextoutputframe(stream, requestid, atoms):
376 379 """Create a text output frame to render text to people.
377 380
378 381 ``atoms`` is a 3-tuple of (formatting string, args, labels).
379 382
380 383 The formatting string contains ``%s`` tokens to be replaced by the
381 384 corresponding indexed entry in ``args``. ``labels`` is an iterable of
382 385 formatters to be applied at rendering time. In terms of the ``ui``
383 386 class, each atom corresponds to a ``ui.write()``.
384 387 """
385 388 bytesleft = DEFAULT_MAX_FRAME_SIZE
386 389 atomchunks = []
387 390
388 391 for (formatting, args, labels) in atoms:
389 392 if len(args) > 255:
390 393 raise ValueError('cannot use more than 255 formatting arguments')
391 394 if len(labels) > 255:
392 395 raise ValueError('cannot use more than 255 labels')
393 396
394 397 # TODO look for localstr, other types here?
395 398
396 399 if not isinstance(formatting, bytes):
397 400 raise ValueError('must use bytes formatting strings')
398 401 for arg in args:
399 402 if not isinstance(arg, bytes):
400 403 raise ValueError('must use bytes for arguments')
401 404 for label in labels:
402 405 if not isinstance(label, bytes):
403 406 raise ValueError('must use bytes for labels')
404 407
405 408 # Formatting string must be UTF-8.
406 409 formatting = formatting.decode(r'utf-8', r'replace').encode(r'utf-8')
407 410
408 411 # Arguments must be UTF-8.
409 412 args = [a.decode(r'utf-8', r'replace').encode(r'utf-8') for a in args]
410 413
411 414 # Labels must be ASCII.
412 415 labels = [l.decode(r'ascii', r'strict').encode(r'ascii')
413 416 for l in labels]
414 417
415 418 if len(formatting) > 65535:
416 419 raise ValueError('formatting string cannot be longer than 64k')
417 420
418 421 if any(len(a) > 65535 for a in args):
419 422 raise ValueError('argument string cannot be longer than 64k')
420 423
421 424 if any(len(l) > 255 for l in labels):
422 425 raise ValueError('label string cannot be longer than 255 bytes')
423 426
424 427 chunks = [
425 428 struct.pack(r'<H', len(formatting)),
426 429 struct.pack(r'<BB', len(labels), len(args)),
427 430 struct.pack(r'<' + r'B' * len(labels), *map(len, labels)),
428 431 struct.pack(r'<' + r'H' * len(args), *map(len, args)),
429 432 ]
430 433 chunks.append(formatting)
431 434 chunks.extend(labels)
432 435 chunks.extend(args)
433 436
434 437 atom = b''.join(chunks)
435 438 atomchunks.append(atom)
436 439 bytesleft -= len(atom)
437 440
438 441 if bytesleft < 0:
439 442 raise ValueError('cannot encode data in a single frame')
440 443
441 444 yield stream.makeframe(requestid=requestid,
442 445 typeid=FRAME_TYPE_TEXT_OUTPUT,
443 446 flags=0,
444 447 payload=b''.join(atomchunks))
445 448
446 449 class stream(object):
447 450 """Represents a logical unidirectional series of frames."""
448 451
449 452 def __init__(self, streamid, active=False):
450 453 self.streamid = streamid
451 454 self._active = False
452 455
453 456 def makeframe(self, requestid, typeid, flags, payload):
454 457 """Create a frame to be sent out over this stream.
455 458
456 459 Only returns the frame instance. Does not actually send it.
457 460 """
458 461 streamflags = 0
459 462 if not self._active:
460 463 streamflags |= STREAM_FLAG_BEGIN_STREAM
461 464 self._active = True
462 465
463 466 return makeframe(requestid, self.streamid, streamflags, typeid, flags,
464 467 payload)
465 468
466 469 def ensureserverstream(stream):
467 470 if stream.streamid % 2:
468 471 raise error.ProgrammingError('server should only write to even '
469 472 'numbered streams; %d is not even' %
470 473 stream.streamid)
471 474
472 475 class serverreactor(object):
473 476 """Holds state of a server handling frame-based protocol requests.
474 477
475 478 This class is the "brain" of the unified frame-based protocol server
476 479 component. While the protocol is stateless from the perspective of
477 480 requests/commands, something needs to track which frames have been
478 481 received, what frames to expect, etc. This class is that thing.
479 482
480 483 Instances are modeled as a state machine of sorts. Instances are also
481 484 reactionary to external events. The point of this class is to encapsulate
482 485 the state of the connection and the exchange of frames, not to perform
483 486 work. Instead, callers tell this class when something occurs, like a
484 487 frame arriving. If that activity is worthy of a follow-up action (say
485 488 *run a command*), the return value of that handler will say so.
486 489
487 490 I/O and CPU intensive operations are purposefully delegated outside of
488 491 this class.
489 492
490 493 Consumers are expected to tell instances when events occur. They do so by
491 494 calling the various ``on*`` methods. These methods return a 2-tuple
492 495 describing any follow-up action(s) to take. The first element is the
493 496 name of an action to perform. The second is a data structure (usually
494 497 a dict) specific to that action that contains more information. e.g.
495 498 if the server wants to send frames back to the client, the data structure
496 499 will contain a reference to those frames.
497 500
498 501 Valid actions that consumers can be instructed to take are:
499 502
500 503 sendframes
501 504 Indicates that frames should be sent to the client. The ``framegen``
502 505 key contains a generator of frames that should be sent. The server
503 506 assumes that all frames are sent to the client.
504 507
505 508 error
506 509 Indicates that an error occurred. Consumer should probably abort.
507 510
508 511 runcommand
509 512 Indicates that the consumer should run a wire protocol command. Details
510 513 of the command to run are given in the data structure.
511 514
512 515 wantframe
513 516 Indicates that nothing of interest happened and the server is waiting on
514 517 more frames from the client before anything interesting can be done.
515 518
516 519 noop
517 520 Indicates no additional action is required.
518 521
519 522 Known Issues
520 523 ------------
521 524
522 525 There are no limits to the number of partially received commands or their
523 526 size. A malicious client could stream command request data and exhaust the
524 527 server's memory.
525 528
526 529 Partially received commands are not acted upon when end of input is
527 530 reached. Should the server error if it receives a partial request?
528 531 Should the client send a message to abort a partially transmitted request
529 532 to facilitate graceful shutdown?
530 533
531 534 Active requests that haven't been responded to aren't tracked. This means
532 535 that if we receive a command and instruct its dispatch, another command
533 536 with its request ID can come in over the wire and there will be a race
534 537 between who responds to what.
535 538 """
536 539
537 540 def __init__(self, deferoutput=False):
538 541 """Construct a new server reactor.
539 542
540 543 ``deferoutput`` can be used to indicate that no output frames should be
541 544 instructed to be sent until input has been exhausted. In this mode,
542 545 events that would normally generate output frames (such as a command
543 546 response being ready) will instead defer instructing the consumer to
544 547 send those frames. This is useful for half-duplex transports where the
545 548 sender cannot receive until all data has been transmitted.
546 549 """
547 550 self._deferoutput = deferoutput
548 551 self._state = 'idle'
549 552 self._nextoutgoingstreamid = 2
550 553 self._bufferedframegens = []
551 554 # stream id -> stream instance for all active streams from the client.
552 555 self._incomingstreams = {}
553 556 self._outgoingstreams = {}
554 557 # request id -> dict of commands that are actively being received.
555 558 self._receivingcommands = {}
556 559 # Request IDs that have been received and are actively being processed.
557 560 # Once all output for a request has been sent, it is removed from this
558 561 # set.
559 562 self._activecommands = set()
560 563
561 564 def onframerecv(self, frame):
562 565 """Process a frame that has been received off the wire.
563 566
564 567 Returns a dict with an ``action`` key that details what action,
565 568 if any, the consumer should take next.
566 569 """
567 570 if not frame.streamid % 2:
568 571 self._state = 'errored'
569 572 return self._makeerrorresult(
570 573 _('received frame with even numbered stream ID: %d') %
571 574 frame.streamid)
572 575
573 576 if frame.streamid not in self._incomingstreams:
574 577 if not frame.streamflags & STREAM_FLAG_BEGIN_STREAM:
575 578 self._state = 'errored'
576 579 return self._makeerrorresult(
577 580 _('received frame on unknown inactive stream without '
578 581 'beginning of stream flag set'))
579 582
580 583 self._incomingstreams[frame.streamid] = stream(frame.streamid)
581 584
582 585 if frame.streamflags & STREAM_FLAG_ENCODING_APPLIED:
583 586 # TODO handle decoding frames
584 587 self._state = 'errored'
585 588 raise error.ProgrammingError('support for decoding stream payloads '
586 589 'not yet implemented')
587 590
588 591 if frame.streamflags & STREAM_FLAG_END_STREAM:
589 592 del self._incomingstreams[frame.streamid]
590 593
591 594 handlers = {
592 595 'idle': self._onframeidle,
593 596 'command-receiving': self._onframecommandreceiving,
594 597 'errored': self._onframeerrored,
595 598 }
596 599
597 600 meth = handlers.get(self._state)
598 601 if not meth:
599 602 raise error.ProgrammingError('unhandled state: %s' % self._state)
600 603
601 604 return meth(frame)
602 605
603 606 def onbytesresponseready(self, stream, requestid, data):
604 607 """Signal that a bytes response is ready to be sent to the client.
605 608
606 609 The raw bytes response is passed as an argument.
607 610 """
608 611 ensureserverstream(stream)
609 612
610 613 def sendframes():
611 614 for frame in createbytesresponseframesfrombytes(stream, requestid,
612 615 data):
613 616 yield frame
614 617
615 618 self._activecommands.remove(requestid)
616 619
617 620 result = sendframes()
618 621
619 622 if self._deferoutput:
620 623 self._bufferedframegens.append(result)
621 624 return 'noop', {}
622 625 else:
623 626 return 'sendframes', {
624 627 'framegen': result,
625 628 }
626 629
627 630 def oninputeof(self):
628 631 """Signals that end of input has been received.
629 632
630 633 No more frames will be received. All pending activity should be
631 634 completed.
632 635 """
633 636 # TODO should we do anything about in-flight commands?
634 637
635 638 if not self._deferoutput or not self._bufferedframegens:
636 639 return 'noop', {}
637 640
638 641 # If we buffered all our responses, emit those.
639 642 def makegen():
640 643 for gen in self._bufferedframegens:
641 644 for frame in gen:
642 645 yield frame
643 646
644 647 return 'sendframes', {
645 648 'framegen': makegen(),
646 649 }
647 650
648 651 def onapplicationerror(self, stream, requestid, msg):
649 652 ensureserverstream(stream)
650 653
651 654 return 'sendframes', {
652 655 'framegen': createerrorframe(stream, requestid, msg,
653 656 application=True),
654 657 }
655 658
656 659 def makeoutputstream(self):
657 660 """Create a stream to be used for sending data to the client."""
658 661 streamid = self._nextoutgoingstreamid
659 662 self._nextoutgoingstreamid += 2
660 663
661 664 s = stream(streamid)
662 665 self._outgoingstreams[streamid] = s
663 666
664 667 return s
665 668
666 669 def _makeerrorresult(self, msg):
667 670 return 'error', {
668 671 'message': msg,
669 672 }
670 673
671 674 def _makeruncommandresult(self, requestid):
672 675 entry = self._receivingcommands[requestid]
673 676 del self._receivingcommands[requestid]
674 677
675 678 if self._receivingcommands:
676 679 self._state = 'command-receiving'
677 680 else:
678 681 self._state = 'idle'
679 682
680 683 assert requestid not in self._activecommands
681 684 self._activecommands.add(requestid)
682 685
683 686 return 'runcommand', {
684 687 'requestid': requestid,
685 688 'command': entry['command'],
686 689 'args': entry['args'],
687 690 'data': entry['data'].getvalue() if entry['data'] else None,
688 691 }
689 692
690 693 def _makewantframeresult(self):
691 694 return 'wantframe', {
692 695 'state': self._state,
693 696 }
694 697
695 698 def _onframeidle(self, frame):
696 699 # The only frame type that should be received in this state is a
697 700 # command request.
698 701 if frame.typeid != FRAME_TYPE_COMMAND_NAME:
699 702 self._state = 'errored'
700 703 return self._makeerrorresult(
701 704 _('expected command frame; got %d') % frame.typeid)
702 705
703 706 if frame.requestid in self._receivingcommands:
704 707 self._state = 'errored'
705 708 return self._makeerrorresult(
706 709 _('request with ID %d already received') % frame.requestid)
707 710
708 711 if frame.requestid in self._activecommands:
709 712 self._state = 'errored'
710 713 return self._makeerrorresult((
711 714 _('request with ID %d is already active') % frame.requestid))
712 715
713 716 expectingargs = bool(frame.flags & FLAG_COMMAND_NAME_HAVE_ARGS)
714 717 expectingdata = bool(frame.flags & FLAG_COMMAND_NAME_HAVE_DATA)
715 718
716 719 self._receivingcommands[frame.requestid] = {
717 720 'command': frame.payload,
718 721 'args': {},
719 722 'data': None,
720 723 'expectingargs': expectingargs,
721 724 'expectingdata': expectingdata,
722 725 }
723 726
724 727 if frame.flags & FLAG_COMMAND_NAME_EOS:
725 728 return self._makeruncommandresult(frame.requestid)
726 729
727 730 if expectingargs or expectingdata:
728 731 self._state = 'command-receiving'
729 732 return self._makewantframeresult()
730 733 else:
731 734 self._state = 'errored'
732 735 return self._makeerrorresult(_('missing frame flags on '
733 736 'command frame'))
734 737
735 738 def _onframecommandreceiving(self, frame):
736 739 # It could be a new command request. Process it as such.
737 740 if frame.typeid == FRAME_TYPE_COMMAND_NAME:
738 741 return self._onframeidle(frame)
739 742
740 743 # All other frames should be related to a command that is currently
741 744 # receiving but is not active.
742 745 if frame.requestid in self._activecommands:
743 746 self._state = 'errored'
744 747 return self._makeerrorresult(
745 748 _('received frame for request that is still active: %d') %
746 749 frame.requestid)
747 750
748 751 if frame.requestid not in self._receivingcommands:
749 752 self._state = 'errored'
750 753 return self._makeerrorresult(
751 754 _('received frame for request that is not receiving: %d') %
752 755 frame.requestid)
753 756
754 757 entry = self._receivingcommands[frame.requestid]
755 758
756 759 if frame.typeid == FRAME_TYPE_COMMAND_ARGUMENT:
757 760 if not entry['expectingargs']:
758 761 self._state = 'errored'
759 762 return self._makeerrorresult(_(
760 763 'received command argument frame for request that is not '
761 764 'expecting arguments: %d') % frame.requestid)
762 765
763 766 return self._handlecommandargsframe(frame, entry)
764 767
765 768 elif frame.typeid == FRAME_TYPE_COMMAND_DATA:
766 769 if not entry['expectingdata']:
767 770 self._state = 'errored'
768 771 return self._makeerrorresult(_(
769 772 'received command data frame for request that is not '
770 773 'expecting data: %d') % frame.requestid)
771 774
772 775 if entry['data'] is None:
773 776 entry['data'] = util.bytesio()
774 777
775 778 return self._handlecommanddataframe(frame, entry)
776 779
777 780 def _handlecommandargsframe(self, frame, entry):
778 781 # The frame and state of command should have already been validated.
779 782 assert frame.typeid == FRAME_TYPE_COMMAND_ARGUMENT
780 783
781 784 offset = 0
782 785 namesize, valuesize = ARGUMENT_FRAME_HEADER.unpack_from(frame.payload)
783 786 offset += ARGUMENT_FRAME_HEADER.size
784 787
785 788 # The argument name MUST fit inside the frame.
786 789 argname = bytes(frame.payload[offset:offset + namesize])
787 790 offset += namesize
788 791
789 792 if len(argname) != namesize:
790 793 self._state = 'errored'
791 794 return self._makeerrorresult(_('malformed argument frame: '
792 795 'partial argument name'))
793 796
794 797 argvalue = bytes(frame.payload[offset:])
795 798
796 799 # Argument value spans multiple frames. Record our active state
797 800 # and wait for the next frame.
798 801 if frame.flags & FLAG_COMMAND_ARGUMENT_CONTINUATION:
799 802 raise error.ProgrammingError('not yet implemented')
800 803
801 804 # Common case: the argument value is completely contained in this
802 805 # frame.
803 806
804 807 if len(argvalue) != valuesize:
805 808 self._state = 'errored'
806 809 return self._makeerrorresult(_('malformed argument frame: '
807 810 'partial argument value'))
808 811
809 812 entry['args'][argname] = argvalue
810 813
811 814 if frame.flags & FLAG_COMMAND_ARGUMENT_EOA:
812 815 if entry['expectingdata']:
813 816 # TODO signal request to run a command once we don't
814 817 # buffer data frames.
815 818 return self._makewantframeresult()
816 819 else:
817 820 return self._makeruncommandresult(frame.requestid)
818 821 else:
819 822 return self._makewantframeresult()
820 823
821 824 def _handlecommanddataframe(self, frame, entry):
822 825 assert frame.typeid == FRAME_TYPE_COMMAND_DATA
823 826
824 827 # TODO support streaming data instead of buffering it.
825 828 entry['data'].write(frame.payload)
826 829
827 830 if frame.flags & FLAG_COMMAND_DATA_CONTINUATION:
828 831 return self._makewantframeresult()
829 832 elif frame.flags & FLAG_COMMAND_DATA_EOS:
830 833 entry['data'].seek(0)
831 834 return self._makeruncommandresult(frame.requestid)
832 835 else:
833 836 self._state = 'errored'
834 837 return self._makeerrorresult(_('command data frame without '
835 838 'flags'))
836 839
837 840 def _onframeerrored(self, frame):
838 841 return self._makeerrorresult(_('server already errored'))
General Comments 0
You need to be logged in to leave comments. Login now