##// END OF EJS Templates
internals: extract wire protocol version 2 commands to standalone doc...
Gregory Szorc -
r39470:dc61a67c default
parent child Browse files
Show More
@@ -51,6 +51,7 b''
51 <File Id="internals.requirements.txt" Name="requirements.txt" />
51 <File Id="internals.requirements.txt" Name="requirements.txt" />
52 <File Id="internals.revlogs.txt" Name="revlogs.txt" />
52 <File Id="internals.revlogs.txt" Name="revlogs.txt" />
53 <File Id="internals.wireprotocol.txt" Name="wireprotocol.txt" />
53 <File Id="internals.wireprotocol.txt" Name="wireprotocol.txt" />
54 <File Id="internals.wireprotocolv2.txt" Name="wireprotocolv2.txt" />
54 </Component>
55 </Component>
55 </Directory>
56 </Directory>
56
57
@@ -1379,6 +1379,9 b' Commands'
1379 This section contains a list of all wire protocol commands implemented by
1379 This section contains a list of all wire protocol commands implemented by
1380 the canonical Mercurial server.
1380 the canonical Mercurial server.
1381
1381
1382 See :hg:`help internals.wireprotocolv2` for information on commands exposed
1383 to the frame-based protocol.
1384
1382 batch
1385 batch
1383 -----
1386 -----
1384
1387
@@ -1750,164 +1753,3 b' response is zlib compressed.'
1750
1753
1751 The server may also respond with a generic error type, which contains a string
1754 The server may also respond with a generic error type, which contains a string
1752 indicating the failure.
1755 indicating the failure.
1753
1754 Frame-Based Protocol Commands
1755 =============================
1756
1757 **Experimental and under active development**
1758
1759 This section documents the wire protocol commands exposed to transports
1760 using the frame-based protocol. The set of commands exposed through
1761 these transports is distinct from the set of commands exposed to legacy
1762 transports.
1763
1764 The frame-based protocol uses CBOR to encode command execution requests.
1765 All command arguments must be mapped to a specific or set of CBOR data
1766 types.
1767
1768 The response to many commands is also CBOR. There is no common response
1769 format: each command defines its own response format.
1770
1771 TODO require node type be specified, as N bytes of binary node value
1772 could be ambiguous once SHA-1 is replaced.
1773
1774 branchmap
1775 ---------
1776
1777 Obtain heads in named branches.
1778
1779 Receives no arguments.
1780
1781 The response is a map with bytestring keys defining the branch name.
1782 Values are arrays of bytestring defining raw changeset nodes.
1783
1784 capabilities
1785 ------------
1786
1787 Obtain the server's capabilities.
1788
1789 Receives no arguments.
1790
1791 This command is typically called only as part of the handshake during
1792 initial connection establishment.
1793
1794 The response is a map with bytestring keys defining server information.
1795
1796 The defined keys are:
1797
1798 commands
1799 A map defining available wire protocol commands on this server.
1800
1801 Keys in the map are the names of commands that can be invoked. Values
1802 are maps defining information about that command. The bytestring keys
1803 are:
1804
1805 args
1806 A map of argument names and their expected types.
1807
1808 Types are defined as a representative value for the expected type.
1809 e.g. an argument expecting a boolean type will have its value
1810 set to true. An integer type will have its value set to 42. The
1811 actual values are arbitrary and may not have meaning.
1812 permissions
1813 An array of permissions required to execute this command.
1814
1815 compression
1816 An array of maps defining available compression format support.
1817
1818 The array is sorted from most preferred to least preferred.
1819
1820 Each entry has the following bytestring keys:
1821
1822 name
1823 Name of the compression engine. e.g. ``zstd`` or ``zlib``.
1824
1825 framingmediatypes
1826 An array of bytestrings defining the supported framing protocol
1827 media types. Servers will not accept media types not in this list.
1828
1829 rawrepoformats
1830 An array of storage formats the repository is using. This set of
1831 requirements can be used to determine whether a client can read a
1832 *raw* copy of file data available.
1833
1834 heads
1835 -----
1836
1837 Obtain DAG heads in the repository.
1838
1839 The command accepts the following arguments:
1840
1841 publiconly (optional)
1842 (boolean) If set, operate on the DAG for public phase changesets only.
1843 Non-public (i.e. draft) phase DAG heads will not be returned.
1844
1845 The response is a CBOR array of bytestrings defining changeset nodes
1846 of DAG heads. The array can be empty if the repository is empty or no
1847 changesets satisfied the request.
1848
1849 TODO consider exposing phase of heads in response
1850
1851 known
1852 -----
1853
1854 Determine whether a series of changeset nodes is known to the server.
1855
1856 The command accepts the following arguments:
1857
1858 nodes
1859 (array of bytestrings) List of changeset nodes whose presence to
1860 query.
1861
1862 The response is a bytestring where each byte contains a 0 or 1 for the
1863 corresponding requested node at the same index.
1864
1865 TODO use a bit array for even more compact response
1866
1867 listkeys
1868 --------
1869
1870 List values in a specified ``pushkey`` namespace.
1871
1872 The command receives the following arguments:
1873
1874 namespace
1875 (bytestring) Pushkey namespace to query.
1876
1877 The response is a map with bytestring keys and values.
1878
1879 TODO consider using binary to represent nodes in certain pushkey namespaces.
1880
1881 lookup
1882 ------
1883
1884 Try to resolve a value to a changeset revision.
1885
1886 Unlike ``known`` which operates on changeset nodes, lookup operates on
1887 node fragments and other names that a user may use.
1888
1889 The command receives the following arguments:
1890
1891 key
1892 (bytestring) Value to try to resolve.
1893
1894 On success, returns a bytestring containing the resolved node.
1895
1896 pushkey
1897 -------
1898
1899 Set a value using the ``pushkey`` protocol.
1900
1901 The command receives the following arguments:
1902
1903 namespace
1904 (bytestring) Pushkey namespace to operate on.
1905 key
1906 (bytestring) The pushkey key to set.
1907 old
1908 (bytestring) Old value for this key.
1909 new
1910 (bytestring) New value for this key.
1911
1912 TODO consider using binary to represent nodes is certain pushkey namespaces.
1913 TODO better define response type and meaning.
This diff has been collapsed as it changes many lines, (1773 lines changed) Show them Hide them
@@ -1,1758 +1,5 b''
1 The Mercurial wire protocol is a request-response based protocol
1 Wire Protocol Version 2
2 with multiple wire representations.
2 =======================
3
4 Each request is modeled as a command name, a dictionary of arguments, and
5 optional raw input. Command arguments and their types are intrinsic
6 properties of commands. So is the response type of the command. This means
7 clients can't always send arbitrary arguments to servers and servers can't
8 return multiple response types.
9
10 The protocol is synchronous and does not support multiplexing (concurrent
11 commands).
12
13 Handshake
14 =========
15
16 It is required or common for clients to perform a *handshake* when connecting
17 to a server. The handshake serves the following purposes:
18
19 * Negotiating protocol/transport level options
20 * Allows the client to learn about server capabilities to influence
21 future requests
22 * Ensures the underlying transport channel is in a *clean* state
23
24 An important goal of the handshake is to allow clients to use more modern
25 wire protocol features. By default, clients must assume they are talking
26 to an old version of Mercurial server (possibly even the very first
27 implementation). So, clients should not attempt to call or utilize modern
28 wire protocol features until they have confirmation that the server
29 supports them. The handshake implementation is designed to allow both
30 ends to utilize the latest set of features and capabilities with as
31 few round trips as possible.
32
33 The handshake mechanism varies by transport and protocol and is documented
34 in the sections below.
35
36 HTTP Protocol
37 =============
38
39 Handshake
40 ---------
41
42 The client sends a ``capabilities`` command request (``?cmd=capabilities``)
43 as soon as HTTP requests may be issued.
44
45 By default, the server responds with a version 1 capabilities string, which
46 the client parses to learn about the server's abilities. The ``Content-Type``
47 for this response is ``application/mercurial-0.1`` or
48 ``application/mercurial-0.2`` depending on whether the client advertised
49 support for version ``0.2`` in its request. (Clients aren't supposed to
50 advertise support for ``0.2`` until the capabilities response indicates
51 the server's support for that media type. However, a client could
52 conceivably cache this metadata and issue the capabilities request in such
53 a way to elicit an ``application/mercurial-0.2`` response.)
54
55 Clients wishing to switch to a newer API service may send an
56 ``X-HgUpgrade-<X>`` header containing a space-delimited list of API service
57 names the client is capable of speaking. The request MUST also include an
58 ``X-HgProto-<X>`` header advertising a known serialization format for the
59 response. ``cbor`` is currently the only defined serialization format.
60
61 If the request contains these headers, the response ``Content-Type`` MAY
62 be for a different media type. e.g. ``application/mercurial-cbor`` if the
63 client advertises support for CBOR.
64
65 The response MUST be deserializable to a map with the following keys:
66
67 apibase
68 URL path to API services, relative to the repository root. e.g. ``api/``.
69
70 apis
71 A map of API service names to API descriptors. An API descriptor contains
72 more details about that API. In the case of the HTTP Version 2 Transport,
73 it will be the normal response to a ``capabilities`` command.
74
75 Only the services advertised by the client that are also available on
76 the server are advertised.
77
78 v1capabilities
79 The capabilities string that would be returned by a version 1 response.
80
81 The client can then inspect the server-advertised APIs and decide which
82 API to use, including continuing to use the HTTP Version 1 Transport.
83
84 HTTP Version 1 Transport
85 ------------------------
86
87 Commands are issued as HTTP/1.0 or HTTP/1.1 requests. Commands are
88 sent to the base URL of the repository with the command name sent in
89 the ``cmd`` query string parameter. e.g.
90 ``https://example.com/repo?cmd=capabilities``. The HTTP method is ``GET``
91 or ``POST`` depending on the command and whether there is a request
92 body.
93
94 Command arguments can be sent multiple ways.
95
96 The simplest is part of the URL query string using ``x-www-form-urlencoded``
97 encoding (see Python's ``urllib.urlencode()``. However, many servers impose
98 length limitations on the URL. So this mechanism is typically only used if
99 the server doesn't support other mechanisms.
100
101 If the server supports the ``httpheader`` capability, command arguments can
102 be sent in HTTP request headers named ``X-HgArg-<N>`` where ``<N>`` is an
103 integer starting at 1. A ``x-www-form-urlencoded`` representation of the
104 arguments is obtained. This full string is then split into chunks and sent
105 in numbered ``X-HgArg-<N>`` headers. The maximum length of each HTTP header
106 is defined by the server in the ``httpheader`` capability value, which defaults
107 to ``1024``. The server reassembles the encoded arguments string by
108 concatenating the ``X-HgArg-<N>`` headers then URL decodes them into a
109 dictionary.
110
111 The list of ``X-HgArg-<N>`` headers should be added to the ``Vary`` request
112 header to instruct caches to take these headers into consideration when caching
113 requests.
114
115 If the server supports the ``httppostargs`` capability, the client
116 may send command arguments in the HTTP request body as part of an
117 HTTP POST request. The command arguments will be URL encoded just like
118 they would for sending them via HTTP headers. However, no splitting is
119 performed: the raw arguments are included in the HTTP request body.
120
121 The client sends a ``X-HgArgs-Post`` header with the string length of the
122 encoded arguments data. Additional data may be included in the HTTP
123 request body immediately following the argument data. The offset of the
124 non-argument data is defined by the ``X-HgArgs-Post`` header. The
125 ``X-HgArgs-Post`` header is not required if there is no argument data.
126
127 Additional command data can be sent as part of the HTTP request body. The
128 default ``Content-Type`` when sending data is ``application/mercurial-0.1``.
129 A ``Content-Length`` header is currently always sent.
130
131 Example HTTP requests::
132
133 GET /repo?cmd=capabilities
134 X-HgArg-1: foo=bar&baz=hello%20world
135
136 The request media type should be chosen based on server support. If the
137 ``httpmediatype`` server capability is present, the client should send
138 the newest mutually supported media type. If this capability is absent,
139 the client must assume the server only supports the
140 ``application/mercurial-0.1`` media type.
141
142 The ``Content-Type`` HTTP response header identifies the response as coming
143 from Mercurial and can also be used to signal an error has occurred.
144
145 The ``application/mercurial-*`` media types indicate a generic Mercurial
146 data type.
147
148 The ``application/mercurial-0.1`` media type is raw Mercurial data. It is the
149 predecessor of the format below.
150
151 The ``application/mercurial-0.2`` media type is compression framed Mercurial
152 data. The first byte of the payload indicates the length of the compression
153 format identifier that follows. Next are N bytes indicating the compression
154 format. e.g. ``zlib``. The remaining bytes are compressed according to that
155 compression format. The decompressed data behaves the same as with
156 ``application/mercurial-0.1``.
157
158 The ``application/hg-error`` media type indicates a generic error occurred.
159 The content of the HTTP response body typically holds text describing the
160 error.
161
162 The ``application/mercurial-cbor`` media type indicates a CBOR payload
163 and should be interpreted as identical to ``application/cbor``.
164
165 Behavior of media types is further described in the ``Content Negotiation``
166 section below.
167
168 Clients should issue a ``User-Agent`` request header that identifies the client.
169 The server should not use the ``User-Agent`` for feature detection.
170
171 A command returning a ``string`` response issues a
172 ``application/mercurial-0.*`` media type and the HTTP response body contains
173 the raw string value (after compression decoding, if used). A
174 ``Content-Length`` header is typically issued, but not required.
175
176 A command returning a ``stream`` response issues a
177 ``application/mercurial-0.*`` media type and the HTTP response is typically
178 using *chunked transfer* (``Transfer-Encoding: chunked``).
179
180 HTTP Version 2 Transport
181 ------------------------
182
183 **Experimental - feature under active development**
184
185 Version 2 of the HTTP protocol is exposed under the ``/api/*`` URL space.
186 It's final API name is not yet formalized.
187
188 Commands are triggered by sending HTTP POST requests against URLs of the
189 form ``<permission>/<command>``, where ``<permission>`` is ``ro`` or
190 ``rw``, meaning read-only and read-write, respectively and ``<command>``
191 is a named wire protocol command.
192
193 Non-POST request methods MUST be rejected by the server with an HTTP
194 405 response.
195
196 Commands that modify repository state in meaningful ways MUST NOT be
197 exposed under the ``ro`` URL prefix. All available commands MUST be
198 available under the ``rw`` URL prefix.
199
200 Server adminstrators MAY implement blanket HTTP authentication keyed
201 off the URL prefix. For example, a server may require authentication
202 for all ``rw/*`` URLs and let unauthenticated requests to ``ro/*``
203 URL proceed. A server MAY issue an HTTP 401, 403, or 407 response
204 in accordance with RFC 7235. Clients SHOULD recognize the HTTP Basic
205 (RFC 7617) and Digest (RFC 7616) authentication schemes. Clients SHOULD
206 make an attempt to recognize unknown schemes using the
207 ``WWW-Authenticate`` response header on a 401 response, as defined by
208 RFC 7235.
209
210 Read-only commands are accessible under ``rw/*`` URLs so clients can
211 signal the intent of the operation very early in the connection
212 lifecycle. For example, a ``push`` operation - which consists of
213 various read-only commands mixed with at least one read-write command -
214 can perform all commands against ``rw/*`` URLs so that any server-side
215 authentication requirements are discovered upon attempting the first
216 command - not potentially several commands into the exchange. This
217 allows clients to fail faster or prompt for credentials as soon as the
218 exchange takes place. This provides a better end-user experience.
219
220 Requests to unknown commands or URLS result in an HTTP 404.
221 TODO formally define response type, how error is communicated, etc.
222
223 HTTP request and response bodies use the *Unified Frame-Based Protocol*
224 (defined below) for media exchange. The entirety of the HTTP message
225 body is 0 or more frames as defined by this protocol.
226
227 Clients and servers MUST advertise the ``TBD`` media type via the
228 ``Content-Type`` request and response headers. In addition, clients MUST
229 advertise this media type value in their ``Accept`` request header in all
230 requests.
231 TODO finalize the media type. For now, it is defined in wireprotoserver.py.
232
233 Servers receiving requests without an ``Accept`` header SHOULD respond with
234 an HTTP 406.
235
236 Servers receiving requests with an invalid ``Content-Type`` header SHOULD
237 respond with an HTTP 415.
238
239 The command to run is specified in the POST payload as defined by the
240 *Unified Frame-Based Protocol*. This is redundant with data already
241 encoded in the URL. This is by design, so server operators can have
242 better understanding about server activity from looking merely at
243 HTTP access logs.
244
245 In most circumstances, the command specified in the URL MUST match
246 the command specified in the frame-based payload or the server will
247 respond with an error. The exception to this is the special
248 ``multirequest`` URL. (See below.) In addition, HTTP requests
249 are limited to one command invocation. The exception is the special
250 ``multirequest`` URL.
251
252 The ``multirequest`` command endpoints (``ro/multirequest`` and
253 ``rw/multirequest``) are special in that they allow the execution of
254 *any* command and allow the execution of multiple commands. If the
255 HTTP request issues multiple commands across multiple frames, all
256 issued commands will be processed by the server. Per the defined
257 behavior of the *Unified Frame-Based Protocol*, commands may be
258 issued interleaved and responses may come back in a different order
259 than they were issued. Clients MUST be able to deal with this.
260
261 SSH Protocol
262 ============
263
264 Handshake
265 ---------
266
267 For all clients, the handshake consists of the client sending 1 or more
268 commands to the server using version 1 of the transport. Servers respond
269 to commands they know how to respond to and send an empty response (``0\n``)
270 for unknown commands (per standard behavior of version 1 of the transport).
271 Clients then typically look for a response to the newest sent command to
272 determine which transport version to use and what the available features for
273 the connection and server are.
274
275 Preceding any response from client-issued commands, the server may print
276 non-protocol output. It is common for SSH servers to print banners, message
277 of the day announcements, etc when clients connect. It is assumed that any
278 such *banner* output will precede any Mercurial server output. So clients
279 must be prepared to handle server output on initial connect that isn't
280 in response to any client-issued command and doesn't conform to Mercurial's
281 wire protocol. This *banner* output should only be on stdout. However,
282 some servers may send output on stderr.
283
284 Pre 0.9.1 clients issue a ``between`` command with the ``pairs`` argument
285 having the value
286 ``0000000000000000000000000000000000000000-0000000000000000000000000000000000000000``.
287
288 The ``between`` command has been supported since the original Mercurial
289 SSH server. Requesting the empty range will return a ``\n`` string response,
290 which will be encoded as ``1\n\n`` (value length of ``1`` followed by a newline
291 followed by the value, which happens to be a newline).
292
293 For pre 0.9.1 clients and all servers, the exchange looks like::
294
295 c: between\n
296 c: pairs 81\n
297 c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
298 s: 1\n
299 s: \n
300
301 0.9.1+ clients send a ``hello`` command (with no arguments) before the
302 ``between`` command. The response to this command allows clients to
303 discover server capabilities and settings.
304
305 An example exchange between 0.9.1+ clients and a ``hello`` aware server looks
306 like::
307
308 c: hello\n
309 c: between\n
310 c: pairs 81\n
311 c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
312 s: 324\n
313 s: capabilities: lookup changegroupsubset branchmap pushkey known getbundle ...\n
314 s: 1\n
315 s: \n
316
317 And a similar scenario but with servers sending a banner on connect::
318
319 c: hello\n
320 c: between\n
321 c: pairs 81\n
322 c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
323 s: welcome to the server\n
324 s: if you find any issues, email someone@somewhere.com\n
325 s: 324\n
326 s: capabilities: lookup changegroupsubset branchmap pushkey known getbundle ...\n
327 s: 1\n
328 s: \n
329
330 Note that output from the ``hello`` command is terminated by a ``\n``. This is
331 part of the response payload and not part of the wire protocol adding a newline
332 after responses. In other words, the length of the response contains the
333 trailing ``\n``.
334
335 Clients supporting version 2 of the SSH transport send a line beginning
336 with ``upgrade`` before the ``hello`` and ``between`` commands. The line
337 (which isn't a well-formed command line because it doesn't consist of a
338 single command name) serves to both communicate the client's intent to
339 switch to transport version 2 (transports are version 1 by default) as
340 well as to advertise the client's transport-level capabilities so the
341 server may satisfy that request immediately.
342
343 The upgrade line has the form:
344
345 upgrade <token> <transport capabilities>
346
347 That is the literal string ``upgrade`` followed by a space, followed by
348 a randomly generated string, followed by a space, followed by a string
349 denoting the client's transport capabilities.
350
351 The token can be anything. However, a random UUID is recommended. (Use
352 of version 4 UUIDs is recommended because version 1 UUIDs can leak the
353 client's MAC address.)
354
355 The transport capabilities string is a URL/percent encoded string
356 containing key-value pairs defining the client's transport-level
357 capabilities. The following capabilities are defined:
358
359 proto
360 A comma-delimited list of transport protocol versions the client
361 supports. e.g. ``ssh-v2``.
362
363 If the server does not recognize the ``upgrade`` line, it should issue
364 an empty response and continue processing the ``hello`` and ``between``
365 commands. Here is an example handshake between a version 2 aware client
366 and a non version 2 aware server:
367
368 c: upgrade 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a proto=ssh-v2
369 c: hello\n
370 c: between\n
371 c: pairs 81\n
372 c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
373 s: 0\n
374 s: 324\n
375 s: capabilities: lookup changegroupsubset branchmap pushkey known getbundle ...\n
376 s: 1\n
377 s: \n
378
379 (The initial ``0\n`` line from the server indicates an empty response to
380 the unknown ``upgrade ..`` command/line.)
381
382 If the server recognizes the ``upgrade`` line and is willing to satisfy that
383 upgrade request, it replies to with a payload of the following form:
384
385 upgraded <token> <transport name>\n
386
387 This line is the literal string ``upgraded``, a space, the token that was
388 specified by the client in its ``upgrade ...`` request line, a space, and the
389 name of the transport protocol that was chosen by the server. The transport
390 name MUST match one of the names the client specified in the ``proto`` field
391 of its ``upgrade ...`` request line.
392
393 If a server issues an ``upgraded`` response, it MUST also read and ignore
394 the lines associated with the ``hello`` and ``between`` command requests
395 that were issued by the server. It is assumed that the negotiated transport
396 will respond with equivalent requested information following the transport
397 handshake.
398
399 All data following the ``\n`` terminating the ``upgraded`` line is the
400 domain of the negotiated transport. It is common for the data immediately
401 following to contain additional metadata about the state of the transport and
402 the server. However, this isn't strictly speaking part of the transport
403 handshake and isn't covered by this section.
404
405 Here is an example handshake between a version 2 aware client and a version
406 2 aware server:
407
408 c: upgrade 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a proto=ssh-v2
409 c: hello\n
410 c: between\n
411 c: pairs 81\n
412 c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
413 s: upgraded 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a ssh-v2\n
414 s: <additional transport specific data>
415
416 The client-issued token that is echoed in the response provides a more
417 resilient mechanism for differentiating *banner* output from Mercurial
418 output. In version 1, properly formatted banner output could get confused
419 for Mercurial server output. By submitting a randomly generated token
420 that is then present in the response, the client can look for that token
421 in response lines and have reasonable certainty that the line did not
422 originate from a *banner* message.
423
424 SSH Version 1 Transport
425 -----------------------
426
427 The SSH transport (version 1) is a custom text-based protocol suitable for
428 use over any bi-directional stream transport. It is most commonly used with
429 SSH.
430
431 A SSH transport server can be started with ``hg serve --stdio``. The stdin,
432 stderr, and stdout file descriptors of the started process are used to exchange
433 data. When Mercurial connects to a remote server over SSH, it actually starts
434 a ``hg serve --stdio`` process on the remote server.
435
436 Commands are issued by sending the command name followed by a trailing newline
437 ``\n`` to the server. e.g. ``capabilities\n``.
438
439 Command arguments are sent in the following format::
440
441 <argument> <length>\n<value>
442
443 That is, the argument string name followed by a space followed by the
444 integer length of the value (expressed as a string) followed by a newline
445 (``\n``) followed by the raw argument value.
446
447 Dictionary arguments are encoded differently::
448
449 <argument> <# elements>\n
450 <key1> <length1>\n<value1>
451 <key2> <length2>\n<value2>
452 ...
453
454 Non-argument data is sent immediately after the final argument value. It is
455 encoded in chunks::
456
457 <length>\n<data>
458
459 Each command declares a list of supported arguments and their types. If a
460 client sends an unknown argument to the server, the server should abort
461 immediately. The special argument ``*`` in a command's definition indicates
462 that all argument names are allowed.
463
464 The definition of supported arguments and types is initially made when a
465 new command is implemented. The client and server must initially independently
466 agree on the arguments and their types. This initial set of arguments can be
467 supplemented through the presence of *capabilities* advertised by the server.
468
469 Each command has a defined expected response type.
470
471 A ``string`` response type is a length framed value. The response consists of
472 the string encoded integer length of a value followed by a newline (``\n``)
473 followed by the value. Empty values are allowed (and are represented as
474 ``0\n``).
475
476 A ``stream`` response type consists of raw bytes of data. There is no framing.
477
478 A generic error response type is also supported. It consists of a an error
479 message written to ``stderr`` followed by ``\n-\n``. In addition, ``\n`` is
480 written to ``stdout``.
481
482 If the server receives an unknown command, it will send an empty ``string``
483 response.
484
485 The server terminates if it receives an empty command (a ``\n`` character).
486
487 If the server announces support for the ``protocaps`` capability, the client
488 should issue a ``protocaps`` command after the initial handshake to annonunce
489 its own capabilities. The client capabilities are persistent.
490
491 SSH Version 2 Transport
492 -----------------------
493
494 **Experimental and under development**
495
496 Version 2 of the SSH transport behaves identically to version 1 of the SSH
497 transport with the exception of handshake semantics. See above for how
498 version 2 of the SSH transport is negotiated.
499
500 Immediately following the ``upgraded`` line signaling a switch to version
501 2 of the SSH protocol, the server automatically sends additional details
502 about the capabilities of the remote server. This has the form:
503
504 <integer length of value>\n
505 capabilities: ...\n
506
507 e.g.
508
509 s: upgraded 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a ssh-v2\n
510 s: 240\n
511 s: capabilities: known getbundle batch ...\n
512
513 Following capabilities advertisement, the peers communicate using version
514 1 of the SSH transport.
515
516 Unified Frame-Based Protocol
517 ============================
518
519 **Experimental and under development**
520
521 The *Unified Frame-Based Protocol* is a communications protocol between
522 Mercurial peers. The protocol aims to be mostly transport agnostic
523 (works similarly on HTTP, SSH, etc).
524
525 To operate the protocol, a bi-directional, half-duplex pipe supporting
526 ordered sends and receives is required. That is, each peer has one pipe
527 for sending data and another for receiving.
528
529 All data is read and written in atomic units called *frames*. These
530 are conceptually similar to TCP packets. Higher-level functionality
531 is built on the exchange and processing of frames.
532
533 All frames are associated with a *stream*. A *stream* provides a
534 unidirectional grouping of frames. Streams facilitate two goals:
535 content encoding and parallelism. There is a dedicated section on
536 streams below.
537
538 The protocol is request-response based: the client issues requests to
539 the server, which issues replies to those requests. Server-initiated
540 messaging is not currently supported, but this specification carves
541 out room to implement it.
542
543 All frames are associated with a numbered request. Frames can thus
544 be logically grouped by their request ID.
545
546 Frames begin with an 8 octet header followed by a variable length
547 payload::
548
549 +------------------------------------------------+
550 | Length (24) |
551 +--------------------------------+---------------+
552 | Request ID (16) | Stream ID (8) |
553 +------------------+-------------+---------------+
554 | Stream Flags (8) |
555 +-----------+------+
556 | Type (4) |
557 +-----------+
558 | Flags (4) |
559 +===========+===================================================|
560 | Frame Payload (0...) ...
561 +---------------------------------------------------------------+
562
563 The length of the frame payload is expressed as an unsigned 24 bit
564 little endian integer. Values larger than 65535 MUST NOT be used unless
565 given permission by the server as part of the negotiated capabilities
566 during the handshake. The frame header is not part of the advertised
567 frame length. The payload length is the over-the-wire length. If there
568 is content encoding applied to the payload as part of the frame's stream,
569 the length is the output of that content encoding, not the input.
570
571 The 16-bit ``Request ID`` field denotes the integer request identifier,
572 stored as an unsigned little endian integer. Odd numbered requests are
573 client-initiated. Even numbered requests are server-initiated. This
574 refers to where the *request* was initiated - not where the *frame* was
575 initiated, so servers will send frames with odd ``Request ID`` in
576 response to client-initiated requests. Implementations are advised to
577 start ordering request identifiers at ``1`` and ``0``, increment by
578 ``2``, and wrap around if all available numbers have been exhausted.
579
580 The 8-bit ``Stream ID`` field denotes the stream that the frame is
581 associated with. Frames belonging to a stream may have content
582 encoding applied and the receiver may need to decode the raw frame
583 payload to obtain the original data. Odd numbered IDs are
584 client-initiated. Even numbered IDs are server-initiated.
585
586 The 8-bit ``Stream Flags`` field defines stream processing semantics.
587 See the section on streams below.
588
589 The 4-bit ``Type`` field denotes the type of frame being sent.
590
591 The 4-bit ``Flags`` field defines special, per-type attributes for
592 the frame.
593
594 The sections below define the frame types and their behavior.
595
596 Command Request (``0x01``)
597 --------------------------
598
599 This frame contains a request to run a command.
600
601 The payload consists of a CBOR map defining the command request. The
602 bytestring keys of that map are:
603
604 name
605 Name of the command that should be executed (bytestring).
606 args
607 Map of bytestring keys to various value types containing the named
608 arguments to this command.
609
610 Each command defines its own set of argument names and their expected
611 types.
612
613 This frame type MUST ONLY be sent from clients to servers: it is illegal
614 for a server to send this frame to a client.
615
616 The following flag values are defined for this type:
617
618 0x01
619 New command request. When set, this frame represents the beginning
620 of a new request to run a command. The ``Request ID`` attached to this
621 frame MUST NOT be active.
622 0x02
623 Command request continuation. When set, this frame is a continuation
624 from a previous command request frame for its ``Request ID``. This
625 flag is set when the CBOR data for a command request does not fit
626 in a single frame.
627 0x04
628 Additional frames expected. When set, the command request didn't fit
629 into a single frame and additional CBOR data follows in a subsequent
630 frame.
631 0x08
632 Command data frames expected. When set, command data frames are
633 expected to follow the final command request frame for this request.
634
635 ``0x01`` MUST be set on the initial command request frame for a
636 ``Request ID``.
637
638 ``0x01`` or ``0x02`` MUST be set to indicate this frame's role in
639 a series of command request frames.
640
641 If command data frames are to be sent, ``0x08`` MUST be set on ALL
642 command request frames.
643
644 Command Data (``0x02``)
645 -----------------------
646
647 This frame contains raw data for a command.
648
649 Most commands can be executed by specifying arguments. However,
650 arguments have an upper bound to their length. For commands that
651 accept data that is beyond this length or whose length isn't known
652 when the command is initially sent, they will need to stream
653 arbitrary data to the server. This frame type facilitates the sending
654 of this data.
655
656 The payload of this frame type consists of a stream of raw data to be
657 consumed by the command handler on the server. The format of the data
658 is command specific.
659
660 The following flag values are defined for this type:
661
662 0x01
663 Command data continuation. When set, the data for this command
664 continues into a subsequent frame.
665
666 0x02
667 End of data. When set, command data has been fully sent to the
668 server. The command has been fully issued and no new data for this
669 command will be sent. The next frame will belong to a new command.
670
671 Command Response Data (``0x03``)
672 --------------------------------
673
674 This frame contains response data to an issued command.
675
676 Response data ALWAYS consists of a series of 1 or more CBOR encoded
677 values. A CBOR value may be using indefinite length encoding. And the
678 bytes constituting the value may span several frames.
679
680 The following flag values are defined for this type:
681
682 0x01
683 Data continuation. When set, an additional frame containing response data
684 will follow.
685 0x02
686 End of data. When set, the response data has been fully sent and
687 no additional frames for this response will be sent.
688
689 The ``0x01`` flag is mutually exclusive with the ``0x02`` flag.
690
691 Error Occurred (``0x05``)
692 -------------------------
693
694 Some kind of error occurred.
695
696 There are 3 general kinds of failures that can occur:
697
698 * Command error encountered before any response issued
699 * Command error encountered after a response was issued
700 * Protocol or stream level error
701
702 This frame type is used to capture the latter cases. (The general
703 command error case is handled by the leading CBOR map in
704 ``Command Response`` frames.)
705
706 The payload of this frame contains a CBOR map detailing the error. That
707 map has the following bytestring keys:
708
709 type
710 (bytestring) The overall type of error encountered. Can be one of the
711 following values:
712
713 protocol
714 A protocol-level error occurred. This typically means someone
715 is violating the framing protocol semantics and the server is
716 refusing to proceed.
717
718 server
719 A server-level error occurred. This typically indicates some kind of
720 logic error on the server, likely the fault of the server.
721
722 command
723 A command-level error, likely the fault of the client.
724
725 message
726 (array of maps) A richly formatted message that is intended for
727 human consumption. See the ``Human Output Side-Channel`` frame
728 section for a description of the format of this data structure.
729
730 Human Output Side-Channel (``0x06``)
731 ------------------------------------
732
733 This frame contains a message that is intended to be displayed to
734 people. Whereas most frames communicate machine readable data, this
735 frame communicates textual data that is intended to be shown to
736 humans.
737
738 The frame consists of a series of *formatting requests*. Each formatting
739 request consists of a formatting string, arguments for that formatting
740 string, and labels to apply to that formatting string.
741
742 A formatting string is a printf()-like string that allows variable
743 substitution within the string. Labels allow the rendered text to be
744 *decorated*. Assuming use of the canonical Mercurial code base, a
745 formatting string can be the input to the ``i18n._`` function. This
746 allows messages emitted from the server to be localized. So even if
747 the server has different i18n settings, people could see messages in
748 their *native* settings. Similarly, the use of labels allows
749 decorations like coloring and underlining to be applied using the
750 client's configured rendering settings.
751
752 Formatting strings are similar to ``printf()`` strings or how
753 Python's ``%`` operator works. The only supported formatting sequences
754 are ``%s`` and ``%%``. ``%s`` will be replaced by whatever the string
755 at that position resolves to. ``%%`` will be replaced by ``%``. All
756 other 2-byte sequences beginning with ``%`` represent a literal
757 ``%`` followed by that character. However, future versions of the
758 wire protocol reserve the right to allow clients to opt in to receiving
759 formatting strings with additional formatters, hence why ``%%`` is
760 required to represent the literal ``%``.
761
762 The frame payload consists of a CBOR array of CBOR maps. Each map
763 defines an *atom* of text data to print. Each *atom* has the following
764 bytestring keys:
765
766 msg
767 (bytestring) The formatting string. Content MUST be ASCII.
768 args (optional)
769 Array of bytestrings defining arguments to the formatting string.
770 labels (optional)
771 Array of bytestrings defining labels to apply to this atom.
772
773 All data to be printed MUST be encoded into a single frame: this frame
774 does not support spanning data across multiple frames.
775
776 All textual data encoded in these frames is assumed to be line delimited.
777 The last atom in the frame SHOULD end with a newline (``\n``). If it
778 doesn't, clients MAY add a newline to facilitate immediate printing.
779
780 Progress Update (``0x07``)
781 --------------------------
782
783 This frame holds the progress of an operation on the peer. Consumption
784 of these frames allows clients to display progress bars, estimated
785 completion times, etc.
786
787 Each frame defines the progress of a single operation on the peer. The
788 payload consists of a CBOR map with the following bytestring keys:
789
790 topic
791 Topic name (string)
792 pos
793 Current numeric position within the topic (integer)
794 total
795 Total/end numeric position of this topic (unsigned integer)
796 label (optional)
797 Unit label (string)
798 item (optional)
799 Item name (string)
800
801 Progress state is created when a frame is received referencing a
802 *topic* that isn't currently tracked. Progress tracking for that
803 *topic* is finished when a frame is received reporting the current
804 position of that topic as ``-1``.
805
806 Multiple *topics* may be active at any given time.
807
808 Rendering of progress information is not mandated or governed by this
809 specification: implementations MAY render progress information however
810 they see fit, including not at all.
811
812 The string data describing the topic SHOULD be static strings to
813 facilitate receivers localizing that string data. The emitter
814 MUST normalize all string data to valid UTF-8 and receivers SHOULD
815 validate that received data conforms to UTF-8. The topic name
816 SHOULD be ASCII.
817
818 Stream Encoding Settings (``0x08``)
819 -----------------------------------
820
821 This frame type holds information defining the content encoding
822 settings for a *stream*.
823
824 This frame type is likely consumed by the protocol layer and is not
825 passed on to applications.
826
827 This frame type MUST ONLY occur on frames having the *Beginning of Stream*
828 ``Stream Flag`` set.
829
830 The payload of this frame defines what content encoding has (possibly)
831 been applied to the payloads of subsequent frames in this stream.
832
833 The payload begins with an 8-bit integer defining the length of the
834 encoding *profile*, followed by the string name of that profile, which
835 must be an ASCII string. All bytes that follow can be used by that
836 profile for supplemental settings definitions. See the section below
837 on defined encoding profiles.
838
839 Stream States and Flags
840 -----------------------
841
842 Streams can be in two states: *open* and *closed*. An *open* stream
843 is active and frames attached to that stream could arrive at any time.
844 A *closed* stream is not active. If a frame attached to a *closed*
845 stream arrives, that frame MUST have an appropriate stream flag
846 set indicating beginning of stream. All streams are in the *closed*
847 state by default.
848
849 The ``Stream Flags`` field denotes a set of bit flags for defining
850 the relationship of this frame within a stream. The following flags
851 are defined:
852
853 0x01
854 Beginning of stream. The first frame in the stream MUST set this
855 flag. When received, the ``Stream ID`` this frame is attached to
856 becomes ``open``.
857
858 0x02
859 End of stream. The last frame in a stream MUST set this flag. When
860 received, the ``Stream ID`` this frame is attached to becomes
861 ``closed``. Any content encoding context associated with this stream
862 can be destroyed after processing the payload of this frame.
863
864 0x04
865 Apply content encoding. When set, any content encoding settings
866 defined by the stream should be applied when attempting to read
867 the frame. When not set, the frame payload isn't encoded.
868
869 Streams
870 -------
871
872 Streams - along with ``Request IDs`` - facilitate grouping of frames.
873 But the purpose of each is quite different and the groupings they
874 constitute are independent.
875
876 A ``Request ID`` is essentially a tag. It tells you which logical
877 request a frame is associated with.
878
879 A *stream* is a sequence of frames grouped for the express purpose
880 of applying a stateful encoding or for denoting sub-groups of frames.
881
882 Unlike ``Request ID``s which span the request and response, a stream
883 is unidirectional and stream IDs are independent from client to
884 server.
885
886 There is no strict hierarchical relationship between ``Request IDs``
887 and *streams*. A stream can contain frames having multiple
888 ``Request IDs``. Frames belonging to the same ``Request ID`` can
889 span multiple streams.
890
891 One goal of streams is to facilitate content encoding. A stream can
892 define an encoding to be applied to frame payloads. For example, the
893 payload transmitted over the wire may contain output from a
894 zstandard compression operation and the receiving end may decompress
895 that payload to obtain the original data.
896
897 The other goal of streams is to facilitate concurrent execution. For
898 example, a server could spawn 4 threads to service a request that can
899 be easily parallelized. Each of those 4 threads could write into its
900 own stream. Those streams could then in turn be delivered to 4 threads
901 on the receiving end, with each thread consuming its stream in near
902 isolation. The *main* thread on both ends merely does I/O and
903 encodes/decodes frame headers: the bulk of the work is done by worker
904 threads.
905
906 In addition, since content encoding is defined per stream, each
907 *worker thread* could perform potentially CPU bound work concurrently
908 with other threads. This approach of applying encoding at the
909 sub-protocol / stream level eliminates a potential resource constraint
910 on the protocol stream as a whole (it is common for the throughput of
911 a compression engine to be smaller than the throughput of a network).
912
913 Having multiple streams - each with their own encoding settings - also
914 facilitates the use of advanced data compression techniques. For
915 example, a transmitter could see that it is generating data faster
916 and slower than the receiving end is consuming it and adjust its
917 compression settings to trade CPU for compression ratio accordingly.
918
919 While streams can define a content encoding, not all frames within
920 that stream must use that content encoding. This can be useful when
921 data is being served from caches and being derived dynamically. A
922 cache could pre-compressed data so the server doesn't have to
923 recompress it. The ability to pick and choose which frames are
924 compressed allows servers to easily send data to the wire without
925 involving potentially expensive encoding overhead.
926
927 Content Encoding Profiles
928 -------------------------
929
930 Streams can have named content encoding *profiles* associated with
931 them. A profile defines a shared understanding of content encoding
932 settings and behavior.
933
934 The following profiles are defined:
935
936 TBD
937
938 Command Protocol
939 ----------------
940
941 A client can request that a remote run a command by sending it
942 frames defining that command. This logical stream is composed of
943 1 or more ``Command Request`` frames and and 0 or more ``Command Data``
944 frames.
945
946 All frames composing a single command request MUST be associated with
947 the same ``Request ID``.
948
949 Clients MAY send additional command requests without waiting on the
950 response to a previous command request. If they do so, they MUST ensure
951 that the ``Request ID`` field of outbound frames does not conflict
952 with that of an active ``Request ID`` whose response has not yet been
953 fully received.
954
955 Servers MAY respond to commands in a different order than they were
956 sent over the wire. Clients MUST be prepared to deal with this. Servers
957 also MAY start executing commands in a different order than they were
958 received, or MAY execute multiple commands concurrently.
959
960 If there is a dependency between commands or a race condition between
961 commands executing (e.g. a read-only command that depends on the results
962 of a command that mutates the repository), then clients MUST NOT send
963 frames issuing a command until a response to all dependent commands has
964 been received.
965 TODO think about whether we should express dependencies between commands
966 to avoid roundtrip latency.
967
968 A command is defined by a command name, 0 or more command arguments,
969 and optional command data.
970
971 Arguments are the recommended mechanism for transferring fixed sets of
972 parameters to a command. Data is appropriate for transferring variable
973 data. Thinking in terms of HTTP, arguments would be headers and data
974 would be the message body.
975
976 It is recommended for servers to delay the dispatch of a command
977 until all argument have been received. Servers MAY impose limits on the
978 maximum argument size.
979 TODO define failure mechanism.
980
981 Servers MAY dispatch to commands immediately once argument data
982 is available or delay until command data is received in full.
983
984 Once a ``Command Request`` frame is sent, a client must be prepared to
985 receive any of the following frames associated with that request:
986 ``Command Response``, ``Error Response``, ``Human Output Side-Channel``,
987 ``Progress Update``.
988
989 The *main* response for a command will be in ``Command Response`` frames.
990 The payloads of these frames consist of 1 or more CBOR encoded values.
991 The first CBOR value on the first ``Command Response`` frame is special
992 and denotes the overall status of the command. This CBOR map contains
993 the following bytestring keys:
994
995 status
996 (bytestring) A well-defined message containing the overall status of
997 this command request. The following values are defined:
998
999 ok
1000 The command was received successfully and its response follows.
1001 error
1002 There was an error processing the command. More details about the
1003 error are encoded in the ``error`` key.
1004
1005 error (optional)
1006 A map containing information about an encountered error. The map has the
1007 following keys:
1008
1009 message
1010 (array of maps) A message describing the error. The message uses the
1011 same format as those in the ``Human Output Side-Channel`` frame.
1012
1013 Capabilities
1014 ============
1015
1016 Servers advertise supported wire protocol features. This allows clients to
1017 probe for server features before blindly calling a command or passing a
1018 specific argument.
1019
1020 The server's features are exposed via a *capabilities* string. This is a
1021 space-delimited string of tokens/features. Some features are single words
1022 like ``lookup`` or ``batch``. Others are complicated key-value pairs
1023 advertising sub-features. e.g. ``httpheader=2048``. When complex, non-word
1024 values are used, each feature name can define its own encoding of sub-values.
1025 Comma-delimited and ``x-www-form-urlencoded`` values are common.
1026
1027 The following document capabilities defined by the canonical Mercurial server
1028 implementation.
1029
1030 batch
1031 -----
1032
1033 Whether the server supports the ``batch`` command.
1034
1035 This capability/command was introduced in Mercurial 1.9 (released July 2011).
1036
1037 branchmap
1038 ---------
1039
1040 Whether the server supports the ``branchmap`` command.
1041
1042 This capability/command was introduced in Mercurial 1.3 (released July 2009).
1043
1044 bundle2-exp
1045 -----------
1046
1047 Precursor to ``bundle2`` capability that was used before bundle2 was a
1048 stable feature.
1049
1050 This capability was introduced in Mercurial 3.0 behind an experimental
1051 flag. This capability should not be observed in the wild.
1052
1053 bundle2
1054 -------
1055
1056 Indicates whether the server supports the ``bundle2`` data exchange format.
1057
1058 The value of the capability is a URL quoted, newline (``\n``) delimited
1059 list of keys or key-value pairs.
1060
1061 A key is simply a URL encoded string.
1062
1063 A key-value pair is a URL encoded key separated from a URL encoded value by
1064 an ``=``. If the value is a list, elements are delimited by a ``,`` after
1065 URL encoding.
1066
1067 For example, say we have the values::
1068
1069 {'HG20': [], 'changegroup': ['01', '02'], 'digests': ['sha1', 'sha512']}
1070
1071 We would first construct a string::
1072
1073 HG20\nchangegroup=01,02\ndigests=sha1,sha512
1074
1075 We would then URL quote this string::
1076
1077 HG20%0Achangegroup%3D01%2C02%0Adigests%3Dsha1%2Csha512
1078
1079 This capability was introduced in Mercurial 3.4 (released May 2015).
1080
1081 changegroupsubset
1082 -----------------
1083
1084 Whether the server supports the ``changegroupsubset`` command.
1085
1086 This capability was introduced in Mercurial 0.9.2 (released December
1087 2006).
1088
1089 This capability was introduced at the same time as the ``lookup``
1090 capability/command.
1091
1092 compression
1093 -----------
1094
1095 Declares support for negotiating compression formats.
1096
1097 Presence of this capability indicates the server supports dynamic selection
1098 of compression formats based on the client request.
1099
1100 Servers advertising this capability are required to support the
1101 ``application/mercurial-0.2`` media type in response to commands returning
1102 streams. Servers may support this media type on any command.
1103
1104 The value of the capability is a comma-delimited list of strings declaring
1105 supported compression formats. The order of the compression formats is in
1106 server-preferred order, most preferred first.
1107
1108 The identifiers used by the official Mercurial distribution are:
1109
1110 bzip2
1111 bzip2
1112 none
1113 uncompressed / raw data
1114 zlib
1115 zlib (no gzip header)
1116 zstd
1117 zstd
1118
1119 This capability was introduced in Mercurial 4.1 (released February 2017).
1120
1121 getbundle
1122 ---------
1123
1124 Whether the server supports the ``getbundle`` command.
1125
1126 This capability was introduced in Mercurial 1.9 (released July 2011).
1127
1128 httpheader
1129 ----------
1130
1131 Whether the server supports receiving command arguments via HTTP request
1132 headers.
1133
1134 The value of the capability is an integer describing the max header
1135 length that clients should send. Clients should ignore any content after a
1136 comma in the value, as this is reserved for future use.
1137
1138 This capability was introduced in Mercurial 1.9 (released July 2011).
1139
1140 httpmediatype
1141 -------------
1142
1143 Indicates which HTTP media types (``Content-Type`` header) the server is
1144 capable of receiving and sending.
1145
1146 The value of the capability is a comma-delimited list of strings identifying
1147 support for media type and transmission direction. The following strings may
1148 be present:
1149
1150 0.1rx
1151 Indicates server support for receiving ``application/mercurial-0.1`` media
1152 types.
1153
1154 0.1tx
1155 Indicates server support for sending ``application/mercurial-0.1`` media
1156 types.
1157
1158 0.2rx
1159 Indicates server support for receiving ``application/mercurial-0.2`` media
1160 types.
1161
1162 0.2tx
1163 Indicates server support for sending ``application/mercurial-0.2`` media
1164 types.
1165
1166 minrx=X
1167 Minimum media type version the server is capable of receiving. Value is a
1168 string like ``0.2``.
1169
1170 This capability can be used by servers to limit connections from legacy
1171 clients not using the latest supported media type. However, only clients
1172 with knowledge of this capability will know to consult this value. This
1173 capability is present so the client may issue a more user-friendly error
1174 when the server has locked out a legacy client.
1175
1176 mintx=X
1177 Minimum media type version the server is capable of sending. Value is a
1178 string like ``0.1``.
1179
1180 Servers advertising support for the ``application/mercurial-0.2`` media type
1181 should also advertise the ``compression`` capability.
1182
1183 This capability was introduced in Mercurial 4.1 (released February 2017).
1184
1185 httppostargs
1186 ------------
1187
1188 **Experimental**
1189
1190 Indicates that the server supports and prefers clients send command arguments
1191 via a HTTP POST request as part of the request body.
1192
1193 This capability was introduced in Mercurial 3.8 (released May 2016).
1194
1195 known
1196 -----
1197
1198 Whether the server supports the ``known`` command.
1199
1200 This capability/command was introduced in Mercurial 1.9 (released July 2011).
1201
1202 lookup
1203 ------
1204
1205 Whether the server supports the ``lookup`` command.
1206
1207 This capability was introduced in Mercurial 0.9.2 (released December
1208 2006).
1209
1210 This capability was introduced at the same time as the ``changegroupsubset``
1211 capability/command.
1212
1213 partial-pull
1214 ------------
1215
1216 Indicates that the client can deal with partial answers to pull requests
1217 by repeating the request.
1218
1219 If this parameter is not advertised, the server will not send pull bundles.
1220
1221 This client capability was introduced in Mercurial 4.6.
1222
1223 protocaps
1224 ---------
1225
1226 Whether the server supports the ``protocaps`` command for SSH V1 transport.
1227
1228 This capability was introduced in Mercurial 4.6.
1229
1230 pushkey
1231 -------
1232
1233 Whether the server supports the ``pushkey`` and ``listkeys`` commands.
1234
1235 This capability was introduced in Mercurial 1.6 (released July 2010).
1236
1237 standardbundle
1238 --------------
1239
1240 **Unsupported**
1241
1242 This capability was introduced during the Mercurial 0.9.2 development cycle in
1243 2006. It was never present in a release, as it was replaced by the ``unbundle``
1244 capability. This capability should not be encountered in the wild.
1245
1246 stream-preferred
1247 ----------------
1248
1249 If present the server prefers that clients clone using the streaming clone
1250 protocol (``hg clone --stream``) rather than the standard
1251 changegroup/bundle based protocol.
1252
1253 This capability was introduced in Mercurial 2.2 (released May 2012).
1254
1255 streamreqs
1256 ----------
1257
1258 Indicates whether the server supports *streaming clones* and the *requirements*
1259 that clients must support to receive it.
1260
1261 If present, the server supports the ``stream_out`` command, which transmits
1262 raw revlogs from the repository instead of changegroups. This provides a faster
1263 cloning mechanism at the expense of more bandwidth used.
1264
1265 The value of this capability is a comma-delimited list of repo format
1266 *requirements*. These are requirements that impact the reading of data in
1267 the ``.hg/store`` directory. An example value is
1268 ``streamreqs=generaldelta,revlogv1`` indicating the server repo requires
1269 the ``revlogv1`` and ``generaldelta`` requirements.
1270
1271 If the only format requirement is ``revlogv1``, the server may expose the
1272 ``stream`` capability instead of the ``streamreqs`` capability.
1273
1274 This capability was introduced in Mercurial 1.7 (released November 2010).
1275
1276 stream
1277 ------
1278
1279 Whether the server supports *streaming clones* from ``revlogv1`` repos.
1280
1281 If present, the server supports the ``stream_out`` command, which transmits
1282 raw revlogs from the repository instead of changegroups. This provides a faster
1283 cloning mechanism at the expense of more bandwidth used.
1284
1285 This capability was introduced in Mercurial 0.9.1 (released July 2006).
1286
1287 When initially introduced, the value of the capability was the numeric
1288 revlog revision. e.g. ``stream=1``. This indicates the changegroup is using
1289 ``revlogv1``. This simple integer value wasn't powerful enough, so the
1290 ``streamreqs`` capability was invented to handle cases where the repo
1291 requirements have more than just ``revlogv1``. Newer servers omit the
1292 ``=1`` since it was the only value supported and the value of ``1`` can
1293 be implied by clients.
1294
1295 unbundlehash
1296 ------------
1297
1298 Whether the ``unbundle`` commands supports receiving a hash of all the
1299 heads instead of a list.
1300
1301 For more, see the documentation for the ``unbundle`` command.
1302
1303 This capability was introduced in Mercurial 1.9 (released July 2011).
1304
1305 unbundle
1306 --------
1307
1308 Whether the server supports pushing via the ``unbundle`` command.
1309
1310 This capability/command has been present since Mercurial 0.9.1 (released
1311 July 2006).
1312
1313 Mercurial 0.9.2 (released December 2006) added values to the capability
1314 indicating which bundle types the server supports receiving. This value is a
1315 comma-delimited list. e.g. ``HG10GZ,HG10BZ,HG10UN``. The order of values
1316 reflects the priority/preference of that type, where the first value is the
1317 most preferred type.
1318
1319 Content Negotiation
1320 ===================
1321
1322 The wire protocol has some mechanisms to help peers determine what content
1323 types and encoding the other side will accept. Historically, these mechanisms
1324 have been built into commands themselves because most commands only send a
1325 well-defined response type and only certain commands needed to support
1326 functionality like compression.
1327
1328 Currently, only the HTTP version 1 transport supports content negotiation
1329 at the protocol layer.
1330
1331 HTTP requests advertise supported response formats via the ``X-HgProto-<N>``
1332 request header, where ``<N>`` is an integer starting at 1 allowing the logical
1333 value to span multiple headers. This value consists of a list of
1334 space-delimited parameters. Each parameter denotes a feature or capability.
1335
1336 The following parameters are defined:
1337
1338 0.1
1339 Indicates the client supports receiving ``application/mercurial-0.1``
1340 responses.
1341
1342 0.2
1343 Indicates the client supports receiving ``application/mercurial-0.2``
1344 responses.
1345
1346 cbor
1347 Indicates the client supports receiving ``application/mercurial-cbor``
1348 responses.
1349
1350 (Only intended to be used with version 2 transports.)
1351
1352 comp
1353 Indicates compression formats the client can decode. Value is a list of
1354 comma delimited strings identifying compression formats ordered from
1355 most preferential to least preferential. e.g. ``comp=zstd,zlib,none``.
1356
1357 This parameter does not have an effect if only the ``0.1`` parameter
1358 is defined, as support for ``application/mercurial-0.2`` or greater is
1359 required to use arbitrary compression formats.
1360
1361 If this parameter is not advertised, the server interprets this as
1362 equivalent to ``zlib,none``.
1363
1364 Clients may choose to only send this header if the ``httpmediatype``
1365 server capability is present, as currently all server-side features
1366 consulting this header require the client to opt in to new protocol features
1367 advertised via the ``httpmediatype`` capability.
1368
1369 A server that doesn't receive an ``X-HgProto-<N>`` header should infer a
1370 value of ``0.1``. This is compatible with legacy clients.
1371
1372 A server receiving a request indicating support for multiple media type
1373 versions may respond with any of the supported media types. Not all servers
1374 may support all media types on all commands.
1375
1376 Commands
1377 ========
1378
1379 This section contains a list of all wire protocol commands implemented by
1380 the canonical Mercurial server.
1381
1382 batch
1383 -----
1384
1385 Issue multiple commands while sending a single command request. The purpose
1386 of this command is to allow a client to issue multiple commands while avoiding
1387 multiple round trips to the server therefore enabling commands to complete
1388 quicker.
1389
1390 The command accepts a ``cmds`` argument that contains a list of commands to
1391 execute.
1392
1393 The value of ``cmds`` is a ``;`` delimited list of strings. Each string has the
1394 form ``<command> <arguments>``. That is, the command name followed by a space
1395 followed by an argument string.
1396
1397 The argument string is a ``,`` delimited list of ``<key>=<value>`` values
1398 corresponding to command arguments. Both the argument name and value are
1399 escaped using a special substitution map::
1400
1401 : -> :c
1402 , -> :o
1403 ; -> :s
1404 = -> :e
1405
1406 The response type for this command is ``string``. The value contains a
1407 ``;`` delimited list of responses for each requested command. Each value
1408 in this list is escaped using the same substitution map used for arguments.
1409
1410 If an error occurs, the generic error response may be sent.
1411
1412 between
1413 -------
1414
1415 (Legacy command used for discovery in old clients)
1416
1417 Obtain nodes between pairs of nodes.
1418
1419 The ``pairs`` arguments contains a space-delimited list of ``-`` delimited
1420 hex node pairs. e.g.::
1421
1422 a072279d3f7fd3a4aa7ffa1a5af8efc573e1c896-6dc58916e7c070f678682bfe404d2e2d68291a18
1423
1424 Return type is a ``string``. Value consists of lines corresponding to each
1425 requested range. Each line contains a space-delimited list of hex nodes.
1426 A newline ``\n`` terminates each line, including the last one.
1427
1428 branchmap
1429 ---------
1430
1431 Obtain heads in named branches.
1432
1433 Accepts no arguments. Return type is a ``string``.
1434
1435 Return value contains lines with URL encoded branch names followed by a space
1436 followed by a space-delimited list of hex nodes of heads on that branch.
1437 e.g.::
1438
1439 default a072279d3f7fd3a4aa7ffa1a5af8efc573e1c896 6dc58916e7c070f678682bfe404d2e2d68291a18
1440 stable baae3bf31522f41dd5e6d7377d0edd8d1cf3fccc
1441
1442 There is no trailing newline.
1443
1444 branches
1445 --------
1446
1447 (Legacy command used for discovery in old clients. Clients with ``getbundle``
1448 use the ``known`` and ``heads`` commands instead.)
1449
1450 Obtain ancestor changesets of specific nodes back to a branch point.
1451
1452 Despite the name, this command has nothing to do with Mercurial named branches.
1453 Instead, it is related to DAG branches.
1454
1455 The command accepts a ``nodes`` argument, which is a string of space-delimited
1456 hex nodes.
1457
1458 For each node requested, the server will find the first ancestor node that is
1459 a DAG root or is a merge.
1460
1461 Return type is a ``string``. Return value contains lines with result data for
1462 each requested node. Each line contains space-delimited nodes followed by a
1463 newline (``\n``). The 4 nodes reported on each line correspond to the requested
1464 node, the ancestor node found, and its 2 parent nodes (which may be the null
1465 node).
1466
1467 capabilities
1468 ------------
1469
1470 Obtain the capabilities string for the repo.
1471
1472 Unlike the ``hello`` command, the capabilities string is not prefixed.
1473 There is no trailing newline.
1474
1475 This command does not accept any arguments. Return type is a ``string``.
1476
1477 This command was introduced in Mercurial 0.9.1 (released July 2006).
1478
1479 changegroup
1480 -----------
1481
1482 (Legacy command: use ``getbundle`` instead)
1483
1484 Obtain a changegroup version 1 with data for changesets that are
1485 descendants of client-specified changesets.
1486
1487 The ``roots`` arguments contains a list of space-delimited hex nodes.
1488
1489 The server responds with a changegroup version 1 containing all
1490 changesets between the requested root/base nodes and the repo's head nodes
1491 at the time of the request.
1492
1493 The return type is a ``stream``.
1494
1495 changegroupsubset
1496 -----------------
1497
1498 (Legacy command: use ``getbundle`` instead)
1499
1500 Obtain a changegroup version 1 with data for changesetsets between
1501 client specified base and head nodes.
1502
1503 The ``bases`` argument contains a list of space-delimited hex nodes.
1504 The ``heads`` argument contains a list of space-delimited hex nodes.
1505
1506 The server responds with a changegroup version 1 containing all
1507 changesets between the requested base and head nodes at the time of the
1508 request.
1509
1510 The return type is a ``stream``.
1511
1512 clonebundles
1513 ------------
1514
1515 Obtains a manifest of bundle URLs available to seed clones.
1516
1517 Each returned line contains a URL followed by metadata. See the
1518 documentation in the ``clonebundles`` extension for more.
1519
1520 The return type is a ``string``.
1521
1522 getbundle
1523 ---------
1524
1525 Obtain a bundle containing repository data.
1526
1527 This command accepts the following arguments:
1528
1529 heads
1530 List of space-delimited hex nodes of heads to retrieve.
1531 common
1532 List of space-delimited hex nodes that the client has in common with the
1533 server.
1534 obsmarkers
1535 Boolean indicating whether to include obsolescence markers as part
1536 of the response. Only works with bundle2.
1537 bundlecaps
1538 Comma-delimited set of strings defining client bundle capabilities.
1539 listkeys
1540 Comma-delimited list of strings of ``pushkey`` namespaces. For each
1541 namespace listed, a bundle2 part will be included with the content of
1542 that namespace.
1543 cg
1544 Boolean indicating whether changegroup data is requested.
1545 cbattempted
1546 Boolean indicating whether the client attempted to use the *clone bundles*
1547 feature before performing this request.
1548 bookmarks
1549 Boolean indicating whether bookmark data is requested.
1550 phases
1551 Boolean indicating whether phases data is requested.
1552
1553 The return type on success is a ``stream`` where the value is bundle.
1554 On the HTTP version 1 transport, the response is zlib compressed.
1555
1556 If an error occurs, a generic error response can be sent.
1557
1558 Unless the client sends a false value for the ``cg`` argument, the returned
1559 bundle contains a changegroup with the nodes between the specified ``common``
1560 and ``heads`` nodes. Depending on the command arguments, the type and content
1561 of the returned bundle can vary significantly.
1562
1563 The default behavior is for the server to send a raw changegroup version
1564 ``01`` response.
1565
1566 If the ``bundlecaps`` provided by the client contain a value beginning
1567 with ``HG2``, a bundle2 will be returned. The bundle2 data may contain
1568 additional repository data, such as ``pushkey`` namespace values.
1569
1570 heads
1571 -----
1572
1573 Returns a list of space-delimited hex nodes of repository heads followed
1574 by a newline. e.g.
1575 ``a9eeb3adc7ddb5006c088e9eda61791c777cbf7c 31f91a3da534dc849f0d6bfc00a395a97cf218a1\n``
1576
1577 This command does not accept any arguments. The return type is a ``string``.
1578
1579 hello
1580 -----
1581
1582 Returns lines describing interesting things about the server in an RFC-822
1583 like format.
1584
1585 Currently, the only line defines the server capabilities. It has the form::
1586
1587 capabilities: <value>
1588
1589 See above for more about the capabilities string.
1590
1591 SSH clients typically issue this command as soon as a connection is
1592 established.
1593
1594 This command does not accept any arguments. The return type is a ``string``.
1595
1596 This command was introduced in Mercurial 0.9.1 (released July 2006).
1597
1598 listkeys
1599 --------
1600
1601 List values in a specified ``pushkey`` namespace.
1602
1603 The ``namespace`` argument defines the pushkey namespace to operate on.
1604
1605 The return type is a ``string``. The value is an encoded dictionary of keys.
1606
1607 Key-value pairs are delimited by newlines (``\n``). Within each line, keys and
1608 values are separated by a tab (``\t``). Keys and values are both strings.
1609
1610 lookup
1611 ------
1612
1613 Try to resolve a value to a known repository revision.
1614
1615 The ``key`` argument is converted from bytes to an
1616 ``encoding.localstr`` instance then passed into
1617 ``localrepository.__getitem__`` in an attempt to resolve it.
1618
1619 The return type is a ``string``.
1620
1621 Upon successful resolution, returns ``1 <hex node>\n``. On failure,
1622 returns ``0 <error string>\n``. e.g.::
1623
1624 1 273ce12ad8f155317b2c078ec75a4eba507f1fba\n
1625
1626 0 unknown revision 'foo'\n
1627
1628 known
1629 -----
1630
1631 Determine whether multiple nodes are known.
1632
1633 The ``nodes`` argument is a list of space-delimited hex nodes to check
1634 for existence.
1635
1636 The return type is ``string``.
1637
1638 Returns a string consisting of ``0``s and ``1``s indicating whether nodes
1639 are known. If the Nth node specified in the ``nodes`` argument is known,
1640 a ``1`` will be returned at byte offset N. If the node isn't known, ``0``
1641 will be present at byte offset N.
1642
1643 There is no trailing newline.
1644
1645 protocaps
1646 ---------
1647
1648 Notify the server about the client capabilities in the SSH V1 transport
1649 protocol.
1650
1651 The ``caps`` argument is a space-delimited list of capabilities.
1652
1653 The server will reply with the string ``OK``.
1654
1655 pushkey
1656 -------
1657
1658 Set a value using the ``pushkey`` protocol.
1659
1660 Accepts arguments ``namespace``, ``key``, ``old``, and ``new``, which
1661 correspond to the pushkey namespace to operate on, the key within that
1662 namespace to change, the old value (which may be empty), and the new value.
1663 All arguments are string types.
1664
1665 The return type is a ``string``. The value depends on the transport protocol.
1666
1667 The SSH version 1 transport sends a string encoded integer followed by a
1668 newline (``\n``) which indicates operation result. The server may send
1669 additional output on the ``stderr`` stream that should be displayed to the
1670 user.
1671
1672 The HTTP version 1 transport sends a string encoded integer followed by a
1673 newline followed by additional server output that should be displayed to
1674 the user. This may include output from hooks, etc.
1675
1676 The integer result varies by namespace. ``0`` means an error has occurred
1677 and there should be additional output to display to the user.
1678
1679 stream_out
1680 ----------
1681
1682 Obtain *streaming clone* data.
1683
1684 The return type is either a ``string`` or a ``stream``, depending on
1685 whether the request was fulfilled properly.
1686
1687 A return value of ``1\n`` indicates the server is not configured to serve
1688 this data. If this is seen by the client, they may not have verified the
1689 ``stream`` capability is set before making the request.
1690
1691 A return value of ``2\n`` indicates the server was unable to lock the
1692 repository to generate data.
1693
1694 All other responses are a ``stream`` of bytes. The first line of this data
1695 contains 2 space-delimited integers corresponding to the path count and
1696 payload size, respectively::
1697
1698 <path count> <payload size>\n
1699
1700 The ``<payload size>`` is the total size of path data: it does not include
1701 the size of the per-path header lines.
1702
1703 Following that header are ``<path count>`` entries. Each entry consists of a
1704 line with metadata followed by raw revlog data. The line consists of::
1705
1706 <store path>\0<size>\n
1707
1708 The ``<store path>`` is the encoded store path of the data that follows.
1709 ``<size>`` is the amount of data for this store path/revlog that follows the
1710 newline.
1711
1712 There is no trailer to indicate end of data. Instead, the client should stop
1713 reading after ``<path count>`` entries are consumed.
1714
1715 unbundle
1716 --------
1717
1718 Send a bundle containing data (usually changegroup data) to the server.
1719
1720 Accepts the argument ``heads``, which is a space-delimited list of hex nodes
1721 corresponding to server repository heads observed by the client. This is used
1722 to detect race conditions and abort push operations before a server performs
1723 too much work or a client transfers too much data.
1724
1725 The request payload consists of a bundle to be applied to the repository,
1726 similarly to as if :hg:`unbundle` were called.
1727
1728 In most scenarios, a special ``push response`` type is returned. This type
1729 contains an integer describing the change in heads as a result of the
1730 operation. A value of ``0`` indicates nothing changed. ``1`` means the number
1731 of heads remained the same. Values ``2`` and larger indicate the number of
1732 added heads minus 1. e.g. ``3`` means 2 heads were added. Negative values
1733 indicate the number of fewer heads, also off by 1. e.g. ``-2`` means there
1734 is 1 fewer head.
1735
1736 The encoding of the ``push response`` type varies by transport.
1737
1738 For the SSH version 1 transport, this type is composed of 2 ``string``
1739 responses: an empty response (``0\n``) followed by the integer result value.
1740 e.g. ``1\n2``. So the full response might be ``0\n1\n2``.
1741
1742 For the HTTP version 1 transport, the response is a ``string`` type composed
1743 of an integer result value followed by a newline (``\n``) followed by string
1744 content holding server output that should be displayed on the client (output
1745 hooks, etc).
1746
1747 In some cases, the server may respond with a ``bundle2`` bundle. In this
1748 case, the response type is ``stream``. For the HTTP version 1 transport, the
1749 response is zlib compressed.
1750
1751 The server may also respond with a generic error type, which contains a string
1752 indicating the failure.
1753
1754 Frame-Based Protocol Commands
1755 =============================
1756
3
1757 **Experimental and under active development**
4 **Experimental and under active development**
1758
5
@@ -1768,8 +15,20 b' types.'
1768 The response to many commands is also CBOR. There is no common response
15 The response to many commands is also CBOR. There is no common response
1769 format: each command defines its own response format.
16 format: each command defines its own response format.
1770
17
1771 TODO require node type be specified, as N bytes of binary node value
18 TODOs
1772 could be ambiguous once SHA-1 is replaced.
19 =====
20
21 * Add "node namespace" support to each command. In order to support
22 SHA-1 hash transition, we want servers to be able to expose different
23 "node namespaces" for the same data. Every command operating on nodes
24 should specify which "node namespace" it is operating on and responses
25 should encode the "node namespace" accordingly.
26
27 Commands
28 ========
29
30 The sections below detail all commands available to wire protocol version
31 2.
1773
32
1774 branchmap
33 branchmap
1775 ---------
34 ---------
General Comments 0
You need to be logged in to leave comments. Login now