wireprotocol.txt
1852 lines
| 66.8 KiB
| text/plain
|
TextLexer
Gregory Szorc
|
r29859 | The Mercurial wire protocol is a request-response based protocol | ||
with multiple wire representations. | ||||
Each request is modeled as a command name, a dictionary of arguments, and | ||||
optional raw input. Command arguments and their types are intrinsic | ||||
properties of commands. So is the response type of the command. This means | ||||
clients can't always send arbitrary arguments to servers and servers can't | ||||
return multiple response types. | ||||
The protocol is synchronous and does not support multiplexing (concurrent | ||||
commands). | ||||
Gregory Szorc
|
r29860 | |||
Gregory Szorc
|
r35993 | Handshake | ||
========= | ||||
It is required or common for clients to perform a *handshake* when connecting | ||||
to a server. The handshake serves the following purposes: | ||||
* Negotiating protocol/transport level options | ||||
* Allows the client to learn about server capabilities to influence | ||||
future requests | ||||
* Ensures the underlying transport channel is in a *clean* state | ||||
Gregory Szorc
|
r29860 | |||
Gregory Szorc
|
r35993 | An important goal of the handshake is to allow clients to use more modern | ||
wire protocol features. By default, clients must assume they are talking | ||||
to an old version of Mercurial server (possibly even the very first | ||||
implementation). So, clients should not attempt to call or utilize modern | ||||
wire protocol features until they have confirmation that the server | ||||
supports them. The handshake implementation is designed to allow both | ||||
ends to utilize the latest set of features and capabilities with as | ||||
few round trips as possible. | ||||
The handshake mechanism varies by transport and protocol and is documented | ||||
in the sections below. | ||||
HTTP Protocol | ||||
============= | ||||
Handshake | ||||
--------- | ||||
The client sends a ``capabilities`` command request (``?cmd=capabilities``) | ||||
as soon as HTTP requests may be issued. | ||||
Gregory Szorc
|
r37575 | By default, the server responds with a version 1 capabilities string, which | ||
the client parses to learn about the server's abilities. The ``Content-Type`` | ||||
for this response is ``application/mercurial-0.1`` or | ||||
``application/mercurial-0.2`` depending on whether the client advertised | ||||
support for version ``0.2`` in its request. (Clients aren't supposed to | ||||
advertise support for ``0.2`` until the capabilities response indicates | ||||
the server's support for that media type. However, a client could | ||||
conceivably cache this metadata and issue the capabilities request in such | ||||
a way to elicit an ``application/mercurial-0.2`` response.) | ||||
Clients wishing to switch to a newer API service may send an | ||||
``X-HgUpgrade-<X>`` header containing a space-delimited list of API service | ||||
names the client is capable of speaking. The request MUST also include an | ||||
``X-HgProto-<X>`` header advertising a known serialization format for the | ||||
response. ``cbor`` is currently the only defined serialization format. | ||||
If the request contains these headers, the response ``Content-Type`` MAY | ||||
be for a different media type. e.g. ``application/mercurial-cbor`` if the | ||||
client advertises support for CBOR. | ||||
The response MUST be deserializable to a map with the following keys: | ||||
apibase | ||||
URL path to API services, relative to the repository root. e.g. ``api/``. | ||||
apis | ||||
A map of API service names to API descriptors. An API descriptor contains | ||||
more details about that API. In the case of the HTTP Version 2 Transport, | ||||
it will be the normal response to a ``capabilities`` command. | ||||
Only the services advertised by the client that are also available on | ||||
the server are advertised. | ||||
v1capabilities | ||||
The capabilities string that would be returned by a version 1 response. | ||||
The client can then inspect the server-advertised APIs and decide which | ||||
API to use, including continuing to use the HTTP Version 1 Transport. | ||||
Gregory Szorc
|
r35993 | |||
HTTP Version 1 Transport | ||||
------------------------ | ||||
Gregory Szorc
|
r29860 | |||
Commands are issued as HTTP/1.0 or HTTP/1.1 requests. Commands are | ||||
sent to the base URL of the repository with the command name sent in | ||||
the ``cmd`` query string parameter. e.g. | ||||
``https://example.com/repo?cmd=capabilities``. The HTTP method is ``GET`` | ||||
or ``POST`` depending on the command and whether there is a request | ||||
body. | ||||
Command arguments can be sent multiple ways. | ||||
The simplest is part of the URL query string using ``x-www-form-urlencoded`` | ||||
encoding (see Python's ``urllib.urlencode()``. However, many servers impose | ||||
length limitations on the URL. So this mechanism is typically only used if | ||||
the server doesn't support other mechanisms. | ||||
If the server supports the ``httpheader`` capability, command arguments can | ||||
be sent in HTTP request headers named ``X-HgArg-<N>`` where ``<N>`` is an | ||||
integer starting at 1. A ``x-www-form-urlencoded`` representation of the | ||||
arguments is obtained. This full string is then split into chunks and sent | ||||
in numbered ``X-HgArg-<N>`` headers. The maximum length of each HTTP header | ||||
is defined by the server in the ``httpheader`` capability value, which defaults | ||||
to ``1024``. The server reassembles the encoded arguments string by | ||||
concatenating the ``X-HgArg-<N>`` headers then URL decodes them into a | ||||
dictionary. | ||||
The list of ``X-HgArg-<N>`` headers should be added to the ``Vary`` request | ||||
header to instruct caches to take these headers into consideration when caching | ||||
requests. | ||||
If the server supports the ``httppostargs`` capability, the client | ||||
may send command arguments in the HTTP request body as part of an | ||||
HTTP POST request. The command arguments will be URL encoded just like | ||||
they would for sending them via HTTP headers. However, no splitting is | ||||
performed: the raw arguments are included in the HTTP request body. | ||||
The client sends a ``X-HgArgs-Post`` header with the string length of the | ||||
encoded arguments data. Additional data may be included in the HTTP | ||||
request body immediately following the argument data. The offset of the | ||||
non-argument data is defined by the ``X-HgArgs-Post`` header. The | ||||
``X-HgArgs-Post`` header is not required if there is no argument data. | ||||
Additional command data can be sent as part of the HTTP request body. The | ||||
default ``Content-Type`` when sending data is ``application/mercurial-0.1``. | ||||
A ``Content-Length`` header is currently always sent. | ||||
Example HTTP requests:: | ||||
GET /repo?cmd=capabilities | ||||
X-HgArg-1: foo=bar&baz=hello%20world | ||||
Gregory Szorc
|
r30760 | The request media type should be chosen based on server support. If the | ||
``httpmediatype`` server capability is present, the client should send | ||||
the newest mutually supported media type. If this capability is absent, | ||||
the client must assume the server only supports the | ||||
``application/mercurial-0.1`` media type. | ||||
Gregory Szorc
|
r29860 | The ``Content-Type`` HTTP response header identifies the response as coming | ||
from Mercurial and can also be used to signal an error has occurred. | ||||
Gregory Szorc
|
r30760 | The ``application/mercurial-*`` media types indicate a generic Mercurial | ||
data type. | ||||
The ``application/mercurial-0.1`` media type is raw Mercurial data. It is the | ||||
predecessor of the format below. | ||||
The ``application/mercurial-0.2`` media type is compression framed Mercurial | ||||
data. The first byte of the payload indicates the length of the compression | ||||
format identifier that follows. Next are N bytes indicating the compression | ||||
format. e.g. ``zlib``. The remaining bytes are compressed according to that | ||||
compression format. The decompressed data behaves the same as with | ||||
``application/mercurial-0.1``. | ||||
Gregory Szorc
|
r29860 | |||
The ``application/hg-error`` media type indicates a generic error occurred. | ||||
The content of the HTTP response body typically holds text describing the | ||||
error. | ||||
Gregory Szorc
|
r37575 | The ``application/mercurial-cbor`` media type indicates a CBOR payload | ||
and should be interpreted as identical to ``application/cbor``. | ||||
Gregory Szorc
|
r30760 | Behavior of media types is further described in the ``Content Negotiation`` | ||
section below. | ||||
Gregory Szorc
|
r29860 | Clients should issue a ``User-Agent`` request header that identifies the client. | ||
The server should not use the ``User-Agent`` for feature detection. | ||||
Gregory Szorc
|
r30760 | A command returning a ``string`` response issues a | ||
``application/mercurial-0.*`` media type and the HTTP response body contains | ||||
the raw string value (after compression decoding, if used). A | ||||
``Content-Length`` header is typically issued, but not required. | ||||
Gregory Szorc
|
r29860 | |||
Gregory Szorc
|
r30760 | A command returning a ``stream`` response issues a | ||
``application/mercurial-0.*`` media type and the HTTP response is typically | ||||
Gregory Szorc
|
r29860 | using *chunked transfer* (``Transfer-Encoding: chunked``). | ||
Gregory Szorc
|
r37065 | HTTP Version 2 Transport | ||
------------------------ | ||||
**Experimental - feature under active development** | ||||
Version 2 of the HTTP protocol is exposed under the ``/api/*`` URL space. | ||||
It's final API name is not yet formalized. | ||||
Gregory Szorc
|
r37066 | Commands are triggered by sending HTTP POST requests against URLs of the | ||
Gregory Szorc
|
r37065 | form ``<permission>/<command>``, where ``<permission>`` is ``ro`` or | ||
``rw``, meaning read-only and read-write, respectively and ``<command>`` | ||||
is a named wire protocol command. | ||||
Gregory Szorc
|
r37066 | Non-POST request methods MUST be rejected by the server with an HTTP | ||
405 response. | ||||
Gregory Szorc
|
r37065 | Commands that modify repository state in meaningful ways MUST NOT be | ||
exposed under the ``ro`` URL prefix. All available commands MUST be | ||||
available under the ``rw`` URL prefix. | ||||
Server adminstrators MAY implement blanket HTTP authentication keyed | ||||
off the URL prefix. For example, a server may require authentication | ||||
for all ``rw/*`` URLs and let unauthenticated requests to ``ro/*`` | ||||
URL proceed. A server MAY issue an HTTP 401, 403, or 407 response | ||||
in accordance with RFC 7235. Clients SHOULD recognize the HTTP Basic | ||||
(RFC 7617) and Digest (RFC 7616) authentication schemes. Clients SHOULD | ||||
make an attempt to recognize unknown schemes using the | ||||
``WWW-Authenticate`` response header on a 401 response, as defined by | ||||
RFC 7235. | ||||
Read-only commands are accessible under ``rw/*`` URLs so clients can | ||||
signal the intent of the operation very early in the connection | ||||
lifecycle. For example, a ``push`` operation - which consists of | ||||
various read-only commands mixed with at least one read-write command - | ||||
can perform all commands against ``rw/*`` URLs so that any server-side | ||||
authentication requirements are discovered upon attempting the first | ||||
command - not potentially several commands into the exchange. This | ||||
allows clients to fail faster or prompt for credentials as soon as the | ||||
exchange takes place. This provides a better end-user experience. | ||||
Requests to unknown commands or URLS result in an HTTP 404. | ||||
TODO formally define response type, how error is communicated, etc. | ||||
Gregory Szorc
|
r37069 | HTTP request and response bodies use the *Unified Frame-Based Protocol* | ||
(defined below) for media exchange. The entirety of the HTTP message | ||||
body is 0 or more frames as defined by this protocol. | ||||
Gregory Szorc
|
r37068 | |||
Clients and servers MUST advertise the ``TBD`` media type via the | ||||
``Content-Type`` request and response headers. In addition, clients MUST | ||||
advertise this media type value in their ``Accept`` request header in all | ||||
requests. | ||||
Gregory Szorc
|
r37069 | TODO finalize the media type. For now, it is defined in wireprotoserver.py. | ||
Gregory Szorc
|
r37068 | |||
Servers receiving requests without an ``Accept`` header SHOULD respond with | ||||
an HTTP 406. | ||||
Servers receiving requests with an invalid ``Content-Type`` header SHOULD | ||||
respond with an HTTP 415. | ||||
Gregory Szorc
|
r37077 | The command to run is specified in the POST payload as defined by the | ||
*Unified Frame-Based Protocol*. This is redundant with data already | ||||
encoded in the URL. This is by design, so server operators can have | ||||
better understanding about server activity from looking merely at | ||||
HTTP access logs. | ||||
In most circumstances, the command specified in the URL MUST match | ||||
the command specified in the frame-based payload or the server will | ||||
respond with an error. The exception to this is the special | ||||
``multirequest`` URL. (See below.) In addition, HTTP requests | ||||
are limited to one command invocation. The exception is the special | ||||
``multirequest`` URL. | ||||
The ``multirequest`` command endpoints (``ro/multirequest`` and | ||||
``rw/multirequest``) are special in that they allow the execution of | ||||
*any* command and allow the execution of multiple commands. If the | ||||
HTTP request issues multiple commands across multiple frames, all | ||||
issued commands will be processed by the server. Per the defined | ||||
behavior of the *Unified Frame-Based Protocol*, commands may be | ||||
issued interleaved and responses may come back in a different order | ||||
than they were issued. Clients MUST be able to deal with this. | ||||
Gregory Szorc
|
r35993 | SSH Protocol | ||
============ | ||||
Handshake | ||||
--------- | ||||
For all clients, the handshake consists of the client sending 1 or more | ||||
commands to the server using version 1 of the transport. Servers respond | ||||
to commands they know how to respond to and send an empty response (``0\n``) | ||||
for unknown commands (per standard behavior of version 1 of the transport). | ||||
Clients then typically look for a response to the newest sent command to | ||||
determine which transport version to use and what the available features for | ||||
the connection and server are. | ||||
Preceding any response from client-issued commands, the server may print | ||||
non-protocol output. It is common for SSH servers to print banners, message | ||||
of the day announcements, etc when clients connect. It is assumed that any | ||||
such *banner* output will precede any Mercurial server output. So clients | ||||
must be prepared to handle server output on initial connect that isn't | ||||
in response to any client-issued command and doesn't conform to Mercurial's | ||||
wire protocol. This *banner* output should only be on stdout. However, | ||||
some servers may send output on stderr. | ||||
Pre 0.9.1 clients issue a ``between`` command with the ``pairs`` argument | ||||
having the value | ||||
``0000000000000000000000000000000000000000-0000000000000000000000000000000000000000``. | ||||
The ``between`` command has been supported since the original Mercurial | ||||
SSH server. Requesting the empty range will return a ``\n`` string response, | ||||
which will be encoded as ``1\n\n`` (value length of ``1`` followed by a newline | ||||
followed by the value, which happens to be a newline). | ||||
For pre 0.9.1 clients and all servers, the exchange looks like:: | ||||
c: between\n | ||||
c: pairs 81\n | ||||
c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000 | ||||
s: 1\n | ||||
s: \n | ||||
Gregory Szorc
|
r29860 | |||
Gregory Szorc
|
r35993 | 0.9.1+ clients send a ``hello`` command (with no arguments) before the | ||
``between`` command. The response to this command allows clients to | ||||
discover server capabilities and settings. | ||||
An example exchange between 0.9.1+ clients and a ``hello`` aware server looks | ||||
like:: | ||||
c: hello\n | ||||
c: between\n | ||||
c: pairs 81\n | ||||
c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000 | ||||
s: 324\n | ||||
s: capabilities: lookup changegroupsubset branchmap pushkey known getbundle ...\n | ||||
s: 1\n | ||||
s: \n | ||||
And a similar scenario but with servers sending a banner on connect:: | ||||
c: hello\n | ||||
c: between\n | ||||
c: pairs 81\n | ||||
c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000 | ||||
s: welcome to the server\n | ||||
s: if you find any issues, email someone@somewhere.com\n | ||||
s: 324\n | ||||
s: capabilities: lookup changegroupsubset branchmap pushkey known getbundle ...\n | ||||
s: 1\n | ||||
s: \n | ||||
Note that output from the ``hello`` command is terminated by a ``\n``. This is | ||||
part of the response payload and not part of the wire protocol adding a newline | ||||
after responses. In other words, the length of the response contains the | ||||
trailing ``\n``. | ||||
Gregory Szorc
|
r35994 | Clients supporting version 2 of the SSH transport send a line beginning | ||
with ``upgrade`` before the ``hello`` and ``between`` commands. The line | ||||
(which isn't a well-formed command line because it doesn't consist of a | ||||
single command name) serves to both communicate the client's intent to | ||||
switch to transport version 2 (transports are version 1 by default) as | ||||
well as to advertise the client's transport-level capabilities so the | ||||
server may satisfy that request immediately. | ||||
The upgrade line has the form: | ||||
upgrade <token> <transport capabilities> | ||||
That is the literal string ``upgrade`` followed by a space, followed by | ||||
a randomly generated string, followed by a space, followed by a string | ||||
denoting the client's transport capabilities. | ||||
The token can be anything. However, a random UUID is recommended. (Use | ||||
of version 4 UUIDs is recommended because version 1 UUIDs can leak the | ||||
client's MAC address.) | ||||
The transport capabilities string is a URL/percent encoded string | ||||
containing key-value pairs defining the client's transport-level | ||||
capabilities. The following capabilities are defined: | ||||
proto | ||||
A comma-delimited list of transport protocol versions the client | ||||
supports. e.g. ``ssh-v2``. | ||||
If the server does not recognize the ``upgrade`` line, it should issue | ||||
an empty response and continue processing the ``hello`` and ``between`` | ||||
commands. Here is an example handshake between a version 2 aware client | ||||
and a non version 2 aware server: | ||||
c: upgrade 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a proto=ssh-v2 | ||||
c: hello\n | ||||
c: between\n | ||||
c: pairs 81\n | ||||
c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000 | ||||
s: 0\n | ||||
s: 324\n | ||||
s: capabilities: lookup changegroupsubset branchmap pushkey known getbundle ...\n | ||||
s: 1\n | ||||
s: \n | ||||
(The initial ``0\n`` line from the server indicates an empty response to | ||||
the unknown ``upgrade ..`` command/line.) | ||||
If the server recognizes the ``upgrade`` line and is willing to satisfy that | ||||
upgrade request, it replies to with a payload of the following form: | ||||
upgraded <token> <transport name>\n | ||||
This line is the literal string ``upgraded``, a space, the token that was | ||||
specified by the client in its ``upgrade ...`` request line, a space, and the | ||||
name of the transport protocol that was chosen by the server. The transport | ||||
name MUST match one of the names the client specified in the ``proto`` field | ||||
of its ``upgrade ...`` request line. | ||||
If a server issues an ``upgraded`` response, it MUST also read and ignore | ||||
the lines associated with the ``hello`` and ``between`` command requests | ||||
that were issued by the server. It is assumed that the negotiated transport | ||||
will respond with equivalent requested information following the transport | ||||
handshake. | ||||
All data following the ``\n`` terminating the ``upgraded`` line is the | ||||
domain of the negotiated transport. It is common for the data immediately | ||||
following to contain additional metadata about the state of the transport and | ||||
the server. However, this isn't strictly speaking part of the transport | ||||
handshake and isn't covered by this section. | ||||
Here is an example handshake between a version 2 aware client and a version | ||||
2 aware server: | ||||
c: upgrade 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a proto=ssh-v2 | ||||
c: hello\n | ||||
c: between\n | ||||
c: pairs 81\n | ||||
c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000 | ||||
s: upgraded 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a ssh-v2\n | ||||
s: <additional transport specific data> | ||||
The client-issued token that is echoed in the response provides a more | ||||
resilient mechanism for differentiating *banner* output from Mercurial | ||||
output. In version 1, properly formatted banner output could get confused | ||||
for Mercurial server output. By submitting a randomly generated token | ||||
that is then present in the response, the client can look for that token | ||||
in response lines and have reasonable certainty that the line did not | ||||
originate from a *banner* message. | ||||
Gregory Szorc
|
r35993 | SSH Version 1 Transport | ||
----------------------- | ||||
The SSH transport (version 1) is a custom text-based protocol suitable for | ||||
use over any bi-directional stream transport. It is most commonly used with | ||||
SSH. | ||||
Gregory Szorc
|
r29860 | |||
A SSH transport server can be started with ``hg serve --stdio``. The stdin, | ||||
stderr, and stdout file descriptors of the started process are used to exchange | ||||
data. When Mercurial connects to a remote server over SSH, it actually starts | ||||
a ``hg serve --stdio`` process on the remote server. | ||||
Commands are issued by sending the command name followed by a trailing newline | ||||
``\n`` to the server. e.g. ``capabilities\n``. | ||||
Command arguments are sent in the following format:: | ||||
<argument> <length>\n<value> | ||||
That is, the argument string name followed by a space followed by the | ||||
integer length of the value (expressed as a string) followed by a newline | ||||
(``\n``) followed by the raw argument value. | ||||
Dictionary arguments are encoded differently:: | ||||
<argument> <# elements>\n | ||||
<key1> <length1>\n<value1> | ||||
<key2> <length2>\n<value2> | ||||
... | ||||
Non-argument data is sent immediately after the final argument value. It is | ||||
encoded in chunks:: | ||||
<length>\n<data> | ||||
Each command declares a list of supported arguments and their types. If a | ||||
client sends an unknown argument to the server, the server should abort | ||||
immediately. The special argument ``*`` in a command's definition indicates | ||||
that all argument names are allowed. | ||||
The definition of supported arguments and types is initially made when a | ||||
new command is implemented. The client and server must initially independently | ||||
agree on the arguments and their types. This initial set of arguments can be | ||||
supplemented through the presence of *capabilities* advertised by the server. | ||||
Each command has a defined expected response type. | ||||
A ``string`` response type is a length framed value. The response consists of | ||||
the string encoded integer length of a value followed by a newline (``\n``) | ||||
followed by the value. Empty values are allowed (and are represented as | ||||
``0\n``). | ||||
A ``stream`` response type consists of raw bytes of data. There is no framing. | ||||
A generic error response type is also supported. It consists of a an error | ||||
message written to ``stderr`` followed by ``\n-\n``. In addition, ``\n`` is | ||||
written to ``stdout``. | ||||
If the server receives an unknown command, it will send an empty ``string`` | ||||
response. | ||||
The server terminates if it receives an empty command (a ``\n`` character). | ||||
Gregory Szorc
|
r29863 | |||
Joerg Sonnenberger
|
r37411 | If the server announces support for the ``protocaps`` capability, the client | ||
should issue a ``protocaps`` command after the initial handshake to annonunce | ||||
its own capabilities. The client capabilities are persistent. | ||||
Gregory Szorc
|
r35994 | SSH Version 2 Transport | ||
----------------------- | ||||
Gregory Szorc
|
r37069 | **Experimental and under development** | ||
Gregory Szorc
|
r35994 | |||
Version 2 of the SSH transport behaves identically to version 1 of the SSH | ||||
transport with the exception of handshake semantics. See above for how | ||||
version 2 of the SSH transport is negotiated. | ||||
Immediately following the ``upgraded`` line signaling a switch to version | ||||
2 of the SSH protocol, the server automatically sends additional details | ||||
about the capabilities of the remote server. This has the form: | ||||
<integer length of value>\n | ||||
capabilities: ...\n | ||||
e.g. | ||||
s: upgraded 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a ssh-v2\n | ||||
s: 240\n | ||||
s: capabilities: known getbundle batch ...\n | ||||
Following capabilities advertisement, the peers communicate using version | ||||
1 of the SSH transport. | ||||
Gregory Szorc
|
r37069 | Unified Frame-Based Protocol | ||
============================ | ||||
**Experimental and under development** | ||||
The *Unified Frame-Based Protocol* is a communications protocol between | ||||
Mercurial peers. The protocol aims to be mostly transport agnostic | ||||
(works similarly on HTTP, SSH, etc). | ||||
To operate the protocol, a bi-directional, half-duplex pipe supporting | ||||
ordered sends and receives is required. That is, each peer has one pipe | ||||
for sending data and another for receiving. | ||||
Gregory Szorc
|
r37304 | All data is read and written in atomic units called *frames*. These | ||
are conceptually similar to TCP packets. Higher-level functionality | ||||
is built on the exchange and processing of frames. | ||||
All frames are associated with a *stream*. A *stream* provides a | ||||
unidirectional grouping of frames. Streams facilitate two goals: | ||||
content encoding and parallelism. There is a dedicated section on | ||||
streams below. | ||||
Gregory Szorc
|
r37069 | The protocol is request-response based: the client issues requests to | ||
the server, which issues replies to those requests. Server-initiated | ||||
Gregory Szorc
|
r37075 | messaging is not currently supported, but this specification carves | ||
out room to implement it. | ||||
Gregory Szorc
|
r37069 | |||
Gregory Szorc
|
r37075 | All frames are associated with a numbered request. Frames can thus | ||
be logically grouped by their request ID. | ||||
Gregory Szorc
|
r37304 | Frames begin with an 8 octet header followed by a variable length | ||
Gregory Szorc
|
r37069 | payload:: | ||
Gregory Szorc
|
r37304 | +------------------------------------------------+ | ||
| Length (24) | | ||||
+--------------------------------+---------------+ | ||||
| Request ID (16) | Stream ID (8) | | ||||
+------------------+-------------+---------------+ | ||||
| Stream Flags (8) | | ||||
+-----------+------+ | ||||
| Type (4) | | ||||
+-----------+ | ||||
| Flags (4) | | ||||
+===========+===================================================| | ||||
Gregory Szorc
|
r37069 | | Frame Payload (0...) ... | ||
+---------------------------------------------------------------+ | ||||
The length of the frame payload is expressed as an unsigned 24 bit | ||||
little endian integer. Values larger than 65535 MUST NOT be used unless | ||||
given permission by the server as part of the negotiated capabilities | ||||
during the handshake. The frame header is not part of the advertised | ||||
Gregory Szorc
|
r37304 | frame length. The payload length is the over-the-wire length. If there | ||
is content encoding applied to the payload as part of the frame's stream, | ||||
the length is the output of that content encoding, not the input. | ||||
Gregory Szorc
|
r37069 | |||
Gregory Szorc
|
r37075 | The 16-bit ``Request ID`` field denotes the integer request identifier, | ||
stored as an unsigned little endian integer. Odd numbered requests are | ||||
client-initiated. Even numbered requests are server-initiated. This | ||||
refers to where the *request* was initiated - not where the *frame* was | ||||
initiated, so servers will send frames with odd ``Request ID`` in | ||||
response to client-initiated requests. Implementations are advised to | ||||
start ordering request identifiers at ``1`` and ``0``, increment by | ||||
``2``, and wrap around if all available numbers have been exhausted. | ||||
Gregory Szorc
|
r37304 | The 8-bit ``Stream ID`` field denotes the stream that the frame is | ||
associated with. Frames belonging to a stream may have content | ||||
encoding applied and the receiver may need to decode the raw frame | ||||
payload to obtain the original data. Odd numbered IDs are | ||||
client-initiated. Even numbered IDs are server-initiated. | ||||
The 8-bit ``Stream Flags`` field defines stream processing semantics. | ||||
See the section on streams below. | ||||
The 4-bit ``Type`` field denotes the type of frame being sent. | ||||
Gregory Szorc
|
r37069 | |||
The 4-bit ``Flags`` field defines special, per-type attributes for | ||||
the frame. | ||||
The sections below define the frame types and their behavior. | ||||
Command Request (``0x01``) | ||||
-------------------------- | ||||
This frame contains a request to run a command. | ||||
Gregory Szorc
|
r37308 | The payload consists of a CBOR map defining the command request. The | ||
bytestring keys of that map are: | ||||
name | ||||
Name of the command that should be executed (bytestring). | ||||
args | ||||
Map of bytestring keys to various value types containing the named | ||||
arguments to this command. | ||||
Each command defines its own set of argument names and their expected | ||||
types. | ||||
Gregory Szorc
|
r37069 | |||
This frame type MUST ONLY be sent from clients to servers: it is illegal | ||||
for a server to send this frame to a client. | ||||
The following flag values are defined for this type: | ||||
0x01 | ||||
Gregory Szorc
|
r37308 | New command request. When set, this frame represents the beginning | ||
of a new request to run a command. The ``Request ID`` attached to this | ||||
frame MUST NOT be active. | ||||
Gregory Szorc
|
r37069 | 0x02 | ||
Gregory Szorc
|
r37308 | Command request continuation. When set, this frame is a continuation | ||
from a previous command request frame for its ``Request ID``. This | ||||
flag is set when the CBOR data for a command request does not fit | ||||
in a single frame. | ||||
Gregory Szorc
|
r37069 | 0x04 | ||
Gregory Szorc
|
r37308 | Additional frames expected. When set, the command request didn't fit | ||
into a single frame and additional CBOR data follows in a subsequent | ||||
frame. | ||||
0x08 | ||||
Command data frames expected. When set, command data frames are | ||||
expected to follow the final command request frame for this request. | ||||
Gregory Szorc
|
r37069 | |||
Gregory Szorc
|
r37308 | ``0x01`` MUST be set on the initial command request frame for a | ||
``Request ID``. | ||||
Gregory Szorc
|
r37069 | |||
Gregory Szorc
|
r37308 | ``0x01`` or ``0x02`` MUST be set to indicate this frame's role in | ||
a series of command request frames. | ||||
Gregory Szorc
|
r37069 | |||
Gregory Szorc
|
r37308 | If command data frames are to be sent, ``0x10`` MUST be set on ALL | ||
command request frames. | ||||
Gregory Szorc
|
r37069 | |||
Command Data (``0x03``) | ||||
----------------------- | ||||
This frame contains raw data for a command. | ||||
Most commands can be executed by specifying arguments. However, | ||||
arguments have an upper bound to their length. For commands that | ||||
accept data that is beyond this length or whose length isn't known | ||||
when the command is initially sent, they will need to stream | ||||
arbitrary data to the server. This frame type facilitates the sending | ||||
of this data. | ||||
The payload of this frame type consists of a stream of raw data to be | ||||
consumed by the command handler on the server. The format of the data | ||||
is command specific. | ||||
The following flag values are defined for this type: | ||||
0x01 | ||||
Command data continuation. When set, the data for this command | ||||
continues into a subsequent frame. | ||||
0x02 | ||||
End of data. When set, command data has been fully sent to the | ||||
server. The command has been fully issued and no new data for this | ||||
command will be sent. The next frame will belong to a new command. | ||||
Gregory Szorc
|
r37315 | Response Data (``0x04``) | ||
------------------------ | ||||
Gregory Szorc
|
r37073 | |||
Gregory Szorc
|
r37315 | This frame contains raw response data to an issued command. | ||
Gregory Szorc
|
r37073 | |||
The following flag values are defined for this type: | ||||
0x01 | ||||
Gregory Szorc
|
r37315 | Data continuation. When set, an additional frame containing response data | ||
will follow. | ||||
Gregory Szorc
|
r37073 | 0x02 | ||
Gregory Szorc
|
r37315 | End of data. When set, the response data has been fully sent and | ||
Gregory Szorc
|
r37073 | no additional frames for this response will be sent. | ||
Gregory Szorc
|
r37315 | 0x04 | ||
CBOR data. When set, the frame payload consists of CBOR data. | ||||
Gregory Szorc
|
r37073 | |||
The ``0x01`` flag is mutually exclusive with the ``0x02`` flag. | ||||
Error Response (``0x05``) | ||||
------------------------- | ||||
An error occurred when processing a request. This could indicate | ||||
a protocol-level failure or an application level failure depending | ||||
on the flags for this message type. | ||||
The payload for this type is an error message that should be | ||||
displayed to the user. | ||||
The following flag values are defined for this type: | ||||
0x01 | ||||
The error occurred at the transport/protocol level. If set, the | ||||
connection should be closed. | ||||
0x02 | ||||
The error occurred at the application level. e.g. invalid command. | ||||
Gregory Szorc
|
r37078 | Human Output Side-Channel (``0x06``) | ||
------------------------------------ | ||||
This frame contains a message that is intended to be displayed to | ||||
people. Whereas most frames communicate machine readable data, this | ||||
frame communicates textual data that is intended to be shown to | ||||
humans. | ||||
The frame consists of a series of *formatting requests*. Each formatting | ||||
request consists of a formatting string, arguments for that formatting | ||||
string, and labels to apply to that formatting string. | ||||
A formatting string is a printf()-like string that allows variable | ||||
substitution within the string. Labels allow the rendered text to be | ||||
*decorated*. Assuming use of the canonical Mercurial code base, a | ||||
formatting string can be the input to the ``i18n._`` function. This | ||||
allows messages emitted from the server to be localized. So even if | ||||
the server has different i18n settings, people could see messages in | ||||
their *native* settings. Similarly, the use of labels allows | ||||
decorations like coloring and underlining to be applied using the | ||||
client's configured rendering settings. | ||||
Formatting strings are similar to ``printf()`` strings or how | ||||
Python's ``%`` operator works. The only supported formatting sequences | ||||
are ``%s`` and ``%%``. ``%s`` will be replaced by whatever the string | ||||
at that position resolves to. ``%%`` will be replaced by ``%``. All | ||||
other 2-byte sequences beginning with ``%`` represent a literal | ||||
``%`` followed by that character. However, future versions of the | ||||
wire protocol reserve the right to allow clients to opt in to receiving | ||||
formatting strings with additional formatters, hence why ``%%`` is | ||||
required to represent the literal ``%``. | ||||
Gregory Szorc
|
r37335 | The frame payload consists of a CBOR array of CBOR maps. Each map | ||
defines an *atom* of text data to print. Each *atom* has the following | ||||
bytestring keys: | ||||
Gregory Szorc
|
r37078 | |||
Gregory Szorc
|
r37335 | msg | ||
(bytestring) The formatting string. Content MUST be ASCII. | ||||
args (optional) | ||||
Array of bytestrings defining arguments to the formatting string. | ||||
labels (optional) | ||||
Array of bytestrings defining labels to apply to this atom. | ||||
Gregory Szorc
|
r37147 | |||
Gregory Szorc
|
r37078 | All data to be printed MUST be encoded into a single frame: this frame | ||
does not support spanning data across multiple frames. | ||||
All textual data encoded in these frames is assumed to be line delimited. | ||||
The last atom in the frame SHOULD end with a newline (``\n``). If it | ||||
doesn't, clients MAY add a newline to facilitate immediate printing. | ||||
Gregory Szorc
|
r37307 | Progress Update (``0x07``) | ||
-------------------------- | ||||
This frame holds the progress of an operation on the peer. Consumption | ||||
of these frames allows clients to display progress bars, estimated | ||||
completion times, etc. | ||||
Each frame defines the progress of a single operation on the peer. The | ||||
payload consists of a CBOR map with the following bytestring keys: | ||||
topic | ||||
Topic name (string) | ||||
pos | ||||
Current numeric position within the topic (integer) | ||||
total | ||||
Total/end numeric position of this topic (unsigned integer) | ||||
label (optional) | ||||
Unit label (string) | ||||
item (optional) | ||||
Item name (string) | ||||
Progress state is created when a frame is received referencing a | ||||
*topic* that isn't currently tracked. Progress tracking for that | ||||
*topic* is finished when a frame is received reporting the current | ||||
position of that topic as ``-1``. | ||||
Multiple *topics* may be active at any given time. | ||||
Rendering of progress information is not mandated or governed by this | ||||
specification: implementations MAY render progress information however | ||||
they see fit, including not at all. | ||||
The string data describing the topic SHOULD be static strings to | ||||
facilitate receivers localizing that string data. The emitter | ||||
MUST normalize all string data to valid UTF-8 and receivers SHOULD | ||||
validate that received data conforms to UTF-8. The topic name | ||||
SHOULD be ASCII. | ||||
Gregory Szorc
|
r37304 | Stream Encoding Settings (``0x08``) | ||
----------------------------------- | ||||
This frame type holds information defining the content encoding | ||||
settings for a *stream*. | ||||
This frame type is likely consumed by the protocol layer and is not | ||||
passed on to applications. | ||||
This frame type MUST ONLY occur on frames having the *Beginning of Stream* | ||||
``Stream Flag`` set. | ||||
The payload of this frame defines what content encoding has (possibly) | ||||
been applied to the payloads of subsequent frames in this stream. | ||||
The payload begins with an 8-bit integer defining the length of the | ||||
encoding *profile*, followed by the string name of that profile, which | ||||
must be an ASCII string. All bytes that follow can be used by that | ||||
profile for supplemental settings definitions. See the section below | ||||
on defined encoding profiles. | ||||
Stream States and Flags | ||||
----------------------- | ||||
Streams can be in two states: *open* and *closed*. An *open* stream | ||||
is active and frames attached to that stream could arrive at any time. | ||||
A *closed* stream is not active. If a frame attached to a *closed* | ||||
stream arrives, that frame MUST have an appropriate stream flag | ||||
set indicating beginning of stream. All streams are in the *closed* | ||||
state by default. | ||||
The ``Stream Flags`` field denotes a set of bit flags for defining | ||||
the relationship of this frame within a stream. The following flags | ||||
are defined: | ||||
0x01 | ||||
Beginning of stream. The first frame in the stream MUST set this | ||||
flag. When received, the ``Stream ID`` this frame is attached to | ||||
becomes ``open``. | ||||
0x02 | ||||
End of stream. The last frame in a stream MUST set this flag. When | ||||
received, the ``Stream ID`` this frame is attached to becomes | ||||
``closed``. Any content encoding context associated with this stream | ||||
can be destroyed after processing the payload of this frame. | ||||
0x04 | ||||
Apply content encoding. When set, any content encoding settings | ||||
defined by the stream should be applied when attempting to read | ||||
the frame. When not set, the frame payload isn't encoded. | ||||
Streams | ||||
------- | ||||
Streams - along with ``Request IDs`` - facilitate grouping of frames. | ||||
But the purpose of each is quite different and the groupings they | ||||
constitute are independent. | ||||
A ``Request ID`` is essentially a tag. It tells you which logical | ||||
request a frame is associated with. | ||||
A *stream* is a sequence of frames grouped for the express purpose | ||||
of applying a stateful encoding or for denoting sub-groups of frames. | ||||
Unlike ``Request ID``s which span the request and response, a stream | ||||
is unidirectional and stream IDs are independent from client to | ||||
server. | ||||
There is no strict hierarchical relationship between ``Request IDs`` | ||||
and *streams*. A stream can contain frames having multiple | ||||
``Request IDs``. Frames belonging to the same ``Request ID`` can | ||||
span multiple streams. | ||||
One goal of streams is to facilitate content encoding. A stream can | ||||
define an encoding to be applied to frame payloads. For example, the | ||||
payload transmitted over the wire may contain output from a | ||||
zstandard compression operation and the receiving end may decompress | ||||
that payload to obtain the original data. | ||||
The other goal of streams is to facilitate concurrent execution. For | ||||
example, a server could spawn 4 threads to service a request that can | ||||
be easily parallelized. Each of those 4 threads could write into its | ||||
own stream. Those streams could then in turn be delivered to 4 threads | ||||
on the receiving end, with each thread consuming its stream in near | ||||
isolation. The *main* thread on both ends merely does I/O and | ||||
encodes/decodes frame headers: the bulk of the work is done by worker | ||||
threads. | ||||
In addition, since content encoding is defined per stream, each | ||||
*worker thread* could perform potentially CPU bound work concurrently | ||||
with other threads. This approach of applying encoding at the | ||||
sub-protocol / stream level eliminates a potential resource constraint | ||||
on the protocol stream as a whole (it is common for the throughput of | ||||
a compression engine to be smaller than the throughput of a network). | ||||
Having multiple streams - each with their own encoding settings - also | ||||
facilitates the use of advanced data compression techniques. For | ||||
example, a transmitter could see that it is generating data faster | ||||
and slower than the receiving end is consuming it and adjust its | ||||
compression settings to trade CPU for compression ratio accordingly. | ||||
While streams can define a content encoding, not all frames within | ||||
that stream must use that content encoding. This can be useful when | ||||
data is being served from caches and being derived dynamically. A | ||||
cache could pre-compressed data so the server doesn't have to | ||||
recompress it. The ability to pick and choose which frames are | ||||
compressed allows servers to easily send data to the wire without | ||||
involving potentially expensive encoding overhead. | ||||
Content Encoding Profiles | ||||
------------------------- | ||||
Streams can have named content encoding *profiles* associated with | ||||
them. A profile defines a shared understanding of content encoding | ||||
settings and behavior. | ||||
The following profiles are defined: | ||||
TBD | ||||
Gregory Szorc
|
r37069 | Issuing Commands | ||
---------------- | ||||
A client can request that a remote run a command by sending it | ||||
frames defining that command. This logical stream is composed of | ||||
Gregory Szorc
|
r37308 | 1 or more ``Command Request`` frames and and 0 or more ``Command Data`` | ||
frames. | ||||
Gregory Szorc
|
r37069 | |||
Gregory Szorc
|
r37075 | All frames composing a single command request MUST be associated with | ||
the same ``Request ID``. | ||||
Clients MAY send additional command requests without waiting on the | ||||
response to a previous command request. If they do so, they MUST ensure | ||||
that the ``Request ID`` field of outbound frames does not conflict | ||||
with that of an active ``Request ID`` whose response has not yet been | ||||
fully received. | ||||
Servers MAY respond to commands in a different order than they were | ||||
sent over the wire. Clients MUST be prepared to deal with this. Servers | ||||
also MAY start executing commands in a different order than they were | ||||
received, or MAY execute multiple commands concurrently. | ||||
If there is a dependency between commands or a race condition between | ||||
commands executing (e.g. a read-only command that depends on the results | ||||
of a command that mutates the repository), then clients MUST NOT send | ||||
frames issuing a command until a response to all dependent commands has | ||||
been received. | ||||
TODO think about whether we should express dependencies between commands | ||||
to avoid roundtrip latency. | ||||
Gregory Szorc
|
r37308 | A command is defined by a command name, 0 or more command arguments, | ||
and optional command data. | ||||
Arguments are the recommended mechanism for transferring fixed sets of | ||||
parameters to a command. Data is appropriate for transferring variable | ||||
data. Thinking in terms of HTTP, arguments would be headers and data | ||||
would be the message body. | ||||
Gregory Szorc
|
r37069 | |||
It is recommended for servers to delay the dispatch of a command | ||||
Gregory Szorc
|
r37308 | until all argument have been received. Servers MAY impose limits on the | ||
maximum argument size. | ||||
Gregory Szorc
|
r37069 | TODO define failure mechanism. | ||
Servers MAY dispatch to commands immediately once argument data | ||||
is available or delay until command data is received in full. | ||||
Gregory Szorc
|
r29863 | Capabilities | ||
============ | ||||
Servers advertise supported wire protocol features. This allows clients to | ||||
probe for server features before blindly calling a command or passing a | ||||
specific argument. | ||||
The server's features are exposed via a *capabilities* string. This is a | ||||
space-delimited string of tokens/features. Some features are single words | ||||
like ``lookup`` or ``batch``. Others are complicated key-value pairs | ||||
advertising sub-features. e.g. ``httpheader=2048``. When complex, non-word | ||||
values are used, each feature name can define its own encoding of sub-values. | ||||
Comma-delimited and ``x-www-form-urlencoded`` values are common. | ||||
The following document capabilities defined by the canonical Mercurial server | ||||
implementation. | ||||
batch | ||||
----- | ||||
Whether the server supports the ``batch`` command. | ||||
This capability/command was introduced in Mercurial 1.9 (released July 2011). | ||||
branchmap | ||||
--------- | ||||
Whether the server supports the ``branchmap`` command. | ||||
This capability/command was introduced in Mercurial 1.3 (released July 2009). | ||||
bundle2-exp | ||||
----------- | ||||
Precursor to ``bundle2`` capability that was used before bundle2 was a | ||||
stable feature. | ||||
This capability was introduced in Mercurial 3.0 behind an experimental | ||||
flag. This capability should not be observed in the wild. | ||||
bundle2 | ||||
------- | ||||
Indicates whether the server supports the ``bundle2`` data exchange format. | ||||
The value of the capability is a URL quoted, newline (``\n``) delimited | ||||
list of keys or key-value pairs. | ||||
A key is simply a URL encoded string. | ||||
A key-value pair is a URL encoded key separated from a URL encoded value by | ||||
an ``=``. If the value is a list, elements are delimited by a ``,`` after | ||||
URL encoding. | ||||
For example, say we have the values:: | ||||
{'HG20': [], 'changegroup': ['01', '02'], 'digests': ['sha1', 'sha512']} | ||||
We would first construct a string:: | ||||
HG20\nchangegroup=01,02\ndigests=sha1,sha512 | ||||
We would then URL quote this string:: | ||||
HG20%0Achangegroup%3D01%2C02%0Adigests%3Dsha1%2Csha512 | ||||
This capability was introduced in Mercurial 3.4 (released May 2015). | ||||
changegroupsubset | ||||
----------------- | ||||
Whether the server supports the ``changegroupsubset`` command. | ||||
This capability was introduced in Mercurial 0.9.2 (released December | ||||
2006). | ||||
This capability was introduced at the same time as the ``lookup`` | ||||
capability/command. | ||||
Gregory Szorc
|
r30760 | compression | ||
----------- | ||||
Declares support for negotiating compression formats. | ||||
Presence of this capability indicates the server supports dynamic selection | ||||
of compression formats based on the client request. | ||||
Servers advertising this capability are required to support the | ||||
``application/mercurial-0.2`` media type in response to commands returning | ||||
streams. Servers may support this media type on any command. | ||||
The value of the capability is a comma-delimited list of strings declaring | ||||
supported compression formats. The order of the compression formats is in | ||||
server-preferred order, most preferred first. | ||||
Gregory Szorc
|
r30761 | The identifiers used by the official Mercurial distribution are: | ||
bzip2 | ||||
bzip2 | ||||
none | ||||
uncompressed / raw data | ||||
zlib | ||||
zlib (no gzip header) | ||||
zstd | ||||
zstd | ||||
Gregory Szorc
|
r30760 | This capability was introduced in Mercurial 4.1 (released February 2017). | ||
Gregory Szorc
|
r29863 | getbundle | ||
--------- | ||||
Whether the server supports the ``getbundle`` command. | ||||
This capability was introduced in Mercurial 1.9 (released July 2011). | ||||
httpheader | ||||
---------- | ||||
Whether the server supports receiving command arguments via HTTP request | ||||
headers. | ||||
The value of the capability is an integer describing the max header | ||||
length that clients should send. Clients should ignore any content after a | ||||
comma in the value, as this is reserved for future use. | ||||
This capability was introduced in Mercurial 1.9 (released July 2011). | ||||
Gregory Szorc
|
r30760 | httpmediatype | ||
------------- | ||||
Indicates which HTTP media types (``Content-Type`` header) the server is | ||||
capable of receiving and sending. | ||||
The value of the capability is a comma-delimited list of strings identifying | ||||
support for media type and transmission direction. The following strings may | ||||
be present: | ||||
0.1rx | ||||
Indicates server support for receiving ``application/mercurial-0.1`` media | ||||
types. | ||||
0.1tx | ||||
Indicates server support for sending ``application/mercurial-0.1`` media | ||||
types. | ||||
0.2rx | ||||
Indicates server support for receiving ``application/mercurial-0.2`` media | ||||
types. | ||||
0.2tx | ||||
Indicates server support for sending ``application/mercurial-0.2`` media | ||||
types. | ||||
minrx=X | ||||
Minimum media type version the server is capable of receiving. Value is a | ||||
string like ``0.2``. | ||||
This capability can be used by servers to limit connections from legacy | ||||
clients not using the latest supported media type. However, only clients | ||||
with knowledge of this capability will know to consult this value. This | ||||
capability is present so the client may issue a more user-friendly error | ||||
when the server has locked out a legacy client. | ||||
mintx=X | ||||
Minimum media type version the server is capable of sending. Value is a | ||||
string like ``0.1``. | ||||
Servers advertising support for the ``application/mercurial-0.2`` media type | ||||
should also advertise the ``compression`` capability. | ||||
This capability was introduced in Mercurial 4.1 (released February 2017). | ||||
Gregory Szorc
|
r29863 | httppostargs | ||
------------ | ||||
**Experimental** | ||||
Indicates that the server supports and prefers clients send command arguments | ||||
via a HTTP POST request as part of the request body. | ||||
This capability was introduced in Mercurial 3.8 (released May 2016). | ||||
known | ||||
----- | ||||
Whether the server supports the ``known`` command. | ||||
This capability/command was introduced in Mercurial 1.9 (released July 2011). | ||||
lookup | ||||
------ | ||||
Whether the server supports the ``lookup`` command. | ||||
This capability was introduced in Mercurial 0.9.2 (released December | ||||
2006). | ||||
This capability was introduced at the same time as the ``changegroupsubset`` | ||||
capability/command. | ||||
Joerg Sonnenberger
|
r37516 | partial-pull | ||
------------ | ||||
Indicates that the client can deal with partial answers to pull requests | ||||
by repeating the request. | ||||
If this parameter is not advertised, the server will not send pull bundles. | ||||
This client capability was introduced in Mercurial 4.6. | ||||
Joerg Sonnenberger
|
r37411 | protocaps | ||
--------- | ||||
Whether the server supports the ``protocaps`` command for SSH V1 transport. | ||||
This capability was introduced in Mercurial 4.6. | ||||
Gregory Szorc
|
r29863 | pushkey | ||
------- | ||||
Whether the server supports the ``pushkey`` and ``listkeys`` commands. | ||||
This capability was introduced in Mercurial 1.6 (released July 2010). | ||||
standardbundle | ||||
-------------- | ||||
**Unsupported** | ||||
This capability was introduced during the Mercurial 0.9.2 development cycle in | ||||
2006. It was never present in a release, as it was replaced by the ``unbundle`` | ||||
capability. This capability should not be encountered in the wild. | ||||
stream-preferred | ||||
---------------- | ||||
If present the server prefers that clients clone using the streaming clone | ||||
Gregory Szorc
|
r34394 | protocol (``hg clone --stream``) rather than the standard | ||
Gregory Szorc
|
r29863 | changegroup/bundle based protocol. | ||
This capability was introduced in Mercurial 2.2 (released May 2012). | ||||
streamreqs | ||||
---------- | ||||
Indicates whether the server supports *streaming clones* and the *requirements* | ||||
that clients must support to receive it. | ||||
If present, the server supports the ``stream_out`` command, which transmits | ||||
raw revlogs from the repository instead of changegroups. This provides a faster | ||||
cloning mechanism at the expense of more bandwidth used. | ||||
The value of this capability is a comma-delimited list of repo format | ||||
*requirements*. These are requirements that impact the reading of data in | ||||
the ``.hg/store`` directory. An example value is | ||||
``streamreqs=generaldelta,revlogv1`` indicating the server repo requires | ||||
the ``revlogv1`` and ``generaldelta`` requirements. | ||||
If the only format requirement is ``revlogv1``, the server may expose the | ||||
``stream`` capability instead of the ``streamreqs`` capability. | ||||
This capability was introduced in Mercurial 1.7 (released November 2010). | ||||
stream | ||||
------ | ||||
Whether the server supports *streaming clones* from ``revlogv1`` repos. | ||||
If present, the server supports the ``stream_out`` command, which transmits | ||||
raw revlogs from the repository instead of changegroups. This provides a faster | ||||
cloning mechanism at the expense of more bandwidth used. | ||||
This capability was introduced in Mercurial 0.9.1 (released July 2006). | ||||
When initially introduced, the value of the capability was the numeric | ||||
revlog revision. e.g. ``stream=1``. This indicates the changegroup is using | ||||
``revlogv1``. This simple integer value wasn't powerful enough, so the | ||||
``streamreqs`` capability was invented to handle cases where the repo | ||||
requirements have more than just ``revlogv1``. Newer servers omit the | ||||
``=1`` since it was the only value supported and the value of ``1`` can | ||||
be implied by clients. | ||||
unbundlehash | ||||
------------ | ||||
Whether the ``unbundle`` commands supports receiving a hash of all the | ||||
heads instead of a list. | ||||
For more, see the documentation for the ``unbundle`` command. | ||||
This capability was introduced in Mercurial 1.9 (released July 2011). | ||||
unbundle | ||||
-------- | ||||
Whether the server supports pushing via the ``unbundle`` command. | ||||
This capability/command has been present since Mercurial 0.9.1 (released | ||||
July 2006). | ||||
Mercurial 0.9.2 (released December 2006) added values to the capability | ||||
indicating which bundle types the server supports receiving. This value is a | ||||
comma-delimited list. e.g. ``HG10GZ,HG10BZ,HG10UN``. The order of values | ||||
reflects the priority/preference of that type, where the first value is the | ||||
most preferred type. | ||||
Gregory Szorc
|
r29864 | |||
Gregory Szorc
|
r30760 | Content Negotiation | ||
=================== | ||||
The wire protocol has some mechanisms to help peers determine what content | ||||
types and encoding the other side will accept. Historically, these mechanisms | ||||
have been built into commands themselves because most commands only send a | ||||
well-defined response type and only certain commands needed to support | ||||
functionality like compression. | ||||
Gregory Szorc
|
r35993 | Currently, only the HTTP version 1 transport supports content negotiation | ||
at the protocol layer. | ||||
Gregory Szorc
|
r30760 | |||
HTTP requests advertise supported response formats via the ``X-HgProto-<N>`` | ||||
request header, where ``<N>`` is an integer starting at 1 allowing the logical | ||||
value to span multiple headers. This value consists of a list of | ||||
space-delimited parameters. Each parameter denotes a feature or capability. | ||||
The following parameters are defined: | ||||
0.1 | ||||
Indicates the client supports receiving ``application/mercurial-0.1`` | ||||
responses. | ||||
0.2 | ||||
Indicates the client supports receiving ``application/mercurial-0.2`` | ||||
responses. | ||||
Gregory Szorc
|
r37575 | cbor | ||
Indicates the client supports receiving ``application/mercurial-cbor`` | ||||
responses. | ||||
(Only intended to be used with version 2 transports.) | ||||
Gregory Szorc
|
r30760 | comp | ||
Indicates compression formats the client can decode. Value is a list of | ||||
comma delimited strings identifying compression formats ordered from | ||||
most preferential to least preferential. e.g. ``comp=zstd,zlib,none``. | ||||
This parameter does not have an effect if only the ``0.1`` parameter | ||||
is defined, as support for ``application/mercurial-0.2`` or greater is | ||||
required to use arbitrary compression formats. | ||||
If this parameter is not advertised, the server interprets this as | ||||
equivalent to ``zlib,none``. | ||||
Clients may choose to only send this header if the ``httpmediatype`` | ||||
server capability is present, as currently all server-side features | ||||
consulting this header require the client to opt in to new protocol features | ||||
advertised via the ``httpmediatype`` capability. | ||||
A server that doesn't receive an ``X-HgProto-<N>`` header should infer a | ||||
value of ``0.1``. This is compatible with legacy clients. | ||||
A server receiving a request indicating support for multiple media type | ||||
versions may respond with any of the supported media types. Not all servers | ||||
may support all media types on all commands. | ||||
Gregory Szorc
|
r29865 | Commands | ||
======== | ||||
This section contains a list of all wire protocol commands implemented by | ||||
the canonical Mercurial server. | ||||
batch | ||||
----- | ||||
Issue multiple commands while sending a single command request. The purpose | ||||
of this command is to allow a client to issue multiple commands while avoiding | ||||
multiple round trips to the server therefore enabling commands to complete | ||||
quicker. | ||||
The command accepts a ``cmds`` argument that contains a list of commands to | ||||
execute. | ||||
The value of ``cmds`` is a ``;`` delimited list of strings. Each string has the | ||||
form ``<command> <arguments>``. That is, the command name followed by a space | ||||
followed by an argument string. | ||||
The argument string is a ``,`` delimited list of ``<key>=<value>`` values | ||||
corresponding to command arguments. Both the argument name and value are | ||||
escaped using a special substitution map:: | ||||
: -> :c | ||||
, -> :o | ||||
; -> :s | ||||
= -> :e | ||||
The response type for this command is ``string``. The value contains a | ||||
``;`` delimited list of responses for each requested command. Each value | ||||
in this list is escaped using the same substitution map used for arguments. | ||||
If an error occurs, the generic error response may be sent. | ||||
between | ||||
------- | ||||
(Legacy command used for discovery in old clients) | ||||
Obtain nodes between pairs of nodes. | ||||
The ``pairs`` arguments contains a space-delimited list of ``-`` delimited | ||||
hex node pairs. e.g.:: | ||||
a072279d3f7fd3a4aa7ffa1a5af8efc573e1c896-6dc58916e7c070f678682bfe404d2e2d68291a18 | ||||
Return type is a ``string``. Value consists of lines corresponding to each | ||||
requested range. Each line contains a space-delimited list of hex nodes. | ||||
A newline ``\n`` terminates each line, including the last one. | ||||
branchmap | ||||
--------- | ||||
Obtain heads in named branches. | ||||
Accepts no arguments. Return type is a ``string``. | ||||
Return value contains lines with URL encoded branch names followed by a space | ||||
followed by a space-delimited list of hex nodes of heads on that branch. | ||||
e.g.:: | ||||
default a072279d3f7fd3a4aa7ffa1a5af8efc573e1c896 6dc58916e7c070f678682bfe404d2e2d68291a18 | ||||
stable baae3bf31522f41dd5e6d7377d0edd8d1cf3fccc | ||||
There is no trailing newline. | ||||
branches | ||||
-------- | ||||
Siddharth Agarwal
|
r32133 | (Legacy command used for discovery in old clients. Clients with ``getbundle`` | ||
use the ``known`` and ``heads`` commands instead.) | ||||
Gregory Szorc
|
r29865 | Obtain ancestor changesets of specific nodes back to a branch point. | ||
Despite the name, this command has nothing to do with Mercurial named branches. | ||||
Instead, it is related to DAG branches. | ||||
The command accepts a ``nodes`` argument, which is a string of space-delimited | ||||
hex nodes. | ||||
For each node requested, the server will find the first ancestor node that is | ||||
a DAG root or is a merge. | ||||
Return type is a ``string``. Return value contains lines with result data for | ||||
each requested node. Each line contains space-delimited nodes followed by a | ||||
newline (``\n``). The 4 nodes reported on each line correspond to the requested | ||||
node, the ancestor node found, and its 2 parent nodes (which may be the null | ||||
node). | ||||
capabilities | ||||
------------ | ||||
Obtain the capabilities string for the repo. | ||||
Unlike the ``hello`` command, the capabilities string is not prefixed. | ||||
There is no trailing newline. | ||||
This command does not accept any arguments. Return type is a ``string``. | ||||
Gregory Szorc
|
r35901 | This command was introduced in Mercurial 0.9.1 (released July 2006). | ||
Gregory Szorc
|
r29865 | changegroup | ||
----------- | ||||
(Legacy command: use ``getbundle`` instead) | ||||
Obtain a changegroup version 1 with data for changesets that are | ||||
descendants of client-specified changesets. | ||||
The ``roots`` arguments contains a list of space-delimited hex nodes. | ||||
The server responds with a changegroup version 1 containing all | ||||
changesets between the requested root/base nodes and the repo's head nodes | ||||
at the time of the request. | ||||
The return type is a ``stream``. | ||||
changegroupsubset | ||||
----------------- | ||||
(Legacy command: use ``getbundle`` instead) | ||||
Obtain a changegroup version 1 with data for changesetsets between | ||||
client specified base and head nodes. | ||||
The ``bases`` argument contains a list of space-delimited hex nodes. | ||||
The ``heads`` argument contains a list of space-delimited hex nodes. | ||||
The server responds with a changegroup version 1 containing all | ||||
changesets between the requested base and head nodes at the time of the | ||||
request. | ||||
The return type is a ``stream``. | ||||
clonebundles | ||||
------------ | ||||
Obtains a manifest of bundle URLs available to seed clones. | ||||
Each returned line contains a URL followed by metadata. See the | ||||
documentation in the ``clonebundles`` extension for more. | ||||
The return type is a ``string``. | ||||
getbundle | ||||
--------- | ||||
Obtain a bundle containing repository data. | ||||
This command accepts the following arguments: | ||||
heads | ||||
List of space-delimited hex nodes of heads to retrieve. | ||||
common | ||||
List of space-delimited hex nodes that the client has in common with the | ||||
server. | ||||
obsmarkers | ||||
Boolean indicating whether to include obsolescence markers as part | ||||
of the response. Only works with bundle2. | ||||
bundlecaps | ||||
Comma-delimited set of strings defining client bundle capabilities. | ||||
listkeys | ||||
Comma-delimited list of strings of ``pushkey`` namespaces. For each | ||||
namespace listed, a bundle2 part will be included with the content of | ||||
that namespace. | ||||
cg | ||||
Boolean indicating whether changegroup data is requested. | ||||
cbattempted | ||||
Boolean indicating whether the client attempted to use the *clone bundles* | ||||
feature before performing this request. | ||||
Boris Feld
|
r35268 | bookmarks | ||
Boolean indicating whether bookmark data is requested. | ||||
Boris Feld
|
r34931 | phases | ||
Boolean indicating whether phases data is requested. | ||||
Gregory Szorc
|
r29865 | |||
The return type on success is a ``stream`` where the value is bundle. | ||||
Gregory Szorc
|
r35993 | On the HTTP version 1 transport, the response is zlib compressed. | ||
Gregory Szorc
|
r29865 | |||
If an error occurs, a generic error response can be sent. | ||||
Unless the client sends a false value for the ``cg`` argument, the returned | ||||
bundle contains a changegroup with the nodes between the specified ``common`` | ||||
and ``heads`` nodes. Depending on the command arguments, the type and content | ||||
of the returned bundle can vary significantly. | ||||
The default behavior is for the server to send a raw changegroup version | ||||
``01`` response. | ||||
If the ``bundlecaps`` provided by the client contain a value beginning | ||||
with ``HG2``, a bundle2 will be returned. The bundle2 data may contain | ||||
additional repository data, such as ``pushkey`` namespace values. | ||||
heads | ||||
----- | ||||
Returns a list of space-delimited hex nodes of repository heads followed | ||||
by a newline. e.g. | ||||
``a9eeb3adc7ddb5006c088e9eda61791c777cbf7c 31f91a3da534dc849f0d6bfc00a395a97cf218a1\n`` | ||||
This command does not accept any arguments. The return type is a ``string``. | ||||
hello | ||||
----- | ||||
Returns lines describing interesting things about the server in an RFC-822 | ||||
like format. | ||||
Currently, the only line defines the server capabilities. It has the form:: | ||||
capabilities: <value> | ||||
See above for more about the capabilities string. | ||||
SSH clients typically issue this command as soon as a connection is | ||||
established. | ||||
This command does not accept any arguments. The return type is a ``string``. | ||||
Gregory Szorc
|
r35901 | This command was introduced in Mercurial 0.9.1 (released July 2006). | ||
Gregory Szorc
|
r29865 | listkeys | ||
-------- | ||||
List values in a specified ``pushkey`` namespace. | ||||
The ``namespace`` argument defines the pushkey namespace to operate on. | ||||
The return type is a ``string``. The value is an encoded dictionary of keys. | ||||
Key-value pairs are delimited by newlines (``\n``). Within each line, keys and | ||||
values are separated by a tab (``\t``). Keys and values are both strings. | ||||
lookup | ||||
------ | ||||
Try to resolve a value to a known repository revision. | ||||
The ``key`` argument is converted from bytes to an | ||||
``encoding.localstr`` instance then passed into | ||||
``localrepository.__getitem__`` in an attempt to resolve it. | ||||
The return type is a ``string``. | ||||
Upon successful resolution, returns ``1 <hex node>\n``. On failure, | ||||
returns ``0 <error string>\n``. e.g.:: | ||||
1 273ce12ad8f155317b2c078ec75a4eba507f1fba\n | ||||
0 unknown revision 'foo'\n | ||||
known | ||||
----- | ||||
Determine whether multiple nodes are known. | ||||
The ``nodes`` argument is a list of space-delimited hex nodes to check | ||||
for existence. | ||||
The return type is ``string``. | ||||
Returns a string consisting of ``0``s and ``1``s indicating whether nodes | ||||
are known. If the Nth node specified in the ``nodes`` argument is known, | ||||
a ``1`` will be returned at byte offset N. If the node isn't known, ``0`` | ||||
will be present at byte offset N. | ||||
There is no trailing newline. | ||||
Joerg Sonnenberger
|
r37411 | protocaps | ||
--------- | ||||
Notify the server about the client capabilities in the SSH V1 transport | ||||
protocol. | ||||
The ``caps`` argument is a space-delimited list of capabilities. | ||||
The server will reply with the string ``OK``. | ||||
Gregory Szorc
|
r29865 | pushkey | ||
------- | ||||
Set a value using the ``pushkey`` protocol. | ||||
Accepts arguments ``namespace``, ``key``, ``old``, and ``new``, which | ||||
correspond to the pushkey namespace to operate on, the key within that | ||||
namespace to change, the old value (which may be empty), and the new value. | ||||
All arguments are string types. | ||||
The return type is a ``string``. The value depends on the transport protocol. | ||||
Gregory Szorc
|
r35993 | The SSH version 1 transport sends a string encoded integer followed by a | ||
newline (``\n``) which indicates operation result. The server may send | ||||
additional output on the ``stderr`` stream that should be displayed to the | ||||
user. | ||||
Gregory Szorc
|
r29865 | |||
Gregory Szorc
|
r35993 | The HTTP version 1 transport sends a string encoded integer followed by a | ||
newline followed by additional server output that should be displayed to | ||||
the user. This may include output from hooks, etc. | ||||
Gregory Szorc
|
r29865 | |||
The integer result varies by namespace. ``0`` means an error has occurred | ||||
and there should be additional output to display to the user. | ||||
stream_out | ||||
---------- | ||||
Obtain *streaming clone* data. | ||||
The return type is either a ``string`` or a ``stream``, depending on | ||||
whether the request was fulfilled properly. | ||||
A return value of ``1\n`` indicates the server is not configured to serve | ||||
this data. If this is seen by the client, they may not have verified the | ||||
``stream`` capability is set before making the request. | ||||
A return value of ``2\n`` indicates the server was unable to lock the | ||||
repository to generate data. | ||||
All other responses are a ``stream`` of bytes. The first line of this data | ||||
contains 2 space-delimited integers corresponding to the path count and | ||||
payload size, respectively:: | ||||
<path count> <payload size>\n | ||||
The ``<payload size>`` is the total size of path data: it does not include | ||||
the size of the per-path header lines. | ||||
Following that header are ``<path count>`` entries. Each entry consists of a | ||||
line with metadata followed by raw revlog data. The line consists of:: | ||||
<store path>\0<size>\n | ||||
The ``<store path>`` is the encoded store path of the data that follows. | ||||
``<size>`` is the amount of data for this store path/revlog that follows the | ||||
newline. | ||||
There is no trailer to indicate end of data. Instead, the client should stop | ||||
reading after ``<path count>`` entries are consumed. | ||||
unbundle | ||||
-------- | ||||
Send a bundle containing data (usually changegroup data) to the server. | ||||
Accepts the argument ``heads``, which is a space-delimited list of hex nodes | ||||
corresponding to server repository heads observed by the client. This is used | ||||
to detect race conditions and abort push operations before a server performs | ||||
too much work or a client transfers too much data. | ||||
The request payload consists of a bundle to be applied to the repository, | ||||
similarly to as if :hg:`unbundle` were called. | ||||
In most scenarios, a special ``push response`` type is returned. This type | ||||
contains an integer describing the change in heads as a result of the | ||||
operation. A value of ``0`` indicates nothing changed. ``1`` means the number | ||||
of heads remained the same. Values ``2`` and larger indicate the number of | ||||
added heads minus 1. e.g. ``3`` means 2 heads were added. Negative values | ||||
indicate the number of fewer heads, also off by 1. e.g. ``-2`` means there | ||||
is 1 fewer head. | ||||
The encoding of the ``push response`` type varies by transport. | ||||
Gregory Szorc
|
r35993 | For the SSH version 1 transport, this type is composed of 2 ``string`` | ||
responses: an empty response (``0\n``) followed by the integer result value. | ||||
e.g. ``1\n2``. So the full response might be ``0\n1\n2``. | ||||
Gregory Szorc
|
r29865 | |||
Gregory Szorc
|
r35993 | For the HTTP version 1 transport, the response is a ``string`` type composed | ||
of an integer result value followed by a newline (``\n``) followed by string | ||||
Gregory Szorc
|
r29865 | content holding server output that should be displayed on the client (output | ||
hooks, etc). | ||||
In some cases, the server may respond with a ``bundle2`` bundle. In this | ||||
Gregory Szorc
|
r35993 | case, the response type is ``stream``. For the HTTP version 1 transport, the | ||
response is zlib compressed. | ||||
Gregory Szorc
|
r29865 | |||
The server may also respond with a generic error type, which contains a string | ||||
indicating the failure. | ||||
Gregory Szorc
|
r37503 | |||
Frame-Based Protocol Commands | ||||
============================= | ||||
**Experimental and under active development** | ||||
This section documents the wire protocol commands exposed to transports | ||||
using the frame-based protocol. The set of commands exposed through | ||||
these transports is distinct from the set of commands exposed to legacy | ||||
transports. | ||||
The frame-based protocol uses CBOR to encode command execution requests. | ||||
All command arguments must be mapped to a specific or set of CBOR data | ||||
types. | ||||
The response to many commands is also CBOR. There is no common response | ||||
format: each command defines its own response format. | ||||
TODO require node type be specified, as N bytes of binary node value | ||||
could be ambiguous once SHA-1 is replaced. | ||||
Gregory Szorc
|
r37506 | branchmap | ||
--------- | ||||
Obtain heads in named branches. | ||||
Receives no arguments. | ||||
The response is a map with bytestring keys defining the branch name. | ||||
Values are arrays of bytestring defining raw changeset nodes. | ||||
Gregory Szorc
|
r37551 | capabilities | ||
------------ | ||||
Obtain the server's capabilities. | ||||
Receives no arguments. | ||||
This command is typically called only as part of the handshake during | ||||
initial connection establishment. | ||||
The response is a map with bytestring keys defining server information. | ||||
The defined keys are: | ||||
commands | ||||
A map defining available wire protocol commands on this server. | ||||
Keys in the map are the names of commands that can be invoked. Values | ||||
are maps defining information about that command. The bytestring keys | ||||
are: | ||||
args | ||||
Gregory Szorc
|
r37553 | A map of argument names and their expected types. | ||
Types are defined as a representative value for the expected type. | ||||
e.g. an argument expecting a boolean type will have its value | ||||
set to true. An integer type will have its value set to 42. The | ||||
actual values are arbitrary and may not have meaning. | ||||
Gregory Szorc
|
r37551 | permissions | ||
An array of permissions required to execute this command. | ||||
compression | ||||
An array of maps defining available compression format support. | ||||
The array is sorted from most preferred to least preferred. | ||||
Each entry has the following bytestring keys: | ||||
name | ||||
Name of the compression engine. e.g. ``zstd`` or ``zlib``. | ||||
Gregory Szorc
|
r37503 | heads | ||
----- | ||||
Obtain DAG heads in the repository. | ||||
The command accepts the following arguments: | ||||
publiconly (optional) | ||||
(boolean) If set, operate on the DAG for public phase changesets only. | ||||
Non-public (i.e. draft) phase DAG heads will not be returned. | ||||
The response is a CBOR array of bytestrings defining changeset nodes | ||||
of DAG heads. The array can be empty if the repository is empty or no | ||||
changesets satisfied the request. | ||||
TODO consider exposing phase of heads in response | ||||
Gregory Szorc
|
r37504 | |||
known | ||||
----- | ||||
Determine whether a series of changeset nodes is known to the server. | ||||
The command accepts the following arguments: | ||||
nodes | ||||
(array of bytestrings) List of changeset nodes whose presence to | ||||
query. | ||||
The response is a bytestring where each byte contains a 0 or 1 for the | ||||
corresponding requested node at the same index. | ||||
TODO use a bit array for even more compact response | ||||
Gregory Szorc
|
r37505 | |||
listkeys | ||||
-------- | ||||
List values in a specified ``pushkey`` namespace. | ||||
The command receives the following arguments: | ||||
namespace | ||||
(bytestring) Pushkey namespace to query. | ||||
The response is a map with bytestring keys and values. | ||||
TODO consider using binary to represent nodes in certain pushkey namespaces. | ||||
Gregory Szorc
|
r37555 | |||
Gregory Szorc
|
r37556 | lookup | ||
------ | ||||
Try to resolve a value to a changeset revision. | ||||
Unlike ``known`` which operates on changeset nodes, lookup operates on | ||||
node fragments and other names that a user may use. | ||||
The command receives the following arguments: | ||||
key | ||||
(bytestring) Value to try to resolve. | ||||
On success, returns a bytestring containing the resolved node. | ||||
Gregory Szorc
|
r37555 | pushkey | ||
------- | ||||
Set a value using the ``pushkey`` protocol. | ||||
The command receives the following arguments: | ||||
namespace | ||||
(bytestring) Pushkey namespace to operate on. | ||||
key | ||||
(bytestring) The pushkey key to set. | ||||
old | ||||
(bytestring) Old value for this key. | ||||
new | ||||
(bytestring) New value for this key. | ||||
TODO consider using binary to represent nodes is certain pushkey namespaces. | ||||
TODO better define response type and meaning. | ||||