|
|
The Mercurial wire protocol is a request-response based protocol
|
|
|
with multiple wire representations.
|
|
|
|
|
|
Each request is modeled as a command name, a dictionary of arguments, and
|
|
|
optional raw input. Command arguments and their types are intrinsic
|
|
|
properties of commands. So is the response type of the command. This means
|
|
|
clients can't always send arbitrary arguments to servers and servers can't
|
|
|
return multiple response types.
|
|
|
|
|
|
The protocol is synchronous and does not support multiplexing (concurrent
|
|
|
commands).
|
|
|
|
|
|
Handshake
|
|
|
=========
|
|
|
|
|
|
It is required or common for clients to perform a *handshake* when connecting
|
|
|
to a server. The handshake serves the following purposes:
|
|
|
|
|
|
* Negotiating protocol/transport level options
|
|
|
* Allows the client to learn about server capabilities to influence
|
|
|
future requests
|
|
|
* Ensures the underlying transport channel is in a *clean* state
|
|
|
|
|
|
An important goal of the handshake is to allow clients to use more modern
|
|
|
wire protocol features. By default, clients must assume they are talking
|
|
|
to an old version of Mercurial server (possibly even the very first
|
|
|
implementation). So, clients should not attempt to call or utilize modern
|
|
|
wire protocol features until they have confirmation that the server
|
|
|
supports them. The handshake implementation is designed to allow both
|
|
|
ends to utilize the latest set of features and capabilities with as
|
|
|
few round trips as possible.
|
|
|
|
|
|
The handshake mechanism varies by transport and protocol and is documented
|
|
|
in the sections below.
|
|
|
|
|
|
HTTP Protocol
|
|
|
=============
|
|
|
|
|
|
Handshake
|
|
|
---------
|
|
|
|
|
|
The client sends a ``capabilities`` command request (``?cmd=capabilities``)
|
|
|
as soon as HTTP requests may be issued.
|
|
|
|
|
|
The server responds with a capabilities string, which the client parses to
|
|
|
learn about the server's abilities.
|
|
|
|
|
|
HTTP Version 1 Transport
|
|
|
------------------------
|
|
|
|
|
|
Commands are issued as HTTP/1.0 or HTTP/1.1 requests. Commands are
|
|
|
sent to the base URL of the repository with the command name sent in
|
|
|
the ``cmd`` query string parameter. e.g.
|
|
|
``https://example.com/repo?cmd=capabilities``. The HTTP method is ``GET``
|
|
|
or ``POST`` depending on the command and whether there is a request
|
|
|
body.
|
|
|
|
|
|
Command arguments can be sent multiple ways.
|
|
|
|
|
|
The simplest is part of the URL query string using ``x-www-form-urlencoded``
|
|
|
encoding (see Python's ``urllib.urlencode()``. However, many servers impose
|
|
|
length limitations on the URL. So this mechanism is typically only used if
|
|
|
the server doesn't support other mechanisms.
|
|
|
|
|
|
If the server supports the ``httpheader`` capability, command arguments can
|
|
|
be sent in HTTP request headers named ``X-HgArg-<N>`` where ``<N>`` is an
|
|
|
integer starting at 1. A ``x-www-form-urlencoded`` representation of the
|
|
|
arguments is obtained. This full string is then split into chunks and sent
|
|
|
in numbered ``X-HgArg-<N>`` headers. The maximum length of each HTTP header
|
|
|
is defined by the server in the ``httpheader`` capability value, which defaults
|
|
|
to ``1024``. The server reassembles the encoded arguments string by
|
|
|
concatenating the ``X-HgArg-<N>`` headers then URL decodes them into a
|
|
|
dictionary.
|
|
|
|
|
|
The list of ``X-HgArg-<N>`` headers should be added to the ``Vary`` request
|
|
|
header to instruct caches to take these headers into consideration when caching
|
|
|
requests.
|
|
|
|
|
|
If the server supports the ``httppostargs`` capability, the client
|
|
|
may send command arguments in the HTTP request body as part of an
|
|
|
HTTP POST request. The command arguments will be URL encoded just like
|
|
|
they would for sending them via HTTP headers. However, no splitting is
|
|
|
performed: the raw arguments are included in the HTTP request body.
|
|
|
|
|
|
The client sends a ``X-HgArgs-Post`` header with the string length of the
|
|
|
encoded arguments data. Additional data may be included in the HTTP
|
|
|
request body immediately following the argument data. The offset of the
|
|
|
non-argument data is defined by the ``X-HgArgs-Post`` header. The
|
|
|
``X-HgArgs-Post`` header is not required if there is no argument data.
|
|
|
|
|
|
Additional command data can be sent as part of the HTTP request body. The
|
|
|
default ``Content-Type`` when sending data is ``application/mercurial-0.1``.
|
|
|
A ``Content-Length`` header is currently always sent.
|
|
|
|
|
|
Example HTTP requests::
|
|
|
|
|
|
GET /repo?cmd=capabilities
|
|
|
X-HgArg-1: foo=bar&baz=hello%20world
|
|
|
|
|
|
The request media type should be chosen based on server support. If the
|
|
|
``httpmediatype`` server capability is present, the client should send
|
|
|
the newest mutually supported media type. If this capability is absent,
|
|
|
the client must assume the server only supports the
|
|
|
``application/mercurial-0.1`` media type.
|
|
|
|
|
|
The ``Content-Type`` HTTP response header identifies the response as coming
|
|
|
from Mercurial and can also be used to signal an error has occurred.
|
|
|
|
|
|
The ``application/mercurial-*`` media types indicate a generic Mercurial
|
|
|
data type.
|
|
|
|
|
|
The ``application/mercurial-0.1`` media type is raw Mercurial data. It is the
|
|
|
predecessor of the format below.
|
|
|
|
|
|
The ``application/mercurial-0.2`` media type is compression framed Mercurial
|
|
|
data. The first byte of the payload indicates the length of the compression
|
|
|
format identifier that follows. Next are N bytes indicating the compression
|
|
|
format. e.g. ``zlib``. The remaining bytes are compressed according to that
|
|
|
compression format. The decompressed data behaves the same as with
|
|
|
``application/mercurial-0.1``.
|
|
|
|
|
|
The ``application/hg-error`` media type indicates a generic error occurred.
|
|
|
The content of the HTTP response body typically holds text describing the
|
|
|
error.
|
|
|
|
|
|
The ``application/hg-changegroup`` media type indicates a changegroup response
|
|
|
type.
|
|
|
|
|
|
Clients also accept the ``text/plain`` media type. All other media
|
|
|
types should cause the client to error.
|
|
|
|
|
|
Behavior of media types is further described in the ``Content Negotiation``
|
|
|
section below.
|
|
|
|
|
|
Clients should issue a ``User-Agent`` request header that identifies the client.
|
|
|
The server should not use the ``User-Agent`` for feature detection.
|
|
|
|
|
|
A command returning a ``string`` response issues a
|
|
|
``application/mercurial-0.*`` media type and the HTTP response body contains
|
|
|
the raw string value (after compression decoding, if used). A
|
|
|
``Content-Length`` header is typically issued, but not required.
|
|
|
|
|
|
A command returning a ``stream`` response issues a
|
|
|
``application/mercurial-0.*`` media type and the HTTP response is typically
|
|
|
using *chunked transfer* (``Transfer-Encoding: chunked``).
|
|
|
|
|
|
HTTP Version 2 Transport
|
|
|
------------------------
|
|
|
|
|
|
**Experimental - feature under active development**
|
|
|
|
|
|
Version 2 of the HTTP protocol is exposed under the ``/api/*`` URL space.
|
|
|
It's final API name is not yet formalized.
|
|
|
|
|
|
Commands are triggered by sending HTTP POST requests against URLs of the
|
|
|
form ``<permission>/<command>``, where ``<permission>`` is ``ro`` or
|
|
|
``rw``, meaning read-only and read-write, respectively and ``<command>``
|
|
|
is a named wire protocol command.
|
|
|
|
|
|
Non-POST request methods MUST be rejected by the server with an HTTP
|
|
|
405 response.
|
|
|
|
|
|
Commands that modify repository state in meaningful ways MUST NOT be
|
|
|
exposed under the ``ro`` URL prefix. All available commands MUST be
|
|
|
available under the ``rw`` URL prefix.
|
|
|
|
|
|
Server adminstrators MAY implement blanket HTTP authentication keyed
|
|
|
off the URL prefix. For example, a server may require authentication
|
|
|
for all ``rw/*`` URLs and let unauthenticated requests to ``ro/*``
|
|
|
URL proceed. A server MAY issue an HTTP 401, 403, or 407 response
|
|
|
in accordance with RFC 7235. Clients SHOULD recognize the HTTP Basic
|
|
|
(RFC 7617) and Digest (RFC 7616) authentication schemes. Clients SHOULD
|
|
|
make an attempt to recognize unknown schemes using the
|
|
|
``WWW-Authenticate`` response header on a 401 response, as defined by
|
|
|
RFC 7235.
|
|
|
|
|
|
Read-only commands are accessible under ``rw/*`` URLs so clients can
|
|
|
signal the intent of the operation very early in the connection
|
|
|
lifecycle. For example, a ``push`` operation - which consists of
|
|
|
various read-only commands mixed with at least one read-write command -
|
|
|
can perform all commands against ``rw/*`` URLs so that any server-side
|
|
|
authentication requirements are discovered upon attempting the first
|
|
|
command - not potentially several commands into the exchange. This
|
|
|
allows clients to fail faster or prompt for credentials as soon as the
|
|
|
exchange takes place. This provides a better end-user experience.
|
|
|
|
|
|
Requests to unknown commands or URLS result in an HTTP 404.
|
|
|
TODO formally define response type, how error is communicated, etc.
|
|
|
|
|
|
HTTP request and response bodies use the *Unified Frame-Based Protocol*
|
|
|
(defined below) for media exchange. The entirety of the HTTP message
|
|
|
body is 0 or more frames as defined by this protocol.
|
|
|
|
|
|
Clients and servers MUST advertise the ``TBD`` media type via the
|
|
|
``Content-Type`` request and response headers. In addition, clients MUST
|
|
|
advertise this media type value in their ``Accept`` request header in all
|
|
|
requests.
|
|
|
TODO finalize the media type. For now, it is defined in wireprotoserver.py.
|
|
|
|
|
|
Servers receiving requests without an ``Accept`` header SHOULD respond with
|
|
|
an HTTP 406.
|
|
|
|
|
|
Servers receiving requests with an invalid ``Content-Type`` header SHOULD
|
|
|
respond with an HTTP 415.
|
|
|
|
|
|
The command to run is specified in the POST payload as defined by the
|
|
|
*Unified Frame-Based Protocol*. This is redundant with data already
|
|
|
encoded in the URL. This is by design, so server operators can have
|
|
|
better understanding about server activity from looking merely at
|
|
|
HTTP access logs.
|
|
|
|
|
|
In most circumstances, the command specified in the URL MUST match
|
|
|
the command specified in the frame-based payload or the server will
|
|
|
respond with an error. The exception to this is the special
|
|
|
``multirequest`` URL. (See below.) In addition, HTTP requests
|
|
|
are limited to one command invocation. The exception is the special
|
|
|
``multirequest`` URL.
|
|
|
|
|
|
The ``multirequest`` command endpoints (``ro/multirequest`` and
|
|
|
``rw/multirequest``) are special in that they allow the execution of
|
|
|
*any* command and allow the execution of multiple commands. If the
|
|
|
HTTP request issues multiple commands across multiple frames, all
|
|
|
issued commands will be processed by the server. Per the defined
|
|
|
behavior of the *Unified Frame-Based Protocol*, commands may be
|
|
|
issued interleaved and responses may come back in a different order
|
|
|
than they were issued. Clients MUST be able to deal with this.
|
|
|
|
|
|
SSH Protocol
|
|
|
============
|
|
|
|
|
|
Handshake
|
|
|
---------
|
|
|
|
|
|
For all clients, the handshake consists of the client sending 1 or more
|
|
|
commands to the server using version 1 of the transport. Servers respond
|
|
|
to commands they know how to respond to and send an empty response (``0\n``)
|
|
|
for unknown commands (per standard behavior of version 1 of the transport).
|
|
|
Clients then typically look for a response to the newest sent command to
|
|
|
determine which transport version to use and what the available features for
|
|
|
the connection and server are.
|
|
|
|
|
|
Preceding any response from client-issued commands, the server may print
|
|
|
non-protocol output. It is common for SSH servers to print banners, message
|
|
|
of the day announcements, etc when clients connect. It is assumed that any
|
|
|
such *banner* output will precede any Mercurial server output. So clients
|
|
|
must be prepared to handle server output on initial connect that isn't
|
|
|
in response to any client-issued command and doesn't conform to Mercurial's
|
|
|
wire protocol. This *banner* output should only be on stdout. However,
|
|
|
some servers may send output on stderr.
|
|
|
|
|
|
Pre 0.9.1 clients issue a ``between`` command with the ``pairs`` argument
|
|
|
having the value
|
|
|
``0000000000000000000000000000000000000000-0000000000000000000000000000000000000000``.
|
|
|
|
|
|
The ``between`` command has been supported since the original Mercurial
|
|
|
SSH server. Requesting the empty range will return a ``\n`` string response,
|
|
|
which will be encoded as ``1\n\n`` (value length of ``1`` followed by a newline
|
|
|
followed by the value, which happens to be a newline).
|
|
|
|
|
|
For pre 0.9.1 clients and all servers, the exchange looks like::
|
|
|
|
|
|
c: between\n
|
|
|
c: pairs 81\n
|
|
|
c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
|
|
|
s: 1\n
|
|
|
s: \n
|
|
|
|
|
|
0.9.1+ clients send a ``hello`` command (with no arguments) before the
|
|
|
``between`` command. The response to this command allows clients to
|
|
|
discover server capabilities and settings.
|
|
|
|
|
|
An example exchange between 0.9.1+ clients and a ``hello`` aware server looks
|
|
|
like::
|
|
|
|
|
|
c: hello\n
|
|
|
c: between\n
|
|
|
c: pairs 81\n
|
|
|
c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
|
|
|
s: 324\n
|
|
|
s: capabilities: lookup changegroupsubset branchmap pushkey known getbundle ...\n
|
|
|
s: 1\n
|
|
|
s: \n
|
|
|
|
|
|
And a similar scenario but with servers sending a banner on connect::
|
|
|
|
|
|
c: hello\n
|
|
|
c: between\n
|
|
|
c: pairs 81\n
|
|
|
c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
|
|
|
s: welcome to the server\n
|
|
|
s: if you find any issues, email someone@somewhere.com\n
|
|
|
s: 324\n
|
|
|
s: capabilities: lookup changegroupsubset branchmap pushkey known getbundle ...\n
|
|
|
s: 1\n
|
|
|
s: \n
|
|
|
|
|
|
Note that output from the ``hello`` command is terminated by a ``\n``. This is
|
|
|
part of the response payload and not part of the wire protocol adding a newline
|
|
|
after responses. In other words, the length of the response contains the
|
|
|
trailing ``\n``.
|
|
|
|
|
|
Clients supporting version 2 of the SSH transport send a line beginning
|
|
|
with ``upgrade`` before the ``hello`` and ``between`` commands. The line
|
|
|
(which isn't a well-formed command line because it doesn't consist of a
|
|
|
single command name) serves to both communicate the client's intent to
|
|
|
switch to transport version 2 (transports are version 1 by default) as
|
|
|
well as to advertise the client's transport-level capabilities so the
|
|
|
server may satisfy that request immediately.
|
|
|
|
|
|
The upgrade line has the form:
|
|
|
|
|
|
upgrade <token> <transport capabilities>
|
|
|
|
|
|
That is the literal string ``upgrade`` followed by a space, followed by
|
|
|
a randomly generated string, followed by a space, followed by a string
|
|
|
denoting the client's transport capabilities.
|
|
|
|
|
|
The token can be anything. However, a random UUID is recommended. (Use
|
|
|
of version 4 UUIDs is recommended because version 1 UUIDs can leak the
|
|
|
client's MAC address.)
|
|
|
|
|
|
The transport capabilities string is a URL/percent encoded string
|
|
|
containing key-value pairs defining the client's transport-level
|
|
|
capabilities. The following capabilities are defined:
|
|
|
|
|
|
proto
|
|
|
A comma-delimited list of transport protocol versions the client
|
|
|
supports. e.g. ``ssh-v2``.
|
|
|
|
|
|
If the server does not recognize the ``upgrade`` line, it should issue
|
|
|
an empty response and continue processing the ``hello`` and ``between``
|
|
|
commands. Here is an example handshake between a version 2 aware client
|
|
|
and a non version 2 aware server:
|
|
|
|
|
|
c: upgrade 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a proto=ssh-v2
|
|
|
c: hello\n
|
|
|
c: between\n
|
|
|
c: pairs 81\n
|
|
|
c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
|
|
|
s: 0\n
|
|
|
s: 324\n
|
|
|
s: capabilities: lookup changegroupsubset branchmap pushkey known getbundle ...\n
|
|
|
s: 1\n
|
|
|
s: \n
|
|
|
|
|
|
(The initial ``0\n`` line from the server indicates an empty response to
|
|
|
the unknown ``upgrade ..`` command/line.)
|
|
|
|
|
|
If the server recognizes the ``upgrade`` line and is willing to satisfy that
|
|
|
upgrade request, it replies to with a payload of the following form:
|
|
|
|
|
|
upgraded <token> <transport name>\n
|
|
|
|
|
|
This line is the literal string ``upgraded``, a space, the token that was
|
|
|
specified by the client in its ``upgrade ...`` request line, a space, and the
|
|
|
name of the transport protocol that was chosen by the server. The transport
|
|
|
name MUST match one of the names the client specified in the ``proto`` field
|
|
|
of its ``upgrade ...`` request line.
|
|
|
|
|
|
If a server issues an ``upgraded`` response, it MUST also read and ignore
|
|
|
the lines associated with the ``hello`` and ``between`` command requests
|
|
|
that were issued by the server. It is assumed that the negotiated transport
|
|
|
will respond with equivalent requested information following the transport
|
|
|
handshake.
|
|
|
|
|
|
All data following the ``\n`` terminating the ``upgraded`` line is the
|
|
|
domain of the negotiated transport. It is common for the data immediately
|
|
|
following to contain additional metadata about the state of the transport and
|
|
|
the server. However, this isn't strictly speaking part of the transport
|
|
|
handshake and isn't covered by this section.
|
|
|
|
|
|
Here is an example handshake between a version 2 aware client and a version
|
|
|
2 aware server:
|
|
|
|
|
|
c: upgrade 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a proto=ssh-v2
|
|
|
c: hello\n
|
|
|
c: between\n
|
|
|
c: pairs 81\n
|
|
|
c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
|
|
|
s: upgraded 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a ssh-v2\n
|
|
|
s: <additional transport specific data>
|
|
|
|
|
|
The client-issued token that is echoed in the response provides a more
|
|
|
resilient mechanism for differentiating *banner* output from Mercurial
|
|
|
output. In version 1, properly formatted banner output could get confused
|
|
|
for Mercurial server output. By submitting a randomly generated token
|
|
|
that is then present in the response, the client can look for that token
|
|
|
in response lines and have reasonable certainty that the line did not
|
|
|
originate from a *banner* message.
|
|
|
|
|
|
SSH Version 1 Transport
|
|
|
-----------------------
|
|
|
|
|
|
The SSH transport (version 1) is a custom text-based protocol suitable for
|
|
|
use over any bi-directional stream transport. It is most commonly used with
|
|
|
SSH.
|
|
|
|
|
|
A SSH transport server can be started with ``hg serve --stdio``. The stdin,
|
|
|
stderr, and stdout file descriptors of the started process are used to exchange
|
|
|
data. When Mercurial connects to a remote server over SSH, it actually starts
|
|
|
a ``hg serve --stdio`` process on the remote server.
|
|
|
|
|
|
Commands are issued by sending the command name followed by a trailing newline
|
|
|
``\n`` to the server. e.g. ``capabilities\n``.
|
|
|
|
|
|
Command arguments are sent in the following format::
|
|
|
|
|
|
<argument> <length>\n<value>
|
|
|
|
|
|
That is, the argument string name followed by a space followed by the
|
|
|
integer length of the value (expressed as a string) followed by a newline
|
|
|
(``\n``) followed by the raw argument value.
|
|
|
|
|
|
Dictionary arguments are encoded differently::
|
|
|
|
|
|
<argument> <# elements>\n
|
|
|
<key1> <length1>\n<value1>
|
|
|
<key2> <length2>\n<value2>
|
|
|
...
|
|
|
|
|
|
Non-argument data is sent immediately after the final argument value. It is
|
|
|
encoded in chunks::
|
|
|
|
|
|
<length>\n<data>
|
|
|
|
|
|
Each command declares a list of supported arguments and their types. If a
|
|
|
client sends an unknown argument to the server, the server should abort
|
|
|
immediately. The special argument ``*`` in a command's definition indicates
|
|
|
that all argument names are allowed.
|
|
|
|
|
|
The definition of supported arguments and types is initially made when a
|
|
|
new command is implemented. The client and server must initially independently
|
|
|
agree on the arguments and their types. This initial set of arguments can be
|
|
|
supplemented through the presence of *capabilities* advertised by the server.
|
|
|
|
|
|
Each command has a defined expected response type.
|
|
|
|
|
|
A ``string`` response type is a length framed value. The response consists of
|
|
|
the string encoded integer length of a value followed by a newline (``\n``)
|
|
|
followed by the value. Empty values are allowed (and are represented as
|
|
|
``0\n``).
|
|
|
|
|
|
A ``stream`` response type consists of raw bytes of data. There is no framing.
|
|
|
|
|
|
A generic error response type is also supported. It consists of a an error
|
|
|
message written to ``stderr`` followed by ``\n-\n``. In addition, ``\n`` is
|
|
|
written to ``stdout``.
|
|
|
|
|
|
If the server receives an unknown command, it will send an empty ``string``
|
|
|
response.
|
|
|
|
|
|
The server terminates if it receives an empty command (a ``\n`` character).
|
|
|
|
|
|
SSH Version 2 Transport
|
|
|
-----------------------
|
|
|
|
|
|
**Experimental and under development**
|
|
|
|
|
|
Version 2 of the SSH transport behaves identically to version 1 of the SSH
|
|
|
transport with the exception of handshake semantics. See above for how
|
|
|
version 2 of the SSH transport is negotiated.
|
|
|
|
|
|
Immediately following the ``upgraded`` line signaling a switch to version
|
|
|
2 of the SSH protocol, the server automatically sends additional details
|
|
|
about the capabilities of the remote server. This has the form:
|
|
|
|
|
|
<integer length of value>\n
|
|
|
capabilities: ...\n
|
|
|
|
|
|
e.g.
|
|
|
|
|
|
s: upgraded 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a ssh-v2\n
|
|
|
s: 240\n
|
|
|
s: capabilities: known getbundle batch ...\n
|
|
|
|
|
|
Following capabilities advertisement, the peers communicate using version
|
|
|
1 of the SSH transport.
|
|
|
|
|
|
Unified Frame-Based Protocol
|
|
|
============================
|
|
|
|
|
|
**Experimental and under development**
|
|
|
|
|
|
The *Unified Frame-Based Protocol* is a communications protocol between
|
|
|
Mercurial peers. The protocol aims to be mostly transport agnostic
|
|
|
(works similarly on HTTP, SSH, etc).
|
|
|
|
|
|
To operate the protocol, a bi-directional, half-duplex pipe supporting
|
|
|
ordered sends and receives is required. That is, each peer has one pipe
|
|
|
for sending data and another for receiving.
|
|
|
|
|
|
All data is read and written in atomic units called *frames*. These
|
|
|
are conceptually similar to TCP packets. Higher-level functionality
|
|
|
is built on the exchange and processing of frames.
|
|
|
|
|
|
All frames are associated with a *stream*. A *stream* provides a
|
|
|
unidirectional grouping of frames. Streams facilitate two goals:
|
|
|
content encoding and parallelism. There is a dedicated section on
|
|
|
streams below.
|
|
|
|
|
|
The protocol is request-response based: the client issues requests to
|
|
|
the server, which issues replies to those requests. Server-initiated
|
|
|
messaging is not currently supported, but this specification carves
|
|
|
out room to implement it.
|
|
|
|
|
|
All frames are associated with a numbered request. Frames can thus
|
|
|
be logically grouped by their request ID.
|
|
|
|
|
|
Frames begin with an 8 octet header followed by a variable length
|
|
|
payload::
|
|
|
|
|
|
+------------------------------------------------+
|
|
|
| Length (24) |
|
|
|
+--------------------------------+---------------+
|
|
|
| Request ID (16) | Stream ID (8) |
|
|
|
+------------------+-------------+---------------+
|
|
|
| Stream Flags (8) |
|
|
|
+-----------+------+
|
|
|
| Type (4) |
|
|
|
+-----------+
|
|
|
| Flags (4) |
|
|
|
+===========+===================================================|
|
|
|
| Frame Payload (0...) ...
|
|
|
+---------------------------------------------------------------+
|
|
|
|
|
|
The length of the frame payload is expressed as an unsigned 24 bit
|
|
|
little endian integer. Values larger than 65535 MUST NOT be used unless
|
|
|
given permission by the server as part of the negotiated capabilities
|
|
|
during the handshake. The frame header is not part of the advertised
|
|
|
frame length. The payload length is the over-the-wire length. If there
|
|
|
is content encoding applied to the payload as part of the frame's stream,
|
|
|
the length is the output of that content encoding, not the input.
|
|
|
|
|
|
The 16-bit ``Request ID`` field denotes the integer request identifier,
|
|
|
stored as an unsigned little endian integer. Odd numbered requests are
|
|
|
client-initiated. Even numbered requests are server-initiated. This
|
|
|
refers to where the *request* was initiated - not where the *frame* was
|
|
|
initiated, so servers will send frames with odd ``Request ID`` in
|
|
|
response to client-initiated requests. Implementations are advised to
|
|
|
start ordering request identifiers at ``1`` and ``0``, increment by
|
|
|
``2``, and wrap around if all available numbers have been exhausted.
|
|
|
|
|
|
The 8-bit ``Stream ID`` field denotes the stream that the frame is
|
|
|
associated with. Frames belonging to a stream may have content
|
|
|
encoding applied and the receiver may need to decode the raw frame
|
|
|
payload to obtain the original data. Odd numbered IDs are
|
|
|
client-initiated. Even numbered IDs are server-initiated.
|
|
|
|
|
|
The 8-bit ``Stream Flags`` field defines stream processing semantics.
|
|
|
See the section on streams below.
|
|
|
|
|
|
The 4-bit ``Type`` field denotes the type of frame being sent.
|
|
|
|
|
|
The 4-bit ``Flags`` field defines special, per-type attributes for
|
|
|
the frame.
|
|
|
|
|
|
The sections below define the frame types and their behavior.
|
|
|
|
|
|
Command Request (``0x01``)
|
|
|
--------------------------
|
|
|
|
|
|
This frame contains a request to run a command.
|
|
|
|
|
|
The name of the command to run constitutes the entirety of the frame
|
|
|
payload.
|
|
|
|
|
|
This frame type MUST ONLY be sent from clients to servers: it is illegal
|
|
|
for a server to send this frame to a client.
|
|
|
|
|
|
The following flag values are defined for this type:
|
|
|
|
|
|
0x01
|
|
|
End of command data. When set, the client will not send any command
|
|
|
arguments or additional command data. When set, the command has been
|
|
|
fully issued and the server has the full context to process the command.
|
|
|
The next frame issued by the client is not part of this command.
|
|
|
0x02
|
|
|
Command argument frames expected. When set, the client will send
|
|
|
*Command Argument* frames containing command argument data.
|
|
|
0x04
|
|
|
Command data frames expected. When set, the client will send
|
|
|
*Command Data* frames containing a raw stream of data for this
|
|
|
command.
|
|
|
|
|
|
The ``0x01`` flag is mutually exclusive with both the ``0x02`` and ``0x04``
|
|
|
flags.
|
|
|
|
|
|
Command Argument (``0x02``)
|
|
|
---------------------------
|
|
|
|
|
|
This frame contains a named argument for a command.
|
|
|
|
|
|
The frame type MUST ONLY be sent from clients to servers: it is illegal
|
|
|
for a server to send this frame to a client.
|
|
|
|
|
|
The payload consists of:
|
|
|
|
|
|
* A 16-bit little endian integer denoting the length of the
|
|
|
argument name.
|
|
|
* A 16-bit little endian integer denoting the length of the
|
|
|
argument value.
|
|
|
* N bytes of ASCII data containing the argument name.
|
|
|
* N bytes of binary data containing the argument value.
|
|
|
|
|
|
The payload MUST hold the entirety of the 32-bit header and the
|
|
|
argument name. The argument value MAY span multiple frames. If this
|
|
|
occurs, the appropriate frame flag should be set to indicate this.
|
|
|
|
|
|
The following flag values are defined for this type:
|
|
|
|
|
|
0x01
|
|
|
Argument data continuation. When set, the data for this argument did
|
|
|
not fit in a single frame and the next frame will contain additional
|
|
|
argument data.
|
|
|
|
|
|
0x02
|
|
|
End of arguments data. When set, the client will not send any more
|
|
|
command arguments for the command this frame is associated with.
|
|
|
The next frame issued by the client will be command data or
|
|
|
belong to a separate request.
|
|
|
|
|
|
Command Data (``0x03``)
|
|
|
-----------------------
|
|
|
|
|
|
This frame contains raw data for a command.
|
|
|
|
|
|
Most commands can be executed by specifying arguments. However,
|
|
|
arguments have an upper bound to their length. For commands that
|
|
|
accept data that is beyond this length or whose length isn't known
|
|
|
when the command is initially sent, they will need to stream
|
|
|
arbitrary data to the server. This frame type facilitates the sending
|
|
|
of this data.
|
|
|
|
|
|
The payload of this frame type consists of a stream of raw data to be
|
|
|
consumed by the command handler on the server. The format of the data
|
|
|
is command specific.
|
|
|
|
|
|
The following flag values are defined for this type:
|
|
|
|
|
|
0x01
|
|
|
Command data continuation. When set, the data for this command
|
|
|
continues into a subsequent frame.
|
|
|
|
|
|
0x02
|
|
|
End of data. When set, command data has been fully sent to the
|
|
|
server. The command has been fully issued and no new data for this
|
|
|
command will be sent. The next frame will belong to a new command.
|
|
|
|
|
|
Bytes Response Data (``0x04``)
|
|
|
------------------------------
|
|
|
|
|
|
This frame contains raw bytes response data to an issued command.
|
|
|
|
|
|
The following flag values are defined for this type:
|
|
|
|
|
|
0x01
|
|
|
Data continuation. When set, an additional frame containing raw
|
|
|
response data will follow.
|
|
|
0x02
|
|
|
End of data. When sent, the response data has been fully sent and
|
|
|
no additional frames for this response will be sent.
|
|
|
|
|
|
The ``0x01`` flag is mutually exclusive with the ``0x02`` flag.
|
|
|
|
|
|
Error Response (``0x05``)
|
|
|
-------------------------
|
|
|
|
|
|
An error occurred when processing a request. This could indicate
|
|
|
a protocol-level failure or an application level failure depending
|
|
|
on the flags for this message type.
|
|
|
|
|
|
The payload for this type is an error message that should be
|
|
|
displayed to the user.
|
|
|
|
|
|
The following flag values are defined for this type:
|
|
|
|
|
|
0x01
|
|
|
The error occurred at the transport/protocol level. If set, the
|
|
|
connection should be closed.
|
|
|
0x02
|
|
|
The error occurred at the application level. e.g. invalid command.
|
|
|
|
|
|
Human Output Side-Channel (``0x06``)
|
|
|
------------------------------------
|
|
|
|
|
|
This frame contains a message that is intended to be displayed to
|
|
|
people. Whereas most frames communicate machine readable data, this
|
|
|
frame communicates textual data that is intended to be shown to
|
|
|
humans.
|
|
|
|
|
|
The frame consists of a series of *formatting requests*. Each formatting
|
|
|
request consists of a formatting string, arguments for that formatting
|
|
|
string, and labels to apply to that formatting string.
|
|
|
|
|
|
A formatting string is a printf()-like string that allows variable
|
|
|
substitution within the string. Labels allow the rendered text to be
|
|
|
*decorated*. Assuming use of the canonical Mercurial code base, a
|
|
|
formatting string can be the input to the ``i18n._`` function. This
|
|
|
allows messages emitted from the server to be localized. So even if
|
|
|
the server has different i18n settings, people could see messages in
|
|
|
their *native* settings. Similarly, the use of labels allows
|
|
|
decorations like coloring and underlining to be applied using the
|
|
|
client's configured rendering settings.
|
|
|
|
|
|
Formatting strings are similar to ``printf()`` strings or how
|
|
|
Python's ``%`` operator works. The only supported formatting sequences
|
|
|
are ``%s`` and ``%%``. ``%s`` will be replaced by whatever the string
|
|
|
at that position resolves to. ``%%`` will be replaced by ``%``. All
|
|
|
other 2-byte sequences beginning with ``%`` represent a literal
|
|
|
``%`` followed by that character. However, future versions of the
|
|
|
wire protocol reserve the right to allow clients to opt in to receiving
|
|
|
formatting strings with additional formatters, hence why ``%%`` is
|
|
|
required to represent the literal ``%``.
|
|
|
|
|
|
The raw frame consists of a series of data structures representing
|
|
|
textual atoms to print. Each atom begins with a struct defining the
|
|
|
size of the data that follows:
|
|
|
|
|
|
* A 16-bit little endian unsigned integer denoting the length of the
|
|
|
formatting string.
|
|
|
* An 8-bit unsigned integer denoting the number of label strings
|
|
|
that follow.
|
|
|
* An 8-bit unsigned integer denoting the number of formatting string
|
|
|
arguments strings that follow.
|
|
|
* An array of 8-bit unsigned integers denoting the lengths of
|
|
|
*labels* data.
|
|
|
* An array of 16-bit unsigned integers denoting the lengths of
|
|
|
formatting strings.
|
|
|
* The formatting string, encoded as UTF-8.
|
|
|
* 0 or more ASCII strings defining labels to apply to this atom.
|
|
|
* 0 or more UTF-8 strings that will be used as arguments to the
|
|
|
formatting string.
|
|
|
|
|
|
TODO use ASCII for formatting string.
|
|
|
|
|
|
All data to be printed MUST be encoded into a single frame: this frame
|
|
|
does not support spanning data across multiple frames.
|
|
|
|
|
|
All textual data encoded in these frames is assumed to be line delimited.
|
|
|
The last atom in the frame SHOULD end with a newline (``\n``). If it
|
|
|
doesn't, clients MAY add a newline to facilitate immediate printing.
|
|
|
|
|
|
Stream Encoding Settings (``0x08``)
|
|
|
-----------------------------------
|
|
|
|
|
|
This frame type holds information defining the content encoding
|
|
|
settings for a *stream*.
|
|
|
|
|
|
This frame type is likely consumed by the protocol layer and is not
|
|
|
passed on to applications.
|
|
|
|
|
|
This frame type MUST ONLY occur on frames having the *Beginning of Stream*
|
|
|
``Stream Flag`` set.
|
|
|
|
|
|
The payload of this frame defines what content encoding has (possibly)
|
|
|
been applied to the payloads of subsequent frames in this stream.
|
|
|
|
|
|
The payload begins with an 8-bit integer defining the length of the
|
|
|
encoding *profile*, followed by the string name of that profile, which
|
|
|
must be an ASCII string. All bytes that follow can be used by that
|
|
|
profile for supplemental settings definitions. See the section below
|
|
|
on defined encoding profiles.
|
|
|
|
|
|
Stream States and Flags
|
|
|
-----------------------
|
|
|
|
|
|
Streams can be in two states: *open* and *closed*. An *open* stream
|
|
|
is active and frames attached to that stream could arrive at any time.
|
|
|
A *closed* stream is not active. If a frame attached to a *closed*
|
|
|
stream arrives, that frame MUST have an appropriate stream flag
|
|
|
set indicating beginning of stream. All streams are in the *closed*
|
|
|
state by default.
|
|
|
|
|
|
The ``Stream Flags`` field denotes a set of bit flags for defining
|
|
|
the relationship of this frame within a stream. The following flags
|
|
|
are defined:
|
|
|
|
|
|
0x01
|
|
|
Beginning of stream. The first frame in the stream MUST set this
|
|
|
flag. When received, the ``Stream ID`` this frame is attached to
|
|
|
becomes ``open``.
|
|
|
|
|
|
0x02
|
|
|
End of stream. The last frame in a stream MUST set this flag. When
|
|
|
received, the ``Stream ID`` this frame is attached to becomes
|
|
|
``closed``. Any content encoding context associated with this stream
|
|
|
can be destroyed after processing the payload of this frame.
|
|
|
|
|
|
0x04
|
|
|
Apply content encoding. When set, any content encoding settings
|
|
|
defined by the stream should be applied when attempting to read
|
|
|
the frame. When not set, the frame payload isn't encoded.
|
|
|
|
|
|
Streams
|
|
|
-------
|
|
|
|
|
|
Streams - along with ``Request IDs`` - facilitate grouping of frames.
|
|
|
But the purpose of each is quite different and the groupings they
|
|
|
constitute are independent.
|
|
|
|
|
|
A ``Request ID`` is essentially a tag. It tells you which logical
|
|
|
request a frame is associated with.
|
|
|
|
|
|
A *stream* is a sequence of frames grouped for the express purpose
|
|
|
of applying a stateful encoding or for denoting sub-groups of frames.
|
|
|
|
|
|
Unlike ``Request ID``s which span the request and response, a stream
|
|
|
is unidirectional and stream IDs are independent from client to
|
|
|
server.
|
|
|
|
|
|
There is no strict hierarchical relationship between ``Request IDs``
|
|
|
and *streams*. A stream can contain frames having multiple
|
|
|
``Request IDs``. Frames belonging to the same ``Request ID`` can
|
|
|
span multiple streams.
|
|
|
|
|
|
One goal of streams is to facilitate content encoding. A stream can
|
|
|
define an encoding to be applied to frame payloads. For example, the
|
|
|
payload transmitted over the wire may contain output from a
|
|
|
zstandard compression operation and the receiving end may decompress
|
|
|
that payload to obtain the original data.
|
|
|
|
|
|
The other goal of streams is to facilitate concurrent execution. For
|
|
|
example, a server could spawn 4 threads to service a request that can
|
|
|
be easily parallelized. Each of those 4 threads could write into its
|
|
|
own stream. Those streams could then in turn be delivered to 4 threads
|
|
|
on the receiving end, with each thread consuming its stream in near
|
|
|
isolation. The *main* thread on both ends merely does I/O and
|
|
|
encodes/decodes frame headers: the bulk of the work is done by worker
|
|
|
threads.
|
|
|
|
|
|
In addition, since content encoding is defined per stream, each
|
|
|
*worker thread* could perform potentially CPU bound work concurrently
|
|
|
with other threads. This approach of applying encoding at the
|
|
|
sub-protocol / stream level eliminates a potential resource constraint
|
|
|
on the protocol stream as a whole (it is common for the throughput of
|
|
|
a compression engine to be smaller than the throughput of a network).
|
|
|
|
|
|
Having multiple streams - each with their own encoding settings - also
|
|
|
facilitates the use of advanced data compression techniques. For
|
|
|
example, a transmitter could see that it is generating data faster
|
|
|
and slower than the receiving end is consuming it and adjust its
|
|
|
compression settings to trade CPU for compression ratio accordingly.
|
|
|
|
|
|
While streams can define a content encoding, not all frames within
|
|
|
that stream must use that content encoding. This can be useful when
|
|
|
data is being served from caches and being derived dynamically. A
|
|
|
cache could pre-compressed data so the server doesn't have to
|
|
|
recompress it. The ability to pick and choose which frames are
|
|
|
compressed allows servers to easily send data to the wire without
|
|
|
involving potentially expensive encoding overhead.
|
|
|
|
|
|
Content Encoding Profiles
|
|
|
-------------------------
|
|
|
|
|
|
Streams can have named content encoding *profiles* associated with
|
|
|
them. A profile defines a shared understanding of content encoding
|
|
|
settings and behavior.
|
|
|
|
|
|
The following profiles are defined:
|
|
|
|
|
|
TBD
|
|
|
|
|
|
Issuing Commands
|
|
|
----------------
|
|
|
|
|
|
A client can request that a remote run a command by sending it
|
|
|
frames defining that command. This logical stream is composed of
|
|
|
1 ``Command Request`` frame, 0 or more ``Command Argument`` frames,
|
|
|
and 0 or more ``Command Data`` frames.
|
|
|
|
|
|
All frames composing a single command request MUST be associated with
|
|
|
the same ``Request ID``.
|
|
|
|
|
|
Clients MAY send additional command requests without waiting on the
|
|
|
response to a previous command request. If they do so, they MUST ensure
|
|
|
that the ``Request ID`` field of outbound frames does not conflict
|
|
|
with that of an active ``Request ID`` whose response has not yet been
|
|
|
fully received.
|
|
|
|
|
|
Servers MAY respond to commands in a different order than they were
|
|
|
sent over the wire. Clients MUST be prepared to deal with this. Servers
|
|
|
also MAY start executing commands in a different order than they were
|
|
|
received, or MAY execute multiple commands concurrently.
|
|
|
|
|
|
If there is a dependency between commands or a race condition between
|
|
|
commands executing (e.g. a read-only command that depends on the results
|
|
|
of a command that mutates the repository), then clients MUST NOT send
|
|
|
frames issuing a command until a response to all dependent commands has
|
|
|
been received.
|
|
|
TODO think about whether we should express dependencies between commands
|
|
|
to avoid roundtrip latency.
|
|
|
|
|
|
Argument frames are the recommended mechanism for transferring fixed
|
|
|
sets of parameters to a command. Data frames are appropriate for
|
|
|
transferring variable data. A similar comparison would be to HTTP:
|
|
|
argument frames are headers and the message body is data frames.
|
|
|
|
|
|
It is recommended for servers to delay the dispatch of a command
|
|
|
until all argument frames for that command have been received. Servers
|
|
|
MAY impose limits on the maximum argument size.
|
|
|
TODO define failure mechanism.
|
|
|
|
|
|
Servers MAY dispatch to commands immediately once argument data
|
|
|
is available or delay until command data is received in full.
|
|
|
|
|
|
Capabilities
|
|
|
============
|
|
|
|
|
|
Servers advertise supported wire protocol features. This allows clients to
|
|
|
probe for server features before blindly calling a command or passing a
|
|
|
specific argument.
|
|
|
|
|
|
The server's features are exposed via a *capabilities* string. This is a
|
|
|
space-delimited string of tokens/features. Some features are single words
|
|
|
like ``lookup`` or ``batch``. Others are complicated key-value pairs
|
|
|
advertising sub-features. e.g. ``httpheader=2048``. When complex, non-word
|
|
|
values are used, each feature name can define its own encoding of sub-values.
|
|
|
Comma-delimited and ``x-www-form-urlencoded`` values are common.
|
|
|
|
|
|
The following document capabilities defined by the canonical Mercurial server
|
|
|
implementation.
|
|
|
|
|
|
batch
|
|
|
-----
|
|
|
|
|
|
Whether the server supports the ``batch`` command.
|
|
|
|
|
|
This capability/command was introduced in Mercurial 1.9 (released July 2011).
|
|
|
|
|
|
branchmap
|
|
|
---------
|
|
|
|
|
|
Whether the server supports the ``branchmap`` command.
|
|
|
|
|
|
This capability/command was introduced in Mercurial 1.3 (released July 2009).
|
|
|
|
|
|
bundle2-exp
|
|
|
-----------
|
|
|
|
|
|
Precursor to ``bundle2`` capability that was used before bundle2 was a
|
|
|
stable feature.
|
|
|
|
|
|
This capability was introduced in Mercurial 3.0 behind an experimental
|
|
|
flag. This capability should not be observed in the wild.
|
|
|
|
|
|
bundle2
|
|
|
-------
|
|
|
|
|
|
Indicates whether the server supports the ``bundle2`` data exchange format.
|
|
|
|
|
|
The value of the capability is a URL quoted, newline (``\n``) delimited
|
|
|
list of keys or key-value pairs.
|
|
|
|
|
|
A key is simply a URL encoded string.
|
|
|
|
|
|
A key-value pair is a URL encoded key separated from a URL encoded value by
|
|
|
an ``=``. If the value is a list, elements are delimited by a ``,`` after
|
|
|
URL encoding.
|
|
|
|
|
|
For example, say we have the values::
|
|
|
|
|
|
{'HG20': [], 'changegroup': ['01', '02'], 'digests': ['sha1', 'sha512']}
|
|
|
|
|
|
We would first construct a string::
|
|
|
|
|
|
HG20\nchangegroup=01,02\ndigests=sha1,sha512
|
|
|
|
|
|
We would then URL quote this string::
|
|
|
|
|
|
HG20%0Achangegroup%3D01%2C02%0Adigests%3Dsha1%2Csha512
|
|
|
|
|
|
This capability was introduced in Mercurial 3.4 (released May 2015).
|
|
|
|
|
|
changegroupsubset
|
|
|
-----------------
|
|
|
|
|
|
Whether the server supports the ``changegroupsubset`` command.
|
|
|
|
|
|
This capability was introduced in Mercurial 0.9.2 (released December
|
|
|
2006).
|
|
|
|
|
|
This capability was introduced at the same time as the ``lookup``
|
|
|
capability/command.
|
|
|
|
|
|
compression
|
|
|
-----------
|
|
|
|
|
|
Declares support for negotiating compression formats.
|
|
|
|
|
|
Presence of this capability indicates the server supports dynamic selection
|
|
|
of compression formats based on the client request.
|
|
|
|
|
|
Servers advertising this capability are required to support the
|
|
|
``application/mercurial-0.2`` media type in response to commands returning
|
|
|
streams. Servers may support this media type on any command.
|
|
|
|
|
|
The value of the capability is a comma-delimited list of strings declaring
|
|
|
supported compression formats. The order of the compression formats is in
|
|
|
server-preferred order, most preferred first.
|
|
|
|
|
|
The identifiers used by the official Mercurial distribution are:
|
|
|
|
|
|
bzip2
|
|
|
bzip2
|
|
|
none
|
|
|
uncompressed / raw data
|
|
|
zlib
|
|
|
zlib (no gzip header)
|
|
|
zstd
|
|
|
zstd
|
|
|
|
|
|
This capability was introduced in Mercurial 4.1 (released February 2017).
|
|
|
|
|
|
getbundle
|
|
|
---------
|
|
|
|
|
|
Whether the server supports the ``getbundle`` command.
|
|
|
|
|
|
This capability was introduced in Mercurial 1.9 (released July 2011).
|
|
|
|
|
|
httpheader
|
|
|
----------
|
|
|
|
|
|
Whether the server supports receiving command arguments via HTTP request
|
|
|
headers.
|
|
|
|
|
|
The value of the capability is an integer describing the max header
|
|
|
length that clients should send. Clients should ignore any content after a
|
|
|
comma in the value, as this is reserved for future use.
|
|
|
|
|
|
This capability was introduced in Mercurial 1.9 (released July 2011).
|
|
|
|
|
|
httpmediatype
|
|
|
-------------
|
|
|
|
|
|
Indicates which HTTP media types (``Content-Type`` header) the server is
|
|
|
capable of receiving and sending.
|
|
|
|
|
|
The value of the capability is a comma-delimited list of strings identifying
|
|
|
support for media type and transmission direction. The following strings may
|
|
|
be present:
|
|
|
|
|
|
0.1rx
|
|
|
Indicates server support for receiving ``application/mercurial-0.1`` media
|
|
|
types.
|
|
|
|
|
|
0.1tx
|
|
|
Indicates server support for sending ``application/mercurial-0.1`` media
|
|
|
types.
|
|
|
|
|
|
0.2rx
|
|
|
Indicates server support for receiving ``application/mercurial-0.2`` media
|
|
|
types.
|
|
|
|
|
|
0.2tx
|
|
|
Indicates server support for sending ``application/mercurial-0.2`` media
|
|
|
types.
|
|
|
|
|
|
minrx=X
|
|
|
Minimum media type version the server is capable of receiving. Value is a
|
|
|
string like ``0.2``.
|
|
|
|
|
|
This capability can be used by servers to limit connections from legacy
|
|
|
clients not using the latest supported media type. However, only clients
|
|
|
with knowledge of this capability will know to consult this value. This
|
|
|
capability is present so the client may issue a more user-friendly error
|
|
|
when the server has locked out a legacy client.
|
|
|
|
|
|
mintx=X
|
|
|
Minimum media type version the server is capable of sending. Value is a
|
|
|
string like ``0.1``.
|
|
|
|
|
|
Servers advertising support for the ``application/mercurial-0.2`` media type
|
|
|
should also advertise the ``compression`` capability.
|
|
|
|
|
|
This capability was introduced in Mercurial 4.1 (released February 2017).
|
|
|
|
|
|
httppostargs
|
|
|
------------
|
|
|
|
|
|
**Experimental**
|
|
|
|
|
|
Indicates that the server supports and prefers clients send command arguments
|
|
|
via a HTTP POST request as part of the request body.
|
|
|
|
|
|
This capability was introduced in Mercurial 3.8 (released May 2016).
|
|
|
|
|
|
known
|
|
|
-----
|
|
|
|
|
|
Whether the server supports the ``known`` command.
|
|
|
|
|
|
This capability/command was introduced in Mercurial 1.9 (released July 2011).
|
|
|
|
|
|
lookup
|
|
|
------
|
|
|
|
|
|
Whether the server supports the ``lookup`` command.
|
|
|
|
|
|
This capability was introduced in Mercurial 0.9.2 (released December
|
|
|
2006).
|
|
|
|
|
|
This capability was introduced at the same time as the ``changegroupsubset``
|
|
|
capability/command.
|
|
|
|
|
|
pushkey
|
|
|
-------
|
|
|
|
|
|
Whether the server supports the ``pushkey`` and ``listkeys`` commands.
|
|
|
|
|
|
This capability was introduced in Mercurial 1.6 (released July 2010).
|
|
|
|
|
|
standardbundle
|
|
|
--------------
|
|
|
|
|
|
**Unsupported**
|
|
|
|
|
|
This capability was introduced during the Mercurial 0.9.2 development cycle in
|
|
|
2006. It was never present in a release, as it was replaced by the ``unbundle``
|
|
|
capability. This capability should not be encountered in the wild.
|
|
|
|
|
|
stream-preferred
|
|
|
----------------
|
|
|
|
|
|
If present the server prefers that clients clone using the streaming clone
|
|
|
protocol (``hg clone --stream``) rather than the standard
|
|
|
changegroup/bundle based protocol.
|
|
|
|
|
|
This capability was introduced in Mercurial 2.2 (released May 2012).
|
|
|
|
|
|
streamreqs
|
|
|
----------
|
|
|
|
|
|
Indicates whether the server supports *streaming clones* and the *requirements*
|
|
|
that clients must support to receive it.
|
|
|
|
|
|
If present, the server supports the ``stream_out`` command, which transmits
|
|
|
raw revlogs from the repository instead of changegroups. This provides a faster
|
|
|
cloning mechanism at the expense of more bandwidth used.
|
|
|
|
|
|
The value of this capability is a comma-delimited list of repo format
|
|
|
*requirements*. These are requirements that impact the reading of data in
|
|
|
the ``.hg/store`` directory. An example value is
|
|
|
``streamreqs=generaldelta,revlogv1`` indicating the server repo requires
|
|
|
the ``revlogv1`` and ``generaldelta`` requirements.
|
|
|
|
|
|
If the only format requirement is ``revlogv1``, the server may expose the
|
|
|
``stream`` capability instead of the ``streamreqs`` capability.
|
|
|
|
|
|
This capability was introduced in Mercurial 1.7 (released November 2010).
|
|
|
|
|
|
stream
|
|
|
------
|
|
|
|
|
|
Whether the server supports *streaming clones* from ``revlogv1`` repos.
|
|
|
|
|
|
If present, the server supports the ``stream_out`` command, which transmits
|
|
|
raw revlogs from the repository instead of changegroups. This provides a faster
|
|
|
cloning mechanism at the expense of more bandwidth used.
|
|
|
|
|
|
This capability was introduced in Mercurial 0.9.1 (released July 2006).
|
|
|
|
|
|
When initially introduced, the value of the capability was the numeric
|
|
|
revlog revision. e.g. ``stream=1``. This indicates the changegroup is using
|
|
|
``revlogv1``. This simple integer value wasn't powerful enough, so the
|
|
|
``streamreqs`` capability was invented to handle cases where the repo
|
|
|
requirements have more than just ``revlogv1``. Newer servers omit the
|
|
|
``=1`` since it was the only value supported and the value of ``1`` can
|
|
|
be implied by clients.
|
|
|
|
|
|
unbundlehash
|
|
|
------------
|
|
|
|
|
|
Whether the ``unbundle`` commands supports receiving a hash of all the
|
|
|
heads instead of a list.
|
|
|
|
|
|
For more, see the documentation for the ``unbundle`` command.
|
|
|
|
|
|
This capability was introduced in Mercurial 1.9 (released July 2011).
|
|
|
|
|
|
unbundle
|
|
|
--------
|
|
|
|
|
|
Whether the server supports pushing via the ``unbundle`` command.
|
|
|
|
|
|
This capability/command has been present since Mercurial 0.9.1 (released
|
|
|
July 2006).
|
|
|
|
|
|
Mercurial 0.9.2 (released December 2006) added values to the capability
|
|
|
indicating which bundle types the server supports receiving. This value is a
|
|
|
comma-delimited list. e.g. ``HG10GZ,HG10BZ,HG10UN``. The order of values
|
|
|
reflects the priority/preference of that type, where the first value is the
|
|
|
most preferred type.
|
|
|
|
|
|
Content Negotiation
|
|
|
===================
|
|
|
|
|
|
The wire protocol has some mechanisms to help peers determine what content
|
|
|
types and encoding the other side will accept. Historically, these mechanisms
|
|
|
have been built into commands themselves because most commands only send a
|
|
|
well-defined response type and only certain commands needed to support
|
|
|
functionality like compression.
|
|
|
|
|
|
Currently, only the HTTP version 1 transport supports content negotiation
|
|
|
at the protocol layer.
|
|
|
|
|
|
HTTP requests advertise supported response formats via the ``X-HgProto-<N>``
|
|
|
request header, where ``<N>`` is an integer starting at 1 allowing the logical
|
|
|
value to span multiple headers. This value consists of a list of
|
|
|
space-delimited parameters. Each parameter denotes a feature or capability.
|
|
|
|
|
|
The following parameters are defined:
|
|
|
|
|
|
0.1
|
|
|
Indicates the client supports receiving ``application/mercurial-0.1``
|
|
|
responses.
|
|
|
|
|
|
0.2
|
|
|
Indicates the client supports receiving ``application/mercurial-0.2``
|
|
|
responses.
|
|
|
|
|
|
comp
|
|
|
Indicates compression formats the client can decode. Value is a list of
|
|
|
comma delimited strings identifying compression formats ordered from
|
|
|
most preferential to least preferential. e.g. ``comp=zstd,zlib,none``.
|
|
|
|
|
|
This parameter does not have an effect if only the ``0.1`` parameter
|
|
|
is defined, as support for ``application/mercurial-0.2`` or greater is
|
|
|
required to use arbitrary compression formats.
|
|
|
|
|
|
If this parameter is not advertised, the server interprets this as
|
|
|
equivalent to ``zlib,none``.
|
|
|
|
|
|
Clients may choose to only send this header if the ``httpmediatype``
|
|
|
server capability is present, as currently all server-side features
|
|
|
consulting this header require the client to opt in to new protocol features
|
|
|
advertised via the ``httpmediatype`` capability.
|
|
|
|
|
|
A server that doesn't receive an ``X-HgProto-<N>`` header should infer a
|
|
|
value of ``0.1``. This is compatible with legacy clients.
|
|
|
|
|
|
A server receiving a request indicating support for multiple media type
|
|
|
versions may respond with any of the supported media types. Not all servers
|
|
|
may support all media types on all commands.
|
|
|
|
|
|
Commands
|
|
|
========
|
|
|
|
|
|
This section contains a list of all wire protocol commands implemented by
|
|
|
the canonical Mercurial server.
|
|
|
|
|
|
batch
|
|
|
-----
|
|
|
|
|
|
Issue multiple commands while sending a single command request. The purpose
|
|
|
of this command is to allow a client to issue multiple commands while avoiding
|
|
|
multiple round trips to the server therefore enabling commands to complete
|
|
|
quicker.
|
|
|
|
|
|
The command accepts a ``cmds`` argument that contains a list of commands to
|
|
|
execute.
|
|
|
|
|
|
The value of ``cmds`` is a ``;`` delimited list of strings. Each string has the
|
|
|
form ``<command> <arguments>``. That is, the command name followed by a space
|
|
|
followed by an argument string.
|
|
|
|
|
|
The argument string is a ``,`` delimited list of ``<key>=<value>`` values
|
|
|
corresponding to command arguments. Both the argument name and value are
|
|
|
escaped using a special substitution map::
|
|
|
|
|
|
: -> :c
|
|
|
, -> :o
|
|
|
; -> :s
|
|
|
= -> :e
|
|
|
|
|
|
The response type for this command is ``string``. The value contains a
|
|
|
``;`` delimited list of responses for each requested command. Each value
|
|
|
in this list is escaped using the same substitution map used for arguments.
|
|
|
|
|
|
If an error occurs, the generic error response may be sent.
|
|
|
|
|
|
between
|
|
|
-------
|
|
|
|
|
|
(Legacy command used for discovery in old clients)
|
|
|
|
|
|
Obtain nodes between pairs of nodes.
|
|
|
|
|
|
The ``pairs`` arguments contains a space-delimited list of ``-`` delimited
|
|
|
hex node pairs. e.g.::
|
|
|
|
|
|
a072279d3f7fd3a4aa7ffa1a5af8efc573e1c896-6dc58916e7c070f678682bfe404d2e2d68291a18
|
|
|
|
|
|
Return type is a ``string``. Value consists of lines corresponding to each
|
|
|
requested range. Each line contains a space-delimited list of hex nodes.
|
|
|
A newline ``\n`` terminates each line, including the last one.
|
|
|
|
|
|
branchmap
|
|
|
---------
|
|
|
|
|
|
Obtain heads in named branches.
|
|
|
|
|
|
Accepts no arguments. Return type is a ``string``.
|
|
|
|
|
|
Return value contains lines with URL encoded branch names followed by a space
|
|
|
followed by a space-delimited list of hex nodes of heads on that branch.
|
|
|
e.g.::
|
|
|
|
|
|
default a072279d3f7fd3a4aa7ffa1a5af8efc573e1c896 6dc58916e7c070f678682bfe404d2e2d68291a18
|
|
|
stable baae3bf31522f41dd5e6d7377d0edd8d1cf3fccc
|
|
|
|
|
|
There is no trailing newline.
|
|
|
|
|
|
branches
|
|
|
--------
|
|
|
|
|
|
(Legacy command used for discovery in old clients. Clients with ``getbundle``
|
|
|
use the ``known`` and ``heads`` commands instead.)
|
|
|
|
|
|
Obtain ancestor changesets of specific nodes back to a branch point.
|
|
|
|
|
|
Despite the name, this command has nothing to do with Mercurial named branches.
|
|
|
Instead, it is related to DAG branches.
|
|
|
|
|
|
The command accepts a ``nodes`` argument, which is a string of space-delimited
|
|
|
hex nodes.
|
|
|
|
|
|
For each node requested, the server will find the first ancestor node that is
|
|
|
a DAG root or is a merge.
|
|
|
|
|
|
Return type is a ``string``. Return value contains lines with result data for
|
|
|
each requested node. Each line contains space-delimited nodes followed by a
|
|
|
newline (``\n``). The 4 nodes reported on each line correspond to the requested
|
|
|
node, the ancestor node found, and its 2 parent nodes (which may be the null
|
|
|
node).
|
|
|
|
|
|
capabilities
|
|
|
------------
|
|
|
|
|
|
Obtain the capabilities string for the repo.
|
|
|
|
|
|
Unlike the ``hello`` command, the capabilities string is not prefixed.
|
|
|
There is no trailing newline.
|
|
|
|
|
|
This command does not accept any arguments. Return type is a ``string``.
|
|
|
|
|
|
This command was introduced in Mercurial 0.9.1 (released July 2006).
|
|
|
|
|
|
changegroup
|
|
|
-----------
|
|
|
|
|
|
(Legacy command: use ``getbundle`` instead)
|
|
|
|
|
|
Obtain a changegroup version 1 with data for changesets that are
|
|
|
descendants of client-specified changesets.
|
|
|
|
|
|
The ``roots`` arguments contains a list of space-delimited hex nodes.
|
|
|
|
|
|
The server responds with a changegroup version 1 containing all
|
|
|
changesets between the requested root/base nodes and the repo's head nodes
|
|
|
at the time of the request.
|
|
|
|
|
|
The return type is a ``stream``.
|
|
|
|
|
|
changegroupsubset
|
|
|
-----------------
|
|
|
|
|
|
(Legacy command: use ``getbundle`` instead)
|
|
|
|
|
|
Obtain a changegroup version 1 with data for changesetsets between
|
|
|
client specified base and head nodes.
|
|
|
|
|
|
The ``bases`` argument contains a list of space-delimited hex nodes.
|
|
|
The ``heads`` argument contains a list of space-delimited hex nodes.
|
|
|
|
|
|
The server responds with a changegroup version 1 containing all
|
|
|
changesets between the requested base and head nodes at the time of the
|
|
|
request.
|
|
|
|
|
|
The return type is a ``stream``.
|
|
|
|
|
|
clonebundles
|
|
|
------------
|
|
|
|
|
|
Obtains a manifest of bundle URLs available to seed clones.
|
|
|
|
|
|
Each returned line contains a URL followed by metadata. See the
|
|
|
documentation in the ``clonebundles`` extension for more.
|
|
|
|
|
|
The return type is a ``string``.
|
|
|
|
|
|
getbundle
|
|
|
---------
|
|
|
|
|
|
Obtain a bundle containing repository data.
|
|
|
|
|
|
This command accepts the following arguments:
|
|
|
|
|
|
heads
|
|
|
List of space-delimited hex nodes of heads to retrieve.
|
|
|
common
|
|
|
List of space-delimited hex nodes that the client has in common with the
|
|
|
server.
|
|
|
obsmarkers
|
|
|
Boolean indicating whether to include obsolescence markers as part
|
|
|
of the response. Only works with bundle2.
|
|
|
bundlecaps
|
|
|
Comma-delimited set of strings defining client bundle capabilities.
|
|
|
listkeys
|
|
|
Comma-delimited list of strings of ``pushkey`` namespaces. For each
|
|
|
namespace listed, a bundle2 part will be included with the content of
|
|
|
that namespace.
|
|
|
cg
|
|
|
Boolean indicating whether changegroup data is requested.
|
|
|
cbattempted
|
|
|
Boolean indicating whether the client attempted to use the *clone bundles*
|
|
|
feature before performing this request.
|
|
|
bookmarks
|
|
|
Boolean indicating whether bookmark data is requested.
|
|
|
phases
|
|
|
Boolean indicating whether phases data is requested.
|
|
|
|
|
|
The return type on success is a ``stream`` where the value is bundle.
|
|
|
On the HTTP version 1 transport, the response is zlib compressed.
|
|
|
|
|
|
If an error occurs, a generic error response can be sent.
|
|
|
|
|
|
Unless the client sends a false value for the ``cg`` argument, the returned
|
|
|
bundle contains a changegroup with the nodes between the specified ``common``
|
|
|
and ``heads`` nodes. Depending on the command arguments, the type and content
|
|
|
of the returned bundle can vary significantly.
|
|
|
|
|
|
The default behavior is for the server to send a raw changegroup version
|
|
|
``01`` response.
|
|
|
|
|
|
If the ``bundlecaps`` provided by the client contain a value beginning
|
|
|
with ``HG2``, a bundle2 will be returned. The bundle2 data may contain
|
|
|
additional repository data, such as ``pushkey`` namespace values.
|
|
|
|
|
|
heads
|
|
|
-----
|
|
|
|
|
|
Returns a list of space-delimited hex nodes of repository heads followed
|
|
|
by a newline. e.g.
|
|
|
``a9eeb3adc7ddb5006c088e9eda61791c777cbf7c 31f91a3da534dc849f0d6bfc00a395a97cf218a1\n``
|
|
|
|
|
|
This command does not accept any arguments. The return type is a ``string``.
|
|
|
|
|
|
hello
|
|
|
-----
|
|
|
|
|
|
Returns lines describing interesting things about the server in an RFC-822
|
|
|
like format.
|
|
|
|
|
|
Currently, the only line defines the server capabilities. It has the form::
|
|
|
|
|
|
capabilities: <value>
|
|
|
|
|
|
See above for more about the capabilities string.
|
|
|
|
|
|
SSH clients typically issue this command as soon as a connection is
|
|
|
established.
|
|
|
|
|
|
This command does not accept any arguments. The return type is a ``string``.
|
|
|
|
|
|
This command was introduced in Mercurial 0.9.1 (released July 2006).
|
|
|
|
|
|
listkeys
|
|
|
--------
|
|
|
|
|
|
List values in a specified ``pushkey`` namespace.
|
|
|
|
|
|
The ``namespace`` argument defines the pushkey namespace to operate on.
|
|
|
|
|
|
The return type is a ``string``. The value is an encoded dictionary of keys.
|
|
|
|
|
|
Key-value pairs are delimited by newlines (``\n``). Within each line, keys and
|
|
|
values are separated by a tab (``\t``). Keys and values are both strings.
|
|
|
|
|
|
lookup
|
|
|
------
|
|
|
|
|
|
Try to resolve a value to a known repository revision.
|
|
|
|
|
|
The ``key`` argument is converted from bytes to an
|
|
|
``encoding.localstr`` instance then passed into
|
|
|
``localrepository.__getitem__`` in an attempt to resolve it.
|
|
|
|
|
|
The return type is a ``string``.
|
|
|
|
|
|
Upon successful resolution, returns ``1 <hex node>\n``. On failure,
|
|
|
returns ``0 <error string>\n``. e.g.::
|
|
|
|
|
|
1 273ce12ad8f155317b2c078ec75a4eba507f1fba\n
|
|
|
|
|
|
0 unknown revision 'foo'\n
|
|
|
|
|
|
known
|
|
|
-----
|
|
|
|
|
|
Determine whether multiple nodes are known.
|
|
|
|
|
|
The ``nodes`` argument is a list of space-delimited hex nodes to check
|
|
|
for existence.
|
|
|
|
|
|
The return type is ``string``.
|
|
|
|
|
|
Returns a string consisting of ``0``s and ``1``s indicating whether nodes
|
|
|
are known. If the Nth node specified in the ``nodes`` argument is known,
|
|
|
a ``1`` will be returned at byte offset N. If the node isn't known, ``0``
|
|
|
will be present at byte offset N.
|
|
|
|
|
|
There is no trailing newline.
|
|
|
|
|
|
pushkey
|
|
|
-------
|
|
|
|
|
|
Set a value using the ``pushkey`` protocol.
|
|
|
|
|
|
Accepts arguments ``namespace``, ``key``, ``old``, and ``new``, which
|
|
|
correspond to the pushkey namespace to operate on, the key within that
|
|
|
namespace to change, the old value (which may be empty), and the new value.
|
|
|
All arguments are string types.
|
|
|
|
|
|
The return type is a ``string``. The value depends on the transport protocol.
|
|
|
|
|
|
The SSH version 1 transport sends a string encoded integer followed by a
|
|
|
newline (``\n``) which indicates operation result. The server may send
|
|
|
additional output on the ``stderr`` stream that should be displayed to the
|
|
|
user.
|
|
|
|
|
|
The HTTP version 1 transport sends a string encoded integer followed by a
|
|
|
newline followed by additional server output that should be displayed to
|
|
|
the user. This may include output from hooks, etc.
|
|
|
|
|
|
The integer result varies by namespace. ``0`` means an error has occurred
|
|
|
and there should be additional output to display to the user.
|
|
|
|
|
|
stream_out
|
|
|
----------
|
|
|
|
|
|
Obtain *streaming clone* data.
|
|
|
|
|
|
The return type is either a ``string`` or a ``stream``, depending on
|
|
|
whether the request was fulfilled properly.
|
|
|
|
|
|
A return value of ``1\n`` indicates the server is not configured to serve
|
|
|
this data. If this is seen by the client, they may not have verified the
|
|
|
``stream`` capability is set before making the request.
|
|
|
|
|
|
A return value of ``2\n`` indicates the server was unable to lock the
|
|
|
repository to generate data.
|
|
|
|
|
|
All other responses are a ``stream`` of bytes. The first line of this data
|
|
|
contains 2 space-delimited integers corresponding to the path count and
|
|
|
payload size, respectively::
|
|
|
|
|
|
<path count> <payload size>\n
|
|
|
|
|
|
The ``<payload size>`` is the total size of path data: it does not include
|
|
|
the size of the per-path header lines.
|
|
|
|
|
|
Following that header are ``<path count>`` entries. Each entry consists of a
|
|
|
line with metadata followed by raw revlog data. The line consists of::
|
|
|
|
|
|
<store path>\0<size>\n
|
|
|
|
|
|
The ``<store path>`` is the encoded store path of the data that follows.
|
|
|
``<size>`` is the amount of data for this store path/revlog that follows the
|
|
|
newline.
|
|
|
|
|
|
There is no trailer to indicate end of data. Instead, the client should stop
|
|
|
reading after ``<path count>`` entries are consumed.
|
|
|
|
|
|
unbundle
|
|
|
--------
|
|
|
|
|
|
Send a bundle containing data (usually changegroup data) to the server.
|
|
|
|
|
|
Accepts the argument ``heads``, which is a space-delimited list of hex nodes
|
|
|
corresponding to server repository heads observed by the client. This is used
|
|
|
to detect race conditions and abort push operations before a server performs
|
|
|
too much work or a client transfers too much data.
|
|
|
|
|
|
The request payload consists of a bundle to be applied to the repository,
|
|
|
similarly to as if :hg:`unbundle` were called.
|
|
|
|
|
|
In most scenarios, a special ``push response`` type is returned. This type
|
|
|
contains an integer describing the change in heads as a result of the
|
|
|
operation. A value of ``0`` indicates nothing changed. ``1`` means the number
|
|
|
of heads remained the same. Values ``2`` and larger indicate the number of
|
|
|
added heads minus 1. e.g. ``3`` means 2 heads were added. Negative values
|
|
|
indicate the number of fewer heads, also off by 1. e.g. ``-2`` means there
|
|
|
is 1 fewer head.
|
|
|
|
|
|
The encoding of the ``push response`` type varies by transport.
|
|
|
|
|
|
For the SSH version 1 transport, this type is composed of 2 ``string``
|
|
|
responses: an empty response (``0\n``) followed by the integer result value.
|
|
|
e.g. ``1\n2``. So the full response might be ``0\n1\n2``.
|
|
|
|
|
|
For the HTTP version 1 transport, the response is a ``string`` type composed
|
|
|
of an integer result value followed by a newline (``\n``) followed by string
|
|
|
content holding server output that should be displayed on the client (output
|
|
|
hooks, etc).
|
|
|
|
|
|
In some cases, the server may respond with a ``bundle2`` bundle. In this
|
|
|
case, the response type is ``stream``. For the HTTP version 1 transport, the
|
|
|
response is zlib compressed.
|
|
|
|
|
|
The server may also respond with a generic error type, which contains a string
|
|
|
indicating the failure.
|
|
|
|