|
|
**Experimental and under development**
|
|
|
|
|
|
This document describe's Mercurial's transport-agnostic remote procedure
|
|
|
call (RPC) protocol which is used to perform interactions with remote
|
|
|
servers. This protocol is also referred to as ``hgrpc``.
|
|
|
|
|
|
The protocol has the following high-level features:
|
|
|
|
|
|
* Concurrent request and response support (multiple commands can be issued
|
|
|
simultaneously and responses can be streamed simultaneously).
|
|
|
* Supports half-duplex and full-duplex connections.
|
|
|
* All data is transmitted within *frames*, which have a well-defined
|
|
|
header and encode their length.
|
|
|
* Side-channels for sending progress updates and printing output. Text
|
|
|
output from the remote can be localized locally.
|
|
|
* Support for simultaneous and long-lived compression streams, even across
|
|
|
requests.
|
|
|
* Uses CBOR for data exchange.
|
|
|
|
|
|
The protocol is not specific to Mercurial and could be used by other
|
|
|
applications.
|
|
|
|
|
|
High-level Overview
|
|
|
===================
|
|
|
|
|
|
To operate the protocol, a bi-directional, half-duplex pipe supporting
|
|
|
ordered sends and receives is required. That is, each peer has one pipe
|
|
|
for sending data and another for receiving. Full-duplex pipes are also
|
|
|
supported.
|
|
|
|
|
|
All data is read and written in atomic units called *frames*. These
|
|
|
are conceptually similar to TCP packets. Higher-level functionality
|
|
|
is built on the exchange and processing of frames.
|
|
|
|
|
|
All frames are associated with a *stream*. A *stream* provides a
|
|
|
unidirectional grouping of frames. Streams facilitate two goals:
|
|
|
content encoding and parallelism. There is a dedicated section on
|
|
|
streams below.
|
|
|
|
|
|
The protocol is request-response based: the client issues requests to
|
|
|
the server, which issues replies to those requests. Server-initiated
|
|
|
messaging is not currently supported, but this specification carves
|
|
|
out room to implement it.
|
|
|
|
|
|
All frames are associated with a numbered request. Frames can thus
|
|
|
be logically grouped by their request ID.
|
|
|
|
|
|
Frames
|
|
|
======
|
|
|
|
|
|
Frames begin with an 8 octet header followed by a variable length
|
|
|
payload::
|
|
|
|
|
|
+------------------------------------------------+
|
|
|
| Length (24) |
|
|
|
+--------------------------------+---------------+
|
|
|
| Request ID (16) | Stream ID (8) |
|
|
|
+------------------+-------------+---------------+
|
|
|
| Stream Flags (8) |
|
|
|
+-----------+------+
|
|
|
| Type (4) |
|
|
|
+-----------+
|
|
|
| Flags (4) |
|
|
|
+===========+===================================================|
|
|
|
| Frame Payload (0...) ...
|
|
|
+---------------------------------------------------------------+
|
|
|
|
|
|
The length of the frame payload is expressed as an unsigned 24 bit
|
|
|
little endian integer. Values larger than 65535 MUST NOT be used unless
|
|
|
given permission by the server as part of the negotiated capabilities
|
|
|
during the handshake. The frame header is not part of the advertised
|
|
|
frame length. The payload length is the over-the-wire length. If there
|
|
|
is content encoding applied to the payload as part of the frame's stream,
|
|
|
the length is the output of that content encoding, not the input.
|
|
|
|
|
|
The 16-bit ``Request ID`` field denotes the integer request identifier,
|
|
|
stored as an unsigned little endian integer. Odd numbered requests are
|
|
|
client-initiated. Even numbered requests are server-initiated. This
|
|
|
refers to where the *request* was initiated - not where the *frame* was
|
|
|
initiated, so servers will send frames with odd ``Request ID`` in
|
|
|
response to client-initiated requests. Implementations are advised to
|
|
|
start ordering request identifiers at ``1`` and ``0``, increment by
|
|
|
``2``, and wrap around if all available numbers have been exhausted.
|
|
|
|
|
|
The 8-bit ``Stream ID`` field denotes the stream that the frame is
|
|
|
associated with. Frames belonging to a stream may have content
|
|
|
encoding applied and the receiver may need to decode the raw frame
|
|
|
payload to obtain the original data. Odd numbered IDs are
|
|
|
client-initiated. Even numbered IDs are server-initiated.
|
|
|
|
|
|
The 8-bit ``Stream Flags`` field defines stream processing semantics.
|
|
|
See the section on streams below.
|
|
|
|
|
|
The 4-bit ``Type`` field denotes the type of frame being sent.
|
|
|
|
|
|
The 4-bit ``Flags`` field defines special, per-type attributes for
|
|
|
the frame.
|
|
|
|
|
|
The sections below define the frame types and their behavior.
|
|
|
|
|
|
Command Request (``0x01``)
|
|
|
--------------------------
|
|
|
|
|
|
This frame contains a request to run a command.
|
|
|
|
|
|
The payload consists of a CBOR map defining the command request. The
|
|
|
bytestring keys of that map are:
|
|
|
|
|
|
name
|
|
|
Name of the command that should be executed (bytestring).
|
|
|
args
|
|
|
Map of bytestring keys to various value types containing the named
|
|
|
arguments to this command.
|
|
|
|
|
|
Each command defines its own set of argument names and their expected
|
|
|
types.
|
|
|
|
|
|
This frame type MUST ONLY be sent from clients to servers: it is illegal
|
|
|
for a server to send this frame to a client.
|
|
|
|
|
|
The following flag values are defined for this type:
|
|
|
|
|
|
0x01
|
|
|
New command request. When set, this frame represents the beginning
|
|
|
of a new request to run a command. The ``Request ID`` attached to this
|
|
|
frame MUST NOT be active.
|
|
|
0x02
|
|
|
Command request continuation. When set, this frame is a continuation
|
|
|
from a previous command request frame for its ``Request ID``. This
|
|
|
flag is set when the CBOR data for a command request does not fit
|
|
|
in a single frame.
|
|
|
0x04
|
|
|
Additional frames expected. When set, the command request didn't fit
|
|
|
into a single frame and additional CBOR data follows in a subsequent
|
|
|
frame.
|
|
|
0x08
|
|
|
Command data frames expected. When set, command data frames are
|
|
|
expected to follow the final command request frame for this request.
|
|
|
|
|
|
``0x01`` MUST be set on the initial command request frame for a
|
|
|
``Request ID``.
|
|
|
|
|
|
``0x01`` or ``0x02`` MUST be set to indicate this frame's role in
|
|
|
a series of command request frames.
|
|
|
|
|
|
If command data frames are to be sent, ``0x08`` MUST be set on ALL
|
|
|
command request frames.
|
|
|
|
|
|
Command Data (``0x02``)
|
|
|
-----------------------
|
|
|
|
|
|
This frame contains raw data for a command.
|
|
|
|
|
|
Most commands can be executed by specifying arguments. However,
|
|
|
arguments have an upper bound to their length. For commands that
|
|
|
accept data that is beyond this length or whose length isn't known
|
|
|
when the command is initially sent, they will need to stream
|
|
|
arbitrary data to the server. This frame type facilitates the sending
|
|
|
of this data.
|
|
|
|
|
|
The payload of this frame type consists of a stream of raw data to be
|
|
|
consumed by the command handler on the server. The format of the data
|
|
|
is command specific.
|
|
|
|
|
|
The following flag values are defined for this type:
|
|
|
|
|
|
0x01
|
|
|
Command data continuation. When set, the data for this command
|
|
|
continues into a subsequent frame.
|
|
|
|
|
|
0x02
|
|
|
End of data. When set, command data has been fully sent to the
|
|
|
server. The command has been fully issued and no new data for this
|
|
|
command will be sent. The next frame will belong to a new command.
|
|
|
|
|
|
Command Response Data (``0x03``)
|
|
|
--------------------------------
|
|
|
|
|
|
This frame contains response data to an issued command.
|
|
|
|
|
|
Response data ALWAYS consists of a series of 1 or more CBOR encoded
|
|
|
values. A CBOR value may be using indefinite length encoding. And the
|
|
|
bytes constituting the value may span several frames.
|
|
|
|
|
|
The following flag values are defined for this type:
|
|
|
|
|
|
0x01
|
|
|
Data continuation. When set, an additional frame containing response data
|
|
|
will follow.
|
|
|
0x02
|
|
|
End of data. When set, the response data has been fully sent and
|
|
|
no additional frames for this response will be sent.
|
|
|
|
|
|
The ``0x01`` flag is mutually exclusive with the ``0x02`` flag.
|
|
|
|
|
|
Error Occurred (``0x05``)
|
|
|
-------------------------
|
|
|
|
|
|
Some kind of error occurred.
|
|
|
|
|
|
There are 3 general kinds of failures that can occur:
|
|
|
|
|
|
* Command error encountered before any response issued
|
|
|
* Command error encountered after a response was issued
|
|
|
* Protocol or stream level error
|
|
|
|
|
|
This frame type is used to capture the latter cases. (The general
|
|
|
command error case is handled by the leading CBOR map in
|
|
|
``Command Response`` frames.)
|
|
|
|
|
|
The payload of this frame contains a CBOR map detailing the error. That
|
|
|
map has the following bytestring keys:
|
|
|
|
|
|
type
|
|
|
(bytestring) The overall type of error encountered. Can be one of the
|
|
|
following values:
|
|
|
|
|
|
protocol
|
|
|
A protocol-level error occurred. This typically means someone
|
|
|
is violating the framing protocol semantics and the server is
|
|
|
refusing to proceed.
|
|
|
|
|
|
server
|
|
|
A server-level error occurred. This typically indicates some kind of
|
|
|
logic error on the server, likely the fault of the server.
|
|
|
|
|
|
command
|
|
|
A command-level error, likely the fault of the client.
|
|
|
|
|
|
message
|
|
|
(array of maps) A richly formatted message that is intended for
|
|
|
human consumption. See the ``Human Output Side-Channel`` frame
|
|
|
section for a description of the format of this data structure.
|
|
|
|
|
|
Human Output Side-Channel (``0x06``)
|
|
|
------------------------------------
|
|
|
|
|
|
This frame contains a message that is intended to be displayed to
|
|
|
people. Whereas most frames communicate machine readable data, this
|
|
|
frame communicates textual data that is intended to be shown to
|
|
|
humans.
|
|
|
|
|
|
The frame consists of a series of *formatting requests*. Each formatting
|
|
|
request consists of a formatting string, arguments for that formatting
|
|
|
string, and labels to apply to that formatting string.
|
|
|
|
|
|
A formatting string is a printf()-like string that allows variable
|
|
|
substitution within the string. Labels allow the rendered text to be
|
|
|
*decorated*. Assuming use of the canonical Mercurial code base, a
|
|
|
formatting string can be the input to the ``i18n._`` function. This
|
|
|
allows messages emitted from the server to be localized. So even if
|
|
|
the server has different i18n settings, people could see messages in
|
|
|
their *native* settings. Similarly, the use of labels allows
|
|
|
decorations like coloring and underlining to be applied using the
|
|
|
client's configured rendering settings.
|
|
|
|
|
|
Formatting strings are similar to ``printf()`` strings or how
|
|
|
Python's ``%`` operator works. The only supported formatting sequences
|
|
|
are ``%s`` and ``%%``. ``%s`` will be replaced by whatever the string
|
|
|
at that position resolves to. ``%%`` will be replaced by ``%``. All
|
|
|
other 2-byte sequences beginning with ``%`` represent a literal
|
|
|
``%`` followed by that character. However, future versions of the
|
|
|
wire protocol reserve the right to allow clients to opt in to receiving
|
|
|
formatting strings with additional formatters, hence why ``%%`` is
|
|
|
required to represent the literal ``%``.
|
|
|
|
|
|
The frame payload consists of a CBOR array of CBOR maps. Each map
|
|
|
defines an *atom* of text data to print. Each *atom* has the following
|
|
|
bytestring keys:
|
|
|
|
|
|
msg
|
|
|
(bytestring) The formatting string. Content MUST be ASCII.
|
|
|
args (optional)
|
|
|
Array of bytestrings defining arguments to the formatting string.
|
|
|
labels (optional)
|
|
|
Array of bytestrings defining labels to apply to this atom.
|
|
|
|
|
|
All data to be printed MUST be encoded into a single frame: this frame
|
|
|
does not support spanning data across multiple frames.
|
|
|
|
|
|
All textual data encoded in these frames is assumed to be line delimited.
|
|
|
The last atom in the frame SHOULD end with a newline (``\n``). If it
|
|
|
doesn't, clients MAY add a newline to facilitate immediate printing.
|
|
|
|
|
|
Progress Update (``0x07``)
|
|
|
--------------------------
|
|
|
|
|
|
This frame holds the progress of an operation on the peer. Consumption
|
|
|
of these frames allows clients to display progress bars, estimated
|
|
|
completion times, etc.
|
|
|
|
|
|
Each frame defines the progress of a single operation on the peer. The
|
|
|
payload consists of a CBOR map with the following bytestring keys:
|
|
|
|
|
|
topic
|
|
|
Topic name (string)
|
|
|
pos
|
|
|
Current numeric position within the topic (integer)
|
|
|
total
|
|
|
Total/end numeric position of this topic (unsigned integer)
|
|
|
label (optional)
|
|
|
Unit label (string)
|
|
|
item (optional)
|
|
|
Item name (string)
|
|
|
|
|
|
Progress state is created when a frame is received referencing a
|
|
|
*topic* that isn't currently tracked. Progress tracking for that
|
|
|
*topic* is finished when a frame is received reporting the current
|
|
|
position of that topic as ``-1``.
|
|
|
|
|
|
Multiple *topics* may be active at any given time.
|
|
|
|
|
|
Rendering of progress information is not mandated or governed by this
|
|
|
specification: implementations MAY render progress information however
|
|
|
they see fit, including not at all.
|
|
|
|
|
|
The string data describing the topic SHOULD be static strings to
|
|
|
facilitate receivers localizing that string data. The emitter
|
|
|
MUST normalize all string data to valid UTF-8 and receivers SHOULD
|
|
|
validate that received data conforms to UTF-8. The topic name
|
|
|
SHOULD be ASCII.
|
|
|
|
|
|
Stream Encoding Settings (``0x08``)
|
|
|
-----------------------------------
|
|
|
|
|
|
This frame type holds information defining the content encoding
|
|
|
settings for a *stream*.
|
|
|
|
|
|
This frame type is likely consumed by the protocol layer and is not
|
|
|
passed on to applications.
|
|
|
|
|
|
This frame type MUST ONLY occur on frames having the *Beginning of Stream*
|
|
|
``Stream Flag`` set.
|
|
|
|
|
|
The payload of this frame defines what content encoding has (possibly)
|
|
|
been applied to the payloads of subsequent frames in this stream.
|
|
|
|
|
|
The payload begins with an 8-bit integer defining the length of the
|
|
|
encoding *profile*, followed by the string name of that profile, which
|
|
|
must be an ASCII string. All bytes that follow can be used by that
|
|
|
profile for supplemental settings definitions. See the section below
|
|
|
on defined encoding profiles.
|
|
|
|
|
|
Stream States and Flags
|
|
|
=======================
|
|
|
|
|
|
Streams can be in two states: *open* and *closed*. An *open* stream
|
|
|
is active and frames attached to that stream could arrive at any time.
|
|
|
A *closed* stream is not active. If a frame attached to a *closed*
|
|
|
stream arrives, that frame MUST have an appropriate stream flag
|
|
|
set indicating beginning of stream. All streams are in the *closed*
|
|
|
state by default.
|
|
|
|
|
|
The ``Stream Flags`` field denotes a set of bit flags for defining
|
|
|
the relationship of this frame within a stream. The following flags
|
|
|
are defined:
|
|
|
|
|
|
0x01
|
|
|
Beginning of stream. The first frame in the stream MUST set this
|
|
|
flag. When received, the ``Stream ID`` this frame is attached to
|
|
|
becomes ``open``.
|
|
|
|
|
|
0x02
|
|
|
End of stream. The last frame in a stream MUST set this flag. When
|
|
|
received, the ``Stream ID`` this frame is attached to becomes
|
|
|
``closed``. Any content encoding context associated with this stream
|
|
|
can be destroyed after processing the payload of this frame.
|
|
|
|
|
|
0x04
|
|
|
Apply content encoding. When set, any content encoding settings
|
|
|
defined by the stream should be applied when attempting to read
|
|
|
the frame. When not set, the frame payload isn't encoded.
|
|
|
|
|
|
Streams
|
|
|
=======
|
|
|
|
|
|
Streams - along with ``Request IDs`` - facilitate grouping of frames.
|
|
|
But the purpose of each is quite different and the groupings they
|
|
|
constitute are independent.
|
|
|
|
|
|
A ``Request ID`` is essentially a tag. It tells you which logical
|
|
|
request a frame is associated with.
|
|
|
|
|
|
A *stream* is a sequence of frames grouped for the express purpose
|
|
|
of applying a stateful encoding or for denoting sub-groups of frames.
|
|
|
|
|
|
Unlike ``Request ID``s which span the request and response, a stream
|
|
|
is unidirectional and stream IDs are independent from client to
|
|
|
server.
|
|
|
|
|
|
There is no strict hierarchical relationship between ``Request IDs``
|
|
|
and *streams*. A stream can contain frames having multiple
|
|
|
``Request IDs``. Frames belonging to the same ``Request ID`` can
|
|
|
span multiple streams.
|
|
|
|
|
|
One goal of streams is to facilitate content encoding. A stream can
|
|
|
define an encoding to be applied to frame payloads. For example, the
|
|
|
payload transmitted over the wire may contain output from a
|
|
|
zstandard compression operation and the receiving end may decompress
|
|
|
that payload to obtain the original data.
|
|
|
|
|
|
The other goal of streams is to facilitate concurrent execution. For
|
|
|
example, a server could spawn 4 threads to service a request that can
|
|
|
be easily parallelized. Each of those 4 threads could write into its
|
|
|
own stream. Those streams could then in turn be delivered to 4 threads
|
|
|
on the receiving end, with each thread consuming its stream in near
|
|
|
isolation. The *main* thread on both ends merely does I/O and
|
|
|
encodes/decodes frame headers: the bulk of the work is done by worker
|
|
|
threads.
|
|
|
|
|
|
In addition, since content encoding is defined per stream, each
|
|
|
*worker thread* could perform potentially CPU bound work concurrently
|
|
|
with other threads. This approach of applying encoding at the
|
|
|
sub-protocol / stream level eliminates a potential resource constraint
|
|
|
on the protocol stream as a whole (it is common for the throughput of
|
|
|
a compression engine to be smaller than the throughput of a network).
|
|
|
|
|
|
Having multiple streams - each with their own encoding settings - also
|
|
|
facilitates the use of advanced data compression techniques. For
|
|
|
example, a transmitter could see that it is generating data faster
|
|
|
and slower than the receiving end is consuming it and adjust its
|
|
|
compression settings to trade CPU for compression ratio accordingly.
|
|
|
|
|
|
While streams can define a content encoding, not all frames within
|
|
|
that stream must use that content encoding. This can be useful when
|
|
|
data is being served from caches and being derived dynamically. A
|
|
|
cache could pre-compressed data so the server doesn't have to
|
|
|
recompress it. The ability to pick and choose which frames are
|
|
|
compressed allows servers to easily send data to the wire without
|
|
|
involving potentially expensive encoding overhead.
|
|
|
|
|
|
Content Encoding Profiles
|
|
|
=========================
|
|
|
|
|
|
Streams can have named content encoding *profiles* associated with
|
|
|
them. A profile defines a shared understanding of content encoding
|
|
|
settings and behavior.
|
|
|
|
|
|
The following profiles are defined:
|
|
|
|
|
|
TBD
|
|
|
|
|
|
Command Protocol
|
|
|
================
|
|
|
|
|
|
A client can request that a remote run a command by sending it
|
|
|
frames defining that command. This logical stream is composed of
|
|
|
1 or more ``Command Request`` frames and and 0 or more ``Command Data``
|
|
|
frames.
|
|
|
|
|
|
All frames composing a single command request MUST be associated with
|
|
|
the same ``Request ID``.
|
|
|
|
|
|
Clients MAY send additional command requests without waiting on the
|
|
|
response to a previous command request. If they do so, they MUST ensure
|
|
|
that the ``Request ID`` field of outbound frames does not conflict
|
|
|
with that of an active ``Request ID`` whose response has not yet been
|
|
|
fully received.
|
|
|
|
|
|
Servers MAY respond to commands in a different order than they were
|
|
|
sent over the wire. Clients MUST be prepared to deal with this. Servers
|
|
|
also MAY start executing commands in a different order than they were
|
|
|
received, or MAY execute multiple commands concurrently.
|
|
|
|
|
|
If there is a dependency between commands or a race condition between
|
|
|
commands executing (e.g. a read-only command that depends on the results
|
|
|
of a command that mutates the repository), then clients MUST NOT send
|
|
|
frames issuing a command until a response to all dependent commands has
|
|
|
been received.
|
|
|
TODO think about whether we should express dependencies between commands
|
|
|
to avoid roundtrip latency.
|
|
|
|
|
|
A command is defined by a command name, 0 or more command arguments,
|
|
|
and optional command data.
|
|
|
|
|
|
Arguments are the recommended mechanism for transferring fixed sets of
|
|
|
parameters to a command. Data is appropriate for transferring variable
|
|
|
data. Thinking in terms of HTTP, arguments would be headers and data
|
|
|
would be the message body.
|
|
|
|
|
|
It is recommended for servers to delay the dispatch of a command
|
|
|
until all argument have been received. Servers MAY impose limits on the
|
|
|
maximum argument size.
|
|
|
TODO define failure mechanism.
|
|
|
|
|
|
Servers MAY dispatch to commands immediately once argument data
|
|
|
is available or delay until command data is received in full.
|
|
|
|
|
|
Once a ``Command Request`` frame is sent, a client must be prepared to
|
|
|
receive any of the following frames associated with that request:
|
|
|
``Command Response``, ``Error Response``, ``Human Output Side-Channel``,
|
|
|
``Progress Update``.
|
|
|
|
|
|
The *main* response for a command will be in ``Command Response`` frames.
|
|
|
The payloads of these frames consist of 1 or more CBOR encoded values.
|
|
|
The first CBOR value on the first ``Command Response`` frame is special
|
|
|
and denotes the overall status of the command. This CBOR map contains
|
|
|
the following bytestring keys:
|
|
|
|
|
|
status
|
|
|
(bytestring) A well-defined message containing the overall status of
|
|
|
this command request. The following values are defined:
|
|
|
|
|
|
ok
|
|
|
The command was received successfully and its response follows.
|
|
|
error
|
|
|
There was an error processing the command. More details about the
|
|
|
error are encoded in the ``error`` key.
|
|
|
|
|
|
error (optional)
|
|
|
A map containing information about an encountered error. The map has the
|
|
|
following keys:
|
|
|
|
|
|
message
|
|
|
(array of maps) A message describing the error. The message uses the
|
|
|
same format as those in the ``Human Output Side-Channel`` frame.
|
|
|
|
|
|
TODO formalize when error frames can be seen and how errors can be
|
|
|
recognized midway through a command response.
|
|
|
|