##// END OF EJS Templates
wireprotov2: implement commands as a generator of objects...
wireprotov2: implement commands as a generator of objects Previously, wire protocol version 2 inherited version 1's model of having separate types to represent the results of different wire protocol commands. As I implemented more powerful commands in future commits, I found I was using a common pattern of returning a special type to hold a generator. This meant the command function required a closure to do most of the work. That made logic flow more difficult to follow. I also noticed that many commands were effectively a sequence of objects to be CBOR encoded. I think it makes sense to define version 2 commands as generators. This way, commands can simply emit the data structures they wish to send to the client. This eliminates the need for a closure in command functions and removes encoding from the bodies of commands. As part of this commit, the handling of response objects has been moved into the serverreactor class. This puts the reactor in the driver's seat with regards to CBOR encoding and error handling. Having error handling in the function that emits frames is particularly important because exceptions in that function can lead to things getting in a bad state: I'm fairly certain that uncaught exceptions in the frame generator were causing deadlocks. I also introduced a dedicated error type for explicit error reporting in command handlers. This will be used in subsequent commits. There's still a bit of work to be done here, especially around formalizing the error handling "protocol." I've added yet another TODO to track this so we don't forget. Test output changed because we're using generators and no longer know we are at the end of the data until we hit the end of the generator. This means we can't emit the end-of-stream flag until we've exhausted the generator. Hence the introduction of 0-sized end-of-stream frames. Differential Revision: https://phab.mercurial-scm.org/D4472

File last commit:

r39595:07b58266 default
r39595:07b58266 default
Show More
wireprotocolrpc.txt
519 lines | 20.3 KiB | text/plain | TextLexer
Gregory Szorc
internals: extract frame-based protocol docs to own document...
r39594 **Experimental and under development**
This document describe's Mercurial's transport-agnostic remote procedure
call (RPC) protocol which is used to perform interactions with remote
servers. This protocol is also referred to as ``hgrpc``.
The protocol has the following high-level features:
* Concurrent request and response support (multiple commands can be issued
simultaneously and responses can be streamed simultaneously).
* Supports half-duplex and full-duplex connections.
* All data is transmitted within *frames*, which have a well-defined
header and encode their length.
* Side-channels for sending progress updates and printing output. Text
output from the remote can be localized locally.
* Support for simultaneous and long-lived compression streams, even across
requests.
* Uses CBOR for data exchange.
The protocol is not specific to Mercurial and could be used by other
applications.
High-level Overview
===================
To operate the protocol, a bi-directional, half-duplex pipe supporting
ordered sends and receives is required. That is, each peer has one pipe
for sending data and another for receiving. Full-duplex pipes are also
supported.
All data is read and written in atomic units called *frames*. These
are conceptually similar to TCP packets. Higher-level functionality
is built on the exchange and processing of frames.
All frames are associated with a *stream*. A *stream* provides a
unidirectional grouping of frames. Streams facilitate two goals:
content encoding and parallelism. There is a dedicated section on
streams below.
The protocol is request-response based: the client issues requests to
the server, which issues replies to those requests. Server-initiated
messaging is not currently supported, but this specification carves
out room to implement it.
All frames are associated with a numbered request. Frames can thus
be logically grouped by their request ID.
Frames
======
Frames begin with an 8 octet header followed by a variable length
payload::
+------------------------------------------------+
| Length (24) |
+--------------------------------+---------------+
| Request ID (16) | Stream ID (8) |
+------------------+-------------+---------------+
| Stream Flags (8) |
+-----------+------+
| Type (4) |
+-----------+
| Flags (4) |
+===========+===================================================|
| Frame Payload (0...) ...
+---------------------------------------------------------------+
The length of the frame payload is expressed as an unsigned 24 bit
little endian integer. Values larger than 65535 MUST NOT be used unless
given permission by the server as part of the negotiated capabilities
during the handshake. The frame header is not part of the advertised
frame length. The payload length is the over-the-wire length. If there
is content encoding applied to the payload as part of the frame's stream,
the length is the output of that content encoding, not the input.
The 16-bit ``Request ID`` field denotes the integer request identifier,
stored as an unsigned little endian integer. Odd numbered requests are
client-initiated. Even numbered requests are server-initiated. This
refers to where the *request* was initiated - not where the *frame* was
initiated, so servers will send frames with odd ``Request ID`` in
response to client-initiated requests. Implementations are advised to
start ordering request identifiers at ``1`` and ``0``, increment by
``2``, and wrap around if all available numbers have been exhausted.
The 8-bit ``Stream ID`` field denotes the stream that the frame is
associated with. Frames belonging to a stream may have content
encoding applied and the receiver may need to decode the raw frame
payload to obtain the original data. Odd numbered IDs are
client-initiated. Even numbered IDs are server-initiated.
The 8-bit ``Stream Flags`` field defines stream processing semantics.
See the section on streams below.
The 4-bit ``Type`` field denotes the type of frame being sent.
The 4-bit ``Flags`` field defines special, per-type attributes for
the frame.
The sections below define the frame types and their behavior.
Command Request (``0x01``)
--------------------------
This frame contains a request to run a command.
The payload consists of a CBOR map defining the command request. The
bytestring keys of that map are:
name
Name of the command that should be executed (bytestring).
args
Map of bytestring keys to various value types containing the named
arguments to this command.
Each command defines its own set of argument names and their expected
types.
This frame type MUST ONLY be sent from clients to servers: it is illegal
for a server to send this frame to a client.
The following flag values are defined for this type:
0x01
New command request. When set, this frame represents the beginning
of a new request to run a command. The ``Request ID`` attached to this
frame MUST NOT be active.
0x02
Command request continuation. When set, this frame is a continuation
from a previous command request frame for its ``Request ID``. This
flag is set when the CBOR data for a command request does not fit
in a single frame.
0x04
Additional frames expected. When set, the command request didn't fit
into a single frame and additional CBOR data follows in a subsequent
frame.
0x08
Command data frames expected. When set, command data frames are
expected to follow the final command request frame for this request.
``0x01`` MUST be set on the initial command request frame for a
``Request ID``.
``0x01`` or ``0x02`` MUST be set to indicate this frame's role in
a series of command request frames.
If command data frames are to be sent, ``0x08`` MUST be set on ALL
command request frames.
Command Data (``0x02``)
-----------------------
This frame contains raw data for a command.
Most commands can be executed by specifying arguments. However,
arguments have an upper bound to their length. For commands that
accept data that is beyond this length or whose length isn't known
when the command is initially sent, they will need to stream
arbitrary data to the server. This frame type facilitates the sending
of this data.
The payload of this frame type consists of a stream of raw data to be
consumed by the command handler on the server. The format of the data
is command specific.
The following flag values are defined for this type:
0x01
Command data continuation. When set, the data for this command
continues into a subsequent frame.
0x02
End of data. When set, command data has been fully sent to the
server. The command has been fully issued and no new data for this
command will be sent. The next frame will belong to a new command.
Command Response Data (``0x03``)
--------------------------------
This frame contains response data to an issued command.
Response data ALWAYS consists of a series of 1 or more CBOR encoded
values. A CBOR value may be using indefinite length encoding. And the
bytes constituting the value may span several frames.
The following flag values are defined for this type:
0x01
Data continuation. When set, an additional frame containing response data
will follow.
0x02
End of data. When set, the response data has been fully sent and
no additional frames for this response will be sent.
The ``0x01`` flag is mutually exclusive with the ``0x02`` flag.
Error Occurred (``0x05``)
-------------------------
Some kind of error occurred.
There are 3 general kinds of failures that can occur:
* Command error encountered before any response issued
* Command error encountered after a response was issued
* Protocol or stream level error
This frame type is used to capture the latter cases. (The general
command error case is handled by the leading CBOR map in
``Command Response`` frames.)
The payload of this frame contains a CBOR map detailing the error. That
map has the following bytestring keys:
type
(bytestring) The overall type of error encountered. Can be one of the
following values:
protocol
A protocol-level error occurred. This typically means someone
is violating the framing protocol semantics and the server is
refusing to proceed.
server
A server-level error occurred. This typically indicates some kind of
logic error on the server, likely the fault of the server.
command
A command-level error, likely the fault of the client.
message
(array of maps) A richly formatted message that is intended for
human consumption. See the ``Human Output Side-Channel`` frame
section for a description of the format of this data structure.
Human Output Side-Channel (``0x06``)
------------------------------------
This frame contains a message that is intended to be displayed to
people. Whereas most frames communicate machine readable data, this
frame communicates textual data that is intended to be shown to
humans.
The frame consists of a series of *formatting requests*. Each formatting
request consists of a formatting string, arguments for that formatting
string, and labels to apply to that formatting string.
A formatting string is a printf()-like string that allows variable
substitution within the string. Labels allow the rendered text to be
*decorated*. Assuming use of the canonical Mercurial code base, a
formatting string can be the input to the ``i18n._`` function. This
allows messages emitted from the server to be localized. So even if
the server has different i18n settings, people could see messages in
their *native* settings. Similarly, the use of labels allows
decorations like coloring and underlining to be applied using the
client's configured rendering settings.
Formatting strings are similar to ``printf()`` strings or how
Python's ``%`` operator works. The only supported formatting sequences
are ``%s`` and ``%%``. ``%s`` will be replaced by whatever the string
at that position resolves to. ``%%`` will be replaced by ``%``. All
other 2-byte sequences beginning with ``%`` represent a literal
``%`` followed by that character. However, future versions of the
wire protocol reserve the right to allow clients to opt in to receiving
formatting strings with additional formatters, hence why ``%%`` is
required to represent the literal ``%``.
The frame payload consists of a CBOR array of CBOR maps. Each map
defines an *atom* of text data to print. Each *atom* has the following
bytestring keys:
msg
(bytestring) The formatting string. Content MUST be ASCII.
args (optional)
Array of bytestrings defining arguments to the formatting string.
labels (optional)
Array of bytestrings defining labels to apply to this atom.
All data to be printed MUST be encoded into a single frame: this frame
does not support spanning data across multiple frames.
All textual data encoded in these frames is assumed to be line delimited.
The last atom in the frame SHOULD end with a newline (``\n``). If it
doesn't, clients MAY add a newline to facilitate immediate printing.
Progress Update (``0x07``)
--------------------------
This frame holds the progress of an operation on the peer. Consumption
of these frames allows clients to display progress bars, estimated
completion times, etc.
Each frame defines the progress of a single operation on the peer. The
payload consists of a CBOR map with the following bytestring keys:
topic
Topic name (string)
pos
Current numeric position within the topic (integer)
total
Total/end numeric position of this topic (unsigned integer)
label (optional)
Unit label (string)
item (optional)
Item name (string)
Progress state is created when a frame is received referencing a
*topic* that isn't currently tracked. Progress tracking for that
*topic* is finished when a frame is received reporting the current
position of that topic as ``-1``.
Multiple *topics* may be active at any given time.
Rendering of progress information is not mandated or governed by this
specification: implementations MAY render progress information however
they see fit, including not at all.
The string data describing the topic SHOULD be static strings to
facilitate receivers localizing that string data. The emitter
MUST normalize all string data to valid UTF-8 and receivers SHOULD
validate that received data conforms to UTF-8. The topic name
SHOULD be ASCII.
Stream Encoding Settings (``0x08``)
-----------------------------------
This frame type holds information defining the content encoding
settings for a *stream*.
This frame type is likely consumed by the protocol layer and is not
passed on to applications.
This frame type MUST ONLY occur on frames having the *Beginning of Stream*
``Stream Flag`` set.
The payload of this frame defines what content encoding has (possibly)
been applied to the payloads of subsequent frames in this stream.
The payload begins with an 8-bit integer defining the length of the
encoding *profile*, followed by the string name of that profile, which
must be an ASCII string. All bytes that follow can be used by that
profile for supplemental settings definitions. See the section below
on defined encoding profiles.
Stream States and Flags
=======================
Streams can be in two states: *open* and *closed*. An *open* stream
is active and frames attached to that stream could arrive at any time.
A *closed* stream is not active. If a frame attached to a *closed*
stream arrives, that frame MUST have an appropriate stream flag
set indicating beginning of stream. All streams are in the *closed*
state by default.
The ``Stream Flags`` field denotes a set of bit flags for defining
the relationship of this frame within a stream. The following flags
are defined:
0x01
Beginning of stream. The first frame in the stream MUST set this
flag. When received, the ``Stream ID`` this frame is attached to
becomes ``open``.
0x02
End of stream. The last frame in a stream MUST set this flag. When
received, the ``Stream ID`` this frame is attached to becomes
``closed``. Any content encoding context associated with this stream
can be destroyed after processing the payload of this frame.
0x04
Apply content encoding. When set, any content encoding settings
defined by the stream should be applied when attempting to read
the frame. When not set, the frame payload isn't encoded.
Streams
=======
Streams - along with ``Request IDs`` - facilitate grouping of frames.
But the purpose of each is quite different and the groupings they
constitute are independent.
A ``Request ID`` is essentially a tag. It tells you which logical
request a frame is associated with.
A *stream* is a sequence of frames grouped for the express purpose
of applying a stateful encoding or for denoting sub-groups of frames.
Unlike ``Request ID``s which span the request and response, a stream
is unidirectional and stream IDs are independent from client to
server.
There is no strict hierarchical relationship between ``Request IDs``
and *streams*. A stream can contain frames having multiple
``Request IDs``. Frames belonging to the same ``Request ID`` can
span multiple streams.
One goal of streams is to facilitate content encoding. A stream can
define an encoding to be applied to frame payloads. For example, the
payload transmitted over the wire may contain output from a
zstandard compression operation and the receiving end may decompress
that payload to obtain the original data.
The other goal of streams is to facilitate concurrent execution. For
example, a server could spawn 4 threads to service a request that can
be easily parallelized. Each of those 4 threads could write into its
own stream. Those streams could then in turn be delivered to 4 threads
on the receiving end, with each thread consuming its stream in near
isolation. The *main* thread on both ends merely does I/O and
encodes/decodes frame headers: the bulk of the work is done by worker
threads.
In addition, since content encoding is defined per stream, each
*worker thread* could perform potentially CPU bound work concurrently
with other threads. This approach of applying encoding at the
sub-protocol / stream level eliminates a potential resource constraint
on the protocol stream as a whole (it is common for the throughput of
a compression engine to be smaller than the throughput of a network).
Having multiple streams - each with their own encoding settings - also
facilitates the use of advanced data compression techniques. For
example, a transmitter could see that it is generating data faster
and slower than the receiving end is consuming it and adjust its
compression settings to trade CPU for compression ratio accordingly.
While streams can define a content encoding, not all frames within
that stream must use that content encoding. This can be useful when
data is being served from caches and being derived dynamically. A
cache could pre-compressed data so the server doesn't have to
recompress it. The ability to pick and choose which frames are
compressed allows servers to easily send data to the wire without
involving potentially expensive encoding overhead.
Content Encoding Profiles
=========================
Streams can have named content encoding *profiles* associated with
them. A profile defines a shared understanding of content encoding
settings and behavior.
The following profiles are defined:
TBD
Command Protocol
================
A client can request that a remote run a command by sending it
frames defining that command. This logical stream is composed of
1 or more ``Command Request`` frames and and 0 or more ``Command Data``
frames.
All frames composing a single command request MUST be associated with
the same ``Request ID``.
Clients MAY send additional command requests without waiting on the
response to a previous command request. If they do so, they MUST ensure
that the ``Request ID`` field of outbound frames does not conflict
with that of an active ``Request ID`` whose response has not yet been
fully received.
Servers MAY respond to commands in a different order than they were
sent over the wire. Clients MUST be prepared to deal with this. Servers
also MAY start executing commands in a different order than they were
received, or MAY execute multiple commands concurrently.
If there is a dependency between commands or a race condition between
commands executing (e.g. a read-only command that depends on the results
of a command that mutates the repository), then clients MUST NOT send
frames issuing a command until a response to all dependent commands has
been received.
TODO think about whether we should express dependencies between commands
to avoid roundtrip latency.
A command is defined by a command name, 0 or more command arguments,
and optional command data.
Arguments are the recommended mechanism for transferring fixed sets of
parameters to a command. Data is appropriate for transferring variable
data. Thinking in terms of HTTP, arguments would be headers and data
would be the message body.
It is recommended for servers to delay the dispatch of a command
until all argument have been received. Servers MAY impose limits on the
maximum argument size.
TODO define failure mechanism.
Servers MAY dispatch to commands immediately once argument data
is available or delay until command data is received in full.
Once a ``Command Request`` frame is sent, a client must be prepared to
receive any of the following frames associated with that request:
``Command Response``, ``Error Response``, ``Human Output Side-Channel``,
``Progress Update``.
The *main* response for a command will be in ``Command Response`` frames.
The payloads of these frames consist of 1 or more CBOR encoded values.
The first CBOR value on the first ``Command Response`` frame is special
and denotes the overall status of the command. This CBOR map contains
the following bytestring keys:
status
(bytestring) A well-defined message containing the overall status of
this command request. The following values are defined:
ok
The command was received successfully and its response follows.
error
There was an error processing the command. More details about the
error are encoded in the ``error`` key.
error (optional)
A map containing information about an encountered error. The map has the
following keys:
message
(array of maps) A message describing the error. The message uses the
same format as those in the ``Human Output Side-Channel`` frame.
Gregory Szorc
wireprotov2: implement commands as a generator of objects...
r39595
TODO formalize when error frames can be seen and how errors can be
recognized midway through a command response.