upstream/mercurial-mirror Commit - r40161:e2fe1074

wireprotov2: update stream encoding specification...

Gregory Szorc -

r40161:e2fe1074 default

parent child

mercurial/help/internals/wireprotocolrpc.txt

0 +99 -8

              **Experimental and under development**
              This document describe's Mercurial's transport-agnostic remote procedure
              call (RPC) protocol which is used to perform interactions with remote
              servers. This protocol is also referred to as ``hgrpc``.
              The protocol has the following high-level features:
              * Concurrent request and response support (multiple commands can be issued
                simultaneously and responses can be streamed simultaneously).
              * Supports half-duplex and full-duplex connections.
              * All data is transmitted within *frames*, which have a well-defined
                header and encode their length.
              * Side-channels for sending progress updates and printing output. Text
                output from the remote can be localized locally.
              * Support for simultaneous and long-lived compression streams, even across
                requests.
              * Uses CBOR for data exchange.
              The protocol is not specific to Mercurial and could be used by other
              applications.
              High-level Overview
              ===================
              To operate the protocol, a bi-directional, half-duplex pipe supporting
              ordered sends and receives is required. That is, each peer has one pipe
              for sending data and another for receiving. Full-duplex pipes are also
              supported.
              All data is read and written in atomic units called *frames*. These
              are conceptually similar to TCP packets. Higher-level functionality
              is built on the exchange and processing of frames.
              All frames are associated with a *stream*. A *stream* provides a
              unidirectional grouping of frames. Streams facilitate two goals:
              content encoding and parallelism. There is a dedicated section on
              streams below.
              The protocol is request-response based: the client issues requests to
              the server, which issues replies to those requests. Server-initiated
              messaging is not currently supported, but this specification carves
              out room to implement it.
              All frames are associated with a numbered request. Frames can thus
              be logically grouped by their request ID.
              Frames
              ======
              Frames begin with an 8 octet header followed by a variable length
              payload::
                  +------------------------------------------------+
                  |                 Length (24)                    |
                  +--------------------------------+---------------+
                  |         Request ID (16)        | Stream ID (8) |
                  +------------------+-------------+---------------+
                  | Stream Flags (8) |
                  +-----------+------+
                  | Type (4)  |
                  +-----------+
                  | Flags (4) |
                  +===========+===================================================|
                  |                     Frame Payload (0...)                    ...
                  +---------------------------------------------------------------+
              The length of the frame payload is expressed as an unsigned 24 bit
              little endian integer. Values larger than 65535 MUST NOT be used unless
              given permission by the server as part of the negotiated capabilities
              during the handshake. The frame header is not part of the advertised
              frame length. The payload length is the over-the-wire length. If there
              is content encoding applied to the payload as part of the frame's stream,
              the length is the output of that content encoding, not the input.
              The 16-bit ``Request ID`` field denotes the integer request identifier,
              stored as an unsigned little endian integer. Odd numbered requests are
              client-initiated. Even numbered requests are server-initiated. This
              refers to where the *request* was initiated - not where the *frame* was
              initiated, so servers will send frames with odd ``Request ID`` in
              response to client-initiated requests. Implementations are advised to
              start ordering request identifiers at ``1`` and ``0``, increment by
              ``2``, and wrap around if all available numbers have been exhausted.
              The 8-bit ``Stream ID`` field denotes the stream that the frame is
              associated with. Frames belonging to a stream may have content
              encoding applied and the receiver may need to decode the raw frame
              payload to obtain the original data. Odd numbered IDs are
              client-initiated. Even numbered IDs are server-initiated.
              The 8-bit ``Stream Flags`` field defines stream processing semantics.
              See the section on streams below.
              The 4-bit ``Type`` field denotes the type of frame being sent.
              The 4-bit ``Flags`` field defines special, per-type attributes for
              the frame.
              The sections below define the frame types and their behavior.
              Command Request (``0x01``)
              --------------------------
              This frame contains a request to run a command.
              The payload consists of a CBOR map defining the command request. The
              bytestring keys of that map are:
              name
                 Name of the command that should be executed (bytestring).
              args
                 Map of bytestring keys to various value types containing the named
                 arguments to this command.
                 Each command defines its own set of argument names and their expected
                 types.
              redirect (optional)
                 (map) Advertises client support for following response *redirects*.
                 This map has the following bytestring keys:
                 targets
                    (array of bytestring) List of named redirect targets supported by
                    this client. The names come from the targets advertised by the
                    server's *capabilities* message.
                 hashes
                    (array of bytestring) List of preferred hashing algorithms that can
                    be used for content integrity verification.
                 See the *Content Redirects* section below for more on content redirects.
              This frame type MUST ONLY be sent from clients to servers: it is illegal
              for a server to send this frame to a client.
              The following flag values are defined for this type:
 x01
                 New command request. When set, this frame represents the beginning
                 of a new request to run a command. The ``Request ID`` attached to this
                 frame MUST NOT be active.
 x02
                 Command request continuation. When set, this frame is a continuation
                 from a previous command request frame for its ``Request ID``. This
                 flag is set when the CBOR data for a command request does not fit
                 in a single frame.
 x04
                 Additional frames expected. When set, the command request didn't fit
                 into a single frame and additional CBOR data follows in a subsequent
                 frame.
 x08
                 Command data frames expected. When set, command data frames are
                 expected to follow the final command request frame for this request.
              ``0x01`` MUST be set on the initial command request frame for a
              ``Request ID``.
              ``0x01`` or ``0x02`` MUST be set to indicate this frame's role in
              a series of command request frames.
              If command data frames are to be sent, ``0x08`` MUST be set on ALL
              command request frames.
              Command Data (``0x02``)
              -----------------------
              This frame contains raw data for a command.
              Most commands can be executed by specifying arguments. However,
              arguments have an upper bound to their length. For commands that
              accept data that is beyond this length or whose length isn't known
              when the command is initially sent, they will need to stream
              arbitrary data to the server. This frame type facilitates the sending
              of this data.
              The payload of this frame type consists of a stream of raw data to be
              consumed by the command handler on the server. The format of the data
              is command specific.
              The following flag values are defined for this type:
 x01
                 Command data continuation. When set, the data for this command
                 continues into a subsequent frame.
 x02
                 End of data. When set, command data has been fully sent to the
                 server. The command has been fully issued and no new data for this
                 command will be sent. The next frame will belong to a new command.
              Command Response Data (``0x03``)
              --------------------------------
              This frame contains response data to an issued command.
              Response data ALWAYS consists of a series of 1 or more CBOR encoded
              values. A CBOR value may be using indefinite length encoding. And the
              bytes constituting the value may span several frames.
              The following flag values are defined for this type:
 x01
                 Data continuation. When set, an additional frame containing response data
                 will follow.
 x02
                 End of data. When set, the response data has been fully sent and
                 no additional frames for this response will be sent.
              The ``0x01`` flag is mutually exclusive with the ``0x02`` flag.
              Error Occurred (``0x05``)
              -------------------------
              Some kind of error occurred.
              There are 3 general kinds of failures that can occur:
              * Command error encountered before any response issued
              * Command error encountered after a response was issued
              * Protocol or stream level error
              This frame type is used to capture the latter cases. (The general
              command error case is handled by the leading CBOR map in
              ``Command Response`` frames.)
              The payload of this frame contains a CBOR map detailing the error. That
              map has the following bytestring keys:
              type
                 (bytestring) The overall type of error encountered. Can be one of the
                 following values:
                 protocol
                    A protocol-level error occurred. This typically means someone
                    is violating the framing protocol semantics and the server is
                    refusing to proceed.
                 server
                    A server-level error occurred. This typically indicates some kind of
                    logic error on the server, likely the fault of the server.
                 command
                    A command-level error, likely the fault of the client.
              message
                 (array of maps) A richly formatted message that is intended for
                 human consumption. See the ``Human Output Side-Channel`` frame
                 section for a description of the format of this data structure.
              Human Output Side-Channel (``0x06``)
              ------------------------------------
              This frame contains a message that is intended to be displayed to
              people. Whereas most frames communicate machine readable data, this
              frame communicates textual data that is intended to be shown to
              humans.
              The frame consists of a series of *formatting requests*. Each formatting
              request consists of a formatting string, arguments for that formatting
              string, and labels to apply to that formatting string.
              A formatting string is a printf()-like string that allows variable
              substitution within the string. Labels allow the rendered text to be
              *decorated*. Assuming use of the canonical Mercurial code base, a
              formatting string can be the input to the ``i18n._`` function. This
              allows messages emitted from the server to be localized. So even if
              the server has different i18n settings, people could see messages in
              their *native* settings. Similarly, the use of labels allows
              decorations like coloring and underlining to be applied using the
              client's configured rendering settings.
              Formatting strings are similar to ``printf()`` strings or how
              Python's ``%`` operator works. The only supported formatting sequences
              are ``%s`` and ``%%``. ``%s`` will be replaced by whatever the string
              at that position resolves to. ``%%`` will be replaced by ``%``. All
              other 2-byte sequences beginning with ``%`` represent a literal
              ``%`` followed by that character. However, future versions of the
              wire protocol reserve the right to allow clients to opt in to receiving
              formatting strings with additional formatters, hence why ``%%`` is
              required to represent the literal ``%``.
              The frame payload consists of a CBOR array of CBOR maps. Each map
              defines an *atom* of text data to print. Each *atom* has the following
              bytestring keys:
              msg
                 (bytestring) The formatting string. Content MUST be ASCII.
              args (optional)
                 Array of bytestrings defining arguments to the formatting string.
              labels (optional)
                 Array of bytestrings defining labels to apply to this atom.
              All data to be printed MUST be encoded into a single frame: this frame
              does not support spanning data across multiple frames.
              All textual data encoded in these frames is assumed to be line delimited.
              The last atom in the frame SHOULD end with a newline (``\n``). If it
              doesn't, clients MAY add a newline to facilitate immediate printing.
              Progress Update (``0x07``)
              --------------------------
              This frame holds the progress of an operation on the peer. Consumption
              of these frames allows clients to display progress bars, estimated
              completion times, etc.
              Each frame defines the progress of a single operation on the peer. The
              payload consists of a CBOR map with the following bytestring keys:
              topic
                 Topic name (string)
              pos
                 Current numeric position within the topic (integer)
              total
                 Total/end numeric position of this topic (unsigned integer)
              label (optional)
                 Unit label (string)
              item (optional)
                 Item name (string)
              Progress state is created when a frame is received referencing a
              *topic* that isn't currently tracked. Progress tracking for that
              *topic* is finished when a frame is received reporting the current
              position of that topic as ``-1``.
              Multiple *topics* may be active at any given time.
              Rendering of progress information is not mandated or governed by this
              specification: implementations MAY render progress information however
              they see fit, including not at all.
              The string data describing the topic SHOULD be static strings to
              facilitate receivers localizing that string data. The emitter
              MUST normalize all string data to valid UTF-8 and receivers SHOULD
              validate that received data conforms to UTF-8. The topic name
              SHOULD be ASCII.
-             Stream Encoding Settings (``0x08``)
+             Sender Protocol Settings (``0x08``)
+             -----------------------------------
+             This frame type advertises the sender's support for various protocol and
+             stream level features. The data advertised in this frame is used to influence
+             subsequent behavior of the current frame exchange channel.
+             The frame payload consists of a CBOR map. It may contain the following
+             bytestring keys:
+             contentencodings
+                (array of bytestring) A list of content encodings supported by the
+                sender, in order of most to least preferred.
+                Peers are allowed to encode stream data using any of the listed
+                encodings.
+                See the ``Content Encoding Profiles`` section for an enumeration
+                of supported content encodings.
+                If not defined, the value is assumed to be a list with the single value
+                ``identity``, meaning only the no-op encoding is supported.
+                Senders MAY filter the set of advertised encodings against what it
+                knows the receiver supports (e.g. if the receiver advertised encodings
+                via the capabilities descriptor). However, doing so will prevent
+                servers from gaining an understanding of the aggregate capabilities
+                of clients. So clients are discouraged from doing so.
+             When this frame is not sent/received, the receiver assumes default values
+             for all keys.
+             If encountered, this frame type MUST be sent before any other frame type
+             in a channel.
+             The following flag values are defined for this frame type:
+x01
+                Data continuation. When set, an additional frame containing more protocol
+                settings immediately follows.
+x02
+                End of data. When set, the protocol settings data has been completely
+                sent.
+             The ``0x01`` flag is mutually exclusive with the ``0x02`` flag.
+             Stream Encoding Settings (``0x09``)
              -----------------------------------
              This frame type holds information defining the content encoding
              settings for a *stream*.
              This frame type is likely consumed by the protocol layer and is not
              passed on to applications.
              This frame type MUST ONLY occur on frames having the *Beginning of Stream*
              ``Stream Flag`` set.
              The payload of this frame defines what content encoding has (possibly)
              been applied to the payloads of subsequent frames in this stream.
-             The payload begins with an 8-bit integer defining the length of the
-             encoding *profile*, followed by the string name of that profile, which
-             must be an ASCII string. All bytes that follow can be used by that
-             profile for supplemental settings definitions. See the section below
-             on defined encoding profiles.
+             The payload consists of a series of CBOR values. The first value is a
+             bytestring denoting the content encoding profile of the data in this
+             stream. Subsequent CBOR values supplement this simple value in a
+             profile-specific manner. See the ``Content Encoding Profiles`` section
+             for more.
+             In the absence of this frame on a stream, it is assumed the stream is
+             using the ``identity`` content encoding.
+             The following flag values are defined for this frame type:
+x01
+                Data continuation. When set, an additional frame containing more encoding
+                settings immediately follows.
+x02
+                End of data. When set, the encoding settings data has been completely
+                sent.
+             The ``0x01`` flag is mutually exclusive with the ``0x02`` flag.
              Stream States and Flags
              =======================
              Streams can be in two states: *open* and *closed*. An *open* stream
              is active and frames attached to that stream could arrive at any time.
              A *closed* stream is not active. If a frame attached to a *closed*
              stream arrives, that frame MUST have an appropriate stream flag
              set indicating beginning of stream. All streams are in the *closed*
              state by default.
              The ``Stream Flags`` field denotes a set of bit flags for defining
              the relationship of this frame within a stream. The following flags
              are defined:
 x01
                 Beginning of stream. The first frame in the stream MUST set this
                 flag. When received, the ``Stream ID`` this frame is attached to
                 becomes ``open``.
 x02
                 End of stream. The last frame in a stream MUST set this flag. When
                 received, the ``Stream ID`` this frame is attached to becomes
                 ``closed``. Any content encoding context associated with this stream
                 can be destroyed after processing the payload of this frame.
 x04
                 Apply content encoding. When set, any content encoding settings
                 defined by the stream should be applied when attempting to read
                 the frame. When not set, the frame payload isn't encoded.
+             TODO consider making stream opening and closing communicated via
+             explicit frame types (e.g. a "stream state change" frame) rather than
+             flags on all frames. This would make stream state changes more explicit,
+             as they could only occur on specific frame types.
              Streams
              =======
              Streams - along with ``Request IDs`` - facilitate grouping of frames.
              But the purpose of each is quite different and the groupings they
              constitute are independent.
              A ``Request ID`` is essentially a tag. It tells you which logical
              request a frame is associated with.
              A *stream* is a sequence of frames grouped for the express purpose
              of applying a stateful encoding or for denoting sub-groups of frames.
              Unlike ``Request ID``s which span the request and response, a stream
              is unidirectional and stream IDs are independent from client to
              server.
              There is no strict hierarchical relationship between ``Request IDs``
              and *streams*. A stream can contain frames having multiple
              ``Request IDs``. Frames belonging to the same ``Request ID`` can
              span multiple streams.
              One goal of streams is to facilitate content encoding. A stream can
              define an encoding to be applied to frame payloads. For example, the
              payload transmitted over the wire may contain output from a
              zstandard compression operation and the receiving end may decompress
              that payload to obtain the original data.
              The other goal of streams is to facilitate concurrent execution. For
              example, a server could spawn 4 threads to service a request that can
              be easily parallelized. Each of those 4 threads could write into its
              own stream. Those streams could then in turn be delivered to 4 threads
              on the receiving end, with each thread consuming its stream in near
              isolation. The *main* thread on both ends merely does I/O and
              encodes/decodes frame headers: the bulk of the work is done by worker
              threads.
              In addition, since content encoding is defined per stream, each
              *worker thread* could perform potentially CPU bound work concurrently
              with other threads. This approach of applying encoding at the
              sub-protocol / stream level eliminates a potential resource constraint
              on the protocol stream as a whole (it is common for the throughput of
              a compression engine to be smaller than the throughput of a network).
              Having multiple streams - each with their own encoding settings - also
              facilitates the use of advanced data compression techniques. For
              example, a transmitter could see that it is generating data faster
              and slower than the receiving end is consuming it and adjust its
              compression settings to trade CPU for compression ratio accordingly.
              While streams can define a content encoding, not all frames within
              that stream must use that content encoding. This can be useful when
              data is being served from caches and being derived dynamically. A
              cache could pre-compressed data so the server doesn't have to
              recompress it. The ability to pick and choose which frames are
              compressed allows servers to easily send data to the wire without
              involving potentially expensive encoding overhead.
              Content Encoding Profiles
              =========================
              Streams can have named content encoding *profiles* associated with
              them. A profile defines a shared understanding of content encoding
              settings and behavior.
-             The following profiles are defined:
+             Profiles are described in the following sections.
+             identity
+             --------
+             The ``identity`` profile is a no-op encoding: the encoded bytes are
+             exactly the input bytes.
+             This profile MUST be supported by all peers.
+             In the absence of an identified profile, the ``identity`` profile is
+             assumed.
-             TBD
+             zstd-8mb
+             --------
+             Zstandard encoding (RFC 8478). Zstandard is a fast and effective lossless
+             compression format.
+             This profile allows decompressor window sizes of up to 8 MB.
+             zlib
+             ----
+             zlib compressed data (RFC 1950). zlib is a widely-used and supported
+             lossless compression format.
+             It isn't as fast as zstandard and it is recommended to use zstandard instead,
+             if possible.
              Command Protocol
              ================
              A client can request that a remote run a command by sending it
              frames defining that command. This logical stream is composed of
 or more ``Command Request`` frames and and 0 or more ``Command Data``
              frames.
              All frames composing a single command request MUST be associated with
              the same ``Request ID``.
              Clients MAY send additional command requests without waiting on the
              response to a previous command request. If they do so, they MUST ensure
              that the ``Request ID`` field of outbound frames does not conflict
              with that of an active ``Request ID`` whose response has not yet been
              fully received.
              Servers MAY respond to commands in a different order than they were
              sent over the wire. Clients MUST be prepared to deal with this. Servers
              also MAY start executing commands in a different order than they were
              received, or MAY execute multiple commands concurrently.
              If there is a dependency between commands or a race condition between
              commands executing (e.g. a read-only command that depends on the results
              of a command that mutates the repository), then clients MUST NOT send
              frames issuing a command until a response to all dependent commands has
              been received.
              TODO think about whether we should express dependencies between commands
              to avoid roundtrip latency.
              A command is defined by a command name, 0 or more command arguments,
              and optional command data.
              Arguments are the recommended mechanism for transferring fixed sets of
              parameters to a command. Data is appropriate for transferring variable
              data. Thinking in terms of HTTP, arguments would be headers and data
              would be the message body.
              It is recommended for servers to delay the dispatch of a command
              until all argument have been received. Servers MAY impose limits on the
              maximum argument size.
              TODO define failure mechanism.
              Servers MAY dispatch to commands immediately once argument data
              is available or delay until command data is received in full.
              Once a ``Command Request`` frame is sent, a client must be prepared to
              receive any of the following frames associated with that request:
              ``Command Response``, ``Error Response``, ``Human Output Side-Channel``,
              ``Progress Update``.
              The *main* response for a command will be in ``Command Response`` frames.
              The payloads of these frames consist of 1 or more CBOR encoded values.
              The first CBOR value on the first ``Command Response`` frame is special
              and denotes the overall status of the command. This CBOR map contains
              the following bytestring keys:
              status
                 (bytestring) A well-defined message containing the overall status of
                 this command request. The following values are defined:
                 ok
                    The command was received successfully and its response follows.
                 error
                    There was an error processing the command. More details about the
                    error are encoded in the ``error`` key.
                 redirect
                    The response for this command is available elsewhere. Details on
                    where are in the ``location`` key.
              error (optional)
                 A map containing information about an encountered error. The map has the
                 following keys:
                 message
                    (array of maps) A message describing the error. The message uses the
                    same format as those in the ``Human Output Side-Channel`` frame.
              location (optional)
                 (map) Presence indicates that a *content redirect* has occurred. The map
                 provides the external location of the content.
                 This map contains the following bytestring keys:
                 url
                    (bytestring) URL from which this content may be requested.
                 mediatype
                    (bytestring) The media type for the fetched content. e.g.
                    ``application/mercurial-*``.
                    In some transports, this value is also advertised by the transport.
                    e.g. as the ``Content-Type`` HTTP header.
                 size (optional)
                    (unsigned integer) Total size of remote object in bytes. This is
                    the raw size of the entity that will be fetched, minus any
                    non-Mercurial protocol encoding (e.g. HTTP content or transfer
                    encoding.)
                 fullhashes (optional)
                    (array of arrays) Content hashes for the entire payload. Each entry
                    is an array of bytestrings containing the hash name and the hash value.
                 fullhashseed (optional)
                    (bytestring) Optional seed value to feed into hasher for full content
                    hash verification.
                 serverdercerts (optional)
                    (array of bytestring) DER encoded x509 certificates for the server. When
                    defined, clients MAY validate that the x509 certificate on the target
                    server exactly matches the certificate used here.
                 servercadercerts (optional)
                    (array of bytestring) DER encoded x509 certificates for the certificate
                    authority of the target server. When defined, clients MAY validate that
                    the x509 on the target server was signed by CA certificate in this set.
                 # TODO support for giving client an x509 certificate pair to be used as a
                 # client certificate.
                 # TODO support common authentication mechanisms (e.g. HTTP basic/digest
                 # auth).
                 # TODO support custom authentication mechanisms. This likely requires
                 # server to advertise required auth mechanism so client can filter.
                 # TODO support chained hashes. e.g. hash for each 1MB segment so client
                 # can iteratively validate data without having to consume all of it first.
              TODO formalize when error frames can be seen and how errors can be
              recognized midway through a command response.
              Content Redirects
              =================
              Servers have the ability to respond to ANY command request with a
              *redirect* to another location. Such a response is referred to as a *redirect
              response*. (This feature is conceptually similar to HTTP redirects, but is
              more powerful.)
              A *redirect response* MUST ONLY be issued if the client advertises support
              for a redirect *target*.
              A *redirect response* MUST NOT be issued unless the client advertises support
              for one.
              Clients advertise support for *redirect responses* after looking at the server's
              *capabilities* data, which is fetched during initial server connection
              handshake. The server's capabilities data advertises named *targets* for
              potential redirects.
              Each target is described by a protocol name, connection and protocol features,
              etc. The server also advertises target-agnostic redirect settings, such as
              which hash algorithms are supported for content integrity checking. (See
              the documentation for the *capabilities* command for more.)
              Clients examine the set of advertised redirect targets for compatibility.
              When sending a command request, the client advertises the set of redirect
              target names it is willing to follow, along with some other settings influencing
              behavior.
              For example, say the server is advertising a ``cdn`` redirect target that
              requires SNI and TLS 1.2. If the client supports those features, it will
              send command requests stating that the ``cdn`` target is acceptable to use.
              But if the client doesn't support SNI or TLS 1.2 (or maybe it encountered an
              error using this target from a previous request), then it omits this target
              name.
              If the client advertises support for a redirect target, the server MAY
              substitute the normal, inline response data for a *redirect response* -
              one where the initial CBOR map has a ``status`` key with value ``redirect``.
              The *redirect response* at a minimum advertises the URL where the response
              can be retrieved.
              The *redirect response* MAY also advertise additional details about that
              content and how to retrieve it. Notably, the response may contain the
              x509 public certificates for the server being redirected to or the
              certificate authority that signed that server's certificate. Unless the
              client has existing settings that offer stronger trust validation than what
              the server advertises, the client SHOULD use the server-provided certificates
              when validating the connection to the remote server in place of any default
              connection verification checks. This is because certificates coming from
              the server SHOULD establish a stronger chain of trust than what the default
              certification validation mechanism in most environments provides. (By default,
              certificate validation ensures the signer of the cert chains up to a set of
              trusted root certificates. And if an explicit certificate or CA certificate
              is presented, that greadly reduces the set of certificates that will be
              recognized as valid, thus reducing the potential for a "bad" certificate
              to be used and trusted.)

mercurial/wireprotoframing.py

0 +21 -2

              # wireprotoframing.py - unified framing protocol for wire protocol
              #
              # Copyright 2018 Gregory Szorc <gregory.szorc@gmail.com>
              #
              # This software may be used and distributed according to the terms of the
              # GNU General Public License version 2 or any later version.
              # This file contains functionality to support the unified frame-based wire
              # protocol. For details about the protocol, see
              # `hg help internals.wireprotocol`.
              from __future__ import absolute_import
              import collections
              import struct
              from .i18n import _
              from .thirdparty import (
                  attr,
              )
              from . import (
                  encoding,
                  error,
                  pycompat,
                  util,
                  wireprototypes,
              )
              from .utils import (
                  cborutil,
                  stringutil,
              )
              FRAME_HEADER_SIZE = 8
              DEFAULT_MAX_FRAME_SIZE = 32768
              STREAM_FLAG_BEGIN_STREAM = 0x01
              STREAM_FLAG_END_STREAM = 0x02
              STREAM_FLAG_ENCODING_APPLIED = 0x04
              STREAM_FLAGS = {
                  b'stream-begin': STREAM_FLAG_BEGIN_STREAM,
                  b'stream-end': STREAM_FLAG_END_STREAM,
                  b'encoded': STREAM_FLAG_ENCODING_APPLIED,
              }
              FRAME_TYPE_COMMAND_REQUEST = 0x01
              FRAME_TYPE_COMMAND_DATA = 0x02
              FRAME_TYPE_COMMAND_RESPONSE = 0x03
              FRAME_TYPE_ERROR_RESPONSE = 0x05
              FRAME_TYPE_TEXT_OUTPUT = 0x06
              FRAME_TYPE_PROGRESS = 0x07
-             FRAME_TYPE_STREAM_SETTINGS = 0x08
+             FRAME_TYPE_SENDER_PROTOCOL_SETTINGS = 0x08
+             FRAME_TYPE_STREAM_SETTINGS = 0x09
              FRAME_TYPES = {
                  b'command-request': FRAME_TYPE_COMMAND_REQUEST,
                  b'command-data': FRAME_TYPE_COMMAND_DATA,
                  b'command-response': FRAME_TYPE_COMMAND_RESPONSE,
                  b'error-response': FRAME_TYPE_ERROR_RESPONSE,
                  b'text-output': FRAME_TYPE_TEXT_OUTPUT,
                  b'progress': FRAME_TYPE_PROGRESS,
+                 b'sender-protocol-settings': FRAME_TYPE_SENDER_PROTOCOL_SETTINGS,
                  b'stream-settings': FRAME_TYPE_STREAM_SETTINGS,
              }
              FLAG_COMMAND_REQUEST_NEW = 0x01
              FLAG_COMMAND_REQUEST_CONTINUATION = 0x02
              FLAG_COMMAND_REQUEST_MORE_FRAMES = 0x04
              FLAG_COMMAND_REQUEST_EXPECT_DATA = 0x08
              FLAGS_COMMAND_REQUEST = {
                  b'new': FLAG_COMMAND_REQUEST_NEW,
                  b'continuation': FLAG_COMMAND_REQUEST_CONTINUATION,
                  b'more': FLAG_COMMAND_REQUEST_MORE_FRAMES,
                  b'have-data': FLAG_COMMAND_REQUEST_EXPECT_DATA,
              }
              FLAG_COMMAND_DATA_CONTINUATION = 0x01
              FLAG_COMMAND_DATA_EOS = 0x02
              FLAGS_COMMAND_DATA = {
                  b'continuation': FLAG_COMMAND_DATA_CONTINUATION,
                  b'eos': FLAG_COMMAND_DATA_EOS,
              }
              FLAG_COMMAND_RESPONSE_CONTINUATION = 0x01
              FLAG_COMMAND_RESPONSE_EOS = 0x02
              FLAGS_COMMAND_RESPONSE = {
                  b'continuation': FLAG_COMMAND_RESPONSE_CONTINUATION,
                  b'eos': FLAG_COMMAND_RESPONSE_EOS,
              }
+             FLAG_SENDER_PROTOCOL_SETTINGS_CONTINUATION = 0x01
+             FLAG_SENDER_PROTOCOL_SETTINGS_EOS = 0x02
+             FLAGS_SENDER_PROTOCOL_SETTINGS = {
+                 b'continuation': FLAG_SENDER_PROTOCOL_SETTINGS_CONTINUATION,
+                 b'eos': FLAG_SENDER_PROTOCOL_SETTINGS_EOS,
+             }
+             FLAG_STREAM_ENCODING_SETTINGS_CONTINUATION = 0x01
+             FLAG_STREAM_ENCODING_SETTINGS_EOS = 0x02
+             FLAGS_STREAM_ENCODING_SETTINGS = {
+                 b'continuation': FLAG_STREAM_ENCODING_SETTINGS_CONTINUATION,
+                 b'eos': FLAG_STREAM_ENCODING_SETTINGS_EOS,
+             }
              # Maps frame types to their available flags.
              FRAME_TYPE_FLAGS = {
                  FRAME_TYPE_COMMAND_REQUEST: FLAGS_COMMAND_REQUEST,
                  FRAME_TYPE_COMMAND_DATA: FLAGS_COMMAND_DATA,
                  FRAME_TYPE_COMMAND_RESPONSE: FLAGS_COMMAND_RESPONSE,
                  FRAME_TYPE_ERROR_RESPONSE: {},
                  FRAME_TYPE_TEXT_OUTPUT: {},
                  FRAME_TYPE_PROGRESS: {},
-                 FRAME_TYPE_STREAM_SETTINGS: {},
+                 FRAME_TYPE_SENDER_PROTOCOL_SETTINGS: FLAGS_SENDER_PROTOCOL_SETTINGS,
+                 FRAME_TYPE_STREAM_SETTINGS: FLAGS_STREAM_ENCODING_SETTINGS,
              }
              ARGUMENT_RECORD_HEADER = struct.Struct(r'<HH')
              def humanflags(mapping, value):
                  """Convert a numeric flags value to a human value, using a mapping table."""
                  namemap = {v: k for k, v in mapping.iteritems()}
                  flags = []
                  val = 1
                  while value >= val:
                      if value & val:
                          flags.append(namemap.get(val, '<unknown 0x%02x>' % val))
                      val <<= 1
                  return b'|'.join(flags)
              @attr.s(slots=True)
              class frameheader(object):
                  """Represents the data in a frame header."""
                  length = attr.ib()
                  requestid = attr.ib()
                  streamid = attr.ib()
                  streamflags = attr.ib()
                  typeid = attr.ib()
                  flags = attr.ib()
              @attr.s(slots=True, repr=False)
              class frame(object):
                  """Represents a parsed frame."""
                  requestid = attr.ib()
                  streamid = attr.ib()
                  streamflags = attr.ib()
                  typeid = attr.ib()
                  flags = attr.ib()
                  payload = attr.ib()
                  @encoding.strmethod
                  def __repr__(self):
                      typename = '<unknown 0x%02x>' % self.typeid
                      for name, value in FRAME_TYPES.iteritems():
                          if value == self.typeid:
                              typename = name
                              break
                      return ('frame(size=%d; request=%d; stream=%d; streamflags=%s; '
                              'type=%s; flags=%s)' % (
                          len(self.payload), self.requestid, self.streamid,
                          humanflags(STREAM_FLAGS, self.streamflags), typename,
                          humanflags(FRAME_TYPE_FLAGS.get(self.typeid, {}), self.flags)))
              def makeframe(requestid, streamid, streamflags, typeid, flags, payload):
                  """Assemble a frame into a byte array."""
                  # TODO assert size of payload.
                  frame = bytearray(FRAME_HEADER_SIZE + len(payload))
                  # 24 bits length
                  # 16 bits request id
                  # 8 bits stream id
                  # 8 bits stream flags
                  # 4 bits type
                  # 4 bits flags
                  l = struct.pack(r'<I', len(payload))
                  frame[0:3] = l[0:3]
                  struct.pack_into(r'<HBB', frame, 3, requestid, streamid, streamflags)
                  frame[7] = (typeid << 4) | flags
                  frame[8:] = payload
                  return frame
              def makeframefromhumanstring(s):
                  """Create a frame from a human readable string
                  Strings have the form:
                      <request-id> <stream-id> <stream-flags> <type> <flags> <payload>
                  This can be used by user-facing applications and tests for creating
                  frames easily without having to type out a bunch of constants.
                  Request ID and stream IDs are integers.
                  Stream flags, frame type, and flags can be specified by integer or
                  named constant.
                  Flags can be delimited by `|` to bitwise OR them together.
                  If the payload begins with ``cbor:``, the following string will be
                  evaluated as Python literal and the resulting object will be fed into
                  a CBOR encoder. Otherwise, the payload is interpreted as a Python
                  byte string literal.
                  """
                  fields = s.split(b' ', 5)
                  requestid, streamid, streamflags, frametype, frameflags, payload = fields
                  requestid = int(requestid)
                  streamid = int(streamid)
                  finalstreamflags = 0
                  for flag in streamflags.split(b'|'):
                      if flag in STREAM_FLAGS:
                          finalstreamflags |= STREAM_FLAGS[flag]
                      else:
                          finalstreamflags |= int(flag)
                  if frametype in FRAME_TYPES:
                      frametype = FRAME_TYPES[frametype]
                  else:
                      frametype = int(frametype)
                  finalflags = 0
                  validflags = FRAME_TYPE_FLAGS[frametype]
                  for flag in frameflags.split(b'|'):
                      if flag in validflags:
                          finalflags |= validflags[flag]
                      else:
                          finalflags |= int(flag)
                  if payload.startswith(b'cbor:'):
                      payload = b''.join(cborutil.streamencode(
                          stringutil.evalpythonliteral(payload[5:])))
                  else:
                      payload = stringutil.unescapestr(payload)
                  return makeframe(requestid=requestid, streamid=streamid,
                                   streamflags=finalstreamflags, typeid=frametype,
                                   flags=finalflags, payload=payload)
              def parseheader(data):
                  """Parse a unified framing protocol frame header from a buffer.
                  The header is expected to be in the buffer at offset 0 and the
                  buffer is expected to be large enough to hold a full header.
                  """
                  # 24 bits payload length (little endian)
                  # 16 bits request ID
                  # 8 bits stream ID
                  # 8 bits stream flags
                  # 4 bits frame type
                  # 4 bits frame flags
                  # ... payload
                  framelength = data[0] + 256 * data[1] + 16384 * data[2]
                  requestid, streamid, streamflags = struct.unpack_from(r'<HBB', data, 3)
                  typeflags = data[7]
                  frametype = (typeflags & 0xf0) >> 4
                  frameflags = typeflags & 0x0f
                  return frameheader(framelength, requestid, streamid, streamflags,
                                     frametype, frameflags)
              def readframe(fh):
                  """Read a unified framing protocol frame from a file object.
                  Returns a 3-tuple of (type, flags, payload) for the decoded frame or
                  None if no frame is available. May raise if a malformed frame is
                  seen.
                  """
                  header = bytearray(FRAME_HEADER_SIZE)
                  readcount = fh.readinto(header)
                  if readcount == 0:
                      return None
                  if readcount != FRAME_HEADER_SIZE:
                      raise error.Abort(_('received incomplete frame: got %d bytes: %s') %
                                        (readcount, header))
                  h = parseheader(header)
                  payload = fh.read(h.length)
                  if len(payload) != h.length:
                      raise error.Abort(_('frame length error: expected %d; got %d') %
                                        (h.length, len(payload)))
                  return frame(h.requestid, h.streamid, h.streamflags, h.typeid, h.flags,
                               payload)
              def createcommandframes(stream, requestid, cmd, args, datafh=None,
                                      maxframesize=DEFAULT_MAX_FRAME_SIZE,
                                      redirect=None):
                  """Create frames necessary to transmit a request to run a command.
                  This is a generator of bytearrays. Each item represents a frame
                  ready to be sent over the wire to a peer.
                  """
                  data = {b'name': cmd}
                  if args:
                      data[b'args'] = args
                  if redirect:
                      data[b'redirect'] = redirect
                  data = b''.join(cborutil.streamencode(data))
                  offset = 0
                  while True:
                      flags = 0
                      # Must set new or continuation flag.
                      if not offset:
                          flags |= FLAG_COMMAND_REQUEST_NEW
                      else:
                          flags |= FLAG_COMMAND_REQUEST_CONTINUATION
                      # Data frames is set on all frames.
                      if datafh:
                          flags |= FLAG_COMMAND_REQUEST_EXPECT_DATA
                      payload = data[offset:offset + maxframesize]
                      offset += len(payload)
                      if len(payload) == maxframesize and offset < len(data):
                          flags |= FLAG_COMMAND_REQUEST_MORE_FRAMES
                      yield stream.makeframe(requestid=requestid,
                                             typeid=FRAME_TYPE_COMMAND_REQUEST,
                                             flags=flags,
                                             payload=payload)
                      if not (flags & FLAG_COMMAND_REQUEST_MORE_FRAMES):
                          break
                  if datafh:
                      while True:
                          data = datafh.read(DEFAULT_MAX_FRAME_SIZE)
                          done = False
                          if len(data) == DEFAULT_MAX_FRAME_SIZE:
                              flags = FLAG_COMMAND_DATA_CONTINUATION
                          else:
                              flags = FLAG_COMMAND_DATA_EOS
                              assert datafh.read(1) == b''
                              done = True
                          yield stream.makeframe(requestid=requestid,
                                                 typeid=FRAME_TYPE_COMMAND_DATA,
                                                 flags=flags,
                                                 payload=data)
                          if done:
                              break
              def createcommandresponseframesfrombytes(stream, requestid, data,
                                                       maxframesize=DEFAULT_MAX_FRAME_SIZE):
                  """Create a raw frame to send a bytes response from static bytes input.
                  Returns a generator of bytearrays.
                  """
                  # Automatically send the overall CBOR response map.
                  overall = b''.join(cborutil.streamencode({b'status': b'ok'}))
                  if len(overall) > maxframesize:
                      raise error.ProgrammingError('not yet implemented')
                  # Simple case where we can fit the full response in a single frame.
                  if len(overall) + len(data) <= maxframesize:
                      flags = FLAG_COMMAND_RESPONSE_EOS
                      yield stream.makeframe(requestid=requestid,
                                             typeid=FRAME_TYPE_COMMAND_RESPONSE,
                                             flags=flags,
                                             payload=overall + data)
                      return
                  # It's easier to send the overall CBOR map in its own frame than to track
                  # offsets.
                  yield stream.makeframe(requestid=requestid,
                                         typeid=FRAME_TYPE_COMMAND_RESPONSE,
                                         flags=FLAG_COMMAND_RESPONSE_CONTINUATION,
                                         payload=overall)
                  offset = 0
                  while True:
                      chunk = data[offset:offset + maxframesize]
                      offset += len(chunk)
                      done = offset == len(data)
                      if done:
                          flags = FLAG_COMMAND_RESPONSE_EOS
                      else:
                          flags = FLAG_COMMAND_RESPONSE_CONTINUATION
                      yield stream.makeframe(requestid=requestid,
                                             typeid=FRAME_TYPE_COMMAND_RESPONSE,
                                             flags=flags,
                                             payload=chunk)
                      if done:
                          break
              def createbytesresponseframesfromgen(stream, requestid, gen,
                                                   maxframesize=DEFAULT_MAX_FRAME_SIZE):
                  """Generator of frames from a generator of byte chunks.
                  This assumes that another frame will follow whatever this emits. i.e.
                  this always emits the continuation flag and never emits the end-of-stream
                  flag.
                  """
                  cb = util.chunkbuffer(gen)
                  flags = FLAG_COMMAND_RESPONSE_CONTINUATION
                  while True:
                      chunk = cb.read(maxframesize)
                      if not chunk:
                          break
                      yield stream.makeframe(requestid=requestid,
                                             typeid=FRAME_TYPE_COMMAND_RESPONSE,
                                             flags=flags,
                                             payload=chunk)
                      flags |= FLAG_COMMAND_RESPONSE_CONTINUATION
              def createcommandresponseokframe(stream, requestid):
                  overall = b''.join(cborutil.streamencode({b'status': b'ok'}))
                  return stream.makeframe(requestid=requestid,
                                          typeid=FRAME_TYPE_COMMAND_RESPONSE,
                                          flags=FLAG_COMMAND_RESPONSE_CONTINUATION,
                                          payload=overall)
              def createcommandresponseeosframe(stream, requestid):
                  """Create an empty payload frame representing command end-of-stream."""
                  return stream.makeframe(requestid=requestid,
                                          typeid=FRAME_TYPE_COMMAND_RESPONSE,
                                          flags=FLAG_COMMAND_RESPONSE_EOS,
                                          payload=b'')
              def createalternatelocationresponseframe(stream, requestid, location):
                  data = {
                      b'status': b'redirect',
                      b'location': {
                          b'url': location.url,
                          b'mediatype': location.mediatype,
                      }
                  }
                  for a in (r'size', r'fullhashes', r'fullhashseed', r'serverdercerts',
                            r'servercadercerts'):
                      value = getattr(location, a)
                      if value is not None:
                          data[b'location'][pycompat.bytestr(a)] = value
                  return stream.makeframe(requestid=requestid,
                                          typeid=FRAME_TYPE_COMMAND_RESPONSE,
                                          flags=FLAG_COMMAND_RESPONSE_CONTINUATION,
                                          payload=b''.join(cborutil.streamencode(data)))
              def createcommanderrorresponse(stream, requestid, message, args=None):
                  # TODO should this be using a list of {'msg': ..., 'args': {}} so atom
                  # formatting works consistently?
                  m = {
                      b'status': b'error',
                      b'error': {
                          b'message': message,
                      }
                  }
                  if args:
                      m[b'error'][b'args'] = args
                  overall = b''.join(cborutil.streamencode(m))
                  yield stream.makeframe(requestid=requestid,
                                         typeid=FRAME_TYPE_COMMAND_RESPONSE,
                                         flags=FLAG_COMMAND_RESPONSE_EOS,
                                         payload=overall)
              def createerrorframe(stream, requestid, msg, errtype):
                  # TODO properly handle frame size limits.
                  assert len(msg) <= DEFAULT_MAX_FRAME_SIZE
                  payload = b''.join(cborutil.streamencode({
                      b'type': errtype,
                      b'message': [{b'msg': msg}],
                  }))
                  yield stream.makeframe(requestid=requestid,
                                         typeid=FRAME_TYPE_ERROR_RESPONSE,
                                         flags=0,
                                         payload=payload)
              def createtextoutputframe(stream, requestid, atoms,
                                        maxframesize=DEFAULT_MAX_FRAME_SIZE):
                  """Create a text output frame to render text to people.
                  ``atoms`` is a 3-tuple of (formatting string, args, labels).
                  The formatting string contains ``%s`` tokens to be replaced by the
                  corresponding indexed entry in ``args``. ``labels`` is an iterable of
                  formatters to be applied at rendering time. In terms of the ``ui``
                  class, each atom corresponds to a ``ui.write()``.
                  """
                  atomdicts = []
                  for (formatting, args, labels) in atoms:
                      # TODO look for localstr, other types here?
                      if not isinstance(formatting, bytes):
                          raise ValueError('must use bytes formatting strings')
                      for arg in args:
                          if not isinstance(arg, bytes):
                              raise ValueError('must use bytes for arguments')
                      for label in labels:
                          if not isinstance(label, bytes):
                              raise ValueError('must use bytes for labels')
                      # Formatting string must be ASCII.
                      formatting = formatting.decode(r'ascii', r'replace').encode(r'ascii')
                      # Arguments must be UTF-8.
                      args = [a.decode(r'utf-8', r'replace').encode(r'utf-8') for a in args]
                      # Labels must be ASCII.
                      labels = [l.decode(r'ascii', r'strict').encode(r'ascii')
                                for l in labels]
                      atom = {b'msg': formatting}
                      if args:
                          atom[b'args'] = args
                      if labels:
                          atom[b'labels'] = labels
                      atomdicts.append(atom)
                  payload = b''.join(cborutil.streamencode(atomdicts))
                  if len(payload) > maxframesize:
                      raise ValueError('cannot encode data in a single frame')
                  yield stream.makeframe(requestid=requestid,
                                         typeid=FRAME_TYPE_TEXT_OUTPUT,
                                         flags=0,
                                         payload=payload)
              class bufferingcommandresponseemitter(object):
                  """Helper object to emit command response frames intelligently.
                  Raw command response data is likely emitted in chunks much smaller
                  than what can fit in a single frame. This class exists to buffer
                  chunks until enough data is available to fit in a single frame.
                  TODO we'll need something like this when compression is supported.
                  So it might make sense to implement this functionality at the stream
                  level.
                  """
                  def __init__(self, stream, requestid, maxframesize=DEFAULT_MAX_FRAME_SIZE):
                      self._stream = stream
                      self._requestid = requestid
                      self._maxsize = maxframesize
                      self._chunks = []
                      self._chunkssize = 0
                  def send(self, data):
                      """Send new data for emission.
                      Is a generator of new frames that were derived from the new input.
                      If the special input ``None`` is received, flushes all buffered
                      data to frames.
                      """
                      if data is None:
                          for frame in self._flush():
                              yield frame
                          return
                      # There is a ton of potential to do more complicated things here.
                      # Our immediate goal is to coalesce small chunks into big frames,
                      # not achieve the fewest number of frames possible. So we go with
                      # a simple implementation:
                      #
                      # * If a chunk is too large for a frame, we flush and emit frames
                      #   for the new chunk.
                      # * If a chunk can be buffered without total buffered size limits
                      #   being exceeded, we do that.
                      # * If a chunk causes us to go over our buffering limit, we flush
                      #   and then buffer the new chunk.
                      if len(data) > self._maxsize:
                          for frame in self._flush():
                              yield frame
                          # Now emit frames for the big chunk.
                          offset = 0
                          while True:
                              chunk = data[offset:offset + self._maxsize]
                              offset += len(chunk)
                              yield self._stream.makeframe(
                                  self._requestid,
                                  typeid=FRAME_TYPE_COMMAND_RESPONSE,
                                  flags=FLAG_COMMAND_RESPONSE_CONTINUATION,
                                  payload=chunk)
                              if offset == len(data):
                                  return
                      # If we don't have enough to constitute a full frame, buffer and
                      # return.
                      if len(data) + self._chunkssize < self._maxsize:
                          self._chunks.append(data)
                          self._chunkssize += len(data)
                          return
                      # Else flush what we have and buffer the new chunk. We could do
                      # something more intelligent here, like break the chunk. Let's
                      # keep things simple for now.
                      for frame in self._flush():
                          yield frame
                      self._chunks.append(data)
                      self._chunkssize = len(data)
                  def _flush(self):
                      payload = b''.join(self._chunks)
                      assert len(payload) <= self._maxsize
                      self._chunks[:] = []
                      self._chunkssize = 0
                      yield self._stream.makeframe(
                          self._requestid,
                          typeid=FRAME_TYPE_COMMAND_RESPONSE,
                          flags=FLAG_COMMAND_RESPONSE_CONTINUATION,
                          payload=payload)
              class stream(object):
                  """Represents a logical unidirectional series of frames."""
                  def __init__(self, streamid, active=False):
                      self.streamid = streamid
                      self._active = active
                  def makeframe(self, requestid, typeid, flags, payload):
                      """Create a frame to be sent out over this stream.
                      Only returns the frame instance. Does not actually send it.
                      """
                      streamflags = 0
                      if not self._active:
                          streamflags |= STREAM_FLAG_BEGIN_STREAM
                          self._active = True
                      return makeframe(requestid, self.streamid, streamflags, typeid, flags,
                                       payload)
              def ensureserverstream(stream):
                  if stream.streamid % 2:
                      raise error.ProgrammingError('server should only write to even '
                                                   'numbered streams; %d is not even' %
                                                   stream.streamid)
              class serverreactor(object):
                  """Holds state of a server handling frame-based protocol requests.
                  This class is the "brain" of the unified frame-based protocol server
                  component. While the protocol is stateless from the perspective of
                  requests/commands, something needs to track which frames have been
                  received, what frames to expect, etc. This class is that thing.
                  Instances are modeled as a state machine of sorts. Instances are also
                  reactionary to external events. The point of this class is to encapsulate
                  the state of the connection and the exchange of frames, not to perform
                  work. Instead, callers tell this class when something occurs, like a
                  frame arriving. If that activity is worthy of a follow-up action (say
                  *run a command*), the return value of that handler will say so.
                  I/O and CPU intensive operations are purposefully delegated outside of
                  this class.
                  Consumers are expected to tell instances when events occur. They do so by
                  calling the various ``on*`` methods. These methods return a 2-tuple
                  describing any follow-up action(s) to take. The first element is the
                  name of an action to perform. The second is a data structure (usually
                  a dict) specific to that action that contains more information. e.g.
                  if the server wants to send frames back to the client, the data structure
                  will contain a reference to those frames.
                  Valid actions that consumers can be instructed to take are:
                  sendframes
                     Indicates that frames should be sent to the client. The ``framegen``
                     key contains a generator of frames that should be sent. The server
                     assumes that all frames are sent to the client.
                  error
                     Indicates that an error occurred. Consumer should probably abort.
                  runcommand
                     Indicates that the consumer should run a wire protocol command. Details
                     of the command to run are given in the data structure.
                  wantframe
                     Indicates that nothing of interest happened and the server is waiting on
                     more frames from the client before anything interesting can be done.
                  noop
                     Indicates no additional action is required.
                  Known Issues
                  ------------
                  There are no limits to the number of partially received commands or their
                  size. A malicious client could stream command request data and exhaust the
                  server's memory.
                  Partially received commands are not acted upon when end of input is
                  reached. Should the server error if it receives a partial request?
                  Should the client send a message to abort a partially transmitted request
                  to facilitate graceful shutdown?
                  Active requests that haven't been responded to aren't tracked. This means
                  that if we receive a command and instruct its dispatch, another command
                  with its request ID can come in over the wire and there will be a race
                  between who responds to what.
                  """
                  def __init__(self, deferoutput=False):
                      """Construct a new server reactor.
                      ``deferoutput`` can be used to indicate that no output frames should be
                      instructed to be sent until input has been exhausted. In this mode,
                      events that would normally generate output frames (such as a command
                      response being ready) will instead defer instructing the consumer to
                      send those frames. This is useful for half-duplex transports where the
                      sender cannot receive until all data has been transmitted.
                      """
                      self._deferoutput = deferoutput
                      self._state = 'idle'
                      self._nextoutgoingstreamid = 2
                      self._bufferedframegens = []
                      # stream id -> stream instance for all active streams from the client.
                      self._incomingstreams = {}
                      self._outgoingstreams = {}
                      # request id -> dict of commands that are actively being received.
                      self._receivingcommands = {}
                      # Request IDs that have been received and are actively being processed.
                      # Once all output for a request has been sent, it is removed from this
                      # set.
                      self._activecommands = set()
                  def onframerecv(self, frame):
                      """Process a frame that has been received off the wire.
                      Returns a dict with an ``action`` key that details what action,
                      if any, the consumer should take next.
                      """
                      if not frame.streamid % 2:
                          self._state = 'errored'
                          return self._makeerrorresult(
                              _('received frame with even numbered stream ID: %d') %
                                frame.streamid)
                      if frame.streamid not in self._incomingstreams:
                          if not frame.streamflags & STREAM_FLAG_BEGIN_STREAM:
                              self._state = 'errored'
                              return self._makeerrorresult(
                                  _('received frame on unknown inactive stream without '
                                    'beginning of stream flag set'))
                          self._incomingstreams[frame.streamid] = stream(frame.streamid)
                      if frame.streamflags & STREAM_FLAG_ENCODING_APPLIED:
                          # TODO handle decoding frames
                          self._state = 'errored'
                          raise error.ProgrammingError('support for decoding stream payloads '
                                                       'not yet implemented')
                      if frame.streamflags & STREAM_FLAG_END_STREAM:
                          del self._incomingstreams[frame.streamid]
                      handlers = {
                          'idle': self._onframeidle,
                          'command-receiving': self._onframecommandreceiving,
                          'errored': self._onframeerrored,
                      }
                      meth = handlers.get(self._state)
                      if not meth:
                          raise error.ProgrammingError('unhandled state: %s' % self._state)
                      return meth(frame)
                  def oncommandresponseready(self, stream, requestid, data):
                      """Signal that a bytes response is ready to be sent to the client.
                      The raw bytes response is passed as an argument.
                      """
                      ensureserverstream(stream)
                      def sendframes():
                          for frame in createcommandresponseframesfrombytes(stream, requestid,
                                                                            data):
                              yield frame
                          self._activecommands.remove(requestid)
                      result = sendframes()
                      if self._deferoutput:
                          self._bufferedframegens.append(result)
                          return 'noop', {}
                      else:
                          return 'sendframes', {
                              'framegen': result,
                          }
                  def oncommandresponsereadyobjects(self, stream, requestid, objs):
                      """Signal that objects are ready to be sent to the client.
                      ``objs`` is an iterable of objects (typically a generator) that will
                      be encoded via CBOR and added to frames, which will be sent to the
                      client.
                      """
                      ensureserverstream(stream)
                      # We need to take care over exception handling. Uncaught exceptions
                      # when generating frames could lead to premature end of the frame
                      # stream and the possibility of the server or client process getting
                      # in a bad state.
                      #
                      # Keep in mind that if ``objs`` is a generator, advancing it could
                      # raise exceptions that originated in e.g. wire protocol command
                      # functions. That is why we differentiate between exceptions raised
                      # when iterating versus other exceptions that occur.
                      #
                      # In all cases, when the function finishes, the request is fully
                      # handled and no new frames for it should be seen.
                      def sendframes():
                          emitted = False
                          alternatelocationsent = False
                          emitter = bufferingcommandresponseemitter(stream, requestid)
                          while True:
                              try:
                                  o = next(objs)
                              except StopIteration:
                                  for frame in emitter.send(None):
                                      yield frame
                                  if emitted:
                                      yield createcommandresponseeosframe(stream, requestid)
                                  break
                              except error.WireprotoCommandError as e:
                                  for frame in createcommanderrorresponse(
                                      stream, requestid, e.message, e.messageargs):
                                      yield frame
                                  break
                              except Exception as e:
                                  for frame in createerrorframe(
                                      stream, requestid, '%s' % stringutil.forcebytestr(e),
                                      errtype='server'):
                                      yield frame
                                  break
                              try:
                                  # Alternate location responses can only be the first and
                                  # only object in the output stream.
                                  if isinstance(o, wireprototypes.alternatelocationresponse):
                                      if emitted:
                                          raise error.ProgrammingError(
                                              'alternatelocationresponse seen after initial '
                                              'output object')
                                      yield createalternatelocationresponseframe(
                                          stream, requestid, o)
                                      alternatelocationsent = True
                                      emitted = True
                                      continue
                                  if alternatelocationsent:
                                      raise error.ProgrammingError(
                                          'object follows alternatelocationresponse')
                                  if not emitted:
                                      yield createcommandresponseokframe(stream, requestid)
                                      emitted = True
                                  # Objects emitted by command functions can be serializable
                                  # data structures or special types.
                                  # TODO consider extracting the content normalization to a
                                  # standalone function, as it may be useful for e.g. cachers.
                                  # A pre-encoded object is sent directly to the emitter.
                                  if isinstance(o, wireprototypes.encodedresponse):
                                      for frame in emitter.send(o.data):
                                          yield frame
                                  # A regular object is CBOR encoded.
                                  else:
                                      for chunk in cborutil.streamencode(o):
                                          for frame in emitter.send(chunk):
                                              yield frame
                              except Exception as e:
                                  for frame in createerrorframe(stream, requestid,
                                                                '%s' % e,
                                                                errtype='server'):
                                      yield frame
                                  break
                          self._activecommands.remove(requestid)
                      return self._handlesendframes(sendframes())
                  def oninputeof(self):
                      """Signals that end of input has been received.
                      No more frames will be received. All pending activity should be
                      completed.
                      """
                      # TODO should we do anything about in-flight commands?
                      if not self._deferoutput or not self._bufferedframegens:
                          return 'noop', {}
                      # If we buffered all our responses, emit those.
                      def makegen():
                          for gen in self._bufferedframegens:
                              for frame in gen:
                                  yield frame
                      return 'sendframes', {
                          'framegen': makegen(),
                      }
                  def _handlesendframes(self, framegen):
                      if self._deferoutput:
                          self._bufferedframegens.append(framegen)
                          return 'noop', {}
                      else:
                          return 'sendframes', {
                              'framegen': framegen,
                          }
                  def onservererror(self, stream, requestid, msg):
                      ensureserverstream(stream)
                      def sendframes():
                          for frame in createerrorframe(stream, requestid, msg,
                                                        errtype='server'):
                              yield frame
                          self._activecommands.remove(requestid)
                      return self._handlesendframes(sendframes())
                  def oncommanderror(self, stream, requestid, message, args=None):
                      """Called when a command encountered an error before sending output."""
                      ensureserverstream(stream)
                      def sendframes():
                          for frame in createcommanderrorresponse(stream, requestid, message,
                                                                  args):
                              yield frame
                          self._activecommands.remove(requestid)
                      return self._handlesendframes(sendframes())
                  def makeoutputstream(self):
                      """Create a stream to be used for sending data to the client."""
                      streamid = self._nextoutgoingstreamid
                      self._nextoutgoingstreamid += 2
                      s = stream(streamid)
                      self._outgoingstreams[streamid] = s
                      return s
                  def _makeerrorresult(self, msg):
                      return 'error', {
                          'message': msg,
                      }
                  def _makeruncommandresult(self, requestid):
                      entry = self._receivingcommands[requestid]
                      if not entry['requestdone']:
                          self._state = 'errored'
                          raise error.ProgrammingError('should not be called without '
                                                       'requestdone set')
                      del self._receivingcommands[requestid]
                      if self._receivingcommands:
                          self._state = 'command-receiving'
                      else:
                          self._state = 'idle'
                      # Decode the payloads as CBOR.
                      entry['payload'].seek(0)
                      request = cborutil.decodeall(entry['payload'].getvalue())[0]
                      if b'name' not in request:
                          self._state = 'errored'
                          return self._makeerrorresult(
                              _('command request missing "name" field'))
                      if b'args' not in request:
                          request[b'args'] = {}
                      assert requestid not in self._activecommands
                      self._activecommands.add(requestid)
                      return 'runcommand', {
                          'requestid': requestid,
                          'command': request[b'name'],
                          'args': request[b'args'],
                          'redirect': request.get(b'redirect'),
                          'data': entry['data'].getvalue() if entry['data'] else None,
                      }
                  def _makewantframeresult(self):
                      return 'wantframe', {
                          'state': self._state,
                      }
                  def _validatecommandrequestframe(self, frame):
                      new = frame.flags & FLAG_COMMAND_REQUEST_NEW
                      continuation = frame.flags & FLAG_COMMAND_REQUEST_CONTINUATION
                      if new and continuation:
                          self._state = 'errored'
                          return self._makeerrorresult(
                              _('received command request frame with both new and '
                                'continuation flags set'))
                      if not new and not continuation:
                          self._state = 'errored'
                          return self._makeerrorresult(
                              _('received command request frame with neither new nor '
                                'continuation flags set'))
                  def _onframeidle(self, frame):
                      # The only frame type that should be received in this state is a
                      # command request.
                      if frame.typeid != FRAME_TYPE_COMMAND_REQUEST:
                          self._state = 'errored'
                          return self._makeerrorresult(
                              _('expected command request frame; got %d') % frame.typeid)
                      res = self._validatecommandrequestframe(frame)
                      if res:
                          return res
                      if frame.requestid in self._receivingcommands:
                          self._state = 'errored'
                          return self._makeerrorresult(
                              _('request with ID %d already received') % frame.requestid)
                      if frame.requestid in self._activecommands:
                          self._state = 'errored'
                          return self._makeerrorresult(
                              _('request with ID %d is already active') % frame.requestid)
                      new = frame.flags & FLAG_COMMAND_REQUEST_NEW
                      moreframes = frame.flags & FLAG_COMMAND_REQUEST_MORE_FRAMES
                      expectingdata = frame.flags & FLAG_COMMAND_REQUEST_EXPECT_DATA
                      if not new:
                          self._state = 'errored'
                          return self._makeerrorresult(
                              _('received command request frame without new flag set'))
                      payload = util.bytesio()
                      payload.write(frame.payload)
                      self._receivingcommands[frame.requestid] = {
                          'payload': payload,
                          'data': None,
                          'requestdone': not moreframes,
                          'expectingdata': bool(expectingdata),
                      }
                      # This is the final frame for this request. Dispatch it.
                      if not moreframes and not expectingdata:
                          return self._makeruncommandresult(frame.requestid)
                      assert moreframes or expectingdata
                      self._state = 'command-receiving'
                      return self._makewantframeresult()
                  def _onframecommandreceiving(self, frame):
                      if frame.typeid == FRAME_TYPE_COMMAND_REQUEST:
                          # Process new command requests as such.
                          if frame.flags & FLAG_COMMAND_REQUEST_NEW:
                              return self._onframeidle(frame)
                          res = self._validatecommandrequestframe(frame)
                          if res:
                              return res
                      # All other frames should be related to a command that is currently
                      # receiving but is not active.
                      if frame.requestid in self._activecommands:
                          self._state = 'errored'
                          return self._makeerrorresult(
                              _('received frame for request that is still active: %d') %
                              frame.requestid)
                      if frame.requestid not in self._receivingcommands:
                          self._state = 'errored'
                          return self._makeerrorresult(
                              _('received frame for request that is not receiving: %d') %
                                frame.requestid)
                      entry = self._receivingcommands[frame.requestid]
                      if frame.typeid == FRAME_TYPE_COMMAND_REQUEST:
                          moreframes = frame.flags & FLAG_COMMAND_REQUEST_MORE_FRAMES
                          expectingdata = bool(frame.flags & FLAG_COMMAND_REQUEST_EXPECT_DATA)
                          if entry['requestdone']:
                              self._state = 'errored'
                              return self._makeerrorresult(
                                  _('received command request frame when request frames '
                                    'were supposedly done'))
                          if expectingdata != entry['expectingdata']:
                              self._state = 'errored'
                              return self._makeerrorresult(
                                  _('mismatch between expect data flag and previous frame'))
                          entry['payload'].write(frame.payload)
                          if not moreframes:
                              entry['requestdone'] = True
                          if not moreframes and not expectingdata:
                              return self._makeruncommandresult(frame.requestid)
                          return self._makewantframeresult()
                      elif frame.typeid == FRAME_TYPE_COMMAND_DATA:
                          if not entry['expectingdata']:
                              self._state = 'errored'
                              return self._makeerrorresult(_(
                                  'received command data frame for request that is not '
                                  'expecting data: %d') % frame.requestid)
                          if entry['data'] is None:
                              entry['data'] = util.bytesio()
                          return self._handlecommanddataframe(frame, entry)
                      else:
                          self._state = 'errored'
                          return self._makeerrorresult(_(
                              'received unexpected frame type: %d') % frame.typeid)
                  def _handlecommanddataframe(self, frame, entry):
                      assert frame.typeid == FRAME_TYPE_COMMAND_DATA
                      # TODO support streaming data instead of buffering it.
                      entry['data'].write(frame.payload)
                      if frame.flags & FLAG_COMMAND_DATA_CONTINUATION:
                          return self._makewantframeresult()
                      elif frame.flags & FLAG_COMMAND_DATA_EOS:
                          entry['data'].seek(0)
                          return self._makeruncommandresult(frame.requestid)
                      else:
                          self._state = 'errored'
                          return self._makeerrorresult(_('command data frame without '
                                                         'flags'))
                  def _onframeerrored(self, frame):
                      return self._makeerrorresult(_('server already errored'))
              class commandrequest(object):
                  """Represents a request to run a command."""
                  def __init__(self, requestid, name, args, datafh=None, redirect=None):
                      self.requestid = requestid
                      self.name = name
                      self.args = args
                      self.datafh = datafh
                      self.redirect = redirect
                      self.state = 'pending'
              class clientreactor(object):
                  """Holds state of a client issuing frame-based protocol requests.
                  This is like ``serverreactor`` but for client-side state.
                  Each instance is bound to the lifetime of a connection. For persistent
                  connection transports using e.g. TCP sockets and speaking the raw
                  framing protocol, there will be a single instance for the lifetime of
                  the TCP socket. For transports where there are multiple discrete
                  interactions (say tunneled within in HTTP request), there will be a
                  separate instance for each distinct interaction.
                  """
                  def __init__(self, hasmultiplesend=False, buffersends=True):
                      """Create a new instance.
                      ``hasmultiplesend`` indicates whether multiple sends are supported
                      by the transport. When True, it is possible to send commands immediately
                      instead of buffering until the caller signals an intent to finish a
                      send operation.
                      ``buffercommands`` indicates whether sends should be buffered until the
                      last request has been issued.
                      """
                      self._hasmultiplesend = hasmultiplesend
                      self._buffersends = buffersends
                      self._canissuecommands = True
                      self._cansend = True
                      self._nextrequestid = 1
                      # We only support a single outgoing stream for now.
                      self._outgoingstream = stream(1)
                      self._pendingrequests = collections.deque()
                      self._activerequests = {}
                      self._incomingstreams = {}
                  def callcommand(self, name, args, datafh=None, redirect=None):
                      """Request that a command be executed.
                      Receives the command name, a dict of arguments to pass to the command,
                      and an optional file object containing the raw data for the command.
                      Returns a 3-tuple of (request, action, action data).
                      """
                      if not self._canissuecommands:
                          raise error.ProgrammingError('cannot issue new commands')
                      requestid = self._nextrequestid
                      self._nextrequestid += 2
                      request = commandrequest(requestid, name, args, datafh=datafh,
                                               redirect=redirect)
                      if self._buffersends:
                          self._pendingrequests.append(request)
                          return request, 'noop', {}
                      else:
                          if not self._cansend:
                              raise error.ProgrammingError('sends cannot be performed on '
                                                           'this instance')
                          if not self._hasmultiplesend:
                              self._cansend = False
                              self._canissuecommands = False
                          return request, 'sendframes', {
                              'framegen': self._makecommandframes(request),
                          }
                  def flushcommands(self):
                      """Request that all queued commands be sent.
                      If any commands are buffered, this will instruct the caller to send
                      them over the wire. If no commands are buffered it instructs the client
                      to no-op.
                      If instances aren't configured for multiple sends, no new command
                      requests are allowed after this is called.
                      """
                      if not self._pendingrequests:
                          return 'noop', {}
                      if not self._cansend:
                          raise error.ProgrammingError('sends cannot be performed on this '
                                                       'instance')
                      # If the instance only allows sending once, mark that we have fired
                      # our one shot.
                      if not self._hasmultiplesend:
                          self._canissuecommands = False
                          self._cansend = False
                      def makeframes():
                          while self._pendingrequests:
                              request = self._pendingrequests.popleft()
                              for frame in self._makecommandframes(request):
                                  yield frame
                      return 'sendframes', {
                          'framegen': makeframes(),
                      }
                  def _makecommandframes(self, request):
                      """Emit frames to issue a command request.
                      As a side-effect, update request accounting to reflect its changed
                      state.
                      """
                      self._activerequests[request.requestid] = request
                      request.state = 'sending'
                      res = createcommandframes(self._outgoingstream,
                                                request.requestid,
                                                request.name,
                                                request.args,
                                                datafh=request.datafh,
                                                redirect=request.redirect)
                      for frame in res:
                          yield frame
                      request.state = 'sent'
                  def onframerecv(self, frame):
                      """Process a frame that has been received off the wire.
                      Returns a 2-tuple of (action, meta) describing further action the
                      caller needs to take as a result of receiving this frame.
                      """
                      if frame.streamid % 2:
                          return 'error', {
                              'message': (
                                  _('received frame with odd numbered stream ID: %d') %
                                  frame.streamid),
                          }
                      if frame.streamid not in self._incomingstreams:
                          if not frame.streamflags & STREAM_FLAG_BEGIN_STREAM:
                              return 'error', {
                                  'message': _('received frame on unknown stream '
                                               'without beginning of stream flag set'),
                              }
                          self._incomingstreams[frame.streamid] = stream(frame.streamid)
                      if frame.streamflags & STREAM_FLAG_ENCODING_APPLIED:
                          raise error.ProgrammingError('support for decoding stream '
                                                       'payloads not yet implemneted')
                      if frame.streamflags & STREAM_FLAG_END_STREAM:
                          del self._incomingstreams[frame.streamid]
                      if frame.requestid not in self._activerequests:
                          return 'error', {
                              'message': (_('received frame for inactive request ID: %d') %
                                          frame.requestid),
                          }
                      request = self._activerequests[frame.requestid]
                      request.state = 'receiving'
                      handlers = {
                          FRAME_TYPE_COMMAND_RESPONSE: self._oncommandresponseframe,
                          FRAME_TYPE_ERROR_RESPONSE: self._onerrorresponseframe,
                      }
                      meth = handlers.get(frame.typeid)
                      if not meth:
                          raise error.ProgrammingError('unhandled frame type: %d' %
                                                       frame.typeid)
                      return meth(request, frame)
                  def _oncommandresponseframe(self, request, frame):
                      if frame.flags & FLAG_COMMAND_RESPONSE_EOS:
                          request.state = 'received'
                          del self._activerequests[request.requestid]
                      return 'responsedata', {
                          'request': request,
                          'expectmore': frame.flags & FLAG_COMMAND_RESPONSE_CONTINUATION,
                          'eos': frame.flags & FLAG_COMMAND_RESPONSE_EOS,
                          'data': frame.payload,
                      }
                  def _onerrorresponseframe(self, request, frame):
                      request.state = 'errored'
                      del self._activerequests[request.requestid]
                      # The payload should be a CBOR map.
                      m = cborutil.decodeall(frame.payload)[0]
                      return 'error', {
                          'request': request,
                          'type': m['type'],
                          'message': m['message'],
                      }

General Comments 0

Write
Preview

You need to be logged in to leave comments. Login now

No TODOs yet

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages