upstream/mercurial-mirror Commit - r29864:f0d47aca

help: document wire protocol "handshake" protocol...

Gregory Szorc -

r29864:f0d47aca default

parent child

mercurial/help/internals/wireprotocol.txt

0 +47 0

              The Mercurial wire protocol is a request-response based protocol
              with multiple wire representations.
              Each request is modeled as a command name, a dictionary of arguments, and
              optional raw input. Command arguments and their types are intrinsic
              properties of commands. So is the response type of the command. This means
              clients can't always send arbitrary arguments to servers and servers can't
              return multiple response types.
              The protocol is synchronous and does not support multiplexing (concurrent
              commands).
              Transport Protocols
              ===================
              HTTP Transport
              --------------
              Commands are issued as HTTP/1.0 or HTTP/1.1 requests. Commands are
              sent to the base URL of the repository with the command name sent in
              the ``cmd`` query string parameter. e.g.
              ``https://example.com/repo?cmd=capabilities``. The HTTP method is ``GET``
              or ``POST`` depending on the command and whether there is a request
              body.
              Command arguments can be sent multiple ways.
              The simplest is part of the URL query string using ``x-www-form-urlencoded``
              encoding (see Python's ``urllib.urlencode()``. However, many servers impose
              length limitations on the URL. So this mechanism is typically only used if
              the server doesn't support other mechanisms.
              If the server supports the ``httpheader`` capability, command arguments can
              be sent in HTTP request headers named ``X-HgArg-<N>`` where ``<N>`` is an
              integer starting at 1. A ``x-www-form-urlencoded`` representation of the
              arguments is obtained. This full string is then split into chunks and sent
              in numbered ``X-HgArg-<N>`` headers. The maximum length of each HTTP header
              is defined by the server in the ``httpheader`` capability value, which defaults
              to ``1024``. The server reassembles the encoded arguments string by
              concatenating the ``X-HgArg-<N>`` headers then URL decodes them into a
              dictionary.
              The list of ``X-HgArg-<N>`` headers should be added to the ``Vary`` request
              header to instruct caches to take these headers into consideration when caching
              requests.
              If the server supports the ``httppostargs`` capability, the client
              may send command arguments in the HTTP request body as part of an
              HTTP POST request. The command arguments will be URL encoded just like
              they would for sending them via HTTP headers. However, no splitting is
              performed: the raw arguments are included in the HTTP request body.
              The client sends a ``X-HgArgs-Post`` header with the string length of the
              encoded arguments data. Additional data may be included in the HTTP
              request body immediately following the argument data. The offset of the
              non-argument data is defined by the ``X-HgArgs-Post`` header. The
              ``X-HgArgs-Post`` header is not required if there is no argument data.
              Additional command data can be sent as part of the HTTP request body. The
              default ``Content-Type`` when sending data is ``application/mercurial-0.1``.
              A ``Content-Length`` header is currently always sent.
              Example HTTP requests::
                  GET /repo?cmd=capabilities
                  X-HgArg-1: foo=bar&baz=hello%20world
              The ``Content-Type`` HTTP response header identifies the response as coming
              from Mercurial and can also be used to signal an error has occurred.
              The ``application/mercurial-0.1`` media type indicates a generic Mercurial
              response. It matches the media type sent by the client.
              The ``application/hg-error`` media type indicates a generic error occurred.
              The content of the HTTP response body typically holds text describing the
              error.
              The ``application/hg-changegroup`` media type indicates a changegroup response
              type.
              Clients also accept the ``text/plain`` media type. All other media
              types should cause the client to error.
              Clients should issue a ``User-Agent`` request header that identifies the client.
              The server should not use the ``User-Agent`` for feature detection.
              A command returning a ``string`` response issues the
              ``application/mercurial-0.1`` media type and the HTTP response body contains
              the raw string value. A ``Content-Length`` header is typically issued.
              A command returning a ``stream`` response issues the
              ``application/mercurial-0.1`` media type and the HTTP response is typically
              using *chunked transfer* (``Transfer-Encoding: chunked``).
              SSH Transport
              =============
              The SSH transport is a custom text-based protocol suitable for use over any
              bi-directional stream transport. It is most commonly used with SSH.
              A SSH transport server can be started with ``hg serve --stdio``. The stdin,
              stderr, and stdout file descriptors of the started process are used to exchange
              data. When Mercurial connects to a remote server over SSH, it actually starts
              a ``hg serve --stdio`` process on the remote server.
              Commands are issued by sending the command name followed by a trailing newline
              ``\n`` to the server. e.g. ``capabilities\n``.
              Command arguments are sent in the following format::
                  <argument> <length>\n<value>
              That is, the argument string name followed by a space followed by the
              integer length of the value (expressed as a string) followed by a newline
              (``\n``) followed by the raw argument value.
              Dictionary arguments are encoded differently::
                  <argument> <# elements>\n
                  <key1> <length1>\n<value1>
                  <key2> <length2>\n<value2>
                  ...
              Non-argument data is sent immediately after the final argument value. It is
              encoded in chunks::
                  <length>\n<data>
              Each command declares a list of supported arguments and their types. If a
              client sends an unknown argument to the server, the server should abort
              immediately. The special argument ``*`` in a command's definition indicates
              that all argument names are allowed.
              The definition of supported arguments and types is initially made when a
              new command is implemented. The client and server must initially independently
              agree on the arguments and their types. This initial set of arguments can be
              supplemented through the presence of *capabilities* advertised by the server.
              Each command has a defined expected response type.
              A ``string`` response type is a length framed value. The response consists of
              the string encoded integer length of a value followed by a newline (``\n``)
              followed by the value. Empty values are allowed (and are represented as
              ``0\n``).
              A ``stream`` response type consists of raw bytes of data. There is no framing.
              A generic error response type is also supported. It consists of a an error
              message written to ``stderr`` followed by ``\n-\n``. In addition, ``\n`` is
              written to ``stdout``.
              If the server receives an unknown command, it will send an empty ``string``
              response.
              The server terminates if it receives an empty command (a ``\n`` character).
              Capabilities
              ============
              Servers advertise supported wire protocol features. This allows clients to
              probe for server features before blindly calling a command or passing a
              specific argument.
              The server's features are exposed via a *capabilities* string. This is a
              space-delimited string of tokens/features. Some features are single words
              like ``lookup`` or ``batch``. Others are complicated key-value pairs
              advertising sub-features. e.g. ``httpheader=2048``. When complex, non-word
              values are used, each feature name can define its own encoding of sub-values.
              Comma-delimited and ``x-www-form-urlencoded`` values are common.
              The following document capabilities defined by the canonical Mercurial server
              implementation.
              batch
              -----
              Whether the server supports the ``batch`` command.
              This capability/command was introduced in Mercurial 1.9 (released July 2011).
              branchmap
              ---------
              Whether the server supports the ``branchmap`` command.
              This capability/command was introduced in Mercurial 1.3 (released July 2009).
              bundle2-exp
              -----------
              Precursor to ``bundle2`` capability that was used before bundle2 was a
              stable feature.
              This capability was introduced in Mercurial 3.0 behind an experimental
              flag. This capability should not be observed in the wild.
              bundle2
              -------
              Indicates whether the server supports the ``bundle2`` data exchange format.
              The value of the capability is a URL quoted, newline (``\n``) delimited
              list of keys or key-value pairs.
              A key is simply a URL encoded string.
              A key-value pair is a URL encoded key separated from a URL encoded value by
              an ``=``. If the value is a list, elements are delimited by a ``,`` after
              URL encoding.
              For example, say we have the values::
                {'HG20': [], 'changegroup': ['01', '02'], 'digests': ['sha1', 'sha512']}
              We would first construct a string::
                HG20\nchangegroup=01,02\ndigests=sha1,sha512
              We would then URL quote this string::
                HG20%0Achangegroup%3D01%2C02%0Adigests%3Dsha1%2Csha512
              This capability was introduced in Mercurial 3.4 (released May 2015).
              changegroupsubset
              -----------------
              Whether the server supports the ``changegroupsubset`` command.
              This capability was introduced in Mercurial 0.9.2 (released December
 ).
              This capability was introduced at the same time as the ``lookup``
              capability/command.
              getbundle
              ---------
              Whether the server supports the ``getbundle`` command.
              This capability was introduced in Mercurial 1.9 (released July 2011).
              httpheader
              ----------
              Whether the server supports receiving command arguments via HTTP request
              headers.
              The value of the capability is an integer describing the max header
              length that clients should send. Clients should ignore any content after a
              comma in the value, as this is reserved for future use.
              This capability was introduced in Mercurial 1.9 (released July 2011).
              httppostargs
              ------------
              **Experimental**
              Indicates that the server supports and prefers clients send command arguments
              via a HTTP POST request as part of the request body.
              This capability was introduced in Mercurial 3.8 (released May 2016).
              known
              -----
              Whether the server supports the ``known`` command.
              This capability/command was introduced in Mercurial 1.9 (released July 2011).
              lookup
              ------
              Whether the server supports the ``lookup`` command.
              This capability was introduced in Mercurial 0.9.2 (released December
 ).
              This capability was introduced at the same time as the ``changegroupsubset``
              capability/command.
              pushkey
              -------
              Whether the server supports the ``pushkey`` and ``listkeys`` commands.
              This capability was introduced in Mercurial 1.6 (released July 2010).
              standardbundle
              --------------
              **Unsupported**
              This capability was introduced during the Mercurial 0.9.2 development cycle in
 . It was never present in a release, as it was replaced by the ``unbundle``
              capability. This capability should not be encountered in the wild.
              stream-preferred
              ----------------
              If present the server prefers that clients clone using the streaming clone
              protocol (``hg clone --uncompressed``) rather than the standard
              changegroup/bundle based protocol.
              This capability was introduced in Mercurial 2.2 (released May 2012).
              streamreqs
              ----------
              Indicates whether the server supports *streaming clones* and the *requirements*
              that clients must support to receive it.
              If present, the server supports the ``stream_out`` command, which transmits
              raw revlogs from the repository instead of changegroups. This provides a faster
              cloning mechanism at the expense of more bandwidth used.
              The value of this capability is a comma-delimited list of repo format
              *requirements*. These are requirements that impact the reading of data in
              the ``.hg/store`` directory. An example value is
              ``streamreqs=generaldelta,revlogv1`` indicating the server repo requires
              the ``revlogv1`` and ``generaldelta`` requirements.
              If the only format requirement is ``revlogv1``, the server may expose the
              ``stream`` capability instead of the ``streamreqs`` capability.
              This capability was introduced in Mercurial 1.7 (released November 2010).
              stream
              ------
              Whether the server supports *streaming clones* from ``revlogv1`` repos.
              If present, the server supports the ``stream_out`` command, which transmits
              raw revlogs from the repository instead of changegroups. This provides a faster
              cloning mechanism at the expense of more bandwidth used.
              This capability was introduced in Mercurial 0.9.1 (released July 2006).
              When initially introduced, the value of the capability was the numeric
              revlog revision. e.g. ``stream=1``. This indicates the changegroup is using
              ``revlogv1``. This simple integer value wasn't powerful enough, so the
              ``streamreqs`` capability was invented to handle cases where the repo
              requirements have more than just ``revlogv1``. Newer servers omit the
              ``=1`` since it was the only value supported and the value of ``1`` can
              be implied by clients.
              unbundlehash
              ------------
              Whether the ``unbundle`` commands supports receiving a hash of all the
              heads instead of a list.
              For more, see the documentation for the ``unbundle`` command.
              This capability was introduced in Mercurial 1.9 (released July 2011).
              unbundle
              --------
              Whether the server supports pushing via the ``unbundle`` command.
              This capability/command has been present since Mercurial 0.9.1 (released
              July 2006).
              Mercurial 0.9.2 (released December 2006) added values to the capability
              indicating which bundle types the server supports receiving. This value is a
              comma-delimited list. e.g. ``HG10GZ,HG10BZ,HG10UN``. The order of values
              reflects the priority/preference of that type, where the first value is the
              most preferred type.
+             Handshake Protocol
+             ==================
+             While not explicitly required, it is common for clients to perform a
+             *handshake* when connecting to a server. The handshake accomplishes 2 things:
+             * Obtaining capabilities and other server features
+             * Flushing extra server output (e.g. SSH servers may print extra text
+               when connecting that may confuse the wire protocol)
+             This isn't a traditional *handshake* as far as network protocols go because
+             there is no persistent state as a result of the handshake: the handshake is
+             simply the issuing of commands and commands are stateless.
+             The canonical clients perform a capabilities lookup at connection establishment
+             time. This is because clients must assume a server only supports the features
+             of the original Mercurial server implementation until proven otherwise (from
+             advertised capabilities). Nearly every server running today supports features
+             that weren't present in the original Mercurial server implementation. Rather
+             than wait for a client to perform functionality that needs to consult
+             capabilities, it issues the lookup at connection start to avoid any delay later.
+             For HTTP servers, the client sends a ``capabilities`` command request as
+             soon as the connection is established. The server responds with a capabilities
+             string, which the client parses.
+             For SSH servers, the client sends the ``hello`` command (no arguments)
+             and a ``between`` command with the ``pairs`` argument having the value
+             ``0000000000000000000000000000000000000000-0000000000000000000000000000000000000000``.
+             The ``between`` command has been supported since the original Mercurial
+             server. Requesting the empty range will return a ``\n`` string response,
+             which will be encoded as ``1\n\n`` (value length of ``1`` followed by a newline
+             followed by the value, which happens to  be a newline).
+             The ``hello`` command was later introduced. Servers supporting it will issue
+             a response to that command before sending the ``1\n\n`` response to the
+             ``between`` command. Servers not supporting ``hello`` will send an empty
+             response (``0\n``).
+             In addition to the expected output from the ``hello`` and ``between`` commands,
+             servers may also send other output, such as *message of the day (MOTD)*
+             announcements. Clients assume servers will send this output before the
+             Mercurial server replies to the client-issued commands. So any server output
+             not conforming to the expected command responses is assumed to be not related
+             to Mercurial and can be ignored.

General Comments 0

Write
Preview

You need to be logged in to leave comments. Login now

No TODOs yet

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages