|
|
Mercurial uses Concise Binary Object Representation (CBOR)
|
|
|
(RFC 7049) for various data formats.
|
|
|
|
|
|
This document describes the subset of CBOR that Mercurial uses and
|
|
|
gives recommendations for appropriate use of CBOR within Mercurial.
|
|
|
|
|
|
Type Limitations
|
|
|
================
|
|
|
|
|
|
Major types 0 and 1 (unsigned integers and negative integers) MUST be
|
|
|
fully supported.
|
|
|
|
|
|
Major type 2 (byte strings) MUST be fully supported. However, there
|
|
|
are limitations around the use of indefinite-length byte strings.
|
|
|
(See below.)
|
|
|
|
|
|
Major type 3 (text strings) are NOT supported.
|
|
|
|
|
|
Major type 4 (arrays) MUST be supported. However, values are limited
|
|
|
to the set of types described in the "Container Types" section below.
|
|
|
And indefinite-length arrays are NOT supported.
|
|
|
|
|
|
Major type 5 (maps) MUST be supported. However, key values are limited
|
|
|
to the set of types described in the "Container Types" section below.
|
|
|
And indefinite-length maps are NOT supported.
|
|
|
|
|
|
Major type 6 (semantic tagging of major types) can be used with the
|
|
|
following semantic tag values:
|
|
|
|
|
|
258
|
|
|
Mathematical finite set. Suitable for representing Python's
|
|
|
``set`` type.
|
|
|
|
|
|
All other semantic tag values are not allowed.
|
|
|
|
|
|
Major type 7 (simple data types) can be used with the following
|
|
|
type values:
|
|
|
|
|
|
20
|
|
|
False
|
|
|
21
|
|
|
True
|
|
|
22
|
|
|
Null
|
|
|
31
|
|
|
Break stop code (for indefinite-length items).
|
|
|
|
|
|
All other simple data type values (including every value requiring the
|
|
|
1 byte extension) are disallowed.
|
|
|
|
|
|
Indefinite-Length Byte Strings
|
|
|
==============================
|
|
|
|
|
|
Indefinite-length byte strings (major type 2) are allowed. However,
|
|
|
they MUST NOT occur inside a container type (such as an array or map).
|
|
|
i.e. they can only occur as the "top-most" element in a stream of
|
|
|
values.
|
|
|
|
|
|
Encoders and decoders SHOULD *stream* indefinite-length byte strings.
|
|
|
i.e. an encoder or decoder SHOULD NOT buffer the entirety of a long
|
|
|
byte string value when indefinite-length byte strings are being used
|
|
|
if it can be avoided. Mercurial MAY use extremely long indefinite-length
|
|
|
byte strings and buffering the source or destination value COULD lead to
|
|
|
memory exhaustion.
|
|
|
|
|
|
Chunks in an indefinite-length byte string SHOULD NOT exceed 2^20
|
|
|
bytes.
|
|
|
|
|
|
Container Types
|
|
|
===============
|
|
|
|
|
|
Mercurial may use the array (major type 4), map (major type 5), and
|
|
|
set (semantic tag 258 plus major type 4 array) container types.
|
|
|
|
|
|
An array may contain any supported type as values.
|
|
|
|
|
|
A map MUST only use the following types as keys:
|
|
|
|
|
|
* unsigned integers (major type 0)
|
|
|
* negative integers (major type 1)
|
|
|
* byte strings (major type 2) (but not indefinite-length byte strings)
|
|
|
* false (simple type 20)
|
|
|
* true (simple type 21)
|
|
|
* null (simple type 22)
|
|
|
|
|
|
A map MUST only use the following types as values:
|
|
|
|
|
|
* all types supported as map keys
|
|
|
* arrays
|
|
|
* maps
|
|
|
* sets
|
|
|
|
|
|
A set may only use the following types as values:
|
|
|
|
|
|
* all types supported as map keys
|
|
|
|
|
|
It is recommended that keys in maps and values in sets and arrays all
|
|
|
be of a uniform type.
|
|
|
|
|
|
Avoiding Large Byte Strings
|
|
|
===========================
|
|
|
|
|
|
The use of large byte strings is discouraged, especially in scenarios where
|
|
|
the total size of the byte string may by unbound for some inputs (e.g. when
|
|
|
representing the content of a tracked file). It is highly recommended to use
|
|
|
indefinite-length byte strings for these purposes.
|
|
|
|
|
|
Since indefinite-length byte strings cannot be nested within an outer
|
|
|
container (such as an array or map), to associate a large byte string
|
|
|
with another data structure, it is recommended to use an array or
|
|
|
map followed immediately by an indefinite-length byte string. For example,
|
|
|
instead of the following map::
|
|
|
|
|
|
{
|
|
|
"key1": "value1",
|
|
|
"key2": "value2",
|
|
|
"long_value": "some very large value...",
|
|
|
}
|
|
|
|
|
|
Use a map followed by a byte string:
|
|
|
|
|
|
{
|
|
|
"key1": "value1",
|
|
|
"key2": "value2",
|
|
|
"value_follows": True,
|
|
|
}
|
|
|
<BEGIN INDEFINITE-LENGTH BYTE STRING>
|
|
|
"some very large value"
|
|
|
"..."
|
|
|
<END INDEFINITE-LENGTH BYTE STRING>
|
|
|
|