cbor.txt
130 lines
| 3.8 KiB
| text/plain
|
TextLexer
Gregory Szorc
|
r39446 | Mercurial uses Concise Binary Object Representation (CBOR) | ||
(RFC 7049) for various data formats. | ||||
This document describes the subset of CBOR that Mercurial uses and | ||||
gives recommendations for appropriate use of CBOR within Mercurial. | ||||
Type Limitations | ||||
================ | ||||
Major types 0 and 1 (unsigned integers and negative integers) MUST be | ||||
fully supported. | ||||
Major type 2 (byte strings) MUST be fully supported. However, there | ||||
are limitations around the use of indefinite-length byte strings. | ||||
(See below.) | ||||
Major type 3 (text strings) are NOT supported. | ||||
Major type 4 (arrays) MUST be supported. However, values are limited | ||||
to the set of types described in the "Container Types" section below. | ||||
And indefinite-length arrays are NOT supported. | ||||
Major type 5 (maps) MUST be supported. However, key values are limited | ||||
to the set of types described in the "Container Types" section below. | ||||
And indefinite-length maps are NOT supported. | ||||
Major type 6 (semantic tagging of major types) can be used with the | ||||
following semantic tag values: | ||||
258 | ||||
Mathematical finite set. Suitable for representing Python's | ||||
``set`` type. | ||||
All other semantic tag values are not allowed. | ||||
Major type 7 (simple data types) can be used with the following | ||||
type values: | ||||
20 | ||||
False | ||||
21 | ||||
True | ||||
22 | ||||
Null | ||||
31 | ||||
Break stop code (for indefinite-length items). | ||||
All other simple data type values (including every value requiring the | ||||
1 byte extension) are disallowed. | ||||
Indefinite-Length Byte Strings | ||||
============================== | ||||
Indefinite-length byte strings (major type 2) are allowed. However, | ||||
they MUST NOT occur inside a container type (such as an array or map). | ||||
i.e. they can only occur as the "top-most" element in a stream of | ||||
values. | ||||
Encoders and decoders SHOULD *stream* indefinite-length byte strings. | ||||
i.e. an encoder or decoder SHOULD NOT buffer the entirety of a long | ||||
byte string value when indefinite-length byte strings are being used | ||||
if it can be avoided. Mercurial MAY use extremely long indefinite-length | ||||
byte strings and buffering the source or destination value COULD lead to | ||||
memory exhaustion. | ||||
Chunks in an indefinite-length byte string SHOULD NOT exceed 2^20 | ||||
bytes. | ||||
Container Types | ||||
=============== | ||||
Mercurial may use the array (major type 4), map (major type 5), and | ||||
set (semantic tag 258 plus major type 4 array) container types. | ||||
An array may contain any supported type as values. | ||||
A map MUST only use the following types as keys: | ||||
* unsigned integers (major type 0) | ||||
* negative integers (major type 1) | ||||
* byte strings (major type 2) (but not indefinite-length byte strings) | ||||
* false (simple type 20) | ||||
* true (simple type 21) | ||||
* null (simple type 22) | ||||
A map MUST only use the following types as values: | ||||
* all types supported as map keys | ||||
* arrays | ||||
* maps | ||||
* sets | ||||
A set may only use the following types as values: | ||||
* all types supported as map keys | ||||
It is recommended that keys in maps and values in sets and arrays all | ||||
be of a uniform type. | ||||
Avoiding Large Byte Strings | ||||
=========================== | ||||
The use of large byte strings is discouraged, especially in scenarios where | ||||
the total size of the byte string may by unbound for some inputs (e.g. when | ||||
representing the content of a tracked file). It is highly recommended to use | ||||
indefinite-length byte strings for these purposes. | ||||
Since indefinite-length byte strings cannot be nested within an outer | ||||
container (such as an array or map), to associate a large byte string | ||||
with another data structure, it is recommended to use an array or | ||||
map followed immediately by an indefinite-length byte string. For example, | ||||
instead of the following map:: | ||||
{ | ||||
"key1": "value1", | ||||
"key2": "value2", | ||||
"long_value": "some very large value...", | ||||
} | ||||
Use a map followed by a byte string: | ||||
{ | ||||
"key1": "value1", | ||||
"key2": "value2", | ||||
"value_follows": True, | ||||
} | ||||
<BEGIN INDEFINITE-LENGTH BYTE STRING> | ||||
"some very large value" | ||||
"..." | ||||
<END INDEFINITE-LENGTH BYTE STRING> | ||||