##// END OF EJS Templates
internals: document CBOR utilization...
Gregory Szorc -
r39446:2fe21c65 default
parent child Browse files
Show More
@@ -0,0 +1,130 b''
1 Mercurial uses Concise Binary Object Representation (CBOR)
2 (RFC 7049) for various data formats.
3
4 This document describes the subset of CBOR that Mercurial uses and
5 gives recommendations for appropriate use of CBOR within Mercurial.
6
7 Type Limitations
8 ================
9
10 Major types 0 and 1 (unsigned integers and negative integers) MUST be
11 fully supported.
12
13 Major type 2 (byte strings) MUST be fully supported. However, there
14 are limitations around the use of indefinite-length byte strings.
15 (See below.)
16
17 Major type 3 (text strings) are NOT supported.
18
19 Major type 4 (arrays) MUST be supported. However, values are limited
20 to the set of types described in the "Container Types" section below.
21 And indefinite-length arrays are NOT supported.
22
23 Major type 5 (maps) MUST be supported. However, key values are limited
24 to the set of types described in the "Container Types" section below.
25 And indefinite-length maps are NOT supported.
26
27 Major type 6 (semantic tagging of major types) can be used with the
28 following semantic tag values:
29
30 258
31 Mathematical finite set. Suitable for representing Python's
32 ``set`` type.
33
34 All other semantic tag values are not allowed.
35
36 Major type 7 (simple data types) can be used with the following
37 type values:
38
39 20
40 False
41 21
42 True
43 22
44 Null
45 31
46 Break stop code (for indefinite-length items).
47
48 All other simple data type values (including every value requiring the
49 1 byte extension) are disallowed.
50
51 Indefinite-Length Byte Strings
52 ==============================
53
54 Indefinite-length byte strings (major type 2) are allowed. However,
55 they MUST NOT occur inside a container type (such as an array or map).
56 i.e. they can only occur as the "top-most" element in a stream of
57 values.
58
59 Encoders and decoders SHOULD *stream* indefinite-length byte strings.
60 i.e. an encoder or decoder SHOULD NOT buffer the entirety of a long
61 byte string value when indefinite-length byte strings are being used
62 if it can be avoided. Mercurial MAY use extremely long indefinite-length
63 byte strings and buffering the source or destination value COULD lead to
64 memory exhaustion.
65
66 Chunks in an indefinite-length byte string SHOULD NOT exceed 2^20
67 bytes.
68
69 Container Types
70 ===============
71
72 Mercurial may use the array (major type 4), map (major type 5), and
73 set (semantic tag 258 plus major type 4 array) container types.
74
75 An array may contain any supported type as values.
76
77 A map MUST only use the following types as keys:
78
79 * unsigned integers (major type 0)
80 * negative integers (major type 1)
81 * byte strings (major type 2) (but not indefinite-length byte strings)
82 * false (simple type 20)
83 * true (simple type 21)
84 * null (simple type 22)
85
86 A map MUST only use the following types as values:
87
88 * all types supported as map keys
89 * arrays
90 * maps
91 * sets
92
93 A set may only use the following types as values:
94
95 * all types supported as map keys
96
97 It is recommended that keys in maps and values in sets and arrays all
98 be of a uniform type.
99
100 Avoiding Large Byte Strings
101 ===========================
102
103 The use of large byte strings is discouraged, especially in scenarios where
104 the total size of the byte string may by unbound for some inputs (e.g. when
105 representing the content of a tracked file). It is highly recommended to use
106 indefinite-length byte strings for these purposes.
107
108 Since indefinite-length byte strings cannot be nested within an outer
109 container (such as an array or map), to associate a large byte string
110 with another data structure, it is recommended to use an array or
111 map followed immediately by an indefinite-length byte string. For example,
112 instead of the following map::
113
114 {
115 "key1": "value1",
116 "key2": "value2",
117 "long_value": "some very large value...",
118 }
119
120 Use a map followed by a byte string:
121
122 {
123 "key1": "value1",
124 "key2": "value2",
125 "value_follows": True,
126 }
127 <BEGIN INDEFINITE-LENGTH BYTE STRING>
128 "some very large value"
129 "..."
130 <END INDEFINITE-LENGTH BYTE STRING>
@@ -43,6 +43,7 b''
43 <Component Id="help.internals" Guid="$(var.help.internals.guid)" Win64='$(var.IsX64)'>
43 <Component Id="help.internals" Guid="$(var.help.internals.guid)" Win64='$(var.IsX64)'>
44 <File Id="internals.bundle2.txt" Name="bundle2.txt" />
44 <File Id="internals.bundle2.txt" Name="bundle2.txt" />
45 <File Id="internals.bundles.txt" Name="bundles.txt" KeyPath="yes" />
45 <File Id="internals.bundles.txt" Name="bundles.txt" KeyPath="yes" />
46 <File Id="internals.cbor.txt" Name="cbor.txt" />
46 <File Id="internals.censor.txt" Name="censor.txt" />
47 <File Id="internals.censor.txt" Name="censor.txt" />
47 <File Id="internals.changegroups.txt" Name="changegroups.txt" />
48 <File Id="internals.changegroups.txt" Name="changegroups.txt" />
48 <File Id="internals.config.txt" Name="config.txt" />
49 <File Id="internals.config.txt" Name="config.txt" />
@@ -205,6 +205,8 b' internalstable = sorted(['
205 loaddoc('bundle2', subdir='internals')),
205 loaddoc('bundle2', subdir='internals')),
206 (['bundles'], _('Bundles'),
206 (['bundles'], _('Bundles'),
207 loaddoc('bundles', subdir='internals')),
207 loaddoc('bundles', subdir='internals')),
208 (['cbor'], _('CBOR'),
209 loaddoc('cbor', subdir='internals')),
208 (['censor'], _('Censor'),
210 (['censor'], _('Censor'),
209 loaddoc('censor', subdir='internals')),
211 loaddoc('censor', subdir='internals')),
210 (['changegroups'], _('Changegroups'),
212 (['changegroups'], _('Changegroups'),
@@ -1010,6 +1010,7 b' internals topic renders index of availab'
1010
1010
1011 bundle2 Bundle2
1011 bundle2 Bundle2
1012 bundles Bundles
1012 bundles Bundles
1013 cbor CBOR
1013 censor Censor
1014 censor Censor
1014 changegroups Changegroups
1015 changegroups Changegroups
1015 config Config Registrar
1016 config Config Registrar
@@ -3294,6 +3295,13 b' Sub-topic indexes rendered properly'
3294 Bundles
3295 Bundles
3295 </td></tr>
3296 </td></tr>
3296 <tr><td>
3297 <tr><td>
3298 <a href="/help/internals.cbor">
3299 cbor
3300 </a>
3301 </td><td>
3302 CBOR
3303 </td></tr>
3304 <tr><td>
3297 <a href="/help/internals.censor">
3305 <a href="/help/internals.censor">
3298 censor
3306 censor
3299 </a>
3307 </a>
General Comments 0
You need to be logged in to leave comments. Login now