##// END OF EJS Templates
help: fix internals.changegroups...
Kyle Lippincott -
r31213:9f169b7f default
parent child Browse files
Show More
@@ -1,35 +1,49 b''
1 Changegroups are representations of repository revlog data, specifically
1 Changegroups are representations of repository revlog data, specifically
2 the changelog, manifest, and filelogs.
2 the changelog data, root/flat manifest data, treemanifest data, and
3 filelogs.
3
4
4 There are 3 versions of changegroups: ``1``, ``2``, and ``3``. From a
5 There are 3 versions of changegroups: ``1``, ``2``, and ``3``. From a
5 high-level, versions ``1`` and ``2`` are almost exactly the same, with
6 high-level, versions ``1`` and ``2`` are almost exactly the same, with the
6 the only difference being a header on entries in the changeset
7 only difference being an additional item in the *delta header*. Version
7 segment. Version ``3`` adds support for exchanging treemanifests and
8 ``3`` adds support for revlog flags in the *delta header* and optionally
8 includes revlog flags in the delta header.
9 exchanging treemanifests (enabled by setting an option on the
10 ``changegroup`` part in the bundle2).
9
11
10 Changegroups consists of 3 logical segments::
12 Changegroups when not exchanging treemanifests consist of 3 logical
13 segments::
11
14
12 +---------------------------------+
15 +---------------------------------+
13 | | | |
16 | | | |
14 | changeset | manifest | filelogs |
17 | changeset | manifest | filelogs |
15 | | | |
18 | | | |
19 | | | |
16 +---------------------------------+
20 +---------------------------------+
17
21
22 When exchanging treemanifests, there are 4 logical segments::
23
24 +-------------------------------------------------+
25 | | | | |
26 | changeset | root | treemanifests | filelogs |
27 | | manifest | | |
28 | | | | |
29 +-------------------------------------------------+
30
18 The principle building block of each segment is a *chunk*. A *chunk*
31 The principle building block of each segment is a *chunk*. A *chunk*
19 is a framed piece of data::
32 is a framed piece of data::
20
33
21 +---------------------------------------+
34 +---------------------------------------+
22 | | |
35 | | |
23 | length | data |
36 | length | data |
24 | (32 bits) | <length> bytes |
37 | (4 bytes) | (<length - 4> bytes) |
25 | | |
38 | | |
26 +---------------------------------------+
39 +---------------------------------------+
27
40
28 Each chunk starts with a 32-bit big-endian signed integer indicating
41 All integers are big-endian signed integers. Each chunk starts with a 32-bit
29 the length of the raw data that follows.
42 integer indicating the length of the entire chunk (including the length field
43 itself).
30
44
31 There is a special case chunk that has 0 length (``0x00000000``). We
45 There is a special case chunk that has a value of 0 for the length
32 call this an *empty chunk*.
46 (``0x00000000``). We call this an *empty chunk*.
33
47
34 Delta Groups
48 Delta Groups
35 ============
49 ============
@@ -43,26 +57,27 b' to signal the end of the delta group::'
43 +------------------------------------------------------------------------+
57 +------------------------------------------------------------------------+
44 | | | | | |
58 | | | | | |
45 | chunk0 length | chunk0 data | chunk1 length | chunk1 data | 0x0 |
59 | chunk0 length | chunk0 data | chunk1 length | chunk1 data | 0x0 |
46 | (32 bits) | (various) | (32 bits) | (various) | (32 bits) |
60 | (4 bytes) | (various) | (4 bytes) | (various) | (4 bytes) |
47 | | | | | |
61 | | | | | |
48 +------------------------------------------------------------+-----------+
62 +------------------------------------------------------------------------+
49
63
50 Each *chunk*'s data consists of the following::
64 Each *chunk*'s data consists of the following::
51
65
52 +-----------------------------------------+
66 +---------------------------------------+
53 | | | |
67 | | |
54 | delta header | mdiff header | delta |
68 | delta header | delta data |
55 | (various) | (12 bytes) | (various) |
69 | (various by version) | (various) |
56 | | | |
70 | | |
57 +-----------------------------------------+
71 +---------------------------------------+
58
72
59 The *length* field is the byte length of the remaining 3 logical pieces
73 The *delta data* is a series of *delta*s that describe a diff from an existing
60 of data. The *delta* is a diff from an existing entry in the changelog.
74 entry (either that the recipient already has, or previously specified in the
75 bundlei/changegroup).
61
76
62 The *delta header* is different between versions ``1``, ``2``, and
77 The *delta header* is different between versions ``1``, ``2``, and
63 ``3`` of the changegroup format.
78 ``3`` of the changegroup format.
64
79
65 Version 1::
80 Version 1 (headerlen=80)::
66
81
67 +------------------------------------------------------+
82 +------------------------------------------------------+
68 | | | | |
83 | | | | |
@@ -71,7 +86,7 b' Version 1::'
71 | | | | |
86 | | | | |
72 +------------------------------------------------------+
87 +------------------------------------------------------+
73
88
74 Version 2::
89 Version 2 (headerlen=100)::
75
90
76 +------------------------------------------------------------------+
91 +------------------------------------------------------------------+
77 | | | | | |
92 | | | | | |
@@ -80,30 +95,35 b' Version 2::'
80 | | | | | |
95 | | | | | |
81 +------------------------------------------------------------------+
96 +------------------------------------------------------------------+
82
97
83 Version 3::
98 Version 3 (headerlen=102)::
84
99
85 +------------------------------------------------------------------------------+
100 +------------------------------------------------------------------------------+
86 | | | | | | |
101 | | | | | | |
87 | node | p1 node | p2 node | base node | link node | flags |
102 | node | p1 node | p2 node | base node | link node | flags |
88 | (20 bytes) | (20 bytes) | (20 bytes) | (20 bytes) | (20 bytes) | (2 bytes) |
103 | (20 bytes) | (20 bytes) | (20 bytes) | (20 bytes) | (20 bytes) | (2 bytes) |
89 | | | | | | |
104 | | | | | | |
90 +------------------------------------------------------------------------------+
105 +------------------------------------------------------------------------------+
91
106
92 The *mdiff header* consists of 3 32-bit big-endian signed integers
107 The *delta data* consists of ``chunklen - 4 - headerlen`` bytes, which contain a
93 describing offsets at which to apply the following delta content::
108 series of *delta*s, densely packed (no separators). These deltas describe a diff
109 from an existing entry (either that the recipient already has, or previously
110 specified in the bundle/changegroup). The format is described more fully in
111 ``hg help internals.bdiff``, but briefly:
94
112
95 +-------------------------------------+
113 +---------------------------------------------------------------+
96 | | | |
114 | | | | |
97 | offset | old length | new length |
115 | start offset | end offset | new length | content |
98 | (32 bits) | (32 bits) | (32 bits) |
116 | (4 bytes) | (4 bytes) | (4 bytes) | (<new length> bytes) |
99 | | | |
117 | | | | |
100 +-------------------------------------+
118 +---------------------------------------------------------------+
119
120 Please note that the length field in the delta data does *not* include itself.
101
121
102 In version 1, the delta is always applied against the previous node from
122 In version 1, the delta is always applied against the previous node from
103 the changegroup or the first parent if this is the first entry in the
123 the changegroup or the first parent if this is the first entry in the
104 changegroup.
124 changegroup.
105
125
106 In version 2, the delta base node is encoded in the entry in the
126 In version 2 and up, the delta base node is encoded in the entry in the
107 changegroup. This allows the delta to be expressed against any parent,
127 changegroup. This allows the delta to be expressed against any parent,
108 which can result in smaller deltas and more efficient encoding of data.
128 which can result in smaller deltas and more efficient encoding of data.
109
129
@@ -111,43 +131,58 b' Changeset Segment'
111 =================
131 =================
112
132
113 The *changeset segment* consists of a single *delta group* holding
133 The *changeset segment* consists of a single *delta group* holding
114 changelog data. It is followed by an *empty chunk* to denote the
134 changelog data. The *empty chunk* at the end of the *delta group* denotes
115 boundary to the *manifests segment*.
135 the boundary to the *manifest segment*.
116
136
117 Manifest Segment
137 Manifest Segment
118 ================
138 ================
119
139
120 The *manifest segment* consists of a single *delta group* holding
140 The *manifest segment* consists of a single *delta group* holding manifest
121 manifest data. It is followed by an *empty chunk* to denote the boundary
141 data. If treemanifests are in use, it contains only the manifest for the
122 to the *filelogs segment*.
142 root directory of the repository. Otherwise, it contains the entire
143 manifest data. The *empty chunk* at the end of the *delta group* denotes
144 the boundary to the next segment (either the *treemanifests segment* or the
145 *filelogs segment*, depending on version and the request options).
146
147 Treemanifests Segment
148 ---------------------
149
150 The *treemanifests segment* only exists in changegroup version ``3``, and
151 only if the 'treemanifest' param is part of the bundle2 changegroup part
152 (it is not possible to use changegroup version 3 outside of bundle2).
153 Aside from the filenames in the *treemanifests segment* containing a
154 trailing ``/`` character, it behaves identically to the *filelogs segment*
155 (see below). The final sub-segment is followed by an *empty chunk* (logically,
156 a sub-segment with filename size 0). This denotes the boundary to the
157 *filelogs segment*.
123
158
124 Filelogs Segment
159 Filelogs Segment
125 ================
160 ================
126
161
127 The *filelogs* segment consists of multiple sub-segments, each
162 The *filelogs segment* consists of multiple sub-segments, each
128 corresponding to an individual file whose data is being described::
163 corresponding to an individual file whose data is being described::
129
164
130 +--------------------------------------+
165 +--------------------------------------------------+
131 | | | | |
166 | | | | | |
132 | filelog0 | filelog1 | filelog2 | ... |
167 | filelog0 | filelog1 | filelog2 | ... | 0x0 |
133 | | | | |
168 | | | | | (4 bytes) |
134 +--------------------------------------+
169 | | | | | |
170 +--------------------------------------------------+
135
171
136 In version ``3`` of the changegroup format, filelogs may include
172 The final filelog sub-segment is followed by an *empty chunk* (logically,
137 directory logs when treemanifests are in use. directory logs are
173 a sub-segment with filename size 0). This denotes the end of the segment
138 identified by having a trailing '/' on their filename (see below).
174 and of the overall changegroup.
139
140 The final filelog sub-segment is followed by an *empty chunk* to denote
141 the end of the segment and the overall changegroup.
142
175
143 Each filelog sub-segment consists of the following::
176 Each filelog sub-segment consists of the following::
144
177
145 +------------------------------------------+
178 +------------------------------------------------------+
146 | | | |
179 | | | |
147 | filename size | filename | delta group |
180 | filename length | filename | delta group |
148 | (32 bits) | (various) | (various) |
181 | (4 bytes) | (<length - 4> bytes) | (various) |
149 | | | |
182 | | | |
150 +------------------------------------------+
183 +------------------------------------------------------+
151
184
152 That is, a *chunk* consisting of the filename (not terminated or padded)
185 That is, a *chunk* consisting of the filename (not terminated or padded)
153 followed by N chunks constituting the *delta group* for this file.
186 followed by N chunks constituting the *delta group* for this file. The
187 *empty chunk* at the end of each *delta group* denotes the boundary to the
188 next filelog sub-segment.
@@ -952,37 +952,51 b' sub-topics can be accessed'
952 """"""""""""
952 """"""""""""
953
953
954 Changegroups are representations of repository revlog data, specifically
954 Changegroups are representations of repository revlog data, specifically
955 the changelog, manifest, and filelogs.
955 the changelog data, root/flat manifest data, treemanifest data, and
956 filelogs.
956
957
957 There are 3 versions of changegroups: "1", "2", and "3". From a high-
958 There are 3 versions of changegroups: "1", "2", and "3". From a high-
958 level, versions "1" and "2" are almost exactly the same, with the only
959 level, versions "1" and "2" are almost exactly the same, with the only
959 difference being a header on entries in the changeset segment. Version "3"
960 difference being an additional item in the *delta header*. Version "3"
960 adds support for exchanging treemanifests and includes revlog flags in the
961 adds support for revlog flags in the *delta header* and optionally
961 delta header.
962 exchanging treemanifests (enabled by setting an option on the
962
963 "changegroup" part in the bundle2).
963 Changegroups consists of 3 logical segments:
964
965 Changegroups when not exchanging treemanifests consist of 3 logical
966 segments:
964
967
965 +---------------------------------+
968 +---------------------------------+
966 | | | |
969 | | | |
967 | changeset | manifest | filelogs |
970 | changeset | manifest | filelogs |
968 | | | |
971 | | | |
972 | | | |
969 +---------------------------------+
973 +---------------------------------+
970
974
975 When exchanging treemanifests, there are 4 logical segments:
976
977 +-------------------------------------------------+
978 | | | | |
979 | changeset | root | treemanifests | filelogs |
980 | | manifest | | |
981 | | | | |
982 +-------------------------------------------------+
983
971 The principle building block of each segment is a *chunk*. A *chunk* is a
984 The principle building block of each segment is a *chunk*. A *chunk* is a
972 framed piece of data:
985 framed piece of data:
973
986
974 +---------------------------------------+
987 +---------------------------------------+
975 | | |
988 | | |
976 | length | data |
989 | length | data |
977 | (32 bits) | <length> bytes |
990 | (4 bytes) | (<length - 4> bytes) |
978 | | |
991 | | |
979 +---------------------------------------+
992 +---------------------------------------+
980
993
981 Each chunk starts with a 32-bit big-endian signed integer indicating the
994 All integers are big-endian signed integers. Each chunk starts with a
982 length of the raw data that follows.
995 32-bit integer indicating the length of the entire chunk (including the
983
996 length field itself).
984 There is a special case chunk that has 0 length ("0x00000000"). We call
997
985 this an *empty chunk*.
998 There is a special case chunk that has a value of 0 for the length
999 ("0x00000000"). We call this an *empty chunk*.
986
1000
987 Delta Groups
1001 Delta Groups
988 ============
1002 ============
@@ -996,26 +1010,27 b' sub-topics can be accessed'
996 +------------------------------------------------------------------------+
1010 +------------------------------------------------------------------------+
997 | | | | | |
1011 | | | | | |
998 | chunk0 length | chunk0 data | chunk1 length | chunk1 data | 0x0 |
1012 | chunk0 length | chunk0 data | chunk1 length | chunk1 data | 0x0 |
999 | (32 bits) | (various) | (32 bits) | (various) | (32 bits) |
1013 | (4 bytes) | (various) | (4 bytes) | (various) | (4 bytes) |
1000 | | | | | |
1014 | | | | | |
1001 +------------------------------------------------------------+-----------+
1015 +------------------------------------------------------------------------+
1002
1016
1003 Each *chunk*'s data consists of the following:
1017 Each *chunk*'s data consists of the following:
1004
1018
1005 +-----------------------------------------+
1019 +---------------------------------------+
1006 | | | |
1020 | | |
1007 | delta header | mdiff header | delta |
1021 | delta header | delta data |
1008 | (various) | (12 bytes) | (various) |
1022 | (various by version) | (various) |
1009 | | | |
1023 | | |
1010 +-----------------------------------------+
1024 +---------------------------------------+
1011
1025
1012 The *length* field is the byte length of the remaining 3 logical pieces of
1026 The *delta data* is a series of *delta*s that describe a diff from an
1013 data. The *delta* is a diff from an existing entry in the changelog.
1027 existing entry (either that the recipient already has, or previously
1028 specified in the bundlei/changegroup).
1014
1029
1015 The *delta header* is different between versions "1", "2", and "3" of the
1030 The *delta header* is different between versions "1", "2", and "3" of the
1016 changegroup format.
1031 changegroup format.
1017
1032
1018 Version 1:
1033 Version 1 (headerlen=80):
1019
1034
1020 +------------------------------------------------------+
1035 +------------------------------------------------------+
1021 | | | | |
1036 | | | | |
@@ -1024,7 +1039,7 b' sub-topics can be accessed'
1024 | | | | |
1039 | | | | |
1025 +------------------------------------------------------+
1040 +------------------------------------------------------+
1026
1041
1027 Version 2:
1042 Version 2 (headerlen=100):
1028
1043
1029 +------------------------------------------------------------------+
1044 +------------------------------------------------------------------+
1030 | | | | | |
1045 | | | | | |
@@ -1033,30 +1048,36 b' sub-topics can be accessed'
1033 | | | | | |
1048 | | | | | |
1034 +------------------------------------------------------------------+
1049 +------------------------------------------------------------------+
1035
1050
1036 Version 3:
1051 Version 3 (headerlen=102):
1037
1052
1038 +------------------------------------------------------------------------------+
1053 +------------------------------------------------------------------------------+
1039 | | | | | | |
1054 | | | | | | |
1040 | node | p1 node | p2 node | base node | link node | flags |
1055 | node | p1 node | p2 node | base node | link node | flags |
1041 | (20 bytes) | (20 bytes) | (20 bytes) | (20 bytes) | (20 bytes) | (2 bytes) |
1056 | (20 bytes) | (20 bytes) | (20 bytes) | (20 bytes) | (20 bytes) | (2 bytes) |
1042 | | | | | | |
1057 | | | | | | |
1043 +------------------------------------------------------------------------------+
1058 +------------------------------------------------------------------------------+
1044
1059
1045 The *mdiff header* consists of 3 32-bit big-endian signed integers
1060 The *delta data* consists of "chunklen - 4 - headerlen" bytes, which
1046 describing offsets at which to apply the following delta content:
1061 contain a series of *delta*s, densely packed (no separators). These deltas
1047
1062 describe a diff from an existing entry (either that the recipient already
1048 +-------------------------------------+
1063 has, or previously specified in the bundle/changegroup). The format is
1049 | | | |
1064 described more fully in "hg help internals.bdiff", but briefly:
1050 | offset | old length | new length |
1065
1051 | (32 bits) | (32 bits) | (32 bits) |
1066 +---------------------------------------------------------------+ |
1052 | | | |
1067 | | | | | start offset | end
1053 +-------------------------------------+
1068 offset | new length | content | | (4 bytes) | (4
1069 bytes) | (4 bytes) | (<new length> bytes) | | |
1070 | | |
1071 +---------------------------------------------------------------+
1072
1073 Please note that the length field in the delta data does *not* include
1074 itself.
1054
1075
1055 In version 1, the delta is always applied against the previous node from
1076 In version 1, the delta is always applied against the previous node from
1056 the changegroup or the first parent if this is the first entry in the
1077 the changegroup or the first parent if this is the first entry in the
1057 changegroup.
1078 changegroup.
1058
1079
1059 In version 2, the delta base node is encoded in the entry in the
1080 In version 2 and up, the delta base node is encoded in the entry in the
1060 changegroup. This allows the delta to be expressed against any parent,
1081 changegroup. This allows the delta to be expressed against any parent,
1061 which can result in smaller deltas and more efficient encoding of data.
1082 which can result in smaller deltas and more efficient encoding of data.
1062
1083
@@ -1064,46 +1085,61 b' sub-topics can be accessed'
1064 =================
1085 =================
1065
1086
1066 The *changeset segment* consists of a single *delta group* holding
1087 The *changeset segment* consists of a single *delta group* holding
1067 changelog data. It is followed by an *empty chunk* to denote the boundary
1088 changelog data. The *empty chunk* at the end of the *delta group* denotes
1068 to the *manifests segment*.
1089 the boundary to the *manifest segment*.
1069
1090
1070 Manifest Segment
1091 Manifest Segment
1071 ================
1092 ================
1072
1093
1073 The *manifest segment* consists of a single *delta group* holding manifest
1094 The *manifest segment* consists of a single *delta group* holding manifest
1074 data. It is followed by an *empty chunk* to denote the boundary to the
1095 data. If treemanifests are in use, it contains only the manifest for the
1075 *filelogs segment*.
1096 root directory of the repository. Otherwise, it contains the entire
1097 manifest data. The *empty chunk* at the end of the *delta group* denotes
1098 the boundary to the next segment (either the *treemanifests segment* or
1099 the *filelogs segment*, depending on version and the request options).
1100
1101 Treemanifests Segment
1102 ---------------------
1103
1104 The *treemanifests segment* only exists in changegroup version "3", and
1105 only if the 'treemanifest' param is part of the bundle2 changegroup part
1106 (it is not possible to use changegroup version 3 outside of bundle2).
1107 Aside from the filenames in the *treemanifests segment* containing a
1108 trailing "/" character, it behaves identically to the *filelogs segment*
1109 (see below). The final sub-segment is followed by an *empty chunk*
1110 (logically, a sub-segment with filename size 0). This denotes the boundary
1111 to the *filelogs segment*.
1076
1112
1077 Filelogs Segment
1113 Filelogs Segment
1078 ================
1114 ================
1079
1115
1080 The *filelogs* segment consists of multiple sub-segments, each
1116 The *filelogs segment* consists of multiple sub-segments, each
1081 corresponding to an individual file whose data is being described:
1117 corresponding to an individual file whose data is being described:
1082
1118
1083 +--------------------------------------+
1119 +--------------------------------------------------+
1084 | | | | |
1120 | | | | | |
1085 | filelog0 | filelog1 | filelog2 | ... |
1121 | filelog0 | filelog1 | filelog2 | ... | 0x0 |
1086 | | | | |
1122 | | | | | (4 bytes) |
1087 +--------------------------------------+
1123 | | | | | |
1088
1124 +--------------------------------------------------+
1089 In version "3" of the changegroup format, filelogs may include directory
1125
1090 logs when treemanifests are in use. directory logs are identified by
1126 The final filelog sub-segment is followed by an *empty chunk* (logically,
1091 having a trailing '/' on their filename (see below).
1127 a sub-segment with filename size 0). This denotes the end of the segment
1092
1128 and of the overall changegroup.
1093 The final filelog sub-segment is followed by an *empty chunk* to denote
1094 the end of the segment and the overall changegroup.
1095
1129
1096 Each filelog sub-segment consists of the following:
1130 Each filelog sub-segment consists of the following:
1097
1131
1098 +------------------------------------------+
1132 +------------------------------------------------------+
1099 | | | |
1133 | | | |
1100 | filename size | filename | delta group |
1134 | filename length | filename | delta group |
1101 | (32 bits) | (various) | (various) |
1135 | (4 bytes) | (<length - 4> bytes) | (various) |
1102 | | | |
1136 | | | |
1103 +------------------------------------------+
1137 +------------------------------------------------------+
1104
1138
1105 That is, a *chunk* consisting of the filename (not terminated or padded)
1139 That is, a *chunk* consisting of the filename (not terminated or padded)
1106 followed by N chunks constituting the *delta group* for this file.
1140 followed by N chunks constituting the *delta group* for this file. The
1141 *empty chunk* at the end of each *delta group* denotes the boundary to the
1142 next filelog sub-segment.
1107
1143
1108 Test list of commands with command with no help text
1144 Test list of commands with command with no help text
1109
1145
@@ -3031,26 +3067,41 b' Sub-topic topics rendered properly'
3031 <h1>Changegroups</h1>
3067 <h1>Changegroups</h1>
3032 <p>
3068 <p>
3033 Changegroups are representations of repository revlog data, specifically
3069 Changegroups are representations of repository revlog data, specifically
3034 the changelog, manifest, and filelogs.
3070 the changelog data, root/flat manifest data, treemanifest data, and
3071 filelogs.
3035 </p>
3072 </p>
3036 <p>
3073 <p>
3037 There are 3 versions of changegroups: &quot;1&quot;, &quot;2&quot;, and &quot;3&quot;. From a
3074 There are 3 versions of changegroups: &quot;1&quot;, &quot;2&quot;, and &quot;3&quot;. From a
3038 high-level, versions &quot;1&quot; and &quot;2&quot; are almost exactly the same, with
3075 high-level, versions &quot;1&quot; and &quot;2&quot; are almost exactly the same, with the
3039 the only difference being a header on entries in the changeset
3076 only difference being an additional item in the *delta header*. Version
3040 segment. Version &quot;3&quot; adds support for exchanging treemanifests and
3077 &quot;3&quot; adds support for revlog flags in the *delta header* and optionally
3041 includes revlog flags in the delta header.
3078 exchanging treemanifests (enabled by setting an option on the
3079 &quot;changegroup&quot; part in the bundle2).
3042 </p>
3080 </p>
3043 <p>
3081 <p>
3044 Changegroups consists of 3 logical segments:
3082 Changegroups when not exchanging treemanifests consist of 3 logical
3083 segments:
3045 </p>
3084 </p>
3046 <pre>
3085 <pre>
3047 +---------------------------------+
3086 +---------------------------------+
3048 | | | |
3087 | | | |
3049 | changeset | manifest | filelogs |
3088 | changeset | manifest | filelogs |
3050 | | | |
3089 | | | |
3090 | | | |
3051 +---------------------------------+
3091 +---------------------------------+
3052 </pre>
3092 </pre>
3053 <p>
3093 <p>
3094 When exchanging treemanifests, there are 4 logical segments:
3095 </p>
3096 <pre>
3097 +-------------------------------------------------+
3098 | | | | |
3099 | changeset | root | treemanifests | filelogs |
3100 | | manifest | | |
3101 | | | | |
3102 +-------------------------------------------------+
3103 </pre>
3104 <p>
3054 The principle building block of each segment is a *chunk*. A *chunk*
3105 The principle building block of each segment is a *chunk*. A *chunk*
3055 is a framed piece of data:
3106 is a framed piece of data:
3056 </p>
3107 </p>
@@ -3058,17 +3109,18 b' Sub-topic topics rendered properly'
3058 +---------------------------------------+
3109 +---------------------------------------+
3059 | | |
3110 | | |
3060 | length | data |
3111 | length | data |
3061 | (32 bits) | &lt;length&gt; bytes |
3112 | (4 bytes) | (&lt;length - 4&gt; bytes) |
3062 | | |
3113 | | |
3063 +---------------------------------------+
3114 +---------------------------------------+
3064 </pre>
3115 </pre>
3065 <p>
3116 <p>
3066 Each chunk starts with a 32-bit big-endian signed integer indicating
3117 All integers are big-endian signed integers. Each chunk starts with a 32-bit
3067 the length of the raw data that follows.
3118 integer indicating the length of the entire chunk (including the length field
3119 itself).
3068 </p>
3120 </p>
3069 <p>
3121 <p>
3070 There is a special case chunk that has 0 length (&quot;0x00000000&quot;). We
3122 There is a special case chunk that has a value of 0 for the length
3071 call this an *empty chunk*.
3123 (&quot;0x00000000&quot;). We call this an *empty chunk*.
3072 </p>
3124 </p>
3073 <h2>Delta Groups</h2>
3125 <h2>Delta Groups</h2>
3074 <p>
3126 <p>
@@ -3083,31 +3135,32 b' Sub-topic topics rendered properly'
3083 +------------------------------------------------------------------------+
3135 +------------------------------------------------------------------------+
3084 | | | | | |
3136 | | | | | |
3085 | chunk0 length | chunk0 data | chunk1 length | chunk1 data | 0x0 |
3137 | chunk0 length | chunk0 data | chunk1 length | chunk1 data | 0x0 |
3086 | (32 bits) | (various) | (32 bits) | (various) | (32 bits) |
3138 | (4 bytes) | (various) | (4 bytes) | (various) | (4 bytes) |
3087 | | | | | |
3139 | | | | | |
3088 +------------------------------------------------------------+-----------+
3140 +------------------------------------------------------------------------+
3089 </pre>
3141 </pre>
3090 <p>
3142 <p>
3091 Each *chunk*'s data consists of the following:
3143 Each *chunk*'s data consists of the following:
3092 </p>
3144 </p>
3093 <pre>
3145 <pre>
3094 +-----------------------------------------+
3146 +---------------------------------------+
3095 | | | |
3147 | | |
3096 | delta header | mdiff header | delta |
3148 | delta header | delta data |
3097 | (various) | (12 bytes) | (various) |
3149 | (various by version) | (various) |
3098 | | | |
3150 | | |
3099 +-----------------------------------------+
3151 +---------------------------------------+
3100 </pre>
3152 </pre>
3101 <p>
3153 <p>
3102 The *length* field is the byte length of the remaining 3 logical pieces
3154 The *delta data* is a series of *delta*s that describe a diff from an existing
3103 of data. The *delta* is a diff from an existing entry in the changelog.
3155 entry (either that the recipient already has, or previously specified in the
3156 bundlei/changegroup).
3104 </p>
3157 </p>
3105 <p>
3158 <p>
3106 The *delta header* is different between versions &quot;1&quot;, &quot;2&quot;, and
3159 The *delta header* is different between versions &quot;1&quot;, &quot;2&quot;, and
3107 &quot;3&quot; of the changegroup format.
3160 &quot;3&quot; of the changegroup format.
3108 </p>
3161 </p>
3109 <p>
3162 <p>
3110 Version 1:
3163 Version 1 (headerlen=80):
3111 </p>
3164 </p>
3112 <pre>
3165 <pre>
3113 +------------------------------------------------------+
3166 +------------------------------------------------------+
@@ -3118,7 +3171,7 b' Sub-topic topics rendered properly'
3118 +------------------------------------------------------+
3171 +------------------------------------------------------+
3119 </pre>
3172 </pre>
3120 <p>
3173 <p>
3121 Version 2:
3174 Version 2 (headerlen=100):
3122 </p>
3175 </p>
3123 <pre>
3176 <pre>
3124 +------------------------------------------------------------------+
3177 +------------------------------------------------------------------+
@@ -3129,85 +3182,104 b' Sub-topic topics rendered properly'
3129 +------------------------------------------------------------------+
3182 +------------------------------------------------------------------+
3130 </pre>
3183 </pre>
3131 <p>
3184 <p>
3132 Version 3:
3185 Version 3 (headerlen=102):
3133 </p>
3186 </p>
3134 <pre>
3187 <pre>
3135 +------------------------------------------------------------------------------+
3188 +------------------------------------------------------------------------------+
3136 | | | | | | |
3189 | | | | | | |
3137 | node | p1 node | p2 node | base node | link node | flags |
3190 | node | p1 node | p2 node | base node | link node | flags |
3138 | (20 bytes) | (20 bytes) | (20 bytes) | (20 bytes) | (20 bytes) | (2 bytes) |
3191 | (20 bytes) | (20 bytes) | (20 bytes) | (20 bytes) | (20 bytes) | (2 bytes) |
3139 | | | | | | |
3192 | | | | | | |
3140 +------------------------------------------------------------------------------+
3193 +------------------------------------------------------------------------------+
3141 </pre>
3194 </pre>
3142 <p>
3195 <p>
3143 The *mdiff header* consists of 3 32-bit big-endian signed integers
3196 The *delta data* consists of &quot;chunklen - 4 - headerlen&quot; bytes, which contain a
3144 describing offsets at which to apply the following delta content:
3197 series of *delta*s, densely packed (no separators). These deltas describe a diff
3198 from an existing entry (either that the recipient already has, or previously
3199 specified in the bundle/changegroup). The format is described more fully in
3200 &quot;hg help internals.bdiff&quot;, but briefly:
3145 </p>
3201 </p>
3146 <pre>
3202 <p>
3147 +-------------------------------------+
3203 +---------------------------------------------------------------+
3148 | | | |
3204 | | | | |
3149 | offset | old length | new length |
3205 | start offset | end offset | new length | content |
3150 | (32 bits) | (32 bits) | (32 bits) |
3206 | (4 bytes) | (4 bytes) | (4 bytes) | (&lt;new length&gt; bytes) |
3151 | | | |
3207 | | | | |
3152 +-------------------------------------+
3208 +---------------------------------------------------------------+
3153 </pre>
3209 </p>
3210 <p>
3211 Please note that the length field in the delta data does *not* include itself.
3212 </p>
3154 <p>
3213 <p>
3155 In version 1, the delta is always applied against the previous node from
3214 In version 1, the delta is always applied against the previous node from
3156 the changegroup or the first parent if this is the first entry in the
3215 the changegroup or the first parent if this is the first entry in the
3157 changegroup.
3216 changegroup.
3158 </p>
3217 </p>
3159 <p>
3218 <p>
3160 In version 2, the delta base node is encoded in the entry in the
3219 In version 2 and up, the delta base node is encoded in the entry in the
3161 changegroup. This allows the delta to be expressed against any parent,
3220 changegroup. This allows the delta to be expressed against any parent,
3162 which can result in smaller deltas and more efficient encoding of data.
3221 which can result in smaller deltas and more efficient encoding of data.
3163 </p>
3222 </p>
3164 <h2>Changeset Segment</h2>
3223 <h2>Changeset Segment</h2>
3165 <p>
3224 <p>
3166 The *changeset segment* consists of a single *delta group* holding
3225 The *changeset segment* consists of a single *delta group* holding
3167 changelog data. It is followed by an *empty chunk* to denote the
3226 changelog data. The *empty chunk* at the end of the *delta group* denotes
3168 boundary to the *manifests segment*.
3227 the boundary to the *manifest segment*.
3169 </p>
3228 </p>
3170 <h2>Manifest Segment</h2>
3229 <h2>Manifest Segment</h2>
3171 <p>
3230 <p>
3172 The *manifest segment* consists of a single *delta group* holding
3231 The *manifest segment* consists of a single *delta group* holding manifest
3173 manifest data. It is followed by an *empty chunk* to denote the boundary
3232 data. If treemanifests are in use, it contains only the manifest for the
3174 to the *filelogs segment*.
3233 root directory of the repository. Otherwise, it contains the entire
3234 manifest data. The *empty chunk* at the end of the *delta group* denotes
3235 the boundary to the next segment (either the *treemanifests segment* or the
3236 *filelogs segment*, depending on version and the request options).
3237 </p>
3238 <h3>Treemanifests Segment</h3>
3239 <p>
3240 The *treemanifests segment* only exists in changegroup version &quot;3&quot;, and
3241 only if the 'treemanifest' param is part of the bundle2 changegroup part
3242 (it is not possible to use changegroup version 3 outside of bundle2).
3243 Aside from the filenames in the *treemanifests segment* containing a
3244 trailing &quot;/&quot; character, it behaves identically to the *filelogs segment*
3245 (see below). The final sub-segment is followed by an *empty chunk* (logically,
3246 a sub-segment with filename size 0). This denotes the boundary to the
3247 *filelogs segment*.
3175 </p>
3248 </p>
3176 <h2>Filelogs Segment</h2>
3249 <h2>Filelogs Segment</h2>
3177 <p>
3250 <p>
3178 The *filelogs* segment consists of multiple sub-segments, each
3251 The *filelogs segment* consists of multiple sub-segments, each
3179 corresponding to an individual file whose data is being described:
3252 corresponding to an individual file whose data is being described:
3180 </p>
3253 </p>
3181 <pre>
3254 <pre>
3182 +--------------------------------------+
3255 +--------------------------------------------------+
3183 | | | | |
3256 | | | | | |
3184 | filelog0 | filelog1 | filelog2 | ... |
3257 | filelog0 | filelog1 | filelog2 | ... | 0x0 |
3185 | | | | |
3258 | | | | | (4 bytes) |
3186 +--------------------------------------+
3259 | | | | | |
3260 +--------------------------------------------------+
3187 </pre>
3261 </pre>
3188 <p>
3262 <p>
3189 In version &quot;3&quot; of the changegroup format, filelogs may include
3263 The final filelog sub-segment is followed by an *empty chunk* (logically,
3190 directory logs when treemanifests are in use. directory logs are
3264 a sub-segment with filename size 0). This denotes the end of the segment
3191 identified by having a trailing '/' on their filename (see below).
3265 and of the overall changegroup.
3192 </p>
3193 <p>
3194 The final filelog sub-segment is followed by an *empty chunk* to denote
3195 the end of the segment and the overall changegroup.
3196 </p>
3266 </p>
3197 <p>
3267 <p>
3198 Each filelog sub-segment consists of the following:
3268 Each filelog sub-segment consists of the following:
3199 </p>
3269 </p>
3200 <pre>
3270 <pre>
3201 +------------------------------------------+
3271 +------------------------------------------------------+
3202 | | | |
3272 | | | |
3203 | filename size | filename | delta group |
3273 | filename length | filename | delta group |
3204 | (32 bits) | (various) | (various) |
3274 | (4 bytes) | (&lt;length - 4&gt; bytes) | (various) |
3205 | | | |
3275 | | | |
3206 +------------------------------------------+
3276 +------------------------------------------------------+
3207 </pre>
3277 </pre>
3208 <p>
3278 <p>
3209 That is, a *chunk* consisting of the filename (not terminated or padded)
3279 That is, a *chunk* consisting of the filename (not terminated or padded)
3210 followed by N chunks constituting the *delta group* for this file.
3280 followed by N chunks constituting the *delta group* for this file. The
3281 *empty chunk* at the end of each *delta group* denotes the boundary to the
3282 next filelog sub-segment.
3211 </p>
3283 </p>
3212
3284
3213 </div>
3285 </div>
General Comments 0
You need to be logged in to leave comments. Login now