##// END OF EJS Templates
help: fix internals.changegroups...
Kyle Lippincott -
r31213:9f169b7f default
parent child Browse files
Show More
@@ -1,35 +1,49 b''
1 1 Changegroups are representations of repository revlog data, specifically
2 the changelog, manifest, and filelogs.
2 the changelog data, root/flat manifest data, treemanifest data, and
3 filelogs.
3 4
4 5 There are 3 versions of changegroups: ``1``, ``2``, and ``3``. From a
5 high-level, versions ``1`` and ``2`` are almost exactly the same, with
6 the only difference being a header on entries in the changeset
7 segment. Version ``3`` adds support for exchanging treemanifests and
8 includes revlog flags in the delta header.
6 high-level, versions ``1`` and ``2`` are almost exactly the same, with the
7 only difference being an additional item in the *delta header*. Version
8 ``3`` adds support for revlog flags in the *delta header* and optionally
9 exchanging treemanifests (enabled by setting an option on the
10 ``changegroup`` part in the bundle2).
9 11
10 Changegroups consists of 3 logical segments::
12 Changegroups when not exchanging treemanifests consist of 3 logical
13 segments::
11 14
12 15 +---------------------------------+
13 16 | | | |
14 17 | changeset | manifest | filelogs |
15 18 | | | |
19 | | | |
16 20 +---------------------------------+
17 21
22 When exchanging treemanifests, there are 4 logical segments::
23
24 +-------------------------------------------------+
25 | | | | |
26 | changeset | root | treemanifests | filelogs |
27 | | manifest | | |
28 | | | | |
29 +-------------------------------------------------+
30
18 31 The principle building block of each segment is a *chunk*. A *chunk*
19 32 is a framed piece of data::
20 33
21 34 +---------------------------------------+
22 35 | | |
23 36 | length | data |
24 | (32 bits) | <length> bytes |
37 | (4 bytes) | (<length - 4> bytes) |
25 38 | | |
26 39 +---------------------------------------+
27 40
28 Each chunk starts with a 32-bit big-endian signed integer indicating
29 the length of the raw data that follows.
41 All integers are big-endian signed integers. Each chunk starts with a 32-bit
42 integer indicating the length of the entire chunk (including the length field
43 itself).
30 44
31 There is a special case chunk that has 0 length (``0x00000000``). We
32 call this an *empty chunk*.
45 There is a special case chunk that has a value of 0 for the length
46 (``0x00000000``). We call this an *empty chunk*.
33 47
34 48 Delta Groups
35 49 ============
@@ -43,26 +57,27 b' to signal the end of the delta group::'
43 57 +------------------------------------------------------------------------+
44 58 | | | | | |
45 59 | chunk0 length | chunk0 data | chunk1 length | chunk1 data | 0x0 |
46 | (32 bits) | (various) | (32 bits) | (various) | (32 bits) |
60 | (4 bytes) | (various) | (4 bytes) | (various) | (4 bytes) |
47 61 | | | | | |
48 +------------------------------------------------------------+-----------+
62 +------------------------------------------------------------------------+
49 63
50 64 Each *chunk*'s data consists of the following::
51 65
52 +-----------------------------------------+
53 | | | |
54 | delta header | mdiff header | delta |
55 | (various) | (12 bytes) | (various) |
56 | | | |
57 +-----------------------------------------+
66 +---------------------------------------+
67 | | |
68 | delta header | delta data |
69 | (various by version) | (various) |
70 | | |
71 +---------------------------------------+
58 72
59 The *length* field is the byte length of the remaining 3 logical pieces
60 of data. The *delta* is a diff from an existing entry in the changelog.
73 The *delta data* is a series of *delta*s that describe a diff from an existing
74 entry (either that the recipient already has, or previously specified in the
75 bundlei/changegroup).
61 76
62 77 The *delta header* is different between versions ``1``, ``2``, and
63 78 ``3`` of the changegroup format.
64 79
65 Version 1::
80 Version 1 (headerlen=80)::
66 81
67 82 +------------------------------------------------------+
68 83 | | | | |
@@ -71,7 +86,7 b' Version 1::'
71 86 | | | | |
72 87 +------------------------------------------------------+
73 88
74 Version 2::
89 Version 2 (headerlen=100)::
75 90
76 91 +------------------------------------------------------------------+
77 92 | | | | | |
@@ -80,30 +95,35 b' Version 2::'
80 95 | | | | | |
81 96 +------------------------------------------------------------------+
82 97
83 Version 3::
98 Version 3 (headerlen=102)::
84 99
85 100 +------------------------------------------------------------------------------+
86 101 | | | | | | |
87 | node | p1 node | p2 node | base node | link node | flags |
102 | node | p1 node | p2 node | base node | link node | flags |
88 103 | (20 bytes) | (20 bytes) | (20 bytes) | (20 bytes) | (20 bytes) | (2 bytes) |
89 104 | | | | | | |
90 105 +------------------------------------------------------------------------------+
91 106
92 The *mdiff header* consists of 3 32-bit big-endian signed integers
93 describing offsets at which to apply the following delta content::
107 The *delta data* consists of ``chunklen - 4 - headerlen`` bytes, which contain a
108 series of *delta*s, densely packed (no separators). These deltas describe a diff
109 from an existing entry (either that the recipient already has, or previously
110 specified in the bundle/changegroup). The format is described more fully in
111 ``hg help internals.bdiff``, but briefly:
94 112
95 +-------------------------------------+
96 | | | |
97 | offset | old length | new length |
98 | (32 bits) | (32 bits) | (32 bits) |
99 | | | |
100 +-------------------------------------+
113 +---------------------------------------------------------------+
114 | | | | |
115 | start offset | end offset | new length | content |
116 | (4 bytes) | (4 bytes) | (4 bytes) | (<new length> bytes) |
117 | | | | |
118 +---------------------------------------------------------------+
119
120 Please note that the length field in the delta data does *not* include itself.
101 121
102 122 In version 1, the delta is always applied against the previous node from
103 123 the changegroup or the first parent if this is the first entry in the
104 124 changegroup.
105 125
106 In version 2, the delta base node is encoded in the entry in the
126 In version 2 and up, the delta base node is encoded in the entry in the
107 127 changegroup. This allows the delta to be expressed against any parent,
108 128 which can result in smaller deltas and more efficient encoding of data.
109 129
@@ -111,43 +131,58 b' Changeset Segment'
111 131 =================
112 132
113 133 The *changeset segment* consists of a single *delta group* holding
114 changelog data. It is followed by an *empty chunk* to denote the
115 boundary to the *manifests segment*.
134 changelog data. The *empty chunk* at the end of the *delta group* denotes
135 the boundary to the *manifest segment*.
116 136
117 137 Manifest Segment
118 138 ================
119 139
120 The *manifest segment* consists of a single *delta group* holding
121 manifest data. It is followed by an *empty chunk* to denote the boundary
122 to the *filelogs segment*.
140 The *manifest segment* consists of a single *delta group* holding manifest
141 data. If treemanifests are in use, it contains only the manifest for the
142 root directory of the repository. Otherwise, it contains the entire
143 manifest data. The *empty chunk* at the end of the *delta group* denotes
144 the boundary to the next segment (either the *treemanifests segment* or the
145 *filelogs segment*, depending on version and the request options).
146
147 Treemanifests Segment
148 ---------------------
149
150 The *treemanifests segment* only exists in changegroup version ``3``, and
151 only if the 'treemanifest' param is part of the bundle2 changegroup part
152 (it is not possible to use changegroup version 3 outside of bundle2).
153 Aside from the filenames in the *treemanifests segment* containing a
154 trailing ``/`` character, it behaves identically to the *filelogs segment*
155 (see below). The final sub-segment is followed by an *empty chunk* (logically,
156 a sub-segment with filename size 0). This denotes the boundary to the
157 *filelogs segment*.
123 158
124 159 Filelogs Segment
125 160 ================
126 161
127 The *filelogs* segment consists of multiple sub-segments, each
162 The *filelogs segment* consists of multiple sub-segments, each
128 163 corresponding to an individual file whose data is being described::
129 164
130 +--------------------------------------+
131 | | | | |
132 | filelog0 | filelog1 | filelog2 | ... |
133 | | | | |
134 +--------------------------------------+
165 +--------------------------------------------------+
166 | | | | | |
167 | filelog0 | filelog1 | filelog2 | ... | 0x0 |
168 | | | | | (4 bytes) |
169 | | | | | |
170 +--------------------------------------------------+
135 171
136 In version ``3`` of the changegroup format, filelogs may include
137 directory logs when treemanifests are in use. directory logs are
138 identified by having a trailing '/' on their filename (see below).
139
140 The final filelog sub-segment is followed by an *empty chunk* to denote
141 the end of the segment and the overall changegroup.
172 The final filelog sub-segment is followed by an *empty chunk* (logically,
173 a sub-segment with filename size 0). This denotes the end of the segment
174 and of the overall changegroup.
142 175
143 176 Each filelog sub-segment consists of the following::
144 177
145 +------------------------------------------+
146 | | | |
147 | filename size | filename | delta group |
148 | (32 bits) | (various) | (various) |
149 | | | |
150 +------------------------------------------+
178 +------------------------------------------------------+
179 | | | |
180 | filename length | filename | delta group |
181 | (4 bytes) | (<length - 4> bytes) | (various) |
182 | | | |
183 +------------------------------------------------------+
151 184
152 185 That is, a *chunk* consisting of the filename (not terminated or padded)
153 followed by N chunks constituting the *delta group* for this file.
186 followed by N chunks constituting the *delta group* for this file. The
187 *empty chunk* at the end of each *delta group* denotes the boundary to the
188 next filelog sub-segment.
@@ -952,37 +952,51 b' sub-topics can be accessed'
952 952 """"""""""""
953 953
954 954 Changegroups are representations of repository revlog data, specifically
955 the changelog, manifest, and filelogs.
955 the changelog data, root/flat manifest data, treemanifest data, and
956 filelogs.
956 957
957 958 There are 3 versions of changegroups: "1", "2", and "3". From a high-
958 959 level, versions "1" and "2" are almost exactly the same, with the only
959 difference being a header on entries in the changeset segment. Version "3"
960 adds support for exchanging treemanifests and includes revlog flags in the
961 delta header.
962
963 Changegroups consists of 3 logical segments:
960 difference being an additional item in the *delta header*. Version "3"
961 adds support for revlog flags in the *delta header* and optionally
962 exchanging treemanifests (enabled by setting an option on the
963 "changegroup" part in the bundle2).
964
965 Changegroups when not exchanging treemanifests consist of 3 logical
966 segments:
964 967
965 968 +---------------------------------+
966 969 | | | |
967 970 | changeset | manifest | filelogs |
968 971 | | | |
972 | | | |
969 973 +---------------------------------+
970 974
975 When exchanging treemanifests, there are 4 logical segments:
976
977 +-------------------------------------------------+
978 | | | | |
979 | changeset | root | treemanifests | filelogs |
980 | | manifest | | |
981 | | | | |
982 +-------------------------------------------------+
983
971 984 The principle building block of each segment is a *chunk*. A *chunk* is a
972 985 framed piece of data:
973 986
974 987 +---------------------------------------+
975 988 | | |
976 989 | length | data |
977 | (32 bits) | <length> bytes |
990 | (4 bytes) | (<length - 4> bytes) |
978 991 | | |
979 992 +---------------------------------------+
980 993
981 Each chunk starts with a 32-bit big-endian signed integer indicating the
982 length of the raw data that follows.
983
984 There is a special case chunk that has 0 length ("0x00000000"). We call
985 this an *empty chunk*.
994 All integers are big-endian signed integers. Each chunk starts with a
995 32-bit integer indicating the length of the entire chunk (including the
996 length field itself).
997
998 There is a special case chunk that has a value of 0 for the length
999 ("0x00000000"). We call this an *empty chunk*.
986 1000
987 1001 Delta Groups
988 1002 ============
@@ -996,26 +1010,27 b' sub-topics can be accessed'
996 1010 +------------------------------------------------------------------------+
997 1011 | | | | | |
998 1012 | chunk0 length | chunk0 data | chunk1 length | chunk1 data | 0x0 |
999 | (32 bits) | (various) | (32 bits) | (various) | (32 bits) |
1013 | (4 bytes) | (various) | (4 bytes) | (various) | (4 bytes) |
1000 1014 | | | | | |
1001 +------------------------------------------------------------+-----------+
1015 +------------------------------------------------------------------------+
1002 1016
1003 1017 Each *chunk*'s data consists of the following:
1004 1018
1005 +-----------------------------------------+
1006 | | | |
1007 | delta header | mdiff header | delta |
1008 | (various) | (12 bytes) | (various) |
1009 | | | |
1010 +-----------------------------------------+
1011
1012 The *length* field is the byte length of the remaining 3 logical pieces of
1013 data. The *delta* is a diff from an existing entry in the changelog.
1019 +---------------------------------------+
1020 | | |
1021 | delta header | delta data |
1022 | (various by version) | (various) |
1023 | | |
1024 +---------------------------------------+
1025
1026 The *delta data* is a series of *delta*s that describe a diff from an
1027 existing entry (either that the recipient already has, or previously
1028 specified in the bundlei/changegroup).
1014 1029
1015 1030 The *delta header* is different between versions "1", "2", and "3" of the
1016 1031 changegroup format.
1017 1032
1018 Version 1:
1033 Version 1 (headerlen=80):
1019 1034
1020 1035 +------------------------------------------------------+
1021 1036 | | | | |
@@ -1024,7 +1039,7 b' sub-topics can be accessed'
1024 1039 | | | | |
1025 1040 +------------------------------------------------------+
1026 1041
1027 Version 2:
1042 Version 2 (headerlen=100):
1028 1043
1029 1044 +------------------------------------------------------------------+
1030 1045 | | | | | |
@@ -1033,30 +1048,36 b' sub-topics can be accessed'
1033 1048 | | | | | |
1034 1049 +------------------------------------------------------------------+
1035 1050
1036 Version 3:
1051 Version 3 (headerlen=102):
1037 1052
1038 1053 +------------------------------------------------------------------------------+
1039 1054 | | | | | | |
1040 | node | p1 node | p2 node | base node | link node | flags |
1055 | node | p1 node | p2 node | base node | link node | flags |
1041 1056 | (20 bytes) | (20 bytes) | (20 bytes) | (20 bytes) | (20 bytes) | (2 bytes) |
1042 1057 | | | | | | |
1043 1058 +------------------------------------------------------------------------------+
1044 1059
1045 The *mdiff header* consists of 3 32-bit big-endian signed integers
1046 describing offsets at which to apply the following delta content:
1047
1048 +-------------------------------------+
1049 | | | |
1050 | offset | old length | new length |
1051 | (32 bits) | (32 bits) | (32 bits) |
1052 | | | |
1053 +-------------------------------------+
1060 The *delta data* consists of "chunklen - 4 - headerlen" bytes, which
1061 contain a series of *delta*s, densely packed (no separators). These deltas
1062 describe a diff from an existing entry (either that the recipient already
1063 has, or previously specified in the bundle/changegroup). The format is
1064 described more fully in "hg help internals.bdiff", but briefly:
1065
1066 +---------------------------------------------------------------+ |
1067 | | | | | start offset | end
1068 offset | new length | content | | (4 bytes) | (4
1069 bytes) | (4 bytes) | (<new length> bytes) | | |
1070 | | |
1071 +---------------------------------------------------------------+
1072
1073 Please note that the length field in the delta data does *not* include
1074 itself.
1054 1075
1055 1076 In version 1, the delta is always applied against the previous node from
1056 1077 the changegroup or the first parent if this is the first entry in the
1057 1078 changegroup.
1058 1079
1059 In version 2, the delta base node is encoded in the entry in the
1080 In version 2 and up, the delta base node is encoded in the entry in the
1060 1081 changegroup. This allows the delta to be expressed against any parent,
1061 1082 which can result in smaller deltas and more efficient encoding of data.
1062 1083
@@ -1064,46 +1085,61 b' sub-topics can be accessed'
1064 1085 =================
1065 1086
1066 1087 The *changeset segment* consists of a single *delta group* holding
1067 changelog data. It is followed by an *empty chunk* to denote the boundary
1068 to the *manifests segment*.
1088 changelog data. The *empty chunk* at the end of the *delta group* denotes
1089 the boundary to the *manifest segment*.
1069 1090
1070 1091 Manifest Segment
1071 1092 ================
1072 1093
1073 1094 The *manifest segment* consists of a single *delta group* holding manifest
1074 data. It is followed by an *empty chunk* to denote the boundary to the
1075 *filelogs segment*.
1095 data. If treemanifests are in use, it contains only the manifest for the
1096 root directory of the repository. Otherwise, it contains the entire
1097 manifest data. The *empty chunk* at the end of the *delta group* denotes
1098 the boundary to the next segment (either the *treemanifests segment* or
1099 the *filelogs segment*, depending on version and the request options).
1100
1101 Treemanifests Segment
1102 ---------------------
1103
1104 The *treemanifests segment* only exists in changegroup version "3", and
1105 only if the 'treemanifest' param is part of the bundle2 changegroup part
1106 (it is not possible to use changegroup version 3 outside of bundle2).
1107 Aside from the filenames in the *treemanifests segment* containing a
1108 trailing "/" character, it behaves identically to the *filelogs segment*
1109 (see below). The final sub-segment is followed by an *empty chunk*
1110 (logically, a sub-segment with filename size 0). This denotes the boundary
1111 to the *filelogs segment*.
1076 1112
1077 1113 Filelogs Segment
1078 1114 ================
1079 1115
1080 The *filelogs* segment consists of multiple sub-segments, each
1116 The *filelogs segment* consists of multiple sub-segments, each
1081 1117 corresponding to an individual file whose data is being described:
1082 1118
1083 +--------------------------------------+
1084 | | | | |
1085 | filelog0 | filelog1 | filelog2 | ... |
1086 | | | | |
1087 +--------------------------------------+
1088
1089 In version "3" of the changegroup format, filelogs may include directory
1090 logs when treemanifests are in use. directory logs are identified by
1091 having a trailing '/' on their filename (see below).
1092
1093 The final filelog sub-segment is followed by an *empty chunk* to denote
1094 the end of the segment and the overall changegroup.
1119 +--------------------------------------------------+
1120 | | | | | |
1121 | filelog0 | filelog1 | filelog2 | ... | 0x0 |
1122 | | | | | (4 bytes) |
1123 | | | | | |
1124 +--------------------------------------------------+
1125
1126 The final filelog sub-segment is followed by an *empty chunk* (logically,
1127 a sub-segment with filename size 0). This denotes the end of the segment
1128 and of the overall changegroup.
1095 1129
1096 1130 Each filelog sub-segment consists of the following:
1097 1131
1098 +------------------------------------------+
1099 | | | |
1100 | filename size | filename | delta group |
1101 | (32 bits) | (various) | (various) |
1102 | | | |
1103 +------------------------------------------+
1132 +------------------------------------------------------+
1133 | | | |
1134 | filename length | filename | delta group |
1135 | (4 bytes) | (<length - 4> bytes) | (various) |
1136 | | | |
1137 +------------------------------------------------------+
1104 1138
1105 1139 That is, a *chunk* consisting of the filename (not terminated or padded)
1106 followed by N chunks constituting the *delta group* for this file.
1140 followed by N chunks constituting the *delta group* for this file. The
1141 *empty chunk* at the end of each *delta group* denotes the boundary to the
1142 next filelog sub-segment.
1107 1143
1108 1144 Test list of commands with command with no help text
1109 1145
@@ -3031,26 +3067,41 b' Sub-topic topics rendered properly'
3031 3067 <h1>Changegroups</h1>
3032 3068 <p>
3033 3069 Changegroups are representations of repository revlog data, specifically
3034 the changelog, manifest, and filelogs.
3070 the changelog data, root/flat manifest data, treemanifest data, and
3071 filelogs.
3035 3072 </p>
3036 3073 <p>
3037 3074 There are 3 versions of changegroups: &quot;1&quot;, &quot;2&quot;, and &quot;3&quot;. From a
3038 high-level, versions &quot;1&quot; and &quot;2&quot; are almost exactly the same, with
3039 the only difference being a header on entries in the changeset
3040 segment. Version &quot;3&quot; adds support for exchanging treemanifests and
3041 includes revlog flags in the delta header.
3075 high-level, versions &quot;1&quot; and &quot;2&quot; are almost exactly the same, with the
3076 only difference being an additional item in the *delta header*. Version
3077 &quot;3&quot; adds support for revlog flags in the *delta header* and optionally
3078 exchanging treemanifests (enabled by setting an option on the
3079 &quot;changegroup&quot; part in the bundle2).
3042 3080 </p>
3043 3081 <p>
3044 Changegroups consists of 3 logical segments:
3082 Changegroups when not exchanging treemanifests consist of 3 logical
3083 segments:
3045 3084 </p>
3046 3085 <pre>
3047 3086 +---------------------------------+
3048 3087 | | | |
3049 3088 | changeset | manifest | filelogs |
3050 3089 | | | |
3090 | | | |
3051 3091 +---------------------------------+
3052 3092 </pre>
3053 3093 <p>
3094 When exchanging treemanifests, there are 4 logical segments:
3095 </p>
3096 <pre>
3097 +-------------------------------------------------+
3098 | | | | |
3099 | changeset | root | treemanifests | filelogs |
3100 | | manifest | | |
3101 | | | | |
3102 +-------------------------------------------------+
3103 </pre>
3104 <p>
3054 3105 The principle building block of each segment is a *chunk*. A *chunk*
3055 3106 is a framed piece of data:
3056 3107 </p>
@@ -3058,17 +3109,18 b' Sub-topic topics rendered properly'
3058 3109 +---------------------------------------+
3059 3110 | | |
3060 3111 | length | data |
3061 | (32 bits) | &lt;length&gt; bytes |
3112 | (4 bytes) | (&lt;length - 4&gt; bytes) |
3062 3113 | | |
3063 3114 +---------------------------------------+
3064 3115 </pre>
3065 3116 <p>
3066 Each chunk starts with a 32-bit big-endian signed integer indicating
3067 the length of the raw data that follows.
3117 All integers are big-endian signed integers. Each chunk starts with a 32-bit
3118 integer indicating the length of the entire chunk (including the length field
3119 itself).
3068 3120 </p>
3069 3121 <p>
3070 There is a special case chunk that has 0 length (&quot;0x00000000&quot;). We
3071 call this an *empty chunk*.
3122 There is a special case chunk that has a value of 0 for the length
3123 (&quot;0x00000000&quot;). We call this an *empty chunk*.
3072 3124 </p>
3073 3125 <h2>Delta Groups</h2>
3074 3126 <p>
@@ -3083,31 +3135,32 b' Sub-topic topics rendered properly'
3083 3135 +------------------------------------------------------------------------+
3084 3136 | | | | | |
3085 3137 | chunk0 length | chunk0 data | chunk1 length | chunk1 data | 0x0 |
3086 | (32 bits) | (various) | (32 bits) | (various) | (32 bits) |
3138 | (4 bytes) | (various) | (4 bytes) | (various) | (4 bytes) |
3087 3139 | | | | | |
3088 +------------------------------------------------------------+-----------+
3140 +------------------------------------------------------------------------+
3089 3141 </pre>
3090 3142 <p>
3091 3143 Each *chunk*'s data consists of the following:
3092 3144 </p>
3093 3145 <pre>
3094 +-----------------------------------------+
3095 | | | |
3096 | delta header | mdiff header | delta |
3097 | (various) | (12 bytes) | (various) |
3098 | | | |
3099 +-----------------------------------------+
3146 +---------------------------------------+
3147 | | |
3148 | delta header | delta data |
3149 | (various by version) | (various) |
3150 | | |
3151 +---------------------------------------+
3100 3152 </pre>
3101 3153 <p>
3102 The *length* field is the byte length of the remaining 3 logical pieces
3103 of data. The *delta* is a diff from an existing entry in the changelog.
3154 The *delta data* is a series of *delta*s that describe a diff from an existing
3155 entry (either that the recipient already has, or previously specified in the
3156 bundlei/changegroup).
3104 3157 </p>
3105 3158 <p>
3106 3159 The *delta header* is different between versions &quot;1&quot;, &quot;2&quot;, and
3107 3160 &quot;3&quot; of the changegroup format.
3108 3161 </p>
3109 3162 <p>
3110 Version 1:
3163 Version 1 (headerlen=80):
3111 3164 </p>
3112 3165 <pre>
3113 3166 +------------------------------------------------------+
@@ -3118,7 +3171,7 b' Sub-topic topics rendered properly'
3118 3171 +------------------------------------------------------+
3119 3172 </pre>
3120 3173 <p>
3121 Version 2:
3174 Version 2 (headerlen=100):
3122 3175 </p>
3123 3176 <pre>
3124 3177 +------------------------------------------------------------------+
@@ -3129,85 +3182,104 b' Sub-topic topics rendered properly'
3129 3182 +------------------------------------------------------------------+
3130 3183 </pre>
3131 3184 <p>
3132 Version 3:
3185 Version 3 (headerlen=102):
3133 3186 </p>
3134 3187 <pre>
3135 3188 +------------------------------------------------------------------------------+
3136 3189 | | | | | | |
3137 | node | p1 node | p2 node | base node | link node | flags |
3190 | node | p1 node | p2 node | base node | link node | flags |
3138 3191 | (20 bytes) | (20 bytes) | (20 bytes) | (20 bytes) | (20 bytes) | (2 bytes) |
3139 3192 | | | | | | |
3140 3193 +------------------------------------------------------------------------------+
3141 3194 </pre>
3142 3195 <p>
3143 The *mdiff header* consists of 3 32-bit big-endian signed integers
3144 describing offsets at which to apply the following delta content:
3196 The *delta data* consists of &quot;chunklen - 4 - headerlen&quot; bytes, which contain a
3197 series of *delta*s, densely packed (no separators). These deltas describe a diff
3198 from an existing entry (either that the recipient already has, or previously
3199 specified in the bundle/changegroup). The format is described more fully in
3200 &quot;hg help internals.bdiff&quot;, but briefly:
3145 3201 </p>
3146 <pre>
3147 +-------------------------------------+
3148 | | | |
3149 | offset | old length | new length |
3150 | (32 bits) | (32 bits) | (32 bits) |
3151 | | | |
3152 +-------------------------------------+
3153 </pre>
3202 <p>
3203 +---------------------------------------------------------------+
3204 | | | | |
3205 | start offset | end offset | new length | content |
3206 | (4 bytes) | (4 bytes) | (4 bytes) | (&lt;new length&gt; bytes) |
3207 | | | | |
3208 +---------------------------------------------------------------+
3209 </p>
3210 <p>
3211 Please note that the length field in the delta data does *not* include itself.
3212 </p>
3154 3213 <p>
3155 3214 In version 1, the delta is always applied against the previous node from
3156 3215 the changegroup or the first parent if this is the first entry in the
3157 3216 changegroup.
3158 3217 </p>
3159 3218 <p>
3160 In version 2, the delta base node is encoded in the entry in the
3219 In version 2 and up, the delta base node is encoded in the entry in the
3161 3220 changegroup. This allows the delta to be expressed against any parent,
3162 3221 which can result in smaller deltas and more efficient encoding of data.
3163 3222 </p>
3164 3223 <h2>Changeset Segment</h2>
3165 3224 <p>
3166 3225 The *changeset segment* consists of a single *delta group* holding
3167 changelog data. It is followed by an *empty chunk* to denote the
3168 boundary to the *manifests segment*.
3226 changelog data. The *empty chunk* at the end of the *delta group* denotes
3227 the boundary to the *manifest segment*.
3169 3228 </p>
3170 3229 <h2>Manifest Segment</h2>
3171 3230 <p>
3172 The *manifest segment* consists of a single *delta group* holding
3173 manifest data. It is followed by an *empty chunk* to denote the boundary
3174 to the *filelogs segment*.
3231 The *manifest segment* consists of a single *delta group* holding manifest
3232 data. If treemanifests are in use, it contains only the manifest for the
3233 root directory of the repository. Otherwise, it contains the entire
3234 manifest data. The *empty chunk* at the end of the *delta group* denotes
3235 the boundary to the next segment (either the *treemanifests segment* or the
3236 *filelogs segment*, depending on version and the request options).
3237 </p>
3238 <h3>Treemanifests Segment</h3>
3239 <p>
3240 The *treemanifests segment* only exists in changegroup version &quot;3&quot;, and
3241 only if the 'treemanifest' param is part of the bundle2 changegroup part
3242 (it is not possible to use changegroup version 3 outside of bundle2).
3243 Aside from the filenames in the *treemanifests segment* containing a
3244 trailing &quot;/&quot; character, it behaves identically to the *filelogs segment*
3245 (see below). The final sub-segment is followed by an *empty chunk* (logically,
3246 a sub-segment with filename size 0). This denotes the boundary to the
3247 *filelogs segment*.
3175 3248 </p>
3176 3249 <h2>Filelogs Segment</h2>
3177 3250 <p>
3178 The *filelogs* segment consists of multiple sub-segments, each
3251 The *filelogs segment* consists of multiple sub-segments, each
3179 3252 corresponding to an individual file whose data is being described:
3180 3253 </p>
3181 3254 <pre>
3182 +--------------------------------------+
3183 | | | | |
3184 | filelog0 | filelog1 | filelog2 | ... |
3185 | | | | |
3186 +--------------------------------------+
3255 +--------------------------------------------------+
3256 | | | | | |
3257 | filelog0 | filelog1 | filelog2 | ... | 0x0 |
3258 | | | | | (4 bytes) |
3259 | | | | | |
3260 +--------------------------------------------------+
3187 3261 </pre>
3188 3262 <p>
3189 In version &quot;3&quot; of the changegroup format, filelogs may include
3190 directory logs when treemanifests are in use. directory logs are
3191 identified by having a trailing '/' on their filename (see below).
3192 </p>
3193 <p>
3194 The final filelog sub-segment is followed by an *empty chunk* to denote
3195 the end of the segment and the overall changegroup.
3263 The final filelog sub-segment is followed by an *empty chunk* (logically,
3264 a sub-segment with filename size 0). This denotes the end of the segment
3265 and of the overall changegroup.
3196 3266 </p>
3197 3267 <p>
3198 3268 Each filelog sub-segment consists of the following:
3199 3269 </p>
3200 3270 <pre>
3201 +------------------------------------------+
3202 | | | |
3203 | filename size | filename | delta group |
3204 | (32 bits) | (various) | (various) |
3205 | | | |
3206 +------------------------------------------+
3271 +------------------------------------------------------+
3272 | | | |
3273 | filename length | filename | delta group |
3274 | (4 bytes) | (&lt;length - 4&gt; bytes) | (various) |
3275 | | | |
3276 +------------------------------------------------------+
3207 3277 </pre>
3208 3278 <p>
3209 3279 That is, a *chunk* consisting of the filename (not terminated or padded)
3210 followed by N chunks constituting the *delta group* for this file.
3280 followed by N chunks constituting the *delta group* for this file. The
3281 *empty chunk* at the end of each *delta group* denotes the boundary to the
3282 next filelog sub-segment.
3211 3283 </p>
3212 3284
3213 3285 </div>
General Comments 0
You need to be logged in to leave comments. Login now