Show More
@@ -1,35 +1,49 b'' | |||
|
1 | 1 | Changegroups are representations of repository revlog data, specifically |
|
2 | the changelog, manifest, and filelogs. | |
|
2 | the changelog data, root/flat manifest data, treemanifest data, and | |
|
3 | filelogs. | |
|
3 | 4 | |
|
4 | 5 | There are 3 versions of changegroups: ``1``, ``2``, and ``3``. From a |
|
5 | high-level, versions ``1`` and ``2`` are almost exactly the same, with | |
|
6 |
|
|
|
7 | segment. Version ``3`` adds support for exchanging treemanifests and | |
|
8 | includes revlog flags in the delta header. | |
|
6 | high-level, versions ``1`` and ``2`` are almost exactly the same, with the | |
|
7 | only difference being an additional item in the *delta header*. Version | |
|
8 | ``3`` adds support for revlog flags in the *delta header* and optionally | |
|
9 | exchanging treemanifests (enabled by setting an option on the | |
|
10 | ``changegroup`` part in the bundle2). | |
|
9 | 11 | |
|
10 | Changegroups consists of 3 logical segments:: | |
|
12 | Changegroups when not exchanging treemanifests consist of 3 logical | |
|
13 | segments:: | |
|
11 | 14 | |
|
12 | 15 | +---------------------------------+ |
|
13 | 16 | | | | | |
|
14 | 17 | | changeset | manifest | filelogs | |
|
15 | 18 | | | | | |
|
19 | | | | | | |
|
16 | 20 | +---------------------------------+ |
|
17 | 21 | |
|
22 | When exchanging treemanifests, there are 4 logical segments:: | |
|
23 | ||
|
24 | +-------------------------------------------------+ | |
|
25 | | | | | | | |
|
26 | | changeset | root | treemanifests | filelogs | | |
|
27 | | | manifest | | | | |
|
28 | | | | | | | |
|
29 | +-------------------------------------------------+ | |
|
30 | ||
|
18 | 31 | The principle building block of each segment is a *chunk*. A *chunk* |
|
19 | 32 | is a framed piece of data:: |
|
20 | 33 | |
|
21 | 34 | +---------------------------------------+ |
|
22 | 35 | | | | |
|
23 | 36 | | length | data | |
|
24 |
| ( |
|
|
37 | | (4 bytes) | (<length - 4> bytes) | | |
|
25 | 38 | | | | |
|
26 | 39 | +---------------------------------------+ |
|
27 | 40 | |
|
28 | Each chunk starts with a 32-bit big-endian signed integer indicating | |
|
29 | the length of the raw data that follows. | |
|
41 | All integers are big-endian signed integers. Each chunk starts with a 32-bit | |
|
42 | integer indicating the length of the entire chunk (including the length field | |
|
43 | itself). | |
|
30 | 44 | |
|
31 |
There is a special case chunk that has 0 length |
|
|
32 | call this an *empty chunk*. | |
|
45 | There is a special case chunk that has a value of 0 for the length | |
|
46 | (``0x00000000``). We call this an *empty chunk*. | |
|
33 | 47 | |
|
34 | 48 | Delta Groups |
|
35 | 49 | ============ |
@@ -43,26 +57,27 b' to signal the end of the delta group::' | |||
|
43 | 57 | +------------------------------------------------------------------------+ |
|
44 | 58 | | | | | | | |
|
45 | 59 | | chunk0 length | chunk0 data | chunk1 length | chunk1 data | 0x0 | |
|
46 |
| ( |
|
|
60 | | (4 bytes) | (various) | (4 bytes) | (various) | (4 bytes) | | |
|
47 | 61 | | | | | | | |
|
48 |
+------------------------------------------------------------ |
|
|
62 | +------------------------------------------------------------------------+ | |
|
49 | 63 | |
|
50 | 64 | Each *chunk*'s data consists of the following:: |
|
51 | 65 | |
|
52 |
+--------------------------------------- |
|
|
53 |
| |
|
|
54 |
| delta header |
|
|
55 | | (various) | (12 bytes) | (various) | | |
|
56 |
| |
|
|
57 |
+--------------------------------------- |
|
|
66 | +---------------------------------------+ | |
|
67 | | | | | |
|
68 | | delta header | delta data | | |
|
69 | | (various by version) | (various) | | |
|
70 | | | | | |
|
71 | +---------------------------------------+ | |
|
58 | 72 | |
|
59 | The *length* field is the byte length of the remaining 3 logical pieces | |
|
60 | of data. The *delta* is a diff from an existing entry in the changelog. | |
|
73 | The *delta data* is a series of *delta*s that describe a diff from an existing | |
|
74 | entry (either that the recipient already has, or previously specified in the | |
|
75 | bundlei/changegroup). | |
|
61 | 76 | |
|
62 | 77 | The *delta header* is different between versions ``1``, ``2``, and |
|
63 | 78 | ``3`` of the changegroup format. |
|
64 | 79 | |
|
65 | Version 1:: | |
|
80 | Version 1 (headerlen=80):: | |
|
66 | 81 | |
|
67 | 82 | +------------------------------------------------------+ |
|
68 | 83 | | | | | | |
@@ -71,7 +86,7 b' Version 1::' | |||
|
71 | 86 | | | | | | |
|
72 | 87 | +------------------------------------------------------+ |
|
73 | 88 | |
|
74 | Version 2:: | |
|
89 | Version 2 (headerlen=100):: | |
|
75 | 90 | |
|
76 | 91 | +------------------------------------------------------------------+ |
|
77 | 92 | | | | | | | |
@@ -80,7 +95,7 b' Version 2::' | |||
|
80 | 95 | | | | | | | |
|
81 | 96 | +------------------------------------------------------------------+ |
|
82 | 97 | |
|
83 | Version 3:: | |
|
98 | Version 3 (headerlen=102):: | |
|
84 | 99 | |
|
85 | 100 | +------------------------------------------------------------------------------+ |
|
86 | 101 | | | | | | | | |
@@ -89,21 +104,26 b' Version 3::' | |||
|
89 | 104 | | | | | | | | |
|
90 | 105 | +------------------------------------------------------------------------------+ |
|
91 | 106 | |
|
92 | The *mdiff header* consists of 3 32-bit big-endian signed integers | |
|
93 | describing offsets at which to apply the following delta content:: | |
|
107 | The *delta data* consists of ``chunklen - 4 - headerlen`` bytes, which contain a | |
|
108 | series of *delta*s, densely packed (no separators). These deltas describe a diff | |
|
109 | from an existing entry (either that the recipient already has, or previously | |
|
110 | specified in the bundle/changegroup). The format is described more fully in | |
|
111 | ``hg help internals.bdiff``, but briefly: | |
|
94 | 112 | |
|
95 | +-------------------------------------+ | |
|
96 | | | | | | |
|
97 |
| offset |
|
|
98 | | (32 bits) | (32 bits) | (32 bits) | | |
|
99 | | | | | | |
|
100 | +-------------------------------------+ | |
|
113 | +---------------------------------------------------------------+ | |
|
114 | | | | | | | |
|
115 | | start offset | end offset | new length | content | | |
|
116 | | (4 bytes) | (4 bytes) | (4 bytes) | (<new length> bytes) | | |
|
117 | | | | | | | |
|
118 | +---------------------------------------------------------------+ | |
|
119 | ||
|
120 | Please note that the length field in the delta data does *not* include itself. | |
|
101 | 121 | |
|
102 | 122 | In version 1, the delta is always applied against the previous node from |
|
103 | 123 | the changegroup or the first parent if this is the first entry in the |
|
104 | 124 | changegroup. |
|
105 | 125 | |
|
106 | In version 2, the delta base node is encoded in the entry in the | |
|
126 | In version 2 and up, the delta base node is encoded in the entry in the | |
|
107 | 127 | changegroup. This allows the delta to be expressed against any parent, |
|
108 | 128 | which can result in smaller deltas and more efficient encoding of data. |
|
109 | 129 | |
@@ -111,43 +131,58 b' Changeset Segment' | |||
|
111 | 131 | ================= |
|
112 | 132 | |
|
113 | 133 | The *changeset segment* consists of a single *delta group* holding |
|
114 | changelog data. It is followed by an *empty chunk* to denote the | |
|
115 |
boundary to the *manifest |
|
|
134 | changelog data. The *empty chunk* at the end of the *delta group* denotes | |
|
135 | the boundary to the *manifest segment*. | |
|
116 | 136 | |
|
117 | 137 | Manifest Segment |
|
118 | 138 | ================ |
|
119 | 139 | |
|
120 | The *manifest segment* consists of a single *delta group* holding | |
|
121 | manifest data. It is followed by an *empty chunk* to denote the boundary | |
|
122 | to the *filelogs segment*. | |
|
140 | The *manifest segment* consists of a single *delta group* holding manifest | |
|
141 | data. If treemanifests are in use, it contains only the manifest for the | |
|
142 | root directory of the repository. Otherwise, it contains the entire | |
|
143 | manifest data. The *empty chunk* at the end of the *delta group* denotes | |
|
144 | the boundary to the next segment (either the *treemanifests segment* or the | |
|
145 | *filelogs segment*, depending on version and the request options). | |
|
146 | ||
|
147 | Treemanifests Segment | |
|
148 | --------------------- | |
|
149 | ||
|
150 | The *treemanifests segment* only exists in changegroup version ``3``, and | |
|
151 | only if the 'treemanifest' param is part of the bundle2 changegroup part | |
|
152 | (it is not possible to use changegroup version 3 outside of bundle2). | |
|
153 | Aside from the filenames in the *treemanifests segment* containing a | |
|
154 | trailing ``/`` character, it behaves identically to the *filelogs segment* | |
|
155 | (see below). The final sub-segment is followed by an *empty chunk* (logically, | |
|
156 | a sub-segment with filename size 0). This denotes the boundary to the | |
|
157 | *filelogs segment*. | |
|
123 | 158 | |
|
124 | 159 | Filelogs Segment |
|
125 | 160 | ================ |
|
126 | 161 | |
|
127 |
The *filelogs |
|
|
162 | The *filelogs segment* consists of multiple sub-segments, each | |
|
128 | 163 | corresponding to an individual file whose data is being described:: |
|
129 | 164 | |
|
130 | +--------------------------------------+ | |
|
131 | | | | | | | |
|
132 | | filelog0 | filelog1 | filelog2 | ... | | |
|
133 | | | | | | | |
|
134 | +--------------------------------------+ | |
|
165 | +--------------------------------------------------+ | |
|
166 | | | | | | | | |
|
167 | | filelog0 | filelog1 | filelog2 | ... | 0x0 | | |
|
168 | | | | | | (4 bytes) | | |
|
169 | | | | | | | | |
|
170 | +--------------------------------------------------+ | |
|
135 | 171 | |
|
136 | In version ``3`` of the changegroup format, filelogs may include | |
|
137 | directory logs when treemanifests are in use. directory logs are | |
|
138 | identified by having a trailing '/' on their filename (see below). | |
|
139 | ||
|
140 | The final filelog sub-segment is followed by an *empty chunk* to denote | |
|
141 | the end of the segment and the overall changegroup. | |
|
172 | The final filelog sub-segment is followed by an *empty chunk* (logically, | |
|
173 | a sub-segment with filename size 0). This denotes the end of the segment | |
|
174 | and of the overall changegroup. | |
|
142 | 175 | |
|
143 | 176 | Each filelog sub-segment consists of the following:: |
|
144 | 177 | |
|
145 | +------------------------------------------+ | |
|
178 | +------------------------------------------------------+ | |
|
146 | 179 | | | | | |
|
147 |
| filename |
|
|
148 |
| |
|
|
180 | | filename length | filename | delta group | | |
|
181 | | (4 bytes) | (<length - 4> bytes) | (various) | | |
|
149 | 182 | | | | | |
|
150 | +------------------------------------------+ | |
|
183 | +------------------------------------------------------+ | |
|
151 | 184 | |
|
152 | 185 | That is, a *chunk* consisting of the filename (not terminated or padded) |
|
153 | followed by N chunks constituting the *delta group* for this file. | |
|
186 | followed by N chunks constituting the *delta group* for this file. The | |
|
187 | *empty chunk* at the end of each *delta group* denotes the boundary to the | |
|
188 | next filelog sub-segment. |
@@ -952,37 +952,51 b' sub-topics can be accessed' | |||
|
952 | 952 | """""""""""" |
|
953 | 953 | |
|
954 | 954 | Changegroups are representations of repository revlog data, specifically |
|
955 | the changelog, manifest, and filelogs. | |
|
955 | the changelog data, root/flat manifest data, treemanifest data, and | |
|
956 | filelogs. | |
|
956 | 957 | |
|
957 | 958 | There are 3 versions of changegroups: "1", "2", and "3". From a high- |
|
958 | 959 | level, versions "1" and "2" are almost exactly the same, with the only |
|
959 |
difference being a |
|
|
960 | adds support for exchanging treemanifests and includes revlog flags in the | |
|
961 | delta header. | |
|
962 | ||
|
963 | Changegroups consists of 3 logical segments: | |
|
960 | difference being an additional item in the *delta header*. Version "3" | |
|
961 | adds support for revlog flags in the *delta header* and optionally | |
|
962 | exchanging treemanifests (enabled by setting an option on the | |
|
963 | "changegroup" part in the bundle2). | |
|
964 | ||
|
965 | Changegroups when not exchanging treemanifests consist of 3 logical | |
|
966 | segments: | |
|
964 | 967 | |
|
965 | 968 | +---------------------------------+ |
|
966 | 969 | | | | | |
|
967 | 970 | | changeset | manifest | filelogs | |
|
968 | 971 | | | | | |
|
972 | | | | | | |
|
969 | 973 | +---------------------------------+ |
|
970 | 974 | |
|
975 | When exchanging treemanifests, there are 4 logical segments: | |
|
976 | ||
|
977 | +-------------------------------------------------+ | |
|
978 | | | | | | | |
|
979 | | changeset | root | treemanifests | filelogs | | |
|
980 | | | manifest | | | | |
|
981 | | | | | | | |
|
982 | +-------------------------------------------------+ | |
|
983 | ||
|
971 | 984 | The principle building block of each segment is a *chunk*. A *chunk* is a |
|
972 | 985 | framed piece of data: |
|
973 | 986 | |
|
974 | 987 | +---------------------------------------+ |
|
975 | 988 | | | | |
|
976 | 989 | | length | data | |
|
977 |
| ( |
|
|
990 | | (4 bytes) | (<length - 4> bytes) | | |
|
978 | 991 | | | | |
|
979 | 992 | +---------------------------------------+ |
|
980 | 993 | |
|
981 | Each chunk starts with a 32-bit big-endian signed integer indicating the | |
|
982 | length of the raw data that follows. | |
|
983 | ||
|
984 | There is a special case chunk that has 0 length ("0x00000000"). We call | |
|
985 | this an *empty chunk*. | |
|
994 | All integers are big-endian signed integers. Each chunk starts with a | |
|
995 | 32-bit integer indicating the length of the entire chunk (including the | |
|
996 | length field itself). | |
|
997 | ||
|
998 | There is a special case chunk that has a value of 0 for the length | |
|
999 | ("0x00000000"). We call this an *empty chunk*. | |
|
986 | 1000 | |
|
987 | 1001 | Delta Groups |
|
988 | 1002 | ============ |
@@ -996,26 +1010,27 b' sub-topics can be accessed' | |||
|
996 | 1010 | +------------------------------------------------------------------------+ |
|
997 | 1011 | | | | | | | |
|
998 | 1012 | | chunk0 length | chunk0 data | chunk1 length | chunk1 data | 0x0 | |
|
999 |
| ( |
|
|
1013 | | (4 bytes) | (various) | (4 bytes) | (various) | (4 bytes) | | |
|
1000 | 1014 | | | | | | | |
|
1001 |
+------------------------------------------------------------ |
|
|
1015 | +------------------------------------------------------------------------+ | |
|
1002 | 1016 | |
|
1003 | 1017 | Each *chunk*'s data consists of the following: |
|
1004 | 1018 | |
|
1005 |
+-------------------------------------- |
|
|
1006 |
| | | |
|
|
1007 |
| delta header |
|
|
1008 |
| (various) | |
|
|
1009 |
| | | |
|
|
1010 |
+-------------------------------------- |
|
|
1011 | ||
|
1012 | The *length* field is the byte length of the remaining 3 logical pieces of | |
|
1013 | data. The *delta* is a diff from an existing entry in the changelog. | |
|
1019 | +---------------------------------------+ | |
|
1020 | | | | | |
|
1021 | | delta header | delta data | | |
|
1022 | | (various by version) | (various) | | |
|
1023 | | | | | |
|
1024 | +---------------------------------------+ | |
|
1025 | ||
|
1026 | The *delta data* is a series of *delta*s that describe a diff from an | |
|
1027 | existing entry (either that the recipient already has, or previously | |
|
1028 | specified in the bundlei/changegroup). | |
|
1014 | 1029 | |
|
1015 | 1030 | The *delta header* is different between versions "1", "2", and "3" of the |
|
1016 | 1031 | changegroup format. |
|
1017 | 1032 | |
|
1018 | Version 1: | |
|
1033 | Version 1 (headerlen=80): | |
|
1019 | 1034 | |
|
1020 | 1035 | +------------------------------------------------------+ |
|
1021 | 1036 | | | | | | |
@@ -1024,7 +1039,7 b' sub-topics can be accessed' | |||
|
1024 | 1039 | | | | | | |
|
1025 | 1040 | +------------------------------------------------------+ |
|
1026 | 1041 | |
|
1027 | Version 2: | |
|
1042 | Version 2 (headerlen=100): | |
|
1028 | 1043 | |
|
1029 | 1044 | +------------------------------------------------------------------+ |
|
1030 | 1045 | | | | | | | |
@@ -1033,7 +1048,7 b' sub-topics can be accessed' | |||
|
1033 | 1048 | | | | | | | |
|
1034 | 1049 | +------------------------------------------------------------------+ |
|
1035 | 1050 | |
|
1036 | Version 3: | |
|
1051 | Version 3 (headerlen=102): | |
|
1037 | 1052 | |
|
1038 | 1053 | +------------------------------------------------------------------------------+ |
|
1039 | 1054 | | | | | | | | |
@@ -1042,21 +1057,27 b' sub-topics can be accessed' | |||
|
1042 | 1057 | | | | | | | | |
|
1043 | 1058 | +------------------------------------------------------------------------------+ |
|
1044 | 1059 | |
|
1045 | The *mdiff header* consists of 3 32-bit big-endian signed integers | |
|
1046 | describing offsets at which to apply the following delta content: | |
|
1047 | ||
|
1048 | +-------------------------------------+ | |
|
1049 | | | | | | |
|
1050 | | offset | old length | new length | | |
|
1051 | | (32 bits) | (32 bits) | (32 bits) | | |
|
1052 |
|
|
|
1053 | +-------------------------------------+ | |
|
1060 | The *delta data* consists of "chunklen - 4 - headerlen" bytes, which | |
|
1061 | contain a series of *delta*s, densely packed (no separators). These deltas | |
|
1062 | describe a diff from an existing entry (either that the recipient already | |
|
1063 | has, or previously specified in the bundle/changegroup). The format is | |
|
1064 | described more fully in "hg help internals.bdiff", but briefly: | |
|
1065 | ||
|
1066 | +---------------------------------------------------------------+ | | |
|
1067 | | | | | | start offset | end | |
|
1068 | offset | new length | content | | (4 bytes) | (4 | |
|
1069 | bytes) | (4 bytes) | (<new length> bytes) | | | | |
|
1070 | | | | | |
|
1071 | +---------------------------------------------------------------+ | |
|
1072 | ||
|
1073 | Please note that the length field in the delta data does *not* include | |
|
1074 | itself. | |
|
1054 | 1075 | |
|
1055 | 1076 | In version 1, the delta is always applied against the previous node from |
|
1056 | 1077 | the changegroup or the first parent if this is the first entry in the |
|
1057 | 1078 | changegroup. |
|
1058 | 1079 | |
|
1059 | In version 2, the delta base node is encoded in the entry in the | |
|
1080 | In version 2 and up, the delta base node is encoded in the entry in the | |
|
1060 | 1081 | changegroup. This allows the delta to be expressed against any parent, |
|
1061 | 1082 | which can result in smaller deltas and more efficient encoding of data. |
|
1062 | 1083 | |
@@ -1064,46 +1085,61 b' sub-topics can be accessed' | |||
|
1064 | 1085 | ================= |
|
1065 | 1086 | |
|
1066 | 1087 | The *changeset segment* consists of a single *delta group* holding |
|
1067 | changelog data. It is followed by an *empty chunk* to denote the boundary | |
|
1068 |
to the *manifest |
|
|
1088 | changelog data. The *empty chunk* at the end of the *delta group* denotes | |
|
1089 | the boundary to the *manifest segment*. | |
|
1069 | 1090 | |
|
1070 | 1091 | Manifest Segment |
|
1071 | 1092 | ================ |
|
1072 | 1093 | |
|
1073 | 1094 | The *manifest segment* consists of a single *delta group* holding manifest |
|
1074 | data. It is followed by an *empty chunk* to denote the boundary to the | |
|
1075 | *filelogs segment*. | |
|
1095 | data. If treemanifests are in use, it contains only the manifest for the | |
|
1096 | root directory of the repository. Otherwise, it contains the entire | |
|
1097 | manifest data. The *empty chunk* at the end of the *delta group* denotes | |
|
1098 | the boundary to the next segment (either the *treemanifests segment* or | |
|
1099 | the *filelogs segment*, depending on version and the request options). | |
|
1100 | ||
|
1101 | Treemanifests Segment | |
|
1102 | --------------------- | |
|
1103 | ||
|
1104 | The *treemanifests segment* only exists in changegroup version "3", and | |
|
1105 | only if the 'treemanifest' param is part of the bundle2 changegroup part | |
|
1106 | (it is not possible to use changegroup version 3 outside of bundle2). | |
|
1107 | Aside from the filenames in the *treemanifests segment* containing a | |
|
1108 | trailing "/" character, it behaves identically to the *filelogs segment* | |
|
1109 | (see below). The final sub-segment is followed by an *empty chunk* | |
|
1110 | (logically, a sub-segment with filename size 0). This denotes the boundary | |
|
1111 | to the *filelogs segment*. | |
|
1076 | 1112 | |
|
1077 | 1113 | Filelogs Segment |
|
1078 | 1114 | ================ |
|
1079 | 1115 | |
|
1080 |
The *filelogs |
|
|
1116 | The *filelogs segment* consists of multiple sub-segments, each | |
|
1081 | 1117 | corresponding to an individual file whose data is being described: |
|
1082 | 1118 | |
|
1083 | +--------------------------------------+ | |
|
1084 | | | | | | | |
|
1085 | | filelog0 | filelog1 | filelog2 | ... | | |
|
1086 | | | | | | | |
|
1087 | +--------------------------------------+ | |
|
1088 | ||
|
1089 | In version "3" of the changegroup format, filelogs may include directory | |
|
1090 | logs when treemanifests are in use. directory logs are identified by | |
|
1091 | having a trailing '/' on their filename (see below). | |
|
1092 | ||
|
1093 | The final filelog sub-segment is followed by an *empty chunk* to denote | |
|
1094 | the end of the segment and the overall changegroup. | |
|
1119 | +--------------------------------------------------+ | |
|
1120 | | | | | | | | |
|
1121 | | filelog0 | filelog1 | filelog2 | ... | 0x0 | | |
|
1122 | | | | | | (4 bytes) | | |
|
1123 | | | | | | | | |
|
1124 | +--------------------------------------------------+ | |
|
1125 | ||
|
1126 | The final filelog sub-segment is followed by an *empty chunk* (logically, | |
|
1127 | a sub-segment with filename size 0). This denotes the end of the segment | |
|
1128 | and of the overall changegroup. | |
|
1095 | 1129 | |
|
1096 | 1130 | Each filelog sub-segment consists of the following: |
|
1097 | 1131 | |
|
1098 | +------------------------------------------+ | |
|
1132 | +------------------------------------------------------+ | |
|
1099 | 1133 | | | | | |
|
1100 |
| filename |
|
|
1101 |
| ( |
|
|
1134 | | filename length | filename | delta group | | |
|
1135 | | (4 bytes) | (<length - 4> bytes) | (various) | | |
|
1102 | 1136 | | | | | |
|
1103 | +------------------------------------------+ | |
|
1137 | +------------------------------------------------------+ | |
|
1104 | 1138 | |
|
1105 | 1139 | That is, a *chunk* consisting of the filename (not terminated or padded) |
|
1106 | followed by N chunks constituting the *delta group* for this file. | |
|
1140 | followed by N chunks constituting the *delta group* for this file. The | |
|
1141 | *empty chunk* at the end of each *delta group* denotes the boundary to the | |
|
1142 | next filelog sub-segment. | |
|
1107 | 1143 | |
|
1108 | 1144 | Test list of commands with command with no help text |
|
1109 | 1145 | |
@@ -3031,26 +3067,41 b' Sub-topic topics rendered properly' | |||
|
3031 | 3067 | <h1>Changegroups</h1> |
|
3032 | 3068 | <p> |
|
3033 | 3069 | Changegroups are representations of repository revlog data, specifically |
|
3034 | the changelog, manifest, and filelogs. | |
|
3070 | the changelog data, root/flat manifest data, treemanifest data, and | |
|
3071 | filelogs. | |
|
3035 | 3072 | </p> |
|
3036 | 3073 | <p> |
|
3037 | 3074 | There are 3 versions of changegroups: "1", "2", and "3". From a |
|
3038 | high-level, versions "1" and "2" are almost exactly the same, with | |
|
3039 |
|
|
|
3040 | segment. Version "3" adds support for exchanging treemanifests and | |
|
3041 | includes revlog flags in the delta header. | |
|
3075 | high-level, versions "1" and "2" are almost exactly the same, with the | |
|
3076 | only difference being an additional item in the *delta header*. Version | |
|
3077 | "3" adds support for revlog flags in the *delta header* and optionally | |
|
3078 | exchanging treemanifests (enabled by setting an option on the | |
|
3079 | "changegroup" part in the bundle2). | |
|
3042 | 3080 | </p> |
|
3043 | 3081 | <p> |
|
3044 |
Changegroups consist |
|
|
3082 | Changegroups when not exchanging treemanifests consist of 3 logical | |
|
3083 | segments: | |
|
3045 | 3084 | </p> |
|
3046 | 3085 | <pre> |
|
3047 | 3086 | +---------------------------------+ |
|
3048 | 3087 | | | | | |
|
3049 | 3088 | | changeset | manifest | filelogs | |
|
3050 | 3089 | | | | | |
|
3090 | | | | | | |
|
3051 | 3091 | +---------------------------------+ |
|
3052 | 3092 | </pre> |
|
3053 | 3093 | <p> |
|
3094 | When exchanging treemanifests, there are 4 logical segments: | |
|
3095 | </p> | |
|
3096 | <pre> | |
|
3097 | +-------------------------------------------------+ | |
|
3098 | | | | | | | |
|
3099 | | changeset | root | treemanifests | filelogs | | |
|
3100 | | | manifest | | | | |
|
3101 | | | | | | | |
|
3102 | +-------------------------------------------------+ | |
|
3103 | </pre> | |
|
3104 | <p> | |
|
3054 | 3105 | The principle building block of each segment is a *chunk*. A *chunk* |
|
3055 | 3106 | is a framed piece of data: |
|
3056 | 3107 | </p> |
@@ -3058,17 +3109,18 b' Sub-topic topics rendered properly' | |||
|
3058 | 3109 | +---------------------------------------+ |
|
3059 | 3110 | | | | |
|
3060 | 3111 | | length | data | |
|
3061 |
| ( |
|
|
3112 | | (4 bytes) | (<length - 4> bytes) | | |
|
3062 | 3113 | | | | |
|
3063 | 3114 | +---------------------------------------+ |
|
3064 | 3115 | </pre> |
|
3065 | 3116 | <p> |
|
3066 | Each chunk starts with a 32-bit big-endian signed integer indicating | |
|
3067 | the length of the raw data that follows. | |
|
3117 | All integers are big-endian signed integers. Each chunk starts with a 32-bit | |
|
3118 | integer indicating the length of the entire chunk (including the length field | |
|
3119 | itself). | |
|
3068 | 3120 | </p> |
|
3069 | 3121 | <p> |
|
3070 |
There is a special case chunk that has 0 length |
|
|
3071 | call this an *empty chunk*. | |
|
3122 | There is a special case chunk that has a value of 0 for the length | |
|
3123 | ("0x00000000"). We call this an *empty chunk*. | |
|
3072 | 3124 | </p> |
|
3073 | 3125 | <h2>Delta Groups</h2> |
|
3074 | 3126 | <p> |
@@ -3083,31 +3135,32 b' Sub-topic topics rendered properly' | |||
|
3083 | 3135 | +------------------------------------------------------------------------+ |
|
3084 | 3136 | | | | | | | |
|
3085 | 3137 | | chunk0 length | chunk0 data | chunk1 length | chunk1 data | 0x0 | |
|
3086 |
| ( |
|
|
3138 | | (4 bytes) | (various) | (4 bytes) | (various) | (4 bytes) | | |
|
3087 | 3139 | | | | | | | |
|
3088 |
+------------------------------------------------------------ |
|
|
3140 | +------------------------------------------------------------------------+ | |
|
3089 | 3141 | </pre> |
|
3090 | 3142 | <p> |
|
3091 | 3143 | Each *chunk*'s data consists of the following: |
|
3092 | 3144 | </p> |
|
3093 | 3145 | <pre> |
|
3094 |
+--------------------------------------- |
|
|
3095 |
| |
|
|
3096 |
| delta header |
|
|
3097 | | (various) | (12 bytes) | (various) | | |
|
3098 |
| |
|
|
3099 |
+--------------------------------------- |
|
|
3146 | +---------------------------------------+ | |
|
3147 | | | | | |
|
3148 | | delta header | delta data | | |
|
3149 | | (various by version) | (various) | | |
|
3150 | | | | | |
|
3151 | +---------------------------------------+ | |
|
3100 | 3152 | </pre> |
|
3101 | 3153 | <p> |
|
3102 | The *length* field is the byte length of the remaining 3 logical pieces | |
|
3103 | of data. The *delta* is a diff from an existing entry in the changelog. | |
|
3154 | The *delta data* is a series of *delta*s that describe a diff from an existing | |
|
3155 | entry (either that the recipient already has, or previously specified in the | |
|
3156 | bundlei/changegroup). | |
|
3104 | 3157 | </p> |
|
3105 | 3158 | <p> |
|
3106 | 3159 | The *delta header* is different between versions "1", "2", and |
|
3107 | 3160 | "3" of the changegroup format. |
|
3108 | 3161 | </p> |
|
3109 | 3162 | <p> |
|
3110 | Version 1: | |
|
3163 | Version 1 (headerlen=80): | |
|
3111 | 3164 | </p> |
|
3112 | 3165 | <pre> |
|
3113 | 3166 | +------------------------------------------------------+ |
@@ -3118,7 +3171,7 b' Sub-topic topics rendered properly' | |||
|
3118 | 3171 | +------------------------------------------------------+ |
|
3119 | 3172 | </pre> |
|
3120 | 3173 | <p> |
|
3121 | Version 2: | |
|
3174 | Version 2 (headerlen=100): | |
|
3122 | 3175 | </p> |
|
3123 | 3176 | <pre> |
|
3124 | 3177 | +------------------------------------------------------------------+ |
@@ -3129,7 +3182,7 b' Sub-topic topics rendered properly' | |||
|
3129 | 3182 | +------------------------------------------------------------------+ |
|
3130 | 3183 | </pre> |
|
3131 | 3184 | <p> |
|
3132 | Version 3: | |
|
3185 | Version 3 (headerlen=102): | |
|
3133 | 3186 | </p> |
|
3134 | 3187 | <pre> |
|
3135 | 3188 | +------------------------------------------------------------------------------+ |
@@ -3140,74 +3193,93 b' Sub-topic topics rendered properly' | |||
|
3140 | 3193 | +------------------------------------------------------------------------------+ |
|
3141 | 3194 | </pre> |
|
3142 | 3195 | <p> |
|
3143 | The *mdiff header* consists of 3 32-bit big-endian signed integers | |
|
3144 | describing offsets at which to apply the following delta content: | |
|
3196 | The *delta data* consists of "chunklen - 4 - headerlen" bytes, which contain a | |
|
3197 | series of *delta*s, densely packed (no separators). These deltas describe a diff | |
|
3198 | from an existing entry (either that the recipient already has, or previously | |
|
3199 | specified in the bundle/changegroup). The format is described more fully in | |
|
3200 | "hg help internals.bdiff", but briefly: | |
|
3145 | 3201 | </p> |
|
3146 |
<p |
|
|
3147 | +-------------------------------------+ | |
|
3148 | | | | | | |
|
3149 |
| offset |
|
|
3150 | | (32 bits) | (32 bits) | (32 bits) | | |
|
3151 | | | | | | |
|
3152 | +-------------------------------------+ | |
|
3153 |
</p |
|
|
3202 | <p> | |
|
3203 | +---------------------------------------------------------------+ | |
|
3204 | | | | | | | |
|
3205 | | start offset | end offset | new length | content | | |
|
3206 | | (4 bytes) | (4 bytes) | (4 bytes) | (<new length> bytes) | | |
|
3207 | | | | | | | |
|
3208 | +---------------------------------------------------------------+ | |
|
3209 | </p> | |
|
3210 | <p> | |
|
3211 | Please note that the length field in the delta data does *not* include itself. | |
|
3212 | </p> | |
|
3154 | 3213 | <p> |
|
3155 | 3214 | In version 1, the delta is always applied against the previous node from |
|
3156 | 3215 | the changegroup or the first parent if this is the first entry in the |
|
3157 | 3216 | changegroup. |
|
3158 | 3217 | </p> |
|
3159 | 3218 | <p> |
|
3160 | In version 2, the delta base node is encoded in the entry in the | |
|
3219 | In version 2 and up, the delta base node is encoded in the entry in the | |
|
3161 | 3220 | changegroup. This allows the delta to be expressed against any parent, |
|
3162 | 3221 | which can result in smaller deltas and more efficient encoding of data. |
|
3163 | 3222 | </p> |
|
3164 | 3223 | <h2>Changeset Segment</h2> |
|
3165 | 3224 | <p> |
|
3166 | 3225 | The *changeset segment* consists of a single *delta group* holding |
|
3167 |
changelog data. |
|
|
3168 |
boundary to the *manifest |
|
|
3226 | changelog data. The *empty chunk* at the end of the *delta group* denotes | |
|
3227 | the boundary to the *manifest segment*. | |
|
3169 | 3228 | </p> |
|
3170 | 3229 | <h2>Manifest Segment</h2> |
|
3171 | 3230 | <p> |
|
3172 | The *manifest segment* consists of a single *delta group* holding | |
|
3173 | manifest data. It is followed by an *empty chunk* to denote the boundary | |
|
3174 | to the *filelogs segment*. | |
|
3231 | The *manifest segment* consists of a single *delta group* holding manifest | |
|
3232 | data. If treemanifests are in use, it contains only the manifest for the | |
|
3233 | root directory of the repository. Otherwise, it contains the entire | |
|
3234 | manifest data. The *empty chunk* at the end of the *delta group* denotes | |
|
3235 | the boundary to the next segment (either the *treemanifests segment* or the | |
|
3236 | *filelogs segment*, depending on version and the request options). | |
|
3237 | </p> | |
|
3238 | <h3>Treemanifests Segment</h3> | |
|
3239 | <p> | |
|
3240 | The *treemanifests segment* only exists in changegroup version "3", and | |
|
3241 | only if the 'treemanifest' param is part of the bundle2 changegroup part | |
|
3242 | (it is not possible to use changegroup version 3 outside of bundle2). | |
|
3243 | Aside from the filenames in the *treemanifests segment* containing a | |
|
3244 | trailing "/" character, it behaves identically to the *filelogs segment* | |
|
3245 | (see below). The final sub-segment is followed by an *empty chunk* (logically, | |
|
3246 | a sub-segment with filename size 0). This denotes the boundary to the | |
|
3247 | *filelogs segment*. | |
|
3175 | 3248 | </p> |
|
3176 | 3249 | <h2>Filelogs Segment</h2> |
|
3177 | 3250 | <p> |
|
3178 |
The *filelogs |
|
|
3251 | The *filelogs segment* consists of multiple sub-segments, each | |
|
3179 | 3252 | corresponding to an individual file whose data is being described: |
|
3180 | 3253 | </p> |
|
3181 | 3254 | <pre> |
|
3182 | +--------------------------------------+ | |
|
3183 | | | | | | | |
|
3184 | | filelog0 | filelog1 | filelog2 | ... | | |
|
3185 | | | | | | | |
|
3186 | +--------------------------------------+ | |
|
3255 | +--------------------------------------------------+ | |
|
3256 | | | | | | | | |
|
3257 | | filelog0 | filelog1 | filelog2 | ... | 0x0 | | |
|
3258 | | | | | | (4 bytes) | | |
|
3259 | | | | | | | | |
|
3260 | +--------------------------------------------------+ | |
|
3187 | 3261 | </pre> |
|
3188 | 3262 | <p> |
|
3189 | In version "3" of the changegroup format, filelogs may include | |
|
3190 | directory logs when treemanifests are in use. directory logs are | |
|
3191 | identified by having a trailing '/' on their filename (see below). | |
|
3192 | </p> | |
|
3193 | <p> | |
|
3194 | The final filelog sub-segment is followed by an *empty chunk* to denote | |
|
3195 | the end of the segment and the overall changegroup. | |
|
3263 | The final filelog sub-segment is followed by an *empty chunk* (logically, | |
|
3264 | a sub-segment with filename size 0). This denotes the end of the segment | |
|
3265 | and of the overall changegroup. | |
|
3196 | 3266 | </p> |
|
3197 | 3267 | <p> |
|
3198 | 3268 | Each filelog sub-segment consists of the following: |
|
3199 | 3269 | </p> |
|
3200 | 3270 | <pre> |
|
3201 | +------------------------------------------+ | |
|
3271 | +------------------------------------------------------+ | |
|
3202 | 3272 | | | | | |
|
3203 |
| filename |
|
|
3204 |
| |
|
|
3273 | | filename length | filename | delta group | | |
|
3274 | | (4 bytes) | (<length - 4> bytes) | (various) | | |
|
3205 | 3275 | | | | | |
|
3206 | +------------------------------------------+ | |
|
3276 | +------------------------------------------------------+ | |
|
3207 | 3277 | </pre> |
|
3208 | 3278 | <p> |
|
3209 | 3279 | That is, a *chunk* consisting of the filename (not terminated or padded) |
|
3210 | followed by N chunks constituting the *delta group* for this file. | |
|
3280 | followed by N chunks constituting the *delta group* for this file. The | |
|
3281 | *empty chunk* at the end of each *delta group* denotes the boundary to the | |
|
3282 | next filelog sub-segment. | |
|
3211 | 3283 | </p> |
|
3212 | 3284 | |
|
3213 | 3285 | </div> |
General Comments 0
You need to be logged in to leave comments.
Login now