##// END OF EJS Templates
internals: typo pass on the dirstate-v2 help file...
Raphaël Gomès -
r49143:62486073 stable
parent child Browse files
Show More
@@ -1,616 +1,616 b''
1 The *dirstate* is what Mercurial uses internally to track
1 The *dirstate* is what Mercurial uses internally to track
2 the state of files in the working directory,
2 the state of files in the working directory,
3 such as set by commands like `hg add` and `hg rm`.
3 such as set by commands like `hg add` and `hg rm`.
4 It also contains some cached data that help make `hg status` faster.
4 It also contains some cached data that help make `hg status` faster.
5 The name refers both to `.hg/dirstate` on the filesystem
5 The name refers both to `.hg/dirstate` on the filesystem
6 and the corresponding data structure in memory while a Mercurial process
6 and the corresponding data structure in memory while a Mercurial process
7 is running.
7 is running.
8
8
9 The original file format, retroactively dubbed `dirstate-v1`,
9 The original file format, retroactively dubbed `dirstate-v1`,
10 is described at https://www.mercurial-scm.org/wiki/DirState.
10 is described at https://www.mercurial-scm.org/wiki/DirState.
11 It is made of a flat sequence of unordered variable-size entries,
11 It is made of a flat sequence of unordered variable-size entries,
12 so accessing any information in it requires parsing all of it.
12 so accessing any information in it requires parsing all of it.
13 Similarly, saving changes requires rewriting the entire file.
13 Similarly, saving changes requires rewriting the entire file.
14
14
15 The newer `dirsate-v2` file format is designed to fix these limitations
15 The newer `dirstate-v2` file format is designed to fix these limitations
16 and make `hg status` faster.
16 and make `hg status` faster.
17
17
18 User guide
18 User guide
19 ==========
19 ==========
20
20
21 Compatibility
21 Compatibility
22 -------------
22 -------------
23
23
24 The file format is experimental and may still change.
24 The file format is experimental and may still change.
25 Different versions of Mercurial may not be compatible with each other
25 Different versions of Mercurial may not be compatible with each other
26 when working on a local repository that uses this format.
26 when working on a local repository that uses this format.
27 When using an incompatible version with the experimental format,
27 When using an incompatible version with the experimental format,
28 anything can happen including data corruption.
28 anything can happen including data corruption.
29
29
30 Since the dirstate is entirely local and not relevant to the wire protocol,
30 Since the dirstate is entirely local and not relevant to the wire protocol,
31 `dirstate-v2` does not affect compatibility with remote Mercurial versions.
31 `dirstate-v2` does not affect compatibility with remote Mercurial versions.
32
32
33 When `share-safe` is enabled, different repositories sharing the same store
33 When `share-safe` is enabled, different repositories sharing the same store
34 can use different dirstate formats.
34 can use different dirstate formats.
35
35
36 Enabling `dirsate-v2` for new local repositories
36 Enabling `dirstate-v2` for new local repositories
37 ------------------------------------------------
37 ------------------------------------------------
38
38
39 When creating a new local repository such as with `hg init` or `hg clone`,
39 When creating a new local repository such as with `hg init` or `hg clone`,
40 the `exp-rc-dirstate-v2` boolean in the `format` configuration section
40 the `exp-rc-dirstate-v2` boolean in the `format` configuration section
41 controls whether to use this file format.
41 controls whether to use this file format.
42 This is disabled by default as of this writing.
42 This is disabled by default as of this writing.
43 To enable it for a single repository, run for example::
43 To enable it for a single repository, run for example::
44
44
45 $ hg init my-project --config format.exp-rc-dirstate-v2=1
45 $ hg init my-project --config format.exp-rc-dirstate-v2=1
46
46
47 Checking the format of an existing local repsitory
47 Checking the format of an existing local repository
48 --------------------------------------------------
48 --------------------------------------------------
49
49
50 The `debugformat` commands prints information about
50 The `debugformat` commands prints information about
51 which of multiple optional formats are used in the current repository,
51 which of multiple optional formats are used in the current repository,
52 including `dirstate-v2`::
52 including `dirstate-v2`::
53
53
54 $ hg debugformat
54 $ hg debugformat
55 format-variant repo
55 format-variant repo
56 fncache: yes
56 fncache: yes
57 dirstate-v2: yes
57 dirstate-v2: yes
58 […]
58 […]
59
59
60 Upgrading or downgrading an existing local repository
60 Upgrading or downgrading an existing local repository
61 -----------------------------------------------------
61 -----------------------------------------------------
62
62
63 The `debugupgrade` command does various upgrades or downgrades
63 The `debugupgrade` command does various upgrades or downgrades
64 on a local repository
64 on a local repository
65 based on the current Mercurial version and on configuration.
65 based on the current Mercurial version and on configuration.
66 The same `format.exp-rc-dirstate-v2` configuration is used again.
66 The same `format.exp-rc-dirstate-v2` configuration is used again.
67
67
68 Example to upgrade::
68 Example to upgrade::
69
69
70 $ hg debugupgrade --config format.exp-rc-dirstate-v2=1
70 $ hg debugupgrade --config format.exp-rc-dirstate-v2=1
71
71
72 Example to downgrade to `dirstate-v1`::
72 Example to downgrade to `dirstate-v1`::
73
73
74 $ hg debugupgrade --config format.exp-rc-dirstate-v2=0
74 $ hg debugupgrade --config format.exp-rc-dirstate-v2=0
75
75
76 Both of this commands do nothing but print a list of proposed changes,
76 Both of this commands do nothing but print a list of proposed changes,
77 which may include changes unrelated to the dirstate.
77 which may include changes unrelated to the dirstate.
78 Those other changes are controlled by their own configuration keys.
78 Those other changes are controlled by their own configuration keys.
79 Add `--run` to a command to actually apply the proposed changes.
79 Add `--run` to a command to actually apply the proposed changes.
80
80
81 Backups of `.hg/requires` and `.hg/dirstate` are created
81 Backups of `.hg/requires` and `.hg/dirstate` are created
82 in a `.hg/upgradebackup.*` directory.
82 in a `.hg/upgradebackup.*` directory.
83 If something goes wrong, restoring those files should undo the change.
83 If something goes wrong, restoring those files should undo the change.
84
84
85 Note that upgrading affects compatibility with older versions of Mercurial
85 Note that upgrading affects compatibility with older versions of Mercurial
86 as noted above.
86 as noted above.
87 This can be relevant when a repository’s files are on a USB drive
87 This can be relevant when a repository’s files are on a USB drive
88 or some other removable media, or shared over the network, etc.
88 or some other removable media, or shared over the network, etc.
89
89
90 Internal filesystem representation
90 Internal filesystem representation
91 ==================================
91 ==================================
92
92
93 Requirements file
93 Requirements file
94 -----------------
94 -----------------
95
95
96 The `.hg/requires` file indicates which of various optional file formats
96 The `.hg/requires` file indicates which of various optional file formats
97 are used by a given repository.
97 are used by a given repository.
98 Mercurial aborts when seeing a requirement it does not know about,
98 Mercurial aborts when seeing a requirement it does not know about,
99 which avoids older version accidentally messing up a respository
99 which avoids older version accidentally messing up a repository
100 that uses a format that was introduced later.
100 that uses a format that was introduced later.
101 For versions that do support a format, the presence or absence of
101 For versions that do support a format, the presence or absence of
102 the corresponding requirement indicates whether to use that format.
102 the corresponding requirement indicates whether to use that format.
103
103
104 When the file contains a `dirstate-v2` line,
104 When the file contains a `dirstate-v2` line,
105 the `dirstate-v2` format is used.
105 the `dirstate-v2` format is used.
106 With no such line `dirstate-v1` is used.
106 With no such line `dirstate-v1` is used.
107
107
108 High level description
108 High level description
109 ----------------------
109 ----------------------
110
110
111 Whereas `dirstate-v1` uses a single `.hg/disrtate` file,
111 Whereas `dirstate-v1` uses a single `.hg/dirstate` file,
112 in `dirstate-v2` that file is a "docket" file
112 in `dirstate-v2` that file is a "docket" file
113 that only contains some metadata
113 that only contains some metadata
114 and points to separate data file named `.hg/dirstate.{ID}`,
114 and points to separate data file named `.hg/dirstate.{ID}`,
115 where `{ID}` is a random identifier.
115 where `{ID}` is a random identifier.
116
116
117 This separation allows making data files append-only
117 This separation allows making data files append-only
118 and therefore safer to memory-map.
118 and therefore safer to memory-map.
119 Creating a new data file (occasionally to clean up unused data)
119 Creating a new data file (occasionally to clean up unused data)
120 can be done with a different ID
120 can be done with a different ID
121 without disrupting another Mercurial process
121 without disrupting another Mercurial process
122 that could still be using the previous data file.
122 that could still be using the previous data file.
123
123
124 Both files have a format designed to reduce the need for parsing,
124 Both files have a format designed to reduce the need for parsing,
125 by using fixed-size binary components as much as possible.
125 by using fixed-size binary components as much as possible.
126 For data that is not fixed-size,
126 For data that is not fixed-size,
127 references to other parts of a file can be made by storing "pseudo-pointers":
127 references to other parts of a file can be made by storing "pseudo-pointers":
128 integers counted in bytes from the start of a file.
128 integers counted in bytes from the start of a file.
129 For read-only access no data structure is needed,
129 For read-only access no data structure is needed,
130 only a bytes buffer (possibly memory-mapped directly from the filesystem)
130 only a bytes buffer (possibly memory-mapped directly from the filesystem)
131 with specific parts read on demand.
131 with specific parts read on demand.
132
132
133 The data file contains "nodes" organized in a tree.
133 The data file contains "nodes" organized in a tree.
134 Each node represents a file or directory inside the working directory
134 Each node represents a file or directory inside the working directory
135 or its parent changeset.
135 or its parent changeset.
136 This tree has the same structure as the filesystem,
136 This tree has the same structure as the filesystem,
137 so a node representing a directory has child nodes representing
137 so a node representing a directory has child nodes representing
138 the files and subdirectories contained directly in that directory.
138 the files and subdirectories contained directly in that directory.
139
139
140 The docket file format
140 The docket file format
141 ----------------------
141 ----------------------
142
142
143 This is implemented in `rust/hg-core/src/dirstate_tree/on_disk.rs`
143 This is implemented in `rust/hg-core/src/dirstate_tree/on_disk.rs`
144 and `mercurial/dirstateutils/docket.py`.
144 and `mercurial/dirstateutils/docket.py`.
145
145
146 Components of the docket file are found at fixed offsets,
146 Components of the docket file are found at fixed offsets,
147 counted in bytes from the start of the file:
147 counted in bytes from the start of the file:
148
148
149 * Offset 0:
149 * Offset 0:
150 The 12-bytes marker string "dirstate-v2\n" ending with a newline character.
150 The 12-bytes marker string "dirstate-v2\n" ending with a newline character.
151 This makes it easier to tell a dirstate-v2 file from a dirstate-v1 file,
151 This makes it easier to tell a dirstate-v2 file from a dirstate-v1 file,
152 although it is not strictly necessary
152 although it is not strictly necessary
153 since `.hg/requires` determines which format to use.
153 since `.hg/requires` determines which format to use.
154
154
155 * Offset 12:
155 * Offset 12:
156 The changeset node ID on the first parent of the working directory,
156 The changeset node ID on the first parent of the working directory,
157 as up to 32 binary bytes.
157 as up to 32 binary bytes.
158 If a node ID is shorter (20 bytes for SHA-1),
158 If a node ID is shorter (20 bytes for SHA-1),
159 it is start-aligned and the rest of the bytes are set to zero.
159 it is start-aligned and the rest of the bytes are set to zero.
160
160
161 * Offset 44:
161 * Offset 44:
162 The changeset node ID on the second parent of the working directory,
162 The changeset node ID on the second parent of the working directory,
163 or all zeros if there isn’t one.
163 or all zeros if there isn’t one.
164 Also 32 binary bytes.
164 Also 32 binary bytes.
165
165
166 * Offset 76:
166 * Offset 76:
167 Tree metadata on 44 bytes, described below.
167 Tree metadata on 44 bytes, described below.
168 Its separation in this documentation from the rest of the docket
168 Its separation in this documentation from the rest of the docket
169 reflects a detail of the current implementation.
169 reflects a detail of the current implementation.
170 Since tree metadata is also made of fields at fixed offsets, those could
170 Since tree metadata is also made of fields at fixed offsets, those could
171 be inlined here by adding 76 bytes to each offset.
171 be inlined here by adding 76 bytes to each offset.
172
172
173 * Offset 120:
173 * Offset 120:
174 The used size of the data file, as a 32-bit big-endian integer.
174 The used size of the data file, as a 32-bit big-endian integer.
175 The actual size of the data file may be larger
175 The actual size of the data file may be larger
176 (if another Mercurial processis in appending to it
176 (if another Mercurial process is appending to it
177 but has not updated the docket yet).
177 but has not updated the docket yet).
178 That extra data must be ignored.
178 That extra data must be ignored.
179
179
180 * Offset 124:
180 * Offset 124:
181 The length of the data file identifier, as a 8-bit integer.
181 The length of the data file identifier, as a 8-bit integer.
182
182
183 * Offset 125:
183 * Offset 125:
184 The data file identifier.
184 The data file identifier.
185
185
186 * Any additional data is current ignored, and dropped when updating the file.
186 * Any additional data is current ignored, and dropped when updating the file.
187
187
188 Tree metadata in the docket file
188 Tree metadata in the docket file
189 --------------------------------
189 --------------------------------
190
190
191 Tree metadata is similarly made of components at fixed offsets.
191 Tree metadata is similarly made of components at fixed offsets.
192 These offsets are counted in bytes from the start of tree metadata,
192 These offsets are counted in bytes from the start of tree metadata,
193 which is 76 bytes after the start of the docket file.
193 which is 76 bytes after the start of the docket file.
194
194
195 This metadata can be thought of as the singular root of the tree
195 This metadata can be thought of as the singular root of the tree
196 formed by nodes in the data file.
196 formed by nodes in the data file.
197
197
198 * Offset 0:
198 * Offset 0:
199 Pseudo-pointer to the start of root nodes,
199 Pseudo-pointer to the start of root nodes,
200 counted in bytes from the start of the data file,
200 counted in bytes from the start of the data file,
201 as a 32-bit big-endian integer.
201 as a 32-bit big-endian integer.
202 These nodes describe files and directories found directly
202 These nodes describe files and directories found directly
203 at the root of the working directory.
203 at the root of the working directory.
204
204
205 * Offset 4:
205 * Offset 4:
206 Number of root nodes, as a 32-bit big-endian integer.
206 Number of root nodes, as a 32-bit big-endian integer.
207
207
208 * Offset 8:
208 * Offset 8:
209 Total number of nodes in the entire tree that "have a dirstate entry",
209 Total number of nodes in the entire tree that "have a dirstate entry",
210 as a 32-bit big-endian integer.
210 as a 32-bit big-endian integer.
211 Those nodes represent files that would be present at all in `dirstate-v1`.
211 Those nodes represent files that would be present at all in `dirstate-v1`.
212 This is typically less than the total number of nodes.
212 This is typically less than the total number of nodes.
213 This counter is used to implement `len(dirstatemap)`.
213 This counter is used to implement `len(dirstatemap)`.
214
214
215 * Offset 12:
215 * Offset 12:
216 Number of nodes in the entire tree that have a copy source,
216 Number of nodes in the entire tree that have a copy source,
217 as a 32-bit big-endian integer.
217 as a 32-bit big-endian integer.
218 At the next commit, these files are recorded
218 At the next commit, these files are recorded
219 as having been copied or moved/renamed from that source.
219 as having been copied or moved/renamed from that source.
220 (A move is recorded as a copy and separate removal of the source.)
220 (A move is recorded as a copy and separate removal of the source.)
221 This counter is used to implement `len(dirstatemap.copymap)`.
221 This counter is used to implement `len(dirstatemap.copymap)`.
222
222
223 * Offset 16:
223 * Offset 16:
224 An estimation of how many bytes of the data file
224 An estimation of how many bytes of the data file
225 (within its used size) are unused, as a 32-bit big-endian integer.
225 (within its used size) are unused, as a 32-bit big-endian integer.
226 When appending to an existing data file,
226 When appending to an existing data file,
227 some existing nodes or paths can be unreachable from the new root
227 some existing nodes or paths can be unreachable from the new root
228 but they still take up space.
228 but they still take up space.
229 This counter is used to decide when to write a new data file from scratch
229 This counter is used to decide when to write a new data file from scratch
230 instead of appending to an existing one,
230 instead of appending to an existing one,
231 in order to get rid of that unreachable data
231 in order to get rid of that unreachable data
232 and avoid unbounded file size growth.
232 and avoid unbounded file size growth.
233
233
234 * Offset 20:
234 * Offset 20:
235 These four bytes are currently ignored
235 These four bytes are currently ignored
236 and reset to zero when updating a docket file.
236 and reset to zero when updating a docket file.
237 This is an attempt at forward compatibility:
237 This is an attempt at forward compatibility:
238 future Mercurial versions could use this as a bit field
238 future Mercurial versions could use this as a bit field
239 to indicate that a dirstate has additional data or constraints.
239 to indicate that a dirstate has additional data or constraints.
240 Finding a dirstate file with the relevant bit unset indicates that
240 Finding a dirstate file with the relevant bit unset indicates that
241 it was written by a then-older version
241 it was written by a then-older version
242 which is not aware of that future change.
242 which is not aware of that future change.
243
243
244 * Offset 24:
244 * Offset 24:
245 Either 20 zero bytes, or a SHA-1 hash as 20 binary bytes.
245 Either 20 zero bytes, or a SHA-1 hash as 20 binary bytes.
246 When present, the hash is of ignore patterns
246 When present, the hash is of ignore patterns
247 that were used for some previous run of the `status` algorithm.
247 that were used for some previous run of the `status` algorithm.
248
248
249 * (Offset 44: end of tree metadata)
249 * (Offset 44: end of tree metadata)
250
250
251 Optional hash of ignore patterns
251 Optional hash of ignore patterns
252 --------------------------------
252 --------------------------------
253
253
254 The implementation of `status` at `rust/hg-core/src/dirstate_tree/status.rs`
254 The implementation of `status` at `rust/hg-core/src/dirstate_tree/status.rs`
255 has been optimized such that its run time is dominated by calls
255 has been optimized such that its run time is dominated by calls
256 to `stat` for reading the filesystem metadata of a file or directory,
256 to `stat` for reading the filesystem metadata of a file or directory,
257 and to `readdir` for listing the contents of a directory.
257 and to `readdir` for listing the contents of a directory.
258 In some cases the algorithm can skip calls to `readdir`
258 In some cases the algorithm can skip calls to `readdir`
259 (saving significant time)
259 (saving significant time)
260 because the dirstate already contains enough of the relevant information
260 because the dirstate already contains enough of the relevant information
261 to build the correct `status` results.
261 to build the correct `status` results.
262
262
263 The default configuration of `hg status` is to list unknown files
263 The default configuration of `hg status` is to list unknown files
264 but not ignored files.
264 but not ignored files.
265 In this case, it matters for the `readdir`-skipping optimization
265 In this case, it matters for the `readdir`-skipping optimization
266 if a given file used to be ignored but became unknown
266 if a given file used to be ignored but became unknown
267 because `.hgignore` changed.
267 because `.hgignore` changed.
268 To detect the possibility of such a change,
268 To detect the possibility of such a change,
269 the tree metadata contains an optional hash of all ignore patterns.
269 the tree metadata contains an optional hash of all ignore patterns.
270
270
271 We define:
271 We define:
272
272
273 * "Root" ignore files as:
273 * "Root" ignore files as:
274
274
275 - `.hgignore` at the root of the repository if it exists
275 - `.hgignore` at the root of the repository if it exists
276 - And all files from `ui.ignore.*` config.
276 - And all files from `ui.ignore.*` config.
277
277
278 This set of files is sorted by the string representation of their path.
278 This set of files is sorted by the string representation of their path.
279
279
280 * The "expanded contents" of an ignore files is the byte string made
280 * The "expanded contents" of an ignore files is the byte string made
281 by the concatenation of its contents followed by the "expanded contents"
281 by the concatenation of its contents followed by the "expanded contents"
282 of other files included with `include:` or `subinclude:` directives,
282 of other files included with `include:` or `subinclude:` directives,
283 in inclusion order. This definition is recursive, as included files can
283 in inclusion order. This definition is recursive, as included files can
284 themselves include more files.
284 themselves include more files.
285
285
286 This hash is defined as the SHA-1 of the concatenation (in sorted
286 This hash is defined as the SHA-1 of the concatenation (in sorted
287 order) of the "expanded contents" of each "root" ignore file.
287 order) of the "expanded contents" of each "root" ignore file.
288 (Note that computing this does not require actually concatenating
288 (Note that computing this does not require actually concatenating
289 into a single contiguous byte sequence.
289 into a single contiguous byte sequence.
290 Instead a SHA-1 hasher object can be created
290 Instead a SHA-1 hasher object can be created
291 and fed separate chunks one by one.)
291 and fed separate chunks one by one.)
292
292
293 The data file format
293 The data file format
294 --------------------
294 --------------------
295
295
296 This is implemented in `rust/hg-core/src/dirstate_tree/on_disk.rs`
296 This is implemented in `rust/hg-core/src/dirstate_tree/on_disk.rs`
297 and `mercurial/dirstateutils/v2.py`.
297 and `mercurial/dirstateutils/v2.py`.
298
298
299 The data file contains two types of data: paths and nodes.
299 The data file contains two types of data: paths and nodes.
300
300
301 Paths and nodes can be organized in any order in the file, except that sibling
301 Paths and nodes can be organized in any order in the file, except that sibling
302 nodes must be next to each other and sorted by their path.
302 nodes must be next to each other and sorted by their path.
303 Contiguity lets the parent refer to them all
303 Contiguity lets the parent refer to them all
304 by their count and a single pseudo-pointer,
304 by their count and a single pseudo-pointer,
305 instead of storing one pseudo-pointer per child node.
305 instead of storing one pseudo-pointer per child node.
306 Sorting allows using binary seach to find a child node with a given name
306 Sorting allows using binary search to find a child node with a given name
307 in `O(log(n))` byte sequence comparisons.
307 in `O(log(n))` byte sequence comparisons.
308
308
309 The current implemention writes paths and child node before a given node
309 The current implementation writes paths and child node before a given node
310 for ease of figuring out the value of pseudo-pointers by the time the are to be
310 for ease of figuring out the value of pseudo-pointers by the time the are to be
311 written, but this is not an obligation and readers must not rely on it.
311 written, but this is not an obligation and readers must not rely on it.
312
312
313 A path is stored as a byte string anywhere in the file, without delimiter.
313 A path is stored as a byte string anywhere in the file, without delimiter.
314 It is refered to by one or more node by a pseudo-pointer to its start, and its
314 It is referred to by one or more node by a pseudo-pointer to its start, and its
315 length in bytes. Since there is no delimiter,
315 length in bytes. Since there is no delimiter,
316 when a path is a substring of another the same bytes could be reused,
316 when a path is a substring of another the same bytes could be reused,
317 although the implementation does not exploit this as of this writing.
317 although the implementation does not exploit this as of this writing.
318
318
319 A node is stored on 43 bytes with components at fixed offsets. Paths and
319 A node is stored on 43 bytes with components at fixed offsets. Paths and
320 child nodes relevant to a node are stored externally and referenced though
320 child nodes relevant to a node are stored externally and referenced though
321 pseudo-pointers.
321 pseudo-pointers.
322
322
323 All integers are stored in big-endian. All pseudo-pointers are 32-bit integers
323 All integers are stored in big-endian. All pseudo-pointers are 32-bit integers
324 counting bytes from the start of the data file. Path lengths and positions
324 counting bytes from the start of the data file. Path lengths and positions
325 are 16-bit integers, also counted in bytes.
325 are 16-bit integers, also counted in bytes.
326
326
327 Node components are:
327 Node components are:
328
328
329 * Offset 0:
329 * Offset 0:
330 Pseudo-pointer to the full path of this node,
330 Pseudo-pointer to the full path of this node,
331 from the working directory root.
331 from the working directory root.
332
332
333 * Offset 4:
333 * Offset 4:
334 Length of the full path.
334 Length of the full path.
335
335
336 * Offset 6:
336 * Offset 6:
337 Position of the last `/` path separator within the full path,
337 Position of the last `/` path separator within the full path,
338 in bytes from the start of the full path,
338 in bytes from the start of the full path,
339 or zero if there isn’t one.
339 or zero if there isn’t one.
340 The part of the full path after this position is the "base name".
340 The part of the full path after this position is the "base name".
341 Since sibling nodes have the same parent, only their base name vary
341 Since sibling nodes have the same parent, only their base name vary
342 and needs to be considered when doing binary search to find a given path.
342 and needs to be considered when doing binary search to find a given path.
343
343
344 * Offset 8:
344 * Offset 8:
345 Pseudo-pointer to the "copy source" path for this node,
345 Pseudo-pointer to the "copy source" path for this node,
346 or zero if there is no copy source.
346 or zero if there is no copy source.
347
347
348 * Offset 12:
348 * Offset 12:
349 Length of the copy source path, or zero if there isn’t one.
349 Length of the copy source path, or zero if there isn’t one.
350
350
351 * Offset 14:
351 * Offset 14:
352 Pseudo-pointer to the start of child nodes.
352 Pseudo-pointer to the start of child nodes.
353
353
354 * Offset 18:
354 * Offset 18:
355 Number of child nodes, as a 32-bit integer.
355 Number of child nodes, as a 32-bit integer.
356 They occupy 43 times this number of bytes
356 They occupy 43 times this number of bytes
357 (not counting space for paths, and further descendants).
357 (not counting space for paths, and further descendants).
358
358
359 * Offset 22:
359 * Offset 22:
360 Number as a 32-bit integer of descendant nodes in this subtree,
360 Number as a 32-bit integer of descendant nodes in this subtree,
361 not including this node itself,
361 not including this node itself,
362 that "have a dirstate entry".
362 that "have a dirstate entry".
363 Those nodes represent files that would be present at all in `dirstate-v1`.
363 Those nodes represent files that would be present at all in `dirstate-v1`.
364 This is typically less than the total number of descendants.
364 This is typically less than the total number of descendants.
365 This counter is used to implement `has_dir`.
365 This counter is used to implement `has_dir`.
366
366
367 * Offset 26:
367 * Offset 26:
368 Number as a 32-bit integer of descendant nodes in this subtree,
368 Number as a 32-bit integer of descendant nodes in this subtree,
369 not including this node itself,
369 not including this node itself,
370 that represent files tracked in the working directory.
370 that represent files tracked in the working directory.
371 (For example, `hg rm` makes a file untracked.)
371 (For example, `hg rm` makes a file untracked.)
372 This counter is used to implement `has_tracked_dir`.
372 This counter is used to implement `has_tracked_dir`.
373
373
374 * Offset 30:
374 * Offset 30:
375 A `flags` fields that packs some boolean values as bits of a 16-bit integer.
375 A `flags` fields that packs some boolean values as bits of a 16-bit integer.
376 Starting from least-significant, bit masks are::
376 Starting from least-significant, bit masks are::
377
377
378 WDIR_TRACKED = 1 << 0
378 WDIR_TRACKED = 1 << 0
379 P1_TRACKED = 1 << 1
379 P1_TRACKED = 1 << 1
380 P2_INFO = 1 << 2
380 P2_INFO = 1 << 2
381 MODE_EXEC_PERM = 1 << 3
381 MODE_EXEC_PERM = 1 << 3
382 MODE_IS_SYMLINK = 1 << 4
382 MODE_IS_SYMLINK = 1 << 4
383 HAS_FALLBACK_EXEC = 1 << 5
383 HAS_FALLBACK_EXEC = 1 << 5
384 FALLBACK_EXEC = 1 << 6
384 FALLBACK_EXEC = 1 << 6
385 HAS_FALLBACK_SYMLINK = 1 << 7
385 HAS_FALLBACK_SYMLINK = 1 << 7
386 FALLBACK_SYMLINK = 1 << 8
386 FALLBACK_SYMLINK = 1 << 8
387 EXPECTED_STATE_IS_MODIFIED = 1 << 9
387 EXPECTED_STATE_IS_MODIFIED = 1 << 9
388 HAS_MODE_AND_SIZE = 1 << 10
388 HAS_MODE_AND_SIZE = 1 << 10
389 HAS_MTIME = 1 << 11
389 HAS_MTIME = 1 << 11
390 MTIME_SECOND_AMBIGUOUS = 1 << 12
390 MTIME_SECOND_AMBIGUOUS = 1 << 12
391 DIRECTORY = 1 << 13
391 DIRECTORY = 1 << 13
392 ALL_UNKNOWN_RECORDED = 1 << 14
392 ALL_UNKNOWN_RECORDED = 1 << 14
393 ALL_IGNORED_RECORDED = 1 << 15
393 ALL_IGNORED_RECORDED = 1 << 15
394
394
395 The meaning of each bit is described below.
395 The meaning of each bit is described below.
396
396
397 Other bits are unset.
397 Other bits are unset.
398 They may be assigned meaning if the future,
398 They may be assigned meaning if the future,
399 with the limitation that Mercurial versions that pre-date such meaning
399 with the limitation that Mercurial versions that pre-date such meaning
400 will always reset those bits to unset when writing nodes.
400 will always reset those bits to unset when writing nodes.
401 (A new node is written for any mutation in its subtree,
401 (A new node is written for any mutation in its subtree,
402 leaving the bytes of the old node unreachable
402 leaving the bytes of the old node unreachable
403 until the data file is rewritten entirely.)
403 until the data file is rewritten entirely.)
404
404
405 * Offset 32:
405 * Offset 32:
406 A `size` field described below, as a 32-bit integer.
406 A `size` field described below, as a 32-bit integer.
407 Unlike in dirstate-v1, negative values are not used.
407 Unlike in dirstate-v1, negative values are not used.
408
408
409 * Offset 36:
409 * Offset 36:
410 The seconds component of an `mtime` field described below,
410 The seconds component of an `mtime` field described below,
411 as a 32-bit integer.
411 as a 32-bit integer.
412 Unlike in dirstate-v1, negative values are not used.
412 Unlike in dirstate-v1, negative values are not used.
413 When `mtime` is used, this is number of seconds since the Unix epoch
413 When `mtime` is used, this is number of seconds since the Unix epoch
414 truncated to its lower 31 bits.
414 truncated to its lower 31 bits.
415
415
416 * Offset 40:
416 * Offset 40:
417 The nanoseconds component of an `mtime` field described below,
417 The nanoseconds component of an `mtime` field described below,
418 as a 32-bit integer.
418 as a 32-bit integer.
419 When `mtime` is used,
419 When `mtime` is used,
420 this is the number of nanoseconds since `mtime.seconds`,
420 this is the number of nanoseconds since `mtime.seconds`,
421 always stritctly less than one billion.
421 always strictly less than one billion.
422
422
423 This may be zero if more precision is not available.
423 This may be zero if more precision is not available.
424 (This can happen because of limitations in any of Mercurial, Python,
424 (This can happen because of limitations in any of Mercurial, Python,
425 libc, the operating system, …)
425 libc, the operating system, …)
426
426
427 When comparing two mtimes and either has this component set to zero,
427 When comparing two mtimes and either has this component set to zero,
428 the sub-second precision of both should be ignored.
428 the sub-second precision of both should be ignored.
429 False positives when checking mtime equality due to clock resolution
429 False positives when checking mtime equality due to clock resolution
430 are always possible and the status algorithm needs to deal with them,
430 are always possible and the status algorithm needs to deal with them,
431 but having too many false negatives could be harmful too.
431 but having too many false negatives could be harmful too.
432
432
433 * (Offset 44: end of this node)
433 * (Offset 44: end of this node)
434
434
435 The meaning of the boolean values packed in `flags` is:
435 The meaning of the boolean values packed in `flags` is:
436
436
437 `WDIR_TRACKED`
437 `WDIR_TRACKED`
438 Set if the working directory contains a tracked file at this node’s path.
438 Set if the working directory contains a tracked file at this node’s path.
439 This is typically set and unset by `hg add` and `hg rm`.
439 This is typically set and unset by `hg add` and `hg rm`.
440
440
441 `P1_TRACKED`
441 `P1_TRACKED`
442 Set if the working directory’s first parent changeset
442 Set if the working directory’s first parent changeset
443 (whose node identifier is found in tree metadata)
443 (whose node identifier is found in tree metadata)
444 contains a tracked file at this node’s path.
444 contains a tracked file at this node’s path.
445 This is a cache to reduce manifest lookups.
445 This is a cache to reduce manifest lookups.
446
446
447 `P2_INFO`
447 `P2_INFO`
448 Set if the file has been involved in some merge operation.
448 Set if the file has been involved in some merge operation.
449 Either because it was actually merged,
449 Either because it was actually merged,
450 or because the version in the second parent p2 version was ahead,
450 or because the version in the second parent p2 version was ahead,
451 or because some rename moved it there.
451 or because some rename moved it there.
452 In either case `hg status` will want it displayed as modified.
452 In either case `hg status` will want it displayed as modified.
453
453
454 Files that would be mentioned at all in the `dirstate-v1` file format
454 Files that would be mentioned at all in the `dirstate-v1` file format
455 have a node with at least one of the above three bits set in `dirstate-v2`.
455 have a node with at least one of the above three bits set in `dirstate-v2`.
456 Let’s call these files "tracked anywhere",
456 Let’s call these files "tracked anywhere",
457 and "untracked" the nodes with all three of these bits unset.
457 and "untracked" the nodes with all three of these bits unset.
458 Untracked nodes are typically for directories:
458 Untracked nodes are typically for directories:
459 they hold child nodes and form the tree structure.
459 they hold child nodes and form the tree structure.
460 Additional untracked nodes may also exist.
460 Additional untracked nodes may also exist.
461 Although implementations should strive to clean up nodes
461 Although implementations should strive to clean up nodes
462 that are entirely unused, other untracked nodes may also exist.
462 that are entirely unused, other untracked nodes may also exist.
463 For example, a future version of Mercurial might in some cases
463 For example, a future version of Mercurial might in some cases
464 add nodes for untracked files or/and ignored files in the working directory
464 add nodes for untracked files or/and ignored files in the working directory
465 in order to optimize `hg status`
465 in order to optimize `hg status`
466 by enabling it to skip `readdir` in more cases.
466 by enabling it to skip `readdir` in more cases.
467
467
468 `HAS_MODE_AND_SIZE`
468 `HAS_MODE_AND_SIZE`
469 Must be unset for untracked nodes.
469 Must be unset for untracked nodes.
470 For files tracked anywhere, if this is set:
470 For files tracked anywhere, if this is set:
471 - The `size` field is the expected file size,
471 - The `size` field is the expected file size,
472 in bytes truncated its lower to 31 bits.
472 in bytes truncated its lower to 31 bits.
473 - The expected execute permission for the file’s owner
473 - The expected execute permission for the file’s owner
474 is given by `MODE_EXEC_PERM`
474 is given by `MODE_EXEC_PERM`
475 - The expected file type is given by `MODE_IS_SIMLINK`:
475 - The expected file type is given by `MODE_IS_SIMLINK`:
476 a symbolic link if set, or a normal file if unset.
476 a symbolic link if set, or a normal file if unset.
477 If this is unset the expected size, permission, and file type are unknown.
477 If this is unset the expected size, permission, and file type are unknown.
478 The `size` field is unused (set to zero).
478 The `size` field is unused (set to zero).
479
479
480 `HAS_MTIME`
480 `HAS_MTIME`
481 The nodes contains a "valid" last modification time in the `mtime` field.
481 The nodes contains a "valid" last modification time in the `mtime` field.
482
482
483
483
484 It means the `mtime` was already strictly in the past when observed,
484 It means the `mtime` was already strictly in the past when observed,
485 meaning that later changes cannot happen in the same clock tick
485 meaning that later changes cannot happen in the same clock tick
486 and must cause a different modification time
486 and must cause a different modification time
487 (unless the system clock jumps back and we get unlucky,
487 (unless the system clock jumps back and we get unlucky,
488 which is not impossible but deemed unlikely enough).
488 which is not impossible but deemed unlikely enough).
489
489
490 This means that if `std::fs::symlink_metadata` later reports
490 This means that if `std::fs::symlink_metadata` later reports
491 the same modification time
491 the same modification time
492 and ignored patterns haven’t changed,
492 and ignored patterns haven’t changed,
493 we can assume the node to be unchanged on disk.
493 we can assume the node to be unchanged on disk.
494
494
495 The `mtime` field can then be used to skip more expensive lookup when
495 The `mtime` field can then be used to skip more expensive lookup when
496 checking the status of "tracked" nodes.
496 checking the status of "tracked" nodes.
497
497
498 It can also be set for node where `DIRECTORY` is set.
498 It can also be set for node where `DIRECTORY` is set.
499 See `DIRECTORY` documentation for details.
499 See `DIRECTORY` documentation for details.
500
500
501 `DIRECTORY`
501 `DIRECTORY`
502 When set, this entry will match a directory that exists or existed on the
502 When set, this entry will match a directory that exists or existed on the
503 file system.
503 file system.
504
504
505 * When `HAS_MTIME` is set a directory has been seen on the file system and
505 * When `HAS_MTIME` is set a directory has been seen on the file system and
506 `mtime` matches its last modificiation time. However, `HAS_MTIME` not being set
506 `mtime` matches its last modification time. However, `HAS_MTIME` not
507 does not indicate the lack of directory on the file system.
507 being set does not indicate the lack of directory on the file system.
508
508
509 * When not tracked anywhere, this node does not represent an ignored or
509 * When not tracked anywhere, this node does not represent an ignored or
510 unknown file on disk.
510 unknown file on disk.
511
511
512 If `HAS_MTIME` is set
512 If `HAS_MTIME` is set
513 and `mtime` matches the last modification time of the directory on disk,
513 and `mtime` matches the last modification time of the directory on disk,
514 the directory is unchanged
514 the directory is unchanged
515 and we can skip calling `std::fs::read_dir` again for this directory,
515 and we can skip calling `std::fs::read_dir` again for this directory,
516 and iterate child dirstate nodes instead.
516 and iterate child dirstate nodes instead.
517 (as long as `ALL_UNKNOWN_RECORDED` and `ALL_IGNORED_RECORDED` are taken
517 (as long as `ALL_UNKNOWN_RECORDED` and `ALL_IGNORED_RECORDED` are taken
518 into account)
518 into account)
519
519
520 `MODE_EXEC_PERM`
520 `MODE_EXEC_PERM`
521 Must be unset if `HAS_MODE_AND_SIZE` is unset.
521 Must be unset if `HAS_MODE_AND_SIZE` is unset.
522 If `HAS_MODE_AND_SIZE` is set,
522 If `HAS_MODE_AND_SIZE` is set,
523 this indicates whether the file’s own is expected
523 this indicates whether the file’s own is expected
524 to have execute permission.
524 to have execute permission.
525
525
526 Beware that on system without fs support for this information, the value
526 Beware that on system without fs support for this information, the value
527 stored in the dirstate might be wrong and should not be relied on.
527 stored in the dirstate might be wrong and should not be relied on.
528
528
529 `MODE_IS_SYMLINK`
529 `MODE_IS_SYMLINK`
530 Must be unset if `HAS_MODE_AND_SIZE` is unset.
530 Must be unset if `HAS_MODE_AND_SIZE` is unset.
531 If `HAS_MODE_AND_SIZE` is set,
531 If `HAS_MODE_AND_SIZE` is set,
532 this indicates whether the file is expected to be a symlink
532 this indicates whether the file is expected to be a symlink
533 as opposed to a normal file.
533 as opposed to a normal file.
534
534
535 Beware that on system without fs support for this information, the value
535 Beware that on system without fs support for this information, the value
536 stored in the dirstate might be wrong and should not be relied on.
536 stored in the dirstate might be wrong and should not be relied on.
537
537
538 `EXPECTED_STATE_IS_MODIFIED`
538 `EXPECTED_STATE_IS_MODIFIED`
539 Must be unset for untracked nodes.
539 Must be unset for untracked nodes.
540 For:
540 For:
541 - a file tracked anywhere
541 - a file tracked anywhere
542 - that has expected metadata (`HAS_MODE_AND_SIZE` and `HAS_MTIME`)
542 - that has expected metadata (`HAS_MODE_AND_SIZE` and `HAS_MTIME`)
543 - if that metadata matches
543 - if that metadata matches
544 metadata found in the working directory with `stat`
544 metadata found in the working directory with `stat`
545 This bit indicates the status of the file.
545 This bit indicates the status of the file.
546 If set, the status is modified. If unset, it is clean.
546 If set, the status is modified. If unset, it is clean.
547
547
548 In cases where `hg status` needs to read the contents of a file
548 In cases where `hg status` needs to read the contents of a file
549 because metadata is ambiguous, this bit lets it record the result
549 because metadata is ambiguous, this bit lets it record the result
550 if the result is modified so that a future run of `hg status`
550 if the result is modified so that a future run of `hg status`
551 does not need to do the same again.
551 does not need to do the same again.
552 It is valid to never set this bit,
552 It is valid to never set this bit,
553 and consider expected metadata ambiguous if it is set.
553 and consider expected metadata ambiguous if it is set.
554
554
555 `ALL_UNKNOWN_RECORDED`
555 `ALL_UNKNOWN_RECORDED`
556 If set, all "unknown" children existing on disk (at the time of the last
556 If set, all "unknown" children existing on disk (at the time of the last
557 status) have been recorded and the `mtime` associated with
557 status) have been recorded and the `mtime` associated with
558 `DIRECTORY` can be used for optimization even when "unknown" file
558 `DIRECTORY` can be used for optimization even when "unknown" file
559 are listed.
559 are listed.
560
560
561 Note that the amount recorded "unknown" children can still be zero if None
561 Note that the amount recorded "unknown" children can still be zero if None
562 where present.
562 where present.
563
563
564 Also note that having this flag unset does not imply that no "unknown"
564 Also note that having this flag unset does not imply that no "unknown"
565 children have been recorded. Some might be present, but there is no garantee
565 children have been recorded. Some might be present, but there is
566 that is will be all of them.
566 no guarantee that is will be all of them.
567
567
568 `ALL_IGNORED_RECORDED`
568 `ALL_IGNORED_RECORDED`
569 If set, all "ignored" children existing on disk (at the time of the last
569 If set, all "ignored" children existing on disk (at the time of the last
570 status) have been recorded and the `mtime` associated with
570 status) have been recorded and the `mtime` associated with
571 `DIRECTORY` can be used for optimization even when "ignored" file
571 `DIRECTORY` can be used for optimization even when "ignored" file
572 are listed.
572 are listed.
573
573
574 Note that the amount recorded "ignored" children can still be zero if None
574 Note that the amount recorded "ignored" children can still be zero if None
575 where present.
575 where present.
576
576
577 Also note that having this flag unset does not imply that no "ignored"
577 Also note that having this flag unset does not imply that no "ignored"
578 children have been recorded. Some might be present, but there is no garantee
578 children have been recorded. Some might be present, but there is
579 that is will be all of them.
579 no guarantee that is will be all of them.
580
580
581 `HAS_FALLBACK_EXEC`
581 `HAS_FALLBACK_EXEC`
582 If this flag is set, the entry carries "fallback" information for the
582 If this flag is set, the entry carries "fallback" information for the
583 executable bit in the `FALLBACK_EXEC` flag.
583 executable bit in the `FALLBACK_EXEC` flag.
584
584
585 Fallback information can be stored in the dirstate to keep track of
585 Fallback information can be stored in the dirstate to keep track of
586 filesystem attribute tracked by Mercurial when the underlying file
586 filesystem attribute tracked by Mercurial when the underlying file
587 system or operating system does not support that property, (e.g.
587 system or operating system does not support that property, (e.g.
588 Windows).
588 Windows).
589
589
590 `FALLBACK_EXEC`
590 `FALLBACK_EXEC`
591 Should be ignored if `HAS_FALLBACK_EXEC` is unset. If set the file for this
591 Should be ignored if `HAS_FALLBACK_EXEC` is unset. If set the file for this
592 entry should be considered executable if that information cannot be
592 entry should be considered executable if that information cannot be
593 extracted from the file system. If unset it should be considered
593 extracted from the file system. If unset it should be considered
594 non-executable instead.
594 non-executable instead.
595
595
596 `HAS_FALLBACK_SYMLINK`
596 `HAS_FALLBACK_SYMLINK`
597 If this flag is set, the entry carries "fallback" information for symbolic
597 If this flag is set, the entry carries "fallback" information for symbolic
598 link status in the `FALLBACK_SYMLINK` flag.
598 link status in the `FALLBACK_SYMLINK` flag.
599
599
600 Fallback information can be stored in the dirstate to keep track of
600 Fallback information can be stored in the dirstate to keep track of
601 filesystem attribute tracked by Mercurial when the underlying file
601 filesystem attribute tracked by Mercurial when the underlying file
602 system or operating system does not support that property, (e.g.
602 system or operating system does not support that property, (e.g.
603 Windows).
603 Windows).
604
604
605 `FALLBACK_SYMLINK`
605 `FALLBACK_SYMLINK`
606 Should be ignored if `HAS_FALLBACK_SYMLINK` is unset. If set the file for
606 Should be ignored if `HAS_FALLBACK_SYMLINK` is unset. If set the file for
607 this entry should be considered a symlink if that information cannot be
607 this entry should be considered a symlink if that information cannot be
608 extracted from the file system. If unset it should be considered a normal
608 extracted from the file system. If unset it should be considered a normal
609 file instead.
609 file instead.
610
610
611 `MTIME_SECOND_AMBIGUOUS`
611 `MTIME_SECOND_AMBIGUOUS`
612 This flag is relevant only when `HAS_FILE_MTIME` is set. When set, the
612 This flag is relevant only when `HAS_FILE_MTIME` is set. When set, the
613 `mtime` stored in the entry is only valid for comparison with timestamps
613 `mtime` stored in the entry is only valid for comparison with timestamps
614 that have nanosecond information. If available timestamp does not carries
614 that have nanosecond information. If available timestamp does not carries
615 nanosecond information, the `mtime` should be ignored and no optimisation
615 nanosecond information, the `mtime` should be ignored and no optimization
616 can be applied.
616 can be applied.
General Comments 0
You need to be logged in to leave comments. Login now