Show More
@@ -1,616 +1,616 b'' | |||
|
1 | 1 | The *dirstate* is what Mercurial uses internally to track |
|
2 | 2 | the state of files in the working directory, |
|
3 | 3 | such as set by commands like `hg add` and `hg rm`. |
|
4 | 4 | It also contains some cached data that help make `hg status` faster. |
|
5 | 5 | The name refers both to `.hg/dirstate` on the filesystem |
|
6 | 6 | and the corresponding data structure in memory while a Mercurial process |
|
7 | 7 | is running. |
|
8 | 8 | |
|
9 | 9 | The original file format, retroactively dubbed `dirstate-v1`, |
|
10 | 10 | is described at https://www.mercurial-scm.org/wiki/DirState. |
|
11 | 11 | It is made of a flat sequence of unordered variable-size entries, |
|
12 | 12 | so accessing any information in it requires parsing all of it. |
|
13 | 13 | Similarly, saving changes requires rewriting the entire file. |
|
14 | 14 | |
|
15 | The newer `dirsate-v2` file format is designed to fix these limitations | |
|
15 | The newer `dirstate-v2` file format is designed to fix these limitations | |
|
16 | 16 | and make `hg status` faster. |
|
17 | 17 | |
|
18 | 18 | User guide |
|
19 | 19 | ========== |
|
20 | 20 | |
|
21 | 21 | Compatibility |
|
22 | 22 | ------------- |
|
23 | 23 | |
|
24 | 24 | The file format is experimental and may still change. |
|
25 | 25 | Different versions of Mercurial may not be compatible with each other |
|
26 | 26 | when working on a local repository that uses this format. |
|
27 | 27 | When using an incompatible version with the experimental format, |
|
28 | 28 | anything can happen including data corruption. |
|
29 | 29 | |
|
30 | 30 | Since the dirstate is entirely local and not relevant to the wire protocol, |
|
31 | 31 | `dirstate-v2` does not affect compatibility with remote Mercurial versions. |
|
32 | 32 | |
|
33 | 33 | When `share-safe` is enabled, different repositories sharing the same store |
|
34 | 34 | can use different dirstate formats. |
|
35 | 35 | |
|
36 | Enabling `dirsate-v2` for new local repositories | |
|
36 | Enabling `dirstate-v2` for new local repositories | |
|
37 | 37 | ------------------------------------------------ |
|
38 | 38 | |
|
39 | 39 | When creating a new local repository such as with `hg init` or `hg clone`, |
|
40 | 40 | the `exp-rc-dirstate-v2` boolean in the `format` configuration section |
|
41 | 41 | controls whether to use this file format. |
|
42 | 42 | This is disabled by default as of this writing. |
|
43 | 43 | To enable it for a single repository, run for example:: |
|
44 | 44 | |
|
45 | 45 | $ hg init my-project --config format.exp-rc-dirstate-v2=1 |
|
46 | 46 | |
|
47 | Checking the format of an existing local repsitory | |
|
47 | Checking the format of an existing local repository | |
|
48 | 48 | -------------------------------------------------- |
|
49 | 49 | |
|
50 | 50 | The `debugformat` commands prints information about |
|
51 | 51 | which of multiple optional formats are used in the current repository, |
|
52 | 52 | including `dirstate-v2`:: |
|
53 | 53 | |
|
54 | 54 | $ hg debugformat |
|
55 | 55 | format-variant repo |
|
56 | 56 | fncache: yes |
|
57 | 57 | dirstate-v2: yes |
|
58 | 58 | […] |
|
59 | 59 | |
|
60 | 60 | Upgrading or downgrading an existing local repository |
|
61 | 61 | ----------------------------------------------------- |
|
62 | 62 | |
|
63 | 63 | The `debugupgrade` command does various upgrades or downgrades |
|
64 | 64 | on a local repository |
|
65 | 65 | based on the current Mercurial version and on configuration. |
|
66 | 66 | The same `format.exp-rc-dirstate-v2` configuration is used again. |
|
67 | 67 | |
|
68 | 68 | Example to upgrade:: |
|
69 | 69 | |
|
70 | 70 | $ hg debugupgrade --config format.exp-rc-dirstate-v2=1 |
|
71 | 71 | |
|
72 | 72 | Example to downgrade to `dirstate-v1`:: |
|
73 | 73 | |
|
74 | 74 | $ hg debugupgrade --config format.exp-rc-dirstate-v2=0 |
|
75 | 75 | |
|
76 | 76 | Both of this commands do nothing but print a list of proposed changes, |
|
77 | 77 | which may include changes unrelated to the dirstate. |
|
78 | 78 | Those other changes are controlled by their own configuration keys. |
|
79 | 79 | Add `--run` to a command to actually apply the proposed changes. |
|
80 | 80 | |
|
81 | 81 | Backups of `.hg/requires` and `.hg/dirstate` are created |
|
82 | 82 | in a `.hg/upgradebackup.*` directory. |
|
83 | 83 | If something goes wrong, restoring those files should undo the change. |
|
84 | 84 | |
|
85 | 85 | Note that upgrading affects compatibility with older versions of Mercurial |
|
86 | 86 | as noted above. |
|
87 | 87 | This can be relevant when a repository’s files are on a USB drive |
|
88 | 88 | or some other removable media, or shared over the network, etc. |
|
89 | 89 | |
|
90 | 90 | Internal filesystem representation |
|
91 | 91 | ================================== |
|
92 | 92 | |
|
93 | 93 | Requirements file |
|
94 | 94 | ----------------- |
|
95 | 95 | |
|
96 | 96 | The `.hg/requires` file indicates which of various optional file formats |
|
97 | 97 | are used by a given repository. |
|
98 | 98 | Mercurial aborts when seeing a requirement it does not know about, |
|
99 |
which avoids older version accidentally messing up a re |
|
|
99 | which avoids older version accidentally messing up a repository | |
|
100 | 100 | that uses a format that was introduced later. |
|
101 | 101 | For versions that do support a format, the presence or absence of |
|
102 | 102 | the corresponding requirement indicates whether to use that format. |
|
103 | 103 | |
|
104 | 104 | When the file contains a `dirstate-v2` line, |
|
105 | 105 | the `dirstate-v2` format is used. |
|
106 | 106 | With no such line `dirstate-v1` is used. |
|
107 | 107 | |
|
108 | 108 | High level description |
|
109 | 109 | ---------------------- |
|
110 | 110 | |
|
111 |
Whereas `dirstate-v1` uses a single `.hg/di |
|
|
111 | Whereas `dirstate-v1` uses a single `.hg/dirstate` file, | |
|
112 | 112 | in `dirstate-v2` that file is a "docket" file |
|
113 | 113 | that only contains some metadata |
|
114 | 114 | and points to separate data file named `.hg/dirstate.{ID}`, |
|
115 | 115 | where `{ID}` is a random identifier. |
|
116 | 116 | |
|
117 | 117 | This separation allows making data files append-only |
|
118 | 118 | and therefore safer to memory-map. |
|
119 | 119 | Creating a new data file (occasionally to clean up unused data) |
|
120 | 120 | can be done with a different ID |
|
121 | 121 | without disrupting another Mercurial process |
|
122 | 122 | that could still be using the previous data file. |
|
123 | 123 | |
|
124 | 124 | Both files have a format designed to reduce the need for parsing, |
|
125 | 125 | by using fixed-size binary components as much as possible. |
|
126 | 126 | For data that is not fixed-size, |
|
127 | 127 | references to other parts of a file can be made by storing "pseudo-pointers": |
|
128 | 128 | integers counted in bytes from the start of a file. |
|
129 | 129 | For read-only access no data structure is needed, |
|
130 | 130 | only a bytes buffer (possibly memory-mapped directly from the filesystem) |
|
131 | 131 | with specific parts read on demand. |
|
132 | 132 | |
|
133 | 133 | The data file contains "nodes" organized in a tree. |
|
134 | 134 | Each node represents a file or directory inside the working directory |
|
135 | 135 | or its parent changeset. |
|
136 | 136 | This tree has the same structure as the filesystem, |
|
137 | 137 | so a node representing a directory has child nodes representing |
|
138 | 138 | the files and subdirectories contained directly in that directory. |
|
139 | 139 | |
|
140 | 140 | The docket file format |
|
141 | 141 | ---------------------- |
|
142 | 142 | |
|
143 | 143 | This is implemented in `rust/hg-core/src/dirstate_tree/on_disk.rs` |
|
144 | 144 | and `mercurial/dirstateutils/docket.py`. |
|
145 | 145 | |
|
146 | 146 | Components of the docket file are found at fixed offsets, |
|
147 | 147 | counted in bytes from the start of the file: |
|
148 | 148 | |
|
149 | 149 | * Offset 0: |
|
150 | 150 | The 12-bytes marker string "dirstate-v2\n" ending with a newline character. |
|
151 | 151 | This makes it easier to tell a dirstate-v2 file from a dirstate-v1 file, |
|
152 | 152 | although it is not strictly necessary |
|
153 | 153 | since `.hg/requires` determines which format to use. |
|
154 | 154 | |
|
155 | 155 | * Offset 12: |
|
156 | 156 | The changeset node ID on the first parent of the working directory, |
|
157 | 157 | as up to 32 binary bytes. |
|
158 | 158 | If a node ID is shorter (20 bytes for SHA-1), |
|
159 | 159 | it is start-aligned and the rest of the bytes are set to zero. |
|
160 | 160 | |
|
161 | 161 | * Offset 44: |
|
162 | 162 | The changeset node ID on the second parent of the working directory, |
|
163 | 163 | or all zeros if there isn’t one. |
|
164 | 164 | Also 32 binary bytes. |
|
165 | 165 | |
|
166 | 166 | * Offset 76: |
|
167 | 167 | Tree metadata on 44 bytes, described below. |
|
168 | 168 | Its separation in this documentation from the rest of the docket |
|
169 | 169 | reflects a detail of the current implementation. |
|
170 | 170 | Since tree metadata is also made of fields at fixed offsets, those could |
|
171 | 171 | be inlined here by adding 76 bytes to each offset. |
|
172 | 172 | |
|
173 | 173 | * Offset 120: |
|
174 | 174 | The used size of the data file, as a 32-bit big-endian integer. |
|
175 | 175 | The actual size of the data file may be larger |
|
176 |
(if another Mercurial process |
|
|
176 | (if another Mercurial process is appending to it | |
|
177 | 177 | but has not updated the docket yet). |
|
178 | 178 | That extra data must be ignored. |
|
179 | 179 | |
|
180 | 180 | * Offset 124: |
|
181 | 181 | The length of the data file identifier, as a 8-bit integer. |
|
182 | 182 | |
|
183 | 183 | * Offset 125: |
|
184 | 184 | The data file identifier. |
|
185 | 185 | |
|
186 | 186 | * Any additional data is current ignored, and dropped when updating the file. |
|
187 | 187 | |
|
188 | 188 | Tree metadata in the docket file |
|
189 | 189 | -------------------------------- |
|
190 | 190 | |
|
191 | 191 | Tree metadata is similarly made of components at fixed offsets. |
|
192 | 192 | These offsets are counted in bytes from the start of tree metadata, |
|
193 | 193 | which is 76 bytes after the start of the docket file. |
|
194 | 194 | |
|
195 | 195 | This metadata can be thought of as the singular root of the tree |
|
196 | 196 | formed by nodes in the data file. |
|
197 | 197 | |
|
198 | 198 | * Offset 0: |
|
199 | 199 | Pseudo-pointer to the start of root nodes, |
|
200 | 200 | counted in bytes from the start of the data file, |
|
201 | 201 | as a 32-bit big-endian integer. |
|
202 | 202 | These nodes describe files and directories found directly |
|
203 | 203 | at the root of the working directory. |
|
204 | 204 | |
|
205 | 205 | * Offset 4: |
|
206 | 206 | Number of root nodes, as a 32-bit big-endian integer. |
|
207 | 207 | |
|
208 | 208 | * Offset 8: |
|
209 | 209 | Total number of nodes in the entire tree that "have a dirstate entry", |
|
210 | 210 | as a 32-bit big-endian integer. |
|
211 | 211 | Those nodes represent files that would be present at all in `dirstate-v1`. |
|
212 | 212 | This is typically less than the total number of nodes. |
|
213 | 213 | This counter is used to implement `len(dirstatemap)`. |
|
214 | 214 | |
|
215 | 215 | * Offset 12: |
|
216 | 216 | Number of nodes in the entire tree that have a copy source, |
|
217 | 217 | as a 32-bit big-endian integer. |
|
218 | 218 | At the next commit, these files are recorded |
|
219 | 219 | as having been copied or moved/renamed from that source. |
|
220 | 220 | (A move is recorded as a copy and separate removal of the source.) |
|
221 | 221 | This counter is used to implement `len(dirstatemap.copymap)`. |
|
222 | 222 | |
|
223 | 223 | * Offset 16: |
|
224 | 224 | An estimation of how many bytes of the data file |
|
225 | 225 | (within its used size) are unused, as a 32-bit big-endian integer. |
|
226 | 226 | When appending to an existing data file, |
|
227 | 227 | some existing nodes or paths can be unreachable from the new root |
|
228 | 228 | but they still take up space. |
|
229 | 229 | This counter is used to decide when to write a new data file from scratch |
|
230 | 230 | instead of appending to an existing one, |
|
231 | 231 | in order to get rid of that unreachable data |
|
232 | 232 | and avoid unbounded file size growth. |
|
233 | 233 | |
|
234 | 234 | * Offset 20: |
|
235 | 235 | These four bytes are currently ignored |
|
236 | 236 | and reset to zero when updating a docket file. |
|
237 | 237 | This is an attempt at forward compatibility: |
|
238 | 238 | future Mercurial versions could use this as a bit field |
|
239 | 239 | to indicate that a dirstate has additional data or constraints. |
|
240 | 240 | Finding a dirstate file with the relevant bit unset indicates that |
|
241 | 241 | it was written by a then-older version |
|
242 | 242 | which is not aware of that future change. |
|
243 | 243 | |
|
244 | 244 | * Offset 24: |
|
245 | 245 | Either 20 zero bytes, or a SHA-1 hash as 20 binary bytes. |
|
246 | 246 | When present, the hash is of ignore patterns |
|
247 | 247 | that were used for some previous run of the `status` algorithm. |
|
248 | 248 | |
|
249 | 249 | * (Offset 44: end of tree metadata) |
|
250 | 250 | |
|
251 | 251 | Optional hash of ignore patterns |
|
252 | 252 | -------------------------------- |
|
253 | 253 | |
|
254 | 254 | The implementation of `status` at `rust/hg-core/src/dirstate_tree/status.rs` |
|
255 | 255 | has been optimized such that its run time is dominated by calls |
|
256 | 256 | to `stat` for reading the filesystem metadata of a file or directory, |
|
257 | 257 | and to `readdir` for listing the contents of a directory. |
|
258 | 258 | In some cases the algorithm can skip calls to `readdir` |
|
259 | 259 | (saving significant time) |
|
260 | 260 | because the dirstate already contains enough of the relevant information |
|
261 | 261 | to build the correct `status` results. |
|
262 | 262 | |
|
263 | 263 | The default configuration of `hg status` is to list unknown files |
|
264 | 264 | but not ignored files. |
|
265 | 265 | In this case, it matters for the `readdir`-skipping optimization |
|
266 | 266 | if a given file used to be ignored but became unknown |
|
267 | 267 | because `.hgignore` changed. |
|
268 | 268 | To detect the possibility of such a change, |
|
269 | 269 | the tree metadata contains an optional hash of all ignore patterns. |
|
270 | 270 | |
|
271 | 271 | We define: |
|
272 | 272 | |
|
273 | 273 | * "Root" ignore files as: |
|
274 | 274 | |
|
275 | 275 | - `.hgignore` at the root of the repository if it exists |
|
276 | 276 | - And all files from `ui.ignore.*` config. |
|
277 | 277 | |
|
278 | 278 | This set of files is sorted by the string representation of their path. |
|
279 | 279 | |
|
280 | 280 | * The "expanded contents" of an ignore files is the byte string made |
|
281 | 281 | by the concatenation of its contents followed by the "expanded contents" |
|
282 | 282 | of other files included with `include:` or `subinclude:` directives, |
|
283 | 283 | in inclusion order. This definition is recursive, as included files can |
|
284 | 284 | themselves include more files. |
|
285 | 285 | |
|
286 | 286 | This hash is defined as the SHA-1 of the concatenation (in sorted |
|
287 | 287 | order) of the "expanded contents" of each "root" ignore file. |
|
288 | 288 | (Note that computing this does not require actually concatenating |
|
289 | 289 | into a single contiguous byte sequence. |
|
290 | 290 | Instead a SHA-1 hasher object can be created |
|
291 | 291 | and fed separate chunks one by one.) |
|
292 | 292 | |
|
293 | 293 | The data file format |
|
294 | 294 | -------------------- |
|
295 | 295 | |
|
296 | 296 | This is implemented in `rust/hg-core/src/dirstate_tree/on_disk.rs` |
|
297 | 297 | and `mercurial/dirstateutils/v2.py`. |
|
298 | 298 | |
|
299 | 299 | The data file contains two types of data: paths and nodes. |
|
300 | 300 | |
|
301 | 301 | Paths and nodes can be organized in any order in the file, except that sibling |
|
302 | 302 | nodes must be next to each other and sorted by their path. |
|
303 | 303 | Contiguity lets the parent refer to them all |
|
304 | 304 | by their count and a single pseudo-pointer, |
|
305 | 305 | instead of storing one pseudo-pointer per child node. |
|
306 | Sorting allows using binary seach to find a child node with a given name | |
|
306 | Sorting allows using binary search to find a child node with a given name | |
|
307 | 307 | in `O(log(n))` byte sequence comparisons. |
|
308 | 308 | |
|
309 | The current implemention writes paths and child node before a given node | |
|
309 | The current implementation writes paths and child node before a given node | |
|
310 | 310 | for ease of figuring out the value of pseudo-pointers by the time the are to be |
|
311 | 311 | written, but this is not an obligation and readers must not rely on it. |
|
312 | 312 | |
|
313 | 313 | A path is stored as a byte string anywhere in the file, without delimiter. |
|
314 | It is refered to by one or more node by a pseudo-pointer to its start, and its | |
|
314 | It is referred to by one or more node by a pseudo-pointer to its start, and its | |
|
315 | 315 | length in bytes. Since there is no delimiter, |
|
316 | 316 | when a path is a substring of another the same bytes could be reused, |
|
317 | 317 | although the implementation does not exploit this as of this writing. |
|
318 | 318 | |
|
319 | 319 | A node is stored on 43 bytes with components at fixed offsets. Paths and |
|
320 | 320 | child nodes relevant to a node are stored externally and referenced though |
|
321 | 321 | pseudo-pointers. |
|
322 | 322 | |
|
323 | 323 | All integers are stored in big-endian. All pseudo-pointers are 32-bit integers |
|
324 | 324 | counting bytes from the start of the data file. Path lengths and positions |
|
325 | 325 | are 16-bit integers, also counted in bytes. |
|
326 | 326 | |
|
327 | 327 | Node components are: |
|
328 | 328 | |
|
329 | 329 | * Offset 0: |
|
330 | 330 | Pseudo-pointer to the full path of this node, |
|
331 | 331 | from the working directory root. |
|
332 | 332 | |
|
333 | 333 | * Offset 4: |
|
334 | 334 | Length of the full path. |
|
335 | 335 | |
|
336 | 336 | * Offset 6: |
|
337 | 337 | Position of the last `/` path separator within the full path, |
|
338 | 338 | in bytes from the start of the full path, |
|
339 | 339 | or zero if there isn’t one. |
|
340 | 340 | The part of the full path after this position is the "base name". |
|
341 | 341 | Since sibling nodes have the same parent, only their base name vary |
|
342 | 342 | and needs to be considered when doing binary search to find a given path. |
|
343 | 343 | |
|
344 | 344 | * Offset 8: |
|
345 | 345 | Pseudo-pointer to the "copy source" path for this node, |
|
346 | 346 | or zero if there is no copy source. |
|
347 | 347 | |
|
348 | 348 | * Offset 12: |
|
349 | 349 | Length of the copy source path, or zero if there isn’t one. |
|
350 | 350 | |
|
351 | 351 | * Offset 14: |
|
352 | 352 | Pseudo-pointer to the start of child nodes. |
|
353 | 353 | |
|
354 | 354 | * Offset 18: |
|
355 | 355 | Number of child nodes, as a 32-bit integer. |
|
356 | 356 | They occupy 43 times this number of bytes |
|
357 | 357 | (not counting space for paths, and further descendants). |
|
358 | 358 | |
|
359 | 359 | * Offset 22: |
|
360 | 360 | Number as a 32-bit integer of descendant nodes in this subtree, |
|
361 | 361 | not including this node itself, |
|
362 | 362 | that "have a dirstate entry". |
|
363 | 363 | Those nodes represent files that would be present at all in `dirstate-v1`. |
|
364 | 364 | This is typically less than the total number of descendants. |
|
365 | 365 | This counter is used to implement `has_dir`. |
|
366 | 366 | |
|
367 | 367 | * Offset 26: |
|
368 | 368 | Number as a 32-bit integer of descendant nodes in this subtree, |
|
369 | 369 | not including this node itself, |
|
370 | 370 | that represent files tracked in the working directory. |
|
371 | 371 | (For example, `hg rm` makes a file untracked.) |
|
372 | 372 | This counter is used to implement `has_tracked_dir`. |
|
373 | 373 | |
|
374 | 374 | * Offset 30: |
|
375 | 375 | A `flags` fields that packs some boolean values as bits of a 16-bit integer. |
|
376 | 376 | Starting from least-significant, bit masks are:: |
|
377 | 377 | |
|
378 | 378 | WDIR_TRACKED = 1 << 0 |
|
379 | 379 | P1_TRACKED = 1 << 1 |
|
380 | 380 | P2_INFO = 1 << 2 |
|
381 | 381 | MODE_EXEC_PERM = 1 << 3 |
|
382 | 382 | MODE_IS_SYMLINK = 1 << 4 |
|
383 | 383 | HAS_FALLBACK_EXEC = 1 << 5 |
|
384 | 384 | FALLBACK_EXEC = 1 << 6 |
|
385 | 385 | HAS_FALLBACK_SYMLINK = 1 << 7 |
|
386 | 386 | FALLBACK_SYMLINK = 1 << 8 |
|
387 | 387 | EXPECTED_STATE_IS_MODIFIED = 1 << 9 |
|
388 | 388 | HAS_MODE_AND_SIZE = 1 << 10 |
|
389 | 389 | HAS_MTIME = 1 << 11 |
|
390 | 390 | MTIME_SECOND_AMBIGUOUS = 1 << 12 |
|
391 | 391 | DIRECTORY = 1 << 13 |
|
392 | 392 | ALL_UNKNOWN_RECORDED = 1 << 14 |
|
393 | 393 | ALL_IGNORED_RECORDED = 1 << 15 |
|
394 | 394 | |
|
395 | 395 | The meaning of each bit is described below. |
|
396 | 396 | |
|
397 | 397 | Other bits are unset. |
|
398 | 398 | They may be assigned meaning if the future, |
|
399 | 399 | with the limitation that Mercurial versions that pre-date such meaning |
|
400 | 400 | will always reset those bits to unset when writing nodes. |
|
401 | 401 | (A new node is written for any mutation in its subtree, |
|
402 | 402 | leaving the bytes of the old node unreachable |
|
403 | 403 | until the data file is rewritten entirely.) |
|
404 | 404 | |
|
405 | 405 | * Offset 32: |
|
406 | 406 | A `size` field described below, as a 32-bit integer. |
|
407 | 407 | Unlike in dirstate-v1, negative values are not used. |
|
408 | 408 | |
|
409 | 409 | * Offset 36: |
|
410 | 410 | The seconds component of an `mtime` field described below, |
|
411 | 411 | as a 32-bit integer. |
|
412 | 412 | Unlike in dirstate-v1, negative values are not used. |
|
413 | 413 | When `mtime` is used, this is number of seconds since the Unix epoch |
|
414 | 414 | truncated to its lower 31 bits. |
|
415 | 415 | |
|
416 | 416 | * Offset 40: |
|
417 | 417 | The nanoseconds component of an `mtime` field described below, |
|
418 | 418 | as a 32-bit integer. |
|
419 | 419 | When `mtime` is used, |
|
420 | 420 | this is the number of nanoseconds since `mtime.seconds`, |
|
421 |
always stri |
|
|
421 | always strictly less than one billion. | |
|
422 | 422 | |
|
423 | 423 | This may be zero if more precision is not available. |
|
424 | 424 | (This can happen because of limitations in any of Mercurial, Python, |
|
425 | 425 | libc, the operating system, …) |
|
426 | 426 | |
|
427 | 427 | When comparing two mtimes and either has this component set to zero, |
|
428 | 428 | the sub-second precision of both should be ignored. |
|
429 | 429 | False positives when checking mtime equality due to clock resolution |
|
430 | 430 | are always possible and the status algorithm needs to deal with them, |
|
431 | 431 | but having too many false negatives could be harmful too. |
|
432 | 432 | |
|
433 | 433 | * (Offset 44: end of this node) |
|
434 | 434 | |
|
435 | 435 | The meaning of the boolean values packed in `flags` is: |
|
436 | 436 | |
|
437 | 437 | `WDIR_TRACKED` |
|
438 | 438 | Set if the working directory contains a tracked file at this node’s path. |
|
439 | 439 | This is typically set and unset by `hg add` and `hg rm`. |
|
440 | 440 | |
|
441 | 441 | `P1_TRACKED` |
|
442 | 442 | Set if the working directory’s first parent changeset |
|
443 | 443 | (whose node identifier is found in tree metadata) |
|
444 | 444 | contains a tracked file at this node’s path. |
|
445 | 445 | This is a cache to reduce manifest lookups. |
|
446 | 446 | |
|
447 | 447 | `P2_INFO` |
|
448 | 448 | Set if the file has been involved in some merge operation. |
|
449 | 449 | Either because it was actually merged, |
|
450 | 450 | or because the version in the second parent p2 version was ahead, |
|
451 | 451 | or because some rename moved it there. |
|
452 | 452 | In either case `hg status` will want it displayed as modified. |
|
453 | 453 | |
|
454 | 454 | Files that would be mentioned at all in the `dirstate-v1` file format |
|
455 | 455 | have a node with at least one of the above three bits set in `dirstate-v2`. |
|
456 | 456 | Let’s call these files "tracked anywhere", |
|
457 | 457 | and "untracked" the nodes with all three of these bits unset. |
|
458 | 458 | Untracked nodes are typically for directories: |
|
459 | 459 | they hold child nodes and form the tree structure. |
|
460 | 460 | Additional untracked nodes may also exist. |
|
461 | 461 | Although implementations should strive to clean up nodes |
|
462 | 462 | that are entirely unused, other untracked nodes may also exist. |
|
463 | 463 | For example, a future version of Mercurial might in some cases |
|
464 | 464 | add nodes for untracked files or/and ignored files in the working directory |
|
465 | 465 | in order to optimize `hg status` |
|
466 | 466 | by enabling it to skip `readdir` in more cases. |
|
467 | 467 | |
|
468 | 468 | `HAS_MODE_AND_SIZE` |
|
469 | 469 | Must be unset for untracked nodes. |
|
470 | 470 | For files tracked anywhere, if this is set: |
|
471 | 471 | - The `size` field is the expected file size, |
|
472 | 472 | in bytes truncated its lower to 31 bits. |
|
473 | 473 | - The expected execute permission for the file’s owner |
|
474 | 474 | is given by `MODE_EXEC_PERM` |
|
475 | 475 | - The expected file type is given by `MODE_IS_SIMLINK`: |
|
476 | 476 | a symbolic link if set, or a normal file if unset. |
|
477 | 477 | If this is unset the expected size, permission, and file type are unknown. |
|
478 | 478 | The `size` field is unused (set to zero). |
|
479 | 479 | |
|
480 | 480 | `HAS_MTIME` |
|
481 | 481 | The nodes contains a "valid" last modification time in the `mtime` field. |
|
482 | 482 | |
|
483 | 483 | |
|
484 | 484 | It means the `mtime` was already strictly in the past when observed, |
|
485 | 485 | meaning that later changes cannot happen in the same clock tick |
|
486 | 486 | and must cause a different modification time |
|
487 | 487 | (unless the system clock jumps back and we get unlucky, |
|
488 | 488 | which is not impossible but deemed unlikely enough). |
|
489 | 489 | |
|
490 | 490 | This means that if `std::fs::symlink_metadata` later reports |
|
491 | 491 | the same modification time |
|
492 | 492 | and ignored patterns haven’t changed, |
|
493 | 493 | we can assume the node to be unchanged on disk. |
|
494 | 494 | |
|
495 | 495 | The `mtime` field can then be used to skip more expensive lookup when |
|
496 | 496 | checking the status of "tracked" nodes. |
|
497 | 497 | |
|
498 | 498 | It can also be set for node where `DIRECTORY` is set. |
|
499 | 499 | See `DIRECTORY` documentation for details. |
|
500 | 500 | |
|
501 | 501 | `DIRECTORY` |
|
502 | 502 | When set, this entry will match a directory that exists or existed on the |
|
503 | 503 | file system. |
|
504 | 504 | |
|
505 | 505 | * When `HAS_MTIME` is set a directory has been seen on the file system and |
|
506 |
`mtime` matches its last modific |
|
|
507 | does not indicate the lack of directory on the file system. | |
|
506 | `mtime` matches its last modification time. However, `HAS_MTIME` not | |
|
507 | being set does not indicate the lack of directory on the file system. | |
|
508 | 508 | |
|
509 | 509 | * When not tracked anywhere, this node does not represent an ignored or |
|
510 | 510 | unknown file on disk. |
|
511 | 511 | |
|
512 | 512 | If `HAS_MTIME` is set |
|
513 | 513 | and `mtime` matches the last modification time of the directory on disk, |
|
514 | 514 | the directory is unchanged |
|
515 | 515 | and we can skip calling `std::fs::read_dir` again for this directory, |
|
516 | 516 | and iterate child dirstate nodes instead. |
|
517 | 517 | (as long as `ALL_UNKNOWN_RECORDED` and `ALL_IGNORED_RECORDED` are taken |
|
518 | 518 | into account) |
|
519 | 519 | |
|
520 | 520 | `MODE_EXEC_PERM` |
|
521 | 521 | Must be unset if `HAS_MODE_AND_SIZE` is unset. |
|
522 | 522 | If `HAS_MODE_AND_SIZE` is set, |
|
523 | 523 | this indicates whether the file’s own is expected |
|
524 | 524 | to have execute permission. |
|
525 | 525 | |
|
526 | 526 | Beware that on system without fs support for this information, the value |
|
527 | 527 | stored in the dirstate might be wrong and should not be relied on. |
|
528 | 528 | |
|
529 | 529 | `MODE_IS_SYMLINK` |
|
530 | 530 | Must be unset if `HAS_MODE_AND_SIZE` is unset. |
|
531 | 531 | If `HAS_MODE_AND_SIZE` is set, |
|
532 | 532 | this indicates whether the file is expected to be a symlink |
|
533 | 533 | as opposed to a normal file. |
|
534 | 534 | |
|
535 | 535 | Beware that on system without fs support for this information, the value |
|
536 | 536 | stored in the dirstate might be wrong and should not be relied on. |
|
537 | 537 | |
|
538 | 538 | `EXPECTED_STATE_IS_MODIFIED` |
|
539 | 539 | Must be unset for untracked nodes. |
|
540 | 540 | For: |
|
541 | 541 | - a file tracked anywhere |
|
542 | 542 | - that has expected metadata (`HAS_MODE_AND_SIZE` and `HAS_MTIME`) |
|
543 | 543 | - if that metadata matches |
|
544 | 544 | metadata found in the working directory with `stat` |
|
545 | 545 | This bit indicates the status of the file. |
|
546 | 546 | If set, the status is modified. If unset, it is clean. |
|
547 | 547 | |
|
548 | 548 | In cases where `hg status` needs to read the contents of a file |
|
549 | 549 | because metadata is ambiguous, this bit lets it record the result |
|
550 | 550 | if the result is modified so that a future run of `hg status` |
|
551 | 551 | does not need to do the same again. |
|
552 | 552 | It is valid to never set this bit, |
|
553 | 553 | and consider expected metadata ambiguous if it is set. |
|
554 | 554 | |
|
555 | 555 | `ALL_UNKNOWN_RECORDED` |
|
556 | 556 | If set, all "unknown" children existing on disk (at the time of the last |
|
557 | 557 | status) have been recorded and the `mtime` associated with |
|
558 | 558 | `DIRECTORY` can be used for optimization even when "unknown" file |
|
559 | 559 | are listed. |
|
560 | 560 | |
|
561 | 561 | Note that the amount recorded "unknown" children can still be zero if None |
|
562 | 562 | where present. |
|
563 | 563 | |
|
564 | 564 | Also note that having this flag unset does not imply that no "unknown" |
|
565 |
children have been recorded. Some might be present, but there is |
|
|
566 | that is will be all of them. | |
|
565 | children have been recorded. Some might be present, but there is | |
|
566 | no guarantee that is will be all of them. | |
|
567 | 567 | |
|
568 | 568 | `ALL_IGNORED_RECORDED` |
|
569 | 569 | If set, all "ignored" children existing on disk (at the time of the last |
|
570 | 570 | status) have been recorded and the `mtime` associated with |
|
571 | 571 | `DIRECTORY` can be used for optimization even when "ignored" file |
|
572 | 572 | are listed. |
|
573 | 573 | |
|
574 | 574 | Note that the amount recorded "ignored" children can still be zero if None |
|
575 | 575 | where present. |
|
576 | 576 | |
|
577 | 577 | Also note that having this flag unset does not imply that no "ignored" |
|
578 |
children have been recorded. Some might be present, but there is |
|
|
579 | that is will be all of them. | |
|
578 | children have been recorded. Some might be present, but there is | |
|
579 | no guarantee that is will be all of them. | |
|
580 | 580 | |
|
581 | 581 | `HAS_FALLBACK_EXEC` |
|
582 | 582 | If this flag is set, the entry carries "fallback" information for the |
|
583 | 583 | executable bit in the `FALLBACK_EXEC` flag. |
|
584 | 584 | |
|
585 | 585 | Fallback information can be stored in the dirstate to keep track of |
|
586 | 586 | filesystem attribute tracked by Mercurial when the underlying file |
|
587 | 587 | system or operating system does not support that property, (e.g. |
|
588 | 588 | Windows). |
|
589 | 589 | |
|
590 | 590 | `FALLBACK_EXEC` |
|
591 | 591 | Should be ignored if `HAS_FALLBACK_EXEC` is unset. If set the file for this |
|
592 | 592 | entry should be considered executable if that information cannot be |
|
593 | 593 | extracted from the file system. If unset it should be considered |
|
594 | 594 | non-executable instead. |
|
595 | 595 | |
|
596 | 596 | `HAS_FALLBACK_SYMLINK` |
|
597 | 597 | If this flag is set, the entry carries "fallback" information for symbolic |
|
598 | 598 | link status in the `FALLBACK_SYMLINK` flag. |
|
599 | 599 | |
|
600 | 600 | Fallback information can be stored in the dirstate to keep track of |
|
601 | 601 | filesystem attribute tracked by Mercurial when the underlying file |
|
602 | 602 | system or operating system does not support that property, (e.g. |
|
603 | 603 | Windows). |
|
604 | 604 | |
|
605 | 605 | `FALLBACK_SYMLINK` |
|
606 | 606 | Should be ignored if `HAS_FALLBACK_SYMLINK` is unset. If set the file for |
|
607 | 607 | this entry should be considered a symlink if that information cannot be |
|
608 | 608 | extracted from the file system. If unset it should be considered a normal |
|
609 | 609 | file instead. |
|
610 | 610 | |
|
611 | 611 | `MTIME_SECOND_AMBIGUOUS` |
|
612 | 612 | This flag is relevant only when `HAS_FILE_MTIME` is set. When set, the |
|
613 | 613 | `mtime` stored in the entry is only valid for comparison with timestamps |
|
614 | 614 | that have nanosecond information. If available timestamp does not carries |
|
615 |
nanosecond information, the `mtime` should be ignored and no optimi |
|
|
615 | nanosecond information, the `mtime` should be ignored and no optimization | |
|
616 | 616 | can be applied. |
General Comments 0
You need to be logged in to leave comments.
Login now