##// END OF EJS Templates
dirstate-v2: Document flags/mode/size/mtime fields of tree nodes...
Simon Sapin -
r49002:77fc340a default
parent child Browse files
Show More
@@ -371,6 +371,114 b' Node components are:'
371 (For example, `hg rm` makes a file untracked.)
371 (For example, `hg rm` makes a file untracked.)
372 This counter is used to implement `has_tracked_dir`.
372 This counter is used to implement `has_tracked_dir`.
373
373
374 * Offset 30 and more:
374 * Offset 30:
375 **TODO:** docs not written yet
375 Some boolean values packed as bits of a single byte.
376 as this part of the format might be changing soon.
376 Starting from least-significant, bit masks are::
377
378 WDIR_TRACKED = 1 << 0
379 P1_TRACKED = 1 << 1
380 P2_INFO = 1 << 2
381 HAS_MODE_AND_SIZE = 1 << 3
382 HAS_MTIME = 1 << 4
383
384 Other bits are unset. The meaning of these bits are:
385
386 `WDIR_TRACKED`
387 Set if the working directory contains a tracked file at this node’s path.
388 This is typically set and unset by `hg add` and `hg rm`.
389
390 `P1_TRACKED`
391 set if the working directory’s first parent changeset
392 (whose node identifier is found in tree metadata)
393 contains a tracked file at this node’s path.
394 This is a cache to reduce manifest lookups.
395
396 `P2_INFO`
397 Set if the file has been involved in some merge operation.
398 Either because it was actually merged,
399 or because the version in the second parent p2 version was ahead,
400 or because some rename moved it there.
401 In either case `hg status` will want it displayed as modified.
402
403 Files that would be mentioned at all in the `dirstate-v1` file format
404 have a node with at least one of the above three bits set in `dirstate-v2`.
405 Let’s call these files "tracked anywhere",
406 and "untracked" the nodes with all three of these bits unset.
407 Untracked nodes are typically for directories:
408 they hold child nodes and form the tree structure.
409 Additional untracked nodes may also exist.
410 Although implementations should strive to clean up nodes
411 that are entirely unused, other untracked nodes may also exist.
412 For example, a future version of Mercurial might in some cases
413 add nodes for untracked files or/and ignored files in the working directory
414 in order to optimize `hg status`
415 by enabling it to skip `readdir` in more cases.
416
417 When a node is for a file tracked anywhere,
418 the rest of the node data is three fields:
419
420 * Offset 31:
421 If `HAS_MODE_AND_SIZE` is unset, four zero bytes.
422 Otherwise, a 32-bit integer for the Unix mode (as in `stat_result.st_mode`)
423 expected for this file to be considered clean.
424 Only the `S_IXUSR` bit (owner has execute permission) is considered.
425
426 * Offset 35:
427 If `HAS_MTIME` is unset, four zero bytes.
428 Otherwise, a 32-bit integer for expected modified time of the file
429 (as in `stat_result.st_mtime`),
430 truncated to its 31 least-significant bits.
431 Unlike in dirstate-v1, negative values are not used.
432
433 * Offset 39:
434 If `HAS_MODE_AND_SIZE` is unset, four zero bytes.
435 Otherwise, a 32-bit integer for expected size of the file
436 truncated to its 31 least-significant bits.
437 Unlike in dirstate-v1, negative values are not used.
438
439 If an untracked node `HAS_MTIME` *unset*, this space is unused:
440
441 * Offset 31:
442 12 bytes set to zero
443
444 If an untracked node `HAS_MTIME` *set*,
445 what follows is the modification time of a directory
446 represented with separated second and sub-second components
447 since the Unix epoch:
448
449 * Offset 31:
450 The number of seconds as a signed (two’s complement) 64-bit integer.
451
452 * Offset 39:
453 The number of nanoseconds as 32-bit integer.
454 Always greater than or equal to zero, and strictly less than a billion.
455 Increasing this component makes the modification time
456 go forward or backward in time dependening
457 on the sign of the integral seconds components.
458 (Note: this is buggy because there is no negative zero integer,
459 but will be changed soon.)
460
461 The presence of a directory modification time means that at some point,
462 this path in the working directory was observed:
463
464 - To be a directory
465 - With the given modification time
466 - That time was already strictly in the past when observed,
467 meaning that later changes cannot happen in the same clock tick
468 and must cause a different modification time
469 (unless the system clock jumps back and we get unlucky,
470 which is not impossible but deemed unlikely enough).
471 - All direct children of this directory
472 (as returned by `std::fs::read_dir`)
473 either have a corresponding dirstate node,
474 or are ignored by ignore patterns whose hash is in tree metadata.
475
476 This means that if `std::fs::symlink_metadata` later reports
477 the same modification time
478 and ignored patterns haven’t changed,
479 a run of status that is not listing ignored files
480 can skip calling `std::fs::read_dir` again for this directory,
481 and iterate child dirstate nodes instead.
482
483
484 * (Offset 43: end of this node)
@@ -55,7 +55,7 b' class DirstateItem(object):'
55 - p1_tracked: is the file tracked in working copy first parent
55 - p1_tracked: is the file tracked in working copy first parent
56 - p2_info: the file has been involved in some merge operation. Either
56 - p2_info: the file has been involved in some merge operation. Either
57 because it was actually merged, or because the p2 version was
57 because it was actually merged, or because the p2 version was
58 ahead, or because some renamed moved it there. In either case
58 ahead, or because some rename moved it there. In either case
59 `hg status` will want it displayed as modified.
59 `hg status` will want it displayed as modified.
60
60
61 # about the file state expected from p1 manifest:
61 # about the file state expected from p1 manifest:
@@ -64,44 +64,24 b" pub struct Docket<'on_disk> {"
64 uuid: &'on_disk [u8],
64 uuid: &'on_disk [u8],
65 }
65 }
66
66
67 /// Fields are documented in the *Tree metadata in the docket file*
68 /// section of `mercurial/helptext/internals/dirstate-v2.txt`
67 #[derive(BytesCast)]
69 #[derive(BytesCast)]
68 #[repr(C)]
70 #[repr(C)]
69 struct TreeMetadata {
71 struct TreeMetadata {
70 root_nodes: ChildNodes,
72 root_nodes: ChildNodes,
71 nodes_with_entry_count: Size,
73 nodes_with_entry_count: Size,
72 nodes_with_copy_source_count: Size,
74 nodes_with_copy_source_count: Size,
73
74 /// How many bytes of this data file are not used anymore
75 unreachable_bytes: Size,
75 unreachable_bytes: Size,
76
77 /// Current version always sets these bytes to zero when creating or
78 /// updating a dirstate. Future versions could assign some bits to signal
79 /// for example "the version that last wrote/updated this dirstate did so
80 /// in such and such way that can be relied on by versions that know to."
81 unused: [u8; 4],
76 unused: [u8; 4],
82
77
83 /// If non-zero, a hash of ignore files that were used for some previous
78 /// See *Optional hash of ignore patterns* section of
84 /// run of the `status` algorithm.
79 /// `mercurial/helptext/internals/dirstate-v2.txt`
85 ///
86 /// We define:
87 ///
88 /// * "Root" ignore files are `.hgignore` at the root of the repository if
89 /// it exists, and files from `ui.ignore.*` config. This set of files is
90 /// then sorted by the string representation of their path.
91 /// * The "expanded contents" of an ignore files is the byte string made
92 /// by concatenating its contents with the "expanded contents" of other
93 /// files included with `include:` or `subinclude:` files, in inclusion
94 /// order. This definition is recursive, as included files can
95 /// themselves include more files.
96 ///
97 /// This hash is defined as the SHA-1 of the concatenation (in sorted
98 /// order) of the "expanded contents" of each "root" ignore file.
99 /// (Note that computing this does not require actually concatenating byte
100 /// strings into contiguous memory, instead SHA-1 hashing can be done
101 /// incrementally.)
102 ignore_patterns_hash: IgnorePatternsHash,
80 ignore_patterns_hash: IgnorePatternsHash,
103 }
81 }
104
82
83 /// Fields are documented in the *The data file format*
84 /// section of `mercurial/helptext/internals/dirstate-v2.txt`
105 #[derive(BytesCast)]
85 #[derive(BytesCast)]
106 #[repr(C)]
86 #[repr(C)]
107 pub(super) struct Node {
87 pub(super) struct Node {
@@ -114,45 +94,6 b' pub(super) struct Node {'
114 children: ChildNodes,
94 children: ChildNodes,
115 pub(super) descendants_with_entry_count: Size,
95 pub(super) descendants_with_entry_count: Size,
116 pub(super) tracked_descendants_count: Size,
96 pub(super) tracked_descendants_count: Size,
117
118 /// Depending on the bits in `flags`:
119 ///
120 /// * If any of `WDIR_TRACKED`, `P1_TRACKED`, or `P2_INFO` are set, the
121 /// node has an entry.
122 ///
123 /// - If `HAS_MODE_AND_SIZE` is set, `data.mode` and `data.size` are
124 /// meaningful. Otherwise they are set to zero
125 /// - If `HAS_MTIME` is set, `data.mtime` is meaningful. Otherwise it is
126 /// set to zero.
127 ///
128 /// * If none of `WDIR_TRACKED`, `P1_TRACKED`, `P2_INFO`, or `HAS_MTIME`
129 /// are set, the node does not have an entry and `data` is set to all
130 /// zeros.
131 ///
132 /// * If none of `WDIR_TRACKED`, `P1_TRACKED`, `P2_INFO` are set, but
133 /// `HAS_MTIME` is set, the bytes of `data` should instead be
134 /// interpreted as the `Timestamp` for the mtime of a cached directory.
135 ///
136 /// The presence of this combination of flags means that at some point,
137 /// this path in the working directory was observed:
138 ///
139 /// - To be a directory
140 /// - With the modification time as given by `Timestamp`
141 /// - That timestamp was already strictly in the past when observed,
142 /// meaning that later changes cannot happen in the same clock tick
143 /// and must cause a different modification time (unless the system
144 /// clock jumps back and we get unlucky, which is not impossible but
145 /// but deemed unlikely enough).
146 /// - All direct children of this directory (as returned by
147 /// `std::fs::read_dir`) either have a corresponding dirstate node, or
148 /// are ignored by ignore patterns whose hash is in
149 /// `TreeMetadata::ignore_patterns_hash`.
150 ///
151 /// This means that if `std::fs::symlink_metadata` later reports the
152 /// same modification time and ignored patterns haven’t changed, a run
153 /// of status that is not listing ignored files can skip calling
154 /// `std::fs::read_dir` again for this directory, iterate child
155 /// dirstate nodes instead.
156 flags: Flags,
97 flags: Flags,
157 data: Entry,
98 data: Entry,
158 }
99 }
General Comments 0
You need to be logged in to leave comments. Login now