##// END OF EJS Templates
dirstate-v2: Add storage space for nanoseconds precision in file mtimes...
Simon Sapin -
r49033:308d9c24 default
parent child Browse files
Show More
@@ -372,7 +372,7 b' Node components are:'
372 This counter is used to implement `has_tracked_dir`.
372 This counter is used to implement `has_tracked_dir`.
373
373
374 * Offset 30:
374 * Offset 30:
375 Some boolean values packed as bits of a single byte.
375 A single `flags` byte that packs some boolean values as bits.
376 Starting from least-significant, bit masks are::
376 Starting from least-significant, bit masks are::
377
377
378 WDIR_TRACKED = 1 << 0
378 WDIR_TRACKED = 1 << 0
@@ -381,110 +381,116 b' Node components are:'
381 HAS_MODE_AND_SIZE = 1 << 3
381 HAS_MODE_AND_SIZE = 1 << 3
382 HAS_MTIME = 1 << 4
382 HAS_MTIME = 1 << 4
383 MODE_EXEC_PERM = 1 << 5
383 MODE_EXEC_PERM = 1 << 5
384 MODE_IS_SYMLINK = 1 << 7
384 MODE_IS_SYMLINK = 1 << 6
385
386
387 Other bits are unset. The meaning of these bits are:
388
389 `WDIR_TRACKED`
390 Set if the working directory contains a tracked file at this node’s path.
391 This is typically set and unset by `hg add` and `hg rm`.
392
393 `P1_TRACKED`
394 set if the working directory’s first parent changeset
395 (whose node identifier is found in tree metadata)
396 contains a tracked file at this node’s path.
397 This is a cache to reduce manifest lookups.
398
399 `P2_INFO`
400 Set if the file has been involved in some merge operation.
401 Either because it was actually merged,
402 or because the version in the second parent p2 version was ahead,
403 or because some rename moved it there.
404 In either case `hg status` will want it displayed as modified.
405
385
406 Files that would be mentioned at all in the `dirstate-v1` file format
386 The meaning of each bit is described below.
407 have a node with at least one of the above three bits set in `dirstate-v2`.
387 Other bits are unset.
408 Let’s call these files "tracked anywhere",
409 and "untracked" the nodes with all three of these bits unset.
410 Untracked nodes are typically for directories:
411 they hold child nodes and form the tree structure.
412 Additional untracked nodes may also exist.
413 Although implementations should strive to clean up nodes
414 that are entirely unused, other untracked nodes may also exist.
415 For example, a future version of Mercurial might in some cases
416 add nodes for untracked files or/and ignored files in the working directory
417 in order to optimize `hg status`
418 by enabling it to skip `readdir` in more cases.
419
388
420 When a node is for a file tracked anywhere:
389 * Offset 31:
421 - If `HAS_MODE_AND_SIZE` is set, the file is expected
390 A `size` field described below, as a 32-bit integer.
422 to be a symbolic link or a normal file based on `MODE_IS_SYMLINK`.
391 Unlike in dirstate-v1, negative values are not used.
423 - If `HAS_MODE_AND_SIZE` is set, the file’s owner is expected
424 to have execute permission or not based on `MODE_EXEC_PERM`.
425 - If `HAS_MODE_AND_SIZE` is unset,
426 the expected type of file and permission are unknown.
427 The rest of the node data is three fields:
428
429 * Offset 31:
430 4 unused bytes, set to zero
431
392
432 * Offset 35:
393 * Offset 35:
433 If `HAS_MODE_AND_SIZE` is unset, four zero bytes.
394 The seconds component of an `mtime` field described below,
434 Otherwise, a 32-bit integer for expected size of the file
395 as a 32-bit integer.
435 truncated to its 31 least-significant bits.
396 Unlike in dirstate-v1, negative values are not used.
436 Unlike in dirstate-v1, negative values are not used.
437
438 * Offset 39:
439 If `HAS_MTIME` is unset, four zero bytes.
440 Otherwise, a 32-bit integer for expected modified time of the file
441 (as in `stat_result.st_mtime`),
442 truncated to its 31 least-significant bits.
443 Unlike in dirstate-v1, negative values are not used.
444
445 If an untracked node `HAS_MTIME` *unset*, this space is unused:
446
447 * Offset 31:
448 12 unused bytes, set to zero
449
450 If an untracked node `HAS_MTIME` *set*,
451 what follows is the modification time of a directory
452 represented similarly to the C `timespec` struct:
453
454 * Offset 31:
455 4 unused bytes, set to zero
456
397
457 * Offset 35:
398 * Offset 39:
458 The number of seconds elapsed since the Unix epoch,
399 The nanoseconds component of an `mtime` field described below,
459 truncated to its lower 31 bits,
400 as a 32-bit integer.
460 as a 32-bit integer.
461
462 * Offset 39:
463 The sub-second number of nanoseconds elapsed since the Unix epoch,
464 as 32-bit integer.
465 Always greater than or equal to zero, and strictly less than a billion.
466
467 The presence of a directory modification time means that at some point,
468 this path in the working directory was observed:
469
470 - To be a directory
471 - With the given modification time
472 - That time was already strictly in the past when observed,
473 meaning that later changes cannot happen in the same clock tick
474 and must cause a different modification time
475 (unless the system clock jumps back and we get unlucky,
476 which is not impossible but deemed unlikely enough).
477 - All direct children of this directory
478 (as returned by `std::fs::read_dir`)
479 either have a corresponding dirstate node,
480 or are ignored by ignore patterns whose hash is in tree metadata.
481
482 This means that if `std::fs::symlink_metadata` later reports
483 the same modification time
484 and ignored patterns haven’t changed,
485 a run of status that is not listing ignored files
486 can skip calling `std::fs::read_dir` again for this directory,
487 and iterate child dirstate nodes instead.
488
489
401
490 * (Offset 43: end of this node)
402 * (Offset 43: end of this node)
403
404 The meaning of the boolean values packed in `flags` is:
405
406 `WDIR_TRACKED`
407 Set if the working directory contains a tracked file at this node’s path.
408 This is typically set and unset by `hg add` and `hg rm`.
409
410 `P1_TRACKED`
411 Set if the working directory’s first parent changeset
412 (whose node identifier is found in tree metadata)
413 contains a tracked file at this node’s path.
414 This is a cache to reduce manifest lookups.
415
416 `P2_INFO`
417 Set if the file has been involved in some merge operation.
418 Either because it was actually merged,
419 or because the version in the second parent p2 version was ahead,
420 or because some rename moved it there.
421 In either case `hg status` will want it displayed as modified.
422
423 Files that would be mentioned at all in the `dirstate-v1` file format
424 have a node with at least one of the above three bits set in `dirstate-v2`.
425 Let’s call these files "tracked anywhere",
426 and "untracked" the nodes with all three of these bits unset.
427 Untracked nodes are typically for directories:
428 they hold child nodes and form the tree structure.
429 Additional untracked nodes may also exist.
430 Although implementations should strive to clean up nodes
431 that are entirely unused, other untracked nodes may also exist.
432 For example, a future version of Mercurial might in some cases
433 add nodes for untracked files or/and ignored files in the working directory
434 in order to optimize `hg status`
435 by enabling it to skip `readdir` in more cases.
436
437 `HAS_MODE_AND_SIZE`
438 Must be unset for untracked nodes.
439 For files tracked anywhere, if this is set:
440 - The `size` field is the expected file size,
441 in bytes truncated its lower to 31 bits,
442 for the file to be clean.
443 - The expected execute permission for the file’s owner
444 is given by `MODE_EXEC_PERM`
445 - The expected file type is given by `MODE_IS_SIMLINK`:
446 a symbolic link if set, or a normal file if unset.
447 If this is unset the expected size, permission, and file type are unknown.
448 The `size` field is unused (set to zero).
449
450 `HAS_MTIME`
451 If unset, the `mtime` field is unused (set to zero).
452 If set, it contains a timestamp represented as
453 - the number of seconds since the Unix epoch,
454 truncated to its lower 31 bits.
455 - and the number of nanoseconds since `mtime.seconds`,
456 always stritctly less than one billion.
457 This may be zero if more precision is not available.
458 (This can happen because of limitations in any of Mercurial, Python,
459 libc, the operating system, …)
460
461 If set for a file tracked anywhere,
462 `mtime` is the expected modification time for the file to be clean.
463
464 If set for an untracked node, at some point,
465 this path in the working directory was observed:
466
467 - To be a directory
468 - With the modification time given in `mtime`
469 - That time was already strictly in the past when observed,
470 meaning that later changes cannot happen in the same clock tick
471 and must cause a different modification time
472 (unless the system clock jumps back and we get unlucky,
473 which is not impossible but deemed unlikely enough).
474 - All direct children of this directory
475 (as returned by `std::fs::read_dir`)
476 either have a corresponding dirstate node,
477 or are ignored by ignore patterns whose hash is in tree metadata.
478
479 This means that if `std::fs::symlink_metadata` later reports
480 the same modification time
481 and ignored patterns haven’t changed,
482 a run of status that is not listing ignored files
483 can skip calling `std::fs::read_dir` again for this directory,
484 and iterate child dirstate nodes instead.
485
486 `MODE_EXEC_PERM`
487 Must be unset if `HAS_MODE_AND_SIZE` is unset.
488 If `HAS_MODE_AND_SIZE` is set,
489 this indicates whether the file’s own is expected
490 to have execute permission.
491
492 `MODE_IS_SYMLINK`
493 Must be unset if `HAS_MODE_AND_SIZE` is unset.
494 If `HAS_MODE_AND_SIZE` is set,
495 this indicates whether the file is expected to be a symlink
496 as opposed to a normal file.
@@ -97,7 +97,8 b' pub(super) struct Node {'
97 pub(super) descendants_with_entry_count: Size,
97 pub(super) descendants_with_entry_count: Size,
98 pub(super) tracked_descendants_count: Size,
98 pub(super) tracked_descendants_count: Size,
99 flags: Flags,
99 flags: Flags,
100 data: Entry,
100 size: U32Be,
101 mtime: PackedTruncatedTimestamp,
101 }
102 }
102
103
103 bitflags! {
104 bitflags! {
@@ -110,23 +111,14 b' bitflags! {'
110 const HAS_MODE_AND_SIZE = 1 << 3;
111 const HAS_MODE_AND_SIZE = 1 << 3;
111 const HAS_MTIME = 1 << 4;
112 const HAS_MTIME = 1 << 4;
112 const MODE_EXEC_PERM = 1 << 5;
113 const MODE_EXEC_PERM = 1 << 5;
113 const MODE_IS_SYMLINK = 1 << 7;
114 const MODE_IS_SYMLINK = 1 << 6;
114 }
115 }
115 }
116 }
116
117
117 #[derive(BytesCast, Copy, Clone, Debug)]
118 #[repr(C)]
119 struct Entry {
120 _padding: U32Be,
121 size: U32Be,
122 mtime: U32Be,
123 }
124
125 /// Duration since the Unix epoch
118 /// Duration since the Unix epoch
126 #[derive(BytesCast, Copy, Clone)]
119 #[derive(BytesCast, Copy, Clone)]
127 #[repr(C)]
120 #[repr(C)]
128 struct PackedTimestamp {
121 struct PackedTruncatedTimestamp {
129 _padding: U32Be,
130 truncated_seconds: U32Be,
122 truncated_seconds: U32Be,
131 nanoseconds: U32Be,
123 nanoseconds: U32Be,
132 }
124 }
@@ -329,7 +321,7 b' impl Node {'
329 ) -> Result<Option<TruncatedTimestamp>, DirstateV2ParseError> {
321 ) -> Result<Option<TruncatedTimestamp>, DirstateV2ParseError> {
330 Ok(
322 Ok(
331 if self.flags.contains(Flags::HAS_MTIME) && !self.has_entry() {
323 if self.flags.contains(Flags::HAS_MTIME) && !self.has_entry() {
332 Some(self.data.as_timestamp()?)
324 Some(self.mtime.try_into()?)
333 } else {
325 } else {
334 None
326 None
335 },
327 },
@@ -356,12 +348,12 b' impl Node {'
356 let p1_tracked = self.flags.contains(Flags::P1_TRACKED);
348 let p1_tracked = self.flags.contains(Flags::P1_TRACKED);
357 let p2_info = self.flags.contains(Flags::P2_INFO);
349 let p2_info = self.flags.contains(Flags::P2_INFO);
358 let mode_size = if self.flags.contains(Flags::HAS_MODE_AND_SIZE) {
350 let mode_size = if self.flags.contains(Flags::HAS_MODE_AND_SIZE) {
359 Some((self.synthesize_unix_mode(), self.data.size.into()))
351 Some((self.synthesize_unix_mode(), self.size.into()))
360 } else {
352 } else {
361 None
353 None
362 };
354 };
363 let mtime = if self.flags.contains(Flags::HAS_MTIME) {
355 let mtime = if self.flags.contains(Flags::HAS_MTIME) {
364 Some(self.data.mtime.into())
356 Some(self.mtime.truncated_seconds.into())
365 } else {
357 } else {
366 None
358 None
367 };
359 };
@@ -407,10 +399,10 b' impl Node {'
407 tracked_descendants_count: self.tracked_descendants_count.get(),
399 tracked_descendants_count: self.tracked_descendants_count.get(),
408 })
400 })
409 }
401 }
410 }
411
402
412 impl Entry {
403 fn from_dirstate_entry(
413 fn from_dirstate_entry(entry: &DirstateEntry) -> (Flags, Self) {
404 entry: &DirstateEntry,
405 ) -> (Flags, U32Be, PackedTruncatedTimestamp) {
414 let (wdir_tracked, p1_tracked, p2_info, mode_size_opt, mtime_opt) =
406 let (wdir_tracked, p1_tracked, p2_info, mode_size_opt, mtime_opt) =
415 entry.v2_data();
407 entry.v2_data();
416 // TODO: convert throug raw flag bits instead?
408 // TODO: convert throug raw flag bits instead?
@@ -418,53 +410,26 b' impl Entry {'
418 flags.set(Flags::WDIR_TRACKED, wdir_tracked);
410 flags.set(Flags::WDIR_TRACKED, wdir_tracked);
419 flags.set(Flags::P1_TRACKED, p1_tracked);
411 flags.set(Flags::P1_TRACKED, p1_tracked);
420 flags.set(Flags::P2_INFO, p2_info);
412 flags.set(Flags::P2_INFO, p2_info);
421 let (size, mtime);
413 let size = if let Some((m, s)) = mode_size_opt {
422 if let Some((m, s)) = mode_size_opt {
423 let exec_perm = m & libc::S_IXUSR != 0;
414 let exec_perm = m & libc::S_IXUSR != 0;
424 let is_symlink = m & libc::S_IFMT == libc::S_IFLNK;
415 let is_symlink = m & libc::S_IFMT == libc::S_IFLNK;
425 flags.set(Flags::MODE_EXEC_PERM, exec_perm);
416 flags.set(Flags::MODE_EXEC_PERM, exec_perm);
426 flags.set(Flags::MODE_IS_SYMLINK, is_symlink);
417 flags.set(Flags::MODE_IS_SYMLINK, is_symlink);
427 size = s;
418 flags.insert(Flags::HAS_MODE_AND_SIZE);
428 flags.insert(Flags::HAS_MODE_AND_SIZE)
419 s.into()
429 } else {
420 } else {
430 size = 0;
421 0.into()
431 }
432 if let Some(m) = mtime_opt {
433 mtime = m;
434 flags.insert(Flags::HAS_MTIME);
435 } else {
436 mtime = 0;
437 }
438 let raw_entry = Entry {
439 _padding: 0.into(),
440 size: size.into(),
441 mtime: mtime.into(),
442 };
422 };
443 (flags, raw_entry)
423 let mtime = if let Some(m) = mtime_opt {
444 }
424 flags.insert(Flags::HAS_MTIME);
445
425 PackedTruncatedTimestamp {
446 fn from_timestamp(timestamp: TruncatedTimestamp) -> Self {
426 truncated_seconds: m.into(),
447 let packed = PackedTimestamp {
427 nanoseconds: 0.into(),
448 _padding: 0.into(),
428 }
449 truncated_seconds: timestamp.truncated_seconds().into(),
429 } else {
450 nanoseconds: timestamp.nanoseconds().into(),
430 PackedTruncatedTimestamp::null()
451 };
431 };
452 // Safety: both types implement the `ByteCast` trait, so we could
432 (flags, size, mtime)
453 // safely use `as_bytes` and `from_bytes` to do this conversion. Using
454 // `transmute` instead makes the compiler check that the two types
455 // have the same size, which eliminates the error case of
456 // `from_bytes`.
457 unsafe { std::mem::transmute::<PackedTimestamp, Entry>(packed) }
458 }
459
460 fn as_timestamp(self) -> Result<TruncatedTimestamp, DirstateV2ParseError> {
461 // Safety: same as above in `from_timestamp`
462 let packed =
463 unsafe { std::mem::transmute::<Entry, PackedTimestamp>(self) };
464 TruncatedTimestamp::from_already_truncated(
465 packed.truncated_seconds.get(),
466 packed.nanoseconds.get(),
467 )
468 }
433 }
469 }
434 }
470
435
@@ -610,20 +575,17 b" impl Writer<'_, '_> {"
610 };
575 };
611 on_disk_nodes.push(match node {
576 on_disk_nodes.push(match node {
612 NodeRef::InMemory(path, node) => {
577 NodeRef::InMemory(path, node) => {
613 let (flags, data) = match &node.data {
578 let (flags, size, mtime) = match &node.data {
614 dirstate_map::NodeData::Entry(entry) => {
579 dirstate_map::NodeData::Entry(entry) => {
615 Entry::from_dirstate_entry(entry)
580 Node::from_dirstate_entry(entry)
616 }
581 }
617 dirstate_map::NodeData::CachedDirectory { mtime } => {
582 dirstate_map::NodeData::CachedDirectory { mtime } => {
618 (Flags::HAS_MTIME, Entry::from_timestamp(*mtime))
583 (Flags::HAS_MTIME, 0.into(), (*mtime).into())
619 }
584 }
620 dirstate_map::NodeData::None => (
585 dirstate_map::NodeData::None => (
621 Flags::empty(),
586 Flags::empty(),
622 Entry {
587 0.into(),
623 _padding: 0.into(),
588 PackedTruncatedTimestamp::null(),
624 size: 0.into(),
625 mtime: 0.into(),
626 },
627 ),
589 ),
628 };
590 };
629 Node {
591 Node {
@@ -641,7 +603,8 b" impl Writer<'_, '_> {"
641 .tracked_descendants_count
603 .tracked_descendants_count
642 .into(),
604 .into(),
643 flags,
605 flags,
644 data,
606 size,
607 mtime,
645 }
608 }
646 }
609 }
647 NodeRef::OnDisk(node) => Node {
610 NodeRef::OnDisk(node) => Node {
@@ -725,3 +688,33 b' fn path_len_from_usize(x: usize) -> Path'
725 .expect("dirstate-v2 path length overflow")
688 .expect("dirstate-v2 path length overflow")
726 .into()
689 .into()
727 }
690 }
691
692 impl From<TruncatedTimestamp> for PackedTruncatedTimestamp {
693 fn from(timestamp: TruncatedTimestamp) -> Self {
694 Self {
695 truncated_seconds: timestamp.truncated_seconds().into(),
696 nanoseconds: timestamp.nanoseconds().into(),
697 }
698 }
699 }
700
701 impl TryFrom<PackedTruncatedTimestamp> for TruncatedTimestamp {
702 type Error = DirstateV2ParseError;
703
704 fn try_from(
705 timestamp: PackedTruncatedTimestamp,
706 ) -> Result<Self, Self::Error> {
707 Self::from_already_truncated(
708 timestamp.truncated_seconds.get(),
709 timestamp.nanoseconds.get(),
710 )
711 }
712 }
713 impl PackedTruncatedTimestamp {
714 fn null() -> Self {
715 Self {
716 truncated_seconds: 0.into(),
717 nanoseconds: 0.into(),
718 }
719 }
720 }
General Comments 0
You need to be logged in to leave comments. Login now