upstream/mercurial-mirror Commit - r50453:363923bd

dirstate-v2: hash the source of the ignore patterns as well...

Raphaël Gomès -

r50453:363923bd stable

parent child

mercurial/helptext/internals/dirstate-v2.txt

0 +10 -2

             The *dirstate* is what Mercurial uses internally to track
             the state of files in the working directory,
             such as set by commands like `hg add` and `hg rm`.
             It also contains some cached data that help make `hg status` faster.
             The name refers both to `.hg/dirstate` on the filesystem
             and the corresponding data structure in memory while a Mercurial process
             is running.
             The original file format, retroactively dubbed `dirstate-v1`,
             is described at https://www.mercurial-scm.org/wiki/DirState.
             It is made of a flat sequence of unordered variable-size entries,
             so accessing any information in it requires parsing all of it.
             Similarly, saving changes requires rewriting the entire file.
             The newer `dirstate-v2` file format is designed to fix these limitations
             and make `hg status` faster.
             User guide
             ==========
             Compatibility
             -------------
             The file format is experimental and may still change.
             Different versions of Mercurial may not be compatible with each other
             when working on a local repository that uses this format.
             When using an incompatible version with the experimental format,
             anything can happen including data corruption.
             Since the dirstate is entirely local and not relevant to the wire protocol,
             `dirstate-v2` does not affect compatibility with remote Mercurial versions.
             When `share-safe` is enabled, different repositories sharing the same store
             can use different dirstate formats.
             Enabling `dirstate-v2` for new local repositories
             ------------------------------------------------
             When creating a new local repository such as with `hg init` or `hg clone`,
             the `use-dirstate-v2` boolean in the `format` configuration section
             controls whether to use this file format.
             This is disabled by default as of this writing.
             To enable it for a single repository, run for example::
                 $ hg init my-project --config format.use-dirstate-v2=1
             Checking the format of an existing local repository
             --------------------------------------------------
             The `debugformat` commands prints information about
             which of multiple optional formats are used in the current repository,
             including `dirstate-v2`::
                 $ hg debugformat
                 format-variant     repo
                 fncache:            yes
                 dirstate-v2:        yes
                 […]
             Upgrading or downgrading an existing local repository
             -----------------------------------------------------
             The `debugupgrade` command does various upgrades or downgrades
             on a local repository
             based on the current Mercurial version and on configuration.
             The same `format.use-dirstate-v2` configuration is used again.
             Example to upgrade::
                 $ hg debugupgrade --config format.use-dirstate-v2=1
             Example to downgrade to `dirstate-v1`::
                 $ hg debugupgrade --config format.use-dirstate-v2=0
             Both of this commands do nothing but print a list of proposed changes,
             which may include changes unrelated to the dirstate.
             Those other changes are controlled by their own configuration keys.
             Add `--run` to a command to actually apply the proposed changes.
             Backups of `.hg/requires` and `.hg/dirstate` are created
             in a `.hg/upgradebackup.*` directory.
             If something goes wrong, restoring those files should undo the change.
             Note that upgrading affects compatibility with older versions of Mercurial
             as noted above.
             This can be relevant when a repository’s files are on a USB drive
             or some other removable media, or shared over the network, etc.
             Internal filesystem representation
             ==================================
             Requirements file
             -----------------
             The `.hg/requires` file indicates which of various optional file formats
             are used by a given repository.
             Mercurial aborts when seeing a requirement it does not know about,
             which avoids older version accidentally messing up a repository
             that uses a format that was introduced later.
             For versions that do support a format, the presence or absence of
             the corresponding requirement indicates whether to use that format.
             When the file contains a `dirstate-v2` line,
             the `dirstate-v2` format is used.
             With no such line `dirstate-v1` is used.
             High level description
             ----------------------
             Whereas `dirstate-v1` uses a single `.hg/dirstate` file,
             in `dirstate-v2` that file is a "docket" file
             that only contains some metadata
             and points to separate data file named `.hg/dirstate.{ID}`,
             where `{ID}` is a random identifier.
             This separation allows making data files append-only
             and therefore safer to memory-map.
             Creating a new data file (occasionally to clean up unused data)
             can be done with a different ID
             without disrupting another Mercurial process
             that could still be using the previous data file.
             Both files have a format designed to reduce the need for parsing,
             by using fixed-size binary components as much as possible.
             For data that is not fixed-size,
             references to other parts of a file can be made by storing "pseudo-pointers":
             integers counted in bytes from the start of a file.
             For read-only access no data structure is needed,
             only a bytes buffer (possibly memory-mapped directly from the filesystem)
             with specific parts read on demand.
             The data file contains "nodes" organized in a tree.
             Each node represents a file or directory inside the working directory
             or its parent changeset.
             This tree has the same structure as the filesystem,
             so a node representing a directory has child nodes representing
             the files and subdirectories contained directly in that directory.
             The docket file format
             ----------------------
             This is implemented in `rust/hg-core/src/dirstate_tree/on_disk.rs`
             and `mercurial/dirstateutils/docket.py`.
             Components of the docket file are found at fixed offsets,
             counted in bytes from the start of the file:
             * Offset 0:
               The 12-bytes marker string "dirstate-v2\n" ending with a newline character.
               This makes it easier to tell a dirstate-v2 file from a dirstate-v1 file,
               although it is not strictly necessary
               since `.hg/requires` determines which format to use.
             * Offset 12:
               The changeset node ID on the first parent of the working directory,
               as up to 32 binary bytes.
               If a node ID is shorter (20 bytes for SHA-1),
               it is start-aligned and the rest of the bytes are set to zero.
             * Offset 44:
               The changeset node ID on the second parent of the working directory,
               or all zeros if there isn’t one.
               Also 32 binary bytes.
             * Offset 76:
               Tree metadata on 44 bytes, described below.
               Its separation in this documentation from the rest of the docket
               reflects a detail of the current implementation.
               Since tree metadata is also made of fields at fixed offsets, those could
               be inlined here by adding 76 bytes to each offset.
             * Offset 120:
               The used size of the data file, as a 32-bit big-endian integer.
               The actual size of the data file may be larger
               (if another Mercurial process is appending to it
               but has not updated the docket yet).
               That extra data must be ignored.
             * Offset 124:
               The length of the data file identifier, as a 8-bit integer.
             * Offset 125:
               The data file identifier.
             * Any additional data is current ignored, and dropped when updating the file.
             Tree metadata in the docket file
             --------------------------------
             Tree metadata is similarly made of components at fixed offsets.
             These offsets are counted in bytes from the start of tree metadata,
             which is 76 bytes after the start of the docket file.
             This metadata can be thought of as the singular root of the tree
             formed by nodes in the data file.
             * Offset 0:
               Pseudo-pointer to the start of root nodes,
               counted in bytes from the start of the data file,
               as a 32-bit big-endian integer.
               These nodes describe files and directories found directly
               at the root of the working directory.
             * Offset 4:
               Number of root nodes, as a 32-bit big-endian integer.
             * Offset 8:
               Total number of nodes in the entire tree that "have a dirstate entry",
               as a 32-bit big-endian integer.
               Those nodes represent files that would be present at all in `dirstate-v1`.
               This is typically less than the total number of nodes.
               This counter is used to implement `len(dirstatemap)`.
             * Offset 12:
               Number of nodes in the entire tree that have a copy source,
               as a 32-bit big-endian integer.
               At the next commit, these files are recorded
               as having been copied or moved/renamed from that source.
               (A move is recorded as a copy and separate removal of the source.)
               This counter is used to implement `len(dirstatemap.copymap)`.
             * Offset 16:
               An estimation of how many bytes of the data file
               (within its used size) are unused, as a 32-bit big-endian integer.
               When appending to an existing data file,
               some existing nodes or paths can be unreachable from the new root
               but they still take up space.
               This counter is used to decide when to write a new data file from scratch
               instead of appending to an existing one,
               in order to get rid of that unreachable data
               and avoid unbounded file size growth.
             * Offset 20:
               These four bytes are currently ignored
               and reset to zero when updating a docket file.
               This is an attempt at forward compatibility:
               future Mercurial versions could use this as a bit field
               to indicate that a dirstate has additional data or constraints.
               Finding a dirstate file with the relevant bit unset indicates that
               it was written by a then-older version
               which is not aware of that future change.
             * Offset 24:
               Either 20 zero bytes, or a SHA-1 hash as 20 binary bytes.
               When present, the hash is of ignore patterns
               that were used for some previous run of the `status` algorithm.
             * (Offset 44: end of tree metadata)
             Optional hash of ignore patterns
             --------------------------------
             The implementation of `status` at `rust/hg-core/src/dirstate_tree/status.rs`
             has been optimized such that its run time is dominated by calls
             to `stat` for reading the filesystem metadata of a file or directory,
             and to `readdir` for listing the contents of a directory.
             In some cases the algorithm can skip calls to `readdir`
             (saving significant time)
             because the dirstate already contains enough of the relevant information
             to build the correct `status` results.
             The default configuration of `hg status` is to list unknown files
             but not ignored files.
             In this case, it matters for the `readdir`-skipping optimization
             if a given file used to be ignored but became unknown
             because `.hgignore` changed.
             To detect the possibility of such a change,
             the tree metadata contains an optional hash of all ignore patterns.
             We define:
             * "Root" ignore files as:
               - `.hgignore` at the root of the repository if it exists
               - And all files from `ui.ignore.*` config.
               This set of files is sorted by the string representation of their path.
             * The "expanded contents" of an ignore files is the byte string made
               by the concatenation of its contents followed by the "expanded contents"
               of other files included with `include:` or `subinclude:` directives,
               in inclusion order. This definition is recursive, as included files can
               themselves include more files.
-            This hash is defined as the SHA-1 of the concatenation (in sorted
+            * "filepath" as the bytes of the ignore file path
-            order) of the "expanded contents" of each "root" ignore file.
+              relative to the root of the repository if inside the repository,
+              or the untouched path as defined in the configuration.
+            This hash is defined as the SHA-1 of the following line format:
+            <filepath> <sha1 of the "expanded contents">\n
+            for each "root" ignore file. (in sorted order)
             (Note that computing this does not require actually concatenating
             into a single contiguous byte sequence.
             Instead a SHA-1 hasher object can be created
             and fed separate chunks one by one.)
             The data file format
             --------------------
             This is implemented in `rust/hg-core/src/dirstate_tree/on_disk.rs`
             and `mercurial/dirstateutils/v2.py`.
             The data file contains two types of data: paths and nodes.
             Paths and nodes can be organized in any order in the file, except that sibling
             nodes must be next to each other and sorted by their path.
             Contiguity lets the parent refer to them all
             by their count and a single pseudo-pointer,
             instead of storing one pseudo-pointer per child node.
             Sorting allows using binary search to find a child node with a given name
             in `O(log(n))` byte sequence comparisons.
             The current implementation writes paths and child node before a given node
             for ease of figuring out the value of pseudo-pointers by the time the are to be
             written, but this is not an obligation and readers must not rely on it.
             A path is stored as a byte string anywhere in the file, without delimiter.
             It is referred to by one or more node by a pseudo-pointer to its start, and its
             length in bytes. Since there is no delimiter,
             when a path is a substring of another the same bytes could be reused,
             although the implementation does not exploit this as of this writing.
             A node is stored on 43 bytes with components at fixed offsets. Paths and
             child nodes relevant to a node are stored externally and referenced though
             pseudo-pointers.
             All integers are stored in big-endian. All pseudo-pointers are 32-bit integers
             counting bytes from the start of the data file. Path lengths and positions
             are 16-bit integers, also counted in bytes.
             Node components are:
             * Offset 0:
               Pseudo-pointer to the full path of this node,
               from the working directory root.
             * Offset 4:
               Length of the full path.
             * Offset 6:
               Position of the last `/` path separator within the full path,
               in bytes from the start of the full path,
               or zero if there isn’t one.
               The part of the full path after this position is the "base name".
               Since sibling nodes have the same parent, only their base name vary
               and needs to be considered when doing binary search to find a given path.
             * Offset 8:
               Pseudo-pointer to the "copy source" path for this node,
               or zero if there is no copy source.
             * Offset 12:
               Length of the copy source path, or zero if there isn’t one.
             * Offset 14:
               Pseudo-pointer to the start of child nodes.
             * Offset 18:
               Number of child nodes, as a 32-bit integer.
               They occupy 43 times this number of bytes
               (not counting space for paths, and further descendants).
             * Offset 22:
               Number as a 32-bit integer of descendant nodes in this subtree,
               not including this node itself,
               that "have a dirstate entry".
               Those nodes represent files that would be present at all in `dirstate-v1`.
               This is typically less than the total number of descendants.
               This counter is used to implement `has_dir`.
             * Offset 26:
               Number as a 32-bit integer of descendant nodes in this subtree,
               not including this node itself,
               that represent files tracked in the working directory.
               (For example, `hg rm` makes a file untracked.)
               This counter is used to implement `has_tracked_dir`.
             * Offset 30:
               A `flags` fields  that packs some boolean values as bits of a 16-bit integer.
               Starting from least-significant, bit masks are::
                 WDIR_TRACKED = 1 << 0
                 P1_TRACKED = 1 << 1
                 P2_INFO = 1 << 2
                 MODE_EXEC_PERM = 1 << 3
                 MODE_IS_SYMLINK = 1 << 4
                 HAS_FALLBACK_EXEC = 1 << 5
                 FALLBACK_EXEC = 1 << 6
                 HAS_FALLBACK_SYMLINK = 1 << 7
                 FALLBACK_SYMLINK = 1 << 8
                 EXPECTED_STATE_IS_MODIFIED = 1 << 9
                 HAS_MODE_AND_SIZE = 1 << 10
                 HAS_MTIME = 1 << 11
                 MTIME_SECOND_AMBIGUOUS = 1 << 12
                 DIRECTORY = 1 << 13
                 ALL_UNKNOWN_RECORDED = 1 << 14
                 ALL_IGNORED_RECORDED = 1 << 15
               The meaning of each bit is described below.
               Other bits are unset.
               They may be assigned meaning if the future,
               with the limitation that Mercurial versions that pre-date such meaning
               will always reset those bits to unset when writing nodes.
               (A new node is written for any mutation in its subtree,
               leaving the bytes of the old node unreachable
               until the data file is rewritten entirely.)
             * Offset 32:
               A `size` field described below, as a 32-bit integer.
               Unlike in dirstate-v1, negative values are not used.
             * Offset 36:
               The seconds component of an `mtime` field described below,
               as a 32-bit integer.
               Unlike in dirstate-v1, negative values are not used.
               When `mtime` is used, this is number of seconds since the Unix epoch
               truncated to its lower 31 bits.
             * Offset 40:
               The nanoseconds component of an `mtime` field described below,
               as a 32-bit integer.
               When `mtime` is used,
               this is the number of nanoseconds since `mtime.seconds`,
               always strictly less than one billion.
               This may be zero if more precision is not available.
               (This can happen because of limitations in any of Mercurial, Python,
               libc, the operating system, …)
               When comparing two mtimes and either has this component set to zero,
               the sub-second precision of both should be ignored.
               False positives when checking mtime equality due to clock resolution
               are always possible and the status algorithm needs to deal with them,
               but having too many false negatives could be harmful too.
             * (Offset 44: end of this node)
             The meaning of the boolean values packed in `flags` is:
             `WDIR_TRACKED`
                 Set if the working directory contains a tracked file at this node’s path.
                 This is typically set and unset by `hg add` and `hg rm`.
             `P1_TRACKED`
                 Set if the working directory’s first parent changeset
                 (whose node identifier is found in tree metadata)
                 contains a tracked file at this node’s path.
                 This is a cache to reduce manifest lookups.
             `P2_INFO`
                 Set if the file has been involved in some merge operation.
                 Either because it was actually merged,
                 or because the version in the second parent p2 version was ahead,
                 or because some rename moved it there.
                 In either case `hg status` will want it displayed as modified.
             Files that would be mentioned at all in the `dirstate-v1` file format
             have a node with at least one of the above three bits set in `dirstate-v2`.
             Let’s call these files "tracked anywhere",
             and "untracked" the nodes with all three of these bits unset.
             Untracked nodes are typically for directories:
             they hold child nodes and form the tree structure.
             Additional untracked nodes may also exist.
             Although implementations should strive to clean up nodes
             that are entirely unused, other untracked nodes may also exist.
             For example, a future version of Mercurial might in some cases
             add nodes for untracked files or/and ignored files in the working directory
             in order to optimize `hg status`
             by enabling it to skip `readdir` in more cases.
             `HAS_MODE_AND_SIZE`
                 Must be unset for untracked nodes.
                 For files tracked anywhere, if this is set:
                 - The `size` field is the expected file size,
                   in bytes truncated its lower to 31 bits.
                 - The expected execute permission for the file’s owner
                   is given by `MODE_EXEC_PERM`
                 - The expected file type is given by `MODE_IS_SIMLINK`:
                   a symbolic link if set, or a normal file if unset.
                 If this is unset the expected size, permission, and file type are unknown.
                 The `size` field is unused (set to zero).
             `HAS_MTIME`
                 The nodes contains a "valid" last modification time in the `mtime` field.
                 It means the `mtime` was already strictly in the past when observed,
                 meaning that later changes cannot happen in the same clock tick
                 and must cause a different modification time
                 (unless the system clock jumps back and we get unlucky,
                 which is not impossible but deemed unlikely enough).
                 This means that if `std::fs::symlink_metadata` later reports
                 the same modification time
                 and ignored patterns haven’t changed,
                 we can assume the node to be unchanged on disk.
                 The `mtime` field can then be used to skip more expensive lookup when
                 checking the status of "tracked" nodes.
                 It can also be set for node where `DIRECTORY` is set.
                 See `DIRECTORY` documentation for details.
             `DIRECTORY`
                 When set, this entry will match a directory that exists or existed on the
                 file system.
                 * When `HAS_MTIME` is set a directory has been seen on the file system and
                   `mtime` matches its last modification time. However, `HAS_MTIME` not
                   being set does not indicate the lack of directory on the file system.
                 * When not tracked anywhere, this node does not represent an ignored or
                   unknown file on disk.
                 If `HAS_MTIME` is set
                 and `mtime` matches the last modification time of the directory on disk,
                 the directory is unchanged
                 and we can skip calling `std::fs::read_dir` again for this directory,
                 and iterate child dirstate nodes instead.
                 (as long as `ALL_UNKNOWN_RECORDED` and `ALL_IGNORED_RECORDED` are taken
                 into account)
             `MODE_EXEC_PERM`
                 Must be unset if `HAS_MODE_AND_SIZE` is unset.
                 If `HAS_MODE_AND_SIZE` is set,
                 this indicates whether the file’s own is expected
                 to have execute permission.
                 Beware that on system without fs support for this information, the value
                 stored in the dirstate might be wrong and should not be relied on.
             `MODE_IS_SYMLINK`
                 Must be unset if `HAS_MODE_AND_SIZE` is unset.
                 If `HAS_MODE_AND_SIZE` is set,
                 this indicates whether the file is expected to be a symlink
                 as opposed to a normal file.
                 Beware that on system without fs support for this information, the value
                 stored in the dirstate might be wrong and should not be relied on.
             `EXPECTED_STATE_IS_MODIFIED`
                 Must be unset for untracked nodes.
                 For:
                 - a file tracked anywhere
                 - that has expected metadata (`HAS_MODE_AND_SIZE` and `HAS_MTIME`)
                 - if that metadata matches
                   metadata found in the working directory with `stat`
                 This bit indicates the status of the file.
                 If set, the status is modified. If unset, it is clean.
                 In cases where `hg status` needs to read the contents of a file
                 because metadata is ambiguous, this bit lets it record the result
                 if the result is modified so that a future run of `hg status`
                 does not need to do the same again.
                 It is valid to never set this bit,
                 and consider expected metadata ambiguous if it is set.
             `ALL_UNKNOWN_RECORDED`
                 If set, all "unknown" children existing on disk (at the time of the last
                 status) have been recorded and the `mtime` associated with
                 `DIRECTORY` can be used for optimization even when "unknown" file
                 are listed.
                 Note that the amount recorded "unknown" children can still be zero if None
                 where present.
                 Also note that having this flag unset does not imply that no "unknown"
                 children have been recorded. Some might be present, but there is
                 no guarantee that is will be all of them.
             `ALL_IGNORED_RECORDED`
                 If set, all "ignored" children existing on disk (at the time of the last
                 status) have been recorded and the `mtime` associated with
                 `DIRECTORY` can be used for optimization even when "ignored" file
                 are listed.
                 Note that the amount recorded "ignored" children can still be zero if None
                 where present.
                 Also note that having this flag unset does not imply that no "ignored"
                 children have been recorded. Some might be present, but there is
                 no guarantee that is will be all of them.
             `HAS_FALLBACK_EXEC`
                 If this flag is set, the entry carries "fallback" information for the
                 executable bit in the `FALLBACK_EXEC` flag.
                 Fallback information can be stored in the dirstate to keep track of
                 filesystem attribute tracked by Mercurial when the underlying file
                 system or operating system does not support that property, (e.g.
                 Windows).
             `FALLBACK_EXEC`
                 Should be ignored if `HAS_FALLBACK_EXEC` is unset. If set the file for this
                 entry should be considered executable if that information cannot be
                 extracted from the file system. If unset it should be considered
                 non-executable instead.
             `HAS_FALLBACK_SYMLINK`
                 If this flag is set, the entry carries "fallback" information for symbolic
                 link status in the `FALLBACK_SYMLINK` flag.
                 Fallback information can be stored in the dirstate to keep track of
                 filesystem attribute tracked by Mercurial when the underlying file
                 system or operating system does not support that property, (e.g.
                 Windows).
             `FALLBACK_SYMLINK`
                 Should be ignored if `HAS_FALLBACK_SYMLINK` is unset. If set the file for
                 this entry should be considered a symlink if that information cannot be
                 extracted from the file system. If unset it should be considered a normal
                 file instead.
             `MTIME_SECOND_AMBIGUOUS`
                 This flag is relevant only when `HAS_FILE_MTIME` is set.  When set, the
                 `mtime` stored in the entry is only valid for comparison with timestamps
                 that have nanosecond information. If available timestamp does not carries
                 nanosecond information, the `mtime` should be ignored and no optimization
                 can be applied.

rust/hg-core/src/dirstate_tree/status.rs

0 +20 -2

             use crate::dirstate::entry::TruncatedTimestamp;
             use crate::dirstate::status::IgnoreFnType;
             use crate::dirstate::status::StatusPath;
             use crate::dirstate_tree::dirstate_map::BorrowedPath;
             use crate::dirstate_tree::dirstate_map::ChildNodesRef;
             use crate::dirstate_tree::dirstate_map::DirstateMap;
             use crate::dirstate_tree::dirstate_map::DirstateVersion;
             use crate::dirstate_tree::dirstate_map::NodeRef;
             use crate::dirstate_tree::on_disk::DirstateV2ParseError;
             use crate::matchers::get_ignore_function;
             use crate::matchers::Matcher;
             use crate::utils::files::get_bytes_from_os_string;
+            use crate::utils::files::get_bytes_from_path;
             use crate::utils::files::get_path_from_bytes;
             use crate::utils::hg_path::HgPath;
             use crate::BadMatch;
             use crate::DirstateStatus;
             use crate::HgPathBuf;
             use crate::HgPathCow;
             use crate::PatternFileWarning;
             use crate::StatusError;
             use crate::StatusOptions;
             use micro_timer::timed;
             use once_cell::sync::OnceCell;
             use rayon::prelude::*;
             use sha1::{Digest, Sha1};
             use std::borrow::Cow;
             use std::io;
             use std::path::Path;
             use std::path::PathBuf;
             use std::sync::Mutex;
             use std::time::SystemTime;
             /// Returns the status of the working directory compared to its parent
             /// changeset.
             ///
             /// This algorithm is based on traversing the filesystem tree (`fs` in function
             /// and variable names) and dirstate tree at the same time. The core of this
             /// traversal is the recursive `traverse_fs_directory_and_dirstate` function
             /// and its use of `itertools::merge_join_by`. When reaching a path that only
             /// exists in one of the two trees, depending on information requested by
             /// `options` we may need to traverse the remaining subtree.
             #[timed]
             pub fn status<'dirstate>(
                 dmap: &'dirstate mut DirstateMap,
                 matcher: &(dyn Matcher + Sync),
                 root_dir: PathBuf,
                 ignore_files: Vec<PathBuf>,
                 options: StatusOptions,
             ) -> Result<(DirstateStatus<'dirstate>, Vec<PatternFileWarning>), StatusError>
             {
                 // Force the global rayon threadpool to not exceed 16 concurrent threads.
                 // This is a stop-gap measure until we figure out why using more than 16
                 // threads makes `status` slower for each additional thread.
                 // We use `ok()` in case the global threadpool has already been
                 // instantiated in `rhg` or some other caller.
                 // TODO find the underlying cause and fix it, then remove this.
                 rayon::ThreadPoolBuilder::new()
                     .num_threads(16)
                     .build_global()
                     .ok();
                 let (ignore_fn, warnings, patterns_changed): (IgnoreFnType, _, _) =
                     if options.list_ignored || options.list_unknown {
                         let (ignore_fn, warnings, changed) = match dmap.dirstate_version {
                             DirstateVersion::V1 => {
                                 let (ignore_fn, warnings) = get_ignore_function(
                                     ignore_files,
                                     &root_dir,
-                                    &mut |_pattern_bytes| {},
+                                    &mut |_source, _pattern_bytes| {},
                                 )?;
                                 (ignore_fn, warnings, None)
                             }
                             DirstateVersion::V2 => {
                                 let mut hasher = Sha1::new();
                                 let (ignore_fn, warnings) = get_ignore_function(
                                     ignore_files,
                                     &root_dir,
-                                    &mut |pattern_bytes| hasher.update(pattern_bytes),
+                                    &mut |source, pattern_bytes| {
+                                        // If inside the repo, use the relative version to
+                                        // make it deterministic inside tests.
+                                        // The performance hit should be negligible.
+                                        let source = source
+                                            .strip_prefix(&root_dir)
+                                            .unwrap_or(source);
+                                        let source = get_bytes_from_path(source);
+                                        let mut subhasher = Sha1::new();
+                                        subhasher.update(pattern_bytes);
+                                        let patterns_hash = subhasher.finalize();
+                                        hasher.update(source);
+                                        hasher.update(b" ");
+                                        hasher.update(patterns_hash);
+                                        hasher.update(b"\n");
+                                    },
                                 )?;
                                 let new_hash = *hasher.finalize().as_ref();
                                 let changed = new_hash != dmap.ignore_patterns_hash;
                                 dmap.ignore_patterns_hash = new_hash;
                                 (ignore_fn, warnings, Some(changed))
                             }
                         };
                         (ignore_fn, warnings, changed)
                     } else {
                         (Box::new(|&_| true), vec![], None)
                     };
                 let filesystem_time_at_status_start =
                     filesystem_now(&root_dir).ok().map(TruncatedTimestamp::from);
                 // If the repository is under the current directory, prefer using a
                 // relative path, so the kernel needs to traverse fewer directory in every
                 // call to `read_dir` or `symlink_metadata`.
                 // This is effective in the common case where the current directory is the
                 // repository root.
                 // TODO: Better yet would be to use libc functions like `openat` and
                 // `fstatat` to remove such repeated traversals entirely, but the standard
                 // library does not provide APIs based on those.
                 // Maybe with a crate like https://crates.io/crates/openat instead?
                 let root_dir = if let Some(relative) = std::env::current_dir()
                     .ok()
                     .and_then(|cwd| root_dir.strip_prefix(cwd).ok())
                 {
                     relative
                 } else {
                     &root_dir
                 };
                 let outcome = DirstateStatus {
                     filesystem_time_at_status_start,
                     ..Default::default()
                 };
                 let common = StatusCommon {
                     dmap,
                     options,
                     matcher,
                     ignore_fn,
                     outcome: Mutex::new(outcome),
                     ignore_patterns_have_changed: patterns_changed,
                     new_cacheable_directories: Default::default(),
                     outdated_cached_directories: Default::default(),
                     filesystem_time_at_status_start,
                 };
                 let is_at_repo_root = true;
                 let hg_path = &BorrowedPath::OnDisk(HgPath::new(""));
                 let has_ignored_ancestor = HasIgnoredAncestor::create(None, hg_path);
                 let root_cached_mtime = None;
                 let root_dir_metadata = None;
                 // If the path we have for the repository root is a symlink, do follow it.
                 // (As opposed to symlinks within the working directory which are not
                 // followed, using `std::fs::symlink_metadata`.)
                 common.traverse_fs_directory_and_dirstate(
                     &has_ignored_ancestor,
                     dmap.root.as_ref(),
                     hg_path,
                     &root_dir,
                     root_dir_metadata,
                     root_cached_mtime,
                     is_at_repo_root,
                 )?;
                 let mut outcome = common.outcome.into_inner().unwrap();
                 let new_cacheable = common.new_cacheable_directories.into_inner().unwrap();
                 let outdated = common.outdated_cached_directories.into_inner().unwrap();
                 outcome.dirty = common.ignore_patterns_have_changed == Some(true)
                     || !outdated.is_empty()
                     || (!new_cacheable.is_empty()
                         && dmap.dirstate_version == DirstateVersion::V2);
                 // Remove outdated mtimes before adding new mtimes, in case a given
                 // directory is both
                 for path in &outdated {
                     dmap.clear_cached_mtime(path)?;
                 }
                 for (path, mtime) in &new_cacheable {
                     dmap.set_cached_mtime(path, *mtime)?;
                 }
                 Ok((outcome, warnings))
             }
             /// Bag of random things needed by various parts of the algorithm. Reduces the
             /// number of parameters passed to functions.
             struct StatusCommon<'a, 'tree, 'on_disk: 'tree> {
                 dmap: &'tree DirstateMap<'on_disk>,
                 options: StatusOptions,
                 matcher: &'a (dyn Matcher + Sync),
                 ignore_fn: IgnoreFnType<'a>,
                 outcome: Mutex<DirstateStatus<'on_disk>>,
                 /// New timestamps of directories to be used for caching their readdirs
                 new_cacheable_directories:
                     Mutex<Vec<(Cow<'on_disk, HgPath>, TruncatedTimestamp)>>,
                 /// Used to invalidate the readdir cache of directories
                 outdated_cached_directories: Mutex<Vec<Cow<'on_disk, HgPath>>>,
                 /// Whether ignore files like `.hgignore` have changed since the previous
                 /// time a `status()` call wrote their hash to the dirstate. `None` means
                 /// we don’t know as this run doesn’t list either ignored or uknown files
                 /// and therefore isn’t reading `.hgignore`.
                 ignore_patterns_have_changed: Option<bool>,
                 /// The current time at the start of the `status()` algorithm, as measured
                 /// and possibly truncated by the filesystem.
                 filesystem_time_at_status_start: Option<TruncatedTimestamp>,
             }
             enum Outcome {
                 Modified,
                 Added,
                 Removed,
                 Deleted,
                 Clean,
                 Ignored,
                 Unknown,
                 Unsure,
             }
             /// Lazy computation of whether a given path has a hgignored
             /// ancestor.
             struct HasIgnoredAncestor<'a> {
                 /// `path` and `parent` constitute the inputs to the computation,
                 /// `cache` stores the outcome.
                 path: &'a HgPath,
                 parent: Option<&'a HasIgnoredAncestor<'a>>,
                 cache: OnceCell<bool>,
             }
             impl<'a> HasIgnoredAncestor<'a> {
                 fn create(
                     parent: Option<&'a HasIgnoredAncestor<'a>>,
                     path: &'a HgPath,
                 ) -> HasIgnoredAncestor<'a> {
                     Self {
                         path,
                         parent,
                         cache: OnceCell::new(),
                     }
                 }
                 fn force<'b>(&self, ignore_fn: &IgnoreFnType<'b>) -> bool {
                     match self.parent {
                         None => false,
                         Some(parent) => {
                             *(parent.cache.get_or_init(|| {
                                 parent.force(ignore_fn) || ignore_fn(&self.path)
                             }))
                         }
                     }
                 }
             }
             impl<'a, 'tree, 'on_disk> StatusCommon<'a, 'tree, 'on_disk> {
                 fn push_outcome(
                     &self,
                     which: Outcome,
                     dirstate_node: &NodeRef<'tree, 'on_disk>,
                 ) -> Result<(), DirstateV2ParseError> {
                     let path = dirstate_node
                         .full_path_borrowed(self.dmap.on_disk)?
                         .detach_from_tree();
                     let copy_source = if self.options.list_copies {
                         dirstate_node
                             .copy_source_borrowed(self.dmap.on_disk)?
                             .map(|source| source.detach_from_tree())
                     } else {
                         None
                     };
                     self.push_outcome_common(which, path, copy_source);
                     Ok(())
                 }
                 fn push_outcome_without_copy_source(
                     &self,
                     which: Outcome,
                     path: &BorrowedPath<'_, 'on_disk>,
                 ) {
                     self.push_outcome_common(which, path.detach_from_tree(), None)
                 }
                 fn push_outcome_common(
                     &self,
                     which: Outcome,
                     path: HgPathCow<'on_disk>,
                     copy_source: Option<HgPathCow<'on_disk>>,
                 ) {
                     let mut outcome = self.outcome.lock().unwrap();
                     let vec = match which {
                         Outcome::Modified => &mut outcome.modified,
                         Outcome::Added => &mut outcome.added,
                         Outcome::Removed => &mut outcome.removed,
                         Outcome::Deleted => &mut outcome.deleted,
                         Outcome::Clean => &mut outcome.clean,
                         Outcome::Ignored => &mut outcome.ignored,
                         Outcome::Unknown => &mut outcome.unknown,
                         Outcome::Unsure => &mut outcome.unsure,
                     };
                     vec.push(StatusPath { path, copy_source });
                 }
                 fn read_dir(
                     &self,
                     hg_path: &HgPath,
                     fs_path: &Path,
                     is_at_repo_root: bool,
                 ) -> Result<Vec<DirEntry>, ()> {
                     DirEntry::read_dir(fs_path, is_at_repo_root)
                         .map_err(|error| self.io_error(error, hg_path))
                 }
                 fn io_error(&self, error: std::io::Error, hg_path: &HgPath) {
                     let errno = error.raw_os_error().expect("expected real OS error");
                     self.outcome
                         .lock()
                         .unwrap()
                         .bad
                         .push((hg_path.to_owned().into(), BadMatch::OsError(errno)))
                 }
                 fn check_for_outdated_directory_cache(
                     &self,
                     dirstate_node: &NodeRef<'tree, 'on_disk>,
                 ) -> Result<bool, DirstateV2ParseError> {
                     if self.ignore_patterns_have_changed == Some(true)
                         && dirstate_node.cached_directory_mtime()?.is_some()
                     {
                         self.outdated_cached_directories.lock().unwrap().push(
                             dirstate_node
                                 .full_path_borrowed(self.dmap.on_disk)?
                                 .detach_from_tree(),
                         );
                         return Ok(true);
                     }
                     Ok(false)
                 }
                 /// If this returns true, we can get accurate results by only using
                 /// `symlink_metadata` for child nodes that exist in the dirstate and don’t
                 /// need to call `read_dir`.
                 fn can_skip_fs_readdir(
                     &self,
                     directory_metadata: Option<&std::fs::Metadata>,
                     cached_directory_mtime: Option<TruncatedTimestamp>,
                 ) -> bool {
                     if !self.options.list_unknown && !self.options.list_ignored {
                         // All states that we care about listing have corresponding
                         // dirstate entries.
                         // This happens for example with `hg status -mard`.
                         return true;
                     }
                     if !self.options.list_ignored
                         && self.ignore_patterns_have_changed == Some(false)
                     {
                         if let Some(cached_mtime) = cached_directory_mtime {
                             // The dirstate contains a cached mtime for this directory, set
                             // by a previous run of the `status` algorithm which found this
                             // directory eligible for `read_dir` caching.
                             if let Some(meta) = directory_metadata {
                                 if cached_mtime
                                     .likely_equal_to_mtime_of(meta)
                                     .unwrap_or(false)
                                 {
                                     // The mtime of that directory has not changed
                                     // since then, which means that the results of
                                     // `read_dir` should also be unchanged.
                                     return true;
                                 }
                             }
                         }
                     }
                     false
                 }
                 /// Returns whether all child entries of the filesystem directory have a
                 /// corresponding dirstate node or are ignored.
                 fn traverse_fs_directory_and_dirstate<'ancestor>(
                     &self,
                     has_ignored_ancestor: &'ancestor HasIgnoredAncestor<'ancestor>,
                     dirstate_nodes: ChildNodesRef<'tree, 'on_disk>,
                     directory_hg_path: &BorrowedPath<'tree, 'on_disk>,
                     directory_fs_path: &Path,
                     directory_metadata: Option<&std::fs::Metadata>,
                     cached_directory_mtime: Option<TruncatedTimestamp>,
                     is_at_repo_root: bool,
                 ) -> Result<bool, DirstateV2ParseError> {
                     if self.can_skip_fs_readdir(directory_metadata, cached_directory_mtime)
                     {
                         dirstate_nodes
                             .par_iter()
                             .map(|dirstate_node| {
                                 let fs_path = directory_fs_path.join(get_path_from_bytes(
                                     dirstate_node.base_name(self.dmap.on_disk)?.as_bytes(),
                                 ));
                                 match std::fs::symlink_metadata(&fs_path) {
                                     Ok(fs_metadata) => self.traverse_fs_and_dirstate(
                                         &fs_path,
                                         &fs_metadata,
                                         dirstate_node,
                                         has_ignored_ancestor,
                                     ),
                                     Err(e) if e.kind() == std::io::ErrorKind::NotFound => {
                                         self.traverse_dirstate_only(dirstate_node)
                                     }
                                     Err(error) => {
                                         let hg_path =
                                             dirstate_node.full_path(self.dmap.on_disk)?;
                                         Ok(self.io_error(error, hg_path))
                                     }
                                 }
                             })
                             .collect::<Result<_, _>>()?;
                         // We don’t know, so conservatively say this isn’t the case
                         let children_all_have_dirstate_node_or_are_ignored = false;
                         return Ok(children_all_have_dirstate_node_or_are_ignored);
                     }
                     let mut fs_entries = if let Ok(entries) = self.read_dir(
                         directory_hg_path,
                         directory_fs_path,
                         is_at_repo_root,
                     ) {
                         entries
                     } else {
                         // Treat an unreadable directory (typically because of insufficient
                         // permissions) like an empty directory. `self.read_dir` has
                         // already called `self.io_error` so a warning will be emitted.
                         Vec::new()
                     };
                     // `merge_join_by` requires both its input iterators to be sorted:
                     let dirstate_nodes = dirstate_nodes.sorted();
                     // `sort_unstable_by_key` doesn’t allow keys borrowing from the value:
                     // https://github.com/rust-lang/rust/issues/34162
                     fs_entries.sort_unstable_by(|e1, e2| e1.base_name.cmp(&e2.base_name));
                     // Propagate here any error that would happen inside the comparison
                     // callback below
                     for dirstate_node in &dirstate_nodes {
                         dirstate_node.base_name(self.dmap.on_disk)?;
                     }
                     itertools::merge_join_by(
                         dirstate_nodes,
                         &fs_entries,
                         |dirstate_node, fs_entry| {
                             // This `unwrap` never panics because we already propagated
                             // those errors above
                             dirstate_node
                                 .base_name(self.dmap.on_disk)
                                 .unwrap()
                                 .cmp(&fs_entry.base_name)
                         },
                     )
                     .par_bridge()
                     .map(|pair| {
                         use itertools::EitherOrBoth::*;
                         let has_dirstate_node_or_is_ignored;
                         match pair {
                             Both(dirstate_node, fs_entry) => {
                                 self.traverse_fs_and_dirstate(
                                     &fs_entry.full_path,
                                     &fs_entry.metadata,
                                     dirstate_node,
                                     has_ignored_ancestor,
                                 )?;
                                 has_dirstate_node_or_is_ignored = true
                             }
                             Left(dirstate_node) => {
                                 self.traverse_dirstate_only(dirstate_node)?;
                                 has_dirstate_node_or_is_ignored = true;
                             }
                             Right(fs_entry) => {
                                 has_dirstate_node_or_is_ignored = self.traverse_fs_only(
                                     has_ignored_ancestor.force(&self.ignore_fn),
                                     directory_hg_path,
                                     fs_entry,
                                 )
                             }
                         }
                         Ok(has_dirstate_node_or_is_ignored)
                     })
                     .try_reduce(|| true, |a, b| Ok(a && b))
                 }
                 fn traverse_fs_and_dirstate<'ancestor>(
                     &self,
                     fs_path: &Path,
                     fs_metadata: &std::fs::Metadata,
                     dirstate_node: NodeRef<'tree, 'on_disk>,
                     has_ignored_ancestor: &'ancestor HasIgnoredAncestor<'ancestor>,
                 ) -> Result<(), DirstateV2ParseError> {
                     let outdated_dircache =
                         self.check_for_outdated_directory_cache(&dirstate_node)?;
                     let hg_path = &dirstate_node.full_path_borrowed(self.dmap.on_disk)?;
                     let file_type = fs_metadata.file_type();
                     let file_or_symlink = file_type.is_file() || file_type.is_symlink();
                     if !file_or_symlink {
                         // If we previously had a file here, it was removed (with
                         // `hg rm` or similar) or deleted before it could be
                         // replaced by a directory or something else.
                         self.mark_removed_or_deleted_if_file(&dirstate_node)?;
                     }
                     if file_type.is_dir() {
                         if self.options.collect_traversed_dirs {
                             self.outcome
                                 .lock()
                                 .unwrap()
                                 .traversed
                                 .push(hg_path.detach_from_tree())
                         }
                         let is_ignored = HasIgnoredAncestor::create(
                             Some(&has_ignored_ancestor),
                             hg_path,
                         );
                         let is_at_repo_root = false;
                         let children_all_have_dirstate_node_or_are_ignored = self
                             .traverse_fs_directory_and_dirstate(
                                 &is_ignored,
                                 dirstate_node.children(self.dmap.on_disk)?,
                                 hg_path,
                                 fs_path,
                                 Some(fs_metadata),
                                 dirstate_node.cached_directory_mtime()?,
                                 is_at_repo_root,
                             )?;
                         self.maybe_save_directory_mtime(
                             children_all_have_dirstate_node_or_are_ignored,
                             fs_metadata,
                             dirstate_node,
                             outdated_dircache,
                         )?
                     } else {
                         if file_or_symlink && self.matcher.matches(&hg_path) {
                             if let Some(entry) = dirstate_node.entry()? {
                                 if !entry.any_tracked() {
                                     // Forward-compat if we start tracking unknown/ignored
                                     // files for caching reasons
                                     self.mark_unknown_or_ignored(
                                         has_ignored_ancestor.force(&self.ignore_fn),
                                         &hg_path,
                                     );
                                 }
                                 if entry.added() {
                                     self.push_outcome(Outcome::Added, &dirstate_node)?;
                                 } else if entry.removed() {
                                     self.push_outcome(Outcome::Removed, &dirstate_node)?;
                                 } else if entry.modified() {
                                     self.push_outcome(Outcome::Modified, &dirstate_node)?;
                                 } else {
                                     self.handle_normal_file(&dirstate_node, fs_metadata)?;
                                 }
                             } else {
                                 // `node.entry.is_none()` indicates a "directory"
                                 // node, but the filesystem has a file
                                 self.mark_unknown_or_ignored(
                                     has_ignored_ancestor.force(&self.ignore_fn),
                                     hg_path,
                                 );
                             }
                         }
                         for child_node in dirstate_node.children(self.dmap.on_disk)?.iter()
                         {
                             self.traverse_dirstate_only(child_node)?
                         }
                     }
                     Ok(())
                 }
                 /// Save directory mtime if applicable.
                 ///
                 /// `outdated_directory_cache` is `true` if we've just invalidated the
                 /// cache for this directory in `check_for_outdated_directory_cache`,
                 /// which forces the update.
                 fn maybe_save_directory_mtime(
                     &self,
                     children_all_have_dirstate_node_or_are_ignored: bool,
                     directory_metadata: &std::fs::Metadata,
                     dirstate_node: NodeRef<'tree, 'on_disk>,
                     outdated_directory_cache: bool,
                 ) -> Result<(), DirstateV2ParseError> {
                     if !children_all_have_dirstate_node_or_are_ignored {
                         return Ok(());
                     }
                     // All filesystem directory entries from `read_dir` have a
                     // corresponding node in the dirstate, so we can reconstitute the
                     // names of those entries without calling `read_dir` again.
                     // TODO: use let-else here and below when available:
                     // https://github.com/rust-lang/rust/issues/87335
                     let status_start = if let Some(status_start) =
                         &self.filesystem_time_at_status_start
                     {
                         status_start
                     } else {
                         return Ok(());
                     };
                     // Although the Rust standard library’s `SystemTime` type
                     // has nanosecond precision, the times reported for a
                     // directory’s (or file’s) modified time may have lower
                     // resolution based on the filesystem (for example ext3
                     // only stores integer seconds), kernel (see
                     // https://stackoverflow.com/a/14393315/1162888), etc.
                     let directory_mtime = if let Ok(option) =
                         TruncatedTimestamp::for_reliable_mtime_of(
                             directory_metadata,
                             status_start,
                         ) {
                         if let Some(directory_mtime) = option {
                             directory_mtime
                         } else {
                             // The directory was modified too recently,
                             // don’t cache its `read_dir` results.
                             //
                             // 1. A change to this directory (direct child was
                             //    added or removed) cause its mtime to be set
                             //    (possibly truncated) to `directory_mtime`
                             // 2. This `status` algorithm calls `read_dir`
                             // 3. An other change is made to the same directory is
                             //    made so that calling `read_dir` agin would give
                             //    different results, but soon enough after 1. that
                             //    the mtime stays the same
                             //
                             // On a system where the time resolution poor, this
                             // scenario is not unlikely if all three steps are caused
                             // by the same script.
                             return Ok(());
                         }
                     } else {
                         // OS/libc does not support mtime?
                         return Ok(());
                     };
                     // We’ve observed (through `status_start`) that time has
                     // “progressed” since `directory_mtime`, so any further
                     // change to this directory is extremely likely to cause a
                     // different mtime.
                     //
                     // Having the same mtime again is not entirely impossible
                     // since the system clock is not monotonous. It could jump
                     // backward to some point before `directory_mtime`, then a
                     // directory change could potentially happen during exactly
                     // the wrong tick.
                     //
                     // We deem this scenario (unlike the previous one) to be
                     // unlikely enough in practice.
                     let is_up_to_date = if let Some(cached) =
                         dirstate_node.cached_directory_mtime()?
                     {
                         !outdated_directory_cache && cached.likely_equal(directory_mtime)
                     } else {
                         false
                     };
                     if !is_up_to_date {
                         let hg_path = dirstate_node
                             .full_path_borrowed(self.dmap.on_disk)?
                             .detach_from_tree();
                         self.new_cacheable_directories
                             .lock()
                             .unwrap()
                             .push((hg_path, directory_mtime))
                     }
                     Ok(())
                 }
                 /// A file that is clean in the dirstate was found in the filesystem
                 fn handle_normal_file(
                     &self,
                     dirstate_node: &NodeRef<'tree, 'on_disk>,
                     fs_metadata: &std::fs::Metadata,
                 ) -> Result<(), DirstateV2ParseError> {
                     // Keep the low 31 bits
                     fn truncate_u64(value: u64) -> i32 {
                         (value & 0x7FFF_FFFF) as i32
                     }
                     let entry = dirstate_node
                         .entry()?
                         .expect("handle_normal_file called with entry-less node");
                     let mode_changed =
                         || self.options.check_exec && entry.mode_changed(fs_metadata);
                     let size = entry.size();
                     let size_changed = size != truncate_u64(fs_metadata.len());
                     if size >= 0 && size_changed && fs_metadata.file_type().is_symlink() {
                         // issue6456: Size returned may be longer due to encryption
                         // on EXT-4 fscrypt. TODO maybe only do it on EXT4?
                         self.push_outcome(Outcome::Unsure, dirstate_node)?
                     } else if dirstate_node.has_copy_source()
                         || entry.is_from_other_parent()
                         || (size >= 0 && (size_changed || mode_changed()))
                     {
                         self.push_outcome(Outcome::Modified, dirstate_node)?
                     } else {
                         let mtime_looks_clean;
                         if let Some(dirstate_mtime) = entry.truncated_mtime() {
                             let fs_mtime = TruncatedTimestamp::for_mtime_of(fs_metadata)
                                 .expect("OS/libc does not support mtime?");
                             // There might be a change in the future if for example the
                             // internal clock become off while process run, but this is a
                             // case where the issues the user would face
                             // would be a lot worse and there is nothing we
                             // can really do.
                             mtime_looks_clean = fs_mtime.likely_equal(dirstate_mtime)
                         } else {
                             // No mtime in the dirstate entry
                             mtime_looks_clean = false
                         };
                         if !mtime_looks_clean {
                             self.push_outcome(Outcome::Unsure, dirstate_node)?
                         } else if self.options.list_clean {
                             self.push_outcome(Outcome::Clean, dirstate_node)?
                         }
                     }
                     Ok(())
                 }
                 /// A node in the dirstate tree has no corresponding filesystem entry
                 fn traverse_dirstate_only(
                     &self,
                     dirstate_node: NodeRef<'tree, 'on_disk>,
                 ) -> Result<(), DirstateV2ParseError> {
                     self.check_for_outdated_directory_cache(&dirstate_node)?;
                     self.mark_removed_or_deleted_if_file(&dirstate_node)?;
                     dirstate_node
                         .children(self.dmap.on_disk)?
                         .par_iter()
                         .map(|child_node| self.traverse_dirstate_only(child_node))
                         .collect()
                 }
                 /// A node in the dirstate tree has no corresponding *file* on the
                 /// filesystem
                 ///
                 /// Does nothing on a "directory" node
                 fn mark_removed_or_deleted_if_file(
                     &self,
                     dirstate_node: &NodeRef<'tree, 'on_disk>,
                 ) -> Result<(), DirstateV2ParseError> {
                     if let Some(entry) = dirstate_node.entry()? {
                         if !entry.any_tracked() {
                             // Future-compat for when we start storing ignored and unknown
                             // files for caching reasons
                             return Ok(());
                         }
                         let path = dirstate_node.full_path(self.dmap.on_disk)?;
                         if self.matcher.matches(path) {
                             if entry.removed() {
                                 self.push_outcome(Outcome::Removed, dirstate_node)?
                             } else {
                                 self.push_outcome(Outcome::Deleted, &dirstate_node)?
                             }
                         }
                     }
                     Ok(())
                 }
                 /// Something in the filesystem has no corresponding dirstate node
                 ///
                 /// Returns whether that path is ignored
                 fn traverse_fs_only(
                     &self,
                     has_ignored_ancestor: bool,
                     directory_hg_path: &HgPath,
                     fs_entry: &DirEntry,
                 ) -> bool {
                     let hg_path = directory_hg_path.join(&fs_entry.base_name);
                     let file_type = fs_entry.metadata.file_type();
                     let file_or_symlink = file_type.is_file() || file_type.is_symlink();
                     if file_type.is_dir() {
                         let is_ignored =
                             has_ignored_ancestor || (self.ignore_fn)(&hg_path);
                         let traverse_children = if is_ignored {
                             // Descendants of an ignored directory are all ignored
                             self.options.list_ignored
                         } else {
                             // Descendants of an unknown directory may be either unknown or
                             // ignored
                             self.options.list_unknown || self.options.list_ignored
                         };
                         if traverse_children {
                             let is_at_repo_root = false;
                             if let Ok(children_fs_entries) = self.read_dir(
                                 &hg_path,
                                 &fs_entry.full_path,
                                 is_at_repo_root,
                             ) {
                                 children_fs_entries.par_iter().for_each(|child_fs_entry| {
                                     self.traverse_fs_only(
                                         is_ignored,
                                         &hg_path,
                                         child_fs_entry,
                                     );
                                 })
                             }
                             if self.options.collect_traversed_dirs {
                                 self.outcome.lock().unwrap().traversed.push(hg_path.into())
                             }
                         }
                         is_ignored
                     } else {
                         if file_or_symlink {
                             if self.matcher.matches(&hg_path) {
                                 self.mark_unknown_or_ignored(
                                     has_ignored_ancestor,
                                     &BorrowedPath::InMemory(&hg_path),
                                 )
                             } else {
                                 // We haven’t computed whether this path is ignored. It
                                 // might not be, and a future run of status might have a
                                 // different matcher that matches it. So treat it as not
                                 // ignored. That is, inhibit readdir caching of the parent
                                 // directory.
                                 false
                             }
                         } else {
                             // This is neither a directory, a plain file, or a symlink.
                             // Treat it like an ignored file.
                             true
                         }
                     }
                 }
                 /// Returns whether that path is ignored
                 fn mark_unknown_or_ignored(
                     &self,
                     has_ignored_ancestor: bool,
                     hg_path: &BorrowedPath<'_, 'on_disk>,
                 ) -> bool {
                     let is_ignored = has_ignored_ancestor || (self.ignore_fn)(&hg_path);
                     if is_ignored {
                         if self.options.list_ignored {
                             self.push_outcome_without_copy_source(
                                 Outcome::Ignored,
                                 hg_path,
                             )
                         }
                     } else {
                         if self.options.list_unknown {
                             self.push_outcome_without_copy_source(
                                 Outcome::Unknown,
                                 hg_path,
                             )
                         }
                     }
                     is_ignored
                 }
             }
             struct DirEntry {
                 base_name: HgPathBuf,
                 full_path: PathBuf,
                 metadata: std::fs::Metadata,
             }
             impl DirEntry {
                 /// Returns **unsorted** entries in the given directory, with name and
                 /// metadata.
                 ///
                 /// If a `.hg` sub-directory is encountered:
                 ///
                 /// * At the repository root, ignore that sub-directory
                 /// * Elsewhere, we’re listing the content of a sub-repo. Return an empty
                 ///   list instead.
                 fn read_dir(path: &Path, is_at_repo_root: bool) -> io::Result<Vec<Self>> {
                     // `read_dir` returns a "not found" error for the empty path
                     let at_cwd = path == Path::new("");
                     let read_dir_path = if at_cwd { Path::new(".") } else { path };
                     let mut results = Vec::new();
                     for entry in read_dir_path.read_dir()? {
                         let entry = entry?;
                         let metadata = match entry.metadata() {
                             Ok(v) => v,
                             Err(e) => {
                                 // race with file deletion?
                                 if e.kind() == std::io::ErrorKind::NotFound {
                                     continue;
                                 } else {
                                     return Err(e);
                                 }
                             }
                         };
                         let file_name = entry.file_name();
                         // FIXME don't do this when cached
                         if file_name == ".hg" {
                             if is_at_repo_root {
                                 // Skip the repo’s own .hg (might be a symlink)
                                 continue;
                             } else if metadata.is_dir() {
                                 // A .hg sub-directory at another location means a subrepo,
                                 // skip it entirely.
                                 return Ok(Vec::new());
                             }
                         }
                         let full_path = if at_cwd {
                             file_name.clone().into()
                         } else {
                             entry.path()
                         };
                         let base_name = get_bytes_from_os_string(file_name).into();
                         results.push(DirEntry {
                             base_name,
                             full_path,
                             metadata,
                         })
                     }
                     Ok(results)
                 }
             }
             /// Return the `mtime` of a temporary file newly-created in the `.hg` directory
             /// of the give repository.
             ///
             /// This is similar to `SystemTime::now()`, with the result truncated to the
             /// same time resolution as other files’ modification times. Using `.hg`
             /// instead of the system’s default temporary directory (such as `/tmp`) makes
             /// it more likely the temporary file is in the same disk partition as contents
             /// of the working directory, which can matter since different filesystems may
             /// store timestamps with different resolutions.
             ///
             /// This may fail, typically if we lack write permissions. In that case we
             /// should continue the `status()` algoritm anyway and consider the current
             /// date/time to be unknown.
             fn filesystem_now(repo_root: &Path) -> Result<SystemTime, io::Error> {
                 tempfile::tempfile_in(repo_root.join(".hg"))?
                     .metadata()?
                     .modified()
             }

rust/hg-core/src/filepatterns.rs

0 +3 -3

             // filepatterns.rs
             //
             // Copyright 2019 Raphaël Gomès <rgomes@octobus.net>
             //
             // This software may be used and distributed according to the terms of the
             // GNU General Public License version 2 or any later version.
             //! Handling of Mercurial-specific patterns.
             use crate::{
                 utils::{
                     files::{canonical_path, get_bytes_from_path, get_path_from_bytes},
                     hg_path::{path_to_hg_path_buf, HgPathBuf, HgPathError},
                     SliceExt,
                 },
                 FastHashMap, PatternError,
             };
             use lazy_static::lazy_static;
             use regex::bytes::{NoExpand, Regex};
             use std::ops::Deref;
             use std::path::{Path, PathBuf};
             use std::vec::Vec;
             lazy_static! {
                 static ref RE_ESCAPE: Vec<Vec<u8>> = {
                     let mut v: Vec<Vec<u8>> = (0..=255).map(|byte| vec![byte]).collect();
                     let to_escape = b"()[]{}?*+-|^$\\.&~# \t\n\r\x0b\x0c";
                     for byte in to_escape {
                         v[*byte as usize].insert(0, b'\\');
                     }
                     v
                 };
             }
             /// These are matched in order
             const GLOB_REPLACEMENTS: &[(&[u8], &[u8])] =
                 &[(b"*/", b"(?:.*/)?"), (b"*", b".*"), (b"", b"[^/]*")];
             /// Appended to the regexp of globs
             const GLOB_SUFFIX: &[u8; 7] = b"(?:/|$)";
             #[derive(Debug, Clone, PartialEq, Eq)]
             pub enum PatternSyntax {
                 /// A regular expression
                 Regexp,
                 /// Glob that matches at the front of the path
                 RootGlob,
                 /// Glob that matches at any suffix of the path (still anchored at
                 /// slashes)
                 Glob,
                 /// a path relative to repository root, which is matched recursively
                 Path,
                 /// A path relative to cwd
                 RelPath,
                 /// an unrooted glob (*.rs matches Rust files in all dirs)
                 RelGlob,
                 /// A regexp that needn't match the start of a name
                 RelRegexp,
                 /// A path relative to repository root, which is matched non-recursively
                 /// (will not match subdirectories)
                 RootFiles,
                 /// A file of patterns to read and include
                 Include,
                 /// A file of patterns to match against files under the same directory
                 SubInclude,
                 /// SubInclude with the result of parsing the included file
                 ///
                 /// Note: there is no ExpandedInclude because that expansion can be done
                 /// in place by replacing the Include pattern by the included patterns.
                 /// SubInclude requires more handling.
                 ///
                 /// Note: `Box` is used to minimize size impact on other enum variants
                 ExpandedSubInclude(Box<SubInclude>),
             }
             /// Transforms a glob pattern into a regex
             fn glob_to_re(pat: &[u8]) -> Vec<u8> {
                 let mut input = pat;
                 let mut res: Vec<u8> = vec![];
                 let mut group_depth = 0;
                 while let Some((c, rest)) = input.split_first() {
                     input = rest;
                     match c {
                         b'*' => {
                             for (source, repl) in GLOB_REPLACEMENTS {
                                 if let Some(rest) = input.drop_prefix(source) {
                                     input = rest;
                                     res.extend(*repl);
                                     break;
                                 }
                             }
                         }
                         b'?' => res.extend(b"."),
                         b'[' => {
                             match input.iter().skip(1).position(|b| *b == b']') {
                                 None => res.extend(b"\\["),
                                 Some(end) => {
                                     // Account for the one we skipped
                                     let end = end + 1;
                                     res.extend(b"[");
                                     for (i, b) in input[..end].iter().enumerate() {
                                         if *b == b'!' && i == 0 {
                                             res.extend(b"^")
                                         } else if *b == b'^' && i == 0 {
                                             res.extend(b"\\^")
                                         } else if *b == b'\\' {
                                             res.extend(b"\\\\")
                                         } else {
                                             res.push(*b)
                                         }
                                     }
                                     res.extend(b"]");
                                     input = &input[end + 1..];
                                 }
                             }
                         }
                         b'{' => {
                             group_depth += 1;
                             res.extend(b"(?:")
                         }
                         b'}' if group_depth > 0 => {
                             group_depth -= 1;
                             res.extend(b")");
                         }
                         b',' if group_depth > 0 => res.extend(b"|"),
                         b'\\' => {
                             let c = {
                                 if let Some((c, rest)) = input.split_first() {
                                     input = rest;
                                     c
                                 } else {
                                     c
                                 }
                             };
                             res.extend(&RE_ESCAPE[*c as usize])
                         }
                         _ => res.extend(&RE_ESCAPE[*c as usize]),
                     }
                 }
                 res
             }
             fn escape_pattern(pattern: &[u8]) -> Vec<u8> {
                 pattern
                     .iter()
                     .flat_map(|c| RE_ESCAPE[*c as usize].clone())
                     .collect()
             }
             pub fn parse_pattern_syntax(
                 kind: &[u8],
             ) -> Result<PatternSyntax, PatternError> {
                 match kind {
                     b"re:" => Ok(PatternSyntax::Regexp),
                     b"path:" => Ok(PatternSyntax::Path),
                     b"relpath:" => Ok(PatternSyntax::RelPath),
                     b"rootfilesin:" => Ok(PatternSyntax::RootFiles),
                     b"relglob:" => Ok(PatternSyntax::RelGlob),
                     b"relre:" => Ok(PatternSyntax::RelRegexp),
                     b"glob:" => Ok(PatternSyntax::Glob),
                     b"rootglob:" => Ok(PatternSyntax::RootGlob),
                     b"include:" => Ok(PatternSyntax::Include),
                     b"subinclude:" => Ok(PatternSyntax::SubInclude),
                     _ => Err(PatternError::UnsupportedSyntax(
                         String::from_utf8_lossy(kind).to_string(),
                     )),
                 }
             }
             /// Builds the regex that corresponds to the given pattern.
             /// If within a `syntax: regexp` context, returns the pattern,
             /// otherwise, returns the corresponding regex.
             fn _build_single_regex(entry: &IgnorePattern) -> Vec<u8> {
                 let IgnorePattern {
                     syntax, pattern, ..
                 } = entry;
                 if pattern.is_empty() {
                     return vec![];
                 }
                 match syntax {
                     PatternSyntax::Regexp => pattern.to_owned(),
                     PatternSyntax::RelRegexp => {
                         // The `regex` crate accepts `**` while `re2` and Python's `re`
                         // do not. Checking for `*` correctly triggers the same error all
                         // engines.
                         if pattern[0] == b'^'
                             || pattern[0] == b'*'
                             || pattern.starts_with(b".*")
                         {
                             return pattern.to_owned();
                         }
                         [&b".*"[..], pattern].concat()
                     }
                     PatternSyntax::Path | PatternSyntax::RelPath => {
                         if pattern == b"." {
                             return vec![];
                         }
                         [escape_pattern(pattern).as_slice(), b"(?:/|$)"].concat()
                     }
                     PatternSyntax::RootFiles => {
                         let mut res = if pattern == b"." {
                             vec![]
                         } else {
                             // Pattern is a directory name.
                             [escape_pattern(pattern).as_slice(), b"/"].concat()
                         };
                         // Anything after the pattern must be a non-directory.
                         res.extend(b"[^/]+$");
                         res
                     }
                     PatternSyntax::RelGlob => {
                         let glob_re = glob_to_re(pattern);
                         if let Some(rest) = glob_re.drop_prefix(b"[^/]*") {
                             [b".*", rest, GLOB_SUFFIX].concat()
                         } else {
                             [b"(?:.*/)?", glob_re.as_slice(), GLOB_SUFFIX].concat()
                         }
                     }
                     PatternSyntax::Glob | PatternSyntax::RootGlob => {
                         [glob_to_re(pattern).as_slice(), GLOB_SUFFIX].concat()
                     }
                     PatternSyntax::Include
                     | PatternSyntax::SubInclude
                     | PatternSyntax::ExpandedSubInclude(_) => unreachable!(),
                 }
             }
             const GLOB_SPECIAL_CHARACTERS: [u8; 7] =
                 [b'*', b'?', b'[', b']', b'{', b'}', b'\\'];
             /// TODO support other platforms
             #[cfg(unix)]
             pub fn normalize_path_bytes(bytes: &[u8]) -> Vec<u8> {
                 if bytes.is_empty() {
                     return b".".to_vec();
                 }
                 let sep = b'/';
                 let mut initial_slashes = bytes.iter().take_while(|b| **b == sep).count();
                 if initial_slashes > 2 {
                     // POSIX allows one or two initial slashes, but treats three or more
                     // as single slash.
                     initial_slashes = 1;
                 }
                 let components = bytes
                     .split(|b| *b == sep)
                     .filter(|c| !(c.is_empty() || c == b"."))
                     .fold(vec![], |mut acc, component| {
                         if component != b".."
                             || (initial_slashes == 0 && acc.is_empty())
                             || (!acc.is_empty() && acc[acc.len() - 1] == b"..")
                         {
                             acc.push(component)
                         } else if !acc.is_empty() {
                             acc.pop();
                         }
                         acc
                     });
                 let mut new_bytes = components.join(&sep);
                 if initial_slashes > 0 {
                     let mut buf: Vec<_> = (0..initial_slashes).map(|_| sep).collect();
                     buf.extend(new_bytes);
                     new_bytes = buf;
                 }
                 if new_bytes.is_empty() {
                     b".".to_vec()
                 } else {
                     new_bytes
                 }
             }
             /// Wrapper function to `_build_single_regex` that short-circuits 'exact' globs
             /// that don't need to be transformed into a regex.
             pub fn build_single_regex(
                 entry: &IgnorePattern,
             ) -> Result<Option<Vec<u8>>, PatternError> {
                 let IgnorePattern {
                     pattern, syntax, ..
                 } = entry;
                 let pattern = match syntax {
                     PatternSyntax::RootGlob
                     | PatternSyntax::Path
                     | PatternSyntax::RelGlob
                     | PatternSyntax::RootFiles => normalize_path_bytes(&pattern),
                     PatternSyntax::Include | PatternSyntax::SubInclude => {
                         return Err(PatternError::NonRegexPattern(entry.clone()))
                     }
                     _ => pattern.to_owned(),
                 };
                 if *syntax == PatternSyntax::RootGlob
                     && !pattern.iter().any(|b| GLOB_SPECIAL_CHARACTERS.contains(b))
                 {
                     Ok(None)
                 } else {
                     let mut entry = entry.clone();
                     entry.pattern = pattern;
                     Ok(Some(_build_single_regex(&entry)))
                 }
             }
             lazy_static! {
                 static ref SYNTAXES: FastHashMap<&'static [u8], &'static [u8]> = {
                     let mut m = FastHashMap::default();
                     m.insert(b"re".as_ref(), b"relre:".as_ref());
                     m.insert(b"regexp".as_ref(), b"relre:".as_ref());
                     m.insert(b"glob".as_ref(), b"relglob:".as_ref());
                     m.insert(b"rootglob".as_ref(), b"rootglob:".as_ref());
                     m.insert(b"include".as_ref(), b"include:".as_ref());
                     m.insert(b"subinclude".as_ref(), b"subinclude:".as_ref());
                     m.insert(b"path".as_ref(), b"path:".as_ref());
                     m.insert(b"rootfilesin".as_ref(), b"rootfilesin:".as_ref());
                     m
                 };
             }
             #[derive(Debug)]
             pub enum PatternFileWarning {
                 /// (file path, syntax bytes)
                 InvalidSyntax(PathBuf, Vec<u8>),
                 /// File path
                 NoSuchFile(PathBuf),
             }
             pub fn parse_pattern_file_contents(
                 lines: &[u8],
                 file_path: &Path,
                 default_syntax_override: Option<&[u8]>,
                 warn: bool,
             ) -> Result<(Vec<IgnorePattern>, Vec<PatternFileWarning>), PatternError> {
                 let comment_regex = Regex::new(r"((?:^|[^\\])(?:\\\\)*)#.*").unwrap();
                 #[allow(clippy::trivial_regex)]
                 let comment_escape_regex = Regex::new(r"\\#").unwrap();
                 let mut inputs: Vec<IgnorePattern> = vec![];
                 let mut warnings: Vec<PatternFileWarning> = vec![];
                 let mut current_syntax =
                     default_syntax_override.unwrap_or(b"relre:".as_ref());
                 for (line_number, mut line) in lines.split(|c| *c == b'\n').enumerate() {
                     let line_number = line_number + 1;
                     let line_buf;
                     if line.contains(&b'#') {
                         if let Some(cap) = comment_regex.captures(line) {
                             line = &line[..cap.get(1).unwrap().end()]
                         }
                         line_buf = comment_escape_regex.replace_all(line, NoExpand(b"#"));
                         line = &line_buf;
                     }
                     let mut line = line.trim_end();
                     if line.is_empty() {
                         continue;
                     }
                     if let Some(syntax) = line.drop_prefix(b"syntax:") {
                         let syntax = syntax.trim();
                         if let Some(rel_syntax) = SYNTAXES.get(syntax) {
                             current_syntax = rel_syntax;
                         } else if warn {
                             warnings.push(PatternFileWarning::InvalidSyntax(
                                 file_path.to_owned(),
                                 syntax.to_owned(),
                             ));
                         }
                         continue;
                     }
                     let mut line_syntax: &[u8] = &current_syntax;
                     for (s, rels) in SYNTAXES.iter() {
                         if let Some(rest) = line.drop_prefix(rels) {
                             line_syntax = rels;
                             line = rest;
                             break;
                         }
                         if let Some(rest) = line.drop_prefix(&[s, &b":"[..]].concat()) {
                             line_syntax = rels;
                             line = rest;
                             break;
                         }
                     }
                     inputs.push(IgnorePattern::new(
                         parse_pattern_syntax(&line_syntax).map_err(|e| match e {
                             PatternError::UnsupportedSyntax(syntax) => {
                                 PatternError::UnsupportedSyntaxInFile(
                                     syntax,
                                     file_path.to_string_lossy().into(),
                                     line_number,
                                 )
                             }
                             _ => e,
                         })?,
                         &line,
                         file_path,
                     ));
                 }
                 Ok((inputs, warnings))
             }
             pub fn read_pattern_file(
                 file_path: &Path,
                 warn: bool,
-                inspect_pattern_bytes: &mut impl FnMut(&[u8]),
+                inspect_pattern_bytes: &mut impl FnMut(&Path, &[u8]),
             ) -> Result<(Vec<IgnorePattern>, Vec<PatternFileWarning>), PatternError> {
                 match std::fs::read(file_path) {
                     Ok(contents) => {
-                        inspect_pattern_bytes(&contents);
+                        inspect_pattern_bytes(file_path, &contents);
                         parse_pattern_file_contents(&contents, file_path, None, warn)
                     }
                     Err(e) if e.kind() == std::io::ErrorKind::NotFound => Ok((
                         vec![],
                         vec![PatternFileWarning::NoSuchFile(file_path.to_owned())],
                     )),
                     Err(e) => Err(e.into()),
                 }
             }
             /// Represents an entry in an "ignore" file.
             #[derive(Debug, Eq, PartialEq, Clone)]
             pub struct IgnorePattern {
                 pub syntax: PatternSyntax,
                 pub pattern: Vec<u8>,
                 pub source: PathBuf,
             }
             impl IgnorePattern {
                 pub fn new(syntax: PatternSyntax, pattern: &[u8], source: &Path) -> Self {
                     Self {
                         syntax,
                         pattern: pattern.to_owned(),
                         source: source.to_owned(),
                     }
                 }
             }
             pub type PatternResult<T> = Result<T, PatternError>;
             /// Wrapper for `read_pattern_file` that also recursively expands `include:`
             /// and `subinclude:` patterns.
             ///
             /// The former are expanded in place, while `PatternSyntax::ExpandedSubInclude`
             /// is used for the latter to form a tree of patterns.
             pub fn get_patterns_from_file(
                 pattern_file: &Path,
                 root_dir: &Path,
-                inspect_pattern_bytes: &mut impl FnMut(&[u8]),
+                inspect_pattern_bytes: &mut impl FnMut(&Path, &[u8]),
             ) -> PatternResult<(Vec<IgnorePattern>, Vec<PatternFileWarning>)> {
                 let (patterns, mut warnings) =
                     read_pattern_file(pattern_file, true, inspect_pattern_bytes)?;
                 let patterns = patterns
                     .into_iter()
                     .flat_map(|entry| -> PatternResult<_> {
                         Ok(match &entry.syntax {
                             PatternSyntax::Include => {
                                 let inner_include =
                                     root_dir.join(get_path_from_bytes(&entry.pattern));
                                 let (inner_pats, inner_warnings) = get_patterns_from_file(
                                     &inner_include,
                                     root_dir,
                                     inspect_pattern_bytes,
                                 )?;
                                 warnings.extend(inner_warnings);
                                 inner_pats
                             }
                             PatternSyntax::SubInclude => {
                                 let mut sub_include = SubInclude::new(
                                     &root_dir,
                                     &entry.pattern,
                                     &entry.source,
                                 )?;
                                 let (inner_patterns, inner_warnings) =
                                     get_patterns_from_file(
                                         &sub_include.path,
                                         &sub_include.root,
                                         inspect_pattern_bytes,
                                     )?;
                                 sub_include.included_patterns = inner_patterns;
                                 warnings.extend(inner_warnings);
                                 vec![IgnorePattern {
                                     syntax: PatternSyntax::ExpandedSubInclude(Box::new(
                                         sub_include,
                                     )),
                                     ..entry
                                 }]
                             }
                             _ => vec![entry],
                         })
                     })
                     .flatten()
                     .collect();
                 Ok((patterns, warnings))
             }
             /// Holds all the information needed to handle a `subinclude:` pattern.
             #[derive(Debug, PartialEq, Eq, Clone)]
             pub struct SubInclude {
                 /// Will be used for repository (hg) paths that start with this prefix.
                 /// It is relative to the current working directory, so comparing against
                 /// repository paths is painless.
                 pub prefix: HgPathBuf,
                 /// The file itself, containing the patterns
                 pub path: PathBuf,
                 /// Folder in the filesystem where this it applies
                 pub root: PathBuf,
                 pub included_patterns: Vec<IgnorePattern>,
             }
             impl SubInclude {
                 pub fn new(
                     root_dir: &Path,
                     pattern: &[u8],
                     source: &Path,
                 ) -> Result<SubInclude, HgPathError> {
                     let normalized_source =
                         normalize_path_bytes(&get_bytes_from_path(source));
                     let source_root = get_path_from_bytes(&normalized_source);
                     let source_root =
                         source_root.parent().unwrap_or_else(|| source_root.deref());
                     let path = source_root.join(get_path_from_bytes(pattern));
                     let new_root = path.parent().unwrap_or_else(|| path.deref());
                     let prefix = canonical_path(root_dir, root_dir, new_root)?;
                     Ok(Self {
                         prefix: path_to_hg_path_buf(prefix).and_then(|mut p| {
                             if !p.is_empty() {
                                 p.push_byte(b'/');
                             }
                             Ok(p)
                         })?,
                         path: path.to_owned(),
                         root: new_root.to_owned(),
                         included_patterns: Vec::new(),
                     })
                 }
             }
             /// Separate and pre-process subincludes from other patterns for the "ignore"
             /// phase.
             pub fn filter_subincludes(
                 ignore_patterns: Vec<IgnorePattern>,
             ) -> Result<(Vec<Box<SubInclude>>, Vec<IgnorePattern>), HgPathError> {
                 let mut subincludes = vec![];
                 let mut others = vec![];
                 for pattern in ignore_patterns {
                     if let PatternSyntax::ExpandedSubInclude(sub_include) = pattern.syntax
                     {
                         subincludes.push(sub_include);
                     } else {
                         others.push(pattern)
                     }
                 }
                 Ok((subincludes, others))
             }
             #[cfg(test)]
             mod tests {
                 use super::*;
                 use pretty_assertions::assert_eq;
                 #[test]
                 fn escape_pattern_test() {
                     let untouched =
                         br#"!"%',/0123456789:;<=>@ABCDEFGHIJKLMNOPQRSTUVWXYZ_`abcdefghijklmnopqrstuvwxyz"#;
                     assert_eq!(escape_pattern(untouched), untouched.to_vec());
                     // All escape codes
                     assert_eq!(
                         escape_pattern(br#"()[]{}?*+-|^$\\.&~# \t\n\r\v\f"#),
                         br#"\(\)\[\]\{\}\?\*\+\-\|\^\$\\\\\.\&\~\#\ \\t\\n\\r\\v\\f"#
                             .to_vec()
                     );
                 }
                 #[test]
                 fn glob_test() {
                     assert_eq!(glob_to_re(br#"?"#), br#"."#);
                     assert_eq!(glob_to_re(br#"*"#), br#"[^/]*"#);
                     assert_eq!(glob_to_re(br#"**"#), br#".*"#);
                     assert_eq!(glob_to_re(br#"**/a"#), br#"(?:.*/)?a"#);
                     assert_eq!(glob_to_re(br#"a/**/b"#), br#"a/(?:.*/)?b"#);
                     assert_eq!(glob_to_re(br#"[a*?!^][^b][!c]"#), br#"[a*?!^][\^b][^c]"#);
                     assert_eq!(glob_to_re(br#"{a,b}"#), br#"(?:a|b)"#);
                     assert_eq!(glob_to_re(br#".\*\?"#), br#"\.\*\?"#);
                 }
                 #[test]
                 fn test_parse_pattern_file_contents() {
                     let lines = b"syntax: glob\n*.elc";
                     assert_eq!(
                         parse_pattern_file_contents(
                             lines,
                             Path::new("file_path"),
                             None,
                             false
                         )
                         .unwrap()
                         .0,
                         vec![IgnorePattern::new(
                             PatternSyntax::RelGlob,
                             b"*.elc",
                             Path::new("file_path")
                         )],
                     );
                     let lines = b"syntax: include\nsyntax: glob";
                     assert_eq!(
                         parse_pattern_file_contents(
                             lines,
                             Path::new("file_path"),
                             None,
                             false
                         )
                         .unwrap()
                         .0,
                         vec![]
                     );
                     let lines = b"glob:**.o";
                     assert_eq!(
                         parse_pattern_file_contents(
                             lines,
                             Path::new("file_path"),
                             None,
                             false
                         )
                         .unwrap()
                         .0,
                         vec![IgnorePattern::new(
                             PatternSyntax::RelGlob,
                             b"**.o",
                             Path::new("file_path")
                         )]
                     );
                 }
                 #[test]
                 fn test_build_single_regex() {
                     assert_eq!(
                         build_single_regex(&IgnorePattern::new(
                             PatternSyntax::RelGlob,
                             b"rust/target/",
                             Path::new("")
                         ))
                         .unwrap(),
                         Some(br"(?:.*/)?rust/target(?:/|$)".to_vec()),
                     );
                     assert_eq!(
                         build_single_regex(&IgnorePattern::new(
                             PatternSyntax::Regexp,
                             br"rust/target/\d+",
                             Path::new("")
                         ))
                         .unwrap(),
                         Some(br"rust/target/\d+".to_vec()),
                     );
                 }
                 #[test]
                 fn test_build_single_regex_shortcut() {
                     assert_eq!(
                         build_single_regex(&IgnorePattern::new(
                             PatternSyntax::RootGlob,
                             b"",
                             Path::new("")
                         ))
                         .unwrap(),
                         None,
                     );
                     assert_eq!(
                         build_single_regex(&IgnorePattern::new(
                             PatternSyntax::RootGlob,
                             b"whatever",
                             Path::new("")
                         ))
                         .unwrap(),
                         None,
                     );
                     assert_eq!(
                         build_single_regex(&IgnorePattern::new(
                             PatternSyntax::RootGlob,
                             b"*.o",
                             Path::new("")
                         ))
                         .unwrap(),
                         Some(br"[^/]*\.o(?:/|$)".to_vec()),
                     );
                 }
             }

rust/hg-core/src/matchers.rs

0 +2 -2

             // matchers.rs
             //
             // Copyright 2019 Raphaël Gomès <rgomes@octobus.net>
             //
             // This software may be used and distributed according to the terms of the
             // GNU General Public License version 2 or any later version.
             //! Structs and types for matching files and directories.
             use crate::{
                 dirstate::dirs_multiset::DirsChildrenMultiset,
                 filepatterns::{
                     build_single_regex, filter_subincludes, get_patterns_from_file,
                     PatternFileWarning, PatternResult,
                 },
                 utils::{
                     files::find_dirs,
                     hg_path::{HgPath, HgPathBuf},
                     Escaped,
                 },
                 DirsMultiset, DirstateMapError, FastHashMap, IgnorePattern, PatternError,
                 PatternSyntax,
             };
             use crate::dirstate::status::IgnoreFnType;
             use crate::filepatterns::normalize_path_bytes;
             use std::borrow::ToOwned;
             use std::collections::HashSet;
             use std::fmt::{Display, Error, Formatter};
             use std::iter::FromIterator;
             use std::ops::Deref;
             use std::path::{Path, PathBuf};
             use micro_timer::timed;
             #[derive(Debug, PartialEq)]
             pub enum VisitChildrenSet {
                 /// Don't visit anything
                 Empty,
                 /// Only visit this directory
                 This,
                 /// Visit this directory and these subdirectories
                 /// TODO Should we implement a `NonEmptyHashSet`?
                 Set(HashSet<HgPathBuf>),
                 /// Visit this directory and all subdirectories
                 Recursive,
             }
             pub trait Matcher: core::fmt::Debug {
                 /// Explicitly listed files
                 fn file_set(&self) -> Option<&HashSet<HgPathBuf>>;
                 /// Returns whether `filename` is in `file_set`
                 fn exact_match(&self, filename: &HgPath) -> bool;
                 /// Returns whether `filename` is matched by this matcher
                 fn matches(&self, filename: &HgPath) -> bool;
                 /// Decides whether a directory should be visited based on whether it
                 /// has potential matches in it or one of its subdirectories, and
                 /// potentially lists which subdirectories of that directory should be
                 /// visited. This is based on the match's primary, included, and excluded
                 /// patterns.
                 ///
                 /// # Example
                 ///
                 /// Assume matchers `['path:foo/bar', 'rootfilesin:qux']`, we would
                 /// return the following values (assuming the implementation of
                 /// visit_children_set is capable of recognizing this; some implementations
                 /// are not).
                 ///
                 /// ```text
                 /// ```ignore
                 /// '' -> {'foo', 'qux'}
                 /// 'baz' -> set()
                 /// 'foo' -> {'bar'}
                 /// // Ideally this would be `Recursive`, but since the prefix nature of
                 /// // matchers is applied to the entire matcher, we have to downgrade this
                 /// // to `This` due to the (yet to be implemented in Rust) non-prefix
                 /// // `RootFilesIn'-kind matcher being mixed in.
                 /// 'foo/bar' -> 'this'
                 /// 'qux' -> 'this'
                 /// ```
                 /// # Important
                 ///
                 /// Most matchers do not know if they're representing files or
                 /// directories. They see `['path:dir/f']` and don't know whether `f` is a
                 /// file or a directory, so `visit_children_set('dir')` for most matchers
                 /// will return `HashSet{ HgPath { "f" } }`, but if the matcher knows it's
                 /// a file (like the yet to be implemented in Rust `ExactMatcher` does),
                 /// it may return `VisitChildrenSet::This`.
                 /// Do not rely on the return being a `HashSet` indicating that there are
                 /// no files in this dir to investigate (or equivalently that if there are
                 /// files to investigate in 'dir' that it will always return
                 /// `VisitChildrenSet::This`).
                 fn visit_children_set(&self, directory: &HgPath) -> VisitChildrenSet;
                 /// Matcher will match everything and `files_set()` will be empty:
                 /// optimization might be possible.
                 fn matches_everything(&self) -> bool;
                 /// Matcher will match exactly the files in `files_set()`: optimization
                 /// might be possible.
                 fn is_exact(&self) -> bool;
             }
             /// Matches everything.
             ///```
             /// use hg::{ matchers::{Matcher, AlwaysMatcher}, utils::hg_path::HgPath };
             ///
             /// let matcher = AlwaysMatcher;
             ///
             /// assert_eq!(matcher.matches(HgPath::new(b"whatever")), true);
             /// assert_eq!(matcher.matches(HgPath::new(b"b.txt")), true);
             /// assert_eq!(matcher.matches(HgPath::new(b"main.c")), true);
             /// assert_eq!(matcher.matches(HgPath::new(br"re:.*\.c$")), true);
             /// ```
             #[derive(Debug)]
             pub struct AlwaysMatcher;
             impl Matcher for AlwaysMatcher {
                 fn file_set(&self) -> Option<&HashSet<HgPathBuf>> {
                     None
                 }
                 fn exact_match(&self, _filename: &HgPath) -> bool {
                     false
                 }
                 fn matches(&self, _filename: &HgPath) -> bool {
                     true
                 }
                 fn visit_children_set(&self, _directory: &HgPath) -> VisitChildrenSet {
                     VisitChildrenSet::Recursive
                 }
                 fn matches_everything(&self) -> bool {
                     true
                 }
                 fn is_exact(&self) -> bool {
                     false
                 }
             }
             /// Matches nothing.
             #[derive(Debug)]
             pub struct NeverMatcher;
             impl Matcher for NeverMatcher {
                 fn file_set(&self) -> Option<&HashSet<HgPathBuf>> {
                     None
                 }
                 fn exact_match(&self, _filename: &HgPath) -> bool {
                     false
                 }
                 fn matches(&self, _filename: &HgPath) -> bool {
                     false
                 }
                 fn visit_children_set(&self, _directory: &HgPath) -> VisitChildrenSet {
                     VisitChildrenSet::Empty
                 }
                 fn matches_everything(&self) -> bool {
                     false
                 }
                 fn is_exact(&self) -> bool {
                     true
                 }
             }
             /// Matches the input files exactly. They are interpreted as paths, not
             /// patterns.
             ///
             ///```
             /// use hg::{ matchers::{Matcher, FileMatcher}, utils::hg_path::{HgPath, HgPathBuf} };
             ///
             /// let files = vec![HgPathBuf::from_bytes(b"a.txt"), HgPathBuf::from_bytes(br"re:.*\.c$")];
             /// let matcher = FileMatcher::new(files).unwrap();
             ///
             /// assert_eq!(matcher.matches(HgPath::new(b"a.txt")), true);
             /// assert_eq!(matcher.matches(HgPath::new(b"b.txt")), false);
             /// assert_eq!(matcher.matches(HgPath::new(b"main.c")), false);
             /// assert_eq!(matcher.matches(HgPath::new(br"re:.*\.c$")), true);
             /// ```
             #[derive(Debug)]
             pub struct FileMatcher {
                 files: HashSet<HgPathBuf>,
                 dirs: DirsMultiset,
             }
             impl FileMatcher {
                 pub fn new(files: Vec<HgPathBuf>) -> Result<Self, DirstateMapError> {
                     let dirs = DirsMultiset::from_manifest(&files)?;
                     Ok(Self {
                         files: HashSet::from_iter(files.into_iter()),
                         dirs,
                     })
                 }
                 fn inner_matches(&self, filename: &HgPath) -> bool {
                     self.files.contains(filename.as_ref())
                 }
             }
             impl Matcher for FileMatcher {
                 fn file_set(&self) -> Option<&HashSet<HgPathBuf>> {
                     Some(&self.files)
                 }
                 fn exact_match(&self, filename: &HgPath) -> bool {
                     self.inner_matches(filename)
                 }
                 fn matches(&self, filename: &HgPath) -> bool {
                     self.inner_matches(filename)
                 }
                 fn visit_children_set(&self, directory: &HgPath) -> VisitChildrenSet {
                     if self.files.is_empty() || !self.dirs.contains(&directory) {
                         return VisitChildrenSet::Empty;
                     }
                     let mut candidates: HashSet<HgPathBuf> =
                         self.dirs.iter().cloned().collect();
                     candidates.extend(self.files.iter().cloned());
                     candidates.remove(HgPath::new(b""));
                     if !directory.as_ref().is_empty() {
                         let directory = [directory.as_ref().as_bytes(), b"/"].concat();
                         candidates = candidates
                             .iter()
                             .filter_map(|c| {
                                 if c.as_bytes().starts_with(&directory) {
                                     Some(HgPathBuf::from_bytes(
                                         &c.as_bytes()[directory.len()..],
                                     ))
                                 } else {
                                     None
                                 }
                             })
                             .collect();
                     }
                     // `self.dirs` includes all of the directories, recursively, so if
                     // we're attempting to match 'foo/bar/baz.txt', it'll have '', 'foo',
                     // 'foo/bar' in it. Thus we can safely ignore a candidate that has a
                     // '/' in it, indicating it's for a subdir-of-a-subdir; the immediate
                     // subdir will be in there without a slash.
                     VisitChildrenSet::Set(
                         candidates
                             .into_iter()
                             .filter_map(|c| {
                                 if c.bytes().all(|b| *b != b'/') {
                                     Some(c)
                                 } else {
                                     None
                                 }
                             })
                             .collect(),
                     )
                 }
                 fn matches_everything(&self) -> bool {
                     false
                 }
                 fn is_exact(&self) -> bool {
                     true
                 }
             }
             /// Matches files that are included in the ignore rules.
             /// ```
             /// use hg::{
             ///     matchers::{IncludeMatcher, Matcher},
             ///     IgnorePattern,
             ///     PatternSyntax,
             ///     utils::hg_path::HgPath
             /// };
             /// use std::path::Path;
             /// ///
             /// let ignore_patterns =
             /// vec![IgnorePattern::new(PatternSyntax::RootGlob, b"this*", Path::new(""))];
             /// let matcher = IncludeMatcher::new(ignore_patterns).unwrap();
             /// ///
             /// assert_eq!(matcher.matches(HgPath::new(b"testing")), false);
             /// assert_eq!(matcher.matches(HgPath::new(b"this should work")), true);
             /// assert_eq!(matcher.matches(HgPath::new(b"this also")), true);
             /// assert_eq!(matcher.matches(HgPath::new(b"but not this")), false);
             /// ```
             pub struct IncludeMatcher<'a> {
                 patterns: Vec<u8>,
                 match_fn: IgnoreFnType<'a>,
                 /// Whether all the patterns match a prefix (i.e. recursively)
                 prefix: bool,
                 roots: HashSet<HgPathBuf>,
                 dirs: HashSet<HgPathBuf>,
                 parents: HashSet<HgPathBuf>,
             }
             impl core::fmt::Debug for IncludeMatcher<'_> {
                 fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
                     f.debug_struct("IncludeMatcher")
                         .field("patterns", &String::from_utf8_lossy(&self.patterns))
                         .field("prefix", &self.prefix)
                         .field("roots", &self.roots)
                         .field("dirs", &self.dirs)
                         .field("parents", &self.parents)
                         .finish()
                 }
             }
             impl<'a> Matcher for IncludeMatcher<'a> {
                 fn file_set(&self) -> Option<&HashSet<HgPathBuf>> {
                     None
                 }
                 fn exact_match(&self, _filename: &HgPath) -> bool {
                     false
                 }
                 fn matches(&self, filename: &HgPath) -> bool {
                     (self.match_fn)(filename.as_ref())
                 }
                 fn visit_children_set(&self, directory: &HgPath) -> VisitChildrenSet {
                     let dir = directory.as_ref();
                     if self.prefix && self.roots.contains(dir) {
                         return VisitChildrenSet::Recursive;
                     }
                     if self.roots.contains(HgPath::new(b""))
                         || self.roots.contains(dir)
                         || self.dirs.contains(dir)
                         || find_dirs(dir).any(|parent_dir| self.roots.contains(parent_dir))
                     {
                         return VisitChildrenSet::This;
                     }
                     if self.parents.contains(directory.as_ref()) {
                         let multiset = self.get_all_parents_children();
                         if let Some(children) = multiset.get(dir) {
                             return VisitChildrenSet::Set(
                                 children.into_iter().map(HgPathBuf::from).collect(),
                             );
                         }
                     }
                     VisitChildrenSet::Empty
                 }
                 fn matches_everything(&self) -> bool {
                     false
                 }
                 fn is_exact(&self) -> bool {
                     false
                 }
             }
             /// The union of multiple matchers. Will match if any of the matchers match.
             #[derive(Debug)]
             pub struct UnionMatcher {
                 matchers: Vec<Box<dyn Matcher + Sync>>,
             }
             impl Matcher for UnionMatcher {
                 fn file_set(&self) -> Option<&HashSet<HgPathBuf>> {
                     None
                 }
                 fn exact_match(&self, _filename: &HgPath) -> bool {
                     false
                 }
                 fn matches(&self, filename: &HgPath) -> bool {
                     self.matchers.iter().any(|m| m.matches(filename))
                 }
                 fn visit_children_set(&self, directory: &HgPath) -> VisitChildrenSet {
                     let mut result = HashSet::new();
                     let mut this = false;
                     for matcher in self.matchers.iter() {
                         let visit = matcher.visit_children_set(directory);
                         match visit {
                             VisitChildrenSet::Empty => continue,
                             VisitChildrenSet::This => {
                                 this = true;
                                 // Don't break, we might have an 'all' in here.
                                 continue;
                             }
                             VisitChildrenSet::Set(set) => {
                                 result.extend(set);
                             }
                             VisitChildrenSet::Recursive => {
                                 return visit;
                             }
                         }
                     }
                     if this {
                         return VisitChildrenSet::This;
                     }
                     if result.is_empty() {
                         VisitChildrenSet::Empty
                     } else {
                         VisitChildrenSet::Set(result)
                     }
                 }
                 fn matches_everything(&self) -> bool {
                     // TODO Maybe if all are AlwaysMatcher?
                     false
                 }
                 fn is_exact(&self) -> bool {
                     false
                 }
             }
             impl UnionMatcher {
                 pub fn new(matchers: Vec<Box<dyn Matcher + Sync>>) -> Self {
                     Self { matchers }
                 }
             }
             #[derive(Debug)]
             pub struct IntersectionMatcher {
                 m1: Box<dyn Matcher + Sync>,
                 m2: Box<dyn Matcher + Sync>,
                 files: Option<HashSet<HgPathBuf>>,
             }
             impl Matcher for IntersectionMatcher {
                 fn file_set(&self) -> Option<&HashSet<HgPathBuf>> {
                     self.files.as_ref()
                 }
                 fn exact_match(&self, filename: &HgPath) -> bool {
                     self.files.as_ref().map_or(false, |f| f.contains(filename))
                 }
                 fn matches(&self, filename: &HgPath) -> bool {
                     self.m1.matches(filename) && self.m2.matches(filename)
                 }
                 fn visit_children_set(&self, directory: &HgPath) -> VisitChildrenSet {
                     let m1_set = self.m1.visit_children_set(directory);
                     if m1_set == VisitChildrenSet::Empty {
                         return VisitChildrenSet::Empty;
                     }
                     let m2_set = self.m2.visit_children_set(directory);
                     if m2_set == VisitChildrenSet::Empty {
                         return VisitChildrenSet::Empty;
                     }
                     if m1_set == VisitChildrenSet::Recursive {
                         return m2_set;
                     } else if m2_set == VisitChildrenSet::Recursive {
                         return m1_set;
                     }
                     match (&m1_set, &m2_set) {
                         (VisitChildrenSet::Recursive, _) => m2_set,
                         (_, VisitChildrenSet::Recursive) => m1_set,
                         (VisitChildrenSet::This, _) | (_, VisitChildrenSet::This) => {
                             VisitChildrenSet::This
                         }
                         (VisitChildrenSet::Set(m1), VisitChildrenSet::Set(m2)) => {
                             let set: HashSet<_> = m1.intersection(&m2).cloned().collect();
                             if set.is_empty() {
                                 VisitChildrenSet::Empty
                             } else {
                                 VisitChildrenSet::Set(set)
                             }
                         }
                         _ => unreachable!(),
                     }
                 }
                 fn matches_everything(&self) -> bool {
                     self.m1.matches_everything() && self.m2.matches_everything()
                 }
                 fn is_exact(&self) -> bool {
                     self.m1.is_exact() || self.m2.is_exact()
                 }
             }
             impl IntersectionMatcher {
                 pub fn new(
                     mut m1: Box<dyn Matcher + Sync>,
                     mut m2: Box<dyn Matcher + Sync>,
                 ) -> Self {
                     let files = if m1.is_exact() || m2.is_exact() {
                         if !m1.is_exact() {
                             std::mem::swap(&mut m1, &mut m2);
                         }
                         m1.file_set().map(|m1_files| {
                             m1_files.iter().cloned().filter(|f| m2.matches(f)).collect()
                         })
                     } else {
                         None
                     };
                     Self { m1, m2, files }
                 }
             }
             #[derive(Debug)]
             pub struct DifferenceMatcher {
                 base: Box<dyn Matcher + Sync>,
                 excluded: Box<dyn Matcher + Sync>,
                 files: Option<HashSet<HgPathBuf>>,
             }
             impl Matcher for DifferenceMatcher {
                 fn file_set(&self) -> Option<&HashSet<HgPathBuf>> {
                     self.files.as_ref()
                 }
                 fn exact_match(&self, filename: &HgPath) -> bool {
                     self.files.as_ref().map_or(false, |f| f.contains(filename))
                 }
                 fn matches(&self, filename: &HgPath) -> bool {
                     self.base.matches(filename) && !self.excluded.matches(filename)
                 }
                 fn visit_children_set(&self, directory: &HgPath) -> VisitChildrenSet {
                     let excluded_set = self.excluded.visit_children_set(directory);
                     if excluded_set == VisitChildrenSet::Recursive {
                         return VisitChildrenSet::Empty;
                     }
                     let base_set = self.base.visit_children_set(directory);
                     // Possible values for base: 'recursive', 'this', set(...), set()
                     // Possible values for excluded:          'this', set(...), set()
                     // If excluded has nothing under here that we care about, return base,
                     // even if it's 'recursive'.
                     if excluded_set == VisitChildrenSet::Empty {
                         return base_set;
                     }
                     match base_set {
                         VisitChildrenSet::This | VisitChildrenSet::Recursive => {
                             // Never return 'recursive' here if excluded_set is any kind of
                             // non-empty (either 'this' or set(foo)), since excluded might
                             // return set() for a subdirectory.
                             VisitChildrenSet::This
                         }
                         set => {
                             // Possible values for base:         set(...), set()
                             // Possible values for excluded: 'this', set(...)
                             // We ignore excluded set results. They're possibly incorrect:
                             //  base = path:dir/subdir
                             //  excluded=rootfilesin:dir,
                             //  visit_children_set(''):
                             //   base returns {'dir'}, excluded returns {'dir'}, if we
                             //   subtracted we'd return set(), which is *not* correct, we
                             //   still need to visit 'dir'!
                             set
                         }
                     }
                 }
                 fn matches_everything(&self) -> bool {
                     false
                 }
                 fn is_exact(&self) -> bool {
                     self.base.is_exact()
                 }
             }
             impl DifferenceMatcher {
                 pub fn new(
                     base: Box<dyn Matcher + Sync>,
                     excluded: Box<dyn Matcher + Sync>,
                 ) -> Self {
                     let base_is_exact = base.is_exact();
                     let base_files = base.file_set().map(ToOwned::to_owned);
                     let mut new = Self {
                         base,
                         excluded,
                         files: None,
                     };
                     if base_is_exact {
                         new.files = base_files.map(|files| {
                             files.iter().cloned().filter(|f| new.matches(f)).collect()
                         });
                     }
                     new
                 }
             }
             /// Returns a function that matches an `HgPath` against the given regex
             /// pattern.
             ///
             /// This can fail when the pattern is invalid or not supported by the
             /// underlying engine (the `regex` crate), for instance anything with
             /// back-references.
             #[timed]
             fn re_matcher(
                 pattern: &[u8],
             ) -> PatternResult<impl Fn(&HgPath) -> bool + Sync> {
                 use std::io::Write;
                 // The `regex` crate adds `.*` to the start and end of expressions if there
                 // are no anchors, so add the start anchor.
                 let mut escaped_bytes = vec![b'^', b'(', b'?', b':'];
                 for byte in pattern {
                     if *byte > 127 {
                         write!(escaped_bytes, "\\x{:x}", *byte).unwrap();
                     } else {
                         escaped_bytes.push(*byte);
                     }
                 }
                 escaped_bytes.push(b')');
                 // Avoid the cost of UTF8 checking
                 //
                 // # Safety
                 // This is safe because we escaped all non-ASCII bytes.
                 let pattern_string = unsafe { String::from_utf8_unchecked(escaped_bytes) };
                 let re = regex::bytes::RegexBuilder::new(&pattern_string)
                     .unicode(false)
                     // Big repos with big `.hgignore` will hit the default limit and
                     // incur a significant performance hit. One repo's `hg status` hit
                     // multiple *minutes*.
                     .dfa_size_limit(50 * (1 << 20))
                     .build()
                     .map_err(|e| PatternError::UnsupportedSyntax(e.to_string()))?;
                 Ok(move |path: &HgPath| re.is_match(path.as_bytes()))
             }
             /// Returns the regex pattern and a function that matches an `HgPath` against
             /// said regex formed by the given ignore patterns.
             fn build_regex_match<'a, 'b>(
                 ignore_patterns: &'a [IgnorePattern],
             ) -> PatternResult<(Vec<u8>, IgnoreFnType<'b>)> {
                 let mut regexps = vec![];
                 let mut exact_set = HashSet::new();
                 for pattern in ignore_patterns {
                     if let Some(re) = build_single_regex(pattern)? {
                         regexps.push(re);
                     } else {
                         let exact = normalize_path_bytes(&pattern.pattern);
                         exact_set.insert(HgPathBuf::from_bytes(&exact));
                     }
                 }
                 let full_regex = regexps.join(&b'|');
                 // An empty pattern would cause the regex engine to incorrectly match the
                 // (empty) root directory
                 let func = if !(regexps.is_empty()) {
                     let matcher = re_matcher(&full_regex)?;
                     let func = move |filename: &HgPath| {
                         exact_set.contains(filename) || matcher(filename)
                     };
                     Box::new(func) as IgnoreFnType
                 } else {
                     let func = move |filename: &HgPath| exact_set.contains(filename);
                     Box::new(func) as IgnoreFnType
                 };
                 Ok((full_regex, func))
             }
             /// Returns roots and directories corresponding to each pattern.
             ///
             /// This calculates the roots and directories exactly matching the patterns and
             /// returns a tuple of (roots, dirs). It does not return other directories
             /// which may also need to be considered, like the parent directories.
             fn roots_and_dirs(
                 ignore_patterns: &[IgnorePattern],
             ) -> (Vec<HgPathBuf>, Vec<HgPathBuf>) {
                 let mut roots = Vec::new();
                 let mut dirs = Vec::new();
                 for ignore_pattern in ignore_patterns {
                     let IgnorePattern {
                         syntax, pattern, ..
                     } = ignore_pattern;
                     match syntax {
                         PatternSyntax::RootGlob | PatternSyntax::Glob => {
                             let mut root = HgPathBuf::new();
                             for p in pattern.split(|c| *c == b'/') {
                                 if p.iter().any(|c| match *c {
                                     b'[' | b'{' | b'*' | b'?' => true,
                                     _ => false,
                                 }) {
                                     break;
                                 }
                                 root.push(HgPathBuf::from_bytes(p).as_ref());
                             }
                             roots.push(root);
                         }
                         PatternSyntax::Path | PatternSyntax::RelPath => {
                             let pat = HgPath::new(if pattern == b"." {
                                 &[] as &[u8]
                             } else {
                                 pattern
                             });
                             roots.push(pat.to_owned());
                         }
                         PatternSyntax::RootFiles => {
                             let pat = if pattern == b"." {
                                 &[] as &[u8]
                             } else {
                                 pattern
                             };
                             dirs.push(HgPathBuf::from_bytes(pat));
                         }
                         _ => {
                             roots.push(HgPathBuf::new());
                         }
                     }
                 }
                 (roots, dirs)
             }
             /// Paths extracted from patterns
             #[derive(Debug, PartialEq)]
             struct RootsDirsAndParents {
                 /// Directories to match recursively
                 pub roots: HashSet<HgPathBuf>,
                 /// Directories to match non-recursively
                 pub dirs: HashSet<HgPathBuf>,
                 /// Implicitly required directories to go to items in either roots or dirs
                 pub parents: HashSet<HgPathBuf>,
             }
             /// Extract roots, dirs and parents from patterns.
             fn roots_dirs_and_parents(
                 ignore_patterns: &[IgnorePattern],
             ) -> PatternResult<RootsDirsAndParents> {
                 let (roots, dirs) = roots_and_dirs(ignore_patterns);
                 let mut parents = HashSet::new();
                 parents.extend(
                     DirsMultiset::from_manifest(&dirs)
                         .map_err(|e| match e {
                             DirstateMapError::InvalidPath(e) => e,
                             _ => unreachable!(),
                         })?
                         .iter()
                         .map(ToOwned::to_owned),
                 );
                 parents.extend(
                     DirsMultiset::from_manifest(&roots)
                         .map_err(|e| match e {
                             DirstateMapError::InvalidPath(e) => e,
                             _ => unreachable!(),
                         })?
                         .iter()
                         .map(ToOwned::to_owned),
                 );
                 Ok(RootsDirsAndParents {
                     roots: HashSet::from_iter(roots),
                     dirs: HashSet::from_iter(dirs),
                     parents,
                 })
             }
             /// Returns a function that checks whether a given file (in the general sense)
             /// should be matched.
             fn build_match<'a, 'b>(
                 ignore_patterns: Vec<IgnorePattern>,
             ) -> PatternResult<(Vec<u8>, IgnoreFnType<'b>)> {
                 let mut match_funcs: Vec<IgnoreFnType<'b>> = vec![];
                 // For debugging and printing
                 let mut patterns = vec![];
                 let (subincludes, ignore_patterns) = filter_subincludes(ignore_patterns)?;
                 if !subincludes.is_empty() {
                     // Build prefix-based matcher functions for subincludes
                     let mut submatchers = FastHashMap::default();
                     let mut prefixes = vec![];
                     for sub_include in subincludes {
                         let matcher = IncludeMatcher::new(sub_include.included_patterns)?;
                         let match_fn =
                             Box::new(move |path: &HgPath| matcher.matches(path));
                         prefixes.push(sub_include.prefix.clone());
                         submatchers.insert(sub_include.prefix.clone(), match_fn);
                     }
                     let match_subinclude = move |filename: &HgPath| {
                         for prefix in prefixes.iter() {
                             if let Some(rel) = filename.relative_to(prefix) {
                                 if (submatchers[prefix])(rel) {
                                     return true;
                                 }
                             }
                         }
                         false
                     };
                     match_funcs.push(Box::new(match_subinclude));
                 }
                 if !ignore_patterns.is_empty() {
                     // Either do dumb matching if all patterns are rootfiles, or match
                     // with a regex.
                     if ignore_patterns
                         .iter()
                         .all(|k| k.syntax == PatternSyntax::RootFiles)
                     {
                         let dirs: HashSet<_> = ignore_patterns
                             .iter()
                             .map(|k| k.pattern.to_owned())
                             .collect();
                         let mut dirs_vec: Vec<_> = dirs.iter().cloned().collect();
                         let match_func = move |path: &HgPath| -> bool {
                             let path = path.as_bytes();
                             let i = path.iter().rfind(|a| **a == b'/');
                             let dir = if let Some(i) = i {
                                 &path[..*i as usize]
                             } else {
                                 b"."
                             };
                             dirs.contains(dir.deref())
                         };
                         match_funcs.push(Box::new(match_func));
                         patterns.extend(b"rootfilesin: ");
                         dirs_vec.sort();
                         patterns.extend(dirs_vec.escaped_bytes());
                     } else {
                         let (new_re, match_func) = build_regex_match(&ignore_patterns)?;
                         patterns = new_re;
                         match_funcs.push(match_func)
                     }
                 }
                 Ok(if match_funcs.len() == 1 {
                     (patterns, match_funcs.remove(0))
                 } else {
                     (
                         patterns,
                         Box::new(move |f: &HgPath| -> bool {
                             match_funcs.iter().any(|match_func| match_func(f))
                         }),
                     )
                 })
             }
             /// Parses all "ignore" files with their recursive includes and returns a
             /// function that checks whether a given file (in the general sense) should be
             /// ignored.
             pub fn get_ignore_matcher<'a>(
                 mut all_pattern_files: Vec<PathBuf>,
                 root_dir: &Path,
-                inspect_pattern_bytes: &mut impl FnMut(&[u8]),
+                inspect_pattern_bytes: &mut impl FnMut(&Path, &[u8]),
             ) -> PatternResult<(IncludeMatcher<'a>, Vec<PatternFileWarning>)> {
                 let mut all_patterns = vec![];
                 let mut all_warnings = vec![];
                 // Sort to make the ordering of calls to `inspect_pattern_bytes`
                 // deterministic even if the ordering of `all_pattern_files` is not (such
                 // as when a iteration order of a Python dict or Rust HashMap is involved).
                 // Sort by "string" representation instead of the default by component
                 // (with a Rust-specific definition of a component)
                 all_pattern_files
                     .sort_unstable_by(|a, b| a.as_os_str().cmp(b.as_os_str()));
                 for pattern_file in &all_pattern_files {
                     let (patterns, warnings) = get_patterns_from_file(
                         pattern_file,
                         root_dir,
                         inspect_pattern_bytes,
                     )?;
                     all_patterns.extend(patterns.to_owned());
                     all_warnings.extend(warnings);
                 }
                 let matcher = IncludeMatcher::new(all_patterns)?;
                 Ok((matcher, all_warnings))
             }
             /// Parses all "ignore" files with their recursive includes and returns a
             /// function that checks whether a given file (in the general sense) should be
             /// ignored.
             pub fn get_ignore_function<'a>(
                 all_pattern_files: Vec<PathBuf>,
                 root_dir: &Path,
-                inspect_pattern_bytes: &mut impl FnMut(&[u8]),
+                inspect_pattern_bytes: &mut impl FnMut(&Path, &[u8]),
             ) -> PatternResult<(IgnoreFnType<'a>, Vec<PatternFileWarning>)> {
                 let res =
                     get_ignore_matcher(all_pattern_files, root_dir, inspect_pattern_bytes);
                 res.map(|(matcher, all_warnings)| {
                     let res: IgnoreFnType<'a> =
                         Box::new(move |path: &HgPath| matcher.matches(path));
                     (res, all_warnings)
                 })
             }
             impl<'a> IncludeMatcher<'a> {
                 pub fn new(ignore_patterns: Vec<IgnorePattern>) -> PatternResult<Self> {
                     let RootsDirsAndParents {
                         roots,
                         dirs,
                         parents,
                     } = roots_dirs_and_parents(&ignore_patterns)?;
                     let prefix = ignore_patterns.iter().all(|k| match k.syntax {
                         PatternSyntax::Path | PatternSyntax::RelPath => true,
                         _ => false,
                     });
                     let (patterns, match_fn) = build_match(ignore_patterns)?;
                     Ok(Self {
                         patterns,
                         match_fn,
                         prefix,
                         roots,
                         dirs,
                         parents,
                     })
                 }
                 fn get_all_parents_children(&self) -> DirsChildrenMultiset {
                     // TODO cache
                     let thing = self
                         .dirs
                         .iter()
                         .chain(self.roots.iter())
                         .chain(self.parents.iter());
                     DirsChildrenMultiset::new(thing, Some(&self.parents))
                 }
                 pub fn debug_get_patterns(&self) -> &[u8] {
                     self.patterns.as_ref()
                 }
             }
             impl<'a> Display for IncludeMatcher<'a> {
                 fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error> {
                     // XXX What about exact matches?
                     // I'm not sure it's worth it to clone the HashSet and keep it
                     // around just in case someone wants to display the matcher, plus
                     // it's going to be unreadable after a few entries, but we need to
                     // inform in this display that exact matches are being used and are
                     // (on purpose) missing from the `includes`.
                     write!(
                         f,
                         "IncludeMatcher(includes='{}')",
                         String::from_utf8_lossy(&self.patterns.escaped_bytes())
                     )
                 }
             }
             #[cfg(test)]
             mod tests {
                 use super::*;
                 use pretty_assertions::assert_eq;
                 use std::path::Path;
                 #[test]
                 fn test_roots_and_dirs() {
                     let pats = vec![
                         IgnorePattern::new(PatternSyntax::Glob, b"g/h/*", Path::new("")),
                         IgnorePattern::new(PatternSyntax::Glob, b"g/h", Path::new("")),
                         IgnorePattern::new(PatternSyntax::Glob, b"g*", Path::new("")),
                     ];
                     let (roots, dirs) = roots_and_dirs(&pats);
                     assert_eq!(
                         roots,
                         vec!(
                             HgPathBuf::from_bytes(b"g/h"),
                             HgPathBuf::from_bytes(b"g/h"),
                             HgPathBuf::new()
                         ),
                     );
                     assert_eq!(dirs, vec!());
                 }
                 #[test]
                 fn test_roots_dirs_and_parents() {
                     let pats = vec![
                         IgnorePattern::new(PatternSyntax::Glob, b"g/h/*", Path::new("")),
                         IgnorePattern::new(PatternSyntax::Glob, b"g/h", Path::new("")),
                         IgnorePattern::new(PatternSyntax::Glob, b"g*", Path::new("")),
                     ];
                     let mut roots = HashSet::new();
                     roots.insert(HgPathBuf::from_bytes(b"g/h"));
                     roots.insert(HgPathBuf::new());
                     let dirs = HashSet::new();
                     let mut parents = HashSet::new();
                     parents.insert(HgPathBuf::new());
                     parents.insert(HgPathBuf::from_bytes(b"g"));
                     assert_eq!(
                         roots_dirs_and_parents(&pats).unwrap(),
                         RootsDirsAndParents {
                             roots,
                             dirs,
                             parents
                         }
                     );
                 }
                 #[test]
                 fn test_filematcher_visit_children_set() {
                     // Visitchildrenset
                     let files = vec![HgPathBuf::from_bytes(b"dir/subdir/foo.txt")];
                     let matcher = FileMatcher::new(files).unwrap();
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"dir"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"")),
                         VisitChildrenSet::Set(set)
                     );
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"subdir"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir")),
                         VisitChildrenSet::Set(set)
                     );
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"foo.txt"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir")),
                         VisitChildrenSet::Set(set)
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir/foo.txt")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"folder")),
                         VisitChildrenSet::Empty
                     );
                 }
                 #[test]
                 fn test_filematcher_visit_children_set_files_and_dirs() {
                     let files = vec![
                         HgPathBuf::from_bytes(b"rootfile.txt"),
                         HgPathBuf::from_bytes(b"a/file1.txt"),
                         HgPathBuf::from_bytes(b"a/b/file2.txt"),
                         // No file in a/b/c
                         HgPathBuf::from_bytes(b"a/b/c/d/file4.txt"),
                     ];
                     let matcher = FileMatcher::new(files).unwrap();
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"a"));
                     set.insert(HgPathBuf::from_bytes(b"rootfile.txt"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"")),
                         VisitChildrenSet::Set(set)
                     );
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"b"));
                     set.insert(HgPathBuf::from_bytes(b"file1.txt"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"a")),
                         VisitChildrenSet::Set(set)
                     );
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"c"));
                     set.insert(HgPathBuf::from_bytes(b"file2.txt"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"a/b")),
                         VisitChildrenSet::Set(set)
                     );
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"d"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"a/b/c")),
                         VisitChildrenSet::Set(set)
                     );
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"file4.txt"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"a/b/c/d")),
                         VisitChildrenSet::Set(set)
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"a/b/c/d/e")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"folder")),
                         VisitChildrenSet::Empty
                     );
                 }
                 #[test]
                 fn test_includematcher() {
                     // VisitchildrensetPrefix
                     let matcher = IncludeMatcher::new(vec![IgnorePattern::new(
                         PatternSyntax::RelPath,
                         b"dir/subdir",
                         Path::new(""),
                     )])
                     .unwrap();
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"dir"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"")),
                         VisitChildrenSet::Set(set)
                     );
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"subdir"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir")),
                         VisitChildrenSet::Set(set)
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir")),
                         VisitChildrenSet::Recursive
                     );
                     // OPT: This should probably be 'all' if its parent is?
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
                         VisitChildrenSet::This
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"folder")),
                         VisitChildrenSet::Empty
                     );
                     // VisitchildrensetRootfilesin
                     let matcher = IncludeMatcher::new(vec![IgnorePattern::new(
                         PatternSyntax::RootFiles,
                         b"dir/subdir",
                         Path::new(""),
                     )])
                     .unwrap();
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"dir"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"")),
                         VisitChildrenSet::Set(set)
                     );
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"subdir"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir")),
                         VisitChildrenSet::Set(set)
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir")),
                         VisitChildrenSet::This
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"folder")),
                         VisitChildrenSet::Empty
                     );
                     // VisitchildrensetGlob
                     let matcher = IncludeMatcher::new(vec![IgnorePattern::new(
                         PatternSyntax::Glob,
                         b"dir/z*",
                         Path::new(""),
                     )])
                     .unwrap();
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"dir"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"")),
                         VisitChildrenSet::Set(set)
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"folder")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir")),
                         VisitChildrenSet::This
                     );
                     // OPT: these should probably be set().
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir")),
                         VisitChildrenSet::This
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
                         VisitChildrenSet::This
                     );
                     // Test multiple patterns
                     let matcher = IncludeMatcher::new(vec![
                         IgnorePattern::new(PatternSyntax::RelPath, b"foo", Path::new("")),
                         IgnorePattern::new(PatternSyntax::Glob, b"g*", Path::new("")),
                     ])
                     .unwrap();
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"")),
                         VisitChildrenSet::This
                     );
                     // Test multiple patterns
                     let matcher = IncludeMatcher::new(vec![IgnorePattern::new(
                         PatternSyntax::Glob,
                         b"**/*.exe",
                         Path::new(""),
                     )])
                     .unwrap();
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"")),
                         VisitChildrenSet::This
                     );
                 }
                 #[test]
                 fn test_unionmatcher() {
                     // Path + Rootfiles
                     let m1 = IncludeMatcher::new(vec![IgnorePattern::new(
                         PatternSyntax::RelPath,
                         b"dir/subdir",
                         Path::new(""),
                     )])
                     .unwrap();
                     let m2 = IncludeMatcher::new(vec![IgnorePattern::new(
                         PatternSyntax::RootFiles,
                         b"dir",
                         Path::new(""),
                     )])
                     .unwrap();
                     let matcher = UnionMatcher::new(vec![Box::new(m1), Box::new(m2)]);
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"dir"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"")),
                         VisitChildrenSet::Set(set)
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir")),
                         VisitChildrenSet::This
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir")),
                         VisitChildrenSet::Recursive
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/foo")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"folder")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"folder")),
                         VisitChildrenSet::Empty
                     );
                     // OPT: These next two could be 'all' instead of 'this'.
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir/z")),
                         VisitChildrenSet::This
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
                         VisitChildrenSet::This
                     );
                     // Path + unrelated Path
                     let m1 = IncludeMatcher::new(vec![IgnorePattern::new(
                         PatternSyntax::RelPath,
                         b"dir/subdir",
                         Path::new(""),
                     )])
                     .unwrap();
                     let m2 = IncludeMatcher::new(vec![IgnorePattern::new(
                         PatternSyntax::RelPath,
                         b"folder",
                         Path::new(""),
                     )])
                     .unwrap();
                     let matcher = UnionMatcher::new(vec![Box::new(m1), Box::new(m2)]);
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"folder"));
                     set.insert(HgPathBuf::from_bytes(b"dir"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"")),
                         VisitChildrenSet::Set(set)
                     );
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"subdir"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir")),
                         VisitChildrenSet::Set(set)
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir")),
                         VisitChildrenSet::Recursive
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/foo")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"folder")),
                         VisitChildrenSet::Recursive
                     );
                     // OPT: These next two could be 'all' instead of 'this'.
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir/z")),
                         VisitChildrenSet::This
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
                         VisitChildrenSet::This
                     );
                     // Path + subpath
                     let m1 = IncludeMatcher::new(vec![IgnorePattern::new(
                         PatternSyntax::RelPath,
                         b"dir/subdir/x",
                         Path::new(""),
                     )])
                     .unwrap();
                     let m2 = IncludeMatcher::new(vec![IgnorePattern::new(
                         PatternSyntax::RelPath,
                         b"dir/subdir",
                         Path::new(""),
                     )])
                     .unwrap();
                     let matcher = UnionMatcher::new(vec![Box::new(m1), Box::new(m2)]);
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"dir"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"")),
                         VisitChildrenSet::Set(set)
                     );
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"subdir"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir")),
                         VisitChildrenSet::Set(set)
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir")),
                         VisitChildrenSet::Recursive
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/foo")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"folder")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
                         VisitChildrenSet::Recursive
                     );
                     // OPT: this should probably be 'all' not 'this'.
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir/z")),
                         VisitChildrenSet::This
                     );
                 }
                 #[test]
                 fn test_intersectionmatcher() {
                     // Include path + Include rootfiles
                     let m1 = Box::new(
                         IncludeMatcher::new(vec![IgnorePattern::new(
                             PatternSyntax::RelPath,
                             b"dir/subdir",
                             Path::new(""),
                         )])
                         .unwrap(),
                     );
                     let m2 = Box::new(
                         IncludeMatcher::new(vec![IgnorePattern::new(
                             PatternSyntax::RootFiles,
                             b"dir",
                             Path::new(""),
                         )])
                         .unwrap(),
                     );
                     let matcher = IntersectionMatcher::new(m1, m2);
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"dir"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"")),
                         VisitChildrenSet::Set(set)
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir")),
                         VisitChildrenSet::This
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/foo")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"folder")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir/z")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
                         VisitChildrenSet::Empty
                     );
                     // Non intersecting paths
                     let m1 = Box::new(
                         IncludeMatcher::new(vec![IgnorePattern::new(
                             PatternSyntax::RelPath,
                             b"dir/subdir",
                             Path::new(""),
                         )])
                         .unwrap(),
                     );
                     let m2 = Box::new(
                         IncludeMatcher::new(vec![IgnorePattern::new(
                             PatternSyntax::RelPath,
                             b"folder",
                             Path::new(""),
                         )])
                         .unwrap(),
                     );
                     let matcher = IntersectionMatcher::new(m1, m2);
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/foo")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"folder")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir/z")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
                         VisitChildrenSet::Empty
                     );
                     // Nested paths
                     let m1 = Box::new(
                         IncludeMatcher::new(vec![IgnorePattern::new(
                             PatternSyntax::RelPath,
                             b"dir/subdir/x",
                             Path::new(""),
                         )])
                         .unwrap(),
                     );
                     let m2 = Box::new(
                         IncludeMatcher::new(vec![IgnorePattern::new(
                             PatternSyntax::RelPath,
                             b"dir/subdir",
                             Path::new(""),
                         )])
                         .unwrap(),
                     );
                     let matcher = IntersectionMatcher::new(m1, m2);
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"dir"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"")),
                         VisitChildrenSet::Set(set)
                     );
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"subdir"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir")),
                         VisitChildrenSet::Set(set)
                     );
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"x"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir")),
                         VisitChildrenSet::Set(set)
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/foo")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"folder")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir/z")),
                         VisitChildrenSet::Empty
                     );
                     // OPT: this should probably be 'all' not 'this'.
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
                         VisitChildrenSet::This
                     );
                     // Diverging paths
                     let m1 = Box::new(
                         IncludeMatcher::new(vec![IgnorePattern::new(
                             PatternSyntax::RelPath,
                             b"dir/subdir/x",
                             Path::new(""),
                         )])
                         .unwrap(),
                     );
                     let m2 = Box::new(
                         IncludeMatcher::new(vec![IgnorePattern::new(
                             PatternSyntax::RelPath,
                             b"dir/subdir/z",
                             Path::new(""),
                         )])
                         .unwrap(),
                     );
                     let matcher = IntersectionMatcher::new(m1, m2);
                     // OPT: these next two could probably be Empty as well.
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"dir"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"")),
                         VisitChildrenSet::Set(set)
                     );
                     // OPT: these next two could probably be Empty as well.
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"subdir"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir")),
                         VisitChildrenSet::Set(set)
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/foo")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"folder")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir/z")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
                         VisitChildrenSet::Empty
                     );
                 }
                 #[test]
                 fn test_differencematcher() {
                     // Two alwaysmatchers should function like a nevermatcher
                     let m1 = AlwaysMatcher;
                     let m2 = AlwaysMatcher;
                     let matcher = DifferenceMatcher::new(Box::new(m1), Box::new(m2));
                     for case in &[
                         &b""[..],
                         b"dir",
                         b"dir/subdir",
                         b"dir/subdir/z",
                         b"dir/foo",
                         b"dir/subdir/x",
                         b"folder",
                     ] {
                         assert_eq!(
                             matcher.visit_children_set(HgPath::new(case)),
                             VisitChildrenSet::Empty
                         );
                     }
                     // One always and one never should behave the same as an always
                     let m1 = AlwaysMatcher;
                     let m2 = NeverMatcher;
                     let matcher = DifferenceMatcher::new(Box::new(m1), Box::new(m2));
                     for case in &[
                         &b""[..],
                         b"dir",
                         b"dir/subdir",
                         b"dir/subdir/z",
                         b"dir/foo",
                         b"dir/subdir/x",
                         b"folder",
                     ] {
                         assert_eq!(
                             matcher.visit_children_set(HgPath::new(case)),
                             VisitChildrenSet::Recursive
                         );
                     }
                     // Two include matchers
                     let m1 = Box::new(
                         IncludeMatcher::new(vec![IgnorePattern::new(
                             PatternSyntax::RelPath,
                             b"dir/subdir",
                             Path::new("/repo"),
                         )])
                         .unwrap(),
                     );
                     let m2 = Box::new(
                         IncludeMatcher::new(vec![IgnorePattern::new(
                             PatternSyntax::RootFiles,
                             b"dir",
                             Path::new("/repo"),
                         )])
                         .unwrap(),
                     );
                     let matcher = DifferenceMatcher::new(m1, m2);
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"dir"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"")),
                         VisitChildrenSet::Set(set)
                     );
                     let mut set = HashSet::new();
                     set.insert(HgPathBuf::from_bytes(b"subdir"));
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir")),
                         VisitChildrenSet::Set(set)
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir")),
                         VisitChildrenSet::Recursive
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/foo")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"folder")),
                         VisitChildrenSet::Empty
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir/z")),
                         VisitChildrenSet::This
                     );
                     assert_eq!(
                         matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
                         VisitChildrenSet::This
                     );
                 }
             }

rust/rhg/src/commands/debugignorerhg.rs

0 +1 -1

             use crate::error::CommandError;
             use clap::SubCommand;
             use hg;
             use hg::matchers::get_ignore_matcher;
             use hg::StatusError;
             use log::warn;
             pub const HELP_TEXT: &str = "
             Show effective hgignore patterns used by rhg.
             This is a pure Rust version of `hg debugignore`.
             Some options might be missing, check the list below.
             ";
             pub fn args() -> clap::App<'static, 'static> {
                 SubCommand::with_name("debugignorerhg").about(HELP_TEXT)
             }
             pub fn run(invocation: &crate::CliInvocation) -> Result<(), CommandError> {
                 let repo = invocation.repo?;
                 let ignore_file = repo.working_directory_vfs().join(".hgignore"); // TODO hardcoded
                 let (ignore_matcher, warnings) = get_ignore_matcher(
                     vec![ignore_file],
                     &repo.working_directory_path().to_owned(),
-                    &mut |_pattern_bytes| (),
+                    &mut |_source, _pattern_bytes| (),
                 )
                 .map_err(|e| StatusError::from(e))?;
                 if !warnings.is_empty() {
                     warn!("Pattern warnings: {:?}", &warnings);
                 }
                 let patterns = ignore_matcher.debug_get_patterns();
                 invocation.ui.write_stdout(patterns)?;
                 invocation.ui.write_stdout(b"\n")?;
                 Ok(())
             }

tests/test-hgignore.t

0 +13 -7

             #testcases dirstate-v1 dirstate-v2
             #if dirstate-v2
               $ cat >> $HGRCPATH << EOF
               > [format]
               > use-dirstate-v2=1
               > [storage]
               > dirstate-v2.slow-path=allow
               > EOF
             #endif
               $ hg init ignorerepo
               $ cd ignorerepo
             debugignore with no hgignore should be deterministic:
               $ hg debugignore
               <nevermatcher>
             Issue562: .hgignore requires newline at end:
               $ touch foo
               $ touch bar
               $ touch baz
               $ cat > makeignore.py <<EOF
               > f = open(".hgignore", "w")
               > f.write("ignore\n")
               > f.write("foo\n")
               > # No EOL here
               > f.write("bar")
               > f.close()
               > EOF
               $ "$PYTHON" makeignore.py
             Should display baz only:
               $ hg status
               ? baz
               $ rm foo bar baz .hgignore makeignore.py
               $ touch a.o
               $ touch a.c
               $ touch syntax
               $ mkdir dir
               $ touch dir/a.o
               $ touch dir/b.o
               $ touch dir/c.o
               $ hg add dir/a.o
               $ hg commit -m 0
               $ hg add dir/b.o
               $ hg status
               A dir/b.o
               ? a.c
               ? a.o
               ? dir/c.o
               ? syntax
               $ echo "*.o" > .hgignore
               $ hg status
               abort: $TESTTMP/ignorerepo/.hgignore: invalid pattern (relre): *.o (glob)
               [255]
               $ echo 're:^(?!a).*\.o$' > .hgignore
               $ hg status
               A dir/b.o
               ? .hgignore
               ? a.c
               ? a.o
               ? syntax
             #if rhg
               $ hg status --config rhg.on-unsupported=abort
               unsupported feature: Unsupported syntax regex parse error:
                   ^(?:^(?!a).*\.o$)
                        ^^^
               error: look-around, including look-ahead and look-behind, is not supported
               [252]
             #endif
             Ensure given files are relative to cwd
               $ echo "dir/.*\.o" > .hgignore
               $ hg status -i
               I dir/c.o
               $ hg debugignore dir/c.o dir/missing.o
               dir/c.o is ignored
               (ignore rule in $TESTTMP/ignorerepo/.hgignore, line 1: 'dir/.*\.o') (glob)
               dir/missing.o is ignored
               (ignore rule in $TESTTMP/ignorerepo/.hgignore, line 1: 'dir/.*\.o') (glob)
               $ cd dir
               $ hg debugignore c.o missing.o
               c.o is ignored
               (ignore rule in $TESTTMP/ignorerepo/.hgignore, line 1: 'dir/.*\.o') (glob)
               missing.o is ignored
               (ignore rule in $TESTTMP/ignorerepo/.hgignore, line 1: 'dir/.*\.o') (glob)
             For icasefs, inexact matches also work, except for missing files
             #if icasefs
               $ hg debugignore c.O missing.O
               c.o is ignored
               (ignore rule in $TESTTMP/ignorerepo/.hgignore, line 1: 'dir/.*\.o') (glob)
               missing.O is not ignored
             #endif
               $ cd ..
               $ echo ".*\.o" > .hgignore
               $ hg status
               A dir/b.o
               ? .hgignore
               ? a.c
               ? syntax
             Ensure that comments work:
               $ touch 'foo#bar' 'quux#' 'quu0#'
             #if no-windows
               $ touch 'baz\' 'baz\wat' 'ba0\#wat' 'ba1\\' 'ba1\\wat' 'quu0\'
             #endif
               $ cat <<'EOF' >> .hgignore
               > # full-line comment
               >   # whitespace-only comment line
               > syntax# pattern, no whitespace, then comment
               > a.c  # pattern, then whitespace, then comment
               > baz\\# # (escaped) backslash, then comment
               > ba0\\\#w # (escaped) backslash, escaped comment character, then comment
               > ba1\\\\# # (escaped) backslashes, then comment
               > foo\#b # escaped comment character
               > quux\## escaped comment character at end of name
               > EOF
               $ hg status
               A dir/b.o
               ? .hgignore
               ? quu0#
               ? quu0\ (no-windows !)
               $ cat <<'EOF' > .hgignore
               > .*\.o
               > syntax: glob
               > syntax# pattern, no whitespace, then comment
               > a.c  # pattern, then whitespace, then comment
               > baz\\#* # (escaped) backslash, then comment
               > ba0\\\#w* # (escaped) backslash, escaped comment character, then comment
               > ba1\\\\#* # (escaped) backslashes, then comment
               > foo\#b* # escaped comment character
               > quux\## escaped comment character at end of name
               > quu0[\#]# escaped comment character inside [...]
               > EOF
               $ hg status
               A dir/b.o
               ? .hgignore
               ? ba1\\wat (no-windows !)
               ? baz\wat (no-windows !)
               ? quu0\ (no-windows !)
               $ rm 'foo#bar' 'quux#' 'quu0#'
             #if no-windows
               $ rm 'baz\' 'baz\wat' 'ba0\#wat' 'ba1\\' 'ba1\\wat' 'quu0\'
             #endif
             Check that '^\.' does not ignore the root directory:
               $ echo "^\." > .hgignore
               $ hg status
               A dir/b.o
               ? a.c
               ? a.o
               ? dir/c.o
               ? syntax
             Test that patterns from ui.ignore options are read:
               $ echo > .hgignore
               $ cat >> $HGRCPATH << EOF
               > [ui]
               > ignore.other = $TESTTMP/ignorerepo/.hg/testhgignore
               > EOF
               $ echo "glob:**.o" > .hg/testhgignore
               $ hg status
               A dir/b.o
               ? .hgignore
               ? a.c
               ? syntax
             empty out testhgignore
               $ echo > .hg/testhgignore
             Test relative ignore path (issue4473):
               $ cat >> $HGRCPATH << EOF
               > [ui]
               > ignore.relative = .hg/testhgignorerel
               > EOF
               $ echo "glob:*.o" > .hg/testhgignorerel
               $ cd dir
               $ hg status
               A dir/b.o
               ? .hgignore
               ? a.c
               ? syntax
               $ hg debugignore
               <includematcher includes='.*\\.o(?:/|$)'>
               $ cd ..
               $ echo > .hg/testhgignorerel
               $ echo "syntax: glob" > .hgignore
               $ echo "re:.*\.o" >> .hgignore
               $ hg status
               A dir/b.o
               ? .hgignore
               ? a.c
               ? syntax
               $ echo "syntax: invalid" > .hgignore
               $ hg status
               $TESTTMP/ignorerepo/.hgignore: ignoring invalid syntax 'invalid'
               A dir/b.o
               ? .hgignore
               ? a.c
               ? a.o
               ? dir/c.o
               ? syntax
               $ echo "syntax: glob" > .hgignore
               $ echo "*.o" >> .hgignore
               $ hg status
               A dir/b.o
               ? .hgignore
               ? a.c
               ? syntax
               $ echo "relglob:syntax*" > .hgignore
               $ hg status
               A dir/b.o
               ? .hgignore
               ? a.c
               ? a.o
               ? dir/c.o
               $ echo "relglob:*" > .hgignore
               $ hg status
               A dir/b.o
               $ cd dir
               $ hg status .
               A b.o
               $ hg debugignore
               <includematcher includes='.*(?:/|$)'>
               $ hg debugignore b.o
               b.o is ignored
               (ignore rule in $TESTTMP/ignorerepo/.hgignore, line 1: '*') (glob)
               $ cd ..
             Check patterns that match only the directory
             "(fsmonitor !)" below assumes that fsmonitor is enabled with
             "walk_on_invalidate = false" (default), which doesn't involve
             re-walking whole repository at detection of .hgignore change.
               $ echo "^dir\$" > .hgignore
               $ hg status
               A dir/b.o
               ? .hgignore
               ? a.c
               ? a.o
               ? dir/c.o (fsmonitor !)
               ? syntax
             Check recursive glob pattern matches no directories (dir/**/c.o matches dir/c.o)
               $ echo "syntax: glob" > .hgignore
               $ echo "dir/**/c.o" >> .hgignore
               $ touch dir/c.o
               $ mkdir dir/subdir
               $ touch dir/subdir/c.o
               $ hg status
               A dir/b.o
               ? .hgignore
               ? a.c
               ? a.o
               ? syntax
               $ hg debugignore a.c
               a.c is not ignored
               $ hg debugignore dir/c.o
               dir/c.o is ignored
               (ignore rule in $TESTTMP/ignorerepo/.hgignore, line 2: 'dir/**/c.o') (glob)
             Check rooted globs
               $ hg purge --all --config extensions.purge=
               $ echo "syntax: rootglob" > .hgignore
               $ echo "a/*.ext" >> .hgignore
               $ for p in a b/a aa; do mkdir -p $p; touch $p/b.ext; done
               $ hg status -A 'set:**.ext'
               ? aa/b.ext
               ? b/a/b.ext
               I a/b.ext
             Check using 'include:' in ignore file
               $ hg purge --all --config extensions.purge=
               $ touch foo.included
               $ echo ".*.included" > otherignore
               $ hg status -I "include:otherignore"
               ? foo.included
               $ echo "include:otherignore" >> .hgignore
               $ hg status
               A dir/b.o
               ? .hgignore
               ? otherignore
             Check recursive uses of 'include:'
               $ echo "include:nested/ignore" >> otherignore
               $ mkdir nested nested/more
               $ echo "glob:*ignore" > nested/ignore
               $ echo "rootglob:a" >> nested/ignore
               $ touch a nested/a nested/more/a
               $ hg status
               A dir/b.o
               ? nested/a
               ? nested/more/a
               $ rm a nested/a nested/more/a
               $ cp otherignore goodignore
               $ echo "include:badignore" >> otherignore
               $ hg status
               skipping unreadable pattern file 'badignore': $ENOENT$
               A dir/b.o
               $ mv goodignore otherignore
             Check using 'include:' while in a non-root directory
               $ cd ..
               $ hg -R ignorerepo status
               A dir/b.o
               $ cd ignorerepo
             Check including subincludes
               $ hg revert -q --all
               $ hg purge --all --config extensions.purge=
               $ echo ".hgignore" > .hgignore
               $ mkdir dir1 dir2
               $ touch dir1/file1 dir1/file2 dir2/file1 dir2/file2
               $ echo "subinclude:dir2/.hgignore" >> .hgignore
               $ echo "glob:file*2" > dir2/.hgignore
               $ hg status
               ? dir1/file1
               ? dir1/file2
               ? dir2/file1
             Check including subincludes with other patterns
               $ echo "subinclude:dir1/.hgignore" >> .hgignore
               $ mkdir dir1/subdir
               $ touch dir1/subdir/file1
               $ echo "rootglob:f?le1" > dir1/.hgignore
               $ hg status
               ? dir1/file2
               ? dir1/subdir/file1
               ? dir2/file1
               $ rm dir1/subdir/file1
               $ echo "regexp:f.le1" > dir1/.hgignore
               $ hg status
               ? dir1/file2
               ? dir2/file1
             Check multiple levels of sub-ignores
               $ touch dir1/subdir/subfile1 dir1/subdir/subfile3 dir1/subdir/subfile4
               $ echo "subinclude:subdir/.hgignore" >> dir1/.hgignore
               $ echo "glob:subfil*3" >> dir1/subdir/.hgignore
               $ hg status
               ? dir1/file2
               ? dir1/subdir/subfile4
               ? dir2/file1
             Check include subignore at the same level
               $ mv dir1/subdir/.hgignore dir1/.hgignoretwo
               $ echo "regexp:f.le1" > dir1/.hgignore
               $ echo "subinclude:.hgignoretwo" >> dir1/.hgignore
               $ echo "glob:file*2" > dir1/.hgignoretwo
               $ hg status | grep file2
               [1]
               $ hg debugignore dir1/file2
               dir1/file2 is ignored
               (ignore rule in dir2/.hgignore, line 1: 'file*2')
             #if windows
             Windows paths are accepted on input
               $ rm dir1/.hgignore
               $ echo "dir1/file*" >> .hgignore
               $ hg debugignore "dir1\file2"
               dir1/file2 is ignored
               (ignore rule in $TESTTMP\ignorerepo\.hgignore, line 4: 'dir1/file*')
               $ hg up -qC .
             #endif
             #if dirstate-v2 rust
             Check the hash of ignore patterns written in the dirstate
             This is an optimization that is only relevant when using the Rust extensions
+              $ cat_filename_and_hash () {
+              >     for i in "$@"; do
+              >         printf "$i "
+              >         cat "$i" | "$TESTDIR"/f --raw-sha1 | sed 's/^raw-sha1=//'
+              >     done
+              > }
               $ hg status > /dev/null
-              $ cat .hg/testhgignore .hg/testhgignorerel .hgignore dir2/.hgignore dir1/.hgignore dir1/.hgignoretwo | $TESTDIR/f --sha1
+              $ cat_filename_and_hash .hg/testhgignore .hg/testhgignorerel .hgignore dir2/.hgignore dir1/.hgignore dir1/.hgignoretwo | $TESTDIR/f --sha1
-              sha1=6e315b60f15fb5dfa02be00f3e2c8f923051f5ff
+              sha1=c0beb296395d48ced8e14f39009c4ea6e409bfe6
               $ hg debugstate --docket | grep ignore
-              ignore pattern hash: 6e315b60f15fb5dfa02be00f3e2c8f923051f5ff
+              ignore pattern hash: c0beb296395d48ced8e14f39009c4ea6e409bfe6
               $ echo rel > .hg/testhgignorerel
               $ hg status > /dev/null
-              $ cat .hg/testhgignore .hg/testhgignorerel .hgignore dir2/.hgignore dir1/.hgignore dir1/.hgignoretwo | $TESTDIR/f --sha1
+              $ cat_filename_and_hash .hg/testhgignore .hg/testhgignorerel .hgignore dir2/.hgignore dir1/.hgignore dir1/.hgignoretwo | $TESTDIR/f --sha1
-              sha1=dea19cc7119213f24b6b582a4bae7b0cb063e34e
+              sha1=b8e63d3428ec38abc68baa27631516d5ec46b7fa
               $ hg debugstate --docket | grep ignore
-              ignore pattern hash: dea19cc7119213f24b6b582a4bae7b0cb063e34e
+              ignore pattern hash: b8e63d3428ec38abc68baa27631516d5ec46b7fa
               $ cd ..
             Check that the hash depends on the source of the hgignore patterns
             (otherwise the context is lost and things like subinclude are cached improperly)
               $ hg init ignore-collision
               $ cd ignore-collision
               $ echo > .hg/testhgignorerel
               $ mkdir dir1/ dir1/subdir
               $ touch dir1/subdir/f dir1/subdir/ignored1
               $ echo 'ignored1' > dir1/.hgignore
               $ mkdir dir2 dir2/subdir
               $ touch dir2/subdir/f dir2/subdir/ignored2
               $ echo 'ignored2' > dir2/.hgignore
               $ echo 'subinclude:dir2/.hgignore' >> .hgignore
               $ echo 'subinclude:dir1/.hgignore' >> .hgignore
               $ hg commit -Aqm_
               $ > dir1/.hgignore
               $ echo 'ignored' > dir2/.hgignore
               $ echo 'ignored1' >> dir2/.hgignore
               $ hg status
               M dir1/.hgignore
               M dir2/.hgignore
-              ? dir1/subdir/ignored1 (missing-correct-output !)
+              ? dir1/subdir/ignored1
             #endif

General Comments 0

Write
Preview

You need to be logged in to leave comments. Login now

No TODOs yet

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages