##// END OF EJS Templates
dirstate-v2: hash the source of the ignore patterns as well...
Raphaël Gomès -
r50453:363923bd stable
parent child Browse files
Show More
@@ -1,616 +1,624
1 The *dirstate* is what Mercurial uses internally to track
1 The *dirstate* is what Mercurial uses internally to track
2 the state of files in the working directory,
2 the state of files in the working directory,
3 such as set by commands like `hg add` and `hg rm`.
3 such as set by commands like `hg add` and `hg rm`.
4 It also contains some cached data that help make `hg status` faster.
4 It also contains some cached data that help make `hg status` faster.
5 The name refers both to `.hg/dirstate` on the filesystem
5 The name refers both to `.hg/dirstate` on the filesystem
6 and the corresponding data structure in memory while a Mercurial process
6 and the corresponding data structure in memory while a Mercurial process
7 is running.
7 is running.
8
8
9 The original file format, retroactively dubbed `dirstate-v1`,
9 The original file format, retroactively dubbed `dirstate-v1`,
10 is described at https://www.mercurial-scm.org/wiki/DirState.
10 is described at https://www.mercurial-scm.org/wiki/DirState.
11 It is made of a flat sequence of unordered variable-size entries,
11 It is made of a flat sequence of unordered variable-size entries,
12 so accessing any information in it requires parsing all of it.
12 so accessing any information in it requires parsing all of it.
13 Similarly, saving changes requires rewriting the entire file.
13 Similarly, saving changes requires rewriting the entire file.
14
14
15 The newer `dirstate-v2` file format is designed to fix these limitations
15 The newer `dirstate-v2` file format is designed to fix these limitations
16 and make `hg status` faster.
16 and make `hg status` faster.
17
17
18 User guide
18 User guide
19 ==========
19 ==========
20
20
21 Compatibility
21 Compatibility
22 -------------
22 -------------
23
23
24 The file format is experimental and may still change.
24 The file format is experimental and may still change.
25 Different versions of Mercurial may not be compatible with each other
25 Different versions of Mercurial may not be compatible with each other
26 when working on a local repository that uses this format.
26 when working on a local repository that uses this format.
27 When using an incompatible version with the experimental format,
27 When using an incompatible version with the experimental format,
28 anything can happen including data corruption.
28 anything can happen including data corruption.
29
29
30 Since the dirstate is entirely local and not relevant to the wire protocol,
30 Since the dirstate is entirely local and not relevant to the wire protocol,
31 `dirstate-v2` does not affect compatibility with remote Mercurial versions.
31 `dirstate-v2` does not affect compatibility with remote Mercurial versions.
32
32
33 When `share-safe` is enabled, different repositories sharing the same store
33 When `share-safe` is enabled, different repositories sharing the same store
34 can use different dirstate formats.
34 can use different dirstate formats.
35
35
36 Enabling `dirstate-v2` for new local repositories
36 Enabling `dirstate-v2` for new local repositories
37 ------------------------------------------------
37 ------------------------------------------------
38
38
39 When creating a new local repository such as with `hg init` or `hg clone`,
39 When creating a new local repository such as with `hg init` or `hg clone`,
40 the `use-dirstate-v2` boolean in the `format` configuration section
40 the `use-dirstate-v2` boolean in the `format` configuration section
41 controls whether to use this file format.
41 controls whether to use this file format.
42 This is disabled by default as of this writing.
42 This is disabled by default as of this writing.
43 To enable it for a single repository, run for example::
43 To enable it for a single repository, run for example::
44
44
45 $ hg init my-project --config format.use-dirstate-v2=1
45 $ hg init my-project --config format.use-dirstate-v2=1
46
46
47 Checking the format of an existing local repository
47 Checking the format of an existing local repository
48 --------------------------------------------------
48 --------------------------------------------------
49
49
50 The `debugformat` commands prints information about
50 The `debugformat` commands prints information about
51 which of multiple optional formats are used in the current repository,
51 which of multiple optional formats are used in the current repository,
52 including `dirstate-v2`::
52 including `dirstate-v2`::
53
53
54 $ hg debugformat
54 $ hg debugformat
55 format-variant repo
55 format-variant repo
56 fncache: yes
56 fncache: yes
57 dirstate-v2: yes
57 dirstate-v2: yes
58 […]
58 […]
59
59
60 Upgrading or downgrading an existing local repository
60 Upgrading or downgrading an existing local repository
61 -----------------------------------------------------
61 -----------------------------------------------------
62
62
63 The `debugupgrade` command does various upgrades or downgrades
63 The `debugupgrade` command does various upgrades or downgrades
64 on a local repository
64 on a local repository
65 based on the current Mercurial version and on configuration.
65 based on the current Mercurial version and on configuration.
66 The same `format.use-dirstate-v2` configuration is used again.
66 The same `format.use-dirstate-v2` configuration is used again.
67
67
68 Example to upgrade::
68 Example to upgrade::
69
69
70 $ hg debugupgrade --config format.use-dirstate-v2=1
70 $ hg debugupgrade --config format.use-dirstate-v2=1
71
71
72 Example to downgrade to `dirstate-v1`::
72 Example to downgrade to `dirstate-v1`::
73
73
74 $ hg debugupgrade --config format.use-dirstate-v2=0
74 $ hg debugupgrade --config format.use-dirstate-v2=0
75
75
76 Both of this commands do nothing but print a list of proposed changes,
76 Both of this commands do nothing but print a list of proposed changes,
77 which may include changes unrelated to the dirstate.
77 which may include changes unrelated to the dirstate.
78 Those other changes are controlled by their own configuration keys.
78 Those other changes are controlled by their own configuration keys.
79 Add `--run` to a command to actually apply the proposed changes.
79 Add `--run` to a command to actually apply the proposed changes.
80
80
81 Backups of `.hg/requires` and `.hg/dirstate` are created
81 Backups of `.hg/requires` and `.hg/dirstate` are created
82 in a `.hg/upgradebackup.*` directory.
82 in a `.hg/upgradebackup.*` directory.
83 If something goes wrong, restoring those files should undo the change.
83 If something goes wrong, restoring those files should undo the change.
84
84
85 Note that upgrading affects compatibility with older versions of Mercurial
85 Note that upgrading affects compatibility with older versions of Mercurial
86 as noted above.
86 as noted above.
87 This can be relevant when a repository’s files are on a USB drive
87 This can be relevant when a repository’s files are on a USB drive
88 or some other removable media, or shared over the network, etc.
88 or some other removable media, or shared over the network, etc.
89
89
90 Internal filesystem representation
90 Internal filesystem representation
91 ==================================
91 ==================================
92
92
93 Requirements file
93 Requirements file
94 -----------------
94 -----------------
95
95
96 The `.hg/requires` file indicates which of various optional file formats
96 The `.hg/requires` file indicates which of various optional file formats
97 are used by a given repository.
97 are used by a given repository.
98 Mercurial aborts when seeing a requirement it does not know about,
98 Mercurial aborts when seeing a requirement it does not know about,
99 which avoids older version accidentally messing up a repository
99 which avoids older version accidentally messing up a repository
100 that uses a format that was introduced later.
100 that uses a format that was introduced later.
101 For versions that do support a format, the presence or absence of
101 For versions that do support a format, the presence or absence of
102 the corresponding requirement indicates whether to use that format.
102 the corresponding requirement indicates whether to use that format.
103
103
104 When the file contains a `dirstate-v2` line,
104 When the file contains a `dirstate-v2` line,
105 the `dirstate-v2` format is used.
105 the `dirstate-v2` format is used.
106 With no such line `dirstate-v1` is used.
106 With no such line `dirstate-v1` is used.
107
107
108 High level description
108 High level description
109 ----------------------
109 ----------------------
110
110
111 Whereas `dirstate-v1` uses a single `.hg/dirstate` file,
111 Whereas `dirstate-v1` uses a single `.hg/dirstate` file,
112 in `dirstate-v2` that file is a "docket" file
112 in `dirstate-v2` that file is a "docket" file
113 that only contains some metadata
113 that only contains some metadata
114 and points to separate data file named `.hg/dirstate.{ID}`,
114 and points to separate data file named `.hg/dirstate.{ID}`,
115 where `{ID}` is a random identifier.
115 where `{ID}` is a random identifier.
116
116
117 This separation allows making data files append-only
117 This separation allows making data files append-only
118 and therefore safer to memory-map.
118 and therefore safer to memory-map.
119 Creating a new data file (occasionally to clean up unused data)
119 Creating a new data file (occasionally to clean up unused data)
120 can be done with a different ID
120 can be done with a different ID
121 without disrupting another Mercurial process
121 without disrupting another Mercurial process
122 that could still be using the previous data file.
122 that could still be using the previous data file.
123
123
124 Both files have a format designed to reduce the need for parsing,
124 Both files have a format designed to reduce the need for parsing,
125 by using fixed-size binary components as much as possible.
125 by using fixed-size binary components as much as possible.
126 For data that is not fixed-size,
126 For data that is not fixed-size,
127 references to other parts of a file can be made by storing "pseudo-pointers":
127 references to other parts of a file can be made by storing "pseudo-pointers":
128 integers counted in bytes from the start of a file.
128 integers counted in bytes from the start of a file.
129 For read-only access no data structure is needed,
129 For read-only access no data structure is needed,
130 only a bytes buffer (possibly memory-mapped directly from the filesystem)
130 only a bytes buffer (possibly memory-mapped directly from the filesystem)
131 with specific parts read on demand.
131 with specific parts read on demand.
132
132
133 The data file contains "nodes" organized in a tree.
133 The data file contains "nodes" organized in a tree.
134 Each node represents a file or directory inside the working directory
134 Each node represents a file or directory inside the working directory
135 or its parent changeset.
135 or its parent changeset.
136 This tree has the same structure as the filesystem,
136 This tree has the same structure as the filesystem,
137 so a node representing a directory has child nodes representing
137 so a node representing a directory has child nodes representing
138 the files and subdirectories contained directly in that directory.
138 the files and subdirectories contained directly in that directory.
139
139
140 The docket file format
140 The docket file format
141 ----------------------
141 ----------------------
142
142
143 This is implemented in `rust/hg-core/src/dirstate_tree/on_disk.rs`
143 This is implemented in `rust/hg-core/src/dirstate_tree/on_disk.rs`
144 and `mercurial/dirstateutils/docket.py`.
144 and `mercurial/dirstateutils/docket.py`.
145
145
146 Components of the docket file are found at fixed offsets,
146 Components of the docket file are found at fixed offsets,
147 counted in bytes from the start of the file:
147 counted in bytes from the start of the file:
148
148
149 * Offset 0:
149 * Offset 0:
150 The 12-bytes marker string "dirstate-v2\n" ending with a newline character.
150 The 12-bytes marker string "dirstate-v2\n" ending with a newline character.
151 This makes it easier to tell a dirstate-v2 file from a dirstate-v1 file,
151 This makes it easier to tell a dirstate-v2 file from a dirstate-v1 file,
152 although it is not strictly necessary
152 although it is not strictly necessary
153 since `.hg/requires` determines which format to use.
153 since `.hg/requires` determines which format to use.
154
154
155 * Offset 12:
155 * Offset 12:
156 The changeset node ID on the first parent of the working directory,
156 The changeset node ID on the first parent of the working directory,
157 as up to 32 binary bytes.
157 as up to 32 binary bytes.
158 If a node ID is shorter (20 bytes for SHA-1),
158 If a node ID is shorter (20 bytes for SHA-1),
159 it is start-aligned and the rest of the bytes are set to zero.
159 it is start-aligned and the rest of the bytes are set to zero.
160
160
161 * Offset 44:
161 * Offset 44:
162 The changeset node ID on the second parent of the working directory,
162 The changeset node ID on the second parent of the working directory,
163 or all zeros if there isn’t one.
163 or all zeros if there isn’t one.
164 Also 32 binary bytes.
164 Also 32 binary bytes.
165
165
166 * Offset 76:
166 * Offset 76:
167 Tree metadata on 44 bytes, described below.
167 Tree metadata on 44 bytes, described below.
168 Its separation in this documentation from the rest of the docket
168 Its separation in this documentation from the rest of the docket
169 reflects a detail of the current implementation.
169 reflects a detail of the current implementation.
170 Since tree metadata is also made of fields at fixed offsets, those could
170 Since tree metadata is also made of fields at fixed offsets, those could
171 be inlined here by adding 76 bytes to each offset.
171 be inlined here by adding 76 bytes to each offset.
172
172
173 * Offset 120:
173 * Offset 120:
174 The used size of the data file, as a 32-bit big-endian integer.
174 The used size of the data file, as a 32-bit big-endian integer.
175 The actual size of the data file may be larger
175 The actual size of the data file may be larger
176 (if another Mercurial process is appending to it
176 (if another Mercurial process is appending to it
177 but has not updated the docket yet).
177 but has not updated the docket yet).
178 That extra data must be ignored.
178 That extra data must be ignored.
179
179
180 * Offset 124:
180 * Offset 124:
181 The length of the data file identifier, as a 8-bit integer.
181 The length of the data file identifier, as a 8-bit integer.
182
182
183 * Offset 125:
183 * Offset 125:
184 The data file identifier.
184 The data file identifier.
185
185
186 * Any additional data is current ignored, and dropped when updating the file.
186 * Any additional data is current ignored, and dropped when updating the file.
187
187
188 Tree metadata in the docket file
188 Tree metadata in the docket file
189 --------------------------------
189 --------------------------------
190
190
191 Tree metadata is similarly made of components at fixed offsets.
191 Tree metadata is similarly made of components at fixed offsets.
192 These offsets are counted in bytes from the start of tree metadata,
192 These offsets are counted in bytes from the start of tree metadata,
193 which is 76 bytes after the start of the docket file.
193 which is 76 bytes after the start of the docket file.
194
194
195 This metadata can be thought of as the singular root of the tree
195 This metadata can be thought of as the singular root of the tree
196 formed by nodes in the data file.
196 formed by nodes in the data file.
197
197
198 * Offset 0:
198 * Offset 0:
199 Pseudo-pointer to the start of root nodes,
199 Pseudo-pointer to the start of root nodes,
200 counted in bytes from the start of the data file,
200 counted in bytes from the start of the data file,
201 as a 32-bit big-endian integer.
201 as a 32-bit big-endian integer.
202 These nodes describe files and directories found directly
202 These nodes describe files and directories found directly
203 at the root of the working directory.
203 at the root of the working directory.
204
204
205 * Offset 4:
205 * Offset 4:
206 Number of root nodes, as a 32-bit big-endian integer.
206 Number of root nodes, as a 32-bit big-endian integer.
207
207
208 * Offset 8:
208 * Offset 8:
209 Total number of nodes in the entire tree that "have a dirstate entry",
209 Total number of nodes in the entire tree that "have a dirstate entry",
210 as a 32-bit big-endian integer.
210 as a 32-bit big-endian integer.
211 Those nodes represent files that would be present at all in `dirstate-v1`.
211 Those nodes represent files that would be present at all in `dirstate-v1`.
212 This is typically less than the total number of nodes.
212 This is typically less than the total number of nodes.
213 This counter is used to implement `len(dirstatemap)`.
213 This counter is used to implement `len(dirstatemap)`.
214
214
215 * Offset 12:
215 * Offset 12:
216 Number of nodes in the entire tree that have a copy source,
216 Number of nodes in the entire tree that have a copy source,
217 as a 32-bit big-endian integer.
217 as a 32-bit big-endian integer.
218 At the next commit, these files are recorded
218 At the next commit, these files are recorded
219 as having been copied or moved/renamed from that source.
219 as having been copied or moved/renamed from that source.
220 (A move is recorded as a copy and separate removal of the source.)
220 (A move is recorded as a copy and separate removal of the source.)
221 This counter is used to implement `len(dirstatemap.copymap)`.
221 This counter is used to implement `len(dirstatemap.copymap)`.
222
222
223 * Offset 16:
223 * Offset 16:
224 An estimation of how many bytes of the data file
224 An estimation of how many bytes of the data file
225 (within its used size) are unused, as a 32-bit big-endian integer.
225 (within its used size) are unused, as a 32-bit big-endian integer.
226 When appending to an existing data file,
226 When appending to an existing data file,
227 some existing nodes or paths can be unreachable from the new root
227 some existing nodes or paths can be unreachable from the new root
228 but they still take up space.
228 but they still take up space.
229 This counter is used to decide when to write a new data file from scratch
229 This counter is used to decide when to write a new data file from scratch
230 instead of appending to an existing one,
230 instead of appending to an existing one,
231 in order to get rid of that unreachable data
231 in order to get rid of that unreachable data
232 and avoid unbounded file size growth.
232 and avoid unbounded file size growth.
233
233
234 * Offset 20:
234 * Offset 20:
235 These four bytes are currently ignored
235 These four bytes are currently ignored
236 and reset to zero when updating a docket file.
236 and reset to zero when updating a docket file.
237 This is an attempt at forward compatibility:
237 This is an attempt at forward compatibility:
238 future Mercurial versions could use this as a bit field
238 future Mercurial versions could use this as a bit field
239 to indicate that a dirstate has additional data or constraints.
239 to indicate that a dirstate has additional data or constraints.
240 Finding a dirstate file with the relevant bit unset indicates that
240 Finding a dirstate file with the relevant bit unset indicates that
241 it was written by a then-older version
241 it was written by a then-older version
242 which is not aware of that future change.
242 which is not aware of that future change.
243
243
244 * Offset 24:
244 * Offset 24:
245 Either 20 zero bytes, or a SHA-1 hash as 20 binary bytes.
245 Either 20 zero bytes, or a SHA-1 hash as 20 binary bytes.
246 When present, the hash is of ignore patterns
246 When present, the hash is of ignore patterns
247 that were used for some previous run of the `status` algorithm.
247 that were used for some previous run of the `status` algorithm.
248
248
249 * (Offset 44: end of tree metadata)
249 * (Offset 44: end of tree metadata)
250
250
251 Optional hash of ignore patterns
251 Optional hash of ignore patterns
252 --------------------------------
252 --------------------------------
253
253
254 The implementation of `status` at `rust/hg-core/src/dirstate_tree/status.rs`
254 The implementation of `status` at `rust/hg-core/src/dirstate_tree/status.rs`
255 has been optimized such that its run time is dominated by calls
255 has been optimized such that its run time is dominated by calls
256 to `stat` for reading the filesystem metadata of a file or directory,
256 to `stat` for reading the filesystem metadata of a file or directory,
257 and to `readdir` for listing the contents of a directory.
257 and to `readdir` for listing the contents of a directory.
258 In some cases the algorithm can skip calls to `readdir`
258 In some cases the algorithm can skip calls to `readdir`
259 (saving significant time)
259 (saving significant time)
260 because the dirstate already contains enough of the relevant information
260 because the dirstate already contains enough of the relevant information
261 to build the correct `status` results.
261 to build the correct `status` results.
262
262
263 The default configuration of `hg status` is to list unknown files
263 The default configuration of `hg status` is to list unknown files
264 but not ignored files.
264 but not ignored files.
265 In this case, it matters for the `readdir`-skipping optimization
265 In this case, it matters for the `readdir`-skipping optimization
266 if a given file used to be ignored but became unknown
266 if a given file used to be ignored but became unknown
267 because `.hgignore` changed.
267 because `.hgignore` changed.
268 To detect the possibility of such a change,
268 To detect the possibility of such a change,
269 the tree metadata contains an optional hash of all ignore patterns.
269 the tree metadata contains an optional hash of all ignore patterns.
270
270
271 We define:
271 We define:
272
272
273 * "Root" ignore files as:
273 * "Root" ignore files as:
274
274
275 - `.hgignore` at the root of the repository if it exists
275 - `.hgignore` at the root of the repository if it exists
276 - And all files from `ui.ignore.*` config.
276 - And all files from `ui.ignore.*` config.
277
277
278 This set of files is sorted by the string representation of their path.
278 This set of files is sorted by the string representation of their path.
279
279
280 * The "expanded contents" of an ignore files is the byte string made
280 * The "expanded contents" of an ignore files is the byte string made
281 by the concatenation of its contents followed by the "expanded contents"
281 by the concatenation of its contents followed by the "expanded contents"
282 of other files included with `include:` or `subinclude:` directives,
282 of other files included with `include:` or `subinclude:` directives,
283 in inclusion order. This definition is recursive, as included files can
283 in inclusion order. This definition is recursive, as included files can
284 themselves include more files.
284 themselves include more files.
285
285
286 This hash is defined as the SHA-1 of the concatenation (in sorted
286 * "filepath" as the bytes of the ignore file path
287 order) of the "expanded contents" of each "root" ignore file.
287 relative to the root of the repository if inside the repository,
288 or the untouched path as defined in the configuration.
289
290 This hash is defined as the SHA-1 of the following line format:
291
292 <filepath> <sha1 of the "expanded contents">\n
293
294 for each "root" ignore file. (in sorted order)
295
288 (Note that computing this does not require actually concatenating
296 (Note that computing this does not require actually concatenating
289 into a single contiguous byte sequence.
297 into a single contiguous byte sequence.
290 Instead a SHA-1 hasher object can be created
298 Instead a SHA-1 hasher object can be created
291 and fed separate chunks one by one.)
299 and fed separate chunks one by one.)
292
300
293 The data file format
301 The data file format
294 --------------------
302 --------------------
295
303
296 This is implemented in `rust/hg-core/src/dirstate_tree/on_disk.rs`
304 This is implemented in `rust/hg-core/src/dirstate_tree/on_disk.rs`
297 and `mercurial/dirstateutils/v2.py`.
305 and `mercurial/dirstateutils/v2.py`.
298
306
299 The data file contains two types of data: paths and nodes.
307 The data file contains two types of data: paths and nodes.
300
308
301 Paths and nodes can be organized in any order in the file, except that sibling
309 Paths and nodes can be organized in any order in the file, except that sibling
302 nodes must be next to each other and sorted by their path.
310 nodes must be next to each other and sorted by their path.
303 Contiguity lets the parent refer to them all
311 Contiguity lets the parent refer to them all
304 by their count and a single pseudo-pointer,
312 by their count and a single pseudo-pointer,
305 instead of storing one pseudo-pointer per child node.
313 instead of storing one pseudo-pointer per child node.
306 Sorting allows using binary search to find a child node with a given name
314 Sorting allows using binary search to find a child node with a given name
307 in `O(log(n))` byte sequence comparisons.
315 in `O(log(n))` byte sequence comparisons.
308
316
309 The current implementation writes paths and child node before a given node
317 The current implementation writes paths and child node before a given node
310 for ease of figuring out the value of pseudo-pointers by the time the are to be
318 for ease of figuring out the value of pseudo-pointers by the time the are to be
311 written, but this is not an obligation and readers must not rely on it.
319 written, but this is not an obligation and readers must not rely on it.
312
320
313 A path is stored as a byte string anywhere in the file, without delimiter.
321 A path is stored as a byte string anywhere in the file, without delimiter.
314 It is referred to by one or more node by a pseudo-pointer to its start, and its
322 It is referred to by one or more node by a pseudo-pointer to its start, and its
315 length in bytes. Since there is no delimiter,
323 length in bytes. Since there is no delimiter,
316 when a path is a substring of another the same bytes could be reused,
324 when a path is a substring of another the same bytes could be reused,
317 although the implementation does not exploit this as of this writing.
325 although the implementation does not exploit this as of this writing.
318
326
319 A node is stored on 43 bytes with components at fixed offsets. Paths and
327 A node is stored on 43 bytes with components at fixed offsets. Paths and
320 child nodes relevant to a node are stored externally and referenced though
328 child nodes relevant to a node are stored externally and referenced though
321 pseudo-pointers.
329 pseudo-pointers.
322
330
323 All integers are stored in big-endian. All pseudo-pointers are 32-bit integers
331 All integers are stored in big-endian. All pseudo-pointers are 32-bit integers
324 counting bytes from the start of the data file. Path lengths and positions
332 counting bytes from the start of the data file. Path lengths and positions
325 are 16-bit integers, also counted in bytes.
333 are 16-bit integers, also counted in bytes.
326
334
327 Node components are:
335 Node components are:
328
336
329 * Offset 0:
337 * Offset 0:
330 Pseudo-pointer to the full path of this node,
338 Pseudo-pointer to the full path of this node,
331 from the working directory root.
339 from the working directory root.
332
340
333 * Offset 4:
341 * Offset 4:
334 Length of the full path.
342 Length of the full path.
335
343
336 * Offset 6:
344 * Offset 6:
337 Position of the last `/` path separator within the full path,
345 Position of the last `/` path separator within the full path,
338 in bytes from the start of the full path,
346 in bytes from the start of the full path,
339 or zero if there isn’t one.
347 or zero if there isn’t one.
340 The part of the full path after this position is the "base name".
348 The part of the full path after this position is the "base name".
341 Since sibling nodes have the same parent, only their base name vary
349 Since sibling nodes have the same parent, only their base name vary
342 and needs to be considered when doing binary search to find a given path.
350 and needs to be considered when doing binary search to find a given path.
343
351
344 * Offset 8:
352 * Offset 8:
345 Pseudo-pointer to the "copy source" path for this node,
353 Pseudo-pointer to the "copy source" path for this node,
346 or zero if there is no copy source.
354 or zero if there is no copy source.
347
355
348 * Offset 12:
356 * Offset 12:
349 Length of the copy source path, or zero if there isn’t one.
357 Length of the copy source path, or zero if there isn’t one.
350
358
351 * Offset 14:
359 * Offset 14:
352 Pseudo-pointer to the start of child nodes.
360 Pseudo-pointer to the start of child nodes.
353
361
354 * Offset 18:
362 * Offset 18:
355 Number of child nodes, as a 32-bit integer.
363 Number of child nodes, as a 32-bit integer.
356 They occupy 43 times this number of bytes
364 They occupy 43 times this number of bytes
357 (not counting space for paths, and further descendants).
365 (not counting space for paths, and further descendants).
358
366
359 * Offset 22:
367 * Offset 22:
360 Number as a 32-bit integer of descendant nodes in this subtree,
368 Number as a 32-bit integer of descendant nodes in this subtree,
361 not including this node itself,
369 not including this node itself,
362 that "have a dirstate entry".
370 that "have a dirstate entry".
363 Those nodes represent files that would be present at all in `dirstate-v1`.
371 Those nodes represent files that would be present at all in `dirstate-v1`.
364 This is typically less than the total number of descendants.
372 This is typically less than the total number of descendants.
365 This counter is used to implement `has_dir`.
373 This counter is used to implement `has_dir`.
366
374
367 * Offset 26:
375 * Offset 26:
368 Number as a 32-bit integer of descendant nodes in this subtree,
376 Number as a 32-bit integer of descendant nodes in this subtree,
369 not including this node itself,
377 not including this node itself,
370 that represent files tracked in the working directory.
378 that represent files tracked in the working directory.
371 (For example, `hg rm` makes a file untracked.)
379 (For example, `hg rm` makes a file untracked.)
372 This counter is used to implement `has_tracked_dir`.
380 This counter is used to implement `has_tracked_dir`.
373
381
374 * Offset 30:
382 * Offset 30:
375 A `flags` fields that packs some boolean values as bits of a 16-bit integer.
383 A `flags` fields that packs some boolean values as bits of a 16-bit integer.
376 Starting from least-significant, bit masks are::
384 Starting from least-significant, bit masks are::
377
385
378 WDIR_TRACKED = 1 << 0
386 WDIR_TRACKED = 1 << 0
379 P1_TRACKED = 1 << 1
387 P1_TRACKED = 1 << 1
380 P2_INFO = 1 << 2
388 P2_INFO = 1 << 2
381 MODE_EXEC_PERM = 1 << 3
389 MODE_EXEC_PERM = 1 << 3
382 MODE_IS_SYMLINK = 1 << 4
390 MODE_IS_SYMLINK = 1 << 4
383 HAS_FALLBACK_EXEC = 1 << 5
391 HAS_FALLBACK_EXEC = 1 << 5
384 FALLBACK_EXEC = 1 << 6
392 FALLBACK_EXEC = 1 << 6
385 HAS_FALLBACK_SYMLINK = 1 << 7
393 HAS_FALLBACK_SYMLINK = 1 << 7
386 FALLBACK_SYMLINK = 1 << 8
394 FALLBACK_SYMLINK = 1 << 8
387 EXPECTED_STATE_IS_MODIFIED = 1 << 9
395 EXPECTED_STATE_IS_MODIFIED = 1 << 9
388 HAS_MODE_AND_SIZE = 1 << 10
396 HAS_MODE_AND_SIZE = 1 << 10
389 HAS_MTIME = 1 << 11
397 HAS_MTIME = 1 << 11
390 MTIME_SECOND_AMBIGUOUS = 1 << 12
398 MTIME_SECOND_AMBIGUOUS = 1 << 12
391 DIRECTORY = 1 << 13
399 DIRECTORY = 1 << 13
392 ALL_UNKNOWN_RECORDED = 1 << 14
400 ALL_UNKNOWN_RECORDED = 1 << 14
393 ALL_IGNORED_RECORDED = 1 << 15
401 ALL_IGNORED_RECORDED = 1 << 15
394
402
395 The meaning of each bit is described below.
403 The meaning of each bit is described below.
396
404
397 Other bits are unset.
405 Other bits are unset.
398 They may be assigned meaning if the future,
406 They may be assigned meaning if the future,
399 with the limitation that Mercurial versions that pre-date such meaning
407 with the limitation that Mercurial versions that pre-date such meaning
400 will always reset those bits to unset when writing nodes.
408 will always reset those bits to unset when writing nodes.
401 (A new node is written for any mutation in its subtree,
409 (A new node is written for any mutation in its subtree,
402 leaving the bytes of the old node unreachable
410 leaving the bytes of the old node unreachable
403 until the data file is rewritten entirely.)
411 until the data file is rewritten entirely.)
404
412
405 * Offset 32:
413 * Offset 32:
406 A `size` field described below, as a 32-bit integer.
414 A `size` field described below, as a 32-bit integer.
407 Unlike in dirstate-v1, negative values are not used.
415 Unlike in dirstate-v1, negative values are not used.
408
416
409 * Offset 36:
417 * Offset 36:
410 The seconds component of an `mtime` field described below,
418 The seconds component of an `mtime` field described below,
411 as a 32-bit integer.
419 as a 32-bit integer.
412 Unlike in dirstate-v1, negative values are not used.
420 Unlike in dirstate-v1, negative values are not used.
413 When `mtime` is used, this is number of seconds since the Unix epoch
421 When `mtime` is used, this is number of seconds since the Unix epoch
414 truncated to its lower 31 bits.
422 truncated to its lower 31 bits.
415
423
416 * Offset 40:
424 * Offset 40:
417 The nanoseconds component of an `mtime` field described below,
425 The nanoseconds component of an `mtime` field described below,
418 as a 32-bit integer.
426 as a 32-bit integer.
419 When `mtime` is used,
427 When `mtime` is used,
420 this is the number of nanoseconds since `mtime.seconds`,
428 this is the number of nanoseconds since `mtime.seconds`,
421 always strictly less than one billion.
429 always strictly less than one billion.
422
430
423 This may be zero if more precision is not available.
431 This may be zero if more precision is not available.
424 (This can happen because of limitations in any of Mercurial, Python,
432 (This can happen because of limitations in any of Mercurial, Python,
425 libc, the operating system, …)
433 libc, the operating system, …)
426
434
427 When comparing two mtimes and either has this component set to zero,
435 When comparing two mtimes and either has this component set to zero,
428 the sub-second precision of both should be ignored.
436 the sub-second precision of both should be ignored.
429 False positives when checking mtime equality due to clock resolution
437 False positives when checking mtime equality due to clock resolution
430 are always possible and the status algorithm needs to deal with them,
438 are always possible and the status algorithm needs to deal with them,
431 but having too many false negatives could be harmful too.
439 but having too many false negatives could be harmful too.
432
440
433 * (Offset 44: end of this node)
441 * (Offset 44: end of this node)
434
442
435 The meaning of the boolean values packed in `flags` is:
443 The meaning of the boolean values packed in `flags` is:
436
444
437 `WDIR_TRACKED`
445 `WDIR_TRACKED`
438 Set if the working directory contains a tracked file at this node’s path.
446 Set if the working directory contains a tracked file at this node’s path.
439 This is typically set and unset by `hg add` and `hg rm`.
447 This is typically set and unset by `hg add` and `hg rm`.
440
448
441 `P1_TRACKED`
449 `P1_TRACKED`
442 Set if the working directory’s first parent changeset
450 Set if the working directory’s first parent changeset
443 (whose node identifier is found in tree metadata)
451 (whose node identifier is found in tree metadata)
444 contains a tracked file at this node’s path.
452 contains a tracked file at this node’s path.
445 This is a cache to reduce manifest lookups.
453 This is a cache to reduce manifest lookups.
446
454
447 `P2_INFO`
455 `P2_INFO`
448 Set if the file has been involved in some merge operation.
456 Set if the file has been involved in some merge operation.
449 Either because it was actually merged,
457 Either because it was actually merged,
450 or because the version in the second parent p2 version was ahead,
458 or because the version in the second parent p2 version was ahead,
451 or because some rename moved it there.
459 or because some rename moved it there.
452 In either case `hg status` will want it displayed as modified.
460 In either case `hg status` will want it displayed as modified.
453
461
454 Files that would be mentioned at all in the `dirstate-v1` file format
462 Files that would be mentioned at all in the `dirstate-v1` file format
455 have a node with at least one of the above three bits set in `dirstate-v2`.
463 have a node with at least one of the above three bits set in `dirstate-v2`.
456 Let’s call these files "tracked anywhere",
464 Let’s call these files "tracked anywhere",
457 and "untracked" the nodes with all three of these bits unset.
465 and "untracked" the nodes with all three of these bits unset.
458 Untracked nodes are typically for directories:
466 Untracked nodes are typically for directories:
459 they hold child nodes and form the tree structure.
467 they hold child nodes and form the tree structure.
460 Additional untracked nodes may also exist.
468 Additional untracked nodes may also exist.
461 Although implementations should strive to clean up nodes
469 Although implementations should strive to clean up nodes
462 that are entirely unused, other untracked nodes may also exist.
470 that are entirely unused, other untracked nodes may also exist.
463 For example, a future version of Mercurial might in some cases
471 For example, a future version of Mercurial might in some cases
464 add nodes for untracked files or/and ignored files in the working directory
472 add nodes for untracked files or/and ignored files in the working directory
465 in order to optimize `hg status`
473 in order to optimize `hg status`
466 by enabling it to skip `readdir` in more cases.
474 by enabling it to skip `readdir` in more cases.
467
475
468 `HAS_MODE_AND_SIZE`
476 `HAS_MODE_AND_SIZE`
469 Must be unset for untracked nodes.
477 Must be unset for untracked nodes.
470 For files tracked anywhere, if this is set:
478 For files tracked anywhere, if this is set:
471 - The `size` field is the expected file size,
479 - The `size` field is the expected file size,
472 in bytes truncated its lower to 31 bits.
480 in bytes truncated its lower to 31 bits.
473 - The expected execute permission for the file’s owner
481 - The expected execute permission for the file’s owner
474 is given by `MODE_EXEC_PERM`
482 is given by `MODE_EXEC_PERM`
475 - The expected file type is given by `MODE_IS_SIMLINK`:
483 - The expected file type is given by `MODE_IS_SIMLINK`:
476 a symbolic link if set, or a normal file if unset.
484 a symbolic link if set, or a normal file if unset.
477 If this is unset the expected size, permission, and file type are unknown.
485 If this is unset the expected size, permission, and file type are unknown.
478 The `size` field is unused (set to zero).
486 The `size` field is unused (set to zero).
479
487
480 `HAS_MTIME`
488 `HAS_MTIME`
481 The nodes contains a "valid" last modification time in the `mtime` field.
489 The nodes contains a "valid" last modification time in the `mtime` field.
482
490
483
491
484 It means the `mtime` was already strictly in the past when observed,
492 It means the `mtime` was already strictly in the past when observed,
485 meaning that later changes cannot happen in the same clock tick
493 meaning that later changes cannot happen in the same clock tick
486 and must cause a different modification time
494 and must cause a different modification time
487 (unless the system clock jumps back and we get unlucky,
495 (unless the system clock jumps back and we get unlucky,
488 which is not impossible but deemed unlikely enough).
496 which is not impossible but deemed unlikely enough).
489
497
490 This means that if `std::fs::symlink_metadata` later reports
498 This means that if `std::fs::symlink_metadata` later reports
491 the same modification time
499 the same modification time
492 and ignored patterns haven’t changed,
500 and ignored patterns haven’t changed,
493 we can assume the node to be unchanged on disk.
501 we can assume the node to be unchanged on disk.
494
502
495 The `mtime` field can then be used to skip more expensive lookup when
503 The `mtime` field can then be used to skip more expensive lookup when
496 checking the status of "tracked" nodes.
504 checking the status of "tracked" nodes.
497
505
498 It can also be set for node where `DIRECTORY` is set.
506 It can also be set for node where `DIRECTORY` is set.
499 See `DIRECTORY` documentation for details.
507 See `DIRECTORY` documentation for details.
500
508
501 `DIRECTORY`
509 `DIRECTORY`
502 When set, this entry will match a directory that exists or existed on the
510 When set, this entry will match a directory that exists or existed on the
503 file system.
511 file system.
504
512
505 * When `HAS_MTIME` is set a directory has been seen on the file system and
513 * When `HAS_MTIME` is set a directory has been seen on the file system and
506 `mtime` matches its last modification time. However, `HAS_MTIME` not
514 `mtime` matches its last modification time. However, `HAS_MTIME` not
507 being set does not indicate the lack of directory on the file system.
515 being set does not indicate the lack of directory on the file system.
508
516
509 * When not tracked anywhere, this node does not represent an ignored or
517 * When not tracked anywhere, this node does not represent an ignored or
510 unknown file on disk.
518 unknown file on disk.
511
519
512 If `HAS_MTIME` is set
520 If `HAS_MTIME` is set
513 and `mtime` matches the last modification time of the directory on disk,
521 and `mtime` matches the last modification time of the directory on disk,
514 the directory is unchanged
522 the directory is unchanged
515 and we can skip calling `std::fs::read_dir` again for this directory,
523 and we can skip calling `std::fs::read_dir` again for this directory,
516 and iterate child dirstate nodes instead.
524 and iterate child dirstate nodes instead.
517 (as long as `ALL_UNKNOWN_RECORDED` and `ALL_IGNORED_RECORDED` are taken
525 (as long as `ALL_UNKNOWN_RECORDED` and `ALL_IGNORED_RECORDED` are taken
518 into account)
526 into account)
519
527
520 `MODE_EXEC_PERM`
528 `MODE_EXEC_PERM`
521 Must be unset if `HAS_MODE_AND_SIZE` is unset.
529 Must be unset if `HAS_MODE_AND_SIZE` is unset.
522 If `HAS_MODE_AND_SIZE` is set,
530 If `HAS_MODE_AND_SIZE` is set,
523 this indicates whether the file’s own is expected
531 this indicates whether the file’s own is expected
524 to have execute permission.
532 to have execute permission.
525
533
526 Beware that on system without fs support for this information, the value
534 Beware that on system without fs support for this information, the value
527 stored in the dirstate might be wrong and should not be relied on.
535 stored in the dirstate might be wrong and should not be relied on.
528
536
529 `MODE_IS_SYMLINK`
537 `MODE_IS_SYMLINK`
530 Must be unset if `HAS_MODE_AND_SIZE` is unset.
538 Must be unset if `HAS_MODE_AND_SIZE` is unset.
531 If `HAS_MODE_AND_SIZE` is set,
539 If `HAS_MODE_AND_SIZE` is set,
532 this indicates whether the file is expected to be a symlink
540 this indicates whether the file is expected to be a symlink
533 as opposed to a normal file.
541 as opposed to a normal file.
534
542
535 Beware that on system without fs support for this information, the value
543 Beware that on system without fs support for this information, the value
536 stored in the dirstate might be wrong and should not be relied on.
544 stored in the dirstate might be wrong and should not be relied on.
537
545
538 `EXPECTED_STATE_IS_MODIFIED`
546 `EXPECTED_STATE_IS_MODIFIED`
539 Must be unset for untracked nodes.
547 Must be unset for untracked nodes.
540 For:
548 For:
541 - a file tracked anywhere
549 - a file tracked anywhere
542 - that has expected metadata (`HAS_MODE_AND_SIZE` and `HAS_MTIME`)
550 - that has expected metadata (`HAS_MODE_AND_SIZE` and `HAS_MTIME`)
543 - if that metadata matches
551 - if that metadata matches
544 metadata found in the working directory with `stat`
552 metadata found in the working directory with `stat`
545 This bit indicates the status of the file.
553 This bit indicates the status of the file.
546 If set, the status is modified. If unset, it is clean.
554 If set, the status is modified. If unset, it is clean.
547
555
548 In cases where `hg status` needs to read the contents of a file
556 In cases where `hg status` needs to read the contents of a file
549 because metadata is ambiguous, this bit lets it record the result
557 because metadata is ambiguous, this bit lets it record the result
550 if the result is modified so that a future run of `hg status`
558 if the result is modified so that a future run of `hg status`
551 does not need to do the same again.
559 does not need to do the same again.
552 It is valid to never set this bit,
560 It is valid to never set this bit,
553 and consider expected metadata ambiguous if it is set.
561 and consider expected metadata ambiguous if it is set.
554
562
555 `ALL_UNKNOWN_RECORDED`
563 `ALL_UNKNOWN_RECORDED`
556 If set, all "unknown" children existing on disk (at the time of the last
564 If set, all "unknown" children existing on disk (at the time of the last
557 status) have been recorded and the `mtime` associated with
565 status) have been recorded and the `mtime` associated with
558 `DIRECTORY` can be used for optimization even when "unknown" file
566 `DIRECTORY` can be used for optimization even when "unknown" file
559 are listed.
567 are listed.
560
568
561 Note that the amount recorded "unknown" children can still be zero if None
569 Note that the amount recorded "unknown" children can still be zero if None
562 where present.
570 where present.
563
571
564 Also note that having this flag unset does not imply that no "unknown"
572 Also note that having this flag unset does not imply that no "unknown"
565 children have been recorded. Some might be present, but there is
573 children have been recorded. Some might be present, but there is
566 no guarantee that is will be all of them.
574 no guarantee that is will be all of them.
567
575
568 `ALL_IGNORED_RECORDED`
576 `ALL_IGNORED_RECORDED`
569 If set, all "ignored" children existing on disk (at the time of the last
577 If set, all "ignored" children existing on disk (at the time of the last
570 status) have been recorded and the `mtime` associated with
578 status) have been recorded and the `mtime` associated with
571 `DIRECTORY` can be used for optimization even when "ignored" file
579 `DIRECTORY` can be used for optimization even when "ignored" file
572 are listed.
580 are listed.
573
581
574 Note that the amount recorded "ignored" children can still be zero if None
582 Note that the amount recorded "ignored" children can still be zero if None
575 where present.
583 where present.
576
584
577 Also note that having this flag unset does not imply that no "ignored"
585 Also note that having this flag unset does not imply that no "ignored"
578 children have been recorded. Some might be present, but there is
586 children have been recorded. Some might be present, but there is
579 no guarantee that is will be all of them.
587 no guarantee that is will be all of them.
580
588
581 `HAS_FALLBACK_EXEC`
589 `HAS_FALLBACK_EXEC`
582 If this flag is set, the entry carries "fallback" information for the
590 If this flag is set, the entry carries "fallback" information for the
583 executable bit in the `FALLBACK_EXEC` flag.
591 executable bit in the `FALLBACK_EXEC` flag.
584
592
585 Fallback information can be stored in the dirstate to keep track of
593 Fallback information can be stored in the dirstate to keep track of
586 filesystem attribute tracked by Mercurial when the underlying file
594 filesystem attribute tracked by Mercurial when the underlying file
587 system or operating system does not support that property, (e.g.
595 system or operating system does not support that property, (e.g.
588 Windows).
596 Windows).
589
597
590 `FALLBACK_EXEC`
598 `FALLBACK_EXEC`
591 Should be ignored if `HAS_FALLBACK_EXEC` is unset. If set the file for this
599 Should be ignored if `HAS_FALLBACK_EXEC` is unset. If set the file for this
592 entry should be considered executable if that information cannot be
600 entry should be considered executable if that information cannot be
593 extracted from the file system. If unset it should be considered
601 extracted from the file system. If unset it should be considered
594 non-executable instead.
602 non-executable instead.
595
603
596 `HAS_FALLBACK_SYMLINK`
604 `HAS_FALLBACK_SYMLINK`
597 If this flag is set, the entry carries "fallback" information for symbolic
605 If this flag is set, the entry carries "fallback" information for symbolic
598 link status in the `FALLBACK_SYMLINK` flag.
606 link status in the `FALLBACK_SYMLINK` flag.
599
607
600 Fallback information can be stored in the dirstate to keep track of
608 Fallback information can be stored in the dirstate to keep track of
601 filesystem attribute tracked by Mercurial when the underlying file
609 filesystem attribute tracked by Mercurial when the underlying file
602 system or operating system does not support that property, (e.g.
610 system or operating system does not support that property, (e.g.
603 Windows).
611 Windows).
604
612
605 `FALLBACK_SYMLINK`
613 `FALLBACK_SYMLINK`
606 Should be ignored if `HAS_FALLBACK_SYMLINK` is unset. If set the file for
614 Should be ignored if `HAS_FALLBACK_SYMLINK` is unset. If set the file for
607 this entry should be considered a symlink if that information cannot be
615 this entry should be considered a symlink if that information cannot be
608 extracted from the file system. If unset it should be considered a normal
616 extracted from the file system. If unset it should be considered a normal
609 file instead.
617 file instead.
610
618
611 `MTIME_SECOND_AMBIGUOUS`
619 `MTIME_SECOND_AMBIGUOUS`
612 This flag is relevant only when `HAS_FILE_MTIME` is set. When set, the
620 This flag is relevant only when `HAS_FILE_MTIME` is set. When set, the
613 `mtime` stored in the entry is only valid for comparison with timestamps
621 `mtime` stored in the entry is only valid for comparison with timestamps
614 that have nanosecond information. If available timestamp does not carries
622 that have nanosecond information. If available timestamp does not carries
615 nanosecond information, the `mtime` should be ignored and no optimization
623 nanosecond information, the `mtime` should be ignored and no optimization
616 can be applied.
624 can be applied.
@@ -1,913 +1,931
1 use crate::dirstate::entry::TruncatedTimestamp;
1 use crate::dirstate::entry::TruncatedTimestamp;
2 use crate::dirstate::status::IgnoreFnType;
2 use crate::dirstate::status::IgnoreFnType;
3 use crate::dirstate::status::StatusPath;
3 use crate::dirstate::status::StatusPath;
4 use crate::dirstate_tree::dirstate_map::BorrowedPath;
4 use crate::dirstate_tree::dirstate_map::BorrowedPath;
5 use crate::dirstate_tree::dirstate_map::ChildNodesRef;
5 use crate::dirstate_tree::dirstate_map::ChildNodesRef;
6 use crate::dirstate_tree::dirstate_map::DirstateMap;
6 use crate::dirstate_tree::dirstate_map::DirstateMap;
7 use crate::dirstate_tree::dirstate_map::DirstateVersion;
7 use crate::dirstate_tree::dirstate_map::DirstateVersion;
8 use crate::dirstate_tree::dirstate_map::NodeRef;
8 use crate::dirstate_tree::dirstate_map::NodeRef;
9 use crate::dirstate_tree::on_disk::DirstateV2ParseError;
9 use crate::dirstate_tree::on_disk::DirstateV2ParseError;
10 use crate::matchers::get_ignore_function;
10 use crate::matchers::get_ignore_function;
11 use crate::matchers::Matcher;
11 use crate::matchers::Matcher;
12 use crate::utils::files::get_bytes_from_os_string;
12 use crate::utils::files::get_bytes_from_os_string;
13 use crate::utils::files::get_bytes_from_path;
13 use crate::utils::files::get_path_from_bytes;
14 use crate::utils::files::get_path_from_bytes;
14 use crate::utils::hg_path::HgPath;
15 use crate::utils::hg_path::HgPath;
15 use crate::BadMatch;
16 use crate::BadMatch;
16 use crate::DirstateStatus;
17 use crate::DirstateStatus;
17 use crate::HgPathBuf;
18 use crate::HgPathBuf;
18 use crate::HgPathCow;
19 use crate::HgPathCow;
19 use crate::PatternFileWarning;
20 use crate::PatternFileWarning;
20 use crate::StatusError;
21 use crate::StatusError;
21 use crate::StatusOptions;
22 use crate::StatusOptions;
22 use micro_timer::timed;
23 use micro_timer::timed;
23 use once_cell::sync::OnceCell;
24 use once_cell::sync::OnceCell;
24 use rayon::prelude::*;
25 use rayon::prelude::*;
25 use sha1::{Digest, Sha1};
26 use sha1::{Digest, Sha1};
26 use std::borrow::Cow;
27 use std::borrow::Cow;
27 use std::io;
28 use std::io;
28 use std::path::Path;
29 use std::path::Path;
29 use std::path::PathBuf;
30 use std::path::PathBuf;
30 use std::sync::Mutex;
31 use std::sync::Mutex;
31 use std::time::SystemTime;
32 use std::time::SystemTime;
32
33
33 /// Returns the status of the working directory compared to its parent
34 /// Returns the status of the working directory compared to its parent
34 /// changeset.
35 /// changeset.
35 ///
36 ///
36 /// This algorithm is based on traversing the filesystem tree (`fs` in function
37 /// This algorithm is based on traversing the filesystem tree (`fs` in function
37 /// and variable names) and dirstate tree at the same time. The core of this
38 /// and variable names) and dirstate tree at the same time. The core of this
38 /// traversal is the recursive `traverse_fs_directory_and_dirstate` function
39 /// traversal is the recursive `traverse_fs_directory_and_dirstate` function
39 /// and its use of `itertools::merge_join_by`. When reaching a path that only
40 /// and its use of `itertools::merge_join_by`. When reaching a path that only
40 /// exists in one of the two trees, depending on information requested by
41 /// exists in one of the two trees, depending on information requested by
41 /// `options` we may need to traverse the remaining subtree.
42 /// `options` we may need to traverse the remaining subtree.
42 #[timed]
43 #[timed]
43 pub fn status<'dirstate>(
44 pub fn status<'dirstate>(
44 dmap: &'dirstate mut DirstateMap,
45 dmap: &'dirstate mut DirstateMap,
45 matcher: &(dyn Matcher + Sync),
46 matcher: &(dyn Matcher + Sync),
46 root_dir: PathBuf,
47 root_dir: PathBuf,
47 ignore_files: Vec<PathBuf>,
48 ignore_files: Vec<PathBuf>,
48 options: StatusOptions,
49 options: StatusOptions,
49 ) -> Result<(DirstateStatus<'dirstate>, Vec<PatternFileWarning>), StatusError>
50 ) -> Result<(DirstateStatus<'dirstate>, Vec<PatternFileWarning>), StatusError>
50 {
51 {
51 // Force the global rayon threadpool to not exceed 16 concurrent threads.
52 // Force the global rayon threadpool to not exceed 16 concurrent threads.
52 // This is a stop-gap measure until we figure out why using more than 16
53 // This is a stop-gap measure until we figure out why using more than 16
53 // threads makes `status` slower for each additional thread.
54 // threads makes `status` slower for each additional thread.
54 // We use `ok()` in case the global threadpool has already been
55 // We use `ok()` in case the global threadpool has already been
55 // instantiated in `rhg` or some other caller.
56 // instantiated in `rhg` or some other caller.
56 // TODO find the underlying cause and fix it, then remove this.
57 // TODO find the underlying cause and fix it, then remove this.
57 rayon::ThreadPoolBuilder::new()
58 rayon::ThreadPoolBuilder::new()
58 .num_threads(16)
59 .num_threads(16)
59 .build_global()
60 .build_global()
60 .ok();
61 .ok();
61
62
62 let (ignore_fn, warnings, patterns_changed): (IgnoreFnType, _, _) =
63 let (ignore_fn, warnings, patterns_changed): (IgnoreFnType, _, _) =
63 if options.list_ignored || options.list_unknown {
64 if options.list_ignored || options.list_unknown {
64 let (ignore_fn, warnings, changed) = match dmap.dirstate_version {
65 let (ignore_fn, warnings, changed) = match dmap.dirstate_version {
65 DirstateVersion::V1 => {
66 DirstateVersion::V1 => {
66 let (ignore_fn, warnings) = get_ignore_function(
67 let (ignore_fn, warnings) = get_ignore_function(
67 ignore_files,
68 ignore_files,
68 &root_dir,
69 &root_dir,
69 &mut |_pattern_bytes| {},
70 &mut |_source, _pattern_bytes| {},
70 )?;
71 )?;
71 (ignore_fn, warnings, None)
72 (ignore_fn, warnings, None)
72 }
73 }
73 DirstateVersion::V2 => {
74 DirstateVersion::V2 => {
74 let mut hasher = Sha1::new();
75 let mut hasher = Sha1::new();
75 let (ignore_fn, warnings) = get_ignore_function(
76 let (ignore_fn, warnings) = get_ignore_function(
76 ignore_files,
77 ignore_files,
77 &root_dir,
78 &root_dir,
78 &mut |pattern_bytes| hasher.update(pattern_bytes),
79 &mut |source, pattern_bytes| {
80 // If inside the repo, use the relative version to
81 // make it deterministic inside tests.
82 // The performance hit should be negligible.
83 let source = source
84 .strip_prefix(&root_dir)
85 .unwrap_or(source);
86 let source = get_bytes_from_path(source);
87
88 let mut subhasher = Sha1::new();
89 subhasher.update(pattern_bytes);
90 let patterns_hash = subhasher.finalize();
91
92 hasher.update(source);
93 hasher.update(b" ");
94 hasher.update(patterns_hash);
95 hasher.update(b"\n");
96 },
79 )?;
97 )?;
80 let new_hash = *hasher.finalize().as_ref();
98 let new_hash = *hasher.finalize().as_ref();
81 let changed = new_hash != dmap.ignore_patterns_hash;
99 let changed = new_hash != dmap.ignore_patterns_hash;
82 dmap.ignore_patterns_hash = new_hash;
100 dmap.ignore_patterns_hash = new_hash;
83 (ignore_fn, warnings, Some(changed))
101 (ignore_fn, warnings, Some(changed))
84 }
102 }
85 };
103 };
86 (ignore_fn, warnings, changed)
104 (ignore_fn, warnings, changed)
87 } else {
105 } else {
88 (Box::new(|&_| true), vec![], None)
106 (Box::new(|&_| true), vec![], None)
89 };
107 };
90
108
91 let filesystem_time_at_status_start =
109 let filesystem_time_at_status_start =
92 filesystem_now(&root_dir).ok().map(TruncatedTimestamp::from);
110 filesystem_now(&root_dir).ok().map(TruncatedTimestamp::from);
93
111
94 // If the repository is under the current directory, prefer using a
112 // If the repository is under the current directory, prefer using a
95 // relative path, so the kernel needs to traverse fewer directory in every
113 // relative path, so the kernel needs to traverse fewer directory in every
96 // call to `read_dir` or `symlink_metadata`.
114 // call to `read_dir` or `symlink_metadata`.
97 // This is effective in the common case where the current directory is the
115 // This is effective in the common case where the current directory is the
98 // repository root.
116 // repository root.
99
117
100 // TODO: Better yet would be to use libc functions like `openat` and
118 // TODO: Better yet would be to use libc functions like `openat` and
101 // `fstatat` to remove such repeated traversals entirely, but the standard
119 // `fstatat` to remove such repeated traversals entirely, but the standard
102 // library does not provide APIs based on those.
120 // library does not provide APIs based on those.
103 // Maybe with a crate like https://crates.io/crates/openat instead?
121 // Maybe with a crate like https://crates.io/crates/openat instead?
104 let root_dir = if let Some(relative) = std::env::current_dir()
122 let root_dir = if let Some(relative) = std::env::current_dir()
105 .ok()
123 .ok()
106 .and_then(|cwd| root_dir.strip_prefix(cwd).ok())
124 .and_then(|cwd| root_dir.strip_prefix(cwd).ok())
107 {
125 {
108 relative
126 relative
109 } else {
127 } else {
110 &root_dir
128 &root_dir
111 };
129 };
112
130
113 let outcome = DirstateStatus {
131 let outcome = DirstateStatus {
114 filesystem_time_at_status_start,
132 filesystem_time_at_status_start,
115 ..Default::default()
133 ..Default::default()
116 };
134 };
117 let common = StatusCommon {
135 let common = StatusCommon {
118 dmap,
136 dmap,
119 options,
137 options,
120 matcher,
138 matcher,
121 ignore_fn,
139 ignore_fn,
122 outcome: Mutex::new(outcome),
140 outcome: Mutex::new(outcome),
123 ignore_patterns_have_changed: patterns_changed,
141 ignore_patterns_have_changed: patterns_changed,
124 new_cacheable_directories: Default::default(),
142 new_cacheable_directories: Default::default(),
125 outdated_cached_directories: Default::default(),
143 outdated_cached_directories: Default::default(),
126 filesystem_time_at_status_start,
144 filesystem_time_at_status_start,
127 };
145 };
128 let is_at_repo_root = true;
146 let is_at_repo_root = true;
129 let hg_path = &BorrowedPath::OnDisk(HgPath::new(""));
147 let hg_path = &BorrowedPath::OnDisk(HgPath::new(""));
130 let has_ignored_ancestor = HasIgnoredAncestor::create(None, hg_path);
148 let has_ignored_ancestor = HasIgnoredAncestor::create(None, hg_path);
131 let root_cached_mtime = None;
149 let root_cached_mtime = None;
132 let root_dir_metadata = None;
150 let root_dir_metadata = None;
133 // If the path we have for the repository root is a symlink, do follow it.
151 // If the path we have for the repository root is a symlink, do follow it.
134 // (As opposed to symlinks within the working directory which are not
152 // (As opposed to symlinks within the working directory which are not
135 // followed, using `std::fs::symlink_metadata`.)
153 // followed, using `std::fs::symlink_metadata`.)
136 common.traverse_fs_directory_and_dirstate(
154 common.traverse_fs_directory_and_dirstate(
137 &has_ignored_ancestor,
155 &has_ignored_ancestor,
138 dmap.root.as_ref(),
156 dmap.root.as_ref(),
139 hg_path,
157 hg_path,
140 &root_dir,
158 &root_dir,
141 root_dir_metadata,
159 root_dir_metadata,
142 root_cached_mtime,
160 root_cached_mtime,
143 is_at_repo_root,
161 is_at_repo_root,
144 )?;
162 )?;
145 let mut outcome = common.outcome.into_inner().unwrap();
163 let mut outcome = common.outcome.into_inner().unwrap();
146 let new_cacheable = common.new_cacheable_directories.into_inner().unwrap();
164 let new_cacheable = common.new_cacheable_directories.into_inner().unwrap();
147 let outdated = common.outdated_cached_directories.into_inner().unwrap();
165 let outdated = common.outdated_cached_directories.into_inner().unwrap();
148
166
149 outcome.dirty = common.ignore_patterns_have_changed == Some(true)
167 outcome.dirty = common.ignore_patterns_have_changed == Some(true)
150 || !outdated.is_empty()
168 || !outdated.is_empty()
151 || (!new_cacheable.is_empty()
169 || (!new_cacheable.is_empty()
152 && dmap.dirstate_version == DirstateVersion::V2);
170 && dmap.dirstate_version == DirstateVersion::V2);
153
171
154 // Remove outdated mtimes before adding new mtimes, in case a given
172 // Remove outdated mtimes before adding new mtimes, in case a given
155 // directory is both
173 // directory is both
156 for path in &outdated {
174 for path in &outdated {
157 dmap.clear_cached_mtime(path)?;
175 dmap.clear_cached_mtime(path)?;
158 }
176 }
159 for (path, mtime) in &new_cacheable {
177 for (path, mtime) in &new_cacheable {
160 dmap.set_cached_mtime(path, *mtime)?;
178 dmap.set_cached_mtime(path, *mtime)?;
161 }
179 }
162
180
163 Ok((outcome, warnings))
181 Ok((outcome, warnings))
164 }
182 }
165
183
166 /// Bag of random things needed by various parts of the algorithm. Reduces the
184 /// Bag of random things needed by various parts of the algorithm. Reduces the
167 /// number of parameters passed to functions.
185 /// number of parameters passed to functions.
168 struct StatusCommon<'a, 'tree, 'on_disk: 'tree> {
186 struct StatusCommon<'a, 'tree, 'on_disk: 'tree> {
169 dmap: &'tree DirstateMap<'on_disk>,
187 dmap: &'tree DirstateMap<'on_disk>,
170 options: StatusOptions,
188 options: StatusOptions,
171 matcher: &'a (dyn Matcher + Sync),
189 matcher: &'a (dyn Matcher + Sync),
172 ignore_fn: IgnoreFnType<'a>,
190 ignore_fn: IgnoreFnType<'a>,
173 outcome: Mutex<DirstateStatus<'on_disk>>,
191 outcome: Mutex<DirstateStatus<'on_disk>>,
174 /// New timestamps of directories to be used for caching their readdirs
192 /// New timestamps of directories to be used for caching their readdirs
175 new_cacheable_directories:
193 new_cacheable_directories:
176 Mutex<Vec<(Cow<'on_disk, HgPath>, TruncatedTimestamp)>>,
194 Mutex<Vec<(Cow<'on_disk, HgPath>, TruncatedTimestamp)>>,
177 /// Used to invalidate the readdir cache of directories
195 /// Used to invalidate the readdir cache of directories
178 outdated_cached_directories: Mutex<Vec<Cow<'on_disk, HgPath>>>,
196 outdated_cached_directories: Mutex<Vec<Cow<'on_disk, HgPath>>>,
179
197
180 /// Whether ignore files like `.hgignore` have changed since the previous
198 /// Whether ignore files like `.hgignore` have changed since the previous
181 /// time a `status()` call wrote their hash to the dirstate. `None` means
199 /// time a `status()` call wrote their hash to the dirstate. `None` means
182 /// we don’t know as this run doesn’t list either ignored or uknown files
200 /// we don’t know as this run doesn’t list either ignored or uknown files
183 /// and therefore isn’t reading `.hgignore`.
201 /// and therefore isn’t reading `.hgignore`.
184 ignore_patterns_have_changed: Option<bool>,
202 ignore_patterns_have_changed: Option<bool>,
185
203
186 /// The current time at the start of the `status()` algorithm, as measured
204 /// The current time at the start of the `status()` algorithm, as measured
187 /// and possibly truncated by the filesystem.
205 /// and possibly truncated by the filesystem.
188 filesystem_time_at_status_start: Option<TruncatedTimestamp>,
206 filesystem_time_at_status_start: Option<TruncatedTimestamp>,
189 }
207 }
190
208
191 enum Outcome {
209 enum Outcome {
192 Modified,
210 Modified,
193 Added,
211 Added,
194 Removed,
212 Removed,
195 Deleted,
213 Deleted,
196 Clean,
214 Clean,
197 Ignored,
215 Ignored,
198 Unknown,
216 Unknown,
199 Unsure,
217 Unsure,
200 }
218 }
201
219
202 /// Lazy computation of whether a given path has a hgignored
220 /// Lazy computation of whether a given path has a hgignored
203 /// ancestor.
221 /// ancestor.
204 struct HasIgnoredAncestor<'a> {
222 struct HasIgnoredAncestor<'a> {
205 /// `path` and `parent` constitute the inputs to the computation,
223 /// `path` and `parent` constitute the inputs to the computation,
206 /// `cache` stores the outcome.
224 /// `cache` stores the outcome.
207 path: &'a HgPath,
225 path: &'a HgPath,
208 parent: Option<&'a HasIgnoredAncestor<'a>>,
226 parent: Option<&'a HasIgnoredAncestor<'a>>,
209 cache: OnceCell<bool>,
227 cache: OnceCell<bool>,
210 }
228 }
211
229
212 impl<'a> HasIgnoredAncestor<'a> {
230 impl<'a> HasIgnoredAncestor<'a> {
213 fn create(
231 fn create(
214 parent: Option<&'a HasIgnoredAncestor<'a>>,
232 parent: Option<&'a HasIgnoredAncestor<'a>>,
215 path: &'a HgPath,
233 path: &'a HgPath,
216 ) -> HasIgnoredAncestor<'a> {
234 ) -> HasIgnoredAncestor<'a> {
217 Self {
235 Self {
218 path,
236 path,
219 parent,
237 parent,
220 cache: OnceCell::new(),
238 cache: OnceCell::new(),
221 }
239 }
222 }
240 }
223
241
224 fn force<'b>(&self, ignore_fn: &IgnoreFnType<'b>) -> bool {
242 fn force<'b>(&self, ignore_fn: &IgnoreFnType<'b>) -> bool {
225 match self.parent {
243 match self.parent {
226 None => false,
244 None => false,
227 Some(parent) => {
245 Some(parent) => {
228 *(parent.cache.get_or_init(|| {
246 *(parent.cache.get_or_init(|| {
229 parent.force(ignore_fn) || ignore_fn(&self.path)
247 parent.force(ignore_fn) || ignore_fn(&self.path)
230 }))
248 }))
231 }
249 }
232 }
250 }
233 }
251 }
234 }
252 }
235
253
236 impl<'a, 'tree, 'on_disk> StatusCommon<'a, 'tree, 'on_disk> {
254 impl<'a, 'tree, 'on_disk> StatusCommon<'a, 'tree, 'on_disk> {
237 fn push_outcome(
255 fn push_outcome(
238 &self,
256 &self,
239 which: Outcome,
257 which: Outcome,
240 dirstate_node: &NodeRef<'tree, 'on_disk>,
258 dirstate_node: &NodeRef<'tree, 'on_disk>,
241 ) -> Result<(), DirstateV2ParseError> {
259 ) -> Result<(), DirstateV2ParseError> {
242 let path = dirstate_node
260 let path = dirstate_node
243 .full_path_borrowed(self.dmap.on_disk)?
261 .full_path_borrowed(self.dmap.on_disk)?
244 .detach_from_tree();
262 .detach_from_tree();
245 let copy_source = if self.options.list_copies {
263 let copy_source = if self.options.list_copies {
246 dirstate_node
264 dirstate_node
247 .copy_source_borrowed(self.dmap.on_disk)?
265 .copy_source_borrowed(self.dmap.on_disk)?
248 .map(|source| source.detach_from_tree())
266 .map(|source| source.detach_from_tree())
249 } else {
267 } else {
250 None
268 None
251 };
269 };
252 self.push_outcome_common(which, path, copy_source);
270 self.push_outcome_common(which, path, copy_source);
253 Ok(())
271 Ok(())
254 }
272 }
255
273
256 fn push_outcome_without_copy_source(
274 fn push_outcome_without_copy_source(
257 &self,
275 &self,
258 which: Outcome,
276 which: Outcome,
259 path: &BorrowedPath<'_, 'on_disk>,
277 path: &BorrowedPath<'_, 'on_disk>,
260 ) {
278 ) {
261 self.push_outcome_common(which, path.detach_from_tree(), None)
279 self.push_outcome_common(which, path.detach_from_tree(), None)
262 }
280 }
263
281
264 fn push_outcome_common(
282 fn push_outcome_common(
265 &self,
283 &self,
266 which: Outcome,
284 which: Outcome,
267 path: HgPathCow<'on_disk>,
285 path: HgPathCow<'on_disk>,
268 copy_source: Option<HgPathCow<'on_disk>>,
286 copy_source: Option<HgPathCow<'on_disk>>,
269 ) {
287 ) {
270 let mut outcome = self.outcome.lock().unwrap();
288 let mut outcome = self.outcome.lock().unwrap();
271 let vec = match which {
289 let vec = match which {
272 Outcome::Modified => &mut outcome.modified,
290 Outcome::Modified => &mut outcome.modified,
273 Outcome::Added => &mut outcome.added,
291 Outcome::Added => &mut outcome.added,
274 Outcome::Removed => &mut outcome.removed,
292 Outcome::Removed => &mut outcome.removed,
275 Outcome::Deleted => &mut outcome.deleted,
293 Outcome::Deleted => &mut outcome.deleted,
276 Outcome::Clean => &mut outcome.clean,
294 Outcome::Clean => &mut outcome.clean,
277 Outcome::Ignored => &mut outcome.ignored,
295 Outcome::Ignored => &mut outcome.ignored,
278 Outcome::Unknown => &mut outcome.unknown,
296 Outcome::Unknown => &mut outcome.unknown,
279 Outcome::Unsure => &mut outcome.unsure,
297 Outcome::Unsure => &mut outcome.unsure,
280 };
298 };
281 vec.push(StatusPath { path, copy_source });
299 vec.push(StatusPath { path, copy_source });
282 }
300 }
283
301
284 fn read_dir(
302 fn read_dir(
285 &self,
303 &self,
286 hg_path: &HgPath,
304 hg_path: &HgPath,
287 fs_path: &Path,
305 fs_path: &Path,
288 is_at_repo_root: bool,
306 is_at_repo_root: bool,
289 ) -> Result<Vec<DirEntry>, ()> {
307 ) -> Result<Vec<DirEntry>, ()> {
290 DirEntry::read_dir(fs_path, is_at_repo_root)
308 DirEntry::read_dir(fs_path, is_at_repo_root)
291 .map_err(|error| self.io_error(error, hg_path))
309 .map_err(|error| self.io_error(error, hg_path))
292 }
310 }
293
311
294 fn io_error(&self, error: std::io::Error, hg_path: &HgPath) {
312 fn io_error(&self, error: std::io::Error, hg_path: &HgPath) {
295 let errno = error.raw_os_error().expect("expected real OS error");
313 let errno = error.raw_os_error().expect("expected real OS error");
296 self.outcome
314 self.outcome
297 .lock()
315 .lock()
298 .unwrap()
316 .unwrap()
299 .bad
317 .bad
300 .push((hg_path.to_owned().into(), BadMatch::OsError(errno)))
318 .push((hg_path.to_owned().into(), BadMatch::OsError(errno)))
301 }
319 }
302
320
303 fn check_for_outdated_directory_cache(
321 fn check_for_outdated_directory_cache(
304 &self,
322 &self,
305 dirstate_node: &NodeRef<'tree, 'on_disk>,
323 dirstate_node: &NodeRef<'tree, 'on_disk>,
306 ) -> Result<bool, DirstateV2ParseError> {
324 ) -> Result<bool, DirstateV2ParseError> {
307 if self.ignore_patterns_have_changed == Some(true)
325 if self.ignore_patterns_have_changed == Some(true)
308 && dirstate_node.cached_directory_mtime()?.is_some()
326 && dirstate_node.cached_directory_mtime()?.is_some()
309 {
327 {
310 self.outdated_cached_directories.lock().unwrap().push(
328 self.outdated_cached_directories.lock().unwrap().push(
311 dirstate_node
329 dirstate_node
312 .full_path_borrowed(self.dmap.on_disk)?
330 .full_path_borrowed(self.dmap.on_disk)?
313 .detach_from_tree(),
331 .detach_from_tree(),
314 );
332 );
315 return Ok(true);
333 return Ok(true);
316 }
334 }
317 Ok(false)
335 Ok(false)
318 }
336 }
319
337
320 /// If this returns true, we can get accurate results by only using
338 /// If this returns true, we can get accurate results by only using
321 /// `symlink_metadata` for child nodes that exist in the dirstate and don’t
339 /// `symlink_metadata` for child nodes that exist in the dirstate and don’t
322 /// need to call `read_dir`.
340 /// need to call `read_dir`.
323 fn can_skip_fs_readdir(
341 fn can_skip_fs_readdir(
324 &self,
342 &self,
325 directory_metadata: Option<&std::fs::Metadata>,
343 directory_metadata: Option<&std::fs::Metadata>,
326 cached_directory_mtime: Option<TruncatedTimestamp>,
344 cached_directory_mtime: Option<TruncatedTimestamp>,
327 ) -> bool {
345 ) -> bool {
328 if !self.options.list_unknown && !self.options.list_ignored {
346 if !self.options.list_unknown && !self.options.list_ignored {
329 // All states that we care about listing have corresponding
347 // All states that we care about listing have corresponding
330 // dirstate entries.
348 // dirstate entries.
331 // This happens for example with `hg status -mard`.
349 // This happens for example with `hg status -mard`.
332 return true;
350 return true;
333 }
351 }
334 if !self.options.list_ignored
352 if !self.options.list_ignored
335 && self.ignore_patterns_have_changed == Some(false)
353 && self.ignore_patterns_have_changed == Some(false)
336 {
354 {
337 if let Some(cached_mtime) = cached_directory_mtime {
355 if let Some(cached_mtime) = cached_directory_mtime {
338 // The dirstate contains a cached mtime for this directory, set
356 // The dirstate contains a cached mtime for this directory, set
339 // by a previous run of the `status` algorithm which found this
357 // by a previous run of the `status` algorithm which found this
340 // directory eligible for `read_dir` caching.
358 // directory eligible for `read_dir` caching.
341 if let Some(meta) = directory_metadata {
359 if let Some(meta) = directory_metadata {
342 if cached_mtime
360 if cached_mtime
343 .likely_equal_to_mtime_of(meta)
361 .likely_equal_to_mtime_of(meta)
344 .unwrap_or(false)
362 .unwrap_or(false)
345 {
363 {
346 // The mtime of that directory has not changed
364 // The mtime of that directory has not changed
347 // since then, which means that the results of
365 // since then, which means that the results of
348 // `read_dir` should also be unchanged.
366 // `read_dir` should also be unchanged.
349 return true;
367 return true;
350 }
368 }
351 }
369 }
352 }
370 }
353 }
371 }
354 false
372 false
355 }
373 }
356
374
357 /// Returns whether all child entries of the filesystem directory have a
375 /// Returns whether all child entries of the filesystem directory have a
358 /// corresponding dirstate node or are ignored.
376 /// corresponding dirstate node or are ignored.
359 fn traverse_fs_directory_and_dirstate<'ancestor>(
377 fn traverse_fs_directory_and_dirstate<'ancestor>(
360 &self,
378 &self,
361 has_ignored_ancestor: &'ancestor HasIgnoredAncestor<'ancestor>,
379 has_ignored_ancestor: &'ancestor HasIgnoredAncestor<'ancestor>,
362 dirstate_nodes: ChildNodesRef<'tree, 'on_disk>,
380 dirstate_nodes: ChildNodesRef<'tree, 'on_disk>,
363 directory_hg_path: &BorrowedPath<'tree, 'on_disk>,
381 directory_hg_path: &BorrowedPath<'tree, 'on_disk>,
364 directory_fs_path: &Path,
382 directory_fs_path: &Path,
365 directory_metadata: Option<&std::fs::Metadata>,
383 directory_metadata: Option<&std::fs::Metadata>,
366 cached_directory_mtime: Option<TruncatedTimestamp>,
384 cached_directory_mtime: Option<TruncatedTimestamp>,
367 is_at_repo_root: bool,
385 is_at_repo_root: bool,
368 ) -> Result<bool, DirstateV2ParseError> {
386 ) -> Result<bool, DirstateV2ParseError> {
369 if self.can_skip_fs_readdir(directory_metadata, cached_directory_mtime)
387 if self.can_skip_fs_readdir(directory_metadata, cached_directory_mtime)
370 {
388 {
371 dirstate_nodes
389 dirstate_nodes
372 .par_iter()
390 .par_iter()
373 .map(|dirstate_node| {
391 .map(|dirstate_node| {
374 let fs_path = directory_fs_path.join(get_path_from_bytes(
392 let fs_path = directory_fs_path.join(get_path_from_bytes(
375 dirstate_node.base_name(self.dmap.on_disk)?.as_bytes(),
393 dirstate_node.base_name(self.dmap.on_disk)?.as_bytes(),
376 ));
394 ));
377 match std::fs::symlink_metadata(&fs_path) {
395 match std::fs::symlink_metadata(&fs_path) {
378 Ok(fs_metadata) => self.traverse_fs_and_dirstate(
396 Ok(fs_metadata) => self.traverse_fs_and_dirstate(
379 &fs_path,
397 &fs_path,
380 &fs_metadata,
398 &fs_metadata,
381 dirstate_node,
399 dirstate_node,
382 has_ignored_ancestor,
400 has_ignored_ancestor,
383 ),
401 ),
384 Err(e) if e.kind() == std::io::ErrorKind::NotFound => {
402 Err(e) if e.kind() == std::io::ErrorKind::NotFound => {
385 self.traverse_dirstate_only(dirstate_node)
403 self.traverse_dirstate_only(dirstate_node)
386 }
404 }
387 Err(error) => {
405 Err(error) => {
388 let hg_path =
406 let hg_path =
389 dirstate_node.full_path(self.dmap.on_disk)?;
407 dirstate_node.full_path(self.dmap.on_disk)?;
390 Ok(self.io_error(error, hg_path))
408 Ok(self.io_error(error, hg_path))
391 }
409 }
392 }
410 }
393 })
411 })
394 .collect::<Result<_, _>>()?;
412 .collect::<Result<_, _>>()?;
395
413
396 // We don’t know, so conservatively say this isn’t the case
414 // We don’t know, so conservatively say this isn’t the case
397 let children_all_have_dirstate_node_or_are_ignored = false;
415 let children_all_have_dirstate_node_or_are_ignored = false;
398
416
399 return Ok(children_all_have_dirstate_node_or_are_ignored);
417 return Ok(children_all_have_dirstate_node_or_are_ignored);
400 }
418 }
401
419
402 let mut fs_entries = if let Ok(entries) = self.read_dir(
420 let mut fs_entries = if let Ok(entries) = self.read_dir(
403 directory_hg_path,
421 directory_hg_path,
404 directory_fs_path,
422 directory_fs_path,
405 is_at_repo_root,
423 is_at_repo_root,
406 ) {
424 ) {
407 entries
425 entries
408 } else {
426 } else {
409 // Treat an unreadable directory (typically because of insufficient
427 // Treat an unreadable directory (typically because of insufficient
410 // permissions) like an empty directory. `self.read_dir` has
428 // permissions) like an empty directory. `self.read_dir` has
411 // already called `self.io_error` so a warning will be emitted.
429 // already called `self.io_error` so a warning will be emitted.
412 Vec::new()
430 Vec::new()
413 };
431 };
414
432
415 // `merge_join_by` requires both its input iterators to be sorted:
433 // `merge_join_by` requires both its input iterators to be sorted:
416
434
417 let dirstate_nodes = dirstate_nodes.sorted();
435 let dirstate_nodes = dirstate_nodes.sorted();
418 // `sort_unstable_by_key` doesn’t allow keys borrowing from the value:
436 // `sort_unstable_by_key` doesn’t allow keys borrowing from the value:
419 // https://github.com/rust-lang/rust/issues/34162
437 // https://github.com/rust-lang/rust/issues/34162
420 fs_entries.sort_unstable_by(|e1, e2| e1.base_name.cmp(&e2.base_name));
438 fs_entries.sort_unstable_by(|e1, e2| e1.base_name.cmp(&e2.base_name));
421
439
422 // Propagate here any error that would happen inside the comparison
440 // Propagate here any error that would happen inside the comparison
423 // callback below
441 // callback below
424 for dirstate_node in &dirstate_nodes {
442 for dirstate_node in &dirstate_nodes {
425 dirstate_node.base_name(self.dmap.on_disk)?;
443 dirstate_node.base_name(self.dmap.on_disk)?;
426 }
444 }
427 itertools::merge_join_by(
445 itertools::merge_join_by(
428 dirstate_nodes,
446 dirstate_nodes,
429 &fs_entries,
447 &fs_entries,
430 |dirstate_node, fs_entry| {
448 |dirstate_node, fs_entry| {
431 // This `unwrap` never panics because we already propagated
449 // This `unwrap` never panics because we already propagated
432 // those errors above
450 // those errors above
433 dirstate_node
451 dirstate_node
434 .base_name(self.dmap.on_disk)
452 .base_name(self.dmap.on_disk)
435 .unwrap()
453 .unwrap()
436 .cmp(&fs_entry.base_name)
454 .cmp(&fs_entry.base_name)
437 },
455 },
438 )
456 )
439 .par_bridge()
457 .par_bridge()
440 .map(|pair| {
458 .map(|pair| {
441 use itertools::EitherOrBoth::*;
459 use itertools::EitherOrBoth::*;
442 let has_dirstate_node_or_is_ignored;
460 let has_dirstate_node_or_is_ignored;
443 match pair {
461 match pair {
444 Both(dirstate_node, fs_entry) => {
462 Both(dirstate_node, fs_entry) => {
445 self.traverse_fs_and_dirstate(
463 self.traverse_fs_and_dirstate(
446 &fs_entry.full_path,
464 &fs_entry.full_path,
447 &fs_entry.metadata,
465 &fs_entry.metadata,
448 dirstate_node,
466 dirstate_node,
449 has_ignored_ancestor,
467 has_ignored_ancestor,
450 )?;
468 )?;
451 has_dirstate_node_or_is_ignored = true
469 has_dirstate_node_or_is_ignored = true
452 }
470 }
453 Left(dirstate_node) => {
471 Left(dirstate_node) => {
454 self.traverse_dirstate_only(dirstate_node)?;
472 self.traverse_dirstate_only(dirstate_node)?;
455 has_dirstate_node_or_is_ignored = true;
473 has_dirstate_node_or_is_ignored = true;
456 }
474 }
457 Right(fs_entry) => {
475 Right(fs_entry) => {
458 has_dirstate_node_or_is_ignored = self.traverse_fs_only(
476 has_dirstate_node_or_is_ignored = self.traverse_fs_only(
459 has_ignored_ancestor.force(&self.ignore_fn),
477 has_ignored_ancestor.force(&self.ignore_fn),
460 directory_hg_path,
478 directory_hg_path,
461 fs_entry,
479 fs_entry,
462 )
480 )
463 }
481 }
464 }
482 }
465 Ok(has_dirstate_node_or_is_ignored)
483 Ok(has_dirstate_node_or_is_ignored)
466 })
484 })
467 .try_reduce(|| true, |a, b| Ok(a && b))
485 .try_reduce(|| true, |a, b| Ok(a && b))
468 }
486 }
469
487
470 fn traverse_fs_and_dirstate<'ancestor>(
488 fn traverse_fs_and_dirstate<'ancestor>(
471 &self,
489 &self,
472 fs_path: &Path,
490 fs_path: &Path,
473 fs_metadata: &std::fs::Metadata,
491 fs_metadata: &std::fs::Metadata,
474 dirstate_node: NodeRef<'tree, 'on_disk>,
492 dirstate_node: NodeRef<'tree, 'on_disk>,
475 has_ignored_ancestor: &'ancestor HasIgnoredAncestor<'ancestor>,
493 has_ignored_ancestor: &'ancestor HasIgnoredAncestor<'ancestor>,
476 ) -> Result<(), DirstateV2ParseError> {
494 ) -> Result<(), DirstateV2ParseError> {
477 let outdated_dircache =
495 let outdated_dircache =
478 self.check_for_outdated_directory_cache(&dirstate_node)?;
496 self.check_for_outdated_directory_cache(&dirstate_node)?;
479 let hg_path = &dirstate_node.full_path_borrowed(self.dmap.on_disk)?;
497 let hg_path = &dirstate_node.full_path_borrowed(self.dmap.on_disk)?;
480 let file_type = fs_metadata.file_type();
498 let file_type = fs_metadata.file_type();
481 let file_or_symlink = file_type.is_file() || file_type.is_symlink();
499 let file_or_symlink = file_type.is_file() || file_type.is_symlink();
482 if !file_or_symlink {
500 if !file_or_symlink {
483 // If we previously had a file here, it was removed (with
501 // If we previously had a file here, it was removed (with
484 // `hg rm` or similar) or deleted before it could be
502 // `hg rm` or similar) or deleted before it could be
485 // replaced by a directory or something else.
503 // replaced by a directory or something else.
486 self.mark_removed_or_deleted_if_file(&dirstate_node)?;
504 self.mark_removed_or_deleted_if_file(&dirstate_node)?;
487 }
505 }
488 if file_type.is_dir() {
506 if file_type.is_dir() {
489 if self.options.collect_traversed_dirs {
507 if self.options.collect_traversed_dirs {
490 self.outcome
508 self.outcome
491 .lock()
509 .lock()
492 .unwrap()
510 .unwrap()
493 .traversed
511 .traversed
494 .push(hg_path.detach_from_tree())
512 .push(hg_path.detach_from_tree())
495 }
513 }
496 let is_ignored = HasIgnoredAncestor::create(
514 let is_ignored = HasIgnoredAncestor::create(
497 Some(&has_ignored_ancestor),
515 Some(&has_ignored_ancestor),
498 hg_path,
516 hg_path,
499 );
517 );
500 let is_at_repo_root = false;
518 let is_at_repo_root = false;
501 let children_all_have_dirstate_node_or_are_ignored = self
519 let children_all_have_dirstate_node_or_are_ignored = self
502 .traverse_fs_directory_and_dirstate(
520 .traverse_fs_directory_and_dirstate(
503 &is_ignored,
521 &is_ignored,
504 dirstate_node.children(self.dmap.on_disk)?,
522 dirstate_node.children(self.dmap.on_disk)?,
505 hg_path,
523 hg_path,
506 fs_path,
524 fs_path,
507 Some(fs_metadata),
525 Some(fs_metadata),
508 dirstate_node.cached_directory_mtime()?,
526 dirstate_node.cached_directory_mtime()?,
509 is_at_repo_root,
527 is_at_repo_root,
510 )?;
528 )?;
511 self.maybe_save_directory_mtime(
529 self.maybe_save_directory_mtime(
512 children_all_have_dirstate_node_or_are_ignored,
530 children_all_have_dirstate_node_or_are_ignored,
513 fs_metadata,
531 fs_metadata,
514 dirstate_node,
532 dirstate_node,
515 outdated_dircache,
533 outdated_dircache,
516 )?
534 )?
517 } else {
535 } else {
518 if file_or_symlink && self.matcher.matches(&hg_path) {
536 if file_or_symlink && self.matcher.matches(&hg_path) {
519 if let Some(entry) = dirstate_node.entry()? {
537 if let Some(entry) = dirstate_node.entry()? {
520 if !entry.any_tracked() {
538 if !entry.any_tracked() {
521 // Forward-compat if we start tracking unknown/ignored
539 // Forward-compat if we start tracking unknown/ignored
522 // files for caching reasons
540 // files for caching reasons
523 self.mark_unknown_or_ignored(
541 self.mark_unknown_or_ignored(
524 has_ignored_ancestor.force(&self.ignore_fn),
542 has_ignored_ancestor.force(&self.ignore_fn),
525 &hg_path,
543 &hg_path,
526 );
544 );
527 }
545 }
528 if entry.added() {
546 if entry.added() {
529 self.push_outcome(Outcome::Added, &dirstate_node)?;
547 self.push_outcome(Outcome::Added, &dirstate_node)?;
530 } else if entry.removed() {
548 } else if entry.removed() {
531 self.push_outcome(Outcome::Removed, &dirstate_node)?;
549 self.push_outcome(Outcome::Removed, &dirstate_node)?;
532 } else if entry.modified() {
550 } else if entry.modified() {
533 self.push_outcome(Outcome::Modified, &dirstate_node)?;
551 self.push_outcome(Outcome::Modified, &dirstate_node)?;
534 } else {
552 } else {
535 self.handle_normal_file(&dirstate_node, fs_metadata)?;
553 self.handle_normal_file(&dirstate_node, fs_metadata)?;
536 }
554 }
537 } else {
555 } else {
538 // `node.entry.is_none()` indicates a "directory"
556 // `node.entry.is_none()` indicates a "directory"
539 // node, but the filesystem has a file
557 // node, but the filesystem has a file
540 self.mark_unknown_or_ignored(
558 self.mark_unknown_or_ignored(
541 has_ignored_ancestor.force(&self.ignore_fn),
559 has_ignored_ancestor.force(&self.ignore_fn),
542 hg_path,
560 hg_path,
543 );
561 );
544 }
562 }
545 }
563 }
546
564
547 for child_node in dirstate_node.children(self.dmap.on_disk)?.iter()
565 for child_node in dirstate_node.children(self.dmap.on_disk)?.iter()
548 {
566 {
549 self.traverse_dirstate_only(child_node)?
567 self.traverse_dirstate_only(child_node)?
550 }
568 }
551 }
569 }
552 Ok(())
570 Ok(())
553 }
571 }
554
572
555 /// Save directory mtime if applicable.
573 /// Save directory mtime if applicable.
556 ///
574 ///
557 /// `outdated_directory_cache` is `true` if we've just invalidated the
575 /// `outdated_directory_cache` is `true` if we've just invalidated the
558 /// cache for this directory in `check_for_outdated_directory_cache`,
576 /// cache for this directory in `check_for_outdated_directory_cache`,
559 /// which forces the update.
577 /// which forces the update.
560 fn maybe_save_directory_mtime(
578 fn maybe_save_directory_mtime(
561 &self,
579 &self,
562 children_all_have_dirstate_node_or_are_ignored: bool,
580 children_all_have_dirstate_node_or_are_ignored: bool,
563 directory_metadata: &std::fs::Metadata,
581 directory_metadata: &std::fs::Metadata,
564 dirstate_node: NodeRef<'tree, 'on_disk>,
582 dirstate_node: NodeRef<'tree, 'on_disk>,
565 outdated_directory_cache: bool,
583 outdated_directory_cache: bool,
566 ) -> Result<(), DirstateV2ParseError> {
584 ) -> Result<(), DirstateV2ParseError> {
567 if !children_all_have_dirstate_node_or_are_ignored {
585 if !children_all_have_dirstate_node_or_are_ignored {
568 return Ok(());
586 return Ok(());
569 }
587 }
570 // All filesystem directory entries from `read_dir` have a
588 // All filesystem directory entries from `read_dir` have a
571 // corresponding node in the dirstate, so we can reconstitute the
589 // corresponding node in the dirstate, so we can reconstitute the
572 // names of those entries without calling `read_dir` again.
590 // names of those entries without calling `read_dir` again.
573
591
574 // TODO: use let-else here and below when available:
592 // TODO: use let-else here and below when available:
575 // https://github.com/rust-lang/rust/issues/87335
593 // https://github.com/rust-lang/rust/issues/87335
576 let status_start = if let Some(status_start) =
594 let status_start = if let Some(status_start) =
577 &self.filesystem_time_at_status_start
595 &self.filesystem_time_at_status_start
578 {
596 {
579 status_start
597 status_start
580 } else {
598 } else {
581 return Ok(());
599 return Ok(());
582 };
600 };
583
601
584 // Although the Rust standard library’s `SystemTime` type
602 // Although the Rust standard library’s `SystemTime` type
585 // has nanosecond precision, the times reported for a
603 // has nanosecond precision, the times reported for a
586 // directory’s (or file’s) modified time may have lower
604 // directory’s (or file’s) modified time may have lower
587 // resolution based on the filesystem (for example ext3
605 // resolution based on the filesystem (for example ext3
588 // only stores integer seconds), kernel (see
606 // only stores integer seconds), kernel (see
589 // https://stackoverflow.com/a/14393315/1162888), etc.
607 // https://stackoverflow.com/a/14393315/1162888), etc.
590 let directory_mtime = if let Ok(option) =
608 let directory_mtime = if let Ok(option) =
591 TruncatedTimestamp::for_reliable_mtime_of(
609 TruncatedTimestamp::for_reliable_mtime_of(
592 directory_metadata,
610 directory_metadata,
593 status_start,
611 status_start,
594 ) {
612 ) {
595 if let Some(directory_mtime) = option {
613 if let Some(directory_mtime) = option {
596 directory_mtime
614 directory_mtime
597 } else {
615 } else {
598 // The directory was modified too recently,
616 // The directory was modified too recently,
599 // don’t cache its `read_dir` results.
617 // don’t cache its `read_dir` results.
600 //
618 //
601 // 1. A change to this directory (direct child was
619 // 1. A change to this directory (direct child was
602 // added or removed) cause its mtime to be set
620 // added or removed) cause its mtime to be set
603 // (possibly truncated) to `directory_mtime`
621 // (possibly truncated) to `directory_mtime`
604 // 2. This `status` algorithm calls `read_dir`
622 // 2. This `status` algorithm calls `read_dir`
605 // 3. An other change is made to the same directory is
623 // 3. An other change is made to the same directory is
606 // made so that calling `read_dir` agin would give
624 // made so that calling `read_dir` agin would give
607 // different results, but soon enough after 1. that
625 // different results, but soon enough after 1. that
608 // the mtime stays the same
626 // the mtime stays the same
609 //
627 //
610 // On a system where the time resolution poor, this
628 // On a system where the time resolution poor, this
611 // scenario is not unlikely if all three steps are caused
629 // scenario is not unlikely if all three steps are caused
612 // by the same script.
630 // by the same script.
613 return Ok(());
631 return Ok(());
614 }
632 }
615 } else {
633 } else {
616 // OS/libc does not support mtime?
634 // OS/libc does not support mtime?
617 return Ok(());
635 return Ok(());
618 };
636 };
619 // We’ve observed (through `status_start`) that time has
637 // We’ve observed (through `status_start`) that time has
620 // “progressed” since `directory_mtime`, so any further
638 // “progressed” since `directory_mtime`, so any further
621 // change to this directory is extremely likely to cause a
639 // change to this directory is extremely likely to cause a
622 // different mtime.
640 // different mtime.
623 //
641 //
624 // Having the same mtime again is not entirely impossible
642 // Having the same mtime again is not entirely impossible
625 // since the system clock is not monotonous. It could jump
643 // since the system clock is not monotonous. It could jump
626 // backward to some point before `directory_mtime`, then a
644 // backward to some point before `directory_mtime`, then a
627 // directory change could potentially happen during exactly
645 // directory change could potentially happen during exactly
628 // the wrong tick.
646 // the wrong tick.
629 //
647 //
630 // We deem this scenario (unlike the previous one) to be
648 // We deem this scenario (unlike the previous one) to be
631 // unlikely enough in practice.
649 // unlikely enough in practice.
632
650
633 let is_up_to_date = if let Some(cached) =
651 let is_up_to_date = if let Some(cached) =
634 dirstate_node.cached_directory_mtime()?
652 dirstate_node.cached_directory_mtime()?
635 {
653 {
636 !outdated_directory_cache && cached.likely_equal(directory_mtime)
654 !outdated_directory_cache && cached.likely_equal(directory_mtime)
637 } else {
655 } else {
638 false
656 false
639 };
657 };
640 if !is_up_to_date {
658 if !is_up_to_date {
641 let hg_path = dirstate_node
659 let hg_path = dirstate_node
642 .full_path_borrowed(self.dmap.on_disk)?
660 .full_path_borrowed(self.dmap.on_disk)?
643 .detach_from_tree();
661 .detach_from_tree();
644 self.new_cacheable_directories
662 self.new_cacheable_directories
645 .lock()
663 .lock()
646 .unwrap()
664 .unwrap()
647 .push((hg_path, directory_mtime))
665 .push((hg_path, directory_mtime))
648 }
666 }
649 Ok(())
667 Ok(())
650 }
668 }
651
669
652 /// A file that is clean in the dirstate was found in the filesystem
670 /// A file that is clean in the dirstate was found in the filesystem
653 fn handle_normal_file(
671 fn handle_normal_file(
654 &self,
672 &self,
655 dirstate_node: &NodeRef<'tree, 'on_disk>,
673 dirstate_node: &NodeRef<'tree, 'on_disk>,
656 fs_metadata: &std::fs::Metadata,
674 fs_metadata: &std::fs::Metadata,
657 ) -> Result<(), DirstateV2ParseError> {
675 ) -> Result<(), DirstateV2ParseError> {
658 // Keep the low 31 bits
676 // Keep the low 31 bits
659 fn truncate_u64(value: u64) -> i32 {
677 fn truncate_u64(value: u64) -> i32 {
660 (value & 0x7FFF_FFFF) as i32
678 (value & 0x7FFF_FFFF) as i32
661 }
679 }
662
680
663 let entry = dirstate_node
681 let entry = dirstate_node
664 .entry()?
682 .entry()?
665 .expect("handle_normal_file called with entry-less node");
683 .expect("handle_normal_file called with entry-less node");
666 let mode_changed =
684 let mode_changed =
667 || self.options.check_exec && entry.mode_changed(fs_metadata);
685 || self.options.check_exec && entry.mode_changed(fs_metadata);
668 let size = entry.size();
686 let size = entry.size();
669 let size_changed = size != truncate_u64(fs_metadata.len());
687 let size_changed = size != truncate_u64(fs_metadata.len());
670 if size >= 0 && size_changed && fs_metadata.file_type().is_symlink() {
688 if size >= 0 && size_changed && fs_metadata.file_type().is_symlink() {
671 // issue6456: Size returned may be longer due to encryption
689 // issue6456: Size returned may be longer due to encryption
672 // on EXT-4 fscrypt. TODO maybe only do it on EXT4?
690 // on EXT-4 fscrypt. TODO maybe only do it on EXT4?
673 self.push_outcome(Outcome::Unsure, dirstate_node)?
691 self.push_outcome(Outcome::Unsure, dirstate_node)?
674 } else if dirstate_node.has_copy_source()
692 } else if dirstate_node.has_copy_source()
675 || entry.is_from_other_parent()
693 || entry.is_from_other_parent()
676 || (size >= 0 && (size_changed || mode_changed()))
694 || (size >= 0 && (size_changed || mode_changed()))
677 {
695 {
678 self.push_outcome(Outcome::Modified, dirstate_node)?
696 self.push_outcome(Outcome::Modified, dirstate_node)?
679 } else {
697 } else {
680 let mtime_looks_clean;
698 let mtime_looks_clean;
681 if let Some(dirstate_mtime) = entry.truncated_mtime() {
699 if let Some(dirstate_mtime) = entry.truncated_mtime() {
682 let fs_mtime = TruncatedTimestamp::for_mtime_of(fs_metadata)
700 let fs_mtime = TruncatedTimestamp::for_mtime_of(fs_metadata)
683 .expect("OS/libc does not support mtime?");
701 .expect("OS/libc does not support mtime?");
684 // There might be a change in the future if for example the
702 // There might be a change in the future if for example the
685 // internal clock become off while process run, but this is a
703 // internal clock become off while process run, but this is a
686 // case where the issues the user would face
704 // case where the issues the user would face
687 // would be a lot worse and there is nothing we
705 // would be a lot worse and there is nothing we
688 // can really do.
706 // can really do.
689 mtime_looks_clean = fs_mtime.likely_equal(dirstate_mtime)
707 mtime_looks_clean = fs_mtime.likely_equal(dirstate_mtime)
690 } else {
708 } else {
691 // No mtime in the dirstate entry
709 // No mtime in the dirstate entry
692 mtime_looks_clean = false
710 mtime_looks_clean = false
693 };
711 };
694 if !mtime_looks_clean {
712 if !mtime_looks_clean {
695 self.push_outcome(Outcome::Unsure, dirstate_node)?
713 self.push_outcome(Outcome::Unsure, dirstate_node)?
696 } else if self.options.list_clean {
714 } else if self.options.list_clean {
697 self.push_outcome(Outcome::Clean, dirstate_node)?
715 self.push_outcome(Outcome::Clean, dirstate_node)?
698 }
716 }
699 }
717 }
700 Ok(())
718 Ok(())
701 }
719 }
702
720
703 /// A node in the dirstate tree has no corresponding filesystem entry
721 /// A node in the dirstate tree has no corresponding filesystem entry
704 fn traverse_dirstate_only(
722 fn traverse_dirstate_only(
705 &self,
723 &self,
706 dirstate_node: NodeRef<'tree, 'on_disk>,
724 dirstate_node: NodeRef<'tree, 'on_disk>,
707 ) -> Result<(), DirstateV2ParseError> {
725 ) -> Result<(), DirstateV2ParseError> {
708 self.check_for_outdated_directory_cache(&dirstate_node)?;
726 self.check_for_outdated_directory_cache(&dirstate_node)?;
709 self.mark_removed_or_deleted_if_file(&dirstate_node)?;
727 self.mark_removed_or_deleted_if_file(&dirstate_node)?;
710 dirstate_node
728 dirstate_node
711 .children(self.dmap.on_disk)?
729 .children(self.dmap.on_disk)?
712 .par_iter()
730 .par_iter()
713 .map(|child_node| self.traverse_dirstate_only(child_node))
731 .map(|child_node| self.traverse_dirstate_only(child_node))
714 .collect()
732 .collect()
715 }
733 }
716
734
717 /// A node in the dirstate tree has no corresponding *file* on the
735 /// A node in the dirstate tree has no corresponding *file* on the
718 /// filesystem
736 /// filesystem
719 ///
737 ///
720 /// Does nothing on a "directory" node
738 /// Does nothing on a "directory" node
721 fn mark_removed_or_deleted_if_file(
739 fn mark_removed_or_deleted_if_file(
722 &self,
740 &self,
723 dirstate_node: &NodeRef<'tree, 'on_disk>,
741 dirstate_node: &NodeRef<'tree, 'on_disk>,
724 ) -> Result<(), DirstateV2ParseError> {
742 ) -> Result<(), DirstateV2ParseError> {
725 if let Some(entry) = dirstate_node.entry()? {
743 if let Some(entry) = dirstate_node.entry()? {
726 if !entry.any_tracked() {
744 if !entry.any_tracked() {
727 // Future-compat for when we start storing ignored and unknown
745 // Future-compat for when we start storing ignored and unknown
728 // files for caching reasons
746 // files for caching reasons
729 return Ok(());
747 return Ok(());
730 }
748 }
731 let path = dirstate_node.full_path(self.dmap.on_disk)?;
749 let path = dirstate_node.full_path(self.dmap.on_disk)?;
732 if self.matcher.matches(path) {
750 if self.matcher.matches(path) {
733 if entry.removed() {
751 if entry.removed() {
734 self.push_outcome(Outcome::Removed, dirstate_node)?
752 self.push_outcome(Outcome::Removed, dirstate_node)?
735 } else {
753 } else {
736 self.push_outcome(Outcome::Deleted, &dirstate_node)?
754 self.push_outcome(Outcome::Deleted, &dirstate_node)?
737 }
755 }
738 }
756 }
739 }
757 }
740 Ok(())
758 Ok(())
741 }
759 }
742
760
743 /// Something in the filesystem has no corresponding dirstate node
761 /// Something in the filesystem has no corresponding dirstate node
744 ///
762 ///
745 /// Returns whether that path is ignored
763 /// Returns whether that path is ignored
746 fn traverse_fs_only(
764 fn traverse_fs_only(
747 &self,
765 &self,
748 has_ignored_ancestor: bool,
766 has_ignored_ancestor: bool,
749 directory_hg_path: &HgPath,
767 directory_hg_path: &HgPath,
750 fs_entry: &DirEntry,
768 fs_entry: &DirEntry,
751 ) -> bool {
769 ) -> bool {
752 let hg_path = directory_hg_path.join(&fs_entry.base_name);
770 let hg_path = directory_hg_path.join(&fs_entry.base_name);
753 let file_type = fs_entry.metadata.file_type();
771 let file_type = fs_entry.metadata.file_type();
754 let file_or_symlink = file_type.is_file() || file_type.is_symlink();
772 let file_or_symlink = file_type.is_file() || file_type.is_symlink();
755 if file_type.is_dir() {
773 if file_type.is_dir() {
756 let is_ignored =
774 let is_ignored =
757 has_ignored_ancestor || (self.ignore_fn)(&hg_path);
775 has_ignored_ancestor || (self.ignore_fn)(&hg_path);
758 let traverse_children = if is_ignored {
776 let traverse_children = if is_ignored {
759 // Descendants of an ignored directory are all ignored
777 // Descendants of an ignored directory are all ignored
760 self.options.list_ignored
778 self.options.list_ignored
761 } else {
779 } else {
762 // Descendants of an unknown directory may be either unknown or
780 // Descendants of an unknown directory may be either unknown or
763 // ignored
781 // ignored
764 self.options.list_unknown || self.options.list_ignored
782 self.options.list_unknown || self.options.list_ignored
765 };
783 };
766 if traverse_children {
784 if traverse_children {
767 let is_at_repo_root = false;
785 let is_at_repo_root = false;
768 if let Ok(children_fs_entries) = self.read_dir(
786 if let Ok(children_fs_entries) = self.read_dir(
769 &hg_path,
787 &hg_path,
770 &fs_entry.full_path,
788 &fs_entry.full_path,
771 is_at_repo_root,
789 is_at_repo_root,
772 ) {
790 ) {
773 children_fs_entries.par_iter().for_each(|child_fs_entry| {
791 children_fs_entries.par_iter().for_each(|child_fs_entry| {
774 self.traverse_fs_only(
792 self.traverse_fs_only(
775 is_ignored,
793 is_ignored,
776 &hg_path,
794 &hg_path,
777 child_fs_entry,
795 child_fs_entry,
778 );
796 );
779 })
797 })
780 }
798 }
781 if self.options.collect_traversed_dirs {
799 if self.options.collect_traversed_dirs {
782 self.outcome.lock().unwrap().traversed.push(hg_path.into())
800 self.outcome.lock().unwrap().traversed.push(hg_path.into())
783 }
801 }
784 }
802 }
785 is_ignored
803 is_ignored
786 } else {
804 } else {
787 if file_or_symlink {
805 if file_or_symlink {
788 if self.matcher.matches(&hg_path) {
806 if self.matcher.matches(&hg_path) {
789 self.mark_unknown_or_ignored(
807 self.mark_unknown_or_ignored(
790 has_ignored_ancestor,
808 has_ignored_ancestor,
791 &BorrowedPath::InMemory(&hg_path),
809 &BorrowedPath::InMemory(&hg_path),
792 )
810 )
793 } else {
811 } else {
794 // We haven’t computed whether this path is ignored. It
812 // We haven’t computed whether this path is ignored. It
795 // might not be, and a future run of status might have a
813 // might not be, and a future run of status might have a
796 // different matcher that matches it. So treat it as not
814 // different matcher that matches it. So treat it as not
797 // ignored. That is, inhibit readdir caching of the parent
815 // ignored. That is, inhibit readdir caching of the parent
798 // directory.
816 // directory.
799 false
817 false
800 }
818 }
801 } else {
819 } else {
802 // This is neither a directory, a plain file, or a symlink.
820 // This is neither a directory, a plain file, or a symlink.
803 // Treat it like an ignored file.
821 // Treat it like an ignored file.
804 true
822 true
805 }
823 }
806 }
824 }
807 }
825 }
808
826
809 /// Returns whether that path is ignored
827 /// Returns whether that path is ignored
810 fn mark_unknown_or_ignored(
828 fn mark_unknown_or_ignored(
811 &self,
829 &self,
812 has_ignored_ancestor: bool,
830 has_ignored_ancestor: bool,
813 hg_path: &BorrowedPath<'_, 'on_disk>,
831 hg_path: &BorrowedPath<'_, 'on_disk>,
814 ) -> bool {
832 ) -> bool {
815 let is_ignored = has_ignored_ancestor || (self.ignore_fn)(&hg_path);
833 let is_ignored = has_ignored_ancestor || (self.ignore_fn)(&hg_path);
816 if is_ignored {
834 if is_ignored {
817 if self.options.list_ignored {
835 if self.options.list_ignored {
818 self.push_outcome_without_copy_source(
836 self.push_outcome_without_copy_source(
819 Outcome::Ignored,
837 Outcome::Ignored,
820 hg_path,
838 hg_path,
821 )
839 )
822 }
840 }
823 } else {
841 } else {
824 if self.options.list_unknown {
842 if self.options.list_unknown {
825 self.push_outcome_without_copy_source(
843 self.push_outcome_without_copy_source(
826 Outcome::Unknown,
844 Outcome::Unknown,
827 hg_path,
845 hg_path,
828 )
846 )
829 }
847 }
830 }
848 }
831 is_ignored
849 is_ignored
832 }
850 }
833 }
851 }
834
852
835 struct DirEntry {
853 struct DirEntry {
836 base_name: HgPathBuf,
854 base_name: HgPathBuf,
837 full_path: PathBuf,
855 full_path: PathBuf,
838 metadata: std::fs::Metadata,
856 metadata: std::fs::Metadata,
839 }
857 }
840
858
841 impl DirEntry {
859 impl DirEntry {
842 /// Returns **unsorted** entries in the given directory, with name and
860 /// Returns **unsorted** entries in the given directory, with name and
843 /// metadata.
861 /// metadata.
844 ///
862 ///
845 /// If a `.hg` sub-directory is encountered:
863 /// If a `.hg` sub-directory is encountered:
846 ///
864 ///
847 /// * At the repository root, ignore that sub-directory
865 /// * At the repository root, ignore that sub-directory
848 /// * Elsewhere, we’re listing the content of a sub-repo. Return an empty
866 /// * Elsewhere, we’re listing the content of a sub-repo. Return an empty
849 /// list instead.
867 /// list instead.
850 fn read_dir(path: &Path, is_at_repo_root: bool) -> io::Result<Vec<Self>> {
868 fn read_dir(path: &Path, is_at_repo_root: bool) -> io::Result<Vec<Self>> {
851 // `read_dir` returns a "not found" error for the empty path
869 // `read_dir` returns a "not found" error for the empty path
852 let at_cwd = path == Path::new("");
870 let at_cwd = path == Path::new("");
853 let read_dir_path = if at_cwd { Path::new(".") } else { path };
871 let read_dir_path = if at_cwd { Path::new(".") } else { path };
854 let mut results = Vec::new();
872 let mut results = Vec::new();
855 for entry in read_dir_path.read_dir()? {
873 for entry in read_dir_path.read_dir()? {
856 let entry = entry?;
874 let entry = entry?;
857 let metadata = match entry.metadata() {
875 let metadata = match entry.metadata() {
858 Ok(v) => v,
876 Ok(v) => v,
859 Err(e) => {
877 Err(e) => {
860 // race with file deletion?
878 // race with file deletion?
861 if e.kind() == std::io::ErrorKind::NotFound {
879 if e.kind() == std::io::ErrorKind::NotFound {
862 continue;
880 continue;
863 } else {
881 } else {
864 return Err(e);
882 return Err(e);
865 }
883 }
866 }
884 }
867 };
885 };
868 let file_name = entry.file_name();
886 let file_name = entry.file_name();
869 // FIXME don't do this when cached
887 // FIXME don't do this when cached
870 if file_name == ".hg" {
888 if file_name == ".hg" {
871 if is_at_repo_root {
889 if is_at_repo_root {
872 // Skip the repo’s own .hg (might be a symlink)
890 // Skip the repo’s own .hg (might be a symlink)
873 continue;
891 continue;
874 } else if metadata.is_dir() {
892 } else if metadata.is_dir() {
875 // A .hg sub-directory at another location means a subrepo,
893 // A .hg sub-directory at another location means a subrepo,
876 // skip it entirely.
894 // skip it entirely.
877 return Ok(Vec::new());
895 return Ok(Vec::new());
878 }
896 }
879 }
897 }
880 let full_path = if at_cwd {
898 let full_path = if at_cwd {
881 file_name.clone().into()
899 file_name.clone().into()
882 } else {
900 } else {
883 entry.path()
901 entry.path()
884 };
902 };
885 let base_name = get_bytes_from_os_string(file_name).into();
903 let base_name = get_bytes_from_os_string(file_name).into();
886 results.push(DirEntry {
904 results.push(DirEntry {
887 base_name,
905 base_name,
888 full_path,
906 full_path,
889 metadata,
907 metadata,
890 })
908 })
891 }
909 }
892 Ok(results)
910 Ok(results)
893 }
911 }
894 }
912 }
895
913
896 /// Return the `mtime` of a temporary file newly-created in the `.hg` directory
914 /// Return the `mtime` of a temporary file newly-created in the `.hg` directory
897 /// of the give repository.
915 /// of the give repository.
898 ///
916 ///
899 /// This is similar to `SystemTime::now()`, with the result truncated to the
917 /// This is similar to `SystemTime::now()`, with the result truncated to the
900 /// same time resolution as other files’ modification times. Using `.hg`
918 /// same time resolution as other files’ modification times. Using `.hg`
901 /// instead of the system’s default temporary directory (such as `/tmp`) makes
919 /// instead of the system’s default temporary directory (such as `/tmp`) makes
902 /// it more likely the temporary file is in the same disk partition as contents
920 /// it more likely the temporary file is in the same disk partition as contents
903 /// of the working directory, which can matter since different filesystems may
921 /// of the working directory, which can matter since different filesystems may
904 /// store timestamps with different resolutions.
922 /// store timestamps with different resolutions.
905 ///
923 ///
906 /// This may fail, typically if we lack write permissions. In that case we
924 /// This may fail, typically if we lack write permissions. In that case we
907 /// should continue the `status()` algoritm anyway and consider the current
925 /// should continue the `status()` algoritm anyway and consider the current
908 /// date/time to be unknown.
926 /// date/time to be unknown.
909 fn filesystem_now(repo_root: &Path) -> Result<SystemTime, io::Error> {
927 fn filesystem_now(repo_root: &Path) -> Result<SystemTime, io::Error> {
910 tempfile::tempfile_in(repo_root.join(".hg"))?
928 tempfile::tempfile_in(repo_root.join(".hg"))?
911 .metadata()?
929 .metadata()?
912 .modified()
930 .modified()
913 }
931 }
@@ -1,706 +1,706
1 // filepatterns.rs
1 // filepatterns.rs
2 //
2 //
3 // Copyright 2019 Raphaël Gomès <rgomes@octobus.net>
3 // Copyright 2019 Raphaël Gomès <rgomes@octobus.net>
4 //
4 //
5 // This software may be used and distributed according to the terms of the
5 // This software may be used and distributed according to the terms of the
6 // GNU General Public License version 2 or any later version.
6 // GNU General Public License version 2 or any later version.
7
7
8 //! Handling of Mercurial-specific patterns.
8 //! Handling of Mercurial-specific patterns.
9
9
10 use crate::{
10 use crate::{
11 utils::{
11 utils::{
12 files::{canonical_path, get_bytes_from_path, get_path_from_bytes},
12 files::{canonical_path, get_bytes_from_path, get_path_from_bytes},
13 hg_path::{path_to_hg_path_buf, HgPathBuf, HgPathError},
13 hg_path::{path_to_hg_path_buf, HgPathBuf, HgPathError},
14 SliceExt,
14 SliceExt,
15 },
15 },
16 FastHashMap, PatternError,
16 FastHashMap, PatternError,
17 };
17 };
18 use lazy_static::lazy_static;
18 use lazy_static::lazy_static;
19 use regex::bytes::{NoExpand, Regex};
19 use regex::bytes::{NoExpand, Regex};
20 use std::ops::Deref;
20 use std::ops::Deref;
21 use std::path::{Path, PathBuf};
21 use std::path::{Path, PathBuf};
22 use std::vec::Vec;
22 use std::vec::Vec;
23
23
24 lazy_static! {
24 lazy_static! {
25 static ref RE_ESCAPE: Vec<Vec<u8>> = {
25 static ref RE_ESCAPE: Vec<Vec<u8>> = {
26 let mut v: Vec<Vec<u8>> = (0..=255).map(|byte| vec![byte]).collect();
26 let mut v: Vec<Vec<u8>> = (0..=255).map(|byte| vec![byte]).collect();
27 let to_escape = b"()[]{}?*+-|^$\\.&~# \t\n\r\x0b\x0c";
27 let to_escape = b"()[]{}?*+-|^$\\.&~# \t\n\r\x0b\x0c";
28 for byte in to_escape {
28 for byte in to_escape {
29 v[*byte as usize].insert(0, b'\\');
29 v[*byte as usize].insert(0, b'\\');
30 }
30 }
31 v
31 v
32 };
32 };
33 }
33 }
34
34
35 /// These are matched in order
35 /// These are matched in order
36 const GLOB_REPLACEMENTS: &[(&[u8], &[u8])] =
36 const GLOB_REPLACEMENTS: &[(&[u8], &[u8])] =
37 &[(b"*/", b"(?:.*/)?"), (b"*", b".*"), (b"", b"[^/]*")];
37 &[(b"*/", b"(?:.*/)?"), (b"*", b".*"), (b"", b"[^/]*")];
38
38
39 /// Appended to the regexp of globs
39 /// Appended to the regexp of globs
40 const GLOB_SUFFIX: &[u8; 7] = b"(?:/|$)";
40 const GLOB_SUFFIX: &[u8; 7] = b"(?:/|$)";
41
41
42 #[derive(Debug, Clone, PartialEq, Eq)]
42 #[derive(Debug, Clone, PartialEq, Eq)]
43 pub enum PatternSyntax {
43 pub enum PatternSyntax {
44 /// A regular expression
44 /// A regular expression
45 Regexp,
45 Regexp,
46 /// Glob that matches at the front of the path
46 /// Glob that matches at the front of the path
47 RootGlob,
47 RootGlob,
48 /// Glob that matches at any suffix of the path (still anchored at
48 /// Glob that matches at any suffix of the path (still anchored at
49 /// slashes)
49 /// slashes)
50 Glob,
50 Glob,
51 /// a path relative to repository root, which is matched recursively
51 /// a path relative to repository root, which is matched recursively
52 Path,
52 Path,
53 /// A path relative to cwd
53 /// A path relative to cwd
54 RelPath,
54 RelPath,
55 /// an unrooted glob (*.rs matches Rust files in all dirs)
55 /// an unrooted glob (*.rs matches Rust files in all dirs)
56 RelGlob,
56 RelGlob,
57 /// A regexp that needn't match the start of a name
57 /// A regexp that needn't match the start of a name
58 RelRegexp,
58 RelRegexp,
59 /// A path relative to repository root, which is matched non-recursively
59 /// A path relative to repository root, which is matched non-recursively
60 /// (will not match subdirectories)
60 /// (will not match subdirectories)
61 RootFiles,
61 RootFiles,
62 /// A file of patterns to read and include
62 /// A file of patterns to read and include
63 Include,
63 Include,
64 /// A file of patterns to match against files under the same directory
64 /// A file of patterns to match against files under the same directory
65 SubInclude,
65 SubInclude,
66 /// SubInclude with the result of parsing the included file
66 /// SubInclude with the result of parsing the included file
67 ///
67 ///
68 /// Note: there is no ExpandedInclude because that expansion can be done
68 /// Note: there is no ExpandedInclude because that expansion can be done
69 /// in place by replacing the Include pattern by the included patterns.
69 /// in place by replacing the Include pattern by the included patterns.
70 /// SubInclude requires more handling.
70 /// SubInclude requires more handling.
71 ///
71 ///
72 /// Note: `Box` is used to minimize size impact on other enum variants
72 /// Note: `Box` is used to minimize size impact on other enum variants
73 ExpandedSubInclude(Box<SubInclude>),
73 ExpandedSubInclude(Box<SubInclude>),
74 }
74 }
75
75
76 /// Transforms a glob pattern into a regex
76 /// Transforms a glob pattern into a regex
77 fn glob_to_re(pat: &[u8]) -> Vec<u8> {
77 fn glob_to_re(pat: &[u8]) -> Vec<u8> {
78 let mut input = pat;
78 let mut input = pat;
79 let mut res: Vec<u8> = vec![];
79 let mut res: Vec<u8> = vec![];
80 let mut group_depth = 0;
80 let mut group_depth = 0;
81
81
82 while let Some((c, rest)) = input.split_first() {
82 while let Some((c, rest)) = input.split_first() {
83 input = rest;
83 input = rest;
84
84
85 match c {
85 match c {
86 b'*' => {
86 b'*' => {
87 for (source, repl) in GLOB_REPLACEMENTS {
87 for (source, repl) in GLOB_REPLACEMENTS {
88 if let Some(rest) = input.drop_prefix(source) {
88 if let Some(rest) = input.drop_prefix(source) {
89 input = rest;
89 input = rest;
90 res.extend(*repl);
90 res.extend(*repl);
91 break;
91 break;
92 }
92 }
93 }
93 }
94 }
94 }
95 b'?' => res.extend(b"."),
95 b'?' => res.extend(b"."),
96 b'[' => {
96 b'[' => {
97 match input.iter().skip(1).position(|b| *b == b']') {
97 match input.iter().skip(1).position(|b| *b == b']') {
98 None => res.extend(b"\\["),
98 None => res.extend(b"\\["),
99 Some(end) => {
99 Some(end) => {
100 // Account for the one we skipped
100 // Account for the one we skipped
101 let end = end + 1;
101 let end = end + 1;
102
102
103 res.extend(b"[");
103 res.extend(b"[");
104
104
105 for (i, b) in input[..end].iter().enumerate() {
105 for (i, b) in input[..end].iter().enumerate() {
106 if *b == b'!' && i == 0 {
106 if *b == b'!' && i == 0 {
107 res.extend(b"^")
107 res.extend(b"^")
108 } else if *b == b'^' && i == 0 {
108 } else if *b == b'^' && i == 0 {
109 res.extend(b"\\^")
109 res.extend(b"\\^")
110 } else if *b == b'\\' {
110 } else if *b == b'\\' {
111 res.extend(b"\\\\")
111 res.extend(b"\\\\")
112 } else {
112 } else {
113 res.push(*b)
113 res.push(*b)
114 }
114 }
115 }
115 }
116 res.extend(b"]");
116 res.extend(b"]");
117 input = &input[end + 1..];
117 input = &input[end + 1..];
118 }
118 }
119 }
119 }
120 }
120 }
121 b'{' => {
121 b'{' => {
122 group_depth += 1;
122 group_depth += 1;
123 res.extend(b"(?:")
123 res.extend(b"(?:")
124 }
124 }
125 b'}' if group_depth > 0 => {
125 b'}' if group_depth > 0 => {
126 group_depth -= 1;
126 group_depth -= 1;
127 res.extend(b")");
127 res.extend(b")");
128 }
128 }
129 b',' if group_depth > 0 => res.extend(b"|"),
129 b',' if group_depth > 0 => res.extend(b"|"),
130 b'\\' => {
130 b'\\' => {
131 let c = {
131 let c = {
132 if let Some((c, rest)) = input.split_first() {
132 if let Some((c, rest)) = input.split_first() {
133 input = rest;
133 input = rest;
134 c
134 c
135 } else {
135 } else {
136 c
136 c
137 }
137 }
138 };
138 };
139 res.extend(&RE_ESCAPE[*c as usize])
139 res.extend(&RE_ESCAPE[*c as usize])
140 }
140 }
141 _ => res.extend(&RE_ESCAPE[*c as usize]),
141 _ => res.extend(&RE_ESCAPE[*c as usize]),
142 }
142 }
143 }
143 }
144 res
144 res
145 }
145 }
146
146
147 fn escape_pattern(pattern: &[u8]) -> Vec<u8> {
147 fn escape_pattern(pattern: &[u8]) -> Vec<u8> {
148 pattern
148 pattern
149 .iter()
149 .iter()
150 .flat_map(|c| RE_ESCAPE[*c as usize].clone())
150 .flat_map(|c| RE_ESCAPE[*c as usize].clone())
151 .collect()
151 .collect()
152 }
152 }
153
153
154 pub fn parse_pattern_syntax(
154 pub fn parse_pattern_syntax(
155 kind: &[u8],
155 kind: &[u8],
156 ) -> Result<PatternSyntax, PatternError> {
156 ) -> Result<PatternSyntax, PatternError> {
157 match kind {
157 match kind {
158 b"re:" => Ok(PatternSyntax::Regexp),
158 b"re:" => Ok(PatternSyntax::Regexp),
159 b"path:" => Ok(PatternSyntax::Path),
159 b"path:" => Ok(PatternSyntax::Path),
160 b"relpath:" => Ok(PatternSyntax::RelPath),
160 b"relpath:" => Ok(PatternSyntax::RelPath),
161 b"rootfilesin:" => Ok(PatternSyntax::RootFiles),
161 b"rootfilesin:" => Ok(PatternSyntax::RootFiles),
162 b"relglob:" => Ok(PatternSyntax::RelGlob),
162 b"relglob:" => Ok(PatternSyntax::RelGlob),
163 b"relre:" => Ok(PatternSyntax::RelRegexp),
163 b"relre:" => Ok(PatternSyntax::RelRegexp),
164 b"glob:" => Ok(PatternSyntax::Glob),
164 b"glob:" => Ok(PatternSyntax::Glob),
165 b"rootglob:" => Ok(PatternSyntax::RootGlob),
165 b"rootglob:" => Ok(PatternSyntax::RootGlob),
166 b"include:" => Ok(PatternSyntax::Include),
166 b"include:" => Ok(PatternSyntax::Include),
167 b"subinclude:" => Ok(PatternSyntax::SubInclude),
167 b"subinclude:" => Ok(PatternSyntax::SubInclude),
168 _ => Err(PatternError::UnsupportedSyntax(
168 _ => Err(PatternError::UnsupportedSyntax(
169 String::from_utf8_lossy(kind).to_string(),
169 String::from_utf8_lossy(kind).to_string(),
170 )),
170 )),
171 }
171 }
172 }
172 }
173
173
174 /// Builds the regex that corresponds to the given pattern.
174 /// Builds the regex that corresponds to the given pattern.
175 /// If within a `syntax: regexp` context, returns the pattern,
175 /// If within a `syntax: regexp` context, returns the pattern,
176 /// otherwise, returns the corresponding regex.
176 /// otherwise, returns the corresponding regex.
177 fn _build_single_regex(entry: &IgnorePattern) -> Vec<u8> {
177 fn _build_single_regex(entry: &IgnorePattern) -> Vec<u8> {
178 let IgnorePattern {
178 let IgnorePattern {
179 syntax, pattern, ..
179 syntax, pattern, ..
180 } = entry;
180 } = entry;
181 if pattern.is_empty() {
181 if pattern.is_empty() {
182 return vec![];
182 return vec![];
183 }
183 }
184 match syntax {
184 match syntax {
185 PatternSyntax::Regexp => pattern.to_owned(),
185 PatternSyntax::Regexp => pattern.to_owned(),
186 PatternSyntax::RelRegexp => {
186 PatternSyntax::RelRegexp => {
187 // The `regex` crate accepts `**` while `re2` and Python's `re`
187 // The `regex` crate accepts `**` while `re2` and Python's `re`
188 // do not. Checking for `*` correctly triggers the same error all
188 // do not. Checking for `*` correctly triggers the same error all
189 // engines.
189 // engines.
190 if pattern[0] == b'^'
190 if pattern[0] == b'^'
191 || pattern[0] == b'*'
191 || pattern[0] == b'*'
192 || pattern.starts_with(b".*")
192 || pattern.starts_with(b".*")
193 {
193 {
194 return pattern.to_owned();
194 return pattern.to_owned();
195 }
195 }
196 [&b".*"[..], pattern].concat()
196 [&b".*"[..], pattern].concat()
197 }
197 }
198 PatternSyntax::Path | PatternSyntax::RelPath => {
198 PatternSyntax::Path | PatternSyntax::RelPath => {
199 if pattern == b"." {
199 if pattern == b"." {
200 return vec![];
200 return vec![];
201 }
201 }
202 [escape_pattern(pattern).as_slice(), b"(?:/|$)"].concat()
202 [escape_pattern(pattern).as_slice(), b"(?:/|$)"].concat()
203 }
203 }
204 PatternSyntax::RootFiles => {
204 PatternSyntax::RootFiles => {
205 let mut res = if pattern == b"." {
205 let mut res = if pattern == b"." {
206 vec![]
206 vec![]
207 } else {
207 } else {
208 // Pattern is a directory name.
208 // Pattern is a directory name.
209 [escape_pattern(pattern).as_slice(), b"/"].concat()
209 [escape_pattern(pattern).as_slice(), b"/"].concat()
210 };
210 };
211
211
212 // Anything after the pattern must be a non-directory.
212 // Anything after the pattern must be a non-directory.
213 res.extend(b"[^/]+$");
213 res.extend(b"[^/]+$");
214 res
214 res
215 }
215 }
216 PatternSyntax::RelGlob => {
216 PatternSyntax::RelGlob => {
217 let glob_re = glob_to_re(pattern);
217 let glob_re = glob_to_re(pattern);
218 if let Some(rest) = glob_re.drop_prefix(b"[^/]*") {
218 if let Some(rest) = glob_re.drop_prefix(b"[^/]*") {
219 [b".*", rest, GLOB_SUFFIX].concat()
219 [b".*", rest, GLOB_SUFFIX].concat()
220 } else {
220 } else {
221 [b"(?:.*/)?", glob_re.as_slice(), GLOB_SUFFIX].concat()
221 [b"(?:.*/)?", glob_re.as_slice(), GLOB_SUFFIX].concat()
222 }
222 }
223 }
223 }
224 PatternSyntax::Glob | PatternSyntax::RootGlob => {
224 PatternSyntax::Glob | PatternSyntax::RootGlob => {
225 [glob_to_re(pattern).as_slice(), GLOB_SUFFIX].concat()
225 [glob_to_re(pattern).as_slice(), GLOB_SUFFIX].concat()
226 }
226 }
227 PatternSyntax::Include
227 PatternSyntax::Include
228 | PatternSyntax::SubInclude
228 | PatternSyntax::SubInclude
229 | PatternSyntax::ExpandedSubInclude(_) => unreachable!(),
229 | PatternSyntax::ExpandedSubInclude(_) => unreachable!(),
230 }
230 }
231 }
231 }
232
232
233 const GLOB_SPECIAL_CHARACTERS: [u8; 7] =
233 const GLOB_SPECIAL_CHARACTERS: [u8; 7] =
234 [b'*', b'?', b'[', b']', b'{', b'}', b'\\'];
234 [b'*', b'?', b'[', b']', b'{', b'}', b'\\'];
235
235
236 /// TODO support other platforms
236 /// TODO support other platforms
237 #[cfg(unix)]
237 #[cfg(unix)]
238 pub fn normalize_path_bytes(bytes: &[u8]) -> Vec<u8> {
238 pub fn normalize_path_bytes(bytes: &[u8]) -> Vec<u8> {
239 if bytes.is_empty() {
239 if bytes.is_empty() {
240 return b".".to_vec();
240 return b".".to_vec();
241 }
241 }
242 let sep = b'/';
242 let sep = b'/';
243
243
244 let mut initial_slashes = bytes.iter().take_while(|b| **b == sep).count();
244 let mut initial_slashes = bytes.iter().take_while(|b| **b == sep).count();
245 if initial_slashes > 2 {
245 if initial_slashes > 2 {
246 // POSIX allows one or two initial slashes, but treats three or more
246 // POSIX allows one or two initial slashes, but treats three or more
247 // as single slash.
247 // as single slash.
248 initial_slashes = 1;
248 initial_slashes = 1;
249 }
249 }
250 let components = bytes
250 let components = bytes
251 .split(|b| *b == sep)
251 .split(|b| *b == sep)
252 .filter(|c| !(c.is_empty() || c == b"."))
252 .filter(|c| !(c.is_empty() || c == b"."))
253 .fold(vec![], |mut acc, component| {
253 .fold(vec![], |mut acc, component| {
254 if component != b".."
254 if component != b".."
255 || (initial_slashes == 0 && acc.is_empty())
255 || (initial_slashes == 0 && acc.is_empty())
256 || (!acc.is_empty() && acc[acc.len() - 1] == b"..")
256 || (!acc.is_empty() && acc[acc.len() - 1] == b"..")
257 {
257 {
258 acc.push(component)
258 acc.push(component)
259 } else if !acc.is_empty() {
259 } else if !acc.is_empty() {
260 acc.pop();
260 acc.pop();
261 }
261 }
262 acc
262 acc
263 });
263 });
264 let mut new_bytes = components.join(&sep);
264 let mut new_bytes = components.join(&sep);
265
265
266 if initial_slashes > 0 {
266 if initial_slashes > 0 {
267 let mut buf: Vec<_> = (0..initial_slashes).map(|_| sep).collect();
267 let mut buf: Vec<_> = (0..initial_slashes).map(|_| sep).collect();
268 buf.extend(new_bytes);
268 buf.extend(new_bytes);
269 new_bytes = buf;
269 new_bytes = buf;
270 }
270 }
271 if new_bytes.is_empty() {
271 if new_bytes.is_empty() {
272 b".".to_vec()
272 b".".to_vec()
273 } else {
273 } else {
274 new_bytes
274 new_bytes
275 }
275 }
276 }
276 }
277
277
278 /// Wrapper function to `_build_single_regex` that short-circuits 'exact' globs
278 /// Wrapper function to `_build_single_regex` that short-circuits 'exact' globs
279 /// that don't need to be transformed into a regex.
279 /// that don't need to be transformed into a regex.
280 pub fn build_single_regex(
280 pub fn build_single_regex(
281 entry: &IgnorePattern,
281 entry: &IgnorePattern,
282 ) -> Result<Option<Vec<u8>>, PatternError> {
282 ) -> Result<Option<Vec<u8>>, PatternError> {
283 let IgnorePattern {
283 let IgnorePattern {
284 pattern, syntax, ..
284 pattern, syntax, ..
285 } = entry;
285 } = entry;
286 let pattern = match syntax {
286 let pattern = match syntax {
287 PatternSyntax::RootGlob
287 PatternSyntax::RootGlob
288 | PatternSyntax::Path
288 | PatternSyntax::Path
289 | PatternSyntax::RelGlob
289 | PatternSyntax::RelGlob
290 | PatternSyntax::RootFiles => normalize_path_bytes(&pattern),
290 | PatternSyntax::RootFiles => normalize_path_bytes(&pattern),
291 PatternSyntax::Include | PatternSyntax::SubInclude => {
291 PatternSyntax::Include | PatternSyntax::SubInclude => {
292 return Err(PatternError::NonRegexPattern(entry.clone()))
292 return Err(PatternError::NonRegexPattern(entry.clone()))
293 }
293 }
294 _ => pattern.to_owned(),
294 _ => pattern.to_owned(),
295 };
295 };
296 if *syntax == PatternSyntax::RootGlob
296 if *syntax == PatternSyntax::RootGlob
297 && !pattern.iter().any(|b| GLOB_SPECIAL_CHARACTERS.contains(b))
297 && !pattern.iter().any(|b| GLOB_SPECIAL_CHARACTERS.contains(b))
298 {
298 {
299 Ok(None)
299 Ok(None)
300 } else {
300 } else {
301 let mut entry = entry.clone();
301 let mut entry = entry.clone();
302 entry.pattern = pattern;
302 entry.pattern = pattern;
303 Ok(Some(_build_single_regex(&entry)))
303 Ok(Some(_build_single_regex(&entry)))
304 }
304 }
305 }
305 }
306
306
307 lazy_static! {
307 lazy_static! {
308 static ref SYNTAXES: FastHashMap<&'static [u8], &'static [u8]> = {
308 static ref SYNTAXES: FastHashMap<&'static [u8], &'static [u8]> = {
309 let mut m = FastHashMap::default();
309 let mut m = FastHashMap::default();
310
310
311 m.insert(b"re".as_ref(), b"relre:".as_ref());
311 m.insert(b"re".as_ref(), b"relre:".as_ref());
312 m.insert(b"regexp".as_ref(), b"relre:".as_ref());
312 m.insert(b"regexp".as_ref(), b"relre:".as_ref());
313 m.insert(b"glob".as_ref(), b"relglob:".as_ref());
313 m.insert(b"glob".as_ref(), b"relglob:".as_ref());
314 m.insert(b"rootglob".as_ref(), b"rootglob:".as_ref());
314 m.insert(b"rootglob".as_ref(), b"rootglob:".as_ref());
315 m.insert(b"include".as_ref(), b"include:".as_ref());
315 m.insert(b"include".as_ref(), b"include:".as_ref());
316 m.insert(b"subinclude".as_ref(), b"subinclude:".as_ref());
316 m.insert(b"subinclude".as_ref(), b"subinclude:".as_ref());
317 m.insert(b"path".as_ref(), b"path:".as_ref());
317 m.insert(b"path".as_ref(), b"path:".as_ref());
318 m.insert(b"rootfilesin".as_ref(), b"rootfilesin:".as_ref());
318 m.insert(b"rootfilesin".as_ref(), b"rootfilesin:".as_ref());
319 m
319 m
320 };
320 };
321 }
321 }
322
322
323 #[derive(Debug)]
323 #[derive(Debug)]
324 pub enum PatternFileWarning {
324 pub enum PatternFileWarning {
325 /// (file path, syntax bytes)
325 /// (file path, syntax bytes)
326 InvalidSyntax(PathBuf, Vec<u8>),
326 InvalidSyntax(PathBuf, Vec<u8>),
327 /// File path
327 /// File path
328 NoSuchFile(PathBuf),
328 NoSuchFile(PathBuf),
329 }
329 }
330
330
331 pub fn parse_pattern_file_contents(
331 pub fn parse_pattern_file_contents(
332 lines: &[u8],
332 lines: &[u8],
333 file_path: &Path,
333 file_path: &Path,
334 default_syntax_override: Option<&[u8]>,
334 default_syntax_override: Option<&[u8]>,
335 warn: bool,
335 warn: bool,
336 ) -> Result<(Vec<IgnorePattern>, Vec<PatternFileWarning>), PatternError> {
336 ) -> Result<(Vec<IgnorePattern>, Vec<PatternFileWarning>), PatternError> {
337 let comment_regex = Regex::new(r"((?:^|[^\\])(?:\\\\)*)#.*").unwrap();
337 let comment_regex = Regex::new(r"((?:^|[^\\])(?:\\\\)*)#.*").unwrap();
338
338
339 #[allow(clippy::trivial_regex)]
339 #[allow(clippy::trivial_regex)]
340 let comment_escape_regex = Regex::new(r"\\#").unwrap();
340 let comment_escape_regex = Regex::new(r"\\#").unwrap();
341 let mut inputs: Vec<IgnorePattern> = vec![];
341 let mut inputs: Vec<IgnorePattern> = vec![];
342 let mut warnings: Vec<PatternFileWarning> = vec![];
342 let mut warnings: Vec<PatternFileWarning> = vec![];
343
343
344 let mut current_syntax =
344 let mut current_syntax =
345 default_syntax_override.unwrap_or(b"relre:".as_ref());
345 default_syntax_override.unwrap_or(b"relre:".as_ref());
346
346
347 for (line_number, mut line) in lines.split(|c| *c == b'\n').enumerate() {
347 for (line_number, mut line) in lines.split(|c| *c == b'\n').enumerate() {
348 let line_number = line_number + 1;
348 let line_number = line_number + 1;
349
349
350 let line_buf;
350 let line_buf;
351 if line.contains(&b'#') {
351 if line.contains(&b'#') {
352 if let Some(cap) = comment_regex.captures(line) {
352 if let Some(cap) = comment_regex.captures(line) {
353 line = &line[..cap.get(1).unwrap().end()]
353 line = &line[..cap.get(1).unwrap().end()]
354 }
354 }
355 line_buf = comment_escape_regex.replace_all(line, NoExpand(b"#"));
355 line_buf = comment_escape_regex.replace_all(line, NoExpand(b"#"));
356 line = &line_buf;
356 line = &line_buf;
357 }
357 }
358
358
359 let mut line = line.trim_end();
359 let mut line = line.trim_end();
360
360
361 if line.is_empty() {
361 if line.is_empty() {
362 continue;
362 continue;
363 }
363 }
364
364
365 if let Some(syntax) = line.drop_prefix(b"syntax:") {
365 if let Some(syntax) = line.drop_prefix(b"syntax:") {
366 let syntax = syntax.trim();
366 let syntax = syntax.trim();
367
367
368 if let Some(rel_syntax) = SYNTAXES.get(syntax) {
368 if let Some(rel_syntax) = SYNTAXES.get(syntax) {
369 current_syntax = rel_syntax;
369 current_syntax = rel_syntax;
370 } else if warn {
370 } else if warn {
371 warnings.push(PatternFileWarning::InvalidSyntax(
371 warnings.push(PatternFileWarning::InvalidSyntax(
372 file_path.to_owned(),
372 file_path.to_owned(),
373 syntax.to_owned(),
373 syntax.to_owned(),
374 ));
374 ));
375 }
375 }
376 continue;
376 continue;
377 }
377 }
378
378
379 let mut line_syntax: &[u8] = &current_syntax;
379 let mut line_syntax: &[u8] = &current_syntax;
380
380
381 for (s, rels) in SYNTAXES.iter() {
381 for (s, rels) in SYNTAXES.iter() {
382 if let Some(rest) = line.drop_prefix(rels) {
382 if let Some(rest) = line.drop_prefix(rels) {
383 line_syntax = rels;
383 line_syntax = rels;
384 line = rest;
384 line = rest;
385 break;
385 break;
386 }
386 }
387 if let Some(rest) = line.drop_prefix(&[s, &b":"[..]].concat()) {
387 if let Some(rest) = line.drop_prefix(&[s, &b":"[..]].concat()) {
388 line_syntax = rels;
388 line_syntax = rels;
389 line = rest;
389 line = rest;
390 break;
390 break;
391 }
391 }
392 }
392 }
393
393
394 inputs.push(IgnorePattern::new(
394 inputs.push(IgnorePattern::new(
395 parse_pattern_syntax(&line_syntax).map_err(|e| match e {
395 parse_pattern_syntax(&line_syntax).map_err(|e| match e {
396 PatternError::UnsupportedSyntax(syntax) => {
396 PatternError::UnsupportedSyntax(syntax) => {
397 PatternError::UnsupportedSyntaxInFile(
397 PatternError::UnsupportedSyntaxInFile(
398 syntax,
398 syntax,
399 file_path.to_string_lossy().into(),
399 file_path.to_string_lossy().into(),
400 line_number,
400 line_number,
401 )
401 )
402 }
402 }
403 _ => e,
403 _ => e,
404 })?,
404 })?,
405 &line,
405 &line,
406 file_path,
406 file_path,
407 ));
407 ));
408 }
408 }
409 Ok((inputs, warnings))
409 Ok((inputs, warnings))
410 }
410 }
411
411
412 pub fn read_pattern_file(
412 pub fn read_pattern_file(
413 file_path: &Path,
413 file_path: &Path,
414 warn: bool,
414 warn: bool,
415 inspect_pattern_bytes: &mut impl FnMut(&[u8]),
415 inspect_pattern_bytes: &mut impl FnMut(&Path, &[u8]),
416 ) -> Result<(Vec<IgnorePattern>, Vec<PatternFileWarning>), PatternError> {
416 ) -> Result<(Vec<IgnorePattern>, Vec<PatternFileWarning>), PatternError> {
417 match std::fs::read(file_path) {
417 match std::fs::read(file_path) {
418 Ok(contents) => {
418 Ok(contents) => {
419 inspect_pattern_bytes(&contents);
419 inspect_pattern_bytes(file_path, &contents);
420 parse_pattern_file_contents(&contents, file_path, None, warn)
420 parse_pattern_file_contents(&contents, file_path, None, warn)
421 }
421 }
422 Err(e) if e.kind() == std::io::ErrorKind::NotFound => Ok((
422 Err(e) if e.kind() == std::io::ErrorKind::NotFound => Ok((
423 vec![],
423 vec![],
424 vec![PatternFileWarning::NoSuchFile(file_path.to_owned())],
424 vec![PatternFileWarning::NoSuchFile(file_path.to_owned())],
425 )),
425 )),
426 Err(e) => Err(e.into()),
426 Err(e) => Err(e.into()),
427 }
427 }
428 }
428 }
429
429
430 /// Represents an entry in an "ignore" file.
430 /// Represents an entry in an "ignore" file.
431 #[derive(Debug, Eq, PartialEq, Clone)]
431 #[derive(Debug, Eq, PartialEq, Clone)]
432 pub struct IgnorePattern {
432 pub struct IgnorePattern {
433 pub syntax: PatternSyntax,
433 pub syntax: PatternSyntax,
434 pub pattern: Vec<u8>,
434 pub pattern: Vec<u8>,
435 pub source: PathBuf,
435 pub source: PathBuf,
436 }
436 }
437
437
438 impl IgnorePattern {
438 impl IgnorePattern {
439 pub fn new(syntax: PatternSyntax, pattern: &[u8], source: &Path) -> Self {
439 pub fn new(syntax: PatternSyntax, pattern: &[u8], source: &Path) -> Self {
440 Self {
440 Self {
441 syntax,
441 syntax,
442 pattern: pattern.to_owned(),
442 pattern: pattern.to_owned(),
443 source: source.to_owned(),
443 source: source.to_owned(),
444 }
444 }
445 }
445 }
446 }
446 }
447
447
448 pub type PatternResult<T> = Result<T, PatternError>;
448 pub type PatternResult<T> = Result<T, PatternError>;
449
449
450 /// Wrapper for `read_pattern_file` that also recursively expands `include:`
450 /// Wrapper for `read_pattern_file` that also recursively expands `include:`
451 /// and `subinclude:` patterns.
451 /// and `subinclude:` patterns.
452 ///
452 ///
453 /// The former are expanded in place, while `PatternSyntax::ExpandedSubInclude`
453 /// The former are expanded in place, while `PatternSyntax::ExpandedSubInclude`
454 /// is used for the latter to form a tree of patterns.
454 /// is used for the latter to form a tree of patterns.
455 pub fn get_patterns_from_file(
455 pub fn get_patterns_from_file(
456 pattern_file: &Path,
456 pattern_file: &Path,
457 root_dir: &Path,
457 root_dir: &Path,
458 inspect_pattern_bytes: &mut impl FnMut(&[u8]),
458 inspect_pattern_bytes: &mut impl FnMut(&Path, &[u8]),
459 ) -> PatternResult<(Vec<IgnorePattern>, Vec<PatternFileWarning>)> {
459 ) -> PatternResult<(Vec<IgnorePattern>, Vec<PatternFileWarning>)> {
460 let (patterns, mut warnings) =
460 let (patterns, mut warnings) =
461 read_pattern_file(pattern_file, true, inspect_pattern_bytes)?;
461 read_pattern_file(pattern_file, true, inspect_pattern_bytes)?;
462 let patterns = patterns
462 let patterns = patterns
463 .into_iter()
463 .into_iter()
464 .flat_map(|entry| -> PatternResult<_> {
464 .flat_map(|entry| -> PatternResult<_> {
465 Ok(match &entry.syntax {
465 Ok(match &entry.syntax {
466 PatternSyntax::Include => {
466 PatternSyntax::Include => {
467 let inner_include =
467 let inner_include =
468 root_dir.join(get_path_from_bytes(&entry.pattern));
468 root_dir.join(get_path_from_bytes(&entry.pattern));
469 let (inner_pats, inner_warnings) = get_patterns_from_file(
469 let (inner_pats, inner_warnings) = get_patterns_from_file(
470 &inner_include,
470 &inner_include,
471 root_dir,
471 root_dir,
472 inspect_pattern_bytes,
472 inspect_pattern_bytes,
473 )?;
473 )?;
474 warnings.extend(inner_warnings);
474 warnings.extend(inner_warnings);
475 inner_pats
475 inner_pats
476 }
476 }
477 PatternSyntax::SubInclude => {
477 PatternSyntax::SubInclude => {
478 let mut sub_include = SubInclude::new(
478 let mut sub_include = SubInclude::new(
479 &root_dir,
479 &root_dir,
480 &entry.pattern,
480 &entry.pattern,
481 &entry.source,
481 &entry.source,
482 )?;
482 )?;
483 let (inner_patterns, inner_warnings) =
483 let (inner_patterns, inner_warnings) =
484 get_patterns_from_file(
484 get_patterns_from_file(
485 &sub_include.path,
485 &sub_include.path,
486 &sub_include.root,
486 &sub_include.root,
487 inspect_pattern_bytes,
487 inspect_pattern_bytes,
488 )?;
488 )?;
489 sub_include.included_patterns = inner_patterns;
489 sub_include.included_patterns = inner_patterns;
490 warnings.extend(inner_warnings);
490 warnings.extend(inner_warnings);
491 vec![IgnorePattern {
491 vec![IgnorePattern {
492 syntax: PatternSyntax::ExpandedSubInclude(Box::new(
492 syntax: PatternSyntax::ExpandedSubInclude(Box::new(
493 sub_include,
493 sub_include,
494 )),
494 )),
495 ..entry
495 ..entry
496 }]
496 }]
497 }
497 }
498 _ => vec![entry],
498 _ => vec![entry],
499 })
499 })
500 })
500 })
501 .flatten()
501 .flatten()
502 .collect();
502 .collect();
503
503
504 Ok((patterns, warnings))
504 Ok((patterns, warnings))
505 }
505 }
506
506
507 /// Holds all the information needed to handle a `subinclude:` pattern.
507 /// Holds all the information needed to handle a `subinclude:` pattern.
508 #[derive(Debug, PartialEq, Eq, Clone)]
508 #[derive(Debug, PartialEq, Eq, Clone)]
509 pub struct SubInclude {
509 pub struct SubInclude {
510 /// Will be used for repository (hg) paths that start with this prefix.
510 /// Will be used for repository (hg) paths that start with this prefix.
511 /// It is relative to the current working directory, so comparing against
511 /// It is relative to the current working directory, so comparing against
512 /// repository paths is painless.
512 /// repository paths is painless.
513 pub prefix: HgPathBuf,
513 pub prefix: HgPathBuf,
514 /// The file itself, containing the patterns
514 /// The file itself, containing the patterns
515 pub path: PathBuf,
515 pub path: PathBuf,
516 /// Folder in the filesystem where this it applies
516 /// Folder in the filesystem where this it applies
517 pub root: PathBuf,
517 pub root: PathBuf,
518
518
519 pub included_patterns: Vec<IgnorePattern>,
519 pub included_patterns: Vec<IgnorePattern>,
520 }
520 }
521
521
522 impl SubInclude {
522 impl SubInclude {
523 pub fn new(
523 pub fn new(
524 root_dir: &Path,
524 root_dir: &Path,
525 pattern: &[u8],
525 pattern: &[u8],
526 source: &Path,
526 source: &Path,
527 ) -> Result<SubInclude, HgPathError> {
527 ) -> Result<SubInclude, HgPathError> {
528 let normalized_source =
528 let normalized_source =
529 normalize_path_bytes(&get_bytes_from_path(source));
529 normalize_path_bytes(&get_bytes_from_path(source));
530
530
531 let source_root = get_path_from_bytes(&normalized_source);
531 let source_root = get_path_from_bytes(&normalized_source);
532 let source_root =
532 let source_root =
533 source_root.parent().unwrap_or_else(|| source_root.deref());
533 source_root.parent().unwrap_or_else(|| source_root.deref());
534
534
535 let path = source_root.join(get_path_from_bytes(pattern));
535 let path = source_root.join(get_path_from_bytes(pattern));
536 let new_root = path.parent().unwrap_or_else(|| path.deref());
536 let new_root = path.parent().unwrap_or_else(|| path.deref());
537
537
538 let prefix = canonical_path(root_dir, root_dir, new_root)?;
538 let prefix = canonical_path(root_dir, root_dir, new_root)?;
539
539
540 Ok(Self {
540 Ok(Self {
541 prefix: path_to_hg_path_buf(prefix).and_then(|mut p| {
541 prefix: path_to_hg_path_buf(prefix).and_then(|mut p| {
542 if !p.is_empty() {
542 if !p.is_empty() {
543 p.push_byte(b'/');
543 p.push_byte(b'/');
544 }
544 }
545 Ok(p)
545 Ok(p)
546 })?,
546 })?,
547 path: path.to_owned(),
547 path: path.to_owned(),
548 root: new_root.to_owned(),
548 root: new_root.to_owned(),
549 included_patterns: Vec::new(),
549 included_patterns: Vec::new(),
550 })
550 })
551 }
551 }
552 }
552 }
553
553
554 /// Separate and pre-process subincludes from other patterns for the "ignore"
554 /// Separate and pre-process subincludes from other patterns for the "ignore"
555 /// phase.
555 /// phase.
556 pub fn filter_subincludes(
556 pub fn filter_subincludes(
557 ignore_patterns: Vec<IgnorePattern>,
557 ignore_patterns: Vec<IgnorePattern>,
558 ) -> Result<(Vec<Box<SubInclude>>, Vec<IgnorePattern>), HgPathError> {
558 ) -> Result<(Vec<Box<SubInclude>>, Vec<IgnorePattern>), HgPathError> {
559 let mut subincludes = vec![];
559 let mut subincludes = vec![];
560 let mut others = vec![];
560 let mut others = vec![];
561
561
562 for pattern in ignore_patterns {
562 for pattern in ignore_patterns {
563 if let PatternSyntax::ExpandedSubInclude(sub_include) = pattern.syntax
563 if let PatternSyntax::ExpandedSubInclude(sub_include) = pattern.syntax
564 {
564 {
565 subincludes.push(sub_include);
565 subincludes.push(sub_include);
566 } else {
566 } else {
567 others.push(pattern)
567 others.push(pattern)
568 }
568 }
569 }
569 }
570 Ok((subincludes, others))
570 Ok((subincludes, others))
571 }
571 }
572
572
573 #[cfg(test)]
573 #[cfg(test)]
574 mod tests {
574 mod tests {
575 use super::*;
575 use super::*;
576 use pretty_assertions::assert_eq;
576 use pretty_assertions::assert_eq;
577
577
578 #[test]
578 #[test]
579 fn escape_pattern_test() {
579 fn escape_pattern_test() {
580 let untouched =
580 let untouched =
581 br#"!"%',/0123456789:;<=>@ABCDEFGHIJKLMNOPQRSTUVWXYZ_`abcdefghijklmnopqrstuvwxyz"#;
581 br#"!"%',/0123456789:;<=>@ABCDEFGHIJKLMNOPQRSTUVWXYZ_`abcdefghijklmnopqrstuvwxyz"#;
582 assert_eq!(escape_pattern(untouched), untouched.to_vec());
582 assert_eq!(escape_pattern(untouched), untouched.to_vec());
583 // All escape codes
583 // All escape codes
584 assert_eq!(
584 assert_eq!(
585 escape_pattern(br#"()[]{}?*+-|^$\\.&~# \t\n\r\v\f"#),
585 escape_pattern(br#"()[]{}?*+-|^$\\.&~# \t\n\r\v\f"#),
586 br#"\(\)\[\]\{\}\?\*\+\-\|\^\$\\\\\.\&\~\#\ \\t\\n\\r\\v\\f"#
586 br#"\(\)\[\]\{\}\?\*\+\-\|\^\$\\\\\.\&\~\#\ \\t\\n\\r\\v\\f"#
587 .to_vec()
587 .to_vec()
588 );
588 );
589 }
589 }
590
590
591 #[test]
591 #[test]
592 fn glob_test() {
592 fn glob_test() {
593 assert_eq!(glob_to_re(br#"?"#), br#"."#);
593 assert_eq!(glob_to_re(br#"?"#), br#"."#);
594 assert_eq!(glob_to_re(br#"*"#), br#"[^/]*"#);
594 assert_eq!(glob_to_re(br#"*"#), br#"[^/]*"#);
595 assert_eq!(glob_to_re(br#"**"#), br#".*"#);
595 assert_eq!(glob_to_re(br#"**"#), br#".*"#);
596 assert_eq!(glob_to_re(br#"**/a"#), br#"(?:.*/)?a"#);
596 assert_eq!(glob_to_re(br#"**/a"#), br#"(?:.*/)?a"#);
597 assert_eq!(glob_to_re(br#"a/**/b"#), br#"a/(?:.*/)?b"#);
597 assert_eq!(glob_to_re(br#"a/**/b"#), br#"a/(?:.*/)?b"#);
598 assert_eq!(glob_to_re(br#"[a*?!^][^b][!c]"#), br#"[a*?!^][\^b][^c]"#);
598 assert_eq!(glob_to_re(br#"[a*?!^][^b][!c]"#), br#"[a*?!^][\^b][^c]"#);
599 assert_eq!(glob_to_re(br#"{a,b}"#), br#"(?:a|b)"#);
599 assert_eq!(glob_to_re(br#"{a,b}"#), br#"(?:a|b)"#);
600 assert_eq!(glob_to_re(br#".\*\?"#), br#"\.\*\?"#);
600 assert_eq!(glob_to_re(br#".\*\?"#), br#"\.\*\?"#);
601 }
601 }
602
602
603 #[test]
603 #[test]
604 fn test_parse_pattern_file_contents() {
604 fn test_parse_pattern_file_contents() {
605 let lines = b"syntax: glob\n*.elc";
605 let lines = b"syntax: glob\n*.elc";
606
606
607 assert_eq!(
607 assert_eq!(
608 parse_pattern_file_contents(
608 parse_pattern_file_contents(
609 lines,
609 lines,
610 Path::new("file_path"),
610 Path::new("file_path"),
611 None,
611 None,
612 false
612 false
613 )
613 )
614 .unwrap()
614 .unwrap()
615 .0,
615 .0,
616 vec![IgnorePattern::new(
616 vec![IgnorePattern::new(
617 PatternSyntax::RelGlob,
617 PatternSyntax::RelGlob,
618 b"*.elc",
618 b"*.elc",
619 Path::new("file_path")
619 Path::new("file_path")
620 )],
620 )],
621 );
621 );
622
622
623 let lines = b"syntax: include\nsyntax: glob";
623 let lines = b"syntax: include\nsyntax: glob";
624
624
625 assert_eq!(
625 assert_eq!(
626 parse_pattern_file_contents(
626 parse_pattern_file_contents(
627 lines,
627 lines,
628 Path::new("file_path"),
628 Path::new("file_path"),
629 None,
629 None,
630 false
630 false
631 )
631 )
632 .unwrap()
632 .unwrap()
633 .0,
633 .0,
634 vec![]
634 vec![]
635 );
635 );
636 let lines = b"glob:**.o";
636 let lines = b"glob:**.o";
637 assert_eq!(
637 assert_eq!(
638 parse_pattern_file_contents(
638 parse_pattern_file_contents(
639 lines,
639 lines,
640 Path::new("file_path"),
640 Path::new("file_path"),
641 None,
641 None,
642 false
642 false
643 )
643 )
644 .unwrap()
644 .unwrap()
645 .0,
645 .0,
646 vec![IgnorePattern::new(
646 vec![IgnorePattern::new(
647 PatternSyntax::RelGlob,
647 PatternSyntax::RelGlob,
648 b"**.o",
648 b"**.o",
649 Path::new("file_path")
649 Path::new("file_path")
650 )]
650 )]
651 );
651 );
652 }
652 }
653
653
654 #[test]
654 #[test]
655 fn test_build_single_regex() {
655 fn test_build_single_regex() {
656 assert_eq!(
656 assert_eq!(
657 build_single_regex(&IgnorePattern::new(
657 build_single_regex(&IgnorePattern::new(
658 PatternSyntax::RelGlob,
658 PatternSyntax::RelGlob,
659 b"rust/target/",
659 b"rust/target/",
660 Path::new("")
660 Path::new("")
661 ))
661 ))
662 .unwrap(),
662 .unwrap(),
663 Some(br"(?:.*/)?rust/target(?:/|$)".to_vec()),
663 Some(br"(?:.*/)?rust/target(?:/|$)".to_vec()),
664 );
664 );
665 assert_eq!(
665 assert_eq!(
666 build_single_regex(&IgnorePattern::new(
666 build_single_regex(&IgnorePattern::new(
667 PatternSyntax::Regexp,
667 PatternSyntax::Regexp,
668 br"rust/target/\d+",
668 br"rust/target/\d+",
669 Path::new("")
669 Path::new("")
670 ))
670 ))
671 .unwrap(),
671 .unwrap(),
672 Some(br"rust/target/\d+".to_vec()),
672 Some(br"rust/target/\d+".to_vec()),
673 );
673 );
674 }
674 }
675
675
676 #[test]
676 #[test]
677 fn test_build_single_regex_shortcut() {
677 fn test_build_single_regex_shortcut() {
678 assert_eq!(
678 assert_eq!(
679 build_single_regex(&IgnorePattern::new(
679 build_single_regex(&IgnorePattern::new(
680 PatternSyntax::RootGlob,
680 PatternSyntax::RootGlob,
681 b"",
681 b"",
682 Path::new("")
682 Path::new("")
683 ))
683 ))
684 .unwrap(),
684 .unwrap(),
685 None,
685 None,
686 );
686 );
687 assert_eq!(
687 assert_eq!(
688 build_single_regex(&IgnorePattern::new(
688 build_single_regex(&IgnorePattern::new(
689 PatternSyntax::RootGlob,
689 PatternSyntax::RootGlob,
690 b"whatever",
690 b"whatever",
691 Path::new("")
691 Path::new("")
692 ))
692 ))
693 .unwrap(),
693 .unwrap(),
694 None,
694 None,
695 );
695 );
696 assert_eq!(
696 assert_eq!(
697 build_single_regex(&IgnorePattern::new(
697 build_single_regex(&IgnorePattern::new(
698 PatternSyntax::RootGlob,
698 PatternSyntax::RootGlob,
699 b"*.o",
699 b"*.o",
700 Path::new("")
700 Path::new("")
701 ))
701 ))
702 .unwrap(),
702 .unwrap(),
703 Some(br"[^/]*\.o(?:/|$)".to_vec()),
703 Some(br"[^/]*\.o(?:/|$)".to_vec()),
704 );
704 );
705 }
705 }
706 }
706 }
@@ -1,1688 +1,1688
1 // matchers.rs
1 // matchers.rs
2 //
2 //
3 // Copyright 2019 Raphaël Gomès <rgomes@octobus.net>
3 // Copyright 2019 Raphaël Gomès <rgomes@octobus.net>
4 //
4 //
5 // This software may be used and distributed according to the terms of the
5 // This software may be used and distributed according to the terms of the
6 // GNU General Public License version 2 or any later version.
6 // GNU General Public License version 2 or any later version.
7
7
8 //! Structs and types for matching files and directories.
8 //! Structs and types for matching files and directories.
9
9
10 use crate::{
10 use crate::{
11 dirstate::dirs_multiset::DirsChildrenMultiset,
11 dirstate::dirs_multiset::DirsChildrenMultiset,
12 filepatterns::{
12 filepatterns::{
13 build_single_regex, filter_subincludes, get_patterns_from_file,
13 build_single_regex, filter_subincludes, get_patterns_from_file,
14 PatternFileWarning, PatternResult,
14 PatternFileWarning, PatternResult,
15 },
15 },
16 utils::{
16 utils::{
17 files::find_dirs,
17 files::find_dirs,
18 hg_path::{HgPath, HgPathBuf},
18 hg_path::{HgPath, HgPathBuf},
19 Escaped,
19 Escaped,
20 },
20 },
21 DirsMultiset, DirstateMapError, FastHashMap, IgnorePattern, PatternError,
21 DirsMultiset, DirstateMapError, FastHashMap, IgnorePattern, PatternError,
22 PatternSyntax,
22 PatternSyntax,
23 };
23 };
24
24
25 use crate::dirstate::status::IgnoreFnType;
25 use crate::dirstate::status::IgnoreFnType;
26 use crate::filepatterns::normalize_path_bytes;
26 use crate::filepatterns::normalize_path_bytes;
27 use std::borrow::ToOwned;
27 use std::borrow::ToOwned;
28 use std::collections::HashSet;
28 use std::collections::HashSet;
29 use std::fmt::{Display, Error, Formatter};
29 use std::fmt::{Display, Error, Formatter};
30 use std::iter::FromIterator;
30 use std::iter::FromIterator;
31 use std::ops::Deref;
31 use std::ops::Deref;
32 use std::path::{Path, PathBuf};
32 use std::path::{Path, PathBuf};
33
33
34 use micro_timer::timed;
34 use micro_timer::timed;
35
35
36 #[derive(Debug, PartialEq)]
36 #[derive(Debug, PartialEq)]
37 pub enum VisitChildrenSet {
37 pub enum VisitChildrenSet {
38 /// Don't visit anything
38 /// Don't visit anything
39 Empty,
39 Empty,
40 /// Only visit this directory
40 /// Only visit this directory
41 This,
41 This,
42 /// Visit this directory and these subdirectories
42 /// Visit this directory and these subdirectories
43 /// TODO Should we implement a `NonEmptyHashSet`?
43 /// TODO Should we implement a `NonEmptyHashSet`?
44 Set(HashSet<HgPathBuf>),
44 Set(HashSet<HgPathBuf>),
45 /// Visit this directory and all subdirectories
45 /// Visit this directory and all subdirectories
46 Recursive,
46 Recursive,
47 }
47 }
48
48
49 pub trait Matcher: core::fmt::Debug {
49 pub trait Matcher: core::fmt::Debug {
50 /// Explicitly listed files
50 /// Explicitly listed files
51 fn file_set(&self) -> Option<&HashSet<HgPathBuf>>;
51 fn file_set(&self) -> Option<&HashSet<HgPathBuf>>;
52 /// Returns whether `filename` is in `file_set`
52 /// Returns whether `filename` is in `file_set`
53 fn exact_match(&self, filename: &HgPath) -> bool;
53 fn exact_match(&self, filename: &HgPath) -> bool;
54 /// Returns whether `filename` is matched by this matcher
54 /// Returns whether `filename` is matched by this matcher
55 fn matches(&self, filename: &HgPath) -> bool;
55 fn matches(&self, filename: &HgPath) -> bool;
56 /// Decides whether a directory should be visited based on whether it
56 /// Decides whether a directory should be visited based on whether it
57 /// has potential matches in it or one of its subdirectories, and
57 /// has potential matches in it or one of its subdirectories, and
58 /// potentially lists which subdirectories of that directory should be
58 /// potentially lists which subdirectories of that directory should be
59 /// visited. This is based on the match's primary, included, and excluded
59 /// visited. This is based on the match's primary, included, and excluded
60 /// patterns.
60 /// patterns.
61 ///
61 ///
62 /// # Example
62 /// # Example
63 ///
63 ///
64 /// Assume matchers `['path:foo/bar', 'rootfilesin:qux']`, we would
64 /// Assume matchers `['path:foo/bar', 'rootfilesin:qux']`, we would
65 /// return the following values (assuming the implementation of
65 /// return the following values (assuming the implementation of
66 /// visit_children_set is capable of recognizing this; some implementations
66 /// visit_children_set is capable of recognizing this; some implementations
67 /// are not).
67 /// are not).
68 ///
68 ///
69 /// ```text
69 /// ```text
70 /// ```ignore
70 /// ```ignore
71 /// '' -> {'foo', 'qux'}
71 /// '' -> {'foo', 'qux'}
72 /// 'baz' -> set()
72 /// 'baz' -> set()
73 /// 'foo' -> {'bar'}
73 /// 'foo' -> {'bar'}
74 /// // Ideally this would be `Recursive`, but since the prefix nature of
74 /// // Ideally this would be `Recursive`, but since the prefix nature of
75 /// // matchers is applied to the entire matcher, we have to downgrade this
75 /// // matchers is applied to the entire matcher, we have to downgrade this
76 /// // to `This` due to the (yet to be implemented in Rust) non-prefix
76 /// // to `This` due to the (yet to be implemented in Rust) non-prefix
77 /// // `RootFilesIn'-kind matcher being mixed in.
77 /// // `RootFilesIn'-kind matcher being mixed in.
78 /// 'foo/bar' -> 'this'
78 /// 'foo/bar' -> 'this'
79 /// 'qux' -> 'this'
79 /// 'qux' -> 'this'
80 /// ```
80 /// ```
81 /// # Important
81 /// # Important
82 ///
82 ///
83 /// Most matchers do not know if they're representing files or
83 /// Most matchers do not know if they're representing files or
84 /// directories. They see `['path:dir/f']` and don't know whether `f` is a
84 /// directories. They see `['path:dir/f']` and don't know whether `f` is a
85 /// file or a directory, so `visit_children_set('dir')` for most matchers
85 /// file or a directory, so `visit_children_set('dir')` for most matchers
86 /// will return `HashSet{ HgPath { "f" } }`, but if the matcher knows it's
86 /// will return `HashSet{ HgPath { "f" } }`, but if the matcher knows it's
87 /// a file (like the yet to be implemented in Rust `ExactMatcher` does),
87 /// a file (like the yet to be implemented in Rust `ExactMatcher` does),
88 /// it may return `VisitChildrenSet::This`.
88 /// it may return `VisitChildrenSet::This`.
89 /// Do not rely on the return being a `HashSet` indicating that there are
89 /// Do not rely on the return being a `HashSet` indicating that there are
90 /// no files in this dir to investigate (or equivalently that if there are
90 /// no files in this dir to investigate (or equivalently that if there are
91 /// files to investigate in 'dir' that it will always return
91 /// files to investigate in 'dir' that it will always return
92 /// `VisitChildrenSet::This`).
92 /// `VisitChildrenSet::This`).
93 fn visit_children_set(&self, directory: &HgPath) -> VisitChildrenSet;
93 fn visit_children_set(&self, directory: &HgPath) -> VisitChildrenSet;
94 /// Matcher will match everything and `files_set()` will be empty:
94 /// Matcher will match everything and `files_set()` will be empty:
95 /// optimization might be possible.
95 /// optimization might be possible.
96 fn matches_everything(&self) -> bool;
96 fn matches_everything(&self) -> bool;
97 /// Matcher will match exactly the files in `files_set()`: optimization
97 /// Matcher will match exactly the files in `files_set()`: optimization
98 /// might be possible.
98 /// might be possible.
99 fn is_exact(&self) -> bool;
99 fn is_exact(&self) -> bool;
100 }
100 }
101
101
102 /// Matches everything.
102 /// Matches everything.
103 ///```
103 ///```
104 /// use hg::{ matchers::{Matcher, AlwaysMatcher}, utils::hg_path::HgPath };
104 /// use hg::{ matchers::{Matcher, AlwaysMatcher}, utils::hg_path::HgPath };
105 ///
105 ///
106 /// let matcher = AlwaysMatcher;
106 /// let matcher = AlwaysMatcher;
107 ///
107 ///
108 /// assert_eq!(matcher.matches(HgPath::new(b"whatever")), true);
108 /// assert_eq!(matcher.matches(HgPath::new(b"whatever")), true);
109 /// assert_eq!(matcher.matches(HgPath::new(b"b.txt")), true);
109 /// assert_eq!(matcher.matches(HgPath::new(b"b.txt")), true);
110 /// assert_eq!(matcher.matches(HgPath::new(b"main.c")), true);
110 /// assert_eq!(matcher.matches(HgPath::new(b"main.c")), true);
111 /// assert_eq!(matcher.matches(HgPath::new(br"re:.*\.c$")), true);
111 /// assert_eq!(matcher.matches(HgPath::new(br"re:.*\.c$")), true);
112 /// ```
112 /// ```
113 #[derive(Debug)]
113 #[derive(Debug)]
114 pub struct AlwaysMatcher;
114 pub struct AlwaysMatcher;
115
115
116 impl Matcher for AlwaysMatcher {
116 impl Matcher for AlwaysMatcher {
117 fn file_set(&self) -> Option<&HashSet<HgPathBuf>> {
117 fn file_set(&self) -> Option<&HashSet<HgPathBuf>> {
118 None
118 None
119 }
119 }
120 fn exact_match(&self, _filename: &HgPath) -> bool {
120 fn exact_match(&self, _filename: &HgPath) -> bool {
121 false
121 false
122 }
122 }
123 fn matches(&self, _filename: &HgPath) -> bool {
123 fn matches(&self, _filename: &HgPath) -> bool {
124 true
124 true
125 }
125 }
126 fn visit_children_set(&self, _directory: &HgPath) -> VisitChildrenSet {
126 fn visit_children_set(&self, _directory: &HgPath) -> VisitChildrenSet {
127 VisitChildrenSet::Recursive
127 VisitChildrenSet::Recursive
128 }
128 }
129 fn matches_everything(&self) -> bool {
129 fn matches_everything(&self) -> bool {
130 true
130 true
131 }
131 }
132 fn is_exact(&self) -> bool {
132 fn is_exact(&self) -> bool {
133 false
133 false
134 }
134 }
135 }
135 }
136
136
137 /// Matches nothing.
137 /// Matches nothing.
138 #[derive(Debug)]
138 #[derive(Debug)]
139 pub struct NeverMatcher;
139 pub struct NeverMatcher;
140
140
141 impl Matcher for NeverMatcher {
141 impl Matcher for NeverMatcher {
142 fn file_set(&self) -> Option<&HashSet<HgPathBuf>> {
142 fn file_set(&self) -> Option<&HashSet<HgPathBuf>> {
143 None
143 None
144 }
144 }
145 fn exact_match(&self, _filename: &HgPath) -> bool {
145 fn exact_match(&self, _filename: &HgPath) -> bool {
146 false
146 false
147 }
147 }
148 fn matches(&self, _filename: &HgPath) -> bool {
148 fn matches(&self, _filename: &HgPath) -> bool {
149 false
149 false
150 }
150 }
151 fn visit_children_set(&self, _directory: &HgPath) -> VisitChildrenSet {
151 fn visit_children_set(&self, _directory: &HgPath) -> VisitChildrenSet {
152 VisitChildrenSet::Empty
152 VisitChildrenSet::Empty
153 }
153 }
154 fn matches_everything(&self) -> bool {
154 fn matches_everything(&self) -> bool {
155 false
155 false
156 }
156 }
157 fn is_exact(&self) -> bool {
157 fn is_exact(&self) -> bool {
158 true
158 true
159 }
159 }
160 }
160 }
161
161
162 /// Matches the input files exactly. They are interpreted as paths, not
162 /// Matches the input files exactly. They are interpreted as paths, not
163 /// patterns.
163 /// patterns.
164 ///
164 ///
165 ///```
165 ///```
166 /// use hg::{ matchers::{Matcher, FileMatcher}, utils::hg_path::{HgPath, HgPathBuf} };
166 /// use hg::{ matchers::{Matcher, FileMatcher}, utils::hg_path::{HgPath, HgPathBuf} };
167 ///
167 ///
168 /// let files = vec![HgPathBuf::from_bytes(b"a.txt"), HgPathBuf::from_bytes(br"re:.*\.c$")];
168 /// let files = vec![HgPathBuf::from_bytes(b"a.txt"), HgPathBuf::from_bytes(br"re:.*\.c$")];
169 /// let matcher = FileMatcher::new(files).unwrap();
169 /// let matcher = FileMatcher::new(files).unwrap();
170 ///
170 ///
171 /// assert_eq!(matcher.matches(HgPath::new(b"a.txt")), true);
171 /// assert_eq!(matcher.matches(HgPath::new(b"a.txt")), true);
172 /// assert_eq!(matcher.matches(HgPath::new(b"b.txt")), false);
172 /// assert_eq!(matcher.matches(HgPath::new(b"b.txt")), false);
173 /// assert_eq!(matcher.matches(HgPath::new(b"main.c")), false);
173 /// assert_eq!(matcher.matches(HgPath::new(b"main.c")), false);
174 /// assert_eq!(matcher.matches(HgPath::new(br"re:.*\.c$")), true);
174 /// assert_eq!(matcher.matches(HgPath::new(br"re:.*\.c$")), true);
175 /// ```
175 /// ```
176 #[derive(Debug)]
176 #[derive(Debug)]
177 pub struct FileMatcher {
177 pub struct FileMatcher {
178 files: HashSet<HgPathBuf>,
178 files: HashSet<HgPathBuf>,
179 dirs: DirsMultiset,
179 dirs: DirsMultiset,
180 }
180 }
181
181
182 impl FileMatcher {
182 impl FileMatcher {
183 pub fn new(files: Vec<HgPathBuf>) -> Result<Self, DirstateMapError> {
183 pub fn new(files: Vec<HgPathBuf>) -> Result<Self, DirstateMapError> {
184 let dirs = DirsMultiset::from_manifest(&files)?;
184 let dirs = DirsMultiset::from_manifest(&files)?;
185 Ok(Self {
185 Ok(Self {
186 files: HashSet::from_iter(files.into_iter()),
186 files: HashSet::from_iter(files.into_iter()),
187 dirs,
187 dirs,
188 })
188 })
189 }
189 }
190 fn inner_matches(&self, filename: &HgPath) -> bool {
190 fn inner_matches(&self, filename: &HgPath) -> bool {
191 self.files.contains(filename.as_ref())
191 self.files.contains(filename.as_ref())
192 }
192 }
193 }
193 }
194
194
195 impl Matcher for FileMatcher {
195 impl Matcher for FileMatcher {
196 fn file_set(&self) -> Option<&HashSet<HgPathBuf>> {
196 fn file_set(&self) -> Option<&HashSet<HgPathBuf>> {
197 Some(&self.files)
197 Some(&self.files)
198 }
198 }
199 fn exact_match(&self, filename: &HgPath) -> bool {
199 fn exact_match(&self, filename: &HgPath) -> bool {
200 self.inner_matches(filename)
200 self.inner_matches(filename)
201 }
201 }
202 fn matches(&self, filename: &HgPath) -> bool {
202 fn matches(&self, filename: &HgPath) -> bool {
203 self.inner_matches(filename)
203 self.inner_matches(filename)
204 }
204 }
205 fn visit_children_set(&self, directory: &HgPath) -> VisitChildrenSet {
205 fn visit_children_set(&self, directory: &HgPath) -> VisitChildrenSet {
206 if self.files.is_empty() || !self.dirs.contains(&directory) {
206 if self.files.is_empty() || !self.dirs.contains(&directory) {
207 return VisitChildrenSet::Empty;
207 return VisitChildrenSet::Empty;
208 }
208 }
209 let mut candidates: HashSet<HgPathBuf> =
209 let mut candidates: HashSet<HgPathBuf> =
210 self.dirs.iter().cloned().collect();
210 self.dirs.iter().cloned().collect();
211
211
212 candidates.extend(self.files.iter().cloned());
212 candidates.extend(self.files.iter().cloned());
213 candidates.remove(HgPath::new(b""));
213 candidates.remove(HgPath::new(b""));
214
214
215 if !directory.as_ref().is_empty() {
215 if !directory.as_ref().is_empty() {
216 let directory = [directory.as_ref().as_bytes(), b"/"].concat();
216 let directory = [directory.as_ref().as_bytes(), b"/"].concat();
217 candidates = candidates
217 candidates = candidates
218 .iter()
218 .iter()
219 .filter_map(|c| {
219 .filter_map(|c| {
220 if c.as_bytes().starts_with(&directory) {
220 if c.as_bytes().starts_with(&directory) {
221 Some(HgPathBuf::from_bytes(
221 Some(HgPathBuf::from_bytes(
222 &c.as_bytes()[directory.len()..],
222 &c.as_bytes()[directory.len()..],
223 ))
223 ))
224 } else {
224 } else {
225 None
225 None
226 }
226 }
227 })
227 })
228 .collect();
228 .collect();
229 }
229 }
230
230
231 // `self.dirs` includes all of the directories, recursively, so if
231 // `self.dirs` includes all of the directories, recursively, so if
232 // we're attempting to match 'foo/bar/baz.txt', it'll have '', 'foo',
232 // we're attempting to match 'foo/bar/baz.txt', it'll have '', 'foo',
233 // 'foo/bar' in it. Thus we can safely ignore a candidate that has a
233 // 'foo/bar' in it. Thus we can safely ignore a candidate that has a
234 // '/' in it, indicating it's for a subdir-of-a-subdir; the immediate
234 // '/' in it, indicating it's for a subdir-of-a-subdir; the immediate
235 // subdir will be in there without a slash.
235 // subdir will be in there without a slash.
236 VisitChildrenSet::Set(
236 VisitChildrenSet::Set(
237 candidates
237 candidates
238 .into_iter()
238 .into_iter()
239 .filter_map(|c| {
239 .filter_map(|c| {
240 if c.bytes().all(|b| *b != b'/') {
240 if c.bytes().all(|b| *b != b'/') {
241 Some(c)
241 Some(c)
242 } else {
242 } else {
243 None
243 None
244 }
244 }
245 })
245 })
246 .collect(),
246 .collect(),
247 )
247 )
248 }
248 }
249 fn matches_everything(&self) -> bool {
249 fn matches_everything(&self) -> bool {
250 false
250 false
251 }
251 }
252 fn is_exact(&self) -> bool {
252 fn is_exact(&self) -> bool {
253 true
253 true
254 }
254 }
255 }
255 }
256
256
257 /// Matches files that are included in the ignore rules.
257 /// Matches files that are included in the ignore rules.
258 /// ```
258 /// ```
259 /// use hg::{
259 /// use hg::{
260 /// matchers::{IncludeMatcher, Matcher},
260 /// matchers::{IncludeMatcher, Matcher},
261 /// IgnorePattern,
261 /// IgnorePattern,
262 /// PatternSyntax,
262 /// PatternSyntax,
263 /// utils::hg_path::HgPath
263 /// utils::hg_path::HgPath
264 /// };
264 /// };
265 /// use std::path::Path;
265 /// use std::path::Path;
266 /// ///
266 /// ///
267 /// let ignore_patterns =
267 /// let ignore_patterns =
268 /// vec![IgnorePattern::new(PatternSyntax::RootGlob, b"this*", Path::new(""))];
268 /// vec![IgnorePattern::new(PatternSyntax::RootGlob, b"this*", Path::new(""))];
269 /// let matcher = IncludeMatcher::new(ignore_patterns).unwrap();
269 /// let matcher = IncludeMatcher::new(ignore_patterns).unwrap();
270 /// ///
270 /// ///
271 /// assert_eq!(matcher.matches(HgPath::new(b"testing")), false);
271 /// assert_eq!(matcher.matches(HgPath::new(b"testing")), false);
272 /// assert_eq!(matcher.matches(HgPath::new(b"this should work")), true);
272 /// assert_eq!(matcher.matches(HgPath::new(b"this should work")), true);
273 /// assert_eq!(matcher.matches(HgPath::new(b"this also")), true);
273 /// assert_eq!(matcher.matches(HgPath::new(b"this also")), true);
274 /// assert_eq!(matcher.matches(HgPath::new(b"but not this")), false);
274 /// assert_eq!(matcher.matches(HgPath::new(b"but not this")), false);
275 /// ```
275 /// ```
276 pub struct IncludeMatcher<'a> {
276 pub struct IncludeMatcher<'a> {
277 patterns: Vec<u8>,
277 patterns: Vec<u8>,
278 match_fn: IgnoreFnType<'a>,
278 match_fn: IgnoreFnType<'a>,
279 /// Whether all the patterns match a prefix (i.e. recursively)
279 /// Whether all the patterns match a prefix (i.e. recursively)
280 prefix: bool,
280 prefix: bool,
281 roots: HashSet<HgPathBuf>,
281 roots: HashSet<HgPathBuf>,
282 dirs: HashSet<HgPathBuf>,
282 dirs: HashSet<HgPathBuf>,
283 parents: HashSet<HgPathBuf>,
283 parents: HashSet<HgPathBuf>,
284 }
284 }
285
285
286 impl core::fmt::Debug for IncludeMatcher<'_> {
286 impl core::fmt::Debug for IncludeMatcher<'_> {
287 fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
287 fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
288 f.debug_struct("IncludeMatcher")
288 f.debug_struct("IncludeMatcher")
289 .field("patterns", &String::from_utf8_lossy(&self.patterns))
289 .field("patterns", &String::from_utf8_lossy(&self.patterns))
290 .field("prefix", &self.prefix)
290 .field("prefix", &self.prefix)
291 .field("roots", &self.roots)
291 .field("roots", &self.roots)
292 .field("dirs", &self.dirs)
292 .field("dirs", &self.dirs)
293 .field("parents", &self.parents)
293 .field("parents", &self.parents)
294 .finish()
294 .finish()
295 }
295 }
296 }
296 }
297
297
298 impl<'a> Matcher for IncludeMatcher<'a> {
298 impl<'a> Matcher for IncludeMatcher<'a> {
299 fn file_set(&self) -> Option<&HashSet<HgPathBuf>> {
299 fn file_set(&self) -> Option<&HashSet<HgPathBuf>> {
300 None
300 None
301 }
301 }
302
302
303 fn exact_match(&self, _filename: &HgPath) -> bool {
303 fn exact_match(&self, _filename: &HgPath) -> bool {
304 false
304 false
305 }
305 }
306
306
307 fn matches(&self, filename: &HgPath) -> bool {
307 fn matches(&self, filename: &HgPath) -> bool {
308 (self.match_fn)(filename.as_ref())
308 (self.match_fn)(filename.as_ref())
309 }
309 }
310
310
311 fn visit_children_set(&self, directory: &HgPath) -> VisitChildrenSet {
311 fn visit_children_set(&self, directory: &HgPath) -> VisitChildrenSet {
312 let dir = directory.as_ref();
312 let dir = directory.as_ref();
313 if self.prefix && self.roots.contains(dir) {
313 if self.prefix && self.roots.contains(dir) {
314 return VisitChildrenSet::Recursive;
314 return VisitChildrenSet::Recursive;
315 }
315 }
316 if self.roots.contains(HgPath::new(b""))
316 if self.roots.contains(HgPath::new(b""))
317 || self.roots.contains(dir)
317 || self.roots.contains(dir)
318 || self.dirs.contains(dir)
318 || self.dirs.contains(dir)
319 || find_dirs(dir).any(|parent_dir| self.roots.contains(parent_dir))
319 || find_dirs(dir).any(|parent_dir| self.roots.contains(parent_dir))
320 {
320 {
321 return VisitChildrenSet::This;
321 return VisitChildrenSet::This;
322 }
322 }
323
323
324 if self.parents.contains(directory.as_ref()) {
324 if self.parents.contains(directory.as_ref()) {
325 let multiset = self.get_all_parents_children();
325 let multiset = self.get_all_parents_children();
326 if let Some(children) = multiset.get(dir) {
326 if let Some(children) = multiset.get(dir) {
327 return VisitChildrenSet::Set(
327 return VisitChildrenSet::Set(
328 children.into_iter().map(HgPathBuf::from).collect(),
328 children.into_iter().map(HgPathBuf::from).collect(),
329 );
329 );
330 }
330 }
331 }
331 }
332 VisitChildrenSet::Empty
332 VisitChildrenSet::Empty
333 }
333 }
334
334
335 fn matches_everything(&self) -> bool {
335 fn matches_everything(&self) -> bool {
336 false
336 false
337 }
337 }
338
338
339 fn is_exact(&self) -> bool {
339 fn is_exact(&self) -> bool {
340 false
340 false
341 }
341 }
342 }
342 }
343
343
344 /// The union of multiple matchers. Will match if any of the matchers match.
344 /// The union of multiple matchers. Will match if any of the matchers match.
345 #[derive(Debug)]
345 #[derive(Debug)]
346 pub struct UnionMatcher {
346 pub struct UnionMatcher {
347 matchers: Vec<Box<dyn Matcher + Sync>>,
347 matchers: Vec<Box<dyn Matcher + Sync>>,
348 }
348 }
349
349
350 impl Matcher for UnionMatcher {
350 impl Matcher for UnionMatcher {
351 fn file_set(&self) -> Option<&HashSet<HgPathBuf>> {
351 fn file_set(&self) -> Option<&HashSet<HgPathBuf>> {
352 None
352 None
353 }
353 }
354
354
355 fn exact_match(&self, _filename: &HgPath) -> bool {
355 fn exact_match(&self, _filename: &HgPath) -> bool {
356 false
356 false
357 }
357 }
358
358
359 fn matches(&self, filename: &HgPath) -> bool {
359 fn matches(&self, filename: &HgPath) -> bool {
360 self.matchers.iter().any(|m| m.matches(filename))
360 self.matchers.iter().any(|m| m.matches(filename))
361 }
361 }
362
362
363 fn visit_children_set(&self, directory: &HgPath) -> VisitChildrenSet {
363 fn visit_children_set(&self, directory: &HgPath) -> VisitChildrenSet {
364 let mut result = HashSet::new();
364 let mut result = HashSet::new();
365 let mut this = false;
365 let mut this = false;
366 for matcher in self.matchers.iter() {
366 for matcher in self.matchers.iter() {
367 let visit = matcher.visit_children_set(directory);
367 let visit = matcher.visit_children_set(directory);
368 match visit {
368 match visit {
369 VisitChildrenSet::Empty => continue,
369 VisitChildrenSet::Empty => continue,
370 VisitChildrenSet::This => {
370 VisitChildrenSet::This => {
371 this = true;
371 this = true;
372 // Don't break, we might have an 'all' in here.
372 // Don't break, we might have an 'all' in here.
373 continue;
373 continue;
374 }
374 }
375 VisitChildrenSet::Set(set) => {
375 VisitChildrenSet::Set(set) => {
376 result.extend(set);
376 result.extend(set);
377 }
377 }
378 VisitChildrenSet::Recursive => {
378 VisitChildrenSet::Recursive => {
379 return visit;
379 return visit;
380 }
380 }
381 }
381 }
382 }
382 }
383 if this {
383 if this {
384 return VisitChildrenSet::This;
384 return VisitChildrenSet::This;
385 }
385 }
386 if result.is_empty() {
386 if result.is_empty() {
387 VisitChildrenSet::Empty
387 VisitChildrenSet::Empty
388 } else {
388 } else {
389 VisitChildrenSet::Set(result)
389 VisitChildrenSet::Set(result)
390 }
390 }
391 }
391 }
392
392
393 fn matches_everything(&self) -> bool {
393 fn matches_everything(&self) -> bool {
394 // TODO Maybe if all are AlwaysMatcher?
394 // TODO Maybe if all are AlwaysMatcher?
395 false
395 false
396 }
396 }
397
397
398 fn is_exact(&self) -> bool {
398 fn is_exact(&self) -> bool {
399 false
399 false
400 }
400 }
401 }
401 }
402
402
403 impl UnionMatcher {
403 impl UnionMatcher {
404 pub fn new(matchers: Vec<Box<dyn Matcher + Sync>>) -> Self {
404 pub fn new(matchers: Vec<Box<dyn Matcher + Sync>>) -> Self {
405 Self { matchers }
405 Self { matchers }
406 }
406 }
407 }
407 }
408
408
409 #[derive(Debug)]
409 #[derive(Debug)]
410 pub struct IntersectionMatcher {
410 pub struct IntersectionMatcher {
411 m1: Box<dyn Matcher + Sync>,
411 m1: Box<dyn Matcher + Sync>,
412 m2: Box<dyn Matcher + Sync>,
412 m2: Box<dyn Matcher + Sync>,
413 files: Option<HashSet<HgPathBuf>>,
413 files: Option<HashSet<HgPathBuf>>,
414 }
414 }
415
415
416 impl Matcher for IntersectionMatcher {
416 impl Matcher for IntersectionMatcher {
417 fn file_set(&self) -> Option<&HashSet<HgPathBuf>> {
417 fn file_set(&self) -> Option<&HashSet<HgPathBuf>> {
418 self.files.as_ref()
418 self.files.as_ref()
419 }
419 }
420
420
421 fn exact_match(&self, filename: &HgPath) -> bool {
421 fn exact_match(&self, filename: &HgPath) -> bool {
422 self.files.as_ref().map_or(false, |f| f.contains(filename))
422 self.files.as_ref().map_or(false, |f| f.contains(filename))
423 }
423 }
424
424
425 fn matches(&self, filename: &HgPath) -> bool {
425 fn matches(&self, filename: &HgPath) -> bool {
426 self.m1.matches(filename) && self.m2.matches(filename)
426 self.m1.matches(filename) && self.m2.matches(filename)
427 }
427 }
428
428
429 fn visit_children_set(&self, directory: &HgPath) -> VisitChildrenSet {
429 fn visit_children_set(&self, directory: &HgPath) -> VisitChildrenSet {
430 let m1_set = self.m1.visit_children_set(directory);
430 let m1_set = self.m1.visit_children_set(directory);
431 if m1_set == VisitChildrenSet::Empty {
431 if m1_set == VisitChildrenSet::Empty {
432 return VisitChildrenSet::Empty;
432 return VisitChildrenSet::Empty;
433 }
433 }
434 let m2_set = self.m2.visit_children_set(directory);
434 let m2_set = self.m2.visit_children_set(directory);
435 if m2_set == VisitChildrenSet::Empty {
435 if m2_set == VisitChildrenSet::Empty {
436 return VisitChildrenSet::Empty;
436 return VisitChildrenSet::Empty;
437 }
437 }
438
438
439 if m1_set == VisitChildrenSet::Recursive {
439 if m1_set == VisitChildrenSet::Recursive {
440 return m2_set;
440 return m2_set;
441 } else if m2_set == VisitChildrenSet::Recursive {
441 } else if m2_set == VisitChildrenSet::Recursive {
442 return m1_set;
442 return m1_set;
443 }
443 }
444
444
445 match (&m1_set, &m2_set) {
445 match (&m1_set, &m2_set) {
446 (VisitChildrenSet::Recursive, _) => m2_set,
446 (VisitChildrenSet::Recursive, _) => m2_set,
447 (_, VisitChildrenSet::Recursive) => m1_set,
447 (_, VisitChildrenSet::Recursive) => m1_set,
448 (VisitChildrenSet::This, _) | (_, VisitChildrenSet::This) => {
448 (VisitChildrenSet::This, _) | (_, VisitChildrenSet::This) => {
449 VisitChildrenSet::This
449 VisitChildrenSet::This
450 }
450 }
451 (VisitChildrenSet::Set(m1), VisitChildrenSet::Set(m2)) => {
451 (VisitChildrenSet::Set(m1), VisitChildrenSet::Set(m2)) => {
452 let set: HashSet<_> = m1.intersection(&m2).cloned().collect();
452 let set: HashSet<_> = m1.intersection(&m2).cloned().collect();
453 if set.is_empty() {
453 if set.is_empty() {
454 VisitChildrenSet::Empty
454 VisitChildrenSet::Empty
455 } else {
455 } else {
456 VisitChildrenSet::Set(set)
456 VisitChildrenSet::Set(set)
457 }
457 }
458 }
458 }
459 _ => unreachable!(),
459 _ => unreachable!(),
460 }
460 }
461 }
461 }
462
462
463 fn matches_everything(&self) -> bool {
463 fn matches_everything(&self) -> bool {
464 self.m1.matches_everything() && self.m2.matches_everything()
464 self.m1.matches_everything() && self.m2.matches_everything()
465 }
465 }
466
466
467 fn is_exact(&self) -> bool {
467 fn is_exact(&self) -> bool {
468 self.m1.is_exact() || self.m2.is_exact()
468 self.m1.is_exact() || self.m2.is_exact()
469 }
469 }
470 }
470 }
471
471
472 impl IntersectionMatcher {
472 impl IntersectionMatcher {
473 pub fn new(
473 pub fn new(
474 mut m1: Box<dyn Matcher + Sync>,
474 mut m1: Box<dyn Matcher + Sync>,
475 mut m2: Box<dyn Matcher + Sync>,
475 mut m2: Box<dyn Matcher + Sync>,
476 ) -> Self {
476 ) -> Self {
477 let files = if m1.is_exact() || m2.is_exact() {
477 let files = if m1.is_exact() || m2.is_exact() {
478 if !m1.is_exact() {
478 if !m1.is_exact() {
479 std::mem::swap(&mut m1, &mut m2);
479 std::mem::swap(&mut m1, &mut m2);
480 }
480 }
481 m1.file_set().map(|m1_files| {
481 m1.file_set().map(|m1_files| {
482 m1_files.iter().cloned().filter(|f| m2.matches(f)).collect()
482 m1_files.iter().cloned().filter(|f| m2.matches(f)).collect()
483 })
483 })
484 } else {
484 } else {
485 None
485 None
486 };
486 };
487 Self { m1, m2, files }
487 Self { m1, m2, files }
488 }
488 }
489 }
489 }
490
490
491 #[derive(Debug)]
491 #[derive(Debug)]
492 pub struct DifferenceMatcher {
492 pub struct DifferenceMatcher {
493 base: Box<dyn Matcher + Sync>,
493 base: Box<dyn Matcher + Sync>,
494 excluded: Box<dyn Matcher + Sync>,
494 excluded: Box<dyn Matcher + Sync>,
495 files: Option<HashSet<HgPathBuf>>,
495 files: Option<HashSet<HgPathBuf>>,
496 }
496 }
497
497
498 impl Matcher for DifferenceMatcher {
498 impl Matcher for DifferenceMatcher {
499 fn file_set(&self) -> Option<&HashSet<HgPathBuf>> {
499 fn file_set(&self) -> Option<&HashSet<HgPathBuf>> {
500 self.files.as_ref()
500 self.files.as_ref()
501 }
501 }
502
502
503 fn exact_match(&self, filename: &HgPath) -> bool {
503 fn exact_match(&self, filename: &HgPath) -> bool {
504 self.files.as_ref().map_or(false, |f| f.contains(filename))
504 self.files.as_ref().map_or(false, |f| f.contains(filename))
505 }
505 }
506
506
507 fn matches(&self, filename: &HgPath) -> bool {
507 fn matches(&self, filename: &HgPath) -> bool {
508 self.base.matches(filename) && !self.excluded.matches(filename)
508 self.base.matches(filename) && !self.excluded.matches(filename)
509 }
509 }
510
510
511 fn visit_children_set(&self, directory: &HgPath) -> VisitChildrenSet {
511 fn visit_children_set(&self, directory: &HgPath) -> VisitChildrenSet {
512 let excluded_set = self.excluded.visit_children_set(directory);
512 let excluded_set = self.excluded.visit_children_set(directory);
513 if excluded_set == VisitChildrenSet::Recursive {
513 if excluded_set == VisitChildrenSet::Recursive {
514 return VisitChildrenSet::Empty;
514 return VisitChildrenSet::Empty;
515 }
515 }
516 let base_set = self.base.visit_children_set(directory);
516 let base_set = self.base.visit_children_set(directory);
517 // Possible values for base: 'recursive', 'this', set(...), set()
517 // Possible values for base: 'recursive', 'this', set(...), set()
518 // Possible values for excluded: 'this', set(...), set()
518 // Possible values for excluded: 'this', set(...), set()
519 // If excluded has nothing under here that we care about, return base,
519 // If excluded has nothing under here that we care about, return base,
520 // even if it's 'recursive'.
520 // even if it's 'recursive'.
521 if excluded_set == VisitChildrenSet::Empty {
521 if excluded_set == VisitChildrenSet::Empty {
522 return base_set;
522 return base_set;
523 }
523 }
524 match base_set {
524 match base_set {
525 VisitChildrenSet::This | VisitChildrenSet::Recursive => {
525 VisitChildrenSet::This | VisitChildrenSet::Recursive => {
526 // Never return 'recursive' here if excluded_set is any kind of
526 // Never return 'recursive' here if excluded_set is any kind of
527 // non-empty (either 'this' or set(foo)), since excluded might
527 // non-empty (either 'this' or set(foo)), since excluded might
528 // return set() for a subdirectory.
528 // return set() for a subdirectory.
529 VisitChildrenSet::This
529 VisitChildrenSet::This
530 }
530 }
531 set => {
531 set => {
532 // Possible values for base: set(...), set()
532 // Possible values for base: set(...), set()
533 // Possible values for excluded: 'this', set(...)
533 // Possible values for excluded: 'this', set(...)
534 // We ignore excluded set results. They're possibly incorrect:
534 // We ignore excluded set results. They're possibly incorrect:
535 // base = path:dir/subdir
535 // base = path:dir/subdir
536 // excluded=rootfilesin:dir,
536 // excluded=rootfilesin:dir,
537 // visit_children_set(''):
537 // visit_children_set(''):
538 // base returns {'dir'}, excluded returns {'dir'}, if we
538 // base returns {'dir'}, excluded returns {'dir'}, if we
539 // subtracted we'd return set(), which is *not* correct, we
539 // subtracted we'd return set(), which is *not* correct, we
540 // still need to visit 'dir'!
540 // still need to visit 'dir'!
541 set
541 set
542 }
542 }
543 }
543 }
544 }
544 }
545
545
546 fn matches_everything(&self) -> bool {
546 fn matches_everything(&self) -> bool {
547 false
547 false
548 }
548 }
549
549
550 fn is_exact(&self) -> bool {
550 fn is_exact(&self) -> bool {
551 self.base.is_exact()
551 self.base.is_exact()
552 }
552 }
553 }
553 }
554
554
555 impl DifferenceMatcher {
555 impl DifferenceMatcher {
556 pub fn new(
556 pub fn new(
557 base: Box<dyn Matcher + Sync>,
557 base: Box<dyn Matcher + Sync>,
558 excluded: Box<dyn Matcher + Sync>,
558 excluded: Box<dyn Matcher + Sync>,
559 ) -> Self {
559 ) -> Self {
560 let base_is_exact = base.is_exact();
560 let base_is_exact = base.is_exact();
561 let base_files = base.file_set().map(ToOwned::to_owned);
561 let base_files = base.file_set().map(ToOwned::to_owned);
562 let mut new = Self {
562 let mut new = Self {
563 base,
563 base,
564 excluded,
564 excluded,
565 files: None,
565 files: None,
566 };
566 };
567 if base_is_exact {
567 if base_is_exact {
568 new.files = base_files.map(|files| {
568 new.files = base_files.map(|files| {
569 files.iter().cloned().filter(|f| new.matches(f)).collect()
569 files.iter().cloned().filter(|f| new.matches(f)).collect()
570 });
570 });
571 }
571 }
572 new
572 new
573 }
573 }
574 }
574 }
575
575
576 /// Returns a function that matches an `HgPath` against the given regex
576 /// Returns a function that matches an `HgPath` against the given regex
577 /// pattern.
577 /// pattern.
578 ///
578 ///
579 /// This can fail when the pattern is invalid or not supported by the
579 /// This can fail when the pattern is invalid or not supported by the
580 /// underlying engine (the `regex` crate), for instance anything with
580 /// underlying engine (the `regex` crate), for instance anything with
581 /// back-references.
581 /// back-references.
582 #[timed]
582 #[timed]
583 fn re_matcher(
583 fn re_matcher(
584 pattern: &[u8],
584 pattern: &[u8],
585 ) -> PatternResult<impl Fn(&HgPath) -> bool + Sync> {
585 ) -> PatternResult<impl Fn(&HgPath) -> bool + Sync> {
586 use std::io::Write;
586 use std::io::Write;
587
587
588 // The `regex` crate adds `.*` to the start and end of expressions if there
588 // The `regex` crate adds `.*` to the start and end of expressions if there
589 // are no anchors, so add the start anchor.
589 // are no anchors, so add the start anchor.
590 let mut escaped_bytes = vec![b'^', b'(', b'?', b':'];
590 let mut escaped_bytes = vec![b'^', b'(', b'?', b':'];
591 for byte in pattern {
591 for byte in pattern {
592 if *byte > 127 {
592 if *byte > 127 {
593 write!(escaped_bytes, "\\x{:x}", *byte).unwrap();
593 write!(escaped_bytes, "\\x{:x}", *byte).unwrap();
594 } else {
594 } else {
595 escaped_bytes.push(*byte);
595 escaped_bytes.push(*byte);
596 }
596 }
597 }
597 }
598 escaped_bytes.push(b')');
598 escaped_bytes.push(b')');
599
599
600 // Avoid the cost of UTF8 checking
600 // Avoid the cost of UTF8 checking
601 //
601 //
602 // # Safety
602 // # Safety
603 // This is safe because we escaped all non-ASCII bytes.
603 // This is safe because we escaped all non-ASCII bytes.
604 let pattern_string = unsafe { String::from_utf8_unchecked(escaped_bytes) };
604 let pattern_string = unsafe { String::from_utf8_unchecked(escaped_bytes) };
605 let re = regex::bytes::RegexBuilder::new(&pattern_string)
605 let re = regex::bytes::RegexBuilder::new(&pattern_string)
606 .unicode(false)
606 .unicode(false)
607 // Big repos with big `.hgignore` will hit the default limit and
607 // Big repos with big `.hgignore` will hit the default limit and
608 // incur a significant performance hit. One repo's `hg status` hit
608 // incur a significant performance hit. One repo's `hg status` hit
609 // multiple *minutes*.
609 // multiple *minutes*.
610 .dfa_size_limit(50 * (1 << 20))
610 .dfa_size_limit(50 * (1 << 20))
611 .build()
611 .build()
612 .map_err(|e| PatternError::UnsupportedSyntax(e.to_string()))?;
612 .map_err(|e| PatternError::UnsupportedSyntax(e.to_string()))?;
613
613
614 Ok(move |path: &HgPath| re.is_match(path.as_bytes()))
614 Ok(move |path: &HgPath| re.is_match(path.as_bytes()))
615 }
615 }
616
616
617 /// Returns the regex pattern and a function that matches an `HgPath` against
617 /// Returns the regex pattern and a function that matches an `HgPath` against
618 /// said regex formed by the given ignore patterns.
618 /// said regex formed by the given ignore patterns.
619 fn build_regex_match<'a, 'b>(
619 fn build_regex_match<'a, 'b>(
620 ignore_patterns: &'a [IgnorePattern],
620 ignore_patterns: &'a [IgnorePattern],
621 ) -> PatternResult<(Vec<u8>, IgnoreFnType<'b>)> {
621 ) -> PatternResult<(Vec<u8>, IgnoreFnType<'b>)> {
622 let mut regexps = vec![];
622 let mut regexps = vec![];
623 let mut exact_set = HashSet::new();
623 let mut exact_set = HashSet::new();
624
624
625 for pattern in ignore_patterns {
625 for pattern in ignore_patterns {
626 if let Some(re) = build_single_regex(pattern)? {
626 if let Some(re) = build_single_regex(pattern)? {
627 regexps.push(re);
627 regexps.push(re);
628 } else {
628 } else {
629 let exact = normalize_path_bytes(&pattern.pattern);
629 let exact = normalize_path_bytes(&pattern.pattern);
630 exact_set.insert(HgPathBuf::from_bytes(&exact));
630 exact_set.insert(HgPathBuf::from_bytes(&exact));
631 }
631 }
632 }
632 }
633
633
634 let full_regex = regexps.join(&b'|');
634 let full_regex = regexps.join(&b'|');
635
635
636 // An empty pattern would cause the regex engine to incorrectly match the
636 // An empty pattern would cause the regex engine to incorrectly match the
637 // (empty) root directory
637 // (empty) root directory
638 let func = if !(regexps.is_empty()) {
638 let func = if !(regexps.is_empty()) {
639 let matcher = re_matcher(&full_regex)?;
639 let matcher = re_matcher(&full_regex)?;
640 let func = move |filename: &HgPath| {
640 let func = move |filename: &HgPath| {
641 exact_set.contains(filename) || matcher(filename)
641 exact_set.contains(filename) || matcher(filename)
642 };
642 };
643 Box::new(func) as IgnoreFnType
643 Box::new(func) as IgnoreFnType
644 } else {
644 } else {
645 let func = move |filename: &HgPath| exact_set.contains(filename);
645 let func = move |filename: &HgPath| exact_set.contains(filename);
646 Box::new(func) as IgnoreFnType
646 Box::new(func) as IgnoreFnType
647 };
647 };
648
648
649 Ok((full_regex, func))
649 Ok((full_regex, func))
650 }
650 }
651
651
652 /// Returns roots and directories corresponding to each pattern.
652 /// Returns roots and directories corresponding to each pattern.
653 ///
653 ///
654 /// This calculates the roots and directories exactly matching the patterns and
654 /// This calculates the roots and directories exactly matching the patterns and
655 /// returns a tuple of (roots, dirs). It does not return other directories
655 /// returns a tuple of (roots, dirs). It does not return other directories
656 /// which may also need to be considered, like the parent directories.
656 /// which may also need to be considered, like the parent directories.
657 fn roots_and_dirs(
657 fn roots_and_dirs(
658 ignore_patterns: &[IgnorePattern],
658 ignore_patterns: &[IgnorePattern],
659 ) -> (Vec<HgPathBuf>, Vec<HgPathBuf>) {
659 ) -> (Vec<HgPathBuf>, Vec<HgPathBuf>) {
660 let mut roots = Vec::new();
660 let mut roots = Vec::new();
661 let mut dirs = Vec::new();
661 let mut dirs = Vec::new();
662
662
663 for ignore_pattern in ignore_patterns {
663 for ignore_pattern in ignore_patterns {
664 let IgnorePattern {
664 let IgnorePattern {
665 syntax, pattern, ..
665 syntax, pattern, ..
666 } = ignore_pattern;
666 } = ignore_pattern;
667 match syntax {
667 match syntax {
668 PatternSyntax::RootGlob | PatternSyntax::Glob => {
668 PatternSyntax::RootGlob | PatternSyntax::Glob => {
669 let mut root = HgPathBuf::new();
669 let mut root = HgPathBuf::new();
670 for p in pattern.split(|c| *c == b'/') {
670 for p in pattern.split(|c| *c == b'/') {
671 if p.iter().any(|c| match *c {
671 if p.iter().any(|c| match *c {
672 b'[' | b'{' | b'*' | b'?' => true,
672 b'[' | b'{' | b'*' | b'?' => true,
673 _ => false,
673 _ => false,
674 }) {
674 }) {
675 break;
675 break;
676 }
676 }
677 root.push(HgPathBuf::from_bytes(p).as_ref());
677 root.push(HgPathBuf::from_bytes(p).as_ref());
678 }
678 }
679 roots.push(root);
679 roots.push(root);
680 }
680 }
681 PatternSyntax::Path | PatternSyntax::RelPath => {
681 PatternSyntax::Path | PatternSyntax::RelPath => {
682 let pat = HgPath::new(if pattern == b"." {
682 let pat = HgPath::new(if pattern == b"." {
683 &[] as &[u8]
683 &[] as &[u8]
684 } else {
684 } else {
685 pattern
685 pattern
686 });
686 });
687 roots.push(pat.to_owned());
687 roots.push(pat.to_owned());
688 }
688 }
689 PatternSyntax::RootFiles => {
689 PatternSyntax::RootFiles => {
690 let pat = if pattern == b"." {
690 let pat = if pattern == b"." {
691 &[] as &[u8]
691 &[] as &[u8]
692 } else {
692 } else {
693 pattern
693 pattern
694 };
694 };
695 dirs.push(HgPathBuf::from_bytes(pat));
695 dirs.push(HgPathBuf::from_bytes(pat));
696 }
696 }
697 _ => {
697 _ => {
698 roots.push(HgPathBuf::new());
698 roots.push(HgPathBuf::new());
699 }
699 }
700 }
700 }
701 }
701 }
702 (roots, dirs)
702 (roots, dirs)
703 }
703 }
704
704
705 /// Paths extracted from patterns
705 /// Paths extracted from patterns
706 #[derive(Debug, PartialEq)]
706 #[derive(Debug, PartialEq)]
707 struct RootsDirsAndParents {
707 struct RootsDirsAndParents {
708 /// Directories to match recursively
708 /// Directories to match recursively
709 pub roots: HashSet<HgPathBuf>,
709 pub roots: HashSet<HgPathBuf>,
710 /// Directories to match non-recursively
710 /// Directories to match non-recursively
711 pub dirs: HashSet<HgPathBuf>,
711 pub dirs: HashSet<HgPathBuf>,
712 /// Implicitly required directories to go to items in either roots or dirs
712 /// Implicitly required directories to go to items in either roots or dirs
713 pub parents: HashSet<HgPathBuf>,
713 pub parents: HashSet<HgPathBuf>,
714 }
714 }
715
715
716 /// Extract roots, dirs and parents from patterns.
716 /// Extract roots, dirs and parents from patterns.
717 fn roots_dirs_and_parents(
717 fn roots_dirs_and_parents(
718 ignore_patterns: &[IgnorePattern],
718 ignore_patterns: &[IgnorePattern],
719 ) -> PatternResult<RootsDirsAndParents> {
719 ) -> PatternResult<RootsDirsAndParents> {
720 let (roots, dirs) = roots_and_dirs(ignore_patterns);
720 let (roots, dirs) = roots_and_dirs(ignore_patterns);
721
721
722 let mut parents = HashSet::new();
722 let mut parents = HashSet::new();
723
723
724 parents.extend(
724 parents.extend(
725 DirsMultiset::from_manifest(&dirs)
725 DirsMultiset::from_manifest(&dirs)
726 .map_err(|e| match e {
726 .map_err(|e| match e {
727 DirstateMapError::InvalidPath(e) => e,
727 DirstateMapError::InvalidPath(e) => e,
728 _ => unreachable!(),
728 _ => unreachable!(),
729 })?
729 })?
730 .iter()
730 .iter()
731 .map(ToOwned::to_owned),
731 .map(ToOwned::to_owned),
732 );
732 );
733 parents.extend(
733 parents.extend(
734 DirsMultiset::from_manifest(&roots)
734 DirsMultiset::from_manifest(&roots)
735 .map_err(|e| match e {
735 .map_err(|e| match e {
736 DirstateMapError::InvalidPath(e) => e,
736 DirstateMapError::InvalidPath(e) => e,
737 _ => unreachable!(),
737 _ => unreachable!(),
738 })?
738 })?
739 .iter()
739 .iter()
740 .map(ToOwned::to_owned),
740 .map(ToOwned::to_owned),
741 );
741 );
742
742
743 Ok(RootsDirsAndParents {
743 Ok(RootsDirsAndParents {
744 roots: HashSet::from_iter(roots),
744 roots: HashSet::from_iter(roots),
745 dirs: HashSet::from_iter(dirs),
745 dirs: HashSet::from_iter(dirs),
746 parents,
746 parents,
747 })
747 })
748 }
748 }
749
749
750 /// Returns a function that checks whether a given file (in the general sense)
750 /// Returns a function that checks whether a given file (in the general sense)
751 /// should be matched.
751 /// should be matched.
752 fn build_match<'a, 'b>(
752 fn build_match<'a, 'b>(
753 ignore_patterns: Vec<IgnorePattern>,
753 ignore_patterns: Vec<IgnorePattern>,
754 ) -> PatternResult<(Vec<u8>, IgnoreFnType<'b>)> {
754 ) -> PatternResult<(Vec<u8>, IgnoreFnType<'b>)> {
755 let mut match_funcs: Vec<IgnoreFnType<'b>> = vec![];
755 let mut match_funcs: Vec<IgnoreFnType<'b>> = vec![];
756 // For debugging and printing
756 // For debugging and printing
757 let mut patterns = vec![];
757 let mut patterns = vec![];
758
758
759 let (subincludes, ignore_patterns) = filter_subincludes(ignore_patterns)?;
759 let (subincludes, ignore_patterns) = filter_subincludes(ignore_patterns)?;
760
760
761 if !subincludes.is_empty() {
761 if !subincludes.is_empty() {
762 // Build prefix-based matcher functions for subincludes
762 // Build prefix-based matcher functions for subincludes
763 let mut submatchers = FastHashMap::default();
763 let mut submatchers = FastHashMap::default();
764 let mut prefixes = vec![];
764 let mut prefixes = vec![];
765
765
766 for sub_include in subincludes {
766 for sub_include in subincludes {
767 let matcher = IncludeMatcher::new(sub_include.included_patterns)?;
767 let matcher = IncludeMatcher::new(sub_include.included_patterns)?;
768 let match_fn =
768 let match_fn =
769 Box::new(move |path: &HgPath| matcher.matches(path));
769 Box::new(move |path: &HgPath| matcher.matches(path));
770 prefixes.push(sub_include.prefix.clone());
770 prefixes.push(sub_include.prefix.clone());
771 submatchers.insert(sub_include.prefix.clone(), match_fn);
771 submatchers.insert(sub_include.prefix.clone(), match_fn);
772 }
772 }
773
773
774 let match_subinclude = move |filename: &HgPath| {
774 let match_subinclude = move |filename: &HgPath| {
775 for prefix in prefixes.iter() {
775 for prefix in prefixes.iter() {
776 if let Some(rel) = filename.relative_to(prefix) {
776 if let Some(rel) = filename.relative_to(prefix) {
777 if (submatchers[prefix])(rel) {
777 if (submatchers[prefix])(rel) {
778 return true;
778 return true;
779 }
779 }
780 }
780 }
781 }
781 }
782 false
782 false
783 };
783 };
784
784
785 match_funcs.push(Box::new(match_subinclude));
785 match_funcs.push(Box::new(match_subinclude));
786 }
786 }
787
787
788 if !ignore_patterns.is_empty() {
788 if !ignore_patterns.is_empty() {
789 // Either do dumb matching if all patterns are rootfiles, or match
789 // Either do dumb matching if all patterns are rootfiles, or match
790 // with a regex.
790 // with a regex.
791 if ignore_patterns
791 if ignore_patterns
792 .iter()
792 .iter()
793 .all(|k| k.syntax == PatternSyntax::RootFiles)
793 .all(|k| k.syntax == PatternSyntax::RootFiles)
794 {
794 {
795 let dirs: HashSet<_> = ignore_patterns
795 let dirs: HashSet<_> = ignore_patterns
796 .iter()
796 .iter()
797 .map(|k| k.pattern.to_owned())
797 .map(|k| k.pattern.to_owned())
798 .collect();
798 .collect();
799 let mut dirs_vec: Vec<_> = dirs.iter().cloned().collect();
799 let mut dirs_vec: Vec<_> = dirs.iter().cloned().collect();
800
800
801 let match_func = move |path: &HgPath| -> bool {
801 let match_func = move |path: &HgPath| -> bool {
802 let path = path.as_bytes();
802 let path = path.as_bytes();
803 let i = path.iter().rfind(|a| **a == b'/');
803 let i = path.iter().rfind(|a| **a == b'/');
804 let dir = if let Some(i) = i {
804 let dir = if let Some(i) = i {
805 &path[..*i as usize]
805 &path[..*i as usize]
806 } else {
806 } else {
807 b"."
807 b"."
808 };
808 };
809 dirs.contains(dir.deref())
809 dirs.contains(dir.deref())
810 };
810 };
811 match_funcs.push(Box::new(match_func));
811 match_funcs.push(Box::new(match_func));
812
812
813 patterns.extend(b"rootfilesin: ");
813 patterns.extend(b"rootfilesin: ");
814 dirs_vec.sort();
814 dirs_vec.sort();
815 patterns.extend(dirs_vec.escaped_bytes());
815 patterns.extend(dirs_vec.escaped_bytes());
816 } else {
816 } else {
817 let (new_re, match_func) = build_regex_match(&ignore_patterns)?;
817 let (new_re, match_func) = build_regex_match(&ignore_patterns)?;
818 patterns = new_re;
818 patterns = new_re;
819 match_funcs.push(match_func)
819 match_funcs.push(match_func)
820 }
820 }
821 }
821 }
822
822
823 Ok(if match_funcs.len() == 1 {
823 Ok(if match_funcs.len() == 1 {
824 (patterns, match_funcs.remove(0))
824 (patterns, match_funcs.remove(0))
825 } else {
825 } else {
826 (
826 (
827 patterns,
827 patterns,
828 Box::new(move |f: &HgPath| -> bool {
828 Box::new(move |f: &HgPath| -> bool {
829 match_funcs.iter().any(|match_func| match_func(f))
829 match_funcs.iter().any(|match_func| match_func(f))
830 }),
830 }),
831 )
831 )
832 })
832 })
833 }
833 }
834
834
835 /// Parses all "ignore" files with their recursive includes and returns a
835 /// Parses all "ignore" files with their recursive includes and returns a
836 /// function that checks whether a given file (in the general sense) should be
836 /// function that checks whether a given file (in the general sense) should be
837 /// ignored.
837 /// ignored.
838 pub fn get_ignore_matcher<'a>(
838 pub fn get_ignore_matcher<'a>(
839 mut all_pattern_files: Vec<PathBuf>,
839 mut all_pattern_files: Vec<PathBuf>,
840 root_dir: &Path,
840 root_dir: &Path,
841 inspect_pattern_bytes: &mut impl FnMut(&[u8]),
841 inspect_pattern_bytes: &mut impl FnMut(&Path, &[u8]),
842 ) -> PatternResult<(IncludeMatcher<'a>, Vec<PatternFileWarning>)> {
842 ) -> PatternResult<(IncludeMatcher<'a>, Vec<PatternFileWarning>)> {
843 let mut all_patterns = vec![];
843 let mut all_patterns = vec![];
844 let mut all_warnings = vec![];
844 let mut all_warnings = vec![];
845
845
846 // Sort to make the ordering of calls to `inspect_pattern_bytes`
846 // Sort to make the ordering of calls to `inspect_pattern_bytes`
847 // deterministic even if the ordering of `all_pattern_files` is not (such
847 // deterministic even if the ordering of `all_pattern_files` is not (such
848 // as when a iteration order of a Python dict or Rust HashMap is involved).
848 // as when a iteration order of a Python dict or Rust HashMap is involved).
849 // Sort by "string" representation instead of the default by component
849 // Sort by "string" representation instead of the default by component
850 // (with a Rust-specific definition of a component)
850 // (with a Rust-specific definition of a component)
851 all_pattern_files
851 all_pattern_files
852 .sort_unstable_by(|a, b| a.as_os_str().cmp(b.as_os_str()));
852 .sort_unstable_by(|a, b| a.as_os_str().cmp(b.as_os_str()));
853
853
854 for pattern_file in &all_pattern_files {
854 for pattern_file in &all_pattern_files {
855 let (patterns, warnings) = get_patterns_from_file(
855 let (patterns, warnings) = get_patterns_from_file(
856 pattern_file,
856 pattern_file,
857 root_dir,
857 root_dir,
858 inspect_pattern_bytes,
858 inspect_pattern_bytes,
859 )?;
859 )?;
860
860
861 all_patterns.extend(patterns.to_owned());
861 all_patterns.extend(patterns.to_owned());
862 all_warnings.extend(warnings);
862 all_warnings.extend(warnings);
863 }
863 }
864 let matcher = IncludeMatcher::new(all_patterns)?;
864 let matcher = IncludeMatcher::new(all_patterns)?;
865 Ok((matcher, all_warnings))
865 Ok((matcher, all_warnings))
866 }
866 }
867
867
868 /// Parses all "ignore" files with their recursive includes and returns a
868 /// Parses all "ignore" files with their recursive includes and returns a
869 /// function that checks whether a given file (in the general sense) should be
869 /// function that checks whether a given file (in the general sense) should be
870 /// ignored.
870 /// ignored.
871 pub fn get_ignore_function<'a>(
871 pub fn get_ignore_function<'a>(
872 all_pattern_files: Vec<PathBuf>,
872 all_pattern_files: Vec<PathBuf>,
873 root_dir: &Path,
873 root_dir: &Path,
874 inspect_pattern_bytes: &mut impl FnMut(&[u8]),
874 inspect_pattern_bytes: &mut impl FnMut(&Path, &[u8]),
875 ) -> PatternResult<(IgnoreFnType<'a>, Vec<PatternFileWarning>)> {
875 ) -> PatternResult<(IgnoreFnType<'a>, Vec<PatternFileWarning>)> {
876 let res =
876 let res =
877 get_ignore_matcher(all_pattern_files, root_dir, inspect_pattern_bytes);
877 get_ignore_matcher(all_pattern_files, root_dir, inspect_pattern_bytes);
878 res.map(|(matcher, all_warnings)| {
878 res.map(|(matcher, all_warnings)| {
879 let res: IgnoreFnType<'a> =
879 let res: IgnoreFnType<'a> =
880 Box::new(move |path: &HgPath| matcher.matches(path));
880 Box::new(move |path: &HgPath| matcher.matches(path));
881
881
882 (res, all_warnings)
882 (res, all_warnings)
883 })
883 })
884 }
884 }
885
885
886 impl<'a> IncludeMatcher<'a> {
886 impl<'a> IncludeMatcher<'a> {
887 pub fn new(ignore_patterns: Vec<IgnorePattern>) -> PatternResult<Self> {
887 pub fn new(ignore_patterns: Vec<IgnorePattern>) -> PatternResult<Self> {
888 let RootsDirsAndParents {
888 let RootsDirsAndParents {
889 roots,
889 roots,
890 dirs,
890 dirs,
891 parents,
891 parents,
892 } = roots_dirs_and_parents(&ignore_patterns)?;
892 } = roots_dirs_and_parents(&ignore_patterns)?;
893 let prefix = ignore_patterns.iter().all(|k| match k.syntax {
893 let prefix = ignore_patterns.iter().all(|k| match k.syntax {
894 PatternSyntax::Path | PatternSyntax::RelPath => true,
894 PatternSyntax::Path | PatternSyntax::RelPath => true,
895 _ => false,
895 _ => false,
896 });
896 });
897 let (patterns, match_fn) = build_match(ignore_patterns)?;
897 let (patterns, match_fn) = build_match(ignore_patterns)?;
898
898
899 Ok(Self {
899 Ok(Self {
900 patterns,
900 patterns,
901 match_fn,
901 match_fn,
902 prefix,
902 prefix,
903 roots,
903 roots,
904 dirs,
904 dirs,
905 parents,
905 parents,
906 })
906 })
907 }
907 }
908
908
909 fn get_all_parents_children(&self) -> DirsChildrenMultiset {
909 fn get_all_parents_children(&self) -> DirsChildrenMultiset {
910 // TODO cache
910 // TODO cache
911 let thing = self
911 let thing = self
912 .dirs
912 .dirs
913 .iter()
913 .iter()
914 .chain(self.roots.iter())
914 .chain(self.roots.iter())
915 .chain(self.parents.iter());
915 .chain(self.parents.iter());
916 DirsChildrenMultiset::new(thing, Some(&self.parents))
916 DirsChildrenMultiset::new(thing, Some(&self.parents))
917 }
917 }
918
918
919 pub fn debug_get_patterns(&self) -> &[u8] {
919 pub fn debug_get_patterns(&self) -> &[u8] {
920 self.patterns.as_ref()
920 self.patterns.as_ref()
921 }
921 }
922 }
922 }
923
923
924 impl<'a> Display for IncludeMatcher<'a> {
924 impl<'a> Display for IncludeMatcher<'a> {
925 fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error> {
925 fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error> {
926 // XXX What about exact matches?
926 // XXX What about exact matches?
927 // I'm not sure it's worth it to clone the HashSet and keep it
927 // I'm not sure it's worth it to clone the HashSet and keep it
928 // around just in case someone wants to display the matcher, plus
928 // around just in case someone wants to display the matcher, plus
929 // it's going to be unreadable after a few entries, but we need to
929 // it's going to be unreadable after a few entries, but we need to
930 // inform in this display that exact matches are being used and are
930 // inform in this display that exact matches are being used and are
931 // (on purpose) missing from the `includes`.
931 // (on purpose) missing from the `includes`.
932 write!(
932 write!(
933 f,
933 f,
934 "IncludeMatcher(includes='{}')",
934 "IncludeMatcher(includes='{}')",
935 String::from_utf8_lossy(&self.patterns.escaped_bytes())
935 String::from_utf8_lossy(&self.patterns.escaped_bytes())
936 )
936 )
937 }
937 }
938 }
938 }
939
939
940 #[cfg(test)]
940 #[cfg(test)]
941 mod tests {
941 mod tests {
942 use super::*;
942 use super::*;
943 use pretty_assertions::assert_eq;
943 use pretty_assertions::assert_eq;
944 use std::path::Path;
944 use std::path::Path;
945
945
946 #[test]
946 #[test]
947 fn test_roots_and_dirs() {
947 fn test_roots_and_dirs() {
948 let pats = vec![
948 let pats = vec![
949 IgnorePattern::new(PatternSyntax::Glob, b"g/h/*", Path::new("")),
949 IgnorePattern::new(PatternSyntax::Glob, b"g/h/*", Path::new("")),
950 IgnorePattern::new(PatternSyntax::Glob, b"g/h", Path::new("")),
950 IgnorePattern::new(PatternSyntax::Glob, b"g/h", Path::new("")),
951 IgnorePattern::new(PatternSyntax::Glob, b"g*", Path::new("")),
951 IgnorePattern::new(PatternSyntax::Glob, b"g*", Path::new("")),
952 ];
952 ];
953 let (roots, dirs) = roots_and_dirs(&pats);
953 let (roots, dirs) = roots_and_dirs(&pats);
954
954
955 assert_eq!(
955 assert_eq!(
956 roots,
956 roots,
957 vec!(
957 vec!(
958 HgPathBuf::from_bytes(b"g/h"),
958 HgPathBuf::from_bytes(b"g/h"),
959 HgPathBuf::from_bytes(b"g/h"),
959 HgPathBuf::from_bytes(b"g/h"),
960 HgPathBuf::new()
960 HgPathBuf::new()
961 ),
961 ),
962 );
962 );
963 assert_eq!(dirs, vec!());
963 assert_eq!(dirs, vec!());
964 }
964 }
965
965
966 #[test]
966 #[test]
967 fn test_roots_dirs_and_parents() {
967 fn test_roots_dirs_and_parents() {
968 let pats = vec![
968 let pats = vec![
969 IgnorePattern::new(PatternSyntax::Glob, b"g/h/*", Path::new("")),
969 IgnorePattern::new(PatternSyntax::Glob, b"g/h/*", Path::new("")),
970 IgnorePattern::new(PatternSyntax::Glob, b"g/h", Path::new("")),
970 IgnorePattern::new(PatternSyntax::Glob, b"g/h", Path::new("")),
971 IgnorePattern::new(PatternSyntax::Glob, b"g*", Path::new("")),
971 IgnorePattern::new(PatternSyntax::Glob, b"g*", Path::new("")),
972 ];
972 ];
973
973
974 let mut roots = HashSet::new();
974 let mut roots = HashSet::new();
975 roots.insert(HgPathBuf::from_bytes(b"g/h"));
975 roots.insert(HgPathBuf::from_bytes(b"g/h"));
976 roots.insert(HgPathBuf::new());
976 roots.insert(HgPathBuf::new());
977
977
978 let dirs = HashSet::new();
978 let dirs = HashSet::new();
979
979
980 let mut parents = HashSet::new();
980 let mut parents = HashSet::new();
981 parents.insert(HgPathBuf::new());
981 parents.insert(HgPathBuf::new());
982 parents.insert(HgPathBuf::from_bytes(b"g"));
982 parents.insert(HgPathBuf::from_bytes(b"g"));
983
983
984 assert_eq!(
984 assert_eq!(
985 roots_dirs_and_parents(&pats).unwrap(),
985 roots_dirs_and_parents(&pats).unwrap(),
986 RootsDirsAndParents {
986 RootsDirsAndParents {
987 roots,
987 roots,
988 dirs,
988 dirs,
989 parents
989 parents
990 }
990 }
991 );
991 );
992 }
992 }
993
993
994 #[test]
994 #[test]
995 fn test_filematcher_visit_children_set() {
995 fn test_filematcher_visit_children_set() {
996 // Visitchildrenset
996 // Visitchildrenset
997 let files = vec![HgPathBuf::from_bytes(b"dir/subdir/foo.txt")];
997 let files = vec![HgPathBuf::from_bytes(b"dir/subdir/foo.txt")];
998 let matcher = FileMatcher::new(files).unwrap();
998 let matcher = FileMatcher::new(files).unwrap();
999
999
1000 let mut set = HashSet::new();
1000 let mut set = HashSet::new();
1001 set.insert(HgPathBuf::from_bytes(b"dir"));
1001 set.insert(HgPathBuf::from_bytes(b"dir"));
1002 assert_eq!(
1002 assert_eq!(
1003 matcher.visit_children_set(HgPath::new(b"")),
1003 matcher.visit_children_set(HgPath::new(b"")),
1004 VisitChildrenSet::Set(set)
1004 VisitChildrenSet::Set(set)
1005 );
1005 );
1006
1006
1007 let mut set = HashSet::new();
1007 let mut set = HashSet::new();
1008 set.insert(HgPathBuf::from_bytes(b"subdir"));
1008 set.insert(HgPathBuf::from_bytes(b"subdir"));
1009 assert_eq!(
1009 assert_eq!(
1010 matcher.visit_children_set(HgPath::new(b"dir")),
1010 matcher.visit_children_set(HgPath::new(b"dir")),
1011 VisitChildrenSet::Set(set)
1011 VisitChildrenSet::Set(set)
1012 );
1012 );
1013
1013
1014 let mut set = HashSet::new();
1014 let mut set = HashSet::new();
1015 set.insert(HgPathBuf::from_bytes(b"foo.txt"));
1015 set.insert(HgPathBuf::from_bytes(b"foo.txt"));
1016 assert_eq!(
1016 assert_eq!(
1017 matcher.visit_children_set(HgPath::new(b"dir/subdir")),
1017 matcher.visit_children_set(HgPath::new(b"dir/subdir")),
1018 VisitChildrenSet::Set(set)
1018 VisitChildrenSet::Set(set)
1019 );
1019 );
1020
1020
1021 assert_eq!(
1021 assert_eq!(
1022 matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
1022 matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
1023 VisitChildrenSet::Empty
1023 VisitChildrenSet::Empty
1024 );
1024 );
1025 assert_eq!(
1025 assert_eq!(
1026 matcher.visit_children_set(HgPath::new(b"dir/subdir/foo.txt")),
1026 matcher.visit_children_set(HgPath::new(b"dir/subdir/foo.txt")),
1027 VisitChildrenSet::Empty
1027 VisitChildrenSet::Empty
1028 );
1028 );
1029 assert_eq!(
1029 assert_eq!(
1030 matcher.visit_children_set(HgPath::new(b"folder")),
1030 matcher.visit_children_set(HgPath::new(b"folder")),
1031 VisitChildrenSet::Empty
1031 VisitChildrenSet::Empty
1032 );
1032 );
1033 }
1033 }
1034
1034
1035 #[test]
1035 #[test]
1036 fn test_filematcher_visit_children_set_files_and_dirs() {
1036 fn test_filematcher_visit_children_set_files_and_dirs() {
1037 let files = vec![
1037 let files = vec![
1038 HgPathBuf::from_bytes(b"rootfile.txt"),
1038 HgPathBuf::from_bytes(b"rootfile.txt"),
1039 HgPathBuf::from_bytes(b"a/file1.txt"),
1039 HgPathBuf::from_bytes(b"a/file1.txt"),
1040 HgPathBuf::from_bytes(b"a/b/file2.txt"),
1040 HgPathBuf::from_bytes(b"a/b/file2.txt"),
1041 // No file in a/b/c
1041 // No file in a/b/c
1042 HgPathBuf::from_bytes(b"a/b/c/d/file4.txt"),
1042 HgPathBuf::from_bytes(b"a/b/c/d/file4.txt"),
1043 ];
1043 ];
1044 let matcher = FileMatcher::new(files).unwrap();
1044 let matcher = FileMatcher::new(files).unwrap();
1045
1045
1046 let mut set = HashSet::new();
1046 let mut set = HashSet::new();
1047 set.insert(HgPathBuf::from_bytes(b"a"));
1047 set.insert(HgPathBuf::from_bytes(b"a"));
1048 set.insert(HgPathBuf::from_bytes(b"rootfile.txt"));
1048 set.insert(HgPathBuf::from_bytes(b"rootfile.txt"));
1049 assert_eq!(
1049 assert_eq!(
1050 matcher.visit_children_set(HgPath::new(b"")),
1050 matcher.visit_children_set(HgPath::new(b"")),
1051 VisitChildrenSet::Set(set)
1051 VisitChildrenSet::Set(set)
1052 );
1052 );
1053
1053
1054 let mut set = HashSet::new();
1054 let mut set = HashSet::new();
1055 set.insert(HgPathBuf::from_bytes(b"b"));
1055 set.insert(HgPathBuf::from_bytes(b"b"));
1056 set.insert(HgPathBuf::from_bytes(b"file1.txt"));
1056 set.insert(HgPathBuf::from_bytes(b"file1.txt"));
1057 assert_eq!(
1057 assert_eq!(
1058 matcher.visit_children_set(HgPath::new(b"a")),
1058 matcher.visit_children_set(HgPath::new(b"a")),
1059 VisitChildrenSet::Set(set)
1059 VisitChildrenSet::Set(set)
1060 );
1060 );
1061
1061
1062 let mut set = HashSet::new();
1062 let mut set = HashSet::new();
1063 set.insert(HgPathBuf::from_bytes(b"c"));
1063 set.insert(HgPathBuf::from_bytes(b"c"));
1064 set.insert(HgPathBuf::from_bytes(b"file2.txt"));
1064 set.insert(HgPathBuf::from_bytes(b"file2.txt"));
1065 assert_eq!(
1065 assert_eq!(
1066 matcher.visit_children_set(HgPath::new(b"a/b")),
1066 matcher.visit_children_set(HgPath::new(b"a/b")),
1067 VisitChildrenSet::Set(set)
1067 VisitChildrenSet::Set(set)
1068 );
1068 );
1069
1069
1070 let mut set = HashSet::new();
1070 let mut set = HashSet::new();
1071 set.insert(HgPathBuf::from_bytes(b"d"));
1071 set.insert(HgPathBuf::from_bytes(b"d"));
1072 assert_eq!(
1072 assert_eq!(
1073 matcher.visit_children_set(HgPath::new(b"a/b/c")),
1073 matcher.visit_children_set(HgPath::new(b"a/b/c")),
1074 VisitChildrenSet::Set(set)
1074 VisitChildrenSet::Set(set)
1075 );
1075 );
1076 let mut set = HashSet::new();
1076 let mut set = HashSet::new();
1077 set.insert(HgPathBuf::from_bytes(b"file4.txt"));
1077 set.insert(HgPathBuf::from_bytes(b"file4.txt"));
1078 assert_eq!(
1078 assert_eq!(
1079 matcher.visit_children_set(HgPath::new(b"a/b/c/d")),
1079 matcher.visit_children_set(HgPath::new(b"a/b/c/d")),
1080 VisitChildrenSet::Set(set)
1080 VisitChildrenSet::Set(set)
1081 );
1081 );
1082
1082
1083 assert_eq!(
1083 assert_eq!(
1084 matcher.visit_children_set(HgPath::new(b"a/b/c/d/e")),
1084 matcher.visit_children_set(HgPath::new(b"a/b/c/d/e")),
1085 VisitChildrenSet::Empty
1085 VisitChildrenSet::Empty
1086 );
1086 );
1087 assert_eq!(
1087 assert_eq!(
1088 matcher.visit_children_set(HgPath::new(b"folder")),
1088 matcher.visit_children_set(HgPath::new(b"folder")),
1089 VisitChildrenSet::Empty
1089 VisitChildrenSet::Empty
1090 );
1090 );
1091 }
1091 }
1092
1092
1093 #[test]
1093 #[test]
1094 fn test_includematcher() {
1094 fn test_includematcher() {
1095 // VisitchildrensetPrefix
1095 // VisitchildrensetPrefix
1096 let matcher = IncludeMatcher::new(vec![IgnorePattern::new(
1096 let matcher = IncludeMatcher::new(vec![IgnorePattern::new(
1097 PatternSyntax::RelPath,
1097 PatternSyntax::RelPath,
1098 b"dir/subdir",
1098 b"dir/subdir",
1099 Path::new(""),
1099 Path::new(""),
1100 )])
1100 )])
1101 .unwrap();
1101 .unwrap();
1102
1102
1103 let mut set = HashSet::new();
1103 let mut set = HashSet::new();
1104 set.insert(HgPathBuf::from_bytes(b"dir"));
1104 set.insert(HgPathBuf::from_bytes(b"dir"));
1105 assert_eq!(
1105 assert_eq!(
1106 matcher.visit_children_set(HgPath::new(b"")),
1106 matcher.visit_children_set(HgPath::new(b"")),
1107 VisitChildrenSet::Set(set)
1107 VisitChildrenSet::Set(set)
1108 );
1108 );
1109
1109
1110 let mut set = HashSet::new();
1110 let mut set = HashSet::new();
1111 set.insert(HgPathBuf::from_bytes(b"subdir"));
1111 set.insert(HgPathBuf::from_bytes(b"subdir"));
1112 assert_eq!(
1112 assert_eq!(
1113 matcher.visit_children_set(HgPath::new(b"dir")),
1113 matcher.visit_children_set(HgPath::new(b"dir")),
1114 VisitChildrenSet::Set(set)
1114 VisitChildrenSet::Set(set)
1115 );
1115 );
1116 assert_eq!(
1116 assert_eq!(
1117 matcher.visit_children_set(HgPath::new(b"dir/subdir")),
1117 matcher.visit_children_set(HgPath::new(b"dir/subdir")),
1118 VisitChildrenSet::Recursive
1118 VisitChildrenSet::Recursive
1119 );
1119 );
1120 // OPT: This should probably be 'all' if its parent is?
1120 // OPT: This should probably be 'all' if its parent is?
1121 assert_eq!(
1121 assert_eq!(
1122 matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
1122 matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
1123 VisitChildrenSet::This
1123 VisitChildrenSet::This
1124 );
1124 );
1125 assert_eq!(
1125 assert_eq!(
1126 matcher.visit_children_set(HgPath::new(b"folder")),
1126 matcher.visit_children_set(HgPath::new(b"folder")),
1127 VisitChildrenSet::Empty
1127 VisitChildrenSet::Empty
1128 );
1128 );
1129
1129
1130 // VisitchildrensetRootfilesin
1130 // VisitchildrensetRootfilesin
1131 let matcher = IncludeMatcher::new(vec![IgnorePattern::new(
1131 let matcher = IncludeMatcher::new(vec![IgnorePattern::new(
1132 PatternSyntax::RootFiles,
1132 PatternSyntax::RootFiles,
1133 b"dir/subdir",
1133 b"dir/subdir",
1134 Path::new(""),
1134 Path::new(""),
1135 )])
1135 )])
1136 .unwrap();
1136 .unwrap();
1137
1137
1138 let mut set = HashSet::new();
1138 let mut set = HashSet::new();
1139 set.insert(HgPathBuf::from_bytes(b"dir"));
1139 set.insert(HgPathBuf::from_bytes(b"dir"));
1140 assert_eq!(
1140 assert_eq!(
1141 matcher.visit_children_set(HgPath::new(b"")),
1141 matcher.visit_children_set(HgPath::new(b"")),
1142 VisitChildrenSet::Set(set)
1142 VisitChildrenSet::Set(set)
1143 );
1143 );
1144
1144
1145 let mut set = HashSet::new();
1145 let mut set = HashSet::new();
1146 set.insert(HgPathBuf::from_bytes(b"subdir"));
1146 set.insert(HgPathBuf::from_bytes(b"subdir"));
1147 assert_eq!(
1147 assert_eq!(
1148 matcher.visit_children_set(HgPath::new(b"dir")),
1148 matcher.visit_children_set(HgPath::new(b"dir")),
1149 VisitChildrenSet::Set(set)
1149 VisitChildrenSet::Set(set)
1150 );
1150 );
1151
1151
1152 assert_eq!(
1152 assert_eq!(
1153 matcher.visit_children_set(HgPath::new(b"dir/subdir")),
1153 matcher.visit_children_set(HgPath::new(b"dir/subdir")),
1154 VisitChildrenSet::This
1154 VisitChildrenSet::This
1155 );
1155 );
1156 assert_eq!(
1156 assert_eq!(
1157 matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
1157 matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
1158 VisitChildrenSet::Empty
1158 VisitChildrenSet::Empty
1159 );
1159 );
1160 assert_eq!(
1160 assert_eq!(
1161 matcher.visit_children_set(HgPath::new(b"folder")),
1161 matcher.visit_children_set(HgPath::new(b"folder")),
1162 VisitChildrenSet::Empty
1162 VisitChildrenSet::Empty
1163 );
1163 );
1164
1164
1165 // VisitchildrensetGlob
1165 // VisitchildrensetGlob
1166 let matcher = IncludeMatcher::new(vec![IgnorePattern::new(
1166 let matcher = IncludeMatcher::new(vec![IgnorePattern::new(
1167 PatternSyntax::Glob,
1167 PatternSyntax::Glob,
1168 b"dir/z*",
1168 b"dir/z*",
1169 Path::new(""),
1169 Path::new(""),
1170 )])
1170 )])
1171 .unwrap();
1171 .unwrap();
1172
1172
1173 let mut set = HashSet::new();
1173 let mut set = HashSet::new();
1174 set.insert(HgPathBuf::from_bytes(b"dir"));
1174 set.insert(HgPathBuf::from_bytes(b"dir"));
1175 assert_eq!(
1175 assert_eq!(
1176 matcher.visit_children_set(HgPath::new(b"")),
1176 matcher.visit_children_set(HgPath::new(b"")),
1177 VisitChildrenSet::Set(set)
1177 VisitChildrenSet::Set(set)
1178 );
1178 );
1179 assert_eq!(
1179 assert_eq!(
1180 matcher.visit_children_set(HgPath::new(b"folder")),
1180 matcher.visit_children_set(HgPath::new(b"folder")),
1181 VisitChildrenSet::Empty
1181 VisitChildrenSet::Empty
1182 );
1182 );
1183 assert_eq!(
1183 assert_eq!(
1184 matcher.visit_children_set(HgPath::new(b"dir")),
1184 matcher.visit_children_set(HgPath::new(b"dir")),
1185 VisitChildrenSet::This
1185 VisitChildrenSet::This
1186 );
1186 );
1187 // OPT: these should probably be set().
1187 // OPT: these should probably be set().
1188 assert_eq!(
1188 assert_eq!(
1189 matcher.visit_children_set(HgPath::new(b"dir/subdir")),
1189 matcher.visit_children_set(HgPath::new(b"dir/subdir")),
1190 VisitChildrenSet::This
1190 VisitChildrenSet::This
1191 );
1191 );
1192 assert_eq!(
1192 assert_eq!(
1193 matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
1193 matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
1194 VisitChildrenSet::This
1194 VisitChildrenSet::This
1195 );
1195 );
1196
1196
1197 // Test multiple patterns
1197 // Test multiple patterns
1198 let matcher = IncludeMatcher::new(vec![
1198 let matcher = IncludeMatcher::new(vec![
1199 IgnorePattern::new(PatternSyntax::RelPath, b"foo", Path::new("")),
1199 IgnorePattern::new(PatternSyntax::RelPath, b"foo", Path::new("")),
1200 IgnorePattern::new(PatternSyntax::Glob, b"g*", Path::new("")),
1200 IgnorePattern::new(PatternSyntax::Glob, b"g*", Path::new("")),
1201 ])
1201 ])
1202 .unwrap();
1202 .unwrap();
1203
1203
1204 assert_eq!(
1204 assert_eq!(
1205 matcher.visit_children_set(HgPath::new(b"")),
1205 matcher.visit_children_set(HgPath::new(b"")),
1206 VisitChildrenSet::This
1206 VisitChildrenSet::This
1207 );
1207 );
1208
1208
1209 // Test multiple patterns
1209 // Test multiple patterns
1210 let matcher = IncludeMatcher::new(vec![IgnorePattern::new(
1210 let matcher = IncludeMatcher::new(vec![IgnorePattern::new(
1211 PatternSyntax::Glob,
1211 PatternSyntax::Glob,
1212 b"**/*.exe",
1212 b"**/*.exe",
1213 Path::new(""),
1213 Path::new(""),
1214 )])
1214 )])
1215 .unwrap();
1215 .unwrap();
1216
1216
1217 assert_eq!(
1217 assert_eq!(
1218 matcher.visit_children_set(HgPath::new(b"")),
1218 matcher.visit_children_set(HgPath::new(b"")),
1219 VisitChildrenSet::This
1219 VisitChildrenSet::This
1220 );
1220 );
1221 }
1221 }
1222
1222
1223 #[test]
1223 #[test]
1224 fn test_unionmatcher() {
1224 fn test_unionmatcher() {
1225 // Path + Rootfiles
1225 // Path + Rootfiles
1226 let m1 = IncludeMatcher::new(vec![IgnorePattern::new(
1226 let m1 = IncludeMatcher::new(vec![IgnorePattern::new(
1227 PatternSyntax::RelPath,
1227 PatternSyntax::RelPath,
1228 b"dir/subdir",
1228 b"dir/subdir",
1229 Path::new(""),
1229 Path::new(""),
1230 )])
1230 )])
1231 .unwrap();
1231 .unwrap();
1232 let m2 = IncludeMatcher::new(vec![IgnorePattern::new(
1232 let m2 = IncludeMatcher::new(vec![IgnorePattern::new(
1233 PatternSyntax::RootFiles,
1233 PatternSyntax::RootFiles,
1234 b"dir",
1234 b"dir",
1235 Path::new(""),
1235 Path::new(""),
1236 )])
1236 )])
1237 .unwrap();
1237 .unwrap();
1238 let matcher = UnionMatcher::new(vec![Box::new(m1), Box::new(m2)]);
1238 let matcher = UnionMatcher::new(vec![Box::new(m1), Box::new(m2)]);
1239
1239
1240 let mut set = HashSet::new();
1240 let mut set = HashSet::new();
1241 set.insert(HgPathBuf::from_bytes(b"dir"));
1241 set.insert(HgPathBuf::from_bytes(b"dir"));
1242 assert_eq!(
1242 assert_eq!(
1243 matcher.visit_children_set(HgPath::new(b"")),
1243 matcher.visit_children_set(HgPath::new(b"")),
1244 VisitChildrenSet::Set(set)
1244 VisitChildrenSet::Set(set)
1245 );
1245 );
1246 assert_eq!(
1246 assert_eq!(
1247 matcher.visit_children_set(HgPath::new(b"dir")),
1247 matcher.visit_children_set(HgPath::new(b"dir")),
1248 VisitChildrenSet::This
1248 VisitChildrenSet::This
1249 );
1249 );
1250 assert_eq!(
1250 assert_eq!(
1251 matcher.visit_children_set(HgPath::new(b"dir/subdir")),
1251 matcher.visit_children_set(HgPath::new(b"dir/subdir")),
1252 VisitChildrenSet::Recursive
1252 VisitChildrenSet::Recursive
1253 );
1253 );
1254 assert_eq!(
1254 assert_eq!(
1255 matcher.visit_children_set(HgPath::new(b"dir/foo")),
1255 matcher.visit_children_set(HgPath::new(b"dir/foo")),
1256 VisitChildrenSet::Empty
1256 VisitChildrenSet::Empty
1257 );
1257 );
1258 assert_eq!(
1258 assert_eq!(
1259 matcher.visit_children_set(HgPath::new(b"folder")),
1259 matcher.visit_children_set(HgPath::new(b"folder")),
1260 VisitChildrenSet::Empty
1260 VisitChildrenSet::Empty
1261 );
1261 );
1262 assert_eq!(
1262 assert_eq!(
1263 matcher.visit_children_set(HgPath::new(b"folder")),
1263 matcher.visit_children_set(HgPath::new(b"folder")),
1264 VisitChildrenSet::Empty
1264 VisitChildrenSet::Empty
1265 );
1265 );
1266
1266
1267 // OPT: These next two could be 'all' instead of 'this'.
1267 // OPT: These next two could be 'all' instead of 'this'.
1268 assert_eq!(
1268 assert_eq!(
1269 matcher.visit_children_set(HgPath::new(b"dir/subdir/z")),
1269 matcher.visit_children_set(HgPath::new(b"dir/subdir/z")),
1270 VisitChildrenSet::This
1270 VisitChildrenSet::This
1271 );
1271 );
1272 assert_eq!(
1272 assert_eq!(
1273 matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
1273 matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
1274 VisitChildrenSet::This
1274 VisitChildrenSet::This
1275 );
1275 );
1276
1276
1277 // Path + unrelated Path
1277 // Path + unrelated Path
1278 let m1 = IncludeMatcher::new(vec![IgnorePattern::new(
1278 let m1 = IncludeMatcher::new(vec![IgnorePattern::new(
1279 PatternSyntax::RelPath,
1279 PatternSyntax::RelPath,
1280 b"dir/subdir",
1280 b"dir/subdir",
1281 Path::new(""),
1281 Path::new(""),
1282 )])
1282 )])
1283 .unwrap();
1283 .unwrap();
1284 let m2 = IncludeMatcher::new(vec![IgnorePattern::new(
1284 let m2 = IncludeMatcher::new(vec![IgnorePattern::new(
1285 PatternSyntax::RelPath,
1285 PatternSyntax::RelPath,
1286 b"folder",
1286 b"folder",
1287 Path::new(""),
1287 Path::new(""),
1288 )])
1288 )])
1289 .unwrap();
1289 .unwrap();
1290 let matcher = UnionMatcher::new(vec![Box::new(m1), Box::new(m2)]);
1290 let matcher = UnionMatcher::new(vec![Box::new(m1), Box::new(m2)]);
1291
1291
1292 let mut set = HashSet::new();
1292 let mut set = HashSet::new();
1293 set.insert(HgPathBuf::from_bytes(b"folder"));
1293 set.insert(HgPathBuf::from_bytes(b"folder"));
1294 set.insert(HgPathBuf::from_bytes(b"dir"));
1294 set.insert(HgPathBuf::from_bytes(b"dir"));
1295 assert_eq!(
1295 assert_eq!(
1296 matcher.visit_children_set(HgPath::new(b"")),
1296 matcher.visit_children_set(HgPath::new(b"")),
1297 VisitChildrenSet::Set(set)
1297 VisitChildrenSet::Set(set)
1298 );
1298 );
1299 let mut set = HashSet::new();
1299 let mut set = HashSet::new();
1300 set.insert(HgPathBuf::from_bytes(b"subdir"));
1300 set.insert(HgPathBuf::from_bytes(b"subdir"));
1301 assert_eq!(
1301 assert_eq!(
1302 matcher.visit_children_set(HgPath::new(b"dir")),
1302 matcher.visit_children_set(HgPath::new(b"dir")),
1303 VisitChildrenSet::Set(set)
1303 VisitChildrenSet::Set(set)
1304 );
1304 );
1305
1305
1306 assert_eq!(
1306 assert_eq!(
1307 matcher.visit_children_set(HgPath::new(b"dir/subdir")),
1307 matcher.visit_children_set(HgPath::new(b"dir/subdir")),
1308 VisitChildrenSet::Recursive
1308 VisitChildrenSet::Recursive
1309 );
1309 );
1310 assert_eq!(
1310 assert_eq!(
1311 matcher.visit_children_set(HgPath::new(b"dir/foo")),
1311 matcher.visit_children_set(HgPath::new(b"dir/foo")),
1312 VisitChildrenSet::Empty
1312 VisitChildrenSet::Empty
1313 );
1313 );
1314
1314
1315 assert_eq!(
1315 assert_eq!(
1316 matcher.visit_children_set(HgPath::new(b"folder")),
1316 matcher.visit_children_set(HgPath::new(b"folder")),
1317 VisitChildrenSet::Recursive
1317 VisitChildrenSet::Recursive
1318 );
1318 );
1319 // OPT: These next two could be 'all' instead of 'this'.
1319 // OPT: These next two could be 'all' instead of 'this'.
1320 assert_eq!(
1320 assert_eq!(
1321 matcher.visit_children_set(HgPath::new(b"dir/subdir/z")),
1321 matcher.visit_children_set(HgPath::new(b"dir/subdir/z")),
1322 VisitChildrenSet::This
1322 VisitChildrenSet::This
1323 );
1323 );
1324 assert_eq!(
1324 assert_eq!(
1325 matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
1325 matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
1326 VisitChildrenSet::This
1326 VisitChildrenSet::This
1327 );
1327 );
1328
1328
1329 // Path + subpath
1329 // Path + subpath
1330 let m1 = IncludeMatcher::new(vec![IgnorePattern::new(
1330 let m1 = IncludeMatcher::new(vec![IgnorePattern::new(
1331 PatternSyntax::RelPath,
1331 PatternSyntax::RelPath,
1332 b"dir/subdir/x",
1332 b"dir/subdir/x",
1333 Path::new(""),
1333 Path::new(""),
1334 )])
1334 )])
1335 .unwrap();
1335 .unwrap();
1336 let m2 = IncludeMatcher::new(vec![IgnorePattern::new(
1336 let m2 = IncludeMatcher::new(vec![IgnorePattern::new(
1337 PatternSyntax::RelPath,
1337 PatternSyntax::RelPath,
1338 b"dir/subdir",
1338 b"dir/subdir",
1339 Path::new(""),
1339 Path::new(""),
1340 )])
1340 )])
1341 .unwrap();
1341 .unwrap();
1342 let matcher = UnionMatcher::new(vec![Box::new(m1), Box::new(m2)]);
1342 let matcher = UnionMatcher::new(vec![Box::new(m1), Box::new(m2)]);
1343
1343
1344 let mut set = HashSet::new();
1344 let mut set = HashSet::new();
1345 set.insert(HgPathBuf::from_bytes(b"dir"));
1345 set.insert(HgPathBuf::from_bytes(b"dir"));
1346 assert_eq!(
1346 assert_eq!(
1347 matcher.visit_children_set(HgPath::new(b"")),
1347 matcher.visit_children_set(HgPath::new(b"")),
1348 VisitChildrenSet::Set(set)
1348 VisitChildrenSet::Set(set)
1349 );
1349 );
1350 let mut set = HashSet::new();
1350 let mut set = HashSet::new();
1351 set.insert(HgPathBuf::from_bytes(b"subdir"));
1351 set.insert(HgPathBuf::from_bytes(b"subdir"));
1352 assert_eq!(
1352 assert_eq!(
1353 matcher.visit_children_set(HgPath::new(b"dir")),
1353 matcher.visit_children_set(HgPath::new(b"dir")),
1354 VisitChildrenSet::Set(set)
1354 VisitChildrenSet::Set(set)
1355 );
1355 );
1356
1356
1357 assert_eq!(
1357 assert_eq!(
1358 matcher.visit_children_set(HgPath::new(b"dir/subdir")),
1358 matcher.visit_children_set(HgPath::new(b"dir/subdir")),
1359 VisitChildrenSet::Recursive
1359 VisitChildrenSet::Recursive
1360 );
1360 );
1361 assert_eq!(
1361 assert_eq!(
1362 matcher.visit_children_set(HgPath::new(b"dir/foo")),
1362 matcher.visit_children_set(HgPath::new(b"dir/foo")),
1363 VisitChildrenSet::Empty
1363 VisitChildrenSet::Empty
1364 );
1364 );
1365
1365
1366 assert_eq!(
1366 assert_eq!(
1367 matcher.visit_children_set(HgPath::new(b"folder")),
1367 matcher.visit_children_set(HgPath::new(b"folder")),
1368 VisitChildrenSet::Empty
1368 VisitChildrenSet::Empty
1369 );
1369 );
1370 assert_eq!(
1370 assert_eq!(
1371 matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
1371 matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
1372 VisitChildrenSet::Recursive
1372 VisitChildrenSet::Recursive
1373 );
1373 );
1374 // OPT: this should probably be 'all' not 'this'.
1374 // OPT: this should probably be 'all' not 'this'.
1375 assert_eq!(
1375 assert_eq!(
1376 matcher.visit_children_set(HgPath::new(b"dir/subdir/z")),
1376 matcher.visit_children_set(HgPath::new(b"dir/subdir/z")),
1377 VisitChildrenSet::This
1377 VisitChildrenSet::This
1378 );
1378 );
1379 }
1379 }
1380
1380
1381 #[test]
1381 #[test]
1382 fn test_intersectionmatcher() {
1382 fn test_intersectionmatcher() {
1383 // Include path + Include rootfiles
1383 // Include path + Include rootfiles
1384 let m1 = Box::new(
1384 let m1 = Box::new(
1385 IncludeMatcher::new(vec![IgnorePattern::new(
1385 IncludeMatcher::new(vec![IgnorePattern::new(
1386 PatternSyntax::RelPath,
1386 PatternSyntax::RelPath,
1387 b"dir/subdir",
1387 b"dir/subdir",
1388 Path::new(""),
1388 Path::new(""),
1389 )])
1389 )])
1390 .unwrap(),
1390 .unwrap(),
1391 );
1391 );
1392 let m2 = Box::new(
1392 let m2 = Box::new(
1393 IncludeMatcher::new(vec![IgnorePattern::new(
1393 IncludeMatcher::new(vec![IgnorePattern::new(
1394 PatternSyntax::RootFiles,
1394 PatternSyntax::RootFiles,
1395 b"dir",
1395 b"dir",
1396 Path::new(""),
1396 Path::new(""),
1397 )])
1397 )])
1398 .unwrap(),
1398 .unwrap(),
1399 );
1399 );
1400 let matcher = IntersectionMatcher::new(m1, m2);
1400 let matcher = IntersectionMatcher::new(m1, m2);
1401
1401
1402 let mut set = HashSet::new();
1402 let mut set = HashSet::new();
1403 set.insert(HgPathBuf::from_bytes(b"dir"));
1403 set.insert(HgPathBuf::from_bytes(b"dir"));
1404 assert_eq!(
1404 assert_eq!(
1405 matcher.visit_children_set(HgPath::new(b"")),
1405 matcher.visit_children_set(HgPath::new(b"")),
1406 VisitChildrenSet::Set(set)
1406 VisitChildrenSet::Set(set)
1407 );
1407 );
1408 assert_eq!(
1408 assert_eq!(
1409 matcher.visit_children_set(HgPath::new(b"dir")),
1409 matcher.visit_children_set(HgPath::new(b"dir")),
1410 VisitChildrenSet::This
1410 VisitChildrenSet::This
1411 );
1411 );
1412 assert_eq!(
1412 assert_eq!(
1413 matcher.visit_children_set(HgPath::new(b"dir/subdir")),
1413 matcher.visit_children_set(HgPath::new(b"dir/subdir")),
1414 VisitChildrenSet::Empty
1414 VisitChildrenSet::Empty
1415 );
1415 );
1416 assert_eq!(
1416 assert_eq!(
1417 matcher.visit_children_set(HgPath::new(b"dir/foo")),
1417 matcher.visit_children_set(HgPath::new(b"dir/foo")),
1418 VisitChildrenSet::Empty
1418 VisitChildrenSet::Empty
1419 );
1419 );
1420 assert_eq!(
1420 assert_eq!(
1421 matcher.visit_children_set(HgPath::new(b"folder")),
1421 matcher.visit_children_set(HgPath::new(b"folder")),
1422 VisitChildrenSet::Empty
1422 VisitChildrenSet::Empty
1423 );
1423 );
1424 assert_eq!(
1424 assert_eq!(
1425 matcher.visit_children_set(HgPath::new(b"dir/subdir/z")),
1425 matcher.visit_children_set(HgPath::new(b"dir/subdir/z")),
1426 VisitChildrenSet::Empty
1426 VisitChildrenSet::Empty
1427 );
1427 );
1428 assert_eq!(
1428 assert_eq!(
1429 matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
1429 matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
1430 VisitChildrenSet::Empty
1430 VisitChildrenSet::Empty
1431 );
1431 );
1432
1432
1433 // Non intersecting paths
1433 // Non intersecting paths
1434 let m1 = Box::new(
1434 let m1 = Box::new(
1435 IncludeMatcher::new(vec![IgnorePattern::new(
1435 IncludeMatcher::new(vec![IgnorePattern::new(
1436 PatternSyntax::RelPath,
1436 PatternSyntax::RelPath,
1437 b"dir/subdir",
1437 b"dir/subdir",
1438 Path::new(""),
1438 Path::new(""),
1439 )])
1439 )])
1440 .unwrap(),
1440 .unwrap(),
1441 );
1441 );
1442 let m2 = Box::new(
1442 let m2 = Box::new(
1443 IncludeMatcher::new(vec![IgnorePattern::new(
1443 IncludeMatcher::new(vec![IgnorePattern::new(
1444 PatternSyntax::RelPath,
1444 PatternSyntax::RelPath,
1445 b"folder",
1445 b"folder",
1446 Path::new(""),
1446 Path::new(""),
1447 )])
1447 )])
1448 .unwrap(),
1448 .unwrap(),
1449 );
1449 );
1450 let matcher = IntersectionMatcher::new(m1, m2);
1450 let matcher = IntersectionMatcher::new(m1, m2);
1451
1451
1452 assert_eq!(
1452 assert_eq!(
1453 matcher.visit_children_set(HgPath::new(b"")),
1453 matcher.visit_children_set(HgPath::new(b"")),
1454 VisitChildrenSet::Empty
1454 VisitChildrenSet::Empty
1455 );
1455 );
1456 assert_eq!(
1456 assert_eq!(
1457 matcher.visit_children_set(HgPath::new(b"dir")),
1457 matcher.visit_children_set(HgPath::new(b"dir")),
1458 VisitChildrenSet::Empty
1458 VisitChildrenSet::Empty
1459 );
1459 );
1460 assert_eq!(
1460 assert_eq!(
1461 matcher.visit_children_set(HgPath::new(b"dir/subdir")),
1461 matcher.visit_children_set(HgPath::new(b"dir/subdir")),
1462 VisitChildrenSet::Empty
1462 VisitChildrenSet::Empty
1463 );
1463 );
1464 assert_eq!(
1464 assert_eq!(
1465 matcher.visit_children_set(HgPath::new(b"dir/foo")),
1465 matcher.visit_children_set(HgPath::new(b"dir/foo")),
1466 VisitChildrenSet::Empty
1466 VisitChildrenSet::Empty
1467 );
1467 );
1468 assert_eq!(
1468 assert_eq!(
1469 matcher.visit_children_set(HgPath::new(b"folder")),
1469 matcher.visit_children_set(HgPath::new(b"folder")),
1470 VisitChildrenSet::Empty
1470 VisitChildrenSet::Empty
1471 );
1471 );
1472 assert_eq!(
1472 assert_eq!(
1473 matcher.visit_children_set(HgPath::new(b"dir/subdir/z")),
1473 matcher.visit_children_set(HgPath::new(b"dir/subdir/z")),
1474 VisitChildrenSet::Empty
1474 VisitChildrenSet::Empty
1475 );
1475 );
1476 assert_eq!(
1476 assert_eq!(
1477 matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
1477 matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
1478 VisitChildrenSet::Empty
1478 VisitChildrenSet::Empty
1479 );
1479 );
1480
1480
1481 // Nested paths
1481 // Nested paths
1482 let m1 = Box::new(
1482 let m1 = Box::new(
1483 IncludeMatcher::new(vec![IgnorePattern::new(
1483 IncludeMatcher::new(vec![IgnorePattern::new(
1484 PatternSyntax::RelPath,
1484 PatternSyntax::RelPath,
1485 b"dir/subdir/x",
1485 b"dir/subdir/x",
1486 Path::new(""),
1486 Path::new(""),
1487 )])
1487 )])
1488 .unwrap(),
1488 .unwrap(),
1489 );
1489 );
1490 let m2 = Box::new(
1490 let m2 = Box::new(
1491 IncludeMatcher::new(vec![IgnorePattern::new(
1491 IncludeMatcher::new(vec![IgnorePattern::new(
1492 PatternSyntax::RelPath,
1492 PatternSyntax::RelPath,
1493 b"dir/subdir",
1493 b"dir/subdir",
1494 Path::new(""),
1494 Path::new(""),
1495 )])
1495 )])
1496 .unwrap(),
1496 .unwrap(),
1497 );
1497 );
1498 let matcher = IntersectionMatcher::new(m1, m2);
1498 let matcher = IntersectionMatcher::new(m1, m2);
1499
1499
1500 let mut set = HashSet::new();
1500 let mut set = HashSet::new();
1501 set.insert(HgPathBuf::from_bytes(b"dir"));
1501 set.insert(HgPathBuf::from_bytes(b"dir"));
1502 assert_eq!(
1502 assert_eq!(
1503 matcher.visit_children_set(HgPath::new(b"")),
1503 matcher.visit_children_set(HgPath::new(b"")),
1504 VisitChildrenSet::Set(set)
1504 VisitChildrenSet::Set(set)
1505 );
1505 );
1506
1506
1507 let mut set = HashSet::new();
1507 let mut set = HashSet::new();
1508 set.insert(HgPathBuf::from_bytes(b"subdir"));
1508 set.insert(HgPathBuf::from_bytes(b"subdir"));
1509 assert_eq!(
1509 assert_eq!(
1510 matcher.visit_children_set(HgPath::new(b"dir")),
1510 matcher.visit_children_set(HgPath::new(b"dir")),
1511 VisitChildrenSet::Set(set)
1511 VisitChildrenSet::Set(set)
1512 );
1512 );
1513 let mut set = HashSet::new();
1513 let mut set = HashSet::new();
1514 set.insert(HgPathBuf::from_bytes(b"x"));
1514 set.insert(HgPathBuf::from_bytes(b"x"));
1515 assert_eq!(
1515 assert_eq!(
1516 matcher.visit_children_set(HgPath::new(b"dir/subdir")),
1516 matcher.visit_children_set(HgPath::new(b"dir/subdir")),
1517 VisitChildrenSet::Set(set)
1517 VisitChildrenSet::Set(set)
1518 );
1518 );
1519 assert_eq!(
1519 assert_eq!(
1520 matcher.visit_children_set(HgPath::new(b"dir/foo")),
1520 matcher.visit_children_set(HgPath::new(b"dir/foo")),
1521 VisitChildrenSet::Empty
1521 VisitChildrenSet::Empty
1522 );
1522 );
1523 assert_eq!(
1523 assert_eq!(
1524 matcher.visit_children_set(HgPath::new(b"folder")),
1524 matcher.visit_children_set(HgPath::new(b"folder")),
1525 VisitChildrenSet::Empty
1525 VisitChildrenSet::Empty
1526 );
1526 );
1527 assert_eq!(
1527 assert_eq!(
1528 matcher.visit_children_set(HgPath::new(b"dir/subdir/z")),
1528 matcher.visit_children_set(HgPath::new(b"dir/subdir/z")),
1529 VisitChildrenSet::Empty
1529 VisitChildrenSet::Empty
1530 );
1530 );
1531 // OPT: this should probably be 'all' not 'this'.
1531 // OPT: this should probably be 'all' not 'this'.
1532 assert_eq!(
1532 assert_eq!(
1533 matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
1533 matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
1534 VisitChildrenSet::This
1534 VisitChildrenSet::This
1535 );
1535 );
1536
1536
1537 // Diverging paths
1537 // Diverging paths
1538 let m1 = Box::new(
1538 let m1 = Box::new(
1539 IncludeMatcher::new(vec![IgnorePattern::new(
1539 IncludeMatcher::new(vec![IgnorePattern::new(
1540 PatternSyntax::RelPath,
1540 PatternSyntax::RelPath,
1541 b"dir/subdir/x",
1541 b"dir/subdir/x",
1542 Path::new(""),
1542 Path::new(""),
1543 )])
1543 )])
1544 .unwrap(),
1544 .unwrap(),
1545 );
1545 );
1546 let m2 = Box::new(
1546 let m2 = Box::new(
1547 IncludeMatcher::new(vec![IgnorePattern::new(
1547 IncludeMatcher::new(vec![IgnorePattern::new(
1548 PatternSyntax::RelPath,
1548 PatternSyntax::RelPath,
1549 b"dir/subdir/z",
1549 b"dir/subdir/z",
1550 Path::new(""),
1550 Path::new(""),
1551 )])
1551 )])
1552 .unwrap(),
1552 .unwrap(),
1553 );
1553 );
1554 let matcher = IntersectionMatcher::new(m1, m2);
1554 let matcher = IntersectionMatcher::new(m1, m2);
1555
1555
1556 // OPT: these next two could probably be Empty as well.
1556 // OPT: these next two could probably be Empty as well.
1557 let mut set = HashSet::new();
1557 let mut set = HashSet::new();
1558 set.insert(HgPathBuf::from_bytes(b"dir"));
1558 set.insert(HgPathBuf::from_bytes(b"dir"));
1559 assert_eq!(
1559 assert_eq!(
1560 matcher.visit_children_set(HgPath::new(b"")),
1560 matcher.visit_children_set(HgPath::new(b"")),
1561 VisitChildrenSet::Set(set)
1561 VisitChildrenSet::Set(set)
1562 );
1562 );
1563 // OPT: these next two could probably be Empty as well.
1563 // OPT: these next two could probably be Empty as well.
1564 let mut set = HashSet::new();
1564 let mut set = HashSet::new();
1565 set.insert(HgPathBuf::from_bytes(b"subdir"));
1565 set.insert(HgPathBuf::from_bytes(b"subdir"));
1566 assert_eq!(
1566 assert_eq!(
1567 matcher.visit_children_set(HgPath::new(b"dir")),
1567 matcher.visit_children_set(HgPath::new(b"dir")),
1568 VisitChildrenSet::Set(set)
1568 VisitChildrenSet::Set(set)
1569 );
1569 );
1570 assert_eq!(
1570 assert_eq!(
1571 matcher.visit_children_set(HgPath::new(b"dir/subdir")),
1571 matcher.visit_children_set(HgPath::new(b"dir/subdir")),
1572 VisitChildrenSet::Empty
1572 VisitChildrenSet::Empty
1573 );
1573 );
1574 assert_eq!(
1574 assert_eq!(
1575 matcher.visit_children_set(HgPath::new(b"dir/foo")),
1575 matcher.visit_children_set(HgPath::new(b"dir/foo")),
1576 VisitChildrenSet::Empty
1576 VisitChildrenSet::Empty
1577 );
1577 );
1578 assert_eq!(
1578 assert_eq!(
1579 matcher.visit_children_set(HgPath::new(b"folder")),
1579 matcher.visit_children_set(HgPath::new(b"folder")),
1580 VisitChildrenSet::Empty
1580 VisitChildrenSet::Empty
1581 );
1581 );
1582 assert_eq!(
1582 assert_eq!(
1583 matcher.visit_children_set(HgPath::new(b"dir/subdir/z")),
1583 matcher.visit_children_set(HgPath::new(b"dir/subdir/z")),
1584 VisitChildrenSet::Empty
1584 VisitChildrenSet::Empty
1585 );
1585 );
1586 assert_eq!(
1586 assert_eq!(
1587 matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
1587 matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
1588 VisitChildrenSet::Empty
1588 VisitChildrenSet::Empty
1589 );
1589 );
1590 }
1590 }
1591
1591
1592 #[test]
1592 #[test]
1593 fn test_differencematcher() {
1593 fn test_differencematcher() {
1594 // Two alwaysmatchers should function like a nevermatcher
1594 // Two alwaysmatchers should function like a nevermatcher
1595 let m1 = AlwaysMatcher;
1595 let m1 = AlwaysMatcher;
1596 let m2 = AlwaysMatcher;
1596 let m2 = AlwaysMatcher;
1597 let matcher = DifferenceMatcher::new(Box::new(m1), Box::new(m2));
1597 let matcher = DifferenceMatcher::new(Box::new(m1), Box::new(m2));
1598
1598
1599 for case in &[
1599 for case in &[
1600 &b""[..],
1600 &b""[..],
1601 b"dir",
1601 b"dir",
1602 b"dir/subdir",
1602 b"dir/subdir",
1603 b"dir/subdir/z",
1603 b"dir/subdir/z",
1604 b"dir/foo",
1604 b"dir/foo",
1605 b"dir/subdir/x",
1605 b"dir/subdir/x",
1606 b"folder",
1606 b"folder",
1607 ] {
1607 ] {
1608 assert_eq!(
1608 assert_eq!(
1609 matcher.visit_children_set(HgPath::new(case)),
1609 matcher.visit_children_set(HgPath::new(case)),
1610 VisitChildrenSet::Empty
1610 VisitChildrenSet::Empty
1611 );
1611 );
1612 }
1612 }
1613
1613
1614 // One always and one never should behave the same as an always
1614 // One always and one never should behave the same as an always
1615 let m1 = AlwaysMatcher;
1615 let m1 = AlwaysMatcher;
1616 let m2 = NeverMatcher;
1616 let m2 = NeverMatcher;
1617 let matcher = DifferenceMatcher::new(Box::new(m1), Box::new(m2));
1617 let matcher = DifferenceMatcher::new(Box::new(m1), Box::new(m2));
1618
1618
1619 for case in &[
1619 for case in &[
1620 &b""[..],
1620 &b""[..],
1621 b"dir",
1621 b"dir",
1622 b"dir/subdir",
1622 b"dir/subdir",
1623 b"dir/subdir/z",
1623 b"dir/subdir/z",
1624 b"dir/foo",
1624 b"dir/foo",
1625 b"dir/subdir/x",
1625 b"dir/subdir/x",
1626 b"folder",
1626 b"folder",
1627 ] {
1627 ] {
1628 assert_eq!(
1628 assert_eq!(
1629 matcher.visit_children_set(HgPath::new(case)),
1629 matcher.visit_children_set(HgPath::new(case)),
1630 VisitChildrenSet::Recursive
1630 VisitChildrenSet::Recursive
1631 );
1631 );
1632 }
1632 }
1633
1633
1634 // Two include matchers
1634 // Two include matchers
1635 let m1 = Box::new(
1635 let m1 = Box::new(
1636 IncludeMatcher::new(vec![IgnorePattern::new(
1636 IncludeMatcher::new(vec![IgnorePattern::new(
1637 PatternSyntax::RelPath,
1637 PatternSyntax::RelPath,
1638 b"dir/subdir",
1638 b"dir/subdir",
1639 Path::new("/repo"),
1639 Path::new("/repo"),
1640 )])
1640 )])
1641 .unwrap(),
1641 .unwrap(),
1642 );
1642 );
1643 let m2 = Box::new(
1643 let m2 = Box::new(
1644 IncludeMatcher::new(vec![IgnorePattern::new(
1644 IncludeMatcher::new(vec![IgnorePattern::new(
1645 PatternSyntax::RootFiles,
1645 PatternSyntax::RootFiles,
1646 b"dir",
1646 b"dir",
1647 Path::new("/repo"),
1647 Path::new("/repo"),
1648 )])
1648 )])
1649 .unwrap(),
1649 .unwrap(),
1650 );
1650 );
1651
1651
1652 let matcher = DifferenceMatcher::new(m1, m2);
1652 let matcher = DifferenceMatcher::new(m1, m2);
1653
1653
1654 let mut set = HashSet::new();
1654 let mut set = HashSet::new();
1655 set.insert(HgPathBuf::from_bytes(b"dir"));
1655 set.insert(HgPathBuf::from_bytes(b"dir"));
1656 assert_eq!(
1656 assert_eq!(
1657 matcher.visit_children_set(HgPath::new(b"")),
1657 matcher.visit_children_set(HgPath::new(b"")),
1658 VisitChildrenSet::Set(set)
1658 VisitChildrenSet::Set(set)
1659 );
1659 );
1660
1660
1661 let mut set = HashSet::new();
1661 let mut set = HashSet::new();
1662 set.insert(HgPathBuf::from_bytes(b"subdir"));
1662 set.insert(HgPathBuf::from_bytes(b"subdir"));
1663 assert_eq!(
1663 assert_eq!(
1664 matcher.visit_children_set(HgPath::new(b"dir")),
1664 matcher.visit_children_set(HgPath::new(b"dir")),
1665 VisitChildrenSet::Set(set)
1665 VisitChildrenSet::Set(set)
1666 );
1666 );
1667 assert_eq!(
1667 assert_eq!(
1668 matcher.visit_children_set(HgPath::new(b"dir/subdir")),
1668 matcher.visit_children_set(HgPath::new(b"dir/subdir")),
1669 VisitChildrenSet::Recursive
1669 VisitChildrenSet::Recursive
1670 );
1670 );
1671 assert_eq!(
1671 assert_eq!(
1672 matcher.visit_children_set(HgPath::new(b"dir/foo")),
1672 matcher.visit_children_set(HgPath::new(b"dir/foo")),
1673 VisitChildrenSet::Empty
1673 VisitChildrenSet::Empty
1674 );
1674 );
1675 assert_eq!(
1675 assert_eq!(
1676 matcher.visit_children_set(HgPath::new(b"folder")),
1676 matcher.visit_children_set(HgPath::new(b"folder")),
1677 VisitChildrenSet::Empty
1677 VisitChildrenSet::Empty
1678 );
1678 );
1679 assert_eq!(
1679 assert_eq!(
1680 matcher.visit_children_set(HgPath::new(b"dir/subdir/z")),
1680 matcher.visit_children_set(HgPath::new(b"dir/subdir/z")),
1681 VisitChildrenSet::This
1681 VisitChildrenSet::This
1682 );
1682 );
1683 assert_eq!(
1683 assert_eq!(
1684 matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
1684 matcher.visit_children_set(HgPath::new(b"dir/subdir/x")),
1685 VisitChildrenSet::This
1685 VisitChildrenSet::This
1686 );
1686 );
1687 }
1687 }
1688 }
1688 }
@@ -1,40 +1,40
1 use crate::error::CommandError;
1 use crate::error::CommandError;
2 use clap::SubCommand;
2 use clap::SubCommand;
3 use hg;
3 use hg;
4 use hg::matchers::get_ignore_matcher;
4 use hg::matchers::get_ignore_matcher;
5 use hg::StatusError;
5 use hg::StatusError;
6 use log::warn;
6 use log::warn;
7
7
8 pub const HELP_TEXT: &str = "
8 pub const HELP_TEXT: &str = "
9 Show effective hgignore patterns used by rhg.
9 Show effective hgignore patterns used by rhg.
10
10
11 This is a pure Rust version of `hg debugignore`.
11 This is a pure Rust version of `hg debugignore`.
12
12
13 Some options might be missing, check the list below.
13 Some options might be missing, check the list below.
14 ";
14 ";
15
15
16 pub fn args() -> clap::App<'static, 'static> {
16 pub fn args() -> clap::App<'static, 'static> {
17 SubCommand::with_name("debugignorerhg").about(HELP_TEXT)
17 SubCommand::with_name("debugignorerhg").about(HELP_TEXT)
18 }
18 }
19
19
20 pub fn run(invocation: &crate::CliInvocation) -> Result<(), CommandError> {
20 pub fn run(invocation: &crate::CliInvocation) -> Result<(), CommandError> {
21 let repo = invocation.repo?;
21 let repo = invocation.repo?;
22
22
23 let ignore_file = repo.working_directory_vfs().join(".hgignore"); // TODO hardcoded
23 let ignore_file = repo.working_directory_vfs().join(".hgignore"); // TODO hardcoded
24
24
25 let (ignore_matcher, warnings) = get_ignore_matcher(
25 let (ignore_matcher, warnings) = get_ignore_matcher(
26 vec![ignore_file],
26 vec![ignore_file],
27 &repo.working_directory_path().to_owned(),
27 &repo.working_directory_path().to_owned(),
28 &mut |_pattern_bytes| (),
28 &mut |_source, _pattern_bytes| (),
29 )
29 )
30 .map_err(|e| StatusError::from(e))?;
30 .map_err(|e| StatusError::from(e))?;
31
31
32 if !warnings.is_empty() {
32 if !warnings.is_empty() {
33 warn!("Pattern warnings: {:?}", &warnings);
33 warn!("Pattern warnings: {:?}", &warnings);
34 }
34 }
35
35
36 let patterns = ignore_matcher.debug_get_patterns();
36 let patterns = ignore_matcher.debug_get_patterns();
37 invocation.ui.write_stdout(patterns)?;
37 invocation.ui.write_stdout(patterns)?;
38 invocation.ui.write_stdout(b"\n")?;
38 invocation.ui.write_stdout(b"\n")?;
39 Ok(())
39 Ok(())
40 }
40 }
@@ -1,465 +1,471
1 #testcases dirstate-v1 dirstate-v2
1 #testcases dirstate-v1 dirstate-v2
2
2
3 #if dirstate-v2
3 #if dirstate-v2
4 $ cat >> $HGRCPATH << EOF
4 $ cat >> $HGRCPATH << EOF
5 > [format]
5 > [format]
6 > use-dirstate-v2=1
6 > use-dirstate-v2=1
7 > [storage]
7 > [storage]
8 > dirstate-v2.slow-path=allow
8 > dirstate-v2.slow-path=allow
9 > EOF
9 > EOF
10 #endif
10 #endif
11
11
12 $ hg init ignorerepo
12 $ hg init ignorerepo
13 $ cd ignorerepo
13 $ cd ignorerepo
14
14
15 debugignore with no hgignore should be deterministic:
15 debugignore with no hgignore should be deterministic:
16 $ hg debugignore
16 $ hg debugignore
17 <nevermatcher>
17 <nevermatcher>
18
18
19 Issue562: .hgignore requires newline at end:
19 Issue562: .hgignore requires newline at end:
20
20
21 $ touch foo
21 $ touch foo
22 $ touch bar
22 $ touch bar
23 $ touch baz
23 $ touch baz
24 $ cat > makeignore.py <<EOF
24 $ cat > makeignore.py <<EOF
25 > f = open(".hgignore", "w")
25 > f = open(".hgignore", "w")
26 > f.write("ignore\n")
26 > f.write("ignore\n")
27 > f.write("foo\n")
27 > f.write("foo\n")
28 > # No EOL here
28 > # No EOL here
29 > f.write("bar")
29 > f.write("bar")
30 > f.close()
30 > f.close()
31 > EOF
31 > EOF
32
32
33 $ "$PYTHON" makeignore.py
33 $ "$PYTHON" makeignore.py
34
34
35 Should display baz only:
35 Should display baz only:
36
36
37 $ hg status
37 $ hg status
38 ? baz
38 ? baz
39
39
40 $ rm foo bar baz .hgignore makeignore.py
40 $ rm foo bar baz .hgignore makeignore.py
41
41
42 $ touch a.o
42 $ touch a.o
43 $ touch a.c
43 $ touch a.c
44 $ touch syntax
44 $ touch syntax
45 $ mkdir dir
45 $ mkdir dir
46 $ touch dir/a.o
46 $ touch dir/a.o
47 $ touch dir/b.o
47 $ touch dir/b.o
48 $ touch dir/c.o
48 $ touch dir/c.o
49
49
50 $ hg add dir/a.o
50 $ hg add dir/a.o
51 $ hg commit -m 0
51 $ hg commit -m 0
52 $ hg add dir/b.o
52 $ hg add dir/b.o
53
53
54 $ hg status
54 $ hg status
55 A dir/b.o
55 A dir/b.o
56 ? a.c
56 ? a.c
57 ? a.o
57 ? a.o
58 ? dir/c.o
58 ? dir/c.o
59 ? syntax
59 ? syntax
60
60
61 $ echo "*.o" > .hgignore
61 $ echo "*.o" > .hgignore
62 $ hg status
62 $ hg status
63 abort: $TESTTMP/ignorerepo/.hgignore: invalid pattern (relre): *.o (glob)
63 abort: $TESTTMP/ignorerepo/.hgignore: invalid pattern (relre): *.o (glob)
64 [255]
64 [255]
65
65
66 $ echo 're:^(?!a).*\.o$' > .hgignore
66 $ echo 're:^(?!a).*\.o$' > .hgignore
67 $ hg status
67 $ hg status
68 A dir/b.o
68 A dir/b.o
69 ? .hgignore
69 ? .hgignore
70 ? a.c
70 ? a.c
71 ? a.o
71 ? a.o
72 ? syntax
72 ? syntax
73 #if rhg
73 #if rhg
74 $ hg status --config rhg.on-unsupported=abort
74 $ hg status --config rhg.on-unsupported=abort
75 unsupported feature: Unsupported syntax regex parse error:
75 unsupported feature: Unsupported syntax regex parse error:
76 ^(?:^(?!a).*\.o$)
76 ^(?:^(?!a).*\.o$)
77 ^^^
77 ^^^
78 error: look-around, including look-ahead and look-behind, is not supported
78 error: look-around, including look-ahead and look-behind, is not supported
79 [252]
79 [252]
80 #endif
80 #endif
81
81
82 Ensure given files are relative to cwd
82 Ensure given files are relative to cwd
83
83
84 $ echo "dir/.*\.o" > .hgignore
84 $ echo "dir/.*\.o" > .hgignore
85 $ hg status -i
85 $ hg status -i
86 I dir/c.o
86 I dir/c.o
87
87
88 $ hg debugignore dir/c.o dir/missing.o
88 $ hg debugignore dir/c.o dir/missing.o
89 dir/c.o is ignored
89 dir/c.o is ignored
90 (ignore rule in $TESTTMP/ignorerepo/.hgignore, line 1: 'dir/.*\.o') (glob)
90 (ignore rule in $TESTTMP/ignorerepo/.hgignore, line 1: 'dir/.*\.o') (glob)
91 dir/missing.o is ignored
91 dir/missing.o is ignored
92 (ignore rule in $TESTTMP/ignorerepo/.hgignore, line 1: 'dir/.*\.o') (glob)
92 (ignore rule in $TESTTMP/ignorerepo/.hgignore, line 1: 'dir/.*\.o') (glob)
93 $ cd dir
93 $ cd dir
94 $ hg debugignore c.o missing.o
94 $ hg debugignore c.o missing.o
95 c.o is ignored
95 c.o is ignored
96 (ignore rule in $TESTTMP/ignorerepo/.hgignore, line 1: 'dir/.*\.o') (glob)
96 (ignore rule in $TESTTMP/ignorerepo/.hgignore, line 1: 'dir/.*\.o') (glob)
97 missing.o is ignored
97 missing.o is ignored
98 (ignore rule in $TESTTMP/ignorerepo/.hgignore, line 1: 'dir/.*\.o') (glob)
98 (ignore rule in $TESTTMP/ignorerepo/.hgignore, line 1: 'dir/.*\.o') (glob)
99
99
100 For icasefs, inexact matches also work, except for missing files
100 For icasefs, inexact matches also work, except for missing files
101
101
102 #if icasefs
102 #if icasefs
103 $ hg debugignore c.O missing.O
103 $ hg debugignore c.O missing.O
104 c.o is ignored
104 c.o is ignored
105 (ignore rule in $TESTTMP/ignorerepo/.hgignore, line 1: 'dir/.*\.o') (glob)
105 (ignore rule in $TESTTMP/ignorerepo/.hgignore, line 1: 'dir/.*\.o') (glob)
106 missing.O is not ignored
106 missing.O is not ignored
107 #endif
107 #endif
108
108
109 $ cd ..
109 $ cd ..
110
110
111 $ echo ".*\.o" > .hgignore
111 $ echo ".*\.o" > .hgignore
112 $ hg status
112 $ hg status
113 A dir/b.o
113 A dir/b.o
114 ? .hgignore
114 ? .hgignore
115 ? a.c
115 ? a.c
116 ? syntax
116 ? syntax
117
117
118 Ensure that comments work:
118 Ensure that comments work:
119
119
120 $ touch 'foo#bar' 'quux#' 'quu0#'
120 $ touch 'foo#bar' 'quux#' 'quu0#'
121 #if no-windows
121 #if no-windows
122 $ touch 'baz\' 'baz\wat' 'ba0\#wat' 'ba1\\' 'ba1\\wat' 'quu0\'
122 $ touch 'baz\' 'baz\wat' 'ba0\#wat' 'ba1\\' 'ba1\\wat' 'quu0\'
123 #endif
123 #endif
124
124
125 $ cat <<'EOF' >> .hgignore
125 $ cat <<'EOF' >> .hgignore
126 > # full-line comment
126 > # full-line comment
127 > # whitespace-only comment line
127 > # whitespace-only comment line
128 > syntax# pattern, no whitespace, then comment
128 > syntax# pattern, no whitespace, then comment
129 > a.c # pattern, then whitespace, then comment
129 > a.c # pattern, then whitespace, then comment
130 > baz\\# # (escaped) backslash, then comment
130 > baz\\# # (escaped) backslash, then comment
131 > ba0\\\#w # (escaped) backslash, escaped comment character, then comment
131 > ba0\\\#w # (escaped) backslash, escaped comment character, then comment
132 > ba1\\\\# # (escaped) backslashes, then comment
132 > ba1\\\\# # (escaped) backslashes, then comment
133 > foo\#b # escaped comment character
133 > foo\#b # escaped comment character
134 > quux\## escaped comment character at end of name
134 > quux\## escaped comment character at end of name
135 > EOF
135 > EOF
136 $ hg status
136 $ hg status
137 A dir/b.o
137 A dir/b.o
138 ? .hgignore
138 ? .hgignore
139 ? quu0#
139 ? quu0#
140 ? quu0\ (no-windows !)
140 ? quu0\ (no-windows !)
141
141
142 $ cat <<'EOF' > .hgignore
142 $ cat <<'EOF' > .hgignore
143 > .*\.o
143 > .*\.o
144 > syntax: glob
144 > syntax: glob
145 > syntax# pattern, no whitespace, then comment
145 > syntax# pattern, no whitespace, then comment
146 > a.c # pattern, then whitespace, then comment
146 > a.c # pattern, then whitespace, then comment
147 > baz\\#* # (escaped) backslash, then comment
147 > baz\\#* # (escaped) backslash, then comment
148 > ba0\\\#w* # (escaped) backslash, escaped comment character, then comment
148 > ba0\\\#w* # (escaped) backslash, escaped comment character, then comment
149 > ba1\\\\#* # (escaped) backslashes, then comment
149 > ba1\\\\#* # (escaped) backslashes, then comment
150 > foo\#b* # escaped comment character
150 > foo\#b* # escaped comment character
151 > quux\## escaped comment character at end of name
151 > quux\## escaped comment character at end of name
152 > quu0[\#]# escaped comment character inside [...]
152 > quu0[\#]# escaped comment character inside [...]
153 > EOF
153 > EOF
154 $ hg status
154 $ hg status
155 A dir/b.o
155 A dir/b.o
156 ? .hgignore
156 ? .hgignore
157 ? ba1\\wat (no-windows !)
157 ? ba1\\wat (no-windows !)
158 ? baz\wat (no-windows !)
158 ? baz\wat (no-windows !)
159 ? quu0\ (no-windows !)
159 ? quu0\ (no-windows !)
160
160
161 $ rm 'foo#bar' 'quux#' 'quu0#'
161 $ rm 'foo#bar' 'quux#' 'quu0#'
162 #if no-windows
162 #if no-windows
163 $ rm 'baz\' 'baz\wat' 'ba0\#wat' 'ba1\\' 'ba1\\wat' 'quu0\'
163 $ rm 'baz\' 'baz\wat' 'ba0\#wat' 'ba1\\' 'ba1\\wat' 'quu0\'
164 #endif
164 #endif
165
165
166 Check that '^\.' does not ignore the root directory:
166 Check that '^\.' does not ignore the root directory:
167
167
168 $ echo "^\." > .hgignore
168 $ echo "^\." > .hgignore
169 $ hg status
169 $ hg status
170 A dir/b.o
170 A dir/b.o
171 ? a.c
171 ? a.c
172 ? a.o
172 ? a.o
173 ? dir/c.o
173 ? dir/c.o
174 ? syntax
174 ? syntax
175
175
176 Test that patterns from ui.ignore options are read:
176 Test that patterns from ui.ignore options are read:
177
177
178 $ echo > .hgignore
178 $ echo > .hgignore
179 $ cat >> $HGRCPATH << EOF
179 $ cat >> $HGRCPATH << EOF
180 > [ui]
180 > [ui]
181 > ignore.other = $TESTTMP/ignorerepo/.hg/testhgignore
181 > ignore.other = $TESTTMP/ignorerepo/.hg/testhgignore
182 > EOF
182 > EOF
183 $ echo "glob:**.o" > .hg/testhgignore
183 $ echo "glob:**.o" > .hg/testhgignore
184 $ hg status
184 $ hg status
185 A dir/b.o
185 A dir/b.o
186 ? .hgignore
186 ? .hgignore
187 ? a.c
187 ? a.c
188 ? syntax
188 ? syntax
189
189
190 empty out testhgignore
190 empty out testhgignore
191 $ echo > .hg/testhgignore
191 $ echo > .hg/testhgignore
192
192
193 Test relative ignore path (issue4473):
193 Test relative ignore path (issue4473):
194
194
195 $ cat >> $HGRCPATH << EOF
195 $ cat >> $HGRCPATH << EOF
196 > [ui]
196 > [ui]
197 > ignore.relative = .hg/testhgignorerel
197 > ignore.relative = .hg/testhgignorerel
198 > EOF
198 > EOF
199 $ echo "glob:*.o" > .hg/testhgignorerel
199 $ echo "glob:*.o" > .hg/testhgignorerel
200 $ cd dir
200 $ cd dir
201 $ hg status
201 $ hg status
202 A dir/b.o
202 A dir/b.o
203 ? .hgignore
203 ? .hgignore
204 ? a.c
204 ? a.c
205 ? syntax
205 ? syntax
206 $ hg debugignore
206 $ hg debugignore
207 <includematcher includes='.*\\.o(?:/|$)'>
207 <includematcher includes='.*\\.o(?:/|$)'>
208
208
209 $ cd ..
209 $ cd ..
210 $ echo > .hg/testhgignorerel
210 $ echo > .hg/testhgignorerel
211 $ echo "syntax: glob" > .hgignore
211 $ echo "syntax: glob" > .hgignore
212 $ echo "re:.*\.o" >> .hgignore
212 $ echo "re:.*\.o" >> .hgignore
213 $ hg status
213 $ hg status
214 A dir/b.o
214 A dir/b.o
215 ? .hgignore
215 ? .hgignore
216 ? a.c
216 ? a.c
217 ? syntax
217 ? syntax
218
218
219 $ echo "syntax: invalid" > .hgignore
219 $ echo "syntax: invalid" > .hgignore
220 $ hg status
220 $ hg status
221 $TESTTMP/ignorerepo/.hgignore: ignoring invalid syntax 'invalid'
221 $TESTTMP/ignorerepo/.hgignore: ignoring invalid syntax 'invalid'
222 A dir/b.o
222 A dir/b.o
223 ? .hgignore
223 ? .hgignore
224 ? a.c
224 ? a.c
225 ? a.o
225 ? a.o
226 ? dir/c.o
226 ? dir/c.o
227 ? syntax
227 ? syntax
228
228
229 $ echo "syntax: glob" > .hgignore
229 $ echo "syntax: glob" > .hgignore
230 $ echo "*.o" >> .hgignore
230 $ echo "*.o" >> .hgignore
231 $ hg status
231 $ hg status
232 A dir/b.o
232 A dir/b.o
233 ? .hgignore
233 ? .hgignore
234 ? a.c
234 ? a.c
235 ? syntax
235 ? syntax
236
236
237 $ echo "relglob:syntax*" > .hgignore
237 $ echo "relglob:syntax*" > .hgignore
238 $ hg status
238 $ hg status
239 A dir/b.o
239 A dir/b.o
240 ? .hgignore
240 ? .hgignore
241 ? a.c
241 ? a.c
242 ? a.o
242 ? a.o
243 ? dir/c.o
243 ? dir/c.o
244
244
245 $ echo "relglob:*" > .hgignore
245 $ echo "relglob:*" > .hgignore
246 $ hg status
246 $ hg status
247 A dir/b.o
247 A dir/b.o
248
248
249 $ cd dir
249 $ cd dir
250 $ hg status .
250 $ hg status .
251 A b.o
251 A b.o
252
252
253 $ hg debugignore
253 $ hg debugignore
254 <includematcher includes='.*(?:/|$)'>
254 <includematcher includes='.*(?:/|$)'>
255
255
256 $ hg debugignore b.o
256 $ hg debugignore b.o
257 b.o is ignored
257 b.o is ignored
258 (ignore rule in $TESTTMP/ignorerepo/.hgignore, line 1: '*') (glob)
258 (ignore rule in $TESTTMP/ignorerepo/.hgignore, line 1: '*') (glob)
259
259
260 $ cd ..
260 $ cd ..
261
261
262 Check patterns that match only the directory
262 Check patterns that match only the directory
263
263
264 "(fsmonitor !)" below assumes that fsmonitor is enabled with
264 "(fsmonitor !)" below assumes that fsmonitor is enabled with
265 "walk_on_invalidate = false" (default), which doesn't involve
265 "walk_on_invalidate = false" (default), which doesn't involve
266 re-walking whole repository at detection of .hgignore change.
266 re-walking whole repository at detection of .hgignore change.
267
267
268 $ echo "^dir\$" > .hgignore
268 $ echo "^dir\$" > .hgignore
269 $ hg status
269 $ hg status
270 A dir/b.o
270 A dir/b.o
271 ? .hgignore
271 ? .hgignore
272 ? a.c
272 ? a.c
273 ? a.o
273 ? a.o
274 ? dir/c.o (fsmonitor !)
274 ? dir/c.o (fsmonitor !)
275 ? syntax
275 ? syntax
276
276
277 Check recursive glob pattern matches no directories (dir/**/c.o matches dir/c.o)
277 Check recursive glob pattern matches no directories (dir/**/c.o matches dir/c.o)
278
278
279 $ echo "syntax: glob" > .hgignore
279 $ echo "syntax: glob" > .hgignore
280 $ echo "dir/**/c.o" >> .hgignore
280 $ echo "dir/**/c.o" >> .hgignore
281 $ touch dir/c.o
281 $ touch dir/c.o
282 $ mkdir dir/subdir
282 $ mkdir dir/subdir
283 $ touch dir/subdir/c.o
283 $ touch dir/subdir/c.o
284 $ hg status
284 $ hg status
285 A dir/b.o
285 A dir/b.o
286 ? .hgignore
286 ? .hgignore
287 ? a.c
287 ? a.c
288 ? a.o
288 ? a.o
289 ? syntax
289 ? syntax
290 $ hg debugignore a.c
290 $ hg debugignore a.c
291 a.c is not ignored
291 a.c is not ignored
292 $ hg debugignore dir/c.o
292 $ hg debugignore dir/c.o
293 dir/c.o is ignored
293 dir/c.o is ignored
294 (ignore rule in $TESTTMP/ignorerepo/.hgignore, line 2: 'dir/**/c.o') (glob)
294 (ignore rule in $TESTTMP/ignorerepo/.hgignore, line 2: 'dir/**/c.o') (glob)
295
295
296 Check rooted globs
296 Check rooted globs
297
297
298 $ hg purge --all --config extensions.purge=
298 $ hg purge --all --config extensions.purge=
299 $ echo "syntax: rootglob" > .hgignore
299 $ echo "syntax: rootglob" > .hgignore
300 $ echo "a/*.ext" >> .hgignore
300 $ echo "a/*.ext" >> .hgignore
301 $ for p in a b/a aa; do mkdir -p $p; touch $p/b.ext; done
301 $ for p in a b/a aa; do mkdir -p $p; touch $p/b.ext; done
302 $ hg status -A 'set:**.ext'
302 $ hg status -A 'set:**.ext'
303 ? aa/b.ext
303 ? aa/b.ext
304 ? b/a/b.ext
304 ? b/a/b.ext
305 I a/b.ext
305 I a/b.ext
306
306
307 Check using 'include:' in ignore file
307 Check using 'include:' in ignore file
308
308
309 $ hg purge --all --config extensions.purge=
309 $ hg purge --all --config extensions.purge=
310 $ touch foo.included
310 $ touch foo.included
311
311
312 $ echo ".*.included" > otherignore
312 $ echo ".*.included" > otherignore
313 $ hg status -I "include:otherignore"
313 $ hg status -I "include:otherignore"
314 ? foo.included
314 ? foo.included
315
315
316 $ echo "include:otherignore" >> .hgignore
316 $ echo "include:otherignore" >> .hgignore
317 $ hg status
317 $ hg status
318 A dir/b.o
318 A dir/b.o
319 ? .hgignore
319 ? .hgignore
320 ? otherignore
320 ? otherignore
321
321
322 Check recursive uses of 'include:'
322 Check recursive uses of 'include:'
323
323
324 $ echo "include:nested/ignore" >> otherignore
324 $ echo "include:nested/ignore" >> otherignore
325 $ mkdir nested nested/more
325 $ mkdir nested nested/more
326 $ echo "glob:*ignore" > nested/ignore
326 $ echo "glob:*ignore" > nested/ignore
327 $ echo "rootglob:a" >> nested/ignore
327 $ echo "rootglob:a" >> nested/ignore
328 $ touch a nested/a nested/more/a
328 $ touch a nested/a nested/more/a
329 $ hg status
329 $ hg status
330 A dir/b.o
330 A dir/b.o
331 ? nested/a
331 ? nested/a
332 ? nested/more/a
332 ? nested/more/a
333 $ rm a nested/a nested/more/a
333 $ rm a nested/a nested/more/a
334
334
335 $ cp otherignore goodignore
335 $ cp otherignore goodignore
336 $ echo "include:badignore" >> otherignore
336 $ echo "include:badignore" >> otherignore
337 $ hg status
337 $ hg status
338 skipping unreadable pattern file 'badignore': $ENOENT$
338 skipping unreadable pattern file 'badignore': $ENOENT$
339 A dir/b.o
339 A dir/b.o
340
340
341 $ mv goodignore otherignore
341 $ mv goodignore otherignore
342
342
343 Check using 'include:' while in a non-root directory
343 Check using 'include:' while in a non-root directory
344
344
345 $ cd ..
345 $ cd ..
346 $ hg -R ignorerepo status
346 $ hg -R ignorerepo status
347 A dir/b.o
347 A dir/b.o
348 $ cd ignorerepo
348 $ cd ignorerepo
349
349
350 Check including subincludes
350 Check including subincludes
351
351
352 $ hg revert -q --all
352 $ hg revert -q --all
353 $ hg purge --all --config extensions.purge=
353 $ hg purge --all --config extensions.purge=
354 $ echo ".hgignore" > .hgignore
354 $ echo ".hgignore" > .hgignore
355 $ mkdir dir1 dir2
355 $ mkdir dir1 dir2
356 $ touch dir1/file1 dir1/file2 dir2/file1 dir2/file2
356 $ touch dir1/file1 dir1/file2 dir2/file1 dir2/file2
357 $ echo "subinclude:dir2/.hgignore" >> .hgignore
357 $ echo "subinclude:dir2/.hgignore" >> .hgignore
358 $ echo "glob:file*2" > dir2/.hgignore
358 $ echo "glob:file*2" > dir2/.hgignore
359 $ hg status
359 $ hg status
360 ? dir1/file1
360 ? dir1/file1
361 ? dir1/file2
361 ? dir1/file2
362 ? dir2/file1
362 ? dir2/file1
363
363
364 Check including subincludes with other patterns
364 Check including subincludes with other patterns
365
365
366 $ echo "subinclude:dir1/.hgignore" >> .hgignore
366 $ echo "subinclude:dir1/.hgignore" >> .hgignore
367
367
368 $ mkdir dir1/subdir
368 $ mkdir dir1/subdir
369 $ touch dir1/subdir/file1
369 $ touch dir1/subdir/file1
370 $ echo "rootglob:f?le1" > dir1/.hgignore
370 $ echo "rootglob:f?le1" > dir1/.hgignore
371 $ hg status
371 $ hg status
372 ? dir1/file2
372 ? dir1/file2
373 ? dir1/subdir/file1
373 ? dir1/subdir/file1
374 ? dir2/file1
374 ? dir2/file1
375 $ rm dir1/subdir/file1
375 $ rm dir1/subdir/file1
376
376
377 $ echo "regexp:f.le1" > dir1/.hgignore
377 $ echo "regexp:f.le1" > dir1/.hgignore
378 $ hg status
378 $ hg status
379 ? dir1/file2
379 ? dir1/file2
380 ? dir2/file1
380 ? dir2/file1
381
381
382 Check multiple levels of sub-ignores
382 Check multiple levels of sub-ignores
383
383
384 $ touch dir1/subdir/subfile1 dir1/subdir/subfile3 dir1/subdir/subfile4
384 $ touch dir1/subdir/subfile1 dir1/subdir/subfile3 dir1/subdir/subfile4
385 $ echo "subinclude:subdir/.hgignore" >> dir1/.hgignore
385 $ echo "subinclude:subdir/.hgignore" >> dir1/.hgignore
386 $ echo "glob:subfil*3" >> dir1/subdir/.hgignore
386 $ echo "glob:subfil*3" >> dir1/subdir/.hgignore
387
387
388 $ hg status
388 $ hg status
389 ? dir1/file2
389 ? dir1/file2
390 ? dir1/subdir/subfile4
390 ? dir1/subdir/subfile4
391 ? dir2/file1
391 ? dir2/file1
392
392
393 Check include subignore at the same level
393 Check include subignore at the same level
394
394
395 $ mv dir1/subdir/.hgignore dir1/.hgignoretwo
395 $ mv dir1/subdir/.hgignore dir1/.hgignoretwo
396 $ echo "regexp:f.le1" > dir1/.hgignore
396 $ echo "regexp:f.le1" > dir1/.hgignore
397 $ echo "subinclude:.hgignoretwo" >> dir1/.hgignore
397 $ echo "subinclude:.hgignoretwo" >> dir1/.hgignore
398 $ echo "glob:file*2" > dir1/.hgignoretwo
398 $ echo "glob:file*2" > dir1/.hgignoretwo
399
399
400 $ hg status | grep file2
400 $ hg status | grep file2
401 [1]
401 [1]
402 $ hg debugignore dir1/file2
402 $ hg debugignore dir1/file2
403 dir1/file2 is ignored
403 dir1/file2 is ignored
404 (ignore rule in dir2/.hgignore, line 1: 'file*2')
404 (ignore rule in dir2/.hgignore, line 1: 'file*2')
405
405
406 #if windows
406 #if windows
407
407
408 Windows paths are accepted on input
408 Windows paths are accepted on input
409
409
410 $ rm dir1/.hgignore
410 $ rm dir1/.hgignore
411 $ echo "dir1/file*" >> .hgignore
411 $ echo "dir1/file*" >> .hgignore
412 $ hg debugignore "dir1\file2"
412 $ hg debugignore "dir1\file2"
413 dir1/file2 is ignored
413 dir1/file2 is ignored
414 (ignore rule in $TESTTMP\ignorerepo\.hgignore, line 4: 'dir1/file*')
414 (ignore rule in $TESTTMP\ignorerepo\.hgignore, line 4: 'dir1/file*')
415 $ hg up -qC .
415 $ hg up -qC .
416
416
417 #endif
417 #endif
418
418
419 #if dirstate-v2 rust
419 #if dirstate-v2 rust
420
420
421 Check the hash of ignore patterns written in the dirstate
421 Check the hash of ignore patterns written in the dirstate
422 This is an optimization that is only relevant when using the Rust extensions
422 This is an optimization that is only relevant when using the Rust extensions
423
423
424 $ cat_filename_and_hash () {
425 > for i in "$@"; do
426 > printf "$i "
427 > cat "$i" | "$TESTDIR"/f --raw-sha1 | sed 's/^raw-sha1=//'
428 > done
429 > }
424 $ hg status > /dev/null
430 $ hg status > /dev/null
425 $ cat .hg/testhgignore .hg/testhgignorerel .hgignore dir2/.hgignore dir1/.hgignore dir1/.hgignoretwo | $TESTDIR/f --sha1
431 $ cat_filename_and_hash .hg/testhgignore .hg/testhgignorerel .hgignore dir2/.hgignore dir1/.hgignore dir1/.hgignoretwo | $TESTDIR/f --sha1
426 sha1=6e315b60f15fb5dfa02be00f3e2c8f923051f5ff
432 sha1=c0beb296395d48ced8e14f39009c4ea6e409bfe6
427 $ hg debugstate --docket | grep ignore
433 $ hg debugstate --docket | grep ignore
428 ignore pattern hash: 6e315b60f15fb5dfa02be00f3e2c8f923051f5ff
434 ignore pattern hash: c0beb296395d48ced8e14f39009c4ea6e409bfe6
429
435
430 $ echo rel > .hg/testhgignorerel
436 $ echo rel > .hg/testhgignorerel
431 $ hg status > /dev/null
437 $ hg status > /dev/null
432 $ cat .hg/testhgignore .hg/testhgignorerel .hgignore dir2/.hgignore dir1/.hgignore dir1/.hgignoretwo | $TESTDIR/f --sha1
438 $ cat_filename_and_hash .hg/testhgignore .hg/testhgignorerel .hgignore dir2/.hgignore dir1/.hgignore dir1/.hgignoretwo | $TESTDIR/f --sha1
433 sha1=dea19cc7119213f24b6b582a4bae7b0cb063e34e
439 sha1=b8e63d3428ec38abc68baa27631516d5ec46b7fa
434 $ hg debugstate --docket | grep ignore
440 $ hg debugstate --docket | grep ignore
435 ignore pattern hash: dea19cc7119213f24b6b582a4bae7b0cb063e34e
441 ignore pattern hash: b8e63d3428ec38abc68baa27631516d5ec46b7fa
436 $ cd ..
442 $ cd ..
437
443
438 Check that the hash depends on the source of the hgignore patterns
444 Check that the hash depends on the source of the hgignore patterns
439 (otherwise the context is lost and things like subinclude are cached improperly)
445 (otherwise the context is lost and things like subinclude are cached improperly)
440
446
441 $ hg init ignore-collision
447 $ hg init ignore-collision
442 $ cd ignore-collision
448 $ cd ignore-collision
443 $ echo > .hg/testhgignorerel
449 $ echo > .hg/testhgignorerel
444
450
445 $ mkdir dir1/ dir1/subdir
451 $ mkdir dir1/ dir1/subdir
446 $ touch dir1/subdir/f dir1/subdir/ignored1
452 $ touch dir1/subdir/f dir1/subdir/ignored1
447 $ echo 'ignored1' > dir1/.hgignore
453 $ echo 'ignored1' > dir1/.hgignore
448
454
449 $ mkdir dir2 dir2/subdir
455 $ mkdir dir2 dir2/subdir
450 $ touch dir2/subdir/f dir2/subdir/ignored2
456 $ touch dir2/subdir/f dir2/subdir/ignored2
451 $ echo 'ignored2' > dir2/.hgignore
457 $ echo 'ignored2' > dir2/.hgignore
452 $ echo 'subinclude:dir2/.hgignore' >> .hgignore
458 $ echo 'subinclude:dir2/.hgignore' >> .hgignore
453 $ echo 'subinclude:dir1/.hgignore' >> .hgignore
459 $ echo 'subinclude:dir1/.hgignore' >> .hgignore
454
460
455 $ hg commit -Aqm_
461 $ hg commit -Aqm_
456
462
457 $ > dir1/.hgignore
463 $ > dir1/.hgignore
458 $ echo 'ignored' > dir2/.hgignore
464 $ echo 'ignored' > dir2/.hgignore
459 $ echo 'ignored1' >> dir2/.hgignore
465 $ echo 'ignored1' >> dir2/.hgignore
460 $ hg status
466 $ hg status
461 M dir1/.hgignore
467 M dir1/.hgignore
462 M dir2/.hgignore
468 M dir2/.hgignore
463 ? dir1/subdir/ignored1 (missing-correct-output !)
469 ? dir1/subdir/ignored1
464
470
465 #endif
471 #endif
General Comments 0
You need to be logged in to leave comments. Login now