##// END OF EJS Templates
fix: prefetch file contents...
Rodrigo Damazio Bovendorp -
r45615:263cf0f6 default draft
parent child Browse files
Show More
@@ -1,903 +1,926 b''
1 1 # fix - rewrite file content in changesets and working copy
2 2 #
3 3 # Copyright 2018 Google LLC.
4 4 #
5 5 # This software may be used and distributed according to the terms of the
6 6 # GNU General Public License version 2 or any later version.
7 7 """rewrite file content in changesets or working copy (EXPERIMENTAL)
8 8
9 9 Provides a command that runs configured tools on the contents of modified files,
10 10 writing back any fixes to the working copy or replacing changesets.
11 11
12 12 Here is an example configuration that causes :hg:`fix` to apply automatic
13 13 formatting fixes to modified lines in C++ code::
14 14
15 15 [fix]
16 16 clang-format:command=clang-format --assume-filename={rootpath}
17 17 clang-format:linerange=--lines={first}:{last}
18 18 clang-format:pattern=set:**.cpp or **.hpp
19 19
20 20 The :command suboption forms the first part of the shell command that will be
21 21 used to fix a file. The content of the file is passed on standard input, and the
22 22 fixed file content is expected on standard output. Any output on standard error
23 23 will be displayed as a warning. If the exit status is not zero, the file will
24 24 not be affected. A placeholder warning is displayed if there is a non-zero exit
25 25 status but no standard error output. Some values may be substituted into the
26 26 command::
27 27
28 28 {rootpath} The path of the file being fixed, relative to the repo root
29 29 {basename} The name of the file being fixed, without the directory path
30 30
31 31 If the :linerange suboption is set, the tool will only be run if there are
32 32 changed lines in a file. The value of this suboption is appended to the shell
33 33 command once for every range of changed lines in the file. Some values may be
34 34 substituted into the command::
35 35
36 36 {first} The 1-based line number of the first line in the modified range
37 37 {last} The 1-based line number of the last line in the modified range
38 38
39 39 Deleted sections of a file will be ignored by :linerange, because there is no
40 40 corresponding line range in the version being fixed.
41 41
42 42 By default, tools that set :linerange will only be executed if there is at least
43 43 one changed line range. This is meant to prevent accidents like running a code
44 44 formatter in such a way that it unexpectedly reformats the whole file. If such a
45 45 tool needs to operate on unchanged files, it should set the :skipclean suboption
46 46 to false.
47 47
48 48 The :pattern suboption determines which files will be passed through each
49 49 configured tool. See :hg:`help patterns` for possible values. However, all
50 50 patterns are relative to the repo root, even if that text says they are relative
51 51 to the current working directory. If there are file arguments to :hg:`fix`, the
52 52 intersection of these patterns is used.
53 53
54 54 There is also a configurable limit for the maximum size of file that will be
55 55 processed by :hg:`fix`::
56 56
57 57 [fix]
58 58 maxfilesize = 2MB
59 59
60 60 Normally, execution of configured tools will continue after a failure (indicated
61 61 by a non-zero exit status). It can also be configured to abort after the first
62 62 such failure, so that no files will be affected if any tool fails. This abort
63 63 will also cause :hg:`fix` to exit with a non-zero status::
64 64
65 65 [fix]
66 66 failure = abort
67 67
68 68 When multiple tools are configured to affect a file, they execute in an order
69 69 defined by the :priority suboption. The priority suboption has a default value
70 70 of zero for each tool. Tools are executed in order of descending priority. The
71 71 execution order of tools with equal priority is unspecified. For example, you
72 72 could use the 'sort' and 'head' utilities to keep only the 10 smallest numbers
73 73 in a text file by ensuring that 'sort' runs before 'head'::
74 74
75 75 [fix]
76 76 sort:command = sort -n
77 77 head:command = head -n 10
78 78 sort:pattern = numbers.txt
79 79 head:pattern = numbers.txt
80 80 sort:priority = 2
81 81 head:priority = 1
82 82
83 83 To account for changes made by each tool, the line numbers used for incremental
84 84 formatting are recomputed before executing the next tool. So, each tool may see
85 85 different values for the arguments added by the :linerange suboption.
86 86
87 87 Each fixer tool is allowed to return some metadata in addition to the fixed file
88 88 content. The metadata must be placed before the file content on stdout,
89 89 separated from the file content by a zero byte. The metadata is parsed as a JSON
90 90 value (so, it should be UTF-8 encoded and contain no zero bytes). A fixer tool
91 91 is expected to produce this metadata encoding if and only if the :metadata
92 92 suboption is true::
93 93
94 94 [fix]
95 95 tool:command = tool --prepend-json-metadata
96 96 tool:metadata = true
97 97
98 98 The metadata values are passed to hooks, which can be used to print summaries or
99 99 perform other post-fixing work. The supported hooks are::
100 100
101 101 "postfixfile"
102 102 Run once for each file in each revision where any fixer tools made changes
103 103 to the file content. Provides "$HG_REV" and "$HG_PATH" to identify the file,
104 104 and "$HG_METADATA" with a map of fixer names to metadata values from fixer
105 105 tools that affected the file. Fixer tools that didn't affect the file have a
106 106 value of None. Only fixer tools that executed are present in the metadata.
107 107
108 108 "postfix"
109 109 Run once after all files and revisions have been handled. Provides
110 110 "$HG_REPLACEMENTS" with information about what revisions were created and
111 111 made obsolete. Provides a boolean "$HG_WDIRWRITTEN" to indicate whether any
112 112 files in the working copy were updated. Provides a list "$HG_METADATA"
113 113 mapping fixer tool names to lists of metadata values returned from
114 114 executions that modified a file. This aggregates the same metadata
115 115 previously passed to the "postfixfile" hook.
116 116
117 117 Fixer tools are run in the repository's root directory. This allows them to read
118 118 configuration files from the working copy, or even write to the working copy.
119 119 The working copy is not updated to match the revision being fixed. In fact,
120 120 several revisions may be fixed in parallel. Writes to the working copy are not
121 121 amended into the revision being fixed; fixer tools should always write fixed
122 122 file content back to stdout as documented above.
123 123 """
124 124
125 125 from __future__ import absolute_import
126 126
127 127 import collections
128 128 import itertools
129 129 import os
130 130 import re
131 131 import subprocess
132 132
133 133 from mercurial.i18n import _
134 134 from mercurial.node import nullrev
135 135 from mercurial.node import wdirrev
136 136
137 137 from mercurial.utils import procutil
138 138
139 139 from mercurial import (
140 140 cmdutil,
141 141 context,
142 142 copies,
143 143 error,
144 144 match as matchmod,
145 145 mdiff,
146 146 merge,
147 147 mergestate as mergestatemod,
148 148 pycompat,
149 149 registrar,
150 150 rewriteutil,
151 151 scmutil,
152 152 util,
153 153 worker,
154 154 )
155 155
156 156 # Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for
157 157 # extensions which SHIP WITH MERCURIAL. Non-mainline extensions should
158 158 # be specifying the version(s) of Mercurial they are tested with, or
159 159 # leave the attribute unspecified.
160 160 testedwith = b'ships-with-hg-core'
161 161
162 162 cmdtable = {}
163 163 command = registrar.command(cmdtable)
164 164
165 165 configtable = {}
166 166 configitem = registrar.configitem(configtable)
167 167
168 168 # Register the suboptions allowed for each configured fixer, and default values.
169 169 FIXER_ATTRS = {
170 170 b'command': None,
171 171 b'linerange': None,
172 172 b'pattern': None,
173 173 b'priority': 0,
174 174 b'metadata': False,
175 175 b'skipclean': True,
176 176 b'enabled': True,
177 177 }
178 178
179 179 for key, default in FIXER_ATTRS.items():
180 180 configitem(b'fix', b'.*:%s$' % key, default=default, generic=True)
181 181
182 182 # A good default size allows most source code files to be fixed, but avoids
183 183 # letting fixer tools choke on huge inputs, which could be surprising to the
184 184 # user.
185 185 configitem(b'fix', b'maxfilesize', default=b'2MB')
186 186
187 187 # Allow fix commands to exit non-zero if an executed fixer tool exits non-zero.
188 188 # This helps users do shell scripts that stop when a fixer tool signals a
189 189 # problem.
190 190 configitem(b'fix', b'failure', default=b'continue')
191 191
192 192
193 193 def checktoolfailureaction(ui, message, hint=None):
194 194 """Abort with 'message' if fix.failure=abort"""
195 195 action = ui.config(b'fix', b'failure')
196 196 if action not in (b'continue', b'abort'):
197 197 raise error.Abort(
198 198 _(b'unknown fix.failure action: %s') % (action,),
199 199 hint=_(b'use "continue" or "abort"'),
200 200 )
201 201 if action == b'abort':
202 202 raise error.Abort(message, hint=hint)
203 203
204 204
205 205 allopt = (b'', b'all', False, _(b'fix all non-public non-obsolete revisions'))
206 206 baseopt = (
207 207 b'',
208 208 b'base',
209 209 [],
210 210 _(
211 211 b'revisions to diff against (overrides automatic '
212 212 b'selection, and applies to every revision being '
213 213 b'fixed)'
214 214 ),
215 215 _(b'REV'),
216 216 )
217 217 revopt = (b'r', b'rev', [], _(b'revisions to fix (ADVANCED)'), _(b'REV'))
218 218 sourceopt = (
219 219 b's',
220 220 b'source',
221 221 [],
222 222 _(b'fix the specified revisions and their descendants'),
223 223 _(b'REV'),
224 224 )
225 225 wdiropt = (b'w', b'working-dir', False, _(b'fix the working directory'))
226 226 wholeopt = (b'', b'whole', False, _(b'always fix every line of a file'))
227 227 usage = _(b'[OPTION]... [FILE]...')
228 228
229 229
230 230 @command(
231 231 b'fix',
232 232 [allopt, baseopt, revopt, sourceopt, wdiropt, wholeopt],
233 233 usage,
234 234 helpcategory=command.CATEGORY_FILE_CONTENTS,
235 235 )
236 236 def fix(ui, repo, *pats, **opts):
237 237 """rewrite file content in changesets or working directory
238 238
239 239 Runs any configured tools to fix the content of files. Only affects files
240 240 with changes, unless file arguments are provided. Only affects changed lines
241 241 of files, unless the --whole flag is used. Some tools may always affect the
242 242 whole file regardless of --whole.
243 243
244 244 If revisions are specified with --rev, those revisions will be checked, and
245 245 they may be replaced with new revisions that have fixed file content. It is
246 246 desirable to specify all descendants of each specified revision, so that the
247 247 fixes propagate to the descendants. If all descendants are fixed at the same
248 248 time, no merging, rebasing, or evolution will be required.
249 249
250 250 If --working-dir is used, files with uncommitted changes in the working copy
251 251 will be fixed. If the checked-out revision is also fixed, the working
252 252 directory will update to the replacement revision.
253 253
254 254 When determining what lines of each file to fix at each revision, the whole
255 255 set of revisions being fixed is considered, so that fixes to earlier
256 256 revisions are not forgotten in later ones. The --base flag can be used to
257 257 override this default behavior, though it is not usually desirable to do so.
258 258 """
259 259 opts = pycompat.byteskwargs(opts)
260 260 cmdutil.check_at_most_one_arg(opts, b'all', b'source', b'rev')
261 261 cmdutil.check_incompatible_arguments(
262 262 opts, b'working_dir', [b'all', b'source']
263 263 )
264 264
265 265 with repo.wlock(), repo.lock(), repo.transaction(b'fix'):
266 266 revstofix = getrevstofix(ui, repo, opts)
267 267 basectxs = getbasectxs(repo, opts, revstofix)
268 268 workqueue, numitems = getworkqueue(
269 269 ui, repo, pats, opts, revstofix, basectxs
270 270 )
271 271 basepaths = getbasepaths(repo, opts, workqueue, basectxs)
272 272 fixers = getfixers(ui)
273 273
274 # Rather than letting each worker independently fetch the files
275 # (which also would add complications for shared/keepalive
276 # connections), prefetch them all first.
277 _prefetchfiles(repo, workqueue, basepaths)
278
274 279 # There are no data dependencies between the workers fixing each file
275 280 # revision, so we can use all available parallelism.
276 281 def getfixes(items):
277 282 for rev, path in items:
278 283 ctx = repo[rev]
279 284 olddata = ctx[path].data()
280 285 metadata, newdata = fixfile(
281 286 ui, repo, opts, fixers, ctx, path, basepaths, basectxs[rev]
282 287 )
283 288 # Don't waste memory/time passing unchanged content back, but
284 289 # produce one result per item either way.
285 290 yield (
286 291 rev,
287 292 path,
288 293 metadata,
289 294 newdata if newdata != olddata else None,
290 295 )
291 296
292 297 results = worker.worker(
293 298 ui, 1.0, getfixes, tuple(), workqueue, threadsafe=False
294 299 )
295 300
296 301 # We have to hold on to the data for each successor revision in memory
297 302 # until all its parents are committed. We ensure this by committing and
298 303 # freeing memory for the revisions in some topological order. This
299 304 # leaves a little bit of memory efficiency on the table, but also makes
300 305 # the tests deterministic. It might also be considered a feature since
301 306 # it makes the results more easily reproducible.
302 307 filedata = collections.defaultdict(dict)
303 308 aggregatemetadata = collections.defaultdict(list)
304 309 replacements = {}
305 310 wdirwritten = False
306 311 commitorder = sorted(revstofix, reverse=True)
307 312 with ui.makeprogress(
308 313 topic=_(b'fixing'), unit=_(b'files'), total=sum(numitems.values())
309 314 ) as progress:
310 315 for rev, path, filerevmetadata, newdata in results:
311 316 progress.increment(item=path)
312 317 for fixername, fixermetadata in filerevmetadata.items():
313 318 aggregatemetadata[fixername].append(fixermetadata)
314 319 if newdata is not None:
315 320 filedata[rev][path] = newdata
316 321 hookargs = {
317 322 b'rev': rev,
318 323 b'path': path,
319 324 b'metadata': filerevmetadata,
320 325 }
321 326 repo.hook(
322 327 b'postfixfile',
323 328 throw=False,
324 329 **pycompat.strkwargs(hookargs)
325 330 )
326 331 numitems[rev] -= 1
327 332 # Apply the fixes for this and any other revisions that are
328 333 # ready and sitting at the front of the queue. Using a loop here
329 334 # prevents the queue from being blocked by the first revision to
330 335 # be ready out of order.
331 336 while commitorder and not numitems[commitorder[-1]]:
332 337 rev = commitorder.pop()
333 338 ctx = repo[rev]
334 339 if rev == wdirrev:
335 340 writeworkingdir(repo, ctx, filedata[rev], replacements)
336 341 wdirwritten = bool(filedata[rev])
337 342 else:
338 343 replacerev(ui, repo, ctx, filedata[rev], replacements)
339 344 del filedata[rev]
340 345
341 346 cleanup(repo, replacements, wdirwritten)
342 347 hookargs = {
343 348 b'replacements': replacements,
344 349 b'wdirwritten': wdirwritten,
345 350 b'metadata': aggregatemetadata,
346 351 }
347 352 repo.hook(b'postfix', throw=True, **pycompat.strkwargs(hookargs))
348 353
349 354
350 355 def cleanup(repo, replacements, wdirwritten):
351 356 """Calls scmutil.cleanupnodes() with the given replacements.
352 357
353 358 "replacements" is a dict from nodeid to nodeid, with one key and one value
354 359 for every revision that was affected by fixing. This is slightly different
355 360 from cleanupnodes().
356 361
357 362 "wdirwritten" is a bool which tells whether the working copy was affected by
358 363 fixing, since it has no entry in "replacements".
359 364
360 365 Useful as a hook point for extending "hg fix" with output summarizing the
361 366 effects of the command, though we choose not to output anything here.
362 367 """
363 368 replacements = {
364 369 prec: [succ] for prec, succ in pycompat.iteritems(replacements)
365 370 }
366 371 scmutil.cleanupnodes(repo, replacements, b'fix', fixphase=True)
367 372
368 373
369 374 def getworkqueue(ui, repo, pats, opts, revstofix, basectxs):
370 375 """"Constructs the list of files to be fixed at specific revisions
371 376
372 377 It is up to the caller how to consume the work items, and the only
373 378 dependence between them is that replacement revisions must be committed in
374 379 topological order. Each work item represents a file in the working copy or
375 380 in some revision that should be fixed and written back to the working copy
376 381 or into a replacement revision.
377 382
378 383 Work items for the same revision are grouped together, so that a worker
379 384 pool starting with the first N items in parallel is likely to finish the
380 385 first revision's work before other revisions. This can allow us to write
381 386 the result to disk and reduce memory footprint. At time of writing, the
382 387 partition strategy in worker.py seems favorable to this. We also sort the
383 388 items by ascending revision number to match the order in which we commit
384 389 the fixes later.
385 390 """
386 391 workqueue = []
387 392 numitems = collections.defaultdict(int)
388 393 maxfilesize = ui.configbytes(b'fix', b'maxfilesize')
389 394 for rev in sorted(revstofix):
390 395 fixctx = repo[rev]
391 396 match = scmutil.match(fixctx, pats, opts)
392 397 for path in sorted(
393 398 pathstofix(ui, repo, pats, opts, match, basectxs[rev], fixctx)
394 399 ):
395 400 fctx = fixctx[path]
396 401 if fctx.islink():
397 402 continue
398 403 if fctx.size() > maxfilesize:
399 404 ui.warn(
400 405 _(b'ignoring file larger than %s: %s\n')
401 406 % (util.bytecount(maxfilesize), path)
402 407 )
403 408 continue
404 409 workqueue.append((rev, path))
405 410 numitems[rev] += 1
406 411 return workqueue, numitems
407 412
408 413
409 414 def getrevstofix(ui, repo, opts):
410 415 """Returns the set of revision numbers that should be fixed"""
411 416 if opts[b'all']:
412 417 revs = repo.revs(b'(not public() and not obsolete()) or wdir()')
413 418 elif opts[b'source']:
414 419 source_revs = scmutil.revrange(repo, opts[b'source'])
415 420 revs = set(repo.revs(b'%ld::', source_revs))
416 421 if wdirrev in source_revs:
417 422 # `wdir()::` is currently empty, so manually add wdir
418 423 revs.add(wdirrev)
419 424 if repo[b'.'].rev() in revs:
420 425 revs.add(wdirrev)
421 426 else:
422 427 revs = set(scmutil.revrange(repo, opts[b'rev']))
423 428 if opts.get(b'working_dir'):
424 429 revs.add(wdirrev)
425 430 for rev in revs:
426 431 checkfixablectx(ui, repo, repo[rev])
427 432 # Allow fixing only wdir() even if there's an unfinished operation
428 433 if not (len(revs) == 1 and wdirrev in revs):
429 434 cmdutil.checkunfinished(repo)
430 435 rewriteutil.precheck(repo, revs, b'fix')
431 436 if wdirrev in revs and list(
432 437 mergestatemod.mergestate.read(repo).unresolved()
433 438 ):
434 439 raise error.Abort(b'unresolved conflicts', hint=b"use 'hg resolve'")
435 440 if not revs:
436 441 raise error.Abort(
437 442 b'no changesets specified', hint=b'use --rev or --working-dir'
438 443 )
439 444 return revs
440 445
441 446
442 447 def checkfixablectx(ui, repo, ctx):
443 448 """Aborts if the revision shouldn't be replaced with a fixed one."""
444 449 if ctx.obsolete():
445 450 # It would be better to actually check if the revision has a successor.
446 451 allowdivergence = ui.configbool(
447 452 b'experimental', b'evolution.allowdivergence'
448 453 )
449 454 if not allowdivergence:
450 455 raise error.Abort(
451 456 b'fixing obsolete revision could cause divergence'
452 457 )
453 458
454 459
455 460 def pathstofix(ui, repo, pats, opts, match, basectxs, fixctx):
456 461 """Returns the set of files that should be fixed in a context
457 462
458 463 The result depends on the base contexts; we include any file that has
459 464 changed relative to any of the base contexts. Base contexts should be
460 465 ancestors of the context being fixed.
461 466 """
462 467 files = set()
463 468 for basectx in basectxs:
464 469 stat = basectx.status(
465 470 fixctx, match=match, listclean=bool(pats), listunknown=bool(pats)
466 471 )
467 472 files.update(
468 473 set(
469 474 itertools.chain(
470 475 stat.added, stat.modified, stat.clean, stat.unknown
471 476 )
472 477 )
473 478 )
474 479 return files
475 480
476 481
477 482
478 483 def lineranges(opts, path, basepaths, basectxs, fixctx, content2):
479 484 """Returns the set of line ranges that should be fixed in a file
480 485
481 486 Of the form [(10, 20), (30, 40)].
482 487
483 488 This depends on the given base contexts; we must consider lines that have
484 489 changed versus any of the base contexts, and whether the file has been
485 490 renamed versus any of them.
486 491
487 492 Another way to understand this is that we exclude line ranges that are
488 493 common to the file in all base contexts.
489 494 """
490 495 if opts.get(b'whole'):
491 496 # Return a range containing all lines. Rely on the diff implementation's
492 497 # idea of how many lines are in the file, instead of reimplementing it.
493 498 return difflineranges(b'', content2)
494 499
495 500 rangeslist = []
496 501 for basectx in basectxs:
497 502 basepath = basepaths.get((basectx.rev(), fixctx.rev(), path), path)
498 503
499 504 if basepath in basectx:
500 505 content1 = basectx[basepath].data()
501 506 else:
502 507 content1 = b''
503 508 rangeslist.extend(difflineranges(content1, content2))
504 509 return unionranges(rangeslist)
505 510
506 511
507 512 def getbasepaths(repo, opts, workqueue, basectxs):
508 513 if opts.get(b'whole'):
509 514 # Base paths will never be fetched for line range determination.
510 515 return {}
511 516
512 517 basepaths = {}
513 518 for rev, path in workqueue:
514 519 fixctx = repo[rev]
515 520 for basectx in basectxs[rev]:
516 521 basepath = copies.pathcopies(basectx, fixctx).get(path, path)
517 522 if basepath in basectx:
518 523 basepaths[(basectx.rev(), fixctx.rev(), path)] = basepath
519 524 return basepaths
520 525
521 526
522 527 def unionranges(rangeslist):
523 528 """Return the union of some closed intervals
524 529
525 530 >>> unionranges([])
526 531 []
527 532 >>> unionranges([(1, 100)])
528 533 [(1, 100)]
529 534 >>> unionranges([(1, 100), (1, 100)])
530 535 [(1, 100)]
531 536 >>> unionranges([(1, 100), (2, 100)])
532 537 [(1, 100)]
533 538 >>> unionranges([(1, 99), (1, 100)])
534 539 [(1, 100)]
535 540 >>> unionranges([(1, 100), (40, 60)])
536 541 [(1, 100)]
537 542 >>> unionranges([(1, 49), (50, 100)])
538 543 [(1, 100)]
539 544 >>> unionranges([(1, 48), (50, 100)])
540 545 [(1, 48), (50, 100)]
541 546 >>> unionranges([(1, 2), (3, 4), (5, 6)])
542 547 [(1, 6)]
543 548 """
544 549 rangeslist = sorted(set(rangeslist))
545 550 unioned = []
546 551 if rangeslist:
547 552 unioned, rangeslist = [rangeslist[0]], rangeslist[1:]
548 553 for a, b in rangeslist:
549 554 c, d = unioned[-1]
550 555 if a > d + 1:
551 556 unioned.append((a, b))
552 557 else:
553 558 unioned[-1] = (c, max(b, d))
554 559 return unioned
555 560
556 561
557 562 def difflineranges(content1, content2):
558 563 """Return list of line number ranges in content2 that differ from content1.
559 564
560 565 Line numbers are 1-based. The numbers are the first and last line contained
561 566 in the range. Single-line ranges have the same line number for the first and
562 567 last line. Excludes any empty ranges that result from lines that are only
563 568 present in content1. Relies on mdiff's idea of where the line endings are in
564 569 the string.
565 570
566 571 >>> from mercurial import pycompat
567 572 >>> lines = lambda s: b'\\n'.join([c for c in pycompat.iterbytestr(s)])
568 573 >>> difflineranges2 = lambda a, b: difflineranges(lines(a), lines(b))
569 574 >>> difflineranges2(b'', b'')
570 575 []
571 576 >>> difflineranges2(b'a', b'')
572 577 []
573 578 >>> difflineranges2(b'', b'A')
574 579 [(1, 1)]
575 580 >>> difflineranges2(b'a', b'a')
576 581 []
577 582 >>> difflineranges2(b'a', b'A')
578 583 [(1, 1)]
579 584 >>> difflineranges2(b'ab', b'')
580 585 []
581 586 >>> difflineranges2(b'', b'AB')
582 587 [(1, 2)]
583 588 >>> difflineranges2(b'abc', b'ac')
584 589 []
585 590 >>> difflineranges2(b'ab', b'aCb')
586 591 [(2, 2)]
587 592 >>> difflineranges2(b'abc', b'aBc')
588 593 [(2, 2)]
589 594 >>> difflineranges2(b'ab', b'AB')
590 595 [(1, 2)]
591 596 >>> difflineranges2(b'abcde', b'aBcDe')
592 597 [(2, 2), (4, 4)]
593 598 >>> difflineranges2(b'abcde', b'aBCDe')
594 599 [(2, 4)]
595 600 """
596 601 ranges = []
597 602 for lines, kind in mdiff.allblocks(content1, content2):
598 603 firstline, lastline = lines[2:4]
599 604 if kind == b'!' and firstline != lastline:
600 605 ranges.append((firstline + 1, lastline))
601 606 return ranges
602 607
603 608
604 609 def getbasectxs(repo, opts, revstofix):
605 610 """Returns a map of the base contexts for each revision
606 611
607 612 The base contexts determine which lines are considered modified when we
608 613 attempt to fix just the modified lines in a file. It also determines which
609 614 files we attempt to fix, so it is important to compute this even when
610 615 --whole is used.
611 616 """
612 617 # The --base flag overrides the usual logic, and we give every revision
613 618 # exactly the set of baserevs that the user specified.
614 619 if opts.get(b'base'):
615 620 baserevs = set(scmutil.revrange(repo, opts.get(b'base')))
616 621 if not baserevs:
617 622 baserevs = {nullrev}
618 623 basectxs = {repo[rev] for rev in baserevs}
619 624 return {rev: basectxs for rev in revstofix}
620 625
621 626 # Proceed in topological order so that we can easily determine each
622 627 # revision's baserevs by looking at its parents and their baserevs.
623 628 basectxs = collections.defaultdict(set)
624 629 for rev in sorted(revstofix):
625 630 ctx = repo[rev]
626 631 for pctx in ctx.parents():
627 632 if pctx.rev() in basectxs:
628 633 basectxs[rev].update(basectxs[pctx.rev()])
629 634 else:
630 635 basectxs[rev].add(pctx)
631 636 return basectxs
632 637
638 def _prefetchfiles(repo, workqueue, basepaths):
639 toprefetch = set()
640
641 # Prefetch the files that will be fixed.
642 for rev, path in workqueue:
643 if rev == wdirrev:
644 continue
645 toprefetch.add((rev, path))
646
647 # Prefetch the base contents for lineranges().
648 for (baserev, fixrev, path), basepath in basepaths.items():
649 toprefetch.add((baserev, basepath))
650
651 if toprefetch:
652 scmutil.prefetchfiles(repo, [
653 (rev, scmutil.matchfiles(repo, [path])) for rev, path in toprefetch
654 ])
655
633 656
634 657 def fixfile(ui, repo, opts, fixers, fixctx, path, basepaths, basectxs):
635 658 """Run any configured fixers that should affect the file in this context
636 659
637 660 Returns the file content that results from applying the fixers in some order
638 661 starting with the file's content in the fixctx. Fixers that support line
639 662 ranges will affect lines that have changed relative to any of the basectxs
640 663 (i.e. they will only avoid lines that are common to all basectxs).
641 664
642 665 A fixer tool's stdout will become the file's new content if and only if it
643 666 exits with code zero. The fixer tool's working directory is the repository's
644 667 root.
645 668 """
646 669 metadata = {}
647 670 newdata = fixctx[path].data()
648 671 for fixername, fixer in pycompat.iteritems(fixers):
649 672 if fixer.affects(opts, fixctx, path):
650 673 ranges = lineranges(
651 674 opts, path, basepaths, basectxs, fixctx, newdata)
652 675 command = fixer.command(ui, path, ranges)
653 676 if command is None:
654 677 continue
655 678 ui.debug(b'subprocess: %s\n' % (command,))
656 679 proc = subprocess.Popen(
657 680 procutil.tonativestr(command),
658 681 shell=True,
659 682 cwd=procutil.tonativestr(repo.root),
660 683 stdin=subprocess.PIPE,
661 684 stdout=subprocess.PIPE,
662 685 stderr=subprocess.PIPE,
663 686 )
664 687 stdout, stderr = proc.communicate(newdata)
665 688 if stderr:
666 689 showstderr(ui, fixctx.rev(), fixername, stderr)
667 690 newerdata = stdout
668 691 if fixer.shouldoutputmetadata():
669 692 try:
670 693 metadatajson, newerdata = stdout.split(b'\0', 1)
671 694 metadata[fixername] = pycompat.json_loads(metadatajson)
672 695 except ValueError:
673 696 ui.warn(
674 697 _(b'ignored invalid output from fixer tool: %s\n')
675 698 % (fixername,)
676 699 )
677 700 continue
678 701 else:
679 702 metadata[fixername] = None
680 703 if proc.returncode == 0:
681 704 newdata = newerdata
682 705 else:
683 706 if not stderr:
684 707 message = _(b'exited with status %d\n') % (proc.returncode,)
685 708 showstderr(ui, fixctx.rev(), fixername, message)
686 709 checktoolfailureaction(
687 710 ui,
688 711 _(b'no fixes will be applied'),
689 712 hint=_(
690 713 b'use --config fix.failure=continue to apply any '
691 714 b'successful fixes anyway'
692 715 ),
693 716 )
694 717 return metadata, newdata
695 718
696 719
697 720 def showstderr(ui, rev, fixername, stderr):
698 721 """Writes the lines of the stderr string as warnings on the ui
699 722
700 723 Uses the revision number and fixername to give more context to each line of
701 724 the error message. Doesn't include file names, since those take up a lot of
702 725 space and would tend to be included in the error message if they were
703 726 relevant.
704 727 """
705 728 for line in re.split(b'[\r\n]+', stderr):
706 729 if line:
707 730 ui.warn(b'[')
708 731 if rev is None:
709 732 ui.warn(_(b'wdir'), label=b'evolve.rev')
710 733 else:
711 734 ui.warn(b'%d' % rev, label=b'evolve.rev')
712 735 ui.warn(b'] %s: %s\n' % (fixername, line))
713 736
714 737
715 738 def writeworkingdir(repo, ctx, filedata, replacements):
716 739 """Write new content to the working copy and check out the new p1 if any
717 740
718 741 We check out a new revision if and only if we fixed something in both the
719 742 working directory and its parent revision. This avoids the need for a full
720 743 update/merge, and means that the working directory simply isn't affected
721 744 unless the --working-dir flag is given.
722 745
723 746 Directly updates the dirstate for the affected files.
724 747 """
725 748 for path, data in pycompat.iteritems(filedata):
726 749 fctx = ctx[path]
727 750 fctx.write(data, fctx.flags())
728 751 if repo.dirstate[path] == b'n':
729 752 repo.dirstate.normallookup(path)
730 753
731 754 oldparentnodes = repo.dirstate.parents()
732 755 newparentnodes = [replacements.get(n, n) for n in oldparentnodes]
733 756 if newparentnodes != oldparentnodes:
734 757 repo.setparents(*newparentnodes)
735 758
736 759
737 760 def replacerev(ui, repo, ctx, filedata, replacements):
738 761 """Commit a new revision like the given one, but with file content changes
739 762
740 763 "ctx" is the original revision to be replaced by a modified one.
741 764
742 765 "filedata" is a dict that maps paths to their new file content. All other
743 766 paths will be recreated from the original revision without changes.
744 767 "filedata" may contain paths that didn't exist in the original revision;
745 768 they will be added.
746 769
747 770 "replacements" is a dict that maps a single node to a single node, and it is
748 771 updated to indicate the original revision is replaced by the newly created
749 772 one. No entry is added if the replacement's node already exists.
750 773
751 774 The new revision has the same parents as the old one, unless those parents
752 775 have already been replaced, in which case those replacements are the parents
753 776 of this new revision. Thus, if revisions are replaced in topological order,
754 777 there is no need to rebase them into the original topology later.
755 778 """
756 779
757 780 p1rev, p2rev = repo.changelog.parentrevs(ctx.rev())
758 781 p1ctx, p2ctx = repo[p1rev], repo[p2rev]
759 782 newp1node = replacements.get(p1ctx.node(), p1ctx.node())
760 783 newp2node = replacements.get(p2ctx.node(), p2ctx.node())
761 784
762 785 # We don't want to create a revision that has no changes from the original,
763 786 # but we should if the original revision's parent has been replaced.
764 787 # Otherwise, we would produce an orphan that needs no actual human
765 788 # intervention to evolve. We can't rely on commit() to avoid creating the
766 789 # un-needed revision because the extra field added below produces a new hash
767 790 # regardless of file content changes.
768 791 if (
769 792 not filedata
770 793 and p1ctx.node() not in replacements
771 794 and p2ctx.node() not in replacements
772 795 ):
773 796 return
774 797
775 798 extra = ctx.extra().copy()
776 799 extra[b'fix_source'] = ctx.hex()
777 800
778 801 wctx = context.overlayworkingctx(repo)
779 802 wctx.setbase(repo[newp1node])
780 803 merge.revert_to(ctx, wc=wctx)
781 804 copies.graftcopies(wctx, ctx, ctx.p1())
782 805
783 806 for path in filedata.keys():
784 807 fctx = ctx[path]
785 808 copysource = fctx.copysource()
786 809 wctx.write(path, filedata[path], flags=fctx.flags())
787 810 if copysource:
788 811 wctx.markcopied(path, copysource)
789 812
790 813 memctx = wctx.tomemctx(
791 814 text=ctx.description(),
792 815 branch=ctx.branch(),
793 816 extra=extra,
794 817 date=ctx.date(),
795 818 parents=(newp1node, newp2node),
796 819 user=ctx.user(),
797 820 )
798 821
799 822 sucnode = memctx.commit()
800 823 prenode = ctx.node()
801 824 if prenode == sucnode:
802 825 ui.debug(b'node %s already existed\n' % (ctx.hex()))
803 826 else:
804 827 replacements[ctx.node()] = sucnode
805 828
806 829
807 830 def getfixers(ui):
808 831 """Returns a map of configured fixer tools indexed by their names
809 832
810 833 Each value is a Fixer object with methods that implement the behavior of the
811 834 fixer's config suboptions. Does not validate the config values.
812 835 """
813 836 fixers = {}
814 837 for name in fixernames(ui):
815 838 enabled = ui.configbool(b'fix', name + b':enabled')
816 839 command = ui.config(b'fix', name + b':command')
817 840 pattern = ui.config(b'fix', name + b':pattern')
818 841 linerange = ui.config(b'fix', name + b':linerange')
819 842 priority = ui.configint(b'fix', name + b':priority')
820 843 metadata = ui.configbool(b'fix', name + b':metadata')
821 844 skipclean = ui.configbool(b'fix', name + b':skipclean')
822 845 # Don't use a fixer if it has no pattern configured. It would be
823 846 # dangerous to let it affect all files. It would be pointless to let it
824 847 # affect no files. There is no reasonable subset of files to use as the
825 848 # default.
826 849 if command is None:
827 850 ui.warn(
828 851 _(b'fixer tool has no command configuration: %s\n') % (name,)
829 852 )
830 853 elif pattern is None:
831 854 ui.warn(
832 855 _(b'fixer tool has no pattern configuration: %s\n') % (name,)
833 856 )
834 857 elif not enabled:
835 858 ui.debug(b'ignoring disabled fixer tool: %s\n' % (name,))
836 859 else:
837 860 fixers[name] = Fixer(
838 861 command, pattern, linerange, priority, metadata, skipclean
839 862 )
840 863 return collections.OrderedDict(
841 864 sorted(fixers.items(), key=lambda item: item[1]._priority, reverse=True)
842 865 )
843 866
844 867
845 868 def fixernames(ui):
846 869 """Returns the names of [fix] config options that have suboptions"""
847 870 names = set()
848 871 for k, v in ui.configitems(b'fix'):
849 872 if b':' in k:
850 873 names.add(k.split(b':', 1)[0])
851 874 return names
852 875
853 876
854 877 class Fixer(object):
855 878 """Wraps the raw config values for a fixer with methods"""
856 879
857 880 def __init__(
858 881 self, command, pattern, linerange, priority, metadata, skipclean
859 882 ):
860 883 self._command = command
861 884 self._pattern = pattern
862 885 self._linerange = linerange
863 886 self._priority = priority
864 887 self._metadata = metadata
865 888 self._skipclean = skipclean
866 889
867 890 def affects(self, opts, fixctx, path):
868 891 """Should this fixer run on the file at the given path and context?"""
869 892 repo = fixctx.repo()
870 893 matcher = matchmod.match(
871 894 repo.root, repo.root, [self._pattern], ctx=fixctx
872 895 )
873 896 return matcher(path)
874 897
875 898 def shouldoutputmetadata(self):
876 899 """Should the stdout of this fixer start with JSON and a null byte?"""
877 900 return self._metadata
878 901
879 902 def command(self, ui, path, ranges):
880 903 """A shell command to use to invoke this fixer on the given file/lines
881 904
882 905 May return None if there is no appropriate command to run for the given
883 906 parameters.
884 907 """
885 908 expand = cmdutil.rendercommandtemplate
886 909 parts = [
887 910 expand(
888 911 ui,
889 912 self._command,
890 913 {b'rootpath': path, b'basename': os.path.basename(path)},
891 914 )
892 915 ]
893 916 if self._linerange:
894 917 if self._skipclean and not ranges:
895 918 # No line ranges to fix, so don't run the fixer.
896 919 return None
897 920 for first, last in ranges:
898 921 parts.append(
899 922 expand(
900 923 ui, self._linerange, {b'first': first, b'last': last}
901 924 )
902 925 )
903 926 return b' '.join(parts)
General Comments 0
You need to be logged in to leave comments. Login now