##// END OF EJS Templates
copies-rust: add smarter approach for merging small mapping with large mapping...
copies-rust: add smarter approach for merging small mapping with large mapping The current approach (finding the smaller updated set) works great when the mapping have similar size, but do a lot of unnecessary work when one side is tinier than the other one. So we do better in theses cases. See inline documentation for details. It give a sizeable boost to many of out slower cases: Repo Case Source-Rev Dest-Rev # of revisions old time new time Difference Factor time per rev --------------------------------------------------------------------------------------------------------------------------------------------------------------- mozilla-try x00000_revs_x_added_0_copies 6a320851d377 1ebb79acd503 : 363753 revs, 18.123103 s, 5.693818 s, -12.429285 s, × 0.3142, 15 µs/rev mozilla-try x00000_revs_x_added_x_copies 5173c4b6f97c 95d83ee7242d : 362229 revs, 17.907312 s, 5.677655 s, -12.229657 s, × 0.3171, 15 µs/rev mozilla-try x00000_revs_x000_added_x_copies 9126823d0e9c ca82787bb23c : 359344 revs, 17.684797 s, 5.563370 s, -12.121427 s, × 0.3146, 15 µs/rev mozilla-try x00000_revs_x0000_added_x0000_copies 8d3fafa80d4b eb884023b810 : 192665 revs, 2.881471 s, 2.864099 s, -0.017372 s, × 0.9940, 14 µs/rev mozilla-try x00000_revs_x00000_added_x000_copies 9b2a99adc05e 8e29777b48e6 : 382065 revs, 63.148971 s, 59.498652 s, -3.650319 s, × 0.9422, 155 µs/rev mozilla-try x00000_revs_x00000_added_x000_copies 9b2a99adc05e 8e29777b48e6 : 382065 revs, 63.148971 s, 59.498652 s, -3.650319 s, × 0.9422, 155 µs/rev ideally, the im-rs object would have a `merge` method, but it does not (yet) Full timing comparison below (they are one pathological case than become even worse, for unclear reason). Repo Case Source-Rev Dest-Rev # of revisions old time new time Difference Factor time per rev --------------------------------------------------------------------------------------------------------------------------------------------------------------- mercurial x_revs_x_added_0_copies ad6b123de1c7 39cfcef4f463 : 1 revs, 0.000043 s, 0.000042 s, -0.000001 s, × 0.9767, 42 µs/rev mercurial x_revs_x_added_x_copies 2b1c78674230 0c1d10351869 : 6 revs, 0.000105 s, 0.000104 s, -0.000001 s, × 0.9905, 17 µs/rev mercurial x000_revs_x000_added_x_copies 81f8ff2a9bf2 dd3267698d84 : 1032 revs, 0.004895 s, 0.004913 s, +0.000018 s, × 1.0037, 4 µs/rev pypy x_revs_x_added_0_copies aed021ee8ae8 099ed31b181b : 9 revs, 0.000194 s, 0.000191 s, -0.000003 s, × 0.9845, 21 µs/rev pypy x_revs_x000_added_0_copies 4aa4e1f8e19a 359343b9ac0e : 1 revs, 0.000050 s, 0.000050 s, +0.000000 s, × 1.0000, 50 µs/rev pypy x_revs_x_added_x_copies ac52eb7bbbb0 72e022663155 : 7 revs, 0.000115 s, 0.000112 s, -0.000003 s, × 0.9739, 16 µs/rev pypy x_revs_x00_added_x_copies c3b14617fbd7 ace7255d9a26 : 1 revs, 0.000289 s, 0.000288 s, -0.000001 s, × 0.9965, 288 µs/rev pypy x_revs_x000_added_x000_copies df6f7a526b60 a83dc6a2d56f : 6 revs, 0.010513 s, 0.010411 s, -0.000102 s, × 0.9903, 1735 µs/rev pypy x000_revs_xx00_added_0_copies 89a76aede314 2f22446ff07e : 4785 revs, 0.051474 s, 0.052852 s, +0.001378 s, × 1.0268, 11 µs/rev pypy x000_revs_x000_added_x_copies 8a3b5bfd266e 2c68e87c3efe : 6780 revs, 0.088086 s, 0.092828 s, +0.004742 s, × 1.0538, 13 µs/rev pypy x000_revs_x000_added_x000_copies 89a76aede314 7b3dda341c84 : 5441 revs, 0.062176 s, 0.063269 s, +0.001093 s, × 1.0176, 11 µs/rev pypy x0000_revs_x_added_0_copies d1defd0dc478 c9cb1334cc78 : 43645 revs, 0.720950 s, 0.711975 s, -0.008975 s, × 0.9876, 16 µs/rev pypy x0000_revs_xx000_added_0_copies bf2c629d0071 4ffed77c095c : 2 revs, 0.012897 s, 0.012771 s, -0.000126 s, × 0.9902, 6385 µs/rev pypy x0000_revs_xx000_added_x000_copies 08ea3258278e d9fa043f30c0 : 11316 revs, 0.121524 s, 0.124505 s, +0.002981 s, × 1.0245, 11 µs/rev netbeans x_revs_x_added_0_copies fb0955ffcbcd a01e9239f9e7 : 2 revs, 0.000082 s, 0.000082 s, +0.000000 s, × 1.0000, 41 µs/rev netbeans x_revs_x000_added_0_copies 6f360122949f 20eb231cc7d0 : 2 revs, 0.000109 s, 0.000111 s, +0.000002 s, × 1.0183, 55 µs/rev netbeans x_revs_x_added_x_copies 1ada3faf6fb6 5a39d12eecf4 : 3 revs, 0.000175 s, 0.000171 s, -0.000004 s, × 0.9771, 57 µs/rev netbeans x_revs_x00_added_x_copies 35be93ba1e2c 9eec5e90c05f : 9 revs, 0.000719 s, 0.000708 s, -0.000011 s, × 0.9847, 78 µs/rev netbeans x000_revs_xx00_added_0_copies eac3045b4fdd 51d4ae7f1290 : 1421 revs, 0.010426 s, 0.010608 s, +0.000182 s, × 1.0175, 7 µs/rev netbeans x000_revs_x000_added_x_copies e2063d266acd 6081d72689dc : 1533 revs, 0.015712 s, 0.015635 s, -0.000077 s, × 0.9951, 10 µs/rev netbeans x000_revs_x000_added_x000_copies ff453e9fee32 411350406ec2 : 5750 revs, 0.077353 s, 0.072072 s, -0.005281 s, × 0.9317, 12 µs/rev netbeans x0000_revs_xx000_added_x000_copies 588c2d1ced70 1aad62e59ddd : 66949 revs, 0.673930 s, 0.682732 s, +0.008802 s, × 1.0131, 10 µs/rev mozilla-central x_revs_x_added_0_copies 3697f962bb7b 7015fcdd43a2 : 2 revs, 0.000089 s, 0.000090 s, +0.000001 s, × 1.0112, 45 µs/rev mozilla-central x_revs_x000_added_0_copies dd390860c6c9 40d0c5bed75d : 8 revs, 0.000212 s, 0.000210 s, -0.000002 s, × 0.9906, 26 µs/rev mozilla-central x_revs_x_added_x_copies 8d198483ae3b 14207ffc2b2f : 9 revs, 0.000183 s, 0.000182 s, -0.000001 s, × 0.9945, 20 µs/rev mozilla-central x_revs_x00_added_x_copies 98cbc58cc6bc 446a150332c3 : 7 revs, 0.000595 s, 0.000594 s, -0.000001 s, × 0.9983, 84 µs/rev mozilla-central x_revs_x000_added_x000_copies 3c684b4b8f68 0a5e72d1b479 : 3 revs, 0.003117 s, 0.003102 s, -0.000015 s, × 0.9952, 1034 µs/rev mozilla-central x_revs_x0000_added_x0000_copies effb563bb7e5 c07a39dc4e80 : 6 revs, 0.060197 s, 0.060234 s, +0.000037 s, × 1.0006, 10039 µs/rev mozilla-central x000_revs_xx00_added_0_copies 6100d773079a 04a55431795e : 1593 revs, 0.006379 s, 0.006300 s, -0.000079 s, × 0.9876, 3 µs/rev mozilla-central x000_revs_x000_added_x_copies 9f17a6fc04f9 2d37b966abed : 41 revs, 0.005008 s, 0.004817 s, -0.000191 s, × 0.9619, 117 µs/rev mozilla-central x000_revs_x000_added_x000_copies 7c97034feb78 4407bd0c6330 : 7839 revs, 0.065123 s, 0.065451 s, +0.000328 s, × 1.0050, 8 µs/rev mozilla-central x0000_revs_xx000_added_0_copies 9eec5917337d 67118cc6dcad : 615 revs, 0.026404 s, 0.026282 s, -0.000122 s, × 0.9954, 42 µs/rev mozilla-central x0000_revs_xx000_added_x000_copies f78c615a656c 96a38b690156 : 30263 revs, 0.203456 s, 0.206873 s, +0.003417 s, × 1.0168, 6 µs/rev mozilla-central x00000_revs_x0000_added_x0000_copies 6832ae71433c 4c222a1d9a00 : 153721 revs, 1.929809 s, 1.935918 s, +0.006109 s, × 1.0032, 12 µs/rev mozilla-central x00000_revs_x00000_added_x000_copies 76caed42cf7c 1daa622bbe42 : 204976 revs, 2.825064 s, 2.827320 s, +0.002256 s, × 1.0008, 13 µs/rev mozilla-try x_revs_x_added_0_copies aaf6dde0deb8 9790f499805a : 2 revs, 0.000857 s, 0.000842 s, -0.000015 s, × 0.9825, 421 µs/rev mozilla-try x_revs_x000_added_0_copies d8d0222927b4 5bb8ce8c7450 : 2 revs, 0.000870 s, 0.000870 s, +0.000000 s, × 1.0000, 435 µs/rev mozilla-try x_revs_x_added_x_copies 092fcca11bdb 936255a0384a : 4 revs, 0.000161 s, 0.000165 s, +0.000004 s, × 1.0248, 41 µs/rev mozilla-try x_revs_x00_added_x_copies b53d2fadbdb5 017afae788ec : 2 revs, 0.001147 s, 0.001145 s, -0.000002 s, × 0.9983, 572 µs/rev mozilla-try x_revs_x000_added_x000_copies 20408ad61ce5 6f0ee96e21ad : 1 revs, 0.026640 s, 0.026500 s, -0.000140 s, × 0.9947, 26500 µs/rev mozilla-try x_revs_x0000_added_x0000_copies effb563bb7e5 c07a39dc4e80 : 6 revs, 0.059849 s, 0.059407 s, -0.000442 s, × 0.9926, 9901 µs/rev mozilla-try x000_revs_xx00_added_0_copies 6100d773079a 04a55431795e : 1593 revs, 0.006326 s, 0.006325 s, -0.000001 s, × 0.9998, 3 µs/rev mozilla-try x000_revs_x000_added_x_copies 9f17a6fc04f9 2d37b966abed : 41 revs, 0.005188 s, 0.005171 s, -0.000017 s, × 0.9967, 126 µs/rev mozilla-try x000_revs_x000_added_x000_copies 1346fd0130e4 4c65cbdabc1f : 6657 revs, 0.067633 s, 0.066837 s, -0.000796 s, × 0.9882, 10 µs/rev mozilla-try x0000_revs_x_added_0_copies 63519bfd42ee a36a2a865d92 : 40314 revs, 0.306969 s, 0.314252 s, +0.007283 s, × 1.0237, 7 µs/rev mozilla-try x0000_revs_x_added_x_copies 9fe69ff0762d bcabf2a78927 : 38690 revs, 0.293370 s, 0.304160 s, +0.010790 s, × 1.0368, 7 µs/rev mozilla-try x0000_revs_xx000_added_x_copies 156f6e2674f2 4d0f2c178e66 : 8598 revs, 0.087159 s, 0.089223 s, +0.002064 s, × 1.0237, 10 µs/rev mozilla-try x0000_revs_xx000_added_0_copies 9eec5917337d 67118cc6dcad : 615 revs, 0.027251 s, 0.026711 s, -0.000540 s, × 0.9802, 43 µs/rev mozilla-try x0000_revs_xx000_added_x000_copies 89294cd501d9 7ccb2fc7ccb5 : 97052 revs, 3.010011 s, 3.243010 s, +0.232999 s, × 1.0774, 33 µs/rev mozilla-try x0000_revs_x0000_added_x0000_copies e928c65095ed e951f4ad123a : 52031 revs, 0.753434 s, 0.756500 s, +0.003066 s, × 1.0041, 14 µs/rev mozilla-try x00000_revs_x_added_0_copies 6a320851d377 1ebb79acd503 : 363753 revs, 18.123103 s, 5.693818 s, -12.429285 s, × 0.3142, 15 µs/rev mozilla-try x00000_revs_x00000_added_0_copies dc8a3ca7010e d16fde900c9c : 34414 revs, 0.583206 s, 0.590904 s, +0.007698 s, × 1.0132, 17 µs/rev mozilla-try x00000_revs_x_added_x_copies 5173c4b6f97c 95d83ee7242d : 362229 revs, 17.907312 s, 5.677655 s, -12.229657 s, × 0.3171, 15 µs/rev mozilla-try x00000_revs_x000_added_x_copies 9126823d0e9c ca82787bb23c : 359344 revs, 17.684797 s, 5.563370 s, -12.121427 s, × 0.3146, 15 µs/rev mozilla-try x00000_revs_x0000_added_x0000_copies 8d3fafa80d4b eb884023b810 : 192665 revs, 2.881471 s, 2.864099 s, -0.017372 s, × 0.9940, 14 µs/rev mozilla-try x00000_revs_x00000_added_x0000_copies 1b661134e2ca 1ae03d022d6d : 228985 revs, 101.062002 s, 113.297287 s, +12.235285 s, × 1.1211, 494 µs/rev mozilla-try x00000_revs_x00000_added_x000_copies 9b2a99adc05e 8e29777b48e6 : 382065 revs, 63.148971 s, 59.498652 s, -3.650319 s, × 0.9422, 155 µs/rev Differential Revision: https://phab.mercurial-scm.org/D9491

File last commit:

r46554:89a2afe3 default
r46744:c94d013e default
Show More
check-code.py
1109 lines | 35.3 KiB | text/x-python | PythonLexer
Gregory Szorc
global: use python3 in shebangs...
r46434 #!/usr/bin/env python3
Matt Mackall
Introduce check-code.py...
r10281 #
# check-code - a style and portability checker for Mercurial
#
Matt Mackall
check-code: fix copyright date
r10290 # Copyright 2010 Matt Mackall <mpm@selenic.com>
Matt Mackall
Introduce check-code.py...
r10281 #
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
Simon Heimberg
check-code: explain what to do when a check-code rule mismatches...
r20241 """style and portability checker for Mercurial
when a rule triggers wrong, do one of the following (prefer one from top):
* do the work-around the rule suggests
* doublecheck that it is a false match
* improve the rule pattern
* add an ignore pattern to the rule (3rd arg) which matches your good line
timeless
py24: remove check-code py24 notation...
r28700 (you can append a short comment and match this, like: #re-raises)
Simon Heimberg
check-code: explain what to do when a check-code rule mismatches...
r20241 * change the pattern to a warning and list the exception in test-check-code-hg
* ONLY use no--check-code for skipping entire files from external sources
"""
Pulkit Goyal
check-code: use absolute_import and print_function
r28509 from __future__ import absolute_import, print_function
import glob
Thomas Arendsen Hein
check-code: check for gratuitous whitespace after Python keywords
r13074 import keyword
Matt Mackall
check-code: add a warnings level...
r10895 import optparse
Pulkit Goyal
check-code: use absolute_import and print_function
r28509 import os
import re
import sys
Augie Fackler
formatting: blacken the codebase...
r43346
timeless
check-code: handle py3 open divergence...
r29145 if sys.version_info[0] < 3:
opentext = open
else:
Augie Fackler
formatting: blacken the codebase...
r43346
timeless
check-code: handle py3 open divergence...
r29145 def opentext(f):
Augie Fackler
contrib: have check-code look at files in latin1 instead of ascii...
r39091 return open(f, encoding='latin1')
Augie Fackler
formatting: blacken the codebase...
r43346
Simon Heimberg
check-code: introduce function for using re2 when available...
r19310 try:
timeless
check-code: handle range/xrange divergence
r29143 xrange
except NameError:
xrange = range
try:
Simon Heimberg
check-code: introduce function for using re2 when available...
r19310 import re2
except ImportError:
re2 = None
FUJIWARA Katsunori
contrib: make check-code.py check code fragments embedded in test scripts
r41992 import testparseutil
Augie Fackler
formatting: blacken the codebase...
r43346
Simon Heimberg
check-code: introduce function for using re2 when available...
r19310 def compilere(pat, multiline=False):
if multiline:
pat = '(?m)' + pat
if re2:
try:
return re2.compile(pat)
except re2.error:
pass
return re.compile(pat)
Matt Mackall
Introduce check-code.py...
r10281
Augie Fackler
formatting: blacken the codebase...
r43346
FUJIWARA Katsunori
check-code: build translation table for repquote in global for efficiency...
r29398 # check "rules depending on implementation of repquote()" in each
# patterns (especially pypats), before changing around repquote()
Augie Fackler
formatting: blacken the codebase...
r43346 _repquotefixedmap = {
' ': ' ',
'\n': '\n',
'.': 'p',
':': 'q',
'%': '%',
'\\': 'b',
'*': 'A',
'+': 'P',
'-': 'M',
}
FUJIWARA Katsunori
check-code: build translation table for repquote in global for efficiency...
r29398 def _repquoteencodechr(i):
if i > 255:
return 'u'
c = chr(i)
if c in _repquotefixedmap:
return _repquotefixedmap[c]
if c.isalpha():
return 'x'
if c.isdigit():
return 'n'
return 'o'
Augie Fackler
formatting: blacken the codebase...
r43346
FUJIWARA Katsunori
check-code: build translation table for repquote in global for efficiency...
r29398 _repquotett = ''.join(_repquoteencodechr(i) for i in xrange(256))
Augie Fackler
formatting: blacken the codebase...
r43346
Matt Mackall
Introduce check-code.py...
r10281 def repquote(m):
Simon Heimberg
check-code: more replacement characters...
r19999 t = m.group('text')
FUJIWARA Katsunori
check-code: build translation table for repquote in global for efficiency...
r29398 t = t.translate(_repquotett)
Benoit Boissinot
check-code: improve quote detection regexp, add tests
r10722 return m.group('quote') + t + m.group('quote')
Matt Mackall
Introduce check-code.py...
r10281
Augie Fackler
formatting: blacken the codebase...
r43346
Benoit Boissinot
check-code: more tests and more robust python filtering
r10727 def reppython(m):
comment = m.group('comment')
if comment:
Mads Kiilerich
check-code: catch trailing space in comments
r18959 l = len(comment.rstrip())
return "#" * l + comment[l:]
Benoit Boissinot
check-code: more tests and more robust python filtering
r10727 return repquote(m)
Matt Mackall
Introduce check-code.py...
r10281
Augie Fackler
formatting: blacken the codebase...
r43346
Matt Mackall
Introduce check-code.py...
r10281 def repcomment(m):
return m.group(1) + "#" * len(m.group(2))
Augie Fackler
formatting: blacken the codebase...
r43346
Matt Mackall
Introduce check-code.py...
r10281 def repccomment(m):
t = re.sub(r"((?<=\n) )|\S", "x", m.group(2))
return m.group(1) + t + "*/"
Augie Fackler
formatting: blacken the codebase...
r43346
Matt Mackall
Introduce check-code.py...
r10281 def repcallspaces(m):
t = re.sub(r"\n\s+", "\n", m.group(2))
return m.group(1) + t
Augie Fackler
formatting: blacken the codebase...
r43346
Matt Mackall
Introduce check-code.py...
r10281 def repinclude(m):
return m.group(1) + "<foo>"
Augie Fackler
formatting: blacken the codebase...
r43346
Matt Mackall
Introduce check-code.py...
r10281 def rephere(m):
t = re.sub(r"\S", "x", m.group(2))
return m.group(1) + t
testpats = [
Augie Fackler
formatting: blacken the codebase...
r43346 [
(r'\b(push|pop)d\b', "don't use 'pushd' or 'popd', use 'cd'"),
(r'\W\$?\(\([^\)\n]*\)\)', "don't use (()) or $(()), use 'expr'"),
(r'grep.*-q', "don't use 'grep -q', redirect to /dev/null"),
(r'(?<!hg )grep.* -a', "don't use 'grep -a', use in-line python"),
(r'sed.*-i', "don't use 'sed -i', use a temporary file"),
(r'\becho\b.*\\n', "don't use 'echo \\n', use printf"),
(r'echo -n', "don't use 'echo -n', use printf"),
(r'(^|\|\s*)\bwc\b[^|]*$\n(?!.*\(re\))', "filter wc output"),
(r'head -c', "don't use 'head -c', use 'dd'"),
(r'tail -n', "don't use the '-n' option to tail, just use '-<num>'"),
(r'sha1sum', "don't use sha1sum, use $TESTDIR/md5sum.py"),
(r'\bls\b.*-\w*R', "don't use 'ls -R', use 'find'"),
(r'printf.*[^\\]\\([1-9]|0\d)', r"don't use 'printf \NNN', use Python"),
(r'printf.*[^\\]\\x', "don't use printf \\x, use Python"),
(r'rm -rf \*', "don't use naked rm -rf, target a directory"),
(
r'\[[^\]]+==',
'[ foo == bar ] is a bashism, use [ foo = bar ] instead',
),
(
r'(^|\|\s*)grep (-\w\s+)*[^|]*[(|]\w',
"use egrep for extended grep syntax",
),
(r'(^|\|\s*)e?grep .*\\S', "don't use \\S in regular expression"),
(r'(?<!!)/bin/', "don't use explicit paths for tools"),
(r'#!.*/bash', "don't use bash in shebang, use sh"),
(r'[^\n]\Z', "no trailing newline"),
(r'export .*=', "don't export and assign at once"),
(r'^source\b', "don't use 'source', use '.'"),
(r'touch -d', "don't use 'touch -d', use 'touch -t' instead"),
(r'\bls +[^|\n-]+ +-', "options to 'ls' must come before filenames"),
(r'[^>\n]>\s*\$HGRCPATH', "don't overwrite $HGRCPATH, append to it"),
(r'^stop\(\)', "don't use 'stop' as a shell function name"),
(r'(\[|\btest\b).*-e ', "don't use 'test -e', use 'test -f'"),
(r'\[\[\s+[^\]]*\]\]', "don't use '[[ ]]', use '[ ]'"),
(r'^alias\b.*=', "don't use alias, use a function"),
(r'if\s*!', "don't use '!' to negate exit status"),
(r'/dev/u?random', "don't use entropy, use /dev/zero"),
(r'do\s*true;\s*done', "don't use true as loop body, use sleep 0"),
(
r'sed (-e )?\'(\d+|/[^/]*/)i(?!\\\n)',
"put a backslash-escaped newline after sed 'i' command",
),
(r'^diff *-\w*[uU].*$\n(^ \$ |^$)', "prefix diff -u/-U with cmp"),
(r'^\s+(if)? diff *-\w*[uU]', "prefix diff -u/-U with cmp"),
(r'[\s="`\']python\s(?!bindings)', "don't use 'python', use '$PYTHON'"),
(r'seq ', "don't use 'seq', use $TESTDIR/seq.py"),
(r'\butil\.Abort\b', "directly use error.Abort"),
(r'\|&', "don't use |&, use 2>&1"),
(r'\w = +\w', "only one space after = allowed"),
(
r'\bsed\b.*[^\\]\\n',
"don't use 'sed ... \\n', use a \\ and a newline",
),
(r'env.*-u', "don't use 'env -u VAR', use 'unset VAR'"),
(r'cp.* -r ', "don't use 'cp -r', use 'cp -R'"),
(r'grep.* -[ABC]', "don't use grep's context flags"),
(
r'find.*-printf',
"don't use 'find -printf', it doesn't exist on BSD find(1)",
),
(r'\$RANDOM ', "don't use bash-only $RANDOM to generate random values"),
],
# warnings
[
(r'^function', "don't use 'function', use old style"),
(r'^diff.*-\w*N', "don't use 'diff -N'"),
(r'\$PWD|\${PWD}', "don't use $PWD, use `pwd`"),
(r'^([^"\'\n]|("[^"\n]*")|(\'[^\'\n]*\'))*\^', "^ must be quoted"),
(r'kill (`|\$\()', "don't use kill, use killdaemons.py"),
],
Matt Mackall
Introduce check-code.py...
r10281 ]
testfilters = [
Jun Wu
check-code: forbid using bash in shebang...
r34062 (r"( *)(#([^!][^\n]*\S)?)", repcomment),
Matt Mackall
Introduce check-code.py...
r10281 (r"<<(\S+)((.|\n)*?\n\1)", rephere),
]
Matt Mackall
check-code: fix issues with finding patterns in unified tests, fix tests...
r15372 uprefix = r"^ \$ "
Matt Mackall
check-code: add some basic support for unified tests
r12364 utestpats = [
Augie Fackler
formatting: blacken the codebase...
r43346 [
(r'^(\S.*|| [$>] \S.*)[ \t]\n', "trailing whitespace on non-output"),
(
uprefix + r'.*\|\s*sed[^|>\n]*\n',
"use regex test output patterns instead of sed",
),
(uprefix + r'(true|exit 0)', "explicit zero exit unnecessary"),
(uprefix + r'.*(?<!\[)\$\?', "explicit exit code checks unnecessary"),
(
uprefix + r'.*\|\| echo.*(fail|error)',
"explicit exit code checks unnecessary",
),
(uprefix + r'set -e', "don't use set -e"),
(uprefix + r'(\s|fi\b|done\b)', "use > for continued lines"),
(
uprefix + r'.*:\.\S*/',
"x:.y in a path does not work on msys, rewrite "
"as x://.y, or see `hg log -k msys` for alternatives",
r'-\S+:\.|' '# no-msys', # -Rxxx
), # in test-pull.t which is skipped on windows
(
r'^ [^$>].*27\.0\.0\.1',
'use $LOCALIP not an explicit loopback address',
),
(
r'^ (?![>$] ).*\$LOCALIP.*[^)]$',
'mark $LOCALIP output lines with (glob) to help tests in BSD jails',
),
(
r'^ (cat|find): .*: \$ENOENT\$',
'use test -f to test for file existence',
),
(
r'^ diff -[^ -]*p',
"don't use (external) diff with -p for portability",
),
(r' readlink ', 'use readlink.py instead of readlink'),
(
r'^ [-+][-+][-+] .* [-+]0000 \(glob\)',
"glob timezone field in diff output for portability",
),
(
r'^ @@ -[0-9]+ [+][0-9]+,[0-9]+ @@',
"use '@@ -N* +N,n @@ (glob)' style chunk header for portability",
),
(
r'^ @@ -[0-9]+,[0-9]+ [+][0-9]+ @@',
"use '@@ -N,n +N* @@ (glob)' style chunk header for portability",
),
(
r'^ @@ -[0-9]+ [+][0-9]+ @@',
"use '@@ -N* +N* @@ (glob)' style chunk header for portability",
),
(
uprefix + r'hg( +-[^ ]+( +[^ ]+)?)* +extdiff'
r'( +(-[^ po-]+|--(?!program|option)[^ ]+|[^-][^ ]*))*$',
"use $RUNTESTDIR/pdiff via extdiff (or -o/-p for false-positives)",
),
],
# warnings
[
(
r'^ (?!.*\$LOCALIP)[^*?/\n]* \(glob\)$',
"glob match with no glob string (?, *, /, and $LOCALIP)",
),
],
Matt Mackall
check-code: add some basic support for unified tests
r12364 ]
Yuya Nishihara
check-code: allow tabs in heredoc
r35316 # transform plain test rules to unified test's
Mads Kiilerich
check-code: fix checking for sh style in .t tests...
r14203 for i in [0, 1]:
Pierre-Yves David
check-code: allow an escape pattern to be specified for testpattern...
r22101 for tp in testpats[i]:
p = tp[0]
m = tp[1]
Augie Fackler
cleanup: remove pointless r-prefixes on double-quoted strings...
r43809 if p.startswith('^'):
p = "^ [$>] (%s)" % p[1:]
Mads Kiilerich
check-code: fix checking for sh style in .t tests...
r14203 else:
Augie Fackler
cleanup: remove pointless r-prefixes on double-quoted strings...
r43809 p = "^ [$>] .*(%s)" % p
Pierre-Yves David
check-code: allow an escape pattern to be specified for testpattern...
r22101 utestpats[i].append((p, m) + tp[2:])
Matt Mackall
check-code: add some basic support for unified tests
r12364
Yuya Nishihara
check-code: allow tabs in heredoc
r35316 # don't transform the following rules:
# " > \t" and " \t" should be allowed in unified tests
testpats[0].append((r'^( *)\t', "don't use tabs to indent"))
utestpats[0].append((r'^( ?)\t', "don't use tabs to indent"))
Matt Mackall
check-code: add some basic support for unified tests
r12364 utestfilters = [
Idan Kamara
check-code: replace heredocs in unified tests...
r17711 (r"<<(\S+)((.|\n)*?\n > \1)", rephere),
Jun Wu
check-code: forbid using bash in shebang...
r34062 (r"( +)(#([^!][^\n]*\S)?)", repcomment),
Matt Mackall
check-code: add some basic support for unified tests
r12364 ]
FUJIWARA Katsunori
contrib: split pypats list in check-code.py...
r41987 # common patterns to check *.py
commonpypats = [
Augie Fackler
formatting: blacken the codebase...
r43346 [
(r'\\$', 'Use () to wrap long lines in Python, not \\'),
(
r'^\s*def\s*\w+\s*\(.*,\s*\(',
"tuple parameter unpacking not available in Python 3+",
),
(
r'lambda\s*\(.*,.*\)',
"tuple parameter unpacking not available in Python 3+",
),
(r'(?<!def)\s+(cmp)\(', "cmp is not available in Python 3+"),
(r'(?<!\.)\breduce\s*\(.*', "reduce is not available in Python 3+"),
(
r'\bdict\(.*=',
'dict() is different in Py2 and 3 and is slower than {}',
'dict-from-generator',
),
(r'\.has_key\b', "dict.has_key is not available in Python 3+"),
(r'\s<>\s', '<> operator is not available in Python 3+, use !='),
(r'^\s*\t', "don't use tabs"),
(r'\S;\s*\n', "semicolon"),
(r'[^_]_\([ \t\n]*(?:"[^"]+"[ \t\n+]*)+%', "don't use % inside _()"),
(r"[^_]_\([ \t\n]*(?:'[^']+'[ \t\n+]*)+%", "don't use % inside _()"),
(r'(\w|\)),\w', "missing whitespace after ,"),
(r'(\w|\))[+/*\-<>]\w', "missing whitespace in expression"),
(r'\w\s=\s\s+\w', "gratuitous whitespace after ="),
(
(
# a line ending with a colon, potentially with trailing comments
r':([ \t]*#[^\n]*)?\n'
# one that is not a pass and not only a comment
r'(?P<indent>[ \t]+)[^#][^\n]+\n'
# more lines at the same indent level
r'((?P=indent)[^\n]+\n)*'
# a pass at the same indent level, which is bogus
r'(?P=indent)pass[ \t\n#]'
),
'omit superfluous pass',
),
(r'[^\n]\Z', "no trailing newline"),
(r'(\S[ \t]+|^[ \t]+)\n', "trailing whitespace"),
(
r'^\s+(self\.)?[A-Za-z][a-z0-9]+[A-Z]\w* = ',
"don't use camelcase in identifiers",
r'#.*camelcase-required',
),
(
r'^\s*(if|while|def|class|except|try)\s[^[\n]*:\s*[^\\n]#\s]+',
"linebreak after :",
),
(
r'class\s[^( \n]+:',
"old-style class, use class foo(object)",
r'#.*old-style',
),
(
r'class\s[^( \n]+\(\):',
"class foo() creates old style object, use class foo(object)",
r'#.*old-style',
),
(
r'\b(%s)\('
% '|'.join(k for k in keyword.kwlist if k not in ('print', 'exec')),
"Python keyword is not a function",
),
# (r'class\s[A-Z][^\(]*\((?!Exception)',
# "don't capitalize non-exception classes"),
# (r'in range\(', "use xrange"),
# (r'^\s*print\s+', "avoid using print in core and extensions"),
(r'[\x80-\xff]', "non-ASCII character literal"),
(r'("\')\.format\(', "str.format() has no bytes counterpart, use %"),
(
r'([\(\[][ \t]\S)|(\S[ \t][\)\]])',
"gratuitous whitespace in () or []",
),
# (r'\s\s=', "gratuitous whitespace before ="),
(
r'[^>< ](\+=|-=|!=|<>|<=|>=|<<=|>>=|%=)\S',
"missing whitespace around operator",
),
(
r'[^>< ](\+=|-=|!=|<>|<=|>=|<<=|>>=|%=)\s',
"missing whitespace around operator",
),
(
r'\s(\+=|-=|!=|<>|<=|>=|<<=|>>=|%=)\S',
"missing whitespace around operator",
),
(r'[^^+=*/!<>&| %-](\s=|=\s)[^= ]', "wrong whitespace around ="),
(
r'\([^()]*( =[^=]|[^<>!=]= )',
"no whitespace around = for named parameters",
),
(
r'raise [^,(]+, (\([^\)]+\)|[^,\(\)]+)$',
"don't use old-style two-argument raise, use Exception(message)",
),
(r' is\s+(not\s+)?["\'0-9-]', "object comparison with literal"),
(
r' [=!]=\s+(True|False|None)',
"comparison with singleton, use 'is' or 'is not' instead",
),
(
r'^\s*(while|if) [01]:',
"use True/False for constant Boolean expression",
),
(r'^\s*if False(:| +and)', 'Remove code instead of using `if False`'),
(
r'(?:(?<!def)\s+|\()hasattr\(',
'hasattr(foo, bar) is broken on py2, use util.safehasattr(foo, bar) '
'instead',
r'#.*hasattr-py3-only',
),
(r'opener\([^)]*\).read\(', "use opener.read() instead"),
(r'opener\([^)]*\).write\(', "use opener.write() instead"),
(r'(?i)descend[e]nt', "the proper spelling is descendAnt"),
(r'\.debug\(\_', "don't mark debug messages for translation"),
(r'\.strip\(\)\.split\(\)', "no need to strip before splitting"),
(r'^\s*except\s*:', "naked except clause", r'#.*re-raises'),
(
r'^\s*except\s([^\(,]+|\([^\)]+\))\s*,',
'legacy exception syntax; use "as" instead of ","',
),
(r'release\(.*wlock, .*lock\)', "wrong lock release order"),
(r'\bdef\s+__bool__\b', "__bool__ should be __nonzero__ in Python 2"),
(
r'os\.path\.join\(.*, *(""|\'\')\)',
"use pathutil.normasprefix(path) instead of os.path.join(path, '')",
),
(r'\s0[0-7]+\b', 'legacy octal syntax; use "0o" prefix instead of "0"'),
# XXX only catch mutable arguments on the first line of the definition
(r'def.*[( ]\w+=\{\}', "don't use mutable default arguments"),
(r'\butil\.Abort\b', "directly use error.Abort"),
(
r'^@(\w*\.)?cachefunc',
"module-level @cachefunc is risky, please avoid",
),
(
r'^import Queue',
"don't use Queue, use pycompat.queue.Queue + "
"pycompat.queue.Empty",
),
(
r'^import cStringIO',
"don't use cStringIO.StringIO, use util.stringio",
),
(r'^import urllib', "don't use urllib, use util.urlreq/util.urlerr"),
(
r'^import SocketServer',
"don't use SockerServer, use util.socketserver",
),
(r'^import urlparse', "don't use urlparse, use util.urlreq"),
(r'^import xmlrpclib', "don't use xmlrpclib, use util.xmlrpclib"),
(r'^import cPickle', "don't use cPickle, use util.pickle"),
(r'^import pickle', "don't use pickle, use util.pickle"),
(r'^import httplib', "don't use httplib, use util.httplib"),
(r'^import BaseHTTPServer', "use util.httpserver instead"),
(
r'^(from|import) mercurial\.(cext|pure|cffi)',
"use mercurial.policy.importmod instead",
),
(r'\.next\(\)', "don't use .next(), use next(...)"),
(
r'([a-z]*).revision\(\1\.node\(',
"don't convert rev to node before passing to revision(nodeorrev)",
),
(r'platform\.system\(\)', "don't use platform.system(), use pycompat"),
],
# warnings
[],
FUJIWARA Katsunori
contrib: split pypats list in check-code.py...
r41987 ]
# patterns to check normal *.py files
pypats = [
Augie Fackler
formatting: blacken the codebase...
r43346 [
# Ideally, these should be placed in "commonpypats" for
# consistency of coding rules in Mercurial source tree.
# But on the other hand, these are not so seriously required for
# python code fragments embedded in test scripts. Fixing test
# scripts for these patterns requires many changes, and has less
# profit than effort.
(r'raise Exception', "don't raise generic exceptions"),
(r'[\s\(](open|file)\([^)]*\)\.read\(', "use util.readfile() instead"),
(
r'[\s\(](open|file)\([^)]*\)\.write\(',
"use util.writefile() instead",
),
(
r'^[\s\(]*(open(er)?|file)\([^)]*\)(?!\.close\(\))',
"always assign an opened file to a variable, and close it afterwards",
),
(
r'[\s\(](open|file)\([^)]*\)\.(?!close\(\))',
"always assign an opened file to a variable, and close it afterwards",
),
(r':\n( )*( ){1,3}[^ ]', "must indent 4 spaces"),
(r'^import atexit', "don't use atexit, use ui.atexit"),
# rules depending on implementation of repquote()
(
r' x+[xpqo%APM][\'"]\n\s+[\'"]x',
'string join across lines with no space',
),
(
r'''(?x)ui\.(status|progress|write|note|warn)\(
FUJIWARA Katsunori
check-code: detect "missing _() in ui message" more exactly...
r29397 [ \t\n#]*
(?# any strings/comments might precede a string, which
# contains translatable message)
Augie Fackler
contrib: fix check-code to be able to detect missing _() with bytestrings...
r43351 b?((['"]|\'\'\'|""")[ \npq%bAPMxno]*(['"]|\'\'\'|""")[ \t\n#]+)*
FUJIWARA Katsunori
check-code: detect "missing _() in ui message" more exactly...
r29397 (?# sequence consisting of below might precede translatable message
# - formatting string: "% 10s", "%05d", "% -3.2f", "%*s", "%%" ...
# - escaped character: "\\", "\n", "\0" ...
# - character other than '%', 'b' as '\', and 'x' as alphabet)
(['"]|\'\'\'|""")
((%([ n]?[PM]?([np]+|A))?x)|%%|b[bnx]|[ \nnpqAPMo])*x
(?# this regexp can't use [^...] style,
# because _preparepats forcibly adds "\n" into [^...],
# even though this regexp wants match it against "\n")''',
Augie Fackler
formatting: blacken the codebase...
r43346 "missing _() in ui message (use () to hide false-positives)",
),
]
+ commonpypats[0],
# warnings
[
# rules depending on implementation of repquote()
(r'(^| )pp +xxxxqq[ \n][^\n]', "add two newlines after '.. note::'"),
]
+ commonpypats[1],
Matt Mackall
Introduce check-code.py...
r10281 ]
FUJIWARA Katsunori
contrib: make check-code.py check code fragments embedded in test scripts
r41992 # patterns to check *.py for embedded ones in test script
embeddedpypats = [
Augie Fackler
formatting: blacken the codebase...
r43346 [] + commonpypats[0],
# warnings
[] + commonpypats[1],
FUJIWARA Katsunori
contrib: make check-code.py check code fragments embedded in test scripts
r41992 ]
FUJIWARA Katsunori
contrib: split pypats list in check-code.py...
r41987 # common filters to convert *.py
commonpyfilters = [
Augie Fackler
formatting: blacken the codebase...
r43346 (
r"""(?msx)(?P<comment>\#.*?$)|
Benoit Boissinot
check-code: more tests and more robust python filtering
r10727 ((?P<quote>('''|\"\"\"|(?<!')'(?!')|(?<!")"(?!")))
(?P<text>(([^\\]|\\.)*?))
Augie Fackler
formatting: blacken the codebase...
r43346 (?P=quote))""",
reppython,
),
Matt Mackall
Introduce check-code.py...
r10281 ]
FUJIWARA Katsunori
contrib: split pypats list in check-code.py...
r41987 # filters to convert normal *.py files
Augie Fackler
formatting: blacken the codebase...
r43346 pyfilters = [] + commonpyfilters
FUJIWARA Katsunori
contrib: split pypats list in check-code.py...
r41987
Jun Wu
check-code: suggest pycompat.is(posix|windows|darwin)...
r34649 # non-filter patterns
pynfpats = [
[
Augie Fackler
formatting: blacken the codebase...
r43346 (r'pycompat\.osname\s*[=!]=\s*[\'"]nt[\'"]', "use pycompat.iswindows"),
(r'pycompat\.osname\s*[=!]=\s*[\'"]posix[\'"]', "use pycompat.isposix"),
(
r'pycompat\.sysplatform\s*[!=]=\s*[\'"]darwin[\'"]',
"use pycompat.isdarwin",
),
Jun Wu
check-code: suggest pycompat.is(posix|windows|darwin)...
r34649 ],
# warnings
[],
]
FUJIWARA Katsunori
contrib: make check-code.py check code fragments embedded in test scripts
r41992 # filters to convert *.py for embedded ones in test script
Augie Fackler
formatting: blacken the codebase...
r43346 embeddedpyfilters = [] + commonpyfilters
FUJIWARA Katsunori
contrib: make check-code.py check code fragments embedded in test scripts
r41992
Jun Wu
checkcode: enforce lowercase for extension docstring title...
r31602 # extension non-filter patterns
pyextnfpats = [
[(r'^"""\n?[A-Z]', "don't capitalize docstring title")],
# warnings
[],
]
Mads Kiilerich
check-code: check txt files for trailing whitespace
r18960 txtfilters = []
txtpats = [
Augie Fackler
formatting: blacken the codebase...
r43346 [
(r'\s$', 'trailing whitespace'),
('.. note::[ \n][^\n]', 'add two newlines after note::'),
],
[],
Mads Kiilerich
check-code: check txt files for trailing whitespace
r18960 ]
Matt Mackall
Introduce check-code.py...
r10281 cpats = [
Augie Fackler
formatting: blacken the codebase...
r43346 [
(r'//', "don't use //-style comments"),
(r'\S\t', "don't use tabs except for indent"),
(r'(\S[ \t]+|^[ \t]+)\n', "trailing whitespace"),
(r'(while|if|do|for)\(', "use space after while/if/do/for"),
(r'return\(', "return is not a function"),
(r' ;', "no space before ;"),
(r'[^;] \)', "no space before )"),
(r'[)][{]', "space between ) and {"),
(r'\w+\* \w+', "use int *foo, not int* foo"),
(r'\W\([^\)]+\) \w+', "use (int)foo, not (int) foo"),
(r'\w+ (\+\+|--)', "use foo++, not foo ++"),
(r'\w,\w', "missing whitespace after ,"),
(r'^[^#]\w[+/*]\w', "missing whitespace in expression"),
(r'\w\s=\s\s+\w', "gratuitous whitespace after ="),
(r'^#\s+\w', "use #foo, not # foo"),
(r'[^\n]\Z', "no trailing newline"),
(r'^\s*#import\b', "use only #include in standard C code"),
(r'strcpy\(', "don't use strcpy, use strlcpy or memcpy"),
(r'strcat\(', "don't use strcat"),
# rules depending on implementation of repquote()
],
# warnings
[
# rules depending on implementation of repquote()
],
Matt Mackall
Introduce check-code.py...
r10281 ]
cfilters = [
(r'(/\*)(((\*(?!/))|[^*])*)\*/', repccomment),
Benoit Boissinot
check-code: improve quote detection regexp, add tests
r10722 (r'''(?P<quote>(?<!")")(?P<text>([^"]|\\")+)"(?!")''', repquote),
Matt Mackall
Introduce check-code.py...
r10281 (r'''(#\s*include\s+<)([^>]+)>''', repinclude),
(r'(\()([^)]+\))', repcallspaces),
]
timeless
check-code: check for repo in revlog and ui in util
r14137 inutilpats = [
Augie Fackler
formating: upgrade to black 20.8b1...
r46554 [
(r'\bui\.', "don't use ui in util"),
],
Augie Fackler
formatting: blacken the codebase...
r43346 # warnings
[],
timeless
check-code: check for repo in revlog and ui in util
r14137 ]
inrevlogpats = [
Augie Fackler
formating: upgrade to black 20.8b1...
r46554 [
(r'\brepo\.', "don't use repo in revlog"),
],
Augie Fackler
formatting: blacken the codebase...
r43346 # warnings
[],
timeless
check-code: check for repo in revlog and ui in util
r14137 ]
Steven Brown
check-code: check for consistent usage of the websub filter in hgweb templates...
r21487 webtemplatefilters = []
webtemplatepats = [
Augie Fackler
formatting: blacken the codebase...
r43346 [],
[
(
r'{desc(\|(?!websub|firstline)[^\|]*)+}',
'follow desc keyword with either firstline or websub',
),
],
Steven Brown
check-code: check for consistent usage of the websub filter in hgweb templates...
r21487 ]
FUJIWARA Katsunori
contrib: check reference to old selenic.com domain...
r30246 allfilesfilters = []
allfilespats = [
Augie Fackler
formatting: blacken the codebase...
r43346 [
(
r'(http|https)://[a-zA-Z0-9./]*selenic.com/',
'use mercurial-scm.org domain URL',
),
(
r'mercurial@selenic\.com',
'use mercurial-scm.org domain for mercurial ML address',
),
(
r'mercurial-devel@selenic\.com',
'use mercurial-scm.org domain for mercurial-devel ML address',
),
],
# warnings
[],
FUJIWARA Katsunori
contrib: check reference to old selenic.com domain...
r30246 ]
Pulkit Goyal
py3: add warnings in check-code related to py3...
r30665 py3pats = [
Augie Fackler
formatting: blacken the codebase...
r43346 [
(
r'os\.environ',
"use encoding.environ instead (py3)",
r'#.*re-exports',
),
(r'os\.name', "use pycompat.osname instead (py3)"),
(r'os\.getcwd', "use encoding.getcwd instead (py3)", r'#.*re-exports'),
(r'os\.sep', "use pycompat.ossep instead (py3)"),
(r'os\.pathsep', "use pycompat.ospathsep instead (py3)"),
(r'os\.altsep', "use pycompat.osaltsep instead (py3)"),
(r'sys\.platform', "use pycompat.sysplatform instead (py3)"),
(r'getopt\.getopt', "use pycompat.getoptb instead (py3)"),
(r'os\.getenv', "use encoding.environ.get instead"),
(r'os\.setenv', "modifying the environ dict is not preferred"),
(r'(?<!pycompat\.)xrange', "use pycompat.xrange instead (py3)"),
],
# warnings
[],
Pulkit Goyal
py3: add warnings in check-code related to py3...
r30665 ]
Matt Mackall
Introduce check-code.py...
r10281 checks = [
Matt Mackall
check-code: look at shebang to identify Python scripts
r21222 ('python', r'.*\.(py|cgi)$', r'^#!.*python', pyfilters, pypats),
Jun Wu
check-code: suggest pycompat.is(posix|windows|darwin)...
r34649 ('python', r'.*\.(py|cgi)$', r'^#!.*python', [], pynfpats),
Jun Wu
checkcode: enforce lowercase for extension docstring title...
r31602 ('python', r'.*hgext.*\.py$', '', [], pyextnfpats),
Augie Fackler
formatting: blacken the codebase...
r43346 (
'python 3',
r'.*(hgext|mercurial)/(?!demandimport|policy|pycompat).*\.py',
'',
pyfilters,
py3pats,
),
Matt Mackall
check-code: look at shebang to identify Python scripts
r21222 ('test script', r'(.*/)?test-[^.~]*$', '', testfilters, testpats),
('c', r'.*\.[ch]$', '', cfilters, cpats),
('unified test', r'.*\.t$', '', utestfilters, utestpats),
Augie Fackler
formatting: blacken the codebase...
r43346 (
'layering violation repo in revlog',
r'mercurial/revlog\.py',
'',
pyfilters,
inrevlogpats,
),
(
'layering violation ui in util',
r'mercurial/util\.py',
'',
pyfilters,
inutilpats,
),
Matt Mackall
check-code: look at shebang to identify Python scripts
r21222 ('txt', r'.*\.txt$', '', txtfilters, txtpats),
Augie Fackler
formatting: blacken the codebase...
r43346 (
'web template',
r'mercurial/templates/.*\.tmpl',
'',
webtemplatefilters,
webtemplatepats,
),
('all except for .po', r'.*(?<!\.po)$', '', allfilesfilters, allfilespats),
Matt Mackall
Introduce check-code.py...
r10281 ]
FUJIWARA Katsunori
contrib: make check-code.py check code fragments embedded in test scripts
r41992 # (desc,
# func to pick up embedded code fragments,
# list of patterns to convert target files
# list of patterns to detect errors/warnings)
embeddedchecks = [
Augie Fackler
formatting: blacken the codebase...
r43346 (
'embedded python',
testparseutil.pyembedded,
embeddedpyfilters,
embeddedpypats,
)
FUJIWARA Katsunori
contrib: make check-code.py check code fragments embedded in test scripts
r41992 ]
Augie Fackler
formatting: blacken the codebase...
r43346
Simon Heimberg
check-code: only fix patterns once...
r19307 def _preparepats():
FUJIWARA Katsunori
contrib: refactor preparation logic for patterns of check-code.py...
r41988 def preparefailandwarn(failandwarn):
Simon Heimberg
check-code: only fix patterns once...
r19307 for pats in failandwarn:
for i, pseq in enumerate(pats):
# fix-up regexes for multi-line searches
Simon Heimberg
cleanup: drop unused variables and an unused import
r19378 p = pseq[0]
Augie Fackler
contrib: fix a subtle bug in check-code's regex rewriting...
r36975 # \s doesn't match \n (done in two steps)
# first, we replace \s that appears in a set already
p = re.sub(r'\[\\s', r'[ \\t', p)
# now we replace other \s instances.
p = re.sub(r'(?<!(\\|\[))\\s', r'[ \\t]', p)
Simon Heimberg
check-code: only fix patterns once...
r19307 # [^...] doesn't match newline
p = re.sub(r'(?<!\\)\[\^', r'[^\\n', p)
Simon Heimberg
check-code: compile all patterns on initialisation...
r19308 pats[i] = (re.compile(p, re.MULTILINE),) + pseq[1:]
FUJIWARA Katsunori
contrib: refactor preparation logic for patterns of check-code.py...
r41988
def preparefilters(filters):
Simon Heimberg
check-code: compile filters when loading
r19309 for i, flt in enumerate(filters):
filters[i] = re.compile(flt[0]), flt[1]
Simon Heimberg
check-code: only fix patterns once...
r19307
FUJIWARA Katsunori
contrib: make check-code.py check code fragments embedded in test scripts
r41992 for cs in (checks, embeddedchecks):
FUJIWARA Katsunori
contrib: refactor preparation logic for patterns of check-code.py...
r41988 for c in cs:
failandwarn = c[-1]
preparefailandwarn(failandwarn)
filters = c[-2]
preparefilters(filters)
Augie Fackler
formatting: blacken the codebase...
r43346
Pierre-Yves David
code-code: Add a logfunc argument to checkfile...
r10719 class norepeatlogger(object):
def __init__(self):
self._lastseen = None
Matt Mackall
check-code: add --blame switch
r11604 def log(self, fname, lineno, line, msg, blame):
Pierre-Yves David
code-code: Add a logfunc argument to checkfile...
r10719 """print error related a to given line of a given file.
The faulty line will also be printed but only once in the case
of multiple errors.
Matt Mackall
Introduce check-code.py...
r10281
Pierre-Yves David
code-code: Add a logfunc argument to checkfile...
r10719 :fname: filename
:lineno: line number
:line: actual content of the line
:msg: error message
"""
msgid = fname, lineno, line
if msgid != self._lastseen:
Matt Mackall
check-code: add --blame switch
r11604 if blame:
Pulkit Goyal
check-code: use absolute_import and print_function
r28509 print("%s:%d (%s):" % (fname, lineno, blame))
Matt Mackall
check-code: add --blame switch
r11604 else:
Pulkit Goyal
check-code: use absolute_import and print_function
r28509 print("%s:%d:" % (fname, lineno))
print(" > %s" % line)
Pierre-Yves David
code-code: Add a logfunc argument to checkfile...
r10719 self._lastseen = msgid
Pulkit Goyal
check-code: use absolute_import and print_function
r28509 print(" " + msg)
Pierre-Yves David
code-code: Add a logfunc argument to checkfile...
r10719
Augie Fackler
formatting: blacken the codebase...
r43346
Pierre-Yves David
code-code: Add a logfunc argument to checkfile...
r10719 _defaultlogger = norepeatlogger()
Augie Fackler
formatting: blacken the codebase...
r43346
Matt Mackall
check-code: add --blame switch
r11604 def getblame(f):
lines = []
for l in os.popen('hg annotate -un %s' % f):
start, line = l.split(':', 1)
user, rev = start.split()
lines.append((line[1:-1], user, rev))
return lines
Augie Fackler
formatting: blacken the codebase...
r43346
def checkfile(
f,
logfunc=_defaultlogger.log,
maxerr=None,
warnings=False,
blame=False,
debug=False,
lineno=True,
):
Pierre-Yves David
code-code: Add a logfunc argument to checkfile...
r10719 """checks style and portability of a given file
:f: filepath
:logfunc: function used to report error
logfunc(filename, linenumber, linecontent, errormessage)
Mads Kiilerich
fix trivial spelling errors
r17424 :maxerr: number of error to display before aborting.
Mads Kiilerich
tests: keep track of all check-code.py warnings
r15873 Set to false (default) to report all errors
Pierre-Yves David
check-code: add a return value to checkfile function...
r10720
return True if no error is found, False otherwise.
Pierre-Yves David
code-code: Add a logfunc argument to checkfile...
r10719 """
Pierre-Yves David
check-code: add a return value to checkfile function...
r10720 result = True
Matt Mackall
check-code: look at shebang to identify Python scripts
r21222
try:
timeless
check-code: handle py3 open divergence...
r29145 with opentext(f) as fp:
try:
Martin von Zweigbergk
cleanup: delete lots of unused local variables...
r41401 pre = fp.read()
timeless
check-code: handle py3 open divergence...
r29145 except UnicodeDecodeError as e:
print("%s while reading %s" % (e, f))
return result
Gregory Szorc
global: mass rewrite to use modern exception syntax...
r25660 except IOError as e:
Pulkit Goyal
check-code: use absolute_import and print_function
r28509 print("Skipping %s, %s" % (f, str(e).split(':', 1)[0]))
Matt Mackall
check-code: look at shebang to identify Python scripts
r21222 return result
FUJIWARA Katsunori
contrib: factor out actual error check for file data of check-code.py...
r41989 # context information shared while single checkfile() invocation
context = {'blamecache': None}
Matt Mackall
check-code: look at shebang to identify Python scripts
r21222 for name, match, magic, filters, pats in checks:
timeless
check-code: adding debug flag
r14135 if debug:
Pulkit Goyal
check-code: use absolute_import and print_function
r28509 print(name, f)
FUJIWARA Katsunori
check-code: examine magic pattern matching against contents of a file...
r28050 if not (re.match(match, f) or (magic and re.search(magic, pre))):
timeless
check-code: adding debug flag
r14135 if debug:
Augie Fackler
formatting: blacken the codebase...
r43346 print(
"Skipping %s for %s it doesn't match %s" % (name, match, f)
)
Matt Mackall
Introduce check-code.py...
r10281 continue
Simon Heimberg
check-code: concatenate "check-code" on compile time...
r19382 if "no-" "check-code" in pre:
timeless
check-code: improve test-check-code error diffs...
r27560 # If you're looking at this line, it's because a file has:
# no- check- code
# but the reason to output skipping is to make life for
# tests easier. So, instead of writing it with a normal
# spelling, we write it with the expected spelling from
# tests/test-check-code.t
Pulkit Goyal
check-code: use absolute_import and print_function
r28509 print("Skipping %s it has no-che?k-code (glob)" % f)
Augie Fackler
formatting: blacken the codebase...
r43346 return "Skip" # skip checking this file
FUJIWARA Katsunori
contrib: factor out actual error check for file data of check-code.py...
r41989
Augie Fackler
formatting: blacken the codebase...
r43346 fc = _checkfiledata(
name,
f,
pre,
filters,
pats,
context,
logfunc,
maxerr,
warnings,
blame,
debug,
lineno,
)
FUJIWARA Katsunori
contrib: change return value of file checking function of check-code.py...
r41990 if fc:
FUJIWARA Katsunori
contrib: factor out actual error check for file data of check-code.py...
r41989 result = False
FUJIWARA Katsunori
contrib: make check-code.py check code fragments embedded in test scripts
r41992 if f.endswith('.t') and "no-" "check-code" not in pre:
if debug:
Augie Fackler
formatting: blacken the codebase...
r43346 print("Checking embedded code in %s" % f)
FUJIWARA Katsunori
contrib: make check-code.py check code fragments embedded in test scripts
r41992
prelines = pre.splitlines()
embeddederros = []
for name, embedded, filters, pats in embeddedchecks:
# "reset curmax at each repetition" treats maxerr as "max
# nubmer of errors in an actual file per entry of
# (embedded)checks"
curmaxerr = maxerr
for found in embedded(f, prelines, embeddederros):
filename, starts, ends, code = found
Augie Fackler
formatting: blacken the codebase...
r43346 fc = _checkfiledata(
name,
f,
code,
filters,
pats,
context,
logfunc,
curmaxerr,
warnings,
blame,
debug,
lineno,
offset=starts - 1,
)
FUJIWARA Katsunori
contrib: make check-code.py check code fragments embedded in test scripts
r41992 if fc:
result = False
if curmaxerr:
if fc >= curmaxerr:
break
curmaxerr -= fc
FUJIWARA Katsunori
contrib: factor out actual error check for file data of check-code.py...
r41989 return result
Augie Fackler
formatting: blacken the codebase...
r43346
def _checkfiledata(
name,
f,
filedata,
filters,
pats,
context,
logfunc,
maxerr,
warnings,
blame,
debug,
lineno,
offset=None,
):
FUJIWARA Katsunori
contrib: factor out actual error check for file data of check-code.py...
r41989 """Execute actual error check for file data
:name: of the checking category
:f: filepath
:filedata: content of a file
:filters: to be applied before checking
:pats: to detect errors
:context: a dict of information shared while single checkfile() invocation
Valid keys: 'blamecache'.
:logfunc: function used to report error
logfunc(filename, linenumber, linecontent, errormessage)
:maxerr: number of error to display before aborting, or False to
report all errors
:warnings: whether warning level checks should be applied
:blame: whether blame information should be displayed at error reporting
:debug: whether debug information should be displayed
:lineno: whether lineno should be displayed at error reporting
FUJIWARA Katsunori
contrib: add line offset information to file check function of check-code.py...
r41991 :offset: line number offset of 'filedata' in 'f' for checking
an embedded code fragment, or None (offset=0 is different
from offset=None)
FUJIWARA Katsunori
contrib: factor out actual error check for file data of check-code.py...
r41989
FUJIWARA Katsunori
contrib: change return value of file checking function of check-code.py...
r41990 returns number of detected errors.
FUJIWARA Katsunori
contrib: factor out actual error check for file data of check-code.py...
r41989 """
blamecache = context['blamecache']
FUJIWARA Katsunori
contrib: add line offset information to file check function of check-code.py...
r41991 if offset is None:
lineoffset = 0
else:
lineoffset = offset
FUJIWARA Katsunori
contrib: factor out actual error check for file data of check-code.py...
r41989
fc = 0
pre = post = filedata
Augie Fackler
formatting: blacken the codebase...
r43346 if True: # TODO: get rid of this redundant 'if' block
Matt Mackall
Introduce check-code.py...
r10281 for p, r in filters:
post = re.sub(p, r, post)
Augie Fackler
formatting: blacken the codebase...
r43346 nerrs = len(pats[0]) # nerr elements are errors
Idan Kamara
check-code: separate warnings to avoid repetitive str.startswith
r14009 if warnings:
pats = pats[0] + pats[1]
else:
pats = pats[0]
Matt Mackall
Introduce check-code.py...
r10281 # print post # uncomment to show filtered version
Matt Mackall
check-code: support multiline matches like try/except/finally...
r15281
timeless
check-code: adding debug flag
r14135 if debug:
Pulkit Goyal
check-code: use absolute_import and print_function
r28509 print("Checking %s for %s" % (name, f))
Matt Mackall
check-code: support multiline matches like try/except/finally...
r15281
prelines = None
errors = []
Simon Heimberg
check-code: automatically preppend "warning: " to all warning messages...
r19422 for i, pat in enumerate(pats):
Brodie Rao
check-code: ignore naked excepts with a "re-raise" comment...
r16705 if len(pat) == 3:
p, msg, ignore = pat
else:
p, msg = pat
ignore = None
Simon Heimberg
check-code: prepend warning prefix only once, but for each warning...
r20005 if i >= nerrs:
msg = "warning: " + msg
Brodie Rao
check-code: ignore naked excepts with a "re-raise" comment...
r16705
Matt Mackall
check-code: support multiline matches like try/except/finally...
r15281 pos = 0
n = 0
Simon Heimberg
check-code: compile all patterns on initialisation...
r19308 for m in p.finditer(post):
Matt Mackall
check-code: support multiline matches like try/except/finally...
r15281 if prelines is None:
prelines = pre.splitlines()
postlines = post.splitlines(True)
start = m.start()
while n < len(postlines):
step = len(postlines[n])
if pos + step > start:
break
pos += step
n += 1
l = prelines[n]
Simon Heimberg
check-code: drop now unused check-code-ignore...
r20242 if ignore and re.search(ignore, l, re.MULTILINE):
Simon Heimberg
check-code: print debug output when an ignore pattern matches
r20243 if debug:
Augie Fackler
formatting: blacken the codebase...
r43346 print(
"Skipping %s for %s:%s (ignore pattern)"
% (name, f, (n + lineoffset))
)
Brodie Rao
check-code: ignore naked excepts with a "re-raise" comment...
r16705 continue
Matt Mackall
check-code: support multiline matches like try/except/finally...
r15281 bd = ""
if blame:
bd = 'working directory'
FUJIWARA Katsunori
contrib: factor out actual error check for file data of check-code.py...
r41989 if blamecache is None:
Matt Mackall
check-code: support multiline matches like try/except/finally...
r15281 blamecache = getblame(f)
FUJIWARA Katsunori
contrib: factor out actual error check for file data of check-code.py...
r41989 context['blamecache'] = blamecache
FUJIWARA Katsunori
contrib: add line offset information to file check function of check-code.py...
r41991 if (n + lineoffset) < len(blamecache):
bl, bu, br = blamecache[(n + lineoffset)]
if offset is None and bl == l:
Matt Mackall
check-code: support multiline matches like try/except/finally...
r15281 bd = '%s@%s' % (bu, br)
FUJIWARA Katsunori
contrib: add line offset information to file check function of check-code.py...
r41991 elif offset is not None and bl.endswith(l):
# "offset is not None" means "checking
# embedded code fragment". In this case,
# "l" does not have information about the
# beginning of an *original* line in the
# file (e.g. ' > ').
# Therefore, use "str.endswith()", and
# show "maybe" for a little loose
# examination.
bd = '%s@%s, maybe' % (bu, br)
Simon Heimberg
check-code: prepend warning prefix only once, but for each warning...
r20005
FUJIWARA Katsunori
contrib: add line offset information to file check function of check-code.py...
r41991 errors.append((f, lineno and (n + lineoffset + 1), l, msg, bd))
Matt Mackall
check-code: support multiline matches like try/except/finally...
r15281
errors.sort()
for e in errors:
logfunc(*e)
fc += 1
Mads Kiilerich
tests: keep track of all check-code.py warnings
r15873 if maxerr and fc >= maxerr:
Pulkit Goyal
check-code: use absolute_import and print_function
r28509 print(" (too many errors, giving up)")
Matt Mackall
Introduce check-code.py...
r10281 break
Matt Mackall
check-code: support multiline matches like try/except/finally...
r15281
FUJIWARA Katsunori
contrib: change return value of file checking function of check-code.py...
r41990 return fc
Pierre-Yves David
check-code: Add a ``checkfile`` function...
r10717
Augie Fackler
formatting: blacken the codebase...
r43346
FUJIWARA Katsunori
check-code: factor out boot procedure into main...
r29568 def main():
Jun Wu
check-code: use "-" to specify a list of files from stdin...
r31824 parser = optparse.OptionParser("%prog [options] [files | -]")
Augie Fackler
formatting: blacken the codebase...
r43346 parser.add_option(
"-w",
"--warnings",
action="store_true",
help="include warning-level checks",
)
parser.add_option(
"-p", "--per-file", type="int", help="max warnings per file"
)
parser.add_option(
"-b",
"--blame",
action="store_true",
help="use annotate to generate blame info",
)
parser.add_option(
"", "--debug", action="store_true", help="show debug information"
)
parser.add_option(
"",
"--nolineno",
action="store_false",
dest='lineno',
help="don't show line numbers",
)
Matt Mackall
check-code: add a warnings level...
r10895
Augie Fackler
formatting: blacken the codebase...
r43346 parser.set_defaults(
per_file=15, warnings=False, blame=False, debug=False, lineno=True
)
Matt Mackall
check-code: add a warnings level...
r10895 (options, args) = parser.parse_args()
if len(args) == 0:
Pierre-Yves David
check-code: Only call check-code if __name__ = "__main__"....
r10716 check = glob.glob("*")
Jun Wu
check-code: use "-" to specify a list of files from stdin...
r31824 elif args == ['-']:
# read file list from stdin
check = sys.stdin.read().splitlines()
Pierre-Yves David
check-code: Only call check-code if __name__ = "__main__"....
r10716 else:
Matt Mackall
check-code: add a warnings level...
r10895 check = args
Matt Mackall
Introduce check-code.py...
r10281
FUJIWARA Katsunori
check-code: move fixing up regexp into main procedure...
r29569 _preparepats()
Mads Kiilerich
check-code: fix return code initialization...
r15544 ret = 0
Pierre-Yves David
check-code: Only call check-code if __name__ = "__main__"....
r10716 for f in check:
Augie Fackler
formatting: blacken the codebase...
r43346 if not checkfile(
f,
maxerr=options.per_file,
warnings=options.warnings,
blame=options.blame,
debug=options.debug,
lineno=options.lineno,
):
Alecs King
check-code: add exit status...
r11816 ret = 1
FUJIWARA Katsunori
check-code: factor out boot procedure into main...
r29568 return ret
Augie Fackler
formatting: blacken the codebase...
r43346
FUJIWARA Katsunori
check-code: factor out boot procedure into main...
r29568 if __name__ == "__main__":
sys.exit(main())