##// END OF EJS Templates
revlog: change generaldelta delta parent heuristic...
revlog: change generaldelta delta parent heuristic The old generaldelta heuristic was "if p1 (or p2) was closer than the last full text, use it, otherwise use prev". This was problematic when a repo contained multiple branches that were very different. If commits to branch A were pushed, and the last full text was branch B, it would generate a fulltext. Then if branch B was pushed, it would generate another fulltext. The problem is that the last fulltext (and delta'ing against `prev` in general) has no correlation with the contents of the incoming revision, and therefore will always have degenerate cases. According to the blame, that algorithm was chosen to minimize the chain length. Since there is already code that protects against that (the delta-vs-fulltext code), and since it has been improved since the original generaldelta algorithm went in (2011), I believe the chain length criteria will still be preserved. The new algorithm always diffs against p1 (or p2 if it's closer), unless the resulting delta will fail the delta-vs-fulltext check, in which case we delta against prev. Some before and after stats on manifest.d size. internal large repo old heuristic - 2.0 GB new heuristic - 1.2 GB mozilla-central old heuristic - 242 MB new heuristic - 261 MB The regression in mozilla central is due to the new heuristic choosing p2r as the delta when it's closer to the tip. Switching the algorithm to always prefer p1r brings the size back down (242 MB). This is result of the way in which mozilla does merges and pushes, and the result could easily swing the other direction in other repos (depending on if they merge X into Y or Y into X), but will never be as degenerate as before. I future patch will address the regression by introducing an optional, even more aggressive delta heuristic which will knock the mozilla manifest size down dramatically.

File last commit:

r25661:20de1ace default
r26117:4dc5b51f default
Show More
check-code.py
573 lines | 20.3 KiB | text/x-python | PythonLexer
Matt Mackall
Introduce check-code.py...
r10281 #!/usr/bin/env python
#
# check-code - a style and portability checker for Mercurial
#
Matt Mackall
check-code: fix copyright date
r10290 # Copyright 2010 Matt Mackall <mpm@selenic.com>
Matt Mackall
Introduce check-code.py...
r10281 #
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
Simon Heimberg
check-code: explain what to do when a check-code rule mismatches...
r20241 """style and portability checker for Mercurial
when a rule triggers wrong, do one of the following (prefer one from top):
* do the work-around the rule suggests
* doublecheck that it is a false match
* improve the rule pattern
* add an ignore pattern to the rule (3rd arg) which matches your good line
(you can append a short comment and match this, like: #re-raises, # no-py24)
* change the pattern to a warning and list the exception in test-check-code-hg
* ONLY use no--check-code for skipping entire files from external sources
"""
Alecs King
check-code: add exit status...
r11816 import re, glob, os, sys
Thomas Arendsen Hein
check-code: check for gratuitous whitespace after Python keywords
r13074 import keyword
Matt Mackall
check-code: add a warnings level...
r10895 import optparse
Simon Heimberg
check-code: introduce function for using re2 when available...
r19310 try:
import re2
except ImportError:
re2 = None
def compilere(pat, multiline=False):
if multiline:
pat = '(?m)' + pat
if re2:
try:
return re2.compile(pat)
except re2.error:
pass
return re.compile(pat)
Matt Mackall
Introduce check-code.py...
r10281
def repquote(m):
Simon Heimberg
check-code: more replacement characters...
r19999 fromc = '.:'
tochr = 'pq'
def encodechr(i):
if i > 255:
return 'u'
c = chr(i)
if c in ' \n':
return c
if c.isalpha():
return 'x'
if c.isdigit():
return 'n'
try:
return tochr[fromc.find(c)]
except (ValueError, IndexError):
return 'o'
t = m.group('text')
tt = ''.join(encodechr(i) for i in xrange(256))
t = t.translate(tt)
Benoit Boissinot
check-code: improve quote detection regexp, add tests
r10722 return m.group('quote') + t + m.group('quote')
Matt Mackall
Introduce check-code.py...
r10281
Benoit Boissinot
check-code: more tests and more robust python filtering
r10727 def reppython(m):
comment = m.group('comment')
if comment:
Mads Kiilerich
check-code: catch trailing space in comments
r18959 l = len(comment.rstrip())
return "#" * l + comment[l:]
Benoit Boissinot
check-code: more tests and more robust python filtering
r10727 return repquote(m)
Matt Mackall
Introduce check-code.py...
r10281
def repcomment(m):
return m.group(1) + "#" * len(m.group(2))
def repccomment(m):
t = re.sub(r"((?<=\n) )|\S", "x", m.group(2))
return m.group(1) + t + "*/"
def repcallspaces(m):
t = re.sub(r"\n\s+", "\n", m.group(2))
return m.group(1) + t
def repinclude(m):
return m.group(1) + "<foo>"
def rephere(m):
t = re.sub(r"\S", "x", m.group(2))
return m.group(1) + t
testpats = [
Idan Kamara
check-code: separate warnings to avoid repetitive str.startswith
r14009 [
Mads Kiilerich
check-code: put grouping around regexps generated from testpats...
r16495 (r'pushd|popd', "don't use 'pushd' or 'popd', use 'cd'"),
Matt Mackall
check-code: support multiline matches like try/except/finally...
r15281 (r'\W\$?\(\([^\)\n]*\)\)', "don't use (()) or $(()), use 'expr'"),
Martin Geisler
check-code.py: make help strings consistent
r10374 (r'grep.*-q', "don't use 'grep -q', redirect to /dev/null"),
Danek Duvall
solaris: test cases can't use grep -a...
r19626 (r'(?<!hg )grep.*-a', "don't use 'grep -a', use in-line python"),
Matt Mackall
tests: remove sed -i from test-record
r16332 (r'sed.*-i', "don't use 'sed -i', use a temporary file"),
Mads Kiilerich
test-alias: adapt for Windows...
r16965 (r'\becho\b.*\\n', "don't use 'echo \\n', use printf"),
Martin Geisler
check-code: catch "echo -n" in tests
r11884 (r'echo -n', "don't use 'echo -n', use printf"),
Matt Mackall
test-revert.t: fix wc check-code false positive
r23134 (r'(^|\|\s*)\bwc\b[^|]*$\n(?!.*\(re\))', "filter wc output"),
Martin Geisler
check-code.py: make help strings consistent
r10374 (r'head -c', "don't use 'head -c', use 'dd'"),
Danek Duvall
solaris: tests can't use tail -n...
r19628 (r'tail -n', "don't use the '-n' option to tail, just use '-<num>'"),
Matt Mackall
tests: use md5sum.py instead of sha1sum, add check
r15389 (r'sha1sum', "don't use sha1sum, use $TESTDIR/md5sum.py"),
Martin Geisler
check-code.py: make help strings consistent
r10374 (r'ls.*-\w*R', "don't use 'ls -R', use 'find'"),
Simon Heimberg
check-code: do not warn on printf \\x or \\[1-9]...
r19380 (r'printf.*[^\\]\\([1-9]|0\d)', "don't use 'printf \NNN', use Python"),
(r'printf.*[^\\]\\x', "don't use printf \\x, use Python"),
Matt Mackall
Introduce check-code.py...
r10281 (r'\$\(.*\)', "don't use $(expr), use `expr`"),
(r'rm -rf \*', "don't use naked rm -rf, target a directory"),
Matt Mackall
check-code: fix issues with finding patterns in unified tests, fix tests...
r15372 (r'(^|\|\s*)grep (-\w\s+)*[^|]*[(|]\w',
Matt Mackall
Introduce check-code.py...
r10281 "use egrep for extended grep syntax"),
(r'/bin/', "don't use explicit paths for tools"),
(r'[^\n]\Z', "no trailing newline"),
Mads Kiilerich
test-merge-default and check-code.py: No "export x=x" in sh
r10658 (r'export.*=', "don't export and assign at once"),
Matt Mackall
check-code: fix issues with finding patterns in unified tests, fix tests...
r15372 (r'^source\b', "don't use 'source', use '.'"),
Dan Villiom Podlaski Christiansen
tests: compatibility fix....
r12367 (r'touch -d', "don't use 'touch -d', use 'touch -t' instead"),
Matt Mackall
tests: fix check-code detection of anchored expressions, fix echo -n usage
r15364 (r'ls +[^|\n-]+ +-', "options to 'ls' must come before filenames"),
Matt Mackall
check-code: support multiline matches like try/except/finally...
r15281 (r'[^>\n]>\s*\$HGRCPATH', "don't overwrite $HGRCPATH, append to it"),
Matt Mackall
check-code: fix issues with finding patterns in unified tests, fix tests...
r15372 (r'^stop\(\)', "don't use 'stop' as a shell function name"),
Mads Kiilerich
tests: don't use 'test -e'...
r15282 (r'(\[|\btest\b).*-e ', "don't use 'test -e', use 'test -f'"),
Yuya Nishihara
check-code: ban use of '[[ ]]' in tests
r25588 (r'\[\[\s+[^\]]*\]\]', "don't use '[[ ]]', use '[ ]'"),
Mads Kiilerich
tests: don't use alias...
r16013 (r'^alias\b.*=', "don't use alias, use a function"),
Mads Kiilerich
tests: solaris sh can not negate exit status with '!'
r16485 (r'if\s*!', "don't use '!' to negate exit status"),
Mads Kiilerich
tests: don't use /dev/urandom for largefiles testing...
r16494 (r'/dev/u?random', "don't use entropy, use /dev/zero"),
Mads Kiilerich
tests: use 'do sleep 0' instead of 'do true', also on first line of command...
r16496 (r'do\s*true;\s*done', "don't use true as loop body, use sleep 0"),
Mads Kiilerich
tests: avoid tab indent on all kinds of lines of sh commands
r16497 (r'^( *)\t', "don't use tabs to indent"),
Kevin Bullock
check-code: fix sed 'i' command rule newline matching...
r19083 (r'sed (-e )?\'(\d+|/[^/]*/)i(?!\\\n)',
Kevin Bullock
check-code: add a rule against a GNU sed-ism...
r19080 "put a backslash-escaped newline after sed 'i' command"),
Danek Duvall
solaris: diff -u emits "No differences encountered"...
r20598 (r'^diff *-\w*u.*$\n(^ \$ |^$)', "prefix diff -u with cmp"),
Matt Harbison
check-code: enforce the usage of 'seq.py' instead of 'seq'
r24362 (r'seq ', "don't use 'seq', use $TESTDIR/seq.py")
Idan Kamara
check-code: separate warnings to avoid repetitive str.startswith
r14009 ],
# warnings
Mads Kiilerich
tests: run most check-code sh checks on continued lines too...
r16672 [
(r'^function', "don't use 'function', use old style"),
(r'^diff.*-\w*N', "don't use 'diff -N'"),
Mads Kiilerich
tests: use `pwd` instead of ${PWD} in test-convert-git.t - because of Solaris
r18508 (r'\$PWD|\${PWD}', "don't use $PWD, use `pwd`"),
Mads Kiilerich
tests: run most check-code sh checks on continued lines too...
r16672 (r'^([^"\'\n]|("[^"\n]*")|(\'[^\'\n]*\'))*\^', "^ must be quoted"),
Kevin Bullock
check-code: warn to use killdaemons instead of kill `cat PIDFILE`...
r18575 (r'kill (`|\$\()', "don't use kill, use killdaemons.py")
Mads Kiilerich
tests: run most check-code sh checks on continued lines too...
r16672 ]
Matt Mackall
Introduce check-code.py...
r10281 ]
testfilters = [
(r"( *)(#([^\n]*\S)?)", repcomment),
(r"<<(\S+)((.|\n)*?\n\1)", rephere),
]
Simon Heimberg
check-code: extract windows glob warning message...
r18832 winglobmsg = "use (glob) to match Windows paths too"
Matt Mackall
check-code: fix issues with finding patterns in unified tests, fix tests...
r15372 uprefix = r"^ \$ "
Matt Mackall
check-code: add some basic support for unified tests
r12364 utestpats = [
Idan Kamara
check-code: separate warnings to avoid repetitive str.startswith
r14009 [
Mads Kiilerich
check-code: fix check for trailing whitespace on continued lines too...
r17347 (r'^(\S.*|| [$>] .*)[ \t]\n', "trailing whitespace on non-output"),
Mads Kiilerich
tests: unify the last sh tests...
r16673 (uprefix + r'.*\|\s*sed[^|>\n]*\n',
"use regex test output patterns instead of sed"),
Matt Mackall
check-code: add some basic support for unified tests
r12364 (uprefix + r'(true|exit 0)', "explicit zero exit unnecessary"),
Patrick Mezard
test-svn-subrepo: fix reference output for svn 1.7...
r15607 (uprefix + r'.*(?<!\[)\$\?', "explicit exit code checks unnecessary"),
Matt Mackall
check-code: add some basic support for unified tests
r12364 (uprefix + r'.*\|\| echo.*(fail|error)',
"explicit exit code checks unnecessary"),
(uprefix + r'set -e', "don't use set -e"),
Mads Kiilerich
check-code: check that '>' is used for continued lines...
r19873 (uprefix + r'(\s|fi\b|done\b)', "use > for continued lines"),
Simon Heimberg
tests: rewrite path in test-shelve.t for not being mangled on msys...
r20423 (uprefix + r'.*:\.\S*/', "x:.y in a path does not work on msys, rewrite "
"as x://.y, or see `hg log -k msys` for alternatives", r'-\S+:\.|' #-Rxxx
Matt Mackall
check-code: allow disabling msys path check
r24205 '# no-msys'), # in test-pull.t which is skipped on windows
Simon Heimberg
check-code: extract windows glob warning message...
r18832 (r'^ saved backup bundle to \$TESTTMP.*\.hg$', winglobmsg),
Bryan O'Sullivan
check-code: fix a check-code failure in check-code...
r18835 (r'^ changeset .* references (corrupted|missing) \$TESTTMP/.*[^)]$',
winglobmsg),
Simon Heimberg
check-code: document last ignore regexp...
r20014 (r'^ pulling from \$TESTTMP/.*[^)]$', winglobmsg,
'\$TESTTMP/unix-repo$'), # in test-issue1802.t which skipped on windows
FUJIWARA Katsunori
check-code.py: avoid warning against "reverting subrepo ..." lines...
r23936 (r'^ reverting (?!subrepo ).*/.*[^)]$', winglobmsg),
Simon Heimberg
check-code: drop unneeded ignore patterns...
r20013 (r'^ cloning subrepo \S+/.*[^)]$', winglobmsg),
(r'^ pushing to \$TESTTMP/.*[^)]$', winglobmsg),
(r'^ pushing subrepo \S+/\S+ to.*[^)]$', winglobmsg),
Brendan Cully
tests: check path separator in moves
r19133 (r'^ moving \S+/.*[^)]$', winglobmsg),
Simon Heimberg
check-code: drop unneeded ignore patterns...
r20013 (r'^ no changes made to subrepo since.*/.*[^)]$', winglobmsg),
(r'^ .*: largefile \S+ not available from file:.*/.*[^)]$', winglobmsg),
Simon Heimberg
tests: lines with largefile .* file://$TESTTMP also match on windows...
r20471 (r'^ .*file://\$TESTTMP',
'write "file:/*/$TESTTMP" + (glob) to match on windows too'),
Danek Duvall
tests: cat error messages are different on Solaris
r21930 (r'^ (cat|find): .*: No such file or directory',
'use test -f to test for file existence'),
Idan Kamara
check-code: separate warnings to avoid repetitive str.startswith
r14009 ],
# warnings
Simon Heimberg
check-code: warn about line glob match with no glob character (?*/)
r18683 [
(r'^ [^*?/\n]* \(glob\)$',
Simon Heimberg
check-code: automatically preppend "warning: " to all warning messages...
r19422 "glob match with no glob character (?*/)"),
Simon Heimberg
check-code: warn about line glob match with no glob character (?*/)
r18683 ]
Matt Mackall
check-code: add some basic support for unified tests
r12364 ]
Mads Kiilerich
check-code: fix checking for sh style in .t tests...
r14203 for i in [0, 1]:
Pierre-Yves David
check-code: allow an escape pattern to be specified for testpattern...
r22101 for tp in testpats[i]:
p = tp[0]
m = tp[1]
Matt Mackall
check-code: fix issues with finding patterns in unified tests, fix tests...
r15372 if p.startswith(r'^'):
Mads Kiilerich
tests: run most check-code sh checks on continued lines too...
r16672 p = r"^ [$>] (%s)" % p[1:]
Mads Kiilerich
check-code: fix checking for sh style in .t tests...
r14203 else:
Mads Kiilerich
tests: run most check-code sh checks on continued lines too...
r16672 p = r"^ [$>] .*(%s)" % p
Pierre-Yves David
check-code: allow an escape pattern to be specified for testpattern...
r22101 utestpats[i].append((p, m) + tp[2:])
Matt Mackall
check-code: add some basic support for unified tests
r12364
utestfilters = [
Idan Kamara
check-code: replace heredocs in unified tests...
r17711 (r"<<(\S+)((.|\n)*?\n > \1)", rephere),
Matt Mackall
check-code: add some basic support for unified tests
r12364 (r"( *)(#([^\n]*\S)?)", repcomment),
]
Matt Mackall
Introduce check-code.py...
r10281 pypats = [
Idan Kamara
check-code: separate warnings to avoid repetitive str.startswith
r14009 [
Matt Mackall
check-code: check for argument passing py2.6ism
r20796 (r'\([^)]*\*\w[^()]+\w+=', "can't pass varargs with keyword in Py2.5"),
Renato Cunha
check-code: check for tuple parameter unpacking (missing in py3k)
r11568 (r'^\s*def\s*\w+\s*\(.*,\s*\(',
"tuple parameter unpacking not available in Python 3+"),
(r'lambda\s*\(.*,.*\)',
"tuple parameter unpacking not available in Python 3+"),
Augie Fackler
check-code: new rule to forbid imports of a.b on the same line as other imports...
r19793 (r'import (.+,[^.]+\.[^.]+|[^.]+\.[^.]+,)',
'2to3 can\'t always rewrite "import qux, foo.bar", '
'use "import foo.bar" on its own line instead.'),
Renato Cunha
check-code: added a check for calls to the builtin cmp function
r11764 (r'(?<!def)\s+(cmp)\(', "cmp is not available in Python 3+"),
Renato Cunha
check-code: added check for reduce usage
r11569 (r'\breduce\s*\(.*', "reduce is not available in Python 3+"),
Augie Fackler
check-code: disallow use of dict(key=value) construction...
r20688 (r'dict\(.*=', 'dict() is different in Py2 and 3 and is slower than {}',
'dict-from-generator'),
Martin Geisler
check-code: catch dict.has_key
r11602 (r'\.has_key\b', "dict.has_key is not available in Python 3+"),
Augie Fackler
check-code: disallow defunct <> operator...
r18183 (r'\s<>\s', '<> operator is not available in Python 3+, use !='),
Matt Mackall
Introduce check-code.py...
r10281 (r'^\s*\t', "don't use tabs"),
Matt Mackall
check-code: import some pylint checks
r10412 (r'\S;\s*\n', "semicolon"),
FUJIWARA Katsunori
check-code: detect "% inside _()" when there are leading whitespaces...
r21097 (r'[^_]_\([ \t\n]*(?:"[^"]+"[ \t\n+]*)+%', "don't use % inside _()"),
(r"[^_]_\([ \t\n]*(?:'[^']+'[ \t\n+]*)+%", "don't use % inside _()"),
Mads Kiilerich
check-code: there must also be whitespace between ')' and operator...
r18054 (r'(\w|\)),\w', "missing whitespace after ,"),
(r'(\w|\))[+/*\-<>]\w', "missing whitespace in expression"),
Mads Kiilerich
check-code: make 'missing whitespace in assignment' more aggressive...
r18055 (r'^\s+(\w|\.)+=\w[^,()\n]*$', "missing whitespace in assignment"),
Brodie Rao
check-code: promote 80+ character line warning to an error
r16702 (r'.{81}', "line too long"),
Matt Mackall
check-code: fix issues with finding patterns in unified tests, fix tests...
r15372 (r' x+[xo][\'"]\n\s+[\'"]x', 'string join across lines with no space'),
Matt Mackall
Introduce check-code.py...
r10281 (r'[^\n]\Z', "no trailing newline"),
Matt Mackall
check-code: support multiline matches like try/except/finally...
r15281 (r'(\S[ \t]+|^[ \t]+)\n', "trailing whitespace"),
Brodie Rao
cleanup: eradicate long lines
r16683 # (r'^\s+[^_ \n][^_. \n]+_[^_\n]+\s*=',
# "don't use underbars in identifiers"),
Matt Mackall
check-code: enable camelcase check, fix up problems
r15457 (r'^\s+(self\.)?[A-za-z][a-z0-9]+[A-Z]\w* = ',
"don't use camelcase in identifiers"),
Matt Mackall
check-code: support multiline matches like try/except/finally...
r15281 (r'^\s*(if|while|def|class|except|try)\s[^[\n]*:\s*[^\\n]#\s]+',
Matt Mackall
check-code: check thyself
r10286 "linebreak after :"),
Matt Mackall
check-code: support multiline matches like try/except/finally...
r15281 (r'class\s[^( \n]+:', "old-style class, use class foo(object)"),
(r'class\s[^( \n]+\(\):',
Pierre-Yves David
check-code: fix the error message about 'class foo():'...
r25140 "class foo() creates old style object, use class foo(object)"),
Pierre-Yves David
check-code: allow print and exec as a function...
r25028 (r'\b(%s)\(' % '|'.join(k for k in keyword.kwlist
if k not in ('print', 'exec')),
Thomas Arendsen Hein
check-code: single check for Python keywords used as a function...
r13076 "Python keyword is not a function"),
Matt Mackall
check-code: import some pylint checks
r10412 (r',]', "unneeded trailing ',' in list"),
Matt Mackall
Introduce check-code.py...
r10281 # (r'class\s[A-Z][^\(]*\((?!Exception)',
# "don't capitalize non-exception classes"),
# (r'in range\(', "use xrange"),
# (r'^\s*print\s+', "avoid using print in core and extensions"),
(r'[\x80-\xff]', "non-ASCII character literal"),
Matt Mackall
check-code: reintroduce str.format() ban for 3.x porting...
r25212 (r'("\')\.format\(', "str.format() has no bytes counterpart, use %"),
Thomas Arendsen Hein
check-code: check for gratuitous whitespace after Python keywords
r13074 (r'^\s*(%s)\s\s' % '|'.join(keyword.kwlist),
"gratuitous whitespace after Python keyword"),
Matt Mackall
check-code: support multiline matches like try/except/finally...
r15281 (r'([\(\[][ \t]\S)|(\S[ \t][\)\]])', "gratuitous whitespace in () or []"),
Matt Mackall
Introduce check-code.py...
r10281 # (r'\s\s=', "gratuitous whitespace before ="),
Pierre-Yves David
check-code: recognise %= as an operator
r17167 (r'[^>< ](\+=|-=|!=|<>|<=|>=|<<=|>>=|%=)\S',
Martin Geisler
check-code: reformat long lines
r11345 "missing whitespace around operator"),
Pierre-Yves David
check-code: recognise %= as an operator
r17167 (r'[^>< ](\+=|-=|!=|<>|<=|>=|<<=|>>=|%=)\s',
Martin Geisler
check-code: reformat long lines
r11345 "missing whitespace around operator"),
Pierre-Yves David
check-code: recognise %= as an operator
r17167 (r'\s(\+=|-=|!=|<>|<=|>=|<<=|>>=|%=)\S',
Martin Geisler
check-code: reformat long lines
r11345 "missing whitespace around operator"),
Pierre-Yves David
check-code: recognise %= as an operator
r17167 (r'[^^+=*/!<>&| %-](\s=|=\s)[^= ]',
Martin Geisler
check-code: reformat long lines
r11345 "wrong whitespace around ="),
Mads Kiilerich
check-code: check for spaces around = for named parameters
r19872 (r'\([^()]*( =[^=]|[^<>!=]= )',
"no whitespace around = for named parameters"),
Matt Mackall
check-code: two more rules...
r10451 (r'raise Exception', "don't raise generic exceptions"),
Augie Fackler
check-code: disallow two-argument form of raise...
r18180 (r'raise [^,(]+, (\([^\)]+\)|[^,\(\)]+)$',
"don't use old-style two-argument raise, use Exception(message)"),
Idan Kamara
check-code: separate warnings to avoid repetitive str.startswith
r14009 (r' is\s+(not\s+)?["\'0-9-]', "object comparison with literal"),
(r' [=!]=\s+(True|False|None)',
"comparison with singleton, use 'is' or 'is not' instead"),
Martin Geisler
check-code: flag 0/1 used as constant Boolean expression
r14494 (r'^\s*(while|if) [01]:',
"use True/False for constant Boolean expression"),
Patrick Mezard
mq: replace hasattr() with util.safehasattr(), update check-code.py
r16416 (r'(?:(?<!def)\s+|\()hasattr',
Augie Fackler
check-code: disallow use of hasattr()...
r14978 'hasattr(foo, bar) is broken, use util.safehasattr(foo, bar) instead'),
Dan Villiom Podlaski Christiansen
check-code: disallow calling opener(...).read() and opener(..).write()
r14169 (r'opener\([^)]*\).read\(',
"use opener.read() instead"),
(r'opener\([^)]*\).write\(',
"use opener.write() instead"),
(r'[\s\(](open|file)\([^)]*\)\.read\(',
"use util.readfile() instead"),
(r'[\s\(](open|file)\([^)]*\)\.write\(',
Simon Heimberg
check-code: fix an error message
r19981 "use util.writefile() instead"),
Dan Villiom Podlaski Christiansen
check-code: disallow calling opener(...).read() and opener(..).write()
r14169 (r'^[\s\(]*(open(er)?|file)\([^)]*\)',
"always assign an opened file to a variable, and close it afterwards"),
(r'[\s\(](open|file)\([^)]*\)\.',
"always assign an opened file to a variable, and close it afterwards"),
Mads Kiilerich
spelling: fixes from proofreading of spell checker issues
r23139 (r'(?i)descend[e]nt', "the proper spelling is descendAnt"),
Matt Mackall
check-code: don't mark debug messages for translation
r14709 (r'\.debug\(\_', "don't mark debug messages for translation"),
Martin Geisler
check-code: catch unnecessary s.strip().split() calls
r16590 (r'\.strip\(\)\.split\(\)', "no need to strip before splitting"),
Simon Heimberg
check-code: do not prepend "warning" to a failure message...
r18762 (r'^\s*except\s*:', "naked except clause", r'#.*re-raises'),
Gregory Szorc
check-code: detect legacy exception syntax...
r25661 (r'^\s*except\s([^\(,]+|\([^\)]+\))\s*,',
'legacy exception syntax; use "as" instead of ","'),
Mads Kiilerich
check-code: indent 4 spaces in py files
r17299 (r':\n( )*( ){1,3}[^ ]', "must indent 4 spaces"),
Matt Mackall
check-code: move i18n check from warning to error
r17957 (r'ui\.(status|progress|write|note|warn)\([\'\"]x',
"missing _() in ui message (use () to hide false-positives)"),
Matt Mackall
check-code: add check for lock release order
r19031 (r'release\(.*wlock, .*lock\)', "wrong lock release order"),
Yuya Nishihara
check-code: look for misuse of __bool__
r22448 (r'\b__bool__\b', "__bool__ should be __nonzero__ in Python 2"),
FUJIWARA Katsunori
check-code: check os.path.join(*, '') not working correctly with Python 2.7.9...
r24836 (r'os\.path\.join\(.*, *(""|\'\')\)',
"use pathutil.normasprefix(path) instead of os.path.join(path, '')"),
Gregory Szorc
check-code: detect legacy octal syntax...
r25659 (r'\s0[0-7]+\b', 'legacy octal syntax; use "0o" prefix instead of "0"'),
Idan Kamara
check-code: separate warnings to avoid repetitive str.startswith
r14009 ],
# warnings
[
Simon Heimberg
check-code: more replacement characters...
r19999 (r'(^| )pp +xxxxqq[ \n][^\n]', "add two newlines after '.. note::'"),
Idan Kamara
check-code: separate warnings to avoid repetitive str.startswith
r14009 ]
Matt Mackall
Introduce check-code.py...
r10281 ]
pyfilters = [
Benoit Boissinot
check-code: more tests and more robust python filtering
r10727 (r"""(?msx)(?P<comment>\#.*?$)|
((?P<quote>('''|\"\"\"|(?<!')'(?!')|(?<!")"(?!")))
(?P<text>(([^\\]|\\.)*?))
(?P=quote))""", reppython),
Matt Mackall
Introduce check-code.py...
r10281 ]
Mads Kiilerich
check-code: check txt files for trailing whitespace
r18960 txtfilters = []
txtpats = [
[
('\s$', 'trailing whitespace'),
Simon Heimberg
help: remove last occurrences of ".. note::" without two newlines...
r20532 ('.. note::[ \n][^\n]', 'add two newlines after note::')
Mads Kiilerich
check-code: check txt files for trailing whitespace
r18960 ],
[]
]
Matt Mackall
Introduce check-code.py...
r10281 cpats = [
Idan Kamara
check-code: separate warnings to avoid repetitive str.startswith
r14009 [
Matt Mackall
Introduce check-code.py...
r10281 (r'//', "don't use //-style comments"),
(r'^ ', "don't use spaces to indent"),
(r'\S\t', "don't use tabs except for indent"),
Matt Mackall
check-code: support multiline matches like try/except/finally...
r15281 (r'(\S[ \t]+|^[ \t]+)\n', "trailing whitespace"),
Brodie Rao
check-code: promote 80+ character line warning to an error
r16702 (r'.{81}', "line too long"),
Matt Mackall
Introduce check-code.py...
r10281 (r'(while|if|do|for)\(', "use space after while/if/do/for"),
(r'return\(', "return is not a function"),
(r' ;', "no space before ;"),
Laurent Charignon
check-code: in C code, prevent space before closing parenthesis
r24453 (r'[^;] \)', "no space before )"),
Matt Mackall
check-code: add bracket style check
r19745 (r'[)][{]', "space between ) and {"),
Matt Mackall
Introduce check-code.py...
r10281 (r'\w+\* \w+', "use int *foo, not int* foo"),
Matt Mackall
check-code: make casting style check more precise
r19731 (r'\W\([^\)]+\) \w+', "use (int)foo, not (int) foo"),
Matt Mackall
check-code: avoid false-positive on ++
r16413 (r'\w+ (\+\+|--)', "use foo++, not foo ++"),
Matt Mackall
Introduce check-code.py...
r10281 (r'\w,\w', "missing whitespace after ,"),
Matt Mackall
osutil: fix up check-code issues
r13736 (r'^[^#]\w[+/*]\w', "missing whitespace in expression"),
Matt Mackall
Introduce check-code.py...
r10281 (r'^#\s+\w', "use #foo, not # foo"),
(r'[^\n]\Z', "no trailing newline"),
Dan Villiom Podlaski Christiansen
osutil: replace #import with #include, and add a check for it
r13748 (r'^\s*#import\b', "use only #include in standard C code"),
Idan Kamara
check-code: separate warnings to avoid repetitive str.startswith
r14009 ],
# warnings
[]
Matt Mackall
Introduce check-code.py...
r10281 ]
cfilters = [
(r'(/\*)(((\*(?!/))|[^*])*)\*/', repccomment),
Benoit Boissinot
check-code: improve quote detection regexp, add tests
r10722 (r'''(?P<quote>(?<!")")(?P<text>([^"]|\\")+)"(?!")''', repquote),
Matt Mackall
Introduce check-code.py...
r10281 (r'''(#\s*include\s+<)([^>]+)>''', repinclude),
(r'(\()([^)]+\))', repcallspaces),
]
timeless
check-code: check for repo in revlog and ui in util
r14137 inutilpats = [
[
(r'\bui\.', "don't use ui in util"),
],
# warnings
[]
]
inrevlogpats = [
[
(r'\brepo\.', "don't use repo in revlog"),
],
# warnings
[]
]
Steven Brown
check-code: check for consistent usage of the websub filter in hgweb templates...
r21487 webtemplatefilters = []
webtemplatepats = [
[],
[
(r'{desc(\|(?!websub|firstline)[^\|]*)+}',
'follow desc keyword with either firstline or websub'),
]
]
Matt Mackall
Introduce check-code.py...
r10281 checks = [
Matt Mackall
check-code: look at shebang to identify Python scripts
r21222 ('python', r'.*\.(py|cgi)$', r'^#!.*python', pyfilters, pypats),
('test script', r'(.*/)?test-[^.~]*$', '', testfilters, testpats),
('c', r'.*\.[ch]$', '', cfilters, cpats),
('unified test', r'.*\.t$', '', utestfilters, utestpats),
('layering violation repo in revlog', r'mercurial/revlog\.py', '',
pyfilters, inrevlogpats),
('layering violation ui in util', r'mercurial/util\.py', '', pyfilters,
timeless
check-code: check for repo in revlog and ui in util
r14137 inutilpats),
Matt Mackall
check-code: look at shebang to identify Python scripts
r21222 ('txt', r'.*\.txt$', '', txtfilters, txtpats),
Steven Brown
check-code: check for consistent usage of the websub filter in hgweb templates...
r21487 ('web template', r'mercurial/templates/.*\.tmpl', '',
webtemplatefilters, webtemplatepats),
Matt Mackall
Introduce check-code.py...
r10281 ]
Simon Heimberg
check-code: only fix patterns once...
r19307 def _preparepats():
for c in checks:
failandwarn = c[-1]
for pats in failandwarn:
for i, pseq in enumerate(pats):
# fix-up regexes for multi-line searches
Simon Heimberg
cleanup: drop unused variables and an unused import
r19378 p = pseq[0]
Simon Heimberg
check-code: only fix patterns once...
r19307 # \s doesn't match \n
p = re.sub(r'(?<!\\)\\s', r'[ \\t]', p)
# [^...] doesn't match newline
p = re.sub(r'(?<!\\)\[\^', r'[^\\n', p)
Simon Heimberg
check-code: compile all patterns on initialisation...
r19308 pats[i] = (re.compile(p, re.MULTILINE),) + pseq[1:]
Matt Mackall
check-code: look at shebang to identify Python scripts
r21222 filters = c[3]
Simon Heimberg
check-code: compile filters when loading
r19309 for i, flt in enumerate(filters):
filters[i] = re.compile(flt[0]), flt[1]
Simon Heimberg
check-code: only fix patterns once...
r19307 _preparepats()
Pierre-Yves David
code-code: Add a logfunc argument to checkfile...
r10719 class norepeatlogger(object):
def __init__(self):
self._lastseen = None
Matt Mackall
check-code: add --blame switch
r11604 def log(self, fname, lineno, line, msg, blame):
Pierre-Yves David
code-code: Add a logfunc argument to checkfile...
r10719 """print error related a to given line of a given file.
The faulty line will also be printed but only once in the case
of multiple errors.
Matt Mackall
Introduce check-code.py...
r10281
Pierre-Yves David
code-code: Add a logfunc argument to checkfile...
r10719 :fname: filename
:lineno: line number
:line: actual content of the line
:msg: error message
"""
msgid = fname, lineno, line
if msgid != self._lastseen:
Matt Mackall
check-code: add --blame switch
r11604 if blame:
print "%s:%d (%s):" % (fname, lineno, blame)
else:
print "%s:%d:" % (fname, lineno)
Pierre-Yves David
code-code: Add a logfunc argument to checkfile...
r10719 print " > %s" % line
self._lastseen = msgid
print " " + msg
_defaultlogger = norepeatlogger()
Matt Mackall
check-code: add --blame switch
r11604 def getblame(f):
lines = []
for l in os.popen('hg annotate -un %s' % f):
start, line = l.split(':', 1)
user, rev = start.split()
lines.append((line[1:-1], user, rev))
return lines
def checkfile(f, logfunc=_defaultlogger.log, maxerr=None, warnings=False,
Mads Kiilerich
check-code: add --nolineno option for hiding line numbers...
r15502 blame=False, debug=False, lineno=True):
Pierre-Yves David
code-code: Add a logfunc argument to checkfile...
r10719 """checks style and portability of a given file
:f: filepath
:logfunc: function used to report error
logfunc(filename, linenumber, linecontent, errormessage)
Mads Kiilerich
fix trivial spelling errors
r17424 :maxerr: number of error to display before aborting.
Mads Kiilerich
tests: keep track of all check-code.py warnings
r15873 Set to false (default) to report all errors
Pierre-Yves David
check-code: add a return value to checkfile function...
r10720
return True if no error is found, False otherwise.
Pierre-Yves David
code-code: Add a logfunc argument to checkfile...
r10719 """
Matt Mackall
check-code: add --blame switch
r11604 blamecache = None
Pierre-Yves David
check-code: add a return value to checkfile function...
r10720 result = True
Matt Mackall
check-code: look at shebang to identify Python scripts
r21222
try:
fp = open(f)
Gregory Szorc
global: mass rewrite to use modern exception syntax...
r25660 except IOError as e:
Matt Mackall
check-code: look at shebang to identify Python scripts
r21222 print "Skipping %s, %s" % (f, str(e).split(':', 1)[0])
return result
pre = post = fp.read()
fp.close()
for name, match, magic, filters, pats in checks:
timeless
check-code: adding debug flag
r14135 if debug:
print name, f
Matt Mackall
Introduce check-code.py...
r10281 fc = 0
Matt Mackall
check-code: look at shebang to identify Python scripts
r21222 if not (re.match(match, f) or (magic and re.search(magic, f))):
timeless
check-code: adding debug flag
r14135 if debug:
print "Skipping %s for %s it doesn't match %s" % (
name, match, f)
Matt Mackall
Introduce check-code.py...
r10281 continue
Simon Heimberg
check-code: concatenate "check-code" on compile time...
r19382 if "no-" "check-code" in pre:
Simon Heimberg
check-code: always report when a file is skipped by "no-check-code"...
r20239 print "Skipping %s it has no-" "check-code" % f
return "Skip" # skip checking this file
Matt Mackall
Introduce check-code.py...
r10281 for p, r in filters:
post = re.sub(p, r, post)
Simon Heimberg
check-code: automatically preppend "warning: " to all warning messages...
r19422 nerrs = len(pats[0]) # nerr elements are errors
Idan Kamara
check-code: separate warnings to avoid repetitive str.startswith
r14009 if warnings:
pats = pats[0] + pats[1]
else:
pats = pats[0]
Matt Mackall
Introduce check-code.py...
r10281 # print post # uncomment to show filtered version
Matt Mackall
check-code: support multiline matches like try/except/finally...
r15281
timeless
check-code: adding debug flag
r14135 if debug:
print "Checking %s for %s" % (name, f)
Matt Mackall
check-code: support multiline matches like try/except/finally...
r15281
prelines = None
errors = []
Simon Heimberg
check-code: automatically preppend "warning: " to all warning messages...
r19422 for i, pat in enumerate(pats):
Brodie Rao
check-code: ignore naked excepts with a "re-raise" comment...
r16705 if len(pat) == 3:
p, msg, ignore = pat
else:
p, msg = pat
ignore = None
Simon Heimberg
check-code: prepend warning prefix only once, but for each warning...
r20005 if i >= nerrs:
msg = "warning: " + msg
Brodie Rao
check-code: ignore naked excepts with a "re-raise" comment...
r16705
Matt Mackall
check-code: support multiline matches like try/except/finally...
r15281 pos = 0
n = 0
Simon Heimberg
check-code: compile all patterns on initialisation...
r19308 for m in p.finditer(post):
Matt Mackall
check-code: support multiline matches like try/except/finally...
r15281 if prelines is None:
prelines = pre.splitlines()
postlines = post.splitlines(True)
start = m.start()
while n < len(postlines):
step = len(postlines[n])
if pos + step > start:
break
pos += step
n += 1
l = prelines[n]
Simon Heimberg
check-code: drop now unused check-code-ignore...
r20242 if ignore and re.search(ignore, l, re.MULTILINE):
Simon Heimberg
check-code: print debug output when an ignore pattern matches
r20243 if debug:
print "Skipping %s for %s:%s (ignore pattern)" % (
name, f, n)
Brodie Rao
check-code: ignore naked excepts with a "re-raise" comment...
r16705 continue
Matt Mackall
check-code: support multiline matches like try/except/finally...
r15281 bd = ""
if blame:
bd = 'working directory'
if not blamecache:
blamecache = getblame(f)
if n < len(blamecache):
bl, bu, br = blamecache[n]
if bl == l:
bd = '%s@%s' % (bu, br)
Simon Heimberg
check-code: prepend warning prefix only once, but for each warning...
r20005
Mads Kiilerich
check-code: add --nolineno option for hiding line numbers...
r15502 errors.append((f, lineno and n + 1, l, msg, bd))
Matt Mackall
check-code: support multiline matches like try/except/finally...
r15281 result = False
errors.sort()
for e in errors:
logfunc(*e)
fc += 1
Mads Kiilerich
tests: keep track of all check-code.py warnings
r15873 if maxerr and fc >= maxerr:
Matt Mackall
Introduce check-code.py...
r10281 print " (too many errors, giving up)"
break
Matt Mackall
check-code: support multiline matches like try/except/finally...
r15281
Pierre-Yves David
check-code: add a return value to checkfile function...
r10720 return result
Pierre-Yves David
check-code: Add a ``checkfile`` function...
r10717
Pierre-Yves David
check-code: Only call check-code if __name__ = "__main__"....
r10716 if __name__ == "__main__":
Matt Mackall
check-code: add a warnings level...
r10895 parser = optparse.OptionParser("%prog [options] [files]")
parser.add_option("-w", "--warnings", action="store_true",
help="include warning-level checks")
parser.add_option("-p", "--per-file", type="int",
help="max warnings per file")
Matt Mackall
check-code: add --blame switch
r11604 parser.add_option("-b", "--blame", action="store_true",
help="use annotate to generate blame info")
timeless
check-code: adding debug flag
r14135 parser.add_option("", "--debug", action="store_true",
help="show debug information")
Mads Kiilerich
check-code: add --nolineno option for hiding line numbers...
r15502 parser.add_option("", "--nolineno", action="store_false",
dest='lineno', help="don't show line numbers")
Matt Mackall
check-code: add a warnings level...
r10895
Mads Kiilerich
check-code: add --nolineno option for hiding line numbers...
r15502 parser.set_defaults(per_file=15, warnings=False, blame=False, debug=False,
lineno=True)
Matt Mackall
check-code: add a warnings level...
r10895 (options, args) = parser.parse_args()
if len(args) == 0:
Pierre-Yves David
check-code: Only call check-code if __name__ = "__main__"....
r10716 check = glob.glob("*")
else:
Matt Mackall
check-code: add a warnings level...
r10895 check = args
Matt Mackall
Introduce check-code.py...
r10281
Mads Kiilerich
check-code: fix return code initialization...
r15544 ret = 0
Pierre-Yves David
check-code: Only call check-code if __name__ = "__main__"....
r10716 for f in check:
Alecs King
check-code: add exit status...
r11816 if not checkfile(f, maxerr=options.per_file, warnings=options.warnings,
Mads Kiilerich
check-code: add --nolineno option for hiding line numbers...
r15502 blame=options.blame, debug=options.debug,
lineno=options.lineno):
Alecs King
check-code: add exit status...
r11816 ret = 1
sys.exit(ret)