##// END OF EJS Templates
mdiff: speed up showfunc for large diffs...
mdiff: speed up showfunc for large diffs This addresses the following issues with showfunc: - Silly usage of regular expressions. - Doing str.rstrip() needlessly in an inner loop. - Doing catastrophic backtracking when trying to find a function line. Finding function text is now at worst O(n lines in the old file), and at best close to O(n hunks). Given a diff like this[1]: src/main/antlr3/uk/ac/cam/ch/wwmm/pregenerated/ChemicalChunker.g | 4 +- src/main/java/uk/ac/cam/ch/wwmm/pregenerated/ChemicalChunkerLexer.java | 2 +- src/main/java/uk/ac/cam/ch/wwmm/pregenerated/ChemicalChunkerParser.java | 29189 +++++---- 3 files changed, 14741 insertions(+), 14454 deletions(-) [1]: https://bitbucket.org/wwmm/chemicaltagger/changeset/d2bfbaecd4fc/raw Without this change, hg log --stat --config diff.showfunc=1 takes an absurdly long time to complete: CallCount Recursive Total(ms) Inline(ms) module:lineno(function) 32813 0 80.3546 40.6086 mercurial.mdiff:160(yieldhunk) +65062746 0 25.7227 25.7227 +<method 'match' of '_sre.SRE_Pattern' objects> +65062746 0 14.0221 14.0221 +<method 'rstrip' of 'str' objects> +1809 0 0.0009 0.0009 +mercurial.mdiff:148(contextend) +1809 0 0.0003 0.0003 +<len> 65062746 0 25.7227 25.7227 <method 'match' of '_sre.SRE_Pattern' objects> 65062763 0 14.0221 14.0221 <method 'rstrip' of 'str' objects> 543 0 0.1631 0.1631 <zlib.decompress> 3 0 0.0505 0.0505 <mercurial.bdiff.blocks> 31007 0 80.4564 0.0477 mercurial.mdiff:147(_unidiff) +32813 0 80.3546 40.6086 +mercurial.mdiff:160(yieldhunk) +3 0 0.0505 0.0505 +<mercurial.bdiff.blocks> +3618 0 0.0022 0.0022 +mercurial.mdiff:154(contextstart) +5427 0 0.0013 0.0013 +<len> +3 0 0.0001 0.0000 +re:188(compile) 1 0 80.8381 0.0322 mercurial.patch:1777(diffstatdata) +107499 0 0.0235 0.0235 +<method 'startswith' of 'str' objects> +31014 0 80.7820 0.0071 +mercurial.util:1284(iterlines) +3 0 0.0000 0.0000 +<method 'search' of '_sre.SRE_Pattern' objects> +4 0 0.0000 0.0000 +mercurial.patch:1783(addresult) +3 0 0.0000 0.0000 +<method 'group' of '_sre.SRE_Match' objects> 6 0 0.0444 0.0283 mercurial.mdiff:12(splitnewlines) +6 0 0.0160 0.0160 +<method 'split' of 'str' objects> 32 0 0.0246 0.0246 <method 'update' of '_hashlib.HASH' objects> 11 0 0.0236 0.0236 <method 'read' of 'file' objects> Time: real 80.880 secs (user 80.200+0.000 sys 0.380+0.000) With this change, it's almost as fast as not using showfunc at all: CallCount Recursive Total(ms) Inline(ms) module:lineno(function) 543 0 0.1699 0.1699 <zlib.decompress> 3 0 0.0501 0.0501 <mercurial.bdiff.blocks> 32813 0 0.0415 0.0348 mercurial.mdiff:161(yieldhunk) +70837 0 0.0058 0.0058 +<method 'isalnum' of 'str' objects> +1809 0 0.0006 0.0006 +mercurial.mdiff:148(contextend) +1809 0 0.0002 0.0002 +<len> 1 0 0.4879 0.0310 mercurial.patch:1777(diffstatdata) +107499 0 0.0230 0.0230 +<method 'startswith' of 'str' objects> +31014 0 0.4335 0.0065 +mercurial.util:1284(iterlines) +3 0 0.0000 0.0000 +<method 'search' of '_sre.SRE_Pattern' objects> +4 0 0.0000 0.0000 +mercurial.patch:1783(addresult) +1 0 0.0004 0.0000 +re:188(compile) 32 0 0.0293 0.0293 <method 'update' of '_hashlib.HASH' objects> 6 0 0.0427 0.0279 mercurial.mdiff:12(splitnewlines) +6 0 0.0147 0.0147 +<method 'split' of 'str' objects> 31007 0 0.1169 0.0235 mercurial.mdiff:147(_unidiff) +3 0 0.0501 0.0501 +<mercurial.bdiff.blocks> +32813 0 0.0415 0.0348 +mercurial.mdiff:161(yieldhunk) +3618 0 0.0012 0.0012 +mercurial.mdiff:154(contextstart) +5427 0 0.0006 0.0006 +<len> 107597 0 0.0230 0.0230 <method 'startswith' of 'str' objects> 16 0 0.0213 0.0213 <mercurial.mpatch.patches> 194 0 0.0149 0.0149 <method 'split' of 'str' objects> Time: real 0.530 secs (user 0.450+0.000 sys 0.070+0.000)

File last commit:

r14943:d3bb825d default
r15141:16dc9a32 default
Show More
fancyopts.py
117 lines | 3.3 KiB | text/x-python | PythonLexer
Martin Geisler
fancyopts: add copyright and license header
r8230 # fancyopts.py - better command line parsing
#
# Copyright 2005-2009 Matt Mackall <mpm@selenic.com> and others
#
# This software may be used and distributed according to the terms of the
Matt Mackall
Update license to GPLv2+
r10263 # GNU General Public License version 2 or any later version.
Martin Geisler
fancyopts: add copyright and license header
r8230
mark.williamson@cl.cam.ac.uk
A number of minor fixes to problems that pychecker found....
r667 import getopt
mpm@selenic.com
Add back links from file revisions to changeset revisions...
r0
Augie Fackler
fancyopts: Parse options that occur after arguments....
r7772 def gnugetopt(args, options, longoptions):
"""Parse options mostly like getopt.gnu_getopt.
This is different from getopt.gnu_getopt in that an argument of - will
become an argument of - instead of vanishing completely.
"""
extraargs = []
if '--' in args:
stopindex = args.index('--')
Matt Mackall
many, many trivial check-code fixups
r10282 extraargs = args[stopindex + 1:]
Augie Fackler
fancyopts: Parse options that occur after arguments....
r7772 args = args[:stopindex]
opts, parseargs = getopt.getopt(args, options, longoptions)
args = []
while parseargs:
arg = parseargs.pop(0)
if arg and arg[0] == '-' and len(arg) > 1:
parseargs.insert(0, arg)
topts, newparseargs = getopt.getopt(parseargs, options, longoptions)
opts = opts + topts
parseargs = newparseargs
else:
args.append(arg)
args.extend(extraargs)
return opts, args
def fancyopts(args, options, state, gnu=False):
Matt Mackall
fancyopts: lots of cleanups
r5638 """
read args, parse options, and store options in state
each option is a tuple of:
short option or ''
long option
default value
description
FUJIWARA Katsunori
help: show value requirement and multiple occurrence of options...
r11321 option value label(optional)
Matt Mackall
fancyopts: lots of cleanups
r5638
option types include:
boolean or none - option sets variable in state to true
string - parameter string is stored in state
list - parameter string is added to a list
integer - parameter strings is stored as int
function - call function with parameter
mpm@selenic.com
Add back links from file revisions to changeset revisions...
r0
Matt Mackall
fancyopts: lots of cleanups
r5638 non-option args are returned
"""
namelist = []
shortlist = ''
argmap = {}
defmap = {}
FUJIWARA Katsunori
help: show value requirement and multiple occurrence of options...
r11321 for option in options:
if len(option) == 5:
short, name, default, comment, dummy = option
else:
short, name, default, comment = option
Matt Mackall
fancyopts: lots of cleanups
r5638 # convert opts to getopt format
oname = name
name = name.replace('-', '_')
argmap['-' + short] = argmap['--' + oname] = name
defmap[name] = default
# copy defaults to state
if isinstance(default, list):
state[name] = default[:]
Augie Fackler
globally: use safehasattr(x, '__call__') instead of hasattr(x, '__call__')
r14943 elif getattr(default, '__call__', False):
Matt Mackall
fancyopts: lots of cleanups
r5638 state[name] = None
Thomas Arendsen Hein
fancyopts: Copy list arguments in command table before modifying....
r5093 else:
Matt Mackall
fancyopts: lots of cleanups
r5638 state[name] = default
mpm@selenic.com
Add back links from file revisions to changeset revisions...
r0
Matt Mackall
fancyopts: lots of cleanups
r5638 # does it take a parameter?
if not (default is None or default is True or default is False):
Matt Mackall
many, many trivial check-code fixups
r10282 if short:
short += ':'
if oname:
oname += '='
Matt Mackall
fancyopts: lots of cleanups
r5638 if short:
shortlist += short
if name:
namelist.append(oname)
# parse arguments
Augie Fackler
fancyopts: Parse options that occur after arguments....
r7772 if gnu:
parse = gnugetopt
else:
parse = getopt.getopt
opts, args = parse(args, shortlist, namelist)
mpm@selenic.com
Add back links from file revisions to changeset revisions...
r0
Matt Mackall
fancyopts: lots of cleanups
r5638 # transfer result to state
for opt, val in opts:
name = argmap[opt]
t = type(defmap[name])
if t is type(fancyopts):
state[name] = defmap[name](val)
elif t is type(1):
state[name] = int(val)
elif t is type(''):
state[name] = val
elif t is type([]):
state[name].append(val)
elif t is type(None) or t is type(False):
state[name] = True
mpm@selenic.com
Beginning of new command parsing interface...
r209
Matt Mackall
fancyopts: lots of cleanups
r5638 # return unparsed args
mpm@selenic.com
Add back links from file revisions to changeset revisions...
r0 return args