##// END OF EJS Templates
merge: perform background file closing in batchget...
merge: perform background file closing in batchget As 2fdbf22a1b63 demonstrated with stream clones, closing files on background threads on Windows can yield a significant speedup because closing files that have been created/appended to is slow on Windows/NTFS. Working directory updates can write thousands of files. Therefore it is susceptible to excessive slowness on Windows due to slow file closes. This patch enables background file closing when performing working directory file writes. The impact when performing an `hg up tip` on mozilla-central (136,357 files) from an empty working directory is significant: Before: 535s (8:55) After: 133s (2:13) Delta: -402s (6:42) That's a 4x speedup! By comparison, that same machine can perform the same operation in ~15s on Linux. So Windows went from ~35x to ~9x slower. Not bad but there's still work to do. As a reminder, background file closing is only activated on Windows because it is only beneficial on that platform. So this patch shouldn't change non-Windows behavior at all. It's worth noting that non-Windows systems perform working directory updates with multiple processes. Unfortunately, worker.py doesn't yet support Windows. So, there is still plenty of room for making working directory updates faster on Windows. Even if multiple processes are used on Windows, I believe background file closing will still provide a benefit, as individual processes will still be slowed down by the file close bottleneck (assuming the I/O system isn't saturated).

File last commit:

r28073:c4bec3c4 default
r28200:588695cc default
Show More
revsetbenchmarks.py
325 lines | 10.0 KiB | text/x-python | PythonLexer
/ contrib / revsetbenchmarks.py
Pierre-Yves David
revsetbenchmark: simplify and convert the script to python...
r20848 #!/usr/bin/env python
# Measure the performance of a list of revsets against multiple revisions
# defined by parameter. Checkout one by one and run perfrevset with every
# revset in the list to benchmark its performance.
#
Pierre-Yves David
revsetbenchmarks: fix argument parsing...
r25535 # You should run this from the root of your mercurial repository.
Pierre-Yves David
revsetbenchmark: simplify and convert the script to python...
r20848 #
Pierre-Yves David
revsetbenchmarks: fix argument parsing...
r25535 # call with --help for details
Pierre-Yves David
revsetbenchmark: simplify and convert the script to python...
r20848
import sys
Pierre-Yves David
revsetbenchmark: automatically finds the perf extension...
r21548 import os
Pierre-Yves David
revsetbenchmarks: parse perfrevset output into actual number...
r25530 import re
Pierre-Yves David
revsetbenchmarks: display relative change when meaningful...
r25539 import math
Durham Goode
revsetbenchmark: remove python 2.7 dependency...
r20893 from subprocess import check_call, Popen, CalledProcessError, STDOUT, PIPE
Pierre-Yves David
revsetbenchmark: use optparse to retrieve argument...
r21287 # cannot use argparse, python 2.7 only
from optparse import OptionParser
Pierre-Yves David
revsetbenchmarks: use combination variants in default set...
r25544 DEFAULTVARIANTS = ['plain', 'min', 'max', 'first', 'last',
'reverse', 'reverse+first', 'reverse+last',
'sort', 'sort+first', 'sort+last']
Pierre-Yves David
revsetbenchmarks: allow running multiple variants per revset...
r25540
Durham Goode
revsetbenchmark: remove python 2.7 dependency...
r20893 def check_output(*args, **kwargs):
kwargs.setdefault('stderr', PIPE)
kwargs.setdefault('stdout', PIPE)
proc = Popen(*args, **kwargs)
output, error = proc.communicate()
if proc.returncode != 0:
Pierre-Yves David
revsetbenchmark: fix error raising...
r21202 raise CalledProcessError(proc.returncode, ' '.join(args[0]))
Durham Goode
revsetbenchmark: remove python 2.7 dependency...
r20893 return output
Pierre-Yves David
revsetbenchmark: simplify and convert the script to python...
r20848
Pierre-Yves David
revsetbenchmark: convert update to proper subprocess call
r20850 def update(rev):
"""update the repo to a revision"""
try:
check_call(['hg', 'update', '--quiet', '--check', str(rev)])
Yuya Nishihara
revsetbenchmarks: run make after update so that C extensions are built
r26034 check_output(['make', 'local'],
stderr=None) # suppress output except for error/warning
Gregory Szorc
global: mass rewrite to use modern exception syntax...
r25660 except CalledProcessError as exc:
Pierre-Yves David
revsetbenchmark: convert update to proper subprocess call
r20850 print >> sys.stderr, 'update to revision %s failed, aborting' % rev
sys.exit(exc.returncode)
Pierre-Yves David
revsetbenchmarks: extract call to mercurial into a function...
r25528
def hg(cmd, repo=None):
"""run a mercurial command
<cmd> is the list of command + argument,
<repo> is an optional repository path to run this command in."""
fullcmd = ['./hg']
if repo is not None:
fullcmd += ['-R', repo]
fullcmd += ['--config',
'extensions.perf=' + os.path.join(contribdir, 'perf.py')]
fullcmd += cmd
return check_output(fullcmd, stderr=STDOUT)
Gregory Szorc
revsetbenchmarks: support benchmarking changectx loading...
r27073 def perf(revset, target=None, contexts=False):
Pierre-Yves David
revsetbenchmark: convert performance call to proper subprocess call
r20851 """run benchmark for this very revset"""
try:
Gregory Szorc
revsetbenchmarks: support benchmarking changectx loading...
r27073 args = ['perfrevset', revset]
if contexts:
args.append('--contexts')
output = hg(args, repo=target)
Pierre-Yves David
revsetbenchmarks: parse perfrevset output into actual number...
r25530 return parseoutput(output)
Gregory Szorc
global: mass rewrite to use modern exception syntax...
r25660 except CalledProcessError as exc:
Pierre-Yves David
revsetbenchmarks: improve error output in case of failure...
r25529 print >> sys.stderr, 'abort: cannot run revset benchmark: %s' % exc.cmd
Durham Goode
revsetbenchmark: handle exception case...
r28073 if getattr(exc, 'output', None) is None: # no output before 2.7
Mads Kiilerich
spelling: trivial spell checking
r26781 print >> sys.stderr, '(no output)'
Pierre-Yves David
revsetbenchmarks: improve error output in case of failure...
r25529 else:
print >> sys.stderr, exc.output
Pierre-Yves David
revsetbenchmark: do not abort on failure to run a revset...
r25646 return None
Pierre-Yves David
revsetbenchmark: convert performance call to proper subprocess call
r20851
Pierre-Yves David
revsetbenchmarks: parse perfrevset output into actual number...
r25530 outputre = re.compile(r'! wall (\d+.\d+) comb (\d+.\d+) user (\d+.\d+) '
'sys (\d+.\d+) \(best of (\d+)\)')
def parseoutput(output):
"""parse a textual output into a dict
We cannot just use json because we want to compare with old
versions of Mercurial that may not support json output.
"""
match = outputre.search(output)
if not match:
print >> sys.stderr, 'abort: invalid output:'
print >> sys.stderr, output
sys.exit(1)
return {'comb': float(match.group(2)),
'count': int(match.group(5)),
'sys': float(match.group(3)),
'user': float(match.group(4)),
'wall': float(match.group(1)),
}
Pierre-Yves David
revsetbenchmark: convert revision display to proper subprocesscall
r20852 def printrevision(rev):
"""print data about a revision"""
Pierre-Yves David
revsetbenchmarks: improve revision printing...
r25538 sys.stdout.write("Revision ")
Pierre-Yves David
revsetbenchmark: convert revision display to proper subprocesscall
r20852 sys.stdout.flush()
check_call(['hg', 'log', '--rev', str(rev), '--template',
Pierre-Yves David
revsetbenchmarks: also display tag when printing a revision...
r25546 '{if(tags, " ({tags})")} '
Pierre-Yves David
revsetbenchmarks: improve revision printing...
r25538 '{rev}:{node|short}: {desc|firstline}\n'])
Pierre-Yves David
revsetbenchmark: convert revision display to proper subprocesscall
r20852
Pierre-Yves David
revsetbenchmarks: ensure all indexes have the same width...
r25532 def idxwidth(nbidx):
"""return the max width of number used for index
Augie Fackler
revsetbenchmarks: clarify comment based on irc discussion
r25533 This is similar to log10(nbidx), but we use custom code here
because we start with zero and we'd rather not deal with all the
extra rounding business that log10 would imply.
"""
Pierre-Yves David
revsetbenchmarks: ensure all indexes have the same width...
r25532 nbidx -= 1 # starts at 0
idxwidth = 0
while nbidx:
idxwidth += 1
nbidx //= 10
if not idxwidth:
idxwidth = 1
return idxwidth
Pierre-Yves David
revsetbenchmarks: display relative change when meaningful...
r25539 def getfactor(main, other, field, sensitivity=0.05):
"""return the relative factor between values for 'field' in main and other
Mads Kiilerich
spelling: trivial spell checking
r26781 Return None if the factor is insignificant (less than <sensitivity>
Pierre-Yves David
revsetbenchmarks: display relative change when meaningful...
r25539 variation)."""
factor = 1
if main is not None:
factor = other[field] / main[field]
low, high = 1 - sensitivity, 1 + sensitivity
if (low < factor < high):
return None
return factor
def formatfactor(factor):
"""format a factor into a 4 char string
22%
156%
x2.4
x23
x789
x1e4
x5x7
"""
if factor is None:
return ' '
elif factor < 2:
return '%3i%%' % (factor * 100)
elif factor < 10:
return 'x%3.1f' % factor
elif factor < 1000:
return '%4s' % ('x%i' % factor)
else:
order = int(math.log(factor)) + 1
while 1 < math.log(factor):
factor //= 0
return 'x%ix%i' % (factor, order)
Pierre-Yves David
revsetbenchmarks: display even more compact timing result...
r25541 def formattiming(value):
"""format a value to strictly 8 char, dropping some precision if needed"""
if value < 10**7:
return ('%.6f' % value)[:8]
else:
# value is HUGE very unlikely to happen (4+ month run)
return '%i' % value
Pierre-Yves David
revsetbenchmarks: display relative change when meaningful...
r25539 _marker = object()
Pierre-Yves David
revsetbenchmarks: allow running multiple variants per revset...
r25540 def printresult(variants, idx, data, maxidx, verbose=False, reference=_marker):
Pierre-Yves David
revsetbenchmarks: factor out result output into a function...
r25531 """print a line of result to stdout"""
Pierre-Yves David
revsetbenchmarks: ensure all indexes have the same width...
r25532 mask = '%%0%ii) %%s' % idxwidth(maxidx)
Pierre-Yves David
revsetbenchmark: do not abort on failure to run a revset...
r25646
Pierre-Yves David
revsetbenchmarks: allow running multiple variants per revset...
r25540 out = []
for var in variants:
Pierre-Yves David
revsetbenchmark: do not abort on failure to run a revset...
r25646 if data[var] is None:
out.append('error ')
out.append(' ' * 4)
continue
Pierre-Yves David
revsetbenchmarks: display even more compact timing result...
r25541 out.append(formattiming(data[var]['wall']))
Pierre-Yves David
revsetbenchmarks: allow running multiple variants per revset...
r25540 if reference is not _marker:
factor = None
if reference is not None:
factor = getfactor(reference[var], data[var], 'wall')
out.append(formatfactor(factor))
if verbose:
Pierre-Yves David
revsetbenchmarks: display even more compact timing result...
r25541 out.append(formattiming(data[var]['comb']))
out.append(formattiming(data[var]['user']))
out.append(formattiming(data[var]['sys']))
Pierre-Yves David
revsetbenchmarks: allow running multiple variants per revset...
r25540 out.append('%6d' % data[var]['count'])
Pierre-Yves David
revsetbenchmarks: use a more compact output format with a header...
r25534 print mask % (idx, ' '.join(out))
Pierre-Yves David
revsetbenchmarks: parse perfrevset output into actual number...
r25530
Pierre-Yves David
revsetbenchmarks: allow running multiple variants per revset...
r25540 def printheader(variants, maxidx, verbose=False, relative=False):
header = [' ' * (idxwidth(maxidx) + 1)]
for var in variants:
if not var:
var = 'iter'
Pierre-Yves David
revsetbenchmarks: display even more compact timing result...
r25541 if 8 < len(var):
var = var[:3] + '..' + var[-3:]
header.append('%-8s' % var)
Pierre-Yves David
revsetbenchmarks: allow running multiple variants per revset...
r25540 if relative:
header.append(' ')
if verbose:
Pierre-Yves David
revsetbenchmarks: display even more compact timing result...
r25541 header.append('%-8s' % 'comb')
header.append('%-8s' % 'user')
header.append('%-8s' % 'sys')
Pierre-Yves David
revsetbenchmarks: allow running multiple variants per revset...
r25540 header.append('%6s' % 'count')
Pierre-Yves David
revsetbenchmarks: use a more compact output format with a header...
r25534 print ' '.join(header)
Pierre-Yves David
revsetbenchmarks: parse perfrevset output into actual number...
r25530
Pierre-Yves David
revsetbenchmark: get revision to benchmark in a function...
r20853 def getrevs(spec):
"""get the list of rev matched by a revset"""
try:
out = check_output(['hg', 'log', '--template={rev}\n', '--rev', spec])
Gregory Szorc
global: mass rewrite to use modern exception syntax...
r25660 except CalledProcessError as exc:
Pierre-Yves David
revsetbenchmark: get revision to benchmark in a function...
r20853 print >> sys.stderr, "abort, can't get revision from %s" % spec
sys.exit(exc.returncode)
return [r for r in out.split() if r]
Pierre-Yves David
revsetbenchmarks: allow running multiple variants per revset...
r25540 def applyvariants(revset, variant):
if variant == 'plain':
return revset
Pierre-Yves David
revsetbenchmarks: support combining variants with "+"...
r25543 for var in variant.split('+'):
revset = '%s(%s)' % (var, revset)
return revset
Pierre-Yves David
revsetbenchmarks: allow running multiple variants per revset...
r25540
Pierre-Yves David
revsetbenchmarks: add main documention for the script...
r25607 helptext="""This script will run multiple variants of provided revsets using
different revisions in your mercurial repository. After the benchmark are run
Mads Kiilerich
spelling: trivial spell checking
r26781 summary output is provided. Use it to demonstrate speed improvements or pin
Pierre-Yves David
revsetbenchmarks: add main documention for the script...
r25607 point regressions. Revsets to run are specified in a file (or from stdin), one
revsets per line. Line starting with '#' will be ignored, allowing insertion of
comments."""
parser = OptionParser(usage="usage: %prog [options] <revs>",
description=helptext)
Pierre-Yves David
revsetbenchmark: use optparse to retrieve argument...
r21287 parser.add_option("-f", "--file",
Mads Kiilerich
spelling: fixes from proofreading of spell checker issues
r23139 help="read revset from FILE (stdin if omitted)",
Pierre-Yves David
revsetbenchmark: make it clear that revsets may be read from stdin
r22555 metavar="FILE")
Pierre-Yves David
revsetbenchmark: support for running on other repo...
r21549 parser.add_option("-R", "--repo",
help="run benchmark on REPO", metavar="REPO")
Pierre-Yves David
revsetbenchmark: use optparse to retrieve argument...
r21287
Pierre-Yves David
revsetbenchmarks: hide most timing under a --verbose flag...
r25537 parser.add_option("-v", "--verbose",
action='store_true',
help="display all timing data (not just best total time)")
Pierre-Yves David
revsetbenchmarks: allow running multiple variants per revset...
r25540 parser.add_option("", "--variants",
default=','.join(DEFAULTVARIANTS),
help="comma separated list of variant to test "
"(eg: plain,min,sorted) (plain = no modification)")
Gregory Szorc
revsetbenchmarks: support benchmarking changectx loading...
r27073 parser.add_option('', '--contexts',
action='store_true',
help='obtain changectx from results instead of integer revs')
Pierre-Yves David
revsetbenchmarks: allow running multiple variants per revset...
r25540
Pierre-Yves David
revsetbenchmark: use optparse to retrieve argument...
r21287 (options, args) = parser.parse_args()
Pierre-Yves David
revsetbenchmark: simplify and convert the script to python...
r20848
Pierre-Yves David
revsetbenchmarks: fix argument parsing...
r25535 if not args:
Pierre-Yves David
revsetbenchmark: use optparse to retrieve argument...
r21287 parser.print_help()
Pierre-Yves David
revsetbenchmark: add a usage message when no arguments are passed...
r21286 sys.exit(255)
Pierre-Yves David
revsetbenchmark: automatically finds the perf extension...
r21548 # the directory where both this script and the perf.py extension live.
contribdir = os.path.dirname(__file__)
Pierre-Yves David
revsetbenchmark: use optparse to retrieve argument...
r21287
Pierre-Yves David
revsetbenchmark: simplify and convert the script to python...
r20848 revsetsfile = sys.stdin
Pierre-Yves David
revsetbenchmark: use optparse to retrieve argument...
r21287 if options.file:
revsetsfile = open(options.file)
Pierre-Yves David
revsetbenchmark: simplify and convert the script to python...
r20848
Pierre-Yves David
revsetbenchmark: allow comments ('#' prefix) in the revset input
r22556 revsets = [l.strip() for l in revsetsfile if not l.startswith('#')]
Pierre-Yves David
revsetbenchmarks: ignore empty lines...
r25642 revsets = [l for l in revsets if l]
Pierre-Yves David
revsetbenchmark: simplify and convert the script to python...
r20848
print "Revsets to benchmark"
print "----------------------------"
for idx, rset in enumerate(revsets):
print "%i) %s" % (idx, rset)
print "----------------------------"
print
Pierre-Yves David
revsetbenchmarks: fix argument parsing...
r25535 revs = []
for a in args:
revs.extend(getrevs(a))
Pierre-Yves David
revsetbenchmark: simplify and convert the script to python...
r20848
Pierre-Yves David
revsetbenchmarks: allow running multiple variants per revset...
r25540 variants = options.variants.split(',')
Pierre-Yves David
revsetbenchmark: add a summary at the end of execution...
r20855 results = []
Pierre-Yves David
revsetbenchmark: simplify and convert the script to python...
r20848 for r in revs:
print "----------------------------"
Pierre-Yves David
revsetbenchmark: convert revision display to proper subprocesscall
r20852 printrevision(r)
Pierre-Yves David
revsetbenchmark: simplify and convert the script to python...
r20848 print "----------------------------"
Pierre-Yves David
revsetbenchmark: convert update to proper subprocess call
r20850 update(r)
Pierre-Yves David
revsetbenchmark: add a summary at the end of execution...
r20855 res = []
results.append(res)
Pierre-Yves David
revsetbenchmarks: allow running multiple variants per revset...
r25540 printheader(variants, len(revsets), verbose=options.verbose)
Pierre-Yves David
revsetbenchmark: simplify and convert the script to python...
r20848 for idx, rset in enumerate(revsets):
Pierre-Yves David
revsetbenchmarks: allow running multiple variants per revset...
r25540 varres = {}
for var in variants:
varrset = applyvariants(rset, var)
Gregory Szorc
revsetbenchmarks: support benchmarking changectx loading...
r27073 data = perf(varrset, target=options.repo, contexts=options.contexts)
Pierre-Yves David
revsetbenchmarks: allow running multiple variants per revset...
r25540 varres[var] = data
res.append(varres)
printresult(variants, idx, varres, len(revsets),
verbose=options.verbose)
Pierre-Yves David
revsetbenchmark: add a summary at the end of execution...
r20855 sys.stdout.flush()
Pierre-Yves David
revsetbenchmark: simplify and convert the script to python...
r20848 print "----------------------------"
Pierre-Yves David
revsetbenchmark: add a summary at the end of execution...
r20855
print """
Result by revset
================
"""
Pierre-Yves David
revsetbenchmarks: improve revision printing...
r25538 print 'Revision:'
Pierre-Yves David
revsetbenchmark: add a summary at the end of execution...
r20855 for idx, rev in enumerate(revs):
sys.stdout.write('%i) ' % idx)
sys.stdout.flush()
printrevision(rev)
print
print
for ridx, rset in enumerate(revsets):
print "revset #%i: %s" % (ridx, rset)
Pierre-Yves David
revsetbenchmarks: allow running multiple variants per revset...
r25540 printheader(variants, len(results), verbose=options.verbose, relative=True)
Pierre-Yves David
revsetbenchmarks: display relative change when meaningful...
r25539 ref = None
Pierre-Yves David
revsetbenchmark: add a summary at the end of execution...
r20855 for idx, data in enumerate(results):
Pierre-Yves David
revsetbenchmarks: allow running multiple variants per revset...
r25540 printresult(variants, idx, data[ridx], len(results),
verbose=options.verbose, reference=ref)
Pierre-Yves David
revsetbenchmarks: display relative change when meaningful...
r25539 ref = data[ridx]
Pierre-Yves David
revsetbenchmark: add a summary at the end of execution...
r20855 print