##// END OF EJS Templates
parsers: inline fields of dirstate values in C version...
parsers: inline fields of dirstate values in C version Previously, while unpacking the dirstate we'd create 3-4 new CPython objects for most dirstate values: - the state is a single character string, which is pooled by CPython - the mode is a new object if it isn't 0 due to being in the lookup set - the size is a new object if it is greater than 255 - the mtime is a new object if it isn't -1 due to being in the lookup set - the tuple to contain them all In some cases such as regular hg status, we actually look at all the objects. In other cases like hg add, hg status for a subdirectory, or hg status with the third-party hgwatchman enabled, we look at almost none of the objects. This patch eliminates most object creation in these cases by defining a custom C struct that is exposed to Python with an interface similar to a tuple. Only when tuple elements are actually requested are the respective objects created. The gains, where they're expected, are significant. The following tests are run against a working copy with over 270,000 files. parse_dirstate becomes significantly faster: $ hg perfdirstate before: wall 0.186437 comb 0.180000 user 0.160000 sys 0.020000 (best of 35) after: wall 0.093158 comb 0.100000 user 0.090000 sys 0.010000 (best of 95) and as a result, several commands benefit: $ time hg status # with hgwatchman enabled before: 0.42s user 0.14s system 99% cpu 0.563 total after: 0.34s user 0.12s system 99% cpu 0.471 total $ time hg add new-file before: 0.85s user 0.18s system 99% cpu 1.033 total after: 0.76s user 0.17s system 99% cpu 0.931 total There is a slight regression in regular status performance, but this is fixed in an upcoming patch.

File last commit:

r18894:ed46c2b9 default
r21809:e250b830 default
Show More
dicthelpers.py
55 lines | 1.6 KiB | text/x-python | PythonLexer
# dicthelpers.py - helper routines for Python dicts
#
# Copyright 2013 Facebook
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
def diff(d1, d2, default=None):
'''Return all key-value pairs that are different between d1 and d2.
This includes keys that are present in one dict but not the other, and
keys whose values are different. The return value is a dict with values
being pairs of values from d1 and d2 respectively, and missing values
treated as default, so if a value is missing from one dict and the same as
default in the other, it will not be returned.'''
res = {}
if d1 is d2:
# same dict, so diff is empty
return res
for k1, v1 in d1.iteritems():
v2 = d2.get(k1, default)
if v1 != v2:
res[k1] = (v1, v2)
for k2 in d2:
if k2 not in d1:
v2 = d2[k2]
if v2 != default:
res[k2] = (default, v2)
return res
def join(d1, d2, default=None):
'''Return all key-value pairs from both d1 and d2.
This is akin to an outer join in relational algebra. The return value is a
dict with values being pairs of values from d1 and d2 respectively, and
missing values represented as default.'''
res = {}
for k1, v1 in d1.iteritems():
if k1 in d2:
res[k1] = (v1, d2[k1])
else:
res[k1] = (v1, default)
if d1 is d2:
return res
for k2 in d2:
if k2 not in d1:
res[k2] = (default, d2[k2])
return res