##// END OF EJS Templates
parsers: inline fields of dirstate values in C version...
parsers: inline fields of dirstate values in C version Previously, while unpacking the dirstate we'd create 3-4 new CPython objects for most dirstate values: - the state is a single character string, which is pooled by CPython - the mode is a new object if it isn't 0 due to being in the lookup set - the size is a new object if it is greater than 255 - the mtime is a new object if it isn't -1 due to being in the lookup set - the tuple to contain them all In some cases such as regular hg status, we actually look at all the objects. In other cases like hg add, hg status for a subdirectory, or hg status with the third-party hgwatchman enabled, we look at almost none of the objects. This patch eliminates most object creation in these cases by defining a custom C struct that is exposed to Python with an interface similar to a tuple. Only when tuple elements are actually requested are the respective objects created. The gains, where they're expected, are significant. The following tests are run against a working copy with over 270,000 files. parse_dirstate becomes significantly faster: $ hg perfdirstate before: wall 0.186437 comb 0.180000 user 0.160000 sys 0.020000 (best of 35) after: wall 0.093158 comb 0.100000 user 0.090000 sys 0.010000 (best of 95) and as a result, several commands benefit: $ time hg status # with hgwatchman enabled before: 0.42s user 0.14s system 99% cpu 0.563 total after: 0.34s user 0.12s system 99% cpu 0.471 total $ time hg add new-file before: 0.85s user 0.18s system 99% cpu 1.033 total after: 0.76s user 0.17s system 99% cpu 0.931 total There is a slight regression in regular status performance, but this is fixed in an upcoming patch.

File last commit:

r18894:ed46c2b9 default
r21809:e250b830 default
Show More
dicthelpers.py
55 lines | 1.6 KiB | text/x-python | PythonLexer
Siddharth Agarwal
mercurial: implement diff and join for dicts...
r18820 # dicthelpers.py - helper routines for Python dicts
#
# Copyright 2013 Facebook
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
Siddharth Agarwal
dicthelpers: inline diff and join code...
r18847 def diff(d1, d2, default=None):
'''Return all key-value pairs that are different between d1 and d2.
This includes keys that are present in one dict but not the other, and
keys whose values are different. The return value is a dict with values
being pairs of values from d1 and d2 respectively, and missing values
Siddharth Agarwal
dicthelpers.diff: compare against default for missing values...
r18894 treated as default, so if a value is missing from one dict and the same as
default in the other, it will not be returned.'''
Siddharth Agarwal
mercurial: implement diff and join for dicts...
r18820 res = {}
Siddharth Agarwal
dicthelpers: inline diff and join code...
r18847 if d1 is d2:
Siddharth Agarwal
mercurial: implement diff and join for dicts...
r18820 # same dict, so diff is empty
return res
for k1, v1 in d1.iteritems():
Siddharth Agarwal
dicthelpers.diff: compare against default for missing values...
r18894 v2 = d2.get(k1, default)
if v1 != v2:
res[k1] = (v1, v2)
Siddharth Agarwal
mercurial: implement diff and join for dicts...
r18820
Siddharth Agarwal
dicthelpers: inline diff and join code...
r18847 for k2 in d2:
if k2 not in d1:
Siddharth Agarwal
dicthelpers.diff: compare against default for missing values...
r18894 v2 = d2[k2]
if v2 != default:
res[k2] = (default, v2)
Siddharth Agarwal
dicthelpers: inline diff and join code...
r18847
return res
def join(d1, d2, default=None):
'''Return all key-value pairs from both d1 and d2.
This is akin to an outer join in relational algebra. The return value is a
dict with values being pairs of values from d1 and d2 respectively, and
missing values represented as default.'''
res = {}
for k1, v1 in d1.iteritems():
if k1 in d2:
res[k1] = (v1, d2[k1])
else:
res[k1] = (v1, default)
Siddharth Agarwal
mercurial: implement diff and join for dicts...
r18820 if d1 is d2:
return res
for k2 in d2:
if k2 not in d1:
res[k2] = (default, d2[k2])
return res