##// END OF EJS Templates
obsolete: order of magnitude speedup in _computebumpedset...
obsolete: order of magnitude speedup in _computebumpedset Reminder: a changeset is said "bumped" if it tries to obsolete a immutable changeset. The previous algorithm for computing bumped changeset was: 1) Get all public changesets 2) Find all they successors 3) Search for stuff that are eligible for being "bumped" (mutable and non obsolete) The entry size of this algorithm is `O(len(public))` which is mostly the same as `O(len(repo))`. Even this this approach mean fewer obsolescence marker are traveled, this is not very scalable. The new algorithm is: 1) For each potential bumped changesets (non obsolete mutable) 2) iterate over precursors 3) if a precursors is public. changeset is bumped We travel more obsolescence marker, but the entry size is much smaller since the amount of potential bumped should remains mostly stable with time `O(1)`. On some confidential gigantic repo this move bumped computation from 15.19s to 0.46s (×33 speedup…). On "smaller" repo (mercurial, cubicweb's review) no significant gain were seen. The additional traversal of obsolescence marker is probably probably counter balance the advantage of it. Other optimisation could be done in the future (eg: sharing precursors cache for divergence detection)

File last commit:

r20190:d5d25e54 default
r20207:cd62532c default
Show More
branchmap.py
285 lines | 10.7 KiB | text/x-python | PythonLexer
Pierre-Yves David
branchmap: create a mercurial.branchmap module...
r18116 # branchmap.py - logic to computes, maintain and stores branchmap for local repo
#
# Copyright 2005-2007 Matt Mackall <mpm@selenic.com>
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
Pierre-Yves David
branchmap: extract write logic from localrepo
r18117
Pierre-Yves David
branchmap: extract read logic from repo
r18118 from node import bin, hex, nullid, nullrev
Pierre-Yves David
branchmap: extract write logic from localrepo
r18117 import encoding
Augie Fackler
subsettable: move from repoview to branchmap, the only place it's used...
r20032 import util
Pierre-Yves David
branchmap: extract write logic from localrepo
r18117
Pierre-Yves David
branchmap: move the cache file name into a dedicated function...
r18185 def _filename(repo):
Pierre-Yves David
branchmap: use a different file name for filtered view of repo
r18187 """name of a branchcache file for a given repo or repoview"""
Brodie Rao
branchmap: cache open/closed branch head information...
r20185 filename = "cache/branch2"
Pierre-Yves David
branchmap: use a different file name for filtered view of repo
r18187 if repo.filtername:
filename = '%s-%s' % (filename, repo.filtername)
return filename
Pierre-Yves David
branchmap: move the cache file name into a dedicated function...
r18185
Pierre-Yves David
branchmap: extract read logic from repo
r18118 def read(repo):
try:
Pierre-Yves David
branchmap: move the cache file name into a dedicated function...
r18185 f = repo.opener(_filename(repo))
Pierre-Yves David
branchmap: extract read logic from repo
r18118 lines = f.read().split('\n')
f.close()
except (IOError, OSError):
Pierre-Yves David
branchmap: read return None in case of failure...
r18212 return None
Pierre-Yves David
branchmap: extract read logic from repo
r18118
try:
Pierre-Yves David
branchmap: read and write key part related to filtered revision...
r18184 cachekey = lines.pop(0).split(" ", 2)
last, lrev = cachekey[:2]
Pierre-Yves David
branchmap: extract read logic from repo
r18118 last, lrev = bin(last), int(lrev)
Pierre-Yves David
branchmap: read and write key part related to filtered revision...
r18184 filteredhash = None
if len(cachekey) > 2:
filteredhash = bin(cachekey[2])
partial = branchcache(tipnode=last, tiprev=lrev,
filteredhash=filteredhash)
Pierre-Yves David
branchmap: move validity logic in the object itself...
r18132 if not partial.validfor(repo):
Pierre-Yves David
branchmap: extract read logic from repo
r18118 # invalidate the cache
Pierre-Yves David
branchmap: improve invalid cache message when reading...
r18166 raise ValueError('tip differs')
Pierre-Yves David
branchmap: extract read logic from repo
r18118 for l in lines:
if not l:
continue
Brodie Rao
branchmap: cache open/closed branch head information...
r20185 node, state, label = l.split(" ", 2)
if state not in 'oc':
raise ValueError('invalid branch state')
Pierre-Yves David
branchmap: extract read logic from repo
r18118 label = encoding.tolocal(label.strip())
if not node in repo:
Pierre-Yves David
branchmap: improve invalid cache message when reading...
r18166 raise ValueError('node %s does not exist' % node)
Brodie Rao
branchmap: cache open/closed branch head information...
r20185 node = bin(node)
partial.setdefault(label, []).append(node)
if state == 'c':
partial._closednodes.add(node)
Pierre-Yves David
branchmap: extract read logic from repo
r18118 except KeyboardInterrupt:
raise
except Exception, inst:
if repo.ui.debugflag:
Pierre-Yves David
branchmap: report filtername when read fails...
r18188 msg = 'invalid branchheads cache'
if repo.filtername is not None:
msg += ' (%s)' % repo.filtername
msg += ': %s\n'
repo.ui.warn(msg % inst)
Pierre-Yves David
branchmap: read return None in case of failure...
r18212 partial = None
Pierre-Yves David
branchmap: add the tiprev (cache key) on the branchmap object...
r18126 return partial
Pierre-Yves David
branchmap: extract read logic from repo
r18118
Pierre-Yves David
branchmap: make update responsible to update the cache key...
r18130
Pierre-Yves David
branchmap: extract _updatebranchcache from repo
r18120
Augie Fackler
subsettable: move from repoview to branchmap, the only place it's used...
r20032 ### Nearest subset relation
# Nearest subset of filter X is a filter Y so that:
# * Y is included in X,
# * X - Y is as small as possible.
# This create and ordering used for branchmap purpose.
# the ordering may be partial
subsettable = {None: 'visible',
'visible': 'served',
'served': 'immutable',
'immutable': 'base'}
Pierre-Yves David
branchmap: extract updatebranchcache from repo
r18121 def updatecache(repo):
cl = repo.changelog
Pierre-Yves David
branchmap: enable caching for filtered version too...
r18189 filtername = repo.filtername
partial = repo._branchcaches.get(filtername)
Pierre-Yves David
branchmap: extract updatebranchcache from repo
r18121
Pierre-Yves David
branchmap: allow to use cache of subset...
r18234 revs = []
Pierre-Yves David
branchmap: move validity logic in the object itself...
r18132 if partial is None or not partial.validfor(repo):
Pierre-Yves David
branchmap: add the tiprev (cache key) on the branchmap object...
r18126 partial = read(repo)
Pierre-Yves David
branchmap: read return None in case of failure...
r18212 if partial is None:
Augie Fackler
subsettable: move from repoview to branchmap, the only place it's used...
r20032 subsetname = subsettable.get(filtername)
Pierre-Yves David
branchmap: allow to use cache of subset...
r18234 if subsetname is None:
partial = branchcache()
else:
subset = repo.filtered(subsetname)
partial = subset.branchmap().copy()
extrarevs = subset.changelog.filteredrevs - cl.filteredrevs
revs.extend(r for r in extrarevs if r <= partial.tiprev)
revs.extend(cl.revs(start=partial.tiprev + 1))
Pierre-Yves David
branchmap: drop `_cacheabletip` usage in `updatecache`...
r18218 if revs:
Pierre-Yves David
branchmap: pass revision insteads of changectx to the update function...
r18305 partial.update(repo, revs)
Pierre-Yves David
branchmap: make write a method on the branchmap object
r18128 partial.write(repo)
Pierre-Yves David
branchmap: display filtername when `updatebranch` fails to do its jobs...
r18451 assert partial.validfor(repo), filtername
Pierre-Yves David
branchmap: enable caching for filtered version too...
r18189 repo._branchcaches[repo.filtername] = partial
Pierre-Yves David
branchmap: store branchcache in a dedicated object...
r18124
class branchcache(dict):
Brodie Rao
branchmap: add documentation on the branchcache on-disk format
r20181 """A dict like object that hold branches heads cache.
This cache is used to avoid costly computations to determine all the
branch heads of a repo.
The cache is serialized on disk in the following format:
<tip hex node> <tip rev number> [optional filtered repo hex hash]
Brodie Rao
branchmap: cache open/closed branch head information...
r20185 <branch head hex node> <open/closed state> <branch name>
<branch head hex node> <open/closed state> <branch name>
Brodie Rao
branchmap: add documentation on the branchcache on-disk format
r20181 ...
The first line is used to check if the cache is still valid. If the
branch cache is for a filtered repo view, an optional third hash is
included that hashes the hashes of all filtered revisions.
Brodie Rao
branchmap: cache open/closed branch head information...
r20185
The open/closed state is represented by a single letter 'o' or 'c'.
This field can be used to avoid changelog reads when determining if a
branch head closes a branch or not.
Brodie Rao
branchmap: add documentation on the branchcache on-disk format
r20181 """
Pierre-Yves David
branchmap: store branchcache in a dedicated object...
r18124
Pierre-Yves David
branchmap: takes filtered revision in account for cache calculation...
r18168 def __init__(self, entries=(), tipnode=nullid, tiprev=nullrev,
Brodie Rao
branchmap: cache open/closed branch head information...
r20185 filteredhash=None, closednodes=None):
Pierre-Yves David
branchmap: add the tipnode (cache key) on the branchcache object...
r18125 super(branchcache, self).__init__(entries)
self.tipnode = tipnode
Pierre-Yves David
branchmap: add the tiprev (cache key) on the branchmap object...
r18126 self.tiprev = tiprev
Pierre-Yves David
branchmap: takes filtered revision in account for cache calculation...
r18168 self.filteredhash = filteredhash
Brodie Rao
branchmap: cache open/closed branch head information...
r20185 # closednodes is a set of nodes that close their branch. If the branch
# cache has been updated, it may contain nodes that are no longer
# heads.
if closednodes is None:
self._closednodes = set()
else:
self._closednodes = closednodes
Pierre-Yves David
branchmap: takes filtered revision in account for cache calculation...
r18168
def _hashfiltered(self, repo):
"""build hash of revision filtered in the current cache
Mads Kiilerich
spelling: fix some minor issues found by spell checker
r18644 Tracking tipnode and tiprev is not enough to ensure validity of the
Pierre-Yves David
branchmap: takes filtered revision in account for cache calculation...
r18168 cache as they do not help to distinct cache that ignored various
revision bellow tiprev.
To detect such difference, we build a cache of all ignored revisions.
"""
cl = repo.changelog
if not cl.filteredrevs:
return None
key = None
revs = sorted(r for r in cl.filteredrevs if r <= self.tiprev)
if revs:
s = util.sha1()
for rev in revs:
s.update('%s;' % rev)
key = s.digest()
return key
Pierre-Yves David
branchmap: make write a method on the branchmap object
r18128
Pierre-Yves David
branchmap: move validity logic in the object itself...
r18132 def validfor(self, repo):
Mads Kiilerich
spelling: fix some minor issues found by spell checker
r18644 """Is the cache content valid regarding a repo
Pierre-Yves David
branchmap: move validity logic in the object itself...
r18132
Mads Kiilerich
spelling: fix some minor issues found by spell checker
r18644 - False when cached tipnode is unknown or if we detect a strip.
Pierre-Yves David
branchmap: move validity logic in the object itself...
r18132 - True when cache is up to date or a subset of current repo."""
try:
Pierre-Yves David
branchmap: takes filtered revision in account for cache calculation...
r18168 return ((self.tipnode == repo.changelog.node(self.tiprev))
and (self.filteredhash == self._hashfiltered(repo)))
Pierre-Yves David
branchmap: move validity logic in the object itself...
r18132 except IndexError:
return False
Brodie Rao
branchmap: introduce branchtip() method
r20186 def _branchtip(self, heads):
tip = heads[-1]
closed = True
for h in reversed(heads):
if h not in self._closednodes:
tip = h
closed = False
break
return tip, closed
def branchtip(self, branch):
return self._branchtip(self[branch])[0]
Brodie Rao
branchmap: introduce branchheads() method
r20188 def branchheads(self, branch, closed=False):
heads = self[branch]
if not closed:
heads = [h for h in heads if h not in self._closednodes]
return heads
Brodie Rao
branchmap: introduce iterbranches() method
r20190 def iterbranches(self):
for bn, heads in self.iteritems():
yield (bn, heads) + self._branchtip(heads)
Pierre-Yves David
branchmap: add a copy method...
r18232 def copy(self):
"""return an deep copy of the branchcache object"""
Brodie Rao
branchmap: cache open/closed branch head information...
r20185 return branchcache(self, self.tipnode, self.tiprev, self.filteredhash,
self._closednodes)
Pierre-Yves David
branchmap: move validity logic in the object itself...
r18132
Pierre-Yves David
branchmap: make write a method on the branchmap object
r18128 def write(self, repo):
try:
Pierre-Yves David
branchmap: move the cache file name into a dedicated function...
r18185 f = repo.opener(_filename(repo), "w", atomictemp=True)
Pierre-Yves David
branchmap: read and write key part related to filtered revision...
r18184 cachekey = [hex(self.tipnode), str(self.tiprev)]
if self.filteredhash is not None:
cachekey.append(hex(self.filteredhash))
f.write(" ".join(cachekey) + '\n')
Mads Kiilerich
localrepo: store branchheads sorted
r18357 for label, nodes in sorted(self.iteritems()):
Pierre-Yves David
branchmap: make write a method on the branchmap object
r18128 for node in nodes:
Brodie Rao
branchmap: cache open/closed branch head information...
r20185 if node in self._closednodes:
state = 'c'
else:
state = 'o'
f.write("%s %s %s\n" % (hex(node), state,
encoding.fromlocal(label)))
Pierre-Yves David
branchmap: make write a method on the branchmap object
r18128 f.close()
Pierre-Yves David
branchmap: ignore Abort error while writing cache...
r18214 except (IOError, OSError, util.Abort):
# Abort may be raise by read only opener
Pierre-Yves David
branchmap: make write a method on the branchmap object
r18128 pass
Pierre-Yves David
branchmap: make update a method
r18131
Pierre-Yves David
branchmap: pass revision insteads of changectx to the update function...
r18305 def update(self, repo, revgen):
Pierre-Yves David
branchmap: make update a method
r18131 """Given a branchhead cache, self, that may have extra nodes or be
missing heads, and a generator of nodes that are at least a superset of
heads missing, this function updates self to be correct.
"""
cl = repo.changelog
# collect new branch entries
newbranches = {}
Brodie Rao
branchmap: cache open/closed branch head information...
r20185 getbranchinfo = cl.branchinfo
Pierre-Yves David
branchmap: Save changectx creation during update...
r18307 for r in revgen:
Brodie Rao
branchmap: cache open/closed branch head information...
r20185 branch, closesbranch = getbranchinfo(r)
node = cl.node(r)
newbranches.setdefault(branch, []).append(node)
if closesbranch:
self._closednodes.add(node)
Pierre-Yves David
branchmap: make update a method
r18131 # if older branchheads are reachable from new ones, they aren't
# really branchheads. Note checking parents is insufficient:
# 1 (branch a) -> 2 (branch b) -> 3 (branch a)
for branch, newnodes in newbranches.iteritems():
bheads = self.setdefault(branch, [])
# Remove candidate heads that no longer are in the repo (e.g., as
# the result of a strip that just happened). Avoid using 'node in
# self' here because that dives down into branchcache code somewhat
# recursively.
bheadrevs = [cl.rev(node) for node in bheads
if cl.hasnode(node)]
newheadrevs = [cl.rev(node) for node in newnodes
if cl.hasnode(node)]
ctxisnew = bheadrevs and min(newheadrevs) > max(bheadrevs)
# Remove duplicates - nodes that are in newheadrevs and are already
# in bheadrevs. This can happen if you strip a node whose parent
# was already a head (because they're on different branches).
bheadrevs = sorted(set(bheadrevs).union(newheadrevs))
# Starting from tip means fewer passes over reachable. If we know
# the new candidates are not ancestors of existing heads, we don't
# have to examine ancestors of existing heads
if ctxisnew:
iterrevs = sorted(newheadrevs)
else:
iterrevs = list(bheadrevs)
# This loop prunes out two kinds of heads - heads that are
# superseded by a head in newheadrevs, and newheadrevs that are not
# heads because an existing head is their descendant.
while iterrevs:
latest = iterrevs.pop()
if latest not in bheadrevs:
continue
ancestors = set(cl.ancestors([latest],
bheadrevs[0]))
if ancestors:
bheadrevs = [b for b in bheadrevs if b not in ancestors]
self[branch] = [cl.node(rev) for rev in bheadrevs]
tiprev = max(bheadrevs)
if tiprev > self.tiprev:
self.tipnode = cl.node(tiprev)
self.tiprev = tiprev
Pierre-Yves David
branchmap: remove the droppednodes logic...
r19838 if not self.validfor(repo):
Pierre-Yves David
branchmap: make update a method
r18131 # cache key are not valid anymore
self.tipnode = nullid
self.tiprev = nullrev
for heads in self.values():
tiprev = max(cl.rev(node) for node in heads)
if tiprev > self.tiprev:
self.tipnode = cl.node(tiprev)
self.tiprev = tiprev
Pierre-Yves David
branchmap: takes filtered revision in account for cache calculation...
r18168 self.filteredhash = self._hashfiltered(repo)