##// END OF EJS Templates
bundle: when verbose, show what takes up the space in the generated bundle...
bundle: when verbose, show what takes up the space in the generated bundle This is kind of similar to the debugbundle command but gives summarized actual uncompressed number of bytes when creating the bundle. The numbers are as usable as the bundle format is efficient. Hopefully bundle2 will make it a better indicator of actual entropy. This is useful when accepting pull requests to assess whether the repo size increase seems reasonable for the diff before pushing stuff upstream, It has helped me catching large files that should have been committed as largefiles but was committed as regular files in intermediate changesets. This output doesn't combine well with debug output so we only enable it when verbose without debug.

File last commit:

r23602:a4679a74 merge default
r23748:4ab66de4 default
Show More
manifest.py
287 lines | 10.0 KiB | text/x-python | PythonLexer
mpm@selenic.com
Break apart hg.py...
r1089 # manifest.py - manifest revision class for mercurial
#
Thomas Arendsen Hein
Updated copyright notices and add "and others" to "hg version"
r4635 # Copyright 2005-2007 Matt Mackall <mpm@selenic.com>
mpm@selenic.com
Break apart hg.py...
r1089 #
Martin Geisler
updated license to be explicit about GPL version 2
r8225 # This software may be used and distributed according to the terms of the
Matt Mackall
Update license to GPLv2+
r10263 # GNU General Public License version 2 or any later version.
mpm@selenic.com
Break apart hg.py...
r1089
Matt Mackall
Simplify i18n imports
r3891 from i18n import _
Martin von Zweigbergk
manifest: for diff(), only iterate over files, not flags...
r22965 import mdiff, parsers, error, revlog, util
Simon Heimberg
separate import lines from mercurial and general python modules
r8312 import array, struct
mpm@selenic.com
Break apart hg.py...
r1089
Matt Mackall
Combine manifest dict and flags dict into a single object...
r2835 class manifestdict(dict):
Alexis S. L. Carvalho
Fix some bugs introduced during the manifest refactoring
r2857 def __init__(self, mapping=None, flags=None):
Matt Mackall
many, many trivial check-code fixups
r10282 if mapping is None:
mapping = {}
if flags is None:
flags = {}
Matt Mackall
Add manifestflags class
r2831 dict.__init__(self, mapping)
Matt Mackall
Switch to simpler manifestdict
r2839 self._flags = flags
Augie Fackler
manifest: disallow setting the node id of an entry to None...
r23594 def __setitem__(self, k, v):
assert v is not None
dict.__setitem__(self, k, v)
Matt Mackall
manifestflags: eliminate remaining users of direct dict access
r2834 def flags(self, f):
Matt Mackall
Switch to simpler manifestdict
r2839 return self._flags.get(f, "")
Jesse Glick
localrepo: optimize internode status calls using withflags...
r16646 def withflags(self):
return set(self._flags.keys())
Augie Fackler
manifest: rename ambiguously-named set to setflag...
r22942 def setflag(self, f, flags):
"""Set the flags (symlink, executable) for path f."""
Matt Mackall
simplify flag handling...
r6743 self._flags[f] = flags
Matt Mackall
Add manifestflags class
r2831 def copy(self):
Benoit Boissinot
manifestdict: remove unnecessary dictionary copy...
r9416 return manifestdict(self, dict.copy(self._flags))
Siddharth Agarwal
manifestdict: add a new method to intersect with a set of files...
r21879 def intersectfiles(self, files):
'''make a new manifestdict with the intersection of self with files
The algorithm assumes that files is much smaller than self.'''
ret = manifestdict()
for fn in files:
if fn in self:
ret[fn] = self[fn]
flags = self._flags.get(fn, None)
if flags:
ret._flags[fn] = flags
return ret
Martin von Zweigbergk
manifest: repurpose flagsdiff() into (node-and-flag)diff()...
r22964
Martin von Zweigbergk
manifest: add matches() method...
r23305 def matches(self, match):
'''generate a new manifest filtered by the match argument'''
if match.always():
return self.copy()
files = match.files()
if (match.matchfn == match.exact or
(not match.anypats() and util.all(fn in self for fn in files))):
return self.intersectfiles(files)
mf = self.copy()
for fn in mf.keys():
if not match(fn):
del mf[fn]
return mf
Martin von Zweigbergk
manifest: repurpose flagsdiff() into (node-and-flag)diff()...
r22964 def diff(self, m2):
'''Finds changes between the current manifest and m2. The result is
returned as a dict with filename as key and values of the form
Martin von Zweigbergk
manifest: transpose pair of pairs from diff()...
r22966 ((n1,fl1),(n2,fl2)), where n1/n2 is the nodeid in the current/other
Martin von Zweigbergk
manifest: for diff(), only iterate over files, not flags...
r22965 manifest and fl1/fl2 is the flag in the current/other manifest. Where
the file does not exist, the nodeid will be None and the flags will be
the empty string.'''
diff = {}
for fn, n1 in self.iteritems():
fl1 = self._flags.get(fn, '')
n2 = m2.get(fn, None)
fl2 = m2._flags.get(fn, '')
if n2 is None:
fl2 = ''
if n1 != n2 or fl1 != fl2:
Martin von Zweigbergk
manifest: transpose pair of pairs from diff()...
r22966 diff[fn] = ((n1, fl1), (n2, fl2))
Martin von Zweigbergk
manifest: for diff(), only iterate over files, not flags...
r22965
for fn, n2 in m2.iteritems():
if fn not in self:
fl2 = m2._flags.get(fn, '')
Martin von Zweigbergk
manifest: transpose pair of pairs from diff()...
r22966 diff[fn] = ((None, ''), (n2, fl2))
Martin von Zweigbergk
manifest: for diff(), only iterate over files, not flags...
r22965
return diff
Matt Mackall
Add manifestflags class
r2831
Augie Fackler
manifest: move manifestdict-to-text encoding to manifest class...
r22929 def text(self):
Augie Fackler
manifest: add docstring to text() method
r22943 """Get the full data of this manifest as a bytestring."""
Augie Fackler
manifest: move manifestdict-to-text encoding to manifest class...
r22929 fl = sorted(self)
_checkforbidden(fl)
hex, flags = revlog.hex, self.flags
# if this is changed to support newlines in filenames,
# be sure to check the templates/ dir again (especially *-raw.tmpl)
return ''.join("%s\0%s%s\n" % (f, hex(self[f]), flags(f)) for f in fl)
Augie Fackler
manifest: move checkforbidden to module-level...
r22408
Augie Fackler
manifest: add fastdelta method to manifestdict...
r22931 def fastdelta(self, base, changes):
"""Given a base manifest text as an array.array and a list of changes
relative to that text, compute a delta that can be used by revlog.
"""
delta = []
dstart = None
dend = None
dline = [""]
start = 0
# zero copy representation of base as a buffer
addbuf = util.buffer(base)
# start with a readonly loop that finds the offset of
# each line and creates the deltas
for f, todelete in changes:
# bs will either be the index of the item or the insert point
start, end = _msearch(addbuf, f, start)
if not todelete:
l = "%s\0%s%s\n" % (f, revlog.hex(self[f]), self.flags(f))
else:
if start == end:
# item we want to delete was not found, error out
raise AssertionError(
_("failed to remove %s from manifest") % f)
l = ""
if dstart is not None and dstart <= start and dend >= start:
if dend < end:
dend = end
if l:
dline.append(l)
else:
if dstart is not None:
delta.append([dstart, dend, "".join(dline)])
dstart = start
dend = end
dline = [l]
if dstart is not None:
delta.append([dstart, dend, "".join(dline)])
# apply the delta to the base, and get a delta for addrevision
deltatext, arraytext = _addlistdelta(base, delta)
return arraytext, deltatext
Augie Fackler
manifest: move _search to module level and rename to _msearch...
r22930 def _msearch(m, s, lo=0, hi=None):
'''return a tuple (start, end) that says where to find s within m.
If the string is found m[start:end] are the line containing
that string. If start == end the string was not found and
they indicate the proper sorted insertion point.
m should be a buffer or a string
s is a string'''
def advance(i, c):
while i < lenm and m[i] != c:
i += 1
return i
if not s:
return (lo, lo)
lenm = len(m)
if not hi:
hi = lenm
while lo < hi:
mid = (lo + hi) // 2
start = mid
while start > 0 and m[start - 1] != '\n':
start -= 1
end = advance(start, '\0')
if m[start:end] < s:
# we know that after the null there are 40 bytes of sha1
# this translates to the bisect lo = mid + 1
lo = advance(end + 40, '\n') + 1
else:
# this translates to the bisect hi = mid
hi = start
end = advance(lo, '\0')
found = m[lo:end]
if s == found:
# we know that after the null there are 40 bytes of sha1
end = advance(end + 40, '\n')
return (lo, end + 1)
else:
return (lo, lo)
Augie Fackler
manifest: mark addlistdelta and checkforbidden as module-private
r22415 def _checkforbidden(l):
Augie Fackler
manifest: move checkforbidden to module-level...
r22408 """Check filenames for illegal characters."""
for f in l:
if '\n' in f or '\r' in f:
raise error.RevlogError(
_("'\\n' and '\\r' disallowed in filenames: %r") % f)
Augie Fackler
manifest: move addlistdelta to module-level...
r22409 # apply the changes collected during the bisect loop to our addlist
# return a delta suitable for addrevision
Augie Fackler
manifest: mark addlistdelta and checkforbidden as module-private
r22415 def _addlistdelta(addlist, x):
Augie Fackler
manifest: move addlistdelta to module-level...
r22409 # for large addlist arrays, building a new array is cheaper
# than repeatedly modifying the existing one
currentposition = 0
newaddlist = array.array('c')
for start, end, content in x:
newaddlist += addlist[currentposition:start]
if content:
newaddlist += array.array('c', content)
currentposition = end
newaddlist += addlist[currentposition:]
deltatext = "".join(struct.pack(">lll", start, end, len(content))
+ content for start, end, content in x)
return deltatext, newaddlist
Augie Fackler
manifest: move manifest parsing to module-level...
r22786 def _parse(lines):
mfdict = manifestdict()
parsers.parse_manifest(mfdict, mfdict._flags, lines)
return mfdict
Augie Fackler
manifest: move addlistdelta to module-level...
r22409
Matt Mackall
revlog: kill from-style imports...
r7634 class manifest(revlog.revlog):
Matt Mackall
revlog: simplify revlog version handling...
r4258 def __init__(self, opener):
Durham Goode
manifest: increase lrucache from 3 to 4...
r20075 # we expect to deal with not more than four revs at a time,
# during a commit --amend
self._mancache = util.lrucachedict(4)
Matt Mackall
revlog: kill from-style imports...
r7634 revlog.revlog.__init__(self, opener, "00manifest.i")
mpm@selenic.com
Break apart hg.py...
r1089
Brendan Cully
Abstract manifest block parsing.
r3196 def readdelta(self, node):
Matt Mackall
revlog: remove delta function
r7362 r = self.rev(node)
Augie Fackler
manifest: move manifest parsing to module-level...
r22786 return _parse(mdiff.patchtext(self.revdiff(self.deltaparent(r), r)))
Thomas Arendsen Hein
Whitespace/Tab cleanup
r3223
Matt Mackall
manifest: add readfast method
r13711 def readfast(self, node):
'''use the faster of readdelta or read'''
r = self.rev(node)
Sune Foldager
revlog: compute correct deltaparent in the deltaparent function...
r14208 deltaparent = self.deltaparent(r)
if deltaparent != revlog.nullrev and deltaparent in self.parentrevs(r):
Matt Mackall
manifest: add readfast method
r13711 return self.readdelta(node)
return self.read(node)
mpm@selenic.com
Break apart hg.py...
r1089 def read(self, node):
Matt Mackall
revlog: kill from-style imports...
r7634 if node == revlog.nullid:
return manifestdict() # don't upset local cache
Siddharth Agarwal
manifest: use a size 3 LRU cache to store parsed manifests...
r18604 if node in self._mancache:
return self._mancache[node][0]
mpm@selenic.com
Break apart hg.py...
r1089 text = self.revision(node)
Benoit Boissinot
manifest: simplify cache handling, use a unique cache
r9414 arraytext = array.array('c', text)
Augie Fackler
manifest: move manifest parsing to module-level...
r22786 mapping = _parse(text)
Siddharth Agarwal
manifest: use a size 3 LRU cache to store parsed manifests...
r18604 self._mancache[node] = (mapping, arraytext)
Matt Mackall
Combine manifest dict and flags dict into a single object...
r2835 return mapping
mpm@selenic.com
Break apart hg.py...
r1089
Vadim Gelfer
fix parsing of tags. make parse errors useful. add new tag tests....
r2320 def find(self, node, f):
'''look up entry for a single file efficiently.
Alexis S. L. Carvalho
fix manifest.find
r4159 return (node, flags) pair if found, (None, None) if not.'''
Siddharth Agarwal
manifest: use a size 3 LRU cache to store parsed manifests...
r18604 if node in self._mancache:
mapping = self._mancache[node][0]
return mapping.get(f), mapping.flags(f)
Vadim Gelfer
fix parsing of tags. make parse errors useful. add new tag tests....
r2320 text = self.revision(node)
Augie Fackler
manifest: move _search to module level and rename to _msearch...
r22930 start, end = _msearch(text, f)
Vadim Gelfer
fix parsing of tags. make parse errors useful. add new tag tests....
r2320 if start == end:
return None, None
l = text[start:end]
f, n = l.split('\0')
Matt Mackall
revlog: kill from-style imports...
r7634 return revlog.bin(n[:40]), n[40:-1]
Vadim Gelfer
fix parsing of tags. make parse errors useful. add new tag tests....
r2320
Augie Fackler
manifest: simplify manifest.add() by making args required...
r22787 def add(self, map, transaction, link, p1, p2, added, removed):
Augie Fackler
manifest: rearrange add() method and add comments for clarity...
r22788 if p1 in self._mancache:
# If our first parent is in the manifest cache, we can
# compute a delta here using properties we know about the
# manifest up-front, which may save time later for the
# revlog layer.
mpm@selenic.com
Break apart hg.py...
r1089
Augie Fackler
manifest: mark addlistdelta and checkforbidden as module-private
r22415 _checkforbidden(added)
mpm@selenic.com
Break apart hg.py...
r1089 # combine the changed lists into one list for sorting
Benoit Boissinot
manifest.add(): cleanup worklist construction and iteration
r9415 work = [(x, False) for x in added]
work.extend((x, True) for x in removed)
Mads Kiilerich
avoid using abbreviations that look like spelling errors
r17428 # this could use heapq.merge() (from Python 2.6+) or equivalent
Benoit Boissinot
manifest.add(): cleanup worklist construction and iteration
r9415 # since the lists are already sorted
mpm@selenic.com
Break apart hg.py...
r1089 work.sort()
Augie Fackler
manifest: add fastdelta method to manifestdict...
r22931 arraytext, deltatext = map.fastdelta(self._mancache[p1][1], work)
cachedelta = self.rev(p1), deltatext
Matt Mackall
util: don't mess with builtins to emulate buffer()
r15657 text = util.buffer(arraytext)
Augie Fackler
manifest: rearrange add() method and add comments for clarity...
r22788 else:
# The first parent manifest isn't already loaded, so we'll
# just encode a fulltext of the manifest and pass that
# through to the revlog layer, and let it handle the delta
# process.
Augie Fackler
manifest: move manifestdict-to-text encoding to manifest class...
r22929 text = map.text()
Augie Fackler
manifest: rearrange add() method and add comments for clarity...
r22788 arraytext = array.array('c', text)
cachedelta = None
mason@suse.com
Optimize manifest.add...
r1534
Benoit Boissinot
manifest/revlog: do not let the revlog cache mutable objects...
r9420 n = self.addrevision(text, transaction, link, p1, p2, cachedelta)
Siddharth Agarwal
manifest: use a size 3 LRU cache to store parsed manifests...
r18604 self._mancache[n] = (map, arraytext)
mpm@selenic.com
Break apart hg.py...
r1089
return n