##// END OF EJS Templates
branchmap-v3: filter topo heads using node for performance reason...
branchmap-v3: filter topo heads using node for performance reason The branchmap currently contains heads as nodeid. If we build a set of revnum with the topological heads, we need to turn the nodeid in the branchmap to revnum to be able to check if they are topo-heads. That nodeid → revnum lookup is "expensive" and adds up to something noticeable if you do it hundreds of thousand of time. Instead we turn all the topo-heads revnums into nodes and build a set. So we can directly test membership of the nodeids stored in the branchmap. That is much faster. Ideally we would have revnum in the branchmap and could directly test revnum against a revnum set and that would be even faster. However that's an adventure for another time. Without this change, the branchmap format "v3" was significantly slower than the "v2" format. With this changes, some of that gap is recovered With rust + persistent nodemap, this overhead was smaller because the extra lookup did not had to to build the nodemap from scratch. In addition the mozilla-unified repository is able to use the "pure_top" mode of branchmap v3, so it was not really affected by this. Future changeset will work of the remaining of the performance gap. ### benchmark.name = hg.command.unbundle # bin-env-vars.hg.py-re2-module = default # benchmark.variants.issue6528 = disabled # benchmark.variants.resource-usage = default # benchmark.variants.reuse-external-delta-parent = yes # benchmark.variants.revs = any-1-extra-rev # benchmark.variants.source = unbundle # benchmark.variants.validate = default # benchmark.variants.verbosity = quiet ## data-env-vars.name = netbeans-2018-08-01-zstd-sparse-revlog # bin-env-vars.hg.flavor = default branch-v2: 0.233711 ~~~~~ branch-v3 before: 0.380994 (+63.02%, +0.15) branch-v3 after: 0.368769 (+57.79%, +0.14) # bin-env-vars.hg.flavor = rust branch-v2: 0.235230 ~~~~~ branch-v3 before: 0.385060 (+63.70%, +0.15) branch-v3 after: 0.372460 (+58.34%, +0.14) ## data-env-vars.name = netbeans-2018-08-01-ds2-pnm # bin-env-vars.hg.flavor = rust branch-v2: 0.255586 ~~~~~ branch-v3 before: 0.317524 (+24.23%, +0.06) branch-v3 after: 0.318907 (+24.78%, +0.06) ## data-env-vars.name = mozilla-central-2024-03-22-zstd-sparse-revlog # bin-env-vars.hg.flavor = default branch-v2: 0.339010 ~~~~~ branch-v3 before: 0.410007 (+20.94%, +0.07) branch-v3 after: 0.349752 (+3.17%, +0.01) # bin-env-vars.hg.flavor = rust branch-v2: 0.346525 ~~~~~ branch-v3 before: 0.410428 (+18.44%, +0.06) branch-v3 after: 0.354300 (+2.24%, +0.01) ## data-env-vars.name = mozilla-central-2024-03-22-ds2-pnm # bin-env-vars.hg.flavor = rust branch-v2: 0.380202 ~~~~~ branch-v3 before: 0.393871 (+3.60%, +0.01) branch-v3 after: 0.396293 (+4.23%, +0.02) ## data-env-vars.name = mozilla-unified-2024-03-22-zstd-sparse-revlog # bin-env-vars.hg.flavor = default branch-v2: 0.412165 ~~~~~ branch-v3 before: 0.438105 (+6.29%, +0.03) branch-v3 after: 0.424769 (+3.06%, +0.01) # bin-env-vars.hg.flavor = rust branch-v2: 0.412397 ~~~~~ branch-v3 before: 0.438405 (+6.31%, +0.03) branch-v3 after: 0.421796 (+2.28%, +0.01) ## data-env-vars.name = mozilla-unified-2024-03-22-ds2-pnm # bin-env-vars.hg.flavor = rust branch-v2: 0.429501 ~~~~~ branch-v3 before: 0.452692 (+5.40%, +0.02) branch-v3 after: 0.443849 (+3.34%, +0.01) ## data-env-vars.name = mozilla-try-2024-03-26-zstd-sparse-revlog # bin-env-vars.hg.flavor = default branch-v2: 3.403171 ~~~~~ branch-v3 before: 6.562345 (+92.83%, +3.16) branch-v3 after: 6.234055 (+83.18%, +2.83) # bin-env-vars.hg.flavor = rust branch-v2: 3.454876 ~~~~~ branch-v3 before: 6.160248 (+78.31%, +2.71) branch-v3 after: 6.307813 (+82.58%, +2.85) ## data-env-vars.name = mozilla-try-2024-03-26-ds2-pnm # bin-env-vars.hg.flavor = rust branch-v2: 3.465435 ~~~~~ branch-v3 before: 5.381648 (+55.30%, +1.92) branch-v3 after: 5.176076 (+49.36%, +1.71)

File last commit:

r52756:f4733654 default
r52869:41b8892a default
Show More
mpatch.py
144 lines | 3.5 KiB | text/x-python | PythonLexer
Martin Geisler
pure Python implementation of mpatch.c
r7699 # mpatch.py - Python implementation of mpatch.c
#
Raphaël Gomès
contributor: change mentions of mpm to olivia...
r47575 # Copyright 2009 Olivia Mackall <olivia@selenic.com> and others
Martin Geisler
pure Python implementation of mpatch.c
r7699 #
Martin Geisler
updated license to be explicit about GPL version 2
r8225 # This software may be used and distributed according to the terms of the
Matt Mackall
Update license to GPLv2+
r10263 # GNU General Public License version 2 or any later version.
Martin Geisler
pure Python implementation of mpatch.c
r7699
Matt Harbison
typing: add `from __future__ import annotations` to most files...
r52756 from __future__ import annotations
Gregory Szorc
mpatch: use absolute_import...
r27337
Gregory Szorc
py3: use io.BytesIO directly...
r49728 import io
Martin Geisler
pure/mpatch: use StringIO instead of mmap (issue1493)...
r7775 import struct
Gregory Szorc
mpatch: use absolute_import...
r27337
Matt Harbison
typing: add type hints to mpatch implementations...
r50494 from typing import (
List,
Tuple,
)
Augie Fackler
formatting: blacken the codebase...
r43346
Gregory Szorc
py3: use io.BytesIO directly...
r49728 stringio = io.BytesIO
Martin Geisler
pure Python implementation of mpatch.c
r7699
Augie Fackler
formatting: blacken the codebase...
r43346
timeless
mpatch: unify mpatchError (issue5182)...
r28782 class mpatchError(Exception):
Augie Fackler
formating: upgrade to black 20.8b1...
r46554 """error raised when a delta cannot be decoded"""
timeless
mpatch: unify mpatchError (issue5182)...
r28782
Augie Fackler
formatting: blacken the codebase...
r43346
Martin Geisler
pure Python implementation of mpatch.c
r7699 # This attempts to apply a series of patches in time proportional to
# the total size of the patches, rather than patches * len(text). This
# means rather than shuffling strings around, we shuffle around
# pointers to fragments with fragment lists.
#
# When the fragment lists get too long, we collapse them. To do this
# efficiently, we do all our operations inside a buffer created by
# mmap and simply use memmove. This avoids creating a bunch of large
# temporary string buffers.
Augie Fackler
formatting: blacken the codebase...
r43346
Matt Harbison
typing: add type hints to mpatch implementations...
r50494 def _pull(
dst: List[Tuple[int, int]], src: List[Tuple[int, int]], l: int
) -> None: # pull l bytes from src
Augie Fackler
mpatch: move pull() method to top level...
r28587 while l:
f = src.pop()
Augie Fackler
formatting: blacken the codebase...
r43346 if f[0] > l: # do we need to split?
Augie Fackler
mpatch: move pull() method to top level...
r28587 src.append((f[0] - l, f[1] + l))
dst.append((l, f[1]))
return
dst.append(f)
l -= f[0]
Augie Fackler
formatting: blacken the codebase...
r43346
Matt Harbison
typing: add type hints to mpatch implementations...
r50494 def _move(m: stringio, dest: int, src: int, count: int) -> None:
Augie Fackler
mpatch: un-nest the move() method...
r28588 """move count bytes from src to dest
The file pointer is left at the end of dest.
"""
m.seek(src)
buf = m.read(count)
m.seek(dest)
m.write(buf)
Augie Fackler
formatting: blacken the codebase...
r43346
Matt Harbison
typing: add type hints to mpatch implementations...
r50494 def _collect(
m: stringio, buf: int, list: List[Tuple[int, int]]
) -> Tuple[int, int]:
Augie Fackler
mpatch: move collect() to module level...
r28589 start = buf
for l, p in reversed(list):
_move(m, buf, p, l)
buf += l
return (buf - start, start)
Augie Fackler
formatting: blacken the codebase...
r43346
Matt Harbison
typing: add type hints to mpatch implementations...
r50494 def patches(a: bytes, bins: List[bytes]) -> bytes:
Matt Mackall
many, many trivial check-code fixups
r10282 if not bins:
return a
Martin Geisler
pure Python implementation of mpatch.c
r7699
plens = [len(x) for x in bins]
pl = sum(plens)
bl = len(a) + pl
Augie Fackler
formatting: blacken the codebase...
r43346 tl = bl + bl + pl # enough for the patches and two working texts
Martin Geisler
pure Python implementation of mpatch.c
r7699 b1, b2 = 0, bl
Matt Mackall
many, many trivial check-code fixups
r10282 if not tl:
return a
Martin Geisler
pure Python implementation of mpatch.c
r7699
timeless
pycompat: switch to util.stringio for py3 compat
r28861 m = stringio()
Martin Geisler
pure Python implementation of mpatch.c
r7699
# load our original text
m.write(a)
frags = [(len(a), b1)]
# copy all the patches into our segment so we can memmove from them
pos = b2 + bl
m.seek(pos)
Alex Gaynor
style: never put multiple statements on one line...
r34436 for p in bins:
m.write(p)
Martin Geisler
pure Python implementation of mpatch.c
r7699
for plen in plens:
# if our list gets too long, execute it
if len(frags) > 128:
b2, b1 = b1, b2
Augie Fackler
mpatch: move collect() to module level...
r28589 frags = [_collect(m, b1, frags)]
Martin Geisler
pure Python implementation of mpatch.c
r7699
new = []
end = pos + plen
last = 0
while pos < end:
Martin Geisler
pure/mpatch: use StringIO instead of mmap (issue1493)...
r7775 m.seek(pos)
timeless
mpatch: unify mpatchError (issue5182)...
r28782 try:
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 p1, p2, l = struct.unpack(b">lll", m.read(12))
timeless
mpatch: unify mpatchError (issue5182)...
r28782 except struct.error:
Matt Harbison
pure: stringify builtin exception messages...
r52634 raise mpatchError("patch cannot be decoded")
Augie Fackler
formatting: blacken the codebase...
r43346 _pull(new, frags, p1 - last) # what didn't change
_pull([], frags, p2 - p1) # what got deleted
new.append((l, pos + 12)) # what got added
Martin Geisler
pure Python implementation of mpatch.c
r7699 pos += l + 12
last = p2
Augie Fackler
formatting: blacken the codebase...
r43346 frags.extend(reversed(new)) # what was left at the end
Martin Geisler
pure Python implementation of mpatch.c
r7699
Augie Fackler
mpatch: move collect() to module level...
r28589 t = _collect(m, b2, frags)
Martin Geisler
pure Python implementation of mpatch.c
r7699
Martin Geisler
pure/mpatch: use StringIO instead of mmap (issue1493)...
r7775 m.seek(t[1])
return m.read(t[0])
Martin Geisler
pure Python implementation of mpatch.c
r7699
Augie Fackler
formatting: blacken the codebase...
r43346
Matt Harbison
typing: add type hints to mpatch implementations...
r50494 def patchedsize(orig: int, delta: bytes) -> int:
Martin Geisler
pure Python implementation of mpatch.c
r7699 outlen, last, bin = 0, 0, 0
binend = len(delta)
data = 12
while data <= binend:
Augie Fackler
formatting: blacken the codebase...
r43346 decode = delta[bin : bin + 12]
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 start, end, length = struct.unpack(b">lll", decode)
Martin Geisler
pure Python implementation of mpatch.c
r7699 if start > end:
break
bin = data + length
data = bin + 12
outlen += start - last
last = end
outlen += length
if bin != binend:
Matt Harbison
pure: stringify builtin exception messages...
r52634 raise mpatchError("patch cannot be decoded")
Martin Geisler
pure Python implementation of mpatch.c
r7699
outlen += orig - last
return outlen