##// END OF EJS Templates
exchangev2: fetch manifest revisions...
exchangev2: fetch manifest revisions Now that the server has support for retrieving manifest data, we can implement the client bits to call it. We teach the changeset fetching code to capture the manifest revisions that are encountered on incoming changesets. We then feed this into a new function which filters out known manifests and then batches up manifest data requests to the server. This is different from the previous wire protocol in a few notable ways. First, the client fetches manifest data separately and explicitly. Before, we'd ask the server for data pertaining to some changesets (via a "getbundle" command) and manifests (and files) would be sent automatically. Providing an API for looking up just manifest data separately gives clients much more flexibility for manifest management. For example, a client may choose to only fetch manifest data on demand instead of prefetching it (i.e. partial clone). Second, we send N commands to the server for manifest retrieval instead of 1. This property has a few nice side-effects. One is that the deterministic nature of the requests lends itself to server-side caching. For example, say the remote has 50,000 manifests. If the server is configured to cache responses, each time a new commit arrives, you will have a cache miss and need to regenerate all outgoing data. But if you makes N requests requesting 10,000 manifests each, a new commit will still yield cache hits on the initial, unchanged manifest batches/requests. A derived benefit from these properties is that resumable clone is conceptually simpler to implement. When making a monolithic request for all of the repository data, recovering from an interrupted clone is hard because the server was in the driver's seat and was maintaining state about all the data that needed transferred. With the client driving fetching, the client can persist the set of unfetched entities and retry/resume a fetch if something goes wrong. Or we can fetch all data N changesets at a time and slowly build up a repository. This approach is drastically easier to implement when we have server APIs exposing low-level repository primitives (such as manifests and files). We don't yet support tree manifests. But it should be possible to implement that with the existing wire protocol command. Differential Revision: https://phab.mercurial-scm.org/D4489

File last commit:

r38806:e7aa113b default
r39674:d292328e default
Show More
treediscovery.py
174 lines | 5.6 KiB | text/x-python | PythonLexer
Peter Arrenbrecht
discovery: add new set-based discovery...
r14164 # discovery.py - protocol changeset discovery functions
#
# Copyright 2010 Matt Mackall <mpm@selenic.com>
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
Gregory Szorc
treediscovery: use absolute_import
r25987 from __future__ import absolute_import
Martin von Zweigbergk
util: drop alias for collections.deque...
r25113 import collections
Gregory Szorc
treediscovery: use absolute_import
r25987
from .i18n import _
from .node import (
nullid,
short,
)
from . import (
error,
Gregory Szorc
global: use pycompat.xrange()...
r38806 pycompat,
Gregory Szorc
treediscovery: use absolute_import
r25987 )
Peter Arrenbrecht
discovery: add new set-based discovery...
r14164
def findcommonincoming(repo, remote, heads=None, force=False):
"""Return a tuple (common, fetch, heads) used to identify the common
subset of nodes between repo and remote.
"common" is a list of (at least) the heads of the common subset.
"fetch" is a list of roots of the nodes that would be incoming, to be
supplied to changegroupsubset.
"heads" is either the supplied heads, or else the remote's heads.
"""
Pierre-Yves David
discovery: stop using nodemap for membership testing...
r20225 knownnode = repo.changelog.hasnode
Peter Arrenbrecht
discovery: add new set-based discovery...
r14164 search = []
fetch = set()
seen = set()
seenbranch = set()
base = set()
if not heads:
Gregory Szorc
treediscovery: switch to command executor interface...
r37652 with remote.commandexecutor() as e:
heads = e.callcommand('heads', {}).result()
Peter Arrenbrecht
discovery: add new set-based discovery...
r14164
if repo.changelog.tip() == nullid:
base.add(nullid)
if heads != [nullid]:
return [nullid], [nullid], list(heads)
Peter Arrenbrecht
treediscovery: fix regression when run against older repos (issue2793)...
r14199 return [nullid], [], heads
Peter Arrenbrecht
discovery: add new set-based discovery...
r14164
# assume we're closer to the tip than the root
# and start by examining the heads
repo.ui.status(_("searching for changes\n"))
unknown = []
for h in heads:
Pierre-Yves David
discovery: stop using nodemap for membership testing...
r20225 if not knownnode(h):
Peter Arrenbrecht
discovery: add new set-based discovery...
r14164 unknown.append(h)
else:
base.add(h)
Peter Arrenbrecht
treediscovery: fix regression when run against older repos (issue2793)...
r14199 if not unknown:
return list(base), [], list(heads)
Peter Arrenbrecht
discovery: add new set-based discovery...
r14164 req = set(unknown)
reqcnt = 0
Martin von Zweigbergk
treediscovery: use progress helper...
r38419 progress = repo.ui.makeprogress(_('searching'), unit=_('queries'))
Peter Arrenbrecht
discovery: add new set-based discovery...
r14164
# search through remote branches
# a 'branch' here is a linear segment of history, with four parts:
# head, root, first parent, second parent
# (a branch always has two parents (or none) by definition)
Gregory Szorc
treediscovery: switch to command executor interface...
r37652 with remote.commandexecutor() as e:
branches = e.callcommand('branches', {'nodes': unknown}).result()
unknown = collections.deque(branches)
Peter Arrenbrecht
discovery: add new set-based discovery...
r14164 while unknown:
r = []
while unknown:
Bryan O'Sullivan
cleanup: use the deque type where appropriate...
r16803 n = unknown.popleft()
Peter Arrenbrecht
discovery: add new set-based discovery...
r14164 if n[0] in seen:
continue
repo.ui.debug("examining %s:%s\n"
% (short(n[0]), short(n[1])))
if n[0] == nullid: # found the end of the branch
pass
elif n in seenbranch:
repo.ui.debug("branch already found\n")
continue
Pierre-Yves David
discovery: stop using nodemap for membership testing...
r20225 elif n[1] and knownnode(n[1]): # do we know the base?
Peter Arrenbrecht
discovery: add new set-based discovery...
r14164 repo.ui.debug("found incomplete branch %s:%s\n"
% (short(n[0]), short(n[1])))
search.append(n[0:2]) # schedule branch range for scanning
seenbranch.add(n)
else:
if n[1] not in seen and n[1] not in fetch:
Pierre-Yves David
discovery: stop using nodemap for membership testing...
r20225 if knownnode(n[2]) and knownnode(n[3]):
Peter Arrenbrecht
discovery: add new set-based discovery...
r14164 repo.ui.debug("found new changeset %s\n" %
short(n[1]))
fetch.add(n[1]) # earliest unknown
for p in n[2:4]:
Pierre-Yves David
discovery: stop using nodemap for membership testing...
r20225 if knownnode(p):
Peter Arrenbrecht
discovery: add new set-based discovery...
r14164 base.add(p) # latest known
for p in n[2:4]:
Pierre-Yves David
discovery: stop using nodemap for membership testing...
r20225 if p not in req and not knownnode(p):
Peter Arrenbrecht
discovery: add new set-based discovery...
r14164 r.append(p)
req.add(p)
seen.add(n[0])
if r:
reqcnt += 1
Martin von Zweigbergk
treediscovery: use progress helper...
r38419 progress.increment()
Peter Arrenbrecht
discovery: add new set-based discovery...
r14164 repo.ui.debug("request %d: %s\n" %
(reqcnt, " ".join(map(short, r))))
Gregory Szorc
global: use pycompat.xrange()...
r38806 for p in pycompat.xrange(0, len(r), 10):
Gregory Szorc
treediscovery: switch to command executor interface...
r37652 with remote.commandexecutor() as e:
branches = e.callcommand('branches', {
'nodes': r[p:p + 10],
}).result()
for b in branches:
Peter Arrenbrecht
discovery: add new set-based discovery...
r14164 repo.ui.debug("received %s:%s\n" %
(short(b[0]), short(b[1])))
unknown.append(b)
# do binary search on the branches we found
while search:
newsearch = []
reqcnt += 1
Martin von Zweigbergk
treediscovery: use progress helper...
r38419 progress.increment()
Gregory Szorc
treediscovery: switch to command executor interface...
r37652
with remote.commandexecutor() as e:
between = e.callcommand('between', {'pairs': search}).result()
for n, l in zip(search, between):
Peter Arrenbrecht
discovery: add new set-based discovery...
r14164 l.append(n[1])
p = n[0]
f = 1
for i in l:
repo.ui.debug("narrowing %d:%d %s\n" % (f, len(l), short(i)))
Pierre-Yves David
discovery: stop using nodemap for membership testing...
r20225 if knownnode(i):
Peter Arrenbrecht
discovery: add new set-based discovery...
r14164 if f <= 2:
repo.ui.debug("found new branch changeset %s\n" %
short(p))
fetch.add(p)
base.add(i)
else:
repo.ui.debug("narrowed branch search to %s:%s\n"
% (short(p), short(i)))
newsearch.append((p, i))
break
p, f = i, f * 2
search = newsearch
# sanity check our fetch list
for f in fetch:
Pierre-Yves David
discovery: stop using nodemap for membership testing...
r20225 if knownnode(f):
Peter Arrenbrecht
discovery: add new set-based discovery...
r14164 raise error.RepoError(_("already have changeset ")
+ short(f[:4]))
base = list(base)
if base == [nullid]:
if force:
repo.ui.warn(_("warning: repository is unrelated\n"))
else:
Pierre-Yves David
error: get Abort from 'error' instead of 'util'...
r26587 raise error.Abort(_("repository is unrelated"))
Peter Arrenbrecht
discovery: add new set-based discovery...
r14164
repo.ui.debug("found new changesets starting at " +
" ".join([short(f) for f in fetch]) + "\n")
Martin von Zweigbergk
treediscovery: use progress helper...
r38419 progress.complete()
Peter Arrenbrecht
discovery: add new set-based discovery...
r14164 repo.ui.debug("%d total queries\n" % reqcnt)
return base, list(fetch), heads