upstream/mercurial-mirror Files · mercurial/setdiscovery.py

sshpeer: initial definition and implementation of new SSH protocol...

sshpeer: initial definition and implementation of new SSH protocol The existing SSH protocol has several design flaws. Future commits will elaborate on these flaws as new features are introduced to combat these flaws. For now, hopefully you can take me for my word that a ground up rewrite of the SSH protocol is needed. This commit lays the foundation for a new SSH protocol by defining a mechanism to upgrade the SSH transport channel away from the default (version 1) protocol to something modern (which we'll call "version 2" for now). This upgrade process is detailed in the internals documentation for the wire protocol. The gist of it is the client sends a request line preceding the "hello" command/line which basically says "I'm requesting an upgrade: here's what I support." If the server recognizes that line, it processes the upgrade request and the transport channel is switched to use the new version of the protocol. If not, it sends an empty response, which is how all Mercurial SSH servers from the beginning of time reacted to unknown commands. The upgrade request is effectively ignored and the client continues to use the existing version of the protocol as if nothing happened. The new version of the SSH protocol is completely identical to version 1 aside from the upgrade dance and the bytes that follow. The immediate bytes that follow the protocol switch are defined to be a length framed "capabilities: " line containing the remote's advertised capabilities. In reality, this looks very similar to what the "hello" response would look like. But it will evolve quickly. The methodology by which the protocol will evolve is important. I'm not going to introduce the new protocol all at once. That would likely lead to endless bike shedding and forward progress would stall. Instead, I intend to tricle out new features and diversions from the existing protocol in small, incremental changes. To support the gradual evolution of the protocol, the on-the-wire advertised protocol name contains an "exp" to denote "experimental" and a 4 digit field to capture the sub-version of the protocol. Whenever we make a BC change to the wire protocol, we can increment this version and lock out all older clients because it will appear as a completely different protocol version. This means we can incur as many breaking changes as we want. We don't have to commit to supporting any one feature or idea for a long period of time. We can even evolve the handshake mechanism, because that is defined as being an implementation detail of the negotiated protocol version! Hopefully this lowers the barrier to accepting changes to the protocol and for experimenting with "radical" ideas during its development. In core, sshpeer received most of the attention. We haven't even implemented the server bits for the new protocol in core yet. Instead, we add very primitive support to our test server, mainly just to exercise the added code paths in sshpeer. Differential Revision: https://phab.mercurial-scm.org/D2061 # no-check-commit because of required foo_bar naming

Martin von Zweigbergk - - Load All Authors

File last commit:

r35867:5cfdf613 default


                r35994:48a3a928

default

Download file

             setdiscovery.py
        
                    267 lines
            
             | 9.4 KiB
            
                | text/x-python
            
             |
                PythonLexer
            
             / mercurial / setdiscovery.py
          
                    History
                
                 |
                  Source
                 | Raw
                 |Copy content
                 |Copy permalink

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
      # setdiscovery.py - improved discovery of common nodeset for mercurial

      #

      # Copyright 2010 Benoit Boissinot <bboissin@gmail.com>

      # and Peter Arrenbrecht <peter@arrenbrecht.ch>

      #

      # This software may be used and distributed according to the terms of the

      # GNU General Public License version 2 or any later version.

        Olle Lundberg
    
setdiscovery: document algorithms used...

              r20656
            
      """

      Algorithm works in the following way. You have two repository: local and

      remote. They both contains a DAG of changelists.

      The goal of the discovery protocol is to find one set of node *common*,

      the set of nodes shared by local and remote.

      One of the issue with the original protocol was latency, it could

      potentially require lots of roundtrips to discover that the local repo was a

      subset of remote (which is a very common case, you usually have few changes

      compared to upstream, while upstream probably had lots of development).

      The new protocol only requires one interface for the remote repo: `known()`,

      which given a set of changelists tells you if they are present in the DAG.

      The algorithm then works as follow:

       - We will be using three sets, `common`, `missing`, `unknown`. Originally

       all nodes are in `unknown`.

       - Take a sample from `unknown`, call `remote.known(sample)`

         - For each node that remote knows, move it and all its ancestors to `common`

         - For each node that remote doesn't know, move it and all its descendants

         to `missing`

       - Iterate until `unknown` is empty

      There are a couple optimizations, first is instead of starting with a random

      sample of missing, start by sending all heads, in the case where the local

      repo is a subset, you computed the answer in one round trip.

      Then you can do something similar to the bisecting strategy used when

      finding faulty changesets. Instead of random samples, you can try picking

      nodes that will maximize the number of nodes that will be

      classified with it (since all ancestors or descendants will be marked as well).

      """

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
        Gregory Szorc
    
setdiscovery: use absolute_import

              r25973
            
      from __future__ import absolute_import

        Martin von Zweigbergk
    
util: drop alias for collections.deque...

              r25113
            
      import collections

        Augie Fackler
    
cleanup: move stdlib imports to their own import statement...

              r20034
            
      import random

        Gregory Szorc
    
setdiscovery: use absolute_import

              r25973
            
      from .i18n import _

      from .node import (

          nullid,

          nullrev,

      )

      from . import (

          dagutil,

        Pierre-Yves David
    
error: get Abort from 'error' instead of 'util'...

              r26587
            
          error,

        marmoute
    
discovery: include timing in the debug output...

              r32712
            
          util,

        Gregory Szorc
    
setdiscovery: use absolute_import

              r25973
            
      )

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
        Pierre-Yves David
    
setdiscovery: drop the 'always' argument to '_updatesample'...

              r23814
            
      def _updatesample(dag, nodes, sample, quicksamplesize=0):

        Pierre-Yves David
    
setdiscovery: document the '_updatesample' function...

              r23809
            
          """update an existing sample to match the expected size

          The sample is updated with nodes exponentially distant from each head of the

          <nodes> set. (H~1, H~2, H~4, H~8, etc).

          If a target size is specified, the sampling will stop once this size is

          reached. Otherwise sampling will happen until roots of the <nodes> set are

          reached.

          :dag: a dag object from dagutil

          :nodes:  set of nodes we want to discover (if None, assume the whole dag)

          :sample: a sample to update

          :quicksamplesize: optional target size of the sample"""

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
          # if nodes is empty we scan the entire graph

          if nodes:

              heads = dag.headsetofconnecteds(nodes)

          else:

              heads = dag.heads()

          dist = {}

        Martin von Zweigbergk
    
util: drop alias for collections.deque...

              r25113
            
          visit = collections.deque(heads)

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
          seen = set()

          factor = 1

          while visit:

              curr = visit.popleft()

              if curr in seen:

                  continue

              d = dist.setdefault(curr, 1)

              if d > factor:

                  factor *= 2

              if d == factor:

        Pierre-Yves David
    
setdiscovery: drop the 'always' argument to '_updatesample'...

              r23814
            
                  sample.add(curr)

                  if quicksamplesize and (len(sample) >= quicksamplesize):

                      return

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
              seen.add(curr)

              for p in dag.parents(curr):

                  if not nodes or p in nodes:

                      dist.setdefault(p, d + 1)

                      visit.append(p)

        Pierre-Yves David
    
setdiscovery: drop unused 'initial' argument for '_takequicksample'...

              r23806
            
      def _takequicksample(dag, nodes, size):

        Pierre-Yves David
    
setdiscovery: document '_takequicksample'

              r23816
            
          """takes a quick sample of size <size>

          It is meant for initial sampling and focuses on querying heads and close

          ancestors of heads.

          :dag: a dag object

          :nodes: set of nodes to discover

          :size: the maximum size of the sample"""

        Pierre-Yves David
    
setdiscovery: drop '_setupsample' usage in '_takequicksample'...

              r23815
            
          sample = dag.headsetofconnecteds(nodes)

          if size <= len(sample):

              return _limitsample(sample, size)

        Pierre-Yves David
    
setdiscovery: drop the 'always' argument to '_updatesample'...

              r23814
            
          _updatesample(dag, None, sample, quicksamplesize=size)

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
          return sample

      def _takefullsample(dag, nodes, size):

        Pierre-Yves David
    
setdiscovery: drop the 'always' argument to '_updatesample'...

              r23814
            
          sample = dag.headsetofconnecteds(nodes)

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
          # update from heads

        Pierre-Yves David
    
setdiscovery: drop the 'always' argument to '_updatesample'...

              r23814
            
          _updatesample(dag, nodes, sample)

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
          # update from roots

        Pierre-Yves David
    
setdiscovery: drop the 'always' argument to '_updatesample'...

              r23814
            
          _updatesample(dag.inverse(), nodes, sample)

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
          assert sample

        Pierre-Yves David
    
setdiscovery: randomly pick between heads and sample when taking full sample...

              r23810
            
          sample = _limitsample(sample, size)

          if len(sample) < size:

              more = size - len(sample)

              sample.update(random.sample(list(nodes - sample), more))

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
          return sample

        Pierre-Yves David
    
setdiscovery: extract sample limitation in a `_limitsample` function...

              r23083
            
      def _limitsample(sample, desiredlen):

          """return a random subset of sample of at most desiredlen item"""

          if len(sample) > desiredlen:

              sample = set(random.sample(sample, desiredlen))

          return sample

        Martin von Zweigbergk
    
setdiscovery: don't call "heads" wire command when heads specified...

              r35867
            
      def findcommonheads(ui, local, remote, heads=None,

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
                          initialsamplesize=100,

                          fullsamplesize=200,

        Boris Feld
    
setdiscover: allow to ignore part of the local graph...

              r35305
            
                          abortwhenunrelated=True,

                          ancestorsof=None):

        Steven Brown
    
setdiscovery: limit lines to 80 characters

              r14206
            
          '''Return a tuple (common, anyincoming, remoteheads) used to identify

          missing nodes from or in remote.

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
          '''

        marmoute
    
discovery: include timing in the debug output...

              r32712
            
          start = util.timer()

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
          roundtrips = 0

          cl = local.changelog

        Boris Feld
    
setdiscover: allow to ignore part of the local graph...

              r35305
            
          localsubset = None

          if ancestorsof is not None:

              rev = local.changelog.rev

              localsubset = [rev(n) for n in ancestorsof]

          dag = dagutil.revlogdag(cl, localsubset=localsubset)

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
        Peter Arrenbrecht
    
setdiscovery: batch heads and known(ownheads)...

              r14624
            
          # early exit if we know all the specified remote heads already

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
          ui.debug("query 1; heads\n")

          roundtrips += 1

        Peter Arrenbrecht
    
setdiscovery: batch heads and known(ownheads)...

              r14624
            
          ownheads = dag.heads()

        Pierre-Yves David
    
setdiscovery: limit the size of the initial sample (issue4411)...

              r23084
            
          sample = _limitsample(ownheads, initialsamplesize)

        Mads Kiilerich
    
discovery: indices between sample and yesno must match (issue4438)...

              r23192
            
          # indices between sample and externalized version must match

          sample = list(sample)

        Martin von Zweigbergk
    
setdiscovery: don't call "heads" wire command when heads specified...

              r35867
            
          if heads:

              srvheadhashes = heads

              yesno = remote.known(dag.externalizeall(sample))

          else:

              batch = remote.iterbatch()

              batch.heads()

              batch.known(dag.externalizeall(sample))

              batch.submit()

              srvheadhashes, yesno = batch.results()

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
          if cl.tip() == nullid:

              if srvheadhashes != [nullid]:

                  return [nullid], True, srvheadhashes

              return [nullid], False, []

        Steven Brown
    
setdiscovery: limit lines to 80 characters

              r14206
            
          # start actual discovery (we note this before the next "if" for

          # compatibility reasons)

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
          ui.status(_("searching for changes\n"))

          srvheads = dag.internalizeall(srvheadhashes, filterunknown=True)

          if len(srvheads) == len(srvheadhashes):

        Matt Mackall
    
discovery: quiet note about heads...

              r14833
            
              ui.debug("all remote heads known locally\n")

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
              return (srvheadhashes, False, srvheadhashes,)

        Augie Fackler
    
cleanup: use __builtins__.all instead of util.all

              r25151
            
          if sample and len(ownheads) <= initialsamplesize and all(yesno):

        Mads Kiilerich
    
add missing localization markup

              r15497
            
              ui.note(_("all local heads known remotely\n"))

        Peter Arrenbrecht
    
setdiscovery: batch heads and known(ownheads)...

              r14624
            
              ownheadhashes = dag.externalizeall(ownheads)

              return (ownheadhashes, True, srvheadhashes,)

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
          # full blown discovery

        Brodie Rao
    
cleanup: eradicate long lines

              r16683
            
          # own nodes I know we both know

        Siddharth Agarwal
    
setdiscovery: avoid a full changelog graph traversal...

              r23343
            
          # treat remote heads (and maybe own heads) as a first implicit sample

          # response

          common = cl.incrementalmissingrevs(srvheads)

          commoninsample = set(n for i, n in enumerate(sample) if yesno[i])

          common.addbases(commoninsample)

        Pierre-Yves David
    
setdiscovery: drop shadowed 'undecided' assignment...

              r23746
            
          # own nodes where I don't know if remote knows them

        Siddharth Agarwal
    
setdiscovery: avoid a full changelog graph traversal...

              r23343
            
          undecided = set(common.missingancestors(ownheads))

        Brodie Rao
    
cleanup: eradicate long lines

              r16683
            
          # own nodes I know remote lacks

          missing = set()

        Peter Arrenbrecht
    
setdiscovery: batch heads and known(ownheads)...

              r14624
            
          full = False

          while undecided:

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
        Peter Arrenbrecht
    
setdiscovery: batch heads and known(ownheads)...

              r14624
            
              if sample:

                  missinginsample = [n for i, n in enumerate(sample) if not yesno[i]]

                  missing.update(dag.descendantset(missinginsample, missing))

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
        Peter Arrenbrecht
    
setdiscovery: batch heads and known(ownheads)...

              r14624
            
                  undecided.difference_update(missing)

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
              if not undecided:

                  break

        Pierre-Yves David
    
setdiscovery: factorize similar sampling code...

              r23747
            
              if full or common.hasbases():

                  if full:

                      ui.note(_("sampling from both directions\n"))

                  else:

                      ui.debug("taking initial sample\n")

        Pierre-Yves David
    
setdiscovery: delay sample building calls to gather them in a single place...

              r23807
            
                  samplefunc = _takefullsample

        Pierre-Yves David
    
setdiscovery: limit the size of all sample (issue4411)...

              r23130
            
                  targetsize = fullsamplesize

        Peter Arrenbrecht
    
setdiscovery: batch heads and known(ownheads)...

              r14624
            
              else:

                  # use even cheaper initial sample

                  ui.debug("taking quick initial sample\n")

        Pierre-Yves David
    
setdiscovery: delay sample building calls to gather them in a single place...

              r23807
            
                  samplefunc = _takequicksample

        Pierre-Yves David
    
setdiscovery: limit the size of all sample (issue4411)...

              r23130
            
                  targetsize = initialsamplesize

        Pierre-Yves David
    
setdiscovery: avoid calling any sample building if the undecided set is small...

              r23808
            
              if len(undecided) < targetsize:

                  sample = list(undecided)

              else:

                  sample = samplefunc(dag, undecided, targetsize)

                  sample = _limitsample(sample, targetsize)

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
              roundtrips += 1

              ui.progress(_('searching'), roundtrips, unit=_('queries'))

              ui.debug("query %i; still undecided: %i, sample size is: %i\n"

                       % (roundtrips, len(undecided), len(sample)))

              # indices between sample and externalized version must match

              sample = list(sample)

              yesno = remote.known(dag.externalizeall(sample))

        Peter Arrenbrecht
    
setdiscovery: batch heads and known(ownheads)...

              r14624
            
              full = True

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
        Siddharth Agarwal
    
setdiscovery: avoid a full changelog graph traversal...

              r23343
            
              if sample:

                  commoninsample = set(n for i, n in enumerate(sample) if yesno[i])

                  common.addbases(commoninsample)

                  common.removeancestorsfrom(undecided)

          # heads(common) == heads(common.bases) since common represents common.bases

          # and all its ancestors

          result = dag.headsetofconnecteds(common.bases)

          # common.bases can include nullrev, but our contract requires us to not

          # return any heads in that case, so discard that

          result.discard(nullrev)

        marmoute
    
discovery: include timing in the debug output...

              r32712
            
          elapsed = util.timer() - start

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
          ui.progress(_('searching'), None)

        marmoute
    
discovery: include timing in the debug output...

              r32712
            
          ui.debug("%d total queries in %.4fs\n" % (roundtrips, elapsed))

        marmoute
    
setdiscovery: improves logged message...

              r32768
            
          msg = ('found %d common and %d unknown server heads,'

                 ' %d roundtrips in %.4fs\n')

          missing = set(result) - set(srvheads)

          ui.log('discovery', msg, len(result), len(missing), roundtrips,

        marmoute
    
discovery: log discovery result in non-trivial cases...

              r32713
            
                 elapsed)

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
          if not result and srvheadhashes != [nullid]:

              if abortwhenunrelated:

        Pierre-Yves David
    
error: get Abort from 'error' instead of 'util'...

              r26587
            
                  raise error.Abort(_("repository is unrelated"))

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
              else:

                  ui.warn(_("warning: repository is unrelated\n"))

        Martin von Zweigbergk
    
cleanup: use set literals...

              r32291
            
              return ({nullid}, True, srvheadhashes,)

        Peter Arrenbrecht
    
discovery: add new set-based discovery...

              r14164
            
        Andrew Pritchard
    
setdiscovery: return anyincoming=False when remote's only head is nullid...

              r14981
            
          anyincoming = (srvheadhashes != [nullid])

          return dag.externalizeall(result), anyincoming, srvheadhashes

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages

Peter Arrenbrecht discovery: add new set-based discovery...	r14164	# setdiscovery.py - improved discovery of common nodeset for mercurial
		#
		# Copyright 2010 Benoit Boissinot <bboissin@gmail.com>
		# and Peter Arrenbrecht <peter@arrenbrecht.ch>
		#
		# This software may be used and distributed according to the terms of the
		# GNU General Public License version 2 or any later version.
Olle Lundberg setdiscovery: document algorithms used...	r20656	"""
		Algorithm works in the following way. You have two repository: local and
		remote. They both contains a DAG of changelists.

		The goal of the discovery protocol is to find one set of node common,
		the set of nodes shared by local and remote.

		One of the issue with the original protocol was latency, it could
		potentially require lots of roundtrips to discover that the local repo was a
		subset of remote (which is a very common case, you usually have few changes
		compared to upstream, while upstream probably had lots of development).

		The new protocol only requires one interface for the remote repo: `known()`,
		which given a set of changelists tells you if they are present in the DAG.

		The algorithm then works as follow:

		- We will be using three sets, `common`, `missing`, `unknown`. Originally
		all nodes are in `unknown`.
		- Take a sample from `unknown`, call `remote.known(sample)`
		- For each node that remote knows, move it and all its ancestors to `common`
		- For each node that remote doesn't know, move it and all its descendants
		to `missing`
		- Iterate until `unknown` is empty

		There are a couple optimizations, first is instead of starting with a random
		sample of missing, start by sending all heads, in the case where the local
		repo is a subset, you computed the answer in one round trip.

		Then you can do something similar to the bisecting strategy used when
		finding faulty changesets. Instead of random samples, you can try picking
		nodes that will maximize the number of nodes that will be
		classified with it (since all ancestors or descendants will be marked as well).
		"""
Peter Arrenbrecht discovery: add new set-based discovery...	r14164
Gregory Szorc setdiscovery: use absolute_import	r25973	from __future__ import absolute_import

Martin von Zweigbergk util: drop alias for collections.deque...	r25113	import collections
Augie Fackler cleanup: move stdlib imports to their own import statement...	r20034	import random
Gregory Szorc setdiscovery: use absolute_import	r25973
		from .i18n import _
		from .node import (
		nullid,
		nullrev,
		)
		from . import (
		dagutil,
Pierre-Yves David error: get Abort from 'error' instead of 'util'...	r26587	error,
marmoute discovery: include timing in the debug output...	r32712	util,
Gregory Szorc setdiscovery: use absolute_import	r25973	)
Peter Arrenbrecht discovery: add new set-based discovery...	r14164
Pierre-Yves David setdiscovery: drop the 'always' argument to '_updatesample'...	r23814	def _updatesample(dag, nodes, sample, quicksamplesize=0):
Pierre-Yves David setdiscovery: document the '_updatesample' function...	r23809	"""update an existing sample to match the expected size

		The sample is updated with nodes exponentially distant from each head of the
		<nodes> set. (H~1, H~2, H~4, H~8, etc).

		If a target size is specified, the sampling will stop once this size is
		reached. Otherwise sampling will happen until roots of the <nodes> set are
		reached.

		:dag: a dag object from dagutil
		:nodes: set of nodes we want to discover (if None, assume the whole dag)
		:sample: a sample to update
		:quicksamplesize: optional target size of the sample"""
Peter Arrenbrecht discovery: add new set-based discovery...	r14164	# if nodes is empty we scan the entire graph
		if nodes:
		heads = dag.headsetofconnecteds(nodes)
		else:
		heads = dag.heads()
		dist = {}
Martin von Zweigbergk util: drop alias for collections.deque...	r25113	visit = collections.deque(heads)
Peter Arrenbrecht discovery: add new set-based discovery...	r14164	seen = set()
		factor = 1
		while visit:
		curr = visit.popleft()
		if curr in seen:
		continue
		d = dist.setdefault(curr, 1)
		if d > factor:
		factor *= 2
		if d == factor:
Pierre-Yves David setdiscovery: drop the 'always' argument to '_updatesample'...	r23814	sample.add(curr)
		if quicksamplesize and (len(sample) >= quicksamplesize):
		return
Peter Arrenbrecht discovery: add new set-based discovery...	r14164	seen.add(curr)
		for p in dag.parents(curr):
		if not nodes or p in nodes:
		dist.setdefault(p, d + 1)
		visit.append(p)

Pierre-Yves David setdiscovery: drop unused 'initial' argument for '_takequicksample'...	r23806	def _takequicksample(dag, nodes, size):
Pierre-Yves David setdiscovery: document '_takequicksample'	r23816	"""takes a quick sample of size <size>

		It is meant for initial sampling and focuses on querying heads and close
		ancestors of heads.

		:dag: a dag object
		:nodes: set of nodes to discover
		:size: the maximum size of the sample"""
Pierre-Yves David setdiscovery: drop '_setupsample' usage in '_takequicksample'...	r23815	sample = dag.headsetofconnecteds(nodes)
		if size <= len(sample):
		return _limitsample(sample, size)
Pierre-Yves David setdiscovery: drop the 'always' argument to '_updatesample'...	r23814	_updatesample(dag, None, sample, quicksamplesize=size)
Peter Arrenbrecht discovery: add new set-based discovery...	r14164	return sample

		def _takefullsample(dag, nodes, size):
Pierre-Yves David setdiscovery: drop the 'always' argument to '_updatesample'...	r23814	sample = dag.headsetofconnecteds(nodes)
Peter Arrenbrecht discovery: add new set-based discovery...	r14164	# update from heads
Pierre-Yves David setdiscovery: drop the 'always' argument to '_updatesample'...	r23814	_updatesample(dag, nodes, sample)
Peter Arrenbrecht discovery: add new set-based discovery...	r14164	# update from roots
Pierre-Yves David setdiscovery: drop the 'always' argument to '_updatesample'...	r23814	_updatesample(dag.inverse(), nodes, sample)
Peter Arrenbrecht discovery: add new set-based discovery...	r14164	assert sample
Pierre-Yves David setdiscovery: randomly pick between heads and sample when taking full sample...	r23810	sample = _limitsample(sample, size)
		if len(sample) < size:
		more = size - len(sample)
		sample.update(random.sample(list(nodes - sample), more))
Peter Arrenbrecht discovery: add new set-based discovery...	r14164	return sample

Pierre-Yves David setdiscovery: extract sample limitation in a `_limitsample` function...	r23083	def _limitsample(sample, desiredlen):
		"""return a random subset of sample of at most desiredlen item"""
		if len(sample) > desiredlen:
		sample = set(random.sample(sample, desiredlen))
		return sample

Martin von Zweigbergk setdiscovery: don't call "heads" wire command when heads specified...	r35867	def findcommonheads(ui, local, remote, heads=None,
Peter Arrenbrecht discovery: add new set-based discovery...	r14164	initialsamplesize=100,
		fullsamplesize=200,
Boris Feld setdiscover: allow to ignore part of the local graph...	r35305	abortwhenunrelated=True,
		ancestorsof=None):
Steven Brown setdiscovery: limit lines to 80 characters	r14206	'''Return a tuple (common, anyincoming, remoteheads) used to identify
		missing nodes from or in remote.
Peter Arrenbrecht discovery: add new set-based discovery...	r14164	'''
marmoute discovery: include timing in the debug output...	r32712	start = util.timer()

Peter Arrenbrecht discovery: add new set-based discovery...	r14164	roundtrips = 0
		cl = local.changelog
Boris Feld setdiscover: allow to ignore part of the local graph...	r35305	localsubset = None
		if ancestorsof is not None:
		rev = local.changelog.rev
		localsubset = [rev(n) for n in ancestorsof]
		dag = dagutil.revlogdag(cl, localsubset=localsubset)
Peter Arrenbrecht discovery: add new set-based discovery...	r14164
Peter Arrenbrecht setdiscovery: batch heads and known(ownheads)...	r14624	# early exit if we know all the specified remote heads already
Peter Arrenbrecht discovery: add new set-based discovery...	r14164	ui.debug("query 1; heads\n")
		roundtrips += 1
Peter Arrenbrecht setdiscovery: batch heads and known(ownheads)...	r14624	ownheads = dag.heads()
Pierre-Yves David setdiscovery: limit the size of the initial sample (issue4411)...	r23084	sample = _limitsample(ownheads, initialsamplesize)
Mads Kiilerich discovery: indices between sample and yesno must match (issue4438)...	r23192	# indices between sample and externalized version must match
		sample = list(sample)
Martin von Zweigbergk setdiscovery: don't call "heads" wire command when heads specified...	r35867	if heads:
		srvheadhashes = heads
		yesno = remote.known(dag.externalizeall(sample))
		else:
		batch = remote.iterbatch()
		batch.heads()
		batch.known(dag.externalizeall(sample))
		batch.submit()
		srvheadhashes, yesno = batch.results()
Peter Arrenbrecht discovery: add new set-based discovery...	r14164
		if cl.tip() == nullid:
		if srvheadhashes != [nullid]:
		return [nullid], True, srvheadhashes
		return [nullid], False, []

Steven Brown setdiscovery: limit lines to 80 characters	r14206	# start actual discovery (we note this before the next "if" for
		# compatibility reasons)
Peter Arrenbrecht discovery: add new set-based discovery...	r14164	ui.status(_("searching for changes\n"))

		srvheads = dag.internalizeall(srvheadhashes, filterunknown=True)
		if len(srvheads) == len(srvheadhashes):
Matt Mackall discovery: quiet note about heads...	r14833	ui.debug("all remote heads known locally\n")
Peter Arrenbrecht discovery: add new set-based discovery...	r14164	return (srvheadhashes, False, srvheadhashes,)

Augie Fackler cleanup: use __builtins__.all instead of util.all	r25151	if sample and len(ownheads) <= initialsamplesize and all(yesno):
Mads Kiilerich add missing localization markup	r15497	ui.note(_("all local heads known remotely\n"))
Peter Arrenbrecht setdiscovery: batch heads and known(ownheads)...	r14624	ownheadhashes = dag.externalizeall(ownheads)
		return (ownheadhashes, True, srvheadhashes,)

Peter Arrenbrecht discovery: add new set-based discovery...	r14164	# full blown discovery

Brodie Rao cleanup: eradicate long lines	r16683	# own nodes I know we both know
Siddharth Agarwal setdiscovery: avoid a full changelog graph traversal...	r23343	# treat remote heads (and maybe own heads) as a first implicit sample
		# response
		common = cl.incrementalmissingrevs(srvheads)
		commoninsample = set(n for i, n in enumerate(sample) if yesno[i])
		common.addbases(commoninsample)
Pierre-Yves David setdiscovery: drop shadowed 'undecided' assignment...	r23746	# own nodes where I don't know if remote knows them
Siddharth Agarwal setdiscovery: avoid a full changelog graph traversal...	r23343	undecided = set(common.missingancestors(ownheads))
Brodie Rao cleanup: eradicate long lines	r16683	# own nodes I know remote lacks
		missing = set()

Peter Arrenbrecht setdiscovery: batch heads and known(ownheads)...	r14624	full = False
		while undecided:
Peter Arrenbrecht discovery: add new set-based discovery...	r14164
Peter Arrenbrecht setdiscovery: batch heads and known(ownheads)...	r14624	if sample:
		missinginsample = [n for i, n in enumerate(sample) if not yesno[i]]
		missing.update(dag.descendantset(missinginsample, missing))
Peter Arrenbrecht discovery: add new set-based discovery...	r14164
Peter Arrenbrecht setdiscovery: batch heads and known(ownheads)...	r14624	undecided.difference_update(missing)
Peter Arrenbrecht discovery: add new set-based discovery...	r14164
		if not undecided:
		break

Pierre-Yves David setdiscovery: factorize similar sampling code...	r23747	if full or common.hasbases():
		if full:
		ui.note(_("sampling from both directions\n"))
		else:
		ui.debug("taking initial sample\n")
Pierre-Yves David setdiscovery: delay sample building calls to gather them in a single place...	r23807	samplefunc = _takefullsample
Pierre-Yves David setdiscovery: limit the size of all sample (issue4411)...	r23130	targetsize = fullsamplesize
Peter Arrenbrecht setdiscovery: batch heads and known(ownheads)...	r14624	else:
		# use even cheaper initial sample
		ui.debug("taking quick initial sample\n")
Pierre-Yves David setdiscovery: delay sample building calls to gather them in a single place...	r23807	samplefunc = _takequicksample
Pierre-Yves David setdiscovery: limit the size of all sample (issue4411)...	r23130	targetsize = initialsamplesize
Pierre-Yves David setdiscovery: avoid calling any sample building if the undecided set is small...	r23808	if len(undecided) < targetsize:
		sample = list(undecided)
		else:
		sample = samplefunc(dag, undecided, targetsize)
		sample = _limitsample(sample, targetsize)
Peter Arrenbrecht discovery: add new set-based discovery...	r14164
		roundtrips += 1
		ui.progress(_('searching'), roundtrips, unit=_('queries'))
		ui.debug("query %i; still undecided: %i, sample size is: %i\n"
		% (roundtrips, len(undecided), len(sample)))
		# indices between sample and externalized version must match
		sample = list(sample)
		yesno = remote.known(dag.externalizeall(sample))
Peter Arrenbrecht setdiscovery: batch heads and known(ownheads)...	r14624	full = True
Peter Arrenbrecht discovery: add new set-based discovery...	r14164
Siddharth Agarwal setdiscovery: avoid a full changelog graph traversal...	r23343	if sample:
		commoninsample = set(n for i, n in enumerate(sample) if yesno[i])
		common.addbases(commoninsample)
		common.removeancestorsfrom(undecided)

		# heads(common) == heads(common.bases) since common represents common.bases
		# and all its ancestors
		result = dag.headsetofconnecteds(common.bases)
		# common.bases can include nullrev, but our contract requires us to not
		# return any heads in that case, so discard that
		result.discard(nullrev)
marmoute discovery: include timing in the debug output...	r32712	elapsed = util.timer() - start
Peter Arrenbrecht discovery: add new set-based discovery...	r14164	ui.progress(_('searching'), None)
marmoute discovery: include timing in the debug output...	r32712	ui.debug("%d total queries in %.4fs\n" % (roundtrips, elapsed))
marmoute setdiscovery: improves logged message...	r32768	msg = ('found %d common and %d unknown server heads,'
		' %d roundtrips in %.4fs\n')
		missing = set(result) - set(srvheads)
		ui.log('discovery', msg, len(result), len(missing), roundtrips,
marmoute discovery: log discovery result in non-trivial cases...	r32713	elapsed)
Peter Arrenbrecht discovery: add new set-based discovery...	r14164
		if not result and srvheadhashes != [nullid]:
		if abortwhenunrelated:
Pierre-Yves David error: get Abort from 'error' instead of 'util'...	r26587	raise error.Abort(_("repository is unrelated"))
Peter Arrenbrecht discovery: add new set-based discovery...	r14164	else:
		ui.warn(_("warning: repository is unrelated\n"))
Martin von Zweigbergk cleanup: use set literals...	r32291	return ({nullid}, True, srvheadhashes,)
Peter Arrenbrecht discovery: add new set-based discovery...	r14164
Andrew Pritchard setdiscovery: return anyincoming=False when remote's only head is nullid...	r14981	anyincoming = (srvheadhashes != [nullid])
		return dag.externalizeall(result), anyincoming, srvheadhashes