upstream/mercurial-mirror Commit - r50542:c217d94c

1

# setdiscovery.py - improved discovery of common nodeset for mercurial

1

# setdiscovery.py - improved discovery of common nodeset for mercurial

2

#

2

#

3

4

# and Peter Arrenbrecht <peter@arrenbrecht.ch>

4

# and Peter Arrenbrecht <peter@arrenbrecht.ch>

5

#

5

#

6

# This software may be used and distributed according to the terms of the

6

# This software may be used and distributed according to the terms of the

7

# GNU General Public License version 2 or any later version.

7

# GNU General Public License version 2 or any later version.

8

"""

8

"""

9

Algorithm works in the following way. You have two repository: local and

9

Algorithm works in the following way. You have two repository: local and

10

remote. They both contains a DAG of changelists.

10

remote. They both contains a DAG of changelists.

11

12

The goal of the discovery protocol is to find one set of node *common*,

12

The goal of the discovery protocol is to find one set of node *common*,

13

the set of nodes shared by local and remote.

13

the set of nodes shared by local and remote.

14

15

One of the issue with the original protocol was latency, it could

15

One of the issue with the original protocol was latency, it could

16

potentially require lots of roundtrips to discover that the local repo was a

16

potentially require lots of roundtrips to discover that the local repo was a

17

subset of remote (which is a very common case, you usually have few changes

17

subset of remote (which is a very common case, you usually have few changes

18

compared to upstream, while upstream probably had lots of development).

18

compared to upstream, while upstream probably had lots of development).

19

20

The new protocol only requires one interface for the remote repo: `known()`,

20

The new protocol only requires one interface for the remote repo: `known()`,

21

which given a set of changelists tells you if they are present in the DAG.

21

which given a set of changelists tells you if they are present in the DAG.

22

23

The algorithm then works as follow:

23

The algorithm then works as follow:

24

25

- We will be using three sets, `common`, `missing`, `unknown`. Originally

25

- We will be using three sets, `common`, `missing`, `unknown`. Originally

26

all nodes are in `unknown`.

26

all nodes are in `unknown`.

27

- Take a sample from `unknown`, call `remote.known(sample)`

27

- Take a sample from `unknown`, call `remote.known(sample)`

28

- For each node that remote knows, move it and all its ancestors to `common`

28

- For each node that remote knows, move it and all its ancestors to `common`

29

- For each node that remote doesn't know, move it and all its descendants

29

- For each node that remote doesn't know, move it and all its descendants

30

to `missing`

30

to `missing`

31

- Iterate until `unknown` is empty

31

- Iterate until `unknown` is empty

32

33

There are a couple optimizations, first is instead of starting with a random

33

There are a couple optimizations, first is instead of starting with a random

34

sample of missing, start by sending all heads, in the case where the local

34

sample of missing, start by sending all heads, in the case where the local

35

repo is a subset, you computed the answer in one round trip.

35

repo is a subset, you computed the answer in one round trip.

36

37

Then you can do something similar to the bisecting strategy used when

37

Then you can do something similar to the bisecting strategy used when

38

finding faulty changesets. Instead of random samples, you can try picking

38

finding faulty changesets. Instead of random samples, you can try picking

39

nodes that will maximize the number of nodes that will be

39

nodes that will maximize the number of nodes that will be

40

classified with it (since all ancestors or descendants will be marked as well).

40

classified with it (since all ancestors or descendants will be marked as well).

41

"""

41

"""

42

43

44

import collections

44

import collections

45

import random

45

import random

46

47

from .i18n import _

47

from .i18n import _

48

from .node import nullrev

48

from .node import nullrev

49

from . import (

49

from . import (

50

error,

50

error,

51

policy,

51

policy,

52

util,

52

util,

53

)

53

)

54

55

56

def _updatesample(revs, heads, sample, parentfn, quicksamplesize=0):

56

def _updatesample(revs, heads, sample, parentfn, quicksamplesize=0):

57

"""update an existing sample to match the expected size

57

"""update an existing sample to match the expected size

58

59

The sample is updated with revs exponentially distant from each head of the

59

The sample is updated with revs exponentially distant from each head of the

60

<revs> set. (H~1, H~2, H~4, H~8, etc).

60

<revs> set. (H~1, H~2, H~4, H~8, etc).

61

62

If a target size is specified, the sampling will stop once this size is

62

If a target size is specified, the sampling will stop once this size is

63

reached. Otherwise sampling will happen until roots of the <revs> set are

63

reached. Otherwise sampling will happen until roots of the <revs> set are

64

reached.

64

reached.

65

66

:revs: set of revs we want to discover (if None, assume the whole dag)

66

:revs: set of revs we want to discover (if None, assume the whole dag)

67

:heads: set of DAG head revs

67

:heads: set of DAG head revs

68

:sample: a sample to update

68

:sample: a sample to update

69

:parentfn: a callable to resolve parents for a revision

69

:parentfn: a callable to resolve parents for a revision

70

:quicksamplesize: optional target size of the sample"""

70

:quicksamplesize: optional target size of the sample"""

71

dist = {}

71

dist = {}

72

visit = collections.deque(heads)

72

visit = collections.deque(heads)

73

seen = set()

73

seen = set()

74

factor = 1

74

factor = 1

75

while visit:

75

while visit:

76

curr = visit.popleft()

76

curr = visit.popleft()

77

if curr in seen:

77

if curr in seen:

78

continue

78

continue

79

d = dist.setdefault(curr, 1)

79

d = dist.setdefault(curr, 1)

80

if d > factor:

80

if d > factor:

81

factor *= 2

81

factor *= 2

82

if d == factor:

82

if d == factor:

83

sample.add(curr)

83

sample.add(curr)

84

if quicksamplesize and (len(sample) >= quicksamplesize):

84

if quicksamplesize and (len(sample) >= quicksamplesize):

85

return

85

return

86

seen.add(curr)

86

seen.add(curr)

87

88

for p in parentfn(curr):

88

for p in parentfn(curr):

89

if p != nullrev and (not revs or p in revs):

89

if p != nullrev and (not revs or p in revs):

90

dist.setdefault(p, d + 1)

90

dist.setdefault(p, d + 1)

91

visit.append(p)

91

visit.append(p)

92

93

94

def _limitsample(sample, desiredlen, randomize=True):

94

def _limitsample(sample, desiredlen, randomize=True):

95

"""return a random subset of sample of at most desiredlen item.

95

"""return a random subset of sample of at most desiredlen item.

96

97

If randomize is False, though, a deterministic subset is returned.

97

If randomize is False, though, a deterministic subset is returned.

98

This is meant for integration tests.

98

This is meant for integration tests.

99

"""

99

"""

100

if len(sample) <= desiredlen:

100

if len(sample) <= desiredlen:

101

return sample

101

return sample

102

sample = list(sample)

102

if randomize:

103

if randomize:

103

return set(random.sample(sample, desiredlen))

104

return set(random.sample(sample, desiredlen))

104

sample = list(sample)

105

sample.sort()

105

sample.sort()

106

return set(sample[:desiredlen])

106

return set(sample[:desiredlen])

107

108

109

class partialdiscovery:

109

class partialdiscovery:

110

"""an object representing ongoing discovery

110

"""an object representing ongoing discovery

111

112

Feed with data from the remote repository, this object keep track of the

112

Feed with data from the remote repository, this object keep track of the

113

current set of changeset in various states:

113

current set of changeset in various states:

114

115

- common: revs also known remotely

115

- common: revs also known remotely

116

- undecided: revs we don't have information on yet

116

- undecided: revs we don't have information on yet

117

- missing: revs missing remotely

117

- missing: revs missing remotely

118

(all tracked revisions are known locally)

118

(all tracked revisions are known locally)

119

"""

119

"""

120

121

def __init__(self, repo, targetheads, respectsize, randomize=True):

121

def __init__(self, repo, targetheads, respectsize, randomize=True):

122

self._repo = repo

122

self._repo = repo

123

self._targetheads = targetheads

123

self._targetheads = targetheads

124

self._common = repo.changelog.incrementalmissingrevs()

124

self._common = repo.changelog.incrementalmissingrevs()

125

self._undecided = None

125

self._undecided = None

126

self.missing = set()

126

self.missing = set()

127

self._childrenmap = None

127

self._childrenmap = None

128

self._respectsize = respectsize

128

self._respectsize = respectsize

129

self.randomize = randomize

129

self.randomize = randomize

130

131

def addcommons(self, commons):

131

def addcommons(self, commons):

132

"""register nodes known as common"""

132

"""register nodes known as common"""

133

self._common.addbases(commons)

133

self._common.addbases(commons)

134

if self._undecided is not None:

134

if self._undecided is not None:

135

self._common.removeancestorsfrom(self._undecided)

135

self._common.removeancestorsfrom(self._undecided)

136

137

def addmissings(self, missings):

137

def addmissings(self, missings):

138

"""register some nodes as missing"""

138

"""register some nodes as missing"""

139

newmissing = self._repo.revs(b'%ld::%ld', missings, self.undecided)

139

newmissing = self._repo.revs(b'%ld::%ld', missings, self.undecided)

140

if newmissing:

140

if newmissing:

141

self.missing.update(newmissing)

141

self.missing.update(newmissing)

142

self.undecided.difference_update(newmissing)

142

self.undecided.difference_update(newmissing)

143

144

def addinfo(self, sample):

144

def addinfo(self, sample):

145

"""consume an iterable of (rev, known) tuples"""

145

"""consume an iterable of (rev, known) tuples"""

146

common = set()

146

common = set()

147

missing = set()

147

missing = set()

148

for rev, known in sample:

148

for rev, known in sample:

149

if known:

149

if known:

150

common.add(rev)

150

common.add(rev)

151

else:

151

else:

152

missing.add(rev)

152

missing.add(rev)

153

if common:

153

if common:

154

self.addcommons(common)

154

self.addcommons(common)

155

if missing:

155

if missing:

156

self.addmissings(missing)

156

self.addmissings(missing)

157

158

def hasinfo(self):

158

def hasinfo(self):

159

"""return True is we have any clue about the remote state"""

159

"""return True is we have any clue about the remote state"""

160

return self._common.hasbases()

160

return self._common.hasbases()

161

162

def iscomplete(self):

162

def iscomplete(self):

163

"""True if all the necessary data have been gathered"""

163

"""True if all the necessary data have been gathered"""

164

return self._undecided is not None and not self._undecided

164

return self._undecided is not None and not self._undecided

165

166

@property

166

@property

167

def undecided(self):

167

def undecided(self):

168

if self._undecided is not None:

168

if self._undecided is not None:

169

return self._undecided

169

return self._undecided

170

self._undecided = set(self._common.missingancestors(self._targetheads))

170

self._undecided = set(self._common.missingancestors(self._targetheads))

171

return self._undecided

171

return self._undecided

172

173

def stats(self):

173

def stats(self):

174

return {

174

return {

175

'undecided': len(self.undecided),

175

'undecided': len(self.undecided),

176

}

176

}

177

178

def commonheads(self):

178

def commonheads(self):

179

"""the heads of the known common set"""

179

"""the heads of the known common set"""

180

# heads(common) == heads(common.bases) since common represents

180

# heads(common) == heads(common.bases) since common represents

181

# common.bases and all its ancestors

181

# common.bases and all its ancestors

182

return self._common.basesheads()

182

return self._common.basesheads()

183

184

def _parentsgetter(self):

184

def _parentsgetter(self):

185

getrev = self._repo.changelog.index.__getitem__

185

getrev = self._repo.changelog.index.__getitem__

186

187

def getparents(r):

187

def getparents(r):

188

return getrev(r)[5:7]

188

return getrev(r)[5:7]

189

190

return getparents

190

return getparents

191

192

def _childrengetter(self):

192

def _childrengetter(self):

193

194

if self._childrenmap is not None:

194

if self._childrenmap is not None:

195

# During discovery, the `undecided` set keep shrinking.

195

# During discovery, the `undecided` set keep shrinking.

196

# Therefore, the map computed for an iteration N will be

196

# Therefore, the map computed for an iteration N will be

197

# valid for iteration N+1. Instead of computing the same

197

# valid for iteration N+1. Instead of computing the same

198

# data over and over we cached it the first time.

198

# data over and over we cached it the first time.

199

return self._childrenmap.__getitem__

199

return self._childrenmap.__getitem__

200

201

# _updatesample() essentially does interaction over revisions to look

201

# _updatesample() essentially does interaction over revisions to look

202

# up their children. This lookup is expensive and doing it in a loop is

202

# up their children. This lookup is expensive and doing it in a loop is

203

# quadratic. We precompute the children for all relevant revisions and

203

# quadratic. We precompute the children for all relevant revisions and

204

# make the lookup in _updatesample() a simple dict lookup.

204

# make the lookup in _updatesample() a simple dict lookup.

205

self._childrenmap = children = {}

205

self._childrenmap = children = {}

206

207

parentrevs = self._parentsgetter()

207

parentrevs = self._parentsgetter()

208

revs = self.undecided

208

revs = self.undecided

209

210

for rev in sorted(revs):

210

for rev in sorted(revs):

211

# Always ensure revision has an entry so we don't need to worry

211

# Always ensure revision has an entry so we don't need to worry

212

# about missing keys.

212

# about missing keys.

213

children[rev] = []

213

children[rev] = []

214

for prev in parentrevs(rev):

214

for prev in parentrevs(rev):

215

if prev == nullrev:

215

if prev == nullrev:

216

continue

216

continue

217

c = children.get(prev)

217

c = children.get(prev)

218

if c is not None:

218

if c is not None:

219

c.append(rev)

219

c.append(rev)

220

return children.__getitem__

220

return children.__getitem__

221

222

def takequicksample(self, headrevs, size):

222

def takequicksample(self, headrevs, size):

223

"""takes a quick sample of size <size>

223

"""takes a quick sample of size <size>

224

225

It is meant for initial sampling and focuses on querying heads and close

225

It is meant for initial sampling and focuses on querying heads and close

226

ancestors of heads.

226

ancestors of heads.

227

228

:headrevs: set of head revisions in local DAG to consider

228

:headrevs: set of head revisions in local DAG to consider

229

:size: the maximum size of the sample"""

229

:size: the maximum size of the sample"""

230

revs = self.undecided

230

revs = self.undecided

231

if len(revs) <= size:

231

if len(revs) <= size:

232

return list(revs)

232

return list(revs)

233

sample = set(self._repo.revs(b'heads(%ld)', revs))

233

sample = set(self._repo.revs(b'heads(%ld)', revs))

234

235

if len(sample) >= size:

235

if len(sample) >= size:

236

return _limitsample(sample, size, randomize=self.randomize)

236

return _limitsample(sample, size, randomize=self.randomize)

237

238

_updatesample(

238

_updatesample(

239

None, headrevs, sample, self._parentsgetter(), quicksamplesize=size

239

None, headrevs, sample, self._parentsgetter(), quicksamplesize=size

240

)

240

)

241

return sample

241

return sample

242

243

def takefullsample(self, headrevs, size):

243

def takefullsample(self, headrevs, size):

244

revs = self.undecided

244

revs = self.undecided

245

if len(revs) <= size:

245

if len(revs) <= size:

246

return list(revs)

246

return list(revs)

247

repo = self._repo

247

repo = self._repo

248

sample = set(repo.revs(b'heads(%ld)', revs))

248

sample = set(repo.revs(b'heads(%ld)', revs))

249

parentrevs = self._parentsgetter()

249

parentrevs = self._parentsgetter()

250

251

# update from heads

251

# update from heads

252

revsheads = sample.copy()

252

revsheads = sample.copy()

253

_updatesample(revs, revsheads, sample, parentrevs)

253

_updatesample(revs, revsheads, sample, parentrevs)

254

255

# update from roots

255

# update from roots

256

revsroots = set(repo.revs(b'roots(%ld)', revs))

256

revsroots = set(repo.revs(b'roots(%ld)', revs))

257

childrenrevs = self._childrengetter()

257

childrenrevs = self._childrengetter()

258

_updatesample(revs, revsroots, sample, childrenrevs)

258

_updatesample(revs, revsroots, sample, childrenrevs)

259

assert sample

259

assert sample

260

261

if not self._respectsize:

261

if not self._respectsize:

262

size = max(size, min(len(revsroots), len(revsheads)))

262

size = max(size, min(len(revsroots), len(revsheads)))

263

264

sample = _limitsample(sample, size, randomize=self.randomize)

264

sample = _limitsample(sample, size, randomize=self.randomize)

265

if len(sample) < size:

265

if len(sample) < size:

266

more = size - len(sample)

266

more = size - len(sample)

267

takefrom = list(revs - sample)

267

takefrom = list(revs - sample)

268

if self.randomize:

268

if self.randomize:

269

sample.update(random.sample(takefrom, more))

269

sample.update(random.sample(takefrom, more))

270

else:

270

else:

271

takefrom.sort()

271

takefrom.sort()

272

sample.update(takefrom[:more])

272

sample.update(takefrom[:more])

273

return sample

273

return sample

274

275

276

pure_partialdiscovery = partialdiscovery

276

pure_partialdiscovery = partialdiscovery

277

278

partialdiscovery = policy.importrust(

278

partialdiscovery = policy.importrust(

279

'discovery', member='PartialDiscovery', default=partialdiscovery

279

'discovery', member='PartialDiscovery', default=partialdiscovery

280

)

280

)

281

282

283

def findcommonheads(

283

def findcommonheads(

284

ui,

284

ui,

285

local,

285

local,

286

remote,

286

remote,

287

abortwhenunrelated=True,

287

abortwhenunrelated=True,

288

ancestorsof=None,

288

ancestorsof=None,

289

audit=None,

289

audit=None,

290

):

290

):

291

"""Return a tuple (common, anyincoming, remoteheads) used to identify

291

"""Return a tuple (common, anyincoming, remoteheads) used to identify

292

missing nodes from or in remote.

292

missing nodes from or in remote.

293

294

The audit argument is an optional dictionnary that a caller can pass. it

294

The audit argument is an optional dictionnary that a caller can pass. it

295

will be updated with extra data about the discovery, this is useful for

295

will be updated with extra data about the discovery, this is useful for

296

debug.

296

debug.

297

"""

297

"""

298

299

samplegrowth = float(ui.config(b'devel', b'discovery.grow-sample.rate'))

299

samplegrowth = float(ui.config(b'devel', b'discovery.grow-sample.rate'))

300

301

if audit is not None:

301

if audit is not None:

302

audit[b'total-queries'] = 0

302

audit[b'total-queries'] = 0

303

304

start = util.timer()

304

start = util.timer()

305

306

roundtrips = 0

306

roundtrips = 0

307

cl = local.changelog

307

cl = local.changelog

308

clnode = cl.node

308

clnode = cl.node

309

clrev = cl.rev

309

clrev = cl.rev

310

311

if ancestorsof is not None:

311

if ancestorsof is not None:

312

ownheads = [clrev(n) for n in ancestorsof]

312

ownheads = [clrev(n) for n in ancestorsof]

313

else:

313

else:

314

ownheads = [rev for rev in cl.headrevs() if rev != nullrev]

314

ownheads = [rev for rev in cl.headrevs() if rev != nullrev]

315

316

initial_head_exchange = ui.configbool(b'devel', b'discovery.exchange-heads')

316

initial_head_exchange = ui.configbool(b'devel', b'discovery.exchange-heads')

317

initialsamplesize = ui.configint(b'devel', b'discovery.sample-size.initial')

317

initialsamplesize = ui.configint(b'devel', b'discovery.sample-size.initial')

318

fullsamplesize = ui.configint(b'devel', b'discovery.sample-size')

318

fullsamplesize = ui.configint(b'devel', b'discovery.sample-size')

319

# We also ask remote about all the local heads. That set can be arbitrarily

319

# We also ask remote about all the local heads. That set can be arbitrarily

320

# large, so we used to limit it size to `initialsamplesize`. We no longer

320

# large, so we used to limit it size to `initialsamplesize`. We no longer

321

# do as it proved counter productive. The skipped heads could lead to a

321

# do as it proved counter productive. The skipped heads could lead to a

322

# large "undecided" set, slower to be clarified than if we asked the

322

# large "undecided" set, slower to be clarified than if we asked the

323

# question for all heads right away.

323

# question for all heads right away.

324

#

324

#

325

# We are already fetching all server heads using the `heads` commands,

325

# We are already fetching all server heads using the `heads` commands,

326

# sending a equivalent number of heads the other way should not have a

326

# sending a equivalent number of heads the other way should not have a

327

# significant impact. In addition, it is very likely that we are going to

327

# significant impact. In addition, it is very likely that we are going to

328

# have to issue "known" request for an equivalent amount of revisions in

328

# have to issue "known" request for an equivalent amount of revisions in

329

# order to decide if theses heads are common or missing.

329

# order to decide if theses heads are common or missing.

330

#

330

#

331

# find a detailled analysis below.

331

# find a detailled analysis below.

332

#

332

#

333

# Case A: local and server both has few heads

333

# Case A: local and server both has few heads

334

#

334

#

335

# Ownheads is below initialsamplesize, limit would not have any effect.

335

# Ownheads is below initialsamplesize, limit would not have any effect.

336

#

336

#

337

# Case B: local has few heads and server has many

337

# Case B: local has few heads and server has many

338

#

338

#

339

# Ownheads is below initialsamplesize, limit would not have any effect.

339

# Ownheads is below initialsamplesize, limit would not have any effect.

340

#

340

#

341

# Case C: local and server both has many heads

341

# Case C: local and server both has many heads

342

#

342

#

343

# We now transfert some more data, but not significantly more than is

343

# We now transfert some more data, but not significantly more than is

344

# already transfered to carry the server heads.

344

# already transfered to carry the server heads.

345

#

345

#

346

# Case D: local has many heads, server has few

346

# Case D: local has many heads, server has few

347

#

347

#

348

# D.1 local heads are mostly known remotely

348

# D.1 local heads are mostly known remotely

349

#

349

#

350

# All the known head will have be part of a `known` request at some

350

# All the known head will have be part of a `known` request at some

351

# point for the discovery to finish. Sending them all earlier is

351

# point for the discovery to finish. Sending them all earlier is

352

# actually helping.

352

# actually helping.

353

#

353

#

354

# (This case is fairly unlikely, it requires the numerous heads to all

354

# (This case is fairly unlikely, it requires the numerous heads to all

355

# be merged server side in only a few heads)

355

# be merged server side in only a few heads)

356

#

356

#

357

# D.2 local heads are mostly missing remotely

357

# D.2 local heads are mostly missing remotely

358

#

358

#

359

# To determine that the heads are missing, we'll have to issue `known`

359

# To determine that the heads are missing, we'll have to issue `known`

360

# request for them or one of their ancestors. This amount of `known`

360

# request for them or one of their ancestors. This amount of `known`

361

# request will likely be in the same order of magnitude than the amount

361

# request will likely be in the same order of magnitude than the amount

362

# of local heads.

362

# of local heads.

363

#

363

#

364

# The only case where we can be more efficient using `known` request on

364

# The only case where we can be more efficient using `known` request on

365

# ancestors are case were all the "missing" local heads are based on a

365

# ancestors are case were all the "missing" local heads are based on a

366

# few changeset, also "missing". This means we would have a "complex"

366

# few changeset, also "missing". This means we would have a "complex"

367

# graph (with many heads) attached to, but very independant to a the

367

# graph (with many heads) attached to, but very independant to a the

368

# "simple" graph on the server. This is a fairly usual case and have

368

# "simple" graph on the server. This is a fairly usual case and have

369

# not been met in the wild so far.

369

# not been met in the wild so far.

370

if initial_head_exchange:

370

if initial_head_exchange:

371

if remote.limitedarguments:

371

if remote.limitedarguments:

372

sample = _limitsample(ownheads, initialsamplesize)

372

sample = _limitsample(ownheads, initialsamplesize)

373

# indices between sample and externalized version must match

373

# indices between sample and externalized version must match

374

sample = list(sample)

374

sample = list(sample)

375

else:

375

else:

376

sample = ownheads

376

sample = ownheads

377

378

ui.debug(b"query 1; heads\n")

378

ui.debug(b"query 1; heads\n")

379

roundtrips += 1

379

roundtrips += 1

380

with remote.commandexecutor() as e:

380

with remote.commandexecutor() as e:

381

fheads = e.callcommand(b'heads', {})

381

fheads = e.callcommand(b'heads', {})

382

if audit is not None:

382

if audit is not None:

383

audit[b'total-queries'] += len(sample)

383

audit[b'total-queries'] += len(sample)

384

fknown = e.callcommand(

384

fknown = e.callcommand(

385

b'known',

385

b'known',

386

{

386

{

387

b'nodes': [clnode(r) for r in sample],

387

b'nodes': [clnode(r) for r in sample],

388

},

388

},

389

)

389

)

390

391

srvheadhashes, yesno = fheads.result(), fknown.result()

391

srvheadhashes, yesno = fheads.result(), fknown.result()

392

393

if audit is not None:

393

if audit is not None:

394

audit[b'total-roundtrips'] = 1

394

audit[b'total-roundtrips'] = 1

395

396

if cl.tiprev() == nullrev:

396

if cl.tiprev() == nullrev:

397

if srvheadhashes != [cl.nullid]:

397

if srvheadhashes != [cl.nullid]:

398

return [cl.nullid], True, srvheadhashes

398

return [cl.nullid], True, srvheadhashes

399

return [cl.nullid], False, []

399

return [cl.nullid], False, []

400

else:

400

else:

401

# we still need the remote head for the function return

401

# we still need the remote head for the function return

402

with remote.commandexecutor() as e:

402

with remote.commandexecutor() as e:

403

fheads = e.callcommand(b'heads', {})

403

fheads = e.callcommand(b'heads', {})

404

srvheadhashes = fheads.result()

404

srvheadhashes = fheads.result()

405

406

# start actual discovery (we note this before the next "if" for

406

# start actual discovery (we note this before the next "if" for

407

# compatibility reasons)

407

# compatibility reasons)

408

ui.status(_(b"searching for changes\n"))

408

ui.status(_(b"searching for changes\n"))

409

410

knownsrvheads = [] # revnos of remote heads that are known locally

410

knownsrvheads = [] # revnos of remote heads that are known locally

411

for node in srvheadhashes:

411

for node in srvheadhashes:

412

if node == cl.nullid:

412

if node == cl.nullid:

413

continue

413

continue

414

415

try:

415

try:

416

knownsrvheads.append(clrev(node))

416

knownsrvheads.append(clrev(node))

417

# Catches unknown and filtered nodes.

417

# Catches unknown and filtered nodes.

418

except error.LookupError:

418

except error.LookupError:

419

continue

419

continue

420

421

if initial_head_exchange:

421

if initial_head_exchange:

422

# early exit if we know all the specified remote heads already

422

# early exit if we know all the specified remote heads already

423

if len(knownsrvheads) == len(srvheadhashes):

423

if len(knownsrvheads) == len(srvheadhashes):

424

ui.debug(b"all remote heads known locally\n")

424

ui.debug(b"all remote heads known locally\n")

425

return srvheadhashes, False, srvheadhashes

425

return srvheadhashes, False, srvheadhashes

426

427

if len(sample) == len(ownheads) and all(yesno):

427

if len(sample) == len(ownheads) and all(yesno):

428

ui.note(_(b"all local changesets known remotely\n"))

428

ui.note(_(b"all local changesets known remotely\n"))

429

ownheadhashes = [clnode(r) for r in ownheads]

429

ownheadhashes = [clnode(r) for r in ownheads]

430

return ownheadhashes, True, srvheadhashes

430

return ownheadhashes, True, srvheadhashes

431

432

# full blown discovery

432

# full blown discovery

433

434

# if the server has a limit to its arguments size, we can't grow the sample.

434

# if the server has a limit to its arguments size, we can't grow the sample.

435

configbool = local.ui.configbool

435

configbool = local.ui.configbool

436

grow_sample = configbool(b'devel', b'discovery.grow-sample')

436

grow_sample = configbool(b'devel', b'discovery.grow-sample')

437

grow_sample = grow_sample and not remote.limitedarguments

437

grow_sample = grow_sample and not remote.limitedarguments

438

439

dynamic_sample = configbool(b'devel', b'discovery.grow-sample.dynamic')

439

dynamic_sample = configbool(b'devel', b'discovery.grow-sample.dynamic')

440

hard_limit_sample = not (dynamic_sample or remote.limitedarguments)

440

hard_limit_sample = not (dynamic_sample or remote.limitedarguments)

441

442

randomize = ui.configbool(b'devel', b'discovery.randomize')

442

randomize = ui.configbool(b'devel', b'discovery.randomize')

443

if cl.index.rust_ext_compat:

443

if cl.index.rust_ext_compat:

444

pd = partialdiscovery

444

pd = partialdiscovery

445

else:

445

else:

446

pd = pure_partialdiscovery

446

pd = pure_partialdiscovery

447

disco = pd(local, ownheads, hard_limit_sample, randomize=randomize)

447

disco = pd(local, ownheads, hard_limit_sample, randomize=randomize)

448

if initial_head_exchange:

448

if initial_head_exchange:

449

# treat remote heads (and maybe own heads) as a first implicit sample

449

# treat remote heads (and maybe own heads) as a first implicit sample

450

# response

450

# response

451

disco.addcommons(knownsrvheads)

451

disco.addcommons(knownsrvheads)

452

disco.addinfo(zip(sample, yesno))

452

disco.addinfo(zip(sample, yesno))

453

454

full = not initial_head_exchange

454

full = not initial_head_exchange

455

progress = ui.makeprogress(_(b'searching'), unit=_(b'queries'))

455

progress = ui.makeprogress(_(b'searching'), unit=_(b'queries'))

456

while not disco.iscomplete():

456

while not disco.iscomplete():

457

458

if full or disco.hasinfo():

458

if full or disco.hasinfo():

459

if full:

459

if full:

460

ui.note(_(b"sampling from both directions\n"))

460

ui.note(_(b"sampling from both directions\n"))

461

else:

461

else:

462

ui.debug(b"taking initial sample\n")

462

ui.debug(b"taking initial sample\n")

463

samplefunc = disco.takefullsample

463

samplefunc = disco.takefullsample

464

targetsize = fullsamplesize

464

targetsize = fullsamplesize

465

if grow_sample:

465

if grow_sample:

466

fullsamplesize = int(fullsamplesize * samplegrowth)

466

fullsamplesize = int(fullsamplesize * samplegrowth)

467

else:

467

else:

468

# use even cheaper initial sample

468

# use even cheaper initial sample

469

ui.debug(b"taking quick initial sample\n")

469

ui.debug(b"taking quick initial sample\n")

470

samplefunc = disco.takequicksample

470

samplefunc = disco.takequicksample

471

targetsize = initialsamplesize

471

targetsize = initialsamplesize

472

sample = samplefunc(ownheads, targetsize)

472

sample = samplefunc(ownheads, targetsize)

473

474

roundtrips += 1

474

roundtrips += 1

475

progress.update(roundtrips)

475

progress.update(roundtrips)

476

stats = disco.stats()

476

stats = disco.stats()

477

ui.debug(

477

ui.debug(

478

b"query %i; still undecided: %i, sample size is: %i\n"

478

b"query %i; still undecided: %i, sample size is: %i\n"

479

% (roundtrips, stats['undecided'], len(sample))

479

% (roundtrips, stats['undecided'], len(sample))

480

)

480

)

481

482

# indices between sample and externalized version must match

482

# indices between sample and externalized version must match

483

sample = list(sample)

483

sample = list(sample)

484

485

with remote.commandexecutor() as e:

485

with remote.commandexecutor() as e:

486

if audit is not None:

486

if audit is not None:

487

audit[b'total-queries'] += len(sample)

487

audit[b'total-queries'] += len(sample)

488

yesno = e.callcommand(

488

yesno = e.callcommand(

489

b'known',

489

b'known',

490

{

490

{

491

b'nodes': [clnode(r) for r in sample],

491

b'nodes': [clnode(r) for r in sample],

492

},

492

},

493

).result()

493

).result()

494

495

full = True

495

full = True

496

497

disco.addinfo(zip(sample, yesno))

497

disco.addinfo(zip(sample, yesno))

498

499

result = disco.commonheads()

499

result = disco.commonheads()

500

elapsed = util.timer() - start

500

elapsed = util.timer() - start

501

progress.complete()

501

progress.complete()

502

ui.debug(b"%d total queries in %.4fs\n" % (roundtrips, elapsed))

502

ui.debug(b"%d total queries in %.4fs\n" % (roundtrips, elapsed))

503

msg = (

503

msg = (

504

b'found %d common and %d unknown server heads,'

504

b'found %d common and %d unknown server heads,'

505

b' %d roundtrips in %.4fs\n'

505

b' %d roundtrips in %.4fs\n'

506

)

506

)

507

missing = set(result) - set(knownsrvheads)

507

missing = set(result) - set(knownsrvheads)

508

ui.log(b'discovery', msg, len(result), len(missing), roundtrips, elapsed)

508

ui.log(b'discovery', msg, len(result), len(missing), roundtrips, elapsed)

509

510

if audit is not None:

510

if audit is not None:

511

audit[b'total-roundtrips'] = roundtrips

511

audit[b'total-roundtrips'] = roundtrips

512

513

if not result and srvheadhashes != [cl.nullid]:

513

if not result and srvheadhashes != [cl.nullid]:

514

if abortwhenunrelated:

514

if abortwhenunrelated:

515

raise error.Abort(_(b"repository is unrelated"))

515

raise error.Abort(_(b"repository is unrelated"))

516

else:

516

else:

517

ui.warn(_(b"warning: repository is unrelated\n"))

517

ui.warn(_(b"warning: repository is unrelated\n"))

518

return (

518

return (

519

{cl.nullid},

519

{cl.nullid},

520

True,

520

True,

521

srvheadhashes,

521

srvheadhashes,

522

)

522

)

523

524

anyincoming = srvheadhashes != [cl.nullid]

524

anyincoming = srvheadhashes != [cl.nullid]

525

result = {clnode(r) for r in result}

525

result = {clnode(r) for r in result}

526

return result, anyincoming, srvheadhashes

526

return result, anyincoming, srvheadhashes

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages

             # setdiscovery.py - improved discovery of common nodeset for mercurial
             #
             # Copyright 2010 Benoit Boissinot <bboissin@gmail.com>
             # and Peter Arrenbrecht <peter@arrenbrecht.ch>
             #
             # This software may be used and distributed according to the terms of the
             # GNU General Public License version 2 or any later version.
             """
             Algorithm works in the following way. You have two repository: local and
             remote. They both contains a DAG of changelists.
             The goal of the discovery protocol is to find one set of node *common*,
             the set of nodes shared by local and remote.
             One of the issue with the original protocol was latency, it could
             potentially require lots of roundtrips to discover that the local repo was a
             subset of remote (which is a very common case, you usually have few changes
             compared to upstream, while upstream probably had lots of development).
             The new protocol only requires one interface for the remote repo: `known()`,
             which given a set of changelists tells you if they are present in the DAG.
             The algorithm then works as follow:
              - We will be using three sets, `common`, `missing`, `unknown`. Originally
              all nodes are in `unknown`.
              - Take a sample from `unknown`, call `remote.known(sample)`
                - For each node that remote knows, move it and all its ancestors to `common`
                - For each node that remote doesn't know, move it and all its descendants
                to `missing`
              - Iterate until `unknown` is empty
             There are a couple optimizations, first is instead of starting with a random
             sample of missing, start by sending all heads, in the case where the local
             repo is a subset, you computed the answer in one round trip.
             Then you can do something similar to the bisecting strategy used when
             finding faulty changesets. Instead of random samples, you can try picking
             nodes that will maximize the number of nodes that will be
             classified with it (since all ancestors or descendants will be marked as well).
             """
             import collections
             import random
             from .i18n import _
             from .node import nullrev
             from . import (
                 error,
                 policy,
                 util,
             )
             def _updatesample(revs, heads, sample, parentfn, quicksamplesize=0):
                 """update an existing sample to match the expected size
                 The sample is updated with revs exponentially distant from each head of the
                 <revs> set. (H~1, H~2, H~4, H~8, etc).
                 If a target size is specified, the sampling will stop once this size is
                 reached. Otherwise sampling will happen until roots of the <revs> set are
                 reached.
                 :revs:  set of revs we want to discover (if None, assume the whole dag)
                 :heads: set of DAG head revs
                 :sample: a sample to update
                 :parentfn: a callable to resolve parents for a revision
                 :quicksamplesize: optional target size of the sample"""
                 dist = {}
                 visit = collections.deque(heads)
                 seen = set()
                 factor = 1
                 while visit:
                     curr = visit.popleft()
                     if curr in seen:
                         continue
                     d = dist.setdefault(curr, 1)
                     if d > factor:
                         factor *= 2
                     if d == factor:
                         sample.add(curr)
                         if quicksamplesize and (len(sample) >= quicksamplesize):
                             return
                     seen.add(curr)
                     for p in parentfn(curr):
                         if p != nullrev and (not revs or p in revs):
                             dist.setdefault(p, d + 1)
                             visit.append(p)
             def _limitsample(sample, desiredlen, randomize=True):
                 """return a random subset of sample of at most desiredlen item.
                 If randomize is False, though, a deterministic subset is returned.
                 This is meant for integration tests.
                 """
                 if len(sample) <= desiredlen:
                     return sample
+                sample = list(sample)
                 if randomize:
                     return set(random.sample(sample, desiredlen))
-                sample = list(sample)
                 sample.sort()
                 return set(sample[:desiredlen])
             class partialdiscovery:
                 """an object representing ongoing discovery
                 Feed with data from the remote repository, this object keep track of the
                 current set of changeset in various states:
                 - common:    revs also known remotely
                 - undecided: revs we don't have information on yet
                 - missing:   revs missing remotely
                 (all tracked revisions are known locally)
                 """
                 def __init__(self, repo, targetheads, respectsize, randomize=True):
                     self._repo = repo
                     self._targetheads = targetheads
                     self._common = repo.changelog.incrementalmissingrevs()
                     self._undecided = None
                     self.missing = set()
                     self._childrenmap = None
                     self._respectsize = respectsize
                     self.randomize = randomize
                 def addcommons(self, commons):
                     """register nodes known as common"""
                     self._common.addbases(commons)
                     if self._undecided is not None:
                         self._common.removeancestorsfrom(self._undecided)
                 def addmissings(self, missings):
                     """register some nodes as missing"""
                     newmissing = self._repo.revs(b'%ld::%ld', missings, self.undecided)
                     if newmissing:
                         self.missing.update(newmissing)
                         self.undecided.difference_update(newmissing)
                 def addinfo(self, sample):
                     """consume an iterable of (rev, known) tuples"""
                     common = set()
                     missing = set()
                     for rev, known in sample:
                         if known:
                             common.add(rev)
                         else:
                             missing.add(rev)
                     if common:
                         self.addcommons(common)
                     if missing:
                         self.addmissings(missing)
                 def hasinfo(self):
                     """return True is we have any clue about the remote state"""
                     return self._common.hasbases()
                 def iscomplete(self):
                     """True if all the necessary data have been gathered"""
                     return self._undecided is not None and not self._undecided
                 @property
                 def undecided(self):
                     if self._undecided is not None:
                         return self._undecided
                     self._undecided = set(self._common.missingancestors(self._targetheads))
                     return self._undecided
                 def stats(self):
                     return {
                         'undecided': len(self.undecided),
                     }
                 def commonheads(self):
                     """the heads of the known common set"""
                     # heads(common) == heads(common.bases) since common represents
                     # common.bases and all its ancestors
                     return self._common.basesheads()
                 def _parentsgetter(self):
                     getrev = self._repo.changelog.index.__getitem__
                     def getparents(r):
                         return getrev(r)[5:7]
                     return getparents
                 def _childrengetter(self):
                     if self._childrenmap is not None:
                         # During discovery, the `undecided` set keep shrinking.
                         # Therefore, the map computed for an iteration N will be
                         # valid for iteration N+1. Instead of computing the same
                         # data over and over we cached it the first time.
                         return self._childrenmap.__getitem__
                     # _updatesample() essentially does interaction over revisions to look
                     # up their children. This lookup is expensive and doing it in a loop is
                     # quadratic. We precompute the children for all relevant revisions and
                     # make the lookup in _updatesample() a simple dict lookup.
                     self._childrenmap = children = {}
                     parentrevs = self._parentsgetter()
                     revs = self.undecided
                     for rev in sorted(revs):
                         # Always ensure revision has an entry so we don't need to worry
                         # about missing keys.
                         children[rev] = []
                         for prev in parentrevs(rev):
                             if prev == nullrev:
                                 continue
                             c = children.get(prev)
                             if c is not None:
                                 c.append(rev)
                     return children.__getitem__
                 def takequicksample(self, headrevs, size):
                     """takes a quick sample of size <size>
                     It is meant for initial sampling and focuses on querying heads and close
                     ancestors of heads.
                     :headrevs: set of head revisions in local DAG to consider
                     :size: the maximum size of the sample"""
                     revs = self.undecided
                     if len(revs) <= size:
                         return list(revs)
                     sample = set(self._repo.revs(b'heads(%ld)', revs))
                     if len(sample) >= size:
                         return _limitsample(sample, size, randomize=self.randomize)
                     _updatesample(
                         None, headrevs, sample, self._parentsgetter(), quicksamplesize=size
                     )
                     return sample
                 def takefullsample(self, headrevs, size):
                     revs = self.undecided
                     if len(revs) <= size:
                         return list(revs)
                     repo = self._repo
                     sample = set(repo.revs(b'heads(%ld)', revs))
                     parentrevs = self._parentsgetter()
                     # update from heads
                     revsheads = sample.copy()
                     _updatesample(revs, revsheads, sample, parentrevs)
                     # update from roots
                     revsroots = set(repo.revs(b'roots(%ld)', revs))
                     childrenrevs = self._childrengetter()
                     _updatesample(revs, revsroots, sample, childrenrevs)
                     assert sample
                     if not self._respectsize:
                         size = max(size, min(len(revsroots), len(revsheads)))
                     sample = _limitsample(sample, size, randomize=self.randomize)
                     if len(sample) < size:
                         more = size - len(sample)
                         takefrom = list(revs - sample)
                         if self.randomize:
                             sample.update(random.sample(takefrom, more))
                         else:
                             takefrom.sort()
                             sample.update(takefrom[:more])
                     return sample
             pure_partialdiscovery = partialdiscovery
             partialdiscovery = policy.importrust(
                 'discovery', member='PartialDiscovery', default=partialdiscovery
             )
             def findcommonheads(
                 ui,
                 local,
                 remote,
                 abortwhenunrelated=True,
                 ancestorsof=None,
                 audit=None,
             ):
                 """Return a tuple (common, anyincoming, remoteheads) used to identify
                 missing nodes from or in remote.
                 The audit argument is an optional dictionnary that a caller can pass. it
                 will be updated with extra data about the discovery, this is useful for
                 debug.
                 """
                 samplegrowth = float(ui.config(b'devel', b'discovery.grow-sample.rate'))
                 if audit is not None:
                     audit[b'total-queries'] = 0
                 start = util.timer()
                 roundtrips = 0
                 cl = local.changelog
                 clnode = cl.node
                 clrev = cl.rev
                 if ancestorsof is not None:
                     ownheads = [clrev(n) for n in ancestorsof]
                 else:
                     ownheads = [rev for rev in cl.headrevs() if rev != nullrev]
                 initial_head_exchange = ui.configbool(b'devel', b'discovery.exchange-heads')
                 initialsamplesize = ui.configint(b'devel', b'discovery.sample-size.initial')
                 fullsamplesize = ui.configint(b'devel', b'discovery.sample-size')
                 # We also ask remote about all the local heads. That set can be arbitrarily
                 # large, so we used to limit it size to `initialsamplesize`. We no longer
                 # do as it proved counter productive. The skipped heads could lead to a
                 # large "undecided" set, slower to be clarified than if we asked the
                 # question for all heads right away.
                 #
                 # We are already fetching all server heads using the `heads` commands,
                 # sending a equivalent number of heads the other way should not have a
                 # significant impact.  In addition, it is very likely that we are going to
                 # have to issue "known" request for an equivalent amount of revisions in
                 # order to decide if theses heads are common or missing.
                 #
                 # find a detailled analysis below.
                 #
                 # Case A: local and server both has few heads
                 #
                 #     Ownheads is below initialsamplesize, limit would not have any effect.
                 #
                 # Case B: local has few heads and server has many
                 #
                 #     Ownheads is below initialsamplesize, limit would not have any effect.
                 #
                 # Case C: local and server both has many heads
                 #
                 #     We now transfert some more data, but not significantly more than is
                 #     already transfered to carry the server heads.
                 #
                 # Case D: local has many heads, server has few
                 #
                 #   D.1 local heads are mostly known remotely
                 #
                 #     All the known head will have be part of a `known` request at some
                 #     point for the discovery to finish. Sending them all earlier is
                 #     actually helping.
                 #
                 #     (This case is fairly unlikely, it requires the numerous heads to all
                 #     be merged server side in only a few heads)
                 #
                 #   D.2 local heads are mostly missing remotely
                 #
                 #     To determine that the heads are missing, we'll have to issue `known`
                 #     request for them or one of their ancestors. This amount of `known`
                 #     request will likely be in the same order of magnitude than the amount
                 #     of local heads.
                 #
                 #     The only case where we can be more efficient using `known` request on
                 #     ancestors are case were all the "missing" local heads are based on a
                 #     few changeset, also "missing".  This means we would have a "complex"
                 #     graph (with many heads) attached to, but very independant to a the
                 #     "simple" graph on the server. This is a fairly usual case and have
                 #     not been met in the wild so far.
                 if initial_head_exchange:
                     if remote.limitedarguments:
                         sample = _limitsample(ownheads, initialsamplesize)
                         # indices between sample and externalized version must match
                         sample = list(sample)
                     else:
                         sample = ownheads
                     ui.debug(b"query 1; heads\n")
                     roundtrips += 1
                     with remote.commandexecutor() as e:
                         fheads = e.callcommand(b'heads', {})
                         if audit is not None:
                             audit[b'total-queries'] += len(sample)
                         fknown = e.callcommand(
                             b'known',
                             {
                                 b'nodes': [clnode(r) for r in sample],
                             },
                         )
                     srvheadhashes, yesno = fheads.result(), fknown.result()
                     if audit is not None:
                         audit[b'total-roundtrips'] = 1
                     if cl.tiprev() == nullrev:
                         if srvheadhashes != [cl.nullid]:
                             return [cl.nullid], True, srvheadhashes
                         return [cl.nullid], False, []
                 else:
                     # we still need the remote head for the function return
                     with remote.commandexecutor() as e:
                         fheads = e.callcommand(b'heads', {})
                     srvheadhashes = fheads.result()
                 # start actual discovery (we note this before the next "if" for
                 # compatibility reasons)
                 ui.status(_(b"searching for changes\n"))
                 knownsrvheads = []  # revnos of remote heads that are known locally
                 for node in srvheadhashes:
                     if node == cl.nullid:
                         continue
                     try:
                         knownsrvheads.append(clrev(node))
                     # Catches unknown and filtered nodes.
                     except error.LookupError:
                         continue
                 if initial_head_exchange:
                     # early exit if we know all the specified remote heads already
                     if len(knownsrvheads) == len(srvheadhashes):
                         ui.debug(b"all remote heads known locally\n")
                         return srvheadhashes, False, srvheadhashes
                     if len(sample) == len(ownheads) and all(yesno):
                         ui.note(_(b"all local changesets known remotely\n"))
                         ownheadhashes = [clnode(r) for r in ownheads]
                         return ownheadhashes, True, srvheadhashes
                 # full blown discovery
                 # if the server has a limit to its arguments size, we can't grow the sample.
                 configbool = local.ui.configbool
                 grow_sample = configbool(b'devel', b'discovery.grow-sample')
                 grow_sample = grow_sample and not remote.limitedarguments
                 dynamic_sample = configbool(b'devel', b'discovery.grow-sample.dynamic')
                 hard_limit_sample = not (dynamic_sample or remote.limitedarguments)
                 randomize = ui.configbool(b'devel', b'discovery.randomize')
                 if cl.index.rust_ext_compat:
                     pd = partialdiscovery
                 else:
                     pd = pure_partialdiscovery
                 disco = pd(local, ownheads, hard_limit_sample, randomize=randomize)
                 if initial_head_exchange:
                     # treat remote heads (and maybe own heads) as a first implicit sample
                     # response
                     disco.addcommons(knownsrvheads)
                     disco.addinfo(zip(sample, yesno))
                 full = not initial_head_exchange
                 progress = ui.makeprogress(_(b'searching'), unit=_(b'queries'))
                 while not disco.iscomplete():
                     if full or disco.hasinfo():
                         if full:
                             ui.note(_(b"sampling from both directions\n"))
                         else:
                             ui.debug(b"taking initial sample\n")
                         samplefunc = disco.takefullsample
                         targetsize = fullsamplesize
                         if grow_sample:
                             fullsamplesize = int(fullsamplesize * samplegrowth)
                     else:
                         # use even cheaper initial sample
                         ui.debug(b"taking quick initial sample\n")
                         samplefunc = disco.takequicksample
                         targetsize = initialsamplesize
                     sample = samplefunc(ownheads, targetsize)
                     roundtrips += 1
                     progress.update(roundtrips)
                     stats = disco.stats()
                     ui.debug(
                         b"query %i; still undecided: %i, sample size is: %i\n"
                         % (roundtrips, stats['undecided'], len(sample))
                     )
                     # indices between sample and externalized version must match
                     sample = list(sample)
                     with remote.commandexecutor() as e:
                         if audit is not None:
                             audit[b'total-queries'] += len(sample)
                         yesno = e.callcommand(
                             b'known',
                             {
                                 b'nodes': [clnode(r) for r in sample],
                             },
                         ).result()
                     full = True
                     disco.addinfo(zip(sample, yesno))
                 result = disco.commonheads()
                 elapsed = util.timer() - start
                 progress.complete()
                 ui.debug(b"%d total queries in %.4fs\n" % (roundtrips, elapsed))
                 msg = (
                     b'found %d common and %d unknown server heads,'
                     b' %d roundtrips in %.4fs\n'
                 )
                 missing = set(result) - set(knownsrvheads)
                 ui.log(b'discovery', msg, len(result), len(missing), roundtrips, elapsed)
                 if audit is not None:
                     audit[b'total-roundtrips'] = roundtrips
                 if not result and srvheadhashes != [cl.nullid]:
                     if abortwhenunrelated:
                         raise error.Abort(_(b"repository is unrelated"))
                     else:
                         ui.warn(_(b"warning: repository is unrelated\n"))
                     return (
                         {cl.nullid},
                         True,
                         srvheadhashes,
                     )
                 anyincoming = srvheadhashes != [cl.nullid]
                 result = {clnode(r) for r in result}
                 return result, anyincoming, srvheadhashes