##// END OF EJS Templates
clone-bundles: garbage collect older bundle when generating new ones...
marmoute -
r51300:971dc236 default
parent child Browse files
Show More
@@ -1,832 +1,903 b''
1 1 # This software may be used and distributed according to the terms of the
2 2 # GNU General Public License version 2 or any later version.
3 3
4 4 """advertise pre-generated bundles to seed clones
5 5
6 6 "clonebundles" is a server-side extension used to advertise the existence
7 7 of pre-generated, externally hosted bundle files to clients that are
8 8 cloning so that cloning can be faster, more reliable, and require less
9 9 resources on the server. "pullbundles" is a related feature for sending
10 10 pre-generated bundle files to clients as part of pull operations.
11 11
12 12 Cloning can be a CPU and I/O intensive operation on servers. Traditionally,
13 13 the server, in response to a client's request to clone, dynamically generates
14 14 a bundle containing the entire repository content and sends it to the client.
15 15 There is no caching on the server and the server will have to redundantly
16 16 generate the same outgoing bundle in response to each clone request. For
17 17 servers with large repositories or with high clone volume, the load from
18 18 clones can make scaling the server challenging and costly.
19 19
20 20 This extension provides server operators the ability to offload
21 21 potentially expensive clone load to an external service. Pre-generated
22 22 bundles also allow using more CPU intensive compression, reducing the
23 23 effective bandwidth requirements.
24 24
25 25 Here's how clone bundles work:
26 26
27 27 1. A server operator establishes a mechanism for making bundle files available
28 28 on a hosting service where Mercurial clients can fetch them.
29 29 2. A manifest file listing available bundle URLs and some optional metadata
30 30 is added to the Mercurial repository on the server.
31 31 3. A client initiates a clone against a clone bundles aware server.
32 32 4. The client sees the server is advertising clone bundles and fetches the
33 33 manifest listing available bundles.
34 34 5. The client filters and sorts the available bundles based on what it
35 35 supports and prefers.
36 36 6. The client downloads and applies an available bundle from the
37 37 server-specified URL.
38 38 7. The client reconnects to the original server and performs the equivalent
39 39 of :hg:`pull` to retrieve all repository data not in the bundle. (The
40 40 repository could have been updated between when the bundle was created
41 41 and when the client started the clone.) This may use "pullbundles".
42 42
43 43 Instead of the server generating full repository bundles for every clone
44 44 request, it generates full bundles once and they are subsequently reused to
45 45 bootstrap new clones. The server may still transfer data at clone time.
46 46 However, this is only data that has been added/changed since the bundle was
47 47 created. For large, established repositories, this can reduce server load for
48 48 clones to less than 1% of original.
49 49
50 50 Here's how pullbundles work:
51 51
52 52 1. A manifest file listing available bundles and describing the revisions
53 53 is added to the Mercurial repository on the server.
54 54 2. A new-enough client informs the server that it supports partial pulls
55 55 and initiates a pull.
56 56 3. If the server has pull bundles enabled and sees the client advertising
57 57 partial pulls, it checks for a matching pull bundle in the manifest.
58 58 A bundle matches if the format is supported by the client, the client
59 59 has the required revisions already and needs something from the bundle.
60 60 4. If there is at least one matching bundle, the server sends it to the client.
61 61 5. The client applies the bundle and notices that the server reply was
62 62 incomplete. It initiates another pull.
63 63
64 64 To work, this extension requires the following of server operators:
65 65
66 66 * Generating bundle files of repository content (typically periodically,
67 67 such as once per day).
68 68 * Clone bundles: A file server that clients have network access to and that
69 69 Python knows how to talk to through its normal URL handling facility
70 70 (typically an HTTP/HTTPS server).
71 71 * A process for keeping the bundles manifest in sync with available bundle
72 72 files.
73 73
74 74 Strictly speaking, using a static file hosting server isn't required: a server
75 75 operator could use a dynamic service for retrieving bundle data. However,
76 76 static file hosting services are simple and scalable and should be sufficient
77 77 for most needs.
78 78
79 79 Bundle files can be generated with the :hg:`bundle` command. Typically
80 80 :hg:`bundle --all` is used to produce a bundle of the entire repository.
81 81
82 82 :hg:`debugcreatestreamclonebundle` can be used to produce a special
83 83 *streaming clonebundle*. These are bundle files that are extremely efficient
84 84 to produce and consume (read: fast). However, they are larger than
85 85 traditional bundle formats and require that clients support the exact set
86 86 of repository data store formats in use by the repository that created them.
87 87 Typically, a newer server can serve data that is compatible with older clients.
88 88 However, *streaming clone bundles* don't have this guarantee. **Server
89 89 operators need to be aware that newer versions of Mercurial may produce
90 90 streaming clone bundles incompatible with older Mercurial versions.**
91 91
92 92 A server operator is responsible for creating a ``.hg/clonebundles.manifest``
93 93 file containing the list of available bundle files suitable for seeding
94 94 clones. If this file does not exist, the repository will not advertise the
95 95 existence of clone bundles when clients connect. For pull bundles,
96 96 ``.hg/pullbundles.manifest`` is used.
97 97
98 98 The manifest file contains a newline (\\n) delimited list of entries.
99 99
100 100 Each line in this file defines an available bundle. Lines have the format:
101 101
102 102 <URL> [<key>=<value>[ <key>=<value>]]
103 103
104 104 That is, a URL followed by an optional, space-delimited list of key=value
105 105 pairs describing additional properties of this bundle. Both keys and values
106 106 are URI encoded.
107 107
108 108 For pull bundles, the URL is a path under the ``.hg`` directory of the
109 109 repository.
110 110
111 111 Keys in UPPERCASE are reserved for use by Mercurial and are defined below.
112 112 All non-uppercase keys can be used by site installations. An example use
113 113 for custom properties is to use the *datacenter* attribute to define which
114 114 data center a file is hosted in. Clients could then prefer a server in the
115 115 data center closest to them.
116 116
117 117 The following reserved keys are currently defined:
118 118
119 119 BUNDLESPEC
120 120 A "bundle specification" string that describes the type of the bundle.
121 121
122 122 These are string values that are accepted by the "--type" argument of
123 123 :hg:`bundle`.
124 124
125 125 The values are parsed in strict mode, which means they must be of the
126 126 "<compression>-<type>" form. See
127 127 mercurial.exchange.parsebundlespec() for more details.
128 128
129 129 :hg:`debugbundle --spec` can be used to print the bundle specification
130 130 string for a bundle file. The output of this command can be used verbatim
131 131 for the value of ``BUNDLESPEC`` (it is already escaped).
132 132
133 133 Clients will automatically filter out specifications that are unknown or
134 134 unsupported so they won't attempt to download something that likely won't
135 135 apply.
136 136
137 137 The actual value doesn't impact client behavior beyond filtering:
138 138 clients will still sniff the bundle type from the header of downloaded
139 139 files.
140 140
141 141 **Use of this key is highly recommended**, as it allows clients to
142 142 easily skip unsupported bundles. If this key is not defined, an old
143 143 client may attempt to apply a bundle that it is incapable of reading.
144 144
145 145 REQUIRESNI
146 146 Whether Server Name Indication (SNI) is required to connect to the URL.
147 147 SNI allows servers to use multiple certificates on the same IP. It is
148 148 somewhat common in CDNs and other hosting providers. Older Python
149 149 versions do not support SNI. Defining this attribute enables clients
150 150 with older Python versions to filter this entry without experiencing
151 151 an opaque SSL failure at connection time.
152 152
153 153 If this is defined, it is important to advertise a non-SNI fallback
154 154 URL or clients running old Python releases may not be able to clone
155 155 with the clonebundles facility.
156 156
157 157 Value should be "true".
158 158
159 159 REQUIREDRAM
160 160 Value specifies expected memory requirements to decode the payload.
161 161 Values can have suffixes for common bytes sizes. e.g. "64MB".
162 162
163 163 This key is often used with zstd-compressed bundles using a high
164 164 compression level / window size, which can require 100+ MB of memory
165 165 to decode.
166 166
167 167 heads
168 168 Used for pull bundles. This contains the ``;`` separated changeset
169 169 hashes of the heads of the bundle content.
170 170
171 171 bases
172 172 Used for pull bundles. This contains the ``;`` separated changeset
173 173 hashes of the roots of the bundle content. This can be skipped if
174 174 the bundle was created without ``--base``.
175 175
176 176 Manifests can contain multiple entries. Assuming metadata is defined, clients
177 177 will filter entries from the manifest that they don't support. The remaining
178 178 entries are optionally sorted by client preferences
179 179 (``ui.clonebundleprefers`` config option). The client then attempts
180 180 to fetch the bundle at the first URL in the remaining list.
181 181
182 182 **Errors when downloading a bundle will fail the entire clone operation:
183 183 clients do not automatically fall back to a traditional clone.** The reason
184 184 for this is that if a server is using clone bundles, it is probably doing so
185 185 because the feature is necessary to help it scale. In other words, there
186 186 is an assumption that clone load will be offloaded to another service and
187 187 that the Mercurial server isn't responsible for serving this clone load.
188 188 If that other service experiences issues and clients start mass falling back to
189 189 the original Mercurial server, the added clone load could overwhelm the server
190 190 due to unexpected load and effectively take it offline. Not having clients
191 191 automatically fall back to cloning from the original server mitigates this
192 192 scenario.
193 193
194 194 Because there is no automatic Mercurial server fallback on failure of the
195 195 bundle hosting service, it is important for server operators to view the bundle
196 196 hosting service as an extension of the Mercurial server in terms of
197 197 availability and service level agreements: if the bundle hosting service goes
198 198 down, so does the ability for clients to clone. Note: clients will see a
199 199 message informing them how to bypass the clone bundles facility when a failure
200 200 occurs. So server operators should prepare for some people to follow these
201 201 instructions when a failure occurs, thus driving more load to the original
202 202 Mercurial server when the bundle hosting service fails.
203 203
204 204
205 205 auto-generation of clone bundles
206 206 --------------------------------
207 207
208 208 It is possible to set Mercurial to automatically re-generate clone bundles when
209 209 new content is available.
210 210
211 211 Mercurial will take care of the process asynchronously. The defined list of
212 bundle type will be generated, uploaded, and advertised.
212 bundle-type will be generated, uploaded, and advertised. Older bundles will get
213 decommissioned as newer ones replace them.
213 214
214 215 Bundles Generation:
215 216 ...................
216 217
217 218 The extension can generate multiple variants of the clone bundle. Each
218 219 different variant will be defined by the "bundle-spec" they use::
219 220
220 221 [clone-bundles]
221 222 auto-generate.formats= zstd-v2, gzip-v2
222 223
223 224 See `hg help bundlespec` for details about available options.
224 225
225 226 Bundles Upload and Serving:
226 227 ...........................
227 228
228 229 The generated bundles need to be made available to users through a "public" URL.
229 230 This should be donne through `clone-bundles.upload-command` configuration. The
230 231 value of this command should be a shell command. It will have access to the
231 232 bundle file path through the `$HGCB_BUNDLE_PATH` variable. And the expected
232 233 basename in the "public" URL is accessible at::
233 234
234 235 [clone-bundles]
235 236 upload-command=sftp put $HGCB_BUNDLE_PATH \
236 237 sftp://bundles.host/clone-bundles/$HGCB_BUNDLE_BASENAME
237 238
239 If the file was already uploaded, the command must still succeed.
240
238 241 After upload, the file should be available at an url defined by
239 242 `clone-bundles.url-template`.
240 243
241 244 [clone-bundles]
242 245 url-template=https://bundles.host/cache/clone-bundles/{basename}
246
247 Old bundles cleanup:
248 ....................
249
250 When new bundles are generated, the older ones are no longer necessary and can
251 be removed from storage. This is done through the `clone-bundles.delete-command`
252 configuration. The command is given the url of the artifact to delete through
253 the `$HGCB_BUNDLE_URL` environment variable.
254
255 [clone-bundles]
256 delete-command=sftp rm sftp://bundles.host/clone-bundles/$HGCB_BUNDLE_BASENAME
257
258 If the file was already deleted, the command must still succeed.
243 259 """
244 260
245 261
246 262 import os
247 263 import weakref
248 264
249 265 from mercurial.i18n import _
250 266
251 267 from mercurial import (
252 268 bundlecaches,
253 269 commands,
254 270 error,
255 271 extensions,
256 272 localrepo,
257 273 lock,
258 274 node,
259 275 registrar,
260 276 util,
261 277 wireprotov1server,
262 278 )
263 279
264 280
265 281 from mercurial.utils import (
266 282 procutil,
267 283 )
268 284
269 285 testedwith = b'ships-with-hg-core'
270 286
271 287
272 288 def capabilities(orig, repo, proto):
273 289 caps = orig(repo, proto)
274 290
275 291 # Only advertise if a manifest exists. This does add some I/O to requests.
276 292 # But this should be cheaper than a wasted network round trip due to
277 293 # missing file.
278 294 if repo.vfs.exists(bundlecaches.CB_MANIFEST_FILE):
279 295 caps.append(b'clonebundles')
280 296
281 297 return caps
282 298
283 299
284 300 def extsetup(ui):
285 301 extensions.wrapfunction(wireprotov1server, b'_capabilities', capabilities)
286 302
287 303
288 304 # logic for bundle auto-generation
289 305
290 306
291 307 configtable = {}
292 308 configitem = registrar.configitem(configtable)
293 309
294 310 cmdtable = {}
295 311 command = registrar.command(cmdtable)
296 312
297 313 configitem(b'clone-bundles', b'auto-generate.formats', default=list)
298 314
299 315
300 316 configitem(b'clone-bundles', b'upload-command', default=None)
301 317
318 configitem(b'clone-bundles', b'delete-command', default=None)
319
302 320 configitem(b'clone-bundles', b'url-template', default=None)
303 321
304 322 configitem(b'devel', b'debug.clonebundles', default=False)
305 323
306 324
307 325 # category for the post-close transaction hooks
308 326 CAT_POSTCLOSE = b"clonebundles-autobundles"
309 327
310 328 # template for bundle file names
311 329 BUNDLE_MASK = (
312 330 b"full-%(bundle_type)s-%(revs)d_revs-%(tip_short)s_tip-%(op_id)s.hg"
313 331 )
314 332
315 333
316 334 # file in .hg/ use to track clonebundles being auto-generated
317 335 AUTO_GEN_FILE = b'clonebundles.auto-gen'
318 336
319 337
320 338 class BundleBase(object):
321 339 """represents the core of properties that matters for us in a bundle
322 340
323 341 :bundle_type: the bundlespec (see hg help bundlespec)
324 342 :revs: the number of revisions in the repo at bundle creation time
325 343 :tip_rev: the rev-num of the tip revision
326 344 :tip_node: the node id of the tip-most revision in the bundle
327 345
328 346 :ready: True if the bundle is ready to be served
329 347 """
330 348
331 349 ready = False
332 350
333 351 def __init__(self, bundle_type, revs, tip_rev, tip_node):
334 352 self.bundle_type = bundle_type
335 353 self.revs = revs
336 354 self.tip_rev = tip_rev
337 355 self.tip_node = tip_node
338 356
339 357 def valid_for(self, repo):
340 358 """is this bundle applicable to the current repository
341 359
342 360 This is useful for detecting bundles made irrelevant by stripping.
343 361 """
344 362 tip_node = node.bin(self.tip_node)
345 363 return repo.changelog.index.get_rev(tip_node) == self.tip_rev
346 364
347 365 def __eq__(self, other):
348 366 left = (self.ready, self.bundle_type, self.tip_rev, self.tip_node)
349 367 right = (other.ready, other.bundle_type, other.tip_rev, other.tip_node)
350 368 return left == right
351 369
352 370 def __neq__(self, other):
353 371 return not self == other
354 372
355 373 def __cmp__(self, other):
356 374 if self == other:
357 375 return 0
358 376 return -1
359 377
360 378
361 379 class RequestedBundle(BundleBase):
362 380 """A bundle that should be generated.
363 381
364 382 Additional attributes compared to BundleBase
365 383 :heads: list of head revisions (as rev-num)
366 384 :op_id: a "unique" identifier for the operation triggering the change
367 385 """
368 386
369 387 def __init__(self, bundle_type, revs, tip_rev, tip_node, head_revs, op_id):
370 388 self.head_revs = head_revs
371 389 self.op_id = op_id
372 390 super(RequestedBundle, self).__init__(
373 391 bundle_type,
374 392 revs,
375 393 tip_rev,
376 394 tip_node,
377 395 )
378 396
379 397 @property
380 398 def suggested_filename(self):
381 399 """A filename that can be used for the generated bundle"""
382 400 data = {
383 401 b'bundle_type': self.bundle_type,
384 402 b'revs': self.revs,
385 403 b'heads': self.head_revs,
386 404 b'tip_rev': self.tip_rev,
387 405 b'tip_node': self.tip_node,
388 406 b'tip_short': self.tip_node[:12],
389 407 b'op_id': self.op_id,
390 408 }
391 409 return BUNDLE_MASK % data
392 410
393 411 def generate_bundle(self, repo, file_path):
394 412 """generate the bundle at `filepath`"""
395 413 commands.bundle(
396 414 repo.ui,
397 415 repo,
398 416 file_path,
399 417 base=[b"null"],
400 418 rev=self.head_revs,
401 419 type=self.bundle_type,
402 420 quiet=True,
403 421 )
404 422
405 423 def generating(self, file_path, hostname=None, pid=None):
406 424 """return a GeneratingBundle object from this object"""
407 425 if pid is None:
408 426 pid = os.getpid()
409 427 if hostname is None:
410 428 hostname = lock._getlockprefix()
411 429 return GeneratingBundle(
412 430 self.bundle_type,
413 431 self.revs,
414 432 self.tip_rev,
415 433 self.tip_node,
416 434 hostname,
417 435 pid,
418 436 file_path,
419 437 )
420 438
421 439
422 440 class GeneratingBundle(BundleBase):
423 441 """A bundle being generated
424 442
425 443 extra attributes compared to BundleBase:
426 444
427 445 :hostname: the hostname of the machine generating the bundle
428 446 :pid: the pid of the process generating the bundle
429 447 :filepath: the target filename of the bundle
430 448
431 449 These attributes exist to help detect stalled generation processes.
432 450 """
433 451
434 452 ready = False
435 453
436 454 def __init__(
437 455 self, bundle_type, revs, tip_rev, tip_node, hostname, pid, filepath
438 456 ):
439 457 self.hostname = hostname
440 458 self.pid = pid
441 459 self.filepath = filepath
442 460 super(GeneratingBundle, self).__init__(
443 461 bundle_type, revs, tip_rev, tip_node
444 462 )
445 463
446 464 @classmethod
447 465 def from_line(cls, line):
448 466 """create an object by deserializing a line from AUTO_GEN_FILE"""
449 467 assert line.startswith(b'PENDING-v1 ')
450 468 (
451 469 __,
452 470 bundle_type,
453 471 revs,
454 472 tip_rev,
455 473 tip_node,
456 474 hostname,
457 475 pid,
458 476 filepath,
459 477 ) = line.split()
460 478 hostname = util.urlreq.unquote(hostname)
461 479 filepath = util.urlreq.unquote(filepath)
462 480 revs = int(revs)
463 481 tip_rev = int(tip_rev)
464 482 pid = int(pid)
465 483 return cls(
466 484 bundle_type, revs, tip_rev, tip_node, hostname, pid, filepath
467 485 )
468 486
469 487 def to_line(self):
470 488 """serialize the object to include as a line in AUTO_GEN_FILE"""
471 489 templ = b"PENDING-v1 %s %d %d %s %s %d %s"
472 490 data = (
473 491 self.bundle_type,
474 492 self.revs,
475 493 self.tip_rev,
476 494 self.tip_node,
477 495 util.urlreq.quote(self.hostname),
478 496 self.pid,
479 497 util.urlreq.quote(self.filepath),
480 498 )
481 499 return templ % data
482 500
483 501 def __eq__(self, other):
484 502 if not super(GeneratingBundle, self).__eq__(other):
485 503 return False
486 504 left = (self.hostname, self.pid, self.filepath)
487 505 right = (other.hostname, other.pid, other.filepath)
488 506 return left == right
489 507
490 508 def uploaded(self, url, basename):
491 509 """return a GeneratedBundle from this object"""
492 510 return GeneratedBundle(
493 511 self.bundle_type,
494 512 self.revs,
495 513 self.tip_rev,
496 514 self.tip_node,
497 515 url,
498 516 basename,
499 517 )
500 518
501 519
502 520 class GeneratedBundle(BundleBase):
503 521 """A bundle that is done being generated and can be served
504 522
505 523 extra attributes compared to BundleBase:
506 524
507 525 :file_url: the url where the bundle is available.
508 526 :basename: the "basename" used to upload (useful for deletion)
509 527
510 528 These attributes exist to generate a bundle manifest
511 529 (.hg/pullbundles.manifest)
512 530 """
513 531
514 532 ready = True
515 533
516 534 def __init__(
517 535 self, bundle_type, revs, tip_rev, tip_node, file_url, basename
518 536 ):
519 537 self.file_url = file_url
520 538 self.basename = basename
521 539 super(GeneratedBundle, self).__init__(
522 540 bundle_type, revs, tip_rev, tip_node
523 541 )
524 542
525 543 @classmethod
526 544 def from_line(cls, line):
527 545 """create an object by deserializing a line from AUTO_GEN_FILE"""
528 546 assert line.startswith(b'DONE-v1 ')
529 547 (
530 548 __,
531 549 bundle_type,
532 550 revs,
533 551 tip_rev,
534 552 tip_node,
535 553 file_url,
536 554 basename,
537 555 ) = line.split()
538 556 revs = int(revs)
539 557 tip_rev = int(tip_rev)
540 558 file_url = util.urlreq.unquote(file_url)
541 559 return cls(bundle_type, revs, tip_rev, tip_node, file_url, basename)
542 560
543 561 def to_line(self):
544 562 """serialize the object to include as a line in AUTO_GEN_FILE"""
545 563 templ = b"DONE-v1 %s %d %d %s %s %s"
546 564 data = (
547 565 self.bundle_type,
548 566 self.revs,
549 567 self.tip_rev,
550 568 self.tip_node,
551 569 util.urlreq.quote(self.file_url),
552 570 self.basename,
553 571 )
554 572 return templ % data
555 573
556 574 def manifest_line(self):
557 575 """serialize the object to include as a line in pullbundles.manifest"""
558 576 templ = b"%s BUNDLESPEC=%s REQUIRESNI=true"
559 577 return templ % (self.file_url, self.bundle_type)
560 578
561 579 def __eq__(self, other):
562 580 if not super(GeneratedBundle, self).__eq__(other):
563 581 return False
564 582 return self.file_url == other.file_url
565 583
566 584
567 585 def parse_auto_gen(content):
568 586 """parse the AUTO_GEN_FILE to return a list of Bundle object"""
569 587 bundles = []
570 588 for line in content.splitlines():
571 589 if line.startswith(b'PENDING-v1 '):
572 590 bundles.append(GeneratingBundle.from_line(line))
573 591 elif line.startswith(b'DONE-v1 '):
574 592 bundles.append(GeneratedBundle.from_line(line))
575 593 return bundles
576 594
577 595
578 596 def dumps_auto_gen(bundles):
579 597 """serialize a list of Bundle as a AUTO_GEN_FILE content"""
580 598 lines = []
581 599 for b in bundles:
582 600 lines.append(b"%s\n" % b.to_line())
583 601 lines.sort()
584 602 return b"".join(lines)
585 603
586 604
587 605 def read_auto_gen(repo):
588 606 """read the AUTO_GEN_FILE for the <repo> a list of Bundle object"""
589 607 data = repo.vfs.tryread(AUTO_GEN_FILE)
590 608 if not data:
591 609 return []
592 610 return parse_auto_gen(data)
593 611
594 612
595 613 def write_auto_gen(repo, bundles):
596 614 """write a list of Bundle objects into the repo's AUTO_GEN_FILE"""
597 615 assert repo._cb_lock_ref is not None
598 616 data = dumps_auto_gen(bundles)
599 617 with repo.vfs(AUTO_GEN_FILE, mode=b'wb', atomictemp=True) as f:
600 618 f.write(data)
601 619
602 620
603 621 def generate_manifest(bundles):
604 622 """write a list of Bundle objects into the repo's AUTO_GEN_FILE"""
605 623 bundles = list(bundles)
606 624 bundles.sort(key=lambda b: b.bundle_type)
607 625 lines = []
608 626 for b in bundles:
609 627 lines.append(b"%s\n" % b.manifest_line())
610 628 return b"".join(lines)
611 629
612 630
613 631 def update_ondisk_manifest(repo):
614 632 """update the clonebundle manifest with latest url"""
615 633 with repo.clonebundles_lock():
616 634 bundles = read_auto_gen(repo)
617 635
618 636 per_types = {}
619 637 for b in bundles:
620 638 if not (b.ready and b.valid_for(repo)):
621 639 continue
622 640 current = per_types.get(b.bundle_type)
623 641 if current is not None and current.revs >= b.revs:
624 642 continue
625 643 per_types[b.bundle_type] = b
626 644 manifest = generate_manifest(per_types.values())
627 645 with repo.vfs(
628 646 bundlecaches.CB_MANIFEST_FILE, mode=b"wb", atomictemp=True
629 647 ) as f:
630 648 f.write(manifest)
631 649
632 650
633 651 def update_bundle_list(repo, new_bundles=(), del_bundles=()):
634 652 """modify the repo's AUTO_GEN_FILE
635 653
636 654 This method also regenerates the clone bundle manifest when needed"""
637 655 with repo.clonebundles_lock():
638 656 bundles = read_auto_gen(repo)
639 657 if del_bundles:
640 658 bundles = [b for b in bundles if b not in del_bundles]
641 659 new_bundles = [b for b in new_bundles if b not in bundles]
642 660 bundles.extend(new_bundles)
643 661 write_auto_gen(repo, bundles)
644 662 all_changed = []
645 663 all_changed.extend(new_bundles)
646 664 all_changed.extend(del_bundles)
647 665 if any(b.ready for b in all_changed):
648 666 update_ondisk_manifest(repo)
649 667
650 668
651 669 def cleanup_tmp_bundle(repo, target):
652 670 """remove a GeneratingBundle file and entry"""
653 671 assert not target.ready
654 672 with repo.clonebundles_lock():
655 673 repo.vfs.tryunlink(target.filepath)
656 674 update_bundle_list(repo, del_bundles=[target])
657 675
658 676
659 677 def finalize_one_bundle(repo, target):
660 678 """upload a generated bundle and advertise it in the clonebundles.manifest"""
661 679 with repo.clonebundles_lock():
662 680 bundles = read_auto_gen(repo)
663 681 if target in bundles and target.valid_for(repo):
664 682 result = upload_bundle(repo, target)
665 683 update_bundle_list(repo, new_bundles=[result])
666 684 cleanup_tmp_bundle(repo, target)
667 685
668 686
687 def find_outdated_bundles(repo, bundles):
688 """finds outdated bundles"""
689 olds = []
690 per_types = {}
691 for b in bundles:
692 if not b.valid_for(repo):
693 olds.append(b)
694 continue
695 l = per_types.setdefault(b.bundle_type, [])
696 l.append(b)
697 for key in sorted(per_types):
698 all = per_types[key]
699 if len(all) > 1:
700 all.sort(key=lambda b: b.revs, reverse=True)
701 olds.extend(all[1:])
702 return olds
703
704
705 def collect_garbage(repo):
706 """finds outdated bundles and get them deleted"""
707 with repo.clonebundles_lock():
708 bundles = read_auto_gen(repo)
709 olds = find_outdated_bundles(repo, bundles)
710 for o in olds:
711 delete_bundle(repo, o)
712 update_bundle_list(repo, del_bundles=olds)
713
714
669 715 def upload_bundle(repo, bundle):
670 716 """upload the result of a GeneratingBundle and return a GeneratedBundle
671 717
672 718 The upload is done using the `clone-bundles.upload-command`
673 719 """
674 720 cmd = repo.ui.config(b'clone-bundles', b'upload-command')
675 721 url = repo.ui.config(b'clone-bundles', b'url-template')
676 722 basename = repo.vfs.basename(bundle.filepath)
677 723 filepath = procutil.shellquote(bundle.filepath)
678 724 variables = {
679 725 b'HGCB_BUNDLE_PATH': filepath,
680 726 b'HGCB_BUNDLE_BASENAME': basename,
681 727 }
682 728 env = procutil.shellenviron(environ=variables)
683 729 ret = repo.ui.system(cmd, environ=env)
684 730 if ret:
685 731 raise error.Abort(b"command returned status %d: %s" % (ret, cmd))
686 732 url = (
687 733 url.decode('utf8')
688 734 .format(basename=basename.decode('utf8'))
689 735 .encode('utf8')
690 736 )
691 737 return bundle.uploaded(url, basename)
692 738
693 739
740 def delete_bundle(repo, bundle):
741 """delete a bundle from storage"""
742 assert bundle.ready
743 msg = b'clone-bundles: deleting bundle %s\n'
744 msg %= bundle.basename
745 if repo.ui.configbool(b'devel', b'debug.clonebundles'):
746 repo.ui.write(msg)
747 else:
748 repo.ui.debug(msg)
749
750 cmd = repo.ui.config(b'clone-bundles', b'delete-command')
751 variables = {
752 b'HGCB_BUNDLE_URL': bundle.file_url,
753 b'HGCB_BASENAME': bundle.basename,
754 }
755 env = procutil.shellenviron(environ=variables)
756 ret = repo.ui.system(cmd, environ=env)
757 if ret:
758 raise error.Abort(b"command returned status %d: %s" % (ret, cmd))
759
760
694 761 def auto_bundle_needed_actions(repo, bundles, op_id):
695 762 """find the list of bundles that need action
696 763
697 764 returns a list of RequestedBundle objects that need to be generated and
698 765 uploaded."""
699 766 create_bundles = []
767 delete_bundles = []
700 768 repo = repo.filtered(b"immutable")
701 769 targets = repo.ui.configlist(b'clone-bundles', b'auto-generate.formats')
702 770 revs = len(repo.changelog)
703 771 generic_data = {
704 772 'revs': revs,
705 773 'head_revs': repo.changelog.headrevs(),
706 774 'tip_rev': repo.changelog.tiprev(),
707 775 'tip_node': node.hex(repo.changelog.tip()),
708 776 'op_id': op_id,
709 777 }
710 778 for t in targets:
711 779 data = generic_data.copy()
712 780 data['bundle_type'] = t
713 781 b = RequestedBundle(**data)
714 782 create_bundles.append(b)
715 return create_bundles
783 delete_bundles.extend(find_outdated_bundles(repo, bundles))
784 return create_bundles, delete_bundles
716 785
717 786
718 787 def start_one_bundle(repo, bundle):
719 788 """start the generation of a single bundle file
720 789
721 790 the `bundle` argument should be a RequestedBundle object.
722 791
723 792 This data is passed to the `debugmakeclonebundles` "as is".
724 793 """
725 794 data = util.pickle.dumps(bundle)
726 795 cmd = [procutil.hgexecutable(), b'--cwd', repo.path, INTERNAL_CMD]
727 796 env = procutil.shellenviron()
728 797 msg = b'clone-bundles: starting bundle generation: %s\n'
729 798 stdout = None
730 799 stderr = None
731 800 waits = []
732 801 record_wait = None
733 802 if repo.ui.configbool(b'devel', b'debug.clonebundles'):
734 803 stdout = procutil.stdout
735 804 stderr = procutil.stderr
736 805 repo.ui.write(msg % bundle.bundle_type)
737 806 record_wait = waits.append
738 807 else:
739 808 repo.ui.debug(msg % bundle.bundle_type)
740 809 bg = procutil.runbgcommand
741 810 bg(
742 811 cmd,
743 812 env,
744 813 stdin_bytes=data,
745 814 stdout=stdout,
746 815 stderr=stderr,
747 816 record_wait=record_wait,
748 817 )
749 818 for f in waits:
750 819 f()
751 820
752 821
753 822 INTERNAL_CMD = b'debug::internal-make-clone-bundles'
754 823
755 824
756 825 @command(INTERNAL_CMD, [], b'')
757 826 def debugmakeclonebundles(ui, repo):
758 827 """Internal command to auto-generate debug bundles"""
759 828 requested_bundle = util.pickle.load(procutil.stdin)
760 829 procutil.stdin.close()
761 830
831 collect_garbage(repo)
832
762 833 fname = requested_bundle.suggested_filename
763 834 fpath = repo.vfs.makedirs(b'tmp-bundles')
764 835 fpath = repo.vfs.join(b'tmp-bundles', fname)
765 836 bundle = requested_bundle.generating(fpath)
766 837 update_bundle_list(repo, new_bundles=[bundle])
767 838
768 839 requested_bundle.generate_bundle(repo, fpath)
769 840
770 841 repo.invalidate()
771 842 finalize_one_bundle(repo, bundle)
772 843
773 844
774 845 def make_auto_bundler(source_repo):
775 846 reporef = weakref.ref(source_repo)
776 847
777 848 def autobundle(tr):
778 849 repo = reporef()
779 850 assert repo is not None
780 851 bundles = read_auto_gen(repo)
781 new = auto_bundle_needed_actions(repo, bundles, b"%d_txn" % id(tr))
852 new, __ = auto_bundle_needed_actions(repo, bundles, b"%d_txn" % id(tr))
782 853 for data in new:
783 854 start_one_bundle(repo, data)
784 855 return None
785 856
786 857 return autobundle
787 858
788 859
789 860 def reposetup(ui, repo):
790 861 """install the two pieces needed for automatic clonebundle generation
791 862
792 863 - add a "post-close" hook that fires bundling when needed
793 864 - introduce a clone-bundle lock to let multiple processes meddle with the
794 865 state files.
795 866 """
796 867 if not repo.local():
797 868 return
798 869
799 870 class autobundlesrepo(repo.__class__):
800 871 def transaction(self, *args, **kwargs):
801 872 tr = super(autobundlesrepo, self).transaction(*args, **kwargs)
802 873 targets = repo.ui.configlist(
803 874 b'clone-bundles', b'auto-generate.formats'
804 875 )
805 876 if targets:
806 877 tr.addpostclose(CAT_POSTCLOSE, make_auto_bundler(self))
807 878 return tr
808 879
809 880 @localrepo.unfilteredmethod
810 881 def clonebundles_lock(self, wait=True):
811 882 '''Lock the repository file related to clone bundles'''
812 883 if not util.safehasattr(self, '_cb_lock_ref'):
813 884 self._cb_lock_ref = None
814 885 l = self._currentlock(self._cb_lock_ref)
815 886 if l is not None:
816 887 l.lock()
817 888 return l
818 889
819 890 l = self._lock(
820 891 vfs=self.vfs,
821 892 lockname=b"clonebundleslock",
822 893 wait=wait,
823 894 releasefn=None,
824 895 acquirefn=None,
825 896 desc=_(b'repository %s') % self.origroot,
826 897 )
827 898 self._cb_lock_ref = weakref.ref(l)
828 899 return l
829 900
830 901 repo._wlockfreeprefix.add(AUTO_GEN_FILE)
831 902 repo._wlockfreeprefix.add(bundlecaches.CB_MANIFEST_FILE)
832 903 repo.__class__ = autobundlesrepo
@@ -1,70 +1,96 b''
1 1
2 2 #require no-reposimplestore no-chg
3 3
4 4 initial setup
5 5
6 6 $ hg init server
7 7 $ cat >> server/.hg/hgrc << EOF
8 8 > [extensions]
9 9 > clonebundles =
10 10 >
11 11 > [clone-bundles]
12 12 > auto-generate.formats = v2
13 13 > upload-command = cp "\$HGCB_BUNDLE_PATH" "$TESTTMP"/final-upload/
14 > delete-command = rm -f "$TESTTMP/final-upload/\$HGCB_BASENAME"
14 15 > url-template = file://$TESTTMP/final-upload/{basename}
15 16 >
16 17 > [devel]
17 18 > debug.clonebundles=yes
18 19 > EOF
19 20
20 21 $ mkdir final-upload
21 22 $ hg clone server client
22 23 updating to branch default
23 24 0 files updated, 0 files merged, 0 files removed, 0 files unresolved
24 25 $ cd client
25 26
26 27 Test bundles are generated on push
27 28 ==================================
28 29
29 30 $ touch foo
30 31 $ hg -q commit -A -m 'add foo'
31 32 $ touch bar
32 33 $ hg -q commit -A -m 'add bar'
33 34 $ hg push
34 35 pushing to $TESTTMP/server
35 36 searching for changes
36 37 adding changesets
37 38 adding manifests
38 39 adding file changes
39 40 2 changesets found
40 41 added 2 changesets with 2 changes to 2 files
41 42 clone-bundles: starting bundle generation: v2
42 43 $ cat ../server/.hg/clonebundles.manifest
43 44 file:/*/$TESTTMP/final-upload/full-v2-2_revs-aaff8d2ffbbf_tip-*_txn.hg BUNDLESPEC=v2 REQUIRESNI=true (glob)
44 45 $ ls -1 ../final-upload
45 46 full-v2-2_revs-aaff8d2ffbbf_tip-*_txn.hg (glob)
46 47 $ ls -1 ../server/.hg/tmp-bundles
47 48
48 49 Newer bundles are generated with more pushes
49 50 --------------------------------------------
50 51
51 52 $ touch baz
52 53 $ hg -q commit -A -m 'add baz'
53 54 $ touch buz
54 55 $ hg -q commit -A -m 'add buz'
55 56 $ hg push
56 57 pushing to $TESTTMP/server
57 58 searching for changes
58 59 adding changesets
59 60 adding manifests
60 61 adding file changes
61 62 4 changesets found
62 63 added 2 changesets with 2 changes to 2 files
63 64 clone-bundles: starting bundle generation: v2
64 65
65 66 $ cat ../server/.hg/clonebundles.manifest
66 67 file:/*/$TESTTMP/final-upload/full-v2-4_revs-6427147b985a_tip-*_txn.hg BUNDLESPEC=v2 REQUIRESNI=true (glob)
67 68 $ ls -1 ../final-upload
68 69 full-v2-2_revs-aaff8d2ffbbf_tip-*_txn.hg (glob)
69 70 full-v2-4_revs-6427147b985a_tip-*_txn.hg (glob)
70 71 $ ls -1 ../server/.hg/tmp-bundles
72
73 Older bundles are cleaned up with more pushes
74 ---------------------------------------------
75
76 $ touch faz
77 $ hg -q commit -A -m 'add faz'
78 $ touch fuz
79 $ hg -q commit -A -m 'add fuz'
80 $ hg push
81 pushing to $TESTTMP/server
82 searching for changes
83 adding changesets
84 adding manifests
85 adding file changes
86 clone-bundles: deleting bundle full-v2-2_revs-aaff8d2ffbbf_tip-*_txn.hg (glob)
87 6 changesets found
88 added 2 changesets with 2 changes to 2 files
89 clone-bundles: starting bundle generation: v2
90
91 $ cat ../server/.hg/clonebundles.manifest
92 file:/*/$TESTTMP/final-upload/full-v2-6_revs-b1010e95ea00_tip-*_txn.hg BUNDLESPEC=v2 REQUIRESNI=true (glob)
93 $ ls -1 ../final-upload
94 full-v2-4_revs-6427147b985a_tip-*_txn.hg (glob)
95 full-v2-6_revs-b1010e95ea00_tip-*_txn.hg (glob)
96 $ ls -1 ../server/.hg/tmp-bundles
General Comments 0
You need to be logged in to leave comments. Login now