##// END OF EJS Templates
clonebundles: fix display of auto-generate.on-change lines
Mathias De Mare -
r51628:2b059812 stable
parent child Browse files
Show More
@@ -1,1082 +1,1082 b''
1 1 # This software may be used and distributed according to the terms of the
2 2 # GNU General Public License version 2 or any later version.
3 3
4 4 """advertise pre-generated bundles to seed clones
5 5
6 6 "clonebundles" is a server-side extension used to advertise the existence
7 7 of pre-generated, externally hosted bundle files to clients that are
8 8 cloning so that cloning can be faster, more reliable, and require less
9 9 resources on the server. "pullbundles" is a related feature for sending
10 10 pre-generated bundle files to clients as part of pull operations.
11 11
12 12 Cloning can be a CPU and I/O intensive operation on servers. Traditionally,
13 13 the server, in response to a client's request to clone, dynamically generates
14 14 a bundle containing the entire repository content and sends it to the client.
15 15 There is no caching on the server and the server will have to redundantly
16 16 generate the same outgoing bundle in response to each clone request. For
17 17 servers with large repositories or with high clone volume, the load from
18 18 clones can make scaling the server challenging and costly.
19 19
20 20 This extension provides server operators the ability to offload
21 21 potentially expensive clone load to an external service. Pre-generated
22 22 bundles also allow using more CPU intensive compression, reducing the
23 23 effective bandwidth requirements.
24 24
25 25 Here's how clone bundles work:
26 26
27 27 1. A server operator establishes a mechanism for making bundle files available
28 28 on a hosting service where Mercurial clients can fetch them.
29 29 2. A manifest file listing available bundle URLs and some optional metadata
30 30 is added to the Mercurial repository on the server.
31 31 3. A client initiates a clone against a clone bundles aware server.
32 32 4. The client sees the server is advertising clone bundles and fetches the
33 33 manifest listing available bundles.
34 34 5. The client filters and sorts the available bundles based on what it
35 35 supports and prefers.
36 36 6. The client downloads and applies an available bundle from the
37 37 server-specified URL.
38 38 7. The client reconnects to the original server and performs the equivalent
39 39 of :hg:`pull` to retrieve all repository data not in the bundle. (The
40 40 repository could have been updated between when the bundle was created
41 41 and when the client started the clone.) This may use "pullbundles".
42 42
43 43 Instead of the server generating full repository bundles for every clone
44 44 request, it generates full bundles once and they are subsequently reused to
45 45 bootstrap new clones. The server may still transfer data at clone time.
46 46 However, this is only data that has been added/changed since the bundle was
47 47 created. For large, established repositories, this can reduce server load for
48 48 clones to less than 1% of original.
49 49
50 50 Here's how pullbundles work:
51 51
52 52 1. A manifest file listing available bundles and describing the revisions
53 53 is added to the Mercurial repository on the server.
54 54 2. A new-enough client informs the server that it supports partial pulls
55 55 and initiates a pull.
56 56 3. If the server has pull bundles enabled and sees the client advertising
57 57 partial pulls, it checks for a matching pull bundle in the manifest.
58 58 A bundle matches if the format is supported by the client, the client
59 59 has the required revisions already and needs something from the bundle.
60 60 4. If there is at least one matching bundle, the server sends it to the client.
61 61 5. The client applies the bundle and notices that the server reply was
62 62 incomplete. It initiates another pull.
63 63
64 64 To work, this extension requires the following of server operators:
65 65
66 66 * Generating bundle files of repository content (typically periodically,
67 67 such as once per day).
68 68 * Clone bundles: A file server that clients have network access to and that
69 69 Python knows how to talk to through its normal URL handling facility
70 70 (typically an HTTP/HTTPS server).
71 71 * A process for keeping the bundles manifest in sync with available bundle
72 72 files.
73 73
74 74 Strictly speaking, using a static file hosting server isn't required: a server
75 75 operator could use a dynamic service for retrieving bundle data. However,
76 76 static file hosting services are simple and scalable and should be sufficient
77 77 for most needs.
78 78
79 79 Bundle files can be generated with the :hg:`bundle` command. Typically
80 80 :hg:`bundle --all` is used to produce a bundle of the entire repository.
81 81
82 82 The bundlespec option `stream` (see :hg:`help bundlespec`)
83 83 can be used to produce a special *streaming clonebundle*, typically using
84 84 :hg:`bundle --all --type="none-streamv2"`.
85 85 These are bundle files that are extremely efficient
86 86 to produce and consume (read: fast). However, they are larger than
87 87 traditional bundle formats and require that clients support the exact set
88 88 of repository data store formats in use by the repository that created them.
89 89 Typically, a newer server can serve data that is compatible with older clients.
90 90 However, *streaming clone bundles* don't have this guarantee. **Server
91 91 operators need to be aware that newer versions of Mercurial may produce
92 92 streaming clone bundles incompatible with older Mercurial versions.**
93 93
94 94 A server operator is responsible for creating a ``.hg/clonebundles.manifest``
95 95 file containing the list of available bundle files suitable for seeding
96 96 clones. If this file does not exist, the repository will not advertise the
97 97 existence of clone bundles when clients connect. For pull bundles,
98 98 ``.hg/pullbundles.manifest`` is used.
99 99
100 100 The manifest file contains a newline (\\n) delimited list of entries.
101 101
102 102 Each line in this file defines an available bundle. Lines have the format:
103 103
104 104 <URL> [<key>=<value>[ <key>=<value>]]
105 105
106 106 That is, a URL followed by an optional, space-delimited list of key=value
107 107 pairs describing additional properties of this bundle. Both keys and values
108 108 are URI encoded.
109 109
110 110 For pull bundles, the URL is a path under the ``.hg`` directory of the
111 111 repository.
112 112
113 113 Keys in UPPERCASE are reserved for use by Mercurial and are defined below.
114 114 All non-uppercase keys can be used by site installations. An example use
115 115 for custom properties is to use the *datacenter* attribute to define which
116 116 data center a file is hosted in. Clients could then prefer a server in the
117 117 data center closest to them.
118 118
119 119 The following reserved keys are currently defined:
120 120
121 121 BUNDLESPEC
122 122 A "bundle specification" string that describes the type of the bundle.
123 123
124 124 These are string values that are accepted by the "--type" argument of
125 125 :hg:`bundle`.
126 126
127 127 The values are parsed in strict mode, which means they must be of the
128 128 "<compression>-<type>" form. See
129 129 mercurial.exchange.parsebundlespec() for more details.
130 130
131 131 :hg:`debugbundle --spec` can be used to print the bundle specification
132 132 string for a bundle file. The output of this command can be used verbatim
133 133 for the value of ``BUNDLESPEC`` (it is already escaped).
134 134
135 135 Clients will automatically filter out specifications that are unknown or
136 136 unsupported so they won't attempt to download something that likely won't
137 137 apply.
138 138
139 139 The actual value doesn't impact client behavior beyond filtering:
140 140 clients will still sniff the bundle type from the header of downloaded
141 141 files.
142 142
143 143 **Use of this key is highly recommended**, as it allows clients to
144 144 easily skip unsupported bundles. If this key is not defined, an old
145 145 client may attempt to apply a bundle that it is incapable of reading.
146 146
147 147 REQUIRESNI
148 148 Whether Server Name Indication (SNI) is required to connect to the URL.
149 149 SNI allows servers to use multiple certificates on the same IP. It is
150 150 somewhat common in CDNs and other hosting providers. Older Python
151 151 versions do not support SNI. Defining this attribute enables clients
152 152 with older Python versions to filter this entry without experiencing
153 153 an opaque SSL failure at connection time.
154 154
155 155 If this is defined, it is important to advertise a non-SNI fallback
156 156 URL or clients running old Python releases may not be able to clone
157 157 with the clonebundles facility.
158 158
159 159 Value should be "true".
160 160
161 161 REQUIREDRAM
162 162 Value specifies expected memory requirements to decode the payload.
163 163 Values can have suffixes for common bytes sizes. e.g. "64MB".
164 164
165 165 This key is often used with zstd-compressed bundles using a high
166 166 compression level / window size, which can require 100+ MB of memory
167 167 to decode.
168 168
169 169 heads
170 170 Used for pull bundles. This contains the ``;`` separated changeset
171 171 hashes of the heads of the bundle content.
172 172
173 173 bases
174 174 Used for pull bundles. This contains the ``;`` separated changeset
175 175 hashes of the roots of the bundle content. This can be skipped if
176 176 the bundle was created without ``--base``.
177 177
178 178 Manifests can contain multiple entries. Assuming metadata is defined, clients
179 179 will filter entries from the manifest that they don't support. The remaining
180 180 entries are optionally sorted by client preferences
181 181 (``ui.clonebundleprefers`` config option). The client then attempts
182 182 to fetch the bundle at the first URL in the remaining list.
183 183
184 184 **Errors when downloading a bundle will fail the entire clone operation:
185 185 clients do not automatically fall back to a traditional clone.** The reason
186 186 for this is that if a server is using clone bundles, it is probably doing so
187 187 because the feature is necessary to help it scale. In other words, there
188 188 is an assumption that clone load will be offloaded to another service and
189 189 that the Mercurial server isn't responsible for serving this clone load.
190 190 If that other service experiences issues and clients start mass falling back to
191 191 the original Mercurial server, the added clone load could overwhelm the server
192 192 due to unexpected load and effectively take it offline. Not having clients
193 193 automatically fall back to cloning from the original server mitigates this
194 194 scenario.
195 195
196 196 Because there is no automatic Mercurial server fallback on failure of the
197 197 bundle hosting service, it is important for server operators to view the bundle
198 198 hosting service as an extension of the Mercurial server in terms of
199 199 availability and service level agreements: if the bundle hosting service goes
200 200 down, so does the ability for clients to clone. Note: clients will see a
201 201 message informing them how to bypass the clone bundles facility when a failure
202 202 occurs. So server operators should prepare for some people to follow these
203 203 instructions when a failure occurs, thus driving more load to the original
204 204 Mercurial server when the bundle hosting service fails.
205 205
206 206
207 207 inline clonebundles
208 208 -------------------
209 209
210 210 It is possible to transmit clonebundles inline in case repositories are
211 211 accessed over SSH. This avoids having to setup an external HTTPS server
212 212 and results in the same access control as already present for the SSH setup.
213 213
214 214 Inline clonebundles should be placed into the `.hg/bundle-cache` directory.
215 215 A clonebundle at `.hg/bundle-cache/mybundle.bundle` is referred to
216 216 in the `clonebundles.manifest` file as `peer-bundle-cache://mybundle.bundle`.
217 217
218 218
219 219 auto-generation of clone bundles
220 220 --------------------------------
221 221
222 222 It is possible to set Mercurial to automatically re-generate clone bundles when
223 223 enough new content is available.
224 224
225 225 Mercurial will take care of the process asynchronously. The defined list of
226 226 bundle-type will be generated, uploaded, and advertised. Older bundles will get
227 227 decommissioned as newer ones replace them.
228 228
229 229 Bundles Generation:
230 230 ...................
231 231
232 232 The extension can generate multiple variants of the clone bundle. Each
233 233 different variant will be defined by the "bundle-spec" they use::
234 234
235 235 [clone-bundles]
236 236 auto-generate.formats= zstd-v2, gzip-v2
237 237
238 238 See `hg help bundlespec` for details about available options.
239 239
240 240 By default, new bundles are generated when 5% of the repository contents or at
241 241 least 1000 revisions are not contained in the cached bundles. This option can
242 242 be controlled by the `clone-bundles.trigger.below-bundled-ratio` option
243 243 (default 0.95) and the `clone-bundles.trigger.revs` option (default 1000)::
244 244
245 245 [clone-bundles]
246 246 trigger.below-bundled-ratio=0.95
247 247 trigger.revs=1000
248 248
249 249 This logic can be manually triggered using the `admin::clone-bundles-refresh`
250 250 command, or automatically on each repository change if
251 `clone-bundles.auto-generate.on-change` is set to `yes`.
251 `clone-bundles.auto-generate.on-change` is set to `yes`::
252 252
253 253 [clone-bundles]
254 254 auto-generate.on-change=yes
255 255 auto-generate.formats= zstd-v2, gzip-v2
256 256
257 257 Automatic Inline serving
258 258 ........................
259 259
260 260 The simplest way to serve the generated bundle is through the Mercurial
261 261 protocol. However it is not the most efficient as request will still be served
262 262 by that main server. It is useful in case where authentication is complexe or
263 263 when an efficient mirror system is already in use anyway. See the `inline
264 264 clonebundles` section above for details about inline clonebundles
265 265
266 266 To automatically serve generated bundle through inline clonebundle, simply set
267 267 the following option::
268 268
269 269 auto-generate.serve-inline=yes
270 270
271 271 Enabling this option disable the managed upload and serving explained below.
272 272
273 273 Bundles Upload and Serving:
274 274 ...........................
275 275
276 276 This is the most efficient way to serve automatically generated clone bundles,
277 277 but requires some setup.
278 278
279 279 The generated bundles need to be made available to users through a "public" URL.
280 280 This should be donne through `clone-bundles.upload-command` configuration. The
281 281 value of this command should be a shell command. It will have access to the
282 282 bundle file path through the `$HGCB_BUNDLE_PATH` variable. And the expected
283 283 basename in the "public" URL is accessible at::
284 284
285 285 [clone-bundles]
286 286 upload-command=sftp put $HGCB_BUNDLE_PATH \
287 287 sftp://bundles.host/clone-bundles/$HGCB_BUNDLE_BASENAME
288 288
289 289 If the file was already uploaded, the command must still succeed.
290 290
291 291 After upload, the file should be available at an url defined by
292 292 `clone-bundles.url-template`.
293 293
294 294 [clone-bundles]
295 295 url-template=https://bundles.host/cache/clone-bundles/{basename}
296 296
297 297 Old bundles cleanup:
298 298 ....................
299 299
300 300 When new bundles are generated, the older ones are no longer necessary and can
301 301 be removed from storage. This is done through the `clone-bundles.delete-command`
302 302 configuration. The command is given the url of the artifact to delete through
303 303 the `$HGCB_BUNDLE_URL` environment variable.
304 304
305 305 [clone-bundles]
306 306 delete-command=sftp rm sftp://bundles.host/clone-bundles/$HGCB_BUNDLE_BASENAME
307 307
308 308 If the file was already deleted, the command must still succeed.
309 309 """
310 310
311 311
312 312 import os
313 313 import weakref
314 314
315 315 from mercurial.i18n import _
316 316
317 317 from mercurial import (
318 318 bundlecaches,
319 319 commands,
320 320 error,
321 321 extensions,
322 322 localrepo,
323 323 lock,
324 324 node,
325 325 registrar,
326 326 util,
327 327 wireprotov1server,
328 328 )
329 329
330 330
331 331 from mercurial.utils import (
332 332 procutil,
333 333 )
334 334
335 335 testedwith = b'ships-with-hg-core'
336 336
337 337
338 338 def capabilities(orig, repo, proto):
339 339 caps = orig(repo, proto)
340 340
341 341 # Only advertise if a manifest exists. This does add some I/O to requests.
342 342 # But this should be cheaper than a wasted network round trip due to
343 343 # missing file.
344 344 if repo.vfs.exists(bundlecaches.CB_MANIFEST_FILE):
345 345 caps.append(b'clonebundles')
346 346 caps.append(b'clonebundles_manifest')
347 347
348 348 return caps
349 349
350 350
351 351 def extsetup(ui):
352 352 extensions.wrapfunction(wireprotov1server, b'_capabilities', capabilities)
353 353
354 354
355 355 # logic for bundle auto-generation
356 356
357 357
358 358 configtable = {}
359 359 configitem = registrar.configitem(configtable)
360 360
361 361 cmdtable = {}
362 362 command = registrar.command(cmdtable)
363 363
364 364 configitem(b'clone-bundles', b'auto-generate.on-change', default=False)
365 365 configitem(b'clone-bundles', b'auto-generate.formats', default=list)
366 366 configitem(b'clone-bundles', b'auto-generate.serve-inline', default=False)
367 367 configitem(b'clone-bundles', b'trigger.below-bundled-ratio', default=0.95)
368 368 configitem(b'clone-bundles', b'trigger.revs', default=1000)
369 369
370 370 configitem(b'clone-bundles', b'upload-command', default=None)
371 371
372 372 configitem(b'clone-bundles', b'delete-command', default=None)
373 373
374 374 configitem(b'clone-bundles', b'url-template', default=None)
375 375
376 376 configitem(b'devel', b'debug.clonebundles', default=False)
377 377
378 378
379 379 # category for the post-close transaction hooks
380 380 CAT_POSTCLOSE = b"clonebundles-autobundles"
381 381
382 382 # template for bundle file names
383 383 BUNDLE_MASK = (
384 384 b"full-%(bundle_type)s-%(revs)d_revs-%(tip_short)s_tip-%(op_id)s.hg"
385 385 )
386 386
387 387
388 388 # file in .hg/ use to track clonebundles being auto-generated
389 389 AUTO_GEN_FILE = b'clonebundles.auto-gen'
390 390
391 391
392 392 class BundleBase(object):
393 393 """represents the core of properties that matters for us in a bundle
394 394
395 395 :bundle_type: the bundlespec (see hg help bundlespec)
396 396 :revs: the number of revisions in the repo at bundle creation time
397 397 :tip_rev: the rev-num of the tip revision
398 398 :tip_node: the node id of the tip-most revision in the bundle
399 399
400 400 :ready: True if the bundle is ready to be served
401 401 """
402 402
403 403 ready = False
404 404
405 405 def __init__(self, bundle_type, revs, tip_rev, tip_node):
406 406 self.bundle_type = bundle_type
407 407 self.revs = revs
408 408 self.tip_rev = tip_rev
409 409 self.tip_node = tip_node
410 410
411 411 def valid_for(self, repo):
412 412 """is this bundle applicable to the current repository
413 413
414 414 This is useful for detecting bundles made irrelevant by stripping.
415 415 """
416 416 tip_node = node.bin(self.tip_node)
417 417 return repo.changelog.index.get_rev(tip_node) == self.tip_rev
418 418
419 419 def __eq__(self, other):
420 420 left = (self.ready, self.bundle_type, self.tip_rev, self.tip_node)
421 421 right = (other.ready, other.bundle_type, other.tip_rev, other.tip_node)
422 422 return left == right
423 423
424 424 def __neq__(self, other):
425 425 return not self == other
426 426
427 427 def __cmp__(self, other):
428 428 if self == other:
429 429 return 0
430 430 return -1
431 431
432 432
433 433 class RequestedBundle(BundleBase):
434 434 """A bundle that should be generated.
435 435
436 436 Additional attributes compared to BundleBase
437 437 :heads: list of head revisions (as rev-num)
438 438 :op_id: a "unique" identifier for the operation triggering the change
439 439 """
440 440
441 441 def __init__(self, bundle_type, revs, tip_rev, tip_node, head_revs, op_id):
442 442 self.head_revs = head_revs
443 443 self.op_id = op_id
444 444 super(RequestedBundle, self).__init__(
445 445 bundle_type,
446 446 revs,
447 447 tip_rev,
448 448 tip_node,
449 449 )
450 450
451 451 @property
452 452 def suggested_filename(self):
453 453 """A filename that can be used for the generated bundle"""
454 454 data = {
455 455 b'bundle_type': self.bundle_type,
456 456 b'revs': self.revs,
457 457 b'heads': self.head_revs,
458 458 b'tip_rev': self.tip_rev,
459 459 b'tip_node': self.tip_node,
460 460 b'tip_short': self.tip_node[:12],
461 461 b'op_id': self.op_id,
462 462 }
463 463 return BUNDLE_MASK % data
464 464
465 465 def generate_bundle(self, repo, file_path):
466 466 """generate the bundle at `filepath`"""
467 467 commands.bundle(
468 468 repo.ui,
469 469 repo,
470 470 file_path,
471 471 base=[b"null"],
472 472 rev=self.head_revs,
473 473 type=self.bundle_type,
474 474 quiet=True,
475 475 )
476 476
477 477 def generating(self, file_path, hostname=None, pid=None):
478 478 """return a GeneratingBundle object from this object"""
479 479 if pid is None:
480 480 pid = os.getpid()
481 481 if hostname is None:
482 482 hostname = lock._getlockprefix()
483 483 return GeneratingBundle(
484 484 self.bundle_type,
485 485 self.revs,
486 486 self.tip_rev,
487 487 self.tip_node,
488 488 hostname,
489 489 pid,
490 490 file_path,
491 491 )
492 492
493 493
494 494 class GeneratingBundle(BundleBase):
495 495 """A bundle being generated
496 496
497 497 extra attributes compared to BundleBase:
498 498
499 499 :hostname: the hostname of the machine generating the bundle
500 500 :pid: the pid of the process generating the bundle
501 501 :filepath: the target filename of the bundle
502 502
503 503 These attributes exist to help detect stalled generation processes.
504 504 """
505 505
506 506 ready = False
507 507
508 508 def __init__(
509 509 self, bundle_type, revs, tip_rev, tip_node, hostname, pid, filepath
510 510 ):
511 511 self.hostname = hostname
512 512 self.pid = pid
513 513 self.filepath = filepath
514 514 super(GeneratingBundle, self).__init__(
515 515 bundle_type, revs, tip_rev, tip_node
516 516 )
517 517
518 518 @classmethod
519 519 def from_line(cls, line):
520 520 """create an object by deserializing a line from AUTO_GEN_FILE"""
521 521 assert line.startswith(b'PENDING-v1 ')
522 522 (
523 523 __,
524 524 bundle_type,
525 525 revs,
526 526 tip_rev,
527 527 tip_node,
528 528 hostname,
529 529 pid,
530 530 filepath,
531 531 ) = line.split()
532 532 hostname = util.urlreq.unquote(hostname)
533 533 filepath = util.urlreq.unquote(filepath)
534 534 revs = int(revs)
535 535 tip_rev = int(tip_rev)
536 536 pid = int(pid)
537 537 return cls(
538 538 bundle_type, revs, tip_rev, tip_node, hostname, pid, filepath
539 539 )
540 540
541 541 def to_line(self):
542 542 """serialize the object to include as a line in AUTO_GEN_FILE"""
543 543 templ = b"PENDING-v1 %s %d %d %s %s %d %s"
544 544 data = (
545 545 self.bundle_type,
546 546 self.revs,
547 547 self.tip_rev,
548 548 self.tip_node,
549 549 util.urlreq.quote(self.hostname),
550 550 self.pid,
551 551 util.urlreq.quote(self.filepath),
552 552 )
553 553 return templ % data
554 554
555 555 def __eq__(self, other):
556 556 if not super(GeneratingBundle, self).__eq__(other):
557 557 return False
558 558 left = (self.hostname, self.pid, self.filepath)
559 559 right = (other.hostname, other.pid, other.filepath)
560 560 return left == right
561 561
562 562 def uploaded(self, url, basename):
563 563 """return a GeneratedBundle from this object"""
564 564 return GeneratedBundle(
565 565 self.bundle_type,
566 566 self.revs,
567 567 self.tip_rev,
568 568 self.tip_node,
569 569 url,
570 570 basename,
571 571 )
572 572
573 573
574 574 class GeneratedBundle(BundleBase):
575 575 """A bundle that is done being generated and can be served
576 576
577 577 extra attributes compared to BundleBase:
578 578
579 579 :file_url: the url where the bundle is available.
580 580 :basename: the "basename" used to upload (useful for deletion)
581 581
582 582 These attributes exist to generate a bundle manifest
583 583 (.hg/pullbundles.manifest)
584 584 """
585 585
586 586 ready = True
587 587
588 588 def __init__(
589 589 self, bundle_type, revs, tip_rev, tip_node, file_url, basename
590 590 ):
591 591 self.file_url = file_url
592 592 self.basename = basename
593 593 super(GeneratedBundle, self).__init__(
594 594 bundle_type, revs, tip_rev, tip_node
595 595 )
596 596
597 597 @classmethod
598 598 def from_line(cls, line):
599 599 """create an object by deserializing a line from AUTO_GEN_FILE"""
600 600 assert line.startswith(b'DONE-v1 ')
601 601 (
602 602 __,
603 603 bundle_type,
604 604 revs,
605 605 tip_rev,
606 606 tip_node,
607 607 file_url,
608 608 basename,
609 609 ) = line.split()
610 610 revs = int(revs)
611 611 tip_rev = int(tip_rev)
612 612 file_url = util.urlreq.unquote(file_url)
613 613 return cls(bundle_type, revs, tip_rev, tip_node, file_url, basename)
614 614
615 615 def to_line(self):
616 616 """serialize the object to include as a line in AUTO_GEN_FILE"""
617 617 templ = b"DONE-v1 %s %d %d %s %s %s"
618 618 data = (
619 619 self.bundle_type,
620 620 self.revs,
621 621 self.tip_rev,
622 622 self.tip_node,
623 623 util.urlreq.quote(self.file_url),
624 624 self.basename,
625 625 )
626 626 return templ % data
627 627
628 628 def manifest_line(self):
629 629 """serialize the object to include as a line in pullbundles.manifest"""
630 630 templ = b"%s BUNDLESPEC=%s"
631 631 if self.file_url.startswith(b'http'):
632 632 templ += b" REQUIRESNI=true"
633 633 return templ % (self.file_url, self.bundle_type)
634 634
635 635 def __eq__(self, other):
636 636 if not super(GeneratedBundle, self).__eq__(other):
637 637 return False
638 638 return self.file_url == other.file_url
639 639
640 640
641 641 def parse_auto_gen(content):
642 642 """parse the AUTO_GEN_FILE to return a list of Bundle object"""
643 643 bundles = []
644 644 for line in content.splitlines():
645 645 if line.startswith(b'PENDING-v1 '):
646 646 bundles.append(GeneratingBundle.from_line(line))
647 647 elif line.startswith(b'DONE-v1 '):
648 648 bundles.append(GeneratedBundle.from_line(line))
649 649 return bundles
650 650
651 651
652 652 def dumps_auto_gen(bundles):
653 653 """serialize a list of Bundle as a AUTO_GEN_FILE content"""
654 654 lines = []
655 655 for b in bundles:
656 656 lines.append(b"%s\n" % b.to_line())
657 657 lines.sort()
658 658 return b"".join(lines)
659 659
660 660
661 661 def read_auto_gen(repo):
662 662 """read the AUTO_GEN_FILE for the <repo> a list of Bundle object"""
663 663 data = repo.vfs.tryread(AUTO_GEN_FILE)
664 664 if not data:
665 665 return []
666 666 return parse_auto_gen(data)
667 667
668 668
669 669 def write_auto_gen(repo, bundles):
670 670 """write a list of Bundle objects into the repo's AUTO_GEN_FILE"""
671 671 assert repo._cb_lock_ref is not None
672 672 data = dumps_auto_gen(bundles)
673 673 with repo.vfs(AUTO_GEN_FILE, mode=b'wb', atomictemp=True) as f:
674 674 f.write(data)
675 675
676 676
677 677 def generate_manifest(bundles):
678 678 """write a list of Bundle objects into the repo's AUTO_GEN_FILE"""
679 679 bundles = list(bundles)
680 680 bundles.sort(key=lambda b: b.bundle_type)
681 681 lines = []
682 682 for b in bundles:
683 683 lines.append(b"%s\n" % b.manifest_line())
684 684 return b"".join(lines)
685 685
686 686
687 687 def update_ondisk_manifest(repo):
688 688 """update the clonebundle manifest with latest url"""
689 689 with repo.clonebundles_lock():
690 690 bundles = read_auto_gen(repo)
691 691
692 692 per_types = {}
693 693 for b in bundles:
694 694 if not (b.ready and b.valid_for(repo)):
695 695 continue
696 696 current = per_types.get(b.bundle_type)
697 697 if current is not None and current.revs >= b.revs:
698 698 continue
699 699 per_types[b.bundle_type] = b
700 700 manifest = generate_manifest(per_types.values())
701 701 with repo.vfs(
702 702 bundlecaches.CB_MANIFEST_FILE, mode=b"wb", atomictemp=True
703 703 ) as f:
704 704 f.write(manifest)
705 705
706 706
707 707 def update_bundle_list(repo, new_bundles=(), del_bundles=()):
708 708 """modify the repo's AUTO_GEN_FILE
709 709
710 710 This method also regenerates the clone bundle manifest when needed"""
711 711 with repo.clonebundles_lock():
712 712 bundles = read_auto_gen(repo)
713 713 if del_bundles:
714 714 bundles = [b for b in bundles if b not in del_bundles]
715 715 new_bundles = [b for b in new_bundles if b not in bundles]
716 716 bundles.extend(new_bundles)
717 717 write_auto_gen(repo, bundles)
718 718 all_changed = []
719 719 all_changed.extend(new_bundles)
720 720 all_changed.extend(del_bundles)
721 721 if any(b.ready for b in all_changed):
722 722 update_ondisk_manifest(repo)
723 723
724 724
725 725 def cleanup_tmp_bundle(repo, target):
726 726 """remove a GeneratingBundle file and entry"""
727 727 assert not target.ready
728 728 with repo.clonebundles_lock():
729 729 repo.vfs.tryunlink(target.filepath)
730 730 update_bundle_list(repo, del_bundles=[target])
731 731
732 732
733 733 def finalize_one_bundle(repo, target):
734 734 """upload a generated bundle and advertise it in the clonebundles.manifest"""
735 735 with repo.clonebundles_lock():
736 736 bundles = read_auto_gen(repo)
737 737 if target in bundles and target.valid_for(repo):
738 738 result = upload_bundle(repo, target)
739 739 update_bundle_list(repo, new_bundles=[result])
740 740 cleanup_tmp_bundle(repo, target)
741 741
742 742
743 743 def find_outdated_bundles(repo, bundles):
744 744 """finds outdated bundles"""
745 745 olds = []
746 746 per_types = {}
747 747 for b in bundles:
748 748 if not b.valid_for(repo):
749 749 olds.append(b)
750 750 continue
751 751 l = per_types.setdefault(b.bundle_type, [])
752 752 l.append(b)
753 753 for key in sorted(per_types):
754 754 all = per_types[key]
755 755 if len(all) > 1:
756 756 all.sort(key=lambda b: b.revs, reverse=True)
757 757 olds.extend(all[1:])
758 758 return olds
759 759
760 760
761 761 def collect_garbage(repo):
762 762 """finds outdated bundles and get them deleted"""
763 763 with repo.clonebundles_lock():
764 764 bundles = read_auto_gen(repo)
765 765 olds = find_outdated_bundles(repo, bundles)
766 766 for o in olds:
767 767 delete_bundle(repo, o)
768 768 update_bundle_list(repo, del_bundles=olds)
769 769
770 770
771 771 def upload_bundle(repo, bundle):
772 772 """upload the result of a GeneratingBundle and return a GeneratedBundle
773 773
774 774 The upload is done using the `clone-bundles.upload-command`
775 775 """
776 776 inline = repo.ui.config(b'clone-bundles', b'auto-generate.serve-inline')
777 777 basename = repo.vfs.basename(bundle.filepath)
778 778 if inline:
779 779 dest_dir = repo.vfs.join(bundlecaches.BUNDLE_CACHE_DIR)
780 780 repo.vfs.makedirs(dest_dir)
781 781 dest = repo.vfs.join(dest_dir, basename)
782 782 util.copyfiles(bundle.filepath, dest, hardlink=True)
783 783 url = bundlecaches.CLONEBUNDLESCHEME + basename
784 784 return bundle.uploaded(url, basename)
785 785 else:
786 786 cmd = repo.ui.config(b'clone-bundles', b'upload-command')
787 787 url = repo.ui.config(b'clone-bundles', b'url-template')
788 788 filepath = procutil.shellquote(bundle.filepath)
789 789 variables = {
790 790 b'HGCB_BUNDLE_PATH': filepath,
791 791 b'HGCB_BUNDLE_BASENAME': basename,
792 792 }
793 793 env = procutil.shellenviron(environ=variables)
794 794 ret = repo.ui.system(cmd, environ=env)
795 795 if ret:
796 796 raise error.Abort(b"command returned status %d: %s" % (ret, cmd))
797 797 url = (
798 798 url.decode('utf8')
799 799 .format(basename=basename.decode('utf8'))
800 800 .encode('utf8')
801 801 )
802 802 return bundle.uploaded(url, basename)
803 803
804 804
805 805 def delete_bundle(repo, bundle):
806 806 """delete a bundle from storage"""
807 807 assert bundle.ready
808 808
809 809 inline = bundle.file_url.startswith(bundlecaches.CLONEBUNDLESCHEME)
810 810
811 811 if inline:
812 812 msg = b'clone-bundles: deleting inline bundle %s\n'
813 813 else:
814 814 msg = b'clone-bundles: deleting bundle %s\n'
815 815 msg %= bundle.basename
816 816 if repo.ui.configbool(b'devel', b'debug.clonebundles'):
817 817 repo.ui.write(msg)
818 818 else:
819 819 repo.ui.debug(msg)
820 820
821 821 if inline:
822 822 inline_path = repo.vfs.join(
823 823 bundlecaches.BUNDLE_CACHE_DIR,
824 824 bundle.basename,
825 825 )
826 826 util.tryunlink(inline_path)
827 827 else:
828 828 cmd = repo.ui.config(b'clone-bundles', b'delete-command')
829 829 variables = {
830 830 b'HGCB_BUNDLE_URL': bundle.file_url,
831 831 b'HGCB_BASENAME': bundle.basename,
832 832 }
833 833 env = procutil.shellenviron(environ=variables)
834 834 ret = repo.ui.system(cmd, environ=env)
835 835 if ret:
836 836 raise error.Abort(b"command returned status %d: %s" % (ret, cmd))
837 837
838 838
839 839 def auto_bundle_needed_actions(repo, bundles, op_id):
840 840 """find the list of bundles that need action
841 841
842 842 returns a list of RequestedBundle objects that need to be generated and
843 843 uploaded."""
844 844 create_bundles = []
845 845 delete_bundles = []
846 846 repo = repo.filtered(b"immutable")
847 847 targets = repo.ui.configlist(b'clone-bundles', b'auto-generate.formats')
848 848 ratio = float(
849 849 repo.ui.config(b'clone-bundles', b'trigger.below-bundled-ratio')
850 850 )
851 851 abs_revs = repo.ui.configint(b'clone-bundles', b'trigger.revs')
852 852 revs = len(repo.changelog)
853 853 generic_data = {
854 854 'revs': revs,
855 855 'head_revs': repo.changelog.headrevs(),
856 856 'tip_rev': repo.changelog.tiprev(),
857 857 'tip_node': node.hex(repo.changelog.tip()),
858 858 'op_id': op_id,
859 859 }
860 860 for t in targets:
861 861 t = bundlecaches.parsebundlespec(repo, t, strict=False).as_spec()
862 862 if new_bundle_needed(repo, bundles, ratio, abs_revs, t, revs):
863 863 data = generic_data.copy()
864 864 data['bundle_type'] = t
865 865 b = RequestedBundle(**data)
866 866 create_bundles.append(b)
867 867 delete_bundles.extend(find_outdated_bundles(repo, bundles))
868 868 return create_bundles, delete_bundles
869 869
870 870
871 871 def new_bundle_needed(repo, bundles, ratio, abs_revs, bundle_type, revs):
872 872 """consider the current cached content and trigger new bundles if needed"""
873 873 threshold = max((revs * ratio), (revs - abs_revs))
874 874 for b in bundles:
875 875 if not b.valid_for(repo) or b.bundle_type != bundle_type:
876 876 continue
877 877 if b.revs > threshold:
878 878 return False
879 879 return True
880 880
881 881
882 882 def start_one_bundle(repo, bundle):
883 883 """start the generation of a single bundle file
884 884
885 885 the `bundle` argument should be a RequestedBundle object.
886 886
887 887 This data is passed to the `debugmakeclonebundles` "as is".
888 888 """
889 889 data = util.pickle.dumps(bundle)
890 890 cmd = [procutil.hgexecutable(), b'--cwd', repo.path, INTERNAL_CMD]
891 891 env = procutil.shellenviron()
892 892 msg = b'clone-bundles: starting bundle generation: %s\n'
893 893 stdout = None
894 894 stderr = None
895 895 waits = []
896 896 record_wait = None
897 897 if repo.ui.configbool(b'devel', b'debug.clonebundles'):
898 898 stdout = procutil.stdout
899 899 stderr = procutil.stderr
900 900 repo.ui.write(msg % bundle.bundle_type)
901 901 record_wait = waits.append
902 902 else:
903 903 repo.ui.debug(msg % bundle.bundle_type)
904 904 bg = procutil.runbgcommand
905 905 bg(
906 906 cmd,
907 907 env,
908 908 stdin_bytes=data,
909 909 stdout=stdout,
910 910 stderr=stderr,
911 911 record_wait=record_wait,
912 912 )
913 913 for f in waits:
914 914 f()
915 915
916 916
917 917 INTERNAL_CMD = b'debug::internal-make-clone-bundles'
918 918
919 919
920 920 @command(INTERNAL_CMD, [], b'')
921 921 def debugmakeclonebundles(ui, repo):
922 922 """Internal command to auto-generate debug bundles"""
923 923 requested_bundle = util.pickle.load(procutil.stdin)
924 924 procutil.stdin.close()
925 925
926 926 collect_garbage(repo)
927 927
928 928 fname = requested_bundle.suggested_filename
929 929 fpath = repo.vfs.makedirs(b'tmp-bundles')
930 930 fpath = repo.vfs.join(b'tmp-bundles', fname)
931 931 bundle = requested_bundle.generating(fpath)
932 932 update_bundle_list(repo, new_bundles=[bundle])
933 933
934 934 requested_bundle.generate_bundle(repo, fpath)
935 935
936 936 repo.invalidate()
937 937 finalize_one_bundle(repo, bundle)
938 938
939 939
940 940 def make_auto_bundler(source_repo):
941 941 reporef = weakref.ref(source_repo)
942 942
943 943 def autobundle(tr):
944 944 repo = reporef()
945 945 assert repo is not None
946 946 bundles = read_auto_gen(repo)
947 947 new, __ = auto_bundle_needed_actions(repo, bundles, b"%d_txn" % id(tr))
948 948 for data in new:
949 949 start_one_bundle(repo, data)
950 950 return None
951 951
952 952 return autobundle
953 953
954 954
955 955 def reposetup(ui, repo):
956 956 """install the two pieces needed for automatic clonebundle generation
957 957
958 958 - add a "post-close" hook that fires bundling when needed
959 959 - introduce a clone-bundle lock to let multiple processes meddle with the
960 960 state files.
961 961 """
962 962 if not repo.local():
963 963 return
964 964
965 965 class autobundlesrepo(repo.__class__):
966 966 def transaction(self, *args, **kwargs):
967 967 tr = super(autobundlesrepo, self).transaction(*args, **kwargs)
968 968 enabled = repo.ui.configbool(
969 969 b'clone-bundles',
970 970 b'auto-generate.on-change',
971 971 )
972 972 targets = repo.ui.configlist(
973 973 b'clone-bundles', b'auto-generate.formats'
974 974 )
975 975 if enabled and targets:
976 976 tr.addpostclose(CAT_POSTCLOSE, make_auto_bundler(self))
977 977 return tr
978 978
979 979 @localrepo.unfilteredmethod
980 980 def clonebundles_lock(self, wait=True):
981 981 '''Lock the repository file related to clone bundles'''
982 982 if not util.safehasattr(self, '_cb_lock_ref'):
983 983 self._cb_lock_ref = None
984 984 l = self._currentlock(self._cb_lock_ref)
985 985 if l is not None:
986 986 l.lock()
987 987 return l
988 988
989 989 l = self._lock(
990 990 vfs=self.vfs,
991 991 lockname=b"clonebundleslock",
992 992 wait=wait,
993 993 releasefn=None,
994 994 acquirefn=None,
995 995 desc=_(b'repository %s') % self.origroot,
996 996 )
997 997 self._cb_lock_ref = weakref.ref(l)
998 998 return l
999 999
1000 1000 repo._wlockfreeprefix.add(AUTO_GEN_FILE)
1001 1001 repo._wlockfreeprefix.add(bundlecaches.CB_MANIFEST_FILE)
1002 1002 repo.__class__ = autobundlesrepo
1003 1003
1004 1004
1005 1005 @command(
1006 1006 b'admin::clone-bundles-refresh',
1007 1007 [
1008 1008 (
1009 1009 b'',
1010 1010 b'background',
1011 1011 False,
1012 1012 _(b'start bundle generation in the background'),
1013 1013 ),
1014 1014 ],
1015 1015 b'',
1016 1016 )
1017 1017 def cmd_admin_clone_bundles_refresh(
1018 1018 ui,
1019 1019 repo: localrepo.localrepository,
1020 1020 background=False,
1021 1021 ):
1022 1022 """generate clone bundles according to the configuration
1023 1023
1024 1024 This runs the logic for automatic generation, removing outdated bundles and
1025 1025 generating new ones if necessary. See :hg:`help -e clone-bundles` for
1026 1026 details about how to configure this feature.
1027 1027 """
1028 1028 debug = repo.ui.configbool(b'devel', b'debug.clonebundles')
1029 1029 bundles = read_auto_gen(repo)
1030 1030 op_id = b"%d_acbr" % os.getpid()
1031 1031 create, delete = auto_bundle_needed_actions(repo, bundles, op_id)
1032 1032
1033 1033 # if some bundles are scheduled for creation in the background, they will
1034 1034 # deal with garbage collection too, so no need to synchroniously do it.
1035 1035 #
1036 1036 # However if no bundles are scheduled for creation, we need to explicitly do
1037 1037 # it here.
1038 1038 if not (background and create):
1039 1039 # we clean up outdated bundles before generating new ones to keep the
1040 1040 # last two versions of the bundle around for a while and avoid having to
1041 1041 # deal with clients that just got served a manifest.
1042 1042 for o in delete:
1043 1043 delete_bundle(repo, o)
1044 1044 update_bundle_list(repo, del_bundles=delete)
1045 1045
1046 1046 if create:
1047 1047 fpath = repo.vfs.makedirs(b'tmp-bundles')
1048 1048
1049 1049 if background:
1050 1050 for requested_bundle in create:
1051 1051 start_one_bundle(repo, requested_bundle)
1052 1052 else:
1053 1053 for requested_bundle in create:
1054 1054 if debug:
1055 1055 msg = b'clone-bundles: starting bundle generation: %s\n'
1056 1056 repo.ui.write(msg % requested_bundle.bundle_type)
1057 1057 fname = requested_bundle.suggested_filename
1058 1058 fpath = repo.vfs.join(b'tmp-bundles', fname)
1059 1059 generating_bundle = requested_bundle.generating(fpath)
1060 1060 update_bundle_list(repo, new_bundles=[generating_bundle])
1061 1061 requested_bundle.generate_bundle(repo, fpath)
1062 1062 result = upload_bundle(repo, generating_bundle)
1063 1063 update_bundle_list(repo, new_bundles=[result])
1064 1064 update_ondisk_manifest(repo)
1065 1065 cleanup_tmp_bundle(repo, generating_bundle)
1066 1066
1067 1067
1068 1068 @command(b'admin::clone-bundles-clear', [], b'')
1069 1069 def cmd_admin_clone_bundles_clear(ui, repo: localrepo.localrepository):
1070 1070 """remove existing clone bundle caches
1071 1071
1072 1072 See `hg help admin::clone-bundles-refresh` for details on how to regenerate
1073 1073 them.
1074 1074
1075 1075 This command will only affect bundles currently available, it will not
1076 1076 affect bundles being asynchronously generated.
1077 1077 """
1078 1078 bundles = read_auto_gen(repo)
1079 1079 delete = [b for b in bundles if b.ready]
1080 1080 for o in delete:
1081 1081 delete_bundle(repo, o)
1082 1082 update_bundle_list(repo, del_bundles=delete)
General Comments 0
You need to be logged in to leave comments. Login now