##// END OF EJS Templates
sparse-revlog: set max delta chain length to on thousand...
sparse-revlog: set max delta chain length to on thousand The new snapshot system used in the sparse-revlog case gave us some small size benefit so far. However its most important property is to gracefully handle harder limit on delta chainlength. Long delta chain has a very detrimental impact on read (and write) performance in revlog. Being able to shorter them provide a great boost. However, shorting delta used to result significantly lower compression ratio. The intermediate snapshots effectively suppress most of this effect (even all in some case). # Effect on the test repository The repository we use for test is not "realistic" but can still show this in action using an unreasonably low chain limit. Limiting the chain length show a sizeable increase but stay under control: +6% for limit=15; +15% for limit=10. Without the snapshot system the increase is significantly bigger: +45% for limit=15; +80% for limit=10. Even slightly larger than without delta chain limit, the resulting size is still smaller than before we started doing snapshots. Here is a table for comparison. *Since the repository is not branchy, the initial sparse-revlog version does not bring much benefit compare to the non-sparse one): chain length limit | none | limit=15 | limit=10 | without sparse-revlog | 62 818 987 | 112 664 615 | 131 222 574 | without snapshot | 74 365 490 | 108 211 410 | 133 857 764 | with snapshot | 59 230 936 | 63 002 924 | 68 415 329 | # Effect On Real Life Repositories The series provides significant benefits on all kind of repositories. Using `hg debugupgraderepo -o redeltaparent --run`, we recomputed delta chain for various repositories with different settings: - delta chain length: unlimited or 1000 limit - sparse-revlog: enabled or disabled - this series: applied or not applied We can observe multiple types of effect: - On very branchy repositories: * The delta chain limit as low impact on the repo size. * Intermediate snapshot greatly reduces manifest size: - pypy: -80% - netbeans: -95% * The delta chain limit is effective, without a size impact: - netbeans average: 613 -> 282 - private #1 average: 1 068 -> 307 - On more linear repository: * Intermediate snapshot limit the impact of delta chain limit: - mozilla: without the series: +360% with the series: +25% * The delta chain limit provides large improvement: - mozilla's average chain length: unlimited: 15 338 limited: 469 * Despite the chain length limit, the manifest size is reduced: - mercurial: -25% - mozilla: -30% It is clear that the use of chains of intermediate snapshots provide large benefits both in storage size and delta chains quality. We should now switch our effort toward making sure the write performance are acceptable. Then, `sparse-revlog` will be a suitable format for all new repository. # Raw Statistic * no-sparse: general delta repository not using sparse-revlog * no-snapshot: sparse-revlog repository not using this series * snapshot: sparse-revlog repository using this series mercurial Manifest Size: limit | none | 1000 ------------|-------------|------------ no-sparse | 8 021 373 | 8 199 366 no-snapshot | 8 103 561 | 8 259 719 snapshot | 6 137 116 | 6 126 433 Manifest Chain length data limit || none || 1000 || value || average | max || average | max || ------------||---------|---------||---------|---------|| no-sparse || 307 | 1456 || 279 | 1000 || no-snapshot || 312 | 1456 || 283 | 1000 || snapshot || 248 | 1208 || 241 | 1000 || Full Store Size limit | none | 1000 ------------|-------------|------------ no-sparse | 51 013 198 | 51 201 574 no-snapshot | 50 930 795 | 51 141 006 snapshot | 48 072 037 | 48 093 572 pypy Manifest Size: limit | none | 1000 ------------|-------------|------------ no-sparse | 193 987 784 | 193 987 784 no-snapshot | 163 171 745 | 163 312 229 snapshot | 34 605 900 | 34 600 750 Manifest Chain length data limit || none || 1000 || value || average | max || average | max || ------------||---------|---------||---------|---------|| no-sparse || 101 | 692 || 101 | 692 || no-snapshot || 151 | 1307 || 148 | 1000 || snapshot || 128 | 1309 || 125 | 1000 || Full Store Size limit | none | 1000 ------------|-------------|------------ no-sparse | 495 931 473 | 495 931 473 no-snapshot | 465 441 017 | 465 581 501 snapshot | 355 467 301 | 355 472 451 Mozilla Manifest Size: limit | none | 1000 ------------|----------------|--------------- no-sparse | 416 757 148 | 1 869 009 668 no-snapshot | 401 592 370 | 1 843 493 795 snapshot | 224 359 521 | 284 615 500 Manifest Chain length data limit || none || 1000 || value || average | max || average | max || ------------||---------|---------||---------|---------|| no-sparse || 15 333 | 58 980 || 468 | 1 000 || no-snapshot || 15 336 | 58 980 || 469 | 1 000 || snapshot || 15 338 | 58 983 || 469 | 1 000 || Full Store Size limit | none | 1000 ------------|----------------|--------------- no-sparse | 2 712 477 887 | 4 164 995 451 no-snapshot | 2 698 887 835 | 4 141 054 304 snapshot | 2 518 130 385 | 2 578 587 596 Netbeans Manifest Size: limit | none | 1000 ------------|----------------|--------------- no-sparse | 4 766 794 101 | 4 870 642 687 no-snapshot | 4 334 806 082 | 4 428 681 309 snapshot | 232 659 666 | 240 330 665 Manifest Chain length data limit || none || 1000 || value || average | max || average | max || ------------||---------|---------||---------|---------|| no-sparse || 597 | 6802 || 254 | 1 000 || no-snapshot || 648 | 6 802 || 305 | 1 000 || snapshot || 613 | 6 804 || 282 | 1 000 || Full Store Size limit | none | 1000 ------------|----------------|--------------- no-sparse | 5 807 347 998 | 5 911 196 584 no-snapshot | 5 375 398 602 | 5 469 273 829 snapshot | 1 282 519 928 | 1 290 190 927 Private repo #1 Manifest Size: limit | none | 1000 ------------|-----------------|--------------- no-sparse | 41 389 010 840 | 41 398 162 091 no-snapshot | 9 737 319 435 | 10 223 773 150 snapshot | 744 215 807 | 747 961 822 Manifest Chain length data limit || none || 1000 || value || average | max || average | max || ------------||---------|---------||---------|---------|| no-sparse || 245 | 8 885 || 81 | 1 000 || no-snapshot || 1 225 | 8 885 || 336 | 1 000 || snapshot || 1 068 | 7 909 || 307 | 1 000 || Full Store Size limit | none | 1000 ------------|----------------|--------------- no-sparse | 49 646 065 126 | 49 655 216 377 no-snapshot | 17 924 862 856 | 18 411 316 571 snapshot | 9 009 024 710 | 9 012 770 725 Private repo #2 We currently have less data available for that repository. * Before is a sparse-revlog repository without this series * After is a sparse-revlog repository with this series + 1000 chain limit Manifest Size: Before: 1 531 485 040 bytes After: 1 091 422 451 bytes Manifest Chain: Before: 2 218 avg; 6 575 Max After: 442 avg; 1 000 Max Full Store Size Before: 15 203 955 615 after: 8 207 180 693

File last commit:

r35166:27196b7f stable
r39542:b66ea3fc default
Show More
subrepos.txt
169 lines | 7.1 KiB | text/plain | TextLexer
Patrick Mezard
Add subrepos help topic...
r12828 Subrepositories let you nest external repositories or projects into a
parent Mercurial repository, and make commands operate on them as a
Matt Mackall
subrepo: add git to the help topic
r15213 group.
Mercurial currently supports Mercurial, Git, and Subversion
subrepositories.
Patrick Mezard
Add subrepos help topic...
r12828
Subrepositories are made of three components:
1. Nested repository checkouts. They can appear anywhere in the
Matt Mackall
subrepo: add git to the help topic
r15213 parent working directory.
Patrick Mezard
Add subrepos help topic...
r12828
FUJIWARA Katsunori
doc: add description about location of management files for subrepo
r16503 2. Nested repository references. They are defined in ``.hgsub``, which
should be placed in the root of working directory, and
Patrick Mezard
Add subrepos help topic...
r12828 tell where the subrepository checkouts come from. Mercurial
Takumi IINO
help: fix literal block syntax
r17454 subrepositories are referenced like::
Patrick Mezard
Add subrepos help topic...
r12828
path/to/nested = https://example.com/nested/repo/path
Takumi IINO
help: fix literal block syntax
r17454 Git and Subversion subrepos are also supported::
Matt Mackall
subrepo: add git to the help topic
r15213
path/to/nested = [git]git://example.com/nested/repo/path
path/to/nested = [svn]https://example.com/nested/trunk/path
Patrick Mezard
Add subrepos help topic...
r12828 where ``path/to/nested`` is the checkout location relatively to the
parent Mercurial root, and ``https://example.com/nested/repo/path``
is the source repository path. The source can also reference a
Matt Mackall
subrepo: add git to the help topic
r15213 filesystem path.
Patrick Mezard
Add subrepos help topic...
r12828
Note that ``.hgsub`` does not exist by default in Mercurial
repositories, you have to create and add it to the parent
repository before using subrepositories.
FUJIWARA Katsunori
doc: add description about location of management files for subrepo
r16503 3. Nested repository states. They are defined in ``.hgsubstate``, which
is placed in the root of working directory, and
Patrick Mezard
Add subrepos help topic...
r12828 capture whatever information is required to restore the
subrepositories to the state they were committed in a parent
repository changeset. Mercurial automatically record the nested
repositories states when committing in the parent repository.
.. note::
Simon Heimberg
help: remove last occurrences of ".. note::" without two newlines...
r20532
Patrick Mezard
Add subrepos help topic...
r12828 The ``.hgsubstate`` file should not be edited manually.
Adding a Subrepository
FUJIWARA Katsunori
doc: unify section level between help topics...
r17267 ======================
Patrick Mezard
Add subrepos help topic...
r12828
If ``.hgsub`` does not exist, create it and add it to the parent
repository. Clone or checkout the external projects where you want it
to live in the parent repository. Edit ``.hgsub`` and add the
subrepository entry as described above. At this point, the
subrepository is tracked and the next commit will record its state in
``.hgsubstate`` and bind it to the committed changeset.
Synchronizing a Subrepository
FUJIWARA Katsunori
doc: unify section level between help topics...
r17267 =============================
Patrick Mezard
Add subrepos help topic...
r12828
Subrepos do not automatically track the latest changeset of their
sources. Instead, they are updated to the changeset that corresponds
with the changeset checked out in the top-level changeset. This is so
developers always get a consistent set of compatible code and
libraries when they update.
Thus, updating subrepos is a manual process. Simply check out target
subrepo at the desired revision, test in the top-level repo, then
commit in the parent repository to record the new combination.
Deleting a Subrepository
FUJIWARA Katsunori
doc: unify section level between help topics...
r17267 ========================
Patrick Mezard
Add subrepos help topic...
r12828
Wagner Bruna
help: correct tip about deleting a subrepository...
r12860 To remove a subrepository from the parent repository, delete its
reference from ``.hgsub``, then remove its files.
Patrick Mezard
Add subrepos help topic...
r12828
Interaction with Mercurial Commands
FUJIWARA Katsunori
doc: unify section level between help topics...
r17267 ===================================
Patrick Mezard
Add subrepos help topic...
r12828
:add: add does not recurse in subrepos unless -S/--subrepos is
David M. Carr
add: support adding explicit files in subrepos...
r15410 specified. However, if you specify the full path of a file in a
subrepo, it will be added even without -S/--subrepos specified.
Mathias De Maré
subrepos: support adding files in git subrepos...
r24174 Subversion subrepositories are currently silently
Patrick Mezard
Add subrepos help topic...
r12828 ignored.
Matt Harbison
addremove: add support for the -S flag...
r23538 :addremove: addremove does not recurse into subrepos unless
Matt Harbison
addremove: support addremove with explicit paths in subrepos...
r23539 -S/--subrepos is specified. However, if you specify the full
path of a directory in a subrepo, addremove will be performed on
it even without -S/--subrepos being specified. Git and
Subversion subrepositories will print a warning and continue.
Matt Harbison
addremove: add support for the -S flag...
r23538
Patrick Mezard
Add subrepos help topic...
r12828 :archive: archive does not recurse in subrepositories unless
-S/--subrepos is specified.
Yuya Nishihara
cat: record the current behavior of wildcard matches in subrepos...
r35166 :cat: Git subrepositories only support exact file matches.
Mathias De Maré
subrepo: add 'cat' support for git subrepos...
r23991 Subversion subrepositories are currently ignored.
Matt Harbison
cat: support cat with explicit paths in subrepos...
r21041
Patrick Mezard
Add subrepos help topic...
r12828 :commit: commit creates a consistent snapshot of the state of the
David M. Carr
subrepo: update help for commit to reflect new default behavior...
r15427 entire project and its subrepositories. If any subrepositories
have been modified, Mercurial will abort. Mercurial can be made
to instead commit all modified subrepositories by specifying
-S/--subrepos, or setting "ui.commitsubrepos=True" in a
configuration file (see :hg:`help config`). After there are no
longer any modified subrepositories, it records their state and
Matt Harbison
commit: propagate --addremove to subrepos if -S is specified (issue3759)...
r23537 finally commits it in the parent repository. The --addremove
option also honors the -S/--subrepos option. However, Git and
Subversion subrepositories will print a warning and abort.
Patrick Mezard
Add subrepos help topic...
r12828
:diff: diff does not recurse in subrepos unless -S/--subrepos is
specified. Changes are displayed as usual, on the subrepositories
Matt Harbison
subrepo: update the help text to account for diff -I/-X gitsubrepo support...
r24874 elements. Subversion subrepositories are currently silently ignored.
Patrick Mezard
Add subrepos help topic...
r12828
Matt Harbison
subrepo: add basic support to hgsubrepo for the files command...
r24413 :files: files does not recurse into subrepos unless -S/--subrepos is
Matt Harbison
files: recurse into subrepos automatically with an explicit path
r25228 specified. However, if you specify the full path of a file or
directory in a subrepo, it will be displayed even without
-S/--subrepos being specified. Git and Subversion subrepositories
are currently silently ignored.
Matt Harbison
subrepo: add basic support to hgsubrepo for the files command...
r24413
David M. Carr
forget: support forgetting explicit paths in subrepos...
r15474 :forget: forget currently only handles exact file matches in subrepos.
Git and Subversion subrepositories are currently silently ignored.
Patrick Mezard
Add subrepos help topic...
r12828 :incoming: incoming does not recurse in subrepos unless -S/--subrepos
David M. Carr
subrepo: improve help for git subrepo support...
r15428 is specified. Git and Subversion subrepositories are currently
silently ignored.
Patrick Mezard
Add subrepos help topic...
r12828
:outgoing: outgoing does not recurse in subrepos unless -S/--subrepos
David M. Carr
subrepo: improve help for git subrepo support...
r15428 is specified. Git and Subversion subrepositories are currently
silently ignored.
Patrick Mezard
Add subrepos help topic...
r12828
:pull: pull is not recursive since it is not clear what to pull prior
to running :hg:`update`. Listing and retrieving all
subrepositories changes referenced by the parent repository pulled
changesets is expensive at best, impossible in the Subversion
case.
:push: Mercurial will automatically push all subrepositories first
when the parent repository is being pushed. This ensures new
subrepository changes are available when referenced by top-level
David M. Carr
subrepo: improve help for svn subrepo support...
r15429 repositories. Push is a no-op for Subversion subrepositories.
Patrick Mezard
Add subrepos help topic...
r12828
Matt Harbison
serve: add support for Mercurial subrepositories...
r32005 :serve: serve does not recurse into subrepositories unless
-S/--subrepos is specified. Git and Subversion subrepositories
are currently silently ignored.
Patrick Mezard
Add subrepos help topic...
r12828 :status: status does not recurse into subrepositories unless
-S/--subrepos is specified. Subrepository changes are displayed as
regular Mercurial changes on the subrepository
elements. Subversion subrepositories are currently silently
ignored.
Matt Harbison
remove: recurse into subrepositories with --subrepos/-S flag...
r23325 :remove: remove does not recurse into subrepositories unless
Matt Harbison
remove: support remove with explicit paths in subrepos
r23326 -S/--subrepos is specified. However, if you specify a file or
directory path in a subrepo, it will be removed even without
-S/--subrepos. Git and Subversion subrepositories are currently
silently ignored.
Matt Harbison
remove: recurse into subrepositories with --subrepos/-S flag...
r23325
Patrick Mezard
Add subrepos help topic...
r12828 :update: update restores the subrepos in the state they were
originally committed in target changeset. If the recorded
changeset is not available in the current subrepository, Mercurial
will pull it in first before updating. This means that updating
can require network access when using subrepositories.
Remapping Subrepositories Sources
FUJIWARA Katsunori
doc: unify section level between help topics...
r17267 =================================
Patrick Mezard
Add subrepos help topic...
r12828
A subrepository source location may change during a project life,
invalidating references stored in the parent repository history. To
fix this, rewriting rules can be defined in parent repository ``hgrc``
file or in Mercurial configuration. See the ``[subpaths]`` section in
hgrc(5) for more details.