##// END OF EJS Templates
copies-rust: move is_ancestor caching within the rust code...
copies-rust: move is_ancestor caching within the rust code Now that the OrdMap merging is fast, smaller things start to matters. We move the caching of `is_ancestor` call within the Rust code. This avoid round-trip to Python and help us to shave more time on our slower case: Repo Cases Source-Rev Dest-Rev Old-Time New-Time Difference Factor ------------------------------------------------------------------------------------------------------------------------------------ pypy x0000_revs_x_added_0_copies d1defd0dc478 c9cb1334cc78 : 2.780174 s, 2.137894 s, -0.642280 s, × 0.7690 mozilla-try x0000_revs_xx000_added_x000_copies 89294cd501d9 7ccb2fc7ccb5 : 9.843481 s, 8.100385 s, -1.743096 s, × 0.8229 Note: I would happily have used native code for ancestors computation, however I failed (did not tried hard) to created a rust version that goes as fast as the current C version. Below are full tables for: - this change compared to the previous change - this change compared to filelog performance Repo Cases Source-Rev Dest-Rev Old-Time New-Time Difference Factor ------------------------------------------------------------------------------------------------------------------------------------ mercurial x_revs_x_added_0_copies ad6b123de1c7 39cfcef4f463 : 0.000049 s, 0.000047 s, -0.000002 s, × 0.9592 mercurial x_revs_x_added_x_copies 2b1c78674230 0c1d10351869 : 0.000182 s, 0.000181 s, -0.000001 s, × 0.9945 mercurial x000_revs_x000_added_x_copies 81f8ff2a9bf2 dd3267698d84 : 0.005872 s, 0.005852 s, -0.000020 s, × 0.9966 pypy x_revs_x_added_0_copies aed021ee8ae8 099ed31b181b : 0.000229 s, 0.000229 s, +0.000000 s, × 1.0000 pypy x_revs_x000_added_0_copies 4aa4e1f8e19a 359343b9ac0e : 0.000058 s, 0.000058 s, +0.000000 s, × 1.0000 pypy x_revs_x_added_x_copies ac52eb7bbbb0 72e022663155 : 0.000148 s, 0.000146 s, -0.000002 s, × 0.9865 pypy x_revs_x00_added_x_copies c3b14617fbd7 ace7255d9a26 : 0.001205 s, 0.001206 s, +0.000001 s, × 1.0008 pypy x_revs_x000_added_x000_copies df6f7a526b60 a83dc6a2d56f : 0.025662 s, 0.025275 s, -0.000387 s, × 0.9849 pypy x000_revs_xx00_added_0_copies 89a76aede314 2f22446ff07e : 0.080113 s, 0.080303 s, +0.000190 s, × 1.0024 pypy x000_revs_x000_added_x_copies 8a3b5bfd266e 2c68e87c3efe : 0.153030 s, 0.152641 s, -0.000389 s, × 0.9975 pypy x000_revs_x000_added_x000_copies 89a76aede314 7b3dda341c84 : 0.098774 s, 0.099107 s, +0.000333 s, × 1.0034 pypy x0000_revs_x_added_0_copies d1defd0dc478 c9cb1334cc78 : 2.780174 s, 2.137894 s, -0.642280 s, × 0.7690 pypy x0000_revs_xx000_added_0_copies bf2c629d0071 4ffed77c095c : 0.022218 s, 0.022202 s, -0.000016 s, × 0.9993 pypy x0000_revs_xx000_added_x000_copies 08ea3258278e d9fa043f30c0 : 0.252125 s, 0.228946 s, -0.023179 s, × 0.9081 netbeans x_revs_x_added_0_copies fb0955ffcbcd a01e9239f9e7 : 0.000186 s, 0.000186 s, +0.000000 s, × 1.0000 netbeans x_revs_x000_added_0_copies 6f360122949f 20eb231cc7d0 : 0.000133 s, 0.000133 s, +0.000000 s, × 1.0000 netbeans x_revs_x_added_x_copies 1ada3faf6fb6 5a39d12eecf4 : 0.000320 s, 0.000320 s, +0.000000 s, × 1.0000 netbeans x_revs_x00_added_x_copies 35be93ba1e2c 9eec5e90c05f : 0.001336 s, 0.001339 s, +0.000003 s, × 1.0022 netbeans x000_revs_xx00_added_0_copies eac3045b4fdd 51d4ae7f1290 : 0.015573 s, 0.015694 s, +0.000121 s, × 1.0078 netbeans x000_revs_x000_added_x_copies e2063d266acd 6081d72689dc : 0.018667 s, 0.018457 s, -0.000210 s, × 0.9888 netbeans x000_revs_x000_added_x000_copies ff453e9fee32 411350406ec2 : 0.112534 s, 0.111691 s, -0.000843 s, × 0.9925 netbeans x0000_revs_xx000_added_x000_copies 588c2d1ced70 1aad62e59ddd : 1.231869 s, 1.166017 s, -0.065852 s, × 0.9465 mozilla-central x_revs_x_added_0_copies 3697f962bb7b 7015fcdd43a2 : 0.000197 s, 0.000197 s, +0.000000 s, × 1.0000 mozilla-central x_revs_x000_added_0_copies dd390860c6c9 40d0c5bed75d : 0.000637 s, 0.000626 s, -0.000011 s, × 0.9827 mozilla-central x_revs_x_added_x_copies 8d198483ae3b 14207ffc2b2f : 0.000303 s, 0.000303 s, +0.000000 s, × 1.0000 mozilla-central x_revs_x00_added_x_copies 98cbc58cc6bc 446a150332c3 : 0.001663 s, 0.001679 s, +0.000016 s, × 1.0096 mozilla-central x_revs_x000_added_x000_copies 3c684b4b8f68 0a5e72d1b479 : 0.007008 s, 0.006947 s, -0.000061 s, × 0.9913 mozilla-central x_revs_x0000_added_x0000_copies effb563bb7e5 c07a39dc4e80 : 0.127385 s, 0.133070 s, +0.005685 s, × 1.0446 mozilla-central x000_revs_xx00_added_0_copies 6100d773079a 04a55431795e : 0.008740 s, 0.008705 s, -0.000035 s, × 0.9960 mozilla-central x000_revs_x000_added_x_copies 9f17a6fc04f9 2d37b966abed : 0.005783 s, 0.005913 s, +0.000130 s, × 1.0225 mozilla-central x000_revs_x000_added_x000_copies 7c97034feb78 4407bd0c6330 : 0.102184 s, 0.101373 s, -0.000811 s, × 0.9921 mozilla-central x0000_revs_xx000_added_0_copies 9eec5917337d 67118cc6dcad : 0.046220 s, 0.046526 s, +0.000306 s, × 1.0066 mozilla-central x0000_revs_xx000_added_x000_copies f78c615a656c 96a38b690156 : 0.315271 s, 0.313954 s, -0.001317 s, × 0.9958 mozilla-central x00000_revs_x0000_added_x0000_copies 6832ae71433c 4c222a1d9a00 : 3.478747 s, 3.367395 s, -0.111352 s, × 0.9680 mozilla-central x00000_revs_x00000_added_x000_copies 76caed42cf7c 1daa622bbe42 : 4.766435 s, 4.691820 s, -0.074615 s, × 0.9843 mozilla-try x_revs_x_added_0_copies aaf6dde0deb8 9790f499805a : 0.001214 s, 0.001199 s, -0.000015 s, × 0.9876 mozilla-try x_revs_x000_added_0_copies d8d0222927b4 5bb8ce8c7450 : 0.001221 s, 0.001216 s, -0.000005 s, × 0.9959 mozilla-try x_revs_x_added_x_copies 092fcca11bdb 936255a0384a : 0.000613 s, 0.000613 s, +0.000000 s, × 1.0000 mozilla-try x_revs_x00_added_x_copies b53d2fadbdb5 017afae788ec : 0.001904 s, 0.001906 s, +0.000002 s, × 1.0011 mozilla-try x_revs_x000_added_x000_copies 20408ad61ce5 6f0ee96e21ad : 0.093000 s, 0.092766 s, -0.000234 s, × 0.9975 mozilla-try x_revs_x0000_added_x0000_copies effb563bb7e5 c07a39dc4e80 : 0.132194 s, 0.136074 s, +0.003880 s, × 1.0294 mozilla-try x000_revs_xx00_added_0_copies 6100d773079a 04a55431795e : 0.009069 s, 0.009067 s, -0.000002 s, × 0.9998 mozilla-try x000_revs_x000_added_x_copies 9f17a6fc04f9 2d37b966abed : 0.006169 s, 0.006243 s, +0.000074 s, × 1.0120 mozilla-try x000_revs_x000_added_x000_copies 1346fd0130e4 4c65cbdabc1f : 0.115540 s, 0.114463 s, -0.001077 s, × 0.9907 mozilla-try x0000_revs_x_added_0_copies 63519bfd42ee a36a2a865d92 : 0.435381 s, 0.433683 s, -0.001698 s, × 0.9961 mozilla-try x0000_revs_x_added_x_copies 9fe69ff0762d bcabf2a78927 : 0.415461 s, 0.411278 s, -0.004183 s, × 0.9899 mozilla-try x0000_revs_xx000_added_x_copies 156f6e2674f2 4d0f2c178e66 : 0.155946 s, 0.155133 s, -0.000813 s, × 0.9948 mozilla-try x0000_revs_xx000_added_0_copies 9eec5917337d 67118cc6dcad : 0.048521 s, 0.048933 s, +0.000412 s, × 1.0085 mozilla-try x0000_revs_xx000_added_x000_copies 89294cd501d9 7ccb2fc7ccb5 : 9.843481 s, 8.100385 s, -1.743096 s, × 0.8229 mozilla-try x0000_revs_x0000_added_x0000_copies e928c65095ed e951f4ad123a : 1.465128 s, 1.446720 s, -0.018408 s, × 0.9874 mozilla-try x00000_revs_x00000_added_0_copies dc8a3ca7010e d16fde900c9c : 1.374283 s, 1.369537 s, -0.004746 s, × 0.9965 mozilla-try x00000_revs_x0000_added_x0000_copies 8d3fafa80d4b eb884023b810 : 5.255158 s, 5.186079 s, -0.069079 s, × 0.9869 Repo Case Source-Rev Dest-Rev filelog sidedata Difference Factor -------------------------------------------------------------------------------------------------------------------------------------- mercurial x_revs_x_added_0_copies ad6b123de1c7 39cfcef4f463 : 0.000892 s, 0.000047 s, -0.000845 s, × 0.052691 mercurial x_revs_x_added_x_copies 2b1c78674230 0c1d10351869 : 0.001823 s, 0.000181 s, -0.001642 s, × 0.099287 mercurial x000_revs_x000_added_x_copies 81f8ff2a9bf2 dd3267698d84 : 0.018063 s, 0.005852 s, -0.012211 s, × 0.323977 pypy x_revs_x_added_0_copies aed021ee8ae8 099ed31b181b : 0.001505 s, 0.000229 s, -0.001276 s, × 0.152159 pypy x_revs_x000_added_0_copies 4aa4e1f8e19a 359343b9ac0e : 0.205895 s, 0.000058 s, -0.205837 s, × 0.000282 pypy x_revs_x_added_x_copies ac52eb7bbbb0 72e022663155 : 0.017021 s, 0.000146 s, -0.016875 s, × 0.008578 pypy x_revs_x00_added_x_copies c3b14617fbd7 ace7255d9a26 : 0.019422 s, 0.001206 s, -0.018216 s, × 0.062095 pypy x_revs_x000_added_x000_copies df6f7a526b60 a83dc6a2d56f : 0.767740 s, 0.025275 s, -0.742465 s, × 0.032921 pypy x000_revs_xx00_added_0_copies 89a76aede314 2f22446ff07e : 1.188515 s, 0.080303 s, -1.108212 s, × 0.067566 pypy x000_revs_x000_added_x_copies 8a3b5bfd266e 2c68e87c3efe : 1.251968 s, 0.152641 s, -1.099327 s, × 0.121921 pypy x000_revs_x000_added_x000_copies 89a76aede314 7b3dda341c84 : 1.616799 s, 0.099107 s, -1.517692 s, × 0.061298 pypy x0000_revs_x_added_0_copies d1defd0dc478 c9cb1334cc78 : 0.001057 s, 2.137894 s, +2.136837 s, × 2022.605487 pypy x0000_revs_xx000_added_0_copies bf2c629d0071 4ffed77c095c : 1.069485 s, 0.022202 s, -1.047283 s, × 0.020760 pypy x0000_revs_xx000_added_x000_copies 08ea3258278e d9fa043f30c0 : 1.350162 s, 0.228946 s, -1.121216 s, × 0.169569 netbeans x_revs_x_added_0_copies fb0955ffcbcd a01e9239f9e7 : 0.028008 s, 0.000186 s, -0.027822 s, × 0.006641 netbeans x_revs_x000_added_0_copies 6f360122949f 20eb231cc7d0 : 0.132281 s, 0.000133 s, -0.132148 s, × 0.001005 netbeans x_revs_x_added_x_copies 1ada3faf6fb6 5a39d12eecf4 : 0.025311 s, 0.000320 s, -0.024991 s, × 0.012643 netbeans x_revs_x00_added_x_copies 35be93ba1e2c 9eec5e90c05f : 0.052957 s, 0.001339 s, -0.051618 s, × 0.025285 netbeans x000_revs_xx00_added_0_copies eac3045b4fdd 51d4ae7f1290 : 0.038011 s, 0.015694 s, -0.022317 s, × 0.412880 netbeans x000_revs_x000_added_x_copies e2063d266acd 6081d72689dc : 0.198639 s, 0.018457 s, -0.180182 s, × 0.092917 netbeans x000_revs_x000_added_x000_copies ff453e9fee32 411350406ec2 : 0.955713 s, 0.111691 s, -0.844022 s, × 0.116867 netbeans x0000_revs_xx000_added_x000_copies 588c2d1ced70 1aad62e59ddd : 3.838886 s, 1.166017 s, -2.672869 s, × 0.303738 mozilla-central x_revs_x_added_0_copies 3697f962bb7b 7015fcdd43a2 : 0.024548 s, 0.000197 s, -0.024351 s, × 0.008025 mozilla-central x_revs_x000_added_0_copies dd390860c6c9 40d0c5bed75d : 0.143394 s, 0.000626 s, -0.142768 s, × 0.004366 mozilla-central x_revs_x_added_x_copies 8d198483ae3b 14207ffc2b2f : 0.026046 s, 0.000303 s, -0.025743 s, × 0.011633 mozilla-central x_revs_x00_added_x_copies 98cbc58cc6bc 446a150332c3 : 0.085440 s, 0.001679 s, -0.083761 s, × 0.019651 mozilla-central x_revs_x000_added_x000_copies 3c684b4b8f68 0a5e72d1b479 : 0.195656 s, 0.006947 s, -0.188709 s, × 0.035506 mozilla-central x_revs_x0000_added_x0000_copies effb563bb7e5 c07a39dc4e80 : 2.190874 s, 0.133070 s, -2.057804 s, × 0.060738 mozilla-central x000_revs_xx00_added_0_copies 6100d773079a 04a55431795e : 0.090208 s, 0.008705 s, -0.081503 s, × 0.096499 mozilla-central x000_revs_x000_added_x_copies 9f17a6fc04f9 2d37b966abed : 0.747367 s, 0.005913 s, -0.741454 s, × 0.007912 mozilla-central x000_revs_x000_added_x000_copies 7c97034feb78 4407bd0c6330 : 1.152863 s, 0.101373 s, -1.051490 s, × 0.087932 mozilla-central x0000_revs_xx000_added_0_copies 9eec5917337d 67118cc6dcad : 6.598336 s, 0.046526 s, -6.551810 s, × 0.007051 mozilla-central x0000_revs_xx000_added_x000_copies f78c615a656c 96a38b690156 : 3.255015 s, 0.313954 s, -2.941061 s, × 0.096452 mozilla-central x00000_revs_x0000_added_x0000_copies 6832ae71433c 4c222a1d9a00 : 15.668041 s, 3.367395 s, -12.300646 s, × 0.214921 mozilla-central x00000_revs_x00000_added_x000_copies 76caed42cf7c 1daa622bbe42 : 20.439638 s, 4.691820 s, -15.747818 s, × 0.229545 mozilla-try x_revs_x_added_0_copies aaf6dde0deb8 9790f499805a : 0.080923 s, 0.001199 s, -0.079724 s, × 0.014817 mozilla-try x_revs_x000_added_0_copies d8d0222927b4 5bb8ce8c7450 : 0.498456 s, 0.001216 s, -0.497240 s, × 0.002440 mozilla-try x_revs_x_added_x_copies 092fcca11bdb 936255a0384a : 0.020798 s, 0.000613 s, -0.020185 s, × 0.029474 mozilla-try x_revs_x00_added_x_copies b53d2fadbdb5 017afae788ec : 0.226930 s, 0.001906 s, -0.225024 s, × 0.008399 mozilla-try x_revs_x000_added_x000_copies 20408ad61ce5 6f0ee96e21ad : 1.113005 s, 0.092766 s, -1.020239 s, × 0.083347 mozilla-try x_revs_x0000_added_x0000_copies effb563bb7e5 c07a39dc4e80 : 2.230671 s, 0.136074 s, -2.094597 s, × 0.061001 mozilla-try x000_revs_xx00_added_0_copies 6100d773079a 04a55431795e : 0.089672 s, 0.009067 s, -0.080605 s, × 0.101113 mozilla-try x000_revs_x000_added_x_copies 9f17a6fc04f9 2d37b966abed : 0.740221 s, 0.006243 s, -0.733978 s, × 0.008434 mozilla-try x000_revs_x000_added_x000_copies 1346fd0130e4 4c65cbdabc1f : 1.185881 s, 0.114463 s, -1.071418 s, × 0.096521 mozilla-try x0000_revs_x_added_0_copies 63519bfd42ee a36a2a865d92 : 0.086072 s, 0.433683 s, +0.347611 s, × 5.038607 mozilla-try x0000_revs_x_added_x_copies 9fe69ff0762d bcabf2a78927 : 0.081321 s, 0.411278 s, +0.329957 s, × 5.057464 mozilla-try x0000_revs_xx000_added_x_copies 156f6e2674f2 4d0f2c178e66 : 7.528370 s, 0.155133 s, -7.373237 s, × 0.020606 mozilla-try x0000_revs_xx000_added_0_copies 9eec5917337d 67118cc6dcad : 6.757368 s, 0.048933 s, -6.708435 s, × 0.007241 mozilla-try x0000_revs_xx000_added_x000_copies 89294cd501d9 7ccb2fc7ccb5 : 7.643752 s, 8.100385 s, +0.456633 s, × 1.059739 mozilla-try x0000_revs_x0000_added_x0000_copies e928c65095ed e951f4ad123a : 9.704242 s, 1.446720 s, -8.257522 s, × 0.149081 mozilla-try x00000_revs_x_added_0_copies 6a320851d377 1ebb79acd503 : 0.092845 s, killed mozilla-try x00000_revs_x00000_added_0_copies dc8a3ca7010e d16fde900c9c : 26.626870 s, 1.369537 s, -25.257333 s, × 0.051434 mozilla-try x00000_revs_x_added_x_copies 5173c4b6f97c 95d83ee7242d : 0.092953 s, killed mozilla-try x00000_revs_x000_added_x_copies 9126823d0e9c ca82787bb23c : 0.227131 s, killed mozilla-try x00000_revs_x0000_added_x0000_copies 8d3fafa80d4b eb884023b810 : 18.884666 s, 5.186079 s, -13.698587 s, × 0.274619 mozilla-try x00000_revs_x00000_added_x0000_copies 1b661134e2ca 1ae03d022d6d : 21.451622 s, killed mozilla-try x00000_revs_x00000_added_x000_copies 9b2a99adc05e 8e29777b48e6 : 25.152558 s, killed Differential Revision: https://phab.mercurial-scm.org/D9303

File last commit:

r46586:8b99c473 default
r46586:8b99c473 default
Show More
copies.py
1144 lines | 39.4 KiB | text/x-python | PythonLexer
Matt Mackall
copies: move findcopies code to its own module...
r6274 # copies.py - copy detection for Mercurial
#
# Copyright 2008 Matt Mackall <mpm@selenic.com>
#
Martin Geisler
updated license to be explicit about GPL version 2
r8225 # This software may be used and distributed according to the terms of the
Matt Mackall
Update license to GPLv2+
r10263 # GNU General Public License version 2 or any later version.
Matt Mackall
copies: move findcopies code to its own module...
r6274
Gregory Szorc
copies: use absolute_import
r25924 from __future__ import absolute_import
Pulkit Goyal
copytrace: move fast heuristic copytracing algorithm to core...
r34180 import collections
import os
Matt Mackall
copies: move findcopies code to its own module...
r6274
Pulkit Goyal
copies: add a config to limit the number of candidates to check in heuristics...
r34847 from .i18n import _
sidedatacopies: deal with upgrading and downgrading to that format...
r43418
Gregory Szorc
copies: use absolute_import
r25924 from . import (
Yuya Nishihara
copies: use intersectmatchers() in non-merge p1 optimization...
r33869 match as matchmod,
Durham Goode
copies: optimize forward copy detection logic for rebases...
r28000 node,
Gregory Szorc
copies: use absolute_import
r25924 pathutil,
copies: use the rust code for `combine_changeset_copies`...
r46576 policy,
Gregory Szorc
py3: finish porting iteritems() to pycompat and remove source transformer...
r43376 pycompat,
Gregory Szorc
copies: use absolute_import
r25924 util,
)
sidedatacopies: deal with upgrading and downgrading to that format...
r43418
Augie Fackler
formatting: blacken the codebase...
r43346 from .utils import stringutil
copies: return None instead of ChangingFiles when relevant...
r46264 from .revlogutils import flagutil
copies: use the rust code for `combine_changeset_copies`...
r46576 rustmod = policy.importrust("copy_tracing")
Gregory Szorc
copies: use absolute_import
r25924
Martin von Zweigbergk
copies: inline _chainandfilter() to prepare for next patch...
r42796 def _filter(src, dst, t):
"""filters out invalid copies after chaining"""
Martin von Zweigbergk
copies: document cases in _chain()...
r42413
Martin von Zweigbergk
copies: inline _chainandfilter() to prepare for next patch...
r42796 # When _chain()'ing copies in 'a' (from 'src' via some other commit 'mid')
# with copies in 'b' (from 'mid' to 'dst'), we can get the different cases
# in the following table (not including trivial cases). For example, case 2
# is where a file existed in 'src' and remained under that name in 'mid' and
Martin von Zweigbergk
copies: document cases in _chain()...
r42413 # then was renamed between 'mid' and 'dst'.
#
# case src mid dst result
# 1 x y - -
# 2 x y y x->y
# 3 x y x -
# 4 x y z x->z
# 5 - x y -
# 6 x x y x->y
Martin von Zweigbergk
copies: split up _chain() in naive chaining and filtering steps...
r42565 #
# _chain() takes care of chaining the copies in 'a' and 'b', but it
# cannot tell the difference between cases 1 and 2, between 3 and 4, or
# between 5 and 6, so it includes all cases in its result.
# Cases 1, 3, and 5 are then removed by _filter().
Martin von Zweigbergk
copies: document cases in _chain()...
r42413
Martin von Zweigbergk
copies: split up _chain() in naive chaining and filtering steps...
r42565 for k, v in list(t.items()):
# remove copies from files that didn't exist
if v not in src:
del t[k]
# remove criss-crossed copies
elif k in src and v in dst:
del t[k]
# remove copies to files that were then removed
elif k not in dst:
del t[k]
Augie Fackler
formatting: blacken the codebase...
r43346
copies: expand `_chain` variable name to make the function easier to read...
r44223 def _chain(prefix, suffix):
"""chain two sets of copies 'prefix' and 'suffix'"""
result = prefix.copy()
for key, value in pycompat.iteritems(suffix):
result[key] = prefix.get(value, value)
return result
Matt Mackall
copies: rewrite copy detection for non-merge users...
r15775
Augie Fackler
formatting: blacken the codebase...
r43346
copies: drop the findlimit logic...
r43470 def _tracefile(fctx, am, basemf):
Martin von Zweigbergk
copies: consistently use """ for docstrings...
r35422 """return file context that is the ancestor of fctx present in ancestor
pathcopies: give up any optimization based on `introrev`...
r43469 manifest am
Note: we used to try and stop after a given limit, however checking if that
limit is reached turned out to be very expensive. we are better off
disabling that feature."""
Matt Mackall
copies: rewrite copy detection for non-merge users...
r15775
for f in fctx.ancestors():
Martin von Zweigbergk
copies: return only path from _tracefile() since that's all caller needs...
r42751 path = f.path()
if am.get(path, None) == f.filenode():
return path
Martin von Zweigbergk
copies: follow copies across merge base without source file (issue6163)...
r42798 if basemf and basemf.get(path, None) == f.filenode():
return path
Matt Mackall
copies: rewrite copy detection for non-merge users...
r15775
Augie Fackler
formatting: blacken the codebase...
r43346
Martin von Zweigbergk
copies: respect narrowmatcher in "parent -> working dir" case...
r41918 def _dirstatecopies(repo, match=None):
ds = repo.dirstate
Matt Mackall
copies: rewrite copy detection for non-merge users...
r15775 c = ds.copies().copy()
Pulkit Goyal
py3: explicitly convert dict.keys() and dict.items() into a list...
r34350 for k in list(c):
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 if ds[k] not in b'anm' or (match and not match(k)):
Matt Mackall
copies: rewrite copy detection for non-merge users...
r15775 del c[k]
return c
Augie Fackler
formatting: blacken the codebase...
r43346
Durham Goode
copies: add matcher parameter to copy logic...
r24782 def _computeforwardmissing(a, b, match=None):
Durham Goode
copy: move _forwardcopies file logic to a function...
r24011 """Computes which files are in b but not a.
This is its own function so extensions can easily wrap this call to see what
files _forwardcopies is about to process.
"""
Durham Goode
copies: add matcher parameter to copy logic...
r24782 ma = a.manifest()
mb = b.manifest()
Durham Goode
copies: remove use of manifest.matches...
r31256 return mb.filesnotin(ma, match=match)
Durham Goode
copy: move _forwardcopies file logic to a function...
r24011
Augie Fackler
formatting: blacken the codebase...
r43346
Martin von Zweigbergk
copies: extract function for deciding whether to use changeset-centric algos...
r42284 def usechangesetcentricalgo(repo):
"""Checks if we should use changeset-centric copy algorithms"""
sidedatacopies: read rename information from sidedata...
r43416 if repo.filecopiesmode == b'changeset-sidedata':
return True
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 readfrom = repo.ui.config(b'experimental', b'copies.read-from')
changesetsource = (b'changeset-only', b'compatibility')
copies: expand the logic of usechangesetcentricalgo...
r43290 return readfrom in changesetsource
Martin von Zweigbergk
copies: extract function for deciding whether to use changeset-centric algos...
r42284
Augie Fackler
formatting: blacken the codebase...
r43346
Martin von Zweigbergk
copies: follow copies across merge base without source file (issue6163)...
r42798 def _committedforwardcopies(a, b, base, match):
Martin von Zweigbergk
copies: extract method for getting non-wdir forward copies...
r35423 """Like _forwardcopies(), but b.rev() cannot be None (working copy)"""
Mads Kiilerich
diff: search beyond ancestor when detecting renames...
r20294 # files might have to be traced back to the fctx parent of the last
# one-side-only changeset, but not further back than that
Boris Feld
copies: add a devel debug mode to trace what copy tracing does...
r40093 repo = a._repo
Martin von Zweigbergk
copies: do copy tracing based on ctx.p[12]copies() if configured...
r41922
Martin von Zweigbergk
copies: extract function for deciding whether to use changeset-centric algos...
r42284 if usechangesetcentricalgo(repo):
Martin von Zweigbergk
copies: do copy tracing based on ctx.p[12]copies() if configured...
r41922 return _changesetforwardcopies(a, b, match)
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 debug = repo.ui.debugflag and repo.ui.configbool(b'devel', b'debug.copies')
Boris Feld
copies: add a devel debug mode to trace what copy tracing does...
r40093 dbg = repo.ui.debug
if debug:
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 dbg(b'debug.copies: looking into rename from %s to %s\n' % (a, b))
Mads Kiilerich
diff: search beyond ancestor when detecting renames...
r20294 am = a.manifest()
Martin von Zweigbergk
copies: follow copies across merge base without source file (issue6163)...
r42798 basemf = None if base is None else base.manifest()
Mads Kiilerich
diff: search beyond ancestor when detecting renames...
r20294
Matt Mackall
copies: rewrite copy detection for non-merge users...
r15775 # find where new files came from
# we currently don't try to find where old files went, too expensive
# this means we can miss a case like 'hg rm b; hg cp a b'
cm = {}
Durham Goode
copies: optimize forward copy detection logic for rebases...
r28000
# Computing the forward missing is quite expensive on large manifests, since
# it compares the entire manifests. We can optimize it in the common use
# case of computing what copies are in a commit versus its parent (like
# during a rebase or histedit). Note, we exclude merge commits from this
# optimization, since the ctx.files() for a merge commit is not correct for
# this comparison.
forwardmissingmatch = match
Yuya Nishihara
copies: use intersectmatchers() in non-merge p1 optimization...
r33869 if b.p1() == a and b.p2().node() == node.nullid:
Martin von Zweigbergk
copies: remove dependency on scmutil by directly using match.exact()...
r42102 filesmatcher = matchmod.exact(b.files())
Yuya Nishihara
copies: use intersectmatchers() in non-merge p1 optimization...
r33869 forwardmissingmatch = matchmod.intersectmatchers(match, filesmatcher)
Durham Goode
copies: optimize forward copy detection logic for rebases...
r28000 missing = _computeforwardmissing(a, b, match=forwardmissingmatch)
Pierre-Yves David
_adjustlinkrev: reuse ancestors set during rename detection (issue4514)...
r23980 ancestrycontext = a._repo.changelog.ancestors([b.rev()], inclusive=True)
Boris Feld
copies: add a devel debug mode to trace what copy tracing does...
r40093
if debug:
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 dbg(b'debug.copies: missing files to search: %d\n' % len(missing))
Boris Feld
copies: add a devel debug mode to trace what copy tracing does...
r40093
Martin von Zweigbergk
copies: process files in deterministic order for stable tests...
r42396 for f in sorted(missing):
Boris Feld
copies: add a devel debug mode to trace what copy tracing does...
r40093 if debug:
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 dbg(b'debug.copies: tracing file: %s\n' % f)
Pierre-Yves David
_adjustlinkrev: reuse ancestors set during rename detection (issue4514)...
r23980 fctx = b[f]
fctx._ancestrycontext = ancestrycontext
Boris Feld
copies: add a devel debug mode to trace what copy tracing does...
r40093
Boris Feld
copies: add time information to the debug information
r40094 if debug:
start = util.timer()
copies: drop the findlimit logic...
r43470 opath = _tracefile(fctx, am, basemf)
Martin von Zweigbergk
copies: return only path from _tracefile() since that's all caller needs...
r42751 if opath:
Boris Feld
copies: add a devel debug mode to trace what copy tracing does...
r40093 if debug:
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 dbg(b'debug.copies: rename of: %s\n' % opath)
Martin von Zweigbergk
copies: return only path from _tracefile() since that's all caller needs...
r42751 cm[f] = opath
Boris Feld
copies: add time information to the debug information
r40094 if debug:
Augie Fackler
formatting: blacken the codebase...
r43346 dbg(
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 b'debug.copies: time: %f seconds\n'
Augie Fackler
formatting: blacken the codebase...
r43346 % (util.timer() - start)
)
Martin von Zweigbergk
copies: extract method for getting non-wdir forward copies...
r35423 return cm
Augie Fackler
formatting: blacken the codebase...
r43346
copies: rename some function to the new naming scheme...
r46199 def _revinfo_getter(repo):
copies: directly pass a changes object to the copy tracing code...
r46217 """returns a function that returns the following data given a <rev>"
copies: extract data extraction into a `revinfo` function...
r43549
* p1: revision number of first parent
* p2: revision number of first parent
copies: directly pass a changes object to the copy tracing code...
r46217 * changes: a ChangingFiles object
copies: extract data extraction into a `revinfo` function...
r43549 """
cl = repo.changelog
parents = cl.parentrevs
copies: return None instead of ChangingFiles when relevant...
r46264 flags = cl.flags
HASCOPIESINFO = flagutil.REVIDX_HASCOPIESINFO
copies: extract data extraction into a `revinfo` function...
r43549
copies: use dedicated `_revinfo_getter` function and call...
r46215 changelogrevision = cl.changelogrevision
sidedatacopies: directly fetch copies information from sidedata...
r43551
copies: use dedicated `_revinfo_getter` function and call...
r46215 # A small cache to avoid doing the work twice for merges
#
# In the vast majority of cases, if we ask information for a revision
# about 1 parent, we'll later ask it for the other. So it make sense to
# keep the information around when reaching the first parent of a merge
# and dropping it after it was provided for the second parents.
#
# It exists cases were only one parent of the merge will be walked. It
# happens when the "destination" the copy tracing is descendant from a
# new root, not common with the "source". In that case, we will only walk
# through merge parents that are descendant of changesets common
# between "source" and "destination".
#
# With the current case implementation if such changesets have a copy
# information, we'll keep them in memory until the end of
# _changesetforwardcopies. We don't expect the case to be frequent
# enough to matters.
#
# In addition, it would be possible to reach pathological case, were
# many first parent are met before any second parent is reached. In
# that case the cache could grow. If this even become an issue one can
# safely introduce a maximum cache size. This would trade extra CPU/IO
# time to save memory.
merge_caches = {}
sidedatacopies: only fetch information once for merge...
r43595
copies: use dedicated `_revinfo_getter` function and call...
r46215 def revinfo(rev):
p1, p2 = parents(rev)
value = None
copies: no longer change the sidedata flag...
r46216 e = merge_caches.pop(rev, None)
if e is not None:
return e
copies: return None instead of ChangingFiles when relevant...
r46264 changes = None
if flags(rev) & HASCOPIESINFO:
changes = changelogrevision(rev).changes
value = (p1, p2, changes)
copies: no longer change the sidedata flag...
r46216 if p1 != node.nullrev and p2 != node.nullrev:
# XXX some case we over cache, IGNORE
copies: directly pass a changes object to the copy tracing code...
r46217 merge_caches[rev] = value
copies: use dedicated `_revinfo_getter` function and call...
r46215 return value
copies: extract data extraction into a `revinfo` function...
r43549
return revinfo
copies: cache the ancestor checking call when tracing copy...
r46504 def cached_is_ancestor(is_ancestor):
"""return a cached version of is_ancestor"""
cache = {}
def _is_ancestor(anc, desc):
if anc > desc:
return False
elif anc == desc:
return True
key = (anc, desc)
ret = cache.get(key)
if ret is None:
ret = cache[key] = is_ancestor(anc, desc)
return ret
return _is_ancestor
Martin von Zweigbergk
copies: do copy tracing based on ctx.p[12]copies() if configured...
r41922 def _changesetforwardcopies(a, b, match):
Martin von Zweigbergk
copies: fix crash on in changeset-centric tracing from commit to itself...
r42868 if a.rev() in (node.nullrev, b.rev()):
Martin von Zweigbergk
copies: do copy tracing based on ctx.p[12]copies() if configured...
r41922 return {}
copies: use an unfiltered repository for the changeset centric algorithm...
r43550 repo = a.repo().unfiltered()
Martin von Zweigbergk
copies: do copy tracing based on ctx.p[12]copies() if configured...
r41922 children = {}
copies: extract data extraction into a `revinfo` function...
r43549
Martin von Zweigbergk
copies: do copy tracing based on ctx.p[12]copies() if configured...
r41922 cl = repo.changelog
copies-rust: move is_ancestor caching within the rust code...
r46586 isancestor = cl.isancestorrev
Martin von Zweigbergk
copies: do copy tracing based on ctx.p[12]copies() if configured...
r41922 missingrevs = cl.findmissingrevs(common=[a.rev()], heads=[b.rev()])
copies: compute the exact set of revision to walk...
r43593 mrset = set(missingrevs)
roots = set()
Martin von Zweigbergk
copies: do copy tracing based on ctx.p[12]copies() if configured...
r41922 for r in missingrevs:
for p in cl.parentrevs(r):
if p == node.nullrev:
continue
if p not in children:
children[p] = [r]
else:
children[p].append(r)
copies: compute the exact set of revision to walk...
r43593 if p not in mrset:
roots.add(p)
if not roots:
# no common revision to track copies from
return {}
min_root = min(roots)
Martin von Zweigbergk
copies: do copy tracing based on ctx.p[12]copies() if configured...
r41922
copies: compute the exact set of revision to walk...
r43593 from_head = set(
cl.reachableroots(min_root, [b.rev()], list(roots), includepath=True)
)
iterrevs = set(from_head)
iterrevs &= mrset
iterrevs.update(roots)
iterrevs.remove(b.rev())
copies: split the combination of the copies mapping in its own function...
r44225 revs = sorted(iterrevs)
copies: make two version of the changeset centric algorithm...
r46214
if repo.filecopiesmode == b'changeset-sidedata':
copies: use dedicated `_revinfo_getter` function and call...
r46215 revinfo = _revinfo_getter(repo)
copies: make two version of the changeset centric algorithm...
r46214 return _combine_changeset_copies(
revs, children, b.rev(), revinfo, match, isancestor
)
else:
copies: use dedicated `_revinfo_getter` function and call...
r46215 revinfo = _revinfo_getter_extra(repo)
copies: make two version of the changeset centric algorithm...
r46214 return _combine_changeset_copies_extra(
revs, children, b.rev(), revinfo, match, isancestor
)
copies: split the combination of the copies mapping in its own function...
r44225
copies: rename some function to the new naming scheme...
r46199 def _combine_changeset_copies(
copies: fix the changeset based algorithm regarding merge...
r45252 revs, children, targetrev, revinfo, match, isancestor
):
copies: split the combination of the copies mapping in its own function...
r44225 """combine the copies information for each item of iterrevs
revs: sorted iterable of revision to visit
children: a {parent: [children]} mapping.
targetrev: the final copies destination revision (not in iterrevs)
revinfo(rev): a function that return (p1, p2, p1copies, p2copies, removed)
match: a matcher
It returns the aggregated copies information for `targetrev`.
"""
copies: use the rust code for `combine_changeset_copies`...
r46576
alwaysmatch = match.always()
if rustmod is not None and alwaysmatch:
return rustmod.combine_changeset_copies(
list(revs), children, targetrev, revinfo, isancestor
)
copies-rust: move is_ancestor caching within the rust code...
r46586 isancestor = cached_is_ancestor(isancestor)
copies: do not initialize the dictionary with root in changeset copies...
r44224 all_copies = {}
copies: split the combination of the copies mapping in its own function...
r44225 for r in revs:
copies: do not initialize the dictionary with root in changeset copies...
r44224 copies = all_copies.pop(r, None)
if copies is None:
# this is a root
copies = {}
Martin von Zweigbergk
copies: avoid unnecessary copying of copy dict...
r42687 for i, c in enumerate(children[r]):
copies: directly pass a changes object to the copy tracing code...
r46217 p1, p2, changes = revinfo(c)
copies: return None instead of ChangingFiles when relevant...
r46264 childcopies = {}
copies: avoid instancing more changectx to access parent revisions...
r43548 if r == p1:
Martin von Zweigbergk
copies: do copy tracing based on ctx.p[12]copies() if configured...
r41922 parent = 1
copies: return None instead of ChangingFiles when relevant...
r46264 if changes is not None:
childcopies = changes.copied_from_p1
Martin von Zweigbergk
copies: do copy tracing based on ctx.p[12]copies() if configured...
r41922 else:
copies: avoid instancing more changectx to access parent revisions...
r43548 assert r == p2
Martin von Zweigbergk
copies: do copy tracing based on ctx.p[12]copies() if configured...
r41922 parent = 2
copies: return None instead of ChangingFiles when relevant...
r46264 if changes is not None:
childcopies = changes.copied_from_p2
Martin von Zweigbergk
copies: avoid calling matcher if matcher.always()...
r42688 if not alwaysmatch:
Augie Fackler
formatting: blacken the codebase...
r43346 childcopies = {
dst: src for dst, src in childcopies.items() if match(dst)
}
copies: move from a copy on branchpoint to a copy on write approach...
r43594 newcopies = copies
Martin von Zweigbergk
copies: avoid reusing the same variable for two different copy dicts...
r42714 if childcopies:
copies: fix the changeset based algorithm regarding merge...
r45252 newcopies = copies.copy()
for dest, source in pycompat.iteritems(childcopies):
prev = copies.get(source)
if prev is not None and prev[1] is not None:
source = prev[1]
newcopies[dest] = (c, source)
copies: move from a copy on branchpoint to a copy on write approach...
r43594 assert newcopies is not copies
copies: return None instead of ChangingFiles when relevant...
r46264 if changes is not None:
for f in changes.removed:
if f in newcopies:
if newcopies is copies:
# copy on write to avoid affecting potential other
# branches. when there are no other branches, this
# could be avoided.
newcopies = copies.copy()
newcopies[f] = (c, None)
copies: simplify the handling of merges...
r43546 othercopies = all_copies.get(c)
if othercopies is None:
all_copies[c] = newcopies
else:
# we are the second parent to work on c, we need to merge our
# work with the other.
#
# In case of conflict, parent 1 take precedence over parent 2.
# This is an arbitrary choice made anew when implementing
# changeset based copies. It was made without regards with
# potential filelog related behavior.
if parent == 1:
copies: fix the changeset based algorithm regarding merge...
r45252 _merge_copies_dict(
copies: directly pass a changes object to the copy tracing code...
r46217 othercopies, newcopies, isancestor, changes
copies: fix the changeset based algorithm regarding merge...
r45252 )
copies: simplify the handling of merges...
r43546 else:
copies: fix the changeset based algorithm regarding merge...
r45252 _merge_copies_dict(
copies: directly pass a changes object to the copy tracing code...
r46217 newcopies, othercopies, isancestor, changes
copies: fix the changeset based algorithm regarding merge...
r45252 )
copies: simplify the handling of merges...
r43546 all_copies[c] = newcopies
copies: fix the changeset based algorithm regarding merge...
r45252
final_copies = {}
for dest, (tt, source) in all_copies[targetrev].items():
if source is not None:
final_copies[dest] = source
return final_copies
copies: directly pass a changes object to the copy tracing code...
r46217 def _merge_copies_dict(minor, major, isancestor, changes):
copies: fix the changeset based algorithm regarding merge...
r45252 """merge two copies-mapping together, minor and major
In case of conflict, value from "major" will be picked.
- `isancestors(low_rev, high_rev)`: callable return True if `low_rev` is an
ancestors of `high_rev`,
- `ismerged(path)`: callable return True if `path` have been merged in the
current revision,
"""
for dest, value in major.items():
other = minor.get(dest)
if other is None:
minor[dest] = value
else:
new_tt = value[0]
other_tt = other[0]
if value[1] == other[1]:
continue
# content from "major" wins, unless it is older
# than the branch point or there is a merge
salvaged: properly deal with salvaged file during copy tracing...
r46262 if new_tt == other_tt:
minor[dest] = value
copies: return None instead of ChangingFiles when relevant...
r46264 elif (
changes is not None
and value[1] is None
and dest in changes.salvaged
):
salvaged: properly deal with salvaged file during copy tracing...
r46262 pass
copies: return None instead of ChangingFiles when relevant...
r46264 elif (
changes is not None
and other[1] is None
and dest in changes.salvaged
):
salvaged: properly deal with salvaged file during copy tracing...
r46262 minor[dest] = value
copies: move `merged` testing sooner...
r46265 elif changes is not None and dest in changes.merged:
salvaged: properly deal with salvaged file during copy tracing...
r46262 minor[dest] = value
copies: move `merged` testing sooner...
r46265 elif not isancestor(new_tt, other_tt):
copies: make sure deleted copy info do not overwriting unrelated ones...
r46394 if value[1] is not None:
minor[dest] = value
elif isancestor(other_tt, new_tt):
minor[dest] = value
Martin von Zweigbergk
copies: do copy tracing based on ctx.p[12]copies() if configured...
r41922
Augie Fackler
formatting: blacken the codebase...
r43346
copies: use dedicated `_revinfo_getter` function and call...
r46215 def _revinfo_getter_extra(repo):
"""return a function that return multiple data given a <rev>"i
* p1: revision number of first parent
* p2: revision number of first parent
* p1copies: mapping of copies from p1
* p2copies: mapping of copies from p2
* removed: a list of removed files
* ismerged: a callback to know if file was merged in that revision
"""
cl = repo.changelog
parents = cl.parentrevs
def get_ismerged(rev):
ctx = repo[rev]
def ismerged(path):
if path not in ctx.files():
return False
fctx = ctx[path]
parents = fctx._filelog.parents(fctx._filenode)
nb_parents = 0
for n in parents:
if n != node.nullid:
nb_parents += 1
return nb_parents >= 2
return ismerged
def revinfo(rev):
p1, p2 = parents(rev)
ctx = repo[rev]
p1copies, p2copies = ctx._copies
removed = ctx.filesremoved()
return p1, p2, p1copies, p2copies, removed, get_ismerged(rev)
return revinfo
copies: make two version of the changeset centric algorithm...
r46214 def _combine_changeset_copies_extra(
revs, children, targetrev, revinfo, match, isancestor
):
"""version of `_combine_changeset_copies` that works with the Google
specific "extra" based storage for copy information"""
all_copies = {}
alwaysmatch = match.always()
for r in revs:
copies = all_copies.pop(r, None)
if copies is None:
# this is a root
copies = {}
for i, c in enumerate(children[r]):
p1, p2, p1copies, p2copies, removed, ismerged = revinfo(c)
if r == p1:
parent = 1
childcopies = p1copies
else:
assert r == p2
parent = 2
childcopies = p2copies
if not alwaysmatch:
childcopies = {
dst: src for dst, src in childcopies.items() if match(dst)
}
newcopies = copies
if childcopies:
newcopies = copies.copy()
for dest, source in pycompat.iteritems(childcopies):
prev = copies.get(source)
if prev is not None and prev[1] is not None:
source = prev[1]
newcopies[dest] = (c, source)
assert newcopies is not copies
for f in removed:
if f in newcopies:
if newcopies is copies:
# copy on write to avoid affecting potential other
# branches. when there are no other branches, this
# could be avoided.
newcopies = copies.copy()
newcopies[f] = (c, None)
othercopies = all_copies.get(c)
if othercopies is None:
all_copies[c] = newcopies
else:
# we are the second parent to work on c, we need to merge our
# work with the other.
#
# In case of conflict, parent 1 take precedence over parent 2.
# This is an arbitrary choice made anew when implementing
# changeset based copies. It was made without regards with
# potential filelog related behavior.
if parent == 1:
_merge_copies_dict_extra(
othercopies, newcopies, isancestor, ismerged
)
else:
_merge_copies_dict_extra(
newcopies, othercopies, isancestor, ismerged
)
all_copies[c] = newcopies
final_copies = {}
for dest, (tt, source) in all_copies[targetrev].items():
if source is not None:
final_copies[dest] = source
return final_copies
def _merge_copies_dict_extra(minor, major, isancestor, ismerged):
"""version of `_merge_copies_dict` that works with the Google
specific "extra" based storage for copy information"""
for dest, value in major.items():
other = minor.get(dest)
if other is None:
minor[dest] = value
else:
new_tt = value[0]
other_tt = other[0]
if value[1] == other[1]:
continue
# content from "major" wins, unless it is older
# than the branch point or there is a merge
if (
new_tt == other_tt
or not isancestor(new_tt, other_tt)
or ismerged(dest)
):
minor[dest] = value
Martin von Zweigbergk
copies: follow copies across merge base without source file (issue6163)...
r42798 def _forwardcopies(a, b, base=None, match=None):
Martin von Zweigbergk
copies: extract method for getting non-wdir forward copies...
r35423 """find {dst@b: src@a} copy mapping where a is an ancestor of b"""
Martin von Zweigbergk
copies: follow copies across merge base without source file (issue6163)...
r42798 if base is None:
base = a
Martin von Zweigbergk
narrow: make copies.pathcopies() filter with narrowspec again...
r40487 match = a.repo().narrowmatch(match)
Martin von Zweigbergk
copies: extract method for getting non-wdir forward copies...
r35423 # check for working copy
if b.rev() is None:
Martin von Zweigbergk
copies: follow copies across merge base without source file (issue6163)...
r42798 cm = _committedforwardcopies(a, b.p1(), base, match)
Martin von Zweigbergk
copies: group wdir-handling in one place...
r35424 # combine copies from dirstate if necessary
Martin von Zweigbergk
copies: inline _chainandfilter() to prepare for next patch...
r42796 copies = _chain(cm, _dirstatecopies(b._repo, match))
Martin von Zweigbergk
copies: remove most early returns from pathcopies() and _forwardcopies()...
r42795 else:
Augie Fackler
formatting: blacken the codebase...
r43346 copies = _committedforwardcopies(a, b, base, match)
Martin von Zweigbergk
copies: remove most early returns from pathcopies() and _forwardcopies()...
r42795 return copies
Matt Mackall
copies: rewrite copy detection for non-merge users...
r15775
Augie Fackler
formatting: blacken the codebase...
r43346
Martin von Zweigbergk
copies: make _backwardrenames() filter out copies by destination...
r41919 def _backwardrenames(a, b, match):
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 if a._repo.ui.config(b'experimental', b'copytrace') == b'off':
Durham Goode
copy: add flag for disabling copy tracing...
r26013 return {}
Siddharth Agarwal
copies: do not track backward copies, only renames (issue3739)...
r18136 # Even though we're not taking copies into account, 1:n rename situations
# can still exist (e.g. hg cp a b; hg mv a c). In those cases we
# arbitrarily pick one of the renames.
Martin von Zweigbergk
copies: make _backwardrenames() filter out copies by destination...
r41919 # We don't want to pass in "match" here, since that would filter
# the destination by it. Since we're reversing the copies, we want
# to filter the source instead.
Matt Mackall
copies: rewrite copy detection for non-merge users...
r15775 f = _forwardcopies(b, a)
r = {}
Gregory Szorc
py3: finish porting iteritems() to pycompat and remove source transformer...
r43376 for k, v in sorted(pycompat.iteritems(f)):
Martin von Zweigbergk
copies: make _backwardrenames() filter out copies by destination...
r41919 if match and not match(v):
continue
Siddharth Agarwal
copies: do not track backward copies, only renames (issue3739)...
r18136 # remove copies
if v in a:
continue
Matt Mackall
copies: rewrite copy detection for non-merge users...
r15775 r[v] = k
return r
Augie Fackler
formatting: blacken the codebase...
r43346
Durham Goode
copies: add matcher parameter to copy logic...
r24782 def pathcopies(x, y, match=None):
Martin von Zweigbergk
copies: consistently use """ for docstrings...
r35422 """find {dst@y: src@x} copy mapping for directed compare"""
Boris Feld
copies: add a devel debug mode to trace what copy tracing does...
r40093 repo = x._repo
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 debug = repo.ui.debugflag and repo.ui.configbool(b'devel', b'debug.copies')
Boris Feld
copies: add a devel debug mode to trace what copy tracing does...
r40093 if debug:
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 repo.ui.debug(
b'debug.copies: searching copies from %s to %s\n' % (x, y)
)
Matt Mackall
copies: rewrite copy detection for non-merge users...
r15775 if x == y or not x or not y:
return {}
Martin von Zweigbergk
copies: avoid filtering by short-circuit dirstate-only copies earlier...
r44749 if y.rev() is None and x == y.p1():
if debug:
repo.ui.debug(b'debug.copies: search mode: dirstate\n')
# short-circuit to avoid issues with merge states
return _dirstatecopies(repo, match)
Matt Mackall
copies: rewrite copy detection for non-merge users...
r15775 a = y.ancestor(x)
if a == x:
Boris Feld
copies: add a devel debug mode to trace what copy tracing does...
r40093 if debug:
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 repo.ui.debug(b'debug.copies: search mode: forward\n')
Martin von Zweigbergk
copies: remove most early returns from pathcopies() and _forwardcopies()...
r42795 copies = _forwardcopies(x, y, match=match)
elif a == y:
Boris Feld
copies: add a devel debug mode to trace what copy tracing does...
r40093 if debug:
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 repo.ui.debug(b'debug.copies: search mode: backward\n')
Martin von Zweigbergk
copies: remove most early returns from pathcopies() and _forwardcopies()...
r42795 copies = _backwardrenames(x, y, match=match)
else:
if debug:
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 repo.ui.debug(b'debug.copies: search mode: combined\n')
Martin von Zweigbergk
copies: follow copies across merge base without source file (issue6163)...
r42798 base = None
if a.rev() != node.nullrev:
base = x
Augie Fackler
formatting: blacken the codebase...
r43346 copies = _chain(
_backwardrenames(x, a, match=match),
_forwardcopies(a, y, base, match=match),
)
Martin von Zweigbergk
copies: filter invalid copies only at end of pathcopies() (issue6163)...
r42797 _filter(x, y, copies)
Martin von Zweigbergk
copies: remove most early returns from pathcopies() and _forwardcopies()...
r42795 return copies
Matt Mackall
copies: split the copies api for "normal" and merge cases (API)
r15774
Augie Fackler
formatting: blacken the codebase...
r43346
Pierre-Yves David
mergecopies: rename 'ca' to 'base'...
r30186 def mergecopies(repo, c1, c2, base):
Matt Mackall
copies: move findcopies code to its own module...
r6274 """
Martin von Zweigbergk
copies: move comment about implementation of mergecopies() to end...
r42287 Finds moves and copies between context c1 and c2 that are relevant for
Pulkit Goyal
copytrace: move the default copytracing algorithm in a new function...
r34080 merging. 'base' will be used as the merge base.
Copytracing is used in commands like rebase, merge, unshelve, etc to merge
files that were moved/ copied in one merge parent and modified in another.
For example:
Pulkit Goyal
copies: add more details to the documentation of mergecopies()...
r33821
o ---> 4 another commit
|
| o ---> 3 commit that modifies a.txt
| /
o / ---> 2 commit that moves a.txt to b.txt
|/
o ---> 1 merge base
If we try to rebase revision 3 on revision 4, since there is no a.txt in
revision 4, and if user have copytrace disabled, we prints the following
message:
```other changed <file> which local deleted```
Martin von Zweigbergk
copies: define a type to return from mergecopies()...
r44681 Returns a tuple where:
Matt Mackall
copies: add docstring for mergecopies
r16168
Martin von Zweigbergk
copies: define a type to return from mergecopies()...
r44681 "branch_copies" an instance of branch_copies.
Siddharth Agarwal
copies: separate moves via directory renames from explicit copies...
r18134
Matt Mackall
copies: add docstring for mergecopies
r16168 "diverge" is a mapping of source name -> list of destination names
for divergent renames.
Thomas Arendsen Hein
merge: warn about file deleted in one branch and renamed in other (issue3074)...
r16794
Martin von Zweigbergk
copies: move comment about implementation of mergecopies() to end...
r42287 This function calls different copytracing algorithms based on config.
Matt Mackall
copies: move findcopies code to its own module...
r6274 """
# avoid silly behavior for update from empty dir
Matt Mackall
copies: teach symmetric difference about working revisions...
r6430 if not c1 or not c2 or c1 == c2:
Martin von Zweigbergk
merge: start using the per-side copy dicts...
r44682 return branch_copies(), branch_copies(), {}
Matt Mackall
copies: move findcopies code to its own module...
r6274
Martin von Zweigbergk
copies: respect narrowmatcher in "parent -> working dir" case...
r41918 narrowmatch = c1.repo().narrowmatch()
Matt Mackall
copies: teach copies about dirstate.copies...
r6646 # avoid silly behavior for parent -> working dir
Matt Mackall
misc: replace .parents()[0] with p1()
r13878 if c2.node() is None and c1.node() == repo.dirstate.p1():
Martin von Zweigbergk
merge: start using the per-side copy dicts...
r44682 return (
branch_copies(_dirstatecopies(repo, narrowmatch)),
branch_copies(),
{},
)
Matt Mackall
copies: teach copies about dirstate.copies...
r6646
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 copytracing = repo.ui.config(b'experimental', b'copytrace')
Martin von Zweigbergk
copies: move check for experimental.copytrace==<falsy> earlier...
r42411 if stringutil.parsebool(copytracing) is False:
# stringutil.parsebool() returns None when it is unable to parse the
# value, so we should rely on making sure copytracing is on such cases
Martin von Zweigbergk
merge: start using the per-side copy dicts...
r44682 return branch_copies(), branch_copies(), {}
Pulkit Goyal
copytrace: move the default copytracing algorithm in a new function...
r34080
Martin von Zweigbergk
copies: ignore heuristics copytracing when using changeset-centric algos...
r42412 if usechangesetcentricalgo(repo):
# The heuristics don't make sense when we need changeset-centric algos
return _fullcopytracing(repo, c1, c2, base)
Durham Goode
copy: add flag for disabling copy tracing...
r26013 # Copy trace disabling is explicitly below the node == p1 logic above
# because the logic above is required for a simple copy to be kept across a
# rebase.
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 if copytracing == b'heuristics':
Yuya Nishihara
copytrace: use ctx.mutable() instead of adhoc constant of non-public phases
r34365 # Do full copytracing if only non-public revisions are involved as
# that will be fast enough and will also cover the copies which could
# be missed by heuristics
Pulkit Goyal
copytrace: add a a new config to limit the number of drafts in heuristics...
r34312 if _isfullcopytraceable(repo, c1, base):
Pulkit Goyal
copytrace: use the full copytracing method if only drafts are involved...
r34289 return _fullcopytracing(repo, c1, c2, base)
Pulkit Goyal
copytrace: move fast heuristic copytracing algorithm to core...
r34180 return _heuristicscopytracing(repo, c1, c2, base)
Pulkit Goyal
copytrace: move the default copytracing algorithm in a new function...
r34080 else:
return _fullcopytracing(repo, c1, c2, base)
Durham Goode
copy: add flag for disabling copy tracing...
r26013
Augie Fackler
formatting: blacken the codebase...
r43346
Pulkit Goyal
copytrace: add a a new config to limit the number of drafts in heuristics...
r34312 def _isfullcopytraceable(repo, c1, base):
Augie Fackler
formating: upgrade to black 20.8b1...
r46554 """Checks that if base, source and destination are all no-public branches,
Yuya Nishihara
copytrace: use ctx.mutable() instead of adhoc constant of non-public phases
r34365 if yes let's use the full copytrace algorithm for increased capabilities
since it will be fast enough.
Pulkit Goyal
copies: add docs for config `experimental.copytrace.sourcecommitlimit`...
r34517
`experimental.copytrace.sourcecommitlimit` can be used to set a limit for
number of changesets from c1 to base such that if number of changesets are
more than the limit, full copytracing algorithm won't be used.
Pulkit Goyal
copytrace: use the full copytracing method if only drafts are involved...
r34289 """
Pulkit Goyal
copytrace: add a a new config to limit the number of drafts in heuristics...
r34312 if c1.rev() is None:
c1 = c1.p1()
Yuya Nishihara
copytrace: use ctx.mutable() instead of adhoc constant of non-public phases
r34365 if c1.mutable() and base.mutable():
Augie Fackler
formatting: blacken the codebase...
r43346 sourcecommitlimit = repo.ui.configint(
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 b'experimental', b'copytrace.sourcecommitlimit'
Augie Fackler
formatting: blacken the codebase...
r43346 )
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 commits = len(repo.revs(b'%d::%d', base.rev(), c1.rev()))
Pulkit Goyal
copytrace: add a a new config to limit the number of drafts in heuristics...
r34312 return commits < sourcecommitlimit
Pulkit Goyal
copytrace: use the full copytracing method if only drafts are involved...
r34289 return False
Augie Fackler
formatting: blacken the codebase...
r43346
def _checksinglesidecopies(
src, dsts1, m1, m2, mb, c2, base, copy, renamedelete
):
Martin von Zweigbergk
copies: calculate mergecopies() based on pathcopies()...
r42408 if src not in m2:
# deleted on side 2
if src not in m1:
# renamed on side 1, deleted on side 2
renamedelete[src] = dsts1
Martin von Zweigbergk
copies: fix crash when copy source is not in graft base...
r44691 elif src not in mb:
# Work around the "short-circuit to avoid issues with merge states"
# thing in pathcopies(): pathcopies(x, y) can return a copy where the
# destination doesn't exist in y.
pass
flags: account for flag change when tracking rename relevant to merge...
r45396 elif mb[src] != m2[src] and not _related(c2[src], base[src]):
return
elif mb[src] != m2[src] or mb.flags(src) != m2.flags(src):
Martin von Zweigbergk
copies: calculate mergecopies() based on pathcopies()...
r42408 # modified on side 2
for dst in dsts1:
Martin von Zweigbergk
merge: when rename was made on both sides, use ancestor as merge base...
r44714 copy[dst] = src
Martin von Zweigbergk
copies: calculate mergecopies() based on pathcopies()...
r42408
Augie Fackler
formatting: blacken the codebase...
r43346
Martin von Zweigbergk
copies: define a type to return from mergecopies()...
r44681 class branch_copies(object):
"""Information about copies made on one side of a merge/graft.
"copy" is a mapping from destination name -> source name,
where source is in c1 and destination is in c2 or vice-versa.
"movewithdir" is a mapping from source name -> destination name,
where the file at source present in one context but not the other
needs to be moved to destination by the merge process, because the
other context moved the directory it is in.
"renamedelete" is a mapping of source name -> list of destination
names for files deleted in c1 that were renamed in c2 or vice-versa.
"dirmove" is a mapping of detected source dir -> destination dir renames.
This is needed for handling changes to new files previously grafted into
renamed directories.
"""
def __init__(
self, copy=None, renamedelete=None, dirmove=None, movewithdir=None
):
self.copy = {} if copy is None else copy
self.renamedelete = {} if renamedelete is None else renamedelete
self.dirmove = {} if dirmove is None else dirmove
self.movewithdir = {} if movewithdir is None else movewithdir
Martin von Zweigbergk
copies: implement __repr__ on branch_copies for debugging...
r45528 def __repr__(self):
Augie Fackler
formating: upgrade to black 20.8b1...
r46554 return '<branch_copies\n copy=%r\n renamedelete=%r\n dirmove=%r\n movewithdir=%r\n>' % (
self.copy,
self.renamedelete,
self.dirmove,
self.movewithdir,
Martin von Zweigbergk
copies: implement __repr__ on branch_copies for debugging...
r45528 )
Martin von Zweigbergk
copies: define a type to return from mergecopies()...
r44681
Pulkit Goyal
copytrace: move the default copytracing algorithm in a new function...
r34080 def _fullcopytracing(repo, c1, c2, base):
Augie Fackler
formating: upgrade to black 20.8b1...
r46554 """The full copytracing algorithm which finds all the new files that were
Pulkit Goyal
copytrace: move the default copytracing algorithm in a new function...
r34080 added from merge base up to the top commit and for each file it checks if
this file was copied from another file.
This is pretty slow when a lot of changesets are involved but will track all
the copies.
"""
Matt Mackall
copies: move findcopies code to its own module...
r6274 m1 = c1.manifest()
m2 = c2.manifest()
Pierre-Yves David
mergecopies: rename 'ca' to 'base'...
r30186 mb = base.manifest()
Matt Mackall
copies: move findcopies code to its own module...
r6274
Martin von Zweigbergk
copies: calculate mergecopies() based on pathcopies()...
r42408 copies1 = pathcopies(base, c1)
copies2 = pathcopies(base, c2)
Martin von Zweigbergk
copies: move early return in mergecopies() earlier...
r44622 if not (copies1 or copies2):
Martin von Zweigbergk
merge: start using the per-side copy dicts...
r44682 return branch_copies(), branch_copies(), {}
Martin von Zweigbergk
copies: move early return in mergecopies() earlier...
r44622
Martin von Zweigbergk
copies: calculate mergecopies() based on pathcopies()...
r42408 inversecopies1 = {}
inversecopies2 = {}
for dst, src in copies1.items():
inversecopies1.setdefault(src, []).append(dst)
for dst, src in copies2.items():
inversecopies2.setdefault(src, []).append(dst)
Martin von Zweigbergk
copies: make mergecopies() distinguish between copies on each side...
r44657 copy1 = {}
copy2 = {}
Martin von Zweigbergk
copies: calculate mergecopies() based on pathcopies()...
r42408 diverge = {}
Martin von Zweigbergk
copies: make mergecopies() distinguish between copies on each side...
r44657 renamedelete1 = {}
renamedelete2 = {}
Martin von Zweigbergk
copies: calculate mergecopies() based on pathcopies()...
r42408 allsources = set(inversecopies1) | set(inversecopies2)
for src in allsources:
dsts1 = inversecopies1.get(src)
dsts2 = inversecopies2.get(src)
if dsts1 and dsts2:
# copied/renamed on both sides
if src not in m1 and src not in m2:
# renamed on both sides
dsts1 = set(dsts1)
dsts2 = set(dsts2)
# If there's some overlap in the rename destinations, we
# consider it not divergent. For example, if side 1 copies 'a'
# to 'b' and 'c' and deletes 'a', and side 2 copies 'a' to 'c'
# and 'd' and deletes 'a'.
if dsts1 & dsts2:
Augie Fackler
formatting: blacken the codebase...
r43346 for dst in dsts1 & dsts2:
Martin von Zweigbergk
copies: make mergecopies() distinguish between copies on each side...
r44657 copy1[dst] = src
copy2[dst] = src
Martin von Zweigbergk
copies: calculate mergecopies() based on pathcopies()...
r42408 else:
diverge[src] = sorted(dsts1 | dsts2)
elif src in m1 and src in m2:
# copied on both sides
dsts1 = set(dsts1)
dsts2 = set(dsts2)
Augie Fackler
formatting: blacken the codebase...
r43346 for dst in dsts1 & dsts2:
Martin von Zweigbergk
copies: make mergecopies() distinguish between copies on each side...
r44657 copy1[dst] = src
copy2[dst] = src
Martin von Zweigbergk
copies: calculate mergecopies() based on pathcopies()...
r42408 # TODO: Handle cases where it was renamed on one side and copied
# on the other side
elif dsts1:
# copied/renamed only on side 1
Augie Fackler
formatting: blacken the codebase...
r43346 _checksinglesidecopies(
Martin von Zweigbergk
copies: make mergecopies() distinguish between copies on each side...
r44657 src, dsts1, m1, m2, mb, c2, base, copy1, renamedelete1
Augie Fackler
formatting: blacken the codebase...
r43346 )
Martin von Zweigbergk
copies: calculate mergecopies() based on pathcopies()...
r42408 elif dsts2:
# copied/renamed only on side 2
Augie Fackler
formatting: blacken the codebase...
r43346 _checksinglesidecopies(
Martin von Zweigbergk
copies: make mergecopies() distinguish between copies on each side...
r44657 src, dsts2, m2, m1, mb, c1, base, copy2, renamedelete2
Augie Fackler
formatting: blacken the codebase...
r43346 )
Martin von Zweigbergk
copies: calculate mergecopies() based on pathcopies()...
r42408
Matt Mackall
copies: group bothnew with other sets
r26659 # find interesting file sets from manifests
Martin von Zweigbergk
narrow: move copies overrides to core...
r40002 addedinm1 = m1.filesnotin(mb, repo.narrowmatch())
addedinm2 = m2.filesnotin(mb, repo.narrowmatch())
Martin von Zweigbergk
copies: inline _computenonoverlap() in mergecopies()...
r42409 u1 = sorted(addedinm1 - addedinm2)
u2 = sorted(addedinm2 - addedinm1)
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 header = b" unmatched files in %s"
Martin von Zweigbergk
copies: inline _computenonoverlap() in mergecopies()...
r42409 if u1:
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 repo.ui.debug(b"%s:\n %s\n" % (header % b'local', b"\n ".join(u1)))
Martin von Zweigbergk
copies: inline _computenonoverlap() in mergecopies()...
r42409 if u2:
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 repo.ui.debug(b"%s:\n %s\n" % (header % b'other', b"\n ".join(u2)))
Matt Mackall
copies: move findcopies code to its own module...
r6274
Martin von Zweigbergk
copies: move early return for "no copies" case a little earlier...
r42342 if repo.ui.debugflag:
Martin von Zweigbergk
copies: avoid calculating debug-only stuff without --debug...
r44623 renamedeleteset = set()
divergeset = set()
for dsts in diverge.values():
divergeset.update(dsts)
Martin von Zweigbergk
copies: make mergecopies() distinguish between copies on each side...
r44657 for dsts in renamedelete1.values():
renamedeleteset.update(dsts)
for dsts in renamedelete2.values():
Martin von Zweigbergk
copies: avoid calculating debug-only stuff without --debug...
r44623 renamedeleteset.update(dsts)
Augie Fackler
formatting: blacken the codebase...
r43346 repo.ui.debug(
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 b" all copies found (* = to merge, ! = divergent, "
b"% = renamed and deleted):\n"
Augie Fackler
formatting: blacken the codebase...
r43346 )
Martin von Zweigbergk
copies: print debug information about copies per side/branch...
r44679 for side, copies in ((b"local", copies1), (b"remote", copies2)):
if not copies:
continue
repo.ui.debug(b" on %s side:\n" % side)
for f in sorted(copies):
note = b""
if f in copy1 or f in copy2:
note += b"*"
if f in divergeset:
note += b"!"
if f in renamedeleteset:
note += b"%"
repo.ui.debug(
b" src: '%s' -> dst: '%s' %s\n" % (copies[f], f, note)
)
Martin von Zweigbergk
copies: avoid calculating debug-only stuff without --debug...
r44623 del renamedeleteset
del divergeset
Matt Mackall
copies: move findcopies code to its own module...
r6274
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 repo.ui.debug(b" checking for directory renames\n")
Matt Mackall
copies: move findcopies code to its own module...
r6274
Martin von Zweigbergk
copies: make mergecopies() distinguish between copies on each side...
r44657 dirmove1, movewithdir2 = _dir_renames(repo, c1, copy1, copies1, u2)
dirmove2, movewithdir1 = _dir_renames(repo, c2, copy2, copies2, u1)
Martin von Zweigbergk
copies: extract function for finding directory renames...
r44624
Martin von Zweigbergk
merge: start using the per-side copy dicts...
r44682 branch_copies1 = branch_copies(copy1, renamedelete1, dirmove1, movewithdir1)
branch_copies2 = branch_copies(copy2, renamedelete2, dirmove2, movewithdir2)
Martin von Zweigbergk
copies: make mergecopies() distinguish between copies on each side...
r44657
Martin von Zweigbergk
merge: start using the per-side copy dicts...
r44682 return branch_copies1, branch_copies2, diverge
Martin von Zweigbergk
copies: extract function for finding directory renames...
r44624
Martin von Zweigbergk
copies: make mergecopies() distinguish between copies on each side...
r44657 def _dir_renames(repo, ctx, copy, fullcopy, addedfiles):
"""Finds moved directories and files that should move with them.
ctx: the context for one of the sides
copy: files copied on the same side (as ctx)
fullcopy: files copied on the same side (as ctx), including those that
merge.manifestmerge() won't care about
addedfiles: added files on the other side (compared to ctx)
"""
Matt Mackall
copies: move findcopies code to its own module...
r6274 # generate a directory move map
Martin von Zweigbergk
copies: make mergecopies() distinguish between copies on each side...
r44657 d = ctx.dirs()
Matt Mackall
copies: re-include root directory in directory rename detection (issue3511)
r17055 invalid = set()
Matt Mackall
copies: move findcopies code to its own module...
r6274 dirmove = {}
# examine each file copy for a potential directory move, which is
# when all the files in a directory are moved to a new directory
Gregory Szorc
py3: finish porting iteritems() to pycompat and remove source transformer...
r43376 for dst, src in pycompat.iteritems(fullcopy):
Durham Goode
copies: switch to using pathutil.dirname...
r25282 dsrc, ddst = pathutil.dirname(src), pathutil.dirname(dst)
Matt Mackall
copies: move findcopies code to its own module...
r6274 if dsrc in invalid:
# already seen to be uninteresting
continue
Martin von Zweigbergk
copies: make mergecopies() distinguish between copies on each side...
r44657 elif dsrc in d and ddst in d:
Matt Mackall
copies: move findcopies code to its own module...
r6274 # directory wasn't entirely moved locally
Kyle Lippincott
copies: correctly skip directories that have already been considered...
r39299 invalid.add(dsrc)
elif dsrc in dirmove and dirmove[dsrc] != ddst:
Matt Mackall
copies: move findcopies code to its own module...
r6274 # files from the same directory moved to two different places
Kyle Lippincott
copies: correctly skip directories that have already been considered...
r39299 invalid.add(dsrc)
Matt Mackall
copies: move findcopies code to its own module...
r6274 else:
# looks good so far
Kyle Lippincott
copies: correctly skip directories that have already been considered...
r39299 dirmove[dsrc] = ddst
Matt Mackall
copies: move findcopies code to its own module...
r6274
for i in invalid:
if i in dirmove:
del dirmove[i]
Martin von Zweigbergk
copies: make mergecopies() distinguish between copies on each side...
r44657 del d, invalid
Matt Mackall
copies: move findcopies code to its own module...
r6274
if not dirmove:
Martin von Zweigbergk
copies: extract function for finding directory renames...
r44624 return {}, {}
Matt Mackall
copies: move findcopies code to its own module...
r6274
Gregory Szorc
py3: finish porting iteritems() to pycompat and remove source transformer...
r43376 dirmove = {k + b"/": v + b"/" for k, v in pycompat.iteritems(dirmove)}
Kyle Lippincott
copies: correctly skip directories that have already been considered...
r39299
Matt Mackall
copies: move findcopies code to its own module...
r6274 for d in dirmove:
Augie Fackler
formatting: blacken the codebase...
r43346 repo.ui.debug(
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 b" discovered dir src: '%s' -> dst: '%s'\n" % (d, dirmove[d])
Augie Fackler
formatting: blacken the codebase...
r43346 )
Matt Mackall
copies: move findcopies code to its own module...
r6274
Pierre-Yves David
checkcopies: move 'movewithdir' initialisation right before its usage...
r30183 movewithdir = {}
Matt Mackall
copies: move findcopies code to its own module...
r6274 # check unaccounted nonoverlapping files against directory moves
Martin von Zweigbergk
copies: make mergecopies() distinguish between copies on each side...
r44657 for f in addedfiles:
Matt Mackall
copies: move findcopies code to its own module...
r6274 if f not in fullcopy:
for d in dirmove:
if f.startswith(d):
# new file added in a directory that was moved, move it
Augie Fackler
formatting: blacken the codebase...
r43346 df = dirmove[d] + f[len(d) :]
Matt Mackall
copies: don't double-detect items in the directory copy check
r6426 if df not in copy:
Siddharth Agarwal
copies: separate moves via directory renames from explicit copies...
r18134 movewithdir[f] = df
Augie Fackler
formatting: blacken the codebase...
r43346 repo.ui.debug(
Martin von Zweigbergk
cleanup: join string literals that are already on one line...
r43387 b" pending file src: '%s' -> dst: '%s'\n"
Augie Fackler
formatting: blacken the codebase...
r43346 % (f, df)
)
Matt Mackall
copies: move findcopies code to its own module...
r6274 break
Martin von Zweigbergk
copies: extract function for finding directory renames...
r44624 return dirmove, movewithdir
Durham Goode
copies: refactor checkcopies() into a top level method...
r19178
Augie Fackler
formatting: blacken the codebase...
r43346
Pulkit Goyal
copytrace: move fast heuristic copytracing algorithm to core...
r34180 def _heuristicscopytracing(repo, c1, c2, base):
Augie Fackler
formating: upgrade to black 20.8b1...
r46554 """Fast copytracing using filename heuristics
Pulkit Goyal
copytrace: move fast heuristic copytracing algorithm to core...
r34180
Assumes that moves or renames are of following two types:
1) Inside a directory only (same directory name but different filenames)
2) Move from one directory to another
(same filenames but different directory names)
Works only when there are no merge commits in the "source branch".
Source branch is commits from base up to c2 not including base.
If merge is involved it fallbacks to _fullcopytracing().
Can be used by setting the following config:
[experimental]
copytrace = heuristics
Pulkit Goyal
copies: add a config to limit the number of candidates to check in heuristics...
r34847
In some cases the copy/move candidates found by heuristics can be very large
in number and that will make the algorithm slow. The number of possible
candidates to check can be limited by using the config
`experimental.copytrace.movecandidateslimit` which defaults to 100.
Pulkit Goyal
copytrace: move fast heuristic copytracing algorithm to core...
r34180 """
if c1.rev() is None:
c1 = c1.p1()
if c2.rev() is None:
c2 = c2.p1()
changedfiles = set()
m1 = c1.manifest()
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 if not repo.revs(b'%d::%d', base.rev(), c2.rev()):
Pulkit Goyal
copytrace: move fast heuristic copytracing algorithm to core...
r34180 # If base is not in c2 branch, we switch to fullcopytracing
Augie Fackler
formatting: blacken the codebase...
r43346 repo.ui.debug(
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 b"switching to full copytracing as base is not "
b"an ancestor of c2\n"
Augie Fackler
formatting: blacken the codebase...
r43346 )
Pulkit Goyal
copytrace: move fast heuristic copytracing algorithm to core...
r34180 return _fullcopytracing(repo, c1, c2, base)
ctx = c2
while ctx != base:
if len(ctx.parents()) == 2:
# To keep things simple let's not handle merges
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 repo.ui.debug(b"switching to full copytracing because of merges\n")
Pulkit Goyal
copytrace: move fast heuristic copytracing algorithm to core...
r34180 return _fullcopytracing(repo, c1, c2, base)
changedfiles.update(ctx.files())
ctx = ctx.p1()
Martin von Zweigbergk
merge: start using the per-side copy dicts...
r44682 copies2 = {}
Pulkit Goyal
copytrace: move fast heuristic copytracing algorithm to core...
r34180 cp = _forwardcopies(base, c2)
Gregory Szorc
py3: finish porting iteritems() to pycompat and remove source transformer...
r43376 for dst, src in pycompat.iteritems(cp):
Pulkit Goyal
copytrace: move fast heuristic copytracing algorithm to core...
r34180 if src in m1:
Martin von Zweigbergk
merge: start using the per-side copy dicts...
r44682 copies2[dst] = src
Pulkit Goyal
copytrace: move fast heuristic copytracing algorithm to core...
r34180
# file is missing if it isn't present in the destination, but is present in
# the base and present in the source.
# Presence in the base is important to exclude added files, presence in the
# source is important to exclude removed files.
Augie Fackler
py3: use list comprehensions instead of filter where we need to eagerly filter...
r36364 filt = lambda f: f not in m1 and f in base and f in c2
missingfiles = [f for f in changedfiles if filt(f)]
Pulkit Goyal
copytrace: move fast heuristic copytracing algorithm to core...
r34180
Martin von Zweigbergk
merge: start using the per-side copy dicts...
r44682 copies1 = {}
Pulkit Goyal
copytrace: move fast heuristic copytracing algorithm to core...
r34180 if missingfiles:
basenametofilename = collections.defaultdict(list)
dirnametofilename = collections.defaultdict(list)
for f in m1.filesnotin(base.manifest()):
basename = os.path.basename(f)
dirname = os.path.dirname(f)
basenametofilename[basename].append(f)
dirnametofilename[dirname].append(f)
for f in missingfiles:
basename = os.path.basename(f)
dirname = os.path.dirname(f)
samebasename = basenametofilename[basename]
samedirname = dirnametofilename[dirname]
movecandidates = samebasename + samedirname
# f is guaranteed to be present in c2, that's why
# c2.filectx(f) won't fail
f2 = c2.filectx(f)
Pulkit Goyal
copies: add a config to limit the number of candidates to check in heuristics...
r34847 # we can have a lot of candidates which can slow down the heuristics
# config value to limit the number of candidates moves to check
Augie Fackler
formatting: blacken the codebase...
r43346 maxcandidates = repo.ui.configint(
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 b'experimental', b'copytrace.movecandidateslimit'
Augie Fackler
formatting: blacken the codebase...
r43346 )
Pulkit Goyal
copies: add a config to limit the number of candidates to check in heuristics...
r34847
if len(movecandidates) > maxcandidates:
Augie Fackler
formatting: blacken the codebase...
r43346 repo.ui.status(
_(
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 b"skipping copytracing for '%s', more "
b"candidates than the limit: %d\n"
Augie Fackler
formatting: blacken the codebase...
r43346 )
% (f, len(movecandidates))
)
Pulkit Goyal
copies: add a config to limit the number of candidates to check in heuristics...
r34847 continue
Pulkit Goyal
copytrace: move fast heuristic copytracing algorithm to core...
r34180 for candidate in movecandidates:
f1 = c1.filectx(candidate)
Gábor Stefanik
copies: clean up _related logic...
r37410 if _related(f1, f2):
Pulkit Goyal
copytrace: move fast heuristic copytracing algorithm to core...
r34180 # if there are a few related copies then we'll merge
# changes into all of them. This matches the behaviour
# of upstream copytracing
Martin von Zweigbergk
merge: start using the per-side copy dicts...
r44682 copies1[candidate] = f
Pulkit Goyal
copytrace: move fast heuristic copytracing algorithm to core...
r34180
Martin von Zweigbergk
merge: start using the per-side copy dicts...
r44682 return branch_copies(copies1), branch_copies(copies2), {}
Pulkit Goyal
copytrace: move fast heuristic copytracing algorithm to core...
r34180
Augie Fackler
formatting: blacken the codebase...
r43346
Gábor Stefanik
copies: clean up _related logic...
r37410 def _related(f1, f2):
Pierre-Yves David
checkcopies: extract the '_related' closure...
r30138 """return True if f1 and f2 filectx have a common ancestor
Walk back to common ancestor to see if the two files originate
from the same file. Since workingfilectx's rev() is None it messes
up the integer comparison logic, hence the pre-step check for
None (f1 and f2 can only be workingfilectx's initially).
"""
if f1 == f2:
Augie Fackler
formatting: blacken the codebase...
r43346 return True # a match
Pierre-Yves David
checkcopies: extract the '_related' closure...
r30138
g1, g2 = f1.ancestors(), f2.ancestors()
try:
f1r, f2r = f1.linkrev(), f2.linkrev()
if f1r is None:
f1 = next(g1)
if f2r is None:
f2 = next(g2)
while True:
f1r, f2r = f1.linkrev(), f2.linkrev()
if f1r > f2r:
f1 = next(g1)
elif f2r > f1r:
f2 = next(g2)
Augie Fackler
formatting: blacken the codebase...
r43346 else: # f1 and f2 point to files in the same linkrev
return f1 == f2 # true if they point to the same file
Pierre-Yves David
checkcopies: extract the '_related' closure...
r30138 except StopIteration:
return False
Augie Fackler
formatting: blacken the codebase...
r43346
Martin von Zweigbergk
graftcopies: remove `skip` and `repo` arguments...
r44551 def graftcopies(wctx, ctx, base):
Martin von Zweigbergk
graftcopies: document why the function is useful at all...
r44552 """reproduce copies between base and ctx in the wctx
Unlike mergecopies(), this function will only consider copies between base
and ctx; it will ignore copies between base and wctx. Also unlike
mergecopies(), this function will apply copies to the working copy (instead
of just returning information about the copies). That makes it cheaper
(especially in the common case of base==ctx.p1()) and useful also when
experimental.copytrace=off.
merge.update() will have already marked most copies, but it will only
mark copies if it thinks the source files are related (see
merge._related()). It will also not mark copies if the file wasn't modified
on the local side. This function adds the copies that were "missed"
by merge.update().
"""
Martin von Zweigbergk
graftcopies: use _filter() for filtering out invalid copies...
r44550 new_copies = pathcopies(base, ctx)
_filter(wctx.p1(), wctx, new_copies)
for dst, src in pycompat.iteritems(new_copies):
wctx[dst].markcopied(src)