upstream/mercurial-mirror Commit - r46770:fce2f20a

copies-rust: start recording overwrite as they happens...

copies-rust: start recording overwrite as they happens If a revision has information overwriting data from another revision, the overwriting revision is a descendant of the overwritten one. So we could warm the Oracle cache with such information to avoid potential future `is_ancestors` call. This provide us with a large speedup in the most expensive cases: Repo Case Source-Rev Dest-Rev # of revisions old time new time Difference Factor time per rev --------------------------------------------------------------------------------------------------------------------------------------------------------------- mozilla-try x00000_revs_x00000_added_x0000_copies 1ae03d022d6d : 228985 revs, 41.113063 s, 36.001255 s, -5.111808 s, × 0.8757, 157 µs/rev mozilla-try x00000_revs_x00000_added_x000_copies 8e29777b48e6 : 382065 revs, 27.891612 s, 14.340641 s, -13.550971 s, × 0.5142, 37 µs/rev Full comparison below: Repo Case Source-Rev Dest-Rev # of revisions old time new time Difference Factor time per rev --------------------------------------------------------------------------------------------------------------------------------------------------------------- mercurial x_revs_x_added_0_copies 39cfcef4f463 : 1 revs, 0.000042 s, 0.000042 s, +0.000000 s, × 1.0000, 42 µs/rev mercurial x_revs_x_added_x_copies 0c1d10351869 : 6 revs, 0.000114 s, 0.000109 s, -0.000005 s, × 0.9561, 18 µs/rev mercurial x000_revs_x000_added_x_copies dd3267698d84 : 1032 revs, 0.004934 s, 0.004953 s, +0.000019 s, × 1.0039, 4 µs/rev pypy x_revs_x_added_0_copies 099ed31b181b : 9 revs, 0.000195 s, 0.000237 s, +0.000042 s, × 1.2154, 26 µs/rev pypy x_revs_x000_added_0_copies 359343b9ac0e : 1 revs, 0.000050 s, 0.000050 s, +0.000000 s, × 1.0000, 50 µs/rev pypy x_revs_x_added_x_copies 72e022663155 : 7 revs, 0.000113 s, 0.000113 s, +0.000000 s, × 1.0000, 16 µs/rev pypy x_revs_x00_added_x_copies ace7255d9a26 : 1 revs, 0.6f1f4a s, 0.6f1f4a s, +0.000000 s, × 1.0000, 322 µs/rev pypy x_revs_x000_added_x000_copies a83dc6a2d56f : 6 revs, 0.010788 s, 0.010702 s, -0.000086 s, × 0.9920, 1783 µs/rev pypy x000_revs_xx00_added_0_copies 2f22446ff07e : 4785 revs, 0.050880 s, 0.050504 s, -0.000376 s, × 0.9926, 10 µs/rev pypy x000_revs_x000_added_x_copies 2c68e87c3efe : 6780 revs, 0.081760 s, 0.080159 s, -0.001601 s, × 0.9804, 11 µs/rev pypy x000_revs_x000_added_x000_copies 7b3dda341c84 : 5441 revs, 0.061382 s, 0.060058 s, -0.001324 s, × 0.9784, 11 µs/rev pypy x0000_revs_x_added_0_copies c9cb1334cc78 : 43645 revs, 0.585802 s, 0.536950 s, -0.048852 s, × 0.9166, 12 µs/rev pypy x0000_revs_xx000_added_0_copies 4ffed77c095c : 2 revs, 0.012803 s, 0.012868 s, +0.000065 s, × 1.0051, 6434 µs/rev pypy x0000_revs_xx000_added_x000_copies d9fa043f30c0 : 11316 revs, 0.113558 s, 0.112806 s, -0.000752 s, × 0.9934, 9 µs/rev netbeans x_revs_x_added_0_copies a01e9239f9e7 : 2 revs, 0.000085 s, 0.000084 s, -0.000001 s, × 0.9882, 42 µs/rev netbeans x_revs_x000_added_0_copies 20eb231cc7d0 : 2 revs, 0.000106 s, 0.000106 s, +0.000000 s, × 1.0000, 53 µs/rev netbeans x_revs_x_added_x_copies 5a39d12eecf4 : 3 revs, 0.000175 s, 0.000174 s, -0.000001 s, × 0.9943, 58 µs/rev netbeans x_revs_x00_added_x_copies 9eec5e90c05f : 9 revs, 0.000721 s, 0.000726 s, +0.000005 s, × 1.0069, 80 µs/rev netbeans x000_revs_xx00_added_0_copies 51d4ae7f1290 : 1421 revs, 0.010127 s, 0.010105 s, -0.000022 s, × 0.9978, 7 µs/rev netbeans x000_revs_x000_added_x_copies 6081d72689dc : 1533 revs, 0.015616 s, 0.015748 s, +0.000132 s, × 1.0085, 10 µs/rev netbeans x000_revs_x000_added_x000_copies 411350406ec2 : 5750 revs, 0.061341 s, 0.060357 s, -0.000984 s, × 0.9840, 10 µs/rev netbeans x0000_revs_xx000_added_x000_copies 1aad62e59ddd : 66949 revs, 0.542214 s, 0.499356 s, -0.042858 s, × 0.9210, 7 µs/rev mozilla-central x_revs_x_added_0_copies 7015fcdd43a2 : 2 revs, 0.000089 s, 0.000092 s, +0.000003 s, × 1.0337, 46 µs/rev mozilla-central x_revs_x000_added_0_copies 40d0c5bed75d : 8 revs, 0.000279 s, 0.000279 s, +0.000000 s, × 1.0000, 34 µs/rev mozilla-central x_revs_x_added_x_copies 14207ffc2b2f : 9 revs, 0.000184 s, 0.000186 s, +0.000002 s, × 1.0109, 20 µs/rev mozilla-central x_revs_x00_added_x_copies 446a150332c3 : 7 revs, 0.000661 s, 0.000660 s, -0.000001 s, × 0.9985, 94 µs/rev mozilla-central x_revs_x000_added_x000_copies 0a5e72d1b479 : 3 revs, 0.003377 s, 0.003372 s, -0.000005 s, × 0.9985, 1124 µs/rev mozilla-central x_revs_x0000_added_x0000_copies c07a39dc4e80 : 6 revs, 0.070508 s, 0.070294 s, -0.000214 s, × 0.9970, 11715 µs/rev mozilla-central x000_revs_xx00_added_0_copies 04a55431795e : 1593 revs, 0.006576 s, 0.006545 s, -0.000031 s, × 0.9953, 4 µs/rev mozilla-central x000_revs_x000_added_x_copies 2d37b966abed : 41 revs, 0.004809 s, 0.004998 s, +0.000189 s, × 1.0393, 121 µs/rev mozilla-central x000_revs_x000_added_x000_copies 4407bd0c6330 : 7839 revs, 0.064872 s, 0.063348 s, -0.001524 s, × 0.9765, 8 µs/rev mozilla-central x0000_revs_xx000_added_0_copies 67118cc6dcad : 615 revs, 0.026142 s, 0.026154 s, +0.000012 s, × 1.0005, 42 µs/rev mozilla-central x0000_revs_xx000_added_x000_copies 96a38b690156 : 30263 revs, 0.203956 s, 0.199063 s, -0.004893 s, × 0.9760, 6 µs/rev mozilla-central x00000_revs_x0000_added_x0000_copies 4c222a1d9a00 : 153721 revs, 1.763853 s, 1.277320 s, -0.486533 s, × 0.7242, 8 µs/rev mozilla-central x00000_revs_x00000_added_x000_copies 1daa622bbe42 : 204976 revs, 2.609761 s, 1.698794 s, -0.910967 s, × 0.6509, 8 µs/rev mozilla-try x_revs_x_added_0_copies 9790f499805a : 2 revs, 0.000847 s, 0.000842 s, -0.000005 s, × 0.9941, 421 µs/rev mozilla-try x_revs_x000_added_0_copies 5bb8ce8c7450 : 2 revs, 0.000867 s, 0.000865 s, -0.000002 s, × 0.9977, 432 µs/rev mozilla-try x_revs_x_added_x_copies 936255a0384a : 4 revs, 0.000161 s, 0.000160 s, -0.000001 s, × 0.9938, 40 µs/rev mozilla-try x_revs_x00_added_x_copies 017afae788ec : 2 revs, 0.001131 s, 0.001122 s, -0.000009 s, × 0.9920, 561 µs/rev mozilla-try x_revs_x000_added_x000_copies 6f0ee96e21ad : 1 revs, 0.033114 s, 0.032743 s, -0.000371 s, × 0.9888, 32743 µs/rev mozilla-try x_revs_x0000_added_x0000_copies c07a39dc4e80 : 6 revs, 0.071092 s, 0.071529 s, +0.000437 s, × 1.0061, 11921 µs/rev mozilla-try x000_revs_xx00_added_0_copies 04a55431795e : 1593 revs, 0.006554 s, 0.006593 s, +0.000039 s, × 1.0060, 4 µs/rev mozilla-try x000_revs_x000_added_x_copies 2d37b966abed : 41 revs, 0.005160 s, 0.005311 s, +0.000151 s, × 1.0293, 129 µs/rev mozilla-try x000_revs_x000_added_x000_copies 4c65cbdabc1f : 6657 revs, 0.065063 s, 0.063063 s, -0.002000 s, × 0.9693, 9 µs/rev mozilla-try x0000_revs_x_added_0_copies a36a2a865d92 : 40314 revs, 0.297118 s, 0.312363 s, +0.015245 s, × 1.0513, 7 µs/rev mozilla-try x0000_revs_x_added_x_copies bcabf2a78927 : 38690 revs, 0.284002 s, 0.283106 s, -0.000896 s, × 0.9968, 7 µs/rev mozilla-try x0000_revs_xx000_added_x_copies 4d0f2c178e66 : 8598 revs, 0.086311 s, 0.083817 s, -0.002494 s, × 0.9711, 9 µs/rev mozilla-try x0000_revs_xx000_added_0_copies 67118cc6dcad : 615 revs, 0.026738 s, 0.026516 s, -0.000222 s, × 0.9917, 43 µs/rev mozilla-try x0000_revs_xx000_added_x000_copies 7ccb2fc7ccb5 : 97052 revs, 1.514270 s, 1.304865 s, -0.209405 s, × 0.8617, 13 µs/rev mozilla-try x0000_revs_x0000_added_x0000_copies e951f4ad123a : 52031 revs, 0.735875 s, 0.681088 s, -0.054787 s, × 0.9255, 13 µs/rev mozilla-try x00000_revs_x_added_0_copies 1ebb79acd503 : 363753 revs, 4.843329 s, 4.454320 s, -0.389009 s, × 0.9197, 12 µs/rev mozilla-try x00000_revs_x00000_added_0_copies d16fde900c9c : 34414 revs, 0.591752 s, 0.567913 s, -0.023839 s, × 0.9597, 16 µs/rev mozilla-try x00000_revs_x_added_x_copies 95d83ee7242d : 362229 revs, 4.760563 s, 4.547043 s, -0.213520 s, × 0.9551, 12 µs/rev mozilla-try x00000_revs_x000_added_x_copies ca82787bb23c : 359344 revs, 4.751942 s, 4.378579 s, -0.373363 s, × 0.9214, 12 µs/rev mozilla-try x00000_revs_x0000_added_x0000_copies eb884023b810 : 192665 revs, 2.605014 s, 1.703622 s, -0.901392 s, × 0.6540, 8 µs/rev mozilla-try x00000_revs_x00000_added_x0000_copies 1ae03d022d6d : 228985 revs, 41.113063 s, 36.001255 s, -5.111808 s, × 0.8757, 157 µs/rev mozilla-try x00000_revs_x00000_added_x000_copies 8e29777b48e6 : 382065 revs, 27.891612 s, 14.340641 s, -13.550971 s, × 0.5142, 37 µs/rev Differential Revision: https://phab.mercurial-scm.org/D9497

marmoute -

r46770:fce2f20a default

parent child

Collapse all files

rust/hg-core/src/copy_tracing.rs

0 +10 -1

              use crate::utils::hg_path::HgPath;
              use crate::utils::hg_path::HgPathBuf;
              use crate::Revision;
              use crate::NULL_REVISION;
              use im_rc::ordmap::DiffItem;
              use im_rc::ordmap::Entry;
              use im_rc::ordmap::OrdMap;
              use std::cmp::Ordering;
              use std::collections::HashMap;
              use std::convert::TryInto;
              pub type PathCopies = HashMap<HgPathBuf, HgPathBuf>;
              type PathToken = usize;
              #[derive(Clone, Debug, PartialEq, Copy)]
              struct TimeStampedPathCopy {
                  /// revision at which the copy information was added
                  rev: Revision,
                  /// the copy source, (Set to None in case of deletion of the associated
                  /// key)
                  path: Option<PathToken>,
              }
              /// maps CopyDestination to Copy Source (+ a "timestamp" for the operation)
              type TimeStampedPathCopies = OrdMap<PathToken, TimeStampedPathCopy>;
              /// hold parent 1, parent 2 and relevant files actions.
              pub type RevInfo<'a> = (Revision, Revision, ChangedFiles<'a>);
              /// represent the files affected by a changesets
              ///
              /// This hold a subset of mercurial.metadata.ChangingFiles as we do not need
              /// all the data categories tracked by it.
              /// This hold a subset of mercurial.metadata.ChangingFiles as we do not need
              /// all the data categories tracked by it.
              pub struct ChangedFiles<'a> {
                  nb_items: u32,
                  index: &'a [u8],
                  data: &'a [u8],
              }
              /// Represent active changes that affect the copy tracing.
              enum Action<'a> {
                  /// The parent ? children edge is removing a file
                  ///
                  /// (actually, this could be the edge from the other parent, but it does
                  /// not matters)
                  Removed(&'a HgPath),
                  /// The parent ? children edge introduce copy information between (dest,
                  /// source)
                  Copied(&'a HgPath, &'a HgPath),
              }
              /// This express the possible "special" case we can get in a merge
              ///
              /// See mercurial/metadata.py for details on these values.
              #[derive(PartialEq)]
              enum MergeCase {
                  /// Merged: file had history on both side that needed to be merged
                  Merged,
                  /// Salvaged: file was candidate for deletion, but survived the merge
                  Salvaged,
                  /// Normal: Not one of the two cases above
                  Normal,
              }
              type FileChange<'a> = (u8, &'a HgPath, &'a HgPath);
              const EMPTY: &[u8] = b"";
              const COPY_MASK: u8 = 3;
              const P1_COPY: u8 = 2;
              const P2_COPY: u8 = 3;
              const ACTION_MASK: u8 = 28;
              const REMOVED: u8 = 12;
              const MERGED: u8 = 8;
              const SALVAGED: u8 = 16;
              impl<'a> ChangedFiles<'a> {
                  const INDEX_START: usize = 4;
                  const ENTRY_SIZE: u32 = 9;
                  const FILENAME_START: u32 = 1;
                  const COPY_SOURCE_START: u32 = 5;
                  pub fn new(data: &'a [u8]) -> Self {
                      assert!(
                          data.len() >= 4,
                          "data size ({}) is too small to contain the header (4)",
                          data.len()
                      );
                      let nb_items_raw: [u8; 4] = (&data[0..=3])
                          .try_into()
                          .expect("failed to turn 4 bytes into 4 bytes");
                      let nb_items = u32::from_be_bytes(nb_items_raw);
                      let index_size = (nb_items * Self::ENTRY_SIZE) as usize;
                      let index_end = Self::INDEX_START + index_size;
                      assert!(
                          data.len() >= index_end,
                          "data size ({}) is too small to fit the index_data ({})",
                          data.len(),
                          index_end
                      );
                      let ret = ChangedFiles {
                          nb_items,
                          index: &data[Self::INDEX_START..index_end],
                          data: &data[index_end..],
                      };
                      let max_data = ret.filename_end(nb_items - 1) as usize;
                      assert!(
                          ret.data.len() >= max_data,
                          "data size ({}) is too small to fit all data ({})",
                          data.len(),
                          index_end + max_data
                      );
                      ret
                  }
                  pub fn new_empty() -> Self {
                      ChangedFiles {
                          nb_items: 0,
                          index: EMPTY,
                          data: EMPTY,
                      }
                  }
                  /// internal function to return an individual entry at a given index
                  fn entry(&'a self, idx: u32) -> FileChange<'a> {
                      if idx >= self.nb_items {
                          panic!(
                              "index for entry is higher that the number of file {} >= {}",
                              idx, self.nb_items
                          )
                      }
                      let flags = self.flags(idx);
                      let filename = self.filename(idx);
                      let copy_idx = self.copy_idx(idx);
                      let copy_source = self.filename(copy_idx);
                      (flags, filename, copy_source)
                  }
                  /// internal function to return the filename of the entry at a given index
                  fn filename(&self, idx: u32) -> &HgPath {
                      let filename_start;
                      if idx == 0 {
                          filename_start = 0;
                      } else {
                          filename_start = self.filename_end(idx - 1)
                      }
                      let filename_end = self.filename_end(idx);
                      let filename_start = filename_start as usize;
                      let filename_end = filename_end as usize;
                      HgPath::new(&self.data[filename_start..filename_end])
                  }
                  /// internal function to return the flag field of the entry at a given
                  /// index
                  fn flags(&self, idx: u32) -> u8 {
                      let idx = idx as usize;
                      self.index[idx * (Self::ENTRY_SIZE as usize)]
                  }
                  /// internal function to return the end of a filename part at a given index
                  fn filename_end(&self, idx: u32) -> u32 {
                      let start = (idx * Self::ENTRY_SIZE) + Self::FILENAME_START;
                      let end = (idx * Self::ENTRY_SIZE) + Self::COPY_SOURCE_START;
                      let start = start as usize;
                      let end = end as usize;
                      let raw = (&self.index[start..end])
                          .try_into()
                          .expect("failed to turn 4 bytes into 4 bytes");
                      u32::from_be_bytes(raw)
                  }
                  /// internal function to return index of the copy source of the entry at a
                  /// given index
                  fn copy_idx(&self, idx: u32) -> u32 {
                      let start = (idx * Self::ENTRY_SIZE) + Self::COPY_SOURCE_START;
                      let end = (idx + 1) * Self::ENTRY_SIZE;
                      let start = start as usize;
                      let end = end as usize;
                      let raw = (&self.index[start..end])
                          .try_into()
                          .expect("failed to turn 4 bytes into 4 bytes");
                      u32::from_be_bytes(raw)
                  }
                  /// Return an iterator over all the `Action` in this instance.
                  fn iter_actions(&self, parent: Parent) -> ActionsIterator {
                      ActionsIterator {
                          changes: &self,
                          parent: parent,
                          current: 0,
                      }
                  }
                  /// return the MergeCase value associated with a filename
                  fn get_merge_case(&self, path: &HgPath) -> MergeCase {
                      if self.nb_items == 0 {
                          return MergeCase::Normal;
                      }
                      let mut low_part = 0;
                      let mut high_part = self.nb_items;
                      while low_part < high_part {
                          let cursor = (low_part + high_part - 1) / 2;
                          let (flags, filename, _source) = self.entry(cursor);
                          match path.cmp(filename) {
                              Ordering::Less => low_part = cursor + 1,
                              Ordering::Greater => high_part = cursor,
                              Ordering::Equal => {
                                  return match flags & ACTION_MASK {
                                      MERGED => MergeCase::Merged,
                                      SALVAGED => MergeCase::Salvaged,
                                      _ => MergeCase::Normal,
                                  };
                              }
                          }
                      }
                      MergeCase::Normal
                  }
              }
              /// A struct responsible for answering "is X ancestors of Y" quickly
              ///
              /// The structure will delegate ancestors call to a callback, and cache the
              /// result.
              #[derive(Debug)]
              struct AncestorOracle<'a, A: Fn(Revision, Revision) -> bool> {
                  inner: &'a A,
                  pairs: HashMap<(Revision, Revision), bool>,
              }
              impl<'a, A: Fn(Revision, Revision) -> bool> AncestorOracle<'a, A> {
                  fn new(func: &'a A) -> Self {
                      Self {
                          inner: func,
                          pairs: HashMap::default(),
                      }
                  }
+                 fn record_overwrite(&mut self, anc: Revision, desc: Revision) {
+                     self.pairs.insert((anc, desc), true);
+                 }
                  /// returns `true` if `anc` is an ancestors of `desc`, `false` otherwise
                  fn is_overwrite(&mut self, anc: Revision, desc: Revision) -> bool {
                      if anc > desc {
                          false
                      } else if anc == desc {
                          true
                      } else {
                          if let Some(b) = self.pairs.get(&(anc, desc)) {
                              *b
                          } else {
                              let b = (self.inner)(anc, desc);
                              self.pairs.insert((anc, desc), b);
                              b
                          }
                      }
                  }
              }
              struct ActionsIterator<'a> {
                  changes: &'a ChangedFiles<'a>,
                  parent: Parent,
                  current: u32,
              }
              impl<'a> Iterator for ActionsIterator<'a> {
                  type Item = Action<'a>;
                  fn next(&mut self) -> Option<Action<'a>> {
                      let copy_flag = match self.parent {
                          Parent::FirstParent => P1_COPY,
                          Parent::SecondParent => P2_COPY,
                      };
                      while self.current < self.changes.nb_items {
                          let (flags, file, source) = self.changes.entry(self.current);
                          self.current += 1;
                          if (flags & ACTION_MASK) == REMOVED {
                              return Some(Action::Removed(file));
                          }
                          let copy = flags & COPY_MASK;
                          if copy == copy_flag {
                              return Some(Action::Copied(file, source));
                          }
                      }
                      return None;
                  }
              }
              /// A small struct whose purpose is to ensure lifetime of bytes referenced in
              /// ChangedFiles
              ///
              /// It is passed to the RevInfoMaker callback who can assign any necessary
              /// content to the `data` attribute. The copy tracing code is responsible for
              /// keeping the DataHolder alive at least as long as the ChangedFiles object.
              pub struct DataHolder<D> {
                  /// RevInfoMaker callback should assign data referenced by the
                  /// ChangedFiles struct it return to this attribute. The DataHolder
                  /// lifetime will be at least as long as the ChangedFiles one.
                  pub data: Option<D>,
              }
              pub type RevInfoMaker<'a, D> =
                  Box<dyn for<'r> Fn(Revision, &'r mut DataHolder<D>) -> RevInfo<'r> + 'a>;
              /// enum used to carry information about the parent → child currently processed
              #[derive(Copy, Clone, Debug)]
              enum Parent {
                  /// The `p1(x) → x` edge
                  FirstParent,
                  /// The `p2(x) → x` edge
                  SecondParent,
              }
              /// A small "tokenizer" responsible of turning full HgPath into lighter
              /// PathToken
              ///
              /// Dealing with small object, like integer is much faster, so HgPath input are
              /// turned into integer "PathToken" and converted back in the end.
              #[derive(Clone, Debug, Default)]
              struct TwoWayPathMap {
                  token: HashMap<HgPathBuf, PathToken>,
                  path: Vec<HgPathBuf>,
              }
              impl TwoWayPathMap {
                  fn tokenize(&mut self, path: &HgPath) -> PathToken {
                      match self.token.get(path) {
                          Some(a) => *a,
                          None => {
                              let a = self.token.len();
                              let buf = path.to_owned();
                              self.path.push(buf.clone());
                              self.token.insert(buf, a);
                              a
                          }
                      }
                  }
                  fn untokenize(&self, token: PathToken) -> &HgPathBuf {
                      assert!(token < self.path.len(), format!("Unknown token: {}", token));
                      &self.path[token]
                  }
              }
              /// Same as mercurial.copies._combine_changeset_copies, but in Rust.
              ///
              /// Arguments are:
              ///
              /// revs: all revisions to be considered
              /// children: a {parent ? [childrens]} mapping
              /// target_rev: the final revision we are combining copies to
              /// rev_info(rev): callback to get revision information:
              ///   * first parent
              ///   * second parent
              ///   * ChangedFiles
              /// isancestors(low_rev, high_rev): callback to check if a revision is an
              ///                                 ancestor of another
              pub fn combine_changeset_copies<A: Fn(Revision, Revision) -> bool, D>(
                  revs: Vec<Revision>,
                  mut children_count: HashMap<Revision, usize>,
                  target_rev: Revision,
                  rev_info: RevInfoMaker<D>,
                  is_ancestor: &A,
              ) -> PathCopies {
                  let mut all_copies = HashMap::new();
                  let mut oracle = AncestorOracle::new(is_ancestor);
                  let mut path_map = TwoWayPathMap::default();
                  for rev in revs {
                      let mut d: DataHolder<D> = DataHolder { data: None };
                      let (p1, p2, changes) = rev_info(rev, &mut d);
                      // We will chain the copies information accumulated for the parent with
                      // the individual copies information the curent revision.  Creating a
                      // new TimeStampedPath for each `rev` → `children` vertex.
                      let mut copies: Option<TimeStampedPathCopies> = None;
                      if p1 != NULL_REVISION {
                          // Retrieve data computed in a previous iteration
                          let parent_copies = get_and_clean_parent_copies(
                              &mut all_copies,
                              &mut children_count,
                              p1,
                          );
                          if let Some(parent_copies) = parent_copies {
                              // combine it with data for that revision
                              let vertex_copies = add_from_changes(
                                  &mut path_map,
+                                 &mut oracle,
                                  &parent_copies,
                                  &changes,
                                  Parent::FirstParent,
                                  rev,
                              );
                              // keep that data around for potential later combination
                              copies = Some(vertex_copies);
                          }
                      }
                      if p2 != NULL_REVISION {
                          // Retrieve data computed in a previous iteration
                          let parent_copies = get_and_clean_parent_copies(
                              &mut all_copies,
                              &mut children_count,
                              p2,
                          );
                          if let Some(parent_copies) = parent_copies {
                              // combine it with data for that revision
                              let vertex_copies = add_from_changes(
                                  &mut path_map,
+                                 &mut oracle,
                                  &parent_copies,
                                  &changes,
                                  Parent::SecondParent,
                                  rev,
                              );
                              copies = match copies {
                                  None => Some(vertex_copies),
                                  // Merge has two parents needs to combines their copy
                                  // information.
                                  //
                                  // If we got data from both parents, We need to combine
                                  // them.
                                  Some(copies) => Some(merge_copies_dict(
                                      &path_map,
                                      vertex_copies,
                                      copies,
                                      &changes,
                                      &mut oracle,
                                  )),
                              };
                          }
                      }
                      match copies {
                          Some(copies) => {
                              all_copies.insert(rev, copies);
                          }
                          _ => {}
                      }
                  }
                  // Drop internal information (like the timestamp) and return the final
                  // mapping.
                  let tt_result = all_copies
                      .remove(&target_rev)
                      .expect("target revision was not processed");
                  let mut result = PathCopies::default();
                  for (dest, tt_source) in tt_result {
                      if let Some(path) = tt_source.path {
                          let path_dest = path_map.untokenize(dest).to_owned();
                          let path_path = path_map.untokenize(path).to_owned();
                          result.insert(path_dest, path_path);
                      }
                  }
                  result
              }
              /// fetch previous computed information
              ///
              /// If no other children are expected to need this information, we drop it from
              /// the cache.
              ///
              /// If parent is not part of the set we are expected to walk, return None.
              fn get_and_clean_parent_copies(
                  all_copies: &mut HashMap<Revision, TimeStampedPathCopies>,
                  children_count: &mut HashMap<Revision, usize>,
                  parent_rev: Revision,
              ) -> Option<TimeStampedPathCopies> {
                  let count = children_count.get_mut(&parent_rev)?;
                  *count -= 1;
                  if *count == 0 {
                      match all_copies.remove(&parent_rev) {
                          Some(c) => Some(c),
                          None => Some(TimeStampedPathCopies::default()),
                      }
                  } else {
                      match all_copies.get(&parent_rev) {
                          Some(c) => Some(c.clone()),
                          None => Some(TimeStampedPathCopies::default()),
                      }
                  }
              }
              /// Combine ChangedFiles with some existing PathCopies information and return
              /// the result
-             fn add_from_changes(
+             fn add_from_changes<A: Fn(Revision, Revision) -> bool>(
                  path_map: &mut TwoWayPathMap,
+                 oracle: &mut AncestorOracle<A>,
                  base_copies: &TimeStampedPathCopies,
                  changes: &ChangedFiles,
                  parent: Parent,
                  current_rev: Revision,
              ) -> TimeStampedPathCopies {
                  let mut copies = base_copies.clone();
                  for action in changes.iter_actions(parent) {
                      match action {
                          Action::Copied(path_dest, path_source) => {
                              let dest = path_map.tokenize(path_dest);
                              let source = path_map.tokenize(path_source);
                              let entry;
                              if let Some(v) = base_copies.get(&source) {
                                  entry = match &v.path {
                                      Some(path) => Some((*(path)).to_owned()),
                                      None => Some(source.to_owned()),
                                  }
                              } else {
                                  entry = Some(source.to_owned());
                              }
                              // Each new entry is introduced by the children, we
                              // record this information as we will need it to take
                              // the right decision when merging conflicting copy
                              // information. See merge_copies_dict for details.
                              match copies.entry(dest) {
                                  Entry::Vacant(slot) => {
                                      let ttpc = TimeStampedPathCopy {
                                          rev: current_rev,
                                          path: entry,
                                      };
                                      slot.insert(ttpc);
                                  }
                                  Entry::Occupied(mut slot) => {
                                      let mut ttpc = slot.get_mut();
+                                     oracle.record_overwrite(ttpc.rev, current_rev);
                                      ttpc.rev = current_rev;
                                      ttpc.path = entry;
                                  }
                              }
                          }
                          Action::Removed(deleted_path) => {
                              // We must drop copy information for removed file.
                              //
                              // We need to explicitly record them as dropped to
                              // propagate this information when merging two
                              // TimeStampedPathCopies object.
                              let deleted = path_map.tokenize(deleted_path);
                              copies.entry(deleted).and_modify(|old| {
+                                 oracle.record_overwrite(old.rev, current_rev);
                                  old.rev = current_rev;
                                  old.path = None;
                              });
                          }
                      }
                  }
                  copies
              }
              /// merge two copies-mapping together, minor and major
              ///
              /// In case of conflict, value from "major" will be picked, unless in some
              /// cases. See inline documentation for details.
              fn merge_copies_dict<A: Fn(Revision, Revision) -> bool>(
                  path_map: &TwoWayPathMap,
                  mut minor: TimeStampedPathCopies,
                  mut major: TimeStampedPathCopies,
                  changes: &ChangedFiles,
                  oracle: &mut AncestorOracle<A>,
              ) -> TimeStampedPathCopies {
                  // This closure exist as temporary help while multiple developper are
                  // actively working on this code. Feel free to re-inline it once this
                  // code is more settled.
                  let mut cmp_value =
                      |dest: &PathToken,
                       src_minor: &TimeStampedPathCopy,
                       src_major: &TimeStampedPathCopy| {
                          compare_value(
                              path_map, changes, oracle, dest, src_minor, src_major,
                          )
                      };
                  if minor.is_empty() {
                      major
                  } else if major.is_empty() {
                      minor
                  } else if minor.len() * 2 < major.len() {
                      // Lets says we are merging two TimeStampedPathCopies instance A and B.
                      //
                      // If A contains N items, the merge result will never contains more
                      // than N values differents than the one in A
                      //
                      // If B contains M items, with M > N, the merge result will always
                      // result in a minimum of M - N value differents than the on in
                      // A
                      //
                      // As a result, if N < (M-N), we know that simply iterating over A will
                      // yield less difference than iterating over the difference
                      // between A and B.
                      //
                      // This help performance a lot in case were a tiny
                      // TimeStampedPathCopies is merged with a much larger one.
                      for (dest, src_minor) in minor {
                          let src_major = major.get(&dest);
                          match src_major {
                              None => major.insert(dest, src_minor),
                              Some(src_major) => {
                                  match cmp_value(&dest, &src_minor, src_major) {
                                      MergePick::Any | MergePick::Major => None,
                                      MergePick::Minor => major.insert(dest, src_minor),
                                  }
                              }
                          };
                      }
                      major
                  } else if major.len() * 2 < minor.len() {
                      // This use the same rational than the previous block.
                      // (Check previous block documentation for details.)
                      for (dest, src_major) in major {
                          let src_minor = minor.get(&dest);
                          match src_minor {
                              None => minor.insert(dest, src_major),
                              Some(src_minor) => {
                                  match cmp_value(&dest, src_minor, &src_major) {
                                      MergePick::Any | MergePick::Minor => None,
                                      MergePick::Major => minor.insert(dest, src_major),
                                  }
                              }
                          };
                      }
                      minor
                  } else {
                      let mut override_minor = Vec::new();
                      let mut override_major = Vec::new();
                      let mut to_major = |k: &PathToken, v: &TimeStampedPathCopy| {
                          override_major.push((k.clone(), v.clone()))
                      };
                      let mut to_minor = |k: &PathToken, v: &TimeStampedPathCopy| {
                          override_minor.push((k.clone(), v.clone()))
                      };
                      // The diff function leverage detection of the identical subpart if
                      // minor and major has some common ancestors. This make it very
                      // fast is most case.
                      //
                      // In case where the two map are vastly different in size, the current
                      // approach is still slowish because the iteration will iterate over
                      // all the "exclusive" content of the larger on. This situation can be
                      // frequent when the subgraph of revision we are processing has a lot
                      // of roots. Each roots adding they own fully new map to the mix (and
                      // likely a small map, if the path from the root to the "main path" is
                      // small.
                      //
                      // We could do better by detecting such situation and processing them
                      // differently.
                      for d in minor.diff(&major) {
                          match d {
                              DiffItem::Add(k, v) => to_minor(k, v),
                              DiffItem::Remove(k, v) => to_major(k, v),
                              DiffItem::Update { old, new } => {
                                  let (dest, src_major) = new;
                                  let (_, src_minor) = old;
                                  match cmp_value(dest, src_minor, src_major) {
                                      MergePick::Major => to_minor(dest, src_major),
                                      MergePick::Minor => to_major(dest, src_minor),
                                      // If the two entry are identical, no need to do
                                      // anything (but diff should not have yield them)
                                      MergePick::Any => unreachable!(),
                                  }
                              }
                          };
                      }
                      let updates;
                      let mut result;
                      if override_major.is_empty() {
                          result = major
                      } else if override_minor.is_empty() {
                          result = minor
                      } else {
                          if override_minor.len() < override_major.len() {
                              updates = override_minor;
                              result = minor;
                          } else {
                              updates = override_major;
                              result = major;
                          }
                          for (k, v) in updates {
                              result.insert(k, v);
                          }
                      }
                      result
                  }
              }
              /// represent the side that should prevail when merging two
              /// TimeStampedPathCopies
              enum MergePick {
                  /// The "major" (p1) side prevails
                  Major,
                  /// The "minor" (p2) side prevails
                  Minor,
                  /// Any side could be used (because they are the same)
                  Any,
              }
              /// decide which side prevails in case of conflicting values
              #[allow(clippy::if_same_then_else)]
              fn compare_value<A: Fn(Revision, Revision) -> bool>(
                  path_map: &TwoWayPathMap,
                  changes: &ChangedFiles,
                  oracle: &mut AncestorOracle<A>,
                  dest: &PathToken,
                  src_minor: &TimeStampedPathCopy,
                  src_major: &TimeStampedPathCopy,
              ) -> MergePick {
                  if src_major.path == src_minor.path {
                      // we have the same value, but from other source;
                      if src_major.rev == src_minor.rev {
                          // If the two entry are identical, they are both valid
                          MergePick::Any
                      } else if oracle.is_overwrite(src_major.rev, src_minor.rev) {
                          MergePick::Minor
                      } else {
                          MergePick::Major
                      }
                  } else if src_major.rev == src_minor.rev {
                      // We cannot get copy information for both p1 and p2 in the
                      // same rev. So this is the same value.
                      unreachable!(
                          "conflict information from p1 and p2 in the same revision"
                      );
                  } else {
                      let dest_path = path_map.untokenize(*dest);
                      let action = changes.get_merge_case(dest_path);
                      if src_major.path.is_none() && action == MergeCase::Salvaged {
                          // If the file is "deleted" in the major side but was
                          // salvaged by the merge, we keep the minor side alive
                          MergePick::Minor
                      } else if src_minor.path.is_none() && action == MergeCase::Salvaged {
                          // If the file is "deleted" in the minor side but was
                          // salvaged by the merge, unconditionnaly preserve the
                          // major side.
                          MergePick::Major
                      } else if action == MergeCase::Merged {
                          // If the file was actively merged, copy information
                          // from each side might conflict.  The major side will
                          // win such conflict.
                          MergePick::Major
                      } else if oracle.is_overwrite(src_major.rev, src_minor.rev) {
                          // If the minor side is strictly newer than the major
                          // side, it should be kept.
                          MergePick::Minor
                      } else if src_major.path.is_some() {
                          // without any special case, the "major" value win
                          // other the "minor" one.
                          MergePick::Major
                      } else if oracle.is_overwrite(src_minor.rev, src_major.rev) {
                          // the "major" rev is a direct ancestors of "minor",
                          // any different value should
                          // overwrite
                          MergePick::Major
                      } else {
                          // major version is None (so the file was deleted on
                          // that branch) and that branch is independant (neither
                          // minor nor major is an ancestors of the other one.)
                          // We preserve the new
                          // information about the new file.
                          MergePick::Minor
                      }
                  }
              }

General Comments 0

Write
Preview

You need to be logged in to leave comments. Login now

No TODOs yet

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages