upstream/mercurial-mirror Files · mercurial/mdiff.py

parsers: inline fields of dirstate values in C version...

parsers: inline fields of dirstate values in C version Previously, while unpacking the dirstate we'd create 3-4 new CPython objects for most dirstate values: - the state is a single character string, which is pooled by CPython - the mode is a new object if it isn't 0 due to being in the lookup set - the size is a new object if it is greater than 255 - the mtime is a new object if it isn't -1 due to being in the lookup set - the tuple to contain them all In some cases such as regular hg status, we actually look at all the objects. In other cases like hg add, hg status for a subdirectory, or hg status with the third-party hgwatchman enabled, we look at almost none of the objects. This patch eliminates most object creation in these cases by defining a custom C struct that is exposed to Python with an interface similar to a tuple. Only when tuple elements are actually requested are the respective objects created. The gains, where they're expected, are significant. The following tests are run against a working copy with over 270,000 files. parse_dirstate becomes significantly faster: $ hg perfdirstate before: wall 0.186437 comb 0.180000 user 0.160000 sys 0.020000 (best of 35) after: wall 0.093158 comb 0.100000 user 0.090000 sys 0.010000 (best of 95) and as a result, several commands benefit: $ time hg status # with hgwatchman enabled before: 0.42s user 0.14s system 99% cpu 0.563 total after: 0.34s user 0.12s system 99% cpu 0.471 total $ time hg add new-file before: 0.85s user 0.18s system 99% cpu 1.033 total after: 0.76s user 0.17s system 99% cpu 0.931 total There is a slight regression in regular status performance, but this is fixed in an upcoming patch.

Stephen Lee - - Load All Authors

File last commit:

r21790:3fbef7ac default


                r21809:e250b830

default

Download file

             mdiff.py
        
                    362 lines
            
             | 11.4 KiB
            
                | text/x-python
            
             |
                PythonLexer
            
             / mercurial / mdiff.py
          
                    History
                
                 |
                  Source
                 | Raw
                 |Copy content
                 |Copy permalink

        mpm@selenic.com
    
mdiff.py: kill #! line, add copyright notice...

              r239
            
      # mdiff.py - diff and patch routines for mercurial

      #

        Vadim Gelfer
    
update copyrights.

              r2859
            
      # Copyright 2005, 2006 Matt Mackall <mpm@selenic.com>

        mpm@selenic.com
    
mdiff.py: kill #! line, add copyright notice...

              r239
            
      #

        Martin Geisler
    
updated license to be explicit about GPL version 2

              r8225
            
      # This software may be used and distributed according to the terms of the

        Matt Mackall
    
Update license to GPLv2+

              r10263
            
      # GNU General Public License version 2 or any later version.

        mpm@selenic.com
    
mdiff.py: kill #! line, add copyright notice...

              r239
            
        Patrick Mezard
    
Let --unified default to diff.unified (issue 1076)

              r6467
            
      from i18n import _

        Augie Fackler
    
cleanup: move stdlib imports to their own import statement...

              r20034
            
      import bdiff, mpatch, util, base85

      import re, struct, zlib

        mpm@selenic.com
    
Add back links from file revisions to changeset revisions...

              r0
            
        Vadim Gelfer
    
fix speed regression in mdiff caused by line split bugfix.

              r2251
            
      def splitnewlines(text):

        Vadim Gelfer
    
fix diffs containing embedded "\r"....

              r2248
            
          '''like str.splitlines, but only split on newlines.'''

        Vadim Gelfer
    
fix speed regression in mdiff caused by line split bugfix.

              r2251
            
          lines = [l + '\n' for l in text.split('\n')]

          if lines:

              if lines[-1] == '\n':

                  lines.pop()

              else:

                  lines[-1] = lines[-1][:-1]

          return lines

        Vadim Gelfer
    
fix diffs containing embedded "\r"....

              r2248
            
        Vadim Gelfer
    
refactor text diff/patch code....

              r2874
            
      class diffopts(object):

          '''context is the number of context lines

          text treats all files as text

          showfunc enables diff -p output

        Brendan Cully
    
Add diff --git option

              r2907
            
          git enables the git extended patch format

        Stephen Darnell
    
Add -D/--nodates options to hg diff/export that removes dates from diff headers...

              r3199
            
          nodates removes dates from diff headers

        Vadim Gelfer
    
refactor text diff/patch code....

              r2874
            
          ignorews ignores all whitespace changes in the diff

          ignorewsamount ignores changes in the amount of whitespace

        Patrick Mezard
    
patch: support diff data loss detection and upgrade...

              r10189
            
          ignoreblanklines ignores changes whose lines are all blank

          upgrade generates git diffs to avoid data loss

          '''

        Thomas Arendsen Hein
    
Show revisions in diffs like CVS, based on a patch from Goffredo Baroncelli....

              r396
            
        Vadim Gelfer
    
refactor text diff/patch code....

              r2874
            
          defaults = {

              'context': 3,

              'text': False,

        Matt Mackall
    
diff: don't show function name by default...

              r5863
            
              'showfunc': False,

        Brendan Cully
    
Add diff --git option

              r2907
            
              'git': False,

        Stephen Darnell
    
Add -D/--nodates options to hg diff/export that removes dates from diff headers...

              r3199
            
              'nodates': False,

        Stephen Lee
    
diff: add nobinary config to suppress git-style binary diffs

              r21790
            
              'nobinary': False,

        Vadim Gelfer
    
refactor text diff/patch code....

              r2874
            
              'ignorews': False,

              'ignorewsamount': False,

              'ignoreblanklines': False,

        Patrick Mezard
    
patch: support diff data loss detection and upgrade...

              r10189
            
              'upgrade': False,

        Vadim Gelfer
    
refactor text diff/patch code....

              r2874
            
              }

          __slots__ = defaults.keys()

          def __init__(self, **opts):

              for k in self.__slots__:

                  v = opts.get(k)

                  if v is None:

                      v = self.defaults[k]

                  setattr(self, k, v)

        Patrick Mezard
    
Let --unified default to diff.unified (issue 1076)

              r6467
            
              try:

                  self.context = int(self.context)

              except ValueError:

                  raise util.Abort(_('diff context lines count must be '

                                     'an integer, not %r') % self.context)

        Patrick Mezard
    
mq: preserve --git flag when merging patches...

              r10185
            
          def copy(self, **kwargs):

              opts = dict((k, getattr(self, k)) for k in self.defaults)

              opts.update(kwargs)

              return diffopts(**opts)

        Vadim Gelfer
    
refactor text diff/patch code....

              r2874
            
      defaultopts = diffopts()

        Patrick Mezard
    
mdiff: fix diff -b/B/w on mixed whitespace hunks (issue127)...

              r9827
            
      def wsclean(opts, text, blank=True):

        Matt Mackall
    
diff: correctly handle combinations of whitespace options

              r4878
            
          if opts.ignorews:

        Patrick Mezard
    
mdiff: replace wscleanup() regexps with C loops...

              r15530
            
              text = bdiff.fixws(text, 1)

        Matt Mackall
    
diff: correctly handle combinations of whitespace options

              r4878
            
          elif opts.ignorewsamount:

        Patrick Mezard
    
mdiff: replace wscleanup() regexps with C loops...

              r15530
            
              text = bdiff.fixws(text, 0)

        Patrick Mezard
    
mdiff: fix diff -b/B/w on mixed whitespace hunks (issue127)...

              r9827
            
          if blank and opts.ignoreblanklines:

        Patrick Mezard
    
diff: --ignore-blank-lines was too enthusiastic...

              r15509
            
              text = re.sub('\n+', '\n', text).strip('\n')

        Matt Mackall
    
diff: correctly handle combinations of whitespace options

              r4878
            
          return text

        Patrick Mezard
    
annotate: support diff whitespace filtering flags (issue3030)...

              r15528
            
      def splitblock(base1, lines1, base2, lines2, opts):

          # The input lines matches except for interwoven blank lines. We

          # transform it into a sequence of matching blocks and blank blocks.

          lines1 = [(wsclean(opts, l) and 1 or 0) for l in lines1]

          lines2 = [(wsclean(opts, l) and 1 or 0) for l in lines2]

          s1, e1 = 0, len(lines1)

          s2, e2 = 0, len(lines2)

          while s1 < e1 or s2 < e2:

              i1, i2, btype = s1, s2, '='

              if (i1 >= e1 or lines1[i1] == 0

                  or i2 >= e2 or lines2[i2] == 0):

                  # Consume the block of blank lines

                  btype = '~'

                  while i1 < e1 and lines1[i1] == 0:

                      i1 += 1

                  while i2 < e2 and lines2[i2] == 0:

                      i2 += 1

              else:

                  # Consume the matching lines

                  while i1 < e1 and lines1[i1] == 1 and lines2[i2] == 1:

                      i1 += 1

                      i2 += 1

              yield [base1 + s1, base1 + i1, base2 + s2, base2 + i2], btype

              s1 = i1

              s2 = i2

      def allblocks(text1, text2, opts=None, lines1=None, lines2=None, refine=False):

        Patrick Mezard
    
mdiff: make diffblocks() return all blocks, matching and changed...

              r15526
            
          """Return (block, type) tuples, where block is an mdiff.blocks

          line entry. type is '=' for blocks matching exactly one another

          (bdiff blocks), '!' for non-matching blocks and '~' for blocks

        Patrick Mezard
    
annotate: support diff whitespace filtering flags (issue3030)...

              r15528
            
          matching only after having filtered blank lines. If refine is True,

          then '~' blocks are refined and are only made of blank lines.

        Patrick Mezard
    
mdiff: make diffblocks() return all blocks, matching and changed...

              r15526
            
          line1 and line2 are text1 and text2 split with splitnewlines() if

          they are already available.

        Patrick Mezard
    
mdiff: extract blocks whitespace normalization in diffblocks()...

              r15525
            
          """

          if opts is None:

              opts = defaultopts

          if opts.ignorews or opts.ignorewsamount:

              text1 = wsclean(opts, text1, False)

              text2 = wsclean(opts, text2, False)

          diff = bdiff.blocks(text1, text2)

          for i, s1 in enumerate(diff):

              # The first match is special.

              # we've either found a match starting at line 0 or a match later

              # in the file.  If it starts later, old and new below will both be

              # empty and we'll continue to the next match.

              if i > 0:

                  s = diff[i - 1]

              else:

                  s = [0, 0, 0, 0]

              s = [s[1], s1[0], s[3], s1[2]]

              # bdiff sometimes gives huge matches past eof, this check eats them,

              # and deals with the special first match case described above

        Patrick Mezard
    
mdiff: split lines in allblocks() only when necessary...

              r15529
            
              if s[0] != s[1] or s[2] != s[3]:

        Patrick Mezard
    
mdiff: make diffblocks() return all blocks, matching and changed...

              r15526
            
                  type = '!'

                  if opts.ignoreblanklines:

        Patrick Mezard
    
mdiff: split lines in allblocks() only when necessary...

              r15529
            
                      if lines1 is None:

                          lines1 = splitnewlines(text1)

                      if lines2 is None:

                          lines2 = splitnewlines(text2)

                      old = wsclean(opts, "".join(lines1[s[0]:s[1]]))

                      new = wsclean(opts, "".join(lines2[s[2]:s[3]]))

                      if old == new:

        Patrick Mezard
    
mdiff: make diffblocks() return all blocks, matching and changed...

              r15526
            
                          type = '~'

                  yield s, type

              yield s1, '='

        Patrick Mezard
    
mdiff: extract blocks whitespace normalization in diffblocks()...

              r15525
            
        Guillermo Pérez
    
diff: unify calls to diffline...

              r17940
            
      def unidiff(a, ad, b, bd, fn1, fn2, opts=defaultopts):

        Patrick Mezard
    
mdiff: fix diff header generation for files with spaces (issue3357)...

              r16362
            
          def datetag(date, fn=None):

        Alexis S. L. Carvalho
    
git patches: correct handling of filenames with spaces...

              r4679
            
              if not opts.git and not opts.nodates:

                  return '\t%s\n' % date

        Patrick Mezard
    
mdiff: fix diff header generation for files with spaces (issue3357)...

              r16362
            
              if fn and ' ' in fn:

        Alexis S. L. Carvalho
    
git patches: correct handling of filenames with spaces...

              r4679
            
                  return '\t\n'

              return '\n'

        Brendan Cully
    
Remove dates from git export file lines - they confuse git-apply

              r3026
            
        Matt Mackall
    
many, many trivial check-code fixups

              r10282
            
          if not a and not b:

              return ""

        Matt Mackall
    
Clean up mdiff imports

              r1379
            
          epoch = util.datestr((0, 0))

        mpm@selenic.com
    
Attempt to make diff deal with null sources properly...

              r264
            
        Mads Kiilerich
    
diff: always use / in paths in diff...

              r15437
            
          fn1 = util.pconvert(fn1)

          fn2 = util.pconvert(fn2)

        Vadim Gelfer
    
refactor text diff/patch code....

              r2874
            
          if not opts.text and (util.binary(a) or util.binary(b)):

        Martin Geisler
    
mdiff: compare content of binary files directly...

              r6871
            
              if a and b and len(a) == len(b) and a == b:

        tailgunner@smtp.ru
    
Don't lie that "binary file has changed"...

              r4103
            
                  return ""

        Dustin Sallings
    
Use both the from and to name in mdiff.unidiff....

              r5482
            
              l = ['Binary file %s has changed\n' % fn1]

        Thomas Arendsen Hein
    
Fix diff against an empty file (issue124) and add a test for this.

              r1723
            
          elif not a:

        Vadim Gelfer
    
fix speed regression in mdiff caused by line split bugfix.

              r2251
            
              b = splitnewlines(b)

        Thomas Arendsen Hein
    
Fix diff against an empty file (issue124) and add a test for this.

              r1723
            
              if a is None:

        Patrick Mezard
    
mdiff: fix diff header generation for files with spaces (issue3357)...

              r16362
            
                  l1 = '--- /dev/null%s' % datetag(epoch)

        Thomas Arendsen Hein
    
Fix diff against an empty file (issue124) and add a test for this.

              r1723
            
              else:

        Patrick Mezard
    
mdiff: fix diff header generation for files with spaces (issue3357)...

              r16362
            
                  l1 = "--- %s%s" % ("a/" + fn1, datetag(ad, fn1))

              l2 = "+++ %s%s" % ("b/" + fn2, datetag(bd, fn2))

        mpm@selenic.com
    
Attempt to make diff deal with null sources properly...

              r264
            
              l3 = "@@ -0,0 +1,%d @@\n" % len(b)

              l = [l1, l2, l3] + ["+" + e for e in b]

        Thomas Arendsen Hein
    
Fix diff against an empty file (issue124) and add a test for this.

              r1723
            
          elif not b:

        Vadim Gelfer
    
fix speed regression in mdiff caused by line split bugfix.

              r2251
            
              a = splitnewlines(a)

        Patrick Mezard
    
mdiff: fix diff header generation for files with spaces (issue3357)...

              r16362
            
              l1 = "--- %s%s" % ("a/" + fn1, datetag(ad, fn1))

        Thomas Arendsen Hein
    
Fix diff against an empty file (issue124) and add a test for this.

              r1723
            
              if b is None:

        Patrick Mezard
    
mdiff: fix diff header generation for files with spaces (issue3357)...

              r16362
            
                  l2 = '+++ /dev/null%s' % datetag(epoch)

        Thomas Arendsen Hein
    
Fix diff against an empty file (issue124) and add a test for this.

              r1723
            
              else:

        Patrick Mezard
    
mdiff: fix diff header generation for files with spaces (issue3357)...

              r16362
            
                  l2 = "+++ %s%s" % ("b/" + fn2, datetag(bd, fn2))

        mpm@selenic.com
    
Attempt to make diff deal with null sources properly...

              r264
            
              l3 = "@@ -1,%d +0,0 @@\n" % len(a)

              l = [l1, l2, l3] + ["-" + e for e in a]

          else:

        Vadim Gelfer
    
fix speed regression in mdiff caused by line split bugfix.

              r2251
            
              al = splitnewlines(a)

              bl = splitnewlines(b)

        Benoit Boissinot
    
remove header handling out of mdiff.bunidiff, rename it

              r10614
            
              l = list(_unidiff(a, b, al, bl, opts=opts))

        Matt Mackall
    
many, many trivial check-code fixups

              r10282
            
              if not l:

                  return ""

        Benoit Boissinot
    
remove header handling out of mdiff.bunidiff, rename it

              r10614
            
        Patrick Mezard
    
mdiff: fix diff header generation for files with spaces (issue3357)...

              r16362
            
              l.insert(0, "--- a/%s%s" % (fn1, datetag(ad, fn1)))

              l.insert(1, "+++ b/%s%s" % (fn2, datetag(bd, fn2)))

        mpm@selenic.com
    
hg diff: fix missing final newline bug

              r170
            
          for ln in xrange(len(l)):

              if l[ln][-1] != '\n':

                  l[ln] += "\n\ No newline at end of file\n"

        mpm@selenic.com
    
Add back links from file revisions to changeset revisions...

              r0
            
          return "".join(l)

        Benoit Boissinot
    
remove header handling out of mdiff.bunidiff, rename it

              r10614
            
      # creates a headerless unified diff

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
      # t1 and t2 are the text to be diffed

      # l1 and l2 are the text broken up into lines

        Benoit Boissinot
    
remove header handling out of mdiff.bunidiff, rename it

              r10614
            
      def _unidiff(t1, t2, l1, l2, opts=defaultopts):

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
          def contextend(l, len):

        Vadim Gelfer
    
refactor text diff/patch code....

              r2874
            
              ret = l + opts.context

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
              if ret > len:

                  ret = len

              return ret

          def contextstart(l):

        Vadim Gelfer
    
refactor text diff/patch code....

              r2874
            
              ret = l - opts.context

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
              if ret < 0:

                  return 0

              return ret

        Brodie Rao
    
mdiff: speed up showfunc for large diffs...

              r15141
            
          lastfunc = [0, '']

        Benoit Boissinot
    
remove header handling out of mdiff.bunidiff, rename it

              r10614
            
          def yieldhunk(hunk):

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
              (astart, a2, bstart, b2, delta) = hunk

              aend = contextend(a2, len(l1))

              alen = aend - astart

              blen = b2 - bstart + aend - a2

              func = ""

        Vadim Gelfer
    
refactor text diff/patch code....

              r2874
            
              if opts.showfunc:

        Brodie Rao
    
mdiff: speed up showfunc for large diffs...

              r15141
            
                  lastpos, func = lastfunc

                  # walk backwards from the start of the context up to the start of

                  # the previous hunk context until we find a line starting with an

                  # alphanumeric char.

                  for i in xrange(astart - 1, lastpos - 1, -1):

                      if l1[i][0].isalnum():

                          func = ' ' + l1[i].rstrip()[:40]

                          lastfunc[1] = func

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
                          break

        Brodie Rao
    
mdiff: speed up showfunc for large diffs...

              r15141
            
                  # by recording this hunk's starting point as the next place to

                  # start looking for function lines, we avoid reading any line in

                  # the file more than once.

                  lastfunc[0] = astart

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
        Nicolas Venegas
    
mdiff/patch: fix bad hunk handling for unified diffs with zero context...

              r15462
            
              # zero-length hunk ranges report their start line as one less

              if alen:

                  astart += 1

              if blen:

                  bstart += 1

              yield "@@ -%d,%d +%d,%d @@%s\n" % (astart, alen,

                                                 bstart, blen, func)

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
              for x in delta:

                  yield x

              for x in xrange(a2, aend):

                  yield ' ' + l1[x]

          # bdiff.blocks gives us the matching sequences in the files.  The loop

          # below finds the spaces between those matching sequences and translates

          # them into diff output.

          #

          hunk = None

        Patrick Mezard
    
mdiff: adjust hunk offsets with --ignore-blank-lines (issue3234)...

              r16089
            
          ignoredlines = 0

        Patrick Mezard
    
mdiff: make diffblocks() return all blocks, matching and changed...

              r15526
            
          for s, stype in allblocks(t1, t2, opts, l1, l2):

        Patrick Mezard
    
mdiff: adjust hunk offsets with --ignore-blank-lines (issue3234)...

              r16089
            
              a1, a2, b1, b2 = s

        Patrick Mezard
    
mdiff: make diffblocks() return all blocks, matching and changed...

              r15526
            
              if stype != '!':

        Patrick Mezard
    
mdiff: adjust hunk offsets with --ignore-blank-lines (issue3234)...

              r16089
            
                  if stype == '~':

                      # The diff context lines are based on t1 content. When

                      # blank lines are ignored, the new lines offsets must

                      # be adjusted as if equivalent blocks ('~') had the

                      # same sizes on both sides.

                      ignoredlines += (b2 - b1) - (a2 - a1)

        Patrick Mezard
    
mdiff: make diffblocks() return all blocks, matching and changed...

              r15526
            
                  continue

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
              delta = []

              old = l1[a1:a2]

              new = l2[b1:b2]

        Patrick Mezard
    
mdiff: adjust hunk offsets with --ignore-blank-lines (issue3234)...

              r16089
            
              b1 -= ignoredlines

              b2 -= ignoredlines

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
              astart = contextstart(a1)

              bstart = contextstart(b1)

              prev = None

              if hunk:

                  # join with the previous hunk if it falls inside the context

        Vadim Gelfer
    
refactor text diff/patch code....

              r2874
            
                  if astart < hunk[1] + opts.context + 1:

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
                      prev = hunk

                      astart = hunk[1]

                      bstart = hunk[3]

                  else:

        Benoit Boissinot
    
remove header handling out of mdiff.bunidiff, rename it

              r10614
            
                      for x in yieldhunk(hunk):

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
                          yield x

              if prev:

                  # we've joined the previous hunk, record the new ending points.

                  hunk[1] = a2

                  hunk[3] = b2

                  delta = hunk[4]

              else:

                  # create a new hunk

        Matt Mackall
    
many, many trivial check-code fixups

              r10282
            
                  hunk = [astart, a2, bstart, b2, delta]

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
        Matt Mackall
    
many, many trivial check-code fixups

              r10282
            
              delta[len(delta):] = [' ' + x for x in l1[astart:a1]]

              delta[len(delta):] = ['-' + x for x in old]

              delta[len(delta):] = ['+' + x for x in new]

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
          if hunk:

        Benoit Boissinot
    
remove header handling out of mdiff.bunidiff, rename it

              r10614
            
              for x in yieldhunk(hunk):

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
                  yield x

        Guillermo Pérez <bisho at fb.com>
    
diff: move b85diff to mdiff module...

              r17939
            
      def b85diff(to, tn):

          '''print base85-encoded binary diff'''

          def fmtline(line):

              l = len(line)

              if l <= 26:

                  l = chr(ord('A') + l - 1)

              else:

                  l = chr(l - 26 + ord('a') - 1)

              return '%c%s\n' % (l, base85.b85encode(line, True))

          def chunk(text, csize=52):

              l = len(text)

              i = 0

              while i < l:

                  yield text[i:i + csize]

                  i += csize

        Guillermo Pérez
    
diff: move index header generation to patch...

              r17946
            
          if to is None:

              to = ''

          if tn is None:

              tn = ''

          if to == tn:

              return ''

        Guillermo Pérez <bisho at fb.com>
    
diff: move b85diff to mdiff module...

              r17939
            
          # TODO: deltas

        Guillermo Pérez
    
diff: move index header generation to patch...

              r17946
            
          ret = []

          ret.append('GIT binary patch\n')

          ret.append('literal %s\n' % len(tn))

        Guillermo Pérez <bisho at fb.com>
    
diff: move b85diff to mdiff module...

              r17939
            
          for l in chunk(zlib.compress(tn)):

              ret.append(fmtline(l))

          ret.append('\n')

        Guillermo Pérez
    
diff: move index header generation to patch...

              r17946
            
        Guillermo Pérez <bisho at fb.com>
    
diff: move b85diff to mdiff module...

              r17939
            
          return ''.join(ret)

        mpm@selenic.com
    
Add a function to return the new text from a binary diff

              r120
            
      def patchtext(bin):

          pos = 0

          t = []

          while pos < len(bin):

              p1, p2, l = struct.unpack(">lll", bin[pos:pos + 12])

              pos += 12

              t.append(bin[pos:pos + l])

              pos += l

          return "".join(t)

        mpm@selenic.com
    
Add back links from file revisions to changeset revisions...

              r0
            
      def patch(a, bin):

        Benoit Boissinot
    
mdiff.patch(): add a special case for when the base text is empty...

              r12025
            
          if len(a) == 0:

              # skip over trivial delta header

        Matt Mackall
    
util: don't mess with builtins to emulate buffer()

              r15657
            
              return util.buffer(bin, 12)

        Matt Mackall
    
Clean up mdiff imports

              r1379
            
          return mpatch.patches(a, [bin])

        mpm@selenic.com
    
Start using bdiff for generating deltas...

              r432
            
        Alexis S. L. Carvalho
    
add mdiff.get_matching_blocks

              r4361
            
      # similar to difflib.SequenceMatcher.get_matching_blocks

      def get_matching_blocks(a, b):

          return [(d[0], d[2], d[1] - d[0]) for d in bdiff.blocks(a, b)]

        Matt Mackall
    
revlog: generate trivial deltas against null revision...

              r5367
            
      def trivialdiffheader(length):

          return struct.pack(">lll", 0, 0, length)

        Matt Mackall
    
Clean up mdiff imports

              r1379
            
      patches = mpatch.patches

        mason@suse.com
    
Fill in the uncompressed size during revlog.addgroup...

              r2078
            
      patchedsize = mpatch.patchedsize

        mpm@selenic.com
    
Start using bdiff for generating deltas...

              r432
            
      textdiff = bdiff.bdiff

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages

mpm@selenic.com mdiff.py: kill #! line, add copyright notice...	r239	# mdiff.py - diff and patch routines for mercurial
		#
Vadim Gelfer update copyrights.	r2859	# Copyright 2005, 2006 Matt Mackall <mpm@selenic.com>
mpm@selenic.com mdiff.py: kill #! line, add copyright notice...	r239	#
Martin Geisler updated license to be explicit about GPL version 2	r8225	# This software may be used and distributed according to the terms of the
Matt Mackall Update license to GPLv2+	r10263	# GNU General Public License version 2 or any later version.
mpm@selenic.com mdiff.py: kill #! line, add copyright notice...	r239
Patrick Mezard Let --unified default to diff.unified (issue 1076)	r6467	from i18n import _
Augie Fackler cleanup: move stdlib imports to their own import statement...	r20034	import bdiff, mpatch, util, base85
		import re, struct, zlib
mpm@selenic.com Add back links from file revisions to changeset revisions...	r0
Vadim Gelfer fix speed regression in mdiff caused by line split bugfix.	r2251	def splitnewlines(text):
Vadim Gelfer fix diffs containing embedded "\r"....	r2248	'''like str.splitlines, but only split on newlines.'''
Vadim Gelfer fix speed regression in mdiff caused by line split bugfix.	r2251	lines = [l + '\n' for l in text.split('\n')]
		if lines:
		if lines[-1] == '\n':
		lines.pop()
		else:
		lines[-1] = lines[-1][:-1]
		return lines
Vadim Gelfer fix diffs containing embedded "\r"....	r2248
Vadim Gelfer refactor text diff/patch code....	r2874	class diffopts(object):
		'''context is the number of context lines
		text treats all files as text
		showfunc enables diff -p output
Brendan Cully Add diff --git option	r2907	git enables the git extended patch format
Stephen Darnell Add -D/--nodates options to hg diff/export that removes dates from diff headers...	r3199	nodates removes dates from diff headers
Vadim Gelfer refactor text diff/patch code....	r2874	ignorews ignores all whitespace changes in the diff
		ignorewsamount ignores changes in the amount of whitespace
Patrick Mezard patch: support diff data loss detection and upgrade...	r10189	ignoreblanklines ignores changes whose lines are all blank
		upgrade generates git diffs to avoid data loss
		'''
Thomas Arendsen Hein Show revisions in diffs like CVS, based on a patch from Goffredo Baroncelli....	r396
Vadim Gelfer refactor text diff/patch code....	r2874	defaults = {
		'context': 3,
		'text': False,
Matt Mackall diff: don't show function name by default...	r5863	'showfunc': False,
Brendan Cully Add diff --git option	r2907	'git': False,
Stephen Darnell Add -D/--nodates options to hg diff/export that removes dates from diff headers...	r3199	'nodates': False,
Stephen Lee diff: add nobinary config to suppress git-style binary diffs	r21790	'nobinary': False,
Vadim Gelfer refactor text diff/patch code....	r2874	'ignorews': False,
		'ignorewsamount': False,
		'ignoreblanklines': False,
Patrick Mezard patch: support diff data loss detection and upgrade...	r10189	'upgrade': False,
Vadim Gelfer refactor text diff/patch code....	r2874	}

		__slots__ = defaults.keys()

		def __init__(self, **opts):
		for k in self.__slots__:
		v = opts.get(k)
		if v is None:
		v = self.defaults[k]
		setattr(self, k, v)

Patrick Mezard Let --unified default to diff.unified (issue 1076)	r6467	try:
		self.context = int(self.context)
		except ValueError:
		raise util.Abort(_('diff context lines count must be '
		'an integer, not %r') % self.context)

Patrick Mezard mq: preserve --git flag when merging patches...	r10185	def copy(self, **kwargs):
		opts = dict((k, getattr(self, k)) for k in self.defaults)
		opts.update(kwargs)
		return diffopts(**opts)

Vadim Gelfer refactor text diff/patch code....	r2874	defaultopts = diffopts()

Patrick Mezard mdiff: fix diff -b/B/w on mixed whitespace hunks (issue127)...	r9827	def wsclean(opts, text, blank=True):
Matt Mackall diff: correctly handle combinations of whitespace options	r4878	if opts.ignorews:
Patrick Mezard mdiff: replace wscleanup() regexps with C loops...	r15530	text = bdiff.fixws(text, 1)
Matt Mackall diff: correctly handle combinations of whitespace options	r4878	elif opts.ignorewsamount:
Patrick Mezard mdiff: replace wscleanup() regexps with C loops...	r15530	text = bdiff.fixws(text, 0)
Patrick Mezard mdiff: fix diff -b/B/w on mixed whitespace hunks (issue127)...	r9827	if blank and opts.ignoreblanklines:
Patrick Mezard diff: --ignore-blank-lines was too enthusiastic...	r15509	text = re.sub('\n+', '\n', text).strip('\n')
Matt Mackall diff: correctly handle combinations of whitespace options	r4878	return text

Patrick Mezard annotate: support diff whitespace filtering flags (issue3030)...	r15528	def splitblock(base1, lines1, base2, lines2, opts):
		# The input lines matches except for interwoven blank lines. We
		# transform it into a sequence of matching blocks and blank blocks.
		lines1 = [(wsclean(opts, l) and 1 or 0) for l in lines1]
		lines2 = [(wsclean(opts, l) and 1 or 0) for l in lines2]
		s1, e1 = 0, len(lines1)
		s2, e2 = 0, len(lines2)
		while s1 < e1 or s2 < e2:
		i1, i2, btype = s1, s2, '='
		if (i1 >= e1 or lines1[i1] == 0
		or i2 >= e2 or lines2[i2] == 0):
		# Consume the block of blank lines
		btype = '~'
		while i1 < e1 and lines1[i1] == 0:
		i1 += 1
		while i2 < e2 and lines2[i2] == 0:
		i2 += 1
		else:
		# Consume the matching lines
		while i1 < e1 and lines1[i1] == 1 and lines2[i2] == 1:
		i1 += 1
		i2 += 1
		yield [base1 + s1, base1 + i1, base2 + s2, base2 + i2], btype
		s1 = i1
		s2 = i2

		def allblocks(text1, text2, opts=None, lines1=None, lines2=None, refine=False):
Patrick Mezard mdiff: make diffblocks() return all blocks, matching and changed...	r15526	"""Return (block, type) tuples, where block is an mdiff.blocks
		line entry. type is '=' for blocks matching exactly one another
		(bdiff blocks), '!' for non-matching blocks and '~' for blocks
Patrick Mezard annotate: support diff whitespace filtering flags (issue3030)...	r15528	matching only after having filtered blank lines. If refine is True,
		then '~' blocks are refined and are only made of blank lines.
Patrick Mezard mdiff: make diffblocks() return all blocks, matching and changed...	r15526	line1 and line2 are text1 and text2 split with splitnewlines() if
		they are already available.
Patrick Mezard mdiff: extract blocks whitespace normalization in diffblocks()...	r15525	"""
		if opts is None:
		opts = defaultopts
		if opts.ignorews or opts.ignorewsamount:
		text1 = wsclean(opts, text1, False)
		text2 = wsclean(opts, text2, False)
		diff = bdiff.blocks(text1, text2)
		for i, s1 in enumerate(diff):
		# The first match is special.
		# we've either found a match starting at line 0 or a match later
		# in the file. If it starts later, old and new below will both be
		# empty and we'll continue to the next match.
		if i > 0:
		s = diff[i - 1]
		else:
		s = [0, 0, 0, 0]
		s = [s[1], s1[0], s[3], s1[2]]

		# bdiff sometimes gives huge matches past eof, this check eats them,
		# and deals with the special first match case described above
Patrick Mezard mdiff: split lines in allblocks() only when necessary...	r15529	if s[0] != s[1] or s[2] != s[3]:
Patrick Mezard mdiff: make diffblocks() return all blocks, matching and changed...	r15526	type = '!'
		if opts.ignoreblanklines:
Patrick Mezard mdiff: split lines in allblocks() only when necessary...	r15529	if lines1 is None:
		lines1 = splitnewlines(text1)
		if lines2 is None:
		lines2 = splitnewlines(text2)
		old = wsclean(opts, "".join(lines1[s[0]:s[1]]))
		new = wsclean(opts, "".join(lines2[s[2]:s[3]]))
		if old == new:
Patrick Mezard mdiff: make diffblocks() return all blocks, matching and changed...	r15526	type = '~'
		yield s, type
		yield s1, '='
Patrick Mezard mdiff: extract blocks whitespace normalization in diffblocks()...	r15525
Guillermo Pérez diff: unify calls to diffline...	r17940	def unidiff(a, ad, b, bd, fn1, fn2, opts=defaultopts):
Patrick Mezard mdiff: fix diff header generation for files with spaces (issue3357)...	r16362	def datetag(date, fn=None):
Alexis S. L. Carvalho git patches: correct handling of filenames with spaces...	r4679	if not opts.git and not opts.nodates:
		return '\t%s\n' % date
Patrick Mezard mdiff: fix diff header generation for files with spaces (issue3357)...	r16362	if fn and ' ' in fn:
Alexis S. L. Carvalho git patches: correct handling of filenames with spaces...	r4679	return '\t\n'
		return '\n'
Brendan Cully Remove dates from git export file lines - they confuse git-apply	r3026
Matt Mackall many, many trivial check-code fixups	r10282	if not a and not b:
		return ""
Matt Mackall Clean up mdiff imports	r1379	epoch = util.datestr((0, 0))
mpm@selenic.com Attempt to make diff deal with null sources properly...	r264
Mads Kiilerich diff: always use / in paths in diff...	r15437	fn1 = util.pconvert(fn1)
		fn2 = util.pconvert(fn2)

Vadim Gelfer refactor text diff/patch code....	r2874	if not opts.text and (util.binary(a) or util.binary(b)):
Martin Geisler mdiff: compare content of binary files directly...	r6871	if a and b and len(a) == len(b) and a == b:
tailgunner@smtp.ru Don't lie that "binary file has changed"...	r4103	return ""
Dustin Sallings Use both the from and to name in mdiff.unidiff....	r5482	l = ['Binary file %s has changed\n' % fn1]
Thomas Arendsen Hein Fix diff against an empty file (issue124) and add a test for this.	r1723	elif not a:
Vadim Gelfer fix speed regression in mdiff caused by line split bugfix.	r2251	b = splitnewlines(b)
Thomas Arendsen Hein Fix diff against an empty file (issue124) and add a test for this.	r1723	if a is None:
Patrick Mezard mdiff: fix diff header generation for files with spaces (issue3357)...	r16362	l1 = '--- /dev/null%s' % datetag(epoch)
Thomas Arendsen Hein Fix diff against an empty file (issue124) and add a test for this.	r1723	else:
Patrick Mezard mdiff: fix diff header generation for files with spaces (issue3357)...	r16362	l1 = "--- %s%s" % ("a/" + fn1, datetag(ad, fn1))
		l2 = "+++ %s%s" % ("b/" + fn2, datetag(bd, fn2))
mpm@selenic.com Attempt to make diff deal with null sources properly...	r264	l3 = "@@ -0,0 +1,%d @@\n" % len(b)
		l = [l1, l2, l3] + ["+" + e for e in b]
Thomas Arendsen Hein Fix diff against an empty file (issue124) and add a test for this.	r1723	elif not b:
Vadim Gelfer fix speed regression in mdiff caused by line split bugfix.	r2251	a = splitnewlines(a)
Patrick Mezard mdiff: fix diff header generation for files with spaces (issue3357)...	r16362	l1 = "--- %s%s" % ("a/" + fn1, datetag(ad, fn1))
Thomas Arendsen Hein Fix diff against an empty file (issue124) and add a test for this.	r1723	if b is None:
Patrick Mezard mdiff: fix diff header generation for files with spaces (issue3357)...	r16362	l2 = '+++ /dev/null%s' % datetag(epoch)
Thomas Arendsen Hein Fix diff against an empty file (issue124) and add a test for this.	r1723	else:
Patrick Mezard mdiff: fix diff header generation for files with spaces (issue3357)...	r16362	l2 = "+++ %s%s" % ("b/" + fn2, datetag(bd, fn2))
mpm@selenic.com Attempt to make diff deal with null sources properly...	r264	l3 = "@@ -1,%d +0,0 @@\n" % len(a)
		l = [l1, l2, l3] + ["-" + e for e in a]
		else:
Vadim Gelfer fix speed regression in mdiff caused by line split bugfix.	r2251	al = splitnewlines(a)
		bl = splitnewlines(b)
Benoit Boissinot remove header handling out of mdiff.bunidiff, rename it	r10614	l = list(_unidiff(a, b, al, bl, opts=opts))
Matt Mackall many, many trivial check-code fixups	r10282	if not l:
		return ""
Benoit Boissinot remove header handling out of mdiff.bunidiff, rename it	r10614
Patrick Mezard mdiff: fix diff header generation for files with spaces (issue3357)...	r16362	l.insert(0, "--- a/%s%s" % (fn1, datetag(ad, fn1)))
		l.insert(1, "+++ b/%s%s" % (fn2, datetag(bd, fn2)))
mpm@selenic.com hg diff: fix missing final newline bug	r170
		for ln in xrange(len(l)):
		if l[ln][-1] != '\n':
		l[ln] += "\n\ No newline at end of file\n"

mpm@selenic.com Add back links from file revisions to changeset revisions...	r0	return "".join(l)

Benoit Boissinot remove header handling out of mdiff.bunidiff, rename it	r10614	# creates a headerless unified diff
mason@suse.com Add new bdiff based unidiff generation.	r1637	# t1 and t2 are the text to be diffed
		# l1 and l2 are the text broken up into lines
Benoit Boissinot remove header handling out of mdiff.bunidiff, rename it	r10614	def _unidiff(t1, t2, l1, l2, opts=defaultopts):
mason@suse.com Add new bdiff based unidiff generation.	r1637	def contextend(l, len):
Vadim Gelfer refactor text diff/patch code....	r2874	ret = l + opts.context
mason@suse.com Add new bdiff based unidiff generation.	r1637	if ret > len:
		ret = len
		return ret

		def contextstart(l):
Vadim Gelfer refactor text diff/patch code....	r2874	ret = l - opts.context
mason@suse.com Add new bdiff based unidiff generation.	r1637	if ret < 0:
		return 0
		return ret

Brodie Rao mdiff: speed up showfunc for large diffs...	r15141	lastfunc = [0, '']
Benoit Boissinot remove header handling out of mdiff.bunidiff, rename it	r10614	def yieldhunk(hunk):
mason@suse.com Add new bdiff based unidiff generation.	r1637	(astart, a2, bstart, b2, delta) = hunk
		aend = contextend(a2, len(l1))
		alen = aend - astart
		blen = b2 - bstart + aend - a2

		func = ""
Vadim Gelfer refactor text diff/patch code....	r2874	if opts.showfunc:
Brodie Rao mdiff: speed up showfunc for large diffs...	r15141	lastpos, func = lastfunc
		# walk backwards from the start of the context up to the start of
		# the previous hunk context until we find a line starting with an
		# alphanumeric char.
		for i in xrange(astart - 1, lastpos - 1, -1):
		if l1[i][0].isalnum():
		func = ' ' + l1[i].rstrip()[:40]
		lastfunc[1] = func
mason@suse.com Add new bdiff based unidiff generation.	r1637	break
Brodie Rao mdiff: speed up showfunc for large diffs...	r15141	# by recording this hunk's starting point as the next place to
		# start looking for function lines, we avoid reading any line in
		# the file more than once.
		lastfunc[0] = astart
mason@suse.com Add new bdiff based unidiff generation.	r1637
Nicolas Venegas mdiff/patch: fix bad hunk handling for unified diffs with zero context...	r15462	# zero-length hunk ranges report their start line as one less
		if alen:
		astart += 1
		if blen:
		bstart += 1

		yield "@@ -%d,%d +%d,%d @@%s\n" % (astart, alen,
		bstart, blen, func)
mason@suse.com Add new bdiff based unidiff generation.	r1637	for x in delta:
		yield x
		for x in xrange(a2, aend):
		yield ' ' + l1[x]

		# bdiff.blocks gives us the matching sequences in the files. The loop
		# below finds the spaces between those matching sequences and translates
		# them into diff output.
		#
		hunk = None
Patrick Mezard mdiff: adjust hunk offsets with --ignore-blank-lines (issue3234)...	r16089	ignoredlines = 0
Patrick Mezard mdiff: make diffblocks() return all blocks, matching and changed...	r15526	for s, stype in allblocks(t1, t2, opts, l1, l2):
Patrick Mezard mdiff: adjust hunk offsets with --ignore-blank-lines (issue3234)...	r16089	a1, a2, b1, b2 = s
Patrick Mezard mdiff: make diffblocks() return all blocks, matching and changed...	r15526	if stype != '!':
Patrick Mezard mdiff: adjust hunk offsets with --ignore-blank-lines (issue3234)...	r16089	if stype == '~':
		# The diff context lines are based on t1 content. When
		# blank lines are ignored, the new lines offsets must
		# be adjusted as if equivalent blocks ('~') had the
		# same sizes on both sides.
		ignoredlines += (b2 - b1) - (a2 - a1)
Patrick Mezard mdiff: make diffblocks() return all blocks, matching and changed...	r15526	continue
mason@suse.com Add new bdiff based unidiff generation.	r1637	delta = []
		old = l1[a1:a2]
		new = l2[b1:b2]

Patrick Mezard mdiff: adjust hunk offsets with --ignore-blank-lines (issue3234)...	r16089	b1 -= ignoredlines
		b2 -= ignoredlines
mason@suse.com Add new bdiff based unidiff generation.	r1637	astart = contextstart(a1)
		bstart = contextstart(b1)
		prev = None
		if hunk:
		# join with the previous hunk if it falls inside the context
Vadim Gelfer refactor text diff/patch code....	r2874	if astart < hunk[1] + opts.context + 1:
mason@suse.com Add new bdiff based unidiff generation.	r1637	prev = hunk
		astart = hunk[1]
		bstart = hunk[3]
		else:
Benoit Boissinot remove header handling out of mdiff.bunidiff, rename it	r10614	for x in yieldhunk(hunk):
mason@suse.com Add new bdiff based unidiff generation.	r1637	yield x
		if prev:
		# we've joined the previous hunk, record the new ending points.
		hunk[1] = a2
		hunk[3] = b2
		delta = hunk[4]
		else:
		# create a new hunk
Matt Mackall many, many trivial check-code fixups	r10282	hunk = [astart, a2, bstart, b2, delta]
mason@suse.com Add new bdiff based unidiff generation.	r1637
Matt Mackall many, many trivial check-code fixups	r10282	delta[len(delta):] = [' ' + x for x in l1[astart:a1]]
		delta[len(delta):] = ['-' + x for x in old]
		delta[len(delta):] = ['+' + x for x in new]
mason@suse.com Add new bdiff based unidiff generation.	r1637
		if hunk:
Benoit Boissinot remove header handling out of mdiff.bunidiff, rename it	r10614	for x in yieldhunk(hunk):
mason@suse.com Add new bdiff based unidiff generation.	r1637	yield x

Guillermo Pérez <bisho at fb.com> diff: move b85diff to mdiff module...	r17939	def b85diff(to, tn):
		'''print base85-encoded binary diff'''
		def fmtline(line):
		l = len(line)
		if l <= 26:
		l = chr(ord('A') + l - 1)
		else:
		l = chr(l - 26 + ord('a') - 1)
		return '%c%s\n' % (l, base85.b85encode(line, True))

		def chunk(text, csize=52):
		l = len(text)
		i = 0
		while i < l:
		yield text[i:i + csize]
		i += csize

Guillermo Pérez diff: move index header generation to patch...	r17946	if to is None:
		to = ''
		if tn is None:
		tn = ''

		if to == tn:
		return ''
Guillermo Pérez <bisho at fb.com> diff: move b85diff to mdiff module...	r17939
		# TODO: deltas
Guillermo Pérez diff: move index header generation to patch...	r17946	ret = []
		ret.append('GIT binary patch\n')
		ret.append('literal %s\n' % len(tn))
Guillermo Pérez <bisho at fb.com> diff: move b85diff to mdiff module...	r17939	for l in chunk(zlib.compress(tn)):
		ret.append(fmtline(l))
		ret.append('\n')
Guillermo Pérez diff: move index header generation to patch...	r17946
Guillermo Pérez <bisho at fb.com> diff: move b85diff to mdiff module...	r17939	return ''.join(ret)

mpm@selenic.com Add a function to return the new text from a binary diff	r120	def patchtext(bin):
		pos = 0
		t = []
		while pos < len(bin):
		p1, p2, l = struct.unpack(">lll", bin[pos:pos + 12])
		pos += 12
		t.append(bin[pos:pos + l])
		pos += l
		return "".join(t)

mpm@selenic.com Add back links from file revisions to changeset revisions...	r0	def patch(a, bin):
Benoit Boissinot mdiff.patch(): add a special case for when the base text is empty...	r12025	if len(a) == 0:
		# skip over trivial delta header
Matt Mackall util: don't mess with builtins to emulate buffer()	r15657	return util.buffer(bin, 12)
Matt Mackall Clean up mdiff imports	r1379	return mpatch.patches(a, [bin])
mpm@selenic.com Start using bdiff for generating deltas...	r432
Alexis S. L. Carvalho add mdiff.get_matching_blocks	r4361	# similar to difflib.SequenceMatcher.get_matching_blocks
		def get_matching_blocks(a, b):
		return [(d[0], d[2], d[1] - d[0]) for d in bdiff.blocks(a, b)]

Matt Mackall revlog: generate trivial deltas against null revision...	r5367	def trivialdiffheader(length):
		return struct.pack(">lll", 0, 0, length)

Matt Mackall Clean up mdiff imports	r1379	patches = mpatch.patches
mason@suse.com Fill in the uncompressed size during revlog.addgroup...	r2078	patchedsize = mpatch.patchedsize
mpm@selenic.com Start using bdiff for generating deltas...	r432	textdiff = bdiff.bdiff