upstream/mercurial-mirror Files · mercurial/mdiff.py

mdiff: replace wscleanup() regexps with C loops...

mdiff: replace wscleanup() regexps with C loops On my system it reduces: hg annotate -w mercurial/commands.py from 36s to less than 8s, to be compared with 6.3s when run without whitespace options.

Patrick Mezard - - Load All Authors

File last commit:

r15530:eeac5e17 default


                r15530:eeac5e17

default

Download file

             mdiff.py
        
                    333 lines
            
             | 10.6 KiB
            
                | text/x-python
            
             |
                PythonLexer
            
             / mercurial / mdiff.py
          
                    History
                
                 |
                  Source
                 | Raw
                 |Copy content
                 |Copy permalink

        mpm@selenic.com
    
mdiff.py: kill #! line, add copyright notice...

              r239
            
      # mdiff.py - diff and patch routines for mercurial

      #

        Vadim Gelfer
    
update copyrights.

              r2859
            
      # Copyright 2005, 2006 Matt Mackall <mpm@selenic.com>

        mpm@selenic.com
    
mdiff.py: kill #! line, add copyright notice...

              r239
            
      #

        Martin Geisler
    
updated license to be explicit about GPL version 2

              r8225
            
      # This software may be used and distributed according to the terms of the

        Matt Mackall
    
Update license to GPLv2+

              r10263
            
      # GNU General Public License version 2 or any later version.

        mpm@selenic.com
    
mdiff.py: kill #! line, add copyright notice...

              r239
            
        Patrick Mezard
    
Let --unified default to diff.unified (issue 1076)

              r6467
            
      from i18n import _

        Simon Heimberg
    
separate import lines from mercurial and general python modules

              r8312
            
      import bdiff, mpatch, util

      import re, struct

        mpm@selenic.com
    
Add back links from file revisions to changeset revisions...

              r0
            
        Vadim Gelfer
    
fix speed regression in mdiff caused by line split bugfix.

              r2251
            
      def splitnewlines(text):

        Vadim Gelfer
    
fix diffs containing embedded "\r"....

              r2248
            
          '''like str.splitlines, but only split on newlines.'''

        Vadim Gelfer
    
fix speed regression in mdiff caused by line split bugfix.

              r2251
            
          lines = [l + '\n' for l in text.split('\n')]

          if lines:

              if lines[-1] == '\n':

                  lines.pop()

              else:

                  lines[-1] = lines[-1][:-1]

          return lines

        Vadim Gelfer
    
fix diffs containing embedded "\r"....

              r2248
            
        Vadim Gelfer
    
refactor text diff/patch code....

              r2874
            
      class diffopts(object):

          '''context is the number of context lines

          text treats all files as text

          showfunc enables diff -p output

        Brendan Cully
    
Add diff --git option

              r2907
            
          git enables the git extended patch format

        Stephen Darnell
    
Add -D/--nodates options to hg diff/export that removes dates from diff headers...

              r3199
            
          nodates removes dates from diff headers

        Vadim Gelfer
    
refactor text diff/patch code....

              r2874
            
          ignorews ignores all whitespace changes in the diff

          ignorewsamount ignores changes in the amount of whitespace

        Patrick Mezard
    
patch: support diff data loss detection and upgrade...

              r10189
            
          ignoreblanklines ignores changes whose lines are all blank

          upgrade generates git diffs to avoid data loss

          '''

        Thomas Arendsen Hein
    
Show revisions in diffs like CVS, based on a patch from Goffredo Baroncelli....

              r396
            
        Vadim Gelfer
    
refactor text diff/patch code....

              r2874
            
          defaults = {

              'context': 3,

              'text': False,

        Matt Mackall
    
diff: don't show function name by default...

              r5863
            
              'showfunc': False,

        Brendan Cully
    
Add diff --git option

              r2907
            
              'git': False,

        Stephen Darnell
    
Add -D/--nodates options to hg diff/export that removes dates from diff headers...

              r3199
            
              'nodates': False,

        Vadim Gelfer
    
refactor text diff/patch code....

              r2874
            
              'ignorews': False,

              'ignorewsamount': False,

              'ignoreblanklines': False,

        Patrick Mezard
    
patch: support diff data loss detection and upgrade...

              r10189
            
              'upgrade': False,

        Vadim Gelfer
    
refactor text diff/patch code....

              r2874
            
              }

          __slots__ = defaults.keys()

          def __init__(self, **opts):

              for k in self.__slots__:

                  v = opts.get(k)

                  if v is None:

                      v = self.defaults[k]

                  setattr(self, k, v)

        Patrick Mezard
    
Let --unified default to diff.unified (issue 1076)

              r6467
            
              try:

                  self.context = int(self.context)

              except ValueError:

                  raise util.Abort(_('diff context lines count must be '

                                     'an integer, not %r') % self.context)

        Patrick Mezard
    
mq: preserve --git flag when merging patches...

              r10185
            
          def copy(self, **kwargs):

              opts = dict((k, getattr(self, k)) for k in self.defaults)

              opts.update(kwargs)

              return diffopts(**opts)

        Vadim Gelfer
    
refactor text diff/patch code....

              r2874
            
      defaultopts = diffopts()

        Patrick Mezard
    
mdiff: fix diff -b/B/w on mixed whitespace hunks (issue127)...

              r9827
            
      def wsclean(opts, text, blank=True):

        Matt Mackall
    
diff: correctly handle combinations of whitespace options

              r4878
            
          if opts.ignorews:

        Patrick Mezard
    
mdiff: replace wscleanup() regexps with C loops...

              r15530
            
              text = bdiff.fixws(text, 1)

        Matt Mackall
    
diff: correctly handle combinations of whitespace options

              r4878
            
          elif opts.ignorewsamount:

        Patrick Mezard
    
mdiff: replace wscleanup() regexps with C loops...

              r15530
            
              text = bdiff.fixws(text, 0)

        Patrick Mezard
    
mdiff: fix diff -b/B/w on mixed whitespace hunks (issue127)...

              r9827
            
          if blank and opts.ignoreblanklines:

        Patrick Mezard
    
diff: --ignore-blank-lines was too enthusiastic...

              r15509
            
              text = re.sub('\n+', '\n', text).strip('\n')

        Matt Mackall
    
diff: correctly handle combinations of whitespace options

              r4878
            
          return text

        Patrick Mezard
    
annotate: support diff whitespace filtering flags (issue3030)...

              r15528
            
      def splitblock(base1, lines1, base2, lines2, opts):

          # The input lines matches except for interwoven blank lines. We

          # transform it into a sequence of matching blocks and blank blocks.

          lines1 = [(wsclean(opts, l) and 1 or 0) for l in lines1]

          lines2 = [(wsclean(opts, l) and 1 or 0) for l in lines2]

          s1, e1 = 0, len(lines1)

          s2, e2 = 0, len(lines2)

          while s1 < e1 or s2 < e2:

              i1, i2, btype = s1, s2, '='

              if (i1 >= e1 or lines1[i1] == 0

                  or i2 >= e2 or lines2[i2] == 0):

                  # Consume the block of blank lines

                  btype = '~'

                  while i1 < e1 and lines1[i1] == 0:

                      i1 += 1

                  while i2 < e2 and lines2[i2] == 0:

                      i2 += 1

              else:

                  # Consume the matching lines

                  while i1 < e1 and lines1[i1] == 1 and lines2[i2] == 1:

                      i1 += 1

                      i2 += 1

              yield [base1 + s1, base1 + i1, base2 + s2, base2 + i2], btype

              s1 = i1

              s2 = i2

      def allblocks(text1, text2, opts=None, lines1=None, lines2=None, refine=False):

        Patrick Mezard
    
mdiff: make diffblocks() return all blocks, matching and changed...

              r15526
            
          """Return (block, type) tuples, where block is an mdiff.blocks

          line entry. type is '=' for blocks matching exactly one another

          (bdiff blocks), '!' for non-matching blocks and '~' for blocks

        Patrick Mezard
    
annotate: support diff whitespace filtering flags (issue3030)...

              r15528
            
          matching only after having filtered blank lines. If refine is True,

          then '~' blocks are refined and are only made of blank lines.

        Patrick Mezard
    
mdiff: make diffblocks() return all blocks, matching and changed...

              r15526
            
          line1 and line2 are text1 and text2 split with splitnewlines() if

          they are already available.

        Patrick Mezard
    
mdiff: extract blocks whitespace normalization in diffblocks()...

              r15525
            
          """

          if opts is None:

              opts = defaultopts

          if opts.ignorews or opts.ignorewsamount:

              text1 = wsclean(opts, text1, False)

              text2 = wsclean(opts, text2, False)

          diff = bdiff.blocks(text1, text2)

          for i, s1 in enumerate(diff):

              # The first match is special.

              # we've either found a match starting at line 0 or a match later

              # in the file.  If it starts later, old and new below will both be

              # empty and we'll continue to the next match.

              if i > 0:

                  s = diff[i - 1]

              else:

                  s = [0, 0, 0, 0]

              s = [s[1], s1[0], s[3], s1[2]]

              # bdiff sometimes gives huge matches past eof, this check eats them,

              # and deals with the special first match case described above

        Patrick Mezard
    
mdiff: split lines in allblocks() only when necessary...

              r15529
            
              if s[0] != s[1] or s[2] != s[3]:

        Patrick Mezard
    
mdiff: make diffblocks() return all blocks, matching and changed...

              r15526
            
                  type = '!'

                  if opts.ignoreblanklines:

        Patrick Mezard
    
mdiff: split lines in allblocks() only when necessary...

              r15529
            
                      if lines1 is None:

                          lines1 = splitnewlines(text1)

                      if lines2 is None:

                          lines2 = splitnewlines(text2)

                      old = wsclean(opts, "".join(lines1[s[0]:s[1]]))

                      new = wsclean(opts, "".join(lines2[s[2]:s[3]]))

                      if old == new:

        Patrick Mezard
    
mdiff: make diffblocks() return all blocks, matching and changed...

              r15526
            
                          type = '~'

                  yield s, type

              yield s1, '='

        Patrick Mezard
    
mdiff: extract blocks whitespace normalization in diffblocks()...

              r15525
            
        Dirkjan Ochtman
    
patch/diff: use a separate function to write the first line of a file diff

              r7200
            
      def diffline(revs, a, b, opts):

          parts = ['diff']

          if opts.git:

              parts.append('--git')

          if revs and not opts.git:

              parts.append(' '.join(["-r %s" % rev for rev in revs]))

          if opts.git:

              parts.append('a/%s' % a)

              parts.append('b/%s' % b)

          else:

              parts.append(a)

          return ' '.join(parts) + '\n'

        Thomas Arendsen Hein
    
Remove trailing space

              r7204
            
        Dustin Sallings
    
Use both the from and to name in mdiff.unidiff....

              r5482
            
      def unidiff(a, ad, b, bd, fn1, fn2, r=None, opts=defaultopts):

        Alexis S. L. Carvalho
    
git patches: correct handling of filenames with spaces...

              r4679
            
          def datetag(date, addtab=True):

              if not opts.git and not opts.nodates:

                  return '\t%s\n' % date

        Dustin Sallings
    
Use both the from and to name in mdiff.unidiff....

              r5482
            
              if addtab and ' ' in fn1:

        Alexis S. L. Carvalho
    
git patches: correct handling of filenames with spaces...

              r4679
            
                  return '\t\n'

              return '\n'

        Brendan Cully
    
Remove dates from git export file lines - they confuse git-apply

              r3026
            
        Matt Mackall
    
many, many trivial check-code fixups

              r10282
            
          if not a and not b:

              return ""

        Matt Mackall
    
Clean up mdiff imports

              r1379
            
          epoch = util.datestr((0, 0))

        mpm@selenic.com
    
Attempt to make diff deal with null sources properly...

              r264
            
        Mads Kiilerich
    
diff: always use / in paths in diff...

              r15437
            
          fn1 = util.pconvert(fn1)

          fn2 = util.pconvert(fn2)

        Vadim Gelfer
    
refactor text diff/patch code....

              r2874
            
          if not opts.text and (util.binary(a) or util.binary(b)):

        Martin Geisler
    
mdiff: compare content of binary files directly...

              r6871
            
              if a and b and len(a) == len(b) and a == b:

        tailgunner@smtp.ru
    
Don't lie that "binary file has changed"...

              r4103
            
                  return ""

        Dustin Sallings
    
Use both the from and to name in mdiff.unidiff....

              r5482
            
              l = ['Binary file %s has changed\n' % fn1]

        Thomas Arendsen Hein
    
Fix diff against an empty file (issue124) and add a test for this.

              r1723
            
          elif not a:

        Vadim Gelfer
    
fix speed regression in mdiff caused by line split bugfix.

              r2251
            
              b = splitnewlines(b)

        Thomas Arendsen Hein
    
Fix diff against an empty file (issue124) and add a test for this.

              r1723
            
              if a is None:

        Alexis S. L. Carvalho
    
git patches: correct handling of filenames with spaces...

              r4679
            
                  l1 = '--- /dev/null%s' % datetag(epoch, False)

        Thomas Arendsen Hein
    
Fix diff against an empty file (issue124) and add a test for this.

              r1723
            
              else:

        Dustin Sallings
    
Use both the from and to name in mdiff.unidiff....

              r5482
            
                  l1 = "--- %s%s" % ("a/" + fn1, datetag(ad))

              l2 = "+++ %s%s" % ("b/" + fn2, datetag(bd))

        mpm@selenic.com
    
Attempt to make diff deal with null sources properly...

              r264
            
              l3 = "@@ -0,0 +1,%d @@\n" % len(b)

              l = [l1, l2, l3] + ["+" + e for e in b]

        Thomas Arendsen Hein
    
Fix diff against an empty file (issue124) and add a test for this.

              r1723
            
          elif not b:

        Vadim Gelfer
    
fix speed regression in mdiff caused by line split bugfix.

              r2251
            
              a = splitnewlines(a)

        Dustin Sallings
    
Use both the from and to name in mdiff.unidiff....

              r5482
            
              l1 = "--- %s%s" % ("a/" + fn1, datetag(ad))

        Thomas Arendsen Hein
    
Fix diff against an empty file (issue124) and add a test for this.

              r1723
            
              if b is None:

        Alexis S. L. Carvalho
    
git patches: correct handling of filenames with spaces...

              r4679
            
                  l2 = '+++ /dev/null%s' % datetag(epoch, False)

        Thomas Arendsen Hein
    
Fix diff against an empty file (issue124) and add a test for this.

              r1723
            
              else:

        Dustin Sallings
    
Use both the from and to name in mdiff.unidiff....

              r5482
            
                  l2 = "+++ %s%s" % ("b/" + fn2, datetag(bd))

        mpm@selenic.com
    
Attempt to make diff deal with null sources properly...

              r264
            
              l3 = "@@ -1,%d +0,0 @@\n" % len(a)

              l = [l1, l2, l3] + ["-" + e for e in a]

          else:

        Vadim Gelfer
    
fix speed regression in mdiff caused by line split bugfix.

              r2251
            
              al = splitnewlines(a)

              bl = splitnewlines(b)

        Benoit Boissinot
    
remove header handling out of mdiff.bunidiff, rename it

              r10614
            
              l = list(_unidiff(a, b, al, bl, opts=opts))

        Matt Mackall
    
many, many trivial check-code fixups

              r10282
            
              if not l:

                  return ""

        Benoit Boissinot
    
remove header handling out of mdiff.bunidiff, rename it

              r10614
            
              l.insert(0, "--- a/%s%s" % (fn1, datetag(ad)))

              l.insert(1, "+++ b/%s%s" % (fn2, datetag(bd)))

        mpm@selenic.com
    
hg diff: fix missing final newline bug

              r170
            
          for ln in xrange(len(l)):

              if l[ln][-1] != '\n':

                  l[ln] += "\n\ No newline at end of file\n"

        Thomas Arendsen Hein
    
Show revisions in diffs like CVS, based on a patch from Goffredo Baroncelli....

              r396
            
          if r:

        Dirkjan Ochtman
    
patch/diff: use a separate function to write the first line of a file diff

              r7200
            
              l.insert(0, diffline(r, fn1, fn2, opts))

        Thomas Arendsen Hein
    
Show revisions in diffs like CVS, based on a patch from Goffredo Baroncelli....

              r396
            
        mpm@selenic.com
    
Add back links from file revisions to changeset revisions...

              r0
            
          return "".join(l)

        Benoit Boissinot
    
remove header handling out of mdiff.bunidiff, rename it

              r10614
            
      # creates a headerless unified diff

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
      # t1 and t2 are the text to be diffed

      # l1 and l2 are the text broken up into lines

        Benoit Boissinot
    
remove header handling out of mdiff.bunidiff, rename it

              r10614
            
      def _unidiff(t1, t2, l1, l2, opts=defaultopts):

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
          def contextend(l, len):

        Vadim Gelfer
    
refactor text diff/patch code....

              r2874
            
              ret = l + opts.context

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
              if ret > len:

                  ret = len

              return ret

          def contextstart(l):

        Vadim Gelfer
    
refactor text diff/patch code....

              r2874
            
              ret = l - opts.context

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
              if ret < 0:

                  return 0

              return ret

        Brodie Rao
    
mdiff: speed up showfunc for large diffs...

              r15141
            
          lastfunc = [0, '']

        Benoit Boissinot
    
remove header handling out of mdiff.bunidiff, rename it

              r10614
            
          def yieldhunk(hunk):

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
              (astart, a2, bstart, b2, delta) = hunk

              aend = contextend(a2, len(l1))

              alen = aend - astart

              blen = b2 - bstart + aend - a2

              func = ""

        Vadim Gelfer
    
refactor text diff/patch code....

              r2874
            
              if opts.showfunc:

        Brodie Rao
    
mdiff: speed up showfunc for large diffs...

              r15141
            
                  lastpos, func = lastfunc

                  # walk backwards from the start of the context up to the start of

                  # the previous hunk context until we find a line starting with an

                  # alphanumeric char.

                  for i in xrange(astart - 1, lastpos - 1, -1):

                      if l1[i][0].isalnum():

                          func = ' ' + l1[i].rstrip()[:40]

                          lastfunc[1] = func

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
                          break

        Brodie Rao
    
mdiff: speed up showfunc for large diffs...

              r15141
            
                  # by recording this hunk's starting point as the next place to

                  # start looking for function lines, we avoid reading any line in

                  # the file more than once.

                  lastfunc[0] = astart

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
        Nicolas Venegas
    
mdiff/patch: fix bad hunk handling for unified diffs with zero context...

              r15462
            
              # zero-length hunk ranges report their start line as one less

              if alen:

                  astart += 1

              if blen:

                  bstart += 1

              yield "@@ -%d,%d +%d,%d @@%s\n" % (astart, alen,

                                                 bstart, blen, func)

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
              for x in delta:

                  yield x

              for x in xrange(a2, aend):

                  yield ' ' + l1[x]

          # bdiff.blocks gives us the matching sequences in the files.  The loop

          # below finds the spaces between those matching sequences and translates

          # them into diff output.

          #

          hunk = None

        Patrick Mezard
    
mdiff: make diffblocks() return all blocks, matching and changed...

              r15526
            
          for s, stype in allblocks(t1, t2, opts, l1, l2):

              if stype != '!':

                  continue

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
              delta = []

        Patrick Mezard
    
mdiff: extract blocks whitespace normalization in diffblocks()...

              r15525
            
              a1, a2, b1, b2 = s

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
              old = l1[a1:a2]

              new = l2[b1:b2]

              astart = contextstart(a1)

              bstart = contextstart(b1)

              prev = None

              if hunk:

                  # join with the previous hunk if it falls inside the context

        Vadim Gelfer
    
refactor text diff/patch code....

              r2874
            
                  if astart < hunk[1] + opts.context + 1:

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
                      prev = hunk

                      astart = hunk[1]

                      bstart = hunk[3]

                  else:

        Benoit Boissinot
    
remove header handling out of mdiff.bunidiff, rename it

              r10614
            
                      for x in yieldhunk(hunk):

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
                          yield x

              if prev:

                  # we've joined the previous hunk, record the new ending points.

                  hunk[1] = a2

                  hunk[3] = b2

                  delta = hunk[4]

              else:

                  # create a new hunk

        Matt Mackall
    
many, many trivial check-code fixups

              r10282
            
                  hunk = [astart, a2, bstart, b2, delta]

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
        Matt Mackall
    
many, many trivial check-code fixups

              r10282
            
              delta[len(delta):] = [' ' + x for x in l1[astart:a1]]

              delta[len(delta):] = ['-' + x for x in old]

              delta[len(delta):] = ['+' + x for x in new]

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
          if hunk:

        Benoit Boissinot
    
remove header handling out of mdiff.bunidiff, rename it

              r10614
            
              for x in yieldhunk(hunk):

        mason@suse.com
    
Add new bdiff based unidiff generation.

              r1637
            
                  yield x

        mpm@selenic.com
    
Add a function to return the new text from a binary diff

              r120
            
      def patchtext(bin):

          pos = 0

          t = []

          while pos < len(bin):

              p1, p2, l = struct.unpack(">lll", bin[pos:pos + 12])

              pos += 12

              t.append(bin[pos:pos + l])

              pos += l

          return "".join(t)

        mpm@selenic.com
    
Add back links from file revisions to changeset revisions...

              r0
            
      def patch(a, bin):

        Benoit Boissinot
    
mdiff.patch(): add a special case for when the base text is empty...

              r12025
            
          if len(a) == 0:

              # skip over trivial delta header

              return buffer(bin, 12)

        Matt Mackall
    
Clean up mdiff imports

              r1379
            
          return mpatch.patches(a, [bin])

        mpm@selenic.com
    
Start using bdiff for generating deltas...

              r432
            
        Alexis S. L. Carvalho
    
add mdiff.get_matching_blocks

              r4361
            
      # similar to difflib.SequenceMatcher.get_matching_blocks

      def get_matching_blocks(a, b):

          return [(d[0], d[2], d[1] - d[0]) for d in bdiff.blocks(a, b)]

        Matt Mackall
    
revlog: generate trivial deltas against null revision...

              r5367
            
      def trivialdiffheader(length):

          return struct.pack(">lll", 0, 0, length)

        Matt Mackall
    
Clean up mdiff imports

              r1379
            
      patches = mpatch.patches

        mason@suse.com
    
Fill in the uncompressed size during revlog.addgroup...

              r2078
            
      patchedsize = mpatch.patchedsize

        mpm@selenic.com
    
Start using bdiff for generating deltas...

              r432
            
      textdiff = bdiff.bdiff

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages

mpm@selenic.com mdiff.py: kill #! line, add copyright notice...	r239	# mdiff.py - diff and patch routines for mercurial
		#
Vadim Gelfer update copyrights.	r2859	# Copyright 2005, 2006 Matt Mackall <mpm@selenic.com>
mpm@selenic.com mdiff.py: kill #! line, add copyright notice...	r239	#
Martin Geisler updated license to be explicit about GPL version 2	r8225	# This software may be used and distributed according to the terms of the
Matt Mackall Update license to GPLv2+	r10263	# GNU General Public License version 2 or any later version.
mpm@selenic.com mdiff.py: kill #! line, add copyright notice...	r239
Patrick Mezard Let --unified default to diff.unified (issue 1076)	r6467	from i18n import _
Simon Heimberg separate import lines from mercurial and general python modules	r8312	import bdiff, mpatch, util
		import re, struct
mpm@selenic.com Add back links from file revisions to changeset revisions...	r0
Vadim Gelfer fix speed regression in mdiff caused by line split bugfix.	r2251	def splitnewlines(text):
Vadim Gelfer fix diffs containing embedded "\r"....	r2248	'''like str.splitlines, but only split on newlines.'''
Vadim Gelfer fix speed regression in mdiff caused by line split bugfix.	r2251	lines = [l + '\n' for l in text.split('\n')]
		if lines:
		if lines[-1] == '\n':
		lines.pop()
		else:
		lines[-1] = lines[-1][:-1]
		return lines
Vadim Gelfer fix diffs containing embedded "\r"....	r2248
Vadim Gelfer refactor text diff/patch code....	r2874	class diffopts(object):
		'''context is the number of context lines
		text treats all files as text
		showfunc enables diff -p output
Brendan Cully Add diff --git option	r2907	git enables the git extended patch format
Stephen Darnell Add -D/--nodates options to hg diff/export that removes dates from diff headers...	r3199	nodates removes dates from diff headers
Vadim Gelfer refactor text diff/patch code....	r2874	ignorews ignores all whitespace changes in the diff
		ignorewsamount ignores changes in the amount of whitespace
Patrick Mezard patch: support diff data loss detection and upgrade...	r10189	ignoreblanklines ignores changes whose lines are all blank
		upgrade generates git diffs to avoid data loss
		'''
Thomas Arendsen Hein Show revisions in diffs like CVS, based on a patch from Goffredo Baroncelli....	r396
Vadim Gelfer refactor text diff/patch code....	r2874	defaults = {
		'context': 3,
		'text': False,
Matt Mackall diff: don't show function name by default...	r5863	'showfunc': False,
Brendan Cully Add diff --git option	r2907	'git': False,
Stephen Darnell Add -D/--nodates options to hg diff/export that removes dates from diff headers...	r3199	'nodates': False,
Vadim Gelfer refactor text diff/patch code....	r2874	'ignorews': False,
		'ignorewsamount': False,
		'ignoreblanklines': False,
Patrick Mezard patch: support diff data loss detection and upgrade...	r10189	'upgrade': False,
Vadim Gelfer refactor text diff/patch code....	r2874	}

		__slots__ = defaults.keys()

		def __init__(self, **opts):
		for k in self.__slots__:
		v = opts.get(k)
		if v is None:
		v = self.defaults[k]
		setattr(self, k, v)

Patrick Mezard Let --unified default to diff.unified (issue 1076)	r6467	try:
		self.context = int(self.context)
		except ValueError:
		raise util.Abort(_('diff context lines count must be '
		'an integer, not %r') % self.context)

Patrick Mezard mq: preserve --git flag when merging patches...	r10185	def copy(self, **kwargs):
		opts = dict((k, getattr(self, k)) for k in self.defaults)
		opts.update(kwargs)
		return diffopts(**opts)

Vadim Gelfer refactor text diff/patch code....	r2874	defaultopts = diffopts()

Patrick Mezard mdiff: fix diff -b/B/w on mixed whitespace hunks (issue127)...	r9827	def wsclean(opts, text, blank=True):
Matt Mackall diff: correctly handle combinations of whitespace options	r4878	if opts.ignorews:
Patrick Mezard mdiff: replace wscleanup() regexps with C loops...	r15530	text = bdiff.fixws(text, 1)
Matt Mackall diff: correctly handle combinations of whitespace options	r4878	elif opts.ignorewsamount:
Patrick Mezard mdiff: replace wscleanup() regexps with C loops...	r15530	text = bdiff.fixws(text, 0)
Patrick Mezard mdiff: fix diff -b/B/w on mixed whitespace hunks (issue127)...	r9827	if blank and opts.ignoreblanklines:
Patrick Mezard diff: --ignore-blank-lines was too enthusiastic...	r15509	text = re.sub('\n+', '\n', text).strip('\n')
Matt Mackall diff: correctly handle combinations of whitespace options	r4878	return text

Patrick Mezard annotate: support diff whitespace filtering flags (issue3030)...	r15528	def splitblock(base1, lines1, base2, lines2, opts):
		# The input lines matches except for interwoven blank lines. We
		# transform it into a sequence of matching blocks and blank blocks.
		lines1 = [(wsclean(opts, l) and 1 or 0) for l in lines1]
		lines2 = [(wsclean(opts, l) and 1 or 0) for l in lines2]
		s1, e1 = 0, len(lines1)
		s2, e2 = 0, len(lines2)
		while s1 < e1 or s2 < e2:
		i1, i2, btype = s1, s2, '='
		if (i1 >= e1 or lines1[i1] == 0
		or i2 >= e2 or lines2[i2] == 0):
		# Consume the block of blank lines
		btype = '~'
		while i1 < e1 and lines1[i1] == 0:
		i1 += 1
		while i2 < e2 and lines2[i2] == 0:
		i2 += 1
		else:
		# Consume the matching lines
		while i1 < e1 and lines1[i1] == 1 and lines2[i2] == 1:
		i1 += 1
		i2 += 1
		yield [base1 + s1, base1 + i1, base2 + s2, base2 + i2], btype
		s1 = i1
		s2 = i2

		def allblocks(text1, text2, opts=None, lines1=None, lines2=None, refine=False):
Patrick Mezard mdiff: make diffblocks() return all blocks, matching and changed...	r15526	"""Return (block, type) tuples, where block is an mdiff.blocks
		line entry. type is '=' for blocks matching exactly one another
		(bdiff blocks), '!' for non-matching blocks and '~' for blocks
Patrick Mezard annotate: support diff whitespace filtering flags (issue3030)...	r15528	matching only after having filtered blank lines. If refine is True,
		then '~' blocks are refined and are only made of blank lines.
Patrick Mezard mdiff: make diffblocks() return all blocks, matching and changed...	r15526	line1 and line2 are text1 and text2 split with splitnewlines() if
		they are already available.
Patrick Mezard mdiff: extract blocks whitespace normalization in diffblocks()...	r15525	"""
		if opts is None:
		opts = defaultopts
		if opts.ignorews or opts.ignorewsamount:
		text1 = wsclean(opts, text1, False)
		text2 = wsclean(opts, text2, False)
		diff = bdiff.blocks(text1, text2)
		for i, s1 in enumerate(diff):
		# The first match is special.
		# we've either found a match starting at line 0 or a match later
		# in the file. If it starts later, old and new below will both be
		# empty and we'll continue to the next match.
		if i > 0:
		s = diff[i - 1]
		else:
		s = [0, 0, 0, 0]
		s = [s[1], s1[0], s[3], s1[2]]

		# bdiff sometimes gives huge matches past eof, this check eats them,
		# and deals with the special first match case described above
Patrick Mezard mdiff: split lines in allblocks() only when necessary...	r15529	if s[0] != s[1] or s[2] != s[3]:
Patrick Mezard mdiff: make diffblocks() return all blocks, matching and changed...	r15526	type = '!'
		if opts.ignoreblanklines:
Patrick Mezard mdiff: split lines in allblocks() only when necessary...	r15529	if lines1 is None:
		lines1 = splitnewlines(text1)
		if lines2 is None:
		lines2 = splitnewlines(text2)
		old = wsclean(opts, "".join(lines1[s[0]:s[1]]))
		new = wsclean(opts, "".join(lines2[s[2]:s[3]]))
		if old == new:
Patrick Mezard mdiff: make diffblocks() return all blocks, matching and changed...	r15526	type = '~'
		yield s, type
		yield s1, '='
Patrick Mezard mdiff: extract blocks whitespace normalization in diffblocks()...	r15525
Dirkjan Ochtman patch/diff: use a separate function to write the first line of a file diff	r7200	def diffline(revs, a, b, opts):
		parts = ['diff']
		if opts.git:
		parts.append('--git')
		if revs and not opts.git:
		parts.append(' '.join(["-r %s" % rev for rev in revs]))
		if opts.git:
		parts.append('a/%s' % a)
		parts.append('b/%s' % b)
		else:
		parts.append(a)
		return ' '.join(parts) + '\n'
Thomas Arendsen Hein Remove trailing space	r7204
Dustin Sallings Use both the from and to name in mdiff.unidiff....	r5482	def unidiff(a, ad, b, bd, fn1, fn2, r=None, opts=defaultopts):
Alexis S. L. Carvalho git patches: correct handling of filenames with spaces...	r4679	def datetag(date, addtab=True):
		if not opts.git and not opts.nodates:
		return '\t%s\n' % date
Dustin Sallings Use both the from and to name in mdiff.unidiff....	r5482	if addtab and ' ' in fn1:
Alexis S. L. Carvalho git patches: correct handling of filenames with spaces...	r4679	return '\t\n'
		return '\n'
Brendan Cully Remove dates from git export file lines - they confuse git-apply	r3026
Matt Mackall many, many trivial check-code fixups	r10282	if not a and not b:
		return ""
Matt Mackall Clean up mdiff imports	r1379	epoch = util.datestr((0, 0))
mpm@selenic.com Attempt to make diff deal with null sources properly...	r264
Mads Kiilerich diff: always use / in paths in diff...	r15437	fn1 = util.pconvert(fn1)
		fn2 = util.pconvert(fn2)

Vadim Gelfer refactor text diff/patch code....	r2874	if not opts.text and (util.binary(a) or util.binary(b)):
Martin Geisler mdiff: compare content of binary files directly...	r6871	if a and b and len(a) == len(b) and a == b:
tailgunner@smtp.ru Don't lie that "binary file has changed"...	r4103	return ""
Dustin Sallings Use both the from and to name in mdiff.unidiff....	r5482	l = ['Binary file %s has changed\n' % fn1]
Thomas Arendsen Hein Fix diff against an empty file (issue124) and add a test for this.	r1723	elif not a:
Vadim Gelfer fix speed regression in mdiff caused by line split bugfix.	r2251	b = splitnewlines(b)
Thomas Arendsen Hein Fix diff against an empty file (issue124) and add a test for this.	r1723	if a is None:
Alexis S. L. Carvalho git patches: correct handling of filenames with spaces...	r4679	l1 = '--- /dev/null%s' % datetag(epoch, False)
Thomas Arendsen Hein Fix diff against an empty file (issue124) and add a test for this.	r1723	else:
Dustin Sallings Use both the from and to name in mdiff.unidiff....	r5482	l1 = "--- %s%s" % ("a/" + fn1, datetag(ad))
		l2 = "+++ %s%s" % ("b/" + fn2, datetag(bd))
mpm@selenic.com Attempt to make diff deal with null sources properly...	r264	l3 = "@@ -0,0 +1,%d @@\n" % len(b)
		l = [l1, l2, l3] + ["+" + e for e in b]
Thomas Arendsen Hein Fix diff against an empty file (issue124) and add a test for this.	r1723	elif not b:
Vadim Gelfer fix speed regression in mdiff caused by line split bugfix.	r2251	a = splitnewlines(a)
Dustin Sallings Use both the from and to name in mdiff.unidiff....	r5482	l1 = "--- %s%s" % ("a/" + fn1, datetag(ad))
Thomas Arendsen Hein Fix diff against an empty file (issue124) and add a test for this.	r1723	if b is None:
Alexis S. L. Carvalho git patches: correct handling of filenames with spaces...	r4679	l2 = '+++ /dev/null%s' % datetag(epoch, False)
Thomas Arendsen Hein Fix diff against an empty file (issue124) and add a test for this.	r1723	else:
Dustin Sallings Use both the from and to name in mdiff.unidiff....	r5482	l2 = "+++ %s%s" % ("b/" + fn2, datetag(bd))
mpm@selenic.com Attempt to make diff deal with null sources properly...	r264	l3 = "@@ -1,%d +0,0 @@\n" % len(a)
		l = [l1, l2, l3] + ["-" + e for e in a]
		else:
Vadim Gelfer fix speed regression in mdiff caused by line split bugfix.	r2251	al = splitnewlines(a)
		bl = splitnewlines(b)
Benoit Boissinot remove header handling out of mdiff.bunidiff, rename it	r10614	l = list(_unidiff(a, b, al, bl, opts=opts))
Matt Mackall many, many trivial check-code fixups	r10282	if not l:
		return ""
Benoit Boissinot remove header handling out of mdiff.bunidiff, rename it	r10614
		l.insert(0, "--- a/%s%s" % (fn1, datetag(ad)))
		l.insert(1, "+++ b/%s%s" % (fn2, datetag(bd)))
mpm@selenic.com hg diff: fix missing final newline bug	r170
		for ln in xrange(len(l)):
		if l[ln][-1] != '\n':
		l[ln] += "\n\ No newline at end of file\n"

Thomas Arendsen Hein Show revisions in diffs like CVS, based on a patch from Goffredo Baroncelli....	r396	if r:
Dirkjan Ochtman patch/diff: use a separate function to write the first line of a file diff	r7200	l.insert(0, diffline(r, fn1, fn2, opts))
Thomas Arendsen Hein Show revisions in diffs like CVS, based on a patch from Goffredo Baroncelli....	r396
mpm@selenic.com Add back links from file revisions to changeset revisions...	r0	return "".join(l)

Benoit Boissinot remove header handling out of mdiff.bunidiff, rename it	r10614	# creates a headerless unified diff
mason@suse.com Add new bdiff based unidiff generation.	r1637	# t1 and t2 are the text to be diffed
		# l1 and l2 are the text broken up into lines
Benoit Boissinot remove header handling out of mdiff.bunidiff, rename it	r10614	def _unidiff(t1, t2, l1, l2, opts=defaultopts):
mason@suse.com Add new bdiff based unidiff generation.	r1637	def contextend(l, len):
Vadim Gelfer refactor text diff/patch code....	r2874	ret = l + opts.context
mason@suse.com Add new bdiff based unidiff generation.	r1637	if ret > len:
		ret = len
		return ret

		def contextstart(l):
Vadim Gelfer refactor text diff/patch code....	r2874	ret = l - opts.context
mason@suse.com Add new bdiff based unidiff generation.	r1637	if ret < 0:
		return 0
		return ret

Brodie Rao mdiff: speed up showfunc for large diffs...	r15141	lastfunc = [0, '']
Benoit Boissinot remove header handling out of mdiff.bunidiff, rename it	r10614	def yieldhunk(hunk):
mason@suse.com Add new bdiff based unidiff generation.	r1637	(astart, a2, bstart, b2, delta) = hunk
		aend = contextend(a2, len(l1))
		alen = aend - astart
		blen = b2 - bstart + aend - a2

		func = ""
Vadim Gelfer refactor text diff/patch code....	r2874	if opts.showfunc:
Brodie Rao mdiff: speed up showfunc for large diffs...	r15141	lastpos, func = lastfunc
		# walk backwards from the start of the context up to the start of
		# the previous hunk context until we find a line starting with an
		# alphanumeric char.
		for i in xrange(astart - 1, lastpos - 1, -1):
		if l1[i][0].isalnum():
		func = ' ' + l1[i].rstrip()[:40]
		lastfunc[1] = func
mason@suse.com Add new bdiff based unidiff generation.	r1637	break
Brodie Rao mdiff: speed up showfunc for large diffs...	r15141	# by recording this hunk's starting point as the next place to
		# start looking for function lines, we avoid reading any line in
		# the file more than once.
		lastfunc[0] = astart
mason@suse.com Add new bdiff based unidiff generation.	r1637
Nicolas Venegas mdiff/patch: fix bad hunk handling for unified diffs with zero context...	r15462	# zero-length hunk ranges report their start line as one less
		if alen:
		astart += 1
		if blen:
		bstart += 1

		yield "@@ -%d,%d +%d,%d @@%s\n" % (astart, alen,
		bstart, blen, func)
mason@suse.com Add new bdiff based unidiff generation.	r1637	for x in delta:
		yield x
		for x in xrange(a2, aend):
		yield ' ' + l1[x]

		# bdiff.blocks gives us the matching sequences in the files. The loop
		# below finds the spaces between those matching sequences and translates
		# them into diff output.
		#
		hunk = None
Patrick Mezard mdiff: make diffblocks() return all blocks, matching and changed...	r15526	for s, stype in allblocks(t1, t2, opts, l1, l2):
		if stype != '!':
		continue
mason@suse.com Add new bdiff based unidiff generation.	r1637	delta = []
Patrick Mezard mdiff: extract blocks whitespace normalization in diffblocks()...	r15525	a1, a2, b1, b2 = s
mason@suse.com Add new bdiff based unidiff generation.	r1637	old = l1[a1:a2]
		new = l2[b1:b2]

		astart = contextstart(a1)
		bstart = contextstart(b1)
		prev = None
		if hunk:
		# join with the previous hunk if it falls inside the context
Vadim Gelfer refactor text diff/patch code....	r2874	if astart < hunk[1] + opts.context + 1:
mason@suse.com Add new bdiff based unidiff generation.	r1637	prev = hunk
		astart = hunk[1]
		bstart = hunk[3]
		else:
Benoit Boissinot remove header handling out of mdiff.bunidiff, rename it	r10614	for x in yieldhunk(hunk):
mason@suse.com Add new bdiff based unidiff generation.	r1637	yield x
		if prev:
		# we've joined the previous hunk, record the new ending points.
		hunk[1] = a2
		hunk[3] = b2
		delta = hunk[4]
		else:
		# create a new hunk
Matt Mackall many, many trivial check-code fixups	r10282	hunk = [astart, a2, bstart, b2, delta]
mason@suse.com Add new bdiff based unidiff generation.	r1637
Matt Mackall many, many trivial check-code fixups	r10282	delta[len(delta):] = [' ' + x for x in l1[astart:a1]]
		delta[len(delta):] = ['-' + x for x in old]
		delta[len(delta):] = ['+' + x for x in new]
mason@suse.com Add new bdiff based unidiff generation.	r1637
		if hunk:
Benoit Boissinot remove header handling out of mdiff.bunidiff, rename it	r10614	for x in yieldhunk(hunk):
mason@suse.com Add new bdiff based unidiff generation.	r1637	yield x

mpm@selenic.com Add a function to return the new text from a binary diff	r120	def patchtext(bin):
		pos = 0
		t = []
		while pos < len(bin):
		p1, p2, l = struct.unpack(">lll", bin[pos:pos + 12])
		pos += 12
		t.append(bin[pos:pos + l])
		pos += l
		return "".join(t)

mpm@selenic.com Add back links from file revisions to changeset revisions...	r0	def patch(a, bin):
Benoit Boissinot mdiff.patch(): add a special case for when the base text is empty...	r12025	if len(a) == 0:
		# skip over trivial delta header
		return buffer(bin, 12)
Matt Mackall Clean up mdiff imports	r1379	return mpatch.patches(a, [bin])
mpm@selenic.com Start using bdiff for generating deltas...	r432
Alexis S. L. Carvalho add mdiff.get_matching_blocks	r4361	# similar to difflib.SequenceMatcher.get_matching_blocks
		def get_matching_blocks(a, b):
		return [(d[0], d[2], d[1] - d[0]) for d in bdiff.blocks(a, b)]

Matt Mackall revlog: generate trivial deltas against null revision...	r5367	def trivialdiffheader(length):
		return struct.pack(">lll", 0, 0, length)

Matt Mackall Clean up mdiff imports	r1379	patches = mpatch.patches
mason@suse.com Fill in the uncompressed size during revlog.addgroup...	r2078	patchedsize = mpatch.patchedsize
mpm@selenic.com Start using bdiff for generating deltas...	r432	textdiff = bdiff.bdiff