upstream/mercurial-mirror Files · hgext/highlight/highlight.py

revlog: move revision verification out of verify...

revlog: move revision verification out of verify File revision verification is performing low-level checks of file storage, namely that flags are appropriate and revision data can be resolved. Since these checks are somewhat revlog-specific and may not be appropriate for alternate storage backends, this commit moves those checks from verify.py to revlog.py. Because we're now emitting warnings/errors that apply to specific revisions, we taught the iverifyproblem interface to expose the problematic node and to report this node in verify output. This was necessary to prevent unwanted test changes. After this change, revlog.verifyintegrity() and file verify code in verify.py both iterate over revisions and resolve their fulltext. But they do so in separate loops. (verify.py needs to resolve fulltexts as part of calling renamed() - at least when using revlogs.) This should add overhead. But on the mozilla-unified repo: $ hg verify before: time: real 700.640 secs (user 585.520+0.000 sys 23.480+0.000) after: time: real 682.380 secs (user 570.370+0.000 sys 22.240+0.000) I'm not sure what's going on. Maybe avoiding the filelog attribute proxies shaved off enough time to offset the losses? Maybe fulltext resolution has less overhead than I thought? I've left a comment indicating the potential for optimization. But because it doesn't produce a performance regression on a large repository, I'm not going to worry about it. Differential Revision: https://phab.mercurial-scm.org/D4745

Yuya Nishihara - - Load All Authors

File last commit:

r38402:23dc901c default


                r39908:733db72f

default

Download file

             highlight.py
        
                    97 lines
            
             | 3.0 KiB
            
                | text/x-python
            
             |
                PythonLexer
            
             / hgext / highlight / highlight.py
          
                    History
                
                 |
                  Source
                 | Raw
                 |Copy content
                 |Copy permalink

        Martin Geisler
    
highlight: add copyright and license header

              r8251
            
      # highlight.py - highlight extension implementation file

      #

      #  Copyright 2007-2009 Adam Hupp <adam@hupp.org> and others

      #

      # This software may be used and distributed according to the terms of the

        Matt Mackall
    
Update license to GPLv2+

              r10263
            
      # GNU General Public License version 2 or any later version.

        Patrick Mezard
    
highlight: split code to improve startup times

              r6938
            
      #

      # The original module was split in an interface and an implementation

      # file to defer pygments loading and speedup extension setup.

        Pulkit Goyal
    
py3: make files use absolute_import and print_function...

              r29485
            
      from __future__ import absolute_import

        Patrick Mezard
    
highlight: split code to improve startup times

              r6938
            
      from mercurial import demandimport

        Gregory Szorc
    
demandimport: make module ignores a set (API)...

              r37862
            
      demandimport.IGNORES.update(['pkgutil', 'pkg_resources', '__main__'])

        Pulkit Goyal
    
py3: make files use absolute_import and print_function...

              r29485
            
      from mercurial import (

          encoding,

        Yuya Nishihara
    
stringutil: bulk-replace call sites to point to new module...

              r37102
            
      )

      from mercurial.utils import (

          stringutil,

        Pulkit Goyal
    
py3: make files use absolute_import and print_function...

              r29485
            
      )

        Patrick Mezard
    
highlight: split code to improve startup times

              r6938
            
        Augie Fackler
    
highlight: put pygments import inside demandimport.deactivated...

              r32908
            
      with demandimport.deactivated():

          import pygments

          import pygments.formatters

          import pygments.lexers

        Augie Fackler
    
highlight: eagerly discover plugin lexers while demandimport is off...

              r35330
            
          import pygments.plugin

        Augie Fackler
    
highlight: put pygments import inside demandimport.deactivated...

              r32908
            
          import pygments.util

        Augie Fackler
    
highlight: eagerly discover plugin lexers while demandimport is off...

              r35330
            
          for unused in pygments.plugin.find_plugin_lexers():

              pass

        Pulkit Goyal
    
py3: make files use absolute_import and print_function...

              r29485
            
      highlight = pygments.highlight

      ClassNotFound = pygments.util.ClassNotFound

      guess_lexer = pygments.lexers.guess_lexer

      guess_lexer_for_filename = pygments.lexers.guess_lexer_for_filename

      TextLexer = pygments.lexers.TextLexer

      HtmlFormatter = pygments.formatters.HtmlFormatter

        Patrick Mezard
    
highlight: split code to improve startup times

              r6938
            
      SYNTAX_CSS = ('\n<link rel="stylesheet" href="{url}highlightcss" '

                    'type="text/css" />')

        Gregory Szorc
    
highlight: add option to prevent content-only based fallback...

              r26680
            
      def pygmentize(field, fctx, style, tmpl, guessfilenameonly=False):

        Patrick Mezard
    
highlight: split code to improve startup times

              r6938
            
          # append a <link ...> to the syntax highlighting css

        Yuya Nishihara
    
highlight: get around tmpl.load() which now returns a parsed tree...

              r38402
            
          tmpl.load('header')

          old_header = tmpl.cache['header']

        Patrick Mezard
    
highlight: split code to improve startup times

              r6938
            
          if SYNTAX_CSS not in old_header:

        timeless
    
cleanup: remove superfluous space after space after equals (python)

              r27637
            
              new_header = old_header + SYNTAX_CSS

        Patrick Mezard
    
highlight: split code to improve startup times

              r6938
            
              tmpl.cache['header'] = new_header

          text = fctx.data()

        Yuya Nishihara
    
stringutil: bulk-replace call sites to point to new module...

              r37102
            
          if stringutil.binary(text):

        Patrick Mezard
    
highlight: split code to improve startup times

              r6938
            
              return

        Matt Mackall
    
highlight: ignore Unicode's extra linebreaks (issue4291)...

              r23613
            
          # str.splitlines() != unicode.splitlines() because "reasons"

          for c in "\x0c\x1c\x1d\x1e":

              if c in text:

                  text = text.replace(c, '')

        Yuya Nishihara
    
highlight: fixes garbled text in non-UTF-8 environment...

              r9424
            
          # Pygments is best used with Unicode strings:

          # <http://pygments.org/docs/unicode/>

          text = text.decode(encoding.encoding, 'replace')

        Christian Ebert
    
highlight: convert text to local before passing to pygmentize (issue1341)...

              r7120
            
        Patrick Mezard
    
highlight: split code to improve startup times

              r6938
            
          # To get multi-line strings right, we can't format line-by-line

          try:

        Alexander Plavin
    
highlight: fix page layout with empty first and last lines...

              r19169
            
              lexer = guess_lexer_for_filename(fctx.path(), text[:1024],

                                               stripnl=False)

        Patrick Mezard
    
highlight: split code to improve startup times

              r6938
            
          except (ClassNotFound, ValueError):

        Gregory Szorc
    
highlight: add option to prevent content-only based fallback...

              r26680
            
              # guess_lexer will return a lexer if *any* lexer matches. There is

              # no way to specify a minimum match score. This can give a high rate of

              # false positives on files with an unknown filename pattern.

              if guessfilenameonly:

                  return

        Patrick Mezard
    
highlight: split code to improve startup times

              r6938
            
              try:

        Alexander Plavin
    
highlight: fix page layout with empty first and last lines...

              r19169
            
                  lexer = guess_lexer(text[:1024], stripnl=False)

        Patrick Mezard
    
highlight: split code to improve startup times

              r6938
            
              except (ClassNotFound, ValueError):

        av6
    
highlight: exit early on textual and unknown files (issue3005)...

              r25899
            
                  # Don't highlight unknown files

                  return

          # Don't highlight text files

          if isinstance(lexer, TextLexer):

              return

        Patrick Mezard
    
highlight: split code to improve startup times

              r6938
            
        av6
    
highlight: produce correct markup when there's a blank line just before EOF...

              r25867
            
          formatter = HtmlFormatter(nowrap=True, style=style)

        Patrick Mezard
    
highlight: split code to improve startup times

              r6938
            
          colorized = highlight(text, lexer, formatter)

        Yuya Nishihara
    
highlight: fixes garbled text in non-UTF-8 environment...

              r9424
            
          coloriter = (s.encode(encoding.encoding, 'replace')

                       for s in colorized.splitlines())

        Patrick Mezard
    
highlight: split code to improve startup times

              r6938
            
        Augie Fackler
    
highlight: adjust to attribute being private...

              r38378
            
          tmpl._filters['colorize'] = lambda x: next(coloriter)

        Patrick Mezard
    
highlight: split code to improve startup times

              r6938
            
          oldl = tmpl.cache[field]

          newl = oldl.replace('line|escape', 'line|colorize')

          tmpl.cache[field] = newl

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages

Martin Geisler highlight: add copyright and license header	r8251	# highlight.py - highlight extension implementation file
		#
		# Copyright 2007-2009 Adam Hupp <adam@hupp.org> and others
		#
		# This software may be used and distributed according to the terms of the
Matt Mackall Update license to GPLv2+	r10263	# GNU General Public License version 2 or any later version.
Patrick Mezard highlight: split code to improve startup times	r6938	#
		# The original module was split in an interface and an implementation
		# file to defer pygments loading and speedup extension setup.

Pulkit Goyal py3: make files use absolute_import and print_function...	r29485	from __future__ import absolute_import

Patrick Mezard highlight: split code to improve startup times	r6938	from mercurial import demandimport
Gregory Szorc demandimport: make module ignores a set (API)...	r37862	demandimport.IGNORES.update(['pkgutil', 'pkg_resources', '__main__'])
Pulkit Goyal py3: make files use absolute_import and print_function...	r29485
		from mercurial import (
		encoding,
Yuya Nishihara stringutil: bulk-replace call sites to point to new module...	r37102	)

		from mercurial.utils import (
		stringutil,
Pulkit Goyal py3: make files use absolute_import and print_function...	r29485	)
Patrick Mezard highlight: split code to improve startup times	r6938
Augie Fackler highlight: put pygments import inside demandimport.deactivated...	r32908	with demandimport.deactivated():
		import pygments
		import pygments.formatters
		import pygments.lexers
Augie Fackler highlight: eagerly discover plugin lexers while demandimport is off...	r35330	import pygments.plugin
Augie Fackler highlight: put pygments import inside demandimport.deactivated...	r32908	import pygments.util

Augie Fackler highlight: eagerly discover plugin lexers while demandimport is off...	r35330	for unused in pygments.plugin.find_plugin_lexers():
		pass

Pulkit Goyal py3: make files use absolute_import and print_function...	r29485	highlight = pygments.highlight
		ClassNotFound = pygments.util.ClassNotFound
		guess_lexer = pygments.lexers.guess_lexer
		guess_lexer_for_filename = pygments.lexers.guess_lexer_for_filename
		TextLexer = pygments.lexers.TextLexer
		HtmlFormatter = pygments.formatters.HtmlFormatter
Patrick Mezard highlight: split code to improve startup times	r6938
		SYNTAX_CSS = ('\n<link rel="stylesheet" href="{url}highlightcss" '
		'type="text/css" />')

Gregory Szorc highlight: add option to prevent content-only based fallback...	r26680	def pygmentize(field, fctx, style, tmpl, guessfilenameonly=False):
Patrick Mezard highlight: split code to improve startup times	r6938
		# append a <link ...> to the syntax highlighting css
Yuya Nishihara highlight: get around tmpl.load() which now returns a parsed tree...	r38402	tmpl.load('header')
		old_header = tmpl.cache['header']
Patrick Mezard highlight: split code to improve startup times	r6938	if SYNTAX_CSS not in old_header:
timeless cleanup: remove superfluous space after space after equals (python)	r27637	new_header = old_header + SYNTAX_CSS
Patrick Mezard highlight: split code to improve startup times	r6938	tmpl.cache['header'] = new_header

		text = fctx.data()
Yuya Nishihara stringutil: bulk-replace call sites to point to new module...	r37102	if stringutil.binary(text):
Patrick Mezard highlight: split code to improve startup times	r6938	return

Matt Mackall highlight: ignore Unicode's extra linebreaks (issue4291)...	r23613	# str.splitlines() != unicode.splitlines() because "reasons"
		for c in "\x0c\x1c\x1d\x1e":
		if c in text:
		text = text.replace(c, '')

Yuya Nishihara highlight: fixes garbled text in non-UTF-8 environment...	r9424	# Pygments is best used with Unicode strings:
		# <http://pygments.org/docs/unicode/>
		text = text.decode(encoding.encoding, 'replace')
Christian Ebert highlight: convert text to local before passing to pygmentize (issue1341)...	r7120
Patrick Mezard highlight: split code to improve startup times	r6938	# To get multi-line strings right, we can't format line-by-line
		try:
Alexander Plavin highlight: fix page layout with empty first and last lines...	r19169	lexer = guess_lexer_for_filename(fctx.path(), text[:1024],
		stripnl=False)
Patrick Mezard highlight: split code to improve startup times	r6938	except (ClassNotFound, ValueError):
Gregory Szorc highlight: add option to prevent content-only based fallback...	r26680	# guess_lexer will return a lexer if any lexer matches. There is
		# no way to specify a minimum match score. This can give a high rate of
		# false positives on files with an unknown filename pattern.
		if guessfilenameonly:
		return

Patrick Mezard highlight: split code to improve startup times	r6938	try:
Alexander Plavin highlight: fix page layout with empty first and last lines...	r19169	lexer = guess_lexer(text[:1024], stripnl=False)
Patrick Mezard highlight: split code to improve startup times	r6938	except (ClassNotFound, ValueError):
av6 highlight: exit early on textual and unknown files (issue3005)...	r25899	# Don't highlight unknown files
		return

		# Don't highlight text files
		if isinstance(lexer, TextLexer):
		return
Patrick Mezard highlight: split code to improve startup times	r6938
av6 highlight: produce correct markup when there's a blank line just before EOF...	r25867	formatter = HtmlFormatter(nowrap=True, style=style)
Patrick Mezard highlight: split code to improve startup times	r6938
		colorized = highlight(text, lexer, formatter)
Yuya Nishihara highlight: fixes garbled text in non-UTF-8 environment...	r9424	coloriter = (s.encode(encoding.encoding, 'replace')
		for s in colorized.splitlines())
Patrick Mezard highlight: split code to improve startup times	r6938
Augie Fackler highlight: adjust to attribute being private...	r38378	tmpl._filters['colorize'] = lambda x: next(coloriter)
Patrick Mezard highlight: split code to improve startup times	r6938
		oldl = tmpl.cache[field]
		newl = oldl.replace('line\|escape', 'line\|colorize')
		tmpl.cache[field] = newl