upstream/mercurial-mirror Files · mercurial/py3kcompat.py

parsers: inline fields of dirstate values in C version...

parsers: inline fields of dirstate values in C version Previously, while unpacking the dirstate we'd create 3-4 new CPython objects for most dirstate values: - the state is a single character string, which is pooled by CPython - the mode is a new object if it isn't 0 due to being in the lookup set - the size is a new object if it is greater than 255 - the mtime is a new object if it isn't -1 due to being in the lookup set - the tuple to contain them all In some cases such as regular hg status, we actually look at all the objects. In other cases like hg add, hg status for a subdirectory, or hg status with the third-party hgwatchman enabled, we look at almost none of the objects. This patch eliminates most object creation in these cases by defining a custom C struct that is exposed to Python with an interface similar to a tuple. Only when tuple elements are actually requested are the respective objects created. The gains, where they're expected, are significant. The following tests are run against a working copy with over 270,000 files. parse_dirstate becomes significantly faster: $ hg perfdirstate before: wall 0.186437 comb 0.180000 user 0.160000 sys 0.020000 (best of 35) after: wall 0.093158 comb 0.100000 user 0.090000 sys 0.010000 (best of 95) and as a result, several commands benefit: $ time hg status # with hgwatchman enabled before: 0.42s user 0.14s system 99% cpu 0.563 total after: 0.34s user 0.12s system 99% cpu 0.471 total $ time hg add new-file before: 0.85s user 0.18s system 99% cpu 1.033 total after: 0.76s user 0.17s system 99% cpu 0.931 total There is a slight regression in regular status performance, but this is fixed in an upcoming patch.

Matt Mackall - - Load All Authors

File last commit:

r21292:a7a9d84f default


                r21809:e250b830

default

Download file

             py3kcompat.py
        
                    65 lines
            
             | 2.1 KiB
            
                | text/x-python
            
             |
                PythonLexer
            
             / mercurial / py3kcompat.py
          
                    History
                
                 |
                  Annotation
                 | Raw
                 |Copy content
                 |Copy permalink

      # py3kcompat.py - compatibility definitions for running hg in py3k

      #

      # Copyright 2010 Renato Cunha <renatoc@gmail.com>

      #

      # This software may be used and distributed according to the terms of the

      # GNU General Public License version 2 or any later version.

      import builtins

      from numbers import Number

      def bytesformatter(format, args):

          '''Custom implementation of a formatter for bytestrings.

          This function currently relies on the string formatter to do the

          formatting and always returns bytes objects.

          >>> bytesformatter(20, 10)

          0

          >>> bytesformatter('unicode %s, %s!', ('string', 'foo'))

          b'unicode string, foo!'

          >>> bytesformatter(b'test %s', 'me')

          b'test me'

          >>> bytesformatter('test %s', 'me')

          b'test me'

          >>> bytesformatter(b'test %s', b'me')

          b'test me'

          >>> bytesformatter('test %s', b'me')

          b'test me'

          >>> bytesformatter('test %d: %s', (1, b'result'))

          b'test 1: result'

          '''

          # The current implementation just converts from bytes to unicode, do

          # what's needed and then convert the results back to bytes.

          # Another alternative is to use the Python C API implementation.

          if isinstance(format, Number):

              # If the fixer erroneously passes a number remainder operation to

              # bytesformatter, we just return the correct operation

              return format % args

          if isinstance(format, bytes):

              format = format.decode('utf-8', 'surrogateescape')

          if isinstance(args, bytes):

              args = args.decode('utf-8', 'surrogateescape')

          if isinstance(args, tuple):

              newargs = []

              for arg in args:

                  if isinstance(arg, bytes):

                      arg = arg.decode('utf-8', 'surrogateescape')

                  newargs.append(arg)

              args = tuple(newargs)

          ret = format % args

          return ret.encode('utf-8', 'surrogateescape')

      builtins.bytesformatter = bytesformatter

      origord = builtins.ord

      def fakeord(char):

          if isinstance(char, int):

              return char

          return origord(char)

      builtins.ord = fakeord

      if __name__ == '__main__':

          import doctest

          doctest.testmod()

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages

				# py3kcompat.py - compatibility definitions for running hg in py3k
				#
				# Copyright 2010 Renato Cunha <renatoc@gmail.com>
				#
				# This software may be used and distributed according to the terms of the
				# GNU General Public License version 2 or any later version.

				import builtins

				from numbers import Number

				def bytesformatter(format, args):
				'''Custom implementation of a formatter for bytestrings.

				This function currently relies on the string formatter to do the
				formatting and always returns bytes objects.

				>>> bytesformatter(20, 10)
				0
				>>> bytesformatter('unicode %s, %s!', ('string', 'foo'))
				b'unicode string, foo!'
				>>> bytesformatter(b'test %s', 'me')
				b'test me'
				>>> bytesformatter('test %s', 'me')
				b'test me'
				>>> bytesformatter(b'test %s', b'me')
				b'test me'
				>>> bytesformatter('test %s', b'me')
				b'test me'
				>>> bytesformatter('test %d: %s', (1, b'result'))
				b'test 1: result'
				'''
				# The current implementation just converts from bytes to unicode, do
				# what's needed and then convert the results back to bytes.
				# Another alternative is to use the Python C API implementation.
				if isinstance(format, Number):
				# If the fixer erroneously passes a number remainder operation to
				# bytesformatter, we just return the correct operation
				return format % args
				if isinstance(format, bytes):
				format = format.decode('utf-8', 'surrogateescape')
				if isinstance(args, bytes):
				args = args.decode('utf-8', 'surrogateescape')
				if isinstance(args, tuple):
				newargs = []
				for arg in args:
				if isinstance(arg, bytes):
				arg = arg.decode('utf-8', 'surrogateescape')
				newargs.append(arg)
				args = tuple(newargs)
				ret = format % args
				return ret.encode('utf-8', 'surrogateescape')
				builtins.bytesformatter = bytesformatter

				origord = builtins.ord
				def fakeord(char):
				if isinstance(char, int):
				return char
				return origord(char)
				builtins.ord = fakeord

				if __name__ == '__main__':
				import doctest
				doctest.testmod()