upstream/mercurial-mirror Files · mercurial/lsprof.py

revlog: change generaldelta delta parent heuristic...

revlog: change generaldelta delta parent heuristic The old generaldelta heuristic was "if p1 (or p2) was closer than the last full text, use it, otherwise use prev". This was problematic when a repo contained multiple branches that were very different. If commits to branch A were pushed, and the last full text was branch B, it would generate a fulltext. Then if branch B was pushed, it would generate another fulltext. The problem is that the last fulltext (and delta'ing against `prev` in general) has no correlation with the contents of the incoming revision, and therefore will always have degenerate cases. According to the blame, that algorithm was chosen to minimize the chain length. Since there is already code that protects against that (the delta-vs-fulltext code), and since it has been improved since the original generaldelta algorithm went in (2011), I believe the chain length criteria will still be preserved. The new algorithm always diffs against p1 (or p2 if it's closer), unless the resulting delta will fail the delta-vs-fulltext check, in which case we delta against prev. Some before and after stats on manifest.d size. internal large repo old heuristic - 2.0 GB new heuristic - 1.2 GB mozilla-central old heuristic - 242 MB new heuristic - 261 MB The regression in mozilla central is due to the new heuristic choosing p2r as the delta when it's closer to the tip. Switching the algorithm to always prefer p1r brings the size back down (242 MB). This is result of the way in which mozilla does merges and pushes, and the result could easily swing the other direction in other repos (depending on if they merge X into Y or Y into X), but will never be as degenerate as before. I future patch will address the regression by introducing an optional, even more aggressive delta heuristic which will knock the mozilla manifest size down dramatically.

Mads Kiilerich - - Load All Authors

File last commit:

r18642:a40d608e default


                r26117:4dc5b51f

default

Download file

             lsprof.py
        
                    109 lines
            
             | 3.6 KiB
            
                | text/x-python
            
             |
                PythonLexer
            
             / mercurial / lsprof.py
          
                    History
                
                 |
                  Source
                 | Raw
                 |Copy content
                 |Copy permalink

        Vadim Gelfer
    
add --lsprof option. 3x faster than --profile, more useful output....

              r2422
            
      import sys

        Joel Rosdahl
    
Remove unused imports

              r6212
            
      from _lsprof import Profiler, profiler_entry

        Vadim Gelfer
    
add --lsprof option. 3x faster than --profile, more useful output....

              r2422
            
      __all__ = ['profile', 'Stats']

      def profile(f, *args, **kwds):

          """XXX docstring"""

          p = Profiler()

        Dirkjan Ochtman
    
updating lsprof.py from remote repository

              r5992
            
          p.enable(subcalls=True, builtins=True)

        Vadim Gelfer
    
add --lsprof option. 3x faster than --profile, more useful output....

              r2422
            
          try:

        Dirkjan Ochtman
    
updating lsprof.py from remote repository

              r5992
            
              f(*args, **kwds)

        Vadim Gelfer
    
add --lsprof option. 3x faster than --profile, more useful output....

              r2422
            
          finally:

              p.disable()

        Dirkjan Ochtman
    
updating lsprof.py from remote repository

              r5992
            
          return Stats(p.getstats())

        Vadim Gelfer
    
add --lsprof option. 3x faster than --profile, more useful output....

              r2422
            
      class Stats(object):

          """XXX docstring"""

          def __init__(self, data):

              self.data = data

          def sort(self, crit="inlinetime"):

              """XXX docstring"""

              if crit not in profiler_entry.__dict__:

        Peter Ruibal
    
use Exception(args)-style raising consistently (py3k compatibility)

              r7008
            
                  raise ValueError("Can't sort by %s" % crit)

        Alejandro Santos
    
compat: use 'key' argument instead of 'cmp' when sorting a list

              r9032
            
              self.data.sort(key=lambda x: getattr(x, crit), reverse=True)

        Vadim Gelfer
    
add --lsprof option. 3x faster than --profile, more useful output....

              r2422
            
              for e in self.data:

                  if e.calls:

        Alejandro Santos
    
compat: use 'key' argument instead of 'cmp' when sorting a list

              r9032
            
                      e.calls.sort(key=lambda x: getattr(x, crit), reverse=True)

        Vadim Gelfer
    
add --lsprof option. 3x faster than --profile, more useful output....

              r2422
            
          def pprint(self, top=None, file=None, limit=None, climit=None):

              """XXX docstring"""

              if file is None:

                  file = sys.stdout

              d = self.data

              if top is not None:

                  d = d[:top]

        Dirkjan Ochtman
    
updating lsprof.py from remote repository

              r5992
            
              cols = "% 12s %12s %11.4f %11.4f   %s\n"

              hcols = "% 12s %12s %12s %12s %s\n"

        Bryan O'Sullivan
    
lsprof: report units correctly

              r16804
            
              file.write(hcols % ("CallCount", "Recursive", "Total(s)",

                                  "Inline(s)", "module:lineno(function)"))

        Vadim Gelfer
    
add --lsprof option. 3x faster than --profile, more useful output....

              r2422
            
              count = 0

              for e in d:

        Dirkjan Ochtman
    
updating lsprof.py from remote repository

              r5992
            
                  file.write(cols % (e.callcount, e.reccallcount, e.totaltime,

        Vadim Gelfer
    
add --lsprof option. 3x faster than --profile, more useful output....

              r2422
            
                                     e.inlinetime, label(e.code)))

                  count += 1

                  if limit is not None and count == limit:

                      return

                  ccount = 0

        Matt Mackall
    
profile: add undocumented config options for profiler output

              r16263
            
                  if climit and e.calls:

        Vadim Gelfer
    
add --lsprof option. 3x faster than --profile, more useful output....

              r2422
            
                      for se in e.calls:

        Mads Kiilerich
    
profiling: replace '+' markup of nested lines with indentation...

              r18642
            
                          file.write(cols % (se.callcount, se.reccallcount,

        Vadim Gelfer
    
add --lsprof option. 3x faster than --profile, more useful output....

              r2422
            
                                             se.totaltime, se.inlinetime,

        Mads Kiilerich
    
profiling: replace '+' markup of nested lines with indentation...

              r18642
            
                                             "    %s" % label(se.code)))

        Vadim Gelfer
    
add --lsprof option. 3x faster than --profile, more useful output....

              r2422
            
                          count += 1

                          ccount += 1

                          if limit is not None and count == limit:

                              return

                          if climit is not None and ccount == climit:

                              break

          def freeze(self):

              """Replace all references to code objects with string

              descriptions; this makes it possible to pickle the instance."""

              # this code is probably rather ickier than it needs to be!

              for i in range(len(self.data)):

                  e = self.data[i]

                  if not isinstance(e.code, str):

                      self.data[i] = type(e)((label(e.code),) + e[1:])

        Dirkjan Ochtman
    
updating lsprof.py from remote repository

              r5992
            
                  if e.calls:

                      for j in range(len(e.calls)):

                          se = e.calls[j]

                          if not isinstance(se.code, str):

                              e.calls[j] = type(se)((label(se.code),) + se[1:])

        Vadim Gelfer
    
add --lsprof option. 3x faster than --profile, more useful output....

              r2422
            
      _fn2mod = {}

      def label(code):

          if isinstance(code, str):

              return code

          try:

              mname = _fn2mod[code.co_filename]

          except KeyError:

        Dirkjan Ochtman
    
lsprof: make profile not die when imported modules changes (issue1774)

              r9314
            
              for k, v in list(sys.modules.iteritems()):

        Vadim Gelfer
    
add --lsprof option. 3x faster than --profile, more useful output....

              r2422
            
                  if v is None:

                      continue

        Augie Fackler
    
lsprof: use getattr instead of hasattr

              r14959
            
                  if not isinstance(getattr(v, '__file__', None), str):

        Vadim Gelfer
    
add --lsprof option. 3x faster than --profile, more useful output....

              r2422
            
                      continue

                  if v.__file__.startswith(code.co_filename):

                      mname = _fn2mod[code.co_filename] = k

                      break

              else:

        Benoit Boissinot
    
fix spaces/identation issues

              r10339
            
                  mname = _fn2mod[code.co_filename] = '<%s>' % code.co_filename

        Vadim Gelfer
    
add --lsprof option. 3x faster than --profile, more useful output....

              r2422
            
          return '%s:%d(%s)' % (mname, code.co_firstlineno, code.co_name)

      if __name__ == '__main__':

          import os

          sys.argv = sys.argv[1:]

          if not sys.argv:

              print >> sys.stderr, "usage: lsprof.py <script> <arguments...>"

              sys.exit(2)

          sys.path.insert(0, os.path.abspath(os.path.dirname(sys.argv[0])))

          stats = profile(execfile, sys.argv[0], globals(), locals())

          stats.sort()

          stats.pprint()

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages

Vadim Gelfer add --lsprof option. 3x faster than --profile, more useful output....	r2422	import sys
Joel Rosdahl Remove unused imports	r6212	from _lsprof import Profiler, profiler_entry
Vadim Gelfer add --lsprof option. 3x faster than --profile, more useful output....	r2422
		__all__ = ['profile', 'Stats']

		def profile(f, args, *kwds):
		"""XXX docstring"""
		p = Profiler()
Dirkjan Ochtman updating lsprof.py from remote repository	r5992	p.enable(subcalls=True, builtins=True)
Vadim Gelfer add --lsprof option. 3x faster than --profile, more useful output....	r2422	try:
Dirkjan Ochtman updating lsprof.py from remote repository	r5992	f(args, *kwds)
Vadim Gelfer add --lsprof option. 3x faster than --profile, more useful output....	r2422	finally:
		p.disable()
Dirkjan Ochtman updating lsprof.py from remote repository	r5992	return Stats(p.getstats())
Vadim Gelfer add --lsprof option. 3x faster than --profile, more useful output....	r2422

		class Stats(object):
		"""XXX docstring"""

		def __init__(self, data):
		self.data = data

		def sort(self, crit="inlinetime"):
		"""XXX docstring"""
		if crit not in profiler_entry.__dict__:
Peter Ruibal use Exception(args)-style raising consistently (py3k compatibility)	r7008	raise ValueError("Can't sort by %s" % crit)
Alejandro Santos compat: use 'key' argument instead of 'cmp' when sorting a list	r9032	self.data.sort(key=lambda x: getattr(x, crit), reverse=True)
Vadim Gelfer add --lsprof option. 3x faster than --profile, more useful output....	r2422	for e in self.data:
		if e.calls:
Alejandro Santos compat: use 'key' argument instead of 'cmp' when sorting a list	r9032	e.calls.sort(key=lambda x: getattr(x, crit), reverse=True)
Vadim Gelfer add --lsprof option. 3x faster than --profile, more useful output....	r2422
		def pprint(self, top=None, file=None, limit=None, climit=None):
		"""XXX docstring"""
		if file is None:
		file = sys.stdout
		d = self.data
		if top is not None:
		d = d[:top]
Dirkjan Ochtman updating lsprof.py from remote repository	r5992	cols = "% 12s %12s %11.4f %11.4f %s\n"
		hcols = "% 12s %12s %12s %12s %s\n"
Bryan O'Sullivan lsprof: report units correctly	r16804	file.write(hcols % ("CallCount", "Recursive", "Total(s)",
		"Inline(s)", "module:lineno(function)"))
Vadim Gelfer add --lsprof option. 3x faster than --profile, more useful output....	r2422	count = 0
		for e in d:
Dirkjan Ochtman updating lsprof.py from remote repository	r5992	file.write(cols % (e.callcount, e.reccallcount, e.totaltime,
Vadim Gelfer add --lsprof option. 3x faster than --profile, more useful output....	r2422	e.inlinetime, label(e.code)))
		count += 1
		if limit is not None and count == limit:
		return
		ccount = 0
Matt Mackall profile: add undocumented config options for profiler output	r16263	if climit and e.calls:
Vadim Gelfer add --lsprof option. 3x faster than --profile, more useful output....	r2422	for se in e.calls:
Mads Kiilerich profiling: replace '+' markup of nested lines with indentation...	r18642	file.write(cols % (se.callcount, se.reccallcount,
Vadim Gelfer add --lsprof option. 3x faster than --profile, more useful output....	r2422	se.totaltime, se.inlinetime,
Mads Kiilerich profiling: replace '+' markup of nested lines with indentation...	r18642	" %s" % label(se.code)))
Vadim Gelfer add --lsprof option. 3x faster than --profile, more useful output....	r2422	count += 1
		ccount += 1
		if limit is not None and count == limit:
		return
		if climit is not None and ccount == climit:
		break

		def freeze(self):
		"""Replace all references to code objects with string
		descriptions; this makes it possible to pickle the instance."""

		# this code is probably rather ickier than it needs to be!
		for i in range(len(self.data)):
		e = self.data[i]
		if not isinstance(e.code, str):
		self.data[i] = type(e)((label(e.code),) + e[1:])
Dirkjan Ochtman updating lsprof.py from remote repository	r5992	if e.calls:
		for j in range(len(e.calls)):
		se = e.calls[j]
		if not isinstance(se.code, str):
		e.calls[j] = type(se)((label(se.code),) + se[1:])
Vadim Gelfer add --lsprof option. 3x faster than --profile, more useful output....	r2422
		_fn2mod = {}

		def label(code):
		if isinstance(code, str):
		return code
		try:
		mname = _fn2mod[code.co_filename]
		except KeyError:
Dirkjan Ochtman lsprof: make profile not die when imported modules changes (issue1774)	r9314	for k, v in list(sys.modules.iteritems()):
Vadim Gelfer add --lsprof option. 3x faster than --profile, more useful output....	r2422	if v is None:
		continue
Augie Fackler lsprof: use getattr instead of hasattr	r14959	if not isinstance(getattr(v, '__file__', None), str):
Vadim Gelfer add --lsprof option. 3x faster than --profile, more useful output....	r2422	continue
		if v.__file__.startswith(code.co_filename):
		mname = _fn2mod[code.co_filename] = k
		break
		else:
Benoit Boissinot fix spaces/identation issues	r10339	mname = _fn2mod[code.co_filename] = '<%s>' % code.co_filename
Vadim Gelfer add --lsprof option. 3x faster than --profile, more useful output....	r2422
		return '%s:%d(%s)' % (mname, code.co_firstlineno, code.co_name)


		if __name__ == '__main__':
		import os
		sys.argv = sys.argv[1:]
		if not sys.argv:
		print >> sys.stderr, "usage: lsprof.py <script> <arguments...>"
		sys.exit(2)
		sys.path.insert(0, os.path.abspath(os.path.dirname(sys.argv[0])))
		stats = profile(execfile, sys.argv[0], globals(), locals())
		stats.sort()
		stats.pprint()