upstream/mercurial-mirror Files · mercurial/lsprof.py

snapshot: search for unrelated but reusable full-snapshot...

snapshot: search for unrelated but reusable full-snapshot # New Strategy Step: Reusing Snapshot Outside Of Parents' Chain. If no suitable bases were found in the parent's chains, see if we could reuse a full snapshot not directly related to the current revision. Such search can be expensive, so we only search for snapshots appended to the revlog *after* the bases used by the parents of the current revision (the one we just tested). We assume the parent's bases were created because the previous snapshots were unsuitable, so there are low odds they would be useful now. This search gives a chance to reuse a delta chain unrelated to the current revision. Without this re-use, topological branches would keep reopening new full chains. Creating more and more snapshots as the repository grow. In repositories with many topological branches, the lack of delta reuse can create too many snapshots reducing overall compression to nothing. This results in a very large repository and other usability issues. For now, we still focus on creating level-1 snapshots. However, this principle will play a large part in how we avoid snapshot explosion once we have more snapshot levels. # Effects On The Test Repository In the test repository we created, we can see the beneficial effect of such reuse. We need very few level-0 snapshots and the overall revlog size has decreased. The `hg debugrevlog` call, show a "lvl-2" snapshot. It comes from the existing delta logic using the `prev` revision (revlog's tip) as the base. In this specific case, it turns out the tip was a level-1 snapshot. This is a coincidence that can be ignored. Finding and testing against all these unrelated snapshots can have a performance impact at write time. We currently focus on building good deltas chain we build. Performance concern will be dealt with later in another series.

Augie Fackler - - Load All Authors

File last commit:

r35854:d4e5b265 default


                r39529:3ca144f1

default

Download file

             lsprof.py
        
                    121 lines
            
             | 4.0 KiB
            
                | text/x-python
            
             |
                PythonLexer
            
             / mercurial / lsprof.py
          
                    History
                
                 |
                  Annotation
                 | Raw
                 |Copy content
                 |Copy permalink

      from __future__ import absolute_import, print_function

      import _lsprof

      import sys

      Profiler = _lsprof.Profiler

      # PyPy doesn't expose profiler_entry from the module.

      profiler_entry = getattr(_lsprof, 'profiler_entry', None)

      __all__ = ['profile', 'Stats']

      def profile(f, *args, **kwds):

          """XXX docstring"""

          p = Profiler()

          p.enable(subcalls=True, builtins=True)

          try:

              f(*args, **kwds)

          finally:

              p.disable()

          return Stats(p.getstats())

      class Stats(object):

          """XXX docstring"""

          def __init__(self, data):

              self.data = data

          def sort(self, crit=r"inlinetime"):

              """XXX docstring"""

              # profiler_entries isn't defined when running under PyPy.

              if profiler_entry:

                  if crit not in profiler_entry.__dict__:

                      raise ValueError("Can't sort by %s" % crit)

              elif self.data and not getattr(self.data[0], crit, None):

                  raise ValueError("Can't sort by %s" % crit)

              self.data.sort(key=lambda x: getattr(x, crit), reverse=True)

              for e in self.data:

                  if e.calls:

                      e.calls.sort(key=lambda x: getattr(x, crit), reverse=True)

          def pprint(self, top=None, file=None, limit=None, climit=None):

              """XXX docstring"""

              if file is None:

                  file = sys.stdout

              d = self.data

              if top is not None:

                  d = d[:top]

              cols = "% 12s %12s %11.4f %11.4f   %s\n"

              hcols = "% 12s %12s %12s %12s %s\n"

              file.write(hcols % ("CallCount", "Recursive", "Total(s)",

                                  "Inline(s)", "module:lineno(function)"))

              count = 0

              for e in d:

                  file.write(cols % (e.callcount, e.reccallcount, e.totaltime,

                                     e.inlinetime, label(e.code)))

                  count += 1

                  if limit is not None and count == limit:

                      return

                  ccount = 0

                  if climit and e.calls:

                      for se in e.calls:

                          file.write(cols % (se.callcount, se.reccallcount,

                                             se.totaltime, se.inlinetime,

                                             "    %s" % label(se.code)))

                          count += 1

                          ccount += 1

                          if limit is not None and count == limit:

                              return

                          if climit is not None and ccount == climit:

                              break

          def freeze(self):

              """Replace all references to code objects with string

              descriptions; this makes it possible to pickle the instance."""

              # this code is probably rather ickier than it needs to be!

              for i in range(len(self.data)):

                  e = self.data[i]

                  if not isinstance(e.code, str):

                      self.data[i] = type(e)((label(e.code),) + e[1:])

                  if e.calls:

                      for j in range(len(e.calls)):

                          se = e.calls[j]

                          if not isinstance(se.code, str):

                              e.calls[j] = type(se)((label(se.code),) + se[1:])

      _fn2mod = {}

      def label(code):

          if isinstance(code, str):

              return code

          try:

              mname = _fn2mod[code.co_filename]

          except KeyError:

              for k, v in list(sys.modules.iteritems()):

                  if v is None:

                      continue

                  if not isinstance(getattr(v, '__file__', None), str):

                      continue

                  if v.__file__.startswith(code.co_filename):

                      mname = _fn2mod[code.co_filename] = k

                      break

              else:

                  mname = _fn2mod[code.co_filename] = '<%s>' % code.co_filename

          return '%s:%d(%s)' % (mname, code.co_firstlineno, code.co_name)

      if __name__ == '__main__':

          import os

          sys.argv = sys.argv[1:]

          if not sys.argv:

              print("usage: lsprof.py <script> <arguments...>", file=sys.stderr)

              sys.exit(2)

          sys.path.insert(0, os.path.abspath(os.path.dirname(sys.argv[0])))

          stats = profile(execfile, sys.argv[0], globals(), locals())

          stats.sort()

          stats.pprint()

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages

				from __future__ import absolute_import, print_function

				import _lsprof
				import sys

				Profiler = _lsprof.Profiler

				# PyPy doesn't expose profiler_entry from the module.
				profiler_entry = getattr(_lsprof, 'profiler_entry', None)

				__all__ = ['profile', 'Stats']

				def profile(f, args, *kwds):
				"""XXX docstring"""
				p = Profiler()
				p.enable(subcalls=True, builtins=True)
				try:
				f(args, *kwds)
				finally:
				p.disable()
				return Stats(p.getstats())


				class Stats(object):
				"""XXX docstring"""

				def __init__(self, data):
				self.data = data

				def sort(self, crit=r"inlinetime"):
				"""XXX docstring"""
				# profiler_entries isn't defined when running under PyPy.
				if profiler_entry:
				if crit not in profiler_entry.__dict__:
				raise ValueError("Can't sort by %s" % crit)
				elif self.data and not getattr(self.data[0], crit, None):
				raise ValueError("Can't sort by %s" % crit)

				self.data.sort(key=lambda x: getattr(x, crit), reverse=True)
				for e in self.data:
				if e.calls:
				e.calls.sort(key=lambda x: getattr(x, crit), reverse=True)

				def pprint(self, top=None, file=None, limit=None, climit=None):
				"""XXX docstring"""
				if file is None:
				file = sys.stdout
				d = self.data
				if top is not None:
				d = d[:top]
				cols = "% 12s %12s %11.4f %11.4f %s\n"
				hcols = "% 12s %12s %12s %12s %s\n"
				file.write(hcols % ("CallCount", "Recursive", "Total(s)",
				"Inline(s)", "module:lineno(function)"))
				count = 0
				for e in d:
				file.write(cols % (e.callcount, e.reccallcount, e.totaltime,
				e.inlinetime, label(e.code)))
				count += 1
				if limit is not None and count == limit:
				return
				ccount = 0
				if climit and e.calls:
				for se in e.calls:
				file.write(cols % (se.callcount, se.reccallcount,
				se.totaltime, se.inlinetime,
				" %s" % label(se.code)))
				count += 1
				ccount += 1
				if limit is not None and count == limit:
				return
				if climit is not None and ccount == climit:
				break

				def freeze(self):
				"""Replace all references to code objects with string
				descriptions; this makes it possible to pickle the instance."""

				# this code is probably rather ickier than it needs to be!
				for i in range(len(self.data)):
				e = self.data[i]
				if not isinstance(e.code, str):
				self.data[i] = type(e)((label(e.code),) + e[1:])
				if e.calls:
				for j in range(len(e.calls)):
				se = e.calls[j]
				if not isinstance(se.code, str):
				e.calls[j] = type(se)((label(se.code),) + se[1:])

				_fn2mod = {}

				def label(code):
				if isinstance(code, str):
				return code
				try:
				mname = _fn2mod[code.co_filename]
				except KeyError:
				for k, v in list(sys.modules.iteritems()):
				if v is None:
				continue
				if not isinstance(getattr(v, '__file__', None), str):
				continue
				if v.__file__.startswith(code.co_filename):
				mname = _fn2mod[code.co_filename] = k
				break
				else:
				mname = _fn2mod[code.co_filename] = '<%s>' % code.co_filename

				return '%s:%d(%s)' % (mname, code.co_firstlineno, code.co_name)


				if __name__ == '__main__':
				import os
				sys.argv = sys.argv[1:]
				if not sys.argv:
				print("usage: lsprof.py <script> <arguments...>", file=sys.stderr)
				sys.exit(2)
				sys.path.insert(0, os.path.abspath(os.path.dirname(sys.argv[0])))
				stats = profile(execfile, sys.argv[0], globals(), locals())
				stats.sort()
				stats.pprint()