##// END OF EJS Templates
compression: introduce a `storage.revlog.zlib.level` configuration...
compression: introduce a `storage.revlog.zlib.level` configuration This option control the zlib compression level used when compression revlog chunk. This is also a good excuse to pave the way for a similar configuration option for the zstd compression engine. Having a dedicated option for each compression algorithm is useful because they don't support the same range of values. Using a higher zlib compression impact CPU consumption at compression time, but does not directly affected decompression time. However dealing with small compressed chunk can directly help decompression and indirectly help other revlog logic. I ran some basic test on repositories using different level. I am using the mercurial, pypy, netbeans and mozilla-central clone from our benchmark suite. All tested repository use sparse-revlog and got all their delta recomputed. The different compression level has a small effect on the repository size (about 10% variation in the total range). My quick analysis is that revlog mostly store small delta, that are not affected by the compression level much. So the variation probably mostly comes from better compression of the snapshots revisions, and snapshot revision only represent a small portion of the repository content. I also made some basic timings measurements. The "read" timings are gathered using simple run of `hg perfrevlogrevisions`, the "write" timings using `hg perfrevlogwrite` (restricted to the last 5000 revisions for netbeans and mozilla central). The timings are gathered on a generic machine, (not one of our performance locked machine), so small variation might not be meaningful. However large trend remains relevant. Keep in mind that these numbers are not pure compression/decompression time. They also involve the full revlog logic. In particular the difference in chunk size has an impact on the delta chain structure, affecting performance when writing or reading them. On read/write performance, the compression level has a bigger impact. Counter-intuitively, the higher compression levels improve "write" performance for the large repositories in our tested setting. Maybe because the last 5000 delta chain end up having a very different shape in this specific spot? Or maybe because of a more general trend of better delta chains thanks to the smaller chunk and snapshot. This series does not intend to change the default compression level. However, these result call for a deeper analysis of this performance difference in the future. Full data ========= repo level .hg/store size 00manifest.d read write ---------------------------------------------------------------- mercurial 1 49,402,813 5,963,475 0.170159 53.250304 mercurial 6 47,197,397 5,875,730 0.182820 56.264320 mercurial 9 47,121,596 5,849,781 0.189219 56.293612 pypy 1 370,830,572 28,462,425 2.679217 460.721984 pypy 6 340,112,317 27,648,747 2.768691 467.537158 pypy 9 338,360,736 27,639,003 2.763495 476.589918 netbeans 1 1,281,847,810 165,495,457 122.477027 520.560316 netbeans 6 1,205,284,353 159,161,207 139.876147 715.930400 netbeans 9 1,197,135,671 155,034,586 141.620281 678.297064 mozilla 1 2,775,497,186 298,527,987 147.867662 751.263721 mozilla 6 2,596,856,420 286,597,671 170.572118 987.056093 mozilla 9 2,587,542,494 287,018,264 163.622338 739.803002

File last commit:

r41673:bd3f03d8 default
r42210:1fac9b93 default
Show More
blackbox.py
197 lines | 5.7 KiB | text/x-python | PythonLexer
Bryan O'Sullivan
blackbox: fix copyright
r18676 # blackbox.py - log repository events to a file for post-mortem debugging
Durham Goode
blackbox: adds a blackbox extension...
r18669 #
Bryan O'Sullivan
blackbox: fix copyright
r18676 # Copyright 2010 Nicolas Dumazet
Durham Goode
blackbox: adds a blackbox extension...
r18669 # Copyright 2013 Facebook, Inc.
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
"""log repository events to a blackbox for debugging
Logs event information to .hg/blackbox.log to help debug and diagnose problems.
The events that get logged can be configured via the blackbox.track config key.
timeless
blackbox: log dirty state...
r28246
Takumi IINO
blackbox: fix literal block syntax
r19162 Examples::
Durham Goode
blackbox: adds a blackbox extension...
r18669
[blackbox]
track = *
timeless
blackbox: rewrite dirty documentation noting it is expensive
r28303 # dirty is *EXPENSIVE* (slow);
# each log entry indicates `+` if the repository is dirty, like :hg:`id`.
timeless
blackbox: log dirty state...
r28246 dirty = True
timeless
blackbox: optionally log event source
r28305 # record the source of log messages
logsource = True
Durham Goode
blackbox: adds a blackbox extension...
r18669
[blackbox]
track = command, commandfinish, commandexception, exthook, pythonhook
[blackbox]
track = incoming
Bryan O'Sullivan
blackbox: automatically rotate log files...
r19066 [blackbox]
# limit the size of a log file
maxsize = 1.5 MB
# rotate up to N log files when the current one gets too big
maxfiles = 3
Matt DeVore
blackbox: add configitem for format of log timestamps...
r40466 [blackbox]
# Include nanoseconds in log entries with %f (see Python function
# datetime.datetime.strftime)
date-format = '%Y-%m-%d @ %H:%M:%S.%f'
Durham Goode
blackbox: adds a blackbox extension...
r18669 """
Gregory Szorc
blackbox: use absolute_import
r28090 from __future__ import absolute_import
import re
Durham Goode
blackbox: adds a blackbox extension...
r18669 from mercurial.i18n import _
timeless
blackbox: log working directory version...
r28245 from mercurial.node import hex
Gregory Szorc
blackbox: use absolute_import
r28090 from mercurial import (
Gregory Szorc
py3: cast error message to localstr in blackbox.py...
r35685 encoding,
Yuya Nishihara
loggingutil: extract openlogfile() and proxylogger to new module...
r40830 loggingutil,
Yuya Nishihara
registrar: move cmdutil.command to registrar module (API)...
r32337 registrar,
Gregory Szorc
blackbox: use absolute_import
r28090 )
Yuya Nishihara
procutil: bulk-replace function calls to point to new module
r37138 from mercurial.utils import (
dateutil,
procutil,
)
Durham Goode
blackbox: adds a blackbox extension...
r18669
Augie Fackler
extensions: change magic "shipped with hg" string...
r29841 # Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for
Augie Fackler
extensions: document that `testedwith = 'internal'` is special...
r25186 # extensions which SHIP WITH MERCURIAL. Non-mainline extensions should
# be specifying the version(s) of Mercurial they are tested with, or
# leave the attribute unspecified.
Augie Fackler
extensions: change magic "shipped with hg" string...
r29841 testedwith = 'ships-with-hg-core'
blackbox: minor code reordering...
r33129
cmdtable = {}
command = registrar.command(cmdtable)
configitems: register 'blackbox.maxsize' as an example of 'configbytes'...
r33130 configtable = {}
configitem = registrar.configitem(configtable)
configitems: register the 'blackbox.dirty' config
r33186 configitem('blackbox', 'dirty',
default=False,
)
configitems: register 'blackbox.maxsize' as an example of 'configbytes'...
r33130 configitem('blackbox', 'maxsize',
blackbox: use a human readable version of the default...
r33131 default='1 MB',
configitems: register 'blackbox.maxsize' as an example of 'configbytes'...
r33130 )
configitems: register the 'blackbox.logsource' config
r33187 configitem('blackbox', 'logsource',
default=False,
)
Boris Feld
configitems: register the 'blackbox.maxfiles' config
r34746 configitem('blackbox', 'maxfiles',
default=7,
)
Boris Feld
configitems: register the 'blackbox.track' config
r34518 configitem('blackbox', 'track',
Boris Feld
configitems: fix registration for 'blackbox.track' config...
r34584 default=lambda: ['*'],
Boris Feld
configitems: register the 'blackbox.track' config
r34518 )
Matt DeVore
blackbox: add configitem for format of log timestamps...
r40466 configitem('blackbox', 'date-format',
default='%Y/%m/%d %H:%M:%S',
)
configitems: register 'blackbox.maxsize' as an example of 'configbytes'...
r33130
Yuya Nishihara
loggingutil: extract openlogfile() and proxylogger to new module...
r40830 _lastlogger = loggingutil.proxylogger()
Yuya Nishihara
blackbox: extract global last logger to proxylogger class...
r40795
Yuya Nishihara
blackbox: unindent "if True" block
r40681 class blackboxlogger(object):
Yuya Nishihara
blackbox: initialize logger with repo instance...
r40797 def __init__(self, ui, repo):
self._repo = repo
Yuya Nishihara
blackbox: extract function to test if log event is tracked...
r40684 self._trackedevents = set(ui.configlist('blackbox', 'track'))
Yuya Nishihara
blackbox: pass in options to _openlogfile() as arguments...
r40829 self._maxfiles = ui.configint('blackbox', 'maxfiles')
self._maxsize = ui.configbytes('blackbox', 'maxsize')
Yuya Nishihara
blackbox: resurrect recursion guard...
r41029 self._inlog = False
Yuya Nishihara
blackbox: extract logger class from ui wrapper...
r40680
Yuya Nishihara
blackbox: extract function to test if log event is tracked...
r40684 def tracked(self, event):
return b'*' in self._trackedevents or event in self._trackedevents
Yuya Nishihara
blackbox: unindent "if True" block
r40681 def log(self, ui, event, msg, opts):
Yuya Nishihara
blackbox: resurrect recursion guard...
r41029 # self._log() -> ctx.dirty() may create new subrepo instance, which
# ui is derived from baseui. So the recursion guard in ui.log()
# doesn't work as it's local to the ui instance.
if self._inlog:
return
self._inlog = True
try:
self._log(ui, event, msg, opts)
finally:
self._inlog = False
def _log(self, ui, event, msg, opts):
Yuya Nishihara
blackbox: unindent "if True" block
r40681 default = ui.configdate('devel', 'default-date')
date = dateutil.datestr(default, ui.config('blackbox', 'date-format'))
user = procutil.getuser()
pid = '%d' % procutil.getpid()
changed = ''
Yuya Nishihara
blackbox: initialize repo attribute properly...
r40682 ctx = self._repo[None]
Yuya Nishihara
blackbox: unindent "if True" block
r40681 parents = ctx.parents()
rev = ('+'.join([hex(p.node()) for p in parents]))
if (ui.configbool('blackbox', 'dirty') and
ctx.dirty(missing=True, merge=False, branch=False)):
changed = '+'
if ui.configbool('blackbox', 'logsource'):
src = ' [%s]' % event
else:
src = ''
try:
fmt = '%s %s @%s%s (%s)%s> %s'
Yuya Nishihara
ui: pass in formatted message to logger.log()...
r40793 args = (date, user, rev, changed, pid, src, msg)
Yuya Nishihara
loggingutil: extract openlogfile() and proxylogger to new module...
r40830 with loggingutil.openlogfile(
ui, self._repo.vfs, name='blackbox.log',
maxfiles=self._maxfiles, maxsize=self._maxsize) as fp:
Yuya Nishihara
blackbox: unindent "if True" block
r40681 fp.write(fmt % args)
except (IOError, OSError) as err:
Yuya Nishihara
blackbox: change the way of deactivating the logger on write error...
r40791 # deactivate this to avoid failed logging again
Yuya Nishihara
blackbox: do not nullify repo to deactivate the logger on failure...
r40796 self._trackedevents.clear()
Yuya Nishihara
blackbox: unindent "if True" block
r40681 ui.debug('warning: cannot write to blackbox.log: %s\n' %
encoding.strtolocal(err.strerror))
Yuya Nishihara
blackbox: just try writing to repo.vfs and update lastlogger on success...
r40828 return
_lastlogger.logger = self
Durham Goode
blackbox: adds a blackbox extension...
r18669
Yuya Nishihara
ui: manage logger instances and event filtering by core ui...
r40761 def uipopulate(ui):
Yuya Nishihara
blackbox: extract global last logger to proxylogger class...
r40795 ui.setlogger(b'blackbox', _lastlogger)
Yuya Nishihara
ui: manage logger instances and event filtering by core ui...
r40761
Durham Goode
blackbox: adds a blackbox extension...
r18669 def reposetup(ui, repo):
# During 'hg pull' a httppeer repo is created to represent the remote repo.
# It doesn't have a .hg directory to put a blackbox in, so we don't do
# the blackbox setup for it.
if not repo.local():
return
Yuya Nishihara
ui: manage logger instances and event filtering by core ui...
r40761 # Since blackbox.log is stored in the repo directory, the logger should be
# instantiated per repository.
Yuya Nishihara
blackbox: initialize logger with repo instance...
r40797 logger = blackboxlogger(ui, repo)
Yuya Nishihara
ui: manage logger instances and event filtering by core ui...
r40761 ui.setlogger(b'blackbox', logger)
Jun Wu
blackbox: set lastui even if ui.log is not called (issue5518)...
r34277
Yuya Nishihara
blackbox: extract global last logger to proxylogger class...
r40795 # Set _lastlogger even if ui.log is not called. This gives blackbox a
# fallback place to log
if _lastlogger.logger is None:
_lastlogger.logger = logger
Jun Wu
blackbox: set lastui even if ui.log is not called (issue5518)...
r34277
Boris Feld
repovfs: add a ward to check if locks are properly taken...
r33436 repo._wlockfreeprefix.add('blackbox.log')
Durham Goode
blackbox: adds a 'blackbox' command for viewing recent logs...
r18673
Rodrigo Damazio
help: adding a proper declaration for shortlist/basic commands (API)...
r40331 @command('blackbox',
Durham Goode
blackbox: adds a 'blackbox' command for viewing recent logs...
r18673 [('l', 'limit', 10, _('the number of events to show')),
],
rdamazio@google.com
help: assigning categories to existing commands...
r40329 _('hg blackbox [OPTION]...'),
Rodrigo Damazio
help: adding a proper declaration for shortlist/basic commands (API)...
r40331 helpcategory=command.CATEGORY_MAINTENANCE,
helpbasic=True)
Durham Goode
blackbox: adds a 'blackbox' command for viewing recent logs...
r18673 def blackbox(ui, repo, *revs, **opts):
'''view the recent repository events
'''
timeless
blackbox: refactor use of vfs as _bbvfs
r28026 if not repo.vfs.exists('blackbox.log'):
Durham Goode
blackbox: adds a 'blackbox' command for viewing recent logs...
r18673 return
Pulkit Goyal
py3: handle keyword arguments in hgext/blackbox.py...
r34973 limit = opts.get(r'limit')
timeless
blackbox: rename fp variable
r28244 fp = repo.vfs('blackbox.log', 'r')
lines = fp.read().split('\n')
Durham Goode
blackbox: adds a 'blackbox' command for viewing recent logs...
r18673
count = 0
output = []
for line in reversed(lines):
if count >= limit:
break
# count the commands by matching lines like: 2013/01/23 19:13:36 root>
Gregory Szorc
global: use raw strings for regular expressions with escapes...
r41673 if re.match(br'^\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2} .*> .*', line):
Durham Goode
blackbox: adds a 'blackbox' command for viewing recent logs...
r18673 count += 1
output.append(line)
ui.status('\n'.join(reversed(output)))