##// END OF EJS Templates
repair: migrate revlogs during upgrade...
repair: migrate revlogs during upgrade Our next step for in-place upgrade is to migrate store data. Revlogs are the biggest source of data within the store and a store is useless without them, so we implement their migration first. Our strategy for migrating revlogs is to walk the store and call `revlog.clone()` on each revlog. There are some minor complications. Because revlogs have different storage options (e.g. changelog has generaldelta and delta chains disabled), we need to obtain the correct class of revlog so inserted data is encoded properly for its type. Various attempts at implementing progress indicators that didn't lead to frustration from false "it's almost done" indicators were made. I initially used a single progress bar based on number of revlogs. However, this quickly churned through all filelogs, got to 99% then effectively froze at 99.99% when it got to the manifest. So I converted the progress bar to total revision count. This was a little bit better. But the manifest was still significantly slower than filelogs and it took forever to process the last few percent. I then tried both revision/chunk bytes and raw bytes as the denominator. This had the opposite effect: because so much data is in manifests, it would churn through filelogs without showing much progress. When it got to manifests, it would fill in 90+% of the progress bar. I finally gave up having a unified progress bar and instead implemented 3 progress bars: 1 for filelog revisions, 1 for manifest revisions, and 1 for changelog revisions. I added extra messages indicating the total number of revisions of each so users know there are more progress bars coming. I also added extra messages before and after each stage to give extra details about what is happening. Strictly speaking, this isn't necessary. But the numbers are impressive. For example, when converting a non-generaldelta mozilla-central repository, the messages you see are: migrating 2475593 total revisions (1833043 in filelogs, 321156 in manifests, 321394 in changelog) migrating 1.67 GB in store; 2508 GB tracked data migrating 267868 filelogs containing 1833043 revisions (1.09 GB in store; 57.3 GB tracked data) finished migrating 1833043 filelog revisions across 267868 filelogs; change in size: -415776 bytes migrating 1 manifests containing 321156 revisions (518 MB in store; 2451 GB tracked data) That "2508 GB" figure really blew me away. I had no clue that the raw tracked data in mozilla-central was that large. Granted, 2451 GB is in the manifest and "only" 57.3 GB is in filelogs. But still. It's worth noting that gratuitous loading of source revlogs in order to display numbers and progress bars does serve a purpose: it ensures we can open all source revlogs. We don't want to spend several minutes copying revlogs only to encounter a permissions error or similar later. As part of this commit, we also add swapping of the store directory to the upgrade function. After revlogs are converted, we move the old store into the backup directory then move the temporary repo's store into the old store's location. On well-behaved systems, this should be 2 atomic operations and the window of inconsistency show be very narrow. There are still a few improvements to be made to store copying and upgrading. But this commit gets the bulk of the work out of the way.

File last commit:

r30669:10b17ed9 default
r30779:38aa1ca9 default
Show More
i18n.py
109 lines | 3.6 KiB | text/x-python | PythonLexer
Martin Geisler
put license and copyright info into comment blocks
r8226 # i18n.py - internationalization support for mercurial
#
# Copyright 2005, 2006 Matt Mackall <mpm@selenic.com>
#
# This software may be used and distributed according to the terms of the
Matt Mackall
Update license to GPLv2+
r10263 # GNU General Public License version 2 or any later version.
Benoit Boissinot
i18n first part: make '_' available for files who need it
r1400
Gregory Szorc
i18n: use absolute_import
r25955 from __future__ import absolute_import
import gettext as gettextmod
import locale
import os
import sys
Pulkit Goyal
py3: convert to unicode to pass into encode()...
r30050 from . import (
encoding,
pycompat,
)
Martin Geisler
i18n: lookup .mo files in private locale/ directory...
r7650
# modelled after templater.templatepath:
Augie Fackler
i18n: use getattr instead of hasattr...
r14975 if getattr(sys, 'frozen', None) is not None:
Pulkit Goyal
py3: replace sys.executable with pycompat.sysexecutable...
r30669 module = pycompat.sysexecutable
Martin Geisler
i18n: lookup .mo files in private locale/ directory...
r7650 else:
module = __file__
timeless
py3: handle ugettext + unicode in i18n
r28674 try:
unicode
except NameError:
unicode = str
Martin Geisler
i18n: lookup .mo files in private locale/ directory...
r7650
Yuya Nishihara
i18n: detect UI language without POSIX-style locale variable on Windows (BC)...
r21987 _languages = None
Pulkit Goyal
py3: replace os.name with pycompat.osname (part 1 of 2)...
r30639 if (pycompat.osname == 'nt'
Yuya Nishihara
py3: make i18n use encoding.environ
r30035 and 'LANGUAGE' not in encoding.environ
and 'LC_ALL' not in encoding.environ
and 'LC_MESSAGES' not in encoding.environ
and 'LANG' not in encoding.environ):
Yuya Nishihara
i18n: detect UI language without POSIX-style locale variable on Windows (BC)...
r21987 # Try to detect UI language by "User Interface Language Management" API
# if no locale variables are set. Note that locale.getdefaultlocale()
# uses GetLocaleInfo(), which may be different from UI language.
# (See http://msdn.microsoft.com/en-us/library/dd374098(v=VS.85).aspx )
try:
import ctypes
langid = ctypes.windll.kernel32.GetUserDefaultUILanguage()
_languages = [locale.windows_locale[langid]]
except (ImportError, AttributeError, KeyError):
# ctypes not found or unknown langid
pass
Mads Kiilerich
i18n: use datapath for i18n like for templates and help...
r22638 _ugettext = None
def setdatapath(datapath):
Pulkit Goyal
py3: make util.datapath a bytes variable...
r30301 datapath = pycompat.fsdecode(datapath)
Augie Fackler
i18n: make the locale directory name the same string type as the datapath
r30085 localedir = os.path.join(datapath, pycompat.sysstr('locale'))
Mads Kiilerich
i18n: use datapath for i18n like for templates and help...
r22638 t = gettextmod.translation('hg', localedir, _languages, fallback=True)
global _ugettext
timeless
py3: handle ugettext + unicode in i18n
r28674 try:
_ugettext = t.ugettext
except AttributeError:
_ugettext = t.gettext
Martin Geisler
i18n: encode output in user's local encoding...
r7651
Augie Fackler
i18n: cache the result of every gettext call...
r23031 _msgcache = {}
Martin Geisler
i18n: encode output in user's local encoding...
r7651 def gettext(message):
"""Translate message.
The message is looked up in the catalog to get a Unicode string,
which is encoded in the local encoding before being returned.
Important: message is restricted to characters in the encoding
given by sys.getdefaultencoding() which is most likely 'ascii'.
"""
# If message is None, t.ugettext will return u'None' as the
# translation whereas our callers expect us to return None.
Mads Kiilerich
i18n: use datapath for i18n like for templates and help...
r22638 if message is None or not _ugettext:
Martin Geisler
i18n: encode output in user's local encoding...
r7651 return message
Augie Fackler
i18n: cache the result of every gettext call...
r23031 if message not in _msgcache:
if type(message) is unicode:
# goofy unicode docstrings in test
paragraphs = message.split(u'\n\n')
else:
paragraphs = [p.decode("ascii") for p in message.split('\n\n')]
# Be careful not to translate the empty string -- it holds the
# meta data of the .po file.
Gregory Szorc
i18n: use unicode literal...
r29415 u = u'\n\n'.join([p and _ugettext(p) or u'' for p in paragraphs])
Augie Fackler
i18n: cache the result of every gettext call...
r23031 try:
# encoding.tolocal cannot be used since it will first try to
# decode the Unicode string. Calling u.decode(enc) really
# means u.encode(sys.getdefaultencoding()).decode(enc). Since
# the Python encoding defaults to 'ascii', this fails if the
# translated string use non-ASCII characters.
Pulkit Goyal
py3: convert to unicode to pass into encode()...
r30050 encodingstr = pycompat.sysstr(encoding.encoding)
_msgcache[message] = u.encode(encodingstr, "replace")
Augie Fackler
i18n: cache the result of every gettext call...
r23031 except LookupError:
# An unknown encoding results in a LookupError.
_msgcache[message] = message
return _msgcache[message]
Martin Geisler
i18n: encode output in user's local encoding...
r7651
Brodie Rao
HGPLAIN: allow exceptions to plain mode, like i18n, via HGPLAINEXCEPT...
r13849 def _plain():
Yuya Nishihara
py3: make i18n use encoding.environ
r30035 if ('HGPLAIN' not in encoding.environ
and 'HGPLAINEXCEPT' not in encoding.environ):
Brodie Rao
HGPLAIN: allow exceptions to plain mode, like i18n, via HGPLAINEXCEPT...
r13849 return False
Yuya Nishihara
py3: make i18n use encoding.environ
r30035 exceptions = encoding.environ.get('HGPLAINEXCEPT', '').strip().split(',')
Brodie Rao
HGPLAIN: allow exceptions to plain mode, like i18n, via HGPLAINEXCEPT...
r13849 return 'i18n' not in exceptions
if _plain():
Brodie Rao
ui: add HGPLAIN environment variable for easier scripting...
r10455 _ = lambda message: message
else:
_ = gettext