##// END OF EJS Templates
phases: fix performance regression with Python 2...
phases: fix performance regression with Python 2 Unlike Python 3, xrange doesn't support efficient "in" and uses a linear time scan instead. Expand the condition to handle it fast. Differential Revision: https://phab.mercurial-scm.org/D9072

File last commit:

r44181:66210a20 stable
r46124:497271da default
Show More
win32mbcs.py
221 lines | 7.0 KiB | text/x-python | PythonLexer
Shun-ichi Goto
Update win32mbcs extension...
r6887 # win32mbcs.py -- MBCS filename support for Mercurial
Shun-ichi GOTO
New extension to support problematic MBCS on Windows....
r5846 #
# Copyright (c) 2008 Shun-ichi Goto <shunichi.goto@gmail.com>
#
Shun-ichi GOTO
win32mbcs: Add configuration to specify path encoding...
r10050 # Version: 0.3
Shun-ichi GOTO
New extension to support problematic MBCS on Windows....
r5846 # Author: Shun-ichi Goto <shunichi.goto@gmail.com>
#
Martin Geisler
updated license to be explicit about GPL version 2
r8225 # This software may be used and distributed according to the terms of the
Matt Mackall
Update license to GPLv2+
r10263 # GNU General Public License version 2 or any later version.
Shun-ichi GOTO
New extension to support problematic MBCS on Windows....
r5846 #
Martin Geisler
add blank line after copyright notices and after header
r8228
Dirkjan Ochtman
extensions: fix up description lines some more
r8932 '''allow the use of MBCS paths with problematic encodings
Shun-ichi GOTO
New extension to support problematic MBCS on Windows....
r5846
Martin Geisler
win32mbcs: word-wrap help texts at 70 characters
r8001 Some MBCS encodings are not good for some path operations (i.e.
splitting path, case conversion, etc.) with its encoded bytes. We call
such a encoding (i.e. shift_jis and big5) as "problematic encoding".
This extension can be used to fix the issue with those encodings by
Martin Geisler
win32mbcs: capitalize Unicode
r8665 wrapping some functions to convert to Unicode string before path
Martin Geisler
win32mbcs: word-wrap help texts at 70 characters
r8001 operation.
Shun-ichi GOTO
New extension to support problematic MBCS on Windows....
r5846
Martin Geisler
fixed typos found in translatable strings...
r8668 This extension is useful for:
Martin Geisler
win32mbcs: fix formatting of lists with proper reST markup
r9216
- Japanese Windows users using shift_jis encoding.
- Chinese Windows users using big5 encoding.
- All users who use a repository with one of problematic encodings on
case-insensitive file system.
Shun-ichi Goto
Update win32mbcs extension...
r6887
This extension is not needed for:
Martin Geisler
win32mbcs: fix formatting of lists with proper reST markup
r9216
- Any user who use only ASCII chars in path.
- Any user who do not use any of problematic encodings.
Shun-ichi GOTO
New extension to support problematic MBCS on Windows....
r5846
Shun-ichi Goto
Update win32mbcs extension...
r6887 Note that there are some limitations on using this extension:
Martin Geisler
win32mbcs: fix formatting of lists with proper reST markup
r9216
- You should use single encoding in one repository.
Shun-ichi GOTO
win32mbcs: use extsetup() to wrap functions only once....
r13067 - If the repository path ends with 0x5c, .hg/hgrc cannot be read.
Javi Merino
win32mbcs: Fix typo in documentation...
r13330 - win32mbcs is not compatible with fixutf8 extension.
Shun-ichi GOTO
win32mbcs: Add configuration to specify path encoding...
r10050
Martin Geisler
win32mbcs: fix typos and reST syntax
r10067 By default, win32mbcs uses encoding.encoding decided by Mercurial.
You can specify the encoding by config option::
Shun-ichi Goto
Update win32mbcs extension...
r6887
Shun-ichi GOTO
win32mbcs: Add configuration to specify path encoding...
r10050 [win32mbcs]
encoding = sjis
Martin Geisler
win32mbcs: fix typos and reST syntax
r10067 It is useful for the users who want to commit with UTF-8 log message.
Cédric Duval
extensions: improve the consistency of synopses...
r8894 '''
timeless
win32mbcs: use absolute_import
r28417 from __future__ import absolute_import
Shun-ichi GOTO
New extension to support problematic MBCS on Windows....
r5846
timeless
win32mbcs: use absolute_import
r28417 import os
import sys
Yuya Nishihara
py3: move up symbol imports to enforce import-checker rules...
r29205 from mercurial.i18n import _
Gregory Szorc
py3: manually import getattr where it is needed...
r43359 from mercurial.pycompat import getattr, setattr
timeless
win32mbcs: use absolute_import
r28417 from mercurial import (
encoding,
error,
Pulkit Goyal
py3: replace os.sep with pycompat.ossep (part 4 of 4)
r30616 pycompat,
Boris Feld
configitems: register the 'win32mbcs.encoding' config
r34181 registrar,
timeless
win32mbcs: use absolute_import
r28417 )
Augie Fackler
extensions: change magic "shipped with hg" string...
r29841 # Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for
Augie Fackler
extensions: document that `testedwith = 'internal'` is special...
r25186 # extensions which SHIP WITH MERCURIAL. Non-mainline extensions should
# be specifying the version(s) of Mercurial they are tested with, or
# leave the attribute unspecified.
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 testedwith = b'ships-with-hg-core'
Shun-ichi GOTO
New extension to support problematic MBCS on Windows....
r5846
Boris Feld
configitems: register the 'win32mbcs.encoding' config
r34181 configtable = {}
configitem = registrar.configitem(configtable)
# Encoding.encoding may be updated by --encoding option.
# Use a lambda do delay the resolution.
Augie Fackler
formatting: blacken the codebase...
r43346 configitem(
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 b'win32mbcs', b'encoding', default=lambda: encoding.encoding,
Boris Feld
configitems: register the 'win32mbcs.encoding' config
r34181 )
Augie Fackler
formatting: blacken the codebase...
r43346 _encoding = None # see extsetup
Shun-ichi GOTO
win32mbcs: Add configuration to specify path encoding...
r10050
Shun-ichi Goto
Update win32mbcs extension...
r6887 def decode(arg):
Matt Harbison
win32mbcs: fix a `str` type conditional for py3...
r44181 if isinstance(arg, bytes):
Shun-ichi GOTO
win32mbcs: Add configuration to specify path encoding...
r10050 uarg = arg.decode(_encoding)
if arg == uarg.encode(_encoding):
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 return uarg
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 raise UnicodeError(b"Not local encoding")
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 elif isinstance(arg, tuple):
return tuple(map(decode, arg))
elif isinstance(arg, list):
return map(decode, arg)
Shun-ichi GOTO
win32mbcs: wrapper supports keyword arguments and dict result....
r9131 elif isinstance(arg, dict):
for k, v in arg.items():
arg[k] = decode(v)
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 return arg
Shun-ichi Goto
Update win32mbcs extension...
r6887
Augie Fackler
formatting: blacken the codebase...
r43346
Shun-ichi Goto
Update win32mbcs extension...
r6887 def encode(arg):
Pulkit Goyal
py3: replace `unicode` with pycompat.unicode...
r38332 if isinstance(arg, pycompat.unicode):
Shun-ichi GOTO
win32mbcs: Add configuration to specify path encoding...
r10050 return arg.encode(_encoding)
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 elif isinstance(arg, tuple):
return tuple(map(encode, arg))
elif isinstance(arg, list):
return map(encode, arg)
Shun-ichi GOTO
win32mbcs: wrapper supports keyword arguments and dict result....
r9131 elif isinstance(arg, dict):
for k, v in arg.items():
arg[k] = encode(v)
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 return arg
Shun-ichi Goto
Update win32mbcs extension...
r6887
Augie Fackler
formatting: blacken the codebase...
r43346
Shun-ichi GOTO
win32mbcs: add special wrapper for osutil.listdir()....
r9132 def appendsep(s):
# ensure the path ends with os.sep, appending it if necessary.
try:
us = decode(s)
except UnicodeError:
us = s
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 if us and us[-1] not in b':/\\':
Pulkit Goyal
py3: replace os.sep with pycompat.ossep (part 4 of 4)
r30616 s += pycompat.ossep
Shun-ichi GOTO
win32mbcs: add special wrapper for osutil.listdir()....
r9132 return s
Shun-ichi GOTO
win32mbcs: add reversing wrapper for some unicode-incompatible functions....
r17798
def basewrapper(func, argtype, enc, dec, args, kwds):
# check check already converted, then call original
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 for arg in args:
Shun-ichi GOTO
win32mbcs: add reversing wrapper for some unicode-incompatible functions....
r17798 if isinstance(arg, argtype):
Shun-ichi GOTO
win32mbcs: wrapper supports keyword arguments and dict result....
r9131 return func(*args, **kwds)
Shun-ichi GOTO
New extension to support problematic MBCS on Windows....
r5846
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 try:
Shun-ichi GOTO
win32mbcs: add reversing wrapper for some unicode-incompatible functions....
r17798 # convert string arguments, call func, then convert back the
# return value.
return enc(func(*dec(args), **dec(kwds)))
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 except UnicodeError:
Augie Fackler
formatting: blacken the codebase...
r43346 raise error.Abort(
Martin von Zweigbergk
cleanup: join string literals that are already on one line...
r43387 _(b"[win32mbcs] filename conversion failed with %s encoding\n")
Augie Fackler
formatting: blacken the codebase...
r43346 % _encoding
)
Shun-ichi Goto
Update win32mbcs extension...
r6887
Shun-ichi GOTO
win32mbcs: add reversing wrapper for some unicode-incompatible functions....
r17798 def wrapper(func, args, kwds):
Pulkit Goyal
py3: replace `unicode` with pycompat.unicode...
r38332 return basewrapper(func, pycompat.unicode, encode, decode, args, kwds)
Shun-ichi GOTO
win32mbcs: add reversing wrapper for some unicode-incompatible functions....
r17798
def reversewrapper(func, args, kwds):
return basewrapper(func, str, decode, encode, args, kwds)
Augie Fackler
formatting: blacken the codebase...
r43346
Shun-ichi GOTO
win32mbcs: add special wrapper for osutil.listdir()....
r9132 def wrapperforlistdir(func, args, kwds):
# Ensure 'path' argument ends with os.sep to avoids
# misinterpreting last 0x5c of MBCS 2nd byte as path separator.
if args:
args = list(args)
args[0] = appendsep(args[0])
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 if b'path' in kwds:
kwds[b'path'] = appendsep(kwds[b'path'])
Shun-ichi GOTO
win32mbcs: add special wrapper for osutil.listdir()....
r9132 return func(*args, **kwds)
Augie Fackler
formatting: blacken the codebase...
r43346
Shun-ichi GOTO
win32mbcs: add special wrapper for osutil.listdir()....
r9132 def wrapname(name, wrapper):
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 module, name = name.rsplit(b'.', 1)
Brodie Rao
win32mbcs: look up modules using sys.modules (issue1729)...
r9098 module = sys.modules[module]
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 func = getattr(module, name)
Augie Fackler
formatting: blacken the codebase...
r43346
Shun-ichi GOTO
win32mbcs: wrapper supports keyword arguments and dict result....
r9131 def f(*args, **kwds):
return wrapper(func, args, kwds)
Augie Fackler
formatting: blacken the codebase...
r43346
Augie Fackler
win32mbcs: drop code that was catering to Python 2.3 and earlier
r30476 f.__name__ = func.__name__
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 setattr(module, name, f)
Shun-ichi Goto
Update win32mbcs extension...
r6887
Augie Fackler
formatting: blacken the codebase...
r43346
Shun-ichi Goto
Update win32mbcs extension...
r6887 # List of functions to be wrapped.
# NOTE: os.path.dirname() and os.path.basename() are safe because
# they use result of os.path.split()
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 funcs = b'''os.path.join os.path.split os.path.splitext
Martin von Zweigbergk
util: rename checkcase() to fscasesensitive() (API)...
r29889 os.path.normpath os.makedirs mercurial.util.endswithsep
mercurial.util.splitpath mercurial.util.fscasesensitive
Shun-ichi GOTO
win32mbcs: wrap two more functions to be wrapped....
r14841 mercurial.util.fspath mercurial.util.pconvert mercurial.util.normpath
Shun-ichi GOTO
win32mbcs: wrap util.split()...
r19383 mercurial.util.checkwinfilename mercurial.util.checkosfilename
mercurial.util.split'''
Shun-ichi GOTO
New extension to support problematic MBCS on Windows....
r5846
Shun-ichi GOTO
win32mbcs: add reversing wrapper for some unicode-incompatible functions....
r17798 # These functions are required to be called with local encoded string
# because they expects argument is local encoded string and cause
# problem with unicode string.
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 rfuncs = b'''mercurial.encoding.upper mercurial.encoding.lower
FUJIWARA Katsunori
win32mbcs: avoid unintentional failure at colorization...
r32566 mercurial.util._filenamebytestr'''
Shun-ichi GOTO
win32mbcs: add reversing wrapper for some unicode-incompatible functions....
r17798
FUJIWARA Katsunori
win32mbcs: allow win32mbcs extension to be enabled on cygwin platform...
r15724 # List of Windows specific functions to be wrapped.
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 winfuncs = b'''os.path.splitunc'''
FUJIWARA Katsunori
win32mbcs: allow win32mbcs extension to be enabled on cygwin platform...
r15724
Shun-ichi GOTO
New extension to support problematic MBCS on Windows....
r5846 # codec and alias names of sjis and big5 to be faked.
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 problematic_encodings = b'''big5 big5-tw csbig5 big5hkscs big5-hkscs
Shun-ichi Goto
Update win32mbcs extension...
r6887 hkscs cp932 932 ms932 mskanji ms-kanji shift_jis csshiftjis shiftjis
sjis s_jis shift_jis_2004 shiftjis2004 sjis_2004 sjis2004
Shun-ichi GOTO
Add cp950 as problematic encoding which is used in chinese windows.
r8714 shift_jisx0213 shiftjisx0213 sjisx0213 s_jisx0213 950 cp950 ms950 '''
Shun-ichi GOTO
New extension to support problematic MBCS on Windows....
r5846
Augie Fackler
formatting: blacken the codebase...
r43346
Shun-ichi GOTO
win32mbcs: use extsetup() to wrap functions only once....
r13067 def extsetup(ui):
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 # TODO: decide use of config section for this extension
Augie Fackler
formatting: blacken the codebase...
r43346 if (not os.path.supports_unicode_filenames) and (
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 pycompat.sysplatform != b'cygwin'
Augie Fackler
formatting: blacken the codebase...
r43346 ):
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 ui.warn(_(b"[win32mbcs] cannot activate on this platform.\n"))
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 return
Shun-ichi GOTO
win32mbcs: Add configuration to specify path encoding...
r10050 # determine encoding for filename
global _encoding
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 _encoding = ui.config(b'win32mbcs', b'encoding')
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 # fake is only for relevant environment.
Shun-ichi GOTO
win32mbcs: Add configuration to specify path encoding...
r10050 if _encoding.lower() in problematic_encodings.split():
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 for f in funcs.split():
Shun-ichi GOTO
win32mbcs: add special wrapper for osutil.listdir()....
r9132 wrapname(f, wrapper)
Jun Wu
codemod: use pycompat.iswindows...
r34646 if pycompat.iswindows:
FUJIWARA Katsunori
win32mbcs: allow win32mbcs extension to be enabled on cygwin platform...
r15724 for f in winfuncs.split():
wrapname(f, wrapper)
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 wrapname(b"mercurial.util.listdir", wrapperforlistdir)
wrapname(b"mercurial.windows.listdir", wrapperforlistdir)
Shun-ichi GOTO
win32mbcs: add reversing wrapper for some unicode-incompatible functions....
r17798 # wrap functions to be called with local byte string arguments
for f in rfuncs.split():
wrapname(f, reversewrapper)
Shun-ichi GOTO
win32mbcs: use extsetup() to wrap functions only once....
r13067 # Check sys.args manually instead of using ui.debug() because
# command line options is not yet applied when
# extensions.loadall() is called.
Augie Fackler
formatting: byteify all mercurial/ and hgext/ string literals...
r43347 if b'--debug' in sys.argv:
formatting: run black on all file again...
r43364 ui.writenoi18n(
b"[win32mbcs] activated with encoding: %s\n" % _encoding
)