##// END OF EJS Templates
crecord: use raw string for regular expression...
crecord: use raw string for regular expression \s emits a SyntaxWarning in Python 3.8. Use a raw string to avoid escaping the \. Differential Revision: https://phab.mercurial-scm.org/D5819

File last commit:

r38332:79dd61a4 default
r41676:7a90ff8c default
Show More
win32mbcs.py
206 lines | 6.9 KiB | text/x-python | PythonLexer
Shun-ichi Goto
Update win32mbcs extension...
r6887 # win32mbcs.py -- MBCS filename support for Mercurial
Shun-ichi GOTO
New extension to support problematic MBCS on Windows....
r5846 #
# Copyright (c) 2008 Shun-ichi Goto <shunichi.goto@gmail.com>
#
Shun-ichi GOTO
win32mbcs: Add configuration to specify path encoding...
r10050 # Version: 0.3
Shun-ichi GOTO
New extension to support problematic MBCS on Windows....
r5846 # Author: Shun-ichi Goto <shunichi.goto@gmail.com>
#
Martin Geisler
updated license to be explicit about GPL version 2
r8225 # This software may be used and distributed according to the terms of the
Matt Mackall
Update license to GPLv2+
r10263 # GNU General Public License version 2 or any later version.
Shun-ichi GOTO
New extension to support problematic MBCS on Windows....
r5846 #
Martin Geisler
add blank line after copyright notices and after header
r8228
Dirkjan Ochtman
extensions: fix up description lines some more
r8932 '''allow the use of MBCS paths with problematic encodings
Shun-ichi GOTO
New extension to support problematic MBCS on Windows....
r5846
Martin Geisler
win32mbcs: word-wrap help texts at 70 characters
r8001 Some MBCS encodings are not good for some path operations (i.e.
splitting path, case conversion, etc.) with its encoded bytes. We call
such a encoding (i.e. shift_jis and big5) as "problematic encoding".
This extension can be used to fix the issue with those encodings by
Martin Geisler
win32mbcs: capitalize Unicode
r8665 wrapping some functions to convert to Unicode string before path
Martin Geisler
win32mbcs: word-wrap help texts at 70 characters
r8001 operation.
Shun-ichi GOTO
New extension to support problematic MBCS on Windows....
r5846
Martin Geisler
fixed typos found in translatable strings...
r8668 This extension is useful for:
Martin Geisler
win32mbcs: fix formatting of lists with proper reST markup
r9216
- Japanese Windows users using shift_jis encoding.
- Chinese Windows users using big5 encoding.
- All users who use a repository with one of problematic encodings on
case-insensitive file system.
Shun-ichi Goto
Update win32mbcs extension...
r6887
This extension is not needed for:
Martin Geisler
win32mbcs: fix formatting of lists with proper reST markup
r9216
- Any user who use only ASCII chars in path.
- Any user who do not use any of problematic encodings.
Shun-ichi GOTO
New extension to support problematic MBCS on Windows....
r5846
Shun-ichi Goto
Update win32mbcs extension...
r6887 Note that there are some limitations on using this extension:
Martin Geisler
win32mbcs: fix formatting of lists with proper reST markup
r9216
- You should use single encoding in one repository.
Shun-ichi GOTO
win32mbcs: use extsetup() to wrap functions only once....
r13067 - If the repository path ends with 0x5c, .hg/hgrc cannot be read.
Javi Merino
win32mbcs: Fix typo in documentation...
r13330 - win32mbcs is not compatible with fixutf8 extension.
Shun-ichi GOTO
win32mbcs: Add configuration to specify path encoding...
r10050
Martin Geisler
win32mbcs: fix typos and reST syntax
r10067 By default, win32mbcs uses encoding.encoding decided by Mercurial.
You can specify the encoding by config option::
Shun-ichi Goto
Update win32mbcs extension...
r6887
Shun-ichi GOTO
win32mbcs: Add configuration to specify path encoding...
r10050 [win32mbcs]
encoding = sjis
Martin Geisler
win32mbcs: fix typos and reST syntax
r10067 It is useful for the users who want to commit with UTF-8 log message.
Cédric Duval
extensions: improve the consistency of synopses...
r8894 '''
timeless
win32mbcs: use absolute_import
r28417 from __future__ import absolute_import
Shun-ichi GOTO
New extension to support problematic MBCS on Windows....
r5846
timeless
win32mbcs: use absolute_import
r28417 import os
import sys
Yuya Nishihara
py3: move up symbol imports to enforce import-checker rules...
r29205 from mercurial.i18n import _
timeless
win32mbcs: use absolute_import
r28417 from mercurial import (
encoding,
error,
Pulkit Goyal
py3: replace os.sep with pycompat.ossep (part 4 of 4)
r30616 pycompat,
Boris Feld
configitems: register the 'win32mbcs.encoding' config
r34181 registrar,
timeless
win32mbcs: use absolute_import
r28417 )
Augie Fackler
extensions: change magic "shipped with hg" string...
r29841 # Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for
Augie Fackler
extensions: document that `testedwith = 'internal'` is special...
r25186 # extensions which SHIP WITH MERCURIAL. Non-mainline extensions should
# be specifying the version(s) of Mercurial they are tested with, or
# leave the attribute unspecified.
Augie Fackler
extensions: change magic "shipped with hg" string...
r29841 testedwith = 'ships-with-hg-core'
Shun-ichi GOTO
New extension to support problematic MBCS on Windows....
r5846
Boris Feld
configitems: register the 'win32mbcs.encoding' config
r34181 configtable = {}
configitem = registrar.configitem(configtable)
# Encoding.encoding may be updated by --encoding option.
# Use a lambda do delay the resolution.
configitem('win32mbcs', 'encoding',
default=lambda: encoding.encoding,
)
Shun-ichi GOTO
win32mbcs: use extsetup() to wrap functions only once....
r13067 _encoding = None # see extsetup
Shun-ichi GOTO
win32mbcs: Add configuration to specify path encoding...
r10050
Shun-ichi Goto
Update win32mbcs extension...
r6887 def decode(arg):
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 if isinstance(arg, str):
Shun-ichi GOTO
win32mbcs: Add configuration to specify path encoding...
r10050 uarg = arg.decode(_encoding)
if arg == uarg.encode(_encoding):
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 return uarg
raise UnicodeError("Not local encoding")
elif isinstance(arg, tuple):
return tuple(map(decode, arg))
elif isinstance(arg, list):
return map(decode, arg)
Shun-ichi GOTO
win32mbcs: wrapper supports keyword arguments and dict result....
r9131 elif isinstance(arg, dict):
for k, v in arg.items():
arg[k] = decode(v)
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 return arg
Shun-ichi Goto
Update win32mbcs extension...
r6887
def encode(arg):
Pulkit Goyal
py3: replace `unicode` with pycompat.unicode...
r38332 if isinstance(arg, pycompat.unicode):
Shun-ichi GOTO
win32mbcs: Add configuration to specify path encoding...
r10050 return arg.encode(_encoding)
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 elif isinstance(arg, tuple):
return tuple(map(encode, arg))
elif isinstance(arg, list):
return map(encode, arg)
Shun-ichi GOTO
win32mbcs: wrapper supports keyword arguments and dict result....
r9131 elif isinstance(arg, dict):
for k, v in arg.items():
arg[k] = encode(v)
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 return arg
Shun-ichi Goto
Update win32mbcs extension...
r6887
Shun-ichi GOTO
win32mbcs: add special wrapper for osutil.listdir()....
r9132 def appendsep(s):
# ensure the path ends with os.sep, appending it if necessary.
try:
us = decode(s)
except UnicodeError:
us = s
if us and us[-1] not in ':/\\':
Pulkit Goyal
py3: replace os.sep with pycompat.ossep (part 4 of 4)
r30616 s += pycompat.ossep
Shun-ichi GOTO
win32mbcs: add special wrapper for osutil.listdir()....
r9132 return s
Shun-ichi GOTO
win32mbcs: add reversing wrapper for some unicode-incompatible functions....
r17798
def basewrapper(func, argtype, enc, dec, args, kwds):
# check check already converted, then call original
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 for arg in args:
Shun-ichi GOTO
win32mbcs: add reversing wrapper for some unicode-incompatible functions....
r17798 if isinstance(arg, argtype):
Shun-ichi GOTO
win32mbcs: wrapper supports keyword arguments and dict result....
r9131 return func(*args, **kwds)
Shun-ichi GOTO
New extension to support problematic MBCS on Windows....
r5846
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 try:
Shun-ichi GOTO
win32mbcs: add reversing wrapper for some unicode-incompatible functions....
r17798 # convert string arguments, call func, then convert back the
# return value.
return enc(func(*dec(args), **dec(kwds)))
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 except UnicodeError:
Pierre-Yves David
error: get Abort from 'error' instead of 'util'...
r26587 raise error.Abort(_("[win32mbcs] filename conversion failed with"
Shun-ichi GOTO
win32mbcs: Add configuration to specify path encoding...
r10050 " %s encoding\n") % (_encoding))
Shun-ichi Goto
Update win32mbcs extension...
r6887
Shun-ichi GOTO
win32mbcs: add reversing wrapper for some unicode-incompatible functions....
r17798 def wrapper(func, args, kwds):
Pulkit Goyal
py3: replace `unicode` with pycompat.unicode...
r38332 return basewrapper(func, pycompat.unicode, encode, decode, args, kwds)
Shun-ichi GOTO
win32mbcs: add reversing wrapper for some unicode-incompatible functions....
r17798
def reversewrapper(func, args, kwds):
return basewrapper(func, str, decode, encode, args, kwds)
Shun-ichi GOTO
win32mbcs: add special wrapper for osutil.listdir()....
r9132 def wrapperforlistdir(func, args, kwds):
# Ensure 'path' argument ends with os.sep to avoids
# misinterpreting last 0x5c of MBCS 2nd byte as path separator.
if args:
args = list(args)
args[0] = appendsep(args[0])
Nicolas Dumazet
use 'x in dict' instead of 'dict.has_key(x)'...
r9391 if 'path' in kwds:
Shun-ichi GOTO
win32mbcs: add special wrapper for osutil.listdir()....
r9132 kwds['path'] = appendsep(kwds['path'])
return func(*args, **kwds)
def wrapname(name, wrapper):
Brodie Rao
win32mbcs: look up modules using sys.modules (issue1729)...
r9098 module, name = name.rsplit('.', 1)
module = sys.modules[module]
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 func = getattr(module, name)
Shun-ichi GOTO
win32mbcs: wrapper supports keyword arguments and dict result....
r9131 def f(*args, **kwds):
return wrapper(func, args, kwds)
Augie Fackler
win32mbcs: drop code that was catering to Python 2.3 and earlier
r30476 f.__name__ = func.__name__
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 setattr(module, name, f)
Shun-ichi Goto
Update win32mbcs extension...
r6887
# List of functions to be wrapped.
# NOTE: os.path.dirname() and os.path.basename() are safe because
# they use result of os.path.split()
funcs = '''os.path.join os.path.split os.path.splitext
Martin von Zweigbergk
util: rename checkcase() to fscasesensitive() (API)...
r29889 os.path.normpath os.makedirs mercurial.util.endswithsep
mercurial.util.splitpath mercurial.util.fscasesensitive
Shun-ichi GOTO
win32mbcs: wrap two more functions to be wrapped....
r14841 mercurial.util.fspath mercurial.util.pconvert mercurial.util.normpath
Shun-ichi GOTO
win32mbcs: wrap util.split()...
r19383 mercurial.util.checkwinfilename mercurial.util.checkosfilename
mercurial.util.split'''
Shun-ichi GOTO
New extension to support problematic MBCS on Windows....
r5846
Shun-ichi GOTO
win32mbcs: add reversing wrapper for some unicode-incompatible functions....
r17798 # These functions are required to be called with local encoded string
# because they expects argument is local encoded string and cause
# problem with unicode string.
FUJIWARA Katsunori
win32mbcs: wrap underlying pycompat.bytestr to use checkwinfilename safely...
r32244 rfuncs = '''mercurial.encoding.upper mercurial.encoding.lower
FUJIWARA Katsunori
win32mbcs: avoid unintentional failure at colorization...
r32566 mercurial.util._filenamebytestr'''
Shun-ichi GOTO
win32mbcs: add reversing wrapper for some unicode-incompatible functions....
r17798
FUJIWARA Katsunori
win32mbcs: allow win32mbcs extension to be enabled on cygwin platform...
r15724 # List of Windows specific functions to be wrapped.
winfuncs = '''os.path.splitunc'''
Shun-ichi GOTO
New extension to support problematic MBCS on Windows....
r5846 # codec and alias names of sjis and big5 to be faked.
Shun-ichi Goto
Update win32mbcs extension...
r6887 problematic_encodings = '''big5 big5-tw csbig5 big5hkscs big5-hkscs
hkscs cp932 932 ms932 mskanji ms-kanji shift_jis csshiftjis shiftjis
sjis s_jis shift_jis_2004 shiftjis2004 sjis_2004 sjis2004
Shun-ichi GOTO
Add cp950 as problematic encoding which is used in chinese windows.
r8714 shift_jisx0213 shiftjisx0213 sjisx0213 s_jisx0213 950 cp950 ms950 '''
Shun-ichi GOTO
New extension to support problematic MBCS on Windows....
r5846
Shun-ichi GOTO
win32mbcs: use extsetup() to wrap functions only once....
r13067 def extsetup(ui):
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 # TODO: decide use of config section for this extension
FUJIWARA Katsunori
win32mbcs: allow win32mbcs extension to be enabled on cygwin platform...
r15724 if ((not os.path.supports_unicode_filenames) and
Pulkit Goyal
py3: replace sys.platform with pycompat.sysplatform (part 2 of 2)
r30642 (pycompat.sysplatform != 'cygwin')):
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 ui.warn(_("[win32mbcs] cannot activate on this platform.\n"))
return
Shun-ichi GOTO
win32mbcs: Add configuration to specify path encoding...
r10050 # determine encoding for filename
global _encoding
Boris Feld
configitems: register the 'win32mbcs.encoding' config
r34181 _encoding = ui.config('win32mbcs', 'encoding')
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 # fake is only for relevant environment.
Shun-ichi GOTO
win32mbcs: Add configuration to specify path encoding...
r10050 if _encoding.lower() in problematic_encodings.split():
Peter Arrenbrecht
cleanup: whitespace cleanup
r7877 for f in funcs.split():
Shun-ichi GOTO
win32mbcs: add special wrapper for osutil.listdir()....
r9132 wrapname(f, wrapper)
Jun Wu
codemod: use pycompat.iswindows...
r34646 if pycompat.iswindows:
FUJIWARA Katsunori
win32mbcs: allow win32mbcs extension to be enabled on cygwin platform...
r15724 for f in winfuncs.split():
wrapname(f, wrapper)
Yuya Nishihara
osutil: proxy through util (and platform) modules (API)...
r32203 wrapname("mercurial.util.listdir", wrapperforlistdir)
wrapname("mercurial.windows.listdir", wrapperforlistdir)
Shun-ichi GOTO
win32mbcs: add reversing wrapper for some unicode-incompatible functions....
r17798 # wrap functions to be called with local byte string arguments
for f in rfuncs.split():
wrapname(f, reversewrapper)
Shun-ichi GOTO
win32mbcs: use extsetup() to wrap functions only once....
r13067 # Check sys.args manually instead of using ui.debug() because
# command line options is not yet applied when
# extensions.loadall() is called.
if '--debug' in sys.argv:
FUJIWARA Katsunori
check-code: detect "missing _() in ui message" more exactly...
r29397 ui.write(("[win32mbcs] activated with encoding: %s\n")
Shun-ichi GOTO
win32mbcs: use extsetup() to wrap functions only once....
r13067 % _encoding)