eol.py
392 lines
| 14.5 KiB
| text/x-python
|
PythonLexer
/ hgext / eol.py
Martin Geisler
|
r11249 | """automatically manage newlines in repository files | ||
This extension allows you to manage the type of line endings (CRLF or | ||||
LF) that are used in the repository and in the local working | ||||
directory. That way you can get CRLF line endings on Windows and LF on | ||||
Unix/Mac, thereby letting everybody use their OS native line endings. | ||||
The extension reads its configuration from a versioned ``.hgeol`` | ||||
Yuya Nishihara
|
r24367 | configuration file found in the root of the working directory. The | ||
Martin Geisler
|
r11249 | ``.hgeol`` file use the same syntax as all other Mercurial | ||
configuration files. It uses two sections, ``[patterns]`` and | ||||
``[repository]``. | ||||
Erik Zielke
|
r13124 | The ``[patterns]`` section specifies how line endings should be | ||
Yuya Nishihara
|
r24367 | converted between the working directory and the repository. The format is | ||
Erik Zielke
|
r13124 | specified by a file pattern. The first match is used, so put more | ||
specific patterns first. The available line endings are ``LF``, | ||||
``CRLF``, and ``BIN``. | ||||
Martin Geisler
|
r11249 | |||
Files with the declared format of ``CRLF`` or ``LF`` are always | ||||
Erik Zielke
|
r13124 | checked out and stored in the repository in that format and files | ||
declared to be binary (``BIN``) are left unchanged. Additionally, | ||||
``native`` is an alias for checking out in the platform's default line | ||||
ending: ``LF`` on Unix (including Mac OS X) and ``CRLF`` on | ||||
Windows. Note that ``BIN`` (do nothing to line endings) is Mercurial's | ||||
timeless@mozdev.org
|
r26098 | default behavior; it is only needed if you need to override a later, | ||
Erik Zielke
|
r13124 | more general pattern. | ||
Martin Geisler
|
r11249 | |||
The optional ``[repository]`` section specifies the line endings to | ||||
use for files stored in the repository. It has a single setting, | ||||
``native``, which determines the storage line endings for files | ||||
declared as ``native`` in the ``[patterns]`` section. It can be set to | ||||
``LF`` or ``CRLF``. The default is ``LF``. For example, this means | ||||
that on Windows, files configured as ``native`` (``CRLF`` by default) | ||||
will be converted to ``LF`` when stored in the repository. Files | ||||
declared as ``LF``, ``CRLF``, or ``BIN`` in the ``[patterns]`` section | ||||
are always stored as-is in the repository. | ||||
Example versioned ``.hgeol`` file:: | ||||
[patterns] | ||||
**.py = native | ||||
**.vcproj = CRLF | ||||
**.txt = native | ||||
Makefile = LF | ||||
**.jpg = BIN | ||||
[repository] | ||||
native = LF | ||||
Erik Zielke
|
r13124 | .. note:: | ||
Simon Heimberg
|
r19997 | |||
Erik Zielke
|
r13124 | The rules will first apply when files are touched in the working | ||
Yuya Nishihara
|
r24367 | directory, e.g. by updating to null and back to tip to touch all files. | ||
Erik Zielke
|
r13124 | |||
Martin Geisler
|
r14856 | The extension uses an optional ``[eol]`` section read from both the | ||
normal Mercurial configuration files and the ``.hgeol`` file, with the | ||||
latter overriding the former. You can use that section to control the | ||||
overall behavior. There are three settings: | ||||
Martin Geisler
|
r11249 | |||
- ``eol.native`` (default ``os.linesep``) can be set to ``LF`` or | ||||
Georg Brandl
|
r12802 | ``CRLF`` to override the default interpretation of ``native`` for | ||
Martin Geisler
|
r11249 | checkout. This can be used with :hg:`archive` on Unix, say, to | ||
generate an archive where files have line endings for Windows. | ||||
- ``eol.only-consistent`` (default True) can be set to False to make | ||||
the extension convert files with inconsistent EOLs. Inconsistent | ||||
means that there is both ``CRLF`` and ``LF`` present in the file. | ||||
Such files are normally not touched under the assumption that they | ||||
have mixed EOLs on purpose. | ||||
Martin Geisler
|
r14856 | - ``eol.fix-trailing-newline`` (default False) can be set to True to | ||
Matt Mackall
|
r14857 | ensure that converted files end with a EOL character (either ``\\n`` | ||
or ``\\r\\n`` as per the configured patterns). | ||||
Martin Geisler
|
r14856 | |||
Martin Geisler
|
r12979 | The extension provides ``cleverencode:`` and ``cleverdecode:`` filters | ||
like the deprecated win32text extension does. This means that you can | ||||
disable win32text and enable eol and your filters will still work. You | ||||
only need to these filters until you have prepared a ``.hgeol`` file. | ||||
Martin Geisler
|
r12980 | The ``win32text.forbid*`` hooks provided by the win32text extension | ||
Patrick Mezard
|
r13617 | have been unified into a single hook named ``eol.checkheadshook``. The | ||
hook will lookup the expected line endings from the ``.hgeol`` file, | ||||
which means you must migrate to a ``.hgeol`` file first before using | ||||
the hook. ``eol.checkheadshook`` only checks heads, intermediate | ||||
invalid revisions will be pushed. To forbid them completely, use the | ||||
``eol.checkallhook`` hook. These hooks are best used as | ||||
``pretxnchangegroup`` hooks. | ||||
Martin Geisler
|
r12980 | |||
Martin Geisler
|
r11249 | See :hg:`help patterns` for more information about the glob patterns | ||
used. | ||||
""" | ||||
Pulkit Goyal
|
r28969 | from __future__ import absolute_import | ||
import os | ||||
import re | ||||
Martin Geisler
|
r11249 | from mercurial.i18n import _ | ||
Pulkit Goyal
|
r28969 | from mercurial import ( | ||
config, | ||||
error, | ||||
extensions, | ||||
match, | ||||
util, | ||||
) | ||||
Martin Geisler
|
r11249 | |||
Augie Fackler
|
r29841 | # Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for | ||
Augie Fackler
|
r25186 | # extensions which SHIP WITH MERCURIAL. Non-mainline extensions should | ||
# be specifying the version(s) of Mercurial they are tested with, or | ||||
# leave the attribute unspecified. | ||||
Augie Fackler
|
r29841 | testedwith = 'ships-with-hg-core' | ||
Augie Fackler
|
r16743 | |||
Martin Geisler
|
r11249 | # Matches a lone LF, i.e., one that is not part of CRLF. | ||
singlelf = re.compile('(^|[^\r])\n') | ||||
# Matches a single EOL which can either be a CRLF where repeated CR | ||||
timeless@mozdev.org
|
r17501 | # are removed or a LF. We do not care about old Macintosh files, so a | ||
Martin Geisler
|
r11249 | # stray CR is an error. | ||
eolre = re.compile('\r*\n') | ||||
def inconsistenteol(data): | ||||
return '\r\n' in data and singlelf.search(data) | ||||
def tolf(s, params, ui, **kwargs): | ||||
"""Filter to convert to LF EOLs.""" | ||||
if util.binary(s): | ||||
return s | ||||
if ui.configbool('eol', 'only-consistent', True) and inconsistenteol(s): | ||||
return s | ||||
Brodie Rao
|
r16683 | if (ui.configbool('eol', 'fix-trailing-newline', False) | ||
and s and s[-1] != '\n'): | ||||
Stepan Koltsov
|
r14855 | s = s + '\n' | ||
Martin Geisler
|
r11249 | return eolre.sub('\n', s) | ||
def tocrlf(s, params, ui, **kwargs): | ||||
"""Filter to convert to CRLF EOLs.""" | ||||
if util.binary(s): | ||||
return s | ||||
if ui.configbool('eol', 'only-consistent', True) and inconsistenteol(s): | ||||
return s | ||||
Brodie Rao
|
r16683 | if (ui.configbool('eol', 'fix-trailing-newline', False) | ||
and s and s[-1] != '\n'): | ||||
Stepan Koltsov
|
r14855 | s = s + '\n' | ||
Martin Geisler
|
r11249 | return eolre.sub('\r\n', s) | ||
def isbinary(s, params): | ||||
"""Filter to do nothing with the file.""" | ||||
return s | ||||
filters = { | ||||
'to-lf': tolf, | ||||
'to-crlf': tocrlf, | ||||
'is-binary': isbinary, | ||||
Colin Caughie
|
r12975 | # The following provide backwards compatibility with win32text | ||
Martin Geisler
|
r12979 | 'cleverencode:': tolf, | ||
'cleverdecode:': tocrlf | ||||
Martin Geisler
|
r11249 | } | ||
Patrick Mezard
|
r13613 | class eolfile(object): | ||
def __init__(self, ui, root, data): | ||||
self._decode = {'LF': 'to-lf', 'CRLF': 'to-crlf', 'BIN': 'is-binary'} | ||||
self._encode = {'LF': 'to-lf', 'CRLF': 'to-crlf', 'BIN': 'is-binary'} | ||||
Martin Geisler
|
r11249 | |||
Patrick Mezard
|
r13613 | self.cfg = config.config() | ||
# Our files should not be touched. The pattern must be | ||||
# inserted first override a '** = native' pattern. | ||||
Mads Kiilerich
|
r20790 | self.cfg.set('patterns', '.hg*', 'BIN', 'eol') | ||
Patrick Mezard
|
r13613 | # We can then parse the user's patterns. | ||
self.cfg.parse('.hgeol', data) | ||||
isrepolf = self.cfg.get('repository', 'native') != 'CRLF' | ||||
self._encode['NATIVE'] = isrepolf and 'to-lf' or 'to-crlf' | ||||
iswdlf = ui.config('eol', 'native', os.linesep) in ('LF', '\n') | ||||
self._decode['NATIVE'] = iswdlf and 'to-lf' or 'to-crlf' | ||||
include = [] | ||||
exclude = [] | ||||
Mads Kiilerich
|
r30114 | self.patterns = [] | ||
Patrick Mezard
|
r13613 | for pattern, style in self.cfg.items('patterns'): | ||
key = style.upper() | ||||
if key == 'BIN': | ||||
exclude.append(pattern) | ||||
else: | ||||
include.append(pattern) | ||||
Mads Kiilerich
|
r30114 | m = match.match(root, '', [pattern]) | ||
self.patterns.append((pattern, key, m)) | ||||
Patrick Mezard
|
r13613 | # This will match the files for which we need to care | ||
# about inconsistent newlines. | ||||
self.match = match.match(root, '', [], include, exclude) | ||||
Stepan Koltsov
|
r14854 | def copytoui(self, ui): | ||
Mads Kiilerich
|
r30114 | for pattern, key, m in self.patterns: | ||
Patrick Mezard
|
r13613 | try: | ||
Mads Kiilerich
|
r20790 | ui.setconfig('decode', pattern, self._decode[key], 'eol') | ||
ui.setconfig('encode', pattern, self._encode[key], 'eol') | ||||
Patrick Mezard
|
r13613 | except KeyError: | ||
ui.warn(_("ignoring unknown EOL style '%s' from %s\n") | ||||
Mads Kiilerich
|
r30114 | % (key, self.cfg.source('patterns', pattern))) | ||
Stepan Koltsov
|
r14854 | # eol.only-consistent can be specified in ~/.hgrc or .hgeol | ||
for k, v in self.cfg.items('eol'): | ||||
Mads Kiilerich
|
r20790 | ui.setconfig('eol', k, v, 'eol') | ||
Patrick Mezard
|
r13613 | |||
Patrick Mezard
|
r13615 | def checkrev(self, repo, ctx, files): | ||
Patrick Mezard
|
r13649 | failed = [] | ||
Patrick Mezard
|
r13650 | for f in (files or ctx.files()): | ||
Patrick Mezard
|
r13615 | if f not in ctx: | ||
continue | ||||
Mads Kiilerich
|
r30114 | for pattern, key, m in self.patterns: | ||
if not m(f): | ||||
Patrick Mezard
|
r13615 | continue | ||
Mads Kiilerich
|
r30114 | target = self._encode[key] | ||
Patrick Mezard
|
r13615 | data = ctx[f].data() | ||
Patrick Mezard
|
r13649 | if (target == "to-lf" and "\r\n" in data | ||
or target == "to-crlf" and singlelf.search(data)): | ||||
Bryan O'Sullivan
|
r27524 | failed.append((f, target, str(ctx))) | ||
Antoine Pitrou
|
r13501 | break | ||
Patrick Mezard
|
r13649 | return failed | ||
Martin Geisler
|
r11249 | |||
Patrick Mezard
|
r13614 | def parseeol(ui, repo, nodes): | ||
Patrick Mezard
|
r13613 | try: | ||
Patrick Mezard
|
r13614 | for node in nodes: | ||
try: | ||||
if node is None: | ||||
# Cannot use workingctx.data() since it would load | ||||
# and cache the filters before we configure them. | ||||
Pierre-Yves David
|
r31417 | data = repo.wvfs('.hgeol').read() | ||
Patrick Mezard
|
r13614 | else: | ||
data = repo[node]['.hgeol'].data() | ||||
return eolfile(ui, repo.root, data) | ||||
except (IOError, LookupError): | ||||
pass | ||||
Gregory Szorc
|
r25660 | except error.ParseError as inst: | ||
Patrick Mezard
|
r13614 | ui.warn(_("warning: ignoring .hgeol file due to parse error " | ||
"at %s: %s\n") % (inst.args[1], inst.args[0])) | ||||
return None | ||||
Martin Geisler
|
r11249 | |||
Patrick Mezard
|
r13617 | def _checkhook(ui, repo, node, headsonly): | ||
# Get revisions to check and touched files at the same time | ||||
Martin Geisler
|
r11249 | files = set() | ||
Patrick Mezard
|
r13617 | revs = set() | ||
Martin Geisler
|
r11249 | for rev in xrange(repo[node].rev(), len(repo)): | ||
Patrick Mezard
|
r13617 | revs.add(rev) | ||
if headsonly: | ||||
Patrick Mezard
|
r13650 | ctx = repo[rev] | ||
files.update(ctx.files()) | ||||
Patrick Mezard
|
r13617 | for pctx in ctx.parents(): | ||
revs.discard(pctx.rev()) | ||||
Patrick Mezard
|
r13649 | failed = [] | ||
Patrick Mezard
|
r13617 | for rev in revs: | ||
Patrick Mezard
|
r13616 | ctx = repo[rev] | ||
eol = parseeol(ui, repo, [ctx.node()]) | ||||
if eol: | ||||
Patrick Mezard
|
r13649 | failed.extend(eol.checkrev(repo, ctx, files)) | ||
if failed: | ||||
eols = {'to-lf': 'CRLF', 'to-crlf': 'LF'} | ||||
msgs = [] | ||||
Bryan O'Sullivan
|
r27524 | for f, target, node in sorted(failed): | ||
Patrick Mezard
|
r13649 | msgs.append(_(" %s in %s should not have %s line endings") % | ||
(f, node, eols[target])) | ||||
Pierre-Yves David
|
r26587 | raise error.Abort(_("end-of-line check failed:\n") + "\n".join(msgs)) | ||
Martin Geisler
|
r11249 | |||
Patrick Mezard
|
r13617 | def checkallhook(ui, repo, node, hooktype, **kwargs): | ||
"""verify that files have expected EOLs""" | ||||
_checkhook(ui, repo, node, False) | ||||
def checkheadshook(ui, repo, node, hooktype, **kwargs): | ||||
"""verify that files have expected EOLs""" | ||||
_checkhook(ui, repo, node, True) | ||||
# "checkheadshook" used to be called "hook" | ||||
hook = checkheadshook | ||||
Martin Geisler
|
r11249 | |||
def preupdate(ui, repo, hooktype, parent1, parent2): | ||||
Patrick Mezard
|
r13614 | repo.loadeol([parent1]) | ||
Martin Geisler
|
r11249 | return False | ||
def uisetup(ui): | ||||
Mads Kiilerich
|
r20790 | ui.setconfig('hooks', 'preupdate.eol', preupdate, 'eol') | ||
Martin Geisler
|
r11249 | |||
def extsetup(ui): | ||||
try: | ||||
extensions.find('win32text') | ||||
Steve Borho
|
r13624 | ui.warn(_("the eol extension is incompatible with the " | ||
"win32text extension\n")) | ||||
Martin Geisler
|
r11249 | except KeyError: | ||
pass | ||||
def reposetup(ui, repo): | ||||
Steve Borho
|
r12307 | uisetup(repo.ui) | ||
Martin Geisler
|
r11249 | |||
if not repo.local(): | ||||
return | ||||
for name, fn in filters.iteritems(): | ||||
repo.adddatafilter(name, fn) | ||||
Mads Kiilerich
|
r20790 | ui.setconfig('patch', 'eol', 'auto', 'eol') | ||
Martin Geisler
|
r11249 | |||
class eolrepo(repo.__class__): | ||||
Patrick Mezard
|
r13614 | def loadeol(self, nodes): | ||
eol = parseeol(self.ui, self, nodes) | ||||
Patrick Mezard
|
r13613 | if eol is None: | ||
Patrick Mezard
|
r13612 | return None | ||
Stepan Koltsov
|
r14854 | eol.copytoui(self.ui) | ||
Patrick Mezard
|
r13613 | return eol.match | ||
Martin Geisler
|
r11249 | |||
def _hgcleardirstate(self): | ||||
Mads Kiilerich
|
r30113 | self._eolmatch = self.loadeol([None, 'tip']) | ||
if not self._eolmatch: | ||||
self._eolmatch = util.never | ||||
Martin Geisler
|
r11249 | return | ||
Mads Kiilerich
|
r30140 | oldeol = None | ||
Martin Geisler
|
r11249 | try: | ||
Pierre-Yves David
|
r31328 | cachemtime = os.path.getmtime(self.vfs.join("eol.cache")) | ||
Martin Geisler
|
r11249 | except OSError: | ||
cachemtime = 0 | ||||
Mads Kiilerich
|
r30140 | else: | ||
olddata = self.vfs.read("eol.cache") | ||||
if olddata: | ||||
oldeol = eolfile(self.ui, self.root, olddata) | ||||
Martin Geisler
|
r11249 | |||
try: | ||||
eolmtime = os.path.getmtime(self.wjoin(".hgeol")) | ||||
except OSError: | ||||
eolmtime = 0 | ||||
if eolmtime > cachemtime: | ||||
Bryan O'Sullivan
|
r17955 | self.ui.debug("eol: detected change in .hgeol\n") | ||
Mads Kiilerich
|
r30140 | |||
hgeoldata = self.wvfs.read('.hgeol') | ||||
neweol = eolfile(self.ui, self.root, hgeoldata) | ||||
Martin Geisler
|
r11249 | wlock = None | ||
try: | ||||
wlock = self.wlock() | ||||
Martin Geisler
|
r13581 | for f in self.dirstate: | ||
Mads Kiilerich
|
r30140 | if self.dirstate[f] != 'n': | ||
continue | ||||
if oldeol is not None: | ||||
if not oldeol.match(f) and not neweol.match(f): | ||||
continue | ||||
oldkey = None | ||||
for pattern, key, m in oldeol.patterns: | ||||
if m(f): | ||||
oldkey = key | ||||
break | ||||
newkey = None | ||||
for pattern, key, m in neweol.patterns: | ||||
if m(f): | ||||
newkey = key | ||||
break | ||||
if oldkey == newkey: | ||||
continue | ||||
# all normal files need to be looked at again since | ||||
# the new .hgeol file specify a different filter | ||||
self.dirstate.normallookup(f) | ||||
# Write the cache to update mtime and cache .hgeol | ||||
with self.vfs("eol.cache", "w") as f: | ||||
f.write(hgeoldata) | ||||
Martin Geisler
|
r13475 | except error.LockUnavailable: | ||
# If we cannot lock the repository and clear the | ||||
# dirstate, then a commit might not see all files | ||||
# as modified. But if we cannot lock the | ||||
# repository, then we can also not make a commit, | ||||
# so ignore the error. | ||||
pass | ||||
Pierre-Yves David
|
r30164 | finally: | ||
if wlock is not None: | ||||
wlock.release() | ||||
Martin Geisler
|
r11249 | |||
Pierre-Yves David
|
r26586 | def commitctx(self, ctx, haserror=False): | ||
Martin Geisler
|
r11249 | for f in sorted(ctx.added() + ctx.modified()): | ||
Mads Kiilerich
|
r30113 | if not self._eolmatch(f): | ||
Martin Geisler
|
r11249 | continue | ||
Mads Kiilerich
|
r23068 | fctx = ctx[f] | ||
if fctx is None: | ||||
Nicholas Riley
|
r14862 | continue | ||
Mads Kiilerich
|
r23068 | data = fctx.data() | ||
Martin Geisler
|
r11249 | if util.binary(data): | ||
# We should not abort here, since the user should | ||||
# be able to say "** = native" to automatically | ||||
# have all non-binary files taken care of. | ||||
continue | ||||
if inconsistenteol(data): | ||||
Pierre-Yves David
|
r26587 | raise error.Abort(_("inconsistent newline style " | ||
FUJIWARA Katsunori
|
r20869 | "in %s\n") % f) | ||
Pierre-Yves David
|
r26586 | return super(eolrepo, self).commitctx(ctx, haserror) | ||
Martin Geisler
|
r11249 | repo.__class__ = eolrepo | ||
repo._hgcleardirstate() | ||||