##// END OF EJS Templates
eol: extract parsing error handling in parseeol()
Patrick Mezard -
r13614:40d0cf79 default
parent child Browse files
Show More
@@ -1,303 +1,299 b''
1 """automatically manage newlines in repository files
1 """automatically manage newlines in repository files
2
2
3 This extension allows you to manage the type of line endings (CRLF or
3 This extension allows you to manage the type of line endings (CRLF or
4 LF) that are used in the repository and in the local working
4 LF) that are used in the repository and in the local working
5 directory. That way you can get CRLF line endings on Windows and LF on
5 directory. That way you can get CRLF line endings on Windows and LF on
6 Unix/Mac, thereby letting everybody use their OS native line endings.
6 Unix/Mac, thereby letting everybody use their OS native line endings.
7
7
8 The extension reads its configuration from a versioned ``.hgeol``
8 The extension reads its configuration from a versioned ``.hgeol``
9 configuration file found in the root of the working copy. The
9 configuration file found in the root of the working copy. The
10 ``.hgeol`` file use the same syntax as all other Mercurial
10 ``.hgeol`` file use the same syntax as all other Mercurial
11 configuration files. It uses two sections, ``[patterns]`` and
11 configuration files. It uses two sections, ``[patterns]`` and
12 ``[repository]``.
12 ``[repository]``.
13
13
14 The ``[patterns]`` section specifies how line endings should be
14 The ``[patterns]`` section specifies how line endings should be
15 converted between the working copy and the repository. The format is
15 converted between the working copy and the repository. The format is
16 specified by a file pattern. The first match is used, so put more
16 specified by a file pattern. The first match is used, so put more
17 specific patterns first. The available line endings are ``LF``,
17 specific patterns first. The available line endings are ``LF``,
18 ``CRLF``, and ``BIN``.
18 ``CRLF``, and ``BIN``.
19
19
20 Files with the declared format of ``CRLF`` or ``LF`` are always
20 Files with the declared format of ``CRLF`` or ``LF`` are always
21 checked out and stored in the repository in that format and files
21 checked out and stored in the repository in that format and files
22 declared to be binary (``BIN``) are left unchanged. Additionally,
22 declared to be binary (``BIN``) are left unchanged. Additionally,
23 ``native`` is an alias for checking out in the platform's default line
23 ``native`` is an alias for checking out in the platform's default line
24 ending: ``LF`` on Unix (including Mac OS X) and ``CRLF`` on
24 ending: ``LF`` on Unix (including Mac OS X) and ``CRLF`` on
25 Windows. Note that ``BIN`` (do nothing to line endings) is Mercurial's
25 Windows. Note that ``BIN`` (do nothing to line endings) is Mercurial's
26 default behaviour; it is only needed if you need to override a later,
26 default behaviour; it is only needed if you need to override a later,
27 more general pattern.
27 more general pattern.
28
28
29 The optional ``[repository]`` section specifies the line endings to
29 The optional ``[repository]`` section specifies the line endings to
30 use for files stored in the repository. It has a single setting,
30 use for files stored in the repository. It has a single setting,
31 ``native``, which determines the storage line endings for files
31 ``native``, which determines the storage line endings for files
32 declared as ``native`` in the ``[patterns]`` section. It can be set to
32 declared as ``native`` in the ``[patterns]`` section. It can be set to
33 ``LF`` or ``CRLF``. The default is ``LF``. For example, this means
33 ``LF`` or ``CRLF``. The default is ``LF``. For example, this means
34 that on Windows, files configured as ``native`` (``CRLF`` by default)
34 that on Windows, files configured as ``native`` (``CRLF`` by default)
35 will be converted to ``LF`` when stored in the repository. Files
35 will be converted to ``LF`` when stored in the repository. Files
36 declared as ``LF``, ``CRLF``, or ``BIN`` in the ``[patterns]`` section
36 declared as ``LF``, ``CRLF``, or ``BIN`` in the ``[patterns]`` section
37 are always stored as-is in the repository.
37 are always stored as-is in the repository.
38
38
39 Example versioned ``.hgeol`` file::
39 Example versioned ``.hgeol`` file::
40
40
41 [patterns]
41 [patterns]
42 **.py = native
42 **.py = native
43 **.vcproj = CRLF
43 **.vcproj = CRLF
44 **.txt = native
44 **.txt = native
45 Makefile = LF
45 Makefile = LF
46 **.jpg = BIN
46 **.jpg = BIN
47
47
48 [repository]
48 [repository]
49 native = LF
49 native = LF
50
50
51 .. note::
51 .. note::
52 The rules will first apply when files are touched in the working
52 The rules will first apply when files are touched in the working
53 copy, e.g. by updating to null and back to tip to touch all files.
53 copy, e.g. by updating to null and back to tip to touch all files.
54
54
55 The extension uses an optional ``[eol]`` section in your hgrc file
55 The extension uses an optional ``[eol]`` section in your hgrc file
56 (not the ``.hgeol`` file) for settings that control the overall
56 (not the ``.hgeol`` file) for settings that control the overall
57 behavior. There are two settings:
57 behavior. There are two settings:
58
58
59 - ``eol.native`` (default ``os.linesep``) can be set to ``LF`` or
59 - ``eol.native`` (default ``os.linesep``) can be set to ``LF`` or
60 ``CRLF`` to override the default interpretation of ``native`` for
60 ``CRLF`` to override the default interpretation of ``native`` for
61 checkout. This can be used with :hg:`archive` on Unix, say, to
61 checkout. This can be used with :hg:`archive` on Unix, say, to
62 generate an archive where files have line endings for Windows.
62 generate an archive where files have line endings for Windows.
63
63
64 - ``eol.only-consistent`` (default True) can be set to False to make
64 - ``eol.only-consistent`` (default True) can be set to False to make
65 the extension convert files with inconsistent EOLs. Inconsistent
65 the extension convert files with inconsistent EOLs. Inconsistent
66 means that there is both ``CRLF`` and ``LF`` present in the file.
66 means that there is both ``CRLF`` and ``LF`` present in the file.
67 Such files are normally not touched under the assumption that they
67 Such files are normally not touched under the assumption that they
68 have mixed EOLs on purpose.
68 have mixed EOLs on purpose.
69
69
70 The extension provides ``cleverencode:`` and ``cleverdecode:`` filters
70 The extension provides ``cleverencode:`` and ``cleverdecode:`` filters
71 like the deprecated win32text extension does. This means that you can
71 like the deprecated win32text extension does. This means that you can
72 disable win32text and enable eol and your filters will still work. You
72 disable win32text and enable eol and your filters will still work. You
73 only need to these filters until you have prepared a ``.hgeol`` file.
73 only need to these filters until you have prepared a ``.hgeol`` file.
74
74
75 The ``win32text.forbid*`` hooks provided by the win32text extension
75 The ``win32text.forbid*`` hooks provided by the win32text extension
76 have been unified into a single hook named ``eol.hook``. The hook will
76 have been unified into a single hook named ``eol.hook``. The hook will
77 lookup the expected line endings from the ``.hgeol`` file, which means
77 lookup the expected line endings from the ``.hgeol`` file, which means
78 you must migrate to a ``.hgeol`` file first before using the hook.
78 you must migrate to a ``.hgeol`` file first before using the hook.
79 Remember to enable the eol extension in the repository where you
79 Remember to enable the eol extension in the repository where you
80 install the hook.
80 install the hook.
81
81
82 See :hg:`help patterns` for more information about the glob patterns
82 See :hg:`help patterns` for more information about the glob patterns
83 used.
83 used.
84 """
84 """
85
85
86 from mercurial.i18n import _
86 from mercurial.i18n import _
87 from mercurial import util, config, extensions, match, error
87 from mercurial import util, config, extensions, match, error
88 import re, os
88 import re, os
89
89
90 # Matches a lone LF, i.e., one that is not part of CRLF.
90 # Matches a lone LF, i.e., one that is not part of CRLF.
91 singlelf = re.compile('(^|[^\r])\n')
91 singlelf = re.compile('(^|[^\r])\n')
92 # Matches a single EOL which can either be a CRLF where repeated CR
92 # Matches a single EOL which can either be a CRLF where repeated CR
93 # are removed or a LF. We do not care about old Machintosh files, so a
93 # are removed or a LF. We do not care about old Machintosh files, so a
94 # stray CR is an error.
94 # stray CR is an error.
95 eolre = re.compile('\r*\n')
95 eolre = re.compile('\r*\n')
96
96
97
97
98 def inconsistenteol(data):
98 def inconsistenteol(data):
99 return '\r\n' in data and singlelf.search(data)
99 return '\r\n' in data and singlelf.search(data)
100
100
101 def tolf(s, params, ui, **kwargs):
101 def tolf(s, params, ui, **kwargs):
102 """Filter to convert to LF EOLs."""
102 """Filter to convert to LF EOLs."""
103 if util.binary(s):
103 if util.binary(s):
104 return s
104 return s
105 if ui.configbool('eol', 'only-consistent', True) and inconsistenteol(s):
105 if ui.configbool('eol', 'only-consistent', True) and inconsistenteol(s):
106 return s
106 return s
107 return eolre.sub('\n', s)
107 return eolre.sub('\n', s)
108
108
109 def tocrlf(s, params, ui, **kwargs):
109 def tocrlf(s, params, ui, **kwargs):
110 """Filter to convert to CRLF EOLs."""
110 """Filter to convert to CRLF EOLs."""
111 if util.binary(s):
111 if util.binary(s):
112 return s
112 return s
113 if ui.configbool('eol', 'only-consistent', True) and inconsistenteol(s):
113 if ui.configbool('eol', 'only-consistent', True) and inconsistenteol(s):
114 return s
114 return s
115 return eolre.sub('\r\n', s)
115 return eolre.sub('\r\n', s)
116
116
117 def isbinary(s, params):
117 def isbinary(s, params):
118 """Filter to do nothing with the file."""
118 """Filter to do nothing with the file."""
119 return s
119 return s
120
120
121 filters = {
121 filters = {
122 'to-lf': tolf,
122 'to-lf': tolf,
123 'to-crlf': tocrlf,
123 'to-crlf': tocrlf,
124 'is-binary': isbinary,
124 'is-binary': isbinary,
125 # The following provide backwards compatibility with win32text
125 # The following provide backwards compatibility with win32text
126 'cleverencode:': tolf,
126 'cleverencode:': tolf,
127 'cleverdecode:': tocrlf
127 'cleverdecode:': tocrlf
128 }
128 }
129
129
130 class eolfile(object):
130 class eolfile(object):
131 def __init__(self, ui, root, data):
131 def __init__(self, ui, root, data):
132 self._decode = {'LF': 'to-lf', 'CRLF': 'to-crlf', 'BIN': 'is-binary'}
132 self._decode = {'LF': 'to-lf', 'CRLF': 'to-crlf', 'BIN': 'is-binary'}
133 self._encode = {'LF': 'to-lf', 'CRLF': 'to-crlf', 'BIN': 'is-binary'}
133 self._encode = {'LF': 'to-lf', 'CRLF': 'to-crlf', 'BIN': 'is-binary'}
134
134
135 self.cfg = config.config()
135 self.cfg = config.config()
136 # Our files should not be touched. The pattern must be
136 # Our files should not be touched. The pattern must be
137 # inserted first override a '** = native' pattern.
137 # inserted first override a '** = native' pattern.
138 self.cfg.set('patterns', '.hg*', 'BIN')
138 self.cfg.set('patterns', '.hg*', 'BIN')
139 # We can then parse the user's patterns.
139 # We can then parse the user's patterns.
140 self.cfg.parse('.hgeol', data)
140 self.cfg.parse('.hgeol', data)
141
141
142 isrepolf = self.cfg.get('repository', 'native') != 'CRLF'
142 isrepolf = self.cfg.get('repository', 'native') != 'CRLF'
143 self._encode['NATIVE'] = isrepolf and 'to-lf' or 'to-crlf'
143 self._encode['NATIVE'] = isrepolf and 'to-lf' or 'to-crlf'
144 iswdlf = ui.config('eol', 'native', os.linesep) in ('LF', '\n')
144 iswdlf = ui.config('eol', 'native', os.linesep) in ('LF', '\n')
145 self._decode['NATIVE'] = iswdlf and 'to-lf' or 'to-crlf'
145 self._decode['NATIVE'] = iswdlf and 'to-lf' or 'to-crlf'
146
146
147 include = []
147 include = []
148 exclude = []
148 exclude = []
149 for pattern, style in self.cfg.items('patterns'):
149 for pattern, style in self.cfg.items('patterns'):
150 key = style.upper()
150 key = style.upper()
151 if key == 'BIN':
151 if key == 'BIN':
152 exclude.append(pattern)
152 exclude.append(pattern)
153 else:
153 else:
154 include.append(pattern)
154 include.append(pattern)
155 # This will match the files for which we need to care
155 # This will match the files for which we need to care
156 # about inconsistent newlines.
156 # about inconsistent newlines.
157 self.match = match.match(root, '', [], include, exclude)
157 self.match = match.match(root, '', [], include, exclude)
158
158
159 def setfilters(self, ui):
159 def setfilters(self, ui):
160 for pattern, style in self.cfg.items('patterns'):
160 for pattern, style in self.cfg.items('patterns'):
161 key = style.upper()
161 key = style.upper()
162 try:
162 try:
163 ui.setconfig('decode', pattern, self._decode[key])
163 ui.setconfig('decode', pattern, self._decode[key])
164 ui.setconfig('encode', pattern, self._encode[key])
164 ui.setconfig('encode', pattern, self._encode[key])
165 except KeyError:
165 except KeyError:
166 ui.warn(_("ignoring unknown EOL style '%s' from %s\n")
166 ui.warn(_("ignoring unknown EOL style '%s' from %s\n")
167 % (style, self.cfg.source('patterns', pattern)))
167 % (style, self.cfg.source('patterns', pattern)))
168
168
169 def parseeol(ui, repo, node=None):
169 def parseeol(ui, repo, nodes):
170 try:
170 try:
171 if node is None:
171 for node in nodes:
172 # Cannot use workingctx.data() since it would load
172 try:
173 # and cache the filters before we configure them.
173 if node is None:
174 data = repo.wfile('.hgeol').read()
174 # Cannot use workingctx.data() since it would load
175 else:
175 # and cache the filters before we configure them.
176 data = repo[node]['.hgeol'].data()
176 data = repo.wfile('.hgeol').read()
177 return eolfile(ui, repo.root, data)
177 else:
178 except (IOError, LookupError):
178 data = repo[node]['.hgeol'].data()
179 return None
179 return eolfile(ui, repo.root, data)
180 except (IOError, LookupError):
181 pass
182 except error.ParseError, inst:
183 ui.warn(_("warning: ignoring .hgeol file due to parse error "
184 "at %s: %s\n") % (inst.args[1], inst.args[0]))
185 return None
180
186
181 def hook(ui, repo, node, hooktype, **kwargs):
187 def hook(ui, repo, node, hooktype, **kwargs):
182 """verify that files have expected EOLs"""
188 """verify that files have expected EOLs"""
183 files = set()
189 files = set()
184 for rev in xrange(repo[node].rev(), len(repo)):
190 for rev in xrange(repo[node].rev(), len(repo)):
185 files.update(repo[rev].files())
191 files.update(repo[rev].files())
186 tip = repo['tip']
192 tip = repo['tip']
187 for f in files:
193 for f in files:
188 if f not in tip:
194 if f not in tip:
189 continue
195 continue
190 for pattern, target in ui.configitems('encode'):
196 for pattern, target in ui.configitems('encode'):
191 if match.match(repo.root, '', [pattern])(f):
197 if match.match(repo.root, '', [pattern])(f):
192 data = tip[f].data()
198 data = tip[f].data()
193 if target == "to-lf" and "\r\n" in data:
199 if target == "to-lf" and "\r\n" in data:
194 raise util.Abort(_("%s should not have CRLF line endings")
200 raise util.Abort(_("%s should not have CRLF line endings")
195 % f)
201 % f)
196 elif target == "to-crlf" and singlelf.search(data):
202 elif target == "to-crlf" and singlelf.search(data):
197 raise util.Abort(_("%s should not have LF line endings")
203 raise util.Abort(_("%s should not have LF line endings")
198 % f)
204 % f)
199 # Ignore other rules for this file
205 # Ignore other rules for this file
200 break
206 break
201
207
202
208
203 def preupdate(ui, repo, hooktype, parent1, parent2):
209 def preupdate(ui, repo, hooktype, parent1, parent2):
204 #print "preupdate for %s: %s -> %s" % (repo.root, parent1, parent2)
210 #print "preupdate for %s: %s -> %s" % (repo.root, parent1, parent2)
205 try:
211 repo.loadeol([parent1])
206 repo.loadeol(parent1)
207 except error.ParseError, inst:
208 ui.warn(_("warning: ignoring .hgeol file due to parse error "
209 "at %s: %s\n") % (inst.args[1], inst.args[0]))
210 return False
212 return False
211
213
212 def uisetup(ui):
214 def uisetup(ui):
213 ui.setconfig('hooks', 'preupdate.eol', preupdate)
215 ui.setconfig('hooks', 'preupdate.eol', preupdate)
214
216
215 def extsetup(ui):
217 def extsetup(ui):
216 try:
218 try:
217 extensions.find('win32text')
219 extensions.find('win32text')
218 raise util.Abort(_("the eol extension is incompatible with the "
220 raise util.Abort(_("the eol extension is incompatible with the "
219 "win32text extension"))
221 "win32text extension"))
220 except KeyError:
222 except KeyError:
221 pass
223 pass
222
224
223
225
224 def reposetup(ui, repo):
226 def reposetup(ui, repo):
225 uisetup(repo.ui)
227 uisetup(repo.ui)
226 #print "reposetup for", repo.root
228 #print "reposetup for", repo.root
227
229
228 if not repo.local():
230 if not repo.local():
229 return
231 return
230 for name, fn in filters.iteritems():
232 for name, fn in filters.iteritems():
231 repo.adddatafilter(name, fn)
233 repo.adddatafilter(name, fn)
232
234
233 ui.setconfig('patch', 'eol', 'auto')
235 ui.setconfig('patch', 'eol', 'auto')
234
236
235 class eolrepo(repo.__class__):
237 class eolrepo(repo.__class__):
236
238
237 def loadeol(self, node=None):
239 def loadeol(self, nodes):
238 eol = parseeol(self.ui, self, node)
240 eol = parseeol(self.ui, self, nodes)
239 if eol is None:
241 if eol is None:
240 return None
242 return None
241 eol.setfilters(self.ui)
243 eol.setfilters(self.ui)
242 return eol.match
244 return eol.match
243
245
244 def _hgcleardirstate(self):
246 def _hgcleardirstate(self):
245 try:
247 self._eolfile = self.loadeol([None, 'tip'])
246 self._eolfile = (self.loadeol() or self.loadeol('tip'))
247 except error.ParseError, inst:
248 ui.warn(_("warning: ignoring .hgeol file due to parse error "
249 "at %s: %s\n") % (inst.args[1], inst.args[0]))
250 self._eolfile = None
251
252 if not self._eolfile:
248 if not self._eolfile:
253 self._eolfile = util.never
249 self._eolfile = util.never
254 return
250 return
255
251
256 try:
252 try:
257 cachemtime = os.path.getmtime(self.join("eol.cache"))
253 cachemtime = os.path.getmtime(self.join("eol.cache"))
258 except OSError:
254 except OSError:
259 cachemtime = 0
255 cachemtime = 0
260
256
261 try:
257 try:
262 eolmtime = os.path.getmtime(self.wjoin(".hgeol"))
258 eolmtime = os.path.getmtime(self.wjoin(".hgeol"))
263 except OSError:
259 except OSError:
264 eolmtime = 0
260 eolmtime = 0
265
261
266 if eolmtime > cachemtime:
262 if eolmtime > cachemtime:
267 ui.debug("eol: detected change in .hgeol\n")
263 ui.debug("eol: detected change in .hgeol\n")
268 wlock = None
264 wlock = None
269 try:
265 try:
270 wlock = self.wlock()
266 wlock = self.wlock()
271 for f in self.dirstate:
267 for f in self.dirstate:
272 if self.dirstate[f] == 'n':
268 if self.dirstate[f] == 'n':
273 # all normal files need to be looked at
269 # all normal files need to be looked at
274 # again since the new .hgeol file might no
270 # again since the new .hgeol file might no
275 # longer match a file it matched before
271 # longer match a file it matched before
276 self.dirstate.normallookup(f)
272 self.dirstate.normallookup(f)
277 # Touch the cache to update mtime.
273 # Touch the cache to update mtime.
278 self.opener("eol.cache", "w").close()
274 self.opener("eol.cache", "w").close()
279 wlock.release()
275 wlock.release()
280 except error.LockUnavailable:
276 except error.LockUnavailable:
281 # If we cannot lock the repository and clear the
277 # If we cannot lock the repository and clear the
282 # dirstate, then a commit might not see all files
278 # dirstate, then a commit might not see all files
283 # as modified. But if we cannot lock the
279 # as modified. But if we cannot lock the
284 # repository, then we can also not make a commit,
280 # repository, then we can also not make a commit,
285 # so ignore the error.
281 # so ignore the error.
286 pass
282 pass
287
283
288 def commitctx(self, ctx, error=False):
284 def commitctx(self, ctx, error=False):
289 for f in sorted(ctx.added() + ctx.modified()):
285 for f in sorted(ctx.added() + ctx.modified()):
290 if not self._eolfile(f):
286 if not self._eolfile(f):
291 continue
287 continue
292 data = ctx[f].data()
288 data = ctx[f].data()
293 if util.binary(data):
289 if util.binary(data):
294 # We should not abort here, since the user should
290 # We should not abort here, since the user should
295 # be able to say "** = native" to automatically
291 # be able to say "** = native" to automatically
296 # have all non-binary files taken care of.
292 # have all non-binary files taken care of.
297 continue
293 continue
298 if inconsistenteol(data):
294 if inconsistenteol(data):
299 raise util.Abort(_("inconsistent newline style "
295 raise util.Abort(_("inconsistent newline style "
300 "in %s\n" % f))
296 "in %s\n" % f))
301 return super(eolrepo, self).commitctx(ctx, error)
297 return super(eolrepo, self).commitctx(ctx, error)
302 repo.__class__ = eolrepo
298 repo.__class__ = eolrepo
303 repo._hgcleardirstate()
299 repo._hgcleardirstate()
General Comments 0
You need to be logged in to leave comments. Login now