##// END OF EJS Templates
eol: explain why reading .hgeol from the working dir is special
Patrick Mezard -
r13611:358924b0 default
parent child Browse files
Show More
@@ -1,298 +1,300 b''
1 """automatically manage newlines in repository files
1 """automatically manage newlines in repository files
2
2
3 This extension allows you to manage the type of line endings (CRLF or
3 This extension allows you to manage the type of line endings (CRLF or
4 LF) that are used in the repository and in the local working
4 LF) that are used in the repository and in the local working
5 directory. That way you can get CRLF line endings on Windows and LF on
5 directory. That way you can get CRLF line endings on Windows and LF on
6 Unix/Mac, thereby letting everybody use their OS native line endings.
6 Unix/Mac, thereby letting everybody use their OS native line endings.
7
7
8 The extension reads its configuration from a versioned ``.hgeol``
8 The extension reads its configuration from a versioned ``.hgeol``
9 configuration file found in the root of the working copy. The
9 configuration file found in the root of the working copy. The
10 ``.hgeol`` file use the same syntax as all other Mercurial
10 ``.hgeol`` file use the same syntax as all other Mercurial
11 configuration files. It uses two sections, ``[patterns]`` and
11 configuration files. It uses two sections, ``[patterns]`` and
12 ``[repository]``.
12 ``[repository]``.
13
13
14 The ``[patterns]`` section specifies how line endings should be
14 The ``[patterns]`` section specifies how line endings should be
15 converted between the working copy and the repository. The format is
15 converted between the working copy and the repository. The format is
16 specified by a file pattern. The first match is used, so put more
16 specified by a file pattern. The first match is used, so put more
17 specific patterns first. The available line endings are ``LF``,
17 specific patterns first. The available line endings are ``LF``,
18 ``CRLF``, and ``BIN``.
18 ``CRLF``, and ``BIN``.
19
19
20 Files with the declared format of ``CRLF`` or ``LF`` are always
20 Files with the declared format of ``CRLF`` or ``LF`` are always
21 checked out and stored in the repository in that format and files
21 checked out and stored in the repository in that format and files
22 declared to be binary (``BIN``) are left unchanged. Additionally,
22 declared to be binary (``BIN``) are left unchanged. Additionally,
23 ``native`` is an alias for checking out in the platform's default line
23 ``native`` is an alias for checking out in the platform's default line
24 ending: ``LF`` on Unix (including Mac OS X) and ``CRLF`` on
24 ending: ``LF`` on Unix (including Mac OS X) and ``CRLF`` on
25 Windows. Note that ``BIN`` (do nothing to line endings) is Mercurial's
25 Windows. Note that ``BIN`` (do nothing to line endings) is Mercurial's
26 default behaviour; it is only needed if you need to override a later,
26 default behaviour; it is only needed if you need to override a later,
27 more general pattern.
27 more general pattern.
28
28
29 The optional ``[repository]`` section specifies the line endings to
29 The optional ``[repository]`` section specifies the line endings to
30 use for files stored in the repository. It has a single setting,
30 use for files stored in the repository. It has a single setting,
31 ``native``, which determines the storage line endings for files
31 ``native``, which determines the storage line endings for files
32 declared as ``native`` in the ``[patterns]`` section. It can be set to
32 declared as ``native`` in the ``[patterns]`` section. It can be set to
33 ``LF`` or ``CRLF``. The default is ``LF``. For example, this means
33 ``LF`` or ``CRLF``. The default is ``LF``. For example, this means
34 that on Windows, files configured as ``native`` (``CRLF`` by default)
34 that on Windows, files configured as ``native`` (``CRLF`` by default)
35 will be converted to ``LF`` when stored in the repository. Files
35 will be converted to ``LF`` when stored in the repository. Files
36 declared as ``LF``, ``CRLF``, or ``BIN`` in the ``[patterns]`` section
36 declared as ``LF``, ``CRLF``, or ``BIN`` in the ``[patterns]`` section
37 are always stored as-is in the repository.
37 are always stored as-is in the repository.
38
38
39 Example versioned ``.hgeol`` file::
39 Example versioned ``.hgeol`` file::
40
40
41 [patterns]
41 [patterns]
42 **.py = native
42 **.py = native
43 **.vcproj = CRLF
43 **.vcproj = CRLF
44 **.txt = native
44 **.txt = native
45 Makefile = LF
45 Makefile = LF
46 **.jpg = BIN
46 **.jpg = BIN
47
47
48 [repository]
48 [repository]
49 native = LF
49 native = LF
50
50
51 .. note::
51 .. note::
52 The rules will first apply when files are touched in the working
52 The rules will first apply when files are touched in the working
53 copy, e.g. by updating to null and back to tip to touch all files.
53 copy, e.g. by updating to null and back to tip to touch all files.
54
54
55 The extension uses an optional ``[eol]`` section in your hgrc file
55 The extension uses an optional ``[eol]`` section in your hgrc file
56 (not the ``.hgeol`` file) for settings that control the overall
56 (not the ``.hgeol`` file) for settings that control the overall
57 behavior. There are two settings:
57 behavior. There are two settings:
58
58
59 - ``eol.native`` (default ``os.linesep``) can be set to ``LF`` or
59 - ``eol.native`` (default ``os.linesep``) can be set to ``LF`` or
60 ``CRLF`` to override the default interpretation of ``native`` for
60 ``CRLF`` to override the default interpretation of ``native`` for
61 checkout. This can be used with :hg:`archive` on Unix, say, to
61 checkout. This can be used with :hg:`archive` on Unix, say, to
62 generate an archive where files have line endings for Windows.
62 generate an archive where files have line endings for Windows.
63
63
64 - ``eol.only-consistent`` (default True) can be set to False to make
64 - ``eol.only-consistent`` (default True) can be set to False to make
65 the extension convert files with inconsistent EOLs. Inconsistent
65 the extension convert files with inconsistent EOLs. Inconsistent
66 means that there is both ``CRLF`` and ``LF`` present in the file.
66 means that there is both ``CRLF`` and ``LF`` present in the file.
67 Such files are normally not touched under the assumption that they
67 Such files are normally not touched under the assumption that they
68 have mixed EOLs on purpose.
68 have mixed EOLs on purpose.
69
69
70 The extension provides ``cleverencode:`` and ``cleverdecode:`` filters
70 The extension provides ``cleverencode:`` and ``cleverdecode:`` filters
71 like the deprecated win32text extension does. This means that you can
71 like the deprecated win32text extension does. This means that you can
72 disable win32text and enable eol and your filters will still work. You
72 disable win32text and enable eol and your filters will still work. You
73 only need to these filters until you have prepared a ``.hgeol`` file.
73 only need to these filters until you have prepared a ``.hgeol`` file.
74
74
75 The ``win32text.forbid*`` hooks provided by the win32text extension
75 The ``win32text.forbid*`` hooks provided by the win32text extension
76 have been unified into a single hook named ``eol.hook``. The hook will
76 have been unified into a single hook named ``eol.hook``. The hook will
77 lookup the expected line endings from the ``.hgeol`` file, which means
77 lookup the expected line endings from the ``.hgeol`` file, which means
78 you must migrate to a ``.hgeol`` file first before using the hook.
78 you must migrate to a ``.hgeol`` file first before using the hook.
79 Remember to enable the eol extension in the repository where you
79 Remember to enable the eol extension in the repository where you
80 install the hook.
80 install the hook.
81
81
82 See :hg:`help patterns` for more information about the glob patterns
82 See :hg:`help patterns` for more information about the glob patterns
83 used.
83 used.
84 """
84 """
85
85
86 from mercurial.i18n import _
86 from mercurial.i18n import _
87 from mercurial import util, config, extensions, match, error
87 from mercurial import util, config, extensions, match, error
88 import re, os
88 import re, os
89
89
90 # Matches a lone LF, i.e., one that is not part of CRLF.
90 # Matches a lone LF, i.e., one that is not part of CRLF.
91 singlelf = re.compile('(^|[^\r])\n')
91 singlelf = re.compile('(^|[^\r])\n')
92 # Matches a single EOL which can either be a CRLF where repeated CR
92 # Matches a single EOL which can either be a CRLF where repeated CR
93 # are removed or a LF. We do not care about old Machintosh files, so a
93 # are removed or a LF. We do not care about old Machintosh files, so a
94 # stray CR is an error.
94 # stray CR is an error.
95 eolre = re.compile('\r*\n')
95 eolre = re.compile('\r*\n')
96
96
97
97
98 def inconsistenteol(data):
98 def inconsistenteol(data):
99 return '\r\n' in data and singlelf.search(data)
99 return '\r\n' in data and singlelf.search(data)
100
100
101 def tolf(s, params, ui, **kwargs):
101 def tolf(s, params, ui, **kwargs):
102 """Filter to convert to LF EOLs."""
102 """Filter to convert to LF EOLs."""
103 if util.binary(s):
103 if util.binary(s):
104 return s
104 return s
105 if ui.configbool('eol', 'only-consistent', True) and inconsistenteol(s):
105 if ui.configbool('eol', 'only-consistent', True) and inconsistenteol(s):
106 return s
106 return s
107 return eolre.sub('\n', s)
107 return eolre.sub('\n', s)
108
108
109 def tocrlf(s, params, ui, **kwargs):
109 def tocrlf(s, params, ui, **kwargs):
110 """Filter to convert to CRLF EOLs."""
110 """Filter to convert to CRLF EOLs."""
111 if util.binary(s):
111 if util.binary(s):
112 return s
112 return s
113 if ui.configbool('eol', 'only-consistent', True) and inconsistenteol(s):
113 if ui.configbool('eol', 'only-consistent', True) and inconsistenteol(s):
114 return s
114 return s
115 return eolre.sub('\r\n', s)
115 return eolre.sub('\r\n', s)
116
116
117 def isbinary(s, params):
117 def isbinary(s, params):
118 """Filter to do nothing with the file."""
118 """Filter to do nothing with the file."""
119 return s
119 return s
120
120
121 filters = {
121 filters = {
122 'to-lf': tolf,
122 'to-lf': tolf,
123 'to-crlf': tocrlf,
123 'to-crlf': tocrlf,
124 'is-binary': isbinary,
124 'is-binary': isbinary,
125 # The following provide backwards compatibility with win32text
125 # The following provide backwards compatibility with win32text
126 'cleverencode:': tolf,
126 'cleverencode:': tolf,
127 'cleverdecode:': tocrlf
127 'cleverdecode:': tocrlf
128 }
128 }
129
129
130
130
131 def hook(ui, repo, node, hooktype, **kwargs):
131 def hook(ui, repo, node, hooktype, **kwargs):
132 """verify that files have expected EOLs"""
132 """verify that files have expected EOLs"""
133 files = set()
133 files = set()
134 for rev in xrange(repo[node].rev(), len(repo)):
134 for rev in xrange(repo[node].rev(), len(repo)):
135 files.update(repo[rev].files())
135 files.update(repo[rev].files())
136 tip = repo['tip']
136 tip = repo['tip']
137 for f in files:
137 for f in files:
138 if f not in tip:
138 if f not in tip:
139 continue
139 continue
140 for pattern, target in ui.configitems('encode'):
140 for pattern, target in ui.configitems('encode'):
141 if match.match(repo.root, '', [pattern])(f):
141 if match.match(repo.root, '', [pattern])(f):
142 data = tip[f].data()
142 data = tip[f].data()
143 if target == "to-lf" and "\r\n" in data:
143 if target == "to-lf" and "\r\n" in data:
144 raise util.Abort(_("%s should not have CRLF line endings")
144 raise util.Abort(_("%s should not have CRLF line endings")
145 % f)
145 % f)
146 elif target == "to-crlf" and singlelf.search(data):
146 elif target == "to-crlf" and singlelf.search(data):
147 raise util.Abort(_("%s should not have LF line endings")
147 raise util.Abort(_("%s should not have LF line endings")
148 % f)
148 % f)
149 # Ignore other rules for this file
149 # Ignore other rules for this file
150 break
150 break
151
151
152
152
153 def preupdate(ui, repo, hooktype, parent1, parent2):
153 def preupdate(ui, repo, hooktype, parent1, parent2):
154 #print "preupdate for %s: %s -> %s" % (repo.root, parent1, parent2)
154 #print "preupdate for %s: %s -> %s" % (repo.root, parent1, parent2)
155 try:
155 try:
156 repo.readhgeol(parent1)
156 repo.readhgeol(parent1)
157 except error.ParseError, inst:
157 except error.ParseError, inst:
158 ui.warn(_("warning: ignoring .hgeol file due to parse error "
158 ui.warn(_("warning: ignoring .hgeol file due to parse error "
159 "at %s: %s\n") % (inst.args[1], inst.args[0]))
159 "at %s: %s\n") % (inst.args[1], inst.args[0]))
160 return False
160 return False
161
161
162 def uisetup(ui):
162 def uisetup(ui):
163 ui.setconfig('hooks', 'preupdate.eol', preupdate)
163 ui.setconfig('hooks', 'preupdate.eol', preupdate)
164
164
165 def extsetup(ui):
165 def extsetup(ui):
166 try:
166 try:
167 extensions.find('win32text')
167 extensions.find('win32text')
168 raise util.Abort(_("the eol extension is incompatible with the "
168 raise util.Abort(_("the eol extension is incompatible with the "
169 "win32text extension"))
169 "win32text extension"))
170 except KeyError:
170 except KeyError:
171 pass
171 pass
172
172
173
173
174 def reposetup(ui, repo):
174 def reposetup(ui, repo):
175 uisetup(repo.ui)
175 uisetup(repo.ui)
176 #print "reposetup for", repo.root
176 #print "reposetup for", repo.root
177
177
178 if not repo.local():
178 if not repo.local():
179 return
179 return
180 for name, fn in filters.iteritems():
180 for name, fn in filters.iteritems():
181 repo.adddatafilter(name, fn)
181 repo.adddatafilter(name, fn)
182
182
183 ui.setconfig('patch', 'eol', 'auto')
183 ui.setconfig('patch', 'eol', 'auto')
184
184
185 class eolrepo(repo.__class__):
185 class eolrepo(repo.__class__):
186
186
187 _decode = {'LF': 'to-lf', 'CRLF': 'to-crlf', 'BIN': 'is-binary'}
187 _decode = {'LF': 'to-lf', 'CRLF': 'to-crlf', 'BIN': 'is-binary'}
188 _encode = {'LF': 'to-lf', 'CRLF': 'to-crlf', 'BIN': 'is-binary'}
188 _encode = {'LF': 'to-lf', 'CRLF': 'to-crlf', 'BIN': 'is-binary'}
189
189
190 def readhgeol(self, node=None, data=None):
190 def readhgeol(self, node=None, data=None):
191 if data is None:
191 if data is None:
192 try:
192 try:
193 if node is None:
193 if node is None:
194 # Cannot use workingctx.data() since it would load
195 # and cache the filters before we configure them.
194 data = self.wfile('.hgeol').read()
196 data = self.wfile('.hgeol').read()
195 else:
197 else:
196 data = self[node]['.hgeol'].data()
198 data = self[node]['.hgeol'].data()
197 except (IOError, LookupError):
199 except (IOError, LookupError):
198 return None
200 return None
199
201
200 if self.ui.config('eol', 'native', os.linesep) in ('LF', '\n'):
202 if self.ui.config('eol', 'native', os.linesep) in ('LF', '\n'):
201 self._decode['NATIVE'] = 'to-lf'
203 self._decode['NATIVE'] = 'to-lf'
202 else:
204 else:
203 self._decode['NATIVE'] = 'to-crlf'
205 self._decode['NATIVE'] = 'to-crlf'
204
206
205 eol = config.config()
207 eol = config.config()
206 # Our files should not be touched. The pattern must be
208 # Our files should not be touched. The pattern must be
207 # inserted first override a '** = native' pattern.
209 # inserted first override a '** = native' pattern.
208 eol.set('patterns', '.hg*', 'BIN')
210 eol.set('patterns', '.hg*', 'BIN')
209 # We can then parse the user's patterns.
211 # We can then parse the user's patterns.
210 eol.parse('.hgeol', data)
212 eol.parse('.hgeol', data)
211
213
212 if eol.get('repository', 'native') == 'CRLF':
214 if eol.get('repository', 'native') == 'CRLF':
213 self._encode['NATIVE'] = 'to-crlf'
215 self._encode['NATIVE'] = 'to-crlf'
214 else:
216 else:
215 self._encode['NATIVE'] = 'to-lf'
217 self._encode['NATIVE'] = 'to-lf'
216
218
217 for pattern, style in eol.items('patterns'):
219 for pattern, style in eol.items('patterns'):
218 key = style.upper()
220 key = style.upper()
219 try:
221 try:
220 self.ui.setconfig('decode', pattern, self._decode[key])
222 self.ui.setconfig('decode', pattern, self._decode[key])
221 self.ui.setconfig('encode', pattern, self._encode[key])
223 self.ui.setconfig('encode', pattern, self._encode[key])
222 except KeyError:
224 except KeyError:
223 self.ui.warn(_("ignoring unknown EOL style '%s' from %s\n")
225 self.ui.warn(_("ignoring unknown EOL style '%s' from %s\n")
224 % (style, eol.source('patterns', pattern)))
226 % (style, eol.source('patterns', pattern)))
225
227
226 include = []
228 include = []
227 exclude = []
229 exclude = []
228 for pattern, style in eol.items('patterns'):
230 for pattern, style in eol.items('patterns'):
229 key = style.upper()
231 key = style.upper()
230 if key == 'BIN':
232 if key == 'BIN':
231 exclude.append(pattern)
233 exclude.append(pattern)
232 else:
234 else:
233 include.append(pattern)
235 include.append(pattern)
234
236
235 # This will match the files for which we need to care
237 # This will match the files for which we need to care
236 # about inconsistent newlines.
238 # about inconsistent newlines.
237 return match.match(self.root, '', [], include, exclude)
239 return match.match(self.root, '', [], include, exclude)
238
240
239 def _hgcleardirstate(self):
241 def _hgcleardirstate(self):
240 try:
242 try:
241 self._eolfile = self.readhgeol() or self.readhgeol('tip')
243 self._eolfile = self.readhgeol() or self.readhgeol('tip')
242 except error.ParseError, inst:
244 except error.ParseError, inst:
243 ui.warn(_("warning: ignoring .hgeol file due to parse error "
245 ui.warn(_("warning: ignoring .hgeol file due to parse error "
244 "at %s: %s\n") % (inst.args[1], inst.args[0]))
246 "at %s: %s\n") % (inst.args[1], inst.args[0]))
245 self._eolfile = None
247 self._eolfile = None
246
248
247 if not self._eolfile:
249 if not self._eolfile:
248 self._eolfile = util.never
250 self._eolfile = util.never
249 return
251 return
250
252
251 try:
253 try:
252 cachemtime = os.path.getmtime(self.join("eol.cache"))
254 cachemtime = os.path.getmtime(self.join("eol.cache"))
253 except OSError:
255 except OSError:
254 cachemtime = 0
256 cachemtime = 0
255
257
256 try:
258 try:
257 eolmtime = os.path.getmtime(self.wjoin(".hgeol"))
259 eolmtime = os.path.getmtime(self.wjoin(".hgeol"))
258 except OSError:
260 except OSError:
259 eolmtime = 0
261 eolmtime = 0
260
262
261 if eolmtime > cachemtime:
263 if eolmtime > cachemtime:
262 ui.debug("eol: detected change in .hgeol\n")
264 ui.debug("eol: detected change in .hgeol\n")
263 wlock = None
265 wlock = None
264 try:
266 try:
265 wlock = self.wlock()
267 wlock = self.wlock()
266 for f in self.dirstate:
268 for f in self.dirstate:
267 if self.dirstate[f] == 'n':
269 if self.dirstate[f] == 'n':
268 # all normal files need to be looked at
270 # all normal files need to be looked at
269 # again since the new .hgeol file might no
271 # again since the new .hgeol file might no
270 # longer match a file it matched before
272 # longer match a file it matched before
271 self.dirstate.normallookup(f)
273 self.dirstate.normallookup(f)
272 # Touch the cache to update mtime.
274 # Touch the cache to update mtime.
273 self.opener("eol.cache", "w").close()
275 self.opener("eol.cache", "w").close()
274 wlock.release()
276 wlock.release()
275 except error.LockUnavailable:
277 except error.LockUnavailable:
276 # If we cannot lock the repository and clear the
278 # If we cannot lock the repository and clear the
277 # dirstate, then a commit might not see all files
279 # dirstate, then a commit might not see all files
278 # as modified. But if we cannot lock the
280 # as modified. But if we cannot lock the
279 # repository, then we can also not make a commit,
281 # repository, then we can also not make a commit,
280 # so ignore the error.
282 # so ignore the error.
281 pass
283 pass
282
284
283 def commitctx(self, ctx, error=False):
285 def commitctx(self, ctx, error=False):
284 for f in sorted(ctx.added() + ctx.modified()):
286 for f in sorted(ctx.added() + ctx.modified()):
285 if not self._eolfile(f):
287 if not self._eolfile(f):
286 continue
288 continue
287 data = ctx[f].data()
289 data = ctx[f].data()
288 if util.binary(data):
290 if util.binary(data):
289 # We should not abort here, since the user should
291 # We should not abort here, since the user should
290 # be able to say "** = native" to automatically
292 # be able to say "** = native" to automatically
291 # have all non-binary files taken care of.
293 # have all non-binary files taken care of.
292 continue
294 continue
293 if inconsistenteol(data):
295 if inconsistenteol(data):
294 raise util.Abort(_("inconsistent newline style "
296 raise util.Abort(_("inconsistent newline style "
295 "in %s\n" % f))
297 "in %s\n" % f))
296 return super(eolrepo, self).commitctx(ctx, error)
298 return super(eolrepo, self).commitctx(ctx, error)
297 repo.__class__ = eolrepo
299 repo.__class__ = eolrepo
298 repo._hgcleardirstate()
300 repo._hgcleardirstate()
General Comments 0
You need to be logged in to leave comments. Login now