##// END OF EJS Templates
pycompat: provide 'ispy3' constant...
Yuya Nishihara -
r30030:0f6d6fdd default
parent child Browse files
Show More
@@ -1,1621 +1,1622 b''
1 # bundle2.py - generic container format to transmit arbitrary data.
1 # bundle2.py - generic container format to transmit arbitrary data.
2 #
2 #
3 # Copyright 2013 Facebook, Inc.
3 # Copyright 2013 Facebook, Inc.
4 #
4 #
5 # This software may be used and distributed according to the terms of the
5 # This software may be used and distributed according to the terms of the
6 # GNU General Public License version 2 or any later version.
6 # GNU General Public License version 2 or any later version.
7 """Handling of the new bundle2 format
7 """Handling of the new bundle2 format
8
8
9 The goal of bundle2 is to act as an atomically packet to transmit a set of
9 The goal of bundle2 is to act as an atomically packet to transmit a set of
10 payloads in an application agnostic way. It consist in a sequence of "parts"
10 payloads in an application agnostic way. It consist in a sequence of "parts"
11 that will be handed to and processed by the application layer.
11 that will be handed to and processed by the application layer.
12
12
13
13
14 General format architecture
14 General format architecture
15 ===========================
15 ===========================
16
16
17 The format is architectured as follow
17 The format is architectured as follow
18
18
19 - magic string
19 - magic string
20 - stream level parameters
20 - stream level parameters
21 - payload parts (any number)
21 - payload parts (any number)
22 - end of stream marker.
22 - end of stream marker.
23
23
24 the Binary format
24 the Binary format
25 ============================
25 ============================
26
26
27 All numbers are unsigned and big-endian.
27 All numbers are unsigned and big-endian.
28
28
29 stream level parameters
29 stream level parameters
30 ------------------------
30 ------------------------
31
31
32 Binary format is as follow
32 Binary format is as follow
33
33
34 :params size: int32
34 :params size: int32
35
35
36 The total number of Bytes used by the parameters
36 The total number of Bytes used by the parameters
37
37
38 :params value: arbitrary number of Bytes
38 :params value: arbitrary number of Bytes
39
39
40 A blob of `params size` containing the serialized version of all stream level
40 A blob of `params size` containing the serialized version of all stream level
41 parameters.
41 parameters.
42
42
43 The blob contains a space separated list of parameters. Parameters with value
43 The blob contains a space separated list of parameters. Parameters with value
44 are stored in the form `<name>=<value>`. Both name and value are urlquoted.
44 are stored in the form `<name>=<value>`. Both name and value are urlquoted.
45
45
46 Empty name are obviously forbidden.
46 Empty name are obviously forbidden.
47
47
48 Name MUST start with a letter. If this first letter is lower case, the
48 Name MUST start with a letter. If this first letter is lower case, the
49 parameter is advisory and can be safely ignored. However when the first
49 parameter is advisory and can be safely ignored. However when the first
50 letter is capital, the parameter is mandatory and the bundling process MUST
50 letter is capital, the parameter is mandatory and the bundling process MUST
51 stop if he is not able to proceed it.
51 stop if he is not able to proceed it.
52
52
53 Stream parameters use a simple textual format for two main reasons:
53 Stream parameters use a simple textual format for two main reasons:
54
54
55 - Stream level parameters should remain simple and we want to discourage any
55 - Stream level parameters should remain simple and we want to discourage any
56 crazy usage.
56 crazy usage.
57 - Textual data allow easy human inspection of a bundle2 header in case of
57 - Textual data allow easy human inspection of a bundle2 header in case of
58 troubles.
58 troubles.
59
59
60 Any Applicative level options MUST go into a bundle2 part instead.
60 Any Applicative level options MUST go into a bundle2 part instead.
61
61
62 Payload part
62 Payload part
63 ------------------------
63 ------------------------
64
64
65 Binary format is as follow
65 Binary format is as follow
66
66
67 :header size: int32
67 :header size: int32
68
68
69 The total number of Bytes used by the part header. When the header is empty
69 The total number of Bytes used by the part header. When the header is empty
70 (size = 0) this is interpreted as the end of stream marker.
70 (size = 0) this is interpreted as the end of stream marker.
71
71
72 :header:
72 :header:
73
73
74 The header defines how to interpret the part. It contains two piece of
74 The header defines how to interpret the part. It contains two piece of
75 data: the part type, and the part parameters.
75 data: the part type, and the part parameters.
76
76
77 The part type is used to route an application level handler, that can
77 The part type is used to route an application level handler, that can
78 interpret payload.
78 interpret payload.
79
79
80 Part parameters are passed to the application level handler. They are
80 Part parameters are passed to the application level handler. They are
81 meant to convey information that will help the application level object to
81 meant to convey information that will help the application level object to
82 interpret the part payload.
82 interpret the part payload.
83
83
84 The binary format of the header is has follow
84 The binary format of the header is has follow
85
85
86 :typesize: (one byte)
86 :typesize: (one byte)
87
87
88 :parttype: alphanumerical part name (restricted to [a-zA-Z0-9_:-]*)
88 :parttype: alphanumerical part name (restricted to [a-zA-Z0-9_:-]*)
89
89
90 :partid: A 32bits integer (unique in the bundle) that can be used to refer
90 :partid: A 32bits integer (unique in the bundle) that can be used to refer
91 to this part.
91 to this part.
92
92
93 :parameters:
93 :parameters:
94
94
95 Part's parameter may have arbitrary content, the binary structure is::
95 Part's parameter may have arbitrary content, the binary structure is::
96
96
97 <mandatory-count><advisory-count><param-sizes><param-data>
97 <mandatory-count><advisory-count><param-sizes><param-data>
98
98
99 :mandatory-count: 1 byte, number of mandatory parameters
99 :mandatory-count: 1 byte, number of mandatory parameters
100
100
101 :advisory-count: 1 byte, number of advisory parameters
101 :advisory-count: 1 byte, number of advisory parameters
102
102
103 :param-sizes:
103 :param-sizes:
104
104
105 N couple of bytes, where N is the total number of parameters. Each
105 N couple of bytes, where N is the total number of parameters. Each
106 couple contains (<size-of-key>, <size-of-value) for one parameter.
106 couple contains (<size-of-key>, <size-of-value) for one parameter.
107
107
108 :param-data:
108 :param-data:
109
109
110 A blob of bytes from which each parameter key and value can be
110 A blob of bytes from which each parameter key and value can be
111 retrieved using the list of size couples stored in the previous
111 retrieved using the list of size couples stored in the previous
112 field.
112 field.
113
113
114 Mandatory parameters comes first, then the advisory ones.
114 Mandatory parameters comes first, then the advisory ones.
115
115
116 Each parameter's key MUST be unique within the part.
116 Each parameter's key MUST be unique within the part.
117
117
118 :payload:
118 :payload:
119
119
120 payload is a series of `<chunksize><chunkdata>`.
120 payload is a series of `<chunksize><chunkdata>`.
121
121
122 `chunksize` is an int32, `chunkdata` are plain bytes (as much as
122 `chunksize` is an int32, `chunkdata` are plain bytes (as much as
123 `chunksize` says)` The payload part is concluded by a zero size chunk.
123 `chunksize` says)` The payload part is concluded by a zero size chunk.
124
124
125 The current implementation always produces either zero or one chunk.
125 The current implementation always produces either zero or one chunk.
126 This is an implementation limitation that will ultimately be lifted.
126 This is an implementation limitation that will ultimately be lifted.
127
127
128 `chunksize` can be negative to trigger special case processing. No such
128 `chunksize` can be negative to trigger special case processing. No such
129 processing is in place yet.
129 processing is in place yet.
130
130
131 Bundle processing
131 Bundle processing
132 ============================
132 ============================
133
133
134 Each part is processed in order using a "part handler". Handler are registered
134 Each part is processed in order using a "part handler". Handler are registered
135 for a certain part type.
135 for a certain part type.
136
136
137 The matching of a part to its handler is case insensitive. The case of the
137 The matching of a part to its handler is case insensitive. The case of the
138 part type is used to know if a part is mandatory or advisory. If the Part type
138 part type is used to know if a part is mandatory or advisory. If the Part type
139 contains any uppercase char it is considered mandatory. When no handler is
139 contains any uppercase char it is considered mandatory. When no handler is
140 known for a Mandatory part, the process is aborted and an exception is raised.
140 known for a Mandatory part, the process is aborted and an exception is raised.
141 If the part is advisory and no handler is known, the part is ignored. When the
141 If the part is advisory and no handler is known, the part is ignored. When the
142 process is aborted, the full bundle is still read from the stream to keep the
142 process is aborted, the full bundle is still read from the stream to keep the
143 channel usable. But none of the part read from an abort are processed. In the
143 channel usable. But none of the part read from an abort are processed. In the
144 future, dropping the stream may become an option for channel we do not care to
144 future, dropping the stream may become an option for channel we do not care to
145 preserve.
145 preserve.
146 """
146 """
147
147
148 from __future__ import absolute_import
148 from __future__ import absolute_import
149
149
150 import errno
150 import errno
151 import re
151 import re
152 import string
152 import string
153 import struct
153 import struct
154 import sys
154 import sys
155
155
156 from .i18n import _
156 from .i18n import _
157 from . import (
157 from . import (
158 changegroup,
158 changegroup,
159 error,
159 error,
160 obsolete,
160 obsolete,
161 pushkey,
161 pushkey,
162 pycompat,
162 tags,
163 tags,
163 url,
164 url,
164 util,
165 util,
165 )
166 )
166
167
167 urlerr = util.urlerr
168 urlerr = util.urlerr
168 urlreq = util.urlreq
169 urlreq = util.urlreq
169
170
170 _pack = struct.pack
171 _pack = struct.pack
171 _unpack = struct.unpack
172 _unpack = struct.unpack
172
173
173 _fstreamparamsize = '>i'
174 _fstreamparamsize = '>i'
174 _fpartheadersize = '>i'
175 _fpartheadersize = '>i'
175 _fparttypesize = '>B'
176 _fparttypesize = '>B'
176 _fpartid = '>I'
177 _fpartid = '>I'
177 _fpayloadsize = '>i'
178 _fpayloadsize = '>i'
178 _fpartparamcount = '>BB'
179 _fpartparamcount = '>BB'
179
180
180 preferedchunksize = 4096
181 preferedchunksize = 4096
181
182
182 _parttypeforbidden = re.compile('[^a-zA-Z0-9_:-]')
183 _parttypeforbidden = re.compile('[^a-zA-Z0-9_:-]')
183
184
184 def outdebug(ui, message):
185 def outdebug(ui, message):
185 """debug regarding output stream (bundling)"""
186 """debug regarding output stream (bundling)"""
186 if ui.configbool('devel', 'bundle2.debug', False):
187 if ui.configbool('devel', 'bundle2.debug', False):
187 ui.debug('bundle2-output: %s\n' % message)
188 ui.debug('bundle2-output: %s\n' % message)
188
189
189 def indebug(ui, message):
190 def indebug(ui, message):
190 """debug on input stream (unbundling)"""
191 """debug on input stream (unbundling)"""
191 if ui.configbool('devel', 'bundle2.debug', False):
192 if ui.configbool('devel', 'bundle2.debug', False):
192 ui.debug('bundle2-input: %s\n' % message)
193 ui.debug('bundle2-input: %s\n' % message)
193
194
194 def validateparttype(parttype):
195 def validateparttype(parttype):
195 """raise ValueError if a parttype contains invalid character"""
196 """raise ValueError if a parttype contains invalid character"""
196 if _parttypeforbidden.search(parttype):
197 if _parttypeforbidden.search(parttype):
197 raise ValueError(parttype)
198 raise ValueError(parttype)
198
199
199 def _makefpartparamsizes(nbparams):
200 def _makefpartparamsizes(nbparams):
200 """return a struct format to read part parameter sizes
201 """return a struct format to read part parameter sizes
201
202
202 The number parameters is variable so we need to build that format
203 The number parameters is variable so we need to build that format
203 dynamically.
204 dynamically.
204 """
205 """
205 return '>'+('BB'*nbparams)
206 return '>'+('BB'*nbparams)
206
207
207 parthandlermapping = {}
208 parthandlermapping = {}
208
209
209 def parthandler(parttype, params=()):
210 def parthandler(parttype, params=()):
210 """decorator that register a function as a bundle2 part handler
211 """decorator that register a function as a bundle2 part handler
211
212
212 eg::
213 eg::
213
214
214 @parthandler('myparttype', ('mandatory', 'param', 'handled'))
215 @parthandler('myparttype', ('mandatory', 'param', 'handled'))
215 def myparttypehandler(...):
216 def myparttypehandler(...):
216 '''process a part of type "my part".'''
217 '''process a part of type "my part".'''
217 ...
218 ...
218 """
219 """
219 validateparttype(parttype)
220 validateparttype(parttype)
220 def _decorator(func):
221 def _decorator(func):
221 lparttype = parttype.lower() # enforce lower case matching.
222 lparttype = parttype.lower() # enforce lower case matching.
222 assert lparttype not in parthandlermapping
223 assert lparttype not in parthandlermapping
223 parthandlermapping[lparttype] = func
224 parthandlermapping[lparttype] = func
224 func.params = frozenset(params)
225 func.params = frozenset(params)
225 return func
226 return func
226 return _decorator
227 return _decorator
227
228
228 class unbundlerecords(object):
229 class unbundlerecords(object):
229 """keep record of what happens during and unbundle
230 """keep record of what happens during and unbundle
230
231
231 New records are added using `records.add('cat', obj)`. Where 'cat' is a
232 New records are added using `records.add('cat', obj)`. Where 'cat' is a
232 category of record and obj is an arbitrary object.
233 category of record and obj is an arbitrary object.
233
234
234 `records['cat']` will return all entries of this category 'cat'.
235 `records['cat']` will return all entries of this category 'cat'.
235
236
236 Iterating on the object itself will yield `('category', obj)` tuples
237 Iterating on the object itself will yield `('category', obj)` tuples
237 for all entries.
238 for all entries.
238
239
239 All iterations happens in chronological order.
240 All iterations happens in chronological order.
240 """
241 """
241
242
242 def __init__(self):
243 def __init__(self):
243 self._categories = {}
244 self._categories = {}
244 self._sequences = []
245 self._sequences = []
245 self._replies = {}
246 self._replies = {}
246
247
247 def add(self, category, entry, inreplyto=None):
248 def add(self, category, entry, inreplyto=None):
248 """add a new record of a given category.
249 """add a new record of a given category.
249
250
250 The entry can then be retrieved in the list returned by
251 The entry can then be retrieved in the list returned by
251 self['category']."""
252 self['category']."""
252 self._categories.setdefault(category, []).append(entry)
253 self._categories.setdefault(category, []).append(entry)
253 self._sequences.append((category, entry))
254 self._sequences.append((category, entry))
254 if inreplyto is not None:
255 if inreplyto is not None:
255 self.getreplies(inreplyto).add(category, entry)
256 self.getreplies(inreplyto).add(category, entry)
256
257
257 def getreplies(self, partid):
258 def getreplies(self, partid):
258 """get the records that are replies to a specific part"""
259 """get the records that are replies to a specific part"""
259 return self._replies.setdefault(partid, unbundlerecords())
260 return self._replies.setdefault(partid, unbundlerecords())
260
261
261 def __getitem__(self, cat):
262 def __getitem__(self, cat):
262 return tuple(self._categories.get(cat, ()))
263 return tuple(self._categories.get(cat, ()))
263
264
264 def __iter__(self):
265 def __iter__(self):
265 return iter(self._sequences)
266 return iter(self._sequences)
266
267
267 def __len__(self):
268 def __len__(self):
268 return len(self._sequences)
269 return len(self._sequences)
269
270
270 def __nonzero__(self):
271 def __nonzero__(self):
271 return bool(self._sequences)
272 return bool(self._sequences)
272
273
273 class bundleoperation(object):
274 class bundleoperation(object):
274 """an object that represents a single bundling process
275 """an object that represents a single bundling process
275
276
276 Its purpose is to carry unbundle-related objects and states.
277 Its purpose is to carry unbundle-related objects and states.
277
278
278 A new object should be created at the beginning of each bundle processing.
279 A new object should be created at the beginning of each bundle processing.
279 The object is to be returned by the processing function.
280 The object is to be returned by the processing function.
280
281
281 The object has very little content now it will ultimately contain:
282 The object has very little content now it will ultimately contain:
282 * an access to the repo the bundle is applied to,
283 * an access to the repo the bundle is applied to,
283 * a ui object,
284 * a ui object,
284 * a way to retrieve a transaction to add changes to the repo,
285 * a way to retrieve a transaction to add changes to the repo,
285 * a way to record the result of processing each part,
286 * a way to record the result of processing each part,
286 * a way to construct a bundle response when applicable.
287 * a way to construct a bundle response when applicable.
287 """
288 """
288
289
289 def __init__(self, repo, transactiongetter, captureoutput=True):
290 def __init__(self, repo, transactiongetter, captureoutput=True):
290 self.repo = repo
291 self.repo = repo
291 self.ui = repo.ui
292 self.ui = repo.ui
292 self.records = unbundlerecords()
293 self.records = unbundlerecords()
293 self.gettransaction = transactiongetter
294 self.gettransaction = transactiongetter
294 self.reply = None
295 self.reply = None
295 self.captureoutput = captureoutput
296 self.captureoutput = captureoutput
296
297
297 class TransactionUnavailable(RuntimeError):
298 class TransactionUnavailable(RuntimeError):
298 pass
299 pass
299
300
300 def _notransaction():
301 def _notransaction():
301 """default method to get a transaction while processing a bundle
302 """default method to get a transaction while processing a bundle
302
303
303 Raise an exception to highlight the fact that no transaction was expected
304 Raise an exception to highlight the fact that no transaction was expected
304 to be created"""
305 to be created"""
305 raise TransactionUnavailable()
306 raise TransactionUnavailable()
306
307
307 def applybundle(repo, unbundler, tr, source=None, url=None, op=None):
308 def applybundle(repo, unbundler, tr, source=None, url=None, op=None):
308 # transform me into unbundler.apply() as soon as the freeze is lifted
309 # transform me into unbundler.apply() as soon as the freeze is lifted
309 tr.hookargs['bundle2'] = '1'
310 tr.hookargs['bundle2'] = '1'
310 if source is not None and 'source' not in tr.hookargs:
311 if source is not None and 'source' not in tr.hookargs:
311 tr.hookargs['source'] = source
312 tr.hookargs['source'] = source
312 if url is not None and 'url' not in tr.hookargs:
313 if url is not None and 'url' not in tr.hookargs:
313 tr.hookargs['url'] = url
314 tr.hookargs['url'] = url
314 return processbundle(repo, unbundler, lambda: tr, op=op)
315 return processbundle(repo, unbundler, lambda: tr, op=op)
315
316
316 def processbundle(repo, unbundler, transactiongetter=None, op=None):
317 def processbundle(repo, unbundler, transactiongetter=None, op=None):
317 """This function process a bundle, apply effect to/from a repo
318 """This function process a bundle, apply effect to/from a repo
318
319
319 It iterates over each part then searches for and uses the proper handling
320 It iterates over each part then searches for and uses the proper handling
320 code to process the part. Parts are processed in order.
321 code to process the part. Parts are processed in order.
321
322
322 This is very early version of this function that will be strongly reworked
323 This is very early version of this function that will be strongly reworked
323 before final usage.
324 before final usage.
324
325
325 Unknown Mandatory part will abort the process.
326 Unknown Mandatory part will abort the process.
326
327
327 It is temporarily possible to provide a prebuilt bundleoperation to the
328 It is temporarily possible to provide a prebuilt bundleoperation to the
328 function. This is used to ensure output is properly propagated in case of
329 function. This is used to ensure output is properly propagated in case of
329 an error during the unbundling. This output capturing part will likely be
330 an error during the unbundling. This output capturing part will likely be
330 reworked and this ability will probably go away in the process.
331 reworked and this ability will probably go away in the process.
331 """
332 """
332 if op is None:
333 if op is None:
333 if transactiongetter is None:
334 if transactiongetter is None:
334 transactiongetter = _notransaction
335 transactiongetter = _notransaction
335 op = bundleoperation(repo, transactiongetter)
336 op = bundleoperation(repo, transactiongetter)
336 # todo:
337 # todo:
337 # - replace this is a init function soon.
338 # - replace this is a init function soon.
338 # - exception catching
339 # - exception catching
339 unbundler.params
340 unbundler.params
340 if repo.ui.debugflag:
341 if repo.ui.debugflag:
341 msg = ['bundle2-input-bundle:']
342 msg = ['bundle2-input-bundle:']
342 if unbundler.params:
343 if unbundler.params:
343 msg.append(' %i params')
344 msg.append(' %i params')
344 if op.gettransaction is None:
345 if op.gettransaction is None:
345 msg.append(' no-transaction')
346 msg.append(' no-transaction')
346 else:
347 else:
347 msg.append(' with-transaction')
348 msg.append(' with-transaction')
348 msg.append('\n')
349 msg.append('\n')
349 repo.ui.debug(''.join(msg))
350 repo.ui.debug(''.join(msg))
350 iterparts = enumerate(unbundler.iterparts())
351 iterparts = enumerate(unbundler.iterparts())
351 part = None
352 part = None
352 nbpart = 0
353 nbpart = 0
353 try:
354 try:
354 for nbpart, part in iterparts:
355 for nbpart, part in iterparts:
355 _processpart(op, part)
356 _processpart(op, part)
356 except Exception as exc:
357 except Exception as exc:
357 for nbpart, part in iterparts:
358 for nbpart, part in iterparts:
358 # consume the bundle content
359 # consume the bundle content
359 part.seek(0, 2)
360 part.seek(0, 2)
360 # Small hack to let caller code distinguish exceptions from bundle2
361 # Small hack to let caller code distinguish exceptions from bundle2
361 # processing from processing the old format. This is mostly
362 # processing from processing the old format. This is mostly
362 # needed to handle different return codes to unbundle according to the
363 # needed to handle different return codes to unbundle according to the
363 # type of bundle. We should probably clean up or drop this return code
364 # type of bundle. We should probably clean up or drop this return code
364 # craziness in a future version.
365 # craziness in a future version.
365 exc.duringunbundle2 = True
366 exc.duringunbundle2 = True
366 salvaged = []
367 salvaged = []
367 replycaps = None
368 replycaps = None
368 if op.reply is not None:
369 if op.reply is not None:
369 salvaged = op.reply.salvageoutput()
370 salvaged = op.reply.salvageoutput()
370 replycaps = op.reply.capabilities
371 replycaps = op.reply.capabilities
371 exc._replycaps = replycaps
372 exc._replycaps = replycaps
372 exc._bundle2salvagedoutput = salvaged
373 exc._bundle2salvagedoutput = salvaged
373 raise
374 raise
374 finally:
375 finally:
375 repo.ui.debug('bundle2-input-bundle: %i parts total\n' % nbpart)
376 repo.ui.debug('bundle2-input-bundle: %i parts total\n' % nbpart)
376
377
377 return op
378 return op
378
379
379 def _processpart(op, part):
380 def _processpart(op, part):
380 """process a single part from a bundle
381 """process a single part from a bundle
381
382
382 The part is guaranteed to have been fully consumed when the function exits
383 The part is guaranteed to have been fully consumed when the function exits
383 (even if an exception is raised)."""
384 (even if an exception is raised)."""
384 status = 'unknown' # used by debug output
385 status = 'unknown' # used by debug output
385 hardabort = False
386 hardabort = False
386 try:
387 try:
387 try:
388 try:
388 handler = parthandlermapping.get(part.type)
389 handler = parthandlermapping.get(part.type)
389 if handler is None:
390 if handler is None:
390 status = 'unsupported-type'
391 status = 'unsupported-type'
391 raise error.BundleUnknownFeatureError(parttype=part.type)
392 raise error.BundleUnknownFeatureError(parttype=part.type)
392 indebug(op.ui, 'found a handler for part %r' % part.type)
393 indebug(op.ui, 'found a handler for part %r' % part.type)
393 unknownparams = part.mandatorykeys - handler.params
394 unknownparams = part.mandatorykeys - handler.params
394 if unknownparams:
395 if unknownparams:
395 unknownparams = list(unknownparams)
396 unknownparams = list(unknownparams)
396 unknownparams.sort()
397 unknownparams.sort()
397 status = 'unsupported-params (%s)' % unknownparams
398 status = 'unsupported-params (%s)' % unknownparams
398 raise error.BundleUnknownFeatureError(parttype=part.type,
399 raise error.BundleUnknownFeatureError(parttype=part.type,
399 params=unknownparams)
400 params=unknownparams)
400 status = 'supported'
401 status = 'supported'
401 except error.BundleUnknownFeatureError as exc:
402 except error.BundleUnknownFeatureError as exc:
402 if part.mandatory: # mandatory parts
403 if part.mandatory: # mandatory parts
403 raise
404 raise
404 indebug(op.ui, 'ignoring unsupported advisory part %s' % exc)
405 indebug(op.ui, 'ignoring unsupported advisory part %s' % exc)
405 return # skip to part processing
406 return # skip to part processing
406 finally:
407 finally:
407 if op.ui.debugflag:
408 if op.ui.debugflag:
408 msg = ['bundle2-input-part: "%s"' % part.type]
409 msg = ['bundle2-input-part: "%s"' % part.type]
409 if not part.mandatory:
410 if not part.mandatory:
410 msg.append(' (advisory)')
411 msg.append(' (advisory)')
411 nbmp = len(part.mandatorykeys)
412 nbmp = len(part.mandatorykeys)
412 nbap = len(part.params) - nbmp
413 nbap = len(part.params) - nbmp
413 if nbmp or nbap:
414 if nbmp or nbap:
414 msg.append(' (params:')
415 msg.append(' (params:')
415 if nbmp:
416 if nbmp:
416 msg.append(' %i mandatory' % nbmp)
417 msg.append(' %i mandatory' % nbmp)
417 if nbap:
418 if nbap:
418 msg.append(' %i advisory' % nbmp)
419 msg.append(' %i advisory' % nbmp)
419 msg.append(')')
420 msg.append(')')
420 msg.append(' %s\n' % status)
421 msg.append(' %s\n' % status)
421 op.ui.debug(''.join(msg))
422 op.ui.debug(''.join(msg))
422
423
423 # handler is called outside the above try block so that we don't
424 # handler is called outside the above try block so that we don't
424 # risk catching KeyErrors from anything other than the
425 # risk catching KeyErrors from anything other than the
425 # parthandlermapping lookup (any KeyError raised by handler()
426 # parthandlermapping lookup (any KeyError raised by handler()
426 # itself represents a defect of a different variety).
427 # itself represents a defect of a different variety).
427 output = None
428 output = None
428 if op.captureoutput and op.reply is not None:
429 if op.captureoutput and op.reply is not None:
429 op.ui.pushbuffer(error=True, subproc=True)
430 op.ui.pushbuffer(error=True, subproc=True)
430 output = ''
431 output = ''
431 try:
432 try:
432 handler(op, part)
433 handler(op, part)
433 finally:
434 finally:
434 if output is not None:
435 if output is not None:
435 output = op.ui.popbuffer()
436 output = op.ui.popbuffer()
436 if output:
437 if output:
437 outpart = op.reply.newpart('output', data=output,
438 outpart = op.reply.newpart('output', data=output,
438 mandatory=False)
439 mandatory=False)
439 outpart.addparam('in-reply-to', str(part.id), mandatory=False)
440 outpart.addparam('in-reply-to', str(part.id), mandatory=False)
440 # If exiting or interrupted, do not attempt to seek the stream in the
441 # If exiting or interrupted, do not attempt to seek the stream in the
441 # finally block below. This makes abort faster.
442 # finally block below. This makes abort faster.
442 except (SystemExit, KeyboardInterrupt):
443 except (SystemExit, KeyboardInterrupt):
443 hardabort = True
444 hardabort = True
444 raise
445 raise
445 finally:
446 finally:
446 # consume the part content to not corrupt the stream.
447 # consume the part content to not corrupt the stream.
447 if not hardabort:
448 if not hardabort:
448 part.seek(0, 2)
449 part.seek(0, 2)
449
450
450
451
451 def decodecaps(blob):
452 def decodecaps(blob):
452 """decode a bundle2 caps bytes blob into a dictionary
453 """decode a bundle2 caps bytes blob into a dictionary
453
454
454 The blob is a list of capabilities (one per line)
455 The blob is a list of capabilities (one per line)
455 Capabilities may have values using a line of the form::
456 Capabilities may have values using a line of the form::
456
457
457 capability=value1,value2,value3
458 capability=value1,value2,value3
458
459
459 The values are always a list."""
460 The values are always a list."""
460 caps = {}
461 caps = {}
461 for line in blob.splitlines():
462 for line in blob.splitlines():
462 if not line:
463 if not line:
463 continue
464 continue
464 if '=' not in line:
465 if '=' not in line:
465 key, vals = line, ()
466 key, vals = line, ()
466 else:
467 else:
467 key, vals = line.split('=', 1)
468 key, vals = line.split('=', 1)
468 vals = vals.split(',')
469 vals = vals.split(',')
469 key = urlreq.unquote(key)
470 key = urlreq.unquote(key)
470 vals = [urlreq.unquote(v) for v in vals]
471 vals = [urlreq.unquote(v) for v in vals]
471 caps[key] = vals
472 caps[key] = vals
472 return caps
473 return caps
473
474
474 def encodecaps(caps):
475 def encodecaps(caps):
475 """encode a bundle2 caps dictionary into a bytes blob"""
476 """encode a bundle2 caps dictionary into a bytes blob"""
476 chunks = []
477 chunks = []
477 for ca in sorted(caps):
478 for ca in sorted(caps):
478 vals = caps[ca]
479 vals = caps[ca]
479 ca = urlreq.quote(ca)
480 ca = urlreq.quote(ca)
480 vals = [urlreq.quote(v) for v in vals]
481 vals = [urlreq.quote(v) for v in vals]
481 if vals:
482 if vals:
482 ca = "%s=%s" % (ca, ','.join(vals))
483 ca = "%s=%s" % (ca, ','.join(vals))
483 chunks.append(ca)
484 chunks.append(ca)
484 return '\n'.join(chunks)
485 return '\n'.join(chunks)
485
486
486 bundletypes = {
487 bundletypes = {
487 "": ("", None), # only when using unbundle on ssh and old http servers
488 "": ("", None), # only when using unbundle on ssh and old http servers
488 # since the unification ssh accepts a header but there
489 # since the unification ssh accepts a header but there
489 # is no capability signaling it.
490 # is no capability signaling it.
490 "HG20": (), # special-cased below
491 "HG20": (), # special-cased below
491 "HG10UN": ("HG10UN", None),
492 "HG10UN": ("HG10UN", None),
492 "HG10BZ": ("HG10", 'BZ'),
493 "HG10BZ": ("HG10", 'BZ'),
493 "HG10GZ": ("HG10GZ", 'GZ'),
494 "HG10GZ": ("HG10GZ", 'GZ'),
494 }
495 }
495
496
496 # hgweb uses this list to communicate its preferred type
497 # hgweb uses this list to communicate its preferred type
497 bundlepriority = ['HG10GZ', 'HG10BZ', 'HG10UN']
498 bundlepriority = ['HG10GZ', 'HG10BZ', 'HG10UN']
498
499
499 class bundle20(object):
500 class bundle20(object):
500 """represent an outgoing bundle2 container
501 """represent an outgoing bundle2 container
501
502
502 Use the `addparam` method to add stream level parameter. and `newpart` to
503 Use the `addparam` method to add stream level parameter. and `newpart` to
503 populate it. Then call `getchunks` to retrieve all the binary chunks of
504 populate it. Then call `getchunks` to retrieve all the binary chunks of
504 data that compose the bundle2 container."""
505 data that compose the bundle2 container."""
505
506
506 _magicstring = 'HG20'
507 _magicstring = 'HG20'
507
508
508 def __init__(self, ui, capabilities=()):
509 def __init__(self, ui, capabilities=()):
509 self.ui = ui
510 self.ui = ui
510 self._params = []
511 self._params = []
511 self._parts = []
512 self._parts = []
512 self.capabilities = dict(capabilities)
513 self.capabilities = dict(capabilities)
513 self._compressor = util.compressors[None]()
514 self._compressor = util.compressors[None]()
514
515
515 def setcompression(self, alg):
516 def setcompression(self, alg):
516 """setup core part compression to <alg>"""
517 """setup core part compression to <alg>"""
517 if alg is None:
518 if alg is None:
518 return
519 return
519 assert not any(n.lower() == 'Compression' for n, v in self._params)
520 assert not any(n.lower() == 'Compression' for n, v in self._params)
520 self.addparam('Compression', alg)
521 self.addparam('Compression', alg)
521 self._compressor = util.compressors[alg]()
522 self._compressor = util.compressors[alg]()
522
523
523 @property
524 @property
524 def nbparts(self):
525 def nbparts(self):
525 """total number of parts added to the bundler"""
526 """total number of parts added to the bundler"""
526 return len(self._parts)
527 return len(self._parts)
527
528
528 # methods used to defines the bundle2 content
529 # methods used to defines the bundle2 content
529 def addparam(self, name, value=None):
530 def addparam(self, name, value=None):
530 """add a stream level parameter"""
531 """add a stream level parameter"""
531 if not name:
532 if not name:
532 raise ValueError('empty parameter name')
533 raise ValueError('empty parameter name')
533 if name[0] not in string.letters:
534 if name[0] not in string.letters:
534 raise ValueError('non letter first character: %r' % name)
535 raise ValueError('non letter first character: %r' % name)
535 self._params.append((name, value))
536 self._params.append((name, value))
536
537
537 def addpart(self, part):
538 def addpart(self, part):
538 """add a new part to the bundle2 container
539 """add a new part to the bundle2 container
539
540
540 Parts contains the actual applicative payload."""
541 Parts contains the actual applicative payload."""
541 assert part.id is None
542 assert part.id is None
542 part.id = len(self._parts) # very cheap counter
543 part.id = len(self._parts) # very cheap counter
543 self._parts.append(part)
544 self._parts.append(part)
544
545
545 def newpart(self, typeid, *args, **kwargs):
546 def newpart(self, typeid, *args, **kwargs):
546 """create a new part and add it to the containers
547 """create a new part and add it to the containers
547
548
548 As the part is directly added to the containers. For now, this means
549 As the part is directly added to the containers. For now, this means
549 that any failure to properly initialize the part after calling
550 that any failure to properly initialize the part after calling
550 ``newpart`` should result in a failure of the whole bundling process.
551 ``newpart`` should result in a failure of the whole bundling process.
551
552
552 You can still fall back to manually create and add if you need better
553 You can still fall back to manually create and add if you need better
553 control."""
554 control."""
554 part = bundlepart(typeid, *args, **kwargs)
555 part = bundlepart(typeid, *args, **kwargs)
555 self.addpart(part)
556 self.addpart(part)
556 return part
557 return part
557
558
558 # methods used to generate the bundle2 stream
559 # methods used to generate the bundle2 stream
559 def getchunks(self):
560 def getchunks(self):
560 if self.ui.debugflag:
561 if self.ui.debugflag:
561 msg = ['bundle2-output-bundle: "%s",' % self._magicstring]
562 msg = ['bundle2-output-bundle: "%s",' % self._magicstring]
562 if self._params:
563 if self._params:
563 msg.append(' (%i params)' % len(self._params))
564 msg.append(' (%i params)' % len(self._params))
564 msg.append(' %i parts total\n' % len(self._parts))
565 msg.append(' %i parts total\n' % len(self._parts))
565 self.ui.debug(''.join(msg))
566 self.ui.debug(''.join(msg))
566 outdebug(self.ui, 'start emission of %s stream' % self._magicstring)
567 outdebug(self.ui, 'start emission of %s stream' % self._magicstring)
567 yield self._magicstring
568 yield self._magicstring
568 param = self._paramchunk()
569 param = self._paramchunk()
569 outdebug(self.ui, 'bundle parameter: %s' % param)
570 outdebug(self.ui, 'bundle parameter: %s' % param)
570 yield _pack(_fstreamparamsize, len(param))
571 yield _pack(_fstreamparamsize, len(param))
571 if param:
572 if param:
572 yield param
573 yield param
573 # starting compression
574 # starting compression
574 for chunk in self._getcorechunk():
575 for chunk in self._getcorechunk():
575 yield self._compressor.compress(chunk)
576 yield self._compressor.compress(chunk)
576 yield self._compressor.flush()
577 yield self._compressor.flush()
577
578
578 def _paramchunk(self):
579 def _paramchunk(self):
579 """return a encoded version of all stream parameters"""
580 """return a encoded version of all stream parameters"""
580 blocks = []
581 blocks = []
581 for par, value in self._params:
582 for par, value in self._params:
582 par = urlreq.quote(par)
583 par = urlreq.quote(par)
583 if value is not None:
584 if value is not None:
584 value = urlreq.quote(value)
585 value = urlreq.quote(value)
585 par = '%s=%s' % (par, value)
586 par = '%s=%s' % (par, value)
586 blocks.append(par)
587 blocks.append(par)
587 return ' '.join(blocks)
588 return ' '.join(blocks)
588
589
589 def _getcorechunk(self):
590 def _getcorechunk(self):
590 """yield chunk for the core part of the bundle
591 """yield chunk for the core part of the bundle
591
592
592 (all but headers and parameters)"""
593 (all but headers and parameters)"""
593 outdebug(self.ui, 'start of parts')
594 outdebug(self.ui, 'start of parts')
594 for part in self._parts:
595 for part in self._parts:
595 outdebug(self.ui, 'bundle part: "%s"' % part.type)
596 outdebug(self.ui, 'bundle part: "%s"' % part.type)
596 for chunk in part.getchunks(ui=self.ui):
597 for chunk in part.getchunks(ui=self.ui):
597 yield chunk
598 yield chunk
598 outdebug(self.ui, 'end of bundle')
599 outdebug(self.ui, 'end of bundle')
599 yield _pack(_fpartheadersize, 0)
600 yield _pack(_fpartheadersize, 0)
600
601
601
602
602 def salvageoutput(self):
603 def salvageoutput(self):
603 """return a list with a copy of all output parts in the bundle
604 """return a list with a copy of all output parts in the bundle
604
605
605 This is meant to be used during error handling to make sure we preserve
606 This is meant to be used during error handling to make sure we preserve
606 server output"""
607 server output"""
607 salvaged = []
608 salvaged = []
608 for part in self._parts:
609 for part in self._parts:
609 if part.type.startswith('output'):
610 if part.type.startswith('output'):
610 salvaged.append(part.copy())
611 salvaged.append(part.copy())
611 return salvaged
612 return salvaged
612
613
613
614
614 class unpackermixin(object):
615 class unpackermixin(object):
615 """A mixin to extract bytes and struct data from a stream"""
616 """A mixin to extract bytes and struct data from a stream"""
616
617
617 def __init__(self, fp):
618 def __init__(self, fp):
618 self._fp = fp
619 self._fp = fp
619 self._seekable = (util.safehasattr(fp, 'seek') and
620 self._seekable = (util.safehasattr(fp, 'seek') and
620 util.safehasattr(fp, 'tell'))
621 util.safehasattr(fp, 'tell'))
621
622
622 def _unpack(self, format):
623 def _unpack(self, format):
623 """unpack this struct format from the stream"""
624 """unpack this struct format from the stream"""
624 data = self._readexact(struct.calcsize(format))
625 data = self._readexact(struct.calcsize(format))
625 return _unpack(format, data)
626 return _unpack(format, data)
626
627
627 def _readexact(self, size):
628 def _readexact(self, size):
628 """read exactly <size> bytes from the stream"""
629 """read exactly <size> bytes from the stream"""
629 return changegroup.readexactly(self._fp, size)
630 return changegroup.readexactly(self._fp, size)
630
631
631 def seek(self, offset, whence=0):
632 def seek(self, offset, whence=0):
632 """move the underlying file pointer"""
633 """move the underlying file pointer"""
633 if self._seekable:
634 if self._seekable:
634 return self._fp.seek(offset, whence)
635 return self._fp.seek(offset, whence)
635 else:
636 else:
636 raise NotImplementedError(_('File pointer is not seekable'))
637 raise NotImplementedError(_('File pointer is not seekable'))
637
638
638 def tell(self):
639 def tell(self):
639 """return the file offset, or None if file is not seekable"""
640 """return the file offset, or None if file is not seekable"""
640 if self._seekable:
641 if self._seekable:
641 try:
642 try:
642 return self._fp.tell()
643 return self._fp.tell()
643 except IOError as e:
644 except IOError as e:
644 if e.errno == errno.ESPIPE:
645 if e.errno == errno.ESPIPE:
645 self._seekable = False
646 self._seekable = False
646 else:
647 else:
647 raise
648 raise
648 return None
649 return None
649
650
650 def close(self):
651 def close(self):
651 """close underlying file"""
652 """close underlying file"""
652 if util.safehasattr(self._fp, 'close'):
653 if util.safehasattr(self._fp, 'close'):
653 return self._fp.close()
654 return self._fp.close()
654
655
655 def getunbundler(ui, fp, magicstring=None):
656 def getunbundler(ui, fp, magicstring=None):
656 """return a valid unbundler object for a given magicstring"""
657 """return a valid unbundler object for a given magicstring"""
657 if magicstring is None:
658 if magicstring is None:
658 magicstring = changegroup.readexactly(fp, 4)
659 magicstring = changegroup.readexactly(fp, 4)
659 magic, version = magicstring[0:2], magicstring[2:4]
660 magic, version = magicstring[0:2], magicstring[2:4]
660 if magic != 'HG':
661 if magic != 'HG':
661 raise error.Abort(_('not a Mercurial bundle'))
662 raise error.Abort(_('not a Mercurial bundle'))
662 unbundlerclass = formatmap.get(version)
663 unbundlerclass = formatmap.get(version)
663 if unbundlerclass is None:
664 if unbundlerclass is None:
664 raise error.Abort(_('unknown bundle version %s') % version)
665 raise error.Abort(_('unknown bundle version %s') % version)
665 unbundler = unbundlerclass(ui, fp)
666 unbundler = unbundlerclass(ui, fp)
666 indebug(ui, 'start processing of %s stream' % magicstring)
667 indebug(ui, 'start processing of %s stream' % magicstring)
667 return unbundler
668 return unbundler
668
669
669 class unbundle20(unpackermixin):
670 class unbundle20(unpackermixin):
670 """interpret a bundle2 stream
671 """interpret a bundle2 stream
671
672
672 This class is fed with a binary stream and yields parts through its
673 This class is fed with a binary stream and yields parts through its
673 `iterparts` methods."""
674 `iterparts` methods."""
674
675
675 _magicstring = 'HG20'
676 _magicstring = 'HG20'
676
677
677 def __init__(self, ui, fp):
678 def __init__(self, ui, fp):
678 """If header is specified, we do not read it out of the stream."""
679 """If header is specified, we do not read it out of the stream."""
679 self.ui = ui
680 self.ui = ui
680 self._decompressor = util.decompressors[None]
681 self._decompressor = util.decompressors[None]
681 self._compressed = None
682 self._compressed = None
682 super(unbundle20, self).__init__(fp)
683 super(unbundle20, self).__init__(fp)
683
684
684 @util.propertycache
685 @util.propertycache
685 def params(self):
686 def params(self):
686 """dictionary of stream level parameters"""
687 """dictionary of stream level parameters"""
687 indebug(self.ui, 'reading bundle2 stream parameters')
688 indebug(self.ui, 'reading bundle2 stream parameters')
688 params = {}
689 params = {}
689 paramssize = self._unpack(_fstreamparamsize)[0]
690 paramssize = self._unpack(_fstreamparamsize)[0]
690 if paramssize < 0:
691 if paramssize < 0:
691 raise error.BundleValueError('negative bundle param size: %i'
692 raise error.BundleValueError('negative bundle param size: %i'
692 % paramssize)
693 % paramssize)
693 if paramssize:
694 if paramssize:
694 params = self._readexact(paramssize)
695 params = self._readexact(paramssize)
695 params = self._processallparams(params)
696 params = self._processallparams(params)
696 return params
697 return params
697
698
698 def _processallparams(self, paramsblock):
699 def _processallparams(self, paramsblock):
699 """"""
700 """"""
700 params = util.sortdict()
701 params = util.sortdict()
701 for p in paramsblock.split(' '):
702 for p in paramsblock.split(' '):
702 p = p.split('=', 1)
703 p = p.split('=', 1)
703 p = [urlreq.unquote(i) for i in p]
704 p = [urlreq.unquote(i) for i in p]
704 if len(p) < 2:
705 if len(p) < 2:
705 p.append(None)
706 p.append(None)
706 self._processparam(*p)
707 self._processparam(*p)
707 params[p[0]] = p[1]
708 params[p[0]] = p[1]
708 return params
709 return params
709
710
710
711
711 def _processparam(self, name, value):
712 def _processparam(self, name, value):
712 """process a parameter, applying its effect if needed
713 """process a parameter, applying its effect if needed
713
714
714 Parameter starting with a lower case letter are advisory and will be
715 Parameter starting with a lower case letter are advisory and will be
715 ignored when unknown. Those starting with an upper case letter are
716 ignored when unknown. Those starting with an upper case letter are
716 mandatory and will this function will raise a KeyError when unknown.
717 mandatory and will this function will raise a KeyError when unknown.
717
718
718 Note: no option are currently supported. Any input will be either
719 Note: no option are currently supported. Any input will be either
719 ignored or failing.
720 ignored or failing.
720 """
721 """
721 if not name:
722 if not name:
722 raise ValueError('empty parameter name')
723 raise ValueError('empty parameter name')
723 if name[0] not in string.letters:
724 if name[0] not in string.letters:
724 raise ValueError('non letter first character: %r' % name)
725 raise ValueError('non letter first character: %r' % name)
725 try:
726 try:
726 handler = b2streamparamsmap[name.lower()]
727 handler = b2streamparamsmap[name.lower()]
727 except KeyError:
728 except KeyError:
728 if name[0].islower():
729 if name[0].islower():
729 indebug(self.ui, "ignoring unknown parameter %r" % name)
730 indebug(self.ui, "ignoring unknown parameter %r" % name)
730 else:
731 else:
731 raise error.BundleUnknownFeatureError(params=(name,))
732 raise error.BundleUnknownFeatureError(params=(name,))
732 else:
733 else:
733 handler(self, name, value)
734 handler(self, name, value)
734
735
735 def _forwardchunks(self):
736 def _forwardchunks(self):
736 """utility to transfer a bundle2 as binary
737 """utility to transfer a bundle2 as binary
737
738
738 This is made necessary by the fact the 'getbundle' command over 'ssh'
739 This is made necessary by the fact the 'getbundle' command over 'ssh'
739 have no way to know then the reply end, relying on the bundle to be
740 have no way to know then the reply end, relying on the bundle to be
740 interpreted to know its end. This is terrible and we are sorry, but we
741 interpreted to know its end. This is terrible and we are sorry, but we
741 needed to move forward to get general delta enabled.
742 needed to move forward to get general delta enabled.
742 """
743 """
743 yield self._magicstring
744 yield self._magicstring
744 assert 'params' not in vars(self)
745 assert 'params' not in vars(self)
745 paramssize = self._unpack(_fstreamparamsize)[0]
746 paramssize = self._unpack(_fstreamparamsize)[0]
746 if paramssize < 0:
747 if paramssize < 0:
747 raise error.BundleValueError('negative bundle param size: %i'
748 raise error.BundleValueError('negative bundle param size: %i'
748 % paramssize)
749 % paramssize)
749 yield _pack(_fstreamparamsize, paramssize)
750 yield _pack(_fstreamparamsize, paramssize)
750 if paramssize:
751 if paramssize:
751 params = self._readexact(paramssize)
752 params = self._readexact(paramssize)
752 self._processallparams(params)
753 self._processallparams(params)
753 yield params
754 yield params
754 assert self._decompressor is util.decompressors[None]
755 assert self._decompressor is util.decompressors[None]
755 # From there, payload might need to be decompressed
756 # From there, payload might need to be decompressed
756 self._fp = self._decompressor(self._fp)
757 self._fp = self._decompressor(self._fp)
757 emptycount = 0
758 emptycount = 0
758 while emptycount < 2:
759 while emptycount < 2:
759 # so we can brainlessly loop
760 # so we can brainlessly loop
760 assert _fpartheadersize == _fpayloadsize
761 assert _fpartheadersize == _fpayloadsize
761 size = self._unpack(_fpartheadersize)[0]
762 size = self._unpack(_fpartheadersize)[0]
762 yield _pack(_fpartheadersize, size)
763 yield _pack(_fpartheadersize, size)
763 if size:
764 if size:
764 emptycount = 0
765 emptycount = 0
765 else:
766 else:
766 emptycount += 1
767 emptycount += 1
767 continue
768 continue
768 if size == flaginterrupt:
769 if size == flaginterrupt:
769 continue
770 continue
770 elif size < 0:
771 elif size < 0:
771 raise error.BundleValueError('negative chunk size: %i')
772 raise error.BundleValueError('negative chunk size: %i')
772 yield self._readexact(size)
773 yield self._readexact(size)
773
774
774
775
775 def iterparts(self):
776 def iterparts(self):
776 """yield all parts contained in the stream"""
777 """yield all parts contained in the stream"""
777 # make sure param have been loaded
778 # make sure param have been loaded
778 self.params
779 self.params
779 # From there, payload need to be decompressed
780 # From there, payload need to be decompressed
780 self._fp = self._decompressor(self._fp)
781 self._fp = self._decompressor(self._fp)
781 indebug(self.ui, 'start extraction of bundle2 parts')
782 indebug(self.ui, 'start extraction of bundle2 parts')
782 headerblock = self._readpartheader()
783 headerblock = self._readpartheader()
783 while headerblock is not None:
784 while headerblock is not None:
784 part = unbundlepart(self.ui, headerblock, self._fp)
785 part = unbundlepart(self.ui, headerblock, self._fp)
785 yield part
786 yield part
786 part.seek(0, 2)
787 part.seek(0, 2)
787 headerblock = self._readpartheader()
788 headerblock = self._readpartheader()
788 indebug(self.ui, 'end of bundle2 stream')
789 indebug(self.ui, 'end of bundle2 stream')
789
790
790 def _readpartheader(self):
791 def _readpartheader(self):
791 """reads a part header size and return the bytes blob
792 """reads a part header size and return the bytes blob
792
793
793 returns None if empty"""
794 returns None if empty"""
794 headersize = self._unpack(_fpartheadersize)[0]
795 headersize = self._unpack(_fpartheadersize)[0]
795 if headersize < 0:
796 if headersize < 0:
796 raise error.BundleValueError('negative part header size: %i'
797 raise error.BundleValueError('negative part header size: %i'
797 % headersize)
798 % headersize)
798 indebug(self.ui, 'part header size: %i' % headersize)
799 indebug(self.ui, 'part header size: %i' % headersize)
799 if headersize:
800 if headersize:
800 return self._readexact(headersize)
801 return self._readexact(headersize)
801 return None
802 return None
802
803
803 def compressed(self):
804 def compressed(self):
804 self.params # load params
805 self.params # load params
805 return self._compressed
806 return self._compressed
806
807
807 formatmap = {'20': unbundle20}
808 formatmap = {'20': unbundle20}
808
809
809 b2streamparamsmap = {}
810 b2streamparamsmap = {}
810
811
811 def b2streamparamhandler(name):
812 def b2streamparamhandler(name):
812 """register a handler for a stream level parameter"""
813 """register a handler for a stream level parameter"""
813 def decorator(func):
814 def decorator(func):
814 assert name not in formatmap
815 assert name not in formatmap
815 b2streamparamsmap[name] = func
816 b2streamparamsmap[name] = func
816 return func
817 return func
817 return decorator
818 return decorator
818
819
819 @b2streamparamhandler('compression')
820 @b2streamparamhandler('compression')
820 def processcompression(unbundler, param, value):
821 def processcompression(unbundler, param, value):
821 """read compression parameter and install payload decompression"""
822 """read compression parameter and install payload decompression"""
822 if value not in util.decompressors:
823 if value not in util.decompressors:
823 raise error.BundleUnknownFeatureError(params=(param,),
824 raise error.BundleUnknownFeatureError(params=(param,),
824 values=(value,))
825 values=(value,))
825 unbundler._decompressor = util.decompressors[value]
826 unbundler._decompressor = util.decompressors[value]
826 if value is not None:
827 if value is not None:
827 unbundler._compressed = True
828 unbundler._compressed = True
828
829
829 class bundlepart(object):
830 class bundlepart(object):
830 """A bundle2 part contains application level payload
831 """A bundle2 part contains application level payload
831
832
832 The part `type` is used to route the part to the application level
833 The part `type` is used to route the part to the application level
833 handler.
834 handler.
834
835
835 The part payload is contained in ``part.data``. It could be raw bytes or a
836 The part payload is contained in ``part.data``. It could be raw bytes or a
836 generator of byte chunks.
837 generator of byte chunks.
837
838
838 You can add parameters to the part using the ``addparam`` method.
839 You can add parameters to the part using the ``addparam`` method.
839 Parameters can be either mandatory (default) or advisory. Remote side
840 Parameters can be either mandatory (default) or advisory. Remote side
840 should be able to safely ignore the advisory ones.
841 should be able to safely ignore the advisory ones.
841
842
842 Both data and parameters cannot be modified after the generation has begun.
843 Both data and parameters cannot be modified after the generation has begun.
843 """
844 """
844
845
845 def __init__(self, parttype, mandatoryparams=(), advisoryparams=(),
846 def __init__(self, parttype, mandatoryparams=(), advisoryparams=(),
846 data='', mandatory=True):
847 data='', mandatory=True):
847 validateparttype(parttype)
848 validateparttype(parttype)
848 self.id = None
849 self.id = None
849 self.type = parttype
850 self.type = parttype
850 self._data = data
851 self._data = data
851 self._mandatoryparams = list(mandatoryparams)
852 self._mandatoryparams = list(mandatoryparams)
852 self._advisoryparams = list(advisoryparams)
853 self._advisoryparams = list(advisoryparams)
853 # checking for duplicated entries
854 # checking for duplicated entries
854 self._seenparams = set()
855 self._seenparams = set()
855 for pname, __ in self._mandatoryparams + self._advisoryparams:
856 for pname, __ in self._mandatoryparams + self._advisoryparams:
856 if pname in self._seenparams:
857 if pname in self._seenparams:
857 raise RuntimeError('duplicated params: %s' % pname)
858 raise RuntimeError('duplicated params: %s' % pname)
858 self._seenparams.add(pname)
859 self._seenparams.add(pname)
859 # status of the part's generation:
860 # status of the part's generation:
860 # - None: not started,
861 # - None: not started,
861 # - False: currently generated,
862 # - False: currently generated,
862 # - True: generation done.
863 # - True: generation done.
863 self._generated = None
864 self._generated = None
864 self.mandatory = mandatory
865 self.mandatory = mandatory
865
866
866 def copy(self):
867 def copy(self):
867 """return a copy of the part
868 """return a copy of the part
868
869
869 The new part have the very same content but no partid assigned yet.
870 The new part have the very same content but no partid assigned yet.
870 Parts with generated data cannot be copied."""
871 Parts with generated data cannot be copied."""
871 assert not util.safehasattr(self.data, 'next')
872 assert not util.safehasattr(self.data, 'next')
872 return self.__class__(self.type, self._mandatoryparams,
873 return self.__class__(self.type, self._mandatoryparams,
873 self._advisoryparams, self._data, self.mandatory)
874 self._advisoryparams, self._data, self.mandatory)
874
875
875 # methods used to defines the part content
876 # methods used to defines the part content
876 @property
877 @property
877 def data(self):
878 def data(self):
878 return self._data
879 return self._data
879
880
880 @data.setter
881 @data.setter
881 def data(self, data):
882 def data(self, data):
882 if self._generated is not None:
883 if self._generated is not None:
883 raise error.ReadOnlyPartError('part is being generated')
884 raise error.ReadOnlyPartError('part is being generated')
884 self._data = data
885 self._data = data
885
886
886 @property
887 @property
887 def mandatoryparams(self):
888 def mandatoryparams(self):
888 # make it an immutable tuple to force people through ``addparam``
889 # make it an immutable tuple to force people through ``addparam``
889 return tuple(self._mandatoryparams)
890 return tuple(self._mandatoryparams)
890
891
891 @property
892 @property
892 def advisoryparams(self):
893 def advisoryparams(self):
893 # make it an immutable tuple to force people through ``addparam``
894 # make it an immutable tuple to force people through ``addparam``
894 return tuple(self._advisoryparams)
895 return tuple(self._advisoryparams)
895
896
896 def addparam(self, name, value='', mandatory=True):
897 def addparam(self, name, value='', mandatory=True):
897 if self._generated is not None:
898 if self._generated is not None:
898 raise error.ReadOnlyPartError('part is being generated')
899 raise error.ReadOnlyPartError('part is being generated')
899 if name in self._seenparams:
900 if name in self._seenparams:
900 raise ValueError('duplicated params: %s' % name)
901 raise ValueError('duplicated params: %s' % name)
901 self._seenparams.add(name)
902 self._seenparams.add(name)
902 params = self._advisoryparams
903 params = self._advisoryparams
903 if mandatory:
904 if mandatory:
904 params = self._mandatoryparams
905 params = self._mandatoryparams
905 params.append((name, value))
906 params.append((name, value))
906
907
907 # methods used to generates the bundle2 stream
908 # methods used to generates the bundle2 stream
908 def getchunks(self, ui):
909 def getchunks(self, ui):
909 if self._generated is not None:
910 if self._generated is not None:
910 raise RuntimeError('part can only be consumed once')
911 raise RuntimeError('part can only be consumed once')
911 self._generated = False
912 self._generated = False
912
913
913 if ui.debugflag:
914 if ui.debugflag:
914 msg = ['bundle2-output-part: "%s"' % self.type]
915 msg = ['bundle2-output-part: "%s"' % self.type]
915 if not self.mandatory:
916 if not self.mandatory:
916 msg.append(' (advisory)')
917 msg.append(' (advisory)')
917 nbmp = len(self.mandatoryparams)
918 nbmp = len(self.mandatoryparams)
918 nbap = len(self.advisoryparams)
919 nbap = len(self.advisoryparams)
919 if nbmp or nbap:
920 if nbmp or nbap:
920 msg.append(' (params:')
921 msg.append(' (params:')
921 if nbmp:
922 if nbmp:
922 msg.append(' %i mandatory' % nbmp)
923 msg.append(' %i mandatory' % nbmp)
923 if nbap:
924 if nbap:
924 msg.append(' %i advisory' % nbmp)
925 msg.append(' %i advisory' % nbmp)
925 msg.append(')')
926 msg.append(')')
926 if not self.data:
927 if not self.data:
927 msg.append(' empty payload')
928 msg.append(' empty payload')
928 elif util.safehasattr(self.data, 'next'):
929 elif util.safehasattr(self.data, 'next'):
929 msg.append(' streamed payload')
930 msg.append(' streamed payload')
930 else:
931 else:
931 msg.append(' %i bytes payload' % len(self.data))
932 msg.append(' %i bytes payload' % len(self.data))
932 msg.append('\n')
933 msg.append('\n')
933 ui.debug(''.join(msg))
934 ui.debug(''.join(msg))
934
935
935 #### header
936 #### header
936 if self.mandatory:
937 if self.mandatory:
937 parttype = self.type.upper()
938 parttype = self.type.upper()
938 else:
939 else:
939 parttype = self.type.lower()
940 parttype = self.type.lower()
940 outdebug(ui, 'part %s: "%s"' % (self.id, parttype))
941 outdebug(ui, 'part %s: "%s"' % (self.id, parttype))
941 ## parttype
942 ## parttype
942 header = [_pack(_fparttypesize, len(parttype)),
943 header = [_pack(_fparttypesize, len(parttype)),
943 parttype, _pack(_fpartid, self.id),
944 parttype, _pack(_fpartid, self.id),
944 ]
945 ]
945 ## parameters
946 ## parameters
946 # count
947 # count
947 manpar = self.mandatoryparams
948 manpar = self.mandatoryparams
948 advpar = self.advisoryparams
949 advpar = self.advisoryparams
949 header.append(_pack(_fpartparamcount, len(manpar), len(advpar)))
950 header.append(_pack(_fpartparamcount, len(manpar), len(advpar)))
950 # size
951 # size
951 parsizes = []
952 parsizes = []
952 for key, value in manpar:
953 for key, value in manpar:
953 parsizes.append(len(key))
954 parsizes.append(len(key))
954 parsizes.append(len(value))
955 parsizes.append(len(value))
955 for key, value in advpar:
956 for key, value in advpar:
956 parsizes.append(len(key))
957 parsizes.append(len(key))
957 parsizes.append(len(value))
958 parsizes.append(len(value))
958 paramsizes = _pack(_makefpartparamsizes(len(parsizes) / 2), *parsizes)
959 paramsizes = _pack(_makefpartparamsizes(len(parsizes) / 2), *parsizes)
959 header.append(paramsizes)
960 header.append(paramsizes)
960 # key, value
961 # key, value
961 for key, value in manpar:
962 for key, value in manpar:
962 header.append(key)
963 header.append(key)
963 header.append(value)
964 header.append(value)
964 for key, value in advpar:
965 for key, value in advpar:
965 header.append(key)
966 header.append(key)
966 header.append(value)
967 header.append(value)
967 ## finalize header
968 ## finalize header
968 headerchunk = ''.join(header)
969 headerchunk = ''.join(header)
969 outdebug(ui, 'header chunk size: %i' % len(headerchunk))
970 outdebug(ui, 'header chunk size: %i' % len(headerchunk))
970 yield _pack(_fpartheadersize, len(headerchunk))
971 yield _pack(_fpartheadersize, len(headerchunk))
971 yield headerchunk
972 yield headerchunk
972 ## payload
973 ## payload
973 try:
974 try:
974 for chunk in self._payloadchunks():
975 for chunk in self._payloadchunks():
975 outdebug(ui, 'payload chunk size: %i' % len(chunk))
976 outdebug(ui, 'payload chunk size: %i' % len(chunk))
976 yield _pack(_fpayloadsize, len(chunk))
977 yield _pack(_fpayloadsize, len(chunk))
977 yield chunk
978 yield chunk
978 except GeneratorExit:
979 except GeneratorExit:
979 # GeneratorExit means that nobody is listening for our
980 # GeneratorExit means that nobody is listening for our
980 # results anyway, so just bail quickly rather than trying
981 # results anyway, so just bail quickly rather than trying
981 # to produce an error part.
982 # to produce an error part.
982 ui.debug('bundle2-generatorexit\n')
983 ui.debug('bundle2-generatorexit\n')
983 raise
984 raise
984 except BaseException as exc:
985 except BaseException as exc:
985 # backup exception data for later
986 # backup exception data for later
986 ui.debug('bundle2-input-stream-interrupt: encoding exception %s'
987 ui.debug('bundle2-input-stream-interrupt: encoding exception %s'
987 % exc)
988 % exc)
988 exc_info = sys.exc_info()
989 exc_info = sys.exc_info()
989 msg = 'unexpected error: %s' % exc
990 msg = 'unexpected error: %s' % exc
990 interpart = bundlepart('error:abort', [('message', msg)],
991 interpart = bundlepart('error:abort', [('message', msg)],
991 mandatory=False)
992 mandatory=False)
992 interpart.id = 0
993 interpart.id = 0
993 yield _pack(_fpayloadsize, -1)
994 yield _pack(_fpayloadsize, -1)
994 for chunk in interpart.getchunks(ui=ui):
995 for chunk in interpart.getchunks(ui=ui):
995 yield chunk
996 yield chunk
996 outdebug(ui, 'closing payload chunk')
997 outdebug(ui, 'closing payload chunk')
997 # abort current part payload
998 # abort current part payload
998 yield _pack(_fpayloadsize, 0)
999 yield _pack(_fpayloadsize, 0)
999 if sys.version_info[0] >= 3:
1000 if pycompat.ispy3:
1000 raise exc_info[0](exc_info[1]).with_traceback(exc_info[2])
1001 raise exc_info[0](exc_info[1]).with_traceback(exc_info[2])
1001 else:
1002 else:
1002 exec("""raise exc_info[0], exc_info[1], exc_info[2]""")
1003 exec("""raise exc_info[0], exc_info[1], exc_info[2]""")
1003 # end of payload
1004 # end of payload
1004 outdebug(ui, 'closing payload chunk')
1005 outdebug(ui, 'closing payload chunk')
1005 yield _pack(_fpayloadsize, 0)
1006 yield _pack(_fpayloadsize, 0)
1006 self._generated = True
1007 self._generated = True
1007
1008
1008 def _payloadchunks(self):
1009 def _payloadchunks(self):
1009 """yield chunks of a the part payload
1010 """yield chunks of a the part payload
1010
1011
1011 Exists to handle the different methods to provide data to a part."""
1012 Exists to handle the different methods to provide data to a part."""
1012 # we only support fixed size data now.
1013 # we only support fixed size data now.
1013 # This will be improved in the future.
1014 # This will be improved in the future.
1014 if util.safehasattr(self.data, 'next'):
1015 if util.safehasattr(self.data, 'next'):
1015 buff = util.chunkbuffer(self.data)
1016 buff = util.chunkbuffer(self.data)
1016 chunk = buff.read(preferedchunksize)
1017 chunk = buff.read(preferedchunksize)
1017 while chunk:
1018 while chunk:
1018 yield chunk
1019 yield chunk
1019 chunk = buff.read(preferedchunksize)
1020 chunk = buff.read(preferedchunksize)
1020 elif len(self.data):
1021 elif len(self.data):
1021 yield self.data
1022 yield self.data
1022
1023
1023
1024
1024 flaginterrupt = -1
1025 flaginterrupt = -1
1025
1026
1026 class interrupthandler(unpackermixin):
1027 class interrupthandler(unpackermixin):
1027 """read one part and process it with restricted capability
1028 """read one part and process it with restricted capability
1028
1029
1029 This allows to transmit exception raised on the producer size during part
1030 This allows to transmit exception raised on the producer size during part
1030 iteration while the consumer is reading a part.
1031 iteration while the consumer is reading a part.
1031
1032
1032 Part processed in this manner only have access to a ui object,"""
1033 Part processed in this manner only have access to a ui object,"""
1033
1034
1034 def __init__(self, ui, fp):
1035 def __init__(self, ui, fp):
1035 super(interrupthandler, self).__init__(fp)
1036 super(interrupthandler, self).__init__(fp)
1036 self.ui = ui
1037 self.ui = ui
1037
1038
1038 def _readpartheader(self):
1039 def _readpartheader(self):
1039 """reads a part header size and return the bytes blob
1040 """reads a part header size and return the bytes blob
1040
1041
1041 returns None if empty"""
1042 returns None if empty"""
1042 headersize = self._unpack(_fpartheadersize)[0]
1043 headersize = self._unpack(_fpartheadersize)[0]
1043 if headersize < 0:
1044 if headersize < 0:
1044 raise error.BundleValueError('negative part header size: %i'
1045 raise error.BundleValueError('negative part header size: %i'
1045 % headersize)
1046 % headersize)
1046 indebug(self.ui, 'part header size: %i\n' % headersize)
1047 indebug(self.ui, 'part header size: %i\n' % headersize)
1047 if headersize:
1048 if headersize:
1048 return self._readexact(headersize)
1049 return self._readexact(headersize)
1049 return None
1050 return None
1050
1051
1051 def __call__(self):
1052 def __call__(self):
1052
1053
1053 self.ui.debug('bundle2-input-stream-interrupt:'
1054 self.ui.debug('bundle2-input-stream-interrupt:'
1054 ' opening out of band context\n')
1055 ' opening out of band context\n')
1055 indebug(self.ui, 'bundle2 stream interruption, looking for a part.')
1056 indebug(self.ui, 'bundle2 stream interruption, looking for a part.')
1056 headerblock = self._readpartheader()
1057 headerblock = self._readpartheader()
1057 if headerblock is None:
1058 if headerblock is None:
1058 indebug(self.ui, 'no part found during interruption.')
1059 indebug(self.ui, 'no part found during interruption.')
1059 return
1060 return
1060 part = unbundlepart(self.ui, headerblock, self._fp)
1061 part = unbundlepart(self.ui, headerblock, self._fp)
1061 op = interruptoperation(self.ui)
1062 op = interruptoperation(self.ui)
1062 _processpart(op, part)
1063 _processpart(op, part)
1063 self.ui.debug('bundle2-input-stream-interrupt:'
1064 self.ui.debug('bundle2-input-stream-interrupt:'
1064 ' closing out of band context\n')
1065 ' closing out of band context\n')
1065
1066
1066 class interruptoperation(object):
1067 class interruptoperation(object):
1067 """A limited operation to be use by part handler during interruption
1068 """A limited operation to be use by part handler during interruption
1068
1069
1069 It only have access to an ui object.
1070 It only have access to an ui object.
1070 """
1071 """
1071
1072
1072 def __init__(self, ui):
1073 def __init__(self, ui):
1073 self.ui = ui
1074 self.ui = ui
1074 self.reply = None
1075 self.reply = None
1075 self.captureoutput = False
1076 self.captureoutput = False
1076
1077
1077 @property
1078 @property
1078 def repo(self):
1079 def repo(self):
1079 raise RuntimeError('no repo access from stream interruption')
1080 raise RuntimeError('no repo access from stream interruption')
1080
1081
1081 def gettransaction(self):
1082 def gettransaction(self):
1082 raise TransactionUnavailable('no repo access from stream interruption')
1083 raise TransactionUnavailable('no repo access from stream interruption')
1083
1084
1084 class unbundlepart(unpackermixin):
1085 class unbundlepart(unpackermixin):
1085 """a bundle part read from a bundle"""
1086 """a bundle part read from a bundle"""
1086
1087
1087 def __init__(self, ui, header, fp):
1088 def __init__(self, ui, header, fp):
1088 super(unbundlepart, self).__init__(fp)
1089 super(unbundlepart, self).__init__(fp)
1089 self.ui = ui
1090 self.ui = ui
1090 # unbundle state attr
1091 # unbundle state attr
1091 self._headerdata = header
1092 self._headerdata = header
1092 self._headeroffset = 0
1093 self._headeroffset = 0
1093 self._initialized = False
1094 self._initialized = False
1094 self.consumed = False
1095 self.consumed = False
1095 # part data
1096 # part data
1096 self.id = None
1097 self.id = None
1097 self.type = None
1098 self.type = None
1098 self.mandatoryparams = None
1099 self.mandatoryparams = None
1099 self.advisoryparams = None
1100 self.advisoryparams = None
1100 self.params = None
1101 self.params = None
1101 self.mandatorykeys = ()
1102 self.mandatorykeys = ()
1102 self._payloadstream = None
1103 self._payloadstream = None
1103 self._readheader()
1104 self._readheader()
1104 self._mandatory = None
1105 self._mandatory = None
1105 self._chunkindex = [] #(payload, file) position tuples for chunk starts
1106 self._chunkindex = [] #(payload, file) position tuples for chunk starts
1106 self._pos = 0
1107 self._pos = 0
1107
1108
1108 def _fromheader(self, size):
1109 def _fromheader(self, size):
1109 """return the next <size> byte from the header"""
1110 """return the next <size> byte from the header"""
1110 offset = self._headeroffset
1111 offset = self._headeroffset
1111 data = self._headerdata[offset:(offset + size)]
1112 data = self._headerdata[offset:(offset + size)]
1112 self._headeroffset = offset + size
1113 self._headeroffset = offset + size
1113 return data
1114 return data
1114
1115
1115 def _unpackheader(self, format):
1116 def _unpackheader(self, format):
1116 """read given format from header
1117 """read given format from header
1117
1118
1118 This automatically compute the size of the format to read."""
1119 This automatically compute the size of the format to read."""
1119 data = self._fromheader(struct.calcsize(format))
1120 data = self._fromheader(struct.calcsize(format))
1120 return _unpack(format, data)
1121 return _unpack(format, data)
1121
1122
1122 def _initparams(self, mandatoryparams, advisoryparams):
1123 def _initparams(self, mandatoryparams, advisoryparams):
1123 """internal function to setup all logic related parameters"""
1124 """internal function to setup all logic related parameters"""
1124 # make it read only to prevent people touching it by mistake.
1125 # make it read only to prevent people touching it by mistake.
1125 self.mandatoryparams = tuple(mandatoryparams)
1126 self.mandatoryparams = tuple(mandatoryparams)
1126 self.advisoryparams = tuple(advisoryparams)
1127 self.advisoryparams = tuple(advisoryparams)
1127 # user friendly UI
1128 # user friendly UI
1128 self.params = util.sortdict(self.mandatoryparams)
1129 self.params = util.sortdict(self.mandatoryparams)
1129 self.params.update(self.advisoryparams)
1130 self.params.update(self.advisoryparams)
1130 self.mandatorykeys = frozenset(p[0] for p in mandatoryparams)
1131 self.mandatorykeys = frozenset(p[0] for p in mandatoryparams)
1131
1132
1132 def _payloadchunks(self, chunknum=0):
1133 def _payloadchunks(self, chunknum=0):
1133 '''seek to specified chunk and start yielding data'''
1134 '''seek to specified chunk and start yielding data'''
1134 if len(self._chunkindex) == 0:
1135 if len(self._chunkindex) == 0:
1135 assert chunknum == 0, 'Must start with chunk 0'
1136 assert chunknum == 0, 'Must start with chunk 0'
1136 self._chunkindex.append((0, super(unbundlepart, self).tell()))
1137 self._chunkindex.append((0, super(unbundlepart, self).tell()))
1137 else:
1138 else:
1138 assert chunknum < len(self._chunkindex), \
1139 assert chunknum < len(self._chunkindex), \
1139 'Unknown chunk %d' % chunknum
1140 'Unknown chunk %d' % chunknum
1140 super(unbundlepart, self).seek(self._chunkindex[chunknum][1])
1141 super(unbundlepart, self).seek(self._chunkindex[chunknum][1])
1141
1142
1142 pos = self._chunkindex[chunknum][0]
1143 pos = self._chunkindex[chunknum][0]
1143 payloadsize = self._unpack(_fpayloadsize)[0]
1144 payloadsize = self._unpack(_fpayloadsize)[0]
1144 indebug(self.ui, 'payload chunk size: %i' % payloadsize)
1145 indebug(self.ui, 'payload chunk size: %i' % payloadsize)
1145 while payloadsize:
1146 while payloadsize:
1146 if payloadsize == flaginterrupt:
1147 if payloadsize == flaginterrupt:
1147 # interruption detection, the handler will now read a
1148 # interruption detection, the handler will now read a
1148 # single part and process it.
1149 # single part and process it.
1149 interrupthandler(self.ui, self._fp)()
1150 interrupthandler(self.ui, self._fp)()
1150 elif payloadsize < 0:
1151 elif payloadsize < 0:
1151 msg = 'negative payload chunk size: %i' % payloadsize
1152 msg = 'negative payload chunk size: %i' % payloadsize
1152 raise error.BundleValueError(msg)
1153 raise error.BundleValueError(msg)
1153 else:
1154 else:
1154 result = self._readexact(payloadsize)
1155 result = self._readexact(payloadsize)
1155 chunknum += 1
1156 chunknum += 1
1156 pos += payloadsize
1157 pos += payloadsize
1157 if chunknum == len(self._chunkindex):
1158 if chunknum == len(self._chunkindex):
1158 self._chunkindex.append((pos,
1159 self._chunkindex.append((pos,
1159 super(unbundlepart, self).tell()))
1160 super(unbundlepart, self).tell()))
1160 yield result
1161 yield result
1161 payloadsize = self._unpack(_fpayloadsize)[0]
1162 payloadsize = self._unpack(_fpayloadsize)[0]
1162 indebug(self.ui, 'payload chunk size: %i' % payloadsize)
1163 indebug(self.ui, 'payload chunk size: %i' % payloadsize)
1163
1164
1164 def _findchunk(self, pos):
1165 def _findchunk(self, pos):
1165 '''for a given payload position, return a chunk number and offset'''
1166 '''for a given payload position, return a chunk number and offset'''
1166 for chunk, (ppos, fpos) in enumerate(self._chunkindex):
1167 for chunk, (ppos, fpos) in enumerate(self._chunkindex):
1167 if ppos == pos:
1168 if ppos == pos:
1168 return chunk, 0
1169 return chunk, 0
1169 elif ppos > pos:
1170 elif ppos > pos:
1170 return chunk - 1, pos - self._chunkindex[chunk - 1][0]
1171 return chunk - 1, pos - self._chunkindex[chunk - 1][0]
1171 raise ValueError('Unknown chunk')
1172 raise ValueError('Unknown chunk')
1172
1173
1173 def _readheader(self):
1174 def _readheader(self):
1174 """read the header and setup the object"""
1175 """read the header and setup the object"""
1175 typesize = self._unpackheader(_fparttypesize)[0]
1176 typesize = self._unpackheader(_fparttypesize)[0]
1176 self.type = self._fromheader(typesize)
1177 self.type = self._fromheader(typesize)
1177 indebug(self.ui, 'part type: "%s"' % self.type)
1178 indebug(self.ui, 'part type: "%s"' % self.type)
1178 self.id = self._unpackheader(_fpartid)[0]
1179 self.id = self._unpackheader(_fpartid)[0]
1179 indebug(self.ui, 'part id: "%s"' % self.id)
1180 indebug(self.ui, 'part id: "%s"' % self.id)
1180 # extract mandatory bit from type
1181 # extract mandatory bit from type
1181 self.mandatory = (self.type != self.type.lower())
1182 self.mandatory = (self.type != self.type.lower())
1182 self.type = self.type.lower()
1183 self.type = self.type.lower()
1183 ## reading parameters
1184 ## reading parameters
1184 # param count
1185 # param count
1185 mancount, advcount = self._unpackheader(_fpartparamcount)
1186 mancount, advcount = self._unpackheader(_fpartparamcount)
1186 indebug(self.ui, 'part parameters: %i' % (mancount + advcount))
1187 indebug(self.ui, 'part parameters: %i' % (mancount + advcount))
1187 # param size
1188 # param size
1188 fparamsizes = _makefpartparamsizes(mancount + advcount)
1189 fparamsizes = _makefpartparamsizes(mancount + advcount)
1189 paramsizes = self._unpackheader(fparamsizes)
1190 paramsizes = self._unpackheader(fparamsizes)
1190 # make it a list of couple again
1191 # make it a list of couple again
1191 paramsizes = zip(paramsizes[::2], paramsizes[1::2])
1192 paramsizes = zip(paramsizes[::2], paramsizes[1::2])
1192 # split mandatory from advisory
1193 # split mandatory from advisory
1193 mansizes = paramsizes[:mancount]
1194 mansizes = paramsizes[:mancount]
1194 advsizes = paramsizes[mancount:]
1195 advsizes = paramsizes[mancount:]
1195 # retrieve param value
1196 # retrieve param value
1196 manparams = []
1197 manparams = []
1197 for key, value in mansizes:
1198 for key, value in mansizes:
1198 manparams.append((self._fromheader(key), self._fromheader(value)))
1199 manparams.append((self._fromheader(key), self._fromheader(value)))
1199 advparams = []
1200 advparams = []
1200 for key, value in advsizes:
1201 for key, value in advsizes:
1201 advparams.append((self._fromheader(key), self._fromheader(value)))
1202 advparams.append((self._fromheader(key), self._fromheader(value)))
1202 self._initparams(manparams, advparams)
1203 self._initparams(manparams, advparams)
1203 ## part payload
1204 ## part payload
1204 self._payloadstream = util.chunkbuffer(self._payloadchunks())
1205 self._payloadstream = util.chunkbuffer(self._payloadchunks())
1205 # we read the data, tell it
1206 # we read the data, tell it
1206 self._initialized = True
1207 self._initialized = True
1207
1208
1208 def read(self, size=None):
1209 def read(self, size=None):
1209 """read payload data"""
1210 """read payload data"""
1210 if not self._initialized:
1211 if not self._initialized:
1211 self._readheader()
1212 self._readheader()
1212 if size is None:
1213 if size is None:
1213 data = self._payloadstream.read()
1214 data = self._payloadstream.read()
1214 else:
1215 else:
1215 data = self._payloadstream.read(size)
1216 data = self._payloadstream.read(size)
1216 self._pos += len(data)
1217 self._pos += len(data)
1217 if size is None or len(data) < size:
1218 if size is None or len(data) < size:
1218 if not self.consumed and self._pos:
1219 if not self.consumed and self._pos:
1219 self.ui.debug('bundle2-input-part: total payload size %i\n'
1220 self.ui.debug('bundle2-input-part: total payload size %i\n'
1220 % self._pos)
1221 % self._pos)
1221 self.consumed = True
1222 self.consumed = True
1222 return data
1223 return data
1223
1224
1224 def tell(self):
1225 def tell(self):
1225 return self._pos
1226 return self._pos
1226
1227
1227 def seek(self, offset, whence=0):
1228 def seek(self, offset, whence=0):
1228 if whence == 0:
1229 if whence == 0:
1229 newpos = offset
1230 newpos = offset
1230 elif whence == 1:
1231 elif whence == 1:
1231 newpos = self._pos + offset
1232 newpos = self._pos + offset
1232 elif whence == 2:
1233 elif whence == 2:
1233 if not self.consumed:
1234 if not self.consumed:
1234 self.read()
1235 self.read()
1235 newpos = self._chunkindex[-1][0] - offset
1236 newpos = self._chunkindex[-1][0] - offset
1236 else:
1237 else:
1237 raise ValueError('Unknown whence value: %r' % (whence,))
1238 raise ValueError('Unknown whence value: %r' % (whence,))
1238
1239
1239 if newpos > self._chunkindex[-1][0] and not self.consumed:
1240 if newpos > self._chunkindex[-1][0] and not self.consumed:
1240 self.read()
1241 self.read()
1241 if not 0 <= newpos <= self._chunkindex[-1][0]:
1242 if not 0 <= newpos <= self._chunkindex[-1][0]:
1242 raise ValueError('Offset out of range')
1243 raise ValueError('Offset out of range')
1243
1244
1244 if self._pos != newpos:
1245 if self._pos != newpos:
1245 chunk, internaloffset = self._findchunk(newpos)
1246 chunk, internaloffset = self._findchunk(newpos)
1246 self._payloadstream = util.chunkbuffer(self._payloadchunks(chunk))
1247 self._payloadstream = util.chunkbuffer(self._payloadchunks(chunk))
1247 adjust = self.read(internaloffset)
1248 adjust = self.read(internaloffset)
1248 if len(adjust) != internaloffset:
1249 if len(adjust) != internaloffset:
1249 raise error.Abort(_('Seek failed\n'))
1250 raise error.Abort(_('Seek failed\n'))
1250 self._pos = newpos
1251 self._pos = newpos
1251
1252
1252 # These are only the static capabilities.
1253 # These are only the static capabilities.
1253 # Check the 'getrepocaps' function for the rest.
1254 # Check the 'getrepocaps' function for the rest.
1254 capabilities = {'HG20': (),
1255 capabilities = {'HG20': (),
1255 'error': ('abort', 'unsupportedcontent', 'pushraced',
1256 'error': ('abort', 'unsupportedcontent', 'pushraced',
1256 'pushkey'),
1257 'pushkey'),
1257 'listkeys': (),
1258 'listkeys': (),
1258 'pushkey': (),
1259 'pushkey': (),
1259 'digests': tuple(sorted(util.DIGESTS.keys())),
1260 'digests': tuple(sorted(util.DIGESTS.keys())),
1260 'remote-changegroup': ('http', 'https'),
1261 'remote-changegroup': ('http', 'https'),
1261 'hgtagsfnodes': (),
1262 'hgtagsfnodes': (),
1262 }
1263 }
1263
1264
1264 def getrepocaps(repo, allowpushback=False):
1265 def getrepocaps(repo, allowpushback=False):
1265 """return the bundle2 capabilities for a given repo
1266 """return the bundle2 capabilities for a given repo
1266
1267
1267 Exists to allow extensions (like evolution) to mutate the capabilities.
1268 Exists to allow extensions (like evolution) to mutate the capabilities.
1268 """
1269 """
1269 caps = capabilities.copy()
1270 caps = capabilities.copy()
1270 caps['changegroup'] = tuple(sorted(
1271 caps['changegroup'] = tuple(sorted(
1271 changegroup.supportedincomingversions(repo)))
1272 changegroup.supportedincomingversions(repo)))
1272 if obsolete.isenabled(repo, obsolete.exchangeopt):
1273 if obsolete.isenabled(repo, obsolete.exchangeopt):
1273 supportedformat = tuple('V%i' % v for v in obsolete.formats)
1274 supportedformat = tuple('V%i' % v for v in obsolete.formats)
1274 caps['obsmarkers'] = supportedformat
1275 caps['obsmarkers'] = supportedformat
1275 if allowpushback:
1276 if allowpushback:
1276 caps['pushback'] = ()
1277 caps['pushback'] = ()
1277 return caps
1278 return caps
1278
1279
1279 def bundle2caps(remote):
1280 def bundle2caps(remote):
1280 """return the bundle capabilities of a peer as dict"""
1281 """return the bundle capabilities of a peer as dict"""
1281 raw = remote.capable('bundle2')
1282 raw = remote.capable('bundle2')
1282 if not raw and raw != '':
1283 if not raw and raw != '':
1283 return {}
1284 return {}
1284 capsblob = urlreq.unquote(remote.capable('bundle2'))
1285 capsblob = urlreq.unquote(remote.capable('bundle2'))
1285 return decodecaps(capsblob)
1286 return decodecaps(capsblob)
1286
1287
1287 def obsmarkersversion(caps):
1288 def obsmarkersversion(caps):
1288 """extract the list of supported obsmarkers versions from a bundle2caps dict
1289 """extract the list of supported obsmarkers versions from a bundle2caps dict
1289 """
1290 """
1290 obscaps = caps.get('obsmarkers', ())
1291 obscaps = caps.get('obsmarkers', ())
1291 return [int(c[1:]) for c in obscaps if c.startswith('V')]
1292 return [int(c[1:]) for c in obscaps if c.startswith('V')]
1292
1293
1293 def writebundle(ui, cg, filename, bundletype, vfs=None, compression=None):
1294 def writebundle(ui, cg, filename, bundletype, vfs=None, compression=None):
1294 """Write a bundle file and return its filename.
1295 """Write a bundle file and return its filename.
1295
1296
1296 Existing files will not be overwritten.
1297 Existing files will not be overwritten.
1297 If no filename is specified, a temporary file is created.
1298 If no filename is specified, a temporary file is created.
1298 bz2 compression can be turned off.
1299 bz2 compression can be turned off.
1299 The bundle file will be deleted in case of errors.
1300 The bundle file will be deleted in case of errors.
1300 """
1301 """
1301
1302
1302 if bundletype == "HG20":
1303 if bundletype == "HG20":
1303 bundle = bundle20(ui)
1304 bundle = bundle20(ui)
1304 bundle.setcompression(compression)
1305 bundle.setcompression(compression)
1305 part = bundle.newpart('changegroup', data=cg.getchunks())
1306 part = bundle.newpart('changegroup', data=cg.getchunks())
1306 part.addparam('version', cg.version)
1307 part.addparam('version', cg.version)
1307 if 'clcount' in cg.extras:
1308 if 'clcount' in cg.extras:
1308 part.addparam('nbchanges', str(cg.extras['clcount']),
1309 part.addparam('nbchanges', str(cg.extras['clcount']),
1309 mandatory=False)
1310 mandatory=False)
1310 chunkiter = bundle.getchunks()
1311 chunkiter = bundle.getchunks()
1311 else:
1312 else:
1312 # compression argument is only for the bundle2 case
1313 # compression argument is only for the bundle2 case
1313 assert compression is None
1314 assert compression is None
1314 if cg.version != '01':
1315 if cg.version != '01':
1315 raise error.Abort(_('old bundle types only supports v1 '
1316 raise error.Abort(_('old bundle types only supports v1 '
1316 'changegroups'))
1317 'changegroups'))
1317 header, comp = bundletypes[bundletype]
1318 header, comp = bundletypes[bundletype]
1318 if comp not in util.compressors:
1319 if comp not in util.compressors:
1319 raise error.Abort(_('unknown stream compression type: %s')
1320 raise error.Abort(_('unknown stream compression type: %s')
1320 % comp)
1321 % comp)
1321 z = util.compressors[comp]()
1322 z = util.compressors[comp]()
1322 subchunkiter = cg.getchunks()
1323 subchunkiter = cg.getchunks()
1323 def chunkiter():
1324 def chunkiter():
1324 yield header
1325 yield header
1325 for chunk in subchunkiter:
1326 for chunk in subchunkiter:
1326 yield z.compress(chunk)
1327 yield z.compress(chunk)
1327 yield z.flush()
1328 yield z.flush()
1328 chunkiter = chunkiter()
1329 chunkiter = chunkiter()
1329
1330
1330 # parse the changegroup data, otherwise we will block
1331 # parse the changegroup data, otherwise we will block
1331 # in case of sshrepo because we don't know the end of the stream
1332 # in case of sshrepo because we don't know the end of the stream
1332 return changegroup.writechunks(ui, chunkiter, filename, vfs=vfs)
1333 return changegroup.writechunks(ui, chunkiter, filename, vfs=vfs)
1333
1334
1334 @parthandler('changegroup', ('version', 'nbchanges', 'treemanifest'))
1335 @parthandler('changegroup', ('version', 'nbchanges', 'treemanifest'))
1335 def handlechangegroup(op, inpart):
1336 def handlechangegroup(op, inpart):
1336 """apply a changegroup part on the repo
1337 """apply a changegroup part on the repo
1337
1338
1338 This is a very early implementation that will massive rework before being
1339 This is a very early implementation that will massive rework before being
1339 inflicted to any end-user.
1340 inflicted to any end-user.
1340 """
1341 """
1341 # Make sure we trigger a transaction creation
1342 # Make sure we trigger a transaction creation
1342 #
1343 #
1343 # The addchangegroup function will get a transaction object by itself, but
1344 # The addchangegroup function will get a transaction object by itself, but
1344 # we need to make sure we trigger the creation of a transaction object used
1345 # we need to make sure we trigger the creation of a transaction object used
1345 # for the whole processing scope.
1346 # for the whole processing scope.
1346 op.gettransaction()
1347 op.gettransaction()
1347 unpackerversion = inpart.params.get('version', '01')
1348 unpackerversion = inpart.params.get('version', '01')
1348 # We should raise an appropriate exception here
1349 # We should raise an appropriate exception here
1349 cg = changegroup.getunbundler(unpackerversion, inpart, None)
1350 cg = changegroup.getunbundler(unpackerversion, inpart, None)
1350 # the source and url passed here are overwritten by the one contained in
1351 # the source and url passed here are overwritten by the one contained in
1351 # the transaction.hookargs argument. So 'bundle2' is a placeholder
1352 # the transaction.hookargs argument. So 'bundle2' is a placeholder
1352 nbchangesets = None
1353 nbchangesets = None
1353 if 'nbchanges' in inpart.params:
1354 if 'nbchanges' in inpart.params:
1354 nbchangesets = int(inpart.params.get('nbchanges'))
1355 nbchangesets = int(inpart.params.get('nbchanges'))
1355 if ('treemanifest' in inpart.params and
1356 if ('treemanifest' in inpart.params and
1356 'treemanifest' not in op.repo.requirements):
1357 'treemanifest' not in op.repo.requirements):
1357 if len(op.repo.changelog) != 0:
1358 if len(op.repo.changelog) != 0:
1358 raise error.Abort(_(
1359 raise error.Abort(_(
1359 "bundle contains tree manifests, but local repo is "
1360 "bundle contains tree manifests, but local repo is "
1360 "non-empty and does not use tree manifests"))
1361 "non-empty and does not use tree manifests"))
1361 op.repo.requirements.add('treemanifest')
1362 op.repo.requirements.add('treemanifest')
1362 op.repo._applyopenerreqs()
1363 op.repo._applyopenerreqs()
1363 op.repo._writerequirements()
1364 op.repo._writerequirements()
1364 ret = cg.apply(op.repo, 'bundle2', 'bundle2', expectedtotal=nbchangesets)
1365 ret = cg.apply(op.repo, 'bundle2', 'bundle2', expectedtotal=nbchangesets)
1365 op.records.add('changegroup', {'return': ret})
1366 op.records.add('changegroup', {'return': ret})
1366 if op.reply is not None:
1367 if op.reply is not None:
1367 # This is definitely not the final form of this
1368 # This is definitely not the final form of this
1368 # return. But one need to start somewhere.
1369 # return. But one need to start somewhere.
1369 part = op.reply.newpart('reply:changegroup', mandatory=False)
1370 part = op.reply.newpart('reply:changegroup', mandatory=False)
1370 part.addparam('in-reply-to', str(inpart.id), mandatory=False)
1371 part.addparam('in-reply-to', str(inpart.id), mandatory=False)
1371 part.addparam('return', '%i' % ret, mandatory=False)
1372 part.addparam('return', '%i' % ret, mandatory=False)
1372 assert not inpart.read()
1373 assert not inpart.read()
1373
1374
1374 _remotechangegroupparams = tuple(['url', 'size', 'digests'] +
1375 _remotechangegroupparams = tuple(['url', 'size', 'digests'] +
1375 ['digest:%s' % k for k in util.DIGESTS.keys()])
1376 ['digest:%s' % k for k in util.DIGESTS.keys()])
1376 @parthandler('remote-changegroup', _remotechangegroupparams)
1377 @parthandler('remote-changegroup', _remotechangegroupparams)
1377 def handleremotechangegroup(op, inpart):
1378 def handleremotechangegroup(op, inpart):
1378 """apply a bundle10 on the repo, given an url and validation information
1379 """apply a bundle10 on the repo, given an url and validation information
1379
1380
1380 All the information about the remote bundle to import are given as
1381 All the information about the remote bundle to import are given as
1381 parameters. The parameters include:
1382 parameters. The parameters include:
1382 - url: the url to the bundle10.
1383 - url: the url to the bundle10.
1383 - size: the bundle10 file size. It is used to validate what was
1384 - size: the bundle10 file size. It is used to validate what was
1384 retrieved by the client matches the server knowledge about the bundle.
1385 retrieved by the client matches the server knowledge about the bundle.
1385 - digests: a space separated list of the digest types provided as
1386 - digests: a space separated list of the digest types provided as
1386 parameters.
1387 parameters.
1387 - digest:<digest-type>: the hexadecimal representation of the digest with
1388 - digest:<digest-type>: the hexadecimal representation of the digest with
1388 that name. Like the size, it is used to validate what was retrieved by
1389 that name. Like the size, it is used to validate what was retrieved by
1389 the client matches what the server knows about the bundle.
1390 the client matches what the server knows about the bundle.
1390
1391
1391 When multiple digest types are given, all of them are checked.
1392 When multiple digest types are given, all of them are checked.
1392 """
1393 """
1393 try:
1394 try:
1394 raw_url = inpart.params['url']
1395 raw_url = inpart.params['url']
1395 except KeyError:
1396 except KeyError:
1396 raise error.Abort(_('remote-changegroup: missing "%s" param') % 'url')
1397 raise error.Abort(_('remote-changegroup: missing "%s" param') % 'url')
1397 parsed_url = util.url(raw_url)
1398 parsed_url = util.url(raw_url)
1398 if parsed_url.scheme not in capabilities['remote-changegroup']:
1399 if parsed_url.scheme not in capabilities['remote-changegroup']:
1399 raise error.Abort(_('remote-changegroup does not support %s urls') %
1400 raise error.Abort(_('remote-changegroup does not support %s urls') %
1400 parsed_url.scheme)
1401 parsed_url.scheme)
1401
1402
1402 try:
1403 try:
1403 size = int(inpart.params['size'])
1404 size = int(inpart.params['size'])
1404 except ValueError:
1405 except ValueError:
1405 raise error.Abort(_('remote-changegroup: invalid value for param "%s"')
1406 raise error.Abort(_('remote-changegroup: invalid value for param "%s"')
1406 % 'size')
1407 % 'size')
1407 except KeyError:
1408 except KeyError:
1408 raise error.Abort(_('remote-changegroup: missing "%s" param') % 'size')
1409 raise error.Abort(_('remote-changegroup: missing "%s" param') % 'size')
1409
1410
1410 digests = {}
1411 digests = {}
1411 for typ in inpart.params.get('digests', '').split():
1412 for typ in inpart.params.get('digests', '').split():
1412 param = 'digest:%s' % typ
1413 param = 'digest:%s' % typ
1413 try:
1414 try:
1414 value = inpart.params[param]
1415 value = inpart.params[param]
1415 except KeyError:
1416 except KeyError:
1416 raise error.Abort(_('remote-changegroup: missing "%s" param') %
1417 raise error.Abort(_('remote-changegroup: missing "%s" param') %
1417 param)
1418 param)
1418 digests[typ] = value
1419 digests[typ] = value
1419
1420
1420 real_part = util.digestchecker(url.open(op.ui, raw_url), size, digests)
1421 real_part = util.digestchecker(url.open(op.ui, raw_url), size, digests)
1421
1422
1422 # Make sure we trigger a transaction creation
1423 # Make sure we trigger a transaction creation
1423 #
1424 #
1424 # The addchangegroup function will get a transaction object by itself, but
1425 # The addchangegroup function will get a transaction object by itself, but
1425 # we need to make sure we trigger the creation of a transaction object used
1426 # we need to make sure we trigger the creation of a transaction object used
1426 # for the whole processing scope.
1427 # for the whole processing scope.
1427 op.gettransaction()
1428 op.gettransaction()
1428 from . import exchange
1429 from . import exchange
1429 cg = exchange.readbundle(op.repo.ui, real_part, raw_url)
1430 cg = exchange.readbundle(op.repo.ui, real_part, raw_url)
1430 if not isinstance(cg, changegroup.cg1unpacker):
1431 if not isinstance(cg, changegroup.cg1unpacker):
1431 raise error.Abort(_('%s: not a bundle version 1.0') %
1432 raise error.Abort(_('%s: not a bundle version 1.0') %
1432 util.hidepassword(raw_url))
1433 util.hidepassword(raw_url))
1433 ret = cg.apply(op.repo, 'bundle2', 'bundle2')
1434 ret = cg.apply(op.repo, 'bundle2', 'bundle2')
1434 op.records.add('changegroup', {'return': ret})
1435 op.records.add('changegroup', {'return': ret})
1435 if op.reply is not None:
1436 if op.reply is not None:
1436 # This is definitely not the final form of this
1437 # This is definitely not the final form of this
1437 # return. But one need to start somewhere.
1438 # return. But one need to start somewhere.
1438 part = op.reply.newpart('reply:changegroup')
1439 part = op.reply.newpart('reply:changegroup')
1439 part.addparam('in-reply-to', str(inpart.id), mandatory=False)
1440 part.addparam('in-reply-to', str(inpart.id), mandatory=False)
1440 part.addparam('return', '%i' % ret, mandatory=False)
1441 part.addparam('return', '%i' % ret, mandatory=False)
1441 try:
1442 try:
1442 real_part.validate()
1443 real_part.validate()
1443 except error.Abort as e:
1444 except error.Abort as e:
1444 raise error.Abort(_('bundle at %s is corrupted:\n%s') %
1445 raise error.Abort(_('bundle at %s is corrupted:\n%s') %
1445 (util.hidepassword(raw_url), str(e)))
1446 (util.hidepassword(raw_url), str(e)))
1446 assert not inpart.read()
1447 assert not inpart.read()
1447
1448
1448 @parthandler('reply:changegroup', ('return', 'in-reply-to'))
1449 @parthandler('reply:changegroup', ('return', 'in-reply-to'))
1449 def handlereplychangegroup(op, inpart):
1450 def handlereplychangegroup(op, inpart):
1450 ret = int(inpart.params['return'])
1451 ret = int(inpart.params['return'])
1451 replyto = int(inpart.params['in-reply-to'])
1452 replyto = int(inpart.params['in-reply-to'])
1452 op.records.add('changegroup', {'return': ret}, replyto)
1453 op.records.add('changegroup', {'return': ret}, replyto)
1453
1454
1454 @parthandler('check:heads')
1455 @parthandler('check:heads')
1455 def handlecheckheads(op, inpart):
1456 def handlecheckheads(op, inpart):
1456 """check that head of the repo did not change
1457 """check that head of the repo did not change
1457
1458
1458 This is used to detect a push race when using unbundle.
1459 This is used to detect a push race when using unbundle.
1459 This replaces the "heads" argument of unbundle."""
1460 This replaces the "heads" argument of unbundle."""
1460 h = inpart.read(20)
1461 h = inpart.read(20)
1461 heads = []
1462 heads = []
1462 while len(h) == 20:
1463 while len(h) == 20:
1463 heads.append(h)
1464 heads.append(h)
1464 h = inpart.read(20)
1465 h = inpart.read(20)
1465 assert not h
1466 assert not h
1466 # Trigger a transaction so that we are guaranteed to have the lock now.
1467 # Trigger a transaction so that we are guaranteed to have the lock now.
1467 if op.ui.configbool('experimental', 'bundle2lazylocking'):
1468 if op.ui.configbool('experimental', 'bundle2lazylocking'):
1468 op.gettransaction()
1469 op.gettransaction()
1469 if sorted(heads) != sorted(op.repo.heads()):
1470 if sorted(heads) != sorted(op.repo.heads()):
1470 raise error.PushRaced('repository changed while pushing - '
1471 raise error.PushRaced('repository changed while pushing - '
1471 'please try again')
1472 'please try again')
1472
1473
1473 @parthandler('output')
1474 @parthandler('output')
1474 def handleoutput(op, inpart):
1475 def handleoutput(op, inpart):
1475 """forward output captured on the server to the client"""
1476 """forward output captured on the server to the client"""
1476 for line in inpart.read().splitlines():
1477 for line in inpart.read().splitlines():
1477 op.ui.status(_('remote: %s\n') % line)
1478 op.ui.status(_('remote: %s\n') % line)
1478
1479
1479 @parthandler('replycaps')
1480 @parthandler('replycaps')
1480 def handlereplycaps(op, inpart):
1481 def handlereplycaps(op, inpart):
1481 """Notify that a reply bundle should be created
1482 """Notify that a reply bundle should be created
1482
1483
1483 The payload contains the capabilities information for the reply"""
1484 The payload contains the capabilities information for the reply"""
1484 caps = decodecaps(inpart.read())
1485 caps = decodecaps(inpart.read())
1485 if op.reply is None:
1486 if op.reply is None:
1486 op.reply = bundle20(op.ui, caps)
1487 op.reply = bundle20(op.ui, caps)
1487
1488
1488 class AbortFromPart(error.Abort):
1489 class AbortFromPart(error.Abort):
1489 """Sub-class of Abort that denotes an error from a bundle2 part."""
1490 """Sub-class of Abort that denotes an error from a bundle2 part."""
1490
1491
1491 @parthandler('error:abort', ('message', 'hint'))
1492 @parthandler('error:abort', ('message', 'hint'))
1492 def handleerrorabort(op, inpart):
1493 def handleerrorabort(op, inpart):
1493 """Used to transmit abort error over the wire"""
1494 """Used to transmit abort error over the wire"""
1494 raise AbortFromPart(inpart.params['message'],
1495 raise AbortFromPart(inpart.params['message'],
1495 hint=inpart.params.get('hint'))
1496 hint=inpart.params.get('hint'))
1496
1497
1497 @parthandler('error:pushkey', ('namespace', 'key', 'new', 'old', 'ret',
1498 @parthandler('error:pushkey', ('namespace', 'key', 'new', 'old', 'ret',
1498 'in-reply-to'))
1499 'in-reply-to'))
1499 def handleerrorpushkey(op, inpart):
1500 def handleerrorpushkey(op, inpart):
1500 """Used to transmit failure of a mandatory pushkey over the wire"""
1501 """Used to transmit failure of a mandatory pushkey over the wire"""
1501 kwargs = {}
1502 kwargs = {}
1502 for name in ('namespace', 'key', 'new', 'old', 'ret'):
1503 for name in ('namespace', 'key', 'new', 'old', 'ret'):
1503 value = inpart.params.get(name)
1504 value = inpart.params.get(name)
1504 if value is not None:
1505 if value is not None:
1505 kwargs[name] = value
1506 kwargs[name] = value
1506 raise error.PushkeyFailed(inpart.params['in-reply-to'], **kwargs)
1507 raise error.PushkeyFailed(inpart.params['in-reply-to'], **kwargs)
1507
1508
1508 @parthandler('error:unsupportedcontent', ('parttype', 'params'))
1509 @parthandler('error:unsupportedcontent', ('parttype', 'params'))
1509 def handleerrorunsupportedcontent(op, inpart):
1510 def handleerrorunsupportedcontent(op, inpart):
1510 """Used to transmit unknown content error over the wire"""
1511 """Used to transmit unknown content error over the wire"""
1511 kwargs = {}
1512 kwargs = {}
1512 parttype = inpart.params.get('parttype')
1513 parttype = inpart.params.get('parttype')
1513 if parttype is not None:
1514 if parttype is not None:
1514 kwargs['parttype'] = parttype
1515 kwargs['parttype'] = parttype
1515 params = inpart.params.get('params')
1516 params = inpart.params.get('params')
1516 if params is not None:
1517 if params is not None:
1517 kwargs['params'] = params.split('\0')
1518 kwargs['params'] = params.split('\0')
1518
1519
1519 raise error.BundleUnknownFeatureError(**kwargs)
1520 raise error.BundleUnknownFeatureError(**kwargs)
1520
1521
1521 @parthandler('error:pushraced', ('message',))
1522 @parthandler('error:pushraced', ('message',))
1522 def handleerrorpushraced(op, inpart):
1523 def handleerrorpushraced(op, inpart):
1523 """Used to transmit push race error over the wire"""
1524 """Used to transmit push race error over the wire"""
1524 raise error.ResponseError(_('push failed:'), inpart.params['message'])
1525 raise error.ResponseError(_('push failed:'), inpart.params['message'])
1525
1526
1526 @parthandler('listkeys', ('namespace',))
1527 @parthandler('listkeys', ('namespace',))
1527 def handlelistkeys(op, inpart):
1528 def handlelistkeys(op, inpart):
1528 """retrieve pushkey namespace content stored in a bundle2"""
1529 """retrieve pushkey namespace content stored in a bundle2"""
1529 namespace = inpart.params['namespace']
1530 namespace = inpart.params['namespace']
1530 r = pushkey.decodekeys(inpart.read())
1531 r = pushkey.decodekeys(inpart.read())
1531 op.records.add('listkeys', (namespace, r))
1532 op.records.add('listkeys', (namespace, r))
1532
1533
1533 @parthandler('pushkey', ('namespace', 'key', 'old', 'new'))
1534 @parthandler('pushkey', ('namespace', 'key', 'old', 'new'))
1534 def handlepushkey(op, inpart):
1535 def handlepushkey(op, inpart):
1535 """process a pushkey request"""
1536 """process a pushkey request"""
1536 dec = pushkey.decode
1537 dec = pushkey.decode
1537 namespace = dec(inpart.params['namespace'])
1538 namespace = dec(inpart.params['namespace'])
1538 key = dec(inpart.params['key'])
1539 key = dec(inpart.params['key'])
1539 old = dec(inpart.params['old'])
1540 old = dec(inpart.params['old'])
1540 new = dec(inpart.params['new'])
1541 new = dec(inpart.params['new'])
1541 # Grab the transaction to ensure that we have the lock before performing the
1542 # Grab the transaction to ensure that we have the lock before performing the
1542 # pushkey.
1543 # pushkey.
1543 if op.ui.configbool('experimental', 'bundle2lazylocking'):
1544 if op.ui.configbool('experimental', 'bundle2lazylocking'):
1544 op.gettransaction()
1545 op.gettransaction()
1545 ret = op.repo.pushkey(namespace, key, old, new)
1546 ret = op.repo.pushkey(namespace, key, old, new)
1546 record = {'namespace': namespace,
1547 record = {'namespace': namespace,
1547 'key': key,
1548 'key': key,
1548 'old': old,
1549 'old': old,
1549 'new': new}
1550 'new': new}
1550 op.records.add('pushkey', record)
1551 op.records.add('pushkey', record)
1551 if op.reply is not None:
1552 if op.reply is not None:
1552 rpart = op.reply.newpart('reply:pushkey')
1553 rpart = op.reply.newpart('reply:pushkey')
1553 rpart.addparam('in-reply-to', str(inpart.id), mandatory=False)
1554 rpart.addparam('in-reply-to', str(inpart.id), mandatory=False)
1554 rpart.addparam('return', '%i' % ret, mandatory=False)
1555 rpart.addparam('return', '%i' % ret, mandatory=False)
1555 if inpart.mandatory and not ret:
1556 if inpart.mandatory and not ret:
1556 kwargs = {}
1557 kwargs = {}
1557 for key in ('namespace', 'key', 'new', 'old', 'ret'):
1558 for key in ('namespace', 'key', 'new', 'old', 'ret'):
1558 if key in inpart.params:
1559 if key in inpart.params:
1559 kwargs[key] = inpart.params[key]
1560 kwargs[key] = inpart.params[key]
1560 raise error.PushkeyFailed(partid=str(inpart.id), **kwargs)
1561 raise error.PushkeyFailed(partid=str(inpart.id), **kwargs)
1561
1562
1562 @parthandler('reply:pushkey', ('return', 'in-reply-to'))
1563 @parthandler('reply:pushkey', ('return', 'in-reply-to'))
1563 def handlepushkeyreply(op, inpart):
1564 def handlepushkeyreply(op, inpart):
1564 """retrieve the result of a pushkey request"""
1565 """retrieve the result of a pushkey request"""
1565 ret = int(inpart.params['return'])
1566 ret = int(inpart.params['return'])
1566 partid = int(inpart.params['in-reply-to'])
1567 partid = int(inpart.params['in-reply-to'])
1567 op.records.add('pushkey', {'return': ret}, partid)
1568 op.records.add('pushkey', {'return': ret}, partid)
1568
1569
1569 @parthandler('obsmarkers')
1570 @parthandler('obsmarkers')
1570 def handleobsmarker(op, inpart):
1571 def handleobsmarker(op, inpart):
1571 """add a stream of obsmarkers to the repo"""
1572 """add a stream of obsmarkers to the repo"""
1572 tr = op.gettransaction()
1573 tr = op.gettransaction()
1573 markerdata = inpart.read()
1574 markerdata = inpart.read()
1574 if op.ui.config('experimental', 'obsmarkers-exchange-debug', False):
1575 if op.ui.config('experimental', 'obsmarkers-exchange-debug', False):
1575 op.ui.write(('obsmarker-exchange: %i bytes received\n')
1576 op.ui.write(('obsmarker-exchange: %i bytes received\n')
1576 % len(markerdata))
1577 % len(markerdata))
1577 # The mergemarkers call will crash if marker creation is not enabled.
1578 # The mergemarkers call will crash if marker creation is not enabled.
1578 # we want to avoid this if the part is advisory.
1579 # we want to avoid this if the part is advisory.
1579 if not inpart.mandatory and op.repo.obsstore.readonly:
1580 if not inpart.mandatory and op.repo.obsstore.readonly:
1580 op.repo.ui.debug('ignoring obsolescence markers, feature not enabled')
1581 op.repo.ui.debug('ignoring obsolescence markers, feature not enabled')
1581 return
1582 return
1582 new = op.repo.obsstore.mergemarkers(tr, markerdata)
1583 new = op.repo.obsstore.mergemarkers(tr, markerdata)
1583 if new:
1584 if new:
1584 op.repo.ui.status(_('%i new obsolescence markers\n') % new)
1585 op.repo.ui.status(_('%i new obsolescence markers\n') % new)
1585 op.records.add('obsmarkers', {'new': new})
1586 op.records.add('obsmarkers', {'new': new})
1586 if op.reply is not None:
1587 if op.reply is not None:
1587 rpart = op.reply.newpart('reply:obsmarkers')
1588 rpart = op.reply.newpart('reply:obsmarkers')
1588 rpart.addparam('in-reply-to', str(inpart.id), mandatory=False)
1589 rpart.addparam('in-reply-to', str(inpart.id), mandatory=False)
1589 rpart.addparam('new', '%i' % new, mandatory=False)
1590 rpart.addparam('new', '%i' % new, mandatory=False)
1590
1591
1591
1592
1592 @parthandler('reply:obsmarkers', ('new', 'in-reply-to'))
1593 @parthandler('reply:obsmarkers', ('new', 'in-reply-to'))
1593 def handleobsmarkerreply(op, inpart):
1594 def handleobsmarkerreply(op, inpart):
1594 """retrieve the result of a pushkey request"""
1595 """retrieve the result of a pushkey request"""
1595 ret = int(inpart.params['new'])
1596 ret = int(inpart.params['new'])
1596 partid = int(inpart.params['in-reply-to'])
1597 partid = int(inpart.params['in-reply-to'])
1597 op.records.add('obsmarkers', {'new': ret}, partid)
1598 op.records.add('obsmarkers', {'new': ret}, partid)
1598
1599
1599 @parthandler('hgtagsfnodes')
1600 @parthandler('hgtagsfnodes')
1600 def handlehgtagsfnodes(op, inpart):
1601 def handlehgtagsfnodes(op, inpart):
1601 """Applies .hgtags fnodes cache entries to the local repo.
1602 """Applies .hgtags fnodes cache entries to the local repo.
1602
1603
1603 Payload is pairs of 20 byte changeset nodes and filenodes.
1604 Payload is pairs of 20 byte changeset nodes and filenodes.
1604 """
1605 """
1605 # Grab the transaction so we ensure that we have the lock at this point.
1606 # Grab the transaction so we ensure that we have the lock at this point.
1606 if op.ui.configbool('experimental', 'bundle2lazylocking'):
1607 if op.ui.configbool('experimental', 'bundle2lazylocking'):
1607 op.gettransaction()
1608 op.gettransaction()
1608 cache = tags.hgtagsfnodescache(op.repo.unfiltered())
1609 cache = tags.hgtagsfnodescache(op.repo.unfiltered())
1609
1610
1610 count = 0
1611 count = 0
1611 while True:
1612 while True:
1612 node = inpart.read(20)
1613 node = inpart.read(20)
1613 fnode = inpart.read(20)
1614 fnode = inpart.read(20)
1614 if len(node) < 20 or len(fnode) < 20:
1615 if len(node) < 20 or len(fnode) < 20:
1615 op.ui.debug('ignoring incomplete received .hgtags fnodes data\n')
1616 op.ui.debug('ignoring incomplete received .hgtags fnodes data\n')
1616 break
1617 break
1617 cache.setfnode(node, fnode)
1618 cache.setfnode(node, fnode)
1618 count += 1
1619 count += 1
1619
1620
1620 cache.write()
1621 cache.write()
1621 op.ui.debug('applied %i hgtags fnodes cache entries\n' % count)
1622 op.ui.debug('applied %i hgtags fnodes cache entries\n' % count)
@@ -1,579 +1,579 b''
1 # encoding.py - character transcoding support for Mercurial
1 # encoding.py - character transcoding support for Mercurial
2 #
2 #
3 # Copyright 2005-2009 Matt Mackall <mpm@selenic.com> and others
3 # Copyright 2005-2009 Matt Mackall <mpm@selenic.com> and others
4 #
4 #
5 # This software may be used and distributed according to the terms of the
5 # This software may be used and distributed according to the terms of the
6 # GNU General Public License version 2 or any later version.
6 # GNU General Public License version 2 or any later version.
7
7
8 from __future__ import absolute_import
8 from __future__ import absolute_import
9
9
10 import array
10 import array
11 import locale
11 import locale
12 import os
12 import os
13 import sys
14 import unicodedata
13 import unicodedata
15
14
16 from . import (
15 from . import (
17 error,
16 error,
17 pycompat,
18 )
18 )
19
19
20 if sys.version_info[0] >= 3:
20 if pycompat.ispy3:
21 unichr = chr
21 unichr = chr
22
22
23 # These unicode characters are ignored by HFS+ (Apple Technote 1150,
23 # These unicode characters are ignored by HFS+ (Apple Technote 1150,
24 # "Unicode Subtleties"), so we need to ignore them in some places for
24 # "Unicode Subtleties"), so we need to ignore them in some places for
25 # sanity.
25 # sanity.
26 _ignore = [unichr(int(x, 16)).encode("utf-8") for x in
26 _ignore = [unichr(int(x, 16)).encode("utf-8") for x in
27 "200c 200d 200e 200f 202a 202b 202c 202d 202e "
27 "200c 200d 200e 200f 202a 202b 202c 202d 202e "
28 "206a 206b 206c 206d 206e 206f feff".split()]
28 "206a 206b 206c 206d 206e 206f feff".split()]
29 # verify the next function will work
29 # verify the next function will work
30 if sys.version_info[0] >= 3:
30 if pycompat.ispy3:
31 assert set(i[0] for i in _ignore) == set([ord(b'\xe2'), ord(b'\xef')])
31 assert set(i[0] for i in _ignore) == set([ord(b'\xe2'), ord(b'\xef')])
32 else:
32 else:
33 assert set(i[0] for i in _ignore) == set(["\xe2", "\xef"])
33 assert set(i[0] for i in _ignore) == set(["\xe2", "\xef"])
34
34
35 def hfsignoreclean(s):
35 def hfsignoreclean(s):
36 """Remove codepoints ignored by HFS+ from s.
36 """Remove codepoints ignored by HFS+ from s.
37
37
38 >>> hfsignoreclean(u'.h\u200cg'.encode('utf-8'))
38 >>> hfsignoreclean(u'.h\u200cg'.encode('utf-8'))
39 '.hg'
39 '.hg'
40 >>> hfsignoreclean(u'.h\ufeffg'.encode('utf-8'))
40 >>> hfsignoreclean(u'.h\ufeffg'.encode('utf-8'))
41 '.hg'
41 '.hg'
42 """
42 """
43 if "\xe2" in s or "\xef" in s:
43 if "\xe2" in s or "\xef" in s:
44 for c in _ignore:
44 for c in _ignore:
45 s = s.replace(c, '')
45 s = s.replace(c, '')
46 return s
46 return s
47
47
48 def _getpreferredencoding():
48 def _getpreferredencoding():
49 '''
49 '''
50 On darwin, getpreferredencoding ignores the locale environment and
50 On darwin, getpreferredencoding ignores the locale environment and
51 always returns mac-roman. http://bugs.python.org/issue6202 fixes this
51 always returns mac-roman. http://bugs.python.org/issue6202 fixes this
52 for Python 2.7 and up. This is the same corrected code for earlier
52 for Python 2.7 and up. This is the same corrected code for earlier
53 Python versions.
53 Python versions.
54
54
55 However, we can't use a version check for this method, as some distributions
55 However, we can't use a version check for this method, as some distributions
56 patch Python to fix this. Instead, we use it as a 'fixer' for the mac-roman
56 patch Python to fix this. Instead, we use it as a 'fixer' for the mac-roman
57 encoding, as it is unlikely that this encoding is the actually expected.
57 encoding, as it is unlikely that this encoding is the actually expected.
58 '''
58 '''
59 try:
59 try:
60 locale.CODESET
60 locale.CODESET
61 except AttributeError:
61 except AttributeError:
62 # Fall back to parsing environment variables :-(
62 # Fall back to parsing environment variables :-(
63 return locale.getdefaultlocale()[1]
63 return locale.getdefaultlocale()[1]
64
64
65 oldloc = locale.setlocale(locale.LC_CTYPE)
65 oldloc = locale.setlocale(locale.LC_CTYPE)
66 locale.setlocale(locale.LC_CTYPE, "")
66 locale.setlocale(locale.LC_CTYPE, "")
67 result = locale.nl_langinfo(locale.CODESET)
67 result = locale.nl_langinfo(locale.CODESET)
68 locale.setlocale(locale.LC_CTYPE, oldloc)
68 locale.setlocale(locale.LC_CTYPE, oldloc)
69
69
70 return result
70 return result
71
71
72 _encodingfixers = {
72 _encodingfixers = {
73 '646': lambda: 'ascii',
73 '646': lambda: 'ascii',
74 'ANSI_X3.4-1968': lambda: 'ascii',
74 'ANSI_X3.4-1968': lambda: 'ascii',
75 'mac-roman': _getpreferredencoding
75 'mac-roman': _getpreferredencoding
76 }
76 }
77
77
78 try:
78 try:
79 encoding = os.environ.get("HGENCODING")
79 encoding = os.environ.get("HGENCODING")
80 if not encoding:
80 if not encoding:
81 encoding = locale.getpreferredencoding() or 'ascii'
81 encoding = locale.getpreferredencoding() or 'ascii'
82 encoding = _encodingfixers.get(encoding, lambda: encoding)()
82 encoding = _encodingfixers.get(encoding, lambda: encoding)()
83 except locale.Error:
83 except locale.Error:
84 encoding = 'ascii'
84 encoding = 'ascii'
85 encodingmode = os.environ.get("HGENCODINGMODE", "strict")
85 encodingmode = os.environ.get("HGENCODINGMODE", "strict")
86 fallbackencoding = 'ISO-8859-1'
86 fallbackencoding = 'ISO-8859-1'
87
87
88 class localstr(str):
88 class localstr(str):
89 '''This class allows strings that are unmodified to be
89 '''This class allows strings that are unmodified to be
90 round-tripped to the local encoding and back'''
90 round-tripped to the local encoding and back'''
91 def __new__(cls, u, l):
91 def __new__(cls, u, l):
92 s = str.__new__(cls, l)
92 s = str.__new__(cls, l)
93 s._utf8 = u
93 s._utf8 = u
94 return s
94 return s
95 def __hash__(self):
95 def __hash__(self):
96 return hash(self._utf8) # avoid collisions in local string space
96 return hash(self._utf8) # avoid collisions in local string space
97
97
98 def tolocal(s):
98 def tolocal(s):
99 """
99 """
100 Convert a string from internal UTF-8 to local encoding
100 Convert a string from internal UTF-8 to local encoding
101
101
102 All internal strings should be UTF-8 but some repos before the
102 All internal strings should be UTF-8 but some repos before the
103 implementation of locale support may contain latin1 or possibly
103 implementation of locale support may contain latin1 or possibly
104 other character sets. We attempt to decode everything strictly
104 other character sets. We attempt to decode everything strictly
105 using UTF-8, then Latin-1, and failing that, we use UTF-8 and
105 using UTF-8, then Latin-1, and failing that, we use UTF-8 and
106 replace unknown characters.
106 replace unknown characters.
107
107
108 The localstr class is used to cache the known UTF-8 encoding of
108 The localstr class is used to cache the known UTF-8 encoding of
109 strings next to their local representation to allow lossless
109 strings next to their local representation to allow lossless
110 round-trip conversion back to UTF-8.
110 round-trip conversion back to UTF-8.
111
111
112 >>> u = 'foo: \\xc3\\xa4' # utf-8
112 >>> u = 'foo: \\xc3\\xa4' # utf-8
113 >>> l = tolocal(u)
113 >>> l = tolocal(u)
114 >>> l
114 >>> l
115 'foo: ?'
115 'foo: ?'
116 >>> fromlocal(l)
116 >>> fromlocal(l)
117 'foo: \\xc3\\xa4'
117 'foo: \\xc3\\xa4'
118 >>> u2 = 'foo: \\xc3\\xa1'
118 >>> u2 = 'foo: \\xc3\\xa1'
119 >>> d = { l: 1, tolocal(u2): 2 }
119 >>> d = { l: 1, tolocal(u2): 2 }
120 >>> len(d) # no collision
120 >>> len(d) # no collision
121 2
121 2
122 >>> 'foo: ?' in d
122 >>> 'foo: ?' in d
123 False
123 False
124 >>> l1 = 'foo: \\xe4' # historical latin1 fallback
124 >>> l1 = 'foo: \\xe4' # historical latin1 fallback
125 >>> l = tolocal(l1)
125 >>> l = tolocal(l1)
126 >>> l
126 >>> l
127 'foo: ?'
127 'foo: ?'
128 >>> fromlocal(l) # magically in utf-8
128 >>> fromlocal(l) # magically in utf-8
129 'foo: \\xc3\\xa4'
129 'foo: \\xc3\\xa4'
130 """
130 """
131
131
132 try:
132 try:
133 try:
133 try:
134 # make sure string is actually stored in UTF-8
134 # make sure string is actually stored in UTF-8
135 u = s.decode('UTF-8')
135 u = s.decode('UTF-8')
136 if encoding == 'UTF-8':
136 if encoding == 'UTF-8':
137 # fast path
137 # fast path
138 return s
138 return s
139 r = u.encode(encoding, "replace")
139 r = u.encode(encoding, "replace")
140 if u == r.decode(encoding):
140 if u == r.decode(encoding):
141 # r is a safe, non-lossy encoding of s
141 # r is a safe, non-lossy encoding of s
142 return r
142 return r
143 return localstr(s, r)
143 return localstr(s, r)
144 except UnicodeDecodeError:
144 except UnicodeDecodeError:
145 # we should only get here if we're looking at an ancient changeset
145 # we should only get here if we're looking at an ancient changeset
146 try:
146 try:
147 u = s.decode(fallbackencoding)
147 u = s.decode(fallbackencoding)
148 r = u.encode(encoding, "replace")
148 r = u.encode(encoding, "replace")
149 if u == r.decode(encoding):
149 if u == r.decode(encoding):
150 # r is a safe, non-lossy encoding of s
150 # r is a safe, non-lossy encoding of s
151 return r
151 return r
152 return localstr(u.encode('UTF-8'), r)
152 return localstr(u.encode('UTF-8'), r)
153 except UnicodeDecodeError:
153 except UnicodeDecodeError:
154 u = s.decode("utf-8", "replace") # last ditch
154 u = s.decode("utf-8", "replace") # last ditch
155 return u.encode(encoding, "replace") # can't round-trip
155 return u.encode(encoding, "replace") # can't round-trip
156 except LookupError as k:
156 except LookupError as k:
157 raise error.Abort(k, hint="please check your locale settings")
157 raise error.Abort(k, hint="please check your locale settings")
158
158
159 def fromlocal(s):
159 def fromlocal(s):
160 """
160 """
161 Convert a string from the local character encoding to UTF-8
161 Convert a string from the local character encoding to UTF-8
162
162
163 We attempt to decode strings using the encoding mode set by
163 We attempt to decode strings using the encoding mode set by
164 HGENCODINGMODE, which defaults to 'strict'. In this mode, unknown
164 HGENCODINGMODE, which defaults to 'strict'. In this mode, unknown
165 characters will cause an error message. Other modes include
165 characters will cause an error message. Other modes include
166 'replace', which replaces unknown characters with a special
166 'replace', which replaces unknown characters with a special
167 Unicode character, and 'ignore', which drops the character.
167 Unicode character, and 'ignore', which drops the character.
168 """
168 """
169
169
170 # can we do a lossless round-trip?
170 # can we do a lossless round-trip?
171 if isinstance(s, localstr):
171 if isinstance(s, localstr):
172 return s._utf8
172 return s._utf8
173
173
174 try:
174 try:
175 return s.decode(encoding, encodingmode).encode("utf-8")
175 return s.decode(encoding, encodingmode).encode("utf-8")
176 except UnicodeDecodeError as inst:
176 except UnicodeDecodeError as inst:
177 sub = s[max(0, inst.start - 10):inst.start + 10]
177 sub = s[max(0, inst.start - 10):inst.start + 10]
178 raise error.Abort("decoding near '%s': %s!" % (sub, inst))
178 raise error.Abort("decoding near '%s': %s!" % (sub, inst))
179 except LookupError as k:
179 except LookupError as k:
180 raise error.Abort(k, hint="please check your locale settings")
180 raise error.Abort(k, hint="please check your locale settings")
181
181
182 # How to treat ambiguous-width characters. Set to 'wide' to treat as wide.
182 # How to treat ambiguous-width characters. Set to 'wide' to treat as wide.
183 wide = (os.environ.get("HGENCODINGAMBIGUOUS", "narrow") == "wide"
183 wide = (os.environ.get("HGENCODINGAMBIGUOUS", "narrow") == "wide"
184 and "WFA" or "WF")
184 and "WFA" or "WF")
185
185
186 def colwidth(s):
186 def colwidth(s):
187 "Find the column width of a string for display in the local encoding"
187 "Find the column width of a string for display in the local encoding"
188 return ucolwidth(s.decode(encoding, 'replace'))
188 return ucolwidth(s.decode(encoding, 'replace'))
189
189
190 def ucolwidth(d):
190 def ucolwidth(d):
191 "Find the column width of a Unicode string for display"
191 "Find the column width of a Unicode string for display"
192 eaw = getattr(unicodedata, 'east_asian_width', None)
192 eaw = getattr(unicodedata, 'east_asian_width', None)
193 if eaw is not None:
193 if eaw is not None:
194 return sum([eaw(c) in wide and 2 or 1 for c in d])
194 return sum([eaw(c) in wide and 2 or 1 for c in d])
195 return len(d)
195 return len(d)
196
196
197 def getcols(s, start, c):
197 def getcols(s, start, c):
198 '''Use colwidth to find a c-column substring of s starting at byte
198 '''Use colwidth to find a c-column substring of s starting at byte
199 index start'''
199 index start'''
200 for x in xrange(start + c, len(s)):
200 for x in xrange(start + c, len(s)):
201 t = s[start:x]
201 t = s[start:x]
202 if colwidth(t) == c:
202 if colwidth(t) == c:
203 return t
203 return t
204
204
205 def trim(s, width, ellipsis='', leftside=False):
205 def trim(s, width, ellipsis='', leftside=False):
206 """Trim string 's' to at most 'width' columns (including 'ellipsis').
206 """Trim string 's' to at most 'width' columns (including 'ellipsis').
207
207
208 If 'leftside' is True, left side of string 's' is trimmed.
208 If 'leftside' is True, left side of string 's' is trimmed.
209 'ellipsis' is always placed at trimmed side.
209 'ellipsis' is always placed at trimmed side.
210
210
211 >>> ellipsis = '+++'
211 >>> ellipsis = '+++'
212 >>> from . import encoding
212 >>> from . import encoding
213 >>> encoding.encoding = 'utf-8'
213 >>> encoding.encoding = 'utf-8'
214 >>> t= '1234567890'
214 >>> t= '1234567890'
215 >>> print trim(t, 12, ellipsis=ellipsis)
215 >>> print trim(t, 12, ellipsis=ellipsis)
216 1234567890
216 1234567890
217 >>> print trim(t, 10, ellipsis=ellipsis)
217 >>> print trim(t, 10, ellipsis=ellipsis)
218 1234567890
218 1234567890
219 >>> print trim(t, 8, ellipsis=ellipsis)
219 >>> print trim(t, 8, ellipsis=ellipsis)
220 12345+++
220 12345+++
221 >>> print trim(t, 8, ellipsis=ellipsis, leftside=True)
221 >>> print trim(t, 8, ellipsis=ellipsis, leftside=True)
222 +++67890
222 +++67890
223 >>> print trim(t, 8)
223 >>> print trim(t, 8)
224 12345678
224 12345678
225 >>> print trim(t, 8, leftside=True)
225 >>> print trim(t, 8, leftside=True)
226 34567890
226 34567890
227 >>> print trim(t, 3, ellipsis=ellipsis)
227 >>> print trim(t, 3, ellipsis=ellipsis)
228 +++
228 +++
229 >>> print trim(t, 1, ellipsis=ellipsis)
229 >>> print trim(t, 1, ellipsis=ellipsis)
230 +
230 +
231 >>> u = u'\u3042\u3044\u3046\u3048\u304a' # 2 x 5 = 10 columns
231 >>> u = u'\u3042\u3044\u3046\u3048\u304a' # 2 x 5 = 10 columns
232 >>> t = u.encode(encoding.encoding)
232 >>> t = u.encode(encoding.encoding)
233 >>> print trim(t, 12, ellipsis=ellipsis)
233 >>> print trim(t, 12, ellipsis=ellipsis)
234 \xe3\x81\x82\xe3\x81\x84\xe3\x81\x86\xe3\x81\x88\xe3\x81\x8a
234 \xe3\x81\x82\xe3\x81\x84\xe3\x81\x86\xe3\x81\x88\xe3\x81\x8a
235 >>> print trim(t, 10, ellipsis=ellipsis)
235 >>> print trim(t, 10, ellipsis=ellipsis)
236 \xe3\x81\x82\xe3\x81\x84\xe3\x81\x86\xe3\x81\x88\xe3\x81\x8a
236 \xe3\x81\x82\xe3\x81\x84\xe3\x81\x86\xe3\x81\x88\xe3\x81\x8a
237 >>> print trim(t, 8, ellipsis=ellipsis)
237 >>> print trim(t, 8, ellipsis=ellipsis)
238 \xe3\x81\x82\xe3\x81\x84+++
238 \xe3\x81\x82\xe3\x81\x84+++
239 >>> print trim(t, 8, ellipsis=ellipsis, leftside=True)
239 >>> print trim(t, 8, ellipsis=ellipsis, leftside=True)
240 +++\xe3\x81\x88\xe3\x81\x8a
240 +++\xe3\x81\x88\xe3\x81\x8a
241 >>> print trim(t, 5)
241 >>> print trim(t, 5)
242 \xe3\x81\x82\xe3\x81\x84
242 \xe3\x81\x82\xe3\x81\x84
243 >>> print trim(t, 5, leftside=True)
243 >>> print trim(t, 5, leftside=True)
244 \xe3\x81\x88\xe3\x81\x8a
244 \xe3\x81\x88\xe3\x81\x8a
245 >>> print trim(t, 4, ellipsis=ellipsis)
245 >>> print trim(t, 4, ellipsis=ellipsis)
246 +++
246 +++
247 >>> print trim(t, 4, ellipsis=ellipsis, leftside=True)
247 >>> print trim(t, 4, ellipsis=ellipsis, leftside=True)
248 +++
248 +++
249 >>> t = '\x11\x22\x33\x44\x55\x66\x77\x88\x99\xaa' # invalid byte sequence
249 >>> t = '\x11\x22\x33\x44\x55\x66\x77\x88\x99\xaa' # invalid byte sequence
250 >>> print trim(t, 12, ellipsis=ellipsis)
250 >>> print trim(t, 12, ellipsis=ellipsis)
251 \x11\x22\x33\x44\x55\x66\x77\x88\x99\xaa
251 \x11\x22\x33\x44\x55\x66\x77\x88\x99\xaa
252 >>> print trim(t, 10, ellipsis=ellipsis)
252 >>> print trim(t, 10, ellipsis=ellipsis)
253 \x11\x22\x33\x44\x55\x66\x77\x88\x99\xaa
253 \x11\x22\x33\x44\x55\x66\x77\x88\x99\xaa
254 >>> print trim(t, 8, ellipsis=ellipsis)
254 >>> print trim(t, 8, ellipsis=ellipsis)
255 \x11\x22\x33\x44\x55+++
255 \x11\x22\x33\x44\x55+++
256 >>> print trim(t, 8, ellipsis=ellipsis, leftside=True)
256 >>> print trim(t, 8, ellipsis=ellipsis, leftside=True)
257 +++\x66\x77\x88\x99\xaa
257 +++\x66\x77\x88\x99\xaa
258 >>> print trim(t, 8)
258 >>> print trim(t, 8)
259 \x11\x22\x33\x44\x55\x66\x77\x88
259 \x11\x22\x33\x44\x55\x66\x77\x88
260 >>> print trim(t, 8, leftside=True)
260 >>> print trim(t, 8, leftside=True)
261 \x33\x44\x55\x66\x77\x88\x99\xaa
261 \x33\x44\x55\x66\x77\x88\x99\xaa
262 >>> print trim(t, 3, ellipsis=ellipsis)
262 >>> print trim(t, 3, ellipsis=ellipsis)
263 +++
263 +++
264 >>> print trim(t, 1, ellipsis=ellipsis)
264 >>> print trim(t, 1, ellipsis=ellipsis)
265 +
265 +
266 """
266 """
267 try:
267 try:
268 u = s.decode(encoding)
268 u = s.decode(encoding)
269 except UnicodeDecodeError:
269 except UnicodeDecodeError:
270 if len(s) <= width: # trimming is not needed
270 if len(s) <= width: # trimming is not needed
271 return s
271 return s
272 width -= len(ellipsis)
272 width -= len(ellipsis)
273 if width <= 0: # no enough room even for ellipsis
273 if width <= 0: # no enough room even for ellipsis
274 return ellipsis[:width + len(ellipsis)]
274 return ellipsis[:width + len(ellipsis)]
275 if leftside:
275 if leftside:
276 return ellipsis + s[-width:]
276 return ellipsis + s[-width:]
277 return s[:width] + ellipsis
277 return s[:width] + ellipsis
278
278
279 if ucolwidth(u) <= width: # trimming is not needed
279 if ucolwidth(u) <= width: # trimming is not needed
280 return s
280 return s
281
281
282 width -= len(ellipsis)
282 width -= len(ellipsis)
283 if width <= 0: # no enough room even for ellipsis
283 if width <= 0: # no enough room even for ellipsis
284 return ellipsis[:width + len(ellipsis)]
284 return ellipsis[:width + len(ellipsis)]
285
285
286 if leftside:
286 if leftside:
287 uslice = lambda i: u[i:]
287 uslice = lambda i: u[i:]
288 concat = lambda s: ellipsis + s
288 concat = lambda s: ellipsis + s
289 else:
289 else:
290 uslice = lambda i: u[:-i]
290 uslice = lambda i: u[:-i]
291 concat = lambda s: s + ellipsis
291 concat = lambda s: s + ellipsis
292 for i in xrange(1, len(u)):
292 for i in xrange(1, len(u)):
293 usub = uslice(i)
293 usub = uslice(i)
294 if ucolwidth(usub) <= width:
294 if ucolwidth(usub) <= width:
295 return concat(usub.encode(encoding))
295 return concat(usub.encode(encoding))
296 return ellipsis # no enough room for multi-column characters
296 return ellipsis # no enough room for multi-column characters
297
297
298 def _asciilower(s):
298 def _asciilower(s):
299 '''convert a string to lowercase if ASCII
299 '''convert a string to lowercase if ASCII
300
300
301 Raises UnicodeDecodeError if non-ASCII characters are found.'''
301 Raises UnicodeDecodeError if non-ASCII characters are found.'''
302 s.decode('ascii')
302 s.decode('ascii')
303 return s.lower()
303 return s.lower()
304
304
305 def asciilower(s):
305 def asciilower(s):
306 # delay importing avoids cyclic dependency around "parsers" in
306 # delay importing avoids cyclic dependency around "parsers" in
307 # pure Python build (util => i18n => encoding => parsers => util)
307 # pure Python build (util => i18n => encoding => parsers => util)
308 from . import parsers
308 from . import parsers
309 impl = getattr(parsers, 'asciilower', _asciilower)
309 impl = getattr(parsers, 'asciilower', _asciilower)
310 global asciilower
310 global asciilower
311 asciilower = impl
311 asciilower = impl
312 return impl(s)
312 return impl(s)
313
313
314 def _asciiupper(s):
314 def _asciiupper(s):
315 '''convert a string to uppercase if ASCII
315 '''convert a string to uppercase if ASCII
316
316
317 Raises UnicodeDecodeError if non-ASCII characters are found.'''
317 Raises UnicodeDecodeError if non-ASCII characters are found.'''
318 s.decode('ascii')
318 s.decode('ascii')
319 return s.upper()
319 return s.upper()
320
320
321 def asciiupper(s):
321 def asciiupper(s):
322 # delay importing avoids cyclic dependency around "parsers" in
322 # delay importing avoids cyclic dependency around "parsers" in
323 # pure Python build (util => i18n => encoding => parsers => util)
323 # pure Python build (util => i18n => encoding => parsers => util)
324 from . import parsers
324 from . import parsers
325 impl = getattr(parsers, 'asciiupper', _asciiupper)
325 impl = getattr(parsers, 'asciiupper', _asciiupper)
326 global asciiupper
326 global asciiupper
327 asciiupper = impl
327 asciiupper = impl
328 return impl(s)
328 return impl(s)
329
329
330 def lower(s):
330 def lower(s):
331 "best-effort encoding-aware case-folding of local string s"
331 "best-effort encoding-aware case-folding of local string s"
332 try:
332 try:
333 return asciilower(s)
333 return asciilower(s)
334 except UnicodeDecodeError:
334 except UnicodeDecodeError:
335 pass
335 pass
336 try:
336 try:
337 if isinstance(s, localstr):
337 if isinstance(s, localstr):
338 u = s._utf8.decode("utf-8")
338 u = s._utf8.decode("utf-8")
339 else:
339 else:
340 u = s.decode(encoding, encodingmode)
340 u = s.decode(encoding, encodingmode)
341
341
342 lu = u.lower()
342 lu = u.lower()
343 if u == lu:
343 if u == lu:
344 return s # preserve localstring
344 return s # preserve localstring
345 return lu.encode(encoding)
345 return lu.encode(encoding)
346 except UnicodeError:
346 except UnicodeError:
347 return s.lower() # we don't know how to fold this except in ASCII
347 return s.lower() # we don't know how to fold this except in ASCII
348 except LookupError as k:
348 except LookupError as k:
349 raise error.Abort(k, hint="please check your locale settings")
349 raise error.Abort(k, hint="please check your locale settings")
350
350
351 def upper(s):
351 def upper(s):
352 "best-effort encoding-aware case-folding of local string s"
352 "best-effort encoding-aware case-folding of local string s"
353 try:
353 try:
354 return asciiupper(s)
354 return asciiupper(s)
355 except UnicodeDecodeError:
355 except UnicodeDecodeError:
356 return upperfallback(s)
356 return upperfallback(s)
357
357
358 def upperfallback(s):
358 def upperfallback(s):
359 try:
359 try:
360 if isinstance(s, localstr):
360 if isinstance(s, localstr):
361 u = s._utf8.decode("utf-8")
361 u = s._utf8.decode("utf-8")
362 else:
362 else:
363 u = s.decode(encoding, encodingmode)
363 u = s.decode(encoding, encodingmode)
364
364
365 uu = u.upper()
365 uu = u.upper()
366 if u == uu:
366 if u == uu:
367 return s # preserve localstring
367 return s # preserve localstring
368 return uu.encode(encoding)
368 return uu.encode(encoding)
369 except UnicodeError:
369 except UnicodeError:
370 return s.upper() # we don't know how to fold this except in ASCII
370 return s.upper() # we don't know how to fold this except in ASCII
371 except LookupError as k:
371 except LookupError as k:
372 raise error.Abort(k, hint="please check your locale settings")
372 raise error.Abort(k, hint="please check your locale settings")
373
373
374 class normcasespecs(object):
374 class normcasespecs(object):
375 '''what a platform's normcase does to ASCII strings
375 '''what a platform's normcase does to ASCII strings
376
376
377 This is specified per platform, and should be consistent with what normcase
377 This is specified per platform, and should be consistent with what normcase
378 on that platform actually does.
378 on that platform actually does.
379
379
380 lower: normcase lowercases ASCII strings
380 lower: normcase lowercases ASCII strings
381 upper: normcase uppercases ASCII strings
381 upper: normcase uppercases ASCII strings
382 other: the fallback function should always be called
382 other: the fallback function should always be called
383
383
384 This should be kept in sync with normcase_spec in util.h.'''
384 This should be kept in sync with normcase_spec in util.h.'''
385 lower = -1
385 lower = -1
386 upper = 1
386 upper = 1
387 other = 0
387 other = 0
388
388
389 _jsonmap = []
389 _jsonmap = []
390 _jsonmap.extend("\\u%04x" % x for x in range(32))
390 _jsonmap.extend("\\u%04x" % x for x in range(32))
391 _jsonmap.extend(chr(x) for x in range(32, 127))
391 _jsonmap.extend(chr(x) for x in range(32, 127))
392 _jsonmap.append('\\u007f')
392 _jsonmap.append('\\u007f')
393 _jsonmap[0x09] = '\\t'
393 _jsonmap[0x09] = '\\t'
394 _jsonmap[0x0a] = '\\n'
394 _jsonmap[0x0a] = '\\n'
395 _jsonmap[0x22] = '\\"'
395 _jsonmap[0x22] = '\\"'
396 _jsonmap[0x5c] = '\\\\'
396 _jsonmap[0x5c] = '\\\\'
397 _jsonmap[0x08] = '\\b'
397 _jsonmap[0x08] = '\\b'
398 _jsonmap[0x0c] = '\\f'
398 _jsonmap[0x0c] = '\\f'
399 _jsonmap[0x0d] = '\\r'
399 _jsonmap[0x0d] = '\\r'
400 _paranoidjsonmap = _jsonmap[:]
400 _paranoidjsonmap = _jsonmap[:]
401 _paranoidjsonmap[0x3c] = '\\u003c' # '<' (e.g. escape "</script>")
401 _paranoidjsonmap[0x3c] = '\\u003c' # '<' (e.g. escape "</script>")
402 _paranoidjsonmap[0x3e] = '\\u003e' # '>'
402 _paranoidjsonmap[0x3e] = '\\u003e' # '>'
403 _jsonmap.extend(chr(x) for x in range(128, 256))
403 _jsonmap.extend(chr(x) for x in range(128, 256))
404
404
405 def jsonescape(s, paranoid=False):
405 def jsonescape(s, paranoid=False):
406 '''returns a string suitable for JSON
406 '''returns a string suitable for JSON
407
407
408 JSON is problematic for us because it doesn't support non-Unicode
408 JSON is problematic for us because it doesn't support non-Unicode
409 bytes. To deal with this, we take the following approach:
409 bytes. To deal with this, we take the following approach:
410
410
411 - localstr objects are converted back to UTF-8
411 - localstr objects are converted back to UTF-8
412 - valid UTF-8/ASCII strings are passed as-is
412 - valid UTF-8/ASCII strings are passed as-is
413 - other strings are converted to UTF-8b surrogate encoding
413 - other strings are converted to UTF-8b surrogate encoding
414 - apply JSON-specified string escaping
414 - apply JSON-specified string escaping
415
415
416 (escapes are doubled in these tests)
416 (escapes are doubled in these tests)
417
417
418 >>> jsonescape('this is a test')
418 >>> jsonescape('this is a test')
419 'this is a test'
419 'this is a test'
420 >>> jsonescape('escape characters: \\0 \\x0b \\x7f')
420 >>> jsonescape('escape characters: \\0 \\x0b \\x7f')
421 'escape characters: \\\\u0000 \\\\u000b \\\\u007f'
421 'escape characters: \\\\u0000 \\\\u000b \\\\u007f'
422 >>> jsonescape('escape characters: \\t \\n \\r \\" \\\\')
422 >>> jsonescape('escape characters: \\t \\n \\r \\" \\\\')
423 'escape characters: \\\\t \\\\n \\\\r \\\\" \\\\\\\\'
423 'escape characters: \\\\t \\\\n \\\\r \\\\" \\\\\\\\'
424 >>> jsonescape('a weird byte: \\xdd')
424 >>> jsonescape('a weird byte: \\xdd')
425 'a weird byte: \\xed\\xb3\\x9d'
425 'a weird byte: \\xed\\xb3\\x9d'
426 >>> jsonescape('utf-8: caf\\xc3\\xa9')
426 >>> jsonescape('utf-8: caf\\xc3\\xa9')
427 'utf-8: caf\\xc3\\xa9'
427 'utf-8: caf\\xc3\\xa9'
428 >>> jsonescape('')
428 >>> jsonescape('')
429 ''
429 ''
430
430
431 If paranoid, non-ascii and common troublesome characters are also escaped.
431 If paranoid, non-ascii and common troublesome characters are also escaped.
432 This is suitable for web output.
432 This is suitable for web output.
433
433
434 >>> jsonescape('escape boundary: \\x7e \\x7f \\xc2\\x80', paranoid=True)
434 >>> jsonescape('escape boundary: \\x7e \\x7f \\xc2\\x80', paranoid=True)
435 'escape boundary: ~ \\\\u007f \\\\u0080'
435 'escape boundary: ~ \\\\u007f \\\\u0080'
436 >>> jsonescape('a weird byte: \\xdd', paranoid=True)
436 >>> jsonescape('a weird byte: \\xdd', paranoid=True)
437 'a weird byte: \\\\udcdd'
437 'a weird byte: \\\\udcdd'
438 >>> jsonescape('utf-8: caf\\xc3\\xa9', paranoid=True)
438 >>> jsonescape('utf-8: caf\\xc3\\xa9', paranoid=True)
439 'utf-8: caf\\\\u00e9'
439 'utf-8: caf\\\\u00e9'
440 >>> jsonescape('non-BMP: \\xf0\\x9d\\x84\\x9e', paranoid=True)
440 >>> jsonescape('non-BMP: \\xf0\\x9d\\x84\\x9e', paranoid=True)
441 'non-BMP: \\\\ud834\\\\udd1e'
441 'non-BMP: \\\\ud834\\\\udd1e'
442 >>> jsonescape('<foo@example.org>', paranoid=True)
442 >>> jsonescape('<foo@example.org>', paranoid=True)
443 '\\\\u003cfoo@example.org\\\\u003e'
443 '\\\\u003cfoo@example.org\\\\u003e'
444 '''
444 '''
445
445
446 if paranoid:
446 if paranoid:
447 jm = _paranoidjsonmap
447 jm = _paranoidjsonmap
448 else:
448 else:
449 jm = _jsonmap
449 jm = _jsonmap
450
450
451 u8chars = toutf8b(s)
451 u8chars = toutf8b(s)
452 try:
452 try:
453 return ''.join(jm[x] for x in bytearray(u8chars)) # fast path
453 return ''.join(jm[x] for x in bytearray(u8chars)) # fast path
454 except IndexError:
454 except IndexError:
455 pass
455 pass
456 # non-BMP char is represented as UTF-16 surrogate pair
456 # non-BMP char is represented as UTF-16 surrogate pair
457 u16codes = array.array('H', u8chars.decode('utf-8').encode('utf-16'))
457 u16codes = array.array('H', u8chars.decode('utf-8').encode('utf-16'))
458 u16codes.pop(0) # drop BOM
458 u16codes.pop(0) # drop BOM
459 return ''.join(jm[x] if x < 128 else '\\u%04x' % x for x in u16codes)
459 return ''.join(jm[x] if x < 128 else '\\u%04x' % x for x in u16codes)
460
460
461 _utf8len = [0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 3, 4]
461 _utf8len = [0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 3, 4]
462
462
463 def getutf8char(s, pos):
463 def getutf8char(s, pos):
464 '''get the next full utf-8 character in the given string, starting at pos
464 '''get the next full utf-8 character in the given string, starting at pos
465
465
466 Raises a UnicodeError if the given location does not start a valid
466 Raises a UnicodeError if the given location does not start a valid
467 utf-8 character.
467 utf-8 character.
468 '''
468 '''
469
469
470 # find how many bytes to attempt decoding from first nibble
470 # find how many bytes to attempt decoding from first nibble
471 l = _utf8len[ord(s[pos]) >> 4]
471 l = _utf8len[ord(s[pos]) >> 4]
472 if not l: # ascii
472 if not l: # ascii
473 return s[pos]
473 return s[pos]
474
474
475 c = s[pos:pos + l]
475 c = s[pos:pos + l]
476 # validate with attempted decode
476 # validate with attempted decode
477 c.decode("utf-8")
477 c.decode("utf-8")
478 return c
478 return c
479
479
480 def toutf8b(s):
480 def toutf8b(s):
481 '''convert a local, possibly-binary string into UTF-8b
481 '''convert a local, possibly-binary string into UTF-8b
482
482
483 This is intended as a generic method to preserve data when working
483 This is intended as a generic method to preserve data when working
484 with schemes like JSON and XML that have no provision for
484 with schemes like JSON and XML that have no provision for
485 arbitrary byte strings. As Mercurial often doesn't know
485 arbitrary byte strings. As Mercurial often doesn't know
486 what encoding data is in, we use so-called UTF-8b.
486 what encoding data is in, we use so-called UTF-8b.
487
487
488 If a string is already valid UTF-8 (or ASCII), it passes unmodified.
488 If a string is already valid UTF-8 (or ASCII), it passes unmodified.
489 Otherwise, unsupported bytes are mapped to UTF-16 surrogate range,
489 Otherwise, unsupported bytes are mapped to UTF-16 surrogate range,
490 uDC00-uDCFF.
490 uDC00-uDCFF.
491
491
492 Principles of operation:
492 Principles of operation:
493
493
494 - ASCII and UTF-8 data successfully round-trips and is understood
494 - ASCII and UTF-8 data successfully round-trips and is understood
495 by Unicode-oriented clients
495 by Unicode-oriented clients
496 - filenames and file contents in arbitrary other encodings can have
496 - filenames and file contents in arbitrary other encodings can have
497 be round-tripped or recovered by clueful clients
497 be round-tripped or recovered by clueful clients
498 - local strings that have a cached known UTF-8 encoding (aka
498 - local strings that have a cached known UTF-8 encoding (aka
499 localstr) get sent as UTF-8 so Unicode-oriented clients get the
499 localstr) get sent as UTF-8 so Unicode-oriented clients get the
500 Unicode data they want
500 Unicode data they want
501 - because we must preserve UTF-8 bytestring in places such as
501 - because we must preserve UTF-8 bytestring in places such as
502 filenames, metadata can't be roundtripped without help
502 filenames, metadata can't be roundtripped without help
503
503
504 (Note: "UTF-8b" often refers to decoding a mix of valid UTF-8 and
504 (Note: "UTF-8b" often refers to decoding a mix of valid UTF-8 and
505 arbitrary bytes into an internal Unicode format that can be
505 arbitrary bytes into an internal Unicode format that can be
506 re-encoded back into the original. Here we are exposing the
506 re-encoded back into the original. Here we are exposing the
507 internal surrogate encoding as a UTF-8 string.)
507 internal surrogate encoding as a UTF-8 string.)
508 '''
508 '''
509
509
510 if "\xed" not in s:
510 if "\xed" not in s:
511 if isinstance(s, localstr):
511 if isinstance(s, localstr):
512 return s._utf8
512 return s._utf8
513 try:
513 try:
514 s.decode('utf-8')
514 s.decode('utf-8')
515 return s
515 return s
516 except UnicodeDecodeError:
516 except UnicodeDecodeError:
517 pass
517 pass
518
518
519 r = ""
519 r = ""
520 pos = 0
520 pos = 0
521 l = len(s)
521 l = len(s)
522 while pos < l:
522 while pos < l:
523 try:
523 try:
524 c = getutf8char(s, pos)
524 c = getutf8char(s, pos)
525 if "\xed\xb0\x80" <= c <= "\xed\xb3\xbf":
525 if "\xed\xb0\x80" <= c <= "\xed\xb3\xbf":
526 # have to re-escape existing U+DCxx characters
526 # have to re-escape existing U+DCxx characters
527 c = unichr(0xdc00 + ord(s[pos])).encode('utf-8')
527 c = unichr(0xdc00 + ord(s[pos])).encode('utf-8')
528 pos += 1
528 pos += 1
529 else:
529 else:
530 pos += len(c)
530 pos += len(c)
531 except UnicodeDecodeError:
531 except UnicodeDecodeError:
532 c = unichr(0xdc00 + ord(s[pos])).encode('utf-8')
532 c = unichr(0xdc00 + ord(s[pos])).encode('utf-8')
533 pos += 1
533 pos += 1
534 r += c
534 r += c
535 return r
535 return r
536
536
537 def fromutf8b(s):
537 def fromutf8b(s):
538 '''Given a UTF-8b string, return a local, possibly-binary string.
538 '''Given a UTF-8b string, return a local, possibly-binary string.
539
539
540 return the original binary string. This
540 return the original binary string. This
541 is a round-trip process for strings like filenames, but metadata
541 is a round-trip process for strings like filenames, but metadata
542 that's was passed through tolocal will remain in UTF-8.
542 that's was passed through tolocal will remain in UTF-8.
543
543
544 >>> roundtrip = lambda x: fromutf8b(toutf8b(x)) == x
544 >>> roundtrip = lambda x: fromutf8b(toutf8b(x)) == x
545 >>> m = "\\xc3\\xa9\\x99abcd"
545 >>> m = "\\xc3\\xa9\\x99abcd"
546 >>> toutf8b(m)
546 >>> toutf8b(m)
547 '\\xc3\\xa9\\xed\\xb2\\x99abcd'
547 '\\xc3\\xa9\\xed\\xb2\\x99abcd'
548 >>> roundtrip(m)
548 >>> roundtrip(m)
549 True
549 True
550 >>> roundtrip("\\xc2\\xc2\\x80")
550 >>> roundtrip("\\xc2\\xc2\\x80")
551 True
551 True
552 >>> roundtrip("\\xef\\xbf\\xbd")
552 >>> roundtrip("\\xef\\xbf\\xbd")
553 True
553 True
554 >>> roundtrip("\\xef\\xef\\xbf\\xbd")
554 >>> roundtrip("\\xef\\xef\\xbf\\xbd")
555 True
555 True
556 >>> roundtrip("\\xf1\\x80\\x80\\x80\\x80")
556 >>> roundtrip("\\xf1\\x80\\x80\\x80\\x80")
557 True
557 True
558 '''
558 '''
559
559
560 # fast path - look for uDxxx prefixes in s
560 # fast path - look for uDxxx prefixes in s
561 if "\xed" not in s:
561 if "\xed" not in s:
562 return s
562 return s
563
563
564 # We could do this with the unicode type but some Python builds
564 # We could do this with the unicode type but some Python builds
565 # use UTF-16 internally (issue5031) which causes non-BMP code
565 # use UTF-16 internally (issue5031) which causes non-BMP code
566 # points to be escaped. Instead, we use our handy getutf8char
566 # points to be escaped. Instead, we use our handy getutf8char
567 # helper again to walk the string without "decoding" it.
567 # helper again to walk the string without "decoding" it.
568
568
569 r = ""
569 r = ""
570 pos = 0
570 pos = 0
571 l = len(s)
571 l = len(s)
572 while pos < l:
572 while pos < l:
573 c = getutf8char(s, pos)
573 c = getutf8char(s, pos)
574 pos += len(c)
574 pos += len(c)
575 # unescape U+DCxx characters
575 # unescape U+DCxx characters
576 if "\xed\xb0\x80" <= c <= "\xed\xb3\xbf":
576 if "\xed\xb0\x80" <= c <= "\xed\xb3\xbf":
577 c = chr(ord(c.decode("utf-8")) & 0xff)
577 c = chr(ord(c.decode("utf-8")) & 0xff)
578 r += c
578 r += c
579 return r
579 return r
@@ -1,166 +1,168 b''
1 # pycompat.py - portability shim for python 3
1 # pycompat.py - portability shim for python 3
2 #
2 #
3 # This software may be used and distributed according to the terms of the
3 # This software may be used and distributed according to the terms of the
4 # GNU General Public License version 2 or any later version.
4 # GNU General Public License version 2 or any later version.
5
5
6 """Mercurial portability shim for python 3.
6 """Mercurial portability shim for python 3.
7
7
8 This contains aliases to hide python version-specific details from the core.
8 This contains aliases to hide python version-specific details from the core.
9 """
9 """
10
10
11 from __future__ import absolute_import
11 from __future__ import absolute_import
12
12
13 import sys
13 import sys
14
14
15 if sys.version_info[0] < 3:
15 ispy3 = (sys.version_info[0] >= 3)
16
17 if not ispy3:
16 import cPickle as pickle
18 import cPickle as pickle
17 import cStringIO as io
19 import cStringIO as io
18 import httplib
20 import httplib
19 import Queue as _queue
21 import Queue as _queue
20 import SocketServer as socketserver
22 import SocketServer as socketserver
21 import urlparse
23 import urlparse
22 import xmlrpclib
24 import xmlrpclib
23 else:
25 else:
24 import http.client as httplib
26 import http.client as httplib
25 import io
27 import io
26 import pickle
28 import pickle
27 import queue as _queue
29 import queue as _queue
28 import socketserver
30 import socketserver
29 import urllib.parse as urlparse
31 import urllib.parse as urlparse
30 import xmlrpc.client as xmlrpclib
32 import xmlrpc.client as xmlrpclib
31
33
32 if sys.version_info[0] >= 3:
34 if ispy3:
33 import builtins
35 import builtins
34 import functools
36 import functools
35
37
36 def _wrapattrfunc(f):
38 def _wrapattrfunc(f):
37 @functools.wraps(f)
39 @functools.wraps(f)
38 def w(object, name, *args):
40 def w(object, name, *args):
39 if isinstance(name, bytes):
41 if isinstance(name, bytes):
40 name = name.decode(u'utf-8')
42 name = name.decode(u'utf-8')
41 return f(object, name, *args)
43 return f(object, name, *args)
42 return w
44 return w
43
45
44 # these wrappers are automagically imported by hgloader
46 # these wrappers are automagically imported by hgloader
45 delattr = _wrapattrfunc(builtins.delattr)
47 delattr = _wrapattrfunc(builtins.delattr)
46 getattr = _wrapattrfunc(builtins.getattr)
48 getattr = _wrapattrfunc(builtins.getattr)
47 hasattr = _wrapattrfunc(builtins.hasattr)
49 hasattr = _wrapattrfunc(builtins.hasattr)
48 setattr = _wrapattrfunc(builtins.setattr)
50 setattr = _wrapattrfunc(builtins.setattr)
49 xrange = builtins.range
51 xrange = builtins.range
50
52
51 stringio = io.StringIO
53 stringio = io.StringIO
52 empty = _queue.Empty
54 empty = _queue.Empty
53 queue = _queue.Queue
55 queue = _queue.Queue
54
56
55 class _pycompatstub(object):
57 class _pycompatstub(object):
56 def __init__(self):
58 def __init__(self):
57 self._aliases = {}
59 self._aliases = {}
58
60
59 def _registeraliases(self, origin, items):
61 def _registeraliases(self, origin, items):
60 """Add items that will be populated at the first access"""
62 """Add items that will be populated at the first access"""
61 self._aliases.update((item.replace('_', '').lower(), (origin, item))
63 self._aliases.update((item.replace('_', '').lower(), (origin, item))
62 for item in items)
64 for item in items)
63
65
64 def __getattr__(self, name):
66 def __getattr__(self, name):
65 try:
67 try:
66 origin, item = self._aliases[name]
68 origin, item = self._aliases[name]
67 except KeyError:
69 except KeyError:
68 raise AttributeError(name)
70 raise AttributeError(name)
69 self.__dict__[name] = obj = getattr(origin, item)
71 self.__dict__[name] = obj = getattr(origin, item)
70 return obj
72 return obj
71
73
72 httpserver = _pycompatstub()
74 httpserver = _pycompatstub()
73 urlreq = _pycompatstub()
75 urlreq = _pycompatstub()
74 urlerr = _pycompatstub()
76 urlerr = _pycompatstub()
75 if sys.version_info[0] < 3:
77 if not ispy3:
76 import BaseHTTPServer
78 import BaseHTTPServer
77 import CGIHTTPServer
79 import CGIHTTPServer
78 import SimpleHTTPServer
80 import SimpleHTTPServer
79 import urllib2
81 import urllib2
80 import urllib
82 import urllib
81 urlreq._registeraliases(urllib, (
83 urlreq._registeraliases(urllib, (
82 "addclosehook",
84 "addclosehook",
83 "addinfourl",
85 "addinfourl",
84 "ftpwrapper",
86 "ftpwrapper",
85 "pathname2url",
87 "pathname2url",
86 "quote",
88 "quote",
87 "splitattr",
89 "splitattr",
88 "splitpasswd",
90 "splitpasswd",
89 "splitport",
91 "splitport",
90 "splituser",
92 "splituser",
91 "unquote",
93 "unquote",
92 "url2pathname",
94 "url2pathname",
93 "urlencode",
95 "urlencode",
94 ))
96 ))
95 urlreq._registeraliases(urllib2, (
97 urlreq._registeraliases(urllib2, (
96 "AbstractHTTPHandler",
98 "AbstractHTTPHandler",
97 "BaseHandler",
99 "BaseHandler",
98 "build_opener",
100 "build_opener",
99 "FileHandler",
101 "FileHandler",
100 "FTPHandler",
102 "FTPHandler",
101 "HTTPBasicAuthHandler",
103 "HTTPBasicAuthHandler",
102 "HTTPDigestAuthHandler",
104 "HTTPDigestAuthHandler",
103 "HTTPHandler",
105 "HTTPHandler",
104 "HTTPPasswordMgrWithDefaultRealm",
106 "HTTPPasswordMgrWithDefaultRealm",
105 "HTTPSHandler",
107 "HTTPSHandler",
106 "install_opener",
108 "install_opener",
107 "ProxyHandler",
109 "ProxyHandler",
108 "Request",
110 "Request",
109 "urlopen",
111 "urlopen",
110 ))
112 ))
111 urlerr._registeraliases(urllib2, (
113 urlerr._registeraliases(urllib2, (
112 "HTTPError",
114 "HTTPError",
113 "URLError",
115 "URLError",
114 ))
116 ))
115 httpserver._registeraliases(BaseHTTPServer, (
117 httpserver._registeraliases(BaseHTTPServer, (
116 "HTTPServer",
118 "HTTPServer",
117 "BaseHTTPRequestHandler",
119 "BaseHTTPRequestHandler",
118 ))
120 ))
119 httpserver._registeraliases(SimpleHTTPServer, (
121 httpserver._registeraliases(SimpleHTTPServer, (
120 "SimpleHTTPRequestHandler",
122 "SimpleHTTPRequestHandler",
121 ))
123 ))
122 httpserver._registeraliases(CGIHTTPServer, (
124 httpserver._registeraliases(CGIHTTPServer, (
123 "CGIHTTPRequestHandler",
125 "CGIHTTPRequestHandler",
124 ))
126 ))
125
127
126 else:
128 else:
127 import urllib.request
129 import urllib.request
128 urlreq._registeraliases(urllib.request, (
130 urlreq._registeraliases(urllib.request, (
129 "AbstractHTTPHandler",
131 "AbstractHTTPHandler",
130 "addclosehook",
132 "addclosehook",
131 "addinfourl",
133 "addinfourl",
132 "BaseHandler",
134 "BaseHandler",
133 "build_opener",
135 "build_opener",
134 "FileHandler",
136 "FileHandler",
135 "FTPHandler",
137 "FTPHandler",
136 "ftpwrapper",
138 "ftpwrapper",
137 "HTTPHandler",
139 "HTTPHandler",
138 "HTTPSHandler",
140 "HTTPSHandler",
139 "install_opener",
141 "install_opener",
140 "pathname2url",
142 "pathname2url",
141 "HTTPBasicAuthHandler",
143 "HTTPBasicAuthHandler",
142 "HTTPDigestAuthHandler",
144 "HTTPDigestAuthHandler",
143 "HTTPPasswordMgrWithDefaultRealm",
145 "HTTPPasswordMgrWithDefaultRealm",
144 "ProxyHandler",
146 "ProxyHandler",
145 "quote",
147 "quote",
146 "Request",
148 "Request",
147 "splitattr",
149 "splitattr",
148 "splitpasswd",
150 "splitpasswd",
149 "splitport",
151 "splitport",
150 "splituser",
152 "splituser",
151 "unquote",
153 "unquote",
152 "url2pathname",
154 "url2pathname",
153 "urlopen",
155 "urlopen",
154 ))
156 ))
155 import urllib.error
157 import urllib.error
156 urlerr._registeraliases(urllib.error, (
158 urlerr._registeraliases(urllib.error, (
157 "HTTPError",
159 "HTTPError",
158 "URLError",
160 "URLError",
159 ))
161 ))
160 import http.server
162 import http.server
161 httpserver._registeraliases(http.server, (
163 httpserver._registeraliases(http.server, (
162 "HTTPServer",
164 "HTTPServer",
163 "BaseHTTPRequestHandler",
165 "BaseHTTPRequestHandler",
164 "SimpleHTTPRequestHandler",
166 "SimpleHTTPRequestHandler",
165 "CGIHTTPRequestHandler",
167 "CGIHTTPRequestHandler",
166 ))
168 ))
@@ -1,2904 +1,2904 b''
1 # util.py - Mercurial utility functions and platform specific implementations
1 # util.py - Mercurial utility functions and platform specific implementations
2 #
2 #
3 # Copyright 2005 K. Thananchayan <thananck@yahoo.com>
3 # Copyright 2005 K. Thananchayan <thananck@yahoo.com>
4 # Copyright 2005-2007 Matt Mackall <mpm@selenic.com>
4 # Copyright 2005-2007 Matt Mackall <mpm@selenic.com>
5 # Copyright 2006 Vadim Gelfer <vadim.gelfer@gmail.com>
5 # Copyright 2006 Vadim Gelfer <vadim.gelfer@gmail.com>
6 #
6 #
7 # This software may be used and distributed according to the terms of the
7 # This software may be used and distributed according to the terms of the
8 # GNU General Public License version 2 or any later version.
8 # GNU General Public License version 2 or any later version.
9
9
10 """Mercurial utility functions and platform specific implementations.
10 """Mercurial utility functions and platform specific implementations.
11
11
12 This contains helper routines that are independent of the SCM core and
12 This contains helper routines that are independent of the SCM core and
13 hide platform-specific details from the core.
13 hide platform-specific details from the core.
14 """
14 """
15
15
16 from __future__ import absolute_import
16 from __future__ import absolute_import
17
17
18 import bz2
18 import bz2
19 import calendar
19 import calendar
20 import collections
20 import collections
21 import datetime
21 import datetime
22 import errno
22 import errno
23 import gc
23 import gc
24 import hashlib
24 import hashlib
25 import imp
25 import imp
26 import os
26 import os
27 import re as remod
27 import re as remod
28 import shutil
28 import shutil
29 import signal
29 import signal
30 import socket
30 import socket
31 import subprocess
31 import subprocess
32 import sys
32 import sys
33 import tempfile
33 import tempfile
34 import textwrap
34 import textwrap
35 import time
35 import time
36 import traceback
36 import traceback
37 import zlib
37 import zlib
38
38
39 from . import (
39 from . import (
40 encoding,
40 encoding,
41 error,
41 error,
42 i18n,
42 i18n,
43 osutil,
43 osutil,
44 parsers,
44 parsers,
45 pycompat,
45 pycompat,
46 )
46 )
47
47
48 for attr in (
48 for attr in (
49 'empty',
49 'empty',
50 'httplib',
50 'httplib',
51 'httpserver',
51 'httpserver',
52 'pickle',
52 'pickle',
53 'queue',
53 'queue',
54 'urlerr',
54 'urlerr',
55 'urlparse',
55 'urlparse',
56 # we do import urlreq, but we do it outside the loop
56 # we do import urlreq, but we do it outside the loop
57 #'urlreq',
57 #'urlreq',
58 'stringio',
58 'stringio',
59 'socketserver',
59 'socketserver',
60 'xmlrpclib',
60 'xmlrpclib',
61 ):
61 ):
62 globals()[attr] = getattr(pycompat, attr)
62 globals()[attr] = getattr(pycompat, attr)
63
63
64 # This line is to make pyflakes happy:
64 # This line is to make pyflakes happy:
65 urlreq = pycompat.urlreq
65 urlreq = pycompat.urlreq
66
66
67 if os.name == 'nt':
67 if os.name == 'nt':
68 from . import windows as platform
68 from . import windows as platform
69 else:
69 else:
70 from . import posix as platform
70 from . import posix as platform
71
71
72 _ = i18n._
72 _ = i18n._
73
73
74 bindunixsocket = platform.bindunixsocket
74 bindunixsocket = platform.bindunixsocket
75 cachestat = platform.cachestat
75 cachestat = platform.cachestat
76 checkexec = platform.checkexec
76 checkexec = platform.checkexec
77 checklink = platform.checklink
77 checklink = platform.checklink
78 copymode = platform.copymode
78 copymode = platform.copymode
79 executablepath = platform.executablepath
79 executablepath = platform.executablepath
80 expandglobs = platform.expandglobs
80 expandglobs = platform.expandglobs
81 explainexit = platform.explainexit
81 explainexit = platform.explainexit
82 findexe = platform.findexe
82 findexe = platform.findexe
83 gethgcmd = platform.gethgcmd
83 gethgcmd = platform.gethgcmd
84 getuser = platform.getuser
84 getuser = platform.getuser
85 getpid = os.getpid
85 getpid = os.getpid
86 groupmembers = platform.groupmembers
86 groupmembers = platform.groupmembers
87 groupname = platform.groupname
87 groupname = platform.groupname
88 hidewindow = platform.hidewindow
88 hidewindow = platform.hidewindow
89 isexec = platform.isexec
89 isexec = platform.isexec
90 isowner = platform.isowner
90 isowner = platform.isowner
91 localpath = platform.localpath
91 localpath = platform.localpath
92 lookupreg = platform.lookupreg
92 lookupreg = platform.lookupreg
93 makedir = platform.makedir
93 makedir = platform.makedir
94 nlinks = platform.nlinks
94 nlinks = platform.nlinks
95 normpath = platform.normpath
95 normpath = platform.normpath
96 normcase = platform.normcase
96 normcase = platform.normcase
97 normcasespec = platform.normcasespec
97 normcasespec = platform.normcasespec
98 normcasefallback = platform.normcasefallback
98 normcasefallback = platform.normcasefallback
99 openhardlinks = platform.openhardlinks
99 openhardlinks = platform.openhardlinks
100 oslink = platform.oslink
100 oslink = platform.oslink
101 parsepatchoutput = platform.parsepatchoutput
101 parsepatchoutput = platform.parsepatchoutput
102 pconvert = platform.pconvert
102 pconvert = platform.pconvert
103 poll = platform.poll
103 poll = platform.poll
104 popen = platform.popen
104 popen = platform.popen
105 posixfile = platform.posixfile
105 posixfile = platform.posixfile
106 quotecommand = platform.quotecommand
106 quotecommand = platform.quotecommand
107 readpipe = platform.readpipe
107 readpipe = platform.readpipe
108 rename = platform.rename
108 rename = platform.rename
109 removedirs = platform.removedirs
109 removedirs = platform.removedirs
110 samedevice = platform.samedevice
110 samedevice = platform.samedevice
111 samefile = platform.samefile
111 samefile = platform.samefile
112 samestat = platform.samestat
112 samestat = platform.samestat
113 setbinary = platform.setbinary
113 setbinary = platform.setbinary
114 setflags = platform.setflags
114 setflags = platform.setflags
115 setsignalhandler = platform.setsignalhandler
115 setsignalhandler = platform.setsignalhandler
116 shellquote = platform.shellquote
116 shellquote = platform.shellquote
117 spawndetached = platform.spawndetached
117 spawndetached = platform.spawndetached
118 split = platform.split
118 split = platform.split
119 sshargs = platform.sshargs
119 sshargs = platform.sshargs
120 statfiles = getattr(osutil, 'statfiles', platform.statfiles)
120 statfiles = getattr(osutil, 'statfiles', platform.statfiles)
121 statisexec = platform.statisexec
121 statisexec = platform.statisexec
122 statislink = platform.statislink
122 statislink = platform.statislink
123 termwidth = platform.termwidth
123 termwidth = platform.termwidth
124 testpid = platform.testpid
124 testpid = platform.testpid
125 umask = platform.umask
125 umask = platform.umask
126 unlink = platform.unlink
126 unlink = platform.unlink
127 unlinkpath = platform.unlinkpath
127 unlinkpath = platform.unlinkpath
128 username = platform.username
128 username = platform.username
129
129
130 # Python compatibility
130 # Python compatibility
131
131
132 _notset = object()
132 _notset = object()
133
133
134 # disable Python's problematic floating point timestamps (issue4836)
134 # disable Python's problematic floating point timestamps (issue4836)
135 # (Python hypocritically says you shouldn't change this behavior in
135 # (Python hypocritically says you shouldn't change this behavior in
136 # libraries, and sure enough Mercurial is not a library.)
136 # libraries, and sure enough Mercurial is not a library.)
137 os.stat_float_times(False)
137 os.stat_float_times(False)
138
138
139 def safehasattr(thing, attr):
139 def safehasattr(thing, attr):
140 return getattr(thing, attr, _notset) is not _notset
140 return getattr(thing, attr, _notset) is not _notset
141
141
142 DIGESTS = {
142 DIGESTS = {
143 'md5': hashlib.md5,
143 'md5': hashlib.md5,
144 'sha1': hashlib.sha1,
144 'sha1': hashlib.sha1,
145 'sha512': hashlib.sha512,
145 'sha512': hashlib.sha512,
146 }
146 }
147 # List of digest types from strongest to weakest
147 # List of digest types from strongest to weakest
148 DIGESTS_BY_STRENGTH = ['sha512', 'sha1', 'md5']
148 DIGESTS_BY_STRENGTH = ['sha512', 'sha1', 'md5']
149
149
150 for k in DIGESTS_BY_STRENGTH:
150 for k in DIGESTS_BY_STRENGTH:
151 assert k in DIGESTS
151 assert k in DIGESTS
152
152
153 class digester(object):
153 class digester(object):
154 """helper to compute digests.
154 """helper to compute digests.
155
155
156 This helper can be used to compute one or more digests given their name.
156 This helper can be used to compute one or more digests given their name.
157
157
158 >>> d = digester(['md5', 'sha1'])
158 >>> d = digester(['md5', 'sha1'])
159 >>> d.update('foo')
159 >>> d.update('foo')
160 >>> [k for k in sorted(d)]
160 >>> [k for k in sorted(d)]
161 ['md5', 'sha1']
161 ['md5', 'sha1']
162 >>> d['md5']
162 >>> d['md5']
163 'acbd18db4cc2f85cedef654fccc4a4d8'
163 'acbd18db4cc2f85cedef654fccc4a4d8'
164 >>> d['sha1']
164 >>> d['sha1']
165 '0beec7b5ea3f0fdbc95d0dd47f3c5bc275da8a33'
165 '0beec7b5ea3f0fdbc95d0dd47f3c5bc275da8a33'
166 >>> digester.preferred(['md5', 'sha1'])
166 >>> digester.preferred(['md5', 'sha1'])
167 'sha1'
167 'sha1'
168 """
168 """
169
169
170 def __init__(self, digests, s=''):
170 def __init__(self, digests, s=''):
171 self._hashes = {}
171 self._hashes = {}
172 for k in digests:
172 for k in digests:
173 if k not in DIGESTS:
173 if k not in DIGESTS:
174 raise Abort(_('unknown digest type: %s') % k)
174 raise Abort(_('unknown digest type: %s') % k)
175 self._hashes[k] = DIGESTS[k]()
175 self._hashes[k] = DIGESTS[k]()
176 if s:
176 if s:
177 self.update(s)
177 self.update(s)
178
178
179 def update(self, data):
179 def update(self, data):
180 for h in self._hashes.values():
180 for h in self._hashes.values():
181 h.update(data)
181 h.update(data)
182
182
183 def __getitem__(self, key):
183 def __getitem__(self, key):
184 if key not in DIGESTS:
184 if key not in DIGESTS:
185 raise Abort(_('unknown digest type: %s') % k)
185 raise Abort(_('unknown digest type: %s') % k)
186 return self._hashes[key].hexdigest()
186 return self._hashes[key].hexdigest()
187
187
188 def __iter__(self):
188 def __iter__(self):
189 return iter(self._hashes)
189 return iter(self._hashes)
190
190
191 @staticmethod
191 @staticmethod
192 def preferred(supported):
192 def preferred(supported):
193 """returns the strongest digest type in both supported and DIGESTS."""
193 """returns the strongest digest type in both supported and DIGESTS."""
194
194
195 for k in DIGESTS_BY_STRENGTH:
195 for k in DIGESTS_BY_STRENGTH:
196 if k in supported:
196 if k in supported:
197 return k
197 return k
198 return None
198 return None
199
199
200 class digestchecker(object):
200 class digestchecker(object):
201 """file handle wrapper that additionally checks content against a given
201 """file handle wrapper that additionally checks content against a given
202 size and digests.
202 size and digests.
203
203
204 d = digestchecker(fh, size, {'md5': '...'})
204 d = digestchecker(fh, size, {'md5': '...'})
205
205
206 When multiple digests are given, all of them are validated.
206 When multiple digests are given, all of them are validated.
207 """
207 """
208
208
209 def __init__(self, fh, size, digests):
209 def __init__(self, fh, size, digests):
210 self._fh = fh
210 self._fh = fh
211 self._size = size
211 self._size = size
212 self._got = 0
212 self._got = 0
213 self._digests = dict(digests)
213 self._digests = dict(digests)
214 self._digester = digester(self._digests.keys())
214 self._digester = digester(self._digests.keys())
215
215
216 def read(self, length=-1):
216 def read(self, length=-1):
217 content = self._fh.read(length)
217 content = self._fh.read(length)
218 self._digester.update(content)
218 self._digester.update(content)
219 self._got += len(content)
219 self._got += len(content)
220 return content
220 return content
221
221
222 def validate(self):
222 def validate(self):
223 if self._size != self._got:
223 if self._size != self._got:
224 raise Abort(_('size mismatch: expected %d, got %d') %
224 raise Abort(_('size mismatch: expected %d, got %d') %
225 (self._size, self._got))
225 (self._size, self._got))
226 for k, v in self._digests.items():
226 for k, v in self._digests.items():
227 if v != self._digester[k]:
227 if v != self._digester[k]:
228 # i18n: first parameter is a digest name
228 # i18n: first parameter is a digest name
229 raise Abort(_('%s mismatch: expected %s, got %s') %
229 raise Abort(_('%s mismatch: expected %s, got %s') %
230 (k, v, self._digester[k]))
230 (k, v, self._digester[k]))
231
231
232 try:
232 try:
233 buffer = buffer
233 buffer = buffer
234 except NameError:
234 except NameError:
235 if sys.version_info[0] < 3:
235 if not pycompat.ispy3:
236 def buffer(sliceable, offset=0):
236 def buffer(sliceable, offset=0):
237 return sliceable[offset:]
237 return sliceable[offset:]
238 else:
238 else:
239 def buffer(sliceable, offset=0):
239 def buffer(sliceable, offset=0):
240 return memoryview(sliceable)[offset:]
240 return memoryview(sliceable)[offset:]
241
241
242 closefds = os.name == 'posix'
242 closefds = os.name == 'posix'
243
243
244 _chunksize = 4096
244 _chunksize = 4096
245
245
246 class bufferedinputpipe(object):
246 class bufferedinputpipe(object):
247 """a manually buffered input pipe
247 """a manually buffered input pipe
248
248
249 Python will not let us use buffered IO and lazy reading with 'polling' at
249 Python will not let us use buffered IO and lazy reading with 'polling' at
250 the same time. We cannot probe the buffer state and select will not detect
250 the same time. We cannot probe the buffer state and select will not detect
251 that data are ready to read if they are already buffered.
251 that data are ready to read if they are already buffered.
252
252
253 This class let us work around that by implementing its own buffering
253 This class let us work around that by implementing its own buffering
254 (allowing efficient readline) while offering a way to know if the buffer is
254 (allowing efficient readline) while offering a way to know if the buffer is
255 empty from the output (allowing collaboration of the buffer with polling).
255 empty from the output (allowing collaboration of the buffer with polling).
256
256
257 This class lives in the 'util' module because it makes use of the 'os'
257 This class lives in the 'util' module because it makes use of the 'os'
258 module from the python stdlib.
258 module from the python stdlib.
259 """
259 """
260
260
261 def __init__(self, input):
261 def __init__(self, input):
262 self._input = input
262 self._input = input
263 self._buffer = []
263 self._buffer = []
264 self._eof = False
264 self._eof = False
265 self._lenbuf = 0
265 self._lenbuf = 0
266
266
267 @property
267 @property
268 def hasbuffer(self):
268 def hasbuffer(self):
269 """True is any data is currently buffered
269 """True is any data is currently buffered
270
270
271 This will be used externally a pre-step for polling IO. If there is
271 This will be used externally a pre-step for polling IO. If there is
272 already data then no polling should be set in place."""
272 already data then no polling should be set in place."""
273 return bool(self._buffer)
273 return bool(self._buffer)
274
274
275 @property
275 @property
276 def closed(self):
276 def closed(self):
277 return self._input.closed
277 return self._input.closed
278
278
279 def fileno(self):
279 def fileno(self):
280 return self._input.fileno()
280 return self._input.fileno()
281
281
282 def close(self):
282 def close(self):
283 return self._input.close()
283 return self._input.close()
284
284
285 def read(self, size):
285 def read(self, size):
286 while (not self._eof) and (self._lenbuf < size):
286 while (not self._eof) and (self._lenbuf < size):
287 self._fillbuffer()
287 self._fillbuffer()
288 return self._frombuffer(size)
288 return self._frombuffer(size)
289
289
290 def readline(self, *args, **kwargs):
290 def readline(self, *args, **kwargs):
291 if 1 < len(self._buffer):
291 if 1 < len(self._buffer):
292 # this should not happen because both read and readline end with a
292 # this should not happen because both read and readline end with a
293 # _frombuffer call that collapse it.
293 # _frombuffer call that collapse it.
294 self._buffer = [''.join(self._buffer)]
294 self._buffer = [''.join(self._buffer)]
295 self._lenbuf = len(self._buffer[0])
295 self._lenbuf = len(self._buffer[0])
296 lfi = -1
296 lfi = -1
297 if self._buffer:
297 if self._buffer:
298 lfi = self._buffer[-1].find('\n')
298 lfi = self._buffer[-1].find('\n')
299 while (not self._eof) and lfi < 0:
299 while (not self._eof) and lfi < 0:
300 self._fillbuffer()
300 self._fillbuffer()
301 if self._buffer:
301 if self._buffer:
302 lfi = self._buffer[-1].find('\n')
302 lfi = self._buffer[-1].find('\n')
303 size = lfi + 1
303 size = lfi + 1
304 if lfi < 0: # end of file
304 if lfi < 0: # end of file
305 size = self._lenbuf
305 size = self._lenbuf
306 elif 1 < len(self._buffer):
306 elif 1 < len(self._buffer):
307 # we need to take previous chunks into account
307 # we need to take previous chunks into account
308 size += self._lenbuf - len(self._buffer[-1])
308 size += self._lenbuf - len(self._buffer[-1])
309 return self._frombuffer(size)
309 return self._frombuffer(size)
310
310
311 def _frombuffer(self, size):
311 def _frombuffer(self, size):
312 """return at most 'size' data from the buffer
312 """return at most 'size' data from the buffer
313
313
314 The data are removed from the buffer."""
314 The data are removed from the buffer."""
315 if size == 0 or not self._buffer:
315 if size == 0 or not self._buffer:
316 return ''
316 return ''
317 buf = self._buffer[0]
317 buf = self._buffer[0]
318 if 1 < len(self._buffer):
318 if 1 < len(self._buffer):
319 buf = ''.join(self._buffer)
319 buf = ''.join(self._buffer)
320
320
321 data = buf[:size]
321 data = buf[:size]
322 buf = buf[len(data):]
322 buf = buf[len(data):]
323 if buf:
323 if buf:
324 self._buffer = [buf]
324 self._buffer = [buf]
325 self._lenbuf = len(buf)
325 self._lenbuf = len(buf)
326 else:
326 else:
327 self._buffer = []
327 self._buffer = []
328 self._lenbuf = 0
328 self._lenbuf = 0
329 return data
329 return data
330
330
331 def _fillbuffer(self):
331 def _fillbuffer(self):
332 """read data to the buffer"""
332 """read data to the buffer"""
333 data = os.read(self._input.fileno(), _chunksize)
333 data = os.read(self._input.fileno(), _chunksize)
334 if not data:
334 if not data:
335 self._eof = True
335 self._eof = True
336 else:
336 else:
337 self._lenbuf += len(data)
337 self._lenbuf += len(data)
338 self._buffer.append(data)
338 self._buffer.append(data)
339
339
340 def popen2(cmd, env=None, newlines=False):
340 def popen2(cmd, env=None, newlines=False):
341 # Setting bufsize to -1 lets the system decide the buffer size.
341 # Setting bufsize to -1 lets the system decide the buffer size.
342 # The default for bufsize is 0, meaning unbuffered. This leads to
342 # The default for bufsize is 0, meaning unbuffered. This leads to
343 # poor performance on Mac OS X: http://bugs.python.org/issue4194
343 # poor performance on Mac OS X: http://bugs.python.org/issue4194
344 p = subprocess.Popen(cmd, shell=True, bufsize=-1,
344 p = subprocess.Popen(cmd, shell=True, bufsize=-1,
345 close_fds=closefds,
345 close_fds=closefds,
346 stdin=subprocess.PIPE, stdout=subprocess.PIPE,
346 stdin=subprocess.PIPE, stdout=subprocess.PIPE,
347 universal_newlines=newlines,
347 universal_newlines=newlines,
348 env=env)
348 env=env)
349 return p.stdin, p.stdout
349 return p.stdin, p.stdout
350
350
351 def popen3(cmd, env=None, newlines=False):
351 def popen3(cmd, env=None, newlines=False):
352 stdin, stdout, stderr, p = popen4(cmd, env, newlines)
352 stdin, stdout, stderr, p = popen4(cmd, env, newlines)
353 return stdin, stdout, stderr
353 return stdin, stdout, stderr
354
354
355 def popen4(cmd, env=None, newlines=False, bufsize=-1):
355 def popen4(cmd, env=None, newlines=False, bufsize=-1):
356 p = subprocess.Popen(cmd, shell=True, bufsize=bufsize,
356 p = subprocess.Popen(cmd, shell=True, bufsize=bufsize,
357 close_fds=closefds,
357 close_fds=closefds,
358 stdin=subprocess.PIPE, stdout=subprocess.PIPE,
358 stdin=subprocess.PIPE, stdout=subprocess.PIPE,
359 stderr=subprocess.PIPE,
359 stderr=subprocess.PIPE,
360 universal_newlines=newlines,
360 universal_newlines=newlines,
361 env=env)
361 env=env)
362 return p.stdin, p.stdout, p.stderr, p
362 return p.stdin, p.stdout, p.stderr, p
363
363
364 def version():
364 def version():
365 """Return version information if available."""
365 """Return version information if available."""
366 try:
366 try:
367 from . import __version__
367 from . import __version__
368 return __version__.version
368 return __version__.version
369 except ImportError:
369 except ImportError:
370 return 'unknown'
370 return 'unknown'
371
371
372 def versiontuple(v=None, n=4):
372 def versiontuple(v=None, n=4):
373 """Parses a Mercurial version string into an N-tuple.
373 """Parses a Mercurial version string into an N-tuple.
374
374
375 The version string to be parsed is specified with the ``v`` argument.
375 The version string to be parsed is specified with the ``v`` argument.
376 If it isn't defined, the current Mercurial version string will be parsed.
376 If it isn't defined, the current Mercurial version string will be parsed.
377
377
378 ``n`` can be 2, 3, or 4. Here is how some version strings map to
378 ``n`` can be 2, 3, or 4. Here is how some version strings map to
379 returned values:
379 returned values:
380
380
381 >>> v = '3.6.1+190-df9b73d2d444'
381 >>> v = '3.6.1+190-df9b73d2d444'
382 >>> versiontuple(v, 2)
382 >>> versiontuple(v, 2)
383 (3, 6)
383 (3, 6)
384 >>> versiontuple(v, 3)
384 >>> versiontuple(v, 3)
385 (3, 6, 1)
385 (3, 6, 1)
386 >>> versiontuple(v, 4)
386 >>> versiontuple(v, 4)
387 (3, 6, 1, '190-df9b73d2d444')
387 (3, 6, 1, '190-df9b73d2d444')
388
388
389 >>> versiontuple('3.6.1+190-df9b73d2d444+20151118')
389 >>> versiontuple('3.6.1+190-df9b73d2d444+20151118')
390 (3, 6, 1, '190-df9b73d2d444+20151118')
390 (3, 6, 1, '190-df9b73d2d444+20151118')
391
391
392 >>> v = '3.6'
392 >>> v = '3.6'
393 >>> versiontuple(v, 2)
393 >>> versiontuple(v, 2)
394 (3, 6)
394 (3, 6)
395 >>> versiontuple(v, 3)
395 >>> versiontuple(v, 3)
396 (3, 6, None)
396 (3, 6, None)
397 >>> versiontuple(v, 4)
397 >>> versiontuple(v, 4)
398 (3, 6, None, None)
398 (3, 6, None, None)
399
399
400 >>> v = '3.9-rc'
400 >>> v = '3.9-rc'
401 >>> versiontuple(v, 2)
401 >>> versiontuple(v, 2)
402 (3, 9)
402 (3, 9)
403 >>> versiontuple(v, 3)
403 >>> versiontuple(v, 3)
404 (3, 9, None)
404 (3, 9, None)
405 >>> versiontuple(v, 4)
405 >>> versiontuple(v, 4)
406 (3, 9, None, 'rc')
406 (3, 9, None, 'rc')
407
407
408 >>> v = '3.9-rc+2-02a8fea4289b'
408 >>> v = '3.9-rc+2-02a8fea4289b'
409 >>> versiontuple(v, 2)
409 >>> versiontuple(v, 2)
410 (3, 9)
410 (3, 9)
411 >>> versiontuple(v, 3)
411 >>> versiontuple(v, 3)
412 (3, 9, None)
412 (3, 9, None)
413 >>> versiontuple(v, 4)
413 >>> versiontuple(v, 4)
414 (3, 9, None, 'rc+2-02a8fea4289b')
414 (3, 9, None, 'rc+2-02a8fea4289b')
415 """
415 """
416 if not v:
416 if not v:
417 v = version()
417 v = version()
418 parts = remod.split('[\+-]', v, 1)
418 parts = remod.split('[\+-]', v, 1)
419 if len(parts) == 1:
419 if len(parts) == 1:
420 vparts, extra = parts[0], None
420 vparts, extra = parts[0], None
421 else:
421 else:
422 vparts, extra = parts
422 vparts, extra = parts
423
423
424 vints = []
424 vints = []
425 for i in vparts.split('.'):
425 for i in vparts.split('.'):
426 try:
426 try:
427 vints.append(int(i))
427 vints.append(int(i))
428 except ValueError:
428 except ValueError:
429 break
429 break
430 # (3, 6) -> (3, 6, None)
430 # (3, 6) -> (3, 6, None)
431 while len(vints) < 3:
431 while len(vints) < 3:
432 vints.append(None)
432 vints.append(None)
433
433
434 if n == 2:
434 if n == 2:
435 return (vints[0], vints[1])
435 return (vints[0], vints[1])
436 if n == 3:
436 if n == 3:
437 return (vints[0], vints[1], vints[2])
437 return (vints[0], vints[1], vints[2])
438 if n == 4:
438 if n == 4:
439 return (vints[0], vints[1], vints[2], extra)
439 return (vints[0], vints[1], vints[2], extra)
440
440
441 # used by parsedate
441 # used by parsedate
442 defaultdateformats = (
442 defaultdateformats = (
443 '%Y-%m-%dT%H:%M:%S', # the 'real' ISO8601
443 '%Y-%m-%dT%H:%M:%S', # the 'real' ISO8601
444 '%Y-%m-%dT%H:%M', # without seconds
444 '%Y-%m-%dT%H:%M', # without seconds
445 '%Y-%m-%dT%H%M%S', # another awful but legal variant without :
445 '%Y-%m-%dT%H%M%S', # another awful but legal variant without :
446 '%Y-%m-%dT%H%M', # without seconds
446 '%Y-%m-%dT%H%M', # without seconds
447 '%Y-%m-%d %H:%M:%S', # our common legal variant
447 '%Y-%m-%d %H:%M:%S', # our common legal variant
448 '%Y-%m-%d %H:%M', # without seconds
448 '%Y-%m-%d %H:%M', # without seconds
449 '%Y-%m-%d %H%M%S', # without :
449 '%Y-%m-%d %H%M%S', # without :
450 '%Y-%m-%d %H%M', # without seconds
450 '%Y-%m-%d %H%M', # without seconds
451 '%Y-%m-%d %I:%M:%S%p',
451 '%Y-%m-%d %I:%M:%S%p',
452 '%Y-%m-%d %H:%M',
452 '%Y-%m-%d %H:%M',
453 '%Y-%m-%d %I:%M%p',
453 '%Y-%m-%d %I:%M%p',
454 '%Y-%m-%d',
454 '%Y-%m-%d',
455 '%m-%d',
455 '%m-%d',
456 '%m/%d',
456 '%m/%d',
457 '%m/%d/%y',
457 '%m/%d/%y',
458 '%m/%d/%Y',
458 '%m/%d/%Y',
459 '%a %b %d %H:%M:%S %Y',
459 '%a %b %d %H:%M:%S %Y',
460 '%a %b %d %I:%M:%S%p %Y',
460 '%a %b %d %I:%M:%S%p %Y',
461 '%a, %d %b %Y %H:%M:%S', # GNU coreutils "/bin/date --rfc-2822"
461 '%a, %d %b %Y %H:%M:%S', # GNU coreutils "/bin/date --rfc-2822"
462 '%b %d %H:%M:%S %Y',
462 '%b %d %H:%M:%S %Y',
463 '%b %d %I:%M:%S%p %Y',
463 '%b %d %I:%M:%S%p %Y',
464 '%b %d %H:%M:%S',
464 '%b %d %H:%M:%S',
465 '%b %d %I:%M:%S%p',
465 '%b %d %I:%M:%S%p',
466 '%b %d %H:%M',
466 '%b %d %H:%M',
467 '%b %d %I:%M%p',
467 '%b %d %I:%M%p',
468 '%b %d %Y',
468 '%b %d %Y',
469 '%b %d',
469 '%b %d',
470 '%H:%M:%S',
470 '%H:%M:%S',
471 '%I:%M:%S%p',
471 '%I:%M:%S%p',
472 '%H:%M',
472 '%H:%M',
473 '%I:%M%p',
473 '%I:%M%p',
474 )
474 )
475
475
476 extendeddateformats = defaultdateformats + (
476 extendeddateformats = defaultdateformats + (
477 "%Y",
477 "%Y",
478 "%Y-%m",
478 "%Y-%m",
479 "%b",
479 "%b",
480 "%b %Y",
480 "%b %Y",
481 )
481 )
482
482
483 def cachefunc(func):
483 def cachefunc(func):
484 '''cache the result of function calls'''
484 '''cache the result of function calls'''
485 # XXX doesn't handle keywords args
485 # XXX doesn't handle keywords args
486 if func.__code__.co_argcount == 0:
486 if func.__code__.co_argcount == 0:
487 cache = []
487 cache = []
488 def f():
488 def f():
489 if len(cache) == 0:
489 if len(cache) == 0:
490 cache.append(func())
490 cache.append(func())
491 return cache[0]
491 return cache[0]
492 return f
492 return f
493 cache = {}
493 cache = {}
494 if func.__code__.co_argcount == 1:
494 if func.__code__.co_argcount == 1:
495 # we gain a small amount of time because
495 # we gain a small amount of time because
496 # we don't need to pack/unpack the list
496 # we don't need to pack/unpack the list
497 def f(arg):
497 def f(arg):
498 if arg not in cache:
498 if arg not in cache:
499 cache[arg] = func(arg)
499 cache[arg] = func(arg)
500 return cache[arg]
500 return cache[arg]
501 else:
501 else:
502 def f(*args):
502 def f(*args):
503 if args not in cache:
503 if args not in cache:
504 cache[args] = func(*args)
504 cache[args] = func(*args)
505 return cache[args]
505 return cache[args]
506
506
507 return f
507 return f
508
508
509 class sortdict(dict):
509 class sortdict(dict):
510 '''a simple sorted dictionary'''
510 '''a simple sorted dictionary'''
511 def __init__(self, data=None):
511 def __init__(self, data=None):
512 self._list = []
512 self._list = []
513 if data:
513 if data:
514 self.update(data)
514 self.update(data)
515 def copy(self):
515 def copy(self):
516 return sortdict(self)
516 return sortdict(self)
517 def __setitem__(self, key, val):
517 def __setitem__(self, key, val):
518 if key in self:
518 if key in self:
519 self._list.remove(key)
519 self._list.remove(key)
520 self._list.append(key)
520 self._list.append(key)
521 dict.__setitem__(self, key, val)
521 dict.__setitem__(self, key, val)
522 def __iter__(self):
522 def __iter__(self):
523 return self._list.__iter__()
523 return self._list.__iter__()
524 def update(self, src):
524 def update(self, src):
525 if isinstance(src, dict):
525 if isinstance(src, dict):
526 src = src.iteritems()
526 src = src.iteritems()
527 for k, v in src:
527 for k, v in src:
528 self[k] = v
528 self[k] = v
529 def clear(self):
529 def clear(self):
530 dict.clear(self)
530 dict.clear(self)
531 self._list = []
531 self._list = []
532 def items(self):
532 def items(self):
533 return [(k, self[k]) for k in self._list]
533 return [(k, self[k]) for k in self._list]
534 def __delitem__(self, key):
534 def __delitem__(self, key):
535 dict.__delitem__(self, key)
535 dict.__delitem__(self, key)
536 self._list.remove(key)
536 self._list.remove(key)
537 def pop(self, key, *args, **kwargs):
537 def pop(self, key, *args, **kwargs):
538 dict.pop(self, key, *args, **kwargs)
538 dict.pop(self, key, *args, **kwargs)
539 try:
539 try:
540 self._list.remove(key)
540 self._list.remove(key)
541 except ValueError:
541 except ValueError:
542 pass
542 pass
543 def keys(self):
543 def keys(self):
544 return self._list
544 return self._list
545 def iterkeys(self):
545 def iterkeys(self):
546 return self._list.__iter__()
546 return self._list.__iter__()
547 def iteritems(self):
547 def iteritems(self):
548 for k in self._list:
548 for k in self._list:
549 yield k, self[k]
549 yield k, self[k]
550 def insert(self, index, key, val):
550 def insert(self, index, key, val):
551 self._list.insert(index, key)
551 self._list.insert(index, key)
552 dict.__setitem__(self, key, val)
552 dict.__setitem__(self, key, val)
553 def __repr__(self):
553 def __repr__(self):
554 if not self:
554 if not self:
555 return '%s()' % self.__class__.__name__
555 return '%s()' % self.__class__.__name__
556 return '%s(%r)' % (self.__class__.__name__, self.items())
556 return '%s(%r)' % (self.__class__.__name__, self.items())
557
557
558 class _lrucachenode(object):
558 class _lrucachenode(object):
559 """A node in a doubly linked list.
559 """A node in a doubly linked list.
560
560
561 Holds a reference to nodes on either side as well as a key-value
561 Holds a reference to nodes on either side as well as a key-value
562 pair for the dictionary entry.
562 pair for the dictionary entry.
563 """
563 """
564 __slots__ = ('next', 'prev', 'key', 'value')
564 __slots__ = ('next', 'prev', 'key', 'value')
565
565
566 def __init__(self):
566 def __init__(self):
567 self.next = None
567 self.next = None
568 self.prev = None
568 self.prev = None
569
569
570 self.key = _notset
570 self.key = _notset
571 self.value = None
571 self.value = None
572
572
573 def markempty(self):
573 def markempty(self):
574 """Mark the node as emptied."""
574 """Mark the node as emptied."""
575 self.key = _notset
575 self.key = _notset
576
576
577 class lrucachedict(object):
577 class lrucachedict(object):
578 """Dict that caches most recent accesses and sets.
578 """Dict that caches most recent accesses and sets.
579
579
580 The dict consists of an actual backing dict - indexed by original
580 The dict consists of an actual backing dict - indexed by original
581 key - and a doubly linked circular list defining the order of entries in
581 key - and a doubly linked circular list defining the order of entries in
582 the cache.
582 the cache.
583
583
584 The head node is the newest entry in the cache. If the cache is full,
584 The head node is the newest entry in the cache. If the cache is full,
585 we recycle head.prev and make it the new head. Cache accesses result in
585 we recycle head.prev and make it the new head. Cache accesses result in
586 the node being moved to before the existing head and being marked as the
586 the node being moved to before the existing head and being marked as the
587 new head node.
587 new head node.
588 """
588 """
589 def __init__(self, max):
589 def __init__(self, max):
590 self._cache = {}
590 self._cache = {}
591
591
592 self._head = head = _lrucachenode()
592 self._head = head = _lrucachenode()
593 head.prev = head
593 head.prev = head
594 head.next = head
594 head.next = head
595 self._size = 1
595 self._size = 1
596 self._capacity = max
596 self._capacity = max
597
597
598 def __len__(self):
598 def __len__(self):
599 return len(self._cache)
599 return len(self._cache)
600
600
601 def __contains__(self, k):
601 def __contains__(self, k):
602 return k in self._cache
602 return k in self._cache
603
603
604 def __iter__(self):
604 def __iter__(self):
605 # We don't have to iterate in cache order, but why not.
605 # We don't have to iterate in cache order, but why not.
606 n = self._head
606 n = self._head
607 for i in range(len(self._cache)):
607 for i in range(len(self._cache)):
608 yield n.key
608 yield n.key
609 n = n.next
609 n = n.next
610
610
611 def __getitem__(self, k):
611 def __getitem__(self, k):
612 node = self._cache[k]
612 node = self._cache[k]
613 self._movetohead(node)
613 self._movetohead(node)
614 return node.value
614 return node.value
615
615
616 def __setitem__(self, k, v):
616 def __setitem__(self, k, v):
617 node = self._cache.get(k)
617 node = self._cache.get(k)
618 # Replace existing value and mark as newest.
618 # Replace existing value and mark as newest.
619 if node is not None:
619 if node is not None:
620 node.value = v
620 node.value = v
621 self._movetohead(node)
621 self._movetohead(node)
622 return
622 return
623
623
624 if self._size < self._capacity:
624 if self._size < self._capacity:
625 node = self._addcapacity()
625 node = self._addcapacity()
626 else:
626 else:
627 # Grab the last/oldest item.
627 # Grab the last/oldest item.
628 node = self._head.prev
628 node = self._head.prev
629
629
630 # At capacity. Kill the old entry.
630 # At capacity. Kill the old entry.
631 if node.key is not _notset:
631 if node.key is not _notset:
632 del self._cache[node.key]
632 del self._cache[node.key]
633
633
634 node.key = k
634 node.key = k
635 node.value = v
635 node.value = v
636 self._cache[k] = node
636 self._cache[k] = node
637 # And mark it as newest entry. No need to adjust order since it
637 # And mark it as newest entry. No need to adjust order since it
638 # is already self._head.prev.
638 # is already self._head.prev.
639 self._head = node
639 self._head = node
640
640
641 def __delitem__(self, k):
641 def __delitem__(self, k):
642 node = self._cache.pop(k)
642 node = self._cache.pop(k)
643 node.markempty()
643 node.markempty()
644
644
645 # Temporarily mark as newest item before re-adjusting head to make
645 # Temporarily mark as newest item before re-adjusting head to make
646 # this node the oldest item.
646 # this node the oldest item.
647 self._movetohead(node)
647 self._movetohead(node)
648 self._head = node.next
648 self._head = node.next
649
649
650 # Additional dict methods.
650 # Additional dict methods.
651
651
652 def get(self, k, default=None):
652 def get(self, k, default=None):
653 try:
653 try:
654 return self._cache[k].value
654 return self._cache[k].value
655 except KeyError:
655 except KeyError:
656 return default
656 return default
657
657
658 def clear(self):
658 def clear(self):
659 n = self._head
659 n = self._head
660 while n.key is not _notset:
660 while n.key is not _notset:
661 n.markempty()
661 n.markempty()
662 n = n.next
662 n = n.next
663
663
664 self._cache.clear()
664 self._cache.clear()
665
665
666 def copy(self):
666 def copy(self):
667 result = lrucachedict(self._capacity)
667 result = lrucachedict(self._capacity)
668 n = self._head.prev
668 n = self._head.prev
669 # Iterate in oldest-to-newest order, so the copy has the right ordering
669 # Iterate in oldest-to-newest order, so the copy has the right ordering
670 for i in range(len(self._cache)):
670 for i in range(len(self._cache)):
671 result[n.key] = n.value
671 result[n.key] = n.value
672 n = n.prev
672 n = n.prev
673 return result
673 return result
674
674
675 def _movetohead(self, node):
675 def _movetohead(self, node):
676 """Mark a node as the newest, making it the new head.
676 """Mark a node as the newest, making it the new head.
677
677
678 When a node is accessed, it becomes the freshest entry in the LRU
678 When a node is accessed, it becomes the freshest entry in the LRU
679 list, which is denoted by self._head.
679 list, which is denoted by self._head.
680
680
681 Visually, let's make ``N`` the new head node (* denotes head):
681 Visually, let's make ``N`` the new head node (* denotes head):
682
682
683 previous/oldest <-> head <-> next/next newest
683 previous/oldest <-> head <-> next/next newest
684
684
685 ----<->--- A* ---<->-----
685 ----<->--- A* ---<->-----
686 | |
686 | |
687 E <-> D <-> N <-> C <-> B
687 E <-> D <-> N <-> C <-> B
688
688
689 To:
689 To:
690
690
691 ----<->--- N* ---<->-----
691 ----<->--- N* ---<->-----
692 | |
692 | |
693 E <-> D <-> C <-> B <-> A
693 E <-> D <-> C <-> B <-> A
694
694
695 This requires the following moves:
695 This requires the following moves:
696
696
697 C.next = D (node.prev.next = node.next)
697 C.next = D (node.prev.next = node.next)
698 D.prev = C (node.next.prev = node.prev)
698 D.prev = C (node.next.prev = node.prev)
699 E.next = N (head.prev.next = node)
699 E.next = N (head.prev.next = node)
700 N.prev = E (node.prev = head.prev)
700 N.prev = E (node.prev = head.prev)
701 N.next = A (node.next = head)
701 N.next = A (node.next = head)
702 A.prev = N (head.prev = node)
702 A.prev = N (head.prev = node)
703 """
703 """
704 head = self._head
704 head = self._head
705 # C.next = D
705 # C.next = D
706 node.prev.next = node.next
706 node.prev.next = node.next
707 # D.prev = C
707 # D.prev = C
708 node.next.prev = node.prev
708 node.next.prev = node.prev
709 # N.prev = E
709 # N.prev = E
710 node.prev = head.prev
710 node.prev = head.prev
711 # N.next = A
711 # N.next = A
712 # It is tempting to do just "head" here, however if node is
712 # It is tempting to do just "head" here, however if node is
713 # adjacent to head, this will do bad things.
713 # adjacent to head, this will do bad things.
714 node.next = head.prev.next
714 node.next = head.prev.next
715 # E.next = N
715 # E.next = N
716 node.next.prev = node
716 node.next.prev = node
717 # A.prev = N
717 # A.prev = N
718 node.prev.next = node
718 node.prev.next = node
719
719
720 self._head = node
720 self._head = node
721
721
722 def _addcapacity(self):
722 def _addcapacity(self):
723 """Add a node to the circular linked list.
723 """Add a node to the circular linked list.
724
724
725 The new node is inserted before the head node.
725 The new node is inserted before the head node.
726 """
726 """
727 head = self._head
727 head = self._head
728 node = _lrucachenode()
728 node = _lrucachenode()
729 head.prev.next = node
729 head.prev.next = node
730 node.prev = head.prev
730 node.prev = head.prev
731 node.next = head
731 node.next = head
732 head.prev = node
732 head.prev = node
733 self._size += 1
733 self._size += 1
734 return node
734 return node
735
735
736 def lrucachefunc(func):
736 def lrucachefunc(func):
737 '''cache most recent results of function calls'''
737 '''cache most recent results of function calls'''
738 cache = {}
738 cache = {}
739 order = collections.deque()
739 order = collections.deque()
740 if func.__code__.co_argcount == 1:
740 if func.__code__.co_argcount == 1:
741 def f(arg):
741 def f(arg):
742 if arg not in cache:
742 if arg not in cache:
743 if len(cache) > 20:
743 if len(cache) > 20:
744 del cache[order.popleft()]
744 del cache[order.popleft()]
745 cache[arg] = func(arg)
745 cache[arg] = func(arg)
746 else:
746 else:
747 order.remove(arg)
747 order.remove(arg)
748 order.append(arg)
748 order.append(arg)
749 return cache[arg]
749 return cache[arg]
750 else:
750 else:
751 def f(*args):
751 def f(*args):
752 if args not in cache:
752 if args not in cache:
753 if len(cache) > 20:
753 if len(cache) > 20:
754 del cache[order.popleft()]
754 del cache[order.popleft()]
755 cache[args] = func(*args)
755 cache[args] = func(*args)
756 else:
756 else:
757 order.remove(args)
757 order.remove(args)
758 order.append(args)
758 order.append(args)
759 return cache[args]
759 return cache[args]
760
760
761 return f
761 return f
762
762
763 class propertycache(object):
763 class propertycache(object):
764 def __init__(self, func):
764 def __init__(self, func):
765 self.func = func
765 self.func = func
766 self.name = func.__name__
766 self.name = func.__name__
767 def __get__(self, obj, type=None):
767 def __get__(self, obj, type=None):
768 result = self.func(obj)
768 result = self.func(obj)
769 self.cachevalue(obj, result)
769 self.cachevalue(obj, result)
770 return result
770 return result
771
771
772 def cachevalue(self, obj, value):
772 def cachevalue(self, obj, value):
773 # __dict__ assignment required to bypass __setattr__ (eg: repoview)
773 # __dict__ assignment required to bypass __setattr__ (eg: repoview)
774 obj.__dict__[self.name] = value
774 obj.__dict__[self.name] = value
775
775
776 def pipefilter(s, cmd):
776 def pipefilter(s, cmd):
777 '''filter string S through command CMD, returning its output'''
777 '''filter string S through command CMD, returning its output'''
778 p = subprocess.Popen(cmd, shell=True, close_fds=closefds,
778 p = subprocess.Popen(cmd, shell=True, close_fds=closefds,
779 stdin=subprocess.PIPE, stdout=subprocess.PIPE)
779 stdin=subprocess.PIPE, stdout=subprocess.PIPE)
780 pout, perr = p.communicate(s)
780 pout, perr = p.communicate(s)
781 return pout
781 return pout
782
782
783 def tempfilter(s, cmd):
783 def tempfilter(s, cmd):
784 '''filter string S through a pair of temporary files with CMD.
784 '''filter string S through a pair of temporary files with CMD.
785 CMD is used as a template to create the real command to be run,
785 CMD is used as a template to create the real command to be run,
786 with the strings INFILE and OUTFILE replaced by the real names of
786 with the strings INFILE and OUTFILE replaced by the real names of
787 the temporary files generated.'''
787 the temporary files generated.'''
788 inname, outname = None, None
788 inname, outname = None, None
789 try:
789 try:
790 infd, inname = tempfile.mkstemp(prefix='hg-filter-in-')
790 infd, inname = tempfile.mkstemp(prefix='hg-filter-in-')
791 fp = os.fdopen(infd, 'wb')
791 fp = os.fdopen(infd, 'wb')
792 fp.write(s)
792 fp.write(s)
793 fp.close()
793 fp.close()
794 outfd, outname = tempfile.mkstemp(prefix='hg-filter-out-')
794 outfd, outname = tempfile.mkstemp(prefix='hg-filter-out-')
795 os.close(outfd)
795 os.close(outfd)
796 cmd = cmd.replace('INFILE', inname)
796 cmd = cmd.replace('INFILE', inname)
797 cmd = cmd.replace('OUTFILE', outname)
797 cmd = cmd.replace('OUTFILE', outname)
798 code = os.system(cmd)
798 code = os.system(cmd)
799 if sys.platform == 'OpenVMS' and code & 1:
799 if sys.platform == 'OpenVMS' and code & 1:
800 code = 0
800 code = 0
801 if code:
801 if code:
802 raise Abort(_("command '%s' failed: %s") %
802 raise Abort(_("command '%s' failed: %s") %
803 (cmd, explainexit(code)))
803 (cmd, explainexit(code)))
804 return readfile(outname)
804 return readfile(outname)
805 finally:
805 finally:
806 try:
806 try:
807 if inname:
807 if inname:
808 os.unlink(inname)
808 os.unlink(inname)
809 except OSError:
809 except OSError:
810 pass
810 pass
811 try:
811 try:
812 if outname:
812 if outname:
813 os.unlink(outname)
813 os.unlink(outname)
814 except OSError:
814 except OSError:
815 pass
815 pass
816
816
817 filtertable = {
817 filtertable = {
818 'tempfile:': tempfilter,
818 'tempfile:': tempfilter,
819 'pipe:': pipefilter,
819 'pipe:': pipefilter,
820 }
820 }
821
821
822 def filter(s, cmd):
822 def filter(s, cmd):
823 "filter a string through a command that transforms its input to its output"
823 "filter a string through a command that transforms its input to its output"
824 for name, fn in filtertable.iteritems():
824 for name, fn in filtertable.iteritems():
825 if cmd.startswith(name):
825 if cmd.startswith(name):
826 return fn(s, cmd[len(name):].lstrip())
826 return fn(s, cmd[len(name):].lstrip())
827 return pipefilter(s, cmd)
827 return pipefilter(s, cmd)
828
828
829 def binary(s):
829 def binary(s):
830 """return true if a string is binary data"""
830 """return true if a string is binary data"""
831 return bool(s and '\0' in s)
831 return bool(s and '\0' in s)
832
832
833 def increasingchunks(source, min=1024, max=65536):
833 def increasingchunks(source, min=1024, max=65536):
834 '''return no less than min bytes per chunk while data remains,
834 '''return no less than min bytes per chunk while data remains,
835 doubling min after each chunk until it reaches max'''
835 doubling min after each chunk until it reaches max'''
836 def log2(x):
836 def log2(x):
837 if not x:
837 if not x:
838 return 0
838 return 0
839 i = 0
839 i = 0
840 while x:
840 while x:
841 x >>= 1
841 x >>= 1
842 i += 1
842 i += 1
843 return i - 1
843 return i - 1
844
844
845 buf = []
845 buf = []
846 blen = 0
846 blen = 0
847 for chunk in source:
847 for chunk in source:
848 buf.append(chunk)
848 buf.append(chunk)
849 blen += len(chunk)
849 blen += len(chunk)
850 if blen >= min:
850 if blen >= min:
851 if min < max:
851 if min < max:
852 min = min << 1
852 min = min << 1
853 nmin = 1 << log2(blen)
853 nmin = 1 << log2(blen)
854 if nmin > min:
854 if nmin > min:
855 min = nmin
855 min = nmin
856 if min > max:
856 if min > max:
857 min = max
857 min = max
858 yield ''.join(buf)
858 yield ''.join(buf)
859 blen = 0
859 blen = 0
860 buf = []
860 buf = []
861 if buf:
861 if buf:
862 yield ''.join(buf)
862 yield ''.join(buf)
863
863
864 Abort = error.Abort
864 Abort = error.Abort
865
865
866 def always(fn):
866 def always(fn):
867 return True
867 return True
868
868
869 def never(fn):
869 def never(fn):
870 return False
870 return False
871
871
872 def nogc(func):
872 def nogc(func):
873 """disable garbage collector
873 """disable garbage collector
874
874
875 Python's garbage collector triggers a GC each time a certain number of
875 Python's garbage collector triggers a GC each time a certain number of
876 container objects (the number being defined by gc.get_threshold()) are
876 container objects (the number being defined by gc.get_threshold()) are
877 allocated even when marked not to be tracked by the collector. Tracking has
877 allocated even when marked not to be tracked by the collector. Tracking has
878 no effect on when GCs are triggered, only on what objects the GC looks
878 no effect on when GCs are triggered, only on what objects the GC looks
879 into. As a workaround, disable GC while building complex (huge)
879 into. As a workaround, disable GC while building complex (huge)
880 containers.
880 containers.
881
881
882 This garbage collector issue have been fixed in 2.7.
882 This garbage collector issue have been fixed in 2.7.
883 """
883 """
884 if sys.version >= (2, 7):
884 if sys.version >= (2, 7):
885 return func
885 return func
886 def wrapper(*args, **kwargs):
886 def wrapper(*args, **kwargs):
887 gcenabled = gc.isenabled()
887 gcenabled = gc.isenabled()
888 gc.disable()
888 gc.disable()
889 try:
889 try:
890 return func(*args, **kwargs)
890 return func(*args, **kwargs)
891 finally:
891 finally:
892 if gcenabled:
892 if gcenabled:
893 gc.enable()
893 gc.enable()
894 return wrapper
894 return wrapper
895
895
896 def pathto(root, n1, n2):
896 def pathto(root, n1, n2):
897 '''return the relative path from one place to another.
897 '''return the relative path from one place to another.
898 root should use os.sep to separate directories
898 root should use os.sep to separate directories
899 n1 should use os.sep to separate directories
899 n1 should use os.sep to separate directories
900 n2 should use "/" to separate directories
900 n2 should use "/" to separate directories
901 returns an os.sep-separated path.
901 returns an os.sep-separated path.
902
902
903 If n1 is a relative path, it's assumed it's
903 If n1 is a relative path, it's assumed it's
904 relative to root.
904 relative to root.
905 n2 should always be relative to root.
905 n2 should always be relative to root.
906 '''
906 '''
907 if not n1:
907 if not n1:
908 return localpath(n2)
908 return localpath(n2)
909 if os.path.isabs(n1):
909 if os.path.isabs(n1):
910 if os.path.splitdrive(root)[0] != os.path.splitdrive(n1)[0]:
910 if os.path.splitdrive(root)[0] != os.path.splitdrive(n1)[0]:
911 return os.path.join(root, localpath(n2))
911 return os.path.join(root, localpath(n2))
912 n2 = '/'.join((pconvert(root), n2))
912 n2 = '/'.join((pconvert(root), n2))
913 a, b = splitpath(n1), n2.split('/')
913 a, b = splitpath(n1), n2.split('/')
914 a.reverse()
914 a.reverse()
915 b.reverse()
915 b.reverse()
916 while a and b and a[-1] == b[-1]:
916 while a and b and a[-1] == b[-1]:
917 a.pop()
917 a.pop()
918 b.pop()
918 b.pop()
919 b.reverse()
919 b.reverse()
920 return os.sep.join((['..'] * len(a)) + b) or '.'
920 return os.sep.join((['..'] * len(a)) + b) or '.'
921
921
922 def mainfrozen():
922 def mainfrozen():
923 """return True if we are a frozen executable.
923 """return True if we are a frozen executable.
924
924
925 The code supports py2exe (most common, Windows only) and tools/freeze
925 The code supports py2exe (most common, Windows only) and tools/freeze
926 (portable, not much used).
926 (portable, not much used).
927 """
927 """
928 return (safehasattr(sys, "frozen") or # new py2exe
928 return (safehasattr(sys, "frozen") or # new py2exe
929 safehasattr(sys, "importers") or # old py2exe
929 safehasattr(sys, "importers") or # old py2exe
930 imp.is_frozen("__main__")) # tools/freeze
930 imp.is_frozen("__main__")) # tools/freeze
931
931
932 # the location of data files matching the source code
932 # the location of data files matching the source code
933 if mainfrozen() and getattr(sys, 'frozen', None) != 'macosx_app':
933 if mainfrozen() and getattr(sys, 'frozen', None) != 'macosx_app':
934 # executable version (py2exe) doesn't support __file__
934 # executable version (py2exe) doesn't support __file__
935 datapath = os.path.dirname(sys.executable)
935 datapath = os.path.dirname(sys.executable)
936 else:
936 else:
937 datapath = os.path.dirname(__file__)
937 datapath = os.path.dirname(__file__)
938
938
939 i18n.setdatapath(datapath)
939 i18n.setdatapath(datapath)
940
940
941 _hgexecutable = None
941 _hgexecutable = None
942
942
943 def hgexecutable():
943 def hgexecutable():
944 """return location of the 'hg' executable.
944 """return location of the 'hg' executable.
945
945
946 Defaults to $HG or 'hg' in the search path.
946 Defaults to $HG or 'hg' in the search path.
947 """
947 """
948 if _hgexecutable is None:
948 if _hgexecutable is None:
949 hg = os.environ.get('HG')
949 hg = os.environ.get('HG')
950 mainmod = sys.modules['__main__']
950 mainmod = sys.modules['__main__']
951 if hg:
951 if hg:
952 _sethgexecutable(hg)
952 _sethgexecutable(hg)
953 elif mainfrozen():
953 elif mainfrozen():
954 if getattr(sys, 'frozen', None) == 'macosx_app':
954 if getattr(sys, 'frozen', None) == 'macosx_app':
955 # Env variable set by py2app
955 # Env variable set by py2app
956 _sethgexecutable(os.environ['EXECUTABLEPATH'])
956 _sethgexecutable(os.environ['EXECUTABLEPATH'])
957 else:
957 else:
958 _sethgexecutable(sys.executable)
958 _sethgexecutable(sys.executable)
959 elif os.path.basename(getattr(mainmod, '__file__', '')) == 'hg':
959 elif os.path.basename(getattr(mainmod, '__file__', '')) == 'hg':
960 _sethgexecutable(mainmod.__file__)
960 _sethgexecutable(mainmod.__file__)
961 else:
961 else:
962 exe = findexe('hg') or os.path.basename(sys.argv[0])
962 exe = findexe('hg') or os.path.basename(sys.argv[0])
963 _sethgexecutable(exe)
963 _sethgexecutable(exe)
964 return _hgexecutable
964 return _hgexecutable
965
965
966 def _sethgexecutable(path):
966 def _sethgexecutable(path):
967 """set location of the 'hg' executable"""
967 """set location of the 'hg' executable"""
968 global _hgexecutable
968 global _hgexecutable
969 _hgexecutable = path
969 _hgexecutable = path
970
970
971 def _isstdout(f):
971 def _isstdout(f):
972 fileno = getattr(f, 'fileno', None)
972 fileno = getattr(f, 'fileno', None)
973 return fileno and fileno() == sys.__stdout__.fileno()
973 return fileno and fileno() == sys.__stdout__.fileno()
974
974
975 def system(cmd, environ=None, cwd=None, onerr=None, errprefix=None, out=None):
975 def system(cmd, environ=None, cwd=None, onerr=None, errprefix=None, out=None):
976 '''enhanced shell command execution.
976 '''enhanced shell command execution.
977 run with environment maybe modified, maybe in different dir.
977 run with environment maybe modified, maybe in different dir.
978
978
979 if command fails and onerr is None, return status, else raise onerr
979 if command fails and onerr is None, return status, else raise onerr
980 object as exception.
980 object as exception.
981
981
982 if out is specified, it is assumed to be a file-like object that has a
982 if out is specified, it is assumed to be a file-like object that has a
983 write() method. stdout and stderr will be redirected to out.'''
983 write() method. stdout and stderr will be redirected to out.'''
984 if environ is None:
984 if environ is None:
985 environ = {}
985 environ = {}
986 try:
986 try:
987 sys.stdout.flush()
987 sys.stdout.flush()
988 except Exception:
988 except Exception:
989 pass
989 pass
990 def py2shell(val):
990 def py2shell(val):
991 'convert python object into string that is useful to shell'
991 'convert python object into string that is useful to shell'
992 if val is None or val is False:
992 if val is None or val is False:
993 return '0'
993 return '0'
994 if val is True:
994 if val is True:
995 return '1'
995 return '1'
996 return str(val)
996 return str(val)
997 origcmd = cmd
997 origcmd = cmd
998 cmd = quotecommand(cmd)
998 cmd = quotecommand(cmd)
999 if sys.platform == 'plan9' and (sys.version_info[0] == 2
999 if sys.platform == 'plan9' and (sys.version_info[0] == 2
1000 and sys.version_info[1] < 7):
1000 and sys.version_info[1] < 7):
1001 # subprocess kludge to work around issues in half-baked Python
1001 # subprocess kludge to work around issues in half-baked Python
1002 # ports, notably bichued/python:
1002 # ports, notably bichued/python:
1003 if not cwd is None:
1003 if not cwd is None:
1004 os.chdir(cwd)
1004 os.chdir(cwd)
1005 rc = os.system(cmd)
1005 rc = os.system(cmd)
1006 else:
1006 else:
1007 env = dict(os.environ)
1007 env = dict(os.environ)
1008 env.update((k, py2shell(v)) for k, v in environ.iteritems())
1008 env.update((k, py2shell(v)) for k, v in environ.iteritems())
1009 env['HG'] = hgexecutable()
1009 env['HG'] = hgexecutable()
1010 if out is None or _isstdout(out):
1010 if out is None or _isstdout(out):
1011 rc = subprocess.call(cmd, shell=True, close_fds=closefds,
1011 rc = subprocess.call(cmd, shell=True, close_fds=closefds,
1012 env=env, cwd=cwd)
1012 env=env, cwd=cwd)
1013 else:
1013 else:
1014 proc = subprocess.Popen(cmd, shell=True, close_fds=closefds,
1014 proc = subprocess.Popen(cmd, shell=True, close_fds=closefds,
1015 env=env, cwd=cwd, stdout=subprocess.PIPE,
1015 env=env, cwd=cwd, stdout=subprocess.PIPE,
1016 stderr=subprocess.STDOUT)
1016 stderr=subprocess.STDOUT)
1017 for line in iter(proc.stdout.readline, ''):
1017 for line in iter(proc.stdout.readline, ''):
1018 out.write(line)
1018 out.write(line)
1019 proc.wait()
1019 proc.wait()
1020 rc = proc.returncode
1020 rc = proc.returncode
1021 if sys.platform == 'OpenVMS' and rc & 1:
1021 if sys.platform == 'OpenVMS' and rc & 1:
1022 rc = 0
1022 rc = 0
1023 if rc and onerr:
1023 if rc and onerr:
1024 errmsg = '%s %s' % (os.path.basename(origcmd.split(None, 1)[0]),
1024 errmsg = '%s %s' % (os.path.basename(origcmd.split(None, 1)[0]),
1025 explainexit(rc)[0])
1025 explainexit(rc)[0])
1026 if errprefix:
1026 if errprefix:
1027 errmsg = '%s: %s' % (errprefix, errmsg)
1027 errmsg = '%s: %s' % (errprefix, errmsg)
1028 raise onerr(errmsg)
1028 raise onerr(errmsg)
1029 return rc
1029 return rc
1030
1030
1031 def checksignature(func):
1031 def checksignature(func):
1032 '''wrap a function with code to check for calling errors'''
1032 '''wrap a function with code to check for calling errors'''
1033 def check(*args, **kwargs):
1033 def check(*args, **kwargs):
1034 try:
1034 try:
1035 return func(*args, **kwargs)
1035 return func(*args, **kwargs)
1036 except TypeError:
1036 except TypeError:
1037 if len(traceback.extract_tb(sys.exc_info()[2])) == 1:
1037 if len(traceback.extract_tb(sys.exc_info()[2])) == 1:
1038 raise error.SignatureError
1038 raise error.SignatureError
1039 raise
1039 raise
1040
1040
1041 return check
1041 return check
1042
1042
1043 def copyfile(src, dest, hardlink=False, copystat=False, checkambig=False):
1043 def copyfile(src, dest, hardlink=False, copystat=False, checkambig=False):
1044 '''copy a file, preserving mode and optionally other stat info like
1044 '''copy a file, preserving mode and optionally other stat info like
1045 atime/mtime
1045 atime/mtime
1046
1046
1047 checkambig argument is used with filestat, and is useful only if
1047 checkambig argument is used with filestat, and is useful only if
1048 destination file is guarded by any lock (e.g. repo.lock or
1048 destination file is guarded by any lock (e.g. repo.lock or
1049 repo.wlock).
1049 repo.wlock).
1050
1050
1051 copystat and checkambig should be exclusive.
1051 copystat and checkambig should be exclusive.
1052 '''
1052 '''
1053 assert not (copystat and checkambig)
1053 assert not (copystat and checkambig)
1054 oldstat = None
1054 oldstat = None
1055 if os.path.lexists(dest):
1055 if os.path.lexists(dest):
1056 if checkambig:
1056 if checkambig:
1057 oldstat = checkambig and filestat(dest)
1057 oldstat = checkambig and filestat(dest)
1058 unlink(dest)
1058 unlink(dest)
1059 # hardlinks are problematic on CIFS, quietly ignore this flag
1059 # hardlinks are problematic on CIFS, quietly ignore this flag
1060 # until we find a way to work around it cleanly (issue4546)
1060 # until we find a way to work around it cleanly (issue4546)
1061 if False and hardlink:
1061 if False and hardlink:
1062 try:
1062 try:
1063 oslink(src, dest)
1063 oslink(src, dest)
1064 return
1064 return
1065 except (IOError, OSError):
1065 except (IOError, OSError):
1066 pass # fall back to normal copy
1066 pass # fall back to normal copy
1067 if os.path.islink(src):
1067 if os.path.islink(src):
1068 os.symlink(os.readlink(src), dest)
1068 os.symlink(os.readlink(src), dest)
1069 # copytime is ignored for symlinks, but in general copytime isn't needed
1069 # copytime is ignored for symlinks, but in general copytime isn't needed
1070 # for them anyway
1070 # for them anyway
1071 else:
1071 else:
1072 try:
1072 try:
1073 shutil.copyfile(src, dest)
1073 shutil.copyfile(src, dest)
1074 if copystat:
1074 if copystat:
1075 # copystat also copies mode
1075 # copystat also copies mode
1076 shutil.copystat(src, dest)
1076 shutil.copystat(src, dest)
1077 else:
1077 else:
1078 shutil.copymode(src, dest)
1078 shutil.copymode(src, dest)
1079 if oldstat and oldstat.stat:
1079 if oldstat and oldstat.stat:
1080 newstat = filestat(dest)
1080 newstat = filestat(dest)
1081 if newstat.isambig(oldstat):
1081 if newstat.isambig(oldstat):
1082 # stat of copied file is ambiguous to original one
1082 # stat of copied file is ambiguous to original one
1083 advanced = (oldstat.stat.st_mtime + 1) & 0x7fffffff
1083 advanced = (oldstat.stat.st_mtime + 1) & 0x7fffffff
1084 os.utime(dest, (advanced, advanced))
1084 os.utime(dest, (advanced, advanced))
1085 except shutil.Error as inst:
1085 except shutil.Error as inst:
1086 raise Abort(str(inst))
1086 raise Abort(str(inst))
1087
1087
1088 def copyfiles(src, dst, hardlink=None, progress=lambda t, pos: None):
1088 def copyfiles(src, dst, hardlink=None, progress=lambda t, pos: None):
1089 """Copy a directory tree using hardlinks if possible."""
1089 """Copy a directory tree using hardlinks if possible."""
1090 num = 0
1090 num = 0
1091
1091
1092 if hardlink is None:
1092 if hardlink is None:
1093 hardlink = (os.stat(src).st_dev ==
1093 hardlink = (os.stat(src).st_dev ==
1094 os.stat(os.path.dirname(dst)).st_dev)
1094 os.stat(os.path.dirname(dst)).st_dev)
1095 if hardlink:
1095 if hardlink:
1096 topic = _('linking')
1096 topic = _('linking')
1097 else:
1097 else:
1098 topic = _('copying')
1098 topic = _('copying')
1099
1099
1100 if os.path.isdir(src):
1100 if os.path.isdir(src):
1101 os.mkdir(dst)
1101 os.mkdir(dst)
1102 for name, kind in osutil.listdir(src):
1102 for name, kind in osutil.listdir(src):
1103 srcname = os.path.join(src, name)
1103 srcname = os.path.join(src, name)
1104 dstname = os.path.join(dst, name)
1104 dstname = os.path.join(dst, name)
1105 def nprog(t, pos):
1105 def nprog(t, pos):
1106 if pos is not None:
1106 if pos is not None:
1107 return progress(t, pos + num)
1107 return progress(t, pos + num)
1108 hardlink, n = copyfiles(srcname, dstname, hardlink, progress=nprog)
1108 hardlink, n = copyfiles(srcname, dstname, hardlink, progress=nprog)
1109 num += n
1109 num += n
1110 else:
1110 else:
1111 if hardlink:
1111 if hardlink:
1112 try:
1112 try:
1113 oslink(src, dst)
1113 oslink(src, dst)
1114 except (IOError, OSError):
1114 except (IOError, OSError):
1115 hardlink = False
1115 hardlink = False
1116 shutil.copy(src, dst)
1116 shutil.copy(src, dst)
1117 else:
1117 else:
1118 shutil.copy(src, dst)
1118 shutil.copy(src, dst)
1119 num += 1
1119 num += 1
1120 progress(topic, num)
1120 progress(topic, num)
1121 progress(topic, None)
1121 progress(topic, None)
1122
1122
1123 return hardlink, num
1123 return hardlink, num
1124
1124
1125 _winreservednames = '''con prn aux nul
1125 _winreservednames = '''con prn aux nul
1126 com1 com2 com3 com4 com5 com6 com7 com8 com9
1126 com1 com2 com3 com4 com5 com6 com7 com8 com9
1127 lpt1 lpt2 lpt3 lpt4 lpt5 lpt6 lpt7 lpt8 lpt9'''.split()
1127 lpt1 lpt2 lpt3 lpt4 lpt5 lpt6 lpt7 lpt8 lpt9'''.split()
1128 _winreservedchars = ':*?"<>|'
1128 _winreservedchars = ':*?"<>|'
1129 def checkwinfilename(path):
1129 def checkwinfilename(path):
1130 r'''Check that the base-relative path is a valid filename on Windows.
1130 r'''Check that the base-relative path is a valid filename on Windows.
1131 Returns None if the path is ok, or a UI string describing the problem.
1131 Returns None if the path is ok, or a UI string describing the problem.
1132
1132
1133 >>> checkwinfilename("just/a/normal/path")
1133 >>> checkwinfilename("just/a/normal/path")
1134 >>> checkwinfilename("foo/bar/con.xml")
1134 >>> checkwinfilename("foo/bar/con.xml")
1135 "filename contains 'con', which is reserved on Windows"
1135 "filename contains 'con', which is reserved on Windows"
1136 >>> checkwinfilename("foo/con.xml/bar")
1136 >>> checkwinfilename("foo/con.xml/bar")
1137 "filename contains 'con', which is reserved on Windows"
1137 "filename contains 'con', which is reserved on Windows"
1138 >>> checkwinfilename("foo/bar/xml.con")
1138 >>> checkwinfilename("foo/bar/xml.con")
1139 >>> checkwinfilename("foo/bar/AUX/bla.txt")
1139 >>> checkwinfilename("foo/bar/AUX/bla.txt")
1140 "filename contains 'AUX', which is reserved on Windows"
1140 "filename contains 'AUX', which is reserved on Windows"
1141 >>> checkwinfilename("foo/bar/bla:.txt")
1141 >>> checkwinfilename("foo/bar/bla:.txt")
1142 "filename contains ':', which is reserved on Windows"
1142 "filename contains ':', which is reserved on Windows"
1143 >>> checkwinfilename("foo/bar/b\07la.txt")
1143 >>> checkwinfilename("foo/bar/b\07la.txt")
1144 "filename contains '\\x07', which is invalid on Windows"
1144 "filename contains '\\x07', which is invalid on Windows"
1145 >>> checkwinfilename("foo/bar/bla ")
1145 >>> checkwinfilename("foo/bar/bla ")
1146 "filename ends with ' ', which is not allowed on Windows"
1146 "filename ends with ' ', which is not allowed on Windows"
1147 >>> checkwinfilename("../bar")
1147 >>> checkwinfilename("../bar")
1148 >>> checkwinfilename("foo\\")
1148 >>> checkwinfilename("foo\\")
1149 "filename ends with '\\', which is invalid on Windows"
1149 "filename ends with '\\', which is invalid on Windows"
1150 >>> checkwinfilename("foo\\/bar")
1150 >>> checkwinfilename("foo\\/bar")
1151 "directory name ends with '\\', which is invalid on Windows"
1151 "directory name ends with '\\', which is invalid on Windows"
1152 '''
1152 '''
1153 if path.endswith('\\'):
1153 if path.endswith('\\'):
1154 return _("filename ends with '\\', which is invalid on Windows")
1154 return _("filename ends with '\\', which is invalid on Windows")
1155 if '\\/' in path:
1155 if '\\/' in path:
1156 return _("directory name ends with '\\', which is invalid on Windows")
1156 return _("directory name ends with '\\', which is invalid on Windows")
1157 for n in path.replace('\\', '/').split('/'):
1157 for n in path.replace('\\', '/').split('/'):
1158 if not n:
1158 if not n:
1159 continue
1159 continue
1160 for c in n:
1160 for c in n:
1161 if c in _winreservedchars:
1161 if c in _winreservedchars:
1162 return _("filename contains '%s', which is reserved "
1162 return _("filename contains '%s', which is reserved "
1163 "on Windows") % c
1163 "on Windows") % c
1164 if ord(c) <= 31:
1164 if ord(c) <= 31:
1165 return _("filename contains %r, which is invalid "
1165 return _("filename contains %r, which is invalid "
1166 "on Windows") % c
1166 "on Windows") % c
1167 base = n.split('.')[0]
1167 base = n.split('.')[0]
1168 if base and base.lower() in _winreservednames:
1168 if base and base.lower() in _winreservednames:
1169 return _("filename contains '%s', which is reserved "
1169 return _("filename contains '%s', which is reserved "
1170 "on Windows") % base
1170 "on Windows") % base
1171 t = n[-1]
1171 t = n[-1]
1172 if t in '. ' and n not in '..':
1172 if t in '. ' and n not in '..':
1173 return _("filename ends with '%s', which is not allowed "
1173 return _("filename ends with '%s', which is not allowed "
1174 "on Windows") % t
1174 "on Windows") % t
1175
1175
1176 if os.name == 'nt':
1176 if os.name == 'nt':
1177 checkosfilename = checkwinfilename
1177 checkosfilename = checkwinfilename
1178 else:
1178 else:
1179 checkosfilename = platform.checkosfilename
1179 checkosfilename = platform.checkosfilename
1180
1180
1181 def makelock(info, pathname):
1181 def makelock(info, pathname):
1182 try:
1182 try:
1183 return os.symlink(info, pathname)
1183 return os.symlink(info, pathname)
1184 except OSError as why:
1184 except OSError as why:
1185 if why.errno == errno.EEXIST:
1185 if why.errno == errno.EEXIST:
1186 raise
1186 raise
1187 except AttributeError: # no symlink in os
1187 except AttributeError: # no symlink in os
1188 pass
1188 pass
1189
1189
1190 ld = os.open(pathname, os.O_CREAT | os.O_WRONLY | os.O_EXCL)
1190 ld = os.open(pathname, os.O_CREAT | os.O_WRONLY | os.O_EXCL)
1191 os.write(ld, info)
1191 os.write(ld, info)
1192 os.close(ld)
1192 os.close(ld)
1193
1193
1194 def readlock(pathname):
1194 def readlock(pathname):
1195 try:
1195 try:
1196 return os.readlink(pathname)
1196 return os.readlink(pathname)
1197 except OSError as why:
1197 except OSError as why:
1198 if why.errno not in (errno.EINVAL, errno.ENOSYS):
1198 if why.errno not in (errno.EINVAL, errno.ENOSYS):
1199 raise
1199 raise
1200 except AttributeError: # no symlink in os
1200 except AttributeError: # no symlink in os
1201 pass
1201 pass
1202 fp = posixfile(pathname)
1202 fp = posixfile(pathname)
1203 r = fp.read()
1203 r = fp.read()
1204 fp.close()
1204 fp.close()
1205 return r
1205 return r
1206
1206
1207 def fstat(fp):
1207 def fstat(fp):
1208 '''stat file object that may not have fileno method.'''
1208 '''stat file object that may not have fileno method.'''
1209 try:
1209 try:
1210 return os.fstat(fp.fileno())
1210 return os.fstat(fp.fileno())
1211 except AttributeError:
1211 except AttributeError:
1212 return os.stat(fp.name)
1212 return os.stat(fp.name)
1213
1213
1214 # File system features
1214 # File system features
1215
1215
1216 def fscasesensitive(path):
1216 def fscasesensitive(path):
1217 """
1217 """
1218 Return true if the given path is on a case-sensitive filesystem
1218 Return true if the given path is on a case-sensitive filesystem
1219
1219
1220 Requires a path (like /foo/.hg) ending with a foldable final
1220 Requires a path (like /foo/.hg) ending with a foldable final
1221 directory component.
1221 directory component.
1222 """
1222 """
1223 s1 = os.lstat(path)
1223 s1 = os.lstat(path)
1224 d, b = os.path.split(path)
1224 d, b = os.path.split(path)
1225 b2 = b.upper()
1225 b2 = b.upper()
1226 if b == b2:
1226 if b == b2:
1227 b2 = b.lower()
1227 b2 = b.lower()
1228 if b == b2:
1228 if b == b2:
1229 return True # no evidence against case sensitivity
1229 return True # no evidence against case sensitivity
1230 p2 = os.path.join(d, b2)
1230 p2 = os.path.join(d, b2)
1231 try:
1231 try:
1232 s2 = os.lstat(p2)
1232 s2 = os.lstat(p2)
1233 if s2 == s1:
1233 if s2 == s1:
1234 return False
1234 return False
1235 return True
1235 return True
1236 except OSError:
1236 except OSError:
1237 return True
1237 return True
1238
1238
1239 try:
1239 try:
1240 import re2
1240 import re2
1241 _re2 = None
1241 _re2 = None
1242 except ImportError:
1242 except ImportError:
1243 _re2 = False
1243 _re2 = False
1244
1244
1245 class _re(object):
1245 class _re(object):
1246 def _checkre2(self):
1246 def _checkre2(self):
1247 global _re2
1247 global _re2
1248 try:
1248 try:
1249 # check if match works, see issue3964
1249 # check if match works, see issue3964
1250 _re2 = bool(re2.match(r'\[([^\[]+)\]', '[ui]'))
1250 _re2 = bool(re2.match(r'\[([^\[]+)\]', '[ui]'))
1251 except ImportError:
1251 except ImportError:
1252 _re2 = False
1252 _re2 = False
1253
1253
1254 def compile(self, pat, flags=0):
1254 def compile(self, pat, flags=0):
1255 '''Compile a regular expression, using re2 if possible
1255 '''Compile a regular expression, using re2 if possible
1256
1256
1257 For best performance, use only re2-compatible regexp features. The
1257 For best performance, use only re2-compatible regexp features. The
1258 only flags from the re module that are re2-compatible are
1258 only flags from the re module that are re2-compatible are
1259 IGNORECASE and MULTILINE.'''
1259 IGNORECASE and MULTILINE.'''
1260 if _re2 is None:
1260 if _re2 is None:
1261 self._checkre2()
1261 self._checkre2()
1262 if _re2 and (flags & ~(remod.IGNORECASE | remod.MULTILINE)) == 0:
1262 if _re2 and (flags & ~(remod.IGNORECASE | remod.MULTILINE)) == 0:
1263 if flags & remod.IGNORECASE:
1263 if flags & remod.IGNORECASE:
1264 pat = '(?i)' + pat
1264 pat = '(?i)' + pat
1265 if flags & remod.MULTILINE:
1265 if flags & remod.MULTILINE:
1266 pat = '(?m)' + pat
1266 pat = '(?m)' + pat
1267 try:
1267 try:
1268 return re2.compile(pat)
1268 return re2.compile(pat)
1269 except re2.error:
1269 except re2.error:
1270 pass
1270 pass
1271 return remod.compile(pat, flags)
1271 return remod.compile(pat, flags)
1272
1272
1273 @propertycache
1273 @propertycache
1274 def escape(self):
1274 def escape(self):
1275 '''Return the version of escape corresponding to self.compile.
1275 '''Return the version of escape corresponding to self.compile.
1276
1276
1277 This is imperfect because whether re2 or re is used for a particular
1277 This is imperfect because whether re2 or re is used for a particular
1278 function depends on the flags, etc, but it's the best we can do.
1278 function depends on the flags, etc, but it's the best we can do.
1279 '''
1279 '''
1280 global _re2
1280 global _re2
1281 if _re2 is None:
1281 if _re2 is None:
1282 self._checkre2()
1282 self._checkre2()
1283 if _re2:
1283 if _re2:
1284 return re2.escape
1284 return re2.escape
1285 else:
1285 else:
1286 return remod.escape
1286 return remod.escape
1287
1287
1288 re = _re()
1288 re = _re()
1289
1289
1290 _fspathcache = {}
1290 _fspathcache = {}
1291 def fspath(name, root):
1291 def fspath(name, root):
1292 '''Get name in the case stored in the filesystem
1292 '''Get name in the case stored in the filesystem
1293
1293
1294 The name should be relative to root, and be normcase-ed for efficiency.
1294 The name should be relative to root, and be normcase-ed for efficiency.
1295
1295
1296 Note that this function is unnecessary, and should not be
1296 Note that this function is unnecessary, and should not be
1297 called, for case-sensitive filesystems (simply because it's expensive).
1297 called, for case-sensitive filesystems (simply because it's expensive).
1298
1298
1299 The root should be normcase-ed, too.
1299 The root should be normcase-ed, too.
1300 '''
1300 '''
1301 def _makefspathcacheentry(dir):
1301 def _makefspathcacheentry(dir):
1302 return dict((normcase(n), n) for n in os.listdir(dir))
1302 return dict((normcase(n), n) for n in os.listdir(dir))
1303
1303
1304 seps = os.sep
1304 seps = os.sep
1305 if os.altsep:
1305 if os.altsep:
1306 seps = seps + os.altsep
1306 seps = seps + os.altsep
1307 # Protect backslashes. This gets silly very quickly.
1307 # Protect backslashes. This gets silly very quickly.
1308 seps.replace('\\','\\\\')
1308 seps.replace('\\','\\\\')
1309 pattern = remod.compile(r'([^%s]+)|([%s]+)' % (seps, seps))
1309 pattern = remod.compile(r'([^%s]+)|([%s]+)' % (seps, seps))
1310 dir = os.path.normpath(root)
1310 dir = os.path.normpath(root)
1311 result = []
1311 result = []
1312 for part, sep in pattern.findall(name):
1312 for part, sep in pattern.findall(name):
1313 if sep:
1313 if sep:
1314 result.append(sep)
1314 result.append(sep)
1315 continue
1315 continue
1316
1316
1317 if dir not in _fspathcache:
1317 if dir not in _fspathcache:
1318 _fspathcache[dir] = _makefspathcacheentry(dir)
1318 _fspathcache[dir] = _makefspathcacheentry(dir)
1319 contents = _fspathcache[dir]
1319 contents = _fspathcache[dir]
1320
1320
1321 found = contents.get(part)
1321 found = contents.get(part)
1322 if not found:
1322 if not found:
1323 # retry "once per directory" per "dirstate.walk" which
1323 # retry "once per directory" per "dirstate.walk" which
1324 # may take place for each patches of "hg qpush", for example
1324 # may take place for each patches of "hg qpush", for example
1325 _fspathcache[dir] = contents = _makefspathcacheentry(dir)
1325 _fspathcache[dir] = contents = _makefspathcacheentry(dir)
1326 found = contents.get(part)
1326 found = contents.get(part)
1327
1327
1328 result.append(found or part)
1328 result.append(found or part)
1329 dir = os.path.join(dir, part)
1329 dir = os.path.join(dir, part)
1330
1330
1331 return ''.join(result)
1331 return ''.join(result)
1332
1332
1333 def checknlink(testfile):
1333 def checknlink(testfile):
1334 '''check whether hardlink count reporting works properly'''
1334 '''check whether hardlink count reporting works properly'''
1335
1335
1336 # testfile may be open, so we need a separate file for checking to
1336 # testfile may be open, so we need a separate file for checking to
1337 # work around issue2543 (or testfile may get lost on Samba shares)
1337 # work around issue2543 (or testfile may get lost on Samba shares)
1338 f1 = testfile + ".hgtmp1"
1338 f1 = testfile + ".hgtmp1"
1339 if os.path.lexists(f1):
1339 if os.path.lexists(f1):
1340 return False
1340 return False
1341 try:
1341 try:
1342 posixfile(f1, 'w').close()
1342 posixfile(f1, 'w').close()
1343 except IOError:
1343 except IOError:
1344 try:
1344 try:
1345 os.unlink(f1)
1345 os.unlink(f1)
1346 except OSError:
1346 except OSError:
1347 pass
1347 pass
1348 return False
1348 return False
1349
1349
1350 f2 = testfile + ".hgtmp2"
1350 f2 = testfile + ".hgtmp2"
1351 fd = None
1351 fd = None
1352 try:
1352 try:
1353 oslink(f1, f2)
1353 oslink(f1, f2)
1354 # nlinks() may behave differently for files on Windows shares if
1354 # nlinks() may behave differently for files on Windows shares if
1355 # the file is open.
1355 # the file is open.
1356 fd = posixfile(f2)
1356 fd = posixfile(f2)
1357 return nlinks(f2) > 1
1357 return nlinks(f2) > 1
1358 except OSError:
1358 except OSError:
1359 return False
1359 return False
1360 finally:
1360 finally:
1361 if fd is not None:
1361 if fd is not None:
1362 fd.close()
1362 fd.close()
1363 for f in (f1, f2):
1363 for f in (f1, f2):
1364 try:
1364 try:
1365 os.unlink(f)
1365 os.unlink(f)
1366 except OSError:
1366 except OSError:
1367 pass
1367 pass
1368
1368
1369 def endswithsep(path):
1369 def endswithsep(path):
1370 '''Check path ends with os.sep or os.altsep.'''
1370 '''Check path ends with os.sep or os.altsep.'''
1371 return path.endswith(os.sep) or os.altsep and path.endswith(os.altsep)
1371 return path.endswith(os.sep) or os.altsep and path.endswith(os.altsep)
1372
1372
1373 def splitpath(path):
1373 def splitpath(path):
1374 '''Split path by os.sep.
1374 '''Split path by os.sep.
1375 Note that this function does not use os.altsep because this is
1375 Note that this function does not use os.altsep because this is
1376 an alternative of simple "xxx.split(os.sep)".
1376 an alternative of simple "xxx.split(os.sep)".
1377 It is recommended to use os.path.normpath() before using this
1377 It is recommended to use os.path.normpath() before using this
1378 function if need.'''
1378 function if need.'''
1379 return path.split(os.sep)
1379 return path.split(os.sep)
1380
1380
1381 def gui():
1381 def gui():
1382 '''Are we running in a GUI?'''
1382 '''Are we running in a GUI?'''
1383 if sys.platform == 'darwin':
1383 if sys.platform == 'darwin':
1384 if 'SSH_CONNECTION' in os.environ:
1384 if 'SSH_CONNECTION' in os.environ:
1385 # handle SSH access to a box where the user is logged in
1385 # handle SSH access to a box where the user is logged in
1386 return False
1386 return False
1387 elif getattr(osutil, 'isgui', None):
1387 elif getattr(osutil, 'isgui', None):
1388 # check if a CoreGraphics session is available
1388 # check if a CoreGraphics session is available
1389 return osutil.isgui()
1389 return osutil.isgui()
1390 else:
1390 else:
1391 # pure build; use a safe default
1391 # pure build; use a safe default
1392 return True
1392 return True
1393 else:
1393 else:
1394 return os.name == "nt" or os.environ.get("DISPLAY")
1394 return os.name == "nt" or os.environ.get("DISPLAY")
1395
1395
1396 def mktempcopy(name, emptyok=False, createmode=None):
1396 def mktempcopy(name, emptyok=False, createmode=None):
1397 """Create a temporary file with the same contents from name
1397 """Create a temporary file with the same contents from name
1398
1398
1399 The permission bits are copied from the original file.
1399 The permission bits are copied from the original file.
1400
1400
1401 If the temporary file is going to be truncated immediately, you
1401 If the temporary file is going to be truncated immediately, you
1402 can use emptyok=True as an optimization.
1402 can use emptyok=True as an optimization.
1403
1403
1404 Returns the name of the temporary file.
1404 Returns the name of the temporary file.
1405 """
1405 """
1406 d, fn = os.path.split(name)
1406 d, fn = os.path.split(name)
1407 fd, temp = tempfile.mkstemp(prefix='.%s-' % fn, dir=d)
1407 fd, temp = tempfile.mkstemp(prefix='.%s-' % fn, dir=d)
1408 os.close(fd)
1408 os.close(fd)
1409 # Temporary files are created with mode 0600, which is usually not
1409 # Temporary files are created with mode 0600, which is usually not
1410 # what we want. If the original file already exists, just copy
1410 # what we want. If the original file already exists, just copy
1411 # its mode. Otherwise, manually obey umask.
1411 # its mode. Otherwise, manually obey umask.
1412 copymode(name, temp, createmode)
1412 copymode(name, temp, createmode)
1413 if emptyok:
1413 if emptyok:
1414 return temp
1414 return temp
1415 try:
1415 try:
1416 try:
1416 try:
1417 ifp = posixfile(name, "rb")
1417 ifp = posixfile(name, "rb")
1418 except IOError as inst:
1418 except IOError as inst:
1419 if inst.errno == errno.ENOENT:
1419 if inst.errno == errno.ENOENT:
1420 return temp
1420 return temp
1421 if not getattr(inst, 'filename', None):
1421 if not getattr(inst, 'filename', None):
1422 inst.filename = name
1422 inst.filename = name
1423 raise
1423 raise
1424 ofp = posixfile(temp, "wb")
1424 ofp = posixfile(temp, "wb")
1425 for chunk in filechunkiter(ifp):
1425 for chunk in filechunkiter(ifp):
1426 ofp.write(chunk)
1426 ofp.write(chunk)
1427 ifp.close()
1427 ifp.close()
1428 ofp.close()
1428 ofp.close()
1429 except: # re-raises
1429 except: # re-raises
1430 try: os.unlink(temp)
1430 try: os.unlink(temp)
1431 except OSError: pass
1431 except OSError: pass
1432 raise
1432 raise
1433 return temp
1433 return temp
1434
1434
1435 class filestat(object):
1435 class filestat(object):
1436 """help to exactly detect change of a file
1436 """help to exactly detect change of a file
1437
1437
1438 'stat' attribute is result of 'os.stat()' if specified 'path'
1438 'stat' attribute is result of 'os.stat()' if specified 'path'
1439 exists. Otherwise, it is None. This can avoid preparative
1439 exists. Otherwise, it is None. This can avoid preparative
1440 'exists()' examination on client side of this class.
1440 'exists()' examination on client side of this class.
1441 """
1441 """
1442 def __init__(self, path):
1442 def __init__(self, path):
1443 try:
1443 try:
1444 self.stat = os.stat(path)
1444 self.stat = os.stat(path)
1445 except OSError as err:
1445 except OSError as err:
1446 if err.errno != errno.ENOENT:
1446 if err.errno != errno.ENOENT:
1447 raise
1447 raise
1448 self.stat = None
1448 self.stat = None
1449
1449
1450 __hash__ = object.__hash__
1450 __hash__ = object.__hash__
1451
1451
1452 def __eq__(self, old):
1452 def __eq__(self, old):
1453 try:
1453 try:
1454 # if ambiguity between stat of new and old file is
1454 # if ambiguity between stat of new and old file is
1455 # avoided, comparision of size, ctime and mtime is enough
1455 # avoided, comparision of size, ctime and mtime is enough
1456 # to exactly detect change of a file regardless of platform
1456 # to exactly detect change of a file regardless of platform
1457 return (self.stat.st_size == old.stat.st_size and
1457 return (self.stat.st_size == old.stat.st_size and
1458 self.stat.st_ctime == old.stat.st_ctime and
1458 self.stat.st_ctime == old.stat.st_ctime and
1459 self.stat.st_mtime == old.stat.st_mtime)
1459 self.stat.st_mtime == old.stat.st_mtime)
1460 except AttributeError:
1460 except AttributeError:
1461 return False
1461 return False
1462
1462
1463 def isambig(self, old):
1463 def isambig(self, old):
1464 """Examine whether new (= self) stat is ambiguous against old one
1464 """Examine whether new (= self) stat is ambiguous against old one
1465
1465
1466 "S[N]" below means stat of a file at N-th change:
1466 "S[N]" below means stat of a file at N-th change:
1467
1467
1468 - S[n-1].ctime < S[n].ctime: can detect change of a file
1468 - S[n-1].ctime < S[n].ctime: can detect change of a file
1469 - S[n-1].ctime == S[n].ctime
1469 - S[n-1].ctime == S[n].ctime
1470 - S[n-1].ctime < S[n].mtime: means natural advancing (*1)
1470 - S[n-1].ctime < S[n].mtime: means natural advancing (*1)
1471 - S[n-1].ctime == S[n].mtime: is ambiguous (*2)
1471 - S[n-1].ctime == S[n].mtime: is ambiguous (*2)
1472 - S[n-1].ctime > S[n].mtime: never occurs naturally (don't care)
1472 - S[n-1].ctime > S[n].mtime: never occurs naturally (don't care)
1473 - S[n-1].ctime > S[n].ctime: never occurs naturally (don't care)
1473 - S[n-1].ctime > S[n].ctime: never occurs naturally (don't care)
1474
1474
1475 Case (*2) above means that a file was changed twice or more at
1475 Case (*2) above means that a file was changed twice or more at
1476 same time in sec (= S[n-1].ctime), and comparison of timestamp
1476 same time in sec (= S[n-1].ctime), and comparison of timestamp
1477 is ambiguous.
1477 is ambiguous.
1478
1478
1479 Base idea to avoid such ambiguity is "advance mtime 1 sec, if
1479 Base idea to avoid such ambiguity is "advance mtime 1 sec, if
1480 timestamp is ambiguous".
1480 timestamp is ambiguous".
1481
1481
1482 But advancing mtime only in case (*2) doesn't work as
1482 But advancing mtime only in case (*2) doesn't work as
1483 expected, because naturally advanced S[n].mtime in case (*1)
1483 expected, because naturally advanced S[n].mtime in case (*1)
1484 might be equal to manually advanced S[n-1 or earlier].mtime.
1484 might be equal to manually advanced S[n-1 or earlier].mtime.
1485
1485
1486 Therefore, all "S[n-1].ctime == S[n].ctime" cases should be
1486 Therefore, all "S[n-1].ctime == S[n].ctime" cases should be
1487 treated as ambiguous regardless of mtime, to avoid overlooking
1487 treated as ambiguous regardless of mtime, to avoid overlooking
1488 by confliction between such mtime.
1488 by confliction between such mtime.
1489
1489
1490 Advancing mtime "if isambig(oldstat)" ensures "S[n-1].mtime !=
1490 Advancing mtime "if isambig(oldstat)" ensures "S[n-1].mtime !=
1491 S[n].mtime", even if size of a file isn't changed.
1491 S[n].mtime", even if size of a file isn't changed.
1492 """
1492 """
1493 try:
1493 try:
1494 return (self.stat.st_ctime == old.stat.st_ctime)
1494 return (self.stat.st_ctime == old.stat.st_ctime)
1495 except AttributeError:
1495 except AttributeError:
1496 return False
1496 return False
1497
1497
1498 def __ne__(self, other):
1498 def __ne__(self, other):
1499 return not self == other
1499 return not self == other
1500
1500
1501 class atomictempfile(object):
1501 class atomictempfile(object):
1502 '''writable file object that atomically updates a file
1502 '''writable file object that atomically updates a file
1503
1503
1504 All writes will go to a temporary copy of the original file. Call
1504 All writes will go to a temporary copy of the original file. Call
1505 close() when you are done writing, and atomictempfile will rename
1505 close() when you are done writing, and atomictempfile will rename
1506 the temporary copy to the original name, making the changes
1506 the temporary copy to the original name, making the changes
1507 visible. If the object is destroyed without being closed, all your
1507 visible. If the object is destroyed without being closed, all your
1508 writes are discarded.
1508 writes are discarded.
1509
1509
1510 checkambig argument of constructor is used with filestat, and is
1510 checkambig argument of constructor is used with filestat, and is
1511 useful only if target file is guarded by any lock (e.g. repo.lock
1511 useful only if target file is guarded by any lock (e.g. repo.lock
1512 or repo.wlock).
1512 or repo.wlock).
1513 '''
1513 '''
1514 def __init__(self, name, mode='w+b', createmode=None, checkambig=False):
1514 def __init__(self, name, mode='w+b', createmode=None, checkambig=False):
1515 self.__name = name # permanent name
1515 self.__name = name # permanent name
1516 self._tempname = mktempcopy(name, emptyok=('w' in mode),
1516 self._tempname = mktempcopy(name, emptyok=('w' in mode),
1517 createmode=createmode)
1517 createmode=createmode)
1518 self._fp = posixfile(self._tempname, mode)
1518 self._fp = posixfile(self._tempname, mode)
1519 self._checkambig = checkambig
1519 self._checkambig = checkambig
1520
1520
1521 # delegated methods
1521 # delegated methods
1522 self.read = self._fp.read
1522 self.read = self._fp.read
1523 self.write = self._fp.write
1523 self.write = self._fp.write
1524 self.seek = self._fp.seek
1524 self.seek = self._fp.seek
1525 self.tell = self._fp.tell
1525 self.tell = self._fp.tell
1526 self.fileno = self._fp.fileno
1526 self.fileno = self._fp.fileno
1527
1527
1528 def close(self):
1528 def close(self):
1529 if not self._fp.closed:
1529 if not self._fp.closed:
1530 self._fp.close()
1530 self._fp.close()
1531 filename = localpath(self.__name)
1531 filename = localpath(self.__name)
1532 oldstat = self._checkambig and filestat(filename)
1532 oldstat = self._checkambig and filestat(filename)
1533 if oldstat and oldstat.stat:
1533 if oldstat and oldstat.stat:
1534 rename(self._tempname, filename)
1534 rename(self._tempname, filename)
1535 newstat = filestat(filename)
1535 newstat = filestat(filename)
1536 if newstat.isambig(oldstat):
1536 if newstat.isambig(oldstat):
1537 # stat of changed file is ambiguous to original one
1537 # stat of changed file is ambiguous to original one
1538 advanced = (oldstat.stat.st_mtime + 1) & 0x7fffffff
1538 advanced = (oldstat.stat.st_mtime + 1) & 0x7fffffff
1539 os.utime(filename, (advanced, advanced))
1539 os.utime(filename, (advanced, advanced))
1540 else:
1540 else:
1541 rename(self._tempname, filename)
1541 rename(self._tempname, filename)
1542
1542
1543 def discard(self):
1543 def discard(self):
1544 if not self._fp.closed:
1544 if not self._fp.closed:
1545 try:
1545 try:
1546 os.unlink(self._tempname)
1546 os.unlink(self._tempname)
1547 except OSError:
1547 except OSError:
1548 pass
1548 pass
1549 self._fp.close()
1549 self._fp.close()
1550
1550
1551 def __del__(self):
1551 def __del__(self):
1552 if safehasattr(self, '_fp'): # constructor actually did something
1552 if safehasattr(self, '_fp'): # constructor actually did something
1553 self.discard()
1553 self.discard()
1554
1554
1555 def __enter__(self):
1555 def __enter__(self):
1556 return self
1556 return self
1557
1557
1558 def __exit__(self, exctype, excvalue, traceback):
1558 def __exit__(self, exctype, excvalue, traceback):
1559 if exctype is not None:
1559 if exctype is not None:
1560 self.discard()
1560 self.discard()
1561 else:
1561 else:
1562 self.close()
1562 self.close()
1563
1563
1564 def makedirs(name, mode=None, notindexed=False):
1564 def makedirs(name, mode=None, notindexed=False):
1565 """recursive directory creation with parent mode inheritance
1565 """recursive directory creation with parent mode inheritance
1566
1566
1567 Newly created directories are marked as "not to be indexed by
1567 Newly created directories are marked as "not to be indexed by
1568 the content indexing service", if ``notindexed`` is specified
1568 the content indexing service", if ``notindexed`` is specified
1569 for "write" mode access.
1569 for "write" mode access.
1570 """
1570 """
1571 try:
1571 try:
1572 makedir(name, notindexed)
1572 makedir(name, notindexed)
1573 except OSError as err:
1573 except OSError as err:
1574 if err.errno == errno.EEXIST:
1574 if err.errno == errno.EEXIST:
1575 return
1575 return
1576 if err.errno != errno.ENOENT or not name:
1576 if err.errno != errno.ENOENT or not name:
1577 raise
1577 raise
1578 parent = os.path.dirname(os.path.abspath(name))
1578 parent = os.path.dirname(os.path.abspath(name))
1579 if parent == name:
1579 if parent == name:
1580 raise
1580 raise
1581 makedirs(parent, mode, notindexed)
1581 makedirs(parent, mode, notindexed)
1582 try:
1582 try:
1583 makedir(name, notindexed)
1583 makedir(name, notindexed)
1584 except OSError as err:
1584 except OSError as err:
1585 # Catch EEXIST to handle races
1585 # Catch EEXIST to handle races
1586 if err.errno == errno.EEXIST:
1586 if err.errno == errno.EEXIST:
1587 return
1587 return
1588 raise
1588 raise
1589 if mode is not None:
1589 if mode is not None:
1590 os.chmod(name, mode)
1590 os.chmod(name, mode)
1591
1591
1592 def readfile(path):
1592 def readfile(path):
1593 with open(path, 'rb') as fp:
1593 with open(path, 'rb') as fp:
1594 return fp.read()
1594 return fp.read()
1595
1595
1596 def writefile(path, text):
1596 def writefile(path, text):
1597 with open(path, 'wb') as fp:
1597 with open(path, 'wb') as fp:
1598 fp.write(text)
1598 fp.write(text)
1599
1599
1600 def appendfile(path, text):
1600 def appendfile(path, text):
1601 with open(path, 'ab') as fp:
1601 with open(path, 'ab') as fp:
1602 fp.write(text)
1602 fp.write(text)
1603
1603
1604 class chunkbuffer(object):
1604 class chunkbuffer(object):
1605 """Allow arbitrary sized chunks of data to be efficiently read from an
1605 """Allow arbitrary sized chunks of data to be efficiently read from an
1606 iterator over chunks of arbitrary size."""
1606 iterator over chunks of arbitrary size."""
1607
1607
1608 def __init__(self, in_iter):
1608 def __init__(self, in_iter):
1609 """in_iter is the iterator that's iterating over the input chunks.
1609 """in_iter is the iterator that's iterating over the input chunks.
1610 targetsize is how big a buffer to try to maintain."""
1610 targetsize is how big a buffer to try to maintain."""
1611 def splitbig(chunks):
1611 def splitbig(chunks):
1612 for chunk in chunks:
1612 for chunk in chunks:
1613 if len(chunk) > 2**20:
1613 if len(chunk) > 2**20:
1614 pos = 0
1614 pos = 0
1615 while pos < len(chunk):
1615 while pos < len(chunk):
1616 end = pos + 2 ** 18
1616 end = pos + 2 ** 18
1617 yield chunk[pos:end]
1617 yield chunk[pos:end]
1618 pos = end
1618 pos = end
1619 else:
1619 else:
1620 yield chunk
1620 yield chunk
1621 self.iter = splitbig(in_iter)
1621 self.iter = splitbig(in_iter)
1622 self._queue = collections.deque()
1622 self._queue = collections.deque()
1623 self._chunkoffset = 0
1623 self._chunkoffset = 0
1624
1624
1625 def read(self, l=None):
1625 def read(self, l=None):
1626 """Read L bytes of data from the iterator of chunks of data.
1626 """Read L bytes of data from the iterator of chunks of data.
1627 Returns less than L bytes if the iterator runs dry.
1627 Returns less than L bytes if the iterator runs dry.
1628
1628
1629 If size parameter is omitted, read everything"""
1629 If size parameter is omitted, read everything"""
1630 if l is None:
1630 if l is None:
1631 return ''.join(self.iter)
1631 return ''.join(self.iter)
1632
1632
1633 left = l
1633 left = l
1634 buf = []
1634 buf = []
1635 queue = self._queue
1635 queue = self._queue
1636 while left > 0:
1636 while left > 0:
1637 # refill the queue
1637 # refill the queue
1638 if not queue:
1638 if not queue:
1639 target = 2**18
1639 target = 2**18
1640 for chunk in self.iter:
1640 for chunk in self.iter:
1641 queue.append(chunk)
1641 queue.append(chunk)
1642 target -= len(chunk)
1642 target -= len(chunk)
1643 if target <= 0:
1643 if target <= 0:
1644 break
1644 break
1645 if not queue:
1645 if not queue:
1646 break
1646 break
1647
1647
1648 # The easy way to do this would be to queue.popleft(), modify the
1648 # The easy way to do this would be to queue.popleft(), modify the
1649 # chunk (if necessary), then queue.appendleft(). However, for cases
1649 # chunk (if necessary), then queue.appendleft(). However, for cases
1650 # where we read partial chunk content, this incurs 2 dequeue
1650 # where we read partial chunk content, this incurs 2 dequeue
1651 # mutations and creates a new str for the remaining chunk in the
1651 # mutations and creates a new str for the remaining chunk in the
1652 # queue. Our code below avoids this overhead.
1652 # queue. Our code below avoids this overhead.
1653
1653
1654 chunk = queue[0]
1654 chunk = queue[0]
1655 chunkl = len(chunk)
1655 chunkl = len(chunk)
1656 offset = self._chunkoffset
1656 offset = self._chunkoffset
1657
1657
1658 # Use full chunk.
1658 # Use full chunk.
1659 if offset == 0 and left >= chunkl:
1659 if offset == 0 and left >= chunkl:
1660 left -= chunkl
1660 left -= chunkl
1661 queue.popleft()
1661 queue.popleft()
1662 buf.append(chunk)
1662 buf.append(chunk)
1663 # self._chunkoffset remains at 0.
1663 # self._chunkoffset remains at 0.
1664 continue
1664 continue
1665
1665
1666 chunkremaining = chunkl - offset
1666 chunkremaining = chunkl - offset
1667
1667
1668 # Use all of unconsumed part of chunk.
1668 # Use all of unconsumed part of chunk.
1669 if left >= chunkremaining:
1669 if left >= chunkremaining:
1670 left -= chunkremaining
1670 left -= chunkremaining
1671 queue.popleft()
1671 queue.popleft()
1672 # offset == 0 is enabled by block above, so this won't merely
1672 # offset == 0 is enabled by block above, so this won't merely
1673 # copy via ``chunk[0:]``.
1673 # copy via ``chunk[0:]``.
1674 buf.append(chunk[offset:])
1674 buf.append(chunk[offset:])
1675 self._chunkoffset = 0
1675 self._chunkoffset = 0
1676
1676
1677 # Partial chunk needed.
1677 # Partial chunk needed.
1678 else:
1678 else:
1679 buf.append(chunk[offset:offset + left])
1679 buf.append(chunk[offset:offset + left])
1680 self._chunkoffset += left
1680 self._chunkoffset += left
1681 left -= chunkremaining
1681 left -= chunkremaining
1682
1682
1683 return ''.join(buf)
1683 return ''.join(buf)
1684
1684
1685 def filechunkiter(f, size=65536, limit=None):
1685 def filechunkiter(f, size=65536, limit=None):
1686 """Create a generator that produces the data in the file size
1686 """Create a generator that produces the data in the file size
1687 (default 65536) bytes at a time, up to optional limit (default is
1687 (default 65536) bytes at a time, up to optional limit (default is
1688 to read all data). Chunks may be less than size bytes if the
1688 to read all data). Chunks may be less than size bytes if the
1689 chunk is the last chunk in the file, or the file is a socket or
1689 chunk is the last chunk in the file, or the file is a socket or
1690 some other type of file that sometimes reads less data than is
1690 some other type of file that sometimes reads less data than is
1691 requested."""
1691 requested."""
1692 assert size >= 0
1692 assert size >= 0
1693 assert limit is None or limit >= 0
1693 assert limit is None or limit >= 0
1694 while True:
1694 while True:
1695 if limit is None:
1695 if limit is None:
1696 nbytes = size
1696 nbytes = size
1697 else:
1697 else:
1698 nbytes = min(limit, size)
1698 nbytes = min(limit, size)
1699 s = nbytes and f.read(nbytes)
1699 s = nbytes and f.read(nbytes)
1700 if not s:
1700 if not s:
1701 break
1701 break
1702 if limit:
1702 if limit:
1703 limit -= len(s)
1703 limit -= len(s)
1704 yield s
1704 yield s
1705
1705
1706 def makedate(timestamp=None):
1706 def makedate(timestamp=None):
1707 '''Return a unix timestamp (or the current time) as a (unixtime,
1707 '''Return a unix timestamp (or the current time) as a (unixtime,
1708 offset) tuple based off the local timezone.'''
1708 offset) tuple based off the local timezone.'''
1709 if timestamp is None:
1709 if timestamp is None:
1710 timestamp = time.time()
1710 timestamp = time.time()
1711 if timestamp < 0:
1711 if timestamp < 0:
1712 hint = _("check your clock")
1712 hint = _("check your clock")
1713 raise Abort(_("negative timestamp: %d") % timestamp, hint=hint)
1713 raise Abort(_("negative timestamp: %d") % timestamp, hint=hint)
1714 delta = (datetime.datetime.utcfromtimestamp(timestamp) -
1714 delta = (datetime.datetime.utcfromtimestamp(timestamp) -
1715 datetime.datetime.fromtimestamp(timestamp))
1715 datetime.datetime.fromtimestamp(timestamp))
1716 tz = delta.days * 86400 + delta.seconds
1716 tz = delta.days * 86400 + delta.seconds
1717 return timestamp, tz
1717 return timestamp, tz
1718
1718
1719 def datestr(date=None, format='%a %b %d %H:%M:%S %Y %1%2'):
1719 def datestr(date=None, format='%a %b %d %H:%M:%S %Y %1%2'):
1720 """represent a (unixtime, offset) tuple as a localized time.
1720 """represent a (unixtime, offset) tuple as a localized time.
1721 unixtime is seconds since the epoch, and offset is the time zone's
1721 unixtime is seconds since the epoch, and offset is the time zone's
1722 number of seconds away from UTC.
1722 number of seconds away from UTC.
1723
1723
1724 >>> datestr((0, 0))
1724 >>> datestr((0, 0))
1725 'Thu Jan 01 00:00:00 1970 +0000'
1725 'Thu Jan 01 00:00:00 1970 +0000'
1726 >>> datestr((42, 0))
1726 >>> datestr((42, 0))
1727 'Thu Jan 01 00:00:42 1970 +0000'
1727 'Thu Jan 01 00:00:42 1970 +0000'
1728 >>> datestr((-42, 0))
1728 >>> datestr((-42, 0))
1729 'Wed Dec 31 23:59:18 1969 +0000'
1729 'Wed Dec 31 23:59:18 1969 +0000'
1730 >>> datestr((0x7fffffff, 0))
1730 >>> datestr((0x7fffffff, 0))
1731 'Tue Jan 19 03:14:07 2038 +0000'
1731 'Tue Jan 19 03:14:07 2038 +0000'
1732 >>> datestr((-0x80000000, 0))
1732 >>> datestr((-0x80000000, 0))
1733 'Fri Dec 13 20:45:52 1901 +0000'
1733 'Fri Dec 13 20:45:52 1901 +0000'
1734 """
1734 """
1735 t, tz = date or makedate()
1735 t, tz = date or makedate()
1736 if "%1" in format or "%2" in format or "%z" in format:
1736 if "%1" in format or "%2" in format or "%z" in format:
1737 sign = (tz > 0) and "-" or "+"
1737 sign = (tz > 0) and "-" or "+"
1738 minutes = abs(tz) // 60
1738 minutes = abs(tz) // 60
1739 q, r = divmod(minutes, 60)
1739 q, r = divmod(minutes, 60)
1740 format = format.replace("%z", "%1%2")
1740 format = format.replace("%z", "%1%2")
1741 format = format.replace("%1", "%c%02d" % (sign, q))
1741 format = format.replace("%1", "%c%02d" % (sign, q))
1742 format = format.replace("%2", "%02d" % r)
1742 format = format.replace("%2", "%02d" % r)
1743 d = t - tz
1743 d = t - tz
1744 if d > 0x7fffffff:
1744 if d > 0x7fffffff:
1745 d = 0x7fffffff
1745 d = 0x7fffffff
1746 elif d < -0x80000000:
1746 elif d < -0x80000000:
1747 d = -0x80000000
1747 d = -0x80000000
1748 # Never use time.gmtime() and datetime.datetime.fromtimestamp()
1748 # Never use time.gmtime() and datetime.datetime.fromtimestamp()
1749 # because they use the gmtime() system call which is buggy on Windows
1749 # because they use the gmtime() system call which is buggy on Windows
1750 # for negative values.
1750 # for negative values.
1751 t = datetime.datetime(1970, 1, 1) + datetime.timedelta(seconds=d)
1751 t = datetime.datetime(1970, 1, 1) + datetime.timedelta(seconds=d)
1752 s = t.strftime(format)
1752 s = t.strftime(format)
1753 return s
1753 return s
1754
1754
1755 def shortdate(date=None):
1755 def shortdate(date=None):
1756 """turn (timestamp, tzoff) tuple into iso 8631 date."""
1756 """turn (timestamp, tzoff) tuple into iso 8631 date."""
1757 return datestr(date, format='%Y-%m-%d')
1757 return datestr(date, format='%Y-%m-%d')
1758
1758
1759 def parsetimezone(s):
1759 def parsetimezone(s):
1760 """find a trailing timezone, if any, in string, and return a
1760 """find a trailing timezone, if any, in string, and return a
1761 (offset, remainder) pair"""
1761 (offset, remainder) pair"""
1762
1762
1763 if s.endswith("GMT") or s.endswith("UTC"):
1763 if s.endswith("GMT") or s.endswith("UTC"):
1764 return 0, s[:-3].rstrip()
1764 return 0, s[:-3].rstrip()
1765
1765
1766 # Unix-style timezones [+-]hhmm
1766 # Unix-style timezones [+-]hhmm
1767 if len(s) >= 5 and s[-5] in "+-" and s[-4:].isdigit():
1767 if len(s) >= 5 and s[-5] in "+-" and s[-4:].isdigit():
1768 sign = (s[-5] == "+") and 1 or -1
1768 sign = (s[-5] == "+") and 1 or -1
1769 hours = int(s[-4:-2])
1769 hours = int(s[-4:-2])
1770 minutes = int(s[-2:])
1770 minutes = int(s[-2:])
1771 return -sign * (hours * 60 + minutes) * 60, s[:-5].rstrip()
1771 return -sign * (hours * 60 + minutes) * 60, s[:-5].rstrip()
1772
1772
1773 # ISO8601 trailing Z
1773 # ISO8601 trailing Z
1774 if s.endswith("Z") and s[-2:-1].isdigit():
1774 if s.endswith("Z") and s[-2:-1].isdigit():
1775 return 0, s[:-1]
1775 return 0, s[:-1]
1776
1776
1777 # ISO8601-style [+-]hh:mm
1777 # ISO8601-style [+-]hh:mm
1778 if (len(s) >= 6 and s[-6] in "+-" and s[-3] == ":" and
1778 if (len(s) >= 6 and s[-6] in "+-" and s[-3] == ":" and
1779 s[-5:-3].isdigit() and s[-2:].isdigit()):
1779 s[-5:-3].isdigit() and s[-2:].isdigit()):
1780 sign = (s[-6] == "+") and 1 or -1
1780 sign = (s[-6] == "+") and 1 or -1
1781 hours = int(s[-5:-3])
1781 hours = int(s[-5:-3])
1782 minutes = int(s[-2:])
1782 minutes = int(s[-2:])
1783 return -sign * (hours * 60 + minutes) * 60, s[:-6]
1783 return -sign * (hours * 60 + minutes) * 60, s[:-6]
1784
1784
1785 return None, s
1785 return None, s
1786
1786
1787 def strdate(string, format, defaults=[]):
1787 def strdate(string, format, defaults=[]):
1788 """parse a localized time string and return a (unixtime, offset) tuple.
1788 """parse a localized time string and return a (unixtime, offset) tuple.
1789 if the string cannot be parsed, ValueError is raised."""
1789 if the string cannot be parsed, ValueError is raised."""
1790 # NOTE: unixtime = localunixtime + offset
1790 # NOTE: unixtime = localunixtime + offset
1791 offset, date = parsetimezone(string)
1791 offset, date = parsetimezone(string)
1792
1792
1793 # add missing elements from defaults
1793 # add missing elements from defaults
1794 usenow = False # default to using biased defaults
1794 usenow = False # default to using biased defaults
1795 for part in ("S", "M", "HI", "d", "mb", "yY"): # decreasing specificity
1795 for part in ("S", "M", "HI", "d", "mb", "yY"): # decreasing specificity
1796 found = [True for p in part if ("%"+p) in format]
1796 found = [True for p in part if ("%"+p) in format]
1797 if not found:
1797 if not found:
1798 date += "@" + defaults[part][usenow]
1798 date += "@" + defaults[part][usenow]
1799 format += "@%" + part[0]
1799 format += "@%" + part[0]
1800 else:
1800 else:
1801 # We've found a specific time element, less specific time
1801 # We've found a specific time element, less specific time
1802 # elements are relative to today
1802 # elements are relative to today
1803 usenow = True
1803 usenow = True
1804
1804
1805 timetuple = time.strptime(date, format)
1805 timetuple = time.strptime(date, format)
1806 localunixtime = int(calendar.timegm(timetuple))
1806 localunixtime = int(calendar.timegm(timetuple))
1807 if offset is None:
1807 if offset is None:
1808 # local timezone
1808 # local timezone
1809 unixtime = int(time.mktime(timetuple))
1809 unixtime = int(time.mktime(timetuple))
1810 offset = unixtime - localunixtime
1810 offset = unixtime - localunixtime
1811 else:
1811 else:
1812 unixtime = localunixtime + offset
1812 unixtime = localunixtime + offset
1813 return unixtime, offset
1813 return unixtime, offset
1814
1814
1815 def parsedate(date, formats=None, bias=None):
1815 def parsedate(date, formats=None, bias=None):
1816 """parse a localized date/time and return a (unixtime, offset) tuple.
1816 """parse a localized date/time and return a (unixtime, offset) tuple.
1817
1817
1818 The date may be a "unixtime offset" string or in one of the specified
1818 The date may be a "unixtime offset" string or in one of the specified
1819 formats. If the date already is a (unixtime, offset) tuple, it is returned.
1819 formats. If the date already is a (unixtime, offset) tuple, it is returned.
1820
1820
1821 >>> parsedate(' today ') == parsedate(\
1821 >>> parsedate(' today ') == parsedate(\
1822 datetime.date.today().strftime('%b %d'))
1822 datetime.date.today().strftime('%b %d'))
1823 True
1823 True
1824 >>> parsedate( 'yesterday ') == parsedate((datetime.date.today() -\
1824 >>> parsedate( 'yesterday ') == parsedate((datetime.date.today() -\
1825 datetime.timedelta(days=1)\
1825 datetime.timedelta(days=1)\
1826 ).strftime('%b %d'))
1826 ).strftime('%b %d'))
1827 True
1827 True
1828 >>> now, tz = makedate()
1828 >>> now, tz = makedate()
1829 >>> strnow, strtz = parsedate('now')
1829 >>> strnow, strtz = parsedate('now')
1830 >>> (strnow - now) < 1
1830 >>> (strnow - now) < 1
1831 True
1831 True
1832 >>> tz == strtz
1832 >>> tz == strtz
1833 True
1833 True
1834 """
1834 """
1835 if bias is None:
1835 if bias is None:
1836 bias = {}
1836 bias = {}
1837 if not date:
1837 if not date:
1838 return 0, 0
1838 return 0, 0
1839 if isinstance(date, tuple) and len(date) == 2:
1839 if isinstance(date, tuple) and len(date) == 2:
1840 return date
1840 return date
1841 if not formats:
1841 if not formats:
1842 formats = defaultdateformats
1842 formats = defaultdateformats
1843 date = date.strip()
1843 date = date.strip()
1844
1844
1845 if date == 'now' or date == _('now'):
1845 if date == 'now' or date == _('now'):
1846 return makedate()
1846 return makedate()
1847 if date == 'today' or date == _('today'):
1847 if date == 'today' or date == _('today'):
1848 date = datetime.date.today().strftime('%b %d')
1848 date = datetime.date.today().strftime('%b %d')
1849 elif date == 'yesterday' or date == _('yesterday'):
1849 elif date == 'yesterday' or date == _('yesterday'):
1850 date = (datetime.date.today() -
1850 date = (datetime.date.today() -
1851 datetime.timedelta(days=1)).strftime('%b %d')
1851 datetime.timedelta(days=1)).strftime('%b %d')
1852
1852
1853 try:
1853 try:
1854 when, offset = map(int, date.split(' '))
1854 when, offset = map(int, date.split(' '))
1855 except ValueError:
1855 except ValueError:
1856 # fill out defaults
1856 # fill out defaults
1857 now = makedate()
1857 now = makedate()
1858 defaults = {}
1858 defaults = {}
1859 for part in ("d", "mb", "yY", "HI", "M", "S"):
1859 for part in ("d", "mb", "yY", "HI", "M", "S"):
1860 # this piece is for rounding the specific end of unknowns
1860 # this piece is for rounding the specific end of unknowns
1861 b = bias.get(part)
1861 b = bias.get(part)
1862 if b is None:
1862 if b is None:
1863 if part[0] in "HMS":
1863 if part[0] in "HMS":
1864 b = "00"
1864 b = "00"
1865 else:
1865 else:
1866 b = "0"
1866 b = "0"
1867
1867
1868 # this piece is for matching the generic end to today's date
1868 # this piece is for matching the generic end to today's date
1869 n = datestr(now, "%" + part[0])
1869 n = datestr(now, "%" + part[0])
1870
1870
1871 defaults[part] = (b, n)
1871 defaults[part] = (b, n)
1872
1872
1873 for format in formats:
1873 for format in formats:
1874 try:
1874 try:
1875 when, offset = strdate(date, format, defaults)
1875 when, offset = strdate(date, format, defaults)
1876 except (ValueError, OverflowError):
1876 except (ValueError, OverflowError):
1877 pass
1877 pass
1878 else:
1878 else:
1879 break
1879 break
1880 else:
1880 else:
1881 raise Abort(_('invalid date: %r') % date)
1881 raise Abort(_('invalid date: %r') % date)
1882 # validate explicit (probably user-specified) date and
1882 # validate explicit (probably user-specified) date and
1883 # time zone offset. values must fit in signed 32 bits for
1883 # time zone offset. values must fit in signed 32 bits for
1884 # current 32-bit linux runtimes. timezones go from UTC-12
1884 # current 32-bit linux runtimes. timezones go from UTC-12
1885 # to UTC+14
1885 # to UTC+14
1886 if when < -0x80000000 or when > 0x7fffffff:
1886 if when < -0x80000000 or when > 0x7fffffff:
1887 raise Abort(_('date exceeds 32 bits: %d') % when)
1887 raise Abort(_('date exceeds 32 bits: %d') % when)
1888 if offset < -50400 or offset > 43200:
1888 if offset < -50400 or offset > 43200:
1889 raise Abort(_('impossible time zone offset: %d') % offset)
1889 raise Abort(_('impossible time zone offset: %d') % offset)
1890 return when, offset
1890 return when, offset
1891
1891
1892 def matchdate(date):
1892 def matchdate(date):
1893 """Return a function that matches a given date match specifier
1893 """Return a function that matches a given date match specifier
1894
1894
1895 Formats include:
1895 Formats include:
1896
1896
1897 '{date}' match a given date to the accuracy provided
1897 '{date}' match a given date to the accuracy provided
1898
1898
1899 '<{date}' on or before a given date
1899 '<{date}' on or before a given date
1900
1900
1901 '>{date}' on or after a given date
1901 '>{date}' on or after a given date
1902
1902
1903 >>> p1 = parsedate("10:29:59")
1903 >>> p1 = parsedate("10:29:59")
1904 >>> p2 = parsedate("10:30:00")
1904 >>> p2 = parsedate("10:30:00")
1905 >>> p3 = parsedate("10:30:59")
1905 >>> p3 = parsedate("10:30:59")
1906 >>> p4 = parsedate("10:31:00")
1906 >>> p4 = parsedate("10:31:00")
1907 >>> p5 = parsedate("Sep 15 10:30:00 1999")
1907 >>> p5 = parsedate("Sep 15 10:30:00 1999")
1908 >>> f = matchdate("10:30")
1908 >>> f = matchdate("10:30")
1909 >>> f(p1[0])
1909 >>> f(p1[0])
1910 False
1910 False
1911 >>> f(p2[0])
1911 >>> f(p2[0])
1912 True
1912 True
1913 >>> f(p3[0])
1913 >>> f(p3[0])
1914 True
1914 True
1915 >>> f(p4[0])
1915 >>> f(p4[0])
1916 False
1916 False
1917 >>> f(p5[0])
1917 >>> f(p5[0])
1918 False
1918 False
1919 """
1919 """
1920
1920
1921 def lower(date):
1921 def lower(date):
1922 d = {'mb': "1", 'd': "1"}
1922 d = {'mb': "1", 'd': "1"}
1923 return parsedate(date, extendeddateformats, d)[0]
1923 return parsedate(date, extendeddateformats, d)[0]
1924
1924
1925 def upper(date):
1925 def upper(date):
1926 d = {'mb': "12", 'HI': "23", 'M': "59", 'S': "59"}
1926 d = {'mb': "12", 'HI': "23", 'M': "59", 'S': "59"}
1927 for days in ("31", "30", "29"):
1927 for days in ("31", "30", "29"):
1928 try:
1928 try:
1929 d["d"] = days
1929 d["d"] = days
1930 return parsedate(date, extendeddateformats, d)[0]
1930 return parsedate(date, extendeddateformats, d)[0]
1931 except Abort:
1931 except Abort:
1932 pass
1932 pass
1933 d["d"] = "28"
1933 d["d"] = "28"
1934 return parsedate(date, extendeddateformats, d)[0]
1934 return parsedate(date, extendeddateformats, d)[0]
1935
1935
1936 date = date.strip()
1936 date = date.strip()
1937
1937
1938 if not date:
1938 if not date:
1939 raise Abort(_("dates cannot consist entirely of whitespace"))
1939 raise Abort(_("dates cannot consist entirely of whitespace"))
1940 elif date[0] == "<":
1940 elif date[0] == "<":
1941 if not date[1:]:
1941 if not date[1:]:
1942 raise Abort(_("invalid day spec, use '<DATE'"))
1942 raise Abort(_("invalid day spec, use '<DATE'"))
1943 when = upper(date[1:])
1943 when = upper(date[1:])
1944 return lambda x: x <= when
1944 return lambda x: x <= when
1945 elif date[0] == ">":
1945 elif date[0] == ">":
1946 if not date[1:]:
1946 if not date[1:]:
1947 raise Abort(_("invalid day spec, use '>DATE'"))
1947 raise Abort(_("invalid day spec, use '>DATE'"))
1948 when = lower(date[1:])
1948 when = lower(date[1:])
1949 return lambda x: x >= when
1949 return lambda x: x >= when
1950 elif date[0] == "-":
1950 elif date[0] == "-":
1951 try:
1951 try:
1952 days = int(date[1:])
1952 days = int(date[1:])
1953 except ValueError:
1953 except ValueError:
1954 raise Abort(_("invalid day spec: %s") % date[1:])
1954 raise Abort(_("invalid day spec: %s") % date[1:])
1955 if days < 0:
1955 if days < 0:
1956 raise Abort(_("%s must be nonnegative (see 'hg help dates')")
1956 raise Abort(_("%s must be nonnegative (see 'hg help dates')")
1957 % date[1:])
1957 % date[1:])
1958 when = makedate()[0] - days * 3600 * 24
1958 when = makedate()[0] - days * 3600 * 24
1959 return lambda x: x >= when
1959 return lambda x: x >= when
1960 elif " to " in date:
1960 elif " to " in date:
1961 a, b = date.split(" to ")
1961 a, b = date.split(" to ")
1962 start, stop = lower(a), upper(b)
1962 start, stop = lower(a), upper(b)
1963 return lambda x: x >= start and x <= stop
1963 return lambda x: x >= start and x <= stop
1964 else:
1964 else:
1965 start, stop = lower(date), upper(date)
1965 start, stop = lower(date), upper(date)
1966 return lambda x: x >= start and x <= stop
1966 return lambda x: x >= start and x <= stop
1967
1967
1968 def stringmatcher(pattern):
1968 def stringmatcher(pattern):
1969 """
1969 """
1970 accepts a string, possibly starting with 're:' or 'literal:' prefix.
1970 accepts a string, possibly starting with 're:' or 'literal:' prefix.
1971 returns the matcher name, pattern, and matcher function.
1971 returns the matcher name, pattern, and matcher function.
1972 missing or unknown prefixes are treated as literal matches.
1972 missing or unknown prefixes are treated as literal matches.
1973
1973
1974 helper for tests:
1974 helper for tests:
1975 >>> def test(pattern, *tests):
1975 >>> def test(pattern, *tests):
1976 ... kind, pattern, matcher = stringmatcher(pattern)
1976 ... kind, pattern, matcher = stringmatcher(pattern)
1977 ... return (kind, pattern, [bool(matcher(t)) for t in tests])
1977 ... return (kind, pattern, [bool(matcher(t)) for t in tests])
1978
1978
1979 exact matching (no prefix):
1979 exact matching (no prefix):
1980 >>> test('abcdefg', 'abc', 'def', 'abcdefg')
1980 >>> test('abcdefg', 'abc', 'def', 'abcdefg')
1981 ('literal', 'abcdefg', [False, False, True])
1981 ('literal', 'abcdefg', [False, False, True])
1982
1982
1983 regex matching ('re:' prefix)
1983 regex matching ('re:' prefix)
1984 >>> test('re:a.+b', 'nomatch', 'fooadef', 'fooadefbar')
1984 >>> test('re:a.+b', 'nomatch', 'fooadef', 'fooadefbar')
1985 ('re', 'a.+b', [False, False, True])
1985 ('re', 'a.+b', [False, False, True])
1986
1986
1987 force exact matches ('literal:' prefix)
1987 force exact matches ('literal:' prefix)
1988 >>> test('literal:re:foobar', 'foobar', 're:foobar')
1988 >>> test('literal:re:foobar', 'foobar', 're:foobar')
1989 ('literal', 're:foobar', [False, True])
1989 ('literal', 're:foobar', [False, True])
1990
1990
1991 unknown prefixes are ignored and treated as literals
1991 unknown prefixes are ignored and treated as literals
1992 >>> test('foo:bar', 'foo', 'bar', 'foo:bar')
1992 >>> test('foo:bar', 'foo', 'bar', 'foo:bar')
1993 ('literal', 'foo:bar', [False, False, True])
1993 ('literal', 'foo:bar', [False, False, True])
1994 """
1994 """
1995 if pattern.startswith('re:'):
1995 if pattern.startswith('re:'):
1996 pattern = pattern[3:]
1996 pattern = pattern[3:]
1997 try:
1997 try:
1998 regex = remod.compile(pattern)
1998 regex = remod.compile(pattern)
1999 except remod.error as e:
1999 except remod.error as e:
2000 raise error.ParseError(_('invalid regular expression: %s')
2000 raise error.ParseError(_('invalid regular expression: %s')
2001 % e)
2001 % e)
2002 return 're', pattern, regex.search
2002 return 're', pattern, regex.search
2003 elif pattern.startswith('literal:'):
2003 elif pattern.startswith('literal:'):
2004 pattern = pattern[8:]
2004 pattern = pattern[8:]
2005 return 'literal', pattern, pattern.__eq__
2005 return 'literal', pattern, pattern.__eq__
2006
2006
2007 def shortuser(user):
2007 def shortuser(user):
2008 """Return a short representation of a user name or email address."""
2008 """Return a short representation of a user name or email address."""
2009 f = user.find('@')
2009 f = user.find('@')
2010 if f >= 0:
2010 if f >= 0:
2011 user = user[:f]
2011 user = user[:f]
2012 f = user.find('<')
2012 f = user.find('<')
2013 if f >= 0:
2013 if f >= 0:
2014 user = user[f + 1:]
2014 user = user[f + 1:]
2015 f = user.find(' ')
2015 f = user.find(' ')
2016 if f >= 0:
2016 if f >= 0:
2017 user = user[:f]
2017 user = user[:f]
2018 f = user.find('.')
2018 f = user.find('.')
2019 if f >= 0:
2019 if f >= 0:
2020 user = user[:f]
2020 user = user[:f]
2021 return user
2021 return user
2022
2022
2023 def emailuser(user):
2023 def emailuser(user):
2024 """Return the user portion of an email address."""
2024 """Return the user portion of an email address."""
2025 f = user.find('@')
2025 f = user.find('@')
2026 if f >= 0:
2026 if f >= 0:
2027 user = user[:f]
2027 user = user[:f]
2028 f = user.find('<')
2028 f = user.find('<')
2029 if f >= 0:
2029 if f >= 0:
2030 user = user[f + 1:]
2030 user = user[f + 1:]
2031 return user
2031 return user
2032
2032
2033 def email(author):
2033 def email(author):
2034 '''get email of author.'''
2034 '''get email of author.'''
2035 r = author.find('>')
2035 r = author.find('>')
2036 if r == -1:
2036 if r == -1:
2037 r = None
2037 r = None
2038 return author[author.find('<') + 1:r]
2038 return author[author.find('<') + 1:r]
2039
2039
2040 def ellipsis(text, maxlength=400):
2040 def ellipsis(text, maxlength=400):
2041 """Trim string to at most maxlength (default: 400) columns in display."""
2041 """Trim string to at most maxlength (default: 400) columns in display."""
2042 return encoding.trim(text, maxlength, ellipsis='...')
2042 return encoding.trim(text, maxlength, ellipsis='...')
2043
2043
2044 def unitcountfn(*unittable):
2044 def unitcountfn(*unittable):
2045 '''return a function that renders a readable count of some quantity'''
2045 '''return a function that renders a readable count of some quantity'''
2046
2046
2047 def go(count):
2047 def go(count):
2048 for multiplier, divisor, format in unittable:
2048 for multiplier, divisor, format in unittable:
2049 if count >= divisor * multiplier:
2049 if count >= divisor * multiplier:
2050 return format % (count / float(divisor))
2050 return format % (count / float(divisor))
2051 return unittable[-1][2] % count
2051 return unittable[-1][2] % count
2052
2052
2053 return go
2053 return go
2054
2054
2055 bytecount = unitcountfn(
2055 bytecount = unitcountfn(
2056 (100, 1 << 30, _('%.0f GB')),
2056 (100, 1 << 30, _('%.0f GB')),
2057 (10, 1 << 30, _('%.1f GB')),
2057 (10, 1 << 30, _('%.1f GB')),
2058 (1, 1 << 30, _('%.2f GB')),
2058 (1, 1 << 30, _('%.2f GB')),
2059 (100, 1 << 20, _('%.0f MB')),
2059 (100, 1 << 20, _('%.0f MB')),
2060 (10, 1 << 20, _('%.1f MB')),
2060 (10, 1 << 20, _('%.1f MB')),
2061 (1, 1 << 20, _('%.2f MB')),
2061 (1, 1 << 20, _('%.2f MB')),
2062 (100, 1 << 10, _('%.0f KB')),
2062 (100, 1 << 10, _('%.0f KB')),
2063 (10, 1 << 10, _('%.1f KB')),
2063 (10, 1 << 10, _('%.1f KB')),
2064 (1, 1 << 10, _('%.2f KB')),
2064 (1, 1 << 10, _('%.2f KB')),
2065 (1, 1, _('%.0f bytes')),
2065 (1, 1, _('%.0f bytes')),
2066 )
2066 )
2067
2067
2068 def uirepr(s):
2068 def uirepr(s):
2069 # Avoid double backslash in Windows path repr()
2069 # Avoid double backslash in Windows path repr()
2070 return repr(s).replace('\\\\', '\\')
2070 return repr(s).replace('\\\\', '\\')
2071
2071
2072 # delay import of textwrap
2072 # delay import of textwrap
2073 def MBTextWrapper(**kwargs):
2073 def MBTextWrapper(**kwargs):
2074 class tw(textwrap.TextWrapper):
2074 class tw(textwrap.TextWrapper):
2075 """
2075 """
2076 Extend TextWrapper for width-awareness.
2076 Extend TextWrapper for width-awareness.
2077
2077
2078 Neither number of 'bytes' in any encoding nor 'characters' is
2078 Neither number of 'bytes' in any encoding nor 'characters' is
2079 appropriate to calculate terminal columns for specified string.
2079 appropriate to calculate terminal columns for specified string.
2080
2080
2081 Original TextWrapper implementation uses built-in 'len()' directly,
2081 Original TextWrapper implementation uses built-in 'len()' directly,
2082 so overriding is needed to use width information of each characters.
2082 so overriding is needed to use width information of each characters.
2083
2083
2084 In addition, characters classified into 'ambiguous' width are
2084 In addition, characters classified into 'ambiguous' width are
2085 treated as wide in East Asian area, but as narrow in other.
2085 treated as wide in East Asian area, but as narrow in other.
2086
2086
2087 This requires use decision to determine width of such characters.
2087 This requires use decision to determine width of such characters.
2088 """
2088 """
2089 def _cutdown(self, ucstr, space_left):
2089 def _cutdown(self, ucstr, space_left):
2090 l = 0
2090 l = 0
2091 colwidth = encoding.ucolwidth
2091 colwidth = encoding.ucolwidth
2092 for i in xrange(len(ucstr)):
2092 for i in xrange(len(ucstr)):
2093 l += colwidth(ucstr[i])
2093 l += colwidth(ucstr[i])
2094 if space_left < l:
2094 if space_left < l:
2095 return (ucstr[:i], ucstr[i:])
2095 return (ucstr[:i], ucstr[i:])
2096 return ucstr, ''
2096 return ucstr, ''
2097
2097
2098 # overriding of base class
2098 # overriding of base class
2099 def _handle_long_word(self, reversed_chunks, cur_line, cur_len, width):
2099 def _handle_long_word(self, reversed_chunks, cur_line, cur_len, width):
2100 space_left = max(width - cur_len, 1)
2100 space_left = max(width - cur_len, 1)
2101
2101
2102 if self.break_long_words:
2102 if self.break_long_words:
2103 cut, res = self._cutdown(reversed_chunks[-1], space_left)
2103 cut, res = self._cutdown(reversed_chunks[-1], space_left)
2104 cur_line.append(cut)
2104 cur_line.append(cut)
2105 reversed_chunks[-1] = res
2105 reversed_chunks[-1] = res
2106 elif not cur_line:
2106 elif not cur_line:
2107 cur_line.append(reversed_chunks.pop())
2107 cur_line.append(reversed_chunks.pop())
2108
2108
2109 # this overriding code is imported from TextWrapper of Python 2.6
2109 # this overriding code is imported from TextWrapper of Python 2.6
2110 # to calculate columns of string by 'encoding.ucolwidth()'
2110 # to calculate columns of string by 'encoding.ucolwidth()'
2111 def _wrap_chunks(self, chunks):
2111 def _wrap_chunks(self, chunks):
2112 colwidth = encoding.ucolwidth
2112 colwidth = encoding.ucolwidth
2113
2113
2114 lines = []
2114 lines = []
2115 if self.width <= 0:
2115 if self.width <= 0:
2116 raise ValueError("invalid width %r (must be > 0)" % self.width)
2116 raise ValueError("invalid width %r (must be > 0)" % self.width)
2117
2117
2118 # Arrange in reverse order so items can be efficiently popped
2118 # Arrange in reverse order so items can be efficiently popped
2119 # from a stack of chucks.
2119 # from a stack of chucks.
2120 chunks.reverse()
2120 chunks.reverse()
2121
2121
2122 while chunks:
2122 while chunks:
2123
2123
2124 # Start the list of chunks that will make up the current line.
2124 # Start the list of chunks that will make up the current line.
2125 # cur_len is just the length of all the chunks in cur_line.
2125 # cur_len is just the length of all the chunks in cur_line.
2126 cur_line = []
2126 cur_line = []
2127 cur_len = 0
2127 cur_len = 0
2128
2128
2129 # Figure out which static string will prefix this line.
2129 # Figure out which static string will prefix this line.
2130 if lines:
2130 if lines:
2131 indent = self.subsequent_indent
2131 indent = self.subsequent_indent
2132 else:
2132 else:
2133 indent = self.initial_indent
2133 indent = self.initial_indent
2134
2134
2135 # Maximum width for this line.
2135 # Maximum width for this line.
2136 width = self.width - len(indent)
2136 width = self.width - len(indent)
2137
2137
2138 # First chunk on line is whitespace -- drop it, unless this
2138 # First chunk on line is whitespace -- drop it, unless this
2139 # is the very beginning of the text (i.e. no lines started yet).
2139 # is the very beginning of the text (i.e. no lines started yet).
2140 if self.drop_whitespace and chunks[-1].strip() == '' and lines:
2140 if self.drop_whitespace and chunks[-1].strip() == '' and lines:
2141 del chunks[-1]
2141 del chunks[-1]
2142
2142
2143 while chunks:
2143 while chunks:
2144 l = colwidth(chunks[-1])
2144 l = colwidth(chunks[-1])
2145
2145
2146 # Can at least squeeze this chunk onto the current line.
2146 # Can at least squeeze this chunk onto the current line.
2147 if cur_len + l <= width:
2147 if cur_len + l <= width:
2148 cur_line.append(chunks.pop())
2148 cur_line.append(chunks.pop())
2149 cur_len += l
2149 cur_len += l
2150
2150
2151 # Nope, this line is full.
2151 # Nope, this line is full.
2152 else:
2152 else:
2153 break
2153 break
2154
2154
2155 # The current line is full, and the next chunk is too big to
2155 # The current line is full, and the next chunk is too big to
2156 # fit on *any* line (not just this one).
2156 # fit on *any* line (not just this one).
2157 if chunks and colwidth(chunks[-1]) > width:
2157 if chunks and colwidth(chunks[-1]) > width:
2158 self._handle_long_word(chunks, cur_line, cur_len, width)
2158 self._handle_long_word(chunks, cur_line, cur_len, width)
2159
2159
2160 # If the last chunk on this line is all whitespace, drop it.
2160 # If the last chunk on this line is all whitespace, drop it.
2161 if (self.drop_whitespace and
2161 if (self.drop_whitespace and
2162 cur_line and cur_line[-1].strip() == ''):
2162 cur_line and cur_line[-1].strip() == ''):
2163 del cur_line[-1]
2163 del cur_line[-1]
2164
2164
2165 # Convert current line back to a string and store it in list
2165 # Convert current line back to a string and store it in list
2166 # of all lines (return value).
2166 # of all lines (return value).
2167 if cur_line:
2167 if cur_line:
2168 lines.append(indent + ''.join(cur_line))
2168 lines.append(indent + ''.join(cur_line))
2169
2169
2170 return lines
2170 return lines
2171
2171
2172 global MBTextWrapper
2172 global MBTextWrapper
2173 MBTextWrapper = tw
2173 MBTextWrapper = tw
2174 return tw(**kwargs)
2174 return tw(**kwargs)
2175
2175
2176 def wrap(line, width, initindent='', hangindent=''):
2176 def wrap(line, width, initindent='', hangindent=''):
2177 maxindent = max(len(hangindent), len(initindent))
2177 maxindent = max(len(hangindent), len(initindent))
2178 if width <= maxindent:
2178 if width <= maxindent:
2179 # adjust for weird terminal size
2179 # adjust for weird terminal size
2180 width = max(78, maxindent + 1)
2180 width = max(78, maxindent + 1)
2181 line = line.decode(encoding.encoding, encoding.encodingmode)
2181 line = line.decode(encoding.encoding, encoding.encodingmode)
2182 initindent = initindent.decode(encoding.encoding, encoding.encodingmode)
2182 initindent = initindent.decode(encoding.encoding, encoding.encodingmode)
2183 hangindent = hangindent.decode(encoding.encoding, encoding.encodingmode)
2183 hangindent = hangindent.decode(encoding.encoding, encoding.encodingmode)
2184 wrapper = MBTextWrapper(width=width,
2184 wrapper = MBTextWrapper(width=width,
2185 initial_indent=initindent,
2185 initial_indent=initindent,
2186 subsequent_indent=hangindent)
2186 subsequent_indent=hangindent)
2187 return wrapper.fill(line).encode(encoding.encoding)
2187 return wrapper.fill(line).encode(encoding.encoding)
2188
2188
2189 def iterlines(iterator):
2189 def iterlines(iterator):
2190 for chunk in iterator:
2190 for chunk in iterator:
2191 for line in chunk.splitlines():
2191 for line in chunk.splitlines():
2192 yield line
2192 yield line
2193
2193
2194 def expandpath(path):
2194 def expandpath(path):
2195 return os.path.expanduser(os.path.expandvars(path))
2195 return os.path.expanduser(os.path.expandvars(path))
2196
2196
2197 def hgcmd():
2197 def hgcmd():
2198 """Return the command used to execute current hg
2198 """Return the command used to execute current hg
2199
2199
2200 This is different from hgexecutable() because on Windows we want
2200 This is different from hgexecutable() because on Windows we want
2201 to avoid things opening new shell windows like batch files, so we
2201 to avoid things opening new shell windows like batch files, so we
2202 get either the python call or current executable.
2202 get either the python call or current executable.
2203 """
2203 """
2204 if mainfrozen():
2204 if mainfrozen():
2205 if getattr(sys, 'frozen', None) == 'macosx_app':
2205 if getattr(sys, 'frozen', None) == 'macosx_app':
2206 # Env variable set by py2app
2206 # Env variable set by py2app
2207 return [os.environ['EXECUTABLEPATH']]
2207 return [os.environ['EXECUTABLEPATH']]
2208 else:
2208 else:
2209 return [sys.executable]
2209 return [sys.executable]
2210 return gethgcmd()
2210 return gethgcmd()
2211
2211
2212 def rundetached(args, condfn):
2212 def rundetached(args, condfn):
2213 """Execute the argument list in a detached process.
2213 """Execute the argument list in a detached process.
2214
2214
2215 condfn is a callable which is called repeatedly and should return
2215 condfn is a callable which is called repeatedly and should return
2216 True once the child process is known to have started successfully.
2216 True once the child process is known to have started successfully.
2217 At this point, the child process PID is returned. If the child
2217 At this point, the child process PID is returned. If the child
2218 process fails to start or finishes before condfn() evaluates to
2218 process fails to start or finishes before condfn() evaluates to
2219 True, return -1.
2219 True, return -1.
2220 """
2220 """
2221 # Windows case is easier because the child process is either
2221 # Windows case is easier because the child process is either
2222 # successfully starting and validating the condition or exiting
2222 # successfully starting and validating the condition or exiting
2223 # on failure. We just poll on its PID. On Unix, if the child
2223 # on failure. We just poll on its PID. On Unix, if the child
2224 # process fails to start, it will be left in a zombie state until
2224 # process fails to start, it will be left in a zombie state until
2225 # the parent wait on it, which we cannot do since we expect a long
2225 # the parent wait on it, which we cannot do since we expect a long
2226 # running process on success. Instead we listen for SIGCHLD telling
2226 # running process on success. Instead we listen for SIGCHLD telling
2227 # us our child process terminated.
2227 # us our child process terminated.
2228 terminated = set()
2228 terminated = set()
2229 def handler(signum, frame):
2229 def handler(signum, frame):
2230 terminated.add(os.wait())
2230 terminated.add(os.wait())
2231 prevhandler = None
2231 prevhandler = None
2232 SIGCHLD = getattr(signal, 'SIGCHLD', None)
2232 SIGCHLD = getattr(signal, 'SIGCHLD', None)
2233 if SIGCHLD is not None:
2233 if SIGCHLD is not None:
2234 prevhandler = signal.signal(SIGCHLD, handler)
2234 prevhandler = signal.signal(SIGCHLD, handler)
2235 try:
2235 try:
2236 pid = spawndetached(args)
2236 pid = spawndetached(args)
2237 while not condfn():
2237 while not condfn():
2238 if ((pid in terminated or not testpid(pid))
2238 if ((pid in terminated or not testpid(pid))
2239 and not condfn()):
2239 and not condfn()):
2240 return -1
2240 return -1
2241 time.sleep(0.1)
2241 time.sleep(0.1)
2242 return pid
2242 return pid
2243 finally:
2243 finally:
2244 if prevhandler is not None:
2244 if prevhandler is not None:
2245 signal.signal(signal.SIGCHLD, prevhandler)
2245 signal.signal(signal.SIGCHLD, prevhandler)
2246
2246
2247 def interpolate(prefix, mapping, s, fn=None, escape_prefix=False):
2247 def interpolate(prefix, mapping, s, fn=None, escape_prefix=False):
2248 """Return the result of interpolating items in the mapping into string s.
2248 """Return the result of interpolating items in the mapping into string s.
2249
2249
2250 prefix is a single character string, or a two character string with
2250 prefix is a single character string, or a two character string with
2251 a backslash as the first character if the prefix needs to be escaped in
2251 a backslash as the first character if the prefix needs to be escaped in
2252 a regular expression.
2252 a regular expression.
2253
2253
2254 fn is an optional function that will be applied to the replacement text
2254 fn is an optional function that will be applied to the replacement text
2255 just before replacement.
2255 just before replacement.
2256
2256
2257 escape_prefix is an optional flag that allows using doubled prefix for
2257 escape_prefix is an optional flag that allows using doubled prefix for
2258 its escaping.
2258 its escaping.
2259 """
2259 """
2260 fn = fn or (lambda s: s)
2260 fn = fn or (lambda s: s)
2261 patterns = '|'.join(mapping.keys())
2261 patterns = '|'.join(mapping.keys())
2262 if escape_prefix:
2262 if escape_prefix:
2263 patterns += '|' + prefix
2263 patterns += '|' + prefix
2264 if len(prefix) > 1:
2264 if len(prefix) > 1:
2265 prefix_char = prefix[1:]
2265 prefix_char = prefix[1:]
2266 else:
2266 else:
2267 prefix_char = prefix
2267 prefix_char = prefix
2268 mapping[prefix_char] = prefix_char
2268 mapping[prefix_char] = prefix_char
2269 r = remod.compile(r'%s(%s)' % (prefix, patterns))
2269 r = remod.compile(r'%s(%s)' % (prefix, patterns))
2270 return r.sub(lambda x: fn(mapping[x.group()[1:]]), s)
2270 return r.sub(lambda x: fn(mapping[x.group()[1:]]), s)
2271
2271
2272 def getport(port):
2272 def getport(port):
2273 """Return the port for a given network service.
2273 """Return the port for a given network service.
2274
2274
2275 If port is an integer, it's returned as is. If it's a string, it's
2275 If port is an integer, it's returned as is. If it's a string, it's
2276 looked up using socket.getservbyname(). If there's no matching
2276 looked up using socket.getservbyname(). If there's no matching
2277 service, error.Abort is raised.
2277 service, error.Abort is raised.
2278 """
2278 """
2279 try:
2279 try:
2280 return int(port)
2280 return int(port)
2281 except ValueError:
2281 except ValueError:
2282 pass
2282 pass
2283
2283
2284 try:
2284 try:
2285 return socket.getservbyname(port)
2285 return socket.getservbyname(port)
2286 except socket.error:
2286 except socket.error:
2287 raise Abort(_("no port number associated with service '%s'") % port)
2287 raise Abort(_("no port number associated with service '%s'") % port)
2288
2288
2289 _booleans = {'1': True, 'yes': True, 'true': True, 'on': True, 'always': True,
2289 _booleans = {'1': True, 'yes': True, 'true': True, 'on': True, 'always': True,
2290 '0': False, 'no': False, 'false': False, 'off': False,
2290 '0': False, 'no': False, 'false': False, 'off': False,
2291 'never': False}
2291 'never': False}
2292
2292
2293 def parsebool(s):
2293 def parsebool(s):
2294 """Parse s into a boolean.
2294 """Parse s into a boolean.
2295
2295
2296 If s is not a valid boolean, returns None.
2296 If s is not a valid boolean, returns None.
2297 """
2297 """
2298 return _booleans.get(s.lower(), None)
2298 return _booleans.get(s.lower(), None)
2299
2299
2300 _hexdig = '0123456789ABCDEFabcdef'
2300 _hexdig = '0123456789ABCDEFabcdef'
2301 _hextochr = dict((a + b, chr(int(a + b, 16)))
2301 _hextochr = dict((a + b, chr(int(a + b, 16)))
2302 for a in _hexdig for b in _hexdig)
2302 for a in _hexdig for b in _hexdig)
2303
2303
2304 def _urlunquote(s):
2304 def _urlunquote(s):
2305 """Decode HTTP/HTML % encoding.
2305 """Decode HTTP/HTML % encoding.
2306
2306
2307 >>> _urlunquote('abc%20def')
2307 >>> _urlunquote('abc%20def')
2308 'abc def'
2308 'abc def'
2309 """
2309 """
2310 res = s.split('%')
2310 res = s.split('%')
2311 # fastpath
2311 # fastpath
2312 if len(res) == 1:
2312 if len(res) == 1:
2313 return s
2313 return s
2314 s = res[0]
2314 s = res[0]
2315 for item in res[1:]:
2315 for item in res[1:]:
2316 try:
2316 try:
2317 s += _hextochr[item[:2]] + item[2:]
2317 s += _hextochr[item[:2]] + item[2:]
2318 except KeyError:
2318 except KeyError:
2319 s += '%' + item
2319 s += '%' + item
2320 except UnicodeDecodeError:
2320 except UnicodeDecodeError:
2321 s += unichr(int(item[:2], 16)) + item[2:]
2321 s += unichr(int(item[:2], 16)) + item[2:]
2322 return s
2322 return s
2323
2323
2324 class url(object):
2324 class url(object):
2325 r"""Reliable URL parser.
2325 r"""Reliable URL parser.
2326
2326
2327 This parses URLs and provides attributes for the following
2327 This parses URLs and provides attributes for the following
2328 components:
2328 components:
2329
2329
2330 <scheme>://<user>:<passwd>@<host>:<port>/<path>?<query>#<fragment>
2330 <scheme>://<user>:<passwd>@<host>:<port>/<path>?<query>#<fragment>
2331
2331
2332 Missing components are set to None. The only exception is
2332 Missing components are set to None. The only exception is
2333 fragment, which is set to '' if present but empty.
2333 fragment, which is set to '' if present but empty.
2334
2334
2335 If parsefragment is False, fragment is included in query. If
2335 If parsefragment is False, fragment is included in query. If
2336 parsequery is False, query is included in path. If both are
2336 parsequery is False, query is included in path. If both are
2337 False, both fragment and query are included in path.
2337 False, both fragment and query are included in path.
2338
2338
2339 See http://www.ietf.org/rfc/rfc2396.txt for more information.
2339 See http://www.ietf.org/rfc/rfc2396.txt for more information.
2340
2340
2341 Note that for backward compatibility reasons, bundle URLs do not
2341 Note that for backward compatibility reasons, bundle URLs do not
2342 take host names. That means 'bundle://../' has a path of '../'.
2342 take host names. That means 'bundle://../' has a path of '../'.
2343
2343
2344 Examples:
2344 Examples:
2345
2345
2346 >>> url('http://www.ietf.org/rfc/rfc2396.txt')
2346 >>> url('http://www.ietf.org/rfc/rfc2396.txt')
2347 <url scheme: 'http', host: 'www.ietf.org', path: 'rfc/rfc2396.txt'>
2347 <url scheme: 'http', host: 'www.ietf.org', path: 'rfc/rfc2396.txt'>
2348 >>> url('ssh://[::1]:2200//home/joe/repo')
2348 >>> url('ssh://[::1]:2200//home/joe/repo')
2349 <url scheme: 'ssh', host: '[::1]', port: '2200', path: '/home/joe/repo'>
2349 <url scheme: 'ssh', host: '[::1]', port: '2200', path: '/home/joe/repo'>
2350 >>> url('file:///home/joe/repo')
2350 >>> url('file:///home/joe/repo')
2351 <url scheme: 'file', path: '/home/joe/repo'>
2351 <url scheme: 'file', path: '/home/joe/repo'>
2352 >>> url('file:///c:/temp/foo/')
2352 >>> url('file:///c:/temp/foo/')
2353 <url scheme: 'file', path: 'c:/temp/foo/'>
2353 <url scheme: 'file', path: 'c:/temp/foo/'>
2354 >>> url('bundle:foo')
2354 >>> url('bundle:foo')
2355 <url scheme: 'bundle', path: 'foo'>
2355 <url scheme: 'bundle', path: 'foo'>
2356 >>> url('bundle://../foo')
2356 >>> url('bundle://../foo')
2357 <url scheme: 'bundle', path: '../foo'>
2357 <url scheme: 'bundle', path: '../foo'>
2358 >>> url(r'c:\foo\bar')
2358 >>> url(r'c:\foo\bar')
2359 <url path: 'c:\\foo\\bar'>
2359 <url path: 'c:\\foo\\bar'>
2360 >>> url(r'\\blah\blah\blah')
2360 >>> url(r'\\blah\blah\blah')
2361 <url path: '\\\\blah\\blah\\blah'>
2361 <url path: '\\\\blah\\blah\\blah'>
2362 >>> url(r'\\blah\blah\blah#baz')
2362 >>> url(r'\\blah\blah\blah#baz')
2363 <url path: '\\\\blah\\blah\\blah', fragment: 'baz'>
2363 <url path: '\\\\blah\\blah\\blah', fragment: 'baz'>
2364 >>> url(r'file:///C:\users\me')
2364 >>> url(r'file:///C:\users\me')
2365 <url scheme: 'file', path: 'C:\\users\\me'>
2365 <url scheme: 'file', path: 'C:\\users\\me'>
2366
2366
2367 Authentication credentials:
2367 Authentication credentials:
2368
2368
2369 >>> url('ssh://joe:xyz@x/repo')
2369 >>> url('ssh://joe:xyz@x/repo')
2370 <url scheme: 'ssh', user: 'joe', passwd: 'xyz', host: 'x', path: 'repo'>
2370 <url scheme: 'ssh', user: 'joe', passwd: 'xyz', host: 'x', path: 'repo'>
2371 >>> url('ssh://joe@x/repo')
2371 >>> url('ssh://joe@x/repo')
2372 <url scheme: 'ssh', user: 'joe', host: 'x', path: 'repo'>
2372 <url scheme: 'ssh', user: 'joe', host: 'x', path: 'repo'>
2373
2373
2374 Query strings and fragments:
2374 Query strings and fragments:
2375
2375
2376 >>> url('http://host/a?b#c')
2376 >>> url('http://host/a?b#c')
2377 <url scheme: 'http', host: 'host', path: 'a', query: 'b', fragment: 'c'>
2377 <url scheme: 'http', host: 'host', path: 'a', query: 'b', fragment: 'c'>
2378 >>> url('http://host/a?b#c', parsequery=False, parsefragment=False)
2378 >>> url('http://host/a?b#c', parsequery=False, parsefragment=False)
2379 <url scheme: 'http', host: 'host', path: 'a?b#c'>
2379 <url scheme: 'http', host: 'host', path: 'a?b#c'>
2380 """
2380 """
2381
2381
2382 _safechars = "!~*'()+"
2382 _safechars = "!~*'()+"
2383 _safepchars = "/!~*'()+:\\"
2383 _safepchars = "/!~*'()+:\\"
2384 _matchscheme = remod.compile(r'^[a-zA-Z0-9+.\-]+:').match
2384 _matchscheme = remod.compile(r'^[a-zA-Z0-9+.\-]+:').match
2385
2385
2386 def __init__(self, path, parsequery=True, parsefragment=True):
2386 def __init__(self, path, parsequery=True, parsefragment=True):
2387 # We slowly chomp away at path until we have only the path left
2387 # We slowly chomp away at path until we have only the path left
2388 self.scheme = self.user = self.passwd = self.host = None
2388 self.scheme = self.user = self.passwd = self.host = None
2389 self.port = self.path = self.query = self.fragment = None
2389 self.port = self.path = self.query = self.fragment = None
2390 self._localpath = True
2390 self._localpath = True
2391 self._hostport = ''
2391 self._hostport = ''
2392 self._origpath = path
2392 self._origpath = path
2393
2393
2394 if parsefragment and '#' in path:
2394 if parsefragment and '#' in path:
2395 path, self.fragment = path.split('#', 1)
2395 path, self.fragment = path.split('#', 1)
2396 if not path:
2396 if not path:
2397 path = None
2397 path = None
2398
2398
2399 # special case for Windows drive letters and UNC paths
2399 # special case for Windows drive letters and UNC paths
2400 if hasdriveletter(path) or path.startswith(r'\\'):
2400 if hasdriveletter(path) or path.startswith(r'\\'):
2401 self.path = path
2401 self.path = path
2402 return
2402 return
2403
2403
2404 # For compatibility reasons, we can't handle bundle paths as
2404 # For compatibility reasons, we can't handle bundle paths as
2405 # normal URLS
2405 # normal URLS
2406 if path.startswith('bundle:'):
2406 if path.startswith('bundle:'):
2407 self.scheme = 'bundle'
2407 self.scheme = 'bundle'
2408 path = path[7:]
2408 path = path[7:]
2409 if path.startswith('//'):
2409 if path.startswith('//'):
2410 path = path[2:]
2410 path = path[2:]
2411 self.path = path
2411 self.path = path
2412 return
2412 return
2413
2413
2414 if self._matchscheme(path):
2414 if self._matchscheme(path):
2415 parts = path.split(':', 1)
2415 parts = path.split(':', 1)
2416 if parts[0]:
2416 if parts[0]:
2417 self.scheme, path = parts
2417 self.scheme, path = parts
2418 self._localpath = False
2418 self._localpath = False
2419
2419
2420 if not path:
2420 if not path:
2421 path = None
2421 path = None
2422 if self._localpath:
2422 if self._localpath:
2423 self.path = ''
2423 self.path = ''
2424 return
2424 return
2425 else:
2425 else:
2426 if self._localpath:
2426 if self._localpath:
2427 self.path = path
2427 self.path = path
2428 return
2428 return
2429
2429
2430 if parsequery and '?' in path:
2430 if parsequery and '?' in path:
2431 path, self.query = path.split('?', 1)
2431 path, self.query = path.split('?', 1)
2432 if not path:
2432 if not path:
2433 path = None
2433 path = None
2434 if not self.query:
2434 if not self.query:
2435 self.query = None
2435 self.query = None
2436
2436
2437 # // is required to specify a host/authority
2437 # // is required to specify a host/authority
2438 if path and path.startswith('//'):
2438 if path and path.startswith('//'):
2439 parts = path[2:].split('/', 1)
2439 parts = path[2:].split('/', 1)
2440 if len(parts) > 1:
2440 if len(parts) > 1:
2441 self.host, path = parts
2441 self.host, path = parts
2442 else:
2442 else:
2443 self.host = parts[0]
2443 self.host = parts[0]
2444 path = None
2444 path = None
2445 if not self.host:
2445 if not self.host:
2446 self.host = None
2446 self.host = None
2447 # path of file:///d is /d
2447 # path of file:///d is /d
2448 # path of file:///d:/ is d:/, not /d:/
2448 # path of file:///d:/ is d:/, not /d:/
2449 if path and not hasdriveletter(path):
2449 if path and not hasdriveletter(path):
2450 path = '/' + path
2450 path = '/' + path
2451
2451
2452 if self.host and '@' in self.host:
2452 if self.host and '@' in self.host:
2453 self.user, self.host = self.host.rsplit('@', 1)
2453 self.user, self.host = self.host.rsplit('@', 1)
2454 if ':' in self.user:
2454 if ':' in self.user:
2455 self.user, self.passwd = self.user.split(':', 1)
2455 self.user, self.passwd = self.user.split(':', 1)
2456 if not self.host:
2456 if not self.host:
2457 self.host = None
2457 self.host = None
2458
2458
2459 # Don't split on colons in IPv6 addresses without ports
2459 # Don't split on colons in IPv6 addresses without ports
2460 if (self.host and ':' in self.host and
2460 if (self.host and ':' in self.host and
2461 not (self.host.startswith('[') and self.host.endswith(']'))):
2461 not (self.host.startswith('[') and self.host.endswith(']'))):
2462 self._hostport = self.host
2462 self._hostport = self.host
2463 self.host, self.port = self.host.rsplit(':', 1)
2463 self.host, self.port = self.host.rsplit(':', 1)
2464 if not self.host:
2464 if not self.host:
2465 self.host = None
2465 self.host = None
2466
2466
2467 if (self.host and self.scheme == 'file' and
2467 if (self.host and self.scheme == 'file' and
2468 self.host not in ('localhost', '127.0.0.1', '[::1]')):
2468 self.host not in ('localhost', '127.0.0.1', '[::1]')):
2469 raise Abort(_('file:// URLs can only refer to localhost'))
2469 raise Abort(_('file:// URLs can only refer to localhost'))
2470
2470
2471 self.path = path
2471 self.path = path
2472
2472
2473 # leave the query string escaped
2473 # leave the query string escaped
2474 for a in ('user', 'passwd', 'host', 'port',
2474 for a in ('user', 'passwd', 'host', 'port',
2475 'path', 'fragment'):
2475 'path', 'fragment'):
2476 v = getattr(self, a)
2476 v = getattr(self, a)
2477 if v is not None:
2477 if v is not None:
2478 setattr(self, a, _urlunquote(v))
2478 setattr(self, a, _urlunquote(v))
2479
2479
2480 def __repr__(self):
2480 def __repr__(self):
2481 attrs = []
2481 attrs = []
2482 for a in ('scheme', 'user', 'passwd', 'host', 'port', 'path',
2482 for a in ('scheme', 'user', 'passwd', 'host', 'port', 'path',
2483 'query', 'fragment'):
2483 'query', 'fragment'):
2484 v = getattr(self, a)
2484 v = getattr(self, a)
2485 if v is not None:
2485 if v is not None:
2486 attrs.append('%s: %r' % (a, v))
2486 attrs.append('%s: %r' % (a, v))
2487 return '<url %s>' % ', '.join(attrs)
2487 return '<url %s>' % ', '.join(attrs)
2488
2488
2489 def __str__(self):
2489 def __str__(self):
2490 r"""Join the URL's components back into a URL string.
2490 r"""Join the URL's components back into a URL string.
2491
2491
2492 Examples:
2492 Examples:
2493
2493
2494 >>> str(url('http://user:pw@host:80/c:/bob?fo:oo#ba:ar'))
2494 >>> str(url('http://user:pw@host:80/c:/bob?fo:oo#ba:ar'))
2495 'http://user:pw@host:80/c:/bob?fo:oo#ba:ar'
2495 'http://user:pw@host:80/c:/bob?fo:oo#ba:ar'
2496 >>> str(url('http://user:pw@host:80/?foo=bar&baz=42'))
2496 >>> str(url('http://user:pw@host:80/?foo=bar&baz=42'))
2497 'http://user:pw@host:80/?foo=bar&baz=42'
2497 'http://user:pw@host:80/?foo=bar&baz=42'
2498 >>> str(url('http://user:pw@host:80/?foo=bar%3dbaz'))
2498 >>> str(url('http://user:pw@host:80/?foo=bar%3dbaz'))
2499 'http://user:pw@host:80/?foo=bar%3dbaz'
2499 'http://user:pw@host:80/?foo=bar%3dbaz'
2500 >>> str(url('ssh://user:pw@[::1]:2200//home/joe#'))
2500 >>> str(url('ssh://user:pw@[::1]:2200//home/joe#'))
2501 'ssh://user:pw@[::1]:2200//home/joe#'
2501 'ssh://user:pw@[::1]:2200//home/joe#'
2502 >>> str(url('http://localhost:80//'))
2502 >>> str(url('http://localhost:80//'))
2503 'http://localhost:80//'
2503 'http://localhost:80//'
2504 >>> str(url('http://localhost:80/'))
2504 >>> str(url('http://localhost:80/'))
2505 'http://localhost:80/'
2505 'http://localhost:80/'
2506 >>> str(url('http://localhost:80'))
2506 >>> str(url('http://localhost:80'))
2507 'http://localhost:80/'
2507 'http://localhost:80/'
2508 >>> str(url('bundle:foo'))
2508 >>> str(url('bundle:foo'))
2509 'bundle:foo'
2509 'bundle:foo'
2510 >>> str(url('bundle://../foo'))
2510 >>> str(url('bundle://../foo'))
2511 'bundle:../foo'
2511 'bundle:../foo'
2512 >>> str(url('path'))
2512 >>> str(url('path'))
2513 'path'
2513 'path'
2514 >>> str(url('file:///tmp/foo/bar'))
2514 >>> str(url('file:///tmp/foo/bar'))
2515 'file:///tmp/foo/bar'
2515 'file:///tmp/foo/bar'
2516 >>> str(url('file:///c:/tmp/foo/bar'))
2516 >>> str(url('file:///c:/tmp/foo/bar'))
2517 'file:///c:/tmp/foo/bar'
2517 'file:///c:/tmp/foo/bar'
2518 >>> print url(r'bundle:foo\bar')
2518 >>> print url(r'bundle:foo\bar')
2519 bundle:foo\bar
2519 bundle:foo\bar
2520 >>> print url(r'file:///D:\data\hg')
2520 >>> print url(r'file:///D:\data\hg')
2521 file:///D:\data\hg
2521 file:///D:\data\hg
2522 """
2522 """
2523 if self._localpath:
2523 if self._localpath:
2524 s = self.path
2524 s = self.path
2525 if self.scheme == 'bundle':
2525 if self.scheme == 'bundle':
2526 s = 'bundle:' + s
2526 s = 'bundle:' + s
2527 if self.fragment:
2527 if self.fragment:
2528 s += '#' + self.fragment
2528 s += '#' + self.fragment
2529 return s
2529 return s
2530
2530
2531 s = self.scheme + ':'
2531 s = self.scheme + ':'
2532 if self.user or self.passwd or self.host:
2532 if self.user or self.passwd or self.host:
2533 s += '//'
2533 s += '//'
2534 elif self.scheme and (not self.path or self.path.startswith('/')
2534 elif self.scheme and (not self.path or self.path.startswith('/')
2535 or hasdriveletter(self.path)):
2535 or hasdriveletter(self.path)):
2536 s += '//'
2536 s += '//'
2537 if hasdriveletter(self.path):
2537 if hasdriveletter(self.path):
2538 s += '/'
2538 s += '/'
2539 if self.user:
2539 if self.user:
2540 s += urlreq.quote(self.user, safe=self._safechars)
2540 s += urlreq.quote(self.user, safe=self._safechars)
2541 if self.passwd:
2541 if self.passwd:
2542 s += ':' + urlreq.quote(self.passwd, safe=self._safechars)
2542 s += ':' + urlreq.quote(self.passwd, safe=self._safechars)
2543 if self.user or self.passwd:
2543 if self.user or self.passwd:
2544 s += '@'
2544 s += '@'
2545 if self.host:
2545 if self.host:
2546 if not (self.host.startswith('[') and self.host.endswith(']')):
2546 if not (self.host.startswith('[') and self.host.endswith(']')):
2547 s += urlreq.quote(self.host)
2547 s += urlreq.quote(self.host)
2548 else:
2548 else:
2549 s += self.host
2549 s += self.host
2550 if self.port:
2550 if self.port:
2551 s += ':' + urlreq.quote(self.port)
2551 s += ':' + urlreq.quote(self.port)
2552 if self.host:
2552 if self.host:
2553 s += '/'
2553 s += '/'
2554 if self.path:
2554 if self.path:
2555 # TODO: similar to the query string, we should not unescape the
2555 # TODO: similar to the query string, we should not unescape the
2556 # path when we store it, the path might contain '%2f' = '/',
2556 # path when we store it, the path might contain '%2f' = '/',
2557 # which we should *not* escape.
2557 # which we should *not* escape.
2558 s += urlreq.quote(self.path, safe=self._safepchars)
2558 s += urlreq.quote(self.path, safe=self._safepchars)
2559 if self.query:
2559 if self.query:
2560 # we store the query in escaped form.
2560 # we store the query in escaped form.
2561 s += '?' + self.query
2561 s += '?' + self.query
2562 if self.fragment is not None:
2562 if self.fragment is not None:
2563 s += '#' + urlreq.quote(self.fragment, safe=self._safepchars)
2563 s += '#' + urlreq.quote(self.fragment, safe=self._safepchars)
2564 return s
2564 return s
2565
2565
2566 def authinfo(self):
2566 def authinfo(self):
2567 user, passwd = self.user, self.passwd
2567 user, passwd = self.user, self.passwd
2568 try:
2568 try:
2569 self.user, self.passwd = None, None
2569 self.user, self.passwd = None, None
2570 s = str(self)
2570 s = str(self)
2571 finally:
2571 finally:
2572 self.user, self.passwd = user, passwd
2572 self.user, self.passwd = user, passwd
2573 if not self.user:
2573 if not self.user:
2574 return (s, None)
2574 return (s, None)
2575 # authinfo[1] is passed to urllib2 password manager, and its
2575 # authinfo[1] is passed to urllib2 password manager, and its
2576 # URIs must not contain credentials. The host is passed in the
2576 # URIs must not contain credentials. The host is passed in the
2577 # URIs list because Python < 2.4.3 uses only that to search for
2577 # URIs list because Python < 2.4.3 uses only that to search for
2578 # a password.
2578 # a password.
2579 return (s, (None, (s, self.host),
2579 return (s, (None, (s, self.host),
2580 self.user, self.passwd or ''))
2580 self.user, self.passwd or ''))
2581
2581
2582 def isabs(self):
2582 def isabs(self):
2583 if self.scheme and self.scheme != 'file':
2583 if self.scheme and self.scheme != 'file':
2584 return True # remote URL
2584 return True # remote URL
2585 if hasdriveletter(self.path):
2585 if hasdriveletter(self.path):
2586 return True # absolute for our purposes - can't be joined()
2586 return True # absolute for our purposes - can't be joined()
2587 if self.path.startswith(r'\\'):
2587 if self.path.startswith(r'\\'):
2588 return True # Windows UNC path
2588 return True # Windows UNC path
2589 if self.path.startswith('/'):
2589 if self.path.startswith('/'):
2590 return True # POSIX-style
2590 return True # POSIX-style
2591 return False
2591 return False
2592
2592
2593 def localpath(self):
2593 def localpath(self):
2594 if self.scheme == 'file' or self.scheme == 'bundle':
2594 if self.scheme == 'file' or self.scheme == 'bundle':
2595 path = self.path or '/'
2595 path = self.path or '/'
2596 # For Windows, we need to promote hosts containing drive
2596 # For Windows, we need to promote hosts containing drive
2597 # letters to paths with drive letters.
2597 # letters to paths with drive letters.
2598 if hasdriveletter(self._hostport):
2598 if hasdriveletter(self._hostport):
2599 path = self._hostport + '/' + self.path
2599 path = self._hostport + '/' + self.path
2600 elif (self.host is not None and self.path
2600 elif (self.host is not None and self.path
2601 and not hasdriveletter(path)):
2601 and not hasdriveletter(path)):
2602 path = '/' + path
2602 path = '/' + path
2603 return path
2603 return path
2604 return self._origpath
2604 return self._origpath
2605
2605
2606 def islocal(self):
2606 def islocal(self):
2607 '''whether localpath will return something that posixfile can open'''
2607 '''whether localpath will return something that posixfile can open'''
2608 return (not self.scheme or self.scheme == 'file'
2608 return (not self.scheme or self.scheme == 'file'
2609 or self.scheme == 'bundle')
2609 or self.scheme == 'bundle')
2610
2610
2611 def hasscheme(path):
2611 def hasscheme(path):
2612 return bool(url(path).scheme)
2612 return bool(url(path).scheme)
2613
2613
2614 def hasdriveletter(path):
2614 def hasdriveletter(path):
2615 return path and path[1:2] == ':' and path[0:1].isalpha()
2615 return path and path[1:2] == ':' and path[0:1].isalpha()
2616
2616
2617 def urllocalpath(path):
2617 def urllocalpath(path):
2618 return url(path, parsequery=False, parsefragment=False).localpath()
2618 return url(path, parsequery=False, parsefragment=False).localpath()
2619
2619
2620 def hidepassword(u):
2620 def hidepassword(u):
2621 '''hide user credential in a url string'''
2621 '''hide user credential in a url string'''
2622 u = url(u)
2622 u = url(u)
2623 if u.passwd:
2623 if u.passwd:
2624 u.passwd = '***'
2624 u.passwd = '***'
2625 return str(u)
2625 return str(u)
2626
2626
2627 def removeauth(u):
2627 def removeauth(u):
2628 '''remove all authentication information from a url string'''
2628 '''remove all authentication information from a url string'''
2629 u = url(u)
2629 u = url(u)
2630 u.user = u.passwd = None
2630 u.user = u.passwd = None
2631 return str(u)
2631 return str(u)
2632
2632
2633 def isatty(fp):
2633 def isatty(fp):
2634 try:
2634 try:
2635 return fp.isatty()
2635 return fp.isatty()
2636 except AttributeError:
2636 except AttributeError:
2637 return False
2637 return False
2638
2638
2639 timecount = unitcountfn(
2639 timecount = unitcountfn(
2640 (1, 1e3, _('%.0f s')),
2640 (1, 1e3, _('%.0f s')),
2641 (100, 1, _('%.1f s')),
2641 (100, 1, _('%.1f s')),
2642 (10, 1, _('%.2f s')),
2642 (10, 1, _('%.2f s')),
2643 (1, 1, _('%.3f s')),
2643 (1, 1, _('%.3f s')),
2644 (100, 0.001, _('%.1f ms')),
2644 (100, 0.001, _('%.1f ms')),
2645 (10, 0.001, _('%.2f ms')),
2645 (10, 0.001, _('%.2f ms')),
2646 (1, 0.001, _('%.3f ms')),
2646 (1, 0.001, _('%.3f ms')),
2647 (100, 0.000001, _('%.1f us')),
2647 (100, 0.000001, _('%.1f us')),
2648 (10, 0.000001, _('%.2f us')),
2648 (10, 0.000001, _('%.2f us')),
2649 (1, 0.000001, _('%.3f us')),
2649 (1, 0.000001, _('%.3f us')),
2650 (100, 0.000000001, _('%.1f ns')),
2650 (100, 0.000000001, _('%.1f ns')),
2651 (10, 0.000000001, _('%.2f ns')),
2651 (10, 0.000000001, _('%.2f ns')),
2652 (1, 0.000000001, _('%.3f ns')),
2652 (1, 0.000000001, _('%.3f ns')),
2653 )
2653 )
2654
2654
2655 _timenesting = [0]
2655 _timenesting = [0]
2656
2656
2657 def timed(func):
2657 def timed(func):
2658 '''Report the execution time of a function call to stderr.
2658 '''Report the execution time of a function call to stderr.
2659
2659
2660 During development, use as a decorator when you need to measure
2660 During development, use as a decorator when you need to measure
2661 the cost of a function, e.g. as follows:
2661 the cost of a function, e.g. as follows:
2662
2662
2663 @util.timed
2663 @util.timed
2664 def foo(a, b, c):
2664 def foo(a, b, c):
2665 pass
2665 pass
2666 '''
2666 '''
2667
2667
2668 def wrapper(*args, **kwargs):
2668 def wrapper(*args, **kwargs):
2669 start = time.time()
2669 start = time.time()
2670 indent = 2
2670 indent = 2
2671 _timenesting[0] += indent
2671 _timenesting[0] += indent
2672 try:
2672 try:
2673 return func(*args, **kwargs)
2673 return func(*args, **kwargs)
2674 finally:
2674 finally:
2675 elapsed = time.time() - start
2675 elapsed = time.time() - start
2676 _timenesting[0] -= indent
2676 _timenesting[0] -= indent
2677 sys.stderr.write('%s%s: %s\n' %
2677 sys.stderr.write('%s%s: %s\n' %
2678 (' ' * _timenesting[0], func.__name__,
2678 (' ' * _timenesting[0], func.__name__,
2679 timecount(elapsed)))
2679 timecount(elapsed)))
2680 return wrapper
2680 return wrapper
2681
2681
2682 _sizeunits = (('m', 2**20), ('k', 2**10), ('g', 2**30),
2682 _sizeunits = (('m', 2**20), ('k', 2**10), ('g', 2**30),
2683 ('kb', 2**10), ('mb', 2**20), ('gb', 2**30), ('b', 1))
2683 ('kb', 2**10), ('mb', 2**20), ('gb', 2**30), ('b', 1))
2684
2684
2685 def sizetoint(s):
2685 def sizetoint(s):
2686 '''Convert a space specifier to a byte count.
2686 '''Convert a space specifier to a byte count.
2687
2687
2688 >>> sizetoint('30')
2688 >>> sizetoint('30')
2689 30
2689 30
2690 >>> sizetoint('2.2kb')
2690 >>> sizetoint('2.2kb')
2691 2252
2691 2252
2692 >>> sizetoint('6M')
2692 >>> sizetoint('6M')
2693 6291456
2693 6291456
2694 '''
2694 '''
2695 t = s.strip().lower()
2695 t = s.strip().lower()
2696 try:
2696 try:
2697 for k, u in _sizeunits:
2697 for k, u in _sizeunits:
2698 if t.endswith(k):
2698 if t.endswith(k):
2699 return int(float(t[:-len(k)]) * u)
2699 return int(float(t[:-len(k)]) * u)
2700 return int(t)
2700 return int(t)
2701 except ValueError:
2701 except ValueError:
2702 raise error.ParseError(_("couldn't parse size: %s") % s)
2702 raise error.ParseError(_("couldn't parse size: %s") % s)
2703
2703
2704 class hooks(object):
2704 class hooks(object):
2705 '''A collection of hook functions that can be used to extend a
2705 '''A collection of hook functions that can be used to extend a
2706 function's behavior. Hooks are called in lexicographic order,
2706 function's behavior. Hooks are called in lexicographic order,
2707 based on the names of their sources.'''
2707 based on the names of their sources.'''
2708
2708
2709 def __init__(self):
2709 def __init__(self):
2710 self._hooks = []
2710 self._hooks = []
2711
2711
2712 def add(self, source, hook):
2712 def add(self, source, hook):
2713 self._hooks.append((source, hook))
2713 self._hooks.append((source, hook))
2714
2714
2715 def __call__(self, *args):
2715 def __call__(self, *args):
2716 self._hooks.sort(key=lambda x: x[0])
2716 self._hooks.sort(key=lambda x: x[0])
2717 results = []
2717 results = []
2718 for source, hook in self._hooks:
2718 for source, hook in self._hooks:
2719 results.append(hook(*args))
2719 results.append(hook(*args))
2720 return results
2720 return results
2721
2721
2722 def getstackframes(skip=0, line=' %-*s in %s\n', fileline='%s:%s'):
2722 def getstackframes(skip=0, line=' %-*s in %s\n', fileline='%s:%s'):
2723 '''Yields lines for a nicely formatted stacktrace.
2723 '''Yields lines for a nicely formatted stacktrace.
2724 Skips the 'skip' last entries.
2724 Skips the 'skip' last entries.
2725 Each file+linenumber is formatted according to fileline.
2725 Each file+linenumber is formatted according to fileline.
2726 Each line is formatted according to line.
2726 Each line is formatted according to line.
2727 If line is None, it yields:
2727 If line is None, it yields:
2728 length of longest filepath+line number,
2728 length of longest filepath+line number,
2729 filepath+linenumber,
2729 filepath+linenumber,
2730 function
2730 function
2731
2731
2732 Not be used in production code but very convenient while developing.
2732 Not be used in production code but very convenient while developing.
2733 '''
2733 '''
2734 entries = [(fileline % (fn, ln), func)
2734 entries = [(fileline % (fn, ln), func)
2735 for fn, ln, func, _text in traceback.extract_stack()[:-skip - 1]]
2735 for fn, ln, func, _text in traceback.extract_stack()[:-skip - 1]]
2736 if entries:
2736 if entries:
2737 fnmax = max(len(entry[0]) for entry in entries)
2737 fnmax = max(len(entry[0]) for entry in entries)
2738 for fnln, func in entries:
2738 for fnln, func in entries:
2739 if line is None:
2739 if line is None:
2740 yield (fnmax, fnln, func)
2740 yield (fnmax, fnln, func)
2741 else:
2741 else:
2742 yield line % (fnmax, fnln, func)
2742 yield line % (fnmax, fnln, func)
2743
2743
2744 def debugstacktrace(msg='stacktrace', skip=0, f=sys.stderr, otherf=sys.stdout):
2744 def debugstacktrace(msg='stacktrace', skip=0, f=sys.stderr, otherf=sys.stdout):
2745 '''Writes a message to f (stderr) with a nicely formatted stacktrace.
2745 '''Writes a message to f (stderr) with a nicely formatted stacktrace.
2746 Skips the 'skip' last entries. By default it will flush stdout first.
2746 Skips the 'skip' last entries. By default it will flush stdout first.
2747 It can be used everywhere and intentionally does not require an ui object.
2747 It can be used everywhere and intentionally does not require an ui object.
2748 Not be used in production code but very convenient while developing.
2748 Not be used in production code but very convenient while developing.
2749 '''
2749 '''
2750 if otherf:
2750 if otherf:
2751 otherf.flush()
2751 otherf.flush()
2752 f.write('%s at:\n' % msg)
2752 f.write('%s at:\n' % msg)
2753 for line in getstackframes(skip + 1):
2753 for line in getstackframes(skip + 1):
2754 f.write(line)
2754 f.write(line)
2755 f.flush()
2755 f.flush()
2756
2756
2757 class dirs(object):
2757 class dirs(object):
2758 '''a multiset of directory names from a dirstate or manifest'''
2758 '''a multiset of directory names from a dirstate or manifest'''
2759
2759
2760 def __init__(self, map, skip=None):
2760 def __init__(self, map, skip=None):
2761 self._dirs = {}
2761 self._dirs = {}
2762 addpath = self.addpath
2762 addpath = self.addpath
2763 if safehasattr(map, 'iteritems') and skip is not None:
2763 if safehasattr(map, 'iteritems') and skip is not None:
2764 for f, s in map.iteritems():
2764 for f, s in map.iteritems():
2765 if s[0] != skip:
2765 if s[0] != skip:
2766 addpath(f)
2766 addpath(f)
2767 else:
2767 else:
2768 for f in map:
2768 for f in map:
2769 addpath(f)
2769 addpath(f)
2770
2770
2771 def addpath(self, path):
2771 def addpath(self, path):
2772 dirs = self._dirs
2772 dirs = self._dirs
2773 for base in finddirs(path):
2773 for base in finddirs(path):
2774 if base in dirs:
2774 if base in dirs:
2775 dirs[base] += 1
2775 dirs[base] += 1
2776 return
2776 return
2777 dirs[base] = 1
2777 dirs[base] = 1
2778
2778
2779 def delpath(self, path):
2779 def delpath(self, path):
2780 dirs = self._dirs
2780 dirs = self._dirs
2781 for base in finddirs(path):
2781 for base in finddirs(path):
2782 if dirs[base] > 1:
2782 if dirs[base] > 1:
2783 dirs[base] -= 1
2783 dirs[base] -= 1
2784 return
2784 return
2785 del dirs[base]
2785 del dirs[base]
2786
2786
2787 def __iter__(self):
2787 def __iter__(self):
2788 return self._dirs.iterkeys()
2788 return self._dirs.iterkeys()
2789
2789
2790 def __contains__(self, d):
2790 def __contains__(self, d):
2791 return d in self._dirs
2791 return d in self._dirs
2792
2792
2793 if safehasattr(parsers, 'dirs'):
2793 if safehasattr(parsers, 'dirs'):
2794 dirs = parsers.dirs
2794 dirs = parsers.dirs
2795
2795
2796 def finddirs(path):
2796 def finddirs(path):
2797 pos = path.rfind('/')
2797 pos = path.rfind('/')
2798 while pos != -1:
2798 while pos != -1:
2799 yield path[:pos]
2799 yield path[:pos]
2800 pos = path.rfind('/', 0, pos)
2800 pos = path.rfind('/', 0, pos)
2801
2801
2802 # compression utility
2802 # compression utility
2803
2803
2804 class nocompress(object):
2804 class nocompress(object):
2805 def compress(self, x):
2805 def compress(self, x):
2806 return x
2806 return x
2807 def flush(self):
2807 def flush(self):
2808 return ""
2808 return ""
2809
2809
2810 compressors = {
2810 compressors = {
2811 None: nocompress,
2811 None: nocompress,
2812 # lambda to prevent early import
2812 # lambda to prevent early import
2813 'BZ': lambda: bz2.BZ2Compressor(),
2813 'BZ': lambda: bz2.BZ2Compressor(),
2814 'GZ': lambda: zlib.compressobj(),
2814 'GZ': lambda: zlib.compressobj(),
2815 }
2815 }
2816 # also support the old form by courtesies
2816 # also support the old form by courtesies
2817 compressors['UN'] = compressors[None]
2817 compressors['UN'] = compressors[None]
2818
2818
2819 def _makedecompressor(decompcls):
2819 def _makedecompressor(decompcls):
2820 def generator(f):
2820 def generator(f):
2821 d = decompcls()
2821 d = decompcls()
2822 for chunk in filechunkiter(f):
2822 for chunk in filechunkiter(f):
2823 yield d.decompress(chunk)
2823 yield d.decompress(chunk)
2824 def func(fh):
2824 def func(fh):
2825 return chunkbuffer(generator(fh))
2825 return chunkbuffer(generator(fh))
2826 return func
2826 return func
2827
2827
2828 class ctxmanager(object):
2828 class ctxmanager(object):
2829 '''A context manager for use in 'with' blocks to allow multiple
2829 '''A context manager for use in 'with' blocks to allow multiple
2830 contexts to be entered at once. This is both safer and more
2830 contexts to be entered at once. This is both safer and more
2831 flexible than contextlib.nested.
2831 flexible than contextlib.nested.
2832
2832
2833 Once Mercurial supports Python 2.7+, this will become mostly
2833 Once Mercurial supports Python 2.7+, this will become mostly
2834 unnecessary.
2834 unnecessary.
2835 '''
2835 '''
2836
2836
2837 def __init__(self, *args):
2837 def __init__(self, *args):
2838 '''Accepts a list of no-argument functions that return context
2838 '''Accepts a list of no-argument functions that return context
2839 managers. These will be invoked at __call__ time.'''
2839 managers. These will be invoked at __call__ time.'''
2840 self._pending = args
2840 self._pending = args
2841 self._atexit = []
2841 self._atexit = []
2842
2842
2843 def __enter__(self):
2843 def __enter__(self):
2844 return self
2844 return self
2845
2845
2846 def enter(self):
2846 def enter(self):
2847 '''Create and enter context managers in the order in which they were
2847 '''Create and enter context managers in the order in which they were
2848 passed to the constructor.'''
2848 passed to the constructor.'''
2849 values = []
2849 values = []
2850 for func in self._pending:
2850 for func in self._pending:
2851 obj = func()
2851 obj = func()
2852 values.append(obj.__enter__())
2852 values.append(obj.__enter__())
2853 self._atexit.append(obj.__exit__)
2853 self._atexit.append(obj.__exit__)
2854 del self._pending
2854 del self._pending
2855 return values
2855 return values
2856
2856
2857 def atexit(self, func, *args, **kwargs):
2857 def atexit(self, func, *args, **kwargs):
2858 '''Add a function to call when this context manager exits. The
2858 '''Add a function to call when this context manager exits. The
2859 ordering of multiple atexit calls is unspecified, save that
2859 ordering of multiple atexit calls is unspecified, save that
2860 they will happen before any __exit__ functions.'''
2860 they will happen before any __exit__ functions.'''
2861 def wrapper(exc_type, exc_val, exc_tb):
2861 def wrapper(exc_type, exc_val, exc_tb):
2862 func(*args, **kwargs)
2862 func(*args, **kwargs)
2863 self._atexit.append(wrapper)
2863 self._atexit.append(wrapper)
2864 return func
2864 return func
2865
2865
2866 def __exit__(self, exc_type, exc_val, exc_tb):
2866 def __exit__(self, exc_type, exc_val, exc_tb):
2867 '''Context managers are exited in the reverse order from which
2867 '''Context managers are exited in the reverse order from which
2868 they were created.'''
2868 they were created.'''
2869 received = exc_type is not None
2869 received = exc_type is not None
2870 suppressed = False
2870 suppressed = False
2871 pending = None
2871 pending = None
2872 self._atexit.reverse()
2872 self._atexit.reverse()
2873 for exitfunc in self._atexit:
2873 for exitfunc in self._atexit:
2874 try:
2874 try:
2875 if exitfunc(exc_type, exc_val, exc_tb):
2875 if exitfunc(exc_type, exc_val, exc_tb):
2876 suppressed = True
2876 suppressed = True
2877 exc_type = None
2877 exc_type = None
2878 exc_val = None
2878 exc_val = None
2879 exc_tb = None
2879 exc_tb = None
2880 except BaseException:
2880 except BaseException:
2881 pending = sys.exc_info()
2881 pending = sys.exc_info()
2882 exc_type, exc_val, exc_tb = pending = sys.exc_info()
2882 exc_type, exc_val, exc_tb = pending = sys.exc_info()
2883 del self._atexit
2883 del self._atexit
2884 if pending:
2884 if pending:
2885 raise exc_val
2885 raise exc_val
2886 return received and suppressed
2886 return received and suppressed
2887
2887
2888 def _bz2():
2888 def _bz2():
2889 d = bz2.BZ2Decompressor()
2889 d = bz2.BZ2Decompressor()
2890 # Bzip2 stream start with BZ, but we stripped it.
2890 # Bzip2 stream start with BZ, but we stripped it.
2891 # we put it back for good measure.
2891 # we put it back for good measure.
2892 d.decompress('BZ')
2892 d.decompress('BZ')
2893 return d
2893 return d
2894
2894
2895 decompressors = {None: lambda fh: fh,
2895 decompressors = {None: lambda fh: fh,
2896 '_truncatedBZ': _makedecompressor(_bz2),
2896 '_truncatedBZ': _makedecompressor(_bz2),
2897 'BZ': _makedecompressor(lambda: bz2.BZ2Decompressor()),
2897 'BZ': _makedecompressor(lambda: bz2.BZ2Decompressor()),
2898 'GZ': _makedecompressor(lambda: zlib.decompressobj()),
2898 'GZ': _makedecompressor(lambda: zlib.decompressobj()),
2899 }
2899 }
2900 # also support the old form by courtesies
2900 # also support the old form by courtesies
2901 decompressors['UN'] = decompressors[None]
2901 decompressors['UN'] = decompressors[None]
2902
2902
2903 # convenient shortcut
2903 # convenient shortcut
2904 dst = debugstacktrace
2904 dst = debugstacktrace
General Comments 0
You need to be logged in to leave comments. Login now