##// END OF EJS Templates
procutil: always popen() in binary mode...
Yuya Nishihara -
r37476:00e4bd97 default
parent child Browse files
Show More
@@ -1,1129 +1,1129
1 1 # bugzilla.py - bugzilla integration for mercurial
2 2 #
3 3 # Copyright 2006 Vadim Gelfer <vadim.gelfer@gmail.com>
4 4 # Copyright 2011-4 Jim Hague <jim.hague@acm.org>
5 5 #
6 6 # This software may be used and distributed according to the terms of the
7 7 # GNU General Public License version 2 or any later version.
8 8
9 9 '''hooks for integrating with the Bugzilla bug tracker
10 10
11 11 This hook extension adds comments on bugs in Bugzilla when changesets
12 12 that refer to bugs by Bugzilla ID are seen. The comment is formatted using
13 13 the Mercurial template mechanism.
14 14
15 15 The bug references can optionally include an update for Bugzilla of the
16 16 hours spent working on the bug. Bugs can also be marked fixed.
17 17
18 18 Four basic modes of access to Bugzilla are provided:
19 19
20 20 1. Access via the Bugzilla REST-API. Requires bugzilla 5.0 or later.
21 21
22 22 2. Access via the Bugzilla XMLRPC interface. Requires Bugzilla 3.4 or later.
23 23
24 24 3. Check data via the Bugzilla XMLRPC interface and submit bug change
25 25 via email to Bugzilla email interface. Requires Bugzilla 3.4 or later.
26 26
27 27 4. Writing directly to the Bugzilla database. Only Bugzilla installations
28 28 using MySQL are supported. Requires Python MySQLdb.
29 29
30 30 Writing directly to the database is susceptible to schema changes, and
31 31 relies on a Bugzilla contrib script to send out bug change
32 32 notification emails. This script runs as the user running Mercurial,
33 33 must be run on the host with the Bugzilla install, and requires
34 34 permission to read Bugzilla configuration details and the necessary
35 35 MySQL user and password to have full access rights to the Bugzilla
36 36 database. For these reasons this access mode is now considered
37 37 deprecated, and will not be updated for new Bugzilla versions going
38 38 forward. Only adding comments is supported in this access mode.
39 39
40 40 Access via XMLRPC needs a Bugzilla username and password to be specified
41 41 in the configuration. Comments are added under that username. Since the
42 42 configuration must be readable by all Mercurial users, it is recommended
43 43 that the rights of that user are restricted in Bugzilla to the minimum
44 44 necessary to add comments. Marking bugs fixed requires Bugzilla 4.0 and later.
45 45
46 46 Access via XMLRPC/email uses XMLRPC to query Bugzilla, but sends
47 47 email to the Bugzilla email interface to submit comments to bugs.
48 48 The From: address in the email is set to the email address of the Mercurial
49 49 user, so the comment appears to come from the Mercurial user. In the event
50 50 that the Mercurial user email is not recognized by Bugzilla as a Bugzilla
51 51 user, the email associated with the Bugzilla username used to log into
52 52 Bugzilla is used instead as the source of the comment. Marking bugs fixed
53 53 works on all supported Bugzilla versions.
54 54
55 55 Access via the REST-API needs either a Bugzilla username and password
56 56 or an apikey specified in the configuration. Comments are made under
57 57 the given username or the user associated with the apikey in Bugzilla.
58 58
59 59 Configuration items common to all access modes:
60 60
61 61 bugzilla.version
62 62 The access type to use. Values recognized are:
63 63
64 64 :``restapi``: Bugzilla REST-API, Bugzilla 5.0 and later.
65 65 :``xmlrpc``: Bugzilla XMLRPC interface.
66 66 :``xmlrpc+email``: Bugzilla XMLRPC and email interfaces.
67 67 :``3.0``: MySQL access, Bugzilla 3.0 and later.
68 68 :``2.18``: MySQL access, Bugzilla 2.18 and up to but not
69 69 including 3.0.
70 70 :``2.16``: MySQL access, Bugzilla 2.16 and up to but not
71 71 including 2.18.
72 72
73 73 bugzilla.regexp
74 74 Regular expression to match bug IDs for update in changeset commit message.
75 75 It must contain one "()" named group ``<ids>`` containing the bug
76 76 IDs separated by non-digit characters. It may also contain
77 77 a named group ``<hours>`` with a floating-point number giving the
78 78 hours worked on the bug. If no named groups are present, the first
79 79 "()" group is assumed to contain the bug IDs, and work time is not
80 80 updated. The default expression matches ``Bug 1234``, ``Bug no. 1234``,
81 81 ``Bug number 1234``, ``Bugs 1234,5678``, ``Bug 1234 and 5678`` and
82 82 variations thereof, followed by an hours number prefixed by ``h`` or
83 83 ``hours``, e.g. ``hours 1.5``. Matching is case insensitive.
84 84
85 85 bugzilla.fixregexp
86 86 Regular expression to match bug IDs for marking fixed in changeset
87 87 commit message. This must contain a "()" named group ``<ids>` containing
88 88 the bug IDs separated by non-digit characters. It may also contain
89 89 a named group ``<hours>`` with a floating-point number giving the
90 90 hours worked on the bug. If no named groups are present, the first
91 91 "()" group is assumed to contain the bug IDs, and work time is not
92 92 updated. The default expression matches ``Fixes 1234``, ``Fixes bug 1234``,
93 93 ``Fixes bugs 1234,5678``, ``Fixes 1234 and 5678`` and
94 94 variations thereof, followed by an hours number prefixed by ``h`` or
95 95 ``hours``, e.g. ``hours 1.5``. Matching is case insensitive.
96 96
97 97 bugzilla.fixstatus
98 98 The status to set a bug to when marking fixed. Default ``RESOLVED``.
99 99
100 100 bugzilla.fixresolution
101 101 The resolution to set a bug to when marking fixed. Default ``FIXED``.
102 102
103 103 bugzilla.style
104 104 The style file to use when formatting comments.
105 105
106 106 bugzilla.template
107 107 Template to use when formatting comments. Overrides style if
108 108 specified. In addition to the usual Mercurial keywords, the
109 109 extension specifies:
110 110
111 111 :``{bug}``: The Bugzilla bug ID.
112 112 :``{root}``: The full pathname of the Mercurial repository.
113 113 :``{webroot}``: Stripped pathname of the Mercurial repository.
114 114 :``{hgweb}``: Base URL for browsing Mercurial repositories.
115 115
116 116 Default ``changeset {node|short} in repo {root} refers to bug
117 117 {bug}.\\ndetails:\\n\\t{desc|tabindent}``
118 118
119 119 bugzilla.strip
120 120 The number of path separator characters to strip from the front of
121 121 the Mercurial repository path (``{root}`` in templates) to produce
122 122 ``{webroot}``. For example, a repository with ``{root}``
123 123 ``/var/local/my-project`` with a strip of 2 gives a value for
124 124 ``{webroot}`` of ``my-project``. Default 0.
125 125
126 126 web.baseurl
127 127 Base URL for browsing Mercurial repositories. Referenced from
128 128 templates as ``{hgweb}``.
129 129
130 130 Configuration items common to XMLRPC+email and MySQL access modes:
131 131
132 132 bugzilla.usermap
133 133 Path of file containing Mercurial committer email to Bugzilla user email
134 134 mappings. If specified, the file should contain one mapping per
135 135 line::
136 136
137 137 committer = Bugzilla user
138 138
139 139 See also the ``[usermap]`` section.
140 140
141 141 The ``[usermap]`` section is used to specify mappings of Mercurial
142 142 committer email to Bugzilla user email. See also ``bugzilla.usermap``.
143 143 Contains entries of the form ``committer = Bugzilla user``.
144 144
145 145 XMLRPC and REST-API access mode configuration:
146 146
147 147 bugzilla.bzurl
148 148 The base URL for the Bugzilla installation.
149 149 Default ``http://localhost/bugzilla``.
150 150
151 151 bugzilla.user
152 152 The username to use to log into Bugzilla via XMLRPC. Default
153 153 ``bugs``.
154 154
155 155 bugzilla.password
156 156 The password for Bugzilla login.
157 157
158 158 REST-API access mode uses the options listed above as well as:
159 159
160 160 bugzilla.apikey
161 161 An apikey generated on the Bugzilla instance for api access.
162 162 Using an apikey removes the need to store the user and password
163 163 options.
164 164
165 165 XMLRPC+email access mode uses the XMLRPC access mode configuration items,
166 166 and also:
167 167
168 168 bugzilla.bzemail
169 169 The Bugzilla email address.
170 170
171 171 In addition, the Mercurial email settings must be configured. See the
172 172 documentation in hgrc(5), sections ``[email]`` and ``[smtp]``.
173 173
174 174 MySQL access mode configuration:
175 175
176 176 bugzilla.host
177 177 Hostname of the MySQL server holding the Bugzilla database.
178 178 Default ``localhost``.
179 179
180 180 bugzilla.db
181 181 Name of the Bugzilla database in MySQL. Default ``bugs``.
182 182
183 183 bugzilla.user
184 184 Username to use to access MySQL server. Default ``bugs``.
185 185
186 186 bugzilla.password
187 187 Password to use to access MySQL server.
188 188
189 189 bugzilla.timeout
190 190 Database connection timeout (seconds). Default 5.
191 191
192 192 bugzilla.bzuser
193 193 Fallback Bugzilla user name to record comments with, if changeset
194 194 committer cannot be found as a Bugzilla user.
195 195
196 196 bugzilla.bzdir
197 197 Bugzilla install directory. Used by default notify. Default
198 198 ``/var/www/html/bugzilla``.
199 199
200 200 bugzilla.notify
201 201 The command to run to get Bugzilla to send bug change notification
202 202 emails. Substitutes from a map with 3 keys, ``bzdir``, ``id`` (bug
203 203 id) and ``user`` (committer bugzilla email). Default depends on
204 204 version; from 2.18 it is "cd %(bzdir)s && perl -T
205 205 contrib/sendbugmail.pl %(id)s %(user)s".
206 206
207 207 Activating the extension::
208 208
209 209 [extensions]
210 210 bugzilla =
211 211
212 212 [hooks]
213 213 # run bugzilla hook on every change pulled or pushed in here
214 214 incoming.bugzilla = python:hgext.bugzilla.hook
215 215
216 216 Example configurations:
217 217
218 218 XMLRPC example configuration. This uses the Bugzilla at
219 219 ``http://my-project.org/bugzilla``, logging in as user
220 220 ``bugmail@my-project.org`` with password ``plugh``. It is used with a
221 221 collection of Mercurial repositories in ``/var/local/hg/repos/``,
222 222 with a web interface at ``http://my-project.org/hg``. ::
223 223
224 224 [bugzilla]
225 225 bzurl=http://my-project.org/bugzilla
226 226 user=bugmail@my-project.org
227 227 password=plugh
228 228 version=xmlrpc
229 229 template=Changeset {node|short} in {root|basename}.
230 230 {hgweb}/{webroot}/rev/{node|short}\\n
231 231 {desc}\\n
232 232 strip=5
233 233
234 234 [web]
235 235 baseurl=http://my-project.org/hg
236 236
237 237 XMLRPC+email example configuration. This uses the Bugzilla at
238 238 ``http://my-project.org/bugzilla``, logging in as user
239 239 ``bugmail@my-project.org`` with password ``plugh``. It is used with a
240 240 collection of Mercurial repositories in ``/var/local/hg/repos/``,
241 241 with a web interface at ``http://my-project.org/hg``. Bug comments
242 242 are sent to the Bugzilla email address
243 243 ``bugzilla@my-project.org``. ::
244 244
245 245 [bugzilla]
246 246 bzurl=http://my-project.org/bugzilla
247 247 user=bugmail@my-project.org
248 248 password=plugh
249 249 version=xmlrpc+email
250 250 bzemail=bugzilla@my-project.org
251 251 template=Changeset {node|short} in {root|basename}.
252 252 {hgweb}/{webroot}/rev/{node|short}\\n
253 253 {desc}\\n
254 254 strip=5
255 255
256 256 [web]
257 257 baseurl=http://my-project.org/hg
258 258
259 259 [usermap]
260 260 user@emaildomain.com=user.name@bugzilladomain.com
261 261
262 262 MySQL example configuration. This has a local Bugzilla 3.2 installation
263 263 in ``/opt/bugzilla-3.2``. The MySQL database is on ``localhost``,
264 264 the Bugzilla database name is ``bugs`` and MySQL is
265 265 accessed with MySQL username ``bugs`` password ``XYZZY``. It is used
266 266 with a collection of Mercurial repositories in ``/var/local/hg/repos/``,
267 267 with a web interface at ``http://my-project.org/hg``. ::
268 268
269 269 [bugzilla]
270 270 host=localhost
271 271 password=XYZZY
272 272 version=3.0
273 273 bzuser=unknown@domain.com
274 274 bzdir=/opt/bugzilla-3.2
275 275 template=Changeset {node|short} in {root|basename}.
276 276 {hgweb}/{webroot}/rev/{node|short}\\n
277 277 {desc}\\n
278 278 strip=5
279 279
280 280 [web]
281 281 baseurl=http://my-project.org/hg
282 282
283 283 [usermap]
284 284 user@emaildomain.com=user.name@bugzilladomain.com
285 285
286 286 All the above add a comment to the Bugzilla bug record of the form::
287 287
288 288 Changeset 3b16791d6642 in repository-name.
289 289 http://my-project.org/hg/repository-name/rev/3b16791d6642
290 290
291 291 Changeset commit comment. Bug 1234.
292 292 '''
293 293
294 294 from __future__ import absolute_import
295 295
296 296 import json
297 297 import re
298 298 import time
299 299
300 300 from mercurial.i18n import _
301 301 from mercurial.node import short
302 302 from mercurial import (
303 303 error,
304 304 logcmdutil,
305 305 mail,
306 306 registrar,
307 307 url,
308 308 util,
309 309 )
310 310 from mercurial.utils import (
311 311 procutil,
312 312 stringutil,
313 313 )
314 314
315 315 xmlrpclib = util.xmlrpclib
316 316
317 317 # Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for
318 318 # extensions which SHIP WITH MERCURIAL. Non-mainline extensions should
319 319 # be specifying the version(s) of Mercurial they are tested with, or
320 320 # leave the attribute unspecified.
321 321 testedwith = 'ships-with-hg-core'
322 322
323 323 configtable = {}
324 324 configitem = registrar.configitem(configtable)
325 325
326 326 configitem('bugzilla', 'apikey',
327 327 default='',
328 328 )
329 329 configitem('bugzilla', 'bzdir',
330 330 default='/var/www/html/bugzilla',
331 331 )
332 332 configitem('bugzilla', 'bzemail',
333 333 default=None,
334 334 )
335 335 configitem('bugzilla', 'bzurl',
336 336 default='http://localhost/bugzilla/',
337 337 )
338 338 configitem('bugzilla', 'bzuser',
339 339 default=None,
340 340 )
341 341 configitem('bugzilla', 'db',
342 342 default='bugs',
343 343 )
344 344 configitem('bugzilla', 'fixregexp',
345 345 default=(r'fix(?:es)?\s*(?:bugs?\s*)?,?\s*'
346 346 r'(?:nos?\.?|num(?:ber)?s?)?\s*'
347 347 r'(?P<ids>(?:#?\d+\s*(?:,?\s*(?:and)?)?\s*)+)'
348 348 r'\.?\s*(?:h(?:ours?)?\s*(?P<hours>\d*(?:\.\d+)?))?')
349 349 )
350 350 configitem('bugzilla', 'fixresolution',
351 351 default='FIXED',
352 352 )
353 353 configitem('bugzilla', 'fixstatus',
354 354 default='RESOLVED',
355 355 )
356 356 configitem('bugzilla', 'host',
357 357 default='localhost',
358 358 )
359 359 configitem('bugzilla', 'notify',
360 360 default=configitem.dynamicdefault,
361 361 )
362 362 configitem('bugzilla', 'password',
363 363 default=None,
364 364 )
365 365 configitem('bugzilla', 'regexp',
366 366 default=(r'bugs?\s*,?\s*(?:#|nos?\.?|num(?:ber)?s?)?\s*'
367 367 r'(?P<ids>(?:\d+\s*(?:,?\s*(?:and)?)?\s*)+)'
368 368 r'\.?\s*(?:h(?:ours?)?\s*(?P<hours>\d*(?:\.\d+)?))?')
369 369 )
370 370 configitem('bugzilla', 'strip',
371 371 default=0,
372 372 )
373 373 configitem('bugzilla', 'style',
374 374 default=None,
375 375 )
376 376 configitem('bugzilla', 'template',
377 377 default=None,
378 378 )
379 379 configitem('bugzilla', 'timeout',
380 380 default=5,
381 381 )
382 382 configitem('bugzilla', 'user',
383 383 default='bugs',
384 384 )
385 385 configitem('bugzilla', 'usermap',
386 386 default=None,
387 387 )
388 388 configitem('bugzilla', 'version',
389 389 default=None,
390 390 )
391 391
392 392 class bzaccess(object):
393 393 '''Base class for access to Bugzilla.'''
394 394
395 395 def __init__(self, ui):
396 396 self.ui = ui
397 397 usermap = self.ui.config('bugzilla', 'usermap')
398 398 if usermap:
399 399 self.ui.readconfig(usermap, sections=['usermap'])
400 400
401 401 def map_committer(self, user):
402 402 '''map name of committer to Bugzilla user name.'''
403 403 for committer, bzuser in self.ui.configitems('usermap'):
404 404 if committer.lower() == user.lower():
405 405 return bzuser
406 406 return user
407 407
408 408 # Methods to be implemented by access classes.
409 409 #
410 410 # 'bugs' is a dict keyed on bug id, where values are a dict holding
411 411 # updates to bug state. Recognized dict keys are:
412 412 #
413 413 # 'hours': Value, float containing work hours to be updated.
414 414 # 'fix': If key present, bug is to be marked fixed. Value ignored.
415 415
416 416 def filter_real_bug_ids(self, bugs):
417 417 '''remove bug IDs that do not exist in Bugzilla from bugs.'''
418 418
419 419 def filter_cset_known_bug_ids(self, node, bugs):
420 420 '''remove bug IDs where node occurs in comment text from bugs.'''
421 421
422 422 def updatebug(self, bugid, newstate, text, committer):
423 423 '''update the specified bug. Add comment text and set new states.
424 424
425 425 If possible add the comment as being from the committer of
426 426 the changeset. Otherwise use the default Bugzilla user.
427 427 '''
428 428
429 429 def notify(self, bugs, committer):
430 430 '''Force sending of Bugzilla notification emails.
431 431
432 432 Only required if the access method does not trigger notification
433 433 emails automatically.
434 434 '''
435 435
436 436 # Bugzilla via direct access to MySQL database.
437 437 class bzmysql(bzaccess):
438 438 '''Support for direct MySQL access to Bugzilla.
439 439
440 440 The earliest Bugzilla version this is tested with is version 2.16.
441 441
442 442 If your Bugzilla is version 3.4 or above, you are strongly
443 443 recommended to use the XMLRPC access method instead.
444 444 '''
445 445
446 446 @staticmethod
447 447 def sql_buglist(ids):
448 448 '''return SQL-friendly list of bug ids'''
449 449 return '(' + ','.join(map(str, ids)) + ')'
450 450
451 451 _MySQLdb = None
452 452
453 453 def __init__(self, ui):
454 454 try:
455 455 import MySQLdb as mysql
456 456 bzmysql._MySQLdb = mysql
457 457 except ImportError as err:
458 458 raise error.Abort(_('python mysql support not available: %s') % err)
459 459
460 460 bzaccess.__init__(self, ui)
461 461
462 462 host = self.ui.config('bugzilla', 'host')
463 463 user = self.ui.config('bugzilla', 'user')
464 464 passwd = self.ui.config('bugzilla', 'password')
465 465 db = self.ui.config('bugzilla', 'db')
466 466 timeout = int(self.ui.config('bugzilla', 'timeout'))
467 467 self.ui.note(_('connecting to %s:%s as %s, password %s\n') %
468 468 (host, db, user, '*' * len(passwd)))
469 469 self.conn = bzmysql._MySQLdb.connect(host=host,
470 470 user=user, passwd=passwd,
471 471 db=db,
472 472 connect_timeout=timeout)
473 473 self.cursor = self.conn.cursor()
474 474 self.longdesc_id = self.get_longdesc_id()
475 475 self.user_ids = {}
476 476 self.default_notify = "cd %(bzdir)s && ./processmail %(id)s %(user)s"
477 477
478 478 def run(self, *args, **kwargs):
479 479 '''run a query.'''
480 480 self.ui.note(_('query: %s %s\n') % (args, kwargs))
481 481 try:
482 482 self.cursor.execute(*args, **kwargs)
483 483 except bzmysql._MySQLdb.MySQLError:
484 484 self.ui.note(_('failed query: %s %s\n') % (args, kwargs))
485 485 raise
486 486
487 487 def get_longdesc_id(self):
488 488 '''get identity of longdesc field'''
489 489 self.run('select fieldid from fielddefs where name = "longdesc"')
490 490 ids = self.cursor.fetchall()
491 491 if len(ids) != 1:
492 492 raise error.Abort(_('unknown database schema'))
493 493 return ids[0][0]
494 494
495 495 def filter_real_bug_ids(self, bugs):
496 496 '''filter not-existing bugs from set.'''
497 497 self.run('select bug_id from bugs where bug_id in %s' %
498 498 bzmysql.sql_buglist(bugs.keys()))
499 499 existing = [id for (id,) in self.cursor.fetchall()]
500 500 for id in bugs.keys():
501 501 if id not in existing:
502 502 self.ui.status(_('bug %d does not exist\n') % id)
503 503 del bugs[id]
504 504
505 505 def filter_cset_known_bug_ids(self, node, bugs):
506 506 '''filter bug ids that already refer to this changeset from set.'''
507 507 self.run('''select bug_id from longdescs where
508 508 bug_id in %s and thetext like "%%%s%%"''' %
509 509 (bzmysql.sql_buglist(bugs.keys()), short(node)))
510 510 for (id,) in self.cursor.fetchall():
511 511 self.ui.status(_('bug %d already knows about changeset %s\n') %
512 512 (id, short(node)))
513 513 del bugs[id]
514 514
515 515 def notify(self, bugs, committer):
516 516 '''tell bugzilla to send mail.'''
517 517 self.ui.status(_('telling bugzilla to send mail:\n'))
518 518 (user, userid) = self.get_bugzilla_user(committer)
519 519 for id in bugs.keys():
520 520 self.ui.status(_(' bug %s\n') % id)
521 521 cmdfmt = self.ui.config('bugzilla', 'notify', self.default_notify)
522 522 bzdir = self.ui.config('bugzilla', 'bzdir')
523 523 try:
524 524 # Backwards-compatible with old notify string, which
525 525 # took one string. This will throw with a new format
526 526 # string.
527 527 cmd = cmdfmt % id
528 528 except TypeError:
529 529 cmd = cmdfmt % {'bzdir': bzdir, 'id': id, 'user': user}
530 530 self.ui.note(_('running notify command %s\n') % cmd)
531 fp = procutil.popen('(%s) 2>&1' % cmd)
532 out = fp.read()
531 fp = procutil.popen('(%s) 2>&1' % cmd, 'rb')
532 out = util.fromnativeeol(fp.read())
533 533 ret = fp.close()
534 534 if ret:
535 535 self.ui.warn(out)
536 536 raise error.Abort(_('bugzilla notify command %s') %
537 537 procutil.explainexit(ret)[0])
538 538 self.ui.status(_('done\n'))
539 539
540 540 def get_user_id(self, user):
541 541 '''look up numeric bugzilla user id.'''
542 542 try:
543 543 return self.user_ids[user]
544 544 except KeyError:
545 545 try:
546 546 userid = int(user)
547 547 except ValueError:
548 548 self.ui.note(_('looking up user %s\n') % user)
549 549 self.run('''select userid from profiles
550 550 where login_name like %s''', user)
551 551 all = self.cursor.fetchall()
552 552 if len(all) != 1:
553 553 raise KeyError(user)
554 554 userid = int(all[0][0])
555 555 self.user_ids[user] = userid
556 556 return userid
557 557
558 558 def get_bugzilla_user(self, committer):
559 559 '''See if committer is a registered bugzilla user. Return
560 560 bugzilla username and userid if so. If not, return default
561 561 bugzilla username and userid.'''
562 562 user = self.map_committer(committer)
563 563 try:
564 564 userid = self.get_user_id(user)
565 565 except KeyError:
566 566 try:
567 567 defaultuser = self.ui.config('bugzilla', 'bzuser')
568 568 if not defaultuser:
569 569 raise error.Abort(_('cannot find bugzilla user id for %s') %
570 570 user)
571 571 userid = self.get_user_id(defaultuser)
572 572 user = defaultuser
573 573 except KeyError:
574 574 raise error.Abort(_('cannot find bugzilla user id for %s or %s')
575 575 % (user, defaultuser))
576 576 return (user, userid)
577 577
578 578 def updatebug(self, bugid, newstate, text, committer):
579 579 '''update bug state with comment text.
580 580
581 581 Try adding comment as committer of changeset, otherwise as
582 582 default bugzilla user.'''
583 583 if len(newstate) > 0:
584 584 self.ui.warn(_("Bugzilla/MySQL cannot update bug state\n"))
585 585
586 586 (user, userid) = self.get_bugzilla_user(committer)
587 587 now = time.strftime(r'%Y-%m-%d %H:%M:%S')
588 588 self.run('''insert into longdescs
589 589 (bug_id, who, bug_when, thetext)
590 590 values (%s, %s, %s, %s)''',
591 591 (bugid, userid, now, text))
592 592 self.run('''insert into bugs_activity (bug_id, who, bug_when, fieldid)
593 593 values (%s, %s, %s, %s)''',
594 594 (bugid, userid, now, self.longdesc_id))
595 595 self.conn.commit()
596 596
597 597 class bzmysql_2_18(bzmysql):
598 598 '''support for bugzilla 2.18 series.'''
599 599
600 600 def __init__(self, ui):
601 601 bzmysql.__init__(self, ui)
602 602 self.default_notify = \
603 603 "cd %(bzdir)s && perl -T contrib/sendbugmail.pl %(id)s %(user)s"
604 604
605 605 class bzmysql_3_0(bzmysql_2_18):
606 606 '''support for bugzilla 3.0 series.'''
607 607
608 608 def __init__(self, ui):
609 609 bzmysql_2_18.__init__(self, ui)
610 610
611 611 def get_longdesc_id(self):
612 612 '''get identity of longdesc field'''
613 613 self.run('select id from fielddefs where name = "longdesc"')
614 614 ids = self.cursor.fetchall()
615 615 if len(ids) != 1:
616 616 raise error.Abort(_('unknown database schema'))
617 617 return ids[0][0]
618 618
619 619 # Bugzilla via XMLRPC interface.
620 620
621 621 class cookietransportrequest(object):
622 622 """A Transport request method that retains cookies over its lifetime.
623 623
624 624 The regular xmlrpclib transports ignore cookies. Which causes
625 625 a bit of a problem when you need a cookie-based login, as with
626 626 the Bugzilla XMLRPC interface prior to 4.4.3.
627 627
628 628 So this is a helper for defining a Transport which looks for
629 629 cookies being set in responses and saves them to add to all future
630 630 requests.
631 631 """
632 632
633 633 # Inspiration drawn from
634 634 # http://blog.godson.in/2010/09/how-to-make-python-xmlrpclib-client.html
635 635 # http://www.itkovian.net/base/transport-class-for-pythons-xml-rpc-lib/
636 636
637 637 cookies = []
638 638 def send_cookies(self, connection):
639 639 if self.cookies:
640 640 for cookie in self.cookies:
641 641 connection.putheader("Cookie", cookie)
642 642
643 643 def request(self, host, handler, request_body, verbose=0):
644 644 self.verbose = verbose
645 645 self.accept_gzip_encoding = False
646 646
647 647 # issue XML-RPC request
648 648 h = self.make_connection(host)
649 649 if verbose:
650 650 h.set_debuglevel(1)
651 651
652 652 self.send_request(h, handler, request_body)
653 653 self.send_host(h, host)
654 654 self.send_cookies(h)
655 655 self.send_user_agent(h)
656 656 self.send_content(h, request_body)
657 657
658 658 # Deal with differences between Python 2.6 and 2.7.
659 659 # In the former h is a HTTP(S). In the latter it's a
660 660 # HTTP(S)Connection. Luckily, the 2.6 implementation of
661 661 # HTTP(S) has an underlying HTTP(S)Connection, so extract
662 662 # that and use it.
663 663 try:
664 664 response = h.getresponse()
665 665 except AttributeError:
666 666 response = h._conn.getresponse()
667 667
668 668 # Add any cookie definitions to our list.
669 669 for header in response.msg.getallmatchingheaders("Set-Cookie"):
670 670 val = header.split(": ", 1)[1]
671 671 cookie = val.split(";", 1)[0]
672 672 self.cookies.append(cookie)
673 673
674 674 if response.status != 200:
675 675 raise xmlrpclib.ProtocolError(host + handler, response.status,
676 676 response.reason, response.msg.headers)
677 677
678 678 payload = response.read()
679 679 parser, unmarshaller = self.getparser()
680 680 parser.feed(payload)
681 681 parser.close()
682 682
683 683 return unmarshaller.close()
684 684
685 685 # The explicit calls to the underlying xmlrpclib __init__() methods are
686 686 # necessary. The xmlrpclib.Transport classes are old-style classes, and
687 687 # it turns out their __init__() doesn't get called when doing multiple
688 688 # inheritance with a new-style class.
689 689 class cookietransport(cookietransportrequest, xmlrpclib.Transport):
690 690 def __init__(self, use_datetime=0):
691 691 if util.safehasattr(xmlrpclib.Transport, "__init__"):
692 692 xmlrpclib.Transport.__init__(self, use_datetime)
693 693
694 694 class cookiesafetransport(cookietransportrequest, xmlrpclib.SafeTransport):
695 695 def __init__(self, use_datetime=0):
696 696 if util.safehasattr(xmlrpclib.Transport, "__init__"):
697 697 xmlrpclib.SafeTransport.__init__(self, use_datetime)
698 698
699 699 class bzxmlrpc(bzaccess):
700 700 """Support for access to Bugzilla via the Bugzilla XMLRPC API.
701 701
702 702 Requires a minimum Bugzilla version 3.4.
703 703 """
704 704
705 705 def __init__(self, ui):
706 706 bzaccess.__init__(self, ui)
707 707
708 708 bzweb = self.ui.config('bugzilla', 'bzurl')
709 709 bzweb = bzweb.rstrip("/") + "/xmlrpc.cgi"
710 710
711 711 user = self.ui.config('bugzilla', 'user')
712 712 passwd = self.ui.config('bugzilla', 'password')
713 713
714 714 self.fixstatus = self.ui.config('bugzilla', 'fixstatus')
715 715 self.fixresolution = self.ui.config('bugzilla', 'fixresolution')
716 716
717 717 self.bzproxy = xmlrpclib.ServerProxy(bzweb, self.transport(bzweb))
718 718 ver = self.bzproxy.Bugzilla.version()['version'].split('.')
719 719 self.bzvermajor = int(ver[0])
720 720 self.bzverminor = int(ver[1])
721 721 login = self.bzproxy.User.login({'login': user, 'password': passwd,
722 722 'restrict_login': True})
723 723 self.bztoken = login.get('token', '')
724 724
725 725 def transport(self, uri):
726 726 if util.urlreq.urlparse(uri, "http")[0] == "https":
727 727 return cookiesafetransport()
728 728 else:
729 729 return cookietransport()
730 730
731 731 def get_bug_comments(self, id):
732 732 """Return a string with all comment text for a bug."""
733 733 c = self.bzproxy.Bug.comments({'ids': [id],
734 734 'include_fields': ['text'],
735 735 'token': self.bztoken})
736 736 return ''.join([t['text'] for t in c['bugs'][str(id)]['comments']])
737 737
738 738 def filter_real_bug_ids(self, bugs):
739 739 probe = self.bzproxy.Bug.get({'ids': sorted(bugs.keys()),
740 740 'include_fields': [],
741 741 'permissive': True,
742 742 'token': self.bztoken,
743 743 })
744 744 for badbug in probe['faults']:
745 745 id = badbug['id']
746 746 self.ui.status(_('bug %d does not exist\n') % id)
747 747 del bugs[id]
748 748
749 749 def filter_cset_known_bug_ids(self, node, bugs):
750 750 for id in sorted(bugs.keys()):
751 751 if self.get_bug_comments(id).find(short(node)) != -1:
752 752 self.ui.status(_('bug %d already knows about changeset %s\n') %
753 753 (id, short(node)))
754 754 del bugs[id]
755 755
756 756 def updatebug(self, bugid, newstate, text, committer):
757 757 args = {}
758 758 if 'hours' in newstate:
759 759 args['work_time'] = newstate['hours']
760 760
761 761 if self.bzvermajor >= 4:
762 762 args['ids'] = [bugid]
763 763 args['comment'] = {'body' : text}
764 764 if 'fix' in newstate:
765 765 args['status'] = self.fixstatus
766 766 args['resolution'] = self.fixresolution
767 767 args['token'] = self.bztoken
768 768 self.bzproxy.Bug.update(args)
769 769 else:
770 770 if 'fix' in newstate:
771 771 self.ui.warn(_("Bugzilla/XMLRPC needs Bugzilla 4.0 or later "
772 772 "to mark bugs fixed\n"))
773 773 args['id'] = bugid
774 774 args['comment'] = text
775 775 self.bzproxy.Bug.add_comment(args)
776 776
777 777 class bzxmlrpcemail(bzxmlrpc):
778 778 """Read data from Bugzilla via XMLRPC, send updates via email.
779 779
780 780 Advantages of sending updates via email:
781 781 1. Comments can be added as any user, not just logged in user.
782 782 2. Bug statuses or other fields not accessible via XMLRPC can
783 783 potentially be updated.
784 784
785 785 There is no XMLRPC function to change bug status before Bugzilla
786 786 4.0, so bugs cannot be marked fixed via XMLRPC before Bugzilla 4.0.
787 787 But bugs can be marked fixed via email from 3.4 onwards.
788 788 """
789 789
790 790 # The email interface changes subtly between 3.4 and 3.6. In 3.4,
791 791 # in-email fields are specified as '@<fieldname> = <value>'. In
792 792 # 3.6 this becomes '@<fieldname> <value>'. And fieldname @bug_id
793 793 # in 3.4 becomes @id in 3.6. 3.6 and 4.0 both maintain backwards
794 794 # compatibility, but rather than rely on this use the new format for
795 795 # 4.0 onwards.
796 796
797 797 def __init__(self, ui):
798 798 bzxmlrpc.__init__(self, ui)
799 799
800 800 self.bzemail = self.ui.config('bugzilla', 'bzemail')
801 801 if not self.bzemail:
802 802 raise error.Abort(_("configuration 'bzemail' missing"))
803 803 mail.validateconfig(self.ui)
804 804
805 805 def makecommandline(self, fieldname, value):
806 806 if self.bzvermajor >= 4:
807 807 return "@%s %s" % (fieldname, str(value))
808 808 else:
809 809 if fieldname == "id":
810 810 fieldname = "bug_id"
811 811 return "@%s = %s" % (fieldname, str(value))
812 812
813 813 def send_bug_modify_email(self, bugid, commands, comment, committer):
814 814 '''send modification message to Bugzilla bug via email.
815 815
816 816 The message format is documented in the Bugzilla email_in.pl
817 817 specification. commands is a list of command lines, comment is the
818 818 comment text.
819 819
820 820 To stop users from crafting commit comments with
821 821 Bugzilla commands, specify the bug ID via the message body, rather
822 822 than the subject line, and leave a blank line after it.
823 823 '''
824 824 user = self.map_committer(committer)
825 825 matches = self.bzproxy.User.get({'match': [user],
826 826 'token': self.bztoken})
827 827 if not matches['users']:
828 828 user = self.ui.config('bugzilla', 'user')
829 829 matches = self.bzproxy.User.get({'match': [user],
830 830 'token': self.bztoken})
831 831 if not matches['users']:
832 832 raise error.Abort(_("default bugzilla user %s email not found")
833 833 % user)
834 834 user = matches['users'][0]['email']
835 835 commands.append(self.makecommandline("id", bugid))
836 836
837 837 text = "\n".join(commands) + "\n\n" + comment
838 838
839 839 _charsets = mail._charsets(self.ui)
840 840 user = mail.addressencode(self.ui, user, _charsets)
841 841 bzemail = mail.addressencode(self.ui, self.bzemail, _charsets)
842 842 msg = mail.mimeencode(self.ui, text, _charsets)
843 843 msg['From'] = user
844 844 msg['To'] = bzemail
845 845 msg['Subject'] = mail.headencode(self.ui, "Bug modification", _charsets)
846 846 sendmail = mail.connect(self.ui)
847 847 sendmail(user, bzemail, msg.as_string())
848 848
849 849 def updatebug(self, bugid, newstate, text, committer):
850 850 cmds = []
851 851 if 'hours' in newstate:
852 852 cmds.append(self.makecommandline("work_time", newstate['hours']))
853 853 if 'fix' in newstate:
854 854 cmds.append(self.makecommandline("bug_status", self.fixstatus))
855 855 cmds.append(self.makecommandline("resolution", self.fixresolution))
856 856 self.send_bug_modify_email(bugid, cmds, text, committer)
857 857
858 858 class NotFound(LookupError):
859 859 pass
860 860
861 861 class bzrestapi(bzaccess):
862 862 """Read and write bugzilla data using the REST API available since
863 863 Bugzilla 5.0.
864 864 """
865 865 def __init__(self, ui):
866 866 bzaccess.__init__(self, ui)
867 867 bz = self.ui.config('bugzilla', 'bzurl')
868 868 self.bzroot = '/'.join([bz, 'rest'])
869 869 self.apikey = self.ui.config('bugzilla', 'apikey')
870 870 self.user = self.ui.config('bugzilla', 'user')
871 871 self.passwd = self.ui.config('bugzilla', 'password')
872 872 self.fixstatus = self.ui.config('bugzilla', 'fixstatus')
873 873 self.fixresolution = self.ui.config('bugzilla', 'fixresolution')
874 874
875 875 def apiurl(self, targets, include_fields=None):
876 876 url = '/'.join([self.bzroot] + [str(t) for t in targets])
877 877 qv = {}
878 878 if self.apikey:
879 879 qv['api_key'] = self.apikey
880 880 elif self.user and self.passwd:
881 881 qv['login'] = self.user
882 882 qv['password'] = self.passwd
883 883 if include_fields:
884 884 qv['include_fields'] = include_fields
885 885 if qv:
886 886 url = '%s?%s' % (url, util.urlreq.urlencode(qv))
887 887 return url
888 888
889 889 def _fetch(self, burl):
890 890 try:
891 891 resp = url.open(self.ui, burl)
892 892 return json.loads(resp.read())
893 893 except util.urlerr.httperror as inst:
894 894 if inst.code == 401:
895 895 raise error.Abort(_('authorization failed'))
896 896 if inst.code == 404:
897 897 raise NotFound()
898 898 else:
899 899 raise
900 900
901 901 def _submit(self, burl, data, method='POST'):
902 902 data = json.dumps(data)
903 903 if method == 'PUT':
904 904 class putrequest(util.urlreq.request):
905 905 def get_method(self):
906 906 return 'PUT'
907 907 request_type = putrequest
908 908 else:
909 909 request_type = util.urlreq.request
910 910 req = request_type(burl, data,
911 911 {'Content-Type': 'application/json'})
912 912 try:
913 913 resp = url.opener(self.ui).open(req)
914 914 return json.loads(resp.read())
915 915 except util.urlerr.httperror as inst:
916 916 if inst.code == 401:
917 917 raise error.Abort(_('authorization failed'))
918 918 if inst.code == 404:
919 919 raise NotFound()
920 920 else:
921 921 raise
922 922
923 923 def filter_real_bug_ids(self, bugs):
924 924 '''remove bug IDs that do not exist in Bugzilla from bugs.'''
925 925 badbugs = set()
926 926 for bugid in bugs:
927 927 burl = self.apiurl(('bug', bugid), include_fields='status')
928 928 try:
929 929 self._fetch(burl)
930 930 except NotFound:
931 931 badbugs.add(bugid)
932 932 for bugid in badbugs:
933 933 del bugs[bugid]
934 934
935 935 def filter_cset_known_bug_ids(self, node, bugs):
936 936 '''remove bug IDs where node occurs in comment text from bugs.'''
937 937 sn = short(node)
938 938 for bugid in bugs.keys():
939 939 burl = self.apiurl(('bug', bugid, 'comment'), include_fields='text')
940 940 result = self._fetch(burl)
941 941 comments = result['bugs'][str(bugid)]['comments']
942 942 if any(sn in c['text'] for c in comments):
943 943 self.ui.status(_('bug %d already knows about changeset %s\n') %
944 944 (bugid, sn))
945 945 del bugs[bugid]
946 946
947 947 def updatebug(self, bugid, newstate, text, committer):
948 948 '''update the specified bug. Add comment text and set new states.
949 949
950 950 If possible add the comment as being from the committer of
951 951 the changeset. Otherwise use the default Bugzilla user.
952 952 '''
953 953 bugmod = {}
954 954 if 'hours' in newstate:
955 955 bugmod['work_time'] = newstate['hours']
956 956 if 'fix' in newstate:
957 957 bugmod['status'] = self.fixstatus
958 958 bugmod['resolution'] = self.fixresolution
959 959 if bugmod:
960 960 # if we have to change the bugs state do it here
961 961 bugmod['comment'] = {
962 962 'comment': text,
963 963 'is_private': False,
964 964 'is_markdown': False,
965 965 }
966 966 burl = self.apiurl(('bug', bugid))
967 967 self._submit(burl, bugmod, method='PUT')
968 968 self.ui.debug('updated bug %s\n' % bugid)
969 969 else:
970 970 burl = self.apiurl(('bug', bugid, 'comment'))
971 971 self._submit(burl, {
972 972 'comment': text,
973 973 'is_private': False,
974 974 'is_markdown': False,
975 975 })
976 976 self.ui.debug('added comment to bug %s\n' % bugid)
977 977
978 978 def notify(self, bugs, committer):
979 979 '''Force sending of Bugzilla notification emails.
980 980
981 981 Only required if the access method does not trigger notification
982 982 emails automatically.
983 983 '''
984 984 pass
985 985
986 986 class bugzilla(object):
987 987 # supported versions of bugzilla. different versions have
988 988 # different schemas.
989 989 _versions = {
990 990 '2.16': bzmysql,
991 991 '2.18': bzmysql_2_18,
992 992 '3.0': bzmysql_3_0,
993 993 'xmlrpc': bzxmlrpc,
994 994 'xmlrpc+email': bzxmlrpcemail,
995 995 'restapi': bzrestapi,
996 996 }
997 997
998 998 def __init__(self, ui, repo):
999 999 self.ui = ui
1000 1000 self.repo = repo
1001 1001
1002 1002 bzversion = self.ui.config('bugzilla', 'version')
1003 1003 try:
1004 1004 bzclass = bugzilla._versions[bzversion]
1005 1005 except KeyError:
1006 1006 raise error.Abort(_('bugzilla version %s not supported') %
1007 1007 bzversion)
1008 1008 self.bzdriver = bzclass(self.ui)
1009 1009
1010 1010 self.bug_re = re.compile(
1011 1011 self.ui.config('bugzilla', 'regexp'), re.IGNORECASE)
1012 1012 self.fix_re = re.compile(
1013 1013 self.ui.config('bugzilla', 'fixregexp'), re.IGNORECASE)
1014 1014 self.split_re = re.compile(r'\D+')
1015 1015
1016 1016 def find_bugs(self, ctx):
1017 1017 '''return bugs dictionary created from commit comment.
1018 1018
1019 1019 Extract bug info from changeset comments. Filter out any that are
1020 1020 not known to Bugzilla, and any that already have a reference to
1021 1021 the given changeset in their comments.
1022 1022 '''
1023 1023 start = 0
1024 1024 hours = 0.0
1025 1025 bugs = {}
1026 1026 bugmatch = self.bug_re.search(ctx.description(), start)
1027 1027 fixmatch = self.fix_re.search(ctx.description(), start)
1028 1028 while True:
1029 1029 bugattribs = {}
1030 1030 if not bugmatch and not fixmatch:
1031 1031 break
1032 1032 if not bugmatch:
1033 1033 m = fixmatch
1034 1034 elif not fixmatch:
1035 1035 m = bugmatch
1036 1036 else:
1037 1037 if bugmatch.start() < fixmatch.start():
1038 1038 m = bugmatch
1039 1039 else:
1040 1040 m = fixmatch
1041 1041 start = m.end()
1042 1042 if m is bugmatch:
1043 1043 bugmatch = self.bug_re.search(ctx.description(), start)
1044 1044 if 'fix' in bugattribs:
1045 1045 del bugattribs['fix']
1046 1046 else:
1047 1047 fixmatch = self.fix_re.search(ctx.description(), start)
1048 1048 bugattribs['fix'] = None
1049 1049
1050 1050 try:
1051 1051 ids = m.group('ids')
1052 1052 except IndexError:
1053 1053 ids = m.group(1)
1054 1054 try:
1055 1055 hours = float(m.group('hours'))
1056 1056 bugattribs['hours'] = hours
1057 1057 except IndexError:
1058 1058 pass
1059 1059 except TypeError:
1060 1060 pass
1061 1061 except ValueError:
1062 1062 self.ui.status(_("%s: invalid hours\n") % m.group('hours'))
1063 1063
1064 1064 for id in self.split_re.split(ids):
1065 1065 if not id:
1066 1066 continue
1067 1067 bugs[int(id)] = bugattribs
1068 1068 if bugs:
1069 1069 self.bzdriver.filter_real_bug_ids(bugs)
1070 1070 if bugs:
1071 1071 self.bzdriver.filter_cset_known_bug_ids(ctx.node(), bugs)
1072 1072 return bugs
1073 1073
1074 1074 def update(self, bugid, newstate, ctx):
1075 1075 '''update bugzilla bug with reference to changeset.'''
1076 1076
1077 1077 def webroot(root):
1078 1078 '''strip leading prefix of repo root and turn into
1079 1079 url-safe path.'''
1080 1080 count = int(self.ui.config('bugzilla', 'strip'))
1081 1081 root = util.pconvert(root)
1082 1082 while count > 0:
1083 1083 c = root.find('/')
1084 1084 if c == -1:
1085 1085 break
1086 1086 root = root[c + 1:]
1087 1087 count -= 1
1088 1088 return root
1089 1089
1090 1090 mapfile = None
1091 1091 tmpl = self.ui.config('bugzilla', 'template')
1092 1092 if not tmpl:
1093 1093 mapfile = self.ui.config('bugzilla', 'style')
1094 1094 if not mapfile and not tmpl:
1095 1095 tmpl = _('changeset {node|short} in repo {root} refers '
1096 1096 'to bug {bug}.\ndetails:\n\t{desc|tabindent}')
1097 1097 spec = logcmdutil.templatespec(tmpl, mapfile)
1098 1098 t = logcmdutil.changesettemplater(self.ui, self.repo, spec)
1099 1099 self.ui.pushbuffer()
1100 1100 t.show(ctx, changes=ctx.changeset(),
1101 1101 bug=str(bugid),
1102 1102 hgweb=self.ui.config('web', 'baseurl'),
1103 1103 root=self.repo.root,
1104 1104 webroot=webroot(self.repo.root))
1105 1105 data = self.ui.popbuffer()
1106 1106 self.bzdriver.updatebug(bugid, newstate, data,
1107 1107 stringutil.email(ctx.user()))
1108 1108
1109 1109 def notify(self, bugs, committer):
1110 1110 '''ensure Bugzilla users are notified of bug change.'''
1111 1111 self.bzdriver.notify(bugs, committer)
1112 1112
1113 1113 def hook(ui, repo, hooktype, node=None, **kwargs):
1114 1114 '''add comment to bugzilla for each changeset that refers to a
1115 1115 bugzilla bug id. only add a comment once per bug, so same change
1116 1116 seen multiple times does not fill bug with duplicate data.'''
1117 1117 if node is None:
1118 1118 raise error.Abort(_('hook type %s does not pass a changeset id') %
1119 1119 hooktype)
1120 1120 try:
1121 1121 bz = bugzilla(ui, repo)
1122 1122 ctx = repo[node]
1123 1123 bugs = bz.find_bugs(ctx)
1124 1124 if bugs:
1125 1125 for bug in bugs:
1126 1126 bz.update(bug, bugs[bug], ctx)
1127 1127 bz.notify(bugs, stringutil.email(ctx.user()))
1128 1128 except Exception as e:
1129 1129 raise error.Abort(_('Bugzilla error: %s') % e)
@@ -1,958 +1,958
1 1 # Mercurial built-in replacement for cvsps.
2 2 #
3 3 # Copyright 2008, Frank Kingswood <frank@kingswood-consulting.co.uk>
4 4 #
5 5 # This software may be used and distributed according to the terms of the
6 6 # GNU General Public License version 2 or any later version.
7 7 from __future__ import absolute_import
8 8
9 9 import os
10 10 import re
11 11
12 12 from mercurial.i18n import _
13 13 from mercurial import (
14 14 encoding,
15 15 error,
16 16 hook,
17 17 pycompat,
18 18 util,
19 19 )
20 20 from mercurial.utils import (
21 21 dateutil,
22 22 procutil,
23 23 stringutil,
24 24 )
25 25
26 26 pickle = util.pickle
27 27
28 28 class logentry(object):
29 29 '''Class logentry has the following attributes:
30 30 .author - author name as CVS knows it
31 31 .branch - name of branch this revision is on
32 32 .branches - revision tuple of branches starting at this revision
33 33 .comment - commit message
34 34 .commitid - CVS commitid or None
35 35 .date - the commit date as a (time, tz) tuple
36 36 .dead - true if file revision is dead
37 37 .file - Name of file
38 38 .lines - a tuple (+lines, -lines) or None
39 39 .parent - Previous revision of this entry
40 40 .rcs - name of file as returned from CVS
41 41 .revision - revision number as tuple
42 42 .tags - list of tags on the file
43 43 .synthetic - is this a synthetic "file ... added on ..." revision?
44 44 .mergepoint - the branch that has been merged from (if present in
45 45 rlog output) or None
46 46 .branchpoints - the branches that start at the current entry or empty
47 47 '''
48 48 def __init__(self, **entries):
49 49 self.synthetic = False
50 50 self.__dict__.update(entries)
51 51
52 52 def __repr__(self):
53 53 items = ("%s=%r"%(k, self.__dict__[k]) for k in sorted(self.__dict__))
54 54 return "%s(%s)"%(type(self).__name__, ", ".join(items))
55 55
56 56 class logerror(Exception):
57 57 pass
58 58
59 59 def getrepopath(cvspath):
60 60 """Return the repository path from a CVS path.
61 61
62 62 >>> getrepopath(b'/foo/bar')
63 63 '/foo/bar'
64 64 >>> getrepopath(b'c:/foo/bar')
65 65 '/foo/bar'
66 66 >>> getrepopath(b':pserver:10/foo/bar')
67 67 '/foo/bar'
68 68 >>> getrepopath(b':pserver:10c:/foo/bar')
69 69 '/foo/bar'
70 70 >>> getrepopath(b':pserver:/foo/bar')
71 71 '/foo/bar'
72 72 >>> getrepopath(b':pserver:c:/foo/bar')
73 73 '/foo/bar'
74 74 >>> getrepopath(b':pserver:truc@foo.bar:/foo/bar')
75 75 '/foo/bar'
76 76 >>> getrepopath(b':pserver:truc@foo.bar:c:/foo/bar')
77 77 '/foo/bar'
78 78 >>> getrepopath(b'user@server/path/to/repository')
79 79 '/path/to/repository'
80 80 """
81 81 # According to CVS manual, CVS paths are expressed like:
82 82 # [:method:][[user][:password]@]hostname[:[port]]/path/to/repository
83 83 #
84 84 # CVSpath is splitted into parts and then position of the first occurrence
85 85 # of the '/' char after the '@' is located. The solution is the rest of the
86 86 # string after that '/' sign including it
87 87
88 88 parts = cvspath.split(':')
89 89 atposition = parts[-1].find('@')
90 90 start = 0
91 91
92 92 if atposition != -1:
93 93 start = atposition
94 94
95 95 repopath = parts[-1][parts[-1].find('/', start):]
96 96 return repopath
97 97
98 98 def createlog(ui, directory=None, root="", rlog=True, cache=None):
99 99 '''Collect the CVS rlog'''
100 100
101 101 # Because we store many duplicate commit log messages, reusing strings
102 102 # saves a lot of memory and pickle storage space.
103 103 _scache = {}
104 104 def scache(s):
105 105 "return a shared version of a string"
106 106 return _scache.setdefault(s, s)
107 107
108 108 ui.status(_('collecting CVS rlog\n'))
109 109
110 110 log = [] # list of logentry objects containing the CVS state
111 111
112 112 # patterns to match in CVS (r)log output, by state of use
113 113 re_00 = re.compile('RCS file: (.+)$')
114 114 re_01 = re.compile('cvs \\[r?log aborted\\]: (.+)$')
115 115 re_02 = re.compile('cvs (r?log|server): (.+)\n$')
116 116 re_03 = re.compile("(Cannot access.+CVSROOT)|"
117 117 "(can't create temporary directory.+)$")
118 118 re_10 = re.compile('Working file: (.+)$')
119 119 re_20 = re.compile('symbolic names:')
120 120 re_30 = re.compile('\t(.+): ([\\d.]+)$')
121 121 re_31 = re.compile('----------------------------$')
122 122 re_32 = re.compile('======================================='
123 123 '======================================$')
124 124 re_50 = re.compile('revision ([\\d.]+)(\s+locked by:\s+.+;)?$')
125 125 re_60 = re.compile(r'date:\s+(.+);\s+author:\s+(.+);\s+state:\s+(.+?);'
126 126 r'(\s+lines:\s+(\+\d+)?\s+(-\d+)?;)?'
127 127 r'(\s+commitid:\s+([^;]+);)?'
128 128 r'(.*mergepoint:\s+([^;]+);)?')
129 129 re_70 = re.compile('branches: (.+);$')
130 130
131 131 file_added_re = re.compile(r'file [^/]+ was (initially )?added on branch')
132 132
133 133 prefix = '' # leading path to strip of what we get from CVS
134 134
135 135 if directory is None:
136 136 # Current working directory
137 137
138 138 # Get the real directory in the repository
139 139 try:
140 140 prefix = open(os.path.join('CVS','Repository'), 'rb').read().strip()
141 141 directory = prefix
142 142 if prefix == ".":
143 143 prefix = ""
144 144 except IOError:
145 145 raise logerror(_('not a CVS sandbox'))
146 146
147 147 if prefix and not prefix.endswith(pycompat.ossep):
148 148 prefix += pycompat.ossep
149 149
150 150 # Use the Root file in the sandbox, if it exists
151 151 try:
152 152 root = open(os.path.join('CVS','Root'), 'rb').read().strip()
153 153 except IOError:
154 154 pass
155 155
156 156 if not root:
157 157 root = encoding.environ.get('CVSROOT', '')
158 158
159 159 # read log cache if one exists
160 160 oldlog = []
161 161 date = None
162 162
163 163 if cache:
164 164 cachedir = os.path.expanduser('~/.hg.cvsps')
165 165 if not os.path.exists(cachedir):
166 166 os.mkdir(cachedir)
167 167
168 168 # The cvsps cache pickle needs a uniquified name, based on the
169 169 # repository location. The address may have all sort of nasties
170 170 # in it, slashes, colons and such. So here we take just the
171 171 # alphanumeric characters, concatenated in a way that does not
172 172 # mix up the various components, so that
173 173 # :pserver:user@server:/path
174 174 # and
175 175 # /pserver/user/server/path
176 176 # are mapped to different cache file names.
177 177 cachefile = root.split(":") + [directory, "cache"]
178 178 cachefile = ['-'.join(re.findall(br'\w+', s)) for s in cachefile if s]
179 179 cachefile = os.path.join(cachedir,
180 180 '.'.join([s for s in cachefile if s]))
181 181
182 182 if cache == 'update':
183 183 try:
184 184 ui.note(_('reading cvs log cache %s\n') % cachefile)
185 185 oldlog = pickle.load(open(cachefile, 'rb'))
186 186 for e in oldlog:
187 187 if not (util.safehasattr(e, 'branchpoints') and
188 188 util.safehasattr(e, 'commitid') and
189 189 util.safehasattr(e, 'mergepoint')):
190 190 ui.status(_('ignoring old cache\n'))
191 191 oldlog = []
192 192 break
193 193
194 194 ui.note(_('cache has %d log entries\n') % len(oldlog))
195 195 except Exception as e:
196 196 ui.note(_('error reading cache: %r\n') % e)
197 197
198 198 if oldlog:
199 199 date = oldlog[-1].date # last commit date as a (time,tz) tuple
200 200 date = dateutil.datestr(date, '%Y/%m/%d %H:%M:%S %1%2')
201 201
202 202 # build the CVS commandline
203 203 cmd = ['cvs', '-q']
204 204 if root:
205 205 cmd.append('-d%s' % root)
206 206 p = util.normpath(getrepopath(root))
207 207 if not p.endswith('/'):
208 208 p += '/'
209 209 if prefix:
210 210 # looks like normpath replaces "" by "."
211 211 prefix = p + util.normpath(prefix)
212 212 else:
213 213 prefix = p
214 214 cmd.append(['log', 'rlog'][rlog])
215 215 if date:
216 216 # no space between option and date string
217 217 cmd.append('-d>%s' % date)
218 218 cmd.append(directory)
219 219
220 220 # state machine begins here
221 221 tags = {} # dictionary of revisions on current file with their tags
222 222 branchmap = {} # mapping between branch names and revision numbers
223 223 rcsmap = {}
224 224 state = 0
225 225 store = False # set when a new record can be appended
226 226
227 227 cmd = [procutil.shellquote(arg) for arg in cmd]
228 228 ui.note(_("running %s\n") % (' '.join(cmd)))
229 229 ui.debug("prefix=%r directory=%r root=%r\n" % (prefix, directory, root))
230 230
231 pfp = procutil.popen(' '.join(cmd))
232 peek = pfp.readline()
231 pfp = procutil.popen(' '.join(cmd), 'rb')
232 peek = util.fromnativeeol(pfp.readline())
233 233 while True:
234 234 line = peek
235 235 if line == '':
236 236 break
237 peek = pfp.readline()
237 peek = util.fromnativeeol(pfp.readline())
238 238 if line.endswith('\n'):
239 239 line = line[:-1]
240 240 #ui.debug('state=%d line=%r\n' % (state, line))
241 241
242 242 if state == 0:
243 243 # initial state, consume input until we see 'RCS file'
244 244 match = re_00.match(line)
245 245 if match:
246 246 rcs = match.group(1)
247 247 tags = {}
248 248 if rlog:
249 249 filename = util.normpath(rcs[:-2])
250 250 if filename.startswith(prefix):
251 251 filename = filename[len(prefix):]
252 252 if filename.startswith('/'):
253 253 filename = filename[1:]
254 254 if filename.startswith('Attic/'):
255 255 filename = filename[6:]
256 256 else:
257 257 filename = filename.replace('/Attic/', '/')
258 258 state = 2
259 259 continue
260 260 state = 1
261 261 continue
262 262 match = re_01.match(line)
263 263 if match:
264 264 raise logerror(match.group(1))
265 265 match = re_02.match(line)
266 266 if match:
267 267 raise logerror(match.group(2))
268 268 if re_03.match(line):
269 269 raise logerror(line)
270 270
271 271 elif state == 1:
272 272 # expect 'Working file' (only when using log instead of rlog)
273 273 match = re_10.match(line)
274 274 assert match, _('RCS file must be followed by working file')
275 275 filename = util.normpath(match.group(1))
276 276 state = 2
277 277
278 278 elif state == 2:
279 279 # expect 'symbolic names'
280 280 if re_20.match(line):
281 281 branchmap = {}
282 282 state = 3
283 283
284 284 elif state == 3:
285 285 # read the symbolic names and store as tags
286 286 match = re_30.match(line)
287 287 if match:
288 288 rev = [int(x) for x in match.group(2).split('.')]
289 289
290 290 # Convert magic branch number to an odd-numbered one
291 291 revn = len(rev)
292 292 if revn > 3 and (revn % 2) == 0 and rev[-2] == 0:
293 293 rev = rev[:-2] + rev[-1:]
294 294 rev = tuple(rev)
295 295
296 296 if rev not in tags:
297 297 tags[rev] = []
298 298 tags[rev].append(match.group(1))
299 299 branchmap[match.group(1)] = match.group(2)
300 300
301 301 elif re_31.match(line):
302 302 state = 5
303 303 elif re_32.match(line):
304 304 state = 0
305 305
306 306 elif state == 4:
307 307 # expecting '------' separator before first revision
308 308 if re_31.match(line):
309 309 state = 5
310 310 else:
311 311 assert not re_32.match(line), _('must have at least '
312 312 'some revisions')
313 313
314 314 elif state == 5:
315 315 # expecting revision number and possibly (ignored) lock indication
316 316 # we create the logentry here from values stored in states 0 to 4,
317 317 # as this state is re-entered for subsequent revisions of a file.
318 318 match = re_50.match(line)
319 319 assert match, _('expected revision number')
320 320 e = logentry(rcs=scache(rcs),
321 321 file=scache(filename),
322 322 revision=tuple([int(x) for x in
323 323 match.group(1).split('.')]),
324 324 branches=[],
325 325 parent=None,
326 326 commitid=None,
327 327 mergepoint=None,
328 328 branchpoints=set())
329 329
330 330 state = 6
331 331
332 332 elif state == 6:
333 333 # expecting date, author, state, lines changed
334 334 match = re_60.match(line)
335 335 assert match, _('revision must be followed by date line')
336 336 d = match.group(1)
337 337 if d[2] == '/':
338 338 # Y2K
339 339 d = '19' + d
340 340
341 341 if len(d.split()) != 3:
342 342 # cvs log dates always in GMT
343 343 d = d + ' UTC'
344 344 e.date = dateutil.parsedate(d, ['%y/%m/%d %H:%M:%S',
345 345 '%Y/%m/%d %H:%M:%S',
346 346 '%Y-%m-%d %H:%M:%S'])
347 347 e.author = scache(match.group(2))
348 348 e.dead = match.group(3).lower() == 'dead'
349 349
350 350 if match.group(5):
351 351 if match.group(6):
352 352 e.lines = (int(match.group(5)), int(match.group(6)))
353 353 else:
354 354 e.lines = (int(match.group(5)), 0)
355 355 elif match.group(6):
356 356 e.lines = (0, int(match.group(6)))
357 357 else:
358 358 e.lines = None
359 359
360 360 if match.group(7): # cvs 1.12 commitid
361 361 e.commitid = match.group(8)
362 362
363 363 if match.group(9): # cvsnt mergepoint
364 364 myrev = match.group(10).split('.')
365 365 if len(myrev) == 2: # head
366 366 e.mergepoint = 'HEAD'
367 367 else:
368 368 myrev = '.'.join(myrev[:-2] + ['0', myrev[-2]])
369 369 branches = [b for b in branchmap if branchmap[b] == myrev]
370 370 assert len(branches) == 1, ('unknown branch: %s'
371 371 % e.mergepoint)
372 372 e.mergepoint = branches[0]
373 373
374 374 e.comment = []
375 375 state = 7
376 376
377 377 elif state == 7:
378 378 # read the revision numbers of branches that start at this revision
379 379 # or store the commit log message otherwise
380 380 m = re_70.match(line)
381 381 if m:
382 382 e.branches = [tuple([int(y) for y in x.strip().split('.')])
383 383 for x in m.group(1).split(';')]
384 384 state = 8
385 385 elif re_31.match(line) and re_50.match(peek):
386 386 state = 5
387 387 store = True
388 388 elif re_32.match(line):
389 389 state = 0
390 390 store = True
391 391 else:
392 392 e.comment.append(line)
393 393
394 394 elif state == 8:
395 395 # store commit log message
396 396 if re_31.match(line):
397 397 cpeek = peek
398 398 if cpeek.endswith('\n'):
399 399 cpeek = cpeek[:-1]
400 400 if re_50.match(cpeek):
401 401 state = 5
402 402 store = True
403 403 else:
404 404 e.comment.append(line)
405 405 elif re_32.match(line):
406 406 state = 0
407 407 store = True
408 408 else:
409 409 e.comment.append(line)
410 410
411 411 # When a file is added on a branch B1, CVS creates a synthetic
412 412 # dead trunk revision 1.1 so that the branch has a root.
413 413 # Likewise, if you merge such a file to a later branch B2 (one
414 414 # that already existed when the file was added on B1), CVS
415 415 # creates a synthetic dead revision 1.1.x.1 on B2. Don't drop
416 416 # these revisions now, but mark them synthetic so
417 417 # createchangeset() can take care of them.
418 418 if (store and
419 419 e.dead and
420 420 e.revision[-1] == 1 and # 1.1 or 1.1.x.1
421 421 len(e.comment) == 1 and
422 422 file_added_re.match(e.comment[0])):
423 423 ui.debug('found synthetic revision in %s: %r\n'
424 424 % (e.rcs, e.comment[0]))
425 425 e.synthetic = True
426 426
427 427 if store:
428 428 # clean up the results and save in the log.
429 429 store = False
430 430 e.tags = sorted([scache(x) for x in tags.get(e.revision, [])])
431 431 e.comment = scache('\n'.join(e.comment))
432 432
433 433 revn = len(e.revision)
434 434 if revn > 3 and (revn % 2) == 0:
435 435 e.branch = tags.get(e.revision[:-1], [None])[0]
436 436 else:
437 437 e.branch = None
438 438
439 439 # find the branches starting from this revision
440 440 branchpoints = set()
441 441 for branch, revision in branchmap.iteritems():
442 442 revparts = tuple([int(i) for i in revision.split('.')])
443 443 if len(revparts) < 2: # bad tags
444 444 continue
445 445 if revparts[-2] == 0 and revparts[-1] % 2 == 0:
446 446 # normal branch
447 447 if revparts[:-2] == e.revision:
448 448 branchpoints.add(branch)
449 449 elif revparts == (1, 1, 1): # vendor branch
450 450 if revparts in e.branches:
451 451 branchpoints.add(branch)
452 452 e.branchpoints = branchpoints
453 453
454 454 log.append(e)
455 455
456 456 rcsmap[e.rcs.replace('/Attic/', '/')] = e.rcs
457 457
458 458 if len(log) % 100 == 0:
459 459 ui.status(stringutil.ellipsis('%d %s' % (len(log), e.file), 80)
460 460 + '\n')
461 461
462 462 log.sort(key=lambda x: (x.rcs, x.revision))
463 463
464 464 # find parent revisions of individual files
465 465 versions = {}
466 466 for e in sorted(oldlog, key=lambda x: (x.rcs, x.revision)):
467 467 rcs = e.rcs.replace('/Attic/', '/')
468 468 if rcs in rcsmap:
469 469 e.rcs = rcsmap[rcs]
470 470 branch = e.revision[:-1]
471 471 versions[(e.rcs, branch)] = e.revision
472 472
473 473 for e in log:
474 474 branch = e.revision[:-1]
475 475 p = versions.get((e.rcs, branch), None)
476 476 if p is None:
477 477 p = e.revision[:-2]
478 478 e.parent = p
479 479 versions[(e.rcs, branch)] = e.revision
480 480
481 481 # update the log cache
482 482 if cache:
483 483 if log:
484 484 # join up the old and new logs
485 485 log.sort(key=lambda x: x.date)
486 486
487 487 if oldlog and oldlog[-1].date >= log[0].date:
488 488 raise logerror(_('log cache overlaps with new log entries,'
489 489 ' re-run without cache.'))
490 490
491 491 log = oldlog + log
492 492
493 493 # write the new cachefile
494 494 ui.note(_('writing cvs log cache %s\n') % cachefile)
495 495 pickle.dump(log, open(cachefile, 'wb'))
496 496 else:
497 497 log = oldlog
498 498
499 499 ui.status(_('%d log entries\n') % len(log))
500 500
501 501 encodings = ui.configlist('convert', 'cvsps.logencoding')
502 502 if encodings:
503 503 def revstr(r):
504 504 # this is needed, because logentry.revision is a tuple of "int"
505 505 # (e.g. (1, 2) for "1.2")
506 506 return '.'.join(pycompat.maplist(pycompat.bytestr, r))
507 507
508 508 for entry in log:
509 509 comment = entry.comment
510 510 for e in encodings:
511 511 try:
512 512 entry.comment = comment.decode(e).encode('utf-8')
513 513 if ui.debugflag:
514 514 ui.debug("transcoding by %s: %s of %s\n" %
515 515 (e, revstr(entry.revision), entry.file))
516 516 break
517 517 except UnicodeDecodeError:
518 518 pass # try next encoding
519 519 except LookupError as inst: # unknown encoding, maybe
520 520 raise error.Abort(inst,
521 521 hint=_('check convert.cvsps.logencoding'
522 522 ' configuration'))
523 523 else:
524 524 raise error.Abort(_("no encoding can transcode"
525 525 " CVS log message for %s of %s")
526 526 % (revstr(entry.revision), entry.file),
527 527 hint=_('check convert.cvsps.logencoding'
528 528 ' configuration'))
529 529
530 530 hook.hook(ui, None, "cvslog", True, log=log)
531 531
532 532 return log
533 533
534 534
535 535 class changeset(object):
536 536 '''Class changeset has the following attributes:
537 537 .id - integer identifying this changeset (list index)
538 538 .author - author name as CVS knows it
539 539 .branch - name of branch this changeset is on, or None
540 540 .comment - commit message
541 541 .commitid - CVS commitid or None
542 542 .date - the commit date as a (time,tz) tuple
543 543 .entries - list of logentry objects in this changeset
544 544 .parents - list of one or two parent changesets
545 545 .tags - list of tags on this changeset
546 546 .synthetic - from synthetic revision "file ... added on branch ..."
547 547 .mergepoint- the branch that has been merged from or None
548 548 .branchpoints- the branches that start at the current entry or empty
549 549 '''
550 550 def __init__(self, **entries):
551 551 self.id = None
552 552 self.synthetic = False
553 553 self.__dict__.update(entries)
554 554
555 555 def __repr__(self):
556 556 items = ("%s=%r"%(k, self.__dict__[k]) for k in sorted(self.__dict__))
557 557 return "%s(%s)"%(type(self).__name__, ", ".join(items))
558 558
559 559 def createchangeset(ui, log, fuzz=60, mergefrom=None, mergeto=None):
560 560 '''Convert log into changesets.'''
561 561
562 562 ui.status(_('creating changesets\n'))
563 563
564 564 # try to order commitids by date
565 565 mindate = {}
566 566 for e in log:
567 567 if e.commitid:
568 568 mindate[e.commitid] = min(e.date, mindate.get(e.commitid))
569 569
570 570 # Merge changesets
571 571 log.sort(key=lambda x: (mindate.get(x.commitid), x.commitid, x.comment,
572 572 x.author, x.branch, x.date, x.branchpoints))
573 573
574 574 changesets = []
575 575 files = set()
576 576 c = None
577 577 for i, e in enumerate(log):
578 578
579 579 # Check if log entry belongs to the current changeset or not.
580 580
581 581 # Since CVS is file-centric, two different file revisions with
582 582 # different branchpoints should be treated as belonging to two
583 583 # different changesets (and the ordering is important and not
584 584 # honoured by cvsps at this point).
585 585 #
586 586 # Consider the following case:
587 587 # foo 1.1 branchpoints: [MYBRANCH]
588 588 # bar 1.1 branchpoints: [MYBRANCH, MYBRANCH2]
589 589 #
590 590 # Here foo is part only of MYBRANCH, but not MYBRANCH2, e.g. a
591 591 # later version of foo may be in MYBRANCH2, so foo should be the
592 592 # first changeset and bar the next and MYBRANCH and MYBRANCH2
593 593 # should both start off of the bar changeset. No provisions are
594 594 # made to ensure that this is, in fact, what happens.
595 595 if not (c and e.branchpoints == c.branchpoints and
596 596 (# cvs commitids
597 597 (e.commitid is not None and e.commitid == c.commitid) or
598 598 (# no commitids, use fuzzy commit detection
599 599 (e.commitid is None or c.commitid is None) and
600 600 e.comment == c.comment and
601 601 e.author == c.author and
602 602 e.branch == c.branch and
603 603 ((c.date[0] + c.date[1]) <=
604 604 (e.date[0] + e.date[1]) <=
605 605 (c.date[0] + c.date[1]) + fuzz) and
606 606 e.file not in files))):
607 607 c = changeset(comment=e.comment, author=e.author,
608 608 branch=e.branch, date=e.date,
609 609 entries=[], mergepoint=e.mergepoint,
610 610 branchpoints=e.branchpoints, commitid=e.commitid)
611 611 changesets.append(c)
612 612
613 613 files = set()
614 614 if len(changesets) % 100 == 0:
615 615 t = '%d %s' % (len(changesets), repr(e.comment)[1:-1])
616 616 ui.status(stringutil.ellipsis(t, 80) + '\n')
617 617
618 618 c.entries.append(e)
619 619 files.add(e.file)
620 620 c.date = e.date # changeset date is date of latest commit in it
621 621
622 622 # Mark synthetic changesets
623 623
624 624 for c in changesets:
625 625 # Synthetic revisions always get their own changeset, because
626 626 # the log message includes the filename. E.g. if you add file3
627 627 # and file4 on a branch, you get four log entries and three
628 628 # changesets:
629 629 # "File file3 was added on branch ..." (synthetic, 1 entry)
630 630 # "File file4 was added on branch ..." (synthetic, 1 entry)
631 631 # "Add file3 and file4 to fix ..." (real, 2 entries)
632 632 # Hence the check for 1 entry here.
633 633 c.synthetic = len(c.entries) == 1 and c.entries[0].synthetic
634 634
635 635 # Sort files in each changeset
636 636
637 637 def entitycompare(l, r):
638 638 'Mimic cvsps sorting order'
639 639 l = l.file.split('/')
640 640 r = r.file.split('/')
641 641 nl = len(l)
642 642 nr = len(r)
643 643 n = min(nl, nr)
644 644 for i in range(n):
645 645 if i + 1 == nl and nl < nr:
646 646 return -1
647 647 elif i + 1 == nr and nl > nr:
648 648 return +1
649 649 elif l[i] < r[i]:
650 650 return -1
651 651 elif l[i] > r[i]:
652 652 return +1
653 653 return 0
654 654
655 655 for c in changesets:
656 656 c.entries.sort(entitycompare)
657 657
658 658 # Sort changesets by date
659 659
660 660 odd = set()
661 661 def cscmp(l, r):
662 662 d = sum(l.date) - sum(r.date)
663 663 if d:
664 664 return d
665 665
666 666 # detect vendor branches and initial commits on a branch
667 667 le = {}
668 668 for e in l.entries:
669 669 le[e.rcs] = e.revision
670 670 re = {}
671 671 for e in r.entries:
672 672 re[e.rcs] = e.revision
673 673
674 674 d = 0
675 675 for e in l.entries:
676 676 if re.get(e.rcs, None) == e.parent:
677 677 assert not d
678 678 d = 1
679 679 break
680 680
681 681 for e in r.entries:
682 682 if le.get(e.rcs, None) == e.parent:
683 683 if d:
684 684 odd.add((l, r))
685 685 d = -1
686 686 break
687 687 # By this point, the changesets are sufficiently compared that
688 688 # we don't really care about ordering. However, this leaves
689 689 # some race conditions in the tests, so we compare on the
690 690 # number of files modified, the files contained in each
691 691 # changeset, and the branchpoints in the change to ensure test
692 692 # output remains stable.
693 693
694 694 # recommended replacement for cmp from
695 695 # https://docs.python.org/3.0/whatsnew/3.0.html
696 696 c = lambda x, y: (x > y) - (x < y)
697 697 # Sort bigger changes first.
698 698 if not d:
699 699 d = c(len(l.entries), len(r.entries))
700 700 # Try sorting by filename in the change.
701 701 if not d:
702 702 d = c([e.file for e in l.entries], [e.file for e in r.entries])
703 703 # Try and put changes without a branch point before ones with
704 704 # a branch point.
705 705 if not d:
706 706 d = c(len(l.branchpoints), len(r.branchpoints))
707 707 return d
708 708
709 709 changesets.sort(cscmp)
710 710
711 711 # Collect tags
712 712
713 713 globaltags = {}
714 714 for c in changesets:
715 715 for e in c.entries:
716 716 for tag in e.tags:
717 717 # remember which is the latest changeset to have this tag
718 718 globaltags[tag] = c
719 719
720 720 for c in changesets:
721 721 tags = set()
722 722 for e in c.entries:
723 723 tags.update(e.tags)
724 724 # remember tags only if this is the latest changeset to have it
725 725 c.tags = sorted(tag for tag in tags if globaltags[tag] is c)
726 726
727 727 # Find parent changesets, handle {{mergetobranch BRANCHNAME}}
728 728 # by inserting dummy changesets with two parents, and handle
729 729 # {{mergefrombranch BRANCHNAME}} by setting two parents.
730 730
731 731 if mergeto is None:
732 732 mergeto = r'{{mergetobranch ([-\w]+)}}'
733 733 if mergeto:
734 734 mergeto = re.compile(mergeto)
735 735
736 736 if mergefrom is None:
737 737 mergefrom = r'{{mergefrombranch ([-\w]+)}}'
738 738 if mergefrom:
739 739 mergefrom = re.compile(mergefrom)
740 740
741 741 versions = {} # changeset index where we saw any particular file version
742 742 branches = {} # changeset index where we saw a branch
743 743 n = len(changesets)
744 744 i = 0
745 745 while i < n:
746 746 c = changesets[i]
747 747
748 748 for f in c.entries:
749 749 versions[(f.rcs, f.revision)] = i
750 750
751 751 p = None
752 752 if c.branch in branches:
753 753 p = branches[c.branch]
754 754 else:
755 755 # first changeset on a new branch
756 756 # the parent is a changeset with the branch in its
757 757 # branchpoints such that it is the latest possible
758 758 # commit without any intervening, unrelated commits.
759 759
760 760 for candidate in xrange(i):
761 761 if c.branch not in changesets[candidate].branchpoints:
762 762 if p is not None:
763 763 break
764 764 continue
765 765 p = candidate
766 766
767 767 c.parents = []
768 768 if p is not None:
769 769 p = changesets[p]
770 770
771 771 # Ensure no changeset has a synthetic changeset as a parent.
772 772 while p.synthetic:
773 773 assert len(p.parents) <= 1, \
774 774 _('synthetic changeset cannot have multiple parents')
775 775 if p.parents:
776 776 p = p.parents[0]
777 777 else:
778 778 p = None
779 779 break
780 780
781 781 if p is not None:
782 782 c.parents.append(p)
783 783
784 784 if c.mergepoint:
785 785 if c.mergepoint == 'HEAD':
786 786 c.mergepoint = None
787 787 c.parents.append(changesets[branches[c.mergepoint]])
788 788
789 789 if mergefrom:
790 790 m = mergefrom.search(c.comment)
791 791 if m:
792 792 m = m.group(1)
793 793 if m == 'HEAD':
794 794 m = None
795 795 try:
796 796 candidate = changesets[branches[m]]
797 797 except KeyError:
798 798 ui.warn(_("warning: CVS commit message references "
799 799 "non-existent branch %r:\n%s\n")
800 800 % (m, c.comment))
801 801 if m in branches and c.branch != m and not candidate.synthetic:
802 802 c.parents.append(candidate)
803 803
804 804 if mergeto:
805 805 m = mergeto.search(c.comment)
806 806 if m:
807 807 if m.groups():
808 808 m = m.group(1)
809 809 if m == 'HEAD':
810 810 m = None
811 811 else:
812 812 m = None # if no group found then merge to HEAD
813 813 if m in branches and c.branch != m:
814 814 # insert empty changeset for merge
815 815 cc = changeset(
816 816 author=c.author, branch=m, date=c.date,
817 817 comment='convert-repo: CVS merge from branch %s'
818 818 % c.branch,
819 819 entries=[], tags=[],
820 820 parents=[changesets[branches[m]], c])
821 821 changesets.insert(i + 1, cc)
822 822 branches[m] = i + 1
823 823
824 824 # adjust our loop counters now we have inserted a new entry
825 825 n += 1
826 826 i += 2
827 827 continue
828 828
829 829 branches[c.branch] = i
830 830 i += 1
831 831
832 832 # Drop synthetic changesets (safe now that we have ensured no other
833 833 # changesets can have them as parents).
834 834 i = 0
835 835 while i < len(changesets):
836 836 if changesets[i].synthetic:
837 837 del changesets[i]
838 838 else:
839 839 i += 1
840 840
841 841 # Number changesets
842 842
843 843 for i, c in enumerate(changesets):
844 844 c.id = i + 1
845 845
846 846 if odd:
847 847 for l, r in odd:
848 848 if l.id is not None and r.id is not None:
849 849 ui.warn(_('changeset %d is both before and after %d\n')
850 850 % (l.id, r.id))
851 851
852 852 ui.status(_('%d changeset entries\n') % len(changesets))
853 853
854 854 hook.hook(ui, None, "cvschangesets", True, changesets=changesets)
855 855
856 856 return changesets
857 857
858 858
859 859 def debugcvsps(ui, *args, **opts):
860 860 '''Read CVS rlog for current directory or named path in
861 861 repository, and convert the log to changesets based on matching
862 862 commit log entries and dates.
863 863 '''
864 864 opts = pycompat.byteskwargs(opts)
865 865 if opts["new_cache"]:
866 866 cache = "write"
867 867 elif opts["update_cache"]:
868 868 cache = "update"
869 869 else:
870 870 cache = None
871 871
872 872 revisions = opts["revisions"]
873 873
874 874 try:
875 875 if args:
876 876 log = []
877 877 for d in args:
878 878 log += createlog(ui, d, root=opts["root"], cache=cache)
879 879 else:
880 880 log = createlog(ui, root=opts["root"], cache=cache)
881 881 except logerror as e:
882 882 ui.write("%r\n"%e)
883 883 return
884 884
885 885 changesets = createchangeset(ui, log, opts["fuzz"])
886 886 del log
887 887
888 888 # Print changesets (optionally filtered)
889 889
890 890 off = len(revisions)
891 891 branches = {} # latest version number in each branch
892 892 ancestors = {} # parent branch
893 893 for cs in changesets:
894 894
895 895 if opts["ancestors"]:
896 896 if cs.branch not in branches and cs.parents and cs.parents[0].id:
897 897 ancestors[cs.branch] = (changesets[cs.parents[0].id - 1].branch,
898 898 cs.parents[0].id)
899 899 branches[cs.branch] = cs.id
900 900
901 901 # limit by branches
902 902 if opts["branches"] and (cs.branch or 'HEAD') not in opts["branches"]:
903 903 continue
904 904
905 905 if not off:
906 906 # Note: trailing spaces on several lines here are needed to have
907 907 # bug-for-bug compatibility with cvsps.
908 908 ui.write('---------------------\n')
909 909 ui.write(('PatchSet %d \n' % cs.id))
910 910 ui.write(('Date: %s\n' % dateutil.datestr(cs.date,
911 911 '%Y/%m/%d %H:%M:%S %1%2')))
912 912 ui.write(('Author: %s\n' % cs.author))
913 913 ui.write(('Branch: %s\n' % (cs.branch or 'HEAD')))
914 914 ui.write(('Tag%s: %s \n' % (['', 's'][len(cs.tags) > 1],
915 915 ','.join(cs.tags) or '(none)')))
916 916 if cs.branchpoints:
917 917 ui.write(('Branchpoints: %s \n') %
918 918 ', '.join(sorted(cs.branchpoints)))
919 919 if opts["parents"] and cs.parents:
920 920 if len(cs.parents) > 1:
921 921 ui.write(('Parents: %s\n' %
922 922 (','.join([str(p.id) for p in cs.parents]))))
923 923 else:
924 924 ui.write(('Parent: %d\n' % cs.parents[0].id))
925 925
926 926 if opts["ancestors"]:
927 927 b = cs.branch
928 928 r = []
929 929 while b:
930 930 b, c = ancestors[b]
931 931 r.append('%s:%d:%d' % (b or "HEAD", c, branches[b]))
932 932 if r:
933 933 ui.write(('Ancestors: %s\n' % (','.join(r))))
934 934
935 935 ui.write(('Log:\n'))
936 936 ui.write('%s\n\n' % cs.comment)
937 937 ui.write(('Members: \n'))
938 938 for f in cs.entries:
939 939 fn = f.file
940 940 if fn.startswith(opts["prefix"]):
941 941 fn = fn[len(opts["prefix"]):]
942 942 ui.write('\t%s:%s->%s%s \n' % (
943 943 fn, '.'.join([str(x) for x in f.parent]) or 'INITIAL',
944 944 '.'.join([str(x) for x in f.revision]),
945 945 ['', '(DEAD)'][f.dead]))
946 946 ui.write('\n')
947 947
948 948 # have we seen the start tag?
949 949 if revisions and off:
950 950 if revisions[0] == str(cs.id) or \
951 951 revisions[0] in cs.tags:
952 952 off = False
953 953
954 954 # see if we reached the end tag
955 955 if len(revisions) > 1 and not off:
956 956 if revisions[1] == str(cs.id) or \
957 957 revisions[1] in cs.tags:
958 958 break
@@ -1,341 +1,341
1 1 # mail.py - mail sending bits for mercurial
2 2 #
3 3 # Copyright 2006 Matt Mackall <mpm@selenic.com>
4 4 #
5 5 # This software may be used and distributed according to the terms of the
6 6 # GNU General Public License version 2 or any later version.
7 7
8 8 from __future__ import absolute_import
9 9
10 10 import email
11 11 import email.charset
12 12 import email.header
13 13 import email.message
14 14 import os
15 15 import smtplib
16 16 import socket
17 17 import time
18 18
19 19 from .i18n import _
20 20 from . import (
21 21 encoding,
22 22 error,
23 23 pycompat,
24 24 sslutil,
25 25 util,
26 26 )
27 27 from .utils import (
28 28 procutil,
29 29 stringutil,
30 30 )
31 31
32 32 class STARTTLS(smtplib.SMTP):
33 33 '''Derived class to verify the peer certificate for STARTTLS.
34 34
35 35 This class allows to pass any keyword arguments to SSL socket creation.
36 36 '''
37 37 def __init__(self, ui, host=None, **kwargs):
38 38 smtplib.SMTP.__init__(self, **kwargs)
39 39 self._ui = ui
40 40 self._host = host
41 41
42 42 def starttls(self, keyfile=None, certfile=None):
43 43 if not self.has_extn("starttls"):
44 44 msg = "STARTTLS extension not supported by server"
45 45 raise smtplib.SMTPException(msg)
46 46 (resp, reply) = self.docmd("STARTTLS")
47 47 if resp == 220:
48 48 self.sock = sslutil.wrapsocket(self.sock, keyfile, certfile,
49 49 ui=self._ui,
50 50 serverhostname=self._host)
51 51 self.file = smtplib.SSLFakeFile(self.sock)
52 52 self.helo_resp = None
53 53 self.ehlo_resp = None
54 54 self.esmtp_features = {}
55 55 self.does_esmtp = 0
56 56 return (resp, reply)
57 57
58 58 class SMTPS(smtplib.SMTP):
59 59 '''Derived class to verify the peer certificate for SMTPS.
60 60
61 61 This class allows to pass any keyword arguments to SSL socket creation.
62 62 '''
63 63 def __init__(self, ui, keyfile=None, certfile=None, host=None,
64 64 **kwargs):
65 65 self.keyfile = keyfile
66 66 self.certfile = certfile
67 67 smtplib.SMTP.__init__(self, **kwargs)
68 68 self._host = host
69 69 self.default_port = smtplib.SMTP_SSL_PORT
70 70 self._ui = ui
71 71
72 72 def _get_socket(self, host, port, timeout):
73 73 if self.debuglevel > 0:
74 74 self._ui.debug('connect: %r\n' % (host, port))
75 75 new_socket = socket.create_connection((host, port), timeout)
76 76 new_socket = sslutil.wrapsocket(new_socket,
77 77 self.keyfile, self.certfile,
78 78 ui=self._ui,
79 79 serverhostname=self._host)
80 80 self.file = smtplib.SSLFakeFile(new_socket)
81 81 return new_socket
82 82
83 83 def _smtp(ui):
84 84 '''build an smtp connection and return a function to send mail'''
85 85 local_hostname = ui.config('smtp', 'local_hostname')
86 86 tls = ui.config('smtp', 'tls')
87 87 # backward compatible: when tls = true, we use starttls.
88 88 starttls = tls == 'starttls' or stringutil.parsebool(tls)
89 89 smtps = tls == 'smtps'
90 90 if (starttls or smtps) and not util.safehasattr(socket, 'ssl'):
91 91 raise error.Abort(_("can't use TLS: Python SSL support not installed"))
92 92 mailhost = ui.config('smtp', 'host')
93 93 if not mailhost:
94 94 raise error.Abort(_('smtp.host not configured - cannot send mail'))
95 95 if smtps:
96 96 ui.note(_('(using smtps)\n'))
97 97 s = SMTPS(ui, local_hostname=local_hostname, host=mailhost)
98 98 elif starttls:
99 99 s = STARTTLS(ui, local_hostname=local_hostname, host=mailhost)
100 100 else:
101 101 s = smtplib.SMTP(local_hostname=local_hostname)
102 102 if smtps:
103 103 defaultport = 465
104 104 else:
105 105 defaultport = 25
106 106 mailport = util.getport(ui.config('smtp', 'port', defaultport))
107 107 ui.note(_('sending mail: smtp host %s, port %d\n') %
108 108 (mailhost, mailport))
109 109 s.connect(host=mailhost, port=mailport)
110 110 if starttls:
111 111 ui.note(_('(using starttls)\n'))
112 112 s.ehlo()
113 113 s.starttls()
114 114 s.ehlo()
115 115 if starttls or smtps:
116 116 ui.note(_('(verifying remote certificate)\n'))
117 117 sslutil.validatesocket(s.sock)
118 118 username = ui.config('smtp', 'username')
119 119 password = ui.config('smtp', 'password')
120 120 if username and not password:
121 121 password = ui.getpass()
122 122 if username and password:
123 123 ui.note(_('(authenticating to mail server as %s)\n') %
124 124 (username))
125 125 try:
126 126 s.login(username, password)
127 127 except smtplib.SMTPException as inst:
128 128 raise error.Abort(inst)
129 129
130 130 def send(sender, recipients, msg):
131 131 try:
132 132 return s.sendmail(sender, recipients, msg)
133 133 except smtplib.SMTPRecipientsRefused as inst:
134 134 recipients = [r[1] for r in inst.recipients.values()]
135 135 raise error.Abort('\n' + '\n'.join(recipients))
136 136 except smtplib.SMTPException as inst:
137 137 raise error.Abort(inst)
138 138
139 139 return send
140 140
141 141 def _sendmail(ui, sender, recipients, msg):
142 142 '''send mail using sendmail.'''
143 143 program = ui.config('email', 'method')
144 144 cmdline = '%s -f %s %s' % (program, stringutil.email(sender),
145 145 ' '.join(map(stringutil.email, recipients)))
146 146 ui.note(_('sending mail: %s\n') % cmdline)
147 fp = procutil.popen(cmdline, 'w')
148 fp.write(msg)
147 fp = procutil.popen(cmdline, 'wb')
148 fp.write(util.tonativeeol(msg))
149 149 ret = fp.close()
150 150 if ret:
151 151 raise error.Abort('%s %s' % (
152 152 os.path.basename(program.split(None, 1)[0]),
153 153 procutil.explainexit(ret)[0]))
154 154
155 155 def _mbox(mbox, sender, recipients, msg):
156 156 '''write mails to mbox'''
157 157 fp = open(mbox, 'ab+')
158 158 # Should be time.asctime(), but Windows prints 2-characters day
159 159 # of month instead of one. Make them print the same thing.
160 160 date = time.strftime(r'%a %b %d %H:%M:%S %Y', time.localtime())
161 161 fp.write('From %s %s\n' % (sender, date))
162 162 fp.write(msg)
163 163 fp.write('\n\n')
164 164 fp.close()
165 165
166 166 def connect(ui, mbox=None):
167 167 '''make a mail connection. return a function to send mail.
168 168 call as sendmail(sender, list-of-recipients, msg).'''
169 169 if mbox:
170 170 open(mbox, 'wb').close()
171 171 return lambda s, r, m: _mbox(mbox, s, r, m)
172 172 if ui.config('email', 'method') == 'smtp':
173 173 return _smtp(ui)
174 174 return lambda s, r, m: _sendmail(ui, s, r, m)
175 175
176 176 def sendmail(ui, sender, recipients, msg, mbox=None):
177 177 send = connect(ui, mbox=mbox)
178 178 return send(sender, recipients, msg)
179 179
180 180 def validateconfig(ui):
181 181 '''determine if we have enough config data to try sending email.'''
182 182 method = ui.config('email', 'method')
183 183 if method == 'smtp':
184 184 if not ui.config('smtp', 'host'):
185 185 raise error.Abort(_('smtp specified as email transport, '
186 186 'but no smtp host configured'))
187 187 else:
188 188 if not procutil.findexe(method):
189 189 raise error.Abort(_('%r specified as email transport, '
190 190 'but not in PATH') % method)
191 191
192 192 def codec2iana(cs):
193 193 ''''''
194 194 cs = pycompat.sysbytes(email.charset.Charset(cs).input_charset.lower())
195 195
196 196 # "latin1" normalizes to "iso8859-1", standard calls for "iso-8859-1"
197 197 if cs.startswith("iso") and not cs.startswith("iso-"):
198 198 return "iso-" + cs[3:]
199 199 return cs
200 200
201 201 def mimetextpatch(s, subtype='plain', display=False):
202 202 '''Return MIME message suitable for a patch.
203 203 Charset will be detected by first trying to decode as us-ascii, then utf-8,
204 204 and finally the global encodings. If all those fail, fall back to
205 205 ISO-8859-1, an encoding with that allows all byte sequences.
206 206 Transfer encodings will be used if necessary.'''
207 207
208 208 cs = ['us-ascii', 'utf-8', encoding.encoding, encoding.fallbackencoding]
209 209 if display:
210 210 return mimetextqp(s, subtype, 'us-ascii')
211 211 for charset in cs:
212 212 try:
213 213 s.decode(pycompat.sysstr(charset))
214 214 return mimetextqp(s, subtype, codec2iana(charset))
215 215 except UnicodeDecodeError:
216 216 pass
217 217
218 218 return mimetextqp(s, subtype, "iso-8859-1")
219 219
220 220 def mimetextqp(body, subtype, charset):
221 221 '''Return MIME message.
222 222 Quoted-printable transfer encoding will be used if necessary.
223 223 '''
224 224 cs = email.charset.Charset(charset)
225 225 msg = email.message.Message()
226 226 msg.set_type(pycompat.sysstr('text/' + subtype))
227 227
228 228 for line in body.splitlines():
229 229 if len(line) > 950:
230 230 cs.body_encoding = email.charset.QP
231 231 break
232 232
233 233 msg.set_payload(body, cs)
234 234
235 235 return msg
236 236
237 237 def _charsets(ui):
238 238 '''Obtains charsets to send mail parts not containing patches.'''
239 239 charsets = [cs.lower() for cs in ui.configlist('email', 'charsets')]
240 240 fallbacks = [encoding.fallbackencoding.lower(),
241 241 encoding.encoding.lower(), 'utf-8']
242 242 for cs in fallbacks: # find unique charsets while keeping order
243 243 if cs not in charsets:
244 244 charsets.append(cs)
245 245 return [cs for cs in charsets if not cs.endswith('ascii')]
246 246
247 247 def _encode(ui, s, charsets):
248 248 '''Returns (converted) string, charset tuple.
249 249 Finds out best charset by cycling through sendcharsets in descending
250 250 order. Tries both encoding and fallbackencoding for input. Only as
251 251 last resort send as is in fake ascii.
252 252 Caveat: Do not use for mail parts containing patches!'''
253 253 try:
254 254 s.decode('ascii')
255 255 except UnicodeDecodeError:
256 256 sendcharsets = charsets or _charsets(ui)
257 257 for ics in (encoding.encoding, encoding.fallbackencoding):
258 258 try:
259 259 u = s.decode(ics)
260 260 except UnicodeDecodeError:
261 261 continue
262 262 for ocs in sendcharsets:
263 263 try:
264 264 return u.encode(ocs), ocs
265 265 except UnicodeEncodeError:
266 266 pass
267 267 except LookupError:
268 268 ui.warn(_('ignoring invalid sendcharset: %s\n') % ocs)
269 269 # if ascii, or all conversion attempts fail, send (broken) ascii
270 270 return s, 'us-ascii'
271 271
272 272 def headencode(ui, s, charsets=None, display=False):
273 273 '''Returns RFC-2047 compliant header from given string.'''
274 274 if not display:
275 275 # split into words?
276 276 s, cs = _encode(ui, s, charsets)
277 277 return str(email.header.Header(s, cs))
278 278 return s
279 279
280 280 def _addressencode(ui, name, addr, charsets=None):
281 281 name = headencode(ui, name, charsets)
282 282 try:
283 283 acc, dom = addr.split('@')
284 284 acc = acc.encode('ascii')
285 285 dom = dom.decode(encoding.encoding).encode('idna')
286 286 addr = '%s@%s' % (acc, dom)
287 287 except UnicodeDecodeError:
288 288 raise error.Abort(_('invalid email address: %s') % addr)
289 289 except ValueError:
290 290 try:
291 291 # too strict?
292 292 addr = addr.encode('ascii')
293 293 except UnicodeDecodeError:
294 294 raise error.Abort(_('invalid local address: %s') % addr)
295 295 return email.utils.formataddr((name, addr))
296 296
297 297 def addressencode(ui, address, charsets=None, display=False):
298 298 '''Turns address into RFC-2047 compliant header.'''
299 299 if display or not address:
300 300 return address or ''
301 301 name, addr = email.utils.parseaddr(address)
302 302 return _addressencode(ui, name, addr, charsets)
303 303
304 304 def addrlistencode(ui, addrs, charsets=None, display=False):
305 305 '''Turns a list of addresses into a list of RFC-2047 compliant headers.
306 306 A single element of input list may contain multiple addresses, but output
307 307 always has one address per item'''
308 308 if display:
309 309 return [a.strip() for a in addrs if a.strip()]
310 310
311 311 result = []
312 312 for name, addr in email.utils.getaddresses(addrs):
313 313 if name or addr:
314 314 result.append(_addressencode(ui, name, addr, charsets))
315 315 return result
316 316
317 317 def mimeencode(ui, s, charsets=None, display=False):
318 318 '''creates mime text object, encodes it if needed, and sets
319 319 charset and transfer-encoding accordingly.'''
320 320 cs = 'us-ascii'
321 321 if not display:
322 322 s, cs = _encode(ui, s, charsets)
323 323 return mimetextqp(s, 'plain', cs)
324 324
325 325 def headdecode(s):
326 326 '''Decodes RFC-2047 header'''
327 327 uparts = []
328 328 for part, charset in email.header.decode_header(s):
329 329 if charset is not None:
330 330 try:
331 331 uparts.append(part.decode(charset))
332 332 continue
333 333 except UnicodeDecodeError:
334 334 pass
335 335 try:
336 336 uparts.append(part.decode('UTF-8'))
337 337 continue
338 338 except UnicodeDecodeError:
339 339 pass
340 340 uparts.append(part.decode('ISO-8859-1'))
341 341 return encoding.unitolocal(u' '.join(uparts))
@@ -1,2911 +1,2912
1 1 # patch.py - patch file parsing routines
2 2 #
3 3 # Copyright 2006 Brendan Cully <brendan@kublai.com>
4 4 # Copyright 2007 Chris Mason <chris.mason@oracle.com>
5 5 #
6 6 # This software may be used and distributed according to the terms of the
7 7 # GNU General Public License version 2 or any later version.
8 8
9 9 from __future__ import absolute_import, print_function
10 10
11 11 import collections
12 12 import copy
13 13 import difflib
14 14 import email
15 15 import errno
16 16 import hashlib
17 17 import os
18 18 import posixpath
19 19 import re
20 20 import shutil
21 21 import tempfile
22 22 import zlib
23 23
24 24 from .i18n import _
25 25 from .node import (
26 26 hex,
27 27 short,
28 28 )
29 29 from . import (
30 30 copies,
31 31 encoding,
32 32 error,
33 33 mail,
34 34 mdiff,
35 35 pathutil,
36 36 policy,
37 37 pycompat,
38 38 scmutil,
39 39 similar,
40 40 util,
41 41 vfs as vfsmod,
42 42 )
43 43 from .utils import (
44 44 dateutil,
45 45 procutil,
46 46 stringutil,
47 47 )
48 48
49 49 diffhelpers = policy.importmod(r'diffhelpers')
50 50 stringio = util.stringio
51 51
52 52 gitre = re.compile(br'diff --git a/(.*) b/(.*)')
53 53 tabsplitter = re.compile(br'(\t+|[^\t]+)')
54 54 _nonwordre = re.compile(br'([^a-zA-Z0-9_\x80-\xff])')
55 55
56 56 PatchError = error.PatchError
57 57
58 58 # public functions
59 59
60 60 def split(stream):
61 61 '''return an iterator of individual patches from a stream'''
62 62 def isheader(line, inheader):
63 63 if inheader and line[0] in (' ', '\t'):
64 64 # continuation
65 65 return True
66 66 if line[0] in (' ', '-', '+'):
67 67 # diff line - don't check for header pattern in there
68 68 return False
69 69 l = line.split(': ', 1)
70 70 return len(l) == 2 and ' ' not in l[0]
71 71
72 72 def chunk(lines):
73 73 return stringio(''.join(lines))
74 74
75 75 def hgsplit(stream, cur):
76 76 inheader = True
77 77
78 78 for line in stream:
79 79 if not line.strip():
80 80 inheader = False
81 81 if not inheader and line.startswith('# HG changeset patch'):
82 82 yield chunk(cur)
83 83 cur = []
84 84 inheader = True
85 85
86 86 cur.append(line)
87 87
88 88 if cur:
89 89 yield chunk(cur)
90 90
91 91 def mboxsplit(stream, cur):
92 92 for line in stream:
93 93 if line.startswith('From '):
94 94 for c in split(chunk(cur[1:])):
95 95 yield c
96 96 cur = []
97 97
98 98 cur.append(line)
99 99
100 100 if cur:
101 101 for c in split(chunk(cur[1:])):
102 102 yield c
103 103
104 104 def mimesplit(stream, cur):
105 105 def msgfp(m):
106 106 fp = stringio()
107 107 g = email.Generator.Generator(fp, mangle_from_=False)
108 108 g.flatten(m)
109 109 fp.seek(0)
110 110 return fp
111 111
112 112 for line in stream:
113 113 cur.append(line)
114 114 c = chunk(cur)
115 115
116 116 m = pycompat.emailparser().parse(c)
117 117 if not m.is_multipart():
118 118 yield msgfp(m)
119 119 else:
120 120 ok_types = ('text/plain', 'text/x-diff', 'text/x-patch')
121 121 for part in m.walk():
122 122 ct = part.get_content_type()
123 123 if ct not in ok_types:
124 124 continue
125 125 yield msgfp(part)
126 126
127 127 def headersplit(stream, cur):
128 128 inheader = False
129 129
130 130 for line in stream:
131 131 if not inheader and isheader(line, inheader):
132 132 yield chunk(cur)
133 133 cur = []
134 134 inheader = True
135 135 if inheader and not isheader(line, inheader):
136 136 inheader = False
137 137
138 138 cur.append(line)
139 139
140 140 if cur:
141 141 yield chunk(cur)
142 142
143 143 def remainder(cur):
144 144 yield chunk(cur)
145 145
146 146 class fiter(object):
147 147 def __init__(self, fp):
148 148 self.fp = fp
149 149
150 150 def __iter__(self):
151 151 return self
152 152
153 153 def next(self):
154 154 l = self.fp.readline()
155 155 if not l:
156 156 raise StopIteration
157 157 return l
158 158
159 159 __next__ = next
160 160
161 161 inheader = False
162 162 cur = []
163 163
164 164 mimeheaders = ['content-type']
165 165
166 166 if not util.safehasattr(stream, 'next'):
167 167 # http responses, for example, have readline but not next
168 168 stream = fiter(stream)
169 169
170 170 for line in stream:
171 171 cur.append(line)
172 172 if line.startswith('# HG changeset patch'):
173 173 return hgsplit(stream, cur)
174 174 elif line.startswith('From '):
175 175 return mboxsplit(stream, cur)
176 176 elif isheader(line, inheader):
177 177 inheader = True
178 178 if line.split(':', 1)[0].lower() in mimeheaders:
179 179 # let email parser handle this
180 180 return mimesplit(stream, cur)
181 181 elif line.startswith('--- ') and inheader:
182 182 # No evil headers seen by diff start, split by hand
183 183 return headersplit(stream, cur)
184 184 # Not enough info, keep reading
185 185
186 186 # if we are here, we have a very plain patch
187 187 return remainder(cur)
188 188
189 189 ## Some facility for extensible patch parsing:
190 190 # list of pairs ("header to match", "data key")
191 191 patchheadermap = [('Date', 'date'),
192 192 ('Branch', 'branch'),
193 193 ('Node ID', 'nodeid'),
194 194 ]
195 195
196 196 def extract(ui, fileobj):
197 197 '''extract patch from data read from fileobj.
198 198
199 199 patch can be a normal patch or contained in an email message.
200 200
201 201 return a dictionary. Standard keys are:
202 202 - filename,
203 203 - message,
204 204 - user,
205 205 - date,
206 206 - branch,
207 207 - node,
208 208 - p1,
209 209 - p2.
210 210 Any item can be missing from the dictionary. If filename is missing,
211 211 fileobj did not contain a patch. Caller must unlink filename when done.'''
212 212
213 213 # attempt to detect the start of a patch
214 214 # (this heuristic is borrowed from quilt)
215 215 diffre = re.compile(br'^(?:Index:[ \t]|diff[ \t]-|RCS file: |'
216 216 br'retrieving revision [0-9]+(\.[0-9]+)*$|'
217 217 br'---[ \t].*?^\+\+\+[ \t]|'
218 218 br'\*\*\*[ \t].*?^---[ \t])',
219 219 re.MULTILINE | re.DOTALL)
220 220
221 221 data = {}
222 222 fd, tmpname = tempfile.mkstemp(prefix='hg-patch-')
223 223 tmpfp = os.fdopen(fd, r'wb')
224 224 try:
225 225 msg = pycompat.emailparser().parse(fileobj)
226 226
227 227 subject = msg['Subject'] and mail.headdecode(msg['Subject'])
228 228 data['user'] = msg['From'] and mail.headdecode(msg['From'])
229 229 if not subject and not data['user']:
230 230 # Not an email, restore parsed headers if any
231 231 subject = '\n'.join(': '.join(map(encoding.strtolocal, h))
232 232 for h in msg.items()) + '\n'
233 233
234 234 # should try to parse msg['Date']
235 235 parents = []
236 236
237 237 if subject:
238 238 if subject.startswith('[PATCH'):
239 239 pend = subject.find(']')
240 240 if pend >= 0:
241 241 subject = subject[pend + 1:].lstrip()
242 242 subject = re.sub(br'\n[ \t]+', ' ', subject)
243 243 ui.debug('Subject: %s\n' % subject)
244 244 if data['user']:
245 245 ui.debug('From: %s\n' % data['user'])
246 246 diffs_seen = 0
247 247 ok_types = ('text/plain', 'text/x-diff', 'text/x-patch')
248 248 message = ''
249 249 for part in msg.walk():
250 250 content_type = pycompat.bytestr(part.get_content_type())
251 251 ui.debug('Content-Type: %s\n' % content_type)
252 252 if content_type not in ok_types:
253 253 continue
254 254 payload = part.get_payload(decode=True)
255 255 m = diffre.search(payload)
256 256 if m:
257 257 hgpatch = False
258 258 hgpatchheader = False
259 259 ignoretext = False
260 260
261 261 ui.debug('found patch at byte %d\n' % m.start(0))
262 262 diffs_seen += 1
263 263 cfp = stringio()
264 264 for line in payload[:m.start(0)].splitlines():
265 265 if line.startswith('# HG changeset patch') and not hgpatch:
266 266 ui.debug('patch generated by hg export\n')
267 267 hgpatch = True
268 268 hgpatchheader = True
269 269 # drop earlier commit message content
270 270 cfp.seek(0)
271 271 cfp.truncate()
272 272 subject = None
273 273 elif hgpatchheader:
274 274 if line.startswith('# User '):
275 275 data['user'] = line[7:]
276 276 ui.debug('From: %s\n' % data['user'])
277 277 elif line.startswith("# Parent "):
278 278 parents.append(line[9:].lstrip())
279 279 elif line.startswith("# "):
280 280 for header, key in patchheadermap:
281 281 prefix = '# %s ' % header
282 282 if line.startswith(prefix):
283 283 data[key] = line[len(prefix):]
284 284 else:
285 285 hgpatchheader = False
286 286 elif line == '---':
287 287 ignoretext = True
288 288 if not hgpatchheader and not ignoretext:
289 289 cfp.write(line)
290 290 cfp.write('\n')
291 291 message = cfp.getvalue()
292 292 if tmpfp:
293 293 tmpfp.write(payload)
294 294 if not payload.endswith('\n'):
295 295 tmpfp.write('\n')
296 296 elif not diffs_seen and message and content_type == 'text/plain':
297 297 message += '\n' + payload
298 298 except: # re-raises
299 299 tmpfp.close()
300 300 os.unlink(tmpname)
301 301 raise
302 302
303 303 if subject and not message.startswith(subject):
304 304 message = '%s\n%s' % (subject, message)
305 305 data['message'] = message
306 306 tmpfp.close()
307 307 if parents:
308 308 data['p1'] = parents.pop(0)
309 309 if parents:
310 310 data['p2'] = parents.pop(0)
311 311
312 312 if diffs_seen:
313 313 data['filename'] = tmpname
314 314 else:
315 315 os.unlink(tmpname)
316 316 return data
317 317
318 318 class patchmeta(object):
319 319 """Patched file metadata
320 320
321 321 'op' is the performed operation within ADD, DELETE, RENAME, MODIFY
322 322 or COPY. 'path' is patched file path. 'oldpath' is set to the
323 323 origin file when 'op' is either COPY or RENAME, None otherwise. If
324 324 file mode is changed, 'mode' is a tuple (islink, isexec) where
325 325 'islink' is True if the file is a symlink and 'isexec' is True if
326 326 the file is executable. Otherwise, 'mode' is None.
327 327 """
328 328 def __init__(self, path):
329 329 self.path = path
330 330 self.oldpath = None
331 331 self.mode = None
332 332 self.op = 'MODIFY'
333 333 self.binary = False
334 334
335 335 def setmode(self, mode):
336 336 islink = mode & 0o20000
337 337 isexec = mode & 0o100
338 338 self.mode = (islink, isexec)
339 339
340 340 def copy(self):
341 341 other = patchmeta(self.path)
342 342 other.oldpath = self.oldpath
343 343 other.mode = self.mode
344 344 other.op = self.op
345 345 other.binary = self.binary
346 346 return other
347 347
348 348 def _ispatchinga(self, afile):
349 349 if afile == '/dev/null':
350 350 return self.op == 'ADD'
351 351 return afile == 'a/' + (self.oldpath or self.path)
352 352
353 353 def _ispatchingb(self, bfile):
354 354 if bfile == '/dev/null':
355 355 return self.op == 'DELETE'
356 356 return bfile == 'b/' + self.path
357 357
358 358 def ispatching(self, afile, bfile):
359 359 return self._ispatchinga(afile) and self._ispatchingb(bfile)
360 360
361 361 def __repr__(self):
362 362 return "<patchmeta %s %r>" % (self.op, self.path)
363 363
364 364 def readgitpatch(lr):
365 365 """extract git-style metadata about patches from <patchname>"""
366 366
367 367 # Filter patch for git information
368 368 gp = None
369 369 gitpatches = []
370 370 for line in lr:
371 371 line = line.rstrip(' \r\n')
372 372 if line.startswith('diff --git a/'):
373 373 m = gitre.match(line)
374 374 if m:
375 375 if gp:
376 376 gitpatches.append(gp)
377 377 dst = m.group(2)
378 378 gp = patchmeta(dst)
379 379 elif gp:
380 380 if line.startswith('--- '):
381 381 gitpatches.append(gp)
382 382 gp = None
383 383 continue
384 384 if line.startswith('rename from '):
385 385 gp.op = 'RENAME'
386 386 gp.oldpath = line[12:]
387 387 elif line.startswith('rename to '):
388 388 gp.path = line[10:]
389 389 elif line.startswith('copy from '):
390 390 gp.op = 'COPY'
391 391 gp.oldpath = line[10:]
392 392 elif line.startswith('copy to '):
393 393 gp.path = line[8:]
394 394 elif line.startswith('deleted file'):
395 395 gp.op = 'DELETE'
396 396 elif line.startswith('new file mode '):
397 397 gp.op = 'ADD'
398 398 gp.setmode(int(line[-6:], 8))
399 399 elif line.startswith('new mode '):
400 400 gp.setmode(int(line[-6:], 8))
401 401 elif line.startswith('GIT binary patch'):
402 402 gp.binary = True
403 403 if gp:
404 404 gitpatches.append(gp)
405 405
406 406 return gitpatches
407 407
408 408 class linereader(object):
409 409 # simple class to allow pushing lines back into the input stream
410 410 def __init__(self, fp):
411 411 self.fp = fp
412 412 self.buf = []
413 413
414 414 def push(self, line):
415 415 if line is not None:
416 416 self.buf.append(line)
417 417
418 418 def readline(self):
419 419 if self.buf:
420 420 l = self.buf[0]
421 421 del self.buf[0]
422 422 return l
423 423 return self.fp.readline()
424 424
425 425 def __iter__(self):
426 426 return iter(self.readline, '')
427 427
428 428 class abstractbackend(object):
429 429 def __init__(self, ui):
430 430 self.ui = ui
431 431
432 432 def getfile(self, fname):
433 433 """Return target file data and flags as a (data, (islink,
434 434 isexec)) tuple. Data is None if file is missing/deleted.
435 435 """
436 436 raise NotImplementedError
437 437
438 438 def setfile(self, fname, data, mode, copysource):
439 439 """Write data to target file fname and set its mode. mode is a
440 440 (islink, isexec) tuple. If data is None, the file content should
441 441 be left unchanged. If the file is modified after being copied,
442 442 copysource is set to the original file name.
443 443 """
444 444 raise NotImplementedError
445 445
446 446 def unlink(self, fname):
447 447 """Unlink target file."""
448 448 raise NotImplementedError
449 449
450 450 def writerej(self, fname, failed, total, lines):
451 451 """Write rejected lines for fname. total is the number of hunks
452 452 which failed to apply and total the total number of hunks for this
453 453 files.
454 454 """
455 455
456 456 def exists(self, fname):
457 457 raise NotImplementedError
458 458
459 459 def close(self):
460 460 raise NotImplementedError
461 461
462 462 class fsbackend(abstractbackend):
463 463 def __init__(self, ui, basedir):
464 464 super(fsbackend, self).__init__(ui)
465 465 self.opener = vfsmod.vfs(basedir)
466 466
467 467 def getfile(self, fname):
468 468 if self.opener.islink(fname):
469 469 return (self.opener.readlink(fname), (True, False))
470 470
471 471 isexec = False
472 472 try:
473 473 isexec = self.opener.lstat(fname).st_mode & 0o100 != 0
474 474 except OSError as e:
475 475 if e.errno != errno.ENOENT:
476 476 raise
477 477 try:
478 478 return (self.opener.read(fname), (False, isexec))
479 479 except IOError as e:
480 480 if e.errno != errno.ENOENT:
481 481 raise
482 482 return None, None
483 483
484 484 def setfile(self, fname, data, mode, copysource):
485 485 islink, isexec = mode
486 486 if data is None:
487 487 self.opener.setflags(fname, islink, isexec)
488 488 return
489 489 if islink:
490 490 self.opener.symlink(data, fname)
491 491 else:
492 492 self.opener.write(fname, data)
493 493 if isexec:
494 494 self.opener.setflags(fname, False, True)
495 495
496 496 def unlink(self, fname):
497 497 self.opener.unlinkpath(fname, ignoremissing=True)
498 498
499 499 def writerej(self, fname, failed, total, lines):
500 500 fname = fname + ".rej"
501 501 self.ui.warn(
502 502 _("%d out of %d hunks FAILED -- saving rejects to file %s\n") %
503 503 (failed, total, fname))
504 504 fp = self.opener(fname, 'w')
505 505 fp.writelines(lines)
506 506 fp.close()
507 507
508 508 def exists(self, fname):
509 509 return self.opener.lexists(fname)
510 510
511 511 class workingbackend(fsbackend):
512 512 def __init__(self, ui, repo, similarity):
513 513 super(workingbackend, self).__init__(ui, repo.root)
514 514 self.repo = repo
515 515 self.similarity = similarity
516 516 self.removed = set()
517 517 self.changed = set()
518 518 self.copied = []
519 519
520 520 def _checkknown(self, fname):
521 521 if self.repo.dirstate[fname] == '?' and self.exists(fname):
522 522 raise PatchError(_('cannot patch %s: file is not tracked') % fname)
523 523
524 524 def setfile(self, fname, data, mode, copysource):
525 525 self._checkknown(fname)
526 526 super(workingbackend, self).setfile(fname, data, mode, copysource)
527 527 if copysource is not None:
528 528 self.copied.append((copysource, fname))
529 529 self.changed.add(fname)
530 530
531 531 def unlink(self, fname):
532 532 self._checkknown(fname)
533 533 super(workingbackend, self).unlink(fname)
534 534 self.removed.add(fname)
535 535 self.changed.add(fname)
536 536
537 537 def close(self):
538 538 wctx = self.repo[None]
539 539 changed = set(self.changed)
540 540 for src, dst in self.copied:
541 541 scmutil.dirstatecopy(self.ui, self.repo, wctx, src, dst)
542 542 if self.removed:
543 543 wctx.forget(sorted(self.removed))
544 544 for f in self.removed:
545 545 if f not in self.repo.dirstate:
546 546 # File was deleted and no longer belongs to the
547 547 # dirstate, it was probably marked added then
548 548 # deleted, and should not be considered by
549 549 # marktouched().
550 550 changed.discard(f)
551 551 if changed:
552 552 scmutil.marktouched(self.repo, changed, self.similarity)
553 553 return sorted(self.changed)
554 554
555 555 class filestore(object):
556 556 def __init__(self, maxsize=None):
557 557 self.opener = None
558 558 self.files = {}
559 559 self.created = 0
560 560 self.maxsize = maxsize
561 561 if self.maxsize is None:
562 562 self.maxsize = 4*(2**20)
563 563 self.size = 0
564 564 self.data = {}
565 565
566 566 def setfile(self, fname, data, mode, copied=None):
567 567 if self.maxsize < 0 or (len(data) + self.size) <= self.maxsize:
568 568 self.data[fname] = (data, mode, copied)
569 569 self.size += len(data)
570 570 else:
571 571 if self.opener is None:
572 572 root = tempfile.mkdtemp(prefix='hg-patch-')
573 573 self.opener = vfsmod.vfs(root)
574 574 # Avoid filename issues with these simple names
575 575 fn = '%d' % self.created
576 576 self.opener.write(fn, data)
577 577 self.created += 1
578 578 self.files[fname] = (fn, mode, copied)
579 579
580 580 def getfile(self, fname):
581 581 if fname in self.data:
582 582 return self.data[fname]
583 583 if not self.opener or fname not in self.files:
584 584 return None, None, None
585 585 fn, mode, copied = self.files[fname]
586 586 return self.opener.read(fn), mode, copied
587 587
588 588 def close(self):
589 589 if self.opener:
590 590 shutil.rmtree(self.opener.base)
591 591
592 592 class repobackend(abstractbackend):
593 593 def __init__(self, ui, repo, ctx, store):
594 594 super(repobackend, self).__init__(ui)
595 595 self.repo = repo
596 596 self.ctx = ctx
597 597 self.store = store
598 598 self.changed = set()
599 599 self.removed = set()
600 600 self.copied = {}
601 601
602 602 def _checkknown(self, fname):
603 603 if fname not in self.ctx:
604 604 raise PatchError(_('cannot patch %s: file is not tracked') % fname)
605 605
606 606 def getfile(self, fname):
607 607 try:
608 608 fctx = self.ctx[fname]
609 609 except error.LookupError:
610 610 return None, None
611 611 flags = fctx.flags()
612 612 return fctx.data(), ('l' in flags, 'x' in flags)
613 613
614 614 def setfile(self, fname, data, mode, copysource):
615 615 if copysource:
616 616 self._checkknown(copysource)
617 617 if data is None:
618 618 data = self.ctx[fname].data()
619 619 self.store.setfile(fname, data, mode, copysource)
620 620 self.changed.add(fname)
621 621 if copysource:
622 622 self.copied[fname] = copysource
623 623
624 624 def unlink(self, fname):
625 625 self._checkknown(fname)
626 626 self.removed.add(fname)
627 627
628 628 def exists(self, fname):
629 629 return fname in self.ctx
630 630
631 631 def close(self):
632 632 return self.changed | self.removed
633 633
634 634 # @@ -start,len +start,len @@ or @@ -start +start @@ if len is 1
635 635 unidesc = re.compile('@@ -(\d+)(?:,(\d+))? \+(\d+)(?:,(\d+))? @@')
636 636 contextdesc = re.compile('(?:---|\*\*\*) (\d+)(?:,(\d+))? (?:---|\*\*\*)')
637 637 eolmodes = ['strict', 'crlf', 'lf', 'auto']
638 638
639 639 class patchfile(object):
640 640 def __init__(self, ui, gp, backend, store, eolmode='strict'):
641 641 self.fname = gp.path
642 642 self.eolmode = eolmode
643 643 self.eol = None
644 644 self.backend = backend
645 645 self.ui = ui
646 646 self.lines = []
647 647 self.exists = False
648 648 self.missing = True
649 649 self.mode = gp.mode
650 650 self.copysource = gp.oldpath
651 651 self.create = gp.op in ('ADD', 'COPY', 'RENAME')
652 652 self.remove = gp.op == 'DELETE'
653 653 if self.copysource is None:
654 654 data, mode = backend.getfile(self.fname)
655 655 else:
656 656 data, mode = store.getfile(self.copysource)[:2]
657 657 if data is not None:
658 658 self.exists = self.copysource is None or backend.exists(self.fname)
659 659 self.missing = False
660 660 if data:
661 661 self.lines = mdiff.splitnewlines(data)
662 662 if self.mode is None:
663 663 self.mode = mode
664 664 if self.lines:
665 665 # Normalize line endings
666 666 if self.lines[0].endswith('\r\n'):
667 667 self.eol = '\r\n'
668 668 elif self.lines[0].endswith('\n'):
669 669 self.eol = '\n'
670 670 if eolmode != 'strict':
671 671 nlines = []
672 672 for l in self.lines:
673 673 if l.endswith('\r\n'):
674 674 l = l[:-2] + '\n'
675 675 nlines.append(l)
676 676 self.lines = nlines
677 677 else:
678 678 if self.create:
679 679 self.missing = False
680 680 if self.mode is None:
681 681 self.mode = (False, False)
682 682 if self.missing:
683 683 self.ui.warn(_("unable to find '%s' for patching\n") % self.fname)
684 684 self.ui.warn(_("(use '--prefix' to apply patch relative to the "
685 685 "current directory)\n"))
686 686
687 687 self.hash = {}
688 688 self.dirty = 0
689 689 self.offset = 0
690 690 self.skew = 0
691 691 self.rej = []
692 692 self.fileprinted = False
693 693 self.printfile(False)
694 694 self.hunks = 0
695 695
696 696 def writelines(self, fname, lines, mode):
697 697 if self.eolmode == 'auto':
698 698 eol = self.eol
699 699 elif self.eolmode == 'crlf':
700 700 eol = '\r\n'
701 701 else:
702 702 eol = '\n'
703 703
704 704 if self.eolmode != 'strict' and eol and eol != '\n':
705 705 rawlines = []
706 706 for l in lines:
707 707 if l and l[-1] == '\n':
708 708 l = l[:-1] + eol
709 709 rawlines.append(l)
710 710 lines = rawlines
711 711
712 712 self.backend.setfile(fname, ''.join(lines), mode, self.copysource)
713 713
714 714 def printfile(self, warn):
715 715 if self.fileprinted:
716 716 return
717 717 if warn or self.ui.verbose:
718 718 self.fileprinted = True
719 719 s = _("patching file %s\n") % self.fname
720 720 if warn:
721 721 self.ui.warn(s)
722 722 else:
723 723 self.ui.note(s)
724 724
725 725
726 726 def findlines(self, l, linenum):
727 727 # looks through the hash and finds candidate lines. The
728 728 # result is a list of line numbers sorted based on distance
729 729 # from linenum
730 730
731 731 cand = self.hash.get(l, [])
732 732 if len(cand) > 1:
733 733 # resort our list of potentials forward then back.
734 734 cand.sort(key=lambda x: abs(x - linenum))
735 735 return cand
736 736
737 737 def write_rej(self):
738 738 # our rejects are a little different from patch(1). This always
739 739 # creates rejects in the same form as the original patch. A file
740 740 # header is inserted so that you can run the reject through patch again
741 741 # without having to type the filename.
742 742 if not self.rej:
743 743 return
744 744 base = os.path.basename(self.fname)
745 745 lines = ["--- %s\n+++ %s\n" % (base, base)]
746 746 for x in self.rej:
747 747 for l in x.hunk:
748 748 lines.append(l)
749 749 if l[-1:] != '\n':
750 750 lines.append("\n\ No newline at end of file\n")
751 751 self.backend.writerej(self.fname, len(self.rej), self.hunks, lines)
752 752
753 753 def apply(self, h):
754 754 if not h.complete():
755 755 raise PatchError(_("bad hunk #%d %s (%d %d %d %d)") %
756 756 (h.number, h.desc, len(h.a), h.lena, len(h.b),
757 757 h.lenb))
758 758
759 759 self.hunks += 1
760 760
761 761 if self.missing:
762 762 self.rej.append(h)
763 763 return -1
764 764
765 765 if self.exists and self.create:
766 766 if self.copysource:
767 767 self.ui.warn(_("cannot create %s: destination already "
768 768 "exists\n") % self.fname)
769 769 else:
770 770 self.ui.warn(_("file %s already exists\n") % self.fname)
771 771 self.rej.append(h)
772 772 return -1
773 773
774 774 if isinstance(h, binhunk):
775 775 if self.remove:
776 776 self.backend.unlink(self.fname)
777 777 else:
778 778 l = h.new(self.lines)
779 779 self.lines[:] = l
780 780 self.offset += len(l)
781 781 self.dirty = True
782 782 return 0
783 783
784 784 horig = h
785 785 if (self.eolmode in ('crlf', 'lf')
786 786 or self.eolmode == 'auto' and self.eol):
787 787 # If new eols are going to be normalized, then normalize
788 788 # hunk data before patching. Otherwise, preserve input
789 789 # line-endings.
790 790 h = h.getnormalized()
791 791
792 792 # fast case first, no offsets, no fuzz
793 793 old, oldstart, new, newstart = h.fuzzit(0, False)
794 794 oldstart += self.offset
795 795 orig_start = oldstart
796 796 # if there's skew we want to emit the "(offset %d lines)" even
797 797 # when the hunk cleanly applies at start + skew, so skip the
798 798 # fast case code
799 799 if (self.skew == 0 and
800 800 diffhelpers.testhunk(old, self.lines, oldstart) == 0):
801 801 if self.remove:
802 802 self.backend.unlink(self.fname)
803 803 else:
804 804 self.lines[oldstart:oldstart + len(old)] = new
805 805 self.offset += len(new) - len(old)
806 806 self.dirty = True
807 807 return 0
808 808
809 809 # ok, we couldn't match the hunk. Lets look for offsets and fuzz it
810 810 self.hash = {}
811 811 for x, s in enumerate(self.lines):
812 812 self.hash.setdefault(s, []).append(x)
813 813
814 814 for fuzzlen in xrange(self.ui.configint("patch", "fuzz") + 1):
815 815 for toponly in [True, False]:
816 816 old, oldstart, new, newstart = h.fuzzit(fuzzlen, toponly)
817 817 oldstart = oldstart + self.offset + self.skew
818 818 oldstart = min(oldstart, len(self.lines))
819 819 if old:
820 820 cand = self.findlines(old[0][1:], oldstart)
821 821 else:
822 822 # Only adding lines with no or fuzzed context, just
823 823 # take the skew in account
824 824 cand = [oldstart]
825 825
826 826 for l in cand:
827 827 if not old or diffhelpers.testhunk(old, self.lines, l) == 0:
828 828 self.lines[l : l + len(old)] = new
829 829 self.offset += len(new) - len(old)
830 830 self.skew = l - orig_start
831 831 self.dirty = True
832 832 offset = l - orig_start - fuzzlen
833 833 if fuzzlen:
834 834 msg = _("Hunk #%d succeeded at %d "
835 835 "with fuzz %d "
836 836 "(offset %d lines).\n")
837 837 self.printfile(True)
838 838 self.ui.warn(msg %
839 839 (h.number, l + 1, fuzzlen, offset))
840 840 else:
841 841 msg = _("Hunk #%d succeeded at %d "
842 842 "(offset %d lines).\n")
843 843 self.ui.note(msg % (h.number, l + 1, offset))
844 844 return fuzzlen
845 845 self.printfile(True)
846 846 self.ui.warn(_("Hunk #%d FAILED at %d\n") % (h.number, orig_start))
847 847 self.rej.append(horig)
848 848 return -1
849 849
850 850 def close(self):
851 851 if self.dirty:
852 852 self.writelines(self.fname, self.lines, self.mode)
853 853 self.write_rej()
854 854 return len(self.rej)
855 855
856 856 class header(object):
857 857 """patch header
858 858 """
859 859 diffgit_re = re.compile('diff --git a/(.*) b/(.*)$')
860 860 diff_re = re.compile('diff -r .* (.*)$')
861 861 allhunks_re = re.compile('(?:index|deleted file) ')
862 862 pretty_re = re.compile('(?:new file|deleted file) ')
863 863 special_re = re.compile('(?:index|deleted|copy|rename) ')
864 864 newfile_re = re.compile('(?:new file)')
865 865
866 866 def __init__(self, header):
867 867 self.header = header
868 868 self.hunks = []
869 869
870 870 def binary(self):
871 871 return any(h.startswith('index ') for h in self.header)
872 872
873 873 def pretty(self, fp):
874 874 for h in self.header:
875 875 if h.startswith('index '):
876 876 fp.write(_('this modifies a binary file (all or nothing)\n'))
877 877 break
878 878 if self.pretty_re.match(h):
879 879 fp.write(h)
880 880 if self.binary():
881 881 fp.write(_('this is a binary file\n'))
882 882 break
883 883 if h.startswith('---'):
884 884 fp.write(_('%d hunks, %d lines changed\n') %
885 885 (len(self.hunks),
886 886 sum([max(h.added, h.removed) for h in self.hunks])))
887 887 break
888 888 fp.write(h)
889 889
890 890 def write(self, fp):
891 891 fp.write(''.join(self.header))
892 892
893 893 def allhunks(self):
894 894 return any(self.allhunks_re.match(h) for h in self.header)
895 895
896 896 def files(self):
897 897 match = self.diffgit_re.match(self.header[0])
898 898 if match:
899 899 fromfile, tofile = match.groups()
900 900 if fromfile == tofile:
901 901 return [fromfile]
902 902 return [fromfile, tofile]
903 903 else:
904 904 return self.diff_re.match(self.header[0]).groups()
905 905
906 906 def filename(self):
907 907 return self.files()[-1]
908 908
909 909 def __repr__(self):
910 910 return '<header %s>' % (' '.join(map(repr, self.files())))
911 911
912 912 def isnewfile(self):
913 913 return any(self.newfile_re.match(h) for h in self.header)
914 914
915 915 def special(self):
916 916 # Special files are shown only at the header level and not at the hunk
917 917 # level for example a file that has been deleted is a special file.
918 918 # The user cannot change the content of the operation, in the case of
919 919 # the deleted file he has to take the deletion or not take it, he
920 920 # cannot take some of it.
921 921 # Newly added files are special if they are empty, they are not special
922 922 # if they have some content as we want to be able to change it
923 923 nocontent = len(self.header) == 2
924 924 emptynewfile = self.isnewfile() and nocontent
925 925 return emptynewfile or \
926 926 any(self.special_re.match(h) for h in self.header)
927 927
928 928 class recordhunk(object):
929 929 """patch hunk
930 930
931 931 XXX shouldn't we merge this with the other hunk class?
932 932 """
933 933
934 934 def __init__(self, header, fromline, toline, proc, before, hunk, after,
935 935 maxcontext=None):
936 936 def trimcontext(lines, reverse=False):
937 937 if maxcontext is not None:
938 938 delta = len(lines) - maxcontext
939 939 if delta > 0:
940 940 if reverse:
941 941 return delta, lines[delta:]
942 942 else:
943 943 return delta, lines[:maxcontext]
944 944 return 0, lines
945 945
946 946 self.header = header
947 947 trimedbefore, self.before = trimcontext(before, True)
948 948 self.fromline = fromline + trimedbefore
949 949 self.toline = toline + trimedbefore
950 950 _trimedafter, self.after = trimcontext(after, False)
951 951 self.proc = proc
952 952 self.hunk = hunk
953 953 self.added, self.removed = self.countchanges(self.hunk)
954 954
955 955 def __eq__(self, v):
956 956 if not isinstance(v, recordhunk):
957 957 return False
958 958
959 959 return ((v.hunk == self.hunk) and
960 960 (v.proc == self.proc) and
961 961 (self.fromline == v.fromline) and
962 962 (self.header.files() == v.header.files()))
963 963
964 964 def __hash__(self):
965 965 return hash((tuple(self.hunk),
966 966 tuple(self.header.files()),
967 967 self.fromline,
968 968 self.proc))
969 969
970 970 def countchanges(self, hunk):
971 971 """hunk -> (n+,n-)"""
972 972 add = len([h for h in hunk if h.startswith('+')])
973 973 rem = len([h for h in hunk if h.startswith('-')])
974 974 return add, rem
975 975
976 976 def reversehunk(self):
977 977 """return another recordhunk which is the reverse of the hunk
978 978
979 979 If this hunk is diff(A, B), the returned hunk is diff(B, A). To do
980 980 that, swap fromline/toline and +/- signs while keep other things
981 981 unchanged.
982 982 """
983 983 m = {'+': '-', '-': '+', '\\': '\\'}
984 984 hunk = ['%s%s' % (m[l[0:1]], l[1:]) for l in self.hunk]
985 985 return recordhunk(self.header, self.toline, self.fromline, self.proc,
986 986 self.before, hunk, self.after)
987 987
988 988 def write(self, fp):
989 989 delta = len(self.before) + len(self.after)
990 990 if self.after and self.after[-1] == '\\ No newline at end of file\n':
991 991 delta -= 1
992 992 fromlen = delta + self.removed
993 993 tolen = delta + self.added
994 994 fp.write('@@ -%d,%d +%d,%d @@%s\n' %
995 995 (self.fromline, fromlen, self.toline, tolen,
996 996 self.proc and (' ' + self.proc)))
997 997 fp.write(''.join(self.before + self.hunk + self.after))
998 998
999 999 pretty = write
1000 1000
1001 1001 def filename(self):
1002 1002 return self.header.filename()
1003 1003
1004 1004 def __repr__(self):
1005 1005 return '<hunk %r@%d>' % (self.filename(), self.fromline)
1006 1006
1007 1007 def getmessages():
1008 1008 return {
1009 1009 'multiple': {
1010 1010 'apply': _("apply change %d/%d to '%s'?"),
1011 1011 'discard': _("discard change %d/%d to '%s'?"),
1012 1012 'record': _("record change %d/%d to '%s'?"),
1013 1013 },
1014 1014 'single': {
1015 1015 'apply': _("apply this change to '%s'?"),
1016 1016 'discard': _("discard this change to '%s'?"),
1017 1017 'record': _("record this change to '%s'?"),
1018 1018 },
1019 1019 'help': {
1020 1020 'apply': _('[Ynesfdaq?]'
1021 1021 '$$ &Yes, apply this change'
1022 1022 '$$ &No, skip this change'
1023 1023 '$$ &Edit this change manually'
1024 1024 '$$ &Skip remaining changes to this file'
1025 1025 '$$ Apply remaining changes to this &file'
1026 1026 '$$ &Done, skip remaining changes and files'
1027 1027 '$$ Apply &all changes to all remaining files'
1028 1028 '$$ &Quit, applying no changes'
1029 1029 '$$ &? (display help)'),
1030 1030 'discard': _('[Ynesfdaq?]'
1031 1031 '$$ &Yes, discard this change'
1032 1032 '$$ &No, skip this change'
1033 1033 '$$ &Edit this change manually'
1034 1034 '$$ &Skip remaining changes to this file'
1035 1035 '$$ Discard remaining changes to this &file'
1036 1036 '$$ &Done, skip remaining changes and files'
1037 1037 '$$ Discard &all changes to all remaining files'
1038 1038 '$$ &Quit, discarding no changes'
1039 1039 '$$ &? (display help)'),
1040 1040 'record': _('[Ynesfdaq?]'
1041 1041 '$$ &Yes, record this change'
1042 1042 '$$ &No, skip this change'
1043 1043 '$$ &Edit this change manually'
1044 1044 '$$ &Skip remaining changes to this file'
1045 1045 '$$ Record remaining changes to this &file'
1046 1046 '$$ &Done, skip remaining changes and files'
1047 1047 '$$ Record &all changes to all remaining files'
1048 1048 '$$ &Quit, recording no changes'
1049 1049 '$$ &? (display help)'),
1050 1050 }
1051 1051 }
1052 1052
1053 1053 def filterpatch(ui, headers, operation=None):
1054 1054 """Interactively filter patch chunks into applied-only chunks"""
1055 1055 messages = getmessages()
1056 1056
1057 1057 if operation is None:
1058 1058 operation = 'record'
1059 1059
1060 1060 def prompt(skipfile, skipall, query, chunk):
1061 1061 """prompt query, and process base inputs
1062 1062
1063 1063 - y/n for the rest of file
1064 1064 - y/n for the rest
1065 1065 - ? (help)
1066 1066 - q (quit)
1067 1067
1068 1068 Return True/False and possibly updated skipfile and skipall.
1069 1069 """
1070 1070 newpatches = None
1071 1071 if skipall is not None:
1072 1072 return skipall, skipfile, skipall, newpatches
1073 1073 if skipfile is not None:
1074 1074 return skipfile, skipfile, skipall, newpatches
1075 1075 while True:
1076 1076 resps = messages['help'][operation]
1077 1077 r = ui.promptchoice("%s %s" % (query, resps))
1078 1078 ui.write("\n")
1079 1079 if r == 8: # ?
1080 1080 for c, t in ui.extractchoices(resps)[1]:
1081 1081 ui.write('%s - %s\n' % (c, encoding.lower(t)))
1082 1082 continue
1083 1083 elif r == 0: # yes
1084 1084 ret = True
1085 1085 elif r == 1: # no
1086 1086 ret = False
1087 1087 elif r == 2: # Edit patch
1088 1088 if chunk is None:
1089 1089 ui.write(_('cannot edit patch for whole file'))
1090 1090 ui.write("\n")
1091 1091 continue
1092 1092 if chunk.header.binary():
1093 1093 ui.write(_('cannot edit patch for binary file'))
1094 1094 ui.write("\n")
1095 1095 continue
1096 1096 # Patch comment based on the Git one (based on comment at end of
1097 1097 # https://mercurial-scm.org/wiki/RecordExtension)
1098 1098 phelp = '---' + _("""
1099 1099 To remove '-' lines, make them ' ' lines (context).
1100 1100 To remove '+' lines, delete them.
1101 1101 Lines starting with # will be removed from the patch.
1102 1102
1103 1103 If the patch applies cleanly, the edited hunk will immediately be
1104 1104 added to the record list. If it does not apply cleanly, a rejects
1105 1105 file will be generated: you can use that when you try again. If
1106 1106 all lines of the hunk are removed, then the edit is aborted and
1107 1107 the hunk is left unchanged.
1108 1108 """)
1109 1109 (patchfd, patchfn) = tempfile.mkstemp(prefix="hg-editor-",
1110 1110 suffix=".diff")
1111 1111 ncpatchfp = None
1112 1112 try:
1113 1113 # Write the initial patch
1114 1114 f = util.nativeeolwriter(os.fdopen(patchfd, r'wb'))
1115 1115 chunk.header.write(f)
1116 1116 chunk.write(f)
1117 1117 f.write('\n'.join(['# ' + i for i in phelp.splitlines()]))
1118 1118 f.close()
1119 1119 # Start the editor and wait for it to complete
1120 1120 editor = ui.geteditor()
1121 1121 ret = ui.system("%s \"%s\"" % (editor, patchfn),
1122 1122 environ={'HGUSER': ui.username()},
1123 1123 blockedtag='filterpatch')
1124 1124 if ret != 0:
1125 1125 ui.warn(_("editor exited with exit code %d\n") % ret)
1126 1126 continue
1127 1127 # Remove comment lines
1128 1128 patchfp = open(patchfn, r'rb')
1129 1129 ncpatchfp = stringio()
1130 1130 for line in util.iterfile(patchfp):
1131 1131 line = util.fromnativeeol(line)
1132 1132 if not line.startswith('#'):
1133 1133 ncpatchfp.write(line)
1134 1134 patchfp.close()
1135 1135 ncpatchfp.seek(0)
1136 1136 newpatches = parsepatch(ncpatchfp)
1137 1137 finally:
1138 1138 os.unlink(patchfn)
1139 1139 del ncpatchfp
1140 1140 # Signal that the chunk shouldn't be applied as-is, but
1141 1141 # provide the new patch to be used instead.
1142 1142 ret = False
1143 1143 elif r == 3: # Skip
1144 1144 ret = skipfile = False
1145 1145 elif r == 4: # file (Record remaining)
1146 1146 ret = skipfile = True
1147 1147 elif r == 5: # done, skip remaining
1148 1148 ret = skipall = False
1149 1149 elif r == 6: # all
1150 1150 ret = skipall = True
1151 1151 elif r == 7: # quit
1152 1152 raise error.Abort(_('user quit'))
1153 1153 return ret, skipfile, skipall, newpatches
1154 1154
1155 1155 seen = set()
1156 1156 applied = {} # 'filename' -> [] of chunks
1157 1157 skipfile, skipall = None, None
1158 1158 pos, total = 1, sum(len(h.hunks) for h in headers)
1159 1159 for h in headers:
1160 1160 pos += len(h.hunks)
1161 1161 skipfile = None
1162 1162 fixoffset = 0
1163 1163 hdr = ''.join(h.header)
1164 1164 if hdr in seen:
1165 1165 continue
1166 1166 seen.add(hdr)
1167 1167 if skipall is None:
1168 1168 h.pretty(ui)
1169 1169 msg = (_('examine changes to %s?') %
1170 1170 _(' and ').join("'%s'" % f for f in h.files()))
1171 1171 r, skipfile, skipall, np = prompt(skipfile, skipall, msg, None)
1172 1172 if not r:
1173 1173 continue
1174 1174 applied[h.filename()] = [h]
1175 1175 if h.allhunks():
1176 1176 applied[h.filename()] += h.hunks
1177 1177 continue
1178 1178 for i, chunk in enumerate(h.hunks):
1179 1179 if skipfile is None and skipall is None:
1180 1180 chunk.pretty(ui)
1181 1181 if total == 1:
1182 1182 msg = messages['single'][operation] % chunk.filename()
1183 1183 else:
1184 1184 idx = pos - len(h.hunks) + i
1185 1185 msg = messages['multiple'][operation] % (idx, total,
1186 1186 chunk.filename())
1187 1187 r, skipfile, skipall, newpatches = prompt(skipfile,
1188 1188 skipall, msg, chunk)
1189 1189 if r:
1190 1190 if fixoffset:
1191 1191 chunk = copy.copy(chunk)
1192 1192 chunk.toline += fixoffset
1193 1193 applied[chunk.filename()].append(chunk)
1194 1194 elif newpatches is not None:
1195 1195 for newpatch in newpatches:
1196 1196 for newhunk in newpatch.hunks:
1197 1197 if fixoffset:
1198 1198 newhunk.toline += fixoffset
1199 1199 applied[newhunk.filename()].append(newhunk)
1200 1200 else:
1201 1201 fixoffset += chunk.removed - chunk.added
1202 1202 return (sum([h for h in applied.itervalues()
1203 1203 if h[0].special() or len(h) > 1], []), {})
1204 1204 class hunk(object):
1205 1205 def __init__(self, desc, num, lr, context):
1206 1206 self.number = num
1207 1207 self.desc = desc
1208 1208 self.hunk = [desc]
1209 1209 self.a = []
1210 1210 self.b = []
1211 1211 self.starta = self.lena = None
1212 1212 self.startb = self.lenb = None
1213 1213 if lr is not None:
1214 1214 if context:
1215 1215 self.read_context_hunk(lr)
1216 1216 else:
1217 1217 self.read_unified_hunk(lr)
1218 1218
1219 1219 def getnormalized(self):
1220 1220 """Return a copy with line endings normalized to LF."""
1221 1221
1222 1222 def normalize(lines):
1223 1223 nlines = []
1224 1224 for line in lines:
1225 1225 if line.endswith('\r\n'):
1226 1226 line = line[:-2] + '\n'
1227 1227 nlines.append(line)
1228 1228 return nlines
1229 1229
1230 1230 # Dummy object, it is rebuilt manually
1231 1231 nh = hunk(self.desc, self.number, None, None)
1232 1232 nh.number = self.number
1233 1233 nh.desc = self.desc
1234 1234 nh.hunk = self.hunk
1235 1235 nh.a = normalize(self.a)
1236 1236 nh.b = normalize(self.b)
1237 1237 nh.starta = self.starta
1238 1238 nh.startb = self.startb
1239 1239 nh.lena = self.lena
1240 1240 nh.lenb = self.lenb
1241 1241 return nh
1242 1242
1243 1243 def read_unified_hunk(self, lr):
1244 1244 m = unidesc.match(self.desc)
1245 1245 if not m:
1246 1246 raise PatchError(_("bad hunk #%d") % self.number)
1247 1247 self.starta, self.lena, self.startb, self.lenb = m.groups()
1248 1248 if self.lena is None:
1249 1249 self.lena = 1
1250 1250 else:
1251 1251 self.lena = int(self.lena)
1252 1252 if self.lenb is None:
1253 1253 self.lenb = 1
1254 1254 else:
1255 1255 self.lenb = int(self.lenb)
1256 1256 self.starta = int(self.starta)
1257 1257 self.startb = int(self.startb)
1258 1258 diffhelpers.addlines(lr, self.hunk, self.lena, self.lenb, self.a,
1259 1259 self.b)
1260 1260 # if we hit eof before finishing out the hunk, the last line will
1261 1261 # be zero length. Lets try to fix it up.
1262 1262 while len(self.hunk[-1]) == 0:
1263 1263 del self.hunk[-1]
1264 1264 del self.a[-1]
1265 1265 del self.b[-1]
1266 1266 self.lena -= 1
1267 1267 self.lenb -= 1
1268 1268 self._fixnewline(lr)
1269 1269
1270 1270 def read_context_hunk(self, lr):
1271 1271 self.desc = lr.readline()
1272 1272 m = contextdesc.match(self.desc)
1273 1273 if not m:
1274 1274 raise PatchError(_("bad hunk #%d") % self.number)
1275 1275 self.starta, aend = m.groups()
1276 1276 self.starta = int(self.starta)
1277 1277 if aend is None:
1278 1278 aend = self.starta
1279 1279 self.lena = int(aend) - self.starta
1280 1280 if self.starta:
1281 1281 self.lena += 1
1282 1282 for x in xrange(self.lena):
1283 1283 l = lr.readline()
1284 1284 if l.startswith('---'):
1285 1285 # lines addition, old block is empty
1286 1286 lr.push(l)
1287 1287 break
1288 1288 s = l[2:]
1289 1289 if l.startswith('- ') or l.startswith('! '):
1290 1290 u = '-' + s
1291 1291 elif l.startswith(' '):
1292 1292 u = ' ' + s
1293 1293 else:
1294 1294 raise PatchError(_("bad hunk #%d old text line %d") %
1295 1295 (self.number, x))
1296 1296 self.a.append(u)
1297 1297 self.hunk.append(u)
1298 1298
1299 1299 l = lr.readline()
1300 1300 if l.startswith('\ '):
1301 1301 s = self.a[-1][:-1]
1302 1302 self.a[-1] = s
1303 1303 self.hunk[-1] = s
1304 1304 l = lr.readline()
1305 1305 m = contextdesc.match(l)
1306 1306 if not m:
1307 1307 raise PatchError(_("bad hunk #%d") % self.number)
1308 1308 self.startb, bend = m.groups()
1309 1309 self.startb = int(self.startb)
1310 1310 if bend is None:
1311 1311 bend = self.startb
1312 1312 self.lenb = int(bend) - self.startb
1313 1313 if self.startb:
1314 1314 self.lenb += 1
1315 1315 hunki = 1
1316 1316 for x in xrange(self.lenb):
1317 1317 l = lr.readline()
1318 1318 if l.startswith('\ '):
1319 1319 # XXX: the only way to hit this is with an invalid line range.
1320 1320 # The no-eol marker is not counted in the line range, but I
1321 1321 # guess there are diff(1) out there which behave differently.
1322 1322 s = self.b[-1][:-1]
1323 1323 self.b[-1] = s
1324 1324 self.hunk[hunki - 1] = s
1325 1325 continue
1326 1326 if not l:
1327 1327 # line deletions, new block is empty and we hit EOF
1328 1328 lr.push(l)
1329 1329 break
1330 1330 s = l[2:]
1331 1331 if l.startswith('+ ') or l.startswith('! '):
1332 1332 u = '+' + s
1333 1333 elif l.startswith(' '):
1334 1334 u = ' ' + s
1335 1335 elif len(self.b) == 0:
1336 1336 # line deletions, new block is empty
1337 1337 lr.push(l)
1338 1338 break
1339 1339 else:
1340 1340 raise PatchError(_("bad hunk #%d old text line %d") %
1341 1341 (self.number, x))
1342 1342 self.b.append(s)
1343 1343 while True:
1344 1344 if hunki >= len(self.hunk):
1345 1345 h = ""
1346 1346 else:
1347 1347 h = self.hunk[hunki]
1348 1348 hunki += 1
1349 1349 if h == u:
1350 1350 break
1351 1351 elif h.startswith('-'):
1352 1352 continue
1353 1353 else:
1354 1354 self.hunk.insert(hunki - 1, u)
1355 1355 break
1356 1356
1357 1357 if not self.a:
1358 1358 # this happens when lines were only added to the hunk
1359 1359 for x in self.hunk:
1360 1360 if x.startswith('-') or x.startswith(' '):
1361 1361 self.a.append(x)
1362 1362 if not self.b:
1363 1363 # this happens when lines were only deleted from the hunk
1364 1364 for x in self.hunk:
1365 1365 if x.startswith('+') or x.startswith(' '):
1366 1366 self.b.append(x[1:])
1367 1367 # @@ -start,len +start,len @@
1368 1368 self.desc = "@@ -%d,%d +%d,%d @@\n" % (self.starta, self.lena,
1369 1369 self.startb, self.lenb)
1370 1370 self.hunk[0] = self.desc
1371 1371 self._fixnewline(lr)
1372 1372
1373 1373 def _fixnewline(self, lr):
1374 1374 l = lr.readline()
1375 1375 if l.startswith('\ '):
1376 1376 diffhelpers.fix_newline(self.hunk, self.a, self.b)
1377 1377 else:
1378 1378 lr.push(l)
1379 1379
1380 1380 def complete(self):
1381 1381 return len(self.a) == self.lena and len(self.b) == self.lenb
1382 1382
1383 1383 def _fuzzit(self, old, new, fuzz, toponly):
1384 1384 # this removes context lines from the top and bottom of list 'l'. It
1385 1385 # checks the hunk to make sure only context lines are removed, and then
1386 1386 # returns a new shortened list of lines.
1387 1387 fuzz = min(fuzz, len(old))
1388 1388 if fuzz:
1389 1389 top = 0
1390 1390 bot = 0
1391 1391 hlen = len(self.hunk)
1392 1392 for x in xrange(hlen - 1):
1393 1393 # the hunk starts with the @@ line, so use x+1
1394 1394 if self.hunk[x + 1][0] == ' ':
1395 1395 top += 1
1396 1396 else:
1397 1397 break
1398 1398 if not toponly:
1399 1399 for x in xrange(hlen - 1):
1400 1400 if self.hunk[hlen - bot - 1][0] == ' ':
1401 1401 bot += 1
1402 1402 else:
1403 1403 break
1404 1404
1405 1405 bot = min(fuzz, bot)
1406 1406 top = min(fuzz, top)
1407 1407 return old[top:len(old) - bot], new[top:len(new) - bot], top
1408 1408 return old, new, 0
1409 1409
1410 1410 def fuzzit(self, fuzz, toponly):
1411 1411 old, new, top = self._fuzzit(self.a, self.b, fuzz, toponly)
1412 1412 oldstart = self.starta + top
1413 1413 newstart = self.startb + top
1414 1414 # zero length hunk ranges already have their start decremented
1415 1415 if self.lena and oldstart > 0:
1416 1416 oldstart -= 1
1417 1417 if self.lenb and newstart > 0:
1418 1418 newstart -= 1
1419 1419 return old, oldstart, new, newstart
1420 1420
1421 1421 class binhunk(object):
1422 1422 'A binary patch file.'
1423 1423 def __init__(self, lr, fname):
1424 1424 self.text = None
1425 1425 self.delta = False
1426 1426 self.hunk = ['GIT binary patch\n']
1427 1427 self._fname = fname
1428 1428 self._read(lr)
1429 1429
1430 1430 def complete(self):
1431 1431 return self.text is not None
1432 1432
1433 1433 def new(self, lines):
1434 1434 if self.delta:
1435 1435 return [applybindelta(self.text, ''.join(lines))]
1436 1436 return [self.text]
1437 1437
1438 1438 def _read(self, lr):
1439 1439 def getline(lr, hunk):
1440 1440 l = lr.readline()
1441 1441 hunk.append(l)
1442 1442 return l.rstrip('\r\n')
1443 1443
1444 1444 size = 0
1445 1445 while True:
1446 1446 line = getline(lr, self.hunk)
1447 1447 if not line:
1448 1448 raise PatchError(_('could not extract "%s" binary data')
1449 1449 % self._fname)
1450 1450 if line.startswith('literal '):
1451 1451 size = int(line[8:].rstrip())
1452 1452 break
1453 1453 if line.startswith('delta '):
1454 1454 size = int(line[6:].rstrip())
1455 1455 self.delta = True
1456 1456 break
1457 1457 dec = []
1458 1458 line = getline(lr, self.hunk)
1459 1459 while len(line) > 1:
1460 1460 l = line[0:1]
1461 1461 if l <= 'Z' and l >= 'A':
1462 1462 l = ord(l) - ord('A') + 1
1463 1463 else:
1464 1464 l = ord(l) - ord('a') + 27
1465 1465 try:
1466 1466 dec.append(util.b85decode(line[1:])[:l])
1467 1467 except ValueError as e:
1468 1468 raise PatchError(_('could not decode "%s" binary patch: %s')
1469 1469 % (self._fname, stringutil.forcebytestr(e)))
1470 1470 line = getline(lr, self.hunk)
1471 1471 text = zlib.decompress(''.join(dec))
1472 1472 if len(text) != size:
1473 1473 raise PatchError(_('"%s" length is %d bytes, should be %d')
1474 1474 % (self._fname, len(text), size))
1475 1475 self.text = text
1476 1476
1477 1477 def parsefilename(str):
1478 1478 # --- filename \t|space stuff
1479 1479 s = str[4:].rstrip('\r\n')
1480 1480 i = s.find('\t')
1481 1481 if i < 0:
1482 1482 i = s.find(' ')
1483 1483 if i < 0:
1484 1484 return s
1485 1485 return s[:i]
1486 1486
1487 1487 def reversehunks(hunks):
1488 1488 '''reverse the signs in the hunks given as argument
1489 1489
1490 1490 This function operates on hunks coming out of patch.filterpatch, that is
1491 1491 a list of the form: [header1, hunk1, hunk2, header2...]. Example usage:
1492 1492
1493 1493 >>> rawpatch = b"""diff --git a/folder1/g b/folder1/g
1494 1494 ... --- a/folder1/g
1495 1495 ... +++ b/folder1/g
1496 1496 ... @@ -1,7 +1,7 @@
1497 1497 ... +firstline
1498 1498 ... c
1499 1499 ... 1
1500 1500 ... 2
1501 1501 ... + 3
1502 1502 ... -4
1503 1503 ... 5
1504 1504 ... d
1505 1505 ... +lastline"""
1506 1506 >>> hunks = parsepatch([rawpatch])
1507 1507 >>> hunkscomingfromfilterpatch = []
1508 1508 >>> for h in hunks:
1509 1509 ... hunkscomingfromfilterpatch.append(h)
1510 1510 ... hunkscomingfromfilterpatch.extend(h.hunks)
1511 1511
1512 1512 >>> reversedhunks = reversehunks(hunkscomingfromfilterpatch)
1513 1513 >>> from . import util
1514 1514 >>> fp = util.stringio()
1515 1515 >>> for c in reversedhunks:
1516 1516 ... c.write(fp)
1517 1517 >>> fp.seek(0) or None
1518 1518 >>> reversedpatch = fp.read()
1519 1519 >>> print(pycompat.sysstr(reversedpatch))
1520 1520 diff --git a/folder1/g b/folder1/g
1521 1521 --- a/folder1/g
1522 1522 +++ b/folder1/g
1523 1523 @@ -1,4 +1,3 @@
1524 1524 -firstline
1525 1525 c
1526 1526 1
1527 1527 2
1528 1528 @@ -2,6 +1,6 @@
1529 1529 c
1530 1530 1
1531 1531 2
1532 1532 - 3
1533 1533 +4
1534 1534 5
1535 1535 d
1536 1536 @@ -6,3 +5,2 @@
1537 1537 5
1538 1538 d
1539 1539 -lastline
1540 1540
1541 1541 '''
1542 1542
1543 1543 newhunks = []
1544 1544 for c in hunks:
1545 1545 if util.safehasattr(c, 'reversehunk'):
1546 1546 c = c.reversehunk()
1547 1547 newhunks.append(c)
1548 1548 return newhunks
1549 1549
1550 1550 def parsepatch(originalchunks, maxcontext=None):
1551 1551 """patch -> [] of headers -> [] of hunks
1552 1552
1553 1553 If maxcontext is not None, trim context lines if necessary.
1554 1554
1555 1555 >>> rawpatch = b'''diff --git a/folder1/g b/folder1/g
1556 1556 ... --- a/folder1/g
1557 1557 ... +++ b/folder1/g
1558 1558 ... @@ -1,8 +1,10 @@
1559 1559 ... 1
1560 1560 ... 2
1561 1561 ... -3
1562 1562 ... 4
1563 1563 ... 5
1564 1564 ... 6
1565 1565 ... +6.1
1566 1566 ... +6.2
1567 1567 ... 7
1568 1568 ... 8
1569 1569 ... +9'''
1570 1570 >>> out = util.stringio()
1571 1571 >>> headers = parsepatch([rawpatch], maxcontext=1)
1572 1572 >>> for header in headers:
1573 1573 ... header.write(out)
1574 1574 ... for hunk in header.hunks:
1575 1575 ... hunk.write(out)
1576 1576 >>> print(pycompat.sysstr(out.getvalue()))
1577 1577 diff --git a/folder1/g b/folder1/g
1578 1578 --- a/folder1/g
1579 1579 +++ b/folder1/g
1580 1580 @@ -2,3 +2,2 @@
1581 1581 2
1582 1582 -3
1583 1583 4
1584 1584 @@ -6,2 +5,4 @@
1585 1585 6
1586 1586 +6.1
1587 1587 +6.2
1588 1588 7
1589 1589 @@ -8,1 +9,2 @@
1590 1590 8
1591 1591 +9
1592 1592 """
1593 1593 class parser(object):
1594 1594 """patch parsing state machine"""
1595 1595 def __init__(self):
1596 1596 self.fromline = 0
1597 1597 self.toline = 0
1598 1598 self.proc = ''
1599 1599 self.header = None
1600 1600 self.context = []
1601 1601 self.before = []
1602 1602 self.hunk = []
1603 1603 self.headers = []
1604 1604
1605 1605 def addrange(self, limits):
1606 1606 fromstart, fromend, tostart, toend, proc = limits
1607 1607 self.fromline = int(fromstart)
1608 1608 self.toline = int(tostart)
1609 1609 self.proc = proc
1610 1610
1611 1611 def addcontext(self, context):
1612 1612 if self.hunk:
1613 1613 h = recordhunk(self.header, self.fromline, self.toline,
1614 1614 self.proc, self.before, self.hunk, context, maxcontext)
1615 1615 self.header.hunks.append(h)
1616 1616 self.fromline += len(self.before) + h.removed
1617 1617 self.toline += len(self.before) + h.added
1618 1618 self.before = []
1619 1619 self.hunk = []
1620 1620 self.context = context
1621 1621
1622 1622 def addhunk(self, hunk):
1623 1623 if self.context:
1624 1624 self.before = self.context
1625 1625 self.context = []
1626 1626 self.hunk = hunk
1627 1627
1628 1628 def newfile(self, hdr):
1629 1629 self.addcontext([])
1630 1630 h = header(hdr)
1631 1631 self.headers.append(h)
1632 1632 self.header = h
1633 1633
1634 1634 def addother(self, line):
1635 1635 pass # 'other' lines are ignored
1636 1636
1637 1637 def finished(self):
1638 1638 self.addcontext([])
1639 1639 return self.headers
1640 1640
1641 1641 transitions = {
1642 1642 'file': {'context': addcontext,
1643 1643 'file': newfile,
1644 1644 'hunk': addhunk,
1645 1645 'range': addrange},
1646 1646 'context': {'file': newfile,
1647 1647 'hunk': addhunk,
1648 1648 'range': addrange,
1649 1649 'other': addother},
1650 1650 'hunk': {'context': addcontext,
1651 1651 'file': newfile,
1652 1652 'range': addrange},
1653 1653 'range': {'context': addcontext,
1654 1654 'hunk': addhunk},
1655 1655 'other': {'other': addother},
1656 1656 }
1657 1657
1658 1658 p = parser()
1659 1659 fp = stringio()
1660 1660 fp.write(''.join(originalchunks))
1661 1661 fp.seek(0)
1662 1662
1663 1663 state = 'context'
1664 1664 for newstate, data in scanpatch(fp):
1665 1665 try:
1666 1666 p.transitions[state][newstate](p, data)
1667 1667 except KeyError:
1668 1668 raise PatchError('unhandled transition: %s -> %s' %
1669 1669 (state, newstate))
1670 1670 state = newstate
1671 1671 del fp
1672 1672 return p.finished()
1673 1673
1674 1674 def pathtransform(path, strip, prefix):
1675 1675 '''turn a path from a patch into a path suitable for the repository
1676 1676
1677 1677 prefix, if not empty, is expected to be normalized with a / at the end.
1678 1678
1679 1679 Returns (stripped components, path in repository).
1680 1680
1681 1681 >>> pathtransform(b'a/b/c', 0, b'')
1682 1682 ('', 'a/b/c')
1683 1683 >>> pathtransform(b' a/b/c ', 0, b'')
1684 1684 ('', ' a/b/c')
1685 1685 >>> pathtransform(b' a/b/c ', 2, b'')
1686 1686 ('a/b/', 'c')
1687 1687 >>> pathtransform(b'a/b/c', 0, b'd/e/')
1688 1688 ('', 'd/e/a/b/c')
1689 1689 >>> pathtransform(b' a//b/c ', 2, b'd/e/')
1690 1690 ('a//b/', 'd/e/c')
1691 1691 >>> pathtransform(b'a/b/c', 3, b'')
1692 1692 Traceback (most recent call last):
1693 1693 PatchError: unable to strip away 1 of 3 dirs from a/b/c
1694 1694 '''
1695 1695 pathlen = len(path)
1696 1696 i = 0
1697 1697 if strip == 0:
1698 1698 return '', prefix + path.rstrip()
1699 1699 count = strip
1700 1700 while count > 0:
1701 1701 i = path.find('/', i)
1702 1702 if i == -1:
1703 1703 raise PatchError(_("unable to strip away %d of %d dirs from %s") %
1704 1704 (count, strip, path))
1705 1705 i += 1
1706 1706 # consume '//' in the path
1707 1707 while i < pathlen - 1 and path[i:i + 1] == '/':
1708 1708 i += 1
1709 1709 count -= 1
1710 1710 return path[:i].lstrip(), prefix + path[i:].rstrip()
1711 1711
1712 1712 def makepatchmeta(backend, afile_orig, bfile_orig, hunk, strip, prefix):
1713 1713 nulla = afile_orig == "/dev/null"
1714 1714 nullb = bfile_orig == "/dev/null"
1715 1715 create = nulla and hunk.starta == 0 and hunk.lena == 0
1716 1716 remove = nullb and hunk.startb == 0 and hunk.lenb == 0
1717 1717 abase, afile = pathtransform(afile_orig, strip, prefix)
1718 1718 gooda = not nulla and backend.exists(afile)
1719 1719 bbase, bfile = pathtransform(bfile_orig, strip, prefix)
1720 1720 if afile == bfile:
1721 1721 goodb = gooda
1722 1722 else:
1723 1723 goodb = not nullb and backend.exists(bfile)
1724 1724 missing = not goodb and not gooda and not create
1725 1725
1726 1726 # some diff programs apparently produce patches where the afile is
1727 1727 # not /dev/null, but afile starts with bfile
1728 1728 abasedir = afile[:afile.rfind('/') + 1]
1729 1729 bbasedir = bfile[:bfile.rfind('/') + 1]
1730 1730 if (missing and abasedir == bbasedir and afile.startswith(bfile)
1731 1731 and hunk.starta == 0 and hunk.lena == 0):
1732 1732 create = True
1733 1733 missing = False
1734 1734
1735 1735 # If afile is "a/b/foo" and bfile is "a/b/foo.orig" we assume the
1736 1736 # diff is between a file and its backup. In this case, the original
1737 1737 # file should be patched (see original mpatch code).
1738 1738 isbackup = (abase == bbase and bfile.startswith(afile))
1739 1739 fname = None
1740 1740 if not missing:
1741 1741 if gooda and goodb:
1742 1742 if isbackup:
1743 1743 fname = afile
1744 1744 else:
1745 1745 fname = bfile
1746 1746 elif gooda:
1747 1747 fname = afile
1748 1748
1749 1749 if not fname:
1750 1750 if not nullb:
1751 1751 if isbackup:
1752 1752 fname = afile
1753 1753 else:
1754 1754 fname = bfile
1755 1755 elif not nulla:
1756 1756 fname = afile
1757 1757 else:
1758 1758 raise PatchError(_("undefined source and destination files"))
1759 1759
1760 1760 gp = patchmeta(fname)
1761 1761 if create:
1762 1762 gp.op = 'ADD'
1763 1763 elif remove:
1764 1764 gp.op = 'DELETE'
1765 1765 return gp
1766 1766
1767 1767 def scanpatch(fp):
1768 1768 """like patch.iterhunks, but yield different events
1769 1769
1770 1770 - ('file', [header_lines + fromfile + tofile])
1771 1771 - ('context', [context_lines])
1772 1772 - ('hunk', [hunk_lines])
1773 1773 - ('range', (-start,len, +start,len, proc))
1774 1774 """
1775 1775 lines_re = re.compile(br'@@ -(\d+),(\d+) \+(\d+),(\d+) @@\s*(.*)')
1776 1776 lr = linereader(fp)
1777 1777
1778 1778 def scanwhile(first, p):
1779 1779 """scan lr while predicate holds"""
1780 1780 lines = [first]
1781 1781 for line in iter(lr.readline, ''):
1782 1782 if p(line):
1783 1783 lines.append(line)
1784 1784 else:
1785 1785 lr.push(line)
1786 1786 break
1787 1787 return lines
1788 1788
1789 1789 for line in iter(lr.readline, ''):
1790 1790 if line.startswith('diff --git a/') or line.startswith('diff -r '):
1791 1791 def notheader(line):
1792 1792 s = line.split(None, 1)
1793 1793 return not s or s[0] not in ('---', 'diff')
1794 1794 header = scanwhile(line, notheader)
1795 1795 fromfile = lr.readline()
1796 1796 if fromfile.startswith('---'):
1797 1797 tofile = lr.readline()
1798 1798 header += [fromfile, tofile]
1799 1799 else:
1800 1800 lr.push(fromfile)
1801 1801 yield 'file', header
1802 1802 elif line[0:1] == ' ':
1803 1803 yield 'context', scanwhile(line, lambda l: l[0] in ' \\')
1804 1804 elif line[0] in '-+':
1805 1805 yield 'hunk', scanwhile(line, lambda l: l[0] in '-+\\')
1806 1806 else:
1807 1807 m = lines_re.match(line)
1808 1808 if m:
1809 1809 yield 'range', m.groups()
1810 1810 else:
1811 1811 yield 'other', line
1812 1812
1813 1813 def scangitpatch(lr, firstline):
1814 1814 """
1815 1815 Git patches can emit:
1816 1816 - rename a to b
1817 1817 - change b
1818 1818 - copy a to c
1819 1819 - change c
1820 1820
1821 1821 We cannot apply this sequence as-is, the renamed 'a' could not be
1822 1822 found for it would have been renamed already. And we cannot copy
1823 1823 from 'b' instead because 'b' would have been changed already. So
1824 1824 we scan the git patch for copy and rename commands so we can
1825 1825 perform the copies ahead of time.
1826 1826 """
1827 1827 pos = 0
1828 1828 try:
1829 1829 pos = lr.fp.tell()
1830 1830 fp = lr.fp
1831 1831 except IOError:
1832 1832 fp = stringio(lr.fp.read())
1833 1833 gitlr = linereader(fp)
1834 1834 gitlr.push(firstline)
1835 1835 gitpatches = readgitpatch(gitlr)
1836 1836 fp.seek(pos)
1837 1837 return gitpatches
1838 1838
1839 1839 def iterhunks(fp):
1840 1840 """Read a patch and yield the following events:
1841 1841 - ("file", afile, bfile, firsthunk): select a new target file.
1842 1842 - ("hunk", hunk): a new hunk is ready to be applied, follows a
1843 1843 "file" event.
1844 1844 - ("git", gitchanges): current diff is in git format, gitchanges
1845 1845 maps filenames to gitpatch records. Unique event.
1846 1846 """
1847 1847 afile = ""
1848 1848 bfile = ""
1849 1849 state = None
1850 1850 hunknum = 0
1851 1851 emitfile = newfile = False
1852 1852 gitpatches = None
1853 1853
1854 1854 # our states
1855 1855 BFILE = 1
1856 1856 context = None
1857 1857 lr = linereader(fp)
1858 1858
1859 1859 for x in iter(lr.readline, ''):
1860 1860 if state == BFILE and (
1861 1861 (not context and x.startswith('@'))
1862 1862 or (context is not False and x.startswith('***************'))
1863 1863 or x.startswith('GIT binary patch')):
1864 1864 gp = None
1865 1865 if (gitpatches and
1866 1866 gitpatches[-1].ispatching(afile, bfile)):
1867 1867 gp = gitpatches.pop()
1868 1868 if x.startswith('GIT binary patch'):
1869 1869 h = binhunk(lr, gp.path)
1870 1870 else:
1871 1871 if context is None and x.startswith('***************'):
1872 1872 context = True
1873 1873 h = hunk(x, hunknum + 1, lr, context)
1874 1874 hunknum += 1
1875 1875 if emitfile:
1876 1876 emitfile = False
1877 1877 yield 'file', (afile, bfile, h, gp and gp.copy() or None)
1878 1878 yield 'hunk', h
1879 1879 elif x.startswith('diff --git a/'):
1880 1880 m = gitre.match(x.rstrip(' \r\n'))
1881 1881 if not m:
1882 1882 continue
1883 1883 if gitpatches is None:
1884 1884 # scan whole input for git metadata
1885 1885 gitpatches = scangitpatch(lr, x)
1886 1886 yield 'git', [g.copy() for g in gitpatches
1887 1887 if g.op in ('COPY', 'RENAME')]
1888 1888 gitpatches.reverse()
1889 1889 afile = 'a/' + m.group(1)
1890 1890 bfile = 'b/' + m.group(2)
1891 1891 while gitpatches and not gitpatches[-1].ispatching(afile, bfile):
1892 1892 gp = gitpatches.pop()
1893 1893 yield 'file', ('a/' + gp.path, 'b/' + gp.path, None, gp.copy())
1894 1894 if not gitpatches:
1895 1895 raise PatchError(_('failed to synchronize metadata for "%s"')
1896 1896 % afile[2:])
1897 1897 gp = gitpatches[-1]
1898 1898 newfile = True
1899 1899 elif x.startswith('---'):
1900 1900 # check for a unified diff
1901 1901 l2 = lr.readline()
1902 1902 if not l2.startswith('+++'):
1903 1903 lr.push(l2)
1904 1904 continue
1905 1905 newfile = True
1906 1906 context = False
1907 1907 afile = parsefilename(x)
1908 1908 bfile = parsefilename(l2)
1909 1909 elif x.startswith('***'):
1910 1910 # check for a context diff
1911 1911 l2 = lr.readline()
1912 1912 if not l2.startswith('---'):
1913 1913 lr.push(l2)
1914 1914 continue
1915 1915 l3 = lr.readline()
1916 1916 lr.push(l3)
1917 1917 if not l3.startswith("***************"):
1918 1918 lr.push(l2)
1919 1919 continue
1920 1920 newfile = True
1921 1921 context = True
1922 1922 afile = parsefilename(x)
1923 1923 bfile = parsefilename(l2)
1924 1924
1925 1925 if newfile:
1926 1926 newfile = False
1927 1927 emitfile = True
1928 1928 state = BFILE
1929 1929 hunknum = 0
1930 1930
1931 1931 while gitpatches:
1932 1932 gp = gitpatches.pop()
1933 1933 yield 'file', ('a/' + gp.path, 'b/' + gp.path, None, gp.copy())
1934 1934
1935 1935 def applybindelta(binchunk, data):
1936 1936 """Apply a binary delta hunk
1937 1937 The algorithm used is the algorithm from git's patch-delta.c
1938 1938 """
1939 1939 def deltahead(binchunk):
1940 1940 i = 0
1941 1941 for c in binchunk:
1942 1942 i += 1
1943 1943 if not (ord(c) & 0x80):
1944 1944 return i
1945 1945 return i
1946 1946 out = ""
1947 1947 s = deltahead(binchunk)
1948 1948 binchunk = binchunk[s:]
1949 1949 s = deltahead(binchunk)
1950 1950 binchunk = binchunk[s:]
1951 1951 i = 0
1952 1952 while i < len(binchunk):
1953 1953 cmd = ord(binchunk[i])
1954 1954 i += 1
1955 1955 if (cmd & 0x80):
1956 1956 offset = 0
1957 1957 size = 0
1958 1958 if (cmd & 0x01):
1959 1959 offset = ord(binchunk[i])
1960 1960 i += 1
1961 1961 if (cmd & 0x02):
1962 1962 offset |= ord(binchunk[i]) << 8
1963 1963 i += 1
1964 1964 if (cmd & 0x04):
1965 1965 offset |= ord(binchunk[i]) << 16
1966 1966 i += 1
1967 1967 if (cmd & 0x08):
1968 1968 offset |= ord(binchunk[i]) << 24
1969 1969 i += 1
1970 1970 if (cmd & 0x10):
1971 1971 size = ord(binchunk[i])
1972 1972 i += 1
1973 1973 if (cmd & 0x20):
1974 1974 size |= ord(binchunk[i]) << 8
1975 1975 i += 1
1976 1976 if (cmd & 0x40):
1977 1977 size |= ord(binchunk[i]) << 16
1978 1978 i += 1
1979 1979 if size == 0:
1980 1980 size = 0x10000
1981 1981 offset_end = offset + size
1982 1982 out += data[offset:offset_end]
1983 1983 elif cmd != 0:
1984 1984 offset_end = i + cmd
1985 1985 out += binchunk[i:offset_end]
1986 1986 i += cmd
1987 1987 else:
1988 1988 raise PatchError(_('unexpected delta opcode 0'))
1989 1989 return out
1990 1990
1991 1991 def applydiff(ui, fp, backend, store, strip=1, prefix='', eolmode='strict'):
1992 1992 """Reads a patch from fp and tries to apply it.
1993 1993
1994 1994 Returns 0 for a clean patch, -1 if any rejects were found and 1 if
1995 1995 there was any fuzz.
1996 1996
1997 1997 If 'eolmode' is 'strict', the patch content and patched file are
1998 1998 read in binary mode. Otherwise, line endings are ignored when
1999 1999 patching then normalized according to 'eolmode'.
2000 2000 """
2001 2001 return _applydiff(ui, fp, patchfile, backend, store, strip=strip,
2002 2002 prefix=prefix, eolmode=eolmode)
2003 2003
2004 2004 def _canonprefix(repo, prefix):
2005 2005 if prefix:
2006 2006 prefix = pathutil.canonpath(repo.root, repo.getcwd(), prefix)
2007 2007 if prefix != '':
2008 2008 prefix += '/'
2009 2009 return prefix
2010 2010
2011 2011 def _applydiff(ui, fp, patcher, backend, store, strip=1, prefix='',
2012 2012 eolmode='strict'):
2013 2013 prefix = _canonprefix(backend.repo, prefix)
2014 2014 def pstrip(p):
2015 2015 return pathtransform(p, strip - 1, prefix)[1]
2016 2016
2017 2017 rejects = 0
2018 2018 err = 0
2019 2019 current_file = None
2020 2020
2021 2021 for state, values in iterhunks(fp):
2022 2022 if state == 'hunk':
2023 2023 if not current_file:
2024 2024 continue
2025 2025 ret = current_file.apply(values)
2026 2026 if ret > 0:
2027 2027 err = 1
2028 2028 elif state == 'file':
2029 2029 if current_file:
2030 2030 rejects += current_file.close()
2031 2031 current_file = None
2032 2032 afile, bfile, first_hunk, gp = values
2033 2033 if gp:
2034 2034 gp.path = pstrip(gp.path)
2035 2035 if gp.oldpath:
2036 2036 gp.oldpath = pstrip(gp.oldpath)
2037 2037 else:
2038 2038 gp = makepatchmeta(backend, afile, bfile, first_hunk, strip,
2039 2039 prefix)
2040 2040 if gp.op == 'RENAME':
2041 2041 backend.unlink(gp.oldpath)
2042 2042 if not first_hunk:
2043 2043 if gp.op == 'DELETE':
2044 2044 backend.unlink(gp.path)
2045 2045 continue
2046 2046 data, mode = None, None
2047 2047 if gp.op in ('RENAME', 'COPY'):
2048 2048 data, mode = store.getfile(gp.oldpath)[:2]
2049 2049 if data is None:
2050 2050 # This means that the old path does not exist
2051 2051 raise PatchError(_("source file '%s' does not exist")
2052 2052 % gp.oldpath)
2053 2053 if gp.mode:
2054 2054 mode = gp.mode
2055 2055 if gp.op == 'ADD':
2056 2056 # Added files without content have no hunk and
2057 2057 # must be created
2058 2058 data = ''
2059 2059 if data or mode:
2060 2060 if (gp.op in ('ADD', 'RENAME', 'COPY')
2061 2061 and backend.exists(gp.path)):
2062 2062 raise PatchError(_("cannot create %s: destination "
2063 2063 "already exists") % gp.path)
2064 2064 backend.setfile(gp.path, data, mode, gp.oldpath)
2065 2065 continue
2066 2066 try:
2067 2067 current_file = patcher(ui, gp, backend, store,
2068 2068 eolmode=eolmode)
2069 2069 except PatchError as inst:
2070 2070 ui.warn(str(inst) + '\n')
2071 2071 current_file = None
2072 2072 rejects += 1
2073 2073 continue
2074 2074 elif state == 'git':
2075 2075 for gp in values:
2076 2076 path = pstrip(gp.oldpath)
2077 2077 data, mode = backend.getfile(path)
2078 2078 if data is None:
2079 2079 # The error ignored here will trigger a getfile()
2080 2080 # error in a place more appropriate for error
2081 2081 # handling, and will not interrupt the patching
2082 2082 # process.
2083 2083 pass
2084 2084 else:
2085 2085 store.setfile(path, data, mode)
2086 2086 else:
2087 2087 raise error.Abort(_('unsupported parser state: %s') % state)
2088 2088
2089 2089 if current_file:
2090 2090 rejects += current_file.close()
2091 2091
2092 2092 if rejects:
2093 2093 return -1
2094 2094 return err
2095 2095
2096 2096 def _externalpatch(ui, repo, patcher, patchname, strip, files,
2097 2097 similarity):
2098 2098 """use <patcher> to apply <patchname> to the working directory.
2099 2099 returns whether patch was applied with fuzz factor."""
2100 2100
2101 2101 fuzz = False
2102 2102 args = []
2103 2103 cwd = repo.root
2104 2104 if cwd:
2105 2105 args.append('-d %s' % procutil.shellquote(cwd))
2106 fp = procutil.popen('%s %s -p%d < %s' % (patcher, ' '.join(args), strip,
2107 procutil.shellquote(patchname)))
2106 cmd = ('%s %s -p%d < %s'
2107 % (patcher, ' '.join(args), strip, procutil.shellquote(patchname)))
2108 fp = procutil.popen(cmd, 'rb')
2108 2109 try:
2109 2110 for line in util.iterfile(fp):
2110 2111 line = line.rstrip()
2111 2112 ui.note(line + '\n')
2112 2113 if line.startswith('patching file '):
2113 2114 pf = util.parsepatchoutput(line)
2114 2115 printed_file = False
2115 2116 files.add(pf)
2116 2117 elif line.find('with fuzz') >= 0:
2117 2118 fuzz = True
2118 2119 if not printed_file:
2119 2120 ui.warn(pf + '\n')
2120 2121 printed_file = True
2121 2122 ui.warn(line + '\n')
2122 2123 elif line.find('saving rejects to file') >= 0:
2123 2124 ui.warn(line + '\n')
2124 2125 elif line.find('FAILED') >= 0:
2125 2126 if not printed_file:
2126 2127 ui.warn(pf + '\n')
2127 2128 printed_file = True
2128 2129 ui.warn(line + '\n')
2129 2130 finally:
2130 2131 if files:
2131 2132 scmutil.marktouched(repo, files, similarity)
2132 2133 code = fp.close()
2133 2134 if code:
2134 2135 raise PatchError(_("patch command failed: %s") %
2135 2136 procutil.explainexit(code)[0])
2136 2137 return fuzz
2137 2138
2138 2139 def patchbackend(ui, backend, patchobj, strip, prefix, files=None,
2139 2140 eolmode='strict'):
2140 2141 if files is None:
2141 2142 files = set()
2142 2143 if eolmode is None:
2143 2144 eolmode = ui.config('patch', 'eol')
2144 2145 if eolmode.lower() not in eolmodes:
2145 2146 raise error.Abort(_('unsupported line endings type: %s') % eolmode)
2146 2147 eolmode = eolmode.lower()
2147 2148
2148 2149 store = filestore()
2149 2150 try:
2150 2151 fp = open(patchobj, 'rb')
2151 2152 except TypeError:
2152 2153 fp = patchobj
2153 2154 try:
2154 2155 ret = applydiff(ui, fp, backend, store, strip=strip, prefix=prefix,
2155 2156 eolmode=eolmode)
2156 2157 finally:
2157 2158 if fp != patchobj:
2158 2159 fp.close()
2159 2160 files.update(backend.close())
2160 2161 store.close()
2161 2162 if ret < 0:
2162 2163 raise PatchError(_('patch failed to apply'))
2163 2164 return ret > 0
2164 2165
2165 2166 def internalpatch(ui, repo, patchobj, strip, prefix='', files=None,
2166 2167 eolmode='strict', similarity=0):
2167 2168 """use builtin patch to apply <patchobj> to the working directory.
2168 2169 returns whether patch was applied with fuzz factor."""
2169 2170 backend = workingbackend(ui, repo, similarity)
2170 2171 return patchbackend(ui, backend, patchobj, strip, prefix, files, eolmode)
2171 2172
2172 2173 def patchrepo(ui, repo, ctx, store, patchobj, strip, prefix, files=None,
2173 2174 eolmode='strict'):
2174 2175 backend = repobackend(ui, repo, ctx, store)
2175 2176 return patchbackend(ui, backend, patchobj, strip, prefix, files, eolmode)
2176 2177
2177 2178 def patch(ui, repo, patchname, strip=1, prefix='', files=None, eolmode='strict',
2178 2179 similarity=0):
2179 2180 """Apply <patchname> to the working directory.
2180 2181
2181 2182 'eolmode' specifies how end of lines should be handled. It can be:
2182 2183 - 'strict': inputs are read in binary mode, EOLs are preserved
2183 2184 - 'crlf': EOLs are ignored when patching and reset to CRLF
2184 2185 - 'lf': EOLs are ignored when patching and reset to LF
2185 2186 - None: get it from user settings, default to 'strict'
2186 2187 'eolmode' is ignored when using an external patcher program.
2187 2188
2188 2189 Returns whether patch was applied with fuzz factor.
2189 2190 """
2190 2191 patcher = ui.config('ui', 'patch')
2191 2192 if files is None:
2192 2193 files = set()
2193 2194 if patcher:
2194 2195 return _externalpatch(ui, repo, patcher, patchname, strip,
2195 2196 files, similarity)
2196 2197 return internalpatch(ui, repo, patchname, strip, prefix, files, eolmode,
2197 2198 similarity)
2198 2199
2199 2200 def changedfiles(ui, repo, patchpath, strip=1, prefix=''):
2200 2201 backend = fsbackend(ui, repo.root)
2201 2202 prefix = _canonprefix(repo, prefix)
2202 2203 with open(patchpath, 'rb') as fp:
2203 2204 changed = set()
2204 2205 for state, values in iterhunks(fp):
2205 2206 if state == 'file':
2206 2207 afile, bfile, first_hunk, gp = values
2207 2208 if gp:
2208 2209 gp.path = pathtransform(gp.path, strip - 1, prefix)[1]
2209 2210 if gp.oldpath:
2210 2211 gp.oldpath = pathtransform(gp.oldpath, strip - 1,
2211 2212 prefix)[1]
2212 2213 else:
2213 2214 gp = makepatchmeta(backend, afile, bfile, first_hunk, strip,
2214 2215 prefix)
2215 2216 changed.add(gp.path)
2216 2217 if gp.op == 'RENAME':
2217 2218 changed.add(gp.oldpath)
2218 2219 elif state not in ('hunk', 'git'):
2219 2220 raise error.Abort(_('unsupported parser state: %s') % state)
2220 2221 return changed
2221 2222
2222 2223 class GitDiffRequired(Exception):
2223 2224 pass
2224 2225
2225 2226 def diffallopts(ui, opts=None, untrusted=False, section='diff'):
2226 2227 '''return diffopts with all features supported and parsed'''
2227 2228 return difffeatureopts(ui, opts=opts, untrusted=untrusted, section=section,
2228 2229 git=True, whitespace=True, formatchanging=True)
2229 2230
2230 2231 diffopts = diffallopts
2231 2232
2232 2233 def difffeatureopts(ui, opts=None, untrusted=False, section='diff', git=False,
2233 2234 whitespace=False, formatchanging=False):
2234 2235 '''return diffopts with only opted-in features parsed
2235 2236
2236 2237 Features:
2237 2238 - git: git-style diffs
2238 2239 - whitespace: whitespace options like ignoreblanklines and ignorews
2239 2240 - formatchanging: options that will likely break or cause correctness issues
2240 2241 with most diff parsers
2241 2242 '''
2242 2243 def get(key, name=None, getter=ui.configbool, forceplain=None):
2243 2244 if opts:
2244 2245 v = opts.get(key)
2245 2246 # diffopts flags are either None-default (which is passed
2246 2247 # through unchanged, so we can identify unset values), or
2247 2248 # some other falsey default (eg --unified, which defaults
2248 2249 # to an empty string). We only want to override the config
2249 2250 # entries from hgrc with command line values if they
2250 2251 # appear to have been set, which is any truthy value,
2251 2252 # True, or False.
2252 2253 if v or isinstance(v, bool):
2253 2254 return v
2254 2255 if forceplain is not None and ui.plain():
2255 2256 return forceplain
2256 2257 return getter(section, name or key, untrusted=untrusted)
2257 2258
2258 2259 # core options, expected to be understood by every diff parser
2259 2260 buildopts = {
2260 2261 'nodates': get('nodates'),
2261 2262 'showfunc': get('show_function', 'showfunc'),
2262 2263 'context': get('unified', getter=ui.config),
2263 2264 }
2264 2265 buildopts['worddiff'] = ui.configbool('experimental', 'worddiff')
2265 2266 buildopts['xdiff'] = ui.configbool('experimental', 'xdiff')
2266 2267
2267 2268 if git:
2268 2269 buildopts['git'] = get('git')
2269 2270
2270 2271 # since this is in the experimental section, we need to call
2271 2272 # ui.configbool directory
2272 2273 buildopts['showsimilarity'] = ui.configbool('experimental',
2273 2274 'extendedheader.similarity')
2274 2275
2275 2276 # need to inspect the ui object instead of using get() since we want to
2276 2277 # test for an int
2277 2278 hconf = ui.config('experimental', 'extendedheader.index')
2278 2279 if hconf is not None:
2279 2280 hlen = None
2280 2281 try:
2281 2282 # the hash config could be an integer (for length of hash) or a
2282 2283 # word (e.g. short, full, none)
2283 2284 hlen = int(hconf)
2284 2285 if hlen < 0 or hlen > 40:
2285 2286 msg = _("invalid length for extendedheader.index: '%d'\n")
2286 2287 ui.warn(msg % hlen)
2287 2288 except ValueError:
2288 2289 # default value
2289 2290 if hconf == 'short' or hconf == '':
2290 2291 hlen = 12
2291 2292 elif hconf == 'full':
2292 2293 hlen = 40
2293 2294 elif hconf != 'none':
2294 2295 msg = _("invalid value for extendedheader.index: '%s'\n")
2295 2296 ui.warn(msg % hconf)
2296 2297 finally:
2297 2298 buildopts['index'] = hlen
2298 2299
2299 2300 if whitespace:
2300 2301 buildopts['ignorews'] = get('ignore_all_space', 'ignorews')
2301 2302 buildopts['ignorewsamount'] = get('ignore_space_change',
2302 2303 'ignorewsamount')
2303 2304 buildopts['ignoreblanklines'] = get('ignore_blank_lines',
2304 2305 'ignoreblanklines')
2305 2306 buildopts['ignorewseol'] = get('ignore_space_at_eol', 'ignorewseol')
2306 2307 if formatchanging:
2307 2308 buildopts['text'] = opts and opts.get('text')
2308 2309 binary = None if opts is None else opts.get('binary')
2309 2310 buildopts['nobinary'] = (not binary if binary is not None
2310 2311 else get('nobinary', forceplain=False))
2311 2312 buildopts['noprefix'] = get('noprefix', forceplain=False)
2312 2313
2313 2314 return mdiff.diffopts(**pycompat.strkwargs(buildopts))
2314 2315
2315 2316 def diff(repo, node1=None, node2=None, match=None, changes=None,
2316 2317 opts=None, losedatafn=None, prefix='', relroot='', copy=None,
2317 2318 hunksfilterfn=None):
2318 2319 '''yields diff of changes to files between two nodes, or node and
2319 2320 working directory.
2320 2321
2321 2322 if node1 is None, use first dirstate parent instead.
2322 2323 if node2 is None, compare node1 with working directory.
2323 2324
2324 2325 losedatafn(**kwarg) is a callable run when opts.upgrade=True and
2325 2326 every time some change cannot be represented with the current
2326 2327 patch format. Return False to upgrade to git patch format, True to
2327 2328 accept the loss or raise an exception to abort the diff. It is
2328 2329 called with the name of current file being diffed as 'fn'. If set
2329 2330 to None, patches will always be upgraded to git format when
2330 2331 necessary.
2331 2332
2332 2333 prefix is a filename prefix that is prepended to all filenames on
2333 2334 display (used for subrepos).
2334 2335
2335 2336 relroot, if not empty, must be normalized with a trailing /. Any match
2336 2337 patterns that fall outside it will be ignored.
2337 2338
2338 2339 copy, if not empty, should contain mappings {dst@y: src@x} of copy
2339 2340 information.
2340 2341
2341 2342 hunksfilterfn, if not None, should be a function taking a filectx and
2342 2343 hunks generator that may yield filtered hunks.
2343 2344 '''
2344 2345 for fctx1, fctx2, hdr, hunks in diffhunks(
2345 2346 repo, node1=node1, node2=node2,
2346 2347 match=match, changes=changes, opts=opts,
2347 2348 losedatafn=losedatafn, prefix=prefix, relroot=relroot, copy=copy,
2348 2349 ):
2349 2350 if hunksfilterfn is not None:
2350 2351 # If the file has been removed, fctx2 is None; but this should
2351 2352 # not occur here since we catch removed files early in
2352 2353 # logcmdutil.getlinerangerevs() for 'hg log -L'.
2353 2354 assert fctx2 is not None, \
2354 2355 'fctx2 unexpectly None in diff hunks filtering'
2355 2356 hunks = hunksfilterfn(fctx2, hunks)
2356 2357 text = ''.join(sum((list(hlines) for hrange, hlines in hunks), []))
2357 2358 if hdr and (text or len(hdr) > 1):
2358 2359 yield '\n'.join(hdr) + '\n'
2359 2360 if text:
2360 2361 yield text
2361 2362
2362 2363 def diffhunks(repo, node1=None, node2=None, match=None, changes=None,
2363 2364 opts=None, losedatafn=None, prefix='', relroot='', copy=None):
2364 2365 """Yield diff of changes to files in the form of (`header`, `hunks`) tuples
2365 2366 where `header` is a list of diff headers and `hunks` is an iterable of
2366 2367 (`hunkrange`, `hunklines`) tuples.
2367 2368
2368 2369 See diff() for the meaning of parameters.
2369 2370 """
2370 2371
2371 2372 if opts is None:
2372 2373 opts = mdiff.defaultopts
2373 2374
2374 2375 if not node1 and not node2:
2375 2376 node1 = repo.dirstate.p1()
2376 2377
2377 2378 def lrugetfilectx():
2378 2379 cache = {}
2379 2380 order = collections.deque()
2380 2381 def getfilectx(f, ctx):
2381 2382 fctx = ctx.filectx(f, filelog=cache.get(f))
2382 2383 if f not in cache:
2383 2384 if len(cache) > 20:
2384 2385 del cache[order.popleft()]
2385 2386 cache[f] = fctx.filelog()
2386 2387 else:
2387 2388 order.remove(f)
2388 2389 order.append(f)
2389 2390 return fctx
2390 2391 return getfilectx
2391 2392 getfilectx = lrugetfilectx()
2392 2393
2393 2394 ctx1 = repo[node1]
2394 2395 ctx2 = repo[node2]
2395 2396
2396 2397 relfiltered = False
2397 2398 if relroot != '' and match.always():
2398 2399 # as a special case, create a new matcher with just the relroot
2399 2400 pats = [relroot]
2400 2401 match = scmutil.match(ctx2, pats, default='path')
2401 2402 relfiltered = True
2402 2403
2403 2404 if not changes:
2404 2405 changes = repo.status(ctx1, ctx2, match=match)
2405 2406 modified, added, removed = changes[:3]
2406 2407
2407 2408 if not modified and not added and not removed:
2408 2409 return []
2409 2410
2410 2411 if repo.ui.debugflag:
2411 2412 hexfunc = hex
2412 2413 else:
2413 2414 hexfunc = short
2414 2415 revs = [hexfunc(node) for node in [ctx1.node(), ctx2.node()] if node]
2415 2416
2416 2417 if copy is None:
2417 2418 copy = {}
2418 2419 if opts.git or opts.upgrade:
2419 2420 copy = copies.pathcopies(ctx1, ctx2, match=match)
2420 2421
2421 2422 if relroot is not None:
2422 2423 if not relfiltered:
2423 2424 # XXX this would ideally be done in the matcher, but that is
2424 2425 # generally meant to 'or' patterns, not 'and' them. In this case we
2425 2426 # need to 'and' all the patterns from the matcher with relroot.
2426 2427 def filterrel(l):
2427 2428 return [f for f in l if f.startswith(relroot)]
2428 2429 modified = filterrel(modified)
2429 2430 added = filterrel(added)
2430 2431 removed = filterrel(removed)
2431 2432 relfiltered = True
2432 2433 # filter out copies where either side isn't inside the relative root
2433 2434 copy = dict(((dst, src) for (dst, src) in copy.iteritems()
2434 2435 if dst.startswith(relroot)
2435 2436 and src.startswith(relroot)))
2436 2437
2437 2438 modifiedset = set(modified)
2438 2439 addedset = set(added)
2439 2440 removedset = set(removed)
2440 2441 for f in modified:
2441 2442 if f not in ctx1:
2442 2443 # Fix up added, since merged-in additions appear as
2443 2444 # modifications during merges
2444 2445 modifiedset.remove(f)
2445 2446 addedset.add(f)
2446 2447 for f in removed:
2447 2448 if f not in ctx1:
2448 2449 # Merged-in additions that are then removed are reported as removed.
2449 2450 # They are not in ctx1, so We don't want to show them in the diff.
2450 2451 removedset.remove(f)
2451 2452 modified = sorted(modifiedset)
2452 2453 added = sorted(addedset)
2453 2454 removed = sorted(removedset)
2454 2455 for dst, src in list(copy.items()):
2455 2456 if src not in ctx1:
2456 2457 # Files merged in during a merge and then copied/renamed are
2457 2458 # reported as copies. We want to show them in the diff as additions.
2458 2459 del copy[dst]
2459 2460
2460 2461 def difffn(opts, losedata):
2461 2462 return trydiff(repo, revs, ctx1, ctx2, modified, added, removed,
2462 2463 copy, getfilectx, opts, losedata, prefix, relroot)
2463 2464 if opts.upgrade and not opts.git:
2464 2465 try:
2465 2466 def losedata(fn):
2466 2467 if not losedatafn or not losedatafn(fn=fn):
2467 2468 raise GitDiffRequired
2468 2469 # Buffer the whole output until we are sure it can be generated
2469 2470 return list(difffn(opts.copy(git=False), losedata))
2470 2471 except GitDiffRequired:
2471 2472 return difffn(opts.copy(git=True), None)
2472 2473 else:
2473 2474 return difffn(opts, None)
2474 2475
2475 2476 def difflabel(func, *args, **kw):
2476 2477 '''yields 2-tuples of (output, label) based on the output of func()'''
2477 2478 inlinecolor = False
2478 2479 if kw.get(r'opts'):
2479 2480 inlinecolor = kw[r'opts'].worddiff
2480 2481 headprefixes = [('diff', 'diff.diffline'),
2481 2482 ('copy', 'diff.extended'),
2482 2483 ('rename', 'diff.extended'),
2483 2484 ('old', 'diff.extended'),
2484 2485 ('new', 'diff.extended'),
2485 2486 ('deleted', 'diff.extended'),
2486 2487 ('index', 'diff.extended'),
2487 2488 ('similarity', 'diff.extended'),
2488 2489 ('---', 'diff.file_a'),
2489 2490 ('+++', 'diff.file_b')]
2490 2491 textprefixes = [('@', 'diff.hunk'),
2491 2492 ('-', 'diff.deleted'),
2492 2493 ('+', 'diff.inserted')]
2493 2494 head = False
2494 2495 for chunk in func(*args, **kw):
2495 2496 lines = chunk.split('\n')
2496 2497 matches = {}
2497 2498 if inlinecolor:
2498 2499 matches = _findmatches(lines)
2499 2500 for i, line in enumerate(lines):
2500 2501 if i != 0:
2501 2502 yield ('\n', '')
2502 2503 if head:
2503 2504 if line.startswith('@'):
2504 2505 head = False
2505 2506 else:
2506 2507 if line and line[0] not in ' +-@\\':
2507 2508 head = True
2508 2509 stripline = line
2509 2510 diffline = False
2510 2511 if not head and line and line[0] in '+-':
2511 2512 # highlight tabs and trailing whitespace, but only in
2512 2513 # changed lines
2513 2514 stripline = line.rstrip()
2514 2515 diffline = True
2515 2516
2516 2517 prefixes = textprefixes
2517 2518 if head:
2518 2519 prefixes = headprefixes
2519 2520 for prefix, label in prefixes:
2520 2521 if stripline.startswith(prefix):
2521 2522 if diffline:
2522 2523 if i in matches:
2523 2524 for t, l in _inlinediff(lines[i].rstrip(),
2524 2525 lines[matches[i]].rstrip(),
2525 2526 label):
2526 2527 yield (t, l)
2527 2528 else:
2528 2529 for token in tabsplitter.findall(stripline):
2529 2530 if token.startswith('\t'):
2530 2531 yield (token, 'diff.tab')
2531 2532 else:
2532 2533 yield (token, label)
2533 2534 else:
2534 2535 yield (stripline, label)
2535 2536 break
2536 2537 else:
2537 2538 yield (line, '')
2538 2539 if line != stripline:
2539 2540 yield (line[len(stripline):], 'diff.trailingwhitespace')
2540 2541
2541 2542 def _findmatches(slist):
2542 2543 '''Look for insertion matches to deletion and returns a dict of
2543 2544 correspondences.
2544 2545 '''
2545 2546 lastmatch = 0
2546 2547 matches = {}
2547 2548 for i, line in enumerate(slist):
2548 2549 if line == '':
2549 2550 continue
2550 2551 if line[0] == '-':
2551 2552 lastmatch = max(lastmatch, i)
2552 2553 newgroup = False
2553 2554 for j, newline in enumerate(slist[lastmatch + 1:]):
2554 2555 if newline == '':
2555 2556 continue
2556 2557 if newline[0] == '-' and newgroup: # too far, no match
2557 2558 break
2558 2559 if newline[0] == '+': # potential match
2559 2560 newgroup = True
2560 2561 sim = difflib.SequenceMatcher(None, line, newline).ratio()
2561 2562 if sim > 0.7:
2562 2563 lastmatch = lastmatch + 1 + j
2563 2564 matches[i] = lastmatch
2564 2565 matches[lastmatch] = i
2565 2566 break
2566 2567 return matches
2567 2568
2568 2569 def _inlinediff(s1, s2, operation):
2569 2570 '''Perform string diff to highlight specific changes.'''
2570 2571 operation_skip = '+?' if operation == 'diff.deleted' else '-?'
2571 2572 if operation == 'diff.deleted':
2572 2573 s2, s1 = s1, s2
2573 2574
2574 2575 buff = []
2575 2576 # we never want to higlight the leading +-
2576 2577 if operation == 'diff.deleted' and s2.startswith('-'):
2577 2578 label = operation
2578 2579 token = '-'
2579 2580 s2 = s2[1:]
2580 2581 s1 = s1[1:]
2581 2582 elif operation == 'diff.inserted' and s1.startswith('+'):
2582 2583 label = operation
2583 2584 token = '+'
2584 2585 s2 = s2[1:]
2585 2586 s1 = s1[1:]
2586 2587 else:
2587 2588 raise error.ProgrammingError("Case not expected, operation = %s" %
2588 2589 operation)
2589 2590
2590 2591 s = difflib.ndiff(_nonwordre.split(s2), _nonwordre.split(s1))
2591 2592 for part in s:
2592 2593 if part[0] in operation_skip or len(part) == 2:
2593 2594 continue
2594 2595 l = operation + '.highlight'
2595 2596 if part[0] in ' ':
2596 2597 l = operation
2597 2598 if part[2:] == '\t':
2598 2599 l = 'diff.tab'
2599 2600 if l == label: # contiguous token with same label
2600 2601 token += part[2:]
2601 2602 continue
2602 2603 else:
2603 2604 buff.append((token, label))
2604 2605 label = l
2605 2606 token = part[2:]
2606 2607 buff.append((token, label))
2607 2608
2608 2609 return buff
2609 2610
2610 2611 def diffui(*args, **kw):
2611 2612 '''like diff(), but yields 2-tuples of (output, label) for ui.write()'''
2612 2613 return difflabel(diff, *args, **kw)
2613 2614
2614 2615 def _filepairs(modified, added, removed, copy, opts):
2615 2616 '''generates tuples (f1, f2, copyop), where f1 is the name of the file
2616 2617 before and f2 is the the name after. For added files, f1 will be None,
2617 2618 and for removed files, f2 will be None. copyop may be set to None, 'copy'
2618 2619 or 'rename' (the latter two only if opts.git is set).'''
2619 2620 gone = set()
2620 2621
2621 2622 copyto = dict([(v, k) for k, v in copy.items()])
2622 2623
2623 2624 addedset, removedset = set(added), set(removed)
2624 2625
2625 2626 for f in sorted(modified + added + removed):
2626 2627 copyop = None
2627 2628 f1, f2 = f, f
2628 2629 if f in addedset:
2629 2630 f1 = None
2630 2631 if f in copy:
2631 2632 if opts.git:
2632 2633 f1 = copy[f]
2633 2634 if f1 in removedset and f1 not in gone:
2634 2635 copyop = 'rename'
2635 2636 gone.add(f1)
2636 2637 else:
2637 2638 copyop = 'copy'
2638 2639 elif f in removedset:
2639 2640 f2 = None
2640 2641 if opts.git:
2641 2642 # have we already reported a copy above?
2642 2643 if (f in copyto and copyto[f] in addedset
2643 2644 and copy[copyto[f]] == f):
2644 2645 continue
2645 2646 yield f1, f2, copyop
2646 2647
2647 2648 def trydiff(repo, revs, ctx1, ctx2, modified, added, removed,
2648 2649 copy, getfilectx, opts, losedatafn, prefix, relroot):
2649 2650 '''given input data, generate a diff and yield it in blocks
2650 2651
2651 2652 If generating a diff would lose data like flags or binary data and
2652 2653 losedatafn is not None, it will be called.
2653 2654
2654 2655 relroot is removed and prefix is added to every path in the diff output.
2655 2656
2656 2657 If relroot is not empty, this function expects every path in modified,
2657 2658 added, removed and copy to start with it.'''
2658 2659
2659 2660 def gitindex(text):
2660 2661 if not text:
2661 2662 text = ""
2662 2663 l = len(text)
2663 2664 s = hashlib.sha1('blob %d\0' % l)
2664 2665 s.update(text)
2665 2666 return hex(s.digest())
2666 2667
2667 2668 if opts.noprefix:
2668 2669 aprefix = bprefix = ''
2669 2670 else:
2670 2671 aprefix = 'a/'
2671 2672 bprefix = 'b/'
2672 2673
2673 2674 def diffline(f, revs):
2674 2675 revinfo = ' '.join(["-r %s" % rev for rev in revs])
2675 2676 return 'diff %s %s' % (revinfo, f)
2676 2677
2677 2678 def isempty(fctx):
2678 2679 return fctx is None or fctx.size() == 0
2679 2680
2680 2681 date1 = dateutil.datestr(ctx1.date())
2681 2682 date2 = dateutil.datestr(ctx2.date())
2682 2683
2683 2684 gitmode = {'l': '120000', 'x': '100755', '': '100644'}
2684 2685
2685 2686 if relroot != '' and (repo.ui.configbool('devel', 'all-warnings')
2686 2687 or repo.ui.configbool('devel', 'check-relroot')):
2687 2688 for f in modified + added + removed + list(copy) + list(copy.values()):
2688 2689 if f is not None and not f.startswith(relroot):
2689 2690 raise AssertionError(
2690 2691 "file %s doesn't start with relroot %s" % (f, relroot))
2691 2692
2692 2693 for f1, f2, copyop in _filepairs(modified, added, removed, copy, opts):
2693 2694 content1 = None
2694 2695 content2 = None
2695 2696 fctx1 = None
2696 2697 fctx2 = None
2697 2698 flag1 = None
2698 2699 flag2 = None
2699 2700 if f1:
2700 2701 fctx1 = getfilectx(f1, ctx1)
2701 2702 if opts.git or losedatafn:
2702 2703 flag1 = ctx1.flags(f1)
2703 2704 if f2:
2704 2705 fctx2 = getfilectx(f2, ctx2)
2705 2706 if opts.git or losedatafn:
2706 2707 flag2 = ctx2.flags(f2)
2707 2708 # if binary is True, output "summary" or "base85", but not "text diff"
2708 2709 if opts.text:
2709 2710 binary = False
2710 2711 else:
2711 2712 binary = any(f.isbinary() for f in [fctx1, fctx2] if f is not None)
2712 2713
2713 2714 if losedatafn and not opts.git:
2714 2715 if (binary or
2715 2716 # copy/rename
2716 2717 f2 in copy or
2717 2718 # empty file creation
2718 2719 (not f1 and isempty(fctx2)) or
2719 2720 # empty file deletion
2720 2721 (isempty(fctx1) and not f2) or
2721 2722 # create with flags
2722 2723 (not f1 and flag2) or
2723 2724 # change flags
2724 2725 (f1 and f2 and flag1 != flag2)):
2725 2726 losedatafn(f2 or f1)
2726 2727
2727 2728 path1 = f1 or f2
2728 2729 path2 = f2 or f1
2729 2730 path1 = posixpath.join(prefix, path1[len(relroot):])
2730 2731 path2 = posixpath.join(prefix, path2[len(relroot):])
2731 2732 header = []
2732 2733 if opts.git:
2733 2734 header.append('diff --git %s%s %s%s' %
2734 2735 (aprefix, path1, bprefix, path2))
2735 2736 if not f1: # added
2736 2737 header.append('new file mode %s' % gitmode[flag2])
2737 2738 elif not f2: # removed
2738 2739 header.append('deleted file mode %s' % gitmode[flag1])
2739 2740 else: # modified/copied/renamed
2740 2741 mode1, mode2 = gitmode[flag1], gitmode[flag2]
2741 2742 if mode1 != mode2:
2742 2743 header.append('old mode %s' % mode1)
2743 2744 header.append('new mode %s' % mode2)
2744 2745 if copyop is not None:
2745 2746 if opts.showsimilarity:
2746 2747 sim = similar.score(ctx1[path1], ctx2[path2]) * 100
2747 2748 header.append('similarity index %d%%' % sim)
2748 2749 header.append('%s from %s' % (copyop, path1))
2749 2750 header.append('%s to %s' % (copyop, path2))
2750 2751 elif revs and not repo.ui.quiet:
2751 2752 header.append(diffline(path1, revs))
2752 2753
2753 2754 # fctx.is | diffopts | what to | is fctx.data()
2754 2755 # binary() | text nobinary git index | output? | outputted?
2755 2756 # ------------------------------------|----------------------------
2756 2757 # yes | no no no * | summary | no
2757 2758 # yes | no no yes * | base85 | yes
2758 2759 # yes | no yes no * | summary | no
2759 2760 # yes | no yes yes 0 | summary | no
2760 2761 # yes | no yes yes >0 | summary | semi [1]
2761 2762 # yes | yes * * * | text diff | yes
2762 2763 # no | * * * * | text diff | yes
2763 2764 # [1]: hash(fctx.data()) is outputted. so fctx.data() cannot be faked
2764 2765 if binary and (not opts.git or (opts.git and opts.nobinary and not
2765 2766 opts.index)):
2766 2767 # fast path: no binary content will be displayed, content1 and
2767 2768 # content2 are only used for equivalent test. cmp() could have a
2768 2769 # fast path.
2769 2770 if fctx1 is not None:
2770 2771 content1 = b'\0'
2771 2772 if fctx2 is not None:
2772 2773 if fctx1 is not None and not fctx1.cmp(fctx2):
2773 2774 content2 = b'\0' # not different
2774 2775 else:
2775 2776 content2 = b'\0\0'
2776 2777 else:
2777 2778 # normal path: load contents
2778 2779 if fctx1 is not None:
2779 2780 content1 = fctx1.data()
2780 2781 if fctx2 is not None:
2781 2782 content2 = fctx2.data()
2782 2783
2783 2784 if binary and opts.git and not opts.nobinary:
2784 2785 text = mdiff.b85diff(content1, content2)
2785 2786 if text:
2786 2787 header.append('index %s..%s' %
2787 2788 (gitindex(content1), gitindex(content2)))
2788 2789 hunks = (None, [text]),
2789 2790 else:
2790 2791 if opts.git and opts.index > 0:
2791 2792 flag = flag1
2792 2793 if flag is None:
2793 2794 flag = flag2
2794 2795 header.append('index %s..%s %s' %
2795 2796 (gitindex(content1)[0:opts.index],
2796 2797 gitindex(content2)[0:opts.index],
2797 2798 gitmode[flag]))
2798 2799
2799 2800 uheaders, hunks = mdiff.unidiff(content1, date1,
2800 2801 content2, date2,
2801 2802 path1, path2,
2802 2803 binary=binary, opts=opts)
2803 2804 header.extend(uheaders)
2804 2805 yield fctx1, fctx2, header, hunks
2805 2806
2806 2807 def diffstatsum(stats):
2807 2808 maxfile, maxtotal, addtotal, removetotal, binary = 0, 0, 0, 0, False
2808 2809 for f, a, r, b in stats:
2809 2810 maxfile = max(maxfile, encoding.colwidth(f))
2810 2811 maxtotal = max(maxtotal, a + r)
2811 2812 addtotal += a
2812 2813 removetotal += r
2813 2814 binary = binary or b
2814 2815
2815 2816 return maxfile, maxtotal, addtotal, removetotal, binary
2816 2817
2817 2818 def diffstatdata(lines):
2818 2819 diffre = re.compile('^diff .*-r [a-z0-9]+\s(.*)$')
2819 2820
2820 2821 results = []
2821 2822 filename, adds, removes, isbinary = None, 0, 0, False
2822 2823
2823 2824 def addresult():
2824 2825 if filename:
2825 2826 results.append((filename, adds, removes, isbinary))
2826 2827
2827 2828 # inheader is used to track if a line is in the
2828 2829 # header portion of the diff. This helps properly account
2829 2830 # for lines that start with '--' or '++'
2830 2831 inheader = False
2831 2832
2832 2833 for line in lines:
2833 2834 if line.startswith('diff'):
2834 2835 addresult()
2835 2836 # starting a new file diff
2836 2837 # set numbers to 0 and reset inheader
2837 2838 inheader = True
2838 2839 adds, removes, isbinary = 0, 0, False
2839 2840 if line.startswith('diff --git a/'):
2840 2841 filename = gitre.search(line).group(2)
2841 2842 elif line.startswith('diff -r'):
2842 2843 # format: "diff -r ... -r ... filename"
2843 2844 filename = diffre.search(line).group(1)
2844 2845 elif line.startswith('@@'):
2845 2846 inheader = False
2846 2847 elif line.startswith('+') and not inheader:
2847 2848 adds += 1
2848 2849 elif line.startswith('-') and not inheader:
2849 2850 removes += 1
2850 2851 elif (line.startswith('GIT binary patch') or
2851 2852 line.startswith('Binary file')):
2852 2853 isbinary = True
2853 2854 addresult()
2854 2855 return results
2855 2856
2856 2857 def diffstat(lines, width=80):
2857 2858 output = []
2858 2859 stats = diffstatdata(lines)
2859 2860 maxname, maxtotal, totaladds, totalremoves, hasbinary = diffstatsum(stats)
2860 2861
2861 2862 countwidth = len(str(maxtotal))
2862 2863 if hasbinary and countwidth < 3:
2863 2864 countwidth = 3
2864 2865 graphwidth = width - countwidth - maxname - 6
2865 2866 if graphwidth < 10:
2866 2867 graphwidth = 10
2867 2868
2868 2869 def scale(i):
2869 2870 if maxtotal <= graphwidth:
2870 2871 return i
2871 2872 # If diffstat runs out of room it doesn't print anything,
2872 2873 # which isn't very useful, so always print at least one + or -
2873 2874 # if there were at least some changes.
2874 2875 return max(i * graphwidth // maxtotal, int(bool(i)))
2875 2876
2876 2877 for filename, adds, removes, isbinary in stats:
2877 2878 if isbinary:
2878 2879 count = 'Bin'
2879 2880 else:
2880 2881 count = '%d' % (adds + removes)
2881 2882 pluses = '+' * scale(adds)
2882 2883 minuses = '-' * scale(removes)
2883 2884 output.append(' %s%s | %*s %s%s\n' %
2884 2885 (filename, ' ' * (maxname - encoding.colwidth(filename)),
2885 2886 countwidth, count, pluses, minuses))
2886 2887
2887 2888 if stats:
2888 2889 output.append(_(' %d files changed, %d insertions(+), '
2889 2890 '%d deletions(-)\n')
2890 2891 % (len(stats), totaladds, totalremoves))
2891 2892
2892 2893 return ''.join(output)
2893 2894
2894 2895 def diffstatui(*args, **kw):
2895 2896 '''like diffstat(), but yields 2-tuples of (output, label) for
2896 2897 ui.write()
2897 2898 '''
2898 2899
2899 2900 for line in diffstat(*args, **kw).splitlines():
2900 2901 if line and line[-1] in '+-':
2901 2902 name, graph = line.rsplit(' ', 1)
2902 2903 yield (name + ' ', '')
2903 2904 m = re.search(br'\++', graph)
2904 2905 if m:
2905 2906 yield (m.group(0), 'diffstat.inserted')
2906 2907 m = re.search(br'-+', graph)
2907 2908 if m:
2908 2909 yield (m.group(0), 'diffstat.deleted')
2909 2910 else:
2910 2911 yield (line, '')
2911 2912 yield ('\n', '')
General Comments 0
You need to be logged in to leave comments. Login now