##// END OF EJS Templates
convert: backout b75a04502ced and 9616b03113ce - tagmap...
Mads Kiilerich -
r21076:5236c7a7 default
parent child Browse files
Show More
@@ -1,395 +1,389 b''
1 1 # convert.py Foreign SCM converter
2 2 #
3 3 # Copyright 2005-2007 Matt Mackall <mpm@selenic.com>
4 4 #
5 5 # This software may be used and distributed according to the terms of the
6 6 # GNU General Public License version 2 or any later version.
7 7
8 8 '''import revisions from foreign VCS repositories into Mercurial'''
9 9
10 10 import convcmd
11 11 import cvsps
12 12 import subversion
13 13 from mercurial import commands, templatekw
14 14 from mercurial.i18n import _
15 15
16 16 testedwith = 'internal'
17 17
18 18 # Commands definition was moved elsewhere to ease demandload job.
19 19
20 20 def convert(ui, src, dest=None, revmapfile=None, **opts):
21 21 """convert a foreign SCM repository to a Mercurial one.
22 22
23 23 Accepted source formats [identifiers]:
24 24
25 25 - Mercurial [hg]
26 26 - CVS [cvs]
27 27 - Darcs [darcs]
28 28 - git [git]
29 29 - Subversion [svn]
30 30 - Monotone [mtn]
31 31 - GNU Arch [gnuarch]
32 32 - Bazaar [bzr]
33 33 - Perforce [p4]
34 34
35 35 Accepted destination formats [identifiers]:
36 36
37 37 - Mercurial [hg]
38 38 - Subversion [svn] (history on branches is not preserved)
39 39
40 40 If no revision is given, all revisions will be converted.
41 41 Otherwise, convert will only import up to the named revision
42 42 (given in a format understood by the source).
43 43
44 44 If no destination directory name is specified, it defaults to the
45 45 basename of the source with ``-hg`` appended. If the destination
46 46 repository doesn't exist, it will be created.
47 47
48 48 By default, all sources except Mercurial will use --branchsort.
49 49 Mercurial uses --sourcesort to preserve original revision numbers
50 50 order. Sort modes have the following effects:
51 51
52 52 --branchsort convert from parent to child revision when possible,
53 53 which means branches are usually converted one after
54 54 the other. It generates more compact repositories.
55 55
56 56 --datesort sort revisions by date. Converted repositories have
57 57 good-looking changelogs but are often an order of
58 58 magnitude larger than the same ones generated by
59 59 --branchsort.
60 60
61 61 --sourcesort try to preserve source revisions order, only
62 62 supported by Mercurial sources.
63 63
64 64 --closesort try to move closed revisions as close as possible
65 65 to parent branches, only supported by Mercurial
66 66 sources.
67 67
68 68 If ``REVMAP`` isn't given, it will be put in a default location
69 69 (``<dest>/.hg/shamap`` by default). The ``REVMAP`` is a simple
70 70 text file that maps each source commit ID to the destination ID
71 71 for that revision, like so::
72 72
73 73 <source ID> <destination ID>
74 74
75 75 If the file doesn't exist, it's automatically created. It's
76 76 updated on each commit copied, so :hg:`convert` can be interrupted
77 77 and can be run repeatedly to copy new commits.
78 78
79 79 The authormap is a simple text file that maps each source commit
80 80 author to a destination commit author. It is handy for source SCMs
81 81 that use unix logins to identify authors (e.g.: CVS). One line per
82 82 author mapping and the line format is::
83 83
84 84 source author = destination author
85 85
86 86 Empty lines and lines starting with a ``#`` are ignored.
87 87
88 88 The filemap is a file that allows filtering and remapping of files
89 89 and directories. Each line can contain one of the following
90 90 directives::
91 91
92 92 include path/to/file-or-dir
93 93
94 94 exclude path/to/file-or-dir
95 95
96 96 rename path/to/source path/to/destination
97 97
98 98 Comment lines start with ``#``. A specified path matches if it
99 99 equals the full relative name of a file or one of its parent
100 100 directories. The ``include`` or ``exclude`` directive with the
101 101 longest matching path applies, so line order does not matter.
102 102
103 103 The ``include`` directive causes a file, or all files under a
104 104 directory, to be included in the destination repository. The default
105 105 if there are no ``include`` statements is to include everything.
106 106 If there are any ``include`` statements, nothing else is included.
107 107 The ``exclude`` directive causes files or directories to
108 108 be omitted. The ``rename`` directive renames a file or directory if
109 109 it is converted. To rename from a subdirectory into the root of
110 110 the repository, use ``.`` as the path to rename to.
111 111
112 112 The splicemap is a file that allows insertion of synthetic
113 113 history, letting you specify the parents of a revision. This is
114 114 useful if you want to e.g. give a Subversion merge two parents, or
115 115 graft two disconnected series of history together. Each entry
116 116 contains a key, followed by a space, followed by one or two
117 117 comma-separated values::
118 118
119 119 key parent1, parent2
120 120
121 121 The key is the revision ID in the source
122 122 revision control system whose parents should be modified (same
123 123 format as a key in .hg/shamap). The values are the revision IDs
124 124 (in either the source or destination revision control system) that
125 125 should be used as the new parents for that node. For example, if
126 126 you have merged "release-1.0" into "trunk", then you should
127 127 specify the revision on "trunk" as the first parent and the one on
128 128 the "release-1.0" branch as the second.
129 129
130 130 The branchmap is a file that allows you to rename a branch when it is
131 131 being brought in from whatever external repository. When used in
132 132 conjunction with a splicemap, it allows for a powerful combination
133 133 to help fix even the most badly mismanaged repositories and turn them
134 134 into nicely structured Mercurial repositories. The branchmap contains
135 135 lines of the form::
136 136
137 137 original_branch_name new_branch_name
138 138
139 139 where "original_branch_name" is the name of the branch in the
140 140 source repository, and "new_branch_name" is the name of the branch
141 141 is the destination repository. No whitespace is allowed in the
142 142 branch names. This can be used to (for instance) move code in one
143 143 repository from "default" to a named branch.
144 144
145 145 The closemap is a file that allows closing of a branch. This is useful if
146 146 you want to close a branch. Each entry contains a revision or hash
147 147 separated by white space.
148 148
149 The tagmap is a file that exactly analogous to the branchmap. This will
150 rename tags on the fly and prevent the 'update tags' commit usually found
151 at the end of a convert process.
152
153 149 Mercurial Source
154 150 ################
155 151
156 152 The Mercurial source recognizes the following configuration
157 153 options, which you can set on the command line with ``--config``:
158 154
159 155 :convert.hg.ignoreerrors: ignore integrity errors when reading.
160 156 Use it to fix Mercurial repositories with missing revlogs, by
161 157 converting from and to Mercurial. Default is False.
162 158
163 159 :convert.hg.saverev: store original revision ID in changeset
164 160 (forces target IDs to change). It takes a boolean argument and
165 161 defaults to False.
166 162
167 163 :convert.hg.revs: revset specifying the source revisions to convert.
168 164
169 165 CVS Source
170 166 ##########
171 167
172 168 CVS source will use a sandbox (i.e. a checked-out copy) from CVS
173 169 to indicate the starting point of what will be converted. Direct
174 170 access to the repository files is not needed, unless of course the
175 171 repository is ``:local:``. The conversion uses the top level
176 172 directory in the sandbox to find the CVS repository, and then uses
177 173 CVS rlog commands to find files to convert. This means that unless
178 174 a filemap is given, all files under the starting directory will be
179 175 converted, and that any directory reorganization in the CVS
180 176 sandbox is ignored.
181 177
182 178 The following options can be used with ``--config``:
183 179
184 180 :convert.cvsps.cache: Set to False to disable remote log caching,
185 181 for testing and debugging purposes. Default is True.
186 182
187 183 :convert.cvsps.fuzz: Specify the maximum time (in seconds) that is
188 184 allowed between commits with identical user and log message in
189 185 a single changeset. When very large files were checked in as
190 186 part of a changeset then the default may not be long enough.
191 187 The default is 60.
192 188
193 189 :convert.cvsps.mergeto: Specify a regular expression to which
194 190 commit log messages are matched. If a match occurs, then the
195 191 conversion process will insert a dummy revision merging the
196 192 branch on which this log message occurs to the branch
197 193 indicated in the regex. Default is ``{{mergetobranch
198 194 ([-\\w]+)}}``
199 195
200 196 :convert.cvsps.mergefrom: Specify a regular expression to which
201 197 commit log messages are matched. If a match occurs, then the
202 198 conversion process will add the most recent revision on the
203 199 branch indicated in the regex as the second parent of the
204 200 changeset. Default is ``{{mergefrombranch ([-\\w]+)}}``
205 201
206 202 :convert.localtimezone: use local time (as determined by the TZ
207 203 environment variable) for changeset date/times. The default
208 204 is False (use UTC).
209 205
210 206 :hooks.cvslog: Specify a Python function to be called at the end of
211 207 gathering the CVS log. The function is passed a list with the
212 208 log entries, and can modify the entries in-place, or add or
213 209 delete them.
214 210
215 211 :hooks.cvschangesets: Specify a Python function to be called after
216 212 the changesets are calculated from the CVS log. The
217 213 function is passed a list with the changeset entries, and can
218 214 modify the changesets in-place, or add or delete them.
219 215
220 216 An additional "debugcvsps" Mercurial command allows the builtin
221 217 changeset merging code to be run without doing a conversion. Its
222 218 parameters and output are similar to that of cvsps 2.1. Please see
223 219 the command help for more details.
224 220
225 221 Subversion Source
226 222 #################
227 223
228 224 Subversion source detects classical trunk/branches/tags layouts.
229 225 By default, the supplied ``svn://repo/path/`` source URL is
230 226 converted as a single branch. If ``svn://repo/path/trunk`` exists
231 227 it replaces the default branch. If ``svn://repo/path/branches``
232 228 exists, its subdirectories are listed as possible branches. If
233 229 ``svn://repo/path/tags`` exists, it is looked for tags referencing
234 230 converted branches. Default ``trunk``, ``branches`` and ``tags``
235 231 values can be overridden with following options. Set them to paths
236 232 relative to the source URL, or leave them blank to disable auto
237 233 detection.
238 234
239 235 The following options can be set with ``--config``:
240 236
241 237 :convert.svn.branches: specify the directory containing branches.
242 238 The default is ``branches``.
243 239
244 240 :convert.svn.tags: specify the directory containing tags. The
245 241 default is ``tags``.
246 242
247 243 :convert.svn.trunk: specify the name of the trunk branch. The
248 244 default is ``trunk``.
249 245
250 246 :convert.localtimezone: use local time (as determined by the TZ
251 247 environment variable) for changeset date/times. The default
252 248 is False (use UTC).
253 249
254 250 Source history can be retrieved starting at a specific revision,
255 251 instead of being integrally converted. Only single branch
256 252 conversions are supported.
257 253
258 254 :convert.svn.startrev: specify start Subversion revision number.
259 255 The default is 0.
260 256
261 257 Perforce Source
262 258 ###############
263 259
264 260 The Perforce (P4) importer can be given a p4 depot path or a
265 261 client specification as source. It will convert all files in the
266 262 source to a flat Mercurial repository, ignoring labels, branches
267 263 and integrations. Note that when a depot path is given you then
268 264 usually should specify a target directory, because otherwise the
269 265 target may be named ``...-hg``.
270 266
271 267 It is possible to limit the amount of source history to be
272 268 converted by specifying an initial Perforce revision:
273 269
274 270 :convert.p4.startrev: specify initial Perforce revision (a
275 271 Perforce changelist number).
276 272
277 273 Mercurial Destination
278 274 #####################
279 275
280 276 The following options are supported:
281 277
282 278 :convert.hg.clonebranches: dispatch source branches in separate
283 279 clones. The default is False.
284 280
285 281 :convert.hg.tagsbranch: branch name for tag revisions, defaults to
286 282 ``default``.
287 283
288 284 :convert.hg.usebranchnames: preserve branch names. The default is
289 285 True.
290 286 """
291 287 return convcmd.convert(ui, src, dest, revmapfile, **opts)
292 288
293 289 def debugsvnlog(ui, **opts):
294 290 return subversion.debugsvnlog(ui, **opts)
295 291
296 292 def debugcvsps(ui, *args, **opts):
297 293 '''create changeset information from CVS
298 294
299 295 This command is intended as a debugging tool for the CVS to
300 296 Mercurial converter, and can be used as a direct replacement for
301 297 cvsps.
302 298
303 299 Hg debugcvsps reads the CVS rlog for current directory (or any
304 300 named directory) in the CVS repository, and converts the log to a
305 301 series of changesets based on matching commit log entries and
306 302 dates.'''
307 303 return cvsps.debugcvsps(ui, *args, **opts)
308 304
309 305 commands.norepo += " convert debugsvnlog debugcvsps"
310 306
311 307 cmdtable = {
312 308 "convert":
313 309 (convert,
314 310 [('', 'authors', '',
315 311 _('username mapping filename (DEPRECATED, use --authormap instead)'),
316 312 _('FILE')),
317 313 ('s', 'source-type', '',
318 314 _('source repository type'), _('TYPE')),
319 315 ('d', 'dest-type', '',
320 316 _('destination repository type'), _('TYPE')),
321 317 ('r', 'rev', '',
322 318 _('import up to source revision REV'), _('REV')),
323 319 ('A', 'authormap', '',
324 320 _('remap usernames using this file'), _('FILE')),
325 321 ('', 'filemap', '',
326 322 _('remap file names using contents of file'), _('FILE')),
327 323 ('', 'splicemap', '',
328 324 _('splice synthesized history into place'), _('FILE')),
329 325 ('', 'branchmap', '',
330 326 _('change branch names while converting'), _('FILE')),
331 327 ('', 'closemap', '',
332 328 _('closes given revs'), _('FILE')),
333 ('', 'tagmap', '',
334 _('change tag names while converting'), _('FILE')),
335 329 ('', 'branchsort', None, _('try to sort changesets by branches')),
336 330 ('', 'datesort', None, _('try to sort changesets by date')),
337 331 ('', 'sourcesort', None, _('preserve source changesets order')),
338 332 ('', 'closesort', None, _('try to reorder closed revisions'))],
339 333 _('hg convert [OPTION]... SOURCE [DEST [REVMAP]]')),
340 334 "debugsvnlog":
341 335 (debugsvnlog,
342 336 [],
343 337 'hg debugsvnlog'),
344 338 "debugcvsps":
345 339 (debugcvsps,
346 340 [
347 341 # Main options shared with cvsps-2.1
348 342 ('b', 'branches', [], _('only return changes on specified branches')),
349 343 ('p', 'prefix', '', _('prefix to remove from file names')),
350 344 ('r', 'revisions', [],
351 345 _('only return changes after or between specified tags')),
352 346 ('u', 'update-cache', None, _("update cvs log cache")),
353 347 ('x', 'new-cache', None, _("create new cvs log cache")),
354 348 ('z', 'fuzz', 60, _('set commit time fuzz in seconds')),
355 349 ('', 'root', '', _('specify cvsroot')),
356 350 # Options specific to builtin cvsps
357 351 ('', 'parents', '', _('show parent changesets')),
358 352 ('', 'ancestors', '',
359 353 _('show current changeset in ancestor branches')),
360 354 # Options that are ignored for compatibility with cvsps-2.1
361 355 ('A', 'cvs-direct', None, _('ignored for compatibility')),
362 356 ],
363 357 _('hg debugcvsps [OPTION]... [PATH]...')),
364 358 }
365 359
366 360 def kwconverted(ctx, name):
367 361 rev = ctx.extra().get('convert_revision', '')
368 362 if rev.startswith('svn:'):
369 363 if name == 'svnrev':
370 364 return str(subversion.revsplit(rev)[2])
371 365 elif name == 'svnpath':
372 366 return subversion.revsplit(rev)[1]
373 367 elif name == 'svnuuid':
374 368 return subversion.revsplit(rev)[0]
375 369 return rev
376 370
377 371 def kwsvnrev(repo, ctx, **args):
378 372 """:svnrev: String. Converted subversion revision number."""
379 373 return kwconverted(ctx, 'svnrev')
380 374
381 375 def kwsvnpath(repo, ctx, **args):
382 376 """:svnpath: String. Converted subversion revision project path."""
383 377 return kwconverted(ctx, 'svnpath')
384 378
385 379 def kwsvnuuid(repo, ctx, **args):
386 380 """:svnuuid: String. Converted subversion revision repository identifier."""
387 381 return kwconverted(ctx, 'svnuuid')
388 382
389 383 def extsetup(ui):
390 384 templatekw.keywords['svnrev'] = kwsvnrev
391 385 templatekw.keywords['svnpath'] = kwsvnpath
392 386 templatekw.keywords['svnuuid'] = kwsvnuuid
393 387
394 388 # tell hggettext to extract docstrings from these functions:
395 389 i18nfunctions = [kwsvnrev, kwsvnpath, kwsvnuuid]
@@ -1,444 +1,443 b''
1 1 # common.py - common code for the convert extension
2 2 #
3 3 # Copyright 2005-2009 Matt Mackall <mpm@selenic.com> and others
4 4 #
5 5 # This software may be used and distributed according to the terms of the
6 6 # GNU General Public License version 2 or any later version.
7 7
8 8 import base64, errno, subprocess, os, datetime, re
9 9 import cPickle as pickle
10 10 from mercurial import util
11 11 from mercurial.i18n import _
12 12
13 13 propertycache = util.propertycache
14 14
15 15 def encodeargs(args):
16 16 def encodearg(s):
17 17 lines = base64.encodestring(s)
18 18 lines = [l.splitlines()[0] for l in lines]
19 19 return ''.join(lines)
20 20
21 21 s = pickle.dumps(args)
22 22 return encodearg(s)
23 23
24 24 def decodeargs(s):
25 25 s = base64.decodestring(s)
26 26 return pickle.loads(s)
27 27
28 28 class MissingTool(Exception):
29 29 pass
30 30
31 31 def checktool(exe, name=None, abort=True):
32 32 name = name or exe
33 33 if not util.findexe(exe):
34 34 exc = abort and util.Abort or MissingTool
35 35 raise exc(_('cannot find required "%s" tool') % name)
36 36
37 37 class NoRepo(Exception):
38 38 pass
39 39
40 40 SKIPREV = 'SKIP'
41 41
42 42 class commit(object):
43 43 def __init__(self, author, date, desc, parents, branch=None, rev=None,
44 44 extra={}, sortkey=None):
45 45 self.author = author or 'unknown'
46 46 self.date = date or '0 0'
47 47 self.desc = desc
48 48 self.parents = parents
49 49 self.branch = branch
50 50 self.rev = rev
51 51 self.extra = extra
52 52 self.sortkey = sortkey
53 53
54 54 class converter_source(object):
55 55 """Conversion source interface"""
56 56
57 57 def __init__(self, ui, path=None, rev=None):
58 58 """Initialize conversion source (or raise NoRepo("message")
59 59 exception if path is not a valid repository)"""
60 60 self.ui = ui
61 61 self.path = path
62 62 self.rev = rev
63 63
64 64 self.encoding = 'utf-8'
65 65
66 66 def checkhexformat(self, revstr, mapname='splicemap'):
67 67 """ fails if revstr is not a 40 byte hex. mercurial and git both uses
68 68 such format for their revision numbering
69 69 """
70 70 if not re.match(r'[0-9a-fA-F]{40,40}$', revstr):
71 71 raise util.Abort(_('%s entry %s is not a valid revision'
72 72 ' identifier') % (mapname, revstr))
73 73
74 74 def before(self):
75 75 pass
76 76
77 77 def after(self):
78 78 pass
79 79
80 80 def setrevmap(self, revmap):
81 81 """set the map of already-converted revisions"""
82 82 pass
83 83
84 84 def getheads(self):
85 85 """Return a list of this repository's heads"""
86 86 raise NotImplementedError
87 87
88 88 def getfile(self, name, rev):
89 89 """Return a pair (data, mode) where data is the file content
90 90 as a string and mode one of '', 'x' or 'l'. rev is the
91 91 identifier returned by a previous call to getchanges(). Raise
92 92 IOError to indicate that name was deleted in rev.
93 93 """
94 94 raise NotImplementedError
95 95
96 96 def getchanges(self, version):
97 97 """Returns a tuple of (files, copies).
98 98
99 99 files is a sorted list of (filename, id) tuples for all files
100 100 changed between version and its first parent returned by
101 101 getcommit(). id is the source revision id of the file.
102 102
103 103 copies is a dictionary of dest: source
104 104 """
105 105 raise NotImplementedError
106 106
107 107 def getcommit(self, version):
108 108 """Return the commit object for version"""
109 109 raise NotImplementedError
110 110
111 111 def gettags(self):
112 112 """Return the tags as a dictionary of name: revision
113 113
114 114 Tag names must be UTF-8 strings.
115 115 """
116 116 raise NotImplementedError
117 117
118 118 def recode(self, s, encoding=None):
119 119 if not encoding:
120 120 encoding = self.encoding or 'utf-8'
121 121
122 122 if isinstance(s, unicode):
123 123 return s.encode("utf-8")
124 124 try:
125 125 return s.decode(encoding).encode("utf-8")
126 126 except UnicodeError:
127 127 try:
128 128 return s.decode("latin-1").encode("utf-8")
129 129 except UnicodeError:
130 130 return s.decode(encoding, "replace").encode("utf-8")
131 131
132 132 def getchangedfiles(self, rev, i):
133 133 """Return the files changed by rev compared to parent[i].
134 134
135 135 i is an index selecting one of the parents of rev. The return
136 136 value should be the list of files that are different in rev and
137 137 this parent.
138 138
139 139 If rev has no parents, i is None.
140 140
141 141 This function is only needed to support --filemap
142 142 """
143 143 raise NotImplementedError
144 144
145 145 def converted(self, rev, sinkrev):
146 146 '''Notify the source that a revision has been converted.'''
147 147 pass
148 148
149 149 def hasnativeorder(self):
150 150 """Return true if this source has a meaningful, native revision
151 151 order. For instance, Mercurial revisions are store sequentially
152 152 while there is no such global ordering with Darcs.
153 153 """
154 154 return False
155 155
156 156 def hasnativeclose(self):
157 157 """Return true if this source has ability to close branch.
158 158 """
159 159 return False
160 160
161 161 def lookuprev(self, rev):
162 162 """If rev is a meaningful revision reference in source, return
163 163 the referenced identifier in the same format used by getcommit().
164 164 return None otherwise.
165 165 """
166 166 return None
167 167
168 168 def getbookmarks(self):
169 169 """Return the bookmarks as a dictionary of name: revision
170 170
171 171 Bookmark names are to be UTF-8 strings.
172 172 """
173 173 return {}
174 174
175 175 def checkrevformat(self, revstr, mapname='splicemap'):
176 176 """revstr is a string that describes a revision in the given
177 177 source control system. Return true if revstr has correct
178 178 format.
179 179 """
180 180 return True
181 181
182 182 class converter_sink(object):
183 183 """Conversion sink (target) interface"""
184 184
185 185 def __init__(self, ui, path):
186 186 """Initialize conversion sink (or raise NoRepo("message")
187 187 exception if path is not a valid repository)
188 188
189 189 created is a list of paths to remove if a fatal error occurs
190 190 later"""
191 191 self.ui = ui
192 192 self.path = path
193 193 self.created = []
194 194
195 195 def revmapfile(self):
196 196 """Path to a file that will contain lines
197 197 source_rev_id sink_rev_id
198 198 mapping equivalent revision identifiers for each system."""
199 199 raise NotImplementedError
200 200
201 201 def authorfile(self):
202 202 """Path to a file that will contain lines
203 203 srcauthor=dstauthor
204 204 mapping equivalent authors identifiers for each system."""
205 205 return None
206 206
207 def putcommit(self, files, copies, parents, commit, source,
208 revmap, tagmap):
207 def putcommit(self, files, copies, parents, commit, source, revmap):
209 208 """Create a revision with all changed files listed in 'files'
210 209 and having listed parents. 'commit' is a commit object
211 210 containing at a minimum the author, date, and message for this
212 211 changeset. 'files' is a list of (path, version) tuples,
213 212 'copies' is a dictionary mapping destinations to sources,
214 213 'source' is the source repository, and 'revmap' is a mapfile
215 214 of source revisions to converted revisions. Only getfile() and
216 215 lookuprev() should be called on 'source'.
217 216
218 217 Note that the sink repository is not told to update itself to
219 218 a particular revision (or even what that revision would be)
220 219 before it receives the file data.
221 220 """
222 221 raise NotImplementedError
223 222
224 223 def puttags(self, tags):
225 224 """Put tags into sink.
226 225
227 226 tags: {tagname: sink_rev_id, ...} where tagname is an UTF-8 string.
228 227 Return a pair (tag_revision, tag_parent_revision), or (None, None)
229 228 if nothing was changed.
230 229 """
231 230 raise NotImplementedError
232 231
233 232 def setbranch(self, branch, pbranches):
234 233 """Set the current branch name. Called before the first putcommit
235 234 on the branch.
236 235 branch: branch name for subsequent commits
237 236 pbranches: (converted parent revision, parent branch) tuples"""
238 237 pass
239 238
240 239 def setfilemapmode(self, active):
241 240 """Tell the destination that we're using a filemap
242 241
243 242 Some converter_sources (svn in particular) can claim that a file
244 243 was changed in a revision, even if there was no change. This method
245 244 tells the destination that we're using a filemap and that it should
246 245 filter empty revisions.
247 246 """
248 247 pass
249 248
250 249 def before(self):
251 250 pass
252 251
253 252 def after(self):
254 253 pass
255 254
256 255 def putbookmarks(self, bookmarks):
257 256 """Put bookmarks into sink.
258 257
259 258 bookmarks: {bookmarkname: sink_rev_id, ...}
260 259 where bookmarkname is an UTF-8 string.
261 260 """
262 261 pass
263 262
264 263 def hascommit(self, rev):
265 264 """Return True if the sink contains rev"""
266 265 raise NotImplementedError
267 266
268 267 class commandline(object):
269 268 def __init__(self, ui, command):
270 269 self.ui = ui
271 270 self.command = command
272 271
273 272 def prerun(self):
274 273 pass
275 274
276 275 def postrun(self):
277 276 pass
278 277
279 278 def _cmdline(self, cmd, *args, **kwargs):
280 279 cmdline = [self.command, cmd] + list(args)
281 280 for k, v in kwargs.iteritems():
282 281 if len(k) == 1:
283 282 cmdline.append('-' + k)
284 283 else:
285 284 cmdline.append('--' + k.replace('_', '-'))
286 285 try:
287 286 if len(k) == 1:
288 287 cmdline.append('' + v)
289 288 else:
290 289 cmdline[-1] += '=' + v
291 290 except TypeError:
292 291 pass
293 292 cmdline = [util.shellquote(arg) for arg in cmdline]
294 293 if not self.ui.debugflag:
295 294 cmdline += ['2>', os.devnull]
296 295 cmdline = ' '.join(cmdline)
297 296 return cmdline
298 297
299 298 def _run(self, cmd, *args, **kwargs):
300 299 def popen(cmdline):
301 300 p = subprocess.Popen(cmdline, shell=True, bufsize=-1,
302 301 close_fds=util.closefds,
303 302 stdout=subprocess.PIPE)
304 303 return p
305 304 return self._dorun(popen, cmd, *args, **kwargs)
306 305
307 306 def _run2(self, cmd, *args, **kwargs):
308 307 return self._dorun(util.popen2, cmd, *args, **kwargs)
309 308
310 309 def _dorun(self, openfunc, cmd, *args, **kwargs):
311 310 cmdline = self._cmdline(cmd, *args, **kwargs)
312 311 self.ui.debug('running: %s\n' % (cmdline,))
313 312 self.prerun()
314 313 try:
315 314 return openfunc(cmdline)
316 315 finally:
317 316 self.postrun()
318 317
319 318 def run(self, cmd, *args, **kwargs):
320 319 p = self._run(cmd, *args, **kwargs)
321 320 output = p.communicate()[0]
322 321 self.ui.debug(output)
323 322 return output, p.returncode
324 323
325 324 def runlines(self, cmd, *args, **kwargs):
326 325 p = self._run(cmd, *args, **kwargs)
327 326 output = p.stdout.readlines()
328 327 p.wait()
329 328 self.ui.debug(''.join(output))
330 329 return output, p.returncode
331 330
332 331 def checkexit(self, status, output=''):
333 332 if status:
334 333 if output:
335 334 self.ui.warn(_('%s error:\n') % self.command)
336 335 self.ui.warn(output)
337 336 msg = util.explainexit(status)[0]
338 337 raise util.Abort('%s %s' % (self.command, msg))
339 338
340 339 def run0(self, cmd, *args, **kwargs):
341 340 output, status = self.run(cmd, *args, **kwargs)
342 341 self.checkexit(status, output)
343 342 return output
344 343
345 344 def runlines0(self, cmd, *args, **kwargs):
346 345 output, status = self.runlines(cmd, *args, **kwargs)
347 346 self.checkexit(status, ''.join(output))
348 347 return output
349 348
350 349 @propertycache
351 350 def argmax(self):
352 351 # POSIX requires at least 4096 bytes for ARG_MAX
353 352 argmax = 4096
354 353 try:
355 354 argmax = os.sysconf("SC_ARG_MAX")
356 355 except (AttributeError, ValueError):
357 356 pass
358 357
359 358 # Windows shells impose their own limits on command line length,
360 359 # down to 2047 bytes for cmd.exe under Windows NT/2k and 2500 bytes
361 360 # for older 4nt.exe. See http://support.microsoft.com/kb/830473 for
362 361 # details about cmd.exe limitations.
363 362
364 363 # Since ARG_MAX is for command line _and_ environment, lower our limit
365 364 # (and make happy Windows shells while doing this).
366 365 return argmax // 2 - 1
367 366
368 367 def _limit_arglist(self, arglist, cmd, *args, **kwargs):
369 368 cmdlen = len(self._cmdline(cmd, *args, **kwargs))
370 369 limit = self.argmax - cmdlen
371 370 bytes = 0
372 371 fl = []
373 372 for fn in arglist:
374 373 b = len(fn) + 3
375 374 if bytes + b < limit or len(fl) == 0:
376 375 fl.append(fn)
377 376 bytes += b
378 377 else:
379 378 yield fl
380 379 fl = [fn]
381 380 bytes = b
382 381 if fl:
383 382 yield fl
384 383
385 384 def xargs(self, arglist, cmd, *args, **kwargs):
386 385 for l in self._limit_arglist(arglist, cmd, *args, **kwargs):
387 386 self.run0(cmd, *(list(args) + l), **kwargs)
388 387
389 388 class mapfile(dict):
390 389 def __init__(self, ui, path):
391 390 super(mapfile, self).__init__()
392 391 self.ui = ui
393 392 self.path = path
394 393 self.fp = None
395 394 self.order = []
396 395 self._read()
397 396
398 397 def _read(self):
399 398 if not self.path:
400 399 return
401 400 try:
402 401 fp = open(self.path, 'r')
403 402 except IOError, err:
404 403 if err.errno != errno.ENOENT:
405 404 raise
406 405 return
407 406 for i, line in enumerate(fp):
408 407 line = line.splitlines()[0].rstrip()
409 408 if not line:
410 409 # Ignore blank lines
411 410 continue
412 411 try:
413 412 key, value = line.rsplit(' ', 1)
414 413 except ValueError:
415 414 raise util.Abort(
416 415 _('syntax error in %s(%d): key/value pair expected')
417 416 % (self.path, i + 1))
418 417 if key not in self:
419 418 self.order.append(key)
420 419 super(mapfile, self).__setitem__(key, value)
421 420 fp.close()
422 421
423 422 def __setitem__(self, key, value):
424 423 if self.fp is None:
425 424 try:
426 425 self.fp = open(self.path, 'a')
427 426 except IOError, err:
428 427 raise util.Abort(_('could not open map file %r: %s') %
429 428 (self.path, err.strerror))
430 429 self.fp.write('%s %s\n' % (key, value))
431 430 self.fp.flush()
432 431 super(mapfile, self).__setitem__(key, value)
433 432
434 433 def close(self):
435 434 if self.fp:
436 435 self.fp.close()
437 436 self.fp = None
438 437
439 438 def makedatetimestamp(t):
440 439 """Like util.makedate() but for time t instead of current time"""
441 440 delta = (datetime.datetime.utcfromtimestamp(t) -
442 441 datetime.datetime.fromtimestamp(t))
443 442 tz = delta.days * 86400 + delta.seconds
444 443 return t, tz
@@ -1,571 +1,567 b''
1 1 # convcmd - convert extension commands definition
2 2 #
3 3 # Copyright 2005-2007 Matt Mackall <mpm@selenic.com>
4 4 #
5 5 # This software may be used and distributed according to the terms of the
6 6 # GNU General Public License version 2 or any later version.
7 7
8 8 from common import NoRepo, MissingTool, SKIPREV, mapfile
9 9 from cvs import convert_cvs
10 10 from darcs import darcs_source
11 11 from git import convert_git
12 12 from hg import mercurial_source, mercurial_sink
13 13 from subversion import svn_source, svn_sink
14 14 from monotone import monotone_source
15 15 from gnuarch import gnuarch_source
16 16 from bzr import bzr_source
17 17 from p4 import p4_source
18 18 import filemap
19 19
20 20 import os, shutil, shlex
21 21 from mercurial import hg, util, encoding
22 22 from mercurial.i18n import _
23 23
24 24 orig_encoding = 'ascii'
25 25
26 26 def recode(s):
27 27 if isinstance(s, unicode):
28 28 return s.encode(orig_encoding, 'replace')
29 29 else:
30 30 return s.decode('utf-8').encode(orig_encoding, 'replace')
31 31
32 32 source_converters = [
33 33 ('cvs', convert_cvs, 'branchsort'),
34 34 ('git', convert_git, 'branchsort'),
35 35 ('svn', svn_source, 'branchsort'),
36 36 ('hg', mercurial_source, 'sourcesort'),
37 37 ('darcs', darcs_source, 'branchsort'),
38 38 ('mtn', monotone_source, 'branchsort'),
39 39 ('gnuarch', gnuarch_source, 'branchsort'),
40 40 ('bzr', bzr_source, 'branchsort'),
41 41 ('p4', p4_source, 'branchsort'),
42 42 ]
43 43
44 44 sink_converters = [
45 45 ('hg', mercurial_sink),
46 46 ('svn', svn_sink),
47 47 ]
48 48
49 49 def convertsource(ui, path, type, rev):
50 50 exceptions = []
51 51 if type and type not in [s[0] for s in source_converters]:
52 52 raise util.Abort(_('%s: invalid source repository type') % type)
53 53 for name, source, sortmode in source_converters:
54 54 try:
55 55 if not type or name == type:
56 56 return source(ui, path, rev), sortmode
57 57 except (NoRepo, MissingTool), inst:
58 58 exceptions.append(inst)
59 59 if not ui.quiet:
60 60 for inst in exceptions:
61 61 ui.write("%s\n" % inst)
62 62 raise util.Abort(_('%s: missing or unsupported repository') % path)
63 63
64 64 def convertsink(ui, path, type):
65 65 if type and type not in [s[0] for s in sink_converters]:
66 66 raise util.Abort(_('%s: invalid destination repository type') % type)
67 67 for name, sink in sink_converters:
68 68 try:
69 69 if not type or name == type:
70 70 return sink(ui, path)
71 71 except NoRepo, inst:
72 72 ui.note(_("convert: %s\n") % inst)
73 73 except MissingTool, inst:
74 74 raise util.Abort('%s\n' % inst)
75 75 raise util.Abort(_('%s: unknown repository type') % path)
76 76
77 77 class progresssource(object):
78 78 def __init__(self, ui, source, filecount):
79 79 self.ui = ui
80 80 self.source = source
81 81 self.filecount = filecount
82 82 self.retrieved = 0
83 83
84 84 def getfile(self, file, rev):
85 85 self.retrieved += 1
86 86 self.ui.progress(_('getting files'), self.retrieved,
87 87 item=file, total=self.filecount)
88 88 return self.source.getfile(file, rev)
89 89
90 90 def lookuprev(self, rev):
91 91 return self.source.lookuprev(rev)
92 92
93 93 def close(self):
94 94 self.ui.progress(_('getting files'), None)
95 95
96 96 class converter(object):
97 97 def __init__(self, ui, source, dest, revmapfile, opts):
98 98
99 99 self.source = source
100 100 self.dest = dest
101 101 self.ui = ui
102 102 self.opts = opts
103 103 self.commitcache = {}
104 104 self.authors = {}
105 105 self.authorfile = None
106 106
107 107 # Record converted revisions persistently: maps source revision
108 108 # ID to target revision ID (both strings). (This is how
109 109 # incremental conversions work.)
110 110 self.map = mapfile(ui, revmapfile)
111 111
112 112 # Read first the dst author map if any
113 113 authorfile = self.dest.authorfile()
114 114 if authorfile and os.path.exists(authorfile):
115 115 self.readauthormap(authorfile)
116 116 # Extend/Override with new author map if necessary
117 117 if opts.get('authormap'):
118 118 self.readauthormap(opts.get('authormap'))
119 119 self.authorfile = self.dest.authorfile()
120 120
121 121 self.splicemap = self.parsesplicemap(opts.get('splicemap'))
122 122 self.branchmap = mapfile(ui, opts.get('branchmap'))
123 123 self.closemap = self.parseclosemap(opts.get('closemap'))
124 self.tagmap = mapfile(ui, opts.get('tagmap'))
125 124
126 125 def parseclosemap(self, path):
127 126 """ check and validate the closemap format and
128 127 return a list of revs to close.
129 128 Format checking has two parts.
130 129 1. generic format which is same across all source types
131 130 2. specific format checking which may be different for
132 131 different source type. This logic is implemented in
133 132 checkrevformat function in source files like
134 133 hg.py, subversion.py etc.
135 134 """
136 135
137 136 if not path:
138 137 return []
139 138 m = []
140 139 try:
141 140 fp = open(path, 'r')
142 141 for i, line in enumerate(fp):
143 142 line = line.splitlines()[0].rstrip()
144 143 if not line:
145 144 # Ignore blank lines
146 145 continue
147 146 # split line
148 147 lex = shlex.shlex(line, posix=True)
149 148 lex.whitespace_split = True
150 149 lex.whitespace += ','
151 150 line = list(lex)
152 151 for part in line:
153 152 self.source.checkrevformat(part, 'closemap')
154 153 m.extend(line)
155 154 # if file does not exist or error reading, exit
156 155 except IOError:
157 156 raise util.Abort(_('closemap file not found or error reading %s:')
158 157 % path)
159 158 return m
160 159
161 160 def parsesplicemap(self, path):
162 161 """ check and validate the splicemap format and
163 162 return a child/parents dictionary.
164 163 Format checking has two parts.
165 164 1. generic format which is same across all source types
166 165 2. specific format checking which may be different for
167 166 different source type. This logic is implemented in
168 167 checkrevformat function in source files like
169 168 hg.py, subversion.py etc.
170 169 """
171 170
172 171 if not path:
173 172 return {}
174 173 m = {}
175 174 try:
176 175 fp = open(path, 'r')
177 176 for i, line in enumerate(fp):
178 177 line = line.splitlines()[0].rstrip()
179 178 if not line:
180 179 # Ignore blank lines
181 180 continue
182 181 # split line
183 182 lex = shlex.shlex(line, posix=True)
184 183 lex.whitespace_split = True
185 184 lex.whitespace += ','
186 185 line = list(lex)
187 186 # check number of parents
188 187 if not (2 <= len(line) <= 3):
189 188 raise util.Abort(_('syntax error in %s(%d): child parent1'
190 189 '[,parent2] expected') % (path, i + 1))
191 190 for part in line:
192 191 self.source.checkrevformat(part)
193 192 child, p1, p2 = line[0], line[1:2], line[2:]
194 193 if p1 == p2:
195 194 m[child] = p1
196 195 else:
197 196 m[child] = p1 + p2
198 197 # if file does not exist or error reading, exit
199 198 except IOError:
200 199 raise util.Abort(_('splicemap file not found or error reading %s:')
201 200 % path)
202 201 return m
203 202
204 203
205 204 def walktree(self, heads):
206 205 '''Return a mapping that identifies the uncommitted parents of every
207 206 uncommitted changeset.'''
208 207 visit = heads
209 208 known = set()
210 209 parents = {}
211 210 while visit:
212 211 n = visit.pop(0)
213 212 if n in known or n in self.map:
214 213 continue
215 214 known.add(n)
216 215 self.ui.progress(_('scanning'), len(known), unit=_('revisions'))
217 216 commit = self.cachecommit(n)
218 217 parents[n] = []
219 218 for p in commit.parents:
220 219 parents[n].append(p)
221 220 visit.append(p)
222 221 self.ui.progress(_('scanning'), None)
223 222
224 223 return parents
225 224
226 225 def mergesplicemap(self, parents, splicemap):
227 226 """A splicemap redefines child/parent relationships. Check the
228 227 map contains valid revision identifiers and merge the new
229 228 links in the source graph.
230 229 """
231 230 for c in sorted(splicemap):
232 231 if c not in parents:
233 232 if not self.dest.hascommit(self.map.get(c, c)):
234 233 # Could be in source but not converted during this run
235 234 self.ui.warn(_('splice map revision %s is not being '
236 235 'converted, ignoring\n') % c)
237 236 continue
238 237 pc = []
239 238 for p in splicemap[c]:
240 239 # We do not have to wait for nodes already in dest.
241 240 if self.dest.hascommit(self.map.get(p, p)):
242 241 continue
243 242 # Parent is not in dest and not being converted, not good
244 243 if p not in parents:
245 244 raise util.Abort(_('unknown splice map parent: %s') % p)
246 245 pc.append(p)
247 246 parents[c] = pc
248 247
249 248 def toposort(self, parents, sortmode):
250 249 '''Return an ordering such that every uncommitted changeset is
251 250 preceded by all its uncommitted ancestors.'''
252 251
253 252 def mapchildren(parents):
254 253 """Return a (children, roots) tuple where 'children' maps parent
255 254 revision identifiers to children ones, and 'roots' is the list of
256 255 revisions without parents. 'parents' must be a mapping of revision
257 256 identifier to its parents ones.
258 257 """
259 258 visit = sorted(parents)
260 259 seen = set()
261 260 children = {}
262 261 roots = []
263 262
264 263 while visit:
265 264 n = visit.pop(0)
266 265 if n in seen:
267 266 continue
268 267 seen.add(n)
269 268 # Ensure that nodes without parents are present in the
270 269 # 'children' mapping.
271 270 children.setdefault(n, [])
272 271 hasparent = False
273 272 for p in parents[n]:
274 273 if p not in self.map:
275 274 visit.append(p)
276 275 hasparent = True
277 276 children.setdefault(p, []).append(n)
278 277 if not hasparent:
279 278 roots.append(n)
280 279
281 280 return children, roots
282 281
283 282 # Sort functions are supposed to take a list of revisions which
284 283 # can be converted immediately and pick one
285 284
286 285 def makebranchsorter():
287 286 """If the previously converted revision has a child in the
288 287 eligible revisions list, pick it. Return the list head
289 288 otherwise. Branch sort attempts to minimize branch
290 289 switching, which is harmful for Mercurial backend
291 290 compression.
292 291 """
293 292 prev = [None]
294 293 def picknext(nodes):
295 294 next = nodes[0]
296 295 for n in nodes:
297 296 if prev[0] in parents[n]:
298 297 next = n
299 298 break
300 299 prev[0] = next
301 300 return next
302 301 return picknext
303 302
304 303 def makesourcesorter():
305 304 """Source specific sort."""
306 305 keyfn = lambda n: self.commitcache[n].sortkey
307 306 def picknext(nodes):
308 307 return sorted(nodes, key=keyfn)[0]
309 308 return picknext
310 309
311 310 def makeclosesorter():
312 311 """Close order sort."""
313 312 keyfn = lambda n: ('close' not in self.commitcache[n].extra,
314 313 self.commitcache[n].sortkey)
315 314 def picknext(nodes):
316 315 return sorted(nodes, key=keyfn)[0]
317 316 return picknext
318 317
319 318 def makedatesorter():
320 319 """Sort revisions by date."""
321 320 dates = {}
322 321 def getdate(n):
323 322 if n not in dates:
324 323 dates[n] = util.parsedate(self.commitcache[n].date)
325 324 return dates[n]
326 325
327 326 def picknext(nodes):
328 327 return min([(getdate(n), n) for n in nodes])[1]
329 328
330 329 return picknext
331 330
332 331 if sortmode == 'branchsort':
333 332 picknext = makebranchsorter()
334 333 elif sortmode == 'datesort':
335 334 picknext = makedatesorter()
336 335 elif sortmode == 'sourcesort':
337 336 picknext = makesourcesorter()
338 337 elif sortmode == 'closesort':
339 338 picknext = makeclosesorter()
340 339 else:
341 340 raise util.Abort(_('unknown sort mode: %s') % sortmode)
342 341
343 342 children, actives = mapchildren(parents)
344 343
345 344 s = []
346 345 pendings = {}
347 346 while actives:
348 347 n = picknext(actives)
349 348 actives.remove(n)
350 349 s.append(n)
351 350
352 351 # Update dependents list
353 352 for c in children.get(n, []):
354 353 if c not in pendings:
355 354 pendings[c] = [p for p in parents[c] if p not in self.map]
356 355 try:
357 356 pendings[c].remove(n)
358 357 except ValueError:
359 358 raise util.Abort(_('cycle detected between %s and %s')
360 359 % (recode(c), recode(n)))
361 360 if not pendings[c]:
362 361 # Parents are converted, node is eligible
363 362 actives.insert(0, c)
364 363 pendings[c] = None
365 364
366 365 if len(s) != len(parents):
367 366 raise util.Abort(_("not all revisions were sorted"))
368 367
369 368 return s
370 369
371 370 def writeauthormap(self):
372 371 authorfile = self.authorfile
373 372 if authorfile:
374 373 self.ui.status(_('writing author map file %s\n') % authorfile)
375 374 ofile = open(authorfile, 'w+')
376 375 for author in self.authors:
377 376 ofile.write("%s=%s\n" % (author, self.authors[author]))
378 377 ofile.close()
379 378
380 379 def readauthormap(self, authorfile):
381 380 afile = open(authorfile, 'r')
382 381 for line in afile:
383 382
384 383 line = line.strip()
385 384 if not line or line.startswith('#'):
386 385 continue
387 386
388 387 try:
389 388 srcauthor, dstauthor = line.split('=', 1)
390 389 except ValueError:
391 390 msg = _('ignoring bad line in author map file %s: %s\n')
392 391 self.ui.warn(msg % (authorfile, line.rstrip()))
393 392 continue
394 393
395 394 srcauthor = srcauthor.strip()
396 395 dstauthor = dstauthor.strip()
397 396 if self.authors.get(srcauthor) in (None, dstauthor):
398 397 msg = _('mapping author %s to %s\n')
399 398 self.ui.debug(msg % (srcauthor, dstauthor))
400 399 self.authors[srcauthor] = dstauthor
401 400 continue
402 401
403 402 m = _('overriding mapping for author %s, was %s, will be %s\n')
404 403 self.ui.status(m % (srcauthor, self.authors[srcauthor], dstauthor))
405 404
406 405 afile.close()
407 406
408 407 def cachecommit(self, rev):
409 408 commit = self.source.getcommit(rev)
410 409 commit.author = self.authors.get(commit.author, commit.author)
411 410 # If commit.branch is None, this commit is coming from the source
412 411 # repository's default branch and destined for the default branch in the
413 412 # destination repository. For such commits, passing a literal "None"
414 413 # string to branchmap.get() below allows the user to map "None" to an
415 414 # alternate default branch in the destination repository.
416 415 commit.branch = self.branchmap.get(str(commit.branch), commit.branch)
417 416 self.commitcache[rev] = commit
418 417 return commit
419 418
420 419 def copy(self, rev):
421 420 commit = self.commitcache[rev]
422 421
423 422 changes = self.source.getchanges(rev)
424 423 if isinstance(changes, basestring):
425 424 if changes == SKIPREV:
426 425 dest = SKIPREV
427 426 else:
428 427 dest = self.map[changes]
429 428 self.map[rev] = dest
430 429 return
431 430 files, copies = changes
432 431 pbranches = []
433 432 if commit.parents:
434 433 for prev in commit.parents:
435 434 if prev not in self.commitcache:
436 435 self.cachecommit(prev)
437 436 pbranches.append((self.map[prev],
438 437 self.commitcache[prev].branch))
439 438 self.dest.setbranch(commit.branch, pbranches)
440 439 try:
441 440 parents = self.splicemap[rev]
442 441 self.ui.status(_('spliced in %s as parents of %s\n') %
443 442 (parents, rev))
444 443 parents = [self.map.get(p, p) for p in parents]
445 444 except KeyError:
446 445 parents = [b[0] for b in pbranches]
447 446 source = progresssource(self.ui, self.source, len(files))
448 447 if self.closemap and rev in self.closemap:
449 448 commit.extra['close'] = 1
450 449
451 450 newnode = self.dest.putcommit(files, copies, parents, commit,
452 source, self.map, self.tagmap)
451 source, self.map)
453 452 source.close()
454 453 self.source.converted(rev, newnode)
455 454 self.map[rev] = newnode
456 455
457 456 def convert(self, sortmode):
458 457 try:
459 458 self.source.before()
460 459 self.dest.before()
461 460 self.source.setrevmap(self.map)
462 461 self.ui.status(_("scanning source...\n"))
463 462 heads = self.source.getheads()
464 463 parents = self.walktree(heads)
465 464 self.mergesplicemap(parents, self.splicemap)
466 465 self.ui.status(_("sorting...\n"))
467 466 t = self.toposort(parents, sortmode)
468 467 num = len(t)
469 468 c = None
470 469
471 470 self.ui.status(_("converting...\n"))
472 471 for i, c in enumerate(t):
473 472 num -= 1
474 473 desc = self.commitcache[c].desc
475 474 if "\n" in desc:
476 475 desc = desc.splitlines()[0]
477 476 # convert log message to local encoding without using
478 477 # tolocal() because the encoding.encoding convert()
479 478 # uses is 'utf-8'
480 479 self.ui.status("%d %s\n" % (num, recode(desc)))
481 480 self.ui.note(_("source: %s\n") % recode(c))
482 481 self.ui.progress(_('converting'), i, unit=_('revisions'),
483 482 total=len(t))
484 483 self.copy(c)
485 484 self.ui.progress(_('converting'), None)
486 485
487 486 tags = self.source.gettags()
488 tags = dict((self.tagmap.get(k, k), v)
489 for k, v in tags.iteritems())
490
491 487 ctags = {}
492 488 for k in tags:
493 489 v = tags[k]
494 490 if self.map.get(v, SKIPREV) != SKIPREV:
495 491 ctags[k] = self.map[v]
496 492
497 493 if c and ctags:
498 494 nrev, tagsparent = self.dest.puttags(ctags)
499 495 if nrev and tagsparent:
500 496 # write another hash correspondence to override the previous
501 497 # one so we don't end up with extra tag heads
502 498 tagsparents = [e for e in self.map.iteritems()
503 499 if e[1] == tagsparent]
504 500 if tagsparents:
505 501 self.map[tagsparents[0][0]] = nrev
506 502
507 503 bookmarks = self.source.getbookmarks()
508 504 cbookmarks = {}
509 505 for k in bookmarks:
510 506 v = bookmarks[k]
511 507 if self.map.get(v, SKIPREV) != SKIPREV:
512 508 cbookmarks[k] = self.map[v]
513 509
514 510 if c and cbookmarks:
515 511 self.dest.putbookmarks(cbookmarks)
516 512
517 513 self.writeauthormap()
518 514 finally:
519 515 self.cleanup()
520 516
521 517 def cleanup(self):
522 518 try:
523 519 self.dest.after()
524 520 finally:
525 521 self.source.after()
526 522 self.map.close()
527 523
528 524 def convert(ui, src, dest=None, revmapfile=None, **opts):
529 525 global orig_encoding
530 526 orig_encoding = encoding.encoding
531 527 encoding.encoding = 'UTF-8'
532 528
533 529 # support --authors as an alias for --authormap
534 530 if not opts.get('authormap'):
535 531 opts['authormap'] = opts.get('authors')
536 532
537 533 if not dest:
538 534 dest = hg.defaultdest(src) + "-hg"
539 535 ui.status(_("assuming destination %s\n") % dest)
540 536
541 537 destc = convertsink(ui, dest, opts.get('dest_type'))
542 538
543 539 try:
544 540 srcc, defaultsort = convertsource(ui, src, opts.get('source_type'),
545 541 opts.get('rev'))
546 542 except Exception:
547 543 for path in destc.created:
548 544 shutil.rmtree(path, True)
549 545 raise
550 546
551 547 sortmodes = ('branchsort', 'datesort', 'sourcesort', 'closesort')
552 548 sortmode = [m for m in sortmodes if opts.get(m)]
553 549 if len(sortmode) > 1:
554 550 raise util.Abort(_('more than one sort mode specified'))
555 551 sortmode = sortmode and sortmode[0] or defaultsort
556 552 if sortmode == 'sourcesort' and not srcc.hasnativeorder():
557 553 raise util.Abort(_('--sourcesort is not supported by this data source'))
558 554 if sortmode == 'closesort' and not srcc.hasnativeclose():
559 555 raise util.Abort(_('--closesort is not supported by this data source'))
560 556
561 557 fmap = opts.get('filemap')
562 558 if fmap:
563 559 srcc = filemap.filemap_source(ui, srcc, fmap)
564 560 destc.setfilemapmode(True)
565 561
566 562 if not revmapfile:
567 563 revmapfile = destc.revmapfile()
568 564
569 565 c = converter(ui, srcc, destc, revmapfile, opts)
570 566 c.convert(sortmode)
571 567
@@ -1,447 +1,446 b''
1 1 # hg.py - hg backend for convert extension
2 2 #
3 3 # Copyright 2005-2009 Matt Mackall <mpm@selenic.com> and others
4 4 #
5 5 # This software may be used and distributed according to the terms of the
6 6 # GNU General Public License version 2 or any later version.
7 7
8 8 # Notes for hg->hg conversion:
9 9 #
10 10 # * Old versions of Mercurial didn't trim the whitespace from the ends
11 11 # of commit messages, but new versions do. Changesets created by
12 12 # those older versions, then converted, may thus have different
13 13 # hashes for changesets that are otherwise identical.
14 14 #
15 15 # * Using "--config convert.hg.saverev=true" will make the source
16 16 # identifier to be stored in the converted revision. This will cause
17 17 # the converted revision to have a different identity than the
18 18 # source.
19 19
20 20
21 21 import os, time, cStringIO
22 22 from mercurial.i18n import _
23 23 from mercurial.node import bin, hex, nullid
24 24 from mercurial import hg, util, context, bookmarks, error, scmutil
25 25
26 26 from common import NoRepo, commit, converter_source, converter_sink
27 27
28 28 import re
29 29 sha1re = re.compile(r'\b[0-9a-f]{6,40}\b')
30 30
31 31 class mercurial_sink(converter_sink):
32 32 def __init__(self, ui, path):
33 33 converter_sink.__init__(self, ui, path)
34 34 self.branchnames = ui.configbool('convert', 'hg.usebranchnames', True)
35 35 self.clonebranches = ui.configbool('convert', 'hg.clonebranches', False)
36 36 self.tagsbranch = ui.config('convert', 'hg.tagsbranch', 'default')
37 37 self.lastbranch = None
38 38 if os.path.isdir(path) and len(os.listdir(path)) > 0:
39 39 try:
40 40 self.repo = hg.repository(self.ui, path)
41 41 if not self.repo.local():
42 42 raise NoRepo(_('%s is not a local Mercurial repository')
43 43 % path)
44 44 except error.RepoError, err:
45 45 ui.traceback()
46 46 raise NoRepo(err.args[0])
47 47 else:
48 48 try:
49 49 ui.status(_('initializing destination %s repository\n') % path)
50 50 self.repo = hg.repository(self.ui, path, create=True)
51 51 if not self.repo.local():
52 52 raise NoRepo(_('%s is not a local Mercurial repository')
53 53 % path)
54 54 self.created.append(path)
55 55 except error.RepoError:
56 56 ui.traceback()
57 57 raise NoRepo(_("could not create hg repository %s as sink")
58 58 % path)
59 59 self.lock = None
60 60 self.wlock = None
61 61 self.filemapmode = False
62 62
63 63 def before(self):
64 64 self.ui.debug('run hg sink pre-conversion action\n')
65 65 self.wlock = self.repo.wlock()
66 66 self.lock = self.repo.lock()
67 67
68 68 def after(self):
69 69 self.ui.debug('run hg sink post-conversion action\n')
70 70 if self.lock:
71 71 self.lock.release()
72 72 if self.wlock:
73 73 self.wlock.release()
74 74
75 75 def revmapfile(self):
76 76 return self.repo.join("shamap")
77 77
78 78 def authorfile(self):
79 79 return self.repo.join("authormap")
80 80
81 81 def setbranch(self, branch, pbranches):
82 82 if not self.clonebranches:
83 83 return
84 84
85 85 setbranch = (branch != self.lastbranch)
86 86 self.lastbranch = branch
87 87 if not branch:
88 88 branch = 'default'
89 89 pbranches = [(b[0], b[1] and b[1] or 'default') for b in pbranches]
90 90 pbranch = pbranches and pbranches[0][1] or 'default'
91 91
92 92 branchpath = os.path.join(self.path, branch)
93 93 if setbranch:
94 94 self.after()
95 95 try:
96 96 self.repo = hg.repository(self.ui, branchpath)
97 97 except Exception:
98 98 self.repo = hg.repository(self.ui, branchpath, create=True)
99 99 self.before()
100 100
101 101 # pbranches may bring revisions from other branches (merge parents)
102 102 # Make sure we have them, or pull them.
103 103 missings = {}
104 104 for b in pbranches:
105 105 try:
106 106 self.repo.lookup(b[0])
107 107 except Exception:
108 108 missings.setdefault(b[1], []).append(b[0])
109 109
110 110 if missings:
111 111 self.after()
112 112 for pbranch, heads in sorted(missings.iteritems()):
113 113 pbranchpath = os.path.join(self.path, pbranch)
114 114 prepo = hg.peer(self.ui, {}, pbranchpath)
115 115 self.ui.note(_('pulling from %s into %s\n') % (pbranch, branch))
116 116 self.repo.pull(prepo, [prepo.lookup(h) for h in heads])
117 117 self.before()
118 118
119 def _rewritetags(self, source, revmap, tagmap, data):
119 def _rewritetags(self, source, revmap, data):
120 120 fp = cStringIO.StringIO()
121 121 for line in data.splitlines():
122 122 s = line.split(' ', 1)
123 123 if len(s) != 2:
124 124 continue
125 125 revid = revmap.get(source.lookuprev(s[0]))
126 126 if not revid:
127 127 continue
128 fp.write('%s %s\n' % (revid, tagmap.get(s[1], s[1])))
128 fp.write('%s %s\n' % (revid, s[1]))
129 129 return fp.getvalue()
130 130
131 def putcommit(self, files, copies, parents, commit, source,
132 revmap, tagmap):
131 def putcommit(self, files, copies, parents, commit, source, revmap):
133 132
134 133 files = dict(files)
135 134 def getfilectx(repo, memctx, f):
136 135 v = files[f]
137 136 data, mode = source.getfile(f, v)
138 137 if f == '.hgtags':
139 data = self._rewritetags(source, revmap, tagmap, data)
138 data = self._rewritetags(source, revmap, data)
140 139 return context.memfilectx(f, data, 'l' in mode, 'x' in mode,
141 140 copies.get(f))
142 141
143 142 pl = []
144 143 for p in parents:
145 144 if p not in pl:
146 145 pl.append(p)
147 146 parents = pl
148 147 nparents = len(parents)
149 148 if self.filemapmode and nparents == 1:
150 149 m1node = self.repo.changelog.read(bin(parents[0]))[0]
151 150 parent = parents[0]
152 151
153 152 if len(parents) < 2:
154 153 parents.append(nullid)
155 154 if len(parents) < 2:
156 155 parents.append(nullid)
157 156 p2 = parents.pop(0)
158 157
159 158 text = commit.desc
160 159
161 160 sha1s = re.findall(sha1re, text)
162 161 for sha1 in sha1s:
163 162 oldrev = source.lookuprev(sha1)
164 163 newrev = revmap.get(oldrev)
165 164 if newrev is not None:
166 165 text = text.replace(sha1, newrev[:len(sha1)])
167 166
168 167 extra = commit.extra.copy()
169 168 if self.branchnames and commit.branch:
170 169 extra['branch'] = commit.branch
171 170 if commit.rev:
172 171 extra['convert_revision'] = commit.rev
173 172
174 173 while parents:
175 174 p1 = p2
176 175 p2 = parents.pop(0)
177 176 ctx = context.memctx(self.repo, (p1, p2), text, files.keys(),
178 177 getfilectx, commit.author, commit.date, extra)
179 178 self.repo.commitctx(ctx)
180 179 text = "(octopus merge fixup)\n"
181 180 p2 = hex(self.repo.changelog.tip())
182 181
183 182 if self.filemapmode and nparents == 1:
184 183 man = self.repo.manifest
185 184 mnode = self.repo.changelog.read(bin(p2))[0]
186 185 closed = 'close' in commit.extra
187 186 if not closed and not man.cmp(m1node, man.revision(mnode)):
188 187 self.ui.status(_("filtering out empty revision\n"))
189 188 self.repo.rollback(force=True)
190 189 return parent
191 190 return p2
192 191
193 192 def puttags(self, tags):
194 193 try:
195 194 parentctx = self.repo[self.tagsbranch]
196 195 tagparent = parentctx.node()
197 196 except error.RepoError:
198 197 parentctx = None
199 198 tagparent = nullid
200 199
201 200 oldlines = set()
202 201 for branch, heads in self.repo.branchmap().iteritems():
203 202 for h in heads:
204 203 if '.hgtags' in self.repo[h]:
205 204 oldlines.update(
206 205 set(self.repo[h]['.hgtags'].data().splitlines(True)))
207 206 oldlines = sorted(list(oldlines))
208 207
209 208 newlines = sorted([("%s %s\n" % (tags[tag], tag)) for tag in tags])
210 209 if newlines == oldlines:
211 210 return None, None
212 211
213 212 # if the old and new tags match, then there is nothing to update
214 213 oldtags = set()
215 214 newtags = set()
216 215 for line in oldlines:
217 216 s = line.strip().split(' ', 1)
218 217 if len(s) != 2:
219 218 continue
220 219 oldtags.add(s[1])
221 220 for line in newlines:
222 221 s = line.strip().split(' ', 1)
223 222 if len(s) != 2:
224 223 continue
225 224 if s[1] not in oldtags:
226 225 newtags.add(s[1].strip())
227 226
228 227 if not newtags:
229 228 return None, None
230 229
231 230 data = "".join(newlines)
232 231 def getfilectx(repo, memctx, f):
233 232 return context.memfilectx(f, data, False, False, None)
234 233
235 234 self.ui.status(_("updating tags\n"))
236 235 date = "%s 0" % int(time.mktime(time.gmtime()))
237 236 extra = {'branch': self.tagsbranch}
238 237 ctx = context.memctx(self.repo, (tagparent, None), "update tags",
239 238 [".hgtags"], getfilectx, "convert-repo", date,
240 239 extra)
241 240 self.repo.commitctx(ctx)
242 241 return hex(self.repo.changelog.tip()), hex(tagparent)
243 242
244 243 def setfilemapmode(self, active):
245 244 self.filemapmode = active
246 245
247 246 def putbookmarks(self, updatedbookmark):
248 247 if not len(updatedbookmark):
249 248 return
250 249
251 250 self.ui.status(_("updating bookmarks\n"))
252 251 destmarks = self.repo._bookmarks
253 252 for bookmark in updatedbookmark:
254 253 destmarks[bookmark] = bin(updatedbookmark[bookmark])
255 254 destmarks.write()
256 255
257 256 def hascommit(self, rev):
258 257 if rev not in self.repo and self.clonebranches:
259 258 raise util.Abort(_('revision %s not found in destination '
260 259 'repository (lookups with clonebranches=true '
261 260 'are not implemented)') % rev)
262 261 return rev in self.repo
263 262
264 263 class mercurial_source(converter_source):
265 264 def __init__(self, ui, path, rev=None):
266 265 converter_source.__init__(self, ui, path, rev)
267 266 self.ignoreerrors = ui.configbool('convert', 'hg.ignoreerrors', False)
268 267 self.ignored = set()
269 268 self.saverev = ui.configbool('convert', 'hg.saverev', False)
270 269 try:
271 270 self.repo = hg.repository(self.ui, path)
272 271 # try to provoke an exception if this isn't really a hg
273 272 # repo, but some other bogus compatible-looking url
274 273 if not self.repo.local():
275 274 raise error.RepoError
276 275 except error.RepoError:
277 276 ui.traceback()
278 277 raise NoRepo(_("%s is not a local Mercurial repository") % path)
279 278 self.lastrev = None
280 279 self.lastctx = None
281 280 self._changescache = None
282 281 self.convertfp = None
283 282 # Restrict converted revisions to startrev descendants
284 283 startnode = ui.config('convert', 'hg.startrev')
285 284 hgrevs = ui.config('convert', 'hg.revs')
286 285 if hgrevs is None:
287 286 if startnode is not None:
288 287 try:
289 288 startnode = self.repo.lookup(startnode)
290 289 except error.RepoError:
291 290 raise util.Abort(_('%s is not a valid start revision')
292 291 % startnode)
293 292 startrev = self.repo.changelog.rev(startnode)
294 293 children = {startnode: 1}
295 294 for r in self.repo.changelog.descendants([startrev]):
296 295 children[self.repo.changelog.node(r)] = 1
297 296 self.keep = children.__contains__
298 297 else:
299 298 self.keep = util.always
300 299 if rev:
301 300 self._heads = [self.repo[rev].node()]
302 301 else:
303 302 self._heads = self.repo.heads()
304 303 else:
305 304 if rev or startnode is not None:
306 305 raise util.Abort(_('hg.revs cannot be combined with '
307 306 'hg.startrev or --rev'))
308 307 nodes = set()
309 308 parents = set()
310 309 for r in scmutil.revrange(self.repo, [hgrevs]):
311 310 ctx = self.repo[r]
312 311 nodes.add(ctx.node())
313 312 parents.update(p.node() for p in ctx.parents())
314 313 self.keep = nodes.__contains__
315 314 self._heads = nodes - parents
316 315
317 316 def changectx(self, rev):
318 317 if self.lastrev != rev:
319 318 self.lastctx = self.repo[rev]
320 319 self.lastrev = rev
321 320 return self.lastctx
322 321
323 322 def parents(self, ctx):
324 323 return [p for p in ctx.parents() if p and self.keep(p.node())]
325 324
326 325 def getheads(self):
327 326 return [hex(h) for h in self._heads if self.keep(h)]
328 327
329 328 def getfile(self, name, rev):
330 329 try:
331 330 fctx = self.changectx(rev)[name]
332 331 return fctx.data(), fctx.flags()
333 332 except error.LookupError, err:
334 333 raise IOError(err)
335 334
336 335 def getchanges(self, rev):
337 336 ctx = self.changectx(rev)
338 337 parents = self.parents(ctx)
339 338 if not parents:
340 339 files = sorted(ctx.manifest())
341 340 # getcopies() is not needed for roots, but it is a simple way to
342 341 # detect missing revlogs and abort on errors or populate
343 342 # self.ignored
344 343 self.getcopies(ctx, parents, files)
345 344 return [(f, rev) for f in files if f not in self.ignored], {}
346 345 if self._changescache and self._changescache[0] == rev:
347 346 m, a, r = self._changescache[1]
348 347 else:
349 348 m, a, r = self.repo.status(parents[0].node(), ctx.node())[:3]
350 349 # getcopies() detects missing revlogs early, run it before
351 350 # filtering the changes.
352 351 copies = self.getcopies(ctx, parents, m + a)
353 352 changes = [(name, rev) for name in m + a + r
354 353 if name not in self.ignored]
355 354 return sorted(changes), copies
356 355
357 356 def getcopies(self, ctx, parents, files):
358 357 copies = {}
359 358 for name in files:
360 359 if name in self.ignored:
361 360 continue
362 361 try:
363 362 copysource, _copynode = ctx.filectx(name).renamed()
364 363 if copysource in self.ignored:
365 364 continue
366 365 # Ignore copy sources not in parent revisions
367 366 found = False
368 367 for p in parents:
369 368 if copysource in p:
370 369 found = True
371 370 break
372 371 if not found:
373 372 continue
374 373 copies[name] = copysource
375 374 except TypeError:
376 375 pass
377 376 except error.LookupError, e:
378 377 if not self.ignoreerrors:
379 378 raise
380 379 self.ignored.add(name)
381 380 self.ui.warn(_('ignoring: %s\n') % e)
382 381 return copies
383 382
384 383 def getcommit(self, rev):
385 384 ctx = self.changectx(rev)
386 385 parents = [p.hex() for p in self.parents(ctx)]
387 386 if self.saverev:
388 387 crev = rev
389 388 else:
390 389 crev = None
391 390 return commit(author=ctx.user(),
392 391 date=util.datestr(ctx.date(), '%Y-%m-%d %H:%M:%S %1%2'),
393 392 desc=ctx.description(), rev=crev, parents=parents,
394 393 branch=ctx.branch(), extra=ctx.extra(),
395 394 sortkey=ctx.rev())
396 395
397 396 def gettags(self):
398 397 tags = [t for t in self.repo.tagslist() if t[0] != 'tip']
399 398 return dict([(name, hex(node)) for name, node in tags
400 399 if self.keep(node)])
401 400
402 401 def getchangedfiles(self, rev, i):
403 402 ctx = self.changectx(rev)
404 403 parents = self.parents(ctx)
405 404 if not parents and i is None:
406 405 i = 0
407 406 changes = [], ctx.manifest().keys(), []
408 407 else:
409 408 i = i or 0
410 409 changes = self.repo.status(parents[i].node(), ctx.node())[:3]
411 410 changes = [[f for f in l if f not in self.ignored] for l in changes]
412 411
413 412 if i == 0:
414 413 self._changescache = (rev, changes)
415 414
416 415 return changes[0] + changes[1] + changes[2]
417 416
418 417 def converted(self, rev, destrev):
419 418 if self.convertfp is None:
420 419 self.convertfp = open(self.repo.join('shamap'), 'a')
421 420 self.convertfp.write('%s %s\n' % (destrev, rev))
422 421 self.convertfp.flush()
423 422
424 423 def before(self):
425 424 self.ui.debug('run hg source pre-conversion action\n')
426 425
427 426 def after(self):
428 427 self.ui.debug('run hg source post-conversion action\n')
429 428
430 429 def hasnativeorder(self):
431 430 return True
432 431
433 432 def hasnativeclose(self):
434 433 return True
435 434
436 435 def lookuprev(self, rev):
437 436 try:
438 437 return hex(self.repo.lookup(rev))
439 438 except error.RepoError:
440 439 return None
441 440
442 441 def getbookmarks(self):
443 442 return bookmarks.listbookmarks(self.repo)
444 443
445 444 def checkrevformat(self, revstr, mapname='splicemap'):
446 445 """ Mercurial, revision string is a 40 byte hex """
447 446 self.checkhexformat(revstr, mapname)
@@ -1,1311 +1,1310 b''
1 1 # Subversion 1.4/1.5 Python API backend
2 2 #
3 3 # Copyright(C) 2007 Daniel Holth et al
4 4
5 5 import os, re, sys, tempfile, urllib, urllib2
6 6 import xml.dom.minidom
7 7 import cPickle as pickle
8 8
9 9 from mercurial import strutil, scmutil, util, encoding
10 10 from mercurial.i18n import _
11 11
12 12 propertycache = util.propertycache
13 13
14 14 # Subversion stuff. Works best with very recent Python SVN bindings
15 15 # e.g. SVN 1.5 or backports. Thanks to the bzr folks for enhancing
16 16 # these bindings.
17 17
18 18 from cStringIO import StringIO
19 19
20 20 from common import NoRepo, MissingTool, commit, encodeargs, decodeargs
21 21 from common import commandline, converter_source, converter_sink, mapfile
22 22 from common import makedatetimestamp
23 23
24 24 try:
25 25 from svn.core import SubversionException, Pool
26 26 import svn
27 27 import svn.client
28 28 import svn.core
29 29 import svn.ra
30 30 import svn.delta
31 31 import transport
32 32 import warnings
33 33 warnings.filterwarnings('ignore',
34 34 module='svn.core',
35 35 category=DeprecationWarning)
36 36
37 37 except ImportError:
38 38 svn = None
39 39
40 40 class SvnPathNotFound(Exception):
41 41 pass
42 42
43 43 def revsplit(rev):
44 44 """Parse a revision string and return (uuid, path, revnum).
45 45 >>> revsplit('svn:a2147622-4a9f-4db4-a8d3-13562ff547b2'
46 46 ... '/proj%20B/mytrunk/mytrunk@1')
47 47 ('a2147622-4a9f-4db4-a8d3-13562ff547b2', '/proj%20B/mytrunk/mytrunk', 1)
48 48 >>> revsplit('svn:8af66a51-67f5-4354-b62c-98d67cc7be1d@1')
49 49 ('', '', 1)
50 50 >>> revsplit('@7')
51 51 ('', '', 7)
52 52 >>> revsplit('7')
53 53 ('', '', 0)
54 54 >>> revsplit('bad')
55 55 ('', '', 0)
56 56 """
57 57 parts = rev.rsplit('@', 1)
58 58 revnum = 0
59 59 if len(parts) > 1:
60 60 revnum = int(parts[1])
61 61 parts = parts[0].split('/', 1)
62 62 uuid = ''
63 63 mod = ''
64 64 if len(parts) > 1 and parts[0].startswith('svn:'):
65 65 uuid = parts[0][4:]
66 66 mod = '/' + parts[1]
67 67 return uuid, mod, revnum
68 68
69 69 def quote(s):
70 70 # As of svn 1.7, many svn calls expect "canonical" paths. In
71 71 # theory, we should call svn.core.*canonicalize() on all paths
72 72 # before passing them to the API. Instead, we assume the base url
73 73 # is canonical and copy the behaviour of svn URL encoding function
74 74 # so we can extend it safely with new components. The "safe"
75 75 # characters were taken from the "svn_uri__char_validity" table in
76 76 # libsvn_subr/path.c.
77 77 return urllib.quote(s, "!$&'()*+,-./:=@_~")
78 78
79 79 def geturl(path):
80 80 try:
81 81 return svn.client.url_from_path(svn.core.svn_path_canonicalize(path))
82 82 except SubversionException:
83 83 # svn.client.url_from_path() fails with local repositories
84 84 pass
85 85 if os.path.isdir(path):
86 86 path = os.path.normpath(os.path.abspath(path))
87 87 if os.name == 'nt':
88 88 path = '/' + util.normpath(path)
89 89 # Module URL is later compared with the repository URL returned
90 90 # by svn API, which is UTF-8.
91 91 path = encoding.tolocal(path)
92 92 path = 'file://%s' % quote(path)
93 93 return svn.core.svn_path_canonicalize(path)
94 94
95 95 def optrev(number):
96 96 optrev = svn.core.svn_opt_revision_t()
97 97 optrev.kind = svn.core.svn_opt_revision_number
98 98 optrev.value.number = number
99 99 return optrev
100 100
101 101 class changedpath(object):
102 102 def __init__(self, p):
103 103 self.copyfrom_path = p.copyfrom_path
104 104 self.copyfrom_rev = p.copyfrom_rev
105 105 self.action = p.action
106 106
107 107 def get_log_child(fp, url, paths, start, end, limit=0,
108 108 discover_changed_paths=True, strict_node_history=False):
109 109 protocol = -1
110 110 def receiver(orig_paths, revnum, author, date, message, pool):
111 111 paths = {}
112 112 if orig_paths is not None:
113 113 for k, v in orig_paths.iteritems():
114 114 paths[k] = changedpath(v)
115 115 pickle.dump((paths, revnum, author, date, message),
116 116 fp, protocol)
117 117
118 118 try:
119 119 # Use an ra of our own so that our parent can consume
120 120 # our results without confusing the server.
121 121 t = transport.SvnRaTransport(url=url)
122 122 svn.ra.get_log(t.ra, paths, start, end, limit,
123 123 discover_changed_paths,
124 124 strict_node_history,
125 125 receiver)
126 126 except IOError:
127 127 # Caller may interrupt the iteration
128 128 pickle.dump(None, fp, protocol)
129 129 except Exception, inst:
130 130 pickle.dump(str(inst), fp, protocol)
131 131 else:
132 132 pickle.dump(None, fp, protocol)
133 133 fp.close()
134 134 # With large history, cleanup process goes crazy and suddenly
135 135 # consumes *huge* amount of memory. The output file being closed,
136 136 # there is no need for clean termination.
137 137 os._exit(0)
138 138
139 139 def debugsvnlog(ui, **opts):
140 140 """Fetch SVN log in a subprocess and channel them back to parent to
141 141 avoid memory collection issues.
142 142 """
143 143 if svn is None:
144 144 raise util.Abort(_('debugsvnlog could not load Subversion python '
145 145 'bindings'))
146 146
147 147 util.setbinary(sys.stdin)
148 148 util.setbinary(sys.stdout)
149 149 args = decodeargs(sys.stdin.read())
150 150 get_log_child(sys.stdout, *args)
151 151
152 152 class logstream(object):
153 153 """Interruptible revision log iterator."""
154 154 def __init__(self, stdout):
155 155 self._stdout = stdout
156 156
157 157 def __iter__(self):
158 158 while True:
159 159 try:
160 160 entry = pickle.load(self._stdout)
161 161 except EOFError:
162 162 raise util.Abort(_('Mercurial failed to run itself, check'
163 163 ' hg executable is in PATH'))
164 164 try:
165 165 orig_paths, revnum, author, date, message = entry
166 166 except (TypeError, ValueError):
167 167 if entry is None:
168 168 break
169 169 raise util.Abort(_("log stream exception '%s'") % entry)
170 170 yield entry
171 171
172 172 def close(self):
173 173 if self._stdout:
174 174 self._stdout.close()
175 175 self._stdout = None
176 176
177 177 class directlogstream(list):
178 178 """Direct revision log iterator.
179 179 This can be used for debugging and development but it will probably leak
180 180 memory and is not suitable for real conversions."""
181 181 def __init__(self, url, paths, start, end, limit=0,
182 182 discover_changed_paths=True, strict_node_history=False):
183 183
184 184 def receiver(orig_paths, revnum, author, date, message, pool):
185 185 paths = {}
186 186 if orig_paths is not None:
187 187 for k, v in orig_paths.iteritems():
188 188 paths[k] = changedpath(v)
189 189 self.append((paths, revnum, author, date, message))
190 190
191 191 # Use an ra of our own so that our parent can consume
192 192 # our results without confusing the server.
193 193 t = transport.SvnRaTransport(url=url)
194 194 svn.ra.get_log(t.ra, paths, start, end, limit,
195 195 discover_changed_paths,
196 196 strict_node_history,
197 197 receiver)
198 198
199 199 def close(self):
200 200 pass
201 201
202 202 # Check to see if the given path is a local Subversion repo. Verify this by
203 203 # looking for several svn-specific files and directories in the given
204 204 # directory.
205 205 def filecheck(ui, path, proto):
206 206 for x in ('locks', 'hooks', 'format', 'db'):
207 207 if not os.path.exists(os.path.join(path, x)):
208 208 return False
209 209 return True
210 210
211 211 # Check to see if a given path is the root of an svn repo over http. We verify
212 212 # this by requesting a version-controlled URL we know can't exist and looking
213 213 # for the svn-specific "not found" XML.
214 214 def httpcheck(ui, path, proto):
215 215 try:
216 216 opener = urllib2.build_opener()
217 217 rsp = opener.open('%s://%s/!svn/ver/0/.svn' % (proto, path))
218 218 data = rsp.read()
219 219 except urllib2.HTTPError, inst:
220 220 if inst.code != 404:
221 221 # Except for 404 we cannot know for sure this is not an svn repo
222 222 ui.warn(_('svn: cannot probe remote repository, assume it could '
223 223 'be a subversion repository. Use --source-type if you '
224 224 'know better.\n'))
225 225 return True
226 226 data = inst.fp.read()
227 227 except Exception:
228 228 # Could be urllib2.URLError if the URL is invalid or anything else.
229 229 return False
230 230 return '<m:human-readable errcode="160013">' in data
231 231
232 232 protomap = {'http': httpcheck,
233 233 'https': httpcheck,
234 234 'file': filecheck,
235 235 }
236 236 def issvnurl(ui, url):
237 237 try:
238 238 proto, path = url.split('://', 1)
239 239 if proto == 'file':
240 240 if (os.name == 'nt' and path[:1] == '/' and path[1:2].isalpha()
241 241 and path[2:6].lower() == '%3a/'):
242 242 path = path[:2] + ':/' + path[6:]
243 243 path = urllib.url2pathname(path)
244 244 except ValueError:
245 245 proto = 'file'
246 246 path = os.path.abspath(url)
247 247 if proto == 'file':
248 248 path = util.pconvert(path)
249 249 check = protomap.get(proto, lambda *args: False)
250 250 while '/' in path:
251 251 if check(ui, path, proto):
252 252 return True
253 253 path = path.rsplit('/', 1)[0]
254 254 return False
255 255
256 256 # SVN conversion code stolen from bzr-svn and tailor
257 257 #
258 258 # Subversion looks like a versioned filesystem, branches structures
259 259 # are defined by conventions and not enforced by the tool. First,
260 260 # we define the potential branches (modules) as "trunk" and "branches"
261 261 # children directories. Revisions are then identified by their
262 262 # module and revision number (and a repository identifier).
263 263 #
264 264 # The revision graph is really a tree (or a forest). By default, a
265 265 # revision parent is the previous revision in the same module. If the
266 266 # module directory is copied/moved from another module then the
267 267 # revision is the module root and its parent the source revision in
268 268 # the parent module. A revision has at most one parent.
269 269 #
270 270 class svn_source(converter_source):
271 271 def __init__(self, ui, url, rev=None):
272 272 super(svn_source, self).__init__(ui, url, rev=rev)
273 273
274 274 if not (url.startswith('svn://') or url.startswith('svn+ssh://') or
275 275 (os.path.exists(url) and
276 276 os.path.exists(os.path.join(url, '.svn'))) or
277 277 issvnurl(ui, url)):
278 278 raise NoRepo(_("%s does not look like a Subversion repository")
279 279 % url)
280 280 if svn is None:
281 281 raise MissingTool(_('could not load Subversion python bindings'))
282 282
283 283 try:
284 284 version = svn.core.SVN_VER_MAJOR, svn.core.SVN_VER_MINOR
285 285 if version < (1, 4):
286 286 raise MissingTool(_('Subversion python bindings %d.%d found, '
287 287 '1.4 or later required') % version)
288 288 except AttributeError:
289 289 raise MissingTool(_('Subversion python bindings are too old, 1.4 '
290 290 'or later required'))
291 291
292 292 self.lastrevs = {}
293 293
294 294 latest = None
295 295 try:
296 296 # Support file://path@rev syntax. Useful e.g. to convert
297 297 # deleted branches.
298 298 at = url.rfind('@')
299 299 if at >= 0:
300 300 latest = int(url[at + 1:])
301 301 url = url[:at]
302 302 except ValueError:
303 303 pass
304 304 self.url = geturl(url)
305 305 self.encoding = 'UTF-8' # Subversion is always nominal UTF-8
306 306 try:
307 307 self.transport = transport.SvnRaTransport(url=self.url)
308 308 self.ra = self.transport.ra
309 309 self.ctx = self.transport.client
310 310 self.baseurl = svn.ra.get_repos_root(self.ra)
311 311 # Module is either empty or a repository path starting with
312 312 # a slash and not ending with a slash.
313 313 self.module = urllib.unquote(self.url[len(self.baseurl):])
314 314 self.prevmodule = None
315 315 self.rootmodule = self.module
316 316 self.commits = {}
317 317 self.paths = {}
318 318 self.uuid = svn.ra.get_uuid(self.ra)
319 319 except SubversionException:
320 320 ui.traceback()
321 321 raise NoRepo(_("%s does not look like a Subversion repository")
322 322 % self.url)
323 323
324 324 if rev:
325 325 try:
326 326 latest = int(rev)
327 327 except ValueError:
328 328 raise util.Abort(_('svn: revision %s is not an integer') % rev)
329 329
330 330 self.trunkname = self.ui.config('convert', 'svn.trunk',
331 331 'trunk').strip('/')
332 332 self.startrev = self.ui.config('convert', 'svn.startrev', default=0)
333 333 try:
334 334 self.startrev = int(self.startrev)
335 335 if self.startrev < 0:
336 336 self.startrev = 0
337 337 except ValueError:
338 338 raise util.Abort(_('svn: start revision %s is not an integer')
339 339 % self.startrev)
340 340
341 341 try:
342 342 self.head = self.latest(self.module, latest)
343 343 except SvnPathNotFound:
344 344 self.head = None
345 345 if not self.head:
346 346 raise util.Abort(_('no revision found in module %s')
347 347 % self.module)
348 348 self.last_changed = self.revnum(self.head)
349 349
350 350 self._changescache = None
351 351
352 352 if os.path.exists(os.path.join(url, '.svn/entries')):
353 353 self.wc = url
354 354 else:
355 355 self.wc = None
356 356 self.convertfp = None
357 357
358 358 def setrevmap(self, revmap):
359 359 lastrevs = {}
360 360 for revid in revmap.iterkeys():
361 361 uuid, module, revnum = revsplit(revid)
362 362 lastrevnum = lastrevs.setdefault(module, revnum)
363 363 if revnum > lastrevnum:
364 364 lastrevs[module] = revnum
365 365 self.lastrevs = lastrevs
366 366
367 367 def exists(self, path, optrev):
368 368 try:
369 369 svn.client.ls(self.url.rstrip('/') + '/' + quote(path),
370 370 optrev, False, self.ctx)
371 371 return True
372 372 except SubversionException:
373 373 return False
374 374
375 375 def getheads(self):
376 376
377 377 def isdir(path, revnum):
378 378 kind = self._checkpath(path, revnum)
379 379 return kind == svn.core.svn_node_dir
380 380
381 381 def getcfgpath(name, rev):
382 382 cfgpath = self.ui.config('convert', 'svn.' + name)
383 383 if cfgpath is not None and cfgpath.strip() == '':
384 384 return None
385 385 path = (cfgpath or name).strip('/')
386 386 if not self.exists(path, rev):
387 387 if self.module.endswith(path) and name == 'trunk':
388 388 # we are converting from inside this directory
389 389 return None
390 390 if cfgpath:
391 391 raise util.Abort(_('expected %s to be at %r, but not found')
392 392 % (name, path))
393 393 return None
394 394 self.ui.note(_('found %s at %r\n') % (name, path))
395 395 return path
396 396
397 397 rev = optrev(self.last_changed)
398 398 oldmodule = ''
399 399 trunk = getcfgpath('trunk', rev)
400 400 self.tags = getcfgpath('tags', rev)
401 401 branches = getcfgpath('branches', rev)
402 402
403 403 # If the project has a trunk or branches, we will extract heads
404 404 # from them. We keep the project root otherwise.
405 405 if trunk:
406 406 oldmodule = self.module or ''
407 407 self.module += '/' + trunk
408 408 self.head = self.latest(self.module, self.last_changed)
409 409 if not self.head:
410 410 raise util.Abort(_('no revision found in module %s')
411 411 % self.module)
412 412
413 413 # First head in the list is the module's head
414 414 self.heads = [self.head]
415 415 if self.tags is not None:
416 416 self.tags = '%s/%s' % (oldmodule , (self.tags or 'tags'))
417 417
418 418 # Check if branches bring a few more heads to the list
419 419 if branches:
420 420 rpath = self.url.strip('/')
421 421 branchnames = svn.client.ls(rpath + '/' + quote(branches),
422 422 rev, False, self.ctx)
423 423 for branch in sorted(branchnames):
424 424 module = '%s/%s/%s' % (oldmodule, branches, branch)
425 425 if not isdir(module, self.last_changed):
426 426 continue
427 427 brevid = self.latest(module, self.last_changed)
428 428 if not brevid:
429 429 self.ui.note(_('ignoring empty branch %s\n') % branch)
430 430 continue
431 431 self.ui.note(_('found branch %s at %d\n') %
432 432 (branch, self.revnum(brevid)))
433 433 self.heads.append(brevid)
434 434
435 435 if self.startrev and self.heads:
436 436 if len(self.heads) > 1:
437 437 raise util.Abort(_('svn: start revision is not supported '
438 438 'with more than one branch'))
439 439 revnum = self.revnum(self.heads[0])
440 440 if revnum < self.startrev:
441 441 raise util.Abort(
442 442 _('svn: no revision found after start revision %d')
443 443 % self.startrev)
444 444
445 445 return self.heads
446 446
447 447 def getchanges(self, rev):
448 448 if self._changescache and self._changescache[0] == rev:
449 449 return self._changescache[1]
450 450 self._changescache = None
451 451 (paths, parents) = self.paths[rev]
452 452 if parents:
453 453 files, self.removed, copies = self.expandpaths(rev, paths, parents)
454 454 else:
455 455 # Perform a full checkout on roots
456 456 uuid, module, revnum = revsplit(rev)
457 457 entries = svn.client.ls(self.baseurl + quote(module),
458 458 optrev(revnum), True, self.ctx)
459 459 files = [n for n, e in entries.iteritems()
460 460 if e.kind == svn.core.svn_node_file]
461 461 copies = {}
462 462 self.removed = set()
463 463
464 464 files.sort()
465 465 files = zip(files, [rev] * len(files))
466 466
467 467 # caller caches the result, so free it here to release memory
468 468 del self.paths[rev]
469 469 return (files, copies)
470 470
471 471 def getchangedfiles(self, rev, i):
472 472 changes = self.getchanges(rev)
473 473 self._changescache = (rev, changes)
474 474 return [f[0] for f in changes[0]]
475 475
476 476 def getcommit(self, rev):
477 477 if rev not in self.commits:
478 478 uuid, module, revnum = revsplit(rev)
479 479 self.module = module
480 480 self.reparent(module)
481 481 # We assume that:
482 482 # - requests for revisions after "stop" come from the
483 483 # revision graph backward traversal. Cache all of them
484 484 # down to stop, they will be used eventually.
485 485 # - requests for revisions before "stop" come to get
486 486 # isolated branches parents. Just fetch what is needed.
487 487 stop = self.lastrevs.get(module, 0)
488 488 if revnum < stop:
489 489 stop = revnum + 1
490 490 self._fetch_revisions(revnum, stop)
491 491 if rev not in self.commits:
492 492 raise util.Abort(_('svn: revision %s not found') % revnum)
493 493 commit = self.commits[rev]
494 494 # caller caches the result, so free it here to release memory
495 495 del self.commits[rev]
496 496 return commit
497 497
498 498 def checkrevformat(self, revstr, mapname='splicemap'):
499 499 """ fails if revision format does not match the correct format"""
500 500 if not re.match(r'svn:[0-9a-f]{8,8}-[0-9a-f]{4,4}-'
501 501 '[0-9a-f]{4,4}-[0-9a-f]{4,4}-[0-9a-f]'
502 502 '{12,12}(.*)\@[0-9]+$',revstr):
503 503 raise util.Abort(_('%s entry %s is not a valid revision'
504 504 ' identifier') % (mapname, revstr))
505 505
506 506 def gettags(self):
507 507 tags = {}
508 508 if self.tags is None:
509 509 return tags
510 510
511 511 # svn tags are just a convention, project branches left in a
512 512 # 'tags' directory. There is no other relationship than
513 513 # ancestry, which is expensive to discover and makes them hard
514 514 # to update incrementally. Worse, past revisions may be
515 515 # referenced by tags far away in the future, requiring a deep
516 516 # history traversal on every calculation. Current code
517 517 # performs a single backward traversal, tracking moves within
518 518 # the tags directory (tag renaming) and recording a new tag
519 519 # everytime a project is copied from outside the tags
520 520 # directory. It also lists deleted tags, this behaviour may
521 521 # change in the future.
522 522 pendings = []
523 523 tagspath = self.tags
524 524 start = svn.ra.get_latest_revnum(self.ra)
525 525 stream = self._getlog([self.tags], start, self.startrev)
526 526 try:
527 527 for entry in stream:
528 528 origpaths, revnum, author, date, message = entry
529 529 if not origpaths:
530 530 origpaths = []
531 531 copies = [(e.copyfrom_path, e.copyfrom_rev, p) for p, e
532 532 in origpaths.iteritems() if e.copyfrom_path]
533 533 # Apply moves/copies from more specific to general
534 534 copies.sort(reverse=True)
535 535
536 536 srctagspath = tagspath
537 537 if copies and copies[-1][2] == tagspath:
538 538 # Track tags directory moves
539 539 srctagspath = copies.pop()[0]
540 540
541 541 for source, sourcerev, dest in copies:
542 542 if not dest.startswith(tagspath + '/'):
543 543 continue
544 544 for tag in pendings:
545 545 if tag[0].startswith(dest):
546 546 tagpath = source + tag[0][len(dest):]
547 547 tag[:2] = [tagpath, sourcerev]
548 548 break
549 549 else:
550 550 pendings.append([source, sourcerev, dest])
551 551
552 552 # Filter out tags with children coming from different
553 553 # parts of the repository like:
554 554 # /tags/tag.1 (from /trunk:10)
555 555 # /tags/tag.1/foo (from /branches/foo:12)
556 556 # Here/tags/tag.1 discarded as well as its children.
557 557 # It happens with tools like cvs2svn. Such tags cannot
558 558 # be represented in mercurial.
559 559 addeds = dict((p, e.copyfrom_path) for p, e
560 560 in origpaths.iteritems()
561 561 if e.action == 'A' and e.copyfrom_path)
562 562 badroots = set()
563 563 for destroot in addeds:
564 564 for source, sourcerev, dest in pendings:
565 565 if (not dest.startswith(destroot + '/')
566 566 or source.startswith(addeds[destroot] + '/')):
567 567 continue
568 568 badroots.add(destroot)
569 569 break
570 570
571 571 for badroot in badroots:
572 572 pendings = [p for p in pendings if p[2] != badroot
573 573 and not p[2].startswith(badroot + '/')]
574 574
575 575 # Tell tag renamings from tag creations
576 576 renamings = []
577 577 for source, sourcerev, dest in pendings:
578 578 tagname = dest.split('/')[-1]
579 579 if source.startswith(srctagspath):
580 580 renamings.append([source, sourcerev, tagname])
581 581 continue
582 582 if tagname in tags:
583 583 # Keep the latest tag value
584 584 continue
585 585 # From revision may be fake, get one with changes
586 586 try:
587 587 tagid = self.latest(source, sourcerev)
588 588 if tagid and tagname not in tags:
589 589 tags[tagname] = tagid
590 590 except SvnPathNotFound:
591 591 # It happens when we are following directories
592 592 # we assumed were copied with their parents
593 593 # but were really created in the tag
594 594 # directory.
595 595 pass
596 596 pendings = renamings
597 597 tagspath = srctagspath
598 598 finally:
599 599 stream.close()
600 600 return tags
601 601
602 602 def converted(self, rev, destrev):
603 603 if not self.wc:
604 604 return
605 605 if self.convertfp is None:
606 606 self.convertfp = open(os.path.join(self.wc, '.svn', 'hg-shamap'),
607 607 'a')
608 608 self.convertfp.write('%s %d\n' % (destrev, self.revnum(rev)))
609 609 self.convertfp.flush()
610 610
611 611 def revid(self, revnum, module=None):
612 612 return 'svn:%s%s@%s' % (self.uuid, module or self.module, revnum)
613 613
614 614 def revnum(self, rev):
615 615 return int(rev.split('@')[-1])
616 616
617 617 def latest(self, path, stop=None):
618 618 """Find the latest revid affecting path, up to stop revision
619 619 number. If stop is None, default to repository latest
620 620 revision. It may return a revision in a different module,
621 621 since a branch may be moved without a change being
622 622 reported. Return None if computed module does not belong to
623 623 rootmodule subtree.
624 624 """
625 625 def findchanges(path, start, stop=None):
626 626 stream = self._getlog([path], start, stop or 1)
627 627 try:
628 628 for entry in stream:
629 629 paths, revnum, author, date, message = entry
630 630 if stop is None and paths:
631 631 # We do not know the latest changed revision,
632 632 # keep the first one with changed paths.
633 633 break
634 634 if revnum <= stop:
635 635 break
636 636
637 637 for p in paths:
638 638 if (not path.startswith(p) or
639 639 not paths[p].copyfrom_path):
640 640 continue
641 641 newpath = paths[p].copyfrom_path + path[len(p):]
642 642 self.ui.debug("branch renamed from %s to %s at %d\n" %
643 643 (path, newpath, revnum))
644 644 path = newpath
645 645 break
646 646 if not paths:
647 647 revnum = None
648 648 return revnum, path
649 649 finally:
650 650 stream.close()
651 651
652 652 if not path.startswith(self.rootmodule):
653 653 # Requests on foreign branches may be forbidden at server level
654 654 self.ui.debug('ignoring foreign branch %r\n' % path)
655 655 return None
656 656
657 657 if stop is None:
658 658 stop = svn.ra.get_latest_revnum(self.ra)
659 659 try:
660 660 prevmodule = self.reparent('')
661 661 dirent = svn.ra.stat(self.ra, path.strip('/'), stop)
662 662 self.reparent(prevmodule)
663 663 except SubversionException:
664 664 dirent = None
665 665 if not dirent:
666 666 raise SvnPathNotFound(_('%s not found up to revision %d')
667 667 % (path, stop))
668 668
669 669 # stat() gives us the previous revision on this line of
670 670 # development, but it might be in *another module*. Fetch the
671 671 # log and detect renames down to the latest revision.
672 672 revnum, realpath = findchanges(path, stop, dirent.created_rev)
673 673 if revnum is None:
674 674 # Tools like svnsync can create empty revision, when
675 675 # synchronizing only a subtree for instance. These empty
676 676 # revisions created_rev still have their original values
677 677 # despite all changes having disappeared and can be
678 678 # returned by ra.stat(), at least when stating the root
679 679 # module. In that case, do not trust created_rev and scan
680 680 # the whole history.
681 681 revnum, realpath = findchanges(path, stop)
682 682 if revnum is None:
683 683 self.ui.debug('ignoring empty branch %r\n' % realpath)
684 684 return None
685 685
686 686 if not realpath.startswith(self.rootmodule):
687 687 self.ui.debug('ignoring foreign branch %r\n' % realpath)
688 688 return None
689 689 return self.revid(revnum, realpath)
690 690
691 691 def reparent(self, module):
692 692 """Reparent the svn transport and return the previous parent."""
693 693 if self.prevmodule == module:
694 694 return module
695 695 svnurl = self.baseurl + quote(module)
696 696 prevmodule = self.prevmodule
697 697 if prevmodule is None:
698 698 prevmodule = ''
699 699 self.ui.debug("reparent to %s\n" % svnurl)
700 700 svn.ra.reparent(self.ra, svnurl)
701 701 self.prevmodule = module
702 702 return prevmodule
703 703
704 704 def expandpaths(self, rev, paths, parents):
705 705 changed, removed = set(), set()
706 706 copies = {}
707 707
708 708 new_module, revnum = revsplit(rev)[1:]
709 709 if new_module != self.module:
710 710 self.module = new_module
711 711 self.reparent(self.module)
712 712
713 713 for i, (path, ent) in enumerate(paths):
714 714 self.ui.progress(_('scanning paths'), i, item=path,
715 715 total=len(paths))
716 716 entrypath = self.getrelpath(path)
717 717
718 718 kind = self._checkpath(entrypath, revnum)
719 719 if kind == svn.core.svn_node_file:
720 720 changed.add(self.recode(entrypath))
721 721 if not ent.copyfrom_path or not parents:
722 722 continue
723 723 # Copy sources not in parent revisions cannot be
724 724 # represented, ignore their origin for now
725 725 pmodule, prevnum = revsplit(parents[0])[1:]
726 726 if ent.copyfrom_rev < prevnum:
727 727 continue
728 728 copyfrom_path = self.getrelpath(ent.copyfrom_path, pmodule)
729 729 if not copyfrom_path:
730 730 continue
731 731 self.ui.debug("copied to %s from %s@%s\n" %
732 732 (entrypath, copyfrom_path, ent.copyfrom_rev))
733 733 copies[self.recode(entrypath)] = self.recode(copyfrom_path)
734 734 elif kind == 0: # gone, but had better be a deleted *file*
735 735 self.ui.debug("gone from %s\n" % ent.copyfrom_rev)
736 736 pmodule, prevnum = revsplit(parents[0])[1:]
737 737 parentpath = pmodule + "/" + entrypath
738 738 fromkind = self._checkpath(entrypath, prevnum, pmodule)
739 739
740 740 if fromkind == svn.core.svn_node_file:
741 741 removed.add(self.recode(entrypath))
742 742 elif fromkind == svn.core.svn_node_dir:
743 743 oroot = parentpath.strip('/')
744 744 nroot = path.strip('/')
745 745 children = self._iterfiles(oroot, prevnum)
746 746 for childpath in children:
747 747 childpath = childpath.replace(oroot, nroot)
748 748 childpath = self.getrelpath("/" + childpath, pmodule)
749 749 if childpath:
750 750 removed.add(self.recode(childpath))
751 751 else:
752 752 self.ui.debug('unknown path in revision %d: %s\n' % \
753 753 (revnum, path))
754 754 elif kind == svn.core.svn_node_dir:
755 755 if ent.action == 'M':
756 756 # If the directory just had a prop change,
757 757 # then we shouldn't need to look for its children.
758 758 continue
759 759 if ent.action == 'R' and parents:
760 760 # If a directory is replacing a file, mark the previous
761 761 # file as deleted
762 762 pmodule, prevnum = revsplit(parents[0])[1:]
763 763 pkind = self._checkpath(entrypath, prevnum, pmodule)
764 764 if pkind == svn.core.svn_node_file:
765 765 removed.add(self.recode(entrypath))
766 766 elif pkind == svn.core.svn_node_dir:
767 767 # We do not know what files were kept or removed,
768 768 # mark them all as changed.
769 769 for childpath in self._iterfiles(pmodule, prevnum):
770 770 childpath = self.getrelpath("/" + childpath)
771 771 if childpath:
772 772 changed.add(self.recode(childpath))
773 773
774 774 for childpath in self._iterfiles(path, revnum):
775 775 childpath = self.getrelpath("/" + childpath)
776 776 if childpath:
777 777 changed.add(self.recode(childpath))
778 778
779 779 # Handle directory copies
780 780 if not ent.copyfrom_path or not parents:
781 781 continue
782 782 # Copy sources not in parent revisions cannot be
783 783 # represented, ignore their origin for now
784 784 pmodule, prevnum = revsplit(parents[0])[1:]
785 785 if ent.copyfrom_rev < prevnum:
786 786 continue
787 787 copyfrompath = self.getrelpath(ent.copyfrom_path, pmodule)
788 788 if not copyfrompath:
789 789 continue
790 790 self.ui.debug("mark %s came from %s:%d\n"
791 791 % (path, copyfrompath, ent.copyfrom_rev))
792 792 children = self._iterfiles(ent.copyfrom_path, ent.copyfrom_rev)
793 793 for childpath in children:
794 794 childpath = self.getrelpath("/" + childpath, pmodule)
795 795 if not childpath:
796 796 continue
797 797 copytopath = path + childpath[len(copyfrompath):]
798 798 copytopath = self.getrelpath(copytopath)
799 799 copies[self.recode(copytopath)] = self.recode(childpath)
800 800
801 801 self.ui.progress(_('scanning paths'), None)
802 802 changed.update(removed)
803 803 return (list(changed), removed, copies)
804 804
805 805 def _fetch_revisions(self, from_revnum, to_revnum):
806 806 if from_revnum < to_revnum:
807 807 from_revnum, to_revnum = to_revnum, from_revnum
808 808
809 809 self.child_cset = None
810 810
811 811 def parselogentry(orig_paths, revnum, author, date, message):
812 812 """Return the parsed commit object or None, and True if
813 813 the revision is a branch root.
814 814 """
815 815 self.ui.debug("parsing revision %d (%d changes)\n" %
816 816 (revnum, len(orig_paths)))
817 817
818 818 branched = False
819 819 rev = self.revid(revnum)
820 820 # branch log might return entries for a parent we already have
821 821
822 822 if rev in self.commits or revnum < to_revnum:
823 823 return None, branched
824 824
825 825 parents = []
826 826 # check whether this revision is the start of a branch or part
827 827 # of a branch renaming
828 828 orig_paths = sorted(orig_paths.iteritems())
829 829 root_paths = [(p, e) for p, e in orig_paths
830 830 if self.module.startswith(p)]
831 831 if root_paths:
832 832 path, ent = root_paths[-1]
833 833 if ent.copyfrom_path:
834 834 branched = True
835 835 newpath = ent.copyfrom_path + self.module[len(path):]
836 836 # ent.copyfrom_rev may not be the actual last revision
837 837 previd = self.latest(newpath, ent.copyfrom_rev)
838 838 if previd is not None:
839 839 prevmodule, prevnum = revsplit(previd)[1:]
840 840 if prevnum >= self.startrev:
841 841 parents = [previd]
842 842 self.ui.note(
843 843 _('found parent of branch %s at %d: %s\n') %
844 844 (self.module, prevnum, prevmodule))
845 845 else:
846 846 self.ui.debug("no copyfrom path, don't know what to do.\n")
847 847
848 848 paths = []
849 849 # filter out unrelated paths
850 850 for path, ent in orig_paths:
851 851 if self.getrelpath(path) is None:
852 852 continue
853 853 paths.append((path, ent))
854 854
855 855 # Example SVN datetime. Includes microseconds.
856 856 # ISO-8601 conformant
857 857 # '2007-01-04T17:35:00.902377Z'
858 858 date = util.parsedate(date[:19] + " UTC", ["%Y-%m-%dT%H:%M:%S"])
859 859 if self.ui.configbool('convert', 'localtimezone'):
860 860 date = makedatetimestamp(date[0])
861 861
862 862 log = message and self.recode(message) or ''
863 863 author = author and self.recode(author) or ''
864 864 try:
865 865 branch = self.module.split("/")[-1]
866 866 if branch == self.trunkname:
867 867 branch = None
868 868 except IndexError:
869 869 branch = None
870 870
871 871 cset = commit(author=author,
872 872 date=util.datestr(date, '%Y-%m-%d %H:%M:%S %1%2'),
873 873 desc=log,
874 874 parents=parents,
875 875 branch=branch,
876 876 rev=rev)
877 877
878 878 self.commits[rev] = cset
879 879 # The parents list is *shared* among self.paths and the
880 880 # commit object. Both will be updated below.
881 881 self.paths[rev] = (paths, cset.parents)
882 882 if self.child_cset and not self.child_cset.parents:
883 883 self.child_cset.parents[:] = [rev]
884 884 self.child_cset = cset
885 885 return cset, branched
886 886
887 887 self.ui.note(_('fetching revision log for "%s" from %d to %d\n') %
888 888 (self.module, from_revnum, to_revnum))
889 889
890 890 try:
891 891 firstcset = None
892 892 lastonbranch = False
893 893 stream = self._getlog([self.module], from_revnum, to_revnum)
894 894 try:
895 895 for entry in stream:
896 896 paths, revnum, author, date, message = entry
897 897 if revnum < self.startrev:
898 898 lastonbranch = True
899 899 break
900 900 if not paths:
901 901 self.ui.debug('revision %d has no entries\n' % revnum)
902 902 # If we ever leave the loop on an empty
903 903 # revision, do not try to get a parent branch
904 904 lastonbranch = lastonbranch or revnum == 0
905 905 continue
906 906 cset, lastonbranch = parselogentry(paths, revnum, author,
907 907 date, message)
908 908 if cset:
909 909 firstcset = cset
910 910 if lastonbranch:
911 911 break
912 912 finally:
913 913 stream.close()
914 914
915 915 if not lastonbranch and firstcset and not firstcset.parents:
916 916 # The first revision of the sequence (the last fetched one)
917 917 # has invalid parents if not a branch root. Find the parent
918 918 # revision now, if any.
919 919 try:
920 920 firstrevnum = self.revnum(firstcset.rev)
921 921 if firstrevnum > 1:
922 922 latest = self.latest(self.module, firstrevnum - 1)
923 923 if latest:
924 924 firstcset.parents.append(latest)
925 925 except SvnPathNotFound:
926 926 pass
927 927 except SubversionException, (inst, num):
928 928 if num == svn.core.SVN_ERR_FS_NO_SUCH_REVISION:
929 929 raise util.Abort(_('svn: branch has no revision %s')
930 930 % to_revnum)
931 931 raise
932 932
933 933 def getfile(self, file, rev):
934 934 # TODO: ra.get_file transmits the whole file instead of diffs.
935 935 if file in self.removed:
936 936 raise IOError
937 937 mode = ''
938 938 try:
939 939 new_module, revnum = revsplit(rev)[1:]
940 940 if self.module != new_module:
941 941 self.module = new_module
942 942 self.reparent(self.module)
943 943 io = StringIO()
944 944 info = svn.ra.get_file(self.ra, file, revnum, io)
945 945 data = io.getvalue()
946 946 # ra.get_file() seems to keep a reference on the input buffer
947 947 # preventing collection. Release it explicitly.
948 948 io.close()
949 949 if isinstance(info, list):
950 950 info = info[-1]
951 951 mode = ("svn:executable" in info) and 'x' or ''
952 952 mode = ("svn:special" in info) and 'l' or mode
953 953 except SubversionException, e:
954 954 notfound = (svn.core.SVN_ERR_FS_NOT_FOUND,
955 955 svn.core.SVN_ERR_RA_DAV_PATH_NOT_FOUND)
956 956 if e.apr_err in notfound: # File not found
957 957 raise IOError
958 958 raise
959 959 if mode == 'l':
960 960 link_prefix = "link "
961 961 if data.startswith(link_prefix):
962 962 data = data[len(link_prefix):]
963 963 return data, mode
964 964
965 965 def _iterfiles(self, path, revnum):
966 966 """Enumerate all files in path at revnum, recursively."""
967 967 path = path.strip('/')
968 968 pool = Pool()
969 969 rpath = '/'.join([self.baseurl, quote(path)]).strip('/')
970 970 entries = svn.client.ls(rpath, optrev(revnum), True, self.ctx, pool)
971 971 if path:
972 972 path += '/'
973 973 return ((path + p) for p, e in entries.iteritems()
974 974 if e.kind == svn.core.svn_node_file)
975 975
976 976 def getrelpath(self, path, module=None):
977 977 if module is None:
978 978 module = self.module
979 979 # Given the repository url of this wc, say
980 980 # "http://server/plone/CMFPlone/branches/Plone-2_0-branch"
981 981 # extract the "entry" portion (a relative path) from what
982 982 # svn log --xml says, i.e.
983 983 # "/CMFPlone/branches/Plone-2_0-branch/tests/PloneTestCase.py"
984 984 # that is to say "tests/PloneTestCase.py"
985 985 if path.startswith(module):
986 986 relative = path.rstrip('/')[len(module):]
987 987 if relative.startswith('/'):
988 988 return relative[1:]
989 989 elif relative == '':
990 990 return relative
991 991
992 992 # The path is outside our tracked tree...
993 993 self.ui.debug('%r is not under %r, ignoring\n' % (path, module))
994 994 return None
995 995
996 996 def _checkpath(self, path, revnum, module=None):
997 997 if module is not None:
998 998 prevmodule = self.reparent('')
999 999 path = module + '/' + path
1000 1000 try:
1001 1001 # ra.check_path does not like leading slashes very much, it leads
1002 1002 # to PROPFIND subversion errors
1003 1003 return svn.ra.check_path(self.ra, path.strip('/'), revnum)
1004 1004 finally:
1005 1005 if module is not None:
1006 1006 self.reparent(prevmodule)
1007 1007
1008 1008 def _getlog(self, paths, start, end, limit=0, discover_changed_paths=True,
1009 1009 strict_node_history=False):
1010 1010 # Normalize path names, svn >= 1.5 only wants paths relative to
1011 1011 # supplied URL
1012 1012 relpaths = []
1013 1013 for p in paths:
1014 1014 if not p.startswith('/'):
1015 1015 p = self.module + '/' + p
1016 1016 relpaths.append(p.strip('/'))
1017 1017 args = [self.baseurl, relpaths, start, end, limit,
1018 1018 discover_changed_paths, strict_node_history]
1019 1019 # undocumented feature: debugsvnlog can be disabled
1020 1020 if not self.ui.configbool('convert', 'svn.debugsvnlog', True):
1021 1021 return directlogstream(*args)
1022 1022 arg = encodeargs(args)
1023 1023 hgexe = util.hgexecutable()
1024 1024 cmd = '%s debugsvnlog' % util.shellquote(hgexe)
1025 1025 stdin, stdout = util.popen2(util.quotecommand(cmd))
1026 1026 stdin.write(arg)
1027 1027 try:
1028 1028 stdin.close()
1029 1029 except IOError:
1030 1030 raise util.Abort(_('Mercurial failed to run itself, check'
1031 1031 ' hg executable is in PATH'))
1032 1032 return logstream(stdout)
1033 1033
1034 1034 pre_revprop_change = '''#!/bin/sh
1035 1035
1036 1036 REPOS="$1"
1037 1037 REV="$2"
1038 1038 USER="$3"
1039 1039 PROPNAME="$4"
1040 1040 ACTION="$5"
1041 1041
1042 1042 if [ "$ACTION" = "M" -a "$PROPNAME" = "svn:log" ]; then exit 0; fi
1043 1043 if [ "$ACTION" = "A" -a "$PROPNAME" = "hg:convert-branch" ]; then exit 0; fi
1044 1044 if [ "$ACTION" = "A" -a "$PROPNAME" = "hg:convert-rev" ]; then exit 0; fi
1045 1045
1046 1046 echo "Changing prohibited revision property" >&2
1047 1047 exit 1
1048 1048 '''
1049 1049
1050 1050 class svn_sink(converter_sink, commandline):
1051 1051 commit_re = re.compile(r'Committed revision (\d+).', re.M)
1052 1052 uuid_re = re.compile(r'Repository UUID:\s*(\S+)', re.M)
1053 1053
1054 1054 def prerun(self):
1055 1055 if self.wc:
1056 1056 os.chdir(self.wc)
1057 1057
1058 1058 def postrun(self):
1059 1059 if self.wc:
1060 1060 os.chdir(self.cwd)
1061 1061
1062 1062 def join(self, name):
1063 1063 return os.path.join(self.wc, '.svn', name)
1064 1064
1065 1065 def revmapfile(self):
1066 1066 return self.join('hg-shamap')
1067 1067
1068 1068 def authorfile(self):
1069 1069 return self.join('hg-authormap')
1070 1070
1071 1071 def __init__(self, ui, path):
1072 1072
1073 1073 converter_sink.__init__(self, ui, path)
1074 1074 commandline.__init__(self, ui, 'svn')
1075 1075 self.delete = []
1076 1076 self.setexec = []
1077 1077 self.delexec = []
1078 1078 self.copies = []
1079 1079 self.wc = None
1080 1080 self.cwd = os.getcwd()
1081 1081
1082 1082 created = False
1083 1083 if os.path.isfile(os.path.join(path, '.svn', 'entries')):
1084 1084 self.wc = os.path.realpath(path)
1085 1085 self.run0('update')
1086 1086 else:
1087 1087 if not re.search(r'^(file|http|https|svn|svn\+ssh)\://', path):
1088 1088 path = os.path.realpath(path)
1089 1089 if os.path.isdir(os.path.dirname(path)):
1090 1090 if not os.path.exists(os.path.join(path, 'db', 'fs-type')):
1091 1091 ui.status(_('initializing svn repository %r\n') %
1092 1092 os.path.basename(path))
1093 1093 commandline(ui, 'svnadmin').run0('create', path)
1094 1094 created = path
1095 1095 path = util.normpath(path)
1096 1096 if not path.startswith('/'):
1097 1097 path = '/' + path
1098 1098 path = 'file://' + path
1099 1099
1100 1100 wcpath = os.path.join(os.getcwd(), os.path.basename(path) + '-wc')
1101 1101 ui.status(_('initializing svn working copy %r\n')
1102 1102 % os.path.basename(wcpath))
1103 1103 self.run0('checkout', path, wcpath)
1104 1104
1105 1105 self.wc = wcpath
1106 1106 self.opener = scmutil.opener(self.wc)
1107 1107 self.wopener = scmutil.opener(self.wc)
1108 1108 self.childmap = mapfile(ui, self.join('hg-childmap'))
1109 1109 self.is_exec = util.checkexec(self.wc) and util.isexec or None
1110 1110
1111 1111 if created:
1112 1112 hook = os.path.join(created, 'hooks', 'pre-revprop-change')
1113 1113 fp = open(hook, 'w')
1114 1114 fp.write(pre_revprop_change)
1115 1115 fp.close()
1116 1116 util.setflags(hook, False, True)
1117 1117
1118 1118 output = self.run0('info')
1119 1119 self.uuid = self.uuid_re.search(output).group(1).strip()
1120 1120
1121 1121 def wjoin(self, *names):
1122 1122 return os.path.join(self.wc, *names)
1123 1123
1124 1124 @propertycache
1125 1125 def manifest(self):
1126 1126 # As of svn 1.7, the "add" command fails when receiving
1127 1127 # already tracked entries, so we have to track and filter them
1128 1128 # ourselves.
1129 1129 m = set()
1130 1130 output = self.run0('ls', recursive=True, xml=True)
1131 1131 doc = xml.dom.minidom.parseString(output)
1132 1132 for e in doc.getElementsByTagName('entry'):
1133 1133 for n in e.childNodes:
1134 1134 if n.nodeType != n.ELEMENT_NODE or n.tagName != 'name':
1135 1135 continue
1136 1136 name = ''.join(c.data for c in n.childNodes
1137 1137 if c.nodeType == c.TEXT_NODE)
1138 1138 # Entries are compared with names coming from
1139 1139 # mercurial, so bytes with undefined encoding. Our
1140 1140 # best bet is to assume they are in local
1141 1141 # encoding. They will be passed to command line calls
1142 1142 # later anyway, so they better be.
1143 1143 m.add(encoding.tolocal(name.encode('utf-8')))
1144 1144 break
1145 1145 return m
1146 1146
1147 1147 def putfile(self, filename, flags, data):
1148 1148 if 'l' in flags:
1149 1149 self.wopener.symlink(data, filename)
1150 1150 else:
1151 1151 try:
1152 1152 if os.path.islink(self.wjoin(filename)):
1153 1153 os.unlink(filename)
1154 1154 except OSError:
1155 1155 pass
1156 1156 self.wopener.write(filename, data)
1157 1157
1158 1158 if self.is_exec:
1159 1159 if self.is_exec(self.wjoin(filename)):
1160 1160 if 'x' not in flags:
1161 1161 self.delexec.append(filename)
1162 1162 else:
1163 1163 if 'x' in flags:
1164 1164 self.setexec.append(filename)
1165 1165 util.setflags(self.wjoin(filename), False, 'x' in flags)
1166 1166
1167 1167 def _copyfile(self, source, dest):
1168 1168 # SVN's copy command pukes if the destination file exists, but
1169 1169 # our copyfile method expects to record a copy that has
1170 1170 # already occurred. Cross the semantic gap.
1171 1171 wdest = self.wjoin(dest)
1172 1172 exists = os.path.lexists(wdest)
1173 1173 if exists:
1174 1174 fd, tempname = tempfile.mkstemp(
1175 1175 prefix='hg-copy-', dir=os.path.dirname(wdest))
1176 1176 os.close(fd)
1177 1177 os.unlink(tempname)
1178 1178 os.rename(wdest, tempname)
1179 1179 try:
1180 1180 self.run0('copy', source, dest)
1181 1181 finally:
1182 1182 self.manifest.add(dest)
1183 1183 if exists:
1184 1184 try:
1185 1185 os.unlink(wdest)
1186 1186 except OSError:
1187 1187 pass
1188 1188 os.rename(tempname, wdest)
1189 1189
1190 1190 def dirs_of(self, files):
1191 1191 dirs = set()
1192 1192 for f in files:
1193 1193 if os.path.isdir(self.wjoin(f)):
1194 1194 dirs.add(f)
1195 1195 for i in strutil.rfindall(f, '/'):
1196 1196 dirs.add(f[:i])
1197 1197 return dirs
1198 1198
1199 1199 def add_dirs(self, files):
1200 1200 add_dirs = [d for d in sorted(self.dirs_of(files))
1201 1201 if d not in self.manifest]
1202 1202 if add_dirs:
1203 1203 self.manifest.update(add_dirs)
1204 1204 self.xargs(add_dirs, 'add', non_recursive=True, quiet=True)
1205 1205 return add_dirs
1206 1206
1207 1207 def add_files(self, files):
1208 1208 files = [f for f in files if f not in self.manifest]
1209 1209 if files:
1210 1210 self.manifest.update(files)
1211 1211 self.xargs(files, 'add', quiet=True)
1212 1212 return files
1213 1213
1214 1214 def tidy_dirs(self, names):
1215 1215 deleted = []
1216 1216 for d in sorted(self.dirs_of(names), reverse=True):
1217 1217 wd = self.wjoin(d)
1218 1218 if os.listdir(wd) == '.svn':
1219 1219 self.run0('delete', d)
1220 1220 self.manifest.remove(d)
1221 1221 deleted.append(d)
1222 1222 return deleted
1223 1223
1224 1224 def addchild(self, parent, child):
1225 1225 self.childmap[parent] = child
1226 1226
1227 1227 def revid(self, rev):
1228 1228 return u"svn:%s@%s" % (self.uuid, rev)
1229 1229
1230 def putcommit(self, files, copies, parents, commit, source,
1231 revmap, tagmap):
1230 def putcommit(self, files, copies, parents, commit, source, revmap):
1232 1231 for parent in parents:
1233 1232 try:
1234 1233 return self.revid(self.childmap[parent])
1235 1234 except KeyError:
1236 1235 pass
1237 1236
1238 1237 # Apply changes to working copy
1239 1238 for f, v in files:
1240 1239 try:
1241 1240 data, mode = source.getfile(f, v)
1242 1241 except IOError:
1243 1242 self.delete.append(f)
1244 1243 else:
1245 1244 self.putfile(f, mode, data)
1246 1245 if f in copies:
1247 1246 self.copies.append([copies[f], f])
1248 1247 files = [f[0] for f in files]
1249 1248
1250 1249 entries = set(self.delete)
1251 1250 files = frozenset(files)
1252 1251 entries.update(self.add_dirs(files.difference(entries)))
1253 1252 if self.copies:
1254 1253 for s, d in self.copies:
1255 1254 self._copyfile(s, d)
1256 1255 self.copies = []
1257 1256 if self.delete:
1258 1257 self.xargs(self.delete, 'delete')
1259 1258 for f in self.delete:
1260 1259 self.manifest.remove(f)
1261 1260 self.delete = []
1262 1261 entries.update(self.add_files(files.difference(entries)))
1263 1262 entries.update(self.tidy_dirs(entries))
1264 1263 if self.delexec:
1265 1264 self.xargs(self.delexec, 'propdel', 'svn:executable')
1266 1265 self.delexec = []
1267 1266 if self.setexec:
1268 1267 self.xargs(self.setexec, 'propset', 'svn:executable', '*')
1269 1268 self.setexec = []
1270 1269
1271 1270 fd, messagefile = tempfile.mkstemp(prefix='hg-convert-')
1272 1271 fp = os.fdopen(fd, 'w')
1273 1272 fp.write(commit.desc)
1274 1273 fp.close()
1275 1274 try:
1276 1275 output = self.run0('commit',
1277 1276 username=util.shortuser(commit.author),
1278 1277 file=messagefile,
1279 1278 encoding='utf-8')
1280 1279 try:
1281 1280 rev = self.commit_re.search(output).group(1)
1282 1281 except AttributeError:
1283 1282 if not files:
1284 1283 return parents[0]
1285 1284 self.ui.warn(_('unexpected svn output:\n'))
1286 1285 self.ui.warn(output)
1287 1286 raise util.Abort(_('unable to cope with svn output'))
1288 1287 if commit.rev:
1289 1288 self.run('propset', 'hg:convert-rev', commit.rev,
1290 1289 revprop=True, revision=rev)
1291 1290 if commit.branch and commit.branch != 'default':
1292 1291 self.run('propset', 'hg:convert-branch', commit.branch,
1293 1292 revprop=True, revision=rev)
1294 1293 for parent in parents:
1295 1294 self.addchild(parent, rev)
1296 1295 return self.revid(rev)
1297 1296 finally:
1298 1297 os.unlink(messagefile)
1299 1298
1300 1299 def puttags(self, tags):
1301 1300 self.ui.warn(_('writing Subversion tags is not yet implemented\n'))
1302 1301 return None, None
1303 1302
1304 1303 def hascommit(self, rev):
1305 1304 # This is not correct as one can convert to an existing subversion
1306 1305 # repository and childmap would not list all revisions. Too bad.
1307 1306 if rev in self.childmap:
1308 1307 return True
1309 1308 raise util.Abort(_('splice map revision %s not found in subversion '
1310 1309 'child map (revision lookups are not implemented)')
1311 1310 % rev)
@@ -1,468 +1,463 b''
1 1 $ cat >> $HGRCPATH <<EOF
2 2 > [extensions]
3 3 > convert=
4 4 > [convert]
5 5 > hg.saverev=False
6 6 > EOF
7 7 $ hg help convert
8 8 hg convert [OPTION]... SOURCE [DEST [REVMAP]]
9 9
10 10 convert a foreign SCM repository to a Mercurial one.
11 11
12 12 Accepted source formats [identifiers]:
13 13
14 14 - Mercurial [hg]
15 15 - CVS [cvs]
16 16 - Darcs [darcs]
17 17 - git [git]
18 18 - Subversion [svn]
19 19 - Monotone [mtn]
20 20 - GNU Arch [gnuarch]
21 21 - Bazaar [bzr]
22 22 - Perforce [p4]
23 23
24 24 Accepted destination formats [identifiers]:
25 25
26 26 - Mercurial [hg]
27 27 - Subversion [svn] (history on branches is not preserved)
28 28
29 29 If no revision is given, all revisions will be converted. Otherwise,
30 30 convert will only import up to the named revision (given in a format
31 31 understood by the source).
32 32
33 33 If no destination directory name is specified, it defaults to the basename
34 34 of the source with "-hg" appended. If the destination repository doesn't
35 35 exist, it will be created.
36 36
37 37 By default, all sources except Mercurial will use --branchsort. Mercurial
38 38 uses --sourcesort to preserve original revision numbers order. Sort modes
39 39 have the following effects:
40 40
41 41 --branchsort convert from parent to child revision when possible, which
42 42 means branches are usually converted one after the other.
43 43 It generates more compact repositories.
44 44 --datesort sort revisions by date. Converted repositories have good-
45 45 looking changelogs but are often an order of magnitude
46 46 larger than the same ones generated by --branchsort.
47 47 --sourcesort try to preserve source revisions order, only supported by
48 48 Mercurial sources.
49 49 --closesort try to move closed revisions as close as possible to parent
50 50 branches, only supported by Mercurial sources.
51 51
52 52 If "REVMAP" isn't given, it will be put in a default location
53 53 ("<dest>/.hg/shamap" by default). The "REVMAP" is a simple text file that
54 54 maps each source commit ID to the destination ID for that revision, like
55 55 so:
56 56
57 57 <source ID> <destination ID>
58 58
59 59 If the file doesn't exist, it's automatically created. It's updated on
60 60 each commit copied, so "hg convert" can be interrupted and can be run
61 61 repeatedly to copy new commits.
62 62
63 63 The authormap is a simple text file that maps each source commit author to
64 64 a destination commit author. It is handy for source SCMs that use unix
65 65 logins to identify authors (e.g.: CVS). One line per author mapping and
66 66 the line format is:
67 67
68 68 source author = destination author
69 69
70 70 Empty lines and lines starting with a "#" are ignored.
71 71
72 72 The filemap is a file that allows filtering and remapping of files and
73 73 directories. Each line can contain one of the following directives:
74 74
75 75 include path/to/file-or-dir
76 76
77 77 exclude path/to/file-or-dir
78 78
79 79 rename path/to/source path/to/destination
80 80
81 81 Comment lines start with "#". A specified path matches if it equals the
82 82 full relative name of a file or one of its parent directories. The
83 83 "include" or "exclude" directive with the longest matching path applies,
84 84 so line order does not matter.
85 85
86 86 The "include" directive causes a file, or all files under a directory, to
87 87 be included in the destination repository. The default if there are no
88 88 "include" statements is to include everything. If there are any "include"
89 89 statements, nothing else is included. The "exclude" directive causes files
90 90 or directories to be omitted. The "rename" directive renames a file or
91 91 directory if it is converted. To rename from a subdirectory into the root
92 92 of the repository, use "." as the path to rename to.
93 93
94 94 The splicemap is a file that allows insertion of synthetic history,
95 95 letting you specify the parents of a revision. This is useful if you want
96 96 to e.g. give a Subversion merge two parents, or graft two disconnected
97 97 series of history together. Each entry contains a key, followed by a
98 98 space, followed by one or two comma-separated values:
99 99
100 100 key parent1, parent2
101 101
102 102 The key is the revision ID in the source revision control system whose
103 103 parents should be modified (same format as a key in .hg/shamap). The
104 104 values are the revision IDs (in either the source or destination revision
105 105 control system) that should be used as the new parents for that node. For
106 106 example, if you have merged "release-1.0" into "trunk", then you should
107 107 specify the revision on "trunk" as the first parent and the one on the
108 108 "release-1.0" branch as the second.
109 109
110 110 The branchmap is a file that allows you to rename a branch when it is
111 111 being brought in from whatever external repository. When used in
112 112 conjunction with a splicemap, it allows for a powerful combination to help
113 113 fix even the most badly mismanaged repositories and turn them into nicely
114 114 structured Mercurial repositories. The branchmap contains lines of the
115 115 form:
116 116
117 117 original_branch_name new_branch_name
118 118
119 119 where "original_branch_name" is the name of the branch in the source
120 120 repository, and "new_branch_name" is the name of the branch is the
121 121 destination repository. No whitespace is allowed in the branch names. This
122 122 can be used to (for instance) move code in one repository from "default"
123 123 to a named branch.
124 124
125 125 The closemap is a file that allows closing of a branch. This is useful if
126 126 you want to close a branch. Each entry contains a revision or hash
127 127 separated by white space.
128 128
129 The tagmap is a file that exactly analogous to the branchmap. This will
130 rename tags on the fly and prevent the 'update tags' commit usually found
131 at the end of a convert process.
132
133 129 Mercurial Source
134 130 ################
135 131
136 132 The Mercurial source recognizes the following configuration options, which
137 133 you can set on the command line with "--config":
138 134
139 135 convert.hg.ignoreerrors
140 136 ignore integrity errors when reading. Use it to fix
141 137 Mercurial repositories with missing revlogs, by converting
142 138 from and to Mercurial. Default is False.
143 139 convert.hg.saverev
144 140 store original revision ID in changeset (forces target IDs
145 141 to change). It takes a boolean argument and defaults to
146 142 False.
147 143 convert.hg.revs
148 144 revset specifying the source revisions to convert.
149 145
150 146 CVS Source
151 147 ##########
152 148
153 149 CVS source will use a sandbox (i.e. a checked-out copy) from CVS to
154 150 indicate the starting point of what will be converted. Direct access to
155 151 the repository files is not needed, unless of course the repository is
156 152 ":local:". The conversion uses the top level directory in the sandbox to
157 153 find the CVS repository, and then uses CVS rlog commands to find files to
158 154 convert. This means that unless a filemap is given, all files under the
159 155 starting directory will be converted, and that any directory
160 156 reorganization in the CVS sandbox is ignored.
161 157
162 158 The following options can be used with "--config":
163 159
164 160 convert.cvsps.cache
165 161 Set to False to disable remote log caching, for testing and
166 162 debugging purposes. Default is True.
167 163 convert.cvsps.fuzz
168 164 Specify the maximum time (in seconds) that is allowed
169 165 between commits with identical user and log message in a
170 166 single changeset. When very large files were checked in as
171 167 part of a changeset then the default may not be long enough.
172 168 The default is 60.
173 169 convert.cvsps.mergeto
174 170 Specify a regular expression to which commit log messages
175 171 are matched. If a match occurs, then the conversion process
176 172 will insert a dummy revision merging the branch on which
177 173 this log message occurs to the branch indicated in the
178 174 regex. Default is "{{mergetobranch ([-\w]+)}}"
179 175 convert.cvsps.mergefrom
180 176 Specify a regular expression to which commit log messages
181 177 are matched. If a match occurs, then the conversion process
182 178 will add the most recent revision on the branch indicated in
183 179 the regex as the second parent of the changeset. Default is
184 180 "{{mergefrombranch ([-\w]+)}}"
185 181 convert.localtimezone
186 182 use local time (as determined by the TZ environment
187 183 variable) for changeset date/times. The default is False
188 184 (use UTC).
189 185 hooks.cvslog Specify a Python function to be called at the end of
190 186 gathering the CVS log. The function is passed a list with
191 187 the log entries, and can modify the entries in-place, or add
192 188 or delete them.
193 189 hooks.cvschangesets
194 190 Specify a Python function to be called after the changesets
195 191 are calculated from the CVS log. The function is passed a
196 192 list with the changeset entries, and can modify the
197 193 changesets in-place, or add or delete them.
198 194
199 195 An additional "debugcvsps" Mercurial command allows the builtin changeset
200 196 merging code to be run without doing a conversion. Its parameters and
201 197 output are similar to that of cvsps 2.1. Please see the command help for
202 198 more details.
203 199
204 200 Subversion Source
205 201 #################
206 202
207 203 Subversion source detects classical trunk/branches/tags layouts. By
208 204 default, the supplied "svn://repo/path/" source URL is converted as a
209 205 single branch. If "svn://repo/path/trunk" exists it replaces the default
210 206 branch. If "svn://repo/path/branches" exists, its subdirectories are
211 207 listed as possible branches. If "svn://repo/path/tags" exists, it is
212 208 looked for tags referencing converted branches. Default "trunk",
213 209 "branches" and "tags" values can be overridden with following options. Set
214 210 them to paths relative to the source URL, or leave them blank to disable
215 211 auto detection.
216 212
217 213 The following options can be set with "--config":
218 214
219 215 convert.svn.branches
220 216 specify the directory containing branches. The default is
221 217 "branches".
222 218 convert.svn.tags
223 219 specify the directory containing tags. The default is
224 220 "tags".
225 221 convert.svn.trunk
226 222 specify the name of the trunk branch. The default is
227 223 "trunk".
228 224 convert.localtimezone
229 225 use local time (as determined by the TZ environment
230 226 variable) for changeset date/times. The default is False
231 227 (use UTC).
232 228
233 229 Source history can be retrieved starting at a specific revision, instead
234 230 of being integrally converted. Only single branch conversions are
235 231 supported.
236 232
237 233 convert.svn.startrev
238 234 specify start Subversion revision number. The default is 0.
239 235
240 236 Perforce Source
241 237 ###############
242 238
243 239 The Perforce (P4) importer can be given a p4 depot path or a client
244 240 specification as source. It will convert all files in the source to a flat
245 241 Mercurial repository, ignoring labels, branches and integrations. Note
246 242 that when a depot path is given you then usually should specify a target
247 243 directory, because otherwise the target may be named "...-hg".
248 244
249 245 It is possible to limit the amount of source history to be converted by
250 246 specifying an initial Perforce revision:
251 247
252 248 convert.p4.startrev
253 249 specify initial Perforce revision (a Perforce changelist
254 250 number).
255 251
256 252 Mercurial Destination
257 253 #####################
258 254
259 255 The following options are supported:
260 256
261 257 convert.hg.clonebranches
262 258 dispatch source branches in separate clones. The default is
263 259 False.
264 260 convert.hg.tagsbranch
265 261 branch name for tag revisions, defaults to "default".
266 262 convert.hg.usebranchnames
267 263 preserve branch names. The default is True.
268 264
269 265 options:
270 266
271 267 -s --source-type TYPE source repository type
272 268 -d --dest-type TYPE destination repository type
273 269 -r --rev REV import up to source revision REV
274 270 -A --authormap FILE remap usernames using this file
275 271 --filemap FILE remap file names using contents of file
276 272 --splicemap FILE splice synthesized history into place
277 273 --branchmap FILE change branch names while converting
278 274 --closemap FILE closes given revs
279 --tagmap FILE change tag names while converting
280 275 --branchsort try to sort changesets by branches
281 276 --datesort try to sort changesets by date
282 277 --sourcesort preserve source changesets order
283 278 --closesort try to reorder closed revisions
284 279
285 280 use "hg -v help convert" to show the global options
286 281 $ hg init a
287 282 $ cd a
288 283 $ echo a > a
289 284 $ hg ci -d'0 0' -Ama
290 285 adding a
291 286 $ hg cp a b
292 287 $ hg ci -d'1 0' -mb
293 288 $ hg rm a
294 289 $ hg ci -d'2 0' -mc
295 290 $ hg mv b a
296 291 $ hg ci -d'3 0' -md
297 292 $ echo a >> a
298 293 $ hg ci -d'4 0' -me
299 294 $ cd ..
300 295 $ hg convert a 2>&1 | grep -v 'subversion python bindings could not be loaded'
301 296 assuming destination a-hg
302 297 initializing destination a-hg repository
303 298 scanning source...
304 299 sorting...
305 300 converting...
306 301 4 a
307 302 3 b
308 303 2 c
309 304 1 d
310 305 0 e
311 306 $ hg --cwd a-hg pull ../a
312 307 pulling from ../a
313 308 searching for changes
314 309 no changes found
315 310
316 311 conversion to existing file should fail
317 312
318 313 $ touch bogusfile
319 314 $ hg convert a bogusfile
320 315 initializing destination bogusfile repository
321 316 abort: cannot create new bundle repository
322 317 [255]
323 318
324 319 #if unix-permissions no-root
325 320
326 321 conversion to dir without permissions should fail
327 322
328 323 $ mkdir bogusdir
329 324 $ chmod 000 bogusdir
330 325
331 326 $ hg convert a bogusdir
332 327 abort: Permission denied: 'bogusdir'
333 328 [255]
334 329
335 330 user permissions should succeed
336 331
337 332 $ chmod 700 bogusdir
338 333 $ hg convert a bogusdir
339 334 initializing destination bogusdir repository
340 335 scanning source...
341 336 sorting...
342 337 converting...
343 338 4 a
344 339 3 b
345 340 2 c
346 341 1 d
347 342 0 e
348 343
349 344 #endif
350 345
351 346 test pre and post conversion actions
352 347
353 348 $ echo 'include b' > filemap
354 349 $ hg convert --debug --filemap filemap a partialb | \
355 350 > grep 'run hg'
356 351 run hg source pre-conversion action
357 352 run hg sink pre-conversion action
358 353 run hg sink post-conversion action
359 354 run hg source post-conversion action
360 355
361 356 converting empty dir should fail "nicely
362 357
363 358 $ mkdir emptydir
364 359
365 360 override $PATH to ensure p4 not visible; use $PYTHON in case we're
366 361 running from a devel copy, not a temp installation
367 362
368 363 $ PATH="$BINDIR" $PYTHON "$BINDIR"/hg convert emptydir
369 364 assuming destination emptydir-hg
370 365 initializing destination emptydir-hg repository
371 366 emptydir does not look like a CVS checkout
372 367 emptydir does not look like a Git repository
373 368 emptydir does not look like a Subversion repository
374 369 emptydir is not a local Mercurial repository
375 370 emptydir does not look like a darcs repository
376 371 emptydir does not look like a monotone repository
377 372 emptydir does not look like a GNU Arch repository
378 373 emptydir does not look like a Bazaar repository
379 374 cannot find required "p4" tool
380 375 abort: emptydir: missing or unsupported repository
381 376 [255]
382 377
383 378 convert with imaginary source type
384 379
385 380 $ hg convert --source-type foo a a-foo
386 381 initializing destination a-foo repository
387 382 abort: foo: invalid source repository type
388 383 [255]
389 384
390 385 convert with imaginary sink type
391 386
392 387 $ hg convert --dest-type foo a a-foo
393 388 abort: foo: invalid destination repository type
394 389 [255]
395 390
396 391 testing: convert must not produce duplicate entries in fncache
397 392
398 393 $ hg convert a b
399 394 initializing destination b repository
400 395 scanning source...
401 396 sorting...
402 397 converting...
403 398 4 a
404 399 3 b
405 400 2 c
406 401 1 d
407 402 0 e
408 403
409 404 contents of fncache file:
410 405
411 406 $ cat b/.hg/store/fncache | sort
412 407 data/a.i
413 408 data/b.i
414 409
415 410 test bogus URL
416 411
417 412 $ hg convert -q bzr+ssh://foobar@selenic.com/baz baz
418 413 abort: bzr+ssh://foobar@selenic.com/baz: missing or unsupported repository
419 414 [255]
420 415
421 416 test revset converted() lookup
422 417
423 418 $ hg --config convert.hg.saverev=True convert a c
424 419 initializing destination c repository
425 420 scanning source...
426 421 sorting...
427 422 converting...
428 423 4 a
429 424 3 b
430 425 2 c
431 426 1 d
432 427 0 e
433 428 $ echo f > c/f
434 429 $ hg -R c ci -d'0 0' -Amf
435 430 adding f
436 431 created new head
437 432 $ hg -R c log -r "converted(09d945a62ce6)"
438 433 changeset: 1:98c3dd46a874
439 434 user: test
440 435 date: Thu Jan 01 00:00:01 1970 +0000
441 436 summary: b
442 437
443 438 $ hg -R c log -r "converted()"
444 439 changeset: 0:31ed57b2037c
445 440 user: test
446 441 date: Thu Jan 01 00:00:00 1970 +0000
447 442 summary: a
448 443
449 444 changeset: 1:98c3dd46a874
450 445 user: test
451 446 date: Thu Jan 01 00:00:01 1970 +0000
452 447 summary: b
453 448
454 449 changeset: 2:3b9ca06ef716
455 450 user: test
456 451 date: Thu Jan 01 00:00:02 1970 +0000
457 452 summary: c
458 453
459 454 changeset: 3:4e0debd37cf2
460 455 user: test
461 456 date: Thu Jan 01 00:00:03 1970 +0000
462 457 summary: d
463 458
464 459 changeset: 4:9de3bc9349c5
465 460 user: test
466 461 date: Thu Jan 01 00:00:04 1970 +0000
467 462 summary: e
468 463
General Comments 0
You need to be logged in to leave comments. Login now