##// END OF EJS Templates
convert: deprecate --authors in preference for --authormap...
Martin Geisler -
r12198:0c67a58f default
parent child Browse files
Show More
@@ -1,318 +1,321 b''
1 1 # convert.py Foreign SCM converter
2 2 #
3 3 # Copyright 2005-2007 Matt Mackall <mpm@selenic.com>
4 4 #
5 5 # This software may be used and distributed according to the terms of the
6 6 # GNU General Public License version 2 or any later version.
7 7
8 8 '''import revisions from foreign VCS repositories into Mercurial'''
9 9
10 10 import convcmd
11 11 import cvsps
12 12 import subversion
13 13 from mercurial import commands
14 14 from mercurial.i18n import _
15 15
16 16 # Commands definition was moved elsewhere to ease demandload job.
17 17
18 18 def convert(ui, src, dest=None, revmapfile=None, **opts):
19 19 """convert a foreign SCM repository to a Mercurial one.
20 20
21 21 Accepted source formats [identifiers]:
22 22
23 23 - Mercurial [hg]
24 24 - CVS [cvs]
25 25 - Darcs [darcs]
26 26 - git [git]
27 27 - Subversion [svn]
28 28 - Monotone [mtn]
29 29 - GNU Arch [gnuarch]
30 30 - Bazaar [bzr]
31 31 - Perforce [p4]
32 32
33 33 Accepted destination formats [identifiers]:
34 34
35 35 - Mercurial [hg]
36 36 - Subversion [svn] (history on branches is not preserved)
37 37
38 38 If no revision is given, all revisions will be converted.
39 39 Otherwise, convert will only import up to the named revision
40 40 (given in a format understood by the source).
41 41
42 42 If no destination directory name is specified, it defaults to the
43 43 basename of the source with ``-hg`` appended. If the destination
44 44 repository doesn't exist, it will be created.
45 45
46 46 By default, all sources except Mercurial will use --branchsort.
47 47 Mercurial uses --sourcesort to preserve original revision numbers
48 48 order. Sort modes have the following effects:
49 49
50 50 --branchsort convert from parent to child revision when possible,
51 51 which means branches are usually converted one after
52 52 the other. It generates more compact repositories.
53 53
54 54 --datesort sort revisions by date. Converted repositories have
55 55 good-looking changelogs but are often an order of
56 56 magnitude larger than the same ones generated by
57 57 --branchsort.
58 58
59 59 --sourcesort try to preserve source revisions order, only
60 60 supported by Mercurial sources.
61 61
62 62 If <REVMAP> isn't given, it will be put in a default location
63 63 (<dest>/.hg/shamap by default). The <REVMAP> is a simple text file
64 64 that maps each source commit ID to the destination ID for that
65 65 revision, like so::
66 66
67 67 <source ID> <destination ID>
68 68
69 69 If the file doesn't exist, it's automatically created. It's
70 70 updated on each commit copied, so :hg:`convert` can be interrupted
71 71 and can be run repeatedly to copy new commits.
72 72
73 The username mapping file is a simple text file that maps each
74 source commit author to a destination commit author. It is handy
75 for source SCMs that use unix logins to identify authors (eg:
76 CVS). One line per author mapping and the line format is::
73 The authormap is a simple text file that maps each source commit
74 author to a destination commit author. It is handy for source SCMs
75 that use unix logins to identify authors (eg: CVS). One line per
76 author mapping and the line format is::
77 77
78 78 source author = destination author
79 79
80 80 Empty lines and lines starting with a ``#`` are ignored.
81 81
82 82 The filemap is a file that allows filtering and remapping of files
83 83 and directories. Each line can contain one of the following
84 84 directives::
85 85
86 86 include path/to/file-or-dir
87 87
88 88 exclude path/to/file-or-dir
89 89
90 90 rename path/to/source path/to/destination
91 91
92 92 Comment lines start with ``#``. A specified path matches if it
93 93 equals the full relative name of a file or one of its parent
94 94 directories. The ``include`` or ``exclude`` directive with the
95 95 longest matching path applies, so line order does not matter.
96 96
97 97 The ``include`` directive causes a file, or all files under a
98 98 directory, to be included in the destination repository, and the
99 99 exclusion of all other files and directories not explicitly
100 100 included. The ``exclude`` directive causes files or directories to
101 101 be omitted. The ``rename`` directive renames a file or directory if
102 102 it is converted. To rename from a subdirectory into the root of
103 103 the repository, use ``.`` as the path to rename to.
104 104
105 105 The splicemap is a file that allows insertion of synthetic
106 106 history, letting you specify the parents of a revision. This is
107 107 useful if you want to e.g. give a Subversion merge two parents, or
108 108 graft two disconnected series of history together. Each entry
109 109 contains a key, followed by a space, followed by one or two
110 110 comma-separated values::
111 111
112 112 key parent1, parent2
113 113
114 114 The key is the revision ID in the source
115 115 revision control system whose parents should be modified (same
116 116 format as a key in .hg/shamap). The values are the revision IDs
117 117 (in either the source or destination revision control system) that
118 118 should be used as the new parents for that node. For example, if
119 119 you have merged "release-1.0" into "trunk", then you should
120 120 specify the revision on "trunk" as the first parent and the one on
121 121 the "release-1.0" branch as the second.
122 122
123 123 The branchmap is a file that allows you to rename a branch when it is
124 124 being brought in from whatever external repository. When used in
125 125 conjunction with a splicemap, it allows for a powerful combination
126 126 to help fix even the most badly mismanaged repositories and turn them
127 127 into nicely structured Mercurial repositories. The branchmap contains
128 128 lines of the form::
129 129
130 130 original_branch_name new_branch_name
131 131
132 132 where "original_branch_name" is the name of the branch in the
133 133 source repository, and "new_branch_name" is the name of the branch
134 134 is the destination repository. No whitespace is allowed in the
135 135 branch names. This can be used to (for instance) move code in one
136 136 repository from "default" to a named branch.
137 137
138 138 Mercurial Source
139 139 ----------------
140 140
141 141 --config convert.hg.ignoreerrors=False (boolean)
142 142 ignore integrity errors when reading. Use it to fix Mercurial
143 143 repositories with missing revlogs, by converting from and to
144 144 Mercurial.
145 145 --config convert.hg.saverev=False (boolean)
146 146 store original revision ID in changeset (forces target IDs to
147 147 change)
148 148 --config convert.hg.startrev=0 (hg revision identifier)
149 149 convert start revision and its descendants
150 150
151 151 CVS Source
152 152 ----------
153 153
154 154 CVS source will use a sandbox (i.e. a checked-out copy) from CVS
155 155 to indicate the starting point of what will be converted. Direct
156 156 access to the repository files is not needed, unless of course the
157 157 repository is :local:. The conversion uses the top level directory
158 158 in the sandbox to find the CVS repository, and then uses CVS rlog
159 159 commands to find files to convert. This means that unless a
160 160 filemap is given, all files under the starting directory will be
161 161 converted, and that any directory reorganization in the CVS
162 162 sandbox is ignored.
163 163
164 164 The options shown are the defaults.
165 165
166 166 --config convert.cvsps.cache=True (boolean)
167 167 Set to False to disable remote log caching, for testing and
168 168 debugging purposes.
169 169 --config convert.cvsps.fuzz=60 (integer)
170 170 Specify the maximum time (in seconds) that is allowed between
171 171 commits with identical user and log message in a single
172 172 changeset. When very large files were checked in as part of a
173 173 changeset then the default may not be long enough.
174 174 --config convert.cvsps.mergeto='{{mergetobranch ([-\\w]+)}}'
175 175 Specify a regular expression to which commit log messages are
176 176 matched. If a match occurs, then the conversion process will
177 177 insert a dummy revision merging the branch on which this log
178 178 message occurs to the branch indicated in the regex.
179 179 --config convert.cvsps.mergefrom='{{mergefrombranch ([-\\w]+)}}'
180 180 Specify a regular expression to which commit log messages are
181 181 matched. If a match occurs, then the conversion process will
182 182 add the most recent revision on the branch indicated in the
183 183 regex as the second parent of the changeset.
184 184 --config hook.cvslog
185 185 Specify a Python function to be called at the end of gathering
186 186 the CVS log. The function is passed a list with the log entries,
187 187 and can modify the entries in-place, or add or delete them.
188 188 --config hook.cvschangesets
189 189 Specify a Python function to be called after the changesets
190 190 are calculated from the the CVS log. The function is passed
191 191 a list with the changeset entries, and can modify the changesets
192 192 in-place, or add or delete them.
193 193
194 194 An additional "debugcvsps" Mercurial command allows the builtin
195 195 changeset merging code to be run without doing a conversion. Its
196 196 parameters and output are similar to that of cvsps 2.1. Please see
197 197 the command help for more details.
198 198
199 199 Subversion Source
200 200 -----------------
201 201
202 202 Subversion source detects classical trunk/branches/tags layouts.
203 203 By default, the supplied "svn://repo/path/" source URL is
204 204 converted as a single branch. If "svn://repo/path/trunk" exists it
205 205 replaces the default branch. If "svn://repo/path/branches" exists,
206 206 its subdirectories are listed as possible branches. If
207 207 "svn://repo/path/tags" exists, it is looked for tags referencing
208 208 converted branches. Default "trunk", "branches" and "tags" values
209 209 can be overridden with following options. Set them to paths
210 210 relative to the source URL, or leave them blank to disable auto
211 211 detection.
212 212
213 213 --config convert.svn.branches=branches (directory name)
214 214 specify the directory containing branches
215 215 --config convert.svn.tags=tags (directory name)
216 216 specify the directory containing tags
217 217 --config convert.svn.trunk=trunk (directory name)
218 218 specify the name of the trunk branch
219 219
220 220 Source history can be retrieved starting at a specific revision,
221 221 instead of being integrally converted. Only single branch
222 222 conversions are supported.
223 223
224 224 --config convert.svn.startrev=0 (svn revision number)
225 225 specify start Subversion revision.
226 226
227 227 Perforce Source
228 228 ---------------
229 229
230 230 The Perforce (P4) importer can be given a p4 depot path or a
231 231 client specification as source. It will convert all files in the
232 232 source to a flat Mercurial repository, ignoring labels, branches
233 233 and integrations. Note that when a depot path is given you then
234 234 usually should specify a target directory, because otherwise the
235 235 target may be named ...-hg.
236 236
237 237 It is possible to limit the amount of source history to be
238 238 converted by specifying an initial Perforce revision.
239 239
240 240 --config convert.p4.startrev=0 (perforce changelist number)
241 241 specify initial Perforce revision.
242 242
243 243 Mercurial Destination
244 244 ---------------------
245 245
246 246 --config convert.hg.clonebranches=False (boolean)
247 247 dispatch source branches in separate clones.
248 248 --config convert.hg.tagsbranch=default (branch name)
249 249 tag revisions branch name
250 250 --config convert.hg.usebranchnames=True (boolean)
251 251 preserve branch names
252 252
253 253 """
254 254 return convcmd.convert(ui, src, dest, revmapfile, **opts)
255 255
256 256 def debugsvnlog(ui, **opts):
257 257 return subversion.debugsvnlog(ui, **opts)
258 258
259 259 def debugcvsps(ui, *args, **opts):
260 260 '''create changeset information from CVS
261 261
262 262 This command is intended as a debugging tool for the CVS to
263 263 Mercurial converter, and can be used as a direct replacement for
264 264 cvsps.
265 265
266 266 Hg debugcvsps reads the CVS rlog for current directory (or any
267 267 named directory) in the CVS repository, and converts the log to a
268 268 series of changesets based on matching commit log entries and
269 269 dates.'''
270 270 return cvsps.debugcvsps(ui, *args, **opts)
271 271
272 272 commands.norepo += " convert debugsvnlog debugcvsps"
273 273
274 274 cmdtable = {
275 275 "convert":
276 276 (convert,
277 277 [('A', 'authors', '',
278 _('username mapping filename'), _('FILE')),
278 _('username mapping filename (DEPRECATED, use --authormap instead)'),
279 _('FILE')),
279 280 ('s', 'source-type', '',
280 281 _('source repository type'), _('TYPE')),
281 282 ('d', 'dest-type', '',
282 283 _('destination repository type'), _('TYPE')),
283 284 ('r', 'rev', '',
284 285 _('import up to target revision REV'), _('REV')),
286 ('', 'authormap', '',
287 _('remap usernames using this file'), _('FILE')),
285 288 ('', 'filemap', '',
286 289 _('remap file names using contents of file'), _('FILE')),
287 290 ('', 'splicemap', '',
288 291 _('splice synthesized history into place'), _('FILE')),
289 292 ('', 'branchmap', '',
290 293 _('change branch names while converting'), _('FILE')),
291 294 ('', 'branchsort', None, _('try to sort changesets by branches')),
292 295 ('', 'datesort', None, _('try to sort changesets by date')),
293 296 ('', 'sourcesort', None, _('preserve source changesets order'))],
294 297 _('hg convert [OPTION]... SOURCE [DEST [REVMAP]]')),
295 298 "debugsvnlog":
296 299 (debugsvnlog,
297 300 [],
298 301 'hg debugsvnlog'),
299 302 "debugcvsps":
300 303 (debugcvsps,
301 304 [
302 305 # Main options shared with cvsps-2.1
303 306 ('b', 'branches', [], _('only return changes on specified branches')),
304 307 ('p', 'prefix', '', _('prefix to remove from file names')),
305 308 ('r', 'revisions', [],
306 309 _('only return changes after or between specified tags')),
307 310 ('u', 'update-cache', None, _("update cvs log cache")),
308 311 ('x', 'new-cache', None, _("create new cvs log cache")),
309 312 ('z', 'fuzz', 60, _('set commit time fuzz in seconds')),
310 313 ('', 'root', '', _('specify cvsroot')),
311 314 # Options specific to builtin cvsps
312 315 ('', 'parents', '', _('show parent changesets')),
313 316 ('', 'ancestors', '', _('show current changeset in ancestor branches')),
314 317 # Options that are ignored for compatibility with cvsps-2.1
315 318 ('A', 'cvs-direct', None, _('ignored for compatibility')),
316 319 ],
317 320 _('hg debugcvsps [OPTION]... [PATH]...')),
318 321 }
@@ -1,430 +1,434 b''
1 1 # convcmd - convert extension commands definition
2 2 #
3 3 # Copyright 2005-2007 Matt Mackall <mpm@selenic.com>
4 4 #
5 5 # This software may be used and distributed according to the terms of the
6 6 # GNU General Public License version 2 or any later version.
7 7
8 8 from common import NoRepo, MissingTool, SKIPREV, mapfile
9 9 from cvs import convert_cvs
10 10 from darcs import darcs_source
11 11 from git import convert_git
12 12 from hg import mercurial_source, mercurial_sink
13 13 from subversion import svn_source, svn_sink
14 14 from monotone import monotone_source
15 15 from gnuarch import gnuarch_source
16 16 from bzr import bzr_source
17 17 from p4 import p4_source
18 18 import filemap
19 19
20 20 import os, shutil
21 21 from mercurial import hg, util, encoding
22 22 from mercurial.i18n import _
23 23
24 24 orig_encoding = 'ascii'
25 25
26 26 def recode(s):
27 27 if isinstance(s, unicode):
28 28 return s.encode(orig_encoding, 'replace')
29 29 else:
30 30 return s.decode('utf-8').encode(orig_encoding, 'replace')
31 31
32 32 source_converters = [
33 33 ('cvs', convert_cvs, 'branchsort'),
34 34 ('git', convert_git, 'branchsort'),
35 35 ('svn', svn_source, 'branchsort'),
36 36 ('hg', mercurial_source, 'sourcesort'),
37 37 ('darcs', darcs_source, 'branchsort'),
38 38 ('mtn', monotone_source, 'branchsort'),
39 39 ('gnuarch', gnuarch_source, 'branchsort'),
40 40 ('bzr', bzr_source, 'branchsort'),
41 41 ('p4', p4_source, 'branchsort'),
42 42 ]
43 43
44 44 sink_converters = [
45 45 ('hg', mercurial_sink),
46 46 ('svn', svn_sink),
47 47 ]
48 48
49 49 def convertsource(ui, path, type, rev):
50 50 exceptions = []
51 51 if type and type not in [s[0] for s in source_converters]:
52 52 raise util.Abort(_('%s: invalid source repository type') % type)
53 53 for name, source, sortmode in source_converters:
54 54 try:
55 55 if not type or name == type:
56 56 return source(ui, path, rev), sortmode
57 57 except (NoRepo, MissingTool), inst:
58 58 exceptions.append(inst)
59 59 if not ui.quiet:
60 60 for inst in exceptions:
61 61 ui.write("%s\n" % inst)
62 62 raise util.Abort(_('%s: missing or unsupported repository') % path)
63 63
64 64 def convertsink(ui, path, type):
65 65 if type and type not in [s[0] for s in sink_converters]:
66 66 raise util.Abort(_('%s: invalid destination repository type') % type)
67 67 for name, sink in sink_converters:
68 68 try:
69 69 if not type or name == type:
70 70 return sink(ui, path)
71 71 except NoRepo, inst:
72 72 ui.note(_("convert: %s\n") % inst)
73 73 raise util.Abort(_('%s: unknown repository type') % path)
74 74
75 75 class progresssource(object):
76 76 def __init__(self, ui, source, filecount):
77 77 self.ui = ui
78 78 self.source = source
79 79 self.filecount = filecount
80 80 self.retrieved = 0
81 81
82 82 def getfile(self, file, rev):
83 83 self.retrieved += 1
84 84 self.ui.progress(_('getting files'), self.retrieved,
85 85 item=file, total=self.filecount)
86 86 return self.source.getfile(file, rev)
87 87
88 88 def lookuprev(self, rev):
89 89 return self.source.lookuprev(rev)
90 90
91 91 def close(self):
92 92 self.ui.progress(_('getting files'), None)
93 93
94 94 class converter(object):
95 95 def __init__(self, ui, source, dest, revmapfile, opts):
96 96
97 97 self.source = source
98 98 self.dest = dest
99 99 self.ui = ui
100 100 self.opts = opts
101 101 self.commitcache = {}
102 102 self.authors = {}
103 103 self.authorfile = None
104 104
105 105 # Record converted revisions persistently: maps source revision
106 106 # ID to target revision ID (both strings). (This is how
107 107 # incremental conversions work.)
108 108 self.map = mapfile(ui, revmapfile)
109 109
110 110 # Read first the dst author map if any
111 111 authorfile = self.dest.authorfile()
112 112 if authorfile and os.path.exists(authorfile):
113 113 self.readauthormap(authorfile)
114 114 # Extend/Override with new author map if necessary
115 if opts.get('authors'):
116 self.readauthormap(opts.get('authors'))
115 if opts.get('authormap'):
116 self.readauthormap(opts.get('authormap'))
117 117 self.authorfile = self.dest.authorfile()
118 118
119 119 self.splicemap = mapfile(ui, opts.get('splicemap'))
120 120 self.branchmap = mapfile(ui, opts.get('branchmap'))
121 121
122 122 def walktree(self, heads):
123 123 '''Return a mapping that identifies the uncommitted parents of every
124 124 uncommitted changeset.'''
125 125 visit = heads
126 126 known = set()
127 127 parents = {}
128 128 while visit:
129 129 n = visit.pop(0)
130 130 if n in known or n in self.map:
131 131 continue
132 132 known.add(n)
133 133 self.ui.progress(_('scanning'), len(known), unit=_('revisions'))
134 134 commit = self.cachecommit(n)
135 135 parents[n] = []
136 136 for p in commit.parents:
137 137 parents[n].append(p)
138 138 visit.append(p)
139 139 self.ui.progress(_('scanning'), None)
140 140
141 141 return parents
142 142
143 143 def toposort(self, parents, sortmode):
144 144 '''Return an ordering such that every uncommitted changeset is
145 145 preceeded by all its uncommitted ancestors.'''
146 146
147 147 def mapchildren(parents):
148 148 """Return a (children, roots) tuple where 'children' maps parent
149 149 revision identifiers to children ones, and 'roots' is the list of
150 150 revisions without parents. 'parents' must be a mapping of revision
151 151 identifier to its parents ones.
152 152 """
153 153 visit = parents.keys()
154 154 seen = set()
155 155 children = {}
156 156 roots = []
157 157
158 158 while visit:
159 159 n = visit.pop(0)
160 160 if n in seen:
161 161 continue
162 162 seen.add(n)
163 163 # Ensure that nodes without parents are present in the
164 164 # 'children' mapping.
165 165 children.setdefault(n, [])
166 166 hasparent = False
167 167 for p in parents[n]:
168 168 if not p in self.map:
169 169 visit.append(p)
170 170 hasparent = True
171 171 children.setdefault(p, []).append(n)
172 172 if not hasparent:
173 173 roots.append(n)
174 174
175 175 return children, roots
176 176
177 177 # Sort functions are supposed to take a list of revisions which
178 178 # can be converted immediately and pick one
179 179
180 180 def makebranchsorter():
181 181 """If the previously converted revision has a child in the
182 182 eligible revisions list, pick it. Return the list head
183 183 otherwise. Branch sort attempts to minimize branch
184 184 switching, which is harmful for Mercurial backend
185 185 compression.
186 186 """
187 187 prev = [None]
188 188 def picknext(nodes):
189 189 next = nodes[0]
190 190 for n in nodes:
191 191 if prev[0] in parents[n]:
192 192 next = n
193 193 break
194 194 prev[0] = next
195 195 return next
196 196 return picknext
197 197
198 198 def makesourcesorter():
199 199 """Source specific sort."""
200 200 keyfn = lambda n: self.commitcache[n].sortkey
201 201 def picknext(nodes):
202 202 return sorted(nodes, key=keyfn)[0]
203 203 return picknext
204 204
205 205 def makedatesorter():
206 206 """Sort revisions by date."""
207 207 dates = {}
208 208 def getdate(n):
209 209 if n not in dates:
210 210 dates[n] = util.parsedate(self.commitcache[n].date)
211 211 return dates[n]
212 212
213 213 def picknext(nodes):
214 214 return min([(getdate(n), n) for n in nodes])[1]
215 215
216 216 return picknext
217 217
218 218 if sortmode == 'branchsort':
219 219 picknext = makebranchsorter()
220 220 elif sortmode == 'datesort':
221 221 picknext = makedatesorter()
222 222 elif sortmode == 'sourcesort':
223 223 picknext = makesourcesorter()
224 224 else:
225 225 raise util.Abort(_('unknown sort mode: %s') % sortmode)
226 226
227 227 children, actives = mapchildren(parents)
228 228
229 229 s = []
230 230 pendings = {}
231 231 while actives:
232 232 n = picknext(actives)
233 233 actives.remove(n)
234 234 s.append(n)
235 235
236 236 # Update dependents list
237 237 for c in children.get(n, []):
238 238 if c not in pendings:
239 239 pendings[c] = [p for p in parents[c] if p not in self.map]
240 240 try:
241 241 pendings[c].remove(n)
242 242 except ValueError:
243 243 raise util.Abort(_('cycle detected between %s and %s')
244 244 % (recode(c), recode(n)))
245 245 if not pendings[c]:
246 246 # Parents are converted, node is eligible
247 247 actives.insert(0, c)
248 248 pendings[c] = None
249 249
250 250 if len(s) != len(parents):
251 251 raise util.Abort(_("not all revisions were sorted"))
252 252
253 253 return s
254 254
255 255 def writeauthormap(self):
256 256 authorfile = self.authorfile
257 257 if authorfile:
258 258 self.ui.status(_('Writing author map file %s\n') % authorfile)
259 259 ofile = open(authorfile, 'w+')
260 260 for author in self.authors:
261 261 ofile.write("%s=%s\n" % (author, self.authors[author]))
262 262 ofile.close()
263 263
264 264 def readauthormap(self, authorfile):
265 265 afile = open(authorfile, 'r')
266 266 for line in afile:
267 267
268 268 line = line.strip()
269 269 if not line or line.startswith('#'):
270 270 continue
271 271
272 272 try:
273 273 srcauthor, dstauthor = line.split('=', 1)
274 274 except ValueError:
275 275 msg = _('Ignoring bad line in author map file %s: %s\n')
276 276 self.ui.warn(msg % (authorfile, line.rstrip()))
277 277 continue
278 278
279 279 srcauthor = srcauthor.strip()
280 280 dstauthor = dstauthor.strip()
281 281 if self.authors.get(srcauthor) in (None, dstauthor):
282 282 msg = _('mapping author %s to %s\n')
283 283 self.ui.debug(msg % (srcauthor, dstauthor))
284 284 self.authors[srcauthor] = dstauthor
285 285 continue
286 286
287 287 m = _('overriding mapping for author %s, was %s, will be %s\n')
288 288 self.ui.status(m % (srcauthor, self.authors[srcauthor], dstauthor))
289 289
290 290 afile.close()
291 291
292 292 def cachecommit(self, rev):
293 293 commit = self.source.getcommit(rev)
294 294 commit.author = self.authors.get(commit.author, commit.author)
295 295 commit.branch = self.branchmap.get(commit.branch, commit.branch)
296 296 self.commitcache[rev] = commit
297 297 return commit
298 298
299 299 def copy(self, rev):
300 300 commit = self.commitcache[rev]
301 301
302 302 changes = self.source.getchanges(rev)
303 303 if isinstance(changes, basestring):
304 304 if changes == SKIPREV:
305 305 dest = SKIPREV
306 306 else:
307 307 dest = self.map[changes]
308 308 self.map[rev] = dest
309 309 return
310 310 files, copies = changes
311 311 pbranches = []
312 312 if commit.parents:
313 313 for prev in commit.parents:
314 314 if prev not in self.commitcache:
315 315 self.cachecommit(prev)
316 316 pbranches.append((self.map[prev],
317 317 self.commitcache[prev].branch))
318 318 self.dest.setbranch(commit.branch, pbranches)
319 319 try:
320 320 parents = self.splicemap[rev].replace(',', ' ').split()
321 321 self.ui.status(_('spliced in %s as parents of %s\n') %
322 322 (parents, rev))
323 323 parents = [self.map.get(p, p) for p in parents]
324 324 except KeyError:
325 325 parents = [b[0] for b in pbranches]
326 326 source = progresssource(self.ui, self.source, len(files))
327 327 newnode = self.dest.putcommit(files, copies, parents, commit,
328 328 source, self.map)
329 329 source.close()
330 330 self.source.converted(rev, newnode)
331 331 self.map[rev] = newnode
332 332
333 333 def convert(self, sortmode):
334 334 try:
335 335 self.source.before()
336 336 self.dest.before()
337 337 self.source.setrevmap(self.map)
338 338 self.ui.status(_("scanning source...\n"))
339 339 heads = self.source.getheads()
340 340 parents = self.walktree(heads)
341 341 self.ui.status(_("sorting...\n"))
342 342 t = self.toposort(parents, sortmode)
343 343 num = len(t)
344 344 c = None
345 345
346 346 self.ui.status(_("converting...\n"))
347 347 for i, c in enumerate(t):
348 348 num -= 1
349 349 desc = self.commitcache[c].desc
350 350 if "\n" in desc:
351 351 desc = desc.splitlines()[0]
352 352 # convert log message to local encoding without using
353 353 # tolocal() because encoding.encoding conver() use it as
354 354 # 'utf-8'
355 355 self.ui.status("%d %s\n" % (num, recode(desc)))
356 356 self.ui.note(_("source: %s\n") % recode(c))
357 357 self.ui.progress(_('converting'), i, unit=_('revisions'),
358 358 total=len(t))
359 359 self.copy(c)
360 360 self.ui.progress(_('converting'), None)
361 361
362 362 tags = self.source.gettags()
363 363 ctags = {}
364 364 for k in tags:
365 365 v = tags[k]
366 366 if self.map.get(v, SKIPREV) != SKIPREV:
367 367 ctags[k] = self.map[v]
368 368
369 369 if c and ctags:
370 370 nrev, tagsparent = self.dest.puttags(ctags)
371 371 if nrev and tagsparent:
372 372 # write another hash correspondence to override the previous
373 373 # one so we don't end up with extra tag heads
374 374 tagsparents = [e for e in self.map.iteritems()
375 375 if e[1] == tagsparent]
376 376 if tagsparents:
377 377 self.map[tagsparents[0][0]] = nrev
378 378
379 379 self.writeauthormap()
380 380 finally:
381 381 self.cleanup()
382 382
383 383 def cleanup(self):
384 384 try:
385 385 self.dest.after()
386 386 finally:
387 387 self.source.after()
388 388 self.map.close()
389 389
390 390 def convert(ui, src, dest=None, revmapfile=None, **opts):
391 391 global orig_encoding
392 392 orig_encoding = encoding.encoding
393 393 encoding.encoding = 'UTF-8'
394 394
395 # support --authors as an alias for --authormap
396 if not opts.get('authormap'):
397 opts['authormap'] = opts.get('authors')
398
395 399 if not dest:
396 400 dest = hg.defaultdest(src) + "-hg"
397 401 ui.status(_("assuming destination %s\n") % dest)
398 402
399 403 destc = convertsink(ui, dest, opts.get('dest_type'))
400 404
401 405 try:
402 406 srcc, defaultsort = convertsource(ui, src, opts.get('source_type'),
403 407 opts.get('rev'))
404 408 except Exception:
405 409 for path in destc.created:
406 410 shutil.rmtree(path, True)
407 411 raise
408 412
409 413 sortmodes = ('branchsort', 'datesort', 'sourcesort')
410 414 sortmode = [m for m in sortmodes if opts.get(m)]
411 415 if len(sortmode) > 1:
412 416 raise util.Abort(_('more than one sort mode specified'))
413 417 sortmode = sortmode and sortmode[0] or defaultsort
414 418 if sortmode == 'sourcesort' and not srcc.hasnativeorder():
415 419 raise util.Abort(_('--sourcesort is not supported by this data source'))
416 420
417 421 fmap = opts.get('filemap')
418 422 if fmap:
419 423 srcc = filemap.filemap_source(ui, srcc, fmap)
420 424 destc.setfilemapmode(True)
421 425
422 426 if not revmapfile:
423 427 try:
424 428 revmapfile = destc.revmapfile()
425 429 except:
426 430 revmapfile = os.path.join(destc, "map")
427 431
428 432 c = converter(ui, srcc, destc, revmapfile, opts)
429 433 c.convert(sortmode)
430 434
@@ -1,318 +1,318 b''
1 1 hg convert [OPTION]... SOURCE [DEST [REVMAP]]
2 2
3 3 convert a foreign SCM repository to a Mercurial one.
4 4
5 5 Accepted source formats [identifiers]:
6 6
7 7 - Mercurial [hg]
8 8 - CVS [cvs]
9 9 - Darcs [darcs]
10 10 - git [git]
11 11 - Subversion [svn]
12 12 - Monotone [mtn]
13 13 - GNU Arch [gnuarch]
14 14 - Bazaar [bzr]
15 15 - Perforce [p4]
16 16
17 17 Accepted destination formats [identifiers]:
18 18
19 19 - Mercurial [hg]
20 20 - Subversion [svn] (history on branches is not preserved)
21 21
22 22 If no revision is given, all revisions will be converted. Otherwise,
23 23 convert will only import up to the named revision (given in a format
24 24 understood by the source).
25 25
26 26 If no destination directory name is specified, it defaults to the basename
27 27 of the source with "-hg" appended. If the destination repository doesn't
28 28 exist, it will be created.
29 29
30 30 By default, all sources except Mercurial will use --branchsort. Mercurial
31 31 uses --sourcesort to preserve original revision numbers order. Sort modes
32 32 have the following effects:
33 33
34 34 --branchsort convert from parent to child revision when possible, which
35 35 means branches are usually converted one after the other. It
36 36 generates more compact repositories.
37 37 --datesort sort revisions by date. Converted repositories have good-
38 38 looking changelogs but are often an order of magnitude
39 39 larger than the same ones generated by --branchsort.
40 40 --sourcesort try to preserve source revisions order, only supported by
41 41 Mercurial sources.
42 42
43 43 If <REVMAP> isn't given, it will be put in a default location
44 44 (<dest>/.hg/shamap by default). The <REVMAP> is a simple text file that
45 45 maps each source commit ID to the destination ID for that revision, like
46 46 so:
47 47
48 48 <source ID> <destination ID>
49 49
50 50 If the file doesn't exist, it's automatically created. It's updated on
51 51 each commit copied, so "hg convert" can be interrupted and can be run
52 52 repeatedly to copy new commits.
53 53
54 The username mapping file is a simple text file that maps each source
55 commit author to a destination commit author. It is handy for source SCMs
56 that use unix logins to identify authors (eg: CVS). One line per author
57 mapping and the line format is:
54 The authormap is a simple text file that maps each source commit author to
55 a destination commit author. It is handy for source SCMs that use unix
56 logins to identify authors (eg: CVS). One line per author mapping and the
57 line format is:
58 58
59 59 source author = destination author
60 60
61 61 Empty lines and lines starting with a "#" are ignored.
62 62
63 63 The filemap is a file that allows filtering and remapping of files and
64 64 directories. Each line can contain one of the following directives:
65 65
66 66 include path/to/file-or-dir
67 67
68 68 exclude path/to/file-or-dir
69 69
70 70 rename path/to/source path/to/destination
71 71
72 72 Comment lines start with "#". A specified path matches if it equals the
73 73 full relative name of a file or one of its parent directories. The
74 74 "include" or "exclude" directive with the longest matching path applies,
75 75 so line order does not matter.
76 76
77 77 The "include" directive causes a file, or all files under a directory, to
78 78 be included in the destination repository, and the exclusion of all other
79 79 files and directories not explicitly included. The "exclude" directive
80 80 causes files or directories to be omitted. The "rename" directive renames
81 81 a file or directory if it is converted. To rename from a subdirectory into
82 82 the root of the repository, use "." as the path to rename to.
83 83
84 84 The splicemap is a file that allows insertion of synthetic history,
85 85 letting you specify the parents of a revision. This is useful if you want
86 86 to e.g. give a Subversion merge two parents, or graft two disconnected
87 87 series of history together. Each entry contains a key, followed by a
88 88 space, followed by one or two comma-separated values:
89 89
90 90 key parent1, parent2
91 91
92 92 The key is the revision ID in the source revision control system whose
93 93 parents should be modified (same format as a key in .hg/shamap). The
94 94 values are the revision IDs (in either the source or destination revision
95 95 control system) that should be used as the new parents for that node. For
96 96 example, if you have merged "release-1.0" into "trunk", then you should
97 97 specify the revision on "trunk" as the first parent and the one on the
98 98 "release-1.0" branch as the second.
99 99
100 100 The branchmap is a file that allows you to rename a branch when it is
101 101 being brought in from whatever external repository. When used in
102 102 conjunction with a splicemap, it allows for a powerful combination to help
103 103 fix even the most badly mismanaged repositories and turn them into nicely
104 104 structured Mercurial repositories. The branchmap contains lines of the
105 105 form:
106 106
107 107 original_branch_name new_branch_name
108 108
109 109 where "original_branch_name" is the name of the branch in the source
110 110 repository, and "new_branch_name" is the name of the branch is the
111 111 destination repository. No whitespace is allowed in the branch names. This
112 112 can be used to (for instance) move code in one repository from "default"
113 113 to a named branch.
114 114
115 115 Mercurial Source
116 116 ----------------
117 117
118 118 --config convert.hg.ignoreerrors=False (boolean)
119 119 ignore integrity errors when reading. Use it to fix Mercurial
120 120 repositories with missing revlogs, by converting from and to
121 121 Mercurial.
122 122
123 123 --config convert.hg.saverev=False (boolean)
124 124 store original revision ID in changeset (forces target IDs to change)
125 125
126 126 --config convert.hg.startrev=0 (hg revision identifier)
127 127 convert start revision and its descendants
128 128
129 129 CVS Source
130 130 ----------
131 131
132 132 CVS source will use a sandbox (i.e. a checked-out copy) from CVS to
133 133 indicate the starting point of what will be converted. Direct access to
134 134 the repository files is not needed, unless of course the repository is
135 135 :local:. The conversion uses the top level directory in the sandbox to
136 136 find the CVS repository, and then uses CVS rlog commands to find files to
137 137 convert. This means that unless a filemap is given, all files under the
138 138 starting directory will be converted, and that any directory
139 139 reorganization in the CVS sandbox is ignored.
140 140
141 141 The options shown are the defaults.
142 142
143 143 --config convert.cvsps.cache=True (boolean)
144 144 Set to False to disable remote log caching, for testing and debugging
145 145 purposes.
146 146
147 147 --config convert.cvsps.fuzz=60 (integer)
148 148 Specify the maximum time (in seconds) that is allowed between commits
149 149 with identical user and log message in a single changeset. When very
150 150 large files were checked in as part of a changeset then the default
151 151 may not be long enough.
152 152
153 153 --config convert.cvsps.mergeto='{{mergetobranch ([-\w]+)}}'
154 154 Specify a regular expression to which commit log messages are matched.
155 155 If a match occurs, then the conversion process will insert a dummy
156 156 revision merging the branch on which this log message occurs to the
157 157 branch indicated in the regex.
158 158
159 159 --config convert.cvsps.mergefrom='{{mergefrombranch ([-\w]+)}}'
160 160 Specify a regular expression to which commit log messages are matched.
161 161 If a match occurs, then the conversion process will add the most
162 162 recent revision on the branch indicated in the regex as the second
163 163 parent of the changeset.
164 164
165 165 --config hook.cvslog
166 166 Specify a Python function to be called at the end of gathering the CVS
167 167 log. The function is passed a list with the log entries, and can
168 168 modify the entries in-place, or add or delete them.
169 169
170 170 --config hook.cvschangesets
171 171 Specify a Python function to be called after the changesets are
172 172 calculated from the the CVS log. The function is passed a list with
173 173 the changeset entries, and can modify the changesets in-place, or add
174 174 or delete them.
175 175
176 176 An additional "debugcvsps" Mercurial command allows the builtin changeset
177 177 merging code to be run without doing a conversion. Its parameters and
178 178 output are similar to that of cvsps 2.1. Please see the command help for
179 179 more details.
180 180
181 181 Subversion Source
182 182 -----------------
183 183
184 184 Subversion source detects classical trunk/branches/tags layouts. By
185 185 default, the supplied "svn://repo/path/" source URL is converted as a
186 186 single branch. If "svn://repo/path/trunk" exists it replaces the default
187 187 branch. If "svn://repo/path/branches" exists, its subdirectories are
188 188 listed as possible branches. If "svn://repo/path/tags" exists, it is
189 189 looked for tags referencing converted branches. Default "trunk",
190 190 "branches" and "tags" values can be overridden with following options. Set
191 191 them to paths relative to the source URL, or leave them blank to disable
192 192 auto detection.
193 193
194 194 --config convert.svn.branches=branches (directory name)
195 195 specify the directory containing branches
196 196
197 197 --config convert.svn.tags=tags (directory name)
198 198 specify the directory containing tags
199 199
200 200 --config convert.svn.trunk=trunk (directory name)
201 201 specify the name of the trunk branch
202 202
203 203 Source history can be retrieved starting at a specific revision, instead
204 204 of being integrally converted. Only single branch conversions are
205 205 supported.
206 206
207 207 --config convert.svn.startrev=0 (svn revision number)
208 208 specify start Subversion revision.
209 209
210 210 Perforce Source
211 211 ---------------
212 212
213 213 The Perforce (P4) importer can be given a p4 depot path or a client
214 214 specification as source. It will convert all files in the source to a flat
215 215 Mercurial repository, ignoring labels, branches and integrations. Note
216 216 that when a depot path is given you then usually should specify a target
217 217 directory, because otherwise the target may be named ...-hg.
218 218
219 219 It is possible to limit the amount of source history to be converted by
220 220 specifying an initial Perforce revision.
221 221
222 222 --config convert.p4.startrev=0 (perforce changelist number)
223 223 specify initial Perforce revision.
224 224
225 225 Mercurial Destination
226 226 ---------------------
227 227
228 228 --config convert.hg.clonebranches=False (boolean)
229 229 dispatch source branches in separate clones.
230 230
231 231 --config convert.hg.tagsbranch=default (branch name)
232 232 tag revisions branch name
233 233
234 234 --config convert.hg.usebranchnames=True (boolean)
235 235 preserve branch names
236 236
237 237 options:
238 238
239 -A --authors FILE username mapping filename
240 239 -s --source-type TYPE source repository type
241 240 -d --dest-type TYPE destination repository type
242 241 -r --rev REV import up to target revision REV
242 --authormap FILE remap usernames using this file
243 243 --filemap FILE remap file names using contents of file
244 244 --splicemap FILE splice synthesized history into place
245 245 --branchmap FILE change branch names while converting
246 246 --branchsort try to sort changesets by branches
247 247 --datesort try to sort changesets by date
248 248 --sourcesort preserve source changesets order
249 249
250 250 use "hg -v help convert" to show global options
251 251 adding a
252 252 assuming destination a-hg
253 253 initializing destination a-hg repository
254 254 scanning source...
255 255 sorting...
256 256 converting...
257 257 4 a
258 258 3 b
259 259 2 c
260 260 1 d
261 261 0 e
262 262 pulling from ../a
263 263 searching for changes
264 264 no changes found
265 265 % should fail
266 266 initializing destination bogusfile repository
267 267 abort: cannot create new bundle repository
268 268 % should fail
269 269 abort: Permission denied: bogusdir
270 270 % should succeed
271 271 initializing destination bogusdir repository
272 272 scanning source...
273 273 sorting...
274 274 converting...
275 275 4 a
276 276 3 b
277 277 2 c
278 278 1 d
279 279 0 e
280 280 % test pre and post conversion actions
281 281 run hg source pre-conversion action
282 282 run hg sink pre-conversion action
283 283 run hg sink post-conversion action
284 284 run hg source post-conversion action
285 285 % converting empty dir should fail nicely
286 286 assuming destination emptydir-hg
287 287 initializing destination emptydir-hg repository
288 288 emptydir does not look like a CVS checkout
289 289 emptydir does not look like a Git repository
290 290 emptydir does not look like a Subversion repository
291 291 emptydir is not a local Mercurial repository
292 292 emptydir does not look like a darcs repository
293 293 emptydir does not look like a monotone repository
294 294 emptydir does not look like a GNU Arch repository
295 295 emptydir does not look like a Bazaar repository
296 296 cannot find required "p4" tool
297 297 abort: emptydir: missing or unsupported repository
298 298 % convert with imaginary source type
299 299 initializing destination a-foo repository
300 300 abort: foo: invalid source repository type
301 301 % convert with imaginary sink type
302 302 abort: foo: invalid destination repository type
303 303
304 304 % testing: convert must not produce duplicate entries in fncache
305 305 initializing destination b repository
306 306 scanning source...
307 307 sorting...
308 308 converting...
309 309 4 a
310 310 3 b
311 311 2 c
312 312 1 d
313 313 0 e
314 314 % contents of fncache file:
315 315 data/a.i
316 316 data/b.i
317 317 % test bogus URL
318 318 abort: bzr+ssh://foobar@selenic.com/baz: missing or unsupported repository
General Comments 0
You need to be logged in to leave comments. Login now