##// END OF EJS Templates
convert: implement two hooks in builtin cvsps
Frank Kingswood -
r10095:69ce7a10 default
parent child Browse files
Show More
@@ -1,285 +1,294
1 1 # convert.py Foreign SCM converter
2 2 #
3 3 # Copyright 2005-2007 Matt Mackall <mpm@selenic.com>
4 4 #
5 5 # This software may be used and distributed according to the terms of the
6 6 # GNU General Public License version 2, incorporated herein by reference.
7 7
8 8 '''import revisions from foreign VCS repositories into Mercurial'''
9 9
10 10 import convcmd
11 11 import cvsps
12 12 import subversion
13 13 from mercurial import commands
14 14 from mercurial.i18n import _
15 15
16 16 # Commands definition was moved elsewhere to ease demandload job.
17 17
18 18 def convert(ui, src, dest=None, revmapfile=None, **opts):
19 19 """convert a foreign SCM repository to a Mercurial one.
20 20
21 21 Accepted source formats [identifiers]:
22 22
23 23 - Mercurial [hg]
24 24 - CVS [cvs]
25 25 - Darcs [darcs]
26 26 - git [git]
27 27 - Subversion [svn]
28 28 - Monotone [mtn]
29 29 - GNU Arch [gnuarch]
30 30 - Bazaar [bzr]
31 31 - Perforce [p4]
32 32
33 33 Accepted destination formats [identifiers]:
34 34
35 35 - Mercurial [hg]
36 36 - Subversion [svn] (history on branches is not preserved)
37 37
38 38 If no revision is given, all revisions will be converted.
39 39 Otherwise, convert will only import up to the named revision
40 40 (given in a format understood by the source).
41 41
42 42 If no destination directory name is specified, it defaults to the
43 43 basename of the source with '-hg' appended. If the destination
44 44 repository doesn't exist, it will be created.
45 45
46 46 By default, all sources except Mercurial will use --branchsort.
47 47 Mercurial uses --sourcesort to preserve original revision numbers
48 48 order. Sort modes have the following effects:
49 49
50 50 --branchsort convert from parent to child revision when possible,
51 51 which means branches are usually converted one after
52 52 the other. It generates more compact repositories.
53 53
54 54 --datesort sort revisions by date. Converted repositories have
55 55 good-looking changelogs but are often an order of
56 56 magnitude larger than the same ones generated by
57 57 --branchsort.
58 58
59 59 --sourcesort try to preserve source revisions order, only
60 60 supported by Mercurial sources.
61 61
62 62 If <REVMAP> isn't given, it will be put in a default location
63 63 (<dest>/.hg/shamap by default). The <REVMAP> is a simple text file
64 64 that maps each source commit ID to the destination ID for that
65 65 revision, like so::
66 66
67 67 <source ID> <destination ID>
68 68
69 69 If the file doesn't exist, it's automatically created. It's
70 70 updated on each commit copied, so convert-repo can be interrupted
71 71 and can be run repeatedly to copy new commits.
72 72
73 73 The [username mapping] file is a simple text file that maps each
74 74 source commit author to a destination commit author. It is handy
75 75 for source SCMs that use unix logins to identify authors (eg:
76 76 CVS). One line per author mapping and the line format is:
77 77 srcauthor=whatever string you want
78 78
79 79 The filemap is a file that allows filtering and remapping of files
80 80 and directories. Comment lines start with '#'. Each line can
81 81 contain one of the following directives::
82 82
83 83 include path/to/file
84 84
85 85 exclude path/to/file
86 86
87 87 rename from/file to/file
88 88
89 89 The 'include' directive causes a file, or all files under a
90 90 directory, to be included in the destination repository, and the
91 91 exclusion of all other files and directories not explicitly
92 92 included. The 'exclude' directive causes files or directories to
93 93 be omitted. The 'rename' directive renames a file or directory. To
94 94 rename from a subdirectory into the root of the repository, use
95 95 '.' as the path to rename to.
96 96
97 97 The splicemap is a file that allows insertion of synthetic
98 98 history, letting you specify the parents of a revision. This is
99 99 useful if you want to e.g. give a Subversion merge two parents, or
100 100 graft two disconnected series of history together. Each entry
101 101 contains a key, followed by a space, followed by one or two
102 102 comma-separated values. The key is the revision ID in the source
103 103 revision control system whose parents should be modified (same
104 104 format as a key in .hg/shamap). The values are the revision IDs
105 105 (in either the source or destination revision control system) that
106 106 should be used as the new parents for that node. For example, if
107 107 you have merged "release-1.0" into "trunk", then you should
108 108 specify the revision on "trunk" as the first parent and the one on
109 109 the "release-1.0" branch as the second.
110 110
111 111 The branchmap is a file that allows you to rename a branch when it is
112 112 being brought in from whatever external repository. When used in
113 113 conjunction with a splicemap, it allows for a powerful combination
114 114 to help fix even the most badly mismanaged repositories and turn them
115 115 into nicely structured Mercurial repositories. The branchmap contains
116 116 lines of the form "original_branch_name new_branch_name".
117 117 "original_branch_name" is the name of the branch in the source
118 118 repository, and "new_branch_name" is the name of the branch is the
119 119 destination repository. This can be used to (for instance) move code
120 120 in one repository from "default" to a named branch.
121 121
122 122 Mercurial Source
123 123 ----------------
124 124
125 125 --config convert.hg.ignoreerrors=False (boolean)
126 126 ignore integrity errors when reading. Use it to fix Mercurial
127 127 repositories with missing revlogs, by converting from and to
128 128 Mercurial.
129 129 --config convert.hg.saverev=False (boolean)
130 130 store original revision ID in changeset (forces target IDs to
131 131 change)
132 132 --config convert.hg.startrev=0 (hg revision identifier)
133 133 convert start revision and its descendants
134 134
135 135 CVS Source
136 136 ----------
137 137
138 138 CVS source will use a sandbox (i.e. a checked-out copy) from CVS
139 139 to indicate the starting point of what will be converted. Direct
140 140 access to the repository files is not needed, unless of course the
141 141 repository is :local:. The conversion uses the top level directory
142 142 in the sandbox to find the CVS repository, and then uses CVS rlog
143 143 commands to find files to convert. This means that unless a
144 144 filemap is given, all files under the starting directory will be
145 145 converted, and that any directory reorganization in the CVS
146 146 sandbox is ignored.
147 147
148 148 The options shown are the defaults.
149 149
150 150 --config convert.cvsps.cache=True (boolean)
151 151 Set to False to disable remote log caching, for testing and
152 152 debugging purposes.
153 153 --config convert.cvsps.fuzz=60 (integer)
154 154 Specify the maximum time (in seconds) that is allowed between
155 155 commits with identical user and log message in a single
156 156 changeset. When very large files were checked in as part of a
157 157 changeset then the default may not be long enough.
158 158 --config convert.cvsps.mergeto='{{mergetobranch ([-\\w]+)}}'
159 159 Specify a regular expression to which commit log messages are
160 160 matched. If a match occurs, then the conversion process will
161 161 insert a dummy revision merging the branch on which this log
162 162 message occurs to the branch indicated in the regex.
163 163 --config convert.cvsps.mergefrom='{{mergefrombranch ([-\\w]+)}}'
164 164 Specify a regular expression to which commit log messages are
165 165 matched. If a match occurs, then the conversion process will
166 166 add the most recent revision on the branch indicated in the
167 167 regex as the second parent of the changeset.
168 --config hook.cvslog
169 Specify a Python function to be called at the end of gathering
170 the CVS log. The function is passed a list with the log entries,
171 and can modify the entries in-place, or add or delete them.
172 --config hook.cvschangesets
173 Specify a Python function to be called after the changesets
174 are calculated from the the CVS log. The function is passed
175 a list with the changeset entries, and can modify the changesets
176 in-place, or add or delete them.
168 177
169 178 An additional "debugcvsps" Mercurial command allows the builtin
170 179 changeset merging code to be run without doing a conversion. Its
171 180 parameters and output are similar to that of cvsps 2.1. Please see
172 181 the command help for more details.
173 182
174 183 Subversion Source
175 184 -----------------
176 185
177 186 Subversion source detects classical trunk/branches/tags layouts.
178 187 By default, the supplied "svn://repo/path/" source URL is
179 188 converted as a single branch. If "svn://repo/path/trunk" exists it
180 189 replaces the default branch. If "svn://repo/path/branches" exists,
181 190 its subdirectories are listed as possible branches. If
182 191 "svn://repo/path/tags" exists, it is looked for tags referencing
183 192 converted branches. Default "trunk", "branches" and "tags" values
184 193 can be overridden with following options. Set them to paths
185 194 relative to the source URL, or leave them blank to disable auto
186 195 detection.
187 196
188 197 --config convert.svn.branches=branches (directory name)
189 198 specify the directory containing branches
190 199 --config convert.svn.tags=tags (directory name)
191 200 specify the directory containing tags
192 201 --config convert.svn.trunk=trunk (directory name)
193 202 specify the name of the trunk branch
194 203
195 204 Source history can be retrieved starting at a specific revision,
196 205 instead of being integrally converted. Only single branch
197 206 conversions are supported.
198 207
199 208 --config convert.svn.startrev=0 (svn revision number)
200 209 specify start Subversion revision.
201 210
202 211 Perforce Source
203 212 ---------------
204 213
205 214 The Perforce (P4) importer can be given a p4 depot path or a
206 215 client specification as source. It will convert all files in the
207 216 source to a flat Mercurial repository, ignoring labels, branches
208 217 and integrations. Note that when a depot path is given you then
209 218 usually should specify a target directory, because otherwise the
210 219 target may be named ...-hg.
211 220
212 221 It is possible to limit the amount of source history to be
213 222 converted by specifying an initial Perforce revision.
214 223
215 224 --config convert.p4.startrev=0 (perforce changelist number)
216 225 specify initial Perforce revision.
217 226
218 227 Mercurial Destination
219 228 ---------------------
220 229
221 230 --config convert.hg.clonebranches=False (boolean)
222 231 dispatch source branches in separate clones.
223 232 --config convert.hg.tagsbranch=default (branch name)
224 233 tag revisions branch name
225 234 --config convert.hg.usebranchnames=True (boolean)
226 235 preserve branch names
227 236
228 237 """
229 238 return convcmd.convert(ui, src, dest, revmapfile, **opts)
230 239
231 240 def debugsvnlog(ui, **opts):
232 241 return subversion.debugsvnlog(ui, **opts)
233 242
234 243 def debugcvsps(ui, *args, **opts):
235 244 '''create changeset information from CVS
236 245
237 246 This command is intended as a debugging tool for the CVS to
238 247 Mercurial converter, and can be used as a direct replacement for
239 248 cvsps.
240 249
241 250 Hg debugcvsps reads the CVS rlog for current directory (or any
242 251 named directory) in the CVS repository, and converts the log to a
243 252 series of changesets based on matching commit log entries and
244 253 dates.'''
245 254 return cvsps.debugcvsps(ui, *args, **opts)
246 255
247 256 commands.norepo += " convert debugsvnlog debugcvsps"
248 257
249 258 cmdtable = {
250 259 "convert":
251 260 (convert,
252 261 [('A', 'authors', '', _('username mapping filename')),
253 262 ('d', 'dest-type', '', _('destination repository type')),
254 263 ('', 'filemap', '', _('remap file names using contents of file')),
255 264 ('r', 'rev', '', _('import up to target revision REV')),
256 265 ('s', 'source-type', '', _('source repository type')),
257 266 ('', 'splicemap', '', _('splice synthesized history into place')),
258 267 ('', 'branchmap', '', _('change branch names while converting')),
259 268 ('', 'branchsort', None, _('try to sort changesets by branches')),
260 269 ('', 'datesort', None, _('try to sort changesets by date')),
261 270 ('', 'sourcesort', None, _('preserve source changesets order'))],
262 271 _('hg convert [OPTION]... SOURCE [DEST [REVMAP]]')),
263 272 "debugsvnlog":
264 273 (debugsvnlog,
265 274 [],
266 275 'hg debugsvnlog'),
267 276 "debugcvsps":
268 277 (debugcvsps,
269 278 [
270 279 # Main options shared with cvsps-2.1
271 280 ('b', 'branches', [], _('only return changes on specified branches')),
272 281 ('p', 'prefix', '', _('prefix to remove from file names')),
273 282 ('r', 'revisions', [], _('only return changes after or between specified tags')),
274 283 ('u', 'update-cache', None, _("update cvs log cache")),
275 284 ('x', 'new-cache', None, _("create new cvs log cache")),
276 285 ('z', 'fuzz', 60, _('set commit time fuzz in seconds')),
277 286 ('', 'root', '', _('specify cvsroot')),
278 287 # Options specific to builtin cvsps
279 288 ('', 'parents', '', _('show parent changesets')),
280 289 ('', 'ancestors', '', _('show current changeset in ancestor branches')),
281 290 # Options that are ignored for compatibility with cvsps-2.1
282 291 ('A', 'cvs-direct', None, _('ignored for compatibility')),
283 292 ],
284 293 _('hg debugcvsps [OPTION]... [PATH]...')),
285 294 }
@@ -1,831 +1,836
1 1 #
2 2 # Mercurial built-in replacement for cvsps.
3 3 #
4 4 # Copyright 2008, Frank Kingswood <frank@kingswood-consulting.co.uk>
5 5 #
6 6 # This software may be used and distributed according to the terms of the
7 7 # GNU General Public License version 2, incorporated herein by reference.
8 8
9 9 import os
10 10 import re
11 11 import cPickle as pickle
12 12 from mercurial import util
13 13 from mercurial.i18n import _
14 from mercurial import hook
14 15
15 16 class logentry(object):
16 17 '''Class logentry has the following attributes:
17 18 .author - author name as CVS knows it
18 19 .branch - name of branch this revision is on
19 20 .branches - revision tuple of branches starting at this revision
20 21 .comment - commit message
21 22 .date - the commit date as a (time, tz) tuple
22 23 .dead - true if file revision is dead
23 24 .file - Name of file
24 25 .lines - a tuple (+lines, -lines) or None
25 26 .parent - Previous revision of this entry
26 27 .rcs - name of file as returned from CVS
27 28 .revision - revision number as tuple
28 29 .tags - list of tags on the file
29 30 .synthetic - is this a synthetic "file ... added on ..." revision?
30 31 .mergepoint- the branch that has been merged from
31 32 (if present in rlog output)
32 33 .branchpoints- the branches that start at the current entry
33 34 '''
34 35 def __init__(self, **entries):
35 36 self.__dict__.update(entries)
36 37
37 38 def __repr__(self):
38 39 return "<%s at 0x%x: %s %s>" % (self.__class__.__name__,
39 40 id(self),
40 41 self.file,
41 42 ".".join(map(str, self.revision)))
42 43
43 44 class logerror(Exception):
44 45 pass
45 46
46 47 def getrepopath(cvspath):
47 48 """Return the repository path from a CVS path.
48 49
49 50 >>> getrepopath('/foo/bar')
50 51 '/foo/bar'
51 52 >>> getrepopath('c:/foo/bar')
52 53 'c:/foo/bar'
53 54 >>> getrepopath(':pserver:10/foo/bar')
54 55 '/foo/bar'
55 56 >>> getrepopath(':pserver:10c:/foo/bar')
56 57 '/foo/bar'
57 58 >>> getrepopath(':pserver:/foo/bar')
58 59 '/foo/bar'
59 60 >>> getrepopath(':pserver:c:/foo/bar')
60 61 'c:/foo/bar'
61 62 >>> getrepopath(':pserver:truc@foo.bar:/foo/bar')
62 63 '/foo/bar'
63 64 >>> getrepopath(':pserver:truc@foo.bar:c:/foo/bar')
64 65 'c:/foo/bar'
65 66 """
66 67 # According to CVS manual, CVS paths are expressed like:
67 68 # [:method:][[user][:password]@]hostname[:[port]]/path/to/repository
68 69 #
69 70 # Unfortunately, Windows absolute paths start with a drive letter
70 71 # like 'c:' making it harder to parse. Here we assume that drive
71 72 # letters are only one character long and any CVS component before
72 73 # the repository path is at least 2 characters long, and use this
73 74 # to disambiguate.
74 75 parts = cvspath.split(':')
75 76 if len(parts) == 1:
76 77 return parts[0]
77 78 # Here there is an ambiguous case if we have a port number
78 79 # immediately followed by a Windows driver letter. We assume this
79 80 # never happens and decide it must be CVS path component,
80 81 # therefore ignoring it.
81 82 if len(parts[-2]) > 1:
82 83 return parts[-1].lstrip('0123456789')
83 84 return parts[-2] + ':' + parts[-1]
84 85
85 86 def createlog(ui, directory=None, root="", rlog=True, cache=None):
86 87 '''Collect the CVS rlog'''
87 88
88 89 # Because we store many duplicate commit log messages, reusing strings
89 90 # saves a lot of memory and pickle storage space.
90 91 _scache = {}
91 92 def scache(s):
92 93 "return a shared version of a string"
93 94 return _scache.setdefault(s, s)
94 95
95 96 ui.status(_('collecting CVS rlog\n'))
96 97
97 98 log = [] # list of logentry objects containing the CVS state
98 99
99 100 # patterns to match in CVS (r)log output, by state of use
100 101 re_00 = re.compile('RCS file: (.+)$')
101 102 re_01 = re.compile('cvs \\[r?log aborted\\]: (.+)$')
102 103 re_02 = re.compile('cvs (r?log|server): (.+)\n$')
103 104 re_03 = re.compile("(Cannot access.+CVSROOT)|"
104 105 "(can't create temporary directory.+)$")
105 106 re_10 = re.compile('Working file: (.+)$')
106 107 re_20 = re.compile('symbolic names:')
107 108 re_30 = re.compile('\t(.+): ([\\d.]+)$')
108 109 re_31 = re.compile('----------------------------$')
109 110 re_32 = re.compile('======================================='
110 111 '======================================$')
111 112 re_50 = re.compile('revision ([\\d.]+)(\s+locked by:\s+.+;)?$')
112 113 re_60 = re.compile(r'date:\s+(.+);\s+author:\s+(.+);\s+state:\s+(.+?);'
113 114 r'(\s+lines:\s+(\+\d+)?\s+(-\d+)?;)?'
114 115 r'(.*mergepoint:\s+([^;]+);)?')
115 116 re_70 = re.compile('branches: (.+);$')
116 117
117 118 file_added_re = re.compile(r'file [^/]+ was (initially )?added on branch')
118 119
119 120 prefix = '' # leading path to strip of what we get from CVS
120 121
121 122 if directory is None:
122 123 # Current working directory
123 124
124 125 # Get the real directory in the repository
125 126 try:
126 127 prefix = open(os.path.join('CVS','Repository')).read().strip()
127 128 if prefix == ".":
128 129 prefix = ""
129 130 directory = prefix
130 131 except IOError:
131 132 raise logerror('Not a CVS sandbox')
132 133
133 134 if prefix and not prefix.endswith(os.sep):
134 135 prefix += os.sep
135 136
136 137 # Use the Root file in the sandbox, if it exists
137 138 try:
138 139 root = open(os.path.join('CVS','Root')).read().strip()
139 140 except IOError:
140 141 pass
141 142
142 143 if not root:
143 144 root = os.environ.get('CVSROOT', '')
144 145
145 146 # read log cache if one exists
146 147 oldlog = []
147 148 date = None
148 149
149 150 if cache:
150 151 cachedir = os.path.expanduser('~/.hg.cvsps')
151 152 if not os.path.exists(cachedir):
152 153 os.mkdir(cachedir)
153 154
154 155 # The cvsps cache pickle needs a uniquified name, based on the
155 156 # repository location. The address may have all sort of nasties
156 157 # in it, slashes, colons and such. So here we take just the
157 158 # alphanumerics, concatenated in a way that does not mix up the
158 159 # various components, so that
159 160 # :pserver:user@server:/path
160 161 # and
161 162 # /pserver/user/server/path
162 163 # are mapped to different cache file names.
163 164 cachefile = root.split(":") + [directory, "cache"]
164 165 cachefile = ['-'.join(re.findall(r'\w+', s)) for s in cachefile if s]
165 166 cachefile = os.path.join(cachedir,
166 167 '.'.join([s for s in cachefile if s]))
167 168
168 169 if cache == 'update':
169 170 try:
170 171 ui.note(_('reading cvs log cache %s\n') % cachefile)
171 172 oldlog = pickle.load(open(cachefile))
172 173 ui.note(_('cache has %d log entries\n') % len(oldlog))
173 174 except Exception, e:
174 175 ui.note(_('error reading cache: %r\n') % e)
175 176
176 177 if oldlog:
177 178 date = oldlog[-1].date # last commit date as a (time,tz) tuple
178 179 date = util.datestr(date, '%Y/%m/%d %H:%M:%S %1%2')
179 180
180 181 # build the CVS commandline
181 182 cmd = ['cvs', '-q']
182 183 if root:
183 184 cmd.append('-d%s' % root)
184 185 p = util.normpath(getrepopath(root))
185 186 if not p.endswith('/'):
186 187 p += '/'
187 188 prefix = p + util.normpath(prefix)
188 189 cmd.append(['log', 'rlog'][rlog])
189 190 if date:
190 191 # no space between option and date string
191 192 cmd.append('-d>%s' % date)
192 193 cmd.append(directory)
193 194
194 195 # state machine begins here
195 196 tags = {} # dictionary of revisions on current file with their tags
196 197 branchmap = {} # mapping between branch names and revision numbers
197 198 state = 0
198 199 store = False # set when a new record can be appended
199 200
200 201 cmd = [util.shellquote(arg) for arg in cmd]
201 202 ui.note(_("running %s\n") % (' '.join(cmd)))
202 203 ui.debug("prefix=%r directory=%r root=%r\n" % (prefix, directory, root))
203 204
204 205 pfp = util.popen(' '.join(cmd))
205 206 peek = pfp.readline()
206 207 while True:
207 208 line = peek
208 209 if line == '':
209 210 break
210 211 peek = pfp.readline()
211 212 if line.endswith('\n'):
212 213 line = line[:-1]
213 214 #ui.debug('state=%d line=%r\n' % (state, line))
214 215
215 216 if state == 0:
216 217 # initial state, consume input until we see 'RCS file'
217 218 match = re_00.match(line)
218 219 if match:
219 220 rcs = match.group(1)
220 221 tags = {}
221 222 if rlog:
222 223 filename = util.normpath(rcs[:-2])
223 224 if filename.startswith(prefix):
224 225 filename = filename[len(prefix):]
225 226 if filename.startswith('/'):
226 227 filename = filename[1:]
227 228 if filename.startswith('Attic/'):
228 229 filename = filename[6:]
229 230 else:
230 231 filename = filename.replace('/Attic/', '/')
231 232 state = 2
232 233 continue
233 234 state = 1
234 235 continue
235 236 match = re_01.match(line)
236 237 if match:
237 238 raise Exception(match.group(1))
238 239 match = re_02.match(line)
239 240 if match:
240 241 raise Exception(match.group(2))
241 242 if re_03.match(line):
242 243 raise Exception(line)
243 244
244 245 elif state == 1:
245 246 # expect 'Working file' (only when using log instead of rlog)
246 247 match = re_10.match(line)
247 248 assert match, _('RCS file must be followed by working file')
248 249 filename = util.normpath(match.group(1))
249 250 state = 2
250 251
251 252 elif state == 2:
252 253 # expect 'symbolic names'
253 254 if re_20.match(line):
254 255 branchmap = {}
255 256 state = 3
256 257
257 258 elif state == 3:
258 259 # read the symbolic names and store as tags
259 260 match = re_30.match(line)
260 261 if match:
261 262 rev = [int(x) for x in match.group(2).split('.')]
262 263
263 264 # Convert magic branch number to an odd-numbered one
264 265 revn = len(rev)
265 266 if revn > 3 and (revn % 2) == 0 and rev[-2] == 0:
266 267 rev = rev[:-2] + rev[-1:]
267 268 rev = tuple(rev)
268 269
269 270 if rev not in tags:
270 271 tags[rev] = []
271 272 tags[rev].append(match.group(1))
272 273 branchmap[match.group(1)] = match.group(2)
273 274
274 275 elif re_31.match(line):
275 276 state = 5
276 277 elif re_32.match(line):
277 278 state = 0
278 279
279 280 elif state == 4:
280 281 # expecting '------' separator before first revision
281 282 if re_31.match(line):
282 283 state = 5
283 284 else:
284 285 assert not re_32.match(line), _('must have at least '
285 286 'some revisions')
286 287
287 288 elif state == 5:
288 289 # expecting revision number and possibly (ignored) lock indication
289 290 # we create the logentry here from values stored in states 0 to 4,
290 291 # as this state is re-entered for subsequent revisions of a file.
291 292 match = re_50.match(line)
292 293 assert match, _('expected revision number')
293 294 e = logentry(rcs=scache(rcs), file=scache(filename),
294 295 revision=tuple([int(x) for x in match.group(1).split('.')]),
295 296 branches=[], parent=None,
296 297 synthetic=False)
297 298 state = 6
298 299
299 300 elif state == 6:
300 301 # expecting date, author, state, lines changed
301 302 match = re_60.match(line)
302 303 assert match, _('revision must be followed by date line')
303 304 d = match.group(1)
304 305 if d[2] == '/':
305 306 # Y2K
306 307 d = '19' + d
307 308
308 309 if len(d.split()) != 3:
309 310 # cvs log dates always in GMT
310 311 d = d + ' UTC'
311 312 e.date = util.parsedate(d, ['%y/%m/%d %H:%M:%S',
312 313 '%Y/%m/%d %H:%M:%S',
313 314 '%Y-%m-%d %H:%M:%S'])
314 315 e.author = scache(match.group(2))
315 316 e.dead = match.group(3).lower() == 'dead'
316 317
317 318 if match.group(5):
318 319 if match.group(6):
319 320 e.lines = (int(match.group(5)), int(match.group(6)))
320 321 else:
321 322 e.lines = (int(match.group(5)), 0)
322 323 elif match.group(6):
323 324 e.lines = (0, int(match.group(6)))
324 325 else:
325 326 e.lines = None
326 327
327 328 if match.group(7): # cvsnt mergepoint
328 329 myrev = match.group(8).split('.')
329 330 if len(myrev) == 2: # head
330 331 e.mergepoint = 'HEAD'
331 332 else:
332 333 myrev = '.'.join(myrev[:-2] + ['0', myrev[-2]])
333 334 branches = [b for b in branchmap if branchmap[b] == myrev]
334 335 assert len(branches) == 1, 'unknown branch: %s' % e.mergepoint
335 336 e.mergepoint = branches[0]
336 337 else:
337 338 e.mergepoint = None
338 339 e.comment = []
339 340 state = 7
340 341
341 342 elif state == 7:
342 343 # read the revision numbers of branches that start at this revision
343 344 # or store the commit log message otherwise
344 345 m = re_70.match(line)
345 346 if m:
346 347 e.branches = [tuple([int(y) for y in x.strip().split('.')])
347 348 for x in m.group(1).split(';')]
348 349 state = 8
349 350 elif re_31.match(line) and re_50.match(peek):
350 351 state = 5
351 352 store = True
352 353 elif re_32.match(line):
353 354 state = 0
354 355 store = True
355 356 else:
356 357 e.comment.append(line)
357 358
358 359 elif state == 8:
359 360 # store commit log message
360 361 if re_31.match(line):
361 362 state = 5
362 363 store = True
363 364 elif re_32.match(line):
364 365 state = 0
365 366 store = True
366 367 else:
367 368 e.comment.append(line)
368 369
369 370 # When a file is added on a branch B1, CVS creates a synthetic
370 371 # dead trunk revision 1.1 so that the branch has a root.
371 372 # Likewise, if you merge such a file to a later branch B2 (one
372 373 # that already existed when the file was added on B1), CVS
373 374 # creates a synthetic dead revision 1.1.x.1 on B2. Don't drop
374 375 # these revisions now, but mark them synthetic so
375 376 # createchangeset() can take care of them.
376 377 if (store and
377 378 e.dead and
378 379 e.revision[-1] == 1 and # 1.1 or 1.1.x.1
379 380 len(e.comment) == 1 and
380 381 file_added_re.match(e.comment[0])):
381 382 ui.debug('found synthetic revision in %s: %r\n'
382 383 % (e.rcs, e.comment[0]))
383 384 e.synthetic = True
384 385
385 386 if store:
386 387 # clean up the results and save in the log.
387 388 store = False
388 389 e.tags = sorted([scache(x) for x in tags.get(e.revision, [])])
389 390 e.comment = scache('\n'.join(e.comment))
390 391
391 392 revn = len(e.revision)
392 393 if revn > 3 and (revn % 2) == 0:
393 394 e.branch = tags.get(e.revision[:-1], [None])[0]
394 395 else:
395 396 e.branch = None
396 397
397 398 # find the branches starting from this revision
398 399 branchpoints = set()
399 400 for branch, revision in branchmap.iteritems():
400 401 revparts = tuple([int(i) for i in revision.split('.')])
401 402 if revparts[-2] == 0 and revparts[-1] % 2 == 0:
402 403 # normal branch
403 404 if revparts[:-2] == e.revision:
404 405 branchpoints.add(branch)
405 406 elif revparts == (1,1,1): # vendor branch
406 407 if revparts in e.branches:
407 408 branchpoints.add(branch)
408 409 e.branchpoints = branchpoints
409 410
410 411 log.append(e)
411 412
412 413 if len(log) % 100 == 0:
413 414 ui.status(util.ellipsis('%d %s' % (len(log), e.file), 80)+'\n')
414 415
415 416 log.sort(key=lambda x: (x.rcs, x.revision))
416 417
417 418 # find parent revisions of individual files
418 419 versions = {}
419 420 for e in log:
420 421 branch = e.revision[:-1]
421 422 p = versions.get((e.rcs, branch), None)
422 423 if p is None:
423 424 p = e.revision[:-2]
424 425 e.parent = p
425 426 versions[(e.rcs, branch)] = e.revision
426 427
427 428 # update the log cache
428 429 if cache:
429 430 if log:
430 431 # join up the old and new logs
431 432 log.sort(key=lambda x: x.date)
432 433
433 434 if oldlog and oldlog[-1].date >= log[0].date:
434 435 raise logerror('Log cache overlaps with new log entries,'
435 436 ' re-run without cache.')
436 437
437 438 log = oldlog + log
438 439
439 440 # write the new cachefile
440 441 ui.note(_('writing cvs log cache %s\n') % cachefile)
441 442 pickle.dump(log, open(cachefile, 'w'))
442 443 else:
443 444 log = oldlog
444 445
445 446 ui.status(_('%d log entries\n') % len(log))
446 447
448 hook.hook(ui, None, "cvslog", True, log=log)
449
447 450 return log
448 451
449 452
450 453 class changeset(object):
451 454 '''Class changeset has the following attributes:
452 455 .id - integer identifying this changeset (list index)
453 456 .author - author name as CVS knows it
454 457 .branch - name of branch this changeset is on, or None
455 458 .comment - commit message
456 459 .date - the commit date as a (time,tz) tuple
457 460 .entries - list of logentry objects in this changeset
458 461 .parents - list of one or two parent changesets
459 462 .tags - list of tags on this changeset
460 463 .synthetic - from synthetic revision "file ... added on branch ..."
461 464 .mergepoint- the branch that has been merged from
462 465 (if present in rlog output)
463 466 .branchpoints- the branches that start at the current entry
464 467 '''
465 468 def __init__(self, **entries):
466 469 self.__dict__.update(entries)
467 470
468 471 def __repr__(self):
469 472 return "<%s at 0x%x: %s>" % (self.__class__.__name__,
470 473 id(self),
471 474 getattr(self, 'id', "(no id)"))
472 475
473 476 def createchangeset(ui, log, fuzz=60, mergefrom=None, mergeto=None):
474 477 '''Convert log into changesets.'''
475 478
476 479 ui.status(_('creating changesets\n'))
477 480
478 481 # Merge changesets
479 482
480 483 log.sort(key=lambda x: (x.comment, x.author, x.branch, x.date))
481 484
482 485 changesets = []
483 486 files = set()
484 487 c = None
485 488 for i, e in enumerate(log):
486 489
487 490 # Check if log entry belongs to the current changeset or not.
488 491
489 492 # Since CVS is file centric, two different file revisions with
490 493 # different branchpoints should be treated as belonging to two
491 494 # different changesets (and the ordering is important and not
492 495 # honoured by cvsps at this point).
493 496 #
494 497 # Consider the following case:
495 498 # foo 1.1 branchpoints: [MYBRANCH]
496 499 # bar 1.1 branchpoints: [MYBRANCH, MYBRANCH2]
497 500 #
498 501 # Here foo is part only of MYBRANCH, but not MYBRANCH2, e.g. a
499 502 # later version of foo may be in MYBRANCH2, so foo should be the
500 503 # first changeset and bar the next and MYBRANCH and MYBRANCH2
501 504 # should both start off of the bar changeset. No provisions are
502 505 # made to ensure that this is, in fact, what happens.
503 506 if not (c and
504 507 e.comment == c.comment and
505 508 e.author == c.author and
506 509 e.branch == c.branch and
507 510 (not hasattr(e, 'branchpoints') or
508 511 not hasattr (c, 'branchpoints') or
509 512 e.branchpoints == c.branchpoints) and
510 513 ((c.date[0] + c.date[1]) <=
511 514 (e.date[0] + e.date[1]) <=
512 515 (c.date[0] + c.date[1]) + fuzz) and
513 516 e.file not in files):
514 517 c = changeset(comment=e.comment, author=e.author,
515 518 branch=e.branch, date=e.date, entries=[],
516 519 mergepoint=getattr(e, 'mergepoint', None),
517 520 branchpoints=getattr(e, 'branchpoints', set()))
518 521 changesets.append(c)
519 522 files = set()
520 523 if len(changesets) % 100 == 0:
521 524 t = '%d %s' % (len(changesets), repr(e.comment)[1:-1])
522 525 ui.status(util.ellipsis(t, 80) + '\n')
523 526
524 527 c.entries.append(e)
525 528 files.add(e.file)
526 529 c.date = e.date # changeset date is date of latest commit in it
527 530
528 531 # Mark synthetic changesets
529 532
530 533 for c in changesets:
531 534 # Synthetic revisions always get their own changeset, because
532 535 # the log message includes the filename. E.g. if you add file3
533 536 # and file4 on a branch, you get four log entries and three
534 537 # changesets:
535 538 # "File file3 was added on branch ..." (synthetic, 1 entry)
536 539 # "File file4 was added on branch ..." (synthetic, 1 entry)
537 540 # "Add file3 and file4 to fix ..." (real, 2 entries)
538 541 # Hence the check for 1 entry here.
539 542 synth = getattr(c.entries[0], 'synthetic', None)
540 543 c.synthetic = (len(c.entries) == 1 and synth)
541 544
542 545 # Sort files in each changeset
543 546
544 547 for c in changesets:
545 548 def pathcompare(l, r):
546 549 'Mimic cvsps sorting order'
547 550 l = l.split('/')
548 551 r = r.split('/')
549 552 nl = len(l)
550 553 nr = len(r)
551 554 n = min(nl, nr)
552 555 for i in range(n):
553 556 if i + 1 == nl and nl < nr:
554 557 return -1
555 558 elif i + 1 == nr and nl > nr:
556 559 return +1
557 560 elif l[i] < r[i]:
558 561 return -1
559 562 elif l[i] > r[i]:
560 563 return +1
561 564 return 0
562 565 def entitycompare(l, r):
563 566 return pathcompare(l.file, r.file)
564 567
565 568 c.entries.sort(entitycompare)
566 569
567 570 # Sort changesets by date
568 571
569 572 def cscmp(l, r):
570 573 d = sum(l.date) - sum(r.date)
571 574 if d:
572 575 return d
573 576
574 577 # detect vendor branches and initial commits on a branch
575 578 le = {}
576 579 for e in l.entries:
577 580 le[e.rcs] = e.revision
578 581 re = {}
579 582 for e in r.entries:
580 583 re[e.rcs] = e.revision
581 584
582 585 d = 0
583 586 for e in l.entries:
584 587 if re.get(e.rcs, None) == e.parent:
585 588 assert not d
586 589 d = 1
587 590 break
588 591
589 592 for e in r.entries:
590 593 if le.get(e.rcs, None) == e.parent:
591 594 assert not d
592 595 d = -1
593 596 break
594 597
595 598 return d
596 599
597 600 changesets.sort(cscmp)
598 601
599 602 # Collect tags
600 603
601 604 globaltags = {}
602 605 for c in changesets:
603 606 for e in c.entries:
604 607 for tag in e.tags:
605 608 # remember which is the latest changeset to have this tag
606 609 globaltags[tag] = c
607 610
608 611 for c in changesets:
609 612 tags = set()
610 613 for e in c.entries:
611 614 tags.update(e.tags)
612 615 # remember tags only if this is the latest changeset to have it
613 616 c.tags = sorted(tag for tag in tags if globaltags[tag] is c)
614 617
615 618 # Find parent changesets, handle {{mergetobranch BRANCHNAME}}
616 619 # by inserting dummy changesets with two parents, and handle
617 620 # {{mergefrombranch BRANCHNAME}} by setting two parents.
618 621
619 622 if mergeto is None:
620 623 mergeto = r'{{mergetobranch ([-\w]+)}}'
621 624 if mergeto:
622 625 mergeto = re.compile(mergeto)
623 626
624 627 if mergefrom is None:
625 628 mergefrom = r'{{mergefrombranch ([-\w]+)}}'
626 629 if mergefrom:
627 630 mergefrom = re.compile(mergefrom)
628 631
629 632 versions = {} # changeset index where we saw any particular file version
630 633 branches = {} # changeset index where we saw a branch
631 634 n = len(changesets)
632 635 i = 0
633 636 while i<n:
634 637 c = changesets[i]
635 638
636 639 for f in c.entries:
637 640 versions[(f.rcs, f.revision)] = i
638 641
639 642 p = None
640 643 if c.branch in branches:
641 644 p = branches[c.branch]
642 645 else:
643 646 # first changeset on a new branch
644 647 # the parent is a changeset with the branch in its
645 648 # branchpoints such that it is the latest possible
646 649 # commit without any intervening, unrelated commits.
647 650
648 651 for candidate in xrange(i):
649 652 if c.branch not in changesets[candidate].branchpoints:
650 653 if p is not None:
651 654 break
652 655 continue
653 656 p = candidate
654 657
655 658 c.parents = []
656 659 if p is not None:
657 660 p = changesets[p]
658 661
659 662 # Ensure no changeset has a synthetic changeset as a parent.
660 663 while p.synthetic:
661 664 assert len(p.parents) <= 1, \
662 665 _('synthetic changeset cannot have multiple parents')
663 666 if p.parents:
664 667 p = p.parents[0]
665 668 else:
666 669 p = None
667 670 break
668 671
669 672 if p is not None:
670 673 c.parents.append(p)
671 674
672 675 if c.mergepoint:
673 676 if c.mergepoint == 'HEAD':
674 677 c.mergepoint = None
675 678 c.parents.append(changesets[branches[c.mergepoint]])
676 679
677 680 if mergefrom:
678 681 m = mergefrom.search(c.comment)
679 682 if m:
680 683 m = m.group(1)
681 684 if m == 'HEAD':
682 685 m = None
683 686 try:
684 687 candidate = changesets[branches[m]]
685 688 except KeyError:
686 689 ui.warn(_("warning: CVS commit message references "
687 690 "non-existent branch %r:\n%s\n")
688 691 % (m, c.comment))
689 692 if m in branches and c.branch != m and not candidate.synthetic:
690 693 c.parents.append(candidate)
691 694
692 695 if mergeto:
693 696 m = mergeto.search(c.comment)
694 697 if m:
695 698 try:
696 699 m = m.group(1)
697 700 if m == 'HEAD':
698 701 m = None
699 702 except:
700 703 m = None # if no group found then merge to HEAD
701 704 if m in branches and c.branch != m:
702 705 # insert empty changeset for merge
703 706 cc = changeset(author=c.author, branch=m, date=c.date,
704 707 comment='convert-repo: CVS merge from branch %s' % c.branch,
705 708 entries=[], tags=[], parents=[changesets[branches[m]], c])
706 709 changesets.insert(i + 1, cc)
707 710 branches[m] = i + 1
708 711
709 712 # adjust our loop counters now we have inserted a new entry
710 713 n += 1
711 714 i += 2
712 715 continue
713 716
714 717 branches[c.branch] = i
715 718 i += 1
716 719
717 720 # Drop synthetic changesets (safe now that we have ensured no other
718 721 # changesets can have them as parents).
719 722 i = 0
720 723 while i < len(changesets):
721 724 if changesets[i].synthetic:
722 725 del changesets[i]
723 726 else:
724 727 i += 1
725 728
726 729 # Number changesets
727 730
728 731 for i, c in enumerate(changesets):
729 732 c.id = i + 1
730 733
731 734 ui.status(_('%d changeset entries\n') % len(changesets))
732 735
736 hook.hook(ui, None, "cvschangesets", True, changesets=changesets)
737
733 738 return changesets
734 739
735 740
736 741 def debugcvsps(ui, *args, **opts):
737 742 '''Read CVS rlog for current directory or named path in
738 743 repository, and convert the log to changesets based on matching
739 744 commit log entries and dates.
740 745 '''
741 746 if opts["new_cache"]:
742 747 cache = "write"
743 748 elif opts["update_cache"]:
744 749 cache = "update"
745 750 else:
746 751 cache = None
747 752
748 753 revisions = opts["revisions"]
749 754
750 755 try:
751 756 if args:
752 757 log = []
753 758 for d in args:
754 759 log += createlog(ui, d, root=opts["root"], cache=cache)
755 760 else:
756 761 log = createlog(ui, root=opts["root"], cache=cache)
757 762 except logerror, e:
758 763 ui.write("%r\n"%e)
759 764 return
760 765
761 766 changesets = createchangeset(ui, log, opts["fuzz"])
762 767 del log
763 768
764 769 # Print changesets (optionally filtered)
765 770
766 771 off = len(revisions)
767 772 branches = {} # latest version number in each branch
768 773 ancestors = {} # parent branch
769 774 for cs in changesets:
770 775
771 776 if opts["ancestors"]:
772 777 if cs.branch not in branches and cs.parents and cs.parents[0].id:
773 778 ancestors[cs.branch] = (changesets[cs.parents[0].id-1].branch,
774 779 cs.parents[0].id)
775 780 branches[cs.branch] = cs.id
776 781
777 782 # limit by branches
778 783 if opts["branches"] and (cs.branch or 'HEAD') not in opts["branches"]:
779 784 continue
780 785
781 786 if not off:
782 787 # Note: trailing spaces on several lines here are needed to have
783 788 # bug-for-bug compatibility with cvsps.
784 789 ui.write('---------------------\n')
785 790 ui.write('PatchSet %d \n' % cs.id)
786 791 ui.write('Date: %s\n' % util.datestr(cs.date,
787 792 '%Y/%m/%d %H:%M:%S %1%2'))
788 793 ui.write('Author: %s\n' % cs.author)
789 794 ui.write('Branch: %s\n' % (cs.branch or 'HEAD'))
790 795 ui.write('Tag%s: %s \n' % (['', 's'][len(cs.tags)>1],
791 796 ','.join(cs.tags) or '(none)'))
792 797 branchpoints = getattr(cs, 'branchpoints', None)
793 798 if branchpoints:
794 799 ui.write('Branchpoints: %s \n' % ', '.join(branchpoints))
795 800 if opts["parents"] and cs.parents:
796 801 if len(cs.parents)>1:
797 802 ui.write('Parents: %s\n' % (','.join([str(p.id) for p in cs.parents])))
798 803 else:
799 804 ui.write('Parent: %d\n' % cs.parents[0].id)
800 805
801 806 if opts["ancestors"]:
802 807 b = cs.branch
803 808 r = []
804 809 while b:
805 810 b, c = ancestors[b]
806 811 r.append('%s:%d:%d' % (b or "HEAD", c, branches[b]))
807 812 if r:
808 813 ui.write('Ancestors: %s\n' % (','.join(r)))
809 814
810 815 ui.write('Log:\n')
811 816 ui.write('%s\n\n' % cs.comment)
812 817 ui.write('Members: \n')
813 818 for f in cs.entries:
814 819 fn = f.file
815 820 if fn.startswith(opts["prefix"]):
816 821 fn = fn[len(opts["prefix"]):]
817 822 ui.write('\t%s:%s->%s%s \n' % (fn, '.'.join([str(x) for x in f.parent]) or 'INITIAL',
818 823 '.'.join([str(x) for x in f.revision]), ['', '(DEAD)'][f.dead]))
819 824 ui.write('\n')
820 825
821 826 # have we seen the start tag?
822 827 if revisions and off:
823 828 if revisions[0] == str(cs.id) or \
824 829 revisions[0] in cs.tags:
825 830 off = False
826 831
827 832 # see if we reached the end tag
828 833 if len(revisions)>1 and not off:
829 834 if revisions[1] == str(cs.id) or \
830 835 revisions[1] in cs.tags:
831 836 break
@@ -1,120 +1,133
1 1 #!/bin/sh
2 2
3 3 "$TESTDIR/hghave" cvs || exit 80
4 4
5 5 cvscall()
6 6 {
7 7 cvs -f "$@"
8 8 }
9 9
10 10 hgcat()
11 11 {
12 12 hg --cwd src-hg cat -r tip "$1"
13 13 }
14 14
15 15 echo "[extensions]" >> $HGRCPATH
16 16 echo "convert = " >> $HGRCPATH
17 17 echo "graphlog = " >> $HGRCPATH
18 18
19 cat > cvshooks.py <<EOF
20 def cvslog(ui,repo,hooktype,log):
21 print "%s hook: %d entries"%(hooktype,len(log))
22
23 def cvschangesets(ui,repo,hooktype,changesets):
24 print "%s hook: %d changesets"%(hooktype,len(changesets))
25 EOF
26 hookpath=$PWD
27
28 echo "[hooks]" >> $HGRCPATH
29 echo "cvslog=python:$hookpath/cvshooks.py:cvslog" >> $HGRCPATH
30 echo "cvschangesets=python:$hookpath/cvshooks.py:cvschangesets" >> $HGRCPATH
31
19 32 echo % create cvs repository
20 33 mkdir cvsrepo
21 34 cd cvsrepo
22 CVSROOT=`pwd`
35 CVSROOT=$PWD
23 36 export CVSROOT
24 37 CVS_OPTIONS=-f
25 38 export CVS_OPTIONS
26 39 cd ..
27 40
28 41 cvscall -q -d "$CVSROOT" init
29 42
30 43 echo % create source directory
31 44 mkdir src-temp
32 45 cd src-temp
33 46 echo a > a
34 47 mkdir b
35 48 cd b
36 49 echo c > c
37 50 cd ..
38 51
39 52 echo % import source directory
40 53 cvscall -q import -m import src INITIAL start
41 54 cd ..
42 55
43 56 echo % checkout source directory
44 57 cvscall -q checkout src
45 58
46 59 echo % commit a new revision changing b/c
47 60 cd src
48 61 sleep 1
49 62 echo c >> b/c
50 63 cvscall -q commit -mci0 . | grep '<--' |\
51 64 sed -e 's:.*src/\(.*\),v.*:checking in src/\1,v:g'
52 65 cd ..
53 66
54 67 echo % convert fresh repo
55 68 hg convert src src-hg | sed -e 's/connecting to.*cvsrepo/connecting to cvsrepo/g'
56 69 hgcat a
57 70 hgcat b/c
58 71
59 72 echo % convert fresh repo with --filemap
60 73 echo include b/c > filemap
61 74 hg convert --filemap filemap src src-filemap | sed -e 's/connecting to.*cvsrepo/connecting to cvsrepo/g'
62 75 hgcat b/c
63 76 hg -R src-filemap log --template '{rev} {desc} files: {files}\n'
64 77
65 78 echo % commit new file revisions
66 79 cd src
67 80 echo a >> a
68 81 echo c >> b/c
69 82 cvscall -q commit -mci1 . | grep '<--' |\
70 83 sed -e 's:.*src/\(.*\),v.*:checking in src/\1,v:g'
71 84 cd ..
72 85
73 86 echo % convert again
74 87 hg convert src src-hg | sed -e 's/connecting to.*cvsrepo/connecting to cvsrepo/g'
75 88 hgcat a
76 89 hgcat b/c
77 90
78 91 echo % convert again with --filemap
79 92 hg convert --filemap filemap src src-filemap | sed -e 's/connecting to.*cvsrepo/connecting to cvsrepo/g'
80 93 hgcat b/c
81 94 hg -R src-filemap log --template '{rev} {desc} files: {files}\n'
82 95
83 96 echo % commit branch
84 97 cd src
85 98 cvs -q update -r1.1 b/c
86 99 cvs -q tag -b branch
87 100 cvs -q update -r branch > /dev/null
88 101 echo d >> b/c
89 102 cvs -q commit -mci2 . | grep '<--' |\
90 103 sed -e 's:.*src/\(.*\),v.*:checking in src/\1,v:g'
91 104 cd ..
92 105
93 106 echo % convert again
94 107 hg convert src src-hg | sed -e 's/connecting to.*cvsrepo/connecting to cvsrepo/g'
95 108 hgcat b/c
96 109
97 110 echo % convert again with --filemap
98 111 hg convert --filemap filemap src src-filemap | sed -e 's/connecting to.*cvsrepo/connecting to cvsrepo/g'
99 112 hgcat b/c
100 113 hg -R src-filemap log --template '{rev} {desc} files: {files}\n'
101 114
102 115 echo % commit a new revision with funny log message
103 116 cd src
104 117 sleep 1
105 118 echo e >> a
106 119 cvscall -q commit -m'funny
107 120 ----------------------------
108 121 log message' . | grep '<--' |\
109 122 sed -e 's:.*src/\(.*\),v.*:checking in src/\1,v:g'
110 123 cd ..
111 124
112 125 echo % convert again
113 126 hg convert src src-hg | sed -e 's/connecting to.*cvsrepo/connecting to cvsrepo/g'
114 127
115 128 echo "graphlog = " >> $HGRCPATH
116 129 hg -R src-hg glog --template '{rev} ({branches}) {desc} files: {files}\n'
117 130
118 131 echo % testing debugcvsps
119 132 cd src
120 133 hg debugcvsps | sed -e 's/Author:.*/Author:/' -e 's/Date:.*/Date:/'
@@ -1,254 +1,270
1 1 % create cvs repository
2 2 % create source directory
3 3 % import source directory
4 4 N src/a
5 5 N src/b/c
6 6
7 7 No conflicts created by this import
8 8
9 9 % checkout source directory
10 10 U src/a
11 11 U src/b/c
12 12 % commit a new revision changing b/c
13 13 checking in src/b/c,v
14 14 % convert fresh repo
15 15 initializing destination src-hg repository
16 16 connecting to cvsrepo
17 17 scanning source...
18 18 collecting CVS rlog
19 19 5 log entries
20 cvslog hook: 5 entries
20 21 creating changesets
21 22 3 changeset entries
23 cvschangesets hook: 3 changesets
22 24 sorting...
23 25 converting...
24 26 2 Initial revision
25 27 1 import
26 28 0 ci0
27 29 updating tags
28 30 a
29 31 c
30 32 c
31 33 % convert fresh repo with --filemap
32 34 initializing destination src-filemap repository
33 35 connecting to cvsrepo
34 36 scanning source...
35 37 collecting CVS rlog
36 38 5 log entries
39 cvslog hook: 5 entries
37 40 creating changesets
38 41 3 changeset entries
42 cvschangesets hook: 3 changesets
39 43 sorting...
40 44 converting...
41 45 2 Initial revision
42 46 1 import
43 47 filtering out empty revision
44 48 rolling back last transaction
45 49 0 ci0
46 50 updating tags
47 51 c
48 52 c
49 53 2 update tags files: .hgtags
50 54 1 ci0 files: b/c
51 55 0 Initial revision files: b/c
52 56 % commit new file revisions
53 57 checking in src/a,v
54 58 checking in src/b/c,v
55 59 % convert again
56 60 connecting to cvsrepo
57 61 scanning source...
58 62 collecting CVS rlog
59 63 7 log entries
64 cvslog hook: 7 entries
60 65 creating changesets
61 66 4 changeset entries
67 cvschangesets hook: 4 changesets
62 68 sorting...
63 69 converting...
64 70 0 ci1
65 71 a
66 72 a
67 73 c
68 74 c
69 75 c
70 76 % convert again with --filemap
71 77 connecting to cvsrepo
72 78 scanning source...
73 79 collecting CVS rlog
74 80 7 log entries
81 cvslog hook: 7 entries
75 82 creating changesets
76 83 4 changeset entries
84 cvschangesets hook: 4 changesets
77 85 sorting...
78 86 converting...
79 87 0 ci1
80 88 c
81 89 c
82 90 c
83 91 3 ci1 files: b/c
84 92 2 update tags files: .hgtags
85 93 1 ci0 files: b/c
86 94 0 Initial revision files: b/c
87 95 % commit branch
88 96 U b/c
89 97 T a
90 98 T b/c
91 99 checking in src/b/c,v
92 100 % convert again
93 101 connecting to cvsrepo
94 102 scanning source...
95 103 collecting CVS rlog
96 104 8 log entries
105 cvslog hook: 8 entries
97 106 creating changesets
98 107 5 changeset entries
108 cvschangesets hook: 5 changesets
99 109 sorting...
100 110 converting...
101 111 0 ci2
102 112 c
103 113 d
104 114 % convert again with --filemap
105 115 connecting to cvsrepo
106 116 scanning source...
107 117 collecting CVS rlog
108 118 8 log entries
119 cvslog hook: 8 entries
109 120 creating changesets
110 121 5 changeset entries
122 cvschangesets hook: 5 changesets
111 123 sorting...
112 124 converting...
113 125 0 ci2
114 126 c
115 127 d
116 128 4 ci2 files: b/c
117 129 3 ci1 files: b/c
118 130 2 update tags files: .hgtags
119 131 1 ci0 files: b/c
120 132 0 Initial revision files: b/c
121 133 % commit a new revision with funny log message
122 134 checking in src/a,v
123 135 % convert again
124 136 connecting to cvsrepo
125 137 scanning source...
126 138 collecting CVS rlog
127 139 9 log entries
140 cvslog hook: 9 entries
128 141 creating changesets
129 142 6 changeset entries
143 cvschangesets hook: 6 changesets
130 144 sorting...
131 145 converting...
132 146 0 funny
133 147 o 6 (branch) funny
134 148 | ----------------------------
135 149 | log message files: a
136 150 o 5 (branch) ci2 files: b/c
137 151
138 152 o 4 () ci1 files: a b/c
139 153 |
140 154 o 3 () update tags files: .hgtags
141 155 |
142 156 o 2 () ci0 files: b/c
143 157 |
144 158 | o 1 (INITIAL) import files:
145 159 |/
146 160 o 0 () Initial revision files: a b/c
147 161
148 162 % testing debugcvsps
149 163 collecting CVS rlog
150 164 9 log entries
165 cvslog hook: 9 entries
151 166 creating changesets
152 167 8 changeset entries
168 cvschangesets hook: 8 changesets
153 169 ---------------------
154 170 PatchSet 1
155 171 Date:
156 172 Author:
157 173 Branch: HEAD
158 174 Tag: (none)
159 175 Branchpoints: INITIAL
160 176 Log:
161 177 Initial revision
162 178
163 179 Members:
164 180 a:INITIAL->1.1
165 181
166 182 ---------------------
167 183 PatchSet 2
168 184 Date:
169 185 Author:
170 186 Branch: HEAD
171 187 Tag: (none)
172 188 Branchpoints: INITIAL, branch
173 189 Log:
174 190 Initial revision
175 191
176 192 Members:
177 193 b/c:INITIAL->1.1
178 194
179 195 ---------------------
180 196 PatchSet 3
181 197 Date:
182 198 Author:
183 199 Branch: INITIAL
184 200 Tag: start
185 201 Log:
186 202 import
187 203
188 204 Members:
189 205 a:1.1->1.1.1.1
190 206 b/c:1.1->1.1.1.1
191 207
192 208 ---------------------
193 209 PatchSet 4
194 210 Date:
195 211 Author:
196 212 Branch: HEAD
197 213 Tag: (none)
198 214 Log:
199 215 ci0
200 216
201 217 Members:
202 218 b/c:1.1->1.2
203 219
204 220 ---------------------
205 221 PatchSet 5
206 222 Date:
207 223 Author:
208 224 Branch: HEAD
209 225 Tag: (none)
210 226 Branchpoints: branch
211 227 Log:
212 228 ci1
213 229
214 230 Members:
215 231 a:1.1->1.2
216 232
217 233 ---------------------
218 234 PatchSet 6
219 235 Date:
220 236 Author:
221 237 Branch: HEAD
222 238 Tag: (none)
223 239 Log:
224 240 ci1
225 241
226 242 Members:
227 243 b/c:1.2->1.3
228 244
229 245 ---------------------
230 246 PatchSet 7
231 247 Date:
232 248 Author:
233 249 Branch: branch
234 250 Tag: (none)
235 251 Log:
236 252 ci2
237 253
238 254 Members:
239 255 b/c:1.1->1.1.2.1
240 256
241 257 ---------------------
242 258 PatchSet 8
243 259 Date:
244 260 Author:
245 261 Branch: branch
246 262 Tag: (none)
247 263 Log:
248 264 funny
249 265 ----------------------------
250 266 log message
251 267
252 268 Members:
253 269 a:1.2->1.2.2.1
254 270
@@ -1,266 +1,275
1 1 hg convert [OPTION]... SOURCE [DEST [REVMAP]]
2 2
3 3 convert a foreign SCM repository to a Mercurial one.
4 4
5 5 Accepted source formats [identifiers]:
6 6
7 7 - Mercurial [hg]
8 8 - CVS [cvs]
9 9 - Darcs [darcs]
10 10 - git [git]
11 11 - Subversion [svn]
12 12 - Monotone [mtn]
13 13 - GNU Arch [gnuarch]
14 14 - Bazaar [bzr]
15 15 - Perforce [p4]
16 16
17 17 Accepted destination formats [identifiers]:
18 18
19 19 - Mercurial [hg]
20 20 - Subversion [svn] (history on branches is not preserved)
21 21
22 22 If no revision is given, all revisions will be converted. Otherwise,
23 23 convert will only import up to the named revision (given in a format
24 24 understood by the source).
25 25
26 26 If no destination directory name is specified, it defaults to the basename
27 27 of the source with '-hg' appended. If the destination repository doesn't
28 28 exist, it will be created.
29 29
30 30 By default, all sources except Mercurial will use --branchsort. Mercurial
31 31 uses --sourcesort to preserve original revision numbers order. Sort modes
32 32 have the following effects:
33 33
34 34 --branchsort convert from parent to child revision when possible, which
35 35 means branches are usually converted one after the other. It
36 36 generates more compact repositories.
37 37 --datesort sort revisions by date. Converted repositories have good-
38 38 looking changelogs but are often an order of magnitude
39 39 larger than the same ones generated by --branchsort.
40 40 --sourcesort try to preserve source revisions order, only supported by
41 41 Mercurial sources.
42 42
43 43 If <REVMAP> isn't given, it will be put in a default location
44 44 (<dest>/.hg/shamap by default). The <REVMAP> is a simple text file that
45 45 maps each source commit ID to the destination ID for that revision, like
46 46 so:
47 47
48 48 <source ID> <destination ID>
49 49
50 50 If the file doesn't exist, it's automatically created. It's updated on
51 51 each commit copied, so convert-repo can be interrupted and can be run
52 52 repeatedly to copy new commits.
53 53
54 54 The [username mapping] file is a simple text file that maps each source
55 55 commit author to a destination commit author. It is handy for source SCMs
56 56 that use unix logins to identify authors (eg: CVS). One line per author
57 57 mapping and the line format is: srcauthor=whatever string you want
58 58
59 59 The filemap is a file that allows filtering and remapping of files and
60 60 directories. Comment lines start with '#'. Each line can contain one of
61 61 the following directives:
62 62
63 63 include path/to/file
64 64
65 65 exclude path/to/file
66 66
67 67 rename from/file to/file
68 68
69 69 The 'include' directive causes a file, or all files under a directory, to
70 70 be included in the destination repository, and the exclusion of all other
71 71 files and directories not explicitly included. The 'exclude' directive
72 72 causes files or directories to be omitted. The 'rename' directive renames
73 73 a file or directory. To rename from a subdirectory into the root of the
74 74 repository, use '.' as the path to rename to.
75 75
76 76 The splicemap is a file that allows insertion of synthetic history,
77 77 letting you specify the parents of a revision. This is useful if you want
78 78 to e.g. give a Subversion merge two parents, or graft two disconnected
79 79 series of history together. Each entry contains a key, followed by a
80 80 space, followed by one or two comma-separated values. The key is the
81 81 revision ID in the source revision control system whose parents should be
82 82 modified (same format as a key in .hg/shamap). The values are the revision
83 83 IDs (in either the source or destination revision control system) that
84 84 should be used as the new parents for that node. For example, if you have
85 85 merged "release-1.0" into "trunk", then you should specify the revision on
86 86 "trunk" as the first parent and the one on the "release-1.0" branch as the
87 87 second.
88 88
89 89 The branchmap is a file that allows you to rename a branch when it is
90 90 being brought in from whatever external repository. When used in
91 91 conjunction with a splicemap, it allows for a powerful combination to help
92 92 fix even the most badly mismanaged repositories and turn them into nicely
93 93 structured Mercurial repositories. The branchmap contains lines of the
94 94 form "original_branch_name new_branch_name". "original_branch_name" is the
95 95 name of the branch in the source repository, and "new_branch_name" is the
96 96 name of the branch is the destination repository. This can be used to (for
97 97 instance) move code in one repository from "default" to a named branch.
98 98
99 99 Mercurial Source
100 100 ----------------
101 101
102 102 --config convert.hg.ignoreerrors=False (boolean)
103 103 ignore integrity errors when reading. Use it to fix Mercurial
104 104 repositories with missing revlogs, by converting from and to
105 105 Mercurial.
106 106 --config convert.hg.saverev=False (boolean)
107 107 store original revision ID in changeset (forces target IDs to change)
108 108 --config convert.hg.startrev=0 (hg revision identifier)
109 109 convert start revision and its descendants
110 110
111 111 CVS Source
112 112 ----------
113 113
114 114 CVS source will use a sandbox (i.e. a checked-out copy) from CVS to
115 115 indicate the starting point of what will be converted. Direct access to
116 116 the repository files is not needed, unless of course the repository is
117 117 :local:. The conversion uses the top level directory in the sandbox to
118 118 find the CVS repository, and then uses CVS rlog commands to find files to
119 119 convert. This means that unless a filemap is given, all files under the
120 120 starting directory will be converted, and that any directory
121 121 reorganization in the CVS sandbox is ignored.
122 122
123 123 The options shown are the defaults.
124 124
125 125 --config convert.cvsps.cache=True (boolean)
126 126 Set to False to disable remote log caching, for testing and debugging
127 127 purposes.
128 128 --config convert.cvsps.fuzz=60 (integer)
129 129 Specify the maximum time (in seconds) that is allowed between commits
130 130 with identical user and log message in a single changeset. When very
131 131 large files were checked in as part of a changeset then the default
132 132 may not be long enough.
133 133 --config convert.cvsps.mergeto='{{mergetobranch ([-\w]+)}}'
134 134 Specify a regular expression to which commit log messages are matched.
135 135 If a match occurs, then the conversion process will insert a dummy
136 136 revision merging the branch on which this log message occurs to the
137 137 branch indicated in the regex.
138 138 --config convert.cvsps.mergefrom='{{mergefrombranch ([-\w]+)}}'
139 139 Specify a regular expression to which commit log messages are matched.
140 140 If a match occurs, then the conversion process will add the most
141 141 recent revision on the branch indicated in the regex as the second
142 142 parent of the changeset.
143 --config hook.cvslog
144 Specify a Python function to be called at the end of gathering the CVS
145 log. The function is passed a list with the log entries, and can
146 modify the entries in-place, or add or delete them.
147 --config hook.cvschangesets
148 Specify a Python function to be called after the changesets are
149 calculated from the the CVS log. The function is passed a list with
150 the changeset entries, and can modify the changesets in-place, or add
151 or delete them.
143 152
144 153 An additional "debugcvsps" Mercurial command allows the builtin changeset
145 154 merging code to be run without doing a conversion. Its parameters and
146 155 output are similar to that of cvsps 2.1. Please see the command help for
147 156 more details.
148 157
149 158 Subversion Source
150 159 -----------------
151 160
152 161 Subversion source detects classical trunk/branches/tags layouts. By
153 162 default, the supplied "svn://repo/path/" source URL is converted as a
154 163 single branch. If "svn://repo/path/trunk" exists it replaces the default
155 164 branch. If "svn://repo/path/branches" exists, its subdirectories are
156 165 listed as possible branches. If "svn://repo/path/tags" exists, it is
157 166 looked for tags referencing converted branches. Default "trunk",
158 167 "branches" and "tags" values can be overridden with following options. Set
159 168 them to paths relative to the source URL, or leave them blank to disable
160 169 auto detection.
161 170
162 171 --config convert.svn.branches=branches (directory name)
163 172 specify the directory containing branches
164 173 --config convert.svn.tags=tags (directory name)
165 174 specify the directory containing tags
166 175 --config convert.svn.trunk=trunk (directory name)
167 176 specify the name of the trunk branch
168 177
169 178 Source history can be retrieved starting at a specific revision, instead
170 179 of being integrally converted. Only single branch conversions are
171 180 supported.
172 181
173 182 --config convert.svn.startrev=0 (svn revision number)
174 183 specify start Subversion revision.
175 184
176 185 Perforce Source
177 186 ---------------
178 187
179 188 The Perforce (P4) importer can be given a p4 depot path or a client
180 189 specification as source. It will convert all files in the source to a flat
181 190 Mercurial repository, ignoring labels, branches and integrations. Note
182 191 that when a depot path is given you then usually should specify a target
183 192 directory, because otherwise the target may be named ...-hg.
184 193
185 194 It is possible to limit the amount of source history to be converted by
186 195 specifying an initial Perforce revision.
187 196
188 197 --config convert.p4.startrev=0 (perforce changelist number)
189 198 specify initial Perforce revision.
190 199
191 200 Mercurial Destination
192 201 ---------------------
193 202
194 203 --config convert.hg.clonebranches=False (boolean)
195 204 dispatch source branches in separate clones.
196 205 --config convert.hg.tagsbranch=default (branch name)
197 206 tag revisions branch name
198 207 --config convert.hg.usebranchnames=True (boolean)
199 208 preserve branch names
200 209
201 210 options:
202 211
203 212 -A --authors username mapping filename
204 213 -d --dest-type destination repository type
205 214 --filemap remap file names using contents of file
206 215 -r --rev import up to target revision REV
207 216 -s --source-type source repository type
208 217 --splicemap splice synthesized history into place
209 218 --branchmap change branch names while converting
210 219 --branchsort try to sort changesets by branches
211 220 --datesort try to sort changesets by date
212 221 --sourcesort preserve source changesets order
213 222
214 223 use "hg -v help convert" to show global options
215 224 adding a
216 225 assuming destination a-hg
217 226 initializing destination a-hg repository
218 227 scanning source...
219 228 sorting...
220 229 converting...
221 230 4 a
222 231 3 b
223 232 2 c
224 233 1 d
225 234 0 e
226 235 pulling from ../a
227 236 searching for changes
228 237 no changes found
229 238 % should fail
230 239 initializing destination bogusfile repository
231 240 abort: cannot create new bundle repository
232 241 % should fail
233 242 abort: Permission denied: bogusdir
234 243 % should succeed
235 244 initializing destination bogusdir repository
236 245 scanning source...
237 246 sorting...
238 247 converting...
239 248 4 a
240 249 3 b
241 250 2 c
242 251 1 d
243 252 0 e
244 253 % test pre and post conversion actions
245 254 run hg source pre-conversion action
246 255 run hg sink pre-conversion action
247 256 run hg sink post-conversion action
248 257 run hg source post-conversion action
249 258 % converting empty dir should fail nicely
250 259 assuming destination emptydir-hg
251 260 initializing destination emptydir-hg repository
252 261 emptydir does not look like a CVS checkout
253 262 emptydir does not look like a Git repo
254 263 emptydir does not look like a Subversion repo
255 264 emptydir is not a local Mercurial repo
256 265 emptydir does not look like a darcs repo
257 266 emptydir does not look like a monotone repo
258 267 emptydir does not look like a GNU Arch repo
259 268 emptydir does not look like a Bazaar repo
260 269 cannot find required "p4" tool
261 270 abort: emptydir: missing or unsupported repository
262 271 % convert with imaginary source type
263 272 initializing destination a-foo repository
264 273 abort: foo: invalid source repository type
265 274 % convert with imaginary sink type
266 275 abort: foo: invalid destination repository type
General Comments 0
You need to be logged in to leave comments. Login now