##// END OF EJS Templates
lfs: expand the user facing documentation
Matt Harbison -
r35786:60a6ab7b default
parent child Browse files
Show More
@@ -1,340 +1,387
1 1 # lfs - hash-preserving large file support using Git-LFS protocol
2 2 #
3 3 # Copyright 2017 Facebook, Inc.
4 4 #
5 5 # This software may be used and distributed according to the terms of the
6 6 # GNU General Public License version 2 or any later version.
7 7
8 8 """lfs - large file support (EXPERIMENTAL)
9 9
10 This extension allows large files to be tracked outside of the normal
11 repository storage and stored on a centralized server, similar to the
12 ``largefiles`` extension. The ``git-lfs`` protocol is used when
13 communicating with the server, so existing git infrastructure can be
14 harnessed. Even though the files are stored outside of the repository,
15 they are still integrity checked in the same manner as normal files.
16
17 The files stored outside of the repository are downloaded on demand,
18 which reduces the time to clone, and possibly the local disk usage.
19 This changes fundamental workflows in a DVCS, so careful thought
20 should be given before deploying it. :hg:`convert` can be used to
21 convert LFS repositories to normal repositories that no longer
22 require this extension, and do so without changing the commit hashes.
23 This allows the extension to be disabled if the centralized workflow
24 becomes burdensome. However, the pre and post convert clones will
25 not be able to communicate with each other unless the extension is
26 enabled on both.
27
28 To start a new repository, or add new LFS files, just create and add
29 an ``.hglfs`` file as described below. Because the file is tracked in
30 the repository, all clones will use the same selection policy. During
31 subsequent commits, Mercurial will consult this file to determine if
32 an added or modified file should be stored externally. The type of
33 storage depends on the characteristics of the file at each commit. A
34 file that is near a size threshold may switch back and forth between
35 LFS and normal storage, as needed.
36
37 Alternately, both normal repositories and largefile controlled
38 repositories can be converted to LFS by using :hg:`convert` and the
39 ``lfs.track`` config option described below. The ``.hglfs`` file
40 should then be created and added, to control subsequent LFS selection.
41 The hashes are also unchanged in this case. The LFS and non-LFS
42 repositories can be distinguished because the LFS repository will
43 abort any command if this extension is disabled.
44
45 Committed LFS files are held locally, until the repository is pushed.
46 Prior to pushing the normal repository data, the LFS files that are
47 tracked by the outgoing commits are automatically uploaded to the
48 configured central server. No LFS files are transferred on
49 :hg:`pull` or :hg:`clone`. Instead, the files are downloaded on
50 demand as they need to be read, if a cached copy cannot be found
51 locally. Both committing and downloading an LFS file will link the
52 file to a usercache, to speed up future access. See the `usercache`
53 config setting described below.
54
55 .hglfs::
56
10 57 The extension reads its configuration from a versioned ``.hglfs``
11 58 configuration file found in the root of the working directory. The
12 59 ``.hglfs`` file uses the same syntax as all other Mercurial
13 60 configuration files. It uses a single section, ``[track]``.
14 61
15 62 The ``[track]`` section specifies which files are stored as LFS (or
16 63 not). Each line is keyed by a file pattern, with a predicate value.
17 64 The first file pattern match is used, so put more specific patterns
18 65 first. The available predicates are ``all()``, ``none()``, and
19 66 ``size()``. See "hg help filesets.size" for the latter.
20 67
21 68 Example versioned ``.hglfs`` file::
22 69
23 70 [track]
24 71 # No Makefile or python file, anywhere, will be LFS
25 72 **Makefile = none()
26 73 **.py = none()
27 74
28 75 **.zip = all()
29 76 **.exe = size(">1MB")
30 77
31 78 # Catchall for everything not matched above
32 79 ** = size(">10MB")
33 80
34 81 Configs::
35 82
36 83 [lfs]
37 84 # Remote endpoint. Multiple protocols are supported:
38 85 # - http(s)://user:pass@example.com/path
39 86 # git-lfs endpoint
40 87 # - file:///tmp/path
41 88 # local filesystem, usually for testing
42 89 # if unset, lfs will prompt setting this when it must use this value.
43 90 # (default: unset)
44 url = https://example.com/lfs
91 url = https://example.com/repo.git/info/lfs
45 92
46 93 # Which files to track in LFS. Path tests are "**.extname" for file
47 94 # extensions, and "path:under/some/directory" for path prefix. Both
48 95 # are relative to the repository root.
49 96 # File size can be tested with the "size()" fileset, and tests can be
50 97 # joined with fileset operators. (See "hg help filesets.operators".)
51 98 #
52 99 # Some examples:
53 100 # - all() # everything
54 101 # - none() # nothing
55 102 # - size(">20MB") # larger than 20MB
56 103 # - !**.txt # anything not a *.txt file
57 104 # - **.zip | **.tar.gz | **.7z # some types of compressed files
58 105 # - path:bin # files under "bin" in the project root
59 106 # - (**.php & size(">2MB")) | (**.js & size(">5MB")) | **.tar.gz
60 107 # | (path:bin & !path:/bin/README) | size(">1GB")
61 108 # (default: none())
62 109 #
63 110 # This is ignored if there is a tracked '.hglfs' file, and this setting
64 111 # will eventually be deprecated and removed.
65 112 track = size(">10M")
66 113
67 114 # how many times to retry before giving up on transferring an object
68 115 retry = 5
69 116
70 117 # the local directory to store lfs files for sharing across local clones.
71 118 # If not set, the cache is located in an OS specific cache location.
72 119 usercache = /path/to/global/cache
73 120 """
74 121
75 122 from __future__ import absolute_import
76 123
77 124 from mercurial.i18n import _
78 125
79 126 from mercurial import (
80 127 bundle2,
81 128 changegroup,
82 129 cmdutil,
83 130 config,
84 131 context,
85 132 error,
86 133 exchange,
87 134 extensions,
88 135 filelog,
89 136 fileset,
90 137 hg,
91 138 localrepo,
92 139 minifileset,
93 140 node,
94 141 pycompat,
95 142 registrar,
96 143 revlog,
97 144 scmutil,
98 145 templatekw,
99 146 upgrade,
100 147 util,
101 148 vfs as vfsmod,
102 149 wireproto,
103 150 )
104 151
105 152 from . import (
106 153 blobstore,
107 154 wrapper,
108 155 )
109 156
110 157 # Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for
111 158 # extensions which SHIP WITH MERCURIAL. Non-mainline extensions should
112 159 # be specifying the version(s) of Mercurial they are tested with, or
113 160 # leave the attribute unspecified.
114 161 testedwith = 'ships-with-hg-core'
115 162
116 163 configtable = {}
117 164 configitem = registrar.configitem(configtable)
118 165
119 166 configitem('experimental', 'lfs.user-agent',
120 167 default=None,
121 168 )
122 169 configitem('experimental', 'lfs.worker-enable',
123 170 default=False,
124 171 )
125 172
126 173 configitem('lfs', 'url',
127 174 default=None,
128 175 )
129 176 configitem('lfs', 'usercache',
130 177 default=None,
131 178 )
132 179 # Deprecated
133 180 configitem('lfs', 'threshold',
134 181 default=None,
135 182 )
136 183 configitem('lfs', 'track',
137 184 default='none()',
138 185 )
139 186 configitem('lfs', 'retry',
140 187 default=5,
141 188 )
142 189
143 190 cmdtable = {}
144 191 command = registrar.command(cmdtable)
145 192
146 193 templatekeyword = registrar.templatekeyword()
147 194
148 195 def featuresetup(ui, supported):
149 196 # don't die on seeing a repo with the lfs requirement
150 197 supported |= {'lfs'}
151 198
152 199 def uisetup(ui):
153 200 localrepo.localrepository.featuresetupfuncs.add(featuresetup)
154 201
155 202 def reposetup(ui, repo):
156 203 # Nothing to do with a remote repo
157 204 if not repo.local():
158 205 return
159 206
160 207 repo.svfs.lfslocalblobstore = blobstore.local(repo)
161 208 repo.svfs.lfsremoteblobstore = blobstore.remote(repo)
162 209
163 210 class lfsrepo(repo.__class__):
164 211 @localrepo.unfilteredmethod
165 212 def commitctx(self, ctx, error=False):
166 213 repo.svfs.options['lfstrack'] = _trackedmatcher(self, ctx)
167 214 return super(lfsrepo, self).commitctx(ctx, error)
168 215
169 216 repo.__class__ = lfsrepo
170 217
171 218 if 'lfs' not in repo.requirements:
172 219 def checkrequireslfs(ui, repo, **kwargs):
173 220 if 'lfs' not in repo.requirements:
174 221 last = kwargs.get('node_last')
175 222 _bin = node.bin
176 223 if last:
177 224 s = repo.set('%n:%n', _bin(kwargs['node']), _bin(last))
178 225 else:
179 226 s = repo.set('%n', _bin(kwargs['node']))
180 227 for ctx in s:
181 228 # TODO: is there a way to just walk the files in the commit?
182 229 if any(ctx[f].islfs() for f in ctx.files() if f in ctx):
183 230 repo.requirements.add('lfs')
184 231 repo._writerequirements()
185 232 repo.prepushoutgoinghooks.add('lfs', wrapper.prepush)
186 233 break
187 234
188 235 ui.setconfig('hooks', 'commit.lfs', checkrequireslfs, 'lfs')
189 236 ui.setconfig('hooks', 'pretxnchangegroup.lfs', checkrequireslfs, 'lfs')
190 237 else:
191 238 repo.prepushoutgoinghooks.add('lfs', wrapper.prepush)
192 239
193 240 def _trackedmatcher(repo, ctx):
194 241 """Return a function (path, size) -> bool indicating whether or not to
195 242 track a given file with lfs."""
196 243 data = ''
197 244
198 245 if '.hglfs' in ctx.added() or '.hglfs' in ctx.modified():
199 246 data = ctx['.hglfs'].data()
200 247 elif '.hglfs' not in ctx.removed():
201 248 p1 = repo['.']
202 249
203 250 if '.hglfs' not in p1:
204 251 # No '.hglfs' in wdir or in parent. Fallback to config
205 252 # for now.
206 253 trackspec = repo.ui.config('lfs', 'track')
207 254
208 255 # deprecated config: lfs.threshold
209 256 threshold = repo.ui.configbytes('lfs', 'threshold')
210 257 if threshold:
211 258 fileset.parse(trackspec) # make sure syntax errors are confined
212 259 trackspec = "(%s) | size('>%d')" % (trackspec, threshold)
213 260
214 261 return minifileset.compile(trackspec)
215 262
216 263 data = p1['.hglfs'].data()
217 264
218 265 # In removed, or not in parent
219 266 if not data:
220 267 return lambda p, s: False
221 268
222 269 # Parse errors here will abort with a message that points to the .hglfs file
223 270 # and line number.
224 271 cfg = config.config()
225 272 cfg.parse('.hglfs', data)
226 273
227 274 try:
228 275 rules = [(minifileset.compile(pattern), minifileset.compile(rule))
229 276 for pattern, rule in cfg.items('track')]
230 277 except error.ParseError as e:
231 278 # The original exception gives no indicator that the error is in the
232 279 # .hglfs file, so add that.
233 280
234 281 # TODO: See if the line number of the file can be made available.
235 282 raise error.Abort(_('parse error in .hglfs: %s') % e)
236 283
237 284 def _match(path, size):
238 285 for pat, rule in rules:
239 286 if pat(path, size):
240 287 return rule(path, size)
241 288
242 289 return False
243 290
244 291 return _match
245 292
246 293 def wrapfilelog(filelog):
247 294 wrapfunction = extensions.wrapfunction
248 295
249 296 wrapfunction(filelog, 'addrevision', wrapper.filelogaddrevision)
250 297 wrapfunction(filelog, 'renamed', wrapper.filelogrenamed)
251 298 wrapfunction(filelog, 'size', wrapper.filelogsize)
252 299
253 300 def extsetup(ui):
254 301 wrapfilelog(filelog.filelog)
255 302
256 303 wrapfunction = extensions.wrapfunction
257 304
258 305 wrapfunction(cmdutil, '_updatecatformatter', wrapper._updatecatformatter)
259 306 wrapfunction(scmutil, 'wrapconvertsink', wrapper.convertsink)
260 307
261 308 wrapfunction(upgrade, '_finishdatamigration',
262 309 wrapper.upgradefinishdatamigration)
263 310
264 311 wrapfunction(upgrade, 'preservedrequirements',
265 312 wrapper.upgraderequirements)
266 313
267 314 wrapfunction(upgrade, 'supporteddestrequirements',
268 315 wrapper.upgraderequirements)
269 316
270 317 wrapfunction(changegroup,
271 318 'supportedoutgoingversions',
272 319 wrapper.supportedoutgoingversions)
273 320 wrapfunction(changegroup,
274 321 'allsupportedversions',
275 322 wrapper.allsupportedversions)
276 323
277 324 wrapfunction(exchange, 'push', wrapper.push)
278 325 wrapfunction(wireproto, '_capabilities', wrapper._capabilities)
279 326
280 327 wrapfunction(context.basefilectx, 'cmp', wrapper.filectxcmp)
281 328 wrapfunction(context.basefilectx, 'isbinary', wrapper.filectxisbinary)
282 329 context.basefilectx.islfs = wrapper.filectxislfs
283 330
284 331 revlog.addflagprocessor(
285 332 revlog.REVIDX_EXTSTORED,
286 333 (
287 334 wrapper.readfromstore,
288 335 wrapper.writetostore,
289 336 wrapper.bypasscheckhash,
290 337 ),
291 338 )
292 339
293 340 wrapfunction(hg, 'clone', wrapper.hgclone)
294 341 wrapfunction(hg, 'postshare', wrapper.hgpostshare)
295 342
296 343 # Make bundle choose changegroup3 instead of changegroup2. This affects
297 344 # "hg bundle" command. Note: it does not cover all bundle formats like
298 345 # "packed1". Using "packed1" with lfs will likely cause trouble.
299 346 names = [k for k, v in exchange._bundlespeccgversions.items() if v == '02']
300 347 for k in names:
301 348 exchange._bundlespeccgversions[k] = '03'
302 349
303 350 # bundlerepo uses "vfsmod.readonlyvfs(othervfs)", we need to make sure lfs
304 351 # options and blob stores are passed from othervfs to the new readonlyvfs.
305 352 wrapfunction(vfsmod.readonlyvfs, '__init__', wrapper.vfsinit)
306 353
307 354 # when writing a bundle via "hg bundle" command, upload related LFS blobs
308 355 wrapfunction(bundle2, 'writenewbundle', wrapper.writenewbundle)
309 356
310 357 @templatekeyword('lfs_files')
311 358 def lfsfiles(repo, ctx, **args):
312 359 """List of strings. LFS files added or modified by the changeset."""
313 360 args = pycompat.byteskwargs(args)
314 361
315 362 pointers = wrapper.pointersfromctx(ctx) # {path: pointer}
316 363 files = sorted(pointers.keys())
317 364
318 365 def lfsattrs(v):
319 366 # In the file spec, version is first and the other keys are sorted.
320 367 sortkeyfunc = lambda x: (x[0] != 'version', x)
321 368 items = sorted(pointers[v].iteritems(), key=sortkeyfunc)
322 369 return util.sortdict(items)
323 370
324 371 makemap = lambda v: {
325 372 'file': v,
326 373 'oid': pointers[v].oid(),
327 374 'lfsattrs': templatekw.hybriddict(lfsattrs(v)),
328 375 }
329 376
330 377 # TODO: make the separator ', '?
331 378 f = templatekw._showlist('lfs_file', files, args)
332 379 return templatekw._hybrid(f, files, makemap, pycompat.identity)
333 380
334 381 @command('debuglfsupload',
335 382 [('r', 'rev', [], _('upload large files introduced by REV'))])
336 383 def debuglfsupload(ui, repo, **opts):
337 384 """upload lfs blobs added by the working copy parent or given revisions"""
338 385 revs = opts.get('rev', [])
339 386 pointers = wrapper.extractpointers(repo, scmutil.revrange(repo, revs))
340 387 wrapper.uploadblobs(repo, pointers)
General Comments 0
You need to be logged in to leave comments. Login now