upstream/mercurial-mirror Files · hgext/largefiles/proto.py

lfs: add basic routing for the server side wire protocol processing...

lfs: add basic routing for the server side wire protocol processing The recent hgweb refactoring yielded a clean point to wrap a function that could handle this, so I moved the routing for this out of the core. While not an hg wire protocol, this seems logically close enough. For now, these handlers do nothing other than check permissions. The protocol requires support for PUT requests, so that has been added to the core, and funnels into the same handler as GET and POST. The permission checking code was assuming that anything not checking 'pull' or None ops should be using POST. But that breaks the upload check if it checks 'push'. So I invented a new 'upload' permission, and used it to avoid the mandate to POST. A function wrap point could be added, but security code should probably stay grouped together. Given that anything not 'pull' or None was requiring POST, the comment on hgweb.common.permhooks is probably wrong- there is no 'read'. The rationale for the URIs is that the spec for the Batch API[1] defines the URL as the LFS server url + '/objects/batch'. The default git URLs are: Git remote: https://git-server.com/foo/bar LFS server: https://git-server.com/foo/bar.git/info/lfs Batch API: https://git-server.com/foo/bar.git/info/lfs/objects/batch '.git/' seems like it's not something a user would normally track. If we adhere to how git defines the URLs, then the hg-git extension should be able to talk to a git based server without any additional work. The URI for the transfer requests starts with '.hg/' to ensure that there are no conflicts with tracked files. Since these are handed out by the Batch API, we can change this at any point in the future. (Specifically, it might be a good idea to use something under the proposed /api/ namespace.) In any case, no files are stored at these locations in the repository directory. I started a new module for this because it seems like a good idea to keep all of the security sensitive server side code together. There's also an issue with `hg verify` in that it will want to download *all* blobs in order to run. Sadly, there's no way in the protocol to ask the server to verify the content of a blob it may have. (The verify action is for storing files on a 3rd party server, and then informing the LFS server when that completes.) So we may end up implementing a custom transfer adapter that simply indicates if the blobs are valid, and fall back to basic transfers for non-hg servers. In other words, this code is likely to get bigger before this is made non-experimental. [1] https://github.com/git-lfs/git-lfs/blob/master/docs/api/batch.md

Augie Fackler - - Load All Authors

File last commit:

r36674:5c4c9eb1 default


                r37165:a2566597

default

Download file

             proto.py
        
                    190 lines
            
             | 7.2 KiB
            
                | text/x-python
            
             |
                PythonLexer
            
             / hgext / largefiles / proto.py
          
                    History
                
                 |
                  Annotation
                 | Raw
                 |Copy content
                 |Copy permalink

      # Copyright 2011 Fog Creek Software

      #

      # This software may be used and distributed according to the terms of the

      # GNU General Public License version 2 or any later version.

      from __future__ import absolute_import

      import os

      import re

      from mercurial.i18n import _

      from mercurial import (

          error,

          httppeer,

          util,

          wireproto,

          wireprototypes,

      )

      from . import (

          lfutil,

      )

      urlerr = util.urlerr

      urlreq = util.urlreq

      LARGEFILES_REQUIRED_MSG = ('\nThis repository uses the largefiles extension.'

                                 '\n\nPlease enable it in your Mercurial config '

                                 'file.\n')

      # these will all be replaced by largefiles.uisetup

      ssholdcallstream = None

      httpoldcallstream = None

      def putlfile(repo, proto, sha):

          '''Server command for putting a largefile into a repository's local store

          and into the user cache.'''

          with proto.mayberedirectstdio() as output:

              path = lfutil.storepath(repo, sha)

              util.makedirs(os.path.dirname(path))

              tmpfp = util.atomictempfile(path, createmode=repo.store.createmode)

              try:

                  proto.forwardpayload(tmpfp)

                  tmpfp._fp.seek(0)

                  if sha != lfutil.hexsha1(tmpfp._fp):

                      raise IOError(0, _('largefile contents do not match hash'))

                  tmpfp.close()

                  lfutil.linktousercache(repo, sha)

              except IOError as e:

                  repo.ui.warn(_('largefiles: failed to put %s into store: %s\n') %

                               (sha, e.strerror))

                  return wireproto.pushres(1, output.getvalue() if output else '')

              finally:

                  tmpfp.discard()

          return wireproto.pushres(0, output.getvalue() if output else '')

      def getlfile(repo, proto, sha):

          '''Server command for retrieving a largefile from the repository-local

          cache or user cache.'''

          filename = lfutil.findfile(repo, sha)

          if not filename:

              raise error.Abort(_('requested largefile %s not present in cache')

                                % sha)

          f = open(filename, 'rb')

          length = os.fstat(f.fileno())[6]

          # Since we can't set an HTTP content-length header here, and

          # Mercurial core provides no way to give the length of a streamres

          # (and reading the entire file into RAM would be ill-advised), we

          # just send the length on the first line of the response, like the

          # ssh proto does for string responses.

          def generator():

              yield '%d\n' % length

              for chunk in util.filechunkiter(f):

                  yield chunk

          return wireproto.streamres_legacy(gen=generator())

      def statlfile(repo, proto, sha):

          '''Server command for checking if a largefile is present - returns '2\n' if

          the largefile is missing, '0\n' if it seems to be in good condition.

          The value 1 is reserved for mismatched checksum, but that is too expensive

          to be verified on every stat and must be caught be running 'hg verify'

          server side.'''

          filename = lfutil.findfile(repo, sha)

          if not filename:

              return wireprototypes.bytesresponse('2\n')

          return wireprototypes.bytesresponse('0\n')

      def wirereposetup(ui, repo):

          class lfileswirerepository(repo.__class__):

              def putlfile(self, sha, fd):

                  # unfortunately, httprepository._callpush tries to convert its

                  # input file-like into a bundle before sending it, so we can't use

                  # it ...

                  if issubclass(self.__class__, httppeer.httppeer):

                      res = self._call('putlfile', data=fd, sha=sha,

                          headers={r'content-type': r'application/mercurial-0.1'})

                      try:

                          d, output = res.split('\n', 1)

                          for l in output.splitlines(True):

                              self.ui.warn(_('remote: '), l) # assume l ends with \n

                          return int(d)

                      except ValueError:

                          self.ui.warn(_('unexpected putlfile response: %r\n') % res)

                          return 1

                  # ... but we can't use sshrepository._call because the data=

                  # argument won't get sent, and _callpush does exactly what we want

                  # in this case: send the data straight through

                  else:

                      try:

                          ret, output = self._callpush("putlfile", fd, sha=sha)

                          if ret == "":

                              raise error.ResponseError(_('putlfile failed:'),

                                      output)

                          return int(ret)

                      except IOError:

                          return 1

                      except ValueError:

                          raise error.ResponseError(

                              _('putlfile failed (unexpected response):'), ret)

              def getlfile(self, sha):

                  """returns an iterable with the chunks of the file with sha sha"""

                  stream = self._callstream("getlfile", sha=sha)

                  length = stream.readline()

                  try:

                      length = int(length)

                  except ValueError:

                      self._abort(error.ResponseError(_("unexpected response:"),

                                                      length))

                  # SSH streams will block if reading more than length

                  for chunk in util.filechunkiter(stream, limit=length):

                      yield chunk

                  # HTTP streams must hit the end to process the last empty

                  # chunk of Chunked-Encoding so the connection can be reused.

                  if issubclass(self.__class__, httppeer.httppeer):

                      chunk = stream.read(1)

                      if chunk:

                          self._abort(error.ResponseError(_("unexpected response:"),

                                                          chunk))

              @wireproto.batchable

              def statlfile(self, sha):

                  f = wireproto.future()

                  result = {'sha': sha}

                  yield result, f

                  try:

                      yield int(f.value)

                  except (ValueError, urlerr.httperror):

                      # If the server returns anything but an integer followed by a

                      # newline, newline, it's not speaking our language; if we get

                      # an HTTP error, we can't be sure the largefile is present;

                      # either way, consider it missing.

                      yield 2

          repo.__class__ = lfileswirerepository

      # advertise the largefiles=serve capability

      def _capabilities(orig, repo, proto):

          '''announce largefile server capability'''

          caps = orig(repo, proto)

          caps.append('largefiles=serve')

          return caps

      def heads(repo, proto):

          '''Wrap server command - largefile capable clients will know to call

          lheads instead'''

          if lfutil.islfilesrepo(repo):

              return wireproto.ooberror(LARGEFILES_REQUIRED_MSG)

          return wireproto.heads(repo, proto)

      def sshrepocallstream(self, cmd, **args):

          if cmd == 'heads' and self.capable('largefiles'):

              cmd = 'lheads'

          if cmd == 'batch' and self.capable('largefiles'):

              args[r'cmds'] = args[r'cmds'].replace('heads ', 'lheads ')

          return ssholdcallstream(self, cmd, **args)

      headsre = re.compile(br'(^|;)heads\b')

      def httprepocallstream(self, cmd, **args):

          if cmd == 'heads' and self.capable('largefiles'):

              cmd = 'lheads'

          if cmd == 'batch' and self.capable('largefiles'):

              args[r'cmds'] = headsre.sub('lheads', args[r'cmds'])

          return httpoldcallstream(self, cmd, **args)

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages

				# Copyright 2011 Fog Creek Software
				#
				# This software may be used and distributed according to the terms of the
				# GNU General Public License version 2 or any later version.
				from __future__ import absolute_import

				import os
				import re

				from mercurial.i18n import _

				from mercurial import (
				error,
				httppeer,
				util,
				wireproto,
				wireprototypes,
				)

				from . import (
				lfutil,
				)

				urlerr = util.urlerr
				urlreq = util.urlreq

				LARGEFILES_REQUIRED_MSG = ('\nThis repository uses the largefiles extension.'
				'\n\nPlease enable it in your Mercurial config '
				'file.\n')

				# these will all be replaced by largefiles.uisetup
				ssholdcallstream = None
				httpoldcallstream = None

				def putlfile(repo, proto, sha):
				'''Server command for putting a largefile into a repository's local store
				and into the user cache.'''
				with proto.mayberedirectstdio() as output:
				path = lfutil.storepath(repo, sha)
				util.makedirs(os.path.dirname(path))
				tmpfp = util.atomictempfile(path, createmode=repo.store.createmode)

				try:
				proto.forwardpayload(tmpfp)
				tmpfp._fp.seek(0)
				if sha != lfutil.hexsha1(tmpfp._fp):
				raise IOError(0, _('largefile contents do not match hash'))
				tmpfp.close()
				lfutil.linktousercache(repo, sha)
				except IOError as e:
				repo.ui.warn(_('largefiles: failed to put %s into store: %s\n') %
				(sha, e.strerror))
				return wireproto.pushres(1, output.getvalue() if output else '')
				finally:
				tmpfp.discard()

				return wireproto.pushres(0, output.getvalue() if output else '')

				def getlfile(repo, proto, sha):
				'''Server command for retrieving a largefile from the repository-local
				cache or user cache.'''
				filename = lfutil.findfile(repo, sha)
				if not filename:
				raise error.Abort(_('requested largefile %s not present in cache')
				% sha)
				f = open(filename, 'rb')
				length = os.fstat(f.fileno())[6]

				# Since we can't set an HTTP content-length header here, and
				# Mercurial core provides no way to give the length of a streamres
				# (and reading the entire file into RAM would be ill-advised), we
				# just send the length on the first line of the response, like the
				# ssh proto does for string responses.
				def generator():
				yield '%d\n' % length
				for chunk in util.filechunkiter(f):
				yield chunk
				return wireproto.streamres_legacy(gen=generator())

				def statlfile(repo, proto, sha):
				'''Server command for checking if a largefile is present - returns '2\n' if
				the largefile is missing, '0\n' if it seems to be in good condition.

				The value 1 is reserved for mismatched checksum, but that is too expensive
				to be verified on every stat and must be caught be running 'hg verify'
				server side.'''
				filename = lfutil.findfile(repo, sha)
				if not filename:
				return wireprototypes.bytesresponse('2\n')
				return wireprototypes.bytesresponse('0\n')

				def wirereposetup(ui, repo):
				class lfileswirerepository(repo.__class__):
				def putlfile(self, sha, fd):
				# unfortunately, httprepository._callpush tries to convert its
				# input file-like into a bundle before sending it, so we can't use
				# it ...
				if issubclass(self.__class__, httppeer.httppeer):
				res = self._call('putlfile', data=fd, sha=sha,
				headers={r'content-type': r'application/mercurial-0.1'})
				try:
				d, output = res.split('\n', 1)
				for l in output.splitlines(True):
				self.ui.warn(_('remote: '), l) # assume l ends with \n
				return int(d)
				except ValueError:
				self.ui.warn(_('unexpected putlfile response: %r\n') % res)
				return 1
				# ... but we can't use sshrepository._call because the data=
				# argument won't get sent, and _callpush does exactly what we want
				# in this case: send the data straight through
				else:
				try:
				ret, output = self._callpush("putlfile", fd, sha=sha)
				if ret == "":
				raise error.ResponseError(_('putlfile failed:'),
				output)
				return int(ret)
				except IOError:
				return 1
				except ValueError:
				raise error.ResponseError(
				_('putlfile failed (unexpected response):'), ret)

				def getlfile(self, sha):
				"""returns an iterable with the chunks of the file with sha sha"""
				stream = self._callstream("getlfile", sha=sha)
				length = stream.readline()
				try:
				length = int(length)
				except ValueError:
				self._abort(error.ResponseError(_("unexpected response:"),
				length))

				# SSH streams will block if reading more than length
				for chunk in util.filechunkiter(stream, limit=length):
				yield chunk
				# HTTP streams must hit the end to process the last empty
				# chunk of Chunked-Encoding so the connection can be reused.
				if issubclass(self.__class__, httppeer.httppeer):
				chunk = stream.read(1)
				if chunk:
				self._abort(error.ResponseError(_("unexpected response:"),
				chunk))

				@wireproto.batchable
				def statlfile(self, sha):
				f = wireproto.future()
				result = {'sha': sha}
				yield result, f
				try:
				yield int(f.value)
				except (ValueError, urlerr.httperror):
				# If the server returns anything but an integer followed by a
				# newline, newline, it's not speaking our language; if we get
				# an HTTP error, we can't be sure the largefile is present;
				# either way, consider it missing.
				yield 2

				repo.__class__ = lfileswirerepository

				# advertise the largefiles=serve capability
				def _capabilities(orig, repo, proto):
				'''announce largefile server capability'''
				caps = orig(repo, proto)
				caps.append('largefiles=serve')
				return caps

				def heads(repo, proto):
				'''Wrap server command - largefile capable clients will know to call
				lheads instead'''
				if lfutil.islfilesrepo(repo):
				return wireproto.ooberror(LARGEFILES_REQUIRED_MSG)
				return wireproto.heads(repo, proto)

				def sshrepocallstream(self, cmd, **args):
				if cmd == 'heads' and self.capable('largefiles'):
				cmd = 'lheads'
				if cmd == 'batch' and self.capable('largefiles'):
				args[r'cmds'] = args[r'cmds'].replace('heads ', 'lheads ')
				return ssholdcallstream(self, cmd, **args)

				headsre = re.compile(br'(^\|;)heads\b')

				def httprepocallstream(self, cmd, **args):
				if cmd == 'heads' and self.capable('largefiles'):
				cmd = 'lheads'
				if cmd == 'batch' and self.capable('largefiles'):
				args[r'cmds'] = headsre.sub('lheads', args[r'cmds'])
				return httpoldcallstream(self, cmd, **args)