##// END OF EJS Templates
wireproto: add streams to frame-based protocol...
wireproto: add streams to frame-based protocol Previously, the frame-based protocol was just a series of frames, with each frame associated with a request ID. In order to scale the protocol, we'll want to enable the use of compression. While it is possible to enable compression at the socket/pipe level, this has its disadvantages. The big one is it undermines the point of frames being standalone, atomic units that can be read and written: if you add compression above the framing protocol, you are back to having a stream-based protocol as opposed to something frame-based. So in order to preserve frames, compression needs to occur at the frame payload level. Compressing each frame's payload individually will limit compression ratios because the window size of the compressor will be limited by the max frame size, which is 32-64kb as currently defined. It will also add CPU overhead, as it is more efficient for compressors to operate on fewer, larger blocks of data than more, smaller blocks. So compressing each frame independently is out. This means we need to compress each frame's payload as if it is part of a larger stream. The simplest approach is to have 1 stream per connection. This could certainly work. However, it has disadvantages (documented below). We could also have 1 stream per RPC/command invocation. (This is the model HTTP/2 goes with.) This also has disadvantages. The main disadvantage to one global stream is that it has the very real potential to create CPU bottlenecks doing compression. Networks are only getting faster and the performance of single CPU cores has been relatively flat. Newer compression formats like zstandard offer better CPU cycle efficiency than predecessors like zlib. But it still all too common to saturate your CPU with compression overhead long before you saturate the network pipe. The main disadvantage with streams per request is that you can't reap the benefits of the compression context for multiple requests. For example, if you send 1000 RPC requests (or HTTP/2 requests for that matter), the response to each would have its own compression context. The overall size of the raw responses would be larger because compression contexts wouldn't be able to reference data from another request or response. The approach for streams as implemented in this commit is to support N streams per connection and for streams to potentially span requests and responses. As explained by the added internals docs, this facilitates servers and clients delegating independent streams and compression to independent threads / CPU cores. This helps alleviate the CPU bottleneck of compression. This design also allows compression contexts to be reused across requests/responses. This can result in improved compression ratios and less overhead for compressors and decompressors having to build new contexts. Another feature that was defined was the ability for individual frames within a stream to declare whether that individual frame's payload uses the content encoding (read: compression) defined by the stream. The idea here is that some servers may serve data from a combination of caches and dynamic resolution. Data coming from caches may be pre-compressed. We want to facilitate servers being able to essentially stream bytes from caches to the wire with minimal overhead. Being able to mix and match with frames are compressed within a stream enables these types of advanced server functionality. This commit defines the new streams mechanism. Basic code for supporting streams in frames has been added. But that code is seriously lacking and doesn't fully conform to the defined protocol. For example, we don't close any streams. And support for content encoding within streams is not yet implemented. The change was rather invasive and I didn't think it would be reasonable to implement the entire feature in a single commit. For the record, I would have loved to reuse an existing multiplexing protocol to build the new wire protocol on top of. However, I couldn't find a protocol that offers the performance and scaling characteristics that I desired. Namely, it should support multiple compression contexts to facilitate scaling out to multiple CPU cores and compression contexts should be able to live longer than single RPC requests. HTTP/2 *almost* fits the bill. But the semantics of HTTP message exchange state that streams can only live for a single request-response. We /could/ tunnel on top of HTTP/2 streams and frames with HEADER and DATA frames. But there's no guarantee that HTTP/2 libraries and proxies would allow us to use HTTP/2 streams and frames without the HTTP message exchange semantics defined in RFC 7540 Section 8. Other RPC protocols like gRPC tunnel are built on top of HTTP/2 and thus preserve its semantics of stream per RPC invocation. Even QUIC does this. We could attempt to invent a higher-level stream that spans HTTP/2 streams. But this would be violating HTTP/2 because there is no guarantee that HTTP/2 streams are routed to the same server. The best we can do - which is what this protocol does - is shoehorn all request and response data into a single HTTP message and create streams within. At that point, we've defined a Content-Type in HTTP parlance. It just so happens our media type can also work as a standalone, stream-based protocol, without leaning on HTTP or similar protocol. Differential Revision: https://phab.mercurial-scm.org/D2907

File last commit:

r36675:214f61ab default
r37304:9bfcbe4f default
Show More
statichttprepo.py
221 lines | 6.6 KiB | text/x-python | PythonLexer
mpm@selenic.com
Separate out old-http support...
r1101 # statichttprepo.py - simple http repository class for mercurial
#
# This provides read-only repo access to repositories exported via static http
#
Thomas Arendsen Hein
Updated copyright notices and add "and others" to "hg version"
r4635 # Copyright 2005-2007 Matt Mackall <mpm@selenic.com>
mpm@selenic.com
Separate out old-http support...
r1101 #
Martin Geisler
updated license to be explicit about GPL version 2
r8225 # This software may be used and distributed according to the terms of the
Matt Mackall
Update license to GPLv2+
r10263 # GNU General Public License version 2 or any later version.
mpm@selenic.com
Separate out old-http support...
r1101
Gregory Szorc
statichttprepo: use absolute_import
r25978 from __future__ import absolute_import
import errno
from .i18n import _
from . import (
changelog,
error,
localrepo,
manifest,
namespaces,
Yuya Nishihara
statichttprepo: do not use platform path separator to build a URL...
r34944 pathutil,
Gregory Szorc
statichttprepo: use absolute_import
r25978 scmutil,
store,
url,
util,
Pierre-Yves David
vfs: use 'vfs' module directly in 'mercurial.statichttprepo'...
r31241 vfs as vfsmod,
Gregory Szorc
statichttprepo: use absolute_import
r25978 )
Bryan O'Sullivan
Move urllib error handling from revlog into statichttprepo, where it belongs.
r1325
timeless
pycompat: switch to util.urlreq/util.urlerr for py3 compat
r28883 urlerr = util.urlerr
urlreq = util.urlreq
Benoit Boissinot
statichttprepo: cleanups, use url.py (proxy, password support)...
r7274 class httprangereader(object):
def __init__(self, url, opener):
# we assume opener has HTTPRangeHandler
self.url = url
self.pos = 0
self.opener = opener
Nicolas Dumazet
static-http: mimic more closely localrepo (issue2164: allow clone -r )...
r11066 self.name = url
Gregory Szorc
statichttprepo: implement __enter__ and __exit__ on httprangeheader...
r27705
def __enter__(self):
return self
def __exit__(self, exc_type, exc_value, traceback):
self.close()
Benoit Boissinot
statichttprepo: cleanups, use url.py (proxy, password support)...
r7274 def seek(self, pos):
self.pos = pos
def read(self, bytes=None):
timeless
pycompat: switch to util.urlreq/util.urlerr for py3 compat
r28883 req = urlreq.request(self.url)
Benoit Boissinot
statichttprepo: cleanups, use url.py (proxy, password support)...
r7274 end = ''
if bytes:
end = self.pos + bytes - 1
Alexander Boyd
statichttprepo: don't send Range header when requesting entire file...
r16882 if self.pos or end:
req.add_header('Range', 'bytes=%d-%s' % (self.pos, end))
Benoit Boissinot
statichttprepo: cleanups, use url.py (proxy, password support)...
r7274
Bryan O'Sullivan
Move urllib error handling from revlog into statichttprepo, where it belongs.
r1325 try:
Benoit Boissinot
statichttprepo: cleanups, use url.py (proxy, password support)...
r7274 f = self.opener.open(req)
data = f.read()
Augie Fackler
statichttprepo: remove wrong getattr ladder...
r25196 code = f.code
timeless
pycompat: switch to util.urlreq/util.urlerr for py3 compat
r28883 except urlerr.httperror as inst:
Dirkjan Ochtman
make static-http work with empty repos (issue965)
r6028 num = inst.code == 404 and errno.ENOENT or None
raise IOError(num, inst)
timeless
pycompat: switch to util.urlreq/util.urlerr for py3 compat
r28883 except urlerr.urlerror as inst:
Thomas Arendsen Hein
Catch urllib errors for old-http in a nicer way.
r1821 raise IOError(None, inst.reason[1])
mpm@selenic.com
Separate out old-http support...
r1101
Patrick Mezard
statichttprepo: handle remote not supporting Range headers...
r8612 if code == 200:
# HTTPRangeHandler does nothing if remote does not support
# Range headers and returns the full entity. Let's slice it.
if bytes:
data = data[self.pos:self.pos + bytes]
else:
data = data[self.pos:]
elif bytes:
Benoit Boissinot
statichttprepo: cleanups, use url.py (proxy, password support)...
r7274 data = data[:bytes]
Patrick Mezard
statichttprepo: handle remote not supporting Range headers...
r8612 self.pos += len(data)
Benoit Boissinot
statichttprepo: cleanups, use url.py (proxy, password support)...
r7274 return data
Siddharth Agarwal
statichttprepo.httprangeheader: implement readlines...
r20055 def readlines(self):
return self.read().splitlines(True)
Nicolas Dumazet
static-http: mimic more closely localrepo (issue2164: allow clone -r )...
r11066 def __iter__(self):
Siddharth Agarwal
statichttprepo.httprangeheader: implement readlines...
r20055 return iter(self.readlines())
Nicolas Dumazet
static-http: mimic more closely localrepo (issue2164: allow clone -r )...
r11066 def close(self):
pass
Benoit Boissinot
statichttprepo: cleanups, use url.py (proxy, password support)...
r7274
Augie Fackler
statichttprepo: move HTTPRangeHandler from byterange and delete the latter...
r36443 # _RangeError and _HTTPRangeHandler were originally in byterange.py,
# which was itself extracted from urlgrabber. See the last version of
# byterange.py from history if you need more information.
class _RangeError(IOError):
"""Error raised when an unsatisfiable range is requested."""
class _HTTPRangeHandler(urlreq.basehandler):
"""Handler that enables HTTP Range headers.
This was extremely simple. The Range header is a HTTP feature to
begin with so all this class does is tell urllib2 that the
"206 Partial Content" response from the HTTP server is what we
expected.
"""
def http_error_206(self, req, fp, code, msg, hdrs):
# 206 Partial Content Response
r = urlreq.addinfourl(fp, hdrs, req.get_full_url())
r.code = code
r.msg = msg
return r
def http_error_416(self, req, fp, code, msg, hdrs):
# HTTP's Range Not Satisfiable error
raise _RangeError('Requested Range Not Satisfiable')
Benoit Boissinot
statichttprepo: cleanups, use url.py (proxy, password support)...
r7274 def build_opener(ui, authinfo):
# urllib cannot handle URLs with embedded user or passwd
urlopener = url.opener(ui, authinfo)
Augie Fackler
statichttprepo: move HTTPRangeHandler from byterange and delete the latter...
r36443 urlopener.add_handler(_HTTPRangeHandler())
Benoit Boissinot
statichttprepo: cleanups, use url.py (proxy, password support)...
r7274
Pierre-Yves David
vfs: use 'vfs' module directly in 'mercurial.statichttprepo'...
r31241 class statichttpvfs(vfsmod.abstractvfs):
Dan Villiom Podlaski Christiansen
statichttprepo: make the opener a subclass of abstractopener
r14091 def __init__(self, base):
self.base = base
Mads Kiilerich
statichttprepo: update profile of __call__ in mock vfs object...
r23552 def __call__(self, path, mode='r', *args, **kw):
Adrian Buehlmann
statichttprepo: abort if opener mode is 'r+' or 'rb+'...
r13533 if mode not in ('r', 'rb'):
Nicolas Dumazet
static-http: mimic more closely localrepo (issue2164: allow clone -r )...
r11066 raise IOError('Permission denied')
timeless
pycompat: switch to util.urlreq/util.urlerr for py3 compat
r28883 f = "/".join((self.base, urlreq.quote(path)))
Benoit Boissinot
statichttprepo: cleanups, use url.py (proxy, password support)...
r7274 return httprangereader(f, urlopener)
FUJIWARA Katsunori
vfs: define "join()" in each classes derived from "abstractvfs"...
r17725 def join(self, path):
if path:
Yuya Nishihara
statichttprepo: do not use platform path separator to build a URL...
r34944 return pathutil.join(self.base, path)
FUJIWARA Katsunori
vfs: define "join()" in each classes derived from "abstractvfs"...
r17725 else:
return self.base
FUJIWARA Katsunori
scmutil: rename classes from "opener" to "vfs"...
r17649 return statichttpvfs
mpm@selenic.com
Separate out old-http support...
r1101
Peter Arrenbrecht
peer: introduce real peer classes...
r17192 class statichttppeer(localrepo.localpeer):
def local(self):
return None
Sune Foldager
peer: introduce canpush and improve error message
r17193 def canpush(self):
return False
Peter Arrenbrecht
peer: introduce real peer classes...
r17192
mpm@selenic.com
Separate out old-http support...
r1101 class statichttprepository(localrepo.localrepository):
FUJIWARA Katsunori
localrepo: make supported features manageable in each repositories individually...
r19778 supported = localrepo.localrepository._basesupported
mpm@selenic.com
Separate out old-http support...
r1101 def __init__(self, ui, path):
Vadim Gelfer
hooks: add url to changegroup, incoming, prechangegroup, pretxnchangegroup hooks...
r2673 self._url = path
mpm@selenic.com
Separate out old-http support...
r1101 self.ui = ui
Benoit Boissinot
switch to the .hg/store layout, fix the tests
r3853
Nicolas Dumazet
static-http: mimic more closely localrepo (issue2164: allow clone -r )...
r11066 self.root = path
Brodie Rao
url: move URL parsing functions into util to improve startup time...
r14076 u = util.url(path.rstrip('/') + "/.hg")
Brodie Rao
httprepo/sshrepo: use url.url...
r13819 self.path, authinfo = u.authinfo()
Benoit Boissinot
statichttprepo: cleanups, use url.py (proxy, password support)...
r7274
Pierre-Yves David
statichttp: use 'repo.vfs' as the main attribute...
r31147 vfsclass = build_opener(ui, authinfo)
self.vfs = vfsclass(self.path)
Boris Feld
cachevfs: add a vfs dedicated to cache...
r33533 self.cachevfs = vfsclass(self.vfs.join('cache'))
Pierre-Yves David
phases: mechanism to allow extension to alter initial computation of phase...
r15922 self._phasedefaults = []
Dirkjan Ochtman
make static-http work with empty repos (issue965)
r6028
Ryan McElroy
namespaces: remove weakref; always pass in repo...
r23561 self.names = namespaces.namespaces()
Gregory Szorc
localrepo: move filtername to __init__...
r32730 self.filtername = None
Sean Farley
namespaces: add bookmarks to the names data structure...
r23558
Benoit Boissinot
add "requires" file to the repo, specifying the requirements
r3851 try:
Angel Ezquerra
localrepo: remove all external users of localrepo.opener...
r23877 requirements = scmutil.readrequires(self.vfs, self.supported)
Gregory Szorc
global: mass rewrite to use modern exception syntax...
r25660 except IOError as inst:
Thomas Arendsen Hein
Fix Debian bug #494889 (fetching from static-http://... broken)...
r7178 if inst.errno != errno.ENOENT:
raise
Adrian Buehlmann
introduce new function scmutil.readrequires...
r14482 requirements = set()
Thomas Arendsen Hein
Fix Debian bug #494889 (fetching from static-http://... broken)...
r7178 # check if it is a non-empty old-style repository
try:
Angel Ezquerra
localrepo: remove all external users of localrepo.opener...
r23877 fp = self.vfs("00changelog.i")
Dan Villiom Podlaski Christiansen
explicitly close files...
r13400 fp.read(1)
fp.close()
Gregory Szorc
global: mass rewrite to use modern exception syntax...
r25660 except IOError as inst:
Thomas Arendsen Hein
Fix Debian bug #494889 (fetching from static-http://... broken)...
r7178 if inst.errno != errno.ENOENT:
raise
# we do not care about empty old-style repositories here
Dirkjan Ochtman
make static-http work with empty repos (issue965)
r6028 msg = _("'%s' does not appear to be an hg repository") % path
Matt Mackall
error: move repo errors...
r7637 raise error.RepoError(msg)
Benoit Boissinot
add "requires" file to the repo, specifying the requirements
r3851
# setup store
Pierre-Yves David
statichttp: use 'repo.vfs' as the main attribute...
r31147 self.store = store.store(requirements, self.path, vfsclass)
Matt Mackall
statichttp: use store class...
r6897 self.spath = self.store.path
Angel Ezquerra
localrepo: remove all external users of localrepo.sopener...
r23878 self.svfs = self.store.opener
Matt Mackall
statichttp: use store class...
r6897 self.sjoin = self.store.join
Idan Kamara
scmutil: update cached copy when filecached attribute is assigned (issue3263)...
r16115 self._filecache = {}
Peter Arrenbrecht
peer: introduce real peer classes...
r17192 self.requirements = requirements
Benoit Boissinot
add "requires" file to the repo, specifying the requirements
r3851
Durham Goode
manifest: make manifestlog a storecache...
r30219 self.manifestlog = manifest.manifestlog(self.svfs, self)
Angel Ezquerra
localrepo: remove all external users of localrepo.sopener...
r23878 self.changelog = changelog.changelog(self.svfs)
Greg Ward
localrepo: rename in-memory tag cache instance attributes (issue548)....
r9146 self._tags = None
mpm@selenic.com
Separate out old-http support...
r1101 self.nodetagscache = None
Pierre-Yves David
branchmap: enable caching for filtered version too...
r18189 self._branchcaches = {}
Durham Goode
revbranchcache: move out of branchmap onto localrepo...
r24373 self._revbranchcache = None
Benoit Boissinot
cleanup of revlog.group when repository is local...
r1598 self.encodepats = None
self.decodepats = None
Durham Goode
revbranchcache: move cache writing to the transaction finalizer...
r24377 self._transref = None
Peter Arrenbrecht
peer: introduce real peer classes...
r17192
def _restrictcapabilities(self, caps):
Pierre-Yves David
statichttp: respect localrepo _restrictcapabilities...
r20962 caps = super(statichttprepository, self)._restrictcapabilities(caps)
Peter Arrenbrecht
peer: introduce real peer classes...
r17192 return caps.difference(["pushkey"])
mpm@selenic.com
Separate out old-http support...
r1101
Vadim Gelfer
hooks: add url to changegroup, incoming, prechangegroup, pretxnchangegroup hooks...
r2673 def url(self):
Matt Mackall
Autodetect static-http
r7211 return self._url
Vadim Gelfer
hooks: add url to changegroup, incoming, prechangegroup, pretxnchangegroup hooks...
r2673
mpm@selenic.com
Separate out old-http support...
r1101 def local(self):
return False
Vadim Gelfer
clean up hg.py: move repo constructor code into each repo module
r2740
Peter Arrenbrecht
peer: introduce real peer classes...
r17192 def peer(self):
return statichttppeer(self)
Gregory Szorc
statichttprepo: implement wlock() (issue5613)...
r33605 def wlock(self, wait=True):
Yuya Nishihara
py3: back out c77c925987d7 to store bytes filename in IOError...
r36675 raise error.LockUnavailable(0, _('lock not available'), 'lock',
Gregory Szorc
statichttprepo: implement wlock() (issue5613)...
r33605 _('cannot lock static-http repository'))
Martin Geisler
do not pretend to lock static-http repositories (issue994)
r7005 def lock(self, wait=True):
Pierre-Yves David
error: get Abort from 'error' instead of 'util'...
r26587 raise error.Abort(_('cannot lock static-http repository'))
Martin Geisler
do not pretend to lock static-http repositories (issue994)
r7005
Pierre-Yves David
statichttprepo: do not try to write caches...
r29738 def _writecaches(self):
pass # statichttprepository are read only
Vadim Gelfer
clean up hg.py: move repo constructor code into each repo module
r2740 def instance(ui, path, create):
if create:
Pierre-Yves David
error: get Abort from 'error' instead of 'util'...
r26587 raise error.Abort(_('cannot create new static-http repository'))
Thomas Arendsen Hein
Removed deprecated hg:// and old-http:// protocols (issue406)
r4853 return statichttprepository(ui, path[7:])