bundle2.py
391 lines
| 12.7 KiB
| text/x-python
|
PythonLexer
/ mercurial / bundle2.py
Pierre-Yves David
|
r20801 | # bundle2.py - generic container format to transmit arbitrary data. | ||
# | ||||
# Copyright 2013 Facebook, Inc. | ||||
# | ||||
# This software may be used and distributed according to the terms of the | ||||
# GNU General Public License version 2 or any later version. | ||||
"""Handling of the new bundle2 format | ||||
The goal of bundle2 is to act as an atomically packet to transmit a set of | ||||
payloads in an application agnostic way. It consist in a sequence of "parts" | ||||
that will be handed to and processed by the application layer. | ||||
General format architecture | ||||
=========================== | ||||
The format is architectured as follow | ||||
- magic string | ||||
- stream level parameters | ||||
- payload parts (any number) | ||||
- end of stream marker. | ||||
Pierre-Yves David
|
r20856 | the Binary format | ||
Pierre-Yves David
|
r20801 | ============================ | ||
All numbers are unsigned and big endian. | ||||
stream level parameters | ||||
------------------------ | ||||
Binary format is as follow | ||||
:params size: (16 bits integer) | ||||
The total number of Bytes used by the parameters | ||||
:params value: arbitrary number of Bytes | ||||
A blob of `params size` containing the serialized version of all stream level | ||||
parameters. | ||||
Pierre-Yves David
|
r20809 | The blob contains a space separated list of parameters. parameter with value | ||
Pierre-Yves David
|
r20811 | are stored in the form `<name>=<value>`. Both name and value are urlquoted. | ||
Pierre-Yves David
|
r20804 | |||
Pierre-Yves David
|
r20813 | Empty name are obviously forbidden. | ||
Pierre-Yves David
|
r20844 | Name MUST start with a letter. If this first letter is lower case, the | ||
parameter is advisory and can be safefly ignored. However when the first | ||||
letter is capital, the parameter is mandatory and the bundling process MUST | ||||
stop if he is not able to proceed it. | ||||
Pierre-Yves David
|
r20814 | |||
Pierre-Yves David
|
r20808 | Stream parameters use a simple textual format for two main reasons: | ||
Pierre-Yves David
|
r20804 | |||
Pierre-Yves David
|
r20808 | - Stream level parameters should remains simple and we want to discourage any | ||
crazy usage. | ||||
- Textual data allow easy human inspection of a the bundle2 header in case of | ||||
troubles. | ||||
Any Applicative level options MUST go into a bundle2 part instead. | ||||
Pierre-Yves David
|
r20801 | |||
Payload part | ||||
------------------------ | ||||
Binary format is as follow | ||||
:header size: (16 bits inter) | ||||
The total number of Bytes used by the part headers. When the header is empty | ||||
(size = 0) this is interpreted as the end of stream marker. | ||||
Pierre-Yves David
|
r20856 | :header: | ||
The header defines how to interpret the part. It contains two piece of | ||||
data: the part type, and the part parameters. | ||||
The part type is used to route an application level handler, that can | ||||
interpret payload. | ||||
Part parameters are passed to the application level handler. They are | ||||
meant to convey information that will help the application level object to | ||||
interpret the part payload. | ||||
The binary format of the header is has follow | ||||
:typesize: (one byte) | ||||
Pierre-Yves David
|
r20877 | |||
Pierre-Yves David
|
r20856 | :typename: alphanumerical part name | ||
Pierre-Yves David
|
r20877 | |||
:parameters: | ||||
Part's parameter may have arbitraty content, the binary structure is:: | ||||
<mandatory-count><advisory-count><param-sizes><param-data> | ||||
:mandatory-count: 1 byte, number of mandatory parameters | ||||
:advisory-count: 1 byte, number of advisory parameters | ||||
:param-sizes: | ||||
N couple of bytes, where N is the total number of parameters. Each | ||||
couple contains (<size-of-key>, <size-of-value) for one parameter. | ||||
:param-data: | ||||
A blob of bytes from which each parameter key and value can be | ||||
retrieved using the list of size couples stored in the previous | ||||
field. | ||||
Mandatory parameters comes first, then the advisory ones. | ||||
Pierre-Yves David
|
r20856 | |||
:payload: | ||||
Pierre-Yves David
|
r20876 | payload is a series of `<chunksize><chunkdata>`. | ||
`chunksize` is a 32 bits integer, `chunkdata` are plain bytes (as much as | ||||
`chunksize` says)` The payload part is concluded by a zero size chunk. | ||||
The current implementation always produces either zero or one chunk. | ||||
This is an implementation limitation that will ultimatly be lifted. | ||||
Pierre-Yves David
|
r20801 | """ | ||
Pierre-Yves David
|
r20802 | import util | ||
Pierre-Yves David
|
r20804 | import struct | ||
Pierre-Yves David
|
r20811 | import urllib | ||
Pierre-Yves David
|
r20814 | import string | ||
Pierre-Yves David
|
r20804 | |||
Pierre-Yves David
|
r20802 | import changegroup | ||
Pierre-Yves David
|
r20803 | from i18n import _ | ||
Pierre-Yves David
|
r20802 | |||
Pierre-Yves David
|
r20804 | _pack = struct.pack | ||
_unpack = struct.unpack | ||||
Pierre-Yves David
|
r20801 | _magicstring = 'HG20' | ||
Pierre-Yves David
|
r20804 | _fstreamparamsize = '>H' | ||
Pierre-Yves David
|
r20856 | _fpartheadersize = '>H' | ||
_fparttypesize = '>B' | ||||
Pierre-Yves David
|
r20876 | _fpayloadsize = '>I' | ||
Pierre-Yves David
|
r20877 | _fpartparamcount = '>BB' | ||
def _makefpartparamsizes(nbparams): | ||||
"""return a struct format to read part parameter sizes | ||||
The number parameters is variable so we need to build that format | ||||
dynamically. | ||||
""" | ||||
return '>'+('BB'*nbparams) | ||||
Pierre-Yves David
|
r20804 | |||
Pierre-Yves David
|
r20801 | class bundle20(object): | ||
"""represent an outgoing bundle2 container | ||||
Pierre-Yves David
|
r20856 | Use the `addparam` method to add stream level parameter. and `addpart` to | ||
populate it. Then call `getchunks` to retrieve all the binary chunks of | ||||
datathat compose the bundle2 container.""" | ||||
Pierre-Yves David
|
r20801 | |||
Pierre-Yves David
|
r20842 | def __init__(self, ui): | ||
self.ui = ui | ||||
Pierre-Yves David
|
r20801 | self._params = [] | ||
self._parts = [] | ||||
Pierre-Yves David
|
r20804 | def addparam(self, name, value=None): | ||
"""add a stream level parameter""" | ||||
Pierre-Yves David
|
r20813 | if not name: | ||
raise ValueError('empty parameter name') | ||||
Pierre-Yves David
|
r20814 | if name[0] not in string.letters: | ||
raise ValueError('non letter first character: %r' % name) | ||||
Pierre-Yves David
|
r20804 | self._params.append((name, value)) | ||
Pierre-Yves David
|
r20856 | def addpart(self, part): | ||
"""add a new part to the bundle2 container | ||||
Parts contains the actuall applicative payload.""" | ||||
self._parts.append(part) | ||||
Pierre-Yves David
|
r20801 | def getchunks(self): | ||
Pierre-Yves David
|
r20842 | self.ui.debug('start emission of %s stream\n' % _magicstring) | ||
Pierre-Yves David
|
r20801 | yield _magicstring | ||
Pierre-Yves David
|
r20804 | param = self._paramchunk() | ||
Pierre-Yves David
|
r20842 | self.ui.debug('bundle parameter: %s\n' % param) | ||
Pierre-Yves David
|
r20804 | yield _pack(_fstreamparamsize, len(param)) | ||
if param: | ||||
yield param | ||||
Pierre-Yves David
|
r20856 | self.ui.debug('start of parts\n') | ||
for part in self._parts: | ||||
self.ui.debug('bundle part: "%s"\n' % part.type) | ||||
for chunk in part.getchunks(): | ||||
yield chunk | ||||
Pierre-Yves David
|
r20842 | self.ui.debug('end of bundle\n') | ||
Pierre-Yves David
|
r20801 | yield '\0\0' | ||
Pierre-Yves David
|
r20802 | |||
Pierre-Yves David
|
r20804 | def _paramchunk(self): | ||
"""return a encoded version of all stream parameters""" | ||||
blocks = [] | ||||
Pierre-Yves David
|
r20809 | for par, value in self._params: | ||
Pierre-Yves David
|
r20811 | par = urllib.quote(par) | ||
Pierre-Yves David
|
r20809 | if value is not None: | ||
Pierre-Yves David
|
r20811 | value = urllib.quote(value) | ||
Pierre-Yves David
|
r20809 | par = '%s=%s' % (par, value) | ||
blocks.append(par) | ||||
Pierre-Yves David
|
r20804 | return ' '.join(blocks) | ||
Pierre-Yves David
|
r20802 | class unbundle20(object): | ||
"""interpret a bundle2 stream | ||||
(this will eventually yield parts)""" | ||||
Pierre-Yves David
|
r20843 | def __init__(self, ui, fp): | ||
self.ui = ui | ||||
Pierre-Yves David
|
r20802 | self._fp = fp | ||
Pierre-Yves David
|
r20803 | header = self._readexact(4) | ||
magic, version = header[0:2], header[2:4] | ||||
if magic != 'HG': | ||||
raise util.Abort(_('not a Mercurial bundle')) | ||||
if version != '20': | ||||
raise util.Abort(_('unknown bundle version %s') % version) | ||||
Pierre-Yves David
|
r20843 | self.ui.debug('start processing of %s stream\n' % header) | ||
Pierre-Yves David
|
r20802 | |||
def _unpack(self, format): | ||||
"""unpack this struct format from the stream""" | ||||
data = self._readexact(struct.calcsize(format)) | ||||
return _unpack(format, data) | ||||
def _readexact(self, size): | ||||
"""read exactly <size> bytes from the stream""" | ||||
return changegroup.readexactly(self._fp, size) | ||||
@util.propertycache | ||||
def params(self): | ||||
"""dictionnary of stream level parameters""" | ||||
Pierre-Yves David
|
r20843 | self.ui.debug('reading bundle2 stream parameters\n') | ||
Pierre-Yves David
|
r20805 | params = {} | ||
paramssize = self._unpack(_fstreamparamsize)[0] | ||||
if paramssize: | ||||
for p in self._readexact(paramssize).split(' '): | ||||
Pierre-Yves David
|
r20810 | p = p.split('=', 1) | ||
Pierre-Yves David
|
r20812 | p = [urllib.unquote(i) for i in p] | ||
Pierre-Yves David
|
r20810 | if len(p) < 2: | ||
p.append(None) | ||||
Pierre-Yves David
|
r20844 | self._processparam(*p) | ||
Pierre-Yves David
|
r20810 | params[p[0]] = p[1] | ||
Pierre-Yves David
|
r20805 | return params | ||
Pierre-Yves David
|
r20802 | |||
Pierre-Yves David
|
r20844 | def _processparam(self, name, value): | ||
"""process a parameter, applying its effect if needed | ||||
Parameter starting with a lower case letter are advisory and will be | ||||
ignored when unknown. Those starting with an upper case letter are | ||||
mandatory and will this function will raise a KeyError when unknown. | ||||
Note: no option are currently supported. Any input will be either | ||||
ignored or failing. | ||||
""" | ||||
if not name: | ||||
raise ValueError('empty parameter name') | ||||
if name[0] not in string.letters: | ||||
raise ValueError('non letter first character: %r' % name) | ||||
# Some logic will be later added here to try to process the option for | ||||
# a dict of known parameter. | ||||
if name[0].islower(): | ||||
self.ui.debug("ignoring unknown parameter %r\n" % name) | ||||
else: | ||||
raise KeyError(name) | ||||
Pierre-Yves David
|
r20802 | def __iter__(self): | ||
"""yield all parts contained in the stream""" | ||||
# make sure param have been loaded | ||||
self.params | ||||
Pierre-Yves David
|
r20843 | self.ui.debug('start extraction of bundle2 parts\n') | ||
Pierre-Yves David
|
r20802 | part = self._readpart() | ||
while part is not None: | ||||
yield part | ||||
part = self._readpart() | ||||
Pierre-Yves David
|
r20843 | self.ui.debug('end of bundle2 stream\n') | ||
Pierre-Yves David
|
r20802 | |||
def _readpart(self): | ||||
"""return None when an end of stream markers is reach""" | ||||
Pierre-Yves David
|
r20864 | |||
headersize = self._unpack(_fpartheadersize)[0] | ||||
self.ui.debug('part header size: %i\n' % headersize) | ||||
if not headersize: | ||||
return None | ||||
headerblock = self._readexact(headersize) | ||||
# some utility to help reading from the header block | ||||
self._offset = 0 # layer violation to have something easy to understand | ||||
def fromheader(size): | ||||
"""return the next <size> byte from the header""" | ||||
offset = self._offset | ||||
data = headerblock[offset:(offset + size)] | ||||
self._offset = offset + size | ||||
return data | ||||
Pierre-Yves David
|
r20887 | def unpackheader(format): | ||
"""read given format from header | ||||
This automatically compute the size of the format to read.""" | ||||
data = fromheader(struct.calcsize(format)) | ||||
return _unpack(format, data) | ||||
typesize = unpackheader(_fparttypesize)[0] | ||||
Pierre-Yves David
|
r20864 | parttype = fromheader(typesize) | ||
self.ui.debug('part type: "%s"\n' % parttype) | ||||
Pierre-Yves David
|
r20877 | ## reading parameters | ||
# param count | ||||
Pierre-Yves David
|
r20887 | mancount, advcount = unpackheader(_fpartparamcount) | ||
Pierre-Yves David
|
r20877 | self.ui.debug('part parameters: %i\n' % (mancount + advcount)) | ||
# param size | ||||
Pierre-Yves David
|
r20887 | paramsizes = unpackheader(_makefpartparamsizes(mancount + advcount)) | ||
Pierre-Yves David
|
r20877 | # make it a list of couple again | ||
paramsizes = zip(paramsizes[::2], paramsizes[1::2]) | ||||
# split mandatory from advisory | ||||
mansizes = paramsizes[:mancount] | ||||
advsizes = paramsizes[mancount:] | ||||
# retrive param value | ||||
manparams = [] | ||||
for key, value in mansizes: | ||||
manparams.append((fromheader(key), fromheader(value))) | ||||
advparams = [] | ||||
for key, value in advsizes: | ||||
advparams.append((fromheader(key), fromheader(value))) | ||||
Pierre-Yves David
|
r20864 | del self._offset # clean up layer, nobody saw anything. | ||
Pierre-Yves David
|
r20877 | ## part payload | ||
Pierre-Yves David
|
r20876 | payload = [] | ||
payloadsize = self._unpack(_fpayloadsize)[0] | ||||
self.ui.debug('payload chunk size: %i\n' % payloadsize) | ||||
while payloadsize: | ||||
payload.append(self._readexact(payloadsize)) | ||||
payloadsize = self._unpack(_fpayloadsize)[0] | ||||
self.ui.debug('payload chunk size: %i\n' % payloadsize) | ||||
payload = ''.join(payload) | ||||
Pierre-Yves David
|
r20877 | current = part(parttype, manparams, advparams, data=payload) | ||
Pierre-Yves David
|
r20864 | return current | ||
Pierre-Yves David
|
r20802 | |||
Pierre-Yves David
|
r20856 | class part(object): | ||
"""A bundle2 part contains application level payload | ||||
The part `type` is used to route the part to the application level | ||||
handler. | ||||
""" | ||||
Pierre-Yves David
|
r20877 | def __init__(self, parttype, mandatoryparams=(), advisoryparams=(), | ||
data=''): | ||||
Pierre-Yves David
|
r20856 | self.type = parttype | ||
Pierre-Yves David
|
r20864 | self.data = data | ||
Pierre-Yves David
|
r20877 | self.mandatoryparams = mandatoryparams | ||
self.advisoryparams = advisoryparams | ||||
Pierre-Yves David
|
r20856 | |||
def getchunks(self): | ||||
Pierre-Yves David
|
r20877 | #### header | ||
## parttype | ||||
Pierre-Yves David
|
r20856 | header = [_pack(_fparttypesize, len(self.type)), | ||
self.type, | ||||
] | ||||
Pierre-Yves David
|
r20877 | ## parameters | ||
# count | ||||
manpar = self.mandatoryparams | ||||
advpar = self.advisoryparams | ||||
header.append(_pack(_fpartparamcount, len(manpar), len(advpar))) | ||||
# size | ||||
parsizes = [] | ||||
for key, value in manpar: | ||||
parsizes.append(len(key)) | ||||
parsizes.append(len(value)) | ||||
for key, value in advpar: | ||||
parsizes.append(len(key)) | ||||
parsizes.append(len(value)) | ||||
paramsizes = _pack(_makefpartparamsizes(len(parsizes) / 2), *parsizes) | ||||
header.append(paramsizes) | ||||
# key, value | ||||
for key, value in manpar: | ||||
header.append(key) | ||||
header.append(value) | ||||
for key, value in advpar: | ||||
header.append(key) | ||||
header.append(value) | ||||
## finalize header | ||||
Pierre-Yves David
|
r20856 | headerchunk = ''.join(header) | ||
yield _pack(_fpartheadersize, len(headerchunk)) | ||||
yield headerchunk | ||||
Pierre-Yves David
|
r20877 | ## payload | ||
Pierre-Yves David
|
r20876 | # we only support fixed size data now. | ||
# This will be improved in the future. | ||||
if len(self.data): | ||||
yield _pack(_fpayloadsize, len(self.data)) | ||||
yield self.data | ||||
# end of payload | ||||
yield _pack(_fpayloadsize, 0) | ||||
Pierre-Yves David
|
r20802 | |||