upstream/mercurial-mirror Commit - r30435:b86a448a

zstd: vendor python-zstandard 0.5.0...

zstd: vendor python-zstandard 0.5.0 As the commit message for the previous changeset says, we wish for zstd to be a 1st class citizen in Mercurial. To make that happen, we need to enable Python to talk to the zstd C API. And that requires bindings. This commit vendors a copy of existing Python bindings. Why do we need to vendor? As the commit message of the previous commit says, relying on systems in the wild to have the bindings or zstd present is a losing proposition. By distributing the zstd and bindings with Mercurial, we significantly increase our chances that zstd will work. Since zstd will deliver a better end-user experience by achieving better performance, this benefits our users. Another reason is that the Python bindings still aren't stable and the API is somewhat fluid. While Mercurial could be coded to target multiple versions of the Python bindings, it is safer to bundle an explicit, known working version. The added Python bindings are mostly a fully-featured interface to the zstd C API. They allow one-shot operations, streaming, reading and writing from objects implements the file object protocol, dictionary compression, control over low-level compression parameters, and more. The Python bindings work on Python 2.6, 2.7, and 3.3+ and have been tested on Linux and Windows. There are CFFI bindings, but they are lacking compared to the C extension. Upstream work will be needed before we can support zstd with PyPy. But it will be possible. The files added in this commit come from Git commit from https://github.com/indygreg/python-zstandard and are added without modifications. Some files from the upstream repository have been omitted, namely files related to continuous integration. In the spirit of full disclosure, I'm the maintainer of the "python-zstandard" project and have authored 100% of the code added in this commit. Unfortunately, the Python bindings have not been formally code reviewed by anyone. While I've tested much of the code thoroughly (I even have tests that fuzz APIs), there's a good chance there are bugs, memory leaks, not well thought out APIs, etc. If someone wants to review the code and send feedback to the GitHub project, it would be greatly appreciated. Despite my involvement with both projects, my opinions of code style differ from Mercurial's. The code in this commit introduces numerous code style violations in Mercurial's linters. So, the code is excluded from most lints. However, some violations I agree with. These have been added to the known violations ignore list for now.

Gregory Szorc -

r30435:b86a448a default

parent child

Expand all files

contrib/python-zstandard/LICENSE

0 created 644 +27 0

			@@ -0,0 +1,27 b''
		1	Copyright (c) 2016, Gregory Szorc
		2	All rights reserved.
		3
		4	Redistribution and use in source and binary forms, with or without modification,
		5	are permitted provided that the following conditions are met:
		6
		7	1. Redistributions of source code must retain the above copyright notice, this
		8	list of conditions and the following disclaimer.
		9
		10	2. Redistributions in binary form must reproduce the above copyright notice,
		11	this list of conditions and the following disclaimer in the documentation
		12	and/or other materials provided with the distribution.
		13
		14	3. Neither the name of the copyright holder nor the names of its contributors
		15	may be used to endorse or promote products derived from this software without
		16	specific prior written permission.
		17
		18	THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
		19	ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
		20	WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
		21	DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR
		22	ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
		23	(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
		24	LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
		25	ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
		26	(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
		27	SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

contrib/python-zstandard/MANIFEST.in

0 created 644 +2 0

			@@ -0,0 +1,2 b''
		1	graft zstd
		2	include make_cffi.py

contrib/python-zstandard/NEWS.rst

0 created 644 +63 0

			@@ -0,0 +1,63 b''
		1	Version History
		2	===============
		3
		4	0.5.0 (released 2016-11-10)
		5	---------------------------
		6
		7	* Vendored version of zstd updated to 1.1.1.
		8	* Continuous integration for Python 3.6 and 3.7
		9	* Continuous integration for Conda
		10	* Added compression and decompression APIs providing similar interfaces
		11	to the standard library ``zlib`` and ``bz2`` modules. This allows
		12	coding to a common interface.
		13	* ``zstd.__version__` is now defined.
		14	* ``read_from()`` on various APIs now accepts objects implementing the buffer
		15	protocol.
		16	* ``read_from()`` has gained a ``skip_bytes`` argument. This allows callers
		17	to pass in an existing buffer with a header without having to create a
		18	slice or a new object.
		19	* Implemented ``ZstdCompressionDict.as_bytes()``.
		20	* Python's memory allocator is now used instead of ``malloc()``.
		21	* Low-level zstd data structures are reused in more instances, cutting down
		22	on overhead for certain operations.
		23	* ``distutils`` boilerplate for obtaining an ``Extension`` instance
		24	has now been refactored into a standalone ``setup_zstd.py`` file. This
		25	allows other projects with ``setup.py`` files to reuse the
		26	``distutils`` code for this project without copying code.
		27	* The monolithic ``zstd.c`` file has been split into a header file defining
		28	types and separate ``.c`` source files for the implementation.
		29
		30	History of the Project
		31	======================
		32
		33	2016-08-31 - Zstandard 1.0.0 is released and Gregory starts hacking on a
		34	Python extension for use by the Mercurial project. A very hacky prototype
		35	is sent to the mercurial-devel list for RFC.
		36
		37	2016-09-03 - Most functionality from Zstandard C API implemented. Source
		38	code published on https://github.com/indygreg/python-zstandard. Travis-CI
		39	automation configured. 0.0.1 release on PyPI.
		40
		41	2016-09-05 - After the API was rounded out a bit and support for Python
		42	2.6 and 2.7 was added, version 0.1 was released to PyPI.
		43
		44	2016-09-05 - After the compressor and decompressor APIs were changed, 0.2
		45	was released to PyPI.
		46
		47	2016-09-10 - 0.3 is released with a bunch of new features. ZstdCompressor
		48	now accepts arguments controlling frame parameters. The source size can now
		49	be declared when performing streaming compression. ZstdDecompressor.decompress()
		50	is implemented. Compression dictionaries are now cached when using the simple
		51	compression and decompression APIs. Memory size APIs added.
		52	ZstdCompressor.read_from() and ZstdDecompressor.read_from() have been
		53	implemented. This rounds out the major compression/decompression APIs planned
		54	by the author.
		55
		56	2016-10-02 - 0.3.3 is released with a bug fix for read_from not fully
		57	decoding a zstd frame (issue #2).
		58
		59	2016-10-02 - 0.4.0 is released with zstd 1.1.0, support for custom read and
		60	write buffer sizes, and a few bug fixes involving failure to read/write
		61	all data when buffer sizes were too small to hold remaining data.
		62
		63	2016-11-10 - 0.5.0 is released with zstd 1.1.1 and other enhancements.

contrib/python-zstandard/README.rst

0 created 644 +776 0

This diff has been collapsed as it changes many lines, (776 lines changed) Show them Hide them
		@@ -0,0 +1,776 b''
	1	================
	2	python-zstandard
	3	================
	4
	5	This project provides a Python C extension for interfacing with the
	6	`Zstandard <http://www.zstd.net>`_ compression library.
	7
	8	The primary goal of the extension is to provide a Pythonic interface to
	9	the underlying C API. This means exposing most of the features and flexibility
	10	of the C API while not sacrificing usability or safety that Python provides.
	11
	12	\| \|ci-status\| \|win-ci-status\|
	13
	14	State of Project
	15	================
	16
	17	The project is officially in beta state. The author is reasonably satisfied
	18	with the current API and that functionality works as advertised. There
	19	may be some backwards incompatible changes before 1.0. Though the author
	20	does not intend to make any major changes to the Python API.
	21
	22	There is continuous integration for Python versions 2.6, 2.7, and 3.3+
	23	on Linux x86_x64 and Windows x86 and x86_64. The author is reasonably
	24	confident the extension is stable and works as advertised on these
	25	platforms.
	26
	27	Expected Changes
	28	----------------
	29
	30	The author is reasonably confident in the current state of what's
	31	implemented on the ``ZstdCompressor`` and ``ZstdDecompressor`` types.
	32	Those APIs likely won't change significantly. Some low-level behavior
	33	(such as naming and types expected by arguments) may change.
	34
	35	There will likely be arguments added to control the input and output
	36	buffer sizes (currently, certain operations read and write in chunk
	37	sizes using zstd's preferred defaults).
	38
	39	There should be an API that accepts an object that conforms to the buffer
	40	interface and returns an iterator over compressed or decompressed output.
	41
	42	The author is on the fence as to whether to support the extremely
	43	low level compression and decompression APIs. It could be useful to
	44	support compression without the framing headers. But the author doesn't
	45	believe it a high priority at this time.
	46
	47	The CFFI bindings are half-baked and need to be finished.
	48
	49	Requirements
	50	============
	51
	52	This extension is designed to run with Python 2.6, 2.7, 3.3, 3.4, and 3.5
	53	on common platforms (Linux, Windows, and OS X). Only x86_64 is currently
	54	well-tested as an architecture.
	55
	56	Installing
	57	==========
	58
	59	This package is uploaded to PyPI at https://pypi.python.org/pypi/zstandard.
	60	So, to install this package::
	61
	62	$ pip install zstandard
	63
	64	Binary wheels are made available for some platforms. If you need to
	65	install from a source distribution, all you should need is a working C
	66	compiler and the Python development headers/libraries. On many Linux
	67	distributions, you can install a ``python-dev`` or ``python-devel``
	68	package to provide these dependencies.
	69
	70	Packages are also uploaded to Anaconda Cloud at
	71	https://anaconda.org/indygreg/zstandard. See that URL for how to install
	72	this package with ``conda``.
	73
	74	Performance
	75	===========
	76
	77	Very crude and non-scientific benchmarking (most benchmarks fall in this
	78	category because proper benchmarking is hard) show that the Python bindings
	79	perform within 10% of the native C implementation.
	80
	81	The following table compares the performance of compressing and decompressing
	82	a 1.1 GB tar file comprised of the files in a Firefox source checkout. Values
	83	obtained with the ``zstd`` program are on the left. The remaining columns detail
	84	performance of various compression APIs in the Python bindings.
	85
	86	+-------+-----------------+-----------------+-----------------+---------------+
	87	\| Level \| Native \| Simple \| Stream In \| Stream Out \|
	88	\| \| Comp / Decomp \| Comp / Decomp \| Comp / Decomp \| Comp \|
	89	+=======+=================+=================+=================+===============+
	90	\| 1 \| 490 / 1338 MB/s \| 458 / 1266 MB/s \| 407 / 1156 MB/s \| 405 MB/s \|
	91	+-------+-----------------+-----------------+-----------------+---------------+
	92	\| 2 \| 412 / 1288 MB/s \| 381 / 1203 MB/s \| 345 / 1128 MB/s \| 349 MB/s \|
	93	+-------+-----------------+-----------------+-----------------+---------------+
	94	\| 3 \| 342 / 1312 MB/s \| 319 / 1182 MB/s \| 285 / 1165 MB/s \| 287 MB/s \|
	95	+-------+-----------------+-----------------+-----------------+---------------+
	96	\| 11 \| 64 / 1506 MB/s \| 66 / 1436 MB/s \| 56 / 1342 MB/s \| 57 MB/s \|
	97	+-------+-----------------+-----------------+-----------------+---------------+
	98
	99	Again, these are very unscientific. But it shows that Python is capable of
	100	compressing at several hundred MB/s and decompressing at over 1 GB/s.
	101
	102	Comparison to Other Python Bindings
	103	===================================
	104
	105	https://pypi.python.org/pypi/zstd is an alternative Python binding to
	106	Zstandard. At the time this was written, the latest release of that
	107	package (1.0.0.2) had the following significant differences from this package:
	108
	109	* It only exposes the simple API for compression and decompression operations.
	110	This extension exposes the streaming API, dictionary training, and more.
	111	* It adds a custom framing header to compressed data and there is no way to
	112	disable it. This means that data produced with that module cannot be used by
	113	other Zstandard implementations.
	114
	115	Bundling of Zstandard Source Code
	116	=================================
	117
	118	The source repository for this project contains a vendored copy of the
	119	Zstandard source code. This is done for a few reasons.
	120
	121	First, Zstandard is relatively new and not yet widely available as a system
	122	package. Providing a copy of the source code enables the Python C extension
	123	to be compiled without requiring the user to obtain the Zstandard source code
	124	separately.
	125
	126	Second, Zstandard has both a stable public API and an experimental API.
	127	The experimental API is actually quite useful (contains functionality for
	128	training dictionaries for example), so it is something we wish to expose to
	129	Python. However, the experimental API is only available via static linking.
	130	Furthermore, the experimental API can change at any time. So, control over
	131	the exact version of the Zstandard library linked against is important to
	132	ensure known behavior.
	133
	134	Instructions for Building and Testing
	135	=====================================
	136
	137	Once you have the source code, the extension can be built via setup.py::
	138
	139	$ python setup.py build_ext
	140
	141	We recommend testing with ``nose``::
	142
	143	$ nosetests
	144
	145	A Tox configuration is present to test against multiple Python versions::
	146
	147	$ tox
	148
	149	Tests use the ``hypothesis`` Python package to perform fuzzing. If you
	150	don't have it, those tests won't run.
	151
	152	There is also an experimental CFFI module. You need the ``cffi`` Python
	153	package installed to build and test that.
	154
	155	To create a virtualenv with all development dependencies, do something
	156	like the following::
	157
	158	# Python 2
	159	$ virtualenv venv
	160
	161	# Python 3
	162	$ python3 -m venv venv
	163
	164	$ source venv/bin/activate
	165	$ pip install cffi hypothesis nose tox
	166
	167	API
	168	===
	169
	170	The compiled C extension provides a ``zstd`` Python module. This module
	171	exposes the following interfaces.
	172
	173	ZstdCompressor
	174	--------------
	175
	176	The ``ZstdCompressor`` class provides an interface for performing
	177	compression operations.
	178
	179	Each instance is associated with parameters that control compression
	180	behavior. These come from the following named arguments (all optional):
	181
	182	level
	183	Integer compression level. Valid values are between 1 and 22.
	184	dict_data
	185	Compression dictionary to use.
	186
	187	Note: When using dictionary data and ``compress()`` is called multiple
	188	times, the ``CompressionParameters`` derived from an integer compression
	189	``level`` and the first compressed data's size will be reused for all
	190	subsequent operations. This may not be desirable if source data size
	191	varies significantly.
	192	compression_params
	193	A ``CompressionParameters`` instance (overrides the ``level`` value).
	194	write_checksum
	195	Whether a 4 byte checksum should be written with the compressed data.
	196	Defaults to False. If True, the decompressor can verify that decompressed
	197	data matches the original input data.
	198	write_content_size
	199	Whether the size of the uncompressed data will be written into the
	200	header of compressed data. Defaults to False. The data will only be
	201	written if the compressor knows the size of the input data. This is
	202	likely not true for streaming compression.
	203	write_dict_id
	204	Whether to write the dictionary ID into the compressed data.
	205	Defaults to True. The dictionary ID is only written if a dictionary
	206	is being used.
	207
	208	Simple API
	209	^^^^^^^^^^
	210
	211	``compress(data)`` compresses and returns data as a one-shot operation.::
	212
	213	cctx = zstd.ZsdCompressor()
	214	compressed = cctx.compress(b'data to compress')
	215
	216	Streaming Input API
	217	^^^^^^^^^^^^^^^^^^^
	218
	219	``write_to(fh)`` (which behaves as a context manager) allows you to stream
	220	data into a compressor.::
	221
	222	cctx = zstd.ZstdCompressor(level=10)
	223	with cctx.write_to(fh) as compressor:
	224	compressor.write(b'chunk 0')
	225	compressor.write(b'chunk 1')
	226	...
	227
	228	The argument to ``write_to()`` must have a ``write(data)`` method. As
	229	compressed data is available, ``write()`` will be called with the comrpessed
	230	data as its argument. Many common Python types implement ``write()``, including
	231	open file handles and ``io.BytesIO``.
	232
	233	``write_to()`` returns an object representing a streaming compressor instance.
	234	It must be used as a context manager. That object's ``write(data)`` method
	235	is used to feed data into the compressor.
	236
	237	If the size of the data being fed to this streaming compressor is known,
	238	you can declare it before compression begins::
	239
	240	cctx = zstd.ZstdCompressor()
	241	with cctx.write_to(fh, size=data_len) as compressor:
	242	compressor.write(chunk0)
	243	compressor.write(chunk1)
	244	...
	245
	246	Declaring the size of the source data allows compression parameters to
	247	be tuned. And if ``write_content_size`` is used, it also results in the
	248	content size being written into the frame header of the output data.
	249
	250	The size of chunks being ``write()`` to the destination can be specified::
	251
	252	cctx = zstd.ZstdCompressor()
	253	with cctx.write_to(fh, write_size=32768) as compressor:
	254	...
	255
	256	To see how much memory is being used by the streaming compressor::
	257
	258	cctx = zstd.ZstdCompressor()
	259	with cctx.write_to(fh) as compressor:
	260	...
	261	byte_size = compressor.memory_size()
	262
	263	Streaming Output API
	264	^^^^^^^^^^^^^^^^^^^^
	265
	266	``read_from(reader)`` provides a mechanism to stream data out of a compressor
	267	as an iterator of data chunks.::
	268
	269	cctx = zstd.ZstdCompressor()
	270	for chunk in cctx.read_from(fh):
	271	# Do something with emitted data.
	272
	273	``read_from()`` accepts an object that has a ``read(size)`` method or conforms
	274	to the buffer protocol. (``bytes`` and ``memoryview`` are 2 common types that
	275	provide the buffer protocol.)
	276
	277	Uncompressed data is fetched from the source either by calling ``read(size)``
	278	or by fetching a slice of data from the object directly (in the case where
	279	the buffer protocol is being used). The returned iterator consists of chunks
	280	of compressed data.
	281
	282	Like ``write_to()``, ``read_from()`` also accepts a ``size`` argument
	283	declaring the size of the input stream::
	284
	285	cctx = zstd.ZstdCompressor()
	286	for chunk in cctx.read_from(fh, size=some_int):
	287	pass
	288
	289	You can also control the size that data is ``read()`` from the source and
	290	the ideal size of output chunks::
	291
	292	cctx = zstd.ZstdCompressor()
	293	for chunk in cctx.read_from(fh, read_size=16384, write_size=8192):
	294	pass
	295
	296	Stream Copying API
	297	^^^^^^^^^^^^^^^^^^
	298
	299	``copy_stream(ifh, ofh)`` can be used to copy data between 2 streams while
	300	compressing it.::
	301
	302	cctx = zstd.ZstdCompressor()
	303	cctx.copy_stream(ifh, ofh)
	304
	305	For example, say you wish to compress a file::
	306
	307	cctx = zstd.ZstdCompressor()
	308	with open(input_path, 'rb') as ifh, open(output_path, 'wb') as ofh:
	309	cctx.copy_stream(ifh, ofh)
	310
	311	It is also possible to declare the size of the source stream::
	312
	313	cctx = zstd.ZstdCompressor()
	314	cctx.copy_stream(ifh, ofh, size=len_of_input)
	315
	316	You can also specify how large the chunks that are ``read()`` and ``write()``
	317	from and to the streams::
	318
	319	cctx = zstd.ZstdCompressor()
	320	cctx.copy_stream(ifh, ofh, read_size=32768, write_size=16384)
	321
	322	The stream copier returns a 2-tuple of bytes read and written::
	323
	324	cctx = zstd.ZstdCompressor()
	325	read_count, write_count = cctx.copy_stream(ifh, ofh)
	326
	327	Compressor API
	328	^^^^^^^^^^^^^^
	329
	330	``compressobj()`` returns an object that exposes ``compress(data)`` and
	331	``flush()`` methods. Each returns compressed data or an empty bytes.
	332
	333	The purpose of ``compressobj()`` is to provide an API-compatible interface
	334	with ``zlib.compressobj`` and ``bz2.BZ2Compressor``. This allows callers to
	335	swap in different compressor objects while using the same API.
	336
	337	Once ``flush()`` is called, the compressor will no longer accept new data
	338	to ``compress()``. ``flush()`` must be called to end the compression
	339	context. If not called, the returned data may be incomplete.
	340
	341	Here is how this API should be used::
	342
	343	cctx = zstd.ZstdCompressor()
	344	cobj = cctx.compressobj()
	345	data = cobj.compress(b'raw input 0')
	346	data = cobj.compress(b'raw input 1')
	347	data = cobj.flush()
	348
	349	For best performance results, keep input chunks under 256KB. This avoids
	350	extra allocations for a large output object.
	351
	352	It is possible to declare the input size of the data that will be fed into
	353	the compressor::
	354
	355	cctx = zstd.ZstdCompressor()
	356	cobj = cctx.compressobj(size=6)
	357	data = cobj.compress(b'foobar')
	358	data = cobj.flush()
	359
	360	ZstdDecompressor
	361	----------------
	362
	363	The ``ZstdDecompressor`` class provides an interface for performing
	364	decompression.
	365
	366	Each instance is associated with parameters that control decompression. These
	367	come from the following named arguments (all optional):
	368
	369	dict_data
	370	Compression dictionary to use.
	371
	372	The interface of this class is very similar to ``ZstdCompressor`` (by design).
	373
	374	Simple API
	375	^^^^^^^^^^
	376
	377	``decompress(data)`` can be used to decompress an entire compressed zstd
	378	frame in a single operation.::
	379
	380	dctx = zstd.ZstdDecompressor()
	381	decompressed = dctx.decompress(data)
	382
	383	By default, ``decompress(data)`` will only work on data written with the content
	384	size encoded in its header. This can be achieved by creating a
	385	``ZstdCompressor`` with ``write_content_size=True``. If compressed data without
	386	an embedded content size is seen, ``zstd.ZstdError`` will be raised.
	387
	388	If the compressed data doesn't have its content size embedded within it,
	389	decompression can be attempted by specifying the ``max_output_size``
	390	argument.::
	391
	392	dctx = zstd.ZstdDecompressor()
	393	uncompressed = dctx.decompress(data, max_output_size=1048576)
	394
	395	Ideally, ``max_output_size`` will be identical to the decompressed output
	396	size.
	397
	398	If ``max_output_size`` is too small to hold the decompressed data,
	399	``zstd.ZstdError`` will be raised.
	400
	401	If ``max_output_size`` is larger than the decompressed data, the allocated
	402	output buffer will be resized to only use the space required.
	403
	404	Please note that an allocation of the requested ``max_output_size`` will be
	405	performed every time the method is called. Setting to a very large value could
	406	result in a lot of work for the memory allocator and may result in
	407	``MemoryError`` being raised if the allocation fails.
	408
	409	If the exact size of decompressed data is unknown, it is strongly
	410	recommended to use a streaming API.
	411
	412	Streaming Input API
	413	^^^^^^^^^^^^^^^^^^^
	414
	415	``write_to(fh)`` can be used to incrementally send compressed data to a
	416	decompressor.::
	417
	418	dctx = zstd.ZstdDecompressor()
	419	with dctx.write_to(fh) as decompressor:
	420	decompressor.write(compressed_data)
	421
	422	This behaves similarly to ``zstd.ZstdCompressor``: compressed data is written to
	423	the decompressor by calling ``write(data)`` and decompressed output is written
	424	to the output object by calling its ``write(data)`` method.
	425
	426	The size of chunks being ``write()`` to the destination can be specified::
	427
	428	dctx = zstd.ZstdDecompressor()
	429	with dctx.write_to(fh, write_size=16384) as decompressor:
	430	pass
	431
	432	You can see how much memory is being used by the decompressor::
	433
	434	dctx = zstd.ZstdDecompressor()
	435	with dctx.write_to(fh) as decompressor:
	436	byte_size = decompressor.memory_size()
	437
	438	Streaming Output API
	439	^^^^^^^^^^^^^^^^^^^^
	440
	441	``read_from(fh)`` provides a mechanism to stream decompressed data out of a
	442	compressed source as an iterator of data chunks.::
	443
	444	dctx = zstd.ZstdDecompressor()
	445	for chunk in dctx.read_from(fh):
	446	# Do something with original data.
	447
	448	``read_from()`` accepts a) an object with a ``read(size)`` method that will
	449	return compressed bytes b) an object conforming to the buffer protocol that
	450	can expose its data as a contiguous range of bytes. The ``bytes`` and
	451	``memoryview`` types expose this buffer protocol.
	452
	453	``read_from()`` returns an iterator whose elements are chunks of the
	454	decompressed data.
	455
	456	The size of requested ``read()`` from the source can be specified::
	457
	458	dctx = zstd.ZstdDecompressor()
	459	for chunk in dctx.read_from(fh, read_size=16384):
	460	pass
	461
	462	It is also possible to skip leading bytes in the input data::
	463
	464	dctx = zstd.ZstdDecompressor()
	465	for chunk in dctx.read_from(fh, skip_bytes=1):
	466	pass
	467
	468	Skipping leading bytes is useful if the source data contains extra
	469	header data but you want to avoid the overhead of making a buffer copy
	470	or allocating a new ``memoryview`` object in order to decompress the data.
	471
	472	Similarly to ``ZstdCompressor.read_from()``, the consumer of the iterator
	473	controls when data is decompressed. If the iterator isn't consumed,
	474	decompression is put on hold.
	475
	476	When ``read_from()`` is passed an object conforming to the buffer protocol,
	477	the behavior may seem similar to what occurs when the simple decompression
	478	API is used. However, this API works when the decompressed size is unknown.
	479	Furthermore, if feeding large inputs, the decompressor will work in chunks
	480	instead of performing a single operation.
	481
	482	Stream Copying API
	483	^^^^^^^^^^^^^^^^^^
	484
	485	``copy_stream(ifh, ofh)`` can be used to copy data across 2 streams while
	486	performing decompression.::
	487
	488	dctx = zstd.ZstdDecompressor()
	489	dctx.copy_stream(ifh, ofh)
	490
	491	e.g. to decompress a file to another file::
	492
	493	dctx = zstd.ZstdDecompressor()
	494	with open(input_path, 'rb') as ifh, open(output_path, 'wb') as ofh:
	495	dctx.copy_stream(ifh, ofh)
	496
	497	The size of chunks being ``read()`` and ``write()`` from and to the streams
	498	can be specified::
	499
	500	dctx = zstd.ZstdDecompressor()
	501	dctx.copy_stream(ifh, ofh, read_size=8192, write_size=16384)
	502
	503	Decompressor API
	504	^^^^^^^^^^^^^^^^
	505
	506	``decompressobj()`` returns an object that exposes a ``decompress(data)``
	507	methods. Compressed data chunks are fed into ``decompress(data)`` and
	508	uncompressed output (or an empty bytes) is returned. Output from subsequent
	509	calls needs to be concatenated to reassemble the full decompressed byte
	510	sequence.
	511
	512	The purpose of ``decompressobj()`` is to provide an API-compatible interface
	513	with ``zlib.decompressobj`` and ``bz2.BZ2Decompressor``. This allows callers
	514	to swap in different decompressor objects while using the same API.
	515
	516	Each object is single use: once an input frame is decoded, ``decompress()``
	517	can no longer be called.
	518
	519	Here is how this API should be used::
	520
	521	dctx = zstd.ZstdDeompressor()
	522	dobj = cctx.decompressobj()
	523	data = dobj.decompress(compressed_chunk_0)
	524	data = dobj.decompress(compressed_chunk_1)
	525
	526	Choosing an API
	527	---------------
	528
	529	Various forms of compression and decompression APIs are provided because each
	530	are suitable for different use cases.
	531
	532	The simple/one-shot APIs are useful for small data, when the decompressed
	533	data size is known (either recorded in the zstd frame header via
	534	``write_content_size`` or known via an out-of-band mechanism, such as a file
	535	size).
	536
	537	A limitation of the simple APIs is that input or output data must fit in memory.
	538	And unless using advanced tricks with Python buffer objects, both input and
	539	output must fit in memory simultaneously.
	540
	541	Another limitation is that compression or decompression is performed as a single
	542	operation. So if you feed large input, it could take a long time for the
	543	function to return.
	544
	545	The streaming APIs do not have the limitations of the simple API. The cost to
	546	this is they are more complex to use than a single function call.
	547
	548	The streaming APIs put the caller in control of compression and decompression
	549	behavior by allowing them to directly control either the input or output side
	550	of the operation.
	551
	552	With the streaming input APIs, the caller feeds data into the compressor or
	553	decompressor as they see fit. Output data will only be written after the caller
	554	has explicitly written data.
	555
	556	With the streaming output APIs, the caller consumes output from the compressor
	557	or decompressor as they see fit. The compressor or decompressor will only
	558	consume data from the source when the caller is ready to receive it.
	559
	560	One end of the streaming APIs involves a file-like object that must
	561	``write()`` output data or ``read()`` input data. Depending on what the
	562	backing storage for these objects is, those operations may not complete quickly.
	563	For example, when streaming compressed data to a file, the ``write()`` into
	564	a streaming compressor could result in a ``write()`` to the filesystem, which
	565	may take a long time to finish due to slow I/O on the filesystem. So, there
	566	may be overhead in streaming APIs beyond the compression and decompression
	567	operations.
	568
	569	Dictionary Creation and Management
	570	----------------------------------
	571
	572	Zstandard allows dictionaries to be used when compressing and
	573	decompressing data. The idea is that if you are compressing a lot of similar
	574	data, you can precompute common properties of that data (such as recurring
	575	byte sequences) to achieve better compression ratios.
	576
	577	In Python, compression dictionaries are represented as the
	578	``ZstdCompressionDict`` type.
	579
	580	Instances can be constructed from bytes::
	581
	582	dict_data = zstd.ZstdCompressionDict(data)
	583
	584	More interestingly, instances can be created by training on sample data::
	585
	586	dict_data = zstd.train_dictionary(size, samples)
	587
	588	This takes a list of bytes instances and creates and returns a
	589	``ZstdCompressionDict``.
	590
	591	You can see how many bytes are in the dictionary by calling ``len()``::
	592
	593	dict_data = zstd.train_dictionary(size, samples)
	594	dict_size = len(dict_data) # will not be larger than ``size``
	595
	596	Once you have a dictionary, you can pass it to the objects performing
	597	compression and decompression::
	598
	599	dict_data = zstd.train_dictionary(16384, samples)
	600
	601	cctx = zstd.ZstdCompressor(dict_data=dict_data)
	602	for source_data in input_data:
	603	compressed = cctx.compress(source_data)
	604	# Do something with compressed data.
	605
	606	dctx = zstd.ZstdDecompressor(dict_data=dict_data)
	607	for compressed_data in input_data:
	608	buffer = io.BytesIO()
	609	with dctx.write_to(buffer) as decompressor:
	610	decompressor.write(compressed_data)
	611	# Do something with raw data in ``buffer``.
	612
	613	Dictionaries have unique integer IDs. You can retrieve this ID via::
	614
	615	dict_id = zstd.dictionary_id(dict_data)
	616
	617	You can obtain the raw data in the dict (useful for persisting and constructing
	618	a ``ZstdCompressionDict`` later) via ``as_bytes()``::
	619
	620	dict_data = zstd.train_dictionary(size, samples)
	621	raw_data = dict_data.as_bytes()
	622
	623	Explicit Compression Parameters
	624	-------------------------------
	625
	626	Zstandard's integer compression levels along with the input size and dictionary
	627	size are converted into a data structure defining multiple parameters to tune
	628	behavior of the compression algorithm. It is possible to use define this
	629	data structure explicitly to have lower-level control over compression behavior.
	630
	631	The ``zstd.CompressionParameters`` type represents this data structure.
	632	You can see how Zstandard converts compression levels to this data structure
	633	by calling ``zstd.get_compression_parameters()``. e.g.::
	634
	635	params = zstd.get_compression_parameters(5)
	636
	637	This function also accepts the uncompressed data size and dictionary size
	638	to adjust parameters::
	639
	640	params = zstd.get_compression_parameters(3, source_size=len(data), dict_size=len(dict_data))
	641
	642	You can also construct compression parameters from their low-level components::
	643
	644	params = zstd.CompressionParameters(20, 6, 12, 5, 4, 10, zstd.STRATEGY_FAST)
	645
	646	You can then configure a compressor to use the custom parameters::
	647
	648	cctx = zstd.ZstdCompressor(compression_params=params)
	649
	650	The members of the ``CompressionParameters`` tuple are as follows::
	651
	652	* 0 - Window log
	653	* 1 - Chain log
	654	* 2 - Hash log
	655	* 3 - Search log
	656	* 4 - Search length
	657	* 5 - Target length
	658	* 6 - Strategy (one of the ``zstd.STRATEGY_`` constants)
	659
	660	You'll need to read the Zstandard documentation for what these parameters
	661	do.
	662
	663	Misc Functionality
	664	------------------
	665
	666	estimate_compression_context_size(CompressionParameters)
	667	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	668
	669	Given a ``CompressionParameters`` struct, estimate the memory size required
	670	to perform compression.
	671
	672	estimate_decompression_context_size()
	673	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	674
	675	Estimate the memory size requirements for a decompressor instance.
	676
	677	Constants
	678	---------
	679
	680	The following module constants/attributes are exposed:
	681
	682	ZSTD_VERSION
	683	This module attribute exposes a 3-tuple of the Zstandard version. e.g.
	684	``(1, 0, 0)``
	685	MAX_COMPRESSION_LEVEL
	686	Integer max compression level accepted by compression functions
	687	COMPRESSION_RECOMMENDED_INPUT_SIZE
	688	Recommended chunk size to feed to compressor functions
	689	COMPRESSION_RECOMMENDED_OUTPUT_SIZE
	690	Recommended chunk size for compression output
	691	DECOMPRESSION_RECOMMENDED_INPUT_SIZE
	692	Recommended chunk size to feed into decompresor functions
	693	DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE
	694	Recommended chunk size for decompression output
	695
	696	FRAME_HEADER
	697	bytes containing header of the Zstandard frame
	698	MAGIC_NUMBER
	699	Frame header as an integer
	700
	701	WINDOWLOG_MIN
	702	Minimum value for compression parameter
	703	WINDOWLOG_MAX
	704	Maximum value for compression parameter
	705	CHAINLOG_MIN
	706	Minimum value for compression parameter
	707	CHAINLOG_MAX
	708	Maximum value for compression parameter
	709	HASHLOG_MIN
	710	Minimum value for compression parameter
	711	HASHLOG_MAX
	712	Maximum value for compression parameter
	713	SEARCHLOG_MIN
	714	Minimum value for compression parameter
	715	SEARCHLOG_MAX
	716	Maximum value for compression parameter
	717	SEARCHLENGTH_MIN
	718	Minimum value for compression parameter
	719	SEARCHLENGTH_MAX
	720	Maximum value for compression parameter
	721	TARGETLENGTH_MIN
	722	Minimum value for compression parameter
	723	TARGETLENGTH_MAX
	724	Maximum value for compression parameter
	725	STRATEGY_FAST
	726	Compression strategory
	727	STRATEGY_DFAST
	728	Compression strategory
	729	STRATEGY_GREEDY
	730	Compression strategory
	731	STRATEGY_LAZY
	732	Compression strategory
	733	STRATEGY_LAZY2
	734	Compression strategory
	735	STRATEGY_BTLAZY2
	736	Compression strategory
	737	STRATEGY_BTOPT
	738	Compression strategory
	739
	740	Note on Zstandard's Experimental API
	741	======================================
	742
	743	Many of the Zstandard APIs used by this module are marked as experimental
	744	within the Zstandard project. This includes a large number of useful
	745	features, such as compression and frame parameters and parts of dictionary
	746	compression.
	747
	748	It is unclear how Zstandard's C API will evolve over time, especially with
	749	regards to this experimental functionality. We will try to maintain
	750	backwards compatibility at the Python API level. However, we cannot
	751	guarantee this for things not under our control.
	752
	753	Since a copy of the Zstandard source code is distributed with this
	754	module and since we compile against it, the behavior of a specific
	755	version of this module should be constant for all of time. So if you
	756	pin the version of this module used in your projects (which is a Python
	757	best practice), you should be buffered from unwanted future changes.
	758
	759	Donate
	760	======
	761
	762	A lot of time has been invested into this project by the author.
	763
	764	If you find this project useful and would like to thank the author for
	765	their work, consider donating some money. Any amount is appreciated.
	766
	767	.. image:: https://www.paypalobjects.com/en_US/i/btn/btn_donate_LG.gif
	768	:target: https://www.paypal.com/cgi-bin/webscr?cmd=_donations&business=gregory%2eszorc%40gmail%2ecom&lc=US&item_name=python%2dzstandard&currency_code=USD&bn=PP%2dDonationsBF%3abtn_donate_LG%2egif%3aNonHosted
	769	:alt: Donate via PayPal
	770
	771	.. \|ci-status\| image:: https://travis-ci.org/indygreg/python-zstandard.svg?branch=master
	772	:target: https://travis-ci.org/indygreg/python-zstandard
	773
	774	.. \|win-ci-status\| image:: https://ci.appveyor.com/api/projects/status/github/indygreg/python-zstandard?svg=true
	775	:target: https://ci.appveyor.com/project/indygreg/python-zstandard
	776	:alt: Windows build status

contrib/python-zstandard/c-ext/compressiondict.c

0 created 644 +247 0

			@@ -0,0 +1,247 b''
		1	/**
		2	* Copyright (c) 2016-present, Gregory Szorc
		3	* All rights reserved.
		4	*
		5	* This software may be modified and distributed under the terms
		6	* of the BSD license. See the LICENSE file for details.
		7	*/
		8
		9	#include "python-zstandard.h"
		10
		11	extern PyObject* ZstdError;
		12
		13	ZstdCompressionDict* train_dictionary(PyObject* self, PyObject* args, PyObject* kwargs) {
		14	static char *kwlist[] = { "dict_size", "samples", "parameters", NULL };
		15	size_t capacity;
		16	PyObject* samples;
		17	Py_ssize_t samplesLen;
		18	PyObject* parameters = NULL;
		19	ZDICT_params_t zparams;
		20	Py_ssize_t sampleIndex;
		21	Py_ssize_t sampleSize;
		22	PyObject* sampleItem;
		23	size_t zresult;
		24	void* sampleBuffer;
		25	void* sampleOffset;
		26	size_t samplesSize = 0;
		27	size_t* sampleSizes;
		28	void* dict;
		29	ZstdCompressionDict* result;
		30
		31	if (!PyArg_ParseTupleAndKeywords(args, kwargs, "nO!\|O!", kwlist,
		32	&capacity,
		33	&PyList_Type, &samples,
		34	(PyObject*)&DictParametersType, &parameters)) {
		35	return NULL;
		36	}
		37
		38	/* Validate parameters first since it is easiest. */
		39	zparams.selectivityLevel = 0;
		40	zparams.compressionLevel = 0;
		41	zparams.notificationLevel = 0;
		42	zparams.dictID = 0;
		43	zparams.reserved[0] = 0;
		44	zparams.reserved[1] = 0;
		45
		46	if (parameters) {
		47	/* TODO validate data ranges */
		48	zparams.selectivityLevel = PyLong_AsUnsignedLong(PyTuple_GetItem(parameters, 0));
		49	zparams.compressionLevel = PyLong_AsLong(PyTuple_GetItem(parameters, 1));
		50	zparams.notificationLevel = PyLong_AsUnsignedLong(PyTuple_GetItem(parameters, 2));
		51	zparams.dictID = PyLong_AsUnsignedLong(PyTuple_GetItem(parameters, 3));
		52	}
		53
		54	/* Figure out the size of the raw samples */
		55	samplesLen = PyList_Size(samples);
		56	for (sampleIndex = 0; sampleIndex < samplesLen; sampleIndex++) {
		57	sampleItem = PyList_GetItem(samples, sampleIndex);
		58	if (!PyBytes_Check(sampleItem)) {
		59	PyErr_SetString(PyExc_ValueError, "samples must be bytes");
		60	/* TODO probably need to perform DECREF here */
		61	return NULL;
		62	}
		63	samplesSize += PyBytes_GET_SIZE(sampleItem);
		64	}
		65
		66	/* Now that we know the total size of the raw simples, we can allocate
		67	a buffer for the raw data */
		68	sampleBuffer = malloc(samplesSize);
		69	if (!sampleBuffer) {
		70	PyErr_NoMemory();
		71	return NULL;
		72	}
		73	sampleSizes = malloc(samplesLen * sizeof(size_t));
		74	if (!sampleSizes) {
		75	free(sampleBuffer);
		76	PyErr_NoMemory();
		77	return NULL;
		78	}
		79
		80	sampleOffset = sampleBuffer;
		81	/* Now iterate again and assemble the samples in the buffer */
		82	for (sampleIndex = 0; sampleIndex < samplesLen; sampleIndex++) {
		83	sampleItem = PyList_GetItem(samples, sampleIndex);
		84	sampleSize = PyBytes_GET_SIZE(sampleItem);
		85	sampleSizes[sampleIndex] = sampleSize;
		86	memcpy(sampleOffset, PyBytes_AS_STRING(sampleItem), sampleSize);
		87	sampleOffset = (char*)sampleOffset + sampleSize;
		88	}
		89
		90	dict = malloc(capacity);
		91	if (!dict) {
		92	free(sampleSizes);
		93	free(sampleBuffer);
		94	PyErr_NoMemory();
		95	return NULL;
		96	}
		97
		98	zresult = ZDICT_trainFromBuffer_advanced(dict, capacity,
		99	sampleBuffer, sampleSizes, (unsigned int)samplesLen,
		100	zparams);
		101	if (ZDICT_isError(zresult)) {
		102	PyErr_Format(ZstdError, "Cannot train dict: %s", ZDICT_getErrorName(zresult));
		103	free(dict);
		104	free(sampleSizes);
		105	free(sampleBuffer);
		106	return NULL;
		107	}
		108
		109	result = PyObject_New(ZstdCompressionDict, &ZstdCompressionDictType);
		110	if (!result) {
		111	return NULL;
		112	}
		113
		114	result->dictData = dict;
		115	result->dictSize = zresult;
		116	return result;
		117	}
		118
		119
		120	PyDoc_STRVAR(ZstdCompressionDict__doc__,
		121	"ZstdCompressionDict(data) - Represents a computed compression dictionary\n"
		122	"\n"
		123	"This type holds the results of a computed Zstandard compression dictionary.\n"
		124	"Instances are obtained by calling ``train_dictionary()`` or by passing bytes\n"
		125	"obtained from another source into the constructor.\n"
		126	);
		127
		128	static int ZstdCompressionDict_init(ZstdCompressionDict* self, PyObject* args) {
		129	const char* source;
		130	Py_ssize_t sourceSize;
		131
		132	self->dictData = NULL;
		133	self->dictSize = 0;
		134
		135	#if PY_MAJOR_VERSION >= 3
		136	if (!PyArg_ParseTuple(args, "y#", &source, &sourceSize)) {
		137	#else
		138	if (!PyArg_ParseTuple(args, "s#", &source, &sourceSize)) {
		139	#endif
		140	return -1;
		141	}
		142
		143	self->dictData = malloc(sourceSize);
		144	if (!self->dictData) {
		145	PyErr_NoMemory();
		146	return -1;
		147	}
		148
		149	memcpy(self->dictData, source, sourceSize);
		150	self->dictSize = sourceSize;
		151
		152	return 0;
		153	}
		154
		155	static void ZstdCompressionDict_dealloc(ZstdCompressionDict* self) {
		156	if (self->dictData) {
		157	free(self->dictData);
		158	self->dictData = NULL;
		159	}
		160
		161	PyObject_Del(self);
		162	}
		163
		164	static PyObject* ZstdCompressionDict_dict_id(ZstdCompressionDict* self) {
		165	unsigned dictID = ZDICT_getDictID(self->dictData, self->dictSize);
		166
		167	return PyLong_FromLong(dictID);
		168	}
		169
		170	static PyObject* ZstdCompressionDict_as_bytes(ZstdCompressionDict* self) {
		171	return PyBytes_FromStringAndSize(self->dictData, self->dictSize);
		172	}
		173
		174	static PyMethodDef ZstdCompressionDict_methods[] = {
		175	{ "dict_id", (PyCFunction)ZstdCompressionDict_dict_id, METH_NOARGS,
		176	PyDoc_STR("dict_id() -- obtain the numeric dictionary ID") },
		177	{ "as_bytes", (PyCFunction)ZstdCompressionDict_as_bytes, METH_NOARGS,
		178	PyDoc_STR("as_bytes() -- obtain the raw bytes constituting the dictionary data") },
		179	{ NULL, NULL }
		180	};
		181
		182	static Py_ssize_t ZstdCompressionDict_length(ZstdCompressionDict* self) {
		183	return self->dictSize;
		184	}
		185
		186	static PySequenceMethods ZstdCompressionDict_sq = {
		187	(lenfunc)ZstdCompressionDict_length, /* sq_length */
		188	0, /* sq_concat */
		189	0, /* sq_repeat */
		190	0, /* sq_item */
		191	0, /* sq_ass_item */
		192	0, /* sq_contains */
		193	0, /* sq_inplace_concat */
		194	0 /* sq_inplace_repeat */
		195	};
		196
		197	PyTypeObject ZstdCompressionDictType = {
		198	PyVarObject_HEAD_INIT(NULL, 0)
		199	"zstd.ZstdCompressionDict", /* tp_name */
		200	sizeof(ZstdCompressionDict), /* tp_basicsize */
		201	0, /* tp_itemsize */
		202	(destructor)ZstdCompressionDict_dealloc, /* tp_dealloc */
		203	0, /* tp_print */
		204	0, /* tp_getattr */
		205	0, /* tp_setattr */
		206	0, /* tp_compare */
		207	0, /* tp_repr */
		208	0, /* tp_as_number */
		209	&ZstdCompressionDict_sq, /* tp_as_sequence */
		210	0, /* tp_as_mapping */
		211	0, /* tp_hash */
		212	0, /* tp_call */
		213	0, /* tp_str */
		214	0, /* tp_getattro */
		215	0, /* tp_setattro */
		216	0, /* tp_as_buffer */
		217	Py_TPFLAGS_DEFAULT \| Py_TPFLAGS_BASETYPE, /* tp_flags */
		218	ZstdCompressionDict__doc__, /* tp_doc */
		219	0, /* tp_traverse */
		220	0, /* tp_clear */
		221	0, /* tp_richcompare */
		222	0, /* tp_weaklistoffset */
		223	0, /* tp_iter */
		224	0, /* tp_iternext */
		225	ZstdCompressionDict_methods, /* tp_methods */
		226	0, /* tp_members */
		227	0, /* tp_getset */
		228	0, /* tp_base */
		229	0, /* tp_dict */
		230	0, /* tp_descr_get */
		231	0, /* tp_descr_set */
		232	0, /* tp_dictoffset */
		233	(initproc)ZstdCompressionDict_init, /* tp_init */
		234	0, /* tp_alloc */
		235	PyType_GenericNew, /* tp_new */
		236	};
		237
		238	void compressiondict_module_init(PyObject* mod) {
		239	Py_TYPE(&ZstdCompressionDictType) = &PyType_Type;
		240	if (PyType_Ready(&ZstdCompressionDictType) < 0) {
		241	return;
		242	}
		243
		244	Py_INCREF((PyObject*)&ZstdCompressionDictType);
		245	PyModule_AddObject(mod, "ZstdCompressionDict",
		246	(PyObject*)&ZstdCompressionDictType);
		247	}

contrib/python-zstandard/c-ext/compressionparams.c

0 created 644 +226 0

			@@ -0,0 +1,226 b''
		1	/**
		2	* Copyright (c) 2016-present, Gregory Szorc
		3	* All rights reserved.
		4	*
		5	* This software may be modified and distributed under the terms
		6	* of the BSD license. See the LICENSE file for details.
		7	*/
		8
		9	#include "python-zstandard.h"
		10
		11	void ztopy_compression_parameters(CompressionParametersObject* params, ZSTD_compressionParameters* zparams) {
		12	zparams->windowLog = params->windowLog;
		13	zparams->chainLog = params->chainLog;
		14	zparams->hashLog = params->hashLog;
		15	zparams->searchLog = params->searchLog;
		16	zparams->searchLength = params->searchLength;
		17	zparams->targetLength = params->targetLength;
		18	zparams->strategy = params->strategy;
		19	}
		20
		21	CompressionParametersObject* get_compression_parameters(PyObject* self, PyObject* args) {
		22	int compressionLevel;
		23	unsigned PY_LONG_LONG sourceSize = 0;
		24	Py_ssize_t dictSize = 0;
		25	ZSTD_compressionParameters params;
		26	CompressionParametersObject* result;
		27
		28	if (!PyArg_ParseTuple(args, "i\|Kn", &compressionLevel, &sourceSize, &dictSize)) {
		29	return NULL;
		30	}
		31
		32	params = ZSTD_getCParams(compressionLevel, sourceSize, dictSize);
		33
		34	result = PyObject_New(CompressionParametersObject, &CompressionParametersType);
		35	if (!result) {
		36	return NULL;
		37	}
		38
		39	result->windowLog = params.windowLog;
		40	result->chainLog = params.chainLog;
		41	result->hashLog = params.hashLog;
		42	result->searchLog = params.searchLog;
		43	result->searchLength = params.searchLength;
		44	result->targetLength = params.targetLength;
		45	result->strategy = params.strategy;
		46
		47	return result;
		48	}
		49
		50	PyObject* estimate_compression_context_size(PyObject* self, PyObject* args) {
		51	CompressionParametersObject* params;
		52	ZSTD_compressionParameters zparams;
		53	PyObject* result;
		54
		55	if (!PyArg_ParseTuple(args, "O!", &CompressionParametersType, &params)) {
		56	return NULL;
		57	}
		58
		59	ztopy_compression_parameters(params, &zparams);
		60	result = PyLong_FromSize_t(ZSTD_estimateCCtxSize(zparams));
		61	return result;
		62	}
		63
		64	PyDoc_STRVAR(CompressionParameters__doc__,
		65	"CompressionParameters: low-level control over zstd compression");
		66
		67	static PyObject* CompressionParameters_new(PyTypeObject* subtype, PyObject* args, PyObject* kwargs) {
		68	CompressionParametersObject* self;
		69	unsigned windowLog;
		70	unsigned chainLog;
		71	unsigned hashLog;
		72	unsigned searchLog;
		73	unsigned searchLength;
		74	unsigned targetLength;
		75	unsigned strategy;
		76
		77	if (!PyArg_ParseTuple(args, "IIIIIII", &windowLog, &chainLog, &hashLog, &searchLog,
		78	&searchLength, &targetLength, &strategy)) {
		79	return NULL;
		80	}
		81
		82	if (windowLog < ZSTD_WINDOWLOG_MIN \|\| windowLog > ZSTD_WINDOWLOG_MAX) {
		83	PyErr_SetString(PyExc_ValueError, "invalid window log value");
		84	return NULL;
		85	}
		86
		87	if (chainLog < ZSTD_CHAINLOG_MIN \|\| chainLog > ZSTD_CHAINLOG_MAX) {
		88	PyErr_SetString(PyExc_ValueError, "invalid chain log value");
		89	return NULL;
		90	}
		91
		92	if (hashLog < ZSTD_HASHLOG_MIN \|\| hashLog > ZSTD_HASHLOG_MAX) {
		93	PyErr_SetString(PyExc_ValueError, "invalid hash log value");
		94	return NULL;
		95	}
		96
		97	if (searchLog < ZSTD_SEARCHLOG_MIN \|\| searchLog > ZSTD_SEARCHLOG_MAX) {
		98	PyErr_SetString(PyExc_ValueError, "invalid search log value");
		99	return NULL;
		100	}
		101
		102	if (searchLength < ZSTD_SEARCHLENGTH_MIN \|\| searchLength > ZSTD_SEARCHLENGTH_MAX) {
		103	PyErr_SetString(PyExc_ValueError, "invalid search length value");
		104	return NULL;
		105	}
		106
		107	if (targetLength < ZSTD_TARGETLENGTH_MIN \|\| targetLength > ZSTD_TARGETLENGTH_MAX) {
		108	PyErr_SetString(PyExc_ValueError, "invalid target length value");
		109	return NULL;
		110	}
		111
		112	if (strategy < ZSTD_fast \|\| strategy > ZSTD_btopt) {
		113	PyErr_SetString(PyExc_ValueError, "invalid strategy value");
		114	return NULL;
		115	}
		116
		117	self = (CompressionParametersObject*)subtype->tp_alloc(subtype, 1);
		118	if (!self) {
		119	return NULL;
		120	}
		121
		122	self->windowLog = windowLog;
		123	self->chainLog = chainLog;
		124	self->hashLog = hashLog;
		125	self->searchLog = searchLog;
		126	self->searchLength = searchLength;
		127	self->targetLength = targetLength;
		128	self->strategy = strategy;
		129
		130	return (PyObject*)self;
		131	}
		132
		133	static void CompressionParameters_dealloc(PyObject* self) {
		134	PyObject_Del(self);
		135	}
		136
		137	static Py_ssize_t CompressionParameters_length(PyObject* self) {
		138	return 7;
		139	};
		140
		141	static PyObject* CompressionParameters_item(PyObject* o, Py_ssize_t i) {
		142	CompressionParametersObject* self = (CompressionParametersObject*)o;
		143
		144	switch (i) {
		145	case 0:
		146	return PyLong_FromLong(self->windowLog);
		147	case 1:
		148	return PyLong_FromLong(self->chainLog);
		149	case 2:
		150	return PyLong_FromLong(self->hashLog);
		151	case 3:
		152	return PyLong_FromLong(self->searchLog);
		153	case 4:
		154	return PyLong_FromLong(self->searchLength);
		155	case 5:
		156	return PyLong_FromLong(self->targetLength);
		157	case 6:
		158	return PyLong_FromLong(self->strategy);
		159	default:
		160	PyErr_SetString(PyExc_IndexError, "index out of range");
		161	return NULL;
		162	}
		163	}
		164
		165	static PySequenceMethods CompressionParameters_sq = {
		166	CompressionParameters_length, /* sq_length */
		167	0, /* sq_concat */
		168	0, /* sq_repeat */
		169	CompressionParameters_item, /* sq_item */
		170	0, /* sq_ass_item */
		171	0, /* sq_contains */
		172	0, /* sq_inplace_concat */
		173	0 /* sq_inplace_repeat */
		174	};
		175
		176	PyTypeObject CompressionParametersType = {
		177	PyVarObject_HEAD_INIT(NULL, 0)
		178	"CompressionParameters", /* tp_name */
		179	sizeof(CompressionParametersObject), /* tp_basicsize */
		180	0, /* tp_itemsize */
		181	(destructor)CompressionParameters_dealloc, /* tp_dealloc */
		182	0, /* tp_print */
		183	0, /* tp_getattr */
		184	0, /* tp_setattr */
		185	0, /* tp_compare */
		186	0, /* tp_repr */
		187	0, /* tp_as_number */
		188	&CompressionParameters_sq, /* tp_as_sequence */
		189	0, /* tp_as_mapping */
		190	0, /* tp_hash */
		191	0, /* tp_call */
		192	0, /* tp_str */
		193	0, /* tp_getattro */
		194	0, /* tp_setattro */
		195	0, /* tp_as_buffer */
		196	Py_TPFLAGS_DEFAULT, /* tp_flags */
		197	CompressionParameters__doc__, /* tp_doc */
		198	0, /* tp_traverse */
		199	0, /* tp_clear */
		200	0, /* tp_richcompare */
		201	0, /* tp_weaklistoffset */
		202	0, /* tp_iter */
		203	0, /* tp_iternext */
		204	0, /* tp_methods */
		205	0, /* tp_members */
		206	0, /* tp_getset */
		207	0, /* tp_base */
		208	0, /* tp_dict */
		209	0, /* tp_descr_get */
		210	0, /* tp_descr_set */
		211	0, /* tp_dictoffset */
		212	0, /* tp_init */
		213	0, /* tp_alloc */
		214	CompressionParameters_new, /* tp_new */
		215	};
		216
		217	void compressionparams_module_init(PyObject* mod) {
		218	Py_TYPE(&CompressionParametersType) = &PyType_Type;
		219	if (PyType_Ready(&CompressionParametersType) < 0) {
		220	return;
		221	}
		222
		223	Py_IncRef((PyObject*)&CompressionParametersType);
		224	PyModule_AddObject(mod, "CompressionParameters",
		225	(PyObject*)&CompressionParametersType);
		226	}

contrib/python-zstandard/c-ext/compressionwriter.c

0 created 644 +235 0

			@@ -0,0 +1,235 b''
		1	/**
		2	* Copyright (c) 2016-present, Gregory Szorc
		3	* All rights reserved.
		4	*
		5	* This software may be modified and distributed under the terms
		6	* of the BSD license. See the LICENSE file for details.
		7	*/
		8
		9	#include "python-zstandard.h"
		10
		11	extern PyObject* ZstdError;
		12
		13	PyDoc_STRVAR(ZstdCompresssionWriter__doc__,
		14	"""A context manager used for writing compressed output to a writer.\n"
		15	);
		16
		17	static void ZstdCompressionWriter_dealloc(ZstdCompressionWriter* self) {
		18	Py_XDECREF(self->compressor);
		19	Py_XDECREF(self->writer);
		20
		21	if (self->cstream) {
		22	ZSTD_freeCStream(self->cstream);
		23	self->cstream = NULL;
		24	}
		25
		26	PyObject_Del(self);
		27	}
		28
		29	static PyObject* ZstdCompressionWriter_enter(ZstdCompressionWriter* self) {
		30	if (self->entered) {
		31	PyErr_SetString(ZstdError, "cannot __enter__ multiple times");
		32	return NULL;
		33	}
		34
		35	self->cstream = CStream_from_ZstdCompressor(self->compressor, self->sourceSize);
		36	if (!self->cstream) {
		37	return NULL;
		38	}
		39
		40	self->entered = 1;
		41
		42	Py_INCREF(self);
		43	return (PyObject*)self;
		44	}
		45
		46	static PyObject* ZstdCompressionWriter_exit(ZstdCompressionWriter* self, PyObject* args) {
		47	PyObject* exc_type;
		48	PyObject* exc_value;
		49	PyObject* exc_tb;
		50	size_t zresult;
		51
		52	ZSTD_outBuffer output;
		53	PyObject* res;
		54
		55	if (!PyArg_ParseTuple(args, "OOO", &exc_type, &exc_value, &exc_tb)) {
		56	return NULL;
		57	}
		58
		59	self->entered = 0;
		60
		61	if (self->cstream && exc_type == Py_None && exc_value == Py_None &&
		62	exc_tb == Py_None) {
		63
		64	output.dst = malloc(self->outSize);
		65	if (!output.dst) {
		66	return PyErr_NoMemory();
		67	}
		68	output.size = self->outSize;
		69	output.pos = 0;
		70
		71	while (1) {
		72	zresult = ZSTD_endStream(self->cstream, &output);
		73	if (ZSTD_isError(zresult)) {
		74	PyErr_Format(ZstdError, "error ending compression stream: %s",
		75	ZSTD_getErrorName(zresult));
		76	free(output.dst);
		77	return NULL;
		78	}
		79
		80	if (output.pos) {
		81	#if PY_MAJOR_VERSION >= 3
		82	res = PyObject_CallMethod(self->writer, "write", "y#",
		83	#else
		84	res = PyObject_CallMethod(self->writer, "write", "s#",
		85	#endif
		86	output.dst, output.pos);
		87	Py_XDECREF(res);
		88	}
		89
		90	if (!zresult) {
		91	break;
		92	}
		93
		94	output.pos = 0;
		95	}
		96
		97	free(output.dst);
		98	ZSTD_freeCStream(self->cstream);
		99	self->cstream = NULL;
		100	}
		101
		102	Py_RETURN_FALSE;
		103	}
		104
		105	static PyObject* ZstdCompressionWriter_memory_size(ZstdCompressionWriter* self) {
		106	if (!self->cstream) {
		107	PyErr_SetString(ZstdError, "cannot determine size of an inactive compressor; "
		108	"call when a context manager is active");
		109	return NULL;
		110	}
		111
		112	return PyLong_FromSize_t(ZSTD_sizeof_CStream(self->cstream));
		113	}
		114
		115	static PyObject* ZstdCompressionWriter_write(ZstdCompressionWriter* self, PyObject* args) {
		116	const char* source;
		117	Py_ssize_t sourceSize;
		118	size_t zresult;
		119	ZSTD_inBuffer input;
		120	ZSTD_outBuffer output;
		121	PyObject* res;
		122
		123	#if PY_MAJOR_VERSION >= 3
		124	if (!PyArg_ParseTuple(args, "y#", &source, &sourceSize)) {
		125	#else
		126	if (!PyArg_ParseTuple(args, "s#", &source, &sourceSize)) {
		127	#endif
		128	return NULL;
		129	}
		130
		131	if (!self->entered) {
		132	PyErr_SetString(ZstdError, "compress must be called from an active context manager");
		133	return NULL;
		134	}
		135
		136	output.dst = malloc(self->outSize);
		137	if (!output.dst) {
		138	return PyErr_NoMemory();
		139	}
		140	output.size = self->outSize;
		141	output.pos = 0;
		142
		143	input.src = source;
		144	input.size = sourceSize;
		145	input.pos = 0;
		146
		147	while ((ssize_t)input.pos < sourceSize) {
		148	Py_BEGIN_ALLOW_THREADS
		149	zresult = ZSTD_compressStream(self->cstream, &output, &input);
		150	Py_END_ALLOW_THREADS
		151
		152	if (ZSTD_isError(zresult)) {
		153	free(output.dst);
		154	PyErr_Format(ZstdError, "zstd compress error: %s", ZSTD_getErrorName(zresult));
		155	return NULL;
		156	}
		157
		158	/* Copy data from output buffer to writer. */
		159	if (output.pos) {
		160	#if PY_MAJOR_VERSION >= 3
		161	res = PyObject_CallMethod(self->writer, "write", "y#",
		162	#else
		163	res = PyObject_CallMethod(self->writer, "write", "s#",
		164	#endif
		165	output.dst, output.pos);
		166	Py_XDECREF(res);
		167	}
		168	output.pos = 0;
		169	}
		170
		171	free(output.dst);
		172
		173	/* TODO return bytes written */
		174	Py_RETURN_NONE;
		175	}
		176
		177	static PyMethodDef ZstdCompressionWriter_methods[] = {
		178	{ "__enter__", (PyCFunction)ZstdCompressionWriter_enter, METH_NOARGS,
		179	PyDoc_STR("Enter a compression context.") },
		180	{ "__exit__", (PyCFunction)ZstdCompressionWriter_exit, METH_VARARGS,
		181	PyDoc_STR("Exit a compression context.") },
		182	{ "memory_size", (PyCFunction)ZstdCompressionWriter_memory_size, METH_NOARGS,
		183	PyDoc_STR("Obtain the memory size of the underlying compressor") },
		184	{ "write", (PyCFunction)ZstdCompressionWriter_write, METH_VARARGS,
		185	PyDoc_STR("Compress data") },
		186	{ NULL, NULL }
		187	};
		188
		189	PyTypeObject ZstdCompressionWriterType = {
		190	PyVarObject_HEAD_INIT(NULL, 0)
		191	"zstd.ZstdCompressionWriter", /* tp_name */
		192	sizeof(ZstdCompressionWriter), /* tp_basicsize */
		193	0, /* tp_itemsize */
		194	(destructor)ZstdCompressionWriter_dealloc, /* tp_dealloc */
		195	0, /* tp_print */
		196	0, /* tp_getattr */
		197	0, /* tp_setattr */
		198	0, /* tp_compare */
		199	0, /* tp_repr */
		200	0, /* tp_as_number */
		201	0, /* tp_as_sequence */
		202	0, /* tp_as_mapping */
		203	0, /* tp_hash */
		204	0, /* tp_call */
		205	0, /* tp_str */
		206	0, /* tp_getattro */
		207	0, /* tp_setattro */
		208	0, /* tp_as_buffer */
		209	Py_TPFLAGS_DEFAULT \| Py_TPFLAGS_BASETYPE, /* tp_flags */
		210	ZstdCompresssionWriter__doc__, /* tp_doc */
		211	0, /* tp_traverse */
		212	0, /* tp_clear */
		213	0, /* tp_richcompare */
		214	0, /* tp_weaklistoffset */
		215	0, /* tp_iter */
		216	0, /* tp_iternext */
		217	ZstdCompressionWriter_methods, /* tp_methods */
		218	0, /* tp_members */
		219	0, /* tp_getset */
		220	0, /* tp_base */
		221	0, /* tp_dict */
		222	0, /* tp_descr_get */
		223	0, /* tp_descr_set */
		224	0, /* tp_dictoffset */
		225	0, /* tp_init */
		226	0, /* tp_alloc */
		227	PyType_GenericNew, /* tp_new */
		228	};
		229
		230	void compressionwriter_module_init(PyObject* mod) {
		231	Py_TYPE(&ZstdCompressionWriterType) = &PyType_Type;
		232	if (PyType_Ready(&ZstdCompressionWriterType) < 0) {
		233	return;
		234	}
		235	}

contrib/python-zstandard/c-ext/compressobj.c

0 created 644 +205 0

			@@ -0,0 +1,205 b''
		1	/**
		2	* Copyright (c) 2016-present, Gregory Szorc
		3	* All rights reserved.
		4	*
		5	* This software may be modified and distributed under the terms
		6	* of the BSD license. See the LICENSE file for details.
		7	*/
		8
		9	#include "python-zstandard.h"
		10
		11	extern PyObject* ZstdError;
		12
		13	PyDoc_STRVAR(ZstdCompressionObj__doc__,
		14	"Perform compression using a standard library compatible API.\n"
		15	);
		16
		17	static void ZstdCompressionObj_dealloc(ZstdCompressionObj* self) {
		18	PyMem_Free(self->output.dst);
		19	self->output.dst = NULL;
		20
		21	if (self->cstream) {
		22	ZSTD_freeCStream(self->cstream);
		23	self->cstream = NULL;
		24	}
		25
		26	Py_XDECREF(self->compressor);
		27
		28	PyObject_Del(self);
		29	}
		30
		31	static PyObject* ZstdCompressionObj_compress(ZstdCompressionObj* self, PyObject* args) {
		32	const char* source;
		33	Py_ssize_t sourceSize;
		34	ZSTD_inBuffer input;
		35	size_t zresult;
		36	PyObject* result = NULL;
		37	Py_ssize_t resultSize = 0;
		38
		39	if (self->flushed) {
		40	PyErr_SetString(ZstdError, "cannot call compress() after flush() has been called");
		41	return NULL;
		42	}
		43
		44	#if PY_MAJOR_VERSION >= 3
		45	if (!PyArg_ParseTuple(args, "y#", &source, &sourceSize)) {
		46	#else
		47	if (!PyArg_ParseTuple(args, "s#", &source, &sourceSize)) {
		48	#endif
		49	return NULL;
		50	}
		51
		52	input.src = source;
		53	input.size = sourceSize;
		54	input.pos = 0;
		55
		56	while ((ssize_t)input.pos < sourceSize) {
		57	Py_BEGIN_ALLOW_THREADS
		58	zresult = ZSTD_compressStream(self->cstream, &self->output, &input);
		59	Py_END_ALLOW_THREADS
		60
		61	if (ZSTD_isError(zresult)) {
		62	PyErr_Format(ZstdError, "zstd compress error: %s", ZSTD_getErrorName(zresult));
		63	return NULL;
		64	}
		65
		66	if (self->output.pos) {
		67	if (result) {
		68	resultSize = PyBytes_GET_SIZE(result);
		69	if (-1 == _PyBytes_Resize(&result, resultSize + self->output.pos)) {
		70	return NULL;
		71	}
		72
		73	memcpy(PyBytes_AS_STRING(result) + resultSize,
		74	self->output.dst, self->output.pos);
		75	}
		76	else {
		77	result = PyBytes_FromStringAndSize(self->output.dst, self->output.pos);
		78	if (!result) {
		79	return NULL;
		80	}
		81	}
		82
		83	self->output.pos = 0;
		84	}
		85	}
		86
		87	if (result) {
		88	return result;
		89	}
		90	else {
		91	return PyBytes_FromString("");
		92	}
		93	}
		94
		95	static PyObject* ZstdCompressionObj_flush(ZstdCompressionObj* self) {
		96	size_t zresult;
		97	PyObject* result = NULL;
		98	Py_ssize_t resultSize = 0;
		99
		100	if (self->flushed) {
		101	PyErr_SetString(ZstdError, "flush() already called");
		102	return NULL;
		103	}
		104
		105	self->flushed = 1;
		106
		107	while (1) {
		108	zresult = ZSTD_endStream(self->cstream, &self->output);
		109	if (ZSTD_isError(zresult)) {
		110	PyErr_Format(ZstdError, "error ending compression stream: %s",
		111	ZSTD_getErrorName(zresult));
		112	return NULL;
		113	}
		114
		115	if (self->output.pos) {
		116	if (result) {
		117	resultSize = PyBytes_GET_SIZE(result);
		118	if (-1 == _PyBytes_Resize(&result, resultSize + self->output.pos)) {
		119	return NULL;
		120	}
		121
		122	memcpy(PyBytes_AS_STRING(result) + resultSize,
		123	self->output.dst, self->output.pos);
		124	}
		125	else {
		126	result = PyBytes_FromStringAndSize(self->output.dst, self->output.pos);
		127	if (!result) {
		128	return NULL;
		129	}
		130	}
		131
		132	self->output.pos = 0;
		133	}
		134
		135	if (!zresult) {
		136	break;
		137	}
		138	}
		139
		140	ZSTD_freeCStream(self->cstream);
		141	self->cstream = NULL;
		142
		143	if (result) {
		144	return result;
		145	}
		146	else {
		147	return PyBytes_FromString("");
		148	}
		149	}
		150
		151	static PyMethodDef ZstdCompressionObj_methods[] = {
		152	{ "compress", (PyCFunction)ZstdCompressionObj_compress, METH_VARARGS,
		153	PyDoc_STR("compress data") },
		154	{ "flush", (PyCFunction)ZstdCompressionObj_flush, METH_NOARGS,
		155	PyDoc_STR("finish compression operation") },
		156	{ NULL, NULL }
		157	};
		158
		159	PyTypeObject ZstdCompressionObjType = {
		160	PyVarObject_HEAD_INIT(NULL, 0)
		161	"zstd.ZstdCompressionObj", /* tp_name */
		162	sizeof(ZstdCompressionObj), /* tp_basicsize */
		163	0, /* tp_itemsize */
		164	(destructor)ZstdCompressionObj_dealloc, /* tp_dealloc */
		165	0, /* tp_print */
		166	0, /* tp_getattr */
		167	0, /* tp_setattr */
		168	0, /* tp_compare */
		169	0, /* tp_repr */
		170	0, /* tp_as_number */
		171	0, /* tp_as_sequence */
		172	0, /* tp_as_mapping */
		173	0, /* tp_hash */
		174	0, /* tp_call */
		175	0, /* tp_str */
		176	0, /* tp_getattro */
		177	0, /* tp_setattro */
		178	0, /* tp_as_buffer */
		179	Py_TPFLAGS_DEFAULT \| Py_TPFLAGS_BASETYPE, /* tp_flags */
		180	ZstdCompressionObj__doc__, /* tp_doc */
		181	0, /* tp_traverse */
		182	0, /* tp_clear */
		183	0, /* tp_richcompare */
		184	0, /* tp_weaklistoffset */
		185	0, /* tp_iter */
		186	0, /* tp_iternext */
		187	ZstdCompressionObj_methods, /* tp_methods */
		188	0, /* tp_members */
		189	0, /* tp_getset */
		190	0, /* tp_base */
		191	0, /* tp_dict */
		192	0, /* tp_descr_get */
		193	0, /* tp_descr_set */
		194	0, /* tp_dictoffset */
		195	0, /* tp_init */
		196	0, /* tp_alloc */
		197	PyType_GenericNew, /* tp_new */
		198	};
		199
		200	void compressobj_module_init(PyObject* module) {
		201	Py_TYPE(&ZstdCompressionObjType) = &PyType_Type;
		202	if (PyType_Ready(&ZstdCompressionObjType) < 0) {
		203	return;
		204	}
		205	}

contrib/python-zstandard/c-ext/compressor.c

0 created 644 +757 0

This diff has been collapsed as it changes many lines, (757 lines changed) Show them Hide them
		@@ -0,0 +1,757 b''
	1	/**
	2	* Copyright (c) 2016-present, Gregory Szorc
	3	* All rights reserved.
	4	*
	5	* This software may be modified and distributed under the terms
	6	* of the BSD license. See the LICENSE file for details.
	7	*/
	8
	9	#include "python-zstandard.h"
	10
	11	extern PyObject* ZstdError;
	12
	13	/**
	14	* Initialize a zstd CStream from a ZstdCompressor instance.
	15	*
	16	* Returns a ZSTD_CStream on success or NULL on failure. If NULL, a Python
	17	* exception will be set.
	18	*/
	19	ZSTD_CStream* CStream_from_ZstdCompressor(ZstdCompressor* compressor, Py_ssize_t sourceSize) {
	20	ZSTD_CStream* cstream;
	21	ZSTD_parameters zparams;
	22	void* dictData = NULL;
	23	size_t dictSize = 0;
	24	size_t zresult;
	25
	26	cstream = ZSTD_createCStream();
	27	if (!cstream) {
	28	PyErr_SetString(ZstdError, "cannot create CStream");
	29	return NULL;
	30	}
	31
	32	if (compressor->dict) {
	33	dictData = compressor->dict->dictData;
	34	dictSize = compressor->dict->dictSize;
	35	}
	36
	37	memset(&zparams, 0, sizeof(zparams));
	38	if (compressor->cparams) {
	39	ztopy_compression_parameters(compressor->cparams, &zparams.cParams);
	40	/* Do NOT call ZSTD_adjustCParams() here because the compression params
	41	come from the user. */
	42	}
	43	else {
	44	zparams.cParams = ZSTD_getCParams(compressor->compressionLevel, sourceSize, dictSize);
	45	}
	46
	47	zparams.fParams = compressor->fparams;
	48
	49	zresult = ZSTD_initCStream_advanced(cstream, dictData, dictSize, zparams, sourceSize);
	50
	51	if (ZSTD_isError(zresult)) {
	52	ZSTD_freeCStream(cstream);
	53	PyErr_Format(ZstdError, "cannot init CStream: %s", ZSTD_getErrorName(zresult));
	54	return NULL;
	55	}
	56
	57	return cstream;
	58	}
	59
	60
	61	PyDoc_STRVAR(ZstdCompressor__doc__,
	62	"ZstdCompressor(level=None, dict_data=None, compression_params=None)\n"
	63	"\n"
	64	"Create an object used to perform Zstandard compression.\n"
	65	"\n"
	66	"An instance can compress data various ways. Instances can be used multiple\n"
	67	"times. Each compression operation will use the compression parameters\n"
	68	"defined at construction time.\n"
	69	"\n"
	70	"Compression can be configured via the following names arguments:\n"
	71	"\n"
	72	"level\n"
	73	" Integer compression level.\n"
	74	"dict_data\n"
	75	" A ``ZstdCompressionDict`` to be used to compress with dictionary data.\n"
	76	"compression_params\n"
	77	" A ``CompressionParameters`` instance defining low-level compression"
	78	" parameters. If defined, this will overwrite the ``level`` argument.\n"
	79	"write_checksum\n"
	80	" If True, a 4 byte content checksum will be written with the compressed\n"
	81	" data, allowing the decompressor to perform content verification.\n"
	82	"write_content_size\n"
	83	" If True, the decompressed content size will be included in the header of\n"
	84	" the compressed data. This data will only be written if the compressor\n"
	85	" knows the size of the input data.\n"
	86	"write_dict_id\n"
	87	" Determines whether the dictionary ID will be written into the compressed\n"
	88	" data. Defaults to True. Only adds content to the compressed data if\n"
	89	" a dictionary is being used.\n"
	90	);
	91
	92	static int ZstdCompressor_init(ZstdCompressor* self, PyObject* args, PyObject* kwargs) {
	93	static char* kwlist[] = {
	94	"level",
	95	"dict_data",
	96	"compression_params",
	97	"write_checksum",
	98	"write_content_size",
	99	"write_dict_id",
	100	NULL
	101	};
	102
	103	int level = 3;
	104	ZstdCompressionDict* dict = NULL;
	105	CompressionParametersObject* params = NULL;
	106	PyObject* writeChecksum = NULL;
	107	PyObject* writeContentSize = NULL;
	108	PyObject* writeDictID = NULL;
	109
	110	self->dict = NULL;
	111	self->cparams = NULL;
	112	self->cdict = NULL;
	113
	114	if (!PyArg_ParseTupleAndKeywords(args, kwargs, "\|iO!O!OOO", kwlist,
	115	&level, &ZstdCompressionDictType, &dict,
	116	&CompressionParametersType, &params,
	117	&writeChecksum, &writeContentSize, &writeDictID)) {
	118	return -1;
	119	}
	120
	121	if (level < 1) {
	122	PyErr_SetString(PyExc_ValueError, "level must be greater than 0");
	123	return -1;
	124	}
	125
	126	if (level > ZSTD_maxCLevel()) {
	127	PyErr_Format(PyExc_ValueError, "level must be less than %d",
	128	ZSTD_maxCLevel() + 1);
	129	return -1;
	130	}
	131
	132	self->compressionLevel = level;
	133
	134	if (dict) {
	135	self->dict = dict;
	136	Py_INCREF(dict);
	137	}
	138
	139	if (params) {
	140	self->cparams = params;
	141	Py_INCREF(params);
	142	}
	143
	144	memset(&self->fparams, 0, sizeof(self->fparams));
	145
	146	if (writeChecksum && PyObject_IsTrue(writeChecksum)) {
	147	self->fparams.checksumFlag = 1;
	148	}
	149	if (writeContentSize && PyObject_IsTrue(writeContentSize)) {
	150	self->fparams.contentSizeFlag = 1;
	151	}
	152	if (writeDictID && PyObject_Not(writeDictID)) {
	153	self->fparams.noDictIDFlag = 1;
	154	}
	155
	156	return 0;
	157	}
	158
	159	static void ZstdCompressor_dealloc(ZstdCompressor* self) {
	160	Py_XDECREF(self->cparams);
	161	Py_XDECREF(self->dict);
	162
	163	if (self->cdict) {
	164	ZSTD_freeCDict(self->cdict);
	165	self->cdict = NULL;
	166	}
	167
	168	PyObject_Del(self);
	169	}
	170
	171	PyDoc_STRVAR(ZstdCompressor_copy_stream__doc__,
	172	"copy_stream(ifh, ofh[, size=0, read_size=default, write_size=default])\n"
	173	"compress data between streams\n"
	174	"\n"
	175	"Data will be read from ``ifh``, compressed, and written to ``ofh``.\n"
	176	"``ifh`` must have a ``read(size)`` method. ``ofh`` must have a ``write(data)``\n"
	177	"method.\n"
	178	"\n"
	179	"An optional ``size`` argument specifies the size of the source stream.\n"
	180	"If defined, compression parameters will be tuned based on the size.\n"
	181	"\n"
	182	"Optional arguments ``read_size`` and ``write_size`` define the chunk sizes\n"
	183	"of ``read()`` and ``write()`` operations, respectively. By default, they use\n"
	184	"the default compression stream input and output sizes, respectively.\n"
	185	);
	186
	187	static PyObject* ZstdCompressor_copy_stream(ZstdCompressor* self, PyObject* args, PyObject* kwargs) {
	188	static char* kwlist[] = {
	189	"ifh",
	190	"ofh",
	191	"size",
	192	"read_size",
	193	"write_size",
	194	NULL
	195	};
	196
	197	PyObject* source;
	198	PyObject* dest;
	199	Py_ssize_t sourceSize = 0;
	200	size_t inSize = ZSTD_CStreamInSize();
	201	size_t outSize = ZSTD_CStreamOutSize();
	202	ZSTD_CStream* cstream;
	203	ZSTD_inBuffer input;
	204	ZSTD_outBuffer output;
	205	Py_ssize_t totalRead = 0;
	206	Py_ssize_t totalWrite = 0;
	207	char* readBuffer;
	208	Py_ssize_t readSize;
	209	PyObject* readResult;
	210	PyObject* res = NULL;
	211	size_t zresult;
	212	PyObject* writeResult;
	213	PyObject* totalReadPy;
	214	PyObject* totalWritePy;
	215
	216	if (!PyArg_ParseTupleAndKeywords(args, kwargs, "OO\|nkk", kwlist, &source, &dest, &sourceSize,
	217	&inSize, &outSize)) {
	218	return NULL;
	219	}
	220
	221	if (!PyObject_HasAttrString(source, "read")) {
	222	PyErr_SetString(PyExc_ValueError, "first argument must have a read() method");
	223	return NULL;
	224	}
	225
	226	if (!PyObject_HasAttrString(dest, "write")) {
	227	PyErr_SetString(PyExc_ValueError, "second argument must have a write() method");
	228	return NULL;
	229	}
	230
	231	cstream = CStream_from_ZstdCompressor(self, sourceSize);
	232	if (!cstream) {
	233	res = NULL;
	234	goto finally;
	235	}
	236
	237	output.dst = PyMem_Malloc(outSize);
	238	if (!output.dst) {
	239	PyErr_NoMemory();
	240	res = NULL;
	241	goto finally;
	242	}
	243	output.size = outSize;
	244	output.pos = 0;
	245
	246	while (1) {
	247	/* Try to read from source stream. */
	248	readResult = PyObject_CallMethod(source, "read", "n", inSize);
	249	if (!readResult) {
	250	PyErr_SetString(ZstdError, "could not read() from source");
	251	goto finally;
	252	}
	253
	254	PyBytes_AsStringAndSize(readResult, &readBuffer, &readSize);
	255
	256	/* If no data was read, we're at EOF. */
	257	if (0 == readSize) {
	258	break;
	259	}
	260
	261	totalRead += readSize;
	262
	263	/* Send data to compressor */
	264	input.src = readBuffer;
	265	input.size = readSize;
	266	input.pos = 0;
	267
	268	while (input.pos < input.size) {
	269	Py_BEGIN_ALLOW_THREADS
	270	zresult = ZSTD_compressStream(cstream, &output, &input);
	271	Py_END_ALLOW_THREADS
	272
	273	if (ZSTD_isError(zresult)) {
	274	res = NULL;
	275	PyErr_Format(ZstdError, "zstd compress error: %s", ZSTD_getErrorName(zresult));
	276	goto finally;
	277	}
	278
	279	if (output.pos) {
	280	#if PY_MAJOR_VERSION >= 3
	281	writeResult = PyObject_CallMethod(dest, "write", "y#",
	282	#else
	283	writeResult = PyObject_CallMethod(dest, "write", "s#",
	284	#endif
	285	output.dst, output.pos);
	286	Py_XDECREF(writeResult);
	287	totalWrite += output.pos;
	288	output.pos = 0;
	289	}
	290	}
	291	}
	292
	293	/* We've finished reading. Now flush the compressor stream. */
	294	while (1) {
	295	zresult = ZSTD_endStream(cstream, &output);
	296	if (ZSTD_isError(zresult)) {
	297	PyErr_Format(ZstdError, "error ending compression stream: %s",
	298	ZSTD_getErrorName(zresult));
	299	res = NULL;
	300	goto finally;
	301	}
	302
	303	if (output.pos) {
	304	#if PY_MAJOR_VERSION >= 3
	305	writeResult = PyObject_CallMethod(dest, "write", "y#",
	306	#else
	307	writeResult = PyObject_CallMethod(dest, "write", "s#",
	308	#endif
	309	output.dst, output.pos);
	310	totalWrite += output.pos;
	311	Py_XDECREF(writeResult);
	312	output.pos = 0;
	313	}
	314
	315	if (!zresult) {
	316	break;
	317	}
	318	}
	319
	320	ZSTD_freeCStream(cstream);
	321	cstream = NULL;
	322
	323	totalReadPy = PyLong_FromSsize_t(totalRead);
	324	totalWritePy = PyLong_FromSsize_t(totalWrite);
	325	res = PyTuple_Pack(2, totalReadPy, totalWritePy);
	326	Py_DecRef(totalReadPy);
	327	Py_DecRef(totalWritePy);
	328
	329	finally:
	330	if (output.dst) {
	331	PyMem_Free(output.dst);
	332	}
	333
	334	if (cstream) {
	335	ZSTD_freeCStream(cstream);
	336	}
	337
	338	return res;
	339	}
	340
	341	PyDoc_STRVAR(ZstdCompressor_compress__doc__,
	342	"compress(data)\n"
	343	"\n"
	344	"Compress data in a single operation.\n"
	345	"\n"
	346	"This is the simplest mechanism to perform compression: simply pass in a\n"
	347	"value and get a compressed value back. It is almost the most prone to abuse.\n"
	348	"The input and output values must fit in memory, so passing in very large\n"
	349	"values can result in excessive memory usage. For this reason, one of the\n"
	350	"streaming based APIs is preferred for larger values.\n"
	351	);
	352
	353	static PyObject* ZstdCompressor_compress(ZstdCompressor* self, PyObject* args) {
	354	const char* source;
	355	Py_ssize_t sourceSize;
	356	size_t destSize;
	357	ZSTD_CCtx* cctx;
	358	PyObject* output;
	359	char* dest;
	360	void* dictData = NULL;
	361	size_t dictSize = 0;
	362	size_t zresult;
	363	ZSTD_parameters zparams;
	364	ZSTD_customMem zmem;
	365
	366	#if PY_MAJOR_VERSION >= 3
	367	if (!PyArg_ParseTuple(args, "y#", &source, &sourceSize)) {
	368	#else
	369	if (!PyArg_ParseTuple(args, "s#", &source, &sourceSize)) {
	370	#endif
	371	return NULL;
	372	}
	373
	374	destSize = ZSTD_compressBound(sourceSize);
	375	output = PyBytes_FromStringAndSize(NULL, destSize);
	376	if (!output) {
	377	return NULL;
	378	}
	379
	380	dest = PyBytes_AsString(output);
	381
	382	cctx = ZSTD_createCCtx();
	383	if (!cctx) {
	384	Py_DECREF(output);
	385	PyErr_SetString(ZstdError, "could not create CCtx");
	386	return NULL;
	387	}
	388
	389	if (self->dict) {
	390	dictData = self->dict->dictData;
	391	dictSize = self->dict->dictSize;
	392	}
	393
	394	memset(&zparams, 0, sizeof(zparams));
	395	if (!self->cparams) {
	396	zparams.cParams = ZSTD_getCParams(self->compressionLevel, sourceSize, dictSize);
	397	}
	398	else {
	399	ztopy_compression_parameters(self->cparams, &zparams.cParams);
	400	/* Do NOT call ZSTD_adjustCParams() here because the compression params
	401	come from the user. */
	402	}
	403
	404	zparams.fParams = self->fparams;
	405
	406	/* The raw dict data has to be processed before it can be used. Since this
	407	adds overhead - especially if multiple dictionary compression operations
	408	are performed on the same ZstdCompressor instance - we create a
	409	ZSTD_CDict once and reuse it for all operations. */
	410
	411	/* TODO the zparams (which can be derived from the source data size) used
	412	on first invocation are effectively reused for subsequent operations. This
	413	may not be appropriate if input sizes vary significantly and could affect
	414	chosen compression parameters.
	415	https://github.com/facebook/zstd/issues/358 tracks this issue. */
	416	if (dictData && !self->cdict) {
	417	Py_BEGIN_ALLOW_THREADS
	418	memset(&zmem, 0, sizeof(zmem));
	419	self->cdict = ZSTD_createCDict_advanced(dictData, dictSize, zparams, zmem);
	420	Py_END_ALLOW_THREADS
	421
	422	if (!self->cdict) {
	423	Py_DECREF(output);
	424	ZSTD_freeCCtx(cctx);
	425	PyErr_SetString(ZstdError, "could not create compression dictionary");
	426	return NULL;
	427	}
	428	}
	429
	430	Py_BEGIN_ALLOW_THREADS
	431	/* By avoiding ZSTD_compress(), we don't necessarily write out content
	432	size. This means the argument to ZstdCompressor to control frame
	433	parameters is honored. */
	434	if (self->cdict) {
	435	zresult = ZSTD_compress_usingCDict(cctx, dest, destSize,
	436	source, sourceSize, self->cdict);
	437	}
	438	else {
	439	zresult = ZSTD_compress_advanced(cctx, dest, destSize,
	440	source, sourceSize, dictData, dictSize, zparams);
	441	}
	442	Py_END_ALLOW_THREADS
	443
	444	ZSTD_freeCCtx(cctx);
	445
	446	if (ZSTD_isError(zresult)) {
	447	PyErr_Format(ZstdError, "cannot compress: %s", ZSTD_getErrorName(zresult));
	448	Py_CLEAR(output);
	449	return NULL;
	450	}
	451	else {
	452	Py_SIZE(output) = zresult;
	453	}
	454
	455	return output;
	456	}
	457
	458	PyDoc_STRVAR(ZstdCompressionObj__doc__,
	459	"compressobj()\n"
	460	"\n"
	461	"Return an object exposing ``compress(data)`` and ``flush()`` methods.\n"
	462	"\n"
	463	"The returned object exposes an API similar to ``zlib.compressobj`` and\n"
	464	"``bz2.BZ2Compressor`` so that callers can swap in the zstd compressor\n"
	465	"without changing how compression is performed.\n"
	466	);
	467
	468	static ZstdCompressionObj* ZstdCompressor_compressobj(ZstdCompressor* self, PyObject* args, PyObject* kwargs) {
	469	static char* kwlist[] = {
	470	"size",
	471	NULL
	472	};
	473
	474	Py_ssize_t inSize = 0;
	475	size_t outSize = ZSTD_CStreamOutSize();
	476	ZstdCompressionObj* result = PyObject_New(ZstdCompressionObj, &ZstdCompressionObjType);
	477	if (!result) {
	478	return NULL;
	479	}
	480
	481	if (!PyArg_ParseTupleAndKeywords(args, kwargs, "\|n", kwlist, &inSize)) {
	482	return NULL;
	483	}
	484
	485	result->cstream = CStream_from_ZstdCompressor(self, inSize);
	486	if (!result->cstream) {
	487	Py_DECREF(result);
	488	return NULL;
	489	}
	490
	491	result->output.dst = PyMem_Malloc(outSize);
	492	if (!result->output.dst) {
	493	PyErr_NoMemory();
	494	Py_DECREF(result);
	495	return NULL;
	496	}
	497	result->output.size = outSize;
	498	result->output.pos = 0;
	499
	500	result->compressor = self;
	501	Py_INCREF(result->compressor);
	502
	503	result->flushed = 0;
	504
	505	return result;
	506	}
	507
	508	PyDoc_STRVAR(ZstdCompressor_read_from__doc__,
	509	"read_from(reader, [size=0, read_size=default, write_size=default])\n"
	510	"Read uncompress data from a reader and return an iterator\n"
	511	"\n"
	512	"Returns an iterator of compressed data produced from reading from ``reader``.\n"
	513	"\n"
	514	"Uncompressed data will be obtained from ``reader`` by calling the\n"
	515	"``read(size)`` method of it. The source data will be streamed into a\n"
	516	"compressor. As compressed data is available, it will be exposed to the\n"
	517	"iterator.\n"
	518	"\n"
	519	"Data is read from the source in chunks of ``read_size``. Compressed chunks\n"
	520	"are at most ``write_size`` bytes. Both values default to the zstd input and\n"
	521	"and output defaults, respectively.\n"
	522	"\n"
	523	"The caller is partially in control of how fast data is fed into the\n"
	524	"compressor by how it consumes the returned iterator. The compressor will\n"
	525	"not consume from the reader unless the caller consumes from the iterator.\n"
	526	);
	527
	528	static ZstdCompressorIterator* ZstdCompressor_read_from(ZstdCompressor* self, PyObject* args, PyObject* kwargs) {
	529	static char* kwlist[] = {
	530	"reader",
	531	"size",
	532	"read_size",
	533	"write_size",
	534	NULL
	535	};
	536
	537	PyObject* reader;
	538	Py_ssize_t sourceSize = 0;
	539	size_t inSize = ZSTD_CStreamInSize();
	540	size_t outSize = ZSTD_CStreamOutSize();
	541	ZstdCompressorIterator* result;
	542
	543	if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O\|nkk", kwlist, &reader, &sourceSize,
	544	&inSize, &outSize)) {
	545	return NULL;
	546	}
	547
	548	result = PyObject_New(ZstdCompressorIterator, &ZstdCompressorIteratorType);
	549	if (!result) {
	550	return NULL;
	551	}
	552
	553	result->compressor = NULL;
	554	result->reader = NULL;
	555	result->buffer = NULL;
	556	result->cstream = NULL;
	557	result->input.src = NULL;
	558	result->output.dst = NULL;
	559	result->readResult = NULL;
	560
	561	if (PyObject_HasAttrString(reader, "read")) {
	562	result->reader = reader;
	563	Py_INCREF(result->reader);
	564	}
	565	else if (1 == PyObject_CheckBuffer(reader)) {
	566	result->buffer = PyMem_Malloc(sizeof(Py_buffer));
	567	if (!result->buffer) {
	568	goto except;
	569	}
	570
	571	memset(result->buffer, 0, sizeof(Py_buffer));
	572
	573	if (0 != PyObject_GetBuffer(reader, result->buffer, PyBUF_CONTIG_RO)) {
	574	goto except;
	575	}
	576
	577	result->bufferOffset = 0;
	578	sourceSize = result->buffer->len;
	579	}
	580	else {
	581	PyErr_SetString(PyExc_ValueError,
	582	"must pass an object with a read() method or conforms to buffer protocol");
	583	goto except;
	584	}
	585
	586	result->compressor = self;
	587	Py_INCREF(result->compressor);
	588
	589	result->sourceSize = sourceSize;
	590	result->cstream = CStream_from_ZstdCompressor(self, sourceSize);
	591	if (!result->cstream) {
	592	goto except;
	593	}
	594
	595	result->inSize = inSize;
	596	result->outSize = outSize;
	597
	598	result->output.dst = PyMem_Malloc(outSize);
	599	if (!result->output.dst) {
	600	PyErr_NoMemory();
	601	goto except;
	602	}
	603	result->output.size = outSize;
	604	result->output.pos = 0;
	605
	606	result->input.src = NULL;
	607	result->input.size = 0;
	608	result->input.pos = 0;
	609
	610	result->finishedInput = 0;
	611	result->finishedOutput = 0;
	612
	613	goto finally;
	614
	615	except:
	616	if (result->cstream) {
	617	ZSTD_freeCStream(result->cstream);
	618	result->cstream = NULL;
	619	}
	620
	621	Py_DecRef((PyObject*)result->compressor);
	622	Py_DecRef(result->reader);
	623
	624	Py_DECREF(result);
	625	result = NULL;
	626
	627	finally:
	628	return result;
	629	}
	630
	631	PyDoc_STRVAR(ZstdCompressor_write_to___doc__,
	632	"Create a context manager to write compressed data to an object.\n"
	633	"\n"
	634	"The passed object must have a ``write()`` method.\n"
	635	"\n"
	636	"The caller feeds input data to the object by calling ``compress(data)``.\n"
	637	"Compressed data is written to the argument given to this function.\n"
	638	"\n"
	639	"The function takes an optional ``size`` argument indicating the total size\n"
	640	"of the eventual input. If specified, the size will influence compression\n"
	641	"parameter tuning and could result in the size being written into the\n"
	642	"header of the compressed data.\n"
	643	"\n"
	644	"An optional ``write_size`` argument is also accepted. It defines the maximum\n"
	645	"byte size of chunks fed to ``write()``. By default, it uses the zstd default\n"
	646	"for a compressor output stream.\n"
	647	);
	648
	649	static ZstdCompressionWriter* ZstdCompressor_write_to(ZstdCompressor* self, PyObject* args, PyObject* kwargs) {
	650	static char* kwlist[] = {
	651	"writer",
	652	"size",
	653	"write_size",
	654	NULL
	655	};
	656
	657	PyObject* writer;
	658	ZstdCompressionWriter* result;
	659	Py_ssize_t sourceSize = 0;
	660	size_t outSize = ZSTD_CStreamOutSize();
	661
	662	if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O\|nk", kwlist, &writer, &sourceSize,
	663	&outSize)) {
	664	return NULL;
	665	}
	666
	667	if (!PyObject_HasAttrString(writer, "write")) {
	668	PyErr_SetString(PyExc_ValueError, "must pass an object with a write() method");
	669	return NULL;
	670	}
	671
	672	result = PyObject_New(ZstdCompressionWriter, &ZstdCompressionWriterType);
	673	if (!result) {
	674	return NULL;
	675	}
	676
	677	result->compressor = self;
	678	Py_INCREF(result->compressor);
	679
	680	result->writer = writer;
	681	Py_INCREF(result->writer);
	682
	683	result->sourceSize = sourceSize;
	684
	685	result->outSize = outSize;
	686
	687	result->entered = 0;
	688	result->cstream = NULL;
	689
	690	return result;
	691	}
	692
	693	static PyMethodDef ZstdCompressor_methods[] = {
	694	{ "compress", (PyCFunction)ZstdCompressor_compress, METH_VARARGS,
	695	ZstdCompressor_compress__doc__ },
	696	{ "compressobj", (PyCFunction)ZstdCompressor_compressobj,
	697	METH_VARARGS \| METH_KEYWORDS, ZstdCompressionObj__doc__ },
	698	{ "copy_stream", (PyCFunction)ZstdCompressor_copy_stream,
	699	METH_VARARGS \| METH_KEYWORDS, ZstdCompressor_copy_stream__doc__ },
	700	{ "read_from", (PyCFunction)ZstdCompressor_read_from,
	701	METH_VARARGS \| METH_KEYWORDS, ZstdCompressor_read_from__doc__ },
	702	{ "write_to", (PyCFunction)ZstdCompressor_write_to,
	703	METH_VARARGS \| METH_KEYWORDS, ZstdCompressor_write_to___doc__ },
	704	{ NULL, NULL }
	705	};
	706
	707	PyTypeObject ZstdCompressorType = {
	708	PyVarObject_HEAD_INIT(NULL, 0)
	709	"zstd.ZstdCompressor", /* tp_name */
	710	sizeof(ZstdCompressor), /* tp_basicsize */
	711	0, /* tp_itemsize */
	712	(destructor)ZstdCompressor_dealloc, /* tp_dealloc */
	713	0, /* tp_print */
	714	0, /* tp_getattr */
	715	0, /* tp_setattr */
	716	0, /* tp_compare */
	717	0, /* tp_repr */
	718	0, /* tp_as_number */
	719	0, /* tp_as_sequence */
	720	0, /* tp_as_mapping */
	721	0, /* tp_hash */
	722	0, /* tp_call */
	723	0, /* tp_str */
	724	0, /* tp_getattro */
	725	0, /* tp_setattro */
	726	0, /* tp_as_buffer */
	727	Py_TPFLAGS_DEFAULT \| Py_TPFLAGS_BASETYPE, /* tp_flags */
	728	ZstdCompressor__doc__, /* tp_doc */
	729	0, /* tp_traverse */
	730	0, /* tp_clear */
	731	0, /* tp_richcompare */
	732	0, /* tp_weaklistoffset */
	733	0, /* tp_iter */
	734	0, /* tp_iternext */
	735	ZstdCompressor_methods, /* tp_methods */
	736	0, /* tp_members */
	737	0, /* tp_getset */
	738	0, /* tp_base */
	739	0, /* tp_dict */
	740	0, /* tp_descr_get */
	741	0, /* tp_descr_set */
	742	0, /* tp_dictoffset */
	743	(initproc)ZstdCompressor_init, /* tp_init */
	744	0, /* tp_alloc */
	745	PyType_GenericNew, /* tp_new */
	746	};
	747
	748	void compressor_module_init(PyObject* mod) {
	749	Py_TYPE(&ZstdCompressorType) = &PyType_Type;
	750	if (PyType_Ready(&ZstdCompressorType) < 0) {
	751	return;
	752	}
	753
	754	Py_INCREF((PyObject*)&ZstdCompressorType);
	755	PyModule_AddObject(mod, "ZstdCompressor",
	756	(PyObject*)&ZstdCompressorType);
	757	}

contrib/python-zstandard/c-ext/compressoriterator.c

0 created 644 +234 0

			@@ -0,0 +1,234 b''
		1	/**
		2	* Copyright (c) 2016-present, Gregory Szorc
		3	* All rights reserved.
		4	*
		5	* This software may be modified and distributed under the terms
		6	* of the BSD license. See the LICENSE file for details.
		7	*/
		8
		9	#include "python-zstandard.h"
		10
		11	#define min(a, b) (((a) < (b)) ? (a) : (b))
		12
		13	extern PyObject* ZstdError;
		14
		15	PyDoc_STRVAR(ZstdCompressorIterator__doc__,
		16	"Represents an iterator of compressed data.\n"
		17	);
		18
		19	static void ZstdCompressorIterator_dealloc(ZstdCompressorIterator* self) {
		20	Py_XDECREF(self->readResult);
		21	Py_XDECREF(self->compressor);
		22	Py_XDECREF(self->reader);
		23
		24	if (self->buffer) {
		25	PyBuffer_Release(self->buffer);
		26	PyMem_FREE(self->buffer);
		27	self->buffer = NULL;
		28	}
		29
		30	if (self->cstream) {
		31	ZSTD_freeCStream(self->cstream);
		32	self->cstream = NULL;
		33	}
		34
		35	if (self->output.dst) {
		36	PyMem_Free(self->output.dst);
		37	self->output.dst = NULL;
		38	}
		39
		40	PyObject_Del(self);
		41	}
		42
		43	static PyObject* ZstdCompressorIterator_iter(PyObject* self) {
		44	Py_INCREF(self);
		45	return self;
		46	}
		47
		48	static PyObject* ZstdCompressorIterator_iternext(ZstdCompressorIterator* self) {
		49	size_t zresult;
		50	PyObject* readResult = NULL;
		51	PyObject* chunk;
		52	char* readBuffer;
		53	Py_ssize_t readSize = 0;
		54	Py_ssize_t bufferRemaining;
		55
		56	if (self->finishedOutput) {
		57	PyErr_SetString(PyExc_StopIteration, "output flushed");
		58	return NULL;
		59	}
		60
		61	feedcompressor:
		62
		63	/* If we have data left in the input, consume it. */
		64	if (self->input.pos < self->input.size) {
		65	Py_BEGIN_ALLOW_THREADS
		66	zresult = ZSTD_compressStream(self->cstream, &self->output, &self->input);
		67	Py_END_ALLOW_THREADS
		68
		69	/* Release the Python object holding the input buffer. */
		70	if (self->input.pos == self->input.size) {
		71	self->input.src = NULL;
		72	self->input.pos = 0;
		73	self->input.size = 0;
		74	Py_DECREF(self->readResult);
		75	self->readResult = NULL;
		76	}
		77
		78	if (ZSTD_isError(zresult)) {
		79	PyErr_Format(ZstdError, "zstd compress error: %s", ZSTD_getErrorName(zresult));
		80	return NULL;
		81	}
		82
		83	/* If it produced output data, emit it. */
		84	if (self->output.pos) {
		85	chunk = PyBytes_FromStringAndSize(self->output.dst, self->output.pos);
		86	self->output.pos = 0;
		87	return chunk;
		88	}
		89	}
		90
		91	/* We should never have output data sitting around after a previous call. */
		92	assert(self->output.pos == 0);
		93
		94	/* The code above should have either emitted a chunk and returned or consumed
		95	the entire input buffer. So the state of the input buffer is not
		96	relevant. */
		97	if (!self->finishedInput) {
		98	if (self->reader) {
		99	readResult = PyObject_CallMethod(self->reader, "read", "I", self->inSize);
		100	if (!readResult) {
		101	PyErr_SetString(ZstdError, "could not read() from source");
		102	return NULL;
		103	}
		104
		105	PyBytes_AsStringAndSize(readResult, &readBuffer, &readSize);
		106	}
		107	else {
		108	assert(self->buffer && self->buffer->buf);
		109
		110	/* Only support contiguous C arrays. */
		111	assert(self->buffer->strides == NULL && self->buffer->suboffsets == NULL);
		112	assert(self->buffer->itemsize == 1);
		113
		114	readBuffer = (char*)self->buffer->buf + self->bufferOffset;
		115	bufferRemaining = self->buffer->len - self->bufferOffset;
		116	readSize = min(bufferRemaining, (Py_ssize_t)self->inSize);
		117	self->bufferOffset += readSize;
		118	}
		119
		120	if (0 == readSize) {
		121	Py_XDECREF(readResult);
		122	self->finishedInput = 1;
		123	}
		124	else {
		125	self->readResult = readResult;
		126	}
		127	}
		128
		129	/* EOF */
		130	if (0 == readSize) {
		131	zresult = ZSTD_endStream(self->cstream, &self->output);
		132	if (ZSTD_isError(zresult)) {
		133	PyErr_Format(ZstdError, "error ending compression stream: %s",
		134	ZSTD_getErrorName(zresult));
		135	return NULL;
		136	}
		137
		138	assert(self->output.pos);
		139
		140	if (0 == zresult) {
		141	self->finishedOutput = 1;
		142	}
		143
		144	chunk = PyBytes_FromStringAndSize(self->output.dst, self->output.pos);
		145	self->output.pos = 0;
		146	return chunk;
		147	}
		148
		149	/* New data from reader. Feed into compressor. */
		150	self->input.src = readBuffer;
		151	self->input.size = readSize;
		152	self->input.pos = 0;
		153
		154	Py_BEGIN_ALLOW_THREADS
		155	zresult = ZSTD_compressStream(self->cstream, &self->output, &self->input);
		156	Py_END_ALLOW_THREADS
		157
		158	/* The input buffer currently points to memory managed by Python
		159	(readBuffer). This object was allocated by this function. If it wasn't
		160	fully consumed, we need to release it in a subsequent function call.
		161	If it is fully consumed, do that now.
		162	*/
		163	if (self->input.pos == self->input.size) {
		164	self->input.src = NULL;
		165	self->input.pos = 0;
		166	self->input.size = 0;
		167	Py_XDECREF(self->readResult);
		168	self->readResult = NULL;
		169	}
		170
		171	if (ZSTD_isError(zresult)) {
		172	PyErr_Format(ZstdError, "zstd compress error: %s", ZSTD_getErrorName(zresult));
		173	return NULL;
		174	}
		175
		176	assert(self->input.pos <= self->input.size);
		177
		178	/* If we didn't write anything, start the process over. */
		179	if (0 == self->output.pos) {
		180	goto feedcompressor;
		181	}
		182
		183	chunk = PyBytes_FromStringAndSize(self->output.dst, self->output.pos);
		184	self->output.pos = 0;
		185	return chunk;
		186	}
		187
		188	PyTypeObject ZstdCompressorIteratorType = {
		189	PyVarObject_HEAD_INIT(NULL, 0)
		190	"zstd.ZstdCompressorIterator", /* tp_name */
		191	sizeof(ZstdCompressorIterator), /* tp_basicsize */
		192	0, /* tp_itemsize */
		193	(destructor)ZstdCompressorIterator_dealloc, /* tp_dealloc */
		194	0, /* tp_print */
		195	0, /* tp_getattr */
		196	0, /* tp_setattr */
		197	0, /* tp_compare */
		198	0, /* tp_repr */
		199	0, /* tp_as_number */
		200	0, /* tp_as_sequence */
		201	0, /* tp_as_mapping */
		202	0, /* tp_hash */
		203	0, /* tp_call */
		204	0, /* tp_str */
		205	0, /* tp_getattro */
		206	0, /* tp_setattro */
		207	0, /* tp_as_buffer */
		208	Py_TPFLAGS_DEFAULT \| Py_TPFLAGS_BASETYPE, /* tp_flags */
		209	ZstdCompressorIterator__doc__, /* tp_doc */
		210	0, /* tp_traverse */
		211	0, /* tp_clear */
		212	0, /* tp_richcompare */
		213	0, /* tp_weaklistoffset */
		214	ZstdCompressorIterator_iter, /* tp_iter */
		215	(iternextfunc)ZstdCompressorIterator_iternext, /* tp_iternext */
		216	0, /* tp_methods */
		217	0, /* tp_members */
		218	0, /* tp_getset */
		219	0, /* tp_base */
		220	0, /* tp_dict */
		221	0, /* tp_descr_get */
		222	0, /* tp_descr_set */
		223	0, /* tp_dictoffset */
		224	0, /* tp_init */
		225	0, /* tp_alloc */
		226	PyType_GenericNew, /* tp_new */
		227	};
		228
		229	void compressoriterator_module_init(PyObject* mod) {
		230	Py_TYPE(&ZstdCompressorIteratorType) = &PyType_Type;
		231	if (PyType_Ready(&ZstdCompressorIteratorType) < 0) {
		232	return;
		233	}
		234	}

contrib/python-zstandard/c-ext/constants.c

0 created 644 +84 0

			@@ -0,0 +1,84 b''
		1	/**
		2	* Copyright (c) 2016-present, Gregory Szorc
		3	* All rights reserved.
		4	*
		5	* This software may be modified and distributed under the terms
		6	* of the BSD license. See the LICENSE file for details.
		7	*/
		8
		9	#include "python-zstandard.h"
		10
		11	extern PyObject* ZstdError;
		12
		13	static char frame_header[] = {
		14	'\x28',
		15	'\xb5',
		16	'\x2f',
		17	'\xfd',
		18	};
		19
		20	void constants_module_init(PyObject* mod) {
		21	PyObject* version;
		22	PyObject* zstdVersion;
		23	PyObject* frameHeader;
		24
		25	#if PY_MAJOR_VERSION >= 3
		26	version = PyUnicode_FromString(PYTHON_ZSTANDARD_VERSION);
		27	#else
		28	version = PyString_FromString(PYTHON_ZSTANDARD_VERSION);
		29	#endif
		30	Py_INCREF(version);
		31	PyModule_AddObject(mod, "__version__", version);
		32
		33	ZstdError = PyErr_NewException("zstd.ZstdError", NULL, NULL);
		34	PyModule_AddObject(mod, "ZstdError", ZstdError);
		35
		36	/* For now, the version is a simple tuple instead of a dedicated type. */
		37	zstdVersion = PyTuple_New(3);
		38	PyTuple_SetItem(zstdVersion, 0, PyLong_FromLong(ZSTD_VERSION_MAJOR));
		39	PyTuple_SetItem(zstdVersion, 1, PyLong_FromLong(ZSTD_VERSION_MINOR));
		40	PyTuple_SetItem(zstdVersion, 2, PyLong_FromLong(ZSTD_VERSION_RELEASE));
		41	Py_IncRef(zstdVersion);
		42	PyModule_AddObject(mod, "ZSTD_VERSION", zstdVersion);
		43
		44	frameHeader = PyBytes_FromStringAndSize(frame_header, sizeof(frame_header));
		45	if (frameHeader) {
		46	PyModule_AddObject(mod, "FRAME_HEADER", frameHeader);
		47	}
		48	else {
		49	PyErr_Format(PyExc_ValueError, "could not create frame header object");
		50	}
		51
		52	PyModule_AddIntConstant(mod, "MAX_COMPRESSION_LEVEL", ZSTD_maxCLevel());
		53	PyModule_AddIntConstant(mod, "COMPRESSION_RECOMMENDED_INPUT_SIZE",
		54	(long)ZSTD_CStreamInSize());
		55	PyModule_AddIntConstant(mod, "COMPRESSION_RECOMMENDED_OUTPUT_SIZE",
		56	(long)ZSTD_CStreamOutSize());
		57	PyModule_AddIntConstant(mod, "DECOMPRESSION_RECOMMENDED_INPUT_SIZE",
		58	(long)ZSTD_DStreamInSize());
		59	PyModule_AddIntConstant(mod, "DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE",
		60	(long)ZSTD_DStreamOutSize());
		61
		62	PyModule_AddIntConstant(mod, "MAGIC_NUMBER", ZSTD_MAGICNUMBER);
		63	PyModule_AddIntConstant(mod, "WINDOWLOG_MIN", ZSTD_WINDOWLOG_MIN);
		64	PyModule_AddIntConstant(mod, "WINDOWLOG_MAX", ZSTD_WINDOWLOG_MAX);
		65	PyModule_AddIntConstant(mod, "CHAINLOG_MIN", ZSTD_CHAINLOG_MIN);
		66	PyModule_AddIntConstant(mod, "CHAINLOG_MAX", ZSTD_CHAINLOG_MAX);
		67	PyModule_AddIntConstant(mod, "HASHLOG_MIN", ZSTD_HASHLOG_MIN);
		68	PyModule_AddIntConstant(mod, "HASHLOG_MAX", ZSTD_HASHLOG_MAX);
		69	PyModule_AddIntConstant(mod, "HASHLOG3_MAX", ZSTD_HASHLOG3_MAX);
		70	PyModule_AddIntConstant(mod, "SEARCHLOG_MIN", ZSTD_SEARCHLOG_MIN);
		71	PyModule_AddIntConstant(mod, "SEARCHLOG_MAX", ZSTD_SEARCHLOG_MAX);
		72	PyModule_AddIntConstant(mod, "SEARCHLENGTH_MIN", ZSTD_SEARCHLENGTH_MIN);
		73	PyModule_AddIntConstant(mod, "SEARCHLENGTH_MAX", ZSTD_SEARCHLENGTH_MAX);
		74	PyModule_AddIntConstant(mod, "TARGETLENGTH_MIN", ZSTD_TARGETLENGTH_MIN);
		75	PyModule_AddIntConstant(mod, "TARGETLENGTH_MAX", ZSTD_TARGETLENGTH_MAX);
		76
		77	PyModule_AddIntConstant(mod, "STRATEGY_FAST", ZSTD_fast);
		78	PyModule_AddIntConstant(mod, "STRATEGY_DFAST", ZSTD_dfast);
		79	PyModule_AddIntConstant(mod, "STRATEGY_GREEDY", ZSTD_greedy);
		80	PyModule_AddIntConstant(mod, "STRATEGY_LAZY", ZSTD_lazy);
		81	PyModule_AddIntConstant(mod, "STRATEGY_LAZY2", ZSTD_lazy2);
		82	PyModule_AddIntConstant(mod, "STRATEGY_BTLAZY2", ZSTD_btlazy2);
		83	PyModule_AddIntConstant(mod, "STRATEGY_BTOPT", ZSTD_btopt);
		84	}

contrib/python-zstandard/c-ext/decompressionwriter.c

0 created 644 +187 0

			@@ -0,0 +1,187 b''
		1	/**
		2	* Copyright (c) 2016-present, Gregory Szorc
		3	* All rights reserved.
		4	*
		5	* This software may be modified and distributed under the terms
		6	* of the BSD license. See the LICENSE file for details.
		7	*/
		8
		9	#include "python-zstandard.h"
		10
		11	extern PyObject* ZstdError;
		12
		13	PyDoc_STRVAR(ZstdDecompressionWriter__doc,
		14	"""A context manager used for writing decompressed output.\n"
		15	);
		16
		17	static void ZstdDecompressionWriter_dealloc(ZstdDecompressionWriter* self) {
		18	Py_XDECREF(self->decompressor);
		19	Py_XDECREF(self->writer);
		20
		21	if (self->dstream) {
		22	ZSTD_freeDStream(self->dstream);
		23	self->dstream = NULL;
		24	}
		25
		26	PyObject_Del(self);
		27	}
		28
		29	static PyObject* ZstdDecompressionWriter_enter(ZstdDecompressionWriter* self) {
		30	if (self->entered) {
		31	PyErr_SetString(ZstdError, "cannot __enter__ multiple times");
		32	return NULL;
		33	}
		34
		35	self->dstream = DStream_from_ZstdDecompressor(self->decompressor);
		36	if (!self->dstream) {
		37	return NULL;
		38	}
		39
		40	self->entered = 1;
		41
		42	Py_INCREF(self);
		43	return (PyObject*)self;
		44	}
		45
		46	static PyObject* ZstdDecompressionWriter_exit(ZstdDecompressionWriter* self, PyObject* args) {
		47	self->entered = 0;
		48
		49	if (self->dstream) {
		50	ZSTD_freeDStream(self->dstream);
		51	self->dstream = NULL;
		52	}
		53
		54	Py_RETURN_FALSE;
		55	}
		56
		57	static PyObject* ZstdDecompressionWriter_memory_size(ZstdDecompressionWriter* self) {
		58	if (!self->dstream) {
		59	PyErr_SetString(ZstdError, "cannot determine size of inactive decompressor; "
		60	"call when context manager is active");
		61	return NULL;
		62	}
		63
		64	return PyLong_FromSize_t(ZSTD_sizeof_DStream(self->dstream));
		65	}
		66
		67	static PyObject* ZstdDecompressionWriter_write(ZstdDecompressionWriter* self, PyObject* args) {
		68	const char* source;
		69	Py_ssize_t sourceSize;
		70	size_t zresult = 0;
		71	ZSTD_inBuffer input;
		72	ZSTD_outBuffer output;
		73	PyObject* res;
		74
		75	#if PY_MAJOR_VERSION >= 3
		76	if (!PyArg_ParseTuple(args, "y#", &source, &sourceSize)) {
		77	#else
		78	if (!PyArg_ParseTuple(args, "s#", &source, &sourceSize)) {
		79	#endif
		80	return NULL;
		81	}
		82
		83	if (!self->entered) {
		84	PyErr_SetString(ZstdError, "write must be called from an active context manager");
		85	return NULL;
		86	}
		87
		88	output.dst = malloc(self->outSize);
		89	if (!output.dst) {
		90	return PyErr_NoMemory();
		91	}
		92	output.size = self->outSize;
		93	output.pos = 0;
		94
		95	input.src = source;
		96	input.size = sourceSize;
		97	input.pos = 0;
		98
		99	while ((ssize_t)input.pos < sourceSize) {
		100	Py_BEGIN_ALLOW_THREADS
		101	zresult = ZSTD_decompressStream(self->dstream, &output, &input);
		102	Py_END_ALLOW_THREADS
		103
		104	if (ZSTD_isError(zresult)) {
		105	free(output.dst);
		106	PyErr_Format(ZstdError, "zstd decompress error: %s",
		107	ZSTD_getErrorName(zresult));
		108	return NULL;
		109	}
		110
		111	if (output.pos) {
		112	#if PY_MAJOR_VERSION >= 3
		113	res = PyObject_CallMethod(self->writer, "write", "y#",
		114	#else
		115	res = PyObject_CallMethod(self->writer, "write", "s#",
		116	#endif
		117	output.dst, output.pos);
		118	Py_XDECREF(res);
		119	output.pos = 0;
		120	}
		121	}
		122
		123	free(output.dst);
		124
		125	/* TODO return bytes written */
		126	Py_RETURN_NONE;
		127	}
		128
		129	static PyMethodDef ZstdDecompressionWriter_methods[] = {
		130	{ "__enter__", (PyCFunction)ZstdDecompressionWriter_enter, METH_NOARGS,
		131	PyDoc_STR("Enter a decompression context.") },
		132	{ "__exit__", (PyCFunction)ZstdDecompressionWriter_exit, METH_VARARGS,
		133	PyDoc_STR("Exit a decompression context.") },
		134	{ "memory_size", (PyCFunction)ZstdDecompressionWriter_memory_size, METH_NOARGS,
		135	PyDoc_STR("Obtain the memory size in bytes of the underlying decompressor.") },
		136	{ "write", (PyCFunction)ZstdDecompressionWriter_write, METH_VARARGS,
		137	PyDoc_STR("Compress data") },
		138	{ NULL, NULL }
		139	};
		140
		141	PyTypeObject ZstdDecompressionWriterType = {
		142	PyVarObject_HEAD_INIT(NULL, 0)
		143	"zstd.ZstdDecompressionWriter", /* tp_name */
		144	sizeof(ZstdDecompressionWriter),/* tp_basicsize */
		145	0, /* tp_itemsize */
		146	(destructor)ZstdDecompressionWriter_dealloc, /* tp_dealloc */
		147	0, /* tp_print */
		148	0, /* tp_getattr */
		149	0, /* tp_setattr */
		150	0, /* tp_compare */
		151	0, /* tp_repr */
		152	0, /* tp_as_number */
		153	0, /* tp_as_sequence */
		154	0, /* tp_as_mapping */
		155	0, /* tp_hash */
		156	0, /* tp_call */
		157	0, /* tp_str */
		158	0, /* tp_getattro */
		159	0, /* tp_setattro */
		160	0, /* tp_as_buffer */
		161	Py_TPFLAGS_DEFAULT \| Py_TPFLAGS_BASETYPE, /* tp_flags */
		162	ZstdDecompressionWriter__doc, /* tp_doc */
		163	0, /* tp_traverse */
		164	0, /* tp_clear */
		165	0, /* tp_richcompare */
		166	0, /* tp_weaklistoffset */
		167	0, /* tp_iter */
		168	0, /* tp_iternext */
		169	ZstdDecompressionWriter_methods,/* tp_methods */
		170	0, /* tp_members */
		171	0, /* tp_getset */
		172	0, /* tp_base */
		173	0, /* tp_dict */
		174	0, /* tp_descr_get */
		175	0, /* tp_descr_set */
		176	0, /* tp_dictoffset */
		177	0, /* tp_init */
		178	0, /* tp_alloc */
		179	PyType_GenericNew, /* tp_new */
		180	};
		181
		182	void decompressionwriter_module_init(PyObject* mod) {
		183	Py_TYPE(&ZstdDecompressionWriterType) = &PyType_Type;
		184	if (PyType_Ready(&ZstdDecompressionWriterType) < 0) {
		185	return;
		186	}
		187	}

contrib/python-zstandard/c-ext/decompressobj.c

0 created 644 +170 0

			@@ -0,0 +1,170 b''
		1	/**
		2	* Copyright (c) 2016-present, Gregory Szorc
		3	* All rights reserved.
		4	*
		5	* This software may be modified and distributed under the terms
		6	* of the BSD license. See the LICENSE file for details.
		7	*/
		8
		9	#include "python-zstandard.h"
		10
		11	extern PyObject* ZstdError;
		12
		13	PyDoc_STRVAR(DecompressionObj__doc__,
		14	"Perform decompression using a standard library compatible API.\n"
		15	);
		16
		17	static void DecompressionObj_dealloc(ZstdDecompressionObj* self) {
		18	if (self->dstream) {
		19	ZSTD_freeDStream(self->dstream);
		20	self->dstream = NULL;
		21	}
		22
		23	Py_XDECREF(self->decompressor);
		24
		25	PyObject_Del(self);
		26	}
		27
		28	static PyObject* DecompressionObj_decompress(ZstdDecompressionObj* self, PyObject* args) {
		29	const char* source;
		30	Py_ssize_t sourceSize;
		31	size_t zresult;
		32	ZSTD_inBuffer input;
		33	ZSTD_outBuffer output;
		34	size_t outSize = ZSTD_DStreamOutSize();
		35	PyObject* result = NULL;
		36	Py_ssize_t resultSize = 0;
		37
		38	if (self->finished) {
		39	PyErr_SetString(ZstdError, "cannot use a decompressobj multiple times");
		40	return NULL;
		41	}
		42
		43	#if PY_MAJOR_VERSION >= 3
		44	if (!PyArg_ParseTuple(args, "y#",
		45	#else
		46	if (!PyArg_ParseTuple(args, "s#",
		47	#endif
		48	&source, &sourceSize)) {
		49	return NULL;
		50	}
		51
		52	input.src = source;
		53	input.size = sourceSize;
		54	input.pos = 0;
		55
		56	output.dst = PyMem_Malloc(outSize);
		57	if (!output.dst) {
		58	PyErr_NoMemory();
		59	return NULL;
		60	}
		61	output.size = outSize;
		62	output.pos = 0;
		63
		64	/* Read input until exhausted. */
		65	while (input.pos < input.size) {
		66	Py_BEGIN_ALLOW_THREADS
		67	zresult = ZSTD_decompressStream(self->dstream, &output, &input);
		68	Py_END_ALLOW_THREADS
		69
		70	if (ZSTD_isError(zresult)) {
		71	PyErr_Format(ZstdError, "zstd decompressor error: %s",
		72	ZSTD_getErrorName(zresult));
		73	result = NULL;
		74	goto finally;
		75	}
		76
		77	if (0 == zresult) {
		78	self->finished = 1;
		79	}
		80
		81	if (output.pos) {
		82	if (result) {
		83	resultSize = PyBytes_GET_SIZE(result);
		84	if (-1 == _PyBytes_Resize(&result, resultSize + output.pos)) {
		85	goto except;
		86	}
		87
		88	memcpy(PyBytes_AS_STRING(result) + resultSize,
		89	output.dst, output.pos);
		90	}
		91	else {
		92	result = PyBytes_FromStringAndSize(output.dst, output.pos);
		93	if (!result) {
		94	goto except;
		95	}
		96	}
		97
		98	output.pos = 0;
		99	}
		100	}
		101
		102	if (!result) {
		103	result = PyBytes_FromString("");
		104	}
		105
		106	goto finally;
		107
		108	except:
		109	Py_DecRef(result);
		110	result = NULL;
		111
		112	finally:
		113	PyMem_Free(output.dst);
		114
		115	return result;
		116	}
		117
		118	static PyMethodDef DecompressionObj_methods[] = {
		119	{ "decompress", (PyCFunction)DecompressionObj_decompress,
		120	METH_VARARGS, PyDoc_STR("decompress data") },
		121	{ NULL, NULL }
		122	};
		123
		124	PyTypeObject ZstdDecompressionObjType = {
		125	PyVarObject_HEAD_INIT(NULL, 0)
		126	"zstd.ZstdDecompressionObj", /* tp_name */
		127	sizeof(ZstdDecompressionObj), /* tp_basicsize */
		128	0, /* tp_itemsize */
		129	(destructor)DecompressionObj_dealloc, /* tp_dealloc */
		130	0, /* tp_print */
		131	0, /* tp_getattr */
		132	0, /* tp_setattr */
		133	0, /* tp_compare */
		134	0, /* tp_repr */
		135	0, /* tp_as_number */
		136	0, /* tp_as_sequence */
		137	0, /* tp_as_mapping */
		138	0, /* tp_hash */
		139	0, /* tp_call */
		140	0, /* tp_str */
		141	0, /* tp_getattro */
		142	0, /* tp_setattro */
		143	0, /* tp_as_buffer */
		144	Py_TPFLAGS_DEFAULT \| Py_TPFLAGS_BASETYPE, /* tp_flags */
		145	DecompressionObj__doc__, /* tp_doc */
		146	0, /* tp_traverse */
		147	0, /* tp_clear */
		148	0, /* tp_richcompare */
		149	0, /* tp_weaklistoffset */
		150	0, /* tp_iter */
		151	0, /* tp_iternext */
		152	DecompressionObj_methods, /* tp_methods */
		153	0, /* tp_members */
		154	0, /* tp_getset */
		155	0, /* tp_base */
		156	0, /* tp_dict */
		157	0, /* tp_descr_get */
		158	0, /* tp_descr_set */
		159	0, /* tp_dictoffset */
		160	0, /* tp_init */
		161	0, /* tp_alloc */
		162	PyType_GenericNew, /* tp_new */
		163	};
		164
		165	void decompressobj_module_init(PyObject* module) {
		166	Py_TYPE(&ZstdDecompressionObjType) = &PyType_Type;
		167	if (PyType_Ready(&ZstdDecompressionObjType) < 0) {
		168	return;
		169	}
		170	}

contrib/python-zstandard/c-ext/decompressor.c

0 created 644 +669 0

This diff has been collapsed as it changes many lines, (669 lines changed) Show them Hide them
		@@ -0,0 +1,669 b''
	1	/**
	2	* Copyright (c) 2016-present, Gregory Szorc
	3	* All rights reserved.
	4	*
	5	* This software may be modified and distributed under the terms
	6	* of the BSD license. See the LICENSE file for details.
	7	*/
	8
	9	#include "python-zstandard.h"
	10
	11	extern PyObject* ZstdError;
	12
	13	ZSTD_DStream* DStream_from_ZstdDecompressor(ZstdDecompressor* decompressor) {
	14	ZSTD_DStream* dstream;
	15	void* dictData = NULL;
	16	size_t dictSize = 0;
	17	size_t zresult;
	18
	19	dstream = ZSTD_createDStream();
	20	if (!dstream) {
	21	PyErr_SetString(ZstdError, "could not create DStream");
	22	return NULL;
	23	}
	24
	25	if (decompressor->dict) {
	26	dictData = decompressor->dict->dictData;
	27	dictSize = decompressor->dict->dictSize;
	28	}
	29
	30	if (dictData) {
	31	zresult = ZSTD_initDStream_usingDict(dstream, dictData, dictSize);
	32	}
	33	else {
	34	zresult = ZSTD_initDStream(dstream);
	35	}
	36
	37	if (ZSTD_isError(zresult)) {
	38	PyErr_Format(ZstdError, "could not initialize DStream: %s",
	39	ZSTD_getErrorName(zresult));
	40	return NULL;
	41	}
	42
	43	return dstream;
	44	}
	45
	46	PyDoc_STRVAR(Decompressor__doc__,
	47	"ZstdDecompressor(dict_data=None)\n"
	48	"\n"
	49	"Create an object used to perform Zstandard decompression.\n"
	50	"\n"
	51	"An instance can perform multiple decompression operations."
	52	);
	53
	54	static int Decompressor_init(ZstdDecompressor* self, PyObject* args, PyObject* kwargs) {
	55	static char* kwlist[] = {
	56	"dict_data",
	57	NULL
	58	};
	59
	60	ZstdCompressionDict* dict = NULL;
	61
	62	self->refdctx = NULL;
	63	self->dict = NULL;
	64	self->ddict = NULL;
	65
	66	if (!PyArg_ParseTupleAndKeywords(args, kwargs, "\|O!", kwlist,
	67	&ZstdCompressionDictType, &dict)) {
	68	return -1;
	69	}
	70
	71	/* Instead of creating a ZSTD_DCtx for every decompression operation,
	72	we create an instance at object creation time and recycle it via
	73	ZSTD_copyDCTx() on each use. This means each use is a malloc+memcpy
	74	instead of a malloc+init. */
	75	/* TODO lazily initialize the reference ZSTD_DCtx on first use since
	76	not instances of ZstdDecompressor will use a ZSTD_DCtx. */
	77	self->refdctx = ZSTD_createDCtx();
	78	if (!self->refdctx) {
	79	PyErr_NoMemory();
	80	goto except;
	81	}
	82
	83	if (dict) {
	84	self->dict = dict;
	85	Py_INCREF(dict);
	86	}
	87
	88	return 0;
	89
	90	except:
	91	if (self->refdctx) {
	92	ZSTD_freeDCtx(self->refdctx);
	93	self->refdctx = NULL;
	94	}
	95
	96	return -1;
	97	}
	98
	99	static void Decompressor_dealloc(ZstdDecompressor* self) {
	100	if (self->refdctx) {
	101	ZSTD_freeDCtx(self->refdctx);
	102	}
	103
	104	Py_XDECREF(self->dict);
	105
	106	if (self->ddict) {
	107	ZSTD_freeDDict(self->ddict);
	108	self->ddict = NULL;
	109	}
	110
	111	PyObject_Del(self);
	112	}
	113
	114	PyDoc_STRVAR(Decompressor_copy_stream__doc__,
	115	"copy_stream(ifh, ofh[, read_size=default, write_size=default]) -- decompress data between streams\n"
	116	"\n"
	117	"Compressed data will be read from ``ifh``, decompressed, and written to\n"
	118	"``ofh``. ``ifh`` must have a ``read(size)`` method. ``ofh`` must have a\n"
	119	"``write(data)`` method.\n"
	120	"\n"
	121	"The optional ``read_size`` and ``write_size`` arguments control the chunk\n"
	122	"size of data that is ``read()`` and ``write()`` between streams. They default\n"
	123	"to the default input and output sizes of zstd decompressor streams.\n"
	124	);
	125
	126	static PyObject* Decompressor_copy_stream(ZstdDecompressor* self, PyObject* args, PyObject* kwargs) {
	127	static char* kwlist[] = {
	128	"ifh",
	129	"ofh",
	130	"read_size",
	131	"write_size",
	132	NULL
	133	};
	134
	135	PyObject* source;
	136	PyObject* dest;
	137	size_t inSize = ZSTD_DStreamInSize();
	138	size_t outSize = ZSTD_DStreamOutSize();
	139	ZSTD_DStream* dstream;
	140	ZSTD_inBuffer input;
	141	ZSTD_outBuffer output;
	142	Py_ssize_t totalRead = 0;
	143	Py_ssize_t totalWrite = 0;
	144	char* readBuffer;
	145	Py_ssize_t readSize;
	146	PyObject* readResult;
	147	PyObject* res = NULL;
	148	size_t zresult = 0;
	149	PyObject* writeResult;
	150	PyObject* totalReadPy;
	151	PyObject* totalWritePy;
	152
	153	if (!PyArg_ParseTupleAndKeywords(args, kwargs, "OO\|kk", kwlist, &source,
	154	&dest, &inSize, &outSize)) {
	155	return NULL;
	156	}
	157
	158	if (!PyObject_HasAttrString(source, "read")) {
	159	PyErr_SetString(PyExc_ValueError, "first argument must have a read() method");
	160	return NULL;
	161	}
	162
	163	if (!PyObject_HasAttrString(dest, "write")) {
	164	PyErr_SetString(PyExc_ValueError, "second argument must have a write() method");
	165	return NULL;
	166	}
	167
	168	dstream = DStream_from_ZstdDecompressor(self);
	169	if (!dstream) {
	170	res = NULL;
	171	goto finally;
	172	}
	173
	174	output.dst = PyMem_Malloc(outSize);
	175	if (!output.dst) {
	176	PyErr_NoMemory();
	177	res = NULL;
	178	goto finally;
	179	}
	180	output.size = outSize;
	181	output.pos = 0;
	182
	183	/* Read source stream until EOF */
	184	while (1) {
	185	readResult = PyObject_CallMethod(source, "read", "n", inSize);
	186	if (!readResult) {
	187	PyErr_SetString(ZstdError, "could not read() from source");
	188	goto finally;
	189	}
	190
	191	PyBytes_AsStringAndSize(readResult, &readBuffer, &readSize);
	192
	193	/* If no data was read, we're at EOF. */
	194	if (0 == readSize) {
	195	break;
	196	}
	197
	198	totalRead += readSize;
	199
	200	/* Send data to decompressor */
	201	input.src = readBuffer;
	202	input.size = readSize;
	203	input.pos = 0;
	204
	205	while (input.pos < input.size) {
	206	Py_BEGIN_ALLOW_THREADS
	207	zresult = ZSTD_decompressStream(dstream, &output, &input);
	208	Py_END_ALLOW_THREADS
	209
	210	if (ZSTD_isError(zresult)) {
	211	PyErr_Format(ZstdError, "zstd decompressor error: %s",
	212	ZSTD_getErrorName(zresult));
	213	res = NULL;
	214	goto finally;
	215	}
	216
	217	if (output.pos) {
	218	#if PY_MAJOR_VERSION >= 3
	219	writeResult = PyObject_CallMethod(dest, "write", "y#",
	220	#else
	221	writeResult = PyObject_CallMethod(dest, "write", "s#",
	222	#endif
	223	output.dst, output.pos);
	224
	225	Py_XDECREF(writeResult);
	226	totalWrite += output.pos;
	227	output.pos = 0;
	228	}
	229	}
	230	}
	231
	232	/* Source stream is exhausted. Finish up. */
	233
	234	ZSTD_freeDStream(dstream);
	235	dstream = NULL;
	236
	237	totalReadPy = PyLong_FromSsize_t(totalRead);
	238	totalWritePy = PyLong_FromSsize_t(totalWrite);
	239	res = PyTuple_Pack(2, totalReadPy, totalWritePy);
	240	Py_DecRef(totalReadPy);
	241	Py_DecRef(totalWritePy);
	242
	243	finally:
	244	if (output.dst) {
	245	PyMem_Free(output.dst);
	246	}
	247
	248	if (dstream) {
	249	ZSTD_freeDStream(dstream);
	250	}
	251
	252	return res;
	253	}
	254
	255	PyDoc_STRVAR(Decompressor_decompress__doc__,
	256	"decompress(data[, max_output_size=None]) -- Decompress data in its entirety\n"
	257	"\n"
	258	"This method will decompress the entirety of the argument and return the\n"
	259	"result.\n"
	260	"\n"
	261	"The input bytes are expected to contain a full Zstandard frame (something\n"
	262	"compressed with ``ZstdCompressor.compress()`` or similar). If the input does\n"
	263	"not contain a full frame, an exception will be raised.\n"
	264	"\n"
	265	"If the frame header of the compressed data does not contain the content size\n"
	266	"``max_output_size`` must be specified or ``ZstdError`` will be raised. An\n"
	267	"allocation of size ``max_output_size`` will be performed and an attempt will\n"
	268	"be made to perform decompression into that buffer. If the buffer is too\n"
	269	"small or cannot be allocated, ``ZstdError`` will be raised. The buffer will\n"
	270	"be resized if it is too large.\n"
	271	"\n"
	272	"Uncompressed data could be much larger than compressed data. As a result,\n"
	273	"calling this function could result in a very large memory allocation being\n"
	274	"performed to hold the uncompressed data. Therefore it is highly\n"
	275	"recommended to use a streaming decompression method instead of this one.\n"
	276	);
	277
	278	PyObject* Decompressor_decompress(ZstdDecompressor* self, PyObject* args, PyObject* kwargs) {
	279	static char* kwlist[] = {
	280	"data",
	281	"max_output_size",
	282	NULL
	283	};
	284
	285	const char* source;
	286	Py_ssize_t sourceSize;
	287	Py_ssize_t maxOutputSize = 0;
	288	unsigned long long decompressedSize;
	289	size_t destCapacity;
	290	PyObject* result = NULL;
	291	ZSTD_DCtx* dctx = NULL;
	292	void* dictData = NULL;
	293	size_t dictSize = 0;
	294	size_t zresult;
	295
	296	#if PY_MAJOR_VERSION >= 3
	297	if (!PyArg_ParseTupleAndKeywords(args, kwargs, "y#\|n", kwlist,
	298	#else
	299	if (!PyArg_ParseTupleAndKeywords(args, kwargs, "s#\|n", kwlist,
	300	#endif
	301	&source, &sourceSize, &maxOutputSize)) {
	302	return NULL;
	303	}
	304
	305	dctx = PyMem_Malloc(ZSTD_sizeof_DCtx(self->refdctx));
	306	if (!dctx) {
	307	PyErr_NoMemory();
	308	return NULL;
	309	}
	310
	311	ZSTD_copyDCtx(dctx, self->refdctx);
	312
	313	if (self->dict) {
	314	dictData = self->dict->dictData;
	315	dictSize = self->dict->dictSize;
	316	}
	317
	318	if (dictData && !self->ddict) {
	319	Py_BEGIN_ALLOW_THREADS
	320	self->ddict = ZSTD_createDDict(dictData, dictSize);
	321	Py_END_ALLOW_THREADS
	322
	323	if (!self->ddict) {
	324	PyErr_SetString(ZstdError, "could not create decompression dict");
	325	goto except;
	326	}
	327	}
	328
	329	decompressedSize = ZSTD_getDecompressedSize(source, sourceSize);
	330	/* 0 returned if content size not in the zstd frame header */
	331	if (0 == decompressedSize) {
	332	if (0 == maxOutputSize) {
	333	PyErr_SetString(ZstdError, "input data invalid or missing content size "
	334	"in frame header");
	335	goto except;
	336	}
	337	else {
	338	result = PyBytes_FromStringAndSize(NULL, maxOutputSize);
	339	destCapacity = maxOutputSize;
	340	}
	341	}
	342	else {
	343	result = PyBytes_FromStringAndSize(NULL, decompressedSize);
	344	destCapacity = decompressedSize;
	345	}
	346
	347	if (!result) {
	348	goto except;
	349	}
	350
	351	Py_BEGIN_ALLOW_THREADS
	352	if (self->ddict) {
	353	zresult = ZSTD_decompress_usingDDict(dctx, PyBytes_AsString(result), destCapacity,
	354	source, sourceSize, self->ddict);
	355	}
	356	else {
	357	zresult = ZSTD_decompressDCtx(dctx, PyBytes_AsString(result), destCapacity, source, sourceSize);
	358	}
	359	Py_END_ALLOW_THREADS
	360
	361	if (ZSTD_isError(zresult)) {
	362	PyErr_Format(ZstdError, "decompression error: %s", ZSTD_getErrorName(zresult));
	363	goto except;
	364	}
	365	else if (decompressedSize && zresult != decompressedSize) {
	366	PyErr_Format(ZstdError, "decompression error: decompressed %zu bytes; expected %llu",
	367	zresult, decompressedSize);
	368	goto except;
	369	}
	370	else if (zresult < destCapacity) {
	371	if (_PyBytes_Resize(&result, zresult)) {
	372	goto except;
	373	}
	374	}
	375
	376	goto finally;
	377
	378	except:
	379	Py_DecRef(result);
	380	result = NULL;
	381
	382	finally:
	383	if (dctx) {
	384	PyMem_FREE(dctx);
	385	}
	386
	387	return result;
	388	}
	389
	390	PyDoc_STRVAR(Decompressor_decompressobj__doc__,
	391	"decompressobj()\n"
	392	"\n"
	393	"Incrementally feed data into a decompressor.\n"
	394	"\n"
	395	"The returned object exposes a ``decompress(data)`` method. This makes it\n"
	396	"compatible with ``zlib.decompressobj`` and ``bz2.BZ2Decompressor`` so that\n"
	397	"callers can swap in the zstd decompressor while using the same API.\n"
	398	);
	399
	400	static ZstdDecompressionObj* Decompressor_decompressobj(ZstdDecompressor* self) {
	401	ZstdDecompressionObj* result = PyObject_New(ZstdDecompressionObj, &ZstdDecompressionObjType);
	402	if (!result) {
	403	return NULL;
	404	}
	405
	406	result->dstream = DStream_from_ZstdDecompressor(self);
	407	if (!result->dstream) {
	408	Py_DecRef((PyObject*)result);
	409	return NULL;
	410	}
	411
	412	result->decompressor = self;
	413	Py_INCREF(result->decompressor);
	414
	415	result->finished = 0;
	416
	417	return result;
	418	}
	419
	420	PyDoc_STRVAR(Decompressor_read_from__doc__,
	421	"read_from(reader[, read_size=default, write_size=default, skip_bytes=0])\n"
	422	"Read compressed data and return an iterator\n"
	423	"\n"
	424	"Returns an iterator of decompressed data chunks produced from reading from\n"
	425	"the ``reader``.\n"
	426	"\n"
	427	"Compressed data will be obtained from ``reader`` by calling the\n"
	428	"``read(size)`` method of it. The source data will be streamed into a\n"
	429	"decompressor. As decompressed data is available, it will be exposed to the\n"
	430	"returned iterator.\n"
	431	"\n"
	432	"Data is ``read()`` in chunks of size ``read_size`` and exposed to the\n"
	433	"iterator in chunks of size ``write_size``. The default values are the input\n"
	434	"and output sizes for a zstd streaming decompressor.\n"
	435	"\n"
	436	"There is also support for skipping the first ``skip_bytes`` of data from\n"
	437	"the source.\n"
	438	);
	439
	440	static ZstdDecompressorIterator* Decompressor_read_from(ZstdDecompressor* self, PyObject* args, PyObject* kwargs) {
	441	static char* kwlist[] = {
	442	"reader",
	443	"read_size",
	444	"write_size",
	445	"skip_bytes",
	446	NULL
	447	};
	448
	449	PyObject* reader;
	450	size_t inSize = ZSTD_DStreamInSize();
	451	size_t outSize = ZSTD_DStreamOutSize();
	452	ZstdDecompressorIterator* result;
	453	size_t skipBytes = 0;
	454
	455	if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O\|kkk", kwlist, &reader,
	456	&inSize, &outSize, &skipBytes)) {
	457	return NULL;
	458	}
	459
	460	if (skipBytes >= inSize) {
	461	PyErr_SetString(PyExc_ValueError,
	462	"skip_bytes must be smaller than read_size");
	463	return NULL;
	464	}
	465
	466	result = PyObject_New(ZstdDecompressorIterator, &ZstdDecompressorIteratorType);
	467	if (!result) {
	468	return NULL;
	469	}
	470
	471	result->decompressor = NULL;
	472	result->reader = NULL;
	473	result->buffer = NULL;
	474	result->dstream = NULL;
	475	result->input.src = NULL;
	476	result->output.dst = NULL;
	477
	478	if (PyObject_HasAttrString(reader, "read")) {
	479	result->reader = reader;
	480	Py_INCREF(result->reader);
	481	}
	482	else if (1 == PyObject_CheckBuffer(reader)) {
	483	/* Object claims it is a buffer. Try to get a handle to it. */
	484	result->buffer = PyMem_Malloc(sizeof(Py_buffer));
	485	if (!result->buffer) {
	486	goto except;
	487	}
	488
	489	memset(result->buffer, 0, sizeof(Py_buffer));
	490
	491	if (0 != PyObject_GetBuffer(reader, result->buffer, PyBUF_CONTIG_RO)) {
	492	goto except;
	493	}
	494
	495	result->bufferOffset = 0;
	496	}
	497	else {
	498	PyErr_SetString(PyExc_ValueError,
	499	"must pass an object with a read() method or conforms to buffer protocol");
	500	goto except;
	501	}
	502
	503	result->decompressor = self;
	504	Py_INCREF(result->decompressor);
	505
	506	result->inSize = inSize;
	507	result->outSize = outSize;
	508	result->skipBytes = skipBytes;
	509
	510	result->dstream = DStream_from_ZstdDecompressor(self);
	511	if (!result->dstream) {
	512	goto except;
	513	}
	514
	515	result->input.src = PyMem_Malloc(inSize);
	516	if (!result->input.src) {
	517	PyErr_NoMemory();
	518	goto except;
	519	}
	520	result->input.size = 0;
	521	result->input.pos = 0;
	522
	523	result->output.dst = NULL;
	524	result->output.size = 0;
	525	result->output.pos = 0;
	526
	527	result->readCount = 0;
	528	result->finishedInput = 0;
	529	result->finishedOutput = 0;
	530
	531	goto finally;
	532
	533	except:
	534	if (result->reader) {
	535	Py_DECREF(result->reader);
	536	result->reader = NULL;
	537	}
	538
	539	if (result->buffer) {
	540	PyBuffer_Release(result->buffer);
	541	Py_DECREF(result->buffer);
	542	result->buffer = NULL;
	543	}
	544
	545	Py_DECREF(result);
	546	result = NULL;
	547
	548	finally:
	549
	550	return result;
	551	}
	552
	553	PyDoc_STRVAR(Decompressor_write_to__doc__,
	554	"Create a context manager to write decompressed data to an object.\n"
	555	"\n"
	556	"The passed object must have a ``write()`` method.\n"
	557	"\n"
	558	"The caller feeds intput data to the object by calling ``write(data)``.\n"
	559	"Decompressed data is written to the argument given as it is decompressed.\n"
	560	"\n"
	561	"An optional ``write_size`` argument defines the size of chunks to\n"
	562	"``write()`` to the writer. It defaults to the default output size for a zstd\n"
	563	"streaming decompressor.\n"
	564	);
	565
	566	static ZstdDecompressionWriter* Decompressor_write_to(ZstdDecompressor* self, PyObject* args, PyObject* kwargs) {
	567	static char* kwlist[] = {
	568	"writer",
	569	"write_size",
	570	NULL
	571	};
	572
	573	PyObject* writer;
	574	size_t outSize = ZSTD_DStreamOutSize();
	575	ZstdDecompressionWriter* result;
	576
	577	if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O\|k", kwlist, &writer, &outSize)) {
	578	return NULL;
	579	}
	580
	581	if (!PyObject_HasAttrString(writer, "write")) {
	582	PyErr_SetString(PyExc_ValueError, "must pass an object with a write() method");
	583	return NULL;
	584	}
	585
	586	result = PyObject_New(ZstdDecompressionWriter, &ZstdDecompressionWriterType);
	587	if (!result) {
	588	return NULL;
	589	}
	590
	591	result->decompressor = self;
	592	Py_INCREF(result->decompressor);
	593
	594	result->writer = writer;
	595	Py_INCREF(result->writer);
	596
	597	result->outSize = outSize;
	598
	599	result->entered = 0;
	600	result->dstream = NULL;
	601
	602	return result;
	603	}
	604
	605	static PyMethodDef Decompressor_methods[] = {
	606	{ "copy_stream", (PyCFunction)Decompressor_copy_stream, METH_VARARGS \| METH_KEYWORDS,
	607	Decompressor_copy_stream__doc__ },
	608	{ "decompress", (PyCFunction)Decompressor_decompress, METH_VARARGS \| METH_KEYWORDS,
	609	Decompressor_decompress__doc__ },
	610	{ "decompressobj", (PyCFunction)Decompressor_decompressobj, METH_NOARGS,
	611	Decompressor_decompressobj__doc__ },
	612	{ "read_from", (PyCFunction)Decompressor_read_from, METH_VARARGS \| METH_KEYWORDS,
	613	Decompressor_read_from__doc__ },
	614	{ "write_to", (PyCFunction)Decompressor_write_to, METH_VARARGS \| METH_KEYWORDS,
	615	Decompressor_write_to__doc__ },
	616	{ NULL, NULL }
	617	};
	618
	619	PyTypeObject ZstdDecompressorType = {
	620	PyVarObject_HEAD_INIT(NULL, 0)
	621	"zstd.ZstdDecompressor", /* tp_name */
	622	sizeof(ZstdDecompressor), /* tp_basicsize */
	623	0, /* tp_itemsize */
	624	(destructor)Decompressor_dealloc, /* tp_dealloc */
	625	0, /* tp_print */
	626	0, /* tp_getattr */
	627	0, /* tp_setattr */
	628	0, /* tp_compare */
	629	0, /* tp_repr */
	630	0, /* tp_as_number */
	631	0, /* tp_as_sequence */
	632	0, /* tp_as_mapping */
	633	0, /* tp_hash */
	634	0, /* tp_call */
	635	0, /* tp_str */
	636	0, /* tp_getattro */
	637	0, /* tp_setattro */
	638	0, /* tp_as_buffer */
	639	Py_TPFLAGS_DEFAULT \| Py_TPFLAGS_BASETYPE, /* tp_flags */
	640	Decompressor__doc__, /* tp_doc */
	641	0, /* tp_traverse */
	642	0, /* tp_clear */
	643	0, /* tp_richcompare */
	644	0, /* tp_weaklistoffset */
	645	0, /* tp_iter */
	646	0, /* tp_iternext */
	647	Decompressor_methods, /* tp_methods */
	648	0, /* tp_members */
	649	0, /* tp_getset */
	650	0, /* tp_base */
	651	0, /* tp_dict */
	652	0, /* tp_descr_get */
	653	0, /* tp_descr_set */
	654	0, /* tp_dictoffset */
	655	(initproc)Decompressor_init, /* tp_init */
	656	0, /* tp_alloc */
	657	PyType_GenericNew, /* tp_new */
	658	};
	659
	660	void decompressor_module_init(PyObject* mod) {
	661	Py_TYPE(&ZstdDecompressorType) = &PyType_Type;
	662	if (PyType_Ready(&ZstdDecompressorType) < 0) {
	663	return;
	664	}
	665
	666	Py_INCREF((PyObject*)&ZstdDecompressorType);
	667	PyModule_AddObject(mod, "ZstdDecompressor",
	668	(PyObject*)&ZstdDecompressorType);
	669	}

contrib/python-zstandard/c-ext/decompressoriterator.c

0 created 644 +254 0

			@@ -0,0 +1,254 b''
		1	/**
		2	* Copyright (c) 2016-present, Gregory Szorc
		3	* All rights reserved.
		4	*
		5	* This software may be modified and distributed under the terms
		6	* of the BSD license. See the LICENSE file for details.
		7	*/
		8
		9	#include "python-zstandard.h"
		10
		11	#define min(a, b) (((a) < (b)) ? (a) : (b))
		12
		13	extern PyObject* ZstdError;
		14
		15	PyDoc_STRVAR(ZstdDecompressorIterator__doc__,
		16	"Represents an iterator of decompressed data.\n"
		17	);
		18
		19	static void ZstdDecompressorIterator_dealloc(ZstdDecompressorIterator* self) {
		20	Py_XDECREF(self->decompressor);
		21	Py_XDECREF(self->reader);
		22
		23	if (self->buffer) {
		24	PyBuffer_Release(self->buffer);
		25	PyMem_FREE(self->buffer);
		26	self->buffer = NULL;
		27	}
		28
		29	if (self->dstream) {
		30	ZSTD_freeDStream(self->dstream);
		31	self->dstream = NULL;
		32	}
		33
		34	if (self->input.src) {
		35	PyMem_Free((void*)self->input.src);
		36	self->input.src = NULL;
		37	}
		38
		39	PyObject_Del(self);
		40	}
		41
		42	static PyObject* ZstdDecompressorIterator_iter(PyObject* self) {
		43	Py_INCREF(self);
		44	return self;
		45	}
		46
		47	static DecompressorIteratorResult read_decompressor_iterator(ZstdDecompressorIterator* self) {
		48	size_t zresult;
		49	PyObject* chunk;
		50	DecompressorIteratorResult result;
		51	size_t oldInputPos = self->input.pos;
		52
		53	result.chunk = NULL;
		54
		55	chunk = PyBytes_FromStringAndSize(NULL, self->outSize);
		56	if (!chunk) {
		57	result.errored = 1;
		58	return result;
		59	}
		60
		61	self->output.dst = PyBytes_AsString(chunk);
		62	self->output.size = self->outSize;
		63	self->output.pos = 0;
		64
		65	Py_BEGIN_ALLOW_THREADS
		66	zresult = ZSTD_decompressStream(self->dstream, &self->output, &self->input);
		67	Py_END_ALLOW_THREADS
		68
		69	/* We're done with the pointer. Nullify to prevent anyone from getting a
		70	handle on a Python object. */
		71	self->output.dst = NULL;
		72
		73	if (ZSTD_isError(zresult)) {
		74	Py_DECREF(chunk);
		75	PyErr_Format(ZstdError, "zstd decompress error: %s",
		76	ZSTD_getErrorName(zresult));
		77	result.errored = 1;
		78	return result;
		79	}
		80
		81	self->readCount += self->input.pos - oldInputPos;
		82
		83	/* Frame is fully decoded. Input exhausted and output sitting in buffer. */
		84	if (0 == zresult) {
		85	self->finishedInput = 1;
		86	self->finishedOutput = 1;
		87	}
		88
		89	/* If it produced output data, return it. */
		90	if (self->output.pos) {
		91	if (self->output.pos < self->outSize) {
		92	if (_PyBytes_Resize(&chunk, self->output.pos)) {
		93	result.errored = 1;
		94	return result;
		95	}
		96	}
		97	}
		98	else {
		99	Py_DECREF(chunk);
		100	chunk = NULL;
		101	}
		102
		103	result.errored = 0;
		104	result.chunk = chunk;
		105
		106	return result;
		107	}
		108
		109	static PyObject* ZstdDecompressorIterator_iternext(ZstdDecompressorIterator* self) {
		110	PyObject* readResult = NULL;
		111	char* readBuffer;
		112	Py_ssize_t readSize;
		113	Py_ssize_t bufferRemaining;
		114	DecompressorIteratorResult result;
		115
		116	if (self->finishedOutput) {
		117	PyErr_SetString(PyExc_StopIteration, "output flushed");
		118	return NULL;
		119	}
		120
		121	/* If we have data left in the input, consume it. */
		122	if (self->input.pos < self->input.size) {
		123	result = read_decompressor_iterator(self);
		124	if (result.chunk \|\| result.errored) {
		125	return result.chunk;
		126	}
		127
		128	/* Else fall through to get more data from input. */
		129	}
		130
		131	read_from_source:
		132
		133	if (!self->finishedInput) {
		134	if (self->reader) {
		135	readResult = PyObject_CallMethod(self->reader, "read", "I", self->inSize);
		136	if (!readResult) {
		137	return NULL;
		138	}
		139
		140	PyBytes_AsStringAndSize(readResult, &readBuffer, &readSize);
		141	}
		142	else {
		143	assert(self->buffer && self->buffer->buf);
		144
		145	/* Only support contiguous C arrays for now */
		146	assert(self->buffer->strides == NULL && self->buffer->suboffsets == NULL);
		147	assert(self->buffer->itemsize == 1);
		148
		149	/* TODO avoid memcpy() below */
		150	readBuffer = (char *)self->buffer->buf + self->bufferOffset;
		151	bufferRemaining = self->buffer->len - self->bufferOffset;
		152	readSize = min(bufferRemaining, (Py_ssize_t)self->inSize);
		153	self->bufferOffset += readSize;
		154	}
		155
		156	if (readSize) {
		157	if (!self->readCount && self->skipBytes) {
		158	assert(self->skipBytes < self->inSize);
		159	if ((Py_ssize_t)self->skipBytes >= readSize) {
		160	PyErr_SetString(PyExc_ValueError,
		161	"skip_bytes larger than first input chunk; "
		162	"this scenario is currently unsupported");
		163	Py_DecRef(readResult);
		164	return NULL;
		165	}
		166
		167	readBuffer = readBuffer + self->skipBytes;
		168	readSize -= self->skipBytes;
		169	}
		170
		171	/* Copy input into previously allocated buffer because it can live longer
		172	than a single function call and we don't want to keep a ref to a Python
		173	object around. This could be changed... */
		174	memcpy((void*)self->input.src, readBuffer, readSize);
		175	self->input.size = readSize;
		176	self->input.pos = 0;
		177	}
		178	/* No bytes on first read must mean an empty input stream. */
		179	else if (!self->readCount) {
		180	self->finishedInput = 1;
		181	self->finishedOutput = 1;
		182	Py_DecRef(readResult);
		183	PyErr_SetString(PyExc_StopIteration, "empty input");
		184	return NULL;
		185	}
		186	else {
		187	self->finishedInput = 1;
		188	}
		189
		190	/* We've copied the data managed by memory. Discard the Python object. */
		191	Py_DecRef(readResult);
		192	}
		193
		194	result = read_decompressor_iterator(self);
		195	if (result.errored \|\| result.chunk) {
		196	return result.chunk;
		197	}
		198
		199	/* No new output data. Try again unless we know there is no more data. */
		200	if (!self->finishedInput) {
		201	goto read_from_source;
		202	}
		203
		204	PyErr_SetString(PyExc_StopIteration, "input exhausted");
		205	return NULL;
		206	}
		207
		208	PyTypeObject ZstdDecompressorIteratorType = {
		209	PyVarObject_HEAD_INIT(NULL, 0)
		210	"zstd.ZstdDecompressorIterator", /* tp_name */
		211	sizeof(ZstdDecompressorIterator), /* tp_basicsize */
		212	0, /* tp_itemsize */
		213	(destructor)ZstdDecompressorIterator_dealloc, /* tp_dealloc */
		214	0, /* tp_print */
		215	0, /* tp_getattr */
		216	0, /* tp_setattr */
		217	0, /* tp_compare */
		218	0, /* tp_repr */
		219	0, /* tp_as_number */
		220	0, /* tp_as_sequence */
		221	0, /* tp_as_mapping */
		222	0, /* tp_hash */
		223	0, /* tp_call */
		224	0, /* tp_str */
		225	0, /* tp_getattro */
		226	0, /* tp_setattro */
		227	0, /* tp_as_buffer */
		228	Py_TPFLAGS_DEFAULT \| Py_TPFLAGS_BASETYPE, /* tp_flags */
		229	ZstdDecompressorIterator__doc__, /* tp_doc */
		230	0, /* tp_traverse */
		231	0, /* tp_clear */
		232	0, /* tp_richcompare */
		233	0, /* tp_weaklistoffset */
		234	ZstdDecompressorIterator_iter, /* tp_iter */
		235	(iternextfunc)ZstdDecompressorIterator_iternext, /* tp_iternext */
		236	0, /* tp_methods */
		237	0, /* tp_members */
		238	0, /* tp_getset */
		239	0, /* tp_base */
		240	0, /* tp_dict */
		241	0, /* tp_descr_get */
		242	0, /* tp_descr_set */
		243	0, /* tp_dictoffset */
		244	0, /* tp_init */
		245	0, /* tp_alloc */
		246	PyType_GenericNew, /* tp_new */
		247	};
		248
		249	void decompressoriterator_module_init(PyObject* mod) {
		250	Py_TYPE(&ZstdDecompressorIteratorType) = &PyType_Type;
		251	if (PyType_Ready(&ZstdDecompressorIteratorType) < 0) {
		252	return;
		253	}
		254	}

contrib/python-zstandard/c-ext/dictparams.c

0 created 644 +125 0

			@@ -0,0 +1,125 b''
		1	/**
		2	* Copyright (c) 2016-present, Gregory Szorc
		3	* All rights reserved.
		4	*
		5	* This software may be modified and distributed under the terms
		6	* of the BSD license. See the LICENSE file for details.
		7	*/
		8
		9	#include "python-zstandard.h"
		10
		11	PyDoc_STRVAR(DictParameters__doc__,
		12	"DictParameters: low-level control over dictionary generation");
		13
		14	static PyObject* DictParameters_new(PyTypeObject* subtype, PyObject* args, PyObject* kwargs) {
		15	DictParametersObject* self;
		16	unsigned selectivityLevel;
		17	int compressionLevel;
		18	unsigned notificationLevel;
		19	unsigned dictID;
		20
		21	if (!PyArg_ParseTuple(args, "IiII", &selectivityLevel, &compressionLevel,
		22	&notificationLevel, &dictID)) {
		23	return NULL;
		24	}
		25
		26	self = (DictParametersObject*)subtype->tp_alloc(subtype, 1);
		27	if (!self) {
		28	return NULL;
		29	}
		30
		31	self->selectivityLevel = selectivityLevel;
		32	self->compressionLevel = compressionLevel;
		33	self->notificationLevel = notificationLevel;
		34	self->dictID = dictID;
		35
		36	return (PyObject*)self;
		37	}
		38
		39	static void DictParameters_dealloc(PyObject* self) {
		40	PyObject_Del(self);
		41	}
		42
		43	static Py_ssize_t DictParameters_length(PyObject* self) {
		44	return 4;
		45	};
		46
		47	static PyObject* DictParameters_item(PyObject* o, Py_ssize_t i) {
		48	DictParametersObject* self = (DictParametersObject*)o;
		49
		50	switch (i) {
		51	case 0:
		52	return PyLong_FromLong(self->selectivityLevel);
		53	case 1:
		54	return PyLong_FromLong(self->compressionLevel);
		55	case 2:
		56	return PyLong_FromLong(self->notificationLevel);
		57	case 3:
		58	return PyLong_FromLong(self->dictID);
		59	default:
		60	PyErr_SetString(PyExc_IndexError, "index out of range");
		61	return NULL;
		62	}
		63	}
		64
		65	static PySequenceMethods DictParameters_sq = {
		66	DictParameters_length, /* sq_length */
		67	0, /* sq_concat */
		68	0, /* sq_repeat */
		69	DictParameters_item, /* sq_item */
		70	0, /* sq_ass_item */
		71	0, /* sq_contains */
		72	0, /* sq_inplace_concat */
		73	0 /* sq_inplace_repeat */
		74	};
		75
		76	PyTypeObject DictParametersType = {
		77	PyVarObject_HEAD_INIT(NULL, 0)
		78	"DictParameters", /* tp_name */
		79	sizeof(DictParametersObject), /* tp_basicsize */
		80	0, /* tp_itemsize */
		81	(destructor)DictParameters_dealloc, /* tp_dealloc */
		82	0, /* tp_print */
		83	0, /* tp_getattr */
		84	0, /* tp_setattr */
		85	0, /* tp_compare */
		86	0, /* tp_repr */
		87	0, /* tp_as_number */
		88	&DictParameters_sq, /* tp_as_sequence */
		89	0, /* tp_as_mapping */
		90	0, /* tp_hash */
		91	0, /* tp_call */
		92	0, /* tp_str */
		93	0, /* tp_getattro */
		94	0, /* tp_setattro */
		95	0, /* tp_as_buffer */
		96	Py_TPFLAGS_DEFAULT, /* tp_flags */
		97	DictParameters__doc__, /* tp_doc */
		98	0, /* tp_traverse */
		99	0, /* tp_clear */
		100	0, /* tp_richcompare */
		101	0, /* tp_weaklistoffset */
		102	0, /* tp_iter */
		103	0, /* tp_iternext */
		104	0, /* tp_methods */
		105	0, /* tp_members */
		106	0, /* tp_getset */
		107	0, /* tp_base */
		108	0, /* tp_dict */
		109	0, /* tp_descr_get */
		110	0, /* tp_descr_set */
		111	0, /* tp_dictoffset */
		112	0, /* tp_init */
		113	0, /* tp_alloc */
		114	DictParameters_new, /* tp_new */
		115	};
		116
		117	void dictparams_module_init(PyObject* mod) {
		118	Py_TYPE(&DictParametersType) = &PyType_Type;
		119	if (PyType_Ready(&DictParametersType) < 0) {
		120	return;
		121	}
		122
		123	Py_IncRef((PyObject*)&DictParametersType);
		124	PyModule_AddObject(mod, "DictParameters", (PyObject*)&DictParametersType);
		125	}

contrib/python-zstandard/c-ext/python-zstandard.h

0 created 644 +172 0

			@@ -0,0 +1,172 b''
		1	/**
		2	* Copyright (c) 2016-present, Gregory Szorc
		3	* All rights reserved.
		4	*
		5	* This software may be modified and distributed under the terms
		6	* of the BSD license. See the LICENSE file for details.
		7	*/
		8
		9	#define PY_SSIZE_T_CLEAN
		10	#include <Python.h>
		11
		12	#define ZSTD_STATIC_LINKING_ONLY
		13	#define ZDICT_STATIC_LINKING_ONLY
		14	#include "mem.h"
		15	#include "zstd.h"
		16	#include "zdict.h"
		17
		18	#define PYTHON_ZSTANDARD_VERSION "0.5.0"
		19
		20	typedef struct {
		21	PyObject_HEAD
		22	unsigned windowLog;
		23	unsigned chainLog;
		24	unsigned hashLog;
		25	unsigned searchLog;
		26	unsigned searchLength;
		27	unsigned targetLength;
		28	ZSTD_strategy strategy;
		29	} CompressionParametersObject;
		30
		31	extern PyTypeObject CompressionParametersType;
		32
		33	typedef struct {
		34	PyObject_HEAD
		35	unsigned selectivityLevel;
		36	int compressionLevel;
		37	unsigned notificationLevel;
		38	unsigned dictID;
		39	} DictParametersObject;
		40
		41	extern PyTypeObject DictParametersType;
		42
		43	typedef struct {
		44	PyObject_HEAD
		45
		46	void* dictData;
		47	size_t dictSize;
		48	} ZstdCompressionDict;
		49
		50	extern PyTypeObject ZstdCompressionDictType;
		51
		52	typedef struct {
		53	PyObject_HEAD
		54
		55	int compressionLevel;
		56	ZstdCompressionDict* dict;
		57	ZSTD_CDict* cdict;
		58	CompressionParametersObject* cparams;
		59	ZSTD_frameParameters fparams;
		60	} ZstdCompressor;
		61
		62	extern PyTypeObject ZstdCompressorType;
		63
		64	typedef struct {
		65	PyObject_HEAD
		66
		67	ZstdCompressor* compressor;
		68	ZSTD_CStream* cstream;
		69	ZSTD_outBuffer output;
		70	int flushed;
		71	} ZstdCompressionObj;
		72
		73	extern PyTypeObject ZstdCompressionObjType;
		74
		75	typedef struct {
		76	PyObject_HEAD
		77
		78	ZstdCompressor* compressor;
		79	PyObject* writer;
		80	Py_ssize_t sourceSize;
		81	size_t outSize;
		82	ZSTD_CStream* cstream;
		83	int entered;
		84	} ZstdCompressionWriter;
		85
		86	extern PyTypeObject ZstdCompressionWriterType;
		87
		88	typedef struct {
		89	PyObject_HEAD
		90
		91	ZstdCompressor* compressor;
		92	PyObject* reader;
		93	Py_buffer* buffer;
		94	Py_ssize_t bufferOffset;
		95	Py_ssize_t sourceSize;
		96	size_t inSize;
		97	size_t outSize;
		98
		99	ZSTD_CStream* cstream;
		100	ZSTD_inBuffer input;
		101	ZSTD_outBuffer output;
		102	int finishedOutput;
		103	int finishedInput;
		104	PyObject* readResult;
		105	} ZstdCompressorIterator;
		106
		107	extern PyTypeObject ZstdCompressorIteratorType;
		108
		109	typedef struct {
		110	PyObject_HEAD
		111
		112	ZSTD_DCtx* refdctx;
		113
		114	ZstdCompressionDict* dict;
		115	ZSTD_DDict* ddict;
		116	} ZstdDecompressor;
		117
		118	extern PyTypeObject ZstdDecompressorType;
		119
		120	typedef struct {
		121	PyObject_HEAD
		122
		123	ZstdDecompressor* decompressor;
		124	ZSTD_DStream* dstream;
		125	int finished;
		126	} ZstdDecompressionObj;
		127
		128	extern PyTypeObject ZstdDecompressionObjType;
		129
		130	typedef struct {
		131	PyObject_HEAD
		132
		133	ZstdDecompressor* decompressor;
		134	PyObject* writer;
		135	size_t outSize;
		136	ZSTD_DStream* dstream;
		137	int entered;
		138	} ZstdDecompressionWriter;
		139
		140	extern PyTypeObject ZstdDecompressionWriterType;
		141
		142	typedef struct {
		143	PyObject_HEAD
		144
		145	ZstdDecompressor* decompressor;
		146	PyObject* reader;
		147	Py_buffer* buffer;
		148	Py_ssize_t bufferOffset;
		149	size_t inSize;
		150	size_t outSize;
		151	size_t skipBytes;
		152	ZSTD_DStream* dstream;
		153	ZSTD_inBuffer input;
		154	ZSTD_outBuffer output;
		155	Py_ssize_t readCount;
		156	int finishedInput;
		157	int finishedOutput;
		158	} ZstdDecompressorIterator;
		159
		160	extern PyTypeObject ZstdDecompressorIteratorType;
		161
		162	typedef struct {
		163	int errored;
		164	PyObject* chunk;
		165	} DecompressorIteratorResult;
		166
		167	void ztopy_compression_parameters(CompressionParametersObject* params, ZSTD_compressionParameters* zparams);
		168	CompressionParametersObject* get_compression_parameters(PyObject* self, PyObject* args);
		169	PyObject* estimate_compression_context_size(PyObject* self, PyObject* args);
		170	ZSTD_CStream* CStream_from_ZstdCompressor(ZstdCompressor* compressor, Py_ssize_t sourceSize);
		171	ZSTD_DStream* DStream_from_ZstdDecompressor(ZstdDecompressor* decompressor);
		172	ZstdCompressionDict* train_dictionary(PyObject* self, PyObject* args, PyObject* kwargs);

contrib/python-zstandard/make_cffi.py

0 created 644 +110 0

			@@ -0,0 +1,110 b''
		1	# Copyright (c) 2016-present, Gregory Szorc
		2	# All rights reserved.
		3	#
		4	# This software may be modified and distributed under the terms
		5	# of the BSD license. See the LICENSE file for details.
		6
		7	from __future__ import absolute_import
		8
		9	import cffi
		10	import os
		11
		12
		13	HERE = os.path.abspath(os.path.dirname(__file__))
		14
		15	SOURCES = ['zstd/%s' % p for p in (
		16	'common/entropy_common.c',
		17	'common/error_private.c',
		18	'common/fse_decompress.c',
		19	'common/xxhash.c',
		20	'common/zstd_common.c',
		21	'compress/fse_compress.c',
		22	'compress/huf_compress.c',
		23	'compress/zbuff_compress.c',
		24	'compress/zstd_compress.c',
		25	'decompress/huf_decompress.c',
		26	'decompress/zbuff_decompress.c',
		27	'decompress/zstd_decompress.c',
		28	'dictBuilder/divsufsort.c',
		29	'dictBuilder/zdict.c',
		30	)]
		31
		32	INCLUDE_DIRS = [os.path.join(HERE, d) for d in (
		33	'zstd',
		34	'zstd/common',
		35	'zstd/compress',
		36	'zstd/decompress',
		37	'zstd/dictBuilder',
		38	)]
		39
		40	with open(os.path.join(HERE, 'zstd', 'zstd.h'), 'rb') as fh:
		41	zstd_h = fh.read()
		42
		43	ffi = cffi.FFI()
		44	ffi.set_source('_zstd_cffi', '''
		45	/* needed for typedefs like U32 references in zstd.h */
		46	#include "mem.h"
		47	#define ZSTD_STATIC_LINKING_ONLY
		48	#include "zstd.h"
		49	''',
		50	sources=SOURCES, include_dirs=INCLUDE_DIRS)
		51
		52	# Rather than define the API definitions from zstd.h inline, munge the
		53	# source in a way that cdef() will accept.
		54	lines = zstd_h.splitlines()
		55	lines = [l.rstrip() for l in lines if l.strip()]
		56
		57	# Strip preprocessor directives - they aren't important for our needs.
		58	lines = [l for l in lines
		59	if not l.startswith((b'#if', b'#else', b'#endif', b'#include'))]
		60
		61	# Remove extern C block
		62	lines = [l for l in lines if l not in (b'extern "C" {', b'}')]
		63
		64	# The version #defines don't parse and aren't necessary. Strip them.
		65	lines = [l for l in lines if not l.startswith((
		66	b'#define ZSTD_H_235446',
		67	b'#define ZSTD_LIB_VERSION',
		68	b'#define ZSTD_QUOTE',
		69	b'#define ZSTD_EXPAND_AND_QUOTE',
		70	b'#define ZSTD_VERSION_STRING',
		71	b'#define ZSTD_VERSION_NUMBER'))]
		72
		73	# The C parser also doesn't like some constant defines referencing
		74	# other constants.
		75	# TODO we pick the 64-bit constants here. We should assert somewhere
		76	# we're compiling for 64-bit.
		77	def fix_constants(l):
		78	if l.startswith(b'#define ZSTD_WINDOWLOG_MAX '):
		79	return b'#define ZSTD_WINDOWLOG_MAX 27'
		80	elif l.startswith(b'#define ZSTD_CHAINLOG_MAX '):
		81	return b'#define ZSTD_CHAINLOG_MAX 28'
		82	elif l.startswith(b'#define ZSTD_HASHLOG_MAX '):
		83	return b'#define ZSTD_HASHLOG_MAX 27'
		84	elif l.startswith(b'#define ZSTD_CHAINLOG_MAX '):
		85	return b'#define ZSTD_CHAINLOG_MAX 28'
		86	elif l.startswith(b'#define ZSTD_CHAINLOG_MIN '):
		87	return b'#define ZSTD_CHAINLOG_MIN 6'
		88	elif l.startswith(b'#define ZSTD_SEARCHLOG_MAX '):
		89	return b'#define ZSTD_SEARCHLOG_MAX 26'
		90	elif l.startswith(b'#define ZSTD_BLOCKSIZE_ABSOLUTEMAX '):
		91	return b'#define ZSTD_BLOCKSIZE_ABSOLUTEMAX 131072'
		92	else:
		93	return l
		94	lines = map(fix_constants, lines)
		95
		96	# ZSTDLIB_API isn't handled correctly. Strip it.
		97	lines = [l for l in lines if not l.startswith(b'# define ZSTDLIB_API')]
		98	def strip_api(l):
		99	if l.startswith(b'ZSTDLIB_API '):
		100	return l[len(b'ZSTDLIB_API '):]
		101	else:
		102	return l
		103	lines = map(strip_api, lines)
		104
		105	source = b'\n'.join(lines)
		106	ffi.cdef(source.decode('latin1'))
		107
		108
		109	if __name__ == '__main__':
		110	ffi.compile()

contrib/python-zstandard/setup.py

0 created 755 +62 0

			@@ -0,0 +1,62 b''
		1	#!/usr/bin/env python
		2	# Copyright (c) 2016-present, Gregory Szorc
		3	# All rights reserved.
		4	#
		5	# This software may be modified and distributed under the terms
		6	# of the BSD license. See the LICENSE file for details.
		7
		8	from setuptools import setup
		9
		10	try:
		11	import cffi
		12	except ImportError:
		13	cffi = None
		14
		15	import setup_zstd
		16
		17	# Code for obtaining the Extension instance is in its own module to
		18	# facilitate reuse in other projects.
		19	extensions = [setup_zstd.get_c_extension()]
		20
		21	if cffi:
		22	import make_cffi
		23	extensions.append(make_cffi.ffi.distutils_extension())
		24
		25	version = None
		26
		27	with open('c-ext/python-zstandard.h', 'r') as fh:
		28	for line in fh:
		29	if not line.startswith('#define PYTHON_ZSTANDARD_VERSION'):
		30	continue
		31
		32	version = line.split()[2][1:-1]
		33	break
		34
		35	if not version:
		36	raise Exception('could not resolve package version; '
		37	'this should never happen')
		38
		39	setup(
		40	name='zstandard',
		41	version=version,
		42	description='Zstandard bindings for Python',
		43	long_description=open('README.rst', 'r').read(),
		44	url='https://github.com/indygreg/python-zstandard',
		45	author='Gregory Szorc',
		46	author_email='gregory.szorc@gmail.com',
		47	license='BSD',
		48	classifiers=[
		49	'Development Status :: 4 - Beta',
		50	'Intended Audience :: Developers',
		51	'License :: OSI Approved :: BSD License',
		52	'Programming Language :: C',
		53	'Programming Language :: Python :: 2.6',
		54	'Programming Language :: Python :: 2.7',
		55	'Programming Language :: Python :: 3.3',
		56	'Programming Language :: Python :: 3.4',
		57	'Programming Language :: Python :: 3.5',
		58	],
		59	keywords='zstandard zstd compression',
		60	ext_modules=extensions,
		61	test_suite='tests',
		62	)

contrib/python-zstandard/setup_zstd.py

0 created 644 +64 0

			@@ -0,0 +1,64 b''
		1	# Copyright (c) 2016-present, Gregory Szorc
		2	# All rights reserved.
		3	#
		4	# This software may be modified and distributed under the terms
		5	# of the BSD license. See the LICENSE file for details.
		6
		7	import os
		8	from distutils.extension import Extension
		9
		10
		11	zstd_sources = ['zstd/%s' % p for p in (
		12	'common/entropy_common.c',
		13	'common/error_private.c',
		14	'common/fse_decompress.c',
		15	'common/xxhash.c',
		16	'common/zstd_common.c',
		17	'compress/fse_compress.c',
		18	'compress/huf_compress.c',
		19	'compress/zbuff_compress.c',
		20	'compress/zstd_compress.c',
		21	'decompress/huf_decompress.c',
		22	'decompress/zbuff_decompress.c',
		23	'decompress/zstd_decompress.c',
		24	'dictBuilder/divsufsort.c',
		25	'dictBuilder/zdict.c',
		26	)]
		27
		28
		29	zstd_includes = [
		30	'c-ext',
		31	'zstd',
		32	'zstd/common',
		33	'zstd/compress',
		34	'zstd/decompress',
		35	'zstd/dictBuilder',
		36	]
		37
		38	ext_sources = [
		39	'zstd.c',
		40	'c-ext/compressiondict.c',
		41	'c-ext/compressobj.c',
		42	'c-ext/compressor.c',
		43	'c-ext/compressoriterator.c',
		44	'c-ext/compressionparams.c',
		45	'c-ext/compressionwriter.c',
		46	'c-ext/constants.c',
		47	'c-ext/decompressobj.c',
		48	'c-ext/decompressor.c',
		49	'c-ext/decompressoriterator.c',
		50	'c-ext/decompressionwriter.c',
		51	'c-ext/dictparams.c',
		52	]
		53
		54
		55	def get_c_extension(name='zstd'):
		56	"""Obtain a distutils.extension.Extension for the C extension."""
		57	root = os.path.abspath(os.path.dirname(__file__))
		58
		59	sources = [os.path.join(root, p) for p in zstd_sources + ext_sources]
		60	include_dirs = [os.path.join(root, d) for d in zstd_includes]
		61
		62	# TODO compile with optimizations.
		63	return Extension(name, sources,
		64	include_dirs=include_dirs)

contrib/python-zstandard/tests/__init__.py

0 created 644 0 0

NO CONTENT: new file 100644

contrib/python-zstandard/tests/common.py

0 created 644 +15 0

			@@ -0,0 +1,15 b''
		1	import io
		2
		3	class OpCountingBytesIO(io.BytesIO):
		4	def __init__(self, args, *kwargs):
		5	self._read_count = 0
		6	self._write_count = 0
		7	return super(OpCountingBytesIO, self).__init__(args, *kwargs)
		8
		9	def read(self, *args):
		10	self._read_count += 1
		11	return super(OpCountingBytesIO, self).read(*args)
		12
		13	def write(self, data):
		14	self._write_count += 1
		15	return super(OpCountingBytesIO, self).write(data)

contrib/python-zstandard/tests/test_cffi.py

0 created 644 +35 0

			@@ -0,0 +1,35 b''
		1	import io
		2
		3	try:
		4	import unittest2 as unittest
		5	except ImportError:
		6	import unittest
		7
		8	import zstd
		9
		10	try:
		11	import zstd_cffi
		12	except ImportError:
		13	raise unittest.SkipTest('cffi version of zstd not available')
		14
		15
		16	class TestCFFIWriteToToCDecompressor(unittest.TestCase):
		17	def test_simple(self):
		18	orig = io.BytesIO()
		19	orig.write(b'foo')
		20	orig.write(b'bar')
		21	orig.write(b'foobar' * 16384)
		22
		23	dest = io.BytesIO()
		24	cctx = zstd_cffi.ZstdCompressor()
		25	with cctx.write_to(dest) as compressor:
		26	compressor.write(orig.getvalue())
		27
		28	uncompressed = io.BytesIO()
		29	dctx = zstd.ZstdDecompressor()
		30	with dctx.write_to(uncompressed) as decompressor:
		31	decompressor.write(dest.getvalue())
		32
		33	self.assertEqual(uncompressed.getvalue(), orig.getvalue())
		34
		35

contrib/python-zstandard/tests/test_compressor.py

0 created 644 +465 0

			@@ -0,0 +1,465 b''
		1	import hashlib
		2	import io
		3	import struct
		4	import sys
		5
		6	try:
		7	import unittest2 as unittest
		8	except ImportError:
		9	import unittest
		10
		11	import zstd
		12
		13	from .common import OpCountingBytesIO
		14
		15
		16	if sys.version_info[0] >= 3:
		17	next = lambda it: it.__next__()
		18	else:
		19	next = lambda it: it.next()
		20
		21
		22	class TestCompressor(unittest.TestCase):
		23	def test_level_bounds(self):
		24	with self.assertRaises(ValueError):
		25	zstd.ZstdCompressor(level=0)
		26
		27	with self.assertRaises(ValueError):
		28	zstd.ZstdCompressor(level=23)
		29
		30
		31	class TestCompressor_compress(unittest.TestCase):
		32	def test_compress_empty(self):
		33	cctx = zstd.ZstdCompressor(level=1)
		34	cctx.compress(b'')
		35
		36	cctx = zstd.ZstdCompressor(level=22)
		37	cctx.compress(b'')
		38
		39	def test_compress_empty(self):
		40	cctx = zstd.ZstdCompressor(level=1)
		41	self.assertEqual(cctx.compress(b''),
		42	b'\x28\xb5\x2f\xfd\x00\x48\x01\x00\x00')
		43
		44	def test_compress_large(self):
		45	chunks = []
		46	for i in range(255):
		47	chunks.append(struct.Struct('>B').pack(i) * 16384)
		48
		49	cctx = zstd.ZstdCompressor(level=3)
		50	result = cctx.compress(b''.join(chunks))
		51	self.assertEqual(len(result), 999)
		52	self.assertEqual(result[0:4], b'\x28\xb5\x2f\xfd')
		53
		54	def test_write_checksum(self):
		55	cctx = zstd.ZstdCompressor(level=1)
		56	no_checksum = cctx.compress(b'foobar')
		57	cctx = zstd.ZstdCompressor(level=1, write_checksum=True)
		58	with_checksum = cctx.compress(b'foobar')
		59
		60	self.assertEqual(len(with_checksum), len(no_checksum) + 4)
		61
		62	def test_write_content_size(self):
		63	cctx = zstd.ZstdCompressor(level=1)
		64	no_size = cctx.compress(b'foobar' * 256)
		65	cctx = zstd.ZstdCompressor(level=1, write_content_size=True)
		66	with_size = cctx.compress(b'foobar' * 256)
		67
		68	self.assertEqual(len(with_size), len(no_size) + 1)
		69
		70	def test_no_dict_id(self):
		71	samples = []
		72	for i in range(128):
		73	samples.append(b'foo' * 64)
		74	samples.append(b'bar' * 64)
		75	samples.append(b'foobar' * 64)
		76
		77	d = zstd.train_dictionary(1024, samples)
		78
		79	cctx = zstd.ZstdCompressor(level=1, dict_data=d)
		80	with_dict_id = cctx.compress(b'foobarfoobar')
		81
		82	cctx = zstd.ZstdCompressor(level=1, dict_data=d, write_dict_id=False)
		83	no_dict_id = cctx.compress(b'foobarfoobar')
		84
		85	self.assertEqual(len(with_dict_id), len(no_dict_id) + 4)
		86
		87	def test_compress_dict_multiple(self):
		88	samples = []
		89	for i in range(128):
		90	samples.append(b'foo' * 64)
		91	samples.append(b'bar' * 64)
		92	samples.append(b'foobar' * 64)
		93
		94	d = zstd.train_dictionary(8192, samples)
		95
		96	cctx = zstd.ZstdCompressor(level=1, dict_data=d)
		97
		98	for i in range(32):
		99	cctx.compress(b'foo bar foobar foo bar foobar')
		100
		101
		102	class TestCompressor_compressobj(unittest.TestCase):
		103	def test_compressobj_empty(self):
		104	cctx = zstd.ZstdCompressor(level=1)
		105	cobj = cctx.compressobj()
		106	self.assertEqual(cobj.compress(b''), b'')
		107	self.assertEqual(cobj.flush(),
		108	b'\x28\xb5\x2f\xfd\x00\x48\x01\x00\x00')
		109
		110	def test_compressobj_large(self):
		111	chunks = []
		112	for i in range(255):
		113	chunks.append(struct.Struct('>B').pack(i) * 16384)
		114
		115	cctx = zstd.ZstdCompressor(level=3)
		116	cobj = cctx.compressobj()
		117
		118	result = cobj.compress(b''.join(chunks)) + cobj.flush()
		119	self.assertEqual(len(result), 999)
		120	self.assertEqual(result[0:4], b'\x28\xb5\x2f\xfd')
		121
		122	def test_write_checksum(self):
		123	cctx = zstd.ZstdCompressor(level=1)
		124	cobj = cctx.compressobj()
		125	no_checksum = cobj.compress(b'foobar') + cobj.flush()
		126	cctx = zstd.ZstdCompressor(level=1, write_checksum=True)
		127	cobj = cctx.compressobj()
		128	with_checksum = cobj.compress(b'foobar') + cobj.flush()
		129
		130	self.assertEqual(len(with_checksum), len(no_checksum) + 4)
		131
		132	def test_write_content_size(self):
		133	cctx = zstd.ZstdCompressor(level=1)
		134	cobj = cctx.compressobj(size=len(b'foobar' * 256))
		135	no_size = cobj.compress(b'foobar' * 256) + cobj.flush()
		136	cctx = zstd.ZstdCompressor(level=1, write_content_size=True)
		137	cobj = cctx.compressobj(size=len(b'foobar' * 256))
		138	with_size = cobj.compress(b'foobar' * 256) + cobj.flush()
		139
		140	self.assertEqual(len(with_size), len(no_size) + 1)
		141
		142	def test_compress_after_flush(self):
		143	cctx = zstd.ZstdCompressor()
		144	cobj = cctx.compressobj()
		145
		146	cobj.compress(b'foo')
		147	cobj.flush()
		148
		149	with self.assertRaisesRegexp(zstd.ZstdError, 'cannot call compress after flush'):
		150	cobj.compress(b'foo')
		151
		152	with self.assertRaisesRegexp(zstd.ZstdError, 'flush already called'):
		153	cobj.flush()
		154
		155
		156	class TestCompressor_copy_stream(unittest.TestCase):
		157	def test_no_read(self):
		158	source = object()
		159	dest = io.BytesIO()
		160
		161	cctx = zstd.ZstdCompressor()
		162	with self.assertRaises(ValueError):
		163	cctx.copy_stream(source, dest)
		164
		165	def test_no_write(self):
		166	source = io.BytesIO()
		167	dest = object()
		168
		169	cctx = zstd.ZstdCompressor()
		170	with self.assertRaises(ValueError):
		171	cctx.copy_stream(source, dest)
		172
		173	def test_empty(self):
		174	source = io.BytesIO()
		175	dest = io.BytesIO()
		176
		177	cctx = zstd.ZstdCompressor(level=1)
		178	r, w = cctx.copy_stream(source, dest)
		179	self.assertEqual(int(r), 0)
		180	self.assertEqual(w, 9)
		181
		182	self.assertEqual(dest.getvalue(),
		183	b'\x28\xb5\x2f\xfd\x00\x48\x01\x00\x00')
		184
		185	def test_large_data(self):
		186	source = io.BytesIO()
		187	for i in range(255):
		188	source.write(struct.Struct('>B').pack(i) * 16384)
		189	source.seek(0)
		190
		191	dest = io.BytesIO()
		192	cctx = zstd.ZstdCompressor()
		193	r, w = cctx.copy_stream(source, dest)
		194
		195	self.assertEqual(r, 255 * 16384)
		196	self.assertEqual(w, 999)
		197
		198	def test_write_checksum(self):
		199	source = io.BytesIO(b'foobar')
		200	no_checksum = io.BytesIO()
		201
		202	cctx = zstd.ZstdCompressor(level=1)
		203	cctx.copy_stream(source, no_checksum)
		204
		205	source.seek(0)
		206	with_checksum = io.BytesIO()
		207	cctx = zstd.ZstdCompressor(level=1, write_checksum=True)
		208	cctx.copy_stream(source, with_checksum)
		209
		210	self.assertEqual(len(with_checksum.getvalue()),
		211	len(no_checksum.getvalue()) + 4)
		212
		213	def test_write_content_size(self):
		214	source = io.BytesIO(b'foobar' * 256)
		215	no_size = io.BytesIO()
		216
		217	cctx = zstd.ZstdCompressor(level=1)
		218	cctx.copy_stream(source, no_size)
		219
		220	source.seek(0)
		221	with_size = io.BytesIO()
		222	cctx = zstd.ZstdCompressor(level=1, write_content_size=True)
		223	cctx.copy_stream(source, with_size)
		224
		225	# Source content size is unknown, so no content size written.
		226	self.assertEqual(len(with_size.getvalue()),
		227	len(no_size.getvalue()))
		228
		229	source.seek(0)
		230	with_size = io.BytesIO()
		231	cctx.copy_stream(source, with_size, size=len(source.getvalue()))
		232
		233	# We specified source size, so content size header is present.
		234	self.assertEqual(len(with_size.getvalue()),
		235	len(no_size.getvalue()) + 1)
		236
		237	def test_read_write_size(self):
		238	source = OpCountingBytesIO(b'foobarfoobar')
		239	dest = OpCountingBytesIO()
		240	cctx = zstd.ZstdCompressor()
		241	r, w = cctx.copy_stream(source, dest, read_size=1, write_size=1)
		242
		243	self.assertEqual(r, len(source.getvalue()))
		244	self.assertEqual(w, 21)
		245	self.assertEqual(source._read_count, len(source.getvalue()) + 1)
		246	self.assertEqual(dest._write_count, len(dest.getvalue()))
		247
		248
		249	def compress(data, level):
		250	buffer = io.BytesIO()
		251	cctx = zstd.ZstdCompressor(level=level)
		252	with cctx.write_to(buffer) as compressor:
		253	compressor.write(data)
		254	return buffer.getvalue()
		255
		256
		257	class TestCompressor_write_to(unittest.TestCase):
		258	def test_empty(self):
		259	self.assertEqual(compress(b'', 1),
		260	b'\x28\xb5\x2f\xfd\x00\x48\x01\x00\x00')
		261
		262	def test_multiple_compress(self):
		263	buffer = io.BytesIO()
		264	cctx = zstd.ZstdCompressor(level=5)
		265	with cctx.write_to(buffer) as compressor:
		266	compressor.write(b'foo')
		267	compressor.write(b'bar')
		268	compressor.write(b'x' * 8192)
		269
		270	result = buffer.getvalue()
		271	self.assertEqual(result,
		272	b'\x28\xb5\x2f\xfd\x00\x50\x75\x00\x00\x38\x66\x6f'
		273	b'\x6f\x62\x61\x72\x78\x01\x00\xfc\xdf\x03\x23')
		274
		275	def test_dictionary(self):
		276	samples = []
		277	for i in range(128):
		278	samples.append(b'foo' * 64)
		279	samples.append(b'bar' * 64)
		280	samples.append(b'foobar' * 64)
		281
		282	d = zstd.train_dictionary(8192, samples)
		283
		284	buffer = io.BytesIO()
		285	cctx = zstd.ZstdCompressor(level=9, dict_data=d)
		286	with cctx.write_to(buffer) as compressor:
		287	compressor.write(b'foo')
		288	compressor.write(b'bar')
		289	compressor.write(b'foo' * 16384)
		290
		291	compressed = buffer.getvalue()
		292	h = hashlib.sha1(compressed).hexdigest()
		293	self.assertEqual(h, '1c5bcd25181bcd8c1a73ea8773323e0056129f92')
		294
		295	def test_compression_params(self):
		296	params = zstd.CompressionParameters(20, 6, 12, 5, 4, 10, zstd.STRATEGY_FAST)
		297
		298	buffer = io.BytesIO()
		299	cctx = zstd.ZstdCompressor(compression_params=params)
		300	with cctx.write_to(buffer) as compressor:
		301	compressor.write(b'foo')
		302	compressor.write(b'bar')
		303	compressor.write(b'foobar' * 16384)
		304
		305	compressed = buffer.getvalue()
		306	h = hashlib.sha1(compressed).hexdigest()
		307	self.assertEqual(h, '1ae31f270ed7de14235221a604b31ecd517ebd99')
		308
		309	def test_write_checksum(self):
		310	no_checksum = io.BytesIO()
		311	cctx = zstd.ZstdCompressor(level=1)
		312	with cctx.write_to(no_checksum) as compressor:
		313	compressor.write(b'foobar')
		314
		315	with_checksum = io.BytesIO()
		316	cctx = zstd.ZstdCompressor(level=1, write_checksum=True)
		317	with cctx.write_to(with_checksum) as compressor:
		318	compressor.write(b'foobar')
		319
		320	self.assertEqual(len(with_checksum.getvalue()),
		321	len(no_checksum.getvalue()) + 4)
		322
		323	def test_write_content_size(self):
		324	no_size = io.BytesIO()
		325	cctx = zstd.ZstdCompressor(level=1)
		326	with cctx.write_to(no_size) as compressor:
		327	compressor.write(b'foobar' * 256)
		328
		329	with_size = io.BytesIO()
		330	cctx = zstd.ZstdCompressor(level=1, write_content_size=True)
		331	with cctx.write_to(with_size) as compressor:
		332	compressor.write(b'foobar' * 256)
		333
		334	# Source size is not known in streaming mode, so header not
		335	# written.
		336	self.assertEqual(len(with_size.getvalue()),
		337	len(no_size.getvalue()))
		338
		339	# Declaring size will write the header.
		340	with_size = io.BytesIO()
		341	with cctx.write_to(with_size, size=len(b'foobar' * 256)) as compressor:
		342	compressor.write(b'foobar' * 256)
		343
		344	self.assertEqual(len(with_size.getvalue()),
		345	len(no_size.getvalue()) + 1)
		346
		347	def test_no_dict_id(self):
		348	samples = []
		349	for i in range(128):
		350	samples.append(b'foo' * 64)
		351	samples.append(b'bar' * 64)
		352	samples.append(b'foobar' * 64)
		353
		354	d = zstd.train_dictionary(1024, samples)
		355
		356	with_dict_id = io.BytesIO()
		357	cctx = zstd.ZstdCompressor(level=1, dict_data=d)
		358	with cctx.write_to(with_dict_id) as compressor:
		359	compressor.write(b'foobarfoobar')
		360
		361	cctx = zstd.ZstdCompressor(level=1, dict_data=d, write_dict_id=False)
		362	no_dict_id = io.BytesIO()
		363	with cctx.write_to(no_dict_id) as compressor:
		364	compressor.write(b'foobarfoobar')
		365
		366	self.assertEqual(len(with_dict_id.getvalue()),
		367	len(no_dict_id.getvalue()) + 4)
		368
		369	def test_memory_size(self):
		370	cctx = zstd.ZstdCompressor(level=3)
		371	buffer = io.BytesIO()
		372	with cctx.write_to(buffer) as compressor:
		373	size = compressor.memory_size()
		374
		375	self.assertGreater(size, 100000)
		376
		377	def test_write_size(self):
		378	cctx = zstd.ZstdCompressor(level=3)
		379	dest = OpCountingBytesIO()
		380	with cctx.write_to(dest, write_size=1) as compressor:
		381	compressor.write(b'foo')
		382	compressor.write(b'bar')
		383	compressor.write(b'foobar')
		384
		385	self.assertEqual(len(dest.getvalue()), dest._write_count)
		386
		387
		388	class TestCompressor_read_from(unittest.TestCase):
		389	def test_type_validation(self):
		390	cctx = zstd.ZstdCompressor()
		391
		392	# Object with read() works.
		393	cctx.read_from(io.BytesIO())
		394
		395	# Buffer protocol works.
		396	cctx.read_from(b'foobar')
		397
		398	with self.assertRaisesRegexp(ValueError, 'must pass an object with a read'):
		399	cctx.read_from(True)
		400
		401	def test_read_empty(self):
		402	cctx = zstd.ZstdCompressor(level=1)
		403
		404	source = io.BytesIO()
		405	it = cctx.read_from(source)
		406	chunks = list(it)
		407	self.assertEqual(len(chunks), 1)
		408	compressed = b''.join(chunks)
		409	self.assertEqual(compressed, b'\x28\xb5\x2f\xfd\x00\x48\x01\x00\x00')
		410
		411	# And again with the buffer protocol.
		412	it = cctx.read_from(b'')
		413	chunks = list(it)
		414	self.assertEqual(len(chunks), 1)
		415	compressed2 = b''.join(chunks)
		416	self.assertEqual(compressed2, compressed)
		417
		418	def test_read_large(self):
		419	cctx = zstd.ZstdCompressor(level=1)
		420
		421	source = io.BytesIO()
		422	source.write(b'f' * zstd.COMPRESSION_RECOMMENDED_INPUT_SIZE)
		423	source.write(b'o')
		424	source.seek(0)
		425
		426	# Creating an iterator should not perform any compression until
		427	# first read.
		428	it = cctx.read_from(source, size=len(source.getvalue()))
		429	self.assertEqual(source.tell(), 0)
		430
		431	# We should have exactly 2 output chunks.
		432	chunks = []
		433	chunk = next(it)
		434	self.assertIsNotNone(chunk)
		435	self.assertEqual(source.tell(), zstd.COMPRESSION_RECOMMENDED_INPUT_SIZE)
		436	chunks.append(chunk)
		437	chunk = next(it)
		438	self.assertIsNotNone(chunk)
		439	chunks.append(chunk)
		440
		441	self.assertEqual(source.tell(), len(source.getvalue()))
		442
		443	with self.assertRaises(StopIteration):
		444	next(it)
		445
		446	# And again for good measure.
		447	with self.assertRaises(StopIteration):
		448	next(it)
		449
		450	# We should get the same output as the one-shot compression mechanism.
		451	self.assertEqual(b''.join(chunks), cctx.compress(source.getvalue()))
		452
		453	# Now check the buffer protocol.
		454	it = cctx.read_from(source.getvalue())
		455	chunks = list(it)
		456	self.assertEqual(len(chunks), 2)
		457	self.assertEqual(b''.join(chunks), cctx.compress(source.getvalue()))
		458
		459	def test_read_write_size(self):
		460	source = OpCountingBytesIO(b'foobarfoobar')
		461	cctx = zstd.ZstdCompressor(level=3)
		462	for chunk in cctx.read_from(source, read_size=1, write_size=1):
		463	self.assertEqual(len(chunk), 1)
		464
		465	self.assertEqual(source._read_count, len(source.getvalue()) + 1)

contrib/python-zstandard/tests/test_data_structures.py

0 created 644 +107 0

			@@ -0,0 +1,107 b''
		1	import io
		2
		3	try:
		4	import unittest2 as unittest
		5	except ImportError:
		6	import unittest
		7
		8	try:
		9	import hypothesis
		10	import hypothesis.strategies as strategies
		11	except ImportError:
		12	hypothesis = None
		13
		14	import zstd
		15
		16	class TestCompressionParameters(unittest.TestCase):
		17	def test_init_bad_arg_type(self):
		18	with self.assertRaises(TypeError):
		19	zstd.CompressionParameters()
		20
		21	with self.assertRaises(TypeError):
		22	zstd.CompressionParameters(0, 1)
		23
		24	def test_bounds(self):
		25	zstd.CompressionParameters(zstd.WINDOWLOG_MIN,
		26	zstd.CHAINLOG_MIN,
		27	zstd.HASHLOG_MIN,
		28	zstd.SEARCHLOG_MIN,
		29	zstd.SEARCHLENGTH_MIN,
		30	zstd.TARGETLENGTH_MIN,
		31	zstd.STRATEGY_FAST)
		32
		33	zstd.CompressionParameters(zstd.WINDOWLOG_MAX,
		34	zstd.CHAINLOG_MAX,
		35	zstd.HASHLOG_MAX,
		36	zstd.SEARCHLOG_MAX,
		37	zstd.SEARCHLENGTH_MAX,
		38	zstd.TARGETLENGTH_MAX,
		39	zstd.STRATEGY_BTOPT)
		40
		41	def test_get_compression_parameters(self):
		42	p = zstd.get_compression_parameters(1)
		43	self.assertIsInstance(p, zstd.CompressionParameters)
		44
		45	self.assertEqual(p[0], 19)
		46
		47	if hypothesis:
		48	s_windowlog = strategies.integers(min_value=zstd.WINDOWLOG_MIN,
		49	max_value=zstd.WINDOWLOG_MAX)
		50	s_chainlog = strategies.integers(min_value=zstd.CHAINLOG_MIN,
		51	max_value=zstd.CHAINLOG_MAX)
		52	s_hashlog = strategies.integers(min_value=zstd.HASHLOG_MIN,
		53	max_value=zstd.HASHLOG_MAX)
		54	s_searchlog = strategies.integers(min_value=zstd.SEARCHLOG_MIN,
		55	max_value=zstd.SEARCHLOG_MAX)
		56	s_searchlength = strategies.integers(min_value=zstd.SEARCHLENGTH_MIN,
		57	max_value=zstd.SEARCHLENGTH_MAX)
		58	s_targetlength = strategies.integers(min_value=zstd.TARGETLENGTH_MIN,
		59	max_value=zstd.TARGETLENGTH_MAX)
		60	s_strategy = strategies.sampled_from((zstd.STRATEGY_FAST,
		61	zstd.STRATEGY_DFAST,
		62	zstd.STRATEGY_GREEDY,
		63	zstd.STRATEGY_LAZY,
		64	zstd.STRATEGY_LAZY2,
		65	zstd.STRATEGY_BTLAZY2,
		66	zstd.STRATEGY_BTOPT))
		67
		68	class TestCompressionParametersHypothesis(unittest.TestCase):
		69	@hypothesis.given(s_windowlog, s_chainlog, s_hashlog, s_searchlog,
		70	s_searchlength, s_targetlength, s_strategy)
		71	def test_valid_init(self, windowlog, chainlog, hashlog, searchlog,
		72	searchlength, targetlength, strategy):
		73	p = zstd.CompressionParameters(windowlog, chainlog, hashlog,
		74	searchlog, searchlength,
		75	targetlength, strategy)
		76	self.assertEqual(tuple(p),
		77	(windowlog, chainlog, hashlog, searchlog,
		78	searchlength, targetlength, strategy))
		79
		80	# Verify we can instantiate a compressor with the supplied values.
		81	# ZSTD_checkCParams moves the goal posts on us from what's advertised
		82	# in the constants. So move along with them.
		83	if searchlength == zstd.SEARCHLENGTH_MIN and strategy in (zstd.STRATEGY_FAST, zstd.STRATEGY_GREEDY):
		84	searchlength += 1
		85	p = zstd.CompressionParameters(windowlog, chainlog, hashlog,
		86	searchlog, searchlength,
		87	targetlength, strategy)
		88	elif searchlength == zstd.SEARCHLENGTH_MAX and strategy != zstd.STRATEGY_FAST:
		89	searchlength -= 1
		90	p = zstd.CompressionParameters(windowlog, chainlog, hashlog,
		91	searchlog, searchlength,
		92	targetlength, strategy)
		93
		94	cctx = zstd.ZstdCompressor(compression_params=p)
		95	with cctx.write_to(io.BytesIO()):
		96	pass
		97
		98	@hypothesis.given(s_windowlog, s_chainlog, s_hashlog, s_searchlog,
		99	s_searchlength, s_targetlength, s_strategy)
		100	def test_estimate_compression_context_size(self, windowlog, chainlog,
		101	hashlog, searchlog,
		102	searchlength, targetlength,
		103	strategy):
		104	p = zstd.CompressionParameters(windowlog, chainlog, hashlog,
		105	searchlog, searchlength,
		106	targetlength, strategy)
		107	size = zstd.estimate_compression_context_size(p)

contrib/python-zstandard/tests/test_decompressor.py

0 created 644 +478 0

			@@ -0,0 +1,478 b''
		1	import io
		2	import random
		3	import struct
		4	import sys
		5
		6	try:
		7	import unittest2 as unittest
		8	except ImportError:
		9	import unittest
		10
		11	import zstd
		12
		13	from .common import OpCountingBytesIO
		14
		15
		16	if sys.version_info[0] >= 3:
		17	next = lambda it: it.__next__()
		18	else:
		19	next = lambda it: it.next()
		20
		21
		22	class TestDecompressor_decompress(unittest.TestCase):
		23	def test_empty_input(self):
		24	dctx = zstd.ZstdDecompressor()
		25
		26	with self.assertRaisesRegexp(zstd.ZstdError, 'input data invalid'):
		27	dctx.decompress(b'')
		28
		29	def test_invalid_input(self):
		30	dctx = zstd.ZstdDecompressor()
		31
		32	with self.assertRaisesRegexp(zstd.ZstdError, 'input data invalid'):
		33	dctx.decompress(b'foobar')
		34
		35	def test_no_content_size_in_frame(self):
		36	cctx = zstd.ZstdCompressor(write_content_size=False)
		37	compressed = cctx.compress(b'foobar')
		38
		39	dctx = zstd.ZstdDecompressor()
		40	with self.assertRaisesRegexp(zstd.ZstdError, 'input data invalid'):
		41	dctx.decompress(compressed)
		42
		43	def test_content_size_present(self):
		44	cctx = zstd.ZstdCompressor(write_content_size=True)
		45	compressed = cctx.compress(b'foobar')
		46
		47	dctx = zstd.ZstdDecompressor()
		48	decompressed = dctx.decompress(compressed)
		49	self.assertEqual(decompressed, b'foobar')
		50
		51	def test_max_output_size(self):
		52	cctx = zstd.ZstdCompressor(write_content_size=False)
		53	source = b'foobar' * 256
		54	compressed = cctx.compress(source)
		55
		56	dctx = zstd.ZstdDecompressor()
		57	# Will fit into buffer exactly the size of input.
		58	decompressed = dctx.decompress(compressed, max_output_size=len(source))
		59	self.assertEqual(decompressed, source)
		60
		61	# Input size - 1 fails
		62	with self.assertRaisesRegexp(zstd.ZstdError, 'Destination buffer is too small'):
		63	dctx.decompress(compressed, max_output_size=len(source) - 1)
		64
		65	# Input size + 1 works
		66	decompressed = dctx.decompress(compressed, max_output_size=len(source) + 1)
		67	self.assertEqual(decompressed, source)
		68
		69	# A much larger buffer works.
		70	decompressed = dctx.decompress(compressed, max_output_size=len(source) * 64)
		71	self.assertEqual(decompressed, source)
		72
		73	def test_stupidly_large_output_buffer(self):
		74	cctx = zstd.ZstdCompressor(write_content_size=False)
		75	compressed = cctx.compress(b'foobar' * 256)
		76	dctx = zstd.ZstdDecompressor()
		77
		78	# Will get OverflowError on some Python distributions that can't
		79	# handle really large integers.
		80	with self.assertRaises((MemoryError, OverflowError)):
		81	dctx.decompress(compressed, max_output_size=2**62)
		82
		83	def test_dictionary(self):
		84	samples = []
		85	for i in range(128):
		86	samples.append(b'foo' * 64)
		87	samples.append(b'bar' * 64)
		88	samples.append(b'foobar' * 64)
		89
		90	d = zstd.train_dictionary(8192, samples)
		91
		92	orig = b'foobar' * 16384
		93	cctx = zstd.ZstdCompressor(level=1, dict_data=d, write_content_size=True)
		94	compressed = cctx.compress(orig)
		95
		96	dctx = zstd.ZstdDecompressor(dict_data=d)
		97	decompressed = dctx.decompress(compressed)
		98
		99	self.assertEqual(decompressed, orig)
		100
		101	def test_dictionary_multiple(self):
		102	samples = []
		103	for i in range(128):
		104	samples.append(b'foo' * 64)
		105	samples.append(b'bar' * 64)
		106	samples.append(b'foobar' * 64)
		107
		108	d = zstd.train_dictionary(8192, samples)
		109
		110	sources = (b'foobar' * 8192, b'foo' * 8192, b'bar' * 8192)
		111	compressed = []
		112	cctx = zstd.ZstdCompressor(level=1, dict_data=d, write_content_size=True)
		113	for source in sources:
		114	compressed.append(cctx.compress(source))
		115
		116	dctx = zstd.ZstdDecompressor(dict_data=d)
		117	for i in range(len(sources)):
		118	decompressed = dctx.decompress(compressed[i])
		119	self.assertEqual(decompressed, sources[i])
		120
		121
		122	class TestDecompressor_copy_stream(unittest.TestCase):
		123	def test_no_read(self):
		124	source = object()
		125	dest = io.BytesIO()
		126
		127	dctx = zstd.ZstdDecompressor()
		128	with self.assertRaises(ValueError):
		129	dctx.copy_stream(source, dest)
		130
		131	def test_no_write(self):
		132	source = io.BytesIO()
		133	dest = object()
		134
		135	dctx = zstd.ZstdDecompressor()
		136	with self.assertRaises(ValueError):
		137	dctx.copy_stream(source, dest)
		138
		139	def test_empty(self):
		140	source = io.BytesIO()
		141	dest = io.BytesIO()
		142
		143	dctx = zstd.ZstdDecompressor()
		144	# TODO should this raise an error?
		145	r, w = dctx.copy_stream(source, dest)
		146
		147	self.assertEqual(r, 0)
		148	self.assertEqual(w, 0)
		149	self.assertEqual(dest.getvalue(), b'')
		150
		151	def test_large_data(self):
		152	source = io.BytesIO()
		153	for i in range(255):
		154	source.write(struct.Struct('>B').pack(i) * 16384)
		155	source.seek(0)
		156
		157	compressed = io.BytesIO()
		158	cctx = zstd.ZstdCompressor()
		159	cctx.copy_stream(source, compressed)
		160
		161	compressed.seek(0)
		162	dest = io.BytesIO()
		163	dctx = zstd.ZstdDecompressor()
		164	r, w = dctx.copy_stream(compressed, dest)
		165
		166	self.assertEqual(r, len(compressed.getvalue()))
		167	self.assertEqual(w, len(source.getvalue()))
		168
		169	def test_read_write_size(self):
		170	source = OpCountingBytesIO(zstd.ZstdCompressor().compress(
		171	b'foobarfoobar'))
		172
		173	dest = OpCountingBytesIO()
		174	dctx = zstd.ZstdDecompressor()
		175	r, w = dctx.copy_stream(source, dest, read_size=1, write_size=1)
		176
		177	self.assertEqual(r, len(source.getvalue()))
		178	self.assertEqual(w, len(b'foobarfoobar'))
		179	self.assertEqual(source._read_count, len(source.getvalue()) + 1)
		180	self.assertEqual(dest._write_count, len(dest.getvalue()))
		181
		182
		183	class TestDecompressor_decompressobj(unittest.TestCase):
		184	def test_simple(self):
		185	data = zstd.ZstdCompressor(level=1).compress(b'foobar')
		186
		187	dctx = zstd.ZstdDecompressor()
		188	dobj = dctx.decompressobj()
		189	self.assertEqual(dobj.decompress(data), b'foobar')
		190
		191	def test_reuse(self):
		192	data = zstd.ZstdCompressor(level=1).compress(b'foobar')
		193
		194	dctx = zstd.ZstdDecompressor()
		195	dobj = dctx.decompressobj()
		196	dobj.decompress(data)
		197
		198	with self.assertRaisesRegexp(zstd.ZstdError, 'cannot use a decompressobj'):
		199	dobj.decompress(data)
		200
		201
		202	def decompress_via_writer(data):
		203	buffer = io.BytesIO()
		204	dctx = zstd.ZstdDecompressor()
		205	with dctx.write_to(buffer) as decompressor:
		206	decompressor.write(data)
		207	return buffer.getvalue()
		208
		209
		210	class TestDecompressor_write_to(unittest.TestCase):
		211	def test_empty_roundtrip(self):
		212	cctx = zstd.ZstdCompressor()
		213	empty = cctx.compress(b'')
		214	self.assertEqual(decompress_via_writer(empty), b'')
		215
		216	def test_large_roundtrip(self):
		217	chunks = []
		218	for i in range(255):
		219	chunks.append(struct.Struct('>B').pack(i) * 16384)
		220	orig = b''.join(chunks)
		221	cctx = zstd.ZstdCompressor()
		222	compressed = cctx.compress(orig)
		223
		224	self.assertEqual(decompress_via_writer(compressed), orig)
		225
		226	def test_multiple_calls(self):
		227	chunks = []
		228	for i in range(255):
		229	for j in range(255):
		230	chunks.append(struct.Struct('>B').pack(j) * i)
		231
		232	orig = b''.join(chunks)
		233	cctx = zstd.ZstdCompressor()
		234	compressed = cctx.compress(orig)
		235
		236	buffer = io.BytesIO()
		237	dctx = zstd.ZstdDecompressor()
		238	with dctx.write_to(buffer) as decompressor:
		239	pos = 0
		240	while pos < len(compressed):
		241	pos2 = pos + 8192
		242	decompressor.write(compressed[pos:pos2])
		243	pos += 8192
		244	self.assertEqual(buffer.getvalue(), orig)
		245
		246	def test_dictionary(self):
		247	samples = []
		248	for i in range(128):
		249	samples.append(b'foo' * 64)
		250	samples.append(b'bar' * 64)
		251	samples.append(b'foobar' * 64)
		252
		253	d = zstd.train_dictionary(8192, samples)
		254
		255	orig = b'foobar' * 16384
		256	buffer = io.BytesIO()
		257	cctx = zstd.ZstdCompressor(dict_data=d)
		258	with cctx.write_to(buffer) as compressor:
		259	compressor.write(orig)
		260
		261	compressed = buffer.getvalue()
		262	buffer = io.BytesIO()
		263
		264	dctx = zstd.ZstdDecompressor(dict_data=d)
		265	with dctx.write_to(buffer) as decompressor:
		266	decompressor.write(compressed)
		267
		268	self.assertEqual(buffer.getvalue(), orig)
		269
		270	def test_memory_size(self):
		271	dctx = zstd.ZstdDecompressor()
		272	buffer = io.BytesIO()
		273	with dctx.write_to(buffer) as decompressor:
		274	size = decompressor.memory_size()
		275
		276	self.assertGreater(size, 100000)
		277
		278	def test_write_size(self):
		279	source = zstd.ZstdCompressor().compress(b'foobarfoobar')
		280	dest = OpCountingBytesIO()
		281	dctx = zstd.ZstdDecompressor()
		282	with dctx.write_to(dest, write_size=1) as decompressor:
		283	s = struct.Struct('>B')
		284	for c in source:
		285	if not isinstance(c, str):
		286	c = s.pack(c)
		287	decompressor.write(c)
		288
		289
		290	self.assertEqual(dest.getvalue(), b'foobarfoobar')
		291	self.assertEqual(dest._write_count, len(dest.getvalue()))
		292
		293
		294	class TestDecompressor_read_from(unittest.TestCase):
		295	def test_type_validation(self):
		296	dctx = zstd.ZstdDecompressor()
		297
		298	# Object with read() works.
		299	dctx.read_from(io.BytesIO())
		300
		301	# Buffer protocol works.
		302	dctx.read_from(b'foobar')
		303
		304	with self.assertRaisesRegexp(ValueError, 'must pass an object with a read'):
		305	dctx.read_from(True)
		306
		307	def test_empty_input(self):
		308	dctx = zstd.ZstdDecompressor()
		309
		310	source = io.BytesIO()
		311	it = dctx.read_from(source)
		312	# TODO this is arguably wrong. Should get an error about missing frame foo.
		313	with self.assertRaises(StopIteration):
		314	next(it)
		315
		316	it = dctx.read_from(b'')
		317	with self.assertRaises(StopIteration):
		318	next(it)
		319
		320	def test_invalid_input(self):
		321	dctx = zstd.ZstdDecompressor()
		322
		323	source = io.BytesIO(b'foobar')
		324	it = dctx.read_from(source)
		325	with self.assertRaisesRegexp(zstd.ZstdError, 'Unknown frame descriptor'):
		326	next(it)
		327
		328	it = dctx.read_from(b'foobar')
		329	with self.assertRaisesRegexp(zstd.ZstdError, 'Unknown frame descriptor'):
		330	next(it)
		331
		332	def test_empty_roundtrip(self):
		333	cctx = zstd.ZstdCompressor(level=1, write_content_size=False)
		334	empty = cctx.compress(b'')
		335
		336	source = io.BytesIO(empty)
		337	source.seek(0)
		338
		339	dctx = zstd.ZstdDecompressor()
		340	it = dctx.read_from(source)
		341
		342	# No chunks should be emitted since there is no data.
		343	with self.assertRaises(StopIteration):
		344	next(it)
		345
		346	# Again for good measure.
		347	with self.assertRaises(StopIteration):
		348	next(it)
		349
		350	def test_skip_bytes_too_large(self):
		351	dctx = zstd.ZstdDecompressor()
		352
		353	with self.assertRaisesRegexp(ValueError, 'skip_bytes must be smaller than read_size'):
		354	dctx.read_from(b'', skip_bytes=1, read_size=1)
		355
		356	with self.assertRaisesRegexp(ValueError, 'skip_bytes larger than first input chunk'):
		357	b''.join(dctx.read_from(b'foobar', skip_bytes=10))
		358
		359	def test_skip_bytes(self):
		360	cctx = zstd.ZstdCompressor(write_content_size=False)
		361	compressed = cctx.compress(b'foobar')
		362
		363	dctx = zstd.ZstdDecompressor()
		364	output = b''.join(dctx.read_from(b'hdr' + compressed, skip_bytes=3))
		365	self.assertEqual(output, b'foobar')
		366
		367	def test_large_output(self):
		368	source = io.BytesIO()
		369	source.write(b'f' * zstd.DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE)
		370	source.write(b'o')
		371	source.seek(0)
		372
		373	cctx = zstd.ZstdCompressor(level=1)
		374	compressed = io.BytesIO(cctx.compress(source.getvalue()))
		375	compressed.seek(0)
		376
		377	dctx = zstd.ZstdDecompressor()
		378	it = dctx.read_from(compressed)
		379
		380	chunks = []
		381	chunks.append(next(it))
		382	chunks.append(next(it))
		383
		384	with self.assertRaises(StopIteration):
		385	next(it)
		386
		387	decompressed = b''.join(chunks)
		388	self.assertEqual(decompressed, source.getvalue())
		389
		390	# And again with buffer protocol.
		391	it = dctx.read_from(compressed.getvalue())
		392	chunks = []
		393	chunks.append(next(it))
		394	chunks.append(next(it))
		395
		396	with self.assertRaises(StopIteration):
		397	next(it)
		398
		399	decompressed = b''.join(chunks)
		400	self.assertEqual(decompressed, source.getvalue())
		401
		402	def test_large_input(self):
		403	bytes = list(struct.Struct('>B').pack(i) for i in range(256))
		404	compressed = io.BytesIO()
		405	input_size = 0
		406	cctx = zstd.ZstdCompressor(level=1)
		407	with cctx.write_to(compressed) as compressor:
		408	while True:
		409	compressor.write(random.choice(bytes))
		410	input_size += 1
		411
		412	have_compressed = len(compressed.getvalue()) > zstd.DECOMPRESSION_RECOMMENDED_INPUT_SIZE
		413	have_raw = input_size > zstd.DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE * 2
		414	if have_compressed and have_raw:
		415	break
		416
		417	compressed.seek(0)
		418	self.assertGreater(len(compressed.getvalue()),
		419	zstd.DECOMPRESSION_RECOMMENDED_INPUT_SIZE)
		420
		421	dctx = zstd.ZstdDecompressor()
		422	it = dctx.read_from(compressed)
		423
		424	chunks = []
		425	chunks.append(next(it))
		426	chunks.append(next(it))
		427	chunks.append(next(it))
		428
		429	with self.assertRaises(StopIteration):
		430	next(it)
		431
		432	decompressed = b''.join(chunks)
		433	self.assertEqual(len(decompressed), input_size)
		434
		435	# And again with buffer protocol.
		436	it = dctx.read_from(compressed.getvalue())
		437
		438	chunks = []
		439	chunks.append(next(it))
		440	chunks.append(next(it))
		441	chunks.append(next(it))
		442
		443	with self.assertRaises(StopIteration):
		444	next(it)
		445
		446	decompressed = b''.join(chunks)
		447	self.assertEqual(len(decompressed), input_size)
		448
		449	def test_interesting(self):
		450	# Found this edge case via fuzzing.
		451	cctx = zstd.ZstdCompressor(level=1)
		452
		453	source = io.BytesIO()
		454
		455	compressed = io.BytesIO()
		456	with cctx.write_to(compressed) as compressor:
		457	for i in range(256):
		458	chunk = b'\0' * 1024
		459	compressor.write(chunk)
		460	source.write(chunk)
		461
		462	dctx = zstd.ZstdDecompressor()
		463
		464	simple = dctx.decompress(compressed.getvalue(),
		465	max_output_size=len(source.getvalue()))
		466	self.assertEqual(simple, source.getvalue())
		467
		468	compressed.seek(0)
		469	streamed = b''.join(dctx.read_from(compressed))
		470	self.assertEqual(streamed, source.getvalue())
		471
		472	def test_read_write_size(self):
		473	source = OpCountingBytesIO(zstd.ZstdCompressor().compress(b'foobarfoobar'))
		474	dctx = zstd.ZstdDecompressor()
		475	for chunk in dctx.read_from(source, read_size=1, write_size=1):
		476	self.assertEqual(len(chunk), 1)
		477
		478	self.assertEqual(source._read_count, len(source.getvalue()))

contrib/python-zstandard/tests/test_estimate_sizes.py

0 created 644 +17 0

			@@ -0,0 +1,17 b''
		1	try:
		2	import unittest2 as unittest
		3	except ImportError:
		4	import unittest
		5
		6	import zstd
		7
		8
		9	class TestSizes(unittest.TestCase):
		10	def test_decompression_size(self):
		11	size = zstd.estimate_decompression_context_size()
		12	self.assertGreater(size, 100000)
		13
		14	def test_compression_size(self):
		15	params = zstd.get_compression_parameters(3)
		16	size = zstd.estimate_compression_context_size(params)
		17	self.assertGreater(size, 100000)

contrib/python-zstandard/tests/test_module_attributes.py

0 created 644 +48 0

			@@ -0,0 +1,48 b''
		1	from __future__ import unicode_literals
		2
		3	try:
		4	import unittest2 as unittest
		5	except ImportError:
		6	import unittest
		7
		8	import zstd
		9
		10	class TestModuleAttributes(unittest.TestCase):
		11	def test_version(self):
		12	self.assertEqual(zstd.ZSTD_VERSION, (1, 1, 1))
		13
		14	def test_constants(self):
		15	self.assertEqual(zstd.MAX_COMPRESSION_LEVEL, 22)
		16	self.assertEqual(zstd.FRAME_HEADER, b'\x28\xb5\x2f\xfd')
		17
		18	def test_hasattr(self):
		19	attrs = (
		20	'COMPRESSION_RECOMMENDED_INPUT_SIZE',
		21	'COMPRESSION_RECOMMENDED_OUTPUT_SIZE',
		22	'DECOMPRESSION_RECOMMENDED_INPUT_SIZE',
		23	'DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE',
		24	'MAGIC_NUMBER',
		25	'WINDOWLOG_MIN',
		26	'WINDOWLOG_MAX',
		27	'CHAINLOG_MIN',
		28	'CHAINLOG_MAX',
		29	'HASHLOG_MIN',
		30	'HASHLOG_MAX',
		31	'HASHLOG3_MAX',
		32	'SEARCHLOG_MIN',
		33	'SEARCHLOG_MAX',
		34	'SEARCHLENGTH_MIN',
		35	'SEARCHLENGTH_MAX',
		36	'TARGETLENGTH_MIN',
		37	'TARGETLENGTH_MAX',
		38	'STRATEGY_FAST',
		39	'STRATEGY_DFAST',
		40	'STRATEGY_GREEDY',
		41	'STRATEGY_LAZY',
		42	'STRATEGY_LAZY2',
		43	'STRATEGY_BTLAZY2',
		44	'STRATEGY_BTOPT',
		45	)
		46
		47	for a in attrs:
		48	self.assertTrue(hasattr(zstd, a))

contrib/python-zstandard/tests/test_roundtrip.py

0 created 644 +64 0

			@@ -0,0 +1,64 b''
		1	import io
		2
		3	try:
		4	import unittest2 as unittest
		5	except ImportError:
		6	import unittest
		7
		8	try:
		9	import hypothesis
		10	import hypothesis.strategies as strategies
		11	except ImportError:
		12	raise unittest.SkipTest('hypothesis not available')
		13
		14	import zstd
		15
		16
		17	compression_levels = strategies.integers(min_value=1, max_value=22)
		18
		19
		20	class TestRoundTrip(unittest.TestCase):
		21	@hypothesis.given(strategies.binary(), compression_levels)
		22	def test_compress_write_to(self, data, level):
		23	"""Random data from compress() roundtrips via write_to."""
		24	cctx = zstd.ZstdCompressor(level=level)
		25	compressed = cctx.compress(data)
		26
		27	buffer = io.BytesIO()
		28	dctx = zstd.ZstdDecompressor()
		29	with dctx.write_to(buffer) as decompressor:
		30	decompressor.write(compressed)
		31
		32	self.assertEqual(buffer.getvalue(), data)
		33
		34	@hypothesis.given(strategies.binary(), compression_levels)
		35	def test_compressor_write_to_decompressor_write_to(self, data, level):
		36	"""Random data from compressor write_to roundtrips via write_to."""
		37	compress_buffer = io.BytesIO()
		38	decompressed_buffer = io.BytesIO()
		39
		40	cctx = zstd.ZstdCompressor(level=level)
		41	with cctx.write_to(compress_buffer) as compressor:
		42	compressor.write(data)
		43
		44	dctx = zstd.ZstdDecompressor()
		45	with dctx.write_to(decompressed_buffer) as decompressor:
		46	decompressor.write(compress_buffer.getvalue())
		47
		48	self.assertEqual(decompressed_buffer.getvalue(), data)
		49
		50	@hypothesis.given(strategies.binary(average_size=1048576))
		51	@hypothesis.settings(perform_health_check=False)
		52	def test_compressor_write_to_decompressor_write_to_larger(self, data):
		53	compress_buffer = io.BytesIO()
		54	decompressed_buffer = io.BytesIO()
		55
		56	cctx = zstd.ZstdCompressor(level=5)
		57	with cctx.write_to(compress_buffer) as compressor:
		58	compressor.write(data)
		59
		60	dctx = zstd.ZstdDecompressor()
		61	with dctx.write_to(decompressed_buffer) as decompressor:
		62	decompressor.write(compress_buffer.getvalue())
		63
		64	self.assertEqual(decompressed_buffer.getvalue(), data)

contrib/python-zstandard/tests/test_train_dictionary.py

0 created 644 +46 0

			@@ -0,0 +1,46 b''
		1	import sys
		2
		3	try:
		4	import unittest2 as unittest
		5	except ImportError:
		6	import unittest
		7
		8	import zstd
		9
		10
		11	if sys.version_info[0] >= 3:
		12	int_type = int
		13	else:
		14	int_type = long
		15
		16
		17	class TestTrainDictionary(unittest.TestCase):
		18	def test_no_args(self):
		19	with self.assertRaises(TypeError):
		20	zstd.train_dictionary()
		21
		22	def test_bad_args(self):
		23	with self.assertRaises(TypeError):
		24	zstd.train_dictionary(8192, u'foo')
		25
		26	with self.assertRaises(ValueError):
		27	zstd.train_dictionary(8192, [u'foo'])
		28
		29	def test_basic(self):
		30	samples = []
		31	for i in range(128):
		32	samples.append(b'foo' * 64)
		33	samples.append(b'bar' * 64)
		34	samples.append(b'foobar' * 64)
		35	samples.append(b'baz' * 64)
		36	samples.append(b'foobaz' * 64)
		37	samples.append(b'bazfoo' * 64)
		38
		39	d = zstd.train_dictionary(8192, samples)
		40	self.assertLessEqual(len(d), 8192)
		41
		42	dict_id = d.dict_id()
		43	self.assertIsInstance(dict_id, int_type)
		44
		45	data = d.as_bytes()
		46	self.assertEqual(data[0:4], b'\x37\xa4\x30\xec')

contrib/python-zstandard/zstd.c

0 created 644 +112 0

			@@ -0,0 +1,112 b''
		1	/**
		2	* Copyright (c) 2016-present, Gregory Szorc
		3	* All rights reserved.
		4	*
		5	* This software may be modified and distributed under the terms
		6	* of the BSD license. See the LICENSE file for details.
		7	*/
		8
		9	/* A Python C extension for Zstandard. */
		10
		11	#include "python-zstandard.h"
		12
		13	PyObject *ZstdError;
		14
		15	PyDoc_STRVAR(estimate_compression_context_size__doc__,
		16	"estimate_compression_context_size(compression_parameters)\n"
		17	"\n"
		18	"Give the amount of memory allocated for a compression context given a\n"
		19	"CompressionParameters instance");
		20
		21	PyDoc_STRVAR(estimate_decompression_context_size__doc__,
		22	"estimate_decompression_context_size()\n"
		23	"\n"
		24	"Estimate the amount of memory allocated to a decompression context.\n"
		25	);
		26
		27	static PyObject* estimate_decompression_context_size(PyObject* self) {
		28	return PyLong_FromSize_t(ZSTD_estimateDCtxSize());
		29	}
		30
		31	PyDoc_STRVAR(get_compression_parameters__doc__,
		32	"get_compression_parameters(compression_level[, source_size[, dict_size]])\n"
		33	"\n"
		34	"Obtains a ``CompressionParameters`` instance from a compression level and\n"
		35	"optional input size and dictionary size");
		36
		37	PyDoc_STRVAR(train_dictionary__doc__,
		38	"train_dictionary(dict_size, samples)\n"
		39	"\n"
		40	"Train a dictionary from sample data.\n"
		41	"\n"
		42	"A compression dictionary of size ``dict_size`` will be created from the\n"
		43	"iterable of samples provided by ``samples``.\n"
		44	"\n"
		45	"The raw dictionary content will be returned\n");
		46
		47	static char zstd_doc[] = "Interface to zstandard";
		48
		49	static PyMethodDef zstd_methods[] = {
		50	{ "estimate_compression_context_size", (PyCFunction)estimate_compression_context_size,
		51	METH_VARARGS, estimate_compression_context_size__doc__ },
		52	{ "estimate_decompression_context_size", (PyCFunction)estimate_decompression_context_size,
		53	METH_NOARGS, estimate_decompression_context_size__doc__ },
		54	{ "get_compression_parameters", (PyCFunction)get_compression_parameters,
		55	METH_VARARGS, get_compression_parameters__doc__ },
		56	{ "train_dictionary", (PyCFunction)train_dictionary,
		57	METH_VARARGS \| METH_KEYWORDS, train_dictionary__doc__ },
		58	{ NULL, NULL }
		59	};
		60
		61	void compressobj_module_init(PyObject* mod);
		62	void compressor_module_init(PyObject* mod);
		63	void compressionparams_module_init(PyObject* mod);
		64	void constants_module_init(PyObject* mod);
		65	void dictparams_module_init(PyObject* mod);
		66	void compressiondict_module_init(PyObject* mod);
		67	void compressionwriter_module_init(PyObject* mod);
		68	void compressoriterator_module_init(PyObject* mod);
		69	void decompressor_module_init(PyObject* mod);
		70	void decompressobj_module_init(PyObject* mod);
		71	void decompressionwriter_module_init(PyObject* mod);
		72	void decompressoriterator_module_init(PyObject* mod);
		73
		74	void zstd_module_init(PyObject* m) {
		75	compressionparams_module_init(m);
		76	dictparams_module_init(m);
		77	compressiondict_module_init(m);
		78	compressobj_module_init(m);
		79	compressor_module_init(m);
		80	compressionwriter_module_init(m);
		81	compressoriterator_module_init(m);
		82	constants_module_init(m);
		83	decompressor_module_init(m);
		84	decompressobj_module_init(m);
		85	decompressionwriter_module_init(m);
		86	decompressoriterator_module_init(m);
		87	}
		88
		89	#if PY_MAJOR_VERSION >= 3
		90	static struct PyModuleDef zstd_module = {
		91	PyModuleDef_HEAD_INIT,
		92	"zstd",
		93	zstd_doc,
		94	-1,
		95	zstd_methods
		96	};
		97
		98	PyMODINIT_FUNC PyInit_zstd(void) {
		99	PyObject *m = PyModule_Create(&zstd_module);
		100	if (m) {
		101	zstd_module_init(m);
		102	}
		103	return m;
		104	}
		105	#else
		106	PyMODINIT_FUNC initzstd(void) {
		107	PyObject *m = Py_InitModule3("zstd", zstd_methods, zstd_doc);
		108	if (m) {
		109	zstd_module_init(m);
		110	}
		111	}
		112	#endif

contrib/python-zstandard/zstd_cffi.py

0 created 644 +152 0

			@@ -0,0 +1,152 b''
		1	# Copyright (c) 2016-present, Gregory Szorc
		2	# All rights reserved.
		3	#
		4	# This software may be modified and distributed under the terms
		5	# of the BSD license. See the LICENSE file for details.
		6
		7	"""Python interface to the Zstandard (zstd) compression library."""
		8
		9	from __future__ import absolute_import, unicode_literals
		10
		11	import io
		12
		13	from _zstd_cffi import (
		14	ffi,
		15	lib,
		16	)
		17
		18
		19	_CSTREAM_IN_SIZE = lib.ZSTD_CStreamInSize()
		20	_CSTREAM_OUT_SIZE = lib.ZSTD_CStreamOutSize()
		21
		22
		23	class _ZstdCompressionWriter(object):
		24	def __init__(self, cstream, writer):
		25	self._cstream = cstream
		26	self._writer = writer
		27
		28	def __enter__(self):
		29	return self
		30
		31	def __exit__(self, exc_type, exc_value, exc_tb):
		32	if not exc_type and not exc_value and not exc_tb:
		33	out_buffer = ffi.new('ZSTD_outBuffer *')
		34	out_buffer.dst = ffi.new('char[]', _CSTREAM_OUT_SIZE)
		35	out_buffer.size = _CSTREAM_OUT_SIZE
		36	out_buffer.pos = 0
		37
		38	while True:
		39	res = lib.ZSTD_endStream(self._cstream, out_buffer)
		40	if lib.ZSTD_isError(res):
		41	raise Exception('error ending compression stream: %s' % lib.ZSTD_getErrorName)
		42
		43	if out_buffer.pos:
		44	self._writer.write(ffi.buffer(out_buffer.dst, out_buffer.pos))
		45	out_buffer.pos = 0
		46
		47	if res == 0:
		48	break
		49
		50	return False
		51
		52	def write(self, data):
		53	out_buffer = ffi.new('ZSTD_outBuffer *')
		54	out_buffer.dst = ffi.new('char[]', _CSTREAM_OUT_SIZE)
		55	out_buffer.size = _CSTREAM_OUT_SIZE
		56	out_buffer.pos = 0
		57
		58	# TODO can we reuse existing memory?
		59	in_buffer = ffi.new('ZSTD_inBuffer *')
		60	in_buffer.src = ffi.new('char[]', data)
		61	in_buffer.size = len(data)
		62	in_buffer.pos = 0
		63	while in_buffer.pos < in_buffer.size:
		64	res = lib.ZSTD_compressStream(self._cstream, out_buffer, in_buffer)
		65	if lib.ZSTD_isError(res):
		66	raise Exception('zstd compress error: %s' % lib.ZSTD_getErrorName(res))
		67
		68	if out_buffer.pos:
		69	self._writer.write(ffi.buffer(out_buffer.dst, out_buffer.pos))
		70	out_buffer.pos = 0
		71
		72
		73	class ZstdCompressor(object):
		74	def __init__(self, level=3, dict_data=None, compression_params=None):
		75	if dict_data:
		76	raise Exception('dict_data not yet supported')
		77	if compression_params:
		78	raise Exception('compression_params not yet supported')
		79
		80	self._compression_level = level
		81
		82	def compress(self, data):
		83	# Just use the stream API for now.
		84	output = io.BytesIO()
		85	with self.write_to(output) as compressor:
		86	compressor.write(data)
		87	return output.getvalue()
		88
		89	def copy_stream(self, ifh, ofh):
		90	cstream = self._get_cstream()
		91
		92	in_buffer = ffi.new('ZSTD_inBuffer *')
		93	out_buffer = ffi.new('ZSTD_outBuffer *')
		94
		95	out_buffer.dst = ffi.new('char[]', _CSTREAM_OUT_SIZE)
		96	out_buffer.size = _CSTREAM_OUT_SIZE
		97	out_buffer.pos = 0
		98
		99	total_read, total_write = 0, 0
		100
		101	while True:
		102	data = ifh.read(_CSTREAM_IN_SIZE)
		103	if not data:
		104	break
		105
		106	total_read += len(data)
		107
		108	in_buffer.src = ffi.new('char[]', data)
		109	in_buffer.size = len(data)
		110	in_buffer.pos = 0
		111
		112	while in_buffer.pos < in_buffer.size:
		113	res = lib.ZSTD_compressStream(cstream, out_buffer, in_buffer)
		114	if lib.ZSTD_isError(res):
		115	raise Exception('zstd compress error: %s' %
		116	lib.ZSTD_getErrorName(res))
		117
		118	if out_buffer.pos:
		119	ofh.write(ffi.buffer(out_buffer.dst, out_buffer.pos))
		120	total_write = out_buffer.pos
		121	out_buffer.pos = 0
		122
		123	# We've finished reading. Flush the compressor.
		124	while True:
		125	res = lib.ZSTD_endStream(cstream, out_buffer)
		126	if lib.ZSTD_isError(res):
		127	raise Exception('error ending compression stream: %s' %
		128	lib.ZSTD_getErrorName(res))
		129
		130	if out_buffer.pos:
		131	ofh.write(ffi.buffer(out_buffer.dst, out_buffer.pos))
		132	total_write += out_buffer.pos
		133	out_buffer.pos = 0
		134
		135	if res == 0:
		136	break
		137
		138	return total_read, total_write
		139
		140	def write_to(self, writer):
		141	return _ZstdCompressionWriter(self._get_cstream(), writer)
		142
		143	def _get_cstream(self):
		144	cstream = lib.ZSTD_createCStream()
		145	cstream = ffi.gc(cstream, lib.ZSTD_freeCStream)
		146
		147	res = lib.ZSTD_initCStream(cstream, self._compression_level)
		148	if lib.ZSTD_isError(res):
		149	raise Exception('cannot init CStream: %s' %
		150	lib.ZSTD_getErrorName(res))
		151
		152	return cstream

tests/test-check-code.t

0 +1 -1

              New errors are not allowed. Warnings are strongly discouraged.
              (The writing "no-che?k-code" is for not skipping this file when checking.)
-               $ hg locate | sed 's-\\-/-g' |
+               $ hg locate -X contrib/python-zstandard | sed 's-\\-/-g' |
                >   xargs "$check_code" --warnings --per-file=0 || false
                Skipping hgext/fsmonitor/pywatchman/__init__.py it has no-che?k-code (glob)
                Skipping hgext/fsmonitor/pywatchman/bser.c it has no-che?k-code (glob)

tests/test-check-module-imports.t

0 +1 0

                $ hg locate 'set:**.py or grep(r"^#!.*?python")' \
                > 'tests/**.t' \
                > -X contrib/debugshell.py \
+               > -X contrib/python-zstandard/ \
                > -X contrib/win32/hgwebdir_wsgi.py \
                > -X doc/gendoc.py \
                > -X doc/hgmanpage.py \

tests/test-check-py3-compat.t

0 +11 0

                $ cd "$TESTDIR"/..
                $ hg files 'set:(**.py)' | sed 's|\\|/|g' | xargs python contrib/check-py3-compat.py
+               contrib/python-zstandard/setup.py not using absolute_import
+               contrib/python-zstandard/setup_zstd.py not using absolute_import
+               contrib/python-zstandard/tests/common.py not using absolute_import
+               contrib/python-zstandard/tests/test_cffi.py not using absolute_import
+               contrib/python-zstandard/tests/test_compressor.py not using absolute_import
+               contrib/python-zstandard/tests/test_data_structures.py not using absolute_import
+               contrib/python-zstandard/tests/test_decompressor.py not using absolute_import
+               contrib/python-zstandard/tests/test_estimate_sizes.py not using absolute_import
+               contrib/python-zstandard/tests/test_module_attributes.py not using absolute_import
+               contrib/python-zstandard/tests/test_roundtrip.py not using absolute_import
+               contrib/python-zstandard/tests/test_train_dictionary.py not using absolute_import
                hgext/fsmonitor/pywatchman/__init__.py not using absolute_import
                hgext/fsmonitor/pywatchman/__init__.py requires print_function
                hgext/fsmonitor/pywatchman/capabilities.py not using absolute_import

tests/test-check-pyflakes.t

0 +1 -1

                > -X mercurial/pycompat.py \
                > 2>/dev/null \
                > | xargs pyflakes 2>/dev/null | "$TESTDIR/filterpyflakes.py"
+               contrib/python-zstandard/tests/test_data_structures.py:107: local variable 'size' is assigned to but never used
                tests/filterpyflakes.py:39: undefined name 'undefinedname'

General Comments 0

Write
Preview

You need to be logged in to leave comments. Login now

No TODOs yet

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages