upstream/mercurial-mirror Commit - r42237:675775c3

zstandard: vendor python-zstandard 0.11...

Gregory Szorc -

r42237:675775c3 default

parent child

Expand all files

The requested changes are too big and content was truncated. Show full diff

contrib/python-zstandard/zstd/decompress/zstd_ddict.c

0 created 644 0 0

	1		NO CONTENT: new file 100644
The requested commit or file is too big and content was truncated. Show full diff

contrib/python-zstandard/zstd/decompress/zstd_ddict.h

0 created 644 0 0

	1		NO CONTENT: new file 100644
The requested commit or file is too big and content was truncated. Show full diff

contrib/python-zstandard/zstd/decompress/zstd_decompress_block.c

0 created 644 0 0

	1		NO CONTENT: new file 100644
The requested commit or file is too big and content was truncated. Show full diff

contrib/python-zstandard/zstd/decompress/zstd_decompress_block.h

0 created 644 0 0

	1		NO CONTENT: new file 100644
The requested commit or file is too big and content was truncated. Show full diff

contrib/python-zstandard/zstd/decompress/zstd_decompress_internal.h

0 created 644 0 0

	1		NO CONTENT: new file 100644
The requested commit or file is too big and content was truncated. Show full diff

contrib/clang-format-ignorelist

0 +5 0

              contrib/python-zstandard/zstd/compress/zstd_opt.c
              contrib/python-zstandard/zstd/compress/zstd_opt.h
              contrib/python-zstandard/zstd/decompress/huf_decompress.c
+             contrib/python-zstandard/zstd/decompress/zstd_ddict.c
+             contrib/python-zstandard/zstd/decompress/zstd_ddict.h
+             contrib/python-zstandard/zstd/decompress/zstd_decompress_block.c
+             contrib/python-zstandard/zstd/decompress/zstd_decompress_block.h
+             contrib/python-zstandard/zstd/decompress/zstd_decompress_internal.h
              contrib/python-zstandard/zstd/decompress/zstd_decompress.c
              contrib/python-zstandard/zstd/deprecated/zbuff_common.c
              contrib/python-zstandard/zstd/deprecated/zbuff_compress.c

contrib/python-zstandard/MANIFEST.in

0 0 -1

              include make_cffi.py
              include setup_zstd.py
              include zstd.c
-             include zstd_cffi.py
              include LICENSE
              include NEWS.rst

contrib/python-zstandard/NEWS.rst

0 +217 -1

		@@ -8,8 +8,18 b' 1.0.0 (not yet released)'
8	8	Actions Blocking Release
9	9	------------------------
10	10
11		* compression and decompression APIs that support ``io.rawIOBase`` interface
	11	* compression and decompression APIs that support ``io.RawIOBase`` interface
12	12	(#13).
	13	* ``stream_writer()`` APIs should support ``io.RawIOBase`` interface.
	14	* Properly handle non-blocking I/O and partial writes for objects implementing
	15	``io.RawIOBase``.
	16	* Make ``write_return_read=True`` the default for objects implementing
	17	``io.RawIOBase``.
	18	* Audit for consistent and proper behavior of ``flush()`` and ``close()`` for
	19	all objects implementing ``io.RawIOBase``. Is calling ``close()`` on
	20	wrapped stream acceptable, should ``__exit__`` always call ``close()``,
	21	should ``close()`` imply ``flush()``, etc.
	22	* Consider making reads across frames configurable behavior.
13	23	* Refactor module names so C and CFFI extensions live under ``zstandard``
14	24	package.
15	25	* Overall API design review.
		@@ -43,6 +53,11 b' Actions Blocking Release'
43	53	* Consider a ``chunker()`` API for decompression.
44	54	* Consider stats for ``chunker()`` API, including finding the last consumed
45	55	offset of input data.
	56	* Consider exposing ``ZSTD_cParam_getBounds()`` and
	57	``ZSTD_dParam_getBounds()`` APIs.
	58	* Consider controls over resetting compression contexts (session only, parameters,
	59	or session and parameters).
	60	* Actually use the CFFI backend in fuzzing tests.
46	61
47	62	Other Actions Not Blocking Release
48	63	---------------------------------------
		@@ -51,6 +66,207 b' Other Actions Not Blocking Release'
51	66	* API for ensuring max memory ceiling isn't exceeded.
52	67	* Move off nose for testing.
53	68
	69	0.11.0 (released 2019-02-24)
	70	============================
	71
	72	Backwards Compatibility Nodes
	73	-----------------------------
	74
	75	* ``ZstdDecompressor.read()`` now allows reading sizes of ``-1`` or ``0``
	76	and defaults to ``-1``, per the documented behavior of
	77	``io.RawIOBase.read()``. Previously, we required an argument that was
	78	a positive value.
	79	* The ``readline()``, ``readlines()``, ``__iter__``, and ``__next__`` methods
	80	of ``ZstdDecompressionReader()`` now raise ``io.UnsupportedOperation``
	81	instead of ``NotImplementedError``.
	82	* ``ZstdDecompressor.stream_reader()`` now accepts a ``read_across_frames``
	83	argument. The default value will likely be changed in a future release
	84	and consumers are advised to pass the argument to avoid unwanted change
	85	of behavior in the future.
	86	* ``setup.py`` now always disables the CFFI backend if the installed
	87	CFFI package does not meet the minimum version requirements. Before, it was
	88	possible for the CFFI backend to be generated and a run-time error to
	89	occur.
	90	* In the CFFI backend, ``CompressionReader`` and ``DecompressionReader``
	91	were renamed to ``ZstdCompressionReader`` and ``ZstdDecompressionReader``,
	92	respectively so naming is identical to the C extension. This should have
	93	no meaningful end-user impact, as instances aren't meant to be
	94	constructed directly.
	95	* ``ZstdDecompressor.stream_writer()`` now accepts a ``write_return_read``
	96	argument to control whether ``write()`` returns the number of bytes
	97	read from the source / written to the decompressor. It defaults to off,
	98	which preserves the existing behavior of returning the number of bytes
	99	emitted from the decompressor. The default will change in a future release
	100	so behavior aligns with the specified behavior of ``io.RawIOBase``.
	101	* ``ZstdDecompressionWriter.__exit__`` now calls ``self.close()``. This
	102	will result in that stream plus the underlying stream being closed as
	103	well. If this behavior is not desirable, do not use instances as
	104	context managers.
	105	* ``ZstdCompressor.stream_writer()`` now accepts a ``write_return_read``
	106	argument to control whether ``write()`` returns the number of bytes read
	107	from the source / written to the compressor. It defaults to off, which
	108	preserves the existing behavior of returning the number of bytes emitted
	109	from the compressor. The default will change in a future release so
	110	behavior aligns with the specified behavior of ``io.RawIOBase``.
	111	* ``ZstdCompressionWriter.__exit__`` now calls ``self.close()``. This will
	112	result in that stream plus any underlying stream being closed as well. If
	113	this behavior is not desirable, do not use instances as context managers.
	114	* ``ZstdDecompressionWriter`` no longer requires being used as a context
	115	manager (#57).
	116	* ``ZstdCompressionWriter`` no longer requires being used as a context
	117	manager (#57).
	118	* The ``overlap_size_log`` attribute on ``CompressionParameters`` instances
	119	has been deprecated and will be removed in a future release. The
	120	``overlap_log`` attribute should be used instead.
	121	* The ``overlap_size_log`` argument to ``CompressionParameters`` has been
	122	deprecated and will be removed in a future release. The ``overlap_log``
	123	argument should be used instead.
	124	* The ``ldm_hash_every_log`` attribute on ``CompressionParameters`` instances
	125	has been deprecated and will be removed in a future release. The
	126	``ldm_hash_rate_log`` attribute should be used instead.
	127	* The ``ldm_hash_every_log`` argument to ``CompressionParameters`` has been
	128	deprecated and will be removed in a future release. The ``ldm_hash_rate_log``
	129	argument should be used instead.
	130	* The ``compression_strategy`` argument to ``CompressionParameters`` has been
	131	deprecated and will be removed in a future release. The ``strategy``
	132	argument should be used instead.
	133	* The ``SEARCHLENGTH_MIN`` and ``SEARCHLENGTH_MAX`` constants are deprecated
	134	and will be removed in a future release. Use ``MINMATCH_MIN`` and
	135	``MINMATCH_MAX`` instead.
	136	* The ``zstd_cffi`` module has been renamed to ``zstandard.cffi``. As had
	137	been documented in the ``README`` file since the ``0.9.0`` release, the
	138	module should not be imported directly at its new location. Instead,
	139	``import zstandard`` to cause an appropriate backend module to be loaded
	140	automatically.
	141
	142	Bug Fixes
	143	---------
	144
	145	* CFFI backend could encounter a failure when sending an empty chunk into
	146	``ZstdDecompressionObj.decompress()``. The issue has been fixed.
	147	* CFFI backend could encounter an error when calling
	148	``ZstdDecompressionReader.read()`` if there was data remaining in an
	149	internal buffer. The issue has been fixed. (#71)
	150
	151	Changes
	152	-------
	153
	154	* ``ZstDecompressionObj.decompress()`` now properly handles empty inputs in
	155	the CFFI backend.
	156	* ``ZstdCompressionReader`` now implements ``read1()`` and ``readinto1()``.
	157	These are part of the ``io.BufferedIOBase`` interface.
	158	* ``ZstdCompressionReader`` has gained a ``readinto(b)`` method for reading
	159	compressed output into an existing buffer.
	160	* ``ZstdCompressionReader.read()`` now defaults to ``size=-1`` and accepts
	161	read sizes of ``-1`` and ``0``. The new behavior aligns with the documented
	162	behavior of ``io.RawIOBase``.
	163	* ``ZstdCompressionReader`` now implements ``readall()``. Previously, this
	164	method raised ``NotImplementedError``.
	165	* ``ZstdDecompressionReader`` now implements ``read1()`` and ``readinto1()``.
	166	These are part of the ``io.BufferedIOBase`` interface.
	167	* ``ZstdDecompressionReader.read()`` now defaults to ``size=-1`` and accepts
	168	read sizes of ``-1`` and ``0``. The new behavior aligns with the documented
	169	behavior of ``io.RawIOBase``.
	170	* ``ZstdDecompressionReader()`` now implements ``readall()``. Previously, this
	171	method raised ``NotImplementedError``.
	172	* The ``readline()``, ``readlines()``, ``__iter__``, and ``__next__`` methods
	173	of ``ZstdDecompressionReader()`` now raise ``io.UnsupportedOperation``
	174	instead of ``NotImplementedError``. This reflects a decision to never
	175	implement text-based I/O on (de)compressors and keep the low-level API
	176	operating in the binary domain. (#13)
	177	* ``README.rst`` now documented how to achieve linewise iteration using
	178	an ``io.TextIOWrapper`` with a ``ZstdDecompressionReader``.
	179	* ``ZstdDecompressionReader`` has gained a ``readinto(b)`` method for
	180	reading decompressed output into an existing buffer. This allows chaining
	181	to an ``io.TextIOWrapper`` on Python 3 without using an ``io.BufferedReader``.
	182	* ``ZstdDecompressor.stream_reader()`` now accepts a ``read_across_frames``
	183	argument to control behavior when the input data has multiple zstd
	184	frames. When ``False`` (the default for backwards compatibility), a
	185	``read()`` will stop when the end of a zstd frame is encountered. When
	186	``True``, ``read()`` can potentially return data spanning multiple zstd
	187	frames. The default will likely be changed to ``True`` in a future
	188	release.
	189	* ``setup.py`` now performs CFFI version sniffing and disables the CFFI
	190	backend if CFFI is too old. Previously, we only used ``install_requires``
	191	to enforce the CFFI version and not all build modes would properly enforce
	192	the minimum CFFI version. (#69)
	193	* CFFI's ``ZstdDecompressionReader.read()`` now properly handles data
	194	remaining in any internal buffer. Before, repeated ``read()`` could
	195	result in random errors. (#71)
	196	* Upgraded various Python packages in CI environment.
	197	* Upgrade to hypothesis 4.5.11.
	198	* In the CFFI backend, ``CompressionReader`` and ``DecompressionReader``
	199	were renamed to ``ZstdCompressionReader`` and ``ZstdDecompressionReader``,
	200	respectively.
	201	* ``ZstdDecompressor.stream_writer()`` now accepts a ``write_return_read``
	202	argument to control whether ``write()`` returns the number of bytes read
	203	from the source. It defaults to ``False`` to preserve backwards
	204	compatibility.
	205	* ``ZstdDecompressor.stream_writer()`` now implements the ``io.RawIOBase``
	206	interface and behaves as a proper stream object.
	207	* ``ZstdCompressor.stream_writer()`` now accepts a ``write_return_read``
	208	argument to control whether ``write()`` returns the number of bytes read
	209	from the source. It defaults to ``False`` to preserve backwards
	210	compatibility.
	211	* ``ZstdCompressionWriter`` now implements the ``io.RawIOBase`` interface and
	212	behaves as a proper stream object. ``close()`` will now close the stream
	213	and the underlying stream (if possible). ``__exit__`` will now call
	214	``close()``. Methods like ``writable()`` and ``fileno()`` are implemented.
	215	* ``ZstdDecompressionWriter`` no longer must be used as a context manager.
	216	* ``ZstdCompressionWriter`` no longer must be used as a context manager.
	217	When not using as a context manager, it is important to call
	218	``flush(FRAME_FRAME)`` or the compression stream won't be properly
	219	terminated and decoders may complain about malformed input.
	220	* ``ZstdCompressionWriter.flush()`` (what is returned from
	221	``ZstdCompressor.stream_writer()``) now accepts an argument controlling the
	222	flush behavior. Its value can be one of the new constants
	223	``FLUSH_BLOCK`` or ``FLUSH_FRAME``.
	224	* ``ZstdDecompressionObj`` instances now have a ``flush([length=None])`` method.
	225	This provides parity with standard library equivalent types. (#65)
	226	* ``CompressionParameters`` no longer redundantly store individual compression
	227	parameters on each instance. Instead, compression parameters are stored inside
	228	the underlying ``ZSTD_CCtx_params`` instance. Attributes for obtaining
	229	parameters are now properties rather than instance variables.
	230	* Exposed the ``STRATEGY_BTULTRA2`` constant.
	231	* ``CompressionParameters`` instances now expose an ``overlap_log`` attribute.
	232	This behaves identically to the ``overlap_size_log`` attribute.
	233	* ``CompressionParameters()`` now accepts an ``overlap_log`` argument that
	234	behaves identically to the ``overlap_size_log`` argument. An error will be
	235	raised if both arguments are specified.
	236	* ``CompressionParameters`` instances now expose an ``ldm_hash_rate_log``
	237	attribute. This behaves identically to the ``ldm_hash_every_log`` attribute.
	238	* ``CompressionParameters()`` now accepts a ``ldm_hash_rate_log`` argument that
	239	behaves identically to the ``ldm_hash_every_log`` argument. An error will be
	240	raised if both arguments are specified.
	241	* ``CompressionParameters()`` now accepts a ``strategy`` argument that behaves
	242	identically to the ``compression_strategy`` argument. An error will be raised
	243	if both arguments are specified.
	244	* The ``MINMATCH_MIN`` and ``MINMATCH_MAX`` constants were added. They are
	245	semantically equivalent to the old ``SEARCHLENGTH_MIN`` and
	246	``SEARCHLENGTH_MAX`` constants.
	247	* Bundled zstandard library upgraded from 1.3.7 to 1.3.8.
	248	* ``setup.py`` denotes support for Python 3.7 (Python 3.7 was supported and
	249	tested in the 0.10 release).
	250	* ``zstd_cffi`` module has been renamed to ``zstandard.cffi``.
	251	* ``ZstdCompressor.stream_writer()`` now reuses a buffer in order to avoid
	252	allocating a new buffer for every operation. This should result in faster
	253	performance in cases where ``write()`` or ``flush()`` are being called
	254	frequently. (#62)
	255	* Bundled zstandard library upgraded from 1.3.6 to 1.3.7.
	256
	257	0.10.2 (released 2018-11-03)
	258	============================
	259
	260	Bug Fixes
	261	---------
	262
	263	* ``zstd_cffi.py`` added to ``setup.py`` (#60).
	264
	265	Changes
	266	-------
	267
	268	* Change some integer casts to avoid ``ssize_t`` (#61).
	269
54	270	0.10.1 (released 2018-10-08)
55	271	============================
56	272

contrib/python-zstandard/README.rst

0 +144 -33

              Requirements
              ============
-             This extension is designed to run with Python 2.7, 3.4, 3.5, and 3.6
-             on common platforms (Linux, Windows, and OS X). x86 and x86_64 are well-tested
-             on Windows. Only x86_64 is well-tested on Linux and macOS.
+             This extension is designed to run with Python 2.7, 3.4, 3.5, 3.6, and 3.7
+             on common platforms (Linux, Windows, and OS X). On PyPy (both PyPy2 and PyPy3) we support version 6.0.0 and above.
+             x86 and x86_64 are well-tested on Windows. Only x86_64 is well-tested on Linux and macOS.
              Installing
              ==========
                             # Do something with compressed chunk.
-             When the context manager exists or ``close()`` is called, the stream is closed,
+             When the context manager exits or ``close()`` is called, the stream is closed,
              underlying resources are released, and future operations against the compression
              stream will fail.
              Streaming Input API
              ^^^^^^^^^^^^^^^^^^^
-             ``stream_writer(fh)`` (which behaves as a context manager) allows you to *stream*
-             data into a compressor.::
+             ``stream_writer(fh)`` allows you to *stream* data into a compressor.
+             Returned instances implement the ``io.RawIOBase`` interface. Only methods
+             that involve writing will do useful things.
+             The argument to ``stream_writer()`` must have a ``write(data)`` method. As
+             compressed data is available, ``write()`` will be called with the compressed
+             data as its argument. Many common Python types implement ``write()``, including
+             open file handles and ``io.BytesIO``.
+             The ``write(data)`` method is used to feed data into the compressor.
+             The ``flush([flush_mode=FLUSH_BLOCK])`` method can be called to evict whatever
+             data remains within the compressor's internal state into the output object. This
+             may result in 0 or more ``write()`` calls to the output object. This method
+             accepts an optional ``flush_mode`` argument to control the flushing behavior.
+             Its value can be any of the ``FLUSH_*`` constants.
+             Both ``write()`` and ``flush()`` return the number of bytes written to the
+             object's ``write()``. In many cases, small inputs do not accumulate enough
+             data to cause a write and ``write()`` will return ``0``.
+             Calling ``close()`` will mark the stream as closed and subsequent I/O
+             operations will raise ``ValueError`` (per the documented behavior of
+             ``io.RawIOBase``). ``close()`` will also call ``close()`` on the underlying
+             stream if such a method exists.
+             Typically usage is as follows::
+                cctx = zstd.ZstdCompressor(level=10)
+                compressor = cctx.stream_writer(fh)
+                compressor.write(b'chunk 0\n')
+                compressor.write(b'chunk 1\n')
+                compressor.flush()
+                # Receiver will be able to decode ``chunk 0\nchunk 1\n`` at this point.
+                # Receiver is also expecting more data in the zstd *frame*.
+                compressor.write(b'chunk 2\n')
+                compressor.flush(zstd.FLUSH_FRAME)
+                # Receiver will be able to decode ``chunk 0\nchunk 1\nchunk 2``.
+                # Receiver is expecting no more data, as the zstd frame is closed.
+                # Any future calls to ``write()`` at this point will construct a new
+                # zstd frame.
+             Instances can be used as context managers. Exiting the context manager is
+             the equivalent of calling ``close()``, which is equivalent to calling
+             ``flush(zstd.FLUSH_FRAME)``::
                 cctx = zstd.ZstdCompressor(level=10)
                 with cctx.stream_writer(fh) as compressor:
                     compressor.write(b'chunk 1')
                     ...
-             The argument to ``stream_writer()`` must have a ``write(data)`` method. As
-             compressed data is available, ``write()`` will be called with the compressed
-             data as its argument. Many common Python types implement ``write()``, including
-             open file handles and ``io.BytesIO``.
+             .. important::
-             ``stream_writer()`` returns an object representing a streaming compressor
-             instance. It **must** be used as a context manager. That object's
-             ``write(data)`` method is used to feed data into the compressor.
-             A ``flush()`` method can be called to evict whatever data remains within the
-             compressor's internal state into the output object. This may result in 0 or
-             more ``write()`` calls to the output object.
-             Both ``write()`` and ``flush()`` return the number of bytes written to the
-             object's ``write()``. In many cases, small inputs do not accumulate enough
-             data to cause a write and ``write()`` will return ``0``.
+                If ``flush(FLUSH_FRAME)`` is not called, emitted data doesn't constitute
+                a full zstd *frame* and consumers of this data may complain about malformed
+                input. It is recommended to use instances as a context manager to ensure
+                *frames* are properly finished.
              If the size of the data being fed to this streaming compressor is known,
              you can declare it before compression begins::
                      ...
                      total_written = compressor.tell()
+             ``stream_writer()`` accepts a ``write_return_read`` boolean argument to control
+             the return value of ``write()``. When ``False`` (the default), ``write()`` returns
+             the number of bytes that were ``write()``en to the underlying object. When
+             ``True``, ``write()`` returns the number of bytes read from the input that
+             were subsequently written to the compressor. ``True`` is the *proper* behavior
+             for ``write()`` as specified by the ``io.RawIOBase`` interface and will become
+             the default value in a future release.
              Streaming Output API
              ^^^^^^^^^^^^^^^^^^^^
              ``tell()`` returns the number of decompressed bytes read so far.
              Not all I/O methods are implemented. Notably missing is support for
-             ``readline()``, ``readlines()``, and linewise iteration support. Support for
-             these is planned for a future release.
+             ``readline()``, ``readlines()``, and linewise iteration support. This is
+             because streams operate on binary data - not text data. If you want to
+             convert decompressed output to text, you can chain an ``io.TextIOWrapper``
+             to the stream::
+                with open(path, 'rb') as fh:
+                    dctx = zstd.ZstdDecompressor()
+                    stream_reader = dctx.stream_reader(fh)
+                    text_stream = io.TextIOWrapper(stream_reader, encoding='utf-8')
+                    for line in text_stream:
+                        ...
+             The ``read_across_frames`` argument to ``stream_reader()`` controls the
+             behavior of read operations when the end of a zstd *frame* is encountered.
+             When ``False`` (the default), a read will complete when the end of a
+             zstd *frame* is encountered. When ``True``, a read can potentially
+             return data spanning multiple zstd *frames*.
              Streaming Input API
              ^^^^^^^^^^^^^^^^^^^
-             ``stream_writer(fh)`` can be used to incrementally send compressed data to a
-             decompressor.::
+             ``stream_writer(fh)`` allows you to *stream* data into a decompressor.
+             Returned instances implement the ``io.RawIOBase`` interface. Only methods
+             that involve writing will do useful things.
+             The argument to ``stream_writer()`` is typically an object that also implements
+             ``io.RawIOBase``. But any object with a ``write(data)`` method will work. Many
+             common Python types conform to this interface, including open file handles
+             and ``io.BytesIO``.
+             Behavior is similar to ``ZstdCompressor.stream_writer()``: compressed data
+             is sent to the decompressor by calling ``write(data)`` and decompressed
+             output is written to the underlying stream by calling its ``write(data)``
+             method.::
                  dctx = zstd.ZstdDecompressor()
-                 with dctx.stream_writer(fh) as decompressor:
-                     decompressor.write(compressed_data)
+                 decompressor = dctx.stream_writer(fh)
-             This behaves similarly to ``zstd.ZstdCompressor``: compressed data is written to
-             the decompressor by calling ``write(data)`` and decompressed output is written
-             to the output object by calling its ``write(data)`` method.
+                 decompressor.write(compressed_data)
+                 ...
              Calls to ``write()`` will return the number of bytes written to the output
              object. Not all inputs will result in bytes being written, so return values
              of ``0`` are possible.
+             Like the ``stream_writer()`` compressor, instances can be used as context
+             managers. However, context managers add no extra special behavior and offer
+             little to no benefit to being used.
+             Calling ``close()`` will mark the stream as closed and subsequent I/O operations
+             will raise ``ValueError`` (per the documented behavior of ``io.RawIOBase``).
+             ``close()`` will also call ``close()`` on the underlying stream if such a
+             method exists.
              The size of chunks being ``write()`` to the destination can be specified::
                  dctx = zstd.ZstdDecompressor()
                  with dctx.stream_writer(fh) as decompressor:
                      byte_size = decompressor.memory_size()
+             ``stream_writer()`` accepts a ``write_return_read`` boolean argument to control
+             the return value of ``write()``. When ``False`` (the default)``, ``write()``
+             returns the number of bytes that were ``write()``en to the underlying stream.
+             When ``True``, ``write()`` returns the number of bytes read from the input.
+             ``True`` is the *proper* behavior for ``write()`` as specified by the
+             ``io.RawIOBase`` interface and will become the default in a future release.
              Streaming Output API
              ^^^^^^^^^^^^^^^^^^^^
                 memory (re)allocations, this streaming decompression API isn't as
                 efficient as other APIs.
+             For compatibility with the standard library APIs, instances expose a
+             ``flush([length=None])`` method. This method no-ops and has no meaningful
+             side-effects, making it safe to call any time.
              Batch Decompression API
              ^^^^^^^^^^^^^^^^^^^^^^^
              * search_log
              * min_match
              * target_length
-             * compression_strategy
+             * strategy
+             * compression_strategy (deprecated: same as ``strategy``)
              * write_content_size
              * write_checksum
              * write_dict_id
              * job_size
-             * overlap_size_log
+             * overlap_log
+             * overlap_size_log (deprecated: same as ``overlap_log``)
              * force_max_window
              * enable_ldm
              * ldm_hash_log
              * ldm_min_match
              * ldm_bucket_size_log
-             * ldm_hash_every_log
+             * ldm_hash_rate_log
+             * ldm_hash_every_log (deprecated: same as ``ldm_hash_rate_log``)
              * threads
              Some of these are very low-level settings. It may help to consult the official
              MAGIC_NUMBER
                  Frame header as an integer
+             FLUSH_BLOCK
+                 Flushing behavior that denotes to flush a zstd block. A decompressor will
+                 be able to decode all data fed into the compressor so far.
+             FLUSH_FRAME
+                 Flushing behavior that denotes to end a zstd frame. Any new data fed
+                 to the compressor will start a new frame.
              CONTENTSIZE_UNKNOWN
                  Value for content size when the content size is unknown.
              CONTENTSIZE_ERROR
                  Minimum value for compression parameter
              SEARCHLOG_MAX
                  Maximum value for compression parameter
+             MINMATCH_MIN
+                 Minimum value for compression parameter
+             MINMATCH_MAX
+                 Maximum value for compression parameter
              SEARCHLENGTH_MIN
                  Minimum value for compression parameter
+                 Deprecated: use ``MINMATCH_MIN``
              SEARCHLENGTH_MAX
                  Maximum value for compression parameter
+                 Deprecated: use ``MINMATCH_MAX``
              TARGETLENGTH_MIN
                  Minimum value for compression parameter
              STRATEGY_FAST
                  Compression strategy
              STRATEGY_BTULTRA
                  Compression strategy
+             STRATEGY_BTULTRA2
+                 Compression strategy
              FORMAT_ZSTD1
                  Zstandard frame format

contrib/python-zstandard/c-ext/compressionchunker.c

0 +2 -2

              	/* If we have data left in the input, consume it. */
              	while (chunker->input.pos < chunker->input.size) {
              		Py_BEGIN_ALLOW_THREADS
-             		zresult = ZSTD_compress_generic(chunker->compressor->cctx, &chunker->output,
+             		zresult = ZSTD_compressStream2(chunker->compressor->cctx, &chunker->output,
              			&chunker->input, ZSTD_e_continue);
              		Py_END_ALLOW_THREADS
              	}
              	Py_BEGIN_ALLOW_THREADS
-             	zresult = ZSTD_compress_generic(chunker->compressor->cctx, &chunker->output,
+             	zresult = ZSTD_compressStream2(chunker->compressor->cctx, &chunker->output,
              		&chunker->input, zFlushMode);
              	Py_END_ALLOW_THREADS

contrib/python-zstandard/c-ext/compressiondict.c

0 +3 -7

              		cParams = ZSTD_getCParams(level, 0, self->dictSize);
              	}
              	else {
-             		cParams.chainLog = compressionParams->chainLog;
-             		cParams.hashLog = compressionParams->hashLog;
-             		cParams.searchLength = compressionParams->minMatch;
-             		cParams.searchLog = compressionParams->searchLog;
-             		cParams.strategy = compressionParams->compressionStrategy;
-             		cParams.targetLength = compressionParams->targetLength;
-             		cParams.windowLog = compressionParams->windowLog;
+             		if (to_cparams(compressionParams, &cParams)) {
+             			return NULL;
+             		}
              	}
              	assert(!self->cdict);

contrib/python-zstandard/c-ext/compressionparams.c

0 +230 -135

		@@ -10,7 +10,7 b''
10	10
11	11	extern PyObject* ZstdError;
12	12
13		int set_parameter(ZSTD_CCtx_params* params, ZSTD_cParameter param, ~~unsigned~~ value) {
	13	int set_parameter(ZSTD_CCtx_params* params, ZSTD_cParameter param, int value) {
14	14	size_t zresult = ZSTD_CCtxParam_setParameter(params, param, value);
15	15	if (ZSTD_isError(zresult)) {
16	16	PyErr_Format(ZstdError, "unable to set compression context parameter: %s",
		@@ -23,28 +23,41 b' int set_parameter(ZSTD_CCtx_params* para'
23	23
24	24	#define TRY_SET_PARAMETER(params, param, value) if (set_parameter(params, param, value)) return -1;
25	25
	26	#define TRY_COPY_PARAMETER(source, dest, param) { \
	27	int result; \
	28	size_t zresult = ZSTD_CCtxParam_getParameter(source, param, &result); \
	29	if (ZSTD_isError(zresult)) { \
	30	return 1; \
	31	} \
	32	zresult = ZSTD_CCtxParam_setParameter(dest, param, result); \
	33	if (ZSTD_isError(zresult)) { \
	34	return 1; \
	35	} \
	36	}
	37
26	38	int set_parameters(ZSTD_CCtx_params* params, ZstdCompressionParametersObject* obj) {
27		TRY_~~SET~~_PARAMETER(params, ~~ZSTD_p_format~~, ~~obj~~->~~format~~);
28		TRY_SET_PARAMETER(params, ZSTD_p_compressionLevel, (unsigned)obj->compressionLevel);
29		TRY_SET_PARAMETER(params, ZSTD_p_windowLog, obj->windowLog);
30		TRY_SET_PARAMETER(params, ZSTD_p_hashLog, obj->hashLog);
31		TRY_~~SET~~_PARAMETER(params, ~~ZSTD_p_chainLog~~, ~~obj~~->~~chain~~Log);
32		TRY_~~SET~~_PARAMETER(params, ~~ZSTD_p_searchLog~~, ~~obj~~->~~searc~~hLog);
33		TRY_SET_PARAMETER(params, ZSTD_p_minMatch, obj->minMatch);
34		TRY_SET_PARAMETER(params, ZSTD_p_targetLength, obj->targetLength);
35		TRY_SET_PARAMETER(params, ZSTD_p_compressionStrategy, obj->compressionStrategy);
36		TRY_SET_PARAMETER(params, ZSTD_p_contentSizeFlag, obj->contentSizeFlag);
37		TRY_SET_PARAMETER(params, ZSTD_p_checksumFlag, obj->checksumFlag);
38		TRY_SET_PARAMETER(params, ZSTD_p_dictIDFlag, obj->dictIDFlag);
39		TRY_SET_PARAMETER(params, ZSTD_p_nbWorkers, obj->threads);
40		TRY_SET_PARAMETER(params, ZSTD_p_jobSize, obj->jobSize);
41		TRY_SET_PARAMETER(params, ZSTD_p_overlapSizeLog, obj->overlapSizeLog);
42		TRY_SET_PARAMETER(params, ZSTD_p_forceMaxWindow, obj->forceMaxWindow);
43		TRY_SET_PARAMETER(params, ZSTD_p_enableLongDistanceMatching, obj->enableLongDistanceMatching);
44		TRY_SET_PARAMETER(params, ZSTD_p_ldmHashLog, obj->ldmHashLog);
45		TRY_~~SET~~_PARAMETER(params, ~~ZSTD_p_ldmMinMatch~~, ~~obj~~->~~ldmMinMatch~~);
46		TRY_SET_PARAMETER(params, ZSTD_p_ldmBucketSizeLog, obj->ldmBucketSizeLog);
47		TRY_SET_PARAMETER(params, ZSTD_p_ldmHashEveryLog, obj->ldmHashEveryLog);
	39	TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_nbWorkers);
	40
	41	TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_format);
	42	TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_compressionLevel);
	43	TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_windowLog);
	44	TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_hashLog);
	45	TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_chainLog);
	46	TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_searchLog);
	47	TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_minMatch);
	48	TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_targetLength);
	49	TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_strategy);
	50	TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_contentSizeFlag);
	51	TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_checksumFlag);
	52	TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_dictIDFlag);
	53	TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_jobSize);
	54	TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_overlapLog);
	55	TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_forceMaxWindow);
	56	TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_enableLongDistanceMatching);
	57	TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_ldmHashLog);
	58	TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_ldmMinMatch);
	59	TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_ldmBucketSizeLog);
	60	TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_ldmHashRateLog);
48	61
49	62	return 0;
50	63	}
		@@ -64,6 +77,41 b' int reset_params(ZstdCompressionParamete'
64	77	return set_parameters(params->params, params);
65	78	}
66	79
	80	#define TRY_GET_PARAMETER(params, param, value) { \
	81	size_t zresult = ZSTD_CCtxParam_getParameter(params, param, value); \
	82	if (ZSTD_isError(zresult)) { \
	83	PyErr_Format(ZstdError, "unable to retrieve parameter: %s", ZSTD_getErrorName(zresult)); \
	84	return 1; \
	85	} \
	86	}
	87
	88	int to_cparams(ZstdCompressionParametersObject* params, ZSTD_compressionParameters* cparams) {
	89	int value;
	90
	91	TRY_GET_PARAMETER(params->params, ZSTD_c_windowLog, &value);
	92	cparams->windowLog = value;
	93
	94	TRY_GET_PARAMETER(params->params, ZSTD_c_chainLog, &value);
	95	cparams->chainLog = value;
	96
	97	TRY_GET_PARAMETER(params->params, ZSTD_c_hashLog, &value);
	98	cparams->hashLog = value;
	99
	100	TRY_GET_PARAMETER(params->params, ZSTD_c_searchLog, &value);
	101	cparams->searchLog = value;
	102
	103	TRY_GET_PARAMETER(params->params, ZSTD_c_minMatch, &value);
	104	cparams->minMatch = value;
	105
	106	TRY_GET_PARAMETER(params->params, ZSTD_c_targetLength, &value);
	107	cparams->targetLength = value;
	108
	109	TRY_GET_PARAMETER(params->params, ZSTD_c_strategy, &value);
	110	cparams->strategy = value;
	111
	112	return 0;
	113	}
	114
67	115	static int ZstdCompressionParameters_init(ZstdCompressionParametersObject* self, PyObject* args, PyObject* kwargs) {
68	116	static char* kwlist[] = {
69	117	"format",
		@@ -75,50 +123,60 b' static int ZstdCompressionParameters_ini'
75	123	"min_match",
76	124	"target_length",
77	125	"compression_strategy",
	126	"strategy",
78	127	"write_content_size",
79	128	"write_checksum",
80	129	"write_dict_id",
81	130	"job_size",
	131	"overlap_log",
82	132	"overlap_size_log",
83	133	"force_max_window",
84	134	"enable_ldm",
85	135	"ldm_hash_log",
86	136	"ldm_min_match",
87	137	"ldm_bucket_size_log",
	138	"ldm_hash_rate_log",
88	139	"ldm_hash_every_log",
89	140	"threads",
90	141	NULL
91	142	};
92	143
93		~~unsigned~~ format = 0;
	144	int format = 0;
94	145	int compressionLevel = 0;
95		~~unsigned~~ windowLog = 0;
96		~~unsigned~~ hashLog = 0;
97		~~unsigned~~ chainLog = 0;
98		~~unsigned~~ searchLog = 0;
99		~~unsigned~~ minMatch = 0;
100		~~unsigned~~ targetLength = 0;
101		~~unsigned~~ compressionStrategy = 0;
102		unsigned contentSizeFlag = 1;
103		unsigned checksumFlag = 0;
104		unsigned dictIDFlag = 0;
105		unsigned jobSize = 0;
106		unsigned overlapSizeLog = 0;
107		unsigned forceMaxWindow = 0;
108		unsigned enableLDM = 0;
109		unsigned ldmHashLog = 0;
110		unsigned ldmMinMatch = 0;
111		unsigned ldmBucketSizeLog = 0;
112		unsigned ldmHashEveryLog = 0;
	146	int windowLog = 0;
	147	int hashLog = 0;
	148	int chainLog = 0;
	149	int searchLog = 0;
	150	int minMatch = 0;
	151	int targetLength = 0;
	152	int compressionStrategy = -1;
	153	int strategy = -1;
	154	int contentSizeFlag = 1;
	155	int checksumFlag = 0;
	156	int dictIDFlag = 0;
	157	int jobSize = 0;
	158	int overlapLog = -1;
	159	int overlapSizeLog = -1;
	160	int forceMaxWindow = 0;
	161	int enableLDM = 0;
	162	int ldmHashLog = 0;
	163	int ldmMinMatch = 0;
	164	int ldmBucketSizeLog = 0;
	165	int ldmHashRateLog = -1;
	166	int ldmHashEveryLog = -1;
113	167	int threads = 0;
114	168
115	169	if (!PyArg_ParseTupleAndKeywords(args, kwargs,
116		"\|IiIIIIIIIIIIIIIIIIIIi:CompressionParameters",
	170	"\|iiiiiiiiiiiiiiiiiiiiiiii:CompressionParameters",
117	171	kwlist, &format, &compressionLevel, &windowLog, &hashLog, &chainLog,
118		&searchLog, &minMatch, &targetLength, &compressionStrategy,
119		&contentSizeFlag, &checksumFlag, &dictIDFlag, &jobSize, &overlap~~Size~~Log,
120		&forceMaxWindow, &enableLDM, &ldmHashLog, &ldmMinMatch, &~~ldmBucketSizeLog~~,
121		&ldmHashEveryLog, &threads)) {
	172	&searchLog, &minMatch, &targetLength, &compressionStrategy, &strategy,
	173	&contentSizeFlag, &checksumFlag, &dictIDFlag, &jobSize, &overlapLog,
	174	&overlapSizeLog, &forceMaxWindow, &enableLDM, &ldmHashLog, &ldmMinMatch,
	175	&ldmBucketSizeLog, &ldmHashRateLog, &ldmHashEveryLog, &threads)) {
	176	return -1;
	177	}
	178
	179	if (reset_params(self)) {
122	180	return -1;
123	181	}
124	182
		@@ -126,32 +184,70 b' static int ZstdCompressionParameters_ini'
126	184	threads = cpu_count();
127	185	}
128	186
129		self->format = format;
130		self->compressionLevel = compressionLevel;
131		self->windowLog = windowLog;
132		self->hashLog = hashLog;
133		self->chainLog = chainLog;
134		self->searchLog = searchLog;
135		self->minMatch = minMatch;
136		self->targetLength = targetLength;
137		self->compressionStrategy = compressionStrategy;
138		self->contentSizeFlag = contentSizeFlag;
139		self->checksumFlag = checksumFlag;
140		self->dictIDFlag = dictIDFlag;
141		self->threads = threads;
142		self->jobSize = jobSize;
143		self->overlapSizeLog = overlapSizeLog;
144		self->forceMaxWindow = forceMaxWindow;
145		self->enableLongDistanceMatching = enableLDM;
146		self->ldmHashLog = ldmHashLog;
147		self->ldmMinMatch = ldmMinMatch;
148		self->ldmBucketSizeLog = ldmBucketSizeLog;
149		self->ldmHashEveryLog = ldmHashEveryLog;
	187	/* We need to set ZSTD_c_nbWorkers before ZSTD_c_jobSize and ZSTD_c_overlapLog
	188	* because setting ZSTD_c_nbWorkers resets the other parameters. */
	189	TRY_SET_PARAMETER(self->params, ZSTD_c_nbWorkers, threads);
	190
	191	TRY_SET_PARAMETER(self->params, ZSTD_c_format, format);
	192	TRY_SET_PARAMETER(self->params, ZSTD_c_compressionLevel, compressionLevel);
	193	TRY_SET_PARAMETER(self->params, ZSTD_c_windowLog, windowLog);
	194	TRY_SET_PARAMETER(self->params, ZSTD_c_hashLog, hashLog);
	195	TRY_SET_PARAMETER(self->params, ZSTD_c_chainLog, chainLog);
	196	TRY_SET_PARAMETER(self->params, ZSTD_c_searchLog, searchLog);
	197	TRY_SET_PARAMETER(self->params, ZSTD_c_minMatch, minMatch);
	198	TRY_SET_PARAMETER(self->params, ZSTD_c_targetLength, targetLength);
150	199
151		if (reset_params(self)) {
	200	if (compressionStrategy != -1 && strategy != -1) {
	201	PyErr_SetString(PyExc_ValueError, "cannot specify both compression_strategy and strategy");
	202	return -1;
	203	}
	204
	205	if (compressionStrategy != -1) {
	206	strategy = compressionStrategy;
	207	}
	208	else if (strategy == -1) {
	209	strategy = 0;
	210	}
	211
	212	TRY_SET_PARAMETER(self->params, ZSTD_c_strategy, strategy);
	213	TRY_SET_PARAMETER(self->params, ZSTD_c_contentSizeFlag, contentSizeFlag);
	214	TRY_SET_PARAMETER(self->params, ZSTD_c_checksumFlag, checksumFlag);
	215	TRY_SET_PARAMETER(self->params, ZSTD_c_dictIDFlag, dictIDFlag);
	216	TRY_SET_PARAMETER(self->params, ZSTD_c_jobSize, jobSize);
	217
	218	if (overlapLog != -1 && overlapSizeLog != -1) {
	219	PyErr_SetString(PyExc_ValueError, "cannot specify both overlap_log and overlap_size_log");
152	220	return -1;
153	221	}
154	222
	223	if (overlapSizeLog != -1) {
	224	overlapLog = overlapSizeLog;
	225	}
	226	else if (overlapLog == -1) {
	227	overlapLog = 0;
	228	}
	229
	230	TRY_SET_PARAMETER(self->params, ZSTD_c_overlapLog, overlapLog);
	231	TRY_SET_PARAMETER(self->params, ZSTD_c_forceMaxWindow, forceMaxWindow);
	232	TRY_SET_PARAMETER(self->params, ZSTD_c_enableLongDistanceMatching, enableLDM);
	233	TRY_SET_PARAMETER(self->params, ZSTD_c_ldmHashLog, ldmHashLog);
	234	TRY_SET_PARAMETER(self->params, ZSTD_c_ldmMinMatch, ldmMinMatch);
	235	TRY_SET_PARAMETER(self->params, ZSTD_c_ldmBucketSizeLog, ldmBucketSizeLog);
	236
	237	if (ldmHashRateLog != -1 && ldmHashEveryLog != -1) {
	238	PyErr_SetString(PyExc_ValueError, "cannot specify both ldm_hash_rate_log and ldm_hash_everyLog");
	239	return -1;
	240	}
	241
	242	if (ldmHashEveryLog != -1) {
	243	ldmHashRateLog = ldmHashEveryLog;
	244	}
	245	else if (ldmHashRateLog == -1) {
	246	ldmHashRateLog = 0;
	247	}
	248
	249	TRY_SET_PARAMETER(self->params, ZSTD_c_ldmHashRateLog, ldmHashRateLog);
	250
155	251	return 0;
156	252	}
157	253
		@@ -259,7 +355,7 b' ZstdCompressionParametersObject* Compres'
259	355
260	356	val = PyDict_GetItemString(kwargs, "min_match");
261	357	if (!val) {
262		val = PyLong_FromUnsignedLong(params.~~searchLengt~~h);
	358	val = PyLong_FromUnsignedLong(params.minMatch);
263	359	if (!val) {
264	360	goto cleanup;
265	361	}
		@@ -336,6 +432,41 b' static void ZstdCompressionParameters_de'
336	432	PyObject_Del(self);
337	433	}
338	434
	435	#define PARAM_GETTER(name, param) PyObject* ZstdCompressionParameters_get_##name(PyObject* self, void* unused) { \
	436	int result; \
	437	size_t zresult; \
	438	ZstdCompressionParametersObject* p = (ZstdCompressionParametersObject*)(self); \
	439	zresult = ZSTD_CCtxParam_getParameter(p->params, param, &result); \
	440	if (ZSTD_isError(zresult)) { \
	441	PyErr_Format(ZstdError, "unable to get compression parameter: %s", \
	442	ZSTD_getErrorName(zresult)); \
	443	return NULL; \
	444	} \
	445	return PyLong_FromLong(result); \
	446	}
	447
	448	PARAM_GETTER(format, ZSTD_c_format)
	449	PARAM_GETTER(compression_level, ZSTD_c_compressionLevel)
	450	PARAM_GETTER(window_log, ZSTD_c_windowLog)
	451	PARAM_GETTER(hash_log, ZSTD_c_hashLog)
	452	PARAM_GETTER(chain_log, ZSTD_c_chainLog)
	453	PARAM_GETTER(search_log, ZSTD_c_searchLog)
	454	PARAM_GETTER(min_match, ZSTD_c_minMatch)
	455	PARAM_GETTER(target_length, ZSTD_c_targetLength)
	456	PARAM_GETTER(compression_strategy, ZSTD_c_strategy)
	457	PARAM_GETTER(write_content_size, ZSTD_c_contentSizeFlag)
	458	PARAM_GETTER(write_checksum, ZSTD_c_checksumFlag)
	459	PARAM_GETTER(write_dict_id, ZSTD_c_dictIDFlag)
	460	PARAM_GETTER(job_size, ZSTD_c_jobSize)
	461	PARAM_GETTER(overlap_log, ZSTD_c_overlapLog)
	462	PARAM_GETTER(force_max_window, ZSTD_c_forceMaxWindow)
	463	PARAM_GETTER(enable_ldm, ZSTD_c_enableLongDistanceMatching)
	464	PARAM_GETTER(ldm_hash_log, ZSTD_c_ldmHashLog)
	465	PARAM_GETTER(ldm_min_match, ZSTD_c_ldmMinMatch)
	466	PARAM_GETTER(ldm_bucket_size_log, ZSTD_c_ldmBucketSizeLog)
	467	PARAM_GETTER(ldm_hash_rate_log, ZSTD_c_ldmHashRateLog)
	468	PARAM_GETTER(threads, ZSTD_c_nbWorkers)
	469
339	470	static PyMethodDef ZstdCompressionParameters_methods[] = {
340	471	{
341	472	"from_level",
		@@ -352,70 +483,34 b' static PyMethodDef ZstdCompressionParame'
352	483	{ NULL, NULL }
353	484	};
354	485
355		static PyMemberDef ZstdCompressionParameters_members[] = {
356		{ "format", T_UINT,
357		offsetof(ZstdCompressionParametersObject, format), READONLY,
358		"compression format" },
359		{ "compression_level", T_INT,
360		offsetof(ZstdCompressionParametersObject, compressionLevel), READONLY,
361		"compression level" },
362		{ "window_log", T_UINT,
363		offsetof(ZstdCompressionParametersObject, windowLog), READONLY,
364		"window log" },
365		{ "hash_log", T_UINT,
366		offsetof(ZstdCompressionParametersObject, hashLog), READONLY,
367		"hash log" },
368		{ "chain_log", T_UINT,
369		offsetof(ZstdCompressionParametersObject, chainLog), READONLY,
370		"chain log" },
371		{ "search_log", T_UINT,
372		offsetof(ZstdCompressionParametersObject, searchLog), READONLY,
373		"search log" },
374		{ "min_match", T_UINT,
375		offsetof(ZstdCompressionParametersObject, minMatch), READONLY,
376		"search length" },
377		{ "target_length", T_UINT,
378		offsetof(ZstdCompressionParametersObject, targetLength), READONLY,
379		"target length" },
380		{ "compression_strategy", T_UINT,
381		offsetof(ZstdCompressionParametersObject, compressionStrategy), READONLY,
382		"compression strategy" },
383		{ "write_content_size", T_UINT,
384		offsetof(ZstdCompressionParametersObject, contentSizeFlag), READONLY,
385		"whether to write content size in frames" },
386		{ "write_checksum", T_UINT,
387		offsetof(ZstdCompressionParametersObject, checksumFlag), READONLY,
388		"whether to write checksum in frames" },
389		{ "write_dict_id", T_UINT,
390		offsetof(ZstdCompressionParametersObject, dictIDFlag), READONLY,
391		"whether to write dictionary ID in frames" },
392		{ "threads", T_UINT,
393		offsetof(ZstdCompressionParametersObject, threads), READONLY,
394		"number of threads to use" },
395		{ "job_size", T_UINT,
396		offsetof(ZstdCompressionParametersObject, jobSize), READONLY,
397		"size of compression job when using multiple threads" },
398		{ "overlap_size_log", T_UINT,
399		offsetof(ZstdCompressionParametersObject, overlapSizeLog), READONLY,
400		"Size of previous input reloaded at the beginning of each job" },
401		{ "force_max_window", T_UINT,
402		offsetof(ZstdCompressionParametersObject, forceMaxWindow), READONLY,
403		"force back references to remain smaller than window size" },
404		{ "enable_ldm", T_UINT,
405		offsetof(ZstdCompressionParametersObject, enableLongDistanceMatching), READONLY,
406		"whether to enable long distance matching" },
407		{ "ldm_hash_log", T_UINT,
408		offsetof(ZstdCompressionParametersObject, ldmHashLog), READONLY,
409		"Size of the table for long distance matching, as a power of 2" },
410		{ "ldm_min_match", T_UINT,
411		offsetof(ZstdCompressionParametersObject, ldmMinMatch), READONLY,
412		"minimum size of searched matches for long distance matcher" },
413		{ "ldm_bucket_size_log", T_UINT,
414		offsetof(ZstdCompressionParametersObject, ldmBucketSizeLog), READONLY,
415		"log size of each bucket in the LDM hash table for collision resolution" },
416		{ "ldm_hash_every_log", T_UINT,
417		offsetof(ZstdCompressionParametersObject, ldmHashEveryLog), READONLY,
418		"frequency of inserting/looking up entries in the LDM hash table" },
	486	#define GET_SET_ENTRY(name) { #name, ZstdCompressionParameters_get_##name, NULL, NULL, NULL }
	487
	488	static PyGetSetDef ZstdCompressionParameters_getset[] = {
	489	GET_SET_ENTRY(format),
	490	GET_SET_ENTRY(compression_level),
	491	GET_SET_ENTRY(window_log),
	492	GET_SET_ENTRY(hash_log),
	493	GET_SET_ENTRY(chain_log),
	494	GET_SET_ENTRY(search_log),
	495	GET_SET_ENTRY(min_match),
	496	GET_SET_ENTRY(target_length),
	497	GET_SET_ENTRY(compression_strategy),
	498	GET_SET_ENTRY(write_content_size),
	499	GET_SET_ENTRY(write_checksum),
	500	GET_SET_ENTRY(write_dict_id),
	501	GET_SET_ENTRY(threads),
	502	GET_SET_ENTRY(job_size),
	503	GET_SET_ENTRY(overlap_log),
	504	/* TODO remove this deprecated attribute */
	505	{ "overlap_size_log", ZstdCompressionParameters_get_overlap_log, NULL, NULL, NULL },
	506	GET_SET_ENTRY(force_max_window),
	507	GET_SET_ENTRY(enable_ldm),
	508	GET_SET_ENTRY(ldm_hash_log),
	509	GET_SET_ENTRY(ldm_min_match),
	510	GET_SET_ENTRY(ldm_bucket_size_log),
	511	GET_SET_ENTRY(ldm_hash_rate_log),
	512	/* TODO remove this deprecated attribute */
	513	{ "ldm_hash_every_log", ZstdCompressionParameters_get_ldm_hash_rate_log, NULL, NULL, NULL },
419	514	{ NULL }
420	515	};
421	516
		@@ -448,8 +543,8 b' PyTypeObject ZstdCompressionParametersTy'
448	543	0, /* tp_iter */
449	544	0, /* tp_iternext */
450	545	ZstdCompressionParameters_methods, /* tp_methods */
451		ZstdCompressionParameters_members, /* tp_members */
452		0, /* tp_getset */
	546	0, /* tp_members */
	547	ZstdCompressionParameters_getset, /* tp_getset */
453	548	0, /* tp_base */
454	549	0, /* tp_dict */
455	550	0, /* tp_descr_get */

contrib/python-zstandard/c-ext/compressionreader.c

0 +518 -86

This diff has been collapsed as it changes many lines, (604 lines changed) Show them Hide them
			@@ -128,6 +128,96 b' static PyObject* reader_tell(ZstdCompres'
	128	128	return PyLong_FromUnsignedLongLong(self->bytesCompressed);
	129	129	}
	130	130
		131	int read_compressor_input(ZstdCompressionReader* self) {
		132	if (self->finishedInput) {
		133	return 0;
		134	}
		135
		136	if (self->input.pos != self->input.size) {
		137	return 0;
		138	}
		139
		140	if (self->reader) {
		141	Py_buffer buffer;
		142
		143	assert(self->readResult == NULL);
		144
		145	self->readResult = PyObject_CallMethod(self->reader, "read",
		146	"k", self->readSize);
		147
		148	if (NULL == self->readResult) {
		149	return -1;
		150	}
		151
		152	memset(&buffer, 0, sizeof(buffer));
		153
		154	if (0 != PyObject_GetBuffer(self->readResult, &buffer, PyBUF_CONTIG_RO)) {
		155	return -1;
		156	}
		157
		158	/* EOF */
		159	if (0 == buffer.len) {
		160	self->finishedInput = 1;
		161	Py_CLEAR(self->readResult);
		162	}
		163	else {
		164	self->input.src = buffer.buf;
		165	self->input.size = buffer.len;
		166	self->input.pos = 0;
		167	}
		168
		169	PyBuffer_Release(&buffer);
		170	}
		171	else {
		172	assert(self->buffer.buf);
		173
		174	self->input.src = self->buffer.buf;
		175	self->input.size = self->buffer.len;
		176	self->input.pos = 0;
		177	}
		178
		179	return 1;
		180	}
		181
		182	int compress_input(ZstdCompressionReader* self, ZSTD_outBuffer* output) {
		183	size_t oldPos;
		184	size_t zresult;
		185
		186	/* If we have data left over, consume it. */
		187	if (self->input.pos < self->input.size) {
		188	oldPos = output->pos;
		189
		190	Py_BEGIN_ALLOW_THREADS
		191	zresult = ZSTD_compressStream2(self->compressor->cctx,
		192	output, &self->input, ZSTD_e_continue);
		193	Py_END_ALLOW_THREADS
		194
		195	self->bytesCompressed += output->pos - oldPos;
		196
		197	/* Input exhausted. Clear out state tracking. */
		198	if (self->input.pos == self->input.size) {
		199	memset(&self->input, 0, sizeof(self->input));
		200	Py_CLEAR(self->readResult);
		201
		202	if (self->buffer.buf) {
		203	self->finishedInput = 1;
		204	}
		205	}
		206
		207	if (ZSTD_isError(zresult)) {
		208	PyErr_Format(ZstdError, "zstd compress error: %s", ZSTD_getErrorName(zresult));
		209	return -1;
		210	}
		211	}
		212
		213	if (output->pos && output->pos == output->size) {
		214	return 1;
		215	}
		216	else {
		217	return 0;
		218	}
		219	}
		220
	131	221	static PyObject* reader_read(ZstdCompressionReader* self, PyObject* args, PyObject* kwargs) {
	132	222	static char* kwlist[] = {
	133	223	"size",
			@@ -140,25 +230,30 b' static PyObject* reader_read(ZstdCompres'
	140	230	Py_ssize_t resultSize;
	141	231	size_t zresult;
	142	232	size_t oldPos;
		233	int readResult, compressResult;
	143	234
	144	235	if (self->closed) {
	145	236	PyErr_SetString(PyExc_ValueError, "stream is closed");
	146	237	return NULL;
	147	238	}
	148	239
	149		if (self->finishedOutput) {
	150		return PyBytes_FromStringAndSize("", 0);
	151		}
	152
	153		if (!PyArg_ParseTupleAndKeywords(args, kwargs, "n", kwlist, &size)) {
		240	if (!PyArg_ParseTupleAndKeywords(args, kwargs, "\|n", kwlist, &size)) {
	154	241	return NULL;
	155	242	}
	156	243
	157		if (size < 1) {
	158		PyErr_SetString(PyExc_ValueError, "cannot read negative ~~or size 0~~ amounts");
		244	if (size < -1) {
		245	PyErr_SetString(PyExc_ValueError, "cannot read negative amounts less than -1");
	159	246	return NULL;
	160	247	}
	161	248
		249	if (size == -1) {
		250	return PyObject_CallMethod((PyObject*)self, "readall", NULL);
		251	}
		252
		253	if (self->finishedOutput \|\| size == 0) {
		254	return PyBytes_FromStringAndSize("", 0);
		255	}
		256
	162	257	result = PyBytes_FromStringAndSize(NULL, size);
	163	258	if (NULL == result) {
	164	259	return NULL;
			@@ -172,86 +267,34 b' static PyObject* reader_read(ZstdCompres'
	172	267
	173	268	readinput:
	174	269
	175		/* If we have data left over, consume it. */
	176		if (self->input.pos < self->input.size) {
	177		oldPos = self->output.pos;
	178
	179		Py_BEGIN_ALLOW_THREADS
	180		zresult = ZSTD_compress_generic(self->compressor->cctx,
	181		&self->output, &self->input, ZSTD_e_continue);
	182
	183		Py_END_ALLOW_THREADS
	184
	185		self->bytesCompressed += self->output.pos - oldPos;
	186
	187		/* Input exhausted. Clear out state tracking. */
	188		if (self->input.pos == self->input.size) {
	189		memset(&self->input, 0, sizeof(self->input));
	190		Py_CLEAR(self->readResult);
		270	compressResult = compress_input(self, &self->output);
	191	271
	192		if (self->buffer.buf) {
	193		self->finishedInput = 1;
	194		}
	195		}
	196
	197		if (ZSTD_isError(zresult)) {
	198		PyErr_Format(ZstdError, "zstd compress error: %s", ZSTD_getErrorName(zresult));
	199		return NULL;
	200		}
	201
	202		if (self->output.pos) {
	203		/* If no more room in output, emit it. */
	204		if (self->output.pos == self->output.size) {
	205		memset(&self->output, 0, sizeof(self->output));
	206		return result;
	207		}
	208
	209		/*
	210		* There is room in the output. We fall through to below, which will either
	211		* get more input for us or will attempt to end the stream.
	212		*/
	213		}
	214
	215		/* Fall through to gather more input. */
		272	if (-1 == compressResult) {
		273	Py_XDECREF(result);
		274	return NULL;
		275	}
		276	else if (0 == compressResult) {
		277	/* There is room in the output. We fall through to below, which will
		278	* either get more input for us or will attempt to end the stream.
		279	*/
		280	}
		281	else if (1 == compressResult) {
		282	memset(&self->output, 0, sizeof(self->output));
		283	return result;
		284	}
		285	else {
		286	assert(0);
	216	287	}
	217	288
	218		if (!self->finishedInput) {
	219		if (self->reader) {
	220		Py_buffer buffer;
	221
	222		assert(self->readResult == NULL);
	223		self->readResult = PyObject_CallMethod(self->reader, "read",
	224		"k", self->readSize);
	225		if (self->readResult == NULL) {
	226		return NULL;
	227		}
	228
	229		memset(&buffer, 0, sizeof(buffer));
	230
	231		if (0 != PyObject_GetBuffer(self->readResult, &buffer, PyBUF_CONTIG_RO)) {
	232		return NULL;
	233		}
		289	readResult = read_compressor_input(self);
	234	290
	235		/* EOF */
	236		if (0 == buffer.len) {
	237		self->finishedInput = 1;
	238		Py_CLEAR(self->readResult);
	239		}
	240		else {
	241		self->input.src = buffer.buf;
	242		self->input.size = buffer.len;
	243		self->input.pos = 0;
	244		}
	245
	246		PyBuffer_Release(&buffer);
	247		}
	248		else {
	249		assert(self->buffer.buf);
	250
	251		self->input.src = self->buffer.buf;
	252		self->input.size = self->buffer.len;
	253		self->input.pos = 0;
	254		}
		291	if (-1 == readResult) {
		292	return NULL;
		293	}
		294	else if (0 == readResult) { }
		295	else if (1 == readResult) { }
		296	else {
		297	assert(0);
	255	298	}
	256	299
	257	300	if (self->input.size) {
			@@ -261,7 +304,7 b' readinput:'
	261	304	/* Else EOF */
	262	305	oldPos = self->output.pos;
	263	306
	264		zresult = ZSTD_compress~~_generic~~(self->compressor->cctx, &self->output,
		307	zresult = ZSTD_compressStream2(self->compressor->cctx, &self->output,
	265	308	&self->input, ZSTD_e_end);
	266	309
	267	310	self->bytesCompressed += self->output.pos - oldPos;
			@@ -269,6 +312,7 b' readinput:'
	269	312	if (ZSTD_isError(zresult)) {
	270	313	PyErr_Format(ZstdError, "error ending compression stream: %s",
	271	314	ZSTD_getErrorName(zresult));
		315	Py_XDECREF(result);
	272	316	return NULL;
	273	317	}
	274	318
			@@ -288,9 +332,394 b' readinput:'
	288	332	return result;
	289	333	}
	290	334
		335	static PyObject* reader_read1(ZstdCompressionReader* self, PyObject* args, PyObject* kwargs) {
		336	static char* kwlist[] = {
		337	"size",
		338	NULL
		339	};
		340
		341	Py_ssize_t size = -1;
		342	PyObject* result = NULL;
		343	char* resultBuffer;
		344	Py_ssize_t resultSize;
		345	ZSTD_outBuffer output;
		346	int compressResult;
		347	size_t oldPos;
		348	size_t zresult;
		349
		350	if (self->closed) {
		351	PyErr_SetString(PyExc_ValueError, "stream is closed");
		352	return NULL;
		353	}
		354
		355	if (!PyArg_ParseTupleAndKeywords(args, kwargs, "\|n:read1", kwlist, &size)) {
		356	return NULL;
		357	}
		358
		359	if (size < -1) {
		360	PyErr_SetString(PyExc_ValueError, "cannot read negative amounts less than -1");
		361	return NULL;
		362	}
		363
		364	if (self->finishedOutput \|\| size == 0) {
		365	return PyBytes_FromStringAndSize("", 0);
		366	}
		367
		368	if (size == -1) {
		369	size = ZSTD_CStreamOutSize();
		370	}
		371
		372	result = PyBytes_FromStringAndSize(NULL, size);
		373	if (NULL == result) {
		374	return NULL;
		375	}
		376
		377	PyBytes_AsStringAndSize(result, &resultBuffer, &resultSize);
		378
		379	output.dst = resultBuffer;
		380	output.size = resultSize;
		381	output.pos = 0;
		382
		383	/* read1() is supposed to use at most 1 read() from the underlying stream.
		384	However, we can't satisfy this requirement with compression because
		385	not every input will generate output. We /could/ flush the compressor,
		386	but this may not be desirable. We allow multiple read() from the
		387	underlying stream. But unlike read(), we return as soon as output data
		388	is available.
		389	*/
		390
		391	compressResult = compress_input(self, &output);
		392
		393	if (-1 == compressResult) {
		394	Py_XDECREF(result);
		395	return NULL;
		396	}
		397	else if (0 == compressResult \|\| 1 == compressResult) { }
		398	else {
		399	assert(0);
		400	}
		401
		402	if (output.pos) {
		403	goto finally;
		404	}
		405
		406	while (!self->finishedInput) {
		407	int readResult = read_compressor_input(self);
		408
		409	if (-1 == readResult) {
		410	Py_XDECREF(result);
		411	return NULL;
		412	}
		413	else if (0 == readResult \|\| 1 == readResult) { }
		414	else {
		415	assert(0);
		416	}
		417
		418	compressResult = compress_input(self, &output);
		419
		420	if (-1 == compressResult) {
		421	Py_XDECREF(result);
		422	return NULL;
		423	}
		424	else if (0 == compressResult \|\| 1 == compressResult) { }
		425	else {
		426	assert(0);
		427	}
		428
		429	if (output.pos) {
		430	goto finally;
		431	}
		432	}
		433
		434	/* EOF */
		435	oldPos = output.pos;
		436
		437	zresult = ZSTD_compressStream2(self->compressor->cctx, &output, &self->input,
		438	ZSTD_e_end);
		439
		440	self->bytesCompressed += output.pos - oldPos;
		441
		442	if (ZSTD_isError(zresult)) {
		443	PyErr_Format(ZstdError, "error ending compression stream: %s",
		444	ZSTD_getErrorName(zresult));
		445	Py_XDECREF(result);
		446	return NULL;
		447	}
		448
		449	if (zresult == 0) {
		450	self->finishedOutput = 1;
		451	}
		452
		453	finally:
		454	if (result) {
		455	if (safe_pybytes_resize(&result, output.pos)) {
		456	Py_XDECREF(result);
		457	return NULL;
		458	}
		459	}
		460
		461	return result;
		462	}
		463
	291	464	static PyObject* reader_readall(PyObject* self) {
	292		PyErr_SetNone(PyExc_NotImplementedError);
	293		return NULL;
		465	PyObject* chunks = NULL;
		466	PyObject* empty = NULL;
		467	PyObject* result = NULL;
		468
		469	/* Our strategy is to collect chunks into a list then join all the
		470	* chunks at the end. We could potentially use e.g. an io.BytesIO. But
		471	* this feels simple enough to implement and avoids potentially expensive
		472	* reallocations of large buffers.
		473	*/
		474	chunks = PyList_New(0);
		475	if (NULL == chunks) {
		476	return NULL;
		477	}
		478
		479	while (1) {
		480	PyObject* chunk = PyObject_CallMethod(self, "read", "i", 1048576);
		481	if (NULL == chunk) {
		482	Py_DECREF(chunks);
		483	return NULL;
		484	}
		485
		486	if (!PyBytes_Size(chunk)) {
		487	Py_DECREF(chunk);
		488	break;
		489	}
		490
		491	if (PyList_Append(chunks, chunk)) {
		492	Py_DECREF(chunk);
		493	Py_DECREF(chunks);
		494	return NULL;
		495	}
		496
		497	Py_DECREF(chunk);
		498	}
		499
		500	empty = PyBytes_FromStringAndSize("", 0);
		501	if (NULL == empty) {
		502	Py_DECREF(chunks);
		503	return NULL;
		504	}
		505
		506	result = PyObject_CallMethod(empty, "join", "O", chunks);
		507
		508	Py_DECREF(empty);
		509	Py_DECREF(chunks);
		510
		511	return result;
		512	}
		513
		514	static PyObject* reader_readinto(ZstdCompressionReader* self, PyObject* args) {
		515	Py_buffer dest;
		516	ZSTD_outBuffer output;
		517	int readResult, compressResult;
		518	PyObject* result = NULL;
		519	size_t zresult;
		520	size_t oldPos;
		521
		522	if (self->closed) {
		523	PyErr_SetString(PyExc_ValueError, "stream is closed");
		524	return NULL;
		525	}
		526
		527	if (self->finishedOutput) {
		528	return PyLong_FromLong(0);
		529	}
		530
		531	if (!PyArg_ParseTuple(args, "w*:readinto", &dest)) {
		532	return NULL;
		533	}
		534
		535	if (!PyBuffer_IsContiguous(&dest, 'C') \|\| dest.ndim > 1) {
		536	PyErr_SetString(PyExc_ValueError,
		537	"destination buffer should be contiguous and have at most one dimension");
		538	goto finally;
		539	}
		540
		541	output.dst = dest.buf;
		542	output.size = dest.len;
		543	output.pos = 0;
		544
		545	compressResult = compress_input(self, &output);
		546
		547	if (-1 == compressResult) {
		548	goto finally;
		549	}
		550	else if (0 == compressResult) { }
		551	else if (1 == compressResult) {
		552	result = PyLong_FromSize_t(output.pos);
		553	goto finally;
		554	}
		555	else {
		556	assert(0);
		557	}
		558
		559	while (!self->finishedInput) {
		560	readResult = read_compressor_input(self);
		561
		562	if (-1 == readResult) {
		563	goto finally;
		564	}
		565	else if (0 == readResult \|\| 1 == readResult) {}
		566	else {
		567	assert(0);
		568	}
		569
		570	compressResult = compress_input(self, &output);
		571
		572	if (-1 == compressResult) {
		573	goto finally;
		574	}
		575	else if (0 == compressResult) { }
		576	else if (1 == compressResult) {
		577	result = PyLong_FromSize_t(output.pos);
		578	goto finally;
		579	}
		580	else {
		581	assert(0);
		582	}
		583	}
		584
		585	/* EOF */
		586	oldPos = output.pos;
		587
		588	zresult = ZSTD_compressStream2(self->compressor->cctx, &output, &self->input,
		589	ZSTD_e_end);
		590
		591	self->bytesCompressed += self->output.pos - oldPos;
		592
		593	if (ZSTD_isError(zresult)) {
		594	PyErr_Format(ZstdError, "error ending compression stream: %s",
		595	ZSTD_getErrorName(zresult));
		596	goto finally;
		597	}
		598
		599	assert(output.pos);
		600
		601	if (0 == zresult) {
		602	self->finishedOutput = 1;
		603	}
		604
		605	result = PyLong_FromSize_t(output.pos);
		606
		607	finally:
		608	PyBuffer_Release(&dest);
		609
		610	return result;
		611	}
		612
		613	static PyObject* reader_readinto1(ZstdCompressionReader* self, PyObject* args) {
		614	Py_buffer dest;
		615	PyObject* result = NULL;
		616	ZSTD_outBuffer output;
		617	int compressResult;
		618	size_t oldPos;
		619	size_t zresult;
		620
		621	if (self->closed) {
		622	PyErr_SetString(PyExc_ValueError, "stream is closed");
		623	return NULL;
		624	}
		625
		626	if (self->finishedOutput) {
		627	return PyLong_FromLong(0);
		628	}
		629
		630	if (!PyArg_ParseTuple(args, "w*:readinto1", &dest)) {
		631	return NULL;
		632	}
		633
		634	if (!PyBuffer_IsContiguous(&dest, 'C') \|\| dest.ndim > 1) {
		635	PyErr_SetString(PyExc_ValueError,
		636	"destination buffer should be contiguous and have at most one dimension");
		637	goto finally;
		638	}
		639
		640	output.dst = dest.buf;
		641	output.size = dest.len;
		642	output.pos = 0;
		643
		644	compressResult = compress_input(self, &output);
		645
		646	if (-1 == compressResult) {
		647	goto finally;
		648	}
		649	else if (0 == compressResult \|\| 1 == compressResult) { }
		650	else {
		651	assert(0);
		652	}
		653
		654	if (output.pos) {
		655	result = PyLong_FromSize_t(output.pos);
		656	goto finally;
		657	}
		658
		659	while (!self->finishedInput) {
		660	int readResult = read_compressor_input(self);
		661
		662	if (-1 == readResult) {
		663	goto finally;
		664	}
		665	else if (0 == readResult \|\| 1 == readResult) { }
		666	else {
		667	assert(0);
		668	}
		669
		670	compressResult = compress_input(self, &output);
		671
		672	if (-1 == compressResult) {
		673	goto finally;
		674	}
		675	else if (0 == compressResult) { }
		676	else if (1 == compressResult) {
		677	result = PyLong_FromSize_t(output.pos);
		678	goto finally;
		679	}
		680	else {
		681	assert(0);
		682	}
		683
		684	/* If we produced output and we're not done with input, emit
		685	* that output now, as we've hit restrictions of read1().
		686	*/
		687	if (output.pos && !self->finishedInput) {
		688	result = PyLong_FromSize_t(output.pos);
		689	goto finally;
		690	}
		691
		692	/* Otherwise we either have no output or we've exhausted the
		693	* input. Either we try to get more input or we fall through
		694	* to EOF below */
		695	}
		696
		697	/* EOF */
		698	oldPos = output.pos;
		699
		700	zresult = ZSTD_compressStream2(self->compressor->cctx, &output, &self->input,
		701	ZSTD_e_end);
		702
		703	self->bytesCompressed += self->output.pos - oldPos;
		704
		705	if (ZSTD_isError(zresult)) {
		706	PyErr_Format(ZstdError, "error ending compression stream: %s",
		707	ZSTD_getErrorName(zresult));
		708	goto finally;
		709	}
		710
		711	assert(output.pos);
		712
		713	if (0 == zresult) {
		714	self->finishedOutput = 1;
		715	}
		716
		717	result = PyLong_FromSize_t(output.pos);
		718
		719	finally:
		720	PyBuffer_Release(&dest);
		721
		722	return result;
	294	723	}
	295	724
	296	725	static PyObject* reader_iter(PyObject* self) {
			@@ -315,7 +744,10 b' static PyMethodDef reader_methods[] = {'
	315	744	{ "readable", (PyCFunction)reader_readable, METH_NOARGS,
	316	745	PyDoc_STR("Returns True") },
	317	746	{ "read", (PyCFunction)reader_read, METH_VARARGS \| METH_KEYWORDS, PyDoc_STR("read compressed data") },
		747	{ "read1", (PyCFunction)reader_read1, METH_VARARGS \| METH_KEYWORDS, NULL },
	318	748	{ "readall", (PyCFunction)reader_readall, METH_NOARGS, PyDoc_STR("Not implemented") },
		749	{ "readinto", (PyCFunction)reader_readinto, METH_VARARGS, NULL },
		750	{ "readinto1", (PyCFunction)reader_readinto1, METH_VARARGS, NULL },
	319	751	{ "readline", (PyCFunction)reader_readline, METH_VARARGS, PyDoc_STR("Not implemented") },
	320	752	{ "readlines", (PyCFunction)reader_readlines, METH_VARARGS, PyDoc_STR("Not implemented") },
	321	753	{ "seekable", (PyCFunction)reader_seekable, METH_NOARGS,

contrib/python-zstandard/c-ext/compressionwriter.c

0 +151 -95

              	Py_XDECREF(self->compressor);
              	Py_XDECREF(self->writer);
+             	PyMem_Free(self->output.dst);
+             	self->output.dst = NULL;
              	PyObject_Del(self);
              }
              static PyObject* ZstdCompressionWriter_enter(ZstdCompressionWriter* self) {
-             	size_t zresult;
+             	if (self->closed) {
+             		PyErr_SetString(PyExc_ValueError, "stream is closed");
+             		return NULL;
+             	}
              	if (self->entered) {
              		PyErr_SetString(ZstdError, "cannot __enter__ multiple times");
              		return NULL;
              	}
-             	zresult = ZSTD_CCtx_setPledgedSrcSize(self->compressor->cctx, self->sourceSize);
-             	if (ZSTD_isError(zresult)) {
-             		PyErr_Format(ZstdError, "error setting source size: %s",
-             			ZSTD_getErrorName(zresult));
-             		return NULL;
+             	}
              	self->entered = 1;
              	Py_INCREF(self);
              	PyObject* exc_type;
              	PyObject* exc_value;
              	PyObject* exc_tb;
-             	size_t zresult;
-             	ZSTD_outBuffer output;
-             	PyObject* res;
              	if (!PyArg_ParseTuple(args, "OOO:__exit__", &exc_type, &exc_value, &exc_tb)) {
              		return NULL;
              	self->entered = 0;
              	if (exc_type == Py_None && exc_value == Py_None && exc_tb == Py_None) {
-             		ZSTD_inBuffer inBuffer;
-             		inBuffer.src = NULL;
-             		inBuffer.size = 0;
-             		inBuffer.pos = 0;
-             		output.dst = PyMem_Malloc(self->outSize);
-             		if (!output.dst) {
-             			return PyErr_NoMemory();
+             		}
-             		output.size = self->outSize;
-             		output.pos = 0;
+             		PyObject* result = PyObject_CallMethod((PyObject*)self, "close", NULL);
-             		while (1) {
-             			zresult = ZSTD_compress_generic(self->compressor->cctx, &output, &inBuffer, ZSTD_e_end);
-             			if (ZSTD_isError(zresult)) {
-             				PyErr_Format(ZstdError, "error ending compression stream: %s",
-             					ZSTD_getErrorName(zresult));
-             				PyMem_Free(output.dst);
-             				return NULL;
+             			}
-             			if (output.pos) {
-             #if PY_MAJOR_VERSION >= 3
-             				res = PyObject_CallMethod(self->writer, "write", "y#",
-             #else
-             				res = PyObject_CallMethod(self->writer, "write", "s#",
-             #endif
-             					output.dst, output.pos);
-             				Py_XDECREF(res);
+             			}
-             			if (!zresult) {
-             				break;
+             			}
-             			output.pos = 0;
+             		if (NULL == result) {
+             			return NULL;
              		}
-             		PyMem_Free(output.dst);
              	}
              	Py_RETURN_FALSE;
              	Py_buffer source;
              	size_t zresult;
              	ZSTD_inBuffer input;
-             	ZSTD_outBuffer output;
              	PyObject* res;
              	Py_ssize_t totalWrite = 0;
              		return NULL;
              	}
-             	if (!self->entered) {
-             		PyErr_SetString(ZstdError, "compress must be called from an active context manager");
-             		goto finally;
+             	}
              	if (!PyBuffer_IsContiguous(&source, 'C') || source.ndim > 1) {
              		PyErr_SetString(PyExc_ValueError,
              			"data buffer should be contiguous and have at most one dimension");
              		goto finally;
              	}
-             	output.dst = PyMem_Malloc(self->outSize);
-             	if (!output.dst) {
-             		PyErr_NoMemory();
-             		goto finally;
+             	if (self->closed) {
+             		PyErr_SetString(PyExc_ValueError, "stream is closed");
+             		return NULL;
              	}
-             	output.size = self->outSize;
-             	output.pos = 0;
+             	self->output.pos = 0;
              	input.src = source.buf;
              	input.size = source.len;
              	input.pos = 0;
-             	while ((ssize_t)input.pos < source.len) {
+             	while (input.pos < (size_t)source.len) {
              		Py_BEGIN_ALLOW_THREADS
-             		zresult = ZSTD_compress_generic(self->compressor->cctx, &output, &input, ZSTD_e_continue);
+             		zresult = ZSTD_compressStream2(self->compressor->cctx, &self->output, &input, ZSTD_e_continue);
              		Py_END_ALLOW_THREADS
              		if (ZSTD_isError(zresult)) {
-             			PyMem_Free(output.dst);
              			PyErr_Format(ZstdError, "zstd compress error: %s", ZSTD_getErrorName(zresult));
              			goto finally;
              		}
              		/* Copy data from output buffer to writer. */
-             		if (output.pos) {
+             		if (self->output.pos) {
              #if PY_MAJOR_VERSION >= 3
              			res = PyObject_CallMethod(self->writer, "write", "y#",
              #else
              			res = PyObject_CallMethod(self->writer, "write", "s#",
              #endif
-             				output.dst, output.pos);
+             				self->output.dst, self->output.pos);
              			Py_XDECREF(res);
-             			totalWrite += output.pos;
-             			self->bytesCompressed += output.pos;
+             			totalWrite += self->output.pos;
+             			self->bytesCompressed += self->output.pos;
              		}
-             		output.pos = 0;
+             		self->output.pos = 0;
              	}
-             	PyMem_Free(output.dst);
-             	result = PyLong_FromSsize_t(totalWrite);
+             	if (self->writeReturnRead) {
+             		result = PyLong_FromSize_t(input.pos);
+             	}
+             	else {
+             		result = PyLong_FromSsize_t(totalWrite);
+             	}
              finally:
              	PyBuffer_Release(&source);
              	return result;
              }
-             static PyObject* ZstdCompressionWriter_flush(ZstdCompressionWriter* self, PyObject* args) {
+             static PyObject* ZstdCompressionWriter_flush(ZstdCompressionWriter* self, PyObject* args, PyObject* kwargs) {
+             	static char* kwlist[] = {
+             		"flush_mode",
+             		NULL
+             	};
              	size_t zresult;
-             	ZSTD_outBuffer output;
              	ZSTD_inBuffer input;
              	PyObject* res;
              	Py_ssize_t totalWrite = 0;
+             	unsigned flush_mode = 0;
+             	ZSTD_EndDirective flush;
-             	if (!self->entered) {
-             		PyErr_SetString(ZstdError, "flush must be called from an active context manager");
+                 if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|I:flush",
+             		kwlist, &flush_mode)) {
              		return NULL;
              	}
+             	switch (flush_mode) {
+             		case 0:
+             			flush = ZSTD_e_flush;
+             			break;
+             		case 1:
+             			flush = ZSTD_e_end;
+             			break;
+             		default:
+             			PyErr_Format(PyExc_ValueError, "unknown flush_mode: %d", flush_mode);
+             			return NULL;
+             	}
+             	if (self->closed) {
+             		PyErr_SetString(PyExc_ValueError, "stream is closed");
+             		return NULL;
+             	}
+             	self->output.pos = 0;
              	input.src = NULL;
              	input.size = 0;
              	input.pos = 0;
-             	output.dst = PyMem_Malloc(self->outSize);
-             	if (!output.dst) {
-             		return PyErr_NoMemory();
+             	}
-             	output.size = self->outSize;
-             	output.pos = 0;
              	while (1) {
              		Py_BEGIN_ALLOW_THREADS
-             		zresult = ZSTD_compress_generic(self->compressor->cctx, &output, &input, ZSTD_e_flush);
+             		zresult = ZSTD_compressStream2(self->compressor->cctx, &self->output, &input, flush);
              		Py_END_ALLOW_THREADS
              		if (ZSTD_isError(zresult)) {
-             			PyMem_Free(output.dst);
              			PyErr_Format(ZstdError, "zstd compress error: %s", ZSTD_getErrorName(zresult));
              			return NULL;
              		}
              		/* Copy data from output buffer to writer. */
-             		if (output.pos) {
+             		if (self->output.pos) {
              #if PY_MAJOR_VERSION >= 3
              			res = PyObject_CallMethod(self->writer, "write", "y#",
              #else
              			res = PyObject_CallMethod(self->writer, "write", "s#",
              #endif
-             				output.dst, output.pos);
+             				self->output.dst, self->output.pos);
              			Py_XDECREF(res);
-             			totalWrite += output.pos;
-             			self->bytesCompressed += output.pos;
+             			totalWrite += self->output.pos;
+             			self->bytesCompressed += self->output.pos;
              		}
-             		output.pos = 0;
+             		self->output.pos = 0;
              		if (!zresult) {
              			break;
              		}
              	}
-             	PyMem_Free(output.dst);
+             	return PyLong_FromSsize_t(totalWrite);
+             }
+             static PyObject* ZstdCompressionWriter_close(ZstdCompressionWriter* self) {
+             	PyObject* result;
+             	if (self->closed) {
+             		Py_RETURN_NONE;
+             	}
+             	result = PyObject_CallMethod((PyObject*)self, "flush", "I", 1);
+             	self->closed = 1;
+             	if (NULL == result) {
+             	    return NULL;
+             	}
-             	return PyLong_FromSsize_t(totalWrite);
+                 /* Call close on underlying stream as well. */
+             	if (PyObject_HasAttrString(self->writer, "close")) {
+             		return PyObject_CallMethod(self->writer, "close", NULL);
+             	}
+             	Py_RETURN_NONE;
+             }
+             static PyObject* ZstdCompressionWriter_fileno(ZstdCompressionWriter* self) {
+             	if (PyObject_HasAttrString(self->writer, "fileno")) {
+             		return PyObject_CallMethod(self->writer, "fileno", NULL);
+             	}
+             	else {
+             		PyErr_SetString(PyExc_OSError, "fileno not available on underlying writer");
+             		return NULL;
+             	}
              }
              static PyObject* ZstdCompressionWriter_tell(ZstdCompressionWriter* self) {
              	return PyLong_FromUnsignedLongLong(self->bytesCompressed);
              }
+             static PyObject* ZstdCompressionWriter_writelines(PyObject* self, PyObject* args) {
+             	PyErr_SetNone(PyExc_NotImplementedError);
+             	return NULL;
+             }
+             static PyObject* ZstdCompressionWriter_false(PyObject* self, PyObject* args) {
+             	Py_RETURN_FALSE;
+             }
+             static PyObject* ZstdCompressionWriter_true(PyObject* self, PyObject* args) {
+             	Py_RETURN_TRUE;
+             }
+             static PyObject* ZstdCompressionWriter_unsupported(PyObject* self, PyObject* args, PyObject* kwargs) {
+             	PyObject* iomod;
+             	PyObject* exc;
+             	iomod = PyImport_ImportModule("io");
+             	if (NULL == iomod) {
+             		return NULL;
+             	}
+             	exc = PyObject_GetAttrString(iomod, "UnsupportedOperation");
+             	if (NULL == exc) {
+             		Py_DECREF(iomod);
+             		return NULL;
+             	}
+             	PyErr_SetNone(exc);
+             	Py_DECREF(exc);
+             	Py_DECREF(iomod);
+             	return NULL;
+             }
              static PyMethodDef ZstdCompressionWriter_methods[] = {
              	{ "__enter__", (PyCFunction)ZstdCompressionWriter_enter, METH_NOARGS,
              	PyDoc_STR("Enter a compression context.") },
              	{ "__exit__", (PyCFunction)ZstdCompressionWriter_exit, METH_VARARGS,
              	PyDoc_STR("Exit a compression context.") },
+             	{ "close", (PyCFunction)ZstdCompressionWriter_close, METH_NOARGS, NULL },
+             	{ "fileno", (PyCFunction)ZstdCompressionWriter_fileno, METH_NOARGS, NULL },
+             	{ "isatty", (PyCFunction)ZstdCompressionWriter_false, METH_NOARGS, NULL },
+             	{ "readable", (PyCFunction)ZstdCompressionWriter_false, METH_NOARGS, NULL },
+             	{ "readline", (PyCFunction)ZstdCompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
+             	{ "readlines", (PyCFunction)ZstdCompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
+             	{ "seek", (PyCFunction)ZstdCompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
+             	{ "seekable", ZstdCompressionWriter_false, METH_NOARGS, NULL },
+             	{ "truncate", (PyCFunction)ZstdCompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
+             	{ "writable", ZstdCompressionWriter_true, METH_NOARGS, NULL },
+             	{ "writelines", ZstdCompressionWriter_writelines, METH_VARARGS, NULL },
+             	{ "read", (PyCFunction)ZstdCompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
+             	{ "readall", (PyCFunction)ZstdCompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
+             	{ "readinto", (PyCFunction)ZstdCompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
              	{ "memory_size", (PyCFunction)ZstdCompressionWriter_memory_size, METH_NOARGS,
              	PyDoc_STR("Obtain the memory size of the underlying compressor") },
              	{ "write", (PyCFunction)ZstdCompressionWriter_write, METH_VARARGS | METH_KEYWORDS,
              	PyDoc_STR("Compress data") },
-             	{ "flush", (PyCFunction)ZstdCompressionWriter_flush, METH_NOARGS,
+             	{ "flush", (PyCFunction)ZstdCompressionWriter_flush, METH_VARARGS | METH_KEYWORDS,
              	PyDoc_STR("Flush data and finish a zstd frame") },
              	{ "tell", (PyCFunction)ZstdCompressionWriter_tell, METH_NOARGS,
              	PyDoc_STR("Returns current number of bytes compressed") },
              	{ NULL, NULL }
              };
+             static PyMemberDef ZstdCompressionWriter_members[] = {
+             	 { "closed", T_BOOL, offsetof(ZstdCompressionWriter, closed), READONLY, NULL },
+             	 { NULL }
+             };
              PyTypeObject ZstdCompressionWriterType = {
              	PyVarObject_HEAD_INIT(NULL, 0)
              	"zstd.ZstdCompressionWriter",  /* tp_name */
 ,                              /* tp_iter */
 ,                              /* tp_iternext */
              	ZstdCompressionWriter_methods,  /* tp_methods */
-,                              /* tp_members */
+             	ZstdCompressionWriter_members,  /* tp_members */
 ,                              /* tp_getset */
 ,                              /* tp_base */
 ,                              /* tp_dict */

contrib/python-zstandard/c-ext/compressobj.c

0 +3 -3

              	input.size = source.len;
              	input.pos = 0;
-             	while ((ssize_t)input.pos < source.len) {
+             	while (input.pos < (size_t)source.len) {
              		Py_BEGIN_ALLOW_THREADS
-             			zresult = ZSTD_compress_generic(self->compressor->cctx, &self->output,
+             			zresult = ZSTD_compressStream2(self->compressor->cctx, &self->output,
              				&input, ZSTD_e_continue);
              		Py_END_ALLOW_THREADS
              	while (1) {
              		Py_BEGIN_ALLOW_THREADS
-             		zresult = ZSTD_compress_generic(self->compressor->cctx, &self->output,
+             		zresult = ZSTD_compressStream2(self->compressor->cctx, &self->output,
              			&input, zFlushMode);
              		Py_END_ALLOW_THREADS

contrib/python-zstandard/c-ext/compressor.c

0 +38 -19

              		}
              	}
              	else {
-             		if (set_parameter(self->params, ZSTD_p_compressionLevel, level)) {
+             		if (set_parameter(self->params, ZSTD_c_compressionLevel, level)) {
              			return -1;
              		}
-             		if (set_parameter(self->params, ZSTD_p_contentSizeFlag,
+             		if (set_parameter(self->params, ZSTD_c_contentSizeFlag,
              			writeContentSize ? PyObject_IsTrue(writeContentSize) : 1)) {
              			return -1;
              		}
-             		if (set_parameter(self->params, ZSTD_p_checksumFlag,
+             		if (set_parameter(self->params, ZSTD_c_checksumFlag,
              			writeChecksum ? PyObject_IsTrue(writeChecksum) : 0)) {
              			return -1;
              		}
-             		if (set_parameter(self->params, ZSTD_p_dictIDFlag,
+             		if (set_parameter(self->params, ZSTD_c_dictIDFlag,
              			writeDictID ? PyObject_IsTrue(writeDictID) : 1)) {
              			return -1;
              		}
              		if (threads) {
-             			if (set_parameter(self->params, ZSTD_p_nbWorkers, threads)) {
+             			if (set_parameter(self->params, ZSTD_c_nbWorkers, threads)) {
              				return -1;
              			}
              		}
              		return NULL;
              	}
-             	ZSTD_CCtx_reset(self->cctx);
+             	ZSTD_CCtx_reset(self->cctx, ZSTD_reset_session_only);
              	zresult = ZSTD_CCtx_setPledgedSrcSize(self->cctx, sourceSize);
              	if (ZSTD_isError(zresult)) {
              		while (input.pos < input.size) {
              			Py_BEGIN_ALLOW_THREADS
-             			zresult = ZSTD_compress_generic(self->cctx, &output, &input, ZSTD_e_continue);
+             			zresult = ZSTD_compressStream2(self->cctx, &output, &input, ZSTD_e_continue);
              			Py_END_ALLOW_THREADS
              			if (ZSTD_isError(zresult)) {
              	while (1) {
              		Py_BEGIN_ALLOW_THREADS
-             		zresult = ZSTD_compress_generic(self->cctx, &output, &input, ZSTD_e_end);
+             		zresult = ZSTD_compressStream2(self->cctx, &output, &input, ZSTD_e_end);
              		Py_END_ALLOW_THREADS
              		if (ZSTD_isError(zresult)) {
              		goto except;
              	}
-             	ZSTD_CCtx_reset(self->cctx);
+             	ZSTD_CCtx_reset(self->cctx, ZSTD_reset_session_only);
              	zresult = ZSTD_CCtx_setPledgedSrcSize(self->cctx, sourceSize);
              	if (ZSTD_isError(zresult)) {
              		goto finally;
              	}
-             	ZSTD_CCtx_reset(self->cctx);
+             	ZSTD_CCtx_reset(self->cctx, ZSTD_reset_session_only);
              	destSize = ZSTD_compressBound(source.len);
              	output = PyBytes_FromStringAndSize(NULL, destSize);
              	/* By avoiding ZSTD_compress(), we don't necessarily write out content
              		size. This means the argument to ZstdCompressor to control frame
              		parameters is honored. */
-             	zresult = ZSTD_compress_generic(self->cctx, &outBuffer, &inBuffer, ZSTD_e_end);
+             	zresult = ZSTD_compressStream2(self->cctx, &outBuffer, &inBuffer, ZSTD_e_end);
              	Py_END_ALLOW_THREADS
              	if (ZSTD_isError(zresult)) {
              		return NULL;
              	}
-             	ZSTD_CCtx_reset(self->cctx);
+             	ZSTD_CCtx_reset(self->cctx, ZSTD_reset_session_only);
              	zresult = ZSTD_CCtx_setPledgedSrcSize(self->cctx, inSize);
              	if (ZSTD_isError(zresult)) {
              		goto except;
              	}
-             	ZSTD_CCtx_reset(self->cctx);
+             	ZSTD_CCtx_reset(self->cctx, ZSTD_reset_session_only);
              	zresult = ZSTD_CCtx_setPledgedSrcSize(self->cctx, sourceSize);
              	if (ZSTD_isError(zresult)) {
              		"writer",
              		"size",
              		"write_size",
+             		"write_return_read",
              		NULL
              	};
              	PyObject* writer;
              	ZstdCompressionWriter* result;
+             	size_t zresult;
              	unsigned long long sourceSize = ZSTD_CONTENTSIZE_UNKNOWN;
              	size_t outSize = ZSTD_CStreamOutSize();
+             	PyObject* writeReturnRead = NULL;
-             	if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|Kk:stream_writer", kwlist,
-             		&writer, &sourceSize, &outSize)) {
+             	if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|KkO:stream_writer", kwlist,
+             		&writer, &sourceSize, &outSize, &writeReturnRead)) {
              		return NULL;
              	}
              		return NULL;
              	}
-             	ZSTD_CCtx_reset(self->cctx);
+             	ZSTD_CCtx_reset(self->cctx, ZSTD_reset_session_only);
+             	zresult = ZSTD_CCtx_setPledgedSrcSize(self->cctx, sourceSize);
+             	if (ZSTD_isError(zresult)) {
+             		PyErr_Format(ZstdError, "error setting source size: %s",
+             			ZSTD_getErrorName(zresult));
+             		return NULL;
+             	}
              	result = (ZstdCompressionWriter*)PyObject_CallObject((PyObject*)&ZstdCompressionWriterType, NULL);
              	if (!result) {
              		return NULL;
              	}
+             	result->output.dst = PyMem_Malloc(outSize);
+             	if (!result->output.dst) {
+             		Py_DECREF(result);
+             		return (ZstdCompressionWriter*)PyErr_NoMemory();
+             	}
+             	result->output.pos = 0;
+             	result->output.size = outSize;
              	result->compressor = self;
              	Py_INCREF(result->compressor);
              	result->writer = writer;
              	Py_INCREF(result->writer);
-             	result->sourceSize = sourceSize;
              	result->outSize = outSize;
              	result->bytesCompressed = 0;
+             	result->writeReturnRead = writeReturnRead ? PyObject_IsTrue(writeReturnRead) : 0;
              	return result;
              }
              		return NULL;
              	}
-             	ZSTD_CCtx_reset(self->cctx);
+             	ZSTD_CCtx_reset(self->cctx, ZSTD_reset_session_only);
              	zresult = ZSTD_CCtx_setPledgedSrcSize(self->cctx, sourceSize);
              	if (ZSTD_isError(zresult)) {
              			break;
              		}
-             		zresult = ZSTD_compress_generic(state->cctx, &opOutBuffer, &opInBuffer, ZSTD_e_end);
+             		zresult = ZSTD_compressStream2(state->cctx, &opOutBuffer, &opInBuffer, ZSTD_e_end);
              		if (ZSTD_isError(zresult)) {
              			state->error = WorkerError_zstd;
              			state->zresult = zresult;

contrib/python-zstandard/c-ext/compressoriterator.c

0 +3 -3

              	/* If we have data left in the input, consume it. */
              	if (self->input.pos < self->input.size) {
              		Py_BEGIN_ALLOW_THREADS
-             		zresult = ZSTD_compress_generic(self->compressor->cctx, &self->output,
+             		zresult = ZSTD_compressStream2(self->compressor->cctx, &self->output,
              			&self->input, ZSTD_e_continue);
              		Py_END_ALLOW_THREADS
              		self->input.size = 0;
              		self->input.pos = 0;
-             		zresult = ZSTD_compress_generic(self->compressor->cctx, &self->output,
+             		zresult = ZSTD_compressStream2(self->compressor->cctx, &self->output,
              			&self->input, ZSTD_e_end);
              		if (ZSTD_isError(zresult)) {
              			PyErr_Format(ZstdError, "error ending compression stream: %s",
              	self->input.pos = 0;
              	Py_BEGIN_ALLOW_THREADS
-             	zresult = ZSTD_compress_generic(self->compressor->cctx, &self->output,
+             	zresult = ZSTD_compressStream2(self->compressor->cctx, &self->output,
              		&self->input, ZSTD_e_continue);
              	Py_END_ALLOW_THREADS

contrib/python-zstandard/c-ext/constants.c

0 +9 -2

              	ZstdError = PyErr_NewException("zstd.ZstdError", NULL, NULL);
              	PyModule_AddObject(mod, "ZstdError", ZstdError);
+             	PyModule_AddIntConstant(mod, "FLUSH_BLOCK", 0);
+             	PyModule_AddIntConstant(mod, "FLUSH_FRAME", 1);
              	PyModule_AddIntConstant(mod, "COMPRESSOBJ_FLUSH_FINISH", compressorobj_flush_finish);
              	PyModule_AddIntConstant(mod, "COMPRESSOBJ_FLUSH_BLOCK", compressorobj_flush_block);
              	PyModule_AddIntConstant(mod, "HASHLOG3_MAX", ZSTD_HASHLOG3_MAX);
              	PyModule_AddIntConstant(mod, "SEARCHLOG_MIN", ZSTD_SEARCHLOG_MIN);
              	PyModule_AddIntConstant(mod, "SEARCHLOG_MAX", ZSTD_SEARCHLOG_MAX);
-             	PyModule_AddIntConstant(mod, "SEARCHLENGTH_MIN", ZSTD_SEARCHLENGTH_MIN);
-             	PyModule_AddIntConstant(mod, "SEARCHLENGTH_MAX", ZSTD_SEARCHLENGTH_MAX);
+             	PyModule_AddIntConstant(mod, "MINMATCH_MIN", ZSTD_MINMATCH_MIN);
+             	PyModule_AddIntConstant(mod, "MINMATCH_MAX", ZSTD_MINMATCH_MAX);
+             	/* TODO SEARCHLENGTH_* is deprecated. */
+             	PyModule_AddIntConstant(mod, "SEARCHLENGTH_MIN", ZSTD_MINMATCH_MIN);
+             	PyModule_AddIntConstant(mod, "SEARCHLENGTH_MAX", ZSTD_MINMATCH_MAX);
              	PyModule_AddIntConstant(mod, "TARGETLENGTH_MIN", ZSTD_TARGETLENGTH_MIN);
              	PyModule_AddIntConstant(mod, "TARGETLENGTH_MAX", ZSTD_TARGETLENGTH_MAX);
              	PyModule_AddIntConstant(mod, "LDM_MINMATCH_MIN", ZSTD_LDM_MINMATCH_MIN);
              	PyModule_AddIntConstant(mod, "STRATEGY_BTLAZY2", ZSTD_btlazy2);
              	PyModule_AddIntConstant(mod, "STRATEGY_BTOPT", ZSTD_btopt);
              	PyModule_AddIntConstant(mod, "STRATEGY_BTULTRA", ZSTD_btultra);
+             	PyModule_AddIntConstant(mod, "STRATEGY_BTULTRA2", ZSTD_btultra2);
              	PyModule_AddIntConstant(mod, "DICT_TYPE_AUTO", ZSTD_dct_auto);
              	PyModule_AddIntConstant(mod, "DICT_TYPE_RAWCONTENT", ZSTD_dct_rawContent);

contrib/python-zstandard/c-ext/decompressionreader.c

0 +425 -86

This diff has been collapsed as it changes many lines, (511 lines changed) Show them Hide them
			@@ -102,6 +102,114 b' static PyObject* reader_isatty(PyObject*'
	102	102	Py_RETURN_FALSE;
	103	103	}
	104	104
		105	/**
		106	* Read available input.
		107	*
		108	* Returns 0 if no data was added to input.
		109	* Returns 1 if new input data is available.
		110	* Returns -1 on error and sets a Python exception as a side-effect.
		111	*/
		112	int read_decompressor_input(ZstdDecompressionReader* self) {
		113	if (self->finishedInput) {
		114	return 0;
		115	}
		116
		117	if (self->input.pos != self->input.size) {
		118	return 0;
		119	}
		120
		121	if (self->reader) {
		122	Py_buffer buffer;
		123
		124	assert(self->readResult == NULL);
		125	self->readResult = PyObject_CallMethod(self->reader, "read",
		126	"k", self->readSize);
		127	if (NULL == self->readResult) {
		128	return -1;
		129	}
		130
		131	memset(&buffer, 0, sizeof(buffer));
		132
		133	if (0 != PyObject_GetBuffer(self->readResult, &buffer, PyBUF_CONTIG_RO)) {
		134	return -1;
		135	}
		136
		137	/* EOF */
		138	if (0 == buffer.len) {
		139	self->finishedInput = 1;
		140	Py_CLEAR(self->readResult);
		141	}
		142	else {
		143	self->input.src = buffer.buf;
		144	self->input.size = buffer.len;
		145	self->input.pos = 0;
		146	}
		147
		148	PyBuffer_Release(&buffer);
		149	}
		150	else {
		151	assert(self->buffer.buf);
		152	/*
		153	* We should only get here once since expectation is we always
		154	* exhaust input buffer before reading again.
		155	*/
		156	assert(self->input.src == NULL);
		157
		158	self->input.src = self->buffer.buf;
		159	self->input.size = self->buffer.len;
		160	self->input.pos = 0;
		161	}
		162
		163	return 1;
		164	}
		165
		166	/**
		167	* Decompresses available input into an output buffer.
		168	*
		169	* Returns 0 if we need more input.
		170	* Returns 1 if output buffer should be emitted.
		171	* Returns -1 on error and sets a Python exception.
		172	*/
		173	int decompress_input(ZstdDecompressionReader* self, ZSTD_outBuffer* output) {
		174	size_t zresult;
		175
		176	if (self->input.pos >= self->input.size) {
		177	return 0;
		178	}
		179
		180	Py_BEGIN_ALLOW_THREADS
		181	zresult = ZSTD_decompressStream(self->decompressor->dctx, output, &self->input);
		182	Py_END_ALLOW_THREADS
		183
		184	/* Input exhausted. Clear our state tracking. */
		185	if (self->input.pos == self->input.size) {
		186	memset(&self->input, 0, sizeof(self->input));
		187	Py_CLEAR(self->readResult);
		188
		189	if (self->buffer.buf) {
		190	self->finishedInput = 1;
		191	}
		192	}
		193
		194	if (ZSTD_isError(zresult)) {
		195	PyErr_Format(ZstdError, "zstd decompress error: %s", ZSTD_getErrorName(zresult));
		196	return -1;
		197	}
		198
		199	/* We fulfilled the full read request. Signal to emit. */
		200	if (output->pos && output->pos == output->size) {
		201	return 1;
		202	}
		203	/* We're at the end of a frame and we aren't allowed to return data
		204	spanning frames. */
		205	else if (output->pos && zresult == 0 && !self->readAcrossFrames) {
		206	return 1;
		207	}
		208
		209	/* There is more room in the output. Signal to collect more data. */
		210	return 0;
		211	}
		212
	105	213	static PyObject* reader_read(ZstdDecompressionReader* self, PyObject* args, PyObject* kwargs) {
	106	214	static char* kwlist[] = {
	107	215	"size",
			@@ -113,26 +221,30 b' static PyObject* reader_read(ZstdDecompr'
	113	221	char* resultBuffer;
	114	222	Py_ssize_t resultSize;
	115	223	ZSTD_outBuffer output;
	116		size_t zresult;
		224	int decompressResult, readResult;
	117	225
	118	226	if (self->closed) {
	119	227	PyErr_SetString(PyExc_ValueError, "stream is closed");
	120	228	return NULL;
	121	229	}
	122	230
	123		if (self->finishedOutput) {
	124		return PyBytes_FromStringAndSize("", 0);
	125		}
	126
	127		if (!PyArg_ParseTupleAndKeywords(args, kwargs, "n", kwlist, &size)) {
		231	if (!PyArg_ParseTupleAndKeywords(args, kwargs, "\|n", kwlist, &size)) {
	128	232	return NULL;
	129	233	}
	130	234
	131		if (size < 1) {
	132		PyErr_SetString(PyExc_ValueError, "cannot read negative ~~or size 0~~ amounts");
		235	if (size < -1) {
		236	PyErr_SetString(PyExc_ValueError, "cannot read negative amounts less than -1");
	133	237	return NULL;
	134	238	}
	135	239
		240	if (size == -1) {
		241	return PyObject_CallMethod((PyObject*)self, "readall", NULL);
		242	}
		243
		244	if (self->finishedOutput \|\| size == 0) {
		245	return PyBytes_FromStringAndSize("", 0);
		246	}
		247
	136	248	result = PyBytes_FromStringAndSize(NULL, size);
	137	249	if (NULL == result) {
	138	250	return NULL;
			@@ -146,85 +258,38 b' static PyObject* reader_read(ZstdDecompr'
	146	258
	147	259	readinput:
	148	260
	149		/* Consume input data left over from last time. */
	150		if (self->input.pos < self->input.size) {
	151		Py_BEGIN_ALLOW_THREADS
	152		zresult = ZSTD_decompress_generic(self->decompressor->dctx,
	153		&output, &self->input);
	154		Py_END_ALLOW_THREADS
		261	decompressResult = decompress_input(self, &output);
	155	262
	156		/* Input exhausted. Clear our state tracking. */
	157		if (self->input.pos == self->input.size) {
	158		memset(&self->input, 0, sizeof(self->input));
	159		Py_CLEAR(self->readResult);
		263	if (-1 == decompressResult) {
		264	Py_XDECREF(result);
		265	return NULL;
		266	}
		267	else if (0 == decompressResult) { }
		268	else if (1 == decompressResult) {
		269	self->bytesDecompressed += output.pos;
	160	270
	161		if (self->buffer.buf) {
	162		self->finishedInput = 1;
		271	if (output.pos != output.size) {
		272	if (safe_pybytes_resize(&result, output.pos)) {
		273	Py_XDECREF(result);
		274	return NULL;
	163	275	}
	164	276	}
	165
	166		if (ZSTD_isError(zresult)) {
	167		PyErr_Format(ZstdError, "zstd decompress error: %s", ZSTD_getErrorName(zresult));
	168		return NULL;
	169		}
	170		else if (0 == zresult) {
	171		self->finishedOutput = 1;
	172		}
	173
	174		/* We fulfilled the full read request. Emit it. */
	175		if (output.pos && output.pos == output.size) {
	176		self->bytesDecompressed += output.size;
	177		return result;
	178		}
	179
	180		/*
	181		* There is more room in the output. Fall through to try to collect
	182		* more data so we can try to fill the output.
	183		*/
		277	return result;
		278	}
		279	else {
		280	assert(0);
	184	281	}
	185	282
	186		if (!self->finishedInput) {
	187		if (self->reader) {
	188		Py_buffer buffer;
	189
	190		assert(self->readResult == NULL);
	191		self->readResult = PyObject_CallMethod(self->reader, "read",
	192		"k", self->readSize);
	193		if (NULL == self->readResult) {
	194		return NULL;
	195		}
	196
	197		memset(&buffer, 0, sizeof(buffer));
	198
	199		if (0 != PyObject_GetBuffer(self->readResult, &buffer, PyBUF_CONTIG_RO)) {
	200		return NULL;
	201		}
		283	readResult = read_decompressor_input(self);
	202	284
	203		/* EOF */
	204		if (0 == buffer.len) {
	205		self->finishedInput = 1;
	206		Py_CLEAR(self->readResult);
	207		}
	208		else {
	209		self->input.src = buffer.buf;
	210		self->input.size = buffer.len;
	211		self->input.pos = 0;
	212		}
	213
	214		PyBuffer_Release(&buffer);
	215		}
	216		else {
	217		assert(self->buffer.buf);
	218		/*
	219		* We should only get here once since above block will exhaust
	220		* source buffer until finishedInput is set.
	221		*/
	222		assert(self->input.src == NULL);
	223
	224		self->input.src = self->buffer.buf;
	225		self->input.size = self->buffer.len;
	226		self->input.pos = 0;
	227		}
		285	if (-1 == readResult) {
		286	Py_XDECREF(result);
		287	return NULL;
		288	}
		289	else if (0 == readResult) {}
		290	else if (1 == readResult) {}
		291	else {
		292	assert(0);
	228	293	}
	229	294
	230	295	if (self->input.size) {
			@@ -242,18 +307,288 b' readinput:'
	242	307	return result;
	243	308	}
	244	309
		310	static PyObject* reader_read1(ZstdDecompressionReader* self, PyObject* args, PyObject* kwargs) {
		311	static char* kwlist[] = {
		312	"size",
		313	NULL
		314	};
		315
		316	Py_ssize_t size = -1;
		317	PyObject* result = NULL;
		318	char* resultBuffer;
		319	Py_ssize_t resultSize;
		320	ZSTD_outBuffer output;
		321
		322	if (self->closed) {
		323	PyErr_SetString(PyExc_ValueError, "stream is closed");
		324	return NULL;
		325	}
		326
		327	if (!PyArg_ParseTupleAndKeywords(args, kwargs, "\|n", kwlist, &size)) {
		328	return NULL;
		329	}
		330
		331	if (size < -1) {
		332	PyErr_SetString(PyExc_ValueError, "cannot read negative amounts less than -1");
		333	return NULL;
		334	}
		335
		336	if (self->finishedOutput \|\| size == 0) {
		337	return PyBytes_FromStringAndSize("", 0);
		338	}
		339
		340	if (size == -1) {
		341	size = ZSTD_DStreamOutSize();
		342	}
		343
		344	result = PyBytes_FromStringAndSize(NULL, size);
		345	if (NULL == result) {
		346	return NULL;
		347	}
		348
		349	PyBytes_AsStringAndSize(result, &resultBuffer, &resultSize);
		350
		351	output.dst = resultBuffer;
		352	output.size = resultSize;
		353	output.pos = 0;
		354
		355	/* read1() is supposed to use at most 1 read() from the underlying stream.
		356	* However, we can't satisfy this requirement with decompression due to the
		357	* nature of how decompression works. Our strategy is to read + decompress
		358	* until we get any output, at which point we return. This satisfies the
		359	* intent of the read1() API to limit read operations.
		360	*/
		361	while (!self->finishedInput) {
		362	int readResult, decompressResult;
		363
		364	readResult = read_decompressor_input(self);
		365	if (-1 == readResult) {
		366	Py_XDECREF(result);
		367	return NULL;
		368	}
		369	else if (0 == readResult \|\| 1 == readResult) { }
		370	else {
		371	assert(0);
		372	}
		373
		374	decompressResult = decompress_input(self, &output);
		375
		376	if (-1 == decompressResult) {
		377	Py_XDECREF(result);
		378	return NULL;
		379	}
		380	else if (0 == decompressResult \|\| 1 == decompressResult) { }
		381	else {
		382	assert(0);
		383	}
		384
		385	if (output.pos) {
		386	break;
		387	}
		388	}
		389
		390	self->bytesDecompressed += output.pos;
		391	if (safe_pybytes_resize(&result, output.pos)) {
		392	Py_XDECREF(result);
		393	return NULL;
		394	}
		395
		396	return result;
		397	}
		398
		399	static PyObject* reader_readinto(ZstdDecompressionReader* self, PyObject* args) {
		400	Py_buffer dest;
		401	ZSTD_outBuffer output;
		402	int decompressResult, readResult;
		403	PyObject* result = NULL;
		404
		405	if (self->closed) {
		406	PyErr_SetString(PyExc_ValueError, "stream is closed");
		407	return NULL;
		408	}
		409
		410	if (self->finishedOutput) {
		411	return PyLong_FromLong(0);
		412	}
		413
		414	if (!PyArg_ParseTuple(args, "w*:readinto", &dest)) {
		415	return NULL;
		416	}
		417
		418	if (!PyBuffer_IsContiguous(&dest, 'C') \|\| dest.ndim > 1) {
		419	PyErr_SetString(PyExc_ValueError,
		420	"destination buffer should be contiguous and have at most one dimension");
		421	goto finally;
		422	}
		423
		424	output.dst = dest.buf;
		425	output.size = dest.len;
		426	output.pos = 0;
		427
		428	readinput:
		429
		430	decompressResult = decompress_input(self, &output);
		431
		432	if (-1 == decompressResult) {
		433	goto finally;
		434	}
		435	else if (0 == decompressResult) { }
		436	else if (1 == decompressResult) {
		437	self->bytesDecompressed += output.pos;
		438	result = PyLong_FromSize_t(output.pos);
		439	goto finally;
		440	}
		441	else {
		442	assert(0);
		443	}
		444
		445	readResult = read_decompressor_input(self);
		446
		447	if (-1 == readResult) {
		448	goto finally;
		449	}
		450	else if (0 == readResult) {}
		451	else if (1 == readResult) {}
		452	else {
		453	assert(0);
		454	}
		455
		456	if (self->input.size) {
		457	goto readinput;
		458	}
		459
		460	/* EOF */
		461	self->bytesDecompressed += output.pos;
		462	result = PyLong_FromSize_t(output.pos);
		463
		464	finally:
		465	PyBuffer_Release(&dest);
		466
		467	return result;
		468	}
		469
		470	static PyObject* reader_readinto1(ZstdDecompressionReader* self, PyObject* args) {
		471	Py_buffer dest;
		472	ZSTD_outBuffer output;
		473	PyObject* result = NULL;
		474
		475	if (self->closed) {
		476	PyErr_SetString(PyExc_ValueError, "stream is closed");
		477	return NULL;
		478	}
		479
		480	if (self->finishedOutput) {
		481	return PyLong_FromLong(0);
		482	}
		483
		484	if (!PyArg_ParseTuple(args, "w*:readinto1", &dest)) {
		485	return NULL;
		486	}
		487
		488	if (!PyBuffer_IsContiguous(&dest, 'C') \|\| dest.ndim > 1) {
		489	PyErr_SetString(PyExc_ValueError,
		490	"destination buffer should be contiguous and have at most one dimension");
		491	goto finally;
		492	}
		493
		494	output.dst = dest.buf;
		495	output.size = dest.len;
		496	output.pos = 0;
		497
		498	while (!self->finishedInput && !self->finishedOutput) {
		499	int decompressResult, readResult;
		500
		501	readResult = read_decompressor_input(self);
		502
		503	if (-1 == readResult) {
		504	goto finally;
		505	}
		506	else if (0 == readResult \|\| 1 == readResult) {}
		507	else {
		508	assert(0);
		509	}
		510
		511	decompressResult = decompress_input(self, &output);
		512
		513	if (-1 == decompressResult) {
		514	goto finally;
		515	}
		516	else if (0 == decompressResult \|\| 1 == decompressResult) {}
		517	else {
		518	assert(0);
		519	}
		520
		521	if (output.pos) {
		522	break;
		523	}
		524	}
		525
		526	self->bytesDecompressed += output.pos;
		527	result = PyLong_FromSize_t(output.pos);
		528
		529	finally:
		530	PyBuffer_Release(&dest);
		531
		532	return result;
		533	}
		534
	245	535	static PyObject* reader_readall(PyObject* self) {
	246		PyErr_SetNone(PyExc_NotImplementedError);
	247		return NULL;
		536	PyObject* chunks = NULL;
		537	PyObject* empty = NULL;
		538	PyObject* result = NULL;
		539
		540	/* Our strategy is to collect chunks into a list then join all the
		541	* chunks at the end. We could potentially use e.g. an io.BytesIO. But
		542	* this feels simple enough to implement and avoids potentially expensive
		543	* reallocations of large buffers.
		544	*/
		545	chunks = PyList_New(0);
		546	if (NULL == chunks) {
		547	return NULL;
		548	}
		549
		550	while (1) {
		551	PyObject* chunk = PyObject_CallMethod(self, "read", "i", 1048576);
		552	if (NULL == chunk) {
		553	Py_DECREF(chunks);
		554	return NULL;
		555	}
		556
		557	if (!PyBytes_Size(chunk)) {
		558	Py_DECREF(chunk);
		559	break;
		560	}
		561
		562	if (PyList_Append(chunks, chunk)) {
		563	Py_DECREF(chunk);
		564	Py_DECREF(chunks);
		565	return NULL;
		566	}
		567
		568	Py_DECREF(chunk);
		569	}
		570
		571	empty = PyBytes_FromStringAndSize("", 0);
		572	if (NULL == empty) {
		573	Py_DECREF(chunks);
		574	return NULL;
		575	}
		576
		577	result = PyObject_CallMethod(empty, "join", "O", chunks);
		578
		579	Py_DECREF(empty);
		580	Py_DECREF(chunks);
		581
		582	return result;
	248	583	}
	249	584
	250	585	static PyObject* reader_readline(PyObject* self) {
	251		PyErr_SetNone(PyExc_NotImplementedError);
		586	set_unsupported_operation();
	252	587	return NULL;
	253	588	}
	254	589
	255	590	static PyObject* reader_readlines(PyObject* self) {
	256		PyErr_SetNone(PyExc_NotImplementedError);
		591	set_unsupported_operation();
	257	592	return NULL;
	258	593	}
	259	594
			@@ -345,12 +680,12 b' static PyObject* reader_writelines(PyObj'
	345	680	}
	346	681
	347	682	static PyObject* reader_iter(PyObject* self) {
	348		PyErr_SetNone(PyExc_NotImplementedError);
		683	set_unsupported_operation();
	349	684	return NULL;
	350	685	}
	351	686
	352	687	static PyObject* reader_iternext(PyObject* self) {
	353		PyErr_SetNone(PyExc_NotImplementedError);
		688	set_unsupported_operation();
	354	689	return NULL;
	355	690	}
	356	691
			@@ -367,6 +702,10 b' static PyMethodDef reader_methods[] = {'
	367	702	PyDoc_STR("Returns True") },
	368	703	{ "read", (PyCFunction)reader_read, METH_VARARGS \| METH_KEYWORDS,
	369	704	PyDoc_STR("read compressed data") },
		705	{ "read1", (PyCFunction)reader_read1, METH_VARARGS \| METH_KEYWORDS,
		706	PyDoc_STR("read compressed data") },
		707	{ "readinto", (PyCFunction)reader_readinto, METH_VARARGS, NULL },
		708	{ "readinto1", (PyCFunction)reader_readinto1, METH_VARARGS, NULL },
	370	709	{ "readall", (PyCFunction)reader_readall, METH_NOARGS, PyDoc_STR("Not implemented") },
	371	710	{ "readline", (PyCFunction)reader_readline, METH_NOARGS, PyDoc_STR("Not implemented") },
	372	711	{ "readlines", (PyCFunction)reader_readlines, METH_NOARGS, PyDoc_STR("Not implemented") },

contrib/python-zstandard/c-ext/decompressionwriter.c

0 +117 -10

              }
              static PyObject* ZstdDecompressionWriter_enter(ZstdDecompressionWriter* self) {
-             	if (self->entered) {
-             		PyErr_SetString(ZstdError, "cannot __enter__ multiple times");
+             	if (self->closed) {
+             		PyErr_SetString(PyExc_ValueError, "stream is closed");
              		return NULL;
              	}
-             	if (ensure_dctx(self->decompressor, 1)) {
+             	if (self->entered) {
+             		PyErr_SetString(ZstdError, "cannot __enter__ multiple times");
              		return NULL;
              	}
              static PyObject* ZstdDecompressionWriter_exit(ZstdDecompressionWriter* self, PyObject* args) {
              	self->entered = 0;
+             	if (NULL == PyObject_CallMethod((PyObject*)self, "close", NULL)) {
+             		return NULL;
+             	}
              	Py_RETURN_FALSE;
              }
              		goto finally;
              	}
-             	if (!self->entered) {
-             		PyErr_SetString(ZstdError, "write must be called from an active context manager");
-             		goto finally;
+             	if (self->closed) {
+             		PyErr_SetString(PyExc_ValueError, "stream is closed");
+             		return NULL;
              	}
              	output.dst = PyMem_Malloc(self->outSize);
              	input.size = source.len;
              	input.pos = 0;
-             	while ((ssize_t)input.pos < source.len) {
+             	while (input.pos < (size_t)source.len) {
              		Py_BEGIN_ALLOW_THREADS
-             		zresult = ZSTD_decompress_generic(self->decompressor->dctx, &output, &input);
+             		zresult = ZSTD_decompressStream(self->decompressor->dctx, &output, &input);
              		Py_END_ALLOW_THREADS
              		if (ZSTD_isError(zresult)) {
              	PyMem_Free(output.dst);
-             	result = PyLong_FromSsize_t(totalWrite);
+             	if (self->writeReturnRead) {
+             		result = PyLong_FromSize_t(input.pos);
+             	}
+             	else {
+             		result = PyLong_FromSsize_t(totalWrite);
+             	}
              finally:
              	PyBuffer_Release(&source);
              	return result;
              }
+             static PyObject* ZstdDecompressionWriter_close(ZstdDecompressionWriter* self) {
+             	PyObject* result;
+             	if (self->closed) {
+             		Py_RETURN_NONE;
+             	}
+             	result = PyObject_CallMethod((PyObject*)self, "flush", NULL);
+             	self->closed = 1;
+             	if (NULL == result) {
+             		return NULL;
+             	}
+             	/* Call close on underlying stream as well. */
+             	if (PyObject_HasAttrString(self->writer, "close")) {
+             		return PyObject_CallMethod(self->writer, "close", NULL);
+             	}
+             	Py_RETURN_NONE;
+             }
+             static PyObject* ZstdDecompressionWriter_fileno(ZstdDecompressionWriter* self) {
+             	if (PyObject_HasAttrString(self->writer, "fileno")) {
+             		return PyObject_CallMethod(self->writer, "fileno", NULL);
+             	}
+             	else {
+             		PyErr_SetString(PyExc_OSError, "fileno not available on underlying writer");
+             		return NULL;
+             	}
+             }
+             static PyObject* ZstdDecompressionWriter_flush(ZstdDecompressionWriter* self) {
+             	if (self->closed) {
+             		PyErr_SetString(PyExc_ValueError, "stream is closed");
+             		return NULL;
+             	}
+             	if (PyObject_HasAttrString(self->writer, "flush")) {
+             		return PyObject_CallMethod(self->writer, "flush", NULL);
+             	}
+             	else {
+             		Py_RETURN_NONE;
+             	}
+             }
+             static PyObject* ZstdDecompressionWriter_false(PyObject* self, PyObject* args) {
+             	Py_RETURN_FALSE;
+             }
+             static PyObject* ZstdDecompressionWriter_true(PyObject* self, PyObject* args) {
+             	Py_RETURN_TRUE;
+             }
+             static PyObject* ZstdDecompressionWriter_unsupported(PyObject* self, PyObject* args, PyObject* kwargs) {
+             	PyObject* iomod;
+             	PyObject* exc;
+             	iomod = PyImport_ImportModule("io");
+             	if (NULL == iomod) {
+             		return NULL;
+             	}
+             	exc = PyObject_GetAttrString(iomod, "UnsupportedOperation");
+             	if (NULL == exc) {
+             		Py_DECREF(iomod);
+             		return NULL;
+             	}
+             	PyErr_SetNone(exc);
+             	Py_DECREF(exc);
+             	Py_DECREF(iomod);
+             	return NULL;
+             }
              static PyMethodDef ZstdDecompressionWriter_methods[] = {
              	{ "__enter__", (PyCFunction)ZstdDecompressionWriter_enter, METH_NOARGS,
              	PyDoc_STR("Enter a decompression context.") },
              	PyDoc_STR("Exit a decompression context.") },
              	{ "memory_size", (PyCFunction)ZstdDecompressionWriter_memory_size, METH_NOARGS,
              	PyDoc_STR("Obtain the memory size in bytes of the underlying decompressor.") },
+             	{ "close", (PyCFunction)ZstdDecompressionWriter_close, METH_NOARGS, NULL },
+             	{ "fileno", (PyCFunction)ZstdDecompressionWriter_fileno, METH_NOARGS, NULL },
+             	{ "flush", (PyCFunction)ZstdDecompressionWriter_flush, METH_NOARGS, NULL },
+             	{ "isatty", ZstdDecompressionWriter_false, METH_NOARGS, NULL },
+             	{ "readable", ZstdDecompressionWriter_false, METH_NOARGS, NULL },
+             	{ "readline", (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
+             	{ "readlines", (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
+             	{ "seek", (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
+             	{ "seekable", ZstdDecompressionWriter_false, METH_NOARGS, NULL },
+             	{ "tell", (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
+             	{ "truncate", (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
+             	{ "writable", ZstdDecompressionWriter_true, METH_NOARGS, NULL },
+             	{ "writelines" , (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
+             	{ "read", (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
+             	{ "readall", (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
+             	{ "readinto", (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
              	{ "write", (PyCFunction)ZstdDecompressionWriter_write, METH_VARARGS | METH_KEYWORDS,
              	PyDoc_STR("Compress data") },
              	{ NULL, NULL }
              };
+             static PyMemberDef ZstdDecompressionWriter_members[] = {
+             	{ "closed", T_BOOL, offsetof(ZstdDecompressionWriter, closed), READONLY, NULL },
+             	{ NULL }
+             };
              PyTypeObject ZstdDecompressionWriterType = {
              	PyVarObject_HEAD_INIT(NULL, 0)
              	"zstd.ZstdDecompressionWriter", /* tp_name */
 ,                              /* tp_iter */
 ,                              /* tp_iternext */
              	ZstdDecompressionWriter_methods,/* tp_methods */
-,                              /* tp_members */
+             	ZstdDecompressionWriter_members,/* tp_members */
 ,                              /* tp_getset */
 ,                              /* tp_base */
 ,                              /* tp_dict */

contrib/python-zstandard/c-ext/decompressobj.c

0 +18 -1

              	while (1) {
              		Py_BEGIN_ALLOW_THREADS
-             		zresult = ZSTD_decompress_generic(self->decompressor->dctx, &output, &input);
+             		zresult = ZSTD_decompressStream(self->decompressor->dctx, &output, &input);
              		Py_END_ALLOW_THREADS
              		if (ZSTD_isError(zresult)) {
              	return result;
              }
+             static PyObject* DecompressionObj_flush(ZstdDecompressionObj* self, PyObject* args, PyObject* kwargs) {
+             	static char* kwlist[] = {
+             		"length",
+             		NULL
+             	};
+             	PyObject* length = NULL;
+             	if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|O:flush", kwlist, &length)) {
+             	return NULL;
+             	}
+             	Py_RETURN_NONE;
+             }
              static PyMethodDef DecompressionObj_methods[] = {
              	{ "decompress", (PyCFunction)DecompressionObj_decompress,
              	  METH_VARARGS | METH_KEYWORDS, PyDoc_STR("decompress data") },
+             	{ "flush", (PyCFunction)DecompressionObj_flush,
+             	  METH_VARARGS | METH_KEYWORDS, PyDoc_STR("no-op") },
              	{ NULL, NULL }
              };

contrib/python-zstandard/c-ext/decompressor.c

0 +27 -12

              int ensure_dctx(ZstdDecompressor* decompressor, int loadDict) {
              	size_t zresult;
-             	ZSTD_DCtx_reset(decompressor->dctx);
+             	ZSTD_DCtx_reset(decompressor->dctx, ZSTD_reset_session_only);
              	if (decompressor->maxWindowSize) {
              		zresult = ZSTD_DCtx_setMaxWindowSize(decompressor->dctx, decompressor->maxWindowSize);
              		while (input.pos < input.size) {
              			Py_BEGIN_ALLOW_THREADS
-             			zresult = ZSTD_decompress_generic(self->dctx, &output, &input);
+             			zresult = ZSTD_decompressStream(self->dctx, &output, &input);
              			Py_END_ALLOW_THREADS
              			if (ZSTD_isError(zresult)) {
              	inBuffer.pos = 0;
              	Py_BEGIN_ALLOW_THREADS
-             	zresult = ZSTD_decompress_generic(self->dctx, &outBuffer, &inBuffer);
+             	zresult = ZSTD_decompressStream(self->dctx, &outBuffer, &inBuffer);
              	Py_END_ALLOW_THREADS
              	if (ZSTD_isError(zresult)) {
              }
              PyDoc_STRVAR(Decompressor_stream_reader__doc__,
-             "stream_reader(source, [read_size=default])\n"
+             "stream_reader(source, [read_size=default, [read_across_frames=False]])\n"
              "\n"
              "Obtain an object that behaves like an I/O stream that can be used for\n"
              "reading decompressed output from an object.\n"
              "\n"
              "The source object can be any object with a ``read(size)`` method or that\n"
              "conforms to the buffer protocol.\n"
+             "\n"
+             "``read_across_frames`` controls the behavior of ``read()`` when the end\n"
+             "of a zstd frame is reached. When ``True``, ``read()`` can potentially\n"
+             "return data belonging to multiple zstd frames. When ``False``, ``read()``\n"
+             "will return when the end of a frame is reached.\n"
              );
              static ZstdDecompressionReader* Decompressor_stream_reader(ZstdDecompressor* self, PyObject* args, PyObject* kwargs) {
              	static char* kwlist[] = {
              		"source",
              		"read_size",
+             		"read_across_frames",
              		NULL
              	};
              	PyObject* source;
              	size_t readSize = ZSTD_DStreamInSize();
+             	PyObject* readAcrossFrames = NULL;
              	ZstdDecompressionReader* result;
-             	if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|k:stream_reader", kwlist,
-             		&source, &readSize)) {
+             	if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|kO:stream_reader", kwlist,
+             		&source, &readSize, &readAcrossFrames)) {
              		return NULL;
              	}
              	result->decompressor = self;
              	Py_INCREF(self);
+             	result->readAcrossFrames = readAcrossFrames ? PyObject_IsTrue(readAcrossFrames) : 0;
              	return result;
              }
              	static char* kwlist[] = {
              		"writer",
              		"write_size",
+             		"write_return_read",
              		NULL
              	};
              	PyObject* writer;
              	size_t outSize = ZSTD_DStreamOutSize();
+             	PyObject* writeReturnRead = NULL;
              	ZstdDecompressionWriter* result;
-             	if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|k:stream_writer", kwlist,
-             		&writer, &outSize)) {
+             	if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|kO:stream_writer", kwlist,
+             		&writer, &outSize, &writeReturnRead)) {
              		return NULL;
              	}
              		return NULL;
              	}
+             	if (ensure_dctx(self, 1)) {
+             		return NULL;
+             	}
              	result = (ZstdDecompressionWriter*)PyObject_CallObject((PyObject*)&ZstdDecompressionWriterType, NULL);
              	if (!result) {
              		return NULL;
              	Py_INCREF(result->writer);
              	result->outSize = outSize;
+             	result->writeReturnRead = writeReturnRead ? PyObject_IsTrue(writeReturnRead) : 0;
              	return result;
              }
              	inBuffer.pos = 0;
              	Py_BEGIN_ALLOW_THREADS
-             	zresult = ZSTD_decompress_generic(self->dctx, &outBuffer, &inBuffer);
+             	zresult = ZSTD_decompressStream(self->dctx, &outBuffer, &inBuffer);
              	Py_END_ALLOW_THREADS
              	if (ZSTD_isError(zresult)) {
              		PyErr_Format(ZstdError, "could not decompress chunk 0: %s", ZSTD_getErrorName(zresult));
              			outBuffer.pos = 0;
              			Py_BEGIN_ALLOW_THREADS
-             			zresult = ZSTD_decompress_generic(self->dctx, &outBuffer, &inBuffer);
+             			zresult = ZSTD_decompressStream(self->dctx, &outBuffer, &inBuffer);
              			Py_END_ALLOW_THREADS
              			if (ZSTD_isError(zresult)) {
              				PyErr_Format(ZstdError, "could not decompress chunk %zd: %s",
              			outBuffer.pos = 0;
              			Py_BEGIN_ALLOW_THREADS
-             			zresult = ZSTD_decompress_generic(self->dctx, &outBuffer, &inBuffer);
+             			zresult = ZSTD_decompressStream(self->dctx, &outBuffer, &inBuffer);
              			Py_END_ALLOW_THREADS
              			if (ZSTD_isError(zresult)) {
              				PyErr_Format(ZstdError, "could not decompress chunk %zd: %s",
              		inBuffer.size = sourceSize;
              		inBuffer.pos = 0;
-             		zresult = ZSTD_decompress_generic(state->dctx, &outBuffer, &inBuffer);
+             		zresult = ZSTD_decompressStream(state->dctx, &outBuffer, &inBuffer);
              		if (ZSTD_isError(zresult)) {
              			state->error = WorkerError_zstd;
              			state->zresult = zresult;

contrib/python-zstandard/c-ext/decompressoriterator.c

0 +1 -1

              	self->output.pos = 0;
              	Py_BEGIN_ALLOW_THREADS
-             	zresult = ZSTD_decompress_generic(self->decompressor->dctx, &self->output, &self->input);
+             	zresult = ZSTD_decompressStream(self->decompressor->dctx, &self->output, &self->input);
              	Py_END_ALLOW_THREADS
              	/* We're done with the pointer. Nullify to prevent anyone from getting a

contrib/python-zstandard/c-ext/python-zstandard.h

0 +10 -24

              #include <zdict.h>
              /* Remember to change the string in zstandard/__init__ as well */
-             #define PYTHON_ZSTANDARD_VERSION "0.10.1"
+             #define PYTHON_ZSTANDARD_VERSION "0.11.0"
              typedef enum {
              	compressorobj_flush_finish,
              typedef struct {
              	PyObject_HEAD
              	ZSTD_CCtx_params* params;
-             	unsigned format;
-             	int compressionLevel;
-             	unsigned windowLog;
-             	unsigned hashLog;
-             	unsigned chainLog;
-             	unsigned searchLog;
-             	unsigned minMatch;
-             	unsigned targetLength;
-             	unsigned compressionStrategy;
-             	unsigned contentSizeFlag;
-             	unsigned checksumFlag;
-             	unsigned dictIDFlag;
-             	unsigned threads;
-             	unsigned jobSize;
-             	unsigned overlapSizeLog;
-             	unsigned forceMaxWindow;
-             	unsigned enableLongDistanceMatching;
-             	unsigned ldmHashLog;
-             	unsigned ldmMinMatch;
-             	unsigned ldmBucketSizeLog;
-             	unsigned ldmHashEveryLog;
              } ZstdCompressionParametersObject;
              extern PyTypeObject ZstdCompressionParametersType;
              	ZstdCompressor* compressor;
              	PyObject* writer;
-             	unsigned long long sourceSize;
+             	ZSTD_outBuffer output;
              	size_t outSize;
              	int entered;
+             	int closed;
+             	int writeReturnRead;
              	unsigned long long bytesCompressed;
              } ZstdCompressionWriter;
              	PyObject* reader;
              	/* Size for read() operations on reader. */
              	size_t readSize;
+             	/* Whether a read() can return data spanning multiple zstd frames. */
+             	int readAcrossFrames;
              	/* Buffer to read from (if reading from a buffer). */
              	Py_buffer buffer;
              	PyObject* writer;
              	size_t outSize;
              	int entered;
+             	int closed;
+             	int writeReturnRead;
              } ZstdDecompressionWriter;
              extern PyTypeObject ZstdDecompressionWriterType;
              extern PyTypeObject ZstdBufferWithSegmentsCollectionType;
-             int set_parameter(ZSTD_CCtx_params* params, ZSTD_cParameter param, unsigned value);
+             int set_parameter(ZSTD_CCtx_params* params, ZSTD_cParameter param, int value);
              int set_parameters(ZSTD_CCtx_params* params, ZstdCompressionParametersObject* obj);
+             int to_cparams(ZstdCompressionParametersObject* params, ZSTD_compressionParameters* cparams);
              FrameParametersObject* get_frame_parameters(PyObject* self, PyObject* args, PyObject* kwargs);
              int ensure_ddict(ZstdCompressionDict* dict);
              int ensure_dctx(ZstdDecompressor* decompressor, int loadDict);

contrib/python-zstandard/make_cffi.py

0 +2 0

                  'compress/zstd_opt.c',
                  'compress/zstdmt_compress.c',
                  'decompress/huf_decompress.c',
+                 'decompress/zstd_ddict.c',
                  'decompress/zstd_decompress.c',
+                 'decompress/zstd_decompress_block.c',
                  'dictBuilder/cover.c',
                  'dictBuilder/fastcover.c',
                  'dictBuilder/divsufsort.c',

contrib/python-zstandard/setup.py

0 +22 -6

              # This software may be modified and distributed under the terms
              # of the BSD license. See the LICENSE file for details.
+             from __future__ import print_function
+             from distutils.version import LooseVersion
              import os
              import sys
              from setuptools import setup
+             # Need change in 1.10 for ffi.from_buffer() to handle all buffer types
+             # (like memoryview).
+             # Need feature in 1.11 for ffi.gc() to declare size of objects so we avoid
+             # garbage collection pitfalls.
+             MINIMUM_CFFI_VERSION = '1.11'
              try:
                  import cffi
+                 # PyPy (and possibly other distros) have CFFI distributed as part of
+                 # them. The install_requires for CFFI below won't work. We need to sniff
+                 # out the CFFI version here and reject CFFI if it is too old.
+                 cffi_version = LooseVersion(cffi.__version__)
+                 if cffi_version < LooseVersion(MINIMUM_CFFI_VERSION):
+                     print('CFFI 1.11 or newer required (%s found); '
+                           'not building CFFI backend' % cffi_version,
+                           file=sys.stderr)
+                     cffi = None
              except ImportError:
                  cffi = None
              if cffi:
                  import make_cffi
                  extensions.append(make_cffi.ffi.distutils_extension())
-                 # Need change in 1.10 for ffi.from_buffer() to handle all buffer types
-                 # (like memoryview).
-                 # Need feature in 1.11 for ffi.gc() to declare size of objects so we avoid
-                 # garbage collection pitfalls.
-                 install_requires.append('cffi>=1.11')
+                 install_requires.append('cffi>=%s' % MINIMUM_CFFI_VERSION)
              version = None
                      'Programming Language :: Python :: 3.4',
                      'Programming Language :: Python :: 3.5',
                      'Programming Language :: Python :: 3.6',
+                     'Programming Language :: Python :: 3.7',
                  ],
                  keywords='zstandard zstd compression',
                  packages=['zstandard'],

contrib/python-zstandard/setup_zstd.py

0 +2 0

                  'compress/zstd_opt.c',
                  'compress/zstdmt_compress.c',
                  'decompress/huf_decompress.c',
+                 'decompress/zstd_ddict.c',
                  'decompress/zstd_decompress.c',
+                 'decompress/zstd_decompress_block.c',
                  'dictBuilder/cover.c',
                  'dictBuilder/divsufsort.c',
                  'dictBuilder/fastcover.c',

contrib/python-zstandard/tests/common.py

0 +38 -4

                  return cls
-             class OpCountingBytesIO(io.BytesIO):
+             class NonClosingBytesIO(io.BytesIO):
+                 """BytesIO that saves the underlying buffer on close().
+                 This allows us to access written data after close().
+                 """
                  def __init__(self, *args, **kwargs):
+                     super(NonClosingBytesIO, self).__init__(*args, **kwargs)
+                     self._saved_buffer = None
+                 def close(self):
+                     self._saved_buffer = self.getvalue()
+                     return super(NonClosingBytesIO, self).close()
+                 def getvalue(self):
+                     if self.closed:
+                         return self._saved_buffer
+                     else:
+                         return super(NonClosingBytesIO, self).getvalue()
+             class OpCountingBytesIO(NonClosingBytesIO):
+                 def __init__(self, *args, **kwargs):
+                     self._flush_count = 0
                      self._read_count = 0
                      self._write_count = 0
                      return super(OpCountingBytesIO, self).__init__(*args, **kwargs)
+                 def flush(self):
+                     self._flush_count += 1
+                     return super(OpCountingBytesIO, self).flush()
                  def read(self, *args):
                      self._read_count += 1
                      return super(OpCountingBytesIO, self).read(*args)
                          except OSError:
                              pass
+                 # Also add some actual random data.
+                 _source_files.append(os.urandom(100))
+                 _source_files.append(os.urandom(1000))
+                 _source_files.append(os.urandom(10000))
+                 _source_files.append(os.urandom(100000))
+                 _source_files.append(os.urandom(1000000))
                  return _source_files
              if hypothesis:
-                 default_settings = hypothesis.settings()
+                 default_settings = hypothesis.settings(deadline=10000)
                  hypothesis.settings.register_profile('default', default_settings)
-                 ci_settings = hypothesis.settings(max_examples=2500,
-                                                   max_iterations=2500)
+                 ci_settings = hypothesis.settings(deadline=20000, max_examples=1000)
                  hypothesis.settings.register_profile('ci', ci_settings)
+                 expensive_settings = hypothesis.settings(deadline=None, max_examples=10000)
+                 hypothesis.settings.register_profile('expensive', expensive_settings)
                  hypothesis.settings.load_profile(
                      os.environ.get('HYPOTHESIS_PROFILE', 'default'))

contrib/python-zstandard/tests/test_buffer_util.py

0 +27 0

              class TestBufferWithSegments(unittest.TestCase):
                  def test_arguments(self):
+                     if not hasattr(zstd, 'BufferWithSegments'):
+                         self.skipTest('BufferWithSegments not available')
                      with self.assertRaises(TypeError):
                          zstd.BufferWithSegments()
                          zstd.BufferWithSegments(b'foo', b'\x00\x00')
                  def test_invalid_offset(self):
+                     if not hasattr(zstd, 'BufferWithSegments'):
+                         self.skipTest('BufferWithSegments not available')
                      with self.assertRaisesRegexp(ValueError, 'offset within segments array references memory'):
                          zstd.BufferWithSegments(b'foo', ss.pack(0, 4))
                  def test_invalid_getitem(self):
+                     if not hasattr(zstd, 'BufferWithSegments'):
+                         self.skipTest('BufferWithSegments not available')
                      b = zstd.BufferWithSegments(b'foo', ss.pack(0, 3))
                      with self.assertRaisesRegexp(IndexError, 'offset must be non-negative'):
                          test = b[2]
                  def test_single(self):
+                     if not hasattr(zstd, 'BufferWithSegments'):
+                         self.skipTest('BufferWithSegments not available')
                      b = zstd.BufferWithSegments(b'foo', ss.pack(0, 3))
                      self.assertEqual(len(b), 1)
                      self.assertEqual(b.size, 3)
                      self.assertEqual(b[0].tobytes(), b'foo')
                  def test_multiple(self):
+                     if not hasattr(zstd, 'BufferWithSegments'):
+                         self.skipTest('BufferWithSegments not available')
                      b = zstd.BufferWithSegments(b'foofooxfooxy', b''.join([ss.pack(0, 3),
                                                                             ss.pack(3, 4),
                                                                             ss.pack(7, 5)]))
              class TestBufferWithSegmentsCollection(unittest.TestCase):
                  def test_empty_constructor(self):
+                     if not hasattr(zstd, 'BufferWithSegmentsCollection'):
+                         self.skipTest('BufferWithSegmentsCollection not available')
                      with self.assertRaisesRegexp(ValueError, 'must pass at least 1 argument'):
                          zstd.BufferWithSegmentsCollection()
                  def test_argument_validation(self):
+                     if not hasattr(zstd, 'BufferWithSegmentsCollection'):
+                         self.skipTest('BufferWithSegmentsCollection not available')
                      with self.assertRaisesRegexp(TypeError, 'arguments must be BufferWithSegments'):
                          zstd.BufferWithSegmentsCollection(None)
                          zstd.BufferWithSegmentsCollection(zstd.BufferWithSegments(b'', b''))
                  def test_length(self):
+                     if not hasattr(zstd, 'BufferWithSegmentsCollection'):
+                         self.skipTest('BufferWithSegmentsCollection not available')
                      b1 = zstd.BufferWithSegments(b'foo', ss.pack(0, 3))
                      b2 = zstd.BufferWithSegments(b'barbaz', b''.join([ss.pack(0, 3),
                                                                        ss.pack(3, 3)]))
                      self.assertEqual(c.size(), 9)
                  def test_getitem(self):
+                     if not hasattr(zstd, 'BufferWithSegmentsCollection'):
+                         self.skipTest('BufferWithSegmentsCollection not available')
                      b1 = zstd.BufferWithSegments(b'foo', ss.pack(0, 3))
                      b2 = zstd.BufferWithSegments(b'barbaz', b''.join([ss.pack(0, 3),
                                                                        ss.pack(3, 3)]))

contrib/python-zstandard/tests/test_compressor.py

0 +315 -43

              import hashlib
              import io
+             import os
              import struct
              import sys
              import tarfile
+             import tempfile
              import unittest
              import zstandard as zstd
              from .common import (
                  make_cffi,
+                 NonClosingBytesIO,
                  OpCountingBytesIO,
              )
                      params = zstd.get_frame_parameters(result)
                      self.assertEqual(params.content_size, zstd.CONTENTSIZE_UNKNOWN)
-                     self.assertEqual(params.window_size, 1048576)
+                     self.assertEqual(params.window_size, 2097152)
                      self.assertEqual(params.dict_id, 0)
                      self.assertFalse(params.has_checksum)
                      cobj.compress(b'foo')
                      cobj.flush()
-                     with self.assertRaisesRegexp(zstd.ZstdError, 'cannot call compress\(\) after compressor'):
+                     with self.assertRaisesRegexp(zstd.ZstdError, r'cannot call compress\(\) after compressor'):
                          cobj.compress(b'foo')
                      with self.assertRaisesRegexp(zstd.ZstdError, 'compressor object already finished'):
                      params = zstd.get_frame_parameters(dest.getvalue())
                      self.assertEqual(params.content_size, zstd.CONTENTSIZE_UNKNOWN)
-                     self.assertEqual(params.window_size, 1048576)
+                     self.assertEqual(params.window_size, 2097152)
                      self.assertEqual(params.dict_id, 0)
                      self.assertFalse(params.has_checksum)
                          with self.assertRaises(io.UnsupportedOperation):
                              reader.readlines()
-                         # This could probably be implemented someday.
-                         with self.assertRaises(NotImplementedError):
-                             reader.readall()
                          with self.assertRaises(io.UnsupportedOperation):
                              iter(reader)
                          with self.assertRaisesRegexp(ValueError, 'stream is closed'):
                              reader.read(10)
-                 def test_read_bad_size(self):
+                 def test_read_sizes(self):
                      cctx = zstd.ZstdCompressor()
+                     foo = cctx.compress(b'foo')
                      with cctx.stream_reader(b'foo') as reader:
-                         with self.assertRaisesRegexp(ValueError, 'cannot read negative or size 0 amounts'):
-                             reader.read(-1)
+                         with self.assertRaisesRegexp(ValueError, 'cannot read negative amounts less than -1'):
+                             reader.read(-2)
-                         with self.assertRaisesRegexp(ValueError, 'cannot read negative or size 0 amounts'):
-                             reader.read(0)
+                         self.assertEqual(reader.read(0), b'')
+                         self.assertEqual(reader.read(), foo)
                  def test_read_buffer(self):
                      cctx = zstd.ZstdCompressor()
                      with cctx.stream_reader(source, size=42):
                          pass
+                 def test_readall(self):
+                     cctx = zstd.ZstdCompressor()
+                     frame = cctx.compress(b'foo' * 1024)
+                     reader = cctx.stream_reader(b'foo' * 1024)
+                     self.assertEqual(reader.readall(), frame)
+                 def test_readinto(self):
+                     cctx = zstd.ZstdCompressor()
+                     foo = cctx.compress(b'foo')
+                     reader = cctx.stream_reader(b'foo')
+                     with self.assertRaises(Exception):
+                         reader.readinto(b'foobar')
+                     # readinto() with sufficiently large destination.
+                     b = bytearray(1024)
+                     reader = cctx.stream_reader(b'foo')
+                     self.assertEqual(reader.readinto(b), len(foo))
+                     self.assertEqual(b[0:len(foo)], foo)
+                     self.assertEqual(reader.readinto(b), 0)
+                     self.assertEqual(b[0:len(foo)], foo)
+                     # readinto() with small reads.
+                     b = bytearray(1024)
+                     reader = cctx.stream_reader(b'foo', read_size=1)
+                     self.assertEqual(reader.readinto(b), len(foo))
+                     self.assertEqual(b[0:len(foo)], foo)
+                     # Too small destination buffer.
+                     b = bytearray(2)
+                     reader = cctx.stream_reader(b'foo')
+                     self.assertEqual(reader.readinto(b), 2)
+                     self.assertEqual(b[:], foo[0:2])
+                     self.assertEqual(reader.readinto(b), 2)
+                     self.assertEqual(b[:], foo[2:4])
+                     self.assertEqual(reader.readinto(b), 2)
+                     self.assertEqual(b[:], foo[4:6])
+                 def test_readinto1(self):
+                     cctx = zstd.ZstdCompressor()
+                     foo = b''.join(cctx.read_to_iter(io.BytesIO(b'foo')))
+                     reader = cctx.stream_reader(b'foo')
+                     with self.assertRaises(Exception):
+                         reader.readinto1(b'foobar')
+                     b = bytearray(1024)
+                     source = OpCountingBytesIO(b'foo')
+                     reader = cctx.stream_reader(source)
+                     self.assertEqual(reader.readinto1(b), len(foo))
+                     self.assertEqual(b[0:len(foo)], foo)
+                     self.assertEqual(source._read_count, 2)
+                     # readinto1() with small reads.
+                     b = bytearray(1024)
+                     source = OpCountingBytesIO(b'foo')
+                     reader = cctx.stream_reader(source, read_size=1)
+                     self.assertEqual(reader.readinto1(b), len(foo))
+                     self.assertEqual(b[0:len(foo)], foo)
+                     self.assertEqual(source._read_count, 4)
+                 def test_read1(self):
+                     cctx = zstd.ZstdCompressor()
+                     foo = b''.join(cctx.read_to_iter(io.BytesIO(b'foo')))
+                     b = OpCountingBytesIO(b'foo')
+                     reader = cctx.stream_reader(b)
+                     self.assertEqual(reader.read1(), foo)
+                     self.assertEqual(b._read_count, 2)
+                     b = OpCountingBytesIO(b'foo')
+                     reader = cctx.stream_reader(b)
+                     self.assertEqual(reader.read1(0), b'')
+                     self.assertEqual(reader.read1(2), foo[0:2])
+                     self.assertEqual(b._read_count, 2)
+                     self.assertEqual(reader.read1(2), foo[2:4])
+                     self.assertEqual(reader.read1(1024), foo[4:])
              @make_cffi
              class TestCompressor_stream_writer(unittest.TestCase):
+                 def test_io_api(self):
+                     buffer = io.BytesIO()
+                     cctx = zstd.ZstdCompressor()
+                     writer = cctx.stream_writer(buffer)
+                     self.assertFalse(writer.isatty())
+                     self.assertFalse(writer.readable())
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.readline()
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.readline(42)
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.readline(size=42)
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.readlines()
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.readlines(42)
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.readlines(hint=42)
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.seek(0)
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.seek(10, os.SEEK_SET)
+                     self.assertFalse(writer.seekable())
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.truncate()
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.truncate(42)
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.truncate(size=42)
+                     self.assertTrue(writer.writable())
+                     with self.assertRaises(NotImplementedError):
+                         writer.writelines([])
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.read()
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.read(42)
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.read(size=42)
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.readall()
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.readinto(None)
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.fileno()
+                     self.assertFalse(writer.closed)
+                 def test_fileno_file(self):
+                     with tempfile.TemporaryFile('wb') as tf:
+                         cctx = zstd.ZstdCompressor()
+                         writer = cctx.stream_writer(tf)
+                         self.assertEqual(writer.fileno(), tf.fileno())
+                 def test_close(self):
+                     buffer = NonClosingBytesIO()
+                     cctx = zstd.ZstdCompressor(level=1)
+                     writer = cctx.stream_writer(buffer)
+                     writer.write(b'foo' * 1024)
+                     self.assertFalse(writer.closed)
+                     self.assertFalse(buffer.closed)
+                     writer.close()
+                     self.assertTrue(writer.closed)
+                     self.assertTrue(buffer.closed)
+                     with self.assertRaisesRegexp(ValueError, 'stream is closed'):
+                         writer.write(b'foo')
+                     with self.assertRaisesRegexp(ValueError, 'stream is closed'):
+                         writer.flush()
+                     with self.assertRaisesRegexp(ValueError, 'stream is closed'):
+                         with writer:
+                             pass
+                     self.assertEqual(buffer.getvalue(),
+                                      b'\x28\xb5\x2f\xfd\x00\x48\x55\x00\x00\x18\x66\x6f'
+                                      b'\x6f\x01\x00\xfa\xd3\x77\x43')
+                     # Context manager exit should close stream.
+                     buffer = io.BytesIO()
+                     writer = cctx.stream_writer(buffer)
+                     with writer:
+                         writer.write(b'foo')
+                     self.assertTrue(writer.closed)
                  def test_empty(self):
-                     buffer = io.BytesIO()
+                     buffer = NonClosingBytesIO()
                      cctx = zstd.ZstdCompressor(level=1, write_content_size=False)
                      with cctx.stream_writer(buffer) as compressor:
                          compressor.write(b'')
                      self.assertEqual(params.dict_id, 0)
                      self.assertFalse(params.has_checksum)
+                     # Test without context manager.
+                     buffer = io.BytesIO()
+                     compressor = cctx.stream_writer(buffer)
+                     self.assertEqual(compressor.write(b''), 0)
+                     self.assertEqual(buffer.getvalue(), b'')
+                     self.assertEqual(compressor.flush(zstd.FLUSH_FRAME), 9)
+                     result = buffer.getvalue()
+                     self.assertEqual(result, b'\x28\xb5\x2f\xfd\x00\x48\x01\x00\x00')
+                     params = zstd.get_frame_parameters(result)
+                     self.assertEqual(params.content_size, zstd.CONTENTSIZE_UNKNOWN)
+                     self.assertEqual(params.window_size, 524288)
+                     self.assertEqual(params.dict_id, 0)
+                     self.assertFalse(params.has_checksum)
+                     # Test write_return_read=True
+                     compressor = cctx.stream_writer(buffer, write_return_read=True)
+                     self.assertEqual(compressor.write(b''), 0)
                  def test_input_types(self):
                      expected = b'\x28\xb5\x2f\xfd\x00\x48\x19\x00\x00\x66\x6f\x6f'
                      cctx = zstd.ZstdCompressor(level=1)
                      ]
                      for source in sources:
-                         buffer = io.BytesIO()
+                         buffer = NonClosingBytesIO()
                          with cctx.stream_writer(buffer) as compressor:
                              compressor.write(source)
                          self.assertEqual(buffer.getvalue(), expected)
+                         compressor = cctx.stream_writer(buffer, write_return_read=True)
+                         self.assertEqual(compressor.write(source), len(source))
                  def test_multiple_compress(self):
-                     buffer = io.BytesIO()
+                     buffer = NonClosingBytesIO()
                      cctx = zstd.ZstdCompressor(level=5)
                      with cctx.stream_writer(buffer) as compressor:
                          self.assertEqual(compressor.write(b'foo'), 0)
                      result = buffer.getvalue()
                      self.assertEqual(result,
-                                      b'\x28\xb5\x2f\xfd\x00\x50\x75\x00\x00\x38\x66\x6f'
+                                      b'\x28\xb5\x2f\xfd\x00\x58\x75\x00\x00\x38\x66\x6f'
                                       b'\x6f\x62\x61\x72\x78\x01\x00\xfc\xdf\x03\x23')
+                     # Test without context manager.
+                     buffer = io.BytesIO()
+                     compressor = cctx.stream_writer(buffer)
+                     self.assertEqual(compressor.write(b'foo'), 0)
+                     self.assertEqual(compressor.write(b'bar'), 0)
+                     self.assertEqual(compressor.write(b'x' * 8192), 0)
+                     self.assertEqual(compressor.flush(zstd.FLUSH_FRAME), 23)
+                     result = buffer.getvalue()
+                     self.assertEqual(result,
+                                      b'\x28\xb5\x2f\xfd\x00\x58\x75\x00\x00\x38\x66\x6f'
+                                      b'\x6f\x62\x61\x72\x78\x01\x00\xfc\xdf\x03\x23')
+                     # Test with write_return_read=True.
+                     compressor = cctx.stream_writer(buffer, write_return_read=True)
+                     self.assertEqual(compressor.write(b'foo'), 3)
+                     self.assertEqual(compressor.write(b'barbiz'), 6)
+                     self.assertEqual(compressor.write(b'x' * 8192), 8192)
                  def test_dictionary(self):
                      samples = []
                      for i in range(128):
                      d = zstd.train_dictionary(8192, samples)
                      h = hashlib.sha1(d.as_bytes()).hexdigest()
-                     self.assertEqual(h, '2b3b6428da5bf2c9cc9d4bb58ba0bc5990dd0e79')
+                     self.assertEqual(h, '88ca0d38332aff379d4ced166a51c280a7679aad')
-                     buffer = io.BytesIO()
+                     buffer = NonClosingBytesIO()
                      cctx = zstd.ZstdCompressor(level=9, dict_data=d)
                      with cctx.stream_writer(buffer) as compressor:
                          self.assertEqual(compressor.write(b'foo'), 0)
                      self.assertFalse(params.has_checksum)
                      h = hashlib.sha1(compressed).hexdigest()
-                     self.assertEqual(h, '23f88344263678478f5f82298e0a5d1833125786')
+                     self.assertEqual(h, '8703b4316f274d26697ea5dd480f29c08e85d940')
                      source = b'foo' + b'bar' + (b'foo' * 16384)
                          min_match=5,
                          search_log=4,
                          target_length=10,
-                         compression_strategy=zstd.STRATEGY_FAST)
+                         strategy=zstd.STRATEGY_FAST)
-                     buffer = io.BytesIO()
+                     buffer = NonClosingBytesIO()
                      cctx = zstd.ZstdCompressor(compression_params=params)
                      with cctx.stream_writer(buffer) as compressor:
                          self.assertEqual(compressor.write(b'foo'), 0)
                      self.assertEqual(h, '2a8111d72eb5004cdcecbdac37da9f26720d30ef')
                  def test_write_checksum(self):
-                     no_checksum = io.BytesIO()
+                     no_checksum = NonClosingBytesIO()
                      cctx = zstd.ZstdCompressor(level=1)
                      with cctx.stream_writer(no_checksum) as compressor:
                          self.assertEqual(compressor.write(b'foobar'), 0)
-                     with_checksum = io.BytesIO()
+                     with_checksum = NonClosingBytesIO()
                      cctx = zstd.ZstdCompressor(level=1, write_checksum=True)
                      with cctx.stream_writer(with_checksum) as compressor:
                          self.assertEqual(compressor.write(b'foobar'), 0)
                                       len(no_checksum.getvalue()) + 4)
                  def test_write_content_size(self):
-                     no_size = io.BytesIO()
+                     no_size = NonClosingBytesIO()
                      cctx = zstd.ZstdCompressor(level=1, write_content_size=False)
                      with cctx.stream_writer(no_size) as compressor:
                          self.assertEqual(compressor.write(b'foobar' * 256), 0)
-                     with_size = io.BytesIO()
+                     with_size = NonClosingBytesIO()
                      cctx = zstd.ZstdCompressor(level=1)
                      with cctx.stream_writer(with_size) as compressor:
                          self.assertEqual(compressor.write(b'foobar' * 256), 0)
                                       len(no_size.getvalue()))
                      # Declaring size will write the header.
-                     with_size = io.BytesIO()
+                     with_size = NonClosingBytesIO()
                      with cctx.stream_writer(with_size, size=len(b'foobar' * 256)) as compressor:
                          self.assertEqual(compressor.write(b'foobar' * 256), 0)
                      d = zstd.train_dictionary(1024, samples)
-                     with_dict_id = io.BytesIO()
+                     with_dict_id = NonClosingBytesIO()
                      cctx = zstd.ZstdCompressor(level=1, dict_data=d)
                      with cctx.stream_writer(with_dict_id) as compressor:
                          self.assertEqual(compressor.write(b'foobarfoobar'), 0)
                      self.assertEqual(with_dict_id.getvalue()[4:5], b'\x03')
                      cctx = zstd.ZstdCompressor(level=1, dict_data=d, write_dict_id=False)
-                     no_dict_id = io.BytesIO()
+                     no_dict_id = NonClosingBytesIO()
                      with cctx.stream_writer(no_dict_id) as compressor:
                          self.assertEqual(compressor.write(b'foobarfoobar'), 0)
                      header = trailing[0:3]
                      self.assertEqual(header, b'\x01\x00\x00')
+                 def test_flush_frame(self):
+                     cctx = zstd.ZstdCompressor(level=3)
+                     dest = OpCountingBytesIO()
+                     with cctx.stream_writer(dest) as compressor:
+                         self.assertEqual(compressor.write(b'foobar' * 8192), 0)
+                         self.assertEqual(compressor.flush(zstd.FLUSH_FRAME), 23)
+                         compressor.write(b'biz' * 16384)
+                     self.assertEqual(dest.getvalue(),
+                                      # Frame 1.
+                                      b'\x28\xb5\x2f\xfd\x00\x58\x75\x00\x00\x30\x66\x6f\x6f'
+                                      b'\x62\x61\x72\x01\x00\xf7\xbf\xe8\xa5\x08'
+                                      # Frame 2.
+                                      b'\x28\xb5\x2f\xfd\x00\x58\x5d\x00\x00\x18\x62\x69\x7a'
+                                      b'\x01\x00\xfa\x3f\x75\x37\x04')
+                 def test_bad_flush_mode(self):
+                     cctx = zstd.ZstdCompressor()
+                     dest = io.BytesIO()
+                     with cctx.stream_writer(dest) as compressor:
+                         with self.assertRaisesRegexp(ValueError, 'unknown flush_mode: 42'):
+                             compressor.flush(flush_mode=42)
                  def test_multithreaded(self):
-                     dest = io.BytesIO()
+                     dest = NonClosingBytesIO()
                      cctx = zstd.ZstdCompressor(threads=2)
                      with cctx.stream_writer(dest) as compressor:
                          compressor.write(b'a' * 1048576)
                          pass
                  def test_tarfile_compat(self):
-                     raise unittest.SkipTest('not yet fully working')
-                     dest = io.BytesIO()
+                     dest = NonClosingBytesIO()
                      cctx = zstd.ZstdCompressor()
                      with cctx.stream_writer(dest) as compressor:
-                         with tarfile.open('tf', mode='w', fileobj=compressor) as tf:
+                         with tarfile.open('tf', mode='w|', fileobj=compressor) as tf:
                              tf.add(__file__, 'test_compressor.py')
-                     dest.seek(0)
+                     dest = io.BytesIO(dest.getvalue())
                      dctx = zstd.ZstdDecompressor()
                      with dctx.stream_reader(dest) as reader:
-                         with tarfile.open(mode='r:', fileobj=reader) as tf:
+                         with tarfile.open(mode='r|', fileobj=reader) as tf:
                              for member in tf:
                                  self.assertEqual(member.name, 'test_compressor.py')
              @make_cffi
              class TestCompressor_read_to_iter(unittest.TestCase):
                  def test_type_validation(self):
                      it = chunker.finish()
-                     self.assertEqual(next(it), b'\x28\xb5\x2f\xfd\x00\x50\x01\x00\x00')
+                     self.assertEqual(next(it), b'\x28\xb5\x2f\xfd\x00\x58\x01\x00\x00')
                      with self.assertRaises(StopIteration):
                          next(it)
                      it = chunker.finish()
                      self.assertEqual(next(it),
-                                      b'\x28\xb5\x2f\xfd\x00\x50\x7d\x00\x00\x48\x66\x6f'
+                                      b'\x28\xb5\x2f\xfd\x00\x58\x7d\x00\x00\x48\x66\x6f'
                                       b'\x6f\x62\x61\x72\x62\x61\x7a\x01\x00\xe4\xe4\x8e')
                      with self.assertRaises(StopIteration):
                      self.assertEqual(
                          b''.join(chunks),
-                         b'\x28\xb5\x2f\xfd\x00\x50\x55\x00\x00\x18\x66\x6f\x6f\x01\x00'
+                         b'\x28\xb5\x2f\xfd\x00\x58\x55\x00\x00\x18\x66\x6f\x6f\x01\x00'
                          b'\xfa\xd3\x77\x43')
                      dctx = zstd.ZstdDecompressor()
                          self.assertEqual(list(chunker.compress(source)), [])
                          self.assertEqual(list(chunker.finish()), [
-                             b'\x28\xb5\x2f\xfd\x00\x50\x19\x00\x00\x66\x6f\x6f'
+                             b'\x28\xb5\x2f\xfd\x00\x58\x19\x00\x00\x66\x6f\x6f'
                          ])
                  def test_flush(self):
                      chunks1 = list(chunker.flush())
                      self.assertEqual(chunks1, [
-                         b'\x28\xb5\x2f\xfd\x00\x50\x8c\x00\x00\x30\x66\x6f\x6f\x62\x61\x72'
+                         b'\x28\xb5\x2f\xfd\x00\x58\x8c\x00\x00\x30\x66\x6f\x6f\x62\x61\x72'
                          b'\x02\x00\xfa\x03\xfe\xd0\x9f\xbe\x1b\x02'
                      ])
                      with self.assertRaisesRegexp(
                              zstd.ZstdError,
-                             'cannot call compress\(\) after compression finished'):
+                             r'cannot call compress\(\) after compression finished'):
                          list(chunker.compress(b'foo'))
                  def test_flush_after_finish(self):
                      with self.assertRaisesRegexp(
                              zstd.ZstdError,
-                             'cannot call flush\(\) after compression finished'):
+                             r'cannot call flush\(\) after compression finished'):
                          list(chunker.flush())
                  def test_finish_after_finish(self):
                      with self.assertRaisesRegexp(
                              zstd.ZstdError,
-                             'cannot call finish\(\) after compression finished'):
+                             r'cannot call finish\(\) after compression finished'):
                          list(chunker.finish())
                  def test_invalid_inputs(self):
                      cctx = zstd.ZstdCompressor()
+                     if not hasattr(cctx, 'multi_compress_to_buffer'):
+                         self.skipTest('multi_compress_to_buffer not available')
                      with self.assertRaises(TypeError):
                          cctx.multi_compress_to_buffer(True)
                  def test_empty_input(self):
                      cctx = zstd.ZstdCompressor()
+                     if not hasattr(cctx, 'multi_compress_to_buffer'):
+                         self.skipTest('multi_compress_to_buffer not available')
                      with self.assertRaisesRegexp(ValueError, 'no source elements found'):
                          cctx.multi_compress_to_buffer([])
                  def test_list_input(self):
                      cctx = zstd.ZstdCompressor(write_checksum=True)
+                     if not hasattr(cctx, 'multi_compress_to_buffer'):
+                         self.skipTest('multi_compress_to_buffer not available')
                      original = [b'foo' * 12, b'bar' * 6]
                      frames = [cctx.compress(c) for c in original]
                      b = cctx.multi_compress_to_buffer(original)
                  def test_buffer_with_segments_input(self):
                      cctx = zstd.ZstdCompressor(write_checksum=True)
+                     if not hasattr(cctx, 'multi_compress_to_buffer'):
+                         self.skipTest('multi_compress_to_buffer not available')
                      original = [b'foo' * 4, b'bar' * 6]
                      frames = [cctx.compress(c) for c in original]
                  def test_buffer_with_segments_collection_input(self):
                      cctx = zstd.ZstdCompressor(write_checksum=True)
+                     if not hasattr(cctx, 'multi_compress_to_buffer'):
+                         self.skipTest('multi_compress_to_buffer not available')
                      original = [
                          b'foo1',
                          b'foo2' * 2,
                      cctx = zstd.ZstdCompressor(write_checksum=True)
+                     if not hasattr(cctx, 'multi_compress_to_buffer'):
+                         self.skipTest('multi_compress_to_buffer not available')
                      frames = []
                      frames.extend(b'x' * 64 for i in range(256))
                      frames.extend(b'y' * 64 for i in range(256))

contrib/python-zstandard/tests/test_compressor_fuzzing.py

0 +396 -5

		@@ -12,6 +12,7 b' import zstandard as zstd'
12	12
13	13	from . common import (
14	14	make_cffi,
	15	NonClosingBytesIO,
15	16	random_input_data,
16	17	)
17	18
		@@ -19,6 +20,62 b' from . common import ('
19	20	@unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set')
20	21	@make_cffi
21	22	class TestCompressor_stream_reader_fuzzing(unittest.TestCase):
	23	@hypothesis.settings(
	24	suppress_health_check=[hypothesis.HealthCheck.large_base_example])
	25	@hypothesis.given(original=strategies.sampled_from(random_input_data()),
	26	level=strategies.integers(min_value=1, max_value=5),
	27	source_read_size=strategies.integers(1, 16384),
	28	read_size=strategies.integers(-1, zstd.COMPRESSION_RECOMMENDED_OUTPUT_SIZE))
	29	def test_stream_source_read(self, original, level, source_read_size,
	30	read_size):
	31	if read_size == 0:
	32	read_size = -1
	33
	34	refctx = zstd.ZstdCompressor(level=level)
	35	ref_frame = refctx.compress(original)
	36
	37	cctx = zstd.ZstdCompressor(level=level)
	38	with cctx.stream_reader(io.BytesIO(original), size=len(original),
	39	read_size=source_read_size) as reader:
	40	chunks = []
	41	while True:
	42	chunk = reader.read(read_size)
	43	if not chunk:
	44	break
	45
	46	chunks.append(chunk)
	47
	48	self.assertEqual(b''.join(chunks), ref_frame)
	49
	50	@hypothesis.settings(
	51	suppress_health_check=[hypothesis.HealthCheck.large_base_example])
	52	@hypothesis.given(original=strategies.sampled_from(random_input_data()),
	53	level=strategies.integers(min_value=1, max_value=5),
	54	source_read_size=strategies.integers(1, 16384),
	55	read_size=strategies.integers(-1, zstd.COMPRESSION_RECOMMENDED_OUTPUT_SIZE))
	56	def test_buffer_source_read(self, original, level, source_read_size,
	57	read_size):
	58	if read_size == 0:
	59	read_size = -1
	60
	61	refctx = zstd.ZstdCompressor(level=level)
	62	ref_frame = refctx.compress(original)
	63
	64	cctx = zstd.ZstdCompressor(level=level)
	65	with cctx.stream_reader(original, size=len(original),
	66	read_size=source_read_size) as reader:
	67	chunks = []
	68	while True:
	69	chunk = reader.read(read_size)
	70	if not chunk:
	71	break
	72
	73	chunks.append(chunk)
	74
	75	self.assertEqual(b''.join(chunks), ref_frame)
	76
	77	@hypothesis.settings(
	78	suppress_health_check=[hypothesis.HealthCheck.large_base_example])
22	79	@hypothesis.given(original=strategies.sampled_from(random_input_data()),
23	80	level=strategies.integers(min_value=1, max_value=5),
24	81	source_read_size=strategies.integers(1, 16384),
		@@ -33,15 +90,17 b' class TestCompressor_stream_reader_fuzzi'
33	90	read_size=source_read_size) as reader:
34	91	chunks = []
35	92	while True:
36		read_size = read_sizes.draw(strategies.integers(1, 16384))
	93	read_size = read_sizes.draw(strategies.integers(-1, 16384))
37	94	chunk = reader.read(read_size)
	95	if not chunk and read_size:
	96	break
38	97
39		if not chunk:
40		break
41	98	chunks.append(chunk)
42	99
43	100	self.assertEqual(b''.join(chunks), ref_frame)
44	101
	102	@hypothesis.settings(
	103	suppress_health_check=[hypothesis.HealthCheck.large_base_example])
45	104	@hypothesis.given(original=strategies.sampled_from(random_input_data()),
46	105	level=strategies.integers(min_value=1, max_value=5),
47	106	source_read_size=strategies.integers(1, 16384),
		@@ -57,14 +116,343 b' class TestCompressor_stream_reader_fuzzi'
57	116	read_size=source_read_size) as reader:
58	117	chunks = []
59	118	while True:
	119	read_size = read_sizes.draw(strategies.integers(-1, 16384))
	120	chunk = reader.read(read_size)
	121	if not chunk and read_size:
	122	break
	123
	124	chunks.append(chunk)
	125
	126	self.assertEqual(b''.join(chunks), ref_frame)
	127
	128	@hypothesis.settings(
	129	suppress_health_check=[hypothesis.HealthCheck.large_base_example])
	130	@hypothesis.given(original=strategies.sampled_from(random_input_data()),
	131	level=strategies.integers(min_value=1, max_value=5),
	132	source_read_size=strategies.integers(1, 16384),
	133	read_size=strategies.integers(1, zstd.COMPRESSION_RECOMMENDED_OUTPUT_SIZE))
	134	def test_stream_source_readinto(self, original, level,
	135	source_read_size, read_size):
	136	refctx = zstd.ZstdCompressor(level=level)
	137	ref_frame = refctx.compress(original)
	138
	139	cctx = zstd.ZstdCompressor(level=level)
	140	with cctx.stream_reader(io.BytesIO(original), size=len(original),
	141	read_size=source_read_size) as reader:
	142	chunks = []
	143	while True:
	144	b = bytearray(read_size)
	145	count = reader.readinto(b)
	146
	147	if not count:
	148	break
	149
	150	chunks.append(bytes(b[0:count]))
	151
	152	self.assertEqual(b''.join(chunks), ref_frame)
	153
	154	@hypothesis.settings(
	155	suppress_health_check=[hypothesis.HealthCheck.large_base_example])
	156	@hypothesis.given(original=strategies.sampled_from(random_input_data()),
	157	level=strategies.integers(min_value=1, max_value=5),
	158	source_read_size=strategies.integers(1, 16384),
	159	read_size=strategies.integers(1, zstd.COMPRESSION_RECOMMENDED_OUTPUT_SIZE))
	160	def test_buffer_source_readinto(self, original, level,
	161	source_read_size, read_size):
	162
	163	refctx = zstd.ZstdCompressor(level=level)
	164	ref_frame = refctx.compress(original)
	165
	166	cctx = zstd.ZstdCompressor(level=level)
	167	with cctx.stream_reader(original, size=len(original),
	168	read_size=source_read_size) as reader:
	169	chunks = []
	170	while True:
	171	b = bytearray(read_size)
	172	count = reader.readinto(b)
	173
	174	if not count:
	175	break
	176
	177	chunks.append(bytes(b[0:count]))
	178
	179	self.assertEqual(b''.join(chunks), ref_frame)
	180
	181	@hypothesis.settings(
	182	suppress_health_check=[hypothesis.HealthCheck.large_base_example])
	183	@hypothesis.given(original=strategies.sampled_from(random_input_data()),
	184	level=strategies.integers(min_value=1, max_value=5),
	185	source_read_size=strategies.integers(1, 16384),
	186	read_sizes=strategies.data())
	187	def test_stream_source_readinto_variance(self, original, level,
	188	source_read_size, read_sizes):
	189	refctx = zstd.ZstdCompressor(level=level)
	190	ref_frame = refctx.compress(original)
	191
	192	cctx = zstd.ZstdCompressor(level=level)
	193	with cctx.stream_reader(io.BytesIO(original), size=len(original),
	194	read_size=source_read_size) as reader:
	195	chunks = []
	196	while True:
60	197	read_size = read_sizes.draw(strategies.integers(1, 16384))
61		~~chunk~~ = ~~reader~~.~~read~~(read_size)
	198	b = bytearray(read_size)
	199	count = reader.readinto(b)
	200
	201	if not count:
	202	break
	203
	204	chunks.append(bytes(b[0:count]))
	205
	206	self.assertEqual(b''.join(chunks), ref_frame)
	207
	208	@hypothesis.settings(
	209	suppress_health_check=[hypothesis.HealthCheck.large_base_example])
	210	@hypothesis.given(original=strategies.sampled_from(random_input_data()),
	211	level=strategies.integers(min_value=1, max_value=5),
	212	source_read_size=strategies.integers(1, 16384),
	213	read_sizes=strategies.data())
	214	def test_buffer_source_readinto_variance(self, original, level,
	215	source_read_size, read_sizes):
	216
	217	refctx = zstd.ZstdCompressor(level=level)
	218	ref_frame = refctx.compress(original)
	219
	220	cctx = zstd.ZstdCompressor(level=level)
	221	with cctx.stream_reader(original, size=len(original),
	222	read_size=source_read_size) as reader:
	223	chunks = []
	224	while True:
	225	read_size = read_sizes.draw(strategies.integers(1, 16384))
	226	b = bytearray(read_size)
	227	count = reader.readinto(b)
	228
	229	if not count:
	230	break
	231
	232	chunks.append(bytes(b[0:count]))
	233
	234	self.assertEqual(b''.join(chunks), ref_frame)
	235
	236	@hypothesis.settings(
	237	suppress_health_check=[hypothesis.HealthCheck.large_base_example])
	238	@hypothesis.given(original=strategies.sampled_from(random_input_data()),
	239	level=strategies.integers(min_value=1, max_value=5),
	240	source_read_size=strategies.integers(1, 16384),
	241	read_size=strategies.integers(-1, zstd.COMPRESSION_RECOMMENDED_OUTPUT_SIZE))
	242	def test_stream_source_read1(self, original, level, source_read_size,
	243	read_size):
	244	if read_size == 0:
	245	read_size = -1
	246
	247	refctx = zstd.ZstdCompressor(level=level)
	248	ref_frame = refctx.compress(original)
	249
	250	cctx = zstd.ZstdCompressor(level=level)
	251	with cctx.stream_reader(io.BytesIO(original), size=len(original),
	252	read_size=source_read_size) as reader:
	253	chunks = []
	254	while True:
	255	chunk = reader.read1(read_size)
62	256	if not chunk:
63	257	break
	258
64	259	chunks.append(chunk)
65	260
66	261	self.assertEqual(b''.join(chunks), ref_frame)
67	262
	263	@hypothesis.settings(
	264	suppress_health_check=[hypothesis.HealthCheck.large_base_example])
	265	@hypothesis.given(original=strategies.sampled_from(random_input_data()),
	266	level=strategies.integers(min_value=1, max_value=5),
	267	source_read_size=strategies.integers(1, 16384),
	268	read_size=strategies.integers(-1, zstd.COMPRESSION_RECOMMENDED_OUTPUT_SIZE))
	269	def test_buffer_source_read1(self, original, level, source_read_size,
	270	read_size):
	271	if read_size == 0:
	272	read_size = -1
	273
	274	refctx = zstd.ZstdCompressor(level=level)
	275	ref_frame = refctx.compress(original)
	276
	277	cctx = zstd.ZstdCompressor(level=level)
	278	with cctx.stream_reader(original, size=len(original),
	279	read_size=source_read_size) as reader:
	280	chunks = []
	281	while True:
	282	chunk = reader.read1(read_size)
	283	if not chunk:
	284	break
	285
	286	chunks.append(chunk)
	287
	288	self.assertEqual(b''.join(chunks), ref_frame)
	289
	290	@hypothesis.settings(
	291	suppress_health_check=[hypothesis.HealthCheck.large_base_example])
	292	@hypothesis.given(original=strategies.sampled_from(random_input_data()),
	293	level=strategies.integers(min_value=1, max_value=5),
	294	source_read_size=strategies.integers(1, 16384),
	295	read_sizes=strategies.data())
	296	def test_stream_source_read1_variance(self, original, level, source_read_size,
	297	read_sizes):
	298	refctx = zstd.ZstdCompressor(level=level)
	299	ref_frame = refctx.compress(original)
	300
	301	cctx = zstd.ZstdCompressor(level=level)
	302	with cctx.stream_reader(io.BytesIO(original), size=len(original),
	303	read_size=source_read_size) as reader:
	304	chunks = []
	305	while True:
	306	read_size = read_sizes.draw(strategies.integers(-1, 16384))
	307	chunk = reader.read1(read_size)
	308	if not chunk and read_size:
	309	break
	310
	311	chunks.append(chunk)
	312
	313	self.assertEqual(b''.join(chunks), ref_frame)
	314
	315	@hypothesis.settings(
	316	suppress_health_check=[hypothesis.HealthCheck.large_base_example])
	317	@hypothesis.given(original=strategies.sampled_from(random_input_data()),
	318	level=strategies.integers(min_value=1, max_value=5),
	319	source_read_size=strategies.integers(1, 16384),
	320	read_sizes=strategies.data())
	321	def test_buffer_source_read1_variance(self, original, level, source_read_size,
	322	read_sizes):
	323
	324	refctx = zstd.ZstdCompressor(level=level)
	325	ref_frame = refctx.compress(original)
	326
	327	cctx = zstd.ZstdCompressor(level=level)
	328	with cctx.stream_reader(original, size=len(original),
	329	read_size=source_read_size) as reader:
	330	chunks = []
	331	while True:
	332	read_size = read_sizes.draw(strategies.integers(-1, 16384))
	333	chunk = reader.read1(read_size)
	334	if not chunk and read_size:
	335	break
	336
	337	chunks.append(chunk)
	338
	339	self.assertEqual(b''.join(chunks), ref_frame)
	340
	341
	342	@hypothesis.settings(
	343	suppress_health_check=[hypothesis.HealthCheck.large_base_example])
	344	@hypothesis.given(original=strategies.sampled_from(random_input_data()),
	345	level=strategies.integers(min_value=1, max_value=5),
	346	source_read_size=strategies.integers(1, 16384),
	347	read_size=strategies.integers(1, zstd.COMPRESSION_RECOMMENDED_OUTPUT_SIZE))
	348	def test_stream_source_readinto1(self, original, level, source_read_size,
	349	read_size):
	350	if read_size == 0:
	351	read_size = -1
	352
	353	refctx = zstd.ZstdCompressor(level=level)
	354	ref_frame = refctx.compress(original)
	355
	356	cctx = zstd.ZstdCompressor(level=level)
	357	with cctx.stream_reader(io.BytesIO(original), size=len(original),
	358	read_size=source_read_size) as reader:
	359	chunks = []
	360	while True:
	361	b = bytearray(read_size)
	362	count = reader.readinto1(b)
	363
	364	if not count:
	365	break
	366
	367	chunks.append(bytes(b[0:count]))
	368
	369	self.assertEqual(b''.join(chunks), ref_frame)
	370
	371	@hypothesis.settings(
	372	suppress_health_check=[hypothesis.HealthCheck.large_base_example])
	373	@hypothesis.given(original=strategies.sampled_from(random_input_data()),
	374	level=strategies.integers(min_value=1, max_value=5),
	375	source_read_size=strategies.integers(1, 16384),
	376	read_size=strategies.integers(1, zstd.COMPRESSION_RECOMMENDED_OUTPUT_SIZE))
	377	def test_buffer_source_readinto1(self, original, level, source_read_size,
	378	read_size):
	379	if read_size == 0:
	380	read_size = -1
	381
	382	refctx = zstd.ZstdCompressor(level=level)
	383	ref_frame = refctx.compress(original)
	384
	385	cctx = zstd.ZstdCompressor(level=level)
	386	with cctx.stream_reader(original, size=len(original),
	387	read_size=source_read_size) as reader:
	388	chunks = []
	389	while True:
	390	b = bytearray(read_size)
	391	count = reader.readinto1(b)
	392
	393	if not count:
	394	break
	395
	396	chunks.append(bytes(b[0:count]))
	397
	398	self.assertEqual(b''.join(chunks), ref_frame)
	399
	400	@hypothesis.settings(
	401	suppress_health_check=[hypothesis.HealthCheck.large_base_example])
	402	@hypothesis.given(original=strategies.sampled_from(random_input_data()),
	403	level=strategies.integers(min_value=1, max_value=5),
	404	source_read_size=strategies.integers(1, 16384),
	405	read_sizes=strategies.data())
	406	def test_stream_source_readinto1_variance(self, original, level, source_read_size,
	407	read_sizes):
	408	refctx = zstd.ZstdCompressor(level=level)
	409	ref_frame = refctx.compress(original)
	410
	411	cctx = zstd.ZstdCompressor(level=level)
	412	with cctx.stream_reader(io.BytesIO(original), size=len(original),
	413	read_size=source_read_size) as reader:
	414	chunks = []
	415	while True:
	416	read_size = read_sizes.draw(strategies.integers(1, 16384))
	417	b = bytearray(read_size)
	418	count = reader.readinto1(b)
	419
	420	if not count:
	421	break
	422
	423	chunks.append(bytes(b[0:count]))
	424
	425	self.assertEqual(b''.join(chunks), ref_frame)
	426
	427	@hypothesis.settings(
	428	suppress_health_check=[hypothesis.HealthCheck.large_base_example])
	429	@hypothesis.given(original=strategies.sampled_from(random_input_data()),
	430	level=strategies.integers(min_value=1, max_value=5),
	431	source_read_size=strategies.integers(1, 16384),
	432	read_sizes=strategies.data())
	433	def test_buffer_source_readinto1_variance(self, original, level, source_read_size,
	434	read_sizes):
	435
	436	refctx = zstd.ZstdCompressor(level=level)
	437	ref_frame = refctx.compress(original)
	438
	439	cctx = zstd.ZstdCompressor(level=level)
	440	with cctx.stream_reader(original, size=len(original),
	441	read_size=source_read_size) as reader:
	442	chunks = []
	443	while True:
	444	read_size = read_sizes.draw(strategies.integers(1, 16384))
	445	b = bytearray(read_size)
	446	count = reader.readinto1(b)
	447
	448	if not count:
	449	break
	450
	451	chunks.append(bytes(b[0:count]))
	452
	453	self.assertEqual(b''.join(chunks), ref_frame)
	454
	455
68	456
69	457	@unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set')
70	458	@make_cffi
		@@ -77,7 +465,7 b' class TestCompressor_stream_writer_fuzzi'
77	465	ref_frame = refctx.compress(original)
78	466
79	467	cctx = zstd.ZstdCompressor(level=level)
80		b = io.BytesIO()
	468	b = NonClosingBytesIO()
81	469	with cctx.stream_writer(b, size=len(original), write_size=write_size) as compressor:
82	470	compressor.write(original)
83	471
		@@ -219,6 +607,9 b' class TestCompressor_multi_compress_to_b'
219	607	write_checksum=True,
220	608	**kwargs)
221	609
	610	if not hasattr(cctx, 'multi_compress_to_buffer'):
	611	self.skipTest('multi_compress_to_buffer not available')
	612
222	613	result = cctx.multi_compress_to_buffer(original, threads=-1)
223	614
224	615	self.assertEqual(len(result), len(original))

contrib/python-zstandard/tests/test_data_structures.py

0 +42 -8

                                                     chain_log=zstd.CHAINLOG_MIN,
                                                     hash_log=zstd.HASHLOG_MIN,
                                                     search_log=zstd.SEARCHLOG_MIN,
-                                                    min_match=zstd.SEARCHLENGTH_MIN + 1,
+                                                    min_match=zstd.MINMATCH_MIN + 1,
                                                     target_length=zstd.TARGETLENGTH_MIN,
-                                                    compression_strategy=zstd.STRATEGY_FAST)
+                                                    strategy=zstd.STRATEGY_FAST)
                      zstd.ZstdCompressionParameters(window_log=zstd.WINDOWLOG_MAX,
                                                     chain_log=zstd.CHAINLOG_MAX,
                                                     hash_log=zstd.HASHLOG_MAX,
                                                     search_log=zstd.SEARCHLOG_MAX,
-                                                    min_match=zstd.SEARCHLENGTH_MAX - 1,
+                                                    min_match=zstd.MINMATCH_MAX - 1,
                                                     target_length=zstd.TARGETLENGTH_MAX,
-                                                    compression_strategy=zstd.STRATEGY_BTULTRA)
+                                                    strategy=zstd.STRATEGY_BTULTRA2)
                  def test_from_level(self):
                      p = zstd.ZstdCompressionParameters.from_level(1)
                                                         search_log=4,
                                                         min_match=5,
                                                         target_length=8,
-                                                        compression_strategy=1)
+                                                        strategy=1)
                      self.assertEqual(p.window_log, 10)
                      self.assertEqual(p.chain_log, 6)
                      self.assertEqual(p.hash_log, 7)
                      self.assertEqual(p.threads, 4)
                      p = zstd.ZstdCompressionParameters(threads=2, job_size=1048576,
-                                                    overlap_size_log=6)
+                                                        overlap_log=6)
                      self.assertEqual(p.threads, 2)
                      self.assertEqual(p.job_size, 1048576)
+                     self.assertEqual(p.overlap_log, 6)
                      self.assertEqual(p.overlap_size_log, 6)
                      p = zstd.ZstdCompressionParameters(compression_level=-1)
                      p = zstd.ZstdCompressionParameters(ldm_bucket_size_log=7)
                      self.assertEqual(p.ldm_bucket_size_log, 7)
-                     p = zstd.ZstdCompressionParameters(ldm_hash_every_log=8)
+                     p = zstd.ZstdCompressionParameters(ldm_hash_rate_log=8)
                      self.assertEqual(p.ldm_hash_every_log, 8)
+                     self.assertEqual(p.ldm_hash_rate_log, 8)
                  def test_estimated_compression_context_size(self):
                      p = zstd.ZstdCompressionParameters(window_log=20,
                                                         search_log=1,
                                                         min_match=5,
                                                         target_length=16,
-                                                        compression_strategy=zstd.STRATEGY_DFAST)
+                                                        strategy=zstd.STRATEGY_DFAST)
                      # 32-bit has slightly different values from 64-bit.
                      self.assertAlmostEqual(p.estimated_compression_context_size(), 1294072,
                                             delta=250)
+                 def test_strategy(self):
+                     with self.assertRaisesRegexp(ValueError, 'cannot specify both compression_strategy'):
+                         zstd.ZstdCompressionParameters(strategy=0, compression_strategy=0)
+                     p = zstd.ZstdCompressionParameters(strategy=2)
+                     self.assertEqual(p.compression_strategy, 2)
+                     p = zstd.ZstdCompressionParameters(strategy=3)
+                     self.assertEqual(p.compression_strategy, 3)
+                 def test_ldm_hash_rate_log(self):
+                     with self.assertRaisesRegexp(ValueError, 'cannot specify both ldm_hash_rate_log'):
+                         zstd.ZstdCompressionParameters(ldm_hash_rate_log=8, ldm_hash_every_log=4)
+                     p = zstd.ZstdCompressionParameters(ldm_hash_rate_log=8)
+                     self.assertEqual(p.ldm_hash_every_log, 8)
+                     p = zstd.ZstdCompressionParameters(ldm_hash_every_log=16)
+                     self.assertEqual(p.ldm_hash_every_log, 16)
+                 def test_overlap_log(self):
+                     with self.assertRaisesRegexp(ValueError, 'cannot specify both overlap_log'):
+                         zstd.ZstdCompressionParameters(overlap_log=1, overlap_size_log=9)
+                     p = zstd.ZstdCompressionParameters(overlap_log=2)
+                     self.assertEqual(p.overlap_log, 2)
+                     self.assertEqual(p.overlap_size_log, 2)
+                     p = zstd.ZstdCompressionParameters(overlap_size_log=4)
+                     self.assertEqual(p.overlap_log, 4)
+                     self.assertEqual(p.overlap_size_log, 4)
              @make_cffi
              class TestFrameParameters(unittest.TestCase):

contrib/python-zstandard/tests/test_data_structures_fuzzing.py

0 +16 -15

                                              max_value=zstd.HASHLOG_MAX)
              s_searchlog = strategies.integers(min_value=zstd.SEARCHLOG_MIN,
                                                  max_value=zstd.SEARCHLOG_MAX)
-             s_searchlength = strategies.integers(min_value=zstd.SEARCHLENGTH_MIN,
-                                                  max_value=zstd.SEARCHLENGTH_MAX)
+             s_minmatch = strategies.integers(min_value=zstd.MINMATCH_MIN,
+                                              max_value=zstd.MINMATCH_MAX)
              s_targetlength = strategies.integers(min_value=zstd.TARGETLENGTH_MIN,
                                                   max_value=zstd.TARGETLENGTH_MAX)
              s_strategy = strategies.sampled_from((zstd.STRATEGY_FAST,
                                                      zstd.STRATEGY_LAZY2,
                                                      zstd.STRATEGY_BTLAZY2,
                                                      zstd.STRATEGY_BTOPT,
-                                                     zstd.STRATEGY_BTULTRA))
+                                                     zstd.STRATEGY_BTULTRA,
+                                                     zstd.STRATEGY_BTULTRA2))
              @make_cffi
              @unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set')
              class TestCompressionParametersHypothesis(unittest.TestCase):
                  @hypothesis.given(s_windowlog, s_chainlog, s_hashlog, s_searchlog,
-                                     s_searchlength, s_targetlength, s_strategy)
+                                     s_minmatch, s_targetlength, s_strategy)
                  def test_valid_init(self, windowlog, chainlog, hashlog, searchlog,
-                                     searchlength, targetlength, strategy):
+                                     minmatch, targetlength, strategy):
                      zstd.ZstdCompressionParameters(window_log=windowlog,
                                                     chain_log=chainlog,
                                                     hash_log=hashlog,
                                                     search_log=searchlog,
-                                                    min_match=searchlength,
+                                                    min_match=minmatch,
                                                     target_length=targetlength,
-                                                    compression_strategy=strategy)
+                                                    strategy=strategy)
                  @hypothesis.given(s_windowlog, s_chainlog, s_hashlog, s_searchlog,
-                                     s_searchlength, s_targetlength, s_strategy)
+                                   s_minmatch, s_targetlength, s_strategy)
                  def test_estimated_compression_context_size(self, windowlog, chainlog,
                                                              hashlog, searchlog,
-                                                             searchlength, targetlength,
+                                                             minmatch, targetlength,
                                                              strategy):
-                     if searchlength == zstd.SEARCHLENGTH_MIN and strategy in (zstd.STRATEGY_FAST, zstd.STRATEGY_GREEDY):
-                         searchlength += 1
-                     elif searchlength == zstd.SEARCHLENGTH_MAX and strategy != zstd.STRATEGY_FAST:
-                         searchlength -= 1
+                     if minmatch == zstd.MINMATCH_MIN and strategy in (zstd.STRATEGY_FAST, zstd.STRATEGY_GREEDY):
+                         minmatch += 1
+                     elif minmatch == zstd.MINMATCH_MAX and strategy != zstd.STRATEGY_FAST:
+                         minmatch -= 1
                      p = zstd.ZstdCompressionParameters(window_log=windowlog,
                                                         chain_log=chainlog,
                                                         hash_log=hashlog,
                                                         search_log=searchlog,
-                                                        min_match=searchlength,
+                                                        min_match=minmatch,
                                                         target_length=targetlength,
-                                                        compression_strategy=strategy)
+                                                        strategy=strategy)
                      size = p.estimated_compression_context_size()

contrib/python-zstandard/tests/test_decompressor.py

0 +459 -26

              import random
              import struct
              import sys
+             import tempfile
              import unittest
              import zstandard as zstd
              from .common import (
                  generate_samples,
                  make_cffi,
+                 NonClosingBytesIO,
                  OpCountingBytesIO,
              )
                      cctx = zstd.ZstdCompressor(write_content_size=False)
                      frame = cctx.compress(source)
-                     dctx = zstd.ZstdDecompressor(max_window_size=1)
+                     dctx = zstd.ZstdDecompressor(max_window_size=2**zstd.WINDOWLOG_MIN)
                      with self.assertRaisesRegexp(
                          zstd.ZstdError, 'decompression error: Frame requires too much memory'):
                      dctx = zstd.ZstdDecompressor()
                      with dctx.stream_reader(b'foo') as reader:
-                         with self.assertRaises(NotImplementedError):
+                         with self.assertRaises(io.UnsupportedOperation):
                              reader.readline()
-                         with self.assertRaises(NotImplementedError):
+                         with self.assertRaises(io.UnsupportedOperation):
                              reader.readlines()
-                         with self.assertRaises(NotImplementedError):
-                             reader.readall()
-                         with self.assertRaises(NotImplementedError):
+                         with self.assertRaises(io.UnsupportedOperation):
                              iter(reader)
-                         with self.assertRaises(NotImplementedError):
+                         with self.assertRaises(io.UnsupportedOperation):
                              next(reader)
                          with self.assertRaises(io.UnsupportedOperation):
                          with self.assertRaisesRegexp(ValueError, 'stream is closed'):
                              reader.read(1)
-                 def test_bad_read_size(self):
+                 def test_read_sizes(self):
+                     cctx = zstd.ZstdCompressor()
+                     foo = cctx.compress(b'foo')
                      dctx = zstd.ZstdDecompressor()
-                     with dctx.stream_reader(b'foo') as reader:
-                         with self.assertRaisesRegexp(ValueError, 'cannot read negative or size 0 amounts'):
-                             reader.read(-1)
+                     with dctx.stream_reader(foo) as reader:
+                         with self.assertRaisesRegexp(ValueError, 'cannot read negative amounts less than -1'):
+                             reader.read(-2)
-                         with self.assertRaisesRegexp(ValueError, 'cannot read negative or size 0 amounts'):
-                             reader.read(0)
+                         self.assertEqual(reader.read(0), b'')
+                         self.assertEqual(reader.read(), b'foo')
                  def test_read_buffer(self):
                      cctx = zstd.ZstdCompressor()
                      reader = dctx.stream_reader(source)
                      with reader:
-                         with self.assertRaises(TypeError):
-                             reader.read()
+                         reader.read(0)
                      with reader:
                          with self.assertRaisesRegexp(ValueError, 'stream is closed'):
                              reader.read(100)
+                 def test_partial_read(self):
+                     # Inspired by https://github.com/indygreg/python-zstandard/issues/71.
+                     buffer = io.BytesIO()
+                     cctx = zstd.ZstdCompressor()
+                     writer = cctx.stream_writer(buffer)
+                     writer.write(bytearray(os.urandom(1000000)))
+                     writer.flush(zstd.FLUSH_FRAME)
+                     buffer.seek(0)
+                     dctx = zstd.ZstdDecompressor()
+                     reader = dctx.stream_reader(buffer)
+                     while True:
+                         chunk = reader.read(8192)
+                         if not chunk:
+                             break
+                 def test_read_multiple_frames(self):
+                     cctx = zstd.ZstdCompressor()
+                     source = io.BytesIO()
+                     writer = cctx.stream_writer(source)
+                     writer.write(b'foo')
+                     writer.flush(zstd.FLUSH_FRAME)
+                     writer.write(b'bar')
+                     writer.flush(zstd.FLUSH_FRAME)
+                     dctx = zstd.ZstdDecompressor()
+                     reader = dctx.stream_reader(source.getvalue())
+                     self.assertEqual(reader.read(2), b'fo')
+                     self.assertEqual(reader.read(2), b'o')
+                     self.assertEqual(reader.read(2), b'ba')
+                     self.assertEqual(reader.read(2), b'r')
+                     source.seek(0)
+                     reader = dctx.stream_reader(source)
+                     self.assertEqual(reader.read(2), b'fo')
+                     self.assertEqual(reader.read(2), b'o')
+                     self.assertEqual(reader.read(2), b'ba')
+                     self.assertEqual(reader.read(2), b'r')
+                     reader = dctx.stream_reader(source.getvalue())
+                     self.assertEqual(reader.read(3), b'foo')
+                     self.assertEqual(reader.read(3), b'bar')
+                     source.seek(0)
+                     reader = dctx.stream_reader(source)
+                     self.assertEqual(reader.read(3), b'foo')
+                     self.assertEqual(reader.read(3), b'bar')
+                     reader = dctx.stream_reader(source.getvalue())
+                     self.assertEqual(reader.read(4), b'foo')
+                     self.assertEqual(reader.read(4), b'bar')
+                     source.seek(0)
+                     reader = dctx.stream_reader(source)
+                     self.assertEqual(reader.read(4), b'foo')
+                     self.assertEqual(reader.read(4), b'bar')
+                     reader = dctx.stream_reader(source.getvalue())
+                     self.assertEqual(reader.read(128), b'foo')
+                     self.assertEqual(reader.read(128), b'bar')
+                     source.seek(0)
+                     reader = dctx.stream_reader(source)
+                     self.assertEqual(reader.read(128), b'foo')
+                     self.assertEqual(reader.read(128), b'bar')
+                     # Now tests for reads spanning frames.
+                     reader = dctx.stream_reader(source.getvalue(), read_across_frames=True)
+                     self.assertEqual(reader.read(3), b'foo')
+                     self.assertEqual(reader.read(3), b'bar')
+                     source.seek(0)
+                     reader = dctx.stream_reader(source, read_across_frames=True)
+                     self.assertEqual(reader.read(3), b'foo')
+                     self.assertEqual(reader.read(3), b'bar')
+                     reader = dctx.stream_reader(source.getvalue(), read_across_frames=True)
+                     self.assertEqual(reader.read(6), b'foobar')
+                     source.seek(0)
+                     reader = dctx.stream_reader(source, read_across_frames=True)
+                     self.assertEqual(reader.read(6), b'foobar')
+                     reader = dctx.stream_reader(source.getvalue(), read_across_frames=True)
+                     self.assertEqual(reader.read(7), b'foobar')
+                     source.seek(0)
+                     reader = dctx.stream_reader(source, read_across_frames=True)
+                     self.assertEqual(reader.read(7), b'foobar')
+                     reader = dctx.stream_reader(source.getvalue(), read_across_frames=True)
+                     self.assertEqual(reader.read(128), b'foobar')
+                     source.seek(0)
+                     reader = dctx.stream_reader(source, read_across_frames=True)
+                     self.assertEqual(reader.read(128), b'foobar')
+                 def test_readinto(self):
+                     cctx = zstd.ZstdCompressor()
+                     foo = cctx.compress(b'foo')
+                     dctx = zstd.ZstdDecompressor()
+                     # Attempting to readinto() a non-writable buffer fails.
+                     # The exact exception varies based on the backend.
+                     reader = dctx.stream_reader(foo)
+                     with self.assertRaises(Exception):
+                         reader.readinto(b'foobar')
+                     # readinto() with sufficiently large destination.
+                     b = bytearray(1024)
+                     reader = dctx.stream_reader(foo)
+                     self.assertEqual(reader.readinto(b), 3)
+                     self.assertEqual(b[0:3], b'foo')
+                     self.assertEqual(reader.readinto(b), 0)
+                     self.assertEqual(b[0:3], b'foo')
+                     # readinto() with small reads.
+                     b = bytearray(1024)
+                     reader = dctx.stream_reader(foo, read_size=1)
+                     self.assertEqual(reader.readinto(b), 3)
+                     self.assertEqual(b[0:3], b'foo')
+                     # Too small destination buffer.
+                     b = bytearray(2)
+                     reader = dctx.stream_reader(foo)
+                     self.assertEqual(reader.readinto(b), 2)
+                     self.assertEqual(b[:], b'fo')
+                 def test_readinto1(self):
+                     cctx = zstd.ZstdCompressor()
+                     foo = cctx.compress(b'foo')
+                     dctx = zstd.ZstdDecompressor()
+                     reader = dctx.stream_reader(foo)
+                     with self.assertRaises(Exception):
+                         reader.readinto1(b'foobar')
+                     # Sufficiently large destination.
+                     b = bytearray(1024)
+                     reader = dctx.stream_reader(foo)
+                     self.assertEqual(reader.readinto1(b), 3)
+                     self.assertEqual(b[0:3], b'foo')
+                     self.assertEqual(reader.readinto1(b), 0)
+                     self.assertEqual(b[0:3], b'foo')
+                     # readinto() with small reads.
+                     b = bytearray(1024)
+                     reader = dctx.stream_reader(foo, read_size=1)
+                     self.assertEqual(reader.readinto1(b), 3)
+                     self.assertEqual(b[0:3], b'foo')
+                     # Too small destination buffer.
+                     b = bytearray(2)
+                     reader = dctx.stream_reader(foo)
+                     self.assertEqual(reader.readinto1(b), 2)
+                     self.assertEqual(b[:], b'fo')
+                 def test_readall(self):
+                     cctx = zstd.ZstdCompressor()
+                     foo = cctx.compress(b'foo')
+                     dctx = zstd.ZstdDecompressor()
+                     reader = dctx.stream_reader(foo)
+                     self.assertEqual(reader.readall(), b'foo')
+                 def test_read1(self):
+                     cctx = zstd.ZstdCompressor()
+                     foo = cctx.compress(b'foo')
+                     dctx = zstd.ZstdDecompressor()
+                     b = OpCountingBytesIO(foo)
+                     reader = dctx.stream_reader(b)
+                     self.assertEqual(reader.read1(), b'foo')
+                     self.assertEqual(b._read_count, 1)
+                     b = OpCountingBytesIO(foo)
+                     reader = dctx.stream_reader(b)
+                     self.assertEqual(reader.read1(0), b'')
+                     self.assertEqual(reader.read1(2), b'fo')
+                     self.assertEqual(b._read_count, 1)
+                     self.assertEqual(reader.read1(1), b'o')
+                     self.assertEqual(b._read_count, 1)
+                     self.assertEqual(reader.read1(1), b'')
+                     self.assertEqual(b._read_count, 2)
+                 def test_read_lines(self):
+                     cctx = zstd.ZstdCompressor()
+                     source = b'\n'.join(('line %d' % i).encode('ascii') for i in range(1024))
+                     frame = cctx.compress(source)
+                     dctx = zstd.ZstdDecompressor()
+                     reader = dctx.stream_reader(frame)
+                     tr = io.TextIOWrapper(reader, encoding='utf-8')
+                     lines = []
+                     for line in tr:
+                         lines.append(line.encode('utf-8'))
+                     self.assertEqual(len(lines), 1024)
+                     self.assertEqual(b''.join(lines), source)
+                     reader = dctx.stream_reader(frame)
+                     tr = io.TextIOWrapper(reader, encoding='utf-8')
+                     lines = tr.readlines()
+                     self.assertEqual(len(lines), 1024)
+                     self.assertEqual(''.join(lines).encode('utf-8'), source)
+                     reader = dctx.stream_reader(frame)
+                     tr = io.TextIOWrapper(reader, encoding='utf-8')
+                     lines = []
+                     while True:
+                         line = tr.readline()
+                         if not line:
+                             break
+                         lines.append(line.encode('utf-8'))
+                     self.assertEqual(len(lines), 1024)
+                     self.assertEqual(b''.join(lines), source)
              @make_cffi
              class TestDecompressor_decompressobj(unittest.TestCase):
                      dctx = zstd.ZstdDecompressor()
                      dobj = dctx.decompressobj()
                      self.assertEqual(dobj.decompress(data), b'foobar')
+                     self.assertIsNone(dobj.flush())
+                     self.assertIsNone(dobj.flush(10))
+                     self.assertIsNone(dobj.flush(length=100))
                  def test_input_types(self):
                      compressed = zstd.ZstdCompressor(level=1).compress(b'foo')
                      for source in sources:
                          dobj = dctx.decompressobj()
+                         self.assertIsNone(dobj.flush())
+                         self.assertIsNone(dobj.flush(10))
+                         self.assertIsNone(dobj.flush(length=100))
                          self.assertEqual(dobj.decompress(source), b'foo')
+                         self.assertIsNone(dobj.flush())
                  def test_reuse(self):
                      data = zstd.ZstdCompressor(level=1).compress(b'foobar')
                      with self.assertRaisesRegexp(zstd.ZstdError, 'cannot use a decompressobj'):
                          dobj.decompress(data)
+                         self.assertIsNone(dobj.flush())
                  def test_bad_write_size(self):
                      dctx = zstd.ZstdDecompressor()
                          dobj = dctx.decompressobj(write_size=i + 1)
                          self.assertEqual(dobj.decompress(data), source)
              def decompress_via_writer(data):
                  buffer = io.BytesIO()
                  dctx = zstd.ZstdDecompressor()
-                 with dctx.stream_writer(buffer) as decompressor:
-                     decompressor.write(data)
+                 decompressor = dctx.stream_writer(buffer)
+                 decompressor.write(data)
                  return buffer.getvalue()
              @make_cffi
              class TestDecompressor_stream_writer(unittest.TestCase):
+                 def test_io_api(self):
+                     buffer = io.BytesIO()
+                     dctx = zstd.ZstdDecompressor()
+                     writer = dctx.stream_writer(buffer)
+                     self.assertFalse(writer.closed)
+                     self.assertFalse(writer.isatty())
+                     self.assertFalse(writer.readable())
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.readline()
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.readline(42)
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.readline(size=42)
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.readlines()
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.readlines(42)
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.readlines(hint=42)
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.seek(0)
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.seek(10, os.SEEK_SET)
+                     self.assertFalse(writer.seekable())
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.tell()
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.truncate()
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.truncate(42)
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.truncate(size=42)
+                     self.assertTrue(writer.writable())
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.writelines([])
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.read()
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.read(42)
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.read(size=42)
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.readall()
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.readinto(None)
+                     with self.assertRaises(io.UnsupportedOperation):
+                         writer.fileno()
+                 def test_fileno_file(self):
+                     with tempfile.TemporaryFile('wb') as tf:
+                         dctx = zstd.ZstdDecompressor()
+                         writer = dctx.stream_writer(tf)
+                         self.assertEqual(writer.fileno(), tf.fileno())
+                 def test_close(self):
+                     foo = zstd.ZstdCompressor().compress(b'foo')
+                     buffer = NonClosingBytesIO()
+                     dctx = zstd.ZstdDecompressor()
+                     writer = dctx.stream_writer(buffer)
+                     writer.write(foo)
+                     self.assertFalse(writer.closed)
+                     self.assertFalse(buffer.closed)
+                     writer.close()
+                     self.assertTrue(writer.closed)
+                     self.assertTrue(buffer.closed)
+                     with self.assertRaisesRegexp(ValueError, 'stream is closed'):
+                         writer.write(b'')
+                     with self.assertRaisesRegexp(ValueError, 'stream is closed'):
+                         writer.flush()
+                     with self.assertRaisesRegexp(ValueError, 'stream is closed'):
+                         with writer:
+                             pass
+                     self.assertEqual(buffer.getvalue(), b'foo')
+                     # Context manager exit should close stream.
+                     buffer = NonClosingBytesIO()
+                     writer = dctx.stream_writer(buffer)
+                     with writer:
+                         writer.write(foo)
+                     self.assertTrue(writer.closed)
+                     self.assertEqual(buffer.getvalue(), b'foo')
+                 def test_flush(self):
+                     buffer = OpCountingBytesIO()
+                     dctx = zstd.ZstdDecompressor()
+                     writer = dctx.stream_writer(buffer)
+                     writer.flush()
+                     self.assertEqual(buffer._flush_count, 1)
+                     writer.flush()
+                     self.assertEqual(buffer._flush_count, 2)
                  def test_empty_roundtrip(self):
                      cctx = zstd.ZstdCompressor()
                      empty = cctx.compress(b'')
                      dctx = zstd.ZstdDecompressor()
                      for source in sources:
                          buffer = io.BytesIO()
+                         decompressor = dctx.stream_writer(buffer)
+                         decompressor.write(source)
+                         self.assertEqual(buffer.getvalue(), b'foo')
+                         buffer = NonClosingBytesIO()
                          with dctx.stream_writer(buffer) as decompressor:
-                             decompressor.write(source)
+                             self.assertEqual(decompressor.write(source), 3)
+                         self.assertEqual(buffer.getvalue(), b'foo')
+                         buffer = io.BytesIO()
+                         writer = dctx.stream_writer(buffer, write_return_read=True)
+                         self.assertEqual(writer.write(source), len(source))
                          self.assertEqual(buffer.getvalue(), b'foo')
                  def test_large_roundtrip(self):
                      cctx = zstd.ZstdCompressor()
                      compressed = cctx.compress(orig)
-                     buffer = io.BytesIO()
+                     buffer = NonClosingBytesIO()
                      dctx = zstd.ZstdDecompressor()
                      with dctx.stream_writer(buffer) as decompressor:
                          pos = 0
                              pos += 8192
                      self.assertEqual(buffer.getvalue(), orig)
+                     # Again with write_return_read=True
+                     buffer = io.BytesIO()
+                     writer = dctx.stream_writer(buffer, write_return_read=True)
+                     pos = 0
+                     while pos < len(compressed):
+                         pos2 = pos + 8192
+                         chunk = compressed[pos:pos2]
+                         self.assertEqual(writer.write(chunk), len(chunk))
+                         pos += 8192
+                     self.assertEqual(buffer.getvalue(), orig)
                  def test_dictionary(self):
                      samples = []
                      for i in range(128):
                      d = zstd.train_dictionary(8192, samples)
                      orig = b'foobar' * 16384
-                     buffer = io.BytesIO()
+                     buffer = NonClosingBytesIO()
                      cctx = zstd.ZstdCompressor(dict_data=d)
                      with cctx.stream_writer(buffer) as compressor:
                          self.assertEqual(compressor.write(orig), 0)
                      buffer = io.BytesIO()
                      dctx = zstd.ZstdDecompressor(dict_data=d)
+                     decompressor = dctx.stream_writer(buffer)
+                     self.assertEqual(decompressor.write(compressed), len(orig))
+                     self.assertEqual(buffer.getvalue(), orig)
+                     buffer = NonClosingBytesIO()
                      with dctx.stream_writer(buffer) as decompressor:
                          self.assertEqual(decompressor.write(compressed), len(orig))
                  def test_memory_size(self):
                      dctx = zstd.ZstdDecompressor()
                      buffer = io.BytesIO()
+                     decompressor = dctx.stream_writer(buffer)
+                     size = decompressor.memory_size()
+                     self.assertGreater(size, 100000)
                      with dctx.stream_writer(buffer) as decompressor:
                          size = decompressor.memory_size()
                  @unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set')
                  def test_large_input(self):
                      bytes = list(struct.Struct('>B').pack(i) for i in range(256))
-                     compressed = io.BytesIO()
+                     compressed = NonClosingBytesIO()
                      input_size = 0
                      cctx = zstd.ZstdCompressor(level=1)
                      with cctx.stream_writer(compressed) as compressor:
                              if have_compressed and have_raw:
                                  break
-                     compressed.seek(0)
+                     compressed = io.BytesIO(compressed.getvalue())
                      self.assertGreater(len(compressed.getvalue()),
                                         zstd.DECOMPRESSION_RECOMMENDED_INPUT_SIZE)
                      source = io.BytesIO()
-                     compressed = io.BytesIO()
+                     compressed = NonClosingBytesIO()
                      with cctx.stream_writer(compressed) as compressor:
                          for i in range(256):
                              chunk = b'\0' * 1024
                                               max_output_size=len(source.getvalue()))
                      self.assertEqual(simple, source.getvalue())
-                     compressed.seek(0)
+                     compressed = io.BytesIO(compressed.getvalue())
                      streamed = b''.join(dctx.read_to_iter(compressed))
                      self.assertEqual(streamed, source.getvalue())
                  def test_invalid_inputs(self):
                      dctx = zstd.ZstdDecompressor()
+                     if not hasattr(dctx, 'multi_decompress_to_buffer'):
+                         self.skipTest('multi_decompress_to_buffer not available')
                      with self.assertRaises(TypeError):
                          dctx.multi_decompress_to_buffer(True)
                      frames = [cctx.compress(d) for d in original]
                      dctx = zstd.ZstdDecompressor()
+                     if not hasattr(dctx, 'multi_decompress_to_buffer'):
+                         self.skipTest('multi_decompress_to_buffer not available')
                      result = dctx.multi_decompress_to_buffer(frames)
                      self.assertEqual(len(result), len(frames))
                      sizes = struct.pack('=' + 'Q' * len(original), *map(len, original))
                      dctx = zstd.ZstdDecompressor()
+                     if not hasattr(dctx, 'multi_decompress_to_buffer'):
+                         self.skipTest('multi_decompress_to_buffer not available')
                      result = dctx.multi_decompress_to_buffer(frames, decompressed_sizes=sizes)
                      self.assertEqual(len(result), len(frames))
                      dctx = zstd.ZstdDecompressor()
+                     if not hasattr(dctx, 'multi_decompress_to_buffer'):
+                         self.skipTest('multi_decompress_to_buffer not available')
                      segments = struct.pack('=QQQQ', 0, len(frames[0]), len(frames[0]), len(frames[1]))
                      b = zstd.BufferWithSegments(b''.join(frames), segments)
                      frames = [cctx.compress(d) for d in original]
                      sizes = struct.pack('=' + 'Q' * len(original), *map(len, original))
+                     dctx = zstd.ZstdDecompressor()
+                     if not hasattr(dctx, 'multi_decompress_to_buffer'):
+                         self.skipTest('multi_decompress_to_buffer not available')
                      segments = struct.pack('=QQQQQQ', 0, len(frames[0]),
                                             len(frames[0]), len(frames[1]),
                                             len(frames[0]) + len(frames[1]), len(frames[2]))
                      b = zstd.BufferWithSegments(b''.join(frames), segments)
-                     dctx = zstd.ZstdDecompressor()
                      result = dctx.multi_decompress_to_buffer(b, decompressed_sizes=sizes)
                      self.assertEqual(len(result), len(frames))
                          b'foo4' * 6,
                      ]
+                     if not hasattr(cctx, 'multi_compress_to_buffer'):
+                         self.skipTest('multi_compress_to_buffer not available')
                      frames = cctx.multi_compress_to_buffer(original)
                      # Check round trip.
                      dctx = zstd.ZstdDecompressor()
                      decompressed = dctx.multi_decompress_to_buffer(frames, threads=3)
                      self.assertEqual(len(decompressed), len(original))
                      frames = [cctx.compress(s) for s in generate_samples()]
                      dctx = zstd.ZstdDecompressor(dict_data=d)
+                     if not hasattr(dctx, 'multi_decompress_to_buffer'):
+                         self.skipTest('multi_decompress_to_buffer not available')
                      result = dctx.multi_decompress_to_buffer(frames)
                      self.assertEqual([o.tobytes() for o in result], generate_samples())
                  def test_multiple_threads(self):
                      frames.extend(cctx.compress(b'y' * 64) for i in range(256))
                      dctx = zstd.ZstdDecompressor()
+                     if not hasattr(dctx, 'multi_decompress_to_buffer'):
+                         self.skipTest('multi_decompress_to_buffer not available')
                      result = dctx.multi_decompress_to_buffer(frames, threads=-1)
                      self.assertEqual(len(result), len(frames))
                      dctx = zstd.ZstdDecompressor()
+                     if not hasattr(dctx, 'multi_decompress_to_buffer'):
+                         self.skipTest('multi_decompress_to_buffer not available')
                      with self.assertRaisesRegexp(zstd.ZstdError,
                                                   'error decompressing item 1: ('
                                                   'Corrupted block|'

contrib/python-zstandard/tests/test_decompressor_fuzzing.py

0 +253 -20

		@@ -12,6 +12,7 b' import zstandard as zstd'
12	12
13	13	from . common import (
14	14	make_cffi,
	15	NonClosingBytesIO,
15	16	random_input_data,
16	17	)
17	18
		@@ -23,22 +24,200 b' class TestDecompressor_stream_reader_fuz'
23	24	suppress_health_check=[hypothesis.HealthCheck.large_base_example])
24	25	@hypothesis.given(original=strategies.sampled_from(random_input_data()),
25	26	level=strategies.integers(min_value=1, max_value=5),
26		s~~ource_read_size~~=strategies.~~integer~~s(1, ~~16384~~),
	27	streaming=strategies.booleans(),
	28	source_read_size=strategies.integers(1, 1048576),
27	29	read_sizes=strategies.data())
28		def test_stream_source_read_variance(self, original, level, s~~ource_read_size~~,
29		read_sizes):
	30	def test_stream_source_read_variance(self, original, level, streaming,
	31	source_read_size, read_sizes):
30	32	cctx = zstd.ZstdCompressor(level=level)
31		frame = cctx.compress(original)
	33
	34	if streaming:
	35	source = io.BytesIO()
	36	writer = cctx.stream_writer(source)
	37	writer.write(original)
	38	writer.flush(zstd.FLUSH_FRAME)
	39	source.seek(0)
	40	else:
	41	frame = cctx.compress(original)
	42	source = io.BytesIO(frame)
32	43
33	44	dctx = zstd.ZstdDecompressor()
34		source = io.BytesIO(frame)
35	45
36	46	chunks = []
37	47	with dctx.stream_reader(source, read_size=source_read_size) as reader:
38	48	while True:
39		read_size = read_sizes.draw(strategies.integers(1, 1~~6384~~))
	49	read_size = read_sizes.draw(strategies.integers(-1, 131072))
	50	chunk = reader.read(read_size)
	51	if not chunk and read_size:
	52	break
	53
	54	chunks.append(chunk)
	55
	56	self.assertEqual(b''.join(chunks), original)
	57
	58	# Similar to above except we have a constant read() size.
	59	@hypothesis.settings(
	60	suppress_health_check=[hypothesis.HealthCheck.large_base_example])
	61	@hypothesis.given(original=strategies.sampled_from(random_input_data()),
	62	level=strategies.integers(min_value=1, max_value=5),
	63	streaming=strategies.booleans(),
	64	source_read_size=strategies.integers(1, 1048576),
	65	read_size=strategies.integers(-1, 131072))
	66	def test_stream_source_read_size(self, original, level, streaming,
	67	source_read_size, read_size):
	68	if read_size == 0:
	69	read_size = 1
	70
	71	cctx = zstd.ZstdCompressor(level=level)
	72
	73	if streaming:
	74	source = io.BytesIO()
	75	writer = cctx.stream_writer(source)
	76	writer.write(original)
	77	writer.flush(zstd.FLUSH_FRAME)
	78	source.seek(0)
	79	else:
	80	frame = cctx.compress(original)
	81	source = io.BytesIO(frame)
	82
	83	dctx = zstd.ZstdDecompressor()
	84
	85	chunks = []
	86	reader = dctx.stream_reader(source, read_size=source_read_size)
	87	while True:
	88	chunk = reader.read(read_size)
	89	if not chunk and read_size:
	90	break
	91
	92	chunks.append(chunk)
	93
	94	self.assertEqual(b''.join(chunks), original)
	95
	96	@hypothesis.settings(
	97	suppress_health_check=[hypothesis.HealthCheck.large_base_example])
	98	@hypothesis.given(original=strategies.sampled_from(random_input_data()),
	99	level=strategies.integers(min_value=1, max_value=5),
	100	streaming=strategies.booleans(),
	101	source_read_size=strategies.integers(1, 1048576),
	102	read_sizes=strategies.data())
	103	def test_buffer_source_read_variance(self, original, level, streaming,
	104	source_read_size, read_sizes):
	105	cctx = zstd.ZstdCompressor(level=level)
	106
	107	if streaming:
	108	source = io.BytesIO()
	109	writer = cctx.stream_writer(source)
	110	writer.write(original)
	111	writer.flush(zstd.FLUSH_FRAME)
	112	frame = source.getvalue()
	113	else:
	114	frame = cctx.compress(original)
	115
	116	dctx = zstd.ZstdDecompressor()
	117	chunks = []
	118
	119	with dctx.stream_reader(frame, read_size=source_read_size) as reader:
	120	while True:
	121	read_size = read_sizes.draw(strategies.integers(-1, 131072))
40	122	chunk = reader.read(read_size)
41		if not chunk:
	123	if not chunk and read_size:
	124	break
	125
	126	chunks.append(chunk)
	127
	128	self.assertEqual(b''.join(chunks), original)
	129
	130	# Similar to above except we have a constant read() size.
	131	@hypothesis.settings(
	132	suppress_health_check=[hypothesis.HealthCheck.large_base_example])
	133	@hypothesis.given(original=strategies.sampled_from(random_input_data()),
	134	level=strategies.integers(min_value=1, max_value=5),
	135	streaming=strategies.booleans(),
	136	source_read_size=strategies.integers(1, 1048576),
	137	read_size=strategies.integers(-1, 131072))
	138	def test_buffer_source_constant_read_size(self, original, level, streaming,
	139	source_read_size, read_size):
	140	if read_size == 0:
	141	read_size = -1
	142
	143	cctx = zstd.ZstdCompressor(level=level)
	144
	145	if streaming:
	146	source = io.BytesIO()
	147	writer = cctx.stream_writer(source)
	148	writer.write(original)
	149	writer.flush(zstd.FLUSH_FRAME)
	150	frame = source.getvalue()
	151	else:
	152	frame = cctx.compress(original)
	153
	154	dctx = zstd.ZstdDecompressor()
	155	chunks = []
	156
	157	reader = dctx.stream_reader(frame, read_size=source_read_size)
	158	while True:
	159	chunk = reader.read(read_size)
	160	if not chunk and read_size:
	161	break
	162
	163	chunks.append(chunk)
	164
	165	self.assertEqual(b''.join(chunks), original)
	166
	167	@hypothesis.settings(
	168	suppress_health_check=[hypothesis.HealthCheck.large_base_example])
	169	@hypothesis.given(original=strategies.sampled_from(random_input_data()),
	170	level=strategies.integers(min_value=1, max_value=5),
	171	streaming=strategies.booleans(),
	172	source_read_size=strategies.integers(1, 1048576))
	173	def test_stream_source_readall(self, original, level, streaming,
	174	source_read_size):
	175	cctx = zstd.ZstdCompressor(level=level)
	176
	177	if streaming:
	178	source = io.BytesIO()
	179	writer = cctx.stream_writer(source)
	180	writer.write(original)
	181	writer.flush(zstd.FLUSH_FRAME)
	182	source.seek(0)
	183	else:
	184	frame = cctx.compress(original)
	185	source = io.BytesIO(frame)
	186
	187	dctx = zstd.ZstdDecompressor()
	188
	189	data = dctx.stream_reader(source, read_size=source_read_size).readall()
	190	self.assertEqual(data, original)
	191
	192	@hypothesis.settings(
	193	suppress_health_check=[hypothesis.HealthCheck.large_base_example])
	194	@hypothesis.given(original=strategies.sampled_from(random_input_data()),
	195	level=strategies.integers(min_value=1, max_value=5),
	196	streaming=strategies.booleans(),
	197	source_read_size=strategies.integers(1, 1048576),
	198	read_sizes=strategies.data())
	199	def test_stream_source_read1_variance(self, original, level, streaming,
	200	source_read_size, read_sizes):
	201	cctx = zstd.ZstdCompressor(level=level)
	202
	203	if streaming:
	204	source = io.BytesIO()
	205	writer = cctx.stream_writer(source)
	206	writer.write(original)
	207	writer.flush(zstd.FLUSH_FRAME)
	208	source.seek(0)
	209	else:
	210	frame = cctx.compress(original)
	211	source = io.BytesIO(frame)
	212
	213	dctx = zstd.ZstdDecompressor()
	214
	215	chunks = []
	216	with dctx.stream_reader(source, read_size=source_read_size) as reader:
	217	while True:
	218	read_size = read_sizes.draw(strategies.integers(-1, 131072))
	219	chunk = reader.read1(read_size)
	220	if not chunk and read_size:
42	221	break
43	222
44	223	chunks.append(chunk)
		@@ -49,24 +228,36 b' class TestDecompressor_stream_reader_fuz'
49	228	suppress_health_check=[hypothesis.HealthCheck.large_base_example])
50	229	@hypothesis.given(original=strategies.sampled_from(random_input_data()),
51	230	level=strategies.integers(min_value=1, max_value=5),
52		s~~ource_read_size~~=strategies.~~integer~~s(1, ~~16384~~),
	231	streaming=strategies.booleans(),
	232	source_read_size=strategies.integers(1, 1048576),
53	233	read_sizes=strategies.data())
54		def test_~~buffer~~_source_read_variance(self, original, level, s~~ource_read_size~~,
55		read_sizes):
	234	def test_stream_source_readinto1_variance(self, original, level, streaming,
	235	source_read_size, read_sizes):
56	236	cctx = zstd.ZstdCompressor(level=level)
57		frame = cctx.compress(original)
	237
	238	if streaming:
	239	source = io.BytesIO()
	240	writer = cctx.stream_writer(source)
	241	writer.write(original)
	242	writer.flush(zstd.FLUSH_FRAME)
	243	source.seek(0)
	244	else:
	245	frame = cctx.compress(original)
	246	source = io.BytesIO(frame)
58	247
59	248	dctx = zstd.ZstdDecompressor()
	249
60	250	chunks = []
61
62		with dctx.stream_reader(frame, read_size=source_read_size) as reader:
	251	with dctx.stream_reader(source, read_size=source_read_size) as reader:
63	252	while True:
64		read_size = read_sizes.draw(strategies.integers(1, 1~~6384~~))
65		~~chunk~~ = ~~reader~~.~~read~~(read_size)
66		if not chunk:
	253	read_size = read_sizes.draw(strategies.integers(1, 131072))
	254	b = bytearray(read_size)
	255	count = reader.readinto1(b)
	256
	257	if not count:
67	258	break
68	259
69		chunks.append(~~chunk~~)
	260	chunks.append(bytes(b[0:count]))
70	261
71	262	self.assertEqual(b''.join(chunks), original)
72	263
		@@ -75,7 +266,7 b' class TestDecompressor_stream_reader_fuz'
75	266	@hypothesis.given(
76	267	original=strategies.sampled_from(random_input_data()),
77	268	level=strategies.integers(min_value=1, max_value=5),
78		source_read_size=strategies.integers(1, 1~~6384~~),
	269	source_read_size=strategies.integers(1, 1048576),
79	270	seek_amounts=strategies.data(),
80	271	read_sizes=strategies.data())
81	272	def test_relative_seeks(self, original, level, source_read_size, seek_amounts,
		@@ -99,6 +290,46 b' class TestDecompressor_stream_reader_fuz'
99	290
100	291	self.assertEqual(original[offset:offset + len(chunk)], chunk)
101	292
	293	@hypothesis.settings(
	294	suppress_health_check=[hypothesis.HealthCheck.large_base_example])
	295	@hypothesis.given(
	296	originals=strategies.data(),
	297	frame_count=strategies.integers(min_value=2, max_value=10),
	298	level=strategies.integers(min_value=1, max_value=5),
	299	source_read_size=strategies.integers(1, 1048576),
	300	read_sizes=strategies.data())
	301	def test_multiple_frames(self, originals, frame_count, level,
	302	source_read_size, read_sizes):
	303
	304	cctx = zstd.ZstdCompressor(level=level)
	305	source = io.BytesIO()
	306	buffer = io.BytesIO()
	307	writer = cctx.stream_writer(buffer)
	308
	309	for i in range(frame_count):
	310	data = originals.draw(strategies.sampled_from(random_input_data()))
	311	source.write(data)
	312	writer.write(data)
	313	writer.flush(zstd.FLUSH_FRAME)
	314
	315	dctx = zstd.ZstdDecompressor()
	316	buffer.seek(0)
	317	reader = dctx.stream_reader(buffer, read_size=source_read_size,
	318	read_across_frames=True)
	319
	320	chunks = []
	321
	322	while True:
	323	read_amount = read_sizes.draw(strategies.integers(-1, 16384))
	324	chunk = reader.read(read_amount)
	325
	326	if not chunk and read_amount:
	327	break
	328
	329	chunks.append(chunk)
	330
	331	self.assertEqual(source.getvalue(), b''.join(chunks))
	332
102	333
103	334	@unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set')
104	335	@make_cffi
		@@ -113,7 +344,7 b' class TestDecompressor_stream_writer_fuz'
113	344
114	345	dctx = zstd.ZstdDecompressor()
115	346	source = io.BytesIO(frame)
116		dest = io.BytesIO()
	347	dest = NonClosingBytesIO()
117	348
118	349	with dctx.stream_writer(dest, write_size=write_size) as decompressor:
119	350	while True:
		@@ -234,10 +465,12 b' class TestDecompressor_multi_decompress_'
234	465	write_checksum=True,
235	466	**kwargs)
236	467
	468	if not hasattr(cctx, 'multi_compress_to_buffer'):
	469	self.skipTest('multi_compress_to_buffer not available')
	470
237	471	frames_buffer = cctx.multi_compress_to_buffer(original, threads=-1)
238	472
239	473	dctx = zstd.ZstdDecompressor(**kwargs)
240
241	474	result = dctx.multi_decompress_to_buffer(frames_buffer)
242	475
243	476	self.assertEqual(len(result), len(original))

contrib/python-zstandard/tests/test_module_attributes.py

0 +7 -2

              @make_cffi
              class TestModuleAttributes(unittest.TestCase):
                  def test_version(self):
-                     self.assertEqual(zstd.ZSTD_VERSION, (1, 3, 6))
+                     self.assertEqual(zstd.ZSTD_VERSION, (1, 3, 8))
-                     self.assertEqual(zstd.__version__, '0.10.1')
+                     self.assertEqual(zstd.__version__, '0.11.0')
                  def test_constants(self):
                      self.assertEqual(zstd.MAX_COMPRESSION_LEVEL, 22)
                          'DECOMPRESSION_RECOMMENDED_INPUT_SIZE',
                          'DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE',
                          'MAGIC_NUMBER',
+                         'FLUSH_BLOCK',
+                         'FLUSH_FRAME',
                          'BLOCKSIZELOG_MAX',
                          'BLOCKSIZE_MAX',
                          'WINDOWLOG_MIN',
                          'HASHLOG_MIN',
                          'HASHLOG_MAX',
                          'HASHLOG3_MAX',
+                         'MINMATCH_MIN',
+                         'MINMATCH_MAX',
                          'SEARCHLOG_MIN',
                          'SEARCHLOG_MAX',
                          'SEARCHLENGTH_MIN',
                          'STRATEGY_BTLAZY2',
                          'STRATEGY_BTOPT',
                          'STRATEGY_BTULTRA',
+                         'STRATEGY_BTULTRA2',
                          'DICT_TYPE_AUTO',
                          'DICT_TYPE_RAWCONTENT',
                          'DICT_TYPE_FULLDICT',

contrib/python-zstandard/zstandard/__init__.py

0 +5 -5

                      from zstd import *
                      backend = 'cext'
                  elif platform.python_implementation() in ('PyPy',):
-                     from zstd_cffi import *
+                     from .cffi import *
                      backend = 'cffi'
                  else:
                      try:
                          from zstd import *
                          backend = 'cext'
                      except ImportError:
-                         from zstd_cffi import *
+                         from .cffi import *
                          backend = 'cffi'
              elif _module_policy == 'cffi_fallback':
                  try:
                      from zstd import *
                      backend = 'cext'
                  except ImportError:
-                     from zstd_cffi import *
+                     from .cffi import *
                      backend = 'cffi'
              elif _module_policy == 'cext':
                  from zstd import *
                  backend = 'cext'
              elif _module_policy == 'cffi':
-                 from zstd_cffi import *
+                 from .cffi import *
                  backend = 'cffi'
              else:
                  raise ImportError('unknown module import policy: %s; use default, cffi_fallback, '
                                    'cext, or cffi' % _module_policy)
              # Keep this in sync with python-zstandard.h.
-             __version__ = '0.10.1'
+             __version__ = '0.11.0'

contrib/python-zstandard/zstandard/cffi.py ~~contrib/python-zstandard/zstd_cffi.py~~

0 renamed +883 -320

                  'train_dictionary',
                  # Constants.
+                 'FLUSH_BLOCK',
+                 'FLUSH_FRAME',
                  'COMPRESSOBJ_FLUSH_FINISH',
                  'COMPRESSOBJ_FLUSH_BLOCK',
                  'ZSTD_VERSION',
                  'HASHLOG_MIN',
                  'HASHLOG_MAX',
                  'HASHLOG3_MAX',
+                 'MINMATCH_MIN',
+                 'MINMATCH_MAX',
                  'SEARCHLOG_MIN',
                  'SEARCHLOG_MAX',
                  'SEARCHLENGTH_MIN',
                  'STRATEGY_BTLAZY2',
                  'STRATEGY_BTOPT',
                  'STRATEGY_BTULTRA',
+                 'STRATEGY_BTULTRA2',
                  'DICT_TYPE_AUTO',
                  'DICT_TYPE_RAWCONTENT',
                  'DICT_TYPE_FULLDICT',
              HASHLOG_MIN = lib.ZSTD_HASHLOG_MIN
              HASHLOG_MAX = lib.ZSTD_HASHLOG_MAX
              HASHLOG3_MAX = lib.ZSTD_HASHLOG3_MAX
+             MINMATCH_MIN = lib.ZSTD_MINMATCH_MIN
+             MINMATCH_MAX = lib.ZSTD_MINMATCH_MAX
              SEARCHLOG_MIN = lib.ZSTD_SEARCHLOG_MIN
              SEARCHLOG_MAX = lib.ZSTD_SEARCHLOG_MAX
-             SEARCHLENGTH_MIN = lib.ZSTD_SEARCHLENGTH_MIN
-             SEARCHLENGTH_MAX = lib.ZSTD_SEARCHLENGTH_MAX
+             SEARCHLENGTH_MIN = lib.ZSTD_MINMATCH_MIN
+             SEARCHLENGTH_MAX = lib.ZSTD_MINMATCH_MAX
              TARGETLENGTH_MIN = lib.ZSTD_TARGETLENGTH_MIN
              TARGETLENGTH_MAX = lib.ZSTD_TARGETLENGTH_MAX
              LDM_MINMATCH_MIN = lib.ZSTD_LDM_MINMATCH_MIN
              STRATEGY_BTLAZY2 = lib.ZSTD_btlazy2
              STRATEGY_BTOPT = lib.ZSTD_btopt
              STRATEGY_BTULTRA = lib.ZSTD_btultra
+             STRATEGY_BTULTRA2 = lib.ZSTD_btultra2
              DICT_TYPE_AUTO = lib.ZSTD_dct_auto
              DICT_TYPE_RAWCONTENT = lib.ZSTD_dct_rawContent
              FORMAT_ZSTD1 = lib.ZSTD_f_zstd1
              FORMAT_ZSTD1_MAGICLESS = lib.ZSTD_f_zstd1_magicless
+             FLUSH_BLOCK = 0
+             FLUSH_FRAME = 1
              COMPRESSOBJ_FLUSH_FINISH = 0
              COMPRESSOBJ_FLUSH_BLOCK = 1
                  res = ffi.gc(res, lib.ZSTD_freeCCtxParams)
                  attrs = [
-                     (lib.ZSTD_p_format, params.format),
-                     (lib.ZSTD_p_compressionLevel, params.compression_level),
-                     (lib.ZSTD_p_windowLog, params.window_log),
-                     (lib.ZSTD_p_hashLog, params.hash_log),
-                     (lib.ZSTD_p_chainLog, params.chain_log),
-                     (lib.ZSTD_p_searchLog, params.search_log),
-                     (lib.ZSTD_p_minMatch, params.min_match),
-                     (lib.ZSTD_p_targetLength, params.target_length),
-                     (lib.ZSTD_p_compressionStrategy, params.compression_strategy),
-                     (lib.ZSTD_p_contentSizeFlag, params.write_content_size),
-                     (lib.ZSTD_p_checksumFlag, params.write_checksum),
-                     (lib.ZSTD_p_dictIDFlag, params.write_dict_id),
-                     (lib.ZSTD_p_nbWorkers, params.threads),
-                     (lib.ZSTD_p_jobSize, params.job_size),
-                     (lib.ZSTD_p_overlapSizeLog, params.overlap_size_log),
-                     (lib.ZSTD_p_forceMaxWindow, params.force_max_window),
-                     (lib.ZSTD_p_enableLongDistanceMatching, params.enable_ldm),
-                     (lib.ZSTD_p_ldmHashLog, params.ldm_hash_log),
-                     (lib.ZSTD_p_ldmMinMatch, params.ldm_min_match),
-                     (lib.ZSTD_p_ldmBucketSizeLog, params.ldm_bucket_size_log),
-                     (lib.ZSTD_p_ldmHashEveryLog, params.ldm_hash_every_log),
+                     (lib.ZSTD_c_format, params.format),
+                     (lib.ZSTD_c_compressionLevel, params.compression_level),
+                     (lib.ZSTD_c_windowLog, params.window_log),
+                     (lib.ZSTD_c_hashLog, params.hash_log),
+                     (lib.ZSTD_c_chainLog, params.chain_log),
+                     (lib.ZSTD_c_searchLog, params.search_log),
+                     (lib.ZSTD_c_minMatch, params.min_match),
+                     (lib.ZSTD_c_targetLength, params.target_length),
+                     (lib.ZSTD_c_strategy, params.compression_strategy),
+                     (lib.ZSTD_c_contentSizeFlag, params.write_content_size),
+                     (lib.ZSTD_c_checksumFlag, params.write_checksum),
+                     (lib.ZSTD_c_dictIDFlag, params.write_dict_id),
+                     (lib.ZSTD_c_nbWorkers, params.threads),
+                     (lib.ZSTD_c_jobSize, params.job_size),
+                     (lib.ZSTD_c_overlapLog, params.overlap_log),
+                     (lib.ZSTD_c_forceMaxWindow, params.force_max_window),
+                     (lib.ZSTD_c_enableLongDistanceMatching, params.enable_ldm),
+                     (lib.ZSTD_c_ldmHashLog, params.ldm_hash_log),
+                     (lib.ZSTD_c_ldmMinMatch, params.ldm_min_match),
+                     (lib.ZSTD_c_ldmBucketSizeLog, params.ldm_bucket_size_log),
+                     (lib.ZSTD_c_ldmHashRateLog, params.ldm_hash_rate_log),
                  ]
                  for param, value in attrs:
                          'chain_log': 'chainLog',
                          'hash_log': 'hashLog',
                          'search_log': 'searchLog',
-                         'min_match': 'searchLength',
+                         'min_match': 'minMatch',
                          'target_length': 'targetLength',
                          'compression_strategy': 'strategy',
                      }
                  def __init__(self, format=0, compression_level=0, window_log=0, hash_log=0,
                               chain_log=0, search_log=0, min_match=0, target_length=0,
-                              compression_strategy=0, write_content_size=1, write_checksum=0,
-                              write_dict_id=0, job_size=0, overlap_size_log=0,
-                              force_max_window=0, enable_ldm=0, ldm_hash_log=0,
-                              ldm_min_match=0, ldm_bucket_size_log=0, ldm_hash_every_log=0,
-                              threads=0):
+                              strategy=-1, compression_strategy=-1,
+                              write_content_size=1, write_checksum=0,
+                              write_dict_id=0, job_size=0, overlap_log=-1,
+                              overlap_size_log=-1, force_max_window=0, enable_ldm=0,
+                              ldm_hash_log=0, ldm_min_match=0, ldm_bucket_size_log=0,
+                              ldm_hash_rate_log=-1, ldm_hash_every_log=-1, threads=0):
+                     params = lib.ZSTD_createCCtxParams()
+                     if params == ffi.NULL:
+                         raise MemoryError()
+                     params = ffi.gc(params, lib.ZSTD_freeCCtxParams)
+                     self._params = params
                      if threads < 0:
                          threads = _cpu_count()
-                     self.format = format
-                     self.compression_level = compression_level
-                     self.window_log = window_log
-                     self.hash_log = hash_log
-                     self.chain_log = chain_log
-                     self.search_log = search_log
-                     self.min_match = min_match
-                     self.target_length = target_length
-                     self.compression_strategy = compression_strategy
-                     self.write_content_size = write_content_size
-                     self.write_checksum = write_checksum
-                     self.write_dict_id = write_dict_id
-                     self.job_size = job_size
-                     self.overlap_size_log = overlap_size_log
-                     self.force_max_window = force_max_window
-                     self.enable_ldm = enable_ldm
-                     self.ldm_hash_log = ldm_hash_log
-                     self.ldm_min_match = ldm_min_match
-                     self.ldm_bucket_size_log = ldm_bucket_size_log
-                     self.ldm_hash_every_log = ldm_hash_every_log
-                     self.threads = threads
-                     self.params = _make_cctx_params(self)
+                     # We need to set ZSTD_c_nbWorkers before ZSTD_c_jobSize and ZSTD_c_overlapLog
+                     # because setting ZSTD_c_nbWorkers resets the other parameters.
+                     _set_compression_parameter(params, lib.ZSTD_c_nbWorkers, threads)
+                     _set_compression_parameter(params, lib.ZSTD_c_format, format)
+                     _set_compression_parameter(params, lib.ZSTD_c_compressionLevel, compression_level)
+                     _set_compression_parameter(params, lib.ZSTD_c_windowLog, window_log)
+                     _set_compression_parameter(params, lib.ZSTD_c_hashLog, hash_log)
+                     _set_compression_parameter(params, lib.ZSTD_c_chainLog, chain_log)
+                     _set_compression_parameter(params, lib.ZSTD_c_searchLog, search_log)
+                     _set_compression_parameter(params, lib.ZSTD_c_minMatch, min_match)
+                     _set_compression_parameter(params, lib.ZSTD_c_targetLength, target_length)
+                     if strategy != -1 and compression_strategy != -1:
+                         raise ValueError('cannot specify both compression_strategy and strategy')
+                     if compression_strategy != -1:
+                         strategy = compression_strategy
+                     elif strategy == -1:
+                         strategy = 0
+                     _set_compression_parameter(params, lib.ZSTD_c_strategy, strategy)
+                     _set_compression_parameter(params, lib.ZSTD_c_contentSizeFlag, write_content_size)
+                     _set_compression_parameter(params, lib.ZSTD_c_checksumFlag, write_checksum)
+                     _set_compression_parameter(params, lib.ZSTD_c_dictIDFlag, write_dict_id)
+                     _set_compression_parameter(params, lib.ZSTD_c_jobSize, job_size)
+                     if overlap_log != -1 and overlap_size_log != -1:
+                         raise ValueError('cannot specify both overlap_log and overlap_size_log')
+                     if overlap_size_log != -1:
+                         overlap_log = overlap_size_log
+                     elif overlap_log == -1:
+                         overlap_log = 0
+                     _set_compression_parameter(params, lib.ZSTD_c_overlapLog, overlap_log)
+                     _set_compression_parameter(params, lib.ZSTD_c_forceMaxWindow, force_max_window)
+                     _set_compression_parameter(params, lib.ZSTD_c_enableLongDistanceMatching, enable_ldm)
+                     _set_compression_parameter(params, lib.ZSTD_c_ldmHashLog, ldm_hash_log)
+                     _set_compression_parameter(params, lib.ZSTD_c_ldmMinMatch, ldm_min_match)
+                     _set_compression_parameter(params, lib.ZSTD_c_ldmBucketSizeLog, ldm_bucket_size_log)
+                     if ldm_hash_rate_log != -1 and ldm_hash_every_log != -1:
+                         raise ValueError('cannot specify both ldm_hash_rate_log and ldm_hash_every_log')
+                     if ldm_hash_every_log != -1:
+                         ldm_hash_rate_log = ldm_hash_every_log
+                     elif ldm_hash_rate_log == -1:
+                         ldm_hash_rate_log = 0
+                     _set_compression_parameter(params, lib.ZSTD_c_ldmHashRateLog, ldm_hash_rate_log)
+                 @property
+                 def format(self):
+                     return _get_compression_parameter(self._params, lib.ZSTD_c_format)
+                 @property
+                 def compression_level(self):
+                     return _get_compression_parameter(self._params, lib.ZSTD_c_compressionLevel)
+                 @property
+                 def window_log(self):
+                     return _get_compression_parameter(self._params, lib.ZSTD_c_windowLog)
+                 @property
+                 def hash_log(self):
+                     return _get_compression_parameter(self._params, lib.ZSTD_c_hashLog)
+                 @property
+                 def chain_log(self):
+                     return _get_compression_parameter(self._params, lib.ZSTD_c_chainLog)
+                 @property
+                 def search_log(self):
+                     return _get_compression_parameter(self._params, lib.ZSTD_c_searchLog)
+                 @property
+                 def min_match(self):
+                     return _get_compression_parameter(self._params, lib.ZSTD_c_minMatch)
+                 @property
+                 def target_length(self):
+                     return _get_compression_parameter(self._params, lib.ZSTD_c_targetLength)
+                 @property
+                 def compression_strategy(self):
+                     return _get_compression_parameter(self._params, lib.ZSTD_c_strategy)
+                 @property
+                 def write_content_size(self):
+                     return _get_compression_parameter(self._params, lib.ZSTD_c_contentSizeFlag)
+                 @property
+                 def write_checksum(self):
+                     return _get_compression_parameter(self._params, lib.ZSTD_c_checksumFlag)
+                 @property
+                 def write_dict_id(self):
+                     return _get_compression_parameter(self._params, lib.ZSTD_c_dictIDFlag)
+                 @property
+                 def job_size(self):
+                     return _get_compression_parameter(self._params, lib.ZSTD_c_jobSize)
+                 @property
+                 def overlap_log(self):
+                     return _get_compression_parameter(self._params, lib.ZSTD_c_overlapLog)
+                 @property
+                 def overlap_size_log(self):
+                     return self.overlap_log
+                 @property
+                 def force_max_window(self):
+                     return _get_compression_parameter(self._params, lib.ZSTD_c_forceMaxWindow)
+                 @property
+                 def enable_ldm(self):
+                     return _get_compression_parameter(self._params, lib.ZSTD_c_enableLongDistanceMatching)
+                 @property
+                 def ldm_hash_log(self):
+                     return _get_compression_parameter(self._params, lib.ZSTD_c_ldmHashLog)
+                 @property
+                 def ldm_min_match(self):
+                     return _get_compression_parameter(self._params, lib.ZSTD_c_ldmMinMatch)
+                 @property
+                 def ldm_bucket_size_log(self):
+                     return _get_compression_parameter(self._params, lib.ZSTD_c_ldmBucketSizeLog)
+                 @property
+                 def ldm_hash_rate_log(self):
+                     return _get_compression_parameter(self._params, lib.ZSTD_c_ldmHashRateLog)
+                 @property
+                 def ldm_hash_every_log(self):
+                     return self.ldm_hash_rate_log
+                 @property
+                 def threads(self):
+                     return _get_compression_parameter(self._params, lib.ZSTD_c_nbWorkers)
                  def estimated_compression_context_size(self):
-                     return lib.ZSTD_estimateCCtxSize_usingCCtxParams(self.params)
+                     return lib.ZSTD_estimateCCtxSize_usingCCtxParams(self._params)
              CompressionParameters = ZstdCompressionParameters
              def _set_compression_parameter(params, param, value):
-                 zresult = lib.ZSTD_CCtxParam_setParameter(params, param,
-                                                           ffi.cast('unsigned', value))
+                 zresult = lib.ZSTD_CCtxParam_setParameter(params, param, value)
                  if lib.ZSTD_isError(zresult):
                      raise ZstdError('unable to set compression context parameter: %s' %
                                      _zstd_error(zresult))
+             def _get_compression_parameter(params, param):
+                 result = ffi.new('int *')
+                 zresult = lib.ZSTD_CCtxParam_getParameter(params, param, result)
+                 if lib.ZSTD_isError(zresult):
+                     raise ZstdError('unable to get compression context parameter: %s' %
+                                     _zstd_error(zresult))
+                 return result[0]
              class ZstdCompressionWriter(object):
-                 def __init__(self, compressor, writer, source_size, write_size):
+                 def __init__(self, compressor, writer, source_size, write_size,
+                              write_return_read):
                      self._compressor = compressor
                      self._writer = writer
-                     self._source_size = source_size
                      self._write_size = write_size
+                     self._write_return_read = bool(write_return_read)
                      self._entered = False
+                     self._closed = False
                      self._bytes_compressed = 0
-                 def __enter__(self):
-                     if self._entered:
-                         raise ZstdError('cannot __enter__ multiple times')
-                     zresult = lib.ZSTD_CCtx_setPledgedSrcSize(self._compressor._cctx,
-                                                               self._source_size)
+                     self._dst_buffer = ffi.new('char[]', write_size)
+                     self._out_buffer = ffi.new('ZSTD_outBuffer *')
+                     self._out_buffer.dst = self._dst_buffer
+                     self._out_buffer.size = len(self._dst_buffer)
+                     self._out_buffer.pos = 0
+                     zresult = lib.ZSTD_CCtx_setPledgedSrcSize(compressor._cctx,
+                                                               source_size)
                      if lib.ZSTD_isError(zresult):
                          raise ZstdError('error setting source size: %s' %
                                          _zstd_error(zresult))
+                 def __enter__(self):
+                     if self._closed:
+                         raise ValueError('stream is closed')
+                     if self._entered:
+                         raise ZstdError('cannot __enter__ multiple times')
                      self._entered = True
                      return self
                      self._entered = False
                      if not exc_type and not exc_value and not exc_tb:
-                         dst_buffer = ffi.new('char[]', self._write_size)
-                         out_buffer = ffi.new('ZSTD_outBuffer *')
-                         in_buffer = ffi.new('ZSTD_inBuffer *')
-                         out_buffer.dst = dst_buffer
-                         out_buffer.size = len(dst_buffer)
-                         out_buffer.pos = 0
-                         in_buffer.src = ffi.NULL
-                         in_buffer.size = 0
-                         in_buffer.pos = 0
-                         while True:
-                             zresult = lib.ZSTD_compress_generic(self._compressor._cctx,
-                                                                 out_buffer, in_buffer,
-                                                                 lib.ZSTD_e_end)
-                             if lib.ZSTD_isError(zresult):
-                                 raise ZstdError('error ending compression stream: %s' %
-                                                 _zstd_error(zresult))
-                             if out_buffer.pos:
-                                 self._writer.write(ffi.buffer(out_buffer.dst, out_buffer.pos)[:])
-                                 out_buffer.pos = 0
-                             if zresult == 0:
-                                 break
+                         self.close()
                      self._compressor = None
                      return False
                  def memory_size(self):
-                     if not self._entered:
-                         raise ZstdError('cannot determine size of an inactive compressor; '
-                                         'call when a context manager is active')
                      return lib.ZSTD_sizeof_CCtx(self._compressor._cctx)
+                 def fileno(self):
+                     f = getattr(self._writer, 'fileno', None)
+                     if f:
+                         return f()
+                     else:
+                         raise OSError('fileno not available on underlying writer')
+                 def close(self):
+                     if self._closed:
+                         return
+                     try:
+                         self.flush(FLUSH_FRAME)
+                     finally:
+                         self._closed = True
+                     # Call close() on underlying stream as well.
+                     f = getattr(self._writer, 'close', None)
+                     if f:
+                         f()
+                 @property
+                 def closed(self):
+                     return self._closed
+                 def isatty(self):
+                     return False
+                 def readable(self):
+                     return False
+                 def readline(self, size=-1):
+                     raise io.UnsupportedOperation()
+                 def readlines(self, hint=-1):
+                     raise io.UnsupportedOperation()
+                 def seek(self, offset, whence=None):
+                     raise io.UnsupportedOperation()
+                 def seekable(self):
+                     return False
+                 def truncate(self, size=None):
+                     raise io.UnsupportedOperation()
+                 def writable(self):
+                     return True
+                 def writelines(self, lines):
+                     raise NotImplementedError('writelines() is not yet implemented')
+                 def read(self, size=-1):
+                     raise io.UnsupportedOperation()
+                 def readall(self):
+                     raise io.UnsupportedOperation()
+                 def readinto(self, b):
+                     raise io.UnsupportedOperation()
                  def write(self, data):
-                     if not self._entered:
-                         raise ZstdError('write() must be called from an active context '
-                                         'manager')
+                     if self._closed:
+                         raise ValueError('stream is closed')
                      total_write = 0
                      in_buffer.size = len(data_buffer)
                      in_buffer.pos = 0
-                     out_buffer = ffi.new('ZSTD_outBuffer *')
-                     dst_buffer = ffi.new('char[]', self._write_size)
-                     out_buffer.dst = dst_buffer
-                     out_buffer.size = self._write_size
+                     out_buffer = self._out_buffer
                      out_buffer.pos = 0
                      while in_buffer.pos < in_buffer.size:
-                         zresult = lib.ZSTD_compress_generic(self._compressor._cctx,
-                                                             out_buffer, in_buffer,
-                                                             lib.ZSTD_e_continue)
+                         zresult = lib.ZSTD_compressStream2(self._compressor._cctx,
+                                                            out_buffer, in_buffer,
+                                                            lib.ZSTD_e_continue)
                          if lib.ZSTD_isError(zresult):
                              raise ZstdError('zstd compress error: %s' %
                                              _zstd_error(zresult))
                              self._bytes_compressed += out_buffer.pos
                              out_buffer.pos = 0
-                     return total_write
-                 def flush(self):
-                     if not self._entered:
-                         raise ZstdError('flush must be called from an active context manager')
+                     if self._write_return_read:
+                         return in_buffer.pos
+                     else:
+                         return total_write
+                 def flush(self, flush_mode=FLUSH_BLOCK):
+                     if flush_mode == FLUSH_BLOCK:
+                         flush = lib.ZSTD_e_flush
+                     elif flush_mode == FLUSH_FRAME:
+                         flush = lib.ZSTD_e_end
+                     else:
+                         raise ValueError('unknown flush_mode: %r' % flush_mode)
+                     if self._closed:
+                         raise ValueError('stream is closed')
                      total_write = 0
-                     out_buffer = ffi.new('ZSTD_outBuffer *')
-                     dst_buffer = ffi.new('char[]', self._write_size)
-                     out_buffer.dst = dst_buffer
-                     out_buffer.size = self._write_size
+                     out_buffer = self._out_buffer
                      out_buffer.pos = 0
                      in_buffer = ffi.new('ZSTD_inBuffer *')
                      in_buffer.pos = 0
                      while True:
-                         zresult = lib.ZSTD_compress_generic(self._compressor._cctx,
-                                                             out_buffer, in_buffer,
-                                                             lib.ZSTD_e_flush)
+                         zresult = lib.ZSTD_compressStream2(self._compressor._cctx,
+                                                            out_buffer, in_buffer,
+                                                            flush)
                          if lib.ZSTD_isError(zresult):
                              raise ZstdError('zstd compress error: %s' %
                                              _zstd_error(zresult))
                      chunks = []
                      while source.pos < len(data):
-                         zresult = lib.ZSTD_compress_generic(self._compressor._cctx,
-                                                             self._out,
-                                                             source,
-                                                             lib.ZSTD_e_continue)
+                         zresult = lib.ZSTD_compressStream2(self._compressor._cctx,
+                                                            self._out,
+                                                            source,
+                                                            lib.ZSTD_e_continue)
                          if lib.ZSTD_isError(zresult):
                              raise ZstdError('zstd compress error: %s' %
                                              _zstd_error(zresult))
                      chunks = []
                      while True:
-                         zresult = lib.ZSTD_compress_generic(self._compressor._cctx,
-                                                             self._out,
-                                                             in_buffer,
-                                                             z_flush_mode)
+                         zresult = lib.ZSTD_compressStream2(self._compressor._cctx,
+                                                            self._out,
+                                                            in_buffer,
+                                                            z_flush_mode)
                          if lib.ZSTD_isError(zresult):
                              raise ZstdError('error ending compression stream: %s' %
                                              _zstd_error(zresult))
                      self._in.pos = 0
                      while self._in.pos < self._in.size:
-                         zresult = lib.ZSTD_compress_generic(self._compressor._cctx,
-                                                             self._out,
-                                                             self._in,
-                                                             lib.ZSTD_e_continue)
+                         zresult = lib.ZSTD_compressStream2(self._compressor._cctx,
+                                                            self._out,
+                                                            self._in,
+                                                            lib.ZSTD_e_continue)
                          if self._in.pos == self._in.size:
                              self._in.src = ffi.NULL
                                          'previous operation')
                      while True:
-                         zresult = lib.ZSTD_compress_generic(self._compressor._cctx,
-                                                             self._out, self._in,
-                                                             lib.ZSTD_e_flush)
+                         zresult = lib.ZSTD_compressStream2(self._compressor._cctx,
+                                                            self._out, self._in,
+                                                            lib.ZSTD_e_flush)
                          if lib.ZSTD_isError(zresult):
                              raise ZstdError('zstd compress error: %s' % _zstd_error(zresult))
                                          'previous operation')
                      while True:
-                         zresult = lib.ZSTD_compress_generic(self._compressor._cctx,
-                                                             self._out, self._in,
-                                                             lib.ZSTD_e_end)
+                         zresult = lib.ZSTD_compressStream2(self._compressor._cctx,
+                                                            self._out, self._in,
+                                                            lib.ZSTD_e_end)
                          if lib.ZSTD_isError(zresult):
                              raise ZstdError('zstd compress error: %s' % _zstd_error(zresult))
                              return
-             class CompressionReader(object):
+             class ZstdCompressionReader(object):
                  def __init__(self, compressor, source, read_size):
                      self._compressor = compressor
                      self._source = source
                      return self._bytes_compressed
                  def readall(self):
-                     raise NotImplementedError()
+                     chunks = []
+                     while True:
+                         chunk = self.read(1048576)
+                         if not chunk:
+                             break
+                         chunks.append(chunk)
+                     return b''.join(chunks)
                  def __iter__(self):
                      raise io.UnsupportedOperation()
                  next = __next__
+                 def _read_input(self):
+                     if self._finished_input:
+                         return
+                     if hasattr(self._source, 'read'):
+                         data = self._source.read(self._read_size)
+                         if not data:
+                             self._finished_input = True
+                             return
+                         self._source_buffer = ffi.from_buffer(data)
+                         self._in_buffer.src = self._source_buffer
+                         self._in_buffer.size = len(self._source_buffer)
+                         self._in_buffer.pos = 0
+                     else:
+                         self._source_buffer = ffi.from_buffer(self._source)
+                         self._in_buffer.src = self._source_buffer
+                         self._in_buffer.size = len(self._source_buffer)
+                         self._in_buffer.pos = 0
+                 def _compress_into_buffer(self, out_buffer):
+                     if self._in_buffer.pos >= self._in_buffer.size:
+                         return
+                     old_pos = out_buffer.pos
+                     zresult = lib.ZSTD_compressStream2(self._compressor._cctx,
+                                                        out_buffer, self._in_buffer,
+                                                        lib.ZSTD_e_continue)
+                     self._bytes_compressed += out_buffer.pos - old_pos
+                     if self._in_buffer.pos == self._in_buffer.size:
+                         self._in_buffer.src = ffi.NULL
+                         self._in_buffer.pos = 0
+                         self._in_buffer.size = 0
+                         self._source_buffer = None
+                         if not hasattr(self._source, 'read'):
+                             self._finished_input = True
+                     if lib.ZSTD_isError(zresult):
+                         raise ZstdError('zstd compress error: %s',
+                                         _zstd_error(zresult))
+                     return out_buffer.pos and out_buffer.pos == out_buffer.size
                  def read(self, size=-1):
                      if self._closed:
                          raise ValueError('stream is closed')
-                     if self._finished_output:
+                     if size < -1:
+                         raise ValueError('cannot read negative amounts less than -1')
+                     if size == -1:
+                         return self.readall()
+                     if self._finished_output or size == 0:
                          return b''
-                     if size < 1:
-                         raise ValueError('cannot read negative or size 0 amounts')
                      # Need a dedicated ref to dest buffer otherwise it gets collected.
                      dst_buffer = ffi.new('char[]', size)
                      out_buffer = ffi.new('ZSTD_outBuffer *')
                      out_buffer.size = size
                      out_buffer.pos = 0
-                     def compress_input():
-                         if self._in_buffer.pos >= self._in_buffer.size:
-                             return
-                         old_pos = out_buffer.pos
-                         zresult = lib.ZSTD_compress_generic(self._compressor._cctx,
-                                                             out_buffer, self._in_buffer,
-                                                             lib.ZSTD_e_continue)
-                         self._bytes_compressed += out_buffer.pos - old_pos
-                         if self._in_buffer.pos == self._in_buffer.size:
-                             self._in_buffer.src = ffi.NULL
-                             self._in_buffer.pos = 0
-                             self._in_buffer.size = 0
-                             self._source_buffer = None
-                             if not hasattr(self._source, 'read'):
-                                 self._finished_input = True
-                         if lib.ZSTD_isError(zresult):
-                             raise ZstdError('zstd compress error: %s',
-                                             _zstd_error(zresult))
-                         if out_buffer.pos and out_buffer.pos == out_buffer.size:
-                             return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
-                     def get_input():
-                         if self._finished_input:
-                             return
-                         if hasattr(self._source, 'read'):
-                             data = self._source.read(self._read_size)
-                             if not data:
-                                 self._finished_input = True
-                                 return
-                             self._source_buffer = ffi.from_buffer(data)
-                             self._in_buffer.src = self._source_buffer
-                             self._in_buffer.size = len(self._source_buffer)
-                             self._in_buffer.pos = 0
-                         else:
-                             self._source_buffer = ffi.from_buffer(self._source)
-                             self._in_buffer.src = self._source_buffer
-                             self._in_buffer.size = len(self._source_buffer)
-                             self._in_buffer.pos = 0
-                     result = compress_input()
-                     if result:
-                         return result
+                     if self._compress_into_buffer(out_buffer):
+                         return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
                      while not self._finished_input:
-                         get_input()
-                         result = compress_input()
-                         if result:
-                             return result
+                         self._read_input()
+                         if self._compress_into_buffer(out_buffer):
+                             return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
                      # EOF
                      old_pos = out_buffer.pos
-                     zresult = lib.ZSTD_compress_generic(self._compressor._cctx,
-                                                         out_buffer, self._in_buffer,
-                                                         lib.ZSTD_e_end)
+                     zresult = lib.ZSTD_compressStream2(self._compressor._cctx,
+                                                        out_buffer, self._in_buffer,
+                                                        lib.ZSTD_e_end)
                      self._bytes_compressed += out_buffer.pos - old_pos
                      return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
+                 def read1(self, size=-1):
+                     if self._closed:
+                         raise ValueError('stream is closed')
+                     if size < -1:
+                         raise ValueError('cannot read negative amounts less than -1')
+                     if self._finished_output or size == 0:
+                         return b''
+                     # -1 returns arbitrary number of bytes.
+                     if size == -1:
+                         size = COMPRESSION_RECOMMENDED_OUTPUT_SIZE
+                     dst_buffer = ffi.new('char[]', size)
+                     out_buffer = ffi.new('ZSTD_outBuffer *')
+                     out_buffer.dst = dst_buffer
+                     out_buffer.size = size
+                     out_buffer.pos = 0
+                     # read1() dictates that we can perform at most 1 call to the
+                     # underlying stream to get input. However, we can't satisfy this
+                     # restriction with compression because not all input generates output.
+                     # It is possible to perform a block flush in order to ensure output.
+                     # But this may not be desirable behavior. So we allow multiple read()
+                     # to the underlying stream. But unlike read(), we stop once we have
+                     # any output.
+                     self._compress_into_buffer(out_buffer)
+                     if out_buffer.pos:
+                         return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
+                     while not self._finished_input:
+                         self._read_input()
+                         # If we've filled the output buffer, return immediately.
+                         if self._compress_into_buffer(out_buffer):
+                             return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
+                         # If we've populated the output buffer and we're not at EOF,
+                         # also return, as we've satisfied the read1() limits.
+                         if out_buffer.pos and not self._finished_input:
+                             return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
+                         # Else if we're at EOS and we have room left in the buffer,
+                         # fall through to below and try to add more data to the output.
+                     # EOF.
+                     old_pos = out_buffer.pos
+                     zresult = lib.ZSTD_compressStream2(self._compressor._cctx,
+                                                        out_buffer, self._in_buffer,
+                                                        lib.ZSTD_e_end)
+                     self._bytes_compressed += out_buffer.pos - old_pos
+                     if lib.ZSTD_isError(zresult):
+                         raise ZstdError('error ending compression stream: %s' %
+                                         _zstd_error(zresult))
+                     if zresult == 0:
+                         self._finished_output = True
+                     return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
+                 def readinto(self, b):
+                     if self._closed:
+                         raise ValueError('stream is closed')
+                     if self._finished_output:
+                         return 0
+                     # TODO use writable=True once we require CFFI >= 1.12.
+                     dest_buffer = ffi.from_buffer(b)
+                     ffi.memmove(b, b'', 0)
+                     out_buffer = ffi.new('ZSTD_outBuffer *')
+                     out_buffer.dst = dest_buffer
+                     out_buffer.size = len(dest_buffer)
+                     out_buffer.pos = 0
+                     if self._compress_into_buffer(out_buffer):
+                         return out_buffer.pos
+                     while not self._finished_input:
+                         self._read_input()
+                         if self._compress_into_buffer(out_buffer):
+                             return out_buffer.pos
+                     # EOF.
+                     old_pos = out_buffer.pos
+                     zresult = lib.ZSTD_compressStream2(self._compressor._cctx,
+                                                        out_buffer, self._in_buffer,
+                                                        lib.ZSTD_e_end)
+                     self._bytes_compressed += out_buffer.pos - old_pos
+                     if lib.ZSTD_isError(zresult):
+                         raise ZstdError('error ending compression stream: %s',
+                                         _zstd_error(zresult))
+                     if zresult == 0:
+                         self._finished_output = True
+                     return out_buffer.pos
+                 def readinto1(self, b):
+                     if self._closed:
+                         raise ValueError('stream is closed')
+                     if self._finished_output:
+                         return 0
+                     # TODO use writable=True once we require CFFI >= 1.12.
+                     dest_buffer = ffi.from_buffer(b)
+                     ffi.memmove(b, b'', 0)
+                     out_buffer = ffi.new('ZSTD_outBuffer *')
+                     out_buffer.dst = dest_buffer
+                     out_buffer.size = len(dest_buffer)
+                     out_buffer.pos = 0
+                     self._compress_into_buffer(out_buffer)
+                     if out_buffer.pos:
+                         return out_buffer.pos
+                     while not self._finished_input:
+                         self._read_input()
+                         if self._compress_into_buffer(out_buffer):
+                             return out_buffer.pos
+                         if out_buffer.pos and not self._finished_input:
+                             return out_buffer.pos
+                     # EOF.
+                     old_pos = out_buffer.pos
+                     zresult = lib.ZSTD_compressStream2(self._compressor._cctx,
+                                                        out_buffer, self._in_buffer,
+                                                        lib.ZSTD_e_end)
+                     self._bytes_compressed += out_buffer.pos - old_pos
+                     if lib.ZSTD_isError(zresult):
+                         raise ZstdError('error ending compression stream: %s' %
+                                         _zstd_error(zresult))
+                     if zresult == 0:
+                         self._finished_output = True
+                     return out_buffer.pos
              class ZstdCompressor(object):
                  def __init__(self, level=3, dict_data=None, compression_params=None,
                               write_checksum=None, write_content_size=None,
                          self._params = ffi.gc(params, lib.ZSTD_freeCCtxParams)
                          _set_compression_parameter(self._params,
-                                                    lib.ZSTD_p_compressionLevel,
+                                                    lib.ZSTD_c_compressionLevel,
                                                     level)
                          _set_compression_parameter(
                              self._params,
-                             lib.ZSTD_p_contentSizeFlag,
+                             lib.ZSTD_c_contentSizeFlag,
                              write_content_size if write_content_size is not None else 1)
                          _set_compression_parameter(self._params,
-                                                    lib.ZSTD_p_checksumFlag,
+                                                    lib.ZSTD_c_checksumFlag,
 if write_checksum else 0)
                          _set_compression_parameter(self._params,
-                                                    lib.ZSTD_p_dictIDFlag,
+                                                    lib.ZSTD_c_dictIDFlag,
 if write_dict_id else 0)
                          if threads:
                              _set_compression_parameter(self._params,
-                                                        lib.ZSTD_p_nbWorkers,
+                                                        lib.ZSTD_c_nbWorkers,
                                                         threads)
                      cctx = lib.ZSTD_createCCtx()
                      return lib.ZSTD_sizeof_CCtx(self._cctx)
                  def compress(self, data):
-                     lib.ZSTD_CCtx_reset(self._cctx)
+                     lib.ZSTD_CCtx_reset(self._cctx, lib.ZSTD_reset_session_only)
                      data_buffer = ffi.from_buffer(data)
                      in_buffer.size = len(data_buffer)
                      in_buffer.pos = 0
-                     zresult = lib.ZSTD_compress_generic(self._cctx,
-                                                         out_buffer,
-                                                         in_buffer,
-                                                         lib.ZSTD_e_end)
+                     zresult = lib.ZSTD_compressStream2(self._cctx,
+                                                        out_buffer,
+                                                        in_buffer,
+                                                        lib.ZSTD_e_end)
                      if lib.ZSTD_isError(zresult):
                          raise ZstdError('cannot compress: %s' %
                      return ffi.buffer(out, out_buffer.pos)[:]
                  def compressobj(self, size=-1):
-                     lib.ZSTD_CCtx_reset(self._cctx)
+                     lib.ZSTD_CCtx_reset(self._cctx, lib.ZSTD_reset_session_only)
                      if size < 0:
                          size = lib.ZSTD_CONTENTSIZE_UNKNOWN
                      return cobj
                  def chunker(self, size=-1, chunk_size=COMPRESSION_RECOMMENDED_OUTPUT_SIZE):
-                     lib.ZSTD_CCtx_reset(self._cctx)
+                     lib.ZSTD_CCtx_reset(self._cctx, lib.ZSTD_reset_session_only)
                      if size < 0:
                          size = lib.ZSTD_CONTENTSIZE_UNKNOWN
                      if not hasattr(ofh, 'write'):
                          raise ValueError('second argument must have a write() method')
-                     lib.ZSTD_CCtx_reset(self._cctx)
+                     lib.ZSTD_CCtx_reset(self._cctx, lib.ZSTD_reset_session_only)
                      if size < 0:
                          size = lib.ZSTD_CONTENTSIZE_UNKNOWN
                          in_buffer.pos = 0
                          while in_buffer.pos < in_buffer.size:
-                             zresult = lib.ZSTD_compress_generic(self._cctx,
-                                                                 out_buffer,
-                                                                 in_buffer,
-                                                                 lib.ZSTD_e_continue)
+                             zresult = lib.ZSTD_compressStream2(self._cctx,
+                                                                out_buffer,
+                                                                in_buffer,
+                                                                lib.ZSTD_e_continue)
                              if lib.ZSTD_isError(zresult):
                                  raise ZstdError('zstd compress error: %s' %
                                                  _zstd_error(zresult))
                      # We've finished reading. Flush the compressor.
                      while True:
-                         zresult = lib.ZSTD_compress_generic(self._cctx,
-                                                             out_buffer,
-                                                             in_buffer,
-                                                             lib.ZSTD_e_end)
+                         zresult = lib.ZSTD_compressStream2(self._cctx,
+                                                            out_buffer,
+                                                            in_buffer,
+                                                            lib.ZSTD_e_end)
                          if lib.ZSTD_isError(zresult):
                              raise ZstdError('error ending compression stream: %s' %
                                              _zstd_error(zresult))
                  def stream_reader(self, source, size=-1,
                                    read_size=COMPRESSION_RECOMMENDED_INPUT_SIZE):
-                     lib.ZSTD_CCtx_reset(self._cctx)
+                     lib.ZSTD_CCtx_reset(self._cctx, lib.ZSTD_reset_session_only)
                      try:
                          size = len(source)
                          raise ZstdError('error setting source size: %s' %
                                          _zstd_error(zresult))
-                     return CompressionReader(self, source, read_size)
+                     return ZstdCompressionReader(self, source, read_size)
                  def stream_writer(self, writer, size=-1,
-                              write_size=COMPRESSION_RECOMMENDED_OUTPUT_SIZE):
+                              write_size=COMPRESSION_RECOMMENDED_OUTPUT_SIZE,
+                              write_return_read=False):
                      if not hasattr(writer, 'write'):
                          raise ValueError('must pass an object with a write() method')
-                     lib.ZSTD_CCtx_reset(self._cctx)
+                     lib.ZSTD_CCtx_reset(self._cctx, lib.ZSTD_reset_session_only)
                      if size < 0:
                          size = lib.ZSTD_CONTENTSIZE_UNKNOWN
-                     return ZstdCompressionWriter(self, writer, size, write_size)
+                     return ZstdCompressionWriter(self, writer, size, write_size,
+                                                  write_return_read)
                  write_to = stream_writer
                          raise ValueError('must pass an object with a read() method or '
                                           'conforms to buffer protocol')
-                     lib.ZSTD_CCtx_reset(self._cctx)
+                     lib.ZSTD_CCtx_reset(self._cctx, lib.ZSTD_reset_session_only)
                      if size < 0:
                          size = lib.ZSTD_CONTENTSIZE_UNKNOWN
                          in_buffer.pos = 0
                          while in_buffer.pos < in_buffer.size:
-                             zresult = lib.ZSTD_compress_generic(self._cctx, out_buffer, in_buffer,
-                                                                 lib.ZSTD_e_continue)
+                             zresult = lib.ZSTD_compressStream2(self._cctx, out_buffer, in_buffer,
+                                                                lib.ZSTD_e_continue)
                              if lib.ZSTD_isError(zresult):
                                  raise ZstdError('zstd compress error: %s' %
                                                  _zstd_error(zresult))
                      # remains.
                      while True:
                          assert out_buffer.pos == 0
-                         zresult = lib.ZSTD_compress_generic(self._cctx,
-                                                             out_buffer,
-                                                             in_buffer,
-                                                             lib.ZSTD_e_end)
+                         zresult = lib.ZSTD_compressStream2(self._cctx,
+                                                            out_buffer,
+                                                            in_buffer,
+                                                            lib.ZSTD_e_end)
                          if lib.ZSTD_isError(zresult):
                              raise ZstdError('error ending compression stream: %s' %
                                              _zstd_error(zresult))
                          cparams = ffi.new('ZSTD_compressionParameters')
                          cparams.chainLog = compression_params.chain_log
                          cparams.hashLog = compression_params.hash_log
-                         cparams.searchLength = compression_params.min_match
+                         cparams.minMatch = compression_params.min_match
                          cparams.searchLog = compression_params.search_log
                          cparams.strategy = compression_params.compression_strategy
                          cparams.targetLength = compression_params.target_length
                      out_buffer = ffi.new('ZSTD_outBuffer *')
                      data_buffer = ffi.from_buffer(data)
+                     if len(data_buffer) == 0:
+                         return b''
                      in_buffer.src = data_buffer
                      in_buffer.size = len(data_buffer)
                      in_buffer.pos = 0
                      chunks = []
                      while True:
-                         zresult = lib.ZSTD_decompress_generic(self._decompressor._dctx,
-                                                               out_buffer, in_buffer)
+                         zresult = lib.ZSTD_decompressStream(self._decompressor._dctx,
+                                                             out_buffer, in_buffer)
                          if lib.ZSTD_isError(zresult):
                              raise ZstdError('zstd decompressor error: %s' %
                                              _zstd_error(zresult))
                      return b''.join(chunks)
-             class DecompressionReader(object):
-                 def __init__(self, decompressor, source, read_size):
+                 def flush(self, length=0):
+                     pass
+             class ZstdDecompressionReader(object):
+                 def __init__(self, decompressor, source, read_size, read_across_frames):
                      self._decompressor = decompressor
                      self._source = source
                      self._read_size = read_size
+                     self._read_across_frames = bool(read_across_frames)
                      self._entered = False
                      self._closed = False
                      self._bytes_decompressed = 0
                      return True
                  def readline(self):
-                     raise NotImplementedError()
+                     raise io.UnsupportedOperation()
                  def readlines(self):
-                     raise NotImplementedError()
+                     raise io.UnsupportedOperation()
                  def write(self, data):
                      raise io.UnsupportedOperation()
                      return self._bytes_decompressed
                  def readall(self):
-                     raise NotImplementedError()
+                     chunks = []
+                     while True:
+                         chunk = self.read(1048576)
+                         if not chunk:
+                             break
+                         chunks.append(chunk)
+                     return b''.join(chunks)
                  def __iter__(self):
-                     raise NotImplementedError()
+                     raise io.UnsupportedOperation()
                  def __next__(self):
-                     raise NotImplementedError()
+                     raise io.UnsupportedOperation()
                  next = __next__
-                 def read(self, size):
+                 def _read_input(self):
+                     # We have data left over in the input buffer. Use it.
+                     if self._in_buffer.pos < self._in_buffer.size:
+                         return
+                     # All input data exhausted. Nothing to do.
+                     if self._finished_input:
+                         return
+                     # Else populate the input buffer from our source.
+                     if hasattr(self._source, 'read'):
+                         data = self._source.read(self._read_size)
+                         if not data:
+                             self._finished_input = True
+                             return
+                         self._source_buffer = ffi.from_buffer(data)
+                         self._in_buffer.src = self._source_buffer
+                         self._in_buffer.size = len(self._source_buffer)
+                         self._in_buffer.pos = 0
+                     else:
+                         self._source_buffer = ffi.from_buffer(self._source)
+                         self._in_buffer.src = self._source_buffer
+                         self._in_buffer.size = len(self._source_buffer)
+                         self._in_buffer.pos = 0
+                 def _decompress_into_buffer(self, out_buffer):
+                     """Decompress available input into an output buffer.
+                     Returns True if data in output buffer should be emitted.
+                     """
+                     zresult = lib.ZSTD_decompressStream(self._decompressor._dctx,
+                                                         out_buffer, self._in_buffer)
+                     if self._in_buffer.pos == self._in_buffer.size:
+                         self._in_buffer.src = ffi.NULL
+                         self._in_buffer.pos = 0
+                         self._in_buffer.size = 0
+                         self._source_buffer = None
+                         if not hasattr(self._source, 'read'):
+                             self._finished_input = True
+                     if lib.ZSTD_isError(zresult):
+                         raise ZstdError('zstd decompress error: %s' %
+                                         _zstd_error(zresult))
+                     # Emit data if there is data AND either:
+                     # a) output buffer is full (read amount is satisfied)
+                     # b) we're at end of a frame and not in frame spanning mode
+                     return (out_buffer.pos and
+                             (out_buffer.pos == out_buffer.size or
+                              zresult == 0 and not self._read_across_frames))
+                 def read(self, size=-1):
+                     if self._closed:
+                         raise ValueError('stream is closed')
+                     if size < -1:
+                         raise ValueError('cannot read negative amounts less than -1')
+                     if size == -1:
+                         # This is recursive. But it gets the job done.
+                         return self.readall()
+                     if self._finished_output or size == 0:
+                         return b''
+                     # We /could/ call into readinto() here. But that introduces more
+                     # overhead.
+                     dst_buffer = ffi.new('char[]', size)
+                     out_buffer = ffi.new('ZSTD_outBuffer *')
+                     out_buffer.dst = dst_buffer
+                     out_buffer.size = size
+                     out_buffer.pos = 0
+                     self._read_input()
+                     if self._decompress_into_buffer(out_buffer):
+                         self._bytes_decompressed += out_buffer.pos
+                         return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
+                     while not self._finished_input:
+                         self._read_input()
+                         if self._decompress_into_buffer(out_buffer):
+                             self._bytes_decompressed += out_buffer.pos
+                             return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
+                     self._bytes_decompressed += out_buffer.pos
+                     return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
+                 def readinto(self, b):
                      if self._closed:
                          raise ValueError('stream is closed')
                      if self._finished_output:
+                         return 0
+                     # TODO use writable=True once we require CFFI >= 1.12.
+                     dest_buffer = ffi.from_buffer(b)
+                     ffi.memmove(b, b'', 0)
+                     out_buffer = ffi.new('ZSTD_outBuffer *')
+                     out_buffer.dst = dest_buffer
+                     out_buffer.size = len(dest_buffer)
+                     out_buffer.pos = 0
+                     self._read_input()
+                     if self._decompress_into_buffer(out_buffer):
+                         self._bytes_decompressed += out_buffer.pos
+                         return out_buffer.pos
+                     while not self._finished_input:
+                         self._read_input()
+                         if self._decompress_into_buffer(out_buffer):
+                             self._bytes_decompressed += out_buffer.pos
+                             return out_buffer.pos
+                     self._bytes_decompressed += out_buffer.pos
+                     return out_buffer.pos
+                 def read1(self, size=-1):
+                     if self._closed:
+                         raise ValueError('stream is closed')
+                     if size < -1:
+                         raise ValueError('cannot read negative amounts less than -1')
+                     if self._finished_output or size == 0:
                          return b''
-                     if size < 1:
-                         raise ValueError('cannot read negative or size 0 amounts')
+                     # -1 returns arbitrary number of bytes.
+                     if size == -1:
+                         size = DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE
                      dst_buffer = ffi.new('char[]', size)
                      out_buffer = ffi.new('ZSTD_outBuffer *')
                      out_buffer.size = size
                      out_buffer.pos = 0
-                     def decompress():
-                         zresult = lib.ZSTD_decompress_generic(self._decompressor._dctx,
-                                                               out_buffer, self._in_buffer)
-                         if self._in_buffer.pos == self._in_buffer.size:
-                             self._in_buffer.src = ffi.NULL
-                             self._in_buffer.pos = 0
-                             self._in_buffer.size = 0
-                             self._source_buffer = None
-                             if not hasattr(self._source, 'read'):
-                                 self._finished_input = True
-                         if lib.ZSTD_isError(zresult):
-                             raise ZstdError('zstd decompress error: %s',
-                                             _zstd_error(zresult))
-                         elif zresult == 0:
-                             self._finished_output = True
-                         if out_buffer.pos and out_buffer.pos == out_buffer.size:
-                             self._bytes_decompressed += out_buffer.size
-                             return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
-                     def get_input():
-                         if self._finished_input:
-                             return
-                         if hasattr(self._source, 'read'):
-                             data = self._source.read(self._read_size)
-                             if not data:
-                                 self._finished_input = True
-                                 return
-                             self._source_buffer = ffi.from_buffer(data)
-                             self._in_buffer.src = self._source_buffer
-                             self._in_buffer.size = len(self._source_buffer)
-                             self._in_buffer.pos = 0
-                         else:
-                             self._source_buffer = ffi.from_buffer(self._source)
-                             self._in_buffer.src = self._source_buffer
-                             self._in_buffer.size = len(self._source_buffer)
-                             self._in_buffer.pos = 0
-                     get_input()
-                     result = decompress()
-                     if result:
-                         return result
+                     # read1() dictates that we can perform at most 1 call to underlying
+                     # stream to get input. However, we can't satisfy this restriction with
+                     # decompression because not all input generates output. So we allow
+                     # multiple read(). But unlike read(), we stop once we have any output.
                      while not self._finished_input:
-                         get_input()
-                         result = decompress()
-                         if result:
-                             return result
+                         self._read_input()
+                         self._decompress_into_buffer(out_buffer)
+                         if out_buffer.pos:
+                             break
                      self._bytes_decompressed += out_buffer.pos
                      return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
+                 def readinto1(self, b):
+                     if self._closed:
+                         raise ValueError('stream is closed')
+                     if self._finished_output:
+                         return 0
+                     # TODO use writable=True once we require CFFI >= 1.12.
+                     dest_buffer = ffi.from_buffer(b)
+                     ffi.memmove(b, b'', 0)
+                     out_buffer = ffi.new('ZSTD_outBuffer *')
+                     out_buffer.dst = dest_buffer
+                     out_buffer.size = len(dest_buffer)
+                     out_buffer.pos = 0
+                     while not self._finished_input and not self._finished_output:
+                         self._read_input()
+                         self._decompress_into_buffer(out_buffer)
+                         if out_buffer.pos:
+                             break
+                     self._bytes_decompressed += out_buffer.pos
+                     return out_buffer.pos
                  def seek(self, pos, whence=os.SEEK_SET):
                      if self._closed:
                          raise ValueError('stream is closed')
                      return self._bytes_decompressed
              class ZstdDecompressionWriter(object):
-                 def __init__(self, decompressor, writer, write_size):
+                 def __init__(self, decompressor, writer, write_size, write_return_read):
+                     decompressor._ensure_dctx()
                      self._decompressor = decompressor
                      self._writer = writer
                      self._write_size = write_size
+                     self._write_return_read = bool(write_return_read)
                      self._entered = False
+                     self._closed = False
                  def __enter__(self):
+                     if self._closed:
+                         raise ValueError('stream is closed')
                      if self._entered:
                          raise ZstdError('cannot __enter__ multiple times')
-                     self._decompressor._ensure_dctx()
                      self._entered = True
                      return self
                  def __exit__(self, exc_type, exc_value, exc_tb):
                      self._entered = False
+                     self.close()
                  def memory_size(self):
-                     if not self._decompressor._dctx:
-                         raise ZstdError('cannot determine size of inactive decompressor '
-                                         'call when context manager is active')
                      return lib.ZSTD_sizeof_DCtx(self._decompressor._dctx)
+                 def close(self):
+                     if self._closed:
+                         return
+                     try:
+                         self.flush()
+                     finally:
+                         self._closed = True
+                     f = getattr(self._writer, 'close', None)
+                     if f:
+                         f()
+                 @property
+                 def closed(self):
+                     return self._closed
+                 def fileno(self):
+                     f = getattr(self._writer, 'fileno', None)
+                     if f:
+                         return f()
+                     else:
+                         raise OSError('fileno not available on underlying writer')
+                 def flush(self):
+                     if self._closed:
+                         raise ValueError('stream is closed')
+                     f = getattr(self._writer, 'flush', None)
+                     if f:
+                         return f()
+                 def isatty(self):
+                     return False
+                 def readable(self):
+                     return False
+                 def readline(self, size=-1):
+                     raise io.UnsupportedOperation()
+                 def readlines(self, hint=-1):
+                     raise io.UnsupportedOperation()
+                 def seek(self, offset, whence=None):
+                     raise io.UnsupportedOperation()
+                 def seekable(self):
+                     return False
+                 def tell(self):
+                     raise io.UnsupportedOperation()
+                 def truncate(self, size=None):
+                     raise io.UnsupportedOperation()
+                 def writable(self):
+                     return True
+                 def writelines(self, lines):
+                     raise io.UnsupportedOperation()
+                 def read(self, size=-1):
+                     raise io.UnsupportedOperation()
+                 def readall(self):
+                     raise io.UnsupportedOperation()
+                 def readinto(self, b):
+                     raise io.UnsupportedOperation()
                  def write(self, data):
-                     if not self._entered:
-                         raise ZstdError('write must be called from an active context manager')
+                     if self._closed:
+                         raise ValueError('stream is closed')
                      total_write = 0
                      dctx = self._decompressor._dctx
                      while in_buffer.pos < in_buffer.size:
-                         zresult = lib.ZSTD_decompress_generic(dctx, out_buffer, in_buffer)
+                         zresult = lib.ZSTD_decompressStream(dctx, out_buffer, in_buffer)
                          if lib.ZSTD_isError(zresult):
                              raise ZstdError('zstd decompress error: %s' %
                                              _zstd_error(zresult))
                              total_write += out_buffer.pos
                              out_buffer.pos = 0
-                     return total_write
+                     if self._write_return_read:
+                         return in_buffer.pos
+                     else:
+                         return total_write
              class ZstdDecompressor(object):
                      in_buffer.size = len(data_buffer)
                      in_buffer.pos = 0
-                     zresult = lib.ZSTD_decompress_generic(self._dctx, out_buffer, in_buffer)
+                     zresult = lib.ZSTD_decompressStream(self._dctx, out_buffer, in_buffer)
                      if lib.ZSTD_isError(zresult):
                          raise ZstdError('decompression error: %s' %
                                          _zstd_error(zresult))
                      return ffi.buffer(result_buffer, out_buffer.pos)[:]
-                 def stream_reader(self, source, read_size=DECOMPRESSION_RECOMMENDED_INPUT_SIZE):
+                 def stream_reader(self, source, read_size=DECOMPRESSION_RECOMMENDED_INPUT_SIZE,
+                                   read_across_frames=False):
                      self._ensure_dctx()
-                     return DecompressionReader(self, source, read_size)
+                     return ZstdDecompressionReader(self, source, read_size, read_across_frames)
                  def decompressobj(self, write_size=DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE):
                      if write_size < 1:
                          while in_buffer.pos < in_buffer.size:
                              assert out_buffer.pos == 0
-                             zresult = lib.ZSTD_decompress_generic(self._dctx, out_buffer, in_buffer)
+                             zresult = lib.ZSTD_decompressStream(self._dctx, out_buffer, in_buffer)
                              if lib.ZSTD_isError(zresult):
                                  raise ZstdError('zstd decompress error: %s' %
                                                  _zstd_error(zresult))
                  read_from = read_to_iter
-                 def stream_writer(self, writer, write_size=DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE):
+                 def stream_writer(self, writer, write_size=DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE,
+                                   write_return_read=False):
                      if not hasattr(writer, 'write'):
                          raise ValueError('must pass an object with a write() method')
-                     return ZstdDecompressionWriter(self, writer, write_size)
+                     return ZstdDecompressionWriter(self, writer, write_size,
+                                                    write_return_read)
                  write_to = stream_writer
                          # Flush all read data to output.
                          while in_buffer.pos < in_buffer.size:
-                             zresult = lib.ZSTD_decompress_generic(self._dctx, out_buffer, in_buffer)
+                             zresult = lib.ZSTD_decompressStream(self._dctx, out_buffer, in_buffer)
                              if lib.ZSTD_isError(zresult):
                                  raise ZstdError('zstd decompressor error: %s' %
                                                  _zstd_error(zresult))
                      in_buffer.size = len(chunk_buffer)
                      in_buffer.pos = 0
-                     zresult = lib.ZSTD_decompress_generic(self._dctx, out_buffer, in_buffer)
+                     zresult = lib.ZSTD_decompressStream(self._dctx, out_buffer, in_buffer)
                      if lib.ZSTD_isError(zresult):
                          raise ZstdError('could not decompress chunk 0: %s' %
                                          _zstd_error(zresult))
                          in_buffer.size = len(chunk_buffer)
                          in_buffer.pos = 0
-                         zresult = lib.ZSTD_decompress_generic(self._dctx, out_buffer, in_buffer)
+                         zresult = lib.ZSTD_decompressStream(self._dctx, out_buffer, in_buffer)
                          if lib.ZSTD_isError(zresult):
                              raise ZstdError('could not decompress chunk %d: %s' %
                                              _zstd_error(zresult))
                      return ffi.buffer(last_buffer, len(last_buffer))[:]
                  def _ensure_dctx(self, load_dict=True):
-                     lib.ZSTD_DCtx_reset(self._dctx)
+                     lib.ZSTD_DCtx_reset(self._dctx, lib.ZSTD_reset_session_only)
                      if self._max_window_size:
                          zresult = lib.ZSTD_DCtx_setMaxWindowSize(self._dctx,

contrib/python-zstandard/zstd.c

0 +1 -1

              	   We detect this mismatch here and refuse to load the module if this
              	   scenario is detected.
              	*/
-             	if (ZSTD_VERSION_NUMBER != 10306 || ZSTD_versionNumber() != 10306) {
+             	if (ZSTD_VERSION_NUMBER != 10308 || ZSTD_versionNumber() != 10308) {
              		PyErr_SetString(PyExc_ImportError, "zstd C API mismatch; Python bindings not compiled against expected zstd version");
              		return;
              	}

contrib/python-zstandard/zstd/common/bitstream.h

0 +10 -13

              MEM_STATIC size_t BIT_getMiddleBits(size_t bitContainer, U32 const start, U32 const nbBits)
              {
-             #if defined(__BMI__) && defined(__GNUC__) && __GNUC__*1000+__GNUC_MINOR__ >= 4008  /* experimental */
-             #  if defined(__x86_64__)
-                 if (sizeof(bitContainer)==8)
-                     return _bextr_u64(bitContainer, start, nbBits);
-                 else
-             #  endif
-                     return _bextr_u32(bitContainer, start, nbBits);
-             #else
+                 U32 const regMask = sizeof(bitContainer)*8 - 1;
+                 /* if start > regMask, bitstream is corrupted, and result is undefined */
                  assert(nbBits < BIT_MASK_SIZE);
-                 return (bitContainer >> start) & BIT_mask[nbBits];
-             #endif
+                 return (bitContainer >> (start & regMask)) & BIT_mask[nbBits];
              }
              MEM_STATIC size_t BIT_getLowerBits(size_t bitContainer, U32 const nbBits)
               * @return : value extracted */
              MEM_STATIC size_t BIT_lookBits(const BIT_DStream_t* bitD, U32 nbBits)
              {
-             #if defined(__BMI__) && defined(__GNUC__)   /* experimental; fails if bitD->bitsConsumed + nbBits > sizeof(bitD->bitContainer)*8 */
+                 /* arbitrate between double-shift and shift+mask */
+             #if 1
+                 /* if bitD->bitsConsumed + nbBits > sizeof(bitD->bitContainer)*8,
+                  * bitstream is likely corrupted, and result is undefined */
                  return BIT_getMiddleBits(bitD->bitContainer, (sizeof(bitD->bitContainer)*8) - bitD->bitsConsumed - nbBits, nbBits);
              #else
+                 /* this code path is slower on my os-x laptop */
                  U32 const regMask = sizeof(bitD->bitContainer)*8 - 1;
                  return ((bitD->bitContainer << (bitD->bitsConsumed & regMask)) >> 1) >> ((regMask-nbBits) & regMask);
              #endif
               *  Read (consume) next n bits from local register and update.
               *  Pay attention to not read more than nbBits contained into local register.
               * @return : extracted value. */
-             MEM_STATIC size_t BIT_readBits(BIT_DStream_t* bitD, U32 nbBits)
+             MEM_STATIC size_t BIT_readBits(BIT_DStream_t* bitD, unsigned nbBits)
              {
                  size_t const value = BIT_lookBits(bitD, nbBits);
                  BIT_skipBits(bitD, nbBits);
              /*! BIT_readBitsFast() :
               *  unsafe version; only works only if nbBits >= 1 */
-             MEM_STATIC size_t BIT_readBitsFast(BIT_DStream_t* bitD, U32 nbBits)
+             MEM_STATIC size_t BIT_readBitsFast(BIT_DStream_t* bitD, unsigned nbBits)
              {
                  size_t const value = BIT_lookBitsFast(bitD, nbBits);
                  assert(nbBits >= 1);

contrib/python-zstandard/zstd/common/compiler.h

0 +19 -12

              *  Compiler specifics
              *********************************************************/
              /* force inlining */
+             #if !defined(ZSTD_NO_INLINE)
              #if defined (__GNUC__) || defined(__cplusplus) || defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L   /* C99 */
              #  define INLINE_KEYWORD inline
              #else
              #  define FORCE_INLINE_ATTR
              #endif
+             #else
+             #define INLINE_KEYWORD
+             #define FORCE_INLINE_ATTR
+             #endif
              /**
               * FORCE_INLINE_TEMPLATE is used to define C "templates", which take constant
               * parameters. They must be inlined for the compiler to elimininate the constant
              #endif
              /* prefetch
-              * can be disabled, by declaring NO_PREFETCH macro
-              * All prefetch invocations use a single default locality 2,
-              * generating instruction prefetcht1,
-              * which, according to Intel, means "load data into L2 cache".
-              * This is a good enough "middle ground" for the time being,
-              * though in theory, it would be better to specialize locality depending on data being prefetched.
-              * Tests could not determine any sensible difference based on locality value. */
+              * can be disabled, by declaring NO_PREFETCH build macro */
              #if defined(NO_PREFETCH)
-             #  define PREFETCH(ptr)     (void)(ptr)  /* disabled */
+             #  define PREFETCH_L1(ptr)  (void)(ptr)  /* disabled */
+             #  define PREFETCH_L2(ptr)  (void)(ptr)  /* disabled */
              #else
              #  if defined(_MSC_VER) && (defined(_M_X64) || defined(_M_I86))  /* _mm_prefetch() is not defined outside of x86/x64 */
              #    include <mmintrin.h>   /* https://msdn.microsoft.com/fr-fr/library/84szxsww(v=vs.90).aspx */
-             #    define PREFETCH(ptr)   _mm_prefetch((const char*)(ptr), _MM_HINT_T1)
+             #    define PREFETCH_L1(ptr)  _mm_prefetch((const char*)(ptr), _MM_HINT_T0)
+             #    define PREFETCH_L2(ptr)  _mm_prefetch((const char*)(ptr), _MM_HINT_T1)
              #  elif defined(__GNUC__) && ( (__GNUC__ >= 4) || ( (__GNUC__ == 3) && (__GNUC_MINOR__ >= 1) ) )
-             #    define PREFETCH(ptr)   __builtin_prefetch((ptr), 0 /* rw==read */, 2 /* locality */)
+             #    define PREFETCH_L1(ptr)  __builtin_prefetch((ptr), 0 /* rw==read */, 3 /* locality */)
+             #    define PREFETCH_L2(ptr)  __builtin_prefetch((ptr), 0 /* rw==read */, 2 /* locality */)
              #  else
-             #    define PREFETCH(ptr)   (void)(ptr)  /* disabled */
+             #    define PREFETCH_L1(ptr) (void)(ptr)  /* disabled */
+             #    define PREFETCH_L2(ptr) (void)(ptr)  /* disabled */
              #  endif
              #endif  /* NO_PREFETCH */
                  size_t const _size = (size_t)(s);     \
                  size_t _pos;                          \
                  for (_pos=0; _pos<_size; _pos+=CACHELINE_SIZE) {  \
-                     PREFETCH(_ptr + _pos);            \
+                     PREFETCH_L2(_ptr + _pos);         \
                  }                                     \
              }

contrib/python-zstandard/zstd/common/cpu.h

0 +1 -1

                    __asm__(
                        "pushl %%ebx\n\t"
                        "cpuid\n\t"
-                       "movl %%ebx, %%eax\n\r"
+                       "movl %%ebx, %%eax\n\t"
                        "popl %%ebx"
                        : "=a"(f7b), "=c"(f7c)
                        : "a"(7), "c"(0)

contrib/python-zstandard/zstd/common/debug.h

0 +22 -11

              #endif
-             /* static assert is triggered at compile time, leaving no runtime artefact,
-              * but can only work with compile-time constants.
-              * This variant can only be used inside a function. */
+             /* static assert is triggered at compile time, leaving no runtime artefact.
+              * static assert only works with compile-time constants.
+              * Also, this variant can only be used inside a function. */
              #define DEBUG_STATIC_ASSERT(c) (void)sizeof(char[(c) ? 1 : -1])
              #  define DEBUGLEVEL 0
              #endif
+             /* DEBUGFILE can be defined externally,
+              * typically through compiler command line.
+              * note : currently useless.
+              * Value must be stderr or stdout */
+             #ifndef DEBUGFILE
+             #  define DEBUGFILE stderr
+             #endif
              /* recommended values for DEBUGLEVEL :
-              * 0 : no debug, all run-time functions disabled
-              * 1 : no display, enables assert() only
+              * 0 : release mode, no debug, all run-time checks disabled
+              * 1 : enables assert() only, no display
               * 2 : reserved, for currently active debug path
               * 3 : events once per object lifetime (CCtx, CDict, etc.)
               * 4 : events once per frame
               * 7+: events at every position (*very* verbose)
               *
               * It's generally inconvenient to output traces > 5.
-              * In which case, it's possible to selectively enable higher verbosity levels
+              * In which case, it's possible to selectively trigger high verbosity levels
               * by modifying g_debug_level.
               */
              #if (DEBUGLEVEL>=2)
              #  include <stdio.h>
-             extern int g_debuglevel; /* here, this variable is only declared,
-                                        it actually lives in debug.c,
-                                        and is shared by the whole process.
-                                        It's typically used to enable very verbose levels
-                                        on selective conditions (such as position in src) */
+             extern int g_debuglevel; /* the variable is only declared,
+                                         it actually lives in debug.c,
+                                         and is shared by the whole process.
+                                         It's not thread-safe.
+                                         It's useful when enabling very verbose levels
+                                         on selective conditions (such as position in src) */
              #  define RAWLOG(l, ...) {                                      \
                              if (l<=g_debuglevel) {                          \

contrib/python-zstandard/zstd/common/error_private.c

0 +6 0

              const char* ERR_getErrorString(ERR_enum code)
              {
+             #ifdef ZSTD_STRIP_ERROR_STRINGS
+                 (void)code;
+                 return "Error strings stripped";
+             #else
                  static const char* const notErrorCode = "Unspecified error code";
                  switch( code )
                  {
                  case PREFIX(dictionaryCreation_failed): return "Cannot create Dictionary from provided samples";
                  case PREFIX(dstSize_tooSmall): return "Destination buffer is too small";
                  case PREFIX(srcSize_wrong): return "Src size is incorrect";
+                 case PREFIX(dstBuffer_null): return "Operation on NULL destination buffer";
                      /* following error codes are not stable and may be removed or changed in a future version */
                  case PREFIX(frameIndex_tooLarge): return "Frame index is too large";
                  case PREFIX(seekableIO): return "An I/O error occurred when reading/seeking";
                  case PREFIX(maxCode):
                  default: return notErrorCode;
                  }
+             #endif
              }

contrib/python-zstandard/zstd/common/fse.h

0 +2 -2

                  const U32 tableLog = MEM_read16(ptr);
                  statePtr->value = (ptrdiff_t)1<<tableLog;
                  statePtr->stateTable = u16ptr+2;
-                 statePtr->symbolTT = ((const U32*)ct + 1 + (tableLog ? (1<<(tableLog-1)) : 1));
+                 statePtr->symbolTT = ct + 1 + (tableLog ? (1<<(tableLog-1)) : 1);
                  statePtr->stateLog = tableLog;
              }
                  }
              }
-             MEM_STATIC void FSE_encodeSymbol(BIT_CStream_t* bitC, FSE_CState_t* statePtr, U32 symbol)
+             MEM_STATIC void FSE_encodeSymbol(BIT_CStream_t* bitC, FSE_CState_t* statePtr, unsigned symbol)
              {
                  FSE_symbolCompressionTransform const symbolTT = ((const FSE_symbolCompressionTransform*)(statePtr->symbolTT))[symbol];
                  const U16* const stateTable = (const U16*)(statePtr->stateTable);

contrib/python-zstandard/zstd/common/huf.h

0 +25 -1

              *  Advanced decompression functions
              ******************************************/
              size_t HUF_decompress4X1 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);   /**< single-symbol decoder */
+             #ifndef HUF_FORCE_DECOMPRESS_X1
              size_t HUF_decompress4X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);   /**< double-symbols decoder */
+             #endif
              size_t HUF_decompress4X_DCtx (HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);   /**< decodes RLE and uncompressed */
              size_t HUF_decompress4X_hufOnly(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< considers RLE and uncompressed as errors */
              size_t HUF_decompress4X_hufOnly_wksp(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize); /**< considers RLE and uncompressed as errors */
              size_t HUF_decompress4X1_DCtx(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);   /**< single-symbol decoder */
              size_t HUF_decompress4X1_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize);   /**< single-symbol decoder */
+             #ifndef HUF_FORCE_DECOMPRESS_X1
              size_t HUF_decompress4X2_DCtx(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);   /**< double-symbols decoder */
              size_t HUF_decompress4X2_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize);   /**< double-symbols decoder */
+             #endif
              /* ****************************************
              #define HUF_CTABLE_WORKSPACE_SIZE_U32 (2*HUF_SYMBOLVALUE_MAX +1 +1)
              #define HUF_CTABLE_WORKSPACE_SIZE (HUF_CTABLE_WORKSPACE_SIZE_U32 * sizeof(unsigned))
              size_t HUF_buildCTable_wksp (HUF_CElt* tree,
-                                    const U32* count, U32 maxSymbolValue, U32 maxNbBits,
+                                    const unsigned* count, U32 maxSymbolValue, U32 maxNbBits,
                                           void* workSpace, size_t wkspSize);
              /*! HUF_readStats() :
              #define HUF_DECOMPRESS_WORKSPACE_SIZE (2 << 10)
              #define HUF_DECOMPRESS_WORKSPACE_SIZE_U32 (HUF_DECOMPRESS_WORKSPACE_SIZE / sizeof(U32))
+             #ifndef HUF_FORCE_DECOMPRESS_X2
              size_t HUF_readDTableX1 (HUF_DTable* DTable, const void* src, size_t srcSize);
              size_t HUF_readDTableX1_wksp (HUF_DTable* DTable, const void* src, size_t srcSize, void* workSpace, size_t wkspSize);
+             #endif
+             #ifndef HUF_FORCE_DECOMPRESS_X1
              size_t HUF_readDTableX2 (HUF_DTable* DTable, const void* src, size_t srcSize);
              size_t HUF_readDTableX2_wksp (HUF_DTable* DTable, const void* src, size_t srcSize, void* workSpace, size_t wkspSize);
+             #endif
              size_t HUF_decompress4X_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable);
+             #ifndef HUF_FORCE_DECOMPRESS_X2
              size_t HUF_decompress4X1_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable);
+             #endif
+             #ifndef HUF_FORCE_DECOMPRESS_X1
              size_t HUF_decompress4X2_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable);
+             #endif
              /* ====================== */
                                     HUF_CElt* hufTable, HUF_repeat* repeat, int preferRepeat, int bmi2);
              size_t HUF_decompress1X1 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);   /* single-symbol decoder */
+             #ifndef HUF_FORCE_DECOMPRESS_X1
              size_t HUF_decompress1X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);   /* double-symbol decoder */
+             #endif
              size_t HUF_decompress1X_DCtx (HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);
              size_t HUF_decompress1X_DCtx_wksp (HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize);
+             #ifndef HUF_FORCE_DECOMPRESS_X2
              size_t HUF_decompress1X1_DCtx(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);   /**< single-symbol decoder */
              size_t HUF_decompress1X1_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize);   /**< single-symbol decoder */
+             #endif
+             #ifndef HUF_FORCE_DECOMPRESS_X1
              size_t HUF_decompress1X2_DCtx(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);   /**< double-symbols decoder */
              size_t HUF_decompress1X2_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize);   /**< double-symbols decoder */
+             #endif
              size_t HUF_decompress1X_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable);   /**< automatic selection of sing or double symbol decoder, based on DTable */
+             #ifndef HUF_FORCE_DECOMPRESS_X2
              size_t HUF_decompress1X1_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable);
+             #endif
+             #ifndef HUF_FORCE_DECOMPRESS_X1
              size_t HUF_decompress1X2_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable);
+             #endif
              /* BMI2 variants.
               * If the CPU has BMI2 support, pass bmi2=1, otherwise pass bmi2=0.
               */
              size_t HUF_decompress1X_usingDTable_bmi2(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable, int bmi2);
+             #ifndef HUF_FORCE_DECOMPRESS_X2
              size_t HUF_decompress1X1_DCtx_wksp_bmi2(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize, int bmi2);
+             #endif
              size_t HUF_decompress4X_usingDTable_bmi2(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable, int bmi2);
              size_t HUF_decompress4X_hufOnly_wksp_bmi2(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize, int bmi2);

contrib/python-zstandard/zstd/common/mem.h

0 +8 -2

              #  define MEM_STATIC static  /* this version may generate warnings for unused static functions; disable the relevant warning */
              #endif
+             #ifndef __has_builtin
+             #  define __has_builtin(x) 0  /* compat. with non-clang compilers */
+             #endif
              /* code only tested on 32 and 64 bits systems */
              #define MEM_STATIC_ASSERT(c)   { enum { MEM_static_assert = 1/(int)(!!(c)) }; }
              MEM_STATIC void MEM_check(void) { MEM_STATIC_ASSERT((sizeof(size_t)==4) || (sizeof(size_t)==8)); }
              {
              #if defined(_MSC_VER)     /* Visual Studio */
                  return _byteswap_ulong(in);
-             #elif defined (__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__ >= 403)
+             #elif (defined (__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__ >= 403)) \
+               || (defined(__clang__) && __has_builtin(__builtin_bswap32))
                  return __builtin_bswap32(in);
              #else
                  return  ((in << 24) & 0xff000000 ) |
              {
              #if defined(_MSC_VER)     /* Visual Studio */
                  return _byteswap_uint64(in);
-             #elif defined (__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__ >= 403)
+             #elif (defined (__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__ >= 403)) \
+               || (defined(__clang__) && __has_builtin(__builtin_bswap64))
                  return __builtin_bswap64(in);
              #else
                  return  ((in << 56) & 0xff00000000000000ULL) |

contrib/python-zstandard/zstd/common/pool.c

0 +1 -1

                          ctx->numThreadsBusy++;
                          ctx->queueEmpty = ctx->queueHead == ctx->queueTail;
                          /* Unlock the mutex, signal a pusher, and run the job */
+                         ZSTD_pthread_cond_signal(&ctx->queuePushCond);
                          ZSTD_pthread_mutex_unlock(&ctx->queueMutex);
-                         ZSTD_pthread_cond_signal(&ctx->queuePushCond);
                          job.function(job.opaque);

contrib/python-zstandard/zstd/common/zstd_common.c

0 +3 -1

              /*-****************************************
              *  ZSTD Error Management
              ******************************************/
+             #undef ZSTD_isError   /* defined within zstd_internal.h */
              /*! ZSTD_isError() :
-              *  tells if a return value is an error code */
+              *  tells if a return value is an error code
+              *  symbol is required for external callers */
              unsigned ZSTD_isError(size_t code) { return ERR_isError(code); }
              /*! ZSTD_getErrorName() :

contrib/python-zstandard/zstd/common/zstd_errors.h

0 +1 0

                ZSTD_error_workSpace_tooSmall= 66,
                ZSTD_error_dstSize_tooSmall = 70,
                ZSTD_error_srcSize_wrong    = 72,
+               ZSTD_error_dstBuffer_null   = 74,
                /* following error codes are __NOT STABLE__, they can be removed or changed in future versions */
                ZSTD_error_frameIndex_tooLarge = 100,
                ZSTD_error_seekableIO          = 102,

contrib/python-zstandard/zstd/common/zstd_internal.h

0 +11 -2

              /* ---- static assert (debug) --- */
              #define ZSTD_STATIC_ASSERT(c) DEBUG_STATIC_ASSERT(c)
+             #define ZSTD_isError ERR_isError   /* for inlining */
+             #define FSE_isError  ERR_isError
+             #define HUF_isError  ERR_isError
              /*-*************************************
              #define BIT0   1
              #define ZSTD_WINDOWLOG_ABSOLUTEMIN 10
-             #define ZSTD_WINDOWLOG_DEFAULTMAX 27 /* Default maximum allowed window log */
              static const size_t ZSTD_fcs_fieldSize[4] = { 0, 2, 4, 8 };
              static const size_t ZSTD_did_fieldSize[4] = { 0, 1, 2, 4 };
                  blockType_e blockType;
                  U32 lastBlock;
                  U32 origSize;
-             } blockProperties_t;
+             } blockProperties_t;   /* declared here for decompress and fullbench */
              /*! ZSTD_getcBlockSize() :
               *  Provides the size of compressed block from block header `src` */
              size_t ZSTD_getcBlockSize(const void* src, size_t srcSize,
                                        blockProperties_t* bpPtr);
+             /*! ZSTD_decodeSeqHeaders() :
+              *  decode sequence header from src */
+             /* Used by: decompress, fullbench (does not get its definition from here) */
+             size_t ZSTD_decodeSeqHeaders(ZSTD_DCtx* dctx, int* nbSeqPtr,
+                                    const void* src, size_t srcSize);
              #if defined (__cplusplus)
              }
              #endif

contrib/python-zstandard/zstd/compress/fse_compress.c

0 +3 -3

                  /* symbol start positions */
                  {   U32 u;
                      cumul[0] = 0;
-                     for (u=1; u<=maxSymbolValue+1; u++) {
+                     for (u=1; u <= maxSymbolValue+1; u++) {
                          if (normalizedCounter[u-1]==-1) {  /* Low proba symbol */
                              cumul[u] = cumul[u-1] + 1;
                              tableSymbol[highThreshold--] = (FSE_FUNCTION_TYPE)(u-1);
                  BYTE* op = ostart;
                  BYTE* const oend = ostart + dstSize;
-                 U32   count[FSE_MAX_SYMBOL_VALUE+1];
+                 unsigned count[FSE_MAX_SYMBOL_VALUE+1];
                  S16   norm[FSE_MAX_SYMBOL_VALUE+1];
                  FSE_CTable* CTable = (FSE_CTable*)workSpace;
                  size_t const CTableSize = FSE_CTABLE_SIZE_U32(tableLog, maxSymbolValue);
                  if (!tableLog) tableLog = FSE_DEFAULT_TABLELOG;
                  /* Scan input and build symbol stats */
-                 {   CHECK_V_F(maxCount, HIST_count_wksp(count, &maxSymbolValue, src, srcSize, (unsigned*)scratchBuffer) );
+                 {   CHECK_V_F(maxCount, HIST_count_wksp(count, &maxSymbolValue, src, srcSize, scratchBuffer, scratchBufferSize) );
                      if (maxCount == srcSize) return 1;   /* only a single symbol in src : rle */
                      if (maxCount == 1) return 0;         /* each symbol present maximum once => not compressible */
                      if (maxCount < (srcSize >> 7)) return 0;   /* Heuristic : not compressible enough */

contrib/python-zstandard/zstd/compress/hist.c

0 +19 -11

                  return largestCount;
              }
+             typedef enum { trustInput, checkMaxSymbolValue } HIST_checkInput_e;
              /* HIST_count_parallel_wksp() :
               * store histogram into 4 intermediate tables, recombined at the end.
              static size_t HIST_count_parallel_wksp(
                                              unsigned* count, unsigned* maxSymbolValuePtr,
                                              const void* source, size_t sourceSize,
-                                             unsigned checkMax,
-                                             unsigned* const workSpace)
+                                             HIST_checkInput_e check,
+                                             U32* const workSpace)
              {
                  const BYTE* ip = (const BYTE*)source;
                  const BYTE* const iend = ip+sourceSize;
                  /* finish last symbols */
                  while (ip<iend) Counting1[*ip++]++;
-                 if (checkMax) {   /* verify stats will fit into destination table */
+                 if (check) {   /* verify stats will fit into destination table */
                      U32 s; for (s=255; s>maxSymbolValue; s--) {
                          Counting1[s] += Counting2[s] + Counting3[s] + Counting4[s];
                          if (Counting1[s]) return ERROR(maxSymbolValue_tooSmall);
              /* HIST_countFast_wksp() :
               * Same as HIST_countFast(), but using an externally provided scratch buffer.
-              * `workSpace` size must be table of >= HIST_WKSP_SIZE_U32 unsigned */
+              * `workSpace` is a writable buffer which must be 4-bytes aligned,
+              * `workSpaceSize` must be >= HIST_WKSP_SIZE
+              */
              size_t HIST_countFast_wksp(unsigned* count, unsigned* maxSymbolValuePtr,
                                        const void* source, size_t sourceSize,
-                                       unsigned* workSpace)
+                                       void* workSpace, size_t workSpaceSize)
              {
                  if (sourceSize < 1500) /* heuristic threshold */
                      return HIST_count_simple(count, maxSymbolValuePtr, source, sourceSize);
-                 return HIST_count_parallel_wksp(count, maxSymbolValuePtr, source, sourceSize, 0, workSpace);
+                 if ((size_t)workSpace & 3) return ERROR(GENERIC);  /* must be aligned on 4-bytes boundaries */
+                 if (workSpaceSize < HIST_WKSP_SIZE) return ERROR(workSpace_tooSmall);
+                 return HIST_count_parallel_wksp(count, maxSymbolValuePtr, source, sourceSize, trustInput, (U32*)workSpace);
              }
              /* fast variant (unsafe : won't check if src contains values beyond count[] limit) */
                                   const void* source, size_t sourceSize)
              {
                  unsigned tmpCounters[HIST_WKSP_SIZE_U32];
-                 return HIST_countFast_wksp(count, maxSymbolValuePtr, source, sourceSize, tmpCounters);
+                 return HIST_countFast_wksp(count, maxSymbolValuePtr, source, sourceSize, tmpCounters, sizeof(tmpCounters));
              }
              /* HIST_count_wksp() :
               * Same as HIST_count(), but using an externally provided scratch buffer.
               * `workSpace` size must be table of >= HIST_WKSP_SIZE_U32 unsigned */
              size_t HIST_count_wksp(unsigned* count, unsigned* maxSymbolValuePtr,
-                              const void* source, size_t sourceSize, unsigned* workSpace)
+                                    const void* source, size_t sourceSize,
+                                    void* workSpace, size_t workSpaceSize)
              {
+                 if ((size_t)workSpace & 3) return ERROR(GENERIC);  /* must be aligned on 4-bytes boundaries */
+                 if (workSpaceSize < HIST_WKSP_SIZE) return ERROR(workSpace_tooSmall);
                  if (*maxSymbolValuePtr < 255)
-                     return HIST_count_parallel_wksp(count, maxSymbolValuePtr, source, sourceSize, 1, workSpace);
+                     return HIST_count_parallel_wksp(count, maxSymbolValuePtr, source, sourceSize, checkMaxSymbolValue, (U32*)workSpace);
                  *maxSymbolValuePtr = 255;
-                 return HIST_countFast_wksp(count, maxSymbolValuePtr, source, sourceSize, workSpace);
+                 return HIST_countFast_wksp(count, maxSymbolValuePtr, source, sourceSize, workSpace, workSpaceSize);
              }
              size_t HIST_count(unsigned* count, unsigned* maxSymbolValuePtr,
                               const void* src, size_t srcSize)
              {
                  unsigned tmpCounters[HIST_WKSP_SIZE_U32];
-                 return HIST_count_wksp(count, maxSymbolValuePtr, src, srcSize, tmpCounters);
+                 return HIST_count_wksp(count, maxSymbolValuePtr, src, srcSize, tmpCounters, sizeof(tmpCounters));
              }

contrib/python-zstandard/zstd/compress/hist.h

0 +11 -8

              /*! HIST_count():
               *  Provides the precise count of each byte within a table 'count'.
-              *  'count' is a table of unsigned int, of minimum size (*maxSymbolValuePtr+1).
+              * 'count' is a table of unsigned int, of minimum size (*maxSymbolValuePtr+1).
               *  Updates *maxSymbolValuePtr with actual largest symbol value detected.
-              *  @return : count of the most frequent symbol (which isn't identified).
-              *            or an error code, which can be tested using HIST_isError().
-              *            note : if return == srcSize, there is only one symbol.
+              * @return : count of the most frequent symbol (which isn't identified).
+              *           or an error code, which can be tested using HIST_isError().
+              *           note : if return == srcSize, there is only one symbol.
               */
              size_t HIST_count(unsigned* count, unsigned* maxSymbolValuePtr,
                                const void* src, size_t srcSize);
              /* --- advanced histogram functions --- */
              #define HIST_WKSP_SIZE_U32 1024
+             #define HIST_WKSP_SIZE    (HIST_WKSP_SIZE_U32 * sizeof(unsigned))
              /** HIST_count_wksp() :
               *  Same as HIST_count(), but using an externally provided scratch buffer.
               *  Benefit is this function will use very little stack space.
-              * `workSpace` must be a table of unsigned of size >= HIST_WKSP_SIZE_U32
+              * `workSpace` is a writable buffer which must be 4-bytes aligned,
+              * `workSpaceSize` must be >= HIST_WKSP_SIZE
               */
              size_t HIST_count_wksp(unsigned* count, unsigned* maxSymbolValuePtr,
                                     const void* src, size_t srcSize,
-                                    unsigned* workSpace);
+                                    void* workSpace, size_t workSpaceSize);
              /** HIST_countFast() :
               *  same as HIST_count(), but blindly trusts that all byte values within src are <= *maxSymbolValuePtr.
              /** HIST_countFast_wksp() :
               *  Same as HIST_countFast(), but using an externally provided scratch buffer.
-              * `workSpace` must be a table of unsigned of size >= HIST_WKSP_SIZE_U32
+              * `workSpace` is a writable buffer which must be 4-bytes aligned,
+              * `workSpaceSize` must be >= HIST_WKSP_SIZE
               */
              size_t HIST_countFast_wksp(unsigned* count, unsigned* maxSymbolValuePtr,
                                         const void* src, size_t srcSize,
-                                        unsigned* workSpace);
+                                        void* workSpace, size_t workSpaceSize);
              /*! HIST_count_simple() :
               *  Same as HIST_countFast(), this function is unsafe,

contrib/python-zstandard/zstd/compress/huf_compress.c

0 +33 -31

                  BYTE* op = ostart;
                  BYTE* const oend = ostart + dstSize;
-                 U32 maxSymbolValue = HUF_TABLELOG_MAX;
+                 unsigned maxSymbolValue = HUF_TABLELOG_MAX;
                  U32 tableLog = MAX_FSE_TABLELOG_FOR_HUFF_HEADER;
                  FSE_CTable CTable[FSE_CTABLE_SIZE_U32(MAX_FSE_TABLELOG_FOR_HUFF_HEADER, HUF_TABLELOG_MAX)];
                  BYTE scratchBuffer[1<<MAX_FSE_TABLELOG_FOR_HUFF_HEADER];
-                 U32 count[HUF_TABLELOG_MAX+1];
+                 unsigned count[HUF_TABLELOG_MAX+1];
                  S16 norm[HUF_TABLELOG_MAX+1];
                  /* init conditions */
                  `CTable` : Huffman tree to save, using huf representation.
                  @return : size of saved CTable */
              size_t HUF_writeCTable (void* dst, size_t maxDstSize,
-                                     const HUF_CElt* CTable, U32 maxSymbolValue, U32 huffLog)
+                                     const HUF_CElt* CTable, unsigned maxSymbolValue, unsigned huffLog)
              {
                  BYTE bitsToWeight[HUF_TABLELOG_MAX + 1];   /* precomputed conversion table */
                  BYTE huffWeight[HUF_SYMBOLVALUE_MAX];
              }
-             size_t HUF_readCTable (HUF_CElt* CTable, U32* maxSymbolValuePtr, const void* src, size_t srcSize)
+             size_t HUF_readCTable (HUF_CElt* CTable, unsigned* maxSymbolValuePtr, const void* src, size_t srcSize)
              {
                  BYTE huffWeight[HUF_SYMBOLVALUE_MAX + 1];   /* init not required, even though some static analyzer may complain */
                  U32 rankVal[HUF_TABLELOG_ABSOLUTEMAX + 1];   /* large enough for values from 0 to 16 */
                  U32 current;
              } rankPos;
-             static void HUF_sort(nodeElt* huffNode, const U32* count, U32 maxSymbolValue)
+             static void HUF_sort(nodeElt* huffNode, const unsigned* count, U32 maxSymbolValue)
              {
                  rankPos rank[32];
                  U32 n;
               */
              #define STARTNODE (HUF_SYMBOLVALUE_MAX+1)
              typedef nodeElt huffNodeTable[HUF_CTABLE_WORKSPACE_SIZE_U32];
-             size_t HUF_buildCTable_wksp (HUF_CElt* tree, const U32* count, U32 maxSymbolValue, U32 maxNbBits, void* workSpace, size_t wkspSize)
+             size_t HUF_buildCTable_wksp (HUF_CElt* tree, const unsigned* count, U32 maxSymbolValue, U32 maxNbBits, void* workSpace, size_t wkspSize)
              {
                  nodeElt* const huffNode0 = (nodeElt*)workSpace;
                  nodeElt* const huffNode = huffNode0+1;
               * @return : maxNbBits
               *  Note : count is used before tree is written, so they can safely overlap
               */
-             size_t HUF_buildCTable (HUF_CElt* tree, const U32* count, U32 maxSymbolValue, U32 maxNbBits)
+             size_t HUF_buildCTable (HUF_CElt* tree, const unsigned* count, unsigned maxSymbolValue, unsigned maxNbBits)
              {
                  huffNodeTable nodeTable;
                  return HUF_buildCTable_wksp(tree, count, maxSymbolValue, maxNbBits, nodeTable, sizeof(nodeTable));
                  return HUF_compress4X_usingCTable_internal(dst, dstSize, src, srcSize, CTable, /* bmi2 */ 0);
              }
+             typedef enum { HUF_singleStream, HUF_fourStreams } HUF_nbStreams_e;
              static size_t HUF_compressCTable_internal(
                              BYTE* const ostart, BYTE* op, BYTE* const oend,
                              const void* src, size_t srcSize,
-                             unsigned singleStream, const HUF_CElt* CTable, const int bmi2)
+                             HUF_nbStreams_e nbStreams, const HUF_CElt* CTable, const int bmi2)
              {
-                 size_t const cSize = singleStream ?
+                 size_t const cSize = (nbStreams==HUF_singleStream) ?
                                       HUF_compress1X_usingCTable_internal(op, oend - op, src, srcSize, CTable, bmi2) :
                                       HUF_compress4X_usingCTable_internal(op, oend - op, src, srcSize, CTable, bmi2);
                  if (HUF_isError(cSize)) { return cSize; }
              }
              typedef struct {
-                 U32 count[HUF_SYMBOLVALUE_MAX + 1];
+                 unsigned count[HUF_SYMBOLVALUE_MAX + 1];
                  HUF_CElt CTable[HUF_SYMBOLVALUE_MAX + 1];
                  huffNodeTable nodeTable;
              } HUF_compress_tables_t;
              /* HUF_compress_internal() :
               * `workSpace` must a table of at least HUF_WORKSPACE_SIZE_U32 unsigned */
-             static size_t HUF_compress_internal (
-                             void* dst, size_t dstSize,
-                             const void* src, size_t srcSize,
-                             unsigned maxSymbolValue, unsigned huffLog,
-                             unsigned singleStream,
-                             void* workSpace, size_t wkspSize,
-                             HUF_CElt* oldHufTable, HUF_repeat* repeat, int preferRepeat,
-                             const int bmi2)
+             static size_t
+             HUF_compress_internal (void* dst, size_t dstSize,
+                              const void* src, size_t srcSize,
+                                    unsigned maxSymbolValue, unsigned huffLog,
+                                    HUF_nbStreams_e nbStreams,
+                                    void* workSpace, size_t wkspSize,
+                                    HUF_CElt* oldHufTable, HUF_repeat* repeat, int preferRepeat,
+                              const int bmi2)
              {
                  HUF_compress_tables_t* const table = (HUF_compress_tables_t*)workSpace;
                  BYTE* const ostart = (BYTE*)dst;
                  /* checks & inits */
                  if (((size_t)workSpace & 3) != 0) return ERROR(GENERIC);  /* must be aligned on 4-bytes boundaries */
-                 if (wkspSize < sizeof(*table)) return ERROR(workSpace_tooSmall);
+                 if (wkspSize < HUF_WORKSPACE_SIZE) return ERROR(workSpace_tooSmall);
                  if (!srcSize) return 0;  /* Uncompressed */
                  if (!dstSize) return 0;  /* cannot fit anything within dst budget */
                  if (srcSize > HUF_BLOCKSIZE_MAX) return ERROR(srcSize_wrong);   /* current block size limit */
                  if (preferRepeat && repeat && *repeat == HUF_repeat_valid) {
                      return HUF_compressCTable_internal(ostart, op, oend,
                                                         src, srcSize,
-                                                        singleStream, oldHufTable, bmi2);
+                                                        nbStreams, oldHufTable, bmi2);
                  }
                  /* Scan input and build symbol stats */
-                 {   CHECK_V_F(largest, HIST_count_wksp (table->count, &maxSymbolValue, (const BYTE*)src, srcSize, table->count) );
+                 {   CHECK_V_F(largest, HIST_count_wksp (table->count, &maxSymbolValue, (const BYTE*)src, srcSize, workSpace, wkspSize) );
                      if (largest == srcSize) { *ostart = ((const BYTE*)src)[0]; return 1; }   /* single symbol, rle */
                      if (largest <= (srcSize >> 7)+4) return 0;   /* heuristic : probably not compressible enough */
                  }
                  if (preferRepeat && repeat && *repeat != HUF_repeat_none) {
                      return HUF_compressCTable_internal(ostart, op, oend,
                                                         src, srcSize,
-                                                        singleStream, oldHufTable, bmi2);
+                                                        nbStreams, oldHufTable, bmi2);
                  }
                  /* Build Huffman Tree */
                  huffLog = HUF_optimalTableLog(huffLog, srcSize, maxSymbolValue);
-                 {   CHECK_V_F(maxBits, HUF_buildCTable_wksp(table->CTable, table->count,
-                                                             maxSymbolValue, huffLog,
-                                                             table->nodeTable, sizeof(table->nodeTable)) );
+                 {   size_t const maxBits = HUF_buildCTable_wksp(table->CTable, table->count,
+                                                         maxSymbolValue, huffLog,
+                                                         table->nodeTable, sizeof(table->nodeTable));
+                     CHECK_F(maxBits);
                      huffLog = (U32)maxBits;
                      /* Zero unused symbols in CTable, so we can check it for validity */
                      memset(table->CTable + (maxSymbolValue + 1), 0,
                          if (oldSize <= hSize + newSize || hSize + 12 >= srcSize) {
                              return HUF_compressCTable_internal(ostart, op, oend,
                                                                 src, srcSize,
-                                                                singleStream, oldHufTable, bmi2);
+                                                                nbStreams, oldHufTable, bmi2);
                      }   }
                      /* Use the new huffman table */
                  }
                  return HUF_compressCTable_internal(ostart, op, oend,
                                                     src, srcSize,
-                                                    singleStream, table->CTable, bmi2);
+                                                    nbStreams, table->CTable, bmi2);
              }
                                    void* workSpace, size_t wkspSize)
              {
                  return HUF_compress_internal(dst, dstSize, src, srcSize,
-                                              maxSymbolValue, huffLog, 1 /*single stream*/,
+                                              maxSymbolValue, huffLog, HUF_singleStream,
                                               workSpace, wkspSize,
                                               NULL, NULL, 0, 0 /*bmi2*/);
              }
                                    HUF_CElt* hufTable, HUF_repeat* repeat, int preferRepeat, int bmi2)
              {
                  return HUF_compress_internal(dst, dstSize, src, srcSize,
-                                              maxSymbolValue, huffLog, 1 /*single stream*/,
+                                              maxSymbolValue, huffLog, HUF_singleStream,
                                               workSpace, wkspSize, hufTable,
                                               repeat, preferRepeat, bmi2);
              }
                                    void* workSpace, size_t wkspSize)
              {
                  return HUF_compress_internal(dst, dstSize, src, srcSize,
-                                              maxSymbolValue, huffLog, 0 /*4 streams*/,
+                                              maxSymbolValue, huffLog, HUF_fourStreams,
                                               workSpace, wkspSize,
                                               NULL, NULL, 0, 0 /*bmi2*/);
              }
                                    HUF_CElt* hufTable, HUF_repeat* repeat, int preferRepeat, int bmi2)
              {
                  return HUF_compress_internal(dst, dstSize, src, srcSize,
-                                              maxSymbolValue, huffLog, 0 /* 4 streams */,
+                                              maxSymbolValue, huffLog, HUF_fourStreams,
                                               workSpace, wkspSize,
                                               hufTable, repeat, preferRepeat, bmi2);
              }

contrib/python-zstandard/zstd/compress/zstd_compress.c

0 +622 -372

              /*-*************************************
              *  Dependencies
              ***************************************/
+             #include <limits.h>         /* INT_MAX */
              #include <string.h>         /* memset */
              #include "cpu.h"
              #include "mem.h"
                  memset(cctx, 0, sizeof(*cctx));
                  cctx->customMem = memManager;
                  cctx->bmi2 = ZSTD_cpuid_bmi2(ZSTD_cpuid());
-                 {   size_t const err = ZSTD_CCtx_resetParameters(cctx);
+                 {   size_t const err = ZSTD_CCtx_reset(cctx, ZSTD_reset_parameters);
                      assert(!ZSTD_isError(err));
                      (void)err;
                  }
              #ifdef ZSTD_MULTITHREAD
                  return ZSTDMT_sizeof_CCtx(cctx->mtctx);
              #else
-                 (void) cctx;
+                 (void)cctx;
                  return 0;
              #endif
              }
                  return ret;
              }
-             #define CLAMPCHECK(val,min,max) {            \
-                 if (((val)<(min)) | ((val)>(max))) {     \
-                     return ERROR(parameter_outOfBound);  \
+             ZSTD_bounds ZSTD_cParam_getBounds(ZSTD_cParameter param)
+             {
+                 ZSTD_bounds bounds = { 0, 0, 0 };
+                 switch(param)
+                 {
+                 case ZSTD_c_compressionLevel:
+                     bounds.lowerBound = ZSTD_minCLevel();
+                     bounds.upperBound = ZSTD_maxCLevel();
+                     return bounds;
+                 case ZSTD_c_windowLog:
+                     bounds.lowerBound = ZSTD_WINDOWLOG_MIN;
+                     bounds.upperBound = ZSTD_WINDOWLOG_MAX;
+                     return bounds;
+                 case ZSTD_c_hashLog:
+                     bounds.lowerBound = ZSTD_HASHLOG_MIN;
+                     bounds.upperBound = ZSTD_HASHLOG_MAX;
+                     return bounds;
+                 case ZSTD_c_chainLog:
+                     bounds.lowerBound = ZSTD_CHAINLOG_MIN;
+                     bounds.upperBound = ZSTD_CHAINLOG_MAX;
+                     return bounds;
+                 case ZSTD_c_searchLog:
+                     bounds.lowerBound = ZSTD_SEARCHLOG_MIN;
+                     bounds.upperBound = ZSTD_SEARCHLOG_MAX;
+                     return bounds;
+                 case ZSTD_c_minMatch:
+                     bounds.lowerBound = ZSTD_MINMATCH_MIN;
+                     bounds.upperBound = ZSTD_MINMATCH_MAX;
+                     return bounds;
+                 case ZSTD_c_targetLength:
+                     bounds.lowerBound = ZSTD_TARGETLENGTH_MIN;
+                     bounds.upperBound = ZSTD_TARGETLENGTH_MAX;
+                     return bounds;
+                 case ZSTD_c_strategy:
+                     bounds.lowerBound = ZSTD_STRATEGY_MIN;
+                     bounds.upperBound = ZSTD_STRATEGY_MAX;
+                     return bounds;
+                 case ZSTD_c_contentSizeFlag:
+                     bounds.lowerBound = 0;
+                     bounds.upperBound = 1;
+                     return bounds;
+                 case ZSTD_c_checksumFlag:
+                     bounds.lowerBound = 0;
+                     bounds.upperBound = 1;
+                     return bounds;
+                 case ZSTD_c_dictIDFlag:
+                     bounds.lowerBound = 0;
+                     bounds.upperBound = 1;
+                     return bounds;
+                 case ZSTD_c_nbWorkers:
+                     bounds.lowerBound = 0;
+             #ifdef ZSTD_MULTITHREAD
+                     bounds.upperBound = ZSTDMT_NBWORKERS_MAX;
+             #else
+                     bounds.upperBound = 0;
+             #endif
+                     return bounds;
+                 case ZSTD_c_jobSize:
+                     bounds.lowerBound = 0;
+             #ifdef ZSTD_MULTITHREAD
+                     bounds.upperBound = ZSTDMT_JOBSIZE_MAX;
+             #else
+                     bounds.upperBound = 0;
+             #endif
+                     return bounds;
+                 case ZSTD_c_overlapLog:
+                     bounds.lowerBound = ZSTD_OVERLAPLOG_MIN;
+                     bounds.upperBound = ZSTD_OVERLAPLOG_MAX;
+                     return bounds;
+                 case ZSTD_c_enableLongDistanceMatching:
+                     bounds.lowerBound = 0;
+                     bounds.upperBound = 1;
+                     return bounds;
+                 case ZSTD_c_ldmHashLog:
+                     bounds.lowerBound = ZSTD_LDM_HASHLOG_MIN;
+                     bounds.upperBound = ZSTD_LDM_HASHLOG_MAX;
+                     return bounds;
+                 case ZSTD_c_ldmMinMatch:
+                     bounds.lowerBound = ZSTD_LDM_MINMATCH_MIN;
+                     bounds.upperBound = ZSTD_LDM_MINMATCH_MAX;
+                     return bounds;
+                 case ZSTD_c_ldmBucketSizeLog:
+                     bounds.lowerBound = ZSTD_LDM_BUCKETSIZELOG_MIN;
+                     bounds.upperBound = ZSTD_LDM_BUCKETSIZELOG_MAX;
+                     return bounds;
+                 case ZSTD_c_ldmHashRateLog:
+                     bounds.lowerBound = ZSTD_LDM_HASHRATELOG_MIN;
+                     bounds.upperBound = ZSTD_LDM_HASHRATELOG_MAX;
+                     return bounds;
+                 /* experimental parameters */
+                 case ZSTD_c_rsyncable:
+                     bounds.lowerBound = 0;
+                     bounds.upperBound = 1;
+                     return bounds;
+                 case ZSTD_c_forceMaxWindow :
+                     bounds.lowerBound = 0;
+                     bounds.upperBound = 1;
+                     return bounds;
+                 case ZSTD_c_format:
+                     ZSTD_STATIC_ASSERT(ZSTD_f_zstd1 < ZSTD_f_zstd1_magicless);
+                     bounds.lowerBound = ZSTD_f_zstd1;
+                     bounds.upperBound = ZSTD_f_zstd1_magicless;   /* note : how to ensure at compile time that this is the highest value enum ? */
+                     return bounds;
+                 case ZSTD_c_forceAttachDict:
+                     ZSTD_STATIC_ASSERT(ZSTD_dictDefaultAttach < ZSTD_dictForceCopy);
+                     bounds.lowerBound = ZSTD_dictDefaultAttach;
+                     bounds.upperBound = ZSTD_dictForceCopy;       /* note : how to ensure at compile time that this is the highest value enum ? */
+                     return bounds;
+                 default:
+                     {   ZSTD_bounds const boundError = { ERROR(parameter_unsupported), 0, 0 };
+                         return boundError;
+                     }
+                 }
+             }
+             /* ZSTD_cParam_withinBounds:
+              * @return 1 if value is within cParam bounds,
+              * 0 otherwise */
+             static int ZSTD_cParam_withinBounds(ZSTD_cParameter cParam, int value)
+             {
+                 ZSTD_bounds const bounds = ZSTD_cParam_getBounds(cParam);
+                 if (ZSTD_isError(bounds.error)) return 0;
+                 if (value < bounds.lowerBound) return 0;
+                 if (value > bounds.upperBound) return 0;
+                 return 1;
+             }
+             #define BOUNDCHECK(cParam, val) {                  \
+                 if (!ZSTD_cParam_withinBounds(cParam,val)) {   \
+                     return ERROR(parameter_outOfBound);        \
              }   }
              {
                  switch(param)
                  {
-                 case ZSTD_p_compressionLevel:
-                 case ZSTD_p_hashLog:
-                 case ZSTD_p_chainLog:
-                 case ZSTD_p_searchLog:
-                 case ZSTD_p_minMatch:
-                 case ZSTD_p_targetLength:
-                 case ZSTD_p_compressionStrategy:
+                 case ZSTD_c_compressionLevel:
+                 case ZSTD_c_hashLog:
+                 case ZSTD_c_chainLog:
+                 case ZSTD_c_searchLog:
+                 case ZSTD_c_minMatch:
+                 case ZSTD_c_targetLength:
+                 case ZSTD_c_strategy:
                      return 1;
-                 case ZSTD_p_format:
-                 case ZSTD_p_windowLog:
-                 case ZSTD_p_contentSizeFlag:
-                 case ZSTD_p_checksumFlag:
-                 case ZSTD_p_dictIDFlag:
-                 case ZSTD_p_forceMaxWindow :
-                 case ZSTD_p_nbWorkers:
-                 case ZSTD_p_jobSize:
-                 case ZSTD_p_overlapSizeLog:
-                 case ZSTD_p_enableLongDistanceMatching:
-                 case ZSTD_p_ldmHashLog:
-                 case ZSTD_p_ldmMinMatch:
-                 case ZSTD_p_ldmBucketSizeLog:
-                 case ZSTD_p_ldmHashEveryLog:
-                 case ZSTD_p_forceAttachDict:
+                 case ZSTD_c_format:
+                 case ZSTD_c_windowLog:
+                 case ZSTD_c_contentSizeFlag:
+                 case ZSTD_c_checksumFlag:
+                 case ZSTD_c_dictIDFlag:
+                 case ZSTD_c_forceMaxWindow :
+                 case ZSTD_c_nbWorkers:
+                 case ZSTD_c_jobSize:
+                 case ZSTD_c_overlapLog:
+                 case ZSTD_c_rsyncable:
+                 case ZSTD_c_enableLongDistanceMatching:
+                 case ZSTD_c_ldmHashLog:
+                 case ZSTD_c_ldmMinMatch:
+                 case ZSTD_c_ldmBucketSizeLog:
+                 case ZSTD_c_ldmHashRateLog:
+                 case ZSTD_c_forceAttachDict:
                  default:
                      return 0;
                  }
              }
-             size_t ZSTD_CCtx_setParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, unsigned value)
+             size_t ZSTD_CCtx_setParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, int value)
              {
-                 DEBUGLOG(4, "ZSTD_CCtx_setParameter (%u, %u)", (U32)param, value);
+                 DEBUGLOG(4, "ZSTD_CCtx_setParameter (%i, %i)", (int)param, value);
                  if (cctx->streamStage != zcss_init) {
                      if (ZSTD_isUpdateAuthorized(param)) {
                          cctx->cParamsChanged = 1;
                  switch(param)
                  {
-                 case ZSTD_p_format :
+                 case ZSTD_c_format :
                      return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value);
-                 case ZSTD_p_compressionLevel:
+                 case ZSTD_c_compressionLevel:
                      if (cctx->cdict) return ERROR(stage_wrong);
                      return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value);
-                 case ZSTD_p_windowLog:
-                 case ZSTD_p_hashLog:
-                 case ZSTD_p_chainLog:
-                 case ZSTD_p_searchLog:
-                 case ZSTD_p_minMatch:
-                 case ZSTD_p_targetLength:
-                 case ZSTD_p_compressionStrategy:
+                 case ZSTD_c_windowLog:
+                 case ZSTD_c_hashLog:
+                 case ZSTD_c_chainLog:
+                 case ZSTD_c_searchLog:
+                 case ZSTD_c_minMatch:
+                 case ZSTD_c_targetLength:
+                 case ZSTD_c_strategy:
                      if (cctx->cdict) return ERROR(stage_wrong);
                      return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value);
-                 case ZSTD_p_contentSizeFlag:
-                 case ZSTD_p_checksumFlag:
-                 case ZSTD_p_dictIDFlag:
+                 case ZSTD_c_contentSizeFlag:
+                 case ZSTD_c_checksumFlag:
+                 case ZSTD_c_dictIDFlag:
                      return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value);
-                 case ZSTD_p_forceMaxWindow :  /* Force back-references to remain < windowSize,
+                 case ZSTD_c_forceMaxWindow :  /* Force back-references to remain < windowSize,
                                                 * even when referencing into Dictionary content.
                                                 * default : 0 when using a CDict, 1 when using a Prefix */
                      return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value);
-                 case ZSTD_p_forceAttachDict:
+                 case ZSTD_c_forceAttachDict:
                      return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value);
-                 case ZSTD_p_nbWorkers:
-                     if ((value>0) && cctx->staticSize) {
+                 case ZSTD_c_nbWorkers:
+                     if ((value!=0) && cctx->staticSize) {
                          return ERROR(parameter_unsupported);  /* MT not compatible with static alloc */
                      }
                      return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value);
-                 case ZSTD_p_jobSize:
-                 case ZSTD_p_overlapSizeLog:
+                 case ZSTD_c_jobSize:
+                 case ZSTD_c_overlapLog:
+                 case ZSTD_c_rsyncable:
                      return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value);
-                 case ZSTD_p_enableLongDistanceMatching:
-                 case ZSTD_p_ldmHashLog:
-                 case ZSTD_p_ldmMinMatch:
-                 case ZSTD_p_ldmBucketSizeLog:
-                 case ZSTD_p_ldmHashEveryLog:
+                 case ZSTD_c_enableLongDistanceMatching:
+                 case ZSTD_c_ldmHashLog:
+                 case ZSTD_c_ldmMinMatch:
+                 case ZSTD_c_ldmBucketSizeLog:
+                 case ZSTD_c_ldmHashRateLog:
                      if (cctx->cdict) return ERROR(stage_wrong);
                      return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value);
                  }
              }
-             size_t ZSTD_CCtxParam_setParameter(
-                     ZSTD_CCtx_params* CCtxParams, ZSTD_cParameter param, unsigned value)
+             size_t ZSTD_CCtxParam_setParameter(ZSTD_CCtx_params* CCtxParams,
+                                                ZSTD_cParameter param, int value)
              {
-                 DEBUGLOG(4, "ZSTD_CCtxParam_setParameter (%u, %u)", (U32)param, value);
+                 DEBUGLOG(4, "ZSTD_CCtxParam_setParameter (%i, %i)", (int)param, value);
                  switch(param)
                  {
-                 case ZSTD_p_format :
-                     if (value > (unsigned)ZSTD_f_zstd1_magicless)
-                         return ERROR(parameter_unsupported);
+                 case ZSTD_c_format :
+                     BOUNDCHECK(ZSTD_c_format, value);
                      CCtxParams->format = (ZSTD_format_e)value;
                      return (size_t)CCtxParams->format;
-                 case ZSTD_p_compressionLevel : {
-                     int cLevel = (int)value;  /* cast expected to restore negative sign */
+                 case ZSTD_c_compressionLevel : {
+                     int cLevel = value;
                      if (cLevel > ZSTD_maxCLevel()) cLevel = ZSTD_maxCLevel();
+                     if (cLevel < ZSTD_minCLevel()) cLevel = ZSTD_minCLevel();
                      if (cLevel) {  /* 0 : does not change current level */
                          CCtxParams->compressionLevel = cLevel;
                      }
                      return 0;  /* return type (size_t) cannot represent negative values */
                  }
-                 case ZSTD_p_windowLog :
-                     if (value>0)   /* 0 => use default */
-                         CLAMPCHECK(value, ZSTD_WINDOWLOG_MIN, ZSTD_WINDOWLOG_MAX);
+                 case ZSTD_c_windowLog :
+                     if (value!=0)   /* 0 => use default */
+                         BOUNDCHECK(ZSTD_c_windowLog, value);
                      CCtxParams->cParams.windowLog = value;
                      return CCtxParams->cParams.windowLog;
-                 case ZSTD_p_hashLog :
-                     if (value>0)   /* 0 => use default */
-                         CLAMPCHECK(value, ZSTD_HASHLOG_MIN, ZSTD_HASHLOG_MAX);
+                 case ZSTD_c_hashLog :
+                     if (value!=0)   /* 0 => use default */
+                         BOUNDCHECK(ZSTD_c_hashLog, value);
                      CCtxParams->cParams.hashLog = value;
                      return CCtxParams->cParams.hashLog;
-                 case ZSTD_p_chainLog :
-                     if (value>0)   /* 0 => use default */
-                         CLAMPCHECK(value, ZSTD_CHAINLOG_MIN, ZSTD_CHAINLOG_MAX);
+                 case ZSTD_c_chainLog :
+                     if (value!=0)   /* 0 => use default */
+                         BOUNDCHECK(ZSTD_c_chainLog, value);
                      CCtxParams->cParams.chainLog = value;
                      return CCtxParams->cParams.chainLog;
-                 case ZSTD_p_searchLog :
-                     if (value>0)   /* 0 => use default */
-                         CLAMPCHECK(value, ZSTD_SEARCHLOG_MIN, ZSTD_SEARCHLOG_MAX);
+                 case ZSTD_c_searchLog :
+                     if (value!=0)   /* 0 => use default */
+                         BOUNDCHECK(ZSTD_c_searchLog, value);
                      CCtxParams->cParams.searchLog = value;
                      return value;
-                 case ZSTD_p_minMatch :
-                     if (value>0)   /* 0 => use default */
-                         CLAMPCHECK(value, ZSTD_SEARCHLENGTH_MIN, ZSTD_SEARCHLENGTH_MAX);
-                     CCtxParams->cParams.searchLength = value;
-                     return CCtxParams->cParams.searchLength;
-                 case ZSTD_p_targetLength :
-                     /* all values are valid. 0 => use default */
+                 case ZSTD_c_minMatch :
+                     if (value!=0)   /* 0 => use default */
+                         BOUNDCHECK(ZSTD_c_minMatch, value);
+                     CCtxParams->cParams.minMatch = value;
+                     return CCtxParams->cParams.minMatch;
+                 case ZSTD_c_targetLength :
+                     BOUNDCHECK(ZSTD_c_targetLength, value);
                      CCtxParams->cParams.targetLength = value;
                      return CCtxParams->cParams.targetLength;
-                 case ZSTD_p_compressionStrategy :
-                     if (value>0)   /* 0 => use default */
-                         CLAMPCHECK(value, (unsigned)ZSTD_fast, (unsigned)ZSTD_btultra);
+                 case ZSTD_c_strategy :
+                     if (value!=0)   /* 0 => use default */
+                         BOUNDCHECK(ZSTD_c_strategy, value);
                      CCtxParams->cParams.strategy = (ZSTD_strategy)value;
                      return (size_t)CCtxParams->cParams.strategy;
-                 case ZSTD_p_contentSizeFlag :
+                 case ZSTD_c_contentSizeFlag :
                      /* Content size written in frame header _when known_ (default:1) */
-                     DEBUGLOG(4, "set content size flag = %u", (value>0));
-                     CCtxParams->fParams.contentSizeFlag = value > 0;
+                     DEBUGLOG(4, "set content size flag = %u", (value!=0));
+                     CCtxParams->fParams.contentSizeFlag = value != 0;
                      return CCtxParams->fParams.contentSizeFlag;
-                 case ZSTD_p_checksumFlag :
+                 case ZSTD_c_checksumFlag :
                      /* A 32-bits content checksum will be calculated and written at end of frame (default:0) */
-                     CCtxParams->fParams.checksumFlag = value > 0;
+                     CCtxParams->fParams.checksumFlag = value != 0;
                      return CCtxParams->fParams.checksumFlag;
-                 case ZSTD_p_dictIDFlag : /* When applicable, dictionary's dictID is provided in frame header (default:1) */
-                     DEBUGLOG(4, "set dictIDFlag = %u", (value>0));
+                 case ZSTD_c_dictIDFlag : /* When applicable, dictionary's dictID is provided in frame header (default:1) */
+                     DEBUGLOG(4, "set dictIDFlag = %u", (value!=0));
                      CCtxParams->fParams.noDictIDFlag = !value;
                      return !CCtxParams->fParams.noDictIDFlag;
-                 case ZSTD_p_forceMaxWindow :
-                     CCtxParams->forceWindow = (value > 0);
+                 case ZSTD_c_forceMaxWindow :
+                     CCtxParams->forceWindow = (value != 0);
                      return CCtxParams->forceWindow;
-                 case ZSTD_p_forceAttachDict :
-                     CCtxParams->attachDictPref = value ?
-                                                 (value > 0 ? ZSTD_dictForceAttach : ZSTD_dictForceCopy) :
-                                                  ZSTD_dictDefaultAttach;
+                 case ZSTD_c_forceAttachDict : {
+                     const ZSTD_dictAttachPref_e pref = (ZSTD_dictAttachPref_e)value;
+                     BOUNDCHECK(ZSTD_c_forceAttachDict, pref);
+                     CCtxParams->attachDictPref = pref;
                      return CCtxParams->attachDictPref;
-                 case ZSTD_p_nbWorkers :
+                 }
+                 case ZSTD_c_nbWorkers :
              #ifndef ZSTD_MULTITHREAD
-                     if (value>0) return ERROR(parameter_unsupported);
+                     if (value!=0) return ERROR(parameter_unsupported);
                      return 0;
              #else
                      return ZSTDMT_CCtxParam_setNbWorkers(CCtxParams, value);
              #endif
-                 case ZSTD_p_jobSize :
+                 case ZSTD_c_jobSize :
              #ifndef ZSTD_MULTITHREAD
                      return ERROR(parameter_unsupported);
              #else
                      return ZSTDMT_CCtxParam_setMTCtxParameter(CCtxParams, ZSTDMT_p_jobSize, value);
              #endif
-                 case ZSTD_p_overlapSizeLog :
+                 case ZSTD_c_overlapLog :
+             #ifndef ZSTD_MULTITHREAD
+                     return ERROR(parameter_unsupported);
+             #else
+                     return ZSTDMT_CCtxParam_setMTCtxParameter(CCtxParams, ZSTDMT_p_overlapLog, value);
+             #endif
+                 case ZSTD_c_rsyncable :
              #ifndef ZSTD_MULTITHREAD
                      return ERROR(parameter_unsupported);
              #else
-                     return ZSTDMT_CCtxParam_setMTCtxParameter(CCtxParams, ZSTDMT_p_overlapSectionLog, value);
+                     return ZSTDMT_CCtxParam_setMTCtxParameter(CCtxParams, ZSTDMT_p_rsyncable, value);
              #endif
-                 case ZSTD_p_enableLongDistanceMatching :
-                     CCtxParams->ldmParams.enableLdm = (value>0);
+                 case ZSTD_c_enableLongDistanceMatching :
+                     CCtxParams->ldmParams.enableLdm = (value!=0);
                      return CCtxParams->ldmParams.enableLdm;
-                 case ZSTD_p_ldmHashLog :
-                     if (value>0)   /* 0 ==> auto */
-                         CLAMPCHECK(value, ZSTD_HASHLOG_MIN, ZSTD_HASHLOG_MAX);
+                 case ZSTD_c_ldmHashLog :
+                     if (value!=0)   /* 0 ==> auto */
+                         BOUNDCHECK(ZSTD_c_ldmHashLog, value);
                      CCtxParams->ldmParams.hashLog = value;
                      return CCtxParams->ldmParams.hashLog;
-                 case ZSTD_p_ldmMinMatch :
-                     if (value>0)   /* 0 ==> default */
-                         CLAMPCHECK(value, ZSTD_LDM_MINMATCH_MIN, ZSTD_LDM_MINMATCH_MAX);
+                 case ZSTD_c_ldmMinMatch :
+                     if (value!=0)   /* 0 ==> default */
+                         BOUNDCHECK(ZSTD_c_ldmMinMatch, value);
                      CCtxParams->ldmParams.minMatchLength = value;
                      return CCtxParams->ldmParams.minMatchLength;
-                 case ZSTD_p_ldmBucketSizeLog :
-                     if (value > ZSTD_LDM_BUCKETSIZELOG_MAX)
-                         return ERROR(parameter_outOfBound);
+                 case ZSTD_c_ldmBucketSizeLog :
+                     if (value!=0)   /* 0 ==> default */
+                         BOUNDCHECK(ZSTD_c_ldmBucketSizeLog, value);
                      CCtxParams->ldmParams.bucketSizeLog = value;
                      return CCtxParams->ldmParams.bucketSizeLog;
-                 case ZSTD_p_ldmHashEveryLog :
+                 case ZSTD_c_ldmHashRateLog :
                      if (value > ZSTD_WINDOWLOG_MAX - ZSTD_HASHLOG_MIN)
                          return ERROR(parameter_outOfBound);
-                     CCtxParams->ldmParams.hashEveryLog = value;
-                     return CCtxParams->ldmParams.hashEveryLog;
+                     CCtxParams->ldmParams.hashRateLog = value;
+                     return CCtxParams->ldmParams.hashRateLog;
                  default: return ERROR(parameter_unsupported);
                  }
              }
-             size_t ZSTD_CCtx_getParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, unsigned* value)
+             size_t ZSTD_CCtx_getParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, int* value)
              {
                  return ZSTD_CCtxParam_getParameter(&cctx->requestedParams, param, value);
              }
              size_t ZSTD_CCtxParam_getParameter(
-                     ZSTD_CCtx_params* CCtxParams, ZSTD_cParameter param, unsigned* value)
+                     ZSTD_CCtx_params* CCtxParams, ZSTD_cParameter param, int* value)
              {
                  switch(param)
                  {
-                 case ZSTD_p_format :
+                 case ZSTD_c_format :
                      *value = CCtxParams->format;
                      break;
-                 case ZSTD_p_compressionLevel :
+                 case ZSTD_c_compressionLevel :
                      *value = CCtxParams->compressionLevel;
                      break;
-                 case ZSTD_p_windowLog :
+                 case ZSTD_c_windowLog :
                      *value = CCtxParams->cParams.windowLog;
                      break;
-                 case ZSTD_p_hashLog :
+                 case ZSTD_c_hashLog :
                      *value = CCtxParams->cParams.hashLog;
                      break;
-                 case ZSTD_p_chainLog :
+                 case ZSTD_c_chainLog :
                      *value = CCtxParams->cParams.chainLog;
                      break;
-                 case ZSTD_p_searchLog :
+                 case ZSTD_c_searchLog :
                      *value = CCtxParams->cParams.searchLog;
                      break;
-                 case ZSTD_p_minMatch :
-                     *value = CCtxParams->cParams.searchLength;
+                 case ZSTD_c_minMatch :
+                     *value = CCtxParams->cParams.minMatch;
                      break;
-                 case ZSTD_p_targetLength :
+                 case ZSTD_c_targetLength :
                      *value = CCtxParams->cParams.targetLength;
                      break;
-                 case ZSTD_p_compressionStrategy :
+                 case ZSTD_c_strategy :
                      *value = (unsigned)CCtxParams->cParams.strategy;
                      break;
-                 case ZSTD_p_contentSizeFlag :
+                 case ZSTD_c_contentSizeFlag :
                      *value = CCtxParams->fParams.contentSizeFlag;
                      break;
-                 case ZSTD_p_checksumFlag :
+                 case ZSTD_c_checksumFlag :
                      *value = CCtxParams->fParams.checksumFlag;
                      break;
-                 case ZSTD_p_dictIDFlag :
+                 case ZSTD_c_dictIDFlag :
                      *value = !CCtxParams->fParams.noDictIDFlag;
                      break;
-                 case ZSTD_p_forceMaxWindow :
+                 case ZSTD_c_forceMaxWindow :
                      *value = CCtxParams->forceWindow;
                      break;
-                 case ZSTD_p_forceAttachDict :
+                 case ZSTD_c_forceAttachDict :
                      *value = CCtxParams->attachDictPref;
                      break;
-                 case ZSTD_p_nbWorkers :
+                 case ZSTD_c_nbWorkers :
              #ifndef ZSTD_MULTITHREAD
                      assert(CCtxParams->nbWorkers == 0);
              #endif
                      *value = CCtxParams->nbWorkers;
                      break;
-                 case ZSTD_p_jobSize :
+                 case ZSTD_c_jobSize :
              #ifndef ZSTD_MULTITHREAD
                      return ERROR(parameter_unsupported);
              #else
-                     *value = CCtxParams->jobSize;
+                     assert(CCtxParams->jobSize <= INT_MAX);
+                     *value = (int)CCtxParams->jobSize;
                      break;
              #endif
-                 case ZSTD_p_overlapSizeLog :
+                 case ZSTD_c_overlapLog :
              #ifndef ZSTD_MULTITHREAD
                      return ERROR(parameter_unsupported);
              #else
-                     *value = CCtxParams->overlapSizeLog;
+                     *value = CCtxParams->overlapLog;
                      break;
              #endif
-                 case ZSTD_p_enableLongDistanceMatching :
+                 case ZSTD_c_rsyncable :
+             #ifndef ZSTD_MULTITHREAD
+                     return ERROR(parameter_unsupported);
+             #else
+                     *value = CCtxParams->rsyncable;
+                     break;
+             #endif
+                 case ZSTD_c_enableLongDistanceMatching :
                      *value = CCtxParams->ldmParams.enableLdm;
                      break;
-                 case ZSTD_p_ldmHashLog :
+                 case ZSTD_c_ldmHashLog :
                      *value = CCtxParams->ldmParams.hashLog;
                      break;
-                 case ZSTD_p_ldmMinMatch :
+                 case ZSTD_c_ldmMinMatch :
                      *value = CCtxParams->ldmParams.minMatchLength;
                      break;
-                 case ZSTD_p_ldmBucketSizeLog :
+                 case ZSTD_c_ldmBucketSizeLog :
                      *value = CCtxParams->ldmParams.bucketSizeLog;
                      break;
-                 case ZSTD_p_ldmHashEveryLog :
-                     *value = CCtxParams->ldmParams.hashEveryLog;
+                 case ZSTD_c_ldmHashRateLog :
+                     *value = CCtxParams->ldmParams.hashRateLog;
                      break;
                  default: return ERROR(parameter_unsupported);
                  }
              /*! ZSTD_CCtx_reset() :
               *  Also dumps dictionary */
-             void ZSTD_CCtx_reset(ZSTD_CCtx* cctx)
+             size_t ZSTD_CCtx_reset(ZSTD_CCtx* cctx, ZSTD_ResetDirective reset)
              {
-                 cctx->streamStage = zcss_init;
-                 cctx->pledgedSrcSizePlusOne = 0;
+                 if ( (reset == ZSTD_reset_session_only)
+                   || (reset == ZSTD_reset_session_and_parameters) ) {
+                     cctx->streamStage = zcss_init;
+                     cctx->pledgedSrcSizePlusOne = 0;
+                 }
+                 if ( (reset == ZSTD_reset_parameters)
+                   || (reset == ZSTD_reset_session_and_parameters) ) {
+                     if (cctx->streamStage != zcss_init) return ERROR(stage_wrong);
+                     cctx->cdict = NULL;
+                     return ZSTD_CCtxParams_reset(&cctx->requestedParams);
+                 }
+                 return 0;
              }
-             size_t ZSTD_CCtx_resetParameters(ZSTD_CCtx* cctx)
+             {
-                 if (cctx->streamStage != zcss_init) return ERROR(stage_wrong);
-                 cctx->cdict = NULL;
-                 return ZSTD_CCtxParams_reset(&cctx->requestedParams);
+             }
              /** ZSTD_checkCParams() :
                  control CParam values remain within authorized range.
                  @return : 0, or an error code if one value is beyond authorized range */
              size_t ZSTD_checkCParams(ZSTD_compressionParameters cParams)
              {
-                 CLAMPCHECK(cParams.windowLog, ZSTD_WINDOWLOG_MIN, ZSTD_WINDOWLOG_MAX);
-                 CLAMPCHECK(cParams.chainLog, ZSTD_CHAINLOG_MIN, ZSTD_CHAINLOG_MAX);
-                 CLAMPCHECK(cParams.hashLog, ZSTD_HASHLOG_MIN, ZSTD_HASHLOG_MAX);
-                 CLAMPCHECK(cParams.searchLog, ZSTD_SEARCHLOG_MIN, ZSTD_SEARCHLOG_MAX);
-                 CLAMPCHECK(cParams.searchLength, ZSTD_SEARCHLENGTH_MIN, ZSTD_SEARCHLENGTH_MAX);
-                 ZSTD_STATIC_ASSERT(ZSTD_TARGETLENGTH_MIN == 0);
-                 if (cParams.targetLength > ZSTD_TARGETLENGTH_MAX)
-                     return ERROR(parameter_outOfBound);
-                 if ((U32)(cParams.strategy) > (U32)ZSTD_btultra)
-                     return ERROR(parameter_unsupported);
+                 BOUNDCHECK(ZSTD_c_windowLog, cParams.windowLog);
+                 BOUNDCHECK(ZSTD_c_chainLog,  cParams.chainLog);
+                 BOUNDCHECK(ZSTD_c_hashLog,   cParams.hashLog);
+                 BOUNDCHECK(ZSTD_c_searchLog, cParams.searchLog);
+                 BOUNDCHECK(ZSTD_c_minMatch,  cParams.minMatch);
+                 BOUNDCHECK(ZSTD_c_targetLength,cParams.targetLength);
+                 BOUNDCHECK(ZSTD_c_strategy,  cParams.strategy);
                  return 0;
              }
              static ZSTD_compressionParameters
              ZSTD_clampCParams(ZSTD_compressionParameters cParams)
              {
-             #   define CLAMP(val,min,max) {      \
-                     if (val<min) val=min;        \
-                     else if (val>max) val=max;   \
+             #   define CLAMP_TYPE(cParam, val, type) {                                \
+                     ZSTD_bounds const bounds = ZSTD_cParam_getBounds(cParam);         \
+                     if ((int)val<bounds.lowerBound) val=(type)bounds.lowerBound;      \
+                     else if ((int)val>bounds.upperBound) val=(type)bounds.upperBound; \
                  }
-                 CLAMP(cParams.windowLog, ZSTD_WINDOWLOG_MIN, ZSTD_WINDOWLOG_MAX);
-                 CLAMP(cParams.chainLog, ZSTD_CHAINLOG_MIN, ZSTD_CHAINLOG_MAX);
-                 CLAMP(cParams.hashLog, ZSTD_HASHLOG_MIN, ZSTD_HASHLOG_MAX);
-                 CLAMP(cParams.searchLog, ZSTD_SEARCHLOG_MIN, ZSTD_SEARCHLOG_MAX);
-                 CLAMP(cParams.searchLength, ZSTD_SEARCHLENGTH_MIN, ZSTD_SEARCHLENGTH_MAX);
-                 ZSTD_STATIC_ASSERT(ZSTD_TARGETLENGTH_MIN == 0);
-                 if (cParams.targetLength > ZSTD_TARGETLENGTH_MAX)
-                     cParams.targetLength = ZSTD_TARGETLENGTH_MAX;
-                 CLAMP(cParams.strategy, ZSTD_fast, ZSTD_btultra);
+             #   define CLAMP(cParam, val) CLAMP_TYPE(cParam, val, int)
+                 CLAMP(ZSTD_c_windowLog, cParams.windowLog);
+                 CLAMP(ZSTD_c_chainLog,  cParams.chainLog);
+                 CLAMP(ZSTD_c_hashLog,   cParams.hashLog);
+                 CLAMP(ZSTD_c_searchLog, cParams.searchLog);
+                 CLAMP(ZSTD_c_minMatch,  cParams.minMatch);
+                 CLAMP(ZSTD_c_targetLength,cParams.targetLength);
+                 CLAMP_TYPE(ZSTD_c_strategy,cParams.strategy, ZSTD_strategy);
                  return cParams;
              }
                  if (CCtxParams->cParams.hashLog) cParams.hashLog = CCtxParams->cParams.hashLog;
                  if (CCtxParams->cParams.chainLog) cParams.chainLog = CCtxParams->cParams.chainLog;
                  if (CCtxParams->cParams.searchLog) cParams.searchLog = CCtxParams->cParams.searchLog;
-                 if (CCtxParams->cParams.searchLength) cParams.searchLength = CCtxParams->cParams.searchLength;
+                 if (CCtxParams->cParams.minMatch) cParams.minMatch = CCtxParams->cParams.minMatch;
                  if (CCtxParams->cParams.targetLength) cParams.targetLength = CCtxParams->cParams.targetLength;
                  if (CCtxParams->cParams.strategy) cParams.strategy = CCtxParams->cParams.strategy;
                  assert(!ZSTD_checkCParams(cParams));
              {
                  size_t const chainSize = (cParams->strategy == ZSTD_fast) ? 0 : ((size_t)1 << cParams->chainLog);
                  size_t const hSize = ((size_t)1) << cParams->hashLog;
-                 U32    const hashLog3 = (forCCtx && cParams->searchLength==3) ? MIN(ZSTD_HASHLOG3_MAX, cParams->windowLog) : 0;
+                 U32    const hashLog3 = (forCCtx && cParams->minMatch==3) ? MIN(ZSTD_HASHLOG3_MAX, cParams->windowLog) : 0;
                  size_t const h3Size = ((size_t)1) << hashLog3;
                  size_t const tableSpace = (chainSize + hSize + h3Size) * sizeof(U32);
                  size_t const optPotentialSpace = ((MaxML+1) + (MaxLL+1) + (MaxOff+1) + (1<<Litbits)) * sizeof(U32)
                                        + (ZSTD_OPT_NUM+1) * (sizeof(ZSTD_match_t)+sizeof(ZSTD_optimal_t));
-                 size_t const optSpace = (forCCtx && ((cParams->strategy == ZSTD_btopt) ||
-                                                      (cParams->strategy == ZSTD_btultra)))
+                 size_t const optSpace = (forCCtx && (cParams->strategy >= ZSTD_btopt))
                                              ? optPotentialSpace
                                              : 0;
                  DEBUGLOG(4, "chainSize: %u - hSize: %u - h3Size: %u",
                  {   ZSTD_compressionParameters const cParams =
                              ZSTD_getCParamsFromCCtxParams(params, 0, 0);
                      size_t const blockSize = MIN(ZSTD_BLOCKSIZE_MAX, (size_t)1 << cParams.windowLog);
-                     U32    const divider = (cParams.searchLength==3) ? 3 : 4;
+                     U32    const divider = (cParams.minMatch==3) ? 3 : 4;
                      size_t const maxNbSeq = blockSize / divider;
                      size_t const tokenSpace = WILDCOPY_OVERLENGTH + blockSize + 11*maxNbSeq;
                      size_t const entropySpace = HUF_WORKSPACE_SIZE;
              {
                  int level;
                  size_t memBudget = 0;
-                 for (level=1; level<=compressionLevel; level++) {
+                 for (level=MIN(compressionLevel, 1); level<=compressionLevel; level++) {
                      size_t const newMB = ZSTD_estimateCCtxSize_internal(level);
                      if (newMB > memBudget) memBudget = newMB;
                  }
              {
                  int level;
                  size_t memBudget = 0;
-                 for (level=1; level<=compressionLevel; level++) {
+                 for (level=MIN(compressionLevel, 1); level<=compressionLevel; level++) {
                      size_t const newMB = ZSTD_estimateCStreamSize_internal(level);
                      if (newMB > memBudget) memBudget = newMB;
                  }
                  return (cParams1.hashLog  == cParams2.hashLog)
                       & (cParams1.chainLog == cParams2.chainLog)
                       & (cParams1.strategy == cParams2.strategy)   /* opt parser space */
-                      & ((cParams1.searchLength==3) == (cParams2.searchLength==3));  /* hashlog3 space */
+                      & ((cParams1.minMatch==3) == (cParams2.minMatch==3));  /* hashlog3 space */
              }
              static void ZSTD_assertEqualCParams(ZSTD_compressionParameters cParams1,
                  assert(cParams1.chainLog     == cParams2.chainLog);
                  assert(cParams1.hashLog      == cParams2.hashLog);
                  assert(cParams1.searchLog    == cParams2.searchLog);
-                 assert(cParams1.searchLength == cParams2.searchLength);
+                 assert(cParams1.minMatch     == cParams2.minMatch);
                  assert(cParams1.targetLength == cParams2.targetLength);
                  assert(cParams1.strategy     == cParams2.strategy);
              }
                          ldmParams1.hashLog == ldmParams2.hashLog &&
                          ldmParams1.bucketSizeLog == ldmParams2.bucketSizeLog &&
                          ldmParams1.minMatchLength == ldmParams2.minMatchLength &&
-                         ldmParams1.hashEveryLog == ldmParams2.hashEveryLog);
+                         ldmParams1.hashRateLog == ldmParams2.hashRateLog);
              }
              typedef enum { ZSTDb_not_buffered, ZSTDb_buffered } ZSTD_buffered_policy_e;
              {
                  size_t const windowSize2 = MAX(1, (size_t)MIN(((U64)1 << cParams2.windowLog), pledgedSrcSize));
                  size_t const blockSize2 = MIN(ZSTD_BLOCKSIZE_MAX, windowSize2);
-                 size_t const maxNbSeq2 = blockSize2 / ((cParams2.searchLength == 3) ? 3 : 4);
+                 size_t const maxNbSeq2 = blockSize2 / ((cParams2.minMatch == 3) ? 3 : 4);
                  size_t const maxNbLit2 = blockSize2;
                  size_t const neededBufferSize2 = (buffPol2==ZSTDb_buffered) ? windowSize2 + blockSize2 : 0;
                  DEBUGLOG(4, "ZSTD_sufficientBuff: is neededBufferSize2=%u <= bufferSize1=%u",
              {
                  ZSTD_window_clear(&ms->window);
-                 ms->nextToUpdate = ms->window.dictLimit + 1;
-                 ms->nextToUpdate3 = ms->window.dictLimit + 1;
+                 ms->nextToUpdate = ms->window.dictLimit;
+                 ms->nextToUpdate3 = ms->window.dictLimit;
                  ms->loadedDictEnd = 0;
                  ms->opt.litLengthSum = 0;  /* force reset of btopt stats */
                  ms->dictMatchState = NULL;
              {
                  size_t const chainSize = (cParams->strategy == ZSTD_fast) ? 0 : ((size_t)1 << cParams->chainLog);
                  size_t const hSize = ((size_t)1) << cParams->hashLog;
-                 U32    const hashLog3 = (forCCtx && cParams->searchLength==3) ? MIN(ZSTD_HASHLOG3_MAX, cParams->windowLog) : 0;
+                 U32    const hashLog3 = (forCCtx && cParams->minMatch==3) ? MIN(ZSTD_HASHLOG3_MAX, cParams->windowLog) : 0;
                  size_t const h3Size = ((size_t)1) << hashLog3;
                  size_t const tableSpace = (chainSize + hSize + h3Size) * sizeof(U32);
                  ZSTD_invalidateMatchState(ms);
                  /* opt parser space */
-                 if (forCCtx && ((cParams->strategy == ZSTD_btopt) | (cParams->strategy == ZSTD_btultra))) {
+                 if (forCCtx && (cParams->strategy >= ZSTD_btopt)) {
                      DEBUGLOG(4, "reserving optimal parser space");
-                     ms->opt.litFreq = (U32*)ptr;
+                     ms->opt.litFreq = (unsigned*)ptr;
                      ms->opt.litLengthFreq = ms->opt.litFreq + (1<<Litbits);
                      ms->opt.matchLengthFreq = ms->opt.litLengthFreq + (MaxLL+1);
                      ms->opt.offCodeFreq = ms->opt.matchLengthFreq + (MaxML+1);
                      /* Adjust long distance matching parameters */
                      ZSTD_ldm_adjustParameters(&params.ldmParams, &params.cParams);
                      assert(params.ldmParams.hashLog >= params.ldmParams.bucketSizeLog);
-                     assert(params.ldmParams.hashEveryLog < 32);
-                     zc->ldmState.hashPower = ZSTD_ldm_getHashPower(params.ldmParams.minMatchLength);
+                     assert(params.ldmParams.hashRateLog < 32);
+                     zc->ldmState.hashPower = ZSTD_rollingHash_primePower(params.ldmParams.minMatchLength);
                  }
                  {   size_t const windowSize = MAX(1, (size_t)MIN(((U64)1 << params.cParams.windowLog), pledgedSrcSize));
                      size_t const blockSize = MIN(ZSTD_BLOCKSIZE_MAX, windowSize);
-                     U32    const divider = (params.cParams.searchLength==3) ? 3 : 4;
+                     U32    const divider = (params.cParams.minMatch==3) ? 3 : 4;
                      size_t const maxNbSeq = blockSize / divider;
                      size_t const tokenSpace = WILDCOPY_OVERLENGTH + blockSize + 11*maxNbSeq;
                      size_t const buffOutSize = (zbuff==ZSTDb_buffered) ? ZSTD_compressBound(blockSize)+1 : 0;
                      if (pledgedSrcSize == ZSTD_CONTENTSIZE_UNKNOWN)
                          zc->appliedParams.fParams.contentSizeFlag = 0;
                      DEBUGLOG(4, "pledged content size : %u ; flag : %u",
-                         (U32)pledgedSrcSize, zc->appliedParams.fParams.contentSizeFlag);
+                         (unsigned)pledgedSrcSize, zc->appliedParams.fParams.contentSizeFlag);
                      zc->blockSize = blockSize;
                      XXH64_reset(&zc->xxhState, 0);
               * dictionary tables into the working context is faster than using them
               * in-place.
               */
-             static const size_t attachDictSizeCutoffs[(unsigned)ZSTD_btultra+1] = {
-KB, /* unused */
-KB, /* ZSTD_fast */
+             static const size_t attachDictSizeCutoffs[ZSTD_STRATEGY_MAX+1] = {
+KB,  /* unused */
+KB,  /* ZSTD_fast */
 KB, /* ZSTD_dfast */
 KB, /* ZSTD_greedy */
 KB, /* ZSTD_lazy */
 KB, /* ZSTD_lazy2 */
 KB, /* ZSTD_btlazy2 */
 KB, /* ZSTD_btopt */
-KB /* ZSTD_btultra */
+KB,  /* ZSTD_btultra */
+KB   /* ZSTD_btultra2 */
              };
              static int ZSTD_shouldAttachDict(const ZSTD_CDict* cdict,
                                          ZSTD_buffered_policy_e zbuff)
              {
-                 DEBUGLOG(4, "ZSTD_resetCCtx_usingCDict (pledgedSrcSize=%u)", (U32)pledgedSrcSize);
+                 DEBUGLOG(4, "ZSTD_resetCCtx_usingCDict (pledgedSrcSize=%u)",
+                             (unsigned)pledgedSrcSize);
                  if (ZSTD_shouldAttachDict(cdict, params, pledgedSrcSize)) {
                      return ZSTD_resetCCtx_byAttachingCDict(
               * note : use same formula for both situations */
              static size_t ZSTD_minGain(size_t srcSize, ZSTD_strategy strat)
              {
-                 U32 const minlog = (strat==ZSTD_btultra) ? 7 : 6;
+                 U32 const minlog = (strat>=ZSTD_btultra) ? (U32)(strat) - 1 : 6;
+                 ZSTD_STATIC_ASSERT(ZSTD_btultra == 8);
+                 assert(ZSTD_cParam_withinBounds(ZSTD_c_strategy, strat));
                  return (srcSize >> minlog) + 2;
              }
                                                   ZSTD_strategy strategy, int disableLiteralCompression,
                                                   void* dst, size_t dstCapacity,
                                             const void* src, size_t srcSize,
-                                                  U32* workspace, const int bmi2)
+                                                  void* workspace, size_t wkspSize,
+                                            const int bmi2)
              {
                  size_t const minGain = ZSTD_minGain(srcSize, strategy);
                  size_t const lhSize = 3 + (srcSize >= 1 KB) + (srcSize >= 16 KB);
                      int const preferRepeat = strategy < ZSTD_lazy ? srcSize <= 1024 : 0;
                      if (repeat == HUF_repeat_valid && lhSize == 3) singleStream = 1;
                      cLitSize = singleStream ? HUF_compress1X_repeat(ostart+lhSize, dstCapacity-lhSize, src, srcSize, 255, 11,
-                                                   workspace, HUF_WORKSPACE_SIZE, (HUF_CElt*)nextHuf->CTable, &repeat, preferRepeat, bmi2)
+                                                   workspace, wkspSize, (HUF_CElt*)nextHuf->CTable, &repeat, preferRepeat, bmi2)
                                              : HUF_compress4X_repeat(ostart+lhSize, dstCapacity-lhSize, src, srcSize, 255, 11,
-                                                   workspace, HUF_WORKSPACE_SIZE, (HUF_CElt*)nextHuf->CTable, &repeat, preferRepeat, bmi2);
+                                                   workspace, wkspSize, (HUF_CElt*)nextHuf->CTable, &repeat, preferRepeat, bmi2);
                      if (repeat != HUF_repeat_none) {
                          /* reused the existing table */
                          hType = set_repeat;
                      assert(!ZSTD_isError(NCountCost));
                      assert(compressedCost < ERROR(maxCode));
                      DEBUGLOG(5, "Estimated bit costs: basic=%u\trepeat=%u\tcompressed=%u",
-                                 (U32)basicCost, (U32)repeatCost, (U32)compressedCost);
+                                 (unsigned)basicCost, (unsigned)repeatCost, (unsigned)compressedCost);
                      if (basicCost <= repeatCost && basicCost <= compressedCost) {
                          DEBUGLOG(5, "Selected set_basic");
                          assert(isDefaultAllowed);
              MEM_STATIC size_t
              ZSTD_buildCTable(void* dst, size_t dstCapacity,
                              FSE_CTable* nextCTable, U32 FSELog, symbolEncodingType_e type,
-                             U32* count, U32 max,
+                             unsigned* count, U32 max,
                              const BYTE* codeTable, size_t nbSeq,
                              const S16* defaultNorm, U32 defaultNormLog, U32 defaultMax,
                              const FSE_CTable* prevCTable, size_t prevCTableSize,
              {
                  BYTE* op = (BYTE*)dst;
                  const BYTE* const oend = op + dstCapacity;
+                 DEBUGLOG(6, "ZSTD_buildCTable (dstCapacity=%u)", (unsigned)dstCapacity);
                  switch (type) {
                  case set_rle:
+                     CHECK_F(FSE_buildCTable_rle(nextCTable, (BYTE)max));
+                     if (dstCapacity==0) return ERROR(dstSize_tooSmall);
                      *op = codeTable[0];
-                     CHECK_F(FSE_buildCTable_rle(nextCTable, (BYTE)max));
                      return 1;
                  case set_repeat:
                      memcpy(nextCTable, prevCTable, prevCTableSize);
                  FSE_CState_t  stateLitLength;
                  CHECK_E(BIT_initCStream(&blockStream, dst, dstCapacity), dstSize_tooSmall); /* not enough space remaining */
+                 DEBUGLOG(6, "available space for bitstream : %i  (dstCapacity=%u)",
+                             (int)(blockStream.endPtr - blockStream.startPtr),
+                             (unsigned)dstCapacity);
                  /* first symbols */
                  FSE_initCState2(&stateMatchLength, CTable_MatchLength, mlCodeTable[nbSeq-1]);
                          U32  const ofBits = ofCode;
                          U32  const mlBits = ML_bits[mlCode];
                          DEBUGLOG(6, "encoding: litlen:%2u - matchlen:%2u - offCode:%7u",
-                                     sequences[n].litLength,
-                                     sequences[n].matchLength + MINMATCH,
-                                     sequences[n].offset);
+                                     (unsigned)sequences[n].litLength,
+                                     (unsigned)sequences[n].matchLength + MINMATCH,
+                                     (unsigned)sequences[n].offset);
                                                                                          /* 32b*/  /* 64b*/
                                                                                          /* (7)*/  /* (7)*/
                          FSE_encodeSymbol(&blockStream, &stateOffsetBits, ofCode);       /* 15 */  /* 15 */
                              BIT_addBits(&blockStream, sequences[n].offset, ofBits);     /* 31 */
                          }
                          BIT_flushBits(&blockStream);                                    /* (7)*/
+                         DEBUGLOG(7, "remaining space : %i", (int)(blockStream.endPtr - blockStream.ptr));
                  }   }
                  DEBUGLOG(6, "ZSTD_encodeSequences: flushing ML state with %u bits", stateMatchLength.stateLog);
                          FSE_CTable const* CTable_LitLength, BYTE const* llCodeTable,
                          seqDef const* sequences, size_t nbSeq, int longOffsets, int bmi2)
              {
+                 DEBUGLOG(5, "ZSTD_encodeSequences: dstCapacity = %u", (unsigned)dstCapacity);
              #if DYNAMIC_BMI2
                  if (bmi2) {
                      return ZSTD_encodeSequences_bmi2(dst, dstCapacity,
                                                      sequences, nbSeq, longOffsets);
              }
-             MEM_STATIC size_t ZSTD_compressSequences_internal(seqStore_t* seqStorePtr,
-                                           ZSTD_entropyCTables_t const* prevEntropy,
-                                           ZSTD_entropyCTables_t* nextEntropy,
-                                           ZSTD_CCtx_params const* cctxParams,
-                                           void* dst, size_t dstCapacity, U32* workspace,
-                                           const int bmi2)
+             /* ZSTD_compressSequences_internal():
+              * actually compresses both literals and sequences */
+             MEM_STATIC size_t
+             ZSTD_compressSequences_internal(seqStore_t* seqStorePtr,
+                                       const ZSTD_entropyCTables_t* prevEntropy,
+                                             ZSTD_entropyCTables_t* nextEntropy,
+                                       const ZSTD_CCtx_params* cctxParams,
+                                             void* dst, size_t dstCapacity,
+                                             void* workspace, size_t wkspSize,
+                                       const int bmi2)
              {
                  const int longOffsets = cctxParams->cParams.windowLog > STREAM_ACCUMULATOR_MIN;
                  ZSTD_strategy const strategy = cctxParams->cParams.strategy;
-                 U32 count[MaxSeq+1];
+                 unsigned count[MaxSeq+1];
                  FSE_CTable* CTable_LitLength = nextEntropy->fse.litlengthCTable;
                  FSE_CTable* CTable_OffsetBits = nextEntropy->fse.offcodeCTable;
                  FSE_CTable* CTable_MatchLength = nextEntropy->fse.matchlengthCTable;
                  BYTE* lastNCount = NULL;
                  ZSTD_STATIC_ASSERT(HUF_WORKSPACE_SIZE >= (1<<MAX(MLFSELog,LLFSELog)));
+                 DEBUGLOG(5, "ZSTD_compressSequences_internal");
                  /* Compress literals */
                  {   const BYTE* const literals = seqStorePtr->litStart;
                                                  cctxParams->cParams.strategy, disableLiteralCompression,
                                                  op, dstCapacity,
                                                  literals, litSize,
-                                                 workspace, bmi2);
+                                                 workspace, wkspSize,
+                                                 bmi2);
                      if (ZSTD_isError(cSize))
                        return cSize;
                      assert(cSize <= dstCapacity);
                  /* convert length/distances into codes */
                  ZSTD_seqToCodes(seqStorePtr);
                  /* build CTable for Literal Lengths */
-                 {   U32 max = MaxLL;
-                     size_t const mostFrequent = HIST_countFast_wksp(count, &max, llCodeTable, nbSeq, workspace);   /* can't fail */
+                 {   unsigned max = MaxLL;
+                     size_t const mostFrequent = HIST_countFast_wksp(count, &max, llCodeTable, nbSeq, workspace, wkspSize);   /* can't fail */
                      DEBUGLOG(5, "Building LL table");
                      nextEntropy->fse.litlength_repeatMode = prevEntropy->fse.litlength_repeatMode;
-                     LLtype = ZSTD_selectEncodingType(&nextEntropy->fse.litlength_repeatMode, count, max, mostFrequent, nbSeq, LLFSELog, prevEntropy->fse.litlengthCTable, LL_defaultNorm, LL_defaultNormLog, ZSTD_defaultAllowed, strategy);
+                     LLtype = ZSTD_selectEncodingType(&nextEntropy->fse.litlength_repeatMode,
+                                                     count, max, mostFrequent, nbSeq,
+                                                     LLFSELog, prevEntropy->fse.litlengthCTable,
+                                                     LL_defaultNorm, LL_defaultNormLog,
+                                                     ZSTD_defaultAllowed, strategy);
                      assert(set_basic < set_compressed && set_rle < set_compressed);
                      assert(!(LLtype < set_compressed && nextEntropy->fse.litlength_repeatMode != FSE_repeat_none)); /* We don't copy tables */
                      {   size_t const countSize = ZSTD_buildCTable(op, oend - op, CTable_LitLength, LLFSELog, (symbolEncodingType_e)LLtype,
                                                                  count, max, llCodeTable, nbSeq, LL_defaultNorm, LL_defaultNormLog, MaxLL,
                                                                  prevEntropy->fse.litlengthCTable, sizeof(prevEntropy->fse.litlengthCTable),
-                                                                 workspace, HUF_WORKSPACE_SIZE);
+                                                                 workspace, wkspSize);
                          if (ZSTD_isError(countSize)) return countSize;
                          if (LLtype == set_compressed)
                              lastNCount = op;
                          op += countSize;
                  }   }
                  /* build CTable for Offsets */
-                 {   U32 max = MaxOff;
-                     size_t const mostFrequent = HIST_countFast_wksp(count, &max, ofCodeTable, nbSeq, workspace);  /* can't fail */
+                 {   unsigned max = MaxOff;
+                     size_t const mostFrequent = HIST_countFast_wksp(count, &max, ofCodeTable, nbSeq, workspace, wkspSize);  /* can't fail */
                      /* We can only use the basic table if max <= DefaultMaxOff, otherwise the offsets are too large */
                      ZSTD_defaultPolicy_e const defaultPolicy = (max <= DefaultMaxOff) ? ZSTD_defaultAllowed : ZSTD_defaultDisallowed;
                      DEBUGLOG(5, "Building OF table");
                      nextEntropy->fse.offcode_repeatMode = prevEntropy->fse.offcode_repeatMode;
-                     Offtype = ZSTD_selectEncodingType(&nextEntropy->fse.offcode_repeatMode, count, max, mostFrequent, nbSeq, OffFSELog, prevEntropy->fse.offcodeCTable, OF_defaultNorm, OF_defaultNormLog, defaultPolicy, strategy);
+                     Offtype = ZSTD_selectEncodingType(&nextEntropy->fse.offcode_repeatMode,
+                                                     count, max, mostFrequent, nbSeq,
+                                                     OffFSELog, prevEntropy->fse.offcodeCTable,
+                                                     OF_defaultNorm, OF_defaultNormLog,
+                                                     defaultPolicy, strategy);
                      assert(!(Offtype < set_compressed && nextEntropy->fse.offcode_repeatMode != FSE_repeat_none)); /* We don't copy tables */
                      {   size_t const countSize = ZSTD_buildCTable(op, oend - op, CTable_OffsetBits, OffFSELog, (symbolEncodingType_e)Offtype,
                                                                  count, max, ofCodeTable, nbSeq, OF_defaultNorm, OF_defaultNormLog, DefaultMaxOff,
                                                                  prevEntropy->fse.offcodeCTable, sizeof(prevEntropy->fse.offcodeCTable),
-                                                                 workspace, HUF_WORKSPACE_SIZE);
+                                                                 workspace, wkspSize);
                          if (ZSTD_isError(countSize)) return countSize;
                          if (Offtype == set_compressed)
                              lastNCount = op;
                          op += countSize;
                  }   }
                  /* build CTable for MatchLengths */
-                 {   U32 max = MaxML;
-                     size_t const mostFrequent = HIST_countFast_wksp(count, &max, mlCodeTable, nbSeq, workspace);   /* can't fail */
-                     DEBUGLOG(5, "Building ML table");
+                 {   unsigned max = MaxML;
+                     size_t const mostFrequent = HIST_countFast_wksp(count, &max, mlCodeTable, nbSeq, workspace, wkspSize);   /* can't fail */
+                     DEBUGLOG(5, "Building ML table (remaining space : %i)", (int)(oend-op));
                      nextEntropy->fse.matchlength_repeatMode = prevEntropy->fse.matchlength_repeatMode;
-                     MLtype = ZSTD_selectEncodingType(&nextEntropy->fse.matchlength_repeatMode, count, max, mostFrequent, nbSeq, MLFSELog, prevEntropy->fse.matchlengthCTable, ML_defaultNorm, ML_defaultNormLog, ZSTD_defaultAllowed, strategy);
+                     MLtype = ZSTD_selectEncodingType(&nextEntropy->fse.matchlength_repeatMode,
+                                                     count, max, mostFrequent, nbSeq,
+                                                     MLFSELog, prevEntropy->fse.matchlengthCTable,
+                                                     ML_defaultNorm, ML_defaultNormLog,
+                                                     ZSTD_defaultAllowed, strategy);
                      assert(!(MLtype < set_compressed && nextEntropy->fse.matchlength_repeatMode != FSE_repeat_none)); /* We don't copy tables */
                      {   size_t const countSize = ZSTD_buildCTable(op, oend - op, CTable_MatchLength, MLFSELog, (symbolEncodingType_e)MLtype,
                                                                  count, max, mlCodeTable, nbSeq, ML_defaultNorm, ML_defaultNormLog, MaxML,
                                                                  prevEntropy->fse.matchlengthCTable, sizeof(prevEntropy->fse.matchlengthCTable),
-                                                                 workspace, HUF_WORKSPACE_SIZE);
+                                                                 workspace, wkspSize);
                          if (ZSTD_isError(countSize)) return countSize;
                          if (MLtype == set_compressed)
                              lastNCount = op;
                      }
                  }
+                 DEBUGLOG(5, "compressed block size : %u", (unsigned)(op - ostart));
                  return op - ostart;
              }
-             MEM_STATIC size_t ZSTD_compressSequences(seqStore_t* seqStorePtr,
-                                     const ZSTD_entropyCTables_t* prevEntropy,
-                                           ZSTD_entropyCTables_t* nextEntropy,
-                                     const ZSTD_CCtx_params* cctxParams,
-                                           void* dst, size_t dstCapacity,
-                                           size_t srcSize, U32* workspace, int bmi2)
+             MEM_STATIC size_t
+             ZSTD_compressSequences(seqStore_t* seqStorePtr,
+                                    const ZSTD_entropyCTables_t* prevEntropy,
+                                          ZSTD_entropyCTables_t* nextEntropy,
+                                    const ZSTD_CCtx_params* cctxParams,
+                                          void* dst, size_t dstCapacity,
+                                          size_t srcSize,
+                                          void* workspace, size_t wkspSize,
+                                          int bmi2)
              {
                  size_t const cSize = ZSTD_compressSequences_internal(
-                         seqStorePtr, prevEntropy, nextEntropy, cctxParams, dst, dstCapacity,
-                         workspace, bmi2);
+                                         seqStorePtr, prevEntropy, nextEntropy, cctxParams,
+                                         dst, dstCapacity,
+                                         workspace, wkspSize, bmi2);
                  if (cSize == 0) return 0;
                  /* When srcSize <= dstCapacity, there is enough space to write a raw uncompressed block.
                   * Since we ran out of space, block must be not compressible, so fall back to raw uncompressed block.
               * assumption : strat is a valid strategy */
              ZSTD_blockCompressor ZSTD_selectBlockCompressor(ZSTD_strategy strat, ZSTD_dictMode_e dictMode)
              {
-                 static const ZSTD_blockCompressor blockCompressor[3][(unsigned)ZSTD_btultra+1] = {
+                 static const ZSTD_blockCompressor blockCompressor[3][ZSTD_STRATEGY_MAX+1] = {
                      { ZSTD_compressBlock_fast  /* default for 0 */,
                        ZSTD_compressBlock_fast,
                        ZSTD_compressBlock_doubleFast,
                        ZSTD_compressBlock_lazy2,
                        ZSTD_compressBlock_btlazy2,
                        ZSTD_compressBlock_btopt,
-                       ZSTD_compressBlock_btultra },
+                       ZSTD_compressBlock_btultra,
+                       ZSTD_compressBlock_btultra2 },
                      { ZSTD_compressBlock_fast_extDict  /* default for 0 */,
                        ZSTD_compressBlock_fast_extDict,
                        ZSTD_compressBlock_doubleFast_extDict,
                        ZSTD_compressBlock_lazy2_extDict,
                        ZSTD_compressBlock_btlazy2_extDict,
                        ZSTD_compressBlock_btopt_extDict,
+                       ZSTD_compressBlock_btultra_extDict,
                        ZSTD_compressBlock_btultra_extDict },
                      { ZSTD_compressBlock_fast_dictMatchState  /* default for 0 */,
                        ZSTD_compressBlock_fast_dictMatchState,
                        ZSTD_compressBlock_lazy2_dictMatchState,
                        ZSTD_compressBlock_btlazy2_dictMatchState,
                        ZSTD_compressBlock_btopt_dictMatchState,
+                       ZSTD_compressBlock_btultra_dictMatchState,
                        ZSTD_compressBlock_btultra_dictMatchState }
                  };
                  ZSTD_blockCompressor selectedCompressor;
                  ZSTD_STATIC_ASSERT((unsigned)ZSTD_fast == 1);
-                 assert((U32)strat >= (U32)ZSTD_fast);
-                 assert((U32)strat <= (U32)ZSTD_btultra);
-                 selectedCompressor = blockCompressor[(int)dictMode][(U32)strat];
+                 assert(ZSTD_cParam_withinBounds(ZSTD_c_strategy, strat));
+                 selectedCompressor = blockCompressor[(int)dictMode][(int)strat];
                  assert(selectedCompressor != NULL);
                  return selectedCompressor;
              }
              {
                  ZSTD_matchState_t* const ms = &zc->blockState.matchState;
                  size_t cSize;
-                 DEBUGLOG(5, "ZSTD_compressBlock_internal (dstCapacity=%zu, dictLimit=%u, nextToUpdate=%u)",
-                             dstCapacity, ms->window.dictLimit, ms->nextToUpdate);
+                 DEBUGLOG(5, "ZSTD_compressBlock_internal (dstCapacity=%u, dictLimit=%u, nextToUpdate=%u)",
+                             (unsigned)dstCapacity, (unsigned)ms->window.dictLimit, (unsigned)ms->nextToUpdate);
                  assert(srcSize <= ZSTD_BLOCKSIZE_MAX);
                  /* Assert that we have correctly flushed the ctx params into the ms's copy */
                  ZSTD_assertEqualCParams(zc->appliedParams.cParams, ms->cParams);
                  if (srcSize < MIN_CBLOCK_SIZE+ZSTD_blockHeaderSize+1) {
-                     ZSTD_ldm_skipSequences(&zc->externSeqStore, srcSize, zc->appliedParams.cParams.searchLength);
+                     ZSTD_ldm_skipSequences(&zc->externSeqStore, srcSize, zc->appliedParams.cParams.minMatch);
                      cSize = 0;
                      goto out;  /* don't even attempt compression below a certain srcSize */
                  }
                  ms->opt.symbolCosts = &zc->blockState.prevCBlock->entropy;   /* required for optimal parser to read stats from dictionary */
                  /* a gap between an attached dict and the current window is not safe,
-                  * they must remain adjacent, and when that stops being the case, the dict
-                  * must be unset */
+                  * they must remain adjacent,
+                  * and when that stops being the case, the dict must be unset */
                  assert(ms->dictMatchState == NULL || ms->loadedDictEnd == ms->window.dictLimit);
                  /* limited update after a very long match */
                          &zc->blockState.prevCBlock->entropy, &zc->blockState.nextCBlock->entropy,
                          &zc->appliedParams,
                          dst, dstCapacity,
-                         srcSize, zc->entropyWorkspace, zc->bmi2);
+                         srcSize,
+                         zc->entropyWorkspace, HUF_WORKSPACE_SIZE /* statically allocated in resetCCtx */,
+                         zc->bmi2);
              out:
                  if (!ZSTD_isError(cSize) && cSize != 0) {
                  U32 const maxDist = (U32)1 << cctx->appliedParams.cParams.windowLog;
                  assert(cctx->appliedParams.cParams.windowLog <= 31);
-                 DEBUGLOG(5, "ZSTD_compress_frameChunk (blockSize=%u)", (U32)blockSize);
+                 DEBUGLOG(5, "ZSTD_compress_frameChunk (blockSize=%u)", (unsigned)blockSize);
                  if (cctx->appliedParams.fParams.checksumFlag && srcSize)
                      XXH64_update(&cctx->xxhState, src, srcSize);
                          assert(dstCapacity >= cSize);
                          dstCapacity -= cSize;
                          DEBUGLOG(5, "ZSTD_compress_frameChunk: adding a block of size %u",
-                                     (U32)cSize);
+                                     (unsigned)cSize);
                  }   }
                  if (lastFrameChunk && (op>ostart)) cctx->stage = ZSTDcs_ending;
                  size_t pos=0;
                  assert(!(params.fParams.contentSizeFlag && pledgedSrcSize == ZSTD_CONTENTSIZE_UNKNOWN));
-                 if (dstCapacity < ZSTD_frameHeaderSize_max) return ERROR(dstSize_tooSmall);
+                 if (dstCapacity < ZSTD_FRAMEHEADERSIZE_MAX) return ERROR(dstSize_tooSmall);
                  DEBUGLOG(4, "ZSTD_writeFrameHeader : dictIDFlag : %u ; dictID : %u ; dictIDSizeCode : %u",
-                             !params.fParams.noDictIDFlag, dictID,  dictIDSizeCode);
+                             !params.fParams.noDictIDFlag, (unsigned)dictID, (unsigned)dictIDSizeCode);
                  if (params.format == ZSTD_f_zstd1) {
                      MEM_writeLE32(dst, ZSTD_MAGICNUMBER);
                  size_t fhSize = 0;
                  DEBUGLOG(5, "ZSTD_compressContinue_internal, stage: %u, srcSize: %u",
-                             cctx->stage, (U32)srcSize);
+                             cctx->stage, (unsigned)srcSize);
                  if (cctx->stage==ZSTDcs_created) return ERROR(stage_wrong);   /* missing init (ZSTD_compressBegin) */
                  if (frame && (cctx->stage==ZSTDcs_init)) {
                      }
                  }
-                 DEBUGLOG(5, "ZSTD_compressContinue_internal (blockSize=%u)", (U32)cctx->blockSize);
+                 DEBUGLOG(5, "ZSTD_compressContinue_internal (blockSize=%u)", (unsigned)cctx->blockSize);
                  {   size_t const cSize = frame ?
                                           ZSTD_compress_frameChunk (cctx, dst, dstCapacity, src, srcSize, lastFrameChunk) :
                                           ZSTD_compressBlock_internal (cctx, dst, dstCapacity, src, srcSize);
                          ZSTD_STATIC_ASSERT(ZSTD_CONTENTSIZE_UNKNOWN == (unsigned long long)-1);
                          if (cctx->consumedSrcSize+1 > cctx->pledgedSrcSizePlusOne) {
                              DEBUGLOG(4, "error : pledgedSrcSize = %u, while realSrcSize >= %u",
-                                 (U32)cctx->pledgedSrcSizePlusOne-1, (U32)cctx->consumedSrcSize);
+                                 (unsigned)cctx->pledgedSrcSizePlusOne-1, (unsigned)cctx->consumedSrcSize);
                              return ERROR(srcSize_wrong);
                          }
                      }
                                            void* dst, size_t dstCapacity,
                                      const void* src, size_t srcSize)
              {
-                 DEBUGLOG(5, "ZSTD_compressContinue (srcSize=%u)", (U32)srcSize);
+                 DEBUGLOG(5, "ZSTD_compressContinue (srcSize=%u)", (unsigned)srcSize);
                  return ZSTD_compressContinue_internal(cctx, dst, dstCapacity, src, srcSize, 1 /* frame mode */, 0 /* last chunk */);
              }
                  case ZSTD_btlazy2:   /* we want the dictionary table fully sorted */
                  case ZSTD_btopt:
                  case ZSTD_btultra:
+                 case ZSTD_btultra2:
                      if (srcSize >= HASH_READ_SIZE)
                          ZSTD_updateTree(ms, iend-HASH_READ_SIZE, iend);
                      break;
                      if (offcodeLog > OffFSELog) return ERROR(dictionary_corrupted);
                      /* Defer checking offcodeMaxValue because we need to know the size of the dictionary content */
                      /* fill all offset symbols to avoid garbage at end of table */
-                     CHECK_E( FSE_buildCTable_wksp(bs->entropy.fse.offcodeCTable, offcodeNCount, MaxOff, offcodeLog, workspace, HUF_WORKSPACE_SIZE),
+                     CHECK_E( FSE_buildCTable_wksp(bs->entropy.fse.offcodeCTable,
+                                                 offcodeNCount, MaxOff, offcodeLog,
+                                                 workspace, HUF_WORKSPACE_SIZE),
                               dictionary_corrupted);
                      dictPtr += offcodeHeaderSize;
                  }
                      if (matchlengthLog > MLFSELog) return ERROR(dictionary_corrupted);
                      /* Every match length code must have non-zero probability */
                      CHECK_F( ZSTD_checkDictNCount(matchlengthNCount, matchlengthMaxValue, MaxML));
-                     CHECK_E( FSE_buildCTable_wksp(bs->entropy.fse.matchlengthCTable, matchlengthNCount, matchlengthMaxValue, matchlengthLog, workspace, HUF_WORKSPACE_SIZE),
+                     CHECK_E( FSE_buildCTable_wksp(bs->entropy.fse.matchlengthCTable,
+                                                 matchlengthNCount, matchlengthMaxValue, matchlengthLog,
+                                                 workspace, HUF_WORKSPACE_SIZE),
                               dictionary_corrupted);
                      dictPtr += matchlengthHeaderSize;
                  }
                      if (litlengthLog > LLFSELog) return ERROR(dictionary_corrupted);
                      /* Every literal length code must have non-zero probability */
                      CHECK_F( ZSTD_checkDictNCount(litlengthNCount, litlengthMaxValue, MaxLL));
-                     CHECK_E( FSE_buildCTable_wksp(bs->entropy.fse.litlengthCTable, litlengthNCount, litlengthMaxValue, litlengthLog, workspace, HUF_WORKSPACE_SIZE),
+                     CHECK_E( FSE_buildCTable_wksp(bs->entropy.fse.litlengthCTable,
+                                                 litlengthNCount, litlengthMaxValue, litlengthLog,
+                                                 workspace, HUF_WORKSPACE_SIZE),
                               dictionary_corrupted);
                      dictPtr += litlengthHeaderSize;
                  }
                  ZSTD_parameters const params = ZSTD_getParams(compressionLevel, ZSTD_CONTENTSIZE_UNKNOWN, dictSize);
                  ZSTD_CCtx_params const cctxParams =
                          ZSTD_assignParamsToCCtxParams(cctx->requestedParams, params);
-                 DEBUGLOG(4, "ZSTD_compressBegin_usingDict (dictSize=%u)", (U32)dictSize);
+                 DEBUGLOG(4, "ZSTD_compressBegin_usingDict (dictSize=%u)", (unsigned)dictSize);
                  return ZSTD_compressBegin_internal(cctx, dict, dictSize, ZSTD_dct_auto, ZSTD_dtlm_fast, NULL,
                                                     cctxParams, ZSTD_CONTENTSIZE_UNKNOWN, ZSTDb_not_buffered);
              }
                  if (cctx->appliedParams.fParams.checksumFlag) {
                      U32 const checksum = (U32) XXH64_digest(&cctx->xxhState);
                      if (dstCapacity<4) return ERROR(dstSize_tooSmall);
-                     DEBUGLOG(4, "ZSTD_writeEpilogue: write checksum : %08X", checksum);
+                     DEBUGLOG(4, "ZSTD_writeEpilogue: write checksum : %08X", (unsigned)checksum);
                      MEM_writeLE32(op, checksum);
                      op += 4;
                  }
                      DEBUGLOG(4, "end of frame : controlling src size");
                      if (cctx->pledgedSrcSizePlusOne != cctx->consumedSrcSize+1) {
                          DEBUGLOG(4, "error : pledgedSrcSize = %u, while realSrcSize = %u",
-                             (U32)cctx->pledgedSrcSizePlusOne-1, (U32)cctx->consumedSrcSize);
+                             (unsigned)cctx->pledgedSrcSizePlusOne-1, (unsigned)cctx->consumedSrcSize);
                          return ERROR(srcSize_wrong);
                  }   }
                  return cSize + endResult;
                      const void* dict,size_t dictSize,
                      ZSTD_CCtx_params params)
              {
-                 DEBUGLOG(4, "ZSTD_compress_advanced_internal (srcSize:%u)", (U32)srcSize);
+                 DEBUGLOG(4, "ZSTD_compress_advanced_internal (srcSize:%u)", (unsigned)srcSize);
                  CHECK_F( ZSTD_compressBegin_internal(cctx,
                                       dict, dictSize, ZSTD_dct_auto, ZSTD_dtlm_fast, NULL,
                                       params, srcSize, ZSTDb_not_buffered) );
                                 const void* src, size_t srcSize,
                                       int compressionLevel)
              {
-                 DEBUGLOG(4, "ZSTD_compressCCtx (srcSize=%u)", (U32)srcSize);
+                 DEBUGLOG(4, "ZSTD_compressCCtx (srcSize=%u)", (unsigned)srcSize);
                  assert(cctx != NULL);
                  return ZSTD_compress_usingDict(cctx, dst, dstCapacity, src, srcSize, NULL, 0, compressionLevel);
              }
                      size_t dictSize, ZSTD_compressionParameters cParams,
                      ZSTD_dictLoadMethod_e dictLoadMethod)
              {
-                 DEBUGLOG(5, "sizeof(ZSTD_CDict) : %u", (U32)sizeof(ZSTD_CDict));
+                 DEBUGLOG(5, "sizeof(ZSTD_CDict) : %u", (unsigned)sizeof(ZSTD_CDict));
                  return sizeof(ZSTD_CDict) + HUF_WORKSPACE_SIZE + ZSTD_sizeof_matchState(&cParams, /* forCCtx */ 0)
                         + (dictLoadMethod == ZSTD_dlm_byRef ? 0 : dictSize);
              }
              size_t ZSTD_sizeof_CDict(const ZSTD_CDict* cdict)
              {
                  if (cdict==NULL) return 0;   /* support sizeof on NULL */
-                 DEBUGLOG(5, "sizeof(*cdict) : %u", (U32)sizeof(*cdict));
+                 DEBUGLOG(5, "sizeof(*cdict) : %u", (unsigned)sizeof(*cdict));
                  return cdict->workspaceSize + (cdict->dictBuffer ? cdict->dictContentSize : 0) + sizeof(*cdict);
              }
                                  ZSTD_dictContentType_e dictContentType,
                                  ZSTD_compressionParameters cParams)
              {
-                 DEBUGLOG(3, "ZSTD_initCDict_internal (dictContentType:%u)", (U32)dictContentType);
+                 DEBUGLOG(3, "ZSTD_initCDict_internal (dictContentType:%u)", (unsigned)dictContentType);
                  assert(!ZSTD_checkCParams(cParams));
                  cdict->matchState.cParams = cParams;
                  if ((dictLoadMethod == ZSTD_dlm_byRef) || (!dictBuffer) || (!dictSize)) {
                                                    ZSTD_dictContentType_e dictContentType,
                                                    ZSTD_compressionParameters cParams, ZSTD_customMem customMem)
              {
-                 DEBUGLOG(3, "ZSTD_createCDict_advanced, mode %u", (U32)dictContentType);
+                 DEBUGLOG(3, "ZSTD_createCDict_advanced, mode %u", (unsigned)dictContentType);
                  if (!customMem.customAlloc ^ !customMem.customFree) return NULL;
                  {   ZSTD_CDict* const cdict = (ZSTD_CDict*)ZSTD_malloc(sizeof(ZSTD_CDict), customMem);
                  void* ptr;
                  if ((size_t)workspace & 7) return NULL;  /* 8-aligned */
                  DEBUGLOG(4, "(workspaceSize < neededSize) : (%u < %u) => %u",
-                     (U32)workspaceSize, (U32)neededSize, (U32)(workspaceSize < neededSize));
+                     (unsigned)workspaceSize, (unsigned)neededSize, (unsigned)(workspaceSize < neededSize));
                  if (workspaceSize < neededSize) return NULL;
                  if (dictLoadMethod == ZSTD_dlm_byCopy) {
              size_t ZSTD_resetCStream(ZSTD_CStream* zcs, unsigned long long pledgedSrcSize)
              {
                  ZSTD_CCtx_params params = zcs->requestedParams;
-                 DEBUGLOG(4, "ZSTD_resetCStream: pledgedSrcSize = %u", (U32)pledgedSrcSize);
+                 DEBUGLOG(4, "ZSTD_resetCStream: pledgedSrcSize = %u", (unsigned)pledgedSrcSize);
                  if (pledgedSrcSize==0) pledgedSrcSize = ZSTD_CONTENTSIZE_UNKNOWN;
                  params.fParams.contentSizeFlag = 1;
                  return ZSTD_resetCStream_internal(zcs, NULL, 0, ZSTD_dct_auto, zcs->cdict, params, pledgedSrcSize);
                  assert(!((dict) && (cdict)));  /* either dict or cdict, not both */
                  if (dict && dictSize >= 8) {
-                     DEBUGLOG(4, "loading dictionary of size %u", (U32)dictSize);
+                     DEBUGLOG(4, "loading dictionary of size %u", (unsigned)dictSize);
                      if (zcs->staticSize) {   /* static CCtx : never uses malloc */
                          /* incompatible with internal cdict creation */
                          return ERROR(memory_allocation);
                                               ZSTD_parameters params, unsigned long long pledgedSrcSize)
              {
                  DEBUGLOG(4, "ZSTD_initCStream_advanced: pledgedSrcSize=%u, flag=%u",
-                             (U32)pledgedSrcSize, params.fParams.contentSizeFlag);
+                             (unsigned)pledgedSrcSize, params.fParams.contentSizeFlag);
                  CHECK_F( ZSTD_checkCParams(params.cParams) );
                  if ((pledgedSrcSize==0) && (params.fParams.contentSizeFlag==0)) pledgedSrcSize = ZSTD_CONTENTSIZE_UNKNOWN;  /* for compatibility with older programs relying on this behavior. Users should now specify ZSTD_CONTENTSIZE_UNKNOWN. This line will be removed in the future. */
                  zcs->requestedParams = ZSTD_assignParamsToCCtxParams(zcs->requestedParams, params);
              /*======   Compression   ======*/
-             MEM_STATIC size_t ZSTD_limitCopy(void* dst, size_t dstCapacity,
-                                        const void* src, size_t srcSize)
+             static size_t ZSTD_nextInputSizeHint(const ZSTD_CCtx* cctx)
+             {
+                 size_t hintInSize = cctx->inBuffTarget - cctx->inBuffPos;
+                 if (hintInSize==0) hintInSize = cctx->blockSize;
+                 return hintInSize;
+             }
+             static size_t ZSTD_limitCopy(void* dst, size_t dstCapacity,
+                                    const void* src, size_t srcSize)
              {
                  size_t const length = MIN(dstCapacity, srcSize);
                  if (length) memcpy(dst, src, length);
              }
              /** ZSTD_compressStream_generic():
-              *  internal function for all *compressStream*() variants and *compress_generic()
+              *  internal function for all *compressStream*() variants
               *  non-static, because can be called from zstdmt_compress.c
               * @return : hint size for next input */
              size_t ZSTD_compressStream_generic(ZSTD_CStream* zcs,
                  U32 someMoreWork = 1;
                  /* check expectations */
-                 DEBUGLOG(5, "ZSTD_compressStream_generic, flush=%u", (U32)flushMode);
+                 DEBUGLOG(5, "ZSTD_compressStream_generic, flush=%u", (unsigned)flushMode);
                  assert(zcs->inBuff != NULL);
                  assert(zcs->inBuffSize > 0);
                  assert(zcs->outBuff !=  NULL);
                              /* shortcut to compression pass directly into output buffer */
                              size_t const cSize = ZSTD_compressEnd(zcs,
                                                              op, oend-op, ip, iend-ip);
-                             DEBUGLOG(4, "ZSTD_compressEnd : %u", (U32)cSize);
+                             DEBUGLOG(4, "ZSTD_compressEnd : cSize=%u", (unsigned)cSize);
                              if (ZSTD_isError(cSize)) return cSize;
                              ip = iend;
                              op += cSize;
                              zcs->frameEnded = 1;
-                             ZSTD_CCtx_reset(zcs);
+                             ZSTD_CCtx_reset(zcs, ZSTD_reset_session_only);
                              someMoreWork = 0; break;
                          }
                          /* complete loading into inBuffer */
                              if (zcs->inBuffTarget > zcs->inBuffSize)
                                  zcs->inBuffPos = 0, zcs->inBuffTarget = zcs->blockSize;
                              DEBUGLOG(5, "inBuffTarget:%u / inBuffSize:%u",
-                                      (U32)zcs->inBuffTarget, (U32)zcs->inBuffSize);
+                                      (unsigned)zcs->inBuffTarget, (unsigned)zcs->inBuffSize);
                              if (!lastBlock)
                                  assert(zcs->inBuffTarget <= zcs->inBuffSize);
                              zcs->inToCompress = zcs->inBuffPos;
                                  if (zcs->frameEnded) {
                                      DEBUGLOG(5, "Frame completed directly in outBuffer");
                                      someMoreWork = 0;
-                                     ZSTD_CCtx_reset(zcs);
+                                     ZSTD_CCtx_reset(zcs, ZSTD_reset_session_only);
                                  }
                                  break;
                              }
                              size_t const flushed = ZSTD_limitCopy(op, oend-op,
                                          zcs->outBuff + zcs->outBuffFlushedSize, toFlush);
                              DEBUGLOG(5, "toFlush: %u into %u ==> flushed: %u",
-                                         (U32)toFlush, (U32)(oend-op), (U32)flushed);
+                                         (unsigned)toFlush, (unsigned)(oend-op), (unsigned)flushed);
                              op += flushed;
                              zcs->outBuffFlushedSize += flushed;
                              if (toFlush!=flushed) {
                              if (zcs->frameEnded) {
                                  DEBUGLOG(5, "Frame completed on flush");
                                  someMoreWork = 0;
-                                 ZSTD_CCtx_reset(zcs);
+                                 ZSTD_CCtx_reset(zcs, ZSTD_reset_session_only);
                                  break;
                              }
                              zcs->streamStage = zcss_load;
                  input->pos = ip - istart;
                  output->pos = op - ostart;
                  if (zcs->frameEnded) return 0;
-                 {   size_t hintInSize = zcs->inBuffTarget - zcs->inBuffPos;
-                     if (hintInSize==0) hintInSize = zcs->blockSize;
-                     return hintInSize;
+                 return ZSTD_nextInputSizeHint(zcs);
+             }
+             static size_t ZSTD_nextInputSizeHint_MTorST(const ZSTD_CCtx* cctx)
+             {
+             #ifdef ZSTD_MULTITHREAD
+                 if (cctx->appliedParams.nbWorkers >= 1) {
+                     assert(cctx->mtctx != NULL);
+                     return ZSTDMT_nextInputSizeHint(cctx->mtctx);
                  }
+             #endif
+                 return ZSTD_nextInputSizeHint(cctx);
              }
              size_t ZSTD_compressStream(ZSTD_CStream* zcs, ZSTD_outBuffer* output, ZSTD_inBuffer* input)
              {
-                 /* check conditions */
-                 if (output->pos > output->size) return ERROR(GENERIC);
-                 if (input->pos  > input->size)  return ERROR(GENERIC);
-                 return ZSTD_compressStream_generic(zcs, output, input, ZSTD_e_continue);
+                 CHECK_F( ZSTD_compressStream2(zcs, output, input, ZSTD_e_continue) );
+                 return ZSTD_nextInputSizeHint_MTorST(zcs);
              }
-             size_t ZSTD_compress_generic (ZSTD_CCtx* cctx,
-                                           ZSTD_outBuffer* output,
-                                           ZSTD_inBuffer* input,
-                                           ZSTD_EndDirective endOp)
+             size_t ZSTD_compressStream2( ZSTD_CCtx* cctx,
+                                          ZSTD_outBuffer* output,
+                                          ZSTD_inBuffer* input,
+                                          ZSTD_EndDirective endOp)
              {
-                 DEBUGLOG(5, "ZSTD_compress_generic, endOp=%u ", (U32)endOp);
+                 DEBUGLOG(5, "ZSTD_compressStream2, endOp=%u ", (unsigned)endOp);
                  /* check conditions */
                  if (output->pos > output->size) return ERROR(GENERIC);
                  if (input->pos  > input->size)  return ERROR(GENERIC);
                  if (cctx->streamStage == zcss_init) {
                      ZSTD_CCtx_params params = cctx->requestedParams;
                      ZSTD_prefixDict const prefixDict = cctx->prefixDict;
-                     memset(&cctx->prefixDict, 0, sizeof(cctx->prefixDict));  /* single usage */
-                     assert(prefixDict.dict==NULL || cctx->cdict==NULL);   /* only one can be set */
-                     DEBUGLOG(4, "ZSTD_compress_generic : transparent init stage");
+                     memset(&cctx->prefixDict, 0, sizeof(cctx->prefixDict));   /* single usage */
+                     assert(prefixDict.dict==NULL || cctx->cdict==NULL);    /* only one can be set */
+                     DEBUGLOG(4, "ZSTD_compressStream2 : transparent init stage");
                      if (endOp == ZSTD_e_end) cctx->pledgedSrcSizePlusOne = input->size + 1;  /* auto-fix pledgedSrcSize */
                      params.cParams = ZSTD_getCParamsFromCCtxParams(
                              &cctx->requestedParams, cctx->pledgedSrcSizePlusOne-1, 0 /*dictSize*/);
                      if (params.nbWorkers > 0) {
                          /* mt context creation */
                          if (cctx->mtctx == NULL) {
-                             DEBUGLOG(4, "ZSTD_compress_generic: creating new mtctx for nbWorkers=%u",
+                             DEBUGLOG(4, "ZSTD_compressStream2: creating new mtctx for nbWorkers=%u",
                                          params.nbWorkers);
                              cctx->mtctx = ZSTDMT_createCCtx_advanced(params.nbWorkers, cctx->customMem);
                              if (cctx->mtctx == NULL) return ERROR(memory_allocation);
                          assert(cctx->streamStage == zcss_load);
                          assert(cctx->appliedParams.nbWorkers == 0);
                  }   }
+                 /* end of transparent initialization stage */
                  /* compression stage */
              #ifdef ZSTD_MULTITHREAD
                      {   size_t const flushMin = ZSTDMT_compressStream_generic(cctx->mtctx, output, input, endOp);
                          if ( ZSTD_isError(flushMin)
                            || (endOp == ZSTD_e_end && flushMin == 0) ) { /* compression completed */
-                             ZSTD_CCtx_reset(cctx);
+                             ZSTD_CCtx_reset(cctx, ZSTD_reset_session_only);
                          }
-                         DEBUGLOG(5, "completed ZSTD_compress_generic delegating to ZSTDMT_compressStream_generic");
+                         DEBUGLOG(5, "completed ZSTD_compressStream2 delegating to ZSTDMT_compressStream_generic");
                          return flushMin;
                  }   }
              #endif
                  CHECK_F( ZSTD_compressStream_generic(cctx, output, input, endOp) );
-                 DEBUGLOG(5, "completed ZSTD_compress_generic");
+                 DEBUGLOG(5, "completed ZSTD_compressStream2");
                  return cctx->outBuffContentSize - cctx->outBuffFlushedSize; /* remaining to flush */
              }
-             size_t ZSTD_compress_generic_simpleArgs (
+             size_t ZSTD_compressStream2_simpleArgs (
                                          ZSTD_CCtx* cctx,
                                          void* dst, size_t dstCapacity, size_t* dstPos,
                                    const void* src, size_t srcSize, size_t* srcPos,
              {
                  ZSTD_outBuffer output = { dst, dstCapacity, *dstPos };
                  ZSTD_inBuffer  input  = { src, srcSize, *srcPos };
-                 /* ZSTD_compress_generic() will check validity of dstPos and srcPos */
-                 size_t const cErr = ZSTD_compress_generic(cctx, &output, &input, endOp);
+                 /* ZSTD_compressStream2() will check validity of dstPos and srcPos */
+                 size_t const cErr = ZSTD_compressStream2(cctx, &output, &input, endOp);
                  *dstPos = output.pos;
                  *srcPos = input.pos;
                  return cErr;
              }
+             size_t ZSTD_compress2(ZSTD_CCtx* cctx,
+                                   void* dst, size_t dstCapacity,
+                                   const void* src, size_t srcSize)
+             {
+                 ZSTD_CCtx_reset(cctx, ZSTD_reset_session_only);
+                 {   size_t oPos = 0;
+                     size_t iPos = 0;
+                     size_t const result = ZSTD_compressStream2_simpleArgs(cctx,
+                                                     dst, dstCapacity, &oPos,
+                                                     src, srcSize, &iPos,
+                                                     ZSTD_e_end);
+                     if (ZSTD_isError(result)) return result;
+                     if (result != 0) {  /* compression not completed, due to lack of output space */
+                         assert(oPos == dstCapacity);
+                         return ERROR(dstSize_tooSmall);
+                     }
+                     assert(iPos == srcSize);   /* all input is expected consumed */
+                     return oPos;
+                 }
+             }
              /*======   Finalize   ======*/
              size_t ZSTD_flushStream(ZSTD_CStream* zcs, ZSTD_outBuffer* output)
              {
                  ZSTD_inBuffer input = { NULL, 0, 0 };
-                 if (output->pos > output->size) return ERROR(GENERIC);
-                 CHECK_F( ZSTD_compressStream_generic(zcs, output, &input, ZSTD_e_flush) );
-                 return zcs->outBuffContentSize - zcs->outBuffFlushedSize;  /* remaining to flush */
+                 return ZSTD_compressStream2(zcs, output, &input, ZSTD_e_flush);
              }
              size_t ZSTD_endStream(ZSTD_CStream* zcs, ZSTD_outBuffer* output)
              {
                  ZSTD_inBuffer input = { NULL, 0, 0 };
-                 if (output->pos > output->size) return ERROR(GENERIC);
-                 CHECK_F( ZSTD_compressStream_generic(zcs, output, &input, ZSTD_e_end) );
+                 size_t const remainingToFlush = ZSTD_compressStream2(zcs, output, &input, ZSTD_e_end);
+                 CHECK_F( remainingToFlush );
+                 if (zcs->appliedParams.nbWorkers > 0) return remainingToFlush;   /* minimal estimation */
+                 /* single thread mode : attempt to calculate remaining to flush more precisely */
                  {   size_t const lastBlockSize = zcs->frameEnded ? 0 : ZSTD_BLOCKHEADERSIZE;
                      size_t const checksumSize = zcs->frameEnded ? 0 : zcs->appliedParams.fParams.checksumFlag * 4;
-                     size_t const toFlush = zcs->outBuffContentSize - zcs->outBuffFlushedSize + lastBlockSize + checksumSize;
-                     DEBUGLOG(4, "ZSTD_endStream : remaining to flush : %u", (U32)toFlush);
+                     size_t const toFlush = remainingToFlush + lastBlockSize + checksumSize;
+                     DEBUGLOG(4, "ZSTD_endStream : remaining to flush : %u", (unsigned)toFlush);
                      return toFlush;
                  }
              }
                  /* W,  C,  H,  S,  L, TL, strat */
                  { 19, 12, 13,  1,  6,  1, ZSTD_fast    },  /* base for negative levels */
                  { 19, 13, 14,  1,  7,  0, ZSTD_fast    },  /* level  1 */
-                 { 19, 15, 16,  1,  6,  0, ZSTD_fast    },  /* level  2 */
-                 { 20, 16, 17,  1,  5,  1, ZSTD_dfast   },  /* level  3 */
-                 { 20, 18, 18,  1,  5,  1, ZSTD_dfast   },  /* level  4 */
-                 { 20, 18, 18,  2,  5,  2, ZSTD_greedy  },  /* level  5 */
-                 { 21, 18, 19,  2,  5,  4, ZSTD_lazy    },  /* level  6 */
-                 { 21, 18, 19,  3,  5,  8, ZSTD_lazy2   },  /* level  7 */
+                 { 20, 15, 16,  1,  6,  0, ZSTD_fast    },  /* level  2 */
+                 { 21, 16, 17,  1,  5,  1, ZSTD_dfast   },  /* level  3 */
+                 { 21, 18, 18,  1,  5,  1, ZSTD_dfast   },  /* level  4 */
+                 { 21, 18, 19,  2,  5,  2, ZSTD_greedy  },  /* level  5 */
+                 { 21, 19, 19,  3,  5,  4, ZSTD_greedy  },  /* level  6 */
+                 { 21, 19, 19,  3,  5,  8, ZSTD_lazy    },  /* level  7 */
                  { 21, 19, 19,  3,  5, 16, ZSTD_lazy2   },  /* level  8 */
                  { 21, 19, 20,  4,  5, 16, ZSTD_lazy2   },  /* level  9 */
-                 { 21, 20, 21,  4,  5, 16, ZSTD_lazy2   },  /* level 10 */
-                 { 21, 21, 22,  4,  5, 16, ZSTD_lazy2   },  /* level 11 */
-                 { 22, 20, 22,  5,  5, 16, ZSTD_lazy2   },  /* level 12 */
-                 { 22, 21, 22,  4,  5, 32, ZSTD_btlazy2 },  /* level 13 */
-                 { 22, 21, 22,  5,  5, 32, ZSTD_btlazy2 },  /* level 14 */
-                 { 22, 22, 22,  6,  5, 32, ZSTD_btlazy2 },  /* level 15 */
-                 { 22, 21, 22,  4,  5, 48, ZSTD_btopt   },  /* level 16 */
-                 { 23, 22, 22,  4,  4, 64, ZSTD_btopt   },  /* level 17 */
-                 { 23, 23, 22,  6,  3,256, ZSTD_btopt   },  /* level 18 */
-                 { 23, 24, 22,  7,  3,256, ZSTD_btultra },  /* level 19 */
-                 { 25, 25, 23,  7,  3,256, ZSTD_btultra },  /* level 20 */
-                 { 26, 26, 24,  7,  3,512, ZSTD_btultra },  /* level 21 */
-                 { 27, 27, 25,  9,  3,999, ZSTD_btultra },  /* level 22 */
+                 { 22, 20, 21,  4,  5, 16, ZSTD_lazy2   },  /* level 10 */
+                 { 22, 21, 22,  4,  5, 16, ZSTD_lazy2   },  /* level 11 */
+                 { 22, 21, 22,  5,  5, 16, ZSTD_lazy2   },  /* level 12 */
+                 { 22, 21, 22,  5,  5, 32, ZSTD_btlazy2 },  /* level 13 */
+                 { 22, 22, 23,  5,  5, 32, ZSTD_btlazy2 },  /* level 14 */
+                 { 22, 23, 23,  6,  5, 32, ZSTD_btlazy2 },  /* level 15 */
+                 { 22, 22, 22,  5,  5, 48, ZSTD_btopt   },  /* level 16 */
+                 { 23, 23, 22,  5,  4, 64, ZSTD_btopt   },  /* level 17 */
+                 { 23, 23, 22,  6,  3, 64, ZSTD_btultra },  /* level 18 */
+                 { 23, 24, 22,  7,  3,256, ZSTD_btultra2},  /* level 19 */
+                 { 25, 25, 23,  7,  3,256, ZSTD_btultra2},  /* level 20 */
+                 { 26, 26, 24,  7,  3,512, ZSTD_btultra2},  /* level 21 */
+                 { 27, 27, 25,  9,  3,999, ZSTD_btultra2},  /* level 22 */
              },
              {   /* for srcSize <= 256 KB */
                  /* W,  C,  H,  S,  L,  T, strat */
                  { 18, 18, 19,  4,  4,  8, ZSTD_lazy2   },  /* level  8 */
                  { 18, 18, 19,  5,  4,  8, ZSTD_lazy2   },  /* level  9 */
                  { 18, 18, 19,  6,  4,  8, ZSTD_lazy2   },  /* level 10 */
-                 { 18, 18, 19,  5,  4, 16, ZSTD_btlazy2 },  /* level 11.*/
-                 { 18, 19, 19,  6,  4, 16, ZSTD_btlazy2 },  /* level 12.*/
-                 { 18, 19, 19,  8,  4, 16, ZSTD_btlazy2 },  /* level 13 */
-                 { 18, 18, 19,  4,  4, 24, ZSTD_btopt   },  /* level 14.*/
-                 { 18, 18, 19,  4,  3, 24, ZSTD_btopt   },  /* level 15.*/
-                 { 18, 19, 19,  6,  3, 64, ZSTD_btopt   },  /* level 16.*/
-                 { 18, 19, 19,  8,  3,128, ZSTD_btopt   },  /* level 17.*/
-                 { 18, 19, 19, 10,  3,256, ZSTD_btopt   },  /* level 18.*/
-                 { 18, 19, 19, 10,  3,256, ZSTD_btultra },  /* level 19.*/
-                 { 18, 19, 19, 11,  3,512, ZSTD_btultra },  /* level 20.*/
-                 { 18, 19, 19, 12,  3,512, ZSTD_btultra },  /* level 21.*/
-                 { 18, 19, 19, 13,  3,999, ZSTD_btultra },  /* level 22.*/
+                 { 18, 18, 19,  5,  4, 12, ZSTD_btlazy2 },  /* level 11.*/
+                 { 18, 19, 19,  7,  4, 12, ZSTD_btlazy2 },  /* level 12.*/
+                 { 18, 18, 19,  4,  4, 16, ZSTD_btopt   },  /* level 13 */
+                 { 18, 18, 19,  4,  3, 32, ZSTD_btopt   },  /* level 14.*/
+                 { 18, 18, 19,  6,  3,128, ZSTD_btopt   },  /* level 15.*/
+                 { 18, 19, 19,  6,  3,128, ZSTD_btultra },  /* level 16.*/
+                 { 18, 19, 19,  8,  3,256, ZSTD_btultra },  /* level 17.*/
+                 { 18, 19, 19,  6,  3,128, ZSTD_btultra2},  /* level 18.*/
+                 { 18, 19, 19,  8,  3,256, ZSTD_btultra2},  /* level 19.*/
+                 { 18, 19, 19, 10,  3,512, ZSTD_btultra2},  /* level 20.*/
+                 { 18, 19, 19, 12,  3,512, ZSTD_btultra2},  /* level 21.*/
+                 { 18, 19, 19, 13,  3,999, ZSTD_btultra2},  /* level 22.*/
              },
              {   /* for srcSize <= 128 KB */
                  /* W,  C,  H,  S,  L,  T, strat */
                  { 17, 17, 17,  4,  4,  8, ZSTD_lazy2   },  /* level  8 */
                  { 17, 17, 17,  5,  4,  8, ZSTD_lazy2   },  /* level  9 */
                  { 17, 17, 17,  6,  4,  8, ZSTD_lazy2   },  /* level 10 */
-                 { 17, 17, 17,  7,  4,  8, ZSTD_lazy2   },  /* level 11 */
-                 { 17, 18, 17,  6,  4, 16, ZSTD_btlazy2 },  /* level 12 */
-                 { 17, 18, 17,  8,  4, 16, ZSTD_btlazy2 },  /* level 13.*/
-                 { 17, 18, 17,  4,  4, 32, ZSTD_btopt   },  /* level 14.*/
-                 { 17, 18, 17,  6,  3, 64, ZSTD_btopt   },  /* level 15.*/
-                 { 17, 18, 17,  7,  3,128, ZSTD_btopt   },  /* level 16.*/
-                 { 17, 18, 17,  7,  3,256, ZSTD_btopt   },  /* level 17.*/
-                 { 17, 18, 17,  8,  3,256, ZSTD_btopt   },  /* level 18.*/
-                 { 17, 18, 17,  8,  3,256, ZSTD_btultra },  /* level 19.*/
-                 { 17, 18, 17,  9,  3,256, ZSTD_btultra },  /* level 20.*/
-                 { 17, 18, 17, 10,  3,256, ZSTD_btultra },  /* level 21.*/
-                 { 17, 18, 17, 11,  3,512, ZSTD_btultra },  /* level 22.*/
+                 { 17, 17, 17,  5,  4,  8, ZSTD_btlazy2 },  /* level 11 */
+                 { 17, 18, 17,  7,  4, 12, ZSTD_btlazy2 },  /* level 12 */
+                 { 17, 18, 17,  3,  4, 12, ZSTD_btopt   },  /* level 13.*/
+                 { 17, 18, 17,  4,  3, 32, ZSTD_btopt   },  /* level 14.*/
+                 { 17, 18, 17,  6,  3,256, ZSTD_btopt   },  /* level 15.*/
+                 { 17, 18, 17,  6,  3,128, ZSTD_btultra },  /* level 16.*/
+                 { 17, 18, 17,  8,  3,256, ZSTD_btultra },  /* level 17.*/
+                 { 17, 18, 17, 10,  3,512, ZSTD_btultra },  /* level 18.*/
+                 { 17, 18, 17,  5,  3,256, ZSTD_btultra2},  /* level 19.*/
+                 { 17, 18, 17,  7,  3,512, ZSTD_btultra2},  /* level 20.*/
+                 { 17, 18, 17,  9,  3,512, ZSTD_btultra2},  /* level 21.*/
+                 { 17, 18, 17, 11,  3,999, ZSTD_btultra2},  /* level 22.*/
              },
              {   /* for srcSize <= 16 KB */
                  /* W,  C,  H,  S,  L,  T, strat */
                  { 14, 12, 13,  1,  5,  1, ZSTD_fast    },  /* base for negative levels */
                  { 14, 14, 15,  1,  5,  0, ZSTD_fast    },  /* level  1 */
                  { 14, 14, 15,  1,  4,  0, ZSTD_fast    },  /* level  2 */
-                 { 14, 14, 14,  2,  4,  1, ZSTD_dfast   },  /* level  3.*/
-                 { 14, 14, 14,  4,  4,  2, ZSTD_greedy  },  /* level  4.*/
+                 { 14, 14, 15,  2,  4,  1, ZSTD_dfast   },  /* level  3 */
+                 { 14, 14, 14,  4,  4,  2, ZSTD_greedy  },  /* level  4 */
                  { 14, 14, 14,  3,  4,  4, ZSTD_lazy    },  /* level  5.*/
                  { 14, 14, 14,  4,  4,  8, ZSTD_lazy2   },  /* level  6 */
                  { 14, 14, 14,  6,  4,  8, ZSTD_lazy2   },  /* level  7 */
                  { 14, 15, 14,  5,  4,  8, ZSTD_btlazy2 },  /* level  9.*/
                  { 14, 15, 14,  9,  4,  8, ZSTD_btlazy2 },  /* level 10.*/
                  { 14, 15, 14,  3,  4, 12, ZSTD_btopt   },  /* level 11.*/
-                 { 14, 15, 14,  6,  3, 16, ZSTD_btopt   },  /* level 12.*/
-                 { 14, 15, 14,  6,  3, 24, ZSTD_btopt   },  /* level 13.*/
-                 { 14, 15, 15,  6,  3, 48, ZSTD_btopt   },  /* level 14.*/
-                 { 14, 15, 15,  6,  3, 64, ZSTD_btopt   },  /* level 15.*/
-                 { 14, 15, 15,  6,  3, 96, ZSTD_btopt   },  /* level 16.*/
-                 { 14, 15, 15,  6,  3,128, ZSTD_btopt   },  /* level 17.*/
-                 { 14, 15, 15,  8,  3,256, ZSTD_btopt   },  /* level 18.*/
-                 { 14, 15, 15,  6,  3,256, ZSTD_btultra },  /* level 19.*/
-                 { 14, 15, 15,  8,  3,256, ZSTD_btultra },  /* level 20.*/
-                 { 14, 15, 15,  9,  3,256, ZSTD_btultra },  /* level 21.*/
-                 { 14, 15, 15, 10,  3,512, ZSTD_btultra },  /* level 22.*/
+                 { 14, 15, 14,  4,  3, 24, ZSTD_btopt   },  /* level 12.*/
+                 { 14, 15, 14,  5,  3, 32, ZSTD_btultra },  /* level 13.*/
+                 { 14, 15, 15,  6,  3, 64, ZSTD_btultra },  /* level 14.*/
+                 { 14, 15, 15,  7,  3,256, ZSTD_btultra },  /* level 15.*/
+                 { 14, 15, 15,  5,  3, 48, ZSTD_btultra2},  /* level 16.*/
+                 { 14, 15, 15,  6,  3,128, ZSTD_btultra2},  /* level 17.*/
+                 { 14, 15, 15,  7,  3,256, ZSTD_btultra2},  /* level 18.*/
+                 { 14, 15, 15,  8,  3,256, ZSTD_btultra2},  /* level 19.*/
+                 { 14, 15, 15,  8,  3,512, ZSTD_btultra2},  /* level 20.*/
+                 { 14, 15, 15,  9,  3,512, ZSTD_btultra2},  /* level 21.*/
+                 { 14, 15, 15, 10,  3,999, ZSTD_btultra2},  /* level 22.*/
              },
              };
                  if (compressionLevel > ZSTD_MAX_CLEVEL) row = ZSTD_MAX_CLEVEL;
                  {   ZSTD_compressionParameters cp = ZSTD_defaultCParameters[tableID][row];
                      if (compressionLevel < 0) cp.targetLength = (unsigned)(-compressionLevel);   /* acceleration factor */
-                     return ZSTD_adjustCParams_internal(cp, srcSizeHint, dictSize); }
+                     return ZSTD_adjustCParams_internal(cp, srcSizeHint, dictSize);
+                 }
              }
              /*! ZSTD_getParams() :

contrib/python-zstandard/zstd/compress/zstd_compress_internal.h

0 +90 -28

              typedef enum { ZSTDcs_created=0, ZSTDcs_init, ZSTDcs_ongoing, ZSTDcs_ending } ZSTD_compressionStage_e;
              typedef enum { zcss_init=0, zcss_load, zcss_flush } ZSTD_cStreamStage;
-             typedef enum {
-                 ZSTD_dictDefaultAttach = 0,
-                 ZSTD_dictForceAttach = 1,
-                 ZSTD_dictForceCopy = -1,
-             } ZSTD_dictAttachPref_e;
              typedef struct ZSTD_prefixDict_s {
                  const void* dict;
                  size_t dictSize;
              typedef struct {
                  /* All tables are allocated inside cctx->workspace by ZSTD_resetCCtx_internal() */
-                 U32* litFreq;                /* table of literals statistics, of size 256 */
-                 U32* litLengthFreq;          /* table of litLength statistics, of size (MaxLL+1) */
-                 U32* matchLengthFreq;        /* table of matchLength statistics, of size (MaxML+1) */
-                 U32* offCodeFreq;            /* table of offCode statistics, of size (MaxOff+1) */
+                 unsigned* litFreq;           /* table of literals statistics, of size 256 */
+                 unsigned* litLengthFreq;     /* table of litLength statistics, of size (MaxLL+1) */
+                 unsigned* matchLengthFreq;   /* table of matchLength statistics, of size (MaxML+1) */
+                 unsigned* offCodeFreq;       /* table of offCode statistics, of size (MaxOff+1) */
                  ZSTD_match_t* matchTable;    /* list of found matches, of size ZSTD_OPT_NUM+1 */
                  ZSTD_optimal_t* priceTable;  /* All positions tracked by optimal parser, of size ZSTD_OPT_NUM+1 */
                  U32* hashTable3;
                  U32* chainTable;
                  optState_t opt;         /* optimal parser state */
-                 const ZSTD_matchState_t *dictMatchState;
+                 const ZSTD_matchState_t * dictMatchState;
                  ZSTD_compressionParameters cParams;
              };
                  U32 hashLog;            /* Log size of hashTable */
                  U32 bucketSizeLog;      /* Log bucket size for collision resolution, at most 8 */
                  U32 minMatchLength;     /* Minimum match length */
-                 U32 hashEveryLog;       /* Log number of entries to skip */
+                 U32 hashRateLog;       /* Log number of entries to skip */
                  U32 windowLog;          /* Window log for the LDM */
              } ldmParams_t;
                  ZSTD_dictAttachPref_e attachDictPref;
                  /* Multithreading: used to pass parameters to mtctx */
-                 unsigned nbWorkers;
-                 unsigned jobSize;
-                 unsigned overlapSizeLog;
+                 int nbWorkers;
+                 size_t jobSize;
+                 int overlapLog;
+                 int rsyncable;
                  /* Long distance matching parameters */
                  ldmParams_t ldmParams;
                  }
              }
+             /** ZSTD_ipow() :
+              * Return base^exponent.
+              */
+             static U64 ZSTD_ipow(U64 base, U64 exponent)
+             {
+                 U64 power = 1;
+                 while (exponent) {
+                   if (exponent & 1) power *= base;
+                   exponent >>= 1;
+                   base *= base;
+                 }
+                 return power;
+             }
+             #define ZSTD_ROLL_HASH_CHAR_OFFSET 10
+             /** ZSTD_rollingHash_append() :
+              * Add the buffer to the hash value.
+              */
+             static U64 ZSTD_rollingHash_append(U64 hash, void const* buf, size_t size)
+             {
+                 BYTE const* istart = (BYTE const*)buf;
+                 size_t pos;
+                 for (pos = 0; pos < size; ++pos) {
+                     hash *= prime8bytes;
+                     hash += istart[pos] + ZSTD_ROLL_HASH_CHAR_OFFSET;
+                 }
+                 return hash;
+             }
+             /** ZSTD_rollingHash_compute() :
+              * Compute the rolling hash value of the buffer.
+              */
+             MEM_STATIC U64 ZSTD_rollingHash_compute(void const* buf, size_t size)
+             {
+                 return ZSTD_rollingHash_append(0, buf, size);
+             }
+             /** ZSTD_rollingHash_primePower() :
+              * Compute the primePower to be passed to ZSTD_rollingHash_rotate() for a hash
+              * over a window of length bytes.
+              */
+             MEM_STATIC U64 ZSTD_rollingHash_primePower(U32 length)
+             {
+                 return ZSTD_ipow(prime8bytes, length - 1);
+             }
+             /** ZSTD_rollingHash_rotate() :
+              * Rotate the rolling hash by one byte.
+              */
+             MEM_STATIC U64 ZSTD_rollingHash_rotate(U64 hash, BYTE toRemove, BYTE toAdd, U64 primePower)
+             {
+                 hash -= (toRemove + ZSTD_ROLL_HASH_CHAR_OFFSET) * primePower;
+                 hash *= prime8bytes;
+                 hash += toAdd + ZSTD_ROLL_HASH_CHAR_OFFSET;
+                 return hash;
+             }
              /*-*************************************
              *  Round buffer management
              ***************************************/
               * dictMatchState mode, lowLimit and dictLimit are the same, and the dictionary
               * is below them. forceWindow and dictMatchState are therefore incompatible.
               */
-             MEM_STATIC void ZSTD_window_enforceMaxDist(ZSTD_window_t* window,
-                                                        void const* srcEnd, U32 maxDist,
-                                                        U32* loadedDictEndPtr,
-                                                        const ZSTD_matchState_t** dictMatchStatePtr)
+             MEM_STATIC void
+             ZSTD_window_enforceMaxDist(ZSTD_window_t* window,
+                                        void const* srcEnd,
+                                        U32 maxDist,
+                                        U32* loadedDictEndPtr,
+                                  const ZSTD_matchState_t** dictMatchStatePtr)
              {
-                 U32 const current = (U32)((BYTE const*)srcEnd - window->base);
-                 U32 loadedDictEnd = loadedDictEndPtr != NULL ? *loadedDictEndPtr : 0;
-                 DEBUGLOG(5, "ZSTD_window_enforceMaxDist: current=%u, maxDist=%u", current, maxDist);
-                 if (current > maxDist + loadedDictEnd) {
-                     U32 const newLowLimit = current - maxDist;
+                 U32 const blockEndIdx = (U32)((BYTE const*)srcEnd - window->base);
+                 U32 loadedDictEnd = (loadedDictEndPtr != NULL) ? *loadedDictEndPtr : 0;
+                 DEBUGLOG(5, "ZSTD_window_enforceMaxDist: blockEndIdx=%u, maxDist=%u",
+                             (unsigned)blockEndIdx, (unsigned)maxDist);
+                 if (blockEndIdx > maxDist + loadedDictEnd) {
+                     U32 const newLowLimit = blockEndIdx - maxDist;
                      if (window->lowLimit < newLowLimit) window->lowLimit = newLowLimit;
                      if (window->dictLimit < window->lowLimit) {
                          DEBUGLOG(5, "Update dictLimit to match lowLimit, from %u to %u",
-                                     window->dictLimit, window->lowLimit);
+                                     (unsigned)window->dictLimit, (unsigned)window->lowLimit);
                          window->dictLimit = window->lowLimit;
                      }
                      if (loadedDictEndPtr)
              /* debug functions */
+             #if (DEBUGLEVEL>=2)
              MEM_STATIC double ZSTD_fWeight(U32 rawStat)
              {
                  U32 const fp_accuracy = 8;
                  U32 const fp_multiplier = (1 << fp_accuracy);
-                 U32 const stat = rawStat + 1;
-                 U32 const hb = ZSTD_highbit32(stat);
+                 U32 const newStat = rawStat + 1;
+                 U32 const hb = ZSTD_highbit32(newStat);
                  U32 const BWeight = hb * fp_multiplier;
-                 U32 const FWeight = (stat << fp_accuracy) >> hb;
+                 U32 const FWeight = (newStat << fp_accuracy) >> hb;
                  U32 const weight = BWeight + FWeight;
                  assert(hb + fp_accuracy < 31);
                  return (double)weight / fp_multiplier;
              }
+             /* display a table content,
+              * listing each element, its frequency, and its predicted bit cost */
              MEM_STATIC void ZSTD_debugTable(const U32* table, U32 max)
              {
                  unsigned u, sum;
                  }
              }
+             #endif
              #if defined (__cplusplus)
              }
              #endif

contrib/python-zstandard/zstd/compress/zstd_double_fast.c

0 +4 -4

                  const ZSTD_compressionParameters* const cParams = &ms->cParams;
                  U32* const hashLarge = ms->hashTable;
                  U32  const hBitsL = cParams->hashLog;
-                 U32  const mls = cParams->searchLength;
+                 U32  const mls = cParams->minMatch;
                  U32* const hashSmall = ms->chainTable;
                  U32  const hBitsS = cParams->chainLog;
                  const BYTE* const base = ms->window.base;
                      ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM],
                      void const* src, size_t srcSize)
              {
-                 const U32 mls = ms->cParams.searchLength;
+                 const U32 mls = ms->cParams.minMatch;
                  switch(mls)
                  {
                  default: /* includes case 3 */
                      ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM],
                      void const* src, size_t srcSize)
              {
-                 const U32 mls = ms->cParams.searchLength;
+                 const U32 mls = ms->cParams.minMatch;
                  switch(mls)
                  {
                  default: /* includes case 3 */
                      ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM],
                      void const* src, size_t srcSize)
              {
-                 U32 const mls = ms->cParams.searchLength;
+                 U32 const mls = ms->cParams.minMatch;
                  switch(mls)
                  {
                  default: /* includes case 3 */

contrib/python-zstandard/zstd/compress/zstd_fast.c

0 +15 -15

                  const ZSTD_compressionParameters* const cParams = &ms->cParams;
                  U32* const hashTable = ms->hashTable;
                  U32  const hBits = cParams->hashLog;
-                 U32  const mls = cParams->searchLength;
+                 U32  const mls = cParams->minMatch;
                  const BYTE* const base = ms->window.base;
                  const BYTE* ip = base + ms->nextToUpdate;
                  const BYTE* const iend = ((const BYTE*)end) - HASH_READ_SIZE;
                  /* Always insert every fastHashFillStep position into the hash table.
                   * Insert the other positions if their hash entry is empty.
                   */
-                 for (; ip + fastHashFillStep - 1 <= iend; ip += fastHashFillStep) {
+                 for ( ; ip + fastHashFillStep < iend + 2; ip += fastHashFillStep) {
                      U32 const current = (U32)(ip - base);
-                     U32 i;
-                     for (i = 0; i < fastHashFillStep; ++i) {
-                         size_t const hash = ZSTD_hashPtr(ip + i, hBits, mls);
-                         if (i == 0 || hashTable[hash] == 0)
-                             hashTable[hash] = current + i;
-                         /* Only load extra positions for ZSTD_dtlm_full */
-                         if (dtlm == ZSTD_dtlm_fast)
-                             break;
+                     }
+                 }
+                     size_t const hash0 = ZSTD_hashPtr(ip, hBits, mls);
+                     hashTable[hash0] = current;
+                     if (dtlm == ZSTD_dtlm_fast) continue;
+                     /* Only load extra positions for ZSTD_dtlm_full */
+                     {   U32 p;
+                         for (p = 1; p < fastHashFillStep; ++p) {
+                             size_t const hash = ZSTD_hashPtr(ip + p, hBits, mls);
+                             if (hashTable[hash] == 0) {  /* not yet filled */
+                                 hashTable[hash] = current + p;
+                 }   }   }   }
              }
              FORCE_INLINE_TEMPLATE
                      void const* src, size_t srcSize)
              {
                  ZSTD_compressionParameters const* cParams = &ms->cParams;
-                 U32 const mls = cParams->searchLength;
+                 U32 const mls = cParams->minMatch;
                  assert(ms->dictMatchState == NULL);
                  switch(mls)
                  {
                      void const* src, size_t srcSize)
              {
                  ZSTD_compressionParameters const* cParams = &ms->cParams;
-                 U32 const mls = cParams->searchLength;
+                 U32 const mls = cParams->minMatch;
                  assert(ms->dictMatchState != NULL);
                  switch(mls)
                  {
                      void const* src, size_t srcSize)
              {
                  ZSTD_compressionParameters const* cParams = &ms->cParams;
-                 U32 const mls = cParams->searchLength;
+                 U32 const mls = cParams->minMatch;
                  switch(mls)
                  {
                  default: /* includes case 3 */

contrib/python-zstandard/zstd/compress/zstd_lazy.c

0 +38 -26

              static void
              ZSTD_insertDUBT1(ZSTD_matchState_t* ms,
                               U32 current, const BYTE* inputEnd,
-                              U32 nbCompares, U32 btLow, const ZSTD_dictMode_e dictMode)
+                              U32 nbCompares, U32 btLow,
+                              const ZSTD_dictMode_e dictMode)
              {
                  const ZSTD_compressionParameters* const cParams = &ms->cParams;
-                 U32*   const bt = ms->chainTable;
-                 U32    const btLog  = cParams->chainLog - 1;
-                 U32    const btMask = (1 << btLog) - 1;
+                 U32* const bt = ms->chainTable;
+                 U32  const btLog  = cParams->chainLog - 1;
+                 U32  const btMask = (1 << btLog) - 1;
                  size_t commonLengthSmaller=0, commonLengthLarger=0;
                  const BYTE* const base = ms->window.base;
                  const BYTE* const dictBase = ms->window.dictBase;
                  const BYTE* match;
                  U32* smallerPtr = bt + 2*(current&btMask);
                  U32* largerPtr  = smallerPtr + 1;
-                 U32 matchIndex = *smallerPtr;
+                 U32 matchIndex = *smallerPtr;   /* this candidate is unsorted : next sorted candidate is reached through *smallerPtr, while *largerPtr contains previous unsorted candidate (which is already saved and can be overwritten) */
                  U32 dummy32;   /* to be nullified at the end */
                  U32 const windowLow = ms->window.lowLimit;
                      U32* const nextPtr = bt + 2*(matchIndex & btMask);
                      size_t matchLength = MIN(commonLengthSmaller, commonLengthLarger);   /* guaranteed minimum nb of common bytes */
                      assert(matchIndex < current);
+                     /* note : all candidates are now supposed sorted,
+                      * but it's still possible to have nextPtr[1] == ZSTD_DUBT_UNSORTED_MARK
+                      * when a real index has the same value as ZSTD_DUBT_UNSORTED_MARK */
                      if ( (dictMode != ZSTD_extDict)
                        || (matchIndex+matchLength >= dictLimit)  /* both in current segment*/
                          match = dictBase + matchIndex;
                          matchLength += ZSTD_count_2segments(ip+matchLength, match+matchLength, iend, dictEnd, prefixStart);
                          if (matchIndex+matchLength >= dictLimit)
-                             match = base + matchIndex;   /* to prepare for next usage of match[matchLength] */
+                             match = base + matchIndex;   /* preparation for next read of match[matchLength] */
                      }
                      DEBUGLOG(8, "ZSTD_insertDUBT1: comparing %u with %u : found %u common bytes ",
                      ZSTD_matchState_t* ms,
                      const BYTE* const ip, const BYTE* const iend,
                      size_t* offsetPtr,
+                     size_t bestLength,
                      U32 nbCompares,
                      U32 const mls,
                      const ZSTD_dictMode_e dictMode)
                  U32         const btMask = (1 << btLog) - 1;
                  U32         const btLow = (btMask >= dictHighLimit - dictLowLimit) ? dictLowLimit : dictHighLimit - btMask;
-                 size_t commonLengthSmaller=0, commonLengthLarger=0, bestLength=0;
-                 U32 matchEndIdx = current+8+1;
+                 size_t commonLengthSmaller=0, commonLengthLarger=0;
                  (void)dictMode;
                  assert(dictMode == ZSTD_dictMatchState);
                      if (matchLength > bestLength) {
                          U32 matchIndex = dictMatchIndex + dictIndexDelta;
-                         if (matchLength > matchEndIdx - matchIndex)
-                             matchEndIdx = matchIndex + (U32)matchLength;
                          if ( (4*(int)(matchLength-bestLength)) > (int)(ZSTD_highbit32(current-matchIndex+1) - ZSTD_highbit32((U32)offsetPtr[0]+1)) ) {
-                             DEBUGLOG(2, "ZSTD_DUBT_findBestDictMatch(%u) : found better match length %u -> %u and offsetCode %u -> %u (dictMatchIndex %u, matchIndex %u)",
+                             DEBUGLOG(9, "ZSTD_DUBT_findBetterDictMatch(%u) : found better match length %u -> %u and offsetCode %u -> %u (dictMatchIndex %u, matchIndex %u)",
                                  current, (U32)bestLength, (U32)matchLength, (U32)*offsetPtr, ZSTD_REP_MOVE + current - matchIndex, dictMatchIndex, matchIndex);
                              bestLength = matchLength, *offsetPtr = ZSTD_REP_MOVE + current - matchIndex;
                          }
                          }
                      }
-                     DEBUGLOG(2, "matchLength:%6zu, match:%p, prefixStart:%p, ip:%p", matchLength, match, prefixStart, ip);
                      if (match[matchLength] < ip[matchLength]) {
                          if (dictMatchIndex <= btLow) { break; }   /* beyond tree size, stop the search */
                          commonLengthSmaller = matchLength;    /* all smaller will now have at least this guaranteed common length */
                  if (bestLength >= MINMATCH) {
                      U32 const mIndex = current - ((U32)*offsetPtr - ZSTD_REP_MOVE); (void)mIndex;
-                     DEBUGLOG(2, "ZSTD_DUBT_findBestDictMatch(%u) : found match of length %u and offsetCode %u (pos %u)",
+                     DEBUGLOG(8, "ZSTD_DUBT_findBetterDictMatch(%u) : found match of length %u and offsetCode %u (pos %u)",
                                  current, (U32)bestLength, (U32)*offsetPtr, mIndex);
                  }
                  return bestLength;
                       && (nbCandidates > 1) ) {
                      DEBUGLOG(8, "ZSTD_DUBT_findBestMatch: candidate %u is unsorted",
                                  matchIndex);
-                     *unsortedMark = previousCandidate;
+                     *unsortedMark = previousCandidate;  /* the unsortedMark becomes a reversed chain, to move up back to original position */
                      previousCandidate = matchIndex;
                      matchIndex = *nextCandidate;
                      nextCandidate = bt + 2*(matchIndex&btMask);
                      nbCandidates --;
                  }
+                 /* nullify last candidate if it's still unsorted
+                  * simplification, detrimental to compression ratio, beneficial for speed */
                  if ( (matchIndex > unsortLimit)
                    && (*unsortedMark==ZSTD_DUBT_UNSORTED_MARK) ) {
                      DEBUGLOG(7, "ZSTD_DUBT_findBestMatch: nullify last unsorted candidate %u",
                                  matchIndex);
-                     *nextCandidate = *unsortedMark = 0;   /* nullify next candidate if it's still unsorted (note : simplification, detrimental to compression ratio, beneficial for speed) */
+                     *nextCandidate = *unsortedMark = 0;
                  }
                  /* batch sort stacked candidates */
                  }
                  /* find longest match */
-                 {   size_t commonLengthSmaller=0, commonLengthLarger=0;
+                 {   size_t commonLengthSmaller = 0, commonLengthLarger = 0;
                      const BYTE* const dictBase = ms->window.dictBase;
                      const U32 dictLimit = ms->window.dictLimit;
                      const BYTE* const dictEnd = dictBase + dictLimit;
                      const BYTE* const prefixStart = base + dictLimit;
                      U32* smallerPtr = bt + 2*(current&btMask);
                      U32* largerPtr  = bt + 2*(current&btMask) + 1;
-                     U32 matchEndIdx = current+8+1;
+                     U32 matchEndIdx = current + 8 + 1;
                      U32 dummy32;   /* to be nullified at the end */
                      size_t bestLength = 0;
                              if ( (4*(int)(matchLength-bestLength)) > (int)(ZSTD_highbit32(current-matchIndex+1) - ZSTD_highbit32((U32)offsetPtr[0]+1)) )
                                  bestLength = matchLength, *offsetPtr = ZSTD_REP_MOVE + current - matchIndex;
                              if (ip+matchLength == iend) {   /* equal : no way to know if inf or sup */
+                                 if (dictMode == ZSTD_dictMatchState) {
+                                     nbCompares = 0; /* in addition to avoiding checking any
+                                                      * further in this loop, make sure we
+                                                      * skip checking in the dictionary. */
+                                 }
                                  break;   /* drop, to guarantee consistency (miss a little bit of compression) */
                              }
                          }
                      *smallerPtr = *largerPtr = 0;
                      if (dictMode == ZSTD_dictMatchState && nbCompares) {
-                         bestLength = ZSTD_DUBT_findBetterDictMatch(ms, ip, iend, offsetPtr, nbCompares, mls, dictMode);
+                         bestLength = ZSTD_DUBT_findBetterDictMatch(
+                                 ms, ip, iend,
+                                 offsetPtr, bestLength, nbCompares,
+                                 mls, dictMode);
                      }
                      assert(matchEndIdx > current+8); /* ensure nextToUpdate is increased */
                                          const BYTE* ip, const BYTE* const iLimit,
                                                size_t* offsetPtr)
              {
-                 switch(ms->cParams.searchLength)
+                 switch(ms->cParams.minMatch)
                  {
                  default : /* includes case 3 */
                  case 4 : return ZSTD_BtFindBestMatch(ms, ip, iLimit, offsetPtr, 4, ZSTD_noDict);
                                      const BYTE* ip, const BYTE* const iLimit,
                                      size_t* offsetPtr)
              {
-                 switch(ms->cParams.searchLength)
+                 switch(ms->cParams.minMatch)
                  {
                  default : /* includes case 3 */
                  case 4 : return ZSTD_BtFindBestMatch(ms, ip, iLimit, offsetPtr, 4, ZSTD_dictMatchState);
                                      const BYTE* ip, const BYTE* const iLimit,
                                      size_t* offsetPtr)
              {
-                 switch(ms->cParams.searchLength)
+                 switch(ms->cParams.minMatch)
                  {
                  default : /* includes case 3 */
                  case 4 : return ZSTD_BtFindBestMatch(ms, ip, iLimit, offsetPtr, 4, ZSTD_extDict);
              /* *********************************
              *  Hash Chain
              ***********************************/
-             #define NEXT_IN_CHAIN(d, mask)   chainTable[(d) & mask]
+             #define NEXT_IN_CHAIN(d, mask)   chainTable[(d) & (mask)]
              /* Update chains up to ip (excluded)
                 Assumption : always within prefix (i.e. not within extDict) */
              U32 ZSTD_insertAndFindFirstIndex(ZSTD_matchState_t* ms, const BYTE* ip) {
                  const ZSTD_compressionParameters* const cParams = &ms->cParams;
-                 return ZSTD_insertAndFindFirstIndex_internal(ms, cParams, ip, ms->cParams.searchLength);
+                 return ZSTD_insertAndFindFirstIndex_internal(ms, cParams, ip, ms->cParams.minMatch);
              }
                      size_t currentMl=0;
                      if ((dictMode != ZSTD_extDict) || matchIndex >= dictLimit) {
                          const BYTE* const match = base + matchIndex;
+                         assert(matchIndex >= dictLimit);   /* ensures this is true if dictMode != ZSTD_extDict */
                          if (match[ml] == ip[ml])   /* potentially better */
                              currentMl = ZSTD_count(ip, match, iLimit);
                      } else {
                                      const BYTE* ip, const BYTE* const iLimit,
                                      size_t* offsetPtr)
              {
-                 switch(ms->cParams.searchLength)
+                 switch(ms->cParams.minMatch)
                  {
                  default : /* includes case 3 */
                  case 4 : return ZSTD_HcFindBestMatch_generic(ms, ip, iLimit, offsetPtr, 4, ZSTD_noDict);
                                      const BYTE* ip, const BYTE* const iLimit,
                                      size_t* offsetPtr)
              {
-                 switch(ms->cParams.searchLength)
+                 switch(ms->cParams.minMatch)
                  {
                  default : /* includes case 3 */
                  case 4 : return ZSTD_HcFindBestMatch_generic(ms, ip, iLimit, offsetPtr, 4, ZSTD_dictMatchState);
                                      const BYTE* ip, const BYTE* const iLimit,
                                      size_t* offsetPtr)
              {
-                 switch(ms->cParams.searchLength)
+                 switch(ms->cParams.minMatch)
                  {
                  default : /* includes case 3 */
                  case 4 : return ZSTD_HcFindBestMatch_generic(ms, ip, iLimit, offsetPtr, 4, ZSTD_extDict);

contrib/python-zstandard/zstd/compress/zstd_ldm.c

0 +18 -67

                      params->hashLog = MAX(ZSTD_HASHLOG_MIN, params->windowLog - LDM_HASH_RLOG);
                      assert(params->hashLog <= ZSTD_HASHLOG_MAX);
                  }
-                 if (params->hashEveryLog == 0) {
-                     params->hashEveryLog = params->windowLog < params->hashLog
+                 if (params->hashRateLog == 0) {
+                     params->hashRateLog = params->windowLog < params->hashLog
                                                 ? 0
                                                 : params->windowLog - params->hashLog;
                  }
               *
               *  Gets the small hash, checksum, and tag from the rollingHash.
               *
-              *  If the tag matches (1 << ldmParams.hashEveryLog)-1, then
+              *  If the tag matches (1 << ldmParams.hashRateLog)-1, then
               *  creates an ldmEntry from the offset, and inserts it into the hash table.
               *
               *  hBits is the length of the small hash, which is the most significant hBits
               *  of rollingHash. The checksum is the next 32 most significant bits, followed
-              *  by ldmParams.hashEveryLog bits that make up the tag. */
+              *  by ldmParams.hashRateLog bits that make up the tag. */
              static void ZSTD_ldm_makeEntryAndInsertByTag(ldmState_t* ldmState,
                                                           U64 const rollingHash,
                                                           U32 const hBits,
                                                           U32 const offset,
                                                           ldmParams_t const ldmParams)
              {
-                 U32 const tag = ZSTD_ldm_getTag(rollingHash, hBits, ldmParams.hashEveryLog);
-                 U32 const tagMask = ((U32)1 << ldmParams.hashEveryLog) - 1;
+                 U32 const tag = ZSTD_ldm_getTag(rollingHash, hBits, ldmParams.hashRateLog);
+                 U32 const tagMask = ((U32)1 << ldmParams.hashRateLog) - 1;
                  if (tag == tagMask) {
                      U32 const hash = ZSTD_ldm_getSmallHash(rollingHash, hBits);
                      U32 const checksum = ZSTD_ldm_getChecksum(rollingHash, hBits);
                  }
              }
-             /** ZSTD_ldm_getRollingHash() :
-              *  Get a 64-bit hash using the first len bytes from buf.
+              *
-              *  Giving bytes s = s_1, s_2, ... s_k, the hash is defined to be
-              *  H(s) = s_1*(a^(k-1)) + s_2*(a^(k-2)) + ... + s_k*(a^0)
+              *
-              *  where the constant a is defined to be prime8bytes.
+              *
-              *  The implementation adds an offset to each byte, so
-              *  H(s) = (s_1 + HASH_CHAR_OFFSET)*(a^(k-1)) + ... */
-             static U64 ZSTD_ldm_getRollingHash(const BYTE* buf, U32 len)
+             {
-                 U64 ret = 0;
-                 U32 i;
-                 for (i = 0; i < len; i++) {
-                     ret *= prime8bytes;
-                     ret += buf[i] + LDM_HASH_CHAR_OFFSET;
+                 }
-                 return ret;
+             }
-             /** ZSTD_ldm_ipow() :
-              *  Return base^exp. */
-             static U64 ZSTD_ldm_ipow(U64 base, U64 exp)
+             {
-                 U64 ret = 1;
-                 while (exp) {
-                     if (exp & 1) { ret *= base; }
-                     exp >>= 1;
-                     base *= base;
+                 }
-                 return ret;
+             }
-             U64 ZSTD_ldm_getHashPower(U32 minMatchLength) {
-                 DEBUGLOG(4, "ZSTD_ldm_getHashPower: mml=%u", minMatchLength);
-                 assert(minMatchLength >= ZSTD_LDM_MINMATCH_MIN);
-                 return ZSTD_ldm_ipow(prime8bytes, minMatchLength - 1);
+             }
-             /** ZSTD_ldm_updateHash() :
-              *  Updates hash by removing toRemove and adding toAdd. */
-             static U64 ZSTD_ldm_updateHash(U64 hash, BYTE toRemove, BYTE toAdd, U64 hashPower)
+             {
-                 hash -= ((toRemove + LDM_HASH_CHAR_OFFSET) * hashPower);
-                 hash *= prime8bytes;
-                 hash += toAdd + LDM_HASH_CHAR_OFFSET;
-                 return hash;
+             }
              /** ZSTD_ldm_countBackwardsMatch() :
               *  Returns the number of bytes that match backwards before pIn and pMatch.
               *
                  case ZSTD_btlazy2:
                  case ZSTD_btopt:
                  case ZSTD_btultra:
+                 case ZSTD_btultra2:
                      break;
                  default:
                      assert(0);  /* not possible : not a valid strategy id */
                  const BYTE* cur = lastHashed + 1;
                  while (cur < iend) {
-                     rollingHash = ZSTD_ldm_updateHash(rollingHash, cur[-1],
-                                                       cur[ldmParams.minMatchLength-1],
-                                                       state->hashPower);
+                     rollingHash = ZSTD_rollingHash_rotate(rollingHash, cur[-1],
+                                                           cur[ldmParams.minMatchLength-1],
+                                                           state->hashPower);
                      ZSTD_ldm_makeEntryAndInsertByTag(state,
                                                       rollingHash, hBits,
                                                       (U32)(cur - base), ldmParams);
                  U64 const hashPower = ldmState->hashPower;
                  U32 const hBits = params->hashLog - params->bucketSizeLog;
                  U32 const ldmBucketSize = 1U << params->bucketSizeLog;
-                 U32 const hashEveryLog = params->hashEveryLog;
-                 U32 const ldmTagMask = (1U << params->hashEveryLog) - 1;
+                 U32 const hashRateLog = params->hashRateLog;
+                 U32 const ldmTagMask = (1U << params->hashRateLog) - 1;
                  /* Prefix and extDict parameters */
                  U32 const dictLimit = ldmState->window.dictLimit;
                  U32 const lowestIndex = extDict ? ldmState->window.lowLimit : dictLimit;
                      size_t forwardMatchLength = 0, backwardMatchLength = 0;
                      ldmEntry_t* bestEntry = NULL;
                      if (ip != istart) {
-                         rollingHash = ZSTD_ldm_updateHash(rollingHash, lastHashed[0],
-                                                           lastHashed[minMatchLength],
-                                                           hashPower);
+                         rollingHash = ZSTD_rollingHash_rotate(rollingHash, lastHashed[0],
+                                                               lastHashed[minMatchLength],
+                                                               hashPower);
                      } else {
-                         rollingHash = ZSTD_ldm_getRollingHash(ip, minMatchLength);
+                         rollingHash = ZSTD_rollingHash_compute(ip, minMatchLength);
                      }
                      lastHashed = ip;
                      /* Do not insert and do not look for a match */
-                     if (ZSTD_ldm_getTag(rollingHash, hBits, hashEveryLog) != ldmTagMask) {
+                     if (ZSTD_ldm_getTag(rollingHash, hBits, hashRateLog) != ldmTagMask) {
                         ip++;
                         continue;
                      }
                  void const* src, size_t srcSize)
              {
                  const ZSTD_compressionParameters* const cParams = &ms->cParams;
-                 unsigned const minMatch = cParams->searchLength;
+                 unsigned const minMatch = cParams->minMatch;
                  ZSTD_blockCompressor const blockCompressor =
                      ZSTD_selectBlockCompressor(cParams->strategy, ZSTD_matchState_dictMode(ms));
                  /* Input bounds */

contrib/python-zstandard/zstd/compress/zstd_ldm.h

0 +2 -6

              *  Long distance matching
              ***************************************/
-             #define ZSTD_LDM_DEFAULT_WINDOW_LOG ZSTD_WINDOWLOG_DEFAULTMAX
+             #define ZSTD_LDM_DEFAULT_WINDOW_LOG ZSTD_WINDOWLOG_LIMIT_DEFAULT
              /**
               * ZSTD_ldm_generateSequences():
               */
              size_t ZSTD_ldm_getMaxNbSeq(ldmParams_t params, size_t maxChunkSize);
-             /** ZSTD_ldm_getTableSize() :
-              *  Return prime8bytes^(minMatchLength-1) */
-             U64 ZSTD_ldm_getHashPower(U32 minMatchLength);
              /** ZSTD_ldm_adjustParameters() :
-              *  If the params->hashEveryLog is not set, set it to its default value based on
+              *  If the params->hashRateLog is not set, set it to its default value based on
               *  windowLog and params->hashLog.
               *
               *  Ensures that params->bucketSizeLog is <= params->hashLog (setting it to

contrib/python-zstandard/zstd/compress/zstd_opt.c

0 +133 -48

              #define ZSTD_FREQ_DIV       4   /* log factor when using previous stats to init next stats */
              #define ZSTD_MAX_PRICE     (1<<30)
+             #define ZSTD_PREDEF_THRESHOLD 1024   /* if srcSize < ZSTD_PREDEF_THRESHOLD, symbols' cost is assumed static, directly determined by pre-defined distributions */
              /*-*************************************
              *  Price functions for optimal parser
                  return weight;
              }
-             /* debugging function, @return price in bytes */
+             #if (DEBUGLEVEL>=2)
+             /* debugging function,
+              * @return price in bytes as fractional value
+              * for debug messages only */
              MEM_STATIC double ZSTD_fCost(U32 price)
              {
                  return (double)price / (BITCOST_MULTIPLIER*8);
              }
+             #endif
              static void ZSTD_setBasePrices(optState_t* optPtr, int optLevel)
              {
              }
-             static U32 ZSTD_downscaleStat(U32* table, U32 lastEltIndex, int malus)
+             /* ZSTD_downscaleStat() :
+              * reduce all elements in table by a factor 2^(ZSTD_FREQ_DIV+malus)
+              * return the resulting sum of elements */
+             static U32 ZSTD_downscaleStat(unsigned* table, U32 lastEltIndex, int malus)
              {
                  U32 s, sum=0;
+                 DEBUGLOG(5, "ZSTD_downscaleStat (nbElts=%u)", (unsigned)lastEltIndex+1);
                  assert(ZSTD_FREQ_DIV+malus > 0 && ZSTD_FREQ_DIV+malus < 31);
-                 for (s=0; s<=lastEltIndex; s++) {
+                 for (s=0; s<lastEltIndex+1; s++) {
                      table[s] = 1 + (table[s] >> (ZSTD_FREQ_DIV+malus));
                      sum += table[s];
                  }
                  return sum;
              }
-             static void ZSTD_rescaleFreqs(optState_t* const optPtr,
-                                           const BYTE* const src, size_t const srcSize,
-                                           int optLevel)
+             /* ZSTD_rescaleFreqs() :
+              * if first block (detected by optPtr->litLengthSum == 0) : init statistics
+              *    take hints from dictionary if there is one
+              *    or init from zero, using src for literals stats, or flat 1 for match symbols
+              * otherwise downscale existing stats, to be used as seed for next block.
+              */
+             static void
+             ZSTD_rescaleFreqs(optState_t* const optPtr,
+                         const BYTE* const src, size_t const srcSize,
+                               int const optLevel)
              {
+                 DEBUGLOG(5, "ZSTD_rescaleFreqs (srcSize=%u)", (unsigned)srcSize);
                  optPtr->priceType = zop_dynamic;
                  if (optPtr->litLengthSum == 0) {  /* first block : init */
-                     if (srcSize <= 1024)   /* heuristic */
+                     if (srcSize <= ZSTD_PREDEF_THRESHOLD) {  /* heuristic */
+                         DEBUGLOG(5, "(srcSize <= ZSTD_PREDEF_THRESHOLD) => zop_predef");
                          optPtr->priceType = zop_predef;
+                     }
                      assert(optPtr->symbolCosts != NULL);
-                     if (optPtr->symbolCosts->huf.repeatMode == HUF_repeat_valid) { /* huffman table presumed generated by dictionary */
+                     if (optPtr->symbolCosts->huf.repeatMode == HUF_repeat_valid) {
+                         /* huffman table presumed generated by dictionary */
                          optPtr->priceType = zop_dynamic;
                          assert(optPtr->litFreq != NULL);
                  /* dynamic statistics */
                  {   U32 const llCode = ZSTD_LLcode(litLength);
-                     return (LL_bits[llCode] * BITCOST_MULTIPLIER) + (optPtr->litLengthSumBasePrice - WEIGHT(optPtr->litLengthFreq[llCode], optLevel));
+                     return (LL_bits[llCode] * BITCOST_MULTIPLIER)
+                          + optPtr->litLengthSumBasePrice
+                          - WEIGHT(optPtr->litLengthFreq[llCode], optLevel);
                  }
              }
              FORCE_INLINE_TEMPLATE U32
              ZSTD_getMatchPrice(U32 const offset,
                                 U32 const matchLength,
-                                const optState_t* const optPtr,
+                          const optState_t* const optPtr,
                                 int const optLevel)
              {
                  U32 price;
                  U32* largerPtr  = smallerPtr + 1;
                  U32 dummy32;   /* to be nullified at the end */
                  U32 const windowLow = ms->window.lowLimit;
-                 U32 const matchLow = windowLow ? windowLow : 1;
                  U32 matchEndIdx = current+8+1;
                  size_t bestLength = 8;
                  U32 nbCompares = 1U << cParams->searchLog;
                  assert(ip <= iend-8);   /* required for h calculation */
                  hashTable[h] = current;   /* Update Hash Table */
-                 while (nbCompares-- && (matchIndex >= matchLow)) {
+                 assert(windowLow > 0);
+                 while (nbCompares-- && (matchIndex >= windowLow)) {
                      U32* const nextPtr = bt + 2*(matchIndex & btMask);
                      size_t matchLength = MIN(commonLengthSmaller, commonLengthLarger);   /* guaranteed minimum nb of common bytes */
                      assert(matchIndex < current);
                  const BYTE* const base = ms->window.base;
                  U32 const target = (U32)(ip - base);
                  U32 idx = ms->nextToUpdate;
-                 DEBUGLOG(5, "ZSTD_updateTree_internal, from %u to %u  (dictMode:%u)",
+                 DEBUGLOG(6, "ZSTD_updateTree_internal, from %u to %u  (dictMode:%u)",
                              idx, target, dictMode);
                  while(idx < target)
              }
              void ZSTD_updateTree(ZSTD_matchState_t* ms, const BYTE* ip, const BYTE* iend) {
-                 ZSTD_updateTree_internal(ms, ip, iend, ms->cParams.searchLength, ZSTD_noDict);
+                 ZSTD_updateTree_internal(ms, ip, iend, ms->cParams.minMatch, ZSTD_noDict);
              }
              FORCE_INLINE_TEMPLATE
              U32 ZSTD_insertBtAndGetAllMatches (
                                  ZSTD_matchState_t* ms,
                                  const BYTE* const ip, const BYTE* const iLimit, const ZSTD_dictMode_e dictMode,
-                                 U32 rep[ZSTD_REP_NUM], U32 const ll0,
-                                 ZSTD_match_t* matches, const U32 lengthToBeat, U32 const mls /* template */)
+                                 U32 rep[ZSTD_REP_NUM],
+                                 U32 const ll0,   /* tells if associated literal length is 0 or not. This value must be 0 or 1 */
+                                 ZSTD_match_t* matches,
+                                 const U32 lengthToBeat,
+                                 U32 const mls /* template */)
              {
                  const ZSTD_compressionParameters* const cParams = &ms->cParams;
                  U32 const sufficient_len = MIN(cParams->targetLength, ZSTD_OPT_NUM -1);
                  DEBUGLOG(8, "ZSTD_insertBtAndGetAllMatches: current=%u", current);
                  /* check repCode */
+                 assert(ll0 <= 1);   /* necessarily 1 or 0 */
                  {   U32 const lastR = ZSTD_REP_NUM + ll0;
                      U32 repCode;
                      for (repCode = ll0; repCode < lastR; repCode++) {
                                      ZSTD_match_t* matches, U32 const lengthToBeat)
              {
                  const ZSTD_compressionParameters* const cParams = &ms->cParams;
-                 U32 const matchLengthSearch = cParams->searchLength;
+                 U32 const matchLengthSearch = cParams->minMatch;
                  DEBUGLOG(8, "ZSTD_BtGetAllMatches");
                  if (ip < ms->window.base + ms->nextToUpdate) return 0;   /* skipped area */
                  ZSTD_updateTree_internal(ms, ip, iHighLimit, matchLengthSearch, dictMode);
                  return sol.litlen + sol.mlen;
              }
+             #if 0 /* debug */
+             static void
+             listStats(const U32* table, int lastEltID)
+             {
+                 int const nbElts = lastEltID + 1;
+                 int enb;
+                 for (enb=0; enb < nbElts; enb++) {
+                     (void)table;
+                     //RAWLOG(2, "%3i:%3i,  ", enb, table[enb]);
+                     RAWLOG(2, "%4i,", table[enb]);
+                 }
+                 RAWLOG(2, " \n");
+             }
+             #endif
              FORCE_INLINE_TEMPLATE size_t
              ZSTD_compressBlock_opt_generic(ZSTD_matchState_t* ms,
                                             seqStore_t* seqStore,
                                             U32 rep[ZSTD_REP_NUM],
-                                            const void* src, size_t srcSize,
-                                            const int optLevel, const ZSTD_dictMode_e dictMode)
+                                      const void* src, size_t srcSize,
+                                      const int optLevel,
+                                      const ZSTD_dictMode_e dictMode)
              {
                  optState_t* const optStatePtr = &ms->opt;
                  const BYTE* const istart = (const BYTE*)src;
                  const ZSTD_compressionParameters* const cParams = &ms->cParams;
                  U32 const sufficient_len = MIN(cParams->targetLength, ZSTD_OPT_NUM -1);
-                 U32 const minMatch = (cParams->searchLength == 3) ? 3 : 4;
+                 U32 const minMatch = (cParams->minMatch == 3) ? 3 : 4;
                  ZSTD_optimal_t* const opt = optStatePtr->priceTable;
                  ZSTD_match_t* const matches = optStatePtr->matchTable;
                  ZSTD_optimal_t lastSequence;
                  /* init */
-                 DEBUGLOG(5, "ZSTD_compressBlock_opt_generic");
+                 DEBUGLOG(5, "ZSTD_compressBlock_opt_generic: current=%u, prefix=%u, nextToUpdate=%u",
+                             (U32)(ip - base), ms->window.dictLimit, ms->nextToUpdate);
                  assert(optLevel <= 2);
                  ms->nextToUpdate3 = ms->nextToUpdate;
                  ZSTD_rescaleFreqs(optStatePtr, (const BYTE*)src, srcSize, optLevel);
                                  U32 const offCode = opt[storePos].off;
                                  U32 const advance = llen + mlen;
                                  DEBUGLOG(6, "considering seq starting at %zi, llen=%u, mlen=%u",
-                                             anchor - istart, llen, mlen);
+                                             anchor - istart, (unsigned)llen, (unsigned)mlen);
                                  if (mlen==0) {  /* only literals => must be last "sequence", actually starting a new stream of sequences */
                                      assert(storePos == storeEnd);   /* must be last sequence */
              /* used in 2-pass strategy */
-             static U32 ZSTD_upscaleStat(U32* table, U32 lastEltIndex, int bonus)
+             static U32 ZSTD_upscaleStat(unsigned* table, U32 lastEltIndex, int bonus)
              {
                  U32 s, sum=0;
-                 assert(ZSTD_FREQ_DIV+bonus > 0);
-                 for (s=0; s<=lastEltIndex; s++) {
+                 assert(ZSTD_FREQ_DIV+bonus >= 0);
+                 for (s=0; s<lastEltIndex+1; s++) {
                      table[s] <<= ZSTD_FREQ_DIV+bonus;
                      table[s]--;
                      sum += table[s];
              MEM_STATIC void ZSTD_upscaleStats(optState_t* optPtr)
              {
                  optPtr->litSum = ZSTD_upscaleStat(optPtr->litFreq, MaxLit, 0);
-                 optPtr->litLengthSum = ZSTD_upscaleStat(optPtr->litLengthFreq, MaxLL, 1);
-                 optPtr->matchLengthSum = ZSTD_upscaleStat(optPtr->matchLengthFreq, MaxML, 1);
-                 optPtr->offCodeSum = ZSTD_upscaleStat(optPtr->offCodeFreq, MaxOff, 1);
+                 optPtr->litLengthSum = ZSTD_upscaleStat(optPtr->litLengthFreq, MaxLL, 0);
+                 optPtr->matchLengthSum = ZSTD_upscaleStat(optPtr->matchLengthFreq, MaxML, 0);
+                 optPtr->offCodeSum = ZSTD_upscaleStat(optPtr->offCodeFreq, MaxOff, 0);
+             }
+             /* ZSTD_initStats_ultra():
+              * make a first compression pass, just to seed stats with more accurate starting values.
+              * only works on first block, with no dictionary and no ldm.
+              * this function cannot error, hence its constract must be respected.
+              */
+             static void
+             ZSTD_initStats_ultra(ZSTD_matchState_t* ms,
+                                  seqStore_t* seqStore,
+                                  U32 rep[ZSTD_REP_NUM],
+                            const void* src, size_t srcSize)
+             {
+                 U32 tmpRep[ZSTD_REP_NUM];  /* updated rep codes will sink here */
+                 memcpy(tmpRep, rep, sizeof(tmpRep));
+                 DEBUGLOG(4, "ZSTD_initStats_ultra (srcSize=%zu)", srcSize);
+                 assert(ms->opt.litLengthSum == 0);    /* first block */
+                 assert(seqStore->sequences == seqStore->sequencesStart);   /* no ldm */
+                 assert(ms->window.dictLimit == ms->window.lowLimit);   /* no dictionary */
+                 assert(ms->window.dictLimit - ms->nextToUpdate <= 1);  /* no prefix (note: intentional overflow, defined as 2-complement) */
+                 ZSTD_compressBlock_opt_generic(ms, seqStore, tmpRep, src, srcSize, 2 /*optLevel*/, ZSTD_noDict);   /* generate stats into ms->opt*/
+                 /* invalidate first scan from history */
+                 ZSTD_resetSeqStore(seqStore);
+                 ms->window.base -= srcSize;
+                 ms->window.dictLimit += (U32)srcSize;
+                 ms->window.lowLimit = ms->window.dictLimit;
+                 ms->nextToUpdate = ms->window.dictLimit;
+                 ms->nextToUpdate3 = ms->window.dictLimit;
+                 /* re-inforce weight of collected statistics */
+                 ZSTD_upscaleStats(&ms->opt);
              }
              size_t ZSTD_compressBlock_btultra(
                      const void* src, size_t srcSize)
              {
                  DEBUGLOG(5, "ZSTD_compressBlock_btultra (srcSize=%zu)", srcSize);
-             #if 0
-                 /* 2-pass strategy (disabled)
+                 return ZSTD_compressBlock_opt_generic(ms, seqStore, rep, src, srcSize, 2 /*optLevel*/, ZSTD_noDict);
+             }
+             size_t ZSTD_compressBlock_btultra2(
+                     ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM],
+                     const void* src, size_t srcSize)
+             {
+                 U32 const current = (U32)((const BYTE*)src - ms->window.base);
+                 DEBUGLOG(5, "ZSTD_compressBlock_btultra2 (srcSize=%zu)", srcSize);
+                 /* 2-pass strategy:
-                  * this strategy makes a first pass over first block to collect statistics
-                  * and seed next round's statistics with it.
+                  * After 1st pass, function forgets everything, and starts a new block.
+                  * Consequently, this can only work if no data has been previously loaded in tables,
+                  * aka, no dictionary, no prefix, no ldm preprocessing.
-                  * The compression ratio gain is generally small (~0.5% on first block),
-                  * the cost is 2x cpu time on first block. */
                  assert(srcSize <= ZSTD_BLOCKSIZE_MAX);
                  if ( (ms->opt.litLengthSum==0)   /* first block */
-                   && (seqStore->sequences == seqStore->sequencesStart)   /* no ldm */
-                   && (ms->window.dictLimit == ms->window.lowLimit) ) {   /* no dictionary */
-                     U32 tmpRep[ZSTD_REP_NUM];
-                     DEBUGLOG(5, "ZSTD_compressBlock_btultra: first block: collecting statistics");
-                     assert(ms->nextToUpdate >= ms->window.dictLimit
-                         && ms->nextToUpdate <= ms->window.dictLimit + 1);
-                     memcpy(tmpRep, rep, sizeof(tmpRep));
-                     ZSTD_compressBlock_opt_generic(ms, seqStore, tmpRep, src, srcSize, 2 /*optLevel*/, ZSTD_noDict);   /* generate stats into ms->opt*/
-                     ZSTD_resetSeqStore(seqStore);
-                     /* invalidate first scan from history */
-                     ms->window.base -= srcSize;
-                     ms->window.dictLimit += (U32)srcSize;
-                     ms->window.lowLimit = ms->window.dictLimit;
-                     ms->nextToUpdate = ms->window.dictLimit;
-                     ms->nextToUpdate3 = ms->window.dictLimit;
-                     /* re-inforce weight of collected statistics */
-                     ZSTD_upscaleStats(&ms->opt);
+                   && (seqStore->sequences == seqStore->sequencesStart)  /* no ldm */
+                   && (ms->window.dictLimit == ms->window.lowLimit)   /* no dictionary */
+                   && (current == ms->window.dictLimit)   /* start of frame, nothing already loaded nor skipped */
+                   && (srcSize > ZSTD_PREDEF_THRESHOLD)
+                   ) {
+                     ZSTD_initStats_ultra(ms, seqStore, rep, src, srcSize);
                  }
-             #endif
                  return ZSTD_compressBlock_opt_generic(ms, seqStore, rep, src, srcSize, 2 /*optLevel*/, ZSTD_noDict);
              }
              {
                  return ZSTD_compressBlock_opt_generic(ms, seqStore, rep, src, srcSize, 2 /*optLevel*/, ZSTD_extDict);
              }
+             /* note : no btultra2 variant for extDict nor dictMatchState,
+              * because btultra2 is not meant to work with dictionaries
+              * and is only specific for the first block (no prefix) */

contrib/python-zstandard/zstd/compress/zstd_opt.h

0 +8 0

              size_t ZSTD_compressBlock_btultra(
                      ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM],
                      void const* src, size_t srcSize);
+             size_t ZSTD_compressBlock_btultra2(
+                     ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM],
+                     void const* src, size_t srcSize);
              size_t ZSTD_compressBlock_btopt_dictMatchState(
                      ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM],
                      ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM],
                      void const* src, size_t srcSize);
+                     /* note : no btultra2 variant for extDict nor dictMatchState,
+                      * because btultra2 is not meant to work with dictionaries
+                      * and is only specific for the first block (no prefix) */
              #if defined (__cplusplus)
              }
              #endif

contrib/python-zstandard/zstd/compress/zstdmt_compress.c

0 +229 -73

               */
-             /* ======   Tuning parameters   ====== */
-             #define ZSTDMT_NBWORKERS_MAX 200
-             #define ZSTDMT_JOBSIZE_MAX  (MEM_32bits() ? (512 MB) : (2 GB))  /* note : limited by `jobSize` type, which is `unsigned` */
-             #define ZSTDMT_OVERLAPLOG_DEFAULT 6
              /* ======   Compiler specifics   ====== */
              #if defined(_MSC_VER)
              #  pragma warning(disable : 4204)   /* disable: C4204: non-constant aggregate initializer */
              #endif
+             /* ======   Constants   ====== */
+             #define ZSTDMT_OVERLAPLOG_DEFAULT 0
              /* ======   Dependencies   ====== */
              #include <string.h>      /* memcpy, memset */
-             #include <limits.h>      /* INT_MAX */
+             #include <limits.h>      /* INT_MAX, UINT_MAX */
              #include "pool.h"        /* threadpool */
              #include "threading.h"   /* mutex */
              #include "zstd_compress_internal.h"  /* MIN, ERROR, ZSTD_*, ZSTD_highbit32 */
                 static clock_t _ticksPerSecond = 0;
                 if (_ticksPerSecond <= 0) _ticksPerSecond = sysconf(_SC_CLK_TCK);
-                { struct tms junk; clock_t newTicks = (clock_t) times(&junk);
-                  return ((((unsigned long long)newTicks)*(1000000))/_ticksPerSecond); }
+             }
+                {   struct tms junk; clock_t newTicks = (clock_t) times(&junk);
+                    return ((((unsigned long long)newTicks)*(1000000))/_ticksPerSecond);
+             }  }
              #define MUTEX_WAIT_TIME_DLEVEL 6
              #define ZSTD_PTHREAD_MUTEX_LOCK(mutex) {          \
              typedef struct {
                  ZSTD_pthread_mutex_t poolMutex;
-                 unsigned totalCCtx;
-                 unsigned availCCtx;
+                 int totalCCtx;
+                 int availCCtx;
                  ZSTD_customMem cMem;
                  ZSTD_CCtx* cctx[1];   /* variable size */
              } ZSTDMT_CCtxPool;
              /* note : all CCtx borrowed from the pool should be released back to the pool _before_ freeing the pool */
              static void ZSTDMT_freeCCtxPool(ZSTDMT_CCtxPool* pool)
              {
-                 unsigned u;
-                 for (u=0; u<pool->totalCCtx; u++)
-                     ZSTD_freeCCtx(pool->cctx[u]);  /* note : compatible with free on NULL */
+                 int cid;
+                 for (cid=0; cid<pool->totalCCtx; cid++)
+                     ZSTD_freeCCtx(pool->cctx[cid]);  /* note : compatible with free on NULL */
                  ZSTD_pthread_mutex_destroy(&pool->poolMutex);
                  ZSTD_free(pool, pool->cMem);
              }
              /* ZSTDMT_createCCtxPool() :
               * implies nbWorkers >= 1 , checked by caller ZSTDMT_createCCtx() */
-             static ZSTDMT_CCtxPool* ZSTDMT_createCCtxPool(unsigned nbWorkers,
+             static ZSTDMT_CCtxPool* ZSTDMT_createCCtxPool(int nbWorkers,
                                                            ZSTD_customMem cMem)
              {
                  ZSTDMT_CCtxPool* const cctxPool = (ZSTDMT_CCtxPool*) ZSTD_calloc(
              }
              static ZSTDMT_CCtxPool* ZSTDMT_expandCCtxPool(ZSTDMT_CCtxPool* srcPool,
-                                                           unsigned nbWorkers)
+                                                           int nbWorkers)
              {
                  if (srcPool==NULL) return NULL;
                  if (nbWorkers <= srcPool->totalCCtx) return srcPool;   /* good enough */
                      DEBUGLOG(4, "LDM window size = %u KB", (1U << params.cParams.windowLog) >> 10);
                      ZSTD_ldm_adjustParameters(&params.ldmParams, &params.cParams);
                      assert(params.ldmParams.hashLog >= params.ldmParams.bucketSizeLog);
-                     assert(params.ldmParams.hashEveryLog < 32);
+                     assert(params.ldmParams.hashRateLog < 32);
                      serialState->ldmState.hashPower =
-                             ZSTD_ldm_getHashPower(params.ldmParams.minMatchLength);
+                             ZSTD_rollingHash_primePower(params.ldmParams.minMatchLength);
                  } else {
                      memset(&params.ldmParams, 0, sizeof(params.ldmParams));
                  }
                      if (ZSTD_isError(initError)) JOB_ERROR(initError);
                  } else {  /* srcStart points at reloaded section */
                      U64 const pledgedSrcSize = job->firstJob ? job->fullFrameSize : job->src.size;
-                     {   size_t const forceWindowError = ZSTD_CCtxParam_setParameter(&jobParams, ZSTD_p_forceMaxWindow, !job->firstJob);
+                     {   size_t const forceWindowError = ZSTD_CCtxParam_setParameter(&jobParams, ZSTD_c_forceMaxWindow, !job->firstJob);
                          if (ZSTD_isError(forceWindowError)) JOB_ERROR(forceWindowError);
                      }
                      {   size_t const initError = ZSTD_compressBegin_advanced_internal(cctx,
              static const roundBuff_t kNullRoundBuff = {NULL, 0, 0};
+             #define RSYNC_LENGTH 32
+             typedef struct {
+               U64 hash;
+               U64 hitMask;
+               U64 primePower;
+             } rsyncState_t;
              struct ZSTDMT_CCtx_s {
                  POOL_ctx* factory;
                  ZSTDMT_jobDescription* jobs;
                  inBuff_t inBuff;
                  roundBuff_t roundBuff;
                  serialState_t serial;
+                 rsyncState_t rsync;
                  unsigned singleBlockingThread;
                  unsigned jobIDMask;
                  unsigned doneJobID;
              {
                  if (nbWorkers > ZSTDMT_NBWORKERS_MAX) nbWorkers = ZSTDMT_NBWORKERS_MAX;
                  params->nbWorkers = nbWorkers;
-                 params->overlapSizeLog = ZSTDMT_OVERLAPLOG_DEFAULT;
+                 params->overlapLog = ZSTDMT_OVERLAPLOG_DEFAULT;
                  params->jobSize = 0;
                  return nbWorkers;
              }
              }
              /* Internal only */
-             size_t ZSTDMT_CCtxParam_setMTCtxParameter(ZSTD_CCtx_params* params,
-                                             ZSTDMT_parameter parameter, unsigned value) {
+             size_t
+             ZSTDMT_CCtxParam_setMTCtxParameter(ZSTD_CCtx_params* params,
+                                                ZSTDMT_parameter parameter,
+                                                int value)
+             {
                  DEBUGLOG(4, "ZSTDMT_CCtxParam_setMTCtxParameter");
                  switch(parameter)
                  {
                  case ZSTDMT_p_jobSize :
-                     DEBUGLOG(4, "ZSTDMT_CCtxParam_setMTCtxParameter : set jobSize to %u", value);
-                     if ( (value > 0)  /* value==0 => automatic job size */
-                        & (value < ZSTDMT_JOBSIZE_MIN) )
+                     DEBUGLOG(4, "ZSTDMT_CCtxParam_setMTCtxParameter : set jobSize to %i", value);
+                     if ( value != 0  /* default */
+                       && value < ZSTDMT_JOBSIZE_MIN)
                          value = ZSTDMT_JOBSIZE_MIN;
-                     if (value > ZSTDMT_JOBSIZE_MAX)
-                         value = ZSTDMT_JOBSIZE_MAX;
+                     assert(value >= 0);
+                     if (value > ZSTDMT_JOBSIZE_MAX) value = ZSTDMT_JOBSIZE_MAX;
                      params->jobSize = value;
                      return value;
-                 case ZSTDMT_p_overlapSectionLog :
-                     if (value > 9) value = 9;
-                     DEBUGLOG(4, "ZSTDMT_p_overlapSectionLog : %u", value);
-                     params->overlapSizeLog = (value >= 9) ? 9 : value;
+                 case ZSTDMT_p_overlapLog :
+                     DEBUGLOG(4, "ZSTDMT_p_overlapLog : %i", value);
+                     if (value < ZSTD_OVERLAPLOG_MIN) value = ZSTD_OVERLAPLOG_MIN;
+                     if (value > ZSTD_OVERLAPLOG_MAX) value = ZSTD_OVERLAPLOG_MAX;
+                     params->overlapLog = value;
                      return value;
+                 case ZSTDMT_p_rsyncable :
+                     value = (value != 0);
+                     params->rsyncable = value;
+                     return value;
                  default :
                      return ERROR(parameter_unsupported);
                  }
              }
-             size_t ZSTDMT_setMTCtxParameter(ZSTDMT_CCtx* mtctx, ZSTDMT_parameter parameter, unsigned value)
+             size_t ZSTDMT_setMTCtxParameter(ZSTDMT_CCtx* mtctx, ZSTDMT_parameter parameter, int value)
              {
                  DEBUGLOG(4, "ZSTDMT_setMTCtxParameter");
-                 switch(parameter)
+                 {
-                 case ZSTDMT_p_jobSize :
-                     return ZSTDMT_CCtxParam_setMTCtxParameter(&mtctx->params, parameter, value);
-                 case ZSTDMT_p_overlapSectionLog :
-                     return ZSTDMT_CCtxParam_setMTCtxParameter(&mtctx->params, parameter, value);
-                 default :
-                     return ERROR(parameter_unsupported);
+                 }
+                 return ZSTDMT_CCtxParam_setMTCtxParameter(&mtctx->params, parameter, value);
              }
-             size_t ZSTDMT_getMTCtxParameter(ZSTDMT_CCtx* mtctx, ZSTDMT_parameter parameter, unsigned* value)
+             size_t ZSTDMT_getMTCtxParameter(ZSTDMT_CCtx* mtctx, ZSTDMT_parameter parameter, int* value)
              {
                  switch (parameter) {
                  case ZSTDMT_p_jobSize:
-                     *value = mtctx->params.jobSize;
+                     assert(mtctx->params.jobSize <= INT_MAX);
+                     *value = (int)(mtctx->params.jobSize);
                      break;
-                 case ZSTDMT_p_overlapSectionLog:
-                     *value = mtctx->params.overlapSizeLog;
+                 case ZSTDMT_p_overlapLog:
+                     *value = mtctx->params.overlapLog;
+                     break;
+                 case ZSTDMT_p_rsyncable:
+                     *value = mtctx->params.rsyncable;
                      break;
                  default:
                      return ERROR(parameter_unsupported);
              /* =====   Multi-threaded compression   ===== */
              /* ------------------------------------------ */
-             static size_t ZSTDMT_computeTargetJobLog(ZSTD_CCtx_params const params)
+             static unsigned ZSTDMT_computeTargetJobLog(ZSTD_CCtx_params const params)
              {
                  if (params.ldmParams.enableLdm)
+                     /* In Long Range Mode, the windowLog is typically oversized.
+                      * In which case, it's preferable to determine the jobSize
+                      * based on chainLog instead. */
                      return MAX(21, params.cParams.chainLog + 4);
                  return MAX(20, params.cParams.windowLog + 2);
              }
-             static size_t ZSTDMT_computeOverlapLog(ZSTD_CCtx_params const params)
+             static int ZSTDMT_overlapLog_default(ZSTD_strategy strat)
              {
-                 unsigned const overlapRLog = (params.overlapSizeLog>9) ? 0 : 9-params.overlapSizeLog;
-                 if (params.ldmParams.enableLdm)
-                     return (MIN(params.cParams.windowLog, ZSTDMT_computeTargetJobLog(params) - 2) - overlapRLog);
-                 return overlapRLog >= 9 ? 0 : (params.cParams.windowLog - overlapRLog);
+                 switch(strat)
+                 {
+                     case ZSTD_btultra2:
+                         return 9;
+                     case ZSTD_btultra:
+                     case ZSTD_btopt:
+                         return 8;
+                     case ZSTD_btlazy2:
+                     case ZSTD_lazy2:
+                         return 7;
+                     case ZSTD_lazy:
+                     case ZSTD_greedy:
+                     case ZSTD_dfast:
+                     case ZSTD_fast:
+                     default:;
+                 }
+                 return 6;
              }
-             static unsigned ZSTDMT_computeNbJobs(ZSTD_CCtx_params params, size_t srcSize, unsigned nbWorkers) {
+             static int ZSTDMT_overlapLog(int ovlog, ZSTD_strategy strat)
+             {
+                 assert(0 <= ovlog && ovlog <= 9);
+                 if (ovlog == 0) return ZSTDMT_overlapLog_default(strat);
+                 return ovlog;
+             }
+             static size_t ZSTDMT_computeOverlapSize(ZSTD_CCtx_params const params)
+             {
+                 int const overlapRLog = 9 - ZSTDMT_overlapLog(params.overlapLog, params.cParams.strategy);
+                 int ovLog = (overlapRLog >= 8) ? 0 : (params.cParams.windowLog - overlapRLog);
+                 assert(0 <= overlapRLog && overlapRLog <= 8);
+                 if (params.ldmParams.enableLdm) {
+                     /* In Long Range Mode, the windowLog is typically oversized.
+                      * In which case, it's preferable to determine the jobSize
+                      * based on chainLog instead.
+                      * Then, ovLog becomes a fraction of the jobSize, rather than windowSize */
+                     ovLog = MIN(params.cParams.windowLog, ZSTDMT_computeTargetJobLog(params) - 2)
+                             - overlapRLog;
+                 }
+                 assert(0 <= ovLog && ovLog <= 30);
+                 DEBUGLOG(4, "overlapLog : %i", params.overlapLog);
+                 DEBUGLOG(4, "overlap size : %i", 1 << ovLog);
+                 return (ovLog==0) ? 0 : (size_t)1 << ovLog;
+             }
+             static unsigned
+             ZSTDMT_computeNbJobs(ZSTD_CCtx_params params, size_t srcSize, unsigned nbWorkers)
+             {
                  assert(nbWorkers>0);
                  {   size_t const jobSizeTarget = (size_t)1 << ZSTDMT_computeTargetJobLog(params);
                      size_t const jobMaxSize = jobSizeTarget << 2;
                              ZSTD_CCtx_params params)
              {
                  ZSTD_CCtx_params const jobParams = ZSTDMT_initJobCCtxParams(params);
-                 size_t const overlapSize = (size_t)1 << ZSTDMT_computeOverlapLog(params);
+                 size_t const overlapSize = ZSTDMT_computeOverlapSize(params);
                  unsigned const nbJobs = ZSTDMT_computeNbJobs(params, srcSize, params.nbWorkers);
                  size_t const proposedJobSize = (srcSize + (nbJobs-1)) / nbJobs;
                  size_t const avgJobSize = (((proposedJobSize-1) & 0x1FFFF) < 0x7FFF) ? proposedJobSize + 0xFFFF : proposedJobSize;   /* avoid too small last block */
              }
              size_t ZSTDMT_compress_advanced(ZSTDMT_CCtx* mtctx,
-                                            void* dst, size_t dstCapacity,
-                                      const void* src, size_t srcSize,
-                                      const ZSTD_CDict* cdict,
-                                            ZSTD_parameters params,
-                                            unsigned overlapLog)
+                                             void* dst, size_t dstCapacity,
+                                       const void* src, size_t srcSize,
+                                       const ZSTD_CDict* cdict,
+                                             ZSTD_parameters params,
+                                             int overlapLog)
              {
                  ZSTD_CCtx_params cctxParams = mtctx->params;
                  cctxParams.cParams = params.cParams;
                  cctxParams.fParams = params.fParams;
-                 cctxParams.overlapSizeLog = overlapLog;
+                 assert(ZSTD_OVERLAPLOG_MIN <= overlapLog && overlapLog <= ZSTD_OVERLAPLOG_MAX);
+                 cctxParams.overlapLog = overlapLog;
                  return ZSTDMT_compress_advanced_internal(mtctx,
                                                           dst, dstCapacity,
                                                           src, srcSize,
                                   const void* src, size_t srcSize,
                                         int compressionLevel)
              {
-                 U32 const overlapLog = (compressionLevel >= ZSTD_maxCLevel()) ? 9 : ZSTDMT_OVERLAPLOG_DEFAULT;
                  ZSTD_parameters params = ZSTD_getParams(compressionLevel, srcSize, 0);
+                 int const overlapLog = ZSTDMT_overlapLog_default(params.cParams.strategy);
                  params.fParams.contentSizeFlag = 1;
                  return ZSTDMT_compress_advanced(mtctx, dst, dstCapacity, src, srcSize, NULL, params, overlapLog);
              }
                  if (params.nbWorkers != mtctx->params.nbWorkers)
                      CHECK_F( ZSTDMT_resize(mtctx, params.nbWorkers) );
-                 if (params.jobSize > 0 && params.jobSize < ZSTDMT_JOBSIZE_MIN) params.jobSize = ZSTDMT_JOBSIZE_MIN;
-                 if (params.jobSize > ZSTDMT_JOBSIZE_MAX) params.jobSize = ZSTDMT_JOBSIZE_MAX;
+                 if (params.jobSize != 0 && params.jobSize < ZSTDMT_JOBSIZE_MIN) params.jobSize = ZSTDMT_JOBSIZE_MIN;
+                 if (params.jobSize > (size_t)ZSTDMT_JOBSIZE_MAX) params.jobSize = ZSTDMT_JOBSIZE_MAX;
                  mtctx->singleBlockingThread = (pledgedSrcSize <= ZSTDMT_JOBSIZE_MIN);  /* do not trigger multi-threading when srcSize is too small */
                  if (mtctx->singleBlockingThread) {
                      mtctx->cdict = cdict;
                  }
-                 mtctx->targetPrefixSize = (size_t)1 << ZSTDMT_computeOverlapLog(params);
-                 DEBUGLOG(4, "overlapLog=%u => %u KB", params.overlapSizeLog, (U32)(mtctx->targetPrefixSize>>10));
+                 mtctx->targetPrefixSize = ZSTDMT_computeOverlapSize(params);
+                 DEBUGLOG(4, "overlapLog=%i => %u KB", params.overlapLog, (U32)(mtctx->targetPrefixSize>>10));
                  mtctx->targetSectionSize = params.jobSize;
                  if (mtctx->targetSectionSize == 0) {
                      mtctx->targetSectionSize = 1ULL << ZSTDMT_computeTargetJobLog(params);
                  }
+                 if (params.rsyncable) {
+                     /* Aim for the targetsectionSize as the average job size. */
+                     U32 const jobSizeMB = (U32)(mtctx->targetSectionSize >> 20);
+                     U32 const rsyncBits = ZSTD_highbit32(jobSizeMB) + 20;
+                     assert(jobSizeMB >= 1);
+                     DEBUGLOG(4, "rsyncLog = %u", rsyncBits);
+                     mtctx->rsync.hash = 0;
+                     mtctx->rsync.hitMask = (1ULL << rsyncBits) - 1;
+                     mtctx->rsync.primePower = ZSTD_rollingHash_primePower(RSYNC_LENGTH);
+                 }
                  if (mtctx->targetSectionSize < mtctx->targetPrefixSize) mtctx->targetSectionSize = mtctx->targetPrefixSize;  /* job size must be >= overlap size */
-                 DEBUGLOG(4, "Job Size : %u KB (note : set to %u)", (U32)(mtctx->targetSectionSize>>10), params.jobSize);
+                 DEBUGLOG(4, "Job Size : %u KB (note : set to %u)", (U32)(mtctx->targetSectionSize>>10), (U32)params.jobSize);
                  DEBUGLOG(4, "inBuff Size : %u KB", (U32)(mtctx->targetSectionSize>>10));
                  ZSTDMT_setBufferSize(mtctx->bufPool, ZSTD_compressBound(mtctx->targetSectionSize));
                  {
                  return 1;
              }
+             typedef struct {
+               size_t toLoad;  /* The number of bytes to load from the input. */
+               int flush;      /* Boolean declaring if we must flush because we found a synchronization point. */
+             } syncPoint_t;
+             /**
+              * Searches through the input for a synchronization point. If one is found, we
+              * will instruct the caller to flush, and return the number of bytes to load.
+              * Otherwise, we will load as many bytes as possible and instruct the caller
+              * to continue as normal.
+              */
+             static syncPoint_t
+             findSynchronizationPoint(ZSTDMT_CCtx const* mtctx, ZSTD_inBuffer const input)
+             {
+                 BYTE const* const istart = (BYTE const*)input.src + input.pos;
+                 U64 const primePower = mtctx->rsync.primePower;
+                 U64 const hitMask = mtctx->rsync.hitMask;
+                 syncPoint_t syncPoint;
+                 U64 hash;
+                 BYTE const* prev;
+                 size_t pos;
+                 syncPoint.toLoad = MIN(input.size - input.pos, mtctx->targetSectionSize - mtctx->inBuff.filled);
+                 syncPoint.flush = 0;
+                 if (!mtctx->params.rsyncable)
+                     /* Rsync is disabled. */
+                     return syncPoint;
+                 if (mtctx->inBuff.filled + syncPoint.toLoad < RSYNC_LENGTH)
+                     /* Not enough to compute the hash.
+                      * We will miss any synchronization points in this RSYNC_LENGTH byte
+                      * window. However, since it depends only in the internal buffers, if the
+                      * state is already synchronized, we will remain synchronized.
+                      * Additionally, the probability that we miss a synchronization point is
+                      * low: RSYNC_LENGTH / targetSectionSize.
+                      */
+                     return syncPoint;
+                 /* Initialize the loop variables. */
+                 if (mtctx->inBuff.filled >= RSYNC_LENGTH) {
+                     /* We have enough bytes buffered to initialize the hash.
+                      * Start scanning at the beginning of the input.
+                      */
+                     pos = 0;
+                     prev = (BYTE const*)mtctx->inBuff.buffer.start + mtctx->inBuff.filled - RSYNC_LENGTH;
+                     hash = ZSTD_rollingHash_compute(prev, RSYNC_LENGTH);
+                 } else {
+                     /* We don't have enough bytes buffered to initialize the hash, but
+                      * we know we have at least RSYNC_LENGTH bytes total.
+                      * Start scanning after the first RSYNC_LENGTH bytes less the bytes
+                      * already buffered.
+                      */
+                     pos = RSYNC_LENGTH - mtctx->inBuff.filled;
+                     prev = (BYTE const*)mtctx->inBuff.buffer.start - pos;
+                     hash = ZSTD_rollingHash_compute(mtctx->inBuff.buffer.start, mtctx->inBuff.filled);
+                     hash = ZSTD_rollingHash_append(hash, istart, pos);
+                 }
+                 /* Starting with the hash of the previous RSYNC_LENGTH bytes, roll
+                  * through the input. If we hit a synchronization point, then cut the
+                  * job off, and tell the compressor to flush the job. Otherwise, load
+                  * all the bytes and continue as normal.
+                  * If we go too long without a synchronization point (targetSectionSize)
+                  * then a block will be emitted anyways, but this is okay, since if we
+                  * are already synchronized we will remain synchronized.
+                  */
+                 for (; pos < syncPoint.toLoad; ++pos) {
+                     BYTE const toRemove = pos < RSYNC_LENGTH ? prev[pos] : istart[pos - RSYNC_LENGTH];
+                     /* if (pos >= RSYNC_LENGTH) assert(ZSTD_rollingHash_compute(istart + pos - RSYNC_LENGTH, RSYNC_LENGTH) == hash); */
+                     hash = ZSTD_rollingHash_rotate(hash, toRemove, istart[pos], primePower);
+                     if ((hash & hitMask) == hitMask) {
+                         syncPoint.toLoad = pos + 1;
+                         syncPoint.flush = 1;
+                         break;
+                     }
+                 }
+                 return syncPoint;
+             }
+             size_t ZSTDMT_nextInputSizeHint(const ZSTDMT_CCtx* mtctx)
+             {
+                 size_t hintInSize = mtctx->targetSectionSize - mtctx->inBuff.filled;
+                 if (hintInSize==0) hintInSize = mtctx->targetSectionSize;
+                 return hintInSize;
+             }
              /** ZSTDMT_compressStream_generic() :
               *  internal use only - exposed to be invoked from zstd_compress.c
                  }
                  /* single-pass shortcut (note : synchronous-mode) */
-                 if ( (mtctx->nextJobID == 0)      /* just started */
+                 if ( (!mtctx->params.rsyncable)   /* rsyncable mode is disabled */
+                   && (mtctx->nextJobID == 0)      /* just started */
                    && (mtctx->inBuff.filled == 0)  /* nothing buffered */
                    && (!mtctx->jobReady)           /* no job already created */
                    && (endOp == ZSTD_e_end)        /* end order */
                              DEBUGLOG(5, "ZSTDMT_tryGetInputRange completed successfully : mtctx->inBuff.buffer.start = %p", mtctx->inBuff.buffer.start);
                      }
                      if (mtctx->inBuff.buffer.start != NULL) {
-                         size_t const toLoad = MIN(input->size - input->pos, mtctx->targetSectionSize - mtctx->inBuff.filled);
+                         syncPoint_t const syncPoint = findSynchronizationPoint(mtctx, *input);
+                         if (syncPoint.flush && endOp == ZSTD_e_continue) {
+                             endOp = ZSTD_e_flush;
+                         }
                          assert(mtctx->inBuff.buffer.capacity >= mtctx->targetSectionSize);
                          DEBUGLOG(5, "ZSTDMT_compressStream_generic: adding %u bytes on top of %u to buffer of size %u",
-                                     (U32)toLoad, (U32)mtctx->inBuff.filled, (U32)mtctx->targetSectionSize);
-                         memcpy((char*)mtctx->inBuff.buffer.start + mtctx->inBuff.filled, (const char*)input->src + input->pos, toLoad);
-                         input->pos += toLoad;
-                         mtctx->inBuff.filled += toLoad;
-                         forwardInputProgress = toLoad>0;
+                                     (U32)syncPoint.toLoad, (U32)mtctx->inBuff.filled, (U32)mtctx->targetSectionSize);
+                         memcpy((char*)mtctx->inBuff.buffer.start + mtctx->inBuff.filled, (const char*)input->src + input->pos, syncPoint.toLoad);
+                         input->pos += syncPoint.toLoad;
+                         mtctx->inBuff.filled += syncPoint.toLoad;
+                         forwardInputProgress = syncPoint.toLoad>0;
                      }
                      if ((input->pos < input->size) && (endOp == ZSTD_e_end))
                          endOp = ZSTD_e_flush;   /* can't end now : not all input consumed */

contrib/python-zstandard/zstd/compress/zstdmt_compress.h

0 +18 -10

              #include "zstd.h"            /* ZSTD_inBuffer, ZSTD_outBuffer, ZSTDLIB_API */
+             /* ===   Constants   === */
+             #ifndef ZSTDMT_NBWORKERS_MAX
+             #  define ZSTDMT_NBWORKERS_MAX 200
+             #endif
+             #ifndef ZSTDMT_JOBSIZE_MIN
+             #  define ZSTDMT_JOBSIZE_MIN (1 MB)
+             #endif
+             #define ZSTDMT_JOBSIZE_MAX  (MEM_32bits() ? (512 MB) : (1024 MB))
              /* ===   Memory management   === */
              typedef struct ZSTDMT_CCtx_s ZSTDMT_CCtx;
              ZSTDLIB_API ZSTDMT_CCtx* ZSTDMT_createCCtx(unsigned nbWorkers);
              ZSTDLIB_API size_t ZSTDMT_initCStream(ZSTDMT_CCtx* mtctx, int compressionLevel);
              ZSTDLIB_API size_t ZSTDMT_resetCStream(ZSTDMT_CCtx* mtctx, unsigned long long pledgedSrcSize);  /**< if srcSize is not known at reset time, use ZSTD_CONTENTSIZE_UNKNOWN. Note: for compatibility with older programs, 0 means the same as ZSTD_CONTENTSIZE_UNKNOWN, but it will change in the future to mean "empty" */
+             ZSTDLIB_API size_t ZSTDMT_nextInputSizeHint(const ZSTDMT_CCtx* mtctx);
              ZSTDLIB_API size_t ZSTDMT_compressStream(ZSTDMT_CCtx* mtctx, ZSTD_outBuffer* output, ZSTD_inBuffer* input);
              ZSTDLIB_API size_t ZSTDMT_flushStream(ZSTDMT_CCtx* mtctx, ZSTD_outBuffer* output);   /**< @return : 0 == all flushed; >0 : still some data to be flushed; or an error code (ZSTD_isError()) */
              /* ===   Advanced functions and parameters  === */
-             #ifndef ZSTDMT_JOBSIZE_MIN
-             #  define ZSTDMT_JOBSIZE_MIN (1U << 20)   /* 1 MB - Minimum size of each compression job */
-             #endif
              ZSTDLIB_API size_t ZSTDMT_compress_advanced(ZSTDMT_CCtx* mtctx,
                                                         void* dst, size_t dstCapacity,
                                                   const void* src, size_t srcSize,
                                                   const ZSTD_CDict* cdict,
                                                         ZSTD_parameters params,
-                                                        unsigned overlapLog);
+                                                        int overlapLog);
              ZSTDLIB_API size_t ZSTDMT_initCStream_advanced(ZSTDMT_CCtx* mtctx,
                                                      const void* dict, size_t dictSize,   /* dict can be released after init, a local copy is preserved within zcs */
              /* ZSTDMT_parameter :
               * List of parameters that can be set using ZSTDMT_setMTCtxParameter() */
              typedef enum {
-                 ZSTDMT_p_jobSize,           /* Each job is compressed in parallel. By default, this value is dynamically determined depending on compression parameters. Can be set explicitly here. */
-                 ZSTDMT_p_overlapSectionLog  /* Each job may reload a part of previous job to enhance compressionr ratio; 0 == no overlap, 6(default) == use 1/8th of window, >=9 == use full window. This is a "sticky" parameter : its value will be re-used on next compression job */
+                 ZSTDMT_p_jobSize,     /* Each job is compressed in parallel. By default, this value is dynamically determined depending on compression parameters. Can be set explicitly here. */
+                 ZSTDMT_p_overlapLog,  /* Each job may reload a part of previous job to enhance compressionr ratio; 0 == no overlap, 6(default) == use 1/8th of window, >=9 == use full window. This is a "sticky" parameter : its value will be re-used on next compression job */
+                 ZSTDMT_p_rsyncable    /* Enables rsyncable mode. */
              } ZSTDMT_parameter;
              /* ZSTDMT_setMTCtxParameter() :
               * The function must be called typically after ZSTD_createCCtx() but __before ZSTDMT_init*() !__
               * Parameters not explicitly reset by ZSTDMT_init*() remain the same in consecutive compression sessions.
               * @return : 0, or an error code (which can be tested using ZSTD_isError()) */
-             ZSTDLIB_API size_t ZSTDMT_setMTCtxParameter(ZSTDMT_CCtx* mtctx, ZSTDMT_parameter parameter, unsigned value);
+             ZSTDLIB_API size_t ZSTDMT_setMTCtxParameter(ZSTDMT_CCtx* mtctx, ZSTDMT_parameter parameter, int value);
              /* ZSTDMT_getMTCtxParameter() :
               * Query the ZSTDMT_CCtx for a parameter value.
               * @return : 0, or an error code (which can be tested using ZSTD_isError()) */
-             ZSTDLIB_API size_t ZSTDMT_getMTCtxParameter(ZSTDMT_CCtx* mtctx, ZSTDMT_parameter parameter, unsigned* value);
+             ZSTDLIB_API size_t ZSTDMT_getMTCtxParameter(ZSTDMT_CCtx* mtctx, ZSTDMT_parameter parameter, int* value);
              /*! ZSTDMT_compressStream_generic() :
              /*! ZSTDMT_CCtxParam_setMTCtxParameter()
               *  like ZSTDMT_setMTCtxParameter(), but into a ZSTD_CCtx_Params */
-             size_t ZSTDMT_CCtxParam_setMTCtxParameter(ZSTD_CCtx_params* params, ZSTDMT_parameter parameter, unsigned value);
+             size_t ZSTDMT_CCtxParam_setMTCtxParameter(ZSTD_CCtx_params* params, ZSTDMT_parameter parameter, int value);
              /*! ZSTDMT_CCtxParam_setNbWorkers()
               *  Set nbWorkers, and clamp it.

contrib/python-zstandard/zstd/decompress/huf_decompress.c

0 0 0

	1		NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff

contrib/python-zstandard/zstd/decompress/zstd_decompress.c

0 0 0

	1		NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff

contrib/python-zstandard/zstd/dictBuilder/cover.c

0 0 0

	1		NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff

contrib/python-zstandard/zstd/dictBuilder/fastcover.c

0 0 0

	1		NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff

contrib/python-zstandard/zstd/dictBuilder/zdict.c

0 0 0

	1		NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff

contrib/python-zstandard/zstd/zstd.h

0 0 0

	1		NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff

tests/test-check-py3-compat.t

0 0 0

	1		NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff

tests/test-http-api-httpv2.t

0 0 0

	1		NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff

tests/test-http-protocol.t

0 0 0

	1		NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff

General Comments 0

Write
Preview

You need to be logged in to leave comments. Login now

No TODOs yet

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages