Show More
The requested changes are too big and content was truncated. Show full diff
|
1 | NO CONTENT: new file 100644 | |
The requested commit or file is too big and content was truncated. Show full diff |
|
1 | NO CONTENT: new file 100644 | |
The requested commit or file is too big and content was truncated. Show full diff |
|
1 | NO CONTENT: new file 100644 | |
The requested commit or file is too big and content was truncated. Show full diff |
|
1 | NO CONTENT: new file 100644 | |
The requested commit or file is too big and content was truncated. Show full diff |
|
1 | NO CONTENT: new file 100644 | |
The requested commit or file is too big and content was truncated. Show full diff |
@@ -62,6 +62,11 b' contrib/python-zstandard/zstd/compress/z' | |||
|
62 | 62 | contrib/python-zstandard/zstd/compress/zstd_opt.c |
|
63 | 63 | contrib/python-zstandard/zstd/compress/zstd_opt.h |
|
64 | 64 | contrib/python-zstandard/zstd/decompress/huf_decompress.c |
|
65 | contrib/python-zstandard/zstd/decompress/zstd_ddict.c | |
|
66 | contrib/python-zstandard/zstd/decompress/zstd_ddict.h | |
|
67 | contrib/python-zstandard/zstd/decompress/zstd_decompress_block.c | |
|
68 | contrib/python-zstandard/zstd/decompress/zstd_decompress_block.h | |
|
69 | contrib/python-zstandard/zstd/decompress/zstd_decompress_internal.h | |
|
65 | 70 | contrib/python-zstandard/zstd/decompress/zstd_decompress.c |
|
66 | 71 | contrib/python-zstandard/zstd/deprecated/zbuff_common.c |
|
67 | 72 | contrib/python-zstandard/zstd/deprecated/zbuff_compress.c |
@@ -5,6 +5,5 b' graft tests' | |||
|
5 | 5 | include make_cffi.py |
|
6 | 6 | include setup_zstd.py |
|
7 | 7 | include zstd.c |
|
8 | include zstd_cffi.py | |
|
9 | 8 | include LICENSE |
|
10 | 9 | include NEWS.rst |
@@ -8,8 +8,18 b' 1.0.0 (not yet released)' | |||
|
8 | 8 | Actions Blocking Release |
|
9 | 9 | ------------------------ |
|
10 | 10 | |
|
11 |
* compression and decompression APIs that support ``io. |
|
|
11 | * compression and decompression APIs that support ``io.RawIOBase`` interface | |
|
12 | 12 | (#13). |
|
13 | * ``stream_writer()`` APIs should support ``io.RawIOBase`` interface. | |
|
14 | * Properly handle non-blocking I/O and partial writes for objects implementing | |
|
15 | ``io.RawIOBase``. | |
|
16 | * Make ``write_return_read=True`` the default for objects implementing | |
|
17 | ``io.RawIOBase``. | |
|
18 | * Audit for consistent and proper behavior of ``flush()`` and ``close()`` for | |
|
19 | all objects implementing ``io.RawIOBase``. Is calling ``close()`` on | |
|
20 | wrapped stream acceptable, should ``__exit__`` always call ``close()``, | |
|
21 | should ``close()`` imply ``flush()``, etc. | |
|
22 | * Consider making reads across frames configurable behavior. | |
|
13 | 23 | * Refactor module names so C and CFFI extensions live under ``zstandard`` |
|
14 | 24 | package. |
|
15 | 25 | * Overall API design review. |
@@ -43,6 +53,11 b' Actions Blocking Release' | |||
|
43 | 53 | * Consider a ``chunker()`` API for decompression. |
|
44 | 54 | * Consider stats for ``chunker()`` API, including finding the last consumed |
|
45 | 55 | offset of input data. |
|
56 | * Consider exposing ``ZSTD_cParam_getBounds()`` and | |
|
57 | ``ZSTD_dParam_getBounds()`` APIs. | |
|
58 | * Consider controls over resetting compression contexts (session only, parameters, | |
|
59 | or session and parameters). | |
|
60 | * Actually use the CFFI backend in fuzzing tests. | |
|
46 | 61 | |
|
47 | 62 | Other Actions Not Blocking Release |
|
48 | 63 | --------------------------------------- |
@@ -51,6 +66,207 b' Other Actions Not Blocking Release' | |||
|
51 | 66 | * API for ensuring max memory ceiling isn't exceeded. |
|
52 | 67 | * Move off nose for testing. |
|
53 | 68 | |
|
69 | 0.11.0 (released 2019-02-24) | |
|
70 | ============================ | |
|
71 | ||
|
72 | Backwards Compatibility Nodes | |
|
73 | ----------------------------- | |
|
74 | ||
|
75 | * ``ZstdDecompressor.read()`` now allows reading sizes of ``-1`` or ``0`` | |
|
76 | and defaults to ``-1``, per the documented behavior of | |
|
77 | ``io.RawIOBase.read()``. Previously, we required an argument that was | |
|
78 | a positive value. | |
|
79 | * The ``readline()``, ``readlines()``, ``__iter__``, and ``__next__`` methods | |
|
80 | of ``ZstdDecompressionReader()`` now raise ``io.UnsupportedOperation`` | |
|
81 | instead of ``NotImplementedError``. | |
|
82 | * ``ZstdDecompressor.stream_reader()`` now accepts a ``read_across_frames`` | |
|
83 | argument. The default value will likely be changed in a future release | |
|
84 | and consumers are advised to pass the argument to avoid unwanted change | |
|
85 | of behavior in the future. | |
|
86 | * ``setup.py`` now always disables the CFFI backend if the installed | |
|
87 | CFFI package does not meet the minimum version requirements. Before, it was | |
|
88 | possible for the CFFI backend to be generated and a run-time error to | |
|
89 | occur. | |
|
90 | * In the CFFI backend, ``CompressionReader`` and ``DecompressionReader`` | |
|
91 | were renamed to ``ZstdCompressionReader`` and ``ZstdDecompressionReader``, | |
|
92 | respectively so naming is identical to the C extension. This should have | |
|
93 | no meaningful end-user impact, as instances aren't meant to be | |
|
94 | constructed directly. | |
|
95 | * ``ZstdDecompressor.stream_writer()`` now accepts a ``write_return_read`` | |
|
96 | argument to control whether ``write()`` returns the number of bytes | |
|
97 | read from the source / written to the decompressor. It defaults to off, | |
|
98 | which preserves the existing behavior of returning the number of bytes | |
|
99 | emitted from the decompressor. The default will change in a future release | |
|
100 | so behavior aligns with the specified behavior of ``io.RawIOBase``. | |
|
101 | * ``ZstdDecompressionWriter.__exit__`` now calls ``self.close()``. This | |
|
102 | will result in that stream plus the underlying stream being closed as | |
|
103 | well. If this behavior is not desirable, do not use instances as | |
|
104 | context managers. | |
|
105 | * ``ZstdCompressor.stream_writer()`` now accepts a ``write_return_read`` | |
|
106 | argument to control whether ``write()`` returns the number of bytes read | |
|
107 | from the source / written to the compressor. It defaults to off, which | |
|
108 | preserves the existing behavior of returning the number of bytes emitted | |
|
109 | from the compressor. The default will change in a future release so | |
|
110 | behavior aligns with the specified behavior of ``io.RawIOBase``. | |
|
111 | * ``ZstdCompressionWriter.__exit__`` now calls ``self.close()``. This will | |
|
112 | result in that stream plus any underlying stream being closed as well. If | |
|
113 | this behavior is not desirable, do not use instances as context managers. | |
|
114 | * ``ZstdDecompressionWriter`` no longer requires being used as a context | |
|
115 | manager (#57). | |
|
116 | * ``ZstdCompressionWriter`` no longer requires being used as a context | |
|
117 | manager (#57). | |
|
118 | * The ``overlap_size_log`` attribute on ``CompressionParameters`` instances | |
|
119 | has been deprecated and will be removed in a future release. The | |
|
120 | ``overlap_log`` attribute should be used instead. | |
|
121 | * The ``overlap_size_log`` argument to ``CompressionParameters`` has been | |
|
122 | deprecated and will be removed in a future release. The ``overlap_log`` | |
|
123 | argument should be used instead. | |
|
124 | * The ``ldm_hash_every_log`` attribute on ``CompressionParameters`` instances | |
|
125 | has been deprecated and will be removed in a future release. The | |
|
126 | ``ldm_hash_rate_log`` attribute should be used instead. | |
|
127 | * The ``ldm_hash_every_log`` argument to ``CompressionParameters`` has been | |
|
128 | deprecated and will be removed in a future release. The ``ldm_hash_rate_log`` | |
|
129 | argument should be used instead. | |
|
130 | * The ``compression_strategy`` argument to ``CompressionParameters`` has been | |
|
131 | deprecated and will be removed in a future release. The ``strategy`` | |
|
132 | argument should be used instead. | |
|
133 | * The ``SEARCHLENGTH_MIN`` and ``SEARCHLENGTH_MAX`` constants are deprecated | |
|
134 | and will be removed in a future release. Use ``MINMATCH_MIN`` and | |
|
135 | ``MINMATCH_MAX`` instead. | |
|
136 | * The ``zstd_cffi`` module has been renamed to ``zstandard.cffi``. As had | |
|
137 | been documented in the ``README`` file since the ``0.9.0`` release, the | |
|
138 | module should not be imported directly at its new location. Instead, | |
|
139 | ``import zstandard`` to cause an appropriate backend module to be loaded | |
|
140 | automatically. | |
|
141 | ||
|
142 | Bug Fixes | |
|
143 | --------- | |
|
144 | ||
|
145 | * CFFI backend could encounter a failure when sending an empty chunk into | |
|
146 | ``ZstdDecompressionObj.decompress()``. The issue has been fixed. | |
|
147 | * CFFI backend could encounter an error when calling | |
|
148 | ``ZstdDecompressionReader.read()`` if there was data remaining in an | |
|
149 | internal buffer. The issue has been fixed. (#71) | |
|
150 | ||
|
151 | Changes | |
|
152 | ------- | |
|
153 | ||
|
154 | * ``ZstDecompressionObj.decompress()`` now properly handles empty inputs in | |
|
155 | the CFFI backend. | |
|
156 | * ``ZstdCompressionReader`` now implements ``read1()`` and ``readinto1()``. | |
|
157 | These are part of the ``io.BufferedIOBase`` interface. | |
|
158 | * ``ZstdCompressionReader`` has gained a ``readinto(b)`` method for reading | |
|
159 | compressed output into an existing buffer. | |
|
160 | * ``ZstdCompressionReader.read()`` now defaults to ``size=-1`` and accepts | |
|
161 | read sizes of ``-1`` and ``0``. The new behavior aligns with the documented | |
|
162 | behavior of ``io.RawIOBase``. | |
|
163 | * ``ZstdCompressionReader`` now implements ``readall()``. Previously, this | |
|
164 | method raised ``NotImplementedError``. | |
|
165 | * ``ZstdDecompressionReader`` now implements ``read1()`` and ``readinto1()``. | |
|
166 | These are part of the ``io.BufferedIOBase`` interface. | |
|
167 | * ``ZstdDecompressionReader.read()`` now defaults to ``size=-1`` and accepts | |
|
168 | read sizes of ``-1`` and ``0``. The new behavior aligns with the documented | |
|
169 | behavior of ``io.RawIOBase``. | |
|
170 | * ``ZstdDecompressionReader()`` now implements ``readall()``. Previously, this | |
|
171 | method raised ``NotImplementedError``. | |
|
172 | * The ``readline()``, ``readlines()``, ``__iter__``, and ``__next__`` methods | |
|
173 | of ``ZstdDecompressionReader()`` now raise ``io.UnsupportedOperation`` | |
|
174 | instead of ``NotImplementedError``. This reflects a decision to never | |
|
175 | implement text-based I/O on (de)compressors and keep the low-level API | |
|
176 | operating in the binary domain. (#13) | |
|
177 | * ``README.rst`` now documented how to achieve linewise iteration using | |
|
178 | an ``io.TextIOWrapper`` with a ``ZstdDecompressionReader``. | |
|
179 | * ``ZstdDecompressionReader`` has gained a ``readinto(b)`` method for | |
|
180 | reading decompressed output into an existing buffer. This allows chaining | |
|
181 | to an ``io.TextIOWrapper`` on Python 3 without using an ``io.BufferedReader``. | |
|
182 | * ``ZstdDecompressor.stream_reader()`` now accepts a ``read_across_frames`` | |
|
183 | argument to control behavior when the input data has multiple zstd | |
|
184 | *frames*. When ``False`` (the default for backwards compatibility), a | |
|
185 | ``read()`` will stop when the end of a zstd *frame* is encountered. When | |
|
186 | ``True``, ``read()`` can potentially return data spanning multiple zstd | |
|
187 | *frames*. The default will likely be changed to ``True`` in a future | |
|
188 | release. | |
|
189 | * ``setup.py`` now performs CFFI version sniffing and disables the CFFI | |
|
190 | backend if CFFI is too old. Previously, we only used ``install_requires`` | |
|
191 | to enforce the CFFI version and not all build modes would properly enforce | |
|
192 | the minimum CFFI version. (#69) | |
|
193 | * CFFI's ``ZstdDecompressionReader.read()`` now properly handles data | |
|
194 | remaining in any internal buffer. Before, repeated ``read()`` could | |
|
195 | result in *random* errors. (#71) | |
|
196 | * Upgraded various Python packages in CI environment. | |
|
197 | * Upgrade to hypothesis 4.5.11. | |
|
198 | * In the CFFI backend, ``CompressionReader`` and ``DecompressionReader`` | |
|
199 | were renamed to ``ZstdCompressionReader`` and ``ZstdDecompressionReader``, | |
|
200 | respectively. | |
|
201 | * ``ZstdDecompressor.stream_writer()`` now accepts a ``write_return_read`` | |
|
202 | argument to control whether ``write()`` returns the number of bytes read | |
|
203 | from the source. It defaults to ``False`` to preserve backwards | |
|
204 | compatibility. | |
|
205 | * ``ZstdDecompressor.stream_writer()`` now implements the ``io.RawIOBase`` | |
|
206 | interface and behaves as a proper stream object. | |
|
207 | * ``ZstdCompressor.stream_writer()`` now accepts a ``write_return_read`` | |
|
208 | argument to control whether ``write()`` returns the number of bytes read | |
|
209 | from the source. It defaults to ``False`` to preserve backwards | |
|
210 | compatibility. | |
|
211 | * ``ZstdCompressionWriter`` now implements the ``io.RawIOBase`` interface and | |
|
212 | behaves as a proper stream object. ``close()`` will now close the stream | |
|
213 | and the underlying stream (if possible). ``__exit__`` will now call | |
|
214 | ``close()``. Methods like ``writable()`` and ``fileno()`` are implemented. | |
|
215 | * ``ZstdDecompressionWriter`` no longer must be used as a context manager. | |
|
216 | * ``ZstdCompressionWriter`` no longer must be used as a context manager. | |
|
217 | When not using as a context manager, it is important to call | |
|
218 | ``flush(FRAME_FRAME)`` or the compression stream won't be properly | |
|
219 | terminated and decoders may complain about malformed input. | |
|
220 | * ``ZstdCompressionWriter.flush()`` (what is returned from | |
|
221 | ``ZstdCompressor.stream_writer()``) now accepts an argument controlling the | |
|
222 | flush behavior. Its value can be one of the new constants | |
|
223 | ``FLUSH_BLOCK`` or ``FLUSH_FRAME``. | |
|
224 | * ``ZstdDecompressionObj`` instances now have a ``flush([length=None])`` method. | |
|
225 | This provides parity with standard library equivalent types. (#65) | |
|
226 | * ``CompressionParameters`` no longer redundantly store individual compression | |
|
227 | parameters on each instance. Instead, compression parameters are stored inside | |
|
228 | the underlying ``ZSTD_CCtx_params`` instance. Attributes for obtaining | |
|
229 | parameters are now properties rather than instance variables. | |
|
230 | * Exposed the ``STRATEGY_BTULTRA2`` constant. | |
|
231 | * ``CompressionParameters`` instances now expose an ``overlap_log`` attribute. | |
|
232 | This behaves identically to the ``overlap_size_log`` attribute. | |
|
233 | * ``CompressionParameters()`` now accepts an ``overlap_log`` argument that | |
|
234 | behaves identically to the ``overlap_size_log`` argument. An error will be | |
|
235 | raised if both arguments are specified. | |
|
236 | * ``CompressionParameters`` instances now expose an ``ldm_hash_rate_log`` | |
|
237 | attribute. This behaves identically to the ``ldm_hash_every_log`` attribute. | |
|
238 | * ``CompressionParameters()`` now accepts a ``ldm_hash_rate_log`` argument that | |
|
239 | behaves identically to the ``ldm_hash_every_log`` argument. An error will be | |
|
240 | raised if both arguments are specified. | |
|
241 | * ``CompressionParameters()`` now accepts a ``strategy`` argument that behaves | |
|
242 | identically to the ``compression_strategy`` argument. An error will be raised | |
|
243 | if both arguments are specified. | |
|
244 | * The ``MINMATCH_MIN`` and ``MINMATCH_MAX`` constants were added. They are | |
|
245 | semantically equivalent to the old ``SEARCHLENGTH_MIN`` and | |
|
246 | ``SEARCHLENGTH_MAX`` constants. | |
|
247 | * Bundled zstandard library upgraded from 1.3.7 to 1.3.8. | |
|
248 | * ``setup.py`` denotes support for Python 3.7 (Python 3.7 was supported and | |
|
249 | tested in the 0.10 release). | |
|
250 | * ``zstd_cffi`` module has been renamed to ``zstandard.cffi``. | |
|
251 | * ``ZstdCompressor.stream_writer()`` now reuses a buffer in order to avoid | |
|
252 | allocating a new buffer for every operation. This should result in faster | |
|
253 | performance in cases where ``write()`` or ``flush()`` are being called | |
|
254 | frequently. (#62) | |
|
255 | * Bundled zstandard library upgraded from 1.3.6 to 1.3.7. | |
|
256 | ||
|
257 | 0.10.2 (released 2018-11-03) | |
|
258 | ============================ | |
|
259 | ||
|
260 | Bug Fixes | |
|
261 | --------- | |
|
262 | ||
|
263 | * ``zstd_cffi.py`` added to ``setup.py`` (#60). | |
|
264 | ||
|
265 | Changes | |
|
266 | ------- | |
|
267 | ||
|
268 | * Change some integer casts to avoid ``ssize_t`` (#61). | |
|
269 | ||
|
54 | 270 | 0.10.1 (released 2018-10-08) |
|
55 | 271 | ============================ |
|
56 | 272 |
@@ -20,9 +20,9 b' https://github.com/indygreg/python-zstan' | |||
|
20 | 20 | Requirements |
|
21 | 21 | ============ |
|
22 | 22 | |
|
23 |
This extension is designed to run with Python 2.7, 3.4, 3.5, and 3. |
|
|
24 |
on common platforms (Linux, Windows, and OS X). |
|
|
25 | on Windows. Only x86_64 is well-tested on Linux and macOS. | |
|
23 | This extension is designed to run with Python 2.7, 3.4, 3.5, 3.6, and 3.7 | |
|
24 | on common platforms (Linux, Windows, and OS X). On PyPy (both PyPy2 and PyPy3) we support version 6.0.0 and above. | |
|
25 | x86 and x86_64 are well-tested on Windows. Only x86_64 is well-tested on Linux and macOS. | |
|
26 | 26 | |
|
27 | 27 | Installing |
|
28 | 28 | ========== |
@@ -215,7 +215,7 b' Instances can also be used as context ma' | |||
|
215 | 215 | |
|
216 | 216 | # Do something with compressed chunk. |
|
217 | 217 | |
|
218 |
When the context manager exi |
|
|
218 | When the context manager exits or ``close()`` is called, the stream is closed, | |
|
219 | 219 | underlying resources are released, and future operations against the compression |
|
220 | 220 | stream will fail. |
|
221 | 221 | |
@@ -251,8 +251,54 b' emitted so far.' | |||
|
251 | 251 | Streaming Input API |
|
252 | 252 | ^^^^^^^^^^^^^^^^^^^ |
|
253 | 253 | |
|
254 | ``stream_writer(fh)`` (which behaves as a context manager) allows you to *stream* | |
|
255 | data into a compressor.:: | |
|
254 | ``stream_writer(fh)`` allows you to *stream* data into a compressor. | |
|
255 | ||
|
256 | Returned instances implement the ``io.RawIOBase`` interface. Only methods | |
|
257 | that involve writing will do useful things. | |
|
258 | ||
|
259 | The argument to ``stream_writer()`` must have a ``write(data)`` method. As | |
|
260 | compressed data is available, ``write()`` will be called with the compressed | |
|
261 | data as its argument. Many common Python types implement ``write()``, including | |
|
262 | open file handles and ``io.BytesIO``. | |
|
263 | ||
|
264 | The ``write(data)`` method is used to feed data into the compressor. | |
|
265 | ||
|
266 | The ``flush([flush_mode=FLUSH_BLOCK])`` method can be called to evict whatever | |
|
267 | data remains within the compressor's internal state into the output object. This | |
|
268 | may result in 0 or more ``write()`` calls to the output object. This method | |
|
269 | accepts an optional ``flush_mode`` argument to control the flushing behavior. | |
|
270 | Its value can be any of the ``FLUSH_*`` constants. | |
|
271 | ||
|
272 | Both ``write()`` and ``flush()`` return the number of bytes written to the | |
|
273 | object's ``write()``. In many cases, small inputs do not accumulate enough | |
|
274 | data to cause a write and ``write()`` will return ``0``. | |
|
275 | ||
|
276 | Calling ``close()`` will mark the stream as closed and subsequent I/O | |
|
277 | operations will raise ``ValueError`` (per the documented behavior of | |
|
278 | ``io.RawIOBase``). ``close()`` will also call ``close()`` on the underlying | |
|
279 | stream if such a method exists. | |
|
280 | ||
|
281 | Typically usage is as follows:: | |
|
282 | ||
|
283 | cctx = zstd.ZstdCompressor(level=10) | |
|
284 | compressor = cctx.stream_writer(fh) | |
|
285 | ||
|
286 | compressor.write(b'chunk 0\n') | |
|
287 | compressor.write(b'chunk 1\n') | |
|
288 | compressor.flush() | |
|
289 | # Receiver will be able to decode ``chunk 0\nchunk 1\n`` at this point. | |
|
290 | # Receiver is also expecting more data in the zstd *frame*. | |
|
291 | ||
|
292 | compressor.write(b'chunk 2\n') | |
|
293 | compressor.flush(zstd.FLUSH_FRAME) | |
|
294 | # Receiver will be able to decode ``chunk 0\nchunk 1\nchunk 2``. | |
|
295 | # Receiver is expecting no more data, as the zstd frame is closed. | |
|
296 | # Any future calls to ``write()`` at this point will construct a new | |
|
297 | # zstd frame. | |
|
298 | ||
|
299 | Instances can be used as context managers. Exiting the context manager is | |
|
300 | the equivalent of calling ``close()``, which is equivalent to calling | |
|
301 | ``flush(zstd.FLUSH_FRAME)``:: | |
|
256 | 302 | |
|
257 | 303 | cctx = zstd.ZstdCompressor(level=10) |
|
258 | 304 | with cctx.stream_writer(fh) as compressor: |
@@ -260,22 +306,12 b' data into a compressor.::' | |||
|
260 | 306 | compressor.write(b'chunk 1') |
|
261 | 307 | ... |
|
262 | 308 | |
|
263 | The argument to ``stream_writer()`` must have a ``write(data)`` method. As | |
|
264 | compressed data is available, ``write()`` will be called with the compressed | |
|
265 | data as its argument. Many common Python types implement ``write()``, including | |
|
266 | open file handles and ``io.BytesIO``. | |
|
309 | .. important:: | |
|
267 | 310 | |
|
268 | ``stream_writer()`` returns an object representing a streaming compressor | |
|
269 | instance. It **must** be used as a context manager. That object's | |
|
270 | ``write(data)`` method is used to feed data into the compressor. | |
|
271 | ||
|
272 | A ``flush()`` method can be called to evict whatever data remains within the | |
|
273 | compressor's internal state into the output object. This may result in 0 or | |
|
274 | more ``write()`` calls to the output object. | |
|
275 | ||
|
276 | Both ``write()`` and ``flush()`` return the number of bytes written to the | |
|
277 | object's ``write()``. In many cases, small inputs do not accumulate enough | |
|
278 | data to cause a write and ``write()`` will return ``0``. | |
|
311 | If ``flush(FLUSH_FRAME)`` is not called, emitted data doesn't constitute | |
|
312 | a full zstd *frame* and consumers of this data may complain about malformed | |
|
313 | input. It is recommended to use instances as a context manager to ensure | |
|
314 | *frames* are properly finished. | |
|
279 | 315 | |
|
280 | 316 | If the size of the data being fed to this streaming compressor is known, |
|
281 | 317 | you can declare it before compression begins:: |
@@ -310,6 +346,14 b' Thte total number of bytes written so fa' | |||
|
310 | 346 | ... |
|
311 | 347 | total_written = compressor.tell() |
|
312 | 348 | |
|
349 | ``stream_writer()`` accepts a ``write_return_read`` boolean argument to control | |
|
350 | the return value of ``write()``. When ``False`` (the default), ``write()`` returns | |
|
351 | the number of bytes that were ``write()``en to the underlying object. When | |
|
352 | ``True``, ``write()`` returns the number of bytes read from the input that | |
|
353 | were subsequently written to the compressor. ``True`` is the *proper* behavior | |
|
354 | for ``write()`` as specified by the ``io.RawIOBase`` interface and will become | |
|
355 | the default value in a future release. | |
|
356 | ||
|
313 | 357 | Streaming Output API |
|
314 | 358 | ^^^^^^^^^^^^^^^^^^^^ |
|
315 | 359 | |
@@ -654,27 +698,63 b' will raise ``ValueError`` if attempted.' | |||
|
654 | 698 | ``tell()`` returns the number of decompressed bytes read so far. |
|
655 | 699 | |
|
656 | 700 | Not all I/O methods are implemented. Notably missing is support for |
|
657 |
``readline()``, ``readlines()``, and linewise iteration support. |
|
|
658 | these is planned for a future release. | |
|
701 | ``readline()``, ``readlines()``, and linewise iteration support. This is | |
|
702 | because streams operate on binary data - not text data. If you want to | |
|
703 | convert decompressed output to text, you can chain an ``io.TextIOWrapper`` | |
|
704 | to the stream:: | |
|
705 | ||
|
706 | with open(path, 'rb') as fh: | |
|
707 | dctx = zstd.ZstdDecompressor() | |
|
708 | stream_reader = dctx.stream_reader(fh) | |
|
709 | text_stream = io.TextIOWrapper(stream_reader, encoding='utf-8') | |
|
710 | ||
|
711 | for line in text_stream: | |
|
712 | ... | |
|
713 | ||
|
714 | The ``read_across_frames`` argument to ``stream_reader()`` controls the | |
|
715 | behavior of read operations when the end of a zstd *frame* is encountered. | |
|
716 | When ``False`` (the default), a read will complete when the end of a | |
|
717 | zstd *frame* is encountered. When ``True``, a read can potentially | |
|
718 | return data spanning multiple zstd *frames*. | |
|
659 | 719 | |
|
660 | 720 | Streaming Input API |
|
661 | 721 | ^^^^^^^^^^^^^^^^^^^ |
|
662 | 722 | |
|
663 | ``stream_writer(fh)`` can be used to incrementally send compressed data to a | |
|
664 | decompressor.:: | |
|
723 | ``stream_writer(fh)`` allows you to *stream* data into a decompressor. | |
|
724 | ||
|
725 | Returned instances implement the ``io.RawIOBase`` interface. Only methods | |
|
726 | that involve writing will do useful things. | |
|
727 | ||
|
728 | The argument to ``stream_writer()`` is typically an object that also implements | |
|
729 | ``io.RawIOBase``. But any object with a ``write(data)`` method will work. Many | |
|
730 | common Python types conform to this interface, including open file handles | |
|
731 | and ``io.BytesIO``. | |
|
732 | ||
|
733 | Behavior is similar to ``ZstdCompressor.stream_writer()``: compressed data | |
|
734 | is sent to the decompressor by calling ``write(data)`` and decompressed | |
|
735 | output is written to the underlying stream by calling its ``write(data)`` | |
|
736 | method.:: | |
|
665 | 737 | |
|
666 | 738 | dctx = zstd.ZstdDecompressor() |
|
667 |
|
|
|
668 | decompressor.write(compressed_data) | |
|
739 | decompressor = dctx.stream_writer(fh) | |
|
669 | 740 | |
|
670 | This behaves similarly to ``zstd.ZstdCompressor``: compressed data is written to | |
|
671 | the decompressor by calling ``write(data)`` and decompressed output is written | |
|
672 | to the output object by calling its ``write(data)`` method. | |
|
741 | decompressor.write(compressed_data) | |
|
742 | ... | |
|
743 | ||
|
673 | 744 |
|
|
674 | 745 | Calls to ``write()`` will return the number of bytes written to the output |
|
675 | 746 | object. Not all inputs will result in bytes being written, so return values |
|
676 | 747 | of ``0`` are possible. |
|
677 | 748 | |
|
749 | Like the ``stream_writer()`` compressor, instances can be used as context | |
|
750 | managers. However, context managers add no extra special behavior and offer | |
|
751 | little to no benefit to being used. | |
|
752 | ||
|
753 | Calling ``close()`` will mark the stream as closed and subsequent I/O operations | |
|
754 | will raise ``ValueError`` (per the documented behavior of ``io.RawIOBase``). | |
|
755 | ``close()`` will also call ``close()`` on the underlying stream if such a | |
|
756 | method exists. | |
|
757 | ||
|
678 | 758 | The size of chunks being ``write()`` to the destination can be specified:: |
|
679 | 759 | |
|
680 | 760 | dctx = zstd.ZstdDecompressor() |
@@ -687,6 +767,13 b' You can see how much memory is being use' | |||
|
687 | 767 | with dctx.stream_writer(fh) as decompressor: |
|
688 | 768 | byte_size = decompressor.memory_size() |
|
689 | 769 | |
|
770 | ``stream_writer()`` accepts a ``write_return_read`` boolean argument to control | |
|
771 | the return value of ``write()``. When ``False`` (the default)``, ``write()`` | |
|
772 | returns the number of bytes that were ``write()``en to the underlying stream. | |
|
773 | When ``True``, ``write()`` returns the number of bytes read from the input. | |
|
774 | ``True`` is the *proper* behavior for ``write()`` as specified by the | |
|
775 | ``io.RawIOBase`` interface and will become the default in a future release. | |
|
776 | ||
|
690 | 777 | Streaming Output API |
|
691 | 778 | ^^^^^^^^^^^^^^^^^^^^ |
|
692 | 779 | |
@@ -791,6 +878,10 b' these temporary chunks by passing ``writ' | |||
|
791 | 878 | memory (re)allocations, this streaming decompression API isn't as |
|
792 | 879 | efficient as other APIs. |
|
793 | 880 | |
|
881 | For compatibility with the standard library APIs, instances expose a | |
|
882 | ``flush([length=None])`` method. This method no-ops and has no meaningful | |
|
883 | side-effects, making it safe to call any time. | |
|
884 | ||
|
794 | 885 | Batch Decompression API |
|
795 | 886 | ^^^^^^^^^^^^^^^^^^^^^^^ |
|
796 | 887 | |
@@ -1147,18 +1238,21 b' follows:' | |||
|
1147 | 1238 | * search_log |
|
1148 | 1239 | * min_match |
|
1149 | 1240 | * target_length |
|
1150 |
* |
|
|
1241 | * strategy | |
|
1242 | * compression_strategy (deprecated: same as ``strategy``) | |
|
1151 | 1243 | * write_content_size |
|
1152 | 1244 | * write_checksum |
|
1153 | 1245 | * write_dict_id |
|
1154 | 1246 | * job_size |
|
1155 |
* overlap_ |
|
|
1247 | * overlap_log | |
|
1248 | * overlap_size_log (deprecated: same as ``overlap_log``) | |
|
1156 | 1249 | * force_max_window |
|
1157 | 1250 | * enable_ldm |
|
1158 | 1251 | * ldm_hash_log |
|
1159 | 1252 | * ldm_min_match |
|
1160 | 1253 | * ldm_bucket_size_log |
|
1161 |
* ldm_hash_e |
|
|
1254 | * ldm_hash_rate_log | |
|
1255 | * ldm_hash_every_log (deprecated: same as ``ldm_hash_rate_log``) | |
|
1162 | 1256 | * threads |
|
1163 | 1257 | |
|
1164 | 1258 | Some of these are very low-level settings. It may help to consult the official |
@@ -1240,6 +1334,13 b' FRAME_HEADER' | |||
|
1240 | 1334 | MAGIC_NUMBER |
|
1241 | 1335 | Frame header as an integer |
|
1242 | 1336 | |
|
1337 | FLUSH_BLOCK | |
|
1338 | Flushing behavior that denotes to flush a zstd block. A decompressor will | |
|
1339 | be able to decode all data fed into the compressor so far. | |
|
1340 | FLUSH_FRAME | |
|
1341 | Flushing behavior that denotes to end a zstd frame. Any new data fed | |
|
1342 | to the compressor will start a new frame. | |
|
1343 | ||
|
1243 | 1344 | CONTENTSIZE_UNKNOWN |
|
1244 | 1345 | Value for content size when the content size is unknown. |
|
1245 | 1346 | CONTENTSIZE_ERROR |
@@ -1261,10 +1362,18 b' SEARCHLOG_MIN' | |||
|
1261 | 1362 | Minimum value for compression parameter |
|
1262 | 1363 | SEARCHLOG_MAX |
|
1263 | 1364 | Maximum value for compression parameter |
|
1365 | MINMATCH_MIN | |
|
1366 | Minimum value for compression parameter | |
|
1367 | MINMATCH_MAX | |
|
1368 | Maximum value for compression parameter | |
|
1264 | 1369 | SEARCHLENGTH_MIN |
|
1265 | 1370 | Minimum value for compression parameter |
|
1371 | ||
|
1372 | Deprecated: use ``MINMATCH_MIN`` | |
|
1266 | 1373 | SEARCHLENGTH_MAX |
|
1267 | 1374 | Maximum value for compression parameter |
|
1375 | ||
|
1376 | Deprecated: use ``MINMATCH_MAX`` | |
|
1268 | 1377 | TARGETLENGTH_MIN |
|
1269 | 1378 | Minimum value for compression parameter |
|
1270 | 1379 | STRATEGY_FAST |
@@ -1283,6 +1392,8 b' STRATEGY_BTOPT' | |||
|
1283 | 1392 | Compression strategy |
|
1284 | 1393 | STRATEGY_BTULTRA |
|
1285 | 1394 | Compression strategy |
|
1395 | STRATEGY_BTULTRA2 | |
|
1396 | Compression strategy | |
|
1286 | 1397 | |
|
1287 | 1398 | FORMAT_ZSTD1 |
|
1288 | 1399 | Zstandard frame format |
@@ -43,7 +43,7 b' static PyObject* ZstdCompressionChunkerI' | |||
|
43 | 43 | /* If we have data left in the input, consume it. */ |
|
44 | 44 | while (chunker->input.pos < chunker->input.size) { |
|
45 | 45 | Py_BEGIN_ALLOW_THREADS |
|
46 |
zresult = ZSTD_compress |
|
|
46 | zresult = ZSTD_compressStream2(chunker->compressor->cctx, &chunker->output, | |
|
47 | 47 | &chunker->input, ZSTD_e_continue); |
|
48 | 48 | Py_END_ALLOW_THREADS |
|
49 | 49 | |
@@ -104,7 +104,7 b' static PyObject* ZstdCompressionChunkerI' | |||
|
104 | 104 | } |
|
105 | 105 | |
|
106 | 106 | Py_BEGIN_ALLOW_THREADS |
|
107 |
zresult = ZSTD_compress |
|
|
107 | zresult = ZSTD_compressStream2(chunker->compressor->cctx, &chunker->output, | |
|
108 | 108 | &chunker->input, zFlushMode); |
|
109 | 109 | Py_END_ALLOW_THREADS |
|
110 | 110 |
@@ -298,13 +298,9 b' static PyObject* ZstdCompressionDict_pre' | |||
|
298 | 298 | cParams = ZSTD_getCParams(level, 0, self->dictSize); |
|
299 | 299 | } |
|
300 | 300 | else { |
|
301 | cParams.chainLog = compressionParams->chainLog; | |
|
302 | cParams.hashLog = compressionParams->hashLog; | |
|
303 | cParams.searchLength = compressionParams->minMatch; | |
|
304 | cParams.searchLog = compressionParams->searchLog; | |
|
305 | cParams.strategy = compressionParams->compressionStrategy; | |
|
306 | cParams.targetLength = compressionParams->targetLength; | |
|
307 | cParams.windowLog = compressionParams->windowLog; | |
|
301 | if (to_cparams(compressionParams, &cParams)) { | |
|
302 | return NULL; | |
|
303 | } | |
|
308 | 304 | } |
|
309 | 305 | |
|
310 | 306 | assert(!self->cdict); |
@@ -10,7 +10,7 b'' | |||
|
10 | 10 | |
|
11 | 11 | extern PyObject* ZstdError; |
|
12 | 12 | |
|
13 |
int set_parameter(ZSTD_CCtx_params* params, ZSTD_cParameter param, |
|
|
13 | int set_parameter(ZSTD_CCtx_params* params, ZSTD_cParameter param, int value) { | |
|
14 | 14 | size_t zresult = ZSTD_CCtxParam_setParameter(params, param, value); |
|
15 | 15 | if (ZSTD_isError(zresult)) { |
|
16 | 16 | PyErr_Format(ZstdError, "unable to set compression context parameter: %s", |
@@ -23,28 +23,41 b' int set_parameter(ZSTD_CCtx_params* para' | |||
|
23 | 23 | |
|
24 | 24 | #define TRY_SET_PARAMETER(params, param, value) if (set_parameter(params, param, value)) return -1; |
|
25 | 25 | |
|
26 | #define TRY_COPY_PARAMETER(source, dest, param) { \ | |
|
27 | int result; \ | |
|
28 | size_t zresult = ZSTD_CCtxParam_getParameter(source, param, &result); \ | |
|
29 | if (ZSTD_isError(zresult)) { \ | |
|
30 | return 1; \ | |
|
31 | } \ | |
|
32 | zresult = ZSTD_CCtxParam_setParameter(dest, param, result); \ | |
|
33 | if (ZSTD_isError(zresult)) { \ | |
|
34 | return 1; \ | |
|
35 | } \ | |
|
36 | } | |
|
37 | ||
|
26 | 38 | int set_parameters(ZSTD_CCtx_params* params, ZstdCompressionParametersObject* obj) { |
|
27 |
TRY_ |
|
|
28 | TRY_SET_PARAMETER(params, ZSTD_p_compressionLevel, (unsigned)obj->compressionLevel); | |
|
29 | TRY_SET_PARAMETER(params, ZSTD_p_windowLog, obj->windowLog); | |
|
30 | TRY_SET_PARAMETER(params, ZSTD_p_hashLog, obj->hashLog); | |
|
31 |
TRY_ |
|
|
32 |
TRY_ |
|
|
33 | TRY_SET_PARAMETER(params, ZSTD_p_minMatch, obj->minMatch); | |
|
34 | TRY_SET_PARAMETER(params, ZSTD_p_targetLength, obj->targetLength); | |
|
35 | TRY_SET_PARAMETER(params, ZSTD_p_compressionStrategy, obj->compressionStrategy); | |
|
36 | TRY_SET_PARAMETER(params, ZSTD_p_contentSizeFlag, obj->contentSizeFlag); | |
|
37 | TRY_SET_PARAMETER(params, ZSTD_p_checksumFlag, obj->checksumFlag); | |
|
38 | TRY_SET_PARAMETER(params, ZSTD_p_dictIDFlag, obj->dictIDFlag); | |
|
39 | TRY_SET_PARAMETER(params, ZSTD_p_nbWorkers, obj->threads); | |
|
40 | TRY_SET_PARAMETER(params, ZSTD_p_jobSize, obj->jobSize); | |
|
41 | TRY_SET_PARAMETER(params, ZSTD_p_overlapSizeLog, obj->overlapSizeLog); | |
|
42 | TRY_SET_PARAMETER(params, ZSTD_p_forceMaxWindow, obj->forceMaxWindow); | |
|
43 | TRY_SET_PARAMETER(params, ZSTD_p_enableLongDistanceMatching, obj->enableLongDistanceMatching); | |
|
44 | TRY_SET_PARAMETER(params, ZSTD_p_ldmHashLog, obj->ldmHashLog); | |
|
45 |
TRY_ |
|
|
46 | TRY_SET_PARAMETER(params, ZSTD_p_ldmBucketSizeLog, obj->ldmBucketSizeLog); | |
|
47 | TRY_SET_PARAMETER(params, ZSTD_p_ldmHashEveryLog, obj->ldmHashEveryLog); | |
|
39 | TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_nbWorkers); | |
|
40 | ||
|
41 | TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_format); | |
|
42 | TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_compressionLevel); | |
|
43 | TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_windowLog); | |
|
44 | TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_hashLog); | |
|
45 | TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_chainLog); | |
|
46 | TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_searchLog); | |
|
47 | TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_minMatch); | |
|
48 | TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_targetLength); | |
|
49 | TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_strategy); | |
|
50 | TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_contentSizeFlag); | |
|
51 | TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_checksumFlag); | |
|
52 | TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_dictIDFlag); | |
|
53 | TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_jobSize); | |
|
54 | TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_overlapLog); | |
|
55 | TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_forceMaxWindow); | |
|
56 | TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_enableLongDistanceMatching); | |
|
57 | TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_ldmHashLog); | |
|
58 | TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_ldmMinMatch); | |
|
59 | TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_ldmBucketSizeLog); | |
|
60 | TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_ldmHashRateLog); | |
|
48 | 61 | |
|
49 | 62 | return 0; |
|
50 | 63 | } |
@@ -64,6 +77,41 b' int reset_params(ZstdCompressionParamete' | |||
|
64 | 77 | return set_parameters(params->params, params); |
|
65 | 78 | } |
|
66 | 79 | |
|
80 | #define TRY_GET_PARAMETER(params, param, value) { \ | |
|
81 | size_t zresult = ZSTD_CCtxParam_getParameter(params, param, value); \ | |
|
82 | if (ZSTD_isError(zresult)) { \ | |
|
83 | PyErr_Format(ZstdError, "unable to retrieve parameter: %s", ZSTD_getErrorName(zresult)); \ | |
|
84 | return 1; \ | |
|
85 | } \ | |
|
86 | } | |
|
87 | ||
|
88 | int to_cparams(ZstdCompressionParametersObject* params, ZSTD_compressionParameters* cparams) { | |
|
89 | int value; | |
|
90 | ||
|
91 | TRY_GET_PARAMETER(params->params, ZSTD_c_windowLog, &value); | |
|
92 | cparams->windowLog = value; | |
|
93 | ||
|
94 | TRY_GET_PARAMETER(params->params, ZSTD_c_chainLog, &value); | |
|
95 | cparams->chainLog = value; | |
|
96 | ||
|
97 | TRY_GET_PARAMETER(params->params, ZSTD_c_hashLog, &value); | |
|
98 | cparams->hashLog = value; | |
|
99 | ||
|
100 | TRY_GET_PARAMETER(params->params, ZSTD_c_searchLog, &value); | |
|
101 | cparams->searchLog = value; | |
|
102 | ||
|
103 | TRY_GET_PARAMETER(params->params, ZSTD_c_minMatch, &value); | |
|
104 | cparams->minMatch = value; | |
|
105 | ||
|
106 | TRY_GET_PARAMETER(params->params, ZSTD_c_targetLength, &value); | |
|
107 | cparams->targetLength = value; | |
|
108 | ||
|
109 | TRY_GET_PARAMETER(params->params, ZSTD_c_strategy, &value); | |
|
110 | cparams->strategy = value; | |
|
111 | ||
|
112 | return 0; | |
|
113 | } | |
|
114 | ||
|
67 | 115 | static int ZstdCompressionParameters_init(ZstdCompressionParametersObject* self, PyObject* args, PyObject* kwargs) { |
|
68 | 116 | static char* kwlist[] = { |
|
69 | 117 | "format", |
@@ -75,50 +123,60 b' static int ZstdCompressionParameters_ini' | |||
|
75 | 123 | "min_match", |
|
76 | 124 | "target_length", |
|
77 | 125 | "compression_strategy", |
|
126 | "strategy", | |
|
78 | 127 | "write_content_size", |
|
79 | 128 | "write_checksum", |
|
80 | 129 | "write_dict_id", |
|
81 | 130 | "job_size", |
|
131 | "overlap_log", | |
|
82 | 132 | "overlap_size_log", |
|
83 | 133 | "force_max_window", |
|
84 | 134 | "enable_ldm", |
|
85 | 135 | "ldm_hash_log", |
|
86 | 136 | "ldm_min_match", |
|
87 | 137 | "ldm_bucket_size_log", |
|
138 | "ldm_hash_rate_log", | |
|
88 | 139 | "ldm_hash_every_log", |
|
89 | 140 | "threads", |
|
90 | 141 | NULL |
|
91 | 142 | }; |
|
92 | 143 | |
|
93 |
|
|
|
144 | int format = 0; | |
|
94 | 145 | int compressionLevel = 0; |
|
95 |
|
|
|
96 |
|
|
|
97 |
|
|
|
98 |
|
|
|
99 |
|
|
|
100 |
|
|
|
101 |
|
|
|
102 | unsigned contentSizeFlag = 1; | |
|
103 | unsigned checksumFlag = 0; | |
|
104 | unsigned dictIDFlag = 0; | |
|
105 | unsigned jobSize = 0; | |
|
106 | unsigned overlapSizeLog = 0; | |
|
107 | unsigned forceMaxWindow = 0; | |
|
108 | unsigned enableLDM = 0; | |
|
109 | unsigned ldmHashLog = 0; | |
|
110 | unsigned ldmMinMatch = 0; | |
|
111 | unsigned ldmBucketSizeLog = 0; | |
|
112 | unsigned ldmHashEveryLog = 0; | |
|
146 | int windowLog = 0; | |
|
147 | int hashLog = 0; | |
|
148 | int chainLog = 0; | |
|
149 | int searchLog = 0; | |
|
150 | int minMatch = 0; | |
|
151 | int targetLength = 0; | |
|
152 | int compressionStrategy = -1; | |
|
153 | int strategy = -1; | |
|
154 | int contentSizeFlag = 1; | |
|
155 | int checksumFlag = 0; | |
|
156 | int dictIDFlag = 0; | |
|
157 | int jobSize = 0; | |
|
158 | int overlapLog = -1; | |
|
159 | int overlapSizeLog = -1; | |
|
160 | int forceMaxWindow = 0; | |
|
161 | int enableLDM = 0; | |
|
162 | int ldmHashLog = 0; | |
|
163 | int ldmMinMatch = 0; | |
|
164 | int ldmBucketSizeLog = 0; | |
|
165 | int ldmHashRateLog = -1; | |
|
166 | int ldmHashEveryLog = -1; | |
|
113 | 167 | int threads = 0; |
|
114 | 168 | |
|
115 | 169 | if (!PyArg_ParseTupleAndKeywords(args, kwargs, |
|
116 | "|IiIIIIIIIIIIIIIIIIIIi:CompressionParameters", | |
|
170 | "|iiiiiiiiiiiiiiiiiiiiiiii:CompressionParameters", | |
|
117 | 171 | kwlist, &format, &compressionLevel, &windowLog, &hashLog, &chainLog, |
|
118 | &searchLog, &minMatch, &targetLength, &compressionStrategy, | |
|
119 |
&contentSizeFlag, &checksumFlag, &dictIDFlag, &jobSize, &overlap |
|
|
120 |
&forceMaxWindow, &enableLDM, &ldmHashLog, &ldmMinMatch, |
|
|
121 | &ldmHashEveryLog, &threads)) { | |
|
172 | &searchLog, &minMatch, &targetLength, &compressionStrategy, &strategy, | |
|
173 | &contentSizeFlag, &checksumFlag, &dictIDFlag, &jobSize, &overlapLog, | |
|
174 | &overlapSizeLog, &forceMaxWindow, &enableLDM, &ldmHashLog, &ldmMinMatch, | |
|
175 | &ldmBucketSizeLog, &ldmHashRateLog, &ldmHashEveryLog, &threads)) { | |
|
176 | return -1; | |
|
177 | } | |
|
178 | ||
|
179 | if (reset_params(self)) { | |
|
122 | 180 | return -1; |
|
123 | 181 | } |
|
124 | 182 | |
@@ -126,32 +184,70 b' static int ZstdCompressionParameters_ini' | |||
|
126 | 184 | threads = cpu_count(); |
|
127 | 185 | } |
|
128 | 186 | |
|
129 | self->format = format; | |
|
130 | self->compressionLevel = compressionLevel; | |
|
131 | self->windowLog = windowLog; | |
|
132 | self->hashLog = hashLog; | |
|
133 | self->chainLog = chainLog; | |
|
134 | self->searchLog = searchLog; | |
|
135 | self->minMatch = minMatch; | |
|
136 | self->targetLength = targetLength; | |
|
137 | self->compressionStrategy = compressionStrategy; | |
|
138 | self->contentSizeFlag = contentSizeFlag; | |
|
139 | self->checksumFlag = checksumFlag; | |
|
140 | self->dictIDFlag = dictIDFlag; | |
|
141 | self->threads = threads; | |
|
142 | self->jobSize = jobSize; | |
|
143 | self->overlapSizeLog = overlapSizeLog; | |
|
144 | self->forceMaxWindow = forceMaxWindow; | |
|
145 | self->enableLongDistanceMatching = enableLDM; | |
|
146 | self->ldmHashLog = ldmHashLog; | |
|
147 | self->ldmMinMatch = ldmMinMatch; | |
|
148 | self->ldmBucketSizeLog = ldmBucketSizeLog; | |
|
149 | self->ldmHashEveryLog = ldmHashEveryLog; | |
|
187 | /* We need to set ZSTD_c_nbWorkers before ZSTD_c_jobSize and ZSTD_c_overlapLog | |
|
188 | * because setting ZSTD_c_nbWorkers resets the other parameters. */ | |
|
189 | TRY_SET_PARAMETER(self->params, ZSTD_c_nbWorkers, threads); | |
|
190 | ||
|
191 | TRY_SET_PARAMETER(self->params, ZSTD_c_format, format); | |
|
192 | TRY_SET_PARAMETER(self->params, ZSTD_c_compressionLevel, compressionLevel); | |
|
193 | TRY_SET_PARAMETER(self->params, ZSTD_c_windowLog, windowLog); | |
|
194 | TRY_SET_PARAMETER(self->params, ZSTD_c_hashLog, hashLog); | |
|
195 | TRY_SET_PARAMETER(self->params, ZSTD_c_chainLog, chainLog); | |
|
196 | TRY_SET_PARAMETER(self->params, ZSTD_c_searchLog, searchLog); | |
|
197 | TRY_SET_PARAMETER(self->params, ZSTD_c_minMatch, minMatch); | |
|
198 | TRY_SET_PARAMETER(self->params, ZSTD_c_targetLength, targetLength); | |
|
150 | 199 | |
|
151 | if (reset_params(self)) { | |
|
200 | if (compressionStrategy != -1 && strategy != -1) { | |
|
201 | PyErr_SetString(PyExc_ValueError, "cannot specify both compression_strategy and strategy"); | |
|
202 | return -1; | |
|
203 | } | |
|
204 | ||
|
205 | if (compressionStrategy != -1) { | |
|
206 | strategy = compressionStrategy; | |
|
207 | } | |
|
208 | else if (strategy == -1) { | |
|
209 | strategy = 0; | |
|
210 | } | |
|
211 | ||
|
212 | TRY_SET_PARAMETER(self->params, ZSTD_c_strategy, strategy); | |
|
213 | TRY_SET_PARAMETER(self->params, ZSTD_c_contentSizeFlag, contentSizeFlag); | |
|
214 | TRY_SET_PARAMETER(self->params, ZSTD_c_checksumFlag, checksumFlag); | |
|
215 | TRY_SET_PARAMETER(self->params, ZSTD_c_dictIDFlag, dictIDFlag); | |
|
216 | TRY_SET_PARAMETER(self->params, ZSTD_c_jobSize, jobSize); | |
|
217 | ||
|
218 | if (overlapLog != -1 && overlapSizeLog != -1) { | |
|
219 | PyErr_SetString(PyExc_ValueError, "cannot specify both overlap_log and overlap_size_log"); | |
|
152 | 220 | return -1; |
|
153 | 221 | } |
|
154 | 222 | |
|
223 | if (overlapSizeLog != -1) { | |
|
224 | overlapLog = overlapSizeLog; | |
|
225 | } | |
|
226 | else if (overlapLog == -1) { | |
|
227 | overlapLog = 0; | |
|
228 | } | |
|
229 | ||
|
230 | TRY_SET_PARAMETER(self->params, ZSTD_c_overlapLog, overlapLog); | |
|
231 | TRY_SET_PARAMETER(self->params, ZSTD_c_forceMaxWindow, forceMaxWindow); | |
|
232 | TRY_SET_PARAMETER(self->params, ZSTD_c_enableLongDistanceMatching, enableLDM); | |
|
233 | TRY_SET_PARAMETER(self->params, ZSTD_c_ldmHashLog, ldmHashLog); | |
|
234 | TRY_SET_PARAMETER(self->params, ZSTD_c_ldmMinMatch, ldmMinMatch); | |
|
235 | TRY_SET_PARAMETER(self->params, ZSTD_c_ldmBucketSizeLog, ldmBucketSizeLog); | |
|
236 | ||
|
237 | if (ldmHashRateLog != -1 && ldmHashEveryLog != -1) { | |
|
238 | PyErr_SetString(PyExc_ValueError, "cannot specify both ldm_hash_rate_log and ldm_hash_everyLog"); | |
|
239 | return -1; | |
|
240 | } | |
|
241 | ||
|
242 | if (ldmHashEveryLog != -1) { | |
|
243 | ldmHashRateLog = ldmHashEveryLog; | |
|
244 | } | |
|
245 | else if (ldmHashRateLog == -1) { | |
|
246 | ldmHashRateLog = 0; | |
|
247 | } | |
|
248 | ||
|
249 | TRY_SET_PARAMETER(self->params, ZSTD_c_ldmHashRateLog, ldmHashRateLog); | |
|
250 | ||
|
155 | 251 | return 0; |
|
156 | 252 | } |
|
157 | 253 | |
@@ -259,7 +355,7 b' ZstdCompressionParametersObject* Compres' | |||
|
259 | 355 | |
|
260 | 356 | val = PyDict_GetItemString(kwargs, "min_match"); |
|
261 | 357 | if (!val) { |
|
262 |
val = PyLong_FromUnsignedLong(params. |
|
|
358 | val = PyLong_FromUnsignedLong(params.minMatch); | |
|
263 | 359 | if (!val) { |
|
264 | 360 | goto cleanup; |
|
265 | 361 | } |
@@ -336,6 +432,41 b' static void ZstdCompressionParameters_de' | |||
|
336 | 432 | PyObject_Del(self); |
|
337 | 433 | } |
|
338 | 434 | |
|
435 | #define PARAM_GETTER(name, param) PyObject* ZstdCompressionParameters_get_##name(PyObject* self, void* unused) { \ | |
|
436 | int result; \ | |
|
437 | size_t zresult; \ | |
|
438 | ZstdCompressionParametersObject* p = (ZstdCompressionParametersObject*)(self); \ | |
|
439 | zresult = ZSTD_CCtxParam_getParameter(p->params, param, &result); \ | |
|
440 | if (ZSTD_isError(zresult)) { \ | |
|
441 | PyErr_Format(ZstdError, "unable to get compression parameter: %s", \ | |
|
442 | ZSTD_getErrorName(zresult)); \ | |
|
443 | return NULL; \ | |
|
444 | } \ | |
|
445 | return PyLong_FromLong(result); \ | |
|
446 | } | |
|
447 | ||
|
448 | PARAM_GETTER(format, ZSTD_c_format) | |
|
449 | PARAM_GETTER(compression_level, ZSTD_c_compressionLevel) | |
|
450 | PARAM_GETTER(window_log, ZSTD_c_windowLog) | |
|
451 | PARAM_GETTER(hash_log, ZSTD_c_hashLog) | |
|
452 | PARAM_GETTER(chain_log, ZSTD_c_chainLog) | |
|
453 | PARAM_GETTER(search_log, ZSTD_c_searchLog) | |
|
454 | PARAM_GETTER(min_match, ZSTD_c_minMatch) | |
|
455 | PARAM_GETTER(target_length, ZSTD_c_targetLength) | |
|
456 | PARAM_GETTER(compression_strategy, ZSTD_c_strategy) | |
|
457 | PARAM_GETTER(write_content_size, ZSTD_c_contentSizeFlag) | |
|
458 | PARAM_GETTER(write_checksum, ZSTD_c_checksumFlag) | |
|
459 | PARAM_GETTER(write_dict_id, ZSTD_c_dictIDFlag) | |
|
460 | PARAM_GETTER(job_size, ZSTD_c_jobSize) | |
|
461 | PARAM_GETTER(overlap_log, ZSTD_c_overlapLog) | |
|
462 | PARAM_GETTER(force_max_window, ZSTD_c_forceMaxWindow) | |
|
463 | PARAM_GETTER(enable_ldm, ZSTD_c_enableLongDistanceMatching) | |
|
464 | PARAM_GETTER(ldm_hash_log, ZSTD_c_ldmHashLog) | |
|
465 | PARAM_GETTER(ldm_min_match, ZSTD_c_ldmMinMatch) | |
|
466 | PARAM_GETTER(ldm_bucket_size_log, ZSTD_c_ldmBucketSizeLog) | |
|
467 | PARAM_GETTER(ldm_hash_rate_log, ZSTD_c_ldmHashRateLog) | |
|
468 | PARAM_GETTER(threads, ZSTD_c_nbWorkers) | |
|
469 | ||
|
339 | 470 | static PyMethodDef ZstdCompressionParameters_methods[] = { |
|
340 | 471 | { |
|
341 | 472 | "from_level", |
@@ -352,70 +483,34 b' static PyMethodDef ZstdCompressionParame' | |||
|
352 | 483 | { NULL, NULL } |
|
353 | 484 | }; |
|
354 | 485 | |
|
355 | static PyMemberDef ZstdCompressionParameters_members[] = { | |
|
356 | { "format", T_UINT, | |
|
357 | offsetof(ZstdCompressionParametersObject, format), READONLY, | |
|
358 | "compression format" }, | |
|
359 | { "compression_level", T_INT, | |
|
360 | offsetof(ZstdCompressionParametersObject, compressionLevel), READONLY, | |
|
361 | "compression level" }, | |
|
362 | { "window_log", T_UINT, | |
|
363 | offsetof(ZstdCompressionParametersObject, windowLog), READONLY, | |
|
364 | "window log" }, | |
|
365 | { "hash_log", T_UINT, | |
|
366 | offsetof(ZstdCompressionParametersObject, hashLog), READONLY, | |
|
367 | "hash log" }, | |
|
368 | { "chain_log", T_UINT, | |
|
369 | offsetof(ZstdCompressionParametersObject, chainLog), READONLY, | |
|
370 | "chain log" }, | |
|
371 | { "search_log", T_UINT, | |
|
372 | offsetof(ZstdCompressionParametersObject, searchLog), READONLY, | |
|
373 | "search log" }, | |
|
374 | { "min_match", T_UINT, | |
|
375 | offsetof(ZstdCompressionParametersObject, minMatch), READONLY, | |
|
376 | "search length" }, | |
|
377 | { "target_length", T_UINT, | |
|
378 | offsetof(ZstdCompressionParametersObject, targetLength), READONLY, | |
|
379 | "target length" }, | |
|
380 | { "compression_strategy", T_UINT, | |
|
381 | offsetof(ZstdCompressionParametersObject, compressionStrategy), READONLY, | |
|
382 | "compression strategy" }, | |
|
383 | { "write_content_size", T_UINT, | |
|
384 | offsetof(ZstdCompressionParametersObject, contentSizeFlag), READONLY, | |
|
385 | "whether to write content size in frames" }, | |
|
386 | { "write_checksum", T_UINT, | |
|
387 | offsetof(ZstdCompressionParametersObject, checksumFlag), READONLY, | |
|
388 | "whether to write checksum in frames" }, | |
|
389 | { "write_dict_id", T_UINT, | |
|
390 | offsetof(ZstdCompressionParametersObject, dictIDFlag), READONLY, | |
|
391 | "whether to write dictionary ID in frames" }, | |
|
392 | { "threads", T_UINT, | |
|
393 | offsetof(ZstdCompressionParametersObject, threads), READONLY, | |
|
394 | "number of threads to use" }, | |
|
395 | { "job_size", T_UINT, | |
|
396 | offsetof(ZstdCompressionParametersObject, jobSize), READONLY, | |
|
397 | "size of compression job when using multiple threads" }, | |
|
398 | { "overlap_size_log", T_UINT, | |
|
399 | offsetof(ZstdCompressionParametersObject, overlapSizeLog), READONLY, | |
|
400 | "Size of previous input reloaded at the beginning of each job" }, | |
|
401 | { "force_max_window", T_UINT, | |
|
402 | offsetof(ZstdCompressionParametersObject, forceMaxWindow), READONLY, | |
|
403 | "force back references to remain smaller than window size" }, | |
|
404 | { "enable_ldm", T_UINT, | |
|
405 | offsetof(ZstdCompressionParametersObject, enableLongDistanceMatching), READONLY, | |
|
406 | "whether to enable long distance matching" }, | |
|
407 | { "ldm_hash_log", T_UINT, | |
|
408 | offsetof(ZstdCompressionParametersObject, ldmHashLog), READONLY, | |
|
409 | "Size of the table for long distance matching, as a power of 2" }, | |
|
410 | { "ldm_min_match", T_UINT, | |
|
411 | offsetof(ZstdCompressionParametersObject, ldmMinMatch), READONLY, | |
|
412 | "minimum size of searched matches for long distance matcher" }, | |
|
413 | { "ldm_bucket_size_log", T_UINT, | |
|
414 | offsetof(ZstdCompressionParametersObject, ldmBucketSizeLog), READONLY, | |
|
415 | "log size of each bucket in the LDM hash table for collision resolution" }, | |
|
416 | { "ldm_hash_every_log", T_UINT, | |
|
417 | offsetof(ZstdCompressionParametersObject, ldmHashEveryLog), READONLY, | |
|
418 | "frequency of inserting/looking up entries in the LDM hash table" }, | |
|
486 | #define GET_SET_ENTRY(name) { #name, ZstdCompressionParameters_get_##name, NULL, NULL, NULL } | |
|
487 | ||
|
488 | static PyGetSetDef ZstdCompressionParameters_getset[] = { | |
|
489 | GET_SET_ENTRY(format), | |
|
490 | GET_SET_ENTRY(compression_level), | |
|
491 | GET_SET_ENTRY(window_log), | |
|
492 | GET_SET_ENTRY(hash_log), | |
|
493 | GET_SET_ENTRY(chain_log), | |
|
494 | GET_SET_ENTRY(search_log), | |
|
495 | GET_SET_ENTRY(min_match), | |
|
496 | GET_SET_ENTRY(target_length), | |
|
497 | GET_SET_ENTRY(compression_strategy), | |
|
498 | GET_SET_ENTRY(write_content_size), | |
|
499 | GET_SET_ENTRY(write_checksum), | |
|
500 | GET_SET_ENTRY(write_dict_id), | |
|
501 | GET_SET_ENTRY(threads), | |
|
502 | GET_SET_ENTRY(job_size), | |
|
503 | GET_SET_ENTRY(overlap_log), | |
|
504 | /* TODO remove this deprecated attribute */ | |
|
505 | { "overlap_size_log", ZstdCompressionParameters_get_overlap_log, NULL, NULL, NULL }, | |
|
506 | GET_SET_ENTRY(force_max_window), | |
|
507 | GET_SET_ENTRY(enable_ldm), | |
|
508 | GET_SET_ENTRY(ldm_hash_log), | |
|
509 | GET_SET_ENTRY(ldm_min_match), | |
|
510 | GET_SET_ENTRY(ldm_bucket_size_log), | |
|
511 | GET_SET_ENTRY(ldm_hash_rate_log), | |
|
512 | /* TODO remove this deprecated attribute */ | |
|
513 | { "ldm_hash_every_log", ZstdCompressionParameters_get_ldm_hash_rate_log, NULL, NULL, NULL }, | |
|
419 | 514 | { NULL } |
|
420 | 515 | }; |
|
421 | 516 | |
@@ -448,8 +543,8 b' PyTypeObject ZstdCompressionParametersTy' | |||
|
448 | 543 | 0, /* tp_iter */ |
|
449 | 544 | 0, /* tp_iternext */ |
|
450 | 545 | ZstdCompressionParameters_methods, /* tp_methods */ |
|
451 | ZstdCompressionParameters_members, /* tp_members */ | |
|
452 | 0, /* tp_getset */ | |
|
546 | 0, /* tp_members */ | |
|
547 | ZstdCompressionParameters_getset, /* tp_getset */ | |
|
453 | 548 | 0, /* tp_base */ |
|
454 | 549 | 0, /* tp_dict */ |
|
455 | 550 | 0, /* tp_descr_get */ |
This diff has been collapsed as it changes many lines, (604 lines changed) Show them Hide them | |||
@@ -128,6 +128,96 b' static PyObject* reader_tell(ZstdCompres' | |||
|
128 | 128 | return PyLong_FromUnsignedLongLong(self->bytesCompressed); |
|
129 | 129 | } |
|
130 | 130 | |
|
131 | int read_compressor_input(ZstdCompressionReader* self) { | |
|
132 | if (self->finishedInput) { | |
|
133 | return 0; | |
|
134 | } | |
|
135 | ||
|
136 | if (self->input.pos != self->input.size) { | |
|
137 | return 0; | |
|
138 | } | |
|
139 | ||
|
140 | if (self->reader) { | |
|
141 | Py_buffer buffer; | |
|
142 | ||
|
143 | assert(self->readResult == NULL); | |
|
144 | ||
|
145 | self->readResult = PyObject_CallMethod(self->reader, "read", | |
|
146 | "k", self->readSize); | |
|
147 | ||
|
148 | if (NULL == self->readResult) { | |
|
149 | return -1; | |
|
150 | } | |
|
151 | ||
|
152 | memset(&buffer, 0, sizeof(buffer)); | |
|
153 | ||
|
154 | if (0 != PyObject_GetBuffer(self->readResult, &buffer, PyBUF_CONTIG_RO)) { | |
|
155 | return -1; | |
|
156 | } | |
|
157 | ||
|
158 | /* EOF */ | |
|
159 | if (0 == buffer.len) { | |
|
160 | self->finishedInput = 1; | |
|
161 | Py_CLEAR(self->readResult); | |
|
162 | } | |
|
163 | else { | |
|
164 | self->input.src = buffer.buf; | |
|
165 | self->input.size = buffer.len; | |
|
166 | self->input.pos = 0; | |
|
167 | } | |
|
168 | ||
|
169 | PyBuffer_Release(&buffer); | |
|
170 | } | |
|
171 | else { | |
|
172 | assert(self->buffer.buf); | |
|
173 | ||
|
174 | self->input.src = self->buffer.buf; | |
|
175 | self->input.size = self->buffer.len; | |
|
176 | self->input.pos = 0; | |
|
177 | } | |
|
178 | ||
|
179 | return 1; | |
|
180 | } | |
|
181 | ||
|
182 | int compress_input(ZstdCompressionReader* self, ZSTD_outBuffer* output) { | |
|
183 | size_t oldPos; | |
|
184 | size_t zresult; | |
|
185 | ||
|
186 | /* If we have data left over, consume it. */ | |
|
187 | if (self->input.pos < self->input.size) { | |
|
188 | oldPos = output->pos; | |
|
189 | ||
|
190 | Py_BEGIN_ALLOW_THREADS | |
|
191 | zresult = ZSTD_compressStream2(self->compressor->cctx, | |
|
192 | output, &self->input, ZSTD_e_continue); | |
|
193 | Py_END_ALLOW_THREADS | |
|
194 | ||
|
195 | self->bytesCompressed += output->pos - oldPos; | |
|
196 | ||
|
197 | /* Input exhausted. Clear out state tracking. */ | |
|
198 | if (self->input.pos == self->input.size) { | |
|
199 | memset(&self->input, 0, sizeof(self->input)); | |
|
200 | Py_CLEAR(self->readResult); | |
|
201 | ||
|
202 | if (self->buffer.buf) { | |
|
203 | self->finishedInput = 1; | |
|
204 | } | |
|
205 | } | |
|
206 | ||
|
207 | if (ZSTD_isError(zresult)) { | |
|
208 | PyErr_Format(ZstdError, "zstd compress error: %s", ZSTD_getErrorName(zresult)); | |
|
209 | return -1; | |
|
210 | } | |
|
211 | } | |
|
212 | ||
|
213 | if (output->pos && output->pos == output->size) { | |
|
214 | return 1; | |
|
215 | } | |
|
216 | else { | |
|
217 | return 0; | |
|
218 | } | |
|
219 | } | |
|
220 | ||
|
131 | 221 | static PyObject* reader_read(ZstdCompressionReader* self, PyObject* args, PyObject* kwargs) { |
|
132 | 222 | static char* kwlist[] = { |
|
133 | 223 | "size", |
@@ -140,25 +230,30 b' static PyObject* reader_read(ZstdCompres' | |||
|
140 | 230 | Py_ssize_t resultSize; |
|
141 | 231 | size_t zresult; |
|
142 | 232 | size_t oldPos; |
|
233 | int readResult, compressResult; | |
|
143 | 234 | |
|
144 | 235 | if (self->closed) { |
|
145 | 236 | PyErr_SetString(PyExc_ValueError, "stream is closed"); |
|
146 | 237 | return NULL; |
|
147 | 238 | } |
|
148 | 239 | |
|
149 | if (self->finishedOutput) { | |
|
150 | return PyBytes_FromStringAndSize("", 0); | |
|
151 | } | |
|
152 | ||
|
153 | if (!PyArg_ParseTupleAndKeywords(args, kwargs, "n", kwlist, &size)) { | |
|
240 | if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|n", kwlist, &size)) { | |
|
154 | 241 | return NULL; |
|
155 | 242 | } |
|
156 | 243 | |
|
157 | if (size < 1) { | |
|
158 |
PyErr_SetString(PyExc_ValueError, "cannot read negative |
|
|
244 | if (size < -1) { | |
|
245 | PyErr_SetString(PyExc_ValueError, "cannot read negative amounts less than -1"); | |
|
159 | 246 | return NULL; |
|
160 | 247 | } |
|
161 | 248 | |
|
249 | if (size == -1) { | |
|
250 | return PyObject_CallMethod((PyObject*)self, "readall", NULL); | |
|
251 | } | |
|
252 | ||
|
253 | if (self->finishedOutput || size == 0) { | |
|
254 | return PyBytes_FromStringAndSize("", 0); | |
|
255 | } | |
|
256 | ||
|
162 | 257 | result = PyBytes_FromStringAndSize(NULL, size); |
|
163 | 258 | if (NULL == result) { |
|
164 | 259 | return NULL; |
@@ -172,86 +267,34 b' static PyObject* reader_read(ZstdCompres' | |||
|
172 | 267 | |
|
173 | 268 | readinput: |
|
174 | 269 | |
|
175 | /* If we have data left over, consume it. */ | |
|
176 | if (self->input.pos < self->input.size) { | |
|
177 | oldPos = self->output.pos; | |
|
178 | ||
|
179 | Py_BEGIN_ALLOW_THREADS | |
|
180 | zresult = ZSTD_compress_generic(self->compressor->cctx, | |
|
181 | &self->output, &self->input, ZSTD_e_continue); | |
|
182 | ||
|
183 | Py_END_ALLOW_THREADS | |
|
184 | ||
|
185 | self->bytesCompressed += self->output.pos - oldPos; | |
|
186 | ||
|
187 | /* Input exhausted. Clear out state tracking. */ | |
|
188 | if (self->input.pos == self->input.size) { | |
|
189 | memset(&self->input, 0, sizeof(self->input)); | |
|
190 | Py_CLEAR(self->readResult); | |
|
270 | compressResult = compress_input(self, &self->output); | |
|
191 | 271 | |
|
192 | if (self->buffer.buf) { | |
|
193 | self->finishedInput = 1; | |
|
194 | } | |
|
195 |
|
|
|
196 | ||
|
197 | if (ZSTD_isError(zresult)) { | |
|
198 | PyErr_Format(ZstdError, "zstd compress error: %s", ZSTD_getErrorName(zresult)); | |
|
199 | return NULL; | |
|
200 |
|
|
|
201 | ||
|
202 | if (self->output.pos) { | |
|
203 | /* If no more room in output, emit it. */ | |
|
204 | if (self->output.pos == self->output.size) { | |
|
205 | memset(&self->output, 0, sizeof(self->output)); | |
|
206 | return result; | |
|
207 | } | |
|
208 | ||
|
209 | /* | |
|
210 | * There is room in the output. We fall through to below, which will either | |
|
211 | * get more input for us or will attempt to end the stream. | |
|
212 | */ | |
|
213 | } | |
|
214 | ||
|
215 | /* Fall through to gather more input. */ | |
|
272 | if (-1 == compressResult) { | |
|
273 | Py_XDECREF(result); | |
|
274 | return NULL; | |
|
275 | } | |
|
276 | else if (0 == compressResult) { | |
|
277 | /* There is room in the output. We fall through to below, which will | |
|
278 | * either get more input for us or will attempt to end the stream. | |
|
279 | */ | |
|
280 | } | |
|
281 | else if (1 == compressResult) { | |
|
282 | memset(&self->output, 0, sizeof(self->output)); | |
|
283 | return result; | |
|
284 | } | |
|
285 | else { | |
|
286 | assert(0); | |
|
216 | 287 | } |
|
217 | 288 | |
|
218 | if (!self->finishedInput) { | |
|
219 | if (self->reader) { | |
|
220 | Py_buffer buffer; | |
|
221 | ||
|
222 | assert(self->readResult == NULL); | |
|
223 | self->readResult = PyObject_CallMethod(self->reader, "read", | |
|
224 | "k", self->readSize); | |
|
225 | if (self->readResult == NULL) { | |
|
226 | return NULL; | |
|
227 | } | |
|
228 | ||
|
229 | memset(&buffer, 0, sizeof(buffer)); | |
|
230 | ||
|
231 | if (0 != PyObject_GetBuffer(self->readResult, &buffer, PyBUF_CONTIG_RO)) { | |
|
232 | return NULL; | |
|
233 | } | |
|
289 | readResult = read_compressor_input(self); | |
|
234 | 290 | |
|
235 | /* EOF */ | |
|
236 | if (0 == buffer.len) { | |
|
237 | self->finishedInput = 1; | |
|
238 | Py_CLEAR(self->readResult); | |
|
239 | } | |
|
240 |
|
|
|
241 | self->input.src = buffer.buf; | |
|
242 | self->input.size = buffer.len; | |
|
243 | self->input.pos = 0; | |
|
244 | } | |
|
245 | ||
|
246 | PyBuffer_Release(&buffer); | |
|
247 | } | |
|
248 | else { | |
|
249 | assert(self->buffer.buf); | |
|
250 | ||
|
251 | self->input.src = self->buffer.buf; | |
|
252 | self->input.size = self->buffer.len; | |
|
253 | self->input.pos = 0; | |
|
254 | } | |
|
291 | if (-1 == readResult) { | |
|
292 | return NULL; | |
|
293 | } | |
|
294 | else if (0 == readResult) { } | |
|
295 | else if (1 == readResult) { } | |
|
296 | else { | |
|
297 | assert(0); | |
|
255 | 298 | } |
|
256 | 299 | |
|
257 | 300 | if (self->input.size) { |
@@ -261,7 +304,7 b' readinput:' | |||
|
261 | 304 | /* Else EOF */ |
|
262 | 305 | oldPos = self->output.pos; |
|
263 | 306 | |
|
264 |
zresult = ZSTD_compress |
|
|
307 | zresult = ZSTD_compressStream2(self->compressor->cctx, &self->output, | |
|
265 | 308 | &self->input, ZSTD_e_end); |
|
266 | 309 | |
|
267 | 310 | self->bytesCompressed += self->output.pos - oldPos; |
@@ -269,6 +312,7 b' readinput:' | |||
|
269 | 312 | if (ZSTD_isError(zresult)) { |
|
270 | 313 | PyErr_Format(ZstdError, "error ending compression stream: %s", |
|
271 | 314 | ZSTD_getErrorName(zresult)); |
|
315 | Py_XDECREF(result); | |
|
272 | 316 | return NULL; |
|
273 | 317 | } |
|
274 | 318 | |
@@ -288,9 +332,394 b' readinput:' | |||
|
288 | 332 | return result; |
|
289 | 333 | } |
|
290 | 334 | |
|
335 | static PyObject* reader_read1(ZstdCompressionReader* self, PyObject* args, PyObject* kwargs) { | |
|
336 | static char* kwlist[] = { | |
|
337 | "size", | |
|
338 | NULL | |
|
339 | }; | |
|
340 | ||
|
341 | Py_ssize_t size = -1; | |
|
342 | PyObject* result = NULL; | |
|
343 | char* resultBuffer; | |
|
344 | Py_ssize_t resultSize; | |
|
345 | ZSTD_outBuffer output; | |
|
346 | int compressResult; | |
|
347 | size_t oldPos; | |
|
348 | size_t zresult; | |
|
349 | ||
|
350 | if (self->closed) { | |
|
351 | PyErr_SetString(PyExc_ValueError, "stream is closed"); | |
|
352 | return NULL; | |
|
353 | } | |
|
354 | ||
|
355 | if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|n:read1", kwlist, &size)) { | |
|
356 | return NULL; | |
|
357 | } | |
|
358 | ||
|
359 | if (size < -1) { | |
|
360 | PyErr_SetString(PyExc_ValueError, "cannot read negative amounts less than -1"); | |
|
361 | return NULL; | |
|
362 | } | |
|
363 | ||
|
364 | if (self->finishedOutput || size == 0) { | |
|
365 | return PyBytes_FromStringAndSize("", 0); | |
|
366 | } | |
|
367 | ||
|
368 | if (size == -1) { | |
|
369 | size = ZSTD_CStreamOutSize(); | |
|
370 | } | |
|
371 | ||
|
372 | result = PyBytes_FromStringAndSize(NULL, size); | |
|
373 | if (NULL == result) { | |
|
374 | return NULL; | |
|
375 | } | |
|
376 | ||
|
377 | PyBytes_AsStringAndSize(result, &resultBuffer, &resultSize); | |
|
378 | ||
|
379 | output.dst = resultBuffer; | |
|
380 | output.size = resultSize; | |
|
381 | output.pos = 0; | |
|
382 | ||
|
383 | /* read1() is supposed to use at most 1 read() from the underlying stream. | |
|
384 | However, we can't satisfy this requirement with compression because | |
|
385 | not every input will generate output. We /could/ flush the compressor, | |
|
386 | but this may not be desirable. We allow multiple read() from the | |
|
387 | underlying stream. But unlike read(), we return as soon as output data | |
|
388 | is available. | |
|
389 | */ | |
|
390 | ||
|
391 | compressResult = compress_input(self, &output); | |
|
392 | ||
|
393 | if (-1 == compressResult) { | |
|
394 | Py_XDECREF(result); | |
|
395 | return NULL; | |
|
396 | } | |
|
397 | else if (0 == compressResult || 1 == compressResult) { } | |
|
398 | else { | |
|
399 | assert(0); | |
|
400 | } | |
|
401 | ||
|
402 | if (output.pos) { | |
|
403 | goto finally; | |
|
404 | } | |
|
405 | ||
|
406 | while (!self->finishedInput) { | |
|
407 | int readResult = read_compressor_input(self); | |
|
408 | ||
|
409 | if (-1 == readResult) { | |
|
410 | Py_XDECREF(result); | |
|
411 | return NULL; | |
|
412 | } | |
|
413 | else if (0 == readResult || 1 == readResult) { } | |
|
414 | else { | |
|
415 | assert(0); | |
|
416 | } | |
|
417 | ||
|
418 | compressResult = compress_input(self, &output); | |
|
419 | ||
|
420 | if (-1 == compressResult) { | |
|
421 | Py_XDECREF(result); | |
|
422 | return NULL; | |
|
423 | } | |
|
424 | else if (0 == compressResult || 1 == compressResult) { } | |
|
425 | else { | |
|
426 | assert(0); | |
|
427 | } | |
|
428 | ||
|
429 | if (output.pos) { | |
|
430 | goto finally; | |
|
431 | } | |
|
432 | } | |
|
433 | ||
|
434 | /* EOF */ | |
|
435 | oldPos = output.pos; | |
|
436 | ||
|
437 | zresult = ZSTD_compressStream2(self->compressor->cctx, &output, &self->input, | |
|
438 | ZSTD_e_end); | |
|
439 | ||
|
440 | self->bytesCompressed += output.pos - oldPos; | |
|
441 | ||
|
442 | if (ZSTD_isError(zresult)) { | |
|
443 | PyErr_Format(ZstdError, "error ending compression stream: %s", | |
|
444 | ZSTD_getErrorName(zresult)); | |
|
445 | Py_XDECREF(result); | |
|
446 | return NULL; | |
|
447 | } | |
|
448 | ||
|
449 | if (zresult == 0) { | |
|
450 | self->finishedOutput = 1; | |
|
451 | } | |
|
452 | ||
|
453 | finally: | |
|
454 | if (result) { | |
|
455 | if (safe_pybytes_resize(&result, output.pos)) { | |
|
456 | Py_XDECREF(result); | |
|
457 | return NULL; | |
|
458 | } | |
|
459 | } | |
|
460 | ||
|
461 | return result; | |
|
462 | } | |
|
463 | ||
|
291 | 464 | static PyObject* reader_readall(PyObject* self) { |
|
292 | PyErr_SetNone(PyExc_NotImplementedError); | |
|
293 | return NULL; | |
|
465 | PyObject* chunks = NULL; | |
|
466 | PyObject* empty = NULL; | |
|
467 | PyObject* result = NULL; | |
|
468 | ||
|
469 | /* Our strategy is to collect chunks into a list then join all the | |
|
470 | * chunks at the end. We could potentially use e.g. an io.BytesIO. But | |
|
471 | * this feels simple enough to implement and avoids potentially expensive | |
|
472 | * reallocations of large buffers. | |
|
473 | */ | |
|
474 | chunks = PyList_New(0); | |
|
475 | if (NULL == chunks) { | |
|
476 | return NULL; | |
|
477 | } | |
|
478 | ||
|
479 | while (1) { | |
|
480 | PyObject* chunk = PyObject_CallMethod(self, "read", "i", 1048576); | |
|
481 | if (NULL == chunk) { | |
|
482 | Py_DECREF(chunks); | |
|
483 | return NULL; | |
|
484 | } | |
|
485 | ||
|
486 | if (!PyBytes_Size(chunk)) { | |
|
487 | Py_DECREF(chunk); | |
|
488 | break; | |
|
489 | } | |
|
490 | ||
|
491 | if (PyList_Append(chunks, chunk)) { | |
|
492 | Py_DECREF(chunk); | |
|
493 | Py_DECREF(chunks); | |
|
494 | return NULL; | |
|
495 | } | |
|
496 | ||
|
497 | Py_DECREF(chunk); | |
|
498 | } | |
|
499 | ||
|
500 | empty = PyBytes_FromStringAndSize("", 0); | |
|
501 | if (NULL == empty) { | |
|
502 | Py_DECREF(chunks); | |
|
503 | return NULL; | |
|
504 | } | |
|
505 | ||
|
506 | result = PyObject_CallMethod(empty, "join", "O", chunks); | |
|
507 | ||
|
508 | Py_DECREF(empty); | |
|
509 | Py_DECREF(chunks); | |
|
510 | ||
|
511 | return result; | |
|
512 | } | |
|
513 | ||
|
514 | static PyObject* reader_readinto(ZstdCompressionReader* self, PyObject* args) { | |
|
515 | Py_buffer dest; | |
|
516 | ZSTD_outBuffer output; | |
|
517 | int readResult, compressResult; | |
|
518 | PyObject* result = NULL; | |
|
519 | size_t zresult; | |
|
520 | size_t oldPos; | |
|
521 | ||
|
522 | if (self->closed) { | |
|
523 | PyErr_SetString(PyExc_ValueError, "stream is closed"); | |
|
524 | return NULL; | |
|
525 | } | |
|
526 | ||
|
527 | if (self->finishedOutput) { | |
|
528 | return PyLong_FromLong(0); | |
|
529 | } | |
|
530 | ||
|
531 | if (!PyArg_ParseTuple(args, "w*:readinto", &dest)) { | |
|
532 | return NULL; | |
|
533 | } | |
|
534 | ||
|
535 | if (!PyBuffer_IsContiguous(&dest, 'C') || dest.ndim > 1) { | |
|
536 | PyErr_SetString(PyExc_ValueError, | |
|
537 | "destination buffer should be contiguous and have at most one dimension"); | |
|
538 | goto finally; | |
|
539 | } | |
|
540 | ||
|
541 | output.dst = dest.buf; | |
|
542 | output.size = dest.len; | |
|
543 | output.pos = 0; | |
|
544 | ||
|
545 | compressResult = compress_input(self, &output); | |
|
546 | ||
|
547 | if (-1 == compressResult) { | |
|
548 | goto finally; | |
|
549 | } | |
|
550 | else if (0 == compressResult) { } | |
|
551 | else if (1 == compressResult) { | |
|
552 | result = PyLong_FromSize_t(output.pos); | |
|
553 | goto finally; | |
|
554 | } | |
|
555 | else { | |
|
556 | assert(0); | |
|
557 | } | |
|
558 | ||
|
559 | while (!self->finishedInput) { | |
|
560 | readResult = read_compressor_input(self); | |
|
561 | ||
|
562 | if (-1 == readResult) { | |
|
563 | goto finally; | |
|
564 | } | |
|
565 | else if (0 == readResult || 1 == readResult) {} | |
|
566 | else { | |
|
567 | assert(0); | |
|
568 | } | |
|
569 | ||
|
570 | compressResult = compress_input(self, &output); | |
|
571 | ||
|
572 | if (-1 == compressResult) { | |
|
573 | goto finally; | |
|
574 | } | |
|
575 | else if (0 == compressResult) { } | |
|
576 | else if (1 == compressResult) { | |
|
577 | result = PyLong_FromSize_t(output.pos); | |
|
578 | goto finally; | |
|
579 | } | |
|
580 | else { | |
|
581 | assert(0); | |
|
582 | } | |
|
583 | } | |
|
584 | ||
|
585 | /* EOF */ | |
|
586 | oldPos = output.pos; | |
|
587 | ||
|
588 | zresult = ZSTD_compressStream2(self->compressor->cctx, &output, &self->input, | |
|
589 | ZSTD_e_end); | |
|
590 | ||
|
591 | self->bytesCompressed += self->output.pos - oldPos; | |
|
592 | ||
|
593 | if (ZSTD_isError(zresult)) { | |
|
594 | PyErr_Format(ZstdError, "error ending compression stream: %s", | |
|
595 | ZSTD_getErrorName(zresult)); | |
|
596 | goto finally; | |
|
597 | } | |
|
598 | ||
|
599 | assert(output.pos); | |
|
600 | ||
|
601 | if (0 == zresult) { | |
|
602 | self->finishedOutput = 1; | |
|
603 | } | |
|
604 | ||
|
605 | result = PyLong_FromSize_t(output.pos); | |
|
606 | ||
|
607 | finally: | |
|
608 | PyBuffer_Release(&dest); | |
|
609 | ||
|
610 | return result; | |
|
611 | } | |
|
612 | ||
|
613 | static PyObject* reader_readinto1(ZstdCompressionReader* self, PyObject* args) { | |
|
614 | Py_buffer dest; | |
|
615 | PyObject* result = NULL; | |
|
616 | ZSTD_outBuffer output; | |
|
617 | int compressResult; | |
|
618 | size_t oldPos; | |
|
619 | size_t zresult; | |
|
620 | ||
|
621 | if (self->closed) { | |
|
622 | PyErr_SetString(PyExc_ValueError, "stream is closed"); | |
|
623 | return NULL; | |
|
624 | } | |
|
625 | ||
|
626 | if (self->finishedOutput) { | |
|
627 | return PyLong_FromLong(0); | |
|
628 | } | |
|
629 | ||
|
630 | if (!PyArg_ParseTuple(args, "w*:readinto1", &dest)) { | |
|
631 | return NULL; | |
|
632 | } | |
|
633 | ||
|
634 | if (!PyBuffer_IsContiguous(&dest, 'C') || dest.ndim > 1) { | |
|
635 | PyErr_SetString(PyExc_ValueError, | |
|
636 | "destination buffer should be contiguous and have at most one dimension"); | |
|
637 | goto finally; | |
|
638 | } | |
|
639 | ||
|
640 | output.dst = dest.buf; | |
|
641 | output.size = dest.len; | |
|
642 | output.pos = 0; | |
|
643 | ||
|
644 | compressResult = compress_input(self, &output); | |
|
645 | ||
|
646 | if (-1 == compressResult) { | |
|
647 | goto finally; | |
|
648 | } | |
|
649 | else if (0 == compressResult || 1 == compressResult) { } | |
|
650 | else { | |
|
651 | assert(0); | |
|
652 | } | |
|
653 | ||
|
654 | if (output.pos) { | |
|
655 | result = PyLong_FromSize_t(output.pos); | |
|
656 | goto finally; | |
|
657 | } | |
|
658 | ||
|
659 | while (!self->finishedInput) { | |
|
660 | int readResult = read_compressor_input(self); | |
|
661 | ||
|
662 | if (-1 == readResult) { | |
|
663 | goto finally; | |
|
664 | } | |
|
665 | else if (0 == readResult || 1 == readResult) { } | |
|
666 | else { | |
|
667 | assert(0); | |
|
668 | } | |
|
669 | ||
|
670 | compressResult = compress_input(self, &output); | |
|
671 | ||
|
672 | if (-1 == compressResult) { | |
|
673 | goto finally; | |
|
674 | } | |
|
675 | else if (0 == compressResult) { } | |
|
676 | else if (1 == compressResult) { | |
|
677 | result = PyLong_FromSize_t(output.pos); | |
|
678 | goto finally; | |
|
679 | } | |
|
680 | else { | |
|
681 | assert(0); | |
|
682 | } | |
|
683 | ||
|
684 | /* If we produced output and we're not done with input, emit | |
|
685 | * that output now, as we've hit restrictions of read1(). | |
|
686 | */ | |
|
687 | if (output.pos && !self->finishedInput) { | |
|
688 | result = PyLong_FromSize_t(output.pos); | |
|
689 | goto finally; | |
|
690 | } | |
|
691 | ||
|
692 | /* Otherwise we either have no output or we've exhausted the | |
|
693 | * input. Either we try to get more input or we fall through | |
|
694 | * to EOF below */ | |
|
695 | } | |
|
696 | ||
|
697 | /* EOF */ | |
|
698 | oldPos = output.pos; | |
|
699 | ||
|
700 | zresult = ZSTD_compressStream2(self->compressor->cctx, &output, &self->input, | |
|
701 | ZSTD_e_end); | |
|
702 | ||
|
703 | self->bytesCompressed += self->output.pos - oldPos; | |
|
704 | ||
|
705 | if (ZSTD_isError(zresult)) { | |
|
706 | PyErr_Format(ZstdError, "error ending compression stream: %s", | |
|
707 | ZSTD_getErrorName(zresult)); | |
|
708 | goto finally; | |
|
709 | } | |
|
710 | ||
|
711 | assert(output.pos); | |
|
712 | ||
|
713 | if (0 == zresult) { | |
|
714 | self->finishedOutput = 1; | |
|
715 | } | |
|
716 | ||
|
717 | result = PyLong_FromSize_t(output.pos); | |
|
718 | ||
|
719 | finally: | |
|
720 | PyBuffer_Release(&dest); | |
|
721 | ||
|
722 | return result; | |
|
294 | 723 | } |
|
295 | 724 | |
|
296 | 725 | static PyObject* reader_iter(PyObject* self) { |
@@ -315,7 +744,10 b' static PyMethodDef reader_methods[] = {' | |||
|
315 | 744 | { "readable", (PyCFunction)reader_readable, METH_NOARGS, |
|
316 | 745 | PyDoc_STR("Returns True") }, |
|
317 | 746 | { "read", (PyCFunction)reader_read, METH_VARARGS | METH_KEYWORDS, PyDoc_STR("read compressed data") }, |
|
747 | { "read1", (PyCFunction)reader_read1, METH_VARARGS | METH_KEYWORDS, NULL }, | |
|
318 | 748 | { "readall", (PyCFunction)reader_readall, METH_NOARGS, PyDoc_STR("Not implemented") }, |
|
749 | { "readinto", (PyCFunction)reader_readinto, METH_VARARGS, NULL }, | |
|
750 | { "readinto1", (PyCFunction)reader_readinto1, METH_VARARGS, NULL }, | |
|
319 | 751 | { "readline", (PyCFunction)reader_readline, METH_VARARGS, PyDoc_STR("Not implemented") }, |
|
320 | 752 | { "readlines", (PyCFunction)reader_readlines, METH_VARARGS, PyDoc_STR("Not implemented") }, |
|
321 | 753 | { "seekable", (PyCFunction)reader_seekable, METH_NOARGS, |
@@ -18,24 +18,23 b' static void ZstdCompressionWriter_deallo' | |||
|
18 | 18 | Py_XDECREF(self->compressor); |
|
19 | 19 | Py_XDECREF(self->writer); |
|
20 | 20 | |
|
21 | PyMem_Free(self->output.dst); | |
|
22 | self->output.dst = NULL; | |
|
23 | ||
|
21 | 24 | PyObject_Del(self); |
|
22 | 25 | } |
|
23 | 26 | |
|
24 | 27 | static PyObject* ZstdCompressionWriter_enter(ZstdCompressionWriter* self) { |
|
25 | size_t zresult; | |
|
28 | if (self->closed) { | |
|
29 | PyErr_SetString(PyExc_ValueError, "stream is closed"); | |
|
30 | return NULL; | |
|
31 | } | |
|
26 | 32 | |
|
27 | 33 | if (self->entered) { |
|
28 | 34 | PyErr_SetString(ZstdError, "cannot __enter__ multiple times"); |
|
29 | 35 | return NULL; |
|
30 | 36 | } |
|
31 | 37 | |
|
32 | zresult = ZSTD_CCtx_setPledgedSrcSize(self->compressor->cctx, self->sourceSize); | |
|
33 | if (ZSTD_isError(zresult)) { | |
|
34 | PyErr_Format(ZstdError, "error setting source size: %s", | |
|
35 | ZSTD_getErrorName(zresult)); | |
|
36 | return NULL; | |
|
37 | } | |
|
38 | ||
|
39 | 38 | self->entered = 1; |
|
40 | 39 | |
|
41 | 40 | Py_INCREF(self); |
@@ -46,10 +45,6 b' static PyObject* ZstdCompressionWriter_e' | |||
|
46 | 45 | PyObject* exc_type; |
|
47 | 46 | PyObject* exc_value; |
|
48 | 47 | PyObject* exc_tb; |
|
49 | size_t zresult; | |
|
50 | ||
|
51 | ZSTD_outBuffer output; | |
|
52 | PyObject* res; | |
|
53 | 48 | |
|
54 | 49 | if (!PyArg_ParseTuple(args, "OOO:__exit__", &exc_type, &exc_value, &exc_tb)) { |
|
55 | 50 | return NULL; |
@@ -58,46 +53,11 b' static PyObject* ZstdCompressionWriter_e' | |||
|
58 | 53 | self->entered = 0; |
|
59 | 54 | |
|
60 | 55 | if (exc_type == Py_None && exc_value == Py_None && exc_tb == Py_None) { |
|
61 | ZSTD_inBuffer inBuffer; | |
|
62 | ||
|
63 | inBuffer.src = NULL; | |
|
64 | inBuffer.size = 0; | |
|
65 | inBuffer.pos = 0; | |
|
66 | ||
|
67 | output.dst = PyMem_Malloc(self->outSize); | |
|
68 | if (!output.dst) { | |
|
69 | return PyErr_NoMemory(); | |
|
70 | } | |
|
71 | output.size = self->outSize; | |
|
72 | output.pos = 0; | |
|
56 | PyObject* result = PyObject_CallMethod((PyObject*)self, "close", NULL); | |
|
73 | 57 | |
|
74 | while (1) { | |
|
75 | zresult = ZSTD_compress_generic(self->compressor->cctx, &output, &inBuffer, ZSTD_e_end); | |
|
76 | if (ZSTD_isError(zresult)) { | |
|
77 | PyErr_Format(ZstdError, "error ending compression stream: %s", | |
|
78 | ZSTD_getErrorName(zresult)); | |
|
79 | PyMem_Free(output.dst); | |
|
80 | return NULL; | |
|
81 | } | |
|
82 | ||
|
83 | if (output.pos) { | |
|
84 | #if PY_MAJOR_VERSION >= 3 | |
|
85 | res = PyObject_CallMethod(self->writer, "write", "y#", | |
|
86 | #else | |
|
87 | res = PyObject_CallMethod(self->writer, "write", "s#", | |
|
88 | #endif | |
|
89 | output.dst, output.pos); | |
|
90 | Py_XDECREF(res); | |
|
91 | } | |
|
92 | ||
|
93 | if (!zresult) { | |
|
94 | break; | |
|
95 | } | |
|
96 | ||
|
97 | output.pos = 0; | |
|
58 | if (NULL == result) { | |
|
59 | return NULL; | |
|
98 | 60 | } |
|
99 | ||
|
100 | PyMem_Free(output.dst); | |
|
101 | 61 | } |
|
102 | 62 | |
|
103 | 63 | Py_RETURN_FALSE; |
@@ -117,7 +77,6 b' static PyObject* ZstdCompressionWriter_w' | |||
|
117 | 77 | Py_buffer source; |
|
118 | 78 | size_t zresult; |
|
119 | 79 | ZSTD_inBuffer input; |
|
120 | ZSTD_outBuffer output; | |
|
121 | 80 | PyObject* res; |
|
122 | 81 | Py_ssize_t totalWrite = 0; |
|
123 | 82 | |
@@ -130,143 +89,240 b' static PyObject* ZstdCompressionWriter_w' | |||
|
130 | 89 | return NULL; |
|
131 | 90 | } |
|
132 | 91 | |
|
133 | if (!self->entered) { | |
|
134 | PyErr_SetString(ZstdError, "compress must be called from an active context manager"); | |
|
135 | goto finally; | |
|
136 | } | |
|
137 | ||
|
138 | 92 | if (!PyBuffer_IsContiguous(&source, 'C') || source.ndim > 1) { |
|
139 | 93 | PyErr_SetString(PyExc_ValueError, |
|
140 | 94 | "data buffer should be contiguous and have at most one dimension"); |
|
141 | 95 | goto finally; |
|
142 | 96 | } |
|
143 | 97 | |
|
144 | output.dst = PyMem_Malloc(self->outSize); | |
|
145 | if (!output.dst) { | |
|
146 | PyErr_NoMemory(); | |
|
147 | goto finally; | |
|
98 | if (self->closed) { | |
|
99 | PyErr_SetString(PyExc_ValueError, "stream is closed"); | |
|
100 | return NULL; | |
|
148 | 101 | } |
|
149 | output.size = self->outSize; | |
|
150 | output.pos = 0; | |
|
102 | ||
|
103 | self->output.pos = 0; | |
|
151 | 104 | |
|
152 | 105 | input.src = source.buf; |
|
153 | 106 | input.size = source.len; |
|
154 | 107 | input.pos = 0; |
|
155 | 108 | |
|
156 |
while ( |
|
|
109 | while (input.pos < (size_t)source.len) { | |
|
157 | 110 | Py_BEGIN_ALLOW_THREADS |
|
158 |
zresult = ZSTD_compress |
|
|
111 | zresult = ZSTD_compressStream2(self->compressor->cctx, &self->output, &input, ZSTD_e_continue); | |
|
159 | 112 | Py_END_ALLOW_THREADS |
|
160 | 113 | |
|
161 | 114 | if (ZSTD_isError(zresult)) { |
|
162 | PyMem_Free(output.dst); | |
|
163 | 115 | PyErr_Format(ZstdError, "zstd compress error: %s", ZSTD_getErrorName(zresult)); |
|
164 | 116 | goto finally; |
|
165 | 117 | } |
|
166 | 118 | |
|
167 | 119 | /* Copy data from output buffer to writer. */ |
|
168 | if (output.pos) { | |
|
120 | if (self->output.pos) { | |
|
169 | 121 | #if PY_MAJOR_VERSION >= 3 |
|
170 | 122 | res = PyObject_CallMethod(self->writer, "write", "y#", |
|
171 | 123 | #else |
|
172 | 124 | res = PyObject_CallMethod(self->writer, "write", "s#", |
|
173 | 125 | #endif |
|
174 | output.dst, output.pos); | |
|
126 | self->output.dst, self->output.pos); | |
|
175 | 127 | Py_XDECREF(res); |
|
176 | totalWrite += output.pos; | |
|
177 | self->bytesCompressed += output.pos; | |
|
128 | totalWrite += self->output.pos; | |
|
129 | self->bytesCompressed += self->output.pos; | |
|
178 | 130 | } |
|
179 | output.pos = 0; | |
|
131 | self->output.pos = 0; | |
|
180 | 132 | } |
|
181 | 133 | |
|
182 | PyMem_Free(output.dst); | |
|
183 | ||
|
184 | result = PyLong_FromSsize_t(totalWrite); | |
|
134 | if (self->writeReturnRead) { | |
|
135 | result = PyLong_FromSize_t(input.pos); | |
|
136 | } | |
|
137 | else { | |
|
138 | result = PyLong_FromSsize_t(totalWrite); | |
|
139 | } | |
|
185 | 140 | |
|
186 | 141 | finally: |
|
187 | 142 | PyBuffer_Release(&source); |
|
188 | 143 | return result; |
|
189 | 144 | } |
|
190 | 145 | |
|
191 | static PyObject* ZstdCompressionWriter_flush(ZstdCompressionWriter* self, PyObject* args) { | |
|
146 | static PyObject* ZstdCompressionWriter_flush(ZstdCompressionWriter* self, PyObject* args, PyObject* kwargs) { | |
|
147 | static char* kwlist[] = { | |
|
148 | "flush_mode", | |
|
149 | NULL | |
|
150 | }; | |
|
151 | ||
|
192 | 152 | size_t zresult; |
|
193 | ZSTD_outBuffer output; | |
|
194 | 153 | ZSTD_inBuffer input; |
|
195 | 154 | PyObject* res; |
|
196 | 155 | Py_ssize_t totalWrite = 0; |
|
156 | unsigned flush_mode = 0; | |
|
157 | ZSTD_EndDirective flush; | |
|
197 | 158 | |
|
198 | if (!self->entered) { | |
|
199 | PyErr_SetString(ZstdError, "flush must be called from an active context manager"); | |
|
159 | if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|I:flush", | |
|
160 | kwlist, &flush_mode)) { | |
|
200 | 161 | return NULL; |
|
201 | 162 | } |
|
202 | 163 | |
|
164 | switch (flush_mode) { | |
|
165 | case 0: | |
|
166 | flush = ZSTD_e_flush; | |
|
167 | break; | |
|
168 | case 1: | |
|
169 | flush = ZSTD_e_end; | |
|
170 | break; | |
|
171 | default: | |
|
172 | PyErr_Format(PyExc_ValueError, "unknown flush_mode: %d", flush_mode); | |
|
173 | return NULL; | |
|
174 | } | |
|
175 | ||
|
176 | if (self->closed) { | |
|
177 | PyErr_SetString(PyExc_ValueError, "stream is closed"); | |
|
178 | return NULL; | |
|
179 | } | |
|
180 | ||
|
181 | self->output.pos = 0; | |
|
182 | ||
|
203 | 183 | input.src = NULL; |
|
204 | 184 | input.size = 0; |
|
205 | 185 | input.pos = 0; |
|
206 | 186 | |
|
207 | output.dst = PyMem_Malloc(self->outSize); | |
|
208 | if (!output.dst) { | |
|
209 | return PyErr_NoMemory(); | |
|
210 | } | |
|
211 | output.size = self->outSize; | |
|
212 | output.pos = 0; | |
|
213 | ||
|
214 | 187 | while (1) { |
|
215 | 188 | Py_BEGIN_ALLOW_THREADS |
|
216 |
zresult = ZSTD_compress |
|
|
189 | zresult = ZSTD_compressStream2(self->compressor->cctx, &self->output, &input, flush); | |
|
217 | 190 | Py_END_ALLOW_THREADS |
|
218 | 191 | |
|
219 | 192 | if (ZSTD_isError(zresult)) { |
|
220 | PyMem_Free(output.dst); | |
|
221 | 193 | PyErr_Format(ZstdError, "zstd compress error: %s", ZSTD_getErrorName(zresult)); |
|
222 | 194 | return NULL; |
|
223 | 195 | } |
|
224 | 196 | |
|
225 | 197 | /* Copy data from output buffer to writer. */ |
|
226 | if (output.pos) { | |
|
198 | if (self->output.pos) { | |
|
227 | 199 | #if PY_MAJOR_VERSION >= 3 |
|
228 | 200 | res = PyObject_CallMethod(self->writer, "write", "y#", |
|
229 | 201 | #else |
|
230 | 202 | res = PyObject_CallMethod(self->writer, "write", "s#", |
|
231 | 203 | #endif |
|
232 | output.dst, output.pos); | |
|
204 | self->output.dst, self->output.pos); | |
|
233 | 205 | Py_XDECREF(res); |
|
234 | totalWrite += output.pos; | |
|
235 | self->bytesCompressed += output.pos; | |
|
206 | totalWrite += self->output.pos; | |
|
207 | self->bytesCompressed += self->output.pos; | |
|
236 | 208 | } |
|
237 | 209 | |
|
238 | output.pos = 0; | |
|
210 | self->output.pos = 0; | |
|
239 | 211 | |
|
240 | 212 | if (!zresult) { |
|
241 | 213 | break; |
|
242 | 214 | } |
|
243 | 215 | } |
|
244 | 216 | |
|
245 | PyMem_Free(output.dst); | |
|
217 | return PyLong_FromSsize_t(totalWrite); | |
|
218 | } | |
|
219 | ||
|
220 | static PyObject* ZstdCompressionWriter_close(ZstdCompressionWriter* self) { | |
|
221 | PyObject* result; | |
|
222 | ||
|
223 | if (self->closed) { | |
|
224 | Py_RETURN_NONE; | |
|
225 | } | |
|
226 | ||
|
227 | result = PyObject_CallMethod((PyObject*)self, "flush", "I", 1); | |
|
228 | self->closed = 1; | |
|
229 | ||
|
230 | if (NULL == result) { | |
|
231 | return NULL; | |
|
232 | } | |
|
246 | 233 | |
|
247 | return PyLong_FromSsize_t(totalWrite); | |
|
234 | /* Call close on underlying stream as well. */ | |
|
235 | if (PyObject_HasAttrString(self->writer, "close")) { | |
|
236 | return PyObject_CallMethod(self->writer, "close", NULL); | |
|
237 | } | |
|
238 | ||
|
239 | Py_RETURN_NONE; | |
|
240 | } | |
|
241 | ||
|
242 | static PyObject* ZstdCompressionWriter_fileno(ZstdCompressionWriter* self) { | |
|
243 | if (PyObject_HasAttrString(self->writer, "fileno")) { | |
|
244 | return PyObject_CallMethod(self->writer, "fileno", NULL); | |
|
245 | } | |
|
246 | else { | |
|
247 | PyErr_SetString(PyExc_OSError, "fileno not available on underlying writer"); | |
|
248 | return NULL; | |
|
249 | } | |
|
248 | 250 | } |
|
249 | 251 | |
|
250 | 252 | static PyObject* ZstdCompressionWriter_tell(ZstdCompressionWriter* self) { |
|
251 | 253 | return PyLong_FromUnsignedLongLong(self->bytesCompressed); |
|
252 | 254 | } |
|
253 | 255 | |
|
256 | static PyObject* ZstdCompressionWriter_writelines(PyObject* self, PyObject* args) { | |
|
257 | PyErr_SetNone(PyExc_NotImplementedError); | |
|
258 | return NULL; | |
|
259 | } | |
|
260 | ||
|
261 | static PyObject* ZstdCompressionWriter_false(PyObject* self, PyObject* args) { | |
|
262 | Py_RETURN_FALSE; | |
|
263 | } | |
|
264 | ||
|
265 | static PyObject* ZstdCompressionWriter_true(PyObject* self, PyObject* args) { | |
|
266 | Py_RETURN_TRUE; | |
|
267 | } | |
|
268 | ||
|
269 | static PyObject* ZstdCompressionWriter_unsupported(PyObject* self, PyObject* args, PyObject* kwargs) { | |
|
270 | PyObject* iomod; | |
|
271 | PyObject* exc; | |
|
272 | ||
|
273 | iomod = PyImport_ImportModule("io"); | |
|
274 | if (NULL == iomod) { | |
|
275 | return NULL; | |
|
276 | } | |
|
277 | ||
|
278 | exc = PyObject_GetAttrString(iomod, "UnsupportedOperation"); | |
|
279 | if (NULL == exc) { | |
|
280 | Py_DECREF(iomod); | |
|
281 | return NULL; | |
|
282 | } | |
|
283 | ||
|
284 | PyErr_SetNone(exc); | |
|
285 | Py_DECREF(exc); | |
|
286 | Py_DECREF(iomod); | |
|
287 | ||
|
288 | return NULL; | |
|
289 | } | |
|
290 | ||
|
254 | 291 | static PyMethodDef ZstdCompressionWriter_methods[] = { |
|
255 | 292 | { "__enter__", (PyCFunction)ZstdCompressionWriter_enter, METH_NOARGS, |
|
256 | 293 | PyDoc_STR("Enter a compression context.") }, |
|
257 | 294 | { "__exit__", (PyCFunction)ZstdCompressionWriter_exit, METH_VARARGS, |
|
258 | 295 | PyDoc_STR("Exit a compression context.") }, |
|
296 | { "close", (PyCFunction)ZstdCompressionWriter_close, METH_NOARGS, NULL }, | |
|
297 | { "fileno", (PyCFunction)ZstdCompressionWriter_fileno, METH_NOARGS, NULL }, | |
|
298 | { "isatty", (PyCFunction)ZstdCompressionWriter_false, METH_NOARGS, NULL }, | |
|
299 | { "readable", (PyCFunction)ZstdCompressionWriter_false, METH_NOARGS, NULL }, | |
|
300 | { "readline", (PyCFunction)ZstdCompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL }, | |
|
301 | { "readlines", (PyCFunction)ZstdCompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL }, | |
|
302 | { "seek", (PyCFunction)ZstdCompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL }, | |
|
303 | { "seekable", ZstdCompressionWriter_false, METH_NOARGS, NULL }, | |
|
304 | { "truncate", (PyCFunction)ZstdCompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL }, | |
|
305 | { "writable", ZstdCompressionWriter_true, METH_NOARGS, NULL }, | |
|
306 | { "writelines", ZstdCompressionWriter_writelines, METH_VARARGS, NULL }, | |
|
307 | { "read", (PyCFunction)ZstdCompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL }, | |
|
308 | { "readall", (PyCFunction)ZstdCompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL }, | |
|
309 | { "readinto", (PyCFunction)ZstdCompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL }, | |
|
259 | 310 | { "memory_size", (PyCFunction)ZstdCompressionWriter_memory_size, METH_NOARGS, |
|
260 | 311 | PyDoc_STR("Obtain the memory size of the underlying compressor") }, |
|
261 | 312 | { "write", (PyCFunction)ZstdCompressionWriter_write, METH_VARARGS | METH_KEYWORDS, |
|
262 | 313 | PyDoc_STR("Compress data") }, |
|
263 |
{ "flush", (PyCFunction)ZstdCompressionWriter_flush, METH_ |
|
|
314 | { "flush", (PyCFunction)ZstdCompressionWriter_flush, METH_VARARGS | METH_KEYWORDS, | |
|
264 | 315 | PyDoc_STR("Flush data and finish a zstd frame") }, |
|
265 | 316 | { "tell", (PyCFunction)ZstdCompressionWriter_tell, METH_NOARGS, |
|
266 | 317 | PyDoc_STR("Returns current number of bytes compressed") }, |
|
267 | 318 | { NULL, NULL } |
|
268 | 319 | }; |
|
269 | 320 | |
|
321 | static PyMemberDef ZstdCompressionWriter_members[] = { | |
|
322 | { "closed", T_BOOL, offsetof(ZstdCompressionWriter, closed), READONLY, NULL }, | |
|
323 | { NULL } | |
|
324 | }; | |
|
325 | ||
|
270 | 326 | PyTypeObject ZstdCompressionWriterType = { |
|
271 | 327 | PyVarObject_HEAD_INIT(NULL, 0) |
|
272 | 328 | "zstd.ZstdCompressionWriter", /* tp_name */ |
@@ -296,7 +352,7 b' PyTypeObject ZstdCompressionWriterType =' | |||
|
296 | 352 | 0, /* tp_iter */ |
|
297 | 353 | 0, /* tp_iternext */ |
|
298 | 354 | ZstdCompressionWriter_methods, /* tp_methods */ |
|
299 | 0, /* tp_members */ | |
|
355 | ZstdCompressionWriter_members, /* tp_members */ | |
|
300 | 356 | 0, /* tp_getset */ |
|
301 | 357 | 0, /* tp_base */ |
|
302 | 358 | 0, /* tp_dict */ |
@@ -59,9 +59,9 b' static PyObject* ZstdCompressionObj_comp' | |||
|
59 | 59 | input.size = source.len; |
|
60 | 60 | input.pos = 0; |
|
61 | 61 | |
|
62 |
while ( |
|
|
62 | while (input.pos < (size_t)source.len) { | |
|
63 | 63 | Py_BEGIN_ALLOW_THREADS |
|
64 |
zresult = ZSTD_compress |
|
|
64 | zresult = ZSTD_compressStream2(self->compressor->cctx, &self->output, | |
|
65 | 65 | &input, ZSTD_e_continue); |
|
66 | 66 | Py_END_ALLOW_THREADS |
|
67 | 67 | |
@@ -154,7 +154,7 b' static PyObject* ZstdCompressionObj_flus' | |||
|
154 | 154 | |
|
155 | 155 | while (1) { |
|
156 | 156 | Py_BEGIN_ALLOW_THREADS |
|
157 |
zresult = ZSTD_compress |
|
|
157 | zresult = ZSTD_compressStream2(self->compressor->cctx, &self->output, | |
|
158 | 158 | &input, zFlushMode); |
|
159 | 159 | Py_END_ALLOW_THREADS |
|
160 | 160 |
@@ -204,27 +204,27 b' static int ZstdCompressor_init(ZstdCompr' | |||
|
204 | 204 | } |
|
205 | 205 | } |
|
206 | 206 | else { |
|
207 |
if (set_parameter(self->params, ZSTD_ |
|
|
207 | if (set_parameter(self->params, ZSTD_c_compressionLevel, level)) { | |
|
208 | 208 | return -1; |
|
209 | 209 | } |
|
210 | 210 | |
|
211 |
if (set_parameter(self->params, ZSTD_ |
|
|
211 | if (set_parameter(self->params, ZSTD_c_contentSizeFlag, | |
|
212 | 212 | writeContentSize ? PyObject_IsTrue(writeContentSize) : 1)) { |
|
213 | 213 | return -1; |
|
214 | 214 | } |
|
215 | 215 | |
|
216 |
if (set_parameter(self->params, ZSTD_ |
|
|
216 | if (set_parameter(self->params, ZSTD_c_checksumFlag, | |
|
217 | 217 | writeChecksum ? PyObject_IsTrue(writeChecksum) : 0)) { |
|
218 | 218 | return -1; |
|
219 | 219 | } |
|
220 | 220 | |
|
221 |
if (set_parameter(self->params, ZSTD_ |
|
|
221 | if (set_parameter(self->params, ZSTD_c_dictIDFlag, | |
|
222 | 222 | writeDictID ? PyObject_IsTrue(writeDictID) : 1)) { |
|
223 | 223 | return -1; |
|
224 | 224 | } |
|
225 | 225 | |
|
226 | 226 | if (threads) { |
|
227 |
if (set_parameter(self->params, ZSTD_ |
|
|
227 | if (set_parameter(self->params, ZSTD_c_nbWorkers, threads)) { | |
|
228 | 228 | return -1; |
|
229 | 229 | } |
|
230 | 230 | } |
@@ -344,7 +344,7 b' static PyObject* ZstdCompressor_copy_str' | |||
|
344 | 344 | return NULL; |
|
345 | 345 | } |
|
346 | 346 | |
|
347 | ZSTD_CCtx_reset(self->cctx); | |
|
347 | ZSTD_CCtx_reset(self->cctx, ZSTD_reset_session_only); | |
|
348 | 348 | |
|
349 | 349 | zresult = ZSTD_CCtx_setPledgedSrcSize(self->cctx, sourceSize); |
|
350 | 350 | if (ZSTD_isError(zresult)) { |
@@ -391,7 +391,7 b' static PyObject* ZstdCompressor_copy_str' | |||
|
391 | 391 | |
|
392 | 392 | while (input.pos < input.size) { |
|
393 | 393 | Py_BEGIN_ALLOW_THREADS |
|
394 |
zresult = ZSTD_compress |
|
|
394 | zresult = ZSTD_compressStream2(self->cctx, &output, &input, ZSTD_e_continue); | |
|
395 | 395 | Py_END_ALLOW_THREADS |
|
396 | 396 | |
|
397 | 397 | if (ZSTD_isError(zresult)) { |
@@ -421,7 +421,7 b' static PyObject* ZstdCompressor_copy_str' | |||
|
421 | 421 | |
|
422 | 422 | while (1) { |
|
423 | 423 | Py_BEGIN_ALLOW_THREADS |
|
424 |
zresult = ZSTD_compress |
|
|
424 | zresult = ZSTD_compressStream2(self->cctx, &output, &input, ZSTD_e_end); | |
|
425 | 425 | Py_END_ALLOW_THREADS |
|
426 | 426 | |
|
427 | 427 | if (ZSTD_isError(zresult)) { |
@@ -517,7 +517,7 b' static ZstdCompressionReader* ZstdCompre' | |||
|
517 | 517 | goto except; |
|
518 | 518 | } |
|
519 | 519 | |
|
520 | ZSTD_CCtx_reset(self->cctx); | |
|
520 | ZSTD_CCtx_reset(self->cctx, ZSTD_reset_session_only); | |
|
521 | 521 | |
|
522 | 522 | zresult = ZSTD_CCtx_setPledgedSrcSize(self->cctx, sourceSize); |
|
523 | 523 | if (ZSTD_isError(zresult)) { |
@@ -577,7 +577,7 b' static PyObject* ZstdCompressor_compress' | |||
|
577 | 577 | goto finally; |
|
578 | 578 | } |
|
579 | 579 | |
|
580 | ZSTD_CCtx_reset(self->cctx); | |
|
580 | ZSTD_CCtx_reset(self->cctx, ZSTD_reset_session_only); | |
|
581 | 581 | |
|
582 | 582 | destSize = ZSTD_compressBound(source.len); |
|
583 | 583 | output = PyBytes_FromStringAndSize(NULL, destSize); |
@@ -605,7 +605,7 b' static PyObject* ZstdCompressor_compress' | |||
|
605 | 605 | /* By avoiding ZSTD_compress(), we don't necessarily write out content |
|
606 | 606 | size. This means the argument to ZstdCompressor to control frame |
|
607 | 607 | parameters is honored. */ |
|
608 |
zresult = ZSTD_compress |
|
|
608 | zresult = ZSTD_compressStream2(self->cctx, &outBuffer, &inBuffer, ZSTD_e_end); | |
|
609 | 609 | Py_END_ALLOW_THREADS |
|
610 | 610 | |
|
611 | 611 | if (ZSTD_isError(zresult)) { |
@@ -651,7 +651,7 b' static ZstdCompressionObj* ZstdCompresso' | |||
|
651 | 651 | return NULL; |
|
652 | 652 | } |
|
653 | 653 | |
|
654 | ZSTD_CCtx_reset(self->cctx); | |
|
654 | ZSTD_CCtx_reset(self->cctx, ZSTD_reset_session_only); | |
|
655 | 655 | |
|
656 | 656 | zresult = ZSTD_CCtx_setPledgedSrcSize(self->cctx, inSize); |
|
657 | 657 | if (ZSTD_isError(zresult)) { |
@@ -740,7 +740,7 b' static ZstdCompressorIterator* ZstdCompr' | |||
|
740 | 740 | goto except; |
|
741 | 741 | } |
|
742 | 742 | |
|
743 | ZSTD_CCtx_reset(self->cctx); | |
|
743 | ZSTD_CCtx_reset(self->cctx, ZSTD_reset_session_only); | |
|
744 | 744 | |
|
745 | 745 | zresult = ZSTD_CCtx_setPledgedSrcSize(self->cctx, sourceSize); |
|
746 | 746 | if (ZSTD_isError(zresult)) { |
@@ -794,16 +794,19 b' static ZstdCompressionWriter* ZstdCompre' | |||
|
794 | 794 | "writer", |
|
795 | 795 | "size", |
|
796 | 796 | "write_size", |
|
797 | "write_return_read", | |
|
797 | 798 | NULL |
|
798 | 799 | }; |
|
799 | 800 | |
|
800 | 801 | PyObject* writer; |
|
801 | 802 | ZstdCompressionWriter* result; |
|
803 | size_t zresult; | |
|
802 | 804 | unsigned long long sourceSize = ZSTD_CONTENTSIZE_UNKNOWN; |
|
803 | 805 | size_t outSize = ZSTD_CStreamOutSize(); |
|
806 | PyObject* writeReturnRead = NULL; | |
|
804 | 807 | |
|
805 | if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|Kk:stream_writer", kwlist, | |
|
806 | &writer, &sourceSize, &outSize)) { | |
|
808 | if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|KkO:stream_writer", kwlist, | |
|
809 | &writer, &sourceSize, &outSize, &writeReturnRead)) { | |
|
807 | 810 | return NULL; |
|
808 | 811 | } |
|
809 | 812 | |
@@ -812,22 +815,38 b' static ZstdCompressionWriter* ZstdCompre' | |||
|
812 | 815 | return NULL; |
|
813 | 816 | } |
|
814 | 817 | |
|
815 | ZSTD_CCtx_reset(self->cctx); | |
|
818 | ZSTD_CCtx_reset(self->cctx, ZSTD_reset_session_only); | |
|
819 | ||
|
820 | zresult = ZSTD_CCtx_setPledgedSrcSize(self->cctx, sourceSize); | |
|
821 | if (ZSTD_isError(zresult)) { | |
|
822 | PyErr_Format(ZstdError, "error setting source size: %s", | |
|
823 | ZSTD_getErrorName(zresult)); | |
|
824 | return NULL; | |
|
825 | } | |
|
816 | 826 | |
|
817 | 827 | result = (ZstdCompressionWriter*)PyObject_CallObject((PyObject*)&ZstdCompressionWriterType, NULL); |
|
818 | 828 | if (!result) { |
|
819 | 829 | return NULL; |
|
820 | 830 | } |
|
821 | 831 | |
|
832 | result->output.dst = PyMem_Malloc(outSize); | |
|
833 | if (!result->output.dst) { | |
|
834 | Py_DECREF(result); | |
|
835 | return (ZstdCompressionWriter*)PyErr_NoMemory(); | |
|
836 | } | |
|
837 | ||
|
838 | result->output.pos = 0; | |
|
839 | result->output.size = outSize; | |
|
840 | ||
|
822 | 841 | result->compressor = self; |
|
823 | 842 | Py_INCREF(result->compressor); |
|
824 | 843 | |
|
825 | 844 | result->writer = writer; |
|
826 | 845 | Py_INCREF(result->writer); |
|
827 | 846 | |
|
828 | result->sourceSize = sourceSize; | |
|
829 | 847 | result->outSize = outSize; |
|
830 | 848 | result->bytesCompressed = 0; |
|
849 | result->writeReturnRead = writeReturnRead ? PyObject_IsTrue(writeReturnRead) : 0; | |
|
831 | 850 | |
|
832 | 851 | return result; |
|
833 | 852 | } |
@@ -853,7 +872,7 b' static ZstdCompressionChunker* ZstdCompr' | |||
|
853 | 872 | return NULL; |
|
854 | 873 | } |
|
855 | 874 | |
|
856 | ZSTD_CCtx_reset(self->cctx); | |
|
875 | ZSTD_CCtx_reset(self->cctx, ZSTD_reset_session_only); | |
|
857 | 876 | |
|
858 | 877 | zresult = ZSTD_CCtx_setPledgedSrcSize(self->cctx, sourceSize); |
|
859 | 878 | if (ZSTD_isError(zresult)) { |
@@ -1115,7 +1134,7 b' static void compress_worker(WorkerState*' | |||
|
1115 | 1134 | break; |
|
1116 | 1135 | } |
|
1117 | 1136 | |
|
1118 |
zresult = ZSTD_compress |
|
|
1137 | zresult = ZSTD_compressStream2(state->cctx, &opOutBuffer, &opInBuffer, ZSTD_e_end); | |
|
1119 | 1138 | if (ZSTD_isError(zresult)) { |
|
1120 | 1139 | state->error = WorkerError_zstd; |
|
1121 | 1140 | state->zresult = zresult; |
@@ -57,7 +57,7 b' feedcompressor:' | |||
|
57 | 57 | /* If we have data left in the input, consume it. */ |
|
58 | 58 | if (self->input.pos < self->input.size) { |
|
59 | 59 | Py_BEGIN_ALLOW_THREADS |
|
60 |
zresult = ZSTD_compress |
|
|
60 | zresult = ZSTD_compressStream2(self->compressor->cctx, &self->output, | |
|
61 | 61 | &self->input, ZSTD_e_continue); |
|
62 | 62 | Py_END_ALLOW_THREADS |
|
63 | 63 | |
@@ -127,7 +127,7 b' feedcompressor:' | |||
|
127 | 127 | self->input.size = 0; |
|
128 | 128 | self->input.pos = 0; |
|
129 | 129 | |
|
130 |
zresult = ZSTD_compress |
|
|
130 | zresult = ZSTD_compressStream2(self->compressor->cctx, &self->output, | |
|
131 | 131 | &self->input, ZSTD_e_end); |
|
132 | 132 | if (ZSTD_isError(zresult)) { |
|
133 | 133 | PyErr_Format(ZstdError, "error ending compression stream: %s", |
@@ -152,7 +152,7 b' feedcompressor:' | |||
|
152 | 152 | self->input.pos = 0; |
|
153 | 153 | |
|
154 | 154 | Py_BEGIN_ALLOW_THREADS |
|
155 |
zresult = ZSTD_compress |
|
|
155 | zresult = ZSTD_compressStream2(self->compressor->cctx, &self->output, | |
|
156 | 156 | &self->input, ZSTD_e_continue); |
|
157 | 157 | Py_END_ALLOW_THREADS |
|
158 | 158 |
@@ -32,6 +32,9 b' void constants_module_init(PyObject* mod' | |||
|
32 | 32 | ZstdError = PyErr_NewException("zstd.ZstdError", NULL, NULL); |
|
33 | 33 | PyModule_AddObject(mod, "ZstdError", ZstdError); |
|
34 | 34 | |
|
35 | PyModule_AddIntConstant(mod, "FLUSH_BLOCK", 0); | |
|
36 | PyModule_AddIntConstant(mod, "FLUSH_FRAME", 1); | |
|
37 | ||
|
35 | 38 | PyModule_AddIntConstant(mod, "COMPRESSOBJ_FLUSH_FINISH", compressorobj_flush_finish); |
|
36 | 39 | PyModule_AddIntConstant(mod, "COMPRESSOBJ_FLUSH_BLOCK", compressorobj_flush_block); |
|
37 | 40 | |
@@ -77,8 +80,11 b' void constants_module_init(PyObject* mod' | |||
|
77 | 80 | PyModule_AddIntConstant(mod, "HASHLOG3_MAX", ZSTD_HASHLOG3_MAX); |
|
78 | 81 | PyModule_AddIntConstant(mod, "SEARCHLOG_MIN", ZSTD_SEARCHLOG_MIN); |
|
79 | 82 | PyModule_AddIntConstant(mod, "SEARCHLOG_MAX", ZSTD_SEARCHLOG_MAX); |
|
80 |
PyModule_AddIntConstant(mod, " |
|
|
81 |
PyModule_AddIntConstant(mod, " |
|
|
83 | PyModule_AddIntConstant(mod, "MINMATCH_MIN", ZSTD_MINMATCH_MIN); | |
|
84 | PyModule_AddIntConstant(mod, "MINMATCH_MAX", ZSTD_MINMATCH_MAX); | |
|
85 | /* TODO SEARCHLENGTH_* is deprecated. */ | |
|
86 | PyModule_AddIntConstant(mod, "SEARCHLENGTH_MIN", ZSTD_MINMATCH_MIN); | |
|
87 | PyModule_AddIntConstant(mod, "SEARCHLENGTH_MAX", ZSTD_MINMATCH_MAX); | |
|
82 | 88 | PyModule_AddIntConstant(mod, "TARGETLENGTH_MIN", ZSTD_TARGETLENGTH_MIN); |
|
83 | 89 | PyModule_AddIntConstant(mod, "TARGETLENGTH_MAX", ZSTD_TARGETLENGTH_MAX); |
|
84 | 90 | PyModule_AddIntConstant(mod, "LDM_MINMATCH_MIN", ZSTD_LDM_MINMATCH_MIN); |
@@ -93,6 +99,7 b' void constants_module_init(PyObject* mod' | |||
|
93 | 99 | PyModule_AddIntConstant(mod, "STRATEGY_BTLAZY2", ZSTD_btlazy2); |
|
94 | 100 | PyModule_AddIntConstant(mod, "STRATEGY_BTOPT", ZSTD_btopt); |
|
95 | 101 | PyModule_AddIntConstant(mod, "STRATEGY_BTULTRA", ZSTD_btultra); |
|
102 | PyModule_AddIntConstant(mod, "STRATEGY_BTULTRA2", ZSTD_btultra2); | |
|
96 | 103 | |
|
97 | 104 | PyModule_AddIntConstant(mod, "DICT_TYPE_AUTO", ZSTD_dct_auto); |
|
98 | 105 | PyModule_AddIntConstant(mod, "DICT_TYPE_RAWCONTENT", ZSTD_dct_rawContent); |
This diff has been collapsed as it changes many lines, (511 lines changed) Show them Hide them | |||
@@ -102,6 +102,114 b' static PyObject* reader_isatty(PyObject*' | |||
|
102 | 102 | Py_RETURN_FALSE; |
|
103 | 103 | } |
|
104 | 104 | |
|
105 | /** | |
|
106 | * Read available input. | |
|
107 | * | |
|
108 | * Returns 0 if no data was added to input. | |
|
109 | * Returns 1 if new input data is available. | |
|
110 | * Returns -1 on error and sets a Python exception as a side-effect. | |
|
111 | */ | |
|
112 | int read_decompressor_input(ZstdDecompressionReader* self) { | |
|
113 | if (self->finishedInput) { | |
|
114 | return 0; | |
|
115 | } | |
|
116 | ||
|
117 | if (self->input.pos != self->input.size) { | |
|
118 | return 0; | |
|
119 | } | |
|
120 | ||
|
121 | if (self->reader) { | |
|
122 | Py_buffer buffer; | |
|
123 | ||
|
124 | assert(self->readResult == NULL); | |
|
125 | self->readResult = PyObject_CallMethod(self->reader, "read", | |
|
126 | "k", self->readSize); | |
|
127 | if (NULL == self->readResult) { | |
|
128 | return -1; | |
|
129 | } | |
|
130 | ||
|
131 | memset(&buffer, 0, sizeof(buffer)); | |
|
132 | ||
|
133 | if (0 != PyObject_GetBuffer(self->readResult, &buffer, PyBUF_CONTIG_RO)) { | |
|
134 | return -1; | |
|
135 | } | |
|
136 | ||
|
137 | /* EOF */ | |
|
138 | if (0 == buffer.len) { | |
|
139 | self->finishedInput = 1; | |
|
140 | Py_CLEAR(self->readResult); | |
|
141 | } | |
|
142 | else { | |
|
143 | self->input.src = buffer.buf; | |
|
144 | self->input.size = buffer.len; | |
|
145 | self->input.pos = 0; | |
|
146 | } | |
|
147 | ||
|
148 | PyBuffer_Release(&buffer); | |
|
149 | } | |
|
150 | else { | |
|
151 | assert(self->buffer.buf); | |
|
152 | /* | |
|
153 | * We should only get here once since expectation is we always | |
|
154 | * exhaust input buffer before reading again. | |
|
155 | */ | |
|
156 | assert(self->input.src == NULL); | |
|
157 | ||
|
158 | self->input.src = self->buffer.buf; | |
|
159 | self->input.size = self->buffer.len; | |
|
160 | self->input.pos = 0; | |
|
161 | } | |
|
162 | ||
|
163 | return 1; | |
|
164 | } | |
|
165 | ||
|
166 | /** | |
|
167 | * Decompresses available input into an output buffer. | |
|
168 | * | |
|
169 | * Returns 0 if we need more input. | |
|
170 | * Returns 1 if output buffer should be emitted. | |
|
171 | * Returns -1 on error and sets a Python exception. | |
|
172 | */ | |
|
173 | int decompress_input(ZstdDecompressionReader* self, ZSTD_outBuffer* output) { | |
|
174 | size_t zresult; | |
|
175 | ||
|
176 | if (self->input.pos >= self->input.size) { | |
|
177 | return 0; | |
|
178 | } | |
|
179 | ||
|
180 | Py_BEGIN_ALLOW_THREADS | |
|
181 | zresult = ZSTD_decompressStream(self->decompressor->dctx, output, &self->input); | |
|
182 | Py_END_ALLOW_THREADS | |
|
183 | ||
|
184 | /* Input exhausted. Clear our state tracking. */ | |
|
185 | if (self->input.pos == self->input.size) { | |
|
186 | memset(&self->input, 0, sizeof(self->input)); | |
|
187 | Py_CLEAR(self->readResult); | |
|
188 | ||
|
189 | if (self->buffer.buf) { | |
|
190 | self->finishedInput = 1; | |
|
191 | } | |
|
192 | } | |
|
193 | ||
|
194 | if (ZSTD_isError(zresult)) { | |
|
195 | PyErr_Format(ZstdError, "zstd decompress error: %s", ZSTD_getErrorName(zresult)); | |
|
196 | return -1; | |
|
197 | } | |
|
198 | ||
|
199 | /* We fulfilled the full read request. Signal to emit. */ | |
|
200 | if (output->pos && output->pos == output->size) { | |
|
201 | return 1; | |
|
202 | } | |
|
203 | /* We're at the end of a frame and we aren't allowed to return data | |
|
204 | spanning frames. */ | |
|
205 | else if (output->pos && zresult == 0 && !self->readAcrossFrames) { | |
|
206 | return 1; | |
|
207 | } | |
|
208 | ||
|
209 | /* There is more room in the output. Signal to collect more data. */ | |
|
210 | return 0; | |
|
211 | } | |
|
212 | ||
|
105 | 213 | static PyObject* reader_read(ZstdDecompressionReader* self, PyObject* args, PyObject* kwargs) { |
|
106 | 214 | static char* kwlist[] = { |
|
107 | 215 | "size", |
@@ -113,26 +221,30 b' static PyObject* reader_read(ZstdDecompr' | |||
|
113 | 221 | char* resultBuffer; |
|
114 | 222 | Py_ssize_t resultSize; |
|
115 | 223 | ZSTD_outBuffer output; |
|
116 | size_t zresult; | |
|
224 | int decompressResult, readResult; | |
|
117 | 225 | |
|
118 | 226 | if (self->closed) { |
|
119 | 227 | PyErr_SetString(PyExc_ValueError, "stream is closed"); |
|
120 | 228 | return NULL; |
|
121 | 229 | } |
|
122 | 230 | |
|
123 | if (self->finishedOutput) { | |
|
124 | return PyBytes_FromStringAndSize("", 0); | |
|
125 | } | |
|
126 | ||
|
127 | if (!PyArg_ParseTupleAndKeywords(args, kwargs, "n", kwlist, &size)) { | |
|
231 | if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|n", kwlist, &size)) { | |
|
128 | 232 | return NULL; |
|
129 | 233 | } |
|
130 | 234 | |
|
131 | if (size < 1) { | |
|
132 |
PyErr_SetString(PyExc_ValueError, "cannot read negative |
|
|
235 | if (size < -1) { | |
|
236 | PyErr_SetString(PyExc_ValueError, "cannot read negative amounts less than -1"); | |
|
133 | 237 | return NULL; |
|
134 | 238 | } |
|
135 | 239 | |
|
240 | if (size == -1) { | |
|
241 | return PyObject_CallMethod((PyObject*)self, "readall", NULL); | |
|
242 | } | |
|
243 | ||
|
244 | if (self->finishedOutput || size == 0) { | |
|
245 | return PyBytes_FromStringAndSize("", 0); | |
|
246 | } | |
|
247 | ||
|
136 | 248 | result = PyBytes_FromStringAndSize(NULL, size); |
|
137 | 249 | if (NULL == result) { |
|
138 | 250 | return NULL; |
@@ -146,85 +258,38 b' static PyObject* reader_read(ZstdDecompr' | |||
|
146 | 258 | |
|
147 | 259 | readinput: |
|
148 | 260 | |
|
149 | /* Consume input data left over from last time. */ | |
|
150 | if (self->input.pos < self->input.size) { | |
|
151 | Py_BEGIN_ALLOW_THREADS | |
|
152 | zresult = ZSTD_decompress_generic(self->decompressor->dctx, | |
|
153 | &output, &self->input); | |
|
154 | Py_END_ALLOW_THREADS | |
|
261 | decompressResult = decompress_input(self, &output); | |
|
155 | 262 | |
|
156 | /* Input exhausted. Clear our state tracking. */ | |
|
157 | if (self->input.pos == self->input.size) { | |
|
158 | memset(&self->input, 0, sizeof(self->input)); | |
|
159 | Py_CLEAR(self->readResult); | |
|
263 | if (-1 == decompressResult) { | |
|
264 | Py_XDECREF(result); | |
|
265 | return NULL; | |
|
266 | } | |
|
267 | else if (0 == decompressResult) { } | |
|
268 | else if (1 == decompressResult) { | |
|
269 | self->bytesDecompressed += output.pos; | |
|
160 | 270 | |
|
161 | if (self->buffer.buf) { | |
|
162 | self->finishedInput = 1; | |
|
271 | if (output.pos != output.size) { | |
|
272 | if (safe_pybytes_resize(&result, output.pos)) { | |
|
273 | Py_XDECREF(result); | |
|
274 | return NULL; | |
|
163 | 275 | } |
|
164 | 276 | } |
|
165 | ||
|
166 | if (ZSTD_isError(zresult)) { | |
|
167 | PyErr_Format(ZstdError, "zstd decompress error: %s", ZSTD_getErrorName(zresult)); | |
|
168 | return NULL; | |
|
169 | } | |
|
170 | else if (0 == zresult) { | |
|
171 | self->finishedOutput = 1; | |
|
172 | } | |
|
173 | ||
|
174 | /* We fulfilled the full read request. Emit it. */ | |
|
175 | if (output.pos && output.pos == output.size) { | |
|
176 | self->bytesDecompressed += output.size; | |
|
177 | return result; | |
|
178 | } | |
|
179 | ||
|
180 | /* | |
|
181 | * There is more room in the output. Fall through to try to collect | |
|
182 | * more data so we can try to fill the output. | |
|
183 | */ | |
|
277 | return result; | |
|
278 | } | |
|
279 | else { | |
|
280 | assert(0); | |
|
184 | 281 | } |
|
185 | 282 | |
|
186 | if (!self->finishedInput) { | |
|
187 | if (self->reader) { | |
|
188 | Py_buffer buffer; | |
|
189 | ||
|
190 | assert(self->readResult == NULL); | |
|
191 | self->readResult = PyObject_CallMethod(self->reader, "read", | |
|
192 | "k", self->readSize); | |
|
193 | if (NULL == self->readResult) { | |
|
194 | return NULL; | |
|
195 | } | |
|
196 | ||
|
197 | memset(&buffer, 0, sizeof(buffer)); | |
|
198 | ||
|
199 | if (0 != PyObject_GetBuffer(self->readResult, &buffer, PyBUF_CONTIG_RO)) { | |
|
200 | return NULL; | |
|
201 | } | |
|
283 | readResult = read_decompressor_input(self); | |
|
202 | 284 | |
|
203 | /* EOF */ | |
|
204 | if (0 == buffer.len) { | |
|
205 | self->finishedInput = 1; | |
|
206 | Py_CLEAR(self->readResult); | |
|
207 | } | |
|
208 | else { | |
|
209 | self->input.src = buffer.buf; | |
|
210 | self->input.size = buffer.len; | |
|
211 | self->input.pos = 0; | |
|
212 | } | |
|
213 | ||
|
214 | PyBuffer_Release(&buffer); | |
|
215 | } | |
|
216 | else { | |
|
217 | assert(self->buffer.buf); | |
|
218 | /* | |
|
219 | * We should only get here once since above block will exhaust | |
|
220 | * source buffer until finishedInput is set. | |
|
221 | */ | |
|
222 | assert(self->input.src == NULL); | |
|
223 | ||
|
224 | self->input.src = self->buffer.buf; | |
|
225 | self->input.size = self->buffer.len; | |
|
226 | self->input.pos = 0; | |
|
227 | } | |
|
285 | if (-1 == readResult) { | |
|
286 | Py_XDECREF(result); | |
|
287 | return NULL; | |
|
288 | } | |
|
289 | else if (0 == readResult) {} | |
|
290 | else if (1 == readResult) {} | |
|
291 | else { | |
|
292 | assert(0); | |
|
228 | 293 | } |
|
229 | 294 | |
|
230 | 295 | if (self->input.size) { |
@@ -242,18 +307,288 b' readinput:' | |||
|
242 | 307 | return result; |
|
243 | 308 | } |
|
244 | 309 | |
|
310 | static PyObject* reader_read1(ZstdDecompressionReader* self, PyObject* args, PyObject* kwargs) { | |
|
311 | static char* kwlist[] = { | |
|
312 | "size", | |
|
313 | NULL | |
|
314 | }; | |
|
315 | ||
|
316 | Py_ssize_t size = -1; | |
|
317 | PyObject* result = NULL; | |
|
318 | char* resultBuffer; | |
|
319 | Py_ssize_t resultSize; | |
|
320 | ZSTD_outBuffer output; | |
|
321 | ||
|
322 | if (self->closed) { | |
|
323 | PyErr_SetString(PyExc_ValueError, "stream is closed"); | |
|
324 | return NULL; | |
|
325 | } | |
|
326 | ||
|
327 | if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|n", kwlist, &size)) { | |
|
328 | return NULL; | |
|
329 | } | |
|
330 | ||
|
331 | if (size < -1) { | |
|
332 | PyErr_SetString(PyExc_ValueError, "cannot read negative amounts less than -1"); | |
|
333 | return NULL; | |
|
334 | } | |
|
335 | ||
|
336 | if (self->finishedOutput || size == 0) { | |
|
337 | return PyBytes_FromStringAndSize("", 0); | |
|
338 | } | |
|
339 | ||
|
340 | if (size == -1) { | |
|
341 | size = ZSTD_DStreamOutSize(); | |
|
342 | } | |
|
343 | ||
|
344 | result = PyBytes_FromStringAndSize(NULL, size); | |
|
345 | if (NULL == result) { | |
|
346 | return NULL; | |
|
347 | } | |
|
348 | ||
|
349 | PyBytes_AsStringAndSize(result, &resultBuffer, &resultSize); | |
|
350 | ||
|
351 | output.dst = resultBuffer; | |
|
352 | output.size = resultSize; | |
|
353 | output.pos = 0; | |
|
354 | ||
|
355 | /* read1() is supposed to use at most 1 read() from the underlying stream. | |
|
356 | * However, we can't satisfy this requirement with decompression due to the | |
|
357 | * nature of how decompression works. Our strategy is to read + decompress | |
|
358 | * until we get any output, at which point we return. This satisfies the | |
|
359 | * intent of the read1() API to limit read operations. | |
|
360 | */ | |
|
361 | while (!self->finishedInput) { | |
|
362 | int readResult, decompressResult; | |
|
363 | ||
|
364 | readResult = read_decompressor_input(self); | |
|
365 | if (-1 == readResult) { | |
|
366 | Py_XDECREF(result); | |
|
367 | return NULL; | |
|
368 | } | |
|
369 | else if (0 == readResult || 1 == readResult) { } | |
|
370 | else { | |
|
371 | assert(0); | |
|
372 | } | |
|
373 | ||
|
374 | decompressResult = decompress_input(self, &output); | |
|
375 | ||
|
376 | if (-1 == decompressResult) { | |
|
377 | Py_XDECREF(result); | |
|
378 | return NULL; | |
|
379 | } | |
|
380 | else if (0 == decompressResult || 1 == decompressResult) { } | |
|
381 | else { | |
|
382 | assert(0); | |
|
383 | } | |
|
384 | ||
|
385 | if (output.pos) { | |
|
386 | break; | |
|
387 | } | |
|
388 | } | |
|
389 | ||
|
390 | self->bytesDecompressed += output.pos; | |
|
391 | if (safe_pybytes_resize(&result, output.pos)) { | |
|
392 | Py_XDECREF(result); | |
|
393 | return NULL; | |
|
394 | } | |
|
395 | ||
|
396 | return result; | |
|
397 | } | |
|
398 | ||
|
399 | static PyObject* reader_readinto(ZstdDecompressionReader* self, PyObject* args) { | |
|
400 | Py_buffer dest; | |
|
401 | ZSTD_outBuffer output; | |
|
402 | int decompressResult, readResult; | |
|
403 | PyObject* result = NULL; | |
|
404 | ||
|
405 | if (self->closed) { | |
|
406 | PyErr_SetString(PyExc_ValueError, "stream is closed"); | |
|
407 | return NULL; | |
|
408 | } | |
|
409 | ||
|
410 | if (self->finishedOutput) { | |
|
411 | return PyLong_FromLong(0); | |
|
412 | } | |
|
413 | ||
|
414 | if (!PyArg_ParseTuple(args, "w*:readinto", &dest)) { | |
|
415 | return NULL; | |
|
416 | } | |
|
417 | ||
|
418 | if (!PyBuffer_IsContiguous(&dest, 'C') || dest.ndim > 1) { | |
|
419 | PyErr_SetString(PyExc_ValueError, | |
|
420 | "destination buffer should be contiguous and have at most one dimension"); | |
|
421 | goto finally; | |
|
422 | } | |
|
423 | ||
|
424 | output.dst = dest.buf; | |
|
425 | output.size = dest.len; | |
|
426 | output.pos = 0; | |
|
427 | ||
|
428 | readinput: | |
|
429 | ||
|
430 | decompressResult = decompress_input(self, &output); | |
|
431 | ||
|
432 | if (-1 == decompressResult) { | |
|
433 | goto finally; | |
|
434 | } | |
|
435 | else if (0 == decompressResult) { } | |
|
436 | else if (1 == decompressResult) { | |
|
437 | self->bytesDecompressed += output.pos; | |
|
438 | result = PyLong_FromSize_t(output.pos); | |
|
439 | goto finally; | |
|
440 | } | |
|
441 | else { | |
|
442 | assert(0); | |
|
443 | } | |
|
444 | ||
|
445 | readResult = read_decompressor_input(self); | |
|
446 | ||
|
447 | if (-1 == readResult) { | |
|
448 | goto finally; | |
|
449 | } | |
|
450 | else if (0 == readResult) {} | |
|
451 | else if (1 == readResult) {} | |
|
452 | else { | |
|
453 | assert(0); | |
|
454 | } | |
|
455 | ||
|
456 | if (self->input.size) { | |
|
457 | goto readinput; | |
|
458 | } | |
|
459 | ||
|
460 | /* EOF */ | |
|
461 | self->bytesDecompressed += output.pos; | |
|
462 | result = PyLong_FromSize_t(output.pos); | |
|
463 | ||
|
464 | finally: | |
|
465 | PyBuffer_Release(&dest); | |
|
466 | ||
|
467 | return result; | |
|
468 | } | |
|
469 | ||
|
470 | static PyObject* reader_readinto1(ZstdDecompressionReader* self, PyObject* args) { | |
|
471 | Py_buffer dest; | |
|
472 | ZSTD_outBuffer output; | |
|
473 | PyObject* result = NULL; | |
|
474 | ||
|
475 | if (self->closed) { | |
|
476 | PyErr_SetString(PyExc_ValueError, "stream is closed"); | |
|
477 | return NULL; | |
|
478 | } | |
|
479 | ||
|
480 | if (self->finishedOutput) { | |
|
481 | return PyLong_FromLong(0); | |
|
482 | } | |
|
483 | ||
|
484 | if (!PyArg_ParseTuple(args, "w*:readinto1", &dest)) { | |
|
485 | return NULL; | |
|
486 | } | |
|
487 | ||
|
488 | if (!PyBuffer_IsContiguous(&dest, 'C') || dest.ndim > 1) { | |
|
489 | PyErr_SetString(PyExc_ValueError, | |
|
490 | "destination buffer should be contiguous and have at most one dimension"); | |
|
491 | goto finally; | |
|
492 | } | |
|
493 | ||
|
494 | output.dst = dest.buf; | |
|
495 | output.size = dest.len; | |
|
496 | output.pos = 0; | |
|
497 | ||
|
498 | while (!self->finishedInput && !self->finishedOutput) { | |
|
499 | int decompressResult, readResult; | |
|
500 | ||
|
501 | readResult = read_decompressor_input(self); | |
|
502 | ||
|
503 | if (-1 == readResult) { | |
|
504 | goto finally; | |
|
505 | } | |
|
506 | else if (0 == readResult || 1 == readResult) {} | |
|
507 | else { | |
|
508 | assert(0); | |
|
509 | } | |
|
510 | ||
|
511 | decompressResult = decompress_input(self, &output); | |
|
512 | ||
|
513 | if (-1 == decompressResult) { | |
|
514 | goto finally; | |
|
515 | } | |
|
516 | else if (0 == decompressResult || 1 == decompressResult) {} | |
|
517 | else { | |
|
518 | assert(0); | |
|
519 | } | |
|
520 | ||
|
521 | if (output.pos) { | |
|
522 | break; | |
|
523 | } | |
|
524 | } | |
|
525 | ||
|
526 | self->bytesDecompressed += output.pos; | |
|
527 | result = PyLong_FromSize_t(output.pos); | |
|
528 | ||
|
529 | finally: | |
|
530 | PyBuffer_Release(&dest); | |
|
531 | ||
|
532 | return result; | |
|
533 | } | |
|
534 | ||
|
245 | 535 | static PyObject* reader_readall(PyObject* self) { |
|
246 | PyErr_SetNone(PyExc_NotImplementedError); | |
|
247 | return NULL; | |
|
536 | PyObject* chunks = NULL; | |
|
537 | PyObject* empty = NULL; | |
|
538 | PyObject* result = NULL; | |
|
539 | ||
|
540 | /* Our strategy is to collect chunks into a list then join all the | |
|
541 | * chunks at the end. We could potentially use e.g. an io.BytesIO. But | |
|
542 | * this feels simple enough to implement and avoids potentially expensive | |
|
543 | * reallocations of large buffers. | |
|
544 | */ | |
|
545 | chunks = PyList_New(0); | |
|
546 | if (NULL == chunks) { | |
|
547 | return NULL; | |
|
548 | } | |
|
549 | ||
|
550 | while (1) { | |
|
551 | PyObject* chunk = PyObject_CallMethod(self, "read", "i", 1048576); | |
|
552 | if (NULL == chunk) { | |
|
553 | Py_DECREF(chunks); | |
|
554 | return NULL; | |
|
555 | } | |
|
556 | ||
|
557 | if (!PyBytes_Size(chunk)) { | |
|
558 | Py_DECREF(chunk); | |
|
559 | break; | |
|
560 | } | |
|
561 | ||
|
562 | if (PyList_Append(chunks, chunk)) { | |
|
563 | Py_DECREF(chunk); | |
|
564 | Py_DECREF(chunks); | |
|
565 | return NULL; | |
|
566 | } | |
|
567 | ||
|
568 | Py_DECREF(chunk); | |
|
569 | } | |
|
570 | ||
|
571 | empty = PyBytes_FromStringAndSize("", 0); | |
|
572 | if (NULL == empty) { | |
|
573 | Py_DECREF(chunks); | |
|
574 | return NULL; | |
|
575 | } | |
|
576 | ||
|
577 | result = PyObject_CallMethod(empty, "join", "O", chunks); | |
|
578 | ||
|
579 | Py_DECREF(empty); | |
|
580 | Py_DECREF(chunks); | |
|
581 | ||
|
582 | return result; | |
|
248 | 583 | } |
|
249 | 584 | |
|
250 | 585 | static PyObject* reader_readline(PyObject* self) { |
|
251 | PyErr_SetNone(PyExc_NotImplementedError); | |
|
586 | set_unsupported_operation(); | |
|
252 | 587 | return NULL; |
|
253 | 588 | } |
|
254 | 589 | |
|
255 | 590 | static PyObject* reader_readlines(PyObject* self) { |
|
256 | PyErr_SetNone(PyExc_NotImplementedError); | |
|
591 | set_unsupported_operation(); | |
|
257 | 592 | return NULL; |
|
258 | 593 | } |
|
259 | 594 | |
@@ -345,12 +680,12 b' static PyObject* reader_writelines(PyObj' | |||
|
345 | 680 | } |
|
346 | 681 | |
|
347 | 682 | static PyObject* reader_iter(PyObject* self) { |
|
348 | PyErr_SetNone(PyExc_NotImplementedError); | |
|
683 | set_unsupported_operation(); | |
|
349 | 684 | return NULL; |
|
350 | 685 | } |
|
351 | 686 | |
|
352 | 687 | static PyObject* reader_iternext(PyObject* self) { |
|
353 | PyErr_SetNone(PyExc_NotImplementedError); | |
|
688 | set_unsupported_operation(); | |
|
354 | 689 | return NULL; |
|
355 | 690 | } |
|
356 | 691 | |
@@ -367,6 +702,10 b' static PyMethodDef reader_methods[] = {' | |||
|
367 | 702 | PyDoc_STR("Returns True") }, |
|
368 | 703 | { "read", (PyCFunction)reader_read, METH_VARARGS | METH_KEYWORDS, |
|
369 | 704 | PyDoc_STR("read compressed data") }, |
|
705 | { "read1", (PyCFunction)reader_read1, METH_VARARGS | METH_KEYWORDS, | |
|
706 | PyDoc_STR("read compressed data") }, | |
|
707 | { "readinto", (PyCFunction)reader_readinto, METH_VARARGS, NULL }, | |
|
708 | { "readinto1", (PyCFunction)reader_readinto1, METH_VARARGS, NULL }, | |
|
370 | 709 | { "readall", (PyCFunction)reader_readall, METH_NOARGS, PyDoc_STR("Not implemented") }, |
|
371 | 710 | { "readline", (PyCFunction)reader_readline, METH_NOARGS, PyDoc_STR("Not implemented") }, |
|
372 | 711 | { "readlines", (PyCFunction)reader_readlines, METH_NOARGS, PyDoc_STR("Not implemented") }, |
@@ -22,12 +22,13 b' static void ZstdDecompressionWriter_deal' | |||
|
22 | 22 | } |
|
23 | 23 | |
|
24 | 24 | static PyObject* ZstdDecompressionWriter_enter(ZstdDecompressionWriter* self) { |
|
25 |
if (self-> |
|
|
26 | PyErr_SetString(ZstdError, "cannot __enter__ multiple times"); | |
|
25 | if (self->closed) { | |
|
26 | PyErr_SetString(PyExc_ValueError, "stream is closed"); | |
|
27 | 27 | return NULL; |
|
28 | 28 | } |
|
29 | 29 | |
|
30 | if (ensure_dctx(self->decompressor, 1)) { | |
|
30 | if (self->entered) { | |
|
31 | PyErr_SetString(ZstdError, "cannot __enter__ multiple times"); | |
|
31 | 32 | return NULL; |
|
32 | 33 | } |
|
33 | 34 | |
@@ -40,6 +41,10 b' static PyObject* ZstdDecompressionWriter' | |||
|
40 | 41 | static PyObject* ZstdDecompressionWriter_exit(ZstdDecompressionWriter* self, PyObject* args) { |
|
41 | 42 | self->entered = 0; |
|
42 | 43 | |
|
44 | if (NULL == PyObject_CallMethod((PyObject*)self, "close", NULL)) { | |
|
45 | return NULL; | |
|
46 | } | |
|
47 | ||
|
43 | 48 | Py_RETURN_FALSE; |
|
44 | 49 | } |
|
45 | 50 | |
@@ -76,9 +81,9 b' static PyObject* ZstdDecompressionWriter' | |||
|
76 | 81 | goto finally; |
|
77 | 82 | } |
|
78 | 83 | |
|
79 |
if ( |
|
|
80 | PyErr_SetString(ZstdError, "write must be called from an active context manager"); | |
|
81 | goto finally; | |
|
84 | if (self->closed) { | |
|
85 | PyErr_SetString(PyExc_ValueError, "stream is closed"); | |
|
86 | return NULL; | |
|
82 | 87 | } |
|
83 | 88 | |
|
84 | 89 | output.dst = PyMem_Malloc(self->outSize); |
@@ -93,9 +98,9 b' static PyObject* ZstdDecompressionWriter' | |||
|
93 | 98 | input.size = source.len; |
|
94 | 99 | input.pos = 0; |
|
95 | 100 | |
|
96 |
while ( |
|
|
101 | while (input.pos < (size_t)source.len) { | |
|
97 | 102 | Py_BEGIN_ALLOW_THREADS |
|
98 |
zresult = ZSTD_decompress |
|
|
103 | zresult = ZSTD_decompressStream(self->decompressor->dctx, &output, &input); | |
|
99 | 104 | Py_END_ALLOW_THREADS |
|
100 | 105 | |
|
101 | 106 | if (ZSTD_isError(zresult)) { |
@@ -120,13 +125,94 b' static PyObject* ZstdDecompressionWriter' | |||
|
120 | 125 | |
|
121 | 126 | PyMem_Free(output.dst); |
|
122 | 127 | |
|
123 | result = PyLong_FromSsize_t(totalWrite); | |
|
128 | if (self->writeReturnRead) { | |
|
129 | result = PyLong_FromSize_t(input.pos); | |
|
130 | } | |
|
131 | else { | |
|
132 | result = PyLong_FromSsize_t(totalWrite); | |
|
133 | } | |
|
124 | 134 | |
|
125 | 135 | finally: |
|
126 | 136 | PyBuffer_Release(&source); |
|
127 | 137 | return result; |
|
128 | 138 | } |
|
129 | 139 | |
|
140 | static PyObject* ZstdDecompressionWriter_close(ZstdDecompressionWriter* self) { | |
|
141 | PyObject* result; | |
|
142 | ||
|
143 | if (self->closed) { | |
|
144 | Py_RETURN_NONE; | |
|
145 | } | |
|
146 | ||
|
147 | result = PyObject_CallMethod((PyObject*)self, "flush", NULL); | |
|
148 | self->closed = 1; | |
|
149 | ||
|
150 | if (NULL == result) { | |
|
151 | return NULL; | |
|
152 | } | |
|
153 | ||
|
154 | /* Call close on underlying stream as well. */ | |
|
155 | if (PyObject_HasAttrString(self->writer, "close")) { | |
|
156 | return PyObject_CallMethod(self->writer, "close", NULL); | |
|
157 | } | |
|
158 | ||
|
159 | Py_RETURN_NONE; | |
|
160 | } | |
|
161 | ||
|
162 | static PyObject* ZstdDecompressionWriter_fileno(ZstdDecompressionWriter* self) { | |
|
163 | if (PyObject_HasAttrString(self->writer, "fileno")) { | |
|
164 | return PyObject_CallMethod(self->writer, "fileno", NULL); | |
|
165 | } | |
|
166 | else { | |
|
167 | PyErr_SetString(PyExc_OSError, "fileno not available on underlying writer"); | |
|
168 | return NULL; | |
|
169 | } | |
|
170 | } | |
|
171 | ||
|
172 | static PyObject* ZstdDecompressionWriter_flush(ZstdDecompressionWriter* self) { | |
|
173 | if (self->closed) { | |
|
174 | PyErr_SetString(PyExc_ValueError, "stream is closed"); | |
|
175 | return NULL; | |
|
176 | } | |
|
177 | ||
|
178 | if (PyObject_HasAttrString(self->writer, "flush")) { | |
|
179 | return PyObject_CallMethod(self->writer, "flush", NULL); | |
|
180 | } | |
|
181 | else { | |
|
182 | Py_RETURN_NONE; | |
|
183 | } | |
|
184 | } | |
|
185 | ||
|
186 | static PyObject* ZstdDecompressionWriter_false(PyObject* self, PyObject* args) { | |
|
187 | Py_RETURN_FALSE; | |
|
188 | } | |
|
189 | ||
|
190 | static PyObject* ZstdDecompressionWriter_true(PyObject* self, PyObject* args) { | |
|
191 | Py_RETURN_TRUE; | |
|
192 | } | |
|
193 | ||
|
194 | static PyObject* ZstdDecompressionWriter_unsupported(PyObject* self, PyObject* args, PyObject* kwargs) { | |
|
195 | PyObject* iomod; | |
|
196 | PyObject* exc; | |
|
197 | ||
|
198 | iomod = PyImport_ImportModule("io"); | |
|
199 | if (NULL == iomod) { | |
|
200 | return NULL; | |
|
201 | } | |
|
202 | ||
|
203 | exc = PyObject_GetAttrString(iomod, "UnsupportedOperation"); | |
|
204 | if (NULL == exc) { | |
|
205 | Py_DECREF(iomod); | |
|
206 | return NULL; | |
|
207 | } | |
|
208 | ||
|
209 | PyErr_SetNone(exc); | |
|
210 | Py_DECREF(exc); | |
|
211 | Py_DECREF(iomod); | |
|
212 | ||
|
213 | return NULL; | |
|
214 | } | |
|
215 | ||
|
130 | 216 | static PyMethodDef ZstdDecompressionWriter_methods[] = { |
|
131 | 217 | { "__enter__", (PyCFunction)ZstdDecompressionWriter_enter, METH_NOARGS, |
|
132 | 218 | PyDoc_STR("Enter a decompression context.") }, |
@@ -134,11 +220,32 b' static PyMethodDef ZstdDecompressionWrit' | |||
|
134 | 220 | PyDoc_STR("Exit a decompression context.") }, |
|
135 | 221 | { "memory_size", (PyCFunction)ZstdDecompressionWriter_memory_size, METH_NOARGS, |
|
136 | 222 | PyDoc_STR("Obtain the memory size in bytes of the underlying decompressor.") }, |
|
223 | { "close", (PyCFunction)ZstdDecompressionWriter_close, METH_NOARGS, NULL }, | |
|
224 | { "fileno", (PyCFunction)ZstdDecompressionWriter_fileno, METH_NOARGS, NULL }, | |
|
225 | { "flush", (PyCFunction)ZstdDecompressionWriter_flush, METH_NOARGS, NULL }, | |
|
226 | { "isatty", ZstdDecompressionWriter_false, METH_NOARGS, NULL }, | |
|
227 | { "readable", ZstdDecompressionWriter_false, METH_NOARGS, NULL }, | |
|
228 | { "readline", (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL }, | |
|
229 | { "readlines", (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL }, | |
|
230 | { "seek", (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL }, | |
|
231 | { "seekable", ZstdDecompressionWriter_false, METH_NOARGS, NULL }, | |
|
232 | { "tell", (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL }, | |
|
233 | { "truncate", (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL }, | |
|
234 | { "writable", ZstdDecompressionWriter_true, METH_NOARGS, NULL }, | |
|
235 | { "writelines" , (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL }, | |
|
236 | { "read", (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL }, | |
|
237 | { "readall", (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL }, | |
|
238 | { "readinto", (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL }, | |
|
137 | 239 | { "write", (PyCFunction)ZstdDecompressionWriter_write, METH_VARARGS | METH_KEYWORDS, |
|
138 | 240 | PyDoc_STR("Compress data") }, |
|
139 | 241 | { NULL, NULL } |
|
140 | 242 | }; |
|
141 | 243 | |
|
244 | static PyMemberDef ZstdDecompressionWriter_members[] = { | |
|
245 | { "closed", T_BOOL, offsetof(ZstdDecompressionWriter, closed), READONLY, NULL }, | |
|
246 | { NULL } | |
|
247 | }; | |
|
248 | ||
|
142 | 249 | PyTypeObject ZstdDecompressionWriterType = { |
|
143 | 250 | PyVarObject_HEAD_INIT(NULL, 0) |
|
144 | 251 | "zstd.ZstdDecompressionWriter", /* tp_name */ |
@@ -168,7 +275,7 b' PyTypeObject ZstdDecompressionWriterType' | |||
|
168 | 275 | 0, /* tp_iter */ |
|
169 | 276 | 0, /* tp_iternext */ |
|
170 | 277 | ZstdDecompressionWriter_methods,/* tp_methods */ |
|
171 | 0, /* tp_members */ | |
|
278 | ZstdDecompressionWriter_members,/* tp_members */ | |
|
172 | 279 | 0, /* tp_getset */ |
|
173 | 280 | 0, /* tp_base */ |
|
174 | 281 | 0, /* tp_dict */ |
@@ -75,7 +75,7 b' static PyObject* DecompressionObj_decomp' | |||
|
75 | 75 | |
|
76 | 76 | while (1) { |
|
77 | 77 | Py_BEGIN_ALLOW_THREADS |
|
78 |
zresult = ZSTD_decompress |
|
|
78 | zresult = ZSTD_decompressStream(self->decompressor->dctx, &output, &input); | |
|
79 | 79 | Py_END_ALLOW_THREADS |
|
80 | 80 | |
|
81 | 81 | if (ZSTD_isError(zresult)) { |
@@ -130,9 +130,26 b' finally:' | |||
|
130 | 130 | return result; |
|
131 | 131 | } |
|
132 | 132 | |
|
133 | static PyObject* DecompressionObj_flush(ZstdDecompressionObj* self, PyObject* args, PyObject* kwargs) { | |
|
134 | static char* kwlist[] = { | |
|
135 | "length", | |
|
136 | NULL | |
|
137 | }; | |
|
138 | ||
|
139 | PyObject* length = NULL; | |
|
140 | ||
|
141 | if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|O:flush", kwlist, &length)) { | |
|
142 | return NULL; | |
|
143 | } | |
|
144 | ||
|
145 | Py_RETURN_NONE; | |
|
146 | } | |
|
147 | ||
|
133 | 148 | static PyMethodDef DecompressionObj_methods[] = { |
|
134 | 149 | { "decompress", (PyCFunction)DecompressionObj_decompress, |
|
135 | 150 | METH_VARARGS | METH_KEYWORDS, PyDoc_STR("decompress data") }, |
|
151 | { "flush", (PyCFunction)DecompressionObj_flush, | |
|
152 | METH_VARARGS | METH_KEYWORDS, PyDoc_STR("no-op") }, | |
|
136 | 153 | { NULL, NULL } |
|
137 | 154 | }; |
|
138 | 155 |
@@ -17,7 +17,7 b' extern PyObject* ZstdError;' | |||
|
17 | 17 | int ensure_dctx(ZstdDecompressor* decompressor, int loadDict) { |
|
18 | 18 | size_t zresult; |
|
19 | 19 | |
|
20 | ZSTD_DCtx_reset(decompressor->dctx); | |
|
20 | ZSTD_DCtx_reset(decompressor->dctx, ZSTD_reset_session_only); | |
|
21 | 21 | |
|
22 | 22 | if (decompressor->maxWindowSize) { |
|
23 | 23 | zresult = ZSTD_DCtx_setMaxWindowSize(decompressor->dctx, decompressor->maxWindowSize); |
@@ -229,7 +229,7 b' static PyObject* Decompressor_copy_strea' | |||
|
229 | 229 | |
|
230 | 230 | while (input.pos < input.size) { |
|
231 | 231 | Py_BEGIN_ALLOW_THREADS |
|
232 |
zresult = ZSTD_decompress |
|
|
232 | zresult = ZSTD_decompressStream(self->dctx, &output, &input); | |
|
233 | 233 | Py_END_ALLOW_THREADS |
|
234 | 234 | |
|
235 | 235 | if (ZSTD_isError(zresult)) { |
@@ -379,7 +379,7 b' PyObject* Decompressor_decompress(ZstdDe' | |||
|
379 | 379 | inBuffer.pos = 0; |
|
380 | 380 | |
|
381 | 381 | Py_BEGIN_ALLOW_THREADS |
|
382 |
zresult = ZSTD_decompress |
|
|
382 | zresult = ZSTD_decompressStream(self->dctx, &outBuffer, &inBuffer); | |
|
383 | 383 | Py_END_ALLOW_THREADS |
|
384 | 384 | |
|
385 | 385 | if (ZSTD_isError(zresult)) { |
@@ -550,28 +550,35 b' finally:' | |||
|
550 | 550 | } |
|
551 | 551 | |
|
552 | 552 | PyDoc_STRVAR(Decompressor_stream_reader__doc__, |
|
553 | "stream_reader(source, [read_size=default])\n" | |
|
553 | "stream_reader(source, [read_size=default, [read_across_frames=False]])\n" | |
|
554 | 554 | "\n" |
|
555 | 555 | "Obtain an object that behaves like an I/O stream that can be used for\n" |
|
556 | 556 | "reading decompressed output from an object.\n" |
|
557 | 557 | "\n" |
|
558 | 558 | "The source object can be any object with a ``read(size)`` method or that\n" |
|
559 | 559 | "conforms to the buffer protocol.\n" |
|
560 | "\n" | |
|
561 | "``read_across_frames`` controls the behavior of ``read()`` when the end\n" | |
|
562 | "of a zstd frame is reached. When ``True``, ``read()`` can potentially\n" | |
|
563 | "return data belonging to multiple zstd frames. When ``False``, ``read()``\n" | |
|
564 | "will return when the end of a frame is reached.\n" | |
|
560 | 565 | ); |
|
561 | 566 | |
|
562 | 567 | static ZstdDecompressionReader* Decompressor_stream_reader(ZstdDecompressor* self, PyObject* args, PyObject* kwargs) { |
|
563 | 568 | static char* kwlist[] = { |
|
564 | 569 | "source", |
|
565 | 570 | "read_size", |
|
571 | "read_across_frames", | |
|
566 | 572 | NULL |
|
567 | 573 | }; |
|
568 | 574 | |
|
569 | 575 | PyObject* source; |
|
570 | 576 | size_t readSize = ZSTD_DStreamInSize(); |
|
577 | PyObject* readAcrossFrames = NULL; | |
|
571 | 578 | ZstdDecompressionReader* result; |
|
572 | 579 | |
|
573 | if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|k:stream_reader", kwlist, | |
|
574 | &source, &readSize)) { | |
|
580 | if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|kO:stream_reader", kwlist, | |
|
581 | &source, &readSize, &readAcrossFrames)) { | |
|
575 | 582 | return NULL; |
|
576 | 583 | } |
|
577 | 584 | |
@@ -604,6 +611,7 b' static ZstdDecompressionReader* Decompre' | |||
|
604 | 611 | |
|
605 | 612 | result->decompressor = self; |
|
606 | 613 | Py_INCREF(self); |
|
614 | result->readAcrossFrames = readAcrossFrames ? PyObject_IsTrue(readAcrossFrames) : 0; | |
|
607 | 615 | |
|
608 | 616 | return result; |
|
609 | 617 | } |
@@ -625,15 +633,17 b' static ZstdDecompressionWriter* Decompre' | |||
|
625 | 633 | static char* kwlist[] = { |
|
626 | 634 | "writer", |
|
627 | 635 | "write_size", |
|
636 | "write_return_read", | |
|
628 | 637 | NULL |
|
629 | 638 | }; |
|
630 | 639 | |
|
631 | 640 | PyObject* writer; |
|
632 | 641 | size_t outSize = ZSTD_DStreamOutSize(); |
|
642 | PyObject* writeReturnRead = NULL; | |
|
633 | 643 | ZstdDecompressionWriter* result; |
|
634 | 644 | |
|
635 | if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|k:stream_writer", kwlist, | |
|
636 | &writer, &outSize)) { | |
|
645 | if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|kO:stream_writer", kwlist, | |
|
646 | &writer, &outSize, &writeReturnRead)) { | |
|
637 | 647 | return NULL; |
|
638 | 648 | } |
|
639 | 649 | |
@@ -642,6 +652,10 b' static ZstdDecompressionWriter* Decompre' | |||
|
642 | 652 | return NULL; |
|
643 | 653 | } |
|
644 | 654 | |
|
655 | if (ensure_dctx(self, 1)) { | |
|
656 | return NULL; | |
|
657 | } | |
|
658 | ||
|
645 | 659 | result = (ZstdDecompressionWriter*)PyObject_CallObject((PyObject*)&ZstdDecompressionWriterType, NULL); |
|
646 | 660 | if (!result) { |
|
647 | 661 | return NULL; |
@@ -654,6 +668,7 b' static ZstdDecompressionWriter* Decompre' | |||
|
654 | 668 | Py_INCREF(result->writer); |
|
655 | 669 | |
|
656 | 670 | result->outSize = outSize; |
|
671 | result->writeReturnRead = writeReturnRead ? PyObject_IsTrue(writeReturnRead) : 0; | |
|
657 | 672 | |
|
658 | 673 | return result; |
|
659 | 674 | } |
@@ -756,7 +771,7 b' static PyObject* Decompressor_decompress' | |||
|
756 | 771 | inBuffer.pos = 0; |
|
757 | 772 | |
|
758 | 773 | Py_BEGIN_ALLOW_THREADS |
|
759 |
zresult = ZSTD_decompress |
|
|
774 | zresult = ZSTD_decompressStream(self->dctx, &outBuffer, &inBuffer); | |
|
760 | 775 | Py_END_ALLOW_THREADS |
|
761 | 776 | if (ZSTD_isError(zresult)) { |
|
762 | 777 | PyErr_Format(ZstdError, "could not decompress chunk 0: %s", ZSTD_getErrorName(zresult)); |
@@ -852,7 +867,7 b' static PyObject* Decompressor_decompress' | |||
|
852 | 867 | outBuffer.pos = 0; |
|
853 | 868 | |
|
854 | 869 | Py_BEGIN_ALLOW_THREADS |
|
855 |
zresult = ZSTD_decompress |
|
|
870 | zresult = ZSTD_decompressStream(self->dctx, &outBuffer, &inBuffer); | |
|
856 | 871 | Py_END_ALLOW_THREADS |
|
857 | 872 | if (ZSTD_isError(zresult)) { |
|
858 | 873 | PyErr_Format(ZstdError, "could not decompress chunk %zd: %s", |
@@ -892,7 +907,7 b' static PyObject* Decompressor_decompress' | |||
|
892 | 907 | outBuffer.pos = 0; |
|
893 | 908 | |
|
894 | 909 | Py_BEGIN_ALLOW_THREADS |
|
895 |
zresult = ZSTD_decompress |
|
|
910 | zresult = ZSTD_decompressStream(self->dctx, &outBuffer, &inBuffer); | |
|
896 | 911 | Py_END_ALLOW_THREADS |
|
897 | 912 | if (ZSTD_isError(zresult)) { |
|
898 | 913 | PyErr_Format(ZstdError, "could not decompress chunk %zd: %s", |
@@ -1176,7 +1191,7 b' static void decompress_worker(WorkerStat' | |||
|
1176 | 1191 | inBuffer.size = sourceSize; |
|
1177 | 1192 | inBuffer.pos = 0; |
|
1178 | 1193 | |
|
1179 |
zresult = ZSTD_decompress |
|
|
1194 | zresult = ZSTD_decompressStream(state->dctx, &outBuffer, &inBuffer); | |
|
1180 | 1195 | if (ZSTD_isError(zresult)) { |
|
1181 | 1196 | state->error = WorkerError_zstd; |
|
1182 | 1197 | state->zresult = zresult; |
@@ -57,7 +57,7 b' static DecompressorIteratorResult read_d' | |||
|
57 | 57 | self->output.pos = 0; |
|
58 | 58 | |
|
59 | 59 | Py_BEGIN_ALLOW_THREADS |
|
60 |
zresult = ZSTD_decompress |
|
|
60 | zresult = ZSTD_decompressStream(self->decompressor->dctx, &self->output, &self->input); | |
|
61 | 61 | Py_END_ALLOW_THREADS |
|
62 | 62 | |
|
63 | 63 | /* We're done with the pointer. Nullify to prevent anyone from getting a |
@@ -16,7 +16,7 b'' | |||
|
16 | 16 | #include <zdict.h> |
|
17 | 17 | |
|
18 | 18 | /* Remember to change the string in zstandard/__init__ as well */ |
|
19 |
#define PYTHON_ZSTANDARD_VERSION "0.1 |
|
|
19 | #define PYTHON_ZSTANDARD_VERSION "0.11.0" | |
|
20 | 20 | |
|
21 | 21 | typedef enum { |
|
22 | 22 | compressorobj_flush_finish, |
@@ -31,27 +31,6 b' typedef enum {' | |||
|
31 | 31 | typedef struct { |
|
32 | 32 | PyObject_HEAD |
|
33 | 33 | ZSTD_CCtx_params* params; |
|
34 | unsigned format; | |
|
35 | int compressionLevel; | |
|
36 | unsigned windowLog; | |
|
37 | unsigned hashLog; | |
|
38 | unsigned chainLog; | |
|
39 | unsigned searchLog; | |
|
40 | unsigned minMatch; | |
|
41 | unsigned targetLength; | |
|
42 | unsigned compressionStrategy; | |
|
43 | unsigned contentSizeFlag; | |
|
44 | unsigned checksumFlag; | |
|
45 | unsigned dictIDFlag; | |
|
46 | unsigned threads; | |
|
47 | unsigned jobSize; | |
|
48 | unsigned overlapSizeLog; | |
|
49 | unsigned forceMaxWindow; | |
|
50 | unsigned enableLongDistanceMatching; | |
|
51 | unsigned ldmHashLog; | |
|
52 | unsigned ldmMinMatch; | |
|
53 | unsigned ldmBucketSizeLog; | |
|
54 | unsigned ldmHashEveryLog; | |
|
55 | 34 | } ZstdCompressionParametersObject; |
|
56 | 35 | |
|
57 | 36 | extern PyTypeObject ZstdCompressionParametersType; |
@@ -129,9 +108,11 b' typedef struct {' | |||
|
129 | 108 | |
|
130 | 109 | ZstdCompressor* compressor; |
|
131 | 110 | PyObject* writer; |
|
132 | unsigned long long sourceSize; | |
|
111 | ZSTD_outBuffer output; | |
|
133 | 112 | size_t outSize; |
|
134 | 113 | int entered; |
|
114 | int closed; | |
|
115 | int writeReturnRead; | |
|
135 | 116 | unsigned long long bytesCompressed; |
|
136 | 117 | } ZstdCompressionWriter; |
|
137 | 118 | |
@@ -235,6 +216,8 b' typedef struct {' | |||
|
235 | 216 | PyObject* reader; |
|
236 | 217 | /* Size for read() operations on reader. */ |
|
237 | 218 | size_t readSize; |
|
219 | /* Whether a read() can return data spanning multiple zstd frames. */ | |
|
220 | int readAcrossFrames; | |
|
238 | 221 | /* Buffer to read from (if reading from a buffer). */ |
|
239 | 222 | Py_buffer buffer; |
|
240 | 223 | |
@@ -267,6 +250,8 b' typedef struct {' | |||
|
267 | 250 | PyObject* writer; |
|
268 | 251 | size_t outSize; |
|
269 | 252 | int entered; |
|
253 | int closed; | |
|
254 | int writeReturnRead; | |
|
270 | 255 | } ZstdDecompressionWriter; |
|
271 | 256 | |
|
272 | 257 | extern PyTypeObject ZstdDecompressionWriterType; |
@@ -360,8 +345,9 b' typedef struct {' | |||
|
360 | 345 | |
|
361 | 346 | extern PyTypeObject ZstdBufferWithSegmentsCollectionType; |
|
362 | 347 | |
|
363 |
int set_parameter(ZSTD_CCtx_params* params, ZSTD_cParameter param, |
|
|
348 | int set_parameter(ZSTD_CCtx_params* params, ZSTD_cParameter param, int value); | |
|
364 | 349 | int set_parameters(ZSTD_CCtx_params* params, ZstdCompressionParametersObject* obj); |
|
350 | int to_cparams(ZstdCompressionParametersObject* params, ZSTD_compressionParameters* cparams); | |
|
365 | 351 | FrameParametersObject* get_frame_parameters(PyObject* self, PyObject* args, PyObject* kwargs); |
|
366 | 352 | int ensure_ddict(ZstdCompressionDict* dict); |
|
367 | 353 | int ensure_dctx(ZstdDecompressor* decompressor, int loadDict); |
@@ -36,7 +36,9 b" SOURCES = ['zstd/%s' % p for p in (" | |||
|
36 | 36 | 'compress/zstd_opt.c', |
|
37 | 37 | 'compress/zstdmt_compress.c', |
|
38 | 38 | 'decompress/huf_decompress.c', |
|
39 | 'decompress/zstd_ddict.c', | |
|
39 | 40 | 'decompress/zstd_decompress.c', |
|
41 | 'decompress/zstd_decompress_block.c', | |
|
40 | 42 | 'dictBuilder/cover.c', |
|
41 | 43 | 'dictBuilder/fastcover.c', |
|
42 | 44 | 'dictBuilder/divsufsort.c', |
@@ -5,12 +5,32 b'' | |||
|
5 | 5 | # This software may be modified and distributed under the terms |
|
6 | 6 | # of the BSD license. See the LICENSE file for details. |
|
7 | 7 | |
|
8 | from __future__ import print_function | |
|
9 | ||
|
10 | from distutils.version import LooseVersion | |
|
8 | 11 | import os |
|
9 | 12 | import sys |
|
10 | 13 | from setuptools import setup |
|
11 | 14 | |
|
15 | # Need change in 1.10 for ffi.from_buffer() to handle all buffer types | |
|
16 | # (like memoryview). | |
|
17 | # Need feature in 1.11 for ffi.gc() to declare size of objects so we avoid | |
|
18 | # garbage collection pitfalls. | |
|
19 | MINIMUM_CFFI_VERSION = '1.11' | |
|
20 | ||
|
12 | 21 | try: |
|
13 | 22 | import cffi |
|
23 | ||
|
24 | # PyPy (and possibly other distros) have CFFI distributed as part of | |
|
25 | # them. The install_requires for CFFI below won't work. We need to sniff | |
|
26 | # out the CFFI version here and reject CFFI if it is too old. | |
|
27 | cffi_version = LooseVersion(cffi.__version__) | |
|
28 | if cffi_version < LooseVersion(MINIMUM_CFFI_VERSION): | |
|
29 | print('CFFI 1.11 or newer required (%s found); ' | |
|
30 | 'not building CFFI backend' % cffi_version, | |
|
31 | file=sys.stderr) | |
|
32 | cffi = None | |
|
33 | ||
|
14 | 34 | except ImportError: |
|
15 | 35 | cffi = None |
|
16 | 36 | |
@@ -49,12 +69,7 b' install_requires = []' | |||
|
49 | 69 | if cffi: |
|
50 | 70 | import make_cffi |
|
51 | 71 | extensions.append(make_cffi.ffi.distutils_extension()) |
|
52 | ||
|
53 | # Need change in 1.10 for ffi.from_buffer() to handle all buffer types | |
|
54 | # (like memoryview). | |
|
55 | # Need feature in 1.11 for ffi.gc() to declare size of objects so we avoid | |
|
56 | # garbage collection pitfalls. | |
|
57 | install_requires.append('cffi>=1.11') | |
|
72 | install_requires.append('cffi>=%s' % MINIMUM_CFFI_VERSION) | |
|
58 | 73 | |
|
59 | 74 | version = None |
|
60 | 75 | |
@@ -88,6 +103,7 b' setup(' | |||
|
88 | 103 | 'Programming Language :: Python :: 3.4', |
|
89 | 104 | 'Programming Language :: Python :: 3.5', |
|
90 | 105 | 'Programming Language :: Python :: 3.6', |
|
106 | 'Programming Language :: Python :: 3.7', | |
|
91 | 107 | ], |
|
92 | 108 | keywords='zstandard zstd compression', |
|
93 | 109 | packages=['zstandard'], |
@@ -30,7 +30,9 b" zstd_sources = ['zstd/%s' % p for p in (" | |||
|
30 | 30 | 'compress/zstd_opt.c', |
|
31 | 31 | 'compress/zstdmt_compress.c', |
|
32 | 32 | 'decompress/huf_decompress.c', |
|
33 | 'decompress/zstd_ddict.c', | |
|
33 | 34 | 'decompress/zstd_decompress.c', |
|
35 | 'decompress/zstd_decompress_block.c', | |
|
34 | 36 | 'dictBuilder/cover.c', |
|
35 | 37 | 'dictBuilder/divsufsort.c', |
|
36 | 38 | 'dictBuilder/fastcover.c', |
@@ -79,12 +79,37 b' def make_cffi(cls):' | |||
|
79 | 79 | return cls |
|
80 | 80 | |
|
81 | 81 | |
|
82 |
class |
|
|
82 | class NonClosingBytesIO(io.BytesIO): | |
|
83 | """BytesIO that saves the underlying buffer on close(). | |
|
84 | ||
|
85 | This allows us to access written data after close(). | |
|
86 | """ | |
|
83 | 87 | def __init__(self, *args, **kwargs): |
|
88 | super(NonClosingBytesIO, self).__init__(*args, **kwargs) | |
|
89 | self._saved_buffer = None | |
|
90 | ||
|
91 | def close(self): | |
|
92 | self._saved_buffer = self.getvalue() | |
|
93 | return super(NonClosingBytesIO, self).close() | |
|
94 | ||
|
95 | def getvalue(self): | |
|
96 | if self.closed: | |
|
97 | return self._saved_buffer | |
|
98 | else: | |
|
99 | return super(NonClosingBytesIO, self).getvalue() | |
|
100 | ||
|
101 | ||
|
102 | class OpCountingBytesIO(NonClosingBytesIO): | |
|
103 | def __init__(self, *args, **kwargs): | |
|
104 | self._flush_count = 0 | |
|
84 | 105 | self._read_count = 0 |
|
85 | 106 | self._write_count = 0 |
|
86 | 107 | return super(OpCountingBytesIO, self).__init__(*args, **kwargs) |
|
87 | 108 | |
|
109 | def flush(self): | |
|
110 | self._flush_count += 1 | |
|
111 | return super(OpCountingBytesIO, self).flush() | |
|
112 | ||
|
88 | 113 | def read(self, *args): |
|
89 | 114 | self._read_count += 1 |
|
90 | 115 | return super(OpCountingBytesIO, self).read(*args) |
@@ -117,6 +142,13 b' def random_input_data():' | |||
|
117 | 142 | except OSError: |
|
118 | 143 | pass |
|
119 | 144 | |
|
145 | # Also add some actual random data. | |
|
146 | _source_files.append(os.urandom(100)) | |
|
147 | _source_files.append(os.urandom(1000)) | |
|
148 | _source_files.append(os.urandom(10000)) | |
|
149 | _source_files.append(os.urandom(100000)) | |
|
150 | _source_files.append(os.urandom(1000000)) | |
|
151 | ||
|
120 | 152 | return _source_files |
|
121 | 153 | |
|
122 | 154 | |
@@ -140,12 +172,14 b' def generate_samples():' | |||
|
140 | 172 | |
|
141 | 173 | |
|
142 | 174 | if hypothesis: |
|
143 | default_settings = hypothesis.settings() | |
|
175 | default_settings = hypothesis.settings(deadline=10000) | |
|
144 | 176 | hypothesis.settings.register_profile('default', default_settings) |
|
145 | 177 | |
|
146 |
ci_settings = hypothesis.settings(max_examples= |
|
|
147 | max_iterations=2500) | |
|
178 | ci_settings = hypothesis.settings(deadline=20000, max_examples=1000) | |
|
148 | 179 | hypothesis.settings.register_profile('ci', ci_settings) |
|
149 | 180 | |
|
181 | expensive_settings = hypothesis.settings(deadline=None, max_examples=10000) | |
|
182 | hypothesis.settings.register_profile('expensive', expensive_settings) | |
|
183 | ||
|
150 | 184 | hypothesis.settings.load_profile( |
|
151 | 185 | os.environ.get('HYPOTHESIS_PROFILE', 'default')) |
@@ -8,6 +8,9 b" ss = struct.Struct('=QQ')" | |||
|
8 | 8 | |
|
9 | 9 | class TestBufferWithSegments(unittest.TestCase): |
|
10 | 10 | def test_arguments(self): |
|
11 | if not hasattr(zstd, 'BufferWithSegments'): | |
|
12 | self.skipTest('BufferWithSegments not available') | |
|
13 | ||
|
11 | 14 | with self.assertRaises(TypeError): |
|
12 | 15 | zstd.BufferWithSegments() |
|
13 | 16 | |
@@ -19,10 +22,16 b' class TestBufferWithSegments(unittest.Te' | |||
|
19 | 22 | zstd.BufferWithSegments(b'foo', b'\x00\x00') |
|
20 | 23 | |
|
21 | 24 | def test_invalid_offset(self): |
|
25 | if not hasattr(zstd, 'BufferWithSegments'): | |
|
26 | self.skipTest('BufferWithSegments not available') | |
|
27 | ||
|
22 | 28 | with self.assertRaisesRegexp(ValueError, 'offset within segments array references memory'): |
|
23 | 29 | zstd.BufferWithSegments(b'foo', ss.pack(0, 4)) |
|
24 | 30 | |
|
25 | 31 | def test_invalid_getitem(self): |
|
32 | if not hasattr(zstd, 'BufferWithSegments'): | |
|
33 | self.skipTest('BufferWithSegments not available') | |
|
34 | ||
|
26 | 35 | b = zstd.BufferWithSegments(b'foo', ss.pack(0, 3)) |
|
27 | 36 | |
|
28 | 37 | with self.assertRaisesRegexp(IndexError, 'offset must be non-negative'): |
@@ -35,6 +44,9 b' class TestBufferWithSegments(unittest.Te' | |||
|
35 | 44 | test = b[2] |
|
36 | 45 | |
|
37 | 46 | def test_single(self): |
|
47 | if not hasattr(zstd, 'BufferWithSegments'): | |
|
48 | self.skipTest('BufferWithSegments not available') | |
|
49 | ||
|
38 | 50 | b = zstd.BufferWithSegments(b'foo', ss.pack(0, 3)) |
|
39 | 51 | self.assertEqual(len(b), 1) |
|
40 | 52 | self.assertEqual(b.size, 3) |
@@ -45,6 +57,9 b' class TestBufferWithSegments(unittest.Te' | |||
|
45 | 57 | self.assertEqual(b[0].tobytes(), b'foo') |
|
46 | 58 | |
|
47 | 59 | def test_multiple(self): |
|
60 | if not hasattr(zstd, 'BufferWithSegments'): | |
|
61 | self.skipTest('BufferWithSegments not available') | |
|
62 | ||
|
48 | 63 | b = zstd.BufferWithSegments(b'foofooxfooxy', b''.join([ss.pack(0, 3), |
|
49 | 64 | ss.pack(3, 4), |
|
50 | 65 | ss.pack(7, 5)])) |
@@ -59,10 +74,16 b' class TestBufferWithSegments(unittest.Te' | |||
|
59 | 74 | |
|
60 | 75 | class TestBufferWithSegmentsCollection(unittest.TestCase): |
|
61 | 76 | def test_empty_constructor(self): |
|
77 | if not hasattr(zstd, 'BufferWithSegmentsCollection'): | |
|
78 | self.skipTest('BufferWithSegmentsCollection not available') | |
|
79 | ||
|
62 | 80 | with self.assertRaisesRegexp(ValueError, 'must pass at least 1 argument'): |
|
63 | 81 | zstd.BufferWithSegmentsCollection() |
|
64 | 82 | |
|
65 | 83 | def test_argument_validation(self): |
|
84 | if not hasattr(zstd, 'BufferWithSegmentsCollection'): | |
|
85 | self.skipTest('BufferWithSegmentsCollection not available') | |
|
86 | ||
|
66 | 87 | with self.assertRaisesRegexp(TypeError, 'arguments must be BufferWithSegments'): |
|
67 | 88 | zstd.BufferWithSegmentsCollection(None) |
|
68 | 89 | |
@@ -74,6 +95,9 b' class TestBufferWithSegmentsCollection(u' | |||
|
74 | 95 | zstd.BufferWithSegmentsCollection(zstd.BufferWithSegments(b'', b'')) |
|
75 | 96 | |
|
76 | 97 | def test_length(self): |
|
98 | if not hasattr(zstd, 'BufferWithSegmentsCollection'): | |
|
99 | self.skipTest('BufferWithSegmentsCollection not available') | |
|
100 | ||
|
77 | 101 | b1 = zstd.BufferWithSegments(b'foo', ss.pack(0, 3)) |
|
78 | 102 | b2 = zstd.BufferWithSegments(b'barbaz', b''.join([ss.pack(0, 3), |
|
79 | 103 | ss.pack(3, 3)])) |
@@ -91,6 +115,9 b' class TestBufferWithSegmentsCollection(u' | |||
|
91 | 115 | self.assertEqual(c.size(), 9) |
|
92 | 116 | |
|
93 | 117 | def test_getitem(self): |
|
118 | if not hasattr(zstd, 'BufferWithSegmentsCollection'): | |
|
119 | self.skipTest('BufferWithSegmentsCollection not available') | |
|
120 | ||
|
94 | 121 | b1 = zstd.BufferWithSegments(b'foo', ss.pack(0, 3)) |
|
95 | 122 | b2 = zstd.BufferWithSegments(b'barbaz', b''.join([ss.pack(0, 3), |
|
96 | 123 | ss.pack(3, 3)])) |
@@ -1,14 +1,17 b'' | |||
|
1 | 1 | import hashlib |
|
2 | 2 | import io |
|
3 | import os | |
|
3 | 4 | import struct |
|
4 | 5 | import sys |
|
5 | 6 | import tarfile |
|
7 | import tempfile | |
|
6 | 8 | import unittest |
|
7 | 9 | |
|
8 | 10 | import zstandard as zstd |
|
9 | 11 | |
|
10 | 12 | from .common import ( |
|
11 | 13 | make_cffi, |
|
14 | NonClosingBytesIO, | |
|
12 | 15 | OpCountingBytesIO, |
|
13 | 16 | ) |
|
14 | 17 | |
@@ -272,7 +275,7 b' class TestCompressor_compressobj(unittes' | |||
|
272 | 275 | |
|
273 | 276 | params = zstd.get_frame_parameters(result) |
|
274 | 277 | self.assertEqual(params.content_size, zstd.CONTENTSIZE_UNKNOWN) |
|
275 |
self.assertEqual(params.window_size, |
|
|
278 | self.assertEqual(params.window_size, 2097152) | |
|
276 | 279 | self.assertEqual(params.dict_id, 0) |
|
277 | 280 | self.assertFalse(params.has_checksum) |
|
278 | 281 | |
@@ -321,7 +324,7 b' class TestCompressor_compressobj(unittes' | |||
|
321 | 324 | cobj.compress(b'foo') |
|
322 | 325 | cobj.flush() |
|
323 | 326 | |
|
324 | with self.assertRaisesRegexp(zstd.ZstdError, 'cannot call compress\(\) after compressor'): | |
|
327 | with self.assertRaisesRegexp(zstd.ZstdError, r'cannot call compress\(\) after compressor'): | |
|
325 | 328 | cobj.compress(b'foo') |
|
326 | 329 | |
|
327 | 330 | with self.assertRaisesRegexp(zstd.ZstdError, 'compressor object already finished'): |
@@ -453,7 +456,7 b' class TestCompressor_copy_stream(unittes' | |||
|
453 | 456 | |
|
454 | 457 | params = zstd.get_frame_parameters(dest.getvalue()) |
|
455 | 458 | self.assertEqual(params.content_size, zstd.CONTENTSIZE_UNKNOWN) |
|
456 |
self.assertEqual(params.window_size, |
|
|
459 | self.assertEqual(params.window_size, 2097152) | |
|
457 | 460 | self.assertEqual(params.dict_id, 0) |
|
458 | 461 | self.assertFalse(params.has_checksum) |
|
459 | 462 | |
@@ -605,10 +608,6 b' class TestCompressor_stream_reader(unitt' | |||
|
605 | 608 | with self.assertRaises(io.UnsupportedOperation): |
|
606 | 609 | reader.readlines() |
|
607 | 610 | |
|
608 | # This could probably be implemented someday. | |
|
609 | with self.assertRaises(NotImplementedError): | |
|
610 | reader.readall() | |
|
611 | ||
|
612 | 611 | with self.assertRaises(io.UnsupportedOperation): |
|
613 | 612 | iter(reader) |
|
614 | 613 | |
@@ -644,15 +643,16 b' class TestCompressor_stream_reader(unitt' | |||
|
644 | 643 | with self.assertRaisesRegexp(ValueError, 'stream is closed'): |
|
645 | 644 | reader.read(10) |
|
646 | 645 | |
|
647 |
def test_read_ |
|
|
646 | def test_read_sizes(self): | |
|
648 | 647 | cctx = zstd.ZstdCompressor() |
|
648 | foo = cctx.compress(b'foo') | |
|
649 | 649 | |
|
650 | 650 | with cctx.stream_reader(b'foo') as reader: |
|
651 |
with self.assertRaisesRegexp(ValueError, 'cannot read negative |
|
|
652 |
reader.read(- |
|
|
651 | with self.assertRaisesRegexp(ValueError, 'cannot read negative amounts less than -1'): | |
|
652 | reader.read(-2) | |
|
653 | 653 | |
|
654 | with self.assertRaisesRegexp(ValueError, 'cannot read negative or size 0 amounts'): | |
|
655 |
|
|
|
654 | self.assertEqual(reader.read(0), b'') | |
|
655 | self.assertEqual(reader.read(), foo) | |
|
656 | 656 | |
|
657 | 657 | def test_read_buffer(self): |
|
658 | 658 | cctx = zstd.ZstdCompressor() |
@@ -746,11 +746,202 b' class TestCompressor_stream_reader(unitt' | |||
|
746 | 746 | with cctx.stream_reader(source, size=42): |
|
747 | 747 | pass |
|
748 | 748 | |
|
749 | def test_readall(self): | |
|
750 | cctx = zstd.ZstdCompressor() | |
|
751 | frame = cctx.compress(b'foo' * 1024) | |
|
752 | ||
|
753 | reader = cctx.stream_reader(b'foo' * 1024) | |
|
754 | self.assertEqual(reader.readall(), frame) | |
|
755 | ||
|
756 | def test_readinto(self): | |
|
757 | cctx = zstd.ZstdCompressor() | |
|
758 | foo = cctx.compress(b'foo') | |
|
759 | ||
|
760 | reader = cctx.stream_reader(b'foo') | |
|
761 | with self.assertRaises(Exception): | |
|
762 | reader.readinto(b'foobar') | |
|
763 | ||
|
764 | # readinto() with sufficiently large destination. | |
|
765 | b = bytearray(1024) | |
|
766 | reader = cctx.stream_reader(b'foo') | |
|
767 | self.assertEqual(reader.readinto(b), len(foo)) | |
|
768 | self.assertEqual(b[0:len(foo)], foo) | |
|
769 | self.assertEqual(reader.readinto(b), 0) | |
|
770 | self.assertEqual(b[0:len(foo)], foo) | |
|
771 | ||
|
772 | # readinto() with small reads. | |
|
773 | b = bytearray(1024) | |
|
774 | reader = cctx.stream_reader(b'foo', read_size=1) | |
|
775 | self.assertEqual(reader.readinto(b), len(foo)) | |
|
776 | self.assertEqual(b[0:len(foo)], foo) | |
|
777 | ||
|
778 | # Too small destination buffer. | |
|
779 | b = bytearray(2) | |
|
780 | reader = cctx.stream_reader(b'foo') | |
|
781 | self.assertEqual(reader.readinto(b), 2) | |
|
782 | self.assertEqual(b[:], foo[0:2]) | |
|
783 | self.assertEqual(reader.readinto(b), 2) | |
|
784 | self.assertEqual(b[:], foo[2:4]) | |
|
785 | self.assertEqual(reader.readinto(b), 2) | |
|
786 | self.assertEqual(b[:], foo[4:6]) | |
|
787 | ||
|
788 | def test_readinto1(self): | |
|
789 | cctx = zstd.ZstdCompressor() | |
|
790 | foo = b''.join(cctx.read_to_iter(io.BytesIO(b'foo'))) | |
|
791 | ||
|
792 | reader = cctx.stream_reader(b'foo') | |
|
793 | with self.assertRaises(Exception): | |
|
794 | reader.readinto1(b'foobar') | |
|
795 | ||
|
796 | b = bytearray(1024) | |
|
797 | source = OpCountingBytesIO(b'foo') | |
|
798 | reader = cctx.stream_reader(source) | |
|
799 | self.assertEqual(reader.readinto1(b), len(foo)) | |
|
800 | self.assertEqual(b[0:len(foo)], foo) | |
|
801 | self.assertEqual(source._read_count, 2) | |
|
802 | ||
|
803 | # readinto1() with small reads. | |
|
804 | b = bytearray(1024) | |
|
805 | source = OpCountingBytesIO(b'foo') | |
|
806 | reader = cctx.stream_reader(source, read_size=1) | |
|
807 | self.assertEqual(reader.readinto1(b), len(foo)) | |
|
808 | self.assertEqual(b[0:len(foo)], foo) | |
|
809 | self.assertEqual(source._read_count, 4) | |
|
810 | ||
|
811 | def test_read1(self): | |
|
812 | cctx = zstd.ZstdCompressor() | |
|
813 | foo = b''.join(cctx.read_to_iter(io.BytesIO(b'foo'))) | |
|
814 | ||
|
815 | b = OpCountingBytesIO(b'foo') | |
|
816 | reader = cctx.stream_reader(b) | |
|
817 | ||
|
818 | self.assertEqual(reader.read1(), foo) | |
|
819 | self.assertEqual(b._read_count, 2) | |
|
820 | ||
|
821 | b = OpCountingBytesIO(b'foo') | |
|
822 | reader = cctx.stream_reader(b) | |
|
823 | ||
|
824 | self.assertEqual(reader.read1(0), b'') | |
|
825 | self.assertEqual(reader.read1(2), foo[0:2]) | |
|
826 | self.assertEqual(b._read_count, 2) | |
|
827 | self.assertEqual(reader.read1(2), foo[2:4]) | |
|
828 | self.assertEqual(reader.read1(1024), foo[4:]) | |
|
829 | ||
|
749 | 830 | |
|
750 | 831 | @make_cffi |
|
751 | 832 | class TestCompressor_stream_writer(unittest.TestCase): |
|
833 | def test_io_api(self): | |
|
834 | buffer = io.BytesIO() | |
|
835 | cctx = zstd.ZstdCompressor() | |
|
836 | writer = cctx.stream_writer(buffer) | |
|
837 | ||
|
838 | self.assertFalse(writer.isatty()) | |
|
839 | self.assertFalse(writer.readable()) | |
|
840 | ||
|
841 | with self.assertRaises(io.UnsupportedOperation): | |
|
842 | writer.readline() | |
|
843 | ||
|
844 | with self.assertRaises(io.UnsupportedOperation): | |
|
845 | writer.readline(42) | |
|
846 | ||
|
847 | with self.assertRaises(io.UnsupportedOperation): | |
|
848 | writer.readline(size=42) | |
|
849 | ||
|
850 | with self.assertRaises(io.UnsupportedOperation): | |
|
851 | writer.readlines() | |
|
852 | ||
|
853 | with self.assertRaises(io.UnsupportedOperation): | |
|
854 | writer.readlines(42) | |
|
855 | ||
|
856 | with self.assertRaises(io.UnsupportedOperation): | |
|
857 | writer.readlines(hint=42) | |
|
858 | ||
|
859 | with self.assertRaises(io.UnsupportedOperation): | |
|
860 | writer.seek(0) | |
|
861 | ||
|
862 | with self.assertRaises(io.UnsupportedOperation): | |
|
863 | writer.seek(10, os.SEEK_SET) | |
|
864 | ||
|
865 | self.assertFalse(writer.seekable()) | |
|
866 | ||
|
867 | with self.assertRaises(io.UnsupportedOperation): | |
|
868 | writer.truncate() | |
|
869 | ||
|
870 | with self.assertRaises(io.UnsupportedOperation): | |
|
871 | writer.truncate(42) | |
|
872 | ||
|
873 | with self.assertRaises(io.UnsupportedOperation): | |
|
874 | writer.truncate(size=42) | |
|
875 | ||
|
876 | self.assertTrue(writer.writable()) | |
|
877 | ||
|
878 | with self.assertRaises(NotImplementedError): | |
|
879 | writer.writelines([]) | |
|
880 | ||
|
881 | with self.assertRaises(io.UnsupportedOperation): | |
|
882 | writer.read() | |
|
883 | ||
|
884 | with self.assertRaises(io.UnsupportedOperation): | |
|
885 | writer.read(42) | |
|
886 | ||
|
887 | with self.assertRaises(io.UnsupportedOperation): | |
|
888 | writer.read(size=42) | |
|
889 | ||
|
890 | with self.assertRaises(io.UnsupportedOperation): | |
|
891 | writer.readall() | |
|
892 | ||
|
893 | with self.assertRaises(io.UnsupportedOperation): | |
|
894 | writer.readinto(None) | |
|
895 | ||
|
896 | with self.assertRaises(io.UnsupportedOperation): | |
|
897 | writer.fileno() | |
|
898 | ||
|
899 | self.assertFalse(writer.closed) | |
|
900 | ||
|
901 | def test_fileno_file(self): | |
|
902 | with tempfile.TemporaryFile('wb') as tf: | |
|
903 | cctx = zstd.ZstdCompressor() | |
|
904 | writer = cctx.stream_writer(tf) | |
|
905 | ||
|
906 | self.assertEqual(writer.fileno(), tf.fileno()) | |
|
907 | ||
|
908 | def test_close(self): | |
|
909 | buffer = NonClosingBytesIO() | |
|
910 | cctx = zstd.ZstdCompressor(level=1) | |
|
911 | writer = cctx.stream_writer(buffer) | |
|
912 | ||
|
913 | writer.write(b'foo' * 1024) | |
|
914 | self.assertFalse(writer.closed) | |
|
915 | self.assertFalse(buffer.closed) | |
|
916 | writer.close() | |
|
917 | self.assertTrue(writer.closed) | |
|
918 | self.assertTrue(buffer.closed) | |
|
919 | ||
|
920 | with self.assertRaisesRegexp(ValueError, 'stream is closed'): | |
|
921 | writer.write(b'foo') | |
|
922 | ||
|
923 | with self.assertRaisesRegexp(ValueError, 'stream is closed'): | |
|
924 | writer.flush() | |
|
925 | ||
|
926 | with self.assertRaisesRegexp(ValueError, 'stream is closed'): | |
|
927 | with writer: | |
|
928 | pass | |
|
929 | ||
|
930 | self.assertEqual(buffer.getvalue(), | |
|
931 | b'\x28\xb5\x2f\xfd\x00\x48\x55\x00\x00\x18\x66\x6f' | |
|
932 | b'\x6f\x01\x00\xfa\xd3\x77\x43') | |
|
933 | ||
|
934 | # Context manager exit should close stream. | |
|
935 | buffer = io.BytesIO() | |
|
936 | writer = cctx.stream_writer(buffer) | |
|
937 | ||
|
938 | with writer: | |
|
939 | writer.write(b'foo') | |
|
940 | ||
|
941 | self.assertTrue(writer.closed) | |
|
942 | ||
|
752 | 943 | def test_empty(self): |
|
753 |
buffer = |
|
|
944 | buffer = NonClosingBytesIO() | |
|
754 | 945 | cctx = zstd.ZstdCompressor(level=1, write_content_size=False) |
|
755 | 946 | with cctx.stream_writer(buffer) as compressor: |
|
756 | 947 | compressor.write(b'') |
@@ -764,6 +955,25 b' class TestCompressor_stream_writer(unitt' | |||
|
764 | 955 | self.assertEqual(params.dict_id, 0) |
|
765 | 956 | self.assertFalse(params.has_checksum) |
|
766 | 957 | |
|
958 | # Test without context manager. | |
|
959 | buffer = io.BytesIO() | |
|
960 | compressor = cctx.stream_writer(buffer) | |
|
961 | self.assertEqual(compressor.write(b''), 0) | |
|
962 | self.assertEqual(buffer.getvalue(), b'') | |
|
963 | self.assertEqual(compressor.flush(zstd.FLUSH_FRAME), 9) | |
|
964 | result = buffer.getvalue() | |
|
965 | self.assertEqual(result, b'\x28\xb5\x2f\xfd\x00\x48\x01\x00\x00') | |
|
966 | ||
|
967 | params = zstd.get_frame_parameters(result) | |
|
968 | self.assertEqual(params.content_size, zstd.CONTENTSIZE_UNKNOWN) | |
|
969 | self.assertEqual(params.window_size, 524288) | |
|
970 | self.assertEqual(params.dict_id, 0) | |
|
971 | self.assertFalse(params.has_checksum) | |
|
972 | ||
|
973 | # Test write_return_read=True | |
|
974 | compressor = cctx.stream_writer(buffer, write_return_read=True) | |
|
975 | self.assertEqual(compressor.write(b''), 0) | |
|
976 | ||
|
767 | 977 | def test_input_types(self): |
|
768 | 978 | expected = b'\x28\xb5\x2f\xfd\x00\x48\x19\x00\x00\x66\x6f\x6f' |
|
769 | 979 | cctx = zstd.ZstdCompressor(level=1) |
@@ -778,14 +988,17 b' class TestCompressor_stream_writer(unitt' | |||
|
778 | 988 | ] |
|
779 | 989 | |
|
780 | 990 | for source in sources: |
|
781 |
buffer = |
|
|
991 | buffer = NonClosingBytesIO() | |
|
782 | 992 | with cctx.stream_writer(buffer) as compressor: |
|
783 | 993 | compressor.write(source) |
|
784 | 994 | |
|
785 | 995 | self.assertEqual(buffer.getvalue(), expected) |
|
786 | 996 | |
|
997 | compressor = cctx.stream_writer(buffer, write_return_read=True) | |
|
998 | self.assertEqual(compressor.write(source), len(source)) | |
|
999 | ||
|
787 | 1000 | def test_multiple_compress(self): |
|
788 |
buffer = |
|
|
1001 | buffer = NonClosingBytesIO() | |
|
789 | 1002 | cctx = zstd.ZstdCompressor(level=5) |
|
790 | 1003 | with cctx.stream_writer(buffer) as compressor: |
|
791 | 1004 | self.assertEqual(compressor.write(b'foo'), 0) |
@@ -794,9 +1007,27 b' class TestCompressor_stream_writer(unitt' | |||
|
794 | 1007 | |
|
795 | 1008 | result = buffer.getvalue() |
|
796 | 1009 | self.assertEqual(result, |
|
797 |
b'\x28\xb5\x2f\xfd\x00\x5 |
|
|
1010 | b'\x28\xb5\x2f\xfd\x00\x58\x75\x00\x00\x38\x66\x6f' | |
|
798 | 1011 | b'\x6f\x62\x61\x72\x78\x01\x00\xfc\xdf\x03\x23') |
|
799 | 1012 | |
|
1013 | # Test without context manager. | |
|
1014 | buffer = io.BytesIO() | |
|
1015 | compressor = cctx.stream_writer(buffer) | |
|
1016 | self.assertEqual(compressor.write(b'foo'), 0) | |
|
1017 | self.assertEqual(compressor.write(b'bar'), 0) | |
|
1018 | self.assertEqual(compressor.write(b'x' * 8192), 0) | |
|
1019 | self.assertEqual(compressor.flush(zstd.FLUSH_FRAME), 23) | |
|
1020 | result = buffer.getvalue() | |
|
1021 | self.assertEqual(result, | |
|
1022 | b'\x28\xb5\x2f\xfd\x00\x58\x75\x00\x00\x38\x66\x6f' | |
|
1023 | b'\x6f\x62\x61\x72\x78\x01\x00\xfc\xdf\x03\x23') | |
|
1024 | ||
|
1025 | # Test with write_return_read=True. | |
|
1026 | compressor = cctx.stream_writer(buffer, write_return_read=True) | |
|
1027 | self.assertEqual(compressor.write(b'foo'), 3) | |
|
1028 | self.assertEqual(compressor.write(b'barbiz'), 6) | |
|
1029 | self.assertEqual(compressor.write(b'x' * 8192), 8192) | |
|
1030 | ||
|
800 | 1031 | def test_dictionary(self): |
|
801 | 1032 | samples = [] |
|
802 | 1033 | for i in range(128): |
@@ -807,9 +1038,9 b' class TestCompressor_stream_writer(unitt' | |||
|
807 | 1038 | d = zstd.train_dictionary(8192, samples) |
|
808 | 1039 | |
|
809 | 1040 | h = hashlib.sha1(d.as_bytes()).hexdigest() |
|
810 | self.assertEqual(h, '2b3b6428da5bf2c9cc9d4bb58ba0bc5990dd0e79') | |
|
1041 | self.assertEqual(h, '88ca0d38332aff379d4ced166a51c280a7679aad') | |
|
811 | 1042 | |
|
812 |
buffer = |
|
|
1043 | buffer = NonClosingBytesIO() | |
|
813 | 1044 | cctx = zstd.ZstdCompressor(level=9, dict_data=d) |
|
814 | 1045 | with cctx.stream_writer(buffer) as compressor: |
|
815 | 1046 | self.assertEqual(compressor.write(b'foo'), 0) |
@@ -825,7 +1056,7 b' class TestCompressor_stream_writer(unitt' | |||
|
825 | 1056 | self.assertFalse(params.has_checksum) |
|
826 | 1057 | |
|
827 | 1058 | h = hashlib.sha1(compressed).hexdigest() |
|
828 |
self.assertEqual(h, ' |
|
|
1059 | self.assertEqual(h, '8703b4316f274d26697ea5dd480f29c08e85d940') | |
|
829 | 1060 | |
|
830 | 1061 | source = b'foo' + b'bar' + (b'foo' * 16384) |
|
831 | 1062 | |
@@ -842,9 +1073,9 b' class TestCompressor_stream_writer(unitt' | |||
|
842 | 1073 | min_match=5, |
|
843 | 1074 | search_log=4, |
|
844 | 1075 | target_length=10, |
|
845 |
|
|
|
1076 | strategy=zstd.STRATEGY_FAST) | |
|
846 | 1077 | |
|
847 |
buffer = |
|
|
1078 | buffer = NonClosingBytesIO() | |
|
848 | 1079 | cctx = zstd.ZstdCompressor(compression_params=params) |
|
849 | 1080 | with cctx.stream_writer(buffer) as compressor: |
|
850 | 1081 | self.assertEqual(compressor.write(b'foo'), 0) |
@@ -863,12 +1094,12 b' class TestCompressor_stream_writer(unitt' | |||
|
863 | 1094 | self.assertEqual(h, '2a8111d72eb5004cdcecbdac37da9f26720d30ef') |
|
864 | 1095 | |
|
865 | 1096 | def test_write_checksum(self): |
|
866 |
no_checksum = |
|
|
1097 | no_checksum = NonClosingBytesIO() | |
|
867 | 1098 | cctx = zstd.ZstdCompressor(level=1) |
|
868 | 1099 | with cctx.stream_writer(no_checksum) as compressor: |
|
869 | 1100 | self.assertEqual(compressor.write(b'foobar'), 0) |
|
870 | 1101 | |
|
871 |
with_checksum = |
|
|
1102 | with_checksum = NonClosingBytesIO() | |
|
872 | 1103 | cctx = zstd.ZstdCompressor(level=1, write_checksum=True) |
|
873 | 1104 | with cctx.stream_writer(with_checksum) as compressor: |
|
874 | 1105 | self.assertEqual(compressor.write(b'foobar'), 0) |
@@ -886,12 +1117,12 b' class TestCompressor_stream_writer(unitt' | |||
|
886 | 1117 | len(no_checksum.getvalue()) + 4) |
|
887 | 1118 | |
|
888 | 1119 | def test_write_content_size(self): |
|
889 |
no_size = |
|
|
1120 | no_size = NonClosingBytesIO() | |
|
890 | 1121 | cctx = zstd.ZstdCompressor(level=1, write_content_size=False) |
|
891 | 1122 | with cctx.stream_writer(no_size) as compressor: |
|
892 | 1123 | self.assertEqual(compressor.write(b'foobar' * 256), 0) |
|
893 | 1124 | |
|
894 |
with_size = |
|
|
1125 | with_size = NonClosingBytesIO() | |
|
895 | 1126 | cctx = zstd.ZstdCompressor(level=1) |
|
896 | 1127 | with cctx.stream_writer(with_size) as compressor: |
|
897 | 1128 | self.assertEqual(compressor.write(b'foobar' * 256), 0) |
@@ -902,7 +1133,7 b' class TestCompressor_stream_writer(unitt' | |||
|
902 | 1133 | len(no_size.getvalue())) |
|
903 | 1134 | |
|
904 | 1135 | # Declaring size will write the header. |
|
905 |
with_size = |
|
|
1136 | with_size = NonClosingBytesIO() | |
|
906 | 1137 | with cctx.stream_writer(with_size, size=len(b'foobar' * 256)) as compressor: |
|
907 | 1138 | self.assertEqual(compressor.write(b'foobar' * 256), 0) |
|
908 | 1139 | |
@@ -927,7 +1158,7 b' class TestCompressor_stream_writer(unitt' | |||
|
927 | 1158 | |
|
928 | 1159 | d = zstd.train_dictionary(1024, samples) |
|
929 | 1160 | |
|
930 |
with_dict_id = |
|
|
1161 | with_dict_id = NonClosingBytesIO() | |
|
931 | 1162 | cctx = zstd.ZstdCompressor(level=1, dict_data=d) |
|
932 | 1163 | with cctx.stream_writer(with_dict_id) as compressor: |
|
933 | 1164 | self.assertEqual(compressor.write(b'foobarfoobar'), 0) |
@@ -935,7 +1166,7 b' class TestCompressor_stream_writer(unitt' | |||
|
935 | 1166 | self.assertEqual(with_dict_id.getvalue()[4:5], b'\x03') |
|
936 | 1167 | |
|
937 | 1168 | cctx = zstd.ZstdCompressor(level=1, dict_data=d, write_dict_id=False) |
|
938 |
no_dict_id = |
|
|
1169 | no_dict_id = NonClosingBytesIO() | |
|
939 | 1170 | with cctx.stream_writer(no_dict_id) as compressor: |
|
940 | 1171 | self.assertEqual(compressor.write(b'foobarfoobar'), 0) |
|
941 | 1172 | |
@@ -1009,8 +1240,32 b' class TestCompressor_stream_writer(unitt' | |||
|
1009 | 1240 | header = trailing[0:3] |
|
1010 | 1241 | self.assertEqual(header, b'\x01\x00\x00') |
|
1011 | 1242 | |
|
1243 | def test_flush_frame(self): | |
|
1244 | cctx = zstd.ZstdCompressor(level=3) | |
|
1245 | dest = OpCountingBytesIO() | |
|
1246 | ||
|
1247 | with cctx.stream_writer(dest) as compressor: | |
|
1248 | self.assertEqual(compressor.write(b'foobar' * 8192), 0) | |
|
1249 | self.assertEqual(compressor.flush(zstd.FLUSH_FRAME), 23) | |
|
1250 | compressor.write(b'biz' * 16384) | |
|
1251 | ||
|
1252 | self.assertEqual(dest.getvalue(), | |
|
1253 | # Frame 1. | |
|
1254 | b'\x28\xb5\x2f\xfd\x00\x58\x75\x00\x00\x30\x66\x6f\x6f' | |
|
1255 | b'\x62\x61\x72\x01\x00\xf7\xbf\xe8\xa5\x08' | |
|
1256 | # Frame 2. | |
|
1257 | b'\x28\xb5\x2f\xfd\x00\x58\x5d\x00\x00\x18\x62\x69\x7a' | |
|
1258 | b'\x01\x00\xfa\x3f\x75\x37\x04') | |
|
1259 | ||
|
1260 | def test_bad_flush_mode(self): | |
|
1261 | cctx = zstd.ZstdCompressor() | |
|
1262 | dest = io.BytesIO() | |
|
1263 | with cctx.stream_writer(dest) as compressor: | |
|
1264 | with self.assertRaisesRegexp(ValueError, 'unknown flush_mode: 42'): | |
|
1265 | compressor.flush(flush_mode=42) | |
|
1266 | ||
|
1012 | 1267 | def test_multithreaded(self): |
|
1013 |
dest = |
|
|
1268 | dest = NonClosingBytesIO() | |
|
1014 | 1269 | cctx = zstd.ZstdCompressor(threads=2) |
|
1015 | 1270 | with cctx.stream_writer(dest) as compressor: |
|
1016 | 1271 | compressor.write(b'a' * 1048576) |
@@ -1043,22 +1298,21 b' class TestCompressor_stream_writer(unitt' | |||
|
1043 | 1298 | pass |
|
1044 | 1299 | |
|
1045 | 1300 | def test_tarfile_compat(self): |
|
1046 | raise unittest.SkipTest('not yet fully working') | |
|
1047 | ||
|
1048 | dest = io.BytesIO() | |
|
1301 | dest = NonClosingBytesIO() | |
|
1049 | 1302 | cctx = zstd.ZstdCompressor() |
|
1050 | 1303 | with cctx.stream_writer(dest) as compressor: |
|
1051 | with tarfile.open('tf', mode='w', fileobj=compressor) as tf: | |
|
1304 | with tarfile.open('tf', mode='w|', fileobj=compressor) as tf: | |
|
1052 | 1305 | tf.add(__file__, 'test_compressor.py') |
|
1053 | 1306 | |
|
1054 | dest.seek(0) | |
|
1307 | dest = io.BytesIO(dest.getvalue()) | |
|
1055 | 1308 | |
|
1056 | 1309 | dctx = zstd.ZstdDecompressor() |
|
1057 | 1310 | with dctx.stream_reader(dest) as reader: |
|
1058 |
with tarfile.open(mode='r |
|
|
1311 | with tarfile.open(mode='r|', fileobj=reader) as tf: | |
|
1059 | 1312 | for member in tf: |
|
1060 | 1313 | self.assertEqual(member.name, 'test_compressor.py') |
|
1061 | 1314 | |
|
1315 | ||
|
1062 | 1316 | @make_cffi |
|
1063 | 1317 | class TestCompressor_read_to_iter(unittest.TestCase): |
|
1064 | 1318 | def test_type_validation(self): |
@@ -1192,7 +1446,7 b' class TestCompressor_chunker(unittest.Te' | |||
|
1192 | 1446 | |
|
1193 | 1447 | it = chunker.finish() |
|
1194 | 1448 | |
|
1195 |
self.assertEqual(next(it), b'\x28\xb5\x2f\xfd\x00\x5 |
|
|
1449 | self.assertEqual(next(it), b'\x28\xb5\x2f\xfd\x00\x58\x01\x00\x00') | |
|
1196 | 1450 | |
|
1197 | 1451 | with self.assertRaises(StopIteration): |
|
1198 | 1452 | next(it) |
@@ -1214,7 +1468,7 b' class TestCompressor_chunker(unittest.Te' | |||
|
1214 | 1468 | it = chunker.finish() |
|
1215 | 1469 | |
|
1216 | 1470 | self.assertEqual(next(it), |
|
1217 |
b'\x28\xb5\x2f\xfd\x00\x5 |
|
|
1471 | b'\x28\xb5\x2f\xfd\x00\x58\x7d\x00\x00\x48\x66\x6f' | |
|
1218 | 1472 | b'\x6f\x62\x61\x72\x62\x61\x7a\x01\x00\xe4\xe4\x8e') |
|
1219 | 1473 | |
|
1220 | 1474 | with self.assertRaises(StopIteration): |
@@ -1258,7 +1512,7 b' class TestCompressor_chunker(unittest.Te' | |||
|
1258 | 1512 | |
|
1259 | 1513 | self.assertEqual( |
|
1260 | 1514 | b''.join(chunks), |
|
1261 |
b'\x28\xb5\x2f\xfd\x00\x5 |
|
|
1515 | b'\x28\xb5\x2f\xfd\x00\x58\x55\x00\x00\x18\x66\x6f\x6f\x01\x00' | |
|
1262 | 1516 | b'\xfa\xd3\x77\x43') |
|
1263 | 1517 | |
|
1264 | 1518 | dctx = zstd.ZstdDecompressor() |
@@ -1283,7 +1537,7 b' class TestCompressor_chunker(unittest.Te' | |||
|
1283 | 1537 | |
|
1284 | 1538 | self.assertEqual(list(chunker.compress(source)), []) |
|
1285 | 1539 | self.assertEqual(list(chunker.finish()), [ |
|
1286 |
b'\x28\xb5\x2f\xfd\x00\x5 |
|
|
1540 | b'\x28\xb5\x2f\xfd\x00\x58\x19\x00\x00\x66\x6f\x6f' | |
|
1287 | 1541 | ]) |
|
1288 | 1542 | |
|
1289 | 1543 | def test_flush(self): |
@@ -1296,7 +1550,7 b' class TestCompressor_chunker(unittest.Te' | |||
|
1296 | 1550 | chunks1 = list(chunker.flush()) |
|
1297 | 1551 | |
|
1298 | 1552 | self.assertEqual(chunks1, [ |
|
1299 |
b'\x28\xb5\x2f\xfd\x00\x5 |
|
|
1553 | b'\x28\xb5\x2f\xfd\x00\x58\x8c\x00\x00\x30\x66\x6f\x6f\x62\x61\x72' | |
|
1300 | 1554 | b'\x02\x00\xfa\x03\xfe\xd0\x9f\xbe\x1b\x02' |
|
1301 | 1555 | ]) |
|
1302 | 1556 | |
@@ -1326,7 +1580,7 b' class TestCompressor_chunker(unittest.Te' | |||
|
1326 | 1580 | |
|
1327 | 1581 | with self.assertRaisesRegexp( |
|
1328 | 1582 | zstd.ZstdError, |
|
1329 | 'cannot call compress\(\) after compression finished'): | |
|
1583 | r'cannot call compress\(\) after compression finished'): | |
|
1330 | 1584 | list(chunker.compress(b'foo')) |
|
1331 | 1585 | |
|
1332 | 1586 | def test_flush_after_finish(self): |
@@ -1338,7 +1592,7 b' class TestCompressor_chunker(unittest.Te' | |||
|
1338 | 1592 | |
|
1339 | 1593 | with self.assertRaisesRegexp( |
|
1340 | 1594 | zstd.ZstdError, |
|
1341 | 'cannot call flush\(\) after compression finished'): | |
|
1595 | r'cannot call flush\(\) after compression finished'): | |
|
1342 | 1596 | list(chunker.flush()) |
|
1343 | 1597 | |
|
1344 | 1598 | def test_finish_after_finish(self): |
@@ -1350,7 +1604,7 b' class TestCompressor_chunker(unittest.Te' | |||
|
1350 | 1604 | |
|
1351 | 1605 | with self.assertRaisesRegexp( |
|
1352 | 1606 | zstd.ZstdError, |
|
1353 | 'cannot call finish\(\) after compression finished'): | |
|
1607 | r'cannot call finish\(\) after compression finished'): | |
|
1354 | 1608 | list(chunker.finish()) |
|
1355 | 1609 | |
|
1356 | 1610 | |
@@ -1358,6 +1612,9 b' class TestCompressor_multi_compress_to_b' | |||
|
1358 | 1612 | def test_invalid_inputs(self): |
|
1359 | 1613 | cctx = zstd.ZstdCompressor() |
|
1360 | 1614 | |
|
1615 | if not hasattr(cctx, 'multi_compress_to_buffer'): | |
|
1616 | self.skipTest('multi_compress_to_buffer not available') | |
|
1617 | ||
|
1361 | 1618 | with self.assertRaises(TypeError): |
|
1362 | 1619 | cctx.multi_compress_to_buffer(True) |
|
1363 | 1620 | |
@@ -1370,6 +1627,9 b' class TestCompressor_multi_compress_to_b' | |||
|
1370 | 1627 | def test_empty_input(self): |
|
1371 | 1628 | cctx = zstd.ZstdCompressor() |
|
1372 | 1629 | |
|
1630 | if not hasattr(cctx, 'multi_compress_to_buffer'): | |
|
1631 | self.skipTest('multi_compress_to_buffer not available') | |
|
1632 | ||
|
1373 | 1633 | with self.assertRaisesRegexp(ValueError, 'no source elements found'): |
|
1374 | 1634 | cctx.multi_compress_to_buffer([]) |
|
1375 | 1635 | |
@@ -1379,6 +1639,9 b' class TestCompressor_multi_compress_to_b' | |||
|
1379 | 1639 | def test_list_input(self): |
|
1380 | 1640 | cctx = zstd.ZstdCompressor(write_checksum=True) |
|
1381 | 1641 | |
|
1642 | if not hasattr(cctx, 'multi_compress_to_buffer'): | |
|
1643 | self.skipTest('multi_compress_to_buffer not available') | |
|
1644 | ||
|
1382 | 1645 | original = [b'foo' * 12, b'bar' * 6] |
|
1383 | 1646 | frames = [cctx.compress(c) for c in original] |
|
1384 | 1647 | b = cctx.multi_compress_to_buffer(original) |
@@ -1394,6 +1657,9 b' class TestCompressor_multi_compress_to_b' | |||
|
1394 | 1657 | def test_buffer_with_segments_input(self): |
|
1395 | 1658 | cctx = zstd.ZstdCompressor(write_checksum=True) |
|
1396 | 1659 | |
|
1660 | if not hasattr(cctx, 'multi_compress_to_buffer'): | |
|
1661 | self.skipTest('multi_compress_to_buffer not available') | |
|
1662 | ||
|
1397 | 1663 | original = [b'foo' * 4, b'bar' * 6] |
|
1398 | 1664 | frames = [cctx.compress(c) for c in original] |
|
1399 | 1665 | |
@@ -1412,6 +1678,9 b' class TestCompressor_multi_compress_to_b' | |||
|
1412 | 1678 | def test_buffer_with_segments_collection_input(self): |
|
1413 | 1679 | cctx = zstd.ZstdCompressor(write_checksum=True) |
|
1414 | 1680 | |
|
1681 | if not hasattr(cctx, 'multi_compress_to_buffer'): | |
|
1682 | self.skipTest('multi_compress_to_buffer not available') | |
|
1683 | ||
|
1415 | 1684 | original = [ |
|
1416 | 1685 | b'foo1', |
|
1417 | 1686 | b'foo2' * 2, |
@@ -1449,6 +1718,9 b' class TestCompressor_multi_compress_to_b' | |||
|
1449 | 1718 | |
|
1450 | 1719 | cctx = zstd.ZstdCompressor(write_checksum=True) |
|
1451 | 1720 | |
|
1721 | if not hasattr(cctx, 'multi_compress_to_buffer'): | |
|
1722 | self.skipTest('multi_compress_to_buffer not available') | |
|
1723 | ||
|
1452 | 1724 | frames = [] |
|
1453 | 1725 | frames.extend(b'x' * 64 for i in range(256)) |
|
1454 | 1726 | frames.extend(b'y' * 64 for i in range(256)) |
@@ -12,6 +12,7 b' import zstandard as zstd' | |||
|
12 | 12 | |
|
13 | 13 | from . common import ( |
|
14 | 14 | make_cffi, |
|
15 | NonClosingBytesIO, | |
|
15 | 16 | random_input_data, |
|
16 | 17 | ) |
|
17 | 18 | |
@@ -19,6 +20,62 b' from . common import (' | |||
|
19 | 20 | @unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set') |
|
20 | 21 | @make_cffi |
|
21 | 22 | class TestCompressor_stream_reader_fuzzing(unittest.TestCase): |
|
23 | @hypothesis.settings( | |
|
24 | suppress_health_check=[hypothesis.HealthCheck.large_base_example]) | |
|
25 | @hypothesis.given(original=strategies.sampled_from(random_input_data()), | |
|
26 | level=strategies.integers(min_value=1, max_value=5), | |
|
27 | source_read_size=strategies.integers(1, 16384), | |
|
28 | read_size=strategies.integers(-1, zstd.COMPRESSION_RECOMMENDED_OUTPUT_SIZE)) | |
|
29 | def test_stream_source_read(self, original, level, source_read_size, | |
|
30 | read_size): | |
|
31 | if read_size == 0: | |
|
32 | read_size = -1 | |
|
33 | ||
|
34 | refctx = zstd.ZstdCompressor(level=level) | |
|
35 | ref_frame = refctx.compress(original) | |
|
36 | ||
|
37 | cctx = zstd.ZstdCompressor(level=level) | |
|
38 | with cctx.stream_reader(io.BytesIO(original), size=len(original), | |
|
39 | read_size=source_read_size) as reader: | |
|
40 | chunks = [] | |
|
41 | while True: | |
|
42 | chunk = reader.read(read_size) | |
|
43 | if not chunk: | |
|
44 | break | |
|
45 | ||
|
46 | chunks.append(chunk) | |
|
47 | ||
|
48 | self.assertEqual(b''.join(chunks), ref_frame) | |
|
49 | ||
|
50 | @hypothesis.settings( | |
|
51 | suppress_health_check=[hypothesis.HealthCheck.large_base_example]) | |
|
52 | @hypothesis.given(original=strategies.sampled_from(random_input_data()), | |
|
53 | level=strategies.integers(min_value=1, max_value=5), | |
|
54 | source_read_size=strategies.integers(1, 16384), | |
|
55 | read_size=strategies.integers(-1, zstd.COMPRESSION_RECOMMENDED_OUTPUT_SIZE)) | |
|
56 | def test_buffer_source_read(self, original, level, source_read_size, | |
|
57 | read_size): | |
|
58 | if read_size == 0: | |
|
59 | read_size = -1 | |
|
60 | ||
|
61 | refctx = zstd.ZstdCompressor(level=level) | |
|
62 | ref_frame = refctx.compress(original) | |
|
63 | ||
|
64 | cctx = zstd.ZstdCompressor(level=level) | |
|
65 | with cctx.stream_reader(original, size=len(original), | |
|
66 | read_size=source_read_size) as reader: | |
|
67 | chunks = [] | |
|
68 | while True: | |
|
69 | chunk = reader.read(read_size) | |
|
70 | if not chunk: | |
|
71 | break | |
|
72 | ||
|
73 | chunks.append(chunk) | |
|
74 | ||
|
75 | self.assertEqual(b''.join(chunks), ref_frame) | |
|
76 | ||
|
77 | @hypothesis.settings( | |
|
78 | suppress_health_check=[hypothesis.HealthCheck.large_base_example]) | |
|
22 | 79 | @hypothesis.given(original=strategies.sampled_from(random_input_data()), |
|
23 | 80 | level=strategies.integers(min_value=1, max_value=5), |
|
24 | 81 | source_read_size=strategies.integers(1, 16384), |
@@ -33,15 +90,17 b' class TestCompressor_stream_reader_fuzzi' | |||
|
33 | 90 | read_size=source_read_size) as reader: |
|
34 | 91 | chunks = [] |
|
35 | 92 | while True: |
|
36 | read_size = read_sizes.draw(strategies.integers(1, 16384)) | |
|
93 | read_size = read_sizes.draw(strategies.integers(-1, 16384)) | |
|
37 | 94 | chunk = reader.read(read_size) |
|
95 | if not chunk and read_size: | |
|
96 | break | |
|
38 | 97 | |
|
39 | if not chunk: | |
|
40 | break | |
|
41 | 98 | chunks.append(chunk) |
|
42 | 99 | |
|
43 | 100 | self.assertEqual(b''.join(chunks), ref_frame) |
|
44 | 101 | |
|
102 | @hypothesis.settings( | |
|
103 | suppress_health_check=[hypothesis.HealthCheck.large_base_example]) | |
|
45 | 104 | @hypothesis.given(original=strategies.sampled_from(random_input_data()), |
|
46 | 105 | level=strategies.integers(min_value=1, max_value=5), |
|
47 | 106 | source_read_size=strategies.integers(1, 16384), |
@@ -57,14 +116,343 b' class TestCompressor_stream_reader_fuzzi' | |||
|
57 | 116 | read_size=source_read_size) as reader: |
|
58 | 117 | chunks = [] |
|
59 | 118 | while True: |
|
119 | read_size = read_sizes.draw(strategies.integers(-1, 16384)) | |
|
120 | chunk = reader.read(read_size) | |
|
121 | if not chunk and read_size: | |
|
122 | break | |
|
123 | ||
|
124 | chunks.append(chunk) | |
|
125 | ||
|
126 | self.assertEqual(b''.join(chunks), ref_frame) | |
|
127 | ||
|
128 | @hypothesis.settings( | |
|
129 | suppress_health_check=[hypothesis.HealthCheck.large_base_example]) | |
|
130 | @hypothesis.given(original=strategies.sampled_from(random_input_data()), | |
|
131 | level=strategies.integers(min_value=1, max_value=5), | |
|
132 | source_read_size=strategies.integers(1, 16384), | |
|
133 | read_size=strategies.integers(1, zstd.COMPRESSION_RECOMMENDED_OUTPUT_SIZE)) | |
|
134 | def test_stream_source_readinto(self, original, level, | |
|
135 | source_read_size, read_size): | |
|
136 | refctx = zstd.ZstdCompressor(level=level) | |
|
137 | ref_frame = refctx.compress(original) | |
|
138 | ||
|
139 | cctx = zstd.ZstdCompressor(level=level) | |
|
140 | with cctx.stream_reader(io.BytesIO(original), size=len(original), | |
|
141 | read_size=source_read_size) as reader: | |
|
142 | chunks = [] | |
|
143 | while True: | |
|
144 | b = bytearray(read_size) | |
|
145 | count = reader.readinto(b) | |
|
146 | ||
|
147 | if not count: | |
|
148 | break | |
|
149 | ||
|
150 | chunks.append(bytes(b[0:count])) | |
|
151 | ||
|
152 | self.assertEqual(b''.join(chunks), ref_frame) | |
|
153 | ||
|
154 | @hypothesis.settings( | |
|
155 | suppress_health_check=[hypothesis.HealthCheck.large_base_example]) | |
|
156 | @hypothesis.given(original=strategies.sampled_from(random_input_data()), | |
|
157 | level=strategies.integers(min_value=1, max_value=5), | |
|
158 | source_read_size=strategies.integers(1, 16384), | |
|
159 | read_size=strategies.integers(1, zstd.COMPRESSION_RECOMMENDED_OUTPUT_SIZE)) | |
|
160 | def test_buffer_source_readinto(self, original, level, | |
|
161 | source_read_size, read_size): | |
|
162 | ||
|
163 | refctx = zstd.ZstdCompressor(level=level) | |
|
164 | ref_frame = refctx.compress(original) | |
|
165 | ||
|
166 | cctx = zstd.ZstdCompressor(level=level) | |
|
167 | with cctx.stream_reader(original, size=len(original), | |
|
168 | read_size=source_read_size) as reader: | |
|
169 | chunks = [] | |
|
170 | while True: | |
|
171 | b = bytearray(read_size) | |
|
172 | count = reader.readinto(b) | |
|
173 | ||
|
174 | if not count: | |
|
175 | break | |
|
176 | ||
|
177 | chunks.append(bytes(b[0:count])) | |
|
178 | ||
|
179 | self.assertEqual(b''.join(chunks), ref_frame) | |
|
180 | ||
|
181 | @hypothesis.settings( | |
|
182 | suppress_health_check=[hypothesis.HealthCheck.large_base_example]) | |
|
183 | @hypothesis.given(original=strategies.sampled_from(random_input_data()), | |
|
184 | level=strategies.integers(min_value=1, max_value=5), | |
|
185 | source_read_size=strategies.integers(1, 16384), | |
|
186 | read_sizes=strategies.data()) | |
|
187 | def test_stream_source_readinto_variance(self, original, level, | |
|
188 | source_read_size, read_sizes): | |
|
189 | refctx = zstd.ZstdCompressor(level=level) | |
|
190 | ref_frame = refctx.compress(original) | |
|
191 | ||
|
192 | cctx = zstd.ZstdCompressor(level=level) | |
|
193 | with cctx.stream_reader(io.BytesIO(original), size=len(original), | |
|
194 | read_size=source_read_size) as reader: | |
|
195 | chunks = [] | |
|
196 | while True: | |
|
60 | 197 | read_size = read_sizes.draw(strategies.integers(1, 16384)) |
|
61 |
|
|
|
198 | b = bytearray(read_size) | |
|
199 | count = reader.readinto(b) | |
|
200 | ||
|
201 | if not count: | |
|
202 | break | |
|
203 | ||
|
204 | chunks.append(bytes(b[0:count])) | |
|
205 | ||
|
206 | self.assertEqual(b''.join(chunks), ref_frame) | |
|
207 | ||
|
208 | @hypothesis.settings( | |
|
209 | suppress_health_check=[hypothesis.HealthCheck.large_base_example]) | |
|
210 | @hypothesis.given(original=strategies.sampled_from(random_input_data()), | |
|
211 | level=strategies.integers(min_value=1, max_value=5), | |
|
212 | source_read_size=strategies.integers(1, 16384), | |
|
213 | read_sizes=strategies.data()) | |
|
214 | def test_buffer_source_readinto_variance(self, original, level, | |
|
215 | source_read_size, read_sizes): | |
|
216 | ||
|
217 | refctx = zstd.ZstdCompressor(level=level) | |
|
218 | ref_frame = refctx.compress(original) | |
|
219 | ||
|
220 | cctx = zstd.ZstdCompressor(level=level) | |
|
221 | with cctx.stream_reader(original, size=len(original), | |
|
222 | read_size=source_read_size) as reader: | |
|
223 | chunks = [] | |
|
224 | while True: | |
|
225 | read_size = read_sizes.draw(strategies.integers(1, 16384)) | |
|
226 | b = bytearray(read_size) | |
|
227 | count = reader.readinto(b) | |
|
228 | ||
|
229 | if not count: | |
|
230 | break | |
|
231 | ||
|
232 | chunks.append(bytes(b[0:count])) | |
|
233 | ||
|
234 | self.assertEqual(b''.join(chunks), ref_frame) | |
|
235 | ||
|
236 | @hypothesis.settings( | |
|
237 | suppress_health_check=[hypothesis.HealthCheck.large_base_example]) | |
|
238 | @hypothesis.given(original=strategies.sampled_from(random_input_data()), | |
|
239 | level=strategies.integers(min_value=1, max_value=5), | |
|
240 | source_read_size=strategies.integers(1, 16384), | |
|
241 | read_size=strategies.integers(-1, zstd.COMPRESSION_RECOMMENDED_OUTPUT_SIZE)) | |
|
242 | def test_stream_source_read1(self, original, level, source_read_size, | |
|
243 | read_size): | |
|
244 | if read_size == 0: | |
|
245 | read_size = -1 | |
|
246 | ||
|
247 | refctx = zstd.ZstdCompressor(level=level) | |
|
248 | ref_frame = refctx.compress(original) | |
|
249 | ||
|
250 | cctx = zstd.ZstdCompressor(level=level) | |
|
251 | with cctx.stream_reader(io.BytesIO(original), size=len(original), | |
|
252 | read_size=source_read_size) as reader: | |
|
253 | chunks = [] | |
|
254 | while True: | |
|
255 | chunk = reader.read1(read_size) | |
|
62 | 256 | if not chunk: |
|
63 | 257 | break |
|
258 | ||
|
64 | 259 | chunks.append(chunk) |
|
65 | 260 | |
|
66 | 261 | self.assertEqual(b''.join(chunks), ref_frame) |
|
67 | 262 | |
|
263 | @hypothesis.settings( | |
|
264 | suppress_health_check=[hypothesis.HealthCheck.large_base_example]) | |
|
265 | @hypothesis.given(original=strategies.sampled_from(random_input_data()), | |
|
266 | level=strategies.integers(min_value=1, max_value=5), | |
|
267 | source_read_size=strategies.integers(1, 16384), | |
|
268 | read_size=strategies.integers(-1, zstd.COMPRESSION_RECOMMENDED_OUTPUT_SIZE)) | |
|
269 | def test_buffer_source_read1(self, original, level, source_read_size, | |
|
270 | read_size): | |
|
271 | if read_size == 0: | |
|
272 | read_size = -1 | |
|
273 | ||
|
274 | refctx = zstd.ZstdCompressor(level=level) | |
|
275 | ref_frame = refctx.compress(original) | |
|
276 | ||
|
277 | cctx = zstd.ZstdCompressor(level=level) | |
|
278 | with cctx.stream_reader(original, size=len(original), | |
|
279 | read_size=source_read_size) as reader: | |
|
280 | chunks = [] | |
|
281 | while True: | |
|
282 | chunk = reader.read1(read_size) | |
|
283 | if not chunk: | |
|
284 | break | |
|
285 | ||
|
286 | chunks.append(chunk) | |
|
287 | ||
|
288 | self.assertEqual(b''.join(chunks), ref_frame) | |
|
289 | ||
|
290 | @hypothesis.settings( | |
|
291 | suppress_health_check=[hypothesis.HealthCheck.large_base_example]) | |
|
292 | @hypothesis.given(original=strategies.sampled_from(random_input_data()), | |
|
293 | level=strategies.integers(min_value=1, max_value=5), | |
|
294 | source_read_size=strategies.integers(1, 16384), | |
|
295 | read_sizes=strategies.data()) | |
|
296 | def test_stream_source_read1_variance(self, original, level, source_read_size, | |
|
297 | read_sizes): | |
|
298 | refctx = zstd.ZstdCompressor(level=level) | |
|
299 | ref_frame = refctx.compress(original) | |
|
300 | ||
|
301 | cctx = zstd.ZstdCompressor(level=level) | |
|
302 | with cctx.stream_reader(io.BytesIO(original), size=len(original), | |
|
303 | read_size=source_read_size) as reader: | |
|
304 | chunks = [] | |
|
305 | while True: | |
|
306 | read_size = read_sizes.draw(strategies.integers(-1, 16384)) | |
|
307 | chunk = reader.read1(read_size) | |
|
308 | if not chunk and read_size: | |
|
309 | break | |
|
310 | ||
|
311 | chunks.append(chunk) | |
|
312 | ||
|
313 | self.assertEqual(b''.join(chunks), ref_frame) | |
|
314 | ||
|
315 | @hypothesis.settings( | |
|
316 | suppress_health_check=[hypothesis.HealthCheck.large_base_example]) | |
|
317 | @hypothesis.given(original=strategies.sampled_from(random_input_data()), | |
|
318 | level=strategies.integers(min_value=1, max_value=5), | |
|
319 | source_read_size=strategies.integers(1, 16384), | |
|
320 | read_sizes=strategies.data()) | |
|
321 | def test_buffer_source_read1_variance(self, original, level, source_read_size, | |
|
322 | read_sizes): | |
|
323 | ||
|
324 | refctx = zstd.ZstdCompressor(level=level) | |
|
325 | ref_frame = refctx.compress(original) | |
|
326 | ||
|
327 | cctx = zstd.ZstdCompressor(level=level) | |
|
328 | with cctx.stream_reader(original, size=len(original), | |
|
329 | read_size=source_read_size) as reader: | |
|
330 | chunks = [] | |
|
331 | while True: | |
|
332 | read_size = read_sizes.draw(strategies.integers(-1, 16384)) | |
|
333 | chunk = reader.read1(read_size) | |
|
334 | if not chunk and read_size: | |
|
335 | break | |
|
336 | ||
|
337 | chunks.append(chunk) | |
|
338 | ||
|
339 | self.assertEqual(b''.join(chunks), ref_frame) | |
|
340 | ||
|
341 | ||
|
342 | @hypothesis.settings( | |
|
343 | suppress_health_check=[hypothesis.HealthCheck.large_base_example]) | |
|
344 | @hypothesis.given(original=strategies.sampled_from(random_input_data()), | |
|
345 | level=strategies.integers(min_value=1, max_value=5), | |
|
346 | source_read_size=strategies.integers(1, 16384), | |
|
347 | read_size=strategies.integers(1, zstd.COMPRESSION_RECOMMENDED_OUTPUT_SIZE)) | |
|
348 | def test_stream_source_readinto1(self, original, level, source_read_size, | |
|
349 | read_size): | |
|
350 | if read_size == 0: | |
|
351 | read_size = -1 | |
|
352 | ||
|
353 | refctx = zstd.ZstdCompressor(level=level) | |
|
354 | ref_frame = refctx.compress(original) | |
|
355 | ||
|
356 | cctx = zstd.ZstdCompressor(level=level) | |
|
357 | with cctx.stream_reader(io.BytesIO(original), size=len(original), | |
|
358 | read_size=source_read_size) as reader: | |
|
359 | chunks = [] | |
|
360 | while True: | |
|
361 | b = bytearray(read_size) | |
|
362 | count = reader.readinto1(b) | |
|
363 | ||
|
364 | if not count: | |
|
365 | break | |
|
366 | ||
|
367 | chunks.append(bytes(b[0:count])) | |
|
368 | ||
|
369 | self.assertEqual(b''.join(chunks), ref_frame) | |
|
370 | ||
|
371 | @hypothesis.settings( | |
|
372 | suppress_health_check=[hypothesis.HealthCheck.large_base_example]) | |
|
373 | @hypothesis.given(original=strategies.sampled_from(random_input_data()), | |
|
374 | level=strategies.integers(min_value=1, max_value=5), | |
|
375 | source_read_size=strategies.integers(1, 16384), | |
|
376 | read_size=strategies.integers(1, zstd.COMPRESSION_RECOMMENDED_OUTPUT_SIZE)) | |
|
377 | def test_buffer_source_readinto1(self, original, level, source_read_size, | |
|
378 | read_size): | |
|
379 | if read_size == 0: | |
|
380 | read_size = -1 | |
|
381 | ||
|
382 | refctx = zstd.ZstdCompressor(level=level) | |
|
383 | ref_frame = refctx.compress(original) | |
|
384 | ||
|
385 | cctx = zstd.ZstdCompressor(level=level) | |
|
386 | with cctx.stream_reader(original, size=len(original), | |
|
387 | read_size=source_read_size) as reader: | |
|
388 | chunks = [] | |
|
389 | while True: | |
|
390 | b = bytearray(read_size) | |
|
391 | count = reader.readinto1(b) | |
|
392 | ||
|
393 | if not count: | |
|
394 | break | |
|
395 | ||
|
396 | chunks.append(bytes(b[0:count])) | |
|
397 | ||
|
398 | self.assertEqual(b''.join(chunks), ref_frame) | |
|
399 | ||
|
400 | @hypothesis.settings( | |
|
401 | suppress_health_check=[hypothesis.HealthCheck.large_base_example]) | |
|
402 | @hypothesis.given(original=strategies.sampled_from(random_input_data()), | |
|
403 | level=strategies.integers(min_value=1, max_value=5), | |
|
404 | source_read_size=strategies.integers(1, 16384), | |
|
405 | read_sizes=strategies.data()) | |
|
406 | def test_stream_source_readinto1_variance(self, original, level, source_read_size, | |
|
407 | read_sizes): | |
|
408 | refctx = zstd.ZstdCompressor(level=level) | |
|
409 | ref_frame = refctx.compress(original) | |
|
410 | ||
|
411 | cctx = zstd.ZstdCompressor(level=level) | |
|
412 | with cctx.stream_reader(io.BytesIO(original), size=len(original), | |
|
413 | read_size=source_read_size) as reader: | |
|
414 | chunks = [] | |
|
415 | while True: | |
|
416 | read_size = read_sizes.draw(strategies.integers(1, 16384)) | |
|
417 | b = bytearray(read_size) | |
|
418 | count = reader.readinto1(b) | |
|
419 | ||
|
420 | if not count: | |
|
421 | break | |
|
422 | ||
|
423 | chunks.append(bytes(b[0:count])) | |
|
424 | ||
|
425 | self.assertEqual(b''.join(chunks), ref_frame) | |
|
426 | ||
|
427 | @hypothesis.settings( | |
|
428 | suppress_health_check=[hypothesis.HealthCheck.large_base_example]) | |
|
429 | @hypothesis.given(original=strategies.sampled_from(random_input_data()), | |
|
430 | level=strategies.integers(min_value=1, max_value=5), | |
|
431 | source_read_size=strategies.integers(1, 16384), | |
|
432 | read_sizes=strategies.data()) | |
|
433 | def test_buffer_source_readinto1_variance(self, original, level, source_read_size, | |
|
434 | read_sizes): | |
|
435 | ||
|
436 | refctx = zstd.ZstdCompressor(level=level) | |
|
437 | ref_frame = refctx.compress(original) | |
|
438 | ||
|
439 | cctx = zstd.ZstdCompressor(level=level) | |
|
440 | with cctx.stream_reader(original, size=len(original), | |
|
441 | read_size=source_read_size) as reader: | |
|
442 | chunks = [] | |
|
443 | while True: | |
|
444 | read_size = read_sizes.draw(strategies.integers(1, 16384)) | |
|
445 | b = bytearray(read_size) | |
|
446 | count = reader.readinto1(b) | |
|
447 | ||
|
448 | if not count: | |
|
449 | break | |
|
450 | ||
|
451 | chunks.append(bytes(b[0:count])) | |
|
452 | ||
|
453 | self.assertEqual(b''.join(chunks), ref_frame) | |
|
454 | ||
|
455 | ||
|
68 | 456 | |
|
69 | 457 | @unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set') |
|
70 | 458 | @make_cffi |
@@ -77,7 +465,7 b' class TestCompressor_stream_writer_fuzzi' | |||
|
77 | 465 | ref_frame = refctx.compress(original) |
|
78 | 466 | |
|
79 | 467 | cctx = zstd.ZstdCompressor(level=level) |
|
80 |
b = |
|
|
468 | b = NonClosingBytesIO() | |
|
81 | 469 | with cctx.stream_writer(b, size=len(original), write_size=write_size) as compressor: |
|
82 | 470 | compressor.write(original) |
|
83 | 471 | |
@@ -219,6 +607,9 b' class TestCompressor_multi_compress_to_b' | |||
|
219 | 607 | write_checksum=True, |
|
220 | 608 | **kwargs) |
|
221 | 609 | |
|
610 | if not hasattr(cctx, 'multi_compress_to_buffer'): | |
|
611 | self.skipTest('multi_compress_to_buffer not available') | |
|
612 | ||
|
222 | 613 | result = cctx.multi_compress_to_buffer(original, threads=-1) |
|
223 | 614 | |
|
224 | 615 | self.assertEqual(len(result), len(original)) |
@@ -15,17 +15,17 b' class TestCompressionParameters(unittest' | |||
|
15 | 15 | chain_log=zstd.CHAINLOG_MIN, |
|
16 | 16 | hash_log=zstd.HASHLOG_MIN, |
|
17 | 17 | search_log=zstd.SEARCHLOG_MIN, |
|
18 |
min_match=zstd. |
|
|
18 | min_match=zstd.MINMATCH_MIN + 1, | |
|
19 | 19 | target_length=zstd.TARGETLENGTH_MIN, |
|
20 |
|
|
|
20 | strategy=zstd.STRATEGY_FAST) | |
|
21 | 21 | |
|
22 | 22 | zstd.ZstdCompressionParameters(window_log=zstd.WINDOWLOG_MAX, |
|
23 | 23 | chain_log=zstd.CHAINLOG_MAX, |
|
24 | 24 | hash_log=zstd.HASHLOG_MAX, |
|
25 | 25 | search_log=zstd.SEARCHLOG_MAX, |
|
26 |
min_match=zstd. |
|
|
26 | min_match=zstd.MINMATCH_MAX - 1, | |
|
27 | 27 | target_length=zstd.TARGETLENGTH_MAX, |
|
28 |
|
|
|
28 | strategy=zstd.STRATEGY_BTULTRA2) | |
|
29 | 29 | |
|
30 | 30 | def test_from_level(self): |
|
31 | 31 | p = zstd.ZstdCompressionParameters.from_level(1) |
@@ -43,7 +43,7 b' class TestCompressionParameters(unittest' | |||
|
43 | 43 | search_log=4, |
|
44 | 44 | min_match=5, |
|
45 | 45 | target_length=8, |
|
46 |
|
|
|
46 | strategy=1) | |
|
47 | 47 | self.assertEqual(p.window_log, 10) |
|
48 | 48 | self.assertEqual(p.chain_log, 6) |
|
49 | 49 | self.assertEqual(p.hash_log, 7) |
@@ -59,9 +59,10 b' class TestCompressionParameters(unittest' | |||
|
59 | 59 | self.assertEqual(p.threads, 4) |
|
60 | 60 | |
|
61 | 61 | p = zstd.ZstdCompressionParameters(threads=2, job_size=1048576, |
|
62 |
overlap |
|
|
62 | overlap_log=6) | |
|
63 | 63 | self.assertEqual(p.threads, 2) |
|
64 | 64 | self.assertEqual(p.job_size, 1048576) |
|
65 | self.assertEqual(p.overlap_log, 6) | |
|
65 | 66 | self.assertEqual(p.overlap_size_log, 6) |
|
66 | 67 | |
|
67 | 68 | p = zstd.ZstdCompressionParameters(compression_level=-1) |
@@ -85,8 +86,9 b' class TestCompressionParameters(unittest' | |||
|
85 | 86 | p = zstd.ZstdCompressionParameters(ldm_bucket_size_log=7) |
|
86 | 87 | self.assertEqual(p.ldm_bucket_size_log, 7) |
|
87 | 88 | |
|
88 |
p = zstd.ZstdCompressionParameters(ldm_hash_e |
|
|
89 | p = zstd.ZstdCompressionParameters(ldm_hash_rate_log=8) | |
|
89 | 90 | self.assertEqual(p.ldm_hash_every_log, 8) |
|
91 | self.assertEqual(p.ldm_hash_rate_log, 8) | |
|
90 | 92 | |
|
91 | 93 | def test_estimated_compression_context_size(self): |
|
92 | 94 | p = zstd.ZstdCompressionParameters(window_log=20, |
@@ -95,12 +97,44 b' class TestCompressionParameters(unittest' | |||
|
95 | 97 | search_log=1, |
|
96 | 98 | min_match=5, |
|
97 | 99 | target_length=16, |
|
98 |
|
|
|
100 | strategy=zstd.STRATEGY_DFAST) | |
|
99 | 101 | |
|
100 | 102 | # 32-bit has slightly different values from 64-bit. |
|
101 | 103 | self.assertAlmostEqual(p.estimated_compression_context_size(), 1294072, |
|
102 | 104 | delta=250) |
|
103 | 105 | |
|
106 | def test_strategy(self): | |
|
107 | with self.assertRaisesRegexp(ValueError, 'cannot specify both compression_strategy'): | |
|
108 | zstd.ZstdCompressionParameters(strategy=0, compression_strategy=0) | |
|
109 | ||
|
110 | p = zstd.ZstdCompressionParameters(strategy=2) | |
|
111 | self.assertEqual(p.compression_strategy, 2) | |
|
112 | ||
|
113 | p = zstd.ZstdCompressionParameters(strategy=3) | |
|
114 | self.assertEqual(p.compression_strategy, 3) | |
|
115 | ||
|
116 | def test_ldm_hash_rate_log(self): | |
|
117 | with self.assertRaisesRegexp(ValueError, 'cannot specify both ldm_hash_rate_log'): | |
|
118 | zstd.ZstdCompressionParameters(ldm_hash_rate_log=8, ldm_hash_every_log=4) | |
|
119 | ||
|
120 | p = zstd.ZstdCompressionParameters(ldm_hash_rate_log=8) | |
|
121 | self.assertEqual(p.ldm_hash_every_log, 8) | |
|
122 | ||
|
123 | p = zstd.ZstdCompressionParameters(ldm_hash_every_log=16) | |
|
124 | self.assertEqual(p.ldm_hash_every_log, 16) | |
|
125 | ||
|
126 | def test_overlap_log(self): | |
|
127 | with self.assertRaisesRegexp(ValueError, 'cannot specify both overlap_log'): | |
|
128 | zstd.ZstdCompressionParameters(overlap_log=1, overlap_size_log=9) | |
|
129 | ||
|
130 | p = zstd.ZstdCompressionParameters(overlap_log=2) | |
|
131 | self.assertEqual(p.overlap_log, 2) | |
|
132 | self.assertEqual(p.overlap_size_log, 2) | |
|
133 | ||
|
134 | p = zstd.ZstdCompressionParameters(overlap_size_log=4) | |
|
135 | self.assertEqual(p.overlap_log, 4) | |
|
136 | self.assertEqual(p.overlap_size_log, 4) | |
|
137 | ||
|
104 | 138 | |
|
105 | 139 | @make_cffi |
|
106 | 140 | class TestFrameParameters(unittest.TestCase): |
@@ -24,8 +24,8 b' s_hashlog = strategies.integers(min_valu' | |||
|
24 | 24 | max_value=zstd.HASHLOG_MAX) |
|
25 | 25 | s_searchlog = strategies.integers(min_value=zstd.SEARCHLOG_MIN, |
|
26 | 26 | max_value=zstd.SEARCHLOG_MAX) |
|
27 |
s_ |
|
|
28 |
|
|
|
27 | s_minmatch = strategies.integers(min_value=zstd.MINMATCH_MIN, | |
|
28 | max_value=zstd.MINMATCH_MAX) | |
|
29 | 29 | s_targetlength = strategies.integers(min_value=zstd.TARGETLENGTH_MIN, |
|
30 | 30 | max_value=zstd.TARGETLENGTH_MAX) |
|
31 | 31 | s_strategy = strategies.sampled_from((zstd.STRATEGY_FAST, |
@@ -35,41 +35,42 b' s_strategy = strategies.sampled_from((zs' | |||
|
35 | 35 | zstd.STRATEGY_LAZY2, |
|
36 | 36 | zstd.STRATEGY_BTLAZY2, |
|
37 | 37 | zstd.STRATEGY_BTOPT, |
|
38 |
zstd.STRATEGY_BTULTRA |
|
|
38 | zstd.STRATEGY_BTULTRA, | |
|
39 | zstd.STRATEGY_BTULTRA2)) | |
|
39 | 40 | |
|
40 | 41 | |
|
41 | 42 | @make_cffi |
|
42 | 43 | @unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set') |
|
43 | 44 | class TestCompressionParametersHypothesis(unittest.TestCase): |
|
44 | 45 | @hypothesis.given(s_windowlog, s_chainlog, s_hashlog, s_searchlog, |
|
45 |
s_ |
|
|
46 | s_minmatch, s_targetlength, s_strategy) | |
|
46 | 47 | def test_valid_init(self, windowlog, chainlog, hashlog, searchlog, |
|
47 |
|
|
|
48 | minmatch, targetlength, strategy): | |
|
48 | 49 | zstd.ZstdCompressionParameters(window_log=windowlog, |
|
49 | 50 | chain_log=chainlog, |
|
50 | 51 | hash_log=hashlog, |
|
51 | 52 | search_log=searchlog, |
|
52 |
min_match= |
|
|
53 | min_match=minmatch, | |
|
53 | 54 | target_length=targetlength, |
|
54 |
|
|
|
55 | strategy=strategy) | |
|
55 | 56 | |
|
56 | 57 | @hypothesis.given(s_windowlog, s_chainlog, s_hashlog, s_searchlog, |
|
57 |
|
|
|
58 | s_minmatch, s_targetlength, s_strategy) | |
|
58 | 59 | def test_estimated_compression_context_size(self, windowlog, chainlog, |
|
59 | 60 | hashlog, searchlog, |
|
60 |
|
|
|
61 | minmatch, targetlength, | |
|
61 | 62 | strategy): |
|
62 |
if |
|
|
63 |
|
|
|
64 |
elif |
|
|
65 |
|
|
|
63 | if minmatch == zstd.MINMATCH_MIN and strategy in (zstd.STRATEGY_FAST, zstd.STRATEGY_GREEDY): | |
|
64 | minmatch += 1 | |
|
65 | elif minmatch == zstd.MINMATCH_MAX and strategy != zstd.STRATEGY_FAST: | |
|
66 | minmatch -= 1 | |
|
66 | 67 | |
|
67 | 68 | p = zstd.ZstdCompressionParameters(window_log=windowlog, |
|
68 | 69 | chain_log=chainlog, |
|
69 | 70 | hash_log=hashlog, |
|
70 | 71 | search_log=searchlog, |
|
71 |
min_match= |
|
|
72 | min_match=minmatch, | |
|
72 | 73 | target_length=targetlength, |
|
73 |
|
|
|
74 | strategy=strategy) | |
|
74 | 75 | size = p.estimated_compression_context_size() |
|
75 | 76 |
@@ -3,6 +3,7 b' import os' | |||
|
3 | 3 | import random |
|
4 | 4 | import struct |
|
5 | 5 | import sys |
|
6 | import tempfile | |
|
6 | 7 | import unittest |
|
7 | 8 | |
|
8 | 9 | import zstandard as zstd |
@@ -10,6 +11,7 b' import zstandard as zstd' | |||
|
10 | 11 | from .common import ( |
|
11 | 12 | generate_samples, |
|
12 | 13 | make_cffi, |
|
14 | NonClosingBytesIO, | |
|
13 | 15 | OpCountingBytesIO, |
|
14 | 16 | ) |
|
15 | 17 | |
@@ -219,7 +221,7 b' class TestDecompressor_decompress(unitte' | |||
|
219 | 221 | cctx = zstd.ZstdCompressor(write_content_size=False) |
|
220 | 222 | frame = cctx.compress(source) |
|
221 | 223 | |
|
222 |
dctx = zstd.ZstdDecompressor(max_window_size= |
|
|
224 | dctx = zstd.ZstdDecompressor(max_window_size=2**zstd.WINDOWLOG_MIN) | |
|
223 | 225 | |
|
224 | 226 | with self.assertRaisesRegexp( |
|
225 | 227 | zstd.ZstdError, 'decompression error: Frame requires too much memory'): |
@@ -302,19 +304,16 b' class TestDecompressor_stream_reader(uni' | |||
|
302 | 304 | dctx = zstd.ZstdDecompressor() |
|
303 | 305 | |
|
304 | 306 | with dctx.stream_reader(b'foo') as reader: |
|
305 |
with self.assertRaises( |
|
|
307 | with self.assertRaises(io.UnsupportedOperation): | |
|
306 | 308 | reader.readline() |
|
307 | 309 | |
|
308 |
with self.assertRaises( |
|
|
310 | with self.assertRaises(io.UnsupportedOperation): | |
|
309 | 311 | reader.readlines() |
|
310 | 312 | |
|
311 |
with self.assertRaises( |
|
|
312 | reader.readall() | |
|
313 | ||
|
314 | with self.assertRaises(NotImplementedError): | |
|
313 | with self.assertRaises(io.UnsupportedOperation): | |
|
315 | 314 | iter(reader) |
|
316 | 315 | |
|
317 |
with self.assertRaises( |
|
|
316 | with self.assertRaises(io.UnsupportedOperation): | |
|
318 | 317 | next(reader) |
|
319 | 318 | |
|
320 | 319 | with self.assertRaises(io.UnsupportedOperation): |
@@ -347,15 +346,18 b' class TestDecompressor_stream_reader(uni' | |||
|
347 | 346 | with self.assertRaisesRegexp(ValueError, 'stream is closed'): |
|
348 | 347 | reader.read(1) |
|
349 | 348 | |
|
350 |
def test_ |
|
|
349 | def test_read_sizes(self): | |
|
350 | cctx = zstd.ZstdCompressor() | |
|
351 | foo = cctx.compress(b'foo') | |
|
352 | ||
|
351 | 353 | dctx = zstd.ZstdDecompressor() |
|
352 | 354 | |
|
353 |
with dctx.stream_reader( |
|
|
354 |
with self.assertRaisesRegexp(ValueError, 'cannot read negative |
|
|
355 |
reader.read(- |
|
|
355 | with dctx.stream_reader(foo) as reader: | |
|
356 | with self.assertRaisesRegexp(ValueError, 'cannot read negative amounts less than -1'): | |
|
357 | reader.read(-2) | |
|
356 | 358 | |
|
357 | with self.assertRaisesRegexp(ValueError, 'cannot read negative or size 0 amounts'): | |
|
358 |
|
|
|
359 | self.assertEqual(reader.read(0), b'') | |
|
360 | self.assertEqual(reader.read(), b'foo') | |
|
359 | 361 | |
|
360 | 362 | def test_read_buffer(self): |
|
361 | 363 | cctx = zstd.ZstdCompressor() |
@@ -524,13 +526,243 b' class TestDecompressor_stream_reader(uni' | |||
|
524 | 526 | reader = dctx.stream_reader(source) |
|
525 | 527 | |
|
526 | 528 | with reader: |
|
527 | with self.assertRaises(TypeError): | |
|
528 | reader.read() | |
|
529 | reader.read(0) | |
|
529 | 530 | |
|
530 | 531 | with reader: |
|
531 | 532 | with self.assertRaisesRegexp(ValueError, 'stream is closed'): |
|
532 | 533 | reader.read(100) |
|
533 | 534 | |
|
535 | def test_partial_read(self): | |
|
536 | # Inspired by https://github.com/indygreg/python-zstandard/issues/71. | |
|
537 | buffer = io.BytesIO() | |
|
538 | cctx = zstd.ZstdCompressor() | |
|
539 | writer = cctx.stream_writer(buffer) | |
|
540 | writer.write(bytearray(os.urandom(1000000))) | |
|
541 | writer.flush(zstd.FLUSH_FRAME) | |
|
542 | buffer.seek(0) | |
|
543 | ||
|
544 | dctx = zstd.ZstdDecompressor() | |
|
545 | reader = dctx.stream_reader(buffer) | |
|
546 | ||
|
547 | while True: | |
|
548 | chunk = reader.read(8192) | |
|
549 | if not chunk: | |
|
550 | break | |
|
551 | ||
|
552 | def test_read_multiple_frames(self): | |
|
553 | cctx = zstd.ZstdCompressor() | |
|
554 | source = io.BytesIO() | |
|
555 | writer = cctx.stream_writer(source) | |
|
556 | writer.write(b'foo') | |
|
557 | writer.flush(zstd.FLUSH_FRAME) | |
|
558 | writer.write(b'bar') | |
|
559 | writer.flush(zstd.FLUSH_FRAME) | |
|
560 | ||
|
561 | dctx = zstd.ZstdDecompressor() | |
|
562 | ||
|
563 | reader = dctx.stream_reader(source.getvalue()) | |
|
564 | self.assertEqual(reader.read(2), b'fo') | |
|
565 | self.assertEqual(reader.read(2), b'o') | |
|
566 | self.assertEqual(reader.read(2), b'ba') | |
|
567 | self.assertEqual(reader.read(2), b'r') | |
|
568 | ||
|
569 | source.seek(0) | |
|
570 | reader = dctx.stream_reader(source) | |
|
571 | self.assertEqual(reader.read(2), b'fo') | |
|
572 | self.assertEqual(reader.read(2), b'o') | |
|
573 | self.assertEqual(reader.read(2), b'ba') | |
|
574 | self.assertEqual(reader.read(2), b'r') | |
|
575 | ||
|
576 | reader = dctx.stream_reader(source.getvalue()) | |
|
577 | self.assertEqual(reader.read(3), b'foo') | |
|
578 | self.assertEqual(reader.read(3), b'bar') | |
|
579 | ||
|
580 | source.seek(0) | |
|
581 | reader = dctx.stream_reader(source) | |
|
582 | self.assertEqual(reader.read(3), b'foo') | |
|
583 | self.assertEqual(reader.read(3), b'bar') | |
|
584 | ||
|
585 | reader = dctx.stream_reader(source.getvalue()) | |
|
586 | self.assertEqual(reader.read(4), b'foo') | |
|
587 | self.assertEqual(reader.read(4), b'bar') | |
|
588 | ||
|
589 | source.seek(0) | |
|
590 | reader = dctx.stream_reader(source) | |
|
591 | self.assertEqual(reader.read(4), b'foo') | |
|
592 | self.assertEqual(reader.read(4), b'bar') | |
|
593 | ||
|
594 | reader = dctx.stream_reader(source.getvalue()) | |
|
595 | self.assertEqual(reader.read(128), b'foo') | |
|
596 | self.assertEqual(reader.read(128), b'bar') | |
|
597 | ||
|
598 | source.seek(0) | |
|
599 | reader = dctx.stream_reader(source) | |
|
600 | self.assertEqual(reader.read(128), b'foo') | |
|
601 | self.assertEqual(reader.read(128), b'bar') | |
|
602 | ||
|
603 | # Now tests for reads spanning frames. | |
|
604 | reader = dctx.stream_reader(source.getvalue(), read_across_frames=True) | |
|
605 | self.assertEqual(reader.read(3), b'foo') | |
|
606 | self.assertEqual(reader.read(3), b'bar') | |
|
607 | ||
|
608 | source.seek(0) | |
|
609 | reader = dctx.stream_reader(source, read_across_frames=True) | |
|
610 | self.assertEqual(reader.read(3), b'foo') | |
|
611 | self.assertEqual(reader.read(3), b'bar') | |
|
612 | ||
|
613 | reader = dctx.stream_reader(source.getvalue(), read_across_frames=True) | |
|
614 | self.assertEqual(reader.read(6), b'foobar') | |
|
615 | ||
|
616 | source.seek(0) | |
|
617 | reader = dctx.stream_reader(source, read_across_frames=True) | |
|
618 | self.assertEqual(reader.read(6), b'foobar') | |
|
619 | ||
|
620 | reader = dctx.stream_reader(source.getvalue(), read_across_frames=True) | |
|
621 | self.assertEqual(reader.read(7), b'foobar') | |
|
622 | ||
|
623 | source.seek(0) | |
|
624 | reader = dctx.stream_reader(source, read_across_frames=True) | |
|
625 | self.assertEqual(reader.read(7), b'foobar') | |
|
626 | ||
|
627 | reader = dctx.stream_reader(source.getvalue(), read_across_frames=True) | |
|
628 | self.assertEqual(reader.read(128), b'foobar') | |
|
629 | ||
|
630 | source.seek(0) | |
|
631 | reader = dctx.stream_reader(source, read_across_frames=True) | |
|
632 | self.assertEqual(reader.read(128), b'foobar') | |
|
633 | ||
|
634 | def test_readinto(self): | |
|
635 | cctx = zstd.ZstdCompressor() | |
|
636 | foo = cctx.compress(b'foo') | |
|
637 | ||
|
638 | dctx = zstd.ZstdDecompressor() | |
|
639 | ||
|
640 | # Attempting to readinto() a non-writable buffer fails. | |
|
641 | # The exact exception varies based on the backend. | |
|
642 | reader = dctx.stream_reader(foo) | |
|
643 | with self.assertRaises(Exception): | |
|
644 | reader.readinto(b'foobar') | |
|
645 | ||
|
646 | # readinto() with sufficiently large destination. | |
|
647 | b = bytearray(1024) | |
|
648 | reader = dctx.stream_reader(foo) | |
|
649 | self.assertEqual(reader.readinto(b), 3) | |
|
650 | self.assertEqual(b[0:3], b'foo') | |
|
651 | self.assertEqual(reader.readinto(b), 0) | |
|
652 | self.assertEqual(b[0:3], b'foo') | |
|
653 | ||
|
654 | # readinto() with small reads. | |
|
655 | b = bytearray(1024) | |
|
656 | reader = dctx.stream_reader(foo, read_size=1) | |
|
657 | self.assertEqual(reader.readinto(b), 3) | |
|
658 | self.assertEqual(b[0:3], b'foo') | |
|
659 | ||
|
660 | # Too small destination buffer. | |
|
661 | b = bytearray(2) | |
|
662 | reader = dctx.stream_reader(foo) | |
|
663 | self.assertEqual(reader.readinto(b), 2) | |
|
664 | self.assertEqual(b[:], b'fo') | |
|
665 | ||
|
666 | def test_readinto1(self): | |
|
667 | cctx = zstd.ZstdCompressor() | |
|
668 | foo = cctx.compress(b'foo') | |
|
669 | ||
|
670 | dctx = zstd.ZstdDecompressor() | |
|
671 | ||
|
672 | reader = dctx.stream_reader(foo) | |
|
673 | with self.assertRaises(Exception): | |
|
674 | reader.readinto1(b'foobar') | |
|
675 | ||
|
676 | # Sufficiently large destination. | |
|
677 | b = bytearray(1024) | |
|
678 | reader = dctx.stream_reader(foo) | |
|
679 | self.assertEqual(reader.readinto1(b), 3) | |
|
680 | self.assertEqual(b[0:3], b'foo') | |
|
681 | self.assertEqual(reader.readinto1(b), 0) | |
|
682 | self.assertEqual(b[0:3], b'foo') | |
|
683 | ||
|
684 | # readinto() with small reads. | |
|
685 | b = bytearray(1024) | |
|
686 | reader = dctx.stream_reader(foo, read_size=1) | |
|
687 | self.assertEqual(reader.readinto1(b), 3) | |
|
688 | self.assertEqual(b[0:3], b'foo') | |
|
689 | ||
|
690 | # Too small destination buffer. | |
|
691 | b = bytearray(2) | |
|
692 | reader = dctx.stream_reader(foo) | |
|
693 | self.assertEqual(reader.readinto1(b), 2) | |
|
694 | self.assertEqual(b[:], b'fo') | |
|
695 | ||
|
696 | def test_readall(self): | |
|
697 | cctx = zstd.ZstdCompressor() | |
|
698 | foo = cctx.compress(b'foo') | |
|
699 | ||
|
700 | dctx = zstd.ZstdDecompressor() | |
|
701 | reader = dctx.stream_reader(foo) | |
|
702 | ||
|
703 | self.assertEqual(reader.readall(), b'foo') | |
|
704 | ||
|
705 | def test_read1(self): | |
|
706 | cctx = zstd.ZstdCompressor() | |
|
707 | foo = cctx.compress(b'foo') | |
|
708 | ||
|
709 | dctx = zstd.ZstdDecompressor() | |
|
710 | ||
|
711 | b = OpCountingBytesIO(foo) | |
|
712 | reader = dctx.stream_reader(b) | |
|
713 | ||
|
714 | self.assertEqual(reader.read1(), b'foo') | |
|
715 | self.assertEqual(b._read_count, 1) | |
|
716 | ||
|
717 | b = OpCountingBytesIO(foo) | |
|
718 | reader = dctx.stream_reader(b) | |
|
719 | ||
|
720 | self.assertEqual(reader.read1(0), b'') | |
|
721 | self.assertEqual(reader.read1(2), b'fo') | |
|
722 | self.assertEqual(b._read_count, 1) | |
|
723 | self.assertEqual(reader.read1(1), b'o') | |
|
724 | self.assertEqual(b._read_count, 1) | |
|
725 | self.assertEqual(reader.read1(1), b'') | |
|
726 | self.assertEqual(b._read_count, 2) | |
|
727 | ||
|
728 | def test_read_lines(self): | |
|
729 | cctx = zstd.ZstdCompressor() | |
|
730 | source = b'\n'.join(('line %d' % i).encode('ascii') for i in range(1024)) | |
|
731 | ||
|
732 | frame = cctx.compress(source) | |
|
733 | ||
|
734 | dctx = zstd.ZstdDecompressor() | |
|
735 | reader = dctx.stream_reader(frame) | |
|
736 | tr = io.TextIOWrapper(reader, encoding='utf-8') | |
|
737 | ||
|
738 | lines = [] | |
|
739 | for line in tr: | |
|
740 | lines.append(line.encode('utf-8')) | |
|
741 | ||
|
742 | self.assertEqual(len(lines), 1024) | |
|
743 | self.assertEqual(b''.join(lines), source) | |
|
744 | ||
|
745 | reader = dctx.stream_reader(frame) | |
|
746 | tr = io.TextIOWrapper(reader, encoding='utf-8') | |
|
747 | ||
|
748 | lines = tr.readlines() | |
|
749 | self.assertEqual(len(lines), 1024) | |
|
750 | self.assertEqual(''.join(lines).encode('utf-8'), source) | |
|
751 | ||
|
752 | reader = dctx.stream_reader(frame) | |
|
753 | tr = io.TextIOWrapper(reader, encoding='utf-8') | |
|
754 | ||
|
755 | lines = [] | |
|
756 | while True: | |
|
757 | line = tr.readline() | |
|
758 | if not line: | |
|
759 | break | |
|
760 | ||
|
761 | lines.append(line.encode('utf-8')) | |
|
762 | ||
|
763 | self.assertEqual(len(lines), 1024) | |
|
764 | self.assertEqual(b''.join(lines), source) | |
|
765 | ||
|
534 | 766 | |
|
535 | 767 | @make_cffi |
|
536 | 768 | class TestDecompressor_decompressobj(unittest.TestCase): |
@@ -540,6 +772,9 b' class TestDecompressor_decompressobj(uni' | |||
|
540 | 772 | dctx = zstd.ZstdDecompressor() |
|
541 | 773 | dobj = dctx.decompressobj() |
|
542 | 774 | self.assertEqual(dobj.decompress(data), b'foobar') |
|
775 | self.assertIsNone(dobj.flush()) | |
|
776 | self.assertIsNone(dobj.flush(10)) | |
|
777 | self.assertIsNone(dobj.flush(length=100)) | |
|
543 | 778 | |
|
544 | 779 | def test_input_types(self): |
|
545 | 780 | compressed = zstd.ZstdCompressor(level=1).compress(b'foo') |
@@ -557,7 +792,11 b' class TestDecompressor_decompressobj(uni' | |||
|
557 | 792 | |
|
558 | 793 | for source in sources: |
|
559 | 794 | dobj = dctx.decompressobj() |
|
795 | self.assertIsNone(dobj.flush()) | |
|
796 | self.assertIsNone(dobj.flush(10)) | |
|
797 | self.assertIsNone(dobj.flush(length=100)) | |
|
560 | 798 | self.assertEqual(dobj.decompress(source), b'foo') |
|
799 | self.assertIsNone(dobj.flush()) | |
|
561 | 800 | |
|
562 | 801 | def test_reuse(self): |
|
563 | 802 | data = zstd.ZstdCompressor(level=1).compress(b'foobar') |
@@ -568,6 +807,7 b' class TestDecompressor_decompressobj(uni' | |||
|
568 | 807 | |
|
569 | 808 | with self.assertRaisesRegexp(zstd.ZstdError, 'cannot use a decompressobj'): |
|
570 | 809 | dobj.decompress(data) |
|
810 | self.assertIsNone(dobj.flush()) | |
|
571 | 811 | |
|
572 | 812 | def test_bad_write_size(self): |
|
573 | 813 | dctx = zstd.ZstdDecompressor() |
@@ -585,16 +825,141 b' class TestDecompressor_decompressobj(uni' | |||
|
585 | 825 | dobj = dctx.decompressobj(write_size=i + 1) |
|
586 | 826 | self.assertEqual(dobj.decompress(data), source) |
|
587 | 827 | |
|
828 | ||
|
588 | 829 | def decompress_via_writer(data): |
|
589 | 830 | buffer = io.BytesIO() |
|
590 | 831 | dctx = zstd.ZstdDecompressor() |
|
591 |
|
|
|
592 |
|
|
|
832 | decompressor = dctx.stream_writer(buffer) | |
|
833 | decompressor.write(data) | |
|
834 | ||
|
593 | 835 | return buffer.getvalue() |
|
594 | 836 | |
|
595 | 837 | |
|
596 | 838 | @make_cffi |
|
597 | 839 | class TestDecompressor_stream_writer(unittest.TestCase): |
|
840 | def test_io_api(self): | |
|
841 | buffer = io.BytesIO() | |
|
842 | dctx = zstd.ZstdDecompressor() | |
|
843 | writer = dctx.stream_writer(buffer) | |
|
844 | ||
|
845 | self.assertFalse(writer.closed) | |
|
846 | self.assertFalse(writer.isatty()) | |
|
847 | self.assertFalse(writer.readable()) | |
|
848 | ||
|
849 | with self.assertRaises(io.UnsupportedOperation): | |
|
850 | writer.readline() | |
|
851 | ||
|
852 | with self.assertRaises(io.UnsupportedOperation): | |
|
853 | writer.readline(42) | |
|
854 | ||
|
855 | with self.assertRaises(io.UnsupportedOperation): | |
|
856 | writer.readline(size=42) | |
|
857 | ||
|
858 | with self.assertRaises(io.UnsupportedOperation): | |
|
859 | writer.readlines() | |
|
860 | ||
|
861 | with self.assertRaises(io.UnsupportedOperation): | |
|
862 | writer.readlines(42) | |
|
863 | ||
|
864 | with self.assertRaises(io.UnsupportedOperation): | |
|
865 | writer.readlines(hint=42) | |
|
866 | ||
|
867 | with self.assertRaises(io.UnsupportedOperation): | |
|
868 | writer.seek(0) | |
|
869 | ||
|
870 | with self.assertRaises(io.UnsupportedOperation): | |
|
871 | writer.seek(10, os.SEEK_SET) | |
|
872 | ||
|
873 | self.assertFalse(writer.seekable()) | |
|
874 | ||
|
875 | with self.assertRaises(io.UnsupportedOperation): | |
|
876 | writer.tell() | |
|
877 | ||
|
878 | with self.assertRaises(io.UnsupportedOperation): | |
|
879 | writer.truncate() | |
|
880 | ||
|
881 | with self.assertRaises(io.UnsupportedOperation): | |
|
882 | writer.truncate(42) | |
|
883 | ||
|
884 | with self.assertRaises(io.UnsupportedOperation): | |
|
885 | writer.truncate(size=42) | |
|
886 | ||
|
887 | self.assertTrue(writer.writable()) | |
|
888 | ||
|
889 | with self.assertRaises(io.UnsupportedOperation): | |
|
890 | writer.writelines([]) | |
|
891 | ||
|
892 | with self.assertRaises(io.UnsupportedOperation): | |
|
893 | writer.read() | |
|
894 | ||
|
895 | with self.assertRaises(io.UnsupportedOperation): | |
|
896 | writer.read(42) | |
|
897 | ||
|
898 | with self.assertRaises(io.UnsupportedOperation): | |
|
899 | writer.read(size=42) | |
|
900 | ||
|
901 | with self.assertRaises(io.UnsupportedOperation): | |
|
902 | writer.readall() | |
|
903 | ||
|
904 | with self.assertRaises(io.UnsupportedOperation): | |
|
905 | writer.readinto(None) | |
|
906 | ||
|
907 | with self.assertRaises(io.UnsupportedOperation): | |
|
908 | writer.fileno() | |
|
909 | ||
|
910 | def test_fileno_file(self): | |
|
911 | with tempfile.TemporaryFile('wb') as tf: | |
|
912 | dctx = zstd.ZstdDecompressor() | |
|
913 | writer = dctx.stream_writer(tf) | |
|
914 | ||
|
915 | self.assertEqual(writer.fileno(), tf.fileno()) | |
|
916 | ||
|
917 | def test_close(self): | |
|
918 | foo = zstd.ZstdCompressor().compress(b'foo') | |
|
919 | ||
|
920 | buffer = NonClosingBytesIO() | |
|
921 | dctx = zstd.ZstdDecompressor() | |
|
922 | writer = dctx.stream_writer(buffer) | |
|
923 | ||
|
924 | writer.write(foo) | |
|
925 | self.assertFalse(writer.closed) | |
|
926 | self.assertFalse(buffer.closed) | |
|
927 | writer.close() | |
|
928 | self.assertTrue(writer.closed) | |
|
929 | self.assertTrue(buffer.closed) | |
|
930 | ||
|
931 | with self.assertRaisesRegexp(ValueError, 'stream is closed'): | |
|
932 | writer.write(b'') | |
|
933 | ||
|
934 | with self.assertRaisesRegexp(ValueError, 'stream is closed'): | |
|
935 | writer.flush() | |
|
936 | ||
|
937 | with self.assertRaisesRegexp(ValueError, 'stream is closed'): | |
|
938 | with writer: | |
|
939 | pass | |
|
940 | ||
|
941 | self.assertEqual(buffer.getvalue(), b'foo') | |
|
942 | ||
|
943 | # Context manager exit should close stream. | |
|
944 | buffer = NonClosingBytesIO() | |
|
945 | writer = dctx.stream_writer(buffer) | |
|
946 | ||
|
947 | with writer: | |
|
948 | writer.write(foo) | |
|
949 | ||
|
950 | self.assertTrue(writer.closed) | |
|
951 | self.assertEqual(buffer.getvalue(), b'foo') | |
|
952 | ||
|
953 | def test_flush(self): | |
|
954 | buffer = OpCountingBytesIO() | |
|
955 | dctx = zstd.ZstdDecompressor() | |
|
956 | writer = dctx.stream_writer(buffer) | |
|
957 | ||
|
958 | writer.flush() | |
|
959 | self.assertEqual(buffer._flush_count, 1) | |
|
960 | writer.flush() | |
|
961 | self.assertEqual(buffer._flush_count, 2) | |
|
962 | ||
|
598 | 963 | def test_empty_roundtrip(self): |
|
599 | 964 | cctx = zstd.ZstdCompressor() |
|
600 | 965 | empty = cctx.compress(b'') |
@@ -616,9 +981,21 b' class TestDecompressor_stream_writer(uni' | |||
|
616 | 981 | dctx = zstd.ZstdDecompressor() |
|
617 | 982 | for source in sources: |
|
618 | 983 | buffer = io.BytesIO() |
|
984 | ||
|
985 | decompressor = dctx.stream_writer(buffer) | |
|
986 | decompressor.write(source) | |
|
987 | self.assertEqual(buffer.getvalue(), b'foo') | |
|
988 | ||
|
989 | buffer = NonClosingBytesIO() | |
|
990 | ||
|
619 | 991 | with dctx.stream_writer(buffer) as decompressor: |
|
620 | decompressor.write(source) | |
|
992 | self.assertEqual(decompressor.write(source), 3) | |
|
993 | ||
|
994 | self.assertEqual(buffer.getvalue(), b'foo') | |
|
621 | 995 | |
|
996 | buffer = io.BytesIO() | |
|
997 | writer = dctx.stream_writer(buffer, write_return_read=True) | |
|
998 | self.assertEqual(writer.write(source), len(source)) | |
|
622 | 999 | self.assertEqual(buffer.getvalue(), b'foo') |
|
623 | 1000 | |
|
624 | 1001 | def test_large_roundtrip(self): |
@@ -641,7 +1018,7 b' class TestDecompressor_stream_writer(uni' | |||
|
641 | 1018 | cctx = zstd.ZstdCompressor() |
|
642 | 1019 | compressed = cctx.compress(orig) |
|
643 | 1020 | |
|
644 |
buffer = |
|
|
1021 | buffer = NonClosingBytesIO() | |
|
645 | 1022 | dctx = zstd.ZstdDecompressor() |
|
646 | 1023 | with dctx.stream_writer(buffer) as decompressor: |
|
647 | 1024 | pos = 0 |
@@ -651,6 +1028,17 b' class TestDecompressor_stream_writer(uni' | |||
|
651 | 1028 | pos += 8192 |
|
652 | 1029 | self.assertEqual(buffer.getvalue(), orig) |
|
653 | 1030 | |
|
1031 | # Again with write_return_read=True | |
|
1032 | buffer = io.BytesIO() | |
|
1033 | writer = dctx.stream_writer(buffer, write_return_read=True) | |
|
1034 | pos = 0 | |
|
1035 | while pos < len(compressed): | |
|
1036 | pos2 = pos + 8192 | |
|
1037 | chunk = compressed[pos:pos2] | |
|
1038 | self.assertEqual(writer.write(chunk), len(chunk)) | |
|
1039 | pos += 8192 | |
|
1040 | self.assertEqual(buffer.getvalue(), orig) | |
|
1041 | ||
|
654 | 1042 | def test_dictionary(self): |
|
655 | 1043 | samples = [] |
|
656 | 1044 | for i in range(128): |
@@ -661,7 +1049,7 b' class TestDecompressor_stream_writer(uni' | |||
|
661 | 1049 | d = zstd.train_dictionary(8192, samples) |
|
662 | 1050 | |
|
663 | 1051 | orig = b'foobar' * 16384 |
|
664 |
buffer = |
|
|
1052 | buffer = NonClosingBytesIO() | |
|
665 | 1053 | cctx = zstd.ZstdCompressor(dict_data=d) |
|
666 | 1054 | with cctx.stream_writer(buffer) as compressor: |
|
667 | 1055 | self.assertEqual(compressor.write(orig), 0) |
@@ -670,6 +1058,12 b' class TestDecompressor_stream_writer(uni' | |||
|
670 | 1058 | buffer = io.BytesIO() |
|
671 | 1059 | |
|
672 | 1060 | dctx = zstd.ZstdDecompressor(dict_data=d) |
|
1061 | decompressor = dctx.stream_writer(buffer) | |
|
1062 | self.assertEqual(decompressor.write(compressed), len(orig)) | |
|
1063 | self.assertEqual(buffer.getvalue(), orig) | |
|
1064 | ||
|
1065 | buffer = NonClosingBytesIO() | |
|
1066 | ||
|
673 | 1067 | with dctx.stream_writer(buffer) as decompressor: |
|
674 | 1068 | self.assertEqual(decompressor.write(compressed), len(orig)) |
|
675 | 1069 | |
@@ -678,6 +1072,11 b' class TestDecompressor_stream_writer(uni' | |||
|
678 | 1072 | def test_memory_size(self): |
|
679 | 1073 | dctx = zstd.ZstdDecompressor() |
|
680 | 1074 | buffer = io.BytesIO() |
|
1075 | ||
|
1076 | decompressor = dctx.stream_writer(buffer) | |
|
1077 | size = decompressor.memory_size() | |
|
1078 | self.assertGreater(size, 100000) | |
|
1079 | ||
|
681 | 1080 | with dctx.stream_writer(buffer) as decompressor: |
|
682 | 1081 | size = decompressor.memory_size() |
|
683 | 1082 | |
@@ -810,7 +1209,7 b' class TestDecompressor_read_to_iter(unit' | |||
|
810 | 1209 | @unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set') |
|
811 | 1210 | def test_large_input(self): |
|
812 | 1211 | bytes = list(struct.Struct('>B').pack(i) for i in range(256)) |
|
813 |
compressed = |
|
|
1212 | compressed = NonClosingBytesIO() | |
|
814 | 1213 | input_size = 0 |
|
815 | 1214 | cctx = zstd.ZstdCompressor(level=1) |
|
816 | 1215 | with cctx.stream_writer(compressed) as compressor: |
@@ -823,7 +1222,7 b' class TestDecompressor_read_to_iter(unit' | |||
|
823 | 1222 | if have_compressed and have_raw: |
|
824 | 1223 | break |
|
825 | 1224 | |
|
826 |
compressed. |
|
|
1225 | compressed = io.BytesIO(compressed.getvalue()) | |
|
827 | 1226 | self.assertGreater(len(compressed.getvalue()), |
|
828 | 1227 | zstd.DECOMPRESSION_RECOMMENDED_INPUT_SIZE) |
|
829 | 1228 | |
@@ -861,7 +1260,7 b' class TestDecompressor_read_to_iter(unit' | |||
|
861 | 1260 | |
|
862 | 1261 | source = io.BytesIO() |
|
863 | 1262 | |
|
864 |
compressed = |
|
|
1263 | compressed = NonClosingBytesIO() | |
|
865 | 1264 | with cctx.stream_writer(compressed) as compressor: |
|
866 | 1265 | for i in range(256): |
|
867 | 1266 | chunk = b'\0' * 1024 |
@@ -874,7 +1273,7 b' class TestDecompressor_read_to_iter(unit' | |||
|
874 | 1273 | max_output_size=len(source.getvalue())) |
|
875 | 1274 | self.assertEqual(simple, source.getvalue()) |
|
876 | 1275 | |
|
877 |
compressed. |
|
|
1276 | compressed = io.BytesIO(compressed.getvalue()) | |
|
878 | 1277 | streamed = b''.join(dctx.read_to_iter(compressed)) |
|
879 | 1278 | self.assertEqual(streamed, source.getvalue()) |
|
880 | 1279 | |
@@ -1001,6 +1400,9 b' class TestDecompressor_multi_decompress_' | |||
|
1001 | 1400 | def test_invalid_inputs(self): |
|
1002 | 1401 | dctx = zstd.ZstdDecompressor() |
|
1003 | 1402 | |
|
1403 | if not hasattr(dctx, 'multi_decompress_to_buffer'): | |
|
1404 | self.skipTest('multi_decompress_to_buffer not available') | |
|
1405 | ||
|
1004 | 1406 | with self.assertRaises(TypeError): |
|
1005 | 1407 | dctx.multi_decompress_to_buffer(True) |
|
1006 | 1408 | |
@@ -1020,6 +1422,10 b' class TestDecompressor_multi_decompress_' | |||
|
1020 | 1422 | frames = [cctx.compress(d) for d in original] |
|
1021 | 1423 | |
|
1022 | 1424 | dctx = zstd.ZstdDecompressor() |
|
1425 | ||
|
1426 | if not hasattr(dctx, 'multi_decompress_to_buffer'): | |
|
1427 | self.skipTest('multi_decompress_to_buffer not available') | |
|
1428 | ||
|
1023 | 1429 | result = dctx.multi_decompress_to_buffer(frames) |
|
1024 | 1430 | |
|
1025 | 1431 | self.assertEqual(len(result), len(frames)) |
@@ -1041,6 +1447,10 b' class TestDecompressor_multi_decompress_' | |||
|
1041 | 1447 | sizes = struct.pack('=' + 'Q' * len(original), *map(len, original)) |
|
1042 | 1448 | |
|
1043 | 1449 | dctx = zstd.ZstdDecompressor() |
|
1450 | ||
|
1451 | if not hasattr(dctx, 'multi_decompress_to_buffer'): | |
|
1452 | self.skipTest('multi_decompress_to_buffer not available') | |
|
1453 | ||
|
1044 | 1454 | result = dctx.multi_decompress_to_buffer(frames, decompressed_sizes=sizes) |
|
1045 | 1455 | |
|
1046 | 1456 | self.assertEqual(len(result), len(frames)) |
@@ -1057,6 +1467,9 b' class TestDecompressor_multi_decompress_' | |||
|
1057 | 1467 | |
|
1058 | 1468 | dctx = zstd.ZstdDecompressor() |
|
1059 | 1469 | |
|
1470 | if not hasattr(dctx, 'multi_decompress_to_buffer'): | |
|
1471 | self.skipTest('multi_decompress_to_buffer not available') | |
|
1472 | ||
|
1060 | 1473 | segments = struct.pack('=QQQQ', 0, len(frames[0]), len(frames[0]), len(frames[1])) |
|
1061 | 1474 | b = zstd.BufferWithSegments(b''.join(frames), segments) |
|
1062 | 1475 | |
@@ -1074,12 +1487,16 b' class TestDecompressor_multi_decompress_' | |||
|
1074 | 1487 | frames = [cctx.compress(d) for d in original] |
|
1075 | 1488 | sizes = struct.pack('=' + 'Q' * len(original), *map(len, original)) |
|
1076 | 1489 | |
|
1490 | dctx = zstd.ZstdDecompressor() | |
|
1491 | ||
|
1492 | if not hasattr(dctx, 'multi_decompress_to_buffer'): | |
|
1493 | self.skipTest('multi_decompress_to_buffer not available') | |
|
1494 | ||
|
1077 | 1495 | segments = struct.pack('=QQQQQQ', 0, len(frames[0]), |
|
1078 | 1496 | len(frames[0]), len(frames[1]), |
|
1079 | 1497 | len(frames[0]) + len(frames[1]), len(frames[2])) |
|
1080 | 1498 | b = zstd.BufferWithSegments(b''.join(frames), segments) |
|
1081 | 1499 | |
|
1082 | dctx = zstd.ZstdDecompressor() | |
|
1083 | 1500 | result = dctx.multi_decompress_to_buffer(b, decompressed_sizes=sizes) |
|
1084 | 1501 | |
|
1085 | 1502 | self.assertEqual(len(result), len(frames)) |
@@ -1099,10 +1516,14 b' class TestDecompressor_multi_decompress_' | |||
|
1099 | 1516 | b'foo4' * 6, |
|
1100 | 1517 | ] |
|
1101 | 1518 | |
|
1519 | if not hasattr(cctx, 'multi_compress_to_buffer'): | |
|
1520 | self.skipTest('multi_compress_to_buffer not available') | |
|
1521 | ||
|
1102 | 1522 | frames = cctx.multi_compress_to_buffer(original) |
|
1103 | 1523 | |
|
1104 | 1524 | # Check round trip. |
|
1105 | 1525 | dctx = zstd.ZstdDecompressor() |
|
1526 | ||
|
1106 | 1527 | decompressed = dctx.multi_decompress_to_buffer(frames, threads=3) |
|
1107 | 1528 | |
|
1108 | 1529 | self.assertEqual(len(decompressed), len(original)) |
@@ -1138,7 +1559,12 b' class TestDecompressor_multi_decompress_' | |||
|
1138 | 1559 | frames = [cctx.compress(s) for s in generate_samples()] |
|
1139 | 1560 | |
|
1140 | 1561 | dctx = zstd.ZstdDecompressor(dict_data=d) |
|
1562 | ||
|
1563 | if not hasattr(dctx, 'multi_decompress_to_buffer'): | |
|
1564 | self.skipTest('multi_decompress_to_buffer not available') | |
|
1565 | ||
|
1141 | 1566 | result = dctx.multi_decompress_to_buffer(frames) |
|
1567 | ||
|
1142 | 1568 | self.assertEqual([o.tobytes() for o in result], generate_samples()) |
|
1143 | 1569 | |
|
1144 | 1570 | def test_multiple_threads(self): |
@@ -1149,6 +1575,10 b' class TestDecompressor_multi_decompress_' | |||
|
1149 | 1575 | frames.extend(cctx.compress(b'y' * 64) for i in range(256)) |
|
1150 | 1576 | |
|
1151 | 1577 | dctx = zstd.ZstdDecompressor() |
|
1578 | ||
|
1579 | if not hasattr(dctx, 'multi_decompress_to_buffer'): | |
|
1580 | self.skipTest('multi_decompress_to_buffer not available') | |
|
1581 | ||
|
1152 | 1582 | result = dctx.multi_decompress_to_buffer(frames, threads=-1) |
|
1153 | 1583 | |
|
1154 | 1584 | self.assertEqual(len(result), len(frames)) |
@@ -1164,6 +1594,9 b' class TestDecompressor_multi_decompress_' | |||
|
1164 | 1594 | |
|
1165 | 1595 | dctx = zstd.ZstdDecompressor() |
|
1166 | 1596 | |
|
1597 | if not hasattr(dctx, 'multi_decompress_to_buffer'): | |
|
1598 | self.skipTest('multi_decompress_to_buffer not available') | |
|
1599 | ||
|
1167 | 1600 | with self.assertRaisesRegexp(zstd.ZstdError, |
|
1168 | 1601 | 'error decompressing item 1: (' |
|
1169 | 1602 | 'Corrupted block|' |
@@ -12,6 +12,7 b' import zstandard as zstd' | |||
|
12 | 12 | |
|
13 | 13 | from . common import ( |
|
14 | 14 | make_cffi, |
|
15 | NonClosingBytesIO, | |
|
15 | 16 | random_input_data, |
|
16 | 17 | ) |
|
17 | 18 | |
@@ -23,22 +24,200 b' class TestDecompressor_stream_reader_fuz' | |||
|
23 | 24 | suppress_health_check=[hypothesis.HealthCheck.large_base_example]) |
|
24 | 25 | @hypothesis.given(original=strategies.sampled_from(random_input_data()), |
|
25 | 26 | level=strategies.integers(min_value=1, max_value=5), |
|
26 |
s |
|
|
27 | streaming=strategies.booleans(), | |
|
28 | source_read_size=strategies.integers(1, 1048576), | |
|
27 | 29 | read_sizes=strategies.data()) |
|
28 |
def test_stream_source_read_variance(self, original, level, s |
|
|
29 | read_sizes): | |
|
30 | def test_stream_source_read_variance(self, original, level, streaming, | |
|
31 | source_read_size, read_sizes): | |
|
30 | 32 | cctx = zstd.ZstdCompressor(level=level) |
|
31 | frame = cctx.compress(original) | |
|
33 | ||
|
34 | if streaming: | |
|
35 | source = io.BytesIO() | |
|
36 | writer = cctx.stream_writer(source) | |
|
37 | writer.write(original) | |
|
38 | writer.flush(zstd.FLUSH_FRAME) | |
|
39 | source.seek(0) | |
|
40 | else: | |
|
41 | frame = cctx.compress(original) | |
|
42 | source = io.BytesIO(frame) | |
|
32 | 43 | |
|
33 | 44 | dctx = zstd.ZstdDecompressor() |
|
34 | source = io.BytesIO(frame) | |
|
35 | 45 | |
|
36 | 46 | chunks = [] |
|
37 | 47 | with dctx.stream_reader(source, read_size=source_read_size) as reader: |
|
38 | 48 | while True: |
|
39 |
read_size = read_sizes.draw(strategies.integers(1, 1 |
|
|
49 | read_size = read_sizes.draw(strategies.integers(-1, 131072)) | |
|
50 | chunk = reader.read(read_size) | |
|
51 | if not chunk and read_size: | |
|
52 | break | |
|
53 | ||
|
54 | chunks.append(chunk) | |
|
55 | ||
|
56 | self.assertEqual(b''.join(chunks), original) | |
|
57 | ||
|
58 | # Similar to above except we have a constant read() size. | |
|
59 | @hypothesis.settings( | |
|
60 | suppress_health_check=[hypothesis.HealthCheck.large_base_example]) | |
|
61 | @hypothesis.given(original=strategies.sampled_from(random_input_data()), | |
|
62 | level=strategies.integers(min_value=1, max_value=5), | |
|
63 | streaming=strategies.booleans(), | |
|
64 | source_read_size=strategies.integers(1, 1048576), | |
|
65 | read_size=strategies.integers(-1, 131072)) | |
|
66 | def test_stream_source_read_size(self, original, level, streaming, | |
|
67 | source_read_size, read_size): | |
|
68 | if read_size == 0: | |
|
69 | read_size = 1 | |
|
70 | ||
|
71 | cctx = zstd.ZstdCompressor(level=level) | |
|
72 | ||
|
73 | if streaming: | |
|
74 | source = io.BytesIO() | |
|
75 | writer = cctx.stream_writer(source) | |
|
76 | writer.write(original) | |
|
77 | writer.flush(zstd.FLUSH_FRAME) | |
|
78 | source.seek(0) | |
|
79 | else: | |
|
80 | frame = cctx.compress(original) | |
|
81 | source = io.BytesIO(frame) | |
|
82 | ||
|
83 | dctx = zstd.ZstdDecompressor() | |
|
84 | ||
|
85 | chunks = [] | |
|
86 | reader = dctx.stream_reader(source, read_size=source_read_size) | |
|
87 | while True: | |
|
88 | chunk = reader.read(read_size) | |
|
89 | if not chunk and read_size: | |
|
90 | break | |
|
91 | ||
|
92 | chunks.append(chunk) | |
|
93 | ||
|
94 | self.assertEqual(b''.join(chunks), original) | |
|
95 | ||
|
96 | @hypothesis.settings( | |
|
97 | suppress_health_check=[hypothesis.HealthCheck.large_base_example]) | |
|
98 | @hypothesis.given(original=strategies.sampled_from(random_input_data()), | |
|
99 | level=strategies.integers(min_value=1, max_value=5), | |
|
100 | streaming=strategies.booleans(), | |
|
101 | source_read_size=strategies.integers(1, 1048576), | |
|
102 | read_sizes=strategies.data()) | |
|
103 | def test_buffer_source_read_variance(self, original, level, streaming, | |
|
104 | source_read_size, read_sizes): | |
|
105 | cctx = zstd.ZstdCompressor(level=level) | |
|
106 | ||
|
107 | if streaming: | |
|
108 | source = io.BytesIO() | |
|
109 | writer = cctx.stream_writer(source) | |
|
110 | writer.write(original) | |
|
111 | writer.flush(zstd.FLUSH_FRAME) | |
|
112 | frame = source.getvalue() | |
|
113 | else: | |
|
114 | frame = cctx.compress(original) | |
|
115 | ||
|
116 | dctx = zstd.ZstdDecompressor() | |
|
117 | chunks = [] | |
|
118 | ||
|
119 | with dctx.stream_reader(frame, read_size=source_read_size) as reader: | |
|
120 | while True: | |
|
121 | read_size = read_sizes.draw(strategies.integers(-1, 131072)) | |
|
40 | 122 | chunk = reader.read(read_size) |
|
41 | if not chunk: | |
|
123 | if not chunk and read_size: | |
|
124 | break | |
|
125 | ||
|
126 | chunks.append(chunk) | |
|
127 | ||
|
128 | self.assertEqual(b''.join(chunks), original) | |
|
129 | ||
|
130 | # Similar to above except we have a constant read() size. | |
|
131 | @hypothesis.settings( | |
|
132 | suppress_health_check=[hypothesis.HealthCheck.large_base_example]) | |
|
133 | @hypothesis.given(original=strategies.sampled_from(random_input_data()), | |
|
134 | level=strategies.integers(min_value=1, max_value=5), | |
|
135 | streaming=strategies.booleans(), | |
|
136 | source_read_size=strategies.integers(1, 1048576), | |
|
137 | read_size=strategies.integers(-1, 131072)) | |
|
138 | def test_buffer_source_constant_read_size(self, original, level, streaming, | |
|
139 | source_read_size, read_size): | |
|
140 | if read_size == 0: | |
|
141 | read_size = -1 | |
|
142 | ||
|
143 | cctx = zstd.ZstdCompressor(level=level) | |
|
144 | ||
|
145 | if streaming: | |
|
146 | source = io.BytesIO() | |
|
147 | writer = cctx.stream_writer(source) | |
|
148 | writer.write(original) | |
|
149 | writer.flush(zstd.FLUSH_FRAME) | |
|
150 | frame = source.getvalue() | |
|
151 | else: | |
|
152 | frame = cctx.compress(original) | |
|
153 | ||
|
154 | dctx = zstd.ZstdDecompressor() | |
|
155 | chunks = [] | |
|
156 | ||
|
157 | reader = dctx.stream_reader(frame, read_size=source_read_size) | |
|
158 | while True: | |
|
159 | chunk = reader.read(read_size) | |
|
160 | if not chunk and read_size: | |
|
161 | break | |
|
162 | ||
|
163 | chunks.append(chunk) | |
|
164 | ||
|
165 | self.assertEqual(b''.join(chunks), original) | |
|
166 | ||
|
167 | @hypothesis.settings( | |
|
168 | suppress_health_check=[hypothesis.HealthCheck.large_base_example]) | |
|
169 | @hypothesis.given(original=strategies.sampled_from(random_input_data()), | |
|
170 | level=strategies.integers(min_value=1, max_value=5), | |
|
171 | streaming=strategies.booleans(), | |
|
172 | source_read_size=strategies.integers(1, 1048576)) | |
|
173 | def test_stream_source_readall(self, original, level, streaming, | |
|
174 | source_read_size): | |
|
175 | cctx = zstd.ZstdCompressor(level=level) | |
|
176 | ||
|
177 | if streaming: | |
|
178 | source = io.BytesIO() | |
|
179 | writer = cctx.stream_writer(source) | |
|
180 | writer.write(original) | |
|
181 | writer.flush(zstd.FLUSH_FRAME) | |
|
182 | source.seek(0) | |
|
183 | else: | |
|
184 | frame = cctx.compress(original) | |
|
185 | source = io.BytesIO(frame) | |
|
186 | ||
|
187 | dctx = zstd.ZstdDecompressor() | |
|
188 | ||
|
189 | data = dctx.stream_reader(source, read_size=source_read_size).readall() | |
|
190 | self.assertEqual(data, original) | |
|
191 | ||
|
192 | @hypothesis.settings( | |
|
193 | suppress_health_check=[hypothesis.HealthCheck.large_base_example]) | |
|
194 | @hypothesis.given(original=strategies.sampled_from(random_input_data()), | |
|
195 | level=strategies.integers(min_value=1, max_value=5), | |
|
196 | streaming=strategies.booleans(), | |
|
197 | source_read_size=strategies.integers(1, 1048576), | |
|
198 | read_sizes=strategies.data()) | |
|
199 | def test_stream_source_read1_variance(self, original, level, streaming, | |
|
200 | source_read_size, read_sizes): | |
|
201 | cctx = zstd.ZstdCompressor(level=level) | |
|
202 | ||
|
203 | if streaming: | |
|
204 | source = io.BytesIO() | |
|
205 | writer = cctx.stream_writer(source) | |
|
206 | writer.write(original) | |
|
207 | writer.flush(zstd.FLUSH_FRAME) | |
|
208 | source.seek(0) | |
|
209 | else: | |
|
210 | frame = cctx.compress(original) | |
|
211 | source = io.BytesIO(frame) | |
|
212 | ||
|
213 | dctx = zstd.ZstdDecompressor() | |
|
214 | ||
|
215 | chunks = [] | |
|
216 | with dctx.stream_reader(source, read_size=source_read_size) as reader: | |
|
217 | while True: | |
|
218 | read_size = read_sizes.draw(strategies.integers(-1, 131072)) | |
|
219 | chunk = reader.read1(read_size) | |
|
220 | if not chunk and read_size: | |
|
42 | 221 | break |
|
43 | 222 | |
|
44 | 223 | chunks.append(chunk) |
@@ -49,24 +228,36 b' class TestDecompressor_stream_reader_fuz' | |||
|
49 | 228 | suppress_health_check=[hypothesis.HealthCheck.large_base_example]) |
|
50 | 229 | @hypothesis.given(original=strategies.sampled_from(random_input_data()), |
|
51 | 230 | level=strategies.integers(min_value=1, max_value=5), |
|
52 |
s |
|
|
231 | streaming=strategies.booleans(), | |
|
232 | source_read_size=strategies.integers(1, 1048576), | |
|
53 | 233 | read_sizes=strategies.data()) |
|
54 |
def test_ |
|
|
55 | read_sizes): | |
|
234 | def test_stream_source_readinto1_variance(self, original, level, streaming, | |
|
235 | source_read_size, read_sizes): | |
|
56 | 236 | cctx = zstd.ZstdCompressor(level=level) |
|
57 | frame = cctx.compress(original) | |
|
237 | ||
|
238 | if streaming: | |
|
239 | source = io.BytesIO() | |
|
240 | writer = cctx.stream_writer(source) | |
|
241 | writer.write(original) | |
|
242 | writer.flush(zstd.FLUSH_FRAME) | |
|
243 | source.seek(0) | |
|
244 | else: | |
|
245 | frame = cctx.compress(original) | |
|
246 | source = io.BytesIO(frame) | |
|
58 | 247 | |
|
59 | 248 | dctx = zstd.ZstdDecompressor() |
|
249 | ||
|
60 | 250 | chunks = [] |
|
61 | ||
|
62 | with dctx.stream_reader(frame, read_size=source_read_size) as reader: | |
|
251 | with dctx.stream_reader(source, read_size=source_read_size) as reader: | |
|
63 | 252 | while True: |
|
64 |
read_size = read_sizes.draw(strategies.integers(1, 1 |
|
|
65 |
|
|
|
66 | if not chunk: | |
|
253 | read_size = read_sizes.draw(strategies.integers(1, 131072)) | |
|
254 | b = bytearray(read_size) | |
|
255 | count = reader.readinto1(b) | |
|
256 | ||
|
257 | if not count: | |
|
67 | 258 | break |
|
68 | 259 | |
|
69 |
chunks.append( |
|
|
260 | chunks.append(bytes(b[0:count])) | |
|
70 | 261 | |
|
71 | 262 | self.assertEqual(b''.join(chunks), original) |
|
72 | 263 | |
@@ -75,7 +266,7 b' class TestDecompressor_stream_reader_fuz' | |||
|
75 | 266 | @hypothesis.given( |
|
76 | 267 | original=strategies.sampled_from(random_input_data()), |
|
77 | 268 | level=strategies.integers(min_value=1, max_value=5), |
|
78 |
source_read_size=strategies.integers(1, 1 |
|
|
269 | source_read_size=strategies.integers(1, 1048576), | |
|
79 | 270 | seek_amounts=strategies.data(), |
|
80 | 271 | read_sizes=strategies.data()) |
|
81 | 272 | def test_relative_seeks(self, original, level, source_read_size, seek_amounts, |
@@ -99,6 +290,46 b' class TestDecompressor_stream_reader_fuz' | |||
|
99 | 290 | |
|
100 | 291 | self.assertEqual(original[offset:offset + len(chunk)], chunk) |
|
101 | 292 | |
|
293 | @hypothesis.settings( | |
|
294 | suppress_health_check=[hypothesis.HealthCheck.large_base_example]) | |
|
295 | @hypothesis.given( | |
|
296 | originals=strategies.data(), | |
|
297 | frame_count=strategies.integers(min_value=2, max_value=10), | |
|
298 | level=strategies.integers(min_value=1, max_value=5), | |
|
299 | source_read_size=strategies.integers(1, 1048576), | |
|
300 | read_sizes=strategies.data()) | |
|
301 | def test_multiple_frames(self, originals, frame_count, level, | |
|
302 | source_read_size, read_sizes): | |
|
303 | ||
|
304 | cctx = zstd.ZstdCompressor(level=level) | |
|
305 | source = io.BytesIO() | |
|
306 | buffer = io.BytesIO() | |
|
307 | writer = cctx.stream_writer(buffer) | |
|
308 | ||
|
309 | for i in range(frame_count): | |
|
310 | data = originals.draw(strategies.sampled_from(random_input_data())) | |
|
311 | source.write(data) | |
|
312 | writer.write(data) | |
|
313 | writer.flush(zstd.FLUSH_FRAME) | |
|
314 | ||
|
315 | dctx = zstd.ZstdDecompressor() | |
|
316 | buffer.seek(0) | |
|
317 | reader = dctx.stream_reader(buffer, read_size=source_read_size, | |
|
318 | read_across_frames=True) | |
|
319 | ||
|
320 | chunks = [] | |
|
321 | ||
|
322 | while True: | |
|
323 | read_amount = read_sizes.draw(strategies.integers(-1, 16384)) | |
|
324 | chunk = reader.read(read_amount) | |
|
325 | ||
|
326 | if not chunk and read_amount: | |
|
327 | break | |
|
328 | ||
|
329 | chunks.append(chunk) | |
|
330 | ||
|
331 | self.assertEqual(source.getvalue(), b''.join(chunks)) | |
|
332 | ||
|
102 | 333 | |
|
103 | 334 | @unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set') |
|
104 | 335 | @make_cffi |
@@ -113,7 +344,7 b' class TestDecompressor_stream_writer_fuz' | |||
|
113 | 344 | |
|
114 | 345 | dctx = zstd.ZstdDecompressor() |
|
115 | 346 | source = io.BytesIO(frame) |
|
116 |
dest = |
|
|
347 | dest = NonClosingBytesIO() | |
|
117 | 348 | |
|
118 | 349 | with dctx.stream_writer(dest, write_size=write_size) as decompressor: |
|
119 | 350 | while True: |
@@ -234,10 +465,12 b' class TestDecompressor_multi_decompress_' | |||
|
234 | 465 | write_checksum=True, |
|
235 | 466 | **kwargs) |
|
236 | 467 | |
|
468 | if not hasattr(cctx, 'multi_compress_to_buffer'): | |
|
469 | self.skipTest('multi_compress_to_buffer not available') | |
|
470 | ||
|
237 | 471 | frames_buffer = cctx.multi_compress_to_buffer(original, threads=-1) |
|
238 | 472 | |
|
239 | 473 | dctx = zstd.ZstdDecompressor(**kwargs) |
|
240 | ||
|
241 | 474 | result = dctx.multi_decompress_to_buffer(frames_buffer) |
|
242 | 475 | |
|
243 | 476 | self.assertEqual(len(result), len(original)) |
@@ -12,9 +12,9 b' from . common import (' | |||
|
12 | 12 | @make_cffi |
|
13 | 13 | class TestModuleAttributes(unittest.TestCase): |
|
14 | 14 | def test_version(self): |
|
15 |
self.assertEqual(zstd.ZSTD_VERSION, (1, 3, |
|
|
15 | self.assertEqual(zstd.ZSTD_VERSION, (1, 3, 8)) | |
|
16 | 16 | |
|
17 |
self.assertEqual(zstd.__version__, '0.1 |
|
|
17 | self.assertEqual(zstd.__version__, '0.11.0') | |
|
18 | 18 | |
|
19 | 19 | def test_constants(self): |
|
20 | 20 | self.assertEqual(zstd.MAX_COMPRESSION_LEVEL, 22) |
@@ -29,6 +29,8 b' class TestModuleAttributes(unittest.Test' | |||
|
29 | 29 | 'DECOMPRESSION_RECOMMENDED_INPUT_SIZE', |
|
30 | 30 | 'DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE', |
|
31 | 31 | 'MAGIC_NUMBER', |
|
32 | 'FLUSH_BLOCK', | |
|
33 | 'FLUSH_FRAME', | |
|
32 | 34 | 'BLOCKSIZELOG_MAX', |
|
33 | 35 | 'BLOCKSIZE_MAX', |
|
34 | 36 | 'WINDOWLOG_MIN', |
@@ -38,6 +40,8 b' class TestModuleAttributes(unittest.Test' | |||
|
38 | 40 | 'HASHLOG_MIN', |
|
39 | 41 | 'HASHLOG_MAX', |
|
40 | 42 | 'HASHLOG3_MAX', |
|
43 | 'MINMATCH_MIN', | |
|
44 | 'MINMATCH_MAX', | |
|
41 | 45 | 'SEARCHLOG_MIN', |
|
42 | 46 | 'SEARCHLOG_MAX', |
|
43 | 47 | 'SEARCHLENGTH_MIN', |
@@ -55,6 +59,7 b' class TestModuleAttributes(unittest.Test' | |||
|
55 | 59 | 'STRATEGY_BTLAZY2', |
|
56 | 60 | 'STRATEGY_BTOPT', |
|
57 | 61 | 'STRATEGY_BTULTRA', |
|
62 | 'STRATEGY_BTULTRA2', | |
|
58 | 63 | 'DICT_TYPE_AUTO', |
|
59 | 64 | 'DICT_TYPE_RAWCONTENT', |
|
60 | 65 | 'DICT_TYPE_FULLDICT', |
@@ -35,31 +35,31 b" if _module_policy == 'default':" | |||
|
35 | 35 | from zstd import * |
|
36 | 36 | backend = 'cext' |
|
37 | 37 | elif platform.python_implementation() in ('PyPy',): |
|
38 |
from |
|
|
38 | from .cffi import * | |
|
39 | 39 | backend = 'cffi' |
|
40 | 40 | else: |
|
41 | 41 | try: |
|
42 | 42 | from zstd import * |
|
43 | 43 | backend = 'cext' |
|
44 | 44 | except ImportError: |
|
45 |
from |
|
|
45 | from .cffi import * | |
|
46 | 46 | backend = 'cffi' |
|
47 | 47 | elif _module_policy == 'cffi_fallback': |
|
48 | 48 | try: |
|
49 | 49 | from zstd import * |
|
50 | 50 | backend = 'cext' |
|
51 | 51 | except ImportError: |
|
52 |
from |
|
|
52 | from .cffi import * | |
|
53 | 53 | backend = 'cffi' |
|
54 | 54 | elif _module_policy == 'cext': |
|
55 | 55 | from zstd import * |
|
56 | 56 | backend = 'cext' |
|
57 | 57 | elif _module_policy == 'cffi': |
|
58 |
from |
|
|
58 | from .cffi import * | |
|
59 | 59 | backend = 'cffi' |
|
60 | 60 | else: |
|
61 | 61 | raise ImportError('unknown module import policy: %s; use default, cffi_fallback, ' |
|
62 | 62 | 'cext, or cffi' % _module_policy) |
|
63 | 63 | |
|
64 | 64 | # Keep this in sync with python-zstandard.h. |
|
65 |
__version__ = '0.1 |
|
|
65 | __version__ = '0.11.0' |
This diff has been collapsed as it changes many lines, (1203 lines changed) Show them Hide them | |||
@@ -28,6 +28,8 b' from __future__ import absolute_import, ' | |||
|
28 | 28 | 'train_dictionary', |
|
29 | 29 | |
|
30 | 30 | # Constants. |
|
31 | 'FLUSH_BLOCK', | |
|
32 | 'FLUSH_FRAME', | |
|
31 | 33 | 'COMPRESSOBJ_FLUSH_FINISH', |
|
32 | 34 | 'COMPRESSOBJ_FLUSH_BLOCK', |
|
33 | 35 | 'ZSTD_VERSION', |
@@ -49,6 +51,8 b' from __future__ import absolute_import, ' | |||
|
49 | 51 | 'HASHLOG_MIN', |
|
50 | 52 | 'HASHLOG_MAX', |
|
51 | 53 | 'HASHLOG3_MAX', |
|
54 | 'MINMATCH_MIN', | |
|
55 | 'MINMATCH_MAX', | |
|
52 | 56 | 'SEARCHLOG_MIN', |
|
53 | 57 | 'SEARCHLOG_MAX', |
|
54 | 58 | 'SEARCHLENGTH_MIN', |
@@ -66,6 +70,7 b' from __future__ import absolute_import, ' | |||
|
66 | 70 | 'STRATEGY_BTLAZY2', |
|
67 | 71 | 'STRATEGY_BTOPT', |
|
68 | 72 | 'STRATEGY_BTULTRA', |
|
73 | 'STRATEGY_BTULTRA2', | |
|
69 | 74 | 'DICT_TYPE_AUTO', |
|
70 | 75 | 'DICT_TYPE_RAWCONTENT', |
|
71 | 76 | 'DICT_TYPE_FULLDICT', |
@@ -114,10 +119,12 b' CHAINLOG_MAX = lib.ZSTD_CHAINLOG_MAX' | |||
|
114 | 119 | HASHLOG_MIN = lib.ZSTD_HASHLOG_MIN |
|
115 | 120 | HASHLOG_MAX = lib.ZSTD_HASHLOG_MAX |
|
116 | 121 | HASHLOG3_MAX = lib.ZSTD_HASHLOG3_MAX |
|
122 | MINMATCH_MIN = lib.ZSTD_MINMATCH_MIN | |
|
123 | MINMATCH_MAX = lib.ZSTD_MINMATCH_MAX | |
|
117 | 124 | SEARCHLOG_MIN = lib.ZSTD_SEARCHLOG_MIN |
|
118 | 125 | SEARCHLOG_MAX = lib.ZSTD_SEARCHLOG_MAX |
|
119 |
SEARCHLENGTH_MIN = lib.ZSTD_ |
|
|
120 |
SEARCHLENGTH_MAX = lib.ZSTD_ |
|
|
126 | SEARCHLENGTH_MIN = lib.ZSTD_MINMATCH_MIN | |
|
127 | SEARCHLENGTH_MAX = lib.ZSTD_MINMATCH_MAX | |
|
121 | 128 | TARGETLENGTH_MIN = lib.ZSTD_TARGETLENGTH_MIN |
|
122 | 129 | TARGETLENGTH_MAX = lib.ZSTD_TARGETLENGTH_MAX |
|
123 | 130 | LDM_MINMATCH_MIN = lib.ZSTD_LDM_MINMATCH_MIN |
@@ -132,6 +139,7 b' STRATEGY_LAZY2 = lib.ZSTD_lazy2' | |||
|
132 | 139 | STRATEGY_BTLAZY2 = lib.ZSTD_btlazy2 |
|
133 | 140 | STRATEGY_BTOPT = lib.ZSTD_btopt |
|
134 | 141 | STRATEGY_BTULTRA = lib.ZSTD_btultra |
|
142 | STRATEGY_BTULTRA2 = lib.ZSTD_btultra2 | |
|
135 | 143 | |
|
136 | 144 | DICT_TYPE_AUTO = lib.ZSTD_dct_auto |
|
137 | 145 | DICT_TYPE_RAWCONTENT = lib.ZSTD_dct_rawContent |
@@ -140,6 +148,9 b' DICT_TYPE_FULLDICT = lib.ZSTD_dct_fullDi' | |||
|
140 | 148 | FORMAT_ZSTD1 = lib.ZSTD_f_zstd1 |
|
141 | 149 | FORMAT_ZSTD1_MAGICLESS = lib.ZSTD_f_zstd1_magicless |
|
142 | 150 | |
|
151 | FLUSH_BLOCK = 0 | |
|
152 | FLUSH_FRAME = 1 | |
|
153 | ||
|
143 | 154 | COMPRESSOBJ_FLUSH_FINISH = 0 |
|
144 | 155 | COMPRESSOBJ_FLUSH_BLOCK = 1 |
|
145 | 156 | |
@@ -182,27 +193,27 b' def _make_cctx_params(params):' | |||
|
182 | 193 | res = ffi.gc(res, lib.ZSTD_freeCCtxParams) |
|
183 | 194 | |
|
184 | 195 | attrs = [ |
|
185 |
(lib.ZSTD_ |
|
|
186 |
(lib.ZSTD_ |
|
|
187 |
(lib.ZSTD_ |
|
|
188 |
(lib.ZSTD_ |
|
|
189 |
(lib.ZSTD_ |
|
|
190 |
(lib.ZSTD_ |
|
|
191 |
(lib.ZSTD_ |
|
|
192 |
(lib.ZSTD_ |
|
|
193 |
(lib.ZSTD_ |
|
|
194 |
(lib.ZSTD_ |
|
|
195 |
(lib.ZSTD_ |
|
|
196 |
(lib.ZSTD_ |
|
|
197 |
(lib.ZSTD_ |
|
|
198 |
(lib.ZSTD_ |
|
|
199 |
(lib.ZSTD_ |
|
|
200 |
(lib.ZSTD_ |
|
|
201 |
(lib.ZSTD_ |
|
|
202 |
(lib.ZSTD_ |
|
|
203 |
(lib.ZSTD_ |
|
|
204 |
(lib.ZSTD_ |
|
|
205 |
(lib.ZSTD_ |
|
|
196 | (lib.ZSTD_c_format, params.format), | |
|
197 | (lib.ZSTD_c_compressionLevel, params.compression_level), | |
|
198 | (lib.ZSTD_c_windowLog, params.window_log), | |
|
199 | (lib.ZSTD_c_hashLog, params.hash_log), | |
|
200 | (lib.ZSTD_c_chainLog, params.chain_log), | |
|
201 | (lib.ZSTD_c_searchLog, params.search_log), | |
|
202 | (lib.ZSTD_c_minMatch, params.min_match), | |
|
203 | (lib.ZSTD_c_targetLength, params.target_length), | |
|
204 | (lib.ZSTD_c_strategy, params.compression_strategy), | |
|
205 | (lib.ZSTD_c_contentSizeFlag, params.write_content_size), | |
|
206 | (lib.ZSTD_c_checksumFlag, params.write_checksum), | |
|
207 | (lib.ZSTD_c_dictIDFlag, params.write_dict_id), | |
|
208 | (lib.ZSTD_c_nbWorkers, params.threads), | |
|
209 | (lib.ZSTD_c_jobSize, params.job_size), | |
|
210 | (lib.ZSTD_c_overlapLog, params.overlap_log), | |
|
211 | (lib.ZSTD_c_forceMaxWindow, params.force_max_window), | |
|
212 | (lib.ZSTD_c_enableLongDistanceMatching, params.enable_ldm), | |
|
213 | (lib.ZSTD_c_ldmHashLog, params.ldm_hash_log), | |
|
214 | (lib.ZSTD_c_ldmMinMatch, params.ldm_min_match), | |
|
215 | (lib.ZSTD_c_ldmBucketSizeLog, params.ldm_bucket_size_log), | |
|
216 | (lib.ZSTD_c_ldmHashRateLog, params.ldm_hash_rate_log), | |
|
206 | 217 | ] |
|
207 | 218 | |
|
208 | 219 | for param, value in attrs: |
@@ -220,7 +231,7 b' class ZstdCompressionParameters(object):' | |||
|
220 | 231 | 'chain_log': 'chainLog', |
|
221 | 232 | 'hash_log': 'hashLog', |
|
222 | 233 | 'search_log': 'searchLog', |
|
223 |
'min_match': ' |
|
|
234 | 'min_match': 'minMatch', | |
|
224 | 235 | 'target_length': 'targetLength', |
|
225 | 236 | 'compression_strategy': 'strategy', |
|
226 | 237 | } |
@@ -233,41 +244,170 b' class ZstdCompressionParameters(object):' | |||
|
233 | 244 | |
|
234 | 245 | def __init__(self, format=0, compression_level=0, window_log=0, hash_log=0, |
|
235 | 246 | chain_log=0, search_log=0, min_match=0, target_length=0, |
|
236 | compression_strategy=0, write_content_size=1, write_checksum=0, | |
|
237 |
write_ |
|
|
238 |
|
|
|
239 | ldm_min_match=0, ldm_bucket_size_log=0, ldm_hash_every_log=0, | |
|
240 | threads=0): | |
|
247 | strategy=-1, compression_strategy=-1, | |
|
248 | write_content_size=1, write_checksum=0, | |
|
249 | write_dict_id=0, job_size=0, overlap_log=-1, | |
|
250 | overlap_size_log=-1, force_max_window=0, enable_ldm=0, | |
|
251 | ldm_hash_log=0, ldm_min_match=0, ldm_bucket_size_log=0, | |
|
252 | ldm_hash_rate_log=-1, ldm_hash_every_log=-1, threads=0): | |
|
253 | ||
|
254 | params = lib.ZSTD_createCCtxParams() | |
|
255 | if params == ffi.NULL: | |
|
256 | raise MemoryError() | |
|
257 | ||
|
258 | params = ffi.gc(params, lib.ZSTD_freeCCtxParams) | |
|
259 | ||
|
260 | self._params = params | |
|
241 | 261 | |
|
242 | 262 | if threads < 0: |
|
243 | 263 | threads = _cpu_count() |
|
244 | 264 | |
|
245 | self.format = format | |
|
246 | self.compression_level = compression_level | |
|
247 | self.window_log = window_log | |
|
248 | self.hash_log = hash_log | |
|
249 | self.chain_log = chain_log | |
|
250 | self.search_log = search_log | |
|
251 | self.min_match = min_match | |
|
252 | self.target_length = target_length | |
|
253 | self.compression_strategy = compression_strategy | |
|
254 | self.write_content_size = write_content_size | |
|
255 | self.write_checksum = write_checksum | |
|
256 | self.write_dict_id = write_dict_id | |
|
257 | self.job_size = job_size | |
|
258 | self.overlap_size_log = overlap_size_log | |
|
259 | self.force_max_window = force_max_window | |
|
260 | self.enable_ldm = enable_ldm | |
|
261 | self.ldm_hash_log = ldm_hash_log | |
|
262 | self.ldm_min_match = ldm_min_match | |
|
263 | self.ldm_bucket_size_log = ldm_bucket_size_log | |
|
264 | self.ldm_hash_every_log = ldm_hash_every_log | |
|
265 | self.threads = threads | |
|
266 | ||
|
267 | self.params = _make_cctx_params(self) | |
|
265 | # We need to set ZSTD_c_nbWorkers before ZSTD_c_jobSize and ZSTD_c_overlapLog | |
|
266 | # because setting ZSTD_c_nbWorkers resets the other parameters. | |
|
267 | _set_compression_parameter(params, lib.ZSTD_c_nbWorkers, threads) | |
|
268 | ||
|
269 | _set_compression_parameter(params, lib.ZSTD_c_format, format) | |
|
270 | _set_compression_parameter(params, lib.ZSTD_c_compressionLevel, compression_level) | |
|
271 | _set_compression_parameter(params, lib.ZSTD_c_windowLog, window_log) | |
|
272 | _set_compression_parameter(params, lib.ZSTD_c_hashLog, hash_log) | |
|
273 | _set_compression_parameter(params, lib.ZSTD_c_chainLog, chain_log) | |
|
274 | _set_compression_parameter(params, lib.ZSTD_c_searchLog, search_log) | |
|
275 | _set_compression_parameter(params, lib.ZSTD_c_minMatch, min_match) | |
|
276 | _set_compression_parameter(params, lib.ZSTD_c_targetLength, target_length) | |
|
277 | ||
|
278 | if strategy != -1 and compression_strategy != -1: | |
|
279 | raise ValueError('cannot specify both compression_strategy and strategy') | |
|
280 | ||
|
281 | if compression_strategy != -1: | |
|
282 | strategy = compression_strategy | |
|
283 | elif strategy == -1: | |
|
284 | strategy = 0 | |
|
285 | ||
|
286 | _set_compression_parameter(params, lib.ZSTD_c_strategy, strategy) | |
|
287 | _set_compression_parameter(params, lib.ZSTD_c_contentSizeFlag, write_content_size) | |
|
288 | _set_compression_parameter(params, lib.ZSTD_c_checksumFlag, write_checksum) | |
|
289 | _set_compression_parameter(params, lib.ZSTD_c_dictIDFlag, write_dict_id) | |
|
290 | _set_compression_parameter(params, lib.ZSTD_c_jobSize, job_size) | |
|
291 | ||
|
292 | if overlap_log != -1 and overlap_size_log != -1: | |
|
293 | raise ValueError('cannot specify both overlap_log and overlap_size_log') | |
|
294 | ||
|
295 | if overlap_size_log != -1: | |
|
296 | overlap_log = overlap_size_log | |
|
297 | elif overlap_log == -1: | |
|
298 | overlap_log = 0 | |
|
299 | ||
|
300 | _set_compression_parameter(params, lib.ZSTD_c_overlapLog, overlap_log) | |
|
301 | _set_compression_parameter(params, lib.ZSTD_c_forceMaxWindow, force_max_window) | |
|
302 | _set_compression_parameter(params, lib.ZSTD_c_enableLongDistanceMatching, enable_ldm) | |
|
303 | _set_compression_parameter(params, lib.ZSTD_c_ldmHashLog, ldm_hash_log) | |
|
304 | _set_compression_parameter(params, lib.ZSTD_c_ldmMinMatch, ldm_min_match) | |
|
305 | _set_compression_parameter(params, lib.ZSTD_c_ldmBucketSizeLog, ldm_bucket_size_log) | |
|
306 | ||
|
307 | if ldm_hash_rate_log != -1 and ldm_hash_every_log != -1: | |
|
308 | raise ValueError('cannot specify both ldm_hash_rate_log and ldm_hash_every_log') | |
|
309 | ||
|
310 | if ldm_hash_every_log != -1: | |
|
311 | ldm_hash_rate_log = ldm_hash_every_log | |
|
312 | elif ldm_hash_rate_log == -1: | |
|
313 | ldm_hash_rate_log = 0 | |
|
314 | ||
|
315 | _set_compression_parameter(params, lib.ZSTD_c_ldmHashRateLog, ldm_hash_rate_log) | |
|
316 | ||
|
317 | @property | |
|
318 | def format(self): | |
|
319 | return _get_compression_parameter(self._params, lib.ZSTD_c_format) | |
|
320 | ||
|
321 | @property | |
|
322 | def compression_level(self): | |
|
323 | return _get_compression_parameter(self._params, lib.ZSTD_c_compressionLevel) | |
|
324 | ||
|
325 | @property | |
|
326 | def window_log(self): | |
|
327 | return _get_compression_parameter(self._params, lib.ZSTD_c_windowLog) | |
|
328 | ||
|
329 | @property | |
|
330 | def hash_log(self): | |
|
331 | return _get_compression_parameter(self._params, lib.ZSTD_c_hashLog) | |
|
332 | ||
|
333 | @property | |
|
334 | def chain_log(self): | |
|
335 | return _get_compression_parameter(self._params, lib.ZSTD_c_chainLog) | |
|
336 | ||
|
337 | @property | |
|
338 | def search_log(self): | |
|
339 | return _get_compression_parameter(self._params, lib.ZSTD_c_searchLog) | |
|
340 | ||
|
341 | @property | |
|
342 | def min_match(self): | |
|
343 | return _get_compression_parameter(self._params, lib.ZSTD_c_minMatch) | |
|
344 | ||
|
345 | @property | |
|
346 | def target_length(self): | |
|
347 | return _get_compression_parameter(self._params, lib.ZSTD_c_targetLength) | |
|
348 | ||
|
349 | @property | |
|
350 | def compression_strategy(self): | |
|
351 | return _get_compression_parameter(self._params, lib.ZSTD_c_strategy) | |
|
352 | ||
|
353 | @property | |
|
354 | def write_content_size(self): | |
|
355 | return _get_compression_parameter(self._params, lib.ZSTD_c_contentSizeFlag) | |
|
356 | ||
|
357 | @property | |
|
358 | def write_checksum(self): | |
|
359 | return _get_compression_parameter(self._params, lib.ZSTD_c_checksumFlag) | |
|
360 | ||
|
361 | @property | |
|
362 | def write_dict_id(self): | |
|
363 | return _get_compression_parameter(self._params, lib.ZSTD_c_dictIDFlag) | |
|
364 | ||
|
365 | @property | |
|
366 | def job_size(self): | |
|
367 | return _get_compression_parameter(self._params, lib.ZSTD_c_jobSize) | |
|
368 | ||
|
369 | @property | |
|
370 | def overlap_log(self): | |
|
371 | return _get_compression_parameter(self._params, lib.ZSTD_c_overlapLog) | |
|
372 | ||
|
373 | @property | |
|
374 | def overlap_size_log(self): | |
|
375 | return self.overlap_log | |
|
376 | ||
|
377 | @property | |
|
378 | def force_max_window(self): | |
|
379 | return _get_compression_parameter(self._params, lib.ZSTD_c_forceMaxWindow) | |
|
380 | ||
|
381 | @property | |
|
382 | def enable_ldm(self): | |
|
383 | return _get_compression_parameter(self._params, lib.ZSTD_c_enableLongDistanceMatching) | |
|
384 | ||
|
385 | @property | |
|
386 | def ldm_hash_log(self): | |
|
387 | return _get_compression_parameter(self._params, lib.ZSTD_c_ldmHashLog) | |
|
388 | ||
|
389 | @property | |
|
390 | def ldm_min_match(self): | |
|
391 | return _get_compression_parameter(self._params, lib.ZSTD_c_ldmMinMatch) | |
|
392 | ||
|
393 | @property | |
|
394 | def ldm_bucket_size_log(self): | |
|
395 | return _get_compression_parameter(self._params, lib.ZSTD_c_ldmBucketSizeLog) | |
|
396 | ||
|
397 | @property | |
|
398 | def ldm_hash_rate_log(self): | |
|
399 | return _get_compression_parameter(self._params, lib.ZSTD_c_ldmHashRateLog) | |
|
400 | ||
|
401 | @property | |
|
402 | def ldm_hash_every_log(self): | |
|
403 | return self.ldm_hash_rate_log | |
|
404 | ||
|
405 | @property | |
|
406 | def threads(self): | |
|
407 | return _get_compression_parameter(self._params, lib.ZSTD_c_nbWorkers) | |
|
268 | 408 | |
|
269 | 409 | def estimated_compression_context_size(self): |
|
270 | return lib.ZSTD_estimateCCtxSize_usingCCtxParams(self.params) | |
|
410 | return lib.ZSTD_estimateCCtxSize_usingCCtxParams(self._params) | |
|
271 | 411 | |
|
272 | 412 | CompressionParameters = ZstdCompressionParameters |
|
273 | 413 | |
@@ -276,31 +416,53 b' def estimate_decompression_context_size(' | |||
|
276 | 416 | |
|
277 | 417 | |
|
278 | 418 | def _set_compression_parameter(params, param, value): |
|
279 | zresult = lib.ZSTD_CCtxParam_setParameter(params, param, | |
|
280 | ffi.cast('unsigned', value)) | |
|
419 | zresult = lib.ZSTD_CCtxParam_setParameter(params, param, value) | |
|
281 | 420 | if lib.ZSTD_isError(zresult): |
|
282 | 421 | raise ZstdError('unable to set compression context parameter: %s' % |
|
283 | 422 | _zstd_error(zresult)) |
|
284 | 423 | |
|
424 | ||
|
425 | def _get_compression_parameter(params, param): | |
|
426 | result = ffi.new('int *') | |
|
427 | ||
|
428 | zresult = lib.ZSTD_CCtxParam_getParameter(params, param, result) | |
|
429 | if lib.ZSTD_isError(zresult): | |
|
430 | raise ZstdError('unable to get compression context parameter: %s' % | |
|
431 | _zstd_error(zresult)) | |
|
432 | ||
|
433 | return result[0] | |
|
434 | ||
|
435 | ||
|
285 | 436 | class ZstdCompressionWriter(object): |
|
286 |
def __init__(self, compressor, writer, source_size, write_size |
|
|
437 | def __init__(self, compressor, writer, source_size, write_size, | |
|
438 | write_return_read): | |
|
287 | 439 | self._compressor = compressor |
|
288 | 440 | self._writer = writer |
|
289 | self._source_size = source_size | |
|
290 | 441 | self._write_size = write_size |
|
442 | self._write_return_read = bool(write_return_read) | |
|
291 | 443 | self._entered = False |
|
444 | self._closed = False | |
|
292 | 445 | self._bytes_compressed = 0 |
|
293 | 446 | |
|
294 | def __enter__(self): | |
|
295 | if self._entered: | |
|
296 | raise ZstdError('cannot __enter__ multiple times') | |
|
297 | ||
|
298 | zresult = lib.ZSTD_CCtx_setPledgedSrcSize(self._compressor._cctx, | |
|
299 | self._source_size) | |
|
447 | self._dst_buffer = ffi.new('char[]', write_size) | |
|
448 | self._out_buffer = ffi.new('ZSTD_outBuffer *') | |
|
449 | self._out_buffer.dst = self._dst_buffer | |
|
450 | self._out_buffer.size = len(self._dst_buffer) | |
|
451 | self._out_buffer.pos = 0 | |
|
452 | ||
|
453 | zresult = lib.ZSTD_CCtx_setPledgedSrcSize(compressor._cctx, | |
|
454 | source_size) | |
|
300 | 455 | if lib.ZSTD_isError(zresult): |
|
301 | 456 | raise ZstdError('error setting source size: %s' % |
|
302 | 457 | _zstd_error(zresult)) |
|
303 | 458 | |
|
459 | def __enter__(self): | |
|
460 | if self._closed: | |
|
461 | raise ValueError('stream is closed') | |
|
462 | ||
|
463 | if self._entered: | |
|
464 | raise ZstdError('cannot __enter__ multiple times') | |
|
465 | ||
|
304 | 466 | self._entered = True |
|
305 | 467 | return self |
|
306 | 468 | |
@@ -308,50 +470,79 b' class ZstdCompressionWriter(object):' | |||
|
308 | 470 | self._entered = False |
|
309 | 471 | |
|
310 | 472 | if not exc_type and not exc_value and not exc_tb: |
|
311 | dst_buffer = ffi.new('char[]', self._write_size) | |
|
312 | ||
|
313 | out_buffer = ffi.new('ZSTD_outBuffer *') | |
|
314 | in_buffer = ffi.new('ZSTD_inBuffer *') | |
|
315 | ||
|
316 | out_buffer.dst = dst_buffer | |
|
317 | out_buffer.size = len(dst_buffer) | |
|
318 | out_buffer.pos = 0 | |
|
319 | ||
|
320 | in_buffer.src = ffi.NULL | |
|
321 | in_buffer.size = 0 | |
|
322 | in_buffer.pos = 0 | |
|
323 | ||
|
324 | while True: | |
|
325 | zresult = lib.ZSTD_compress_generic(self._compressor._cctx, | |
|
326 | out_buffer, in_buffer, | |
|
327 | lib.ZSTD_e_end) | |
|
328 | ||
|
329 | if lib.ZSTD_isError(zresult): | |
|
330 | raise ZstdError('error ending compression stream: %s' % | |
|
331 | _zstd_error(zresult)) | |
|
332 | ||
|
333 | if out_buffer.pos: | |
|
334 | self._writer.write(ffi.buffer(out_buffer.dst, out_buffer.pos)[:]) | |
|
335 | out_buffer.pos = 0 | |
|
336 | ||
|
337 | if zresult == 0: | |
|
338 | break | |
|
473 | self.close() | |
|
339 | 474 | |
|
340 | 475 | self._compressor = None |
|
341 | 476 | |
|
342 | 477 | return False |
|
343 | 478 | |
|
344 | 479 | def memory_size(self): |
|
345 | if not self._entered: | |
|
346 | raise ZstdError('cannot determine size of an inactive compressor; ' | |
|
347 | 'call when a context manager is active') | |
|
348 | ||
|
349 | 480 | return lib.ZSTD_sizeof_CCtx(self._compressor._cctx) |
|
350 | 481 | |
|
482 | def fileno(self): | |
|
483 | f = getattr(self._writer, 'fileno', None) | |
|
484 | if f: | |
|
485 | return f() | |
|
486 | else: | |
|
487 | raise OSError('fileno not available on underlying writer') | |
|
488 | ||
|
489 | def close(self): | |
|
490 | if self._closed: | |
|
491 | return | |
|
492 | ||
|
493 | try: | |
|
494 | self.flush(FLUSH_FRAME) | |
|
495 | finally: | |
|
496 | self._closed = True | |
|
497 | ||
|
498 | # Call close() on underlying stream as well. | |
|
499 | f = getattr(self._writer, 'close', None) | |
|
500 | if f: | |
|
501 | f() | |
|
502 | ||
|
503 | @property | |
|
504 | def closed(self): | |
|
505 | return self._closed | |
|
506 | ||
|
507 | def isatty(self): | |
|
508 | return False | |
|
509 | ||
|
510 | def readable(self): | |
|
511 | return False | |
|
512 | ||
|
513 | def readline(self, size=-1): | |
|
514 | raise io.UnsupportedOperation() | |
|
515 | ||
|
516 | def readlines(self, hint=-1): | |
|
517 | raise io.UnsupportedOperation() | |
|
518 | ||
|
519 | def seek(self, offset, whence=None): | |
|
520 | raise io.UnsupportedOperation() | |
|
521 | ||
|
522 | def seekable(self): | |
|
523 | return False | |
|
524 | ||
|
525 | def truncate(self, size=None): | |
|
526 | raise io.UnsupportedOperation() | |
|
527 | ||
|
528 | def writable(self): | |
|
529 | return True | |
|
530 | ||
|
531 | def writelines(self, lines): | |
|
532 | raise NotImplementedError('writelines() is not yet implemented') | |
|
533 | ||
|
534 | def read(self, size=-1): | |
|
535 | raise io.UnsupportedOperation() | |
|
536 | ||
|
537 | def readall(self): | |
|
538 | raise io.UnsupportedOperation() | |
|
539 | ||
|
540 | def readinto(self, b): | |
|
541 | raise io.UnsupportedOperation() | |
|
542 | ||
|
351 | 543 | def write(self, data): |
|
352 |
if |
|
|
353 | raise ZstdError('write() must be called from an active context ' | |
|
354 | 'manager') | |
|
544 | if self._closed: | |
|
545 | raise ValueError('stream is closed') | |
|
355 | 546 | |
|
356 | 547 | total_write = 0 |
|
357 | 548 | |
@@ -362,16 +553,13 b' class ZstdCompressionWriter(object):' | |||
|
362 | 553 | in_buffer.size = len(data_buffer) |
|
363 | 554 | in_buffer.pos = 0 |
|
364 | 555 | |
|
365 |
out_buffer = f |
|
|
366 | dst_buffer = ffi.new('char[]', self._write_size) | |
|
367 | out_buffer.dst = dst_buffer | |
|
368 | out_buffer.size = self._write_size | |
|
556 | out_buffer = self._out_buffer | |
|
369 | 557 | out_buffer.pos = 0 |
|
370 | 558 | |
|
371 | 559 | while in_buffer.pos < in_buffer.size: |
|
372 |
zresult = lib.ZSTD_compress |
|
|
373 |
|
|
|
374 |
|
|
|
560 | zresult = lib.ZSTD_compressStream2(self._compressor._cctx, | |
|
561 | out_buffer, in_buffer, | |
|
562 | lib.ZSTD_e_continue) | |
|
375 | 563 | if lib.ZSTD_isError(zresult): |
|
376 | 564 | raise ZstdError('zstd compress error: %s' % |
|
377 | 565 | _zstd_error(zresult)) |
@@ -382,18 +570,25 b' class ZstdCompressionWriter(object):' | |||
|
382 | 570 | self._bytes_compressed += out_buffer.pos |
|
383 | 571 | out_buffer.pos = 0 |
|
384 | 572 | |
|
385 | return total_write | |
|
386 | ||
|
387 | def flush(self): | |
|
388 | if not self._entered: | |
|
389 | raise ZstdError('flush must be called from an active context manager') | |
|
573 | if self._write_return_read: | |
|
574 | return in_buffer.pos | |
|
575 | else: | |
|
576 | return total_write | |
|
577 | ||
|
578 | def flush(self, flush_mode=FLUSH_BLOCK): | |
|
579 | if flush_mode == FLUSH_BLOCK: | |
|
580 | flush = lib.ZSTD_e_flush | |
|
581 | elif flush_mode == FLUSH_FRAME: | |
|
582 | flush = lib.ZSTD_e_end | |
|
583 | else: | |
|
584 | raise ValueError('unknown flush_mode: %r' % flush_mode) | |
|
585 | ||
|
586 | if self._closed: | |
|
587 | raise ValueError('stream is closed') | |
|
390 | 588 | |
|
391 | 589 | total_write = 0 |
|
392 | 590 | |
|
393 |
out_buffer = f |
|
|
394 | dst_buffer = ffi.new('char[]', self._write_size) | |
|
395 | out_buffer.dst = dst_buffer | |
|
396 | out_buffer.size = self._write_size | |
|
591 | out_buffer = self._out_buffer | |
|
397 | 592 | out_buffer.pos = 0 |
|
398 | 593 | |
|
399 | 594 | in_buffer = ffi.new('ZSTD_inBuffer *') |
@@ -402,9 +597,9 b' class ZstdCompressionWriter(object):' | |||
|
402 | 597 | in_buffer.pos = 0 |
|
403 | 598 | |
|
404 | 599 | while True: |
|
405 |
zresult = lib.ZSTD_compress |
|
|
406 |
|
|
|
407 |
|
|
|
600 | zresult = lib.ZSTD_compressStream2(self._compressor._cctx, | |
|
601 | out_buffer, in_buffer, | |
|
602 | flush) | |
|
408 | 603 | if lib.ZSTD_isError(zresult): |
|
409 | 604 | raise ZstdError('zstd compress error: %s' % |
|
410 | 605 | _zstd_error(zresult)) |
@@ -438,10 +633,10 b' class ZstdCompressionObj(object):' | |||
|
438 | 633 | chunks = [] |
|
439 | 634 | |
|
440 | 635 | while source.pos < len(data): |
|
441 |
zresult = lib.ZSTD_compress |
|
|
442 |
|
|
|
443 |
|
|
|
444 |
|
|
|
636 | zresult = lib.ZSTD_compressStream2(self._compressor._cctx, | |
|
637 | self._out, | |
|
638 | source, | |
|
639 | lib.ZSTD_e_continue) | |
|
445 | 640 | if lib.ZSTD_isError(zresult): |
|
446 | 641 | raise ZstdError('zstd compress error: %s' % |
|
447 | 642 | _zstd_error(zresult)) |
@@ -477,10 +672,10 b' class ZstdCompressionObj(object):' | |||
|
477 | 672 | chunks = [] |
|
478 | 673 | |
|
479 | 674 | while True: |
|
480 |
zresult = lib.ZSTD_compress |
|
|
481 |
|
|
|
482 |
|
|
|
483 |
|
|
|
675 | zresult = lib.ZSTD_compressStream2(self._compressor._cctx, | |
|
676 | self._out, | |
|
677 | in_buffer, | |
|
678 | z_flush_mode) | |
|
484 | 679 | if lib.ZSTD_isError(zresult): |
|
485 | 680 | raise ZstdError('error ending compression stream: %s' % |
|
486 | 681 | _zstd_error(zresult)) |
@@ -528,10 +723,10 b' class ZstdCompressionChunker(object):' | |||
|
528 | 723 | self._in.pos = 0 |
|
529 | 724 | |
|
530 | 725 | while self._in.pos < self._in.size: |
|
531 |
zresult = lib.ZSTD_compress |
|
|
532 |
|
|
|
533 |
|
|
|
534 |
|
|
|
726 | zresult = lib.ZSTD_compressStream2(self._compressor._cctx, | |
|
727 | self._out, | |
|
728 | self._in, | |
|
729 | lib.ZSTD_e_continue) | |
|
535 | 730 | |
|
536 | 731 | if self._in.pos == self._in.size: |
|
537 | 732 | self._in.src = ffi.NULL |
@@ -555,9 +750,9 b' class ZstdCompressionChunker(object):' | |||
|
555 | 750 | 'previous operation') |
|
556 | 751 | |
|
557 | 752 | while True: |
|
558 |
zresult = lib.ZSTD_compress |
|
|
559 |
|
|
|
560 |
|
|
|
753 | zresult = lib.ZSTD_compressStream2(self._compressor._cctx, | |
|
754 | self._out, self._in, | |
|
755 | lib.ZSTD_e_flush) | |
|
561 | 756 | if lib.ZSTD_isError(zresult): |
|
562 | 757 | raise ZstdError('zstd compress error: %s' % _zstd_error(zresult)) |
|
563 | 758 | |
@@ -577,9 +772,9 b' class ZstdCompressionChunker(object):' | |||
|
577 | 772 | 'previous operation') |
|
578 | 773 | |
|
579 | 774 | while True: |
|
580 |
zresult = lib.ZSTD_compress |
|
|
581 |
|
|
|
582 |
|
|
|
775 | zresult = lib.ZSTD_compressStream2(self._compressor._cctx, | |
|
776 | self._out, self._in, | |
|
777 | lib.ZSTD_e_end) | |
|
583 | 778 | if lib.ZSTD_isError(zresult): |
|
584 | 779 | raise ZstdError('zstd compress error: %s' % _zstd_error(zresult)) |
|
585 | 780 | |
@@ -592,7 +787,7 b' class ZstdCompressionChunker(object):' | |||
|
592 | 787 | return |
|
593 | 788 | |
|
594 | 789 | |
|
595 | class CompressionReader(object): | |
|
790 | class ZstdCompressionReader(object): | |
|
596 | 791 | def __init__(self, compressor, source, read_size): |
|
597 | 792 | self._compressor = compressor |
|
598 | 793 | self._source = source |
@@ -661,7 +856,16 b' class CompressionReader(object):' | |||
|
661 | 856 | return self._bytes_compressed |
|
662 | 857 | |
|
663 | 858 | def readall(self): |
|
664 | raise NotImplementedError() | |
|
859 | chunks = [] | |
|
860 | ||
|
861 | while True: | |
|
862 | chunk = self.read(1048576) | |
|
863 | if not chunk: | |
|
864 | break | |
|
865 | ||
|
866 | chunks.append(chunk) | |
|
867 | ||
|
868 | return b''.join(chunks) | |
|
665 | 869 | |
|
666 | 870 | def __iter__(self): |
|
667 | 871 | raise io.UnsupportedOperation() |
@@ -671,16 +875,67 b' class CompressionReader(object):' | |||
|
671 | 875 | |
|
672 | 876 | next = __next__ |
|
673 | 877 | |
|
878 | def _read_input(self): | |
|
879 | if self._finished_input: | |
|
880 | return | |
|
881 | ||
|
882 | if hasattr(self._source, 'read'): | |
|
883 | data = self._source.read(self._read_size) | |
|
884 | ||
|
885 | if not data: | |
|
886 | self._finished_input = True | |
|
887 | return | |
|
888 | ||
|
889 | self._source_buffer = ffi.from_buffer(data) | |
|
890 | self._in_buffer.src = self._source_buffer | |
|
891 | self._in_buffer.size = len(self._source_buffer) | |
|
892 | self._in_buffer.pos = 0 | |
|
893 | else: | |
|
894 | self._source_buffer = ffi.from_buffer(self._source) | |
|
895 | self._in_buffer.src = self._source_buffer | |
|
896 | self._in_buffer.size = len(self._source_buffer) | |
|
897 | self._in_buffer.pos = 0 | |
|
898 | ||
|
899 | def _compress_into_buffer(self, out_buffer): | |
|
900 | if self._in_buffer.pos >= self._in_buffer.size: | |
|
901 | return | |
|
902 | ||
|
903 | old_pos = out_buffer.pos | |
|
904 | ||
|
905 | zresult = lib.ZSTD_compressStream2(self._compressor._cctx, | |
|
906 | out_buffer, self._in_buffer, | |
|
907 | lib.ZSTD_e_continue) | |
|
908 | ||
|
909 | self._bytes_compressed += out_buffer.pos - old_pos | |
|
910 | ||
|
911 | if self._in_buffer.pos == self._in_buffer.size: | |
|
912 | self._in_buffer.src = ffi.NULL | |
|
913 | self._in_buffer.pos = 0 | |
|
914 | self._in_buffer.size = 0 | |
|
915 | self._source_buffer = None | |
|
916 | ||
|
917 | if not hasattr(self._source, 'read'): | |
|
918 | self._finished_input = True | |
|
919 | ||
|
920 | if lib.ZSTD_isError(zresult): | |
|
921 | raise ZstdError('zstd compress error: %s', | |
|
922 | _zstd_error(zresult)) | |
|
923 | ||
|
924 | return out_buffer.pos and out_buffer.pos == out_buffer.size | |
|
925 | ||
|
674 | 926 | def read(self, size=-1): |
|
675 | 927 | if self._closed: |
|
676 | 928 | raise ValueError('stream is closed') |
|
677 | 929 | |
|
678 | if self._finished_output: | |
|
930 | if size < -1: | |
|
931 | raise ValueError('cannot read negative amounts less than -1') | |
|
932 | ||
|
933 | if size == -1: | |
|
934 | return self.readall() | |
|
935 | ||
|
936 | if self._finished_output or size == 0: | |
|
679 | 937 | return b'' |
|
680 | 938 | |
|
681 | if size < 1: | |
|
682 | raise ValueError('cannot read negative or size 0 amounts') | |
|
683 | ||
|
684 | 939 | # Need a dedicated ref to dest buffer otherwise it gets collected. |
|
685 | 940 | dst_buffer = ffi.new('char[]', size) |
|
686 | 941 | out_buffer = ffi.new('ZSTD_outBuffer *') |
@@ -688,71 +943,21 b' class CompressionReader(object):' | |||
|
688 | 943 | out_buffer.size = size |
|
689 | 944 | out_buffer.pos = 0 |
|
690 | 945 | |
|
691 |
|
|
|
692 | if self._in_buffer.pos >= self._in_buffer.size: | |
|
693 | return | |
|
694 | ||
|
695 | old_pos = out_buffer.pos | |
|
696 | ||
|
697 | zresult = lib.ZSTD_compress_generic(self._compressor._cctx, | |
|
698 | out_buffer, self._in_buffer, | |
|
699 | lib.ZSTD_e_continue) | |
|
700 | ||
|
701 | self._bytes_compressed += out_buffer.pos - old_pos | |
|
702 | ||
|
703 | if self._in_buffer.pos == self._in_buffer.size: | |
|
704 | self._in_buffer.src = ffi.NULL | |
|
705 | self._in_buffer.pos = 0 | |
|
706 | self._in_buffer.size = 0 | |
|
707 | self._source_buffer = None | |
|
708 | ||
|
709 | if not hasattr(self._source, 'read'): | |
|
710 | self._finished_input = True | |
|
711 | ||
|
712 | if lib.ZSTD_isError(zresult): | |
|
713 | raise ZstdError('zstd compress error: %s', | |
|
714 | _zstd_error(zresult)) | |
|
715 | ||
|
716 | if out_buffer.pos and out_buffer.pos == out_buffer.size: | |
|
717 | return ffi.buffer(out_buffer.dst, out_buffer.pos)[:] | |
|
718 | ||
|
719 | def get_input(): | |
|
720 | if self._finished_input: | |
|
721 | return | |
|
722 | ||
|
723 | if hasattr(self._source, 'read'): | |
|
724 | data = self._source.read(self._read_size) | |
|
725 | ||
|
726 | if not data: | |
|
727 | self._finished_input = True | |
|
728 | return | |
|
729 | ||
|
730 | self._source_buffer = ffi.from_buffer(data) | |
|
731 | self._in_buffer.src = self._source_buffer | |
|
732 | self._in_buffer.size = len(self._source_buffer) | |
|
733 | self._in_buffer.pos = 0 | |
|
734 | else: | |
|
735 | self._source_buffer = ffi.from_buffer(self._source) | |
|
736 | self._in_buffer.src = self._source_buffer | |
|
737 | self._in_buffer.size = len(self._source_buffer) | |
|
738 | self._in_buffer.pos = 0 | |
|
739 | ||
|
740 | result = compress_input() | |
|
741 | if result: | |
|
742 | return result | |
|
946 | if self._compress_into_buffer(out_buffer): | |
|
947 | return ffi.buffer(out_buffer.dst, out_buffer.pos)[:] | |
|
743 | 948 | |
|
744 | 949 | while not self._finished_input: |
|
745 |
|
|
|
746 | result = compress_input() | |
|
747 | if result: | |
|
748 | return result | |
|
950 | self._read_input() | |
|
951 | ||
|
952 | if self._compress_into_buffer(out_buffer): | |
|
953 | return ffi.buffer(out_buffer.dst, out_buffer.pos)[:] | |
|
749 | 954 | |
|
750 | 955 | # EOF |
|
751 | 956 | old_pos = out_buffer.pos |
|
752 | 957 | |
|
753 |
zresult = lib.ZSTD_compress |
|
|
754 |
|
|
|
755 |
|
|
|
958 | zresult = lib.ZSTD_compressStream2(self._compressor._cctx, | |
|
959 | out_buffer, self._in_buffer, | |
|
960 | lib.ZSTD_e_end) | |
|
756 | 961 | |
|
757 | 962 | self._bytes_compressed += out_buffer.pos - old_pos |
|
758 | 963 | |
@@ -765,6 +970,159 b' class CompressionReader(object):' | |||
|
765 | 970 | |
|
766 | 971 | return ffi.buffer(out_buffer.dst, out_buffer.pos)[:] |
|
767 | 972 | |
|
973 | def read1(self, size=-1): | |
|
974 | if self._closed: | |
|
975 | raise ValueError('stream is closed') | |
|
976 | ||
|
977 | if size < -1: | |
|
978 | raise ValueError('cannot read negative amounts less than -1') | |
|
979 | ||
|
980 | if self._finished_output or size == 0: | |
|
981 | return b'' | |
|
982 | ||
|
983 | # -1 returns arbitrary number of bytes. | |
|
984 | if size == -1: | |
|
985 | size = COMPRESSION_RECOMMENDED_OUTPUT_SIZE | |
|
986 | ||
|
987 | dst_buffer = ffi.new('char[]', size) | |
|
988 | out_buffer = ffi.new('ZSTD_outBuffer *') | |
|
989 | out_buffer.dst = dst_buffer | |
|
990 | out_buffer.size = size | |
|
991 | out_buffer.pos = 0 | |
|
992 | ||
|
993 | # read1() dictates that we can perform at most 1 call to the | |
|
994 | # underlying stream to get input. However, we can't satisfy this | |
|
995 | # restriction with compression because not all input generates output. | |
|
996 | # It is possible to perform a block flush in order to ensure output. | |
|
997 | # But this may not be desirable behavior. So we allow multiple read() | |
|
998 | # to the underlying stream. But unlike read(), we stop once we have | |
|
999 | # any output. | |
|
1000 | ||
|
1001 | self._compress_into_buffer(out_buffer) | |
|
1002 | if out_buffer.pos: | |
|
1003 | return ffi.buffer(out_buffer.dst, out_buffer.pos)[:] | |
|
1004 | ||
|
1005 | while not self._finished_input: | |
|
1006 | self._read_input() | |
|
1007 | ||
|
1008 | # If we've filled the output buffer, return immediately. | |
|
1009 | if self._compress_into_buffer(out_buffer): | |
|
1010 | return ffi.buffer(out_buffer.dst, out_buffer.pos)[:] | |
|
1011 | ||
|
1012 | # If we've populated the output buffer and we're not at EOF, | |
|
1013 | # also return, as we've satisfied the read1() limits. | |
|
1014 | if out_buffer.pos and not self._finished_input: | |
|
1015 | return ffi.buffer(out_buffer.dst, out_buffer.pos)[:] | |
|
1016 | ||
|
1017 | # Else if we're at EOS and we have room left in the buffer, | |
|
1018 | # fall through to below and try to add more data to the output. | |
|
1019 | ||
|
1020 | # EOF. | |
|
1021 | old_pos = out_buffer.pos | |
|
1022 | ||
|
1023 | zresult = lib.ZSTD_compressStream2(self._compressor._cctx, | |
|
1024 | out_buffer, self._in_buffer, | |
|
1025 | lib.ZSTD_e_end) | |
|
1026 | ||
|
1027 | self._bytes_compressed += out_buffer.pos - old_pos | |
|
1028 | ||
|
1029 | if lib.ZSTD_isError(zresult): | |
|
1030 | raise ZstdError('error ending compression stream: %s' % | |
|
1031 | _zstd_error(zresult)) | |
|
1032 | ||
|
1033 | if zresult == 0: | |
|
1034 | self._finished_output = True | |
|
1035 | ||
|
1036 | return ffi.buffer(out_buffer.dst, out_buffer.pos)[:] | |
|
1037 | ||
|
1038 | def readinto(self, b): | |
|
1039 | if self._closed: | |
|
1040 | raise ValueError('stream is closed') | |
|
1041 | ||
|
1042 | if self._finished_output: | |
|
1043 | return 0 | |
|
1044 | ||
|
1045 | # TODO use writable=True once we require CFFI >= 1.12. | |
|
1046 | dest_buffer = ffi.from_buffer(b) | |
|
1047 | ffi.memmove(b, b'', 0) | |
|
1048 | out_buffer = ffi.new('ZSTD_outBuffer *') | |
|
1049 | out_buffer.dst = dest_buffer | |
|
1050 | out_buffer.size = len(dest_buffer) | |
|
1051 | out_buffer.pos = 0 | |
|
1052 | ||
|
1053 | if self._compress_into_buffer(out_buffer): | |
|
1054 | return out_buffer.pos | |
|
1055 | ||
|
1056 | while not self._finished_input: | |
|
1057 | self._read_input() | |
|
1058 | if self._compress_into_buffer(out_buffer): | |
|
1059 | return out_buffer.pos | |
|
1060 | ||
|
1061 | # EOF. | |
|
1062 | old_pos = out_buffer.pos | |
|
1063 | zresult = lib.ZSTD_compressStream2(self._compressor._cctx, | |
|
1064 | out_buffer, self._in_buffer, | |
|
1065 | lib.ZSTD_e_end) | |
|
1066 | ||
|
1067 | self._bytes_compressed += out_buffer.pos - old_pos | |
|
1068 | ||
|
1069 | if lib.ZSTD_isError(zresult): | |
|
1070 | raise ZstdError('error ending compression stream: %s', | |
|
1071 | _zstd_error(zresult)) | |
|
1072 | ||
|
1073 | if zresult == 0: | |
|
1074 | self._finished_output = True | |
|
1075 | ||
|
1076 | return out_buffer.pos | |
|
1077 | ||
|
1078 | def readinto1(self, b): | |
|
1079 | if self._closed: | |
|
1080 | raise ValueError('stream is closed') | |
|
1081 | ||
|
1082 | if self._finished_output: | |
|
1083 | return 0 | |
|
1084 | ||
|
1085 | # TODO use writable=True once we require CFFI >= 1.12. | |
|
1086 | dest_buffer = ffi.from_buffer(b) | |
|
1087 | ffi.memmove(b, b'', 0) | |
|
1088 | ||
|
1089 | out_buffer = ffi.new('ZSTD_outBuffer *') | |
|
1090 | out_buffer.dst = dest_buffer | |
|
1091 | out_buffer.size = len(dest_buffer) | |
|
1092 | out_buffer.pos = 0 | |
|
1093 | ||
|
1094 | self._compress_into_buffer(out_buffer) | |
|
1095 | if out_buffer.pos: | |
|
1096 | return out_buffer.pos | |
|
1097 | ||
|
1098 | while not self._finished_input: | |
|
1099 | self._read_input() | |
|
1100 | ||
|
1101 | if self._compress_into_buffer(out_buffer): | |
|
1102 | return out_buffer.pos | |
|
1103 | ||
|
1104 | if out_buffer.pos and not self._finished_input: | |
|
1105 | return out_buffer.pos | |
|
1106 | ||
|
1107 | # EOF. | |
|
1108 | old_pos = out_buffer.pos | |
|
1109 | ||
|
1110 | zresult = lib.ZSTD_compressStream2(self._compressor._cctx, | |
|
1111 | out_buffer, self._in_buffer, | |
|
1112 | lib.ZSTD_e_end) | |
|
1113 | ||
|
1114 | self._bytes_compressed += out_buffer.pos - old_pos | |
|
1115 | ||
|
1116 | if lib.ZSTD_isError(zresult): | |
|
1117 | raise ZstdError('error ending compression stream: %s' % | |
|
1118 | _zstd_error(zresult)) | |
|
1119 | ||
|
1120 | if zresult == 0: | |
|
1121 | self._finished_output = True | |
|
1122 | ||
|
1123 | return out_buffer.pos | |
|
1124 | ||
|
1125 | ||
|
768 | 1126 | class ZstdCompressor(object): |
|
769 | 1127 | def __init__(self, level=3, dict_data=None, compression_params=None, |
|
770 | 1128 | write_checksum=None, write_content_size=None, |
@@ -803,25 +1161,25 b' class ZstdCompressor(object):' | |||
|
803 | 1161 | self._params = ffi.gc(params, lib.ZSTD_freeCCtxParams) |
|
804 | 1162 | |
|
805 | 1163 | _set_compression_parameter(self._params, |
|
806 |
lib.ZSTD_ |
|
|
1164 | lib.ZSTD_c_compressionLevel, | |
|
807 | 1165 | level) |
|
808 | 1166 | |
|
809 | 1167 | _set_compression_parameter( |
|
810 | 1168 | self._params, |
|
811 |
lib.ZSTD_ |
|
|
1169 | lib.ZSTD_c_contentSizeFlag, | |
|
812 | 1170 | write_content_size if write_content_size is not None else 1) |
|
813 | 1171 | |
|
814 | 1172 | _set_compression_parameter(self._params, |
|
815 |
lib.ZSTD_ |
|
|
1173 | lib.ZSTD_c_checksumFlag, | |
|
816 | 1174 | 1 if write_checksum else 0) |
|
817 | 1175 | |
|
818 | 1176 | _set_compression_parameter(self._params, |
|
819 |
lib.ZSTD_ |
|
|
1177 | lib.ZSTD_c_dictIDFlag, | |
|
820 | 1178 | 1 if write_dict_id else 0) |
|
821 | 1179 | |
|
822 | 1180 | if threads: |
|
823 | 1181 | _set_compression_parameter(self._params, |
|
824 |
lib.ZSTD_ |
|
|
1182 | lib.ZSTD_c_nbWorkers, | |
|
825 | 1183 | threads) |
|
826 | 1184 | |
|
827 | 1185 | cctx = lib.ZSTD_createCCtx() |
@@ -864,7 +1222,7 b' class ZstdCompressor(object):' | |||
|
864 | 1222 | return lib.ZSTD_sizeof_CCtx(self._cctx) |
|
865 | 1223 | |
|
866 | 1224 | def compress(self, data): |
|
867 | lib.ZSTD_CCtx_reset(self._cctx) | |
|
1225 | lib.ZSTD_CCtx_reset(self._cctx, lib.ZSTD_reset_session_only) | |
|
868 | 1226 | |
|
869 | 1227 | data_buffer = ffi.from_buffer(data) |
|
870 | 1228 | |
@@ -887,10 +1245,10 b' class ZstdCompressor(object):' | |||
|
887 | 1245 | in_buffer.size = len(data_buffer) |
|
888 | 1246 | in_buffer.pos = 0 |
|
889 | 1247 | |
|
890 |
zresult = lib.ZSTD_compress |
|
|
891 |
|
|
|
892 |
|
|
|
893 |
|
|
|
1248 | zresult = lib.ZSTD_compressStream2(self._cctx, | |
|
1249 | out_buffer, | |
|
1250 | in_buffer, | |
|
1251 | lib.ZSTD_e_end) | |
|
894 | 1252 | |
|
895 | 1253 | if lib.ZSTD_isError(zresult): |
|
896 | 1254 | raise ZstdError('cannot compress: %s' % |
@@ -901,7 +1259,7 b' class ZstdCompressor(object):' | |||
|
901 | 1259 | return ffi.buffer(out, out_buffer.pos)[:] |
|
902 | 1260 | |
|
903 | 1261 | def compressobj(self, size=-1): |
|
904 | lib.ZSTD_CCtx_reset(self._cctx) | |
|
1262 | lib.ZSTD_CCtx_reset(self._cctx, lib.ZSTD_reset_session_only) | |
|
905 | 1263 | |
|
906 | 1264 | if size < 0: |
|
907 | 1265 | size = lib.ZSTD_CONTENTSIZE_UNKNOWN |
@@ -923,7 +1281,7 b' class ZstdCompressor(object):' | |||
|
923 | 1281 | return cobj |
|
924 | 1282 | |
|
925 | 1283 | def chunker(self, size=-1, chunk_size=COMPRESSION_RECOMMENDED_OUTPUT_SIZE): |
|
926 | lib.ZSTD_CCtx_reset(self._cctx) | |
|
1284 | lib.ZSTD_CCtx_reset(self._cctx, lib.ZSTD_reset_session_only) | |
|
927 | 1285 | |
|
928 | 1286 | if size < 0: |
|
929 | 1287 | size = lib.ZSTD_CONTENTSIZE_UNKNOWN |
@@ -944,7 +1302,7 b' class ZstdCompressor(object):' | |||
|
944 | 1302 | if not hasattr(ofh, 'write'): |
|
945 | 1303 | raise ValueError('second argument must have a write() method') |
|
946 | 1304 | |
|
947 | lib.ZSTD_CCtx_reset(self._cctx) | |
|
1305 | lib.ZSTD_CCtx_reset(self._cctx, lib.ZSTD_reset_session_only) | |
|
948 | 1306 | |
|
949 | 1307 | if size < 0: |
|
950 | 1308 | size = lib.ZSTD_CONTENTSIZE_UNKNOWN |
@@ -976,10 +1334,10 b' class ZstdCompressor(object):' | |||
|
976 | 1334 | in_buffer.pos = 0 |
|
977 | 1335 | |
|
978 | 1336 | while in_buffer.pos < in_buffer.size: |
|
979 |
zresult = lib.ZSTD_compress |
|
|
980 |
|
|
|
981 |
|
|
|
982 |
|
|
|
1337 | zresult = lib.ZSTD_compressStream2(self._cctx, | |
|
1338 | out_buffer, | |
|
1339 | in_buffer, | |
|
1340 | lib.ZSTD_e_continue) | |
|
983 | 1341 | if lib.ZSTD_isError(zresult): |
|
984 | 1342 | raise ZstdError('zstd compress error: %s' % |
|
985 | 1343 | _zstd_error(zresult)) |
@@ -991,10 +1349,10 b' class ZstdCompressor(object):' | |||
|
991 | 1349 | |
|
992 | 1350 | # We've finished reading. Flush the compressor. |
|
993 | 1351 | while True: |
|
994 |
zresult = lib.ZSTD_compress |
|
|
995 |
|
|
|
996 |
|
|
|
997 |
|
|
|
1352 | zresult = lib.ZSTD_compressStream2(self._cctx, | |
|
1353 | out_buffer, | |
|
1354 | in_buffer, | |
|
1355 | lib.ZSTD_e_end) | |
|
998 | 1356 | if lib.ZSTD_isError(zresult): |
|
999 | 1357 | raise ZstdError('error ending compression stream: %s' % |
|
1000 | 1358 | _zstd_error(zresult)) |
@@ -1011,7 +1369,7 b' class ZstdCompressor(object):' | |||
|
1011 | 1369 | |
|
1012 | 1370 | def stream_reader(self, source, size=-1, |
|
1013 | 1371 | read_size=COMPRESSION_RECOMMENDED_INPUT_SIZE): |
|
1014 | lib.ZSTD_CCtx_reset(self._cctx) | |
|
1372 | lib.ZSTD_CCtx_reset(self._cctx, lib.ZSTD_reset_session_only) | |
|
1015 | 1373 | |
|
1016 | 1374 | try: |
|
1017 | 1375 | size = len(source) |
@@ -1026,20 +1384,22 b' class ZstdCompressor(object):' | |||
|
1026 | 1384 | raise ZstdError('error setting source size: %s' % |
|
1027 | 1385 | _zstd_error(zresult)) |
|
1028 | 1386 | |
|
1029 | return CompressionReader(self, source, read_size) | |
|
1387 | return ZstdCompressionReader(self, source, read_size) | |
|
1030 | 1388 | |
|
1031 | 1389 | def stream_writer(self, writer, size=-1, |
|
1032 |
write_size=COMPRESSION_RECOMMENDED_OUTPUT_SIZE |
|
|
1390 | write_size=COMPRESSION_RECOMMENDED_OUTPUT_SIZE, | |
|
1391 | write_return_read=False): | |
|
1033 | 1392 | |
|
1034 | 1393 | if not hasattr(writer, 'write'): |
|
1035 | 1394 | raise ValueError('must pass an object with a write() method') |
|
1036 | 1395 | |
|
1037 | lib.ZSTD_CCtx_reset(self._cctx) | |
|
1396 | lib.ZSTD_CCtx_reset(self._cctx, lib.ZSTD_reset_session_only) | |
|
1038 | 1397 | |
|
1039 | 1398 | if size < 0: |
|
1040 | 1399 | size = lib.ZSTD_CONTENTSIZE_UNKNOWN |
|
1041 | 1400 | |
|
1042 |
return ZstdCompressionWriter(self, writer, size, write_size |
|
|
1401 | return ZstdCompressionWriter(self, writer, size, write_size, | |
|
1402 | write_return_read) | |
|
1043 | 1403 | |
|
1044 | 1404 | write_to = stream_writer |
|
1045 | 1405 | |
@@ -1056,7 +1416,7 b' class ZstdCompressor(object):' | |||
|
1056 | 1416 | raise ValueError('must pass an object with a read() method or ' |
|
1057 | 1417 | 'conforms to buffer protocol') |
|
1058 | 1418 | |
|
1059 | lib.ZSTD_CCtx_reset(self._cctx) | |
|
1419 | lib.ZSTD_CCtx_reset(self._cctx, lib.ZSTD_reset_session_only) | |
|
1060 | 1420 | |
|
1061 | 1421 | if size < 0: |
|
1062 | 1422 | size = lib.ZSTD_CONTENTSIZE_UNKNOWN |
@@ -1104,8 +1464,8 b' class ZstdCompressor(object):' | |||
|
1104 | 1464 | in_buffer.pos = 0 |
|
1105 | 1465 | |
|
1106 | 1466 | while in_buffer.pos < in_buffer.size: |
|
1107 |
zresult = lib.ZSTD_compress |
|
|
1108 |
|
|
|
1467 | zresult = lib.ZSTD_compressStream2(self._cctx, out_buffer, in_buffer, | |
|
1468 | lib.ZSTD_e_continue) | |
|
1109 | 1469 | if lib.ZSTD_isError(zresult): |
|
1110 | 1470 | raise ZstdError('zstd compress error: %s' % |
|
1111 | 1471 | _zstd_error(zresult)) |
@@ -1124,10 +1484,10 b' class ZstdCompressor(object):' | |||
|
1124 | 1484 | # remains. |
|
1125 | 1485 | while True: |
|
1126 | 1486 | assert out_buffer.pos == 0 |
|
1127 |
zresult = lib.ZSTD_compress |
|
|
1128 |
|
|
|
1129 |
|
|
|
1130 |
|
|
|
1487 | zresult = lib.ZSTD_compressStream2(self._cctx, | |
|
1488 | out_buffer, | |
|
1489 | in_buffer, | |
|
1490 | lib.ZSTD_e_end) | |
|
1131 | 1491 | if lib.ZSTD_isError(zresult): |
|
1132 | 1492 | raise ZstdError('error ending compression stream: %s' % |
|
1133 | 1493 | _zstd_error(zresult)) |
@@ -1234,7 +1594,7 b' class ZstdCompressionDict(object):' | |||
|
1234 | 1594 | cparams = ffi.new('ZSTD_compressionParameters') |
|
1235 | 1595 | cparams.chainLog = compression_params.chain_log |
|
1236 | 1596 | cparams.hashLog = compression_params.hash_log |
|
1237 |
cparams. |
|
|
1597 | cparams.minMatch = compression_params.min_match | |
|
1238 | 1598 | cparams.searchLog = compression_params.search_log |
|
1239 | 1599 | cparams.strategy = compression_params.compression_strategy |
|
1240 | 1600 | cparams.targetLength = compression_params.target_length |
@@ -1345,6 +1705,10 b' class ZstdDecompressionObj(object):' | |||
|
1345 | 1705 | out_buffer = ffi.new('ZSTD_outBuffer *') |
|
1346 | 1706 | |
|
1347 | 1707 | data_buffer = ffi.from_buffer(data) |
|
1708 | ||
|
1709 | if len(data_buffer) == 0: | |
|
1710 | return b'' | |
|
1711 | ||
|
1348 | 1712 | in_buffer.src = data_buffer |
|
1349 | 1713 | in_buffer.size = len(data_buffer) |
|
1350 | 1714 | in_buffer.pos = 0 |
@@ -1357,8 +1721,8 b' class ZstdDecompressionObj(object):' | |||
|
1357 | 1721 | chunks = [] |
|
1358 | 1722 | |
|
1359 | 1723 | while True: |
|
1360 |
zresult = lib.ZSTD_decompress |
|
|
1361 |
|
|
|
1724 | zresult = lib.ZSTD_decompressStream(self._decompressor._dctx, | |
|
1725 | out_buffer, in_buffer) | |
|
1362 | 1726 | if lib.ZSTD_isError(zresult): |
|
1363 | 1727 | raise ZstdError('zstd decompressor error: %s' % |
|
1364 | 1728 | _zstd_error(zresult)) |
@@ -1378,12 +1742,16 b' class ZstdDecompressionObj(object):' | |||
|
1378 | 1742 | |
|
1379 | 1743 | return b''.join(chunks) |
|
1380 | 1744 | |
|
1381 | ||
|
1382 | class DecompressionReader(object): | |
|
1383 | def __init__(self, decompressor, source, read_size): | |
|
1745 | def flush(self, length=0): | |
|
1746 | pass | |
|
1747 | ||
|
1748 | ||
|
1749 | class ZstdDecompressionReader(object): | |
|
1750 | def __init__(self, decompressor, source, read_size, read_across_frames): | |
|
1384 | 1751 | self._decompressor = decompressor |
|
1385 | 1752 | self._source = source |
|
1386 | 1753 | self._read_size = read_size |
|
1754 | self._read_across_frames = bool(read_across_frames) | |
|
1387 | 1755 | self._entered = False |
|
1388 | 1756 | self._closed = False |
|
1389 | 1757 | self._bytes_decompressed = 0 |
@@ -1418,10 +1786,10 b' class DecompressionReader(object):' | |||
|
1418 | 1786 | return True |
|
1419 | 1787 | |
|
1420 | 1788 | def readline(self): |
|
1421 |
raise |
|
|
1789 | raise io.UnsupportedOperation() | |
|
1422 | 1790 | |
|
1423 | 1791 | def readlines(self): |
|
1424 |
raise |
|
|
1792 | raise io.UnsupportedOperation() | |
|
1425 | 1793 | |
|
1426 | 1794 | def write(self, data): |
|
1427 | 1795 | raise io.UnsupportedOperation() |
@@ -1447,25 +1815,158 b' class DecompressionReader(object):' | |||
|
1447 | 1815 | return self._bytes_decompressed |
|
1448 | 1816 | |
|
1449 | 1817 | def readall(self): |
|
1450 | raise NotImplementedError() | |
|
1818 | chunks = [] | |
|
1819 | ||
|
1820 | while True: | |
|
1821 | chunk = self.read(1048576) | |
|
1822 | if not chunk: | |
|
1823 | break | |
|
1824 | ||
|
1825 | chunks.append(chunk) | |
|
1826 | ||
|
1827 | return b''.join(chunks) | |
|
1451 | 1828 | |
|
1452 | 1829 | def __iter__(self): |
|
1453 |
raise |
|
|
1830 | raise io.UnsupportedOperation() | |
|
1454 | 1831 | |
|
1455 | 1832 | def __next__(self): |
|
1456 |
raise |
|
|
1833 | raise io.UnsupportedOperation() | |
|
1457 | 1834 | |
|
1458 | 1835 | next = __next__ |
|
1459 | 1836 | |
|
1460 |
def read(self |
|
|
1837 | def _read_input(self): | |
|
1838 | # We have data left over in the input buffer. Use it. | |
|
1839 | if self._in_buffer.pos < self._in_buffer.size: | |
|
1840 | return | |
|
1841 | ||
|
1842 | # All input data exhausted. Nothing to do. | |
|
1843 | if self._finished_input: | |
|
1844 | return | |
|
1845 | ||
|
1846 | # Else populate the input buffer from our source. | |
|
1847 | if hasattr(self._source, 'read'): | |
|
1848 | data = self._source.read(self._read_size) | |
|
1849 | ||
|
1850 | if not data: | |
|
1851 | self._finished_input = True | |
|
1852 | return | |
|
1853 | ||
|
1854 | self._source_buffer = ffi.from_buffer(data) | |
|
1855 | self._in_buffer.src = self._source_buffer | |
|
1856 | self._in_buffer.size = len(self._source_buffer) | |
|
1857 | self._in_buffer.pos = 0 | |
|
1858 | else: | |
|
1859 | self._source_buffer = ffi.from_buffer(self._source) | |
|
1860 | self._in_buffer.src = self._source_buffer | |
|
1861 | self._in_buffer.size = len(self._source_buffer) | |
|
1862 | self._in_buffer.pos = 0 | |
|
1863 | ||
|
1864 | def _decompress_into_buffer(self, out_buffer): | |
|
1865 | """Decompress available input into an output buffer. | |
|
1866 | ||
|
1867 | Returns True if data in output buffer should be emitted. | |
|
1868 | """ | |
|
1869 | zresult = lib.ZSTD_decompressStream(self._decompressor._dctx, | |
|
1870 | out_buffer, self._in_buffer) | |
|
1871 | ||
|
1872 | if self._in_buffer.pos == self._in_buffer.size: | |
|
1873 | self._in_buffer.src = ffi.NULL | |
|
1874 | self._in_buffer.pos = 0 | |
|
1875 | self._in_buffer.size = 0 | |
|
1876 | self._source_buffer = None | |
|
1877 | ||
|
1878 | if not hasattr(self._source, 'read'): | |
|
1879 | self._finished_input = True | |
|
1880 | ||
|
1881 | if lib.ZSTD_isError(zresult): | |
|
1882 | raise ZstdError('zstd decompress error: %s' % | |
|
1883 | _zstd_error(zresult)) | |
|
1884 | ||
|
1885 | # Emit data if there is data AND either: | |
|
1886 | # a) output buffer is full (read amount is satisfied) | |
|
1887 | # b) we're at end of a frame and not in frame spanning mode | |
|
1888 | return (out_buffer.pos and | |
|
1889 | (out_buffer.pos == out_buffer.size or | |
|
1890 | zresult == 0 and not self._read_across_frames)) | |
|
1891 | ||
|
1892 | def read(self, size=-1): | |
|
1893 | if self._closed: | |
|
1894 | raise ValueError('stream is closed') | |
|
1895 | ||
|
1896 | if size < -1: | |
|
1897 | raise ValueError('cannot read negative amounts less than -1') | |
|
1898 | ||
|
1899 | if size == -1: | |
|
1900 | # This is recursive. But it gets the job done. | |
|
1901 | return self.readall() | |
|
1902 | ||
|
1903 | if self._finished_output or size == 0: | |
|
1904 | return b'' | |
|
1905 | ||
|
1906 | # We /could/ call into readinto() here. But that introduces more | |
|
1907 | # overhead. | |
|
1908 | dst_buffer = ffi.new('char[]', size) | |
|
1909 | out_buffer = ffi.new('ZSTD_outBuffer *') | |
|
1910 | out_buffer.dst = dst_buffer | |
|
1911 | out_buffer.size = size | |
|
1912 | out_buffer.pos = 0 | |
|
1913 | ||
|
1914 | self._read_input() | |
|
1915 | if self._decompress_into_buffer(out_buffer): | |
|
1916 | self._bytes_decompressed += out_buffer.pos | |
|
1917 | return ffi.buffer(out_buffer.dst, out_buffer.pos)[:] | |
|
1918 | ||
|
1919 | while not self._finished_input: | |
|
1920 | self._read_input() | |
|
1921 | if self._decompress_into_buffer(out_buffer): | |
|
1922 | self._bytes_decompressed += out_buffer.pos | |
|
1923 | return ffi.buffer(out_buffer.dst, out_buffer.pos)[:] | |
|
1924 | ||
|
1925 | self._bytes_decompressed += out_buffer.pos | |
|
1926 | return ffi.buffer(out_buffer.dst, out_buffer.pos)[:] | |
|
1927 | ||
|
1928 | def readinto(self, b): | |
|
1461 | 1929 | if self._closed: |
|
1462 | 1930 | raise ValueError('stream is closed') |
|
1463 | 1931 | |
|
1464 | 1932 | if self._finished_output: |
|
1933 | return 0 | |
|
1934 | ||
|
1935 | # TODO use writable=True once we require CFFI >= 1.12. | |
|
1936 | dest_buffer = ffi.from_buffer(b) | |
|
1937 | ffi.memmove(b, b'', 0) | |
|
1938 | out_buffer = ffi.new('ZSTD_outBuffer *') | |
|
1939 | out_buffer.dst = dest_buffer | |
|
1940 | out_buffer.size = len(dest_buffer) | |
|
1941 | out_buffer.pos = 0 | |
|
1942 | ||
|
1943 | self._read_input() | |
|
1944 | if self._decompress_into_buffer(out_buffer): | |
|
1945 | self._bytes_decompressed += out_buffer.pos | |
|
1946 | return out_buffer.pos | |
|
1947 | ||
|
1948 | while not self._finished_input: | |
|
1949 | self._read_input() | |
|
1950 | if self._decompress_into_buffer(out_buffer): | |
|
1951 | self._bytes_decompressed += out_buffer.pos | |
|
1952 | return out_buffer.pos | |
|
1953 | ||
|
1954 | self._bytes_decompressed += out_buffer.pos | |
|
1955 | return out_buffer.pos | |
|
1956 | ||
|
1957 | def read1(self, size=-1): | |
|
1958 | if self._closed: | |
|
1959 | raise ValueError('stream is closed') | |
|
1960 | ||
|
1961 | if size < -1: | |
|
1962 | raise ValueError('cannot read negative amounts less than -1') | |
|
1963 | ||
|
1964 | if self._finished_output or size == 0: | |
|
1465 | 1965 | return b'' |
|
1466 | 1966 | |
|
1467 | if size < 1: | |
|
1468 | raise ValueError('cannot read negative or size 0 amounts') | |
|
1967 | # -1 returns arbitrary number of bytes. | |
|
1968 | if size == -1: | |
|
1969 | size = DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE | |
|
1469 | 1970 | |
|
1470 | 1971 | dst_buffer = ffi.new('char[]', size) |
|
1471 | 1972 | out_buffer = ffi.new('ZSTD_outBuffer *') |
@@ -1473,64 +1974,46 b' class DecompressionReader(object):' | |||
|
1473 | 1974 | out_buffer.size = size |
|
1474 | 1975 | out_buffer.pos = 0 |
|
1475 | 1976 | |
|
1476 | def decompress(): | |
|
1477 | zresult = lib.ZSTD_decompress_generic(self._decompressor._dctx, | |
|
1478 | out_buffer, self._in_buffer) | |
|
1479 | ||
|
1480 | if self._in_buffer.pos == self._in_buffer.size: | |
|
1481 | self._in_buffer.src = ffi.NULL | |
|
1482 | self._in_buffer.pos = 0 | |
|
1483 | self._in_buffer.size = 0 | |
|
1484 | self._source_buffer = None | |
|
1485 | ||
|
1486 | if not hasattr(self._source, 'read'): | |
|
1487 | self._finished_input = True | |
|
1488 | ||
|
1489 | if lib.ZSTD_isError(zresult): | |
|
1490 | raise ZstdError('zstd decompress error: %s', | |
|
1491 | _zstd_error(zresult)) | |
|
1492 | elif zresult == 0: | |
|
1493 | self._finished_output = True | |
|
1494 | ||
|
1495 | if out_buffer.pos and out_buffer.pos == out_buffer.size: | |
|
1496 | self._bytes_decompressed += out_buffer.size | |
|
1497 | return ffi.buffer(out_buffer.dst, out_buffer.pos)[:] | |
|
1498 | ||
|
1499 | def get_input(): | |
|
1500 | if self._finished_input: | |
|
1501 | return | |
|
1502 | ||
|
1503 | if hasattr(self._source, 'read'): | |
|
1504 | data = self._source.read(self._read_size) | |
|
1505 | ||
|
1506 | if not data: | |
|
1507 | self._finished_input = True | |
|
1508 | return | |
|
1509 | ||
|
1510 | self._source_buffer = ffi.from_buffer(data) | |
|
1511 | self._in_buffer.src = self._source_buffer | |
|
1512 | self._in_buffer.size = len(self._source_buffer) | |
|
1513 | self._in_buffer.pos = 0 | |
|
1514 | else: | |
|
1515 | self._source_buffer = ffi.from_buffer(self._source) | |
|
1516 | self._in_buffer.src = self._source_buffer | |
|
1517 | self._in_buffer.size = len(self._source_buffer) | |
|
1518 | self._in_buffer.pos = 0 | |
|
1519 | ||
|
1520 | get_input() | |
|
1521 | result = decompress() | |
|
1522 | if result: | |
|
1523 | return result | |
|
1524 | ||
|
1977 | # read1() dictates that we can perform at most 1 call to underlying | |
|
1978 | # stream to get input. However, we can't satisfy this restriction with | |
|
1979 | # decompression because not all input generates output. So we allow | |
|
1980 | # multiple read(). But unlike read(), we stop once we have any output. | |
|
1525 | 1981 | while not self._finished_input: |
|
1526 |
|
|
|
1527 |
|
|
|
1528 | if result: | |
|
1529 | return result | |
|
1982 | self._read_input() | |
|
1983 | self._decompress_into_buffer(out_buffer) | |
|
1984 | ||
|
1985 | if out_buffer.pos: | |
|
1986 | break | |
|
1530 | 1987 | |
|
1531 | 1988 | self._bytes_decompressed += out_buffer.pos |
|
1532 | 1989 | return ffi.buffer(out_buffer.dst, out_buffer.pos)[:] |
|
1533 | 1990 | |
|
1991 | def readinto1(self, b): | |
|
1992 | if self._closed: | |
|
1993 | raise ValueError('stream is closed') | |
|
1994 | ||
|
1995 | if self._finished_output: | |
|
1996 | return 0 | |
|
1997 | ||
|
1998 | # TODO use writable=True once we require CFFI >= 1.12. | |
|
1999 | dest_buffer = ffi.from_buffer(b) | |
|
2000 | ffi.memmove(b, b'', 0) | |
|
2001 | ||
|
2002 | out_buffer = ffi.new('ZSTD_outBuffer *') | |
|
2003 | out_buffer.dst = dest_buffer | |
|
2004 | out_buffer.size = len(dest_buffer) | |
|
2005 | out_buffer.pos = 0 | |
|
2006 | ||
|
2007 | while not self._finished_input and not self._finished_output: | |
|
2008 | self._read_input() | |
|
2009 | self._decompress_into_buffer(out_buffer) | |
|
2010 | ||
|
2011 | if out_buffer.pos: | |
|
2012 | break | |
|
2013 | ||
|
2014 | self._bytes_decompressed += out_buffer.pos | |
|
2015 | return out_buffer.pos | |
|
2016 | ||
|
1534 | 2017 | def seek(self, pos, whence=os.SEEK_SET): |
|
1535 | 2018 | if self._closed: |
|
1536 | 2019 | raise ValueError('stream is closed') |
@@ -1569,34 +2052,108 b' class DecompressionReader(object):' | |||
|
1569 | 2052 | return self._bytes_decompressed |
|
1570 | 2053 | |
|
1571 | 2054 | class ZstdDecompressionWriter(object): |
|
1572 | def __init__(self, decompressor, writer, write_size): | |
|
2055 | def __init__(self, decompressor, writer, write_size, write_return_read): | |
|
2056 | decompressor._ensure_dctx() | |
|
2057 | ||
|
1573 | 2058 | self._decompressor = decompressor |
|
1574 | 2059 | self._writer = writer |
|
1575 | 2060 | self._write_size = write_size |
|
2061 | self._write_return_read = bool(write_return_read) | |
|
1576 | 2062 | self._entered = False |
|
2063 | self._closed = False | |
|
1577 | 2064 | |
|
1578 | 2065 | def __enter__(self): |
|
2066 | if self._closed: | |
|
2067 | raise ValueError('stream is closed') | |
|
2068 | ||
|
1579 | 2069 | if self._entered: |
|
1580 | 2070 | raise ZstdError('cannot __enter__ multiple times') |
|
1581 | 2071 | |
|
1582 | self._decompressor._ensure_dctx() | |
|
1583 | 2072 | self._entered = True |
|
1584 | 2073 | |
|
1585 | 2074 | return self |
|
1586 | 2075 | |
|
1587 | 2076 | def __exit__(self, exc_type, exc_value, exc_tb): |
|
1588 | 2077 | self._entered = False |
|
2078 | self.close() | |
|
1589 | 2079 | |
|
1590 | 2080 | def memory_size(self): |
|
1591 | if not self._decompressor._dctx: | |
|
1592 | raise ZstdError('cannot determine size of inactive decompressor ' | |
|
1593 | 'call when context manager is active') | |
|
1594 | ||
|
1595 | 2081 | return lib.ZSTD_sizeof_DCtx(self._decompressor._dctx) |
|
1596 | 2082 | |
|
2083 | def close(self): | |
|
2084 | if self._closed: | |
|
2085 | return | |
|
2086 | ||
|
2087 | try: | |
|
2088 | self.flush() | |
|
2089 | finally: | |
|
2090 | self._closed = True | |
|
2091 | ||
|
2092 | f = getattr(self._writer, 'close', None) | |
|
2093 | if f: | |
|
2094 | f() | |
|
2095 | ||
|
2096 | @property | |
|
2097 | def closed(self): | |
|
2098 | return self._closed | |
|
2099 | ||
|
2100 | def fileno(self): | |
|
2101 | f = getattr(self._writer, 'fileno', None) | |
|
2102 | if f: | |
|
2103 | return f() | |
|
2104 | else: | |
|
2105 | raise OSError('fileno not available on underlying writer') | |
|
2106 | ||
|
2107 | def flush(self): | |
|
2108 | if self._closed: | |
|
2109 | raise ValueError('stream is closed') | |
|
2110 | ||
|
2111 | f = getattr(self._writer, 'flush', None) | |
|
2112 | if f: | |
|
2113 | return f() | |
|
2114 | ||
|
2115 | def isatty(self): | |
|
2116 | return False | |
|
2117 | ||
|
2118 | def readable(self): | |
|
2119 | return False | |
|
2120 | ||
|
2121 | def readline(self, size=-1): | |
|
2122 | raise io.UnsupportedOperation() | |
|
2123 | ||
|
2124 | def readlines(self, hint=-1): | |
|
2125 | raise io.UnsupportedOperation() | |
|
2126 | ||
|
2127 | def seek(self, offset, whence=None): | |
|
2128 | raise io.UnsupportedOperation() | |
|
2129 | ||
|
2130 | def seekable(self): | |
|
2131 | return False | |
|
2132 | ||
|
2133 | def tell(self): | |
|
2134 | raise io.UnsupportedOperation() | |
|
2135 | ||
|
2136 | def truncate(self, size=None): | |
|
2137 | raise io.UnsupportedOperation() | |
|
2138 | ||
|
2139 | def writable(self): | |
|
2140 | return True | |
|
2141 | ||
|
2142 | def writelines(self, lines): | |
|
2143 | raise io.UnsupportedOperation() | |
|
2144 | ||
|
2145 | def read(self, size=-1): | |
|
2146 | raise io.UnsupportedOperation() | |
|
2147 | ||
|
2148 | def readall(self): | |
|
2149 | raise io.UnsupportedOperation() | |
|
2150 | ||
|
2151 | def readinto(self, b): | |
|
2152 | raise io.UnsupportedOperation() | |
|
2153 | ||
|
1597 | 2154 | def write(self, data): |
|
1598 |
if |
|
|
1599 | raise ZstdError('write must be called from an active context manager') | |
|
2155 | if self._closed: | |
|
2156 | raise ValueError('stream is closed') | |
|
1600 | 2157 | |
|
1601 | 2158 | total_write = 0 |
|
1602 | 2159 | |
@@ -1616,7 +2173,7 b' class ZstdDecompressionWriter(object):' | |||
|
1616 | 2173 | dctx = self._decompressor._dctx |
|
1617 | 2174 | |
|
1618 | 2175 | while in_buffer.pos < in_buffer.size: |
|
1619 |
zresult = lib.ZSTD_decompress |
|
|
2176 | zresult = lib.ZSTD_decompressStream(dctx, out_buffer, in_buffer) | |
|
1620 | 2177 | if lib.ZSTD_isError(zresult): |
|
1621 | 2178 | raise ZstdError('zstd decompress error: %s' % |
|
1622 | 2179 | _zstd_error(zresult)) |
@@ -1626,7 +2183,10 b' class ZstdDecompressionWriter(object):' | |||
|
1626 | 2183 | total_write += out_buffer.pos |
|
1627 | 2184 | out_buffer.pos = 0 |
|
1628 | 2185 | |
|
1629 | return total_write | |
|
2186 | if self._write_return_read: | |
|
2187 | return in_buffer.pos | |
|
2188 | else: | |
|
2189 | return total_write | |
|
1630 | 2190 | |
|
1631 | 2191 | |
|
1632 | 2192 | class ZstdDecompressor(object): |
@@ -1684,7 +2244,7 b' class ZstdDecompressor(object):' | |||
|
1684 | 2244 | in_buffer.size = len(data_buffer) |
|
1685 | 2245 | in_buffer.pos = 0 |
|
1686 | 2246 | |
|
1687 |
zresult = lib.ZSTD_decompress |
|
|
2247 | zresult = lib.ZSTD_decompressStream(self._dctx, out_buffer, in_buffer) | |
|
1688 | 2248 | if lib.ZSTD_isError(zresult): |
|
1689 | 2249 | raise ZstdError('decompression error: %s' % |
|
1690 | 2250 | _zstd_error(zresult)) |
@@ -1696,9 +2256,10 b' class ZstdDecompressor(object):' | |||
|
1696 | 2256 | |
|
1697 | 2257 | return ffi.buffer(result_buffer, out_buffer.pos)[:] |
|
1698 | 2258 | |
|
1699 |
def stream_reader(self, source, read_size=DECOMPRESSION_RECOMMENDED_INPUT_SIZE |
|
|
2259 | def stream_reader(self, source, read_size=DECOMPRESSION_RECOMMENDED_INPUT_SIZE, | |
|
2260 | read_across_frames=False): | |
|
1700 | 2261 | self._ensure_dctx() |
|
1701 | return DecompressionReader(self, source, read_size) | |
|
2262 | return ZstdDecompressionReader(self, source, read_size, read_across_frames) | |
|
1702 | 2263 | |
|
1703 | 2264 | def decompressobj(self, write_size=DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE): |
|
1704 | 2265 | if write_size < 1: |
@@ -1767,7 +2328,7 b' class ZstdDecompressor(object):' | |||
|
1767 | 2328 | while in_buffer.pos < in_buffer.size: |
|
1768 | 2329 | assert out_buffer.pos == 0 |
|
1769 | 2330 | |
|
1770 |
zresult = lib.ZSTD_decompress |
|
|
2331 | zresult = lib.ZSTD_decompressStream(self._dctx, out_buffer, in_buffer) | |
|
1771 | 2332 | if lib.ZSTD_isError(zresult): |
|
1772 | 2333 | raise ZstdError('zstd decompress error: %s' % |
|
1773 | 2334 | _zstd_error(zresult)) |
@@ -1787,11 +2348,13 b' class ZstdDecompressor(object):' | |||
|
1787 | 2348 | |
|
1788 | 2349 | read_from = read_to_iter |
|
1789 | 2350 | |
|
1790 |
def stream_writer(self, writer, write_size=DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE |
|
|
2351 | def stream_writer(self, writer, write_size=DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE, | |
|
2352 | write_return_read=False): | |
|
1791 | 2353 | if not hasattr(writer, 'write'): |
|
1792 | 2354 | raise ValueError('must pass an object with a write() method') |
|
1793 | 2355 | |
|
1794 |
return ZstdDecompressionWriter(self, writer, write_size |
|
|
2356 | return ZstdDecompressionWriter(self, writer, write_size, | |
|
2357 | write_return_read) | |
|
1795 | 2358 | |
|
1796 | 2359 | write_to = stream_writer |
|
1797 | 2360 | |
@@ -1829,7 +2392,7 b' class ZstdDecompressor(object):' | |||
|
1829 | 2392 | |
|
1830 | 2393 | # Flush all read data to output. |
|
1831 | 2394 | while in_buffer.pos < in_buffer.size: |
|
1832 |
zresult = lib.ZSTD_decompress |
|
|
2395 | zresult = lib.ZSTD_decompressStream(self._dctx, out_buffer, in_buffer) | |
|
1833 | 2396 | if lib.ZSTD_isError(zresult): |
|
1834 | 2397 | raise ZstdError('zstd decompressor error: %s' % |
|
1835 | 2398 | _zstd_error(zresult)) |
@@ -1881,7 +2444,7 b' class ZstdDecompressor(object):' | |||
|
1881 | 2444 | in_buffer.size = len(chunk_buffer) |
|
1882 | 2445 | in_buffer.pos = 0 |
|
1883 | 2446 | |
|
1884 |
zresult = lib.ZSTD_decompress |
|
|
2447 | zresult = lib.ZSTD_decompressStream(self._dctx, out_buffer, in_buffer) | |
|
1885 | 2448 | if lib.ZSTD_isError(zresult): |
|
1886 | 2449 | raise ZstdError('could not decompress chunk 0: %s' % |
|
1887 | 2450 | _zstd_error(zresult)) |
@@ -1918,7 +2481,7 b' class ZstdDecompressor(object):' | |||
|
1918 | 2481 | in_buffer.size = len(chunk_buffer) |
|
1919 | 2482 | in_buffer.pos = 0 |
|
1920 | 2483 | |
|
1921 |
zresult = lib.ZSTD_decompress |
|
|
2484 | zresult = lib.ZSTD_decompressStream(self._dctx, out_buffer, in_buffer) | |
|
1922 | 2485 | if lib.ZSTD_isError(zresult): |
|
1923 | 2486 | raise ZstdError('could not decompress chunk %d: %s' % |
|
1924 | 2487 | _zstd_error(zresult)) |
@@ -1931,7 +2494,7 b' class ZstdDecompressor(object):' | |||
|
1931 | 2494 | return ffi.buffer(last_buffer, len(last_buffer))[:] |
|
1932 | 2495 | |
|
1933 | 2496 | def _ensure_dctx(self, load_dict=True): |
|
1934 | lib.ZSTD_DCtx_reset(self._dctx) | |
|
2497 | lib.ZSTD_DCtx_reset(self._dctx, lib.ZSTD_reset_session_only) | |
|
1935 | 2498 | |
|
1936 | 2499 | if self._max_window_size: |
|
1937 | 2500 | zresult = lib.ZSTD_DCtx_setMaxWindowSize(self._dctx, |
@@ -210,7 +210,7 b' void zstd_module_init(PyObject* m) {' | |||
|
210 | 210 | We detect this mismatch here and refuse to load the module if this |
|
211 | 211 | scenario is detected. |
|
212 | 212 | */ |
|
213 |
if (ZSTD_VERSION_NUMBER != 1030 |
|
|
213 | if (ZSTD_VERSION_NUMBER != 10308 || ZSTD_versionNumber() != 10308) { | |
|
214 | 214 | PyErr_SetString(PyExc_ImportError, "zstd C API mismatch; Python bindings not compiled against expected zstd version"); |
|
215 | 215 | return; |
|
216 | 216 | } |
@@ -339,17 +339,10 b' MEM_STATIC size_t BIT_getUpperBits(size_' | |||
|
339 | 339 | |
|
340 | 340 | MEM_STATIC size_t BIT_getMiddleBits(size_t bitContainer, U32 const start, U32 const nbBits) |
|
341 | 341 | { |
|
342 | #if defined(__BMI__) && defined(__GNUC__) && __GNUC__*1000+__GNUC_MINOR__ >= 4008 /* experimental */ | |
|
343 | # if defined(__x86_64__) | |
|
344 | if (sizeof(bitContainer)==8) | |
|
345 | return _bextr_u64(bitContainer, start, nbBits); | |
|
346 | else | |
|
347 | # endif | |
|
348 | return _bextr_u32(bitContainer, start, nbBits); | |
|
349 | #else | |
|
342 | U32 const regMask = sizeof(bitContainer)*8 - 1; | |
|
343 | /* if start > regMask, bitstream is corrupted, and result is undefined */ | |
|
350 | 344 | assert(nbBits < BIT_MASK_SIZE); |
|
351 | return (bitContainer >> start) & BIT_mask[nbBits]; | |
|
352 | #endif | |
|
345 | return (bitContainer >> (start & regMask)) & BIT_mask[nbBits]; | |
|
353 | 346 | } |
|
354 | 347 | |
|
355 | 348 | MEM_STATIC size_t BIT_getLowerBits(size_t bitContainer, U32 const nbBits) |
@@ -366,9 +359,13 b' MEM_STATIC size_t BIT_getLowerBits(size_' | |||
|
366 | 359 | * @return : value extracted */ |
|
367 | 360 | MEM_STATIC size_t BIT_lookBits(const BIT_DStream_t* bitD, U32 nbBits) |
|
368 | 361 | { |
|
369 | #if defined(__BMI__) && defined(__GNUC__) /* experimental; fails if bitD->bitsConsumed + nbBits > sizeof(bitD->bitContainer)*8 */ | |
|
362 | /* arbitrate between double-shift and shift+mask */ | |
|
363 | #if 1 | |
|
364 | /* if bitD->bitsConsumed + nbBits > sizeof(bitD->bitContainer)*8, | |
|
365 | * bitstream is likely corrupted, and result is undefined */ | |
|
370 | 366 | return BIT_getMiddleBits(bitD->bitContainer, (sizeof(bitD->bitContainer)*8) - bitD->bitsConsumed - nbBits, nbBits); |
|
371 | 367 | #else |
|
368 | /* this code path is slower on my os-x laptop */ | |
|
372 | 369 | U32 const regMask = sizeof(bitD->bitContainer)*8 - 1; |
|
373 | 370 | return ((bitD->bitContainer << (bitD->bitsConsumed & regMask)) >> 1) >> ((regMask-nbBits) & regMask); |
|
374 | 371 | #endif |
@@ -392,7 +389,7 b' MEM_STATIC void BIT_skipBits(BIT_DStream' | |||
|
392 | 389 | * Read (consume) next n bits from local register and update. |
|
393 | 390 | * Pay attention to not read more than nbBits contained into local register. |
|
394 | 391 | * @return : extracted value. */ |
|
395 |
MEM_STATIC size_t BIT_readBits(BIT_DStream_t* bitD, |
|
|
392 | MEM_STATIC size_t BIT_readBits(BIT_DStream_t* bitD, unsigned nbBits) | |
|
396 | 393 | { |
|
397 | 394 | size_t const value = BIT_lookBits(bitD, nbBits); |
|
398 | 395 | BIT_skipBits(bitD, nbBits); |
@@ -401,7 +398,7 b' MEM_STATIC size_t BIT_readBits(BIT_DStre' | |||
|
401 | 398 | |
|
402 | 399 | /*! BIT_readBitsFast() : |
|
403 | 400 | * unsafe version; only works only if nbBits >= 1 */ |
|
404 |
MEM_STATIC size_t BIT_readBitsFast(BIT_DStream_t* bitD, |
|
|
401 | MEM_STATIC size_t BIT_readBitsFast(BIT_DStream_t* bitD, unsigned nbBits) | |
|
405 | 402 | { |
|
406 | 403 | size_t const value = BIT_lookBitsFast(bitD, nbBits); |
|
407 | 404 | assert(nbBits >= 1); |
@@ -15,6 +15,8 b'' | |||
|
15 | 15 | * Compiler specifics |
|
16 | 16 | *********************************************************/ |
|
17 | 17 | /* force inlining */ |
|
18 | ||
|
19 | #if !defined(ZSTD_NO_INLINE) | |
|
18 | 20 | #if defined (__GNUC__) || defined(__cplusplus) || defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L /* C99 */ |
|
19 | 21 | # define INLINE_KEYWORD inline |
|
20 | 22 | #else |
@@ -29,6 +31,13 b'' | |||
|
29 | 31 | # define FORCE_INLINE_ATTR |
|
30 | 32 | #endif |
|
31 | 33 | |
|
34 | #else | |
|
35 | ||
|
36 | #define INLINE_KEYWORD | |
|
37 | #define FORCE_INLINE_ATTR | |
|
38 | ||
|
39 | #endif | |
|
40 | ||
|
32 | 41 | /** |
|
33 | 42 | * FORCE_INLINE_TEMPLATE is used to define C "templates", which take constant |
|
34 | 43 | * parameters. They must be inlined for the compiler to elimininate the constant |
@@ -89,23 +98,21 b'' | |||
|
89 | 98 | #endif |
|
90 | 99 | |
|
91 | 100 | /* prefetch |
|
92 | * can be disabled, by declaring NO_PREFETCH macro | |
|
93 | * All prefetch invocations use a single default locality 2, | |
|
94 | * generating instruction prefetcht1, | |
|
95 | * which, according to Intel, means "load data into L2 cache". | |
|
96 | * This is a good enough "middle ground" for the time being, | |
|
97 | * though in theory, it would be better to specialize locality depending on data being prefetched. | |
|
98 | * Tests could not determine any sensible difference based on locality value. */ | |
|
101 | * can be disabled, by declaring NO_PREFETCH build macro */ | |
|
99 | 102 | #if defined(NO_PREFETCH) |
|
100 |
# define PREFETCH(ptr) |
|
|
103 | # define PREFETCH_L1(ptr) (void)(ptr) /* disabled */ | |
|
104 | # define PREFETCH_L2(ptr) (void)(ptr) /* disabled */ | |
|
101 | 105 | #else |
|
102 | 106 | # if defined(_MSC_VER) && (defined(_M_X64) || defined(_M_I86)) /* _mm_prefetch() is not defined outside of x86/x64 */ |
|
103 | 107 | # include <mmintrin.h> /* https://msdn.microsoft.com/fr-fr/library/84szxsww(v=vs.90).aspx */ |
|
104 |
# define PREFETCH(ptr) |
|
|
108 | # define PREFETCH_L1(ptr) _mm_prefetch((const char*)(ptr), _MM_HINT_T0) | |
|
109 | # define PREFETCH_L2(ptr) _mm_prefetch((const char*)(ptr), _MM_HINT_T1) | |
|
105 | 110 | # elif defined(__GNUC__) && ( (__GNUC__ >= 4) || ( (__GNUC__ == 3) && (__GNUC_MINOR__ >= 1) ) ) |
|
106 |
# define PREFETCH(ptr) |
|
|
111 | # define PREFETCH_L1(ptr) __builtin_prefetch((ptr), 0 /* rw==read */, 3 /* locality */) | |
|
112 | # define PREFETCH_L2(ptr) __builtin_prefetch((ptr), 0 /* rw==read */, 2 /* locality */) | |
|
107 | 113 | # else |
|
108 |
# define PREFETCH(ptr) |
|
|
114 | # define PREFETCH_L1(ptr) (void)(ptr) /* disabled */ | |
|
115 | # define PREFETCH_L2(ptr) (void)(ptr) /* disabled */ | |
|
109 | 116 | # endif |
|
110 | 117 | #endif /* NO_PREFETCH */ |
|
111 | 118 | |
@@ -116,7 +123,7 b'' | |||
|
116 | 123 | size_t const _size = (size_t)(s); \ |
|
117 | 124 | size_t _pos; \ |
|
118 | 125 | for (_pos=0; _pos<_size; _pos+=CACHELINE_SIZE) { \ |
|
119 |
PREFETCH(_ptr + _pos); |
|
|
126 | PREFETCH_L2(_ptr + _pos); \ | |
|
120 | 127 | } \ |
|
121 | 128 | } |
|
122 | 129 |
@@ -78,7 +78,7 b' MEM_STATIC ZSTD_cpuid_t ZSTD_cpuid(void)' | |||
|
78 | 78 | __asm__( |
|
79 | 79 | "pushl %%ebx\n\t" |
|
80 | 80 | "cpuid\n\t" |
|
81 |
"movl %%ebx, %%eax\n\ |
|
|
81 | "movl %%ebx, %%eax\n\t" | |
|
82 | 82 | "popl %%ebx" |
|
83 | 83 | : "=a"(f7b), "=c"(f7c) |
|
84 | 84 | : "a"(7), "c"(0) |
@@ -57,9 +57,9 b' extern "C" {' | |||
|
57 | 57 | #endif |
|
58 | 58 | |
|
59 | 59 | |
|
60 |
/* static assert is triggered at compile time, leaving no runtime artefact |
|
|
61 |
* |
|
|
62 |
* |
|
|
60 | /* static assert is triggered at compile time, leaving no runtime artefact. | |
|
61 | * static assert only works with compile-time constants. | |
|
62 | * Also, this variant can only be used inside a function. */ | |
|
63 | 63 | #define DEBUG_STATIC_ASSERT(c) (void)sizeof(char[(c) ? 1 : -1]) |
|
64 | 64 | |
|
65 | 65 | |
@@ -70,9 +70,19 b' extern "C" {' | |||
|
70 | 70 | # define DEBUGLEVEL 0 |
|
71 | 71 | #endif |
|
72 | 72 | |
|
73 | ||
|
74 | /* DEBUGFILE can be defined externally, | |
|
75 | * typically through compiler command line. | |
|
76 | * note : currently useless. | |
|
77 | * Value must be stderr or stdout */ | |
|
78 | #ifndef DEBUGFILE | |
|
79 | # define DEBUGFILE stderr | |
|
80 | #endif | |
|
81 | ||
|
82 | ||
|
73 | 83 | /* recommended values for DEBUGLEVEL : |
|
74 |
* 0 : no debug, all run-time |
|
|
75 |
* 1 : |
|
|
84 | * 0 : release mode, no debug, all run-time checks disabled | |
|
85 | * 1 : enables assert() only, no display | |
|
76 | 86 | * 2 : reserved, for currently active debug path |
|
77 | 87 | * 3 : events once per object lifetime (CCtx, CDict, etc.) |
|
78 | 88 | * 4 : events once per frame |
@@ -81,7 +91,7 b' extern "C" {' | |||
|
81 | 91 | * 7+: events at every position (*very* verbose) |
|
82 | 92 | * |
|
83 | 93 | * It's generally inconvenient to output traces > 5. |
|
84 |
* In which case, it's possible to selectively |
|
|
94 | * In which case, it's possible to selectively trigger high verbosity levels | |
|
85 | 95 | * by modifying g_debug_level. |
|
86 | 96 | */ |
|
87 | 97 | |
@@ -95,11 +105,12 b' extern "C" {' | |||
|
95 | 105 | |
|
96 | 106 | #if (DEBUGLEVEL>=2) |
|
97 | 107 | # include <stdio.h> |
|
98 |
extern int g_debuglevel; /* he |
|
|
99 | it actually lives in debug.c, | |
|
100 | and is shared by the whole process. | |
|
101 |
|
|
|
102 | on selective conditions (such as position in src) */ | |
|
108 | extern int g_debuglevel; /* the variable is only declared, | |
|
109 | it actually lives in debug.c, | |
|
110 | and is shared by the whole process. | |
|
111 | It's not thread-safe. | |
|
112 | It's useful when enabling very verbose levels | |
|
113 | on selective conditions (such as position in src) */ | |
|
103 | 114 | |
|
104 | 115 | # define RAWLOG(l, ...) { \ |
|
105 | 116 | if (l<=g_debuglevel) { \ |
@@ -14,6 +14,10 b'' | |||
|
14 | 14 | |
|
15 | 15 | const char* ERR_getErrorString(ERR_enum code) |
|
16 | 16 | { |
|
17 | #ifdef ZSTD_STRIP_ERROR_STRINGS | |
|
18 | (void)code; | |
|
19 | return "Error strings stripped"; | |
|
20 | #else | |
|
17 | 21 | static const char* const notErrorCode = "Unspecified error code"; |
|
18 | 22 | switch( code ) |
|
19 | 23 | { |
@@ -39,10 +43,12 b' const char* ERR_getErrorString(ERR_enum ' | |||
|
39 | 43 | case PREFIX(dictionaryCreation_failed): return "Cannot create Dictionary from provided samples"; |
|
40 | 44 | case PREFIX(dstSize_tooSmall): return "Destination buffer is too small"; |
|
41 | 45 | case PREFIX(srcSize_wrong): return "Src size is incorrect"; |
|
46 | case PREFIX(dstBuffer_null): return "Operation on NULL destination buffer"; | |
|
42 | 47 | /* following error codes are not stable and may be removed or changed in a future version */ |
|
43 | 48 | case PREFIX(frameIndex_tooLarge): return "Frame index is too large"; |
|
44 | 49 | case PREFIX(seekableIO): return "An I/O error occurred when reading/seeking"; |
|
45 | 50 | case PREFIX(maxCode): |
|
46 | 51 | default: return notErrorCode; |
|
47 | 52 | } |
|
53 | #endif | |
|
48 | 54 | } |
@@ -512,7 +512,7 b' MEM_STATIC void FSE_initCState(FSE_CStat' | |||
|
512 | 512 | const U32 tableLog = MEM_read16(ptr); |
|
513 | 513 | statePtr->value = (ptrdiff_t)1<<tableLog; |
|
514 | 514 | statePtr->stateTable = u16ptr+2; |
|
515 |
statePtr->symbolTT = |
|
|
515 | statePtr->symbolTT = ct + 1 + (tableLog ? (1<<(tableLog-1)) : 1); | |
|
516 | 516 | statePtr->stateLog = tableLog; |
|
517 | 517 | } |
|
518 | 518 | |
@@ -531,7 +531,7 b' MEM_STATIC void FSE_initCState2(FSE_CSta' | |||
|
531 | 531 | } |
|
532 | 532 | } |
|
533 | 533 | |
|
534 |
MEM_STATIC void FSE_encodeSymbol(BIT_CStream_t* bitC, FSE_CState_t* statePtr, |
|
|
534 | MEM_STATIC void FSE_encodeSymbol(BIT_CStream_t* bitC, FSE_CState_t* statePtr, unsigned symbol) | |
|
535 | 535 | { |
|
536 | 536 | FSE_symbolCompressionTransform const symbolTT = ((const FSE_symbolCompressionTransform*)(statePtr->symbolTT))[symbol]; |
|
537 | 537 | const U16* const stateTable = (const U16*)(statePtr->stateTable); |
@@ -173,15 +173,19 b' typedef U32 HUF_DTable;' | |||
|
173 | 173 | * Advanced decompression functions |
|
174 | 174 | ******************************************/ |
|
175 | 175 | size_t HUF_decompress4X1 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< single-symbol decoder */ |
|
176 | #ifndef HUF_FORCE_DECOMPRESS_X1 | |
|
176 | 177 | size_t HUF_decompress4X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< double-symbols decoder */ |
|
178 | #endif | |
|
177 | 179 | |
|
178 | 180 | size_t HUF_decompress4X_DCtx (HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< decodes RLE and uncompressed */ |
|
179 | 181 | size_t HUF_decompress4X_hufOnly(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< considers RLE and uncompressed as errors */ |
|
180 | 182 | size_t HUF_decompress4X_hufOnly_wksp(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize); /**< considers RLE and uncompressed as errors */ |
|
181 | 183 | size_t HUF_decompress4X1_DCtx(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< single-symbol decoder */ |
|
182 | 184 | size_t HUF_decompress4X1_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize); /**< single-symbol decoder */ |
|
185 | #ifndef HUF_FORCE_DECOMPRESS_X1 | |
|
183 | 186 | size_t HUF_decompress4X2_DCtx(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< double-symbols decoder */ |
|
184 | 187 | size_t HUF_decompress4X2_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize); /**< double-symbols decoder */ |
|
188 | #endif | |
|
185 | 189 | |
|
186 | 190 | |
|
187 | 191 | /* **************************************** |
@@ -228,7 +232,7 b' size_t HUF_compress4X_repeat(void* dst, ' | |||
|
228 | 232 | #define HUF_CTABLE_WORKSPACE_SIZE_U32 (2*HUF_SYMBOLVALUE_MAX +1 +1) |
|
229 | 233 | #define HUF_CTABLE_WORKSPACE_SIZE (HUF_CTABLE_WORKSPACE_SIZE_U32 * sizeof(unsigned)) |
|
230 | 234 | size_t HUF_buildCTable_wksp (HUF_CElt* tree, |
|
231 |
const |
|
|
235 | const unsigned* count, U32 maxSymbolValue, U32 maxNbBits, | |
|
232 | 236 | void* workSpace, size_t wkspSize); |
|
233 | 237 | |
|
234 | 238 | /*! HUF_readStats() : |
@@ -277,14 +281,22 b' U32 HUF_selectDecoder (size_t dstSize, s' | |||
|
277 | 281 | #define HUF_DECOMPRESS_WORKSPACE_SIZE (2 << 10) |
|
278 | 282 | #define HUF_DECOMPRESS_WORKSPACE_SIZE_U32 (HUF_DECOMPRESS_WORKSPACE_SIZE / sizeof(U32)) |
|
279 | 283 | |
|
284 | #ifndef HUF_FORCE_DECOMPRESS_X2 | |
|
280 | 285 | size_t HUF_readDTableX1 (HUF_DTable* DTable, const void* src, size_t srcSize); |
|
281 | 286 | size_t HUF_readDTableX1_wksp (HUF_DTable* DTable, const void* src, size_t srcSize, void* workSpace, size_t wkspSize); |
|
287 | #endif | |
|
288 | #ifndef HUF_FORCE_DECOMPRESS_X1 | |
|
282 | 289 | size_t HUF_readDTableX2 (HUF_DTable* DTable, const void* src, size_t srcSize); |
|
283 | 290 | size_t HUF_readDTableX2_wksp (HUF_DTable* DTable, const void* src, size_t srcSize, void* workSpace, size_t wkspSize); |
|
291 | #endif | |
|
284 | 292 | |
|
285 | 293 | size_t HUF_decompress4X_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable); |
|
294 | #ifndef HUF_FORCE_DECOMPRESS_X2 | |
|
286 | 295 | size_t HUF_decompress4X1_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable); |
|
296 | #endif | |
|
297 | #ifndef HUF_FORCE_DECOMPRESS_X1 | |
|
287 | 298 | size_t HUF_decompress4X2_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable); |
|
299 | #endif | |
|
288 | 300 | |
|
289 | 301 | |
|
290 | 302 | /* ====================== */ |
@@ -306,24 +318,36 b' size_t HUF_compress1X_repeat(void* dst, ' | |||
|
306 | 318 | HUF_CElt* hufTable, HUF_repeat* repeat, int preferRepeat, int bmi2); |
|
307 | 319 | |
|
308 | 320 | size_t HUF_decompress1X1 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /* single-symbol decoder */ |
|
321 | #ifndef HUF_FORCE_DECOMPRESS_X1 | |
|
309 | 322 | size_t HUF_decompress1X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /* double-symbol decoder */ |
|
323 | #endif | |
|
310 | 324 | |
|
311 | 325 | size_t HUF_decompress1X_DCtx (HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); |
|
312 | 326 | size_t HUF_decompress1X_DCtx_wksp (HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize); |
|
327 | #ifndef HUF_FORCE_DECOMPRESS_X2 | |
|
313 | 328 | size_t HUF_decompress1X1_DCtx(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< single-symbol decoder */ |
|
314 | 329 | size_t HUF_decompress1X1_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize); /**< single-symbol decoder */ |
|
330 | #endif | |
|
331 | #ifndef HUF_FORCE_DECOMPRESS_X1 | |
|
315 | 332 | size_t HUF_decompress1X2_DCtx(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< double-symbols decoder */ |
|
316 | 333 | size_t HUF_decompress1X2_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize); /**< double-symbols decoder */ |
|
334 | #endif | |
|
317 | 335 | |
|
318 | 336 | size_t HUF_decompress1X_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable); /**< automatic selection of sing or double symbol decoder, based on DTable */ |
|
337 | #ifndef HUF_FORCE_DECOMPRESS_X2 | |
|
319 | 338 | size_t HUF_decompress1X1_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable); |
|
339 | #endif | |
|
340 | #ifndef HUF_FORCE_DECOMPRESS_X1 | |
|
320 | 341 | size_t HUF_decompress1X2_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable); |
|
342 | #endif | |
|
321 | 343 | |
|
322 | 344 | /* BMI2 variants. |
|
323 | 345 | * If the CPU has BMI2 support, pass bmi2=1, otherwise pass bmi2=0. |
|
324 | 346 | */ |
|
325 | 347 | size_t HUF_decompress1X_usingDTable_bmi2(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable, int bmi2); |
|
348 | #ifndef HUF_FORCE_DECOMPRESS_X2 | |
|
326 | 349 | size_t HUF_decompress1X1_DCtx_wksp_bmi2(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize, int bmi2); |
|
350 | #endif | |
|
327 | 351 | size_t HUF_decompress4X_usingDTable_bmi2(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable, int bmi2); |
|
328 | 352 | size_t HUF_decompress4X_hufOnly_wksp_bmi2(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize, int bmi2); |
|
329 | 353 |
@@ -39,6 +39,10 b' extern "C" {' | |||
|
39 | 39 | # define MEM_STATIC static /* this version may generate warnings for unused static functions; disable the relevant warning */ |
|
40 | 40 | #endif |
|
41 | 41 | |
|
42 | #ifndef __has_builtin | |
|
43 | # define __has_builtin(x) 0 /* compat. with non-clang compilers */ | |
|
44 | #endif | |
|
45 | ||
|
42 | 46 | /* code only tested on 32 and 64 bits systems */ |
|
43 | 47 | #define MEM_STATIC_ASSERT(c) { enum { MEM_static_assert = 1/(int)(!!(c)) }; } |
|
44 | 48 | MEM_STATIC void MEM_check(void) { MEM_STATIC_ASSERT((sizeof(size_t)==4) || (sizeof(size_t)==8)); } |
@@ -198,7 +202,8 b' MEM_STATIC U32 MEM_swap32(U32 in)' | |||
|
198 | 202 | { |
|
199 | 203 | #if defined(_MSC_VER) /* Visual Studio */ |
|
200 | 204 | return _byteswap_ulong(in); |
|
201 | #elif defined (__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__ >= 403) | |
|
205 | #elif (defined (__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__ >= 403)) \ | |
|
206 | || (defined(__clang__) && __has_builtin(__builtin_bswap32)) | |
|
202 | 207 | return __builtin_bswap32(in); |
|
203 | 208 | #else |
|
204 | 209 | return ((in << 24) & 0xff000000 ) | |
@@ -212,7 +217,8 b' MEM_STATIC U64 MEM_swap64(U64 in)' | |||
|
212 | 217 | { |
|
213 | 218 | #if defined(_MSC_VER) /* Visual Studio */ |
|
214 | 219 | return _byteswap_uint64(in); |
|
215 | #elif defined (__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__ >= 403) | |
|
220 | #elif (defined (__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__ >= 403)) \ | |
|
221 | || (defined(__clang__) && __has_builtin(__builtin_bswap64)) | |
|
216 | 222 | return __builtin_bswap64(in); |
|
217 | 223 | #else |
|
218 | 224 | return ((in << 56) & 0xff00000000000000ULL) | |
@@ -88,8 +88,8 b' static void* POOL_thread(void* opaque) {' | |||
|
88 | 88 | ctx->numThreadsBusy++; |
|
89 | 89 | ctx->queueEmpty = ctx->queueHead == ctx->queueTail; |
|
90 | 90 | /* Unlock the mutex, signal a pusher, and run the job */ |
|
91 | ZSTD_pthread_cond_signal(&ctx->queuePushCond); | |
|
91 | 92 | ZSTD_pthread_mutex_unlock(&ctx->queueMutex); |
|
92 | ZSTD_pthread_cond_signal(&ctx->queuePushCond); | |
|
93 | 93 | |
|
94 | 94 | job.function(job.opaque); |
|
95 | 95 |
@@ -30,8 +30,10 b' const char* ZSTD_versionString(void) { r' | |||
|
30 | 30 | /*-**************************************** |
|
31 | 31 | * ZSTD Error Management |
|
32 | 32 | ******************************************/ |
|
33 | #undef ZSTD_isError /* defined within zstd_internal.h */ | |
|
33 | 34 | /*! ZSTD_isError() : |
|
34 |
* tells if a return value is an error code |
|
|
35 | * tells if a return value is an error code | |
|
36 | * symbol is required for external callers */ | |
|
35 | 37 | unsigned ZSTD_isError(size_t code) { return ERR_isError(code); } |
|
36 | 38 | |
|
37 | 39 | /*! ZSTD_getErrorName() : |
@@ -72,6 +72,7 b' typedef enum {' | |||
|
72 | 72 | ZSTD_error_workSpace_tooSmall= 66, |
|
73 | 73 | ZSTD_error_dstSize_tooSmall = 70, |
|
74 | 74 | ZSTD_error_srcSize_wrong = 72, |
|
75 | ZSTD_error_dstBuffer_null = 74, | |
|
75 | 76 | /* following error codes are __NOT STABLE__, they can be removed or changed in future versions */ |
|
76 | 77 | ZSTD_error_frameIndex_tooLarge = 100, |
|
77 | 78 | ZSTD_error_seekableIO = 102, |
@@ -41,6 +41,9 b' extern "C" {' | |||
|
41 | 41 | |
|
42 | 42 | /* ---- static assert (debug) --- */ |
|
43 | 43 | #define ZSTD_STATIC_ASSERT(c) DEBUG_STATIC_ASSERT(c) |
|
44 | #define ZSTD_isError ERR_isError /* for inlining */ | |
|
45 | #define FSE_isError ERR_isError | |
|
46 | #define HUF_isError ERR_isError | |
|
44 | 47 | |
|
45 | 48 | |
|
46 | 49 | /*-************************************* |
@@ -75,7 +78,6 b' static const U32 repStartValue[ZSTD_REP_' | |||
|
75 | 78 | #define BIT0 1 |
|
76 | 79 | |
|
77 | 80 | #define ZSTD_WINDOWLOG_ABSOLUTEMIN 10 |
|
78 | #define ZSTD_WINDOWLOG_DEFAULTMAX 27 /* Default maximum allowed window log */ | |
|
79 | 81 | static const size_t ZSTD_fcs_fieldSize[4] = { 0, 2, 4, 8 }; |
|
80 | 82 | static const size_t ZSTD_did_fieldSize[4] = { 0, 1, 2, 4 }; |
|
81 | 83 | |
@@ -242,7 +244,7 b' typedef struct {' | |||
|
242 | 244 | blockType_e blockType; |
|
243 | 245 | U32 lastBlock; |
|
244 | 246 | U32 origSize; |
|
245 | } blockProperties_t; | |
|
247 | } blockProperties_t; /* declared here for decompress and fullbench */ | |
|
246 | 248 | |
|
247 | 249 | /*! ZSTD_getcBlockSize() : |
|
248 | 250 | * Provides the size of compressed block from block header `src` */ |
@@ -250,6 +252,13 b' typedef struct {' | |||
|
250 | 252 | size_t ZSTD_getcBlockSize(const void* src, size_t srcSize, |
|
251 | 253 | blockProperties_t* bpPtr); |
|
252 | 254 | |
|
255 | /*! ZSTD_decodeSeqHeaders() : | |
|
256 | * decode sequence header from src */ | |
|
257 | /* Used by: decompress, fullbench (does not get its definition from here) */ | |
|
258 | size_t ZSTD_decodeSeqHeaders(ZSTD_DCtx* dctx, int* nbSeqPtr, | |
|
259 | const void* src, size_t srcSize); | |
|
260 | ||
|
261 | ||
|
253 | 262 | #if defined (__cplusplus) |
|
254 | 263 | } |
|
255 | 264 | #endif |
@@ -115,7 +115,7 b' size_t FSE_buildCTable_wksp(FSE_CTable* ' | |||
|
115 | 115 | /* symbol start positions */ |
|
116 | 116 | { U32 u; |
|
117 | 117 | cumul[0] = 0; |
|
118 | for (u=1; u<=maxSymbolValue+1; u++) { | |
|
118 | for (u=1; u <= maxSymbolValue+1; u++) { | |
|
119 | 119 | if (normalizedCounter[u-1]==-1) { /* Low proba symbol */ |
|
120 | 120 | cumul[u] = cumul[u-1] + 1; |
|
121 | 121 | tableSymbol[highThreshold--] = (FSE_FUNCTION_TYPE)(u-1); |
@@ -658,7 +658,7 b' size_t FSE_compress_wksp (void* dst, siz' | |||
|
658 | 658 | BYTE* op = ostart; |
|
659 | 659 | BYTE* const oend = ostart + dstSize; |
|
660 | 660 | |
|
661 |
|
|
|
661 | unsigned count[FSE_MAX_SYMBOL_VALUE+1]; | |
|
662 | 662 | S16 norm[FSE_MAX_SYMBOL_VALUE+1]; |
|
663 | 663 | FSE_CTable* CTable = (FSE_CTable*)workSpace; |
|
664 | 664 | size_t const CTableSize = FSE_CTABLE_SIZE_U32(tableLog, maxSymbolValue); |
@@ -672,7 +672,7 b' size_t FSE_compress_wksp (void* dst, siz' | |||
|
672 | 672 | if (!tableLog) tableLog = FSE_DEFAULT_TABLELOG; |
|
673 | 673 | |
|
674 | 674 | /* Scan input and build symbol stats */ |
|
675 |
{ CHECK_V_F(maxCount, HIST_count_wksp(count, &maxSymbolValue, src, srcSize, |
|
|
675 | { CHECK_V_F(maxCount, HIST_count_wksp(count, &maxSymbolValue, src, srcSize, scratchBuffer, scratchBufferSize) ); | |
|
676 | 676 | if (maxCount == srcSize) return 1; /* only a single symbol in src : rle */ |
|
677 | 677 | if (maxCount == 1) return 0; /* each symbol present maximum once => not compressible */ |
|
678 | 678 | if (maxCount < (srcSize >> 7)) return 0; /* Heuristic : not compressible enough */ |
@@ -73,6 +73,7 b' unsigned HIST_count_simple(unsigned* cou' | |||
|
73 | 73 | return largestCount; |
|
74 | 74 | } |
|
75 | 75 | |
|
76 | typedef enum { trustInput, checkMaxSymbolValue } HIST_checkInput_e; | |
|
76 | 77 | |
|
77 | 78 | /* HIST_count_parallel_wksp() : |
|
78 | 79 | * store histogram into 4 intermediate tables, recombined at the end. |
@@ -85,8 +86,8 b' unsigned HIST_count_simple(unsigned* cou' | |||
|
85 | 86 | static size_t HIST_count_parallel_wksp( |
|
86 | 87 | unsigned* count, unsigned* maxSymbolValuePtr, |
|
87 | 88 | const void* source, size_t sourceSize, |
|
88 |
|
|
|
89 |
|
|
|
89 | HIST_checkInput_e check, | |
|
90 | U32* const workSpace) | |
|
90 | 91 | { |
|
91 | 92 | const BYTE* ip = (const BYTE*)source; |
|
92 | 93 | const BYTE* const iend = ip+sourceSize; |
@@ -137,7 +138,7 b' static size_t HIST_count_parallel_wksp(' | |||
|
137 | 138 | /* finish last symbols */ |
|
138 | 139 | while (ip<iend) Counting1[*ip++]++; |
|
139 | 140 | |
|
140 |
if (check |
|
|
141 | if (check) { /* verify stats will fit into destination table */ | |
|
141 | 142 | U32 s; for (s=255; s>maxSymbolValue; s--) { |
|
142 | 143 | Counting1[s] += Counting2[s] + Counting3[s] + Counting4[s]; |
|
143 | 144 | if (Counting1[s]) return ERROR(maxSymbolValue_tooSmall); |
@@ -157,14 +158,18 b' static size_t HIST_count_parallel_wksp(' | |||
|
157 | 158 | |
|
158 | 159 | /* HIST_countFast_wksp() : |
|
159 | 160 | * Same as HIST_countFast(), but using an externally provided scratch buffer. |
|
160 | * `workSpace` size must be table of >= HIST_WKSP_SIZE_U32 unsigned */ | |
|
161 | * `workSpace` is a writable buffer which must be 4-bytes aligned, | |
|
162 | * `workSpaceSize` must be >= HIST_WKSP_SIZE | |
|
163 | */ | |
|
161 | 164 | size_t HIST_countFast_wksp(unsigned* count, unsigned* maxSymbolValuePtr, |
|
162 | 165 | const void* source, size_t sourceSize, |
|
163 |
|
|
|
166 | void* workSpace, size_t workSpaceSize) | |
|
164 | 167 | { |
|
165 | 168 | if (sourceSize < 1500) /* heuristic threshold */ |
|
166 | 169 | return HIST_count_simple(count, maxSymbolValuePtr, source, sourceSize); |
|
167 | return HIST_count_parallel_wksp(count, maxSymbolValuePtr, source, sourceSize, 0, workSpace); | |
|
170 | if ((size_t)workSpace & 3) return ERROR(GENERIC); /* must be aligned on 4-bytes boundaries */ | |
|
171 | if (workSpaceSize < HIST_WKSP_SIZE) return ERROR(workSpace_tooSmall); | |
|
172 | return HIST_count_parallel_wksp(count, maxSymbolValuePtr, source, sourceSize, trustInput, (U32*)workSpace); | |
|
168 | 173 | } |
|
169 | 174 | |
|
170 | 175 | /* fast variant (unsafe : won't check if src contains values beyond count[] limit) */ |
@@ -172,24 +177,27 b' size_t HIST_countFast(unsigned* count, u' | |||
|
172 | 177 | const void* source, size_t sourceSize) |
|
173 | 178 | { |
|
174 | 179 | unsigned tmpCounters[HIST_WKSP_SIZE_U32]; |
|
175 | return HIST_countFast_wksp(count, maxSymbolValuePtr, source, sourceSize, tmpCounters); | |
|
180 | return HIST_countFast_wksp(count, maxSymbolValuePtr, source, sourceSize, tmpCounters, sizeof(tmpCounters)); | |
|
176 | 181 | } |
|
177 | 182 | |
|
178 | 183 | /* HIST_count_wksp() : |
|
179 | 184 | * Same as HIST_count(), but using an externally provided scratch buffer. |
|
180 | 185 | * `workSpace` size must be table of >= HIST_WKSP_SIZE_U32 unsigned */ |
|
181 | 186 | size_t HIST_count_wksp(unsigned* count, unsigned* maxSymbolValuePtr, |
|
182 |
const void* source, size_t sourceSize, |
|
|
187 | const void* source, size_t sourceSize, | |
|
188 | void* workSpace, size_t workSpaceSize) | |
|
183 | 189 | { |
|
190 | if ((size_t)workSpace & 3) return ERROR(GENERIC); /* must be aligned on 4-bytes boundaries */ | |
|
191 | if (workSpaceSize < HIST_WKSP_SIZE) return ERROR(workSpace_tooSmall); | |
|
184 | 192 | if (*maxSymbolValuePtr < 255) |
|
185 |
return HIST_count_parallel_wksp(count, maxSymbolValuePtr, source, sourceSize, |
|
|
193 | return HIST_count_parallel_wksp(count, maxSymbolValuePtr, source, sourceSize, checkMaxSymbolValue, (U32*)workSpace); | |
|
186 | 194 | *maxSymbolValuePtr = 255; |
|
187 | return HIST_countFast_wksp(count, maxSymbolValuePtr, source, sourceSize, workSpace); | |
|
195 | return HIST_countFast_wksp(count, maxSymbolValuePtr, source, sourceSize, workSpace, workSpaceSize); | |
|
188 | 196 | } |
|
189 | 197 | |
|
190 | 198 | size_t HIST_count(unsigned* count, unsigned* maxSymbolValuePtr, |
|
191 | 199 | const void* src, size_t srcSize) |
|
192 | 200 | { |
|
193 | 201 | unsigned tmpCounters[HIST_WKSP_SIZE_U32]; |
|
194 | return HIST_count_wksp(count, maxSymbolValuePtr, src, srcSize, tmpCounters); | |
|
202 | return HIST_count_wksp(count, maxSymbolValuePtr, src, srcSize, tmpCounters, sizeof(tmpCounters)); | |
|
195 | 203 | } |
@@ -41,11 +41,11 b'' | |||
|
41 | 41 | |
|
42 | 42 | /*! HIST_count(): |
|
43 | 43 | * Provides the precise count of each byte within a table 'count'. |
|
44 |
* |
|
|
44 | * 'count' is a table of unsigned int, of minimum size (*maxSymbolValuePtr+1). | |
|
45 | 45 | * Updates *maxSymbolValuePtr with actual largest symbol value detected. |
|
46 |
* |
|
|
47 |
* |
|
|
48 |
* |
|
|
46 | * @return : count of the most frequent symbol (which isn't identified). | |
|
47 | * or an error code, which can be tested using HIST_isError(). | |
|
48 | * note : if return == srcSize, there is only one symbol. | |
|
49 | 49 | */ |
|
50 | 50 | size_t HIST_count(unsigned* count, unsigned* maxSymbolValuePtr, |
|
51 | 51 | const void* src, size_t srcSize); |
@@ -56,14 +56,16 b' unsigned HIST_isError(size_t code); /**' | |||
|
56 | 56 | /* --- advanced histogram functions --- */ |
|
57 | 57 | |
|
58 | 58 | #define HIST_WKSP_SIZE_U32 1024 |
|
59 | #define HIST_WKSP_SIZE (HIST_WKSP_SIZE_U32 * sizeof(unsigned)) | |
|
59 | 60 | /** HIST_count_wksp() : |
|
60 | 61 | * Same as HIST_count(), but using an externally provided scratch buffer. |
|
61 | 62 | * Benefit is this function will use very little stack space. |
|
62 | * `workSpace` must be a table of unsigned of size >= HIST_WKSP_SIZE_U32 | |
|
63 | * `workSpace` is a writable buffer which must be 4-bytes aligned, | |
|
64 | * `workSpaceSize` must be >= HIST_WKSP_SIZE | |
|
63 | 65 | */ |
|
64 | 66 | size_t HIST_count_wksp(unsigned* count, unsigned* maxSymbolValuePtr, |
|
65 | 67 | const void* src, size_t srcSize, |
|
66 |
|
|
|
68 | void* workSpace, size_t workSpaceSize); | |
|
67 | 69 | |
|
68 | 70 | /** HIST_countFast() : |
|
69 | 71 | * same as HIST_count(), but blindly trusts that all byte values within src are <= *maxSymbolValuePtr. |
@@ -74,11 +76,12 b' size_t HIST_countFast(unsigned* count, u' | |||
|
74 | 76 | |
|
75 | 77 | /** HIST_countFast_wksp() : |
|
76 | 78 | * Same as HIST_countFast(), but using an externally provided scratch buffer. |
|
77 | * `workSpace` must be a table of unsigned of size >= HIST_WKSP_SIZE_U32 | |
|
79 | * `workSpace` is a writable buffer which must be 4-bytes aligned, | |
|
80 | * `workSpaceSize` must be >= HIST_WKSP_SIZE | |
|
78 | 81 | */ |
|
79 | 82 | size_t HIST_countFast_wksp(unsigned* count, unsigned* maxSymbolValuePtr, |
|
80 | 83 | const void* src, size_t srcSize, |
|
81 |
|
|
|
84 | void* workSpace, size_t workSpaceSize); | |
|
82 | 85 | |
|
83 | 86 | /*! HIST_count_simple() : |
|
84 | 87 | * Same as HIST_countFast(), this function is unsafe, |
@@ -88,13 +88,13 b' static size_t HUF_compressWeights (void*' | |||
|
88 | 88 | BYTE* op = ostart; |
|
89 | 89 | BYTE* const oend = ostart + dstSize; |
|
90 | 90 | |
|
91 |
|
|
|
91 | unsigned maxSymbolValue = HUF_TABLELOG_MAX; | |
|
92 | 92 | U32 tableLog = MAX_FSE_TABLELOG_FOR_HUFF_HEADER; |
|
93 | 93 | |
|
94 | 94 | FSE_CTable CTable[FSE_CTABLE_SIZE_U32(MAX_FSE_TABLELOG_FOR_HUFF_HEADER, HUF_TABLELOG_MAX)]; |
|
95 | 95 | BYTE scratchBuffer[1<<MAX_FSE_TABLELOG_FOR_HUFF_HEADER]; |
|
96 | 96 | |
|
97 |
|
|
|
97 | unsigned count[HUF_TABLELOG_MAX+1]; | |
|
98 | 98 | S16 norm[HUF_TABLELOG_MAX+1]; |
|
99 | 99 | |
|
100 | 100 | /* init conditions */ |
@@ -134,7 +134,7 b' struct HUF_CElt_s {' | |||
|
134 | 134 | `CTable` : Huffman tree to save, using huf representation. |
|
135 | 135 | @return : size of saved CTable */ |
|
136 | 136 | size_t HUF_writeCTable (void* dst, size_t maxDstSize, |
|
137 |
const HUF_CElt* CTable, |
|
|
137 | const HUF_CElt* CTable, unsigned maxSymbolValue, unsigned huffLog) | |
|
138 | 138 | { |
|
139 | 139 | BYTE bitsToWeight[HUF_TABLELOG_MAX + 1]; /* precomputed conversion table */ |
|
140 | 140 | BYTE huffWeight[HUF_SYMBOLVALUE_MAX]; |
@@ -169,7 +169,7 b' size_t HUF_writeCTable (void* dst, size_' | |||
|
169 | 169 | } |
|
170 | 170 | |
|
171 | 171 | |
|
172 |
size_t HUF_readCTable (HUF_CElt* CTable, |
|
|
172 | size_t HUF_readCTable (HUF_CElt* CTable, unsigned* maxSymbolValuePtr, const void* src, size_t srcSize) | |
|
173 | 173 | { |
|
174 | 174 | BYTE huffWeight[HUF_SYMBOLVALUE_MAX + 1]; /* init not required, even though some static analyzer may complain */ |
|
175 | 175 | U32 rankVal[HUF_TABLELOG_ABSOLUTEMAX + 1]; /* large enough for values from 0 to 16 */ |
@@ -315,7 +315,7 b' typedef struct {' | |||
|
315 | 315 | U32 current; |
|
316 | 316 | } rankPos; |
|
317 | 317 | |
|
318 |
static void HUF_sort(nodeElt* huffNode, const |
|
|
318 | static void HUF_sort(nodeElt* huffNode, const unsigned* count, U32 maxSymbolValue) | |
|
319 | 319 | { |
|
320 | 320 | rankPos rank[32]; |
|
321 | 321 | U32 n; |
@@ -347,7 +347,7 b' static void HUF_sort(nodeElt* huffNode, ' | |||
|
347 | 347 | */ |
|
348 | 348 | #define STARTNODE (HUF_SYMBOLVALUE_MAX+1) |
|
349 | 349 | typedef nodeElt huffNodeTable[HUF_CTABLE_WORKSPACE_SIZE_U32]; |
|
350 |
size_t HUF_buildCTable_wksp (HUF_CElt* tree, const |
|
|
350 | size_t HUF_buildCTable_wksp (HUF_CElt* tree, const unsigned* count, U32 maxSymbolValue, U32 maxNbBits, void* workSpace, size_t wkspSize) | |
|
351 | 351 | { |
|
352 | 352 | nodeElt* const huffNode0 = (nodeElt*)workSpace; |
|
353 | 353 | nodeElt* const huffNode = huffNode0+1; |
@@ -421,7 +421,7 b' size_t HUF_buildCTable_wksp (HUF_CElt* t' | |||
|
421 | 421 | * @return : maxNbBits |
|
422 | 422 | * Note : count is used before tree is written, so they can safely overlap |
|
423 | 423 | */ |
|
424 |
size_t HUF_buildCTable (HUF_CElt* tree, const |
|
|
424 | size_t HUF_buildCTable (HUF_CElt* tree, const unsigned* count, unsigned maxSymbolValue, unsigned maxNbBits) | |
|
425 | 425 | { |
|
426 | 426 | huffNodeTable nodeTable; |
|
427 | 427 | return HUF_buildCTable_wksp(tree, count, maxSymbolValue, maxNbBits, nodeTable, sizeof(nodeTable)); |
@@ -610,13 +610,14 b' size_t HUF_compress4X_usingCTable(void* ' | |||
|
610 | 610 | return HUF_compress4X_usingCTable_internal(dst, dstSize, src, srcSize, CTable, /* bmi2 */ 0); |
|
611 | 611 | } |
|
612 | 612 | |
|
613 | typedef enum { HUF_singleStream, HUF_fourStreams } HUF_nbStreams_e; | |
|
613 | 614 | |
|
614 | 615 | static size_t HUF_compressCTable_internal( |
|
615 | 616 | BYTE* const ostart, BYTE* op, BYTE* const oend, |
|
616 | 617 | const void* src, size_t srcSize, |
|
617 |
|
|
|
618 | HUF_nbStreams_e nbStreams, const HUF_CElt* CTable, const int bmi2) | |
|
618 | 619 | { |
|
619 | size_t const cSize = singleStream ? | |
|
620 | size_t const cSize = (nbStreams==HUF_singleStream) ? | |
|
620 | 621 | HUF_compress1X_usingCTable_internal(op, oend - op, src, srcSize, CTable, bmi2) : |
|
621 | 622 | HUF_compress4X_usingCTable_internal(op, oend - op, src, srcSize, CTable, bmi2); |
|
622 | 623 | if (HUF_isError(cSize)) { return cSize; } |
@@ -628,21 +629,21 b' static size_t HUF_compressCTable_interna' | |||
|
628 | 629 | } |
|
629 | 630 | |
|
630 | 631 | typedef struct { |
|
631 |
|
|
|
632 | unsigned count[HUF_SYMBOLVALUE_MAX + 1]; | |
|
632 | 633 | HUF_CElt CTable[HUF_SYMBOLVALUE_MAX + 1]; |
|
633 | 634 | huffNodeTable nodeTable; |
|
634 | 635 | } HUF_compress_tables_t; |
|
635 | 636 | |
|
636 | 637 | /* HUF_compress_internal() : |
|
637 | 638 | * `workSpace` must a table of at least HUF_WORKSPACE_SIZE_U32 unsigned */ |
|
638 | static size_t HUF_compress_internal ( | |
|
639 | void* dst, size_t dstSize, | |
|
640 | const void* src, size_t srcSize, | |
|
641 | unsigned maxSymbolValue, unsigned huffLog, | |
|
642 | unsigned singleStream, | |
|
643 | void* workSpace, size_t wkspSize, | |
|
644 | HUF_CElt* oldHufTable, HUF_repeat* repeat, int preferRepeat, | |
|
645 | const int bmi2) | |
|
639 | static size_t | |
|
640 | HUF_compress_internal (void* dst, size_t dstSize, | |
|
641 | const void* src, size_t srcSize, | |
|
642 | unsigned maxSymbolValue, unsigned huffLog, | |
|
643 | HUF_nbStreams_e nbStreams, | |
|
644 | void* workSpace, size_t wkspSize, | |
|
645 | HUF_CElt* oldHufTable, HUF_repeat* repeat, int preferRepeat, | |
|
646 | const int bmi2) | |
|
646 | 647 | { |
|
647 | 648 | HUF_compress_tables_t* const table = (HUF_compress_tables_t*)workSpace; |
|
648 | 649 | BYTE* const ostart = (BYTE*)dst; |
@@ -651,7 +652,7 b' static size_t HUF_compress_internal (' | |||
|
651 | 652 | |
|
652 | 653 | /* checks & inits */ |
|
653 | 654 | if (((size_t)workSpace & 3) != 0) return ERROR(GENERIC); /* must be aligned on 4-bytes boundaries */ |
|
654 |
if (wkspSize < |
|
|
655 | if (wkspSize < HUF_WORKSPACE_SIZE) return ERROR(workSpace_tooSmall); | |
|
655 | 656 | if (!srcSize) return 0; /* Uncompressed */ |
|
656 | 657 | if (!dstSize) return 0; /* cannot fit anything within dst budget */ |
|
657 | 658 | if (srcSize > HUF_BLOCKSIZE_MAX) return ERROR(srcSize_wrong); /* current block size limit */ |
@@ -664,11 +665,11 b' static size_t HUF_compress_internal (' | |||
|
664 | 665 | if (preferRepeat && repeat && *repeat == HUF_repeat_valid) { |
|
665 | 666 | return HUF_compressCTable_internal(ostart, op, oend, |
|
666 | 667 | src, srcSize, |
|
667 |
|
|
|
668 | nbStreams, oldHufTable, bmi2); | |
|
668 | 669 | } |
|
669 | 670 | |
|
670 | 671 | /* Scan input and build symbol stats */ |
|
671 |
{ CHECK_V_F(largest, HIST_count_wksp (table->count, &maxSymbolValue, (const BYTE*)src, srcSize, |
|
|
672 | { CHECK_V_F(largest, HIST_count_wksp (table->count, &maxSymbolValue, (const BYTE*)src, srcSize, workSpace, wkspSize) ); | |
|
672 | 673 | if (largest == srcSize) { *ostart = ((const BYTE*)src)[0]; return 1; } /* single symbol, rle */ |
|
673 | 674 | if (largest <= (srcSize >> 7)+4) return 0; /* heuristic : probably not compressible enough */ |
|
674 | 675 | } |
@@ -683,14 +684,15 b' static size_t HUF_compress_internal (' | |||
|
683 | 684 | if (preferRepeat && repeat && *repeat != HUF_repeat_none) { |
|
684 | 685 | return HUF_compressCTable_internal(ostart, op, oend, |
|
685 | 686 | src, srcSize, |
|
686 |
|
|
|
687 | nbStreams, oldHufTable, bmi2); | |
|
687 | 688 | } |
|
688 | 689 | |
|
689 | 690 | /* Build Huffman Tree */ |
|
690 | 691 | huffLog = HUF_optimalTableLog(huffLog, srcSize, maxSymbolValue); |
|
691 |
{ |
|
|
692 |
|
|
|
693 |
|
|
|
692 | { size_t const maxBits = HUF_buildCTable_wksp(table->CTable, table->count, | |
|
693 | maxSymbolValue, huffLog, | |
|
694 | table->nodeTable, sizeof(table->nodeTable)); | |
|
695 | CHECK_F(maxBits); | |
|
694 | 696 | huffLog = (U32)maxBits; |
|
695 | 697 | /* Zero unused symbols in CTable, so we can check it for validity */ |
|
696 | 698 | memset(table->CTable + (maxSymbolValue + 1), 0, |
@@ -706,7 +708,7 b' static size_t HUF_compress_internal (' | |||
|
706 | 708 | if (oldSize <= hSize + newSize || hSize + 12 >= srcSize) { |
|
707 | 709 | return HUF_compressCTable_internal(ostart, op, oend, |
|
708 | 710 | src, srcSize, |
|
709 |
|
|
|
711 | nbStreams, oldHufTable, bmi2); | |
|
710 | 712 | } } |
|
711 | 713 | |
|
712 | 714 | /* Use the new huffman table */ |
@@ -718,7 +720,7 b' static size_t HUF_compress_internal (' | |||
|
718 | 720 | } |
|
719 | 721 | return HUF_compressCTable_internal(ostart, op, oend, |
|
720 | 722 | src, srcSize, |
|
721 |
|
|
|
723 | nbStreams, table->CTable, bmi2); | |
|
722 | 724 | } |
|
723 | 725 | |
|
724 | 726 | |
@@ -728,7 +730,7 b' size_t HUF_compress1X_wksp (void* dst, s' | |||
|
728 | 730 | void* workSpace, size_t wkspSize) |
|
729 | 731 | { |
|
730 | 732 | return HUF_compress_internal(dst, dstSize, src, srcSize, |
|
731 |
maxSymbolValue, huffLog, |
|
|
733 | maxSymbolValue, huffLog, HUF_singleStream, | |
|
732 | 734 | workSpace, wkspSize, |
|
733 | 735 | NULL, NULL, 0, 0 /*bmi2*/); |
|
734 | 736 | } |
@@ -740,7 +742,7 b' size_t HUF_compress1X_repeat (void* dst,' | |||
|
740 | 742 | HUF_CElt* hufTable, HUF_repeat* repeat, int preferRepeat, int bmi2) |
|
741 | 743 | { |
|
742 | 744 | return HUF_compress_internal(dst, dstSize, src, srcSize, |
|
743 |
maxSymbolValue, huffLog, |
|
|
745 | maxSymbolValue, huffLog, HUF_singleStream, | |
|
744 | 746 | workSpace, wkspSize, hufTable, |
|
745 | 747 | repeat, preferRepeat, bmi2); |
|
746 | 748 | } |
@@ -762,7 +764,7 b' size_t HUF_compress4X_wksp (void* dst, s' | |||
|
762 | 764 | void* workSpace, size_t wkspSize) |
|
763 | 765 | { |
|
764 | 766 | return HUF_compress_internal(dst, dstSize, src, srcSize, |
|
765 |
maxSymbolValue, huffLog, |
|
|
767 | maxSymbolValue, huffLog, HUF_fourStreams, | |
|
766 | 768 | workSpace, wkspSize, |
|
767 | 769 | NULL, NULL, 0, 0 /*bmi2*/); |
|
768 | 770 | } |
@@ -777,7 +779,7 b' size_t HUF_compress4X_repeat (void* dst,' | |||
|
777 | 779 | HUF_CElt* hufTable, HUF_repeat* repeat, int preferRepeat, int bmi2) |
|
778 | 780 | { |
|
779 | 781 | return HUF_compress_internal(dst, dstSize, src, srcSize, |
|
780 |
maxSymbolValue, huffLog, |
|
|
782 | maxSymbolValue, huffLog, HUF_fourStreams, | |
|
781 | 783 | workSpace, wkspSize, |
|
782 | 784 | hufTable, repeat, preferRepeat, bmi2); |
|
783 | 785 | } |
This diff has been collapsed as it changes many lines, (994 lines changed) Show them Hide them | |||
@@ -11,6 +11,7 b'' | |||
|
11 | 11 | /*-************************************* |
|
12 | 12 | * Dependencies |
|
13 | 13 | ***************************************/ |
|
14 | #include <limits.h> /* INT_MAX */ | |
|
14 | 15 | #include <string.h> /* memset */ |
|
15 | 16 | #include "cpu.h" |
|
16 | 17 | #include "mem.h" |
@@ -61,7 +62,7 b' static void ZSTD_initCCtx(ZSTD_CCtx* cct' | |||
|
61 | 62 | memset(cctx, 0, sizeof(*cctx)); |
|
62 | 63 | cctx->customMem = memManager; |
|
63 | 64 | cctx->bmi2 = ZSTD_cpuid_bmi2(ZSTD_cpuid()); |
|
64 |
{ size_t const err = ZSTD_CCtx_reset |
|
|
65 | { size_t const err = ZSTD_CCtx_reset(cctx, ZSTD_reset_parameters); | |
|
65 | 66 | assert(!ZSTD_isError(err)); |
|
66 | 67 | (void)err; |
|
67 | 68 | } |
@@ -128,7 +129,7 b' static size_t ZSTD_sizeof_mtctx(const ZS' | |||
|
128 | 129 | #ifdef ZSTD_MULTITHREAD |
|
129 | 130 | return ZSTDMT_sizeof_CCtx(cctx->mtctx); |
|
130 | 131 | #else |
|
131 |
(void) |
|
|
132 | (void)cctx; | |
|
132 | 133 | return 0; |
|
133 | 134 | #endif |
|
134 | 135 | } |
@@ -226,9 +227,160 b' static ZSTD_CCtx_params ZSTD_assignParam' | |||
|
226 | 227 | return ret; |
|
227 | 228 | } |
|
228 | 229 | |
|
229 | #define CLAMPCHECK(val,min,max) { \ | |
|
230 | if (((val)<(min)) | ((val)>(max))) { \ | |
|
231 | return ERROR(parameter_outOfBound); \ | |
|
230 | ZSTD_bounds ZSTD_cParam_getBounds(ZSTD_cParameter param) | |
|
231 | { | |
|
232 | ZSTD_bounds bounds = { 0, 0, 0 }; | |
|
233 | ||
|
234 | switch(param) | |
|
235 | { | |
|
236 | case ZSTD_c_compressionLevel: | |
|
237 | bounds.lowerBound = ZSTD_minCLevel(); | |
|
238 | bounds.upperBound = ZSTD_maxCLevel(); | |
|
239 | return bounds; | |
|
240 | ||
|
241 | case ZSTD_c_windowLog: | |
|
242 | bounds.lowerBound = ZSTD_WINDOWLOG_MIN; | |
|
243 | bounds.upperBound = ZSTD_WINDOWLOG_MAX; | |
|
244 | return bounds; | |
|
245 | ||
|
246 | case ZSTD_c_hashLog: | |
|
247 | bounds.lowerBound = ZSTD_HASHLOG_MIN; | |
|
248 | bounds.upperBound = ZSTD_HASHLOG_MAX; | |
|
249 | return bounds; | |
|
250 | ||
|
251 | case ZSTD_c_chainLog: | |
|
252 | bounds.lowerBound = ZSTD_CHAINLOG_MIN; | |
|
253 | bounds.upperBound = ZSTD_CHAINLOG_MAX; | |
|
254 | return bounds; | |
|
255 | ||
|
256 | case ZSTD_c_searchLog: | |
|
257 | bounds.lowerBound = ZSTD_SEARCHLOG_MIN; | |
|
258 | bounds.upperBound = ZSTD_SEARCHLOG_MAX; | |
|
259 | return bounds; | |
|
260 | ||
|
261 | case ZSTD_c_minMatch: | |
|
262 | bounds.lowerBound = ZSTD_MINMATCH_MIN; | |
|
263 | bounds.upperBound = ZSTD_MINMATCH_MAX; | |
|
264 | return bounds; | |
|
265 | ||
|
266 | case ZSTD_c_targetLength: | |
|
267 | bounds.lowerBound = ZSTD_TARGETLENGTH_MIN; | |
|
268 | bounds.upperBound = ZSTD_TARGETLENGTH_MAX; | |
|
269 | return bounds; | |
|
270 | ||
|
271 | case ZSTD_c_strategy: | |
|
272 | bounds.lowerBound = ZSTD_STRATEGY_MIN; | |
|
273 | bounds.upperBound = ZSTD_STRATEGY_MAX; | |
|
274 | return bounds; | |
|
275 | ||
|
276 | case ZSTD_c_contentSizeFlag: | |
|
277 | bounds.lowerBound = 0; | |
|
278 | bounds.upperBound = 1; | |
|
279 | return bounds; | |
|
280 | ||
|
281 | case ZSTD_c_checksumFlag: | |
|
282 | bounds.lowerBound = 0; | |
|
283 | bounds.upperBound = 1; | |
|
284 | return bounds; | |
|
285 | ||
|
286 | case ZSTD_c_dictIDFlag: | |
|
287 | bounds.lowerBound = 0; | |
|
288 | bounds.upperBound = 1; | |
|
289 | return bounds; | |
|
290 | ||
|
291 | case ZSTD_c_nbWorkers: | |
|
292 | bounds.lowerBound = 0; | |
|
293 | #ifdef ZSTD_MULTITHREAD | |
|
294 | bounds.upperBound = ZSTDMT_NBWORKERS_MAX; | |
|
295 | #else | |
|
296 | bounds.upperBound = 0; | |
|
297 | #endif | |
|
298 | return bounds; | |
|
299 | ||
|
300 | case ZSTD_c_jobSize: | |
|
301 | bounds.lowerBound = 0; | |
|
302 | #ifdef ZSTD_MULTITHREAD | |
|
303 | bounds.upperBound = ZSTDMT_JOBSIZE_MAX; | |
|
304 | #else | |
|
305 | bounds.upperBound = 0; | |
|
306 | #endif | |
|
307 | return bounds; | |
|
308 | ||
|
309 | case ZSTD_c_overlapLog: | |
|
310 | bounds.lowerBound = ZSTD_OVERLAPLOG_MIN; | |
|
311 | bounds.upperBound = ZSTD_OVERLAPLOG_MAX; | |
|
312 | return bounds; | |
|
313 | ||
|
314 | case ZSTD_c_enableLongDistanceMatching: | |
|
315 | bounds.lowerBound = 0; | |
|
316 | bounds.upperBound = 1; | |
|
317 | return bounds; | |
|
318 | ||
|
319 | case ZSTD_c_ldmHashLog: | |
|
320 | bounds.lowerBound = ZSTD_LDM_HASHLOG_MIN; | |
|
321 | bounds.upperBound = ZSTD_LDM_HASHLOG_MAX; | |
|
322 | return bounds; | |
|
323 | ||
|
324 | case ZSTD_c_ldmMinMatch: | |
|
325 | bounds.lowerBound = ZSTD_LDM_MINMATCH_MIN; | |
|
326 | bounds.upperBound = ZSTD_LDM_MINMATCH_MAX; | |
|
327 | return bounds; | |
|
328 | ||
|
329 | case ZSTD_c_ldmBucketSizeLog: | |
|
330 | bounds.lowerBound = ZSTD_LDM_BUCKETSIZELOG_MIN; | |
|
331 | bounds.upperBound = ZSTD_LDM_BUCKETSIZELOG_MAX; | |
|
332 | return bounds; | |
|
333 | ||
|
334 | case ZSTD_c_ldmHashRateLog: | |
|
335 | bounds.lowerBound = ZSTD_LDM_HASHRATELOG_MIN; | |
|
336 | bounds.upperBound = ZSTD_LDM_HASHRATELOG_MAX; | |
|
337 | return bounds; | |
|
338 | ||
|
339 | /* experimental parameters */ | |
|
340 | case ZSTD_c_rsyncable: | |
|
341 | bounds.lowerBound = 0; | |
|
342 | bounds.upperBound = 1; | |
|
343 | return bounds; | |
|
344 | ||
|
345 | case ZSTD_c_forceMaxWindow : | |
|
346 | bounds.lowerBound = 0; | |
|
347 | bounds.upperBound = 1; | |
|
348 | return bounds; | |
|
349 | ||
|
350 | case ZSTD_c_format: | |
|
351 | ZSTD_STATIC_ASSERT(ZSTD_f_zstd1 < ZSTD_f_zstd1_magicless); | |
|
352 | bounds.lowerBound = ZSTD_f_zstd1; | |
|
353 | bounds.upperBound = ZSTD_f_zstd1_magicless; /* note : how to ensure at compile time that this is the highest value enum ? */ | |
|
354 | return bounds; | |
|
355 | ||
|
356 | case ZSTD_c_forceAttachDict: | |
|
357 | ZSTD_STATIC_ASSERT(ZSTD_dictDefaultAttach < ZSTD_dictForceCopy); | |
|
358 | bounds.lowerBound = ZSTD_dictDefaultAttach; | |
|
359 | bounds.upperBound = ZSTD_dictForceCopy; /* note : how to ensure at compile time that this is the highest value enum ? */ | |
|
360 | return bounds; | |
|
361 | ||
|
362 | default: | |
|
363 | { ZSTD_bounds const boundError = { ERROR(parameter_unsupported), 0, 0 }; | |
|
364 | return boundError; | |
|
365 | } | |
|
366 | } | |
|
367 | } | |
|
368 | ||
|
369 | /* ZSTD_cParam_withinBounds: | |
|
370 | * @return 1 if value is within cParam bounds, | |
|
371 | * 0 otherwise */ | |
|
372 | static int ZSTD_cParam_withinBounds(ZSTD_cParameter cParam, int value) | |
|
373 | { | |
|
374 | ZSTD_bounds const bounds = ZSTD_cParam_getBounds(cParam); | |
|
375 | if (ZSTD_isError(bounds.error)) return 0; | |
|
376 | if (value < bounds.lowerBound) return 0; | |
|
377 | if (value > bounds.upperBound) return 0; | |
|
378 | return 1; | |
|
379 | } | |
|
380 | ||
|
381 | #define BOUNDCHECK(cParam, val) { \ | |
|
382 | if (!ZSTD_cParam_withinBounds(cParam,val)) { \ | |
|
383 | return ERROR(parameter_outOfBound); \ | |
|
232 | 384 | } } |
|
233 | 385 | |
|
234 | 386 | |
@@ -236,38 +388,39 b' static int ZSTD_isUpdateAuthorized(ZSTD_' | |||
|
236 | 388 | { |
|
237 | 389 | switch(param) |
|
238 | 390 | { |
|
239 |
case ZSTD_ |
|
|
240 |
case ZSTD_ |
|
|
241 |
case ZSTD_ |
|
|
242 |
case ZSTD_ |
|
|
243 |
case ZSTD_ |
|
|
244 |
case ZSTD_ |
|
|
245 |
case ZSTD_ |
|
|
391 | case ZSTD_c_compressionLevel: | |
|
392 | case ZSTD_c_hashLog: | |
|
393 | case ZSTD_c_chainLog: | |
|
394 | case ZSTD_c_searchLog: | |
|
395 | case ZSTD_c_minMatch: | |
|
396 | case ZSTD_c_targetLength: | |
|
397 | case ZSTD_c_strategy: | |
|
246 | 398 | return 1; |
|
247 | 399 | |
|
248 |
case ZSTD_ |
|
|
249 |
case ZSTD_ |
|
|
250 |
case ZSTD_ |
|
|
251 |
case ZSTD_ |
|
|
252 |
case ZSTD_ |
|
|
253 |
case ZSTD_ |
|
|
254 |
case ZSTD_ |
|
|
255 |
case ZSTD_ |
|
|
256 |
case ZSTD_ |
|
|
257 |
case ZSTD_ |
|
|
258 | case ZSTD_p_ldmHashLog: | |
|
259 |
case ZSTD_ |
|
|
260 |
case ZSTD_ |
|
|
261 |
case ZSTD_ |
|
|
262 | case ZSTD_p_forceAttachDict: | |
|
400 | case ZSTD_c_format: | |
|
401 | case ZSTD_c_windowLog: | |
|
402 | case ZSTD_c_contentSizeFlag: | |
|
403 | case ZSTD_c_checksumFlag: | |
|
404 | case ZSTD_c_dictIDFlag: | |
|
405 | case ZSTD_c_forceMaxWindow : | |
|
406 | case ZSTD_c_nbWorkers: | |
|
407 | case ZSTD_c_jobSize: | |
|
408 | case ZSTD_c_overlapLog: | |
|
409 | case ZSTD_c_rsyncable: | |
|
410 | case ZSTD_c_enableLongDistanceMatching: | |
|
411 | case ZSTD_c_ldmHashLog: | |
|
412 | case ZSTD_c_ldmMinMatch: | |
|
413 | case ZSTD_c_ldmBucketSizeLog: | |
|
414 | case ZSTD_c_ldmHashRateLog: | |
|
415 | case ZSTD_c_forceAttachDict: | |
|
263 | 416 | default: |
|
264 | 417 | return 0; |
|
265 | 418 | } |
|
266 | 419 | } |
|
267 | 420 | |
|
268 |
size_t ZSTD_CCtx_setParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, |
|
|
421 | size_t ZSTD_CCtx_setParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, int value) | |
|
269 | 422 | { |
|
270 |
DEBUGLOG(4, "ZSTD_CCtx_setParameter (% |
|
|
423 | DEBUGLOG(4, "ZSTD_CCtx_setParameter (%i, %i)", (int)param, value); | |
|
271 | 424 | if (cctx->streamStage != zcss_init) { |
|
272 | 425 | if (ZSTD_isUpdateAuthorized(param)) { |
|
273 | 426 | cctx->cParamsChanged = 1; |
@@ -277,51 +430,52 b' size_t ZSTD_CCtx_setParameter(ZSTD_CCtx*' | |||
|
277 | 430 | |
|
278 | 431 | switch(param) |
|
279 | 432 | { |
|
280 |
case ZSTD_ |
|
|
433 | case ZSTD_c_format : | |
|
281 | 434 | return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value); |
|
282 | 435 | |
|
283 |
case ZSTD_ |
|
|
436 | case ZSTD_c_compressionLevel: | |
|
284 | 437 | if (cctx->cdict) return ERROR(stage_wrong); |
|
285 | 438 | return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value); |
|
286 | 439 | |
|
287 |
case ZSTD_ |
|
|
288 |
case ZSTD_ |
|
|
289 |
case ZSTD_ |
|
|
290 |
case ZSTD_ |
|
|
291 |
case ZSTD_ |
|
|
292 |
case ZSTD_ |
|
|
293 |
case ZSTD_ |
|
|
440 | case ZSTD_c_windowLog: | |
|
441 | case ZSTD_c_hashLog: | |
|
442 | case ZSTD_c_chainLog: | |
|
443 | case ZSTD_c_searchLog: | |
|
444 | case ZSTD_c_minMatch: | |
|
445 | case ZSTD_c_targetLength: | |
|
446 | case ZSTD_c_strategy: | |
|
294 | 447 | if (cctx->cdict) return ERROR(stage_wrong); |
|
295 | 448 | return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value); |
|
296 | 449 | |
|
297 |
case ZSTD_ |
|
|
298 |
case ZSTD_ |
|
|
299 |
case ZSTD_ |
|
|
450 | case ZSTD_c_contentSizeFlag: | |
|
451 | case ZSTD_c_checksumFlag: | |
|
452 | case ZSTD_c_dictIDFlag: | |
|
300 | 453 | return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value); |
|
301 | 454 | |
|
302 |
case ZSTD_ |
|
|
455 | case ZSTD_c_forceMaxWindow : /* Force back-references to remain < windowSize, | |
|
303 | 456 | * even when referencing into Dictionary content. |
|
304 | 457 | * default : 0 when using a CDict, 1 when using a Prefix */ |
|
305 | 458 | return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value); |
|
306 | 459 | |
|
307 |
case ZSTD_ |
|
|
460 | case ZSTD_c_forceAttachDict: | |
|
308 | 461 | return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value); |
|
309 | 462 | |
|
310 |
case ZSTD_ |
|
|
311 |
if ((value |
|
|
463 | case ZSTD_c_nbWorkers: | |
|
464 | if ((value!=0) && cctx->staticSize) { | |
|
312 | 465 | return ERROR(parameter_unsupported); /* MT not compatible with static alloc */ |
|
313 | 466 | } |
|
314 | 467 | return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value); |
|
315 | 468 | |
|
316 |
case ZSTD_ |
|
|
317 |
case ZSTD_ |
|
|
469 | case ZSTD_c_jobSize: | |
|
470 | case ZSTD_c_overlapLog: | |
|
471 | case ZSTD_c_rsyncable: | |
|
318 | 472 | return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value); |
|
319 | 473 | |
|
320 |
case ZSTD_ |
|
|
321 |
case ZSTD_ |
|
|
322 |
case ZSTD_ |
|
|
323 |
case ZSTD_ |
|
|
324 |
case ZSTD_ |
|
|
474 | case ZSTD_c_enableLongDistanceMatching: | |
|
475 | case ZSTD_c_ldmHashLog: | |
|
476 | case ZSTD_c_ldmMinMatch: | |
|
477 | case ZSTD_c_ldmBucketSizeLog: | |
|
478 | case ZSTD_c_ldmHashRateLog: | |
|
325 | 479 | if (cctx->cdict) return ERROR(stage_wrong); |
|
326 | 480 | return ZSTD_CCtxParam_setParameter(&cctx->requestedParams, param, value); |
|
327 | 481 | |
@@ -329,21 +483,21 b' size_t ZSTD_CCtx_setParameter(ZSTD_CCtx*' | |||
|
329 | 483 | } |
|
330 | 484 | } |
|
331 | 485 | |
|
332 | size_t ZSTD_CCtxParam_setParameter( | |
|
333 | ZSTD_CCtx_params* CCtxParams, ZSTD_cParameter param, unsigned value) | |
|
486 | size_t ZSTD_CCtxParam_setParameter(ZSTD_CCtx_params* CCtxParams, | |
|
487 | ZSTD_cParameter param, int value) | |
|
334 | 488 | { |
|
335 |
DEBUGLOG(4, "ZSTD_CCtxParam_setParameter (% |
|
|
489 | DEBUGLOG(4, "ZSTD_CCtxParam_setParameter (%i, %i)", (int)param, value); | |
|
336 | 490 | switch(param) |
|
337 | 491 | { |
|
338 |
case ZSTD_ |
|
|
339 | if (value > (unsigned)ZSTD_f_zstd1_magicless) | |
|
340 | return ERROR(parameter_unsupported); | |
|
492 | case ZSTD_c_format : | |
|
493 | BOUNDCHECK(ZSTD_c_format, value); | |
|
341 | 494 | CCtxParams->format = (ZSTD_format_e)value; |
|
342 | 495 | return (size_t)CCtxParams->format; |
|
343 | 496 | |
|
344 |
case ZSTD_ |
|
|
345 | int cLevel = (int)value; /* cast expected to restore negative sign */ | |
|
497 | case ZSTD_c_compressionLevel : { | |
|
498 | int cLevel = value; | |
|
346 | 499 | if (cLevel > ZSTD_maxCLevel()) cLevel = ZSTD_maxCLevel(); |
|
500 | if (cLevel < ZSTD_minCLevel()) cLevel = ZSTD_minCLevel(); | |
|
347 | 501 | if (cLevel) { /* 0 : does not change current level */ |
|
348 | 502 | CCtxParams->compressionLevel = cLevel; |
|
349 | 503 | } |
@@ -351,213 +505,229 b' size_t ZSTD_CCtxParam_setParameter(' | |||
|
351 | 505 | return 0; /* return type (size_t) cannot represent negative values */ |
|
352 | 506 | } |
|
353 | 507 | |
|
354 |
case ZSTD_ |
|
|
355 |
if (value |
|
|
356 | CLAMPCHECK(value, ZSTD_WINDOWLOG_MIN, ZSTD_WINDOWLOG_MAX); | |
|
508 | case ZSTD_c_windowLog : | |
|
509 | if (value!=0) /* 0 => use default */ | |
|
510 | BOUNDCHECK(ZSTD_c_windowLog, value); | |
|
357 | 511 | CCtxParams->cParams.windowLog = value; |
|
358 | 512 | return CCtxParams->cParams.windowLog; |
|
359 | 513 | |
|
360 |
case ZSTD_ |
|
|
361 |
if (value |
|
|
362 | CLAMPCHECK(value, ZSTD_HASHLOG_MIN, ZSTD_HASHLOG_MAX); | |
|
514 | case ZSTD_c_hashLog : | |
|
515 | if (value!=0) /* 0 => use default */ | |
|
516 | BOUNDCHECK(ZSTD_c_hashLog, value); | |
|
363 | 517 | CCtxParams->cParams.hashLog = value; |
|
364 | 518 | return CCtxParams->cParams.hashLog; |
|
365 | 519 | |
|
366 |
case ZSTD_ |
|
|
367 |
if (value |
|
|
368 | CLAMPCHECK(value, ZSTD_CHAINLOG_MIN, ZSTD_CHAINLOG_MAX); | |
|
520 | case ZSTD_c_chainLog : | |
|
521 | if (value!=0) /* 0 => use default */ | |
|
522 | BOUNDCHECK(ZSTD_c_chainLog, value); | |
|
369 | 523 | CCtxParams->cParams.chainLog = value; |
|
370 | 524 | return CCtxParams->cParams.chainLog; |
|
371 | 525 | |
|
372 |
case ZSTD_ |
|
|
373 |
if (value |
|
|
374 | CLAMPCHECK(value, ZSTD_SEARCHLOG_MIN, ZSTD_SEARCHLOG_MAX); | |
|
526 | case ZSTD_c_searchLog : | |
|
527 | if (value!=0) /* 0 => use default */ | |
|
528 | BOUNDCHECK(ZSTD_c_searchLog, value); | |
|
375 | 529 | CCtxParams->cParams.searchLog = value; |
|
376 | 530 | return value; |
|
377 | 531 | |
|
378 |
case ZSTD_ |
|
|
379 |
if (value |
|
|
380 | CLAMPCHECK(value, ZSTD_SEARCHLENGTH_MIN, ZSTD_SEARCHLENGTH_MAX); | |
|
381 |
CCtxParams->cParams. |
|
|
382 |
return CCtxParams->cParams. |
|
|
383 | ||
|
384 |
case ZSTD_ |
|
|
385 | /* all values are valid. 0 => use default */ | |
|
532 | case ZSTD_c_minMatch : | |
|
533 | if (value!=0) /* 0 => use default */ | |
|
534 | BOUNDCHECK(ZSTD_c_minMatch, value); | |
|
535 | CCtxParams->cParams.minMatch = value; | |
|
536 | return CCtxParams->cParams.minMatch; | |
|
537 | ||
|
538 | case ZSTD_c_targetLength : | |
|
539 | BOUNDCHECK(ZSTD_c_targetLength, value); | |
|
386 | 540 | CCtxParams->cParams.targetLength = value; |
|
387 | 541 | return CCtxParams->cParams.targetLength; |
|
388 | 542 | |
|
389 |
case ZSTD_ |
|
|
390 |
if (value |
|
|
391 | CLAMPCHECK(value, (unsigned)ZSTD_fast, (unsigned)ZSTD_btultra); | |
|
543 | case ZSTD_c_strategy : | |
|
544 | if (value!=0) /* 0 => use default */ | |
|
545 | BOUNDCHECK(ZSTD_c_strategy, value); | |
|
392 | 546 | CCtxParams->cParams.strategy = (ZSTD_strategy)value; |
|
393 | 547 | return (size_t)CCtxParams->cParams.strategy; |
|
394 | 548 | |
|
395 |
case ZSTD_ |
|
|
549 | case ZSTD_c_contentSizeFlag : | |
|
396 | 550 | /* Content size written in frame header _when known_ (default:1) */ |
|
397 |
DEBUGLOG(4, "set content size flag = %u", (value |
|
|
398 |
CCtxParams->fParams.contentSizeFlag = value |
|
|
551 | DEBUGLOG(4, "set content size flag = %u", (value!=0)); | |
|
552 | CCtxParams->fParams.contentSizeFlag = value != 0; | |
|
399 | 553 | return CCtxParams->fParams.contentSizeFlag; |
|
400 | 554 | |
|
401 |
case ZSTD_ |
|
|
555 | case ZSTD_c_checksumFlag : | |
|
402 | 556 | /* A 32-bits content checksum will be calculated and written at end of frame (default:0) */ |
|
403 |
CCtxParams->fParams.checksumFlag = value |
|
|
557 | CCtxParams->fParams.checksumFlag = value != 0; | |
|
404 | 558 | return CCtxParams->fParams.checksumFlag; |
|
405 | 559 | |
|
406 |
case ZSTD_ |
|
|
407 |
DEBUGLOG(4, "set dictIDFlag = %u", (value |
|
|
560 | case ZSTD_c_dictIDFlag : /* When applicable, dictionary's dictID is provided in frame header (default:1) */ | |
|
561 | DEBUGLOG(4, "set dictIDFlag = %u", (value!=0)); | |
|
408 | 562 | CCtxParams->fParams.noDictIDFlag = !value; |
|
409 | 563 | return !CCtxParams->fParams.noDictIDFlag; |
|
410 | 564 | |
|
411 |
case ZSTD_ |
|
|
412 |
CCtxParams->forceWindow = (value |
|
|
565 | case ZSTD_c_forceMaxWindow : | |
|
566 | CCtxParams->forceWindow = (value != 0); | |
|
413 | 567 | return CCtxParams->forceWindow; |
|
414 | 568 | |
|
415 |
case ZSTD_ |
|
|
416 | CCtxParams->attachDictPref = value ? | |
|
417 | (value > 0 ? ZSTD_dictForceAttach : ZSTD_dictForceCopy) : | |
|
418 | ZSTD_dictDefaultAttach; | |
|
569 | case ZSTD_c_forceAttachDict : { | |
|
570 | const ZSTD_dictAttachPref_e pref = (ZSTD_dictAttachPref_e)value; | |
|
571 | BOUNDCHECK(ZSTD_c_forceAttachDict, pref); | |
|
572 | CCtxParams->attachDictPref = pref; | |
|
419 | 573 | return CCtxParams->attachDictPref; |
|
420 | ||
|
421 | case ZSTD_p_nbWorkers : | |
|
574 | } | |
|
575 | ||
|
576 | case ZSTD_c_nbWorkers : | |
|
422 | 577 | #ifndef ZSTD_MULTITHREAD |
|
423 |
if (value |
|
|
578 | if (value!=0) return ERROR(parameter_unsupported); | |
|
424 | 579 | return 0; |
|
425 | 580 | #else |
|
426 | 581 | return ZSTDMT_CCtxParam_setNbWorkers(CCtxParams, value); |
|
427 | 582 | #endif |
|
428 | 583 | |
|
429 |
case ZSTD_ |
|
|
584 | case ZSTD_c_jobSize : | |
|
430 | 585 | #ifndef ZSTD_MULTITHREAD |
|
431 | 586 | return ERROR(parameter_unsupported); |
|
432 | 587 | #else |
|
433 | 588 | return ZSTDMT_CCtxParam_setMTCtxParameter(CCtxParams, ZSTDMT_p_jobSize, value); |
|
434 | 589 | #endif |
|
435 | 590 | |
|
436 |
case ZSTD_ |
|
|
591 | case ZSTD_c_overlapLog : | |
|
592 | #ifndef ZSTD_MULTITHREAD | |
|
593 | return ERROR(parameter_unsupported); | |
|
594 | #else | |
|
595 | return ZSTDMT_CCtxParam_setMTCtxParameter(CCtxParams, ZSTDMT_p_overlapLog, value); | |
|
596 | #endif | |
|
597 | ||
|
598 | case ZSTD_c_rsyncable : | |
|
437 | 599 | #ifndef ZSTD_MULTITHREAD |
|
438 | 600 | return ERROR(parameter_unsupported); |
|
439 | 601 | #else |
|
440 |
return ZSTDMT_CCtxParam_setMTCtxParameter(CCtxParams, ZSTDMT_p_ |
|
|
602 | return ZSTDMT_CCtxParam_setMTCtxParameter(CCtxParams, ZSTDMT_p_rsyncable, value); | |
|
441 | 603 | #endif |
|
442 | 604 | |
|
443 |
case ZSTD_ |
|
|
444 |
CCtxParams->ldmParams.enableLdm = (value |
|
|
605 | case ZSTD_c_enableLongDistanceMatching : | |
|
606 | CCtxParams->ldmParams.enableLdm = (value!=0); | |
|
445 | 607 | return CCtxParams->ldmParams.enableLdm; |
|
446 | 608 | |
|
447 |
case ZSTD_ |
|
|
448 |
if (value |
|
|
449 | CLAMPCHECK(value, ZSTD_HASHLOG_MIN, ZSTD_HASHLOG_MAX); | |
|
609 | case ZSTD_c_ldmHashLog : | |
|
610 | if (value!=0) /* 0 ==> auto */ | |
|
611 | BOUNDCHECK(ZSTD_c_ldmHashLog, value); | |
|
450 | 612 | CCtxParams->ldmParams.hashLog = value; |
|
451 | 613 | return CCtxParams->ldmParams.hashLog; |
|
452 | 614 | |
|
453 |
case ZSTD_ |
|
|
454 |
if (value |
|
|
455 | CLAMPCHECK(value, ZSTD_LDM_MINMATCH_MIN, ZSTD_LDM_MINMATCH_MAX); | |
|
615 | case ZSTD_c_ldmMinMatch : | |
|
616 | if (value!=0) /* 0 ==> default */ | |
|
617 | BOUNDCHECK(ZSTD_c_ldmMinMatch, value); | |
|
456 | 618 | CCtxParams->ldmParams.minMatchLength = value; |
|
457 | 619 | return CCtxParams->ldmParams.minMatchLength; |
|
458 | 620 | |
|
459 |
case ZSTD_ |
|
|
460 | if (value > ZSTD_LDM_BUCKETSIZELOG_MAX) | |
|
461 | return ERROR(parameter_outOfBound); | |
|
621 | case ZSTD_c_ldmBucketSizeLog : | |
|
622 | if (value!=0) /* 0 ==> default */ | |
|
623 | BOUNDCHECK(ZSTD_c_ldmBucketSizeLog, value); | |
|
462 | 624 | CCtxParams->ldmParams.bucketSizeLog = value; |
|
463 | 625 | return CCtxParams->ldmParams.bucketSizeLog; |
|
464 | 626 | |
|
465 |
case ZSTD_ |
|
|
627 | case ZSTD_c_ldmHashRateLog : | |
|
466 | 628 | if (value > ZSTD_WINDOWLOG_MAX - ZSTD_HASHLOG_MIN) |
|
467 | 629 | return ERROR(parameter_outOfBound); |
|
468 |
CCtxParams->ldmParams.hash |
|
|
469 |
return CCtxParams->ldmParams.hash |
|
|
630 | CCtxParams->ldmParams.hashRateLog = value; | |
|
631 | return CCtxParams->ldmParams.hashRateLog; | |
|
470 | 632 | |
|
471 | 633 | default: return ERROR(parameter_unsupported); |
|
472 | 634 | } |
|
473 | 635 | } |
|
474 | 636 | |
|
475 |
size_t ZSTD_CCtx_getParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, |
|
|
637 | size_t ZSTD_CCtx_getParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, int* value) | |
|
476 | 638 | { |
|
477 | 639 | return ZSTD_CCtxParam_getParameter(&cctx->requestedParams, param, value); |
|
478 | 640 | } |
|
479 | 641 | |
|
480 | 642 | size_t ZSTD_CCtxParam_getParameter( |
|
481 |
ZSTD_CCtx_params* CCtxParams, ZSTD_cParameter param, |
|
|
643 | ZSTD_CCtx_params* CCtxParams, ZSTD_cParameter param, int* value) | |
|
482 | 644 | { |
|
483 | 645 | switch(param) |
|
484 | 646 | { |
|
485 |
case ZSTD_ |
|
|
647 | case ZSTD_c_format : | |
|
486 | 648 | *value = CCtxParams->format; |
|
487 | 649 | break; |
|
488 |
case ZSTD_ |
|
|
650 | case ZSTD_c_compressionLevel : | |
|
489 | 651 | *value = CCtxParams->compressionLevel; |
|
490 | 652 | break; |
|
491 |
case ZSTD_ |
|
|
653 | case ZSTD_c_windowLog : | |
|
492 | 654 | *value = CCtxParams->cParams.windowLog; |
|
493 | 655 | break; |
|
494 |
case ZSTD_ |
|
|
656 | case ZSTD_c_hashLog : | |
|
495 | 657 | *value = CCtxParams->cParams.hashLog; |
|
496 | 658 | break; |
|
497 |
case ZSTD_ |
|
|
659 | case ZSTD_c_chainLog : | |
|
498 | 660 | *value = CCtxParams->cParams.chainLog; |
|
499 | 661 | break; |
|
500 |
case ZSTD_ |
|
|
662 | case ZSTD_c_searchLog : | |
|
501 | 663 | *value = CCtxParams->cParams.searchLog; |
|
502 | 664 | break; |
|
503 |
case ZSTD_ |
|
|
504 |
*value = CCtxParams->cParams. |
|
|
665 | case ZSTD_c_minMatch : | |
|
666 | *value = CCtxParams->cParams.minMatch; | |
|
505 | 667 | break; |
|
506 |
case ZSTD_ |
|
|
668 | case ZSTD_c_targetLength : | |
|
507 | 669 | *value = CCtxParams->cParams.targetLength; |
|
508 | 670 | break; |
|
509 |
case ZSTD_ |
|
|
671 | case ZSTD_c_strategy : | |
|
510 | 672 | *value = (unsigned)CCtxParams->cParams.strategy; |
|
511 | 673 | break; |
|
512 |
case ZSTD_ |
|
|
674 | case ZSTD_c_contentSizeFlag : | |
|
513 | 675 | *value = CCtxParams->fParams.contentSizeFlag; |
|
514 | 676 | break; |
|
515 |
case ZSTD_ |
|
|
677 | case ZSTD_c_checksumFlag : | |
|
516 | 678 | *value = CCtxParams->fParams.checksumFlag; |
|
517 | 679 | break; |
|
518 |
case ZSTD_ |
|
|
680 | case ZSTD_c_dictIDFlag : | |
|
519 | 681 | *value = !CCtxParams->fParams.noDictIDFlag; |
|
520 | 682 | break; |
|
521 |
case ZSTD_ |
|
|
683 | case ZSTD_c_forceMaxWindow : | |
|
522 | 684 | *value = CCtxParams->forceWindow; |
|
523 | 685 | break; |
|
524 |
case ZSTD_ |
|
|
686 | case ZSTD_c_forceAttachDict : | |
|
525 | 687 | *value = CCtxParams->attachDictPref; |
|
526 | 688 | break; |
|
527 |
case ZSTD_ |
|
|
689 | case ZSTD_c_nbWorkers : | |
|
528 | 690 | #ifndef ZSTD_MULTITHREAD |
|
529 | 691 | assert(CCtxParams->nbWorkers == 0); |
|
530 | 692 | #endif |
|
531 | 693 | *value = CCtxParams->nbWorkers; |
|
532 | 694 | break; |
|
533 |
case ZSTD_ |
|
|
695 | case ZSTD_c_jobSize : | |
|
534 | 696 | #ifndef ZSTD_MULTITHREAD |
|
535 | 697 | return ERROR(parameter_unsupported); |
|
536 | 698 | #else |
|
537 |
|
|
|
699 | assert(CCtxParams->jobSize <= INT_MAX); | |
|
700 | *value = (int)CCtxParams->jobSize; | |
|
538 | 701 | break; |
|
539 | 702 | #endif |
|
540 |
case ZSTD_ |
|
|
703 | case ZSTD_c_overlapLog : | |
|
541 | 704 | #ifndef ZSTD_MULTITHREAD |
|
542 | 705 | return ERROR(parameter_unsupported); |
|
543 | 706 | #else |
|
544 |
*value = CCtxParams->overlap |
|
|
707 | *value = CCtxParams->overlapLog; | |
|
545 | 708 | break; |
|
546 | 709 | #endif |
|
547 |
case ZSTD_ |
|
|
710 | case ZSTD_c_rsyncable : | |
|
711 | #ifndef ZSTD_MULTITHREAD | |
|
712 | return ERROR(parameter_unsupported); | |
|
713 | #else | |
|
714 | *value = CCtxParams->rsyncable; | |
|
715 | break; | |
|
716 | #endif | |
|
717 | case ZSTD_c_enableLongDistanceMatching : | |
|
548 | 718 | *value = CCtxParams->ldmParams.enableLdm; |
|
549 | 719 | break; |
|
550 |
case ZSTD_ |
|
|
720 | case ZSTD_c_ldmHashLog : | |
|
551 | 721 | *value = CCtxParams->ldmParams.hashLog; |
|
552 | 722 | break; |
|
553 |
case ZSTD_ |
|
|
723 | case ZSTD_c_ldmMinMatch : | |
|
554 | 724 | *value = CCtxParams->ldmParams.minMatchLength; |
|
555 | 725 | break; |
|
556 |
case ZSTD_ |
|
|
726 | case ZSTD_c_ldmBucketSizeLog : | |
|
557 | 727 | *value = CCtxParams->ldmParams.bucketSizeLog; |
|
558 | 728 | break; |
|
559 |
case ZSTD_ |
|
|
560 |
*value = CCtxParams->ldmParams.hash |
|
|
729 | case ZSTD_c_ldmHashRateLog : | |
|
730 | *value = CCtxParams->ldmParams.hashRateLog; | |
|
561 | 731 | break; |
|
562 | 732 | default: return ERROR(parameter_unsupported); |
|
563 | 733 | } |
@@ -655,34 +825,35 b' size_t ZSTD_CCtx_refPrefix_advanced(' | |||
|
655 | 825 | |
|
656 | 826 | /*! ZSTD_CCtx_reset() : |
|
657 | 827 | * Also dumps dictionary */ |
|
658 |
|
|
|
828 | size_t ZSTD_CCtx_reset(ZSTD_CCtx* cctx, ZSTD_ResetDirective reset) | |
|
659 | 829 | { |
|
660 | cctx->streamStage = zcss_init; | |
|
661 | cctx->pledgedSrcSizePlusOne = 0; | |
|
830 | if ( (reset == ZSTD_reset_session_only) | |
|
831 | || (reset == ZSTD_reset_session_and_parameters) ) { | |
|
832 | cctx->streamStage = zcss_init; | |
|
833 | cctx->pledgedSrcSizePlusOne = 0; | |
|
834 | } | |
|
835 | if ( (reset == ZSTD_reset_parameters) | |
|
836 | || (reset == ZSTD_reset_session_and_parameters) ) { | |
|
837 | if (cctx->streamStage != zcss_init) return ERROR(stage_wrong); | |
|
838 | cctx->cdict = NULL; | |
|
839 | return ZSTD_CCtxParams_reset(&cctx->requestedParams); | |
|
840 | } | |
|
841 | return 0; | |
|
662 | 842 | } |
|
663 | 843 | |
|
664 | size_t ZSTD_CCtx_resetParameters(ZSTD_CCtx* cctx) | |
|
665 | { | |
|
666 | if (cctx->streamStage != zcss_init) return ERROR(stage_wrong); | |
|
667 | cctx->cdict = NULL; | |
|
668 | return ZSTD_CCtxParams_reset(&cctx->requestedParams); | |
|
669 | } | |
|
670 | 844 | |
|
671 | 845 | /** ZSTD_checkCParams() : |
|
672 | 846 | control CParam values remain within authorized range. |
|
673 | 847 | @return : 0, or an error code if one value is beyond authorized range */ |
|
674 | 848 | size_t ZSTD_checkCParams(ZSTD_compressionParameters cParams) |
|
675 | 849 | { |
|
676 | CLAMPCHECK(cParams.windowLog, ZSTD_WINDOWLOG_MIN, ZSTD_WINDOWLOG_MAX); | |
|
677 | CLAMPCHECK(cParams.chainLog, ZSTD_CHAINLOG_MIN, ZSTD_CHAINLOG_MAX); | |
|
678 | CLAMPCHECK(cParams.hashLog, ZSTD_HASHLOG_MIN, ZSTD_HASHLOG_MAX); | |
|
679 | CLAMPCHECK(cParams.searchLog, ZSTD_SEARCHLOG_MIN, ZSTD_SEARCHLOG_MAX); | |
|
680 | CLAMPCHECK(cParams.searchLength, ZSTD_SEARCHLENGTH_MIN, ZSTD_SEARCHLENGTH_MAX); | |
|
681 | ZSTD_STATIC_ASSERT(ZSTD_TARGETLENGTH_MIN == 0); | |
|
682 | if (cParams.targetLength > ZSTD_TARGETLENGTH_MAX) | |
|
683 | return ERROR(parameter_outOfBound); | |
|
684 | if ((U32)(cParams.strategy) > (U32)ZSTD_btultra) | |
|
685 | return ERROR(parameter_unsupported); | |
|
850 | BOUNDCHECK(ZSTD_c_windowLog, cParams.windowLog); | |
|
851 | BOUNDCHECK(ZSTD_c_chainLog, cParams.chainLog); | |
|
852 | BOUNDCHECK(ZSTD_c_hashLog, cParams.hashLog); | |
|
853 | BOUNDCHECK(ZSTD_c_searchLog, cParams.searchLog); | |
|
854 | BOUNDCHECK(ZSTD_c_minMatch, cParams.minMatch); | |
|
855 | BOUNDCHECK(ZSTD_c_targetLength,cParams.targetLength); | |
|
856 | BOUNDCHECK(ZSTD_c_strategy, cParams.strategy); | |
|
686 | 857 | return 0; |
|
687 | 858 | } |
|
688 | 859 | |
@@ -692,19 +863,19 b' size_t ZSTD_checkCParams(ZSTD_compressio' | |||
|
692 | 863 | static ZSTD_compressionParameters |
|
693 | 864 | ZSTD_clampCParams(ZSTD_compressionParameters cParams) |
|
694 | 865 | { |
|
695 | # define CLAMP(val,min,max) { \ | |
|
696 | if (val<min) val=min; \ | |
|
697 | else if (val>max) val=max; \ | |
|
866 | # define CLAMP_TYPE(cParam, val, type) { \ | |
|
867 | ZSTD_bounds const bounds = ZSTD_cParam_getBounds(cParam); \ | |
|
868 | if ((int)val<bounds.lowerBound) val=(type)bounds.lowerBound; \ | |
|
869 | else if ((int)val>bounds.upperBound) val=(type)bounds.upperBound; \ | |
|
698 | 870 | } |
|
699 | CLAMP(cParams.windowLog, ZSTD_WINDOWLOG_MIN, ZSTD_WINDOWLOG_MAX); | |
|
700 | CLAMP(cParams.chainLog, ZSTD_CHAINLOG_MIN, ZSTD_CHAINLOG_MAX); | |
|
701 | CLAMP(cParams.hashLog, ZSTD_HASHLOG_MIN, ZSTD_HASHLOG_MAX); | |
|
702 | CLAMP(cParams.searchLog, ZSTD_SEARCHLOG_MIN, ZSTD_SEARCHLOG_MAX); | |
|
703 | CLAMP(cParams.searchLength, ZSTD_SEARCHLENGTH_MIN, ZSTD_SEARCHLENGTH_MAX); | |
|
704 | ZSTD_STATIC_ASSERT(ZSTD_TARGETLENGTH_MIN == 0); | |
|
705 | if (cParams.targetLength > ZSTD_TARGETLENGTH_MAX) | |
|
706 | cParams.targetLength = ZSTD_TARGETLENGTH_MAX; | |
|
707 | CLAMP(cParams.strategy, ZSTD_fast, ZSTD_btultra); | |
|
871 | # define CLAMP(cParam, val) CLAMP_TYPE(cParam, val, int) | |
|
872 | CLAMP(ZSTD_c_windowLog, cParams.windowLog); | |
|
873 | CLAMP(ZSTD_c_chainLog, cParams.chainLog); | |
|
874 | CLAMP(ZSTD_c_hashLog, cParams.hashLog); | |
|
875 | CLAMP(ZSTD_c_searchLog, cParams.searchLog); | |
|
876 | CLAMP(ZSTD_c_minMatch, cParams.minMatch); | |
|
877 | CLAMP(ZSTD_c_targetLength,cParams.targetLength); | |
|
878 | CLAMP_TYPE(ZSTD_c_strategy,cParams.strategy, ZSTD_strategy); | |
|
708 | 879 | return cParams; |
|
709 | 880 | } |
|
710 | 881 | |
@@ -774,7 +945,7 b' ZSTD_compressionParameters ZSTD_getCPara' | |||
|
774 | 945 | if (CCtxParams->cParams.hashLog) cParams.hashLog = CCtxParams->cParams.hashLog; |
|
775 | 946 | if (CCtxParams->cParams.chainLog) cParams.chainLog = CCtxParams->cParams.chainLog; |
|
776 | 947 | if (CCtxParams->cParams.searchLog) cParams.searchLog = CCtxParams->cParams.searchLog; |
|
777 |
if (CCtxParams->cParams. |
|
|
948 | if (CCtxParams->cParams.minMatch) cParams.minMatch = CCtxParams->cParams.minMatch; | |
|
778 | 949 | if (CCtxParams->cParams.targetLength) cParams.targetLength = CCtxParams->cParams.targetLength; |
|
779 | 950 | if (CCtxParams->cParams.strategy) cParams.strategy = CCtxParams->cParams.strategy; |
|
780 | 951 | assert(!ZSTD_checkCParams(cParams)); |
@@ -787,13 +958,12 b' ZSTD_sizeof_matchState(const ZSTD_compre' | |||
|
787 | 958 | { |
|
788 | 959 | size_t const chainSize = (cParams->strategy == ZSTD_fast) ? 0 : ((size_t)1 << cParams->chainLog); |
|
789 | 960 | size_t const hSize = ((size_t)1) << cParams->hashLog; |
|
790 |
U32 const hashLog3 = (forCCtx && cParams-> |
|
|
961 | U32 const hashLog3 = (forCCtx && cParams->minMatch==3) ? MIN(ZSTD_HASHLOG3_MAX, cParams->windowLog) : 0; | |
|
791 | 962 | size_t const h3Size = ((size_t)1) << hashLog3; |
|
792 | 963 | size_t const tableSpace = (chainSize + hSize + h3Size) * sizeof(U32); |
|
793 | 964 | size_t const optPotentialSpace = ((MaxML+1) + (MaxLL+1) + (MaxOff+1) + (1<<Litbits)) * sizeof(U32) |
|
794 | 965 | + (ZSTD_OPT_NUM+1) * (sizeof(ZSTD_match_t)+sizeof(ZSTD_optimal_t)); |
|
795 |
size_t const optSpace = (forCCtx && ( |
|
|
796 | (cParams->strategy == ZSTD_btultra))) | |
|
966 | size_t const optSpace = (forCCtx && (cParams->strategy >= ZSTD_btopt)) | |
|
797 | 967 | ? optPotentialSpace |
|
798 | 968 | : 0; |
|
799 | 969 | DEBUGLOG(4, "chainSize: %u - hSize: %u - h3Size: %u", |
@@ -808,7 +978,7 b' size_t ZSTD_estimateCCtxSize_usingCCtxPa' | |||
|
808 | 978 | { ZSTD_compressionParameters const cParams = |
|
809 | 979 | ZSTD_getCParamsFromCCtxParams(params, 0, 0); |
|
810 | 980 | size_t const blockSize = MIN(ZSTD_BLOCKSIZE_MAX, (size_t)1 << cParams.windowLog); |
|
811 |
U32 const divider = (cParams. |
|
|
981 | U32 const divider = (cParams.minMatch==3) ? 3 : 4; | |
|
812 | 982 | size_t const maxNbSeq = blockSize / divider; |
|
813 | 983 | size_t const tokenSpace = WILDCOPY_OVERLENGTH + blockSize + 11*maxNbSeq; |
|
814 | 984 | size_t const entropySpace = HUF_WORKSPACE_SIZE; |
@@ -843,7 +1013,7 b' size_t ZSTD_estimateCCtxSize(int compres' | |||
|
843 | 1013 | { |
|
844 | 1014 | int level; |
|
845 | 1015 | size_t memBudget = 0; |
|
846 | for (level=1; level<=compressionLevel; level++) { | |
|
1016 | for (level=MIN(compressionLevel, 1); level<=compressionLevel; level++) { | |
|
847 | 1017 | size_t const newMB = ZSTD_estimateCCtxSize_internal(level); |
|
848 | 1018 | if (newMB > memBudget) memBudget = newMB; |
|
849 | 1019 | } |
@@ -879,7 +1049,7 b' size_t ZSTD_estimateCStreamSize(int comp' | |||
|
879 | 1049 | { |
|
880 | 1050 | int level; |
|
881 | 1051 | size_t memBudget = 0; |
|
882 | for (level=1; level<=compressionLevel; level++) { | |
|
1052 | for (level=MIN(compressionLevel, 1); level<=compressionLevel; level++) { | |
|
883 | 1053 | size_t const newMB = ZSTD_estimateCStreamSize_internal(level); |
|
884 | 1054 | if (newMB > memBudget) memBudget = newMB; |
|
885 | 1055 | } |
@@ -933,7 +1103,7 b' static U32 ZSTD_equivalentCParams(ZSTD_c' | |||
|
933 | 1103 | return (cParams1.hashLog == cParams2.hashLog) |
|
934 | 1104 | & (cParams1.chainLog == cParams2.chainLog) |
|
935 | 1105 | & (cParams1.strategy == cParams2.strategy) /* opt parser space */ |
|
936 |
& ((cParams1. |
|
|
1106 | & ((cParams1.minMatch==3) == (cParams2.minMatch==3)); /* hashlog3 space */ | |
|
937 | 1107 | } |
|
938 | 1108 | |
|
939 | 1109 | static void ZSTD_assertEqualCParams(ZSTD_compressionParameters cParams1, |
@@ -945,7 +1115,7 b' static void ZSTD_assertEqualCParams(ZSTD' | |||
|
945 | 1115 | assert(cParams1.chainLog == cParams2.chainLog); |
|
946 | 1116 | assert(cParams1.hashLog == cParams2.hashLog); |
|
947 | 1117 | assert(cParams1.searchLog == cParams2.searchLog); |
|
948 |
assert(cParams1. |
|
|
1118 | assert(cParams1.minMatch == cParams2.minMatch); | |
|
949 | 1119 | assert(cParams1.targetLength == cParams2.targetLength); |
|
950 | 1120 | assert(cParams1.strategy == cParams2.strategy); |
|
951 | 1121 | } |
@@ -960,7 +1130,7 b' static U32 ZSTD_equivalentLdmParams(ldmP' | |||
|
960 | 1130 | ldmParams1.hashLog == ldmParams2.hashLog && |
|
961 | 1131 | ldmParams1.bucketSizeLog == ldmParams2.bucketSizeLog && |
|
962 | 1132 | ldmParams1.minMatchLength == ldmParams2.minMatchLength && |
|
963 |
ldmParams1.hash |
|
|
1133 | ldmParams1.hashRateLog == ldmParams2.hashRateLog); | |
|
964 | 1134 | } |
|
965 | 1135 | |
|
966 | 1136 | typedef enum { ZSTDb_not_buffered, ZSTDb_buffered } ZSTD_buffered_policy_e; |
@@ -976,7 +1146,7 b' static U32 ZSTD_sufficientBuff(size_t bu' | |||
|
976 | 1146 | { |
|
977 | 1147 | size_t const windowSize2 = MAX(1, (size_t)MIN(((U64)1 << cParams2.windowLog), pledgedSrcSize)); |
|
978 | 1148 | size_t const blockSize2 = MIN(ZSTD_BLOCKSIZE_MAX, windowSize2); |
|
979 |
size_t const maxNbSeq2 = blockSize2 / ((cParams2. |
|
|
1149 | size_t const maxNbSeq2 = blockSize2 / ((cParams2.minMatch == 3) ? 3 : 4); | |
|
980 | 1150 | size_t const maxNbLit2 = blockSize2; |
|
981 | 1151 | size_t const neededBufferSize2 = (buffPol2==ZSTDb_buffered) ? windowSize2 + blockSize2 : 0; |
|
982 | 1152 | DEBUGLOG(4, "ZSTD_sufficientBuff: is neededBufferSize2=%u <= bufferSize1=%u", |
@@ -1034,8 +1204,8 b' static void ZSTD_invalidateMatchState(ZS' | |||
|
1034 | 1204 | { |
|
1035 | 1205 | ZSTD_window_clear(&ms->window); |
|
1036 | 1206 | |
|
1037 |
ms->nextToUpdate = ms->window.dictLimit |
|
|
1038 |
ms->nextToUpdate3 = ms->window.dictLimit |
|
|
1207 | ms->nextToUpdate = ms->window.dictLimit; | |
|
1208 | ms->nextToUpdate3 = ms->window.dictLimit; | |
|
1039 | 1209 | ms->loadedDictEnd = 0; |
|
1040 | 1210 | ms->opt.litLengthSum = 0; /* force reset of btopt stats */ |
|
1041 | 1211 | ms->dictMatchState = NULL; |
@@ -1080,7 +1250,7 b' ZSTD_reset_matchState(ZSTD_matchState_t*' | |||
|
1080 | 1250 | { |
|
1081 | 1251 | size_t const chainSize = (cParams->strategy == ZSTD_fast) ? 0 : ((size_t)1 << cParams->chainLog); |
|
1082 | 1252 | size_t const hSize = ((size_t)1) << cParams->hashLog; |
|
1083 |
U32 const hashLog3 = (forCCtx && cParams-> |
|
|
1253 | U32 const hashLog3 = (forCCtx && cParams->minMatch==3) ? MIN(ZSTD_HASHLOG3_MAX, cParams->windowLog) : 0; | |
|
1084 | 1254 | size_t const h3Size = ((size_t)1) << hashLog3; |
|
1085 | 1255 | size_t const tableSpace = (chainSize + hSize + h3Size) * sizeof(U32); |
|
1086 | 1256 | |
@@ -1094,9 +1264,9 b' ZSTD_reset_matchState(ZSTD_matchState_t*' | |||
|
1094 | 1264 | ZSTD_invalidateMatchState(ms); |
|
1095 | 1265 | |
|
1096 | 1266 | /* opt parser space */ |
|
1097 |
if (forCCtx && ( |
|
|
1267 | if (forCCtx && (cParams->strategy >= ZSTD_btopt)) { | |
|
1098 | 1268 | DEBUGLOG(4, "reserving optimal parser space"); |
|
1099 |
ms->opt.litFreq = ( |
|
|
1269 | ms->opt.litFreq = (unsigned*)ptr; | |
|
1100 | 1270 | ms->opt.litLengthFreq = ms->opt.litFreq + (1<<Litbits); |
|
1101 | 1271 | ms->opt.matchLengthFreq = ms->opt.litLengthFreq + (MaxLL+1); |
|
1102 | 1272 | ms->opt.offCodeFreq = ms->opt.matchLengthFreq + (MaxML+1); |
@@ -1158,13 +1328,13 b' static size_t ZSTD_resetCCtx_internal(ZS' | |||
|
1158 | 1328 | /* Adjust long distance matching parameters */ |
|
1159 | 1329 | ZSTD_ldm_adjustParameters(¶ms.ldmParams, ¶ms.cParams); |
|
1160 | 1330 | assert(params.ldmParams.hashLog >= params.ldmParams.bucketSizeLog); |
|
1161 |
assert(params.ldmParams.hash |
|
|
1162 |
zc->ldmState.hashPower = ZSTD_ |
|
|
1331 | assert(params.ldmParams.hashRateLog < 32); | |
|
1332 | zc->ldmState.hashPower = ZSTD_rollingHash_primePower(params.ldmParams.minMatchLength); | |
|
1163 | 1333 | } |
|
1164 | 1334 | |
|
1165 | 1335 | { size_t const windowSize = MAX(1, (size_t)MIN(((U64)1 << params.cParams.windowLog), pledgedSrcSize)); |
|
1166 | 1336 | size_t const blockSize = MIN(ZSTD_BLOCKSIZE_MAX, windowSize); |
|
1167 |
U32 const divider = (params.cParams. |
|
|
1337 | U32 const divider = (params.cParams.minMatch==3) ? 3 : 4; | |
|
1168 | 1338 | size_t const maxNbSeq = blockSize / divider; |
|
1169 | 1339 | size_t const tokenSpace = WILDCOPY_OVERLENGTH + blockSize + 11*maxNbSeq; |
|
1170 | 1340 | size_t const buffOutSize = (zbuff==ZSTDb_buffered) ? ZSTD_compressBound(blockSize)+1 : 0; |
@@ -1227,7 +1397,7 b' static size_t ZSTD_resetCCtx_internal(ZS' | |||
|
1227 | 1397 | if (pledgedSrcSize == ZSTD_CONTENTSIZE_UNKNOWN) |
|
1228 | 1398 | zc->appliedParams.fParams.contentSizeFlag = 0; |
|
1229 | 1399 | DEBUGLOG(4, "pledged content size : %u ; flag : %u", |
|
1230 |
( |
|
|
1400 | (unsigned)pledgedSrcSize, zc->appliedParams.fParams.contentSizeFlag); | |
|
1231 | 1401 | zc->blockSize = blockSize; |
|
1232 | 1402 | |
|
1233 | 1403 | XXH64_reset(&zc->xxhState, 0); |
@@ -1306,16 +1476,17 b' void ZSTD_invalidateRepCodes(ZSTD_CCtx* ' | |||
|
1306 | 1476 | * dictionary tables into the working context is faster than using them |
|
1307 | 1477 | * in-place. |
|
1308 | 1478 | */ |
|
1309 |
static const size_t attachDictSizeCutoffs[ |
|
|
1310 | 8 KB, /* unused */ | |
|
1311 | 8 KB, /* ZSTD_fast */ | |
|
1479 | static const size_t attachDictSizeCutoffs[ZSTD_STRATEGY_MAX+1] = { | |
|
1480 | 8 KB, /* unused */ | |
|
1481 | 8 KB, /* ZSTD_fast */ | |
|
1312 | 1482 | 16 KB, /* ZSTD_dfast */ |
|
1313 | 1483 | 32 KB, /* ZSTD_greedy */ |
|
1314 | 1484 | 32 KB, /* ZSTD_lazy */ |
|
1315 | 1485 | 32 KB, /* ZSTD_lazy2 */ |
|
1316 | 1486 | 32 KB, /* ZSTD_btlazy2 */ |
|
1317 | 1487 | 32 KB, /* ZSTD_btopt */ |
|
1318 | 8 KB /* ZSTD_btultra */ | |
|
1488 | 8 KB, /* ZSTD_btultra */ | |
|
1489 | 8 KB /* ZSTD_btultra2 */ | |
|
1319 | 1490 | }; |
|
1320 | 1491 | |
|
1321 | 1492 | static int ZSTD_shouldAttachDict(const ZSTD_CDict* cdict, |
@@ -1447,7 +1618,8 b' static size_t ZSTD_resetCCtx_usingCDict(' | |||
|
1447 | 1618 | ZSTD_buffered_policy_e zbuff) |
|
1448 | 1619 | { |
|
1449 | 1620 | |
|
1450 |
DEBUGLOG(4, "ZSTD_resetCCtx_usingCDict (pledgedSrcSize=%u)", |
|
|
1621 | DEBUGLOG(4, "ZSTD_resetCCtx_usingCDict (pledgedSrcSize=%u)", | |
|
1622 | (unsigned)pledgedSrcSize); | |
|
1451 | 1623 | |
|
1452 | 1624 | if (ZSTD_shouldAttachDict(cdict, params, pledgedSrcSize)) { |
|
1453 | 1625 | return ZSTD_resetCCtx_byAttachingCDict( |
@@ -1670,7 +1842,9 b' static size_t ZSTD_compressRleLiteralsBl' | |||
|
1670 | 1842 | * note : use same formula for both situations */ |
|
1671 | 1843 | static size_t ZSTD_minGain(size_t srcSize, ZSTD_strategy strat) |
|
1672 | 1844 | { |
|
1673 |
U32 const minlog = (strat |
|
|
1845 | U32 const minlog = (strat>=ZSTD_btultra) ? (U32)(strat) - 1 : 6; | |
|
1846 | ZSTD_STATIC_ASSERT(ZSTD_btultra == 8); | |
|
1847 | assert(ZSTD_cParam_withinBounds(ZSTD_c_strategy, strat)); | |
|
1674 | 1848 | return (srcSize >> minlog) + 2; |
|
1675 | 1849 | } |
|
1676 | 1850 | |
@@ -1679,7 +1853,8 b' static size_t ZSTD_compressLiterals (ZST' | |||
|
1679 | 1853 | ZSTD_strategy strategy, int disableLiteralCompression, |
|
1680 | 1854 | void* dst, size_t dstCapacity, |
|
1681 | 1855 | const void* src, size_t srcSize, |
|
1682 |
|
|
|
1856 | void* workspace, size_t wkspSize, | |
|
1857 | const int bmi2) | |
|
1683 | 1858 | { |
|
1684 | 1859 | size_t const minGain = ZSTD_minGain(srcSize, strategy); |
|
1685 | 1860 | size_t const lhSize = 3 + (srcSize >= 1 KB) + (srcSize >= 16 KB); |
@@ -1708,9 +1883,9 b' static size_t ZSTD_compressLiterals (ZST' | |||
|
1708 | 1883 | int const preferRepeat = strategy < ZSTD_lazy ? srcSize <= 1024 : 0; |
|
1709 | 1884 | if (repeat == HUF_repeat_valid && lhSize == 3) singleStream = 1; |
|
1710 | 1885 | cLitSize = singleStream ? HUF_compress1X_repeat(ostart+lhSize, dstCapacity-lhSize, src, srcSize, 255, 11, |
|
1711 |
workspace, |
|
|
1886 | workspace, wkspSize, (HUF_CElt*)nextHuf->CTable, &repeat, preferRepeat, bmi2) | |
|
1712 | 1887 | : HUF_compress4X_repeat(ostart+lhSize, dstCapacity-lhSize, src, srcSize, 255, 11, |
|
1713 |
workspace, |
|
|
1888 | workspace, wkspSize, (HUF_CElt*)nextHuf->CTable, &repeat, preferRepeat, bmi2); | |
|
1714 | 1889 | if (repeat != HUF_repeat_none) { |
|
1715 | 1890 | /* reused the existing table */ |
|
1716 | 1891 | hType = set_repeat; |
@@ -1977,7 +2152,7 b' ZSTD_selectEncodingType(' | |||
|
1977 | 2152 | assert(!ZSTD_isError(NCountCost)); |
|
1978 | 2153 | assert(compressedCost < ERROR(maxCode)); |
|
1979 | 2154 | DEBUGLOG(5, "Estimated bit costs: basic=%u\trepeat=%u\tcompressed=%u", |
|
1980 |
( |
|
|
2155 | (unsigned)basicCost, (unsigned)repeatCost, (unsigned)compressedCost); | |
|
1981 | 2156 | if (basicCost <= repeatCost && basicCost <= compressedCost) { |
|
1982 | 2157 | DEBUGLOG(5, "Selected set_basic"); |
|
1983 | 2158 | assert(isDefaultAllowed); |
@@ -1999,7 +2174,7 b' ZSTD_selectEncodingType(' | |||
|
1999 | 2174 | MEM_STATIC size_t |
|
2000 | 2175 | ZSTD_buildCTable(void* dst, size_t dstCapacity, |
|
2001 | 2176 | FSE_CTable* nextCTable, U32 FSELog, symbolEncodingType_e type, |
|
2002 |
|
|
|
2177 | unsigned* count, U32 max, | |
|
2003 | 2178 | const BYTE* codeTable, size_t nbSeq, |
|
2004 | 2179 | const S16* defaultNorm, U32 defaultNormLog, U32 defaultMax, |
|
2005 | 2180 | const FSE_CTable* prevCTable, size_t prevCTableSize, |
@@ -2007,11 +2182,13 b' ZSTD_buildCTable(void* dst, size_t dstCa' | |||
|
2007 | 2182 | { |
|
2008 | 2183 | BYTE* op = (BYTE*)dst; |
|
2009 | 2184 | const BYTE* const oend = op + dstCapacity; |
|
2185 | DEBUGLOG(6, "ZSTD_buildCTable (dstCapacity=%u)", (unsigned)dstCapacity); | |
|
2010 | 2186 | |
|
2011 | 2187 | switch (type) { |
|
2012 | 2188 | case set_rle: |
|
2189 | CHECK_F(FSE_buildCTable_rle(nextCTable, (BYTE)max)); | |
|
2190 | if (dstCapacity==0) return ERROR(dstSize_tooSmall); | |
|
2013 | 2191 | *op = codeTable[0]; |
|
2014 | CHECK_F(FSE_buildCTable_rle(nextCTable, (BYTE)max)); | |
|
2015 | 2192 | return 1; |
|
2016 | 2193 | case set_repeat: |
|
2017 | 2194 | memcpy(nextCTable, prevCTable, prevCTableSize); |
@@ -2053,6 +2230,9 b' ZSTD_encodeSequences_body(' | |||
|
2053 | 2230 | FSE_CState_t stateLitLength; |
|
2054 | 2231 | |
|
2055 | 2232 | CHECK_E(BIT_initCStream(&blockStream, dst, dstCapacity), dstSize_tooSmall); /* not enough space remaining */ |
|
2233 | DEBUGLOG(6, "available space for bitstream : %i (dstCapacity=%u)", | |
|
2234 | (int)(blockStream.endPtr - blockStream.startPtr), | |
|
2235 | (unsigned)dstCapacity); | |
|
2056 | 2236 | |
|
2057 | 2237 | /* first symbols */ |
|
2058 | 2238 | FSE_initCState2(&stateMatchLength, CTable_MatchLength, mlCodeTable[nbSeq-1]); |
@@ -2085,9 +2265,9 b' ZSTD_encodeSequences_body(' | |||
|
2085 | 2265 | U32 const ofBits = ofCode; |
|
2086 | 2266 | U32 const mlBits = ML_bits[mlCode]; |
|
2087 | 2267 | DEBUGLOG(6, "encoding: litlen:%2u - matchlen:%2u - offCode:%7u", |
|
2088 | sequences[n].litLength, | |
|
2089 | sequences[n].matchLength + MINMATCH, | |
|
2090 | sequences[n].offset); | |
|
2268 | (unsigned)sequences[n].litLength, | |
|
2269 | (unsigned)sequences[n].matchLength + MINMATCH, | |
|
2270 | (unsigned)sequences[n].offset); | |
|
2091 | 2271 | /* 32b*/ /* 64b*/ |
|
2092 | 2272 | /* (7)*/ /* (7)*/ |
|
2093 | 2273 | FSE_encodeSymbol(&blockStream, &stateOffsetBits, ofCode); /* 15 */ /* 15 */ |
@@ -2112,6 +2292,7 b' ZSTD_encodeSequences_body(' | |||
|
2112 | 2292 | BIT_addBits(&blockStream, sequences[n].offset, ofBits); /* 31 */ |
|
2113 | 2293 | } |
|
2114 | 2294 | BIT_flushBits(&blockStream); /* (7)*/ |
|
2295 | DEBUGLOG(7, "remaining space : %i", (int)(blockStream.endPtr - blockStream.ptr)); | |
|
2115 | 2296 | } } |
|
2116 | 2297 | |
|
2117 | 2298 | DEBUGLOG(6, "ZSTD_encodeSequences: flushing ML state with %u bits", stateMatchLength.stateLog); |
@@ -2169,6 +2350,7 b' static size_t ZSTD_encodeSequences(' | |||
|
2169 | 2350 | FSE_CTable const* CTable_LitLength, BYTE const* llCodeTable, |
|
2170 | 2351 | seqDef const* sequences, size_t nbSeq, int longOffsets, int bmi2) |
|
2171 | 2352 | { |
|
2353 | DEBUGLOG(5, "ZSTD_encodeSequences: dstCapacity = %u", (unsigned)dstCapacity); | |
|
2172 | 2354 | #if DYNAMIC_BMI2 |
|
2173 | 2355 | if (bmi2) { |
|
2174 | 2356 | return ZSTD_encodeSequences_bmi2(dst, dstCapacity, |
@@ -2186,16 +2368,20 b' static size_t ZSTD_encodeSequences(' | |||
|
2186 | 2368 | sequences, nbSeq, longOffsets); |
|
2187 | 2369 | } |
|
2188 | 2370 | |
|
2189 |
|
|
|
2190 | ZSTD_entropyCTables_t const* prevEntropy, | |
|
2191 | ZSTD_entropyCTables_t* nextEntropy, | |
|
2192 | ZSTD_CCtx_params const* cctxParams, | |
|
2193 | void* dst, size_t dstCapacity, U32* workspace, | |
|
2194 |
|
|
|
2371 | /* ZSTD_compressSequences_internal(): | |
|
2372 | * actually compresses both literals and sequences */ | |
|
2373 | MEM_STATIC size_t | |
|
2374 | ZSTD_compressSequences_internal(seqStore_t* seqStorePtr, | |
|
2375 | const ZSTD_entropyCTables_t* prevEntropy, | |
|
2376 | ZSTD_entropyCTables_t* nextEntropy, | |
|
2377 | const ZSTD_CCtx_params* cctxParams, | |
|
2378 | void* dst, size_t dstCapacity, | |
|
2379 | void* workspace, size_t wkspSize, | |
|
2380 | const int bmi2) | |
|
2195 | 2381 | { |
|
2196 | 2382 | const int longOffsets = cctxParams->cParams.windowLog > STREAM_ACCUMULATOR_MIN; |
|
2197 | 2383 | ZSTD_strategy const strategy = cctxParams->cParams.strategy; |
|
2198 |
|
|
|
2384 | unsigned count[MaxSeq+1]; | |
|
2199 | 2385 | FSE_CTable* CTable_LitLength = nextEntropy->fse.litlengthCTable; |
|
2200 | 2386 | FSE_CTable* CTable_OffsetBits = nextEntropy->fse.offcodeCTable; |
|
2201 | 2387 | FSE_CTable* CTable_MatchLength = nextEntropy->fse.matchlengthCTable; |
@@ -2212,6 +2398,7 b' MEM_STATIC size_t ZSTD_compressSequences' | |||
|
2212 | 2398 | BYTE* lastNCount = NULL; |
|
2213 | 2399 | |
|
2214 | 2400 | ZSTD_STATIC_ASSERT(HUF_WORKSPACE_SIZE >= (1<<MAX(MLFSELog,LLFSELog))); |
|
2401 | DEBUGLOG(5, "ZSTD_compressSequences_internal"); | |
|
2215 | 2402 | |
|
2216 | 2403 | /* Compress literals */ |
|
2217 | 2404 | { const BYTE* const literals = seqStorePtr->litStart; |
@@ -2222,7 +2409,8 b' MEM_STATIC size_t ZSTD_compressSequences' | |||
|
2222 | 2409 | cctxParams->cParams.strategy, disableLiteralCompression, |
|
2223 | 2410 | op, dstCapacity, |
|
2224 | 2411 | literals, litSize, |
|
2225 |
workspace, |
|
|
2412 | workspace, wkspSize, | |
|
2413 | bmi2); | |
|
2226 | 2414 | if (ZSTD_isError(cSize)) |
|
2227 | 2415 | return cSize; |
|
2228 | 2416 | assert(cSize <= dstCapacity); |
@@ -2249,51 +2437,63 b' MEM_STATIC size_t ZSTD_compressSequences' | |||
|
2249 | 2437 | /* convert length/distances into codes */ |
|
2250 | 2438 | ZSTD_seqToCodes(seqStorePtr); |
|
2251 | 2439 | /* build CTable for Literal Lengths */ |
|
2252 |
{ |
|
|
2253 | size_t const mostFrequent = HIST_countFast_wksp(count, &max, llCodeTable, nbSeq, workspace); /* can't fail */ | |
|
2440 | { unsigned max = MaxLL; | |
|
2441 | size_t const mostFrequent = HIST_countFast_wksp(count, &max, llCodeTable, nbSeq, workspace, wkspSize); /* can't fail */ | |
|
2254 | 2442 | DEBUGLOG(5, "Building LL table"); |
|
2255 | 2443 | nextEntropy->fse.litlength_repeatMode = prevEntropy->fse.litlength_repeatMode; |
|
2256 | LLtype = ZSTD_selectEncodingType(&nextEntropy->fse.litlength_repeatMode, count, max, mostFrequent, nbSeq, LLFSELog, prevEntropy->fse.litlengthCTable, LL_defaultNorm, LL_defaultNormLog, ZSTD_defaultAllowed, strategy); | |
|
2444 | LLtype = ZSTD_selectEncodingType(&nextEntropy->fse.litlength_repeatMode, | |
|
2445 | count, max, mostFrequent, nbSeq, | |
|
2446 | LLFSELog, prevEntropy->fse.litlengthCTable, | |
|
2447 | LL_defaultNorm, LL_defaultNormLog, | |
|
2448 | ZSTD_defaultAllowed, strategy); | |
|
2257 | 2449 | assert(set_basic < set_compressed && set_rle < set_compressed); |
|
2258 | 2450 | assert(!(LLtype < set_compressed && nextEntropy->fse.litlength_repeatMode != FSE_repeat_none)); /* We don't copy tables */ |
|
2259 | 2451 | { size_t const countSize = ZSTD_buildCTable(op, oend - op, CTable_LitLength, LLFSELog, (symbolEncodingType_e)LLtype, |
|
2260 | 2452 | count, max, llCodeTable, nbSeq, LL_defaultNorm, LL_defaultNormLog, MaxLL, |
|
2261 | 2453 | prevEntropy->fse.litlengthCTable, sizeof(prevEntropy->fse.litlengthCTable), |
|
2262 |
workspace, |
|
|
2454 | workspace, wkspSize); | |
|
2263 | 2455 | if (ZSTD_isError(countSize)) return countSize; |
|
2264 | 2456 | if (LLtype == set_compressed) |
|
2265 | 2457 | lastNCount = op; |
|
2266 | 2458 | op += countSize; |
|
2267 | 2459 | } } |
|
2268 | 2460 | /* build CTable for Offsets */ |
|
2269 |
{ |
|
|
2270 | size_t const mostFrequent = HIST_countFast_wksp(count, &max, ofCodeTable, nbSeq, workspace); /* can't fail */ | |
|
2461 | { unsigned max = MaxOff; | |
|
2462 | size_t const mostFrequent = HIST_countFast_wksp(count, &max, ofCodeTable, nbSeq, workspace, wkspSize); /* can't fail */ | |
|
2271 | 2463 | /* We can only use the basic table if max <= DefaultMaxOff, otherwise the offsets are too large */ |
|
2272 | 2464 | ZSTD_defaultPolicy_e const defaultPolicy = (max <= DefaultMaxOff) ? ZSTD_defaultAllowed : ZSTD_defaultDisallowed; |
|
2273 | 2465 | DEBUGLOG(5, "Building OF table"); |
|
2274 | 2466 | nextEntropy->fse.offcode_repeatMode = prevEntropy->fse.offcode_repeatMode; |
|
2275 | Offtype = ZSTD_selectEncodingType(&nextEntropy->fse.offcode_repeatMode, count, max, mostFrequent, nbSeq, OffFSELog, prevEntropy->fse.offcodeCTable, OF_defaultNorm, OF_defaultNormLog, defaultPolicy, strategy); | |
|
2467 | Offtype = ZSTD_selectEncodingType(&nextEntropy->fse.offcode_repeatMode, | |
|
2468 | count, max, mostFrequent, nbSeq, | |
|
2469 | OffFSELog, prevEntropy->fse.offcodeCTable, | |
|
2470 | OF_defaultNorm, OF_defaultNormLog, | |
|
2471 | defaultPolicy, strategy); | |
|
2276 | 2472 | assert(!(Offtype < set_compressed && nextEntropy->fse.offcode_repeatMode != FSE_repeat_none)); /* We don't copy tables */ |
|
2277 | 2473 | { size_t const countSize = ZSTD_buildCTable(op, oend - op, CTable_OffsetBits, OffFSELog, (symbolEncodingType_e)Offtype, |
|
2278 | 2474 | count, max, ofCodeTable, nbSeq, OF_defaultNorm, OF_defaultNormLog, DefaultMaxOff, |
|
2279 | 2475 | prevEntropy->fse.offcodeCTable, sizeof(prevEntropy->fse.offcodeCTable), |
|
2280 |
workspace, |
|
|
2476 | workspace, wkspSize); | |
|
2281 | 2477 | if (ZSTD_isError(countSize)) return countSize; |
|
2282 | 2478 | if (Offtype == set_compressed) |
|
2283 | 2479 | lastNCount = op; |
|
2284 | 2480 | op += countSize; |
|
2285 | 2481 | } } |
|
2286 | 2482 | /* build CTable for MatchLengths */ |
|
2287 |
{ |
|
|
2288 | size_t const mostFrequent = HIST_countFast_wksp(count, &max, mlCodeTable, nbSeq, workspace); /* can't fail */ | |
|
2289 | DEBUGLOG(5, "Building ML table"); | |
|
2483 | { unsigned max = MaxML; | |
|
2484 | size_t const mostFrequent = HIST_countFast_wksp(count, &max, mlCodeTable, nbSeq, workspace, wkspSize); /* can't fail */ | |
|
2485 | DEBUGLOG(5, "Building ML table (remaining space : %i)", (int)(oend-op)); | |
|
2290 | 2486 | nextEntropy->fse.matchlength_repeatMode = prevEntropy->fse.matchlength_repeatMode; |
|
2291 | MLtype = ZSTD_selectEncodingType(&nextEntropy->fse.matchlength_repeatMode, count, max, mostFrequent, nbSeq, MLFSELog, prevEntropy->fse.matchlengthCTable, ML_defaultNorm, ML_defaultNormLog, ZSTD_defaultAllowed, strategy); | |
|
2487 | MLtype = ZSTD_selectEncodingType(&nextEntropy->fse.matchlength_repeatMode, | |
|
2488 | count, max, mostFrequent, nbSeq, | |
|
2489 | MLFSELog, prevEntropy->fse.matchlengthCTable, | |
|
2490 | ML_defaultNorm, ML_defaultNormLog, | |
|
2491 | ZSTD_defaultAllowed, strategy); | |
|
2292 | 2492 | assert(!(MLtype < set_compressed && nextEntropy->fse.matchlength_repeatMode != FSE_repeat_none)); /* We don't copy tables */ |
|
2293 | 2493 | { size_t const countSize = ZSTD_buildCTable(op, oend - op, CTable_MatchLength, MLFSELog, (symbolEncodingType_e)MLtype, |
|
2294 | 2494 | count, max, mlCodeTable, nbSeq, ML_defaultNorm, ML_defaultNormLog, MaxML, |
|
2295 | 2495 | prevEntropy->fse.matchlengthCTable, sizeof(prevEntropy->fse.matchlengthCTable), |
|
2296 |
workspace, |
|
|
2496 | workspace, wkspSize); | |
|
2297 | 2497 | if (ZSTD_isError(countSize)) return countSize; |
|
2298 | 2498 | if (MLtype == set_compressed) |
|
2299 | 2499 | lastNCount = op; |
@@ -2328,19 +2528,24 b' MEM_STATIC size_t ZSTD_compressSequences' | |||
|
2328 | 2528 | } |
|
2329 | 2529 | } |
|
2330 | 2530 | |
|
2531 | DEBUGLOG(5, "compressed block size : %u", (unsigned)(op - ostart)); | |
|
2331 | 2532 | return op - ostart; |
|
2332 | 2533 | } |
|
2333 | 2534 | |
|
2334 | MEM_STATIC size_t ZSTD_compressSequences(seqStore_t* seqStorePtr, | |
|
2335 | const ZSTD_entropyCTables_t* prevEntropy, | |
|
2336 |
|
|
|
2337 |
|
|
|
2338 | void* dst, size_t dstCapacity, | |
|
2339 |
size_t |
|
|
2535 | MEM_STATIC size_t | |
|
2536 | ZSTD_compressSequences(seqStore_t* seqStorePtr, | |
|
2537 | const ZSTD_entropyCTables_t* prevEntropy, | |
|
2538 | ZSTD_entropyCTables_t* nextEntropy, | |
|
2539 | const ZSTD_CCtx_params* cctxParams, | |
|
2540 | void* dst, size_t dstCapacity, | |
|
2541 | size_t srcSize, | |
|
2542 | void* workspace, size_t wkspSize, | |
|
2543 | int bmi2) | |
|
2340 | 2544 | { |
|
2341 | 2545 | size_t const cSize = ZSTD_compressSequences_internal( |
|
2342 |
seqStorePtr, prevEntropy, nextEntropy, cctxParams, |
|
|
2343 | workspace, bmi2); | |
|
2546 | seqStorePtr, prevEntropy, nextEntropy, cctxParams, | |
|
2547 | dst, dstCapacity, | |
|
2548 | workspace, wkspSize, bmi2); | |
|
2344 | 2549 | if (cSize == 0) return 0; |
|
2345 | 2550 | /* When srcSize <= dstCapacity, there is enough space to write a raw uncompressed block. |
|
2346 | 2551 | * Since we ran out of space, block must be not compressible, so fall back to raw uncompressed block. |
@@ -2362,7 +2567,7 b' MEM_STATIC size_t ZSTD_compressSequences' | |||
|
2362 | 2567 | * assumption : strat is a valid strategy */ |
|
2363 | 2568 | ZSTD_blockCompressor ZSTD_selectBlockCompressor(ZSTD_strategy strat, ZSTD_dictMode_e dictMode) |
|
2364 | 2569 | { |
|
2365 |
static const ZSTD_blockCompressor blockCompressor[3][ |
|
|
2570 | static const ZSTD_blockCompressor blockCompressor[3][ZSTD_STRATEGY_MAX+1] = { | |
|
2366 | 2571 | { ZSTD_compressBlock_fast /* default for 0 */, |
|
2367 | 2572 | ZSTD_compressBlock_fast, |
|
2368 | 2573 | ZSTD_compressBlock_doubleFast, |
@@ -2371,7 +2576,8 b' ZSTD_blockCompressor ZSTD_selectBlockCom' | |||
|
2371 | 2576 | ZSTD_compressBlock_lazy2, |
|
2372 | 2577 | ZSTD_compressBlock_btlazy2, |
|
2373 | 2578 | ZSTD_compressBlock_btopt, |
|
2374 |
ZSTD_compressBlock_btultra |
|
|
2579 | ZSTD_compressBlock_btultra, | |
|
2580 | ZSTD_compressBlock_btultra2 }, | |
|
2375 | 2581 | { ZSTD_compressBlock_fast_extDict /* default for 0 */, |
|
2376 | 2582 | ZSTD_compressBlock_fast_extDict, |
|
2377 | 2583 | ZSTD_compressBlock_doubleFast_extDict, |
@@ -2380,6 +2586,7 b' ZSTD_blockCompressor ZSTD_selectBlockCom' | |||
|
2380 | 2586 | ZSTD_compressBlock_lazy2_extDict, |
|
2381 | 2587 | ZSTD_compressBlock_btlazy2_extDict, |
|
2382 | 2588 | ZSTD_compressBlock_btopt_extDict, |
|
2589 | ZSTD_compressBlock_btultra_extDict, | |
|
2383 | 2590 | ZSTD_compressBlock_btultra_extDict }, |
|
2384 | 2591 | { ZSTD_compressBlock_fast_dictMatchState /* default for 0 */, |
|
2385 | 2592 | ZSTD_compressBlock_fast_dictMatchState, |
@@ -2389,14 +2596,14 b' ZSTD_blockCompressor ZSTD_selectBlockCom' | |||
|
2389 | 2596 | ZSTD_compressBlock_lazy2_dictMatchState, |
|
2390 | 2597 | ZSTD_compressBlock_btlazy2_dictMatchState, |
|
2391 | 2598 | ZSTD_compressBlock_btopt_dictMatchState, |
|
2599 | ZSTD_compressBlock_btultra_dictMatchState, | |
|
2392 | 2600 | ZSTD_compressBlock_btultra_dictMatchState } |
|
2393 | 2601 | }; |
|
2394 | 2602 | ZSTD_blockCompressor selectedCompressor; |
|
2395 | 2603 | ZSTD_STATIC_ASSERT((unsigned)ZSTD_fast == 1); |
|
2396 | 2604 | |
|
2397 | assert((U32)strat >= (U32)ZSTD_fast); | |
|
2398 | assert((U32)strat <= (U32)ZSTD_btultra); | |
|
2399 | selectedCompressor = blockCompressor[(int)dictMode][(U32)strat]; | |
|
2605 | assert(ZSTD_cParam_withinBounds(ZSTD_c_strategy, strat)); | |
|
2606 | selectedCompressor = blockCompressor[(int)dictMode][(int)strat]; | |
|
2400 | 2607 | assert(selectedCompressor != NULL); |
|
2401 | 2608 | return selectedCompressor; |
|
2402 | 2609 | } |
@@ -2421,15 +2628,15 b' static size_t ZSTD_compressBlock_interna' | |||
|
2421 | 2628 | { |
|
2422 | 2629 | ZSTD_matchState_t* const ms = &zc->blockState.matchState; |
|
2423 | 2630 | size_t cSize; |
|
2424 |
DEBUGLOG(5, "ZSTD_compressBlock_internal (dstCapacity=% |
|
|
2425 | dstCapacity, ms->window.dictLimit, ms->nextToUpdate); | |
|
2631 | DEBUGLOG(5, "ZSTD_compressBlock_internal (dstCapacity=%u, dictLimit=%u, nextToUpdate=%u)", | |
|
2632 | (unsigned)dstCapacity, (unsigned)ms->window.dictLimit, (unsigned)ms->nextToUpdate); | |
|
2426 | 2633 | assert(srcSize <= ZSTD_BLOCKSIZE_MAX); |
|
2427 | 2634 | |
|
2428 | 2635 | /* Assert that we have correctly flushed the ctx params into the ms's copy */ |
|
2429 | 2636 | ZSTD_assertEqualCParams(zc->appliedParams.cParams, ms->cParams); |
|
2430 | 2637 | |
|
2431 | 2638 | if (srcSize < MIN_CBLOCK_SIZE+ZSTD_blockHeaderSize+1) { |
|
2432 |
ZSTD_ldm_skipSequences(&zc->externSeqStore, srcSize, zc->appliedParams.cParams. |
|
|
2639 | ZSTD_ldm_skipSequences(&zc->externSeqStore, srcSize, zc->appliedParams.cParams.minMatch); | |
|
2433 | 2640 | cSize = 0; |
|
2434 | 2641 | goto out; /* don't even attempt compression below a certain srcSize */ |
|
2435 | 2642 | } |
@@ -2437,8 +2644,8 b' static size_t ZSTD_compressBlock_interna' | |||
|
2437 | 2644 | ms->opt.symbolCosts = &zc->blockState.prevCBlock->entropy; /* required for optimal parser to read stats from dictionary */ |
|
2438 | 2645 | |
|
2439 | 2646 | /* a gap between an attached dict and the current window is not safe, |
|
2440 |
* they must remain adjacent, |
|
|
2441 | * must be unset */ | |
|
2647 | * they must remain adjacent, | |
|
2648 | * and when that stops being the case, the dict must be unset */ | |
|
2442 | 2649 | assert(ms->dictMatchState == NULL || ms->loadedDictEnd == ms->window.dictLimit); |
|
2443 | 2650 | |
|
2444 | 2651 | /* limited update after a very long match */ |
@@ -2495,7 +2702,9 b' static size_t ZSTD_compressBlock_interna' | |||
|
2495 | 2702 | &zc->blockState.prevCBlock->entropy, &zc->blockState.nextCBlock->entropy, |
|
2496 | 2703 | &zc->appliedParams, |
|
2497 | 2704 | dst, dstCapacity, |
|
2498 | srcSize, zc->entropyWorkspace, zc->bmi2); | |
|
2705 | srcSize, | |
|
2706 | zc->entropyWorkspace, HUF_WORKSPACE_SIZE /* statically allocated in resetCCtx */, | |
|
2707 | zc->bmi2); | |
|
2499 | 2708 | |
|
2500 | 2709 | out: |
|
2501 | 2710 | if (!ZSTD_isError(cSize) && cSize != 0) { |
@@ -2535,7 +2744,7 b' static size_t ZSTD_compress_frameChunk (' | |||
|
2535 | 2744 | U32 const maxDist = (U32)1 << cctx->appliedParams.cParams.windowLog; |
|
2536 | 2745 | assert(cctx->appliedParams.cParams.windowLog <= 31); |
|
2537 | 2746 | |
|
2538 |
DEBUGLOG(5, "ZSTD_compress_frameChunk (blockSize=%u)", ( |
|
|
2747 | DEBUGLOG(5, "ZSTD_compress_frameChunk (blockSize=%u)", (unsigned)blockSize); | |
|
2539 | 2748 | if (cctx->appliedParams.fParams.checksumFlag && srcSize) |
|
2540 | 2749 | XXH64_update(&cctx->xxhState, src, srcSize); |
|
2541 | 2750 | |
@@ -2583,7 +2792,7 b' static size_t ZSTD_compress_frameChunk (' | |||
|
2583 | 2792 | assert(dstCapacity >= cSize); |
|
2584 | 2793 | dstCapacity -= cSize; |
|
2585 | 2794 | DEBUGLOG(5, "ZSTD_compress_frameChunk: adding a block of size %u", |
|
2586 |
( |
|
|
2795 | (unsigned)cSize); | |
|
2587 | 2796 | } } |
|
2588 | 2797 | |
|
2589 | 2798 | if (lastFrameChunk && (op>ostart)) cctx->stage = ZSTDcs_ending; |
@@ -2606,9 +2815,9 b' static size_t ZSTD_writeFrameHeader(void' | |||
|
2606 | 2815 | size_t pos=0; |
|
2607 | 2816 | |
|
2608 | 2817 | assert(!(params.fParams.contentSizeFlag && pledgedSrcSize == ZSTD_CONTENTSIZE_UNKNOWN)); |
|
2609 |
if (dstCapacity < ZSTD_ |
|
|
2818 | if (dstCapacity < ZSTD_FRAMEHEADERSIZE_MAX) return ERROR(dstSize_tooSmall); | |
|
2610 | 2819 | DEBUGLOG(4, "ZSTD_writeFrameHeader : dictIDFlag : %u ; dictID : %u ; dictIDSizeCode : %u", |
|
2611 |
!params.fParams.noDictIDFlag, dictID, |
|
|
2820 | !params.fParams.noDictIDFlag, (unsigned)dictID, (unsigned)dictIDSizeCode); | |
|
2612 | 2821 | |
|
2613 | 2822 | if (params.format == ZSTD_f_zstd1) { |
|
2614 | 2823 | MEM_writeLE32(dst, ZSTD_MAGICNUMBER); |
@@ -2672,7 +2881,7 b' static size_t ZSTD_compressContinue_inte' | |||
|
2672 | 2881 | size_t fhSize = 0; |
|
2673 | 2882 | |
|
2674 | 2883 | DEBUGLOG(5, "ZSTD_compressContinue_internal, stage: %u, srcSize: %u", |
|
2675 |
cctx->stage, ( |
|
|
2884 | cctx->stage, (unsigned)srcSize); | |
|
2676 | 2885 | if (cctx->stage==ZSTDcs_created) return ERROR(stage_wrong); /* missing init (ZSTD_compressBegin) */ |
|
2677 | 2886 | |
|
2678 | 2887 | if (frame && (cctx->stage==ZSTDcs_init)) { |
@@ -2709,7 +2918,7 b' static size_t ZSTD_compressContinue_inte' | |||
|
2709 | 2918 | } |
|
2710 | 2919 | } |
|
2711 | 2920 | |
|
2712 |
DEBUGLOG(5, "ZSTD_compressContinue_internal (blockSize=%u)", ( |
|
|
2921 | DEBUGLOG(5, "ZSTD_compressContinue_internal (blockSize=%u)", (unsigned)cctx->blockSize); | |
|
2713 | 2922 | { size_t const cSize = frame ? |
|
2714 | 2923 | ZSTD_compress_frameChunk (cctx, dst, dstCapacity, src, srcSize, lastFrameChunk) : |
|
2715 | 2924 | ZSTD_compressBlock_internal (cctx, dst, dstCapacity, src, srcSize); |
@@ -2721,7 +2930,7 b' static size_t ZSTD_compressContinue_inte' | |||
|
2721 | 2930 | ZSTD_STATIC_ASSERT(ZSTD_CONTENTSIZE_UNKNOWN == (unsigned long long)-1); |
|
2722 | 2931 | if (cctx->consumedSrcSize+1 > cctx->pledgedSrcSizePlusOne) { |
|
2723 | 2932 | DEBUGLOG(4, "error : pledgedSrcSize = %u, while realSrcSize >= %u", |
|
2724 |
( |
|
|
2933 | (unsigned)cctx->pledgedSrcSizePlusOne-1, (unsigned)cctx->consumedSrcSize); | |
|
2725 | 2934 | return ERROR(srcSize_wrong); |
|
2726 | 2935 | } |
|
2727 | 2936 | } |
@@ -2733,7 +2942,7 b' size_t ZSTD_compressContinue (ZSTD_CCtx*' | |||
|
2733 | 2942 | void* dst, size_t dstCapacity, |
|
2734 | 2943 | const void* src, size_t srcSize) |
|
2735 | 2944 | { |
|
2736 |
DEBUGLOG(5, "ZSTD_compressContinue (srcSize=%u)", ( |
|
|
2945 | DEBUGLOG(5, "ZSTD_compressContinue (srcSize=%u)", (unsigned)srcSize); | |
|
2737 | 2946 | return ZSTD_compressContinue_internal(cctx, dst, dstCapacity, src, srcSize, 1 /* frame mode */, 0 /* last chunk */); |
|
2738 | 2947 | } |
|
2739 | 2948 | |
@@ -2791,6 +3000,7 b' static size_t ZSTD_loadDictionaryContent' | |||
|
2791 | 3000 | case ZSTD_btlazy2: /* we want the dictionary table fully sorted */ |
|
2792 | 3001 | case ZSTD_btopt: |
|
2793 | 3002 | case ZSTD_btultra: |
|
3003 | case ZSTD_btultra2: | |
|
2794 | 3004 | if (srcSize >= HASH_READ_SIZE) |
|
2795 | 3005 | ZSTD_updateTree(ms, iend-HASH_READ_SIZE, iend); |
|
2796 | 3006 | break; |
@@ -2861,7 +3071,9 b' static size_t ZSTD_loadZstdDictionary(ZS' | |||
|
2861 | 3071 | if (offcodeLog > OffFSELog) return ERROR(dictionary_corrupted); |
|
2862 | 3072 | /* Defer checking offcodeMaxValue because we need to know the size of the dictionary content */ |
|
2863 | 3073 | /* fill all offset symbols to avoid garbage at end of table */ |
|
2864 |
CHECK_E( FSE_buildCTable_wksp(bs->entropy.fse.offcodeCTable, |
|
|
3074 | CHECK_E( FSE_buildCTable_wksp(bs->entropy.fse.offcodeCTable, | |
|
3075 | offcodeNCount, MaxOff, offcodeLog, | |
|
3076 | workspace, HUF_WORKSPACE_SIZE), | |
|
2865 | 3077 | dictionary_corrupted); |
|
2866 | 3078 | dictPtr += offcodeHeaderSize; |
|
2867 | 3079 | } |
@@ -2873,7 +3085,9 b' static size_t ZSTD_loadZstdDictionary(ZS' | |||
|
2873 | 3085 | if (matchlengthLog > MLFSELog) return ERROR(dictionary_corrupted); |
|
2874 | 3086 | /* Every match length code must have non-zero probability */ |
|
2875 | 3087 | CHECK_F( ZSTD_checkDictNCount(matchlengthNCount, matchlengthMaxValue, MaxML)); |
|
2876 |
CHECK_E( FSE_buildCTable_wksp(bs->entropy.fse.matchlengthCTable, |
|
|
3088 | CHECK_E( FSE_buildCTable_wksp(bs->entropy.fse.matchlengthCTable, | |
|
3089 | matchlengthNCount, matchlengthMaxValue, matchlengthLog, | |
|
3090 | workspace, HUF_WORKSPACE_SIZE), | |
|
2877 | 3091 | dictionary_corrupted); |
|
2878 | 3092 | dictPtr += matchlengthHeaderSize; |
|
2879 | 3093 | } |
@@ -2885,7 +3099,9 b' static size_t ZSTD_loadZstdDictionary(ZS' | |||
|
2885 | 3099 | if (litlengthLog > LLFSELog) return ERROR(dictionary_corrupted); |
|
2886 | 3100 | /* Every literal length code must have non-zero probability */ |
|
2887 | 3101 | CHECK_F( ZSTD_checkDictNCount(litlengthNCount, litlengthMaxValue, MaxLL)); |
|
2888 |
CHECK_E( FSE_buildCTable_wksp(bs->entropy.fse.litlengthCTable, |
|
|
3102 | CHECK_E( FSE_buildCTable_wksp(bs->entropy.fse.litlengthCTable, | |
|
3103 | litlengthNCount, litlengthMaxValue, litlengthLog, | |
|
3104 | workspace, HUF_WORKSPACE_SIZE), | |
|
2889 | 3105 | dictionary_corrupted); |
|
2890 | 3106 | dictPtr += litlengthHeaderSize; |
|
2891 | 3107 | } |
@@ -3023,7 +3239,7 b' size_t ZSTD_compressBegin_usingDict(ZSTD' | |||
|
3023 | 3239 | ZSTD_parameters const params = ZSTD_getParams(compressionLevel, ZSTD_CONTENTSIZE_UNKNOWN, dictSize); |
|
3024 | 3240 | ZSTD_CCtx_params const cctxParams = |
|
3025 | 3241 | ZSTD_assignParamsToCCtxParams(cctx->requestedParams, params); |
|
3026 |
DEBUGLOG(4, "ZSTD_compressBegin_usingDict (dictSize=%u)", ( |
|
|
3242 | DEBUGLOG(4, "ZSTD_compressBegin_usingDict (dictSize=%u)", (unsigned)dictSize); | |
|
3027 | 3243 | return ZSTD_compressBegin_internal(cctx, dict, dictSize, ZSTD_dct_auto, ZSTD_dtlm_fast, NULL, |
|
3028 | 3244 | cctxParams, ZSTD_CONTENTSIZE_UNKNOWN, ZSTDb_not_buffered); |
|
3029 | 3245 | } |
@@ -3067,7 +3283,7 b' static size_t ZSTD_writeEpilogue(ZSTD_CC' | |||
|
3067 | 3283 | if (cctx->appliedParams.fParams.checksumFlag) { |
|
3068 | 3284 | U32 const checksum = (U32) XXH64_digest(&cctx->xxhState); |
|
3069 | 3285 | if (dstCapacity<4) return ERROR(dstSize_tooSmall); |
|
3070 | DEBUGLOG(4, "ZSTD_writeEpilogue: write checksum : %08X", checksum); | |
|
3286 | DEBUGLOG(4, "ZSTD_writeEpilogue: write checksum : %08X", (unsigned)checksum); | |
|
3071 | 3287 | MEM_writeLE32(op, checksum); |
|
3072 | 3288 | op += 4; |
|
3073 | 3289 | } |
@@ -3093,7 +3309,7 b' size_t ZSTD_compressEnd (ZSTD_CCtx* cctx' | |||
|
3093 | 3309 | DEBUGLOG(4, "end of frame : controlling src size"); |
|
3094 | 3310 | if (cctx->pledgedSrcSizePlusOne != cctx->consumedSrcSize+1) { |
|
3095 | 3311 | DEBUGLOG(4, "error : pledgedSrcSize = %u, while realSrcSize = %u", |
|
3096 |
( |
|
|
3312 | (unsigned)cctx->pledgedSrcSizePlusOne-1, (unsigned)cctx->consumedSrcSize); | |
|
3097 | 3313 | return ERROR(srcSize_wrong); |
|
3098 | 3314 | } } |
|
3099 | 3315 | return cSize + endResult; |
@@ -3139,7 +3355,7 b' size_t ZSTD_compress_advanced_internal(' | |||
|
3139 | 3355 | const void* dict,size_t dictSize, |
|
3140 | 3356 | ZSTD_CCtx_params params) |
|
3141 | 3357 | { |
|
3142 |
DEBUGLOG(4, "ZSTD_compress_advanced_internal (srcSize:%u)", ( |
|
|
3358 | DEBUGLOG(4, "ZSTD_compress_advanced_internal (srcSize:%u)", (unsigned)srcSize); | |
|
3143 | 3359 | CHECK_F( ZSTD_compressBegin_internal(cctx, |
|
3144 | 3360 | dict, dictSize, ZSTD_dct_auto, ZSTD_dtlm_fast, NULL, |
|
3145 | 3361 | params, srcSize, ZSTDb_not_buffered) ); |
@@ -3163,7 +3379,7 b' size_t ZSTD_compressCCtx(ZSTD_CCtx* cctx' | |||
|
3163 | 3379 | const void* src, size_t srcSize, |
|
3164 | 3380 | int compressionLevel) |
|
3165 | 3381 | { |
|
3166 |
DEBUGLOG(4, "ZSTD_compressCCtx (srcSize=%u)", ( |
|
|
3382 | DEBUGLOG(4, "ZSTD_compressCCtx (srcSize=%u)", (unsigned)srcSize); | |
|
3167 | 3383 | assert(cctx != NULL); |
|
3168 | 3384 | return ZSTD_compress_usingDict(cctx, dst, dstCapacity, src, srcSize, NULL, 0, compressionLevel); |
|
3169 | 3385 | } |
@@ -3189,7 +3405,7 b' size_t ZSTD_estimateCDictSize_advanced(' | |||
|
3189 | 3405 | size_t dictSize, ZSTD_compressionParameters cParams, |
|
3190 | 3406 | ZSTD_dictLoadMethod_e dictLoadMethod) |
|
3191 | 3407 | { |
|
3192 |
DEBUGLOG(5, "sizeof(ZSTD_CDict) : %u", ( |
|
|
3408 | DEBUGLOG(5, "sizeof(ZSTD_CDict) : %u", (unsigned)sizeof(ZSTD_CDict)); | |
|
3193 | 3409 | return sizeof(ZSTD_CDict) + HUF_WORKSPACE_SIZE + ZSTD_sizeof_matchState(&cParams, /* forCCtx */ 0) |
|
3194 | 3410 | + (dictLoadMethod == ZSTD_dlm_byRef ? 0 : dictSize); |
|
3195 | 3411 | } |
@@ -3203,7 +3419,7 b' size_t ZSTD_estimateCDictSize(size_t dic' | |||
|
3203 | 3419 | size_t ZSTD_sizeof_CDict(const ZSTD_CDict* cdict) |
|
3204 | 3420 | { |
|
3205 | 3421 | if (cdict==NULL) return 0; /* support sizeof on NULL */ |
|
3206 |
DEBUGLOG(5, "sizeof(*cdict) : %u", ( |
|
|
3422 | DEBUGLOG(5, "sizeof(*cdict) : %u", (unsigned)sizeof(*cdict)); | |
|
3207 | 3423 | return cdict->workspaceSize + (cdict->dictBuffer ? cdict->dictContentSize : 0) + sizeof(*cdict); |
|
3208 | 3424 | } |
|
3209 | 3425 | |
@@ -3214,7 +3430,7 b' static size_t ZSTD_initCDict_internal(' | |||
|
3214 | 3430 | ZSTD_dictContentType_e dictContentType, |
|
3215 | 3431 | ZSTD_compressionParameters cParams) |
|
3216 | 3432 | { |
|
3217 |
DEBUGLOG(3, "ZSTD_initCDict_internal (dictContentType:%u)", ( |
|
|
3433 | DEBUGLOG(3, "ZSTD_initCDict_internal (dictContentType:%u)", (unsigned)dictContentType); | |
|
3218 | 3434 | assert(!ZSTD_checkCParams(cParams)); |
|
3219 | 3435 | cdict->matchState.cParams = cParams; |
|
3220 | 3436 | if ((dictLoadMethod == ZSTD_dlm_byRef) || (!dictBuffer) || (!dictSize)) { |
@@ -3264,7 +3480,7 b' ZSTD_CDict* ZSTD_createCDict_advanced(co' | |||
|
3264 | 3480 | ZSTD_dictContentType_e dictContentType, |
|
3265 | 3481 | ZSTD_compressionParameters cParams, ZSTD_customMem customMem) |
|
3266 | 3482 | { |
|
3267 |
DEBUGLOG(3, "ZSTD_createCDict_advanced, mode %u", ( |
|
|
3483 | DEBUGLOG(3, "ZSTD_createCDict_advanced, mode %u", (unsigned)dictContentType); | |
|
3268 | 3484 | if (!customMem.customAlloc ^ !customMem.customFree) return NULL; |
|
3269 | 3485 | |
|
3270 | 3486 | { ZSTD_CDict* const cdict = (ZSTD_CDict*)ZSTD_malloc(sizeof(ZSTD_CDict), customMem); |
@@ -3345,7 +3561,7 b' const ZSTD_CDict* ZSTD_initStaticCDict(' | |||
|
3345 | 3561 | void* ptr; |
|
3346 | 3562 | if ((size_t)workspace & 7) return NULL; /* 8-aligned */ |
|
3347 | 3563 | DEBUGLOG(4, "(workspaceSize < neededSize) : (%u < %u) => %u", |
|
3348 |
( |
|
|
3564 | (unsigned)workspaceSize, (unsigned)neededSize, (unsigned)(workspaceSize < neededSize)); | |
|
3349 | 3565 | if (workspaceSize < neededSize) return NULL; |
|
3350 | 3566 | |
|
3351 | 3567 | if (dictLoadMethod == ZSTD_dlm_byCopy) { |
@@ -3505,7 +3721,7 b' static size_t ZSTD_resetCStream_internal' | |||
|
3505 | 3721 | size_t ZSTD_resetCStream(ZSTD_CStream* zcs, unsigned long long pledgedSrcSize) |
|
3506 | 3722 | { |
|
3507 | 3723 | ZSTD_CCtx_params params = zcs->requestedParams; |
|
3508 |
DEBUGLOG(4, "ZSTD_resetCStream: pledgedSrcSize = %u", ( |
|
|
3724 | DEBUGLOG(4, "ZSTD_resetCStream: pledgedSrcSize = %u", (unsigned)pledgedSrcSize); | |
|
3509 | 3725 | if (pledgedSrcSize==0) pledgedSrcSize = ZSTD_CONTENTSIZE_UNKNOWN; |
|
3510 | 3726 | params.fParams.contentSizeFlag = 1; |
|
3511 | 3727 | return ZSTD_resetCStream_internal(zcs, NULL, 0, ZSTD_dct_auto, zcs->cdict, params, pledgedSrcSize); |
@@ -3525,7 +3741,7 b' size_t ZSTD_initCStream_internal(ZSTD_CS' | |||
|
3525 | 3741 | assert(!((dict) && (cdict))); /* either dict or cdict, not both */ |
|
3526 | 3742 | |
|
3527 | 3743 | if (dict && dictSize >= 8) { |
|
3528 |
DEBUGLOG(4, "loading dictionary of size %u", ( |
|
|
3744 | DEBUGLOG(4, "loading dictionary of size %u", (unsigned)dictSize); | |
|
3529 | 3745 | if (zcs->staticSize) { /* static CCtx : never uses malloc */ |
|
3530 | 3746 | /* incompatible with internal cdict creation */ |
|
3531 | 3747 | return ERROR(memory_allocation); |
@@ -3584,7 +3800,7 b' size_t ZSTD_initCStream_advanced(ZSTD_CS' | |||
|
3584 | 3800 | ZSTD_parameters params, unsigned long long pledgedSrcSize) |
|
3585 | 3801 | { |
|
3586 | 3802 | DEBUGLOG(4, "ZSTD_initCStream_advanced: pledgedSrcSize=%u, flag=%u", |
|
3587 |
( |
|
|
3803 | (unsigned)pledgedSrcSize, params.fParams.contentSizeFlag); | |
|
3588 | 3804 | CHECK_F( ZSTD_checkCParams(params.cParams) ); |
|
3589 | 3805 | if ((pledgedSrcSize==0) && (params.fParams.contentSizeFlag==0)) pledgedSrcSize = ZSTD_CONTENTSIZE_UNKNOWN; /* for compatibility with older programs relying on this behavior. Users should now specify ZSTD_CONTENTSIZE_UNKNOWN. This line will be removed in the future. */ |
|
3590 | 3806 | zcs->requestedParams = ZSTD_assignParamsToCCtxParams(zcs->requestedParams, params); |
@@ -3612,8 +3828,15 b' size_t ZSTD_initCStream(ZSTD_CStream* zc' | |||
|
3612 | 3828 | |
|
3613 | 3829 | /*====== Compression ======*/ |
|
3614 | 3830 | |
|
3615 | MEM_STATIC size_t ZSTD_limitCopy(void* dst, size_t dstCapacity, | |
|
3616 | const void* src, size_t srcSize) | |
|
3831 | static size_t ZSTD_nextInputSizeHint(const ZSTD_CCtx* cctx) | |
|
3832 | { | |
|
3833 | size_t hintInSize = cctx->inBuffTarget - cctx->inBuffPos; | |
|
3834 | if (hintInSize==0) hintInSize = cctx->blockSize; | |
|
3835 | return hintInSize; | |
|
3836 | } | |
|
3837 | ||
|
3838 | static size_t ZSTD_limitCopy(void* dst, size_t dstCapacity, | |
|
3839 | const void* src, size_t srcSize) | |
|
3617 | 3840 | { |
|
3618 | 3841 | size_t const length = MIN(dstCapacity, srcSize); |
|
3619 | 3842 | if (length) memcpy(dst, src, length); |
@@ -3621,7 +3844,7 b' MEM_STATIC size_t ZSTD_limitCopy(void* d' | |||
|
3621 | 3844 | } |
|
3622 | 3845 | |
|
3623 | 3846 | /** ZSTD_compressStream_generic(): |
|
3624 |
* internal function for all *compressStream*() variants |
|
|
3847 | * internal function for all *compressStream*() variants | |
|
3625 | 3848 | * non-static, because can be called from zstdmt_compress.c |
|
3626 | 3849 | * @return : hint size for next input */ |
|
3627 | 3850 | size_t ZSTD_compressStream_generic(ZSTD_CStream* zcs, |
@@ -3638,7 +3861,7 b' size_t ZSTD_compressStream_generic(ZSTD_' | |||
|
3638 | 3861 | U32 someMoreWork = 1; |
|
3639 | 3862 | |
|
3640 | 3863 | /* check expectations */ |
|
3641 |
DEBUGLOG(5, "ZSTD_compressStream_generic, flush=%u", ( |
|
|
3864 | DEBUGLOG(5, "ZSTD_compressStream_generic, flush=%u", (unsigned)flushMode); | |
|
3642 | 3865 | assert(zcs->inBuff != NULL); |
|
3643 | 3866 | assert(zcs->inBuffSize > 0); |
|
3644 | 3867 | assert(zcs->outBuff != NULL); |
@@ -3660,12 +3883,12 b' size_t ZSTD_compressStream_generic(ZSTD_' | |||
|
3660 | 3883 | /* shortcut to compression pass directly into output buffer */ |
|
3661 | 3884 | size_t const cSize = ZSTD_compressEnd(zcs, |
|
3662 | 3885 | op, oend-op, ip, iend-ip); |
|
3663 |
DEBUGLOG(4, "ZSTD_compressEnd : %u", ( |
|
|
3886 | DEBUGLOG(4, "ZSTD_compressEnd : cSize=%u", (unsigned)cSize); | |
|
3664 | 3887 | if (ZSTD_isError(cSize)) return cSize; |
|
3665 | 3888 | ip = iend; |
|
3666 | 3889 | op += cSize; |
|
3667 | 3890 | zcs->frameEnded = 1; |
|
3668 | ZSTD_CCtx_reset(zcs); | |
|
3891 | ZSTD_CCtx_reset(zcs, ZSTD_reset_session_only); | |
|
3669 | 3892 | someMoreWork = 0; break; |
|
3670 | 3893 | } |
|
3671 | 3894 | /* complete loading into inBuffer */ |
@@ -3709,7 +3932,7 b' size_t ZSTD_compressStream_generic(ZSTD_' | |||
|
3709 | 3932 | if (zcs->inBuffTarget > zcs->inBuffSize) |
|
3710 | 3933 | zcs->inBuffPos = 0, zcs->inBuffTarget = zcs->blockSize; |
|
3711 | 3934 | DEBUGLOG(5, "inBuffTarget:%u / inBuffSize:%u", |
|
3712 |
( |
|
|
3935 | (unsigned)zcs->inBuffTarget, (unsigned)zcs->inBuffSize); | |
|
3713 | 3936 | if (!lastBlock) |
|
3714 | 3937 | assert(zcs->inBuffTarget <= zcs->inBuffSize); |
|
3715 | 3938 | zcs->inToCompress = zcs->inBuffPos; |
@@ -3718,7 +3941,7 b' size_t ZSTD_compressStream_generic(ZSTD_' | |||
|
3718 | 3941 | if (zcs->frameEnded) { |
|
3719 | 3942 | DEBUGLOG(5, "Frame completed directly in outBuffer"); |
|
3720 | 3943 | someMoreWork = 0; |
|
3721 | ZSTD_CCtx_reset(zcs); | |
|
3944 | ZSTD_CCtx_reset(zcs, ZSTD_reset_session_only); | |
|
3722 | 3945 | } |
|
3723 | 3946 | break; |
|
3724 | 3947 | } |
@@ -3733,7 +3956,7 b' size_t ZSTD_compressStream_generic(ZSTD_' | |||
|
3733 | 3956 | size_t const flushed = ZSTD_limitCopy(op, oend-op, |
|
3734 | 3957 | zcs->outBuff + zcs->outBuffFlushedSize, toFlush); |
|
3735 | 3958 | DEBUGLOG(5, "toFlush: %u into %u ==> flushed: %u", |
|
3736 |
( |
|
|
3959 | (unsigned)toFlush, (unsigned)(oend-op), (unsigned)flushed); | |
|
3737 | 3960 | op += flushed; |
|
3738 | 3961 | zcs->outBuffFlushedSize += flushed; |
|
3739 | 3962 | if (toFlush!=flushed) { |
@@ -3746,7 +3969,7 b' size_t ZSTD_compressStream_generic(ZSTD_' | |||
|
3746 | 3969 | if (zcs->frameEnded) { |
|
3747 | 3970 | DEBUGLOG(5, "Frame completed on flush"); |
|
3748 | 3971 | someMoreWork = 0; |
|
3749 | ZSTD_CCtx_reset(zcs); | |
|
3972 | ZSTD_CCtx_reset(zcs, ZSTD_reset_session_only); | |
|
3750 | 3973 | break; |
|
3751 | 3974 | } |
|
3752 | 3975 | zcs->streamStage = zcss_load; |
@@ -3761,28 +3984,34 b' size_t ZSTD_compressStream_generic(ZSTD_' | |||
|
3761 | 3984 | input->pos = ip - istart; |
|
3762 | 3985 | output->pos = op - ostart; |
|
3763 | 3986 | if (zcs->frameEnded) return 0; |
|
3764 | { size_t hintInSize = zcs->inBuffTarget - zcs->inBuffPos; | |
|
3765 | if (hintInSize==0) hintInSize = zcs->blockSize; | |
|
3766 | return hintInSize; | |
|
3987 | return ZSTD_nextInputSizeHint(zcs); | |
|
3988 | } | |
|
3989 | ||
|
3990 | static size_t ZSTD_nextInputSizeHint_MTorST(const ZSTD_CCtx* cctx) | |
|
3991 | { | |
|
3992 | #ifdef ZSTD_MULTITHREAD | |
|
3993 | if (cctx->appliedParams.nbWorkers >= 1) { | |
|
3994 | assert(cctx->mtctx != NULL); | |
|
3995 | return ZSTDMT_nextInputSizeHint(cctx->mtctx); | |
|
3767 | 3996 | } |
|
3997 | #endif | |
|
3998 | return ZSTD_nextInputSizeHint(cctx); | |
|
3999 | ||
|
3768 | 4000 | } |
|
3769 | 4001 | |
|
3770 | 4002 | size_t ZSTD_compressStream(ZSTD_CStream* zcs, ZSTD_outBuffer* output, ZSTD_inBuffer* input) |
|
3771 | 4003 | { |
|
3772 | /* check conditions */ | |
|
3773 | if (output->pos > output->size) return ERROR(GENERIC); | |
|
3774 | if (input->pos > input->size) return ERROR(GENERIC); | |
|
3775 | ||
|
3776 | return ZSTD_compressStream_generic(zcs, output, input, ZSTD_e_continue); | |
|
4004 | CHECK_F( ZSTD_compressStream2(zcs, output, input, ZSTD_e_continue) ); | |
|
4005 | return ZSTD_nextInputSizeHint_MTorST(zcs); | |
|
3777 | 4006 | } |
|
3778 | 4007 | |
|
3779 | 4008 | |
|
3780 |
size_t ZSTD_compress |
|
|
3781 |
|
|
|
3782 |
|
|
|
3783 |
|
|
|
4009 | size_t ZSTD_compressStream2( ZSTD_CCtx* cctx, | |
|
4010 | ZSTD_outBuffer* output, | |
|
4011 | ZSTD_inBuffer* input, | |
|
4012 | ZSTD_EndDirective endOp) | |
|
3784 | 4013 | { |
|
3785 |
DEBUGLOG(5, "ZSTD_compress |
|
|
4014 | DEBUGLOG(5, "ZSTD_compressStream2, endOp=%u ", (unsigned)endOp); | |
|
3786 | 4015 | /* check conditions */ |
|
3787 | 4016 | if (output->pos > output->size) return ERROR(GENERIC); |
|
3788 | 4017 | if (input->pos > input->size) return ERROR(GENERIC); |
@@ -3792,9 +4021,9 b' size_t ZSTD_compress_generic (ZSTD_CCtx*' | |||
|
3792 | 4021 | if (cctx->streamStage == zcss_init) { |
|
3793 | 4022 | ZSTD_CCtx_params params = cctx->requestedParams; |
|
3794 | 4023 | ZSTD_prefixDict const prefixDict = cctx->prefixDict; |
|
3795 | memset(&cctx->prefixDict, 0, sizeof(cctx->prefixDict)); /* single usage */ | |
|
3796 | assert(prefixDict.dict==NULL || cctx->cdict==NULL); /* only one can be set */ | |
|
3797 |
DEBUGLOG(4, "ZSTD_compress |
|
|
4024 | memset(&cctx->prefixDict, 0, sizeof(cctx->prefixDict)); /* single usage */ | |
|
4025 | assert(prefixDict.dict==NULL || cctx->cdict==NULL); /* only one can be set */ | |
|
4026 | DEBUGLOG(4, "ZSTD_compressStream2 : transparent init stage"); | |
|
3798 | 4027 | if (endOp == ZSTD_e_end) cctx->pledgedSrcSizePlusOne = input->size + 1; /* auto-fix pledgedSrcSize */ |
|
3799 | 4028 | params.cParams = ZSTD_getCParamsFromCCtxParams( |
|
3800 | 4029 | &cctx->requestedParams, cctx->pledgedSrcSizePlusOne-1, 0 /*dictSize*/); |
@@ -3807,7 +4036,7 b' size_t ZSTD_compress_generic (ZSTD_CCtx*' | |||
|
3807 | 4036 | if (params.nbWorkers > 0) { |
|
3808 | 4037 | /* mt context creation */ |
|
3809 | 4038 | if (cctx->mtctx == NULL) { |
|
3810 |
DEBUGLOG(4, "ZSTD_compress |
|
|
4039 | DEBUGLOG(4, "ZSTD_compressStream2: creating new mtctx for nbWorkers=%u", | |
|
3811 | 4040 | params.nbWorkers); |
|
3812 | 4041 | cctx->mtctx = ZSTDMT_createCCtx_advanced(params.nbWorkers, cctx->customMem); |
|
3813 | 4042 | if (cctx->mtctx == NULL) return ERROR(memory_allocation); |
@@ -3829,6 +4058,7 b' size_t ZSTD_compress_generic (ZSTD_CCtx*' | |||
|
3829 | 4058 | assert(cctx->streamStage == zcss_load); |
|
3830 | 4059 | assert(cctx->appliedParams.nbWorkers == 0); |
|
3831 | 4060 | } } |
|
4061 | /* end of transparent initialization stage */ | |
|
3832 | 4062 | |
|
3833 | 4063 | /* compression stage */ |
|
3834 | 4064 | #ifdef ZSTD_MULTITHREAD |
@@ -3840,18 +4070,18 b' size_t ZSTD_compress_generic (ZSTD_CCtx*' | |||
|
3840 | 4070 | { size_t const flushMin = ZSTDMT_compressStream_generic(cctx->mtctx, output, input, endOp); |
|
3841 | 4071 | if ( ZSTD_isError(flushMin) |
|
3842 | 4072 | || (endOp == ZSTD_e_end && flushMin == 0) ) { /* compression completed */ |
|
3843 | ZSTD_CCtx_reset(cctx); | |
|
4073 | ZSTD_CCtx_reset(cctx, ZSTD_reset_session_only); | |
|
3844 | 4074 | } |
|
3845 |
DEBUGLOG(5, "completed ZSTD_compress |
|
|
4075 | DEBUGLOG(5, "completed ZSTD_compressStream2 delegating to ZSTDMT_compressStream_generic"); | |
|
3846 | 4076 | return flushMin; |
|
3847 | 4077 | } } |
|
3848 | 4078 | #endif |
|
3849 | 4079 | CHECK_F( ZSTD_compressStream_generic(cctx, output, input, endOp) ); |
|
3850 |
DEBUGLOG(5, "completed ZSTD_compress |
|
|
4080 | DEBUGLOG(5, "completed ZSTD_compressStream2"); | |
|
3851 | 4081 | return cctx->outBuffContentSize - cctx->outBuffFlushedSize; /* remaining to flush */ |
|
3852 | 4082 | } |
|
3853 | 4083 | |
|
3854 |
size_t ZSTD_compress |
|
|
4084 | size_t ZSTD_compressStream2_simpleArgs ( | |
|
3855 | 4085 | ZSTD_CCtx* cctx, |
|
3856 | 4086 | void* dst, size_t dstCapacity, size_t* dstPos, |
|
3857 | 4087 | const void* src, size_t srcSize, size_t* srcPos, |
@@ -3859,13 +4089,33 b' size_t ZSTD_compress_generic_simpleArgs ' | |||
|
3859 | 4089 | { |
|
3860 | 4090 | ZSTD_outBuffer output = { dst, dstCapacity, *dstPos }; |
|
3861 | 4091 | ZSTD_inBuffer input = { src, srcSize, *srcPos }; |
|
3862 |
/* ZSTD_compress |
|
|
3863 |
size_t const cErr = ZSTD_compress |
|
|
4092 | /* ZSTD_compressStream2() will check validity of dstPos and srcPos */ | |
|
4093 | size_t const cErr = ZSTD_compressStream2(cctx, &output, &input, endOp); | |
|
3864 | 4094 | *dstPos = output.pos; |
|
3865 | 4095 | *srcPos = input.pos; |
|
3866 | 4096 | return cErr; |
|
3867 | 4097 | } |
|
3868 | 4098 | |
|
4099 | size_t ZSTD_compress2(ZSTD_CCtx* cctx, | |
|
4100 | void* dst, size_t dstCapacity, | |
|
4101 | const void* src, size_t srcSize) | |
|
4102 | { | |
|
4103 | ZSTD_CCtx_reset(cctx, ZSTD_reset_session_only); | |
|
4104 | { size_t oPos = 0; | |
|
4105 | size_t iPos = 0; | |
|
4106 | size_t const result = ZSTD_compressStream2_simpleArgs(cctx, | |
|
4107 | dst, dstCapacity, &oPos, | |
|
4108 | src, srcSize, &iPos, | |
|
4109 | ZSTD_e_end); | |
|
4110 | if (ZSTD_isError(result)) return result; | |
|
4111 | if (result != 0) { /* compression not completed, due to lack of output space */ | |
|
4112 | assert(oPos == dstCapacity); | |
|
4113 | return ERROR(dstSize_tooSmall); | |
|
4114 | } | |
|
4115 | assert(iPos == srcSize); /* all input is expected consumed */ | |
|
4116 | return oPos; | |
|
4117 | } | |
|
4118 | } | |
|
3869 | 4119 | |
|
3870 | 4120 | /*====== Finalize ======*/ |
|
3871 | 4121 | |
@@ -3874,21 +4124,21 b' size_t ZSTD_compress_generic_simpleArgs ' | |||
|
3874 | 4124 | size_t ZSTD_flushStream(ZSTD_CStream* zcs, ZSTD_outBuffer* output) |
|
3875 | 4125 | { |
|
3876 | 4126 | ZSTD_inBuffer input = { NULL, 0, 0 }; |
|
3877 | if (output->pos > output->size) return ERROR(GENERIC); | |
|
3878 | CHECK_F( ZSTD_compressStream_generic(zcs, output, &input, ZSTD_e_flush) ); | |
|
3879 | return zcs->outBuffContentSize - zcs->outBuffFlushedSize; /* remaining to flush */ | |
|
4127 | return ZSTD_compressStream2(zcs, output, &input, ZSTD_e_flush); | |
|
3880 | 4128 | } |
|
3881 | 4129 | |
|
3882 | 4130 | |
|
3883 | 4131 | size_t ZSTD_endStream(ZSTD_CStream* zcs, ZSTD_outBuffer* output) |
|
3884 | 4132 | { |
|
3885 | 4133 | ZSTD_inBuffer input = { NULL, 0, 0 }; |
|
3886 | if (output->pos > output->size) return ERROR(GENERIC); | |
|
3887 | CHECK_F( ZSTD_compressStream_generic(zcs, output, &input, ZSTD_e_end) ); | |
|
4134 | size_t const remainingToFlush = ZSTD_compressStream2(zcs, output, &input, ZSTD_e_end); | |
|
4135 | CHECK_F( remainingToFlush ); | |
|
4136 | if (zcs->appliedParams.nbWorkers > 0) return remainingToFlush; /* minimal estimation */ | |
|
4137 | /* single thread mode : attempt to calculate remaining to flush more precisely */ | |
|
3888 | 4138 | { size_t const lastBlockSize = zcs->frameEnded ? 0 : ZSTD_BLOCKHEADERSIZE; |
|
3889 | 4139 | size_t const checksumSize = zcs->frameEnded ? 0 : zcs->appliedParams.fParams.checksumFlag * 4; |
|
3890 |
size_t const toFlush = |
|
|
3891 |
DEBUGLOG(4, "ZSTD_endStream : remaining to flush : %u", ( |
|
|
4140 | size_t const toFlush = remainingToFlush + lastBlockSize + checksumSize; | |
|
4141 | DEBUGLOG(4, "ZSTD_endStream : remaining to flush : %u", (unsigned)toFlush); | |
|
3892 | 4142 | return toFlush; |
|
3893 | 4143 | } |
|
3894 | 4144 | } |
@@ -3905,27 +4155,27 b' static const ZSTD_compressionParameters ' | |||
|
3905 | 4155 | /* W, C, H, S, L, TL, strat */ |
|
3906 | 4156 | { 19, 12, 13, 1, 6, 1, ZSTD_fast }, /* base for negative levels */ |
|
3907 | 4157 | { 19, 13, 14, 1, 7, 0, ZSTD_fast }, /* level 1 */ |
|
3908 |
{ |
|
|
3909 |
{ 2 |
|
|
3910 |
{ 2 |
|
|
3911 |
{ 2 |
|
|
3912 |
{ 21, 1 |
|
|
3913 |
{ 21, 1 |
|
|
4158 | { 20, 15, 16, 1, 6, 0, ZSTD_fast }, /* level 2 */ | |
|
4159 | { 21, 16, 17, 1, 5, 1, ZSTD_dfast }, /* level 3 */ | |
|
4160 | { 21, 18, 18, 1, 5, 1, ZSTD_dfast }, /* level 4 */ | |
|
4161 | { 21, 18, 19, 2, 5, 2, ZSTD_greedy }, /* level 5 */ | |
|
4162 | { 21, 19, 19, 3, 5, 4, ZSTD_greedy }, /* level 6 */ | |
|
4163 | { 21, 19, 19, 3, 5, 8, ZSTD_lazy }, /* level 7 */ | |
|
3914 | 4164 | { 21, 19, 19, 3, 5, 16, ZSTD_lazy2 }, /* level 8 */ |
|
3915 | 4165 | { 21, 19, 20, 4, 5, 16, ZSTD_lazy2 }, /* level 9 */ |
|
3916 |
{ 2 |
|
|
3917 |
{ 2 |
|
|
3918 |
{ 22, 2 |
|
|
3919 |
{ 22, 21, 22, |
|
|
3920 |
{ 22, 2 |
|
|
3921 |
{ 22, 2 |
|
|
3922 |
{ 22, 2 |
|
|
3923 |
{ 23, 2 |
|
|
3924 |
{ 23, 23, 22, 6, 3, |
|
|
3925 |
{ 23, 24, 22, 7, 3,256, ZSTD_btultra |
|
|
3926 |
{ 25, 25, 23, 7, 3,256, ZSTD_btultra |
|
|
3927 |
{ 26, 26, 24, 7, 3,512, ZSTD_btultra |
|
|
3928 |
{ 27, 27, 25, 9, 3,999, ZSTD_btultra |
|
|
4166 | { 22, 20, 21, 4, 5, 16, ZSTD_lazy2 }, /* level 10 */ | |
|
4167 | { 22, 21, 22, 4, 5, 16, ZSTD_lazy2 }, /* level 11 */ | |
|
4168 | { 22, 21, 22, 5, 5, 16, ZSTD_lazy2 }, /* level 12 */ | |
|
4169 | { 22, 21, 22, 5, 5, 32, ZSTD_btlazy2 }, /* level 13 */ | |
|
4170 | { 22, 22, 23, 5, 5, 32, ZSTD_btlazy2 }, /* level 14 */ | |
|
4171 | { 22, 23, 23, 6, 5, 32, ZSTD_btlazy2 }, /* level 15 */ | |
|
4172 | { 22, 22, 22, 5, 5, 48, ZSTD_btopt }, /* level 16 */ | |
|
4173 | { 23, 23, 22, 5, 4, 64, ZSTD_btopt }, /* level 17 */ | |
|
4174 | { 23, 23, 22, 6, 3, 64, ZSTD_btultra }, /* level 18 */ | |
|
4175 | { 23, 24, 22, 7, 3,256, ZSTD_btultra2}, /* level 19 */ | |
|
4176 | { 25, 25, 23, 7, 3,256, ZSTD_btultra2}, /* level 20 */ | |
|
4177 | { 26, 26, 24, 7, 3,512, ZSTD_btultra2}, /* level 21 */ | |
|
4178 | { 27, 27, 25, 9, 3,999, ZSTD_btultra2}, /* level 22 */ | |
|
3929 | 4179 | }, |
|
3930 | 4180 | { /* for srcSize <= 256 KB */ |
|
3931 | 4181 | /* W, C, H, S, L, T, strat */ |
@@ -3940,18 +4190,18 b' static const ZSTD_compressionParameters ' | |||
|
3940 | 4190 | { 18, 18, 19, 4, 4, 8, ZSTD_lazy2 }, /* level 8 */ |
|
3941 | 4191 | { 18, 18, 19, 5, 4, 8, ZSTD_lazy2 }, /* level 9 */ |
|
3942 | 4192 | { 18, 18, 19, 6, 4, 8, ZSTD_lazy2 }, /* level 10 */ |
|
3943 |
{ 18, 18, 19, 5, 4, 1 |
|
|
3944 |
{ 18, 19, 19, |
|
|
3945 |
{ 18, 1 |
|
|
3946 |
{ 18, 18, 19, 4, |
|
|
3947 |
{ 18, 18, 19, |
|
|
3948 |
{ 18, 19, 19, 6, 3, |
|
|
3949 |
{ 18, 19, 19, 8, 3, |
|
|
3950 |
{ 18, 19, 19, |
|
|
3951 |
{ 18, 19, 19, |
|
|
3952 |
{ 18, 19, 19, 1 |
|
|
3953 |
{ 18, 19, 19, 12, 3,512, ZSTD_btultra |
|
|
3954 |
{ 18, 19, 19, 13, 3,999, ZSTD_btultra |
|
|
4193 | { 18, 18, 19, 5, 4, 12, ZSTD_btlazy2 }, /* level 11.*/ | |
|
4194 | { 18, 19, 19, 7, 4, 12, ZSTD_btlazy2 }, /* level 12.*/ | |
|
4195 | { 18, 18, 19, 4, 4, 16, ZSTD_btopt }, /* level 13 */ | |
|
4196 | { 18, 18, 19, 4, 3, 32, ZSTD_btopt }, /* level 14.*/ | |
|
4197 | { 18, 18, 19, 6, 3,128, ZSTD_btopt }, /* level 15.*/ | |
|
4198 | { 18, 19, 19, 6, 3,128, ZSTD_btultra }, /* level 16.*/ | |
|
4199 | { 18, 19, 19, 8, 3,256, ZSTD_btultra }, /* level 17.*/ | |
|
4200 | { 18, 19, 19, 6, 3,128, ZSTD_btultra2}, /* level 18.*/ | |
|
4201 | { 18, 19, 19, 8, 3,256, ZSTD_btultra2}, /* level 19.*/ | |
|
4202 | { 18, 19, 19, 10, 3,512, ZSTD_btultra2}, /* level 20.*/ | |
|
4203 | { 18, 19, 19, 12, 3,512, ZSTD_btultra2}, /* level 21.*/ | |
|
4204 | { 18, 19, 19, 13, 3,999, ZSTD_btultra2}, /* level 22.*/ | |
|
3955 | 4205 | }, |
|
3956 | 4206 | { /* for srcSize <= 128 KB */ |
|
3957 | 4207 | /* W, C, H, S, L, T, strat */ |
@@ -3966,26 +4216,26 b' static const ZSTD_compressionParameters ' | |||
|
3966 | 4216 | { 17, 17, 17, 4, 4, 8, ZSTD_lazy2 }, /* level 8 */ |
|
3967 | 4217 | { 17, 17, 17, 5, 4, 8, ZSTD_lazy2 }, /* level 9 */ |
|
3968 | 4218 | { 17, 17, 17, 6, 4, 8, ZSTD_lazy2 }, /* level 10 */ |
|
3969 |
{ 17, 17, 17, |
|
|
3970 |
{ 17, 18, 17, |
|
|
3971 |
{ 17, 18, 17, |
|
|
3972 |
{ 17, 18, 17, 4, |
|
|
3973 |
{ 17, 18, 17, 6, 3, |
|
|
3974 |
{ 17, 18, 17, |
|
|
3975 |
{ 17, 18, 17, |
|
|
3976 |
{ 17, 18, 17, |
|
|
3977 |
{ 17, 18, 17, |
|
|
3978 |
{ 17, 18, 17, |
|
|
3979 |
{ 17, 18, 17, |
|
|
3980 |
{ 17, 18, 17, 11, 3, |
|
|
4219 | { 17, 17, 17, 5, 4, 8, ZSTD_btlazy2 }, /* level 11 */ | |
|
4220 | { 17, 18, 17, 7, 4, 12, ZSTD_btlazy2 }, /* level 12 */ | |
|
4221 | { 17, 18, 17, 3, 4, 12, ZSTD_btopt }, /* level 13.*/ | |
|
4222 | { 17, 18, 17, 4, 3, 32, ZSTD_btopt }, /* level 14.*/ | |
|
4223 | { 17, 18, 17, 6, 3,256, ZSTD_btopt }, /* level 15.*/ | |
|
4224 | { 17, 18, 17, 6, 3,128, ZSTD_btultra }, /* level 16.*/ | |
|
4225 | { 17, 18, 17, 8, 3,256, ZSTD_btultra }, /* level 17.*/ | |
|
4226 | { 17, 18, 17, 10, 3,512, ZSTD_btultra }, /* level 18.*/ | |
|
4227 | { 17, 18, 17, 5, 3,256, ZSTD_btultra2}, /* level 19.*/ | |
|
4228 | { 17, 18, 17, 7, 3,512, ZSTD_btultra2}, /* level 20.*/ | |
|
4229 | { 17, 18, 17, 9, 3,512, ZSTD_btultra2}, /* level 21.*/ | |
|
4230 | { 17, 18, 17, 11, 3,999, ZSTD_btultra2}, /* level 22.*/ | |
|
3981 | 4231 | }, |
|
3982 | 4232 | { /* for srcSize <= 16 KB */ |
|
3983 | 4233 | /* W, C, H, S, L, T, strat */ |
|
3984 | 4234 | { 14, 12, 13, 1, 5, 1, ZSTD_fast }, /* base for negative levels */ |
|
3985 | 4235 | { 14, 14, 15, 1, 5, 0, ZSTD_fast }, /* level 1 */ |
|
3986 | 4236 | { 14, 14, 15, 1, 4, 0, ZSTD_fast }, /* level 2 */ |
|
3987 |
{ 14, 14, 1 |
|
|
3988 |
{ 14, 14, 14, 4, 4, 2, ZSTD_greedy }, /* level 4 |
|
|
4237 | { 14, 14, 15, 2, 4, 1, ZSTD_dfast }, /* level 3 */ | |
|
4238 | { 14, 14, 14, 4, 4, 2, ZSTD_greedy }, /* level 4 */ | |
|
3989 | 4239 | { 14, 14, 14, 3, 4, 4, ZSTD_lazy }, /* level 5.*/ |
|
3990 | 4240 | { 14, 14, 14, 4, 4, 8, ZSTD_lazy2 }, /* level 6 */ |
|
3991 | 4241 | { 14, 14, 14, 6, 4, 8, ZSTD_lazy2 }, /* level 7 */ |
@@ -3993,17 +4243,17 b' static const ZSTD_compressionParameters ' | |||
|
3993 | 4243 | { 14, 15, 14, 5, 4, 8, ZSTD_btlazy2 }, /* level 9.*/ |
|
3994 | 4244 | { 14, 15, 14, 9, 4, 8, ZSTD_btlazy2 }, /* level 10.*/ |
|
3995 | 4245 | { 14, 15, 14, 3, 4, 12, ZSTD_btopt }, /* level 11.*/ |
|
3996 |
{ 14, 15, 14, |
|
|
3997 |
{ 14, 15, 14, |
|
|
3998 |
{ 14, 15, 15, 6, 3, 4 |
|
|
3999 |
{ 14, 15, 15, |
|
|
4000 |
{ 14, 15, 15, |
|
|
4001 |
{ 14, 15, 15, 6, 3,128, ZSTD_bt |
|
|
4002 |
{ 14, 15, 15, |
|
|
4003 |
{ 14, 15, 15, |
|
|
4004 |
{ 14, 15, 15, 8, 3, |
|
|
4005 |
{ 14, 15, 15, 9, 3, |
|
|
4006 |
{ 14, 15, 15, 10, 3, |
|
|
4246 | { 14, 15, 14, 4, 3, 24, ZSTD_btopt }, /* level 12.*/ | |
|
4247 | { 14, 15, 14, 5, 3, 32, ZSTD_btultra }, /* level 13.*/ | |
|
4248 | { 14, 15, 15, 6, 3, 64, ZSTD_btultra }, /* level 14.*/ | |
|
4249 | { 14, 15, 15, 7, 3,256, ZSTD_btultra }, /* level 15.*/ | |
|
4250 | { 14, 15, 15, 5, 3, 48, ZSTD_btultra2}, /* level 16.*/ | |
|
4251 | { 14, 15, 15, 6, 3,128, ZSTD_btultra2}, /* level 17.*/ | |
|
4252 | { 14, 15, 15, 7, 3,256, ZSTD_btultra2}, /* level 18.*/ | |
|
4253 | { 14, 15, 15, 8, 3,256, ZSTD_btultra2}, /* level 19.*/ | |
|
4254 | { 14, 15, 15, 8, 3,512, ZSTD_btultra2}, /* level 20.*/ | |
|
4255 | { 14, 15, 15, 9, 3,512, ZSTD_btultra2}, /* level 21.*/ | |
|
4256 | { 14, 15, 15, 10, 3,999, ZSTD_btultra2}, /* level 22.*/ | |
|
4007 | 4257 | }, |
|
4008 | 4258 | }; |
|
4009 | 4259 | |
@@ -4022,8 +4272,8 b' ZSTD_compressionParameters ZSTD_getCPara' | |||
|
4022 | 4272 | if (compressionLevel > ZSTD_MAX_CLEVEL) row = ZSTD_MAX_CLEVEL; |
|
4023 | 4273 | { ZSTD_compressionParameters cp = ZSTD_defaultCParameters[tableID][row]; |
|
4024 | 4274 | if (compressionLevel < 0) cp.targetLength = (unsigned)(-compressionLevel); /* acceleration factor */ |
|
4025 |
return ZSTD_adjustCParams_internal(cp, srcSizeHint, dictSize); |
|
|
4026 | ||
|
4275 | return ZSTD_adjustCParams_internal(cp, srcSizeHint, dictSize); | |
|
4276 | } | |
|
4027 | 4277 | } |
|
4028 | 4278 | |
|
4029 | 4279 | /*! ZSTD_getParams() : |
@@ -48,12 +48,6 b' extern "C" {' | |||
|
48 | 48 | typedef enum { ZSTDcs_created=0, ZSTDcs_init, ZSTDcs_ongoing, ZSTDcs_ending } ZSTD_compressionStage_e; |
|
49 | 49 | typedef enum { zcss_init=0, zcss_load, zcss_flush } ZSTD_cStreamStage; |
|
50 | 50 | |
|
51 | typedef enum { | |
|
52 | ZSTD_dictDefaultAttach = 0, | |
|
53 | ZSTD_dictForceAttach = 1, | |
|
54 | ZSTD_dictForceCopy = -1, | |
|
55 | } ZSTD_dictAttachPref_e; | |
|
56 | ||
|
57 | 51 | typedef struct ZSTD_prefixDict_s { |
|
58 | 52 | const void* dict; |
|
59 | 53 | size_t dictSize; |
@@ -96,10 +90,10 b' typedef enum { zop_dynamic=0, zop_predef' | |||
|
96 | 90 | |
|
97 | 91 | typedef struct { |
|
98 | 92 | /* All tables are allocated inside cctx->workspace by ZSTD_resetCCtx_internal() */ |
|
99 |
|
|
|
100 |
|
|
|
101 |
|
|
|
102 |
|
|
|
93 | unsigned* litFreq; /* table of literals statistics, of size 256 */ | |
|
94 | unsigned* litLengthFreq; /* table of litLength statistics, of size (MaxLL+1) */ | |
|
95 | unsigned* matchLengthFreq; /* table of matchLength statistics, of size (MaxML+1) */ | |
|
96 | unsigned* offCodeFreq; /* table of offCode statistics, of size (MaxOff+1) */ | |
|
103 | 97 | ZSTD_match_t* matchTable; /* list of found matches, of size ZSTD_OPT_NUM+1 */ |
|
104 | 98 | ZSTD_optimal_t* priceTable; /* All positions tracked by optimal parser, of size ZSTD_OPT_NUM+1 */ |
|
105 | 99 | |
@@ -139,7 +133,7 b' struct ZSTD_matchState_t {' | |||
|
139 | 133 | U32* hashTable3; |
|
140 | 134 | U32* chainTable; |
|
141 | 135 | optState_t opt; /* optimal parser state */ |
|
142 | const ZSTD_matchState_t *dictMatchState; | |
|
136 | const ZSTD_matchState_t * dictMatchState; | |
|
143 | 137 | ZSTD_compressionParameters cParams; |
|
144 | 138 | }; |
|
145 | 139 | |
@@ -167,7 +161,7 b' typedef struct {' | |||
|
167 | 161 | U32 hashLog; /* Log size of hashTable */ |
|
168 | 162 | U32 bucketSizeLog; /* Log bucket size for collision resolution, at most 8 */ |
|
169 | 163 | U32 minMatchLength; /* Minimum match length */ |
|
170 |
U32 hash |
|
|
164 | U32 hashRateLog; /* Log number of entries to skip */ | |
|
171 | 165 | U32 windowLog; /* Window log for the LDM */ |
|
172 | 166 | } ldmParams_t; |
|
173 | 167 | |
@@ -196,9 +190,10 b' struct ZSTD_CCtx_params_s {' | |||
|
196 | 190 | ZSTD_dictAttachPref_e attachDictPref; |
|
197 | 191 | |
|
198 | 192 | /* Multithreading: used to pass parameters to mtctx */ |
|
199 |
|
|
|
200 |
|
|
|
201 |
|
|
|
193 | int nbWorkers; | |
|
194 | size_t jobSize; | |
|
195 | int overlapLog; | |
|
196 | int rsyncable; | |
|
202 | 197 | |
|
203 | 198 | /* Long distance matching parameters */ |
|
204 | 199 | ldmParams_t ldmParams; |
@@ -498,6 +493,64 b' MEM_STATIC size_t ZSTD_hashPtr(const voi' | |||
|
498 | 493 | } |
|
499 | 494 | } |
|
500 | 495 | |
|
496 | /** ZSTD_ipow() : | |
|
497 | * Return base^exponent. | |
|
498 | */ | |
|
499 | static U64 ZSTD_ipow(U64 base, U64 exponent) | |
|
500 | { | |
|
501 | U64 power = 1; | |
|
502 | while (exponent) { | |
|
503 | if (exponent & 1) power *= base; | |
|
504 | exponent >>= 1; | |
|
505 | base *= base; | |
|
506 | } | |
|
507 | return power; | |
|
508 | } | |
|
509 | ||
|
510 | #define ZSTD_ROLL_HASH_CHAR_OFFSET 10 | |
|
511 | ||
|
512 | /** ZSTD_rollingHash_append() : | |
|
513 | * Add the buffer to the hash value. | |
|
514 | */ | |
|
515 | static U64 ZSTD_rollingHash_append(U64 hash, void const* buf, size_t size) | |
|
516 | { | |
|
517 | BYTE const* istart = (BYTE const*)buf; | |
|
518 | size_t pos; | |
|
519 | for (pos = 0; pos < size; ++pos) { | |
|
520 | hash *= prime8bytes; | |
|
521 | hash += istart[pos] + ZSTD_ROLL_HASH_CHAR_OFFSET; | |
|
522 | } | |
|
523 | return hash; | |
|
524 | } | |
|
525 | ||
|
526 | /** ZSTD_rollingHash_compute() : | |
|
527 | * Compute the rolling hash value of the buffer. | |
|
528 | */ | |
|
529 | MEM_STATIC U64 ZSTD_rollingHash_compute(void const* buf, size_t size) | |
|
530 | { | |
|
531 | return ZSTD_rollingHash_append(0, buf, size); | |
|
532 | } | |
|
533 | ||
|
534 | /** ZSTD_rollingHash_primePower() : | |
|
535 | * Compute the primePower to be passed to ZSTD_rollingHash_rotate() for a hash | |
|
536 | * over a window of length bytes. | |
|
537 | */ | |
|
538 | MEM_STATIC U64 ZSTD_rollingHash_primePower(U32 length) | |
|
539 | { | |
|
540 | return ZSTD_ipow(prime8bytes, length - 1); | |
|
541 | } | |
|
542 | ||
|
543 | /** ZSTD_rollingHash_rotate() : | |
|
544 | * Rotate the rolling hash by one byte. | |
|
545 | */ | |
|
546 | MEM_STATIC U64 ZSTD_rollingHash_rotate(U64 hash, BYTE toRemove, BYTE toAdd, U64 primePower) | |
|
547 | { | |
|
548 | hash -= (toRemove + ZSTD_ROLL_HASH_CHAR_OFFSET) * primePower; | |
|
549 | hash *= prime8bytes; | |
|
550 | hash += toAdd + ZSTD_ROLL_HASH_CHAR_OFFSET; | |
|
551 | return hash; | |
|
552 | } | |
|
553 | ||
|
501 | 554 | /*-************************************* |
|
502 | 555 | * Round buffer management |
|
503 | 556 | ***************************************/ |
@@ -626,20 +679,23 b' MEM_STATIC U32 ZSTD_window_correctOverfl' | |||
|
626 | 679 | * dictMatchState mode, lowLimit and dictLimit are the same, and the dictionary |
|
627 | 680 | * is below them. forceWindow and dictMatchState are therefore incompatible. |
|
628 | 681 | */ |
|
629 | MEM_STATIC void ZSTD_window_enforceMaxDist(ZSTD_window_t* window, | |
|
630 | void const* srcEnd, U32 maxDist, | |
|
631 |
|
|
|
632 | const ZSTD_matchState_t** dictMatchStatePtr) | |
|
682 | MEM_STATIC void | |
|
683 | ZSTD_window_enforceMaxDist(ZSTD_window_t* window, | |
|
684 | void const* srcEnd, | |
|
685 | U32 maxDist, | |
|
686 | U32* loadedDictEndPtr, | |
|
687 | const ZSTD_matchState_t** dictMatchStatePtr) | |
|
633 | 688 | { |
|
634 |
U32 const |
|
|
635 | U32 loadedDictEnd = loadedDictEndPtr != NULL ? *loadedDictEndPtr : 0; | |
|
636 |
DEBUGLOG(5, "ZSTD_window_enforceMaxDist: |
|
|
637 | if (current > maxDist + loadedDictEnd) { | |
|
638 | U32 const newLowLimit = current - maxDist; | |
|
689 | U32 const blockEndIdx = (U32)((BYTE const*)srcEnd - window->base); | |
|
690 | U32 loadedDictEnd = (loadedDictEndPtr != NULL) ? *loadedDictEndPtr : 0; | |
|
691 | DEBUGLOG(5, "ZSTD_window_enforceMaxDist: blockEndIdx=%u, maxDist=%u", | |
|
692 | (unsigned)blockEndIdx, (unsigned)maxDist); | |
|
693 | if (blockEndIdx > maxDist + loadedDictEnd) { | |
|
694 | U32 const newLowLimit = blockEndIdx - maxDist; | |
|
639 | 695 | if (window->lowLimit < newLowLimit) window->lowLimit = newLowLimit; |
|
640 | 696 | if (window->dictLimit < window->lowLimit) { |
|
641 | 697 | DEBUGLOG(5, "Update dictLimit to match lowLimit, from %u to %u", |
|
642 | window->dictLimit, window->lowLimit); | |
|
698 | (unsigned)window->dictLimit, (unsigned)window->lowLimit); | |
|
643 | 699 | window->dictLimit = window->lowLimit; |
|
644 | 700 | } |
|
645 | 701 | if (loadedDictEndPtr) |
@@ -690,20 +746,23 b' MEM_STATIC U32 ZSTD_window_update(ZSTD_w' | |||
|
690 | 746 | |
|
691 | 747 | |
|
692 | 748 | /* debug functions */ |
|
749 | #if (DEBUGLEVEL>=2) | |
|
693 | 750 | |
|
694 | 751 | MEM_STATIC double ZSTD_fWeight(U32 rawStat) |
|
695 | 752 | { |
|
696 | 753 | U32 const fp_accuracy = 8; |
|
697 | 754 | U32 const fp_multiplier = (1 << fp_accuracy); |
|
698 |
U32 const |
|
|
699 |
U32 const hb = ZSTD_highbit32( |
|
|
755 | U32 const newStat = rawStat + 1; | |
|
756 | U32 const hb = ZSTD_highbit32(newStat); | |
|
700 | 757 | U32 const BWeight = hb * fp_multiplier; |
|
701 |
U32 const FWeight = ( |
|
|
758 | U32 const FWeight = (newStat << fp_accuracy) >> hb; | |
|
702 | 759 | U32 const weight = BWeight + FWeight; |
|
703 | 760 | assert(hb + fp_accuracy < 31); |
|
704 | 761 | return (double)weight / fp_multiplier; |
|
705 | 762 | } |
|
706 | 763 | |
|
764 | /* display a table content, | |
|
765 | * listing each element, its frequency, and its predicted bit cost */ | |
|
707 | 766 | MEM_STATIC void ZSTD_debugTable(const U32* table, U32 max) |
|
708 | 767 | { |
|
709 | 768 | unsigned u, sum; |
@@ -715,6 +774,9 b' MEM_STATIC void ZSTD_debugTable(const U3' | |||
|
715 | 774 | } |
|
716 | 775 | } |
|
717 | 776 | |
|
777 | #endif | |
|
778 | ||
|
779 | ||
|
718 | 780 | #if defined (__cplusplus) |
|
719 | 781 | } |
|
720 | 782 | #endif |
@@ -18,7 +18,7 b' void ZSTD_fillDoubleHashTable(ZSTD_match' | |||
|
18 | 18 | const ZSTD_compressionParameters* const cParams = &ms->cParams; |
|
19 | 19 | U32* const hashLarge = ms->hashTable; |
|
20 | 20 | U32 const hBitsL = cParams->hashLog; |
|
21 |
U32 const mls = cParams-> |
|
|
21 | U32 const mls = cParams->minMatch; | |
|
22 | 22 | U32* const hashSmall = ms->chainTable; |
|
23 | 23 | U32 const hBitsS = cParams->chainLog; |
|
24 | 24 | const BYTE* const base = ms->window.base; |
@@ -309,7 +309,7 b' size_t ZSTD_compressBlock_doubleFast(' | |||
|
309 | 309 | ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM], |
|
310 | 310 | void const* src, size_t srcSize) |
|
311 | 311 | { |
|
312 |
const U32 mls = ms->cParams. |
|
|
312 | const U32 mls = ms->cParams.minMatch; | |
|
313 | 313 | switch(mls) |
|
314 | 314 | { |
|
315 | 315 | default: /* includes case 3 */ |
@@ -329,7 +329,7 b' size_t ZSTD_compressBlock_doubleFast_dic' | |||
|
329 | 329 | ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM], |
|
330 | 330 | void const* src, size_t srcSize) |
|
331 | 331 | { |
|
332 |
const U32 mls = ms->cParams. |
|
|
332 | const U32 mls = ms->cParams.minMatch; | |
|
333 | 333 | switch(mls) |
|
334 | 334 | { |
|
335 | 335 | default: /* includes case 3 */ |
@@ -483,7 +483,7 b' size_t ZSTD_compressBlock_doubleFast_ext' | |||
|
483 | 483 | ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM], |
|
484 | 484 | void const* src, size_t srcSize) |
|
485 | 485 | { |
|
486 |
U32 const mls = ms->cParams. |
|
|
486 | U32 const mls = ms->cParams.minMatch; | |
|
487 | 487 | switch(mls) |
|
488 | 488 | { |
|
489 | 489 | default: /* includes case 3 */ |
@@ -18,7 +18,7 b' void ZSTD_fillHashTable(ZSTD_matchState_' | |||
|
18 | 18 | const ZSTD_compressionParameters* const cParams = &ms->cParams; |
|
19 | 19 | U32* const hashTable = ms->hashTable; |
|
20 | 20 | U32 const hBits = cParams->hashLog; |
|
21 |
U32 const mls = cParams-> |
|
|
21 | U32 const mls = cParams->minMatch; | |
|
22 | 22 | const BYTE* const base = ms->window.base; |
|
23 | 23 | const BYTE* ip = base + ms->nextToUpdate; |
|
24 | 24 | const BYTE* const iend = ((const BYTE*)end) - HASH_READ_SIZE; |
@@ -27,18 +27,18 b' void ZSTD_fillHashTable(ZSTD_matchState_' | |||
|
27 | 27 | /* Always insert every fastHashFillStep position into the hash table. |
|
28 | 28 | * Insert the other positions if their hash entry is empty. |
|
29 | 29 | */ |
|
30 |
for (; ip + fastHashFillStep |
|
|
30 | for ( ; ip + fastHashFillStep < iend + 2; ip += fastHashFillStep) { | |
|
31 | 31 | U32 const current = (U32)(ip - base); |
|
32 | U32 i; | |
|
33 | for (i = 0; i < fastHashFillStep; ++i) { | |
|
34 | size_t const hash = ZSTD_hashPtr(ip + i, hBits, mls); | |
|
35 | if (i == 0 || hashTable[hash] == 0) | |
|
36 | hashTable[hash] = current + i; | |
|
37 | /* Only load extra positions for ZSTD_dtlm_full */ | |
|
38 | if (dtlm == ZSTD_dtlm_fast) | |
|
39 | break; | |
|
40 | } | |
|
41 | } | |
|
32 | size_t const hash0 = ZSTD_hashPtr(ip, hBits, mls); | |
|
33 | hashTable[hash0] = current; | |
|
34 | if (dtlm == ZSTD_dtlm_fast) continue; | |
|
35 | /* Only load extra positions for ZSTD_dtlm_full */ | |
|
36 | { U32 p; | |
|
37 | for (p = 1; p < fastHashFillStep; ++p) { | |
|
38 | size_t const hash = ZSTD_hashPtr(ip + p, hBits, mls); | |
|
39 | if (hashTable[hash] == 0) { /* not yet filled */ | |
|
40 | hashTable[hash] = current + p; | |
|
41 | } } } } | |
|
42 | 42 | } |
|
43 | 43 | |
|
44 | 44 | FORCE_INLINE_TEMPLATE |
@@ -235,7 +235,7 b' size_t ZSTD_compressBlock_fast(' | |||
|
235 | 235 | void const* src, size_t srcSize) |
|
236 | 236 | { |
|
237 | 237 | ZSTD_compressionParameters const* cParams = &ms->cParams; |
|
238 |
U32 const mls = cParams-> |
|
|
238 | U32 const mls = cParams->minMatch; | |
|
239 | 239 | assert(ms->dictMatchState == NULL); |
|
240 | 240 | switch(mls) |
|
241 | 241 | { |
@@ -256,7 +256,7 b' size_t ZSTD_compressBlock_fast_dictMatch' | |||
|
256 | 256 | void const* src, size_t srcSize) |
|
257 | 257 | { |
|
258 | 258 | ZSTD_compressionParameters const* cParams = &ms->cParams; |
|
259 |
U32 const mls = cParams-> |
|
|
259 | U32 const mls = cParams->minMatch; | |
|
260 | 260 | assert(ms->dictMatchState != NULL); |
|
261 | 261 | switch(mls) |
|
262 | 262 | { |
@@ -375,7 +375,7 b' size_t ZSTD_compressBlock_fast_extDict(' | |||
|
375 | 375 | void const* src, size_t srcSize) |
|
376 | 376 | { |
|
377 | 377 | ZSTD_compressionParameters const* cParams = &ms->cParams; |
|
378 |
U32 const mls = cParams-> |
|
|
378 | U32 const mls = cParams->minMatch; | |
|
379 | 379 | switch(mls) |
|
380 | 380 | { |
|
381 | 381 | default: /* includes case 3 */ |
@@ -63,12 +63,13 b' ZSTD_updateDUBT(ZSTD_matchState_t* ms,' | |||
|
63 | 63 | static void |
|
64 | 64 | ZSTD_insertDUBT1(ZSTD_matchState_t* ms, |
|
65 | 65 | U32 current, const BYTE* inputEnd, |
|
66 |
U32 nbCompares, U32 btLow, |
|
|
66 | U32 nbCompares, U32 btLow, | |
|
67 | const ZSTD_dictMode_e dictMode) | |
|
67 | 68 | { |
|
68 | 69 | const ZSTD_compressionParameters* const cParams = &ms->cParams; |
|
69 |
U32* |
|
|
70 |
U32 |
|
|
71 |
U32 |
|
|
70 | U32* const bt = ms->chainTable; | |
|
71 | U32 const btLog = cParams->chainLog - 1; | |
|
72 | U32 const btMask = (1 << btLog) - 1; | |
|
72 | 73 | size_t commonLengthSmaller=0, commonLengthLarger=0; |
|
73 | 74 | const BYTE* const base = ms->window.base; |
|
74 | 75 | const BYTE* const dictBase = ms->window.dictBase; |
@@ -80,7 +81,7 b' ZSTD_insertDUBT1(ZSTD_matchState_t* ms,' | |||
|
80 | 81 | const BYTE* match; |
|
81 | 82 | U32* smallerPtr = bt + 2*(current&btMask); |
|
82 | 83 | U32* largerPtr = smallerPtr + 1; |
|
83 | U32 matchIndex = *smallerPtr; | |
|
84 | U32 matchIndex = *smallerPtr; /* this candidate is unsorted : next sorted candidate is reached through *smallerPtr, while *largerPtr contains previous unsorted candidate (which is already saved and can be overwritten) */ | |
|
84 | 85 | U32 dummy32; /* to be nullified at the end */ |
|
85 | 86 | U32 const windowLow = ms->window.lowLimit; |
|
86 | 87 | |
@@ -93,6 +94,9 b' ZSTD_insertDUBT1(ZSTD_matchState_t* ms,' | |||
|
93 | 94 | U32* const nextPtr = bt + 2*(matchIndex & btMask); |
|
94 | 95 | size_t matchLength = MIN(commonLengthSmaller, commonLengthLarger); /* guaranteed minimum nb of common bytes */ |
|
95 | 96 | assert(matchIndex < current); |
|
97 | /* note : all candidates are now supposed sorted, | |
|
98 | * but it's still possible to have nextPtr[1] == ZSTD_DUBT_UNSORTED_MARK | |
|
99 | * when a real index has the same value as ZSTD_DUBT_UNSORTED_MARK */ | |
|
96 | 100 | |
|
97 | 101 | if ( (dictMode != ZSTD_extDict) |
|
98 | 102 | || (matchIndex+matchLength >= dictLimit) /* both in current segment*/ |
@@ -108,7 +112,7 b' ZSTD_insertDUBT1(ZSTD_matchState_t* ms,' | |||
|
108 | 112 | match = dictBase + matchIndex; |
|
109 | 113 | matchLength += ZSTD_count_2segments(ip+matchLength, match+matchLength, iend, dictEnd, prefixStart); |
|
110 | 114 | if (matchIndex+matchLength >= dictLimit) |
|
111 |
match = base + matchIndex; /* |
|
|
115 | match = base + matchIndex; /* preparation for next read of match[matchLength] */ | |
|
112 | 116 | } |
|
113 | 117 | |
|
114 | 118 | DEBUGLOG(8, "ZSTD_insertDUBT1: comparing %u with %u : found %u common bytes ", |
@@ -147,6 +151,7 b' ZSTD_DUBT_findBetterDictMatch (' | |||
|
147 | 151 | ZSTD_matchState_t* ms, |
|
148 | 152 | const BYTE* const ip, const BYTE* const iend, |
|
149 | 153 | size_t* offsetPtr, |
|
154 | size_t bestLength, | |
|
150 | 155 | U32 nbCompares, |
|
151 | 156 | U32 const mls, |
|
152 | 157 | const ZSTD_dictMode_e dictMode) |
@@ -172,8 +177,7 b' ZSTD_DUBT_findBetterDictMatch (' | |||
|
172 | 177 | U32 const btMask = (1 << btLog) - 1; |
|
173 | 178 | U32 const btLow = (btMask >= dictHighLimit - dictLowLimit) ? dictLowLimit : dictHighLimit - btMask; |
|
174 | 179 | |
|
175 |
size_t commonLengthSmaller=0, commonLengthLarger=0 |
|
|
176 | U32 matchEndIdx = current+8+1; | |
|
180 | size_t commonLengthSmaller=0, commonLengthLarger=0; | |
|
177 | 181 | |
|
178 | 182 | (void)dictMode; |
|
179 | 183 | assert(dictMode == ZSTD_dictMatchState); |
@@ -188,10 +192,8 b' ZSTD_DUBT_findBetterDictMatch (' | |||
|
188 | 192 | |
|
189 | 193 | if (matchLength > bestLength) { |
|
190 | 194 | U32 matchIndex = dictMatchIndex + dictIndexDelta; |
|
191 | if (matchLength > matchEndIdx - matchIndex) | |
|
192 | matchEndIdx = matchIndex + (U32)matchLength; | |
|
193 | 195 | if ( (4*(int)(matchLength-bestLength)) > (int)(ZSTD_highbit32(current-matchIndex+1) - ZSTD_highbit32((U32)offsetPtr[0]+1)) ) { |
|
194 |
DEBUGLOG( |
|
|
196 | DEBUGLOG(9, "ZSTD_DUBT_findBetterDictMatch(%u) : found better match length %u -> %u and offsetCode %u -> %u (dictMatchIndex %u, matchIndex %u)", | |
|
195 | 197 | current, (U32)bestLength, (U32)matchLength, (U32)*offsetPtr, ZSTD_REP_MOVE + current - matchIndex, dictMatchIndex, matchIndex); |
|
196 | 198 | bestLength = matchLength, *offsetPtr = ZSTD_REP_MOVE + current - matchIndex; |
|
197 | 199 | } |
@@ -200,7 +202,6 b' ZSTD_DUBT_findBetterDictMatch (' | |||
|
200 | 202 | } |
|
201 | 203 | } |
|
202 | 204 | |
|
203 | DEBUGLOG(2, "matchLength:%6zu, match:%p, prefixStart:%p, ip:%p", matchLength, match, prefixStart, ip); | |
|
204 | 205 | if (match[matchLength] < ip[matchLength]) { |
|
205 | 206 | if (dictMatchIndex <= btLow) { break; } /* beyond tree size, stop the search */ |
|
206 | 207 | commonLengthSmaller = matchLength; /* all smaller will now have at least this guaranteed common length */ |
@@ -215,7 +216,7 b' ZSTD_DUBT_findBetterDictMatch (' | |||
|
215 | 216 | |
|
216 | 217 | if (bestLength >= MINMATCH) { |
|
217 | 218 | U32 const mIndex = current - ((U32)*offsetPtr - ZSTD_REP_MOVE); (void)mIndex; |
|
218 |
DEBUGLOG( |
|
|
219 | DEBUGLOG(8, "ZSTD_DUBT_findBetterDictMatch(%u) : found match of length %u and offsetCode %u (pos %u)", | |
|
219 | 220 | current, (U32)bestLength, (U32)*offsetPtr, mIndex); |
|
220 | 221 | } |
|
221 | 222 | return bestLength; |
@@ -261,7 +262,7 b' ZSTD_DUBT_findBestMatch(ZSTD_matchState_' | |||
|
261 | 262 | && (nbCandidates > 1) ) { |
|
262 | 263 | DEBUGLOG(8, "ZSTD_DUBT_findBestMatch: candidate %u is unsorted", |
|
263 | 264 | matchIndex); |
|
264 | *unsortedMark = previousCandidate; | |
|
265 | *unsortedMark = previousCandidate; /* the unsortedMark becomes a reversed chain, to move up back to original position */ | |
|
265 | 266 | previousCandidate = matchIndex; |
|
266 | 267 | matchIndex = *nextCandidate; |
|
267 | 268 | nextCandidate = bt + 2*(matchIndex&btMask); |
@@ -269,11 +270,13 b' ZSTD_DUBT_findBestMatch(ZSTD_matchState_' | |||
|
269 | 270 | nbCandidates --; |
|
270 | 271 | } |
|
271 | 272 | |
|
273 | /* nullify last candidate if it's still unsorted | |
|
274 | * simplification, detrimental to compression ratio, beneficial for speed */ | |
|
272 | 275 | if ( (matchIndex > unsortLimit) |
|
273 | 276 | && (*unsortedMark==ZSTD_DUBT_UNSORTED_MARK) ) { |
|
274 | 277 | DEBUGLOG(7, "ZSTD_DUBT_findBestMatch: nullify last unsorted candidate %u", |
|
275 | 278 | matchIndex); |
|
276 | *nextCandidate = *unsortedMark = 0; /* nullify next candidate if it's still unsorted (note : simplification, detrimental to compression ratio, beneficial for speed) */ | |
|
279 | *nextCandidate = *unsortedMark = 0; | |
|
277 | 280 | } |
|
278 | 281 | |
|
279 | 282 | /* batch sort stacked candidates */ |
@@ -288,14 +291,14 b' ZSTD_DUBT_findBestMatch(ZSTD_matchState_' | |||
|
288 | 291 | } |
|
289 | 292 | |
|
290 | 293 | /* find longest match */ |
|
291 | { size_t commonLengthSmaller=0, commonLengthLarger=0; | |
|
294 | { size_t commonLengthSmaller = 0, commonLengthLarger = 0; | |
|
292 | 295 | const BYTE* const dictBase = ms->window.dictBase; |
|
293 | 296 | const U32 dictLimit = ms->window.dictLimit; |
|
294 | 297 | const BYTE* const dictEnd = dictBase + dictLimit; |
|
295 | 298 | const BYTE* const prefixStart = base + dictLimit; |
|
296 | 299 | U32* smallerPtr = bt + 2*(current&btMask); |
|
297 | 300 | U32* largerPtr = bt + 2*(current&btMask) + 1; |
|
298 | U32 matchEndIdx = current+8+1; | |
|
301 | U32 matchEndIdx = current + 8 + 1; | |
|
299 | 302 | U32 dummy32; /* to be nullified at the end */ |
|
300 | 303 | size_t bestLength = 0; |
|
301 | 304 | |
@@ -323,6 +326,11 b' ZSTD_DUBT_findBestMatch(ZSTD_matchState_' | |||
|
323 | 326 | if ( (4*(int)(matchLength-bestLength)) > (int)(ZSTD_highbit32(current-matchIndex+1) - ZSTD_highbit32((U32)offsetPtr[0]+1)) ) |
|
324 | 327 | bestLength = matchLength, *offsetPtr = ZSTD_REP_MOVE + current - matchIndex; |
|
325 | 328 | if (ip+matchLength == iend) { /* equal : no way to know if inf or sup */ |
|
329 | if (dictMode == ZSTD_dictMatchState) { | |
|
330 | nbCompares = 0; /* in addition to avoiding checking any | |
|
331 | * further in this loop, make sure we | |
|
332 | * skip checking in the dictionary. */ | |
|
333 | } | |
|
326 | 334 | break; /* drop, to guarantee consistency (miss a little bit of compression) */ |
|
327 | 335 | } |
|
328 | 336 | } |
@@ -346,7 +354,10 b' ZSTD_DUBT_findBestMatch(ZSTD_matchState_' | |||
|
346 | 354 | *smallerPtr = *largerPtr = 0; |
|
347 | 355 | |
|
348 | 356 | if (dictMode == ZSTD_dictMatchState && nbCompares) { |
|
349 |
bestLength = ZSTD_DUBT_findBetterDictMatch( |
|
|
357 | bestLength = ZSTD_DUBT_findBetterDictMatch( | |
|
358 | ms, ip, iend, | |
|
359 | offsetPtr, bestLength, nbCompares, | |
|
360 | mls, dictMode); | |
|
350 | 361 | } |
|
351 | 362 | |
|
352 | 363 | assert(matchEndIdx > current+8); /* ensure nextToUpdate is increased */ |
@@ -381,7 +392,7 b' ZSTD_BtFindBestMatch_selectMLS ( ZSTD_m' | |||
|
381 | 392 | const BYTE* ip, const BYTE* const iLimit, |
|
382 | 393 | size_t* offsetPtr) |
|
383 | 394 | { |
|
384 |
switch(ms->cParams. |
|
|
395 | switch(ms->cParams.minMatch) | |
|
385 | 396 | { |
|
386 | 397 | default : /* includes case 3 */ |
|
387 | 398 | case 4 : return ZSTD_BtFindBestMatch(ms, ip, iLimit, offsetPtr, 4, ZSTD_noDict); |
@@ -397,7 +408,7 b' static size_t ZSTD_BtFindBestMatch_dictM' | |||
|
397 | 408 | const BYTE* ip, const BYTE* const iLimit, |
|
398 | 409 | size_t* offsetPtr) |
|
399 | 410 | { |
|
400 |
switch(ms->cParams. |
|
|
411 | switch(ms->cParams.minMatch) | |
|
401 | 412 | { |
|
402 | 413 | default : /* includes case 3 */ |
|
403 | 414 | case 4 : return ZSTD_BtFindBestMatch(ms, ip, iLimit, offsetPtr, 4, ZSTD_dictMatchState); |
@@ -413,7 +424,7 b' static size_t ZSTD_BtFindBestMatch_extDi' | |||
|
413 | 424 | const BYTE* ip, const BYTE* const iLimit, |
|
414 | 425 | size_t* offsetPtr) |
|
415 | 426 | { |
|
416 |
switch(ms->cParams. |
|
|
427 | switch(ms->cParams.minMatch) | |
|
417 | 428 | { |
|
418 | 429 | default : /* includes case 3 */ |
|
419 | 430 | case 4 : return ZSTD_BtFindBestMatch(ms, ip, iLimit, offsetPtr, 4, ZSTD_extDict); |
@@ -428,7 +439,7 b' static size_t ZSTD_BtFindBestMatch_extDi' | |||
|
428 | 439 | /* ********************************* |
|
429 | 440 | * Hash Chain |
|
430 | 441 | ***********************************/ |
|
431 | #define NEXT_IN_CHAIN(d, mask) chainTable[(d) & mask] | |
|
442 | #define NEXT_IN_CHAIN(d, mask) chainTable[(d) & (mask)] | |
|
432 | 443 | |
|
433 | 444 | /* Update chains up to ip (excluded) |
|
434 | 445 | Assumption : always within prefix (i.e. not within extDict) */ |
@@ -458,7 +469,7 b' static U32 ZSTD_insertAndFindFirstIndex_' | |||
|
458 | 469 | |
|
459 | 470 | U32 ZSTD_insertAndFindFirstIndex(ZSTD_matchState_t* ms, const BYTE* ip) { |
|
460 | 471 | const ZSTD_compressionParameters* const cParams = &ms->cParams; |
|
461 |
return ZSTD_insertAndFindFirstIndex_internal(ms, cParams, ip, ms->cParams. |
|
|
472 | return ZSTD_insertAndFindFirstIndex_internal(ms, cParams, ip, ms->cParams.minMatch); | |
|
462 | 473 | } |
|
463 | 474 | |
|
464 | 475 | |
@@ -492,6 +503,7 b' size_t ZSTD_HcFindBestMatch_generic (' | |||
|
492 | 503 | size_t currentMl=0; |
|
493 | 504 | if ((dictMode != ZSTD_extDict) || matchIndex >= dictLimit) { |
|
494 | 505 | const BYTE* const match = base + matchIndex; |
|
506 | assert(matchIndex >= dictLimit); /* ensures this is true if dictMode != ZSTD_extDict */ | |
|
495 | 507 | if (match[ml] == ip[ml]) /* potentially better */ |
|
496 | 508 | currentMl = ZSTD_count(ip, match, iLimit); |
|
497 | 509 | } else { |
@@ -554,7 +566,7 b' FORCE_INLINE_TEMPLATE size_t ZSTD_HcFind' | |||
|
554 | 566 | const BYTE* ip, const BYTE* const iLimit, |
|
555 | 567 | size_t* offsetPtr) |
|
556 | 568 | { |
|
557 |
switch(ms->cParams. |
|
|
569 | switch(ms->cParams.minMatch) | |
|
558 | 570 | { |
|
559 | 571 | default : /* includes case 3 */ |
|
560 | 572 | case 4 : return ZSTD_HcFindBestMatch_generic(ms, ip, iLimit, offsetPtr, 4, ZSTD_noDict); |
@@ -570,7 +582,7 b' static size_t ZSTD_HcFindBestMatch_dictM' | |||
|
570 | 582 | const BYTE* ip, const BYTE* const iLimit, |
|
571 | 583 | size_t* offsetPtr) |
|
572 | 584 | { |
|
573 |
switch(ms->cParams. |
|
|
585 | switch(ms->cParams.minMatch) | |
|
574 | 586 | { |
|
575 | 587 | default : /* includes case 3 */ |
|
576 | 588 | case 4 : return ZSTD_HcFindBestMatch_generic(ms, ip, iLimit, offsetPtr, 4, ZSTD_dictMatchState); |
@@ -586,7 +598,7 b' FORCE_INLINE_TEMPLATE size_t ZSTD_HcFind' | |||
|
586 | 598 | const BYTE* ip, const BYTE* const iLimit, |
|
587 | 599 | size_t* offsetPtr) |
|
588 | 600 | { |
|
589 |
switch(ms->cParams. |
|
|
601 | switch(ms->cParams.minMatch) | |
|
590 | 602 | { |
|
591 | 603 | default : /* includes case 3 */ |
|
592 | 604 | case 4 : return ZSTD_HcFindBestMatch_generic(ms, ip, iLimit, offsetPtr, 4, ZSTD_extDict); |
@@ -37,8 +37,8 b' void ZSTD_ldm_adjustParameters(ldmParams' | |||
|
37 | 37 | params->hashLog = MAX(ZSTD_HASHLOG_MIN, params->windowLog - LDM_HASH_RLOG); |
|
38 | 38 | assert(params->hashLog <= ZSTD_HASHLOG_MAX); |
|
39 | 39 | } |
|
40 |
if (params->hash |
|
|
41 |
params->hash |
|
|
40 | if (params->hashRateLog == 0) { | |
|
41 | params->hashRateLog = params->windowLog < params->hashLog | |
|
42 | 42 | ? 0 |
|
43 | 43 | : params->windowLog - params->hashLog; |
|
44 | 44 | } |
@@ -119,20 +119,20 b' static void ZSTD_ldm_insertEntry(ldmStat' | |||
|
119 | 119 | * |
|
120 | 120 | * Gets the small hash, checksum, and tag from the rollingHash. |
|
121 | 121 | * |
|
122 |
* If the tag matches (1 << ldmParams.hash |
|
|
122 | * If the tag matches (1 << ldmParams.hashRateLog)-1, then | |
|
123 | 123 | * creates an ldmEntry from the offset, and inserts it into the hash table. |
|
124 | 124 | * |
|
125 | 125 | * hBits is the length of the small hash, which is the most significant hBits |
|
126 | 126 | * of rollingHash. The checksum is the next 32 most significant bits, followed |
|
127 |
* by ldmParams.hash |
|
|
127 | * by ldmParams.hashRateLog bits that make up the tag. */ | |
|
128 | 128 | static void ZSTD_ldm_makeEntryAndInsertByTag(ldmState_t* ldmState, |
|
129 | 129 | U64 const rollingHash, |
|
130 | 130 | U32 const hBits, |
|
131 | 131 | U32 const offset, |
|
132 | 132 | ldmParams_t const ldmParams) |
|
133 | 133 | { |
|
134 |
U32 const tag = ZSTD_ldm_getTag(rollingHash, hBits, ldmParams.hash |
|
|
135 |
U32 const tagMask = ((U32)1 << ldmParams.hash |
|
|
134 | U32 const tag = ZSTD_ldm_getTag(rollingHash, hBits, ldmParams.hashRateLog); | |
|
135 | U32 const tagMask = ((U32)1 << ldmParams.hashRateLog) - 1; | |
|
136 | 136 | if (tag == tagMask) { |
|
137 | 137 | U32 const hash = ZSTD_ldm_getSmallHash(rollingHash, hBits); |
|
138 | 138 | U32 const checksum = ZSTD_ldm_getChecksum(rollingHash, hBits); |
@@ -143,56 +143,6 b' static void ZSTD_ldm_makeEntryAndInsertB' | |||
|
143 | 143 | } |
|
144 | 144 | } |
|
145 | 145 | |
|
146 | /** ZSTD_ldm_getRollingHash() : | |
|
147 | * Get a 64-bit hash using the first len bytes from buf. | |
|
148 | * | |
|
149 | * Giving bytes s = s_1, s_2, ... s_k, the hash is defined to be | |
|
150 | * H(s) = s_1*(a^(k-1)) + s_2*(a^(k-2)) + ... + s_k*(a^0) | |
|
151 | * | |
|
152 | * where the constant a is defined to be prime8bytes. | |
|
153 | * | |
|
154 | * The implementation adds an offset to each byte, so | |
|
155 | * H(s) = (s_1 + HASH_CHAR_OFFSET)*(a^(k-1)) + ... */ | |
|
156 | static U64 ZSTD_ldm_getRollingHash(const BYTE* buf, U32 len) | |
|
157 | { | |
|
158 | U64 ret = 0; | |
|
159 | U32 i; | |
|
160 | for (i = 0; i < len; i++) { | |
|
161 | ret *= prime8bytes; | |
|
162 | ret += buf[i] + LDM_HASH_CHAR_OFFSET; | |
|
163 | } | |
|
164 | return ret; | |
|
165 | } | |
|
166 | ||
|
167 | /** ZSTD_ldm_ipow() : | |
|
168 | * Return base^exp. */ | |
|
169 | static U64 ZSTD_ldm_ipow(U64 base, U64 exp) | |
|
170 | { | |
|
171 | U64 ret = 1; | |
|
172 | while (exp) { | |
|
173 | if (exp & 1) { ret *= base; } | |
|
174 | exp >>= 1; | |
|
175 | base *= base; | |
|
176 | } | |
|
177 | return ret; | |
|
178 | } | |
|
179 | ||
|
180 | U64 ZSTD_ldm_getHashPower(U32 minMatchLength) { | |
|
181 | DEBUGLOG(4, "ZSTD_ldm_getHashPower: mml=%u", minMatchLength); | |
|
182 | assert(minMatchLength >= ZSTD_LDM_MINMATCH_MIN); | |
|
183 | return ZSTD_ldm_ipow(prime8bytes, minMatchLength - 1); | |
|
184 | } | |
|
185 | ||
|
186 | /** ZSTD_ldm_updateHash() : | |
|
187 | * Updates hash by removing toRemove and adding toAdd. */ | |
|
188 | static U64 ZSTD_ldm_updateHash(U64 hash, BYTE toRemove, BYTE toAdd, U64 hashPower) | |
|
189 | { | |
|
190 | hash -= ((toRemove + LDM_HASH_CHAR_OFFSET) * hashPower); | |
|
191 | hash *= prime8bytes; | |
|
192 | hash += toAdd + LDM_HASH_CHAR_OFFSET; | |
|
193 | return hash; | |
|
194 | } | |
|
195 | ||
|
196 | 146 | /** ZSTD_ldm_countBackwardsMatch() : |
|
197 | 147 | * Returns the number of bytes that match backwards before pIn and pMatch. |
|
198 | 148 | * |
@@ -238,6 +188,7 b' static size_t ZSTD_ldm_fillFastTables(ZS' | |||
|
238 | 188 | case ZSTD_btlazy2: |
|
239 | 189 | case ZSTD_btopt: |
|
240 | 190 | case ZSTD_btultra: |
|
191 | case ZSTD_btultra2: | |
|
241 | 192 | break; |
|
242 | 193 | default: |
|
243 | 194 | assert(0); /* not possible : not a valid strategy id */ |
@@ -261,9 +212,9 b' static U64 ZSTD_ldm_fillLdmHashTable(ldm' | |||
|
261 | 212 | const BYTE* cur = lastHashed + 1; |
|
262 | 213 | |
|
263 | 214 | while (cur < iend) { |
|
264 |
rollingHash = ZSTD_ |
|
|
265 | cur[ldmParams.minMatchLength-1], | |
|
266 | state->hashPower); | |
|
215 | rollingHash = ZSTD_rollingHash_rotate(rollingHash, cur[-1], | |
|
216 | cur[ldmParams.minMatchLength-1], | |
|
217 | state->hashPower); | |
|
267 | 218 | ZSTD_ldm_makeEntryAndInsertByTag(state, |
|
268 | 219 | rollingHash, hBits, |
|
269 | 220 | (U32)(cur - base), ldmParams); |
@@ -297,8 +248,8 b' static size_t ZSTD_ldm_generateSequences' | |||
|
297 | 248 | U64 const hashPower = ldmState->hashPower; |
|
298 | 249 | U32 const hBits = params->hashLog - params->bucketSizeLog; |
|
299 | 250 | U32 const ldmBucketSize = 1U << params->bucketSizeLog; |
|
300 |
U32 const hash |
|
|
301 |
U32 const ldmTagMask = (1U << params->hash |
|
|
251 | U32 const hashRateLog = params->hashRateLog; | |
|
252 | U32 const ldmTagMask = (1U << params->hashRateLog) - 1; | |
|
302 | 253 | /* Prefix and extDict parameters */ |
|
303 | 254 | U32 const dictLimit = ldmState->window.dictLimit; |
|
304 | 255 | U32 const lowestIndex = extDict ? ldmState->window.lowLimit : dictLimit; |
@@ -324,16 +275,16 b' static size_t ZSTD_ldm_generateSequences' | |||
|
324 | 275 | size_t forwardMatchLength = 0, backwardMatchLength = 0; |
|
325 | 276 | ldmEntry_t* bestEntry = NULL; |
|
326 | 277 | if (ip != istart) { |
|
327 |
rollingHash = ZSTD_ |
|
|
328 | lastHashed[minMatchLength], | |
|
329 | hashPower); | |
|
278 | rollingHash = ZSTD_rollingHash_rotate(rollingHash, lastHashed[0], | |
|
279 | lastHashed[minMatchLength], | |
|
280 | hashPower); | |
|
330 | 281 | } else { |
|
331 |
rollingHash = ZSTD_ |
|
|
282 | rollingHash = ZSTD_rollingHash_compute(ip, minMatchLength); | |
|
332 | 283 | } |
|
333 | 284 | lastHashed = ip; |
|
334 | 285 | |
|
335 | 286 | /* Do not insert and do not look for a match */ |
|
336 |
if (ZSTD_ldm_getTag(rollingHash, hBits, hash |
|
|
287 | if (ZSTD_ldm_getTag(rollingHash, hBits, hashRateLog) != ldmTagMask) { | |
|
337 | 288 | ip++; |
|
338 | 289 | continue; |
|
339 | 290 | } |
@@ -593,7 +544,7 b' size_t ZSTD_ldm_blockCompress(rawSeqStor' | |||
|
593 | 544 | void const* src, size_t srcSize) |
|
594 | 545 | { |
|
595 | 546 | const ZSTD_compressionParameters* const cParams = &ms->cParams; |
|
596 |
unsigned const minMatch = cParams-> |
|
|
547 | unsigned const minMatch = cParams->minMatch; | |
|
597 | 548 | ZSTD_blockCompressor const blockCompressor = |
|
598 | 549 | ZSTD_selectBlockCompressor(cParams->strategy, ZSTD_matchState_dictMode(ms)); |
|
599 | 550 | /* Input bounds */ |
@@ -21,7 +21,7 b' extern "C" {' | |||
|
21 | 21 | * Long distance matching |
|
22 | 22 | ***************************************/ |
|
23 | 23 | |
|
24 |
#define ZSTD_LDM_DEFAULT_WINDOW_LOG ZSTD_WINDOWLOG_DEFAULT |
|
|
24 | #define ZSTD_LDM_DEFAULT_WINDOW_LOG ZSTD_WINDOWLOG_LIMIT_DEFAULT | |
|
25 | 25 | |
|
26 | 26 | /** |
|
27 | 27 | * ZSTD_ldm_generateSequences(): |
@@ -86,12 +86,8 b' size_t ZSTD_ldm_getTableSize(ldmParams_t' | |||
|
86 | 86 | */ |
|
87 | 87 | size_t ZSTD_ldm_getMaxNbSeq(ldmParams_t params, size_t maxChunkSize); |
|
88 | 88 | |
|
89 | /** ZSTD_ldm_getTableSize() : | |
|
90 | * Return prime8bytes^(minMatchLength-1) */ | |
|
91 | U64 ZSTD_ldm_getHashPower(U32 minMatchLength); | |
|
92 | ||
|
93 | 89 | /** ZSTD_ldm_adjustParameters() : |
|
94 |
* If the params->hash |
|
|
90 | * If the params->hashRateLog is not set, set it to its default value based on | |
|
95 | 91 | * windowLog and params->hashLog. |
|
96 | 92 | * |
|
97 | 93 | * Ensures that params->bucketSizeLog is <= params->hashLog (setting it to |
@@ -17,6 +17,8 b'' | |||
|
17 | 17 | #define ZSTD_FREQ_DIV 4 /* log factor when using previous stats to init next stats */ |
|
18 | 18 | #define ZSTD_MAX_PRICE (1<<30) |
|
19 | 19 | |
|
20 | #define ZSTD_PREDEF_THRESHOLD 1024 /* if srcSize < ZSTD_PREDEF_THRESHOLD, symbols' cost is assumed static, directly determined by pre-defined distributions */ | |
|
21 | ||
|
20 | 22 | |
|
21 | 23 | /*-************************************* |
|
22 | 24 | * Price functions for optimal parser |
@@ -52,11 +54,15 b' MEM_STATIC U32 ZSTD_fracWeight(U32 rawSt' | |||
|
52 | 54 | return weight; |
|
53 | 55 | } |
|
54 | 56 | |
|
55 | /* debugging function, @return price in bytes */ | |
|
57 | #if (DEBUGLEVEL>=2) | |
|
58 | /* debugging function, | |
|
59 | * @return price in bytes as fractional value | |
|
60 | * for debug messages only */ | |
|
56 | 61 | MEM_STATIC double ZSTD_fCost(U32 price) |
|
57 | 62 | { |
|
58 | 63 | return (double)price / (BITCOST_MULTIPLIER*8); |
|
59 | 64 | } |
|
65 | #endif | |
|
60 | 66 | |
|
61 | 67 | static void ZSTD_setBasePrices(optState_t* optPtr, int optLevel) |
|
62 | 68 | { |
@@ -67,29 +73,44 b' static void ZSTD_setBasePrices(optState_' | |||
|
67 | 73 | } |
|
68 | 74 | |
|
69 | 75 | |
|
70 | static U32 ZSTD_downscaleStat(U32* table, U32 lastEltIndex, int malus) | |
|
76 | /* ZSTD_downscaleStat() : | |
|
77 | * reduce all elements in table by a factor 2^(ZSTD_FREQ_DIV+malus) | |
|
78 | * return the resulting sum of elements */ | |
|
79 | static U32 ZSTD_downscaleStat(unsigned* table, U32 lastEltIndex, int malus) | |
|
71 | 80 | { |
|
72 | 81 | U32 s, sum=0; |
|
82 | DEBUGLOG(5, "ZSTD_downscaleStat (nbElts=%u)", (unsigned)lastEltIndex+1); | |
|
73 | 83 | assert(ZSTD_FREQ_DIV+malus > 0 && ZSTD_FREQ_DIV+malus < 31); |
|
74 |
for (s=0; s< |
|
|
84 | for (s=0; s<lastEltIndex+1; s++) { | |
|
75 | 85 | table[s] = 1 + (table[s] >> (ZSTD_FREQ_DIV+malus)); |
|
76 | 86 | sum += table[s]; |
|
77 | 87 | } |
|
78 | 88 | return sum; |
|
79 | 89 | } |
|
80 | 90 | |
|
81 | static void ZSTD_rescaleFreqs(optState_t* const optPtr, | |
|
82 | const BYTE* const src, size_t const srcSize, | |
|
83 | int optLevel) | |
|
91 | /* ZSTD_rescaleFreqs() : | |
|
92 | * if first block (detected by optPtr->litLengthSum == 0) : init statistics | |
|
93 | * take hints from dictionary if there is one | |
|
94 | * or init from zero, using src for literals stats, or flat 1 for match symbols | |
|
95 | * otherwise downscale existing stats, to be used as seed for next block. | |
|
96 | */ | |
|
97 | static void | |
|
98 | ZSTD_rescaleFreqs(optState_t* const optPtr, | |
|
99 | const BYTE* const src, size_t const srcSize, | |
|
100 | int const optLevel) | |
|
84 | 101 | { |
|
102 | DEBUGLOG(5, "ZSTD_rescaleFreqs (srcSize=%u)", (unsigned)srcSize); | |
|
85 | 103 | optPtr->priceType = zop_dynamic; |
|
86 | 104 | |
|
87 | 105 | if (optPtr->litLengthSum == 0) { /* first block : init */ |
|
88 |
if (srcSize <= |
|
|
106 | if (srcSize <= ZSTD_PREDEF_THRESHOLD) { /* heuristic */ | |
|
107 | DEBUGLOG(5, "(srcSize <= ZSTD_PREDEF_THRESHOLD) => zop_predef"); | |
|
89 | 108 | optPtr->priceType = zop_predef; |
|
109 | } | |
|
90 | 110 | |
|
91 | 111 | assert(optPtr->symbolCosts != NULL); |
|
92 |
if (optPtr->symbolCosts->huf.repeatMode == HUF_repeat_valid) { |
|
|
112 | if (optPtr->symbolCosts->huf.repeatMode == HUF_repeat_valid) { | |
|
113 | /* huffman table presumed generated by dictionary */ | |
|
93 | 114 | optPtr->priceType = zop_dynamic; |
|
94 | 115 | |
|
95 | 116 | assert(optPtr->litFreq != NULL); |
@@ -208,7 +229,9 b' static U32 ZSTD_litLengthPrice(U32 const' | |||
|
208 | 229 | |
|
209 | 230 | /* dynamic statistics */ |
|
210 | 231 | { U32 const llCode = ZSTD_LLcode(litLength); |
|
211 | return (LL_bits[llCode] * BITCOST_MULTIPLIER) + (optPtr->litLengthSumBasePrice - WEIGHT(optPtr->litLengthFreq[llCode], optLevel)); | |
|
232 | return (LL_bits[llCode] * BITCOST_MULTIPLIER) | |
|
233 | + optPtr->litLengthSumBasePrice | |
|
234 | - WEIGHT(optPtr->litLengthFreq[llCode], optLevel); | |
|
212 | 235 | } |
|
213 | 236 | } |
|
214 | 237 | |
@@ -253,7 +276,7 b' static int ZSTD_literalsContribution(con' | |||
|
253 | 276 | FORCE_INLINE_TEMPLATE U32 |
|
254 | 277 | ZSTD_getMatchPrice(U32 const offset, |
|
255 | 278 | U32 const matchLength, |
|
256 |
|
|
|
279 | const optState_t* const optPtr, | |
|
257 | 280 | int const optLevel) |
|
258 | 281 | { |
|
259 | 282 | U32 price; |
@@ -385,7 +408,6 b' static U32 ZSTD_insertBt1(' | |||
|
385 | 408 | U32* largerPtr = smallerPtr + 1; |
|
386 | 409 | U32 dummy32; /* to be nullified at the end */ |
|
387 | 410 | U32 const windowLow = ms->window.lowLimit; |
|
388 | U32 const matchLow = windowLow ? windowLow : 1; | |
|
389 | 411 | U32 matchEndIdx = current+8+1; |
|
390 | 412 | size_t bestLength = 8; |
|
391 | 413 | U32 nbCompares = 1U << cParams->searchLog; |
@@ -401,7 +423,8 b' static U32 ZSTD_insertBt1(' | |||
|
401 | 423 | assert(ip <= iend-8); /* required for h calculation */ |
|
402 | 424 | hashTable[h] = current; /* Update Hash Table */ |
|
403 | 425 | |
|
404 | while (nbCompares-- && (matchIndex >= matchLow)) { | |
|
426 | assert(windowLow > 0); | |
|
427 | while (nbCompares-- && (matchIndex >= windowLow)) { | |
|
405 | 428 | U32* const nextPtr = bt + 2*(matchIndex & btMask); |
|
406 | 429 | size_t matchLength = MIN(commonLengthSmaller, commonLengthLarger); /* guaranteed minimum nb of common bytes */ |
|
407 | 430 | assert(matchIndex < current); |
@@ -479,7 +502,7 b' void ZSTD_updateTree_internal(' | |||
|
479 | 502 | const BYTE* const base = ms->window.base; |
|
480 | 503 | U32 const target = (U32)(ip - base); |
|
481 | 504 | U32 idx = ms->nextToUpdate; |
|
482 |
DEBUGLOG( |
|
|
505 | DEBUGLOG(6, "ZSTD_updateTree_internal, from %u to %u (dictMode:%u)", | |
|
483 | 506 | idx, target, dictMode); |
|
484 | 507 | |
|
485 | 508 | while(idx < target) |
@@ -488,15 +511,18 b' void ZSTD_updateTree_internal(' | |||
|
488 | 511 | } |
|
489 | 512 | |
|
490 | 513 | void ZSTD_updateTree(ZSTD_matchState_t* ms, const BYTE* ip, const BYTE* iend) { |
|
491 |
ZSTD_updateTree_internal(ms, ip, iend, ms->cParams. |
|
|
514 | ZSTD_updateTree_internal(ms, ip, iend, ms->cParams.minMatch, ZSTD_noDict); | |
|
492 | 515 | } |
|
493 | 516 | |
|
494 | 517 | FORCE_INLINE_TEMPLATE |
|
495 | 518 | U32 ZSTD_insertBtAndGetAllMatches ( |
|
496 | 519 | ZSTD_matchState_t* ms, |
|
497 | 520 | const BYTE* const ip, const BYTE* const iLimit, const ZSTD_dictMode_e dictMode, |
|
498 |
U32 rep[ZSTD_REP_NUM], |
|
|
499 | ZSTD_match_t* matches, const U32 lengthToBeat, U32 const mls /* template */) | |
|
521 | U32 rep[ZSTD_REP_NUM], | |
|
522 | U32 const ll0, /* tells if associated literal length is 0 or not. This value must be 0 or 1 */ | |
|
523 | ZSTD_match_t* matches, | |
|
524 | const U32 lengthToBeat, | |
|
525 | U32 const mls /* template */) | |
|
500 | 526 | { |
|
501 | 527 | const ZSTD_compressionParameters* const cParams = &ms->cParams; |
|
502 | 528 | U32 const sufficient_len = MIN(cParams->targetLength, ZSTD_OPT_NUM -1); |
@@ -542,6 +568,7 b' U32 ZSTD_insertBtAndGetAllMatches (' | |||
|
542 | 568 | DEBUGLOG(8, "ZSTD_insertBtAndGetAllMatches: current=%u", current); |
|
543 | 569 | |
|
544 | 570 | /* check repCode */ |
|
571 | assert(ll0 <= 1); /* necessarily 1 or 0 */ | |
|
545 | 572 | { U32 const lastR = ZSTD_REP_NUM + ll0; |
|
546 | 573 | U32 repCode; |
|
547 | 574 | for (repCode = ll0; repCode < lastR; repCode++) { |
@@ -724,7 +751,7 b' FORCE_INLINE_TEMPLATE U32 ZSTD_BtGetAllM' | |||
|
724 | 751 | ZSTD_match_t* matches, U32 const lengthToBeat) |
|
725 | 752 | { |
|
726 | 753 | const ZSTD_compressionParameters* const cParams = &ms->cParams; |
|
727 |
U32 const matchLengthSearch = cParams-> |
|
|
754 | U32 const matchLengthSearch = cParams->minMatch; | |
|
728 | 755 | DEBUGLOG(8, "ZSTD_BtGetAllMatches"); |
|
729 | 756 | if (ip < ms->window.base + ms->nextToUpdate) return 0; /* skipped area */ |
|
730 | 757 | ZSTD_updateTree_internal(ms, ip, iHighLimit, matchLengthSearch, dictMode); |
@@ -774,12 +801,30 b' static U32 ZSTD_totalLen(ZSTD_optimal_t ' | |||
|
774 | 801 | return sol.litlen + sol.mlen; |
|
775 | 802 | } |
|
776 | 803 | |
|
804 | #if 0 /* debug */ | |
|
805 | ||
|
806 | static void | |
|
807 | listStats(const U32* table, int lastEltID) | |
|
808 | { | |
|
809 | int const nbElts = lastEltID + 1; | |
|
810 | int enb; | |
|
811 | for (enb=0; enb < nbElts; enb++) { | |
|
812 | (void)table; | |
|
813 | //RAWLOG(2, "%3i:%3i, ", enb, table[enb]); | |
|
814 | RAWLOG(2, "%4i,", table[enb]); | |
|
815 | } | |
|
816 | RAWLOG(2, " \n"); | |
|
817 | } | |
|
818 | ||
|
819 | #endif | |
|
820 | ||
|
777 | 821 | FORCE_INLINE_TEMPLATE size_t |
|
778 | 822 | ZSTD_compressBlock_opt_generic(ZSTD_matchState_t* ms, |
|
779 | 823 | seqStore_t* seqStore, |
|
780 | 824 | U32 rep[ZSTD_REP_NUM], |
|
781 |
|
|
|
782 |
|
|
|
825 | const void* src, size_t srcSize, | |
|
826 | const int optLevel, | |
|
827 | const ZSTD_dictMode_e dictMode) | |
|
783 | 828 | { |
|
784 | 829 | optState_t* const optStatePtr = &ms->opt; |
|
785 | 830 | const BYTE* const istart = (const BYTE*)src; |
@@ -792,14 +837,15 b' ZSTD_compressBlock_opt_generic(ZSTD_matc' | |||
|
792 | 837 | const ZSTD_compressionParameters* const cParams = &ms->cParams; |
|
793 | 838 | |
|
794 | 839 | U32 const sufficient_len = MIN(cParams->targetLength, ZSTD_OPT_NUM -1); |
|
795 |
U32 const minMatch = (cParams-> |
|
|
840 | U32 const minMatch = (cParams->minMatch == 3) ? 3 : 4; | |
|
796 | 841 | |
|
797 | 842 | ZSTD_optimal_t* const opt = optStatePtr->priceTable; |
|
798 | 843 | ZSTD_match_t* const matches = optStatePtr->matchTable; |
|
799 | 844 | ZSTD_optimal_t lastSequence; |
|
800 | 845 | |
|
801 | 846 | /* init */ |
|
802 |
DEBUGLOG(5, "ZSTD_compressBlock_opt_generic" |
|
|
847 | DEBUGLOG(5, "ZSTD_compressBlock_opt_generic: current=%u, prefix=%u, nextToUpdate=%u", | |
|
848 | (U32)(ip - base), ms->window.dictLimit, ms->nextToUpdate); | |
|
803 | 849 | assert(optLevel <= 2); |
|
804 | 850 | ms->nextToUpdate3 = ms->nextToUpdate; |
|
805 | 851 | ZSTD_rescaleFreqs(optStatePtr, (const BYTE*)src, srcSize, optLevel); |
@@ -999,7 +1045,7 b' ZSTD_compressBlock_opt_generic(ZSTD_matc' | |||
|
999 | 1045 | U32 const offCode = opt[storePos].off; |
|
1000 | 1046 | U32 const advance = llen + mlen; |
|
1001 | 1047 | DEBUGLOG(6, "considering seq starting at %zi, llen=%u, mlen=%u", |
|
1002 | anchor - istart, llen, mlen); | |
|
1048 | anchor - istart, (unsigned)llen, (unsigned)mlen); | |
|
1003 | 1049 | |
|
1004 | 1050 | if (mlen==0) { /* only literals => must be last "sequence", actually starting a new stream of sequences */ |
|
1005 | 1051 | assert(storePos == storeEnd); /* must be last sequence */ |
@@ -1047,11 +1093,11 b' size_t ZSTD_compressBlock_btopt(' | |||
|
1047 | 1093 | |
|
1048 | 1094 | |
|
1049 | 1095 | /* used in 2-pass strategy */ |
|
1050 |
static U32 ZSTD_upscaleStat( |
|
|
1096 | static U32 ZSTD_upscaleStat(unsigned* table, U32 lastEltIndex, int bonus) | |
|
1051 | 1097 | { |
|
1052 | 1098 | U32 s, sum=0; |
|
1053 | assert(ZSTD_FREQ_DIV+bonus > 0); | |
|
1054 |
for (s=0; s< |
|
|
1099 | assert(ZSTD_FREQ_DIV+bonus >= 0); | |
|
1100 | for (s=0; s<lastEltIndex+1; s++) { | |
|
1055 | 1101 | table[s] <<= ZSTD_FREQ_DIV+bonus; |
|
1056 | 1102 | table[s]--; |
|
1057 | 1103 | sum += table[s]; |
@@ -1063,9 +1109,43 b' static U32 ZSTD_upscaleStat(U32* table, ' | |||
|
1063 | 1109 | MEM_STATIC void ZSTD_upscaleStats(optState_t* optPtr) |
|
1064 | 1110 | { |
|
1065 | 1111 | optPtr->litSum = ZSTD_upscaleStat(optPtr->litFreq, MaxLit, 0); |
|
1066 |
optPtr->litLengthSum = ZSTD_upscaleStat(optPtr->litLengthFreq, MaxLL, |
|
|
1067 |
optPtr->matchLengthSum = ZSTD_upscaleStat(optPtr->matchLengthFreq, MaxML, |
|
|
1068 |
optPtr->offCodeSum = ZSTD_upscaleStat(optPtr->offCodeFreq, MaxOff, |
|
|
1112 | optPtr->litLengthSum = ZSTD_upscaleStat(optPtr->litLengthFreq, MaxLL, 0); | |
|
1113 | optPtr->matchLengthSum = ZSTD_upscaleStat(optPtr->matchLengthFreq, MaxML, 0); | |
|
1114 | optPtr->offCodeSum = ZSTD_upscaleStat(optPtr->offCodeFreq, MaxOff, 0); | |
|
1115 | } | |
|
1116 | ||
|
1117 | /* ZSTD_initStats_ultra(): | |
|
1118 | * make a first compression pass, just to seed stats with more accurate starting values. | |
|
1119 | * only works on first block, with no dictionary and no ldm. | |
|
1120 | * this function cannot error, hence its constract must be respected. | |
|
1121 | */ | |
|
1122 | static void | |
|
1123 | ZSTD_initStats_ultra(ZSTD_matchState_t* ms, | |
|
1124 | seqStore_t* seqStore, | |
|
1125 | U32 rep[ZSTD_REP_NUM], | |
|
1126 | const void* src, size_t srcSize) | |
|
1127 | { | |
|
1128 | U32 tmpRep[ZSTD_REP_NUM]; /* updated rep codes will sink here */ | |
|
1129 | memcpy(tmpRep, rep, sizeof(tmpRep)); | |
|
1130 | ||
|
1131 | DEBUGLOG(4, "ZSTD_initStats_ultra (srcSize=%zu)", srcSize); | |
|
1132 | assert(ms->opt.litLengthSum == 0); /* first block */ | |
|
1133 | assert(seqStore->sequences == seqStore->sequencesStart); /* no ldm */ | |
|
1134 | assert(ms->window.dictLimit == ms->window.lowLimit); /* no dictionary */ | |
|
1135 | assert(ms->window.dictLimit - ms->nextToUpdate <= 1); /* no prefix (note: intentional overflow, defined as 2-complement) */ | |
|
1136 | ||
|
1137 | ZSTD_compressBlock_opt_generic(ms, seqStore, tmpRep, src, srcSize, 2 /*optLevel*/, ZSTD_noDict); /* generate stats into ms->opt*/ | |
|
1138 | ||
|
1139 | /* invalidate first scan from history */ | |
|
1140 | ZSTD_resetSeqStore(seqStore); | |
|
1141 | ms->window.base -= srcSize; | |
|
1142 | ms->window.dictLimit += (U32)srcSize; | |
|
1143 | ms->window.lowLimit = ms->window.dictLimit; | |
|
1144 | ms->nextToUpdate = ms->window.dictLimit; | |
|
1145 | ms->nextToUpdate3 = ms->window.dictLimit; | |
|
1146 | ||
|
1147 | /* re-inforce weight of collected statistics */ | |
|
1148 | ZSTD_upscaleStats(&ms->opt); | |
|
1069 | 1149 | } |
|
1070 | 1150 | |
|
1071 | 1151 | size_t ZSTD_compressBlock_btultra( |
@@ -1073,33 +1153,34 b' size_t ZSTD_compressBlock_btultra(' | |||
|
1073 | 1153 | const void* src, size_t srcSize) |
|
1074 | 1154 | { |
|
1075 | 1155 | DEBUGLOG(5, "ZSTD_compressBlock_btultra (srcSize=%zu)", srcSize); |
|
1076 | #if 0 | |
|
1077 | /* 2-pass strategy (disabled) | |
|
1156 | return ZSTD_compressBlock_opt_generic(ms, seqStore, rep, src, srcSize, 2 /*optLevel*/, ZSTD_noDict); | |
|
1157 | } | |
|
1158 | ||
|
1159 | size_t ZSTD_compressBlock_btultra2( | |
|
1160 | ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM], | |
|
1161 | const void* src, size_t srcSize) | |
|
1162 | { | |
|
1163 | U32 const current = (U32)((const BYTE*)src - ms->window.base); | |
|
1164 | DEBUGLOG(5, "ZSTD_compressBlock_btultra2 (srcSize=%zu)", srcSize); | |
|
1165 | ||
|
1166 | /* 2-pass strategy: | |
|
1078 | 1167 |
|
|
1079 | 1168 |
|
|
1169 | * After 1st pass, function forgets everything, and starts a new block. | |
|
1170 | * Consequently, this can only work if no data has been previously loaded in tables, | |
|
1171 | * aka, no dictionary, no prefix, no ldm preprocessing. | |
|
1080 | 1172 |
|
|
1081 | 1173 |
|
|
1082 | 1174 | assert(srcSize <= ZSTD_BLOCKSIZE_MAX); |
|
1083 | 1175 | if ( (ms->opt.litLengthSum==0) /* first block */ |
|
1084 |
&& (seqStore->sequences == seqStore->sequencesStart) |
|
|
1085 |
&& (ms->window.dictLimit == ms->window.lowLimit) |
|
|
1086 | U32 tmpRep[ZSTD_REP_NUM]; | |
|
1087 | DEBUGLOG(5, "ZSTD_compressBlock_btultra: first block: collecting statistics"); | |
|
1088 | assert(ms->nextToUpdate >= ms->window.dictLimit | |
|
1089 | && ms->nextToUpdate <= ms->window.dictLimit + 1); | |
|
1090 | memcpy(tmpRep, rep, sizeof(tmpRep)); | |
|
1091 | ZSTD_compressBlock_opt_generic(ms, seqStore, tmpRep, src, srcSize, 2 /*optLevel*/, ZSTD_noDict); /* generate stats into ms->opt*/ | |
|
1092 | ZSTD_resetSeqStore(seqStore); | |
|
1093 | /* invalidate first scan from history */ | |
|
1094 | ms->window.base -= srcSize; | |
|
1095 | ms->window.dictLimit += (U32)srcSize; | |
|
1096 | ms->window.lowLimit = ms->window.dictLimit; | |
|
1097 | ms->nextToUpdate = ms->window.dictLimit; | |
|
1098 | ms->nextToUpdate3 = ms->window.dictLimit; | |
|
1099 | /* re-inforce weight of collected statistics */ | |
|
1100 | ZSTD_upscaleStats(&ms->opt); | |
|
1176 | && (seqStore->sequences == seqStore->sequencesStart) /* no ldm */ | |
|
1177 | && (ms->window.dictLimit == ms->window.lowLimit) /* no dictionary */ | |
|
1178 | && (current == ms->window.dictLimit) /* start of frame, nothing already loaded nor skipped */ | |
|
1179 | && (srcSize > ZSTD_PREDEF_THRESHOLD) | |
|
1180 | ) { | |
|
1181 | ZSTD_initStats_ultra(ms, seqStore, rep, src, srcSize); | |
|
1101 | 1182 | } |
|
1102 | #endif | |
|
1183 | ||
|
1103 | 1184 | return ZSTD_compressBlock_opt_generic(ms, seqStore, rep, src, srcSize, 2 /*optLevel*/, ZSTD_noDict); |
|
1104 | 1185 | } |
|
1105 | 1186 | |
@@ -1130,3 +1211,7 b' size_t ZSTD_compressBlock_btultra_extDic' | |||
|
1130 | 1211 | { |
|
1131 | 1212 | return ZSTD_compressBlock_opt_generic(ms, seqStore, rep, src, srcSize, 2 /*optLevel*/, ZSTD_extDict); |
|
1132 | 1213 | } |
|
1214 | ||
|
1215 | /* note : no btultra2 variant for extDict nor dictMatchState, | |
|
1216 | * because btultra2 is not meant to work with dictionaries | |
|
1217 | * and is only specific for the first block (no prefix) */ |
@@ -26,6 +26,10 b' size_t ZSTD_compressBlock_btopt(' | |||
|
26 | 26 | size_t ZSTD_compressBlock_btultra( |
|
27 | 27 | ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM], |
|
28 | 28 | void const* src, size_t srcSize); |
|
29 | size_t ZSTD_compressBlock_btultra2( | |
|
30 | ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM], | |
|
31 | void const* src, size_t srcSize); | |
|
32 | ||
|
29 | 33 | |
|
30 | 34 | size_t ZSTD_compressBlock_btopt_dictMatchState( |
|
31 | 35 | ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM], |
@@ -41,6 +45,10 b' size_t ZSTD_compressBlock_btultra_extDic' | |||
|
41 | 45 | ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM], |
|
42 | 46 | void const* src, size_t srcSize); |
|
43 | 47 | |
|
48 | /* note : no btultra2 variant for extDict nor dictMatchState, | |
|
49 | * because btultra2 is not meant to work with dictionaries | |
|
50 | * and is only specific for the first block (no prefix) */ | |
|
51 | ||
|
44 | 52 | #if defined (__cplusplus) |
|
45 | 53 | } |
|
46 | 54 | #endif |
@@ -9,21 +9,19 b'' | |||
|
9 | 9 | */ |
|
10 | 10 | |
|
11 | 11 | |
|
12 | /* ====== Tuning parameters ====== */ | |
|
13 | #define ZSTDMT_NBWORKERS_MAX 200 | |
|
14 | #define ZSTDMT_JOBSIZE_MAX (MEM_32bits() ? (512 MB) : (2 GB)) /* note : limited by `jobSize` type, which is `unsigned` */ | |
|
15 | #define ZSTDMT_OVERLAPLOG_DEFAULT 6 | |
|
16 | ||
|
17 | ||
|
18 | 12 | /* ====== Compiler specifics ====== */ |
|
19 | 13 | #if defined(_MSC_VER) |
|
20 | 14 | # pragma warning(disable : 4204) /* disable: C4204: non-constant aggregate initializer */ |
|
21 | 15 | #endif |
|
22 | 16 | |
|
23 | 17 | |
|
18 | /* ====== Constants ====== */ | |
|
19 | #define ZSTDMT_OVERLAPLOG_DEFAULT 0 | |
|
20 | ||
|
21 | ||
|
24 | 22 | /* ====== Dependencies ====== */ |
|
25 | 23 | #include <string.h> /* memcpy, memset */ |
|
26 | #include <limits.h> /* INT_MAX */ | |
|
24 | #include <limits.h> /* INT_MAX, UINT_MAX */ | |
|
27 | 25 | #include "pool.h" /* threadpool */ |
|
28 | 26 | #include "threading.h" /* mutex */ |
|
29 | 27 | #include "zstd_compress_internal.h" /* MIN, ERROR, ZSTD_*, ZSTD_highbit32 */ |
@@ -57,9 +55,9 b' static unsigned long long GetCurrentCloc' | |||
|
57 | 55 | static clock_t _ticksPerSecond = 0; |
|
58 | 56 | if (_ticksPerSecond <= 0) _ticksPerSecond = sysconf(_SC_CLK_TCK); |
|
59 | 57 | |
|
60 | { struct tms junk; clock_t newTicks = (clock_t) times(&junk); | |
|
61 |
return ((((unsigned long long)newTicks)*(1000000))/_ticksPerSecond); |
|
|
62 | } | |
|
58 | { struct tms junk; clock_t newTicks = (clock_t) times(&junk); | |
|
59 | return ((((unsigned long long)newTicks)*(1000000))/_ticksPerSecond); | |
|
60 | } } | |
|
63 | 61 | |
|
64 | 62 | #define MUTEX_WAIT_TIME_DLEVEL 6 |
|
65 | 63 | #define ZSTD_PTHREAD_MUTEX_LOCK(mutex) { \ |
@@ -342,8 +340,8 b' static ZSTDMT_seqPool* ZSTDMT_expandSeqP' | |||
|
342 | 340 | |
|
343 | 341 | typedef struct { |
|
344 | 342 | ZSTD_pthread_mutex_t poolMutex; |
|
345 |
|
|
|
346 |
|
|
|
343 | int totalCCtx; | |
|
344 | int availCCtx; | |
|
347 | 345 | ZSTD_customMem cMem; |
|
348 | 346 | ZSTD_CCtx* cctx[1]; /* variable size */ |
|
349 | 347 | } ZSTDMT_CCtxPool; |
@@ -351,16 +349,16 b' typedef struct {' | |||
|
351 | 349 | /* note : all CCtx borrowed from the pool should be released back to the pool _before_ freeing the pool */ |
|
352 | 350 | static void ZSTDMT_freeCCtxPool(ZSTDMT_CCtxPool* pool) |
|
353 | 351 | { |
|
354 | unsigned u; | |
|
355 |
for ( |
|
|
356 |
ZSTD_freeCCtx(pool->cctx[ |
|
|
352 | int cid; | |
|
353 | for (cid=0; cid<pool->totalCCtx; cid++) | |
|
354 | ZSTD_freeCCtx(pool->cctx[cid]); /* note : compatible with free on NULL */ | |
|
357 | 355 | ZSTD_pthread_mutex_destroy(&pool->poolMutex); |
|
358 | 356 | ZSTD_free(pool, pool->cMem); |
|
359 | 357 | } |
|
360 | 358 | |
|
361 | 359 | /* ZSTDMT_createCCtxPool() : |
|
362 | 360 | * implies nbWorkers >= 1 , checked by caller ZSTDMT_createCCtx() */ |
|
363 |
static ZSTDMT_CCtxPool* ZSTDMT_createCCtxPool( |
|
|
361 | static ZSTDMT_CCtxPool* ZSTDMT_createCCtxPool(int nbWorkers, | |
|
364 | 362 | ZSTD_customMem cMem) |
|
365 | 363 | { |
|
366 | 364 | ZSTDMT_CCtxPool* const cctxPool = (ZSTDMT_CCtxPool*) ZSTD_calloc( |
@@ -381,7 +379,7 b' static ZSTDMT_CCtxPool* ZSTDMT_createCCt' | |||
|
381 | 379 | } |
|
382 | 380 | |
|
383 | 381 | static ZSTDMT_CCtxPool* ZSTDMT_expandCCtxPool(ZSTDMT_CCtxPool* srcPool, |
|
384 |
|
|
|
382 | int nbWorkers) | |
|
385 | 383 | { |
|
386 | 384 | if (srcPool==NULL) return NULL; |
|
387 | 385 | if (nbWorkers <= srcPool->totalCCtx) return srcPool; /* good enough */ |
@@ -469,9 +467,9 b' static int ZSTDMT_serialState_reset(seri' | |||
|
469 | 467 | DEBUGLOG(4, "LDM window size = %u KB", (1U << params.cParams.windowLog) >> 10); |
|
470 | 468 | ZSTD_ldm_adjustParameters(¶ms.ldmParams, ¶ms.cParams); |
|
471 | 469 | assert(params.ldmParams.hashLog >= params.ldmParams.bucketSizeLog); |
|
472 |
assert(params.ldmParams.hash |
|
|
470 | assert(params.ldmParams.hashRateLog < 32); | |
|
473 | 471 | serialState->ldmState.hashPower = |
|
474 |
ZSTD_ |
|
|
472 | ZSTD_rollingHash_primePower(params.ldmParams.minMatchLength); | |
|
475 | 473 | } else { |
|
476 | 474 | memset(¶ms.ldmParams, 0, sizeof(params.ldmParams)); |
|
477 | 475 | } |
@@ -674,7 +672,7 b' static void ZSTDMT_compressionJob(void* ' | |||
|
674 | 672 | if (ZSTD_isError(initError)) JOB_ERROR(initError); |
|
675 | 673 | } else { /* srcStart points at reloaded section */ |
|
676 | 674 | U64 const pledgedSrcSize = job->firstJob ? job->fullFrameSize : job->src.size; |
|
677 |
{ size_t const forceWindowError = ZSTD_CCtxParam_setParameter(&jobParams, ZSTD_ |
|
|
675 | { size_t const forceWindowError = ZSTD_CCtxParam_setParameter(&jobParams, ZSTD_c_forceMaxWindow, !job->firstJob); | |
|
678 | 676 | if (ZSTD_isError(forceWindowError)) JOB_ERROR(forceWindowError); |
|
679 | 677 | } |
|
680 | 678 | { size_t const initError = ZSTD_compressBegin_advanced_internal(cctx, |
@@ -777,6 +775,14 b' typedef struct {' | |||
|
777 | 775 | |
|
778 | 776 | static const roundBuff_t kNullRoundBuff = {NULL, 0, 0}; |
|
779 | 777 | |
|
778 | #define RSYNC_LENGTH 32 | |
|
779 | ||
|
780 | typedef struct { | |
|
781 | U64 hash; | |
|
782 | U64 hitMask; | |
|
783 | U64 primePower; | |
|
784 | } rsyncState_t; | |
|
785 | ||
|
780 | 786 | struct ZSTDMT_CCtx_s { |
|
781 | 787 | POOL_ctx* factory; |
|
782 | 788 | ZSTDMT_jobDescription* jobs; |
@@ -790,6 +796,7 b' struct ZSTDMT_CCtx_s {' | |||
|
790 | 796 | inBuff_t inBuff; |
|
791 | 797 | roundBuff_t roundBuff; |
|
792 | 798 | serialState_t serial; |
|
799 | rsyncState_t rsync; | |
|
793 | 800 | unsigned singleBlockingThread; |
|
794 | 801 | unsigned jobIDMask; |
|
795 | 802 | unsigned doneJobID; |
@@ -859,7 +866,7 b' size_t ZSTDMT_CCtxParam_setNbWorkers(ZST' | |||
|
859 | 866 | { |
|
860 | 867 | if (nbWorkers > ZSTDMT_NBWORKERS_MAX) nbWorkers = ZSTDMT_NBWORKERS_MAX; |
|
861 | 868 | params->nbWorkers = nbWorkers; |
|
862 |
params->overlap |
|
|
869 | params->overlapLog = ZSTDMT_OVERLAPLOG_DEFAULT; | |
|
863 | 870 | params->jobSize = 0; |
|
864 | 871 | return nbWorkers; |
|
865 | 872 | } |
@@ -969,52 +976,59 b' size_t ZSTDMT_sizeof_CCtx(ZSTDMT_CCtx* m' | |||
|
969 | 976 | } |
|
970 | 977 | |
|
971 | 978 | /* Internal only */ |
|
972 | size_t ZSTDMT_CCtxParam_setMTCtxParameter(ZSTD_CCtx_params* params, | |
|
973 | ZSTDMT_parameter parameter, unsigned value) { | |
|
979 | size_t | |
|
980 | ZSTDMT_CCtxParam_setMTCtxParameter(ZSTD_CCtx_params* params, | |
|
981 | ZSTDMT_parameter parameter, | |
|
982 | int value) | |
|
983 | { | |
|
974 | 984 | DEBUGLOG(4, "ZSTDMT_CCtxParam_setMTCtxParameter"); |
|
975 | 985 | switch(parameter) |
|
976 | 986 | { |
|
977 | 987 | case ZSTDMT_p_jobSize : |
|
978 |
DEBUGLOG(4, "ZSTDMT_CCtxParam_setMTCtxParameter : set jobSize to % |
|
|
979 |
if ( |
|
|
980 |
|
|
|
988 | DEBUGLOG(4, "ZSTDMT_CCtxParam_setMTCtxParameter : set jobSize to %i", value); | |
|
989 | if ( value != 0 /* default */ | |
|
990 | && value < ZSTDMT_JOBSIZE_MIN) | |
|
981 | 991 | value = ZSTDMT_JOBSIZE_MIN; |
|
982 | if (value > ZSTDMT_JOBSIZE_MAX) | |
|
983 | value = ZSTDMT_JOBSIZE_MAX; | |
|
992 | assert(value >= 0); | |
|
993 | if (value > ZSTDMT_JOBSIZE_MAX) value = ZSTDMT_JOBSIZE_MAX; | |
|
984 | 994 | params->jobSize = value; |
|
985 | 995 | return value; |
|
986 | case ZSTDMT_p_overlapSectionLog : | |
|
987 | if (value > 9) value = 9; | |
|
988 |
DEBUGLOG(4, "ZSTDMT_p_overlap |
|
|
989 | params->overlapSizeLog = (value >= 9) ? 9 : value; | |
|
996 | ||
|
997 | case ZSTDMT_p_overlapLog : | |
|
998 | DEBUGLOG(4, "ZSTDMT_p_overlapLog : %i", value); | |
|
999 | if (value < ZSTD_OVERLAPLOG_MIN) value = ZSTD_OVERLAPLOG_MIN; | |
|
1000 | if (value > ZSTD_OVERLAPLOG_MAX) value = ZSTD_OVERLAPLOG_MAX; | |
|
1001 | params->overlapLog = value; | |
|
990 | 1002 | return value; |
|
1003 | ||
|
1004 | case ZSTDMT_p_rsyncable : | |
|
1005 | value = (value != 0); | |
|
1006 | params->rsyncable = value; | |
|
1007 | return value; | |
|
1008 | ||
|
991 | 1009 | default : |
|
992 | 1010 | return ERROR(parameter_unsupported); |
|
993 | 1011 | } |
|
994 | 1012 | } |
|
995 | 1013 | |
|
996 |
size_t ZSTDMT_setMTCtxParameter(ZSTDMT_CCtx* mtctx, ZSTDMT_parameter parameter, |
|
|
1014 | size_t ZSTDMT_setMTCtxParameter(ZSTDMT_CCtx* mtctx, ZSTDMT_parameter parameter, int value) | |
|
997 | 1015 | { |
|
998 | 1016 | DEBUGLOG(4, "ZSTDMT_setMTCtxParameter"); |
|
999 | switch(parameter) | |
|
1000 | { | |
|
1001 | case ZSTDMT_p_jobSize : | |
|
1002 | return ZSTDMT_CCtxParam_setMTCtxParameter(&mtctx->params, parameter, value); | |
|
1003 | case ZSTDMT_p_overlapSectionLog : | |
|
1004 | return ZSTDMT_CCtxParam_setMTCtxParameter(&mtctx->params, parameter, value); | |
|
1005 | default : | |
|
1006 | return ERROR(parameter_unsupported); | |
|
1007 | } | |
|
1017 | return ZSTDMT_CCtxParam_setMTCtxParameter(&mtctx->params, parameter, value); | |
|
1008 | 1018 | } |
|
1009 | 1019 | |
|
1010 |
size_t ZSTDMT_getMTCtxParameter(ZSTDMT_CCtx* mtctx, ZSTDMT_parameter parameter, |
|
|
1020 | size_t ZSTDMT_getMTCtxParameter(ZSTDMT_CCtx* mtctx, ZSTDMT_parameter parameter, int* value) | |
|
1011 | 1021 | { |
|
1012 | 1022 | switch (parameter) { |
|
1013 | 1023 | case ZSTDMT_p_jobSize: |
|
1014 |
|
|
|
1024 | assert(mtctx->params.jobSize <= INT_MAX); | |
|
1025 | *value = (int)(mtctx->params.jobSize); | |
|
1015 | 1026 | break; |
|
1016 |
case ZSTDMT_p_overlap |
|
|
1017 |
*value = mtctx->params.overlap |
|
|
1027 | case ZSTDMT_p_overlapLog: | |
|
1028 | *value = mtctx->params.overlapLog; | |
|
1029 | break; | |
|
1030 | case ZSTDMT_p_rsyncable: | |
|
1031 | *value = mtctx->params.rsyncable; | |
|
1018 | 1032 | break; |
|
1019 | 1033 | default: |
|
1020 | 1034 | return ERROR(parameter_unsupported); |
@@ -1140,22 +1154,66 b' size_t ZSTDMT_toFlushNow(ZSTDMT_CCtx* mt' | |||
|
1140 | 1154 | /* ===== Multi-threaded compression ===== */ |
|
1141 | 1155 | /* ------------------------------------------ */ |
|
1142 | 1156 | |
|
1143 |
static |
|
|
1157 | static unsigned ZSTDMT_computeTargetJobLog(ZSTD_CCtx_params const params) | |
|
1144 | 1158 | { |
|
1145 | 1159 | if (params.ldmParams.enableLdm) |
|
1160 | /* In Long Range Mode, the windowLog is typically oversized. | |
|
1161 | * In which case, it's preferable to determine the jobSize | |
|
1162 | * based on chainLog instead. */ | |
|
1146 | 1163 | return MAX(21, params.cParams.chainLog + 4); |
|
1147 | 1164 | return MAX(20, params.cParams.windowLog + 2); |
|
1148 | 1165 | } |
|
1149 | 1166 | |
|
1150 |
static |
|
|
1167 | static int ZSTDMT_overlapLog_default(ZSTD_strategy strat) | |
|
1151 | 1168 | { |
|
1152 | unsigned const overlapRLog = (params.overlapSizeLog>9) ? 0 : 9-params.overlapSizeLog; | |
|
1153 | if (params.ldmParams.enableLdm) | |
|
1154 | return (MIN(params.cParams.windowLog, ZSTDMT_computeTargetJobLog(params) - 2) - overlapRLog); | |
|
1155 | return overlapRLog >= 9 ? 0 : (params.cParams.windowLog - overlapRLog); | |
|
1169 | switch(strat) | |
|
1170 | { | |
|
1171 | case ZSTD_btultra2: | |
|
1172 | return 9; | |
|
1173 | case ZSTD_btultra: | |
|
1174 | case ZSTD_btopt: | |
|
1175 | return 8; | |
|
1176 | case ZSTD_btlazy2: | |
|
1177 | case ZSTD_lazy2: | |
|
1178 | return 7; | |
|
1179 | case ZSTD_lazy: | |
|
1180 | case ZSTD_greedy: | |
|
1181 | case ZSTD_dfast: | |
|
1182 | case ZSTD_fast: | |
|
1183 | default:; | |
|
1184 | } | |
|
1185 | return 6; | |
|
1156 | 1186 | } |
|
1157 | 1187 | |
|
1158 | static unsigned ZSTDMT_computeNbJobs(ZSTD_CCtx_params params, size_t srcSize, unsigned nbWorkers) { | |
|
1188 | static int ZSTDMT_overlapLog(int ovlog, ZSTD_strategy strat) | |
|
1189 | { | |
|
1190 | assert(0 <= ovlog && ovlog <= 9); | |
|
1191 | if (ovlog == 0) return ZSTDMT_overlapLog_default(strat); | |
|
1192 | return ovlog; | |
|
1193 | } | |
|
1194 | ||
|
1195 | static size_t ZSTDMT_computeOverlapSize(ZSTD_CCtx_params const params) | |
|
1196 | { | |
|
1197 | int const overlapRLog = 9 - ZSTDMT_overlapLog(params.overlapLog, params.cParams.strategy); | |
|
1198 | int ovLog = (overlapRLog >= 8) ? 0 : (params.cParams.windowLog - overlapRLog); | |
|
1199 | assert(0 <= overlapRLog && overlapRLog <= 8); | |
|
1200 | if (params.ldmParams.enableLdm) { | |
|
1201 | /* In Long Range Mode, the windowLog is typically oversized. | |
|
1202 | * In which case, it's preferable to determine the jobSize | |
|
1203 | * based on chainLog instead. | |
|
1204 | * Then, ovLog becomes a fraction of the jobSize, rather than windowSize */ | |
|
1205 | ovLog = MIN(params.cParams.windowLog, ZSTDMT_computeTargetJobLog(params) - 2) | |
|
1206 | - overlapRLog; | |
|
1207 | } | |
|
1208 | assert(0 <= ovLog && ovLog <= 30); | |
|
1209 | DEBUGLOG(4, "overlapLog : %i", params.overlapLog); | |
|
1210 | DEBUGLOG(4, "overlap size : %i", 1 << ovLog); | |
|
1211 | return (ovLog==0) ? 0 : (size_t)1 << ovLog; | |
|
1212 | } | |
|
1213 | ||
|
1214 | static unsigned | |
|
1215 | ZSTDMT_computeNbJobs(ZSTD_CCtx_params params, size_t srcSize, unsigned nbWorkers) | |
|
1216 | { | |
|
1159 | 1217 | assert(nbWorkers>0); |
|
1160 | 1218 | { size_t const jobSizeTarget = (size_t)1 << ZSTDMT_computeTargetJobLog(params); |
|
1161 | 1219 | size_t const jobMaxSize = jobSizeTarget << 2; |
@@ -1178,7 +1236,7 b' static size_t ZSTDMT_compress_advanced_i' | |||
|
1178 | 1236 | ZSTD_CCtx_params params) |
|
1179 | 1237 | { |
|
1180 | 1238 | ZSTD_CCtx_params const jobParams = ZSTDMT_initJobCCtxParams(params); |
|
1181 |
size_t const overlapSize = |
|
|
1239 | size_t const overlapSize = ZSTDMT_computeOverlapSize(params); | |
|
1182 | 1240 | unsigned const nbJobs = ZSTDMT_computeNbJobs(params, srcSize, params.nbWorkers); |
|
1183 | 1241 | size_t const proposedJobSize = (srcSize + (nbJobs-1)) / nbJobs; |
|
1184 | 1242 | size_t const avgJobSize = (((proposedJobSize-1) & 0x1FFFF) < 0x7FFF) ? proposedJobSize + 0xFFFF : proposedJobSize; /* avoid too small last block */ |
@@ -1289,16 +1347,17 b' static size_t ZSTDMT_compress_advanced_i' | |||
|
1289 | 1347 | } |
|
1290 | 1348 | |
|
1291 | 1349 | size_t ZSTDMT_compress_advanced(ZSTDMT_CCtx* mtctx, |
|
1292 | void* dst, size_t dstCapacity, | |
|
1293 | const void* src, size_t srcSize, | |
|
1294 | const ZSTD_CDict* cdict, | |
|
1295 | ZSTD_parameters params, | |
|
1296 |
|
|
|
1350 | void* dst, size_t dstCapacity, | |
|
1351 | const void* src, size_t srcSize, | |
|
1352 | const ZSTD_CDict* cdict, | |
|
1353 | ZSTD_parameters params, | |
|
1354 | int overlapLog) | |
|
1297 | 1355 | { |
|
1298 | 1356 | ZSTD_CCtx_params cctxParams = mtctx->params; |
|
1299 | 1357 | cctxParams.cParams = params.cParams; |
|
1300 | 1358 | cctxParams.fParams = params.fParams; |
|
1301 | cctxParams.overlapSizeLog = overlapLog; | |
|
1359 | assert(ZSTD_OVERLAPLOG_MIN <= overlapLog && overlapLog <= ZSTD_OVERLAPLOG_MAX); | |
|
1360 | cctxParams.overlapLog = overlapLog; | |
|
1302 | 1361 | return ZSTDMT_compress_advanced_internal(mtctx, |
|
1303 | 1362 | dst, dstCapacity, |
|
1304 | 1363 | src, srcSize, |
@@ -1311,8 +1370,8 b' size_t ZSTDMT_compressCCtx(ZSTDMT_CCtx* ' | |||
|
1311 | 1370 | const void* src, size_t srcSize, |
|
1312 | 1371 | int compressionLevel) |
|
1313 | 1372 | { |
|
1314 | U32 const overlapLog = (compressionLevel >= ZSTD_maxCLevel()) ? 9 : ZSTDMT_OVERLAPLOG_DEFAULT; | |
|
1315 | 1373 | ZSTD_parameters params = ZSTD_getParams(compressionLevel, srcSize, 0); |
|
1374 | int const overlapLog = ZSTDMT_overlapLog_default(params.cParams.strategy); | |
|
1316 | 1375 | params.fParams.contentSizeFlag = 1; |
|
1317 | 1376 | return ZSTDMT_compress_advanced(mtctx, dst, dstCapacity, src, srcSize, NULL, params, overlapLog); |
|
1318 | 1377 | } |
@@ -1339,8 +1398,8 b' size_t ZSTDMT_initCStream_internal(' | |||
|
1339 | 1398 | if (params.nbWorkers != mtctx->params.nbWorkers) |
|
1340 | 1399 | CHECK_F( ZSTDMT_resize(mtctx, params.nbWorkers) ); |
|
1341 | 1400 | |
|
1342 |
if (params.jobSize |
|
|
1343 | if (params.jobSize > ZSTDMT_JOBSIZE_MAX) params.jobSize = ZSTDMT_JOBSIZE_MAX; | |
|
1401 | if (params.jobSize != 0 && params.jobSize < ZSTDMT_JOBSIZE_MIN) params.jobSize = ZSTDMT_JOBSIZE_MIN; | |
|
1402 | if (params.jobSize > (size_t)ZSTDMT_JOBSIZE_MAX) params.jobSize = ZSTDMT_JOBSIZE_MAX; | |
|
1344 | 1403 | |
|
1345 | 1404 | mtctx->singleBlockingThread = (pledgedSrcSize <= ZSTDMT_JOBSIZE_MIN); /* do not trigger multi-threading when srcSize is too small */ |
|
1346 | 1405 | if (mtctx->singleBlockingThread) { |
@@ -1375,14 +1434,24 b' size_t ZSTDMT_initCStream_internal(' | |||
|
1375 | 1434 | mtctx->cdict = cdict; |
|
1376 | 1435 | } |
|
1377 | 1436 | |
|
1378 |
mtctx->targetPrefixSize = |
|
|
1379 |
DEBUGLOG(4, "overlapLog=% |
|
|
1437 | mtctx->targetPrefixSize = ZSTDMT_computeOverlapSize(params); | |
|
1438 | DEBUGLOG(4, "overlapLog=%i => %u KB", params.overlapLog, (U32)(mtctx->targetPrefixSize>>10)); | |
|
1380 | 1439 | mtctx->targetSectionSize = params.jobSize; |
|
1381 | 1440 | if (mtctx->targetSectionSize == 0) { |
|
1382 | 1441 | mtctx->targetSectionSize = 1ULL << ZSTDMT_computeTargetJobLog(params); |
|
1383 | 1442 | } |
|
1443 | if (params.rsyncable) { | |
|
1444 | /* Aim for the targetsectionSize as the average job size. */ | |
|
1445 | U32 const jobSizeMB = (U32)(mtctx->targetSectionSize >> 20); | |
|
1446 | U32 const rsyncBits = ZSTD_highbit32(jobSizeMB) + 20; | |
|
1447 | assert(jobSizeMB >= 1); | |
|
1448 | DEBUGLOG(4, "rsyncLog = %u", rsyncBits); | |
|
1449 | mtctx->rsync.hash = 0; | |
|
1450 | mtctx->rsync.hitMask = (1ULL << rsyncBits) - 1; | |
|
1451 | mtctx->rsync.primePower = ZSTD_rollingHash_primePower(RSYNC_LENGTH); | |
|
1452 | } | |
|
1384 | 1453 | if (mtctx->targetSectionSize < mtctx->targetPrefixSize) mtctx->targetSectionSize = mtctx->targetPrefixSize; /* job size must be >= overlap size */ |
|
1385 | DEBUGLOG(4, "Job Size : %u KB (note : set to %u)", (U32)(mtctx->targetSectionSize>>10), params.jobSize); | |
|
1454 | DEBUGLOG(4, "Job Size : %u KB (note : set to %u)", (U32)(mtctx->targetSectionSize>>10), (U32)params.jobSize); | |
|
1386 | 1455 | DEBUGLOG(4, "inBuff Size : %u KB", (U32)(mtctx->targetSectionSize>>10)); |
|
1387 | 1456 | ZSTDMT_setBufferSize(mtctx->bufPool, ZSTD_compressBound(mtctx->targetSectionSize)); |
|
1388 | 1457 | { |
@@ -1818,6 +1887,89 b' static int ZSTDMT_tryGetInputRange(ZSTDM' | |||
|
1818 | 1887 | return 1; |
|
1819 | 1888 | } |
|
1820 | 1889 | |
|
1890 | typedef struct { | |
|
1891 | size_t toLoad; /* The number of bytes to load from the input. */ | |
|
1892 | int flush; /* Boolean declaring if we must flush because we found a synchronization point. */ | |
|
1893 | } syncPoint_t; | |
|
1894 | ||
|
1895 | /** | |
|
1896 | * Searches through the input for a synchronization point. If one is found, we | |
|
1897 | * will instruct the caller to flush, and return the number of bytes to load. | |
|
1898 | * Otherwise, we will load as many bytes as possible and instruct the caller | |
|
1899 | * to continue as normal. | |
|
1900 | */ | |
|
1901 | static syncPoint_t | |
|
1902 | findSynchronizationPoint(ZSTDMT_CCtx const* mtctx, ZSTD_inBuffer const input) | |
|
1903 | { | |
|
1904 | BYTE const* const istart = (BYTE const*)input.src + input.pos; | |
|
1905 | U64 const primePower = mtctx->rsync.primePower; | |
|
1906 | U64 const hitMask = mtctx->rsync.hitMask; | |
|
1907 | ||
|
1908 | syncPoint_t syncPoint; | |
|
1909 | U64 hash; | |
|
1910 | BYTE const* prev; | |
|
1911 | size_t pos; | |
|
1912 | ||
|
1913 | syncPoint.toLoad = MIN(input.size - input.pos, mtctx->targetSectionSize - mtctx->inBuff.filled); | |
|
1914 | syncPoint.flush = 0; | |
|
1915 | if (!mtctx->params.rsyncable) | |
|
1916 | /* Rsync is disabled. */ | |
|
1917 | return syncPoint; | |
|
1918 | if (mtctx->inBuff.filled + syncPoint.toLoad < RSYNC_LENGTH) | |
|
1919 | /* Not enough to compute the hash. | |
|
1920 | * We will miss any synchronization points in this RSYNC_LENGTH byte | |
|
1921 | * window. However, since it depends only in the internal buffers, if the | |
|
1922 | * state is already synchronized, we will remain synchronized. | |
|
1923 | * Additionally, the probability that we miss a synchronization point is | |
|
1924 | * low: RSYNC_LENGTH / targetSectionSize. | |
|
1925 | */ | |
|
1926 | return syncPoint; | |
|
1927 | /* Initialize the loop variables. */ | |
|
1928 | if (mtctx->inBuff.filled >= RSYNC_LENGTH) { | |
|
1929 | /* We have enough bytes buffered to initialize the hash. | |
|
1930 | * Start scanning at the beginning of the input. | |
|
1931 | */ | |
|
1932 | pos = 0; | |
|
1933 | prev = (BYTE const*)mtctx->inBuff.buffer.start + mtctx->inBuff.filled - RSYNC_LENGTH; | |
|
1934 | hash = ZSTD_rollingHash_compute(prev, RSYNC_LENGTH); | |
|
1935 | } else { | |
|
1936 | /* We don't have enough bytes buffered to initialize the hash, but | |
|
1937 | * we know we have at least RSYNC_LENGTH bytes total. | |
|
1938 | * Start scanning after the first RSYNC_LENGTH bytes less the bytes | |
|
1939 | * already buffered. | |
|
1940 | */ | |
|
1941 | pos = RSYNC_LENGTH - mtctx->inBuff.filled; | |
|
1942 | prev = (BYTE const*)mtctx->inBuff.buffer.start - pos; | |
|
1943 | hash = ZSTD_rollingHash_compute(mtctx->inBuff.buffer.start, mtctx->inBuff.filled); | |
|
1944 | hash = ZSTD_rollingHash_append(hash, istart, pos); | |
|
1945 | } | |
|
1946 | /* Starting with the hash of the previous RSYNC_LENGTH bytes, roll | |
|
1947 | * through the input. If we hit a synchronization point, then cut the | |
|
1948 | * job off, and tell the compressor to flush the job. Otherwise, load | |
|
1949 | * all the bytes and continue as normal. | |
|
1950 | * If we go too long without a synchronization point (targetSectionSize) | |
|
1951 | * then a block will be emitted anyways, but this is okay, since if we | |
|
1952 | * are already synchronized we will remain synchronized. | |
|
1953 | */ | |
|
1954 | for (; pos < syncPoint.toLoad; ++pos) { | |
|
1955 | BYTE const toRemove = pos < RSYNC_LENGTH ? prev[pos] : istart[pos - RSYNC_LENGTH]; | |
|
1956 | /* if (pos >= RSYNC_LENGTH) assert(ZSTD_rollingHash_compute(istart + pos - RSYNC_LENGTH, RSYNC_LENGTH) == hash); */ | |
|
1957 | hash = ZSTD_rollingHash_rotate(hash, toRemove, istart[pos], primePower); | |
|
1958 | if ((hash & hitMask) == hitMask) { | |
|
1959 | syncPoint.toLoad = pos + 1; | |
|
1960 | syncPoint.flush = 1; | |
|
1961 | break; | |
|
1962 | } | |
|
1963 | } | |
|
1964 | return syncPoint; | |
|
1965 | } | |
|
1966 | ||
|
1967 | size_t ZSTDMT_nextInputSizeHint(const ZSTDMT_CCtx* mtctx) | |
|
1968 | { | |
|
1969 | size_t hintInSize = mtctx->targetSectionSize - mtctx->inBuff.filled; | |
|
1970 | if (hintInSize==0) hintInSize = mtctx->targetSectionSize; | |
|
1971 | return hintInSize; | |
|
1972 | } | |
|
1821 | 1973 | |
|
1822 | 1974 | /** ZSTDMT_compressStream_generic() : |
|
1823 | 1975 | * internal use only - exposed to be invoked from zstd_compress.c |
@@ -1844,7 +1996,8 b' size_t ZSTDMT_compressStream_generic(ZST' | |||
|
1844 | 1996 | } |
|
1845 | 1997 | |
|
1846 | 1998 | /* single-pass shortcut (note : synchronous-mode) */ |
|
1847 | if ( (mtctx->nextJobID == 0) /* just started */ | |
|
1999 | if ( (!mtctx->params.rsyncable) /* rsyncable mode is disabled */ | |
|
2000 | && (mtctx->nextJobID == 0) /* just started */ | |
|
1848 | 2001 | && (mtctx->inBuff.filled == 0) /* nothing buffered */ |
|
1849 | 2002 | && (!mtctx->jobReady) /* no job already created */ |
|
1850 | 2003 | && (endOp == ZSTD_e_end) /* end order */ |
@@ -1876,14 +2029,17 b' size_t ZSTDMT_compressStream_generic(ZST' | |||
|
1876 | 2029 | DEBUGLOG(5, "ZSTDMT_tryGetInputRange completed successfully : mtctx->inBuff.buffer.start = %p", mtctx->inBuff.buffer.start); |
|
1877 | 2030 | } |
|
1878 | 2031 | if (mtctx->inBuff.buffer.start != NULL) { |
|
1879 | size_t const toLoad = MIN(input->size - input->pos, mtctx->targetSectionSize - mtctx->inBuff.filled); | |
|
2032 | syncPoint_t const syncPoint = findSynchronizationPoint(mtctx, *input); | |
|
2033 | if (syncPoint.flush && endOp == ZSTD_e_continue) { | |
|
2034 | endOp = ZSTD_e_flush; | |
|
2035 | } | |
|
1880 | 2036 | assert(mtctx->inBuff.buffer.capacity >= mtctx->targetSectionSize); |
|
1881 | 2037 | DEBUGLOG(5, "ZSTDMT_compressStream_generic: adding %u bytes on top of %u to buffer of size %u", |
|
1882 | (U32)toLoad, (U32)mtctx->inBuff.filled, (U32)mtctx->targetSectionSize); | |
|
1883 | memcpy((char*)mtctx->inBuff.buffer.start + mtctx->inBuff.filled, (const char*)input->src + input->pos, toLoad); | |
|
1884 | input->pos += toLoad; | |
|
1885 | mtctx->inBuff.filled += toLoad; | |
|
1886 | forwardInputProgress = toLoad>0; | |
|
2038 | (U32)syncPoint.toLoad, (U32)mtctx->inBuff.filled, (U32)mtctx->targetSectionSize); | |
|
2039 | memcpy((char*)mtctx->inBuff.buffer.start + mtctx->inBuff.filled, (const char*)input->src + input->pos, syncPoint.toLoad); | |
|
2040 | input->pos += syncPoint.toLoad; | |
|
2041 | mtctx->inBuff.filled += syncPoint.toLoad; | |
|
2042 | forwardInputProgress = syncPoint.toLoad>0; | |
|
1887 | 2043 | } |
|
1888 | 2044 | if ((input->pos < input->size) && (endOp == ZSTD_e_end)) |
|
1889 | 2045 | endOp = ZSTD_e_flush; /* can't end now : not all input consumed */ |
@@ -28,6 +28,16 b'' | |||
|
28 | 28 | #include "zstd.h" /* ZSTD_inBuffer, ZSTD_outBuffer, ZSTDLIB_API */ |
|
29 | 29 | |
|
30 | 30 | |
|
31 | /* === Constants === */ | |
|
32 | #ifndef ZSTDMT_NBWORKERS_MAX | |
|
33 | # define ZSTDMT_NBWORKERS_MAX 200 | |
|
34 | #endif | |
|
35 | #ifndef ZSTDMT_JOBSIZE_MIN | |
|
36 | # define ZSTDMT_JOBSIZE_MIN (1 MB) | |
|
37 | #endif | |
|
38 | #define ZSTDMT_JOBSIZE_MAX (MEM_32bits() ? (512 MB) : (1024 MB)) | |
|
39 | ||
|
40 | ||
|
31 | 41 | /* === Memory management === */ |
|
32 | 42 | typedef struct ZSTDMT_CCtx_s ZSTDMT_CCtx; |
|
33 | 43 | ZSTDLIB_API ZSTDMT_CCtx* ZSTDMT_createCCtx(unsigned nbWorkers); |
@@ -52,6 +62,7 b' ZSTDLIB_API size_t ZSTDMT_compressCCtx(Z' | |||
|
52 | 62 | ZSTDLIB_API size_t ZSTDMT_initCStream(ZSTDMT_CCtx* mtctx, int compressionLevel); |
|
53 | 63 | ZSTDLIB_API size_t ZSTDMT_resetCStream(ZSTDMT_CCtx* mtctx, unsigned long long pledgedSrcSize); /**< if srcSize is not known at reset time, use ZSTD_CONTENTSIZE_UNKNOWN. Note: for compatibility with older programs, 0 means the same as ZSTD_CONTENTSIZE_UNKNOWN, but it will change in the future to mean "empty" */ |
|
54 | 64 | |
|
65 | ZSTDLIB_API size_t ZSTDMT_nextInputSizeHint(const ZSTDMT_CCtx* mtctx); | |
|
55 | 66 | ZSTDLIB_API size_t ZSTDMT_compressStream(ZSTDMT_CCtx* mtctx, ZSTD_outBuffer* output, ZSTD_inBuffer* input); |
|
56 | 67 | |
|
57 | 68 | ZSTDLIB_API size_t ZSTDMT_flushStream(ZSTDMT_CCtx* mtctx, ZSTD_outBuffer* output); /**< @return : 0 == all flushed; >0 : still some data to be flushed; or an error code (ZSTD_isError()) */ |
@@ -60,16 +71,12 b' ZSTDLIB_API size_t ZSTDMT_endStream(ZSTD' | |||
|
60 | 71 | |
|
61 | 72 | /* === Advanced functions and parameters === */ |
|
62 | 73 | |
|
63 | #ifndef ZSTDMT_JOBSIZE_MIN | |
|
64 | # define ZSTDMT_JOBSIZE_MIN (1U << 20) /* 1 MB - Minimum size of each compression job */ | |
|
65 | #endif | |
|
66 | ||
|
67 | 74 | ZSTDLIB_API size_t ZSTDMT_compress_advanced(ZSTDMT_CCtx* mtctx, |
|
68 | 75 | void* dst, size_t dstCapacity, |
|
69 | 76 | const void* src, size_t srcSize, |
|
70 | 77 | const ZSTD_CDict* cdict, |
|
71 | 78 | ZSTD_parameters params, |
|
72 |
|
|
|
79 | int overlapLog); | |
|
73 | 80 | |
|
74 | 81 | ZSTDLIB_API size_t ZSTDMT_initCStream_advanced(ZSTDMT_CCtx* mtctx, |
|
75 | 82 | const void* dict, size_t dictSize, /* dict can be released after init, a local copy is preserved within zcs */ |
@@ -84,8 +91,9 b' ZSTDLIB_API size_t ZSTDMT_initCStream_us' | |||
|
84 | 91 | /* ZSTDMT_parameter : |
|
85 | 92 | * List of parameters that can be set using ZSTDMT_setMTCtxParameter() */ |
|
86 | 93 | typedef enum { |
|
87 |
ZSTDMT_p_jobSize, |
|
|
88 |
ZSTDMT_p_overlap |
|
|
94 | ZSTDMT_p_jobSize, /* Each job is compressed in parallel. By default, this value is dynamically determined depending on compression parameters. Can be set explicitly here. */ | |
|
95 | ZSTDMT_p_overlapLog, /* Each job may reload a part of previous job to enhance compressionr ratio; 0 == no overlap, 6(default) == use 1/8th of window, >=9 == use full window. This is a "sticky" parameter : its value will be re-used on next compression job */ | |
|
96 | ZSTDMT_p_rsyncable /* Enables rsyncable mode. */ | |
|
89 | 97 | } ZSTDMT_parameter; |
|
90 | 98 | |
|
91 | 99 | /* ZSTDMT_setMTCtxParameter() : |
@@ -93,12 +101,12 b' typedef enum {' | |||
|
93 | 101 | * The function must be called typically after ZSTD_createCCtx() but __before ZSTDMT_init*() !__ |
|
94 | 102 | * Parameters not explicitly reset by ZSTDMT_init*() remain the same in consecutive compression sessions. |
|
95 | 103 | * @return : 0, or an error code (which can be tested using ZSTD_isError()) */ |
|
96 |
ZSTDLIB_API size_t ZSTDMT_setMTCtxParameter(ZSTDMT_CCtx* mtctx, ZSTDMT_parameter parameter, |
|
|
104 | ZSTDLIB_API size_t ZSTDMT_setMTCtxParameter(ZSTDMT_CCtx* mtctx, ZSTDMT_parameter parameter, int value); | |
|
97 | 105 | |
|
98 | 106 | /* ZSTDMT_getMTCtxParameter() : |
|
99 | 107 | * Query the ZSTDMT_CCtx for a parameter value. |
|
100 | 108 | * @return : 0, or an error code (which can be tested using ZSTD_isError()) */ |
|
101 |
ZSTDLIB_API size_t ZSTDMT_getMTCtxParameter(ZSTDMT_CCtx* mtctx, ZSTDMT_parameter parameter, |
|
|
109 | ZSTDLIB_API size_t ZSTDMT_getMTCtxParameter(ZSTDMT_CCtx* mtctx, ZSTDMT_parameter parameter, int* value); | |
|
102 | 110 | |
|
103 | 111 | |
|
104 | 112 | /*! ZSTDMT_compressStream_generic() : |
@@ -129,7 +137,7 b' size_t ZSTDMT_toFlushNow(ZSTDMT_CCtx* mt' | |||
|
129 | 137 | |
|
130 | 138 | /*! ZSTDMT_CCtxParam_setMTCtxParameter() |
|
131 | 139 | * like ZSTDMT_setMTCtxParameter(), but into a ZSTD_CCtx_Params */ |
|
132 |
size_t ZSTDMT_CCtxParam_setMTCtxParameter(ZSTD_CCtx_params* params, ZSTDMT_parameter parameter, |
|
|
140 | size_t ZSTDMT_CCtxParam_setMTCtxParameter(ZSTD_CCtx_params* params, ZSTDMT_parameter parameter, int value); | |
|
133 | 141 | |
|
134 | 142 | /*! ZSTDMT_CCtxParam_setNbWorkers() |
|
135 | 143 | * Set nbWorkers, and clamp it. |
|
1 | NO CONTENT: modified file | |
The requested commit or file is too big and content was truncated. Show full diff |
|
1 | NO CONTENT: modified file | |
The requested commit or file is too big and content was truncated. Show full diff |
|
1 | NO CONTENT: modified file | |
The requested commit or file is too big and content was truncated. Show full diff |
|
1 | NO CONTENT: modified file | |
The requested commit or file is too big and content was truncated. Show full diff |
|
1 | NO CONTENT: modified file | |
The requested commit or file is too big and content was truncated. Show full diff |
|
1 | NO CONTENT: modified file | |
The requested commit or file is too big and content was truncated. Show full diff |
|
1 | NO CONTENT: modified file | |
The requested commit or file is too big and content was truncated. Show full diff |
|
1 | NO CONTENT: modified file | |
The requested commit or file is too big and content was truncated. Show full diff |
|
1 | NO CONTENT: modified file | |
The requested commit or file is too big and content was truncated. Show full diff |
General Comments 0
You need to be logged in to leave comments.
Login now