##// END OF EJS Templates
zstd: vendor zstd 1.1.1...
Gregory Szorc -
r30434:2e484bde default
parent child Browse files
Show More
@@ -0,0 +1,30 b''
1 BSD License
2
3 For Zstandard software
4
5 Copyright (c) 2016-present, Facebook, Inc. All rights reserved.
6
7 Redistribution and use in source and binary forms, with or without modification,
8 are permitted provided that the following conditions are met:
9
10 * Redistributions of source code must retain the above copyright notice, this
11 list of conditions and the following disclaimer.
12
13 * Redistributions in binary form must reproduce the above copyright notice,
14 this list of conditions and the following disclaimer in the documentation
15 and/or other materials provided with the distribution.
16
17 * Neither the name Facebook nor the names of its contributors may be used to
18 endorse or promote products derived from this software without specific
19 prior written permission.
20
21 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
22 ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
23 WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
24 DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR
25 ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
26 (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
27 LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
28 ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
29 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
30 SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
@@ -0,0 +1,33 b''
1 Additional Grant of Patent Rights Version 2
2
3 "Software" means the Zstandard software distributed by Facebook, Inc.
4
5 Facebook, Inc. ("Facebook") hereby grants to each recipient of the Software
6 ("you") a perpetual, worldwide, royalty-free, non-exclusive, irrevocable
7 (subject to the termination provision below) license under any Necessary
8 Claims, to make, have made, use, sell, offer to sell, import, and otherwise
9 transfer the Software. For avoidance of doubt, no license is granted under
10 Facebook’s rights in any patent claims that are infringed by (i) modifications
11 to the Software made by you or any third party or (ii) the Software in
12 combination with any software or other technology.
13
14 The license granted hereunder will terminate, automatically and without notice,
15 if you (or any of your subsidiaries, corporate affiliates or agents) initiate
16 directly or indirectly, or take a direct financial interest in, any Patent
17 Assertion: (i) against Facebook or any of its subsidiaries or corporate
18 affiliates, (ii) against any party if such Patent Assertion arises in whole or
19 in part from any software, technology, product or service of Facebook or any of
20 its subsidiaries or corporate affiliates, or (iii) against any party relating
21 to the Software. Notwithstanding the foregoing, if Facebook or any of its
22 subsidiaries or corporate affiliates files a lawsuit alleging patent
23 infringement against you in the first instance, and you respond by filing a
24 patent infringement counterclaim in that lawsuit against that party that is
25 unrelated to the Software, the license granted hereunder will not terminate
26 under section (i) of this paragraph due to such counterclaim.
27
28 A "Necessary Claim" is a claim of a patent owned by Facebook that is
29 necessarily infringed by the Software standing alone.
30
31 A "Patent Assertion" is any lawsuit or other action alleging direct, indirect,
32 or contributory infringement or inducement to infringe any patent, including a
33 cross-claim or counterclaim.
@@ -0,0 +1,414 b''
1 /* ******************************************************************
2 bitstream
3 Part of FSE library
4 header file (to include)
5 Copyright (C) 2013-2016, Yann Collet.
6
7 BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
8
9 Redistribution and use in source and binary forms, with or without
10 modification, are permitted provided that the following conditions are
11 met:
12
13 * Redistributions of source code must retain the above copyright
14 notice, this list of conditions and the following disclaimer.
15 * Redistributions in binary form must reproduce the above
16 copyright notice, this list of conditions and the following disclaimer
17 in the documentation and/or other materials provided with the
18 distribution.
19
20 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
21 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
22 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
23 A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
24 OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
25 SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
26 LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
27 DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
28 THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
29 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
30 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
31
32 You can contact the author at :
33 - Source repository : https://github.com/Cyan4973/FiniteStateEntropy
34 ****************************************************************** */
35 #ifndef BITSTREAM_H_MODULE
36 #define BITSTREAM_H_MODULE
37
38 #if defined (__cplusplus)
39 extern "C" {
40 #endif
41
42
43 /*
44 * This API consists of small unitary functions, which must be inlined for best performance.
45 * Since link-time-optimization is not available for all compilers,
46 * these functions are defined into a .h to be included.
47 */
48
49 /*-****************************************
50 * Dependencies
51 ******************************************/
52 #include "mem.h" /* unaligned access routines */
53 #include "error_private.h" /* error codes and messages */
54
55
56 /*=========================================
57 * Target specific
58 =========================================*/
59 #if defined(__BMI__) && defined(__GNUC__)
60 # include <immintrin.h> /* support for bextr (experimental) */
61 #endif
62
63
64 /*-******************************************
65 * bitStream encoding API (write forward)
66 ********************************************/
67 /* bitStream can mix input from multiple sources.
68 * A critical property of these streams is that they encode and decode in **reverse** direction.
69 * So the first bit sequence you add will be the last to be read, like a LIFO stack.
70 */
71 typedef struct
72 {
73 size_t bitContainer;
74 int bitPos;
75 char* startPtr;
76 char* ptr;
77 char* endPtr;
78 } BIT_CStream_t;
79
80 MEM_STATIC size_t BIT_initCStream(BIT_CStream_t* bitC, void* dstBuffer, size_t dstCapacity);
81 MEM_STATIC void BIT_addBits(BIT_CStream_t* bitC, size_t value, unsigned nbBits);
82 MEM_STATIC void BIT_flushBits(BIT_CStream_t* bitC);
83 MEM_STATIC size_t BIT_closeCStream(BIT_CStream_t* bitC);
84
85 /* Start with initCStream, providing the size of buffer to write into.
86 * bitStream will never write outside of this buffer.
87 * `dstCapacity` must be >= sizeof(bitD->bitContainer), otherwise @return will be an error code.
88 *
89 * bits are first added to a local register.
90 * Local register is size_t, hence 64-bits on 64-bits systems, or 32-bits on 32-bits systems.
91 * Writing data into memory is an explicit operation, performed by the flushBits function.
92 * Hence keep track how many bits are potentially stored into local register to avoid register overflow.
93 * After a flushBits, a maximum of 7 bits might still be stored into local register.
94 *
95 * Avoid storing elements of more than 24 bits if you want compatibility with 32-bits bitstream readers.
96 *
97 * Last operation is to close the bitStream.
98 * The function returns the final size of CStream in bytes.
99 * If data couldn't fit into `dstBuffer`, it will return a 0 ( == not storable)
100 */
101
102
103 /*-********************************************
104 * bitStream decoding API (read backward)
105 **********************************************/
106 typedef struct
107 {
108 size_t bitContainer;
109 unsigned bitsConsumed;
110 const char* ptr;
111 const char* start;
112 } BIT_DStream_t;
113
114 typedef enum { BIT_DStream_unfinished = 0,
115 BIT_DStream_endOfBuffer = 1,
116 BIT_DStream_completed = 2,
117 BIT_DStream_overflow = 3 } BIT_DStream_status; /* result of BIT_reloadDStream() */
118 /* 1,2,4,8 would be better for bitmap combinations, but slows down performance a bit ... :( */
119
120 MEM_STATIC size_t BIT_initDStream(BIT_DStream_t* bitD, const void* srcBuffer, size_t srcSize);
121 MEM_STATIC size_t BIT_readBits(BIT_DStream_t* bitD, unsigned nbBits);
122 MEM_STATIC BIT_DStream_status BIT_reloadDStream(BIT_DStream_t* bitD);
123 MEM_STATIC unsigned BIT_endOfDStream(const BIT_DStream_t* bitD);
124
125
126 /* Start by invoking BIT_initDStream().
127 * A chunk of the bitStream is then stored into a local register.
128 * Local register size is 64-bits on 64-bits systems, 32-bits on 32-bits systems (size_t).
129 * You can then retrieve bitFields stored into the local register, **in reverse order**.
130 * Local register is explicitly reloaded from memory by the BIT_reloadDStream() method.
131 * A reload guarantee a minimum of ((8*sizeof(bitD->bitContainer))-7) bits when its result is BIT_DStream_unfinished.
132 * Otherwise, it can be less than that, so proceed accordingly.
133 * Checking if DStream has reached its end can be performed with BIT_endOfDStream().
134 */
135
136
137 /*-****************************************
138 * unsafe API
139 ******************************************/
140 MEM_STATIC void BIT_addBitsFast(BIT_CStream_t* bitC, size_t value, unsigned nbBits);
141 /* faster, but works only if value is "clean", meaning all high bits above nbBits are 0 */
142
143 MEM_STATIC void BIT_flushBitsFast(BIT_CStream_t* bitC);
144 /* unsafe version; does not check buffer overflow */
145
146 MEM_STATIC size_t BIT_readBitsFast(BIT_DStream_t* bitD, unsigned nbBits);
147 /* faster, but works only if nbBits >= 1 */
148
149
150
151 /*-**************************************************************
152 * Internal functions
153 ****************************************************************/
154 MEM_STATIC unsigned BIT_highbit32 (register U32 val)
155 {
156 # if defined(_MSC_VER) /* Visual */
157 unsigned long r=0;
158 _BitScanReverse ( &r, val );
159 return (unsigned) r;
160 # elif defined(__GNUC__) && (__GNUC__ >= 3) /* Use GCC Intrinsic */
161 return 31 - __builtin_clz (val);
162 # else /* Software version */
163 static const unsigned DeBruijnClz[32] = { 0, 9, 1, 10, 13, 21, 2, 29, 11, 14, 16, 18, 22, 25, 3, 30, 8, 12, 20, 28, 15, 17, 24, 7, 19, 27, 23, 6, 26, 5, 4, 31 };
164 U32 v = val;
165 v |= v >> 1;
166 v |= v >> 2;
167 v |= v >> 4;
168 v |= v >> 8;
169 v |= v >> 16;
170 return DeBruijnClz[ (U32) (v * 0x07C4ACDDU) >> 27];
171 # endif
172 }
173
174 /*===== Local Constants =====*/
175 static const unsigned BIT_mask[] = { 0, 1, 3, 7, 0xF, 0x1F, 0x3F, 0x7F, 0xFF, 0x1FF, 0x3FF, 0x7FF, 0xFFF, 0x1FFF, 0x3FFF, 0x7FFF, 0xFFFF, 0x1FFFF, 0x3FFFF, 0x7FFFF, 0xFFFFF, 0x1FFFFF, 0x3FFFFF, 0x7FFFFF, 0xFFFFFF, 0x1FFFFFF, 0x3FFFFFF }; /* up to 26 bits */
176
177
178 /*-**************************************************************
179 * bitStream encoding
180 ****************************************************************/
181 /*! BIT_initCStream() :
182 * `dstCapacity` must be > sizeof(void*)
183 * @return : 0 if success,
184 otherwise an error code (can be tested using ERR_isError() ) */
185 MEM_STATIC size_t BIT_initCStream(BIT_CStream_t* bitC, void* startPtr, size_t dstCapacity)
186 {
187 bitC->bitContainer = 0;
188 bitC->bitPos = 0;
189 bitC->startPtr = (char*)startPtr;
190 bitC->ptr = bitC->startPtr;
191 bitC->endPtr = bitC->startPtr + dstCapacity - sizeof(bitC->ptr);
192 if (dstCapacity <= sizeof(bitC->ptr)) return ERROR(dstSize_tooSmall);
193 return 0;
194 }
195
196 /*! BIT_addBits() :
197 can add up to 26 bits into `bitC`.
198 Does not check for register overflow ! */
199 MEM_STATIC void BIT_addBits(BIT_CStream_t* bitC, size_t value, unsigned nbBits)
200 {
201 bitC->bitContainer |= (value & BIT_mask[nbBits]) << bitC->bitPos;
202 bitC->bitPos += nbBits;
203 }
204
205 /*! BIT_addBitsFast() :
206 * works only if `value` is _clean_, meaning all high bits above nbBits are 0 */
207 MEM_STATIC void BIT_addBitsFast(BIT_CStream_t* bitC, size_t value, unsigned nbBits)
208 {
209 bitC->bitContainer |= value << bitC->bitPos;
210 bitC->bitPos += nbBits;
211 }
212
213 /*! BIT_flushBitsFast() :
214 * unsafe version; does not check buffer overflow */
215 MEM_STATIC void BIT_flushBitsFast(BIT_CStream_t* bitC)
216 {
217 size_t const nbBytes = bitC->bitPos >> 3;
218 MEM_writeLEST(bitC->ptr, bitC->bitContainer);
219 bitC->ptr += nbBytes;
220 bitC->bitPos &= 7;
221 bitC->bitContainer >>= nbBytes*8; /* if bitPos >= sizeof(bitContainer)*8 --> undefined behavior */
222 }
223
224 /*! BIT_flushBits() :
225 * safe version; check for buffer overflow, and prevents it.
226 * note : does not signal buffer overflow. This will be revealed later on using BIT_closeCStream() */
227 MEM_STATIC void BIT_flushBits(BIT_CStream_t* bitC)
228 {
229 size_t const nbBytes = bitC->bitPos >> 3;
230 MEM_writeLEST(bitC->ptr, bitC->bitContainer);
231 bitC->ptr += nbBytes;
232 if (bitC->ptr > bitC->endPtr) bitC->ptr = bitC->endPtr;
233 bitC->bitPos &= 7;
234 bitC->bitContainer >>= nbBytes*8; /* if bitPos >= sizeof(bitContainer)*8 --> undefined behavior */
235 }
236
237 /*! BIT_closeCStream() :
238 * @return : size of CStream, in bytes,
239 or 0 if it could not fit into dstBuffer */
240 MEM_STATIC size_t BIT_closeCStream(BIT_CStream_t* bitC)
241 {
242 BIT_addBitsFast(bitC, 1, 1); /* endMark */
243 BIT_flushBits(bitC);
244
245 if (bitC->ptr >= bitC->endPtr) return 0; /* doesn't fit within authorized budget : cancel */
246
247 return (bitC->ptr - bitC->startPtr) + (bitC->bitPos > 0);
248 }
249
250
251 /*-********************************************************
252 * bitStream decoding
253 **********************************************************/
254 /*! BIT_initDStream() :
255 * Initialize a BIT_DStream_t.
256 * `bitD` : a pointer to an already allocated BIT_DStream_t structure.
257 * `srcSize` must be the *exact* size of the bitStream, in bytes.
258 * @return : size of stream (== srcSize) or an errorCode if a problem is detected
259 */
260 MEM_STATIC size_t BIT_initDStream(BIT_DStream_t* bitD, const void* srcBuffer, size_t srcSize)
261 {
262 if (srcSize < 1) { memset(bitD, 0, sizeof(*bitD)); return ERROR(srcSize_wrong); }
263
264 if (srcSize >= sizeof(bitD->bitContainer)) { /* normal case */
265 bitD->start = (const char*)srcBuffer;
266 bitD->ptr = (const char*)srcBuffer + srcSize - sizeof(bitD->bitContainer);
267 bitD->bitContainer = MEM_readLEST(bitD->ptr);
268 { BYTE const lastByte = ((const BYTE*)srcBuffer)[srcSize-1];
269 bitD->bitsConsumed = lastByte ? 8 - BIT_highbit32(lastByte) : 0;
270 if (lastByte == 0) return ERROR(GENERIC); /* endMark not present */ }
271 } else {
272 bitD->start = (const char*)srcBuffer;
273 bitD->ptr = bitD->start;
274 bitD->bitContainer = *(const BYTE*)(bitD->start);
275 switch(srcSize)
276 {
277 case 7: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[6]) << (sizeof(bitD->bitContainer)*8 - 16);
278 case 6: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[5]) << (sizeof(bitD->bitContainer)*8 - 24);
279 case 5: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[4]) << (sizeof(bitD->bitContainer)*8 - 32);
280 case 4: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[3]) << 24;
281 case 3: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[2]) << 16;
282 case 2: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[1]) << 8;
283 default:;
284 }
285 { BYTE const lastByte = ((const BYTE*)srcBuffer)[srcSize-1];
286 bitD->bitsConsumed = lastByte ? 8 - BIT_highbit32(lastByte) : 0;
287 if (lastByte == 0) return ERROR(GENERIC); /* endMark not present */ }
288 bitD->bitsConsumed += (U32)(sizeof(bitD->bitContainer) - srcSize)*8;
289 }
290
291 return srcSize;
292 }
293
294 MEM_STATIC size_t BIT_getUpperBits(size_t bitContainer, U32 const start)
295 {
296 return bitContainer >> start;
297 }
298
299 MEM_STATIC size_t BIT_getMiddleBits(size_t bitContainer, U32 const start, U32 const nbBits)
300 {
301 #if defined(__BMI__) && defined(__GNUC__) /* experimental */
302 # if defined(__x86_64__)
303 if (sizeof(bitContainer)==8)
304 return _bextr_u64(bitContainer, start, nbBits);
305 else
306 # endif
307 return _bextr_u32(bitContainer, start, nbBits);
308 #else
309 return (bitContainer >> start) & BIT_mask[nbBits];
310 #endif
311 }
312
313 MEM_STATIC size_t BIT_getLowerBits(size_t bitContainer, U32 const nbBits)
314 {
315 return bitContainer & BIT_mask[nbBits];
316 }
317
318 /*! BIT_lookBits() :
319 * Provides next n bits from local register.
320 * local register is not modified.
321 * On 32-bits, maxNbBits==24.
322 * On 64-bits, maxNbBits==56.
323 * @return : value extracted
324 */
325 MEM_STATIC size_t BIT_lookBits(const BIT_DStream_t* bitD, U32 nbBits)
326 {
327 #if defined(__BMI__) && defined(__GNUC__) /* experimental; fails if bitD->bitsConsumed + nbBits > sizeof(bitD->bitContainer)*8 */
328 return BIT_getMiddleBits(bitD->bitContainer, (sizeof(bitD->bitContainer)*8) - bitD->bitsConsumed - nbBits, nbBits);
329 #else
330 U32 const bitMask = sizeof(bitD->bitContainer)*8 - 1;
331 return ((bitD->bitContainer << (bitD->bitsConsumed & bitMask)) >> 1) >> ((bitMask-nbBits) & bitMask);
332 #endif
333 }
334
335 /*! BIT_lookBitsFast() :
336 * unsafe version; only works only if nbBits >= 1 */
337 MEM_STATIC size_t BIT_lookBitsFast(const BIT_DStream_t* bitD, U32 nbBits)
338 {
339 U32 const bitMask = sizeof(bitD->bitContainer)*8 - 1;
340 return (bitD->bitContainer << (bitD->bitsConsumed & bitMask)) >> (((bitMask+1)-nbBits) & bitMask);
341 }
342
343 MEM_STATIC void BIT_skipBits(BIT_DStream_t* bitD, U32 nbBits)
344 {
345 bitD->bitsConsumed += nbBits;
346 }
347
348 /*! BIT_readBits() :
349 * Read (consume) next n bits from local register and update.
350 * Pay attention to not read more than nbBits contained into local register.
351 * @return : extracted value.
352 */
353 MEM_STATIC size_t BIT_readBits(BIT_DStream_t* bitD, U32 nbBits)
354 {
355 size_t const value = BIT_lookBits(bitD, nbBits);
356 BIT_skipBits(bitD, nbBits);
357 return value;
358 }
359
360 /*! BIT_readBitsFast() :
361 * unsafe version; only works only if nbBits >= 1 */
362 MEM_STATIC size_t BIT_readBitsFast(BIT_DStream_t* bitD, U32 nbBits)
363 {
364 size_t const value = BIT_lookBitsFast(bitD, nbBits);
365 BIT_skipBits(bitD, nbBits);
366 return value;
367 }
368
369 /*! BIT_reloadDStream() :
370 * Refill `BIT_DStream_t` from src buffer previously defined (see BIT_initDStream() ).
371 * This function is safe, it guarantees it will not read beyond src buffer.
372 * @return : status of `BIT_DStream_t` internal register.
373 if status == unfinished, internal register is filled with >= (sizeof(bitD->bitContainer)*8 - 7) bits */
374 MEM_STATIC BIT_DStream_status BIT_reloadDStream(BIT_DStream_t* bitD)
375 {
376 if (bitD->bitsConsumed > (sizeof(bitD->bitContainer)*8)) /* should not happen => corruption detected */
377 return BIT_DStream_overflow;
378
379 if (bitD->ptr >= bitD->start + sizeof(bitD->bitContainer)) {
380 bitD->ptr -= bitD->bitsConsumed >> 3;
381 bitD->bitsConsumed &= 7;
382 bitD->bitContainer = MEM_readLEST(bitD->ptr);
383 return BIT_DStream_unfinished;
384 }
385 if (bitD->ptr == bitD->start) {
386 if (bitD->bitsConsumed < sizeof(bitD->bitContainer)*8) return BIT_DStream_endOfBuffer;
387 return BIT_DStream_completed;
388 }
389 { U32 nbBytes = bitD->bitsConsumed >> 3;
390 BIT_DStream_status result = BIT_DStream_unfinished;
391 if (bitD->ptr - nbBytes < bitD->start) {
392 nbBytes = (U32)(bitD->ptr - bitD->start); /* ptr > start */
393 result = BIT_DStream_endOfBuffer;
394 }
395 bitD->ptr -= nbBytes;
396 bitD->bitsConsumed -= nbBytes*8;
397 bitD->bitContainer = MEM_readLEST(bitD->ptr); /* reminder : srcSize > sizeof(bitD) */
398 return result;
399 }
400 }
401
402 /*! BIT_endOfDStream() :
403 * @return Tells if DStream has exactly reached its end (all bits consumed).
404 */
405 MEM_STATIC unsigned BIT_endOfDStream(const BIT_DStream_t* DStream)
406 {
407 return ((DStream->ptr == DStream->start) && (DStream->bitsConsumed == sizeof(DStream->bitContainer)*8));
408 }
409
410 #if defined (__cplusplus)
411 }
412 #endif
413
414 #endif /* BITSTREAM_H_MODULE */
@@ -0,0 +1,225 b''
1 /*
2 Common functions of New Generation Entropy library
3 Copyright (C) 2016, Yann Collet.
4
5 BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
6
7 Redistribution and use in source and binary forms, with or without
8 modification, are permitted provided that the following conditions are
9 met:
10
11 * Redistributions of source code must retain the above copyright
12 notice, this list of conditions and the following disclaimer.
13 * Redistributions in binary form must reproduce the above
14 copyright notice, this list of conditions and the following disclaimer
15 in the documentation and/or other materials provided with the
16 distribution.
17
18 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
19 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
20 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
21 A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
22 OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
23 SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
24 LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
25 DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
26 THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
27 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
28 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
29
30 You can contact the author at :
31 - FSE+HUF source repository : https://github.com/Cyan4973/FiniteStateEntropy
32 - Public forum : https://groups.google.com/forum/#!forum/lz4c
33 *************************************************************************** */
34
35 /* *************************************
36 * Dependencies
37 ***************************************/
38 #include "mem.h"
39 #include "error_private.h" /* ERR_*, ERROR */
40 #define FSE_STATIC_LINKING_ONLY /* FSE_MIN_TABLELOG */
41 #include "fse.h"
42 #define HUF_STATIC_LINKING_ONLY /* HUF_TABLELOG_ABSOLUTEMAX */
43 #include "huf.h"
44
45
46 /*-****************************************
47 * FSE Error Management
48 ******************************************/
49 unsigned FSE_isError(size_t code) { return ERR_isError(code); }
50
51 const char* FSE_getErrorName(size_t code) { return ERR_getErrorName(code); }
52
53
54 /* **************************************************************
55 * HUF Error Management
56 ****************************************************************/
57 unsigned HUF_isError(size_t code) { return ERR_isError(code); }
58
59 const char* HUF_getErrorName(size_t code) { return ERR_getErrorName(code); }
60
61
62 /*-**************************************************************
63 * FSE NCount encoding-decoding
64 ****************************************************************/
65 static short FSE_abs(short a) { return (short)(a<0 ? -a : a); }
66
67 size_t FSE_readNCount (short* normalizedCounter, unsigned* maxSVPtr, unsigned* tableLogPtr,
68 const void* headerBuffer, size_t hbSize)
69 {
70 const BYTE* const istart = (const BYTE*) headerBuffer;
71 const BYTE* const iend = istart + hbSize;
72 const BYTE* ip = istart;
73 int nbBits;
74 int remaining;
75 int threshold;
76 U32 bitStream;
77 int bitCount;
78 unsigned charnum = 0;
79 int previous0 = 0;
80
81 if (hbSize < 4) return ERROR(srcSize_wrong);
82 bitStream = MEM_readLE32(ip);
83 nbBits = (bitStream & 0xF) + FSE_MIN_TABLELOG; /* extract tableLog */
84 if (nbBits > FSE_TABLELOG_ABSOLUTE_MAX) return ERROR(tableLog_tooLarge);
85 bitStream >>= 4;
86 bitCount = 4;
87 *tableLogPtr = nbBits;
88 remaining = (1<<nbBits)+1;
89 threshold = 1<<nbBits;
90 nbBits++;
91
92 while ((remaining>1) & (charnum<=*maxSVPtr)) {
93 if (previous0) {
94 unsigned n0 = charnum;
95 while ((bitStream & 0xFFFF) == 0xFFFF) {
96 n0 += 24;
97 if (ip < iend-5) {
98 ip += 2;
99 bitStream = MEM_readLE32(ip) >> bitCount;
100 } else {
101 bitStream >>= 16;
102 bitCount += 16;
103 } }
104 while ((bitStream & 3) == 3) {
105 n0 += 3;
106 bitStream >>= 2;
107 bitCount += 2;
108 }
109 n0 += bitStream & 3;
110 bitCount += 2;
111 if (n0 > *maxSVPtr) return ERROR(maxSymbolValue_tooSmall);
112 while (charnum < n0) normalizedCounter[charnum++] = 0;
113 if ((ip <= iend-7) || (ip + (bitCount>>3) <= iend-4)) {
114 ip += bitCount>>3;
115 bitCount &= 7;
116 bitStream = MEM_readLE32(ip) >> bitCount;
117 } else {
118 bitStream >>= 2;
119 } }
120 { short const max = (short)((2*threshold-1)-remaining);
121 short count;
122
123 if ((bitStream & (threshold-1)) < (U32)max) {
124 count = (short)(bitStream & (threshold-1));
125 bitCount += nbBits-1;
126 } else {
127 count = (short)(bitStream & (2*threshold-1));
128 if (count >= threshold) count -= max;
129 bitCount += nbBits;
130 }
131
132 count--; /* extra accuracy */
133 remaining -= FSE_abs(count);
134 normalizedCounter[charnum++] = count;
135 previous0 = !count;
136 while (remaining < threshold) {
137 nbBits--;
138 threshold >>= 1;
139 }
140
141 if ((ip <= iend-7) || (ip + (bitCount>>3) <= iend-4)) {
142 ip += bitCount>>3;
143 bitCount &= 7;
144 } else {
145 bitCount -= (int)(8 * (iend - 4 - ip));
146 ip = iend - 4;
147 }
148 bitStream = MEM_readLE32(ip) >> (bitCount & 31);
149 } } /* while ((remaining>1) & (charnum<=*maxSVPtr)) */
150 if (remaining != 1) return ERROR(corruption_detected);
151 if (bitCount > 32) return ERROR(corruption_detected);
152 *maxSVPtr = charnum-1;
153
154 ip += (bitCount+7)>>3;
155 return ip-istart;
156 }
157
158
159 /*! HUF_readStats() :
160 Read compact Huffman tree, saved by HUF_writeCTable().
161 `huffWeight` is destination buffer.
162 @return : size read from `src` , or an error Code .
163 Note : Needed by HUF_readCTable() and HUF_readDTableX?() .
164 */
165 size_t HUF_readStats(BYTE* huffWeight, size_t hwSize, U32* rankStats,
166 U32* nbSymbolsPtr, U32* tableLogPtr,
167 const void* src, size_t srcSize)
168 {
169 U32 weightTotal;
170 const BYTE* ip = (const BYTE*) src;
171 size_t iSize;
172 size_t oSize;
173
174 if (!srcSize) return ERROR(srcSize_wrong);
175 iSize = ip[0];
176 /* memset(huffWeight, 0, hwSize); *//* is not necessary, even though some analyzer complain ... */
177
178 if (iSize >= 128) { /* special header */
179 oSize = iSize - 127;
180 iSize = ((oSize+1)/2);
181 if (iSize+1 > srcSize) return ERROR(srcSize_wrong);
182 if (oSize >= hwSize) return ERROR(corruption_detected);
183 ip += 1;
184 { U32 n;
185 for (n=0; n<oSize; n+=2) {
186 huffWeight[n] = ip[n/2] >> 4;
187 huffWeight[n+1] = ip[n/2] & 15;
188 } } }
189 else { /* header compressed with FSE (normal case) */
190 if (iSize+1 > srcSize) return ERROR(srcSize_wrong);
191 oSize = FSE_decompress(huffWeight, hwSize-1, ip+1, iSize); /* max (hwSize-1) values decoded, as last one is implied */
192 if (FSE_isError(oSize)) return oSize;
193 }
194
195 /* collect weight stats */
196 memset(rankStats, 0, (HUF_TABLELOG_ABSOLUTEMAX + 1) * sizeof(U32));
197 weightTotal = 0;
198 { U32 n; for (n=0; n<oSize; n++) {
199 if (huffWeight[n] >= HUF_TABLELOG_ABSOLUTEMAX) return ERROR(corruption_detected);
200 rankStats[huffWeight[n]]++;
201 weightTotal += (1 << huffWeight[n]) >> 1;
202 } }
203 if (weightTotal == 0) return ERROR(corruption_detected);
204
205 /* get last non-null symbol weight (implied, total must be 2^n) */
206 { U32 const tableLog = BIT_highbit32(weightTotal) + 1;
207 if (tableLog > HUF_TABLELOG_ABSOLUTEMAX) return ERROR(corruption_detected);
208 *tableLogPtr = tableLog;
209 /* determine last weight */
210 { U32 const total = 1 << tableLog;
211 U32 const rest = total - weightTotal;
212 U32 const verif = 1 << BIT_highbit32(rest);
213 U32 const lastWeight = BIT_highbit32(rest) + 1;
214 if (verif != rest) return ERROR(corruption_detected); /* last value must be a clean power of 2 */
215 huffWeight[oSize] = (BYTE)lastWeight;
216 rankStats[lastWeight]++;
217 } }
218
219 /* check tree construction validity */
220 if ((rankStats[1] < 2) || (rankStats[1] & 1)) return ERROR(corruption_detected); /* by construction : at least 2 elts of rank 1, must be even */
221
222 /* results */
223 *nbSymbolsPtr = (U32)(oSize+1);
224 return iSize+1;
225 }
@@ -0,0 +1,43 b''
1 /**
2 * Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
3 * All rights reserved.
4 *
5 * This source code is licensed under the BSD-style license found in the
6 * LICENSE file in the root directory of this source tree. An additional grant
7 * of patent rights can be found in the PATENTS file in the same directory.
8 */
9
10 /* The purpose of this file is to have a single list of error strings embedded in binary */
11
12 #include "error_private.h"
13
14 const char* ERR_getErrorString(ERR_enum code)
15 {
16 static const char* const notErrorCode = "Unspecified error code";
17 switch( code )
18 {
19 case PREFIX(no_error): return "No error detected";
20 case PREFIX(GENERIC): return "Error (generic)";
21 case PREFIX(prefix_unknown): return "Unknown frame descriptor";
22 case PREFIX(version_unsupported): return "Version not supported";
23 case PREFIX(parameter_unknown): return "Unknown parameter type";
24 case PREFIX(frameParameter_unsupported): return "Unsupported frame parameter";
25 case PREFIX(frameParameter_unsupportedBy32bits): return "Frame parameter unsupported in 32-bits mode";
26 case PREFIX(frameParameter_windowTooLarge): return "Frame requires too much memory for decoding";
27 case PREFIX(compressionParameter_unsupported): return "Compression parameter is out of bound";
28 case PREFIX(init_missing): return "Context should be init first";
29 case PREFIX(memory_allocation): return "Allocation error : not enough memory";
30 case PREFIX(stage_wrong): return "Operation not authorized at current processing stage";
31 case PREFIX(dstSize_tooSmall): return "Destination buffer is too small";
32 case PREFIX(srcSize_wrong): return "Src size incorrect";
33 case PREFIX(corruption_detected): return "Corrupted block detected";
34 case PREFIX(checksum_wrong): return "Restored data doesn't match checksum";
35 case PREFIX(tableLog_tooLarge): return "tableLog requires too much memory : unsupported";
36 case PREFIX(maxSymbolValue_tooLarge): return "Unsupported max Symbol Value : too large";
37 case PREFIX(maxSymbolValue_tooSmall): return "Specified maxSymbolValue is too small";
38 case PREFIX(dictionary_corrupted): return "Dictionary is corrupted";
39 case PREFIX(dictionary_wrong): return "Dictionary mismatch";
40 case PREFIX(maxCode):
41 default: return notErrorCode;
42 }
43 }
@@ -0,0 +1,76 b''
1 /**
2 * Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
3 * All rights reserved.
4 *
5 * This source code is licensed under the BSD-style license found in the
6 * LICENSE file in the root directory of this source tree. An additional grant
7 * of patent rights can be found in the PATENTS file in the same directory.
8 */
9
10 /* Note : this module is expected to remain private, do not expose it */
11
12 #ifndef ERROR_H_MODULE
13 #define ERROR_H_MODULE
14
15 #if defined (__cplusplus)
16 extern "C" {
17 #endif
18
19
20 /* ****************************************
21 * Dependencies
22 ******************************************/
23 #include <stddef.h> /* size_t */
24 #include "zstd_errors.h" /* enum list */
25
26
27 /* ****************************************
28 * Compiler-specific
29 ******************************************/
30 #if defined(__GNUC__)
31 # define ERR_STATIC static __attribute__((unused))
32 #elif defined (__cplusplus) || (defined (__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) /* C99 */)
33 # define ERR_STATIC static inline
34 #elif defined(_MSC_VER)
35 # define ERR_STATIC static __inline
36 #else
37 # define ERR_STATIC static /* this version may generate warnings for unused static functions; disable the relevant warning */
38 #endif
39
40
41 /*-****************************************
42 * Customization (error_public.h)
43 ******************************************/
44 typedef ZSTD_ErrorCode ERR_enum;
45 #define PREFIX(name) ZSTD_error_##name
46
47
48 /*-****************************************
49 * Error codes handling
50 ******************************************/
51 #ifdef ERROR
52 # undef ERROR /* reported already defined on VS 2015 (Rich Geldreich) */
53 #endif
54 #define ERROR(name) ((size_t)-PREFIX(name))
55
56 ERR_STATIC unsigned ERR_isError(size_t code) { return (code > ERROR(maxCode)); }
57
58 ERR_STATIC ERR_enum ERR_getErrorCode(size_t code) { if (!ERR_isError(code)) return (ERR_enum)0; return (ERR_enum) (0-code); }
59
60
61 /*-****************************************
62 * Error Strings
63 ******************************************/
64
65 const char* ERR_getErrorString(ERR_enum code); /* error_private.c */
66
67 ERR_STATIC const char* ERR_getErrorName(size_t code)
68 {
69 return ERR_getErrorString(ERR_getErrorCode(code));
70 }
71
72 #if defined (__cplusplus)
73 }
74 #endif
75
76 #endif /* ERROR_H_MODULE */
This diff has been collapsed as it changes many lines, (634 lines changed) Show them Hide them
@@ -0,0 +1,634 b''
1 /* ******************************************************************
2 FSE : Finite State Entropy codec
3 Public Prototypes declaration
4 Copyright (C) 2013-2016, Yann Collet.
5
6 BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
7
8 Redistribution and use in source and binary forms, with or without
9 modification, are permitted provided that the following conditions are
10 met:
11
12 * Redistributions of source code must retain the above copyright
13 notice, this list of conditions and the following disclaimer.
14 * Redistributions in binary form must reproduce the above
15 copyright notice, this list of conditions and the following disclaimer
16 in the documentation and/or other materials provided with the
17 distribution.
18
19 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
20 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
21 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
22 A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
23 OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
24 SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
25 LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
26 DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
27 THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
28 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
29 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
30
31 You can contact the author at :
32 - Source repository : https://github.com/Cyan4973/FiniteStateEntropy
33 ****************************************************************** */
34 #ifndef FSE_H
35 #define FSE_H
36
37 #if defined (__cplusplus)
38 extern "C" {
39 #endif
40
41
42 /*-*****************************************
43 * Dependencies
44 ******************************************/
45 #include <stddef.h> /* size_t, ptrdiff_t */
46
47
48 /*-****************************************
49 * FSE simple functions
50 ******************************************/
51 /*! FSE_compress() :
52 Compress content of buffer 'src', of size 'srcSize', into destination buffer 'dst'.
53 'dst' buffer must be already allocated. Compression runs faster is dstCapacity >= FSE_compressBound(srcSize).
54 @return : size of compressed data (<= dstCapacity).
55 Special values : if return == 0, srcData is not compressible => Nothing is stored within dst !!!
56 if return == 1, srcData is a single byte symbol * srcSize times. Use RLE compression instead.
57 if FSE_isError(return), compression failed (more details using FSE_getErrorName())
58 */
59 size_t FSE_compress(void* dst, size_t dstCapacity,
60 const void* src, size_t srcSize);
61
62 /*! FSE_decompress():
63 Decompress FSE data from buffer 'cSrc', of size 'cSrcSize',
64 into already allocated destination buffer 'dst', of size 'dstCapacity'.
65 @return : size of regenerated data (<= maxDstSize),
66 or an error code, which can be tested using FSE_isError() .
67
68 ** Important ** : FSE_decompress() does not decompress non-compressible nor RLE data !!!
69 Why ? : making this distinction requires a header.
70 Header management is intentionally delegated to the user layer, which can better manage special cases.
71 */
72 size_t FSE_decompress(void* dst, size_t dstCapacity,
73 const void* cSrc, size_t cSrcSize);
74
75
76 /*-*****************************************
77 * Tool functions
78 ******************************************/
79 size_t FSE_compressBound(size_t size); /* maximum compressed size */
80
81 /* Error Management */
82 unsigned FSE_isError(size_t code); /* tells if a return value is an error code */
83 const char* FSE_getErrorName(size_t code); /* provides error code string (useful for debugging) */
84
85
86 /*-*****************************************
87 * FSE advanced functions
88 ******************************************/
89 /*! FSE_compress2() :
90 Same as FSE_compress(), but allows the selection of 'maxSymbolValue' and 'tableLog'
91 Both parameters can be defined as '0' to mean : use default value
92 @return : size of compressed data
93 Special values : if return == 0, srcData is not compressible => Nothing is stored within cSrc !!!
94 if return == 1, srcData is a single byte symbol * srcSize times. Use RLE compression.
95 if FSE_isError(return), it's an error code.
96 */
97 size_t FSE_compress2 (void* dst, size_t dstSize, const void* src, size_t srcSize, unsigned maxSymbolValue, unsigned tableLog);
98
99
100 /*-*****************************************
101 * FSE detailed API
102 ******************************************/
103 /*!
104 FSE_compress() does the following:
105 1. count symbol occurrence from source[] into table count[]
106 2. normalize counters so that sum(count[]) == Power_of_2 (2^tableLog)
107 3. save normalized counters to memory buffer using writeNCount()
108 4. build encoding table 'CTable' from normalized counters
109 5. encode the data stream using encoding table 'CTable'
110
111 FSE_decompress() does the following:
112 1. read normalized counters with readNCount()
113 2. build decoding table 'DTable' from normalized counters
114 3. decode the data stream using decoding table 'DTable'
115
116 The following API allows targeting specific sub-functions for advanced tasks.
117 For example, it's possible to compress several blocks using the same 'CTable',
118 or to save and provide normalized distribution using external method.
119 */
120
121 /* *** COMPRESSION *** */
122
123 /*! FSE_count():
124 Provides the precise count of each byte within a table 'count'.
125 'count' is a table of unsigned int, of minimum size (*maxSymbolValuePtr+1).
126 *maxSymbolValuePtr will be updated if detected smaller than initial value.
127 @return : the count of the most frequent symbol (which is not identified).
128 if return == srcSize, there is only one symbol.
129 Can also return an error code, which can be tested with FSE_isError(). */
130 size_t FSE_count(unsigned* count, unsigned* maxSymbolValuePtr, const void* src, size_t srcSize);
131
132 /*! FSE_optimalTableLog():
133 dynamically downsize 'tableLog' when conditions are met.
134 It saves CPU time, by using smaller tables, while preserving or even improving compression ratio.
135 @return : recommended tableLog (necessarily <= 'maxTableLog') */
136 unsigned FSE_optimalTableLog(unsigned maxTableLog, size_t srcSize, unsigned maxSymbolValue);
137
138 /*! FSE_normalizeCount():
139 normalize counts so that sum(count[]) == Power_of_2 (2^tableLog)
140 'normalizedCounter' is a table of short, of minimum size (maxSymbolValue+1).
141 @return : tableLog,
142 or an errorCode, which can be tested using FSE_isError() */
143 size_t FSE_normalizeCount(short* normalizedCounter, unsigned tableLog, const unsigned* count, size_t srcSize, unsigned maxSymbolValue);
144
145 /*! FSE_NCountWriteBound():
146 Provides the maximum possible size of an FSE normalized table, given 'maxSymbolValue' and 'tableLog'.
147 Typically useful for allocation purpose. */
148 size_t FSE_NCountWriteBound(unsigned maxSymbolValue, unsigned tableLog);
149
150 /*! FSE_writeNCount():
151 Compactly save 'normalizedCounter' into 'buffer'.
152 @return : size of the compressed table,
153 or an errorCode, which can be tested using FSE_isError(). */
154 size_t FSE_writeNCount (void* buffer, size_t bufferSize, const short* normalizedCounter, unsigned maxSymbolValue, unsigned tableLog);
155
156
157 /*! Constructor and Destructor of FSE_CTable.
158 Note that FSE_CTable size depends on 'tableLog' and 'maxSymbolValue' */
159 typedef unsigned FSE_CTable; /* don't allocate that. It's only meant to be more restrictive than void* */
160 FSE_CTable* FSE_createCTable (unsigned tableLog, unsigned maxSymbolValue);
161 void FSE_freeCTable (FSE_CTable* ct);
162
163 /*! FSE_buildCTable():
164 Builds `ct`, which must be already allocated, using FSE_createCTable().
165 @return : 0, or an errorCode, which can be tested using FSE_isError() */
166 size_t FSE_buildCTable(FSE_CTable* ct, const short* normalizedCounter, unsigned maxSymbolValue, unsigned tableLog);
167
168 /*! FSE_compress_usingCTable():
169 Compress `src` using `ct` into `dst` which must be already allocated.
170 @return : size of compressed data (<= `dstCapacity`),
171 or 0 if compressed data could not fit into `dst`,
172 or an errorCode, which can be tested using FSE_isError() */
173 size_t FSE_compress_usingCTable (void* dst, size_t dstCapacity, const void* src, size_t srcSize, const FSE_CTable* ct);
174
175 /*!
176 Tutorial :
177 ----------
178 The first step is to count all symbols. FSE_count() does this job very fast.
179 Result will be saved into 'count', a table of unsigned int, which must be already allocated, and have 'maxSymbolValuePtr[0]+1' cells.
180 'src' is a table of bytes of size 'srcSize'. All values within 'src' MUST be <= maxSymbolValuePtr[0]
181 maxSymbolValuePtr[0] will be updated, with its real value (necessarily <= original value)
182 FSE_count() will return the number of occurrence of the most frequent symbol.
183 This can be used to know if there is a single symbol within 'src', and to quickly evaluate its compressibility.
184 If there is an error, the function will return an ErrorCode (which can be tested using FSE_isError()).
185
186 The next step is to normalize the frequencies.
187 FSE_normalizeCount() will ensure that sum of frequencies is == 2 ^'tableLog'.
188 It also guarantees a minimum of 1 to any Symbol with frequency >= 1.
189 You can use 'tableLog'==0 to mean "use default tableLog value".
190 If you are unsure of which tableLog value to use, you can ask FSE_optimalTableLog(),
191 which will provide the optimal valid tableLog given sourceSize, maxSymbolValue, and a user-defined maximum (0 means "default").
192
193 The result of FSE_normalizeCount() will be saved into a table,
194 called 'normalizedCounter', which is a table of signed short.
195 'normalizedCounter' must be already allocated, and have at least 'maxSymbolValue+1' cells.
196 The return value is tableLog if everything proceeded as expected.
197 It is 0 if there is a single symbol within distribution.
198 If there is an error (ex: invalid tableLog value), the function will return an ErrorCode (which can be tested using FSE_isError()).
199
200 'normalizedCounter' can be saved in a compact manner to a memory area using FSE_writeNCount().
201 'buffer' must be already allocated.
202 For guaranteed success, buffer size must be at least FSE_headerBound().
203 The result of the function is the number of bytes written into 'buffer'.
204 If there is an error, the function will return an ErrorCode (which can be tested using FSE_isError(); ex : buffer size too small).
205
206 'normalizedCounter' can then be used to create the compression table 'CTable'.
207 The space required by 'CTable' must be already allocated, using FSE_createCTable().
208 You can then use FSE_buildCTable() to fill 'CTable'.
209 If there is an error, both functions will return an ErrorCode (which can be tested using FSE_isError()).
210
211 'CTable' can then be used to compress 'src', with FSE_compress_usingCTable().
212 Similar to FSE_count(), the convention is that 'src' is assumed to be a table of char of size 'srcSize'
213 The function returns the size of compressed data (without header), necessarily <= `dstCapacity`.
214 If it returns '0', compressed data could not fit into 'dst'.
215 If there is an error, the function will return an ErrorCode (which can be tested using FSE_isError()).
216 */
217
218
219 /* *** DECOMPRESSION *** */
220
221 /*! FSE_readNCount():
222 Read compactly saved 'normalizedCounter' from 'rBuffer'.
223 @return : size read from 'rBuffer',
224 or an errorCode, which can be tested using FSE_isError().
225 maxSymbolValuePtr[0] and tableLogPtr[0] will also be updated with their respective values */
226 size_t FSE_readNCount (short* normalizedCounter, unsigned* maxSymbolValuePtr, unsigned* tableLogPtr, const void* rBuffer, size_t rBuffSize);
227
228 /*! Constructor and Destructor of FSE_DTable.
229 Note that its size depends on 'tableLog' */
230 typedef unsigned FSE_DTable; /* don't allocate that. It's just a way to be more restrictive than void* */
231 FSE_DTable* FSE_createDTable(unsigned tableLog);
232 void FSE_freeDTable(FSE_DTable* dt);
233
234 /*! FSE_buildDTable():
235 Builds 'dt', which must be already allocated, using FSE_createDTable().
236 return : 0, or an errorCode, which can be tested using FSE_isError() */
237 size_t FSE_buildDTable (FSE_DTable* dt, const short* normalizedCounter, unsigned maxSymbolValue, unsigned tableLog);
238
239 /*! FSE_decompress_usingDTable():
240 Decompress compressed source `cSrc` of size `cSrcSize` using `dt`
241 into `dst` which must be already allocated.
242 @return : size of regenerated data (necessarily <= `dstCapacity`),
243 or an errorCode, which can be tested using FSE_isError() */
244 size_t FSE_decompress_usingDTable(void* dst, size_t dstCapacity, const void* cSrc, size_t cSrcSize, const FSE_DTable* dt);
245
246 /*!
247 Tutorial :
248 ----------
249 (Note : these functions only decompress FSE-compressed blocks.
250 If block is uncompressed, use memcpy() instead
251 If block is a single repeated byte, use memset() instead )
252
253 The first step is to obtain the normalized frequencies of symbols.
254 This can be performed by FSE_readNCount() if it was saved using FSE_writeNCount().
255 'normalizedCounter' must be already allocated, and have at least 'maxSymbolValuePtr[0]+1' cells of signed short.
256 In practice, that means it's necessary to know 'maxSymbolValue' beforehand,
257 or size the table to handle worst case situations (typically 256).
258 FSE_readNCount() will provide 'tableLog' and 'maxSymbolValue'.
259 The result of FSE_readNCount() is the number of bytes read from 'rBuffer'.
260 Note that 'rBufferSize' must be at least 4 bytes, even if useful information is less than that.
261 If there is an error, the function will return an error code, which can be tested using FSE_isError().
262
263 The next step is to build the decompression tables 'FSE_DTable' from 'normalizedCounter'.
264 This is performed by the function FSE_buildDTable().
265 The space required by 'FSE_DTable' must be already allocated using FSE_createDTable().
266 If there is an error, the function will return an error code, which can be tested using FSE_isError().
267
268 `FSE_DTable` can then be used to decompress `cSrc`, with FSE_decompress_usingDTable().
269 `cSrcSize` must be strictly correct, otherwise decompression will fail.
270 FSE_decompress_usingDTable() result will tell how many bytes were regenerated (<=`dstCapacity`).
271 If there is an error, the function will return an error code, which can be tested using FSE_isError(). (ex: dst buffer too small)
272 */
273
274
275 #ifdef FSE_STATIC_LINKING_ONLY
276
277 /* *** Dependency *** */
278 #include "bitstream.h"
279
280
281 /* *****************************************
282 * Static allocation
283 *******************************************/
284 /* FSE buffer bounds */
285 #define FSE_NCOUNTBOUND 512
286 #define FSE_BLOCKBOUND(size) (size + (size>>7))
287 #define FSE_COMPRESSBOUND(size) (FSE_NCOUNTBOUND + FSE_BLOCKBOUND(size)) /* Macro version, useful for static allocation */
288
289 /* It is possible to statically allocate FSE CTable/DTable as a table of unsigned using below macros */
290 #define FSE_CTABLE_SIZE_U32(maxTableLog, maxSymbolValue) (1 + (1<<(maxTableLog-1)) + ((maxSymbolValue+1)*2))
291 #define FSE_DTABLE_SIZE_U32(maxTableLog) (1 + (1<<maxTableLog))
292
293
294 /* *****************************************
295 * FSE advanced API
296 *******************************************/
297 size_t FSE_countFast(unsigned* count, unsigned* maxSymbolValuePtr, const void* src, size_t srcSize);
298 /**< same as FSE_count(), but blindly trusts that all byte values within src are <= *maxSymbolValuePtr */
299
300 unsigned FSE_optimalTableLog_internal(unsigned maxTableLog, size_t srcSize, unsigned maxSymbolValue, unsigned minus);
301 /**< same as FSE_optimalTableLog(), which used `minus==2` */
302
303 size_t FSE_buildCTable_raw (FSE_CTable* ct, unsigned nbBits);
304 /**< build a fake FSE_CTable, designed to not compress an input, where each symbol uses nbBits */
305
306 size_t FSE_buildCTable_rle (FSE_CTable* ct, unsigned char symbolValue);
307 /**< build a fake FSE_CTable, designed to compress always the same symbolValue */
308
309 size_t FSE_buildDTable_raw (FSE_DTable* dt, unsigned nbBits);
310 /**< build a fake FSE_DTable, designed to read an uncompressed bitstream where each symbol uses nbBits */
311
312 size_t FSE_buildDTable_rle (FSE_DTable* dt, unsigned char symbolValue);
313 /**< build a fake FSE_DTable, designed to always generate the same symbolValue */
314
315
316 /* *****************************************
317 * FSE symbol compression API
318 *******************************************/
319 /*!
320 This API consists of small unitary functions, which highly benefit from being inlined.
321 You will want to enable link-time-optimization to ensure these functions are properly inlined in your binary.
322 Visual seems to do it automatically.
323 For gcc or clang, you'll need to add -flto flag at compilation and linking stages.
324 If none of these solutions is applicable, include "fse.c" directly.
325 */
326 typedef struct
327 {
328 ptrdiff_t value;
329 const void* stateTable;
330 const void* symbolTT;
331 unsigned stateLog;
332 } FSE_CState_t;
333
334 static void FSE_initCState(FSE_CState_t* CStatePtr, const FSE_CTable* ct);
335
336 static void FSE_encodeSymbol(BIT_CStream_t* bitC, FSE_CState_t* CStatePtr, unsigned symbol);
337
338 static void FSE_flushCState(BIT_CStream_t* bitC, const FSE_CState_t* CStatePtr);
339
340 /**<
341 These functions are inner components of FSE_compress_usingCTable().
342 They allow the creation of custom streams, mixing multiple tables and bit sources.
343
344 A key property to keep in mind is that encoding and decoding are done **in reverse direction**.
345 So the first symbol you will encode is the last you will decode, like a LIFO stack.
346
347 You will need a few variables to track your CStream. They are :
348
349 FSE_CTable ct; // Provided by FSE_buildCTable()
350 BIT_CStream_t bitStream; // bitStream tracking structure
351 FSE_CState_t state; // State tracking structure (can have several)
352
353
354 The first thing to do is to init bitStream and state.
355 size_t errorCode = BIT_initCStream(&bitStream, dstBuffer, maxDstSize);
356 FSE_initCState(&state, ct);
357
358 Note that BIT_initCStream() can produce an error code, so its result should be tested, using FSE_isError();
359 You can then encode your input data, byte after byte.
360 FSE_encodeSymbol() outputs a maximum of 'tableLog' bits at a time.
361 Remember decoding will be done in reverse direction.
362 FSE_encodeByte(&bitStream, &state, symbol);
363
364 At any time, you can also add any bit sequence.
365 Note : maximum allowed nbBits is 25, for compatibility with 32-bits decoders
366 BIT_addBits(&bitStream, bitField, nbBits);
367
368 The above methods don't commit data to memory, they just store it into local register, for speed.
369 Local register size is 64-bits on 64-bits systems, 32-bits on 32-bits systems (size_t).
370 Writing data to memory is a manual operation, performed by the flushBits function.
371 BIT_flushBits(&bitStream);
372
373 Your last FSE encoding operation shall be to flush your last state value(s).
374 FSE_flushState(&bitStream, &state);
375
376 Finally, you must close the bitStream.
377 The function returns the size of CStream in bytes.
378 If data couldn't fit into dstBuffer, it will return a 0 ( == not compressible)
379 If there is an error, it returns an errorCode (which can be tested using FSE_isError()).
380 size_t size = BIT_closeCStream(&bitStream);
381 */
382
383
384 /* *****************************************
385 * FSE symbol decompression API
386 *******************************************/
387 typedef struct
388 {
389 size_t state;
390 const void* table; /* precise table may vary, depending on U16 */
391 } FSE_DState_t;
392
393
394 static void FSE_initDState(FSE_DState_t* DStatePtr, BIT_DStream_t* bitD, const FSE_DTable* dt);
395
396 static unsigned char FSE_decodeSymbol(FSE_DState_t* DStatePtr, BIT_DStream_t* bitD);
397
398 static unsigned FSE_endOfDState(const FSE_DState_t* DStatePtr);
399
400 /**<
401 Let's now decompose FSE_decompress_usingDTable() into its unitary components.
402 You will decode FSE-encoded symbols from the bitStream,
403 and also any other bitFields you put in, **in reverse order**.
404
405 You will need a few variables to track your bitStream. They are :
406
407 BIT_DStream_t DStream; // Stream context
408 FSE_DState_t DState; // State context. Multiple ones are possible
409 FSE_DTable* DTablePtr; // Decoding table, provided by FSE_buildDTable()
410
411 The first thing to do is to init the bitStream.
412 errorCode = BIT_initDStream(&DStream, srcBuffer, srcSize);
413
414 You should then retrieve your initial state(s)
415 (in reverse flushing order if you have several ones) :
416 errorCode = FSE_initDState(&DState, &DStream, DTablePtr);
417
418 You can then decode your data, symbol after symbol.
419 For information the maximum number of bits read by FSE_decodeSymbol() is 'tableLog'.
420 Keep in mind that symbols are decoded in reverse order, like a LIFO stack (last in, first out).
421 unsigned char symbol = FSE_decodeSymbol(&DState, &DStream);
422
423 You can retrieve any bitfield you eventually stored into the bitStream (in reverse order)
424 Note : maximum allowed nbBits is 25, for 32-bits compatibility
425 size_t bitField = BIT_readBits(&DStream, nbBits);
426
427 All above operations only read from local register (which size depends on size_t).
428 Refueling the register from memory is manually performed by the reload method.
429 endSignal = FSE_reloadDStream(&DStream);
430
431 BIT_reloadDStream() result tells if there is still some more data to read from DStream.
432 BIT_DStream_unfinished : there is still some data left into the DStream.
433 BIT_DStream_endOfBuffer : Dstream reached end of buffer. Its container may no longer be completely filled.
434 BIT_DStream_completed : Dstream reached its exact end, corresponding in general to decompression completed.
435 BIT_DStream_tooFar : Dstream went too far. Decompression result is corrupted.
436
437 When reaching end of buffer (BIT_DStream_endOfBuffer), progress slowly, notably if you decode multiple symbols per loop,
438 to properly detect the exact end of stream.
439 After each decoded symbol, check if DStream is fully consumed using this simple test :
440 BIT_reloadDStream(&DStream) >= BIT_DStream_completed
441
442 When it's done, verify decompression is fully completed, by checking both DStream and the relevant states.
443 Checking if DStream has reached its end is performed by :
444 BIT_endOfDStream(&DStream);
445 Check also the states. There might be some symbols left there, if some high probability ones (>50%) are possible.
446 FSE_endOfDState(&DState);
447 */
448
449
450 /* *****************************************
451 * FSE unsafe API
452 *******************************************/
453 static unsigned char FSE_decodeSymbolFast(FSE_DState_t* DStatePtr, BIT_DStream_t* bitD);
454 /* faster, but works only if nbBits is always >= 1 (otherwise, result will be corrupted) */
455
456
457 /* *****************************************
458 * Implementation of inlined functions
459 *******************************************/
460 typedef struct {
461 int deltaFindState;
462 U32 deltaNbBits;
463 } FSE_symbolCompressionTransform; /* total 8 bytes */
464
465 MEM_STATIC void FSE_initCState(FSE_CState_t* statePtr, const FSE_CTable* ct)
466 {
467 const void* ptr = ct;
468 const U16* u16ptr = (const U16*) ptr;
469 const U32 tableLog = MEM_read16(ptr);
470 statePtr->value = (ptrdiff_t)1<<tableLog;
471 statePtr->stateTable = u16ptr+2;
472 statePtr->symbolTT = ((const U32*)ct + 1 + (tableLog ? (1<<(tableLog-1)) : 1));
473 statePtr->stateLog = tableLog;
474 }
475
476
477 /*! FSE_initCState2() :
478 * Same as FSE_initCState(), but the first symbol to include (which will be the last to be read)
479 * uses the smallest state value possible, saving the cost of this symbol */
480 MEM_STATIC void FSE_initCState2(FSE_CState_t* statePtr, const FSE_CTable* ct, U32 symbol)
481 {
482 FSE_initCState(statePtr, ct);
483 { const FSE_symbolCompressionTransform symbolTT = ((const FSE_symbolCompressionTransform*)(statePtr->symbolTT))[symbol];
484 const U16* stateTable = (const U16*)(statePtr->stateTable);
485 U32 nbBitsOut = (U32)((symbolTT.deltaNbBits + (1<<15)) >> 16);
486 statePtr->value = (nbBitsOut << 16) - symbolTT.deltaNbBits;
487 statePtr->value = stateTable[(statePtr->value >> nbBitsOut) + symbolTT.deltaFindState];
488 }
489 }
490
491 MEM_STATIC void FSE_encodeSymbol(BIT_CStream_t* bitC, FSE_CState_t* statePtr, U32 symbol)
492 {
493 const FSE_symbolCompressionTransform symbolTT = ((const FSE_symbolCompressionTransform*)(statePtr->symbolTT))[symbol];
494 const U16* const stateTable = (const U16*)(statePtr->stateTable);
495 U32 nbBitsOut = (U32)((statePtr->value + symbolTT.deltaNbBits) >> 16);
496 BIT_addBits(bitC, statePtr->value, nbBitsOut);
497 statePtr->value = stateTable[ (statePtr->value >> nbBitsOut) + symbolTT.deltaFindState];
498 }
499
500 MEM_STATIC void FSE_flushCState(BIT_CStream_t* bitC, const FSE_CState_t* statePtr)
501 {
502 BIT_addBits(bitC, statePtr->value, statePtr->stateLog);
503 BIT_flushBits(bitC);
504 }
505
506
507 /* ====== Decompression ====== */
508
509 typedef struct {
510 U16 tableLog;
511 U16 fastMode;
512 } FSE_DTableHeader; /* sizeof U32 */
513
514 typedef struct
515 {
516 unsigned short newState;
517 unsigned char symbol;
518 unsigned char nbBits;
519 } FSE_decode_t; /* size == U32 */
520
521 MEM_STATIC void FSE_initDState(FSE_DState_t* DStatePtr, BIT_DStream_t* bitD, const FSE_DTable* dt)
522 {
523 const void* ptr = dt;
524 const FSE_DTableHeader* const DTableH = (const FSE_DTableHeader*)ptr;
525 DStatePtr->state = BIT_readBits(bitD, DTableH->tableLog);
526 BIT_reloadDStream(bitD);
527 DStatePtr->table = dt + 1;
528 }
529
530 MEM_STATIC BYTE FSE_peekSymbol(const FSE_DState_t* DStatePtr)
531 {
532 FSE_decode_t const DInfo = ((const FSE_decode_t*)(DStatePtr->table))[DStatePtr->state];
533 return DInfo.symbol;
534 }
535
536 MEM_STATIC void FSE_updateState(FSE_DState_t* DStatePtr, BIT_DStream_t* bitD)
537 {
538 FSE_decode_t const DInfo = ((const FSE_decode_t*)(DStatePtr->table))[DStatePtr->state];
539 U32 const nbBits = DInfo.nbBits;
540 size_t const lowBits = BIT_readBits(bitD, nbBits);
541 DStatePtr->state = DInfo.newState + lowBits;
542 }
543
544 MEM_STATIC BYTE FSE_decodeSymbol(FSE_DState_t* DStatePtr, BIT_DStream_t* bitD)
545 {
546 FSE_decode_t const DInfo = ((const FSE_decode_t*)(DStatePtr->table))[DStatePtr->state];
547 U32 const nbBits = DInfo.nbBits;
548 BYTE const symbol = DInfo.symbol;
549 size_t const lowBits = BIT_readBits(bitD, nbBits);
550
551 DStatePtr->state = DInfo.newState + lowBits;
552 return symbol;
553 }
554
555 /*! FSE_decodeSymbolFast() :
556 unsafe, only works if no symbol has a probability > 50% */
557 MEM_STATIC BYTE FSE_decodeSymbolFast(FSE_DState_t* DStatePtr, BIT_DStream_t* bitD)
558 {
559 FSE_decode_t const DInfo = ((const FSE_decode_t*)(DStatePtr->table))[DStatePtr->state];
560 U32 const nbBits = DInfo.nbBits;
561 BYTE const symbol = DInfo.symbol;
562 size_t const lowBits = BIT_readBitsFast(bitD, nbBits);
563
564 DStatePtr->state = DInfo.newState + lowBits;
565 return symbol;
566 }
567
568 MEM_STATIC unsigned FSE_endOfDState(const FSE_DState_t* DStatePtr)
569 {
570 return DStatePtr->state == 0;
571 }
572
573
574
575 #ifndef FSE_COMMONDEFS_ONLY
576
577 /* **************************************************************
578 * Tuning parameters
579 ****************************************************************/
580 /*!MEMORY_USAGE :
581 * Memory usage formula : N->2^N Bytes (examples : 10 -> 1KB; 12 -> 4KB ; 16 -> 64KB; 20 -> 1MB; etc.)
582 * Increasing memory usage improves compression ratio
583 * Reduced memory usage can improve speed, due to cache effect
584 * Recommended max value is 14, for 16KB, which nicely fits into Intel x86 L1 cache */
585 #ifndef FSE_MAX_MEMORY_USAGE
586 # define FSE_MAX_MEMORY_USAGE 14
587 #endif
588 #ifndef FSE_DEFAULT_MEMORY_USAGE
589 # define FSE_DEFAULT_MEMORY_USAGE 13
590 #endif
591
592 /*!FSE_MAX_SYMBOL_VALUE :
593 * Maximum symbol value authorized.
594 * Required for proper stack allocation */
595 #ifndef FSE_MAX_SYMBOL_VALUE
596 # define FSE_MAX_SYMBOL_VALUE 255
597 #endif
598
599 /* **************************************************************
600 * template functions type & suffix
601 ****************************************************************/
602 #define FSE_FUNCTION_TYPE BYTE
603 #define FSE_FUNCTION_EXTENSION
604 #define FSE_DECODE_TYPE FSE_decode_t
605
606
607 #endif /* !FSE_COMMONDEFS_ONLY */
608
609
610 /* ***************************************************************
611 * Constants
612 *****************************************************************/
613 #define FSE_MAX_TABLELOG (FSE_MAX_MEMORY_USAGE-2)
614 #define FSE_MAX_TABLESIZE (1U<<FSE_MAX_TABLELOG)
615 #define FSE_MAXTABLESIZE_MASK (FSE_MAX_TABLESIZE-1)
616 #define FSE_DEFAULT_TABLELOG (FSE_DEFAULT_MEMORY_USAGE-2)
617 #define FSE_MIN_TABLELOG 5
618
619 #define FSE_TABLELOG_ABSOLUTE_MAX 15
620 #if FSE_MAX_TABLELOG > FSE_TABLELOG_ABSOLUTE_MAX
621 # error "FSE_MAX_TABLELOG > FSE_TABLELOG_ABSOLUTE_MAX is not supported"
622 #endif
623
624 #define FSE_TABLESTEP(tableSize) ((tableSize>>1) + (tableSize>>3) + 3)
625
626
627 #endif /* FSE_STATIC_LINKING_ONLY */
628
629
630 #if defined (__cplusplus)
631 }
632 #endif
633
634 #endif /* FSE_H */
@@ -0,0 +1,329 b''
1 /* ******************************************************************
2 FSE : Finite State Entropy decoder
3 Copyright (C) 2013-2015, Yann Collet.
4
5 BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
6
7 Redistribution and use in source and binary forms, with or without
8 modification, are permitted provided that the following conditions are
9 met:
10
11 * Redistributions of source code must retain the above copyright
12 notice, this list of conditions and the following disclaimer.
13 * Redistributions in binary form must reproduce the above
14 copyright notice, this list of conditions and the following disclaimer
15 in the documentation and/or other materials provided with the
16 distribution.
17
18 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
19 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
20 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
21 A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
22 OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
23 SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
24 LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
25 DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
26 THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
27 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
28 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
29
30 You can contact the author at :
31 - FSE source repository : https://github.com/Cyan4973/FiniteStateEntropy
32 - Public forum : https://groups.google.com/forum/#!forum/lz4c
33 ****************************************************************** */
34
35
36 /* **************************************************************
37 * Compiler specifics
38 ****************************************************************/
39 #ifdef _MSC_VER /* Visual Studio */
40 # define FORCE_INLINE static __forceinline
41 # include <intrin.h> /* For Visual 2005 */
42 # pragma warning(disable : 4127) /* disable: C4127: conditional expression is constant */
43 # pragma warning(disable : 4214) /* disable: C4214: non-int bitfields */
44 #else
45 # if defined (__cplusplus) || defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L /* C99 */
46 # ifdef __GNUC__
47 # define FORCE_INLINE static inline __attribute__((always_inline))
48 # else
49 # define FORCE_INLINE static inline
50 # endif
51 # else
52 # define FORCE_INLINE static
53 # endif /* __STDC_VERSION__ */
54 #endif
55
56
57 /* **************************************************************
58 * Includes
59 ****************************************************************/
60 #include <stdlib.h> /* malloc, free, qsort */
61 #include <string.h> /* memcpy, memset */
62 #include <stdio.h> /* printf (debug) */
63 #include "bitstream.h"
64 #define FSE_STATIC_LINKING_ONLY
65 #include "fse.h"
66
67
68 /* **************************************************************
69 * Error Management
70 ****************************************************************/
71 #define FSE_isError ERR_isError
72 #define FSE_STATIC_ASSERT(c) { enum { FSE_static_assert = 1/(int)(!!(c)) }; } /* use only *after* variable declarations */
73
74 /* check and forward error code */
75 #define CHECK_F(f) { size_t const e = f; if (FSE_isError(e)) return e; }
76
77
78 /* **************************************************************
79 * Complex types
80 ****************************************************************/
81 typedef U32 DTable_max_t[FSE_DTABLE_SIZE_U32(FSE_MAX_TABLELOG)];
82
83
84 /* **************************************************************
85 * Templates
86 ****************************************************************/
87 /*
88 designed to be included
89 for type-specific functions (template emulation in C)
90 Objective is to write these functions only once, for improved maintenance
91 */
92
93 /* safety checks */
94 #ifndef FSE_FUNCTION_EXTENSION
95 # error "FSE_FUNCTION_EXTENSION must be defined"
96 #endif
97 #ifndef FSE_FUNCTION_TYPE
98 # error "FSE_FUNCTION_TYPE must be defined"
99 #endif
100
101 /* Function names */
102 #define FSE_CAT(X,Y) X##Y
103 #define FSE_FUNCTION_NAME(X,Y) FSE_CAT(X,Y)
104 #define FSE_TYPE_NAME(X,Y) FSE_CAT(X,Y)
105
106
107 /* Function templates */
108 FSE_DTable* FSE_createDTable (unsigned tableLog)
109 {
110 if (tableLog > FSE_TABLELOG_ABSOLUTE_MAX) tableLog = FSE_TABLELOG_ABSOLUTE_MAX;
111 return (FSE_DTable*)malloc( FSE_DTABLE_SIZE_U32(tableLog) * sizeof (U32) );
112 }
113
114 void FSE_freeDTable (FSE_DTable* dt)
115 {
116 free(dt);
117 }
118
119 size_t FSE_buildDTable(FSE_DTable* dt, const short* normalizedCounter, unsigned maxSymbolValue, unsigned tableLog)
120 {
121 void* const tdPtr = dt+1; /* because *dt is unsigned, 32-bits aligned on 32-bits */
122 FSE_DECODE_TYPE* const tableDecode = (FSE_DECODE_TYPE*) (tdPtr);
123 U16 symbolNext[FSE_MAX_SYMBOL_VALUE+1];
124
125 U32 const maxSV1 = maxSymbolValue + 1;
126 U32 const tableSize = 1 << tableLog;
127 U32 highThreshold = tableSize-1;
128
129 /* Sanity Checks */
130 if (maxSymbolValue > FSE_MAX_SYMBOL_VALUE) return ERROR(maxSymbolValue_tooLarge);
131 if (tableLog > FSE_MAX_TABLELOG) return ERROR(tableLog_tooLarge);
132
133 /* Init, lay down lowprob symbols */
134 { FSE_DTableHeader DTableH;
135 DTableH.tableLog = (U16)tableLog;
136 DTableH.fastMode = 1;
137 { S16 const largeLimit= (S16)(1 << (tableLog-1));
138 U32 s;
139 for (s=0; s<maxSV1; s++) {
140 if (normalizedCounter[s]==-1) {
141 tableDecode[highThreshold--].symbol = (FSE_FUNCTION_TYPE)s;
142 symbolNext[s] = 1;
143 } else {
144 if (normalizedCounter[s] >= largeLimit) DTableH.fastMode=0;
145 symbolNext[s] = normalizedCounter[s];
146 } } }
147 memcpy(dt, &DTableH, sizeof(DTableH));
148 }
149
150 /* Spread symbols */
151 { U32 const tableMask = tableSize-1;
152 U32 const step = FSE_TABLESTEP(tableSize);
153 U32 s, position = 0;
154 for (s=0; s<maxSV1; s++) {
155 int i;
156 for (i=0; i<normalizedCounter[s]; i++) {
157 tableDecode[position].symbol = (FSE_FUNCTION_TYPE)s;
158 position = (position + step) & tableMask;
159 while (position > highThreshold) position = (position + step) & tableMask; /* lowprob area */
160 } }
161 if (position!=0) return ERROR(GENERIC); /* position must reach all cells once, otherwise normalizedCounter is incorrect */
162 }
163
164 /* Build Decoding table */
165 { U32 u;
166 for (u=0; u<tableSize; u++) {
167 FSE_FUNCTION_TYPE const symbol = (FSE_FUNCTION_TYPE)(tableDecode[u].symbol);
168 U16 nextState = symbolNext[symbol]++;
169 tableDecode[u].nbBits = (BYTE) (tableLog - BIT_highbit32 ((U32)nextState) );
170 tableDecode[u].newState = (U16) ( (nextState << tableDecode[u].nbBits) - tableSize);
171 } }
172
173 return 0;
174 }
175
176
177 #ifndef FSE_COMMONDEFS_ONLY
178
179 /*-*******************************************************
180 * Decompression (Byte symbols)
181 *********************************************************/
182 size_t FSE_buildDTable_rle (FSE_DTable* dt, BYTE symbolValue)
183 {
184 void* ptr = dt;
185 FSE_DTableHeader* const DTableH = (FSE_DTableHeader*)ptr;
186 void* dPtr = dt + 1;
187 FSE_decode_t* const cell = (FSE_decode_t*)dPtr;
188
189 DTableH->tableLog = 0;
190 DTableH->fastMode = 0;
191
192 cell->newState = 0;
193 cell->symbol = symbolValue;
194 cell->nbBits = 0;
195
196 return 0;
197 }
198
199
200 size_t FSE_buildDTable_raw (FSE_DTable* dt, unsigned nbBits)
201 {
202 void* ptr = dt;
203 FSE_DTableHeader* const DTableH = (FSE_DTableHeader*)ptr;
204 void* dPtr = dt + 1;
205 FSE_decode_t* const dinfo = (FSE_decode_t*)dPtr;
206 const unsigned tableSize = 1 << nbBits;
207 const unsigned tableMask = tableSize - 1;
208 const unsigned maxSV1 = tableMask+1;
209 unsigned s;
210
211 /* Sanity checks */
212 if (nbBits < 1) return ERROR(GENERIC); /* min size */
213
214 /* Build Decoding Table */
215 DTableH->tableLog = (U16)nbBits;
216 DTableH->fastMode = 1;
217 for (s=0; s<maxSV1; s++) {
218 dinfo[s].newState = 0;
219 dinfo[s].symbol = (BYTE)s;
220 dinfo[s].nbBits = (BYTE)nbBits;
221 }
222
223 return 0;
224 }
225
226 FORCE_INLINE size_t FSE_decompress_usingDTable_generic(
227 void* dst, size_t maxDstSize,
228 const void* cSrc, size_t cSrcSize,
229 const FSE_DTable* dt, const unsigned fast)
230 {
231 BYTE* const ostart = (BYTE*) dst;
232 BYTE* op = ostart;
233 BYTE* const omax = op + maxDstSize;
234 BYTE* const olimit = omax-3;
235
236 BIT_DStream_t bitD;
237 FSE_DState_t state1;
238 FSE_DState_t state2;
239
240 /* Init */
241 CHECK_F(BIT_initDStream(&bitD, cSrc, cSrcSize));
242
243 FSE_initDState(&state1, &bitD, dt);
244 FSE_initDState(&state2, &bitD, dt);
245
246 #define FSE_GETSYMBOL(statePtr) fast ? FSE_decodeSymbolFast(statePtr, &bitD) : FSE_decodeSymbol(statePtr, &bitD)
247
248 /* 4 symbols per loop */
249 for ( ; (BIT_reloadDStream(&bitD)==BIT_DStream_unfinished) & (op<olimit) ; op+=4) {
250 op[0] = FSE_GETSYMBOL(&state1);
251
252 if (FSE_MAX_TABLELOG*2+7 > sizeof(bitD.bitContainer)*8) /* This test must be static */
253 BIT_reloadDStream(&bitD);
254
255 op[1] = FSE_GETSYMBOL(&state2);
256
257 if (FSE_MAX_TABLELOG*4+7 > sizeof(bitD.bitContainer)*8) /* This test must be static */
258 { if (BIT_reloadDStream(&bitD) > BIT_DStream_unfinished) { op+=2; break; } }
259
260 op[2] = FSE_GETSYMBOL(&state1);
261
262 if (FSE_MAX_TABLELOG*2+7 > sizeof(bitD.bitContainer)*8) /* This test must be static */
263 BIT_reloadDStream(&bitD);
264
265 op[3] = FSE_GETSYMBOL(&state2);
266 }
267
268 /* tail */
269 /* note : BIT_reloadDStream(&bitD) >= FSE_DStream_partiallyFilled; Ends at exactly BIT_DStream_completed */
270 while (1) {
271 if (op>(omax-2)) return ERROR(dstSize_tooSmall);
272 *op++ = FSE_GETSYMBOL(&state1);
273 if (BIT_reloadDStream(&bitD)==BIT_DStream_overflow) {
274 *op++ = FSE_GETSYMBOL(&state2);
275 break;
276 }
277
278 if (op>(omax-2)) return ERROR(dstSize_tooSmall);
279 *op++ = FSE_GETSYMBOL(&state2);
280 if (BIT_reloadDStream(&bitD)==BIT_DStream_overflow) {
281 *op++ = FSE_GETSYMBOL(&state1);
282 break;
283 } }
284
285 return op-ostart;
286 }
287
288
289 size_t FSE_decompress_usingDTable(void* dst, size_t originalSize,
290 const void* cSrc, size_t cSrcSize,
291 const FSE_DTable* dt)
292 {
293 const void* ptr = dt;
294 const FSE_DTableHeader* DTableH = (const FSE_DTableHeader*)ptr;
295 const U32 fastMode = DTableH->fastMode;
296
297 /* select fast mode (static) */
298 if (fastMode) return FSE_decompress_usingDTable_generic(dst, originalSize, cSrc, cSrcSize, dt, 1);
299 return FSE_decompress_usingDTable_generic(dst, originalSize, cSrc, cSrcSize, dt, 0);
300 }
301
302
303 size_t FSE_decompress(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize)
304 {
305 const BYTE* const istart = (const BYTE*)cSrc;
306 const BYTE* ip = istart;
307 short counting[FSE_MAX_SYMBOL_VALUE+1];
308 DTable_max_t dt; /* Static analyzer seems unable to understand this table will be properly initialized later */
309 unsigned tableLog;
310 unsigned maxSymbolValue = FSE_MAX_SYMBOL_VALUE;
311
312 if (cSrcSize<2) return ERROR(srcSize_wrong); /* too small input size */
313
314 /* normal FSE decoding mode */
315 { size_t const NCountLength = FSE_readNCount (counting, &maxSymbolValue, &tableLog, istart, cSrcSize);
316 if (FSE_isError(NCountLength)) return NCountLength;
317 if (NCountLength >= cSrcSize) return ERROR(srcSize_wrong); /* too small input size */
318 ip += NCountLength;
319 cSrcSize -= NCountLength;
320 }
321
322 CHECK_F( FSE_buildDTable (dt, counting, maxSymbolValue, tableLog) );
323
324 return FSE_decompress_usingDTable (dst, maxDstSize, ip, cSrcSize, dt); /* always return, even if it is an error code */
325 }
326
327
328
329 #endif /* FSE_COMMONDEFS_ONLY */
@@ -0,0 +1,228 b''
1 /* ******************************************************************
2 Huffman coder, part of New Generation Entropy library
3 header file
4 Copyright (C) 2013-2016, Yann Collet.
5
6 BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
7
8 Redistribution and use in source and binary forms, with or without
9 modification, are permitted provided that the following conditions are
10 met:
11
12 * Redistributions of source code must retain the above copyright
13 notice, this list of conditions and the following disclaimer.
14 * Redistributions in binary form must reproduce the above
15 copyright notice, this list of conditions and the following disclaimer
16 in the documentation and/or other materials provided with the
17 distribution.
18
19 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
20 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
21 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
22 A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
23 OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
24 SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
25 LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
26 DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
27 THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
28 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
29 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
30
31 You can contact the author at :
32 - Source repository : https://github.com/Cyan4973/FiniteStateEntropy
33 ****************************************************************** */
34 #ifndef HUF_H_298734234
35 #define HUF_H_298734234
36
37 #if defined (__cplusplus)
38 extern "C" {
39 #endif
40
41
42 /* *** Dependencies *** */
43 #include <stddef.h> /* size_t */
44
45
46 /* *** simple functions *** */
47 /**
48 HUF_compress() :
49 Compress content from buffer 'src', of size 'srcSize', into buffer 'dst'.
50 'dst' buffer must be already allocated.
51 Compression runs faster if `dstCapacity` >= HUF_compressBound(srcSize).
52 `srcSize` must be <= `HUF_BLOCKSIZE_MAX` == 128 KB.
53 @return : size of compressed data (<= `dstCapacity`).
54 Special values : if return == 0, srcData is not compressible => Nothing is stored within dst !!!
55 if return == 1, srcData is a single repeated byte symbol (RLE compression).
56 if HUF_isError(return), compression failed (more details using HUF_getErrorName())
57 */
58 size_t HUF_compress(void* dst, size_t dstCapacity,
59 const void* src, size_t srcSize);
60
61 /**
62 HUF_decompress() :
63 Decompress HUF data from buffer 'cSrc', of size 'cSrcSize',
64 into already allocated buffer 'dst', of minimum size 'dstSize'.
65 `dstSize` : **must** be the ***exact*** size of original (uncompressed) data.
66 Note : in contrast with FSE, HUF_decompress can regenerate
67 RLE (cSrcSize==1) and uncompressed (cSrcSize==dstSize) data,
68 because it knows size to regenerate.
69 @return : size of regenerated data (== dstSize),
70 or an error code, which can be tested using HUF_isError()
71 */
72 size_t HUF_decompress(void* dst, size_t dstSize,
73 const void* cSrc, size_t cSrcSize);
74
75
76 /* ****************************************
77 * Tool functions
78 ******************************************/
79 #define HUF_BLOCKSIZE_MAX (128 * 1024)
80 size_t HUF_compressBound(size_t size); /**< maximum compressed size (worst case) */
81
82 /* Error Management */
83 unsigned HUF_isError(size_t code); /**< tells if a return value is an error code */
84 const char* HUF_getErrorName(size_t code); /**< provides error code string (useful for debugging) */
85
86
87 /* *** Advanced function *** */
88
89 /** HUF_compress2() :
90 * Same as HUF_compress(), but offers direct control over `maxSymbolValue` and `tableLog` */
91 size_t HUF_compress2 (void* dst, size_t dstSize, const void* src, size_t srcSize, unsigned maxSymbolValue, unsigned tableLog);
92
93
94 #ifdef HUF_STATIC_LINKING_ONLY
95
96 /* *** Dependencies *** */
97 #include "mem.h" /* U32 */
98
99
100 /* *** Constants *** */
101 #define HUF_TABLELOG_ABSOLUTEMAX 16 /* absolute limit of HUF_MAX_TABLELOG. Beyond that value, code does not work */
102 #define HUF_TABLELOG_MAX 12 /* max configured tableLog (for static allocation); can be modified up to HUF_ABSOLUTEMAX_TABLELOG */
103 #define HUF_TABLELOG_DEFAULT 11 /* tableLog by default, when not specified */
104 #define HUF_SYMBOLVALUE_MAX 255
105 #if (HUF_TABLELOG_MAX > HUF_TABLELOG_ABSOLUTEMAX)
106 # error "HUF_TABLELOG_MAX is too large !"
107 #endif
108
109
110 /* ****************************************
111 * Static allocation
112 ******************************************/
113 /* HUF buffer bounds */
114 #define HUF_CTABLEBOUND 129
115 #define HUF_BLOCKBOUND(size) (size + (size>>8) + 8) /* only true if incompressible pre-filtered with fast heuristic */
116 #define HUF_COMPRESSBOUND(size) (HUF_CTABLEBOUND + HUF_BLOCKBOUND(size)) /* Macro version, useful for static allocation */
117
118 /* static allocation of HUF's Compression Table */
119 #define HUF_CREATE_STATIC_CTABLE(name, maxSymbolValue) \
120 U32 name##hb[maxSymbolValue+1]; \
121 void* name##hv = &(name##hb); \
122 HUF_CElt* name = (HUF_CElt*)(name##hv) /* no final ; */
123
124 /* static allocation of HUF's DTable */
125 typedef U32 HUF_DTable;
126 #define HUF_DTABLE_SIZE(maxTableLog) (1 + (1<<(maxTableLog)))
127 #define HUF_CREATE_STATIC_DTABLEX2(DTable, maxTableLog) \
128 HUF_DTable DTable[HUF_DTABLE_SIZE((maxTableLog)-1)] = { ((U32)((maxTableLog)-1)*0x1000001) }
129 #define HUF_CREATE_STATIC_DTABLEX4(DTable, maxTableLog) \
130 HUF_DTable DTable[HUF_DTABLE_SIZE(maxTableLog)] = { ((U32)(maxTableLog)*0x1000001) }
131
132
133 /* ****************************************
134 * Advanced decompression functions
135 ******************************************/
136 size_t HUF_decompress4X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< single-symbol decoder */
137 size_t HUF_decompress4X4 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< double-symbols decoder */
138
139 size_t HUF_decompress4X_DCtx (HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< decodes RLE and uncompressed */
140 size_t HUF_decompress4X_hufOnly(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< considers RLE and uncompressed as errors */
141 size_t HUF_decompress4X2_DCtx(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< single-symbol decoder */
142 size_t HUF_decompress4X4_DCtx(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< double-symbols decoder */
143
144 size_t HUF_decompress1X_DCtx (HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);
145 size_t HUF_decompress1X2_DCtx(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< single-symbol decoder */
146 size_t HUF_decompress1X4_DCtx(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< double-symbols decoder */
147
148
149 /* ****************************************
150 * HUF detailed API
151 ******************************************/
152 /*!
153 HUF_compress() does the following:
154 1. count symbol occurrence from source[] into table count[] using FSE_count()
155 2. (optional) refine tableLog using HUF_optimalTableLog()
156 3. build Huffman table from count using HUF_buildCTable()
157 4. save Huffman table to memory buffer using HUF_writeCTable()
158 5. encode the data stream using HUF_compress4X_usingCTable()
159
160 The following API allows targeting specific sub-functions for advanced tasks.
161 For example, it's possible to compress several blocks using the same 'CTable',
162 or to save and regenerate 'CTable' using external methods.
163 */
164 /* FSE_count() : find it within "fse.h" */
165 unsigned HUF_optimalTableLog(unsigned maxTableLog, size_t srcSize, unsigned maxSymbolValue);
166 typedef struct HUF_CElt_s HUF_CElt; /* incomplete type */
167 size_t HUF_buildCTable (HUF_CElt* CTable, const unsigned* count, unsigned maxSymbolValue, unsigned maxNbBits);
168 size_t HUF_writeCTable (void* dst, size_t maxDstSize, const HUF_CElt* CTable, unsigned maxSymbolValue, unsigned huffLog);
169 size_t HUF_compress4X_usingCTable(void* dst, size_t dstSize, const void* src, size_t srcSize, const HUF_CElt* CTable);
170
171
172 /*! HUF_readStats() :
173 Read compact Huffman tree, saved by HUF_writeCTable().
174 `huffWeight` is destination buffer.
175 @return : size read from `src` , or an error Code .
176 Note : Needed by HUF_readCTable() and HUF_readDTableXn() . */
177 size_t HUF_readStats(BYTE* huffWeight, size_t hwSize, U32* rankStats,
178 U32* nbSymbolsPtr, U32* tableLogPtr,
179 const void* src, size_t srcSize);
180
181 /** HUF_readCTable() :
182 * Loading a CTable saved with HUF_writeCTable() */
183 size_t HUF_readCTable (HUF_CElt* CTable, unsigned maxSymbolValue, const void* src, size_t srcSize);
184
185
186 /*
187 HUF_decompress() does the following:
188 1. select the decompression algorithm (X2, X4) based on pre-computed heuristics
189 2. build Huffman table from save, using HUF_readDTableXn()
190 3. decode 1 or 4 segments in parallel using HUF_decompressSXn_usingDTable
191 */
192
193 /** HUF_selectDecoder() :
194 * Tells which decoder is likely to decode faster,
195 * based on a set of pre-determined metrics.
196 * @return : 0==HUF_decompress4X2, 1==HUF_decompress4X4 .
197 * Assumption : 0 < cSrcSize < dstSize <= 128 KB */
198 U32 HUF_selectDecoder (size_t dstSize, size_t cSrcSize);
199
200 size_t HUF_readDTableX2 (HUF_DTable* DTable, const void* src, size_t srcSize);
201 size_t HUF_readDTableX4 (HUF_DTable* DTable, const void* src, size_t srcSize);
202
203 size_t HUF_decompress4X_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable);
204 size_t HUF_decompress4X2_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable);
205 size_t HUF_decompress4X4_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable);
206
207
208 /* single stream variants */
209
210 size_t HUF_compress1X (void* dst, size_t dstSize, const void* src, size_t srcSize, unsigned maxSymbolValue, unsigned tableLog);
211 size_t HUF_compress1X_usingCTable(void* dst, size_t dstSize, const void* src, size_t srcSize, const HUF_CElt* CTable);
212
213 size_t HUF_decompress1X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /* single-symbol decoder */
214 size_t HUF_decompress1X4 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /* double-symbol decoder */
215
216 size_t HUF_decompress1X_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable);
217 size_t HUF_decompress1X2_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable);
218 size_t HUF_decompress1X4_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable);
219
220
221 #endif /* HUF_STATIC_LINKING_ONLY */
222
223
224 #if defined (__cplusplus)
225 }
226 #endif
227
228 #endif /* HUF_H_298734234 */
@@ -0,0 +1,370 b''
1 /**
2 * Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
3 * All rights reserved.
4 *
5 * This source code is licensed under the BSD-style license found in the
6 * LICENSE file in the root directory of this source tree. An additional grant
7 * of patent rights can be found in the PATENTS file in the same directory.
8 */
9
10 #ifndef MEM_H_MODULE
11 #define MEM_H_MODULE
12
13 #if defined (__cplusplus)
14 extern "C" {
15 #endif
16
17 /*-****************************************
18 * Dependencies
19 ******************************************/
20 #include <stddef.h> /* size_t, ptrdiff_t */
21 #include <string.h> /* memcpy */
22
23
24 /*-****************************************
25 * Compiler specifics
26 ******************************************/
27 #if defined(_MSC_VER) /* Visual Studio */
28 # include <stdlib.h> /* _byteswap_ulong */
29 # include <intrin.h> /* _byteswap_* */
30 #endif
31 #if defined(__GNUC__)
32 # define MEM_STATIC static __inline __attribute__((unused))
33 #elif defined (__cplusplus) || (defined (__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) /* C99 */)
34 # define MEM_STATIC static inline
35 #elif defined(_MSC_VER)
36 # define MEM_STATIC static __inline
37 #else
38 # define MEM_STATIC static /* this version may generate warnings for unused static functions; disable the relevant warning */
39 #endif
40
41 /* code only tested on 32 and 64 bits systems */
42 #define MEM_STATIC_ASSERT(c) { enum { XXH_static_assert = 1/(int)(!!(c)) }; }
43 MEM_STATIC void MEM_check(void) { MEM_STATIC_ASSERT((sizeof(size_t)==4) || (sizeof(size_t)==8)); }
44
45
46 /*-**************************************************************
47 * Basic Types
48 *****************************************************************/
49 #if !defined (__VMS) && (defined (__cplusplus) || (defined (__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) /* C99 */) )
50 # include <stdint.h>
51 typedef uint8_t BYTE;
52 typedef uint16_t U16;
53 typedef int16_t S16;
54 typedef uint32_t U32;
55 typedef int32_t S32;
56 typedef uint64_t U64;
57 typedef int64_t S64;
58 #else
59 typedef unsigned char BYTE;
60 typedef unsigned short U16;
61 typedef signed short S16;
62 typedef unsigned int U32;
63 typedef signed int S32;
64 typedef unsigned long long U64;
65 typedef signed long long S64;
66 #endif
67
68
69 /*-**************************************************************
70 * Memory I/O
71 *****************************************************************/
72 /* MEM_FORCE_MEMORY_ACCESS :
73 * By default, access to unaligned memory is controlled by `memcpy()`, which is safe and portable.
74 * Unfortunately, on some target/compiler combinations, the generated assembly is sub-optimal.
75 * The below switch allow to select different access method for improved performance.
76 * Method 0 (default) : use `memcpy()`. Safe and portable.
77 * Method 1 : `__packed` statement. It depends on compiler extension (ie, not portable).
78 * This method is safe if your compiler supports it, and *generally* as fast or faster than `memcpy`.
79 * Method 2 : direct access. This method is portable but violate C standard.
80 * It can generate buggy code on targets depending on alignment.
81 * In some circumstances, it's the only known way to get the most performance (ie GCC + ARMv6)
82 * See http://fastcompression.blogspot.fr/2015/08/accessing-unaligned-memory.html for details.
83 * Prefer these methods in priority order (0 > 1 > 2)
84 */
85 #ifndef MEM_FORCE_MEMORY_ACCESS /* can be defined externally, on command line for example */
86 # if defined(__GNUC__) && ( defined(__ARM_ARCH_6__) || defined(__ARM_ARCH_6J__) || defined(__ARM_ARCH_6K__) || defined(__ARM_ARCH_6Z__) || defined(__ARM_ARCH_6ZK__) || defined(__ARM_ARCH_6T2__) )
87 # define MEM_FORCE_MEMORY_ACCESS 2
88 # elif defined(__INTEL_COMPILER) /*|| defined(_MSC_VER)*/ || \
89 (defined(__GNUC__) && ( defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) || defined(__ARM_ARCH_7R__) || defined(__ARM_ARCH_7M__) || defined(__ARM_ARCH_7S__) ))
90 # define MEM_FORCE_MEMORY_ACCESS 1
91 # endif
92 #endif
93
94 MEM_STATIC unsigned MEM_32bits(void) { return sizeof(size_t)==4; }
95 MEM_STATIC unsigned MEM_64bits(void) { return sizeof(size_t)==8; }
96
97 MEM_STATIC unsigned MEM_isLittleEndian(void)
98 {
99 const union { U32 u; BYTE c[4]; } one = { 1 }; /* don't use static : performance detrimental */
100 return one.c[0];
101 }
102
103 #if defined(MEM_FORCE_MEMORY_ACCESS) && (MEM_FORCE_MEMORY_ACCESS==2)
104
105 /* violates C standard, by lying on structure alignment.
106 Only use if no other choice to achieve best performance on target platform */
107 MEM_STATIC U16 MEM_read16(const void* memPtr) { return *(const U16*) memPtr; }
108 MEM_STATIC U32 MEM_read32(const void* memPtr) { return *(const U32*) memPtr; }
109 MEM_STATIC U64 MEM_read64(const void* memPtr) { return *(const U64*) memPtr; }
110 MEM_STATIC U64 MEM_readST(const void* memPtr) { return *(const size_t*) memPtr; }
111
112 MEM_STATIC void MEM_write16(void* memPtr, U16 value) { *(U16*)memPtr = value; }
113 MEM_STATIC void MEM_write32(void* memPtr, U32 value) { *(U32*)memPtr = value; }
114 MEM_STATIC void MEM_write64(void* memPtr, U64 value) { *(U64*)memPtr = value; }
115
116 #elif defined(MEM_FORCE_MEMORY_ACCESS) && (MEM_FORCE_MEMORY_ACCESS==1)
117
118 /* __pack instructions are safer, but compiler specific, hence potentially problematic for some compilers */
119 /* currently only defined for gcc and icc */
120 #if defined(_MSC_VER) || (defined(__INTEL_COMPILER) && defined(WIN32))
121 __pragma( pack(push, 1) )
122 typedef union { U16 u16; U32 u32; U64 u64; size_t st; } unalign;
123 __pragma( pack(pop) )
124 #else
125 typedef union { U16 u16; U32 u32; U64 u64; size_t st; } __attribute__((packed)) unalign;
126 #endif
127
128 MEM_STATIC U16 MEM_read16(const void* ptr) { return ((const unalign*)ptr)->u16; }
129 MEM_STATIC U32 MEM_read32(const void* ptr) { return ((const unalign*)ptr)->u32; }
130 MEM_STATIC U64 MEM_read64(const void* ptr) { return ((const unalign*)ptr)->u64; }
131 MEM_STATIC U64 MEM_readST(const void* ptr) { return ((const unalign*)ptr)->st; }
132
133 MEM_STATIC void MEM_write16(void* memPtr, U16 value) { ((unalign*)memPtr)->u16 = value; }
134 MEM_STATIC void MEM_write32(void* memPtr, U32 value) { ((unalign*)memPtr)->u32 = value; }
135 MEM_STATIC void MEM_write64(void* memPtr, U64 value) { ((unalign*)memPtr)->u64 = value; }
136
137 #else
138
139 /* default method, safe and standard.
140 can sometimes prove slower */
141
142 MEM_STATIC U16 MEM_read16(const void* memPtr)
143 {
144 U16 val; memcpy(&val, memPtr, sizeof(val)); return val;
145 }
146
147 MEM_STATIC U32 MEM_read32(const void* memPtr)
148 {
149 U32 val; memcpy(&val, memPtr, sizeof(val)); return val;
150 }
151
152 MEM_STATIC U64 MEM_read64(const void* memPtr)
153 {
154 U64 val; memcpy(&val, memPtr, sizeof(val)); return val;
155 }
156
157 MEM_STATIC size_t MEM_readST(const void* memPtr)
158 {
159 size_t val; memcpy(&val, memPtr, sizeof(val)); return val;
160 }
161
162 MEM_STATIC void MEM_write16(void* memPtr, U16 value)
163 {
164 memcpy(memPtr, &value, sizeof(value));
165 }
166
167 MEM_STATIC void MEM_write32(void* memPtr, U32 value)
168 {
169 memcpy(memPtr, &value, sizeof(value));
170 }
171
172 MEM_STATIC void MEM_write64(void* memPtr, U64 value)
173 {
174 memcpy(memPtr, &value, sizeof(value));
175 }
176
177 #endif /* MEM_FORCE_MEMORY_ACCESS */
178
179 MEM_STATIC U32 MEM_swap32(U32 in)
180 {
181 #if defined(_MSC_VER) /* Visual Studio */
182 return _byteswap_ulong(in);
183 #elif defined (__GNUC__)
184 return __builtin_bswap32(in);
185 #else
186 return ((in << 24) & 0xff000000 ) |
187 ((in << 8) & 0x00ff0000 ) |
188 ((in >> 8) & 0x0000ff00 ) |
189 ((in >> 24) & 0x000000ff );
190 #endif
191 }
192
193 MEM_STATIC U64 MEM_swap64(U64 in)
194 {
195 #if defined(_MSC_VER) /* Visual Studio */
196 return _byteswap_uint64(in);
197 #elif defined (__GNUC__)
198 return __builtin_bswap64(in);
199 #else
200 return ((in << 56) & 0xff00000000000000ULL) |
201 ((in << 40) & 0x00ff000000000000ULL) |
202 ((in << 24) & 0x0000ff0000000000ULL) |
203 ((in << 8) & 0x000000ff00000000ULL) |
204 ((in >> 8) & 0x00000000ff000000ULL) |
205 ((in >> 24) & 0x0000000000ff0000ULL) |
206 ((in >> 40) & 0x000000000000ff00ULL) |
207 ((in >> 56) & 0x00000000000000ffULL);
208 #endif
209 }
210
211 MEM_STATIC size_t MEM_swapST(size_t in)
212 {
213 if (MEM_32bits())
214 return (size_t)MEM_swap32((U32)in);
215 else
216 return (size_t)MEM_swap64((U64)in);
217 }
218
219 /*=== Little endian r/w ===*/
220
221 MEM_STATIC U16 MEM_readLE16(const void* memPtr)
222 {
223 if (MEM_isLittleEndian())
224 return MEM_read16(memPtr);
225 else {
226 const BYTE* p = (const BYTE*)memPtr;
227 return (U16)(p[0] + (p[1]<<8));
228 }
229 }
230
231 MEM_STATIC void MEM_writeLE16(void* memPtr, U16 val)
232 {
233 if (MEM_isLittleEndian()) {
234 MEM_write16(memPtr, val);
235 } else {
236 BYTE* p = (BYTE*)memPtr;
237 p[0] = (BYTE)val;
238 p[1] = (BYTE)(val>>8);
239 }
240 }
241
242 MEM_STATIC U32 MEM_readLE24(const void* memPtr)
243 {
244 return MEM_readLE16(memPtr) + (((const BYTE*)memPtr)[2] << 16);
245 }
246
247 MEM_STATIC void MEM_writeLE24(void* memPtr, U32 val)
248 {
249 MEM_writeLE16(memPtr, (U16)val);
250 ((BYTE*)memPtr)[2] = (BYTE)(val>>16);
251 }
252
253 MEM_STATIC U32 MEM_readLE32(const void* memPtr)
254 {
255 if (MEM_isLittleEndian())
256 return MEM_read32(memPtr);
257 else
258 return MEM_swap32(MEM_read32(memPtr));
259 }
260
261 MEM_STATIC void MEM_writeLE32(void* memPtr, U32 val32)
262 {
263 if (MEM_isLittleEndian())
264 MEM_write32(memPtr, val32);
265 else
266 MEM_write32(memPtr, MEM_swap32(val32));
267 }
268
269 MEM_STATIC U64 MEM_readLE64(const void* memPtr)
270 {
271 if (MEM_isLittleEndian())
272 return MEM_read64(memPtr);
273 else
274 return MEM_swap64(MEM_read64(memPtr));
275 }
276
277 MEM_STATIC void MEM_writeLE64(void* memPtr, U64 val64)
278 {
279 if (MEM_isLittleEndian())
280 MEM_write64(memPtr, val64);
281 else
282 MEM_write64(memPtr, MEM_swap64(val64));
283 }
284
285 MEM_STATIC size_t MEM_readLEST(const void* memPtr)
286 {
287 if (MEM_32bits())
288 return (size_t)MEM_readLE32(memPtr);
289 else
290 return (size_t)MEM_readLE64(memPtr);
291 }
292
293 MEM_STATIC void MEM_writeLEST(void* memPtr, size_t val)
294 {
295 if (MEM_32bits())
296 MEM_writeLE32(memPtr, (U32)val);
297 else
298 MEM_writeLE64(memPtr, (U64)val);
299 }
300
301 /*=== Big endian r/w ===*/
302
303 MEM_STATIC U32 MEM_readBE32(const void* memPtr)
304 {
305 if (MEM_isLittleEndian())
306 return MEM_swap32(MEM_read32(memPtr));
307 else
308 return MEM_read32(memPtr);
309 }
310
311 MEM_STATIC void MEM_writeBE32(void* memPtr, U32 val32)
312 {
313 if (MEM_isLittleEndian())
314 MEM_write32(memPtr, MEM_swap32(val32));
315 else
316 MEM_write32(memPtr, val32);
317 }
318
319 MEM_STATIC U64 MEM_readBE64(const void* memPtr)
320 {
321 if (MEM_isLittleEndian())
322 return MEM_swap64(MEM_read64(memPtr));
323 else
324 return MEM_read64(memPtr);
325 }
326
327 MEM_STATIC void MEM_writeBE64(void* memPtr, U64 val64)
328 {
329 if (MEM_isLittleEndian())
330 MEM_write64(memPtr, MEM_swap64(val64));
331 else
332 MEM_write64(memPtr, val64);
333 }
334
335 MEM_STATIC size_t MEM_readBEST(const void* memPtr)
336 {
337 if (MEM_32bits())
338 return (size_t)MEM_readBE32(memPtr);
339 else
340 return (size_t)MEM_readBE64(memPtr);
341 }
342
343 MEM_STATIC void MEM_writeBEST(void* memPtr, size_t val)
344 {
345 if (MEM_32bits())
346 MEM_writeBE32(memPtr, (U32)val);
347 else
348 MEM_writeBE64(memPtr, (U64)val);
349 }
350
351
352 /* function safe only for comparisons */
353 MEM_STATIC U32 MEM_readMINMATCH(const void* memPtr, U32 length)
354 {
355 switch (length)
356 {
357 default :
358 case 4 : return MEM_read32(memPtr);
359 case 3 : if (MEM_isLittleEndian())
360 return MEM_read32(memPtr)<<8;
361 else
362 return MEM_read32(memPtr)>>8;
363 }
364 }
365
366 #if defined (__cplusplus)
367 }
368 #endif
369
370 #endif /* MEM_H_MODULE */
This diff has been collapsed as it changes many lines, (867 lines changed) Show them Hide them
@@ -0,0 +1,867 b''
1 /*
2 * xxHash - Fast Hash algorithm
3 * Copyright (C) 2012-2016, Yann Collet
4 *
5 * BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
6 *
7 * Redistribution and use in source and binary forms, with or without
8 * modification, are permitted provided that the following conditions are
9 * met:
10 *
11 * * Redistributions of source code must retain the above copyright
12 * notice, this list of conditions and the following disclaimer.
13 * * Redistributions in binary form must reproduce the above
14 * copyright notice, this list of conditions and the following disclaimer
15 * in the documentation and/or other materials provided with the
16 * distribution.
17 *
18 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
19 * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
20 * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
21 * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
22 * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
23 * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
24 * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
25 * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
26 * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
27 * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
28 * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
29 *
30 * You can contact the author at :
31 * - xxHash homepage: http://www.xxhash.com
32 * - xxHash source repository : https://github.com/Cyan4973/xxHash
33 */
34
35
36 /* *************************************
37 * Tuning parameters
38 ***************************************/
39 /*!XXH_FORCE_MEMORY_ACCESS :
40 * By default, access to unaligned memory is controlled by `memcpy()`, which is safe and portable.
41 * Unfortunately, on some target/compiler combinations, the generated assembly is sub-optimal.
42 * The below switch allow to select different access method for improved performance.
43 * Method 0 (default) : use `memcpy()`. Safe and portable.
44 * Method 1 : `__packed` statement. It depends on compiler extension (ie, not portable).
45 * This method is safe if your compiler supports it, and *generally* as fast or faster than `memcpy`.
46 * Method 2 : direct access. This method doesn't depend on compiler but violate C standard.
47 * It can generate buggy code on targets which do not support unaligned memory accesses.
48 * But in some circumstances, it's the only known way to get the most performance (ie GCC + ARMv6)
49 * See http://stackoverflow.com/a/32095106/646947 for details.
50 * Prefer these methods in priority order (0 > 1 > 2)
51 */
52 #ifndef XXH_FORCE_MEMORY_ACCESS /* can be defined externally, on command line for example */
53 # if defined(__GNUC__) && ( defined(__ARM_ARCH_6__) || defined(__ARM_ARCH_6J__) || defined(__ARM_ARCH_6K__) || defined(__ARM_ARCH_6Z__) || defined(__ARM_ARCH_6ZK__) || defined(__ARM_ARCH_6T2__) )
54 # define XXH_FORCE_MEMORY_ACCESS 2
55 # elif (defined(__INTEL_COMPILER) && !defined(WIN32)) || \
56 (defined(__GNUC__) && ( defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) || defined(__ARM_ARCH_7R__) || defined(__ARM_ARCH_7M__) || defined(__ARM_ARCH_7S__) ))
57 # define XXH_FORCE_MEMORY_ACCESS 1
58 # endif
59 #endif
60
61 /*!XXH_ACCEPT_NULL_INPUT_POINTER :
62 * If the input pointer is a null pointer, xxHash default behavior is to trigger a memory access error, since it is a bad pointer.
63 * When this option is enabled, xxHash output for null input pointers will be the same as a null-length input.
64 * By default, this option is disabled. To enable it, uncomment below define :
65 */
66 /* #define XXH_ACCEPT_NULL_INPUT_POINTER 1 */
67
68 /*!XXH_FORCE_NATIVE_FORMAT :
69 * By default, xxHash library provides endian-independant Hash values, based on little-endian convention.
70 * Results are therefore identical for little-endian and big-endian CPU.
71 * This comes at a performance cost for big-endian CPU, since some swapping is required to emulate little-endian format.
72 * Should endian-independance be of no importance for your application, you may set the #define below to 1,
73 * to improve speed for Big-endian CPU.
74 * This option has no impact on Little_Endian CPU.
75 */
76 #ifndef XXH_FORCE_NATIVE_FORMAT /* can be defined externally */
77 # define XXH_FORCE_NATIVE_FORMAT 0
78 #endif
79
80 /*!XXH_FORCE_ALIGN_CHECK :
81 * This is a minor performance trick, only useful with lots of very small keys.
82 * It means : check for aligned/unaligned input.
83 * The check costs one initial branch per hash; set to 0 when the input data
84 * is guaranteed to be aligned.
85 */
86 #ifndef XXH_FORCE_ALIGN_CHECK /* can be defined externally */
87 # if defined(__i386) || defined(_M_IX86) || defined(__x86_64__) || defined(_M_X64)
88 # define XXH_FORCE_ALIGN_CHECK 0
89 # else
90 # define XXH_FORCE_ALIGN_CHECK 1
91 # endif
92 #endif
93
94
95 /* *************************************
96 * Includes & Memory related functions
97 ***************************************/
98 /* Modify the local functions below should you wish to use some other memory routines */
99 /* for malloc(), free() */
100 #include <stdlib.h>
101 static void* XXH_malloc(size_t s) { return malloc(s); }
102 static void XXH_free (void* p) { free(p); }
103 /* for memcpy() */
104 #include <string.h>
105 static void* XXH_memcpy(void* dest, const void* src, size_t size) { return memcpy(dest,src,size); }
106
107 #define XXH_STATIC_LINKING_ONLY
108 #include "xxhash.h"
109
110
111 /* *************************************
112 * Compiler Specific Options
113 ***************************************/
114 #ifdef _MSC_VER /* Visual Studio */
115 # pragma warning(disable : 4127) /* disable: C4127: conditional expression is constant */
116 # define FORCE_INLINE static __forceinline
117 #else
118 # if defined (__cplusplus) || defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L /* C99 */
119 # ifdef __GNUC__
120 # define FORCE_INLINE static inline __attribute__((always_inline))
121 # else
122 # define FORCE_INLINE static inline
123 # endif
124 # else
125 # define FORCE_INLINE static
126 # endif /* __STDC_VERSION__ */
127 #endif
128
129
130 /* *************************************
131 * Basic Types
132 ***************************************/
133 #ifndef MEM_MODULE
134 # define MEM_MODULE
135 # if !defined (__VMS) && (defined (__cplusplus) || (defined (__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) /* C99 */) )
136 # include <stdint.h>
137 typedef uint8_t BYTE;
138 typedef uint16_t U16;
139 typedef uint32_t U32;
140 typedef int32_t S32;
141 typedef uint64_t U64;
142 # else
143 typedef unsigned char BYTE;
144 typedef unsigned short U16;
145 typedef unsigned int U32;
146 typedef signed int S32;
147 typedef unsigned long long U64; /* if your compiler doesn't support unsigned long long, replace by another 64-bit type here. Note that xxhash.h will also need to be updated. */
148 # endif
149 #endif
150
151
152 #if (defined(XXH_FORCE_MEMORY_ACCESS) && (XXH_FORCE_MEMORY_ACCESS==2))
153
154 /* Force direct memory access. Only works on CPU which support unaligned memory access in hardware */
155 static U32 XXH_read32(const void* memPtr) { return *(const U32*) memPtr; }
156 static U64 XXH_read64(const void* memPtr) { return *(const U64*) memPtr; }
157
158 #elif (defined(XXH_FORCE_MEMORY_ACCESS) && (XXH_FORCE_MEMORY_ACCESS==1))
159
160 /* __pack instructions are safer, but compiler specific, hence potentially problematic for some compilers */
161 /* currently only defined for gcc and icc */
162 typedef union { U32 u32; U64 u64; } __attribute__((packed)) unalign;
163
164 static U32 XXH_read32(const void* ptr) { return ((const unalign*)ptr)->u32; }
165 static U64 XXH_read64(const void* ptr) { return ((const unalign*)ptr)->u64; }
166
167 #else
168
169 /* portable and safe solution. Generally efficient.
170 * see : http://stackoverflow.com/a/32095106/646947
171 */
172
173 static U32 XXH_read32(const void* memPtr)
174 {
175 U32 val;
176 memcpy(&val, memPtr, sizeof(val));
177 return val;
178 }
179
180 static U64 XXH_read64(const void* memPtr)
181 {
182 U64 val;
183 memcpy(&val, memPtr, sizeof(val));
184 return val;
185 }
186
187 #endif /* XXH_FORCE_DIRECT_MEMORY_ACCESS */
188
189
190 /* ****************************************
191 * Compiler-specific Functions and Macros
192 ******************************************/
193 #define GCC_VERSION (__GNUC__ * 100 + __GNUC_MINOR__)
194
195 /* Note : although _rotl exists for minGW (GCC under windows), performance seems poor */
196 #if defined(_MSC_VER)
197 # define XXH_rotl32(x,r) _rotl(x,r)
198 # define XXH_rotl64(x,r) _rotl64(x,r)
199 #else
200 # define XXH_rotl32(x,r) ((x << r) | (x >> (32 - r)))
201 # define XXH_rotl64(x,r) ((x << r) | (x >> (64 - r)))
202 #endif
203
204 #if defined(_MSC_VER) /* Visual Studio */
205 # define XXH_swap32 _byteswap_ulong
206 # define XXH_swap64 _byteswap_uint64
207 #elif GCC_VERSION >= 403
208 # define XXH_swap32 __builtin_bswap32
209 # define XXH_swap64 __builtin_bswap64
210 #else
211 static U32 XXH_swap32 (U32 x)
212 {
213 return ((x << 24) & 0xff000000 ) |
214 ((x << 8) & 0x00ff0000 ) |
215 ((x >> 8) & 0x0000ff00 ) |
216 ((x >> 24) & 0x000000ff );
217 }
218 static U64 XXH_swap64 (U64 x)
219 {
220 return ((x << 56) & 0xff00000000000000ULL) |
221 ((x << 40) & 0x00ff000000000000ULL) |
222 ((x << 24) & 0x0000ff0000000000ULL) |
223 ((x << 8) & 0x000000ff00000000ULL) |
224 ((x >> 8) & 0x00000000ff000000ULL) |
225 ((x >> 24) & 0x0000000000ff0000ULL) |
226 ((x >> 40) & 0x000000000000ff00ULL) |
227 ((x >> 56) & 0x00000000000000ffULL);
228 }
229 #endif
230
231
232 /* *************************************
233 * Architecture Macros
234 ***************************************/
235 typedef enum { XXH_bigEndian=0, XXH_littleEndian=1 } XXH_endianess;
236
237 /* XXH_CPU_LITTLE_ENDIAN can be defined externally, for example on the compiler command line */
238 #ifndef XXH_CPU_LITTLE_ENDIAN
239 static const int g_one = 1;
240 # define XXH_CPU_LITTLE_ENDIAN (*(const char*)(&g_one))
241 #endif
242
243
244 /* ***************************
245 * Memory reads
246 *****************************/
247 typedef enum { XXH_aligned, XXH_unaligned } XXH_alignment;
248
249 FORCE_INLINE U32 XXH_readLE32_align(const void* ptr, XXH_endianess endian, XXH_alignment align)
250 {
251 if (align==XXH_unaligned)
252 return endian==XXH_littleEndian ? XXH_read32(ptr) : XXH_swap32(XXH_read32(ptr));
253 else
254 return endian==XXH_littleEndian ? *(const U32*)ptr : XXH_swap32(*(const U32*)ptr);
255 }
256
257 FORCE_INLINE U32 XXH_readLE32(const void* ptr, XXH_endianess endian)
258 {
259 return XXH_readLE32_align(ptr, endian, XXH_unaligned);
260 }
261
262 static U32 XXH_readBE32(const void* ptr)
263 {
264 return XXH_CPU_LITTLE_ENDIAN ? XXH_swap32(XXH_read32(ptr)) : XXH_read32(ptr);
265 }
266
267 FORCE_INLINE U64 XXH_readLE64_align(const void* ptr, XXH_endianess endian, XXH_alignment align)
268 {
269 if (align==XXH_unaligned)
270 return endian==XXH_littleEndian ? XXH_read64(ptr) : XXH_swap64(XXH_read64(ptr));
271 else
272 return endian==XXH_littleEndian ? *(const U64*)ptr : XXH_swap64(*(const U64*)ptr);
273 }
274
275 FORCE_INLINE U64 XXH_readLE64(const void* ptr, XXH_endianess endian)
276 {
277 return XXH_readLE64_align(ptr, endian, XXH_unaligned);
278 }
279
280 static U64 XXH_readBE64(const void* ptr)
281 {
282 return XXH_CPU_LITTLE_ENDIAN ? XXH_swap64(XXH_read64(ptr)) : XXH_read64(ptr);
283 }
284
285
286 /* *************************************
287 * Macros
288 ***************************************/
289 #define XXH_STATIC_ASSERT(c) { enum { XXH_static_assert = 1/(int)(!!(c)) }; } /* use only *after* variable declarations */
290
291
292 /* *************************************
293 * Constants
294 ***************************************/
295 static const U32 PRIME32_1 = 2654435761U;
296 static const U32 PRIME32_2 = 2246822519U;
297 static const U32 PRIME32_3 = 3266489917U;
298 static const U32 PRIME32_4 = 668265263U;
299 static const U32 PRIME32_5 = 374761393U;
300
301 static const U64 PRIME64_1 = 11400714785074694791ULL;
302 static const U64 PRIME64_2 = 14029467366897019727ULL;
303 static const U64 PRIME64_3 = 1609587929392839161ULL;
304 static const U64 PRIME64_4 = 9650029242287828579ULL;
305 static const U64 PRIME64_5 = 2870177450012600261ULL;
306
307 XXH_PUBLIC_API unsigned XXH_versionNumber (void) { return XXH_VERSION_NUMBER; }
308
309
310 /* **************************
311 * Utils
312 ****************************/
313 XXH_PUBLIC_API void XXH32_copyState(XXH32_state_t* restrict dstState, const XXH32_state_t* restrict srcState)
314 {
315 memcpy(dstState, srcState, sizeof(*dstState));
316 }
317
318 XXH_PUBLIC_API void XXH64_copyState(XXH64_state_t* restrict dstState, const XXH64_state_t* restrict srcState)
319 {
320 memcpy(dstState, srcState, sizeof(*dstState));
321 }
322
323
324 /* ***************************
325 * Simple Hash Functions
326 *****************************/
327
328 static U32 XXH32_round(U32 seed, U32 input)
329 {
330 seed += input * PRIME32_2;
331 seed = XXH_rotl32(seed, 13);
332 seed *= PRIME32_1;
333 return seed;
334 }
335
336 FORCE_INLINE U32 XXH32_endian_align(const void* input, size_t len, U32 seed, XXH_endianess endian, XXH_alignment align)
337 {
338 const BYTE* p = (const BYTE*)input;
339 const BYTE* bEnd = p + len;
340 U32 h32;
341 #define XXH_get32bits(p) XXH_readLE32_align(p, endian, align)
342
343 #ifdef XXH_ACCEPT_NULL_INPUT_POINTER
344 if (p==NULL) {
345 len=0;
346 bEnd=p=(const BYTE*)(size_t)16;
347 }
348 #endif
349
350 if (len>=16) {
351 const BYTE* const limit = bEnd - 16;
352 U32 v1 = seed + PRIME32_1 + PRIME32_2;
353 U32 v2 = seed + PRIME32_2;
354 U32 v3 = seed + 0;
355 U32 v4 = seed - PRIME32_1;
356
357 do {
358 v1 = XXH32_round(v1, XXH_get32bits(p)); p+=4;
359 v2 = XXH32_round(v2, XXH_get32bits(p)); p+=4;
360 v3 = XXH32_round(v3, XXH_get32bits(p)); p+=4;
361 v4 = XXH32_round(v4, XXH_get32bits(p)); p+=4;
362 } while (p<=limit);
363
364 h32 = XXH_rotl32(v1, 1) + XXH_rotl32(v2, 7) + XXH_rotl32(v3, 12) + XXH_rotl32(v4, 18);
365 } else {
366 h32 = seed + PRIME32_5;
367 }
368
369 h32 += (U32) len;
370
371 while (p+4<=bEnd) {
372 h32 += XXH_get32bits(p) * PRIME32_3;
373 h32 = XXH_rotl32(h32, 17) * PRIME32_4 ;
374 p+=4;
375 }
376
377 while (p<bEnd) {
378 h32 += (*p) * PRIME32_5;
379 h32 = XXH_rotl32(h32, 11) * PRIME32_1 ;
380 p++;
381 }
382
383 h32 ^= h32 >> 15;
384 h32 *= PRIME32_2;
385 h32 ^= h32 >> 13;
386 h32 *= PRIME32_3;
387 h32 ^= h32 >> 16;
388
389 return h32;
390 }
391
392
393 XXH_PUBLIC_API unsigned int XXH32 (const void* input, size_t len, unsigned int seed)
394 {
395 #if 0
396 /* Simple version, good for code maintenance, but unfortunately slow for small inputs */
397 XXH32_CREATESTATE_STATIC(state);
398 XXH32_reset(state, seed);
399 XXH32_update(state, input, len);
400 return XXH32_digest(state);
401 #else
402 XXH_endianess endian_detected = (XXH_endianess)XXH_CPU_LITTLE_ENDIAN;
403
404 if (XXH_FORCE_ALIGN_CHECK) {
405 if ((((size_t)input) & 3) == 0) { /* Input is 4-bytes aligned, leverage the speed benefit */
406 if ((endian_detected==XXH_littleEndian) || XXH_FORCE_NATIVE_FORMAT)
407 return XXH32_endian_align(input, len, seed, XXH_littleEndian, XXH_aligned);
408 else
409 return XXH32_endian_align(input, len, seed, XXH_bigEndian, XXH_aligned);
410 } }
411
412 if ((endian_detected==XXH_littleEndian) || XXH_FORCE_NATIVE_FORMAT)
413 return XXH32_endian_align(input, len, seed, XXH_littleEndian, XXH_unaligned);
414 else
415 return XXH32_endian_align(input, len, seed, XXH_bigEndian, XXH_unaligned);
416 #endif
417 }
418
419
420 static U64 XXH64_round(U64 acc, U64 input)
421 {
422 acc += input * PRIME64_2;
423 acc = XXH_rotl64(acc, 31);
424 acc *= PRIME64_1;
425 return acc;
426 }
427
428 static U64 XXH64_mergeRound(U64 acc, U64 val)
429 {
430 val = XXH64_round(0, val);
431 acc ^= val;
432 acc = acc * PRIME64_1 + PRIME64_4;
433 return acc;
434 }
435
436 FORCE_INLINE U64 XXH64_endian_align(const void* input, size_t len, U64 seed, XXH_endianess endian, XXH_alignment align)
437 {
438 const BYTE* p = (const BYTE*)input;
439 const BYTE* const bEnd = p + len;
440 U64 h64;
441 #define XXH_get64bits(p) XXH_readLE64_align(p, endian, align)
442
443 #ifdef XXH_ACCEPT_NULL_INPUT_POINTER
444 if (p==NULL) {
445 len=0;
446 bEnd=p=(const BYTE*)(size_t)32;
447 }
448 #endif
449
450 if (len>=32) {
451 const BYTE* const limit = bEnd - 32;
452 U64 v1 = seed + PRIME64_1 + PRIME64_2;
453 U64 v2 = seed + PRIME64_2;
454 U64 v3 = seed + 0;
455 U64 v4 = seed - PRIME64_1;
456
457 do {
458 v1 = XXH64_round(v1, XXH_get64bits(p)); p+=8;
459 v2 = XXH64_round(v2, XXH_get64bits(p)); p+=8;
460 v3 = XXH64_round(v3, XXH_get64bits(p)); p+=8;
461 v4 = XXH64_round(v4, XXH_get64bits(p)); p+=8;
462 } while (p<=limit);
463
464 h64 = XXH_rotl64(v1, 1) + XXH_rotl64(v2, 7) + XXH_rotl64(v3, 12) + XXH_rotl64(v4, 18);
465 h64 = XXH64_mergeRound(h64, v1);
466 h64 = XXH64_mergeRound(h64, v2);
467 h64 = XXH64_mergeRound(h64, v3);
468 h64 = XXH64_mergeRound(h64, v4);
469
470 } else {
471 h64 = seed + PRIME64_5;
472 }
473
474 h64 += (U64) len;
475
476 while (p+8<=bEnd) {
477 U64 const k1 = XXH64_round(0, XXH_get64bits(p));
478 h64 ^= k1;
479 h64 = XXH_rotl64(h64,27) * PRIME64_1 + PRIME64_4;
480 p+=8;
481 }
482
483 if (p+4<=bEnd) {
484 h64 ^= (U64)(XXH_get32bits(p)) * PRIME64_1;
485 h64 = XXH_rotl64(h64, 23) * PRIME64_2 + PRIME64_3;
486 p+=4;
487 }
488
489 while (p<bEnd) {
490 h64 ^= (*p) * PRIME64_5;
491 h64 = XXH_rotl64(h64, 11) * PRIME64_1;
492 p++;
493 }
494
495 h64 ^= h64 >> 33;
496 h64 *= PRIME64_2;
497 h64 ^= h64 >> 29;
498 h64 *= PRIME64_3;
499 h64 ^= h64 >> 32;
500
501 return h64;
502 }
503
504
505 XXH_PUBLIC_API unsigned long long XXH64 (const void* input, size_t len, unsigned long long seed)
506 {
507 #if 0
508 /* Simple version, good for code maintenance, but unfortunately slow for small inputs */
509 XXH64_CREATESTATE_STATIC(state);
510 XXH64_reset(state, seed);
511 XXH64_update(state, input, len);
512 return XXH64_digest(state);
513 #else
514 XXH_endianess endian_detected = (XXH_endianess)XXH_CPU_LITTLE_ENDIAN;
515
516 if (XXH_FORCE_ALIGN_CHECK) {
517 if ((((size_t)input) & 7)==0) { /* Input is aligned, let's leverage the speed advantage */
518 if ((endian_detected==XXH_littleEndian) || XXH_FORCE_NATIVE_FORMAT)
519 return XXH64_endian_align(input, len, seed, XXH_littleEndian, XXH_aligned);
520 else
521 return XXH64_endian_align(input, len, seed, XXH_bigEndian, XXH_aligned);
522 } }
523
524 if ((endian_detected==XXH_littleEndian) || XXH_FORCE_NATIVE_FORMAT)
525 return XXH64_endian_align(input, len, seed, XXH_littleEndian, XXH_unaligned);
526 else
527 return XXH64_endian_align(input, len, seed, XXH_bigEndian, XXH_unaligned);
528 #endif
529 }
530
531
532 /* **************************************************
533 * Advanced Hash Functions
534 ****************************************************/
535
536 XXH_PUBLIC_API XXH32_state_t* XXH32_createState(void)
537 {
538 return (XXH32_state_t*)XXH_malloc(sizeof(XXH32_state_t));
539 }
540 XXH_PUBLIC_API XXH_errorcode XXH32_freeState(XXH32_state_t* statePtr)
541 {
542 XXH_free(statePtr);
543 return XXH_OK;
544 }
545
546 XXH_PUBLIC_API XXH64_state_t* XXH64_createState(void)
547 {
548 return (XXH64_state_t*)XXH_malloc(sizeof(XXH64_state_t));
549 }
550 XXH_PUBLIC_API XXH_errorcode XXH64_freeState(XXH64_state_t* statePtr)
551 {
552 XXH_free(statePtr);
553 return XXH_OK;
554 }
555
556
557 /*** Hash feed ***/
558
559 XXH_PUBLIC_API XXH_errorcode XXH32_reset(XXH32_state_t* statePtr, unsigned int seed)
560 {
561 XXH32_state_t state; /* using a local state to memcpy() in order to avoid strict-aliasing warnings */
562 memset(&state, 0, sizeof(state)-4); /* do not write into reserved, for future removal */
563 state.v1 = seed + PRIME32_1 + PRIME32_2;
564 state.v2 = seed + PRIME32_2;
565 state.v3 = seed + 0;
566 state.v4 = seed - PRIME32_1;
567 memcpy(statePtr, &state, sizeof(state));
568 return XXH_OK;
569 }
570
571
572 XXH_PUBLIC_API XXH_errorcode XXH64_reset(XXH64_state_t* statePtr, unsigned long long seed)
573 {
574 XXH64_state_t state; /* using a local state to memcpy() in order to avoid strict-aliasing warnings */
575 memset(&state, 0, sizeof(state)-8); /* do not write into reserved, for future removal */
576 state.v1 = seed + PRIME64_1 + PRIME64_2;
577 state.v2 = seed + PRIME64_2;
578 state.v3 = seed + 0;
579 state.v4 = seed - PRIME64_1;
580 memcpy(statePtr, &state, sizeof(state));
581 return XXH_OK;
582 }
583
584
585 FORCE_INLINE XXH_errorcode XXH32_update_endian (XXH32_state_t* state, const void* input, size_t len, XXH_endianess endian)
586 {
587 const BYTE* p = (const BYTE*)input;
588 const BYTE* const bEnd = p + len;
589
590 #ifdef XXH_ACCEPT_NULL_INPUT_POINTER
591 if (input==NULL) return XXH_ERROR;
592 #endif
593
594 state->total_len_32 += (unsigned)len;
595 state->large_len |= (len>=16) | (state->total_len_32>=16);
596
597 if (state->memsize + len < 16) { /* fill in tmp buffer */
598 XXH_memcpy((BYTE*)(state->mem32) + state->memsize, input, len);
599 state->memsize += (unsigned)len;
600 return XXH_OK;
601 }
602
603 if (state->memsize) { /* some data left from previous update */
604 XXH_memcpy((BYTE*)(state->mem32) + state->memsize, input, 16-state->memsize);
605 { const U32* p32 = state->mem32;
606 state->v1 = XXH32_round(state->v1, XXH_readLE32(p32, endian)); p32++;
607 state->v2 = XXH32_round(state->v2, XXH_readLE32(p32, endian)); p32++;
608 state->v3 = XXH32_round(state->v3, XXH_readLE32(p32, endian)); p32++;
609 state->v4 = XXH32_round(state->v4, XXH_readLE32(p32, endian)); p32++;
610 }
611 p += 16-state->memsize;
612 state->memsize = 0;
613 }
614
615 if (p <= bEnd-16) {
616 const BYTE* const limit = bEnd - 16;
617 U32 v1 = state->v1;
618 U32 v2 = state->v2;
619 U32 v3 = state->v3;
620 U32 v4 = state->v4;
621
622 do {
623 v1 = XXH32_round(v1, XXH_readLE32(p, endian)); p+=4;
624 v2 = XXH32_round(v2, XXH_readLE32(p, endian)); p+=4;
625 v3 = XXH32_round(v3, XXH_readLE32(p, endian)); p+=4;
626 v4 = XXH32_round(v4, XXH_readLE32(p, endian)); p+=4;
627 } while (p<=limit);
628
629 state->v1 = v1;
630 state->v2 = v2;
631 state->v3 = v3;
632 state->v4 = v4;
633 }
634
635 if (p < bEnd) {
636 XXH_memcpy(state->mem32, p, (size_t)(bEnd-p));
637 state->memsize = (unsigned)(bEnd-p);
638 }
639
640 return XXH_OK;
641 }
642
643 XXH_PUBLIC_API XXH_errorcode XXH32_update (XXH32_state_t* state_in, const void* input, size_t len)
644 {
645 XXH_endianess endian_detected = (XXH_endianess)XXH_CPU_LITTLE_ENDIAN;
646
647 if ((endian_detected==XXH_littleEndian) || XXH_FORCE_NATIVE_FORMAT)
648 return XXH32_update_endian(state_in, input, len, XXH_littleEndian);
649 else
650 return XXH32_update_endian(state_in, input, len, XXH_bigEndian);
651 }
652
653
654
655 FORCE_INLINE U32 XXH32_digest_endian (const XXH32_state_t* state, XXH_endianess endian)
656 {
657 const BYTE * p = (const BYTE*)state->mem32;
658 const BYTE* const bEnd = (const BYTE*)(state->mem32) + state->memsize;
659 U32 h32;
660
661 if (state->large_len) {
662 h32 = XXH_rotl32(state->v1, 1) + XXH_rotl32(state->v2, 7) + XXH_rotl32(state->v3, 12) + XXH_rotl32(state->v4, 18);
663 } else {
664 h32 = state->v3 /* == seed */ + PRIME32_5;
665 }
666
667 h32 += state->total_len_32;
668
669 while (p+4<=bEnd) {
670 h32 += XXH_readLE32(p, endian) * PRIME32_3;
671 h32 = XXH_rotl32(h32, 17) * PRIME32_4;
672 p+=4;
673 }
674
675 while (p<bEnd) {
676 h32 += (*p) * PRIME32_5;
677 h32 = XXH_rotl32(h32, 11) * PRIME32_1;
678 p++;
679 }
680
681 h32 ^= h32 >> 15;
682 h32 *= PRIME32_2;
683 h32 ^= h32 >> 13;
684 h32 *= PRIME32_3;
685 h32 ^= h32 >> 16;
686
687 return h32;
688 }
689
690
691 XXH_PUBLIC_API unsigned int XXH32_digest (const XXH32_state_t* state_in)
692 {
693 XXH_endianess endian_detected = (XXH_endianess)XXH_CPU_LITTLE_ENDIAN;
694
695 if ((endian_detected==XXH_littleEndian) || XXH_FORCE_NATIVE_FORMAT)
696 return XXH32_digest_endian(state_in, XXH_littleEndian);
697 else
698 return XXH32_digest_endian(state_in, XXH_bigEndian);
699 }
700
701
702
703 /* **** XXH64 **** */
704
705 FORCE_INLINE XXH_errorcode XXH64_update_endian (XXH64_state_t* state, const void* input, size_t len, XXH_endianess endian)
706 {
707 const BYTE* p = (const BYTE*)input;
708 const BYTE* const bEnd = p + len;
709
710 #ifdef XXH_ACCEPT_NULL_INPUT_POINTER
711 if (input==NULL) return XXH_ERROR;
712 #endif
713
714 state->total_len += len;
715
716 if (state->memsize + len < 32) { /* fill in tmp buffer */
717 XXH_memcpy(((BYTE*)state->mem64) + state->memsize, input, len);
718 state->memsize += (U32)len;
719 return XXH_OK;
720 }
721
722 if (state->memsize) { /* tmp buffer is full */
723 XXH_memcpy(((BYTE*)state->mem64) + state->memsize, input, 32-state->memsize);
724 state->v1 = XXH64_round(state->v1, XXH_readLE64(state->mem64+0, endian));
725 state->v2 = XXH64_round(state->v2, XXH_readLE64(state->mem64+1, endian));
726 state->v3 = XXH64_round(state->v3, XXH_readLE64(state->mem64+2, endian));
727 state->v4 = XXH64_round(state->v4, XXH_readLE64(state->mem64+3, endian));
728 p += 32-state->memsize;
729 state->memsize = 0;
730 }
731
732 if (p+32 <= bEnd) {
733 const BYTE* const limit = bEnd - 32;
734 U64 v1 = state->v1;
735 U64 v2 = state->v2;
736 U64 v3 = state->v3;
737 U64 v4 = state->v4;
738
739 do {
740 v1 = XXH64_round(v1, XXH_readLE64(p, endian)); p+=8;
741 v2 = XXH64_round(v2, XXH_readLE64(p, endian)); p+=8;
742 v3 = XXH64_round(v3, XXH_readLE64(p, endian)); p+=8;
743 v4 = XXH64_round(v4, XXH_readLE64(p, endian)); p+=8;
744 } while (p<=limit);
745
746 state->v1 = v1;
747 state->v2 = v2;
748 state->v3 = v3;
749 state->v4 = v4;
750 }
751
752 if (p < bEnd) {
753 XXH_memcpy(state->mem64, p, (size_t)(bEnd-p));
754 state->memsize = (unsigned)(bEnd-p);
755 }
756
757 return XXH_OK;
758 }
759
760 XXH_PUBLIC_API XXH_errorcode XXH64_update (XXH64_state_t* state_in, const void* input, size_t len)
761 {
762 XXH_endianess endian_detected = (XXH_endianess)XXH_CPU_LITTLE_ENDIAN;
763
764 if ((endian_detected==XXH_littleEndian) || XXH_FORCE_NATIVE_FORMAT)
765 return XXH64_update_endian(state_in, input, len, XXH_littleEndian);
766 else
767 return XXH64_update_endian(state_in, input, len, XXH_bigEndian);
768 }
769
770
771
772 FORCE_INLINE U64 XXH64_digest_endian (const XXH64_state_t* state, XXH_endianess endian)
773 {
774 const BYTE * p = (const BYTE*)state->mem64;
775 const BYTE* const bEnd = (const BYTE*)state->mem64 + state->memsize;
776 U64 h64;
777
778 if (state->total_len >= 32) {
779 U64 const v1 = state->v1;
780 U64 const v2 = state->v2;
781 U64 const v3 = state->v3;
782 U64 const v4 = state->v4;
783
784 h64 = XXH_rotl64(v1, 1) + XXH_rotl64(v2, 7) + XXH_rotl64(v3, 12) + XXH_rotl64(v4, 18);
785 h64 = XXH64_mergeRound(h64, v1);
786 h64 = XXH64_mergeRound(h64, v2);
787 h64 = XXH64_mergeRound(h64, v3);
788 h64 = XXH64_mergeRound(h64, v4);
789 } else {
790 h64 = state->v3 + PRIME64_5;
791 }
792
793 h64 += (U64) state->total_len;
794
795 while (p+8<=bEnd) {
796 U64 const k1 = XXH64_round(0, XXH_readLE64(p, endian));
797 h64 ^= k1;
798 h64 = XXH_rotl64(h64,27) * PRIME64_1 + PRIME64_4;
799 p+=8;
800 }
801
802 if (p+4<=bEnd) {
803 h64 ^= (U64)(XXH_readLE32(p, endian)) * PRIME64_1;
804 h64 = XXH_rotl64(h64, 23) * PRIME64_2 + PRIME64_3;
805 p+=4;
806 }
807
808 while (p<bEnd) {
809 h64 ^= (*p) * PRIME64_5;
810 h64 = XXH_rotl64(h64, 11) * PRIME64_1;
811 p++;
812 }
813
814 h64 ^= h64 >> 33;
815 h64 *= PRIME64_2;
816 h64 ^= h64 >> 29;
817 h64 *= PRIME64_3;
818 h64 ^= h64 >> 32;
819
820 return h64;
821 }
822
823
824 XXH_PUBLIC_API unsigned long long XXH64_digest (const XXH64_state_t* state_in)
825 {
826 XXH_endianess endian_detected = (XXH_endianess)XXH_CPU_LITTLE_ENDIAN;
827
828 if ((endian_detected==XXH_littleEndian) || XXH_FORCE_NATIVE_FORMAT)
829 return XXH64_digest_endian(state_in, XXH_littleEndian);
830 else
831 return XXH64_digest_endian(state_in, XXH_bigEndian);
832 }
833
834
835 /* **************************
836 * Canonical representation
837 ****************************/
838
839 /*! Default XXH result types are basic unsigned 32 and 64 bits.
840 * The canonical representation follows human-readable write convention, aka big-endian (large digits first).
841 * These functions allow transformation of hash result into and from its canonical format.
842 * This way, hash values can be written into a file or buffer, and remain comparable across different systems and programs.
843 */
844
845 XXH_PUBLIC_API void XXH32_canonicalFromHash(XXH32_canonical_t* dst, XXH32_hash_t hash)
846 {
847 XXH_STATIC_ASSERT(sizeof(XXH32_canonical_t) == sizeof(XXH32_hash_t));
848 if (XXH_CPU_LITTLE_ENDIAN) hash = XXH_swap32(hash);
849 memcpy(dst, &hash, sizeof(*dst));
850 }
851
852 XXH_PUBLIC_API void XXH64_canonicalFromHash(XXH64_canonical_t* dst, XXH64_hash_t hash)
853 {
854 XXH_STATIC_ASSERT(sizeof(XXH64_canonical_t) == sizeof(XXH64_hash_t));
855 if (XXH_CPU_LITTLE_ENDIAN) hash = XXH_swap64(hash);
856 memcpy(dst, &hash, sizeof(*dst));
857 }
858
859 XXH_PUBLIC_API XXH32_hash_t XXH32_hashFromCanonical(const XXH32_canonical_t* src)
860 {
861 return XXH_readBE32(src);
862 }
863
864 XXH_PUBLIC_API XXH64_hash_t XXH64_hashFromCanonical(const XXH64_canonical_t* src)
865 {
866 return XXH_readBE64(src);
867 }
@@ -0,0 +1,309 b''
1 /*
2 xxHash - Extremely Fast Hash algorithm
3 Header File
4 Copyright (C) 2012-2016, Yann Collet.
5
6 BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
7
8 Redistribution and use in source and binary forms, with or without
9 modification, are permitted provided that the following conditions are
10 met:
11
12 * Redistributions of source code must retain the above copyright
13 notice, this list of conditions and the following disclaimer.
14 * Redistributions in binary form must reproduce the above
15 copyright notice, this list of conditions and the following disclaimer
16 in the documentation and/or other materials provided with the
17 distribution.
18
19 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
20 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
21 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
22 A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
23 OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
24 SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
25 LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
26 DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
27 THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
28 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
29 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
30
31 You can contact the author at :
32 - xxHash source repository : https://github.com/Cyan4973/xxHash
33 */
34
35 /* Notice extracted from xxHash homepage :
36
37 xxHash is an extremely fast Hash algorithm, running at RAM speed limits.
38 It also successfully passes all tests from the SMHasher suite.
39
40 Comparison (single thread, Windows Seven 32 bits, using SMHasher on a Core 2 Duo @3GHz)
41
42 Name Speed Q.Score Author
43 xxHash 5.4 GB/s 10
44 CrapWow 3.2 GB/s 2 Andrew
45 MumurHash 3a 2.7 GB/s 10 Austin Appleby
46 SpookyHash 2.0 GB/s 10 Bob Jenkins
47 SBox 1.4 GB/s 9 Bret Mulvey
48 Lookup3 1.2 GB/s 9 Bob Jenkins
49 SuperFastHash 1.2 GB/s 1 Paul Hsieh
50 CityHash64 1.05 GB/s 10 Pike & Alakuijala
51 FNV 0.55 GB/s 5 Fowler, Noll, Vo
52 CRC32 0.43 GB/s 9
53 MD5-32 0.33 GB/s 10 Ronald L. Rivest
54 SHA1-32 0.28 GB/s 10
55
56 Q.Score is a measure of quality of the hash function.
57 It depends on successfully passing SMHasher test set.
58 10 is a perfect score.
59
60 A 64-bits version, named XXH64, is available since r35.
61 It offers much better speed, but for 64-bits applications only.
62 Name Speed on 64 bits Speed on 32 bits
63 XXH64 13.8 GB/s 1.9 GB/s
64 XXH32 6.8 GB/s 6.0 GB/s
65 */
66
67 #ifndef XXHASH_H_5627135585666179
68 #define XXHASH_H_5627135585666179 1
69
70 #if defined (__cplusplus)
71 extern "C" {
72 #endif
73
74 #ifndef XXH_NAMESPACE
75 # define XXH_NAMESPACE ZSTD_ /* Zstandard specific */
76 #endif
77
78
79 /* ****************************
80 * Definitions
81 ******************************/
82 #include <stddef.h> /* size_t */
83 typedef enum { XXH_OK=0, XXH_ERROR } XXH_errorcode;
84
85
86 /* ****************************
87 * API modifier
88 ******************************/
89 /** XXH_PRIVATE_API
90 * This is useful if you want to include xxhash functions in `static` mode
91 * in order to inline them, and remove their symbol from the public list.
92 * Methodology :
93 * #define XXH_PRIVATE_API
94 * #include "xxhash.h"
95 * `xxhash.c` is automatically included.
96 * It's not useful to compile and link it as a separate module anymore.
97 */
98 #ifdef XXH_PRIVATE_API
99 # ifndef XXH_STATIC_LINKING_ONLY
100 # define XXH_STATIC_LINKING_ONLY
101 # endif
102 # if defined(__GNUC__)
103 # define XXH_PUBLIC_API static __inline __attribute__((unused))
104 # elif defined (__cplusplus) || (defined (__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) /* C99 */)
105 # define XXH_PUBLIC_API static inline
106 # elif defined(_MSC_VER)
107 # define XXH_PUBLIC_API static __inline
108 # else
109 # define XXH_PUBLIC_API static /* this version may generate warnings for unused static functions; disable the relevant warning */
110 # endif
111 #else
112 # define XXH_PUBLIC_API /* do nothing */
113 #endif /* XXH_PRIVATE_API */
114
115 /*!XXH_NAMESPACE, aka Namespace Emulation :
116
117 If you want to include _and expose_ xxHash functions from within your own library,
118 but also want to avoid symbol collisions with another library which also includes xxHash,
119
120 you can use XXH_NAMESPACE, to automatically prefix any public symbol from xxhash library
121 with the value of XXH_NAMESPACE (so avoid to keep it NULL and avoid numeric values).
122
123 Note that no change is required within the calling program as long as it includes `xxhash.h` :
124 regular symbol name will be automatically translated by this header.
125 */
126 #ifdef XXH_NAMESPACE
127 # define XXH_CAT(A,B) A##B
128 # define XXH_NAME2(A,B) XXH_CAT(A,B)
129 # define XXH32 XXH_NAME2(XXH_NAMESPACE, XXH32)
130 # define XXH64 XXH_NAME2(XXH_NAMESPACE, XXH64)
131 # define XXH_versionNumber XXH_NAME2(XXH_NAMESPACE, XXH_versionNumber)
132 # define XXH32_createState XXH_NAME2(XXH_NAMESPACE, XXH32_createState)
133 # define XXH64_createState XXH_NAME2(XXH_NAMESPACE, XXH64_createState)
134 # define XXH32_freeState XXH_NAME2(XXH_NAMESPACE, XXH32_freeState)
135 # define XXH64_freeState XXH_NAME2(XXH_NAMESPACE, XXH64_freeState)
136 # define XXH32_reset XXH_NAME2(XXH_NAMESPACE, XXH32_reset)
137 # define XXH64_reset XXH_NAME2(XXH_NAMESPACE, XXH64_reset)
138 # define XXH32_update XXH_NAME2(XXH_NAMESPACE, XXH32_update)
139 # define XXH64_update XXH_NAME2(XXH_NAMESPACE, XXH64_update)
140 # define XXH32_digest XXH_NAME2(XXH_NAMESPACE, XXH32_digest)
141 # define XXH64_digest XXH_NAME2(XXH_NAMESPACE, XXH64_digest)
142 # define XXH32_copyState XXH_NAME2(XXH_NAMESPACE, XXH32_copyState)
143 # define XXH64_copyState XXH_NAME2(XXH_NAMESPACE, XXH64_copyState)
144 # define XXH32_canonicalFromHash XXH_NAME2(XXH_NAMESPACE, XXH32_canonicalFromHash)
145 # define XXH64_canonicalFromHash XXH_NAME2(XXH_NAMESPACE, XXH64_canonicalFromHash)
146 # define XXH32_hashFromCanonical XXH_NAME2(XXH_NAMESPACE, XXH32_hashFromCanonical)
147 # define XXH64_hashFromCanonical XXH_NAME2(XXH_NAMESPACE, XXH64_hashFromCanonical)
148 #endif
149
150
151 /* *************************************
152 * Version
153 ***************************************/
154 #define XXH_VERSION_MAJOR 0
155 #define XXH_VERSION_MINOR 6
156 #define XXH_VERSION_RELEASE 2
157 #define XXH_VERSION_NUMBER (XXH_VERSION_MAJOR *100*100 + XXH_VERSION_MINOR *100 + XXH_VERSION_RELEASE)
158 XXH_PUBLIC_API unsigned XXH_versionNumber (void);
159
160
161 /* ****************************
162 * Simple Hash Functions
163 ******************************/
164 typedef unsigned int XXH32_hash_t;
165 typedef unsigned long long XXH64_hash_t;
166
167 XXH_PUBLIC_API XXH32_hash_t XXH32 (const void* input, size_t length, unsigned int seed);
168 XXH_PUBLIC_API XXH64_hash_t XXH64 (const void* input, size_t length, unsigned long long seed);
169
170 /*!
171 XXH32() :
172 Calculate the 32-bits hash of sequence "length" bytes stored at memory address "input".
173 The memory between input & input+length must be valid (allocated and read-accessible).
174 "seed" can be used to alter the result predictably.
175 Speed on Core 2 Duo @ 3 GHz (single thread, SMHasher benchmark) : 5.4 GB/s
176 XXH64() :
177 Calculate the 64-bits hash of sequence of length "len" stored at memory address "input".
178 "seed" can be used to alter the result predictably.
179 This function runs 2x faster on 64-bits systems, but slower on 32-bits systems (see benchmark).
180 */
181
182
183 /* ****************************
184 * Streaming Hash Functions
185 ******************************/
186 typedef struct XXH32_state_s XXH32_state_t; /* incomplete type */
187 typedef struct XXH64_state_s XXH64_state_t; /* incomplete type */
188
189 /*! State allocation, compatible with dynamic libraries */
190
191 XXH_PUBLIC_API XXH32_state_t* XXH32_createState(void);
192 XXH_PUBLIC_API XXH_errorcode XXH32_freeState(XXH32_state_t* statePtr);
193
194 XXH_PUBLIC_API XXH64_state_t* XXH64_createState(void);
195 XXH_PUBLIC_API XXH_errorcode XXH64_freeState(XXH64_state_t* statePtr);
196
197
198 /* hash streaming */
199
200 XXH_PUBLIC_API XXH_errorcode XXH32_reset (XXH32_state_t* statePtr, unsigned int seed);
201 XXH_PUBLIC_API XXH_errorcode XXH32_update (XXH32_state_t* statePtr, const void* input, size_t length);
202 XXH_PUBLIC_API XXH32_hash_t XXH32_digest (const XXH32_state_t* statePtr);
203
204 XXH_PUBLIC_API XXH_errorcode XXH64_reset (XXH64_state_t* statePtr, unsigned long long seed);
205 XXH_PUBLIC_API XXH_errorcode XXH64_update (XXH64_state_t* statePtr, const void* input, size_t length);
206 XXH_PUBLIC_API XXH64_hash_t XXH64_digest (const XXH64_state_t* statePtr);
207
208 /*
209 These functions generate the xxHash of an input provided in multiple segments.
210 Note that, for small input, they are slower than single-call functions, due to state management.
211 For small input, prefer `XXH32()` and `XXH64()` .
212
213 XXH state must first be allocated, using XXH*_createState() .
214
215 Start a new hash by initializing state with a seed, using XXH*_reset().
216
217 Then, feed the hash state by calling XXH*_update() as many times as necessary.
218 Obviously, input must be allocated and read accessible.
219 The function returns an error code, with 0 meaning OK, and any other value meaning there is an error.
220
221 Finally, a hash value can be produced anytime, by using XXH*_digest().
222 This function returns the nn-bits hash as an int or long long.
223
224 It's still possible to continue inserting input into the hash state after a digest,
225 and generate some new hashes later on, by calling again XXH*_digest().
226
227 When done, free XXH state space if it was allocated dynamically.
228 */
229
230
231 /* **************************
232 * Utils
233 ****************************/
234 #if !(defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L)) /* ! C99 */
235 # define restrict /* disable restrict */
236 #endif
237
238 XXH_PUBLIC_API void XXH32_copyState(XXH32_state_t* restrict dst_state, const XXH32_state_t* restrict src_state);
239 XXH_PUBLIC_API void XXH64_copyState(XXH64_state_t* restrict dst_state, const XXH64_state_t* restrict src_state);
240
241
242 /* **************************
243 * Canonical representation
244 ****************************/
245 typedef struct { unsigned char digest[4]; } XXH32_canonical_t;
246 typedef struct { unsigned char digest[8]; } XXH64_canonical_t;
247
248 XXH_PUBLIC_API void XXH32_canonicalFromHash(XXH32_canonical_t* dst, XXH32_hash_t hash);
249 XXH_PUBLIC_API void XXH64_canonicalFromHash(XXH64_canonical_t* dst, XXH64_hash_t hash);
250
251 XXH_PUBLIC_API XXH32_hash_t XXH32_hashFromCanonical(const XXH32_canonical_t* src);
252 XXH_PUBLIC_API XXH64_hash_t XXH64_hashFromCanonical(const XXH64_canonical_t* src);
253
254 /* Default result type for XXH functions are primitive unsigned 32 and 64 bits.
255 * The canonical representation uses human-readable write convention, aka big-endian (large digits first).
256 * These functions allow transformation of hash result into and from its canonical format.
257 * This way, hash values can be written into a file / memory, and remain comparable on different systems and programs.
258 */
259
260
261 #ifdef XXH_STATIC_LINKING_ONLY
262
263 /* ================================================================================================
264 This section contains definitions which are not guaranteed to remain stable.
265 They may change in future versions, becoming incompatible with a different version of the library.
266 They shall only be used with static linking.
267 Never use these definitions in association with dynamic linking !
268 =================================================================================================== */
269
270 /* These definitions are only meant to allow allocation of XXH state
271 statically, on stack, or in a struct for example.
272 Do not use members directly. */
273
274 struct XXH32_state_s {
275 unsigned total_len_32;
276 unsigned large_len;
277 unsigned v1;
278 unsigned v2;
279 unsigned v3;
280 unsigned v4;
281 unsigned mem32[4]; /* buffer defined as U32 for alignment */
282 unsigned memsize;
283 unsigned reserved; /* never read nor write, will be removed in a future version */
284 }; /* typedef'd to XXH32_state_t */
285
286 struct XXH64_state_s {
287 unsigned long long total_len;
288 unsigned long long v1;
289 unsigned long long v2;
290 unsigned long long v3;
291 unsigned long long v4;
292 unsigned long long mem64[4]; /* buffer defined as U64 for alignment */
293 unsigned memsize;
294 unsigned reserved[2]; /* never read nor write, will be removed in a future version */
295 }; /* typedef'd to XXH64_state_t */
296
297
298 # ifdef XXH_PRIVATE_API
299 # include "xxhash.c" /* include xxhash functions as `static`, for inlining */
300 # endif
301
302 #endif /* XXH_STATIC_LINKING_ONLY */
303
304
305 #if defined (__cplusplus)
306 }
307 #endif
308
309 #endif /* XXHASH_H_5627135585666179 */
@@ -0,0 +1,191 b''
1 /**
2 * Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
3 * All rights reserved.
4 *
5 * This source code is licensed under the BSD-style license found in the
6 * LICENSE file in the root directory of this source tree. An additional grant
7 * of patent rights can be found in the PATENTS file in the same directory.
8 */
9
10 /* ***************************************************************
11 * NOTES/WARNINGS
12 *****************************************************************/
13 /* The streaming API defined here will soon be deprecated by the
14 * new one in 'zstd.h'; consider migrating towards newer streaming
15 * API. See 'lib/README.md'.
16 *****************************************************************/
17
18 #ifndef ZSTD_BUFFERED_H_23987
19 #define ZSTD_BUFFERED_H_23987
20
21 #if defined (__cplusplus)
22 extern "C" {
23 #endif
24
25 /* *************************************
26 * Dependencies
27 ***************************************/
28 #include <stddef.h> /* size_t */
29
30
31 /* ***************************************************************
32 * Compiler specifics
33 *****************************************************************/
34 /* ZSTD_DLL_EXPORT :
35 * Enable exporting of functions when building a Windows DLL */
36 #if defined(_WIN32) && defined(ZSTD_DLL_EXPORT) && (ZSTD_DLL_EXPORT==1)
37 # define ZSTDLIB_API __declspec(dllexport)
38 #else
39 # define ZSTDLIB_API
40 #endif
41
42
43 /* *************************************
44 * Streaming functions
45 ***************************************/
46 /* This is the easier "buffered" streaming API,
47 * using an internal buffer to lift all restrictions on user-provided buffers
48 * which can be any size, any place, for both input and output.
49 * ZBUFF and ZSTD are 100% interoperable,
50 * frames created by one can be decoded by the other one */
51
52 typedef struct ZBUFF_CCtx_s ZBUFF_CCtx;
53 ZSTDLIB_API ZBUFF_CCtx* ZBUFF_createCCtx(void);
54 ZSTDLIB_API size_t ZBUFF_freeCCtx(ZBUFF_CCtx* cctx);
55
56 ZSTDLIB_API size_t ZBUFF_compressInit(ZBUFF_CCtx* cctx, int compressionLevel);
57 ZSTDLIB_API size_t ZBUFF_compressInitDictionary(ZBUFF_CCtx* cctx, const void* dict, size_t dictSize, int compressionLevel);
58
59 ZSTDLIB_API size_t ZBUFF_compressContinue(ZBUFF_CCtx* cctx, void* dst, size_t* dstCapacityPtr, const void* src, size_t* srcSizePtr);
60 ZSTDLIB_API size_t ZBUFF_compressFlush(ZBUFF_CCtx* cctx, void* dst, size_t* dstCapacityPtr);
61 ZSTDLIB_API size_t ZBUFF_compressEnd(ZBUFF_CCtx* cctx, void* dst, size_t* dstCapacityPtr);
62
63 /*-*************************************************
64 * Streaming compression - howto
65 *
66 * A ZBUFF_CCtx object is required to track streaming operation.
67 * Use ZBUFF_createCCtx() and ZBUFF_freeCCtx() to create/release resources.
68 * ZBUFF_CCtx objects can be reused multiple times.
69 *
70 * Start by initializing ZBUF_CCtx.
71 * Use ZBUFF_compressInit() to start a new compression operation.
72 * Use ZBUFF_compressInitDictionary() for a compression which requires a dictionary.
73 *
74 * Use ZBUFF_compressContinue() repetitively to consume input stream.
75 * *srcSizePtr and *dstCapacityPtr can be any size.
76 * The function will report how many bytes were read or written within *srcSizePtr and *dstCapacityPtr.
77 * Note that it may not consume the entire input, in which case it's up to the caller to present again remaining data.
78 * The content of `dst` will be overwritten (up to *dstCapacityPtr) at each call, so save its content if it matters or change @dst .
79 * @return : a hint to preferred nb of bytes to use as input for next function call (it's just a hint, to improve latency)
80 * or an error code, which can be tested using ZBUFF_isError().
81 *
82 * At any moment, it's possible to flush whatever data remains within buffer, using ZBUFF_compressFlush().
83 * The nb of bytes written into `dst` will be reported into *dstCapacityPtr.
84 * Note that the function cannot output more than *dstCapacityPtr,
85 * therefore, some content might still be left into internal buffer if *dstCapacityPtr is too small.
86 * @return : nb of bytes still present into internal buffer (0 if it's empty)
87 * or an error code, which can be tested using ZBUFF_isError().
88 *
89 * ZBUFF_compressEnd() instructs to finish a frame.
90 * It will perform a flush and write frame epilogue.
91 * The epilogue is required for decoders to consider a frame completed.
92 * Similar to ZBUFF_compressFlush(), it may not be able to output the entire internal buffer content if *dstCapacityPtr is too small.
93 * In which case, call again ZBUFF_compressFlush() to complete the flush.
94 * @return : nb of bytes still present into internal buffer (0 if it's empty)
95 * or an error code, which can be tested using ZBUFF_isError().
96 *
97 * Hint : _recommended buffer_ sizes (not compulsory) : ZBUFF_recommendedCInSize() / ZBUFF_recommendedCOutSize()
98 * input : ZBUFF_recommendedCInSize==128 KB block size is the internal unit, use this value to reduce intermediate stages (better latency)
99 * output : ZBUFF_recommendedCOutSize==ZSTD_compressBound(128 KB) + 3 + 3 : ensures it's always possible to write/flush/end a full block. Skip some buffering.
100 * By using both, it ensures that input will be entirely consumed, and output will always contain the result, reducing intermediate buffering.
101 * **************************************************/
102
103
104 typedef struct ZBUFF_DCtx_s ZBUFF_DCtx;
105 ZSTDLIB_API ZBUFF_DCtx* ZBUFF_createDCtx(void);
106 ZSTDLIB_API size_t ZBUFF_freeDCtx(ZBUFF_DCtx* dctx);
107
108 ZSTDLIB_API size_t ZBUFF_decompressInit(ZBUFF_DCtx* dctx);
109 ZSTDLIB_API size_t ZBUFF_decompressInitDictionary(ZBUFF_DCtx* dctx, const void* dict, size_t dictSize);
110
111 ZSTDLIB_API size_t ZBUFF_decompressContinue(ZBUFF_DCtx* dctx,
112 void* dst, size_t* dstCapacityPtr,
113 const void* src, size_t* srcSizePtr);
114
115 /*-***************************************************************************
116 * Streaming decompression howto
117 *
118 * A ZBUFF_DCtx object is required to track streaming operations.
119 * Use ZBUFF_createDCtx() and ZBUFF_freeDCtx() to create/release resources.
120 * Use ZBUFF_decompressInit() to start a new decompression operation,
121 * or ZBUFF_decompressInitDictionary() if decompression requires a dictionary.
122 * Note that ZBUFF_DCtx objects can be re-init multiple times.
123 *
124 * Use ZBUFF_decompressContinue() repetitively to consume your input.
125 * *srcSizePtr and *dstCapacityPtr can be any size.
126 * The function will report how many bytes were read or written by modifying *srcSizePtr and *dstCapacityPtr.
127 * Note that it may not consume the entire input, in which case it's up to the caller to present remaining input again.
128 * The content of `dst` will be overwritten (up to *dstCapacityPtr) at each function call, so save its content if it matters, or change `dst`.
129 * @return : 0 when a frame is completely decoded and fully flushed,
130 * 1 when there is still some data left within internal buffer to flush,
131 * >1 when more data is expected, with value being a suggested next input size (it's just a hint, which helps latency),
132 * or an error code, which can be tested using ZBUFF_isError().
133 *
134 * Hint : recommended buffer sizes (not compulsory) : ZBUFF_recommendedDInSize() and ZBUFF_recommendedDOutSize()
135 * output : ZBUFF_recommendedDOutSize== 128 KB block size is the internal unit, it ensures it's always possible to write a full block when decoded.
136 * input : ZBUFF_recommendedDInSize == 128KB + 3;
137 * just follow indications from ZBUFF_decompressContinue() to minimize latency. It should always be <= 128 KB + 3 .
138 * *******************************************************************************/
139
140
141 /* *************************************
142 * Tool functions
143 ***************************************/
144 ZSTDLIB_API unsigned ZBUFF_isError(size_t errorCode);
145 ZSTDLIB_API const char* ZBUFF_getErrorName(size_t errorCode);
146
147 /** Functions below provide recommended buffer sizes for Compression or Decompression operations.
148 * These sizes are just hints, they tend to offer better latency */
149 ZSTDLIB_API size_t ZBUFF_recommendedCInSize(void);
150 ZSTDLIB_API size_t ZBUFF_recommendedCOutSize(void);
151 ZSTDLIB_API size_t ZBUFF_recommendedDInSize(void);
152 ZSTDLIB_API size_t ZBUFF_recommendedDOutSize(void);
153
154
155 #ifdef ZBUFF_STATIC_LINKING_ONLY
156
157 /* ====================================================================================
158 * The definitions in this section are considered experimental.
159 * They should never be used in association with a dynamic library, as they may change in the future.
160 * They are provided for advanced usages.
161 * Use them only in association with static linking.
162 * ==================================================================================== */
163
164 /*--- Dependency ---*/
165 #define ZSTD_STATIC_LINKING_ONLY /* ZSTD_parameters, ZSTD_customMem */
166 #include "zstd.h"
167
168
169 /*--- Custom memory allocator ---*/
170 /*! ZBUFF_createCCtx_advanced() :
171 * Create a ZBUFF compression context using external alloc and free functions */
172 ZSTDLIB_API ZBUFF_CCtx* ZBUFF_createCCtx_advanced(ZSTD_customMem customMem);
173
174 /*! ZBUFF_createDCtx_advanced() :
175 * Create a ZBUFF decompression context using external alloc and free functions */
176 ZSTDLIB_API ZBUFF_DCtx* ZBUFF_createDCtx_advanced(ZSTD_customMem customMem);
177
178
179 /*--- Advanced Streaming Initialization ---*/
180 ZSTDLIB_API size_t ZBUFF_compressInit_advanced(ZBUFF_CCtx* zbc,
181 const void* dict, size_t dictSize,
182 ZSTD_parameters params, unsigned long long pledgedSrcSize);
183
184 #endif /* ZBUFF_STATIC_LINKING_ONLY */
185
186
187 #if defined (__cplusplus)
188 }
189 #endif
190
191 #endif /* ZSTD_BUFFERED_H_23987 */
@@ -0,0 +1,83 b''
1 /**
2 * Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
3 * All rights reserved.
4 *
5 * This source code is licensed under the BSD-style license found in the
6 * LICENSE file in the root directory of this source tree. An additional grant
7 * of patent rights can be found in the PATENTS file in the same directory.
8 */
9
10
11
12 /*-*************************************
13 * Dependencies
14 ***************************************/
15 #include <stdlib.h> /* malloc */
16 #include "error_private.h"
17 #define ZSTD_STATIC_LINKING_ONLY
18 #include "zstd.h" /* declaration of ZSTD_isError, ZSTD_getErrorName, ZSTD_getErrorCode, ZSTD_getErrorString, ZSTD_versionNumber */
19 #include "zbuff.h" /* declaration of ZBUFF_isError, ZBUFF_getErrorName */
20
21
22 /*-****************************************
23 * Version
24 ******************************************/
25 unsigned ZSTD_versionNumber (void) { return ZSTD_VERSION_NUMBER; }
26
27
28 /*-****************************************
29 * ZSTD Error Management
30 ******************************************/
31 /*! ZSTD_isError() :
32 * tells if a return value is an error code */
33 unsigned ZSTD_isError(size_t code) { return ERR_isError(code); }
34
35 /*! ZSTD_getErrorName() :
36 * provides error code string from function result (useful for debugging) */
37 const char* ZSTD_getErrorName(size_t code) { return ERR_getErrorName(code); }
38
39 /*! ZSTD_getError() :
40 * convert a `size_t` function result into a proper ZSTD_errorCode enum */
41 ZSTD_ErrorCode ZSTD_getErrorCode(size_t code) { return ERR_getErrorCode(code); }
42
43 /*! ZSTD_getErrorString() :
44 * provides error code string from enum */
45 const char* ZSTD_getErrorString(ZSTD_ErrorCode code) { return ERR_getErrorName(code); }
46
47
48 /* **************************************************************
49 * ZBUFF Error Management
50 ****************************************************************/
51 unsigned ZBUFF_isError(size_t errorCode) { return ERR_isError(errorCode); }
52
53 const char* ZBUFF_getErrorName(size_t errorCode) { return ERR_getErrorName(errorCode); }
54
55
56
57 /*=**************************************************************
58 * Custom allocator
59 ****************************************************************/
60 /* default uses stdlib */
61 void* ZSTD_defaultAllocFunction(void* opaque, size_t size)
62 {
63 void* address = malloc(size);
64 (void)opaque;
65 return address;
66 }
67
68 void ZSTD_defaultFreeFunction(void* opaque, void* address)
69 {
70 (void)opaque;
71 free(address);
72 }
73
74 void* ZSTD_malloc(size_t size, ZSTD_customMem customMem)
75 {
76 return customMem.customAlloc(customMem.opaque, size);
77 }
78
79 void ZSTD_free(void* ptr, ZSTD_customMem customMem)
80 {
81 if (ptr!=NULL)
82 customMem.customFree(customMem.opaque, ptr);
83 }
@@ -0,0 +1,60 b''
1 /**
2 * Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
3 * All rights reserved.
4 *
5 * This source code is licensed under the BSD-style license found in the
6 * LICENSE file in the root directory of this source tree. An additional grant
7 * of patent rights can be found in the PATENTS file in the same directory.
8 */
9
10 #ifndef ZSTD_ERRORS_H_398273423
11 #define ZSTD_ERRORS_H_398273423
12
13 #if defined (__cplusplus)
14 extern "C" {
15 #endif
16
17 /*===== dependency =====*/
18 #include <stddef.h> /* size_t */
19
20
21 /*-****************************************
22 * error codes list
23 ******************************************/
24 typedef enum {
25 ZSTD_error_no_error,
26 ZSTD_error_GENERIC,
27 ZSTD_error_prefix_unknown,
28 ZSTD_error_version_unsupported,
29 ZSTD_error_parameter_unknown,
30 ZSTD_error_frameParameter_unsupported,
31 ZSTD_error_frameParameter_unsupportedBy32bits,
32 ZSTD_error_frameParameter_windowTooLarge,
33 ZSTD_error_compressionParameter_unsupported,
34 ZSTD_error_init_missing,
35 ZSTD_error_memory_allocation,
36 ZSTD_error_stage_wrong,
37 ZSTD_error_dstSize_tooSmall,
38 ZSTD_error_srcSize_wrong,
39 ZSTD_error_corruption_detected,
40 ZSTD_error_checksum_wrong,
41 ZSTD_error_tableLog_tooLarge,
42 ZSTD_error_maxSymbolValue_tooLarge,
43 ZSTD_error_maxSymbolValue_tooSmall,
44 ZSTD_error_dictionary_corrupted,
45 ZSTD_error_dictionary_wrong,
46 ZSTD_error_maxCode
47 } ZSTD_ErrorCode;
48
49 /*! ZSTD_getErrorCode() :
50 convert a `size_t` function result into a `ZSTD_ErrorCode` enum type,
51 which can be used to compare directly with enum list published into "error_public.h" */
52 ZSTD_ErrorCode ZSTD_getErrorCode(size_t functionResult);
53 const char* ZSTD_getErrorString(ZSTD_ErrorCode code);
54
55
56 #if defined (__cplusplus)
57 }
58 #endif
59
60 #endif /* ZSTD_ERRORS_H_398273423 */
@@ -0,0 +1,267 b''
1 /**
2 * Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
3 * All rights reserved.
4 *
5 * This source code is licensed under the BSD-style license found in the
6 * LICENSE file in the root directory of this source tree. An additional grant
7 * of patent rights can be found in the PATENTS file in the same directory.
8 */
9
10 #ifndef ZSTD_CCOMMON_H_MODULE
11 #define ZSTD_CCOMMON_H_MODULE
12
13 /*-*******************************************************
14 * Compiler specifics
15 *********************************************************/
16 #ifdef _MSC_VER /* Visual Studio */
17 # define FORCE_INLINE static __forceinline
18 # include <intrin.h> /* For Visual 2005 */
19 # pragma warning(disable : 4127) /* disable: C4127: conditional expression is constant */
20 # pragma warning(disable : 4324) /* disable: C4324: padded structure */
21 # pragma warning(disable : 4100) /* disable: C4100: unreferenced formal parameter */
22 #else
23 # if defined (__cplusplus) || defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L /* C99 */
24 # ifdef __GNUC__
25 # define FORCE_INLINE static inline __attribute__((always_inline))
26 # else
27 # define FORCE_INLINE static inline
28 # endif
29 # else
30 # define FORCE_INLINE static
31 # endif /* __STDC_VERSION__ */
32 #endif
33
34 #ifdef _MSC_VER
35 # define FORCE_NOINLINE static __declspec(noinline)
36 #else
37 # ifdef __GNUC__
38 # define FORCE_NOINLINE static __attribute__((__noinline__))
39 # else
40 # define FORCE_NOINLINE static
41 # endif
42 #endif
43
44
45 /*-*************************************
46 * Dependencies
47 ***************************************/
48 #include "mem.h"
49 #include "error_private.h"
50 #define ZSTD_STATIC_LINKING_ONLY
51 #include "zstd.h"
52
53
54 /*-*************************************
55 * shared macros
56 ***************************************/
57 #define MIN(a,b) ((a)<(b) ? (a) : (b))
58 #define MAX(a,b) ((a)>(b) ? (a) : (b))
59 #define CHECK_F(f) { size_t const errcod = f; if (ERR_isError(errcod)) return errcod; } /* check and Forward error code */
60 #define CHECK_E(f, e) { size_t const errcod = f; if (ERR_isError(errcod)) return ERROR(e); } /* check and send Error code */
61
62
63 /*-*************************************
64 * Common constants
65 ***************************************/
66 #define ZSTD_OPT_NUM (1<<12)
67 #define ZSTD_DICT_MAGIC 0xEC30A437 /* v0.7+ */
68
69 #define ZSTD_REP_NUM 3 /* number of repcodes */
70 #define ZSTD_REP_CHECK (ZSTD_REP_NUM) /* number of repcodes to check by the optimal parser */
71 #define ZSTD_REP_MOVE (ZSTD_REP_NUM-1)
72 #define ZSTD_REP_MOVE_OPT (ZSTD_REP_NUM)
73 static const U32 repStartValue[ZSTD_REP_NUM] = { 1, 4, 8 };
74
75 #define KB *(1 <<10)
76 #define MB *(1 <<20)
77 #define GB *(1U<<30)
78
79 #define BIT7 128
80 #define BIT6 64
81 #define BIT5 32
82 #define BIT4 16
83 #define BIT1 2
84 #define BIT0 1
85
86 #define ZSTD_WINDOWLOG_ABSOLUTEMIN 10
87 static const size_t ZSTD_fcs_fieldSize[4] = { 0, 2, 4, 8 };
88 static const size_t ZSTD_did_fieldSize[4] = { 0, 1, 2, 4 };
89
90 #define ZSTD_BLOCKHEADERSIZE 3 /* C standard doesn't allow `static const` variable to be init using another `static const` variable */
91 static const size_t ZSTD_blockHeaderSize = ZSTD_BLOCKHEADERSIZE;
92 typedef enum { bt_raw, bt_rle, bt_compressed, bt_reserved } blockType_e;
93
94 #define MIN_SEQUENCES_SIZE 1 /* nbSeq==0 */
95 #define MIN_CBLOCK_SIZE (1 /*litCSize*/ + 1 /* RLE or RAW */ + MIN_SEQUENCES_SIZE /* nbSeq==0 */) /* for a non-null block */
96
97 #define HufLog 12
98 typedef enum { set_basic, set_rle, set_compressed, set_repeat } symbolEncodingType_e;
99
100 #define LONGNBSEQ 0x7F00
101
102 #define MINMATCH 3
103 #define EQUAL_READ32 4
104
105 #define Litbits 8
106 #define MaxLit ((1<<Litbits) - 1)
107 #define MaxML 52
108 #define MaxLL 35
109 #define MaxOff 28
110 #define MaxSeq MAX(MaxLL, MaxML) /* Assumption : MaxOff < MaxLL,MaxML */
111 #define MLFSELog 9
112 #define LLFSELog 9
113 #define OffFSELog 8
114
115 static const U32 LL_bits[MaxLL+1] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
116 1, 1, 1, 1, 2, 2, 3, 3, 4, 6, 7, 8, 9,10,11,12,
117 13,14,15,16 };
118 static const S16 LL_defaultNorm[MaxLL+1] = { 4, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1,
119 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 2, 1, 1, 1, 1, 1,
120 -1,-1,-1,-1 };
121 #define LL_DEFAULTNORMLOG 6 /* for static allocation */
122 static const U32 LL_defaultNormLog = LL_DEFAULTNORMLOG;
123
124 static const U32 ML_bits[MaxML+1] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
125 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
126 1, 1, 1, 1, 2, 2, 3, 3, 4, 4, 5, 7, 8, 9,10,11,
127 12,13,14,15,16 };
128 static const S16 ML_defaultNorm[MaxML+1] = { 1, 4, 3, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1,
129 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
130 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,-1,-1,
131 -1,-1,-1,-1,-1 };
132 #define ML_DEFAULTNORMLOG 6 /* for static allocation */
133 static const U32 ML_defaultNormLog = ML_DEFAULTNORMLOG;
134
135 static const S16 OF_defaultNorm[MaxOff+1] = { 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1,
136 1, 1, 1, 1, 1, 1, 1, 1,-1,-1,-1,-1,-1 };
137 #define OF_DEFAULTNORMLOG 5 /* for static allocation */
138 static const U32 OF_defaultNormLog = OF_DEFAULTNORMLOG;
139
140
141 /*-*******************************************
142 * Shared functions to include for inlining
143 *********************************************/
144 static void ZSTD_copy8(void* dst, const void* src) { memcpy(dst, src, 8); }
145 #define COPY8(d,s) { ZSTD_copy8(d,s); d+=8; s+=8; }
146
147 /*! ZSTD_wildcopy() :
148 * custom version of memcpy(), can copy up to 7 bytes too many (8 bytes if length==0) */
149 #define WILDCOPY_OVERLENGTH 8
150 MEM_STATIC void ZSTD_wildcopy(void* dst, const void* src, size_t length)
151 {
152 const BYTE* ip = (const BYTE*)src;
153 BYTE* op = (BYTE*)dst;
154 BYTE* const oend = op + length;
155 do
156 COPY8(op, ip)
157 while (op < oend);
158 }
159
160 MEM_STATIC void ZSTD_wildcopy_e(void* dst, const void* src, void* dstEnd) /* should be faster for decoding, but strangely, not verified on all platform */
161 {
162 const BYTE* ip = (const BYTE*)src;
163 BYTE* op = (BYTE*)dst;
164 BYTE* const oend = (BYTE*)dstEnd;
165 do
166 COPY8(op, ip)
167 while (op < oend);
168 }
169
170
171 /*-*******************************************
172 * Private interfaces
173 *********************************************/
174 typedef struct ZSTD_stats_s ZSTD_stats_t;
175
176 typedef struct {
177 U32 off;
178 U32 len;
179 } ZSTD_match_t;
180
181 typedef struct {
182 U32 price;
183 U32 off;
184 U32 mlen;
185 U32 litlen;
186 U32 rep[ZSTD_REP_NUM];
187 } ZSTD_optimal_t;
188
189
190 typedef struct seqDef_s {
191 U32 offset;
192 U16 litLength;
193 U16 matchLength;
194 } seqDef;
195
196
197 typedef struct {
198 seqDef* sequencesStart;
199 seqDef* sequences;
200 BYTE* litStart;
201 BYTE* lit;
202 BYTE* llCode;
203 BYTE* mlCode;
204 BYTE* ofCode;
205 U32 longLengthID; /* 0 == no longLength; 1 == Lit.longLength; 2 == Match.longLength; */
206 U32 longLengthPos;
207 /* opt */
208 ZSTD_optimal_t* priceTable;
209 ZSTD_match_t* matchTable;
210 U32* matchLengthFreq;
211 U32* litLengthFreq;
212 U32* litFreq;
213 U32* offCodeFreq;
214 U32 matchLengthSum;
215 U32 matchSum;
216 U32 litLengthSum;
217 U32 litSum;
218 U32 offCodeSum;
219 U32 log2matchLengthSum;
220 U32 log2matchSum;
221 U32 log2litLengthSum;
222 U32 log2litSum;
223 U32 log2offCodeSum;
224 U32 factor;
225 U32 cachedPrice;
226 U32 cachedLitLength;
227 const BYTE* cachedLiterals;
228 } seqStore_t;
229
230 const seqStore_t* ZSTD_getSeqStore(const ZSTD_CCtx* ctx);
231 void ZSTD_seqToCodes(const seqStore_t* seqStorePtr);
232 int ZSTD_isSkipFrame(ZSTD_DCtx* dctx);
233
234 /* custom memory allocation functions */
235 void* ZSTD_defaultAllocFunction(void* opaque, size_t size);
236 void ZSTD_defaultFreeFunction(void* opaque, void* address);
237 static const ZSTD_customMem defaultCustomMem = { ZSTD_defaultAllocFunction, ZSTD_defaultFreeFunction, NULL };
238 void* ZSTD_malloc(size_t size, ZSTD_customMem customMem);
239 void ZSTD_free(void* ptr, ZSTD_customMem customMem);
240
241
242 /*====== common function ======*/
243
244 MEM_STATIC U32 ZSTD_highbit32(U32 val)
245 {
246 # if defined(_MSC_VER) /* Visual */
247 unsigned long r=0;
248 _BitScanReverse(&r, val);
249 return (unsigned)r;
250 # elif defined(__GNUC__) && (__GNUC__ >= 3) /* GCC Intrinsic */
251 return 31 - __builtin_clz(val);
252 # else /* Software version */
253 static const int DeBruijnClz[32] = { 0, 9, 1, 10, 13, 21, 2, 29, 11, 14, 16, 18, 22, 25, 3, 30, 8, 12, 20, 28, 15, 17, 24, 7, 19, 27, 23, 6, 26, 5, 4, 31 };
254 U32 v = val;
255 int r;
256 v |= v >> 1;
257 v |= v >> 2;
258 v |= v >> 4;
259 v |= v >> 8;
260 v |= v >> 16;
261 r = DeBruijnClz[(U32)(v * 0x07C4ACDDU) >> 27];
262 return r;
263 # endif
264 }
265
266
267 #endif /* ZSTD_CCOMMON_H_MODULE */
This diff has been collapsed as it changes many lines, (810 lines changed) Show them Hide them
@@ -0,0 +1,810 b''
1 /* ******************************************************************
2 FSE : Finite State Entropy encoder
3 Copyright (C) 2013-2015, Yann Collet.
4
5 BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
6
7 Redistribution and use in source and binary forms, with or without
8 modification, are permitted provided that the following conditions are
9 met:
10
11 * Redistributions of source code must retain the above copyright
12 notice, this list of conditions and the following disclaimer.
13 * Redistributions in binary form must reproduce the above
14 copyright notice, this list of conditions and the following disclaimer
15 in the documentation and/or other materials provided with the
16 distribution.
17
18 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
19 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
20 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
21 A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
22 OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
23 SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
24 LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
25 DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
26 THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
27 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
28 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
29
30 You can contact the author at :
31 - FSE source repository : https://github.com/Cyan4973/FiniteStateEntropy
32 - Public forum : https://groups.google.com/forum/#!forum/lz4c
33 ****************************************************************** */
34
35 /* **************************************************************
36 * Compiler specifics
37 ****************************************************************/
38 #ifdef _MSC_VER /* Visual Studio */
39 # define FORCE_INLINE static __forceinline
40 # include <intrin.h> /* For Visual 2005 */
41 # pragma warning(disable : 4127) /* disable: C4127: conditional expression is constant */
42 # pragma warning(disable : 4214) /* disable: C4214: non-int bitfields */
43 #else
44 # if defined (__cplusplus) || defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L /* C99 */
45 # ifdef __GNUC__
46 # define FORCE_INLINE static inline __attribute__((always_inline))
47 # else
48 # define FORCE_INLINE static inline
49 # endif
50 # else
51 # define FORCE_INLINE static
52 # endif /* __STDC_VERSION__ */
53 #endif
54
55
56 /* **************************************************************
57 * Includes
58 ****************************************************************/
59 #include <stdlib.h> /* malloc, free, qsort */
60 #include <string.h> /* memcpy, memset */
61 #include <stdio.h> /* printf (debug) */
62 #include "bitstream.h"
63 #define FSE_STATIC_LINKING_ONLY
64 #include "fse.h"
65
66
67 /* **************************************************************
68 * Error Management
69 ****************************************************************/
70 #define FSE_STATIC_ASSERT(c) { enum { FSE_static_assert = 1/(int)(!!(c)) }; } /* use only *after* variable declarations */
71
72
73 /* **************************************************************
74 * Complex types
75 ****************************************************************/
76 typedef U32 CTable_max_t[FSE_CTABLE_SIZE_U32(FSE_MAX_TABLELOG, FSE_MAX_SYMBOL_VALUE)];
77
78
79 /* **************************************************************
80 * Templates
81 ****************************************************************/
82 /*
83 designed to be included
84 for type-specific functions (template emulation in C)
85 Objective is to write these functions only once, for improved maintenance
86 */
87
88 /* safety checks */
89 #ifndef FSE_FUNCTION_EXTENSION
90 # error "FSE_FUNCTION_EXTENSION must be defined"
91 #endif
92 #ifndef FSE_FUNCTION_TYPE
93 # error "FSE_FUNCTION_TYPE must be defined"
94 #endif
95
96 /* Function names */
97 #define FSE_CAT(X,Y) X##Y
98 #define FSE_FUNCTION_NAME(X,Y) FSE_CAT(X,Y)
99 #define FSE_TYPE_NAME(X,Y) FSE_CAT(X,Y)
100
101
102 /* Function templates */
103 size_t FSE_buildCTable(FSE_CTable* ct, const short* normalizedCounter, unsigned maxSymbolValue, unsigned tableLog)
104 {
105 U32 const tableSize = 1 << tableLog;
106 U32 const tableMask = tableSize - 1;
107 void* const ptr = ct;
108 U16* const tableU16 = ( (U16*) ptr) + 2;
109 void* const FSCT = ((U32*)ptr) + 1 /* header */ + (tableLog ? tableSize>>1 : 1) ;
110 FSE_symbolCompressionTransform* const symbolTT = (FSE_symbolCompressionTransform*) (FSCT);
111 U32 const step = FSE_TABLESTEP(tableSize);
112 U32 cumul[FSE_MAX_SYMBOL_VALUE+2];
113
114 FSE_FUNCTION_TYPE tableSymbol[FSE_MAX_TABLESIZE]; /* memset() is not necessary, even if static analyzer complain about it */
115 U32 highThreshold = tableSize-1;
116
117 /* CTable header */
118 tableU16[-2] = (U16) tableLog;
119 tableU16[-1] = (U16) maxSymbolValue;
120
121 /* For explanations on how to distribute symbol values over the table :
122 * http://fastcompression.blogspot.fr/2014/02/fse-distributing-symbol-values.html */
123
124 /* symbol start positions */
125 { U32 u;
126 cumul[0] = 0;
127 for (u=1; u<=maxSymbolValue+1; u++) {
128 if (normalizedCounter[u-1]==-1) { /* Low proba symbol */
129 cumul[u] = cumul[u-1] + 1;
130 tableSymbol[highThreshold--] = (FSE_FUNCTION_TYPE)(u-1);
131 } else {
132 cumul[u] = cumul[u-1] + normalizedCounter[u-1];
133 } }
134 cumul[maxSymbolValue+1] = tableSize+1;
135 }
136
137 /* Spread symbols */
138 { U32 position = 0;
139 U32 symbol;
140 for (symbol=0; symbol<=maxSymbolValue; symbol++) {
141 int nbOccurences;
142 for (nbOccurences=0; nbOccurences<normalizedCounter[symbol]; nbOccurences++) {
143 tableSymbol[position] = (FSE_FUNCTION_TYPE)symbol;
144 position = (position + step) & tableMask;
145 while (position > highThreshold) position = (position + step) & tableMask; /* Low proba area */
146 } }
147
148 if (position!=0) return ERROR(GENERIC); /* Must have gone through all positions */
149 }
150
151 /* Build table */
152 { U32 u; for (u=0; u<tableSize; u++) {
153 FSE_FUNCTION_TYPE s = tableSymbol[u]; /* note : static analyzer may not understand tableSymbol is properly initialized */
154 tableU16[cumul[s]++] = (U16) (tableSize+u); /* TableU16 : sorted by symbol order; gives next state value */
155 } }
156
157 /* Build Symbol Transformation Table */
158 { unsigned total = 0;
159 unsigned s;
160 for (s=0; s<=maxSymbolValue; s++) {
161 switch (normalizedCounter[s])
162 {
163 case 0: break;
164
165 case -1:
166 case 1:
167 symbolTT[s].deltaNbBits = (tableLog << 16) - (1<<tableLog);
168 symbolTT[s].deltaFindState = total - 1;
169 total ++;
170 break;
171 default :
172 {
173 U32 const maxBitsOut = tableLog - BIT_highbit32 (normalizedCounter[s]-1);
174 U32 const minStatePlus = normalizedCounter[s] << maxBitsOut;
175 symbolTT[s].deltaNbBits = (maxBitsOut << 16) - minStatePlus;
176 symbolTT[s].deltaFindState = total - normalizedCounter[s];
177 total += normalizedCounter[s];
178 } } } }
179
180 return 0;
181 }
182
183
184
185 #ifndef FSE_COMMONDEFS_ONLY
186
187 /*-**************************************************************
188 * FSE NCount encoding-decoding
189 ****************************************************************/
190 size_t FSE_NCountWriteBound(unsigned maxSymbolValue, unsigned tableLog)
191 {
192 size_t maxHeaderSize = (((maxSymbolValue+1) * tableLog) >> 3) + 3;
193 return maxSymbolValue ? maxHeaderSize : FSE_NCOUNTBOUND; /* maxSymbolValue==0 ? use default */
194 }
195
196 static short FSE_abs(short a) { return (short)(a<0 ? -a : a); }
197
198 static size_t FSE_writeNCount_generic (void* header, size_t headerBufferSize,
199 const short* normalizedCounter, unsigned maxSymbolValue, unsigned tableLog,
200 unsigned writeIsSafe)
201 {
202 BYTE* const ostart = (BYTE*) header;
203 BYTE* out = ostart;
204 BYTE* const oend = ostart + headerBufferSize;
205 int nbBits;
206 const int tableSize = 1 << tableLog;
207 int remaining;
208 int threshold;
209 U32 bitStream;
210 int bitCount;
211 unsigned charnum = 0;
212 int previous0 = 0;
213
214 bitStream = 0;
215 bitCount = 0;
216 /* Table Size */
217 bitStream += (tableLog-FSE_MIN_TABLELOG) << bitCount;
218 bitCount += 4;
219
220 /* Init */
221 remaining = tableSize+1; /* +1 for extra accuracy */
222 threshold = tableSize;
223 nbBits = tableLog+1;
224
225 while (remaining>1) { /* stops at 1 */
226 if (previous0) {
227 unsigned start = charnum;
228 while (!normalizedCounter[charnum]) charnum++;
229 while (charnum >= start+24) {
230 start+=24;
231 bitStream += 0xFFFFU << bitCount;
232 if ((!writeIsSafe) && (out > oend-2)) return ERROR(dstSize_tooSmall); /* Buffer overflow */
233 out[0] = (BYTE) bitStream;
234 out[1] = (BYTE)(bitStream>>8);
235 out+=2;
236 bitStream>>=16;
237 }
238 while (charnum >= start+3) {
239 start+=3;
240 bitStream += 3 << bitCount;
241 bitCount += 2;
242 }
243 bitStream += (charnum-start) << bitCount;
244 bitCount += 2;
245 if (bitCount>16) {
246 if ((!writeIsSafe) && (out > oend - 2)) return ERROR(dstSize_tooSmall); /* Buffer overflow */
247 out[0] = (BYTE)bitStream;
248 out[1] = (BYTE)(bitStream>>8);
249 out += 2;
250 bitStream >>= 16;
251 bitCount -= 16;
252 } }
253 { short count = normalizedCounter[charnum++];
254 const short max = (short)((2*threshold-1)-remaining);
255 remaining -= FSE_abs(count);
256 if (remaining<1) return ERROR(GENERIC);
257 count++; /* +1 for extra accuracy */
258 if (count>=threshold) count += max; /* [0..max[ [max..threshold[ (...) [threshold+max 2*threshold[ */
259 bitStream += count << bitCount;
260 bitCount += nbBits;
261 bitCount -= (count<max);
262 previous0 = (count==1);
263 while (remaining<threshold) nbBits--, threshold>>=1;
264 }
265 if (bitCount>16) {
266 if ((!writeIsSafe) && (out > oend - 2)) return ERROR(dstSize_tooSmall); /* Buffer overflow */
267 out[0] = (BYTE)bitStream;
268 out[1] = (BYTE)(bitStream>>8);
269 out += 2;
270 bitStream >>= 16;
271 bitCount -= 16;
272 } }
273
274 /* flush remaining bitStream */
275 if ((!writeIsSafe) && (out > oend - 2)) return ERROR(dstSize_tooSmall); /* Buffer overflow */
276 out[0] = (BYTE)bitStream;
277 out[1] = (BYTE)(bitStream>>8);
278 out+= (bitCount+7) /8;
279
280 if (charnum > maxSymbolValue + 1) return ERROR(GENERIC);
281
282 return (out-ostart);
283 }
284
285
286 size_t FSE_writeNCount (void* buffer, size_t bufferSize, const short* normalizedCounter, unsigned maxSymbolValue, unsigned tableLog)
287 {
288 if (tableLog > FSE_MAX_TABLELOG) return ERROR(GENERIC); /* Unsupported */
289 if (tableLog < FSE_MIN_TABLELOG) return ERROR(GENERIC); /* Unsupported */
290
291 if (bufferSize < FSE_NCountWriteBound(maxSymbolValue, tableLog))
292 return FSE_writeNCount_generic(buffer, bufferSize, normalizedCounter, maxSymbolValue, tableLog, 0);
293
294 return FSE_writeNCount_generic(buffer, bufferSize, normalizedCounter, maxSymbolValue, tableLog, 1);
295 }
296
297
298
299 /*-**************************************************************
300 * Counting histogram
301 ****************************************************************/
302 /*! FSE_count_simple
303 This function just counts byte values within `src`,
304 and store the histogram into table `count`.
305 This function is unsafe : it doesn't check that all values within `src` can fit into `count`.
306 For this reason, prefer using a table `count` with 256 elements.
307 @return : count of most numerous element
308 */
309 static size_t FSE_count_simple(unsigned* count, unsigned* maxSymbolValuePtr,
310 const void* src, size_t srcSize)
311 {
312 const BYTE* ip = (const BYTE*)src;
313 const BYTE* const end = ip + srcSize;
314 unsigned maxSymbolValue = *maxSymbolValuePtr;
315 unsigned max=0;
316
317
318 memset(count, 0, (maxSymbolValue+1)*sizeof(*count));
319 if (srcSize==0) { *maxSymbolValuePtr = 0; return 0; }
320
321 while (ip<end) count[*ip++]++;
322
323 while (!count[maxSymbolValue]) maxSymbolValue--;
324 *maxSymbolValuePtr = maxSymbolValue;
325
326 { U32 s; for (s=0; s<=maxSymbolValue; s++) if (count[s] > max) max = count[s]; }
327
328 return (size_t)max;
329 }
330
331
332 static size_t FSE_count_parallel(unsigned* count, unsigned* maxSymbolValuePtr,
333 const void* source, size_t sourceSize,
334 unsigned checkMax)
335 {
336 const BYTE* ip = (const BYTE*)source;
337 const BYTE* const iend = ip+sourceSize;
338 unsigned maxSymbolValue = *maxSymbolValuePtr;
339 unsigned max=0;
340
341
342 U32 Counting1[256] = { 0 };
343 U32 Counting2[256] = { 0 };
344 U32 Counting3[256] = { 0 };
345 U32 Counting4[256] = { 0 };
346
347 /* safety checks */
348 if (!sourceSize) {
349 memset(count, 0, maxSymbolValue + 1);
350 *maxSymbolValuePtr = 0;
351 return 0;
352 }
353 if (!maxSymbolValue) maxSymbolValue = 255; /* 0 == default */
354
355 /* by stripes of 16 bytes */
356 { U32 cached = MEM_read32(ip); ip += 4;
357 while (ip < iend-15) {
358 U32 c = cached; cached = MEM_read32(ip); ip += 4;
359 Counting1[(BYTE) c ]++;
360 Counting2[(BYTE)(c>>8) ]++;
361 Counting3[(BYTE)(c>>16)]++;
362 Counting4[ c>>24 ]++;
363 c = cached; cached = MEM_read32(ip); ip += 4;
364 Counting1[(BYTE) c ]++;
365 Counting2[(BYTE)(c>>8) ]++;
366 Counting3[(BYTE)(c>>16)]++;
367 Counting4[ c>>24 ]++;
368 c = cached; cached = MEM_read32(ip); ip += 4;
369 Counting1[(BYTE) c ]++;
370 Counting2[(BYTE)(c>>8) ]++;
371 Counting3[(BYTE)(c>>16)]++;
372 Counting4[ c>>24 ]++;
373 c = cached; cached = MEM_read32(ip); ip += 4;
374 Counting1[(BYTE) c ]++;
375 Counting2[(BYTE)(c>>8) ]++;
376 Counting3[(BYTE)(c>>16)]++;
377 Counting4[ c>>24 ]++;
378 }
379 ip-=4;
380 }
381
382 /* finish last symbols */
383 while (ip<iend) Counting1[*ip++]++;
384
385 if (checkMax) { /* verify stats will fit into destination table */
386 U32 s; for (s=255; s>maxSymbolValue; s--) {
387 Counting1[s] += Counting2[s] + Counting3[s] + Counting4[s];
388 if (Counting1[s]) return ERROR(maxSymbolValue_tooSmall);
389 } }
390
391 { U32 s; for (s=0; s<=maxSymbolValue; s++) {
392 count[s] = Counting1[s] + Counting2[s] + Counting3[s] + Counting4[s];
393 if (count[s] > max) max = count[s];
394 }}
395
396 while (!count[maxSymbolValue]) maxSymbolValue--;
397 *maxSymbolValuePtr = maxSymbolValue;
398 return (size_t)max;
399 }
400
401 /* fast variant (unsafe : won't check if src contains values beyond count[] limit) */
402 size_t FSE_countFast(unsigned* count, unsigned* maxSymbolValuePtr,
403 const void* source, size_t sourceSize)
404 {
405 if (sourceSize < 1500) return FSE_count_simple(count, maxSymbolValuePtr, source, sourceSize);
406 return FSE_count_parallel(count, maxSymbolValuePtr, source, sourceSize, 0);
407 }
408
409 size_t FSE_count(unsigned* count, unsigned* maxSymbolValuePtr,
410 const void* source, size_t sourceSize)
411 {
412 if (*maxSymbolValuePtr <255)
413 return FSE_count_parallel(count, maxSymbolValuePtr, source, sourceSize, 1);
414 *maxSymbolValuePtr = 255;
415 return FSE_countFast(count, maxSymbolValuePtr, source, sourceSize);
416 }
417
418
419
420 /*-**************************************************************
421 * FSE Compression Code
422 ****************************************************************/
423 /*! FSE_sizeof_CTable() :
424 FSE_CTable is a variable size structure which contains :
425 `U16 tableLog;`
426 `U16 maxSymbolValue;`
427 `U16 nextStateNumber[1 << tableLog];` // This size is variable
428 `FSE_symbolCompressionTransform symbolTT[maxSymbolValue+1];` // This size is variable
429 Allocation is manual (C standard does not support variable-size structures).
430 */
431
432 size_t FSE_sizeof_CTable (unsigned maxSymbolValue, unsigned tableLog)
433 {
434 size_t size;
435 FSE_STATIC_ASSERT((size_t)FSE_CTABLE_SIZE_U32(FSE_MAX_TABLELOG, FSE_MAX_SYMBOL_VALUE)*4 >= sizeof(CTable_max_t)); /* A compilation error here means FSE_CTABLE_SIZE_U32 is not large enough */
436 if (tableLog > FSE_MAX_TABLELOG) return ERROR(GENERIC);
437 size = FSE_CTABLE_SIZE_U32 (tableLog, maxSymbolValue) * sizeof(U32);
438 return size;
439 }
440
441 FSE_CTable* FSE_createCTable (unsigned maxSymbolValue, unsigned tableLog)
442 {
443 size_t size;
444 if (tableLog > FSE_TABLELOG_ABSOLUTE_MAX) tableLog = FSE_TABLELOG_ABSOLUTE_MAX;
445 size = FSE_CTABLE_SIZE_U32 (tableLog, maxSymbolValue) * sizeof(U32);
446 return (FSE_CTable*)malloc(size);
447 }
448
449 void FSE_freeCTable (FSE_CTable* ct) { free(ct); }
450
451 /* provides the minimum logSize to safely represent a distribution */
452 static unsigned FSE_minTableLog(size_t srcSize, unsigned maxSymbolValue)
453 {
454 U32 minBitsSrc = BIT_highbit32((U32)(srcSize - 1)) + 1;
455 U32 minBitsSymbols = BIT_highbit32(maxSymbolValue) + 2;
456 U32 minBits = minBitsSrc < minBitsSymbols ? minBitsSrc : minBitsSymbols;
457 return minBits;
458 }
459
460 unsigned FSE_optimalTableLog_internal(unsigned maxTableLog, size_t srcSize, unsigned maxSymbolValue, unsigned minus)
461 {
462 U32 maxBitsSrc = BIT_highbit32((U32)(srcSize - 1)) - minus;
463 U32 tableLog = maxTableLog;
464 U32 minBits = FSE_minTableLog(srcSize, maxSymbolValue);
465 if (tableLog==0) tableLog = FSE_DEFAULT_TABLELOG;
466 if (maxBitsSrc < tableLog) tableLog = maxBitsSrc; /* Accuracy can be reduced */
467 if (minBits > tableLog) tableLog = minBits; /* Need a minimum to safely represent all symbol values */
468 if (tableLog < FSE_MIN_TABLELOG) tableLog = FSE_MIN_TABLELOG;
469 if (tableLog > FSE_MAX_TABLELOG) tableLog = FSE_MAX_TABLELOG;
470 return tableLog;
471 }
472
473 unsigned FSE_optimalTableLog(unsigned maxTableLog, size_t srcSize, unsigned maxSymbolValue)
474 {
475 return FSE_optimalTableLog_internal(maxTableLog, srcSize, maxSymbolValue, 2);
476 }
477
478
479 /* Secondary normalization method.
480 To be used when primary method fails. */
481
482 static size_t FSE_normalizeM2(short* norm, U32 tableLog, const unsigned* count, size_t total, U32 maxSymbolValue)
483 {
484 U32 s;
485 U32 distributed = 0;
486 U32 ToDistribute;
487
488 /* Init */
489 U32 lowThreshold = (U32)(total >> tableLog);
490 U32 lowOne = (U32)((total * 3) >> (tableLog + 1));
491
492 for (s=0; s<=maxSymbolValue; s++) {
493 if (count[s] == 0) {
494 norm[s]=0;
495 continue;
496 }
497 if (count[s] <= lowThreshold) {
498 norm[s] = -1;
499 distributed++;
500 total -= count[s];
501 continue;
502 }
503 if (count[s] <= lowOne) {
504 norm[s] = 1;
505 distributed++;
506 total -= count[s];
507 continue;
508 }
509 norm[s]=-2;
510 }
511 ToDistribute = (1 << tableLog) - distributed;
512
513 if ((total / ToDistribute) > lowOne) {
514 /* risk of rounding to zero */
515 lowOne = (U32)((total * 3) / (ToDistribute * 2));
516 for (s=0; s<=maxSymbolValue; s++) {
517 if ((norm[s] == -2) && (count[s] <= lowOne)) {
518 norm[s] = 1;
519 distributed++;
520 total -= count[s];
521 continue;
522 } }
523 ToDistribute = (1 << tableLog) - distributed;
524 }
525
526 if (distributed == maxSymbolValue+1) {
527 /* all values are pretty poor;
528 probably incompressible data (should have already been detected);
529 find max, then give all remaining points to max */
530 U32 maxV = 0, maxC = 0;
531 for (s=0; s<=maxSymbolValue; s++)
532 if (count[s] > maxC) maxV=s, maxC=count[s];
533 norm[maxV] += (short)ToDistribute;
534 return 0;
535 }
536
537 {
538 U64 const vStepLog = 62 - tableLog;
539 U64 const mid = (1ULL << (vStepLog-1)) - 1;
540 U64 const rStep = ((((U64)1<<vStepLog) * ToDistribute) + mid) / total; /* scale on remaining */
541 U64 tmpTotal = mid;
542 for (s=0; s<=maxSymbolValue; s++) {
543 if (norm[s]==-2) {
544 U64 end = tmpTotal + (count[s] * rStep);
545 U32 sStart = (U32)(tmpTotal >> vStepLog);
546 U32 sEnd = (U32)(end >> vStepLog);
547 U32 weight = sEnd - sStart;
548 if (weight < 1)
549 return ERROR(GENERIC);
550 norm[s] = (short)weight;
551 tmpTotal = end;
552 } } }
553
554 return 0;
555 }
556
557
558 size_t FSE_normalizeCount (short* normalizedCounter, unsigned tableLog,
559 const unsigned* count, size_t total,
560 unsigned maxSymbolValue)
561 {
562 /* Sanity checks */
563 if (tableLog==0) tableLog = FSE_DEFAULT_TABLELOG;
564 if (tableLog < FSE_MIN_TABLELOG) return ERROR(GENERIC); /* Unsupported size */
565 if (tableLog > FSE_MAX_TABLELOG) return ERROR(tableLog_tooLarge); /* Unsupported size */
566 if (tableLog < FSE_minTableLog(total, maxSymbolValue)) return ERROR(GENERIC); /* Too small tableLog, compression potentially impossible */
567
568 { U32 const rtbTable[] = { 0, 473195, 504333, 520860, 550000, 700000, 750000, 830000 };
569
570 U64 const scale = 62 - tableLog;
571 U64 const step = ((U64)1<<62) / total; /* <== here, one division ! */
572 U64 const vStep = 1ULL<<(scale-20);
573 int stillToDistribute = 1<<tableLog;
574 unsigned s;
575 unsigned largest=0;
576 short largestP=0;
577 U32 lowThreshold = (U32)(total >> tableLog);
578
579 for (s=0; s<=maxSymbolValue; s++) {
580 if (count[s] == total) return 0; /* rle special case */
581 if (count[s] == 0) { normalizedCounter[s]=0; continue; }
582 if (count[s] <= lowThreshold) {
583 normalizedCounter[s] = -1;
584 stillToDistribute--;
585 } else {
586 short proba = (short)((count[s]*step) >> scale);
587 if (proba<8) {
588 U64 restToBeat = vStep * rtbTable[proba];
589 proba += (count[s]*step) - ((U64)proba<<scale) > restToBeat;
590 }
591 if (proba > largestP) largestP=proba, largest=s;
592 normalizedCounter[s] = proba;
593 stillToDistribute -= proba;
594 } }
595 if (-stillToDistribute >= (normalizedCounter[largest] >> 1)) {
596 /* corner case, need another normalization method */
597 size_t errorCode = FSE_normalizeM2(normalizedCounter, tableLog, count, total, maxSymbolValue);
598 if (FSE_isError(errorCode)) return errorCode;
599 }
600 else normalizedCounter[largest] += (short)stillToDistribute;
601 }
602
603 #if 0
604 { /* Print Table (debug) */
605 U32 s;
606 U32 nTotal = 0;
607 for (s=0; s<=maxSymbolValue; s++)
608 printf("%3i: %4i \n", s, normalizedCounter[s]);
609 for (s=0; s<=maxSymbolValue; s++)
610 nTotal += abs(normalizedCounter[s]);
611 if (nTotal != (1U<<tableLog))
612 printf("Warning !!! Total == %u != %u !!!", nTotal, 1U<<tableLog);
613 getchar();
614 }
615 #endif
616
617 return tableLog;
618 }
619
620
621 /* fake FSE_CTable, for raw (uncompressed) input */
622 size_t FSE_buildCTable_raw (FSE_CTable* ct, unsigned nbBits)
623 {
624 const unsigned tableSize = 1 << nbBits;
625 const unsigned tableMask = tableSize - 1;
626 const unsigned maxSymbolValue = tableMask;
627 void* const ptr = ct;
628 U16* const tableU16 = ( (U16*) ptr) + 2;
629 void* const FSCT = ((U32*)ptr) + 1 /* header */ + (tableSize>>1); /* assumption : tableLog >= 1 */
630 FSE_symbolCompressionTransform* const symbolTT = (FSE_symbolCompressionTransform*) (FSCT);
631 unsigned s;
632
633 /* Sanity checks */
634 if (nbBits < 1) return ERROR(GENERIC); /* min size */
635
636 /* header */
637 tableU16[-2] = (U16) nbBits;
638 tableU16[-1] = (U16) maxSymbolValue;
639
640 /* Build table */
641 for (s=0; s<tableSize; s++)
642 tableU16[s] = (U16)(tableSize + s);
643
644 /* Build Symbol Transformation Table */
645 { const U32 deltaNbBits = (nbBits << 16) - (1 << nbBits);
646
647 for (s=0; s<=maxSymbolValue; s++) {
648 symbolTT[s].deltaNbBits = deltaNbBits;
649 symbolTT[s].deltaFindState = s-1;
650 } }
651
652
653 return 0;
654 }
655
656 /* fake FSE_CTable, for rle (100% always same symbol) input */
657 size_t FSE_buildCTable_rle (FSE_CTable* ct, BYTE symbolValue)
658 {
659 void* ptr = ct;
660 U16* tableU16 = ( (U16*) ptr) + 2;
661 void* FSCTptr = (U32*)ptr + 2;
662 FSE_symbolCompressionTransform* symbolTT = (FSE_symbolCompressionTransform*) FSCTptr;
663
664 /* header */
665 tableU16[-2] = (U16) 0;
666 tableU16[-1] = (U16) symbolValue;
667
668 /* Build table */
669 tableU16[0] = 0;
670 tableU16[1] = 0; /* just in case */
671
672 /* Build Symbol Transformation Table */
673 symbolTT[symbolValue].deltaNbBits = 0;
674 symbolTT[symbolValue].deltaFindState = 0;
675
676 return 0;
677 }
678
679
680 static size_t FSE_compress_usingCTable_generic (void* dst, size_t dstSize,
681 const void* src, size_t srcSize,
682 const FSE_CTable* ct, const unsigned fast)
683 {
684 const BYTE* const istart = (const BYTE*) src;
685 const BYTE* const iend = istart + srcSize;
686 const BYTE* ip=iend;
687
688
689 BIT_CStream_t bitC;
690 FSE_CState_t CState1, CState2;
691
692 /* init */
693 if (srcSize <= 2) return 0;
694 { size_t const errorCode = BIT_initCStream(&bitC, dst, dstSize);
695 if (FSE_isError(errorCode)) return 0; }
696
697 #define FSE_FLUSHBITS(s) (fast ? BIT_flushBitsFast(s) : BIT_flushBits(s))
698
699 if (srcSize & 1) {
700 FSE_initCState2(&CState1, ct, *--ip);
701 FSE_initCState2(&CState2, ct, *--ip);
702 FSE_encodeSymbol(&bitC, &CState1, *--ip);
703 FSE_FLUSHBITS(&bitC);
704 } else {
705 FSE_initCState2(&CState2, ct, *--ip);
706 FSE_initCState2(&CState1, ct, *--ip);
707 }
708
709 /* join to mod 4 */
710 srcSize -= 2;
711 if ((sizeof(bitC.bitContainer)*8 > FSE_MAX_TABLELOG*4+7 ) && (srcSize & 2)) { /* test bit 2 */
712 FSE_encodeSymbol(&bitC, &CState2, *--ip);
713 FSE_encodeSymbol(&bitC, &CState1, *--ip);
714 FSE_FLUSHBITS(&bitC);
715 }
716
717 /* 2 or 4 encoding per loop */
718 for ( ; ip>istart ; ) {
719
720 FSE_encodeSymbol(&bitC, &CState2, *--ip);
721
722 if (sizeof(bitC.bitContainer)*8 < FSE_MAX_TABLELOG*2+7 ) /* this test must be static */
723 FSE_FLUSHBITS(&bitC);
724
725 FSE_encodeSymbol(&bitC, &CState1, *--ip);
726
727 if (sizeof(bitC.bitContainer)*8 > FSE_MAX_TABLELOG*4+7 ) { /* this test must be static */
728 FSE_encodeSymbol(&bitC, &CState2, *--ip);
729 FSE_encodeSymbol(&bitC, &CState1, *--ip);
730 }
731
732 FSE_FLUSHBITS(&bitC);
733 }
734
735 FSE_flushCState(&bitC, &CState2);
736 FSE_flushCState(&bitC, &CState1);
737 return BIT_closeCStream(&bitC);
738 }
739
740 size_t FSE_compress_usingCTable (void* dst, size_t dstSize,
741 const void* src, size_t srcSize,
742 const FSE_CTable* ct)
743 {
744 const unsigned fast = (dstSize >= FSE_BLOCKBOUND(srcSize));
745
746 if (fast)
747 return FSE_compress_usingCTable_generic(dst, dstSize, src, srcSize, ct, 1);
748 else
749 return FSE_compress_usingCTable_generic(dst, dstSize, src, srcSize, ct, 0);
750 }
751
752
753 size_t FSE_compressBound(size_t size) { return FSE_COMPRESSBOUND(size); }
754
755 size_t FSE_compress2 (void* dst, size_t dstSize, const void* src, size_t srcSize, unsigned maxSymbolValue, unsigned tableLog)
756 {
757 const BYTE* const istart = (const BYTE*) src;
758 const BYTE* ip = istart;
759
760 BYTE* const ostart = (BYTE*) dst;
761 BYTE* op = ostart;
762 BYTE* const oend = ostart + dstSize;
763
764 U32 count[FSE_MAX_SYMBOL_VALUE+1];
765 S16 norm[FSE_MAX_SYMBOL_VALUE+1];
766 CTable_max_t ct;
767 size_t errorCode;
768
769 /* init conditions */
770 if (srcSize <= 1) return 0; /* Uncompressible */
771 if (!maxSymbolValue) maxSymbolValue = FSE_MAX_SYMBOL_VALUE;
772 if (!tableLog) tableLog = FSE_DEFAULT_TABLELOG;
773
774 /* Scan input and build symbol stats */
775 errorCode = FSE_count (count, &maxSymbolValue, ip, srcSize);
776 if (FSE_isError(errorCode)) return errorCode;
777 if (errorCode == srcSize) return 1;
778 if (errorCode == 1) return 0; /* each symbol only present once */
779 if (errorCode < (srcSize >> 7)) return 0; /* Heuristic : not compressible enough */
780
781 tableLog = FSE_optimalTableLog(tableLog, srcSize, maxSymbolValue);
782 errorCode = FSE_normalizeCount (norm, tableLog, count, srcSize, maxSymbolValue);
783 if (FSE_isError(errorCode)) return errorCode;
784
785 /* Write table description header */
786 errorCode = FSE_writeNCount (op, oend-op, norm, maxSymbolValue, tableLog);
787 if (FSE_isError(errorCode)) return errorCode;
788 op += errorCode;
789
790 /* Compress */
791 errorCode = FSE_buildCTable (ct, norm, maxSymbolValue, tableLog);
792 if (FSE_isError(errorCode)) return errorCode;
793 errorCode = FSE_compress_usingCTable(op, oend - op, ip, srcSize, ct);
794 if (errorCode == 0) return 0; /* not enough space for compressed data */
795 op += errorCode;
796
797 /* check compressibility */
798 if ( (size_t)(op-ostart) >= srcSize-1 )
799 return 0;
800
801 return op-ostart;
802 }
803
804 size_t FSE_compress (void* dst, size_t dstSize, const void* src, size_t srcSize)
805 {
806 return FSE_compress2(dst, dstSize, src, (U32)srcSize, FSE_MAX_SYMBOL_VALUE, FSE_DEFAULT_TABLELOG);
807 }
808
809
810 #endif /* FSE_COMMONDEFS_ONLY */
This diff has been collapsed as it changes many lines, (533 lines changed) Show them Hide them
@@ -0,0 +1,533 b''
1 /* ******************************************************************
2 Huffman encoder, part of New Generation Entropy library
3 Copyright (C) 2013-2016, Yann Collet.
4
5 BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
6
7 Redistribution and use in source and binary forms, with or without
8 modification, are permitted provided that the following conditions are
9 met:
10
11 * Redistributions of source code must retain the above copyright
12 notice, this list of conditions and the following disclaimer.
13 * Redistributions in binary form must reproduce the above
14 copyright notice, this list of conditions and the following disclaimer
15 in the documentation and/or other materials provided with the
16 distribution.
17
18 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
19 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
20 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
21 A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
22 OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
23 SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
24 LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
25 DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
26 THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
27 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
28 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
29
30 You can contact the author at :
31 - FSE+HUF source repository : https://github.com/Cyan4973/FiniteStateEntropy
32 - Public forum : https://groups.google.com/forum/#!forum/lz4c
33 ****************************************************************** */
34
35 /* **************************************************************
36 * Compiler specifics
37 ****************************************************************/
38 #ifdef _MSC_VER /* Visual Studio */
39 # pragma warning(disable : 4127) /* disable: C4127: conditional expression is constant */
40 #endif
41
42
43 /* **************************************************************
44 * Includes
45 ****************************************************************/
46 #include <string.h> /* memcpy, memset */
47 #include <stdio.h> /* printf (debug) */
48 #include "bitstream.h"
49 #define FSE_STATIC_LINKING_ONLY /* FSE_optimalTableLog_internal */
50 #include "fse.h" /* header compression */
51 #define HUF_STATIC_LINKING_ONLY
52 #include "huf.h"
53
54
55 /* **************************************************************
56 * Error Management
57 ****************************************************************/
58 #define HUF_STATIC_ASSERT(c) { enum { HUF_static_assert = 1/(int)(!!(c)) }; } /* use only *after* variable declarations */
59
60
61 /* **************************************************************
62 * Utils
63 ****************************************************************/
64 unsigned HUF_optimalTableLog(unsigned maxTableLog, size_t srcSize, unsigned maxSymbolValue)
65 {
66 return FSE_optimalTableLog_internal(maxTableLog, srcSize, maxSymbolValue, 1);
67 }
68
69
70 /* *******************************************************
71 * HUF : Huffman block compression
72 *********************************************************/
73 struct HUF_CElt_s {
74 U16 val;
75 BYTE nbBits;
76 }; /* typedef'd to HUF_CElt within "huf.h" */
77
78 typedef struct nodeElt_s {
79 U32 count;
80 U16 parent;
81 BYTE byte;
82 BYTE nbBits;
83 } nodeElt;
84
85 /*! HUF_writeCTable() :
86 `CTable` : huffman tree to save, using huf representation.
87 @return : size of saved CTable */
88 size_t HUF_writeCTable (void* dst, size_t maxDstSize,
89 const HUF_CElt* CTable, U32 maxSymbolValue, U32 huffLog)
90 {
91 BYTE bitsToWeight[HUF_TABLELOG_MAX + 1];
92 BYTE huffWeight[HUF_SYMBOLVALUE_MAX];
93 BYTE* op = (BYTE*)dst;
94 U32 n;
95
96 /* check conditions */
97 if (maxSymbolValue > HUF_SYMBOLVALUE_MAX) return ERROR(GENERIC);
98
99 /* convert to weight */
100 bitsToWeight[0] = 0;
101 for (n=1; n<huffLog+1; n++)
102 bitsToWeight[n] = (BYTE)(huffLog + 1 - n);
103 for (n=0; n<maxSymbolValue; n++)
104 huffWeight[n] = bitsToWeight[CTable[n].nbBits];
105
106 { size_t const size = FSE_compress(op+1, maxDstSize-1, huffWeight, maxSymbolValue);
107 if (FSE_isError(size)) return size;
108 if ((size>1) & (size < maxSymbolValue/2)) { /* FSE compressed */
109 op[0] = (BYTE)size;
110 return size+1;
111 }
112 }
113
114 /* raw values */
115 if (maxSymbolValue > (256-128)) return ERROR(GENERIC); /* should not happen */
116 if (((maxSymbolValue+1)/2) + 1 > maxDstSize) return ERROR(dstSize_tooSmall); /* not enough space within dst buffer */
117 op[0] = (BYTE)(128 /*special case*/ + (maxSymbolValue-1));
118 huffWeight[maxSymbolValue] = 0; /* to be sure it doesn't cause issue in final combination */
119 for (n=0; n<maxSymbolValue; n+=2)
120 op[(n/2)+1] = (BYTE)((huffWeight[n] << 4) + huffWeight[n+1]);
121 return ((maxSymbolValue+1)/2) + 1;
122
123 }
124
125
126 size_t HUF_readCTable (HUF_CElt* CTable, U32 maxSymbolValue, const void* src, size_t srcSize)
127 {
128 BYTE huffWeight[HUF_SYMBOLVALUE_MAX + 1];
129 U32 rankVal[HUF_TABLELOG_ABSOLUTEMAX + 1]; /* large enough for values from 0 to 16 */
130 U32 tableLog = 0;
131 size_t readSize;
132 U32 nbSymbols = 0;
133 /*memset(huffWeight, 0, sizeof(huffWeight));*/ /* is not necessary, even though some analyzer complain ... */
134
135 /* get symbol weights */
136 readSize = HUF_readStats(huffWeight, HUF_SYMBOLVALUE_MAX+1, rankVal, &nbSymbols, &tableLog, src, srcSize);
137 if (HUF_isError(readSize)) return readSize;
138
139 /* check result */
140 if (tableLog > HUF_TABLELOG_MAX) return ERROR(tableLog_tooLarge);
141 if (nbSymbols > maxSymbolValue+1) return ERROR(maxSymbolValue_tooSmall);
142
143 /* Prepare base value per rank */
144 { U32 n, nextRankStart = 0;
145 for (n=1; n<=tableLog; n++) {
146 U32 current = nextRankStart;
147 nextRankStart += (rankVal[n] << (n-1));
148 rankVal[n] = current;
149 } }
150
151 /* fill nbBits */
152 { U32 n; for (n=0; n<nbSymbols; n++) {
153 const U32 w = huffWeight[n];
154 CTable[n].nbBits = (BYTE)(tableLog + 1 - w);
155 } }
156
157 /* fill val */
158 { U16 nbPerRank[HUF_TABLELOG_MAX+2] = {0}; /* support w=0=>n=tableLog+1 */
159 U16 valPerRank[HUF_TABLELOG_MAX+2] = {0};
160 { U32 n; for (n=0; n<nbSymbols; n++) nbPerRank[CTable[n].nbBits]++; }
161 /* determine stating value per rank */
162 valPerRank[tableLog+1] = 0; /* for w==0 */
163 { U16 min = 0;
164 U32 n; for (n=tableLog; n>0; n--) { /* start at n=tablelog <-> w=1 */
165 valPerRank[n] = min; /* get starting value within each rank */
166 min += nbPerRank[n];
167 min >>= 1;
168 } }
169 /* assign value within rank, symbol order */
170 { U32 n; for (n=0; n<=maxSymbolValue; n++) CTable[n].val = valPerRank[CTable[n].nbBits]++; }
171 }
172
173 return readSize;
174 }
175
176
177 static U32 HUF_setMaxHeight(nodeElt* huffNode, U32 lastNonNull, U32 maxNbBits)
178 {
179 const U32 largestBits = huffNode[lastNonNull].nbBits;
180 if (largestBits <= maxNbBits) return largestBits; /* early exit : no elt > maxNbBits */
181
182 /* there are several too large elements (at least >= 2) */
183 { int totalCost = 0;
184 const U32 baseCost = 1 << (largestBits - maxNbBits);
185 U32 n = lastNonNull;
186
187 while (huffNode[n].nbBits > maxNbBits) {
188 totalCost += baseCost - (1 << (largestBits - huffNode[n].nbBits));
189 huffNode[n].nbBits = (BYTE)maxNbBits;
190 n --;
191 } /* n stops at huffNode[n].nbBits <= maxNbBits */
192 while (huffNode[n].nbBits == maxNbBits) n--; /* n end at index of smallest symbol using < maxNbBits */
193
194 /* renorm totalCost */
195 totalCost >>= (largestBits - maxNbBits); /* note : totalCost is necessarily a multiple of baseCost */
196
197 /* repay normalized cost */
198 { U32 const noSymbol = 0xF0F0F0F0;
199 U32 rankLast[HUF_TABLELOG_MAX+2];
200 int pos;
201
202 /* Get pos of last (smallest) symbol per rank */
203 memset(rankLast, 0xF0, sizeof(rankLast));
204 { U32 currentNbBits = maxNbBits;
205 for (pos=n ; pos >= 0; pos--) {
206 if (huffNode[pos].nbBits >= currentNbBits) continue;
207 currentNbBits = huffNode[pos].nbBits; /* < maxNbBits */
208 rankLast[maxNbBits-currentNbBits] = pos;
209 } }
210
211 while (totalCost > 0) {
212 U32 nBitsToDecrease = BIT_highbit32(totalCost) + 1;
213 for ( ; nBitsToDecrease > 1; nBitsToDecrease--) {
214 U32 highPos = rankLast[nBitsToDecrease];
215 U32 lowPos = rankLast[nBitsToDecrease-1];
216 if (highPos == noSymbol) continue;
217 if (lowPos == noSymbol) break;
218 { U32 const highTotal = huffNode[highPos].count;
219 U32 const lowTotal = 2 * huffNode[lowPos].count;
220 if (highTotal <= lowTotal) break;
221 } }
222 /* only triggered when no more rank 1 symbol left => find closest one (note : there is necessarily at least one !) */
223 while ((nBitsToDecrease<=HUF_TABLELOG_MAX) && (rankLast[nBitsToDecrease] == noSymbol)) /* HUF_MAX_TABLELOG test just to please gcc 5+; but it should not be necessary */
224 nBitsToDecrease ++;
225 totalCost -= 1 << (nBitsToDecrease-1);
226 if (rankLast[nBitsToDecrease-1] == noSymbol)
227 rankLast[nBitsToDecrease-1] = rankLast[nBitsToDecrease]; /* this rank is no longer empty */
228 huffNode[rankLast[nBitsToDecrease]].nbBits ++;
229 if (rankLast[nBitsToDecrease] == 0) /* special case, reached largest symbol */
230 rankLast[nBitsToDecrease] = noSymbol;
231 else {
232 rankLast[nBitsToDecrease]--;
233 if (huffNode[rankLast[nBitsToDecrease]].nbBits != maxNbBits-nBitsToDecrease)
234 rankLast[nBitsToDecrease] = noSymbol; /* this rank is now empty */
235 } } /* while (totalCost > 0) */
236
237 while (totalCost < 0) { /* Sometimes, cost correction overshoot */
238 if (rankLast[1] == noSymbol) { /* special case : no rank 1 symbol (using maxNbBits-1); let's create one from largest rank 0 (using maxNbBits) */
239 while (huffNode[n].nbBits == maxNbBits) n--;
240 huffNode[n+1].nbBits--;
241 rankLast[1] = n+1;
242 totalCost++;
243 continue;
244 }
245 huffNode[ rankLast[1] + 1 ].nbBits--;
246 rankLast[1]++;
247 totalCost ++;
248 } } } /* there are several too large elements (at least >= 2) */
249
250 return maxNbBits;
251 }
252
253
254 typedef struct {
255 U32 base;
256 U32 current;
257 } rankPos;
258
259 static void HUF_sort(nodeElt* huffNode, const U32* count, U32 maxSymbolValue)
260 {
261 rankPos rank[32];
262 U32 n;
263
264 memset(rank, 0, sizeof(rank));
265 for (n=0; n<=maxSymbolValue; n++) {
266 U32 r = BIT_highbit32(count[n] + 1);
267 rank[r].base ++;
268 }
269 for (n=30; n>0; n--) rank[n-1].base += rank[n].base;
270 for (n=0; n<32; n++) rank[n].current = rank[n].base;
271 for (n=0; n<=maxSymbolValue; n++) {
272 U32 const c = count[n];
273 U32 const r = BIT_highbit32(c+1) + 1;
274 U32 pos = rank[r].current++;
275 while ((pos > rank[r].base) && (c > huffNode[pos-1].count)) huffNode[pos]=huffNode[pos-1], pos--;
276 huffNode[pos].count = c;
277 huffNode[pos].byte = (BYTE)n;
278 }
279 }
280
281
282 #define STARTNODE (HUF_SYMBOLVALUE_MAX+1)
283 size_t HUF_buildCTable (HUF_CElt* tree, const U32* count, U32 maxSymbolValue, U32 maxNbBits)
284 {
285 nodeElt huffNode0[2*HUF_SYMBOLVALUE_MAX+1 +1];
286 nodeElt* huffNode = huffNode0 + 1;
287 U32 n, nonNullRank;
288 int lowS, lowN;
289 U16 nodeNb = STARTNODE;
290 U32 nodeRoot;
291
292 /* safety checks */
293 if (maxNbBits == 0) maxNbBits = HUF_TABLELOG_DEFAULT;
294 if (maxSymbolValue > HUF_SYMBOLVALUE_MAX) return ERROR(GENERIC);
295 memset(huffNode0, 0, sizeof(huffNode0));
296
297 /* sort, decreasing order */
298 HUF_sort(huffNode, count, maxSymbolValue);
299
300 /* init for parents */
301 nonNullRank = maxSymbolValue;
302 while(huffNode[nonNullRank].count == 0) nonNullRank--;
303 lowS = nonNullRank; nodeRoot = nodeNb + lowS - 1; lowN = nodeNb;
304 huffNode[nodeNb].count = huffNode[lowS].count + huffNode[lowS-1].count;
305 huffNode[lowS].parent = huffNode[lowS-1].parent = nodeNb;
306 nodeNb++; lowS-=2;
307 for (n=nodeNb; n<=nodeRoot; n++) huffNode[n].count = (U32)(1U<<30);
308 huffNode0[0].count = (U32)(1U<<31);
309
310 /* create parents */
311 while (nodeNb <= nodeRoot) {
312 U32 n1 = (huffNode[lowS].count < huffNode[lowN].count) ? lowS-- : lowN++;
313 U32 n2 = (huffNode[lowS].count < huffNode[lowN].count) ? lowS-- : lowN++;
314 huffNode[nodeNb].count = huffNode[n1].count + huffNode[n2].count;
315 huffNode[n1].parent = huffNode[n2].parent = nodeNb;
316 nodeNb++;
317 }
318
319 /* distribute weights (unlimited tree height) */
320 huffNode[nodeRoot].nbBits = 0;
321 for (n=nodeRoot-1; n>=STARTNODE; n--)
322 huffNode[n].nbBits = huffNode[ huffNode[n].parent ].nbBits + 1;
323 for (n=0; n<=nonNullRank; n++)
324 huffNode[n].nbBits = huffNode[ huffNode[n].parent ].nbBits + 1;
325
326 /* enforce maxTableLog */
327 maxNbBits = HUF_setMaxHeight(huffNode, nonNullRank, maxNbBits);
328
329 /* fill result into tree (val, nbBits) */
330 { U16 nbPerRank[HUF_TABLELOG_MAX+1] = {0};
331 U16 valPerRank[HUF_TABLELOG_MAX+1] = {0};
332 if (maxNbBits > HUF_TABLELOG_MAX) return ERROR(GENERIC); /* check fit into table */
333 for (n=0; n<=nonNullRank; n++)
334 nbPerRank[huffNode[n].nbBits]++;
335 /* determine stating value per rank */
336 { U16 min = 0;
337 for (n=maxNbBits; n>0; n--) {
338 valPerRank[n] = min; /* get starting value within each rank */
339 min += nbPerRank[n];
340 min >>= 1;
341 } }
342 for (n=0; n<=maxSymbolValue; n++)
343 tree[huffNode[n].byte].nbBits = huffNode[n].nbBits; /* push nbBits per symbol, symbol order */
344 for (n=0; n<=maxSymbolValue; n++)
345 tree[n].val = valPerRank[tree[n].nbBits]++; /* assign value within rank, symbol order */
346 }
347
348 return maxNbBits;
349 }
350
351 static void HUF_encodeSymbol(BIT_CStream_t* bitCPtr, U32 symbol, const HUF_CElt* CTable)
352 {
353 BIT_addBitsFast(bitCPtr, CTable[symbol].val, CTable[symbol].nbBits);
354 }
355
356 size_t HUF_compressBound(size_t size) { return HUF_COMPRESSBOUND(size); }
357
358 #define HUF_FLUSHBITS(s) (fast ? BIT_flushBitsFast(s) : BIT_flushBits(s))
359
360 #define HUF_FLUSHBITS_1(stream) \
361 if (sizeof((stream)->bitContainer)*8 < HUF_TABLELOG_MAX*2+7) HUF_FLUSHBITS(stream)
362
363 #define HUF_FLUSHBITS_2(stream) \
364 if (sizeof((stream)->bitContainer)*8 < HUF_TABLELOG_MAX*4+7) HUF_FLUSHBITS(stream)
365
366 size_t HUF_compress1X_usingCTable(void* dst, size_t dstSize, const void* src, size_t srcSize, const HUF_CElt* CTable)
367 {
368 const BYTE* ip = (const BYTE*) src;
369 BYTE* const ostart = (BYTE*)dst;
370 BYTE* const oend = ostart + dstSize;
371 BYTE* op = ostart;
372 size_t n;
373 const unsigned fast = (dstSize >= HUF_BLOCKBOUND(srcSize));
374 BIT_CStream_t bitC;
375
376 /* init */
377 if (dstSize < 8) return 0; /* not enough space to compress */
378 { size_t const errorCode = BIT_initCStream(&bitC, op, oend-op);
379 if (HUF_isError(errorCode)) return 0; }
380
381 n = srcSize & ~3; /* join to mod 4 */
382 switch (srcSize & 3)
383 {
384 case 3 : HUF_encodeSymbol(&bitC, ip[n+ 2], CTable);
385 HUF_FLUSHBITS_2(&bitC);
386 case 2 : HUF_encodeSymbol(&bitC, ip[n+ 1], CTable);
387 HUF_FLUSHBITS_1(&bitC);
388 case 1 : HUF_encodeSymbol(&bitC, ip[n+ 0], CTable);
389 HUF_FLUSHBITS(&bitC);
390 case 0 :
391 default: ;
392 }
393
394 for (; n>0; n-=4) { /* note : n&3==0 at this stage */
395 HUF_encodeSymbol(&bitC, ip[n- 1], CTable);
396 HUF_FLUSHBITS_1(&bitC);
397 HUF_encodeSymbol(&bitC, ip[n- 2], CTable);
398 HUF_FLUSHBITS_2(&bitC);
399 HUF_encodeSymbol(&bitC, ip[n- 3], CTable);
400 HUF_FLUSHBITS_1(&bitC);
401 HUF_encodeSymbol(&bitC, ip[n- 4], CTable);
402 HUF_FLUSHBITS(&bitC);
403 }
404
405 return BIT_closeCStream(&bitC);
406 }
407
408
409 size_t HUF_compress4X_usingCTable(void* dst, size_t dstSize, const void* src, size_t srcSize, const HUF_CElt* CTable)
410 {
411 size_t const segmentSize = (srcSize+3)/4; /* first 3 segments */
412 const BYTE* ip = (const BYTE*) src;
413 const BYTE* const iend = ip + srcSize;
414 BYTE* const ostart = (BYTE*) dst;
415 BYTE* const oend = ostart + dstSize;
416 BYTE* op = ostart;
417
418 if (dstSize < 6 + 1 + 1 + 1 + 8) return 0; /* minimum space to compress successfully */
419 if (srcSize < 12) return 0; /* no saving possible : too small input */
420 op += 6; /* jumpTable */
421
422 { size_t const cSize = HUF_compress1X_usingCTable(op, oend-op, ip, segmentSize, CTable);
423 if (HUF_isError(cSize)) return cSize;
424 if (cSize==0) return 0;
425 MEM_writeLE16(ostart, (U16)cSize);
426 op += cSize;
427 }
428
429 ip += segmentSize;
430 { size_t const cSize = HUF_compress1X_usingCTable(op, oend-op, ip, segmentSize, CTable);
431 if (HUF_isError(cSize)) return cSize;
432 if (cSize==0) return 0;
433 MEM_writeLE16(ostart+2, (U16)cSize);
434 op += cSize;
435 }
436
437 ip += segmentSize;
438 { size_t const cSize = HUF_compress1X_usingCTable(op, oend-op, ip, segmentSize, CTable);
439 if (HUF_isError(cSize)) return cSize;
440 if (cSize==0) return 0;
441 MEM_writeLE16(ostart+4, (U16)cSize);
442 op += cSize;
443 }
444
445 ip += segmentSize;
446 { size_t const cSize = HUF_compress1X_usingCTable(op, oend-op, ip, iend-ip, CTable);
447 if (HUF_isError(cSize)) return cSize;
448 if (cSize==0) return 0;
449 op += cSize;
450 }
451
452 return op-ostart;
453 }
454
455
456 static size_t HUF_compress_internal (
457 void* dst, size_t dstSize,
458 const void* src, size_t srcSize,
459 unsigned maxSymbolValue, unsigned huffLog,
460 unsigned singleStream)
461 {
462 BYTE* const ostart = (BYTE*)dst;
463 BYTE* const oend = ostart + dstSize;
464 BYTE* op = ostart;
465
466 U32 count[HUF_SYMBOLVALUE_MAX+1];
467 HUF_CElt CTable[HUF_SYMBOLVALUE_MAX+1];
468
469 /* checks & inits */
470 if (!srcSize) return 0; /* Uncompressed (note : 1 means rle, so first byte must be correct) */
471 if (!dstSize) return 0; /* cannot fit within dst budget */
472 if (srcSize > HUF_BLOCKSIZE_MAX) return ERROR(srcSize_wrong); /* current block size limit */
473 if (huffLog > HUF_TABLELOG_MAX) return ERROR(tableLog_tooLarge);
474 if (!maxSymbolValue) maxSymbolValue = HUF_SYMBOLVALUE_MAX;
475 if (!huffLog) huffLog = HUF_TABLELOG_DEFAULT;
476
477 /* Scan input and build symbol stats */
478 { size_t const largest = FSE_count (count, &maxSymbolValue, (const BYTE*)src, srcSize);
479 if (HUF_isError(largest)) return largest;
480 if (largest == srcSize) { *ostart = ((const BYTE*)src)[0]; return 1; } /* single symbol, rle */
481 if (largest <= (srcSize >> 7)+1) return 0; /* Fast heuristic : not compressible enough */
482 }
483
484 /* Build Huffman Tree */
485 huffLog = HUF_optimalTableLog(huffLog, srcSize, maxSymbolValue);
486 { size_t const maxBits = HUF_buildCTable (CTable, count, maxSymbolValue, huffLog);
487 if (HUF_isError(maxBits)) return maxBits;
488 huffLog = (U32)maxBits;
489 }
490
491 /* Write table description header */
492 { size_t const hSize = HUF_writeCTable (op, dstSize, CTable, maxSymbolValue, huffLog);
493 if (HUF_isError(hSize)) return hSize;
494 if (hSize + 12 >= srcSize) return 0; /* not useful to try compression */
495 op += hSize;
496 }
497
498 /* Compress */
499 { size_t const cSize = (singleStream) ?
500 HUF_compress1X_usingCTable(op, oend - op, src, srcSize, CTable) : /* single segment */
501 HUF_compress4X_usingCTable(op, oend - op, src, srcSize, CTable);
502 if (HUF_isError(cSize)) return cSize;
503 if (cSize==0) return 0; /* uncompressible */
504 op += cSize;
505 }
506
507 /* check compressibility */
508 if ((size_t)(op-ostart) >= srcSize-1)
509 return 0;
510
511 return op-ostart;
512 }
513
514
515 size_t HUF_compress1X (void* dst, size_t dstSize,
516 const void* src, size_t srcSize,
517 unsigned maxSymbolValue, unsigned huffLog)
518 {
519 return HUF_compress_internal(dst, dstSize, src, srcSize, maxSymbolValue, huffLog, 1);
520 }
521
522 size_t HUF_compress2 (void* dst, size_t dstSize,
523 const void* src, size_t srcSize,
524 unsigned maxSymbolValue, unsigned huffLog)
525 {
526 return HUF_compress_internal(dst, dstSize, src, srcSize, maxSymbolValue, huffLog, 0);
527 }
528
529
530 size_t HUF_compress (void* dst, size_t maxDstSize, const void* src, size_t srcSize)
531 {
532 return HUF_compress2(dst, maxDstSize, src, (U32)srcSize, 255, HUF_TABLELOG_DEFAULT);
533 }
@@ -0,0 +1,319 b''
1 /**
2 * Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
3 * All rights reserved.
4 *
5 * This source code is licensed under the BSD-style license found in the
6 * LICENSE file in the root directory of this source tree. An additional grant
7 * of patent rights can be found in the PATENTS file in the same directory.
8 */
9
10
11
12 /* *************************************
13 * Dependencies
14 ***************************************/
15 #include <stdlib.h>
16 #include "error_private.h"
17 #include "zstd_internal.h" /* MIN, ZSTD_BLOCKHEADERSIZE, defaultCustomMem */
18 #define ZBUFF_STATIC_LINKING_ONLY
19 #include "zbuff.h"
20
21
22 /* *************************************
23 * Constants
24 ***************************************/
25 static size_t const ZBUFF_endFrameSize = ZSTD_BLOCKHEADERSIZE;
26
27
28 /*-***********************************************************
29 * Streaming compression
30 *
31 * A ZBUFF_CCtx object is required to track streaming operation.
32 * Use ZBUFF_createCCtx() and ZBUFF_freeCCtx() to create/release resources.
33 * Use ZBUFF_compressInit() to start a new compression operation.
34 * ZBUFF_CCtx objects can be reused multiple times.
35 *
36 * Use ZBUFF_compressContinue() repetitively to consume your input.
37 * *srcSizePtr and *dstCapacityPtr can be any size.
38 * The function will report how many bytes were read or written by modifying *srcSizePtr and *dstCapacityPtr.
39 * Note that it may not consume the entire input, in which case it's up to the caller to call again the function with remaining input.
40 * The content of dst will be overwritten (up to *dstCapacityPtr) at each function call, so save its content if it matters or change dst .
41 * @return : a hint to preferred nb of bytes to use as input for next function call (it's only a hint, to improve latency)
42 * or an error code, which can be tested using ZBUFF_isError().
43 *
44 * ZBUFF_compressFlush() can be used to instruct ZBUFF to compress and output whatever remains within its buffer.
45 * Note that it will not output more than *dstCapacityPtr.
46 * Therefore, some content might still be left into its internal buffer if dst buffer is too small.
47 * @return : nb of bytes still present into internal buffer (0 if it's empty)
48 * or an error code, which can be tested using ZBUFF_isError().
49 *
50 * ZBUFF_compressEnd() instructs to finish a frame.
51 * It will perform a flush and write frame epilogue.
52 * Similar to ZBUFF_compressFlush(), it may not be able to output the entire internal buffer content if *dstCapacityPtr is too small.
53 * @return : nb of bytes still present into internal buffer (0 if it's empty)
54 * or an error code, which can be tested using ZBUFF_isError().
55 *
56 * Hint : recommended buffer sizes (not compulsory)
57 * input : ZSTD_BLOCKSIZE_MAX (128 KB), internal unit size, it improves latency to use this value.
58 * output : ZSTD_compressBound(ZSTD_BLOCKSIZE_MAX) + ZSTD_blockHeaderSize + ZBUFF_endFrameSize : ensures it's always possible to write/flush/end a full block at best speed.
59 * ***********************************************************/
60
61 typedef enum { ZBUFFcs_init, ZBUFFcs_load, ZBUFFcs_flush, ZBUFFcs_final } ZBUFF_cStage;
62
63 /* *** Resources *** */
64 struct ZBUFF_CCtx_s {
65 ZSTD_CCtx* zc;
66 char* inBuff;
67 size_t inBuffSize;
68 size_t inToCompress;
69 size_t inBuffPos;
70 size_t inBuffTarget;
71 size_t blockSize;
72 char* outBuff;
73 size_t outBuffSize;
74 size_t outBuffContentSize;
75 size_t outBuffFlushedSize;
76 ZBUFF_cStage stage;
77 U32 checksum;
78 U32 frameEnded;
79 ZSTD_customMem customMem;
80 }; /* typedef'd tp ZBUFF_CCtx within "zbuff.h" */
81
82 ZBUFF_CCtx* ZBUFF_createCCtx(void)
83 {
84 return ZBUFF_createCCtx_advanced(defaultCustomMem);
85 }
86
87 ZBUFF_CCtx* ZBUFF_createCCtx_advanced(ZSTD_customMem customMem)
88 {
89 ZBUFF_CCtx* zbc;
90
91 if (!customMem.customAlloc && !customMem.customFree)
92 customMem = defaultCustomMem;
93
94 if (!customMem.customAlloc || !customMem.customFree)
95 return NULL;
96
97 zbc = (ZBUFF_CCtx*)customMem.customAlloc(customMem.opaque, sizeof(ZBUFF_CCtx));
98 if (zbc==NULL) return NULL;
99 memset(zbc, 0, sizeof(ZBUFF_CCtx));
100 memcpy(&zbc->customMem, &customMem, sizeof(ZSTD_customMem));
101 zbc->zc = ZSTD_createCCtx_advanced(customMem);
102 if (zbc->zc == NULL) { ZBUFF_freeCCtx(zbc); return NULL; }
103 return zbc;
104 }
105
106 size_t ZBUFF_freeCCtx(ZBUFF_CCtx* zbc)
107 {
108 if (zbc==NULL) return 0; /* support free on NULL */
109 ZSTD_freeCCtx(zbc->zc);
110 if (zbc->inBuff) zbc->customMem.customFree(zbc->customMem.opaque, zbc->inBuff);
111 if (zbc->outBuff) zbc->customMem.customFree(zbc->customMem.opaque, zbc->outBuff);
112 zbc->customMem.customFree(zbc->customMem.opaque, zbc);
113 return 0;
114 }
115
116
117 /* ====== Initialization ====== */
118
119 size_t ZBUFF_compressInit_advanced(ZBUFF_CCtx* zbc,
120 const void* dict, size_t dictSize,
121 ZSTD_parameters params, unsigned long long pledgedSrcSize)
122 {
123 /* allocate buffers */
124 { size_t const neededInBuffSize = (size_t)1 << params.cParams.windowLog;
125 if (zbc->inBuffSize < neededInBuffSize) {
126 zbc->inBuffSize = neededInBuffSize;
127 zbc->customMem.customFree(zbc->customMem.opaque, zbc->inBuff); /* should not be necessary */
128 zbc->inBuff = (char*)zbc->customMem.customAlloc(zbc->customMem.opaque, neededInBuffSize);
129 if (zbc->inBuff == NULL) return ERROR(memory_allocation);
130 }
131 zbc->blockSize = MIN(ZSTD_BLOCKSIZE_ABSOLUTEMAX, neededInBuffSize);
132 }
133 if (zbc->outBuffSize < ZSTD_compressBound(zbc->blockSize)+1) {
134 zbc->outBuffSize = ZSTD_compressBound(zbc->blockSize)+1;
135 zbc->customMem.customFree(zbc->customMem.opaque, zbc->outBuff); /* should not be necessary */
136 zbc->outBuff = (char*)zbc->customMem.customAlloc(zbc->customMem.opaque, zbc->outBuffSize);
137 if (zbc->outBuff == NULL) return ERROR(memory_allocation);
138 }
139
140 { size_t const errorCode = ZSTD_compressBegin_advanced(zbc->zc, dict, dictSize, params, pledgedSrcSize);
141 if (ZSTD_isError(errorCode)) return errorCode; }
142
143 zbc->inToCompress = 0;
144 zbc->inBuffPos = 0;
145 zbc->inBuffTarget = zbc->blockSize;
146 zbc->outBuffContentSize = zbc->outBuffFlushedSize = 0;
147 zbc->stage = ZBUFFcs_load;
148 zbc->checksum = params.fParams.checksumFlag > 0;
149 zbc->frameEnded = 0;
150 return 0; /* ready to go */
151 }
152
153
154 size_t ZBUFF_compressInitDictionary(ZBUFF_CCtx* zbc, const void* dict, size_t dictSize, int compressionLevel)
155 {
156 ZSTD_parameters const params = ZSTD_getParams(compressionLevel, 0, dictSize);
157 return ZBUFF_compressInit_advanced(zbc, dict, dictSize, params, 0);
158 }
159
160 size_t ZBUFF_compressInit(ZBUFF_CCtx* zbc, int compressionLevel)
161 {
162 return ZBUFF_compressInitDictionary(zbc, NULL, 0, compressionLevel);
163 }
164
165
166 /* internal util function */
167 MEM_STATIC size_t ZBUFF_limitCopy(void* dst, size_t dstCapacity, const void* src, size_t srcSize)
168 {
169 size_t const length = MIN(dstCapacity, srcSize);
170 memcpy(dst, src, length);
171 return length;
172 }
173
174
175 /* ====== Compression ====== */
176
177 typedef enum { zbf_gather, zbf_flush, zbf_end } ZBUFF_flush_e;
178
179 static size_t ZBUFF_compressContinue_generic(ZBUFF_CCtx* zbc,
180 void* dst, size_t* dstCapacityPtr,
181 const void* src, size_t* srcSizePtr,
182 ZBUFF_flush_e const flush)
183 {
184 U32 someMoreWork = 1;
185 const char* const istart = (const char*)src;
186 const char* const iend = istart + *srcSizePtr;
187 const char* ip = istart;
188 char* const ostart = (char*)dst;
189 char* const oend = ostart + *dstCapacityPtr;
190 char* op = ostart;
191
192 while (someMoreWork) {
193 switch(zbc->stage)
194 {
195 case ZBUFFcs_init: return ERROR(init_missing); /* call ZBUFF_compressInit() first ! */
196
197 case ZBUFFcs_load:
198 /* complete inBuffer */
199 { size_t const toLoad = zbc->inBuffTarget - zbc->inBuffPos;
200 size_t const loaded = ZBUFF_limitCopy(zbc->inBuff + zbc->inBuffPos, toLoad, ip, iend-ip);
201 zbc->inBuffPos += loaded;
202 ip += loaded;
203 if ( (zbc->inBuffPos==zbc->inToCompress) || (!flush && (toLoad != loaded)) ) {
204 someMoreWork = 0; break; /* not enough input to get a full block : stop there, wait for more */
205 } }
206 /* compress current block (note : this stage cannot be stopped in the middle) */
207 { void* cDst;
208 size_t cSize;
209 size_t const iSize = zbc->inBuffPos - zbc->inToCompress;
210 size_t oSize = oend-op;
211 if (oSize >= ZSTD_compressBound(iSize))
212 cDst = op; /* compress directly into output buffer (avoid flush stage) */
213 else
214 cDst = zbc->outBuff, oSize = zbc->outBuffSize;
215 cSize = (flush == zbf_end) ?
216 ZSTD_compressEnd(zbc->zc, cDst, oSize, zbc->inBuff + zbc->inToCompress, iSize) :
217 ZSTD_compressContinue(zbc->zc, cDst, oSize, zbc->inBuff + zbc->inToCompress, iSize);
218 if (ZSTD_isError(cSize)) return cSize;
219 if (flush == zbf_end) zbc->frameEnded = 1;
220 /* prepare next block */
221 zbc->inBuffTarget = zbc->inBuffPos + zbc->blockSize;
222 if (zbc->inBuffTarget > zbc->inBuffSize)
223 zbc->inBuffPos = 0, zbc->inBuffTarget = zbc->blockSize; /* note : inBuffSize >= blockSize */
224 zbc->inToCompress = zbc->inBuffPos;
225 if (cDst == op) { op += cSize; break; } /* no need to flush */
226 zbc->outBuffContentSize = cSize;
227 zbc->outBuffFlushedSize = 0;
228 zbc->stage = ZBUFFcs_flush; /* continue to flush stage */
229 }
230
231 case ZBUFFcs_flush:
232 { size_t const toFlush = zbc->outBuffContentSize - zbc->outBuffFlushedSize;
233 size_t const flushed = ZBUFF_limitCopy(op, oend-op, zbc->outBuff + zbc->outBuffFlushedSize, toFlush);
234 op += flushed;
235 zbc->outBuffFlushedSize += flushed;
236 if (toFlush!=flushed) { someMoreWork = 0; break; } /* dst too small to store flushed data : stop there */
237 zbc->outBuffContentSize = zbc->outBuffFlushedSize = 0;
238 zbc->stage = ZBUFFcs_load;
239 break;
240 }
241
242 case ZBUFFcs_final:
243 someMoreWork = 0; /* do nothing */
244 break;
245
246 default:
247 return ERROR(GENERIC); /* impossible */
248 }
249 }
250
251 *srcSizePtr = ip - istart;
252 *dstCapacityPtr = op - ostart;
253 if (zbc->frameEnded) return 0;
254 { size_t hintInSize = zbc->inBuffTarget - zbc->inBuffPos;
255 if (hintInSize==0) hintInSize = zbc->blockSize;
256 return hintInSize;
257 }
258 }
259
260 size_t ZBUFF_compressContinue(ZBUFF_CCtx* zbc,
261 void* dst, size_t* dstCapacityPtr,
262 const void* src, size_t* srcSizePtr)
263 {
264 return ZBUFF_compressContinue_generic(zbc, dst, dstCapacityPtr, src, srcSizePtr, zbf_gather);
265 }
266
267
268
269 /* ====== Finalize ====== */
270
271 size_t ZBUFF_compressFlush(ZBUFF_CCtx* zbc, void* dst, size_t* dstCapacityPtr)
272 {
273 size_t srcSize = 0;
274 ZBUFF_compressContinue_generic(zbc, dst, dstCapacityPtr, &srcSize, &srcSize, zbf_flush); /* use a valid src address instead of NULL */
275 return zbc->outBuffContentSize - zbc->outBuffFlushedSize;
276 }
277
278
279 size_t ZBUFF_compressEnd(ZBUFF_CCtx* zbc, void* dst, size_t* dstCapacityPtr)
280 {
281 BYTE* const ostart = (BYTE*)dst;
282 BYTE* const oend = ostart + *dstCapacityPtr;
283 BYTE* op = ostart;
284
285 if (zbc->stage != ZBUFFcs_final) {
286 /* flush whatever remains */
287 size_t outSize = *dstCapacityPtr;
288 size_t srcSize = 0;
289 size_t const notEnded = ZBUFF_compressContinue_generic(zbc, dst, &outSize, &srcSize, &srcSize, zbf_end); /* use a valid address instead of NULL */
290 size_t const remainingToFlush = zbc->outBuffContentSize - zbc->outBuffFlushedSize;
291 op += outSize;
292 if (remainingToFlush) {
293 *dstCapacityPtr = op-ostart;
294 return remainingToFlush + ZBUFF_endFrameSize + (zbc->checksum * 4);
295 }
296 /* create epilogue */
297 zbc->stage = ZBUFFcs_final;
298 zbc->outBuffContentSize = !notEnded ? 0 :
299 ZSTD_compressEnd(zbc->zc, zbc->outBuff, zbc->outBuffSize, NULL, 0); /* write epilogue into outBuff */
300 }
301
302 /* flush epilogue */
303 { size_t const toFlush = zbc->outBuffContentSize - zbc->outBuffFlushedSize;
304 size_t const flushed = ZBUFF_limitCopy(op, oend-op, zbc->outBuff + zbc->outBuffFlushedSize, toFlush);
305 op += flushed;
306 zbc->outBuffFlushedSize += flushed;
307 *dstCapacityPtr = op-ostart;
308 if (toFlush==flushed) zbc->stage = ZBUFFcs_init; /* end reached */
309 return toFlush - flushed;
310 }
311 }
312
313
314
315 /* *************************************
316 * Tool functions
317 ***************************************/
318 size_t ZBUFF_recommendedCInSize(void) { return ZSTD_BLOCKSIZE_ABSOLUTEMAX; }
319 size_t ZBUFF_recommendedCOutSize(void) { return ZSTD_compressBound(ZSTD_BLOCKSIZE_ABSOLUTEMAX) + ZSTD_blockHeaderSize + ZBUFF_endFrameSize; }
This diff has been collapsed as it changes many lines, (3264 lines changed) Show them Hide them
@@ -0,0 +1,3264 b''
1 /**
2 * Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
3 * All rights reserved.
4 *
5 * This source code is licensed under the BSD-style license found in the
6 * LICENSE file in the root directory of this source tree. An additional grant
7 * of patent rights can be found in the PATENTS file in the same directory.
8 */
9
10
11 /*-*************************************
12 * Dependencies
13 ***************************************/
14 #include <string.h> /* memset */
15 #include "mem.h"
16 #define XXH_STATIC_LINKING_ONLY /* XXH64_state_t */
17 #include "xxhash.h" /* XXH_reset, update, digest */
18 #define FSE_STATIC_LINKING_ONLY /* FSE_encodeSymbol */
19 #include "fse.h"
20 #define HUF_STATIC_LINKING_ONLY
21 #include "huf.h"
22 #include "zstd_internal.h" /* includes zstd.h */
23
24
25 /*-*************************************
26 * Constants
27 ***************************************/
28 static const U32 g_searchStrength = 8; /* control skip over incompressible data */
29 #define HASH_READ_SIZE 8
30 typedef enum { ZSTDcs_created=0, ZSTDcs_init, ZSTDcs_ongoing, ZSTDcs_ending } ZSTD_compressionStage_e;
31
32
33 /*-*************************************
34 * Helper functions
35 ***************************************/
36 size_t ZSTD_compressBound(size_t srcSize) { return FSE_compressBound(srcSize) + 12; }
37
38
39 /*-*************************************
40 * Sequence storage
41 ***************************************/
42 static void ZSTD_resetSeqStore(seqStore_t* ssPtr)
43 {
44 ssPtr->lit = ssPtr->litStart;
45 ssPtr->sequences = ssPtr->sequencesStart;
46 ssPtr->longLengthID = 0;
47 }
48
49
50 /*-*************************************
51 * Context memory management
52 ***************************************/
53 struct ZSTD_CCtx_s
54 {
55 const BYTE* nextSrc; /* next block here to continue on current prefix */
56 const BYTE* base; /* All regular indexes relative to this position */
57 const BYTE* dictBase; /* extDict indexes relative to this position */
58 U32 dictLimit; /* below that point, need extDict */
59 U32 lowLimit; /* below that point, no more data */
60 U32 nextToUpdate; /* index from which to continue dictionary update */
61 U32 nextToUpdate3; /* index from which to continue dictionary update */
62 U32 hashLog3; /* dispatch table : larger == faster, more memory */
63 U32 loadedDictEnd;
64 ZSTD_compressionStage_e stage;
65 U32 rep[ZSTD_REP_NUM];
66 U32 savedRep[ZSTD_REP_NUM];
67 U32 dictID;
68 ZSTD_parameters params;
69 void* workSpace;
70 size_t workSpaceSize;
71 size_t blockSize;
72 U64 frameContentSize;
73 XXH64_state_t xxhState;
74 ZSTD_customMem customMem;
75
76 seqStore_t seqStore; /* sequences storage ptrs */
77 U32* hashTable;
78 U32* hashTable3;
79 U32* chainTable;
80 HUF_CElt* hufTable;
81 U32 flagStaticTables;
82 FSE_CTable offcodeCTable [FSE_CTABLE_SIZE_U32(OffFSELog, MaxOff)];
83 FSE_CTable matchlengthCTable[FSE_CTABLE_SIZE_U32(MLFSELog, MaxML)];
84 FSE_CTable litlengthCTable [FSE_CTABLE_SIZE_U32(LLFSELog, MaxLL)];
85 };
86
87 ZSTD_CCtx* ZSTD_createCCtx(void)
88 {
89 return ZSTD_createCCtx_advanced(defaultCustomMem);
90 }
91
92 ZSTD_CCtx* ZSTD_createCCtx_advanced(ZSTD_customMem customMem)
93 {
94 ZSTD_CCtx* cctx;
95
96 if (!customMem.customAlloc && !customMem.customFree) customMem = defaultCustomMem;
97 if (!customMem.customAlloc || !customMem.customFree) return NULL;
98
99 cctx = (ZSTD_CCtx*) ZSTD_malloc(sizeof(ZSTD_CCtx), customMem);
100 if (!cctx) return NULL;
101 memset(cctx, 0, sizeof(ZSTD_CCtx));
102 memcpy(&(cctx->customMem), &customMem, sizeof(customMem));
103 return cctx;
104 }
105
106 size_t ZSTD_freeCCtx(ZSTD_CCtx* cctx)
107 {
108 if (cctx==NULL) return 0; /* support free on NULL */
109 ZSTD_free(cctx->workSpace, cctx->customMem);
110 ZSTD_free(cctx, cctx->customMem);
111 return 0; /* reserved as a potential error code in the future */
112 }
113
114 size_t ZSTD_sizeof_CCtx(const ZSTD_CCtx* cctx)
115 {
116 if (cctx==NULL) return 0; /* support sizeof on NULL */
117 return sizeof(*cctx) + cctx->workSpaceSize;
118 }
119
120 const seqStore_t* ZSTD_getSeqStore(const ZSTD_CCtx* ctx) /* hidden interface */
121 {
122 return &(ctx->seqStore);
123 }
124
125 static ZSTD_parameters ZSTD_getParamsFromCCtx(const ZSTD_CCtx* cctx)
126 {
127 return cctx->params;
128 }
129
130
131 /** ZSTD_checkParams() :
132 ensure param values remain within authorized range.
133 @return : 0, or an error code if one value is beyond authorized range */
134 size_t ZSTD_checkCParams(ZSTD_compressionParameters cParams)
135 {
136 # define CLAMPCHECK(val,min,max) { if ((val<min) | (val>max)) return ERROR(compressionParameter_unsupported); }
137 CLAMPCHECK(cParams.windowLog, ZSTD_WINDOWLOG_MIN, ZSTD_WINDOWLOG_MAX);
138 CLAMPCHECK(cParams.chainLog, ZSTD_CHAINLOG_MIN, ZSTD_CHAINLOG_MAX);
139 CLAMPCHECK(cParams.hashLog, ZSTD_HASHLOG_MIN, ZSTD_HASHLOG_MAX);
140 CLAMPCHECK(cParams.searchLog, ZSTD_SEARCHLOG_MIN, ZSTD_SEARCHLOG_MAX);
141 { U32 const searchLengthMin = ((cParams.strategy == ZSTD_fast) | (cParams.strategy == ZSTD_greedy)) ? ZSTD_SEARCHLENGTH_MIN+1 : ZSTD_SEARCHLENGTH_MIN;
142 U32 const searchLengthMax = (cParams.strategy == ZSTD_fast) ? ZSTD_SEARCHLENGTH_MAX : ZSTD_SEARCHLENGTH_MAX-1;
143 CLAMPCHECK(cParams.searchLength, searchLengthMin, searchLengthMax); }
144 CLAMPCHECK(cParams.targetLength, ZSTD_TARGETLENGTH_MIN, ZSTD_TARGETLENGTH_MAX);
145 if ((U32)(cParams.strategy) > (U32)ZSTD_btopt2) return ERROR(compressionParameter_unsupported);
146 return 0;
147 }
148
149
150 /** ZSTD_adjustCParams() :
151 optimize `cPar` for a given input (`srcSize` and `dictSize`).
152 mostly downsizing to reduce memory consumption and initialization.
153 Both `srcSize` and `dictSize` are optional (use 0 if unknown),
154 but if both are 0, no optimization can be done.
155 Note : cPar is considered validated at this stage. Use ZSTD_checkParams() to ensure that. */
156 ZSTD_compressionParameters ZSTD_adjustCParams(ZSTD_compressionParameters cPar, unsigned long long srcSize, size_t dictSize)
157 {
158 if (srcSize+dictSize == 0) return cPar; /* no size information available : no adjustment */
159
160 /* resize params, to use less memory when necessary */
161 { U32 const minSrcSize = (srcSize==0) ? 500 : 0;
162 U64 const rSize = srcSize + dictSize + minSrcSize;
163 if (rSize < ((U64)1<<ZSTD_WINDOWLOG_MAX)) {
164 U32 const srcLog = MAX(ZSTD_HASHLOG_MIN, ZSTD_highbit32((U32)(rSize)-1) + 1);
165 if (cPar.windowLog > srcLog) cPar.windowLog = srcLog;
166 } }
167 if (cPar.hashLog > cPar.windowLog) cPar.hashLog = cPar.windowLog;
168 { U32 const btPlus = (cPar.strategy == ZSTD_btlazy2) | (cPar.strategy == ZSTD_btopt) | (cPar.strategy == ZSTD_btopt2);
169 U32 const maxChainLog = cPar.windowLog+btPlus;
170 if (cPar.chainLog > maxChainLog) cPar.chainLog = maxChainLog; } /* <= ZSTD_CHAINLOG_MAX */
171
172 if (cPar.windowLog < ZSTD_WINDOWLOG_ABSOLUTEMIN) cPar.windowLog = ZSTD_WINDOWLOG_ABSOLUTEMIN; /* required for frame header */
173
174 return cPar;
175 }
176
177
178 size_t ZSTD_estimateCCtxSize(ZSTD_compressionParameters cParams)
179 {
180 size_t const blockSize = MIN(ZSTD_BLOCKSIZE_ABSOLUTEMAX, (size_t)1 << cParams.windowLog);
181 U32 const divider = (cParams.searchLength==3) ? 3 : 4;
182 size_t const maxNbSeq = blockSize / divider;
183 size_t const tokenSpace = blockSize + 11*maxNbSeq;
184
185 size_t const chainSize = (cParams.strategy == ZSTD_fast) ? 0 : (1 << cParams.chainLog);
186 size_t const hSize = ((size_t)1) << cParams.hashLog;
187 U32 const hashLog3 = (cParams.searchLength>3) ? 0 : MIN(ZSTD_HASHLOG3_MAX, cParams.windowLog);
188 size_t const h3Size = ((size_t)1) << hashLog3;
189 size_t const tableSpace = (chainSize + hSize + h3Size) * sizeof(U32);
190
191 size_t const optSpace = ((MaxML+1) + (MaxLL+1) + (MaxOff+1) + (1<<Litbits))*sizeof(U32)
192 + (ZSTD_OPT_NUM+1)*(sizeof(ZSTD_match_t) + sizeof(ZSTD_optimal_t));
193 size_t const neededSpace = tableSpace + (256*sizeof(U32)) /* huffTable */ + tokenSpace
194 + (((cParams.strategy == ZSTD_btopt) || (cParams.strategy == ZSTD_btopt2)) ? optSpace : 0);
195
196 return sizeof(ZSTD_CCtx) + neededSpace;
197 }
198
199
200 static U32 ZSTD_equivalentParams(ZSTD_parameters param1, ZSTD_parameters param2)
201 {
202 return (param1.cParams.hashLog == param2.cParams.hashLog)
203 & (param1.cParams.chainLog == param2.cParams.chainLog)
204 & (param1.cParams.strategy == param2.cParams.strategy)
205 & ((param1.cParams.searchLength==3) == (param2.cParams.searchLength==3));
206 }
207
208 /*! ZSTD_continueCCtx() :
209 reuse CCtx without reset (note : requires no dictionary) */
210 static size_t ZSTD_continueCCtx(ZSTD_CCtx* cctx, ZSTD_parameters params, U64 frameContentSize)
211 {
212 U32 const end = (U32)(cctx->nextSrc - cctx->base);
213 cctx->params = params;
214 cctx->frameContentSize = frameContentSize;
215 cctx->lowLimit = end;
216 cctx->dictLimit = end;
217 cctx->nextToUpdate = end+1;
218 cctx->stage = ZSTDcs_init;
219 cctx->dictID = 0;
220 cctx->loadedDictEnd = 0;
221 { int i; for (i=0; i<ZSTD_REP_NUM; i++) cctx->rep[i] = repStartValue[i]; }
222 cctx->seqStore.litLengthSum = 0; /* force reset of btopt stats */
223 XXH64_reset(&cctx->xxhState, 0);
224 return 0;
225 }
226
227 typedef enum { ZSTDcrp_continue, ZSTDcrp_noMemset, ZSTDcrp_fullReset } ZSTD_compResetPolicy_e;
228
229 /*! ZSTD_resetCCtx_advanced() :
230 note : 'params' must be validated */
231 static size_t ZSTD_resetCCtx_advanced (ZSTD_CCtx* zc,
232 ZSTD_parameters params, U64 frameContentSize,
233 ZSTD_compResetPolicy_e const crp)
234 {
235 if (crp == ZSTDcrp_continue)
236 if (ZSTD_equivalentParams(params, zc->params))
237 return ZSTD_continueCCtx(zc, params, frameContentSize);
238
239 { size_t const blockSize = MIN(ZSTD_BLOCKSIZE_ABSOLUTEMAX, (size_t)1 << params.cParams.windowLog);
240 U32 const divider = (params.cParams.searchLength==3) ? 3 : 4;
241 size_t const maxNbSeq = blockSize / divider;
242 size_t const tokenSpace = blockSize + 11*maxNbSeq;
243 size_t const chainSize = (params.cParams.strategy == ZSTD_fast) ? 0 : (1 << params.cParams.chainLog);
244 size_t const hSize = ((size_t)1) << params.cParams.hashLog;
245 U32 const hashLog3 = (params.cParams.searchLength>3) ? 0 : MIN(ZSTD_HASHLOG3_MAX, params.cParams.windowLog);
246 size_t const h3Size = ((size_t)1) << hashLog3;
247 size_t const tableSpace = (chainSize + hSize + h3Size) * sizeof(U32);
248 void* ptr;
249
250 /* Check if workSpace is large enough, alloc a new one if needed */
251 { size_t const optSpace = ((MaxML+1) + (MaxLL+1) + (MaxOff+1) + (1<<Litbits))*sizeof(U32)
252 + (ZSTD_OPT_NUM+1)*(sizeof(ZSTD_match_t) + sizeof(ZSTD_optimal_t));
253 size_t const neededSpace = tableSpace + (256*sizeof(U32)) /* huffTable */ + tokenSpace
254 + (((params.cParams.strategy == ZSTD_btopt) || (params.cParams.strategy == ZSTD_btopt2)) ? optSpace : 0);
255 if (zc->workSpaceSize < neededSpace) {
256 ZSTD_free(zc->workSpace, zc->customMem);
257 zc->workSpace = ZSTD_malloc(neededSpace, zc->customMem);
258 if (zc->workSpace == NULL) return ERROR(memory_allocation);
259 zc->workSpaceSize = neededSpace;
260 } }
261
262 if (crp!=ZSTDcrp_noMemset) memset(zc->workSpace, 0, tableSpace); /* reset tables only */
263 XXH64_reset(&zc->xxhState, 0);
264 zc->hashLog3 = hashLog3;
265 zc->hashTable = (U32*)(zc->workSpace);
266 zc->chainTable = zc->hashTable + hSize;
267 zc->hashTable3 = zc->chainTable + chainSize;
268 ptr = zc->hashTable3 + h3Size;
269 zc->hufTable = (HUF_CElt*)ptr;
270 zc->flagStaticTables = 0;
271 ptr = ((U32*)ptr) + 256; /* note : HUF_CElt* is incomplete type, size is simulated using U32 */
272
273 zc->nextToUpdate = 1;
274 zc->nextSrc = NULL;
275 zc->base = NULL;
276 zc->dictBase = NULL;
277 zc->dictLimit = 0;
278 zc->lowLimit = 0;
279 zc->params = params;
280 zc->blockSize = blockSize;
281 zc->frameContentSize = frameContentSize;
282 { int i; for (i=0; i<ZSTD_REP_NUM; i++) zc->rep[i] = repStartValue[i]; }
283
284 if ((params.cParams.strategy == ZSTD_btopt) || (params.cParams.strategy == ZSTD_btopt2)) {
285 zc->seqStore.litFreq = (U32*)ptr;
286 zc->seqStore.litLengthFreq = zc->seqStore.litFreq + (1<<Litbits);
287 zc->seqStore.matchLengthFreq = zc->seqStore.litLengthFreq + (MaxLL+1);
288 zc->seqStore.offCodeFreq = zc->seqStore.matchLengthFreq + (MaxML+1);
289 ptr = zc->seqStore.offCodeFreq + (MaxOff+1);
290 zc->seqStore.matchTable = (ZSTD_match_t*)ptr;
291 ptr = zc->seqStore.matchTable + ZSTD_OPT_NUM+1;
292 zc->seqStore.priceTable = (ZSTD_optimal_t*)ptr;
293 ptr = zc->seqStore.priceTable + ZSTD_OPT_NUM+1;
294 zc->seqStore.litLengthSum = 0;
295 }
296 zc->seqStore.sequencesStart = (seqDef*)ptr;
297 ptr = zc->seqStore.sequencesStart + maxNbSeq;
298 zc->seqStore.llCode = (BYTE*) ptr;
299 zc->seqStore.mlCode = zc->seqStore.llCode + maxNbSeq;
300 zc->seqStore.ofCode = zc->seqStore.mlCode + maxNbSeq;
301 zc->seqStore.litStart = zc->seqStore.ofCode + maxNbSeq;
302
303 zc->stage = ZSTDcs_init;
304 zc->dictID = 0;
305 zc->loadedDictEnd = 0;
306
307 return 0;
308 }
309 }
310
311
312 /*! ZSTD_copyCCtx() :
313 * Duplicate an existing context `srcCCtx` into another one `dstCCtx`.
314 * Only works during stage ZSTDcs_init (i.e. after creation, but before first call to ZSTD_compressContinue()).
315 * @return : 0, or an error code */
316 size_t ZSTD_copyCCtx(ZSTD_CCtx* dstCCtx, const ZSTD_CCtx* srcCCtx, unsigned long long pledgedSrcSize)
317 {
318 if (srcCCtx->stage!=ZSTDcs_init) return ERROR(stage_wrong);
319
320 memcpy(&dstCCtx->customMem, &srcCCtx->customMem, sizeof(ZSTD_customMem));
321 ZSTD_resetCCtx_advanced(dstCCtx, srcCCtx->params, pledgedSrcSize, ZSTDcrp_noMemset);
322
323 /* copy tables */
324 { size_t const chainSize = (srcCCtx->params.cParams.strategy == ZSTD_fast) ? 0 : (1 << srcCCtx->params.cParams.chainLog);
325 size_t const hSize = ((size_t)1) << srcCCtx->params.cParams.hashLog;
326 size_t const h3Size = (size_t)1 << srcCCtx->hashLog3;
327 size_t const tableSpace = (chainSize + hSize + h3Size) * sizeof(U32);
328 memcpy(dstCCtx->workSpace, srcCCtx->workSpace, tableSpace);
329 }
330
331 /* copy dictionary offsets */
332 dstCCtx->nextToUpdate = srcCCtx->nextToUpdate;
333 dstCCtx->nextToUpdate3= srcCCtx->nextToUpdate3;
334 dstCCtx->nextSrc = srcCCtx->nextSrc;
335 dstCCtx->base = srcCCtx->base;
336 dstCCtx->dictBase = srcCCtx->dictBase;
337 dstCCtx->dictLimit = srcCCtx->dictLimit;
338 dstCCtx->lowLimit = srcCCtx->lowLimit;
339 dstCCtx->loadedDictEnd= srcCCtx->loadedDictEnd;
340 dstCCtx->dictID = srcCCtx->dictID;
341
342 /* copy entropy tables */
343 dstCCtx->flagStaticTables = srcCCtx->flagStaticTables;
344 if (srcCCtx->flagStaticTables) {
345 memcpy(dstCCtx->hufTable, srcCCtx->hufTable, 256*4);
346 memcpy(dstCCtx->litlengthCTable, srcCCtx->litlengthCTable, sizeof(dstCCtx->litlengthCTable));
347 memcpy(dstCCtx->matchlengthCTable, srcCCtx->matchlengthCTable, sizeof(dstCCtx->matchlengthCTable));
348 memcpy(dstCCtx->offcodeCTable, srcCCtx->offcodeCTable, sizeof(dstCCtx->offcodeCTable));
349 }
350
351 return 0;
352 }
353
354
355 /*! ZSTD_reduceTable() :
356 * reduce table indexes by `reducerValue` */
357 static void ZSTD_reduceTable (U32* const table, U32 const size, U32 const reducerValue)
358 {
359 U32 u;
360 for (u=0 ; u < size ; u++) {
361 if (table[u] < reducerValue) table[u] = 0;
362 else table[u] -= reducerValue;
363 }
364 }
365
366 /*! ZSTD_reduceIndex() :
367 * rescale all indexes to avoid future overflow (indexes are U32) */
368 static void ZSTD_reduceIndex (ZSTD_CCtx* zc, const U32 reducerValue)
369 {
370 { U32 const hSize = 1 << zc->params.cParams.hashLog;
371 ZSTD_reduceTable(zc->hashTable, hSize, reducerValue); }
372
373 { U32 const chainSize = (zc->params.cParams.strategy == ZSTD_fast) ? 0 : (1 << zc->params.cParams.chainLog);
374 ZSTD_reduceTable(zc->chainTable, chainSize, reducerValue); }
375
376 { U32 const h3Size = (zc->hashLog3) ? 1 << zc->hashLog3 : 0;
377 ZSTD_reduceTable(zc->hashTable3, h3Size, reducerValue); }
378 }
379
380
381 /*-*******************************************************
382 * Block entropic compression
383 *********************************************************/
384
385 /* See doc/zstd_compression_format.md for detailed format description */
386
387 size_t ZSTD_noCompressBlock (void* dst, size_t dstCapacity, const void* src, size_t srcSize)
388 {
389 if (srcSize + ZSTD_blockHeaderSize > dstCapacity) return ERROR(dstSize_tooSmall);
390 memcpy((BYTE*)dst + ZSTD_blockHeaderSize, src, srcSize);
391 MEM_writeLE24(dst, (U32)(srcSize << 2) + (U32)bt_raw);
392 return ZSTD_blockHeaderSize+srcSize;
393 }
394
395
396 static size_t ZSTD_noCompressLiterals (void* dst, size_t dstCapacity, const void* src, size_t srcSize)
397 {
398 BYTE* const ostart = (BYTE* const)dst;
399 U32 const flSize = 1 + (srcSize>31) + (srcSize>4095);
400
401 if (srcSize + flSize > dstCapacity) return ERROR(dstSize_tooSmall);
402
403 switch(flSize)
404 {
405 case 1: /* 2 - 1 - 5 */
406 ostart[0] = (BYTE)((U32)set_basic + (srcSize<<3));
407 break;
408 case 2: /* 2 - 2 - 12 */
409 MEM_writeLE16(ostart, (U16)((U32)set_basic + (1<<2) + (srcSize<<4)));
410 break;
411 default: /*note : should not be necessary : flSize is within {1,2,3} */
412 case 3: /* 2 - 2 - 20 */
413 MEM_writeLE32(ostart, (U32)((U32)set_basic + (3<<2) + (srcSize<<4)));
414 break;
415 }
416
417 memcpy(ostart + flSize, src, srcSize);
418 return srcSize + flSize;
419 }
420
421 static size_t ZSTD_compressRleLiteralsBlock (void* dst, size_t dstCapacity, const void* src, size_t srcSize)
422 {
423 BYTE* const ostart = (BYTE* const)dst;
424 U32 const flSize = 1 + (srcSize>31) + (srcSize>4095);
425
426 (void)dstCapacity; /* dstCapacity already guaranteed to be >=4, hence large enough */
427
428 switch(flSize)
429 {
430 case 1: /* 2 - 1 - 5 */
431 ostart[0] = (BYTE)((U32)set_rle + (srcSize<<3));
432 break;
433 case 2: /* 2 - 2 - 12 */
434 MEM_writeLE16(ostart, (U16)((U32)set_rle + (1<<2) + (srcSize<<4)));
435 break;
436 default: /*note : should not be necessary : flSize is necessarily within {1,2,3} */
437 case 3: /* 2 - 2 - 20 */
438 MEM_writeLE32(ostart, (U32)((U32)set_rle + (3<<2) + (srcSize<<4)));
439 break;
440 }
441
442 ostart[flSize] = *(const BYTE*)src;
443 return flSize+1;
444 }
445
446
447 static size_t ZSTD_minGain(size_t srcSize) { return (srcSize >> 6) + 2; }
448
449 static size_t ZSTD_compressLiterals (ZSTD_CCtx* zc,
450 void* dst, size_t dstCapacity,
451 const void* src, size_t srcSize)
452 {
453 size_t const minGain = ZSTD_minGain(srcSize);
454 size_t const lhSize = 3 + (srcSize >= 1 KB) + (srcSize >= 16 KB);
455 BYTE* const ostart = (BYTE*)dst;
456 U32 singleStream = srcSize < 256;
457 symbolEncodingType_e hType = set_compressed;
458 size_t cLitSize;
459
460
461 /* small ? don't even attempt compression (speed opt) */
462 # define LITERAL_NOENTROPY 63
463 { size_t const minLitSize = zc->flagStaticTables ? 6 : LITERAL_NOENTROPY;
464 if (srcSize <= minLitSize) return ZSTD_noCompressLiterals(dst, dstCapacity, src, srcSize);
465 }
466
467 if (dstCapacity < lhSize+1) return ERROR(dstSize_tooSmall); /* not enough space for compression */
468 if (zc->flagStaticTables && (lhSize==3)) {
469 hType = set_repeat;
470 singleStream = 1;
471 cLitSize = HUF_compress1X_usingCTable(ostart+lhSize, dstCapacity-lhSize, src, srcSize, zc->hufTable);
472 } else {
473 cLitSize = singleStream ? HUF_compress1X(ostart+lhSize, dstCapacity-lhSize, src, srcSize, 255, 11)
474 : HUF_compress2 (ostart+lhSize, dstCapacity-lhSize, src, srcSize, 255, 11);
475 }
476
477 if ((cLitSize==0) | (cLitSize >= srcSize - minGain))
478 return ZSTD_noCompressLiterals(dst, dstCapacity, src, srcSize);
479 if (cLitSize==1)
480 return ZSTD_compressRleLiteralsBlock(dst, dstCapacity, src, srcSize);
481
482 /* Build header */
483 switch(lhSize)
484 {
485 case 3: /* 2 - 2 - 10 - 10 */
486 { U32 const lhc = hType + ((!singleStream) << 2) + ((U32)srcSize<<4) + ((U32)cLitSize<<14);
487 MEM_writeLE24(ostart, lhc);
488 break;
489 }
490 case 4: /* 2 - 2 - 14 - 14 */
491 { U32 const lhc = hType + (2 << 2) + ((U32)srcSize<<4) + ((U32)cLitSize<<18);
492 MEM_writeLE32(ostart, lhc);
493 break;
494 }
495 default: /* should not be necessary, lhSize is only {3,4,5} */
496 case 5: /* 2 - 2 - 18 - 18 */
497 { U32 const lhc = hType + (3 << 2) + ((U32)srcSize<<4) + ((U32)cLitSize<<22);
498 MEM_writeLE32(ostart, lhc);
499 ostart[4] = (BYTE)(cLitSize >> 10);
500 break;
501 }
502 }
503 return lhSize+cLitSize;
504 }
505
506 static const BYTE LL_Code[64] = { 0, 1, 2, 3, 4, 5, 6, 7,
507 8, 9, 10, 11, 12, 13, 14, 15,
508 16, 16, 17, 17, 18, 18, 19, 19,
509 20, 20, 20, 20, 21, 21, 21, 21,
510 22, 22, 22, 22, 22, 22, 22, 22,
511 23, 23, 23, 23, 23, 23, 23, 23,
512 24, 24, 24, 24, 24, 24, 24, 24,
513 24, 24, 24, 24, 24, 24, 24, 24 };
514
515 static const BYTE ML_Code[128] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
516 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
517 32, 32, 33, 33, 34, 34, 35, 35, 36, 36, 36, 36, 37, 37, 37, 37,
518 38, 38, 38, 38, 38, 38, 38, 38, 39, 39, 39, 39, 39, 39, 39, 39,
519 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40,
520 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41,
521 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42,
522 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42 };
523
524
525 void ZSTD_seqToCodes(const seqStore_t* seqStorePtr)
526 {
527 BYTE const LL_deltaCode = 19;
528 BYTE const ML_deltaCode = 36;
529 const seqDef* const sequences = seqStorePtr->sequencesStart;
530 BYTE* const llCodeTable = seqStorePtr->llCode;
531 BYTE* const ofCodeTable = seqStorePtr->ofCode;
532 BYTE* const mlCodeTable = seqStorePtr->mlCode;
533 U32 const nbSeq = (U32)(seqStorePtr->sequences - seqStorePtr->sequencesStart);
534 U32 u;
535 for (u=0; u<nbSeq; u++) {
536 U32 const llv = sequences[u].litLength;
537 U32 const mlv = sequences[u].matchLength;
538 llCodeTable[u] = (llv> 63) ? (BYTE)ZSTD_highbit32(llv) + LL_deltaCode : LL_Code[llv];
539 ofCodeTable[u] = (BYTE)ZSTD_highbit32(sequences[u].offset);
540 mlCodeTable[u] = (mlv>127) ? (BYTE)ZSTD_highbit32(mlv) + ML_deltaCode : ML_Code[mlv];
541 }
542 if (seqStorePtr->longLengthID==1)
543 llCodeTable[seqStorePtr->longLengthPos] = MaxLL;
544 if (seqStorePtr->longLengthID==2)
545 mlCodeTable[seqStorePtr->longLengthPos] = MaxML;
546 }
547
548
549 size_t ZSTD_compressSequences(ZSTD_CCtx* zc,
550 void* dst, size_t dstCapacity,
551 size_t srcSize)
552 {
553 const seqStore_t* seqStorePtr = &(zc->seqStore);
554 U32 count[MaxSeq+1];
555 S16 norm[MaxSeq+1];
556 FSE_CTable* CTable_LitLength = zc->litlengthCTable;
557 FSE_CTable* CTable_OffsetBits = zc->offcodeCTable;
558 FSE_CTable* CTable_MatchLength = zc->matchlengthCTable;
559 U32 LLtype, Offtype, MLtype; /* compressed, raw or rle */
560 const seqDef* const sequences = seqStorePtr->sequencesStart;
561 const BYTE* const ofCodeTable = seqStorePtr->ofCode;
562 const BYTE* const llCodeTable = seqStorePtr->llCode;
563 const BYTE* const mlCodeTable = seqStorePtr->mlCode;
564 BYTE* const ostart = (BYTE*)dst;
565 BYTE* const oend = ostart + dstCapacity;
566 BYTE* op = ostart;
567 size_t const nbSeq = seqStorePtr->sequences - seqStorePtr->sequencesStart;
568 BYTE* seqHead;
569
570 /* Compress literals */
571 { const BYTE* const literals = seqStorePtr->litStart;
572 size_t const litSize = seqStorePtr->lit - literals;
573 size_t const cSize = ZSTD_compressLiterals(zc, op, dstCapacity, literals, litSize);
574 if (ZSTD_isError(cSize)) return cSize;
575 op += cSize;
576 }
577
578 /* Sequences Header */
579 if ((oend-op) < 3 /*max nbSeq Size*/ + 1 /*seqHead */) return ERROR(dstSize_tooSmall);
580 if (nbSeq < 0x7F) *op++ = (BYTE)nbSeq;
581 else if (nbSeq < LONGNBSEQ) op[0] = (BYTE)((nbSeq>>8) + 0x80), op[1] = (BYTE)nbSeq, op+=2;
582 else op[0]=0xFF, MEM_writeLE16(op+1, (U16)(nbSeq - LONGNBSEQ)), op+=3;
583 if (nbSeq==0) goto _check_compressibility;
584
585 /* seqHead : flags for FSE encoding type */
586 seqHead = op++;
587
588 #define MIN_SEQ_FOR_DYNAMIC_FSE 64
589 #define MAX_SEQ_FOR_STATIC_FSE 1000
590
591 /* convert length/distances into codes */
592 ZSTD_seqToCodes(seqStorePtr);
593
594 /* CTable for Literal Lengths */
595 { U32 max = MaxLL;
596 size_t const mostFrequent = FSE_countFast(count, &max, llCodeTable, nbSeq);
597 if ((mostFrequent == nbSeq) && (nbSeq > 2)) {
598 *op++ = llCodeTable[0];
599 FSE_buildCTable_rle(CTable_LitLength, (BYTE)max);
600 LLtype = set_rle;
601 } else if ((zc->flagStaticTables) && (nbSeq < MAX_SEQ_FOR_STATIC_FSE)) {
602 LLtype = set_repeat;
603 } else if ((nbSeq < MIN_SEQ_FOR_DYNAMIC_FSE) || (mostFrequent < (nbSeq >> (LL_defaultNormLog-1)))) {
604 FSE_buildCTable(CTable_LitLength, LL_defaultNorm, MaxLL, LL_defaultNormLog);
605 LLtype = set_basic;
606 } else {
607 size_t nbSeq_1 = nbSeq;
608 const U32 tableLog = FSE_optimalTableLog(LLFSELog, nbSeq, max);
609 if (count[llCodeTable[nbSeq-1]]>1) { count[llCodeTable[nbSeq-1]]--; nbSeq_1--; }
610 FSE_normalizeCount(norm, tableLog, count, nbSeq_1, max);
611 { size_t const NCountSize = FSE_writeNCount(op, oend-op, norm, max, tableLog); /* overflow protected */
612 if (FSE_isError(NCountSize)) return ERROR(GENERIC);
613 op += NCountSize; }
614 FSE_buildCTable(CTable_LitLength, norm, max, tableLog);
615 LLtype = set_compressed;
616 } }
617
618 /* CTable for Offsets */
619 { U32 max = MaxOff;
620 size_t const mostFrequent = FSE_countFast(count, &max, ofCodeTable, nbSeq);
621 if ((mostFrequent == nbSeq) && (nbSeq > 2)) {
622 *op++ = ofCodeTable[0];
623 FSE_buildCTable_rle(CTable_OffsetBits, (BYTE)max);
624 Offtype = set_rle;
625 } else if ((zc->flagStaticTables) && (nbSeq < MAX_SEQ_FOR_STATIC_FSE)) {
626 Offtype = set_repeat;
627 } else if ((nbSeq < MIN_SEQ_FOR_DYNAMIC_FSE) || (mostFrequent < (nbSeq >> (OF_defaultNormLog-1)))) {
628 FSE_buildCTable(CTable_OffsetBits, OF_defaultNorm, MaxOff, OF_defaultNormLog);
629 Offtype = set_basic;
630 } else {
631 size_t nbSeq_1 = nbSeq;
632 const U32 tableLog = FSE_optimalTableLog(OffFSELog, nbSeq, max);
633 if (count[ofCodeTable[nbSeq-1]]>1) { count[ofCodeTable[nbSeq-1]]--; nbSeq_1--; }
634 FSE_normalizeCount(norm, tableLog, count, nbSeq_1, max);
635 { size_t const NCountSize = FSE_writeNCount(op, oend-op, norm, max, tableLog); /* overflow protected */
636 if (FSE_isError(NCountSize)) return ERROR(GENERIC);
637 op += NCountSize; }
638 FSE_buildCTable(CTable_OffsetBits, norm, max, tableLog);
639 Offtype = set_compressed;
640 } }
641
642 /* CTable for MatchLengths */
643 { U32 max = MaxML;
644 size_t const mostFrequent = FSE_countFast(count, &max, mlCodeTable, nbSeq);
645 if ((mostFrequent == nbSeq) && (nbSeq > 2)) {
646 *op++ = *mlCodeTable;
647 FSE_buildCTable_rle(CTable_MatchLength, (BYTE)max);
648 MLtype = set_rle;
649 } else if ((zc->flagStaticTables) && (nbSeq < MAX_SEQ_FOR_STATIC_FSE)) {
650 MLtype = set_repeat;
651 } else if ((nbSeq < MIN_SEQ_FOR_DYNAMIC_FSE) || (mostFrequent < (nbSeq >> (ML_defaultNormLog-1)))) {
652 FSE_buildCTable(CTable_MatchLength, ML_defaultNorm, MaxML, ML_defaultNormLog);
653 MLtype = set_basic;
654 } else {
655 size_t nbSeq_1 = nbSeq;
656 const U32 tableLog = FSE_optimalTableLog(MLFSELog, nbSeq, max);
657 if (count[mlCodeTable[nbSeq-1]]>1) { count[mlCodeTable[nbSeq-1]]--; nbSeq_1--; }
658 FSE_normalizeCount(norm, tableLog, count, nbSeq_1, max);
659 { size_t const NCountSize = FSE_writeNCount(op, oend-op, norm, max, tableLog); /* overflow protected */
660 if (FSE_isError(NCountSize)) return ERROR(GENERIC);
661 op += NCountSize; }
662 FSE_buildCTable(CTable_MatchLength, norm, max, tableLog);
663 MLtype = set_compressed;
664 } }
665
666 *seqHead = (BYTE)((LLtype<<6) + (Offtype<<4) + (MLtype<<2));
667 zc->flagStaticTables = 0;
668
669 /* Encoding Sequences */
670 { BIT_CStream_t blockStream;
671 FSE_CState_t stateMatchLength;
672 FSE_CState_t stateOffsetBits;
673 FSE_CState_t stateLitLength;
674
675 CHECK_E(BIT_initCStream(&blockStream, op, oend-op), dstSize_tooSmall); /* not enough space remaining */
676
677 /* first symbols */
678 FSE_initCState2(&stateMatchLength, CTable_MatchLength, mlCodeTable[nbSeq-1]);
679 FSE_initCState2(&stateOffsetBits, CTable_OffsetBits, ofCodeTable[nbSeq-1]);
680 FSE_initCState2(&stateLitLength, CTable_LitLength, llCodeTable[nbSeq-1]);
681 BIT_addBits(&blockStream, sequences[nbSeq-1].litLength, LL_bits[llCodeTable[nbSeq-1]]);
682 if (MEM_32bits()) BIT_flushBits(&blockStream);
683 BIT_addBits(&blockStream, sequences[nbSeq-1].matchLength, ML_bits[mlCodeTable[nbSeq-1]]);
684 if (MEM_32bits()) BIT_flushBits(&blockStream);
685 BIT_addBits(&blockStream, sequences[nbSeq-1].offset, ofCodeTable[nbSeq-1]);
686 BIT_flushBits(&blockStream);
687
688 { size_t n;
689 for (n=nbSeq-2 ; n<nbSeq ; n--) { /* intentional underflow */
690 BYTE const llCode = llCodeTable[n];
691 BYTE const ofCode = ofCodeTable[n];
692 BYTE const mlCode = mlCodeTable[n];
693 U32 const llBits = LL_bits[llCode];
694 U32 const ofBits = ofCode; /* 32b*/ /* 64b*/
695 U32 const mlBits = ML_bits[mlCode];
696 /* (7)*/ /* (7)*/
697 FSE_encodeSymbol(&blockStream, &stateOffsetBits, ofCode); /* 15 */ /* 15 */
698 FSE_encodeSymbol(&blockStream, &stateMatchLength, mlCode); /* 24 */ /* 24 */
699 if (MEM_32bits()) BIT_flushBits(&blockStream); /* (7)*/
700 FSE_encodeSymbol(&blockStream, &stateLitLength, llCode); /* 16 */ /* 33 */
701 if (MEM_32bits() || (ofBits+mlBits+llBits >= 64-7-(LLFSELog+MLFSELog+OffFSELog)))
702 BIT_flushBits(&blockStream); /* (7)*/
703 BIT_addBits(&blockStream, sequences[n].litLength, llBits);
704 if (MEM_32bits() && ((llBits+mlBits)>24)) BIT_flushBits(&blockStream);
705 BIT_addBits(&blockStream, sequences[n].matchLength, mlBits);
706 if (MEM_32bits()) BIT_flushBits(&blockStream); /* (7)*/
707 BIT_addBits(&blockStream, sequences[n].offset, ofBits); /* 31 */
708 BIT_flushBits(&blockStream); /* (7)*/
709 } }
710
711 FSE_flushCState(&blockStream, &stateMatchLength);
712 FSE_flushCState(&blockStream, &stateOffsetBits);
713 FSE_flushCState(&blockStream, &stateLitLength);
714
715 { size_t const streamSize = BIT_closeCStream(&blockStream);
716 if (streamSize==0) return ERROR(dstSize_tooSmall); /* not enough space */
717 op += streamSize;
718 } }
719
720 /* check compressibility */
721 _check_compressibility:
722 { size_t const minGain = ZSTD_minGain(srcSize);
723 size_t const maxCSize = srcSize - minGain;
724 if ((size_t)(op-ostart) >= maxCSize) return 0; }
725
726 /* confirm repcodes */
727 { int i; for (i=0; i<ZSTD_REP_NUM; i++) zc->rep[i] = zc->savedRep[i]; }
728
729 return op - ostart;
730 }
731
732
733 /*! ZSTD_storeSeq() :
734 Store a sequence (literal length, literals, offset code and match length code) into seqStore_t.
735 `offsetCode` : distance to match, or 0 == repCode.
736 `matchCode` : matchLength - MINMATCH
737 */
738 MEM_STATIC void ZSTD_storeSeq(seqStore_t* seqStorePtr, size_t litLength, const void* literals, U32 offsetCode, size_t matchCode)
739 {
740 #if 0 /* for debug */
741 static const BYTE* g_start = NULL;
742 const U32 pos = (U32)(literals - g_start);
743 if (g_start==NULL) g_start = literals;
744 //if ((pos > 1) && (pos < 50000))
745 printf("Cpos %6u :%5u literals & match %3u bytes at distance %6u \n",
746 pos, (U32)litLength, (U32)matchCode+MINMATCH, (U32)offsetCode);
747 #endif
748 /* copy Literals */
749 ZSTD_wildcopy(seqStorePtr->lit, literals, litLength);
750 seqStorePtr->lit += litLength;
751
752 /* literal Length */
753 if (litLength>0xFFFF) { seqStorePtr->longLengthID = 1; seqStorePtr->longLengthPos = (U32)(seqStorePtr->sequences - seqStorePtr->sequencesStart); }
754 seqStorePtr->sequences[0].litLength = (U16)litLength;
755
756 /* match offset */
757 seqStorePtr->sequences[0].offset = offsetCode + 1;
758
759 /* match Length */
760 if (matchCode>0xFFFF) { seqStorePtr->longLengthID = 2; seqStorePtr->longLengthPos = (U32)(seqStorePtr->sequences - seqStorePtr->sequencesStart); }
761 seqStorePtr->sequences[0].matchLength = (U16)matchCode;
762
763 seqStorePtr->sequences++;
764 }
765
766
767 /*-*************************************
768 * Match length counter
769 ***************************************/
770 static unsigned ZSTD_NbCommonBytes (register size_t val)
771 {
772 if (MEM_isLittleEndian()) {
773 if (MEM_64bits()) {
774 # if defined(_MSC_VER) && defined(_WIN64)
775 unsigned long r = 0;
776 _BitScanForward64( &r, (U64)val );
777 return (unsigned)(r>>3);
778 # elif defined(__GNUC__) && (__GNUC__ >= 3)
779 return (__builtin_ctzll((U64)val) >> 3);
780 # else
781 static const int DeBruijnBytePos[64] = { 0, 0, 0, 0, 0, 1, 1, 2, 0, 3, 1, 3, 1, 4, 2, 7, 0, 2, 3, 6, 1, 5, 3, 5, 1, 3, 4, 4, 2, 5, 6, 7, 7, 0, 1, 2, 3, 3, 4, 6, 2, 6, 5, 5, 3, 4, 5, 6, 7, 1, 2, 4, 6, 4, 4, 5, 7, 2, 6, 5, 7, 6, 7, 7 };
782 return DeBruijnBytePos[((U64)((val & -(long long)val) * 0x0218A392CDABBD3FULL)) >> 58];
783 # endif
784 } else { /* 32 bits */
785 # if defined(_MSC_VER)
786 unsigned long r=0;
787 _BitScanForward( &r, (U32)val );
788 return (unsigned)(r>>3);
789 # elif defined(__GNUC__) && (__GNUC__ >= 3)
790 return (__builtin_ctz((U32)val) >> 3);
791 # else
792 static const int DeBruijnBytePos[32] = { 0, 0, 3, 0, 3, 1, 3, 0, 3, 2, 2, 1, 3, 2, 0, 1, 3, 3, 1, 2, 2, 2, 2, 0, 3, 1, 2, 0, 1, 0, 1, 1 };
793 return DeBruijnBytePos[((U32)((val & -(S32)val) * 0x077CB531U)) >> 27];
794 # endif
795 }
796 } else { /* Big Endian CPU */
797 if (MEM_64bits()) {
798 # if defined(_MSC_VER) && defined(_WIN64)
799 unsigned long r = 0;
800 _BitScanReverse64( &r, val );
801 return (unsigned)(r>>3);
802 # elif defined(__GNUC__) && (__GNUC__ >= 3)
803 return (__builtin_clzll(val) >> 3);
804 # else
805 unsigned r;
806 const unsigned n32 = sizeof(size_t)*4; /* calculate this way due to compiler complaining in 32-bits mode */
807 if (!(val>>n32)) { r=4; } else { r=0; val>>=n32; }
808 if (!(val>>16)) { r+=2; val>>=8; } else { val>>=24; }
809 r += (!val);
810 return r;
811 # endif
812 } else { /* 32 bits */
813 # if defined(_MSC_VER)
814 unsigned long r = 0;
815 _BitScanReverse( &r, (unsigned long)val );
816 return (unsigned)(r>>3);
817 # elif defined(__GNUC__) && (__GNUC__ >= 3)
818 return (__builtin_clz((U32)val) >> 3);
819 # else
820 unsigned r;
821 if (!(val>>16)) { r=2; val>>=8; } else { r=0; val>>=24; }
822 r += (!val);
823 return r;
824 # endif
825 } }
826 }
827
828
829 static size_t ZSTD_count(const BYTE* pIn, const BYTE* pMatch, const BYTE* const pInLimit)
830 {
831 const BYTE* const pStart = pIn;
832 const BYTE* const pInLoopLimit = pInLimit - (sizeof(size_t)-1);
833
834 while (pIn < pInLoopLimit) {
835 size_t const diff = MEM_readST(pMatch) ^ MEM_readST(pIn);
836 if (!diff) { pIn+=sizeof(size_t); pMatch+=sizeof(size_t); continue; }
837 pIn += ZSTD_NbCommonBytes(diff);
838 return (size_t)(pIn - pStart);
839 }
840 if (MEM_64bits()) if ((pIn<(pInLimit-3)) && (MEM_read32(pMatch) == MEM_read32(pIn))) { pIn+=4; pMatch+=4; }
841 if ((pIn<(pInLimit-1)) && (MEM_read16(pMatch) == MEM_read16(pIn))) { pIn+=2; pMatch+=2; }
842 if ((pIn<pInLimit) && (*pMatch == *pIn)) pIn++;
843 return (size_t)(pIn - pStart);
844 }
845
846 /** ZSTD_count_2segments() :
847 * can count match length with `ip` & `match` in 2 different segments.
848 * convention : on reaching mEnd, match count continue starting from iStart
849 */
850 static size_t ZSTD_count_2segments(const BYTE* ip, const BYTE* match, const BYTE* iEnd, const BYTE* mEnd, const BYTE* iStart)
851 {
852 const BYTE* const vEnd = MIN( ip + (mEnd - match), iEnd);
853 size_t const matchLength = ZSTD_count(ip, match, vEnd);
854 if (match + matchLength != mEnd) return matchLength;
855 return matchLength + ZSTD_count(ip+matchLength, iStart, iEnd);
856 }
857
858
859 /*-*************************************
860 * Hashes
861 ***************************************/
862 static const U32 prime3bytes = 506832829U;
863 static U32 ZSTD_hash3(U32 u, U32 h) { return ((u << (32-24)) * prime3bytes) >> (32-h) ; }
864 MEM_STATIC size_t ZSTD_hash3Ptr(const void* ptr, U32 h) { return ZSTD_hash3(MEM_readLE32(ptr), h); } /* only in zstd_opt.h */
865
866 static const U32 prime4bytes = 2654435761U;
867 static U32 ZSTD_hash4(U32 u, U32 h) { return (u * prime4bytes) >> (32-h) ; }
868 static size_t ZSTD_hash4Ptr(const void* ptr, U32 h) { return ZSTD_hash4(MEM_read32(ptr), h); }
869
870 static const U64 prime5bytes = 889523592379ULL;
871 static size_t ZSTD_hash5(U64 u, U32 h) { return (size_t)(((u << (64-40)) * prime5bytes) >> (64-h)) ; }
872 static size_t ZSTD_hash5Ptr(const void* p, U32 h) { return ZSTD_hash5(MEM_readLE64(p), h); }
873
874 static const U64 prime6bytes = 227718039650203ULL;
875 static size_t ZSTD_hash6(U64 u, U32 h) { return (size_t)(((u << (64-48)) * prime6bytes) >> (64-h)) ; }
876 static size_t ZSTD_hash6Ptr(const void* p, U32 h) { return ZSTD_hash6(MEM_readLE64(p), h); }
877
878 static const U64 prime7bytes = 58295818150454627ULL;
879 static size_t ZSTD_hash7(U64 u, U32 h) { return (size_t)(((u << (64-56)) * prime7bytes) >> (64-h)) ; }
880 static size_t ZSTD_hash7Ptr(const void* p, U32 h) { return ZSTD_hash7(MEM_readLE64(p), h); }
881
882 static const U64 prime8bytes = 0xCF1BBCDCB7A56463ULL;
883 static size_t ZSTD_hash8(U64 u, U32 h) { return (size_t)(((u) * prime8bytes) >> (64-h)) ; }
884 static size_t ZSTD_hash8Ptr(const void* p, U32 h) { return ZSTD_hash8(MEM_readLE64(p), h); }
885
886 static size_t ZSTD_hashPtr(const void* p, U32 hBits, U32 mls)
887 {
888 switch(mls)
889 {
890 default:
891 case 4: return ZSTD_hash4Ptr(p, hBits);
892 case 5: return ZSTD_hash5Ptr(p, hBits);
893 case 6: return ZSTD_hash6Ptr(p, hBits);
894 case 7: return ZSTD_hash7Ptr(p, hBits);
895 case 8: return ZSTD_hash8Ptr(p, hBits);
896 }
897 }
898
899
900 /*-*************************************
901 * Fast Scan
902 ***************************************/
903 static void ZSTD_fillHashTable (ZSTD_CCtx* zc, const void* end, const U32 mls)
904 {
905 U32* const hashTable = zc->hashTable;
906 U32 const hBits = zc->params.cParams.hashLog;
907 const BYTE* const base = zc->base;
908 const BYTE* ip = base + zc->nextToUpdate;
909 const BYTE* const iend = ((const BYTE*)end) - HASH_READ_SIZE;
910 const size_t fastHashFillStep = 3;
911
912 while(ip <= iend) {
913 hashTable[ZSTD_hashPtr(ip, hBits, mls)] = (U32)(ip - base);
914 ip += fastHashFillStep;
915 }
916 }
917
918
919 FORCE_INLINE
920 void ZSTD_compressBlock_fast_generic(ZSTD_CCtx* cctx,
921 const void* src, size_t srcSize,
922 const U32 mls)
923 {
924 U32* const hashTable = cctx->hashTable;
925 U32 const hBits = cctx->params.cParams.hashLog;
926 seqStore_t* seqStorePtr = &(cctx->seqStore);
927 const BYTE* const base = cctx->base;
928 const BYTE* const istart = (const BYTE*)src;
929 const BYTE* ip = istart;
930 const BYTE* anchor = istart;
931 const U32 lowestIndex = cctx->dictLimit;
932 const BYTE* const lowest = base + lowestIndex;
933 const BYTE* const iend = istart + srcSize;
934 const BYTE* const ilimit = iend - HASH_READ_SIZE;
935 U32 offset_1=cctx->rep[0], offset_2=cctx->rep[1];
936 U32 offsetSaved = 0;
937
938 /* init */
939 ip += (ip==lowest);
940 { U32 const maxRep = (U32)(ip-lowest);
941 if (offset_2 > maxRep) offsetSaved = offset_2, offset_2 = 0;
942 if (offset_1 > maxRep) offsetSaved = offset_1, offset_1 = 0;
943 }
944
945 /* Main Search Loop */
946 while (ip < ilimit) { /* < instead of <=, because repcode check at (ip+1) */
947 size_t mLength;
948 size_t const h = ZSTD_hashPtr(ip, hBits, mls);
949 U32 const current = (U32)(ip-base);
950 U32 const matchIndex = hashTable[h];
951 const BYTE* match = base + matchIndex;
952 hashTable[h] = current; /* update hash table */
953
954 if ((offset_1 > 0) & (MEM_read32(ip+1-offset_1) == MEM_read32(ip+1))) {
955 mLength = ZSTD_count(ip+1+4, ip+1+4-offset_1, iend) + 4;
956 ip++;
957 ZSTD_storeSeq(seqStorePtr, ip-anchor, anchor, 0, mLength-MINMATCH);
958 } else {
959 U32 offset;
960 if ( (matchIndex <= lowestIndex) || (MEM_read32(match) != MEM_read32(ip)) ) {
961 ip += ((ip-anchor) >> g_searchStrength) + 1;
962 continue;
963 }
964 mLength = ZSTD_count(ip+4, match+4, iend) + 4;
965 offset = (U32)(ip-match);
966 while (((ip>anchor) & (match>lowest)) && (ip[-1] == match[-1])) { ip--; match--; mLength++; } /* catch up */
967 offset_2 = offset_1;
968 offset_1 = offset;
969
970 ZSTD_storeSeq(seqStorePtr, ip-anchor, anchor, offset + ZSTD_REP_MOVE, mLength-MINMATCH);
971 }
972
973 /* match found */
974 ip += mLength;
975 anchor = ip;
976
977 if (ip <= ilimit) {
978 /* Fill Table */
979 hashTable[ZSTD_hashPtr(base+current+2, hBits, mls)] = current+2; /* here because current+2 could be > iend-8 */
980 hashTable[ZSTD_hashPtr(ip-2, hBits, mls)] = (U32)(ip-2-base);
981 /* check immediate repcode */
982 while ( (ip <= ilimit)
983 && ( (offset_2>0)
984 & (MEM_read32(ip) == MEM_read32(ip - offset_2)) )) {
985 /* store sequence */
986 size_t const rLength = ZSTD_count(ip+4, ip+4-offset_2, iend) + 4;
987 { U32 const tmpOff = offset_2; offset_2 = offset_1; offset_1 = tmpOff; } /* swap offset_2 <=> offset_1 */
988 hashTable[ZSTD_hashPtr(ip, hBits, mls)] = (U32)(ip-base);
989 ZSTD_storeSeq(seqStorePtr, 0, anchor, 0, rLength-MINMATCH);
990 ip += rLength;
991 anchor = ip;
992 continue; /* faster when present ... (?) */
993 } } }
994
995 /* save reps for next block */
996 cctx->savedRep[0] = offset_1 ? offset_1 : offsetSaved;
997 cctx->savedRep[1] = offset_2 ? offset_2 : offsetSaved;
998
999 /* Last Literals */
1000 { size_t const lastLLSize = iend - anchor;
1001 memcpy(seqStorePtr->lit, anchor, lastLLSize);
1002 seqStorePtr->lit += lastLLSize;
1003 }
1004 }
1005
1006
1007 static void ZSTD_compressBlock_fast(ZSTD_CCtx* ctx,
1008 const void* src, size_t srcSize)
1009 {
1010 const U32 mls = ctx->params.cParams.searchLength;
1011 switch(mls)
1012 {
1013 default:
1014 case 4 :
1015 ZSTD_compressBlock_fast_generic(ctx, src, srcSize, 4); return;
1016 case 5 :
1017 ZSTD_compressBlock_fast_generic(ctx, src, srcSize, 5); return;
1018 case 6 :
1019 ZSTD_compressBlock_fast_generic(ctx, src, srcSize, 6); return;
1020 case 7 :
1021 ZSTD_compressBlock_fast_generic(ctx, src, srcSize, 7); return;
1022 }
1023 }
1024
1025
1026 static void ZSTD_compressBlock_fast_extDict_generic(ZSTD_CCtx* ctx,
1027 const void* src, size_t srcSize,
1028 const U32 mls)
1029 {
1030 U32* hashTable = ctx->hashTable;
1031 const U32 hBits = ctx->params.cParams.hashLog;
1032 seqStore_t* seqStorePtr = &(ctx->seqStore);
1033 const BYTE* const base = ctx->base;
1034 const BYTE* const dictBase = ctx->dictBase;
1035 const BYTE* const istart = (const BYTE*)src;
1036 const BYTE* ip = istart;
1037 const BYTE* anchor = istart;
1038 const U32 lowestIndex = ctx->lowLimit;
1039 const BYTE* const dictStart = dictBase + lowestIndex;
1040 const U32 dictLimit = ctx->dictLimit;
1041 const BYTE* const lowPrefixPtr = base + dictLimit;
1042 const BYTE* const dictEnd = dictBase + dictLimit;
1043 const BYTE* const iend = istart + srcSize;
1044 const BYTE* const ilimit = iend - 8;
1045 U32 offset_1=ctx->rep[0], offset_2=ctx->rep[1];
1046
1047 /* Search Loop */
1048 while (ip < ilimit) { /* < instead of <=, because (ip+1) */
1049 const size_t h = ZSTD_hashPtr(ip, hBits, mls);
1050 const U32 matchIndex = hashTable[h];
1051 const BYTE* matchBase = matchIndex < dictLimit ? dictBase : base;
1052 const BYTE* match = matchBase + matchIndex;
1053 const U32 current = (U32)(ip-base);
1054 const U32 repIndex = current + 1 - offset_1; /* offset_1 expected <= current +1 */
1055 const BYTE* repBase = repIndex < dictLimit ? dictBase : base;
1056 const BYTE* repMatch = repBase + repIndex;
1057 size_t mLength;
1058 hashTable[h] = current; /* update hash table */
1059
1060 if ( (((U32)((dictLimit-1) - repIndex) >= 3) /* intentional underflow */ & (repIndex > lowestIndex))
1061 && (MEM_read32(repMatch) == MEM_read32(ip+1)) ) {
1062 const BYTE* repMatchEnd = repIndex < dictLimit ? dictEnd : iend;
1063 mLength = ZSTD_count_2segments(ip+1+EQUAL_READ32, repMatch+EQUAL_READ32, iend, repMatchEnd, lowPrefixPtr) + EQUAL_READ32;
1064 ip++;
1065 ZSTD_storeSeq(seqStorePtr, ip-anchor, anchor, 0, mLength-MINMATCH);
1066 } else {
1067 if ( (matchIndex < lowestIndex) ||
1068 (MEM_read32(match) != MEM_read32(ip)) ) {
1069 ip += ((ip-anchor) >> g_searchStrength) + 1;
1070 continue;
1071 }
1072 { const BYTE* matchEnd = matchIndex < dictLimit ? dictEnd : iend;
1073 const BYTE* lowMatchPtr = matchIndex < dictLimit ? dictStart : lowPrefixPtr;
1074 U32 offset;
1075 mLength = ZSTD_count_2segments(ip+EQUAL_READ32, match+EQUAL_READ32, iend, matchEnd, lowPrefixPtr) + EQUAL_READ32;
1076 while (((ip>anchor) & (match>lowMatchPtr)) && (ip[-1] == match[-1])) { ip--; match--; mLength++; } /* catch up */
1077 offset = current - matchIndex;
1078 offset_2 = offset_1;
1079 offset_1 = offset;
1080 ZSTD_storeSeq(seqStorePtr, ip-anchor, anchor, offset + ZSTD_REP_MOVE, mLength-MINMATCH);
1081 } }
1082
1083 /* found a match : store it */
1084 ip += mLength;
1085 anchor = ip;
1086
1087 if (ip <= ilimit) {
1088 /* Fill Table */
1089 hashTable[ZSTD_hashPtr(base+current+2, hBits, mls)] = current+2;
1090 hashTable[ZSTD_hashPtr(ip-2, hBits, mls)] = (U32)(ip-2-base);
1091 /* check immediate repcode */
1092 while (ip <= ilimit) {
1093 U32 const current2 = (U32)(ip-base);
1094 U32 const repIndex2 = current2 - offset_2;
1095 const BYTE* repMatch2 = repIndex2 < dictLimit ? dictBase + repIndex2 : base + repIndex2;
1096 if ( (((U32)((dictLimit-1) - repIndex2) >= 3) & (repIndex2 > lowestIndex)) /* intentional overflow */
1097 && (MEM_read32(repMatch2) == MEM_read32(ip)) ) {
1098 const BYTE* const repEnd2 = repIndex2 < dictLimit ? dictEnd : iend;
1099 size_t repLength2 = ZSTD_count_2segments(ip+EQUAL_READ32, repMatch2+EQUAL_READ32, iend, repEnd2, lowPrefixPtr) + EQUAL_READ32;
1100 U32 tmpOffset = offset_2; offset_2 = offset_1; offset_1 = tmpOffset; /* swap offset_2 <=> offset_1 */
1101 ZSTD_storeSeq(seqStorePtr, 0, anchor, 0, repLength2-MINMATCH);
1102 hashTable[ZSTD_hashPtr(ip, hBits, mls)] = current2;
1103 ip += repLength2;
1104 anchor = ip;
1105 continue;
1106 }
1107 break;
1108 } } }
1109
1110 /* save reps for next block */
1111 ctx->savedRep[0] = offset_1; ctx->savedRep[1] = offset_2;
1112
1113 /* Last Literals */
1114 { size_t const lastLLSize = iend - anchor;
1115 memcpy(seqStorePtr->lit, anchor, lastLLSize);
1116 seqStorePtr->lit += lastLLSize;
1117 }
1118 }
1119
1120
1121 static void ZSTD_compressBlock_fast_extDict(ZSTD_CCtx* ctx,
1122 const void* src, size_t srcSize)
1123 {
1124 U32 const mls = ctx->params.cParams.searchLength;
1125 switch(mls)
1126 {
1127 default:
1128 case 4 :
1129 ZSTD_compressBlock_fast_extDict_generic(ctx, src, srcSize, 4); return;
1130 case 5 :
1131 ZSTD_compressBlock_fast_extDict_generic(ctx, src, srcSize, 5); return;
1132 case 6 :
1133 ZSTD_compressBlock_fast_extDict_generic(ctx, src, srcSize, 6); return;
1134 case 7 :
1135 ZSTD_compressBlock_fast_extDict_generic(ctx, src, srcSize, 7); return;
1136 }
1137 }
1138
1139
1140 /*-*************************************
1141 * Double Fast
1142 ***************************************/
1143 static void ZSTD_fillDoubleHashTable (ZSTD_CCtx* cctx, const void* end, const U32 mls)
1144 {
1145 U32* const hashLarge = cctx->hashTable;
1146 U32 const hBitsL = cctx->params.cParams.hashLog;
1147 U32* const hashSmall = cctx->chainTable;
1148 U32 const hBitsS = cctx->params.cParams.chainLog;
1149 const BYTE* const base = cctx->base;
1150 const BYTE* ip = base + cctx->nextToUpdate;
1151 const BYTE* const iend = ((const BYTE*)end) - HASH_READ_SIZE;
1152 const size_t fastHashFillStep = 3;
1153
1154 while(ip <= iend) {
1155 hashSmall[ZSTD_hashPtr(ip, hBitsS, mls)] = (U32)(ip - base);
1156 hashLarge[ZSTD_hashPtr(ip, hBitsL, 8)] = (U32)(ip - base);
1157 ip += fastHashFillStep;
1158 }
1159 }
1160
1161
1162 FORCE_INLINE
1163 void ZSTD_compressBlock_doubleFast_generic(ZSTD_CCtx* cctx,
1164 const void* src, size_t srcSize,
1165 const U32 mls)
1166 {
1167 U32* const hashLong = cctx->hashTable;
1168 const U32 hBitsL = cctx->params.cParams.hashLog;
1169 U32* const hashSmall = cctx->chainTable;
1170 const U32 hBitsS = cctx->params.cParams.chainLog;
1171 seqStore_t* seqStorePtr = &(cctx->seqStore);
1172 const BYTE* const base = cctx->base;
1173 const BYTE* const istart = (const BYTE*)src;
1174 const BYTE* ip = istart;
1175 const BYTE* anchor = istart;
1176 const U32 lowestIndex = cctx->dictLimit;
1177 const BYTE* const lowest = base + lowestIndex;
1178 const BYTE* const iend = istart + srcSize;
1179 const BYTE* const ilimit = iend - HASH_READ_SIZE;
1180 U32 offset_1=cctx->rep[0], offset_2=cctx->rep[1];
1181 U32 offsetSaved = 0;
1182
1183 /* init */
1184 ip += (ip==lowest);
1185 { U32 const maxRep = (U32)(ip-lowest);
1186 if (offset_2 > maxRep) offsetSaved = offset_2, offset_2 = 0;
1187 if (offset_1 > maxRep) offsetSaved = offset_1, offset_1 = 0;
1188 }
1189
1190 /* Main Search Loop */
1191 while (ip < ilimit) { /* < instead of <=, because repcode check at (ip+1) */
1192 size_t mLength;
1193 size_t const h2 = ZSTD_hashPtr(ip, hBitsL, 8);
1194 size_t const h = ZSTD_hashPtr(ip, hBitsS, mls);
1195 U32 const current = (U32)(ip-base);
1196 U32 const matchIndexL = hashLong[h2];
1197 U32 const matchIndexS = hashSmall[h];
1198 const BYTE* matchLong = base + matchIndexL;
1199 const BYTE* match = base + matchIndexS;
1200 hashLong[h2] = hashSmall[h] = current; /* update hash tables */
1201
1202 if ((offset_1 > 0) & (MEM_read32(ip+1-offset_1) == MEM_read32(ip+1))) { /* note : by construction, offset_1 <= current */
1203 mLength = ZSTD_count(ip+1+4, ip+1+4-offset_1, iend) + 4;
1204 ip++;
1205 ZSTD_storeSeq(seqStorePtr, ip-anchor, anchor, 0, mLength-MINMATCH);
1206 } else {
1207 U32 offset;
1208 if ( (matchIndexL > lowestIndex) && (MEM_read64(matchLong) == MEM_read64(ip)) ) {
1209 mLength = ZSTD_count(ip+8, matchLong+8, iend) + 8;
1210 offset = (U32)(ip-matchLong);
1211 while (((ip>anchor) & (matchLong>lowest)) && (ip[-1] == matchLong[-1])) { ip--; matchLong--; mLength++; } /* catch up */
1212 } else if ( (matchIndexS > lowestIndex) && (MEM_read32(match) == MEM_read32(ip)) ) {
1213 size_t const h3 = ZSTD_hashPtr(ip+1, hBitsL, 8);
1214 U32 const matchIndex3 = hashLong[h3];
1215 const BYTE* match3 = base + matchIndex3;
1216 hashLong[h3] = current + 1;
1217 if ( (matchIndex3 > lowestIndex) && (MEM_read64(match3) == MEM_read64(ip+1)) ) {
1218 mLength = ZSTD_count(ip+9, match3+8, iend) + 8;
1219 ip++;
1220 offset = (U32)(ip-match3);
1221 while (((ip>anchor) & (match3>lowest)) && (ip[-1] == match3[-1])) { ip--; match3--; mLength++; } /* catch up */
1222 } else {
1223 mLength = ZSTD_count(ip+4, match+4, iend) + 4;
1224 offset = (U32)(ip-match);
1225 while (((ip>anchor) & (match>lowest)) && (ip[-1] == match[-1])) { ip--; match--; mLength++; } /* catch up */
1226 }
1227 } else {
1228 ip += ((ip-anchor) >> g_searchStrength) + 1;
1229 continue;
1230 }
1231
1232 offset_2 = offset_1;
1233 offset_1 = offset;
1234
1235 ZSTD_storeSeq(seqStorePtr, ip-anchor, anchor, offset + ZSTD_REP_MOVE, mLength-MINMATCH);
1236 }
1237
1238 /* match found */
1239 ip += mLength;
1240 anchor = ip;
1241
1242 if (ip <= ilimit) {
1243 /* Fill Table */
1244 hashLong[ZSTD_hashPtr(base+current+2, hBitsL, 8)] =
1245 hashSmall[ZSTD_hashPtr(base+current+2, hBitsS, mls)] = current+2; /* here because current+2 could be > iend-8 */
1246 hashLong[ZSTD_hashPtr(ip-2, hBitsL, 8)] =
1247 hashSmall[ZSTD_hashPtr(ip-2, hBitsS, mls)] = (U32)(ip-2-base);
1248
1249 /* check immediate repcode */
1250 while ( (ip <= ilimit)
1251 && ( (offset_2>0)
1252 & (MEM_read32(ip) == MEM_read32(ip - offset_2)) )) {
1253 /* store sequence */
1254 size_t const rLength = ZSTD_count(ip+4, ip+4-offset_2, iend) + 4;
1255 { U32 const tmpOff = offset_2; offset_2 = offset_1; offset_1 = tmpOff; } /* swap offset_2 <=> offset_1 */
1256 hashSmall[ZSTD_hashPtr(ip, hBitsS, mls)] = (U32)(ip-base);
1257 hashLong[ZSTD_hashPtr(ip, hBitsL, 8)] = (U32)(ip-base);
1258 ZSTD_storeSeq(seqStorePtr, 0, anchor, 0, rLength-MINMATCH);
1259 ip += rLength;
1260 anchor = ip;
1261 continue; /* faster when present ... (?) */
1262 } } }
1263
1264 /* save reps for next block */
1265 cctx->savedRep[0] = offset_1 ? offset_1 : offsetSaved;
1266 cctx->savedRep[1] = offset_2 ? offset_2 : offsetSaved;
1267
1268 /* Last Literals */
1269 { size_t const lastLLSize = iend - anchor;
1270 memcpy(seqStorePtr->lit, anchor, lastLLSize);
1271 seqStorePtr->lit += lastLLSize;
1272 }
1273 }
1274
1275
1276 static void ZSTD_compressBlock_doubleFast(ZSTD_CCtx* ctx, const void* src, size_t srcSize)
1277 {
1278 const U32 mls = ctx->params.cParams.searchLength;
1279 switch(mls)
1280 {
1281 default:
1282 case 4 :
1283 ZSTD_compressBlock_doubleFast_generic(ctx, src, srcSize, 4); return;
1284 case 5 :
1285 ZSTD_compressBlock_doubleFast_generic(ctx, src, srcSize, 5); return;
1286 case 6 :
1287 ZSTD_compressBlock_doubleFast_generic(ctx, src, srcSize, 6); return;
1288 case 7 :
1289 ZSTD_compressBlock_doubleFast_generic(ctx, src, srcSize, 7); return;
1290 }
1291 }
1292
1293
1294 static void ZSTD_compressBlock_doubleFast_extDict_generic(ZSTD_CCtx* ctx,
1295 const void* src, size_t srcSize,
1296 const U32 mls)
1297 {
1298 U32* const hashLong = ctx->hashTable;
1299 U32 const hBitsL = ctx->params.cParams.hashLog;
1300 U32* const hashSmall = ctx->chainTable;
1301 U32 const hBitsS = ctx->params.cParams.chainLog;
1302 seqStore_t* seqStorePtr = &(ctx->seqStore);
1303 const BYTE* const base = ctx->base;
1304 const BYTE* const dictBase = ctx->dictBase;
1305 const BYTE* const istart = (const BYTE*)src;
1306 const BYTE* ip = istart;
1307 const BYTE* anchor = istart;
1308 const U32 lowestIndex = ctx->lowLimit;
1309 const BYTE* const dictStart = dictBase + lowestIndex;
1310 const U32 dictLimit = ctx->dictLimit;
1311 const BYTE* const lowPrefixPtr = base + dictLimit;
1312 const BYTE* const dictEnd = dictBase + dictLimit;
1313 const BYTE* const iend = istart + srcSize;
1314 const BYTE* const ilimit = iend - 8;
1315 U32 offset_1=ctx->rep[0], offset_2=ctx->rep[1];
1316
1317 /* Search Loop */
1318 while (ip < ilimit) { /* < instead of <=, because (ip+1) */
1319 const size_t hSmall = ZSTD_hashPtr(ip, hBitsS, mls);
1320 const U32 matchIndex = hashSmall[hSmall];
1321 const BYTE* matchBase = matchIndex < dictLimit ? dictBase : base;
1322 const BYTE* match = matchBase + matchIndex;
1323
1324 const size_t hLong = ZSTD_hashPtr(ip, hBitsL, 8);
1325 const U32 matchLongIndex = hashLong[hLong];
1326 const BYTE* matchLongBase = matchLongIndex < dictLimit ? dictBase : base;
1327 const BYTE* matchLong = matchLongBase + matchLongIndex;
1328
1329 const U32 current = (U32)(ip-base);
1330 const U32 repIndex = current + 1 - offset_1; /* offset_1 expected <= current +1 */
1331 const BYTE* repBase = repIndex < dictLimit ? dictBase : base;
1332 const BYTE* repMatch = repBase + repIndex;
1333 size_t mLength;
1334 hashSmall[hSmall] = hashLong[hLong] = current; /* update hash table */
1335
1336 if ( (((U32)((dictLimit-1) - repIndex) >= 3) /* intentional underflow */ & (repIndex > lowestIndex))
1337 && (MEM_read32(repMatch) == MEM_read32(ip+1)) ) {
1338 const BYTE* repMatchEnd = repIndex < dictLimit ? dictEnd : iend;
1339 mLength = ZSTD_count_2segments(ip+1+4, repMatch+4, iend, repMatchEnd, lowPrefixPtr) + 4;
1340 ip++;
1341 ZSTD_storeSeq(seqStorePtr, ip-anchor, anchor, 0, mLength-MINMATCH);
1342 } else {
1343 if ((matchLongIndex > lowestIndex) && (MEM_read64(matchLong) == MEM_read64(ip))) {
1344 const BYTE* matchEnd = matchLongIndex < dictLimit ? dictEnd : iend;
1345 const BYTE* lowMatchPtr = matchLongIndex < dictLimit ? dictStart : lowPrefixPtr;
1346 U32 offset;
1347 mLength = ZSTD_count_2segments(ip+8, matchLong+8, iend, matchEnd, lowPrefixPtr) + 8;
1348 offset = current - matchLongIndex;
1349 while (((ip>anchor) & (matchLong>lowMatchPtr)) && (ip[-1] == matchLong[-1])) { ip--; matchLong--; mLength++; } /* catch up */
1350 offset_2 = offset_1;
1351 offset_1 = offset;
1352 ZSTD_storeSeq(seqStorePtr, ip-anchor, anchor, offset + ZSTD_REP_MOVE, mLength-MINMATCH);
1353
1354 } else if ((matchIndex > lowestIndex) && (MEM_read32(match) == MEM_read32(ip))) {
1355 size_t const h3 = ZSTD_hashPtr(ip+1, hBitsL, 8);
1356 U32 const matchIndex3 = hashLong[h3];
1357 const BYTE* const match3Base = matchIndex3 < dictLimit ? dictBase : base;
1358 const BYTE* match3 = match3Base + matchIndex3;
1359 U32 offset;
1360 hashLong[h3] = current + 1;
1361 if ( (matchIndex3 > lowestIndex) && (MEM_read64(match3) == MEM_read64(ip+1)) ) {
1362 const BYTE* matchEnd = matchIndex3 < dictLimit ? dictEnd : iend;
1363 const BYTE* lowMatchPtr = matchIndex3 < dictLimit ? dictStart : lowPrefixPtr;
1364 mLength = ZSTD_count_2segments(ip+9, match3+8, iend, matchEnd, lowPrefixPtr) + 8;
1365 ip++;
1366 offset = current+1 - matchIndex3;
1367 while (((ip>anchor) & (match3>lowMatchPtr)) && (ip[-1] == match3[-1])) { ip--; match3--; mLength++; } /* catch up */
1368 } else {
1369 const BYTE* matchEnd = matchIndex < dictLimit ? dictEnd : iend;
1370 const BYTE* lowMatchPtr = matchIndex < dictLimit ? dictStart : lowPrefixPtr;
1371 mLength = ZSTD_count_2segments(ip+4, match+4, iend, matchEnd, lowPrefixPtr) + 4;
1372 offset = current - matchIndex;
1373 while (((ip>anchor) & (match>lowMatchPtr)) && (ip[-1] == match[-1])) { ip--; match--; mLength++; } /* catch up */
1374 }
1375 offset_2 = offset_1;
1376 offset_1 = offset;
1377 ZSTD_storeSeq(seqStorePtr, ip-anchor, anchor, offset + ZSTD_REP_MOVE, mLength-MINMATCH);
1378
1379 } else {
1380 ip += ((ip-anchor) >> g_searchStrength) + 1;
1381 continue;
1382 } }
1383
1384 /* found a match : store it */
1385 ip += mLength;
1386 anchor = ip;
1387
1388 if (ip <= ilimit) {
1389 /* Fill Table */
1390 hashSmall[ZSTD_hashPtr(base+current+2, hBitsS, mls)] = current+2;
1391 hashLong[ZSTD_hashPtr(base+current+2, hBitsL, 8)] = current+2;
1392 hashSmall[ZSTD_hashPtr(ip-2, hBitsS, mls)] = (U32)(ip-2-base);
1393 hashLong[ZSTD_hashPtr(ip-2, hBitsL, 8)] = (U32)(ip-2-base);
1394 /* check immediate repcode */
1395 while (ip <= ilimit) {
1396 U32 const current2 = (U32)(ip-base);
1397 U32 const repIndex2 = current2 - offset_2;
1398 const BYTE* repMatch2 = repIndex2 < dictLimit ? dictBase + repIndex2 : base + repIndex2;
1399 if ( (((U32)((dictLimit-1) - repIndex2) >= 3) & (repIndex2 > lowestIndex)) /* intentional overflow */
1400 && (MEM_read32(repMatch2) == MEM_read32(ip)) ) {
1401 const BYTE* const repEnd2 = repIndex2 < dictLimit ? dictEnd : iend;
1402 size_t const repLength2 = ZSTD_count_2segments(ip+EQUAL_READ32, repMatch2+EQUAL_READ32, iend, repEnd2, lowPrefixPtr) + EQUAL_READ32;
1403 U32 tmpOffset = offset_2; offset_2 = offset_1; offset_1 = tmpOffset; /* swap offset_2 <=> offset_1 */
1404 ZSTD_storeSeq(seqStorePtr, 0, anchor, 0, repLength2-MINMATCH);
1405 hashSmall[ZSTD_hashPtr(ip, hBitsS, mls)] = current2;
1406 hashLong[ZSTD_hashPtr(ip, hBitsL, 8)] = current2;
1407 ip += repLength2;
1408 anchor = ip;
1409 continue;
1410 }
1411 break;
1412 } } }
1413
1414 /* save reps for next block */
1415 ctx->savedRep[0] = offset_1; ctx->savedRep[1] = offset_2;
1416
1417 /* Last Literals */
1418 { size_t const lastLLSize = iend - anchor;
1419 memcpy(seqStorePtr->lit, anchor, lastLLSize);
1420 seqStorePtr->lit += lastLLSize;
1421 }
1422 }
1423
1424
1425 static void ZSTD_compressBlock_doubleFast_extDict(ZSTD_CCtx* ctx,
1426 const void* src, size_t srcSize)
1427 {
1428 U32 const mls = ctx->params.cParams.searchLength;
1429 switch(mls)
1430 {
1431 default:
1432 case 4 :
1433 ZSTD_compressBlock_doubleFast_extDict_generic(ctx, src, srcSize, 4); return;
1434 case 5 :
1435 ZSTD_compressBlock_doubleFast_extDict_generic(ctx, src, srcSize, 5); return;
1436 case 6 :
1437 ZSTD_compressBlock_doubleFast_extDict_generic(ctx, src, srcSize, 6); return;
1438 case 7 :
1439 ZSTD_compressBlock_doubleFast_extDict_generic(ctx, src, srcSize, 7); return;
1440 }
1441 }
1442
1443
1444 /*-*************************************
1445 * Binary Tree search
1446 ***************************************/
1447 /** ZSTD_insertBt1() : add one or multiple positions to tree.
1448 * ip : assumed <= iend-8 .
1449 * @return : nb of positions added */
1450 static U32 ZSTD_insertBt1(ZSTD_CCtx* zc, const BYTE* const ip, const U32 mls, const BYTE* const iend, U32 nbCompares,
1451 U32 extDict)
1452 {
1453 U32* const hashTable = zc->hashTable;
1454 U32 const hashLog = zc->params.cParams.hashLog;
1455 size_t const h = ZSTD_hashPtr(ip, hashLog, mls);
1456 U32* const bt = zc->chainTable;
1457 U32 const btLog = zc->params.cParams.chainLog - 1;
1458 U32 const btMask = (1 << btLog) - 1;
1459 U32 matchIndex = hashTable[h];
1460 size_t commonLengthSmaller=0, commonLengthLarger=0;
1461 const BYTE* const base = zc->base;
1462 const BYTE* const dictBase = zc->dictBase;
1463 const U32 dictLimit = zc->dictLimit;
1464 const BYTE* const dictEnd = dictBase + dictLimit;
1465 const BYTE* const prefixStart = base + dictLimit;
1466 const BYTE* match;
1467 const U32 current = (U32)(ip-base);
1468 const U32 btLow = btMask >= current ? 0 : current - btMask;
1469 U32* smallerPtr = bt + 2*(current&btMask);
1470 U32* largerPtr = smallerPtr + 1;
1471 U32 dummy32; /* to be nullified at the end */
1472 U32 const windowLow = zc->lowLimit;
1473 U32 matchEndIdx = current+8;
1474 size_t bestLength = 8;
1475 #ifdef ZSTD_C_PREDICT
1476 U32 predictedSmall = *(bt + 2*((current-1)&btMask) + 0);
1477 U32 predictedLarge = *(bt + 2*((current-1)&btMask) + 1);
1478 predictedSmall += (predictedSmall>0);
1479 predictedLarge += (predictedLarge>0);
1480 #endif /* ZSTD_C_PREDICT */
1481
1482 hashTable[h] = current; /* Update Hash Table */
1483
1484 while (nbCompares-- && (matchIndex > windowLow)) {
1485 U32* nextPtr = bt + 2*(matchIndex & btMask);
1486 size_t matchLength = MIN(commonLengthSmaller, commonLengthLarger); /* guaranteed minimum nb of common bytes */
1487 #ifdef ZSTD_C_PREDICT /* note : can create issues when hlog small <= 11 */
1488 const U32* predictPtr = bt + 2*((matchIndex-1) & btMask); /* written this way, as bt is a roll buffer */
1489 if (matchIndex == predictedSmall) {
1490 /* no need to check length, result known */
1491 *smallerPtr = matchIndex;
1492 if (matchIndex <= btLow) { smallerPtr=&dummy32; break; } /* beyond tree size, stop the search */
1493 smallerPtr = nextPtr+1; /* new "smaller" => larger of match */
1494 matchIndex = nextPtr[1]; /* new matchIndex larger than previous (closer to current) */
1495 predictedSmall = predictPtr[1] + (predictPtr[1]>0);
1496 continue;
1497 }
1498 if (matchIndex == predictedLarge) {
1499 *largerPtr = matchIndex;
1500 if (matchIndex <= btLow) { largerPtr=&dummy32; break; } /* beyond tree size, stop the search */
1501 largerPtr = nextPtr;
1502 matchIndex = nextPtr[0];
1503 predictedLarge = predictPtr[0] + (predictPtr[0]>0);
1504 continue;
1505 }
1506 #endif
1507 if ((!extDict) || (matchIndex+matchLength >= dictLimit)) {
1508 match = base + matchIndex;
1509 if (match[matchLength] == ip[matchLength])
1510 matchLength += ZSTD_count(ip+matchLength+1, match+matchLength+1, iend) +1;
1511 } else {
1512 match = dictBase + matchIndex;
1513 matchLength += ZSTD_count_2segments(ip+matchLength, match+matchLength, iend, dictEnd, prefixStart);
1514 if (matchIndex+matchLength >= dictLimit)
1515 match = base + matchIndex; /* to prepare for next usage of match[matchLength] */
1516 }
1517
1518 if (matchLength > bestLength) {
1519 bestLength = matchLength;
1520 if (matchLength > matchEndIdx - matchIndex)
1521 matchEndIdx = matchIndex + (U32)matchLength;
1522 }
1523
1524 if (ip+matchLength == iend) /* equal : no way to know if inf or sup */
1525 break; /* drop , to guarantee consistency ; miss a bit of compression, but other solutions can corrupt the tree */
1526
1527 if (match[matchLength] < ip[matchLength]) { /* necessarily within correct buffer */
1528 /* match is smaller than current */
1529 *smallerPtr = matchIndex; /* update smaller idx */
1530 commonLengthSmaller = matchLength; /* all smaller will now have at least this guaranteed common length */
1531 if (matchIndex <= btLow) { smallerPtr=&dummy32; break; } /* beyond tree size, stop the search */
1532 smallerPtr = nextPtr+1; /* new "smaller" => larger of match */
1533 matchIndex = nextPtr[1]; /* new matchIndex larger than previous (closer to current) */
1534 } else {
1535 /* match is larger than current */
1536 *largerPtr = matchIndex;
1537 commonLengthLarger = matchLength;
1538 if (matchIndex <= btLow) { largerPtr=&dummy32; break; } /* beyond tree size, stop the search */
1539 largerPtr = nextPtr;
1540 matchIndex = nextPtr[0];
1541 } }
1542
1543 *smallerPtr = *largerPtr = 0;
1544 if (bestLength > 384) return MIN(192, (U32)(bestLength - 384)); /* speed optimization */
1545 if (matchEndIdx > current + 8) return matchEndIdx - current - 8;
1546 return 1;
1547 }
1548
1549
1550 static size_t ZSTD_insertBtAndFindBestMatch (
1551 ZSTD_CCtx* zc,
1552 const BYTE* const ip, const BYTE* const iend,
1553 size_t* offsetPtr,
1554 U32 nbCompares, const U32 mls,
1555 U32 extDict)
1556 {
1557 U32* const hashTable = zc->hashTable;
1558 U32 const hashLog = zc->params.cParams.hashLog;
1559 size_t const h = ZSTD_hashPtr(ip, hashLog, mls);
1560 U32* const bt = zc->chainTable;
1561 U32 const btLog = zc->params.cParams.chainLog - 1;
1562 U32 const btMask = (1 << btLog) - 1;
1563 U32 matchIndex = hashTable[h];
1564 size_t commonLengthSmaller=0, commonLengthLarger=0;
1565 const BYTE* const base = zc->base;
1566 const BYTE* const dictBase = zc->dictBase;
1567 const U32 dictLimit = zc->dictLimit;
1568 const BYTE* const dictEnd = dictBase + dictLimit;
1569 const BYTE* const prefixStart = base + dictLimit;
1570 const U32 current = (U32)(ip-base);
1571 const U32 btLow = btMask >= current ? 0 : current - btMask;
1572 const U32 windowLow = zc->lowLimit;
1573 U32* smallerPtr = bt + 2*(current&btMask);
1574 U32* largerPtr = bt + 2*(current&btMask) + 1;
1575 U32 matchEndIdx = current+8;
1576 U32 dummy32; /* to be nullified at the end */
1577 size_t bestLength = 0;
1578
1579 hashTable[h] = current; /* Update Hash Table */
1580
1581 while (nbCompares-- && (matchIndex > windowLow)) {
1582 U32* nextPtr = bt + 2*(matchIndex & btMask);
1583 size_t matchLength = MIN(commonLengthSmaller, commonLengthLarger); /* guaranteed minimum nb of common bytes */
1584 const BYTE* match;
1585
1586 if ((!extDict) || (matchIndex+matchLength >= dictLimit)) {
1587 match = base + matchIndex;
1588 if (match[matchLength] == ip[matchLength])
1589 matchLength += ZSTD_count(ip+matchLength+1, match+matchLength+1, iend) +1;
1590 } else {
1591 match = dictBase + matchIndex;
1592 matchLength += ZSTD_count_2segments(ip+matchLength, match+matchLength, iend, dictEnd, prefixStart);
1593 if (matchIndex+matchLength >= dictLimit)
1594 match = base + matchIndex; /* to prepare for next usage of match[matchLength] */
1595 }
1596
1597 if (matchLength > bestLength) {
1598 if (matchLength > matchEndIdx - matchIndex)
1599 matchEndIdx = matchIndex + (U32)matchLength;
1600 if ( (4*(int)(matchLength-bestLength)) > (int)(ZSTD_highbit32(current-matchIndex+1) - ZSTD_highbit32((U32)offsetPtr[0]+1)) )
1601 bestLength = matchLength, *offsetPtr = ZSTD_REP_MOVE + current - matchIndex;
1602 if (ip+matchLength == iend) /* equal : no way to know if inf or sup */
1603 break; /* drop, to guarantee consistency (miss a little bit of compression) */
1604 }
1605
1606 if (match[matchLength] < ip[matchLength]) {
1607 /* match is smaller than current */
1608 *smallerPtr = matchIndex; /* update smaller idx */
1609 commonLengthSmaller = matchLength; /* all smaller will now have at least this guaranteed common length */
1610 if (matchIndex <= btLow) { smallerPtr=&dummy32; break; } /* beyond tree size, stop the search */
1611 smallerPtr = nextPtr+1; /* new "smaller" => larger of match */
1612 matchIndex = nextPtr[1]; /* new matchIndex larger than previous (closer to current) */
1613 } else {
1614 /* match is larger than current */
1615 *largerPtr = matchIndex;
1616 commonLengthLarger = matchLength;
1617 if (matchIndex <= btLow) { largerPtr=&dummy32; break; } /* beyond tree size, stop the search */
1618 largerPtr = nextPtr;
1619 matchIndex = nextPtr[0];
1620 } }
1621
1622 *smallerPtr = *largerPtr = 0;
1623
1624 zc->nextToUpdate = (matchEndIdx > current + 8) ? matchEndIdx - 8 : current+1;
1625 return bestLength;
1626 }
1627
1628
1629 static void ZSTD_updateTree(ZSTD_CCtx* zc, const BYTE* const ip, const BYTE* const iend, const U32 nbCompares, const U32 mls)
1630 {
1631 const BYTE* const base = zc->base;
1632 const U32 target = (U32)(ip - base);
1633 U32 idx = zc->nextToUpdate;
1634
1635 while(idx < target)
1636 idx += ZSTD_insertBt1(zc, base+idx, mls, iend, nbCompares, 0);
1637 }
1638
1639 /** ZSTD_BtFindBestMatch() : Tree updater, providing best match */
1640 static size_t ZSTD_BtFindBestMatch (
1641 ZSTD_CCtx* zc,
1642 const BYTE* const ip, const BYTE* const iLimit,
1643 size_t* offsetPtr,
1644 const U32 maxNbAttempts, const U32 mls)
1645 {
1646 if (ip < zc->base + zc->nextToUpdate) return 0; /* skipped area */
1647 ZSTD_updateTree(zc, ip, iLimit, maxNbAttempts, mls);
1648 return ZSTD_insertBtAndFindBestMatch(zc, ip, iLimit, offsetPtr, maxNbAttempts, mls, 0);
1649 }
1650
1651
1652 static size_t ZSTD_BtFindBestMatch_selectMLS (
1653 ZSTD_CCtx* zc, /* Index table will be updated */
1654 const BYTE* ip, const BYTE* const iLimit,
1655 size_t* offsetPtr,
1656 const U32 maxNbAttempts, const U32 matchLengthSearch)
1657 {
1658 switch(matchLengthSearch)
1659 {
1660 default :
1661 case 4 : return ZSTD_BtFindBestMatch(zc, ip, iLimit, offsetPtr, maxNbAttempts, 4);
1662 case 5 : return ZSTD_BtFindBestMatch(zc, ip, iLimit, offsetPtr, maxNbAttempts, 5);
1663 case 6 : return ZSTD_BtFindBestMatch(zc, ip, iLimit, offsetPtr, maxNbAttempts, 6);
1664 }
1665 }
1666
1667
1668 static void ZSTD_updateTree_extDict(ZSTD_CCtx* zc, const BYTE* const ip, const BYTE* const iend, const U32 nbCompares, const U32 mls)
1669 {
1670 const BYTE* const base = zc->base;
1671 const U32 target = (U32)(ip - base);
1672 U32 idx = zc->nextToUpdate;
1673
1674 while (idx < target) idx += ZSTD_insertBt1(zc, base+idx, mls, iend, nbCompares, 1);
1675 }
1676
1677
1678 /** Tree updater, providing best match */
1679 static size_t ZSTD_BtFindBestMatch_extDict (
1680 ZSTD_CCtx* zc,
1681 const BYTE* const ip, const BYTE* const iLimit,
1682 size_t* offsetPtr,
1683 const U32 maxNbAttempts, const U32 mls)
1684 {
1685 if (ip < zc->base + zc->nextToUpdate) return 0; /* skipped area */
1686 ZSTD_updateTree_extDict(zc, ip, iLimit, maxNbAttempts, mls);
1687 return ZSTD_insertBtAndFindBestMatch(zc, ip, iLimit, offsetPtr, maxNbAttempts, mls, 1);
1688 }
1689
1690
1691 static size_t ZSTD_BtFindBestMatch_selectMLS_extDict (
1692 ZSTD_CCtx* zc, /* Index table will be updated */
1693 const BYTE* ip, const BYTE* const iLimit,
1694 size_t* offsetPtr,
1695 const U32 maxNbAttempts, const U32 matchLengthSearch)
1696 {
1697 switch(matchLengthSearch)
1698 {
1699 default :
1700 case 4 : return ZSTD_BtFindBestMatch_extDict(zc, ip, iLimit, offsetPtr, maxNbAttempts, 4);
1701 case 5 : return ZSTD_BtFindBestMatch_extDict(zc, ip, iLimit, offsetPtr, maxNbAttempts, 5);
1702 case 6 : return ZSTD_BtFindBestMatch_extDict(zc, ip, iLimit, offsetPtr, maxNbAttempts, 6);
1703 }
1704 }
1705
1706
1707
1708 /* *********************************
1709 * Hash Chain
1710 ***********************************/
1711 #define NEXT_IN_CHAIN(d, mask) chainTable[(d) & mask]
1712
1713 /* Update chains up to ip (excluded)
1714 Assumption : always within prefix (ie. not within extDict) */
1715 FORCE_INLINE
1716 U32 ZSTD_insertAndFindFirstIndex (ZSTD_CCtx* zc, const BYTE* ip, U32 mls)
1717 {
1718 U32* const hashTable = zc->hashTable;
1719 const U32 hashLog = zc->params.cParams.hashLog;
1720 U32* const chainTable = zc->chainTable;
1721 const U32 chainMask = (1 << zc->params.cParams.chainLog) - 1;
1722 const BYTE* const base = zc->base;
1723 const U32 target = (U32)(ip - base);
1724 U32 idx = zc->nextToUpdate;
1725
1726 while(idx < target) { /* catch up */
1727 size_t const h = ZSTD_hashPtr(base+idx, hashLog, mls);
1728 NEXT_IN_CHAIN(idx, chainMask) = hashTable[h];
1729 hashTable[h] = idx;
1730 idx++;
1731 }
1732
1733 zc->nextToUpdate = target;
1734 return hashTable[ZSTD_hashPtr(ip, hashLog, mls)];
1735 }
1736
1737
1738
1739 FORCE_INLINE /* inlining is important to hardwire a hot branch (template emulation) */
1740 size_t ZSTD_HcFindBestMatch_generic (
1741 ZSTD_CCtx* zc, /* Index table will be updated */
1742 const BYTE* const ip, const BYTE* const iLimit,
1743 size_t* offsetPtr,
1744 const U32 maxNbAttempts, const U32 mls, const U32 extDict)
1745 {
1746 U32* const chainTable = zc->chainTable;
1747 const U32 chainSize = (1 << zc->params.cParams.chainLog);
1748 const U32 chainMask = chainSize-1;
1749 const BYTE* const base = zc->base;
1750 const BYTE* const dictBase = zc->dictBase;
1751 const U32 dictLimit = zc->dictLimit;
1752 const BYTE* const prefixStart = base + dictLimit;
1753 const BYTE* const dictEnd = dictBase + dictLimit;
1754 const U32 lowLimit = zc->lowLimit;
1755 const U32 current = (U32)(ip-base);
1756 const U32 minChain = current > chainSize ? current - chainSize : 0;
1757 int nbAttempts=maxNbAttempts;
1758 size_t ml=EQUAL_READ32-1;
1759
1760 /* HC4 match finder */
1761 U32 matchIndex = ZSTD_insertAndFindFirstIndex (zc, ip, mls);
1762
1763 for ( ; (matchIndex>lowLimit) & (nbAttempts>0) ; nbAttempts--) {
1764 const BYTE* match;
1765 size_t currentMl=0;
1766 if ((!extDict) || matchIndex >= dictLimit) {
1767 match = base + matchIndex;
1768 if (match[ml] == ip[ml]) /* potentially better */
1769 currentMl = ZSTD_count(ip, match, iLimit);
1770 } else {
1771 match = dictBase + matchIndex;
1772 if (MEM_read32(match) == MEM_read32(ip)) /* assumption : matchIndex <= dictLimit-4 (by table construction) */
1773 currentMl = ZSTD_count_2segments(ip+EQUAL_READ32, match+EQUAL_READ32, iLimit, dictEnd, prefixStart) + EQUAL_READ32;
1774 }
1775
1776 /* save best solution */
1777 if (currentMl > ml) { ml = currentMl; *offsetPtr = current - matchIndex + ZSTD_REP_MOVE; if (ip+currentMl == iLimit) break; /* best possible, and avoid read overflow*/ }
1778
1779 if (matchIndex <= minChain) break;
1780 matchIndex = NEXT_IN_CHAIN(matchIndex, chainMask);
1781 }
1782
1783 return ml;
1784 }
1785
1786
1787 FORCE_INLINE size_t ZSTD_HcFindBestMatch_selectMLS (
1788 ZSTD_CCtx* zc,
1789 const BYTE* ip, const BYTE* const iLimit,
1790 size_t* offsetPtr,
1791 const U32 maxNbAttempts, const U32 matchLengthSearch)
1792 {
1793 switch(matchLengthSearch)
1794 {
1795 default :
1796 case 4 : return ZSTD_HcFindBestMatch_generic(zc, ip, iLimit, offsetPtr, maxNbAttempts, 4, 0);
1797 case 5 : return ZSTD_HcFindBestMatch_generic(zc, ip, iLimit, offsetPtr, maxNbAttempts, 5, 0);
1798 case 6 : return ZSTD_HcFindBestMatch_generic(zc, ip, iLimit, offsetPtr, maxNbAttempts, 6, 0);
1799 }
1800 }
1801
1802
1803 FORCE_INLINE size_t ZSTD_HcFindBestMatch_extDict_selectMLS (
1804 ZSTD_CCtx* zc,
1805 const BYTE* ip, const BYTE* const iLimit,
1806 size_t* offsetPtr,
1807 const U32 maxNbAttempts, const U32 matchLengthSearch)
1808 {
1809 switch(matchLengthSearch)
1810 {
1811 default :
1812 case 4 : return ZSTD_HcFindBestMatch_generic(zc, ip, iLimit, offsetPtr, maxNbAttempts, 4, 1);
1813 case 5 : return ZSTD_HcFindBestMatch_generic(zc, ip, iLimit, offsetPtr, maxNbAttempts, 5, 1);
1814 case 6 : return ZSTD_HcFindBestMatch_generic(zc, ip, iLimit, offsetPtr, maxNbAttempts, 6, 1);
1815 }
1816 }
1817
1818
1819 /* *******************************
1820 * Common parser - lazy strategy
1821 *********************************/
1822 FORCE_INLINE
1823 void ZSTD_compressBlock_lazy_generic(ZSTD_CCtx* ctx,
1824 const void* src, size_t srcSize,
1825 const U32 searchMethod, const U32 depth)
1826 {
1827 seqStore_t* seqStorePtr = &(ctx->seqStore);
1828 const BYTE* const istart = (const BYTE*)src;
1829 const BYTE* ip = istart;
1830 const BYTE* anchor = istart;
1831 const BYTE* const iend = istart + srcSize;
1832 const BYTE* const ilimit = iend - 8;
1833 const BYTE* const base = ctx->base + ctx->dictLimit;
1834
1835 U32 const maxSearches = 1 << ctx->params.cParams.searchLog;
1836 U32 const mls = ctx->params.cParams.searchLength;
1837
1838 typedef size_t (*searchMax_f)(ZSTD_CCtx* zc, const BYTE* ip, const BYTE* iLimit,
1839 size_t* offsetPtr,
1840 U32 maxNbAttempts, U32 matchLengthSearch);
1841 searchMax_f const searchMax = searchMethod ? ZSTD_BtFindBestMatch_selectMLS : ZSTD_HcFindBestMatch_selectMLS;
1842 U32 offset_1 = ctx->rep[0], offset_2 = ctx->rep[1], savedOffset=0;
1843
1844 /* init */
1845 ip += (ip==base);
1846 ctx->nextToUpdate3 = ctx->nextToUpdate;
1847 { U32 const maxRep = (U32)(ip-base);
1848 if (offset_2 > maxRep) savedOffset = offset_2, offset_2 = 0;
1849 if (offset_1 > maxRep) savedOffset = offset_1, offset_1 = 0;
1850 }
1851
1852 /* Match Loop */
1853 while (ip < ilimit) {
1854 size_t matchLength=0;
1855 size_t offset=0;
1856 const BYTE* start=ip+1;
1857
1858 /* check repCode */
1859 if ((offset_1>0) & (MEM_read32(ip+1) == MEM_read32(ip+1 - offset_1))) {
1860 /* repcode : we take it */
1861 matchLength = ZSTD_count(ip+1+EQUAL_READ32, ip+1+EQUAL_READ32-offset_1, iend) + EQUAL_READ32;
1862 if (depth==0) goto _storeSequence;
1863 }
1864
1865 /* first search (depth 0) */
1866 { size_t offsetFound = 99999999;
1867 size_t const ml2 = searchMax(ctx, ip, iend, &offsetFound, maxSearches, mls);
1868 if (ml2 > matchLength)
1869 matchLength = ml2, start = ip, offset=offsetFound;
1870 }
1871
1872 if (matchLength < EQUAL_READ32) {
1873 ip += ((ip-anchor) >> g_searchStrength) + 1; /* jump faster over incompressible sections */
1874 continue;
1875 }
1876
1877 /* let's try to find a better solution */
1878 if (depth>=1)
1879 while (ip<ilimit) {
1880 ip ++;
1881 if ((offset) && ((offset_1>0) & (MEM_read32(ip) == MEM_read32(ip - offset_1)))) {
1882 size_t const mlRep = ZSTD_count(ip+EQUAL_READ32, ip+EQUAL_READ32-offset_1, iend) + EQUAL_READ32;
1883 int const gain2 = (int)(mlRep * 3);
1884 int const gain1 = (int)(matchLength*3 - ZSTD_highbit32((U32)offset+1) + 1);
1885 if ((mlRep >= EQUAL_READ32) && (gain2 > gain1))
1886 matchLength = mlRep, offset = 0, start = ip;
1887 }
1888 { size_t offset2=99999999;
1889 size_t const ml2 = searchMax(ctx, ip, iend, &offset2, maxSearches, mls);
1890 int const gain2 = (int)(ml2*4 - ZSTD_highbit32((U32)offset2+1)); /* raw approx */
1891 int const gain1 = (int)(matchLength*4 - ZSTD_highbit32((U32)offset+1) + 4);
1892 if ((ml2 >= EQUAL_READ32) && (gain2 > gain1)) {
1893 matchLength = ml2, offset = offset2, start = ip;
1894 continue; /* search a better one */
1895 } }
1896
1897 /* let's find an even better one */
1898 if ((depth==2) && (ip<ilimit)) {
1899 ip ++;
1900 if ((offset) && ((offset_1>0) & (MEM_read32(ip) == MEM_read32(ip - offset_1)))) {
1901 size_t const ml2 = ZSTD_count(ip+EQUAL_READ32, ip+EQUAL_READ32-offset_1, iend) + EQUAL_READ32;
1902 int const gain2 = (int)(ml2 * 4);
1903 int const gain1 = (int)(matchLength*4 - ZSTD_highbit32((U32)offset+1) + 1);
1904 if ((ml2 >= EQUAL_READ32) && (gain2 > gain1))
1905 matchLength = ml2, offset = 0, start = ip;
1906 }
1907 { size_t offset2=99999999;
1908 size_t const ml2 = searchMax(ctx, ip, iend, &offset2, maxSearches, mls);
1909 int const gain2 = (int)(ml2*4 - ZSTD_highbit32((U32)offset2+1)); /* raw approx */
1910 int const gain1 = (int)(matchLength*4 - ZSTD_highbit32((U32)offset+1) + 7);
1911 if ((ml2 >= EQUAL_READ32) && (gain2 > gain1)) {
1912 matchLength = ml2, offset = offset2, start = ip;
1913 continue;
1914 } } }
1915 break; /* nothing found : store previous solution */
1916 }
1917
1918 /* catch up */
1919 if (offset) {
1920 while ((start>anchor) && (start>base+offset-ZSTD_REP_MOVE) && (start[-1] == start[-1-offset+ZSTD_REP_MOVE])) /* only search for offset within prefix */
1921 { start--; matchLength++; }
1922 offset_2 = offset_1; offset_1 = (U32)(offset - ZSTD_REP_MOVE);
1923 }
1924
1925 /* store sequence */
1926 _storeSequence:
1927 { size_t const litLength = start - anchor;
1928 ZSTD_storeSeq(seqStorePtr, litLength, anchor, (U32)offset, matchLength-MINMATCH);
1929 anchor = ip = start + matchLength;
1930 }
1931
1932 /* check immediate repcode */
1933 while ( (ip <= ilimit)
1934 && ((offset_2>0)
1935 & (MEM_read32(ip) == MEM_read32(ip - offset_2)) )) {
1936 /* store sequence */
1937 matchLength = ZSTD_count(ip+EQUAL_READ32, ip+EQUAL_READ32-offset_2, iend) + EQUAL_READ32;
1938 offset = offset_2; offset_2 = offset_1; offset_1 = (U32)offset; /* swap repcodes */
1939 ZSTD_storeSeq(seqStorePtr, 0, anchor, 0, matchLength-MINMATCH);
1940 ip += matchLength;
1941 anchor = ip;
1942 continue; /* faster when present ... (?) */
1943 } }
1944
1945 /* Save reps for next block */
1946 ctx->savedRep[0] = offset_1 ? offset_1 : savedOffset;
1947 ctx->savedRep[1] = offset_2 ? offset_2 : savedOffset;
1948
1949 /* Last Literals */
1950 { size_t const lastLLSize = iend - anchor;
1951 memcpy(seqStorePtr->lit, anchor, lastLLSize);
1952 seqStorePtr->lit += lastLLSize;
1953 }
1954 }
1955
1956
1957 static void ZSTD_compressBlock_btlazy2(ZSTD_CCtx* ctx, const void* src, size_t srcSize)
1958 {
1959 ZSTD_compressBlock_lazy_generic(ctx, src, srcSize, 1, 2);
1960 }
1961
1962 static void ZSTD_compressBlock_lazy2(ZSTD_CCtx* ctx, const void* src, size_t srcSize)
1963 {
1964 ZSTD_compressBlock_lazy_generic(ctx, src, srcSize, 0, 2);
1965 }
1966
1967 static void ZSTD_compressBlock_lazy(ZSTD_CCtx* ctx, const void* src, size_t srcSize)
1968 {
1969 ZSTD_compressBlock_lazy_generic(ctx, src, srcSize, 0, 1);
1970 }
1971
1972 static void ZSTD_compressBlock_greedy(ZSTD_CCtx* ctx, const void* src, size_t srcSize)
1973 {
1974 ZSTD_compressBlock_lazy_generic(ctx, src, srcSize, 0, 0);
1975 }
1976
1977
1978 FORCE_INLINE
1979 void ZSTD_compressBlock_lazy_extDict_generic(ZSTD_CCtx* ctx,
1980 const void* src, size_t srcSize,
1981 const U32 searchMethod, const U32 depth)
1982 {
1983 seqStore_t* seqStorePtr = &(ctx->seqStore);
1984 const BYTE* const istart = (const BYTE*)src;
1985 const BYTE* ip = istart;
1986 const BYTE* anchor = istart;
1987 const BYTE* const iend = istart + srcSize;
1988 const BYTE* const ilimit = iend - 8;
1989 const BYTE* const base = ctx->base;
1990 const U32 dictLimit = ctx->dictLimit;
1991 const U32 lowestIndex = ctx->lowLimit;
1992 const BYTE* const prefixStart = base + dictLimit;
1993 const BYTE* const dictBase = ctx->dictBase;
1994 const BYTE* const dictEnd = dictBase + dictLimit;
1995 const BYTE* const dictStart = dictBase + ctx->lowLimit;
1996
1997 const U32 maxSearches = 1 << ctx->params.cParams.searchLog;
1998 const U32 mls = ctx->params.cParams.searchLength;
1999
2000 typedef size_t (*searchMax_f)(ZSTD_CCtx* zc, const BYTE* ip, const BYTE* iLimit,
2001 size_t* offsetPtr,
2002 U32 maxNbAttempts, U32 matchLengthSearch);
2003 searchMax_f searchMax = searchMethod ? ZSTD_BtFindBestMatch_selectMLS_extDict : ZSTD_HcFindBestMatch_extDict_selectMLS;
2004
2005 U32 offset_1 = ctx->rep[0], offset_2 = ctx->rep[1];
2006
2007 /* init */
2008 ctx->nextToUpdate3 = ctx->nextToUpdate;
2009 ip += (ip == prefixStart);
2010
2011 /* Match Loop */
2012 while (ip < ilimit) {
2013 size_t matchLength=0;
2014 size_t offset=0;
2015 const BYTE* start=ip+1;
2016 U32 current = (U32)(ip-base);
2017
2018 /* check repCode */
2019 { const U32 repIndex = (U32)(current+1 - offset_1);
2020 const BYTE* const repBase = repIndex < dictLimit ? dictBase : base;
2021 const BYTE* const repMatch = repBase + repIndex;
2022 if (((U32)((dictLimit-1) - repIndex) >= 3) & (repIndex > lowestIndex)) /* intentional overflow */
2023 if (MEM_read32(ip+1) == MEM_read32(repMatch)) {
2024 /* repcode detected we should take it */
2025 const BYTE* const repEnd = repIndex < dictLimit ? dictEnd : iend;
2026 matchLength = ZSTD_count_2segments(ip+1+EQUAL_READ32, repMatch+EQUAL_READ32, iend, repEnd, prefixStart) + EQUAL_READ32;
2027 if (depth==0) goto _storeSequence;
2028 } }
2029
2030 /* first search (depth 0) */
2031 { size_t offsetFound = 99999999;
2032 size_t const ml2 = searchMax(ctx, ip, iend, &offsetFound, maxSearches, mls);
2033 if (ml2 > matchLength)
2034 matchLength = ml2, start = ip, offset=offsetFound;
2035 }
2036
2037 if (matchLength < EQUAL_READ32) {
2038 ip += ((ip-anchor) >> g_searchStrength) + 1; /* jump faster over incompressible sections */
2039 continue;
2040 }
2041
2042 /* let's try to find a better solution */
2043 if (depth>=1)
2044 while (ip<ilimit) {
2045 ip ++;
2046 current++;
2047 /* check repCode */
2048 if (offset) {
2049 const U32 repIndex = (U32)(current - offset_1);
2050 const BYTE* const repBase = repIndex < dictLimit ? dictBase : base;
2051 const BYTE* const repMatch = repBase + repIndex;
2052 if (((U32)((dictLimit-1) - repIndex) >= 3) & (repIndex > lowestIndex)) /* intentional overflow */
2053 if (MEM_read32(ip) == MEM_read32(repMatch)) {
2054 /* repcode detected */
2055 const BYTE* const repEnd = repIndex < dictLimit ? dictEnd : iend;
2056 size_t const repLength = ZSTD_count_2segments(ip+EQUAL_READ32, repMatch+EQUAL_READ32, iend, repEnd, prefixStart) + EQUAL_READ32;
2057 int const gain2 = (int)(repLength * 3);
2058 int const gain1 = (int)(matchLength*3 - ZSTD_highbit32((U32)offset+1) + 1);
2059 if ((repLength >= EQUAL_READ32) && (gain2 > gain1))
2060 matchLength = repLength, offset = 0, start = ip;
2061 } }
2062
2063 /* search match, depth 1 */
2064 { size_t offset2=99999999;
2065 size_t const ml2 = searchMax(ctx, ip, iend, &offset2, maxSearches, mls);
2066 int const gain2 = (int)(ml2*4 - ZSTD_highbit32((U32)offset2+1)); /* raw approx */
2067 int const gain1 = (int)(matchLength*4 - ZSTD_highbit32((U32)offset+1) + 4);
2068 if ((ml2 >= EQUAL_READ32) && (gain2 > gain1)) {
2069 matchLength = ml2, offset = offset2, start = ip;
2070 continue; /* search a better one */
2071 } }
2072
2073 /* let's find an even better one */
2074 if ((depth==2) && (ip<ilimit)) {
2075 ip ++;
2076 current++;
2077 /* check repCode */
2078 if (offset) {
2079 const U32 repIndex = (U32)(current - offset_1);
2080 const BYTE* const repBase = repIndex < dictLimit ? dictBase : base;
2081 const BYTE* const repMatch = repBase + repIndex;
2082 if (((U32)((dictLimit-1) - repIndex) >= 3) & (repIndex > lowestIndex)) /* intentional overflow */
2083 if (MEM_read32(ip) == MEM_read32(repMatch)) {
2084 /* repcode detected */
2085 const BYTE* const repEnd = repIndex < dictLimit ? dictEnd : iend;
2086 size_t repLength = ZSTD_count_2segments(ip+EQUAL_READ32, repMatch+EQUAL_READ32, iend, repEnd, prefixStart) + EQUAL_READ32;
2087 int gain2 = (int)(repLength * 4);
2088 int gain1 = (int)(matchLength*4 - ZSTD_highbit32((U32)offset+1) + 1);
2089 if ((repLength >= EQUAL_READ32) && (gain2 > gain1))
2090 matchLength = repLength, offset = 0, start = ip;
2091 } }
2092
2093 /* search match, depth 2 */
2094 { size_t offset2=99999999;
2095 size_t const ml2 = searchMax(ctx, ip, iend, &offset2, maxSearches, mls);
2096 int const gain2 = (int)(ml2*4 - ZSTD_highbit32((U32)offset2+1)); /* raw approx */
2097 int const gain1 = (int)(matchLength*4 - ZSTD_highbit32((U32)offset+1) + 7);
2098 if ((ml2 >= EQUAL_READ32) && (gain2 > gain1)) {
2099 matchLength = ml2, offset = offset2, start = ip;
2100 continue;
2101 } } }
2102 break; /* nothing found : store previous solution */
2103 }
2104
2105 /* catch up */
2106 if (offset) {
2107 U32 const matchIndex = (U32)((start-base) - (offset - ZSTD_REP_MOVE));
2108 const BYTE* match = (matchIndex < dictLimit) ? dictBase + matchIndex : base + matchIndex;
2109 const BYTE* const mStart = (matchIndex < dictLimit) ? dictStart : prefixStart;
2110 while ((start>anchor) && (match>mStart) && (start[-1] == match[-1])) { start--; match--; matchLength++; } /* catch up */
2111 offset_2 = offset_1; offset_1 = (U32)(offset - ZSTD_REP_MOVE);
2112 }
2113
2114 /* store sequence */
2115 _storeSequence:
2116 { size_t const litLength = start - anchor;
2117 ZSTD_storeSeq(seqStorePtr, litLength, anchor, (U32)offset, matchLength-MINMATCH);
2118 anchor = ip = start + matchLength;
2119 }
2120
2121 /* check immediate repcode */
2122 while (ip <= ilimit) {
2123 const U32 repIndex = (U32)((ip-base) - offset_2);
2124 const BYTE* const repBase = repIndex < dictLimit ? dictBase : base;
2125 const BYTE* const repMatch = repBase + repIndex;
2126 if (((U32)((dictLimit-1) - repIndex) >= 3) & (repIndex > lowestIndex)) /* intentional overflow */
2127 if (MEM_read32(ip) == MEM_read32(repMatch)) {
2128 /* repcode detected we should take it */
2129 const BYTE* const repEnd = repIndex < dictLimit ? dictEnd : iend;
2130 matchLength = ZSTD_count_2segments(ip+EQUAL_READ32, repMatch+EQUAL_READ32, iend, repEnd, prefixStart) + EQUAL_READ32;
2131 offset = offset_2; offset_2 = offset_1; offset_1 = (U32)offset; /* swap offset history */
2132 ZSTD_storeSeq(seqStorePtr, 0, anchor, 0, matchLength-MINMATCH);
2133 ip += matchLength;
2134 anchor = ip;
2135 continue; /* faster when present ... (?) */
2136 }
2137 break;
2138 } }
2139
2140 /* Save reps for next block */
2141 ctx->savedRep[0] = offset_1; ctx->savedRep[1] = offset_2;
2142
2143 /* Last Literals */
2144 { size_t const lastLLSize = iend - anchor;
2145 memcpy(seqStorePtr->lit, anchor, lastLLSize);
2146 seqStorePtr->lit += lastLLSize;
2147 }
2148 }
2149
2150
2151 void ZSTD_compressBlock_greedy_extDict(ZSTD_CCtx* ctx, const void* src, size_t srcSize)
2152 {
2153 ZSTD_compressBlock_lazy_extDict_generic(ctx, src, srcSize, 0, 0);
2154 }
2155
2156 static void ZSTD_compressBlock_lazy_extDict(ZSTD_CCtx* ctx, const void* src, size_t srcSize)
2157 {
2158 ZSTD_compressBlock_lazy_extDict_generic(ctx, src, srcSize, 0, 1);
2159 }
2160
2161 static void ZSTD_compressBlock_lazy2_extDict(ZSTD_CCtx* ctx, const void* src, size_t srcSize)
2162 {
2163 ZSTD_compressBlock_lazy_extDict_generic(ctx, src, srcSize, 0, 2);
2164 }
2165
2166 static void ZSTD_compressBlock_btlazy2_extDict(ZSTD_CCtx* ctx, const void* src, size_t srcSize)
2167 {
2168 ZSTD_compressBlock_lazy_extDict_generic(ctx, src, srcSize, 1, 2);
2169 }
2170
2171
2172 /* The optimal parser */
2173 #include "zstd_opt.h"
2174
2175 static void ZSTD_compressBlock_btopt(ZSTD_CCtx* ctx, const void* src, size_t srcSize)
2176 {
2177 #ifdef ZSTD_OPT_H_91842398743
2178 ZSTD_compressBlock_opt_generic(ctx, src, srcSize, 0);
2179 #else
2180 (void)ctx; (void)src; (void)srcSize;
2181 return;
2182 #endif
2183 }
2184
2185 static void ZSTD_compressBlock_btopt2(ZSTD_CCtx* ctx, const void* src, size_t srcSize)
2186 {
2187 #ifdef ZSTD_OPT_H_91842398743
2188 ZSTD_compressBlock_opt_generic(ctx, src, srcSize, 1);
2189 #else
2190 (void)ctx; (void)src; (void)srcSize;
2191 return;
2192 #endif
2193 }
2194
2195 static void ZSTD_compressBlock_btopt_extDict(ZSTD_CCtx* ctx, const void* src, size_t srcSize)
2196 {
2197 #ifdef ZSTD_OPT_H_91842398743
2198 ZSTD_compressBlock_opt_extDict_generic(ctx, src, srcSize, 0);
2199 #else
2200 (void)ctx; (void)src; (void)srcSize;
2201 return;
2202 #endif
2203 }
2204
2205 static void ZSTD_compressBlock_btopt2_extDict(ZSTD_CCtx* ctx, const void* src, size_t srcSize)
2206 {
2207 #ifdef ZSTD_OPT_H_91842398743
2208 ZSTD_compressBlock_opt_extDict_generic(ctx, src, srcSize, 1);
2209 #else
2210 (void)ctx; (void)src; (void)srcSize;
2211 return;
2212 #endif
2213 }
2214
2215
2216 typedef void (*ZSTD_blockCompressor) (ZSTD_CCtx* ctx, const void* src, size_t srcSize);
2217
2218 static ZSTD_blockCompressor ZSTD_selectBlockCompressor(ZSTD_strategy strat, int extDict)
2219 {
2220 static const ZSTD_blockCompressor blockCompressor[2][8] = {
2221 { ZSTD_compressBlock_fast, ZSTD_compressBlock_doubleFast, ZSTD_compressBlock_greedy, ZSTD_compressBlock_lazy, ZSTD_compressBlock_lazy2, ZSTD_compressBlock_btlazy2, ZSTD_compressBlock_btopt, ZSTD_compressBlock_btopt2 },
2222 { ZSTD_compressBlock_fast_extDict, ZSTD_compressBlock_doubleFast_extDict, ZSTD_compressBlock_greedy_extDict, ZSTD_compressBlock_lazy_extDict,ZSTD_compressBlock_lazy2_extDict, ZSTD_compressBlock_btlazy2_extDict, ZSTD_compressBlock_btopt_extDict, ZSTD_compressBlock_btopt2_extDict }
2223 };
2224
2225 return blockCompressor[extDict][(U32)strat];
2226 }
2227
2228
2229 static size_t ZSTD_compressBlock_internal(ZSTD_CCtx* zc, void* dst, size_t dstCapacity, const void* src, size_t srcSize)
2230 {
2231 ZSTD_blockCompressor const blockCompressor = ZSTD_selectBlockCompressor(zc->params.cParams.strategy, zc->lowLimit < zc->dictLimit);
2232 const BYTE* const base = zc->base;
2233 const BYTE* const istart = (const BYTE*)src;
2234 const U32 current = (U32)(istart-base);
2235 if (srcSize < MIN_CBLOCK_SIZE+ZSTD_blockHeaderSize+1) return 0; /* don't even attempt compression below a certain srcSize */
2236 ZSTD_resetSeqStore(&(zc->seqStore));
2237 if (current > zc->nextToUpdate + 384)
2238 zc->nextToUpdate = current - MIN(192, (U32)(current - zc->nextToUpdate - 384)); /* update tree not updated after finding very long rep matches */
2239 blockCompressor(zc, src, srcSize);
2240 return ZSTD_compressSequences(zc, dst, dstCapacity, srcSize);
2241 }
2242
2243
2244 /*! ZSTD_compress_generic() :
2245 * Compress a chunk of data into one or multiple blocks.
2246 * All blocks will be terminated, all input will be consumed.
2247 * Function will issue an error if there is not enough `dstCapacity` to hold the compressed content.
2248 * Frame is supposed already started (header already produced)
2249 * @return : compressed size, or an error code
2250 */
2251 static size_t ZSTD_compress_generic (ZSTD_CCtx* cctx,
2252 void* dst, size_t dstCapacity,
2253 const void* src, size_t srcSize,
2254 U32 lastFrameChunk)
2255 {
2256 size_t blockSize = cctx->blockSize;
2257 size_t remaining = srcSize;
2258 const BYTE* ip = (const BYTE*)src;
2259 BYTE* const ostart = (BYTE*)dst;
2260 BYTE* op = ostart;
2261 U32 const maxDist = 1 << cctx->params.cParams.windowLog;
2262
2263 if (cctx->params.fParams.checksumFlag && srcSize)
2264 XXH64_update(&cctx->xxhState, src, srcSize);
2265
2266 while (remaining) {
2267 U32 const lastBlock = lastFrameChunk & (blockSize >= remaining);
2268 size_t cSize;
2269
2270 if (dstCapacity < ZSTD_blockHeaderSize + MIN_CBLOCK_SIZE) return ERROR(dstSize_tooSmall); /* not enough space to store compressed block */
2271 if (remaining < blockSize) blockSize = remaining;
2272
2273 /* preemptive overflow correction */
2274 if (cctx->lowLimit > (1<<30)) {
2275 U32 const btplus = (cctx->params.cParams.strategy == ZSTD_btlazy2) | (cctx->params.cParams.strategy == ZSTD_btopt) | (cctx->params.cParams.strategy == ZSTD_btopt2);
2276 U32 const chainMask = (1 << (cctx->params.cParams.chainLog - btplus)) - 1;
2277 U32 const supLog = MAX(cctx->params.cParams.chainLog, 17 /* blockSize */);
2278 U32 const newLowLimit = (cctx->lowLimit & chainMask) + (1 << supLog); /* preserve position % chainSize, ensure current-repcode doesn't underflow */
2279 U32 const correction = cctx->lowLimit - newLowLimit;
2280 ZSTD_reduceIndex(cctx, correction);
2281 cctx->base += correction;
2282 cctx->dictBase += correction;
2283 cctx->lowLimit = newLowLimit;
2284 cctx->dictLimit -= correction;
2285 if (cctx->nextToUpdate < correction) cctx->nextToUpdate = 0;
2286 else cctx->nextToUpdate -= correction;
2287 }
2288
2289 if ((U32)(ip+blockSize - cctx->base) > cctx->loadedDictEnd + maxDist) {
2290 /* enforce maxDist */
2291 U32 const newLowLimit = (U32)(ip+blockSize - cctx->base) - maxDist;
2292 if (cctx->lowLimit < newLowLimit) cctx->lowLimit = newLowLimit;
2293 if (cctx->dictLimit < cctx->lowLimit) cctx->dictLimit = cctx->lowLimit;
2294 }
2295
2296 cSize = ZSTD_compressBlock_internal(cctx, op+ZSTD_blockHeaderSize, dstCapacity-ZSTD_blockHeaderSize, ip, blockSize);
2297 if (ZSTD_isError(cSize)) return cSize;
2298
2299 if (cSize == 0) { /* block is not compressible */
2300 U32 const cBlockHeader24 = lastBlock + (((U32)bt_raw)<<1) + (U32)(blockSize << 3);
2301 if (blockSize + ZSTD_blockHeaderSize > dstCapacity) return ERROR(dstSize_tooSmall);
2302 MEM_writeLE32(op, cBlockHeader24); /* no pb, 4th byte will be overwritten */
2303 memcpy(op + ZSTD_blockHeaderSize, ip, blockSize);
2304 cSize = ZSTD_blockHeaderSize+blockSize;
2305 } else {
2306 U32 const cBlockHeader24 = lastBlock + (((U32)bt_compressed)<<1) + (U32)(cSize << 3);
2307 MEM_writeLE24(op, cBlockHeader24);
2308 cSize += ZSTD_blockHeaderSize;
2309 }
2310
2311 remaining -= blockSize;
2312 dstCapacity -= cSize;
2313 ip += blockSize;
2314 op += cSize;
2315 }
2316
2317 if (lastFrameChunk && (op>ostart)) cctx->stage = ZSTDcs_ending;
2318 return op-ostart;
2319 }
2320
2321
2322 static size_t ZSTD_writeFrameHeader(void* dst, size_t dstCapacity,
2323 ZSTD_parameters params, U64 pledgedSrcSize, U32 dictID)
2324 { BYTE* const op = (BYTE*)dst;
2325 U32 const dictIDSizeCode = (dictID>0) + (dictID>=256) + (dictID>=65536); /* 0-3 */
2326 U32 const checksumFlag = params.fParams.checksumFlag>0;
2327 U32 const windowSize = 1U << params.cParams.windowLog;
2328 U32 const singleSegment = params.fParams.contentSizeFlag && (windowSize > (pledgedSrcSize-1));
2329 BYTE const windowLogByte = (BYTE)((params.cParams.windowLog - ZSTD_WINDOWLOG_ABSOLUTEMIN) << 3);
2330 U32 const fcsCode = params.fParams.contentSizeFlag ?
2331 (pledgedSrcSize>=256) + (pledgedSrcSize>=65536+256) + (pledgedSrcSize>=0xFFFFFFFFU) : /* 0-3 */
2332 0;
2333 BYTE const frameHeaderDecriptionByte = (BYTE)(dictIDSizeCode + (checksumFlag<<2) + (singleSegment<<5) + (fcsCode<<6) );
2334 size_t pos;
2335
2336 if (dstCapacity < ZSTD_frameHeaderSize_max) return ERROR(dstSize_tooSmall);
2337
2338 MEM_writeLE32(dst, ZSTD_MAGICNUMBER);
2339 op[4] = frameHeaderDecriptionByte; pos=5;
2340 if (!singleSegment) op[pos++] = windowLogByte;
2341 switch(dictIDSizeCode)
2342 {
2343 default: /* impossible */
2344 case 0 : break;
2345 case 1 : op[pos] = (BYTE)(dictID); pos++; break;
2346 case 2 : MEM_writeLE16(op+pos, (U16)dictID); pos+=2; break;
2347 case 3 : MEM_writeLE32(op+pos, dictID); pos+=4; break;
2348 }
2349 switch(fcsCode)
2350 {
2351 default: /* impossible */
2352 case 0 : if (singleSegment) op[pos++] = (BYTE)(pledgedSrcSize); break;
2353 case 1 : MEM_writeLE16(op+pos, (U16)(pledgedSrcSize-256)); pos+=2; break;
2354 case 2 : MEM_writeLE32(op+pos, (U32)(pledgedSrcSize)); pos+=4; break;
2355 case 3 : MEM_writeLE64(op+pos, (U64)(pledgedSrcSize)); pos+=8; break;
2356 }
2357 return pos;
2358 }
2359
2360
2361 static size_t ZSTD_compressContinue_internal (ZSTD_CCtx* cctx,
2362 void* dst, size_t dstCapacity,
2363 const void* src, size_t srcSize,
2364 U32 frame, U32 lastFrameChunk)
2365 {
2366 const BYTE* const ip = (const BYTE*) src;
2367 size_t fhSize = 0;
2368
2369 if (cctx->stage==ZSTDcs_created) return ERROR(stage_wrong); /* missing init (ZSTD_compressBegin) */
2370
2371 if (frame && (cctx->stage==ZSTDcs_init)) {
2372 fhSize = ZSTD_writeFrameHeader(dst, dstCapacity, cctx->params, cctx->frameContentSize, cctx->dictID);
2373 if (ZSTD_isError(fhSize)) return fhSize;
2374 dstCapacity -= fhSize;
2375 dst = (char*)dst + fhSize;
2376 cctx->stage = ZSTDcs_ongoing;
2377 }
2378
2379 /* Check if blocks follow each other */
2380 if (src != cctx->nextSrc) {
2381 /* not contiguous */
2382 ptrdiff_t const delta = cctx->nextSrc - ip;
2383 cctx->lowLimit = cctx->dictLimit;
2384 cctx->dictLimit = (U32)(cctx->nextSrc - cctx->base);
2385 cctx->dictBase = cctx->base;
2386 cctx->base -= delta;
2387 cctx->nextToUpdate = cctx->dictLimit;
2388 if (cctx->dictLimit - cctx->lowLimit < HASH_READ_SIZE) cctx->lowLimit = cctx->dictLimit; /* too small extDict */
2389 }
2390
2391 /* if input and dictionary overlap : reduce dictionary (area presumed modified by input) */
2392 if ((ip+srcSize > cctx->dictBase + cctx->lowLimit) & (ip < cctx->dictBase + cctx->dictLimit)) {
2393 ptrdiff_t const highInputIdx = (ip + srcSize) - cctx->dictBase;
2394 U32 const lowLimitMax = (highInputIdx > (ptrdiff_t)cctx->dictLimit) ? cctx->dictLimit : (U32)highInputIdx;
2395 cctx->lowLimit = lowLimitMax;
2396 }
2397
2398 cctx->nextSrc = ip + srcSize;
2399
2400 { size_t const cSize = frame ?
2401 ZSTD_compress_generic (cctx, dst, dstCapacity, src, srcSize, lastFrameChunk) :
2402 ZSTD_compressBlock_internal (cctx, dst, dstCapacity, src, srcSize);
2403 if (ZSTD_isError(cSize)) return cSize;
2404 return cSize + fhSize;
2405 }
2406 }
2407
2408
2409 size_t ZSTD_compressContinue (ZSTD_CCtx* cctx,
2410 void* dst, size_t dstCapacity,
2411 const void* src, size_t srcSize)
2412 {
2413 return ZSTD_compressContinue_internal(cctx, dst, dstCapacity, src, srcSize, 1, 0);
2414 }
2415
2416
2417 size_t ZSTD_getBlockSizeMax(ZSTD_CCtx* cctx)
2418 {
2419 return MIN (ZSTD_BLOCKSIZE_ABSOLUTEMAX, 1 << cctx->params.cParams.windowLog);
2420 }
2421
2422 size_t ZSTD_compressBlock(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize)
2423 {
2424 size_t const blockSizeMax = ZSTD_getBlockSizeMax(cctx);
2425 if (srcSize > blockSizeMax) return ERROR(srcSize_wrong);
2426 return ZSTD_compressContinue_internal(cctx, dst, dstCapacity, src, srcSize, 0, 0);
2427 }
2428
2429
2430 static size_t ZSTD_loadDictionaryContent(ZSTD_CCtx* zc, const void* src, size_t srcSize)
2431 {
2432 const BYTE* const ip = (const BYTE*) src;
2433 const BYTE* const iend = ip + srcSize;
2434
2435 /* input becomes current prefix */
2436 zc->lowLimit = zc->dictLimit;
2437 zc->dictLimit = (U32)(zc->nextSrc - zc->base);
2438 zc->dictBase = zc->base;
2439 zc->base += ip - zc->nextSrc;
2440 zc->nextToUpdate = zc->dictLimit;
2441 zc->loadedDictEnd = (U32)(iend - zc->base);
2442
2443 zc->nextSrc = iend;
2444 if (srcSize <= HASH_READ_SIZE) return 0;
2445
2446 switch(zc->params.cParams.strategy)
2447 {
2448 case ZSTD_fast:
2449 ZSTD_fillHashTable (zc, iend, zc->params.cParams.searchLength);
2450 break;
2451
2452 case ZSTD_dfast:
2453 ZSTD_fillDoubleHashTable (zc, iend, zc->params.cParams.searchLength);
2454 break;
2455
2456 case ZSTD_greedy:
2457 case ZSTD_lazy:
2458 case ZSTD_lazy2:
2459 ZSTD_insertAndFindFirstIndex (zc, iend-HASH_READ_SIZE, zc->params.cParams.searchLength);
2460 break;
2461
2462 case ZSTD_btlazy2:
2463 case ZSTD_btopt:
2464 case ZSTD_btopt2:
2465 ZSTD_updateTree(zc, iend-HASH_READ_SIZE, iend, 1 << zc->params.cParams.searchLog, zc->params.cParams.searchLength);
2466 break;
2467
2468 default:
2469 return ERROR(GENERIC); /* strategy doesn't exist; impossible */
2470 }
2471
2472 zc->nextToUpdate = zc->loadedDictEnd;
2473 return 0;
2474 }
2475
2476
2477 /* Dictionaries that assign zero probability to symbols that show up causes problems
2478 when FSE encoding. Refuse dictionaries that assign zero probability to symbols
2479 that we may encounter during compression.
2480 NOTE: This behavior is not standard and could be improved in the future. */
2481 static size_t ZSTD_checkDictNCount(short* normalizedCounter, unsigned dictMaxSymbolValue, unsigned maxSymbolValue) {
2482 U32 s;
2483 if (dictMaxSymbolValue < maxSymbolValue) return ERROR(dictionary_corrupted);
2484 for (s = 0; s <= maxSymbolValue; ++s) {
2485 if (normalizedCounter[s] == 0) return ERROR(dictionary_corrupted);
2486 }
2487 return 0;
2488 }
2489
2490
2491 /* Dictionary format :
2492 Magic == ZSTD_DICT_MAGIC (4 bytes)
2493 HUF_writeCTable(256)
2494 FSE_writeNCount(off)
2495 FSE_writeNCount(ml)
2496 FSE_writeNCount(ll)
2497 RepOffsets
2498 Dictionary content
2499 */
2500 /*! ZSTD_loadDictEntropyStats() :
2501 @return : size read from dictionary
2502 note : magic number supposed already checked */
2503 static size_t ZSTD_loadDictEntropyStats(ZSTD_CCtx* cctx, const void* dict, size_t dictSize)
2504 {
2505 const BYTE* dictPtr = (const BYTE*)dict;
2506 const BYTE* const dictEnd = dictPtr + dictSize;
2507 short offcodeNCount[MaxOff+1];
2508 unsigned offcodeMaxValue = MaxOff;
2509
2510 { size_t const hufHeaderSize = HUF_readCTable(cctx->hufTable, 255, dict, dictSize);
2511 if (HUF_isError(hufHeaderSize)) return ERROR(dictionary_corrupted);
2512 dictPtr += hufHeaderSize;
2513 }
2514
2515 { unsigned offcodeLog;
2516 size_t const offcodeHeaderSize = FSE_readNCount(offcodeNCount, &offcodeMaxValue, &offcodeLog, dictPtr, dictEnd-dictPtr);
2517 if (FSE_isError(offcodeHeaderSize)) return ERROR(dictionary_corrupted);
2518 if (offcodeLog > OffFSELog) return ERROR(dictionary_corrupted);
2519 /* Defer checking offcodeMaxValue because we need to know the size of the dictionary content */
2520 CHECK_E (FSE_buildCTable(cctx->offcodeCTable, offcodeNCount, offcodeMaxValue, offcodeLog), dictionary_corrupted);
2521 dictPtr += offcodeHeaderSize;
2522 }
2523
2524 { short matchlengthNCount[MaxML+1];
2525 unsigned matchlengthMaxValue = MaxML, matchlengthLog;
2526 size_t const matchlengthHeaderSize = FSE_readNCount(matchlengthNCount, &matchlengthMaxValue, &matchlengthLog, dictPtr, dictEnd-dictPtr);
2527 if (FSE_isError(matchlengthHeaderSize)) return ERROR(dictionary_corrupted);
2528 if (matchlengthLog > MLFSELog) return ERROR(dictionary_corrupted);
2529 /* Every match length code must have non-zero probability */
2530 CHECK_F (ZSTD_checkDictNCount(matchlengthNCount, matchlengthMaxValue, MaxML));
2531 CHECK_E (FSE_buildCTable(cctx->matchlengthCTable, matchlengthNCount, matchlengthMaxValue, matchlengthLog), dictionary_corrupted);
2532 dictPtr += matchlengthHeaderSize;
2533 }
2534
2535 { short litlengthNCount[MaxLL+1];
2536 unsigned litlengthMaxValue = MaxLL, litlengthLog;
2537 size_t const litlengthHeaderSize = FSE_readNCount(litlengthNCount, &litlengthMaxValue, &litlengthLog, dictPtr, dictEnd-dictPtr);
2538 if (FSE_isError(litlengthHeaderSize)) return ERROR(dictionary_corrupted);
2539 if (litlengthLog > LLFSELog) return ERROR(dictionary_corrupted);
2540 /* Every literal length code must have non-zero probability */
2541 CHECK_F (ZSTD_checkDictNCount(litlengthNCount, litlengthMaxValue, MaxLL));
2542 CHECK_E(FSE_buildCTable(cctx->litlengthCTable, litlengthNCount, litlengthMaxValue, litlengthLog), dictionary_corrupted);
2543 dictPtr += litlengthHeaderSize;
2544 }
2545
2546 if (dictPtr+12 > dictEnd) return ERROR(dictionary_corrupted);
2547 cctx->rep[0] = MEM_readLE32(dictPtr+0); if (cctx->rep[0] >= dictSize) return ERROR(dictionary_corrupted);
2548 cctx->rep[1] = MEM_readLE32(dictPtr+4); if (cctx->rep[1] >= dictSize) return ERROR(dictionary_corrupted);
2549 cctx->rep[2] = MEM_readLE32(dictPtr+8); if (cctx->rep[2] >= dictSize) return ERROR(dictionary_corrupted);
2550 dictPtr += 12;
2551
2552 { U32 offcodeMax = MaxOff;
2553 if ((size_t)(dictEnd - dictPtr) <= ((U32)-1) - 128 KB) {
2554 U32 const maxOffset = (U32)(dictEnd - dictPtr) + 128 KB; /* The maximum offset that must be supported */
2555 /* Calculate minimum offset code required to represent maxOffset */
2556 offcodeMax = ZSTD_highbit32(maxOffset);
2557 }
2558 /* Every possible supported offset <= dictContentSize + 128 KB must be representable */
2559 CHECK_F (ZSTD_checkDictNCount(offcodeNCount, offcodeMaxValue, MIN(offcodeMax, MaxOff)));
2560 }
2561
2562 cctx->flagStaticTables = 1;
2563 return dictPtr - (const BYTE*)dict;
2564 }
2565
2566 /** ZSTD_compress_insertDictionary() :
2567 * @return : 0, or an error code */
2568 static size_t ZSTD_compress_insertDictionary(ZSTD_CCtx* zc, const void* dict, size_t dictSize)
2569 {
2570 if ((dict==NULL) || (dictSize<=8)) return 0;
2571
2572 /* default : dict is pure content */
2573 if (MEM_readLE32(dict) != ZSTD_DICT_MAGIC) return ZSTD_loadDictionaryContent(zc, dict, dictSize);
2574 zc->dictID = zc->params.fParams.noDictIDFlag ? 0 : MEM_readLE32((const char*)dict+4);
2575
2576 /* known magic number : dict is parsed for entropy stats and content */
2577 { size_t const loadError = ZSTD_loadDictEntropyStats(zc, (const char*)dict+8 /* skip dictHeader */, dictSize-8);
2578 size_t const eSize = loadError + 8;
2579 if (ZSTD_isError(loadError)) return loadError;
2580 return ZSTD_loadDictionaryContent(zc, (const char*)dict+eSize, dictSize-eSize);
2581 }
2582 }
2583
2584
2585 /*! ZSTD_compressBegin_internal() :
2586 * @return : 0, or an error code */
2587 static size_t ZSTD_compressBegin_internal(ZSTD_CCtx* cctx,
2588 const void* dict, size_t dictSize,
2589 ZSTD_parameters params, U64 pledgedSrcSize)
2590 {
2591 ZSTD_compResetPolicy_e const crp = dictSize ? ZSTDcrp_fullReset : ZSTDcrp_continue;
2592 CHECK_F(ZSTD_resetCCtx_advanced(cctx, params, pledgedSrcSize, crp));
2593 return ZSTD_compress_insertDictionary(cctx, dict, dictSize);
2594 }
2595
2596
2597 /*! ZSTD_compressBegin_advanced() :
2598 * @return : 0, or an error code */
2599 size_t ZSTD_compressBegin_advanced(ZSTD_CCtx* cctx,
2600 const void* dict, size_t dictSize,
2601 ZSTD_parameters params, unsigned long long pledgedSrcSize)
2602 {
2603 /* compression parameters verification and optimization */
2604 CHECK_F(ZSTD_checkCParams(params.cParams));
2605 return ZSTD_compressBegin_internal(cctx, dict, dictSize, params, pledgedSrcSize);
2606 }
2607
2608
2609 size_t ZSTD_compressBegin_usingDict(ZSTD_CCtx* cctx, const void* dict, size_t dictSize, int compressionLevel)
2610 {
2611 ZSTD_parameters const params = ZSTD_getParams(compressionLevel, 0, dictSize);
2612 return ZSTD_compressBegin_internal(cctx, dict, dictSize, params, 0);
2613 }
2614
2615
2616 size_t ZSTD_compressBegin(ZSTD_CCtx* zc, int compressionLevel)
2617 {
2618 return ZSTD_compressBegin_usingDict(zc, NULL, 0, compressionLevel);
2619 }
2620
2621
2622 /*! ZSTD_writeEpilogue() :
2623 * Ends a frame.
2624 * @return : nb of bytes written into dst (or an error code) */
2625 static size_t ZSTD_writeEpilogue(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity)
2626 {
2627 BYTE* const ostart = (BYTE*)dst;
2628 BYTE* op = ostart;
2629 size_t fhSize = 0;
2630
2631 if (cctx->stage == ZSTDcs_created) return ERROR(stage_wrong); /* init missing */
2632
2633 /* special case : empty frame */
2634 if (cctx->stage == ZSTDcs_init) {
2635 fhSize = ZSTD_writeFrameHeader(dst, dstCapacity, cctx->params, 0, 0);
2636 if (ZSTD_isError(fhSize)) return fhSize;
2637 dstCapacity -= fhSize;
2638 op += fhSize;
2639 cctx->stage = ZSTDcs_ongoing;
2640 }
2641
2642 if (cctx->stage != ZSTDcs_ending) {
2643 /* write one last empty block, make it the "last" block */
2644 U32 const cBlockHeader24 = 1 /* last block */ + (((U32)bt_raw)<<1) + 0;
2645 if (dstCapacity<4) return ERROR(dstSize_tooSmall);
2646 MEM_writeLE32(op, cBlockHeader24);
2647 op += ZSTD_blockHeaderSize;
2648 dstCapacity -= ZSTD_blockHeaderSize;
2649 }
2650
2651 if (cctx->params.fParams.checksumFlag) {
2652 U32 const checksum = (U32) XXH64_digest(&cctx->xxhState);
2653 if (dstCapacity<4) return ERROR(dstSize_tooSmall);
2654 MEM_writeLE32(op, checksum);
2655 op += 4;
2656 }
2657
2658 cctx->stage = ZSTDcs_created; /* return to "created but no init" status */
2659 return op-ostart;
2660 }
2661
2662
2663 size_t ZSTD_compressEnd (ZSTD_CCtx* cctx,
2664 void* dst, size_t dstCapacity,
2665 const void* src, size_t srcSize)
2666 {
2667 size_t endResult;
2668 size_t const cSize = ZSTD_compressContinue_internal(cctx, dst, dstCapacity, src, srcSize, 1, 1);
2669 if (ZSTD_isError(cSize)) return cSize;
2670 endResult = ZSTD_writeEpilogue(cctx, (char*)dst + cSize, dstCapacity-cSize);
2671 if (ZSTD_isError(endResult)) return endResult;
2672 return cSize + endResult;
2673 }
2674
2675
2676 static size_t ZSTD_compress_internal (ZSTD_CCtx* cctx,
2677 void* dst, size_t dstCapacity,
2678 const void* src, size_t srcSize,
2679 const void* dict,size_t dictSize,
2680 ZSTD_parameters params)
2681 {
2682 CHECK_F(ZSTD_compressBegin_internal(cctx, dict, dictSize, params, srcSize));
2683 return ZSTD_compressEnd(cctx, dst, dstCapacity, src, srcSize);
2684 }
2685
2686 size_t ZSTD_compress_advanced (ZSTD_CCtx* ctx,
2687 void* dst, size_t dstCapacity,
2688 const void* src, size_t srcSize,
2689 const void* dict,size_t dictSize,
2690 ZSTD_parameters params)
2691 {
2692 CHECK_F(ZSTD_checkCParams(params.cParams));
2693 return ZSTD_compress_internal(ctx, dst, dstCapacity, src, srcSize, dict, dictSize, params);
2694 }
2695
2696 size_t ZSTD_compress_usingDict(ZSTD_CCtx* ctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize, const void* dict, size_t dictSize, int compressionLevel)
2697 {
2698 ZSTD_parameters params = ZSTD_getParams(compressionLevel, srcSize, dictSize);
2699 params.fParams.contentSizeFlag = 1;
2700 return ZSTD_compress_internal(ctx, dst, dstCapacity, src, srcSize, dict, dictSize, params);
2701 }
2702
2703 size_t ZSTD_compressCCtx (ZSTD_CCtx* ctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize, int compressionLevel)
2704 {
2705 return ZSTD_compress_usingDict(ctx, dst, dstCapacity, src, srcSize, NULL, 0, compressionLevel);
2706 }
2707
2708 size_t ZSTD_compress(void* dst, size_t dstCapacity, const void* src, size_t srcSize, int compressionLevel)
2709 {
2710 size_t result;
2711 ZSTD_CCtx ctxBody;
2712 memset(&ctxBody, 0, sizeof(ctxBody));
2713 memcpy(&ctxBody.customMem, &defaultCustomMem, sizeof(ZSTD_customMem));
2714 result = ZSTD_compressCCtx(&ctxBody, dst, dstCapacity, src, srcSize, compressionLevel);
2715 ZSTD_free(ctxBody.workSpace, defaultCustomMem); /* can't free ctxBody itself, as it's on stack; free only heap content */
2716 return result;
2717 }
2718
2719
2720 /* ===== Dictionary API ===== */
2721
2722 struct ZSTD_CDict_s {
2723 void* dictContent;
2724 size_t dictContentSize;
2725 ZSTD_CCtx* refContext;
2726 }; /* typedef'd tp ZSTD_CDict within "zstd.h" */
2727
2728 size_t ZSTD_sizeof_CDict(const ZSTD_CDict* cdict)
2729 {
2730 if (cdict==NULL) return 0; /* support sizeof on NULL */
2731 return ZSTD_sizeof_CCtx(cdict->refContext) + cdict->dictContentSize;
2732 }
2733
2734 ZSTD_CDict* ZSTD_createCDict_advanced(const void* dict, size_t dictSize, ZSTD_parameters params, ZSTD_customMem customMem)
2735 {
2736 if (!customMem.customAlloc && !customMem.customFree) customMem = defaultCustomMem;
2737 if (!customMem.customAlloc || !customMem.customFree) return NULL;
2738
2739 { ZSTD_CDict* const cdict = (ZSTD_CDict*) ZSTD_malloc(sizeof(ZSTD_CDict), customMem);
2740 void* const dictContent = ZSTD_malloc(dictSize, customMem);
2741 ZSTD_CCtx* const cctx = ZSTD_createCCtx_advanced(customMem);
2742
2743 if (!dictContent || !cdict || !cctx) {
2744 ZSTD_free(dictContent, customMem);
2745 ZSTD_free(cdict, customMem);
2746 ZSTD_free(cctx, customMem);
2747 return NULL;
2748 }
2749
2750 if (dictSize) {
2751 memcpy(dictContent, dict, dictSize);
2752 }
2753 { size_t const errorCode = ZSTD_compressBegin_advanced(cctx, dictContent, dictSize, params, 0);
2754 if (ZSTD_isError(errorCode)) {
2755 ZSTD_free(dictContent, customMem);
2756 ZSTD_free(cdict, customMem);
2757 ZSTD_free(cctx, customMem);
2758 return NULL;
2759 } }
2760
2761 cdict->dictContent = dictContent;
2762 cdict->dictContentSize = dictSize;
2763 cdict->refContext = cctx;
2764 return cdict;
2765 }
2766 }
2767
2768 ZSTD_CDict* ZSTD_createCDict(const void* dict, size_t dictSize, int compressionLevel)
2769 {
2770 ZSTD_customMem const allocator = { NULL, NULL, NULL };
2771 ZSTD_parameters params = ZSTD_getParams(compressionLevel, 0, dictSize);
2772 params.fParams.contentSizeFlag = 1;
2773 return ZSTD_createCDict_advanced(dict, dictSize, params, allocator);
2774 }
2775
2776 size_t ZSTD_freeCDict(ZSTD_CDict* cdict)
2777 {
2778 if (cdict==NULL) return 0; /* support free on NULL */
2779 { ZSTD_customMem const cMem = cdict->refContext->customMem;
2780 ZSTD_freeCCtx(cdict->refContext);
2781 ZSTD_free(cdict->dictContent, cMem);
2782 ZSTD_free(cdict, cMem);
2783 return 0;
2784 }
2785 }
2786
2787 static ZSTD_parameters ZSTD_getParamsFromCDict(const ZSTD_CDict* cdict) {
2788 return ZSTD_getParamsFromCCtx(cdict->refContext);
2789 }
2790
2791 size_t ZSTD_compressBegin_usingCDict(ZSTD_CCtx* cctx, const ZSTD_CDict* cdict, U64 pledgedSrcSize)
2792 {
2793 if (cdict->dictContentSize) CHECK_F(ZSTD_copyCCtx(cctx, cdict->refContext, pledgedSrcSize))
2794 else CHECK_F(ZSTD_compressBegin_advanced(cctx, NULL, 0, cdict->refContext->params, pledgedSrcSize));
2795 return 0;
2796 }
2797
2798 /*! ZSTD_compress_usingCDict() :
2799 * Compression using a digested Dictionary.
2800 * Faster startup than ZSTD_compress_usingDict(), recommended when same dictionary is used multiple times.
2801 * Note that compression level is decided during dictionary creation */
2802 size_t ZSTD_compress_usingCDict(ZSTD_CCtx* cctx,
2803 void* dst, size_t dstCapacity,
2804 const void* src, size_t srcSize,
2805 const ZSTD_CDict* cdict)
2806 {
2807 CHECK_F(ZSTD_compressBegin_usingCDict(cctx, cdict, srcSize));
2808
2809 if (cdict->refContext->params.fParams.contentSizeFlag==1) {
2810 cctx->params.fParams.contentSizeFlag = 1;
2811 cctx->frameContentSize = srcSize;
2812 }
2813
2814 return ZSTD_compressEnd(cctx, dst, dstCapacity, src, srcSize);
2815 }
2816
2817
2818
2819 /* ******************************************************************
2820 * Streaming
2821 ********************************************************************/
2822
2823 typedef enum { zcss_init, zcss_load, zcss_flush, zcss_final } ZSTD_cStreamStage;
2824
2825 struct ZSTD_CStream_s {
2826 ZSTD_CCtx* cctx;
2827 ZSTD_CDict* cdictLocal;
2828 const ZSTD_CDict* cdict;
2829 char* inBuff;
2830 size_t inBuffSize;
2831 size_t inToCompress;
2832 size_t inBuffPos;
2833 size_t inBuffTarget;
2834 size_t blockSize;
2835 char* outBuff;
2836 size_t outBuffSize;
2837 size_t outBuffContentSize;
2838 size_t outBuffFlushedSize;
2839 ZSTD_cStreamStage stage;
2840 U32 checksum;
2841 U32 frameEnded;
2842 ZSTD_parameters params;
2843 ZSTD_customMem customMem;
2844 }; /* typedef'd to ZSTD_CStream within "zstd.h" */
2845
2846 ZSTD_CStream* ZSTD_createCStream(void)
2847 {
2848 return ZSTD_createCStream_advanced(defaultCustomMem);
2849 }
2850
2851 ZSTD_CStream* ZSTD_createCStream_advanced(ZSTD_customMem customMem)
2852 {
2853 ZSTD_CStream* zcs;
2854
2855 if (!customMem.customAlloc && !customMem.customFree) customMem = defaultCustomMem;
2856 if (!customMem.customAlloc || !customMem.customFree) return NULL;
2857
2858 zcs = (ZSTD_CStream*)ZSTD_malloc(sizeof(ZSTD_CStream), customMem);
2859 if (zcs==NULL) return NULL;
2860 memset(zcs, 0, sizeof(ZSTD_CStream));
2861 memcpy(&zcs->customMem, &customMem, sizeof(ZSTD_customMem));
2862 zcs->cctx = ZSTD_createCCtx_advanced(customMem);
2863 if (zcs->cctx == NULL) { ZSTD_freeCStream(zcs); return NULL; }
2864 return zcs;
2865 }
2866
2867 size_t ZSTD_freeCStream(ZSTD_CStream* zcs)
2868 {
2869 if (zcs==NULL) return 0; /* support free on NULL */
2870 { ZSTD_customMem const cMem = zcs->customMem;
2871 ZSTD_freeCCtx(zcs->cctx);
2872 ZSTD_freeCDict(zcs->cdictLocal);
2873 ZSTD_free(zcs->inBuff, cMem);
2874 ZSTD_free(zcs->outBuff, cMem);
2875 ZSTD_free(zcs, cMem);
2876 return 0;
2877 }
2878 }
2879
2880
2881 /*====== Initialization ======*/
2882
2883 size_t ZSTD_CStreamInSize(void) { return ZSTD_BLOCKSIZE_ABSOLUTEMAX; }
2884 size_t ZSTD_CStreamOutSize(void) { return ZSTD_compressBound(ZSTD_BLOCKSIZE_ABSOLUTEMAX) + ZSTD_blockHeaderSize + 4 /* 32-bits hash */ ; }
2885
2886 size_t ZSTD_resetCStream(ZSTD_CStream* zcs, unsigned long long pledgedSrcSize)
2887 {
2888 if (zcs->inBuffSize==0) return ERROR(stage_wrong); /* zcs has not been init at least once */
2889
2890 if (zcs->cdict) CHECK_F(ZSTD_compressBegin_usingCDict(zcs->cctx, zcs->cdict, pledgedSrcSize))
2891 else CHECK_F(ZSTD_compressBegin_advanced(zcs->cctx, NULL, 0, zcs->params, pledgedSrcSize));
2892
2893 zcs->inToCompress = 0;
2894 zcs->inBuffPos = 0;
2895 zcs->inBuffTarget = zcs->blockSize;
2896 zcs->outBuffContentSize = zcs->outBuffFlushedSize = 0;
2897 zcs->stage = zcss_load;
2898 zcs->frameEnded = 0;
2899 return 0; /* ready to go */
2900 }
2901
2902 size_t ZSTD_initCStream_advanced(ZSTD_CStream* zcs,
2903 const void* dict, size_t dictSize,
2904 ZSTD_parameters params, unsigned long long pledgedSrcSize)
2905 {
2906 /* allocate buffers */
2907 { size_t const neededInBuffSize = (size_t)1 << params.cParams.windowLog;
2908 if (zcs->inBuffSize < neededInBuffSize) {
2909 zcs->inBuffSize = neededInBuffSize;
2910 ZSTD_free(zcs->inBuff, zcs->customMem);
2911 zcs->inBuff = (char*) ZSTD_malloc(neededInBuffSize, zcs->customMem);
2912 if (zcs->inBuff == NULL) return ERROR(memory_allocation);
2913 }
2914 zcs->blockSize = MIN(ZSTD_BLOCKSIZE_ABSOLUTEMAX, neededInBuffSize);
2915 }
2916 if (zcs->outBuffSize < ZSTD_compressBound(zcs->blockSize)+1) {
2917 zcs->outBuffSize = ZSTD_compressBound(zcs->blockSize)+1;
2918 ZSTD_free(zcs->outBuff, zcs->customMem);
2919 zcs->outBuff = (char*) ZSTD_malloc(zcs->outBuffSize, zcs->customMem);
2920 if (zcs->outBuff == NULL) return ERROR(memory_allocation);
2921 }
2922
2923 if (dict) {
2924 ZSTD_freeCDict(zcs->cdictLocal);
2925 zcs->cdictLocal = ZSTD_createCDict_advanced(dict, dictSize, params, zcs->customMem);
2926 if (zcs->cdictLocal == NULL) return ERROR(memory_allocation);
2927 zcs->cdict = zcs->cdictLocal;
2928 } else zcs->cdict = NULL;
2929
2930 zcs->checksum = params.fParams.checksumFlag > 0;
2931 zcs->params = params;
2932
2933 return ZSTD_resetCStream(zcs, pledgedSrcSize);
2934 }
2935
2936 /* note : cdict must outlive compression session */
2937 size_t ZSTD_initCStream_usingCDict(ZSTD_CStream* zcs, const ZSTD_CDict* cdict)
2938 {
2939 ZSTD_parameters const params = ZSTD_getParamsFromCDict(cdict);
2940 size_t const initError = ZSTD_initCStream_advanced(zcs, NULL, 0, params, 0);
2941 zcs->cdict = cdict;
2942 return initError;
2943 }
2944
2945 size_t ZSTD_initCStream_usingDict(ZSTD_CStream* zcs, const void* dict, size_t dictSize, int compressionLevel)
2946 {
2947 ZSTD_parameters const params = ZSTD_getParams(compressionLevel, 0, dictSize);
2948 return ZSTD_initCStream_advanced(zcs, dict, dictSize, params, 0);
2949 }
2950
2951 size_t ZSTD_initCStream(ZSTD_CStream* zcs, int compressionLevel)
2952 {
2953 return ZSTD_initCStream_usingDict(zcs, NULL, 0, compressionLevel);
2954 }
2955
2956 size_t ZSTD_sizeof_CStream(const ZSTD_CStream* zcs)
2957 {
2958 if (zcs==NULL) return 0; /* support sizeof on NULL */
2959 return sizeof(zcs) + ZSTD_sizeof_CCtx(zcs->cctx) + ZSTD_sizeof_CDict(zcs->cdictLocal) + zcs->outBuffSize + zcs->inBuffSize;
2960 }
2961
2962 /*====== Compression ======*/
2963
2964 typedef enum { zsf_gather, zsf_flush, zsf_end } ZSTD_flush_e;
2965
2966 MEM_STATIC size_t ZSTD_limitCopy(void* dst, size_t dstCapacity, const void* src, size_t srcSize)
2967 {
2968 size_t const length = MIN(dstCapacity, srcSize);
2969 memcpy(dst, src, length);
2970 return length;
2971 }
2972
2973 static size_t ZSTD_compressStream_generic(ZSTD_CStream* zcs,
2974 void* dst, size_t* dstCapacityPtr,
2975 const void* src, size_t* srcSizePtr,
2976 ZSTD_flush_e const flush)
2977 {
2978 U32 someMoreWork = 1;
2979 const char* const istart = (const char*)src;
2980 const char* const iend = istart + *srcSizePtr;
2981 const char* ip = istart;
2982 char* const ostart = (char*)dst;
2983 char* const oend = ostart + *dstCapacityPtr;
2984 char* op = ostart;
2985
2986 while (someMoreWork) {
2987 switch(zcs->stage)
2988 {
2989 case zcss_init: return ERROR(init_missing); /* call ZBUFF_compressInit() first ! */
2990
2991 case zcss_load:
2992 /* complete inBuffer */
2993 { size_t const toLoad = zcs->inBuffTarget - zcs->inBuffPos;
2994 size_t const loaded = ZSTD_limitCopy(zcs->inBuff + zcs->inBuffPos, toLoad, ip, iend-ip);
2995 zcs->inBuffPos += loaded;
2996 ip += loaded;
2997 if ( (zcs->inBuffPos==zcs->inToCompress) || (!flush && (toLoad != loaded)) ) {
2998 someMoreWork = 0; break; /* not enough input to get a full block : stop there, wait for more */
2999 } }
3000 /* compress current block (note : this stage cannot be stopped in the middle) */
3001 { void* cDst;
3002 size_t cSize;
3003 size_t const iSize = zcs->inBuffPos - zcs->inToCompress;
3004 size_t oSize = oend-op;
3005 if (oSize >= ZSTD_compressBound(iSize))
3006 cDst = op; /* compress directly into output buffer (avoid flush stage) */
3007 else
3008 cDst = zcs->outBuff, oSize = zcs->outBuffSize;
3009 cSize = (flush == zsf_end) ?
3010 ZSTD_compressEnd(zcs->cctx, cDst, oSize, zcs->inBuff + zcs->inToCompress, iSize) :
3011 ZSTD_compressContinue(zcs->cctx, cDst, oSize, zcs->inBuff + zcs->inToCompress, iSize);
3012 if (ZSTD_isError(cSize)) return cSize;
3013 if (flush == zsf_end) zcs->frameEnded = 1;
3014 /* prepare next block */
3015 zcs->inBuffTarget = zcs->inBuffPos + zcs->blockSize;
3016 if (zcs->inBuffTarget > zcs->inBuffSize)
3017 zcs->inBuffPos = 0, zcs->inBuffTarget = zcs->blockSize; /* note : inBuffSize >= blockSize */
3018 zcs->inToCompress = zcs->inBuffPos;
3019 if (cDst == op) { op += cSize; break; } /* no need to flush */
3020 zcs->outBuffContentSize = cSize;
3021 zcs->outBuffFlushedSize = 0;
3022 zcs->stage = zcss_flush; /* pass-through to flush stage */
3023 }
3024
3025 case zcss_flush:
3026 { size_t const toFlush = zcs->outBuffContentSize - zcs->outBuffFlushedSize;
3027 size_t const flushed = ZSTD_limitCopy(op, oend-op, zcs->outBuff + zcs->outBuffFlushedSize, toFlush);
3028 op += flushed;
3029 zcs->outBuffFlushedSize += flushed;
3030 if (toFlush!=flushed) { someMoreWork = 0; break; } /* dst too small to store flushed data : stop there */
3031 zcs->outBuffContentSize = zcs->outBuffFlushedSize = 0;
3032 zcs->stage = zcss_load;
3033 break;
3034 }
3035
3036 case zcss_final:
3037 someMoreWork = 0; /* do nothing */
3038 break;
3039
3040 default:
3041 return ERROR(GENERIC); /* impossible */
3042 }
3043 }
3044
3045 *srcSizePtr = ip - istart;
3046 *dstCapacityPtr = op - ostart;
3047 if (zcs->frameEnded) return 0;
3048 { size_t hintInSize = zcs->inBuffTarget - zcs->inBuffPos;
3049 if (hintInSize==0) hintInSize = zcs->blockSize;
3050 return hintInSize;
3051 }
3052 }
3053
3054 size_t ZSTD_compressStream(ZSTD_CStream* zcs, ZSTD_outBuffer* output, ZSTD_inBuffer* input)
3055 {
3056 size_t sizeRead = input->size - input->pos;
3057 size_t sizeWritten = output->size - output->pos;
3058 size_t const result = ZSTD_compressStream_generic(zcs,
3059 (char*)(output->dst) + output->pos, &sizeWritten,
3060 (const char*)(input->src) + input->pos, &sizeRead, zsf_gather);
3061 input->pos += sizeRead;
3062 output->pos += sizeWritten;
3063 return result;
3064 }
3065
3066
3067 /*====== Finalize ======*/
3068
3069 /*! ZSTD_flushStream() :
3070 * @return : amount of data remaining to flush */
3071 size_t ZSTD_flushStream(ZSTD_CStream* zcs, ZSTD_outBuffer* output)
3072 {
3073 size_t srcSize = 0;
3074 size_t sizeWritten = output->size - output->pos;
3075 size_t const result = ZSTD_compressStream_generic(zcs,
3076 (char*)(output->dst) + output->pos, &sizeWritten,
3077 &srcSize, &srcSize, /* use a valid src address instead of NULL */
3078 zsf_flush);
3079 output->pos += sizeWritten;
3080 if (ZSTD_isError(result)) return result;
3081 return zcs->outBuffContentSize - zcs->outBuffFlushedSize; /* remaining to flush */
3082 }
3083
3084
3085 size_t ZSTD_endStream(ZSTD_CStream* zcs, ZSTD_outBuffer* output)
3086 {
3087 BYTE* const ostart = (BYTE*)(output->dst) + output->pos;
3088 BYTE* const oend = (BYTE*)(output->dst) + output->size;
3089 BYTE* op = ostart;
3090
3091 if (zcs->stage != zcss_final) {
3092 /* flush whatever remains */
3093 size_t srcSize = 0;
3094 size_t sizeWritten = output->size - output->pos;
3095 size_t const notEnded = ZSTD_compressStream_generic(zcs, ostart, &sizeWritten, &srcSize, &srcSize, zsf_end); /* use a valid src address instead of NULL */
3096 size_t const remainingToFlush = zcs->outBuffContentSize - zcs->outBuffFlushedSize;
3097 op += sizeWritten;
3098 if (remainingToFlush) {
3099 output->pos += sizeWritten;
3100 return remainingToFlush + ZSTD_BLOCKHEADERSIZE /* final empty block */ + (zcs->checksum * 4);
3101 }
3102 /* create epilogue */
3103 zcs->stage = zcss_final;
3104 zcs->outBuffContentSize = !notEnded ? 0 :
3105 ZSTD_compressEnd(zcs->cctx, zcs->outBuff, zcs->outBuffSize, NULL, 0); /* write epilogue, including final empty block, into outBuff */
3106 }
3107
3108 /* flush epilogue */
3109 { size_t const toFlush = zcs->outBuffContentSize - zcs->outBuffFlushedSize;
3110 size_t const flushed = ZSTD_limitCopy(op, oend-op, zcs->outBuff + zcs->outBuffFlushedSize, toFlush);
3111 op += flushed;
3112 zcs->outBuffFlushedSize += flushed;
3113 output->pos += op-ostart;
3114 if (toFlush==flushed) zcs->stage = zcss_init; /* end reached */
3115 return toFlush - flushed;
3116 }
3117 }
3118
3119
3120
3121 /*-===== Pre-defined compression levels =====-*/
3122
3123 #define ZSTD_DEFAULT_CLEVEL 1
3124 #define ZSTD_MAX_CLEVEL 22
3125 int ZSTD_maxCLevel(void) { return ZSTD_MAX_CLEVEL; }
3126
3127 static const ZSTD_compressionParameters ZSTD_defaultCParameters[4][ZSTD_MAX_CLEVEL+1] = {
3128 { /* "default" */
3129 /* W, C, H, S, L, TL, strat */
3130 { 18, 12, 12, 1, 7, 16, ZSTD_fast }, /* level 0 - never used */
3131 { 19, 13, 14, 1, 7, 16, ZSTD_fast }, /* level 1 */
3132 { 19, 15, 16, 1, 6, 16, ZSTD_fast }, /* level 2 */
3133 { 20, 16, 17, 1, 5, 16, ZSTD_dfast }, /* level 3.*/
3134 { 20, 18, 18, 1, 5, 16, ZSTD_dfast }, /* level 4.*/
3135 { 20, 15, 18, 3, 5, 16, ZSTD_greedy }, /* level 5 */
3136 { 21, 16, 19, 2, 5, 16, ZSTD_lazy }, /* level 6 */
3137 { 21, 17, 20, 3, 5, 16, ZSTD_lazy }, /* level 7 */
3138 { 21, 18, 20, 3, 5, 16, ZSTD_lazy2 }, /* level 8 */
3139 { 21, 20, 20, 3, 5, 16, ZSTD_lazy2 }, /* level 9 */
3140 { 21, 19, 21, 4, 5, 16, ZSTD_lazy2 }, /* level 10 */
3141 { 22, 20, 22, 4, 5, 16, ZSTD_lazy2 }, /* level 11 */
3142 { 22, 20, 22, 5, 5, 16, ZSTD_lazy2 }, /* level 12 */
3143 { 22, 21, 22, 5, 5, 16, ZSTD_lazy2 }, /* level 13 */
3144 { 22, 21, 22, 6, 5, 16, ZSTD_lazy2 }, /* level 14 */
3145 { 22, 21, 21, 5, 5, 16, ZSTD_btlazy2 }, /* level 15 */
3146 { 23, 22, 22, 5, 5, 16, ZSTD_btlazy2 }, /* level 16 */
3147 { 23, 21, 22, 4, 5, 24, ZSTD_btopt }, /* level 17 */
3148 { 23, 23, 22, 6, 5, 32, ZSTD_btopt }, /* level 18 */
3149 { 23, 23, 22, 6, 3, 48, ZSTD_btopt }, /* level 19 */
3150 { 25, 25, 23, 7, 3, 64, ZSTD_btopt2 }, /* level 20 */
3151 { 26, 26, 23, 7, 3,256, ZSTD_btopt2 }, /* level 21 */
3152 { 27, 27, 25, 9, 3,512, ZSTD_btopt2 }, /* level 22 */
3153 },
3154 { /* for srcSize <= 256 KB */
3155 /* W, C, H, S, L, T, strat */
3156 { 0, 0, 0, 0, 0, 0, ZSTD_fast }, /* level 0 - not used */
3157 { 18, 13, 14, 1, 6, 8, ZSTD_fast }, /* level 1 */
3158 { 18, 14, 13, 1, 5, 8, ZSTD_dfast }, /* level 2 */
3159 { 18, 16, 15, 1, 5, 8, ZSTD_dfast }, /* level 3 */
3160 { 18, 15, 17, 1, 5, 8, ZSTD_greedy }, /* level 4.*/
3161 { 18, 16, 17, 4, 5, 8, ZSTD_greedy }, /* level 5.*/
3162 { 18, 16, 17, 3, 5, 8, ZSTD_lazy }, /* level 6.*/
3163 { 18, 17, 17, 4, 4, 8, ZSTD_lazy }, /* level 7 */
3164 { 18, 17, 17, 4, 4, 8, ZSTD_lazy2 }, /* level 8 */
3165 { 18, 17, 17, 5, 4, 8, ZSTD_lazy2 }, /* level 9 */
3166 { 18, 17, 17, 6, 4, 8, ZSTD_lazy2 }, /* level 10 */
3167 { 18, 18, 17, 6, 4, 8, ZSTD_lazy2 }, /* level 11.*/
3168 { 18, 18, 17, 7, 4, 8, ZSTD_lazy2 }, /* level 12.*/
3169 { 18, 19, 17, 6, 4, 8, ZSTD_btlazy2 }, /* level 13 */
3170 { 18, 18, 18, 4, 4, 16, ZSTD_btopt }, /* level 14.*/
3171 { 18, 18, 18, 4, 3, 16, ZSTD_btopt }, /* level 15.*/
3172 { 18, 19, 18, 6, 3, 32, ZSTD_btopt }, /* level 16.*/
3173 { 18, 19, 18, 8, 3, 64, ZSTD_btopt }, /* level 17.*/
3174 { 18, 19, 18, 9, 3,128, ZSTD_btopt }, /* level 18.*/
3175 { 18, 19, 18, 10, 3,256, ZSTD_btopt }, /* level 19.*/
3176 { 18, 19, 18, 11, 3,512, ZSTD_btopt2 }, /* level 20.*/
3177 { 18, 19, 18, 12, 3,512, ZSTD_btopt2 }, /* level 21.*/
3178 { 18, 19, 18, 13, 3,512, ZSTD_btopt2 }, /* level 22.*/
3179 },
3180 { /* for srcSize <= 128 KB */
3181 /* W, C, H, S, L, T, strat */
3182 { 17, 12, 12, 1, 7, 8, ZSTD_fast }, /* level 0 - not used */
3183 { 17, 12, 13, 1, 6, 8, ZSTD_fast }, /* level 1 */
3184 { 17, 13, 16, 1, 5, 8, ZSTD_fast }, /* level 2 */
3185 { 17, 16, 16, 2, 5, 8, ZSTD_dfast }, /* level 3 */
3186 { 17, 13, 15, 3, 4, 8, ZSTD_greedy }, /* level 4 */
3187 { 17, 15, 17, 4, 4, 8, ZSTD_greedy }, /* level 5 */
3188 { 17, 16, 17, 3, 4, 8, ZSTD_lazy }, /* level 6 */
3189 { 17, 15, 17, 4, 4, 8, ZSTD_lazy2 }, /* level 7 */
3190 { 17, 17, 17, 4, 4, 8, ZSTD_lazy2 }, /* level 8 */
3191 { 17, 17, 17, 5, 4, 8, ZSTD_lazy2 }, /* level 9 */
3192 { 17, 17, 17, 6, 4, 8, ZSTD_lazy2 }, /* level 10 */
3193 { 17, 17, 17, 7, 4, 8, ZSTD_lazy2 }, /* level 11 */
3194 { 17, 17, 17, 8, 4, 8, ZSTD_lazy2 }, /* level 12 */
3195 { 17, 18, 17, 6, 4, 8, ZSTD_btlazy2 }, /* level 13.*/
3196 { 17, 17, 17, 7, 3, 8, ZSTD_btopt }, /* level 14.*/
3197 { 17, 17, 17, 7, 3, 16, ZSTD_btopt }, /* level 15.*/
3198 { 17, 18, 17, 7, 3, 32, ZSTD_btopt }, /* level 16.*/
3199 { 17, 18, 17, 7, 3, 64, ZSTD_btopt }, /* level 17.*/
3200 { 17, 18, 17, 7, 3,256, ZSTD_btopt }, /* level 18.*/
3201 { 17, 18, 17, 8, 3,256, ZSTD_btopt }, /* level 19.*/
3202 { 17, 18, 17, 9, 3,256, ZSTD_btopt2 }, /* level 20.*/
3203 { 17, 18, 17, 10, 3,256, ZSTD_btopt2 }, /* level 21.*/
3204 { 17, 18, 17, 11, 3,512, ZSTD_btopt2 }, /* level 22.*/
3205 },
3206 { /* for srcSize <= 16 KB */
3207 /* W, C, H, S, L, T, strat */
3208 { 14, 12, 12, 1, 7, 6, ZSTD_fast }, /* level 0 - not used */
3209 { 14, 14, 14, 1, 6, 6, ZSTD_fast }, /* level 1 */
3210 { 14, 14, 14, 1, 4, 6, ZSTD_fast }, /* level 2 */
3211 { 14, 14, 14, 1, 4, 6, ZSTD_dfast }, /* level 3.*/
3212 { 14, 14, 14, 4, 4, 6, ZSTD_greedy }, /* level 4.*/
3213 { 14, 14, 14, 3, 4, 6, ZSTD_lazy }, /* level 5.*/
3214 { 14, 14, 14, 4, 4, 6, ZSTD_lazy2 }, /* level 6 */
3215 { 14, 14, 14, 5, 4, 6, ZSTD_lazy2 }, /* level 7 */
3216 { 14, 14, 14, 6, 4, 6, ZSTD_lazy2 }, /* level 8.*/
3217 { 14, 15, 14, 6, 4, 6, ZSTD_btlazy2 }, /* level 9.*/
3218 { 14, 15, 14, 3, 3, 6, ZSTD_btopt }, /* level 10.*/
3219 { 14, 15, 14, 6, 3, 8, ZSTD_btopt }, /* level 11.*/
3220 { 14, 15, 14, 6, 3, 16, ZSTD_btopt }, /* level 12.*/
3221 { 14, 15, 14, 6, 3, 24, ZSTD_btopt }, /* level 13.*/
3222 { 14, 15, 15, 6, 3, 48, ZSTD_btopt }, /* level 14.*/
3223 { 14, 15, 15, 6, 3, 64, ZSTD_btopt }, /* level 15.*/
3224 { 14, 15, 15, 6, 3, 96, ZSTD_btopt }, /* level 16.*/
3225 { 14, 15, 15, 6, 3,128, ZSTD_btopt }, /* level 17.*/
3226 { 14, 15, 15, 6, 3,256, ZSTD_btopt }, /* level 18.*/
3227 { 14, 15, 15, 7, 3,256, ZSTD_btopt }, /* level 19.*/
3228 { 14, 15, 15, 8, 3,256, ZSTD_btopt2 }, /* level 20.*/
3229 { 14, 15, 15, 9, 3,256, ZSTD_btopt2 }, /* level 21.*/
3230 { 14, 15, 15, 10, 3,256, ZSTD_btopt2 }, /* level 22.*/
3231 },
3232 };
3233
3234 /*! ZSTD_getCParams() :
3235 * @return ZSTD_compressionParameters structure for a selected compression level, `srcSize` and `dictSize`.
3236 * Size values are optional, provide 0 if not known or unused */
3237 ZSTD_compressionParameters ZSTD_getCParams(int compressionLevel, unsigned long long srcSize, size_t dictSize)
3238 {
3239 ZSTD_compressionParameters cp;
3240 size_t const addedSize = srcSize ? 0 : 500;
3241 U64 const rSize = srcSize+dictSize ? srcSize+dictSize+addedSize : (U64)-1;
3242 U32 const tableID = (rSize <= 256 KB) + (rSize <= 128 KB) + (rSize <= 16 KB); /* intentional underflow for srcSizeHint == 0 */
3243 if (compressionLevel <= 0) compressionLevel = ZSTD_DEFAULT_CLEVEL; /* 0 == default; no negative compressionLevel yet */
3244 if (compressionLevel > ZSTD_MAX_CLEVEL) compressionLevel = ZSTD_MAX_CLEVEL;
3245 cp = ZSTD_defaultCParameters[tableID][compressionLevel];
3246 if (MEM_32bits()) { /* auto-correction, for 32-bits mode */
3247 if (cp.windowLog > ZSTD_WINDOWLOG_MAX) cp.windowLog = ZSTD_WINDOWLOG_MAX;
3248 if (cp.chainLog > ZSTD_CHAINLOG_MAX) cp.chainLog = ZSTD_CHAINLOG_MAX;
3249 if (cp.hashLog > ZSTD_HASHLOG_MAX) cp.hashLog = ZSTD_HASHLOG_MAX;
3250 }
3251 cp = ZSTD_adjustCParams(cp, srcSize, dictSize);
3252 return cp;
3253 }
3254
3255 /*! ZSTD_getParams() :
3256 * same as ZSTD_getCParams(), but @return a `ZSTD_parameters` object (instead of `ZSTD_compressionParameters`).
3257 * All fields of `ZSTD_frameParameters` are set to default (0) */
3258 ZSTD_parameters ZSTD_getParams(int compressionLevel, unsigned long long srcSize, size_t dictSize) {
3259 ZSTD_parameters params;
3260 ZSTD_compressionParameters const cParams = ZSTD_getCParams(compressionLevel, srcSize, dictSize);
3261 memset(&params, 0, sizeof(params));
3262 params.cParams = cParams;
3263 return params;
3264 }
This diff has been collapsed as it changes many lines, (900 lines changed) Show them Hide them
@@ -0,0 +1,900 b''
1 /**
2 * Copyright (c) 2016-present, Przemyslaw Skibinski, Yann Collet, Facebook, Inc.
3 * All rights reserved.
4 *
5 * This source code is licensed under the BSD-style license found in the
6 * LICENSE file in the root directory of this source tree. An additional grant
7 * of patent rights can be found in the PATENTS file in the same directory.
8 */
9
10
11 /* Note : this file is intended to be included within zstd_compress.c */
12
13
14 #ifndef ZSTD_OPT_H_91842398743
15 #define ZSTD_OPT_H_91842398743
16
17
18 #define ZSTD_FREQ_DIV 5
19 #define ZSTD_MAX_PRICE (1<<30)
20
21 /*-*************************************
22 * Price functions for optimal parser
23 ***************************************/
24 FORCE_INLINE void ZSTD_setLog2Prices(seqStore_t* ssPtr)
25 {
26 ssPtr->log2matchLengthSum = ZSTD_highbit32(ssPtr->matchLengthSum+1);
27 ssPtr->log2litLengthSum = ZSTD_highbit32(ssPtr->litLengthSum+1);
28 ssPtr->log2litSum = ZSTD_highbit32(ssPtr->litSum+1);
29 ssPtr->log2offCodeSum = ZSTD_highbit32(ssPtr->offCodeSum+1);
30 ssPtr->factor = 1 + ((ssPtr->litSum>>5) / ssPtr->litLengthSum) + ((ssPtr->litSum<<1) / (ssPtr->litSum + ssPtr->matchSum));
31 }
32
33
34 MEM_STATIC void ZSTD_rescaleFreqs(seqStore_t* ssPtr)
35 {
36 unsigned u;
37
38 ssPtr->cachedLiterals = NULL;
39 ssPtr->cachedPrice = ssPtr->cachedLitLength = 0;
40
41 if (ssPtr->litLengthSum == 0) {
42 ssPtr->litSum = (2<<Litbits);
43 ssPtr->litLengthSum = MaxLL+1;
44 ssPtr->matchLengthSum = MaxML+1;
45 ssPtr->offCodeSum = (MaxOff+1);
46 ssPtr->matchSum = (2<<Litbits);
47
48 for (u=0; u<=MaxLit; u++)
49 ssPtr->litFreq[u] = 2;
50 for (u=0; u<=MaxLL; u++)
51 ssPtr->litLengthFreq[u] = 1;
52 for (u=0; u<=MaxML; u++)
53 ssPtr->matchLengthFreq[u] = 1;
54 for (u=0; u<=MaxOff; u++)
55 ssPtr->offCodeFreq[u] = 1;
56 } else {
57 ssPtr->matchLengthSum = 0;
58 ssPtr->litLengthSum = 0;
59 ssPtr->offCodeSum = 0;
60 ssPtr->matchSum = 0;
61 ssPtr->litSum = 0;
62
63 for (u=0; u<=MaxLit; u++) {
64 ssPtr->litFreq[u] = 1 + (ssPtr->litFreq[u]>>ZSTD_FREQ_DIV);
65 ssPtr->litSum += ssPtr->litFreq[u];
66 }
67 for (u=0; u<=MaxLL; u++) {
68 ssPtr->litLengthFreq[u] = 1 + (ssPtr->litLengthFreq[u]>>ZSTD_FREQ_DIV);
69 ssPtr->litLengthSum += ssPtr->litLengthFreq[u];
70 }
71 for (u=0; u<=MaxML; u++) {
72 ssPtr->matchLengthFreq[u] = 1 + (ssPtr->matchLengthFreq[u]>>ZSTD_FREQ_DIV);
73 ssPtr->matchLengthSum += ssPtr->matchLengthFreq[u];
74 ssPtr->matchSum += ssPtr->matchLengthFreq[u] * (u + 3);
75 }
76 for (u=0; u<=MaxOff; u++) {
77 ssPtr->offCodeFreq[u] = 1 + (ssPtr->offCodeFreq[u]>>ZSTD_FREQ_DIV);
78 ssPtr->offCodeSum += ssPtr->offCodeFreq[u];
79 }
80 }
81
82 ZSTD_setLog2Prices(ssPtr);
83 }
84
85
86 FORCE_INLINE U32 ZSTD_getLiteralPrice(seqStore_t* ssPtr, U32 litLength, const BYTE* literals)
87 {
88 U32 price, u;
89
90 if (litLength == 0)
91 return ssPtr->log2litLengthSum - ZSTD_highbit32(ssPtr->litLengthFreq[0]+1);
92
93 /* literals */
94 if (ssPtr->cachedLiterals == literals) {
95 U32 const additional = litLength - ssPtr->cachedLitLength;
96 const BYTE* literals2 = ssPtr->cachedLiterals + ssPtr->cachedLitLength;
97 price = ssPtr->cachedPrice + additional * ssPtr->log2litSum;
98 for (u=0; u < additional; u++)
99 price -= ZSTD_highbit32(ssPtr->litFreq[literals2[u]]+1);
100 ssPtr->cachedPrice = price;
101 ssPtr->cachedLitLength = litLength;
102 } else {
103 price = litLength * ssPtr->log2litSum;
104 for (u=0; u < litLength; u++)
105 price -= ZSTD_highbit32(ssPtr->litFreq[literals[u]]+1);
106
107 if (litLength >= 12) {
108 ssPtr->cachedLiterals = literals;
109 ssPtr->cachedPrice = price;
110 ssPtr->cachedLitLength = litLength;
111 }
112 }
113
114 /* literal Length */
115 { const BYTE LL_deltaCode = 19;
116 const BYTE llCode = (litLength>63) ? (BYTE)ZSTD_highbit32(litLength) + LL_deltaCode : LL_Code[litLength];
117 price += LL_bits[llCode] + ssPtr->log2litLengthSum - ZSTD_highbit32(ssPtr->litLengthFreq[llCode]+1);
118 }
119
120 return price;
121 }
122
123
124 FORCE_INLINE U32 ZSTD_getPrice(seqStore_t* seqStorePtr, U32 litLength, const BYTE* literals, U32 offset, U32 matchLength, const int ultra)
125 {
126 /* offset */
127 BYTE const offCode = (BYTE)ZSTD_highbit32(offset+1);
128 U32 price = offCode + seqStorePtr->log2offCodeSum - ZSTD_highbit32(seqStorePtr->offCodeFreq[offCode]+1);
129
130 if (!ultra && offCode >= 20) price += (offCode-19)*2;
131
132 /* match Length */
133 { const BYTE ML_deltaCode = 36;
134 const BYTE mlCode = (matchLength>127) ? (BYTE)ZSTD_highbit32(matchLength) + ML_deltaCode : ML_Code[matchLength];
135 price += ML_bits[mlCode] + seqStorePtr->log2matchLengthSum - ZSTD_highbit32(seqStorePtr->matchLengthFreq[mlCode]+1);
136 }
137
138 return price + ZSTD_getLiteralPrice(seqStorePtr, litLength, literals) + seqStorePtr->factor;
139 }
140
141
142 MEM_STATIC void ZSTD_updatePrice(seqStore_t* seqStorePtr, U32 litLength, const BYTE* literals, U32 offset, U32 matchLength)
143 {
144 U32 u;
145
146 /* literals */
147 seqStorePtr->litSum += litLength;
148 for (u=0; u < litLength; u++)
149 seqStorePtr->litFreq[literals[u]]++;
150
151 /* literal Length */
152 { const BYTE LL_deltaCode = 19;
153 const BYTE llCode = (litLength>63) ? (BYTE)ZSTD_highbit32(litLength) + LL_deltaCode : LL_Code[litLength];
154 seqStorePtr->litLengthFreq[llCode]++;
155 seqStorePtr->litLengthSum++;
156 }
157
158 /* match offset */
159 { BYTE const offCode = (BYTE)ZSTD_highbit32(offset+1);
160 seqStorePtr->offCodeSum++;
161 seqStorePtr->offCodeFreq[offCode]++;
162 }
163
164 /* match Length */
165 { const BYTE ML_deltaCode = 36;
166 const BYTE mlCode = (matchLength>127) ? (BYTE)ZSTD_highbit32(matchLength) + ML_deltaCode : ML_Code[matchLength];
167 seqStorePtr->matchLengthFreq[mlCode]++;
168 seqStorePtr->matchLengthSum++;
169 }
170
171 ZSTD_setLog2Prices(seqStorePtr);
172 }
173
174
175 #define SET_PRICE(pos, mlen_, offset_, litlen_, price_) \
176 { \
177 while (last_pos < pos) { opt[last_pos+1].price = ZSTD_MAX_PRICE; last_pos++; } \
178 opt[pos].mlen = mlen_; \
179 opt[pos].off = offset_; \
180 opt[pos].litlen = litlen_; \
181 opt[pos].price = price_; \
182 }
183
184
185
186 /* Update hashTable3 up to ip (excluded)
187 Assumption : always within prefix (ie. not within extDict) */
188 FORCE_INLINE
189 U32 ZSTD_insertAndFindFirstIndexHash3 (ZSTD_CCtx* zc, const BYTE* ip)
190 {
191 U32* const hashTable3 = zc->hashTable3;
192 U32 const hashLog3 = zc->hashLog3;
193 const BYTE* const base = zc->base;
194 U32 idx = zc->nextToUpdate3;
195 const U32 target = zc->nextToUpdate3 = (U32)(ip - base);
196 const size_t hash3 = ZSTD_hash3Ptr(ip, hashLog3);
197
198 while(idx < target) {
199 hashTable3[ZSTD_hash3Ptr(base+idx, hashLog3)] = idx;
200 idx++;
201 }
202
203 return hashTable3[hash3];
204 }
205
206
207 /*-*************************************
208 * Binary Tree search
209 ***************************************/
210 static U32 ZSTD_insertBtAndGetAllMatches (
211 ZSTD_CCtx* zc,
212 const BYTE* const ip, const BYTE* const iLimit,
213 U32 nbCompares, const U32 mls,
214 U32 extDict, ZSTD_match_t* matches, const U32 minMatchLen)
215 {
216 const BYTE* const base = zc->base;
217 const U32 current = (U32)(ip-base);
218 const U32 hashLog = zc->params.cParams.hashLog;
219 const size_t h = ZSTD_hashPtr(ip, hashLog, mls);
220 U32* const hashTable = zc->hashTable;
221 U32 matchIndex = hashTable[h];
222 U32* const bt = zc->chainTable;
223 const U32 btLog = zc->params.cParams.chainLog - 1;
224 const U32 btMask= (1U << btLog) - 1;
225 size_t commonLengthSmaller=0, commonLengthLarger=0;
226 const BYTE* const dictBase = zc->dictBase;
227 const U32 dictLimit = zc->dictLimit;
228 const BYTE* const dictEnd = dictBase + dictLimit;
229 const BYTE* const prefixStart = base + dictLimit;
230 const U32 btLow = btMask >= current ? 0 : current - btMask;
231 const U32 windowLow = zc->lowLimit;
232 U32* smallerPtr = bt + 2*(current&btMask);
233 U32* largerPtr = bt + 2*(current&btMask) + 1;
234 U32 matchEndIdx = current+8;
235 U32 dummy32; /* to be nullified at the end */
236 U32 mnum = 0;
237
238 const U32 minMatch = (mls == 3) ? 3 : 4;
239 size_t bestLength = minMatchLen-1;
240
241 if (minMatch == 3) { /* HC3 match finder */
242 U32 const matchIndex3 = ZSTD_insertAndFindFirstIndexHash3 (zc, ip);
243 if (matchIndex3>windowLow && (current - matchIndex3 < (1<<18))) {
244 const BYTE* match;
245 size_t currentMl=0;
246 if ((!extDict) || matchIndex3 >= dictLimit) {
247 match = base + matchIndex3;
248 if (match[bestLength] == ip[bestLength]) currentMl = ZSTD_count(ip, match, iLimit);
249 } else {
250 match = dictBase + matchIndex3;
251 if (MEM_readMINMATCH(match, MINMATCH) == MEM_readMINMATCH(ip, MINMATCH)) /* assumption : matchIndex3 <= dictLimit-4 (by table construction) */
252 currentMl = ZSTD_count_2segments(ip+MINMATCH, match+MINMATCH, iLimit, dictEnd, prefixStart) + MINMATCH;
253 }
254
255 /* save best solution */
256 if (currentMl > bestLength) {
257 bestLength = currentMl;
258 matches[mnum].off = ZSTD_REP_MOVE_OPT + current - matchIndex3;
259 matches[mnum].len = (U32)currentMl;
260 mnum++;
261 if (currentMl > ZSTD_OPT_NUM) goto update;
262 if (ip+currentMl == iLimit) goto update; /* best possible, and avoid read overflow*/
263 }
264 }
265 }
266
267 hashTable[h] = current; /* Update Hash Table */
268
269 while (nbCompares-- && (matchIndex > windowLow)) {
270 U32* nextPtr = bt + 2*(matchIndex & btMask);
271 size_t matchLength = MIN(commonLengthSmaller, commonLengthLarger); /* guaranteed minimum nb of common bytes */
272 const BYTE* match;
273
274 if ((!extDict) || (matchIndex+matchLength >= dictLimit)) {
275 match = base + matchIndex;
276 if (match[matchLength] == ip[matchLength]) {
277 matchLength += ZSTD_count(ip+matchLength+1, match+matchLength+1, iLimit) +1;
278 }
279 } else {
280 match = dictBase + matchIndex;
281 matchLength += ZSTD_count_2segments(ip+matchLength, match+matchLength, iLimit, dictEnd, prefixStart);
282 if (matchIndex+matchLength >= dictLimit)
283 match = base + matchIndex; /* to prepare for next usage of match[matchLength] */
284 }
285
286 if (matchLength > bestLength) {
287 if (matchLength > matchEndIdx - matchIndex) matchEndIdx = matchIndex + (U32)matchLength;
288 bestLength = matchLength;
289 matches[mnum].off = ZSTD_REP_MOVE_OPT + current - matchIndex;
290 matches[mnum].len = (U32)matchLength;
291 mnum++;
292 if (matchLength > ZSTD_OPT_NUM) break;
293 if (ip+matchLength == iLimit) /* equal : no way to know if inf or sup */
294 break; /* drop, to guarantee consistency (miss a little bit of compression) */
295 }
296
297 if (match[matchLength] < ip[matchLength]) {
298 /* match is smaller than current */
299 *smallerPtr = matchIndex; /* update smaller idx */
300 commonLengthSmaller = matchLength; /* all smaller will now have at least this guaranteed common length */
301 if (matchIndex <= btLow) { smallerPtr=&dummy32; break; } /* beyond tree size, stop the search */
302 smallerPtr = nextPtr+1; /* new "smaller" => larger of match */
303 matchIndex = nextPtr[1]; /* new matchIndex larger than previous (closer to current) */
304 } else {
305 /* match is larger than current */
306 *largerPtr = matchIndex;
307 commonLengthLarger = matchLength;
308 if (matchIndex <= btLow) { largerPtr=&dummy32; break; } /* beyond tree size, stop the search */
309 largerPtr = nextPtr;
310 matchIndex = nextPtr[0];
311 } }
312
313 *smallerPtr = *largerPtr = 0;
314
315 update:
316 zc->nextToUpdate = (matchEndIdx > current + 8) ? matchEndIdx - 8 : current+1;
317 return mnum;
318 }
319
320
321 /** Tree updater, providing best match */
322 static U32 ZSTD_BtGetAllMatches (
323 ZSTD_CCtx* zc,
324 const BYTE* const ip, const BYTE* const iLimit,
325 const U32 maxNbAttempts, const U32 mls, ZSTD_match_t* matches, const U32 minMatchLen)
326 {
327 if (ip < zc->base + zc->nextToUpdate) return 0; /* skipped area */
328 ZSTD_updateTree(zc, ip, iLimit, maxNbAttempts, mls);
329 return ZSTD_insertBtAndGetAllMatches(zc, ip, iLimit, maxNbAttempts, mls, 0, matches, minMatchLen);
330 }
331
332
333 static U32 ZSTD_BtGetAllMatches_selectMLS (
334 ZSTD_CCtx* zc, /* Index table will be updated */
335 const BYTE* ip, const BYTE* const iHighLimit,
336 const U32 maxNbAttempts, const U32 matchLengthSearch, ZSTD_match_t* matches, const U32 minMatchLen)
337 {
338 switch(matchLengthSearch)
339 {
340 case 3 : return ZSTD_BtGetAllMatches(zc, ip, iHighLimit, maxNbAttempts, 3, matches, minMatchLen);
341 default :
342 case 4 : return ZSTD_BtGetAllMatches(zc, ip, iHighLimit, maxNbAttempts, 4, matches, minMatchLen);
343 case 5 : return ZSTD_BtGetAllMatches(zc, ip, iHighLimit, maxNbAttempts, 5, matches, minMatchLen);
344 case 6 : return ZSTD_BtGetAllMatches(zc, ip, iHighLimit, maxNbAttempts, 6, matches, minMatchLen);
345 }
346 }
347
348 /** Tree updater, providing best match */
349 static U32 ZSTD_BtGetAllMatches_extDict (
350 ZSTD_CCtx* zc,
351 const BYTE* const ip, const BYTE* const iLimit,
352 const U32 maxNbAttempts, const U32 mls, ZSTD_match_t* matches, const U32 minMatchLen)
353 {
354 if (ip < zc->base + zc->nextToUpdate) return 0; /* skipped area */
355 ZSTD_updateTree_extDict(zc, ip, iLimit, maxNbAttempts, mls);
356 return ZSTD_insertBtAndGetAllMatches(zc, ip, iLimit, maxNbAttempts, mls, 1, matches, minMatchLen);
357 }
358
359
360 static U32 ZSTD_BtGetAllMatches_selectMLS_extDict (
361 ZSTD_CCtx* zc, /* Index table will be updated */
362 const BYTE* ip, const BYTE* const iHighLimit,
363 const U32 maxNbAttempts, const U32 matchLengthSearch, ZSTD_match_t* matches, const U32 minMatchLen)
364 {
365 switch(matchLengthSearch)
366 {
367 case 3 : return ZSTD_BtGetAllMatches_extDict(zc, ip, iHighLimit, maxNbAttempts, 3, matches, minMatchLen);
368 default :
369 case 4 : return ZSTD_BtGetAllMatches_extDict(zc, ip, iHighLimit, maxNbAttempts, 4, matches, minMatchLen);
370 case 5 : return ZSTD_BtGetAllMatches_extDict(zc, ip, iHighLimit, maxNbAttempts, 5, matches, minMatchLen);
371 case 6 : return ZSTD_BtGetAllMatches_extDict(zc, ip, iHighLimit, maxNbAttempts, 6, matches, minMatchLen);
372 }
373 }
374
375
376 /*-*******************************
377 * Optimal parser
378 *********************************/
379 FORCE_INLINE
380 void ZSTD_compressBlock_opt_generic(ZSTD_CCtx* ctx,
381 const void* src, size_t srcSize, const int ultra)
382 {
383 seqStore_t* seqStorePtr = &(ctx->seqStore);
384 const BYTE* const istart = (const BYTE*)src;
385 const BYTE* ip = istart;
386 const BYTE* anchor = istart;
387 const BYTE* const iend = istart + srcSize;
388 const BYTE* const ilimit = iend - 8;
389 const BYTE* const base = ctx->base;
390 const BYTE* const prefixStart = base + ctx->dictLimit;
391
392 const U32 maxSearches = 1U << ctx->params.cParams.searchLog;
393 const U32 sufficient_len = ctx->params.cParams.targetLength;
394 const U32 mls = ctx->params.cParams.searchLength;
395 const U32 minMatch = (ctx->params.cParams.searchLength == 3) ? 3 : 4;
396
397 ZSTD_optimal_t* opt = seqStorePtr->priceTable;
398 ZSTD_match_t* matches = seqStorePtr->matchTable;
399 const BYTE* inr;
400 U32 offset, rep[ZSTD_REP_NUM];
401
402 /* init */
403 ctx->nextToUpdate3 = ctx->nextToUpdate;
404 ZSTD_rescaleFreqs(seqStorePtr);
405 ip += (ip==prefixStart);
406 { U32 i; for (i=0; i<ZSTD_REP_NUM; i++) rep[i]=ctx->rep[i]; }
407
408 /* Match Loop */
409 while (ip < ilimit) {
410 U32 cur, match_num, last_pos, litlen, price;
411 U32 u, mlen, best_mlen, best_off, litLength;
412 memset(opt, 0, sizeof(ZSTD_optimal_t));
413 last_pos = 0;
414 litlen = (U32)(ip - anchor);
415
416 /* check repCode */
417 { U32 i, last_i = ZSTD_REP_CHECK + (ip==anchor);
418 for (i=(ip == anchor); i<last_i; i++) {
419 const S32 repCur = ((i==ZSTD_REP_MOVE_OPT) && (ip==anchor)) ? (rep[0] - 1) : rep[i];
420 if ( (repCur > 0) && (repCur < (S32)(ip-prefixStart))
421 && (MEM_readMINMATCH(ip, minMatch) == MEM_readMINMATCH(ip - repCur, minMatch))) {
422 mlen = (U32)ZSTD_count(ip+minMatch, ip+minMatch-repCur, iend) + minMatch;
423 if (mlen > sufficient_len || mlen >= ZSTD_OPT_NUM) {
424 best_mlen = mlen; best_off = i; cur = 0; last_pos = 1;
425 goto _storeSequence;
426 }
427 best_off = i - (ip == anchor);
428 do {
429 price = ZSTD_getPrice(seqStorePtr, litlen, anchor, best_off, mlen - MINMATCH, ultra);
430 if (mlen > last_pos || price < opt[mlen].price)
431 SET_PRICE(mlen, mlen, i, litlen, price); /* note : macro modifies last_pos */
432 mlen--;
433 } while (mlen >= minMatch);
434 } } }
435
436 match_num = ZSTD_BtGetAllMatches_selectMLS(ctx, ip, iend, maxSearches, mls, matches, minMatch);
437
438 if (!last_pos && !match_num) { ip++; continue; }
439
440 if (match_num && (matches[match_num-1].len > sufficient_len || matches[match_num-1].len >= ZSTD_OPT_NUM)) {
441 best_mlen = matches[match_num-1].len;
442 best_off = matches[match_num-1].off;
443 cur = 0;
444 last_pos = 1;
445 goto _storeSequence;
446 }
447
448 /* set prices using matches at position = 0 */
449 best_mlen = (last_pos) ? last_pos : minMatch;
450 for (u = 0; u < match_num; u++) {
451 mlen = (u>0) ? matches[u-1].len+1 : best_mlen;
452 best_mlen = matches[u].len;
453 while (mlen <= best_mlen) {
454 price = ZSTD_getPrice(seqStorePtr, litlen, anchor, matches[u].off-1, mlen - MINMATCH, ultra);
455 if (mlen > last_pos || price < opt[mlen].price)
456 SET_PRICE(mlen, mlen, matches[u].off, litlen, price); /* note : macro modifies last_pos */
457 mlen++;
458 } }
459
460 if (last_pos < minMatch) { ip++; continue; }
461
462 /* initialize opt[0] */
463 { U32 i ; for (i=0; i<ZSTD_REP_NUM; i++) opt[0].rep[i] = rep[i]; }
464 opt[0].mlen = 1;
465 opt[0].litlen = litlen;
466
467 /* check further positions */
468 for (cur = 1; cur <= last_pos; cur++) {
469 inr = ip + cur;
470
471 if (opt[cur-1].mlen == 1) {
472 litlen = opt[cur-1].litlen + 1;
473 if (cur > litlen) {
474 price = opt[cur - litlen].price + ZSTD_getLiteralPrice(seqStorePtr, litlen, inr-litlen);
475 } else
476 price = ZSTD_getLiteralPrice(seqStorePtr, litlen, anchor);
477 } else {
478 litlen = 1;
479 price = opt[cur - 1].price + ZSTD_getLiteralPrice(seqStorePtr, litlen, inr-1);
480 }
481
482 if (cur > last_pos || price <= opt[cur].price)
483 SET_PRICE(cur, 1, 0, litlen, price);
484
485 if (cur == last_pos) break;
486
487 if (inr > ilimit) /* last match must start at a minimum distance of 8 from oend */
488 continue;
489
490 mlen = opt[cur].mlen;
491 if (opt[cur].off > ZSTD_REP_MOVE_OPT) {
492 opt[cur].rep[2] = opt[cur-mlen].rep[1];
493 opt[cur].rep[1] = opt[cur-mlen].rep[0];
494 opt[cur].rep[0] = opt[cur].off - ZSTD_REP_MOVE_OPT;
495 } else {
496 opt[cur].rep[2] = (opt[cur].off > 1) ? opt[cur-mlen].rep[1] : opt[cur-mlen].rep[2];
497 opt[cur].rep[1] = (opt[cur].off > 0) ? opt[cur-mlen].rep[0] : opt[cur-mlen].rep[1];
498 opt[cur].rep[0] = ((opt[cur].off==ZSTD_REP_MOVE_OPT) && (mlen != 1)) ? (opt[cur-mlen].rep[0] - 1) : (opt[cur-mlen].rep[opt[cur].off]);
499 }
500
501 best_mlen = minMatch;
502 { U32 i, last_i = ZSTD_REP_CHECK + (mlen != 1);
503 for (i=(opt[cur].mlen != 1); i<last_i; i++) { /* check rep */
504 const S32 repCur = ((i==ZSTD_REP_MOVE_OPT) && (opt[cur].mlen != 1)) ? (opt[cur].rep[0] - 1) : opt[cur].rep[i];
505 if ( (repCur > 0) && (repCur < (S32)(inr-prefixStart))
506 && (MEM_readMINMATCH(inr, minMatch) == MEM_readMINMATCH(inr - repCur, minMatch))) {
507 mlen = (U32)ZSTD_count(inr+minMatch, inr+minMatch - repCur, iend) + minMatch;
508
509 if (mlen > sufficient_len || cur + mlen >= ZSTD_OPT_NUM) {
510 best_mlen = mlen; best_off = i; last_pos = cur + 1;
511 goto _storeSequence;
512 }
513
514 best_off = i - (opt[cur].mlen != 1);
515 if (mlen > best_mlen) best_mlen = mlen;
516
517 do {
518 if (opt[cur].mlen == 1) {
519 litlen = opt[cur].litlen;
520 if (cur > litlen) {
521 price = opt[cur - litlen].price + ZSTD_getPrice(seqStorePtr, litlen, inr-litlen, best_off, mlen - MINMATCH, ultra);
522 } else
523 price = ZSTD_getPrice(seqStorePtr, litlen, anchor, best_off, mlen - MINMATCH, ultra);
524 } else {
525 litlen = 0;
526 price = opt[cur].price + ZSTD_getPrice(seqStorePtr, 0, NULL, best_off, mlen - MINMATCH, ultra);
527 }
528
529 if (cur + mlen > last_pos || price <= opt[cur + mlen].price)
530 SET_PRICE(cur + mlen, mlen, i, litlen, price);
531 mlen--;
532 } while (mlen >= minMatch);
533 } } }
534
535 match_num = ZSTD_BtGetAllMatches_selectMLS(ctx, inr, iend, maxSearches, mls, matches, best_mlen);
536
537 if (match_num > 0 && (matches[match_num-1].len > sufficient_len || cur + matches[match_num-1].len >= ZSTD_OPT_NUM)) {
538 best_mlen = matches[match_num-1].len;
539 best_off = matches[match_num-1].off;
540 last_pos = cur + 1;
541 goto _storeSequence;
542 }
543
544 /* set prices using matches at position = cur */
545 for (u = 0; u < match_num; u++) {
546 mlen = (u>0) ? matches[u-1].len+1 : best_mlen;
547 best_mlen = matches[u].len;
548
549 while (mlen <= best_mlen) {
550 if (opt[cur].mlen == 1) {
551 litlen = opt[cur].litlen;
552 if (cur > litlen)
553 price = opt[cur - litlen].price + ZSTD_getPrice(seqStorePtr, litlen, ip+cur-litlen, matches[u].off-1, mlen - MINMATCH, ultra);
554 else
555 price = ZSTD_getPrice(seqStorePtr, litlen, anchor, matches[u].off-1, mlen - MINMATCH, ultra);
556 } else {
557 litlen = 0;
558 price = opt[cur].price + ZSTD_getPrice(seqStorePtr, 0, NULL, matches[u].off-1, mlen - MINMATCH, ultra);
559 }
560
561 if (cur + mlen > last_pos || (price < opt[cur + mlen].price))
562 SET_PRICE(cur + mlen, mlen, matches[u].off, litlen, price);
563
564 mlen++;
565 } } }
566
567 best_mlen = opt[last_pos].mlen;
568 best_off = opt[last_pos].off;
569 cur = last_pos - best_mlen;
570
571 /* store sequence */
572 _storeSequence: /* cur, last_pos, best_mlen, best_off have to be set */
573 opt[0].mlen = 1;
574
575 while (1) {
576 mlen = opt[cur].mlen;
577 offset = opt[cur].off;
578 opt[cur].mlen = best_mlen;
579 opt[cur].off = best_off;
580 best_mlen = mlen;
581 best_off = offset;
582 if (mlen > cur) break;
583 cur -= mlen;
584 }
585
586 for (u = 0; u <= last_pos;) {
587 u += opt[u].mlen;
588 }
589
590 for (cur=0; cur < last_pos; ) {
591 mlen = opt[cur].mlen;
592 if (mlen == 1) { ip++; cur++; continue; }
593 offset = opt[cur].off;
594 cur += mlen;
595 litLength = (U32)(ip - anchor);
596
597 if (offset > ZSTD_REP_MOVE_OPT) {
598 rep[2] = rep[1];
599 rep[1] = rep[0];
600 rep[0] = offset - ZSTD_REP_MOVE_OPT;
601 offset--;
602 } else {
603 if (offset != 0) {
604 best_off = ((offset==ZSTD_REP_MOVE_OPT) && (litLength==0)) ? (rep[0] - 1) : (rep[offset]);
605 if (offset != 1) rep[2] = rep[1];
606 rep[1] = rep[0];
607 rep[0] = best_off;
608 }
609 if (litLength==0) offset--;
610 }
611
612 ZSTD_updatePrice(seqStorePtr, litLength, anchor, offset, mlen-MINMATCH);
613 ZSTD_storeSeq(seqStorePtr, litLength, anchor, offset, mlen-MINMATCH);
614 anchor = ip = ip + mlen;
615 } } /* for (cur=0; cur < last_pos; ) */
616
617 /* Save reps for next block */
618 { int i; for (i=0; i<ZSTD_REP_NUM; i++) ctx->savedRep[i] = rep[i]; }
619
620 /* Last Literals */
621 { size_t const lastLLSize = iend - anchor;
622 memcpy(seqStorePtr->lit, anchor, lastLLSize);
623 seqStorePtr->lit += lastLLSize;
624 }
625 }
626
627
628 FORCE_INLINE
629 void ZSTD_compressBlock_opt_extDict_generic(ZSTD_CCtx* ctx,
630 const void* src, size_t srcSize, const int ultra)
631 {
632 seqStore_t* seqStorePtr = &(ctx->seqStore);
633 const BYTE* const istart = (const BYTE*)src;
634 const BYTE* ip = istart;
635 const BYTE* anchor = istart;
636 const BYTE* const iend = istart + srcSize;
637 const BYTE* const ilimit = iend - 8;
638 const BYTE* const base = ctx->base;
639 const U32 lowestIndex = ctx->lowLimit;
640 const U32 dictLimit = ctx->dictLimit;
641 const BYTE* const prefixStart = base + dictLimit;
642 const BYTE* const dictBase = ctx->dictBase;
643 const BYTE* const dictEnd = dictBase + dictLimit;
644
645 const U32 maxSearches = 1U << ctx->params.cParams.searchLog;
646 const U32 sufficient_len = ctx->params.cParams.targetLength;
647 const U32 mls = ctx->params.cParams.searchLength;
648 const U32 minMatch = (ctx->params.cParams.searchLength == 3) ? 3 : 4;
649
650 ZSTD_optimal_t* opt = seqStorePtr->priceTable;
651 ZSTD_match_t* matches = seqStorePtr->matchTable;
652 const BYTE* inr;
653
654 /* init */
655 U32 offset, rep[ZSTD_REP_NUM];
656 { U32 i; for (i=0; i<ZSTD_REP_NUM; i++) rep[i]=ctx->rep[i]; }
657
658 ctx->nextToUpdate3 = ctx->nextToUpdate;
659 ZSTD_rescaleFreqs(seqStorePtr);
660 ip += (ip==prefixStart);
661
662 /* Match Loop */
663 while (ip < ilimit) {
664 U32 cur, match_num, last_pos, litlen, price;
665 U32 u, mlen, best_mlen, best_off, litLength;
666 U32 current = (U32)(ip-base);
667 memset(opt, 0, sizeof(ZSTD_optimal_t));
668 last_pos = 0;
669 opt[0].litlen = (U32)(ip - anchor);
670
671 /* check repCode */
672 { U32 i, last_i = ZSTD_REP_CHECK + (ip==anchor);
673 for (i = (ip==anchor); i<last_i; i++) {
674 const S32 repCur = ((i==ZSTD_REP_MOVE_OPT) && (ip==anchor)) ? (rep[0] - 1) : rep[i];
675 const U32 repIndex = (U32)(current - repCur);
676 const BYTE* const repBase = repIndex < dictLimit ? dictBase : base;
677 const BYTE* const repMatch = repBase + repIndex;
678 if ( (repCur > 0 && repCur <= (S32)current)
679 && (((U32)((dictLimit-1) - repIndex) >= 3) & (repIndex>lowestIndex)) /* intentional overflow */
680 && (MEM_readMINMATCH(ip, minMatch) == MEM_readMINMATCH(repMatch, minMatch)) ) {
681 /* repcode detected we should take it */
682 const BYTE* const repEnd = repIndex < dictLimit ? dictEnd : iend;
683 mlen = (U32)ZSTD_count_2segments(ip+minMatch, repMatch+minMatch, iend, repEnd, prefixStart) + minMatch;
684
685 if (mlen > sufficient_len || mlen >= ZSTD_OPT_NUM) {
686 best_mlen = mlen; best_off = i; cur = 0; last_pos = 1;
687 goto _storeSequence;
688 }
689
690 best_off = i - (ip==anchor);
691 litlen = opt[0].litlen;
692 do {
693 price = ZSTD_getPrice(seqStorePtr, litlen, anchor, best_off, mlen - MINMATCH, ultra);
694 if (mlen > last_pos || price < opt[mlen].price)
695 SET_PRICE(mlen, mlen, i, litlen, price); /* note : macro modifies last_pos */
696 mlen--;
697 } while (mlen >= minMatch);
698 } } }
699
700 match_num = ZSTD_BtGetAllMatches_selectMLS_extDict(ctx, ip, iend, maxSearches, mls, matches, minMatch); /* first search (depth 0) */
701
702 if (!last_pos && !match_num) { ip++; continue; }
703
704 { U32 i; for (i=0; i<ZSTD_REP_NUM; i++) opt[0].rep[i] = rep[i]; }
705 opt[0].mlen = 1;
706
707 if (match_num && (matches[match_num-1].len > sufficient_len || matches[match_num-1].len >= ZSTD_OPT_NUM)) {
708 best_mlen = matches[match_num-1].len;
709 best_off = matches[match_num-1].off;
710 cur = 0;
711 last_pos = 1;
712 goto _storeSequence;
713 }
714
715 best_mlen = (last_pos) ? last_pos : minMatch;
716
717 /* set prices using matches at position = 0 */
718 for (u = 0; u < match_num; u++) {
719 mlen = (u>0) ? matches[u-1].len+1 : best_mlen;
720 best_mlen = matches[u].len;
721 litlen = opt[0].litlen;
722 while (mlen <= best_mlen) {
723 price = ZSTD_getPrice(seqStorePtr, litlen, anchor, matches[u].off-1, mlen - MINMATCH, ultra);
724 if (mlen > last_pos || price < opt[mlen].price)
725 SET_PRICE(mlen, mlen, matches[u].off, litlen, price);
726 mlen++;
727 } }
728
729 if (last_pos < minMatch) {
730 ip++; continue;
731 }
732
733 /* check further positions */
734 for (cur = 1; cur <= last_pos; cur++) {
735 inr = ip + cur;
736
737 if (opt[cur-1].mlen == 1) {
738 litlen = opt[cur-1].litlen + 1;
739 if (cur > litlen) {
740 price = opt[cur - litlen].price + ZSTD_getLiteralPrice(seqStorePtr, litlen, inr-litlen);
741 } else
742 price = ZSTD_getLiteralPrice(seqStorePtr, litlen, anchor);
743 } else {
744 litlen = 1;
745 price = opt[cur - 1].price + ZSTD_getLiteralPrice(seqStorePtr, litlen, inr-1);
746 }
747
748 if (cur > last_pos || price <= opt[cur].price)
749 SET_PRICE(cur, 1, 0, litlen, price);
750
751 if (cur == last_pos) break;
752
753 if (inr > ilimit) /* last match must start at a minimum distance of 8 from oend */
754 continue;
755
756 mlen = opt[cur].mlen;
757 if (opt[cur].off > ZSTD_REP_MOVE_OPT) {
758 opt[cur].rep[2] = opt[cur-mlen].rep[1];
759 opt[cur].rep[1] = opt[cur-mlen].rep[0];
760 opt[cur].rep[0] = opt[cur].off - ZSTD_REP_MOVE_OPT;
761 } else {
762 opt[cur].rep[2] = (opt[cur].off > 1) ? opt[cur-mlen].rep[1] : opt[cur-mlen].rep[2];
763 opt[cur].rep[1] = (opt[cur].off > 0) ? opt[cur-mlen].rep[0] : opt[cur-mlen].rep[1];
764 opt[cur].rep[0] = ((opt[cur].off==ZSTD_REP_MOVE_OPT) && (mlen != 1)) ? (opt[cur-mlen].rep[0] - 1) : (opt[cur-mlen].rep[opt[cur].off]);
765 }
766
767 best_mlen = minMatch;
768 { U32 i, last_i = ZSTD_REP_CHECK + (mlen != 1);
769 for (i = (mlen != 1); i<last_i; i++) {
770 const S32 repCur = ((i==ZSTD_REP_MOVE_OPT) && (opt[cur].mlen != 1)) ? (opt[cur].rep[0] - 1) : opt[cur].rep[i];
771 const U32 repIndex = (U32)(current+cur - repCur);
772 const BYTE* const repBase = repIndex < dictLimit ? dictBase : base;
773 const BYTE* const repMatch = repBase + repIndex;
774 if ( (repCur > 0 && repCur <= (S32)(current+cur))
775 && (((U32)((dictLimit-1) - repIndex) >= 3) & (repIndex>lowestIndex)) /* intentional overflow */
776 && (MEM_readMINMATCH(inr, minMatch) == MEM_readMINMATCH(repMatch, minMatch)) ) {
777 /* repcode detected */
778 const BYTE* const repEnd = repIndex < dictLimit ? dictEnd : iend;
779 mlen = (U32)ZSTD_count_2segments(inr+minMatch, repMatch+minMatch, iend, repEnd, prefixStart) + minMatch;
780
781 if (mlen > sufficient_len || cur + mlen >= ZSTD_OPT_NUM) {
782 best_mlen = mlen; best_off = i; last_pos = cur + 1;
783 goto _storeSequence;
784 }
785
786 best_off = i - (opt[cur].mlen != 1);
787 if (mlen > best_mlen) best_mlen = mlen;
788
789 do {
790 if (opt[cur].mlen == 1) {
791 litlen = opt[cur].litlen;
792 if (cur > litlen) {
793 price = opt[cur - litlen].price + ZSTD_getPrice(seqStorePtr, litlen, inr-litlen, best_off, mlen - MINMATCH, ultra);
794 } else
795 price = ZSTD_getPrice(seqStorePtr, litlen, anchor, best_off, mlen - MINMATCH, ultra);
796 } else {
797 litlen = 0;
798 price = opt[cur].price + ZSTD_getPrice(seqStorePtr, 0, NULL, best_off, mlen - MINMATCH, ultra);
799 }
800
801 if (cur + mlen > last_pos || price <= opt[cur + mlen].price)
802 SET_PRICE(cur + mlen, mlen, i, litlen, price);
803 mlen--;
804 } while (mlen >= minMatch);
805 } } }
806
807 match_num = ZSTD_BtGetAllMatches_selectMLS_extDict(ctx, inr, iend, maxSearches, mls, matches, minMatch);
808
809 if (match_num > 0 && matches[match_num-1].len > sufficient_len) {
810 best_mlen = matches[match_num-1].len;
811 best_off = matches[match_num-1].off;
812 last_pos = cur + 1;
813 goto _storeSequence;
814 }
815
816 /* set prices using matches at position = cur */
817 for (u = 0; u < match_num; u++) {
818 mlen = (u>0) ? matches[u-1].len+1 : best_mlen;
819 best_mlen = (cur + matches[u].len < ZSTD_OPT_NUM) ? matches[u].len : ZSTD_OPT_NUM - cur;
820
821 while (mlen <= best_mlen) {
822 if (opt[cur].mlen == 1) {
823 litlen = opt[cur].litlen;
824 if (cur > litlen)
825 price = opt[cur - litlen].price + ZSTD_getPrice(seqStorePtr, litlen, ip+cur-litlen, matches[u].off-1, mlen - MINMATCH, ultra);
826 else
827 price = ZSTD_getPrice(seqStorePtr, litlen, anchor, matches[u].off-1, mlen - MINMATCH, ultra);
828 } else {
829 litlen = 0;
830 price = opt[cur].price + ZSTD_getPrice(seqStorePtr, 0, NULL, matches[u].off-1, mlen - MINMATCH, ultra);
831 }
832
833 if (cur + mlen > last_pos || (price < opt[cur + mlen].price))
834 SET_PRICE(cur + mlen, mlen, matches[u].off, litlen, price);
835
836 mlen++;
837 } } } /* for (cur = 1; cur <= last_pos; cur++) */
838
839 best_mlen = opt[last_pos].mlen;
840 best_off = opt[last_pos].off;
841 cur = last_pos - best_mlen;
842
843 /* store sequence */
844 _storeSequence: /* cur, last_pos, best_mlen, best_off have to be set */
845 opt[0].mlen = 1;
846
847 while (1) {
848 mlen = opt[cur].mlen;
849 offset = opt[cur].off;
850 opt[cur].mlen = best_mlen;
851 opt[cur].off = best_off;
852 best_mlen = mlen;
853 best_off = offset;
854 if (mlen > cur) break;
855 cur -= mlen;
856 }
857
858 for (u = 0; u <= last_pos; ) {
859 u += opt[u].mlen;
860 }
861
862 for (cur=0; cur < last_pos; ) {
863 mlen = opt[cur].mlen;
864 if (mlen == 1) { ip++; cur++; continue; }
865 offset = opt[cur].off;
866 cur += mlen;
867 litLength = (U32)(ip - anchor);
868
869 if (offset > ZSTD_REP_MOVE_OPT) {
870 rep[2] = rep[1];
871 rep[1] = rep[0];
872 rep[0] = offset - ZSTD_REP_MOVE_OPT;
873 offset--;
874 } else {
875 if (offset != 0) {
876 best_off = ((offset==ZSTD_REP_MOVE_OPT) && (litLength==0)) ? (rep[0] - 1) : (rep[offset]);
877 if (offset != 1) rep[2] = rep[1];
878 rep[1] = rep[0];
879 rep[0] = best_off;
880 }
881
882 if (litLength==0) offset--;
883 }
884
885 ZSTD_updatePrice(seqStorePtr, litLength, anchor, offset, mlen-MINMATCH);
886 ZSTD_storeSeq(seqStorePtr, litLength, anchor, offset, mlen-MINMATCH);
887 anchor = ip = ip + mlen;
888 } } /* for (cur=0; cur < last_pos; ) */
889
890 /* Save reps for next block */
891 { int i; for (i=0; i<ZSTD_REP_NUM; i++) ctx->savedRep[i] = rep[i]; }
892
893 /* Last Literals */
894 { size_t lastLLSize = iend - anchor;
895 memcpy(seqStorePtr->lit, anchor, lastLLSize);
896 seqStorePtr->lit += lastLLSize;
897 }
898 }
899
900 #endif /* ZSTD_OPT_H_91842398743 */
This diff has been collapsed as it changes many lines, (883 lines changed) Show them Hide them
@@ -0,0 +1,883 b''
1 /* ******************************************************************
2 Huffman decoder, part of New Generation Entropy library
3 Copyright (C) 2013-2016, Yann Collet.
4
5 BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
6
7 Redistribution and use in source and binary forms, with or without
8 modification, are permitted provided that the following conditions are
9 met:
10
11 * Redistributions of source code must retain the above copyright
12 notice, this list of conditions and the following disclaimer.
13 * Redistributions in binary form must reproduce the above
14 copyright notice, this list of conditions and the following disclaimer
15 in the documentation and/or other materials provided with the
16 distribution.
17
18 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
19 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
20 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
21 A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
22 OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
23 SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
24 LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
25 DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
26 THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
27 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
28 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
29
30 You can contact the author at :
31 - FSE+HUF source repository : https://github.com/Cyan4973/FiniteStateEntropy
32 - Public forum : https://groups.google.com/forum/#!forum/lz4c
33 ****************************************************************** */
34
35 /* **************************************************************
36 * Compiler specifics
37 ****************************************************************/
38 #if defined (__cplusplus) || (defined (__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) /* C99 */)
39 /* inline is defined */
40 #elif defined(_MSC_VER) || defined(__GNUC__)
41 # define inline __inline
42 #else
43 # define inline /* disable inline */
44 #endif
45
46 #ifdef _MSC_VER /* Visual Studio */
47 # pragma warning(disable : 4127) /* disable: C4127: conditional expression is constant */
48 #endif
49
50
51 /* **************************************************************
52 * Dependencies
53 ****************************************************************/
54 #include <string.h> /* memcpy, memset */
55 #include "bitstream.h" /* BIT_* */
56 #include "fse.h" /* header compression */
57 #define HUF_STATIC_LINKING_ONLY
58 #include "huf.h"
59
60
61 /* **************************************************************
62 * Error Management
63 ****************************************************************/
64 #define HUF_STATIC_ASSERT(c) { enum { HUF_static_assert = 1/(int)(!!(c)) }; } /* use only *after* variable declarations */
65
66
67 /*-***************************/
68 /* generic DTableDesc */
69 /*-***************************/
70
71 typedef struct { BYTE maxTableLog; BYTE tableType; BYTE tableLog; BYTE reserved; } DTableDesc;
72
73 static DTableDesc HUF_getDTableDesc(const HUF_DTable* table)
74 {
75 DTableDesc dtd;
76 memcpy(&dtd, table, sizeof(dtd));
77 return dtd;
78 }
79
80
81 /*-***************************/
82 /* single-symbol decoding */
83 /*-***************************/
84
85 typedef struct { BYTE byte; BYTE nbBits; } HUF_DEltX2; /* single-symbol decoding */
86
87 size_t HUF_readDTableX2 (HUF_DTable* DTable, const void* src, size_t srcSize)
88 {
89 BYTE huffWeight[HUF_SYMBOLVALUE_MAX + 1];
90 U32 rankVal[HUF_TABLELOG_ABSOLUTEMAX + 1]; /* large enough for values from 0 to 16 */
91 U32 tableLog = 0;
92 U32 nbSymbols = 0;
93 size_t iSize;
94 void* const dtPtr = DTable + 1;
95 HUF_DEltX2* const dt = (HUF_DEltX2*)dtPtr;
96
97 HUF_STATIC_ASSERT(sizeof(DTableDesc) == sizeof(HUF_DTable));
98 /* memset(huffWeight, 0, sizeof(huffWeight)); */ /* is not necessary, even though some analyzer complain ... */
99
100 iSize = HUF_readStats(huffWeight, HUF_SYMBOLVALUE_MAX + 1, rankVal, &nbSymbols, &tableLog, src, srcSize);
101 if (HUF_isError(iSize)) return iSize;
102
103 /* Table header */
104 { DTableDesc dtd = HUF_getDTableDesc(DTable);
105 if (tableLog > (U32)(dtd.maxTableLog+1)) return ERROR(tableLog_tooLarge); /* DTable too small, huffman tree cannot fit in */
106 dtd.tableType = 0;
107 dtd.tableLog = (BYTE)tableLog;
108 memcpy(DTable, &dtd, sizeof(dtd));
109 }
110
111 /* Prepare ranks */
112 { U32 n, nextRankStart = 0;
113 for (n=1; n<tableLog+1; n++) {
114 U32 current = nextRankStart;
115 nextRankStart += (rankVal[n] << (n-1));
116 rankVal[n] = current;
117 } }
118
119 /* fill DTable */
120 { U32 n;
121 for (n=0; n<nbSymbols; n++) {
122 U32 const w = huffWeight[n];
123 U32 const length = (1 << w) >> 1;
124 U32 i;
125 HUF_DEltX2 D;
126 D.byte = (BYTE)n; D.nbBits = (BYTE)(tableLog + 1 - w);
127 for (i = rankVal[w]; i < rankVal[w] + length; i++)
128 dt[i] = D;
129 rankVal[w] += length;
130 } }
131
132 return iSize;
133 }
134
135
136 static BYTE HUF_decodeSymbolX2(BIT_DStream_t* Dstream, const HUF_DEltX2* dt, const U32 dtLog)
137 {
138 size_t const val = BIT_lookBitsFast(Dstream, dtLog); /* note : dtLog >= 1 */
139 BYTE const c = dt[val].byte;
140 BIT_skipBits(Dstream, dt[val].nbBits);
141 return c;
142 }
143
144 #define HUF_DECODE_SYMBOLX2_0(ptr, DStreamPtr) \
145 *ptr++ = HUF_decodeSymbolX2(DStreamPtr, dt, dtLog)
146
147 #define HUF_DECODE_SYMBOLX2_1(ptr, DStreamPtr) \
148 if (MEM_64bits() || (HUF_TABLELOG_MAX<=12)) \
149 HUF_DECODE_SYMBOLX2_0(ptr, DStreamPtr)
150
151 #define HUF_DECODE_SYMBOLX2_2(ptr, DStreamPtr) \
152 if (MEM_64bits()) \
153 HUF_DECODE_SYMBOLX2_0(ptr, DStreamPtr)
154
155 static inline size_t HUF_decodeStreamX2(BYTE* p, BIT_DStream_t* const bitDPtr, BYTE* const pEnd, const HUF_DEltX2* const dt, const U32 dtLog)
156 {
157 BYTE* const pStart = p;
158
159 /* up to 4 symbols at a time */
160 while ((BIT_reloadDStream(bitDPtr) == BIT_DStream_unfinished) && (p <= pEnd-4)) {
161 HUF_DECODE_SYMBOLX2_2(p, bitDPtr);
162 HUF_DECODE_SYMBOLX2_1(p, bitDPtr);
163 HUF_DECODE_SYMBOLX2_2(p, bitDPtr);
164 HUF_DECODE_SYMBOLX2_0(p, bitDPtr);
165 }
166
167 /* closer to the end */
168 while ((BIT_reloadDStream(bitDPtr) == BIT_DStream_unfinished) && (p < pEnd))
169 HUF_DECODE_SYMBOLX2_0(p, bitDPtr);
170
171 /* no more data to retrieve from bitstream, hence no need to reload */
172 while (p < pEnd)
173 HUF_DECODE_SYMBOLX2_0(p, bitDPtr);
174
175 return pEnd-pStart;
176 }
177
178 static size_t HUF_decompress1X2_usingDTable_internal(
179 void* dst, size_t dstSize,
180 const void* cSrc, size_t cSrcSize,
181 const HUF_DTable* DTable)
182 {
183 BYTE* op = (BYTE*)dst;
184 BYTE* const oend = op + dstSize;
185 const void* dtPtr = DTable + 1;
186 const HUF_DEltX2* const dt = (const HUF_DEltX2*)dtPtr;
187 BIT_DStream_t bitD;
188 DTableDesc const dtd = HUF_getDTableDesc(DTable);
189 U32 const dtLog = dtd.tableLog;
190
191 { size_t const errorCode = BIT_initDStream(&bitD, cSrc, cSrcSize);
192 if (HUF_isError(errorCode)) return errorCode; }
193
194 HUF_decodeStreamX2(op, &bitD, oend, dt, dtLog);
195
196 /* check */
197 if (!BIT_endOfDStream(&bitD)) return ERROR(corruption_detected);
198
199 return dstSize;
200 }
201
202 size_t HUF_decompress1X2_usingDTable(
203 void* dst, size_t dstSize,
204 const void* cSrc, size_t cSrcSize,
205 const HUF_DTable* DTable)
206 {
207 DTableDesc dtd = HUF_getDTableDesc(DTable);
208 if (dtd.tableType != 0) return ERROR(GENERIC);
209 return HUF_decompress1X2_usingDTable_internal(dst, dstSize, cSrc, cSrcSize, DTable);
210 }
211
212 size_t HUF_decompress1X2_DCtx (HUF_DTable* DCtx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
213 {
214 const BYTE* ip = (const BYTE*) cSrc;
215
216 size_t const hSize = HUF_readDTableX2 (DCtx, cSrc, cSrcSize);
217 if (HUF_isError(hSize)) return hSize;
218 if (hSize >= cSrcSize) return ERROR(srcSize_wrong);
219 ip += hSize; cSrcSize -= hSize;
220
221 return HUF_decompress1X2_usingDTable_internal (dst, dstSize, ip, cSrcSize, DCtx);
222 }
223
224 size_t HUF_decompress1X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
225 {
226 HUF_CREATE_STATIC_DTABLEX2(DTable, HUF_TABLELOG_MAX);
227 return HUF_decompress1X2_DCtx (DTable, dst, dstSize, cSrc, cSrcSize);
228 }
229
230
231 static size_t HUF_decompress4X2_usingDTable_internal(
232 void* dst, size_t dstSize,
233 const void* cSrc, size_t cSrcSize,
234 const HUF_DTable* DTable)
235 {
236 /* Check */
237 if (cSrcSize < 10) return ERROR(corruption_detected); /* strict minimum : jump table + 1 byte per stream */
238
239 { const BYTE* const istart = (const BYTE*) cSrc;
240 BYTE* const ostart = (BYTE*) dst;
241 BYTE* const oend = ostart + dstSize;
242 const void* const dtPtr = DTable + 1;
243 const HUF_DEltX2* const dt = (const HUF_DEltX2*)dtPtr;
244
245 /* Init */
246 BIT_DStream_t bitD1;
247 BIT_DStream_t bitD2;
248 BIT_DStream_t bitD3;
249 BIT_DStream_t bitD4;
250 size_t const length1 = MEM_readLE16(istart);
251 size_t const length2 = MEM_readLE16(istart+2);
252 size_t const length3 = MEM_readLE16(istart+4);
253 size_t const length4 = cSrcSize - (length1 + length2 + length3 + 6);
254 const BYTE* const istart1 = istart + 6; /* jumpTable */
255 const BYTE* const istart2 = istart1 + length1;
256 const BYTE* const istart3 = istart2 + length2;
257 const BYTE* const istart4 = istart3 + length3;
258 const size_t segmentSize = (dstSize+3) / 4;
259 BYTE* const opStart2 = ostart + segmentSize;
260 BYTE* const opStart3 = opStart2 + segmentSize;
261 BYTE* const opStart4 = opStart3 + segmentSize;
262 BYTE* op1 = ostart;
263 BYTE* op2 = opStart2;
264 BYTE* op3 = opStart3;
265 BYTE* op4 = opStart4;
266 U32 endSignal;
267 DTableDesc const dtd = HUF_getDTableDesc(DTable);
268 U32 const dtLog = dtd.tableLog;
269
270 if (length4 > cSrcSize) return ERROR(corruption_detected); /* overflow */
271 { size_t const errorCode = BIT_initDStream(&bitD1, istart1, length1);
272 if (HUF_isError(errorCode)) return errorCode; }
273 { size_t const errorCode = BIT_initDStream(&bitD2, istart2, length2);
274 if (HUF_isError(errorCode)) return errorCode; }
275 { size_t const errorCode = BIT_initDStream(&bitD3, istart3, length3);
276 if (HUF_isError(errorCode)) return errorCode; }
277 { size_t const errorCode = BIT_initDStream(&bitD4, istart4, length4);
278 if (HUF_isError(errorCode)) return errorCode; }
279
280 /* 16-32 symbols per loop (4-8 symbols per stream) */
281 endSignal = BIT_reloadDStream(&bitD1) | BIT_reloadDStream(&bitD2) | BIT_reloadDStream(&bitD3) | BIT_reloadDStream(&bitD4);
282 for ( ; (endSignal==BIT_DStream_unfinished) && (op4<(oend-7)) ; ) {
283 HUF_DECODE_SYMBOLX2_2(op1, &bitD1);
284 HUF_DECODE_SYMBOLX2_2(op2, &bitD2);
285 HUF_DECODE_SYMBOLX2_2(op3, &bitD3);
286 HUF_DECODE_SYMBOLX2_2(op4, &bitD4);
287 HUF_DECODE_SYMBOLX2_1(op1, &bitD1);
288 HUF_DECODE_SYMBOLX2_1(op2, &bitD2);
289 HUF_DECODE_SYMBOLX2_1(op3, &bitD3);
290 HUF_DECODE_SYMBOLX2_1(op4, &bitD4);
291 HUF_DECODE_SYMBOLX2_2(op1, &bitD1);
292 HUF_DECODE_SYMBOLX2_2(op2, &bitD2);
293 HUF_DECODE_SYMBOLX2_2(op3, &bitD3);
294 HUF_DECODE_SYMBOLX2_2(op4, &bitD4);
295 HUF_DECODE_SYMBOLX2_0(op1, &bitD1);
296 HUF_DECODE_SYMBOLX2_0(op2, &bitD2);
297 HUF_DECODE_SYMBOLX2_0(op3, &bitD3);
298 HUF_DECODE_SYMBOLX2_0(op4, &bitD4);
299 endSignal = BIT_reloadDStream(&bitD1) | BIT_reloadDStream(&bitD2) | BIT_reloadDStream(&bitD3) | BIT_reloadDStream(&bitD4);
300 }
301
302 /* check corruption */
303 if (op1 > opStart2) return ERROR(corruption_detected);
304 if (op2 > opStart3) return ERROR(corruption_detected);
305 if (op3 > opStart4) return ERROR(corruption_detected);
306 /* note : op4 supposed already verified within main loop */
307
308 /* finish bitStreams one by one */
309 HUF_decodeStreamX2(op1, &bitD1, opStart2, dt, dtLog);
310 HUF_decodeStreamX2(op2, &bitD2, opStart3, dt, dtLog);
311 HUF_decodeStreamX2(op3, &bitD3, opStart4, dt, dtLog);
312 HUF_decodeStreamX2(op4, &bitD4, oend, dt, dtLog);
313
314 /* check */
315 endSignal = BIT_endOfDStream(&bitD1) & BIT_endOfDStream(&bitD2) & BIT_endOfDStream(&bitD3) & BIT_endOfDStream(&bitD4);
316 if (!endSignal) return ERROR(corruption_detected);
317
318 /* decoded size */
319 return dstSize;
320 }
321 }
322
323
324 size_t HUF_decompress4X2_usingDTable(
325 void* dst, size_t dstSize,
326 const void* cSrc, size_t cSrcSize,
327 const HUF_DTable* DTable)
328 {
329 DTableDesc dtd = HUF_getDTableDesc(DTable);
330 if (dtd.tableType != 0) return ERROR(GENERIC);
331 return HUF_decompress4X2_usingDTable_internal(dst, dstSize, cSrc, cSrcSize, DTable);
332 }
333
334
335 size_t HUF_decompress4X2_DCtx (HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
336 {
337 const BYTE* ip = (const BYTE*) cSrc;
338
339 size_t const hSize = HUF_readDTableX2 (dctx, cSrc, cSrcSize);
340 if (HUF_isError(hSize)) return hSize;
341 if (hSize >= cSrcSize) return ERROR(srcSize_wrong);
342 ip += hSize; cSrcSize -= hSize;
343
344 return HUF_decompress4X2_usingDTable_internal (dst, dstSize, ip, cSrcSize, dctx);
345 }
346
347 size_t HUF_decompress4X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
348 {
349 HUF_CREATE_STATIC_DTABLEX2(DTable, HUF_TABLELOG_MAX);
350 return HUF_decompress4X2_DCtx(DTable, dst, dstSize, cSrc, cSrcSize);
351 }
352
353
354 /* *************************/
355 /* double-symbols decoding */
356 /* *************************/
357 typedef struct { U16 sequence; BYTE nbBits; BYTE length; } HUF_DEltX4; /* double-symbols decoding */
358
359 typedef struct { BYTE symbol; BYTE weight; } sortedSymbol_t;
360
361 static void HUF_fillDTableX4Level2(HUF_DEltX4* DTable, U32 sizeLog, const U32 consumed,
362 const U32* rankValOrigin, const int minWeight,
363 const sortedSymbol_t* sortedSymbols, const U32 sortedListSize,
364 U32 nbBitsBaseline, U16 baseSeq)
365 {
366 HUF_DEltX4 DElt;
367 U32 rankVal[HUF_TABLELOG_ABSOLUTEMAX + 1];
368
369 /* get pre-calculated rankVal */
370 memcpy(rankVal, rankValOrigin, sizeof(rankVal));
371
372 /* fill skipped values */
373 if (minWeight>1) {
374 U32 i, skipSize = rankVal[minWeight];
375 MEM_writeLE16(&(DElt.sequence), baseSeq);
376 DElt.nbBits = (BYTE)(consumed);
377 DElt.length = 1;
378 for (i = 0; i < skipSize; i++)
379 DTable[i] = DElt;
380 }
381
382 /* fill DTable */
383 { U32 s; for (s=0; s<sortedListSize; s++) { /* note : sortedSymbols already skipped */
384 const U32 symbol = sortedSymbols[s].symbol;
385 const U32 weight = sortedSymbols[s].weight;
386 const U32 nbBits = nbBitsBaseline - weight;
387 const U32 length = 1 << (sizeLog-nbBits);
388 const U32 start = rankVal[weight];
389 U32 i = start;
390 const U32 end = start + length;
391
392 MEM_writeLE16(&(DElt.sequence), (U16)(baseSeq + (symbol << 8)));
393 DElt.nbBits = (BYTE)(nbBits + consumed);
394 DElt.length = 2;
395 do { DTable[i++] = DElt; } while (i<end); /* since length >= 1 */
396
397 rankVal[weight] += length;
398 } }
399 }
400
401 typedef U32 rankVal_t[HUF_TABLELOG_ABSOLUTEMAX][HUF_TABLELOG_ABSOLUTEMAX + 1];
402
403 static void HUF_fillDTableX4(HUF_DEltX4* DTable, const U32 targetLog,
404 const sortedSymbol_t* sortedList, const U32 sortedListSize,
405 const U32* rankStart, rankVal_t rankValOrigin, const U32 maxWeight,
406 const U32 nbBitsBaseline)
407 {
408 U32 rankVal[HUF_TABLELOG_ABSOLUTEMAX + 1];
409 const int scaleLog = nbBitsBaseline - targetLog; /* note : targetLog >= srcLog, hence scaleLog <= 1 */
410 const U32 minBits = nbBitsBaseline - maxWeight;
411 U32 s;
412
413 memcpy(rankVal, rankValOrigin, sizeof(rankVal));
414
415 /* fill DTable */
416 for (s=0; s<sortedListSize; s++) {
417 const U16 symbol = sortedList[s].symbol;
418 const U32 weight = sortedList[s].weight;
419 const U32 nbBits = nbBitsBaseline - weight;
420 const U32 start = rankVal[weight];
421 const U32 length = 1 << (targetLog-nbBits);
422
423 if (targetLog-nbBits >= minBits) { /* enough room for a second symbol */
424 U32 sortedRank;
425 int minWeight = nbBits + scaleLog;
426 if (minWeight < 1) minWeight = 1;
427 sortedRank = rankStart[minWeight];
428 HUF_fillDTableX4Level2(DTable+start, targetLog-nbBits, nbBits,
429 rankValOrigin[nbBits], minWeight,
430 sortedList+sortedRank, sortedListSize-sortedRank,
431 nbBitsBaseline, symbol);
432 } else {
433 HUF_DEltX4 DElt;
434 MEM_writeLE16(&(DElt.sequence), symbol);
435 DElt.nbBits = (BYTE)(nbBits);
436 DElt.length = 1;
437 { U32 const end = start + length;
438 U32 u;
439 for (u = start; u < end; u++) DTable[u] = DElt;
440 } }
441 rankVal[weight] += length;
442 }
443 }
444
445 size_t HUF_readDTableX4 (HUF_DTable* DTable, const void* src, size_t srcSize)
446 {
447 BYTE weightList[HUF_SYMBOLVALUE_MAX + 1];
448 sortedSymbol_t sortedSymbol[HUF_SYMBOLVALUE_MAX + 1];
449 U32 rankStats[HUF_TABLELOG_ABSOLUTEMAX + 1] = { 0 };
450 U32 rankStart0[HUF_TABLELOG_ABSOLUTEMAX + 2] = { 0 };
451 U32* const rankStart = rankStart0+1;
452 rankVal_t rankVal;
453 U32 tableLog, maxW, sizeOfSort, nbSymbols;
454 DTableDesc dtd = HUF_getDTableDesc(DTable);
455 U32 const maxTableLog = dtd.maxTableLog;
456 size_t iSize;
457 void* dtPtr = DTable+1; /* force compiler to avoid strict-aliasing */
458 HUF_DEltX4* const dt = (HUF_DEltX4*)dtPtr;
459
460 HUF_STATIC_ASSERT(sizeof(HUF_DEltX4) == sizeof(HUF_DTable)); /* if compilation fails here, assertion is false */
461 if (maxTableLog > HUF_TABLELOG_ABSOLUTEMAX) return ERROR(tableLog_tooLarge);
462 /* memset(weightList, 0, sizeof(weightList)); */ /* is not necessary, even though some analyzer complain ... */
463
464 iSize = HUF_readStats(weightList, HUF_SYMBOLVALUE_MAX + 1, rankStats, &nbSymbols, &tableLog, src, srcSize);
465 if (HUF_isError(iSize)) return iSize;
466
467 /* check result */
468 if (tableLog > maxTableLog) return ERROR(tableLog_tooLarge); /* DTable can't fit code depth */
469
470 /* find maxWeight */
471 for (maxW = tableLog; rankStats[maxW]==0; maxW--) {} /* necessarily finds a solution before 0 */
472
473 /* Get start index of each weight */
474 { U32 w, nextRankStart = 0;
475 for (w=1; w<maxW+1; w++) {
476 U32 current = nextRankStart;
477 nextRankStart += rankStats[w];
478 rankStart[w] = current;
479 }
480 rankStart[0] = nextRankStart; /* put all 0w symbols at the end of sorted list*/
481 sizeOfSort = nextRankStart;
482 }
483
484 /* sort symbols by weight */
485 { U32 s;
486 for (s=0; s<nbSymbols; s++) {
487 U32 const w = weightList[s];
488 U32 const r = rankStart[w]++;
489 sortedSymbol[r].symbol = (BYTE)s;
490 sortedSymbol[r].weight = (BYTE)w;
491 }
492 rankStart[0] = 0; /* forget 0w symbols; this is beginning of weight(1) */
493 }
494
495 /* Build rankVal */
496 { U32* const rankVal0 = rankVal[0];
497 { int const rescale = (maxTableLog-tableLog) - 1; /* tableLog <= maxTableLog */
498 U32 nextRankVal = 0;
499 U32 w;
500 for (w=1; w<maxW+1; w++) {
501 U32 current = nextRankVal;
502 nextRankVal += rankStats[w] << (w+rescale);
503 rankVal0[w] = current;
504 } }
505 { U32 const minBits = tableLog+1 - maxW;
506 U32 consumed;
507 for (consumed = minBits; consumed < maxTableLog - minBits + 1; consumed++) {
508 U32* const rankValPtr = rankVal[consumed];
509 U32 w;
510 for (w = 1; w < maxW+1; w++) {
511 rankValPtr[w] = rankVal0[w] >> consumed;
512 } } } }
513
514 HUF_fillDTableX4(dt, maxTableLog,
515 sortedSymbol, sizeOfSort,
516 rankStart0, rankVal, maxW,
517 tableLog+1);
518
519 dtd.tableLog = (BYTE)maxTableLog;
520 dtd.tableType = 1;
521 memcpy(DTable, &dtd, sizeof(dtd));
522 return iSize;
523 }
524
525
526 static U32 HUF_decodeSymbolX4(void* op, BIT_DStream_t* DStream, const HUF_DEltX4* dt, const U32 dtLog)
527 {
528 size_t const val = BIT_lookBitsFast(DStream, dtLog); /* note : dtLog >= 1 */
529 memcpy(op, dt+val, 2);
530 BIT_skipBits(DStream, dt[val].nbBits);
531 return dt[val].length;
532 }
533
534 static U32 HUF_decodeLastSymbolX4(void* op, BIT_DStream_t* DStream, const HUF_DEltX4* dt, const U32 dtLog)
535 {
536 size_t const val = BIT_lookBitsFast(DStream, dtLog); /* note : dtLog >= 1 */
537 memcpy(op, dt+val, 1);
538 if (dt[val].length==1) BIT_skipBits(DStream, dt[val].nbBits);
539 else {
540 if (DStream->bitsConsumed < (sizeof(DStream->bitContainer)*8)) {
541 BIT_skipBits(DStream, dt[val].nbBits);
542 if (DStream->bitsConsumed > (sizeof(DStream->bitContainer)*8))
543 DStream->bitsConsumed = (sizeof(DStream->bitContainer)*8); /* ugly hack; works only because it's the last symbol. Note : can't easily extract nbBits from just this symbol */
544 } }
545 return 1;
546 }
547
548
549 #define HUF_DECODE_SYMBOLX4_0(ptr, DStreamPtr) \
550 ptr += HUF_decodeSymbolX4(ptr, DStreamPtr, dt, dtLog)
551
552 #define HUF_DECODE_SYMBOLX4_1(ptr, DStreamPtr) \
553 if (MEM_64bits() || (HUF_TABLELOG_MAX<=12)) \
554 ptr += HUF_decodeSymbolX4(ptr, DStreamPtr, dt, dtLog)
555
556 #define HUF_DECODE_SYMBOLX4_2(ptr, DStreamPtr) \
557 if (MEM_64bits()) \
558 ptr += HUF_decodeSymbolX4(ptr, DStreamPtr, dt, dtLog)
559
560 static inline size_t HUF_decodeStreamX4(BYTE* p, BIT_DStream_t* bitDPtr, BYTE* const pEnd, const HUF_DEltX4* const dt, const U32 dtLog)
561 {
562 BYTE* const pStart = p;
563
564 /* up to 8 symbols at a time */
565 while ((BIT_reloadDStream(bitDPtr) == BIT_DStream_unfinished) & (p < pEnd-(sizeof(bitDPtr->bitContainer)-1))) {
566 HUF_DECODE_SYMBOLX4_2(p, bitDPtr);
567 HUF_DECODE_SYMBOLX4_1(p, bitDPtr);
568 HUF_DECODE_SYMBOLX4_2(p, bitDPtr);
569 HUF_DECODE_SYMBOLX4_0(p, bitDPtr);
570 }
571
572 /* closer to end : up to 2 symbols at a time */
573 while ((BIT_reloadDStream(bitDPtr) == BIT_DStream_unfinished) & (p <= pEnd-2))
574 HUF_DECODE_SYMBOLX4_0(p, bitDPtr);
575
576 while (p <= pEnd-2)
577 HUF_DECODE_SYMBOLX4_0(p, bitDPtr); /* no need to reload : reached the end of DStream */
578
579 if (p < pEnd)
580 p += HUF_decodeLastSymbolX4(p, bitDPtr, dt, dtLog);
581
582 return p-pStart;
583 }
584
585
586 static size_t HUF_decompress1X4_usingDTable_internal(
587 void* dst, size_t dstSize,
588 const void* cSrc, size_t cSrcSize,
589 const HUF_DTable* DTable)
590 {
591 BIT_DStream_t bitD;
592
593 /* Init */
594 { size_t const errorCode = BIT_initDStream(&bitD, cSrc, cSrcSize);
595 if (HUF_isError(errorCode)) return errorCode;
596 }
597
598 /* decode */
599 { BYTE* const ostart = (BYTE*) dst;
600 BYTE* const oend = ostart + dstSize;
601 const void* const dtPtr = DTable+1; /* force compiler to not use strict-aliasing */
602 const HUF_DEltX4* const dt = (const HUF_DEltX4*)dtPtr;
603 DTableDesc const dtd = HUF_getDTableDesc(DTable);
604 HUF_decodeStreamX4(ostart, &bitD, oend, dt, dtd.tableLog);
605 }
606
607 /* check */
608 if (!BIT_endOfDStream(&bitD)) return ERROR(corruption_detected);
609
610 /* decoded size */
611 return dstSize;
612 }
613
614 size_t HUF_decompress1X4_usingDTable(
615 void* dst, size_t dstSize,
616 const void* cSrc, size_t cSrcSize,
617 const HUF_DTable* DTable)
618 {
619 DTableDesc dtd = HUF_getDTableDesc(DTable);
620 if (dtd.tableType != 1) return ERROR(GENERIC);
621 return HUF_decompress1X4_usingDTable_internal(dst, dstSize, cSrc, cSrcSize, DTable);
622 }
623
624 size_t HUF_decompress1X4_DCtx (HUF_DTable* DCtx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
625 {
626 const BYTE* ip = (const BYTE*) cSrc;
627
628 size_t const hSize = HUF_readDTableX4 (DCtx, cSrc, cSrcSize);
629 if (HUF_isError(hSize)) return hSize;
630 if (hSize >= cSrcSize) return ERROR(srcSize_wrong);
631 ip += hSize; cSrcSize -= hSize;
632
633 return HUF_decompress1X4_usingDTable_internal (dst, dstSize, ip, cSrcSize, DCtx);
634 }
635
636 size_t HUF_decompress1X4 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
637 {
638 HUF_CREATE_STATIC_DTABLEX4(DTable, HUF_TABLELOG_MAX);
639 return HUF_decompress1X4_DCtx(DTable, dst, dstSize, cSrc, cSrcSize);
640 }
641
642 static size_t HUF_decompress4X4_usingDTable_internal(
643 void* dst, size_t dstSize,
644 const void* cSrc, size_t cSrcSize,
645 const HUF_DTable* DTable)
646 {
647 if (cSrcSize < 10) return ERROR(corruption_detected); /* strict minimum : jump table + 1 byte per stream */
648
649 { const BYTE* const istart = (const BYTE*) cSrc;
650 BYTE* const ostart = (BYTE*) dst;
651 BYTE* const oend = ostart + dstSize;
652 const void* const dtPtr = DTable+1;
653 const HUF_DEltX4* const dt = (const HUF_DEltX4*)dtPtr;
654
655 /* Init */
656 BIT_DStream_t bitD1;
657 BIT_DStream_t bitD2;
658 BIT_DStream_t bitD3;
659 BIT_DStream_t bitD4;
660 size_t const length1 = MEM_readLE16(istart);
661 size_t const length2 = MEM_readLE16(istart+2);
662 size_t const length3 = MEM_readLE16(istart+4);
663 size_t const length4 = cSrcSize - (length1 + length2 + length3 + 6);
664 const BYTE* const istart1 = istart + 6; /* jumpTable */
665 const BYTE* const istart2 = istart1 + length1;
666 const BYTE* const istart3 = istart2 + length2;
667 const BYTE* const istart4 = istart3 + length3;
668 size_t const segmentSize = (dstSize+3) / 4;
669 BYTE* const opStart2 = ostart + segmentSize;
670 BYTE* const opStart3 = opStart2 + segmentSize;
671 BYTE* const opStart4 = opStart3 + segmentSize;
672 BYTE* op1 = ostart;
673 BYTE* op2 = opStart2;
674 BYTE* op3 = opStart3;
675 BYTE* op4 = opStart4;
676 U32 endSignal;
677 DTableDesc const dtd = HUF_getDTableDesc(DTable);
678 U32 const dtLog = dtd.tableLog;
679
680 if (length4 > cSrcSize) return ERROR(corruption_detected); /* overflow */
681 { size_t const errorCode = BIT_initDStream(&bitD1, istart1, length1);
682 if (HUF_isError(errorCode)) return errorCode; }
683 { size_t const errorCode = BIT_initDStream(&bitD2, istart2, length2);
684 if (HUF_isError(errorCode)) return errorCode; }
685 { size_t const errorCode = BIT_initDStream(&bitD3, istart3, length3);
686 if (HUF_isError(errorCode)) return errorCode; }
687 { size_t const errorCode = BIT_initDStream(&bitD4, istart4, length4);
688 if (HUF_isError(errorCode)) return errorCode; }
689
690 /* 16-32 symbols per loop (4-8 symbols per stream) */
691 endSignal = BIT_reloadDStream(&bitD1) | BIT_reloadDStream(&bitD2) | BIT_reloadDStream(&bitD3) | BIT_reloadDStream(&bitD4);
692 for ( ; (endSignal==BIT_DStream_unfinished) & (op4<(oend-(sizeof(bitD4.bitContainer)-1))) ; ) {
693 HUF_DECODE_SYMBOLX4_2(op1, &bitD1);
694 HUF_DECODE_SYMBOLX4_2(op2, &bitD2);
695 HUF_DECODE_SYMBOLX4_2(op3, &bitD3);
696 HUF_DECODE_SYMBOLX4_2(op4, &bitD4);
697 HUF_DECODE_SYMBOLX4_1(op1, &bitD1);
698 HUF_DECODE_SYMBOLX4_1(op2, &bitD2);
699 HUF_DECODE_SYMBOLX4_1(op3, &bitD3);
700 HUF_DECODE_SYMBOLX4_1(op4, &bitD4);
701 HUF_DECODE_SYMBOLX4_2(op1, &bitD1);
702 HUF_DECODE_SYMBOLX4_2(op2, &bitD2);
703 HUF_DECODE_SYMBOLX4_2(op3, &bitD3);
704 HUF_DECODE_SYMBOLX4_2(op4, &bitD4);
705 HUF_DECODE_SYMBOLX4_0(op1, &bitD1);
706 HUF_DECODE_SYMBOLX4_0(op2, &bitD2);
707 HUF_DECODE_SYMBOLX4_0(op3, &bitD3);
708 HUF_DECODE_SYMBOLX4_0(op4, &bitD4);
709
710 endSignal = BIT_reloadDStream(&bitD1) | BIT_reloadDStream(&bitD2) | BIT_reloadDStream(&bitD3) | BIT_reloadDStream(&bitD4);
711 }
712
713 /* check corruption */
714 if (op1 > opStart2) return ERROR(corruption_detected);
715 if (op2 > opStart3) return ERROR(corruption_detected);
716 if (op3 > opStart4) return ERROR(corruption_detected);
717 /* note : op4 already verified within main loop */
718
719 /* finish bitStreams one by one */
720 HUF_decodeStreamX4(op1, &bitD1, opStart2, dt, dtLog);
721 HUF_decodeStreamX4(op2, &bitD2, opStart3, dt, dtLog);
722 HUF_decodeStreamX4(op3, &bitD3, opStart4, dt, dtLog);
723 HUF_decodeStreamX4(op4, &bitD4, oend, dt, dtLog);
724
725 /* check */
726 { U32 const endCheck = BIT_endOfDStream(&bitD1) & BIT_endOfDStream(&bitD2) & BIT_endOfDStream(&bitD3) & BIT_endOfDStream(&bitD4);
727 if (!endCheck) return ERROR(corruption_detected); }
728
729 /* decoded size */
730 return dstSize;
731 }
732 }
733
734
735 size_t HUF_decompress4X4_usingDTable(
736 void* dst, size_t dstSize,
737 const void* cSrc, size_t cSrcSize,
738 const HUF_DTable* DTable)
739 {
740 DTableDesc dtd = HUF_getDTableDesc(DTable);
741 if (dtd.tableType != 1) return ERROR(GENERIC);
742 return HUF_decompress4X4_usingDTable_internal(dst, dstSize, cSrc, cSrcSize, DTable);
743 }
744
745
746 size_t HUF_decompress4X4_DCtx (HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
747 {
748 const BYTE* ip = (const BYTE*) cSrc;
749
750 size_t hSize = HUF_readDTableX4 (dctx, cSrc, cSrcSize);
751 if (HUF_isError(hSize)) return hSize;
752 if (hSize >= cSrcSize) return ERROR(srcSize_wrong);
753 ip += hSize; cSrcSize -= hSize;
754
755 return HUF_decompress4X4_usingDTable_internal(dst, dstSize, ip, cSrcSize, dctx);
756 }
757
758 size_t HUF_decompress4X4 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
759 {
760 HUF_CREATE_STATIC_DTABLEX4(DTable, HUF_TABLELOG_MAX);
761 return HUF_decompress4X4_DCtx(DTable, dst, dstSize, cSrc, cSrcSize);
762 }
763
764
765 /* ********************************/
766 /* Generic decompression selector */
767 /* ********************************/
768
769 size_t HUF_decompress1X_usingDTable(void* dst, size_t maxDstSize,
770 const void* cSrc, size_t cSrcSize,
771 const HUF_DTable* DTable)
772 {
773 DTableDesc const dtd = HUF_getDTableDesc(DTable);
774 return dtd.tableType ? HUF_decompress1X4_usingDTable_internal(dst, maxDstSize, cSrc, cSrcSize, DTable) :
775 HUF_decompress1X2_usingDTable_internal(dst, maxDstSize, cSrc, cSrcSize, DTable);
776 }
777
778 size_t HUF_decompress4X_usingDTable(void* dst, size_t maxDstSize,
779 const void* cSrc, size_t cSrcSize,
780 const HUF_DTable* DTable)
781 {
782 DTableDesc const dtd = HUF_getDTableDesc(DTable);
783 return dtd.tableType ? HUF_decompress4X4_usingDTable_internal(dst, maxDstSize, cSrc, cSrcSize, DTable) :
784 HUF_decompress4X2_usingDTable_internal(dst, maxDstSize, cSrc, cSrcSize, DTable);
785 }
786
787
788 typedef struct { U32 tableTime; U32 decode256Time; } algo_time_t;
789 static const algo_time_t algoTime[16 /* Quantization */][3 /* single, double, quad */] =
790 {
791 /* single, double, quad */
792 {{0,0}, {1,1}, {2,2}}, /* Q==0 : impossible */
793 {{0,0}, {1,1}, {2,2}}, /* Q==1 : impossible */
794 {{ 38,130}, {1313, 74}, {2151, 38}}, /* Q == 2 : 12-18% */
795 {{ 448,128}, {1353, 74}, {2238, 41}}, /* Q == 3 : 18-25% */
796 {{ 556,128}, {1353, 74}, {2238, 47}}, /* Q == 4 : 25-32% */
797 {{ 714,128}, {1418, 74}, {2436, 53}}, /* Q == 5 : 32-38% */
798 {{ 883,128}, {1437, 74}, {2464, 61}}, /* Q == 6 : 38-44% */
799 {{ 897,128}, {1515, 75}, {2622, 68}}, /* Q == 7 : 44-50% */
800 {{ 926,128}, {1613, 75}, {2730, 75}}, /* Q == 8 : 50-56% */
801 {{ 947,128}, {1729, 77}, {3359, 77}}, /* Q == 9 : 56-62% */
802 {{1107,128}, {2083, 81}, {4006, 84}}, /* Q ==10 : 62-69% */
803 {{1177,128}, {2379, 87}, {4785, 88}}, /* Q ==11 : 69-75% */
804 {{1242,128}, {2415, 93}, {5155, 84}}, /* Q ==12 : 75-81% */
805 {{1349,128}, {2644,106}, {5260,106}}, /* Q ==13 : 81-87% */
806 {{1455,128}, {2422,124}, {4174,124}}, /* Q ==14 : 87-93% */
807 {{ 722,128}, {1891,145}, {1936,146}}, /* Q ==15 : 93-99% */
808 };
809
810 /** HUF_selectDecoder() :
811 * Tells which decoder is likely to decode faster,
812 * based on a set of pre-determined metrics.
813 * @return : 0==HUF_decompress4X2, 1==HUF_decompress4X4 .
814 * Assumption : 0 < cSrcSize < dstSize <= 128 KB */
815 U32 HUF_selectDecoder (size_t dstSize, size_t cSrcSize)
816 {
817 /* decoder timing evaluation */
818 U32 const Q = (U32)(cSrcSize * 16 / dstSize); /* Q < 16 since dstSize > cSrcSize */
819 U32 const D256 = (U32)(dstSize >> 8);
820 U32 const DTime0 = algoTime[Q][0].tableTime + (algoTime[Q][0].decode256Time * D256);
821 U32 DTime1 = algoTime[Q][1].tableTime + (algoTime[Q][1].decode256Time * D256);
822 DTime1 += DTime1 >> 3; /* advantage to algorithm using less memory, for cache eviction */
823
824 return DTime1 < DTime0;
825 }
826
827
828 typedef size_t (*decompressionAlgo)(void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);
829
830 size_t HUF_decompress (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
831 {
832 static const decompressionAlgo decompress[2] = { HUF_decompress4X2, HUF_decompress4X4 };
833
834 /* validation checks */
835 if (dstSize == 0) return ERROR(dstSize_tooSmall);
836 if (cSrcSize > dstSize) return ERROR(corruption_detected); /* invalid */
837 if (cSrcSize == dstSize) { memcpy(dst, cSrc, dstSize); return dstSize; } /* not compressed */
838 if (cSrcSize == 1) { memset(dst, *(const BYTE*)cSrc, dstSize); return dstSize; } /* RLE */
839
840 { U32 const algoNb = HUF_selectDecoder(dstSize, cSrcSize);
841 return decompress[algoNb](dst, dstSize, cSrc, cSrcSize);
842 }
843 }
844
845 size_t HUF_decompress4X_DCtx (HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
846 {
847 /* validation checks */
848 if (dstSize == 0) return ERROR(dstSize_tooSmall);
849 if (cSrcSize > dstSize) return ERROR(corruption_detected); /* invalid */
850 if (cSrcSize == dstSize) { memcpy(dst, cSrc, dstSize); return dstSize; } /* not compressed */
851 if (cSrcSize == 1) { memset(dst, *(const BYTE*)cSrc, dstSize); return dstSize; } /* RLE */
852
853 { U32 const algoNb = HUF_selectDecoder(dstSize, cSrcSize);
854 return algoNb ? HUF_decompress4X4_DCtx(dctx, dst, dstSize, cSrc, cSrcSize) :
855 HUF_decompress4X2_DCtx(dctx, dst, dstSize, cSrc, cSrcSize) ;
856 }
857 }
858
859 size_t HUF_decompress4X_hufOnly (HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
860 {
861 /* validation checks */
862 if (dstSize == 0) return ERROR(dstSize_tooSmall);
863 if ((cSrcSize >= dstSize) || (cSrcSize <= 1)) return ERROR(corruption_detected); /* invalid */
864
865 { U32 const algoNb = HUF_selectDecoder(dstSize, cSrcSize);
866 return algoNb ? HUF_decompress4X4_DCtx(dctx, dst, dstSize, cSrc, cSrcSize) :
867 HUF_decompress4X2_DCtx(dctx, dst, dstSize, cSrc, cSrcSize) ;
868 }
869 }
870
871 size_t HUF_decompress1X_DCtx (HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
872 {
873 /* validation checks */
874 if (dstSize == 0) return ERROR(dstSize_tooSmall);
875 if (cSrcSize > dstSize) return ERROR(corruption_detected); /* invalid */
876 if (cSrcSize == dstSize) { memcpy(dst, cSrc, dstSize); return dstSize; } /* not compressed */
877 if (cSrcSize == 1) { memset(dst, *(const BYTE*)cSrc, dstSize); return dstSize; } /* RLE */
878
879 { U32 const algoNb = HUF_selectDecoder(dstSize, cSrcSize);
880 return algoNb ? HUF_decompress1X4_DCtx(dctx, dst, dstSize, cSrc, cSrcSize) :
881 HUF_decompress1X2_DCtx(dctx, dst, dstSize, cSrc, cSrcSize) ;
882 }
883 }
@@ -0,0 +1,252 b''
1 /**
2 * Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
3 * All rights reserved.
4 *
5 * This source code is licensed under the BSD-style license found in the
6 * LICENSE file in the root directory of this source tree. An additional grant
7 * of patent rights can be found in the PATENTS file in the same directory.
8 */
9
10
11
12 /* *************************************
13 * Dependencies
14 ***************************************/
15 #include <stdlib.h>
16 #include "error_private.h"
17 #include "zstd_internal.h" /* MIN, ZSTD_blockHeaderSize, ZSTD_BLOCKSIZE_MAX */
18 #define ZBUFF_STATIC_LINKING_ONLY
19 #include "zbuff.h"
20
21
22 typedef enum { ZBUFFds_init, ZBUFFds_loadHeader,
23 ZBUFFds_read, ZBUFFds_load, ZBUFFds_flush } ZBUFF_dStage;
24
25 /* *** Resource management *** */
26 struct ZBUFF_DCtx_s {
27 ZSTD_DCtx* zd;
28 ZSTD_frameParams fParams;
29 ZBUFF_dStage stage;
30 char* inBuff;
31 size_t inBuffSize;
32 size_t inPos;
33 char* outBuff;
34 size_t outBuffSize;
35 size_t outStart;
36 size_t outEnd;
37 size_t blockSize;
38 BYTE headerBuffer[ZSTD_FRAMEHEADERSIZE_MAX];
39 size_t lhSize;
40 ZSTD_customMem customMem;
41 }; /* typedef'd to ZBUFF_DCtx within "zbuff.h" */
42
43
44 ZBUFF_DCtx* ZBUFF_createDCtx(void)
45 {
46 return ZBUFF_createDCtx_advanced(defaultCustomMem);
47 }
48
49 ZBUFF_DCtx* ZBUFF_createDCtx_advanced(ZSTD_customMem customMem)
50 {
51 ZBUFF_DCtx* zbd;
52
53 if (!customMem.customAlloc && !customMem.customFree)
54 customMem = defaultCustomMem;
55
56 if (!customMem.customAlloc || !customMem.customFree)
57 return NULL;
58
59 zbd = (ZBUFF_DCtx*)customMem.customAlloc(customMem.opaque, sizeof(ZBUFF_DCtx));
60 if (zbd==NULL) return NULL;
61 memset(zbd, 0, sizeof(ZBUFF_DCtx));
62 memcpy(&zbd->customMem, &customMem, sizeof(ZSTD_customMem));
63 zbd->zd = ZSTD_createDCtx_advanced(customMem);
64 if (zbd->zd == NULL) { ZBUFF_freeDCtx(zbd); return NULL; }
65 zbd->stage = ZBUFFds_init;
66 return zbd;
67 }
68
69 size_t ZBUFF_freeDCtx(ZBUFF_DCtx* zbd)
70 {
71 if (zbd==NULL) return 0; /* support free on null */
72 ZSTD_freeDCtx(zbd->zd);
73 if (zbd->inBuff) zbd->customMem.customFree(zbd->customMem.opaque, zbd->inBuff);
74 if (zbd->outBuff) zbd->customMem.customFree(zbd->customMem.opaque, zbd->outBuff);
75 zbd->customMem.customFree(zbd->customMem.opaque, zbd);
76 return 0;
77 }
78
79
80 /* *** Initialization *** */
81
82 size_t ZBUFF_decompressInitDictionary(ZBUFF_DCtx* zbd, const void* dict, size_t dictSize)
83 {
84 zbd->stage = ZBUFFds_loadHeader;
85 zbd->lhSize = zbd->inPos = zbd->outStart = zbd->outEnd = 0;
86 return ZSTD_decompressBegin_usingDict(zbd->zd, dict, dictSize);
87 }
88
89 size_t ZBUFF_decompressInit(ZBUFF_DCtx* zbd)
90 {
91 return ZBUFF_decompressInitDictionary(zbd, NULL, 0);
92 }
93
94
95 /* internal util function */
96 MEM_STATIC size_t ZBUFF_limitCopy(void* dst, size_t dstCapacity, const void* src, size_t srcSize)
97 {
98 size_t const length = MIN(dstCapacity, srcSize);
99 memcpy(dst, src, length);
100 return length;
101 }
102
103
104 /* *** Decompression *** */
105
106 size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbd,
107 void* dst, size_t* dstCapacityPtr,
108 const void* src, size_t* srcSizePtr)
109 {
110 const char* const istart = (const char*)src;
111 const char* const iend = istart + *srcSizePtr;
112 const char* ip = istart;
113 char* const ostart = (char*)dst;
114 char* const oend = ostart + *dstCapacityPtr;
115 char* op = ostart;
116 U32 someMoreWork = 1;
117
118 while (someMoreWork) {
119 switch(zbd->stage)
120 {
121 case ZBUFFds_init :
122 return ERROR(init_missing);
123
124 case ZBUFFds_loadHeader :
125 { size_t const hSize = ZSTD_getFrameParams(&(zbd->fParams), zbd->headerBuffer, zbd->lhSize);
126 if (ZSTD_isError(hSize)) return hSize;
127 if (hSize != 0) { /* need more input */
128 size_t const toLoad = hSize - zbd->lhSize; /* if hSize!=0, hSize > zbd->lhSize */
129 if (toLoad > (size_t)(iend-ip)) { /* not enough input to load full header */
130 memcpy(zbd->headerBuffer + zbd->lhSize, ip, iend-ip);
131 zbd->lhSize += iend-ip;
132 *dstCapacityPtr = 0;
133 return (hSize - zbd->lhSize) + ZSTD_blockHeaderSize; /* remaining header bytes + next block header */
134 }
135 memcpy(zbd->headerBuffer + zbd->lhSize, ip, toLoad); zbd->lhSize = hSize; ip += toLoad;
136 break;
137 } }
138
139 /* Consume header */
140 { size_t const h1Size = ZSTD_nextSrcSizeToDecompress(zbd->zd); /* == ZSTD_frameHeaderSize_min */
141 size_t const h1Result = ZSTD_decompressContinue(zbd->zd, NULL, 0, zbd->headerBuffer, h1Size);
142 if (ZSTD_isError(h1Result)) return h1Result; /* should not happen : already checked */
143 if (h1Size < zbd->lhSize) { /* long header */
144 size_t const h2Size = ZSTD_nextSrcSizeToDecompress(zbd->zd);
145 size_t const h2Result = ZSTD_decompressContinue(zbd->zd, NULL, 0, zbd->headerBuffer+h1Size, h2Size);
146 if (ZSTD_isError(h2Result)) return h2Result;
147 } }
148
149 zbd->fParams.windowSize = MAX(zbd->fParams.windowSize, 1U << ZSTD_WINDOWLOG_ABSOLUTEMIN);
150
151 /* Frame header instruct buffer sizes */
152 { size_t const blockSize = MIN(zbd->fParams.windowSize, ZSTD_BLOCKSIZE_ABSOLUTEMAX);
153 size_t const neededOutSize = zbd->fParams.windowSize + blockSize;
154 zbd->blockSize = blockSize;
155 if (zbd->inBuffSize < blockSize) {
156 zbd->customMem.customFree(zbd->customMem.opaque, zbd->inBuff);
157 zbd->inBuffSize = blockSize;
158 zbd->inBuff = (char*)zbd->customMem.customAlloc(zbd->customMem.opaque, blockSize);
159 if (zbd->inBuff == NULL) return ERROR(memory_allocation);
160 }
161 if (zbd->outBuffSize < neededOutSize) {
162 zbd->customMem.customFree(zbd->customMem.opaque, zbd->outBuff);
163 zbd->outBuffSize = neededOutSize;
164 zbd->outBuff = (char*)zbd->customMem.customAlloc(zbd->customMem.opaque, neededOutSize);
165 if (zbd->outBuff == NULL) return ERROR(memory_allocation);
166 } }
167 zbd->stage = ZBUFFds_read;
168 /* pass-through */
169
170 case ZBUFFds_read:
171 { size_t const neededInSize = ZSTD_nextSrcSizeToDecompress(zbd->zd);
172 if (neededInSize==0) { /* end of frame */
173 zbd->stage = ZBUFFds_init;
174 someMoreWork = 0;
175 break;
176 }
177 if ((size_t)(iend-ip) >= neededInSize) { /* decode directly from src */
178 const int isSkipFrame = ZSTD_isSkipFrame(zbd->zd);
179 size_t const decodedSize = ZSTD_decompressContinue(zbd->zd,
180 zbd->outBuff + zbd->outStart, (isSkipFrame ? 0 : zbd->outBuffSize - zbd->outStart),
181 ip, neededInSize);
182 if (ZSTD_isError(decodedSize)) return decodedSize;
183 ip += neededInSize;
184 if (!decodedSize && !isSkipFrame) break; /* this was just a header */
185 zbd->outEnd = zbd->outStart + decodedSize;
186 zbd->stage = ZBUFFds_flush;
187 break;
188 }
189 if (ip==iend) { someMoreWork = 0; break; } /* no more input */
190 zbd->stage = ZBUFFds_load;
191 /* pass-through */
192 }
193
194 case ZBUFFds_load:
195 { size_t const neededInSize = ZSTD_nextSrcSizeToDecompress(zbd->zd);
196 size_t const toLoad = neededInSize - zbd->inPos; /* should always be <= remaining space within inBuff */
197 size_t loadedSize;
198 if (toLoad > zbd->inBuffSize - zbd->inPos) return ERROR(corruption_detected); /* should never happen */
199 loadedSize = ZBUFF_limitCopy(zbd->inBuff + zbd->inPos, toLoad, ip, iend-ip);
200 ip += loadedSize;
201 zbd->inPos += loadedSize;
202 if (loadedSize < toLoad) { someMoreWork = 0; break; } /* not enough input, wait for more */
203
204 /* decode loaded input */
205 { const int isSkipFrame = ZSTD_isSkipFrame(zbd->zd);
206 size_t const decodedSize = ZSTD_decompressContinue(zbd->zd,
207 zbd->outBuff + zbd->outStart, zbd->outBuffSize - zbd->outStart,
208 zbd->inBuff, neededInSize);
209 if (ZSTD_isError(decodedSize)) return decodedSize;
210 zbd->inPos = 0; /* input is consumed */
211 if (!decodedSize && !isSkipFrame) { zbd->stage = ZBUFFds_read; break; } /* this was just a header */
212 zbd->outEnd = zbd->outStart + decodedSize;
213 zbd->stage = ZBUFFds_flush;
214 /* pass-through */
215 } }
216
217 case ZBUFFds_flush:
218 { size_t const toFlushSize = zbd->outEnd - zbd->outStart;
219 size_t const flushedSize = ZBUFF_limitCopy(op, oend-op, zbd->outBuff + zbd->outStart, toFlushSize);
220 op += flushedSize;
221 zbd->outStart += flushedSize;
222 if (flushedSize == toFlushSize) { /* flush completed */
223 zbd->stage = ZBUFFds_read;
224 if (zbd->outStart + zbd->blockSize > zbd->outBuffSize)
225 zbd->outStart = zbd->outEnd = 0;
226 break;
227 }
228 /* cannot flush everything */
229 someMoreWork = 0;
230 break;
231 }
232 default: return ERROR(GENERIC); /* impossible */
233 } }
234
235 /* result */
236 *srcSizePtr = ip-istart;
237 *dstCapacityPtr = op-ostart;
238 { size_t nextSrcSizeHint = ZSTD_nextSrcSizeToDecompress(zbd->zd);
239 if (!nextSrcSizeHint) return (zbd->outEnd != zbd->outStart); /* return 0 only if fully flushed too */
240 nextSrcSizeHint += ZSTD_blockHeaderSize * (ZSTD_nextInputType(zbd->zd) == ZSTDnit_block);
241 if (zbd->inPos > nextSrcSizeHint) return ERROR(GENERIC); /* should never happen */
242 nextSrcSizeHint -= zbd->inPos; /* already loaded*/
243 return nextSrcSizeHint;
244 }
245 }
246
247
248 /* *************************************
249 * Tool functions
250 ***************************************/
251 size_t ZBUFF_recommendedDInSize(void) { return ZSTD_BLOCKSIZE_ABSOLUTEMAX + ZSTD_blockHeaderSize /* block header size*/ ; }
252 size_t ZBUFF_recommendedDOutSize(void) { return ZSTD_BLOCKSIZE_ABSOLUTEMAX; }
This diff has been collapsed as it changes many lines, (1842 lines changed) Show them Hide them
@@ -0,0 +1,1842 b''
1 /**
2 * Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
3 * All rights reserved.
4 *
5 * This source code is licensed under the BSD-style license found in the
6 * LICENSE file in the root directory of this source tree. An additional grant
7 * of patent rights can be found in the PATENTS file in the same directory.
8 */
9
10
11 /* ***************************************************************
12 * Tuning parameters
13 *****************************************************************/
14 /*!
15 * HEAPMODE :
16 * Select how default decompression function ZSTD_decompress() will allocate memory,
17 * in memory stack (0), or in memory heap (1, requires malloc())
18 */
19 #ifndef ZSTD_HEAPMODE
20 # define ZSTD_HEAPMODE 1
21 #endif
22
23 /*!
24 * LEGACY_SUPPORT :
25 * if set to 1, ZSTD_decompress() can decode older formats (v0.1+)
26 */
27 #ifndef ZSTD_LEGACY_SUPPORT
28 # define ZSTD_LEGACY_SUPPORT 0
29 #endif
30
31 /*!
32 * MAXWINDOWSIZE_DEFAULT :
33 * maximum window size accepted by DStream, by default.
34 * Frames requiring more memory will be rejected.
35 */
36 #ifndef ZSTD_MAXWINDOWSIZE_DEFAULT
37 # define ZSTD_MAXWINDOWSIZE_DEFAULT ((1 << ZSTD_WINDOWLOG_MAX) + 1) /* defined within zstd.h */
38 #endif
39
40
41 /*-*******************************************************
42 * Dependencies
43 *********************************************************/
44 #include <string.h> /* memcpy, memmove, memset */
45 #include "mem.h" /* low level memory routines */
46 #define XXH_STATIC_LINKING_ONLY /* XXH64_state_t */
47 #include "xxhash.h" /* XXH64_* */
48 #define FSE_STATIC_LINKING_ONLY
49 #include "fse.h"
50 #define HUF_STATIC_LINKING_ONLY
51 #include "huf.h"
52 #include "zstd_internal.h"
53
54 #if defined(ZSTD_LEGACY_SUPPORT) && (ZSTD_LEGACY_SUPPORT>=1)
55 # include "zstd_legacy.h"
56 #endif
57
58
59 /*-*************************************
60 * Macros
61 ***************************************/
62 #define ZSTD_isError ERR_isError /* for inlining */
63 #define FSE_isError ERR_isError
64 #define HUF_isError ERR_isError
65
66
67 /*_*******************************************************
68 * Memory operations
69 **********************************************************/
70 static void ZSTD_copy4(void* dst, const void* src) { memcpy(dst, src, 4); }
71
72
73 /*-*************************************************************
74 * Context management
75 ***************************************************************/
76 typedef enum { ZSTDds_getFrameHeaderSize, ZSTDds_decodeFrameHeader,
77 ZSTDds_decodeBlockHeader, ZSTDds_decompressBlock,
78 ZSTDds_decompressLastBlock, ZSTDds_checkChecksum,
79 ZSTDds_decodeSkippableHeader, ZSTDds_skipFrame } ZSTD_dStage;
80
81 struct ZSTD_DCtx_s
82 {
83 const FSE_DTable* LLTptr;
84 const FSE_DTable* MLTptr;
85 const FSE_DTable* OFTptr;
86 const HUF_DTable* HUFptr;
87 FSE_DTable LLTable[FSE_DTABLE_SIZE_U32(LLFSELog)];
88 FSE_DTable OFTable[FSE_DTABLE_SIZE_U32(OffFSELog)];
89 FSE_DTable MLTable[FSE_DTABLE_SIZE_U32(MLFSELog)];
90 HUF_DTable hufTable[HUF_DTABLE_SIZE(HufLog)]; /* can accommodate HUF_decompress4X */
91 const void* previousDstEnd;
92 const void* base;
93 const void* vBase;
94 const void* dictEnd;
95 size_t expected;
96 U32 rep[ZSTD_REP_NUM];
97 ZSTD_frameParams fParams;
98 blockType_e bType; /* used in ZSTD_decompressContinue(), to transfer blockType between header decoding and block decoding stages */
99 ZSTD_dStage stage;
100 U32 litEntropy;
101 U32 fseEntropy;
102 XXH64_state_t xxhState;
103 size_t headerSize;
104 U32 dictID;
105 const BYTE* litPtr;
106 ZSTD_customMem customMem;
107 size_t litBufSize;
108 size_t litSize;
109 size_t rleSize;
110 BYTE litBuffer[ZSTD_BLOCKSIZE_ABSOLUTEMAX + WILDCOPY_OVERLENGTH];
111 BYTE headerBuffer[ZSTD_FRAMEHEADERSIZE_MAX];
112 }; /* typedef'd to ZSTD_DCtx within "zstd.h" */
113
114 size_t ZSTD_sizeof_DCtx (const ZSTD_DCtx* dctx) { return (dctx==NULL) ? 0 : sizeof(ZSTD_DCtx); }
115
116 size_t ZSTD_estimateDCtxSize(void) { return sizeof(ZSTD_DCtx); }
117
118 size_t ZSTD_decompressBegin(ZSTD_DCtx* dctx)
119 {
120 dctx->expected = ZSTD_frameHeaderSize_prefix;
121 dctx->stage = ZSTDds_getFrameHeaderSize;
122 dctx->previousDstEnd = NULL;
123 dctx->base = NULL;
124 dctx->vBase = NULL;
125 dctx->dictEnd = NULL;
126 dctx->hufTable[0] = (HUF_DTable)((HufLog)*0x1000001); /* cover both little and big endian */
127 dctx->litEntropy = dctx->fseEntropy = 0;
128 dctx->dictID = 0;
129 MEM_STATIC_ASSERT(sizeof(dctx->rep) == sizeof(repStartValue));
130 memcpy(dctx->rep, repStartValue, sizeof(repStartValue)); /* initial repcodes */
131 dctx->LLTptr = dctx->LLTable;
132 dctx->MLTptr = dctx->MLTable;
133 dctx->OFTptr = dctx->OFTable;
134 dctx->HUFptr = dctx->hufTable;
135 return 0;
136 }
137
138 ZSTD_DCtx* ZSTD_createDCtx_advanced(ZSTD_customMem customMem)
139 {
140 ZSTD_DCtx* dctx;
141
142 if (!customMem.customAlloc && !customMem.customFree) customMem = defaultCustomMem;
143 if (!customMem.customAlloc || !customMem.customFree) return NULL;
144
145 dctx = (ZSTD_DCtx*)ZSTD_malloc(sizeof(ZSTD_DCtx), customMem);
146 if (!dctx) return NULL;
147 memcpy(&dctx->customMem, &customMem, sizeof(customMem));
148 ZSTD_decompressBegin(dctx);
149 return dctx;
150 }
151
152 ZSTD_DCtx* ZSTD_createDCtx(void)
153 {
154 return ZSTD_createDCtx_advanced(defaultCustomMem);
155 }
156
157 size_t ZSTD_freeDCtx(ZSTD_DCtx* dctx)
158 {
159 if (dctx==NULL) return 0; /* support free on NULL */
160 ZSTD_free(dctx, dctx->customMem);
161 return 0; /* reserved as a potential error code in the future */
162 }
163
164 void ZSTD_copyDCtx(ZSTD_DCtx* dstDCtx, const ZSTD_DCtx* srcDCtx)
165 {
166 size_t const workSpaceSize = (ZSTD_BLOCKSIZE_ABSOLUTEMAX+WILDCOPY_OVERLENGTH) + ZSTD_frameHeaderSize_max;
167 memcpy(dstDCtx, srcDCtx, sizeof(ZSTD_DCtx) - workSpaceSize); /* no need to copy workspace */
168 }
169
170 static void ZSTD_refDCtx(ZSTD_DCtx* dstDCtx, const ZSTD_DCtx* srcDCtx)
171 {
172 ZSTD_decompressBegin(dstDCtx); /* init */
173 if (srcDCtx) { /* support refDCtx on NULL */
174 dstDCtx->dictEnd = srcDCtx->dictEnd;
175 dstDCtx->vBase = srcDCtx->vBase;
176 dstDCtx->base = srcDCtx->base;
177 dstDCtx->previousDstEnd = srcDCtx->previousDstEnd;
178 dstDCtx->dictID = srcDCtx->dictID;
179 dstDCtx->litEntropy = srcDCtx->litEntropy;
180 dstDCtx->fseEntropy = srcDCtx->fseEntropy;
181 dstDCtx->LLTptr = srcDCtx->LLTable;
182 dstDCtx->MLTptr = srcDCtx->MLTable;
183 dstDCtx->OFTptr = srcDCtx->OFTable;
184 dstDCtx->HUFptr = srcDCtx->hufTable;
185 dstDCtx->rep[0] = srcDCtx->rep[0];
186 dstDCtx->rep[1] = srcDCtx->rep[1];
187 dstDCtx->rep[2] = srcDCtx->rep[2];
188 }
189 }
190
191
192 /*-*************************************************************
193 * Decompression section
194 ***************************************************************/
195
196 /* See compression format details in : doc/zstd_compression_format.md */
197
198 /** ZSTD_frameHeaderSize() :
199 * srcSize must be >= ZSTD_frameHeaderSize_prefix.
200 * @return : size of the Frame Header */
201 static size_t ZSTD_frameHeaderSize(const void* src, size_t srcSize)
202 {
203 if (srcSize < ZSTD_frameHeaderSize_prefix) return ERROR(srcSize_wrong);
204 { BYTE const fhd = ((const BYTE*)src)[4];
205 U32 const dictID= fhd & 3;
206 U32 const singleSegment = (fhd >> 5) & 1;
207 U32 const fcsId = fhd >> 6;
208 return ZSTD_frameHeaderSize_prefix + !singleSegment + ZSTD_did_fieldSize[dictID] + ZSTD_fcs_fieldSize[fcsId]
209 + (singleSegment && !fcsId);
210 }
211 }
212
213
214 /** ZSTD_getFrameParams() :
215 * decode Frame Header, or require larger `srcSize`.
216 * @return : 0, `fparamsPtr` is correctly filled,
217 * >0, `srcSize` is too small, result is expected `srcSize`,
218 * or an error code, which can be tested using ZSTD_isError() */
219 size_t ZSTD_getFrameParams(ZSTD_frameParams* fparamsPtr, const void* src, size_t srcSize)
220 {
221 const BYTE* ip = (const BYTE*)src;
222
223 if (srcSize < ZSTD_frameHeaderSize_prefix) return ZSTD_frameHeaderSize_prefix;
224 if (MEM_readLE32(src) != ZSTD_MAGICNUMBER) {
225 if ((MEM_readLE32(src) & 0xFFFFFFF0U) == ZSTD_MAGIC_SKIPPABLE_START) {
226 if (srcSize < ZSTD_skippableHeaderSize) return ZSTD_skippableHeaderSize; /* magic number + skippable frame length */
227 memset(fparamsPtr, 0, sizeof(*fparamsPtr));
228 fparamsPtr->frameContentSize = MEM_readLE32((const char *)src + 4);
229 fparamsPtr->windowSize = 0; /* windowSize==0 means a frame is skippable */
230 return 0;
231 }
232 return ERROR(prefix_unknown);
233 }
234
235 /* ensure there is enough `srcSize` to fully read/decode frame header */
236 { size_t const fhsize = ZSTD_frameHeaderSize(src, srcSize);
237 if (srcSize < fhsize) return fhsize; }
238
239 { BYTE const fhdByte = ip[4];
240 size_t pos = 5;
241 U32 const dictIDSizeCode = fhdByte&3;
242 U32 const checksumFlag = (fhdByte>>2)&1;
243 U32 const singleSegment = (fhdByte>>5)&1;
244 U32 const fcsID = fhdByte>>6;
245 U32 const windowSizeMax = 1U << ZSTD_WINDOWLOG_MAX;
246 U32 windowSize = 0;
247 U32 dictID = 0;
248 U64 frameContentSize = 0;
249 if ((fhdByte & 0x08) != 0) return ERROR(frameParameter_unsupported); /* reserved bits, which must be zero */
250 if (!singleSegment) {
251 BYTE const wlByte = ip[pos++];
252 U32 const windowLog = (wlByte >> 3) + ZSTD_WINDOWLOG_ABSOLUTEMIN;
253 if (windowLog > ZSTD_WINDOWLOG_MAX) return ERROR(frameParameter_windowTooLarge); /* avoids issue with 1 << windowLog */
254 windowSize = (1U << windowLog);
255 windowSize += (windowSize >> 3) * (wlByte&7);
256 }
257
258 switch(dictIDSizeCode)
259 {
260 default: /* impossible */
261 case 0 : break;
262 case 1 : dictID = ip[pos]; pos++; break;
263 case 2 : dictID = MEM_readLE16(ip+pos); pos+=2; break;
264 case 3 : dictID = MEM_readLE32(ip+pos); pos+=4; break;
265 }
266 switch(fcsID)
267 {
268 default: /* impossible */
269 case 0 : if (singleSegment) frameContentSize = ip[pos]; break;
270 case 1 : frameContentSize = MEM_readLE16(ip+pos)+256; break;
271 case 2 : frameContentSize = MEM_readLE32(ip+pos); break;
272 case 3 : frameContentSize = MEM_readLE64(ip+pos); break;
273 }
274 if (!windowSize) windowSize = (U32)frameContentSize;
275 if (windowSize > windowSizeMax) return ERROR(frameParameter_windowTooLarge);
276 fparamsPtr->frameContentSize = frameContentSize;
277 fparamsPtr->windowSize = windowSize;
278 fparamsPtr->dictID = dictID;
279 fparamsPtr->checksumFlag = checksumFlag;
280 }
281 return 0;
282 }
283
284
285 /** ZSTD_getDecompressedSize() :
286 * compatible with legacy mode
287 * @return : decompressed size if known, 0 otherwise
288 note : 0 can mean any of the following :
289 - decompressed size is not present within frame header
290 - frame header unknown / not supported
291 - frame header not complete (`srcSize` too small) */
292 unsigned long long ZSTD_getDecompressedSize(const void* src, size_t srcSize)
293 {
294 #if defined(ZSTD_LEGACY_SUPPORT) && (ZSTD_LEGACY_SUPPORT==1)
295 if (ZSTD_isLegacy(src, srcSize)) return ZSTD_getDecompressedSize_legacy(src, srcSize);
296 #endif
297 { ZSTD_frameParams fparams;
298 size_t const frResult = ZSTD_getFrameParams(&fparams, src, srcSize);
299 if (frResult!=0) return 0;
300 return fparams.frameContentSize;
301 }
302 }
303
304
305 /** ZSTD_decodeFrameHeader() :
306 * `headerSize` must be the size provided by ZSTD_frameHeaderSize().
307 * @return : 0 if success, or an error code, which can be tested using ZSTD_isError() */
308 static size_t ZSTD_decodeFrameHeader(ZSTD_DCtx* dctx, const void* src, size_t headerSize)
309 {
310 size_t const result = ZSTD_getFrameParams(&(dctx->fParams), src, headerSize);
311 if (ZSTD_isError(result)) return result; /* invalid header */
312 if (result>0) return ERROR(srcSize_wrong); /* headerSize too small */
313 if (dctx->fParams.dictID && (dctx->dictID != dctx->fParams.dictID)) return ERROR(dictionary_wrong);
314 if (dctx->fParams.checksumFlag) XXH64_reset(&dctx->xxhState, 0);
315 return 0;
316 }
317
318
319 typedef struct
320 {
321 blockType_e blockType;
322 U32 lastBlock;
323 U32 origSize;
324 } blockProperties_t;
325
326 /*! ZSTD_getcBlockSize() :
327 * Provides the size of compressed block from block header `src` */
328 size_t ZSTD_getcBlockSize(const void* src, size_t srcSize, blockProperties_t* bpPtr)
329 {
330 if (srcSize < ZSTD_blockHeaderSize) return ERROR(srcSize_wrong);
331 { U32 const cBlockHeader = MEM_readLE24(src);
332 U32 const cSize = cBlockHeader >> 3;
333 bpPtr->lastBlock = cBlockHeader & 1;
334 bpPtr->blockType = (blockType_e)((cBlockHeader >> 1) & 3);
335 bpPtr->origSize = cSize; /* only useful for RLE */
336 if (bpPtr->blockType == bt_rle) return 1;
337 if (bpPtr->blockType == bt_reserved) return ERROR(corruption_detected);
338 return cSize;
339 }
340 }
341
342
343 static size_t ZSTD_copyRawBlock(void* dst, size_t dstCapacity, const void* src, size_t srcSize)
344 {
345 if (srcSize > dstCapacity) return ERROR(dstSize_tooSmall);
346 memcpy(dst, src, srcSize);
347 return srcSize;
348 }
349
350
351 static size_t ZSTD_setRleBlock(void* dst, size_t dstCapacity, const void* src, size_t srcSize, size_t regenSize)
352 {
353 if (srcSize != 1) return ERROR(srcSize_wrong);
354 if (regenSize > dstCapacity) return ERROR(dstSize_tooSmall);
355 memset(dst, *(const BYTE*)src, regenSize);
356 return regenSize;
357 }
358
359 /*! ZSTD_decodeLiteralsBlock() :
360 @return : nb of bytes read from src (< srcSize ) */
361 size_t ZSTD_decodeLiteralsBlock(ZSTD_DCtx* dctx,
362 const void* src, size_t srcSize) /* note : srcSize < BLOCKSIZE */
363 {
364 if (srcSize < MIN_CBLOCK_SIZE) return ERROR(corruption_detected);
365
366 { const BYTE* const istart = (const BYTE*) src;
367 symbolEncodingType_e const litEncType = (symbolEncodingType_e)(istart[0] & 3);
368
369 switch(litEncType)
370 {
371 case set_repeat:
372 if (dctx->litEntropy==0) return ERROR(dictionary_corrupted);
373 /* fall-through */
374 case set_compressed:
375 if (srcSize < 5) return ERROR(corruption_detected); /* srcSize >= MIN_CBLOCK_SIZE == 3; here we need up to 5 for case 3 */
376 { size_t lhSize, litSize, litCSize;
377 U32 singleStream=0;
378 U32 const lhlCode = (istart[0] >> 2) & 3;
379 U32 const lhc = MEM_readLE32(istart);
380 switch(lhlCode)
381 {
382 case 0: case 1: default: /* note : default is impossible, since lhlCode into [0..3] */
383 /* 2 - 2 - 10 - 10 */
384 singleStream = !lhlCode;
385 lhSize = 3;
386 litSize = (lhc >> 4) & 0x3FF;
387 litCSize = (lhc >> 14) & 0x3FF;
388 break;
389 case 2:
390 /* 2 - 2 - 14 - 14 */
391 lhSize = 4;
392 litSize = (lhc >> 4) & 0x3FFF;
393 litCSize = lhc >> 18;
394 break;
395 case 3:
396 /* 2 - 2 - 18 - 18 */
397 lhSize = 5;
398 litSize = (lhc >> 4) & 0x3FFFF;
399 litCSize = (lhc >> 22) + (istart[4] << 10);
400 break;
401 }
402 if (litSize > ZSTD_BLOCKSIZE_ABSOLUTEMAX) return ERROR(corruption_detected);
403 if (litCSize + lhSize > srcSize) return ERROR(corruption_detected);
404
405 if (HUF_isError((litEncType==set_repeat) ?
406 ( singleStream ?
407 HUF_decompress1X_usingDTable(dctx->litBuffer, litSize, istart+lhSize, litCSize, dctx->HUFptr) :
408 HUF_decompress4X_usingDTable(dctx->litBuffer, litSize, istart+lhSize, litCSize, dctx->HUFptr) ) :
409 ( singleStream ?
410 HUF_decompress1X2_DCtx(dctx->hufTable, dctx->litBuffer, litSize, istart+lhSize, litCSize) :
411 HUF_decompress4X_hufOnly (dctx->hufTable, dctx->litBuffer, litSize, istart+lhSize, litCSize)) ))
412 return ERROR(corruption_detected);
413
414 dctx->litPtr = dctx->litBuffer;
415 dctx->litBufSize = ZSTD_BLOCKSIZE_ABSOLUTEMAX+WILDCOPY_OVERLENGTH;
416 dctx->litSize = litSize;
417 dctx->litEntropy = 1;
418 if (litEncType==set_compressed) dctx->HUFptr = dctx->hufTable;
419 return litCSize + lhSize;
420 }
421
422 case set_basic:
423 { size_t litSize, lhSize;
424 U32 const lhlCode = ((istart[0]) >> 2) & 3;
425 switch(lhlCode)
426 {
427 case 0: case 2: default: /* note : default is impossible, since lhlCode into [0..3] */
428 lhSize = 1;
429 litSize = istart[0] >> 3;
430 break;
431 case 1:
432 lhSize = 2;
433 litSize = MEM_readLE16(istart) >> 4;
434 break;
435 case 3:
436 lhSize = 3;
437 litSize = MEM_readLE24(istart) >> 4;
438 break;
439 }
440
441 if (lhSize+litSize+WILDCOPY_OVERLENGTH > srcSize) { /* risk reading beyond src buffer with wildcopy */
442 if (litSize+lhSize > srcSize) return ERROR(corruption_detected);
443 memcpy(dctx->litBuffer, istart+lhSize, litSize);
444 dctx->litPtr = dctx->litBuffer;
445 dctx->litBufSize = ZSTD_BLOCKSIZE_ABSOLUTEMAX+8;
446 dctx->litSize = litSize;
447 return lhSize+litSize;
448 }
449 /* direct reference into compressed stream */
450 dctx->litPtr = istart+lhSize;
451 dctx->litBufSize = srcSize-lhSize;
452 dctx->litSize = litSize;
453 return lhSize+litSize;
454 }
455
456 case set_rle:
457 { U32 const lhlCode = ((istart[0]) >> 2) & 3;
458 size_t litSize, lhSize;
459 switch(lhlCode)
460 {
461 case 0: case 2: default: /* note : default is impossible, since lhlCode into [0..3] */
462 lhSize = 1;
463 litSize = istart[0] >> 3;
464 break;
465 case 1:
466 lhSize = 2;
467 litSize = MEM_readLE16(istart) >> 4;
468 break;
469 case 3:
470 lhSize = 3;
471 litSize = MEM_readLE24(istart) >> 4;
472 if (srcSize<4) return ERROR(corruption_detected); /* srcSize >= MIN_CBLOCK_SIZE == 3; here we need lhSize+1 = 4 */
473 break;
474 }
475 if (litSize > ZSTD_BLOCKSIZE_ABSOLUTEMAX) return ERROR(corruption_detected);
476 memset(dctx->litBuffer, istart[lhSize], litSize);
477 dctx->litPtr = dctx->litBuffer;
478 dctx->litBufSize = ZSTD_BLOCKSIZE_ABSOLUTEMAX+WILDCOPY_OVERLENGTH;
479 dctx->litSize = litSize;
480 return lhSize+1;
481 }
482 default:
483 return ERROR(corruption_detected); /* impossible */
484 }
485 }
486 }
487
488
489 typedef union {
490 FSE_decode_t realData;
491 U32 alignedBy4;
492 } FSE_decode_t4;
493
494 static const FSE_decode_t4 LL_defaultDTable[(1<<LL_DEFAULTNORMLOG)+1] = {
495 { { LL_DEFAULTNORMLOG, 1, 1 } }, /* header : tableLog, fastMode, fastMode */
496 { { 0, 0, 4 } }, /* 0 : base, symbol, bits */
497 { { 16, 0, 4 } },
498 { { 32, 1, 5 } },
499 { { 0, 3, 5 } },
500 { { 0, 4, 5 } },
501 { { 0, 6, 5 } },
502 { { 0, 7, 5 } },
503 { { 0, 9, 5 } },
504 { { 0, 10, 5 } },
505 { { 0, 12, 5 } },
506 { { 0, 14, 6 } },
507 { { 0, 16, 5 } },
508 { { 0, 18, 5 } },
509 { { 0, 19, 5 } },
510 { { 0, 21, 5 } },
511 { { 0, 22, 5 } },
512 { { 0, 24, 5 } },
513 { { 32, 25, 5 } },
514 { { 0, 26, 5 } },
515 { { 0, 27, 6 } },
516 { { 0, 29, 6 } },
517 { { 0, 31, 6 } },
518 { { 32, 0, 4 } },
519 { { 0, 1, 4 } },
520 { { 0, 2, 5 } },
521 { { 32, 4, 5 } },
522 { { 0, 5, 5 } },
523 { { 32, 7, 5 } },
524 { { 0, 8, 5 } },
525 { { 32, 10, 5 } },
526 { { 0, 11, 5 } },
527 { { 0, 13, 6 } },
528 { { 32, 16, 5 } },
529 { { 0, 17, 5 } },
530 { { 32, 19, 5 } },
531 { { 0, 20, 5 } },
532 { { 32, 22, 5 } },
533 { { 0, 23, 5 } },
534 { { 0, 25, 4 } },
535 { { 16, 25, 4 } },
536 { { 32, 26, 5 } },
537 { { 0, 28, 6 } },
538 { { 0, 30, 6 } },
539 { { 48, 0, 4 } },
540 { { 16, 1, 4 } },
541 { { 32, 2, 5 } },
542 { { 32, 3, 5 } },
543 { { 32, 5, 5 } },
544 { { 32, 6, 5 } },
545 { { 32, 8, 5 } },
546 { { 32, 9, 5 } },
547 { { 32, 11, 5 } },
548 { { 32, 12, 5 } },
549 { { 0, 15, 6 } },
550 { { 32, 17, 5 } },
551 { { 32, 18, 5 } },
552 { { 32, 20, 5 } },
553 { { 32, 21, 5 } },
554 { { 32, 23, 5 } },
555 { { 32, 24, 5 } },
556 { { 0, 35, 6 } },
557 { { 0, 34, 6 } },
558 { { 0, 33, 6 } },
559 { { 0, 32, 6 } },
560 }; /* LL_defaultDTable */
561
562 static const FSE_decode_t4 ML_defaultDTable[(1<<ML_DEFAULTNORMLOG)+1] = {
563 { { ML_DEFAULTNORMLOG, 1, 1 } }, /* header : tableLog, fastMode, fastMode */
564 { { 0, 0, 6 } }, /* 0 : base, symbol, bits */
565 { { 0, 1, 4 } },
566 { { 32, 2, 5 } },
567 { { 0, 3, 5 } },
568 { { 0, 5, 5 } },
569 { { 0, 6, 5 } },
570 { { 0, 8, 5 } },
571 { { 0, 10, 6 } },
572 { { 0, 13, 6 } },
573 { { 0, 16, 6 } },
574 { { 0, 19, 6 } },
575 { { 0, 22, 6 } },
576 { { 0, 25, 6 } },
577 { { 0, 28, 6 } },
578 { { 0, 31, 6 } },
579 { { 0, 33, 6 } },
580 { { 0, 35, 6 } },
581 { { 0, 37, 6 } },
582 { { 0, 39, 6 } },
583 { { 0, 41, 6 } },
584 { { 0, 43, 6 } },
585 { { 0, 45, 6 } },
586 { { 16, 1, 4 } },
587 { { 0, 2, 4 } },
588 { { 32, 3, 5 } },
589 { { 0, 4, 5 } },
590 { { 32, 6, 5 } },
591 { { 0, 7, 5 } },
592 { { 0, 9, 6 } },
593 { { 0, 12, 6 } },
594 { { 0, 15, 6 } },
595 { { 0, 18, 6 } },
596 { { 0, 21, 6 } },
597 { { 0, 24, 6 } },
598 { { 0, 27, 6 } },
599 { { 0, 30, 6 } },
600 { { 0, 32, 6 } },
601 { { 0, 34, 6 } },
602 { { 0, 36, 6 } },
603 { { 0, 38, 6 } },
604 { { 0, 40, 6 } },
605 { { 0, 42, 6 } },
606 { { 0, 44, 6 } },
607 { { 32, 1, 4 } },
608 { { 48, 1, 4 } },
609 { { 16, 2, 4 } },
610 { { 32, 4, 5 } },
611 { { 32, 5, 5 } },
612 { { 32, 7, 5 } },
613 { { 32, 8, 5 } },
614 { { 0, 11, 6 } },
615 { { 0, 14, 6 } },
616 { { 0, 17, 6 } },
617 { { 0, 20, 6 } },
618 { { 0, 23, 6 } },
619 { { 0, 26, 6 } },
620 { { 0, 29, 6 } },
621 { { 0, 52, 6 } },
622 { { 0, 51, 6 } },
623 { { 0, 50, 6 } },
624 { { 0, 49, 6 } },
625 { { 0, 48, 6 } },
626 { { 0, 47, 6 } },
627 { { 0, 46, 6 } },
628 }; /* ML_defaultDTable */
629
630 static const FSE_decode_t4 OF_defaultDTable[(1<<OF_DEFAULTNORMLOG)+1] = {
631 { { OF_DEFAULTNORMLOG, 1, 1 } }, /* header : tableLog, fastMode, fastMode */
632 { { 0, 0, 5 } }, /* 0 : base, symbol, bits */
633 { { 0, 6, 4 } },
634 { { 0, 9, 5 } },
635 { { 0, 15, 5 } },
636 { { 0, 21, 5 } },
637 { { 0, 3, 5 } },
638 { { 0, 7, 4 } },
639 { { 0, 12, 5 } },
640 { { 0, 18, 5 } },
641 { { 0, 23, 5 } },
642 { { 0, 5, 5 } },
643 { { 0, 8, 4 } },
644 { { 0, 14, 5 } },
645 { { 0, 20, 5 } },
646 { { 0, 2, 5 } },
647 { { 16, 7, 4 } },
648 { { 0, 11, 5 } },
649 { { 0, 17, 5 } },
650 { { 0, 22, 5 } },
651 { { 0, 4, 5 } },
652 { { 16, 8, 4 } },
653 { { 0, 13, 5 } },
654 { { 0, 19, 5 } },
655 { { 0, 1, 5 } },
656 { { 16, 6, 4 } },
657 { { 0, 10, 5 } },
658 { { 0, 16, 5 } },
659 { { 0, 28, 5 } },
660 { { 0, 27, 5 } },
661 { { 0, 26, 5 } },
662 { { 0, 25, 5 } },
663 { { 0, 24, 5 } },
664 }; /* OF_defaultDTable */
665
666 /*! ZSTD_buildSeqTable() :
667 @return : nb bytes read from src,
668 or an error code if it fails, testable with ZSTD_isError()
669 */
670 static size_t ZSTD_buildSeqTable(FSE_DTable* DTableSpace, const FSE_DTable** DTablePtr,
671 symbolEncodingType_e type, U32 max, U32 maxLog,
672 const void* src, size_t srcSize,
673 const FSE_decode_t4* defaultTable, U32 flagRepeatTable)
674 {
675 const void* const tmpPtr = defaultTable; /* bypass strict aliasing */
676 switch(type)
677 {
678 case set_rle :
679 if (!srcSize) return ERROR(srcSize_wrong);
680 if ( (*(const BYTE*)src) > max) return ERROR(corruption_detected);
681 FSE_buildDTable_rle(DTableSpace, *(const BYTE*)src);
682 *DTablePtr = DTableSpace;
683 return 1;
684 case set_basic :
685 *DTablePtr = (const FSE_DTable*)tmpPtr;
686 return 0;
687 case set_repeat:
688 if (!flagRepeatTable) return ERROR(corruption_detected);
689 return 0;
690 default : /* impossible */
691 case set_compressed :
692 { U32 tableLog;
693 S16 norm[MaxSeq+1];
694 size_t const headerSize = FSE_readNCount(norm, &max, &tableLog, src, srcSize);
695 if (FSE_isError(headerSize)) return ERROR(corruption_detected);
696 if (tableLog > maxLog) return ERROR(corruption_detected);
697 FSE_buildDTable(DTableSpace, norm, max, tableLog);
698 *DTablePtr = DTableSpace;
699 return headerSize;
700 } }
701 }
702
703 size_t ZSTD_decodeSeqHeaders(ZSTD_DCtx* dctx, int* nbSeqPtr,
704 const void* src, size_t srcSize)
705 {
706 const BYTE* const istart = (const BYTE* const)src;
707 const BYTE* const iend = istart + srcSize;
708 const BYTE* ip = istart;
709
710 /* check */
711 if (srcSize < MIN_SEQUENCES_SIZE) return ERROR(srcSize_wrong);
712
713 /* SeqHead */
714 { int nbSeq = *ip++;
715 if (!nbSeq) { *nbSeqPtr=0; return 1; }
716 if (nbSeq > 0x7F) {
717 if (nbSeq == 0xFF) {
718 if (ip+2 > iend) return ERROR(srcSize_wrong);
719 nbSeq = MEM_readLE16(ip) + LONGNBSEQ, ip+=2;
720 } else {
721 if (ip >= iend) return ERROR(srcSize_wrong);
722 nbSeq = ((nbSeq-0x80)<<8) + *ip++;
723 }
724 }
725 *nbSeqPtr = nbSeq;
726 }
727
728 /* FSE table descriptors */
729 if (ip+4 > iend) return ERROR(srcSize_wrong); /* minimum possible size */
730 { symbolEncodingType_e const LLtype = (symbolEncodingType_e)(*ip >> 6);
731 symbolEncodingType_e const OFtype = (symbolEncodingType_e)((*ip >> 4) & 3);
732 symbolEncodingType_e const MLtype = (symbolEncodingType_e)((*ip >> 2) & 3);
733 ip++;
734
735 /* Build DTables */
736 { size_t const llhSize = ZSTD_buildSeqTable(dctx->LLTable, &dctx->LLTptr,
737 LLtype, MaxLL, LLFSELog,
738 ip, iend-ip, LL_defaultDTable, dctx->fseEntropy);
739 if (ZSTD_isError(llhSize)) return ERROR(corruption_detected);
740 ip += llhSize;
741 }
742 { size_t const ofhSize = ZSTD_buildSeqTable(dctx->OFTable, &dctx->OFTptr,
743 OFtype, MaxOff, OffFSELog,
744 ip, iend-ip, OF_defaultDTable, dctx->fseEntropy);
745 if (ZSTD_isError(ofhSize)) return ERROR(corruption_detected);
746 ip += ofhSize;
747 }
748 { size_t const mlhSize = ZSTD_buildSeqTable(dctx->MLTable, &dctx->MLTptr,
749 MLtype, MaxML, MLFSELog,
750 ip, iend-ip, ML_defaultDTable, dctx->fseEntropy);
751 if (ZSTD_isError(mlhSize)) return ERROR(corruption_detected);
752 ip += mlhSize;
753 }
754 }
755
756 return ip-istart;
757 }
758
759
760 typedef struct {
761 size_t litLength;
762 size_t matchLength;
763 size_t offset;
764 } seq_t;
765
766 typedef struct {
767 BIT_DStream_t DStream;
768 FSE_DState_t stateLL;
769 FSE_DState_t stateOffb;
770 FSE_DState_t stateML;
771 size_t prevOffset[ZSTD_REP_NUM];
772 } seqState_t;
773
774
775 static seq_t ZSTD_decodeSequence(seqState_t* seqState)
776 {
777 seq_t seq;
778
779 U32 const llCode = FSE_peekSymbol(&seqState->stateLL);
780 U32 const mlCode = FSE_peekSymbol(&seqState->stateML);
781 U32 const ofCode = FSE_peekSymbol(&seqState->stateOffb); /* <= maxOff, by table construction */
782
783 U32 const llBits = LL_bits[llCode];
784 U32 const mlBits = ML_bits[mlCode];
785 U32 const ofBits = ofCode;
786 U32 const totalBits = llBits+mlBits+ofBits;
787
788 static const U32 LL_base[MaxLL+1] = {
789 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
790 16, 18, 20, 22, 24, 28, 32, 40, 48, 64, 0x80, 0x100, 0x200, 0x400, 0x800, 0x1000,
791 0x2000, 0x4000, 0x8000, 0x10000 };
792
793 static const U32 ML_base[MaxML+1] = {
794 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
795 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
796 35, 37, 39, 41, 43, 47, 51, 59, 67, 83, 99, 0x83, 0x103, 0x203, 0x403, 0x803,
797 0x1003, 0x2003, 0x4003, 0x8003, 0x10003 };
798
799 static const U32 OF_base[MaxOff+1] = {
800 0, 1, 1, 5, 0xD, 0x1D, 0x3D, 0x7D,
801 0xFD, 0x1FD, 0x3FD, 0x7FD, 0xFFD, 0x1FFD, 0x3FFD, 0x7FFD,
802 0xFFFD, 0x1FFFD, 0x3FFFD, 0x7FFFD, 0xFFFFD, 0x1FFFFD, 0x3FFFFD, 0x7FFFFD,
803 0xFFFFFD, 0x1FFFFFD, 0x3FFFFFD, 0x7FFFFFD, 0xFFFFFFD };
804
805 /* sequence */
806 { size_t offset;
807 if (!ofCode)
808 offset = 0;
809 else {
810 offset = OF_base[ofCode] + BIT_readBits(&seqState->DStream, ofBits); /* <= (ZSTD_WINDOWLOG_MAX-1) bits */
811 if (MEM_32bits()) BIT_reloadDStream(&seqState->DStream);
812 }
813
814 if (ofCode <= 1) {
815 offset += (llCode==0);
816 if (offset) {
817 size_t temp = (offset==3) ? seqState->prevOffset[0] - 1 : seqState->prevOffset[offset];
818 temp += !temp; /* 0 is not valid; input is corrupted; force offset to 1 */
819 if (offset != 1) seqState->prevOffset[2] = seqState->prevOffset[1];
820 seqState->prevOffset[1] = seqState->prevOffset[0];
821 seqState->prevOffset[0] = offset = temp;
822 } else {
823 offset = seqState->prevOffset[0];
824 }
825 } else {
826 seqState->prevOffset[2] = seqState->prevOffset[1];
827 seqState->prevOffset[1] = seqState->prevOffset[0];
828 seqState->prevOffset[0] = offset;
829 }
830 seq.offset = offset;
831 }
832
833 seq.matchLength = ML_base[mlCode] + ((mlCode>31) ? BIT_readBits(&seqState->DStream, mlBits) : 0); /* <= 16 bits */
834 if (MEM_32bits() && (mlBits+llBits>24)) BIT_reloadDStream(&seqState->DStream);
835
836 seq.litLength = LL_base[llCode] + ((llCode>15) ? BIT_readBits(&seqState->DStream, llBits) : 0); /* <= 16 bits */
837 if (MEM_32bits() ||
838 (totalBits > 64 - 7 - (LLFSELog+MLFSELog+OffFSELog)) ) BIT_reloadDStream(&seqState->DStream);
839
840 /* ANS state update */
841 FSE_updateState(&seqState->stateLL, &seqState->DStream); /* <= 9 bits */
842 FSE_updateState(&seqState->stateML, &seqState->DStream); /* <= 9 bits */
843 if (MEM_32bits()) BIT_reloadDStream(&seqState->DStream); /* <= 18 bits */
844 FSE_updateState(&seqState->stateOffb, &seqState->DStream); /* <= 8 bits */
845
846 return seq;
847 }
848
849
850 FORCE_NOINLINE
851 size_t ZSTD_execSequenceLast7(BYTE* op,
852 BYTE* const oend, seq_t sequence,
853 const BYTE** litPtr, const BYTE* const litLimit_w,
854 const BYTE* const base, const BYTE* const vBase, const BYTE* const dictEnd)
855 {
856 BYTE* const oLitEnd = op + sequence.litLength;
857 size_t const sequenceLength = sequence.litLength + sequence.matchLength;
858 BYTE* const oMatchEnd = op + sequenceLength; /* risk : address space overflow (32-bits) */
859 BYTE* const oend_w = oend - WILDCOPY_OVERLENGTH;
860 const BYTE* const iLitEnd = *litPtr + sequence.litLength;
861 const BYTE* match = oLitEnd - sequence.offset;
862
863 /* check */
864 if (oMatchEnd>oend) return ERROR(dstSize_tooSmall); /* last match must start at a minimum distance of WILDCOPY_OVERLENGTH from oend */
865 if (iLitEnd > litLimit_w) return ERROR(corruption_detected); /* over-read beyond lit buffer */
866 if (oLitEnd <= oend_w) return ERROR(GENERIC); /* Precondition */
867
868 /* copy literals */
869 if (op < oend_w) {
870 ZSTD_wildcopy(op, *litPtr, oend_w - op);
871 *litPtr += oend_w - op;
872 op = oend_w;
873 }
874 while (op < oLitEnd) *op++ = *(*litPtr)++;
875
876 /* copy Match */
877 if (sequence.offset > (size_t)(oLitEnd - base)) {
878 /* offset beyond prefix */
879 if (sequence.offset > (size_t)(oLitEnd - vBase)) return ERROR(corruption_detected);
880 match = dictEnd - (base-match);
881 if (match + sequence.matchLength <= dictEnd) {
882 memmove(oLitEnd, match, sequence.matchLength);
883 return sequenceLength;
884 }
885 /* span extDict & currentPrefixSegment */
886 { size_t const length1 = dictEnd - match;
887 memmove(oLitEnd, match, length1);
888 op = oLitEnd + length1;
889 sequence.matchLength -= length1;
890 match = base;
891 } }
892 while (op < oMatchEnd) *op++ = *match++;
893 return sequenceLength;
894 }
895
896
897 FORCE_INLINE
898 size_t ZSTD_execSequence(BYTE* op,
899 BYTE* const oend, seq_t sequence,
900 const BYTE** litPtr, const BYTE* const litLimit_w,
901 const BYTE* const base, const BYTE* const vBase, const BYTE* const dictEnd)
902 {
903 BYTE* const oLitEnd = op + sequence.litLength;
904 size_t const sequenceLength = sequence.litLength + sequence.matchLength;
905 BYTE* const oMatchEnd = op + sequenceLength; /* risk : address space overflow (32-bits) */
906 BYTE* const oend_w = oend - WILDCOPY_OVERLENGTH;
907 const BYTE* const iLitEnd = *litPtr + sequence.litLength;
908 const BYTE* match = oLitEnd - sequence.offset;
909
910 /* check */
911 if (oMatchEnd>oend) return ERROR(dstSize_tooSmall); /* last match must start at a minimum distance of WILDCOPY_OVERLENGTH from oend */
912 if (iLitEnd > litLimit_w) return ERROR(corruption_detected); /* over-read beyond lit buffer */
913 if (oLitEnd>oend_w) return ZSTD_execSequenceLast7(op, oend, sequence, litPtr, litLimit_w, base, vBase, dictEnd);
914
915 /* copy Literals */
916 ZSTD_copy8(op, *litPtr);
917 if (sequence.litLength > 8)
918 ZSTD_wildcopy(op+8, (*litPtr)+8, sequence.litLength - 8); /* note : since oLitEnd <= oend-WILDCOPY_OVERLENGTH, no risk of overwrite beyond oend */
919 op = oLitEnd;
920 *litPtr = iLitEnd; /* update for next sequence */
921
922 /* copy Match */
923 if (sequence.offset > (size_t)(oLitEnd - base)) {
924 /* offset beyond prefix */
925 if (sequence.offset > (size_t)(oLitEnd - vBase)) return ERROR(corruption_detected);
926 match = dictEnd - (base-match);
927 if (match + sequence.matchLength <= dictEnd) {
928 memmove(oLitEnd, match, sequence.matchLength);
929 return sequenceLength;
930 }
931 /* span extDict & currentPrefixSegment */
932 { size_t const length1 = dictEnd - match;
933 memmove(oLitEnd, match, length1);
934 op = oLitEnd + length1;
935 sequence.matchLength -= length1;
936 match = base;
937 if (op > oend_w) {
938 U32 i;
939 for (i = 0; i < sequence.matchLength; ++i) op[i] = match[i];
940 return sequenceLength;
941 }
942 } }
943 /* Requirement: op <= oend_w */
944
945 /* match within prefix */
946 if (sequence.offset < 8) {
947 /* close range match, overlap */
948 static const U32 dec32table[] = { 0, 1, 2, 1, 4, 4, 4, 4 }; /* added */
949 static const int dec64table[] = { 8, 8, 8, 7, 8, 9,10,11 }; /* substracted */
950 int const sub2 = dec64table[sequence.offset];
951 op[0] = match[0];
952 op[1] = match[1];
953 op[2] = match[2];
954 op[3] = match[3];
955 match += dec32table[sequence.offset];
956 ZSTD_copy4(op+4, match);
957 match -= sub2;
958 } else {
959 ZSTD_copy8(op, match);
960 }
961 op += 8; match += 8;
962
963 if (oMatchEnd > oend-(16-MINMATCH)) {
964 if (op < oend_w) {
965 ZSTD_wildcopy(op, match, oend_w - op);
966 match += oend_w - op;
967 op = oend_w;
968 }
969 while (op < oMatchEnd) *op++ = *match++;
970 } else {
971 ZSTD_wildcopy(op, match, sequence.matchLength-8); /* works even if matchLength < 8 */
972 }
973 return sequenceLength;
974 }
975
976
977 static size_t ZSTD_decompressSequences(
978 ZSTD_DCtx* dctx,
979 void* dst, size_t maxDstSize,
980 const void* seqStart, size_t seqSize)
981 {
982 const BYTE* ip = (const BYTE*)seqStart;
983 const BYTE* const iend = ip + seqSize;
984 BYTE* const ostart = (BYTE* const)dst;
985 BYTE* const oend = ostart + maxDstSize;
986 BYTE* op = ostart;
987 const BYTE* litPtr = dctx->litPtr;
988 const BYTE* const litLimit_w = litPtr + dctx->litBufSize - WILDCOPY_OVERLENGTH;
989 const BYTE* const litEnd = litPtr + dctx->litSize;
990 const BYTE* const base = (const BYTE*) (dctx->base);
991 const BYTE* const vBase = (const BYTE*) (dctx->vBase);
992 const BYTE* const dictEnd = (const BYTE*) (dctx->dictEnd);
993 int nbSeq;
994
995 /* Build Decoding Tables */
996 { size_t const seqHSize = ZSTD_decodeSeqHeaders(dctx, &nbSeq, ip, seqSize);
997 if (ZSTD_isError(seqHSize)) return seqHSize;
998 ip += seqHSize;
999 }
1000
1001 /* Regen sequences */
1002 if (nbSeq) {
1003 seqState_t seqState;
1004 dctx->fseEntropy = 1;
1005 { U32 i; for (i=0; i<ZSTD_REP_NUM; i++) seqState.prevOffset[i] = dctx->rep[i]; }
1006 CHECK_E(BIT_initDStream(&seqState.DStream, ip, iend-ip), corruption_detected);
1007 FSE_initDState(&seqState.stateLL, &seqState.DStream, dctx->LLTptr);
1008 FSE_initDState(&seqState.stateOffb, &seqState.DStream, dctx->OFTptr);
1009 FSE_initDState(&seqState.stateML, &seqState.DStream, dctx->MLTptr);
1010
1011 for ( ; (BIT_reloadDStream(&(seqState.DStream)) <= BIT_DStream_completed) && nbSeq ; ) {
1012 nbSeq--;
1013 { seq_t const sequence = ZSTD_decodeSequence(&seqState);
1014 size_t const oneSeqSize = ZSTD_execSequence(op, oend, sequence, &litPtr, litLimit_w, base, vBase, dictEnd);
1015 if (ZSTD_isError(oneSeqSize)) return oneSeqSize;
1016 op += oneSeqSize;
1017 } }
1018
1019 /* check if reached exact end */
1020 if (nbSeq) return ERROR(corruption_detected);
1021 /* save reps for next block */
1022 { U32 i; for (i=0; i<ZSTD_REP_NUM; i++) dctx->rep[i] = (U32)(seqState.prevOffset[i]); }
1023 }
1024
1025 /* last literal segment */
1026 { size_t const lastLLSize = litEnd - litPtr;
1027 if (lastLLSize > (size_t)(oend-op)) return ERROR(dstSize_tooSmall);
1028 memcpy(op, litPtr, lastLLSize);
1029 op += lastLLSize;
1030 }
1031
1032 return op-ostart;
1033 }
1034
1035
1036 static void ZSTD_checkContinuity(ZSTD_DCtx* dctx, const void* dst)
1037 {
1038 if (dst != dctx->previousDstEnd) { /* not contiguous */
1039 dctx->dictEnd = dctx->previousDstEnd;
1040 dctx->vBase = (const char*)dst - ((const char*)(dctx->previousDstEnd) - (const char*)(dctx->base));
1041 dctx->base = dst;
1042 dctx->previousDstEnd = dst;
1043 }
1044 }
1045
1046
1047 static size_t ZSTD_decompressBlock_internal(ZSTD_DCtx* dctx,
1048 void* dst, size_t dstCapacity,
1049 const void* src, size_t srcSize)
1050 { /* blockType == blockCompressed */
1051 const BYTE* ip = (const BYTE*)src;
1052
1053 if (srcSize >= ZSTD_BLOCKSIZE_ABSOLUTEMAX) return ERROR(srcSize_wrong);
1054
1055 /* Decode literals sub-block */
1056 { size_t const litCSize = ZSTD_decodeLiteralsBlock(dctx, src, srcSize);
1057 if (ZSTD_isError(litCSize)) return litCSize;
1058 ip += litCSize;
1059 srcSize -= litCSize;
1060 }
1061 return ZSTD_decompressSequences(dctx, dst, dstCapacity, ip, srcSize);
1062 }
1063
1064
1065 size_t ZSTD_decompressBlock(ZSTD_DCtx* dctx,
1066 void* dst, size_t dstCapacity,
1067 const void* src, size_t srcSize)
1068 {
1069 size_t dSize;
1070 ZSTD_checkContinuity(dctx, dst);
1071 dSize = ZSTD_decompressBlock_internal(dctx, dst, dstCapacity, src, srcSize);
1072 dctx->previousDstEnd = (char*)dst + dSize;
1073 return dSize;
1074 }
1075
1076
1077 /** ZSTD_insertBlock() :
1078 insert `src` block into `dctx` history. Useful to track uncompressed blocks. */
1079 ZSTDLIB_API size_t ZSTD_insertBlock(ZSTD_DCtx* dctx, const void* blockStart, size_t blockSize)
1080 {
1081 ZSTD_checkContinuity(dctx, blockStart);
1082 dctx->previousDstEnd = (const char*)blockStart + blockSize;
1083 return blockSize;
1084 }
1085
1086
1087 size_t ZSTD_generateNxBytes(void* dst, size_t dstCapacity, BYTE byte, size_t length)
1088 {
1089 if (length > dstCapacity) return ERROR(dstSize_tooSmall);
1090 memset(dst, byte, length);
1091 return length;
1092 }
1093
1094
1095 /*! ZSTD_decompressFrame() :
1096 * `dctx` must be properly initialized */
1097 static size_t ZSTD_decompressFrame(ZSTD_DCtx* dctx,
1098 void* dst, size_t dstCapacity,
1099 const void* src, size_t srcSize)
1100 {
1101 const BYTE* ip = (const BYTE*)src;
1102 BYTE* const ostart = (BYTE* const)dst;
1103 BYTE* const oend = ostart + dstCapacity;
1104 BYTE* op = ostart;
1105 size_t remainingSize = srcSize;
1106
1107 /* check */
1108 if (srcSize < ZSTD_frameHeaderSize_min+ZSTD_blockHeaderSize) return ERROR(srcSize_wrong);
1109
1110 /* Frame Header */
1111 { size_t const frameHeaderSize = ZSTD_frameHeaderSize(src, ZSTD_frameHeaderSize_prefix);
1112 if (ZSTD_isError(frameHeaderSize)) return frameHeaderSize;
1113 if (srcSize < frameHeaderSize+ZSTD_blockHeaderSize) return ERROR(srcSize_wrong);
1114 CHECK_F(ZSTD_decodeFrameHeader(dctx, src, frameHeaderSize));
1115 ip += frameHeaderSize; remainingSize -= frameHeaderSize;
1116 }
1117
1118 /* Loop on each block */
1119 while (1) {
1120 size_t decodedSize;
1121 blockProperties_t blockProperties;
1122 size_t const cBlockSize = ZSTD_getcBlockSize(ip, remainingSize, &blockProperties);
1123 if (ZSTD_isError(cBlockSize)) return cBlockSize;
1124
1125 ip += ZSTD_blockHeaderSize;
1126 remainingSize -= ZSTD_blockHeaderSize;
1127 if (cBlockSize > remainingSize) return ERROR(srcSize_wrong);
1128
1129 switch(blockProperties.blockType)
1130 {
1131 case bt_compressed:
1132 decodedSize = ZSTD_decompressBlock_internal(dctx, op, oend-op, ip, cBlockSize);
1133 break;
1134 case bt_raw :
1135 decodedSize = ZSTD_copyRawBlock(op, oend-op, ip, cBlockSize);
1136 break;
1137 case bt_rle :
1138 decodedSize = ZSTD_generateNxBytes(op, oend-op, *ip, blockProperties.origSize);
1139 break;
1140 case bt_reserved :
1141 default:
1142 return ERROR(corruption_detected);
1143 }
1144
1145 if (ZSTD_isError(decodedSize)) return decodedSize;
1146 if (dctx->fParams.checksumFlag) XXH64_update(&dctx->xxhState, op, decodedSize);
1147 op += decodedSize;
1148 ip += cBlockSize;
1149 remainingSize -= cBlockSize;
1150 if (blockProperties.lastBlock) break;
1151 }
1152
1153 if (dctx->fParams.checksumFlag) { /* Frame content checksum verification */
1154 U32 const checkCalc = (U32)XXH64_digest(&dctx->xxhState);
1155 U32 checkRead;
1156 if (remainingSize<4) return ERROR(checksum_wrong);
1157 checkRead = MEM_readLE32(ip);
1158 if (checkRead != checkCalc) return ERROR(checksum_wrong);
1159 remainingSize -= 4;
1160 }
1161
1162 if (remainingSize) return ERROR(srcSize_wrong);
1163 return op-ostart;
1164 }
1165
1166
1167 size_t ZSTD_decompress_usingDict(ZSTD_DCtx* dctx,
1168 void* dst, size_t dstCapacity,
1169 const void* src, size_t srcSize,
1170 const void* dict, size_t dictSize)
1171 {
1172 #if defined(ZSTD_LEGACY_SUPPORT) && (ZSTD_LEGACY_SUPPORT==1)
1173 if (ZSTD_isLegacy(src, srcSize)) return ZSTD_decompressLegacy(dst, dstCapacity, src, srcSize, dict, dictSize);
1174 #endif
1175 ZSTD_decompressBegin_usingDict(dctx, dict, dictSize);
1176 ZSTD_checkContinuity(dctx, dst);
1177 return ZSTD_decompressFrame(dctx, dst, dstCapacity, src, srcSize);
1178 }
1179
1180
1181 size_t ZSTD_decompressDCtx(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize)
1182 {
1183 return ZSTD_decompress_usingDict(dctx, dst, dstCapacity, src, srcSize, NULL, 0);
1184 }
1185
1186
1187 size_t ZSTD_decompress(void* dst, size_t dstCapacity, const void* src, size_t srcSize)
1188 {
1189 #if defined(ZSTD_HEAPMODE) && (ZSTD_HEAPMODE==1)
1190 size_t regenSize;
1191 ZSTD_DCtx* const dctx = ZSTD_createDCtx();
1192 if (dctx==NULL) return ERROR(memory_allocation);
1193 regenSize = ZSTD_decompressDCtx(dctx, dst, dstCapacity, src, srcSize);
1194 ZSTD_freeDCtx(dctx);
1195 return regenSize;
1196 #else /* stack mode */
1197 ZSTD_DCtx dctx;
1198 return ZSTD_decompressDCtx(&dctx, dst, dstCapacity, src, srcSize);
1199 #endif
1200 }
1201
1202
1203 /*-**************************************
1204 * Advanced Streaming Decompression API
1205 * Bufferless and synchronous
1206 ****************************************/
1207 size_t ZSTD_nextSrcSizeToDecompress(ZSTD_DCtx* dctx) { return dctx->expected; }
1208
1209 ZSTD_nextInputType_e ZSTD_nextInputType(ZSTD_DCtx* dctx) {
1210 switch(dctx->stage)
1211 {
1212 default: /* should not happen */
1213 case ZSTDds_getFrameHeaderSize:
1214 case ZSTDds_decodeFrameHeader:
1215 return ZSTDnit_frameHeader;
1216 case ZSTDds_decodeBlockHeader:
1217 return ZSTDnit_blockHeader;
1218 case ZSTDds_decompressBlock:
1219 return ZSTDnit_block;
1220 case ZSTDds_decompressLastBlock:
1221 return ZSTDnit_lastBlock;
1222 case ZSTDds_checkChecksum:
1223 return ZSTDnit_checksum;
1224 case ZSTDds_decodeSkippableHeader:
1225 case ZSTDds_skipFrame:
1226 return ZSTDnit_skippableFrame;
1227 }
1228 }
1229
1230 int ZSTD_isSkipFrame(ZSTD_DCtx* dctx) { return dctx->stage == ZSTDds_skipFrame; } /* for zbuff */
1231
1232 /** ZSTD_decompressContinue() :
1233 * @return : nb of bytes generated into `dst` (necessarily <= `dstCapacity)
1234 * or an error code, which can be tested using ZSTD_isError() */
1235 size_t ZSTD_decompressContinue(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize)
1236 {
1237 /* Sanity check */
1238 if (srcSize != dctx->expected) return ERROR(srcSize_wrong);
1239 if (dstCapacity) ZSTD_checkContinuity(dctx, dst);
1240
1241 switch (dctx->stage)
1242 {
1243 case ZSTDds_getFrameHeaderSize :
1244 if (srcSize != ZSTD_frameHeaderSize_prefix) return ERROR(srcSize_wrong); /* impossible */
1245 if ((MEM_readLE32(src) & 0xFFFFFFF0U) == ZSTD_MAGIC_SKIPPABLE_START) { /* skippable frame */
1246 memcpy(dctx->headerBuffer, src, ZSTD_frameHeaderSize_prefix);
1247 dctx->expected = ZSTD_skippableHeaderSize - ZSTD_frameHeaderSize_prefix; /* magic number + skippable frame length */
1248 dctx->stage = ZSTDds_decodeSkippableHeader;
1249 return 0;
1250 }
1251 dctx->headerSize = ZSTD_frameHeaderSize(src, ZSTD_frameHeaderSize_prefix);
1252 if (ZSTD_isError(dctx->headerSize)) return dctx->headerSize;
1253 memcpy(dctx->headerBuffer, src, ZSTD_frameHeaderSize_prefix);
1254 if (dctx->headerSize > ZSTD_frameHeaderSize_prefix) {
1255 dctx->expected = dctx->headerSize - ZSTD_frameHeaderSize_prefix;
1256 dctx->stage = ZSTDds_decodeFrameHeader;
1257 return 0;
1258 }
1259 dctx->expected = 0; /* not necessary to copy more */
1260
1261 case ZSTDds_decodeFrameHeader:
1262 memcpy(dctx->headerBuffer + ZSTD_frameHeaderSize_prefix, src, dctx->expected);
1263 CHECK_F(ZSTD_decodeFrameHeader(dctx, dctx->headerBuffer, dctx->headerSize));
1264 dctx->expected = ZSTD_blockHeaderSize;
1265 dctx->stage = ZSTDds_decodeBlockHeader;
1266 return 0;
1267
1268 case ZSTDds_decodeBlockHeader:
1269 { blockProperties_t bp;
1270 size_t const cBlockSize = ZSTD_getcBlockSize(src, ZSTD_blockHeaderSize, &bp);
1271 if (ZSTD_isError(cBlockSize)) return cBlockSize;
1272 dctx->expected = cBlockSize;
1273 dctx->bType = bp.blockType;
1274 dctx->rleSize = bp.origSize;
1275 if (cBlockSize) {
1276 dctx->stage = bp.lastBlock ? ZSTDds_decompressLastBlock : ZSTDds_decompressBlock;
1277 return 0;
1278 }
1279 /* empty block */
1280 if (bp.lastBlock) {
1281 if (dctx->fParams.checksumFlag) {
1282 dctx->expected = 4;
1283 dctx->stage = ZSTDds_checkChecksum;
1284 } else {
1285 dctx->expected = 0; /* end of frame */
1286 dctx->stage = ZSTDds_getFrameHeaderSize;
1287 }
1288 } else {
1289 dctx->expected = 3; /* go directly to next header */
1290 dctx->stage = ZSTDds_decodeBlockHeader;
1291 }
1292 return 0;
1293 }
1294 case ZSTDds_decompressLastBlock:
1295 case ZSTDds_decompressBlock:
1296 { size_t rSize;
1297 switch(dctx->bType)
1298 {
1299 case bt_compressed:
1300 rSize = ZSTD_decompressBlock_internal(dctx, dst, dstCapacity, src, srcSize);
1301 break;
1302 case bt_raw :
1303 rSize = ZSTD_copyRawBlock(dst, dstCapacity, src, srcSize);
1304 break;
1305 case bt_rle :
1306 rSize = ZSTD_setRleBlock(dst, dstCapacity, src, srcSize, dctx->rleSize);
1307 break;
1308 case bt_reserved : /* should never happen */
1309 default:
1310 return ERROR(corruption_detected);
1311 }
1312 if (ZSTD_isError(rSize)) return rSize;
1313 if (dctx->fParams.checksumFlag) XXH64_update(&dctx->xxhState, dst, rSize);
1314
1315 if (dctx->stage == ZSTDds_decompressLastBlock) { /* end of frame */
1316 if (dctx->fParams.checksumFlag) { /* another round for frame checksum */
1317 dctx->expected = 4;
1318 dctx->stage = ZSTDds_checkChecksum;
1319 } else {
1320 dctx->expected = 0; /* ends here */
1321 dctx->stage = ZSTDds_getFrameHeaderSize;
1322 }
1323 } else {
1324 dctx->stage = ZSTDds_decodeBlockHeader;
1325 dctx->expected = ZSTD_blockHeaderSize;
1326 dctx->previousDstEnd = (char*)dst + rSize;
1327 }
1328 return rSize;
1329 }
1330 case ZSTDds_checkChecksum:
1331 { U32 const h32 = (U32)XXH64_digest(&dctx->xxhState);
1332 U32 const check32 = MEM_readLE32(src); /* srcSize == 4, guaranteed by dctx->expected */
1333 if (check32 != h32) return ERROR(checksum_wrong);
1334 dctx->expected = 0;
1335 dctx->stage = ZSTDds_getFrameHeaderSize;
1336 return 0;
1337 }
1338 case ZSTDds_decodeSkippableHeader:
1339 { memcpy(dctx->headerBuffer + ZSTD_frameHeaderSize_prefix, src, dctx->expected);
1340 dctx->expected = MEM_readLE32(dctx->headerBuffer + 4);
1341 dctx->stage = ZSTDds_skipFrame;
1342 return 0;
1343 }
1344 case ZSTDds_skipFrame:
1345 { dctx->expected = 0;
1346 dctx->stage = ZSTDds_getFrameHeaderSize;
1347 return 0;
1348 }
1349 default:
1350 return ERROR(GENERIC); /* impossible */
1351 }
1352 }
1353
1354
1355 static size_t ZSTD_refDictContent(ZSTD_DCtx* dctx, const void* dict, size_t dictSize)
1356 {
1357 dctx->dictEnd = dctx->previousDstEnd;
1358 dctx->vBase = (const char*)dict - ((const char*)(dctx->previousDstEnd) - (const char*)(dctx->base));
1359 dctx->base = dict;
1360 dctx->previousDstEnd = (const char*)dict + dictSize;
1361 return 0;
1362 }
1363
1364 static size_t ZSTD_loadEntropy(ZSTD_DCtx* dctx, const void* const dict, size_t const dictSize)
1365 {
1366 const BYTE* dictPtr = (const BYTE*)dict;
1367 const BYTE* const dictEnd = dictPtr + dictSize;
1368
1369 { size_t const hSize = HUF_readDTableX4(dctx->hufTable, dict, dictSize);
1370 if (HUF_isError(hSize)) return ERROR(dictionary_corrupted);
1371 dictPtr += hSize;
1372 }
1373
1374 { short offcodeNCount[MaxOff+1];
1375 U32 offcodeMaxValue=MaxOff, offcodeLog;
1376 size_t const offcodeHeaderSize = FSE_readNCount(offcodeNCount, &offcodeMaxValue, &offcodeLog, dictPtr, dictEnd-dictPtr);
1377 if (FSE_isError(offcodeHeaderSize)) return ERROR(dictionary_corrupted);
1378 if (offcodeLog > OffFSELog) return ERROR(dictionary_corrupted);
1379 CHECK_E(FSE_buildDTable(dctx->OFTable, offcodeNCount, offcodeMaxValue, offcodeLog), dictionary_corrupted);
1380 dictPtr += offcodeHeaderSize;
1381 }
1382
1383 { short matchlengthNCount[MaxML+1];
1384 unsigned matchlengthMaxValue = MaxML, matchlengthLog;
1385 size_t const matchlengthHeaderSize = FSE_readNCount(matchlengthNCount, &matchlengthMaxValue, &matchlengthLog, dictPtr, dictEnd-dictPtr);
1386 if (FSE_isError(matchlengthHeaderSize)) return ERROR(dictionary_corrupted);
1387 if (matchlengthLog > MLFSELog) return ERROR(dictionary_corrupted);
1388 CHECK_E(FSE_buildDTable(dctx->MLTable, matchlengthNCount, matchlengthMaxValue, matchlengthLog), dictionary_corrupted);
1389 dictPtr += matchlengthHeaderSize;
1390 }
1391
1392 { short litlengthNCount[MaxLL+1];
1393 unsigned litlengthMaxValue = MaxLL, litlengthLog;
1394 size_t const litlengthHeaderSize = FSE_readNCount(litlengthNCount, &litlengthMaxValue, &litlengthLog, dictPtr, dictEnd-dictPtr);
1395 if (FSE_isError(litlengthHeaderSize)) return ERROR(dictionary_corrupted);
1396 if (litlengthLog > LLFSELog) return ERROR(dictionary_corrupted);
1397 CHECK_E(FSE_buildDTable(dctx->LLTable, litlengthNCount, litlengthMaxValue, litlengthLog), dictionary_corrupted);
1398 dictPtr += litlengthHeaderSize;
1399 }
1400
1401 if (dictPtr+12 > dictEnd) return ERROR(dictionary_corrupted);
1402 dctx->rep[0] = MEM_readLE32(dictPtr+0); if (dctx->rep[0] >= dictSize) return ERROR(dictionary_corrupted);
1403 dctx->rep[1] = MEM_readLE32(dictPtr+4); if (dctx->rep[1] >= dictSize) return ERROR(dictionary_corrupted);
1404 dctx->rep[2] = MEM_readLE32(dictPtr+8); if (dctx->rep[2] >= dictSize) return ERROR(dictionary_corrupted);
1405 dictPtr += 12;
1406
1407 dctx->litEntropy = dctx->fseEntropy = 1;
1408 return dictPtr - (const BYTE*)dict;
1409 }
1410
1411 static size_t ZSTD_decompress_insertDictionary(ZSTD_DCtx* dctx, const void* dict, size_t dictSize)
1412 {
1413 if (dictSize < 8) return ZSTD_refDictContent(dctx, dict, dictSize);
1414 { U32 const magic = MEM_readLE32(dict);
1415 if (magic != ZSTD_DICT_MAGIC) {
1416 return ZSTD_refDictContent(dctx, dict, dictSize); /* pure content mode */
1417 } }
1418 dctx->dictID = MEM_readLE32((const char*)dict + 4);
1419
1420 /* load entropy tables */
1421 dict = (const char*)dict + 8;
1422 dictSize -= 8;
1423 { size_t const eSize = ZSTD_loadEntropy(dctx, dict, dictSize);
1424 if (ZSTD_isError(eSize)) return ERROR(dictionary_corrupted);
1425 dict = (const char*)dict + eSize;
1426 dictSize -= eSize;
1427 }
1428
1429 /* reference dictionary content */
1430 return ZSTD_refDictContent(dctx, dict, dictSize);
1431 }
1432
1433 size_t ZSTD_decompressBegin_usingDict(ZSTD_DCtx* dctx, const void* dict, size_t dictSize)
1434 {
1435 CHECK_F(ZSTD_decompressBegin(dctx));
1436 if (dict && dictSize) CHECK_E(ZSTD_decompress_insertDictionary(dctx, dict, dictSize), dictionary_corrupted);
1437 return 0;
1438 }
1439
1440
1441 /* ====== ZSTD_DDict ====== */
1442
1443 struct ZSTD_DDict_s {
1444 void* dict;
1445 size_t dictSize;
1446 ZSTD_DCtx* refContext;
1447 }; /* typedef'd to ZSTD_DDict within "zstd.h" */
1448
1449 ZSTD_DDict* ZSTD_createDDict_advanced(const void* dict, size_t dictSize, ZSTD_customMem customMem)
1450 {
1451 if (!customMem.customAlloc && !customMem.customFree) customMem = defaultCustomMem;
1452 if (!customMem.customAlloc || !customMem.customFree) return NULL;
1453
1454 { ZSTD_DDict* const ddict = (ZSTD_DDict*) ZSTD_malloc(sizeof(ZSTD_DDict), customMem);
1455 void* const dictContent = ZSTD_malloc(dictSize, customMem);
1456 ZSTD_DCtx* const dctx = ZSTD_createDCtx_advanced(customMem);
1457
1458 if (!dictContent || !ddict || !dctx) {
1459 ZSTD_free(dictContent, customMem);
1460 ZSTD_free(ddict, customMem);
1461 ZSTD_free(dctx, customMem);
1462 return NULL;
1463 }
1464
1465 if (dictSize) {
1466 memcpy(dictContent, dict, dictSize);
1467 }
1468 { size_t const errorCode = ZSTD_decompressBegin_usingDict(dctx, dictContent, dictSize);
1469 if (ZSTD_isError(errorCode)) {
1470 ZSTD_free(dictContent, customMem);
1471 ZSTD_free(ddict, customMem);
1472 ZSTD_free(dctx, customMem);
1473 return NULL;
1474 } }
1475
1476 ddict->dict = dictContent;
1477 ddict->dictSize = dictSize;
1478 ddict->refContext = dctx;
1479 return ddict;
1480 }
1481 }
1482
1483 /*! ZSTD_createDDict() :
1484 * Create a digested dictionary, ready to start decompression without startup delay.
1485 * `dict` can be released after `ZSTD_DDict` creation */
1486 ZSTD_DDict* ZSTD_createDDict(const void* dict, size_t dictSize)
1487 {
1488 ZSTD_customMem const allocator = { NULL, NULL, NULL };
1489 return ZSTD_createDDict_advanced(dict, dictSize, allocator);
1490 }
1491
1492 size_t ZSTD_freeDDict(ZSTD_DDict* ddict)
1493 {
1494 if (ddict==NULL) return 0; /* support free on NULL */
1495 { ZSTD_customMem const cMem = ddict->refContext->customMem;
1496 ZSTD_freeDCtx(ddict->refContext);
1497 ZSTD_free(ddict->dict, cMem);
1498 ZSTD_free(ddict, cMem);
1499 return 0;
1500 }
1501 }
1502
1503 size_t ZSTD_sizeof_DDict(const ZSTD_DDict* ddict)
1504 {
1505 if (ddict==NULL) return 0; /* support sizeof on NULL */
1506 return sizeof(*ddict) + sizeof(ddict->refContext) + ddict->dictSize;
1507 }
1508
1509
1510 /*! ZSTD_decompress_usingDDict() :
1511 * Decompression using a pre-digested Dictionary
1512 * Use dictionary without significant overhead. */
1513 size_t ZSTD_decompress_usingDDict(ZSTD_DCtx* dctx,
1514 void* dst, size_t dstCapacity,
1515 const void* src, size_t srcSize,
1516 const ZSTD_DDict* ddict)
1517 {
1518 #if defined(ZSTD_LEGACY_SUPPORT) && (ZSTD_LEGACY_SUPPORT==1)
1519 if (ZSTD_isLegacy(src, srcSize)) return ZSTD_decompressLegacy(dst, dstCapacity, src, srcSize, ddict->dict, ddict->dictSize);
1520 #endif
1521 ZSTD_refDCtx(dctx, ddict->refContext);
1522 ZSTD_checkContinuity(dctx, dst);
1523 return ZSTD_decompressFrame(dctx, dst, dstCapacity, src, srcSize);
1524 }
1525
1526
1527 /*=====================================
1528 * Streaming decompression
1529 *====================================*/
1530
1531 typedef enum { zdss_init, zdss_loadHeader,
1532 zdss_read, zdss_load, zdss_flush } ZSTD_dStreamStage;
1533
1534 /* *** Resource management *** */
1535 struct ZSTD_DStream_s {
1536 ZSTD_DCtx* dctx;
1537 ZSTD_DDict* ddictLocal;
1538 const ZSTD_DDict* ddict;
1539 ZSTD_frameParams fParams;
1540 ZSTD_dStreamStage stage;
1541 char* inBuff;
1542 size_t inBuffSize;
1543 size_t inPos;
1544 size_t maxWindowSize;
1545 char* outBuff;
1546 size_t outBuffSize;
1547 size_t outStart;
1548 size_t outEnd;
1549 size_t blockSize;
1550 BYTE headerBuffer[ZSTD_FRAMEHEADERSIZE_MAX]; /* tmp buffer to store frame header */
1551 size_t lhSize;
1552 ZSTD_customMem customMem;
1553 void* legacyContext;
1554 U32 previousLegacyVersion;
1555 U32 legacyVersion;
1556 U32 hostageByte;
1557 }; /* typedef'd to ZSTD_DStream within "zstd.h" */
1558
1559
1560 ZSTD_DStream* ZSTD_createDStream(void)
1561 {
1562 return ZSTD_createDStream_advanced(defaultCustomMem);
1563 }
1564
1565 ZSTD_DStream* ZSTD_createDStream_advanced(ZSTD_customMem customMem)
1566 {
1567 ZSTD_DStream* zds;
1568
1569 if (!customMem.customAlloc && !customMem.customFree) customMem = defaultCustomMem;
1570 if (!customMem.customAlloc || !customMem.customFree) return NULL;
1571
1572 zds = (ZSTD_DStream*) ZSTD_malloc(sizeof(ZSTD_DStream), customMem);
1573 if (zds==NULL) return NULL;
1574 memset(zds, 0, sizeof(ZSTD_DStream));
1575 memcpy(&zds->customMem, &customMem, sizeof(ZSTD_customMem));
1576 zds->dctx = ZSTD_createDCtx_advanced(customMem);
1577 if (zds->dctx == NULL) { ZSTD_freeDStream(zds); return NULL; }
1578 zds->stage = zdss_init;
1579 zds->maxWindowSize = ZSTD_MAXWINDOWSIZE_DEFAULT;
1580 return zds;
1581 }
1582
1583 size_t ZSTD_freeDStream(ZSTD_DStream* zds)
1584 {
1585 if (zds==NULL) return 0; /* support free on null */
1586 { ZSTD_customMem const cMem = zds->customMem;
1587 ZSTD_freeDCtx(zds->dctx);
1588 ZSTD_freeDDict(zds->ddictLocal);
1589 ZSTD_free(zds->inBuff, cMem);
1590 ZSTD_free(zds->outBuff, cMem);
1591 #if defined(ZSTD_LEGACY_SUPPORT) && (ZSTD_LEGACY_SUPPORT >= 1)
1592 if (zds->legacyContext)
1593 ZSTD_freeLegacyStreamContext(zds->legacyContext, zds->previousLegacyVersion);
1594 #endif
1595 ZSTD_free(zds, cMem);
1596 return 0;
1597 }
1598 }
1599
1600
1601 /* *** Initialization *** */
1602
1603 size_t ZSTD_DStreamInSize(void) { return ZSTD_BLOCKSIZE_ABSOLUTEMAX + ZSTD_blockHeaderSize; }
1604 size_t ZSTD_DStreamOutSize(void) { return ZSTD_BLOCKSIZE_ABSOLUTEMAX; }
1605
1606 size_t ZSTD_initDStream_usingDict(ZSTD_DStream* zds, const void* dict, size_t dictSize)
1607 {
1608 zds->stage = zdss_loadHeader;
1609 zds->lhSize = zds->inPos = zds->outStart = zds->outEnd = 0;
1610 ZSTD_freeDDict(zds->ddictLocal);
1611 if (dict) {
1612 zds->ddictLocal = ZSTD_createDDict(dict, dictSize);
1613 if (zds->ddictLocal == NULL) return ERROR(memory_allocation);
1614 } else zds->ddictLocal = NULL;
1615 zds->ddict = zds->ddictLocal;
1616 zds->legacyVersion = 0;
1617 zds->hostageByte = 0;
1618 return ZSTD_frameHeaderSize_prefix;
1619 }
1620
1621 size_t ZSTD_initDStream(ZSTD_DStream* zds)
1622 {
1623 return ZSTD_initDStream_usingDict(zds, NULL, 0);
1624 }
1625
1626 size_t ZSTD_initDStream_usingDDict(ZSTD_DStream* zds, const ZSTD_DDict* ddict) /**< note : ddict will just be referenced, and must outlive decompression session */
1627 {
1628 size_t const initResult = ZSTD_initDStream(zds);
1629 zds->ddict = ddict;
1630 return initResult;
1631 }
1632
1633 size_t ZSTD_resetDStream(ZSTD_DStream* zds)
1634 {
1635 zds->stage = zdss_loadHeader;
1636 zds->lhSize = zds->inPos = zds->outStart = zds->outEnd = 0;
1637 zds->legacyVersion = 0;
1638 zds->hostageByte = 0;
1639 return ZSTD_frameHeaderSize_prefix;
1640 }
1641
1642 size_t ZSTD_setDStreamParameter(ZSTD_DStream* zds,
1643 ZSTD_DStreamParameter_e paramType, unsigned paramValue)
1644 {
1645 switch(paramType)
1646 {
1647 default : return ERROR(parameter_unknown);
1648 case ZSTDdsp_maxWindowSize : zds->maxWindowSize = paramValue ? paramValue : (U32)(-1); break;
1649 }
1650 return 0;
1651 }
1652
1653
1654 size_t ZSTD_sizeof_DStream(const ZSTD_DStream* zds)
1655 {
1656 if (zds==NULL) return 0; /* support sizeof on NULL */
1657 return sizeof(*zds) + ZSTD_sizeof_DCtx(zds->dctx) + ZSTD_sizeof_DDict(zds->ddictLocal) + zds->inBuffSize + zds->outBuffSize;
1658 }
1659
1660
1661 /* ***** Decompression ***** */
1662
1663 MEM_STATIC size_t ZSTD_limitCopy(void* dst, size_t dstCapacity, const void* src, size_t srcSize)
1664 {
1665 size_t const length = MIN(dstCapacity, srcSize);
1666 memcpy(dst, src, length);
1667 return length;
1668 }
1669
1670
1671 size_t ZSTD_decompressStream(ZSTD_DStream* zds, ZSTD_outBuffer* output, ZSTD_inBuffer* input)
1672 {
1673 const char* const istart = (const char*)(input->src) + input->pos;
1674 const char* const iend = (const char*)(input->src) + input->size;
1675 const char* ip = istart;
1676 char* const ostart = (char*)(output->dst) + output->pos;
1677 char* const oend = (char*)(output->dst) + output->size;
1678 char* op = ostart;
1679 U32 someMoreWork = 1;
1680
1681 #if defined(ZSTD_LEGACY_SUPPORT) && (ZSTD_LEGACY_SUPPORT>=1)
1682 if (zds->legacyVersion)
1683 return ZSTD_decompressLegacyStream(zds->legacyContext, zds->legacyVersion, output, input);
1684 #endif
1685
1686 while (someMoreWork) {
1687 switch(zds->stage)
1688 {
1689 case zdss_init :
1690 return ERROR(init_missing);
1691
1692 case zdss_loadHeader :
1693 { size_t const hSize = ZSTD_getFrameParams(&zds->fParams, zds->headerBuffer, zds->lhSize);
1694 if (ZSTD_isError(hSize))
1695 #if defined(ZSTD_LEGACY_SUPPORT) && (ZSTD_LEGACY_SUPPORT>=1)
1696 { U32 const legacyVersion = ZSTD_isLegacy(istart, iend-istart);
1697 if (legacyVersion) {
1698 const void* const dict = zds->ddict ? zds->ddict->dict : NULL;
1699 size_t const dictSize = zds->ddict ? zds->ddict->dictSize : 0;
1700 CHECK_F(ZSTD_initLegacyStream(&zds->legacyContext, zds->previousLegacyVersion, legacyVersion,
1701 dict, dictSize));
1702 zds->legacyVersion = zds->previousLegacyVersion = legacyVersion;
1703 return ZSTD_decompressLegacyStream(zds->legacyContext, zds->legacyVersion, output, input);
1704 } else {
1705 return hSize; /* error */
1706 } }
1707 #else
1708 return hSize;
1709 #endif
1710 if (hSize != 0) { /* need more input */
1711 size_t const toLoad = hSize - zds->lhSize; /* if hSize!=0, hSize > zds->lhSize */
1712 if (toLoad > (size_t)(iend-ip)) { /* not enough input to load full header */
1713 memcpy(zds->headerBuffer + zds->lhSize, ip, iend-ip);
1714 zds->lhSize += iend-ip;
1715 input->pos = input->size;
1716 return (MAX(ZSTD_frameHeaderSize_min, hSize) - zds->lhSize) + ZSTD_blockHeaderSize; /* remaining header bytes + next block header */
1717 }
1718 memcpy(zds->headerBuffer + zds->lhSize, ip, toLoad); zds->lhSize = hSize; ip += toLoad;
1719 break;
1720 } }
1721
1722 /* Consume header */
1723 { const ZSTD_DCtx* refContext = zds->ddict ? zds->ddict->refContext : NULL;
1724 ZSTD_refDCtx(zds->dctx, refContext);
1725 }
1726 { size_t const h1Size = ZSTD_nextSrcSizeToDecompress(zds->dctx); /* == ZSTD_frameHeaderSize_prefix */
1727 CHECK_F(ZSTD_decompressContinue(zds->dctx, NULL, 0, zds->headerBuffer, h1Size));
1728 { size_t const h2Size = ZSTD_nextSrcSizeToDecompress(zds->dctx);
1729 CHECK_F(ZSTD_decompressContinue(zds->dctx, NULL, 0, zds->headerBuffer+h1Size, h2Size));
1730 } }
1731
1732 zds->fParams.windowSize = MAX(zds->fParams.windowSize, 1U << ZSTD_WINDOWLOG_ABSOLUTEMIN);
1733 if (zds->fParams.windowSize > zds->maxWindowSize) return ERROR(frameParameter_windowTooLarge);
1734
1735 /* Adapt buffer sizes to frame header instructions */
1736 { size_t const blockSize = MIN(zds->fParams.windowSize, ZSTD_BLOCKSIZE_ABSOLUTEMAX);
1737 size_t const neededOutSize = zds->fParams.windowSize + blockSize;
1738 zds->blockSize = blockSize;
1739 if (zds->inBuffSize < blockSize) {
1740 ZSTD_free(zds->inBuff, zds->customMem);
1741 zds->inBuffSize = blockSize;
1742 zds->inBuff = (char*)ZSTD_malloc(blockSize, zds->customMem);
1743 if (zds->inBuff == NULL) return ERROR(memory_allocation);
1744 }
1745 if (zds->outBuffSize < neededOutSize) {
1746 ZSTD_free(zds->outBuff, zds->customMem);
1747 zds->outBuffSize = neededOutSize;
1748 zds->outBuff = (char*)ZSTD_malloc(neededOutSize, zds->customMem);
1749 if (zds->outBuff == NULL) return ERROR(memory_allocation);
1750 } }
1751 zds->stage = zdss_read;
1752 /* pass-through */
1753
1754 case zdss_read:
1755 { size_t const neededInSize = ZSTD_nextSrcSizeToDecompress(zds->dctx);
1756 if (neededInSize==0) { /* end of frame */
1757 zds->stage = zdss_init;
1758 someMoreWork = 0;
1759 break;
1760 }
1761 if ((size_t)(iend-ip) >= neededInSize) { /* decode directly from src */
1762 const int isSkipFrame = ZSTD_isSkipFrame(zds->dctx);
1763 size_t const decodedSize = ZSTD_decompressContinue(zds->dctx,
1764 zds->outBuff + zds->outStart, (isSkipFrame ? 0 : zds->outBuffSize - zds->outStart),
1765 ip, neededInSize);
1766 if (ZSTD_isError(decodedSize)) return decodedSize;
1767 ip += neededInSize;
1768 if (!decodedSize && !isSkipFrame) break; /* this was just a header */
1769 zds->outEnd = zds->outStart + decodedSize;
1770 zds->stage = zdss_flush;
1771 break;
1772 }
1773 if (ip==iend) { someMoreWork = 0; break; } /* no more input */
1774 zds->stage = zdss_load;
1775 /* pass-through */
1776 }
1777
1778 case zdss_load:
1779 { size_t const neededInSize = ZSTD_nextSrcSizeToDecompress(zds->dctx);
1780 size_t const toLoad = neededInSize - zds->inPos; /* should always be <= remaining space within inBuff */
1781 size_t loadedSize;
1782 if (toLoad > zds->inBuffSize - zds->inPos) return ERROR(corruption_detected); /* should never happen */
1783 loadedSize = ZSTD_limitCopy(zds->inBuff + zds->inPos, toLoad, ip, iend-ip);
1784 ip += loadedSize;
1785 zds->inPos += loadedSize;
1786 if (loadedSize < toLoad) { someMoreWork = 0; break; } /* not enough input, wait for more */
1787
1788 /* decode loaded input */
1789 { const int isSkipFrame = ZSTD_isSkipFrame(zds->dctx);
1790 size_t const decodedSize = ZSTD_decompressContinue(zds->dctx,
1791 zds->outBuff + zds->outStart, zds->outBuffSize - zds->outStart,
1792 zds->inBuff, neededInSize);
1793 if (ZSTD_isError(decodedSize)) return decodedSize;
1794 zds->inPos = 0; /* input is consumed */
1795 if (!decodedSize && !isSkipFrame) { zds->stage = zdss_read; break; } /* this was just a header */
1796 zds->outEnd = zds->outStart + decodedSize;
1797 zds->stage = zdss_flush;
1798 /* pass-through */
1799 } }
1800
1801 case zdss_flush:
1802 { size_t const toFlushSize = zds->outEnd - zds->outStart;
1803 size_t const flushedSize = ZSTD_limitCopy(op, oend-op, zds->outBuff + zds->outStart, toFlushSize);
1804 op += flushedSize;
1805 zds->outStart += flushedSize;
1806 if (flushedSize == toFlushSize) { /* flush completed */
1807 zds->stage = zdss_read;
1808 if (zds->outStart + zds->blockSize > zds->outBuffSize)
1809 zds->outStart = zds->outEnd = 0;
1810 break;
1811 }
1812 /* cannot complete flush */
1813 someMoreWork = 0;
1814 break;
1815 }
1816 default: return ERROR(GENERIC); /* impossible */
1817 } }
1818
1819 /* result */
1820 input->pos += (size_t)(ip-istart);
1821 output->pos += (size_t)(op-ostart);
1822 { size_t nextSrcSizeHint = ZSTD_nextSrcSizeToDecompress(zds->dctx);
1823 if (!nextSrcSizeHint) { /* frame fully decoded */
1824 if (zds->outEnd == zds->outStart) { /* output fully flushed */
1825 if (zds->hostageByte) {
1826 if (input->pos >= input->size) { zds->stage = zdss_read; return 1; } /* can't release hostage (not present) */
1827 input->pos++; /* release hostage */
1828 }
1829 return 0;
1830 }
1831 if (!zds->hostageByte) { /* output not fully flushed; keep last byte as hostage; will be released when all output is flushed */
1832 input->pos--; /* note : pos > 0, otherwise, impossible to finish reading last block */
1833 zds->hostageByte=1;
1834 }
1835 return 1;
1836 }
1837 nextSrcSizeHint += ZSTD_blockHeaderSize * (ZSTD_nextInputType(zds->dctx) == ZSTDnit_block); /* preload header of next block */
1838 if (zds->inPos > nextSrcSizeHint) return ERROR(GENERIC); /* should never happen */
1839 nextSrcSizeHint -= zds->inPos; /* already loaded*/
1840 return nextSrcSizeHint;
1841 }
1842 }
This diff has been collapsed as it changes many lines, (1913 lines changed) Show them Hide them
@@ -0,0 +1,1913 b''
1 /*
2 * divsufsort.c for libdivsufsort-lite
3 * Copyright (c) 2003-2008 Yuta Mori All Rights Reserved.
4 *
5 * Permission is hereby granted, free of charge, to any person
6 * obtaining a copy of this software and associated documentation
7 * files (the "Software"), to deal in the Software without
8 * restriction, including without limitation the rights to use,
9 * copy, modify, merge, publish, distribute, sublicense, and/or sell
10 * copies of the Software, and to permit persons to whom the
11 * Software is furnished to do so, subject to the following
12 * conditions:
13 *
14 * The above copyright notice and this permission notice shall be
15 * included in all copies or substantial portions of the Software.
16 *
17 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
18 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
19 * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
20 * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
21 * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
22 * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
23 * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
24 * OTHER DEALINGS IN THE SOFTWARE.
25 */
26
27 /*- Compiler specifics -*/
28 #ifdef __clang__
29 #pragma clang diagnostic ignored "-Wshorten-64-to-32"
30 #endif
31
32 #if defined(_MSC_VER)
33 # pragma warning(disable : 4244)
34 # pragma warning(disable : 4127) /* C4127 : Condition expression is constant */
35 #endif
36
37
38 /*- Dependencies -*/
39 #include <assert.h>
40 #include <stdio.h>
41 #include <stdlib.h>
42
43 #include "divsufsort.h"
44
45 /*- Constants -*/
46 #if defined(INLINE)
47 # undef INLINE
48 #endif
49 #if !defined(INLINE)
50 # define INLINE __inline
51 #endif
52 #if defined(ALPHABET_SIZE) && (ALPHABET_SIZE < 1)
53 # undef ALPHABET_SIZE
54 #endif
55 #if !defined(ALPHABET_SIZE)
56 # define ALPHABET_SIZE (256)
57 #endif
58 #define BUCKET_A_SIZE (ALPHABET_SIZE)
59 #define BUCKET_B_SIZE (ALPHABET_SIZE * ALPHABET_SIZE)
60 #if defined(SS_INSERTIONSORT_THRESHOLD)
61 # if SS_INSERTIONSORT_THRESHOLD < 1
62 # undef SS_INSERTIONSORT_THRESHOLD
63 # define SS_INSERTIONSORT_THRESHOLD (1)
64 # endif
65 #else
66 # define SS_INSERTIONSORT_THRESHOLD (8)
67 #endif
68 #if defined(SS_BLOCKSIZE)
69 # if SS_BLOCKSIZE < 0
70 # undef SS_BLOCKSIZE
71 # define SS_BLOCKSIZE (0)
72 # elif 32768 <= SS_BLOCKSIZE
73 # undef SS_BLOCKSIZE
74 # define SS_BLOCKSIZE (32767)
75 # endif
76 #else
77 # define SS_BLOCKSIZE (1024)
78 #endif
79 /* minstacksize = log(SS_BLOCKSIZE) / log(3) * 2 */
80 #if SS_BLOCKSIZE == 0
81 # define SS_MISORT_STACKSIZE (96)
82 #elif SS_BLOCKSIZE <= 4096
83 # define SS_MISORT_STACKSIZE (16)
84 #else
85 # define SS_MISORT_STACKSIZE (24)
86 #endif
87 #define SS_SMERGE_STACKSIZE (32)
88 #define TR_INSERTIONSORT_THRESHOLD (8)
89 #define TR_STACKSIZE (64)
90
91
92 /*- Macros -*/
93 #ifndef SWAP
94 # define SWAP(_a, _b) do { t = (_a); (_a) = (_b); (_b) = t; } while(0)
95 #endif /* SWAP */
96 #ifndef MIN
97 # define MIN(_a, _b) (((_a) < (_b)) ? (_a) : (_b))
98 #endif /* MIN */
99 #ifndef MAX
100 # define MAX(_a, _b) (((_a) > (_b)) ? (_a) : (_b))
101 #endif /* MAX */
102 #define STACK_PUSH(_a, _b, _c, _d)\
103 do {\
104 assert(ssize < STACK_SIZE);\
105 stack[ssize].a = (_a), stack[ssize].b = (_b),\
106 stack[ssize].c = (_c), stack[ssize++].d = (_d);\
107 } while(0)
108 #define STACK_PUSH5(_a, _b, _c, _d, _e)\
109 do {\
110 assert(ssize < STACK_SIZE);\
111 stack[ssize].a = (_a), stack[ssize].b = (_b),\
112 stack[ssize].c = (_c), stack[ssize].d = (_d), stack[ssize++].e = (_e);\
113 } while(0)
114 #define STACK_POP(_a, _b, _c, _d)\
115 do {\
116 assert(0 <= ssize);\
117 if(ssize == 0) { return; }\
118 (_a) = stack[--ssize].a, (_b) = stack[ssize].b,\
119 (_c) = stack[ssize].c, (_d) = stack[ssize].d;\
120 } while(0)
121 #define STACK_POP5(_a, _b, _c, _d, _e)\
122 do {\
123 assert(0 <= ssize);\
124 if(ssize == 0) { return; }\
125 (_a) = stack[--ssize].a, (_b) = stack[ssize].b,\
126 (_c) = stack[ssize].c, (_d) = stack[ssize].d, (_e) = stack[ssize].e;\
127 } while(0)
128 #define BUCKET_A(_c0) bucket_A[(_c0)]
129 #if ALPHABET_SIZE == 256
130 #define BUCKET_B(_c0, _c1) (bucket_B[((_c1) << 8) | (_c0)])
131 #define BUCKET_BSTAR(_c0, _c1) (bucket_B[((_c0) << 8) | (_c1)])
132 #else
133 #define BUCKET_B(_c0, _c1) (bucket_B[(_c1) * ALPHABET_SIZE + (_c0)])
134 #define BUCKET_BSTAR(_c0, _c1) (bucket_B[(_c0) * ALPHABET_SIZE + (_c1)])
135 #endif
136
137
138 /*- Private Functions -*/
139
140 static const int lg_table[256]= {
141 -1,0,1,1,2,2,2,2,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,
142 5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,
143 6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,
144 6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,
145 7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
146 7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
147 7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
148 7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7
149 };
150
151 #if (SS_BLOCKSIZE == 0) || (SS_INSERTIONSORT_THRESHOLD < SS_BLOCKSIZE)
152
153 static INLINE
154 int
155 ss_ilg(int n) {
156 #if SS_BLOCKSIZE == 0
157 return (n & 0xffff0000) ?
158 ((n & 0xff000000) ?
159 24 + lg_table[(n >> 24) & 0xff] :
160 16 + lg_table[(n >> 16) & 0xff]) :
161 ((n & 0x0000ff00) ?
162 8 + lg_table[(n >> 8) & 0xff] :
163 0 + lg_table[(n >> 0) & 0xff]);
164 #elif SS_BLOCKSIZE < 256
165 return lg_table[n];
166 #else
167 return (n & 0xff00) ?
168 8 + lg_table[(n >> 8) & 0xff] :
169 0 + lg_table[(n >> 0) & 0xff];
170 #endif
171 }
172
173 #endif /* (SS_BLOCKSIZE == 0) || (SS_INSERTIONSORT_THRESHOLD < SS_BLOCKSIZE) */
174
175 #if SS_BLOCKSIZE != 0
176
177 static const int sqq_table[256] = {
178 0, 16, 22, 27, 32, 35, 39, 42, 45, 48, 50, 53, 55, 57, 59, 61,
179 64, 65, 67, 69, 71, 73, 75, 76, 78, 80, 81, 83, 84, 86, 87, 89,
180 90, 91, 93, 94, 96, 97, 98, 99, 101, 102, 103, 104, 106, 107, 108, 109,
181 110, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126,
182 128, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142,
183 143, 144, 144, 145, 146, 147, 148, 149, 150, 150, 151, 152, 153, 154, 155, 155,
184 156, 157, 158, 159, 160, 160, 161, 162, 163, 163, 164, 165, 166, 167, 167, 168,
185 169, 170, 170, 171, 172, 173, 173, 174, 175, 176, 176, 177, 178, 178, 179, 180,
186 181, 181, 182, 183, 183, 184, 185, 185, 186, 187, 187, 188, 189, 189, 190, 191,
187 192, 192, 193, 193, 194, 195, 195, 196, 197, 197, 198, 199, 199, 200, 201, 201,
188 202, 203, 203, 204, 204, 205, 206, 206, 207, 208, 208, 209, 209, 210, 211, 211,
189 212, 212, 213, 214, 214, 215, 215, 216, 217, 217, 218, 218, 219, 219, 220, 221,
190 221, 222, 222, 223, 224, 224, 225, 225, 226, 226, 227, 227, 228, 229, 229, 230,
191 230, 231, 231, 232, 232, 233, 234, 234, 235, 235, 236, 236, 237, 237, 238, 238,
192 239, 240, 240, 241, 241, 242, 242, 243, 243, 244, 244, 245, 245, 246, 246, 247,
193 247, 248, 248, 249, 249, 250, 250, 251, 251, 252, 252, 253, 253, 254, 254, 255
194 };
195
196 static INLINE
197 int
198 ss_isqrt(int x) {
199 int y, e;
200
201 if(x >= (SS_BLOCKSIZE * SS_BLOCKSIZE)) { return SS_BLOCKSIZE; }
202 e = (x & 0xffff0000) ?
203 ((x & 0xff000000) ?
204 24 + lg_table[(x >> 24) & 0xff] :
205 16 + lg_table[(x >> 16) & 0xff]) :
206 ((x & 0x0000ff00) ?
207 8 + lg_table[(x >> 8) & 0xff] :
208 0 + lg_table[(x >> 0) & 0xff]);
209
210 if(e >= 16) {
211 y = sqq_table[x >> ((e - 6) - (e & 1))] << ((e >> 1) - 7);
212 if(e >= 24) { y = (y + 1 + x / y) >> 1; }
213 y = (y + 1 + x / y) >> 1;
214 } else if(e >= 8) {
215 y = (sqq_table[x >> ((e - 6) - (e & 1))] >> (7 - (e >> 1))) + 1;
216 } else {
217 return sqq_table[x] >> 4;
218 }
219
220 return (x < (y * y)) ? y - 1 : y;
221 }
222
223 #endif /* SS_BLOCKSIZE != 0 */
224
225
226 /*---------------------------------------------------------------------------*/
227
228 /* Compares two suffixes. */
229 static INLINE
230 int
231 ss_compare(const unsigned char *T,
232 const int *p1, const int *p2,
233 int depth) {
234 const unsigned char *U1, *U2, *U1n, *U2n;
235
236 for(U1 = T + depth + *p1,
237 U2 = T + depth + *p2,
238 U1n = T + *(p1 + 1) + 2,
239 U2n = T + *(p2 + 1) + 2;
240 (U1 < U1n) && (U2 < U2n) && (*U1 == *U2);
241 ++U1, ++U2) {
242 }
243
244 return U1 < U1n ?
245 (U2 < U2n ? *U1 - *U2 : 1) :
246 (U2 < U2n ? -1 : 0);
247 }
248
249
250 /*---------------------------------------------------------------------------*/
251
252 #if (SS_BLOCKSIZE != 1) && (SS_INSERTIONSORT_THRESHOLD != 1)
253
254 /* Insertionsort for small size groups */
255 static
256 void
257 ss_insertionsort(const unsigned char *T, const int *PA,
258 int *first, int *last, int depth) {
259 int *i, *j;
260 int t;
261 int r;
262
263 for(i = last - 2; first <= i; --i) {
264 for(t = *i, j = i + 1; 0 < (r = ss_compare(T, PA + t, PA + *j, depth));) {
265 do { *(j - 1) = *j; } while((++j < last) && (*j < 0));
266 if(last <= j) { break; }
267 }
268 if(r == 0) { *j = ~*j; }
269 *(j - 1) = t;
270 }
271 }
272
273 #endif /* (SS_BLOCKSIZE != 1) && (SS_INSERTIONSORT_THRESHOLD != 1) */
274
275
276 /*---------------------------------------------------------------------------*/
277
278 #if (SS_BLOCKSIZE == 0) || (SS_INSERTIONSORT_THRESHOLD < SS_BLOCKSIZE)
279
280 static INLINE
281 void
282 ss_fixdown(const unsigned char *Td, const int *PA,
283 int *SA, int i, int size) {
284 int j, k;
285 int v;
286 int c, d, e;
287
288 for(v = SA[i], c = Td[PA[v]]; (j = 2 * i + 1) < size; SA[i] = SA[k], i = k) {
289 d = Td[PA[SA[k = j++]]];
290 if(d < (e = Td[PA[SA[j]]])) { k = j; d = e; }
291 if(d <= c) { break; }
292 }
293 SA[i] = v;
294 }
295
296 /* Simple top-down heapsort. */
297 static
298 void
299 ss_heapsort(const unsigned char *Td, const int *PA, int *SA, int size) {
300 int i, m;
301 int t;
302
303 m = size;
304 if((size % 2) == 0) {
305 m--;
306 if(Td[PA[SA[m / 2]]] < Td[PA[SA[m]]]) { SWAP(SA[m], SA[m / 2]); }
307 }
308
309 for(i = m / 2 - 1; 0 <= i; --i) { ss_fixdown(Td, PA, SA, i, m); }
310 if((size % 2) == 0) { SWAP(SA[0], SA[m]); ss_fixdown(Td, PA, SA, 0, m); }
311 for(i = m - 1; 0 < i; --i) {
312 t = SA[0], SA[0] = SA[i];
313 ss_fixdown(Td, PA, SA, 0, i);
314 SA[i] = t;
315 }
316 }
317
318
319 /*---------------------------------------------------------------------------*/
320
321 /* Returns the median of three elements. */
322 static INLINE
323 int *
324 ss_median3(const unsigned char *Td, const int *PA,
325 int *v1, int *v2, int *v3) {
326 int *t;
327 if(Td[PA[*v1]] > Td[PA[*v2]]) { SWAP(v1, v2); }
328 if(Td[PA[*v2]] > Td[PA[*v3]]) {
329 if(Td[PA[*v1]] > Td[PA[*v3]]) { return v1; }
330 else { return v3; }
331 }
332 return v2;
333 }
334
335 /* Returns the median of five elements. */
336 static INLINE
337 int *
338 ss_median5(const unsigned char *Td, const int *PA,
339 int *v1, int *v2, int *v3, int *v4, int *v5) {
340 int *t;
341 if(Td[PA[*v2]] > Td[PA[*v3]]) { SWAP(v2, v3); }
342 if(Td[PA[*v4]] > Td[PA[*v5]]) { SWAP(v4, v5); }
343 if(Td[PA[*v2]] > Td[PA[*v4]]) { SWAP(v2, v4); SWAP(v3, v5); }
344 if(Td[PA[*v1]] > Td[PA[*v3]]) { SWAP(v1, v3); }
345 if(Td[PA[*v1]] > Td[PA[*v4]]) { SWAP(v1, v4); SWAP(v3, v5); }
346 if(Td[PA[*v3]] > Td[PA[*v4]]) { return v4; }
347 return v3;
348 }
349
350 /* Returns the pivot element. */
351 static INLINE
352 int *
353 ss_pivot(const unsigned char *Td, const int *PA, int *first, int *last) {
354 int *middle;
355 int t;
356
357 t = last - first;
358 middle = first + t / 2;
359
360 if(t <= 512) {
361 if(t <= 32) {
362 return ss_median3(Td, PA, first, middle, last - 1);
363 } else {
364 t >>= 2;
365 return ss_median5(Td, PA, first, first + t, middle, last - 1 - t, last - 1);
366 }
367 }
368 t >>= 3;
369 first = ss_median3(Td, PA, first, first + t, first + (t << 1));
370 middle = ss_median3(Td, PA, middle - t, middle, middle + t);
371 last = ss_median3(Td, PA, last - 1 - (t << 1), last - 1 - t, last - 1);
372 return ss_median3(Td, PA, first, middle, last);
373 }
374
375
376 /*---------------------------------------------------------------------------*/
377
378 /* Binary partition for substrings. */
379 static INLINE
380 int *
381 ss_partition(const int *PA,
382 int *first, int *last, int depth) {
383 int *a, *b;
384 int t;
385 for(a = first - 1, b = last;;) {
386 for(; (++a < b) && ((PA[*a] + depth) >= (PA[*a + 1] + 1));) { *a = ~*a; }
387 for(; (a < --b) && ((PA[*b] + depth) < (PA[*b + 1] + 1));) { }
388 if(b <= a) { break; }
389 t = ~*b;
390 *b = *a;
391 *a = t;
392 }
393 if(first < a) { *first = ~*first; }
394 return a;
395 }
396
397 /* Multikey introsort for medium size groups. */
398 static
399 void
400 ss_mintrosort(const unsigned char *T, const int *PA,
401 int *first, int *last,
402 int depth) {
403 #define STACK_SIZE SS_MISORT_STACKSIZE
404 struct { int *a, *b, c; int d; } stack[STACK_SIZE];
405 const unsigned char *Td;
406 int *a, *b, *c, *d, *e, *f;
407 int s, t;
408 int ssize;
409 int limit;
410 int v, x = 0;
411
412 for(ssize = 0, limit = ss_ilg(last - first);;) {
413
414 if((last - first) <= SS_INSERTIONSORT_THRESHOLD) {
415 #if 1 < SS_INSERTIONSORT_THRESHOLD
416 if(1 < (last - first)) { ss_insertionsort(T, PA, first, last, depth); }
417 #endif
418 STACK_POP(first, last, depth, limit);
419 continue;
420 }
421
422 Td = T + depth;
423 if(limit-- == 0) { ss_heapsort(Td, PA, first, last - first); }
424 if(limit < 0) {
425 for(a = first + 1, v = Td[PA[*first]]; a < last; ++a) {
426 if((x = Td[PA[*a]]) != v) {
427 if(1 < (a - first)) { break; }
428 v = x;
429 first = a;
430 }
431 }
432 if(Td[PA[*first] - 1] < v) {
433 first = ss_partition(PA, first, a, depth);
434 }
435 if((a - first) <= (last - a)) {
436 if(1 < (a - first)) {
437 STACK_PUSH(a, last, depth, -1);
438 last = a, depth += 1, limit = ss_ilg(a - first);
439 } else {
440 first = a, limit = -1;
441 }
442 } else {
443 if(1 < (last - a)) {
444 STACK_PUSH(first, a, depth + 1, ss_ilg(a - first));
445 first = a, limit = -1;
446 } else {
447 last = a, depth += 1, limit = ss_ilg(a - first);
448 }
449 }
450 continue;
451 }
452
453 /* choose pivot */
454 a = ss_pivot(Td, PA, first, last);
455 v = Td[PA[*a]];
456 SWAP(*first, *a);
457
458 /* partition */
459 for(b = first; (++b < last) && ((x = Td[PA[*b]]) == v);) { }
460 if(((a = b) < last) && (x < v)) {
461 for(; (++b < last) && ((x = Td[PA[*b]]) <= v);) {
462 if(x == v) { SWAP(*b, *a); ++a; }
463 }
464 }
465 for(c = last; (b < --c) && ((x = Td[PA[*c]]) == v);) { }
466 if((b < (d = c)) && (x > v)) {
467 for(; (b < --c) && ((x = Td[PA[*c]]) >= v);) {
468 if(x == v) { SWAP(*c, *d); --d; }
469 }
470 }
471 for(; b < c;) {
472 SWAP(*b, *c);
473 for(; (++b < c) && ((x = Td[PA[*b]]) <= v);) {
474 if(x == v) { SWAP(*b, *a); ++a; }
475 }
476 for(; (b < --c) && ((x = Td[PA[*c]]) >= v);) {
477 if(x == v) { SWAP(*c, *d); --d; }
478 }
479 }
480
481 if(a <= d) {
482 c = b - 1;
483
484 if((s = a - first) > (t = b - a)) { s = t; }
485 for(e = first, f = b - s; 0 < s; --s, ++e, ++f) { SWAP(*e, *f); }
486 if((s = d - c) > (t = last - d - 1)) { s = t; }
487 for(e = b, f = last - s; 0 < s; --s, ++e, ++f) { SWAP(*e, *f); }
488
489 a = first + (b - a), c = last - (d - c);
490 b = (v <= Td[PA[*a] - 1]) ? a : ss_partition(PA, a, c, depth);
491
492 if((a - first) <= (last - c)) {
493 if((last - c) <= (c - b)) {
494 STACK_PUSH(b, c, depth + 1, ss_ilg(c - b));
495 STACK_PUSH(c, last, depth, limit);
496 last = a;
497 } else if((a - first) <= (c - b)) {
498 STACK_PUSH(c, last, depth, limit);
499 STACK_PUSH(b, c, depth + 1, ss_ilg(c - b));
500 last = a;
501 } else {
502 STACK_PUSH(c, last, depth, limit);
503 STACK_PUSH(first, a, depth, limit);
504 first = b, last = c, depth += 1, limit = ss_ilg(c - b);
505 }
506 } else {
507 if((a - first) <= (c - b)) {
508 STACK_PUSH(b, c, depth + 1, ss_ilg(c - b));
509 STACK_PUSH(first, a, depth, limit);
510 first = c;
511 } else if((last - c) <= (c - b)) {
512 STACK_PUSH(first, a, depth, limit);
513 STACK_PUSH(b, c, depth + 1, ss_ilg(c - b));
514 first = c;
515 } else {
516 STACK_PUSH(first, a, depth, limit);
517 STACK_PUSH(c, last, depth, limit);
518 first = b, last = c, depth += 1, limit = ss_ilg(c - b);
519 }
520 }
521 } else {
522 limit += 1;
523 if(Td[PA[*first] - 1] < v) {
524 first = ss_partition(PA, first, last, depth);
525 limit = ss_ilg(last - first);
526 }
527 depth += 1;
528 }
529 }
530 #undef STACK_SIZE
531 }
532
533 #endif /* (SS_BLOCKSIZE == 0) || (SS_INSERTIONSORT_THRESHOLD < SS_BLOCKSIZE) */
534
535
536 /*---------------------------------------------------------------------------*/
537
538 #if SS_BLOCKSIZE != 0
539
540 static INLINE
541 void
542 ss_blockswap(int *a, int *b, int n) {
543 int t;
544 for(; 0 < n; --n, ++a, ++b) {
545 t = *a, *a = *b, *b = t;
546 }
547 }
548
549 static INLINE
550 void
551 ss_rotate(int *first, int *middle, int *last) {
552 int *a, *b, t;
553 int l, r;
554 l = middle - first, r = last - middle;
555 for(; (0 < l) && (0 < r);) {
556 if(l == r) { ss_blockswap(first, middle, l); break; }
557 if(l < r) {
558 a = last - 1, b = middle - 1;
559 t = *a;
560 do {
561 *a-- = *b, *b-- = *a;
562 if(b < first) {
563 *a = t;
564 last = a;
565 if((r -= l + 1) <= l) { break; }
566 a -= 1, b = middle - 1;
567 t = *a;
568 }
569 } while(1);
570 } else {
571 a = first, b = middle;
572 t = *a;
573 do {
574 *a++ = *b, *b++ = *a;
575 if(last <= b) {
576 *a = t;
577 first = a + 1;
578 if((l -= r + 1) <= r) { break; }
579 a += 1, b = middle;
580 t = *a;
581 }
582 } while(1);
583 }
584 }
585 }
586
587
588 /*---------------------------------------------------------------------------*/
589
590 static
591 void
592 ss_inplacemerge(const unsigned char *T, const int *PA,
593 int *first, int *middle, int *last,
594 int depth) {
595 const int *p;
596 int *a, *b;
597 int len, half;
598 int q, r;
599 int x;
600
601 for(;;) {
602 if(*(last - 1) < 0) { x = 1; p = PA + ~*(last - 1); }
603 else { x = 0; p = PA + *(last - 1); }
604 for(a = first, len = middle - first, half = len >> 1, r = -1;
605 0 < len;
606 len = half, half >>= 1) {
607 b = a + half;
608 q = ss_compare(T, PA + ((0 <= *b) ? *b : ~*b), p, depth);
609 if(q < 0) {
610 a = b + 1;
611 half -= (len & 1) ^ 1;
612 } else {
613 r = q;
614 }
615 }
616 if(a < middle) {
617 if(r == 0) { *a = ~*a; }
618 ss_rotate(a, middle, last);
619 last -= middle - a;
620 middle = a;
621 if(first == middle) { break; }
622 }
623 --last;
624 if(x != 0) { while(*--last < 0) { } }
625 if(middle == last) { break; }
626 }
627 }
628
629
630 /*---------------------------------------------------------------------------*/
631
632 /* Merge-forward with internal buffer. */
633 static
634 void
635 ss_mergeforward(const unsigned char *T, const int *PA,
636 int *first, int *middle, int *last,
637 int *buf, int depth) {
638 int *a, *b, *c, *bufend;
639 int t;
640 int r;
641
642 bufend = buf + (middle - first) - 1;
643 ss_blockswap(buf, first, middle - first);
644
645 for(t = *(a = first), b = buf, c = middle;;) {
646 r = ss_compare(T, PA + *b, PA + *c, depth);
647 if(r < 0) {
648 do {
649 *a++ = *b;
650 if(bufend <= b) { *bufend = t; return; }
651 *b++ = *a;
652 } while(*b < 0);
653 } else if(r > 0) {
654 do {
655 *a++ = *c, *c++ = *a;
656 if(last <= c) {
657 while(b < bufend) { *a++ = *b, *b++ = *a; }
658 *a = *b, *b = t;
659 return;
660 }
661 } while(*c < 0);
662 } else {
663 *c = ~*c;
664 do {
665 *a++ = *b;
666 if(bufend <= b) { *bufend = t; return; }
667 *b++ = *a;
668 } while(*b < 0);
669
670 do {
671 *a++ = *c, *c++ = *a;
672 if(last <= c) {
673 while(b < bufend) { *a++ = *b, *b++ = *a; }
674 *a = *b, *b = t;
675 return;
676 }
677 } while(*c < 0);
678 }
679 }
680 }
681
682 /* Merge-backward with internal buffer. */
683 static
684 void
685 ss_mergebackward(const unsigned char *T, const int *PA,
686 int *first, int *middle, int *last,
687 int *buf, int depth) {
688 const int *p1, *p2;
689 int *a, *b, *c, *bufend;
690 int t;
691 int r;
692 int x;
693
694 bufend = buf + (last - middle) - 1;
695 ss_blockswap(buf, middle, last - middle);
696
697 x = 0;
698 if(*bufend < 0) { p1 = PA + ~*bufend; x |= 1; }
699 else { p1 = PA + *bufend; }
700 if(*(middle - 1) < 0) { p2 = PA + ~*(middle - 1); x |= 2; }
701 else { p2 = PA + *(middle - 1); }
702 for(t = *(a = last - 1), b = bufend, c = middle - 1;;) {
703 r = ss_compare(T, p1, p2, depth);
704 if(0 < r) {
705 if(x & 1) { do { *a-- = *b, *b-- = *a; } while(*b < 0); x ^= 1; }
706 *a-- = *b;
707 if(b <= buf) { *buf = t; break; }
708 *b-- = *a;
709 if(*b < 0) { p1 = PA + ~*b; x |= 1; }
710 else { p1 = PA + *b; }
711 } else if(r < 0) {
712 if(x & 2) { do { *a-- = *c, *c-- = *a; } while(*c < 0); x ^= 2; }
713 *a-- = *c, *c-- = *a;
714 if(c < first) {
715 while(buf < b) { *a-- = *b, *b-- = *a; }
716 *a = *b, *b = t;
717 break;
718 }
719 if(*c < 0) { p2 = PA + ~*c; x |= 2; }
720 else { p2 = PA + *c; }
721 } else {
722 if(x & 1) { do { *a-- = *b, *b-- = *a; } while(*b < 0); x ^= 1; }
723 *a-- = ~*b;
724 if(b <= buf) { *buf = t; break; }
725 *b-- = *a;
726 if(x & 2) { do { *a-- = *c, *c-- = *a; } while(*c < 0); x ^= 2; }
727 *a-- = *c, *c-- = *a;
728 if(c < first) {
729 while(buf < b) { *a-- = *b, *b-- = *a; }
730 *a = *b, *b = t;
731 break;
732 }
733 if(*b < 0) { p1 = PA + ~*b; x |= 1; }
734 else { p1 = PA + *b; }
735 if(*c < 0) { p2 = PA + ~*c; x |= 2; }
736 else { p2 = PA + *c; }
737 }
738 }
739 }
740
741 /* D&C based merge. */
742 static
743 void
744 ss_swapmerge(const unsigned char *T, const int *PA,
745 int *first, int *middle, int *last,
746 int *buf, int bufsize, int depth) {
747 #define STACK_SIZE SS_SMERGE_STACKSIZE
748 #define GETIDX(a) ((0 <= (a)) ? (a) : (~(a)))
749 #define MERGE_CHECK(a, b, c)\
750 do {\
751 if(((c) & 1) ||\
752 (((c) & 2) && (ss_compare(T, PA + GETIDX(*((a) - 1)), PA + *(a), depth) == 0))) {\
753 *(a) = ~*(a);\
754 }\
755 if(((c) & 4) && ((ss_compare(T, PA + GETIDX(*((b) - 1)), PA + *(b), depth) == 0))) {\
756 *(b) = ~*(b);\
757 }\
758 } while(0)
759 struct { int *a, *b, *c; int d; } stack[STACK_SIZE];
760 int *l, *r, *lm, *rm;
761 int m, len, half;
762 int ssize;
763 int check, next;
764
765 for(check = 0, ssize = 0;;) {
766 if((last - middle) <= bufsize) {
767 if((first < middle) && (middle < last)) {
768 ss_mergebackward(T, PA, first, middle, last, buf, depth);
769 }
770 MERGE_CHECK(first, last, check);
771 STACK_POP(first, middle, last, check);
772 continue;
773 }
774
775 if((middle - first) <= bufsize) {
776 if(first < middle) {
777 ss_mergeforward(T, PA, first, middle, last, buf, depth);
778 }
779 MERGE_CHECK(first, last, check);
780 STACK_POP(first, middle, last, check);
781 continue;
782 }
783
784 for(m = 0, len = MIN(middle - first, last - middle), half = len >> 1;
785 0 < len;
786 len = half, half >>= 1) {
787 if(ss_compare(T, PA + GETIDX(*(middle + m + half)),
788 PA + GETIDX(*(middle - m - half - 1)), depth) < 0) {
789 m += half + 1;
790 half -= (len & 1) ^ 1;
791 }
792 }
793
794 if(0 < m) {
795 lm = middle - m, rm = middle + m;
796 ss_blockswap(lm, middle, m);
797 l = r = middle, next = 0;
798 if(rm < last) {
799 if(*rm < 0) {
800 *rm = ~*rm;
801 if(first < lm) { for(; *--l < 0;) { } next |= 4; }
802 next |= 1;
803 } else if(first < lm) {
804 for(; *r < 0; ++r) { }
805 next |= 2;
806 }
807 }
808
809 if((l - first) <= (last - r)) {
810 STACK_PUSH(r, rm, last, (next & 3) | (check & 4));
811 middle = lm, last = l, check = (check & 3) | (next & 4);
812 } else {
813 if((next & 2) && (r == middle)) { next ^= 6; }
814 STACK_PUSH(first, lm, l, (check & 3) | (next & 4));
815 first = r, middle = rm, check = (next & 3) | (check & 4);
816 }
817 } else {
818 if(ss_compare(T, PA + GETIDX(*(middle - 1)), PA + *middle, depth) == 0) {
819 *middle = ~*middle;
820 }
821 MERGE_CHECK(first, last, check);
822 STACK_POP(first, middle, last, check);
823 }
824 }
825 #undef STACK_SIZE
826 }
827
828 #endif /* SS_BLOCKSIZE != 0 */
829
830
831 /*---------------------------------------------------------------------------*/
832
833 /* Substring sort */
834 static
835 void
836 sssort(const unsigned char *T, const int *PA,
837 int *first, int *last,
838 int *buf, int bufsize,
839 int depth, int n, int lastsuffix) {
840 int *a;
841 #if SS_BLOCKSIZE != 0
842 int *b, *middle, *curbuf;
843 int j, k, curbufsize, limit;
844 #endif
845 int i;
846
847 if(lastsuffix != 0) { ++first; }
848
849 #if SS_BLOCKSIZE == 0
850 ss_mintrosort(T, PA, first, last, depth);
851 #else
852 if((bufsize < SS_BLOCKSIZE) &&
853 (bufsize < (last - first)) &&
854 (bufsize < (limit = ss_isqrt(last - first)))) {
855 if(SS_BLOCKSIZE < limit) { limit = SS_BLOCKSIZE; }
856 buf = middle = last - limit, bufsize = limit;
857 } else {
858 middle = last, limit = 0;
859 }
860 for(a = first, i = 0; SS_BLOCKSIZE < (middle - a); a += SS_BLOCKSIZE, ++i) {
861 #if SS_INSERTIONSORT_THRESHOLD < SS_BLOCKSIZE
862 ss_mintrosort(T, PA, a, a + SS_BLOCKSIZE, depth);
863 #elif 1 < SS_BLOCKSIZE
864 ss_insertionsort(T, PA, a, a + SS_BLOCKSIZE, depth);
865 #endif
866 curbufsize = last - (a + SS_BLOCKSIZE);
867 curbuf = a + SS_BLOCKSIZE;
868 if(curbufsize <= bufsize) { curbufsize = bufsize, curbuf = buf; }
869 for(b = a, k = SS_BLOCKSIZE, j = i; j & 1; b -= k, k <<= 1, j >>= 1) {
870 ss_swapmerge(T, PA, b - k, b, b + k, curbuf, curbufsize, depth);
871 }
872 }
873 #if SS_INSERTIONSORT_THRESHOLD < SS_BLOCKSIZE
874 ss_mintrosort(T, PA, a, middle, depth);
875 #elif 1 < SS_BLOCKSIZE
876 ss_insertionsort(T, PA, a, middle, depth);
877 #endif
878 for(k = SS_BLOCKSIZE; i != 0; k <<= 1, i >>= 1) {
879 if(i & 1) {
880 ss_swapmerge(T, PA, a - k, a, middle, buf, bufsize, depth);
881 a -= k;
882 }
883 }
884 if(limit != 0) {
885 #if SS_INSERTIONSORT_THRESHOLD < SS_BLOCKSIZE
886 ss_mintrosort(T, PA, middle, last, depth);
887 #elif 1 < SS_BLOCKSIZE
888 ss_insertionsort(T, PA, middle, last, depth);
889 #endif
890 ss_inplacemerge(T, PA, first, middle, last, depth);
891 }
892 #endif
893
894 if(lastsuffix != 0) {
895 /* Insert last type B* suffix. */
896 int PAi[2]; PAi[0] = PA[*(first - 1)], PAi[1] = n - 2;
897 for(a = first, i = *(first - 1);
898 (a < last) && ((*a < 0) || (0 < ss_compare(T, &(PAi[0]), PA + *a, depth)));
899 ++a) {
900 *(a - 1) = *a;
901 }
902 *(a - 1) = i;
903 }
904 }
905
906
907 /*---------------------------------------------------------------------------*/
908
909 static INLINE
910 int
911 tr_ilg(int n) {
912 return (n & 0xffff0000) ?
913 ((n & 0xff000000) ?
914 24 + lg_table[(n >> 24) & 0xff] :
915 16 + lg_table[(n >> 16) & 0xff]) :
916 ((n & 0x0000ff00) ?
917 8 + lg_table[(n >> 8) & 0xff] :
918 0 + lg_table[(n >> 0) & 0xff]);
919 }
920
921
922 /*---------------------------------------------------------------------------*/
923
924 /* Simple insertionsort for small size groups. */
925 static
926 void
927 tr_insertionsort(const int *ISAd, int *first, int *last) {
928 int *a, *b;
929 int t, r;
930
931 for(a = first + 1; a < last; ++a) {
932 for(t = *a, b = a - 1; 0 > (r = ISAd[t] - ISAd[*b]);) {
933 do { *(b + 1) = *b; } while((first <= --b) && (*b < 0));
934 if(b < first) { break; }
935 }
936 if(r == 0) { *b = ~*b; }
937 *(b + 1) = t;
938 }
939 }
940
941
942 /*---------------------------------------------------------------------------*/
943
944 static INLINE
945 void
946 tr_fixdown(const int *ISAd, int *SA, int i, int size) {
947 int j, k;
948 int v;
949 int c, d, e;
950
951 for(v = SA[i], c = ISAd[v]; (j = 2 * i + 1) < size; SA[i] = SA[k], i = k) {
952 d = ISAd[SA[k = j++]];
953 if(d < (e = ISAd[SA[j]])) { k = j; d = e; }
954 if(d <= c) { break; }
955 }
956 SA[i] = v;
957 }
958
959 /* Simple top-down heapsort. */
960 static
961 void
962 tr_heapsort(const int *ISAd, int *SA, int size) {
963 int i, m;
964 int t;
965
966 m = size;
967 if((size % 2) == 0) {
968 m--;
969 if(ISAd[SA[m / 2]] < ISAd[SA[m]]) { SWAP(SA[m], SA[m / 2]); }
970 }
971
972 for(i = m / 2 - 1; 0 <= i; --i) { tr_fixdown(ISAd, SA, i, m); }
973 if((size % 2) == 0) { SWAP(SA[0], SA[m]); tr_fixdown(ISAd, SA, 0, m); }
974 for(i = m - 1; 0 < i; --i) {
975 t = SA[0], SA[0] = SA[i];
976 tr_fixdown(ISAd, SA, 0, i);
977 SA[i] = t;
978 }
979 }
980
981
982 /*---------------------------------------------------------------------------*/
983
984 /* Returns the median of three elements. */
985 static INLINE
986 int *
987 tr_median3(const int *ISAd, int *v1, int *v2, int *v3) {
988 int *t;
989 if(ISAd[*v1] > ISAd[*v2]) { SWAP(v1, v2); }
990 if(ISAd[*v2] > ISAd[*v3]) {
991 if(ISAd[*v1] > ISAd[*v3]) { return v1; }
992 else { return v3; }
993 }
994 return v2;
995 }
996
997 /* Returns the median of five elements. */
998 static INLINE
999 int *
1000 tr_median5(const int *ISAd,
1001 int *v1, int *v2, int *v3, int *v4, int *v5) {
1002 int *t;
1003 if(ISAd[*v2] > ISAd[*v3]) { SWAP(v2, v3); }
1004 if(ISAd[*v4] > ISAd[*v5]) { SWAP(v4, v5); }
1005 if(ISAd[*v2] > ISAd[*v4]) { SWAP(v2, v4); SWAP(v3, v5); }
1006 if(ISAd[*v1] > ISAd[*v3]) { SWAP(v1, v3); }
1007 if(ISAd[*v1] > ISAd[*v4]) { SWAP(v1, v4); SWAP(v3, v5); }
1008 if(ISAd[*v3] > ISAd[*v4]) { return v4; }
1009 return v3;
1010 }
1011
1012 /* Returns the pivot element. */
1013 static INLINE
1014 int *
1015 tr_pivot(const int *ISAd, int *first, int *last) {
1016 int *middle;
1017 int t;
1018
1019 t = last - first;
1020 middle = first + t / 2;
1021
1022 if(t <= 512) {
1023 if(t <= 32) {
1024 return tr_median3(ISAd, first, middle, last - 1);
1025 } else {
1026 t >>= 2;
1027 return tr_median5(ISAd, first, first + t, middle, last - 1 - t, last - 1);
1028 }
1029 }
1030 t >>= 3;
1031 first = tr_median3(ISAd, first, first + t, first + (t << 1));
1032 middle = tr_median3(ISAd, middle - t, middle, middle + t);
1033 last = tr_median3(ISAd, last - 1 - (t << 1), last - 1 - t, last - 1);
1034 return tr_median3(ISAd, first, middle, last);
1035 }
1036
1037
1038 /*---------------------------------------------------------------------------*/
1039
1040 typedef struct _trbudget_t trbudget_t;
1041 struct _trbudget_t {
1042 int chance;
1043 int remain;
1044 int incval;
1045 int count;
1046 };
1047
1048 static INLINE
1049 void
1050 trbudget_init(trbudget_t *budget, int chance, int incval) {
1051 budget->chance = chance;
1052 budget->remain = budget->incval = incval;
1053 }
1054
1055 static INLINE
1056 int
1057 trbudget_check(trbudget_t *budget, int size) {
1058 if(size <= budget->remain) { budget->remain -= size; return 1; }
1059 if(budget->chance == 0) { budget->count += size; return 0; }
1060 budget->remain += budget->incval - size;
1061 budget->chance -= 1;
1062 return 1;
1063 }
1064
1065
1066 /*---------------------------------------------------------------------------*/
1067
1068 static INLINE
1069 void
1070 tr_partition(const int *ISAd,
1071 int *first, int *middle, int *last,
1072 int **pa, int **pb, int v) {
1073 int *a, *b, *c, *d, *e, *f;
1074 int t, s;
1075 int x = 0;
1076
1077 for(b = middle - 1; (++b < last) && ((x = ISAd[*b]) == v);) { }
1078 if(((a = b) < last) && (x < v)) {
1079 for(; (++b < last) && ((x = ISAd[*b]) <= v);) {
1080 if(x == v) { SWAP(*b, *a); ++a; }
1081 }
1082 }
1083 for(c = last; (b < --c) && ((x = ISAd[*c]) == v);) { }
1084 if((b < (d = c)) && (x > v)) {
1085 for(; (b < --c) && ((x = ISAd[*c]) >= v);) {
1086 if(x == v) { SWAP(*c, *d); --d; }
1087 }
1088 }
1089 for(; b < c;) {
1090 SWAP(*b, *c);
1091 for(; (++b < c) && ((x = ISAd[*b]) <= v);) {
1092 if(x == v) { SWAP(*b, *a); ++a; }
1093 }
1094 for(; (b < --c) && ((x = ISAd[*c]) >= v);) {
1095 if(x == v) { SWAP(*c, *d); --d; }
1096 }
1097 }
1098
1099 if(a <= d) {
1100 c = b - 1;
1101 if((s = a - first) > (t = b - a)) { s = t; }
1102 for(e = first, f = b - s; 0 < s; --s, ++e, ++f) { SWAP(*e, *f); }
1103 if((s = d - c) > (t = last - d - 1)) { s = t; }
1104 for(e = b, f = last - s; 0 < s; --s, ++e, ++f) { SWAP(*e, *f); }
1105 first += (b - a), last -= (d - c);
1106 }
1107 *pa = first, *pb = last;
1108 }
1109
1110 static
1111 void
1112 tr_copy(int *ISA, const int *SA,
1113 int *first, int *a, int *b, int *last,
1114 int depth) {
1115 /* sort suffixes of middle partition
1116 by using sorted order of suffixes of left and right partition. */
1117 int *c, *d, *e;
1118 int s, v;
1119
1120 v = b - SA - 1;
1121 for(c = first, d = a - 1; c <= d; ++c) {
1122 if((0 <= (s = *c - depth)) && (ISA[s] == v)) {
1123 *++d = s;
1124 ISA[s] = d - SA;
1125 }
1126 }
1127 for(c = last - 1, e = d + 1, d = b; e < d; --c) {
1128 if((0 <= (s = *c - depth)) && (ISA[s] == v)) {
1129 *--d = s;
1130 ISA[s] = d - SA;
1131 }
1132 }
1133 }
1134
1135 static
1136 void
1137 tr_partialcopy(int *ISA, const int *SA,
1138 int *first, int *a, int *b, int *last,
1139 int depth) {
1140 int *c, *d, *e;
1141 int s, v;
1142 int rank, lastrank, newrank = -1;
1143
1144 v = b - SA - 1;
1145 lastrank = -1;
1146 for(c = first, d = a - 1; c <= d; ++c) {
1147 if((0 <= (s = *c - depth)) && (ISA[s] == v)) {
1148 *++d = s;
1149 rank = ISA[s + depth];
1150 if(lastrank != rank) { lastrank = rank; newrank = d - SA; }
1151 ISA[s] = newrank;
1152 }
1153 }
1154
1155 lastrank = -1;
1156 for(e = d; first <= e; --e) {
1157 rank = ISA[*e];
1158 if(lastrank != rank) { lastrank = rank; newrank = e - SA; }
1159 if(newrank != rank) { ISA[*e] = newrank; }
1160 }
1161
1162 lastrank = -1;
1163 for(c = last - 1, e = d + 1, d = b; e < d; --c) {
1164 if((0 <= (s = *c - depth)) && (ISA[s] == v)) {
1165 *--d = s;
1166 rank = ISA[s + depth];
1167 if(lastrank != rank) { lastrank = rank; newrank = d - SA; }
1168 ISA[s] = newrank;
1169 }
1170 }
1171 }
1172
1173 static
1174 void
1175 tr_introsort(int *ISA, const int *ISAd,
1176 int *SA, int *first, int *last,
1177 trbudget_t *budget) {
1178 #define STACK_SIZE TR_STACKSIZE
1179 struct { const int *a; int *b, *c; int d, e; }stack[STACK_SIZE];
1180 int *a, *b, *c;
1181 int t;
1182 int v, x = 0;
1183 int incr = ISAd - ISA;
1184 int limit, next;
1185 int ssize, trlink = -1;
1186
1187 for(ssize = 0, limit = tr_ilg(last - first);;) {
1188
1189 if(limit < 0) {
1190 if(limit == -1) {
1191 /* tandem repeat partition */
1192 tr_partition(ISAd - incr, first, first, last, &a, &b, last - SA - 1);
1193
1194 /* update ranks */
1195 if(a < last) {
1196 for(c = first, v = a - SA - 1; c < a; ++c) { ISA[*c] = v; }
1197 }
1198 if(b < last) {
1199 for(c = a, v = b - SA - 1; c < b; ++c) { ISA[*c] = v; }
1200 }
1201
1202 /* push */
1203 if(1 < (b - a)) {
1204 STACK_PUSH5(NULL, a, b, 0, 0);
1205 STACK_PUSH5(ISAd - incr, first, last, -2, trlink);
1206 trlink = ssize - 2;
1207 }
1208 if((a - first) <= (last - b)) {
1209 if(1 < (a - first)) {
1210 STACK_PUSH5(ISAd, b, last, tr_ilg(last - b), trlink);
1211 last = a, limit = tr_ilg(a - first);
1212 } else if(1 < (last - b)) {
1213 first = b, limit = tr_ilg(last - b);
1214 } else {
1215 STACK_POP5(ISAd, first, last, limit, trlink);
1216 }
1217 } else {
1218 if(1 < (last - b)) {
1219 STACK_PUSH5(ISAd, first, a, tr_ilg(a - first), trlink);
1220 first = b, limit = tr_ilg(last - b);
1221 } else if(1 < (a - first)) {
1222 last = a, limit = tr_ilg(a - first);
1223 } else {
1224 STACK_POP5(ISAd, first, last, limit, trlink);
1225 }
1226 }
1227 } else if(limit == -2) {
1228 /* tandem repeat copy */
1229 a = stack[--ssize].b, b = stack[ssize].c;
1230 if(stack[ssize].d == 0) {
1231 tr_copy(ISA, SA, first, a, b, last, ISAd - ISA);
1232 } else {
1233 if(0 <= trlink) { stack[trlink].d = -1; }
1234 tr_partialcopy(ISA, SA, first, a, b, last, ISAd - ISA);
1235 }
1236 STACK_POP5(ISAd, first, last, limit, trlink);
1237 } else {
1238 /* sorted partition */
1239 if(0 <= *first) {
1240 a = first;
1241 do { ISA[*a] = a - SA; } while((++a < last) && (0 <= *a));
1242 first = a;
1243 }
1244 if(first < last) {
1245 a = first; do { *a = ~*a; } while(*++a < 0);
1246 next = (ISA[*a] != ISAd[*a]) ? tr_ilg(a - first + 1) : -1;
1247 if(++a < last) { for(b = first, v = a - SA - 1; b < a; ++b) { ISA[*b] = v; } }
1248
1249 /* push */
1250 if(trbudget_check(budget, a - first)) {
1251 if((a - first) <= (last - a)) {
1252 STACK_PUSH5(ISAd, a, last, -3, trlink);
1253 ISAd += incr, last = a, limit = next;
1254 } else {
1255 if(1 < (last - a)) {
1256 STACK_PUSH5(ISAd + incr, first, a, next, trlink);
1257 first = a, limit = -3;
1258 } else {
1259 ISAd += incr, last = a, limit = next;
1260 }
1261 }
1262 } else {
1263 if(0 <= trlink) { stack[trlink].d = -1; }
1264 if(1 < (last - a)) {
1265 first = a, limit = -3;
1266 } else {
1267 STACK_POP5(ISAd, first, last, limit, trlink);
1268 }
1269 }
1270 } else {
1271 STACK_POP5(ISAd, first, last, limit, trlink);
1272 }
1273 }
1274 continue;
1275 }
1276
1277 if((last - first) <= TR_INSERTIONSORT_THRESHOLD) {
1278 tr_insertionsort(ISAd, first, last);
1279 limit = -3;
1280 continue;
1281 }
1282
1283 if(limit-- == 0) {
1284 tr_heapsort(ISAd, first, last - first);
1285 for(a = last - 1; first < a; a = b) {
1286 for(x = ISAd[*a], b = a - 1; (first <= b) && (ISAd[*b] == x); --b) { *b = ~*b; }
1287 }
1288 limit = -3;
1289 continue;
1290 }
1291
1292 /* choose pivot */
1293 a = tr_pivot(ISAd, first, last);
1294 SWAP(*first, *a);
1295 v = ISAd[*first];
1296
1297 /* partition */
1298 tr_partition(ISAd, first, first + 1, last, &a, &b, v);
1299 if((last - first) != (b - a)) {
1300 next = (ISA[*a] != v) ? tr_ilg(b - a) : -1;
1301
1302 /* update ranks */
1303 for(c = first, v = a - SA - 1; c < a; ++c) { ISA[*c] = v; }
1304 if(b < last) { for(c = a, v = b - SA - 1; c < b; ++c) { ISA[*c] = v; } }
1305
1306 /* push */
1307 if((1 < (b - a)) && (trbudget_check(budget, b - a))) {
1308 if((a - first) <= (last - b)) {
1309 if((last - b) <= (b - a)) {
1310 if(1 < (a - first)) {
1311 STACK_PUSH5(ISAd + incr, a, b, next, trlink);
1312 STACK_PUSH5(ISAd, b, last, limit, trlink);
1313 last = a;
1314 } else if(1 < (last - b)) {
1315 STACK_PUSH5(ISAd + incr, a, b, next, trlink);
1316 first = b;
1317 } else {
1318 ISAd += incr, first = a, last = b, limit = next;
1319 }
1320 } else if((a - first) <= (b - a)) {
1321 if(1 < (a - first)) {
1322 STACK_PUSH5(ISAd, b, last, limit, trlink);
1323 STACK_PUSH5(ISAd + incr, a, b, next, trlink);
1324 last = a;
1325 } else {
1326 STACK_PUSH5(ISAd, b, last, limit, trlink);
1327 ISAd += incr, first = a, last = b, limit = next;
1328 }
1329 } else {
1330 STACK_PUSH5(ISAd, b, last, limit, trlink);
1331 STACK_PUSH5(ISAd, first, a, limit, trlink);
1332 ISAd += incr, first = a, last = b, limit = next;
1333 }
1334 } else {
1335 if((a - first) <= (b - a)) {
1336 if(1 < (last - b)) {
1337 STACK_PUSH5(ISAd + incr, a, b, next, trlink);
1338 STACK_PUSH5(ISAd, first, a, limit, trlink);
1339 first = b;
1340 } else if(1 < (a - first)) {
1341 STACK_PUSH5(ISAd + incr, a, b, next, trlink);
1342 last = a;
1343 } else {
1344 ISAd += incr, first = a, last = b, limit = next;
1345 }
1346 } else if((last - b) <= (b - a)) {
1347 if(1 < (last - b)) {
1348 STACK_PUSH5(ISAd, first, a, limit, trlink);
1349 STACK_PUSH5(ISAd + incr, a, b, next, trlink);
1350 first = b;
1351 } else {
1352 STACK_PUSH5(ISAd, first, a, limit, trlink);
1353 ISAd += incr, first = a, last = b, limit = next;
1354 }
1355 } else {
1356 STACK_PUSH5(ISAd, first, a, limit, trlink);
1357 STACK_PUSH5(ISAd, b, last, limit, trlink);
1358 ISAd += incr, first = a, last = b, limit = next;
1359 }
1360 }
1361 } else {
1362 if((1 < (b - a)) && (0 <= trlink)) { stack[trlink].d = -1; }
1363 if((a - first) <= (last - b)) {
1364 if(1 < (a - first)) {
1365 STACK_PUSH5(ISAd, b, last, limit, trlink);
1366 last = a;
1367 } else if(1 < (last - b)) {
1368 first = b;
1369 } else {
1370 STACK_POP5(ISAd, first, last, limit, trlink);
1371 }
1372 } else {
1373 if(1 < (last - b)) {
1374 STACK_PUSH5(ISAd, first, a, limit, trlink);
1375 first = b;
1376 } else if(1 < (a - first)) {
1377 last = a;
1378 } else {
1379 STACK_POP5(ISAd, first, last, limit, trlink);
1380 }
1381 }
1382 }
1383 } else {
1384 if(trbudget_check(budget, last - first)) {
1385 limit = tr_ilg(last - first), ISAd += incr;
1386 } else {
1387 if(0 <= trlink) { stack[trlink].d = -1; }
1388 STACK_POP5(ISAd, first, last, limit, trlink);
1389 }
1390 }
1391 }
1392 #undef STACK_SIZE
1393 }
1394
1395
1396
1397 /*---------------------------------------------------------------------------*/
1398
1399 /* Tandem repeat sort */
1400 static
1401 void
1402 trsort(int *ISA, int *SA, int n, int depth) {
1403 int *ISAd;
1404 int *first, *last;
1405 trbudget_t budget;
1406 int t, skip, unsorted;
1407
1408 trbudget_init(&budget, tr_ilg(n) * 2 / 3, n);
1409 /* trbudget_init(&budget, tr_ilg(n) * 3 / 4, n); */
1410 for(ISAd = ISA + depth; -n < *SA; ISAd += ISAd - ISA) {
1411 first = SA;
1412 skip = 0;
1413 unsorted = 0;
1414 do {
1415 if((t = *first) < 0) { first -= t; skip += t; }
1416 else {
1417 if(skip != 0) { *(first + skip) = skip; skip = 0; }
1418 last = SA + ISA[t] + 1;
1419 if(1 < (last - first)) {
1420 budget.count = 0;
1421 tr_introsort(ISA, ISAd, SA, first, last, &budget);
1422 if(budget.count != 0) { unsorted += budget.count; }
1423 else { skip = first - last; }
1424 } else if((last - first) == 1) {
1425 skip = -1;
1426 }
1427 first = last;
1428 }
1429 } while(first < (SA + n));
1430 if(skip != 0) { *(first + skip) = skip; }
1431 if(unsorted == 0) { break; }
1432 }
1433 }
1434
1435
1436 /*---------------------------------------------------------------------------*/
1437
1438 /* Sorts suffixes of type B*. */
1439 static
1440 int
1441 sort_typeBstar(const unsigned char *T, int *SA,
1442 int *bucket_A, int *bucket_B,
1443 int n, int openMP) {
1444 int *PAb, *ISAb, *buf;
1445 #ifdef LIBBSC_OPENMP
1446 int *curbuf;
1447 int l;
1448 #endif
1449 int i, j, k, t, m, bufsize;
1450 int c0, c1;
1451 #ifdef LIBBSC_OPENMP
1452 int d0, d1;
1453 #endif
1454 (void)openMP;
1455
1456 /* Initialize bucket arrays. */
1457 for(i = 0; i < BUCKET_A_SIZE; ++i) { bucket_A[i] = 0; }
1458 for(i = 0; i < BUCKET_B_SIZE; ++i) { bucket_B[i] = 0; }
1459
1460 /* Count the number of occurrences of the first one or two characters of each
1461 type A, B and B* suffix. Moreover, store the beginning position of all
1462 type B* suffixes into the array SA. */
1463 for(i = n - 1, m = n, c0 = T[n - 1]; 0 <= i;) {
1464 /* type A suffix. */
1465 do { ++BUCKET_A(c1 = c0); } while((0 <= --i) && ((c0 = T[i]) >= c1));
1466 if(0 <= i) {
1467 /* type B* suffix. */
1468 ++BUCKET_BSTAR(c0, c1);
1469 SA[--m] = i;
1470 /* type B suffix. */
1471 for(--i, c1 = c0; (0 <= i) && ((c0 = T[i]) <= c1); --i, c1 = c0) {
1472 ++BUCKET_B(c0, c1);
1473 }
1474 }
1475 }
1476 m = n - m;
1477 /*
1478 note:
1479 A type B* suffix is lexicographically smaller than a type B suffix that
1480 begins with the same first two characters.
1481 */
1482
1483 /* Calculate the index of start/end point of each bucket. */
1484 for(c0 = 0, i = 0, j = 0; c0 < ALPHABET_SIZE; ++c0) {
1485 t = i + BUCKET_A(c0);
1486 BUCKET_A(c0) = i + j; /* start point */
1487 i = t + BUCKET_B(c0, c0);
1488 for(c1 = c0 + 1; c1 < ALPHABET_SIZE; ++c1) {
1489 j += BUCKET_BSTAR(c0, c1);
1490 BUCKET_BSTAR(c0, c1) = j; /* end point */
1491 i += BUCKET_B(c0, c1);
1492 }
1493 }
1494
1495 if(0 < m) {
1496 /* Sort the type B* suffixes by their first two characters. */
1497 PAb = SA + n - m; ISAb = SA + m;
1498 for(i = m - 2; 0 <= i; --i) {
1499 t = PAb[i], c0 = T[t], c1 = T[t + 1];
1500 SA[--BUCKET_BSTAR(c0, c1)] = i;
1501 }
1502 t = PAb[m - 1], c0 = T[t], c1 = T[t + 1];
1503 SA[--BUCKET_BSTAR(c0, c1)] = m - 1;
1504
1505 /* Sort the type B* substrings using sssort. */
1506 #ifdef LIBBSC_OPENMP
1507 if (openMP)
1508 {
1509 buf = SA + m;
1510 c0 = ALPHABET_SIZE - 2, c1 = ALPHABET_SIZE - 1, j = m;
1511 #pragma omp parallel default(shared) private(bufsize, curbuf, k, l, d0, d1)
1512 {
1513 bufsize = (n - (2 * m)) / omp_get_num_threads();
1514 curbuf = buf + omp_get_thread_num() * bufsize;
1515 k = 0;
1516 for(;;) {
1517 #pragma omp critical(sssort_lock)
1518 {
1519 if(0 < (l = j)) {
1520 d0 = c0, d1 = c1;
1521 do {
1522 k = BUCKET_BSTAR(d0, d1);
1523 if(--d1 <= d0) {
1524 d1 = ALPHABET_SIZE - 1;
1525 if(--d0 < 0) { break; }
1526 }
1527 } while(((l - k) <= 1) && (0 < (l = k)));
1528 c0 = d0, c1 = d1, j = k;
1529 }
1530 }
1531 if(l == 0) { break; }
1532 sssort(T, PAb, SA + k, SA + l,
1533 curbuf, bufsize, 2, n, *(SA + k) == (m - 1));
1534 }
1535 }
1536 }
1537 else
1538 {
1539 buf = SA + m, bufsize = n - (2 * m);
1540 for(c0 = ALPHABET_SIZE - 2, j = m; 0 < j; --c0) {
1541 for(c1 = ALPHABET_SIZE - 1; c0 < c1; j = i, --c1) {
1542 i = BUCKET_BSTAR(c0, c1);
1543 if(1 < (j - i)) {
1544 sssort(T, PAb, SA + i, SA + j,
1545 buf, bufsize, 2, n, *(SA + i) == (m - 1));
1546 }
1547 }
1548 }
1549 }
1550 #else
1551 buf = SA + m, bufsize = n - (2 * m);
1552 for(c0 = ALPHABET_SIZE - 2, j = m; 0 < j; --c0) {
1553 for(c1 = ALPHABET_SIZE - 1; c0 < c1; j = i, --c1) {
1554 i = BUCKET_BSTAR(c0, c1);
1555 if(1 < (j - i)) {
1556 sssort(T, PAb, SA + i, SA + j,
1557 buf, bufsize, 2, n, *(SA + i) == (m - 1));
1558 }
1559 }
1560 }
1561 #endif
1562
1563 /* Compute ranks of type B* substrings. */
1564 for(i = m - 1; 0 <= i; --i) {
1565 if(0 <= SA[i]) {
1566 j = i;
1567 do { ISAb[SA[i]] = i; } while((0 <= --i) && (0 <= SA[i]));
1568 SA[i + 1] = i - j;
1569 if(i <= 0) { break; }
1570 }
1571 j = i;
1572 do { ISAb[SA[i] = ~SA[i]] = j; } while(SA[--i] < 0);
1573 ISAb[SA[i]] = j;
1574 }
1575
1576 /* Construct the inverse suffix array of type B* suffixes using trsort. */
1577 trsort(ISAb, SA, m, 1);
1578
1579 /* Set the sorted order of tyoe B* suffixes. */
1580 for(i = n - 1, j = m, c0 = T[n - 1]; 0 <= i;) {
1581 for(--i, c1 = c0; (0 <= i) && ((c0 = T[i]) >= c1); --i, c1 = c0) { }
1582 if(0 <= i) {
1583 t = i;
1584 for(--i, c1 = c0; (0 <= i) && ((c0 = T[i]) <= c1); --i, c1 = c0) { }
1585 SA[ISAb[--j]] = ((t == 0) || (1 < (t - i))) ? t : ~t;
1586 }
1587 }
1588
1589 /* Calculate the index of start/end point of each bucket. */
1590 BUCKET_B(ALPHABET_SIZE - 1, ALPHABET_SIZE - 1) = n; /* end point */
1591 for(c0 = ALPHABET_SIZE - 2, k = m - 1; 0 <= c0; --c0) {
1592 i = BUCKET_A(c0 + 1) - 1;
1593 for(c1 = ALPHABET_SIZE - 1; c0 < c1; --c1) {
1594 t = i - BUCKET_B(c0, c1);
1595 BUCKET_B(c0, c1) = i; /* end point */
1596
1597 /* Move all type B* suffixes to the correct position. */
1598 for(i = t, j = BUCKET_BSTAR(c0, c1);
1599 j <= k;
1600 --i, --k) { SA[i] = SA[k]; }
1601 }
1602 BUCKET_BSTAR(c0, c0 + 1) = i - BUCKET_B(c0, c0) + 1; /* start point */
1603 BUCKET_B(c0, c0) = i; /* end point */
1604 }
1605 }
1606
1607 return m;
1608 }
1609
1610 /* Constructs the suffix array by using the sorted order of type B* suffixes. */
1611 static
1612 void
1613 construct_SA(const unsigned char *T, int *SA,
1614 int *bucket_A, int *bucket_B,
1615 int n, int m) {
1616 int *i, *j, *k;
1617 int s;
1618 int c0, c1, c2;
1619
1620 if(0 < m) {
1621 /* Construct the sorted order of type B suffixes by using
1622 the sorted order of type B* suffixes. */
1623 for(c1 = ALPHABET_SIZE - 2; 0 <= c1; --c1) {
1624 /* Scan the suffix array from right to left. */
1625 for(i = SA + BUCKET_BSTAR(c1, c1 + 1),
1626 j = SA + BUCKET_A(c1 + 1) - 1, k = NULL, c2 = -1;
1627 i <= j;
1628 --j) {
1629 if(0 < (s = *j)) {
1630 assert(T[s] == c1);
1631 assert(((s + 1) < n) && (T[s] <= T[s + 1]));
1632 assert(T[s - 1] <= T[s]);
1633 *j = ~s;
1634 c0 = T[--s];
1635 if((0 < s) && (T[s - 1] > c0)) { s = ~s; }
1636 if(c0 != c2) {
1637 if(0 <= c2) { BUCKET_B(c2, c1) = k - SA; }
1638 k = SA + BUCKET_B(c2 = c0, c1);
1639 }
1640 assert(k < j);
1641 *k-- = s;
1642 } else {
1643 assert(((s == 0) && (T[s] == c1)) || (s < 0));
1644 *j = ~s;
1645 }
1646 }
1647 }
1648 }
1649
1650 /* Construct the suffix array by using
1651 the sorted order of type B suffixes. */
1652 k = SA + BUCKET_A(c2 = T[n - 1]);
1653 *k++ = (T[n - 2] < c2) ? ~(n - 1) : (n - 1);
1654 /* Scan the suffix array from left to right. */
1655 for(i = SA, j = SA + n; i < j; ++i) {
1656 if(0 < (s = *i)) {
1657 assert(T[s - 1] >= T[s]);
1658 c0 = T[--s];
1659 if((s == 0) || (T[s - 1] < c0)) { s = ~s; }
1660 if(c0 != c2) {
1661 BUCKET_A(c2) = k - SA;
1662 k = SA + BUCKET_A(c2 = c0);
1663 }
1664 assert(i < k);
1665 *k++ = s;
1666 } else {
1667 assert(s < 0);
1668 *i = ~s;
1669 }
1670 }
1671 }
1672
1673 /* Constructs the burrows-wheeler transformed string directly
1674 by using the sorted order of type B* suffixes. */
1675 static
1676 int
1677 construct_BWT(const unsigned char *T, int *SA,
1678 int *bucket_A, int *bucket_B,
1679 int n, int m) {
1680 int *i, *j, *k, *orig;
1681 int s;
1682 int c0, c1, c2;
1683
1684 if(0 < m) {
1685 /* Construct the sorted order of type B suffixes by using
1686 the sorted order of type B* suffixes. */
1687 for(c1 = ALPHABET_SIZE - 2; 0 <= c1; --c1) {
1688 /* Scan the suffix array from right to left. */
1689 for(i = SA + BUCKET_BSTAR(c1, c1 + 1),
1690 j = SA + BUCKET_A(c1 + 1) - 1, k = NULL, c2 = -1;
1691 i <= j;
1692 --j) {
1693 if(0 < (s = *j)) {
1694 assert(T[s] == c1);
1695 assert(((s + 1) < n) && (T[s] <= T[s + 1]));
1696 assert(T[s - 1] <= T[s]);
1697 c0 = T[--s];
1698 *j = ~((int)c0);
1699 if((0 < s) && (T[s - 1] > c0)) { s = ~s; }
1700 if(c0 != c2) {
1701 if(0 <= c2) { BUCKET_B(c2, c1) = k - SA; }
1702 k = SA + BUCKET_B(c2 = c0, c1);
1703 }
1704 assert(k < j);
1705 *k-- = s;
1706 } else if(s != 0) {
1707 *j = ~s;
1708 #ifndef NDEBUG
1709 } else {
1710 assert(T[s] == c1);
1711 #endif
1712 }
1713 }
1714 }
1715 }
1716
1717 /* Construct the BWTed string by using
1718 the sorted order of type B suffixes. */
1719 k = SA + BUCKET_A(c2 = T[n - 1]);
1720 *k++ = (T[n - 2] < c2) ? ~((int)T[n - 2]) : (n - 1);
1721 /* Scan the suffix array from left to right. */
1722 for(i = SA, j = SA + n, orig = SA; i < j; ++i) {
1723 if(0 < (s = *i)) {
1724 assert(T[s - 1] >= T[s]);
1725 c0 = T[--s];
1726 *i = c0;
1727 if((0 < s) && (T[s - 1] < c0)) { s = ~((int)T[s - 1]); }
1728 if(c0 != c2) {
1729 BUCKET_A(c2) = k - SA;
1730 k = SA + BUCKET_A(c2 = c0);
1731 }
1732 assert(i < k);
1733 *k++ = s;
1734 } else if(s != 0) {
1735 *i = ~s;
1736 } else {
1737 orig = i;
1738 }
1739 }
1740
1741 return orig - SA;
1742 }
1743
1744 /* Constructs the burrows-wheeler transformed string directly
1745 by using the sorted order of type B* suffixes. */
1746 static
1747 int
1748 construct_BWT_indexes(const unsigned char *T, int *SA,
1749 int *bucket_A, int *bucket_B,
1750 int n, int m,
1751 unsigned char * num_indexes, int * indexes) {
1752 int *i, *j, *k, *orig;
1753 int s;
1754 int c0, c1, c2;
1755
1756 int mod = n / 8;
1757 {
1758 mod |= mod >> 1; mod |= mod >> 2;
1759 mod |= mod >> 4; mod |= mod >> 8;
1760 mod |= mod >> 16; mod >>= 1;
1761
1762 *num_indexes = (unsigned char)((n - 1) / (mod + 1));
1763 }
1764
1765 if(0 < m) {
1766 /* Construct the sorted order of type B suffixes by using
1767 the sorted order of type B* suffixes. */
1768 for(c1 = ALPHABET_SIZE - 2; 0 <= c1; --c1) {
1769 /* Scan the suffix array from right to left. */
1770 for(i = SA + BUCKET_BSTAR(c1, c1 + 1),
1771 j = SA + BUCKET_A(c1 + 1) - 1, k = NULL, c2 = -1;
1772 i <= j;
1773 --j) {
1774 if(0 < (s = *j)) {
1775 assert(T[s] == c1);
1776 assert(((s + 1) < n) && (T[s] <= T[s + 1]));
1777 assert(T[s - 1] <= T[s]);
1778
1779 if ((s & mod) == 0) indexes[s / (mod + 1) - 1] = j - SA;
1780
1781 c0 = T[--s];
1782 *j = ~((int)c0);
1783 if((0 < s) && (T[s - 1] > c0)) { s = ~s; }
1784 if(c0 != c2) {
1785 if(0 <= c2) { BUCKET_B(c2, c1) = k - SA; }
1786 k = SA + BUCKET_B(c2 = c0, c1);
1787 }
1788 assert(k < j);
1789 *k-- = s;
1790 } else if(s != 0) {
1791 *j = ~s;
1792 #ifndef NDEBUG
1793 } else {
1794 assert(T[s] == c1);
1795 #endif
1796 }
1797 }
1798 }
1799 }
1800
1801 /* Construct the BWTed string by using
1802 the sorted order of type B suffixes. */
1803 k = SA + BUCKET_A(c2 = T[n - 1]);
1804 if (T[n - 2] < c2) {
1805 if (((n - 1) & mod) == 0) indexes[(n - 1) / (mod + 1) - 1] = k - SA;
1806 *k++ = ~((int)T[n - 2]);
1807 }
1808 else {
1809 *k++ = n - 1;
1810 }
1811
1812 /* Scan the suffix array from left to right. */
1813 for(i = SA, j = SA + n, orig = SA; i < j; ++i) {
1814 if(0 < (s = *i)) {
1815 assert(T[s - 1] >= T[s]);
1816
1817 if ((s & mod) == 0) indexes[s / (mod + 1) - 1] = i - SA;
1818
1819 c0 = T[--s];
1820 *i = c0;
1821 if(c0 != c2) {
1822 BUCKET_A(c2) = k - SA;
1823 k = SA + BUCKET_A(c2 = c0);
1824 }
1825 assert(i < k);
1826 if((0 < s) && (T[s - 1] < c0)) {
1827 if ((s & mod) == 0) indexes[s / (mod + 1) - 1] = k - SA;
1828 *k++ = ~((int)T[s - 1]);
1829 } else
1830 *k++ = s;
1831 } else if(s != 0) {
1832 *i = ~s;
1833 } else {
1834 orig = i;
1835 }
1836 }
1837
1838 return orig - SA;
1839 }
1840
1841
1842 /*---------------------------------------------------------------------------*/
1843
1844 /*- Function -*/
1845
1846 int
1847 divsufsort(const unsigned char *T, int *SA, int n, int openMP) {
1848 int *bucket_A, *bucket_B;
1849 int m;
1850 int err = 0;
1851
1852 /* Check arguments. */
1853 if((T == NULL) || (SA == NULL) || (n < 0)) { return -1; }
1854 else if(n == 0) { return 0; }
1855 else if(n == 1) { SA[0] = 0; return 0; }
1856 else if(n == 2) { m = (T[0] < T[1]); SA[m ^ 1] = 0, SA[m] = 1; return 0; }
1857
1858 bucket_A = (int *)malloc(BUCKET_A_SIZE * sizeof(int));
1859 bucket_B = (int *)malloc(BUCKET_B_SIZE * sizeof(int));
1860
1861 /* Suffixsort. */
1862 if((bucket_A != NULL) && (bucket_B != NULL)) {
1863 m = sort_typeBstar(T, SA, bucket_A, bucket_B, n, openMP);
1864 construct_SA(T, SA, bucket_A, bucket_B, n, m);
1865 } else {
1866 err = -2;
1867 }
1868
1869 free(bucket_B);
1870 free(bucket_A);
1871
1872 return err;
1873 }
1874
1875 int
1876 divbwt(const unsigned char *T, unsigned char *U, int *A, int n, unsigned char * num_indexes, int * indexes, int openMP) {
1877 int *B;
1878 int *bucket_A, *bucket_B;
1879 int m, pidx, i;
1880
1881 /* Check arguments. */
1882 if((T == NULL) || (U == NULL) || (n < 0)) { return -1; }
1883 else if(n <= 1) { if(n == 1) { U[0] = T[0]; } return n; }
1884
1885 if((B = A) == NULL) { B = (int *)malloc((size_t)(n + 1) * sizeof(int)); }
1886 bucket_A = (int *)malloc(BUCKET_A_SIZE * sizeof(int));
1887 bucket_B = (int *)malloc(BUCKET_B_SIZE * sizeof(int));
1888
1889 /* Burrows-Wheeler Transform. */
1890 if((B != NULL) && (bucket_A != NULL) && (bucket_B != NULL)) {
1891 m = sort_typeBstar(T, B, bucket_A, bucket_B, n, openMP);
1892
1893 if (num_indexes == NULL || indexes == NULL) {
1894 pidx = construct_BWT(T, B, bucket_A, bucket_B, n, m);
1895 } else {
1896 pidx = construct_BWT_indexes(T, B, bucket_A, bucket_B, n, m, num_indexes, indexes);
1897 }
1898
1899 /* Copy to output string. */
1900 U[0] = T[n - 1];
1901 for(i = 0; i < pidx; ++i) { U[i + 1] = (unsigned char)B[i]; }
1902 for(i += 1; i < n; ++i) { U[i] = (unsigned char)B[i]; }
1903 pidx += 1;
1904 } else {
1905 pidx = -2;
1906 }
1907
1908 free(bucket_B);
1909 free(bucket_A);
1910 if(A == NULL) { free(B); }
1911
1912 return pidx;
1913 }
@@ -0,0 +1,67 b''
1 /*
2 * divsufsort.h for libdivsufsort-lite
3 * Copyright (c) 2003-2008 Yuta Mori All Rights Reserved.
4 *
5 * Permission is hereby granted, free of charge, to any person
6 * obtaining a copy of this software and associated documentation
7 * files (the "Software"), to deal in the Software without
8 * restriction, including without limitation the rights to use,
9 * copy, modify, merge, publish, distribute, sublicense, and/or sell
10 * copies of the Software, and to permit persons to whom the
11 * Software is furnished to do so, subject to the following
12 * conditions:
13 *
14 * The above copyright notice and this permission notice shall be
15 * included in all copies or substantial portions of the Software.
16 *
17 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
18 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
19 * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
20 * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
21 * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
22 * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
23 * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
24 * OTHER DEALINGS IN THE SOFTWARE.
25 */
26
27 #ifndef _DIVSUFSORT_H
28 #define _DIVSUFSORT_H 1
29
30 #ifdef __cplusplus
31 extern "C" {
32 #endif /* __cplusplus */
33
34
35 /*- Prototypes -*/
36
37 /**
38 * Constructs the suffix array of a given string.
39 * @param T [0..n-1] The input string.
40 * @param SA [0..n-1] The output array of suffixes.
41 * @param n The length of the given string.
42 * @param openMP enables OpenMP optimization.
43 * @return 0 if no error occurred, -1 or -2 otherwise.
44 */
45 int
46 divsufsort(const unsigned char *T, int *SA, int n, int openMP);
47
48 /**
49 * Constructs the burrows-wheeler transformed string of a given string.
50 * @param T [0..n-1] The input string.
51 * @param U [0..n-1] The output string. (can be T)
52 * @param A [0..n-1] The temporary array. (can be NULL)
53 * @param n The length of the given string.
54 * @param num_indexes The length of secondary indexes array. (can be NULL)
55 * @param indexes The secondary indexes array. (can be NULL)
56 * @param openMP enables OpenMP optimization.
57 * @return The primary index if no error occurred, -1 or -2 otherwise.
58 */
59 int
60 divbwt(const unsigned char *T, unsigned char *U, int *A, int n, unsigned char * num_indexes, int * indexes, int openMP);
61
62
63 #ifdef __cplusplus
64 } /* extern "C" */
65 #endif /* __cplusplus */
66
67 #endif /* _DIVSUFSORT_H */
This diff has been collapsed as it changes many lines, (1010 lines changed) Show them Hide them
@@ -0,0 +1,1010 b''
1 /**
2 * Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
3 * All rights reserved.
4 *
5 * This source code is licensed under the BSD-style license found in the
6 * LICENSE file in the root directory of this source tree. An additional grant
7 * of patent rights can be found in the PATENTS file in the same directory.
8 */
9
10
11 /*-**************************************
12 * Tuning parameters
13 ****************************************/
14 #define ZDICT_MAX_SAMPLES_SIZE (2000U << 20)
15 #define ZDICT_MIN_SAMPLES_SIZE 512
16
17
18 /*-**************************************
19 * Compiler Options
20 ****************************************/
21 /* Unix Large Files support (>4GB) */
22 #define _FILE_OFFSET_BITS 64
23 #if (defined(__sun__) && (!defined(__LP64__))) /* Sun Solaris 32-bits requires specific definitions */
24 # define _LARGEFILE_SOURCE
25 #elif ! defined(__LP64__) /* No point defining Large file for 64 bit */
26 # define _LARGEFILE64_SOURCE
27 #endif
28
29
30 /*-*************************************
31 * Dependencies
32 ***************************************/
33 #include <stdlib.h> /* malloc, free */
34 #include <string.h> /* memset */
35 #include <stdio.h> /* fprintf, fopen, ftello64 */
36 #include <time.h> /* clock */
37
38 #include "mem.h" /* read */
39 #include "error_private.h"
40 #include "fse.h" /* FSE_normalizeCount, FSE_writeNCount */
41 #define HUF_STATIC_LINKING_ONLY
42 #include "huf.h"
43 #include "zstd_internal.h" /* includes zstd.h */
44 #include "xxhash.h"
45 #include "divsufsort.h"
46 #ifndef ZDICT_STATIC_LINKING_ONLY
47 # define ZDICT_STATIC_LINKING_ONLY
48 #endif
49 #include "zdict.h"
50
51
52 /*-*************************************
53 * Constants
54 ***************************************/
55 #define KB *(1 <<10)
56 #define MB *(1 <<20)
57 #define GB *(1U<<30)
58
59 #define DICTLISTSIZE_DEFAULT 10000
60
61 #define NOISELENGTH 32
62
63 #define MINRATIO 4
64 static const int g_compressionLevel_default = 5;
65 static const U32 g_selectivity_default = 9;
66 static const size_t g_provision_entropySize = 200;
67 static const size_t g_min_fast_dictContent = 192;
68
69
70 /*-*************************************
71 * Console display
72 ***************************************/
73 #define DISPLAY(...) { fprintf(stderr, __VA_ARGS__); fflush( stderr ); }
74 #define DISPLAYLEVEL(l, ...) if (notificationLevel>=l) { DISPLAY(__VA_ARGS__); } /* 0 : no display; 1: errors; 2: default; 3: details; 4: debug */
75
76 static clock_t ZDICT_clockSpan(clock_t nPrevious) { return clock() - nPrevious; }
77
78 static void ZDICT_printHex(const void* ptr, size_t length)
79 {
80 const BYTE* const b = (const BYTE*)ptr;
81 size_t u;
82 for (u=0; u<length; u++) {
83 BYTE c = b[u];
84 if (c<32 || c>126) c = '.'; /* non-printable char */
85 DISPLAY("%c", c);
86 }
87 }
88
89
90 /*-********************************************************
91 * Helper functions
92 **********************************************************/
93 unsigned ZDICT_isError(size_t errorCode) { return ERR_isError(errorCode); }
94
95 const char* ZDICT_getErrorName(size_t errorCode) { return ERR_getErrorName(errorCode); }
96
97 unsigned ZDICT_getDictID(const void* dictBuffer, size_t dictSize)
98 {
99 if (dictSize < 8) return 0;
100 if (MEM_readLE32(dictBuffer) != ZSTD_DICT_MAGIC) return 0;
101 return MEM_readLE32((const char*)dictBuffer + 4);
102 }
103
104
105 /*-********************************************************
106 * Dictionary training functions
107 **********************************************************/
108 static unsigned ZDICT_NbCommonBytes (register size_t val)
109 {
110 if (MEM_isLittleEndian()) {
111 if (MEM_64bits()) {
112 # if defined(_MSC_VER) && defined(_WIN64)
113 unsigned long r = 0;
114 _BitScanForward64( &r, (U64)val );
115 return (unsigned)(r>>3);
116 # elif defined(__GNUC__) && (__GNUC__ >= 3)
117 return (__builtin_ctzll((U64)val) >> 3);
118 # else
119 static const int DeBruijnBytePos[64] = { 0, 0, 0, 0, 0, 1, 1, 2, 0, 3, 1, 3, 1, 4, 2, 7, 0, 2, 3, 6, 1, 5, 3, 5, 1, 3, 4, 4, 2, 5, 6, 7, 7, 0, 1, 2, 3, 3, 4, 6, 2, 6, 5, 5, 3, 4, 5, 6, 7, 1, 2, 4, 6, 4, 4, 5, 7, 2, 6, 5, 7, 6, 7, 7 };
120 return DeBruijnBytePos[((U64)((val & -(long long)val) * 0x0218A392CDABBD3FULL)) >> 58];
121 # endif
122 } else { /* 32 bits */
123 # if defined(_MSC_VER)
124 unsigned long r=0;
125 _BitScanForward( &r, (U32)val );
126 return (unsigned)(r>>3);
127 # elif defined(__GNUC__) && (__GNUC__ >= 3)
128 return (__builtin_ctz((U32)val) >> 3);
129 # else
130 static const int DeBruijnBytePos[32] = { 0, 0, 3, 0, 3, 1, 3, 0, 3, 2, 2, 1, 3, 2, 0, 1, 3, 3, 1, 2, 2, 2, 2, 0, 3, 1, 2, 0, 1, 0, 1, 1 };
131 return DeBruijnBytePos[((U32)((val & -(S32)val) * 0x077CB531U)) >> 27];
132 # endif
133 }
134 } else { /* Big Endian CPU */
135 if (MEM_64bits()) {
136 # if defined(_MSC_VER) && defined(_WIN64)
137 unsigned long r = 0;
138 _BitScanReverse64( &r, val );
139 return (unsigned)(r>>3);
140 # elif defined(__GNUC__) && (__GNUC__ >= 3)
141 return (__builtin_clzll(val) >> 3);
142 # else
143 unsigned r;
144 const unsigned n32 = sizeof(size_t)*4; /* calculate this way due to compiler complaining in 32-bits mode */
145 if (!(val>>n32)) { r=4; } else { r=0; val>>=n32; }
146 if (!(val>>16)) { r+=2; val>>=8; } else { val>>=24; }
147 r += (!val);
148 return r;
149 # endif
150 } else { /* 32 bits */
151 # if defined(_MSC_VER)
152 unsigned long r = 0;
153 _BitScanReverse( &r, (unsigned long)val );
154 return (unsigned)(r>>3);
155 # elif defined(__GNUC__) && (__GNUC__ >= 3)
156 return (__builtin_clz((U32)val) >> 3);
157 # else
158 unsigned r;
159 if (!(val>>16)) { r=2; val>>=8; } else { r=0; val>>=24; }
160 r += (!val);
161 return r;
162 # endif
163 } }
164 }
165
166
167 /*! ZDICT_count() :
168 Count the nb of common bytes between 2 pointers.
169 Note : this function presumes end of buffer followed by noisy guard band.
170 */
171 static size_t ZDICT_count(const void* pIn, const void* pMatch)
172 {
173 const char* const pStart = (const char*)pIn;
174 for (;;) {
175 size_t const diff = MEM_readST(pMatch) ^ MEM_readST(pIn);
176 if (!diff) {
177 pIn = (const char*)pIn+sizeof(size_t);
178 pMatch = (const char*)pMatch+sizeof(size_t);
179 continue;
180 }
181 pIn = (const char*)pIn+ZDICT_NbCommonBytes(diff);
182 return (size_t)((const char*)pIn - pStart);
183 }
184 }
185
186
187 typedef struct {
188 U32 pos;
189 U32 length;
190 U32 savings;
191 } dictItem;
192
193 static void ZDICT_initDictItem(dictItem* d)
194 {
195 d->pos = 1;
196 d->length = 0;
197 d->savings = (U32)(-1);
198 }
199
200
201 #define LLIMIT 64 /* heuristic determined experimentally */
202 #define MINMATCHLENGTH 7 /* heuristic determined experimentally */
203 static dictItem ZDICT_analyzePos(
204 BYTE* doneMarks,
205 const int* suffix, U32 start,
206 const void* buffer, U32 minRatio, U32 notificationLevel)
207 {
208 U32 lengthList[LLIMIT] = {0};
209 U32 cumulLength[LLIMIT] = {0};
210 U32 savings[LLIMIT] = {0};
211 const BYTE* b = (const BYTE*)buffer;
212 size_t length;
213 size_t maxLength = LLIMIT;
214 size_t pos = suffix[start];
215 U32 end = start;
216 dictItem solution;
217
218 /* init */
219 memset(&solution, 0, sizeof(solution));
220 doneMarks[pos] = 1;
221
222 /* trivial repetition cases */
223 if ( (MEM_read16(b+pos+0) == MEM_read16(b+pos+2))
224 ||(MEM_read16(b+pos+1) == MEM_read16(b+pos+3))
225 ||(MEM_read16(b+pos+2) == MEM_read16(b+pos+4)) ) {
226 /* skip and mark segment */
227 U16 u16 = MEM_read16(b+pos+4);
228 U32 u, e = 6;
229 while (MEM_read16(b+pos+e) == u16) e+=2 ;
230 if (b[pos+e] == b[pos+e-1]) e++;
231 for (u=1; u<e; u++)
232 doneMarks[pos+u] = 1;
233 return solution;
234 }
235
236 /* look forward */
237 do {
238 end++;
239 length = ZDICT_count(b + pos, b + suffix[end]);
240 } while (length >=MINMATCHLENGTH);
241
242 /* look backward */
243 do {
244 length = ZDICT_count(b + pos, b + *(suffix+start-1));
245 if (length >=MINMATCHLENGTH) start--;
246 } while(length >= MINMATCHLENGTH);
247
248 /* exit if not found a minimum nb of repetitions */
249 if (end-start < minRatio) {
250 U32 idx;
251 for(idx=start; idx<end; idx++)
252 doneMarks[suffix[idx]] = 1;
253 return solution;
254 }
255
256 { int i;
257 U32 searchLength;
258 U32 refinedStart = start;
259 U32 refinedEnd = end;
260
261 DISPLAYLEVEL(4, "\n");
262 DISPLAYLEVEL(4, "found %3u matches of length >= %i at pos %7u ", (U32)(end-start), MINMATCHLENGTH, (U32)pos);
263 DISPLAYLEVEL(4, "\n");
264
265 for (searchLength = MINMATCHLENGTH ; ; searchLength++) {
266 BYTE currentChar = 0;
267 U32 currentCount = 0;
268 U32 currentID = refinedStart;
269 U32 id;
270 U32 selectedCount = 0;
271 U32 selectedID = currentID;
272 for (id =refinedStart; id < refinedEnd; id++) {
273 if (b[ suffix[id] + searchLength] != currentChar) {
274 if (currentCount > selectedCount) {
275 selectedCount = currentCount;
276 selectedID = currentID;
277 }
278 currentID = id;
279 currentChar = b[ suffix[id] + searchLength];
280 currentCount = 0;
281 }
282 currentCount ++;
283 }
284 if (currentCount > selectedCount) { /* for last */
285 selectedCount = currentCount;
286 selectedID = currentID;
287 }
288
289 if (selectedCount < minRatio)
290 break;
291 refinedStart = selectedID;
292 refinedEnd = refinedStart + selectedCount;
293 }
294
295 /* evaluate gain based on new ref */
296 start = refinedStart;
297 pos = suffix[refinedStart];
298 end = start;
299 memset(lengthList, 0, sizeof(lengthList));
300
301 /* look forward */
302 do {
303 end++;
304 length = ZDICT_count(b + pos, b + suffix[end]);
305 if (length >= LLIMIT) length = LLIMIT-1;
306 lengthList[length]++;
307 } while (length >=MINMATCHLENGTH);
308
309 /* look backward */
310 length = MINMATCHLENGTH;
311 while ((length >= MINMATCHLENGTH) & (start > 0)) {
312 length = ZDICT_count(b + pos, b + suffix[start - 1]);
313 if (length >= LLIMIT) length = LLIMIT - 1;
314 lengthList[length]++;
315 if (length >= MINMATCHLENGTH) start--;
316 }
317
318 /* largest useful length */
319 memset(cumulLength, 0, sizeof(cumulLength));
320 cumulLength[maxLength-1] = lengthList[maxLength-1];
321 for (i=(int)(maxLength-2); i>=0; i--)
322 cumulLength[i] = cumulLength[i+1] + lengthList[i];
323
324 for (i=LLIMIT-1; i>=MINMATCHLENGTH; i--) if (cumulLength[i]>=minRatio) break;
325 maxLength = i;
326
327 /* reduce maxLength in case of final into repetitive data */
328 { U32 l = (U32)maxLength;
329 BYTE const c = b[pos + maxLength-1];
330 while (b[pos+l-2]==c) l--;
331 maxLength = l;
332 }
333 if (maxLength < MINMATCHLENGTH) return solution; /* skip : no long-enough solution */
334
335 /* calculate savings */
336 savings[5] = 0;
337 for (i=MINMATCHLENGTH; i<=(int)maxLength; i++)
338 savings[i] = savings[i-1] + (lengthList[i] * (i-3));
339
340 DISPLAYLEVEL(4, "Selected ref at position %u, of length %u : saves %u (ratio: %.2f) \n",
341 (U32)pos, (U32)maxLength, savings[maxLength], (double)savings[maxLength] / maxLength);
342
343 solution.pos = (U32)pos;
344 solution.length = (U32)maxLength;
345 solution.savings = savings[maxLength];
346
347 /* mark positions done */
348 { U32 id;
349 for (id=start; id<end; id++) {
350 U32 p, pEnd;
351 U32 const testedPos = suffix[id];
352 if (testedPos == pos)
353 length = solution.length;
354 else {
355 length = ZDICT_count(b+pos, b+testedPos);
356 if (length > solution.length) length = solution.length;
357 }
358 pEnd = (U32)(testedPos + length);
359 for (p=testedPos; p<pEnd; p++)
360 doneMarks[p] = 1;
361 } } }
362
363 return solution;
364 }
365
366
367 /*! ZDICT_checkMerge
368 check if dictItem can be merged, do it if possible
369 @return : id of destination elt, 0 if not merged
370 */
371 static U32 ZDICT_checkMerge(dictItem* table, dictItem elt, U32 eltNbToSkip)
372 {
373 const U32 tableSize = table->pos;
374 const U32 eltEnd = elt.pos + elt.length;
375
376 /* tail overlap */
377 U32 u; for (u=1; u<tableSize; u++) {
378 if (u==eltNbToSkip) continue;
379 if ((table[u].pos > elt.pos) && (table[u].pos <= eltEnd)) { /* overlap, existing > new */
380 /* append */
381 U32 addedLength = table[u].pos - elt.pos;
382 table[u].length += addedLength;
383 table[u].pos = elt.pos;
384 table[u].savings += elt.savings * addedLength / elt.length; /* rough approx */
385 table[u].savings += elt.length / 8; /* rough approx bonus */
386 elt = table[u];
387 /* sort : improve rank */
388 while ((u>1) && (table[u-1].savings < elt.savings))
389 table[u] = table[u-1], u--;
390 table[u] = elt;
391 return u;
392 } }
393
394 /* front overlap */
395 for (u=1; u<tableSize; u++) {
396 if (u==eltNbToSkip) continue;
397 if ((table[u].pos + table[u].length >= elt.pos) && (table[u].pos < elt.pos)) { /* overlap, existing < new */
398 /* append */
399 int addedLength = (int)eltEnd - (table[u].pos + table[u].length);
400 table[u].savings += elt.length / 8; /* rough approx bonus */
401 if (addedLength > 0) { /* otherwise, elt fully included into existing */
402 table[u].length += addedLength;
403 table[u].savings += elt.savings * addedLength / elt.length; /* rough approx */
404 }
405 /* sort : improve rank */
406 elt = table[u];
407 while ((u>1) && (table[u-1].savings < elt.savings))
408 table[u] = table[u-1], u--;
409 table[u] = elt;
410 return u;
411 } }
412
413 return 0;
414 }
415
416
417 static void ZDICT_removeDictItem(dictItem* table, U32 id)
418 {
419 /* convention : first element is nb of elts */
420 U32 const max = table->pos;
421 U32 u;
422 if (!id) return; /* protection, should never happen */
423 for (u=id; u<max-1; u++)
424 table[u] = table[u+1];
425 table->pos--;
426 }
427
428
429 static void ZDICT_insertDictItem(dictItem* table, U32 maxSize, dictItem elt)
430 {
431 /* merge if possible */
432 U32 mergeId = ZDICT_checkMerge(table, elt, 0);
433 if (mergeId) {
434 U32 newMerge = 1;
435 while (newMerge) {
436 newMerge = ZDICT_checkMerge(table, table[mergeId], mergeId);
437 if (newMerge) ZDICT_removeDictItem(table, mergeId);
438 mergeId = newMerge;
439 }
440 return;
441 }
442
443 /* insert */
444 { U32 current;
445 U32 nextElt = table->pos;
446 if (nextElt >= maxSize) nextElt = maxSize-1;
447 current = nextElt-1;
448 while (table[current].savings < elt.savings) {
449 table[current+1] = table[current];
450 current--;
451 }
452 table[current+1] = elt;
453 table->pos = nextElt+1;
454 }
455 }
456
457
458 static U32 ZDICT_dictSize(const dictItem* dictList)
459 {
460 U32 u, dictSize = 0;
461 for (u=1; u<dictList[0].pos; u++)
462 dictSize += dictList[u].length;
463 return dictSize;
464 }
465
466
467 static size_t ZDICT_trainBuffer(dictItem* dictList, U32 dictListSize,
468 const void* const buffer, size_t bufferSize, /* buffer must end with noisy guard band */
469 const size_t* fileSizes, unsigned nbFiles,
470 U32 minRatio, U32 notificationLevel)
471 {
472 int* const suffix0 = (int*)malloc((bufferSize+2)*sizeof(*suffix0));
473 int* const suffix = suffix0+1;
474 U32* reverseSuffix = (U32*)malloc((bufferSize)*sizeof(*reverseSuffix));
475 BYTE* doneMarks = (BYTE*)malloc((bufferSize+16)*sizeof(*doneMarks)); /* +16 for overflow security */
476 U32* filePos = (U32*)malloc(nbFiles * sizeof(*filePos));
477 size_t result = 0;
478 clock_t displayClock = 0;
479 clock_t const refreshRate = CLOCKS_PER_SEC * 3 / 10;
480
481 # define DISPLAYUPDATE(l, ...) if (notificationLevel>=l) { \
482 if (ZDICT_clockSpan(displayClock) > refreshRate) \
483 { displayClock = clock(); DISPLAY(__VA_ARGS__); \
484 if (notificationLevel>=4) fflush(stdout); } }
485
486 /* init */
487 DISPLAYLEVEL(2, "\r%70s\r", ""); /* clean display line */
488 if (!suffix0 || !reverseSuffix || !doneMarks || !filePos) {
489 result = ERROR(memory_allocation);
490 goto _cleanup;
491 }
492 if (minRatio < MINRATIO) minRatio = MINRATIO;
493 memset(doneMarks, 0, bufferSize+16);
494
495 /* limit sample set size (divsufsort limitation)*/
496 if (bufferSize > ZDICT_MAX_SAMPLES_SIZE) DISPLAYLEVEL(3, "sample set too large : reduced to %u MB ...\n", (U32)(ZDICT_MAX_SAMPLES_SIZE>>20));
497 while (bufferSize > ZDICT_MAX_SAMPLES_SIZE) bufferSize -= fileSizes[--nbFiles];
498
499 /* sort */
500 DISPLAYLEVEL(2, "sorting %u files of total size %u MB ...\n", nbFiles, (U32)(bufferSize>>20));
501 { int const divSuftSortResult = divsufsort((const unsigned char*)buffer, suffix, (int)bufferSize, 0);
502 if (divSuftSortResult != 0) { result = ERROR(GENERIC); goto _cleanup; }
503 }
504 suffix[bufferSize] = (int)bufferSize; /* leads into noise */
505 suffix0[0] = (int)bufferSize; /* leads into noise */
506 /* build reverse suffix sort */
507 { size_t pos;
508 for (pos=0; pos < bufferSize; pos++)
509 reverseSuffix[suffix[pos]] = (U32)pos;
510 /* note filePos tracks borders between samples.
511 It's not used at this stage, but planned to become useful in a later update */
512 filePos[0] = 0;
513 for (pos=1; pos<nbFiles; pos++)
514 filePos[pos] = (U32)(filePos[pos-1] + fileSizes[pos-1]);
515 }
516
517 DISPLAYLEVEL(2, "finding patterns ... \n");
518 DISPLAYLEVEL(3, "minimum ratio : %u \n", minRatio);
519
520 { U32 cursor; for (cursor=0; cursor < bufferSize; ) {
521 dictItem solution;
522 if (doneMarks[cursor]) { cursor++; continue; }
523 solution = ZDICT_analyzePos(doneMarks, suffix, reverseSuffix[cursor], buffer, minRatio, notificationLevel);
524 if (solution.length==0) { cursor++; continue; }
525 ZDICT_insertDictItem(dictList, dictListSize, solution);
526 cursor += solution.length;
527 DISPLAYUPDATE(2, "\r%4.2f %% \r", (double)cursor / bufferSize * 100);
528 } }
529
530 _cleanup:
531 free(suffix0);
532 free(reverseSuffix);
533 free(doneMarks);
534 free(filePos);
535 return result;
536 }
537
538
539 static void ZDICT_fillNoise(void* buffer, size_t length)
540 {
541 unsigned const prime1 = 2654435761U;
542 unsigned const prime2 = 2246822519U;
543 unsigned acc = prime1;
544 size_t p=0;;
545 for (p=0; p<length; p++) {
546 acc *= prime2;
547 ((unsigned char*)buffer)[p] = (unsigned char)(acc >> 21);
548 }
549 }
550
551
552 typedef struct
553 {
554 ZSTD_CCtx* ref;
555 ZSTD_CCtx* zc;
556 void* workPlace; /* must be ZSTD_BLOCKSIZE_ABSOLUTEMAX allocated */
557 } EStats_ress_t;
558
559 #define MAXREPOFFSET 1024
560
561 static void ZDICT_countEStats(EStats_ress_t esr, ZSTD_parameters params,
562 U32* countLit, U32* offsetcodeCount, U32* matchlengthCount, U32* litlengthCount, U32* repOffsets,
563 const void* src, size_t srcSize, U32 notificationLevel)
564 {
565 size_t const blockSizeMax = MIN (ZSTD_BLOCKSIZE_ABSOLUTEMAX, 1 << params.cParams.windowLog);
566 size_t cSize;
567
568 if (srcSize > blockSizeMax) srcSize = blockSizeMax; /* protection vs large samples */
569 { size_t const errorCode = ZSTD_copyCCtx(esr.zc, esr.ref, 0);
570 if (ZSTD_isError(errorCode)) { DISPLAYLEVEL(1, "warning : ZSTD_copyCCtx failed \n"); return; }
571 }
572 cSize = ZSTD_compressBlock(esr.zc, esr.workPlace, ZSTD_BLOCKSIZE_ABSOLUTEMAX, src, srcSize);
573 if (ZSTD_isError(cSize)) { DISPLAYLEVEL(1, "warning : could not compress sample size %u \n", (U32)srcSize); return; }
574
575 if (cSize) { /* if == 0; block is not compressible */
576 const seqStore_t* seqStorePtr = ZSTD_getSeqStore(esr.zc);
577
578 /* literals stats */
579 { const BYTE* bytePtr;
580 for(bytePtr = seqStorePtr->litStart; bytePtr < seqStorePtr->lit; bytePtr++)
581 countLit[*bytePtr]++;
582 }
583
584 /* seqStats */
585 { U32 const nbSeq = (U32)(seqStorePtr->sequences - seqStorePtr->sequencesStart);
586 ZSTD_seqToCodes(seqStorePtr);
587
588 { const BYTE* codePtr = seqStorePtr->ofCode;
589 U32 u;
590 for (u=0; u<nbSeq; u++) offsetcodeCount[codePtr[u]]++;
591 }
592
593 { const BYTE* codePtr = seqStorePtr->mlCode;
594 U32 u;
595 for (u=0; u<nbSeq; u++) matchlengthCount[codePtr[u]]++;
596 }
597
598 { const BYTE* codePtr = seqStorePtr->llCode;
599 U32 u;
600 for (u=0; u<nbSeq; u++) litlengthCount[codePtr[u]]++;
601 }
602
603 if (nbSeq >= 2) { /* rep offsets */
604 const seqDef* const seq = seqStorePtr->sequencesStart;
605 U32 offset1 = seq[0].offset - 3;
606 U32 offset2 = seq[1].offset - 3;
607 if (offset1 >= MAXREPOFFSET) offset1 = 0;
608 if (offset2 >= MAXREPOFFSET) offset2 = 0;
609 repOffsets[offset1] += 3;
610 repOffsets[offset2] += 1;
611 } } }
612 }
613
614 /*
615 static size_t ZDICT_maxSampleSize(const size_t* fileSizes, unsigned nbFiles)
616 {
617 unsigned u;
618 size_t max=0;
619 for (u=0; u<nbFiles; u++)
620 if (max < fileSizes[u]) max = fileSizes[u];
621 return max;
622 }
623 */
624
625 static size_t ZDICT_totalSampleSize(const size_t* fileSizes, unsigned nbFiles)
626 {
627 size_t total=0;
628 unsigned u;
629 for (u=0; u<nbFiles; u++) total += fileSizes[u];
630 return total;
631 }
632
633 typedef struct { U32 offset; U32 count; } offsetCount_t;
634
635 static void ZDICT_insertSortCount(offsetCount_t table[ZSTD_REP_NUM+1], U32 val, U32 count)
636 {
637 U32 u;
638 table[ZSTD_REP_NUM].offset = val;
639 table[ZSTD_REP_NUM].count = count;
640 for (u=ZSTD_REP_NUM; u>0; u--) {
641 offsetCount_t tmp;
642 if (table[u-1].count >= table[u].count) break;
643 tmp = table[u-1];
644 table[u-1] = table[u];
645 table[u] = tmp;
646 }
647 }
648
649
650 #define OFFCODE_MAX 30 /* only applicable to first block */
651 static size_t ZDICT_analyzeEntropy(void* dstBuffer, size_t maxDstSize,
652 unsigned compressionLevel,
653 const void* srcBuffer, const size_t* fileSizes, unsigned nbFiles,
654 const void* dictBuffer, size_t dictBufferSize,
655 unsigned notificationLevel)
656 {
657 U32 countLit[256];
658 HUF_CREATE_STATIC_CTABLE(hufTable, 255);
659 U32 offcodeCount[OFFCODE_MAX+1];
660 short offcodeNCount[OFFCODE_MAX+1];
661 U32 offcodeMax = ZSTD_highbit32((U32)(dictBufferSize + 128 KB));
662 U32 matchLengthCount[MaxML+1];
663 short matchLengthNCount[MaxML+1];
664 U32 litLengthCount[MaxLL+1];
665 short litLengthNCount[MaxLL+1];
666 U32 repOffset[MAXREPOFFSET];
667 offsetCount_t bestRepOffset[ZSTD_REP_NUM+1];
668 EStats_ress_t esr;
669 ZSTD_parameters params;
670 U32 u, huffLog = 11, Offlog = OffFSELog, mlLog = MLFSELog, llLog = LLFSELog, total;
671 size_t pos = 0, errorCode;
672 size_t eSize = 0;
673 size_t const totalSrcSize = ZDICT_totalSampleSize(fileSizes, nbFiles);
674 size_t const averageSampleSize = totalSrcSize / (nbFiles + !nbFiles);
675 BYTE* dstPtr = (BYTE*)dstBuffer;
676
677 /* init */
678 esr.ref = ZSTD_createCCtx();
679 esr.zc = ZSTD_createCCtx();
680 esr.workPlace = malloc(ZSTD_BLOCKSIZE_ABSOLUTEMAX);
681 if (!esr.ref || !esr.zc || !esr.workPlace) {
682 eSize = ERROR(memory_allocation);
683 DISPLAYLEVEL(1, "Not enough memory \n");
684 goto _cleanup;
685 }
686 if (offcodeMax>OFFCODE_MAX) { eSize = ERROR(dictionary_wrong); goto _cleanup; } /* too large dictionary */
687 for (u=0; u<256; u++) countLit[u]=1; /* any character must be described */
688 for (u=0; u<=offcodeMax; u++) offcodeCount[u]=1;
689 for (u=0; u<=MaxML; u++) matchLengthCount[u]=1;
690 for (u=0; u<=MaxLL; u++) litLengthCount[u]=1;
691 memset(repOffset, 0, sizeof(repOffset));
692 repOffset[1] = repOffset[4] = repOffset[8] = 1;
693 memset(bestRepOffset, 0, sizeof(bestRepOffset));
694 if (compressionLevel==0) compressionLevel=g_compressionLevel_default;
695 params = ZSTD_getParams(compressionLevel, averageSampleSize, dictBufferSize);
696 { size_t const beginResult = ZSTD_compressBegin_advanced(esr.ref, dictBuffer, dictBufferSize, params, 0);
697 if (ZSTD_isError(beginResult)) {
698 eSize = ERROR(GENERIC);
699 DISPLAYLEVEL(1, "error : ZSTD_compressBegin_advanced failed \n");
700 goto _cleanup;
701 } }
702
703 /* collect stats on all files */
704 for (u=0; u<nbFiles; u++) {
705 ZDICT_countEStats(esr, params,
706 countLit, offcodeCount, matchLengthCount, litLengthCount, repOffset,
707 (const char*)srcBuffer + pos, fileSizes[u],
708 notificationLevel);
709 pos += fileSizes[u];
710 }
711
712 /* analyze */
713 errorCode = HUF_buildCTable (hufTable, countLit, 255, huffLog);
714 if (HUF_isError(errorCode)) {
715 eSize = ERROR(GENERIC);
716 DISPLAYLEVEL(1, "HUF_buildCTable error \n");
717 goto _cleanup;
718 }
719 huffLog = (U32)errorCode;
720
721 /* looking for most common first offsets */
722 { U32 offset;
723 for (offset=1; offset<MAXREPOFFSET; offset++)
724 ZDICT_insertSortCount(bestRepOffset, offset, repOffset[offset]);
725 }
726 /* note : the result of this phase should be used to better appreciate the impact on statistics */
727
728 total=0; for (u=0; u<=offcodeMax; u++) total+=offcodeCount[u];
729 errorCode = FSE_normalizeCount(offcodeNCount, Offlog, offcodeCount, total, offcodeMax);
730 if (FSE_isError(errorCode)) {
731 eSize = ERROR(GENERIC);
732 DISPLAYLEVEL(1, "FSE_normalizeCount error with offcodeCount \n");
733 goto _cleanup;
734 }
735 Offlog = (U32)errorCode;
736
737 total=0; for (u=0; u<=MaxML; u++) total+=matchLengthCount[u];
738 errorCode = FSE_normalizeCount(matchLengthNCount, mlLog, matchLengthCount, total, MaxML);
739 if (FSE_isError(errorCode)) {
740 eSize = ERROR(GENERIC);
741 DISPLAYLEVEL(1, "FSE_normalizeCount error with matchLengthCount \n");
742 goto _cleanup;
743 }
744 mlLog = (U32)errorCode;
745
746 total=0; for (u=0; u<=MaxLL; u++) total+=litLengthCount[u];
747 errorCode = FSE_normalizeCount(litLengthNCount, llLog, litLengthCount, total, MaxLL);
748 if (FSE_isError(errorCode)) {
749 eSize = ERROR(GENERIC);
750 DISPLAYLEVEL(1, "FSE_normalizeCount error with litLengthCount \n");
751 goto _cleanup;
752 }
753 llLog = (U32)errorCode;
754
755 /* write result to buffer */
756 { size_t const hhSize = HUF_writeCTable(dstPtr, maxDstSize, hufTable, 255, huffLog);
757 if (HUF_isError(hhSize)) {
758 eSize = ERROR(GENERIC);
759 DISPLAYLEVEL(1, "HUF_writeCTable error \n");
760 goto _cleanup;
761 }
762 dstPtr += hhSize;
763 maxDstSize -= hhSize;
764 eSize += hhSize;
765 }
766
767 { size_t const ohSize = FSE_writeNCount(dstPtr, maxDstSize, offcodeNCount, OFFCODE_MAX, Offlog);
768 if (FSE_isError(ohSize)) {
769 eSize = ERROR(GENERIC);
770 DISPLAYLEVEL(1, "FSE_writeNCount error with offcodeNCount \n");
771 goto _cleanup;
772 }
773 dstPtr += ohSize;
774 maxDstSize -= ohSize;
775 eSize += ohSize;
776 }
777
778 { size_t const mhSize = FSE_writeNCount(dstPtr, maxDstSize, matchLengthNCount, MaxML, mlLog);
779 if (FSE_isError(mhSize)) {
780 eSize = ERROR(GENERIC);
781 DISPLAYLEVEL(1, "FSE_writeNCount error with matchLengthNCount \n");
782 goto _cleanup;
783 }
784 dstPtr += mhSize;
785 maxDstSize -= mhSize;
786 eSize += mhSize;
787 }
788
789 { size_t const lhSize = FSE_writeNCount(dstPtr, maxDstSize, litLengthNCount, MaxLL, llLog);
790 if (FSE_isError(lhSize)) {
791 eSize = ERROR(GENERIC);
792 DISPLAYLEVEL(1, "FSE_writeNCount error with litlengthNCount \n");
793 goto _cleanup;
794 }
795 dstPtr += lhSize;
796 maxDstSize -= lhSize;
797 eSize += lhSize;
798 }
799
800 if (maxDstSize<12) {
801 eSize = ERROR(GENERIC);
802 DISPLAYLEVEL(1, "not enough space to write RepOffsets \n");
803 goto _cleanup;
804 }
805 # if 0
806 MEM_writeLE32(dstPtr+0, bestRepOffset[0].offset);
807 MEM_writeLE32(dstPtr+4, bestRepOffset[1].offset);
808 MEM_writeLE32(dstPtr+8, bestRepOffset[2].offset);
809 #else
810 /* at this stage, we don't use the result of "most common first offset",
811 as the impact of statistics is not properly evaluated */
812 MEM_writeLE32(dstPtr+0, repStartValue[0]);
813 MEM_writeLE32(dstPtr+4, repStartValue[1]);
814 MEM_writeLE32(dstPtr+8, repStartValue[2]);
815 #endif
816 //dstPtr += 12;
817 eSize += 12;
818
819 _cleanup:
820 ZSTD_freeCCtx(esr.ref);
821 ZSTD_freeCCtx(esr.zc);
822 free(esr.workPlace);
823
824 return eSize;
825 }
826
827
828 size_t ZDICT_addEntropyTablesFromBuffer_advanced(void* dictBuffer, size_t dictContentSize, size_t dictBufferCapacity,
829 const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples,
830 ZDICT_params_t params)
831 {
832 size_t hSize;
833 int const compressionLevel = (params.compressionLevel <= 0) ? g_compressionLevel_default : params.compressionLevel;
834 U32 const notificationLevel = params.notificationLevel;
835
836 /* dictionary header */
837 MEM_writeLE32(dictBuffer, ZSTD_DICT_MAGIC);
838 { U64 const randomID = XXH64((char*)dictBuffer + dictBufferCapacity - dictContentSize, dictContentSize, 0);
839 U32 const compliantID = (randomID % ((1U<<31)-32768)) + 32768;
840 U32 const dictID = params.dictID ? params.dictID : compliantID;
841 MEM_writeLE32((char*)dictBuffer+4, dictID);
842 }
843 hSize = 8;
844
845 /* entropy tables */
846 DISPLAYLEVEL(2, "\r%70s\r", ""); /* clean display line */
847 DISPLAYLEVEL(2, "statistics ... \n");
848 { size_t const eSize = ZDICT_analyzeEntropy((char*)dictBuffer+hSize, dictBufferCapacity-hSize,
849 compressionLevel,
850 samplesBuffer, samplesSizes, nbSamples,
851 (char*)dictBuffer + dictBufferCapacity - dictContentSize, dictContentSize,
852 notificationLevel);
853 if (ZDICT_isError(eSize)) return eSize;
854 hSize += eSize;
855 }
856
857
858 if (hSize + dictContentSize < dictBufferCapacity)
859 memmove((char*)dictBuffer + hSize, (char*)dictBuffer + dictBufferCapacity - dictContentSize, dictContentSize);
860 return MIN(dictBufferCapacity, hSize+dictContentSize);
861 }
862
863
864 /*! ZDICT_trainFromBuffer_unsafe() :
865 * Warning : `samplesBuffer` must be followed by noisy guard band.
866 * @return : size of dictionary, or an error code which can be tested with ZDICT_isError()
867 */
868 size_t ZDICT_trainFromBuffer_unsafe(
869 void* dictBuffer, size_t maxDictSize,
870 const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples,
871 ZDICT_params_t params)
872 {
873 U32 const dictListSize = MAX(MAX(DICTLISTSIZE_DEFAULT, nbSamples), (U32)(maxDictSize/16));
874 dictItem* const dictList = (dictItem*)malloc(dictListSize * sizeof(*dictList));
875 unsigned const selectivity = params.selectivityLevel == 0 ? g_selectivity_default : params.selectivityLevel;
876 unsigned const minRep = (selectivity > 30) ? MINRATIO : nbSamples >> selectivity;
877 size_t const targetDictSize = maxDictSize;
878 size_t const samplesBuffSize = ZDICT_totalSampleSize(samplesSizes, nbSamples);
879 size_t dictSize = 0;
880 U32 const notificationLevel = params.notificationLevel;
881
882 /* checks */
883 if (!dictList) return ERROR(memory_allocation);
884 if (maxDictSize <= g_provision_entropySize + g_min_fast_dictContent) { free(dictList); return ERROR(dstSize_tooSmall); }
885 if (samplesBuffSize < ZDICT_MIN_SAMPLES_SIZE) { free(dictList); return 0; } /* not enough source to create dictionary */
886
887 /* init */
888 ZDICT_initDictItem(dictList);
889
890 /* build dictionary */
891 ZDICT_trainBuffer(dictList, dictListSize,
892 samplesBuffer, samplesBuffSize,
893 samplesSizes, nbSamples,
894 minRep, notificationLevel);
895
896 /* display best matches */
897 if (params.notificationLevel>= 3) {
898 U32 const nb = MIN(25, dictList[0].pos);
899 U32 const dictContentSize = ZDICT_dictSize(dictList);
900 U32 u;
901 DISPLAYLEVEL(3, "\n %u segments found, of total size %u \n", dictList[0].pos, dictContentSize);
902 DISPLAYLEVEL(3, "list %u best segments \n", nb);
903 for (u=1; u<=nb; u++) {
904 U32 pos = dictList[u].pos;
905 U32 length = dictList[u].length;
906 U32 printedLength = MIN(40, length);
907 DISPLAYLEVEL(3, "%3u:%3u bytes at pos %8u, savings %7u bytes |",
908 u, length, pos, dictList[u].savings);
909 ZDICT_printHex((const char*)samplesBuffer+pos, printedLength);
910 DISPLAYLEVEL(3, "| \n");
911 } }
912
913
914 /* create dictionary */
915 { U32 dictContentSize = ZDICT_dictSize(dictList);
916 if (dictContentSize < targetDictSize/3) {
917 DISPLAYLEVEL(2, "! warning : selected content significantly smaller than requested (%u < %u) \n", dictContentSize, (U32)maxDictSize);
918 if (minRep > MINRATIO) {
919 DISPLAYLEVEL(2, "! consider increasing selectivity to produce larger dictionary (-s%u) \n", selectivity+1);
920 DISPLAYLEVEL(2, "! note : larger dictionaries are not necessarily better, test its efficiency on samples \n");
921 }
922 if (samplesBuffSize < 10 * targetDictSize)
923 DISPLAYLEVEL(2, "! consider increasing the number of samples (total size : %u MB)\n", (U32)(samplesBuffSize>>20));
924 }
925
926 if ((dictContentSize > targetDictSize*3) && (nbSamples > 2*MINRATIO) && (selectivity>1)) {
927 U32 proposedSelectivity = selectivity-1;
928 while ((nbSamples >> proposedSelectivity) <= MINRATIO) { proposedSelectivity--; }
929 DISPLAYLEVEL(2, "! note : calculated dictionary significantly larger than requested (%u > %u) \n", dictContentSize, (U32)maxDictSize);
930 DISPLAYLEVEL(2, "! consider increasing dictionary size, or produce denser dictionary (-s%u) \n", proposedSelectivity);
931 DISPLAYLEVEL(2, "! always test dictionary efficiency on samples \n");
932 }
933
934 /* limit dictionary size */
935 { U32 const max = dictList->pos; /* convention : nb of useful elts within dictList */
936 U32 currentSize = 0;
937 U32 n; for (n=1; n<max; n++) {
938 currentSize += dictList[n].length;
939 if (currentSize > targetDictSize) { currentSize -= dictList[n].length; break; }
940 }
941 dictList->pos = n;
942 dictContentSize = currentSize;
943 }
944
945 /* build dict content */
946 { U32 u;
947 BYTE* ptr = (BYTE*)dictBuffer + maxDictSize;
948 for (u=1; u<dictList->pos; u++) {
949 U32 l = dictList[u].length;
950 ptr -= l;
951 if (ptr<(BYTE*)dictBuffer) { free(dictList); return ERROR(GENERIC); } /* should not happen */
952 memcpy(ptr, (const char*)samplesBuffer+dictList[u].pos, l);
953 } }
954
955 dictSize = ZDICT_addEntropyTablesFromBuffer_advanced(dictBuffer, dictContentSize, maxDictSize,
956 samplesBuffer, samplesSizes, nbSamples,
957 params);
958 }
959
960 /* clean up */
961 free(dictList);
962 return dictSize;
963 }
964
965
966 /* issue : samplesBuffer need to be followed by a noisy guard band.
967 * work around : duplicate the buffer, and add the noise */
968 size_t ZDICT_trainFromBuffer_advanced(void* dictBuffer, size_t dictBufferCapacity,
969 const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples,
970 ZDICT_params_t params)
971 {
972 size_t result;
973 void* newBuff;
974 size_t const sBuffSize = ZDICT_totalSampleSize(samplesSizes, nbSamples);
975 if (sBuffSize < ZDICT_MIN_SAMPLES_SIZE) return 0; /* not enough content => no dictionary */
976
977 newBuff = malloc(sBuffSize + NOISELENGTH);
978 if (!newBuff) return ERROR(memory_allocation);
979
980 memcpy(newBuff, samplesBuffer, sBuffSize);
981 ZDICT_fillNoise((char*)newBuff + sBuffSize, NOISELENGTH); /* guard band, for end of buffer condition */
982
983 result = ZDICT_trainFromBuffer_unsafe(
984 dictBuffer, dictBufferCapacity,
985 newBuff, samplesSizes, nbSamples,
986 params);
987 free(newBuff);
988 return result;
989 }
990
991
992 size_t ZDICT_trainFromBuffer(void* dictBuffer, size_t dictBufferCapacity,
993 const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples)
994 {
995 ZDICT_params_t params;
996 memset(&params, 0, sizeof(params));
997 return ZDICT_trainFromBuffer_advanced(dictBuffer, dictBufferCapacity,
998 samplesBuffer, samplesSizes, nbSamples,
999 params);
1000 }
1001
1002 size_t ZDICT_addEntropyTablesFromBuffer(void* dictBuffer, size_t dictContentSize, size_t dictBufferCapacity,
1003 const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples)
1004 {
1005 ZDICT_params_t params;
1006 memset(&params, 0, sizeof(params));
1007 return ZDICT_addEntropyTablesFromBuffer_advanced(dictBuffer, dictContentSize, dictBufferCapacity,
1008 samplesBuffer, samplesSizes, nbSamples,
1009 params);
1010 }
@@ -0,0 +1,111 b''
1 /**
2 * Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
3 * All rights reserved.
4 *
5 * This source code is licensed under the BSD-style license found in the
6 * LICENSE file in the root directory of this source tree. An additional grant
7 * of patent rights can be found in the PATENTS file in the same directory.
8 */
9
10 #ifndef DICTBUILDER_H_001
11 #define DICTBUILDER_H_001
12
13 #if defined (__cplusplus)
14 extern "C" {
15 #endif
16
17
18 /*====== Dependencies ======*/
19 #include <stddef.h> /* size_t */
20
21
22 /*====== Export for Windows ======*/
23 /*!
24 * ZSTD_DLL_EXPORT :
25 * Enable exporting of functions when building a Windows DLL
26 */
27 #if defined(_WIN32) && defined(ZSTD_DLL_EXPORT) && (ZSTD_DLL_EXPORT==1)
28 # define ZDICTLIB_API __declspec(dllexport)
29 #else
30 # define ZDICTLIB_API
31 #endif
32
33
34 /*! ZDICT_trainFromBuffer() :
35 Train a dictionary from an array of samples.
36 Samples must be stored concatenated in a single flat buffer `samplesBuffer`,
37 supplied with an array of sizes `samplesSizes`, providing the size of each sample, in order.
38 The resulting dictionary will be saved into `dictBuffer`.
39 @return : size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`)
40 or an error code, which can be tested with ZDICT_isError().
41 Tips : In general, a reasonable dictionary has a size of ~ 100 KB.
42 It's obviously possible to target smaller or larger ones, just by specifying different `dictBufferCapacity`.
43 In general, it's recommended to provide a few thousands samples, but this can vary a lot.
44 It's recommended that total size of all samples be about ~x100 times the target size of dictionary.
45 */
46 ZDICTLIB_API size_t ZDICT_trainFromBuffer(void* dictBuffer, size_t dictBufferCapacity,
47 const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples);
48
49
50 /*====== Helper functions ======*/
51 ZDICTLIB_API unsigned ZDICT_getDictID(const void* dictBuffer, size_t dictSize); /**< extracts dictID; @return zero if error (not a valid dictionary) */
52 ZDICTLIB_API unsigned ZDICT_isError(size_t errorCode);
53 ZDICTLIB_API const char* ZDICT_getErrorName(size_t errorCode);
54
55
56
57 #ifdef ZDICT_STATIC_LINKING_ONLY
58
59 /* ====================================================================================
60 * The definitions in this section are considered experimental.
61 * They should never be used with a dynamic library, as they may change in the future.
62 * They are provided for advanced usages.
63 * Use them only in association with static linking.
64 * ==================================================================================== */
65
66 typedef struct {
67 unsigned selectivityLevel; /* 0 means default; larger => select more => larger dictionary */
68 int compressionLevel; /* 0 means default; target a specific zstd compression level */
69 unsigned notificationLevel; /* Write to stderr; 0 = none (default); 1 = errors; 2 = progression; 3 = details; 4 = debug; */
70 unsigned dictID; /* 0 means auto mode (32-bits random value); other : force dictID value */
71 unsigned reserved[2]; /* reserved space for future parameters */
72 } ZDICT_params_t;
73
74
75 /*! ZDICT_trainFromBuffer_advanced() :
76 Same as ZDICT_trainFromBuffer() with control over more parameters.
77 `parameters` is optional and can be provided with values set to 0 to mean "default".
78 @return : size of dictionary stored into `dictBuffer` (<= `dictBufferSize`),
79 or an error code, which can be tested by ZDICT_isError().
80 note : ZDICT_trainFromBuffer_advanced() will send notifications into stderr if instructed to, using notificationLevel>0.
81 */
82 size_t ZDICT_trainFromBuffer_advanced(void* dictBuffer, size_t dictBufferCapacity,
83 const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples,
84 ZDICT_params_t parameters);
85
86
87 /*! ZDICT_addEntropyTablesFromBuffer() :
88
89 Given a content-only dictionary (built using any 3rd party algorithm),
90 add entropy tables computed from an array of samples.
91 Samples must be stored concatenated in a flat buffer `samplesBuffer`,
92 supplied with an array of sizes `samplesSizes`, providing the size of each sample in order.
93
94 The input dictionary content must be stored *at the end* of `dictBuffer`.
95 Its size is `dictContentSize`.
96 The resulting dictionary with added entropy tables will be *written back to `dictBuffer`*,
97 starting from its beginning.
98 @return : size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`).
99 */
100 size_t ZDICT_addEntropyTablesFromBuffer(void* dictBuffer, size_t dictContentSize, size_t dictBufferCapacity,
101 const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples);
102
103
104
105 #endif /* ZDICT_STATIC_LINKING_ONLY */
106
107 #if defined (__cplusplus)
108 }
109 #endif
110
111 #endif /* DICTBUILDER_H_001 */
This diff has been collapsed as it changes many lines, (640 lines changed) Show them Hide them
@@ -0,0 +1,640 b''
1 /*
2 * Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
3 * All rights reserved.
4 *
5 * This source code is licensed under the BSD-style license found in the
6 * LICENSE file in the root directory of this source tree. An additional grant
7 * of patent rights can be found in the PATENTS file in the same directory.
8 */
9
10 #ifndef ZSTD_H_235446
11 #define ZSTD_H_235446
12
13 #if defined (__cplusplus)
14 extern "C" {
15 #endif
16
17 /* ====== Dependency ======*/
18 #include <stddef.h> /* size_t */
19
20
21 /* ====== Export for Windows ======*/
22 /*
23 * ZSTD_DLL_EXPORT :
24 * Enable exporting of functions when building a Windows DLL
25 */
26 #if defined(_WIN32) && defined(ZSTD_DLL_EXPORT) && (ZSTD_DLL_EXPORT==1)
27 # define ZSTDLIB_API __declspec(dllexport)
28 #else
29 # define ZSTDLIB_API
30 #endif
31
32
33 /*******************************************************************************************************
34 Introduction
35
36 zstd, short for Zstandard, is a fast lossless compression algorithm, targeting real-time compression scenarios
37 at zlib-level and better compression ratios. The zstd compression library provides in-memory compression and
38 decompression functions. The library supports compression levels from 1 up to ZSTD_maxCLevel() which is 22.
39 Levels >= 20, labelled `--ultra`, should be used with caution, as they require more memory.
40 Compression can be done in:
41 - a single step (described as Simple API)
42 - a single step, reusing a context (described as Explicit memory management)
43 - unbounded multiple steps (described as Streaming compression)
44 The compression ratio achievable on small data can be highly improved using compression with a dictionary in:
45 - a single step (described as Simple dictionary API)
46 - a single step, reusing a dictionary (described as Fast dictionary API)
47
48 Advanced experimental functions can be accessed using #define ZSTD_STATIC_LINKING_ONLY before including zstd.h.
49 These APIs shall never be used with a dynamic library.
50 They are not "stable", their definition may change in the future. Only static linking is allowed.
51 *********************************************************************************************************/
52
53 /*------ Version ------*/
54 ZSTDLIB_API unsigned ZSTD_versionNumber (void); /**< returns version number of ZSTD */
55
56 #define ZSTD_VERSION_MAJOR 1
57 #define ZSTD_VERSION_MINOR 1
58 #define ZSTD_VERSION_RELEASE 1
59
60 #define ZSTD_LIB_VERSION ZSTD_VERSION_MAJOR.ZSTD_VERSION_MINOR.ZSTD_VERSION_RELEASE
61 #define ZSTD_QUOTE(str) #str
62 #define ZSTD_EXPAND_AND_QUOTE(str) ZSTD_QUOTE(str)
63 #define ZSTD_VERSION_STRING ZSTD_EXPAND_AND_QUOTE(ZSTD_LIB_VERSION)
64
65 #define ZSTD_VERSION_NUMBER (ZSTD_VERSION_MAJOR *100*100 + ZSTD_VERSION_MINOR *100 + ZSTD_VERSION_RELEASE)
66
67
68 /***************************************
69 * Simple API
70 ***************************************/
71 /*! ZSTD_compress() :
72 Compresses `src` content as a single zstd compressed frame into already allocated `dst`.
73 Hint : compression runs faster if `dstCapacity` >= `ZSTD_compressBound(srcSize)`.
74 @return : compressed size written into `dst` (<= `dstCapacity),
75 or an error code if it fails (which can be tested using ZSTD_isError()) */
76 ZSTDLIB_API size_t ZSTD_compress( void* dst, size_t dstCapacity,
77 const void* src, size_t srcSize,
78 int compressionLevel);
79
80 /*! ZSTD_decompress() :
81 `compressedSize` : must be the _exact_ size of a single compressed frame.
82 `dstCapacity` is an upper bound of originalSize.
83 If user cannot imply a maximum upper bound, it's better to use streaming mode to decompress data.
84 @return : the number of bytes decompressed into `dst` (<= `dstCapacity`),
85 or an errorCode if it fails (which can be tested using ZSTD_isError()) */
86 ZSTDLIB_API size_t ZSTD_decompress( void* dst, size_t dstCapacity,
87 const void* src, size_t compressedSize);
88
89 /*! ZSTD_getDecompressedSize() :
90 * 'src' is the start of a zstd compressed frame.
91 * @return : content size to be decompressed, as a 64-bits value _if known_, 0 otherwise.
92 * note 1 : decompressed size is an optional field, that may not be present, especially in streaming mode.
93 * When `return==0`, data to decompress could be any size.
94 * In which case, it's necessary to use streaming mode to decompress data.
95 * Optionally, application can still use ZSTD_decompress() while relying on implied limits.
96 * (For example, data may be necessarily cut into blocks <= 16 KB).
97 * note 2 : decompressed size is always present when compression is done with ZSTD_compress()
98 * note 3 : decompressed size can be very large (64-bits value),
99 * potentially larger than what local system can handle as a single memory segment.
100 * In which case, it's necessary to use streaming mode to decompress data.
101 * note 4 : If source is untrusted, decompressed size could be wrong or intentionally modified.
102 * Always ensure result fits within application's authorized limits.
103 * Each application can set its own limits.
104 * note 5 : when `return==0`, if precise failure cause is needed, use ZSTD_getFrameParams() to know more. */
105 ZSTDLIB_API unsigned long long ZSTD_getDecompressedSize(const void* src, size_t srcSize);
106
107
108 /*====== Helper functions ======*/
109 ZSTDLIB_API int ZSTD_maxCLevel(void); /*!< maximum compression level available */
110 ZSTDLIB_API size_t ZSTD_compressBound(size_t srcSize); /*!< maximum compressed size in worst case scenario */
111 ZSTDLIB_API unsigned ZSTD_isError(size_t code); /*!< tells if a `size_t` function result is an error code */
112 ZSTDLIB_API const char* ZSTD_getErrorName(size_t code); /*!< provides readable string from an error code */
113
114
115 /***************************************
116 * Explicit memory management
117 ***************************************/
118 /*= Compression context
119 * When compressing many messages / blocks,
120 * it is recommended to allocate a context just once, and re-use it for each successive compression operation.
121 * This will make the situation much easier for the system's memory.
122 * Use one context per thread for parallel execution in multi-threaded environments. */
123 typedef struct ZSTD_CCtx_s ZSTD_CCtx;
124 ZSTDLIB_API ZSTD_CCtx* ZSTD_createCCtx(void);
125 ZSTDLIB_API size_t ZSTD_freeCCtx(ZSTD_CCtx* cctx);
126
127 /*! ZSTD_compressCCtx() :
128 Same as ZSTD_compress(), requires an allocated ZSTD_CCtx (see ZSTD_createCCtx()) */
129 ZSTDLIB_API size_t ZSTD_compressCCtx(ZSTD_CCtx* ctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize, int compressionLevel);
130
131 /*= Decompression context */
132 typedef struct ZSTD_DCtx_s ZSTD_DCtx;
133 ZSTDLIB_API ZSTD_DCtx* ZSTD_createDCtx(void);
134 ZSTDLIB_API size_t ZSTD_freeDCtx(ZSTD_DCtx* dctx);
135
136 /*! ZSTD_decompressDCtx() :
137 * Same as ZSTD_decompress(), requires an allocated ZSTD_DCtx (see ZSTD_createDCtx()) */
138 ZSTDLIB_API size_t ZSTD_decompressDCtx(ZSTD_DCtx* ctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
139
140
141 /**************************
142 * Simple dictionary API
143 ***************************/
144 /*! ZSTD_compress_usingDict() :
145 * Compression using a predefined Dictionary (see dictBuilder/zdict.h).
146 * Note : This function load the dictionary, resulting in significant startup delay. */
147 ZSTDLIB_API size_t ZSTD_compress_usingDict(ZSTD_CCtx* ctx,
148 void* dst, size_t dstCapacity,
149 const void* src, size_t srcSize,
150 const void* dict,size_t dictSize,
151 int compressionLevel);
152
153 /*! ZSTD_decompress_usingDict() :
154 * Decompression using a predefined Dictionary (see dictBuilder/zdict.h).
155 * Dictionary must be identical to the one used during compression.
156 * Note : This function load the dictionary, resulting in significant startup delay */
157 ZSTDLIB_API size_t ZSTD_decompress_usingDict(ZSTD_DCtx* dctx,
158 void* dst, size_t dstCapacity,
159 const void* src, size_t srcSize,
160 const void* dict,size_t dictSize);
161
162
163 /****************************
164 * Fast dictionary API
165 ****************************/
166 typedef struct ZSTD_CDict_s ZSTD_CDict;
167
168 /*! ZSTD_createCDict() :
169 * When compressing multiple messages / blocks with the same dictionary, it's recommended to load it just once.
170 * ZSTD_createCDict() will create a digested dictionary, ready to start future compression operations without startup delay.
171 * ZSTD_CDict can be created once and used by multiple threads concurrently, as its usage is read-only.
172 * `dict` can be released after ZSTD_CDict creation */
173 ZSTDLIB_API ZSTD_CDict* ZSTD_createCDict(const void* dict, size_t dictSize, int compressionLevel);
174
175 /*! ZSTD_freeCDict() :
176 * Function frees memory allocated by ZSTD_createCDict() */
177 ZSTDLIB_API size_t ZSTD_freeCDict(ZSTD_CDict* CDict);
178
179 /*! ZSTD_compress_usingCDict() :
180 * Compression using a digested Dictionary.
181 * Faster startup than ZSTD_compress_usingDict(), recommended when same dictionary is used multiple times.
182 * Note that compression level is decided during dictionary creation */
183 ZSTDLIB_API size_t ZSTD_compress_usingCDict(ZSTD_CCtx* cctx,
184 void* dst, size_t dstCapacity,
185 const void* src, size_t srcSize,
186 const ZSTD_CDict* cdict);
187
188
189 typedef struct ZSTD_DDict_s ZSTD_DDict;
190
191 /*! ZSTD_createDDict() :
192 * Create a digested dictionary, ready to start decompression operation without startup delay.
193 * `dict` can be released after creation */
194 ZSTDLIB_API ZSTD_DDict* ZSTD_createDDict(const void* dict, size_t dictSize);
195
196 /*! ZSTD_freeDDict() :
197 * Function frees memory allocated with ZSTD_createDDict() */
198 ZSTDLIB_API size_t ZSTD_freeDDict(ZSTD_DDict* ddict);
199
200 /*! ZSTD_decompress_usingDDict() :
201 * Decompression using a digested Dictionary
202 * Faster startup than ZSTD_decompress_usingDict(), recommended when same dictionary is used multiple times. */
203 ZSTDLIB_API size_t ZSTD_decompress_usingDDict(ZSTD_DCtx* dctx,
204 void* dst, size_t dstCapacity,
205 const void* src, size_t srcSize,
206 const ZSTD_DDict* ddict);
207
208
209 /****************************
210 * Streaming
211 ****************************/
212
213 typedef struct ZSTD_inBuffer_s {
214 const void* src; /**< start of input buffer */
215 size_t size; /**< size of input buffer */
216 size_t pos; /**< position where reading stopped. Will be updated. Necessarily 0 <= pos <= size */
217 } ZSTD_inBuffer;
218
219 typedef struct ZSTD_outBuffer_s {
220 void* dst; /**< start of output buffer */
221 size_t size; /**< size of output buffer */
222 size_t pos; /**< position where writing stopped. Will be updated. Necessarily 0 <= pos <= size */
223 } ZSTD_outBuffer;
224
225
226
227 /*-***********************************************************************
228 * Streaming compression - HowTo
229 *
230 * A ZSTD_CStream object is required to track streaming operation.
231 * Use ZSTD_createCStream() and ZSTD_freeCStream() to create/release resources.
232 * ZSTD_CStream objects can be reused multiple times on consecutive compression operations.
233 * It is recommended to re-use ZSTD_CStream in situations where many streaming operations will be achieved consecutively,
234 * since it will play nicer with system's memory, by re-using already allocated memory.
235 * Use one separate ZSTD_CStream per thread for parallel execution.
236 *
237 * Start a new compression by initializing ZSTD_CStream.
238 * Use ZSTD_initCStream() to start a new compression operation.
239 * Use ZSTD_initCStream_usingDict() for a compression which requires a dictionary.
240 *
241 * Use ZSTD_compressStream() repetitively to consume input stream.
242 * The function will automatically update both `pos` fields.
243 * Note that it may not consume the entire input, in which case `pos < size`,
244 * and it's up to the caller to present again remaining data.
245 * @return : a size hint, preferred nb of bytes to use as input for next function call
246 * (it's just a hint, to help latency a little, any other value will work fine)
247 * (note : the size hint is guaranteed to be <= ZSTD_CStreamInSize() )
248 * or an error code, which can be tested using ZSTD_isError().
249 *
250 * At any moment, it's possible to flush whatever data remains within buffer, using ZSTD_flushStream().
251 * `output->pos` will be updated.
252 * Note some content might still be left within internal buffer if `output->size` is too small.
253 * @return : nb of bytes still present within internal buffer (0 if it's empty)
254 * or an error code, which can be tested using ZSTD_isError().
255 *
256 * ZSTD_endStream() instructs to finish a frame.
257 * It will perform a flush and write frame epilogue.
258 * The epilogue is required for decoders to consider a frame completed.
259 * Similar to ZSTD_flushStream(), it may not be able to flush the full content if `output->size` is too small.
260 * In which case, call again ZSTD_endStream() to complete the flush.
261 * @return : nb of bytes still present within internal buffer (0 if it's empty)
262 * or an error code, which can be tested using ZSTD_isError().
263 *
264 * *******************************************************************/
265
266 /*===== Streaming compression functions ======*/
267 typedef struct ZSTD_CStream_s ZSTD_CStream;
268 ZSTDLIB_API ZSTD_CStream* ZSTD_createCStream(void);
269 ZSTDLIB_API size_t ZSTD_freeCStream(ZSTD_CStream* zcs);
270 ZSTDLIB_API size_t ZSTD_initCStream(ZSTD_CStream* zcs, int compressionLevel);
271 ZSTDLIB_API size_t ZSTD_compressStream(ZSTD_CStream* zcs, ZSTD_outBuffer* output, ZSTD_inBuffer* input);
272 ZSTDLIB_API size_t ZSTD_flushStream(ZSTD_CStream* zcs, ZSTD_outBuffer* output);
273 ZSTDLIB_API size_t ZSTD_endStream(ZSTD_CStream* zcs, ZSTD_outBuffer* output);
274
275 ZSTDLIB_API size_t ZSTD_CStreamInSize(void); /**< recommended size for input buffer */
276 ZSTDLIB_API size_t ZSTD_CStreamOutSize(void); /**< recommended size for output buffer. Guarantee to successfully flush at least one complete compressed block in all circumstances. */
277
278
279
280 /*-***************************************************************************
281 * Streaming decompression - HowTo
282 *
283 * A ZSTD_DStream object is required to track streaming operations.
284 * Use ZSTD_createDStream() and ZSTD_freeDStream() to create/release resources.
285 * ZSTD_DStream objects can be re-used multiple times.
286 *
287 * Use ZSTD_initDStream() to start a new decompression operation,
288 * or ZSTD_initDStream_usingDict() if decompression requires a dictionary.
289 * @return : recommended first input size
290 *
291 * Use ZSTD_decompressStream() repetitively to consume your input.
292 * The function will update both `pos` fields.
293 * If `input.pos < input.size`, some input has not been consumed.
294 * It's up to the caller to present again remaining data.
295 * If `output.pos < output.size`, decoder has flushed everything it could.
296 * @return : 0 when a frame is completely decoded and fully flushed,
297 * an error code, which can be tested using ZSTD_isError(),
298 * any other value > 0, which means there is still some work to do to complete the frame.
299 * The return value is a suggested next input size (just an hint, to help latency).
300 * *******************************************************************************/
301
302 /*===== Streaming decompression functions =====*/
303 typedef struct ZSTD_DStream_s ZSTD_DStream;
304 ZSTDLIB_API ZSTD_DStream* ZSTD_createDStream(void);
305 ZSTDLIB_API size_t ZSTD_freeDStream(ZSTD_DStream* zds);
306 ZSTDLIB_API size_t ZSTD_initDStream(ZSTD_DStream* zds);
307 ZSTDLIB_API size_t ZSTD_decompressStream(ZSTD_DStream* zds, ZSTD_outBuffer* output, ZSTD_inBuffer* input);
308
309 ZSTDLIB_API size_t ZSTD_DStreamInSize(void); /*!< recommended size for input buffer */
310 ZSTDLIB_API size_t ZSTD_DStreamOutSize(void); /*!< recommended size for output buffer. Guarantee to successfully flush at least one complete block in all circumstances. */
311
312
313
314 #ifdef ZSTD_STATIC_LINKING_ONLY
315
316 /****************************************************************************************
317 * START OF ADVANCED AND EXPERIMENTAL FUNCTIONS
318 * The definitions in this section are considered experimental.
319 * They should never be used with a dynamic library, as they may change in the future.
320 * They are provided for advanced usages.
321 * Use them only in association with static linking.
322 * ***************************************************************************************/
323
324 /* --- Constants ---*/
325 #define ZSTD_MAGICNUMBER 0xFD2FB528 /* v0.8 */
326 #define ZSTD_MAGIC_SKIPPABLE_START 0x184D2A50U
327
328 #define ZSTD_WINDOWLOG_MAX_32 25
329 #define ZSTD_WINDOWLOG_MAX_64 27
330 #define ZSTD_WINDOWLOG_MAX ((U32)(MEM_32bits() ? ZSTD_WINDOWLOG_MAX_32 : ZSTD_WINDOWLOG_MAX_64))
331 #define ZSTD_WINDOWLOG_MIN 10
332 #define ZSTD_HASHLOG_MAX ZSTD_WINDOWLOG_MAX
333 #define ZSTD_HASHLOG_MIN 6
334 #define ZSTD_CHAINLOG_MAX (ZSTD_WINDOWLOG_MAX+1)
335 #define ZSTD_CHAINLOG_MIN ZSTD_HASHLOG_MIN
336 #define ZSTD_HASHLOG3_MAX 17
337 #define ZSTD_SEARCHLOG_MAX (ZSTD_WINDOWLOG_MAX-1)
338 #define ZSTD_SEARCHLOG_MIN 1
339 #define ZSTD_SEARCHLENGTH_MAX 7 /* only for ZSTD_fast, other strategies are limited to 6 */
340 #define ZSTD_SEARCHLENGTH_MIN 3 /* only for ZSTD_btopt, other strategies are limited to 4 */
341 #define ZSTD_TARGETLENGTH_MIN 4
342 #define ZSTD_TARGETLENGTH_MAX 999
343
344 #define ZSTD_FRAMEHEADERSIZE_MAX 18 /* for static allocation */
345 static const size_t ZSTD_frameHeaderSize_prefix = 5;
346 static const size_t ZSTD_frameHeaderSize_min = 6;
347 static const size_t ZSTD_frameHeaderSize_max = ZSTD_FRAMEHEADERSIZE_MAX;
348 static const size_t ZSTD_skippableHeaderSize = 8; /* magic number + skippable frame length */
349
350
351 /*--- Advanced types ---*/
352 typedef enum { ZSTD_fast, ZSTD_dfast, ZSTD_greedy, ZSTD_lazy, ZSTD_lazy2, ZSTD_btlazy2, ZSTD_btopt, ZSTD_btopt2 } ZSTD_strategy; /* from faster to stronger */
353
354 typedef struct {
355 unsigned windowLog; /**< largest match distance : larger == more compression, more memory needed during decompression */
356 unsigned chainLog; /**< fully searched segment : larger == more compression, slower, more memory (useless for fast) */
357 unsigned hashLog; /**< dispatch table : larger == faster, more memory */
358 unsigned searchLog; /**< nb of searches : larger == more compression, slower */
359 unsigned searchLength; /**< match length searched : larger == faster decompression, sometimes less compression */
360 unsigned targetLength; /**< acceptable match size for optimal parser (only) : larger == more compression, slower */
361 ZSTD_strategy strategy;
362 } ZSTD_compressionParameters;
363
364 typedef struct {
365 unsigned contentSizeFlag; /**< 1: content size will be in frame header (if known). */
366 unsigned checksumFlag; /**< 1: will generate a 22-bits checksum at end of frame, to be used for error detection by decompressor */
367 unsigned noDictIDFlag; /**< 1: no dict ID will be saved into frame header (if dictionary compression) */
368 } ZSTD_frameParameters;
369
370 typedef struct {
371 ZSTD_compressionParameters cParams;
372 ZSTD_frameParameters fParams;
373 } ZSTD_parameters;
374
375 /*= Custom memory allocation functions */
376 typedef void* (*ZSTD_allocFunction) (void* opaque, size_t size);
377 typedef void (*ZSTD_freeFunction) (void* opaque, void* address);
378 typedef struct { ZSTD_allocFunction customAlloc; ZSTD_freeFunction customFree; void* opaque; } ZSTD_customMem;
379
380
381 /***************************************
382 * Advanced compression functions
383 ***************************************/
384 /*! ZSTD_estimateCCtxSize() :
385 * Gives the amount of memory allocated for a ZSTD_CCtx given a set of compression parameters.
386 * `frameContentSize` is an optional parameter, provide `0` if unknown */
387 ZSTDLIB_API size_t ZSTD_estimateCCtxSize(ZSTD_compressionParameters cParams);
388
389 /*! ZSTD_createCCtx_advanced() :
390 * Create a ZSTD compression context using external alloc and free functions */
391 ZSTDLIB_API ZSTD_CCtx* ZSTD_createCCtx_advanced(ZSTD_customMem customMem);
392
393 /*! ZSTD_sizeofCCtx() :
394 * Gives the amount of memory used by a given ZSTD_CCtx */
395 ZSTDLIB_API size_t ZSTD_sizeof_CCtx(const ZSTD_CCtx* cctx);
396
397 /*! ZSTD_createCDict_advanced() :
398 * Create a ZSTD_CDict using external alloc and free, and customized compression parameters */
399 ZSTDLIB_API ZSTD_CDict* ZSTD_createCDict_advanced(const void* dict, size_t dictSize,
400 ZSTD_parameters params, ZSTD_customMem customMem);
401
402 /*! ZSTD_sizeof_CDict() :
403 * Gives the amount of memory used by a given ZSTD_sizeof_CDict */
404 ZSTDLIB_API size_t ZSTD_sizeof_CDict(const ZSTD_CDict* cdict);
405
406 /*! ZSTD_getParams() :
407 * same as ZSTD_getCParams(), but @return a full `ZSTD_parameters` object instead of a `ZSTD_compressionParameters`.
408 * All fields of `ZSTD_frameParameters` are set to default (0) */
409 ZSTDLIB_API ZSTD_parameters ZSTD_getParams(int compressionLevel, unsigned long long srcSize, size_t dictSize);
410
411 /*! ZSTD_getCParams() :
412 * @return ZSTD_compressionParameters structure for a selected compression level and srcSize.
413 * `srcSize` value is optional, select 0 if not known */
414 ZSTDLIB_API ZSTD_compressionParameters ZSTD_getCParams(int compressionLevel, unsigned long long srcSize, size_t dictSize);
415
416 /*! ZSTD_checkCParams() :
417 * Ensure param values remain within authorized range */
418 ZSTDLIB_API size_t ZSTD_checkCParams(ZSTD_compressionParameters params);
419
420 /*! ZSTD_adjustCParams() :
421 * optimize params for a given `srcSize` and `dictSize`.
422 * both values are optional, select `0` if unknown. */
423 ZSTDLIB_API ZSTD_compressionParameters ZSTD_adjustCParams(ZSTD_compressionParameters cPar, unsigned long long srcSize, size_t dictSize);
424
425 /*! ZSTD_compress_advanced() :
426 * Same as ZSTD_compress_usingDict(), with fine-tune control of each compression parameter */
427 ZSTDLIB_API size_t ZSTD_compress_advanced (ZSTD_CCtx* ctx,
428 void* dst, size_t dstCapacity,
429 const void* src, size_t srcSize,
430 const void* dict,size_t dictSize,
431 ZSTD_parameters params);
432
433
434 /*--- Advanced decompression functions ---*/
435
436 /*! ZSTD_estimateDCtxSize() :
437 * Gives the potential amount of memory allocated to create a ZSTD_DCtx */
438 ZSTDLIB_API size_t ZSTD_estimateDCtxSize(void);
439
440 /*! ZSTD_createDCtx_advanced() :
441 * Create a ZSTD decompression context using external alloc and free functions */
442 ZSTDLIB_API ZSTD_DCtx* ZSTD_createDCtx_advanced(ZSTD_customMem customMem);
443
444 /*! ZSTD_sizeof_DCtx() :
445 * Gives the amount of memory used by a given ZSTD_DCtx */
446 ZSTDLIB_API size_t ZSTD_sizeof_DCtx(const ZSTD_DCtx* dctx);
447
448 /*! ZSTD_sizeof_DDict() :
449 * Gives the amount of memory used by a given ZSTD_DDict */
450 ZSTDLIB_API size_t ZSTD_sizeof_DDict(const ZSTD_DDict* ddict);
451
452
453 /********************************************************************
454 * Advanced streaming functions
455 ********************************************************************/
456
457 /*===== Advanced Streaming compression functions =====*/
458 ZSTDLIB_API ZSTD_CStream* ZSTD_createCStream_advanced(ZSTD_customMem customMem);
459 ZSTDLIB_API size_t ZSTD_initCStream_usingDict(ZSTD_CStream* zcs, const void* dict, size_t dictSize, int compressionLevel);
460 ZSTDLIB_API size_t ZSTD_initCStream_advanced(ZSTD_CStream* zcs, const void* dict, size_t dictSize,
461 ZSTD_parameters params, unsigned long long pledgedSrcSize); /**< pledgedSrcSize is optional and can be zero == unknown */
462 ZSTDLIB_API size_t ZSTD_initCStream_usingCDict(ZSTD_CStream* zcs, const ZSTD_CDict* cdict); /**< note : cdict will just be referenced, and must outlive compression session */
463 ZSTDLIB_API size_t ZSTD_resetCStream(ZSTD_CStream* zcs, unsigned long long pledgedSrcSize); /**< re-use compression parameters from previous init; skip dictionary loading stage; zcs must be init at least once before */
464 ZSTDLIB_API size_t ZSTD_sizeof_CStream(const ZSTD_CStream* zcs);
465
466
467 /*===== Advanced Streaming decompression functions =====*/
468 typedef enum { ZSTDdsp_maxWindowSize } ZSTD_DStreamParameter_e;
469 ZSTDLIB_API ZSTD_DStream* ZSTD_createDStream_advanced(ZSTD_customMem customMem);
470 ZSTDLIB_API size_t ZSTD_initDStream_usingDict(ZSTD_DStream* zds, const void* dict, size_t dictSize);
471 ZSTDLIB_API size_t ZSTD_setDStreamParameter(ZSTD_DStream* zds, ZSTD_DStreamParameter_e paramType, unsigned paramValue);
472 ZSTDLIB_API size_t ZSTD_initDStream_usingDDict(ZSTD_DStream* zds, const ZSTD_DDict* ddict); /**< note : ddict will just be referenced, and must outlive decompression session */
473 ZSTDLIB_API size_t ZSTD_resetDStream(ZSTD_DStream* zds); /**< re-use decompression parameters from previous init; saves dictionary loading */
474 ZSTDLIB_API size_t ZSTD_sizeof_DStream(const ZSTD_DStream* zds);
475
476
477 /*********************************************************************
478 * Buffer-less and synchronous inner streaming functions
479 *
480 * This is an advanced API, giving full control over buffer management, for users which need direct control over memory.
481 * But it's also a complex one, with many restrictions (documented below).
482 * Prefer using normal streaming API for an easier experience
483 ********************************************************************* */
484
485 /**
486 Buffer-less streaming compression (synchronous mode)
487
488 A ZSTD_CCtx object is required to track streaming operations.
489 Use ZSTD_createCCtx() / ZSTD_freeCCtx() to manage resource.
490 ZSTD_CCtx object can be re-used multiple times within successive compression operations.
491
492 Start by initializing a context.
493 Use ZSTD_compressBegin(), or ZSTD_compressBegin_usingDict() for dictionary compression,
494 or ZSTD_compressBegin_advanced(), for finer parameter control.
495 It's also possible to duplicate a reference context which has already been initialized, using ZSTD_copyCCtx()
496
497 Then, consume your input using ZSTD_compressContinue().
498 There are some important considerations to keep in mind when using this advanced function :
499 - ZSTD_compressContinue() has no internal buffer. It uses externally provided buffer only.
500 - Interface is synchronous : input is consumed entirely and produce 1+ (or more) compressed blocks.
501 - Caller must ensure there is enough space in `dst` to store compressed data under worst case scenario.
502 Worst case evaluation is provided by ZSTD_compressBound().
503 ZSTD_compressContinue() doesn't guarantee recover after a failed compression.
504 - ZSTD_compressContinue() presumes prior input ***is still accessible and unmodified*** (up to maximum distance size, see WindowLog).
505 It remembers all previous contiguous blocks, plus one separated memory segment (which can itself consists of multiple contiguous blocks)
506 - ZSTD_compressContinue() detects that prior input has been overwritten when `src` buffer overlaps.
507 In which case, it will "discard" the relevant memory section from its history.
508
509 Finish a frame with ZSTD_compressEnd(), which will write the last block(s) and optional checksum.
510 It's possible to use a NULL,0 src content, in which case, it will write a final empty block to end the frame,
511 Without last block mark, frames will be considered unfinished (broken) by decoders.
512
513 You can then reuse `ZSTD_CCtx` (ZSTD_compressBegin()) to compress some new frame.
514 */
515
516 /*===== Buffer-less streaming compression functions =====*/
517 ZSTDLIB_API size_t ZSTD_compressBegin(ZSTD_CCtx* cctx, int compressionLevel);
518 ZSTDLIB_API size_t ZSTD_compressBegin_usingDict(ZSTD_CCtx* cctx, const void* dict, size_t dictSize, int compressionLevel);
519 ZSTDLIB_API size_t ZSTD_compressBegin_advanced(ZSTD_CCtx* cctx, const void* dict, size_t dictSize, ZSTD_parameters params, unsigned long long pledgedSrcSize);
520 ZSTDLIB_API size_t ZSTD_copyCCtx(ZSTD_CCtx* cctx, const ZSTD_CCtx* preparedCCtx, unsigned long long pledgedSrcSize);
521 ZSTDLIB_API size_t ZSTD_compressContinue(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
522 ZSTDLIB_API size_t ZSTD_compressEnd(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
523
524
525
526 /*-
527 Buffer-less streaming decompression (synchronous mode)
528
529 A ZSTD_DCtx object is required to track streaming operations.
530 Use ZSTD_createDCtx() / ZSTD_freeDCtx() to manage it.
531 A ZSTD_DCtx object can be re-used multiple times.
532
533 First typical operation is to retrieve frame parameters, using ZSTD_getFrameParams().
534 It fills a ZSTD_frameParams structure which provide important information to correctly decode the frame,
535 such as the minimum rolling buffer size to allocate to decompress data (`windowSize`),
536 and the dictionary ID used.
537 (Note : content size is optional, it may not be present. 0 means : content size unknown).
538 Note that these values could be wrong, either because of data malformation, or because an attacker is spoofing deliberate false information.
539 As a consequence, check that values remain within valid application range, especially `windowSize`, before allocation.
540 Each application can set its own limit, depending on local restrictions. For extended interoperability, it is recommended to support at least 8 MB.
541 Frame parameters are extracted from the beginning of the compressed frame.
542 Data fragment must be large enough to ensure successful decoding, typically `ZSTD_frameHeaderSize_max` bytes.
543 @result : 0 : successful decoding, the `ZSTD_frameParams` structure is correctly filled.
544 >0 : `srcSize` is too small, please provide at least @result bytes on next attempt.
545 errorCode, which can be tested using ZSTD_isError().
546
547 Start decompression, with ZSTD_decompressBegin() or ZSTD_decompressBegin_usingDict().
548 Alternatively, you can copy a prepared context, using ZSTD_copyDCtx().
549
550 Then use ZSTD_nextSrcSizeToDecompress() and ZSTD_decompressContinue() alternatively.
551 ZSTD_nextSrcSizeToDecompress() tells how many bytes to provide as 'srcSize' to ZSTD_decompressContinue().
552 ZSTD_decompressContinue() requires this _exact_ amount of bytes, or it will fail.
553
554 @result of ZSTD_decompressContinue() is the number of bytes regenerated within 'dst' (necessarily <= dstCapacity).
555 It can be zero, which is not an error; it just means ZSTD_decompressContinue() has decoded some metadata item.
556 It can also be an error code, which can be tested with ZSTD_isError().
557
558 ZSTD_decompressContinue() needs previous data blocks during decompression, up to `windowSize`.
559 They should preferably be located contiguously, prior to current block.
560 Alternatively, a round buffer of sufficient size is also possible. Sufficient size is determined by frame parameters.
561 ZSTD_decompressContinue() is very sensitive to contiguity,
562 if 2 blocks don't follow each other, make sure that either the compressor breaks contiguity at the same place,
563 or that previous contiguous segment is large enough to properly handle maximum back-reference.
564
565 A frame is fully decoded when ZSTD_nextSrcSizeToDecompress() returns zero.
566 Context can then be reset to start a new decompression.
567
568 Note : it's possible to know if next input to present is a header or a block, using ZSTD_nextInputType().
569 This information is not required to properly decode a frame.
570
571 == Special case : skippable frames ==
572
573 Skippable frames allow integration of user-defined data into a flow of concatenated frames.
574 Skippable frames will be ignored (skipped) by a decompressor. The format of skippable frames is as follows :
575 a) Skippable frame ID - 4 Bytes, Little endian format, any value from 0x184D2A50 to 0x184D2A5F
576 b) Frame Size - 4 Bytes, Little endian format, unsigned 32-bits
577 c) Frame Content - any content (User Data) of length equal to Frame Size
578 For skippable frames ZSTD_decompressContinue() always returns 0.
579 For skippable frames ZSTD_getFrameParams() returns fparamsPtr->windowLog==0 what means that a frame is skippable.
580 It also returns Frame Size as fparamsPtr->frameContentSize.
581 */
582
583 typedef struct {
584 unsigned long long frameContentSize;
585 unsigned windowSize;
586 unsigned dictID;
587 unsigned checksumFlag;
588 } ZSTD_frameParams;
589
590 /*===== Buffer-less streaming decompression functions =====*/
591 ZSTDLIB_API size_t ZSTD_getFrameParams(ZSTD_frameParams* fparamsPtr, const void* src, size_t srcSize); /**< doesn't consume input, see details below */
592 ZSTDLIB_API size_t ZSTD_decompressBegin(ZSTD_DCtx* dctx);
593 ZSTDLIB_API size_t ZSTD_decompressBegin_usingDict(ZSTD_DCtx* dctx, const void* dict, size_t dictSize);
594 ZSTDLIB_API void ZSTD_copyDCtx(ZSTD_DCtx* dctx, const ZSTD_DCtx* preparedDCtx);
595 ZSTDLIB_API size_t ZSTD_nextSrcSizeToDecompress(ZSTD_DCtx* dctx);
596 ZSTDLIB_API size_t ZSTD_decompressContinue(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
597 typedef enum { ZSTDnit_frameHeader, ZSTDnit_blockHeader, ZSTDnit_block, ZSTDnit_lastBlock, ZSTDnit_checksum, ZSTDnit_skippableFrame } ZSTD_nextInputType_e;
598 ZSTDLIB_API ZSTD_nextInputType_e ZSTD_nextInputType(ZSTD_DCtx* dctx);
599
600 /**
601 Block functions
602
603 Block functions produce and decode raw zstd blocks, without frame metadata.
604 Frame metadata cost is typically ~18 bytes, which can be non-negligible for very small blocks (< 100 bytes).
605 User will have to take in charge required information to regenerate data, such as compressed and content sizes.
606
607 A few rules to respect :
608 - Compressing and decompressing require a context structure
609 + Use ZSTD_createCCtx() and ZSTD_createDCtx()
610 - It is necessary to init context before starting
611 + compression : ZSTD_compressBegin()
612 + decompression : ZSTD_decompressBegin()
613 + variants _usingDict() are also allowed
614 + copyCCtx() and copyDCtx() work too
615 - Block size is limited, it must be <= ZSTD_getBlockSizeMax()
616 + If you need to compress more, cut data into multiple blocks
617 + Consider using the regular ZSTD_compress() instead, as frame metadata costs become negligible when source size is large.
618 - When a block is considered not compressible enough, ZSTD_compressBlock() result will be zero.
619 In which case, nothing is produced into `dst`.
620 + User must test for such outcome and deal directly with uncompressed data
621 + ZSTD_decompressBlock() doesn't accept uncompressed data as input !!!
622 + In case of multiple successive blocks, decoder must be informed of uncompressed block existence to follow proper history.
623 Use ZSTD_insertBlock() in such a case.
624 */
625
626 #define ZSTD_BLOCKSIZE_ABSOLUTEMAX (128 * 1024) /* define, for static allocation */
627 /*===== Raw zstd block functions =====*/
628 ZSTDLIB_API size_t ZSTD_getBlockSizeMax(ZSTD_CCtx* cctx);
629 ZSTDLIB_API size_t ZSTD_compressBlock (ZSTD_CCtx* cctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
630 ZSTDLIB_API size_t ZSTD_decompressBlock(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
631 ZSTDLIB_API size_t ZSTD_insertBlock(ZSTD_DCtx* dctx, const void* blockStart, size_t blockSize); /**< insert block into `dctx` history. Useful for uncompressed blocks */
632
633
634 #endif /* ZSTD_STATIC_LINKING_ONLY */
635
636 #if defined (__cplusplus)
637 }
638 #endif
639
640 #endif /* ZSTD_H_235446 */
General Comments 0
You need to be logged in to leave comments. Login now