upstream/mercurial-mirror Files · contrib/python-zstandard/zstd/dictBuilder/zdict.h

zstd: vendor zstd 1.1.1...

zstd: vendor zstd 1.1.1 zstd is a new compression format and it is awesome, yielding higher compression ratios and significantly faster compression and decompression operations compared to zlib (our current compression engine of choice) across the board. We want zstd to be a 1st class citizen in Mercurial and to eventually be the preferred compression format for various operations. This patch starts the formal process of supporting zstd by vendoring a copy of zstd. Why do we need to vendor zstd? Good question. First, zstd is relatively new and not widely available yet. If we didn't vendor zstd or distribute it with Mercurial, most users likely wouldn't have zstd installed or even available to install. What good is a feature if you can't use it? Vendoring and distributing the zstd sources gives us the highest liklihood that zstd will be available to Mercurial installs. Second, the Python bindings to zstd (which will be vendored in a separate changeset) make use of zstd APIs that are only available via static linking. One reason they are only available via static linking is that they are unstable and could change at any time. While it might be possible for the Python bindings to attempt to talk to different versions of the zstd C library, the safest thing to do is link against a specific, known-working version of zstd. This is why the Python zstd bindings themselves vendor zstd and why we must as well. This also explains why the added files are in a "python-zstandard" directory. The added files are from the 1.1.1 release of zstd (Git commit from https://github.com/facebook/zstd) and are added without modifications. Not all files from the zstd "distribution" have been added. Notably missing are files to support interacting with "legacy," pre-1.0 versions of zstd. The decision of which files to include is made by the upstream python-zstandard project (which I'm the author of). The files in this commit are a snapshot of the files from the 0.5.0 release of that project, Git commit from https://github.com/indygreg/python-zstandard.

Gregory Szorc - - Load All Authors

File last commit:

r30434:2e484bde default


                r30434:2e484bde

default

Download file

             zdict.h
        
                    111 lines
            
             | 4.8 KiB
            
                | text/x-c
            
             |
                CLexer
            
             / contrib / python-zstandard / zstd / dictBuilder / zdict.h
          
                    History
                
                 |
                  Annotation
                 | Raw
                 |Copy content
                 |Copy permalink

      /**

       * Copyright (c) 2016-present, Yann Collet, Facebook, Inc.

       * All rights reserved.

       *

       * This source code is licensed under the BSD-style license found in the

       * LICENSE file in the root directory of this source tree. An additional grant

       * of patent rights can be found in the PATENTS file in the same directory.

       */

      #ifndef DICTBUILDER_H_001

      #define DICTBUILDER_H_001

      #if defined (__cplusplus)

      extern "C" {

      #endif

      /*======  Dependencies  ======*/

      #include <stddef.h>  /* size_t */

      /*======  Export for Windows  ======*/

      /*!

      *  ZSTD_DLL_EXPORT :

      *  Enable exporting of functions when building a Windows DLL

      */

      #if defined(_WIN32) && defined(ZSTD_DLL_EXPORT) && (ZSTD_DLL_EXPORT==1)

      #  define ZDICTLIB_API __declspec(dllexport)

      #else

      #  define ZDICTLIB_API

      #endif

      /*! ZDICT_trainFromBuffer() :

          Train a dictionary from an array of samples.

          Samples must be stored concatenated in a single flat buffer `samplesBuffer`,

          supplied with an array of sizes `samplesSizes`, providing the size of each sample, in order.

          The resulting dictionary will be saved into `dictBuffer`.

          @return : size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`)

                    or an error code, which can be tested with ZDICT_isError().

          Tips : In general, a reasonable dictionary has a size of ~ 100 KB.

                 It's obviously possible to target smaller or larger ones, just by specifying different `dictBufferCapacity`.

                 In general, it's recommended to provide a few thousands samples, but this can vary a lot.

                 It's recommended that total size of all samples be about ~x100 times the target size of dictionary.

      */

      ZDICTLIB_API size_t ZDICT_trainFromBuffer(void* dictBuffer, size_t dictBufferCapacity,

                             const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples);

      /*======   Helper functions   ======*/

      ZDICTLIB_API unsigned ZDICT_getDictID(const void* dictBuffer, size_t dictSize);  /**< extracts dictID; @return zero if error (not a valid dictionary) */

      ZDICTLIB_API unsigned ZDICT_isError(size_t errorCode);

      ZDICTLIB_API const char* ZDICT_getErrorName(size_t errorCode);

      #ifdef ZDICT_STATIC_LINKING_ONLY

      /* ====================================================================================

       * The definitions in this section are considered experimental.

       * They should never be used with a dynamic library, as they may change in the future.

       * They are provided for advanced usages.

       * Use them only in association with static linking.

       * ==================================================================================== */

      typedef struct {

          unsigned selectivityLevel;   /* 0 means default; larger => select more => larger dictionary */

          int      compressionLevel;   /* 0 means default; target a specific zstd compression level */

          unsigned notificationLevel;  /* Write to stderr; 0 = none (default); 1 = errors; 2 = progression; 3 = details; 4 = debug; */

          unsigned dictID;             /* 0 means auto mode (32-bits random value); other : force dictID value */

          unsigned reserved[2];        /* reserved space for future parameters */

      } ZDICT_params_t;

      /*! ZDICT_trainFromBuffer_advanced() :

          Same as ZDICT_trainFromBuffer() with control over more parameters.

          `parameters` is optional and can be provided with values set to 0 to mean "default".

          @return : size of dictionary stored into `dictBuffer` (<= `dictBufferSize`),

                    or an error code, which can be tested by ZDICT_isError().

          note : ZDICT_trainFromBuffer_advanced() will send notifications into stderr if instructed to, using notificationLevel>0.

      */

      size_t ZDICT_trainFromBuffer_advanced(void* dictBuffer, size_t dictBufferCapacity,

                                      const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples,

                                      ZDICT_params_t parameters);

      /*! ZDICT_addEntropyTablesFromBuffer() :

          Given a content-only dictionary (built using any 3rd party algorithm),

          add entropy tables computed from an array of samples.

          Samples must be stored concatenated in a flat buffer `samplesBuffer`,

          supplied with an array of sizes `samplesSizes`, providing the size of each sample in order.

          The input dictionary content must be stored *at the end* of `dictBuffer`.

          Its size is `dictContentSize`.

          The resulting dictionary with added entropy tables will be *written back to `dictBuffer`*,

          starting from its beginning.

          @return : size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`).

      */

      size_t ZDICT_addEntropyTablesFromBuffer(void* dictBuffer, size_t dictContentSize, size_t dictBufferCapacity,

                                              const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples);

      #endif   /* ZDICT_STATIC_LINKING_ONLY */

      #if defined (__cplusplus)

      }

      #endif

      #endif   /* DICTBUILDER_H_001 */

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages

				/**
				* Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
				* All rights reserved.
				*
				* This source code is licensed under the BSD-style license found in the
				* LICENSE file in the root directory of this source tree. An additional grant
				* of patent rights can be found in the PATENTS file in the same directory.
				*/

				#ifndef DICTBUILDER_H_001
				#define DICTBUILDER_H_001

				#if defined (__cplusplus)
				extern "C" {
				#endif


				/====== Dependencies ======/
				#include <stddef.h> /* size_t */


				/====== Export for Windows ======/
				/*!
				* ZSTD_DLL_EXPORT :
				* Enable exporting of functions when building a Windows DLL
				*/
				#if defined(_WIN32) && defined(ZSTD_DLL_EXPORT) && (ZSTD_DLL_EXPORT==1)
				# define ZDICTLIB_API __declspec(dllexport)
				#else
				# define ZDICTLIB_API
				#endif


				/*! ZDICT_trainFromBuffer() :
				Train a dictionary from an array of samples.
				Samples must be stored concatenated in a single flat buffer `samplesBuffer`,
				supplied with an array of sizes `samplesSizes`, providing the size of each sample, in order.
				The resulting dictionary will be saved into `dictBuffer`.
				@return : size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`)
				or an error code, which can be tested with ZDICT_isError().
				Tips : In general, a reasonable dictionary has a size of ~ 100 KB.
				It's obviously possible to target smaller or larger ones, just by specifying different `dictBufferCapacity`.
				In general, it's recommended to provide a few thousands samples, but this can vary a lot.
				It's recommended that total size of all samples be about ~x100 times the target size of dictionary.
				*/
				ZDICTLIB_API size_t ZDICT_trainFromBuffer(void* dictBuffer, size_t dictBufferCapacity,
				const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples);


				/====== Helper functions ======/
				ZDICTLIB_API unsigned ZDICT_getDictID(const void* dictBuffer, size_t dictSize); /*< extracts dictID; @return zero if error (not a valid dictionary) /
				ZDICTLIB_API unsigned ZDICT_isError(size_t errorCode);
				ZDICTLIB_API const char* ZDICT_getErrorName(size_t errorCode);



				#ifdef ZDICT_STATIC_LINKING_ONLY

				/* ====================================================================================
				* The definitions in this section are considered experimental.
				* They should never be used with a dynamic library, as they may change in the future.
				* They are provided for advanced usages.
				* Use them only in association with static linking.
				* ==================================================================================== */

				typedef struct {
				unsigned selectivityLevel; /* 0 means default; larger => select more => larger dictionary */
				int compressionLevel; /* 0 means default; target a specific zstd compression level */
				unsigned notificationLevel; /* Write to stderr; 0 = none (default); 1 = errors; 2 = progression; 3 = details; 4 = debug; */
				unsigned dictID; /* 0 means auto mode (32-bits random value); other : force dictID value */
				unsigned reserved[2]; /* reserved space for future parameters */
				} ZDICT_params_t;


				/*! ZDICT_trainFromBuffer_advanced() :
				Same as ZDICT_trainFromBuffer() with control over more parameters.
				`parameters` is optional and can be provided with values set to 0 to mean "default".
				@return : size of dictionary stored into `dictBuffer` (<= `dictBufferSize`),
				or an error code, which can be tested by ZDICT_isError().
				note : ZDICT_trainFromBuffer_advanced() will send notifications into stderr if instructed to, using notificationLevel>0.
				*/
				size_t ZDICT_trainFromBuffer_advanced(void* dictBuffer, size_t dictBufferCapacity,
				const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples,
				ZDICT_params_t parameters);


				/*! ZDICT_addEntropyTablesFromBuffer() :

				Given a content-only dictionary (built using any 3rd party algorithm),
				add entropy tables computed from an array of samples.
				Samples must be stored concatenated in a flat buffer `samplesBuffer`,
				supplied with an array of sizes `samplesSizes`, providing the size of each sample in order.

				The input dictionary content must be stored at the end of `dictBuffer`.
				Its size is `dictContentSize`.
				The resulting dictionary with added entropy tables will be written back to `dictBuffer`,
				starting from its beginning.
				@return : size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`).
				*/
				size_t ZDICT_addEntropyTablesFromBuffer(void* dictBuffer, size_t dictContentSize, size_t dictBufferCapacity,
				const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples);



				#endif /* ZDICT_STATIC_LINKING_ONLY */

				#if defined (__cplusplus)
				}
				#endif

				#endif /* DICTBUILDER_H_001 */