upstream/mercurial-mirror Files · contrib/python-zstandard/tests/test_compressor_fuzzing.py

revlog: skeleton support for version 2 revlogs...

revlog: skeleton support for version 2 revlogs There are a number of improvements we want to make to revlogs that will require a new version - version 2. It is unclear what the full set of improvements will be or when we'll be done with them. What I do know is that the process will likely take longer than a single release, will require input from various stakeholders to evaluate changes, and will have many contentious debates and bikeshedding. It is unrealistic to develop revlog version 2 up front: there are just too many uncertainties that we won't know until things are implemented and experiments are run. Some changes will also be invasive and prone to bit rot, so sitting on dozens of patches is not practical. This commit introduces skeleton support for version 2 revlogs in a way that is flexible and not bound by backwards compatibility concerns. An experimental repo requirement for denoting revlog v2 has been added. The requirement string has a sub-version component to it. This will allow us to declare multiple requirements in the course of developing revlog v2. Whenever we change the in-development revlog v2 format, we can tweak the string, creating a new requirement and locking out old clients. This will allow us to make as many backwards incompatible changes and experiments to revlog v2 as we want. In other words, we can land code and make meaningful progress towards revlog v2 while still maintaining extreme format flexibility up until the point we freeze the format and remove the experimental labels. To enable the new repo requirement, you must supply an experimental and undocumented config option. But not just any boolean flag will do: you need to explicitly use a value that no sane person should ever type. This is an additional guard against enabling revlog v2 on an installation it shouldn't be enabled on. The specific scenario I'm trying to prevent is say a user with a 4.4 client with a frozen format enabling the option but then downgrading to 4.3 and accidentally creating repos with an outdated and unsupported repo format. Requiring a "challenge" string should prevent this. Because the format is not yet finalized and I don't want to take any chances, revlog v2's version is currently 0xDEAD. I figure squatting on a value we're likely never to use as an actual revlog version to mean "internal testing only" is acceptable. And "dead" is easily recognized as something meaningful. There is a bunch of cleanup that is needed before work on revlog v2 begins in earnest. I plan on doing that work once this patch is accepted and we're comfortable with the idea of starting down this path.

Gregory Szorc - - Load All Authors

File last commit:

r31796:e0dc4053 default


                r32697:19b9fc40

default

Download file

             test_compressor_fuzzing.py
        
                    143 lines
            
             | 5.5 KiB
            
                | text/x-python
            
             |
                PythonLexer
            
             / contrib / python-zstandard / tests / test_compressor_fuzzing.py
          
                    History
                
                 |
                  Annotation
                 | Raw
                 |Copy content
                 |Copy permalink

      import io

      import os

      try:

          import unittest2 as unittest

      except ImportError:

          import unittest

      try:

          import hypothesis

          import hypothesis.strategies as strategies

      except ImportError:

          raise unittest.SkipTest('hypothesis not available')

      import zstd

      from . common import (

          make_cffi,

          random_input_data,

      )

      @unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set')

      @make_cffi

      class TestCompressor_write_to_fuzzing(unittest.TestCase):

          @hypothesis.given(original=strategies.sampled_from(random_input_data()),

                              level=strategies.integers(min_value=1, max_value=5),

                              write_size=strategies.integers(min_value=1, max_value=1048576))

          def test_write_size_variance(self, original, level, write_size):

              refctx = zstd.ZstdCompressor(level=level)

              ref_frame = refctx.compress(original)

              cctx = zstd.ZstdCompressor(level=level)

              b = io.BytesIO()

              with cctx.write_to(b, size=len(original), write_size=write_size) as compressor:

                  compressor.write(original)

              self.assertEqual(b.getvalue(), ref_frame)

      @unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set')

      @make_cffi

      class TestCompressor_copy_stream_fuzzing(unittest.TestCase):

          @hypothesis.given(original=strategies.sampled_from(random_input_data()),

                            level=strategies.integers(min_value=1, max_value=5),

                            read_size=strategies.integers(min_value=1, max_value=1048576),

                            write_size=strategies.integers(min_value=1, max_value=1048576))

          def test_read_write_size_variance(self, original, level, read_size, write_size):

              refctx = zstd.ZstdCompressor(level=level)

              ref_frame = refctx.compress(original)

              cctx = zstd.ZstdCompressor(level=level)

              source = io.BytesIO(original)

              dest = io.BytesIO()

              cctx.copy_stream(source, dest, size=len(original), read_size=read_size,

                               write_size=write_size)

              self.assertEqual(dest.getvalue(), ref_frame)

      @unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set')

      @make_cffi

      class TestCompressor_compressobj_fuzzing(unittest.TestCase):

          @hypothesis.given(original=strategies.sampled_from(random_input_data()),

                            level=strategies.integers(min_value=1, max_value=5),

                            chunk_sizes=strategies.streaming(

                                strategies.integers(min_value=1, max_value=4096)))

          def test_random_input_sizes(self, original, level, chunk_sizes):

              chunk_sizes = iter(chunk_sizes)

              refctx = zstd.ZstdCompressor(level=level)

              ref_frame = refctx.compress(original)

              cctx = zstd.ZstdCompressor(level=level)

              cobj = cctx.compressobj(size=len(original))

              chunks = []

              i = 0

              while True:

                  chunk_size = next(chunk_sizes)

                  source = original[i:i + chunk_size]

                  if not source:

                      break

                  chunks.append(cobj.compress(source))

                  i += chunk_size

              chunks.append(cobj.flush())

              self.assertEqual(b''.join(chunks), ref_frame)

      @unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set')

      @make_cffi

      class TestCompressor_read_from_fuzzing(unittest.TestCase):

          @hypothesis.given(original=strategies.sampled_from(random_input_data()),

                            level=strategies.integers(min_value=1, max_value=5),

                            read_size=strategies.integers(min_value=1, max_value=4096),

                            write_size=strategies.integers(min_value=1, max_value=4096))

          def test_read_write_size_variance(self, original, level, read_size, write_size):

              refcctx = zstd.ZstdCompressor(level=level)

              ref_frame = refcctx.compress(original)

              source = io.BytesIO(original)

              cctx = zstd.ZstdCompressor(level=level)

              chunks = list(cctx.read_from(source, size=len(original), read_size=read_size,

                                           write_size=write_size))

              self.assertEqual(b''.join(chunks), ref_frame)

      @unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set')

      class TestCompressor_multi_compress_to_buffer_fuzzing(unittest.TestCase):

          @hypothesis.given(original=strategies.lists(strategies.sampled_from(random_input_data()),

                                                      min_size=1, max_size=1024),

                              threads=strategies.integers(min_value=1, max_value=8),

                              use_dict=strategies.booleans())

          def test_data_equivalence(self, original, threads, use_dict):

              kwargs = {}

              # Use a content dictionary because it is cheap to create.

              if use_dict:

                  kwargs['dict_data'] = zstd.ZstdCompressionDict(original[0])

              cctx = zstd.ZstdCompressor(level=1,

                                         write_content_size=True,

                                         write_checksum=True,

                                         **kwargs)

              result = cctx.multi_compress_to_buffer(original, threads=-1)

              self.assertEqual(len(result), len(original))

              # The frame produced via the batch APIs may not be bit identical to that

              # produced by compress() because compression parameters are adjusted

              # from the first input in batch mode. So the only thing we can do is

              # verify the decompressed data matches the input.

              dctx = zstd.ZstdDecompressor(**kwargs)

              for i, frame in enumerate(result):

                  self.assertEqual(dctx.decompress(frame), original[i])

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages

				import io
				import os

				try:
				import unittest2 as unittest
				except ImportError:
				import unittest

				try:
				import hypothesis
				import hypothesis.strategies as strategies
				except ImportError:
				raise unittest.SkipTest('hypothesis not available')

				import zstd

				from . common import (
				make_cffi,
				random_input_data,
				)


				@unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set')
				@make_cffi
				class TestCompressor_write_to_fuzzing(unittest.TestCase):
				@hypothesis.given(original=strategies.sampled_from(random_input_data()),
				level=strategies.integers(min_value=1, max_value=5),
				write_size=strategies.integers(min_value=1, max_value=1048576))
				def test_write_size_variance(self, original, level, write_size):
				refctx = zstd.ZstdCompressor(level=level)
				ref_frame = refctx.compress(original)

				cctx = zstd.ZstdCompressor(level=level)
				b = io.BytesIO()
				with cctx.write_to(b, size=len(original), write_size=write_size) as compressor:
				compressor.write(original)

				self.assertEqual(b.getvalue(), ref_frame)


				@unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set')
				@make_cffi
				class TestCompressor_copy_stream_fuzzing(unittest.TestCase):
				@hypothesis.given(original=strategies.sampled_from(random_input_data()),
				level=strategies.integers(min_value=1, max_value=5),
				read_size=strategies.integers(min_value=1, max_value=1048576),
				write_size=strategies.integers(min_value=1, max_value=1048576))
				def test_read_write_size_variance(self, original, level, read_size, write_size):
				refctx = zstd.ZstdCompressor(level=level)
				ref_frame = refctx.compress(original)

				cctx = zstd.ZstdCompressor(level=level)
				source = io.BytesIO(original)
				dest = io.BytesIO()

				cctx.copy_stream(source, dest, size=len(original), read_size=read_size,
				write_size=write_size)

				self.assertEqual(dest.getvalue(), ref_frame)


				@unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set')
				@make_cffi
				class TestCompressor_compressobj_fuzzing(unittest.TestCase):
				@hypothesis.given(original=strategies.sampled_from(random_input_data()),
				level=strategies.integers(min_value=1, max_value=5),
				chunk_sizes=strategies.streaming(
				strategies.integers(min_value=1, max_value=4096)))
				def test_random_input_sizes(self, original, level, chunk_sizes):
				chunk_sizes = iter(chunk_sizes)

				refctx = zstd.ZstdCompressor(level=level)
				ref_frame = refctx.compress(original)

				cctx = zstd.ZstdCompressor(level=level)
				cobj = cctx.compressobj(size=len(original))

				chunks = []
				i = 0
				while True:
				chunk_size = next(chunk_sizes)
				source = original[i:i + chunk_size]
				if not source:
				break

				chunks.append(cobj.compress(source))
				i += chunk_size

				chunks.append(cobj.flush())

				self.assertEqual(b''.join(chunks), ref_frame)


				@unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set')
				@make_cffi
				class TestCompressor_read_from_fuzzing(unittest.TestCase):
				@hypothesis.given(original=strategies.sampled_from(random_input_data()),
				level=strategies.integers(min_value=1, max_value=5),
				read_size=strategies.integers(min_value=1, max_value=4096),
				write_size=strategies.integers(min_value=1, max_value=4096))
				def test_read_write_size_variance(self, original, level, read_size, write_size):
				refcctx = zstd.ZstdCompressor(level=level)
				ref_frame = refcctx.compress(original)

				source = io.BytesIO(original)

				cctx = zstd.ZstdCompressor(level=level)
				chunks = list(cctx.read_from(source, size=len(original), read_size=read_size,
				write_size=write_size))

				self.assertEqual(b''.join(chunks), ref_frame)


				@unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set')
				class TestCompressor_multi_compress_to_buffer_fuzzing(unittest.TestCase):
				@hypothesis.given(original=strategies.lists(strategies.sampled_from(random_input_data()),
				min_size=1, max_size=1024),
				threads=strategies.integers(min_value=1, max_value=8),
				use_dict=strategies.booleans())
				def test_data_equivalence(self, original, threads, use_dict):
				kwargs = {}

				# Use a content dictionary because it is cheap to create.
				if use_dict:
				kwargs['dict_data'] = zstd.ZstdCompressionDict(original[0])

				cctx = zstd.ZstdCompressor(level=1,
				write_content_size=True,
				write_checksum=True,
				**kwargs)

				result = cctx.multi_compress_to_buffer(original, threads=-1)

				self.assertEqual(len(result), len(original))

				# The frame produced via the batch APIs may not be bit identical to that
				# produced by compress() because compression parameters are adjusted
				# from the first input in batch mode. So the only thing we can do is
				# verify the decompressed data matches the input.
				dctx = zstd.ZstdDecompressor(**kwargs)

				for i, frame in enumerate(result):
				self.assertEqual(dctx.decompress(frame), original[i])