upstream/mercurial-mirror Files · contrib/python-zstandard/tests/test_decompressor.py

util: implement zstd compression engine...

util: implement zstd compression engine Now that zstd is vendored and being built (in some configurations), we can implement a compression engine for zstd! The zstd engine is a little different from existing engines. Because it may not always be present, we have to defer load the module in case importing it fails. We facilitate this via a cached property that holds a reference to the module or None. The "available" method is implemented to reflect reality. The zstd engine declares its ability to handle bundles using the "zstd" human name and the "ZS" internal name. The latter was chosen because internal names are 2 characters (by only convention I think) and "ZS" seems reasonable. The engine, like others, supports specifying the compression level. However, there are no consumers of this API that yet pass in that argument. I have plans to change that, so stay tuned. Since all we need to do to support bundle generation with a new compression engine is implement and register the compression engine, bundle generation with zstd "just works!" Tests demonstrating this have been added. How does performance of zstd for bundle generation compare? On the mozilla-unified repo, `hg bundle --all -t <engine>-v2` yields the following on my i7-6700K on Linux: engine CPU time bundle size vs orig size throughput none 97.0s 4,054,405,584 100.0% 41.8 MB/s bzip2 (l=9) 393.6s 975,343,098 24.0% 10.3 MB/s gzip (l=6) 184.0s 1,140,533,074 28.1% 22.0 MB/s zstd (l=1) 108.2s 1,119,434,718 27.6% 37.5 MB/s zstd (l=2) 111.3s 1,078,328,002 26.6% 36.4 MB/s zstd (l=3) 113.7s 1,011,823,727 25.0% 35.7 MB/s zstd (l=4) 116.0s 1,008,965,888 24.9% 35.0 MB/s zstd (l=5) 121.0s 977,203,148 24.1% 33.5 MB/s zstd (l=6) 131.7s 927,360,198 22.9% 30.8 MB/s zstd (l=7) 139.0s 912,808,505 22.5% 29.2 MB/s zstd (l=12) 198.1s 854,527,714 21.1% 20.5 MB/s zstd (l=18) 681.6s 789,750,690 19.5% 5.9 MB/s On compression, zstd for bundle generation delivers: * better compression than gzip with significantly less CPU utilization * better than bzip2 compression ratios while still being significantly faster than gzip * ability to aggressively tune compression level to achieve significantly smaller bundles That last point is important. With clone bundles, a server can pre-generate a bundle file, upload it to a static file server, and redirect clients to transparently download it during clone. The server could choose to produce a zstd bundle with the highest compression settings possible. This would take a very long time - a magnitude longer than a typical zstd bundle generation - but the result would be hundreds of megabytes smaller! For the clone volume we do at Mozilla, this could translate to petabytes of bandwidth savings per year and faster clones (due to smaller transfer size). I don't have detailed numbers to report on decompression. However, zstd decompression is fast: >1 GB/s output throughput on this machine, even through the Python bindings. And it can do that regardless of the compression level of the input. By the time you have enough data to worry about overhead of decompression, you have plenty of other things to worry about performance wise. zstd is wins all around. I can't wait to implement support for it on the wire protocol and in revlogs.

Gregory Szorc - - Load All Authors

File last commit:

r30435:b86a448a default


                r30442:41a81067

default

Download file

             test_decompressor.py
        
                    478 lines
            
             | 15.0 KiB
            
                | text/x-python
            
             |
                PythonLexer
            
             / contrib / python-zstandard / tests / test_decompressor.py
          
                    History
                
                 |
                  Annotation
                 | Raw
                 |Copy content
                 |Copy permalink

      import io

      import random

      import struct

      import sys

      try:

          import unittest2 as unittest

      except ImportError:

          import unittest

      import zstd

      from .common import OpCountingBytesIO

      if sys.version_info[0] >= 3:

          next = lambda it: it.__next__()

      else:

          next = lambda it: it.next()

      class TestDecompressor_decompress(unittest.TestCase):

          def test_empty_input(self):

              dctx = zstd.ZstdDecompressor()

              with self.assertRaisesRegexp(zstd.ZstdError, 'input data invalid'):

                  dctx.decompress(b'')

          def test_invalid_input(self):

              dctx = zstd.ZstdDecompressor()

              with self.assertRaisesRegexp(zstd.ZstdError, 'input data invalid'):

                  dctx.decompress(b'foobar')

          def test_no_content_size_in_frame(self):

              cctx = zstd.ZstdCompressor(write_content_size=False)

              compressed = cctx.compress(b'foobar')

              dctx = zstd.ZstdDecompressor()

              with self.assertRaisesRegexp(zstd.ZstdError, 'input data invalid'):

                  dctx.decompress(compressed)

          def test_content_size_present(self):

              cctx = zstd.ZstdCompressor(write_content_size=True)

              compressed = cctx.compress(b'foobar')

              dctx = zstd.ZstdDecompressor()

              decompressed  = dctx.decompress(compressed)

              self.assertEqual(decompressed, b'foobar')

          def test_max_output_size(self):

              cctx = zstd.ZstdCompressor(write_content_size=False)

              source = b'foobar' * 256

              compressed = cctx.compress(source)

              dctx = zstd.ZstdDecompressor()

              # Will fit into buffer exactly the size of input.

              decompressed = dctx.decompress(compressed, max_output_size=len(source))

              self.assertEqual(decompressed, source)

              # Input size - 1 fails

              with self.assertRaisesRegexp(zstd.ZstdError, 'Destination buffer is too small'):

                  dctx.decompress(compressed, max_output_size=len(source) - 1)

              # Input size + 1 works

              decompressed = dctx.decompress(compressed, max_output_size=len(source) + 1)

              self.assertEqual(decompressed, source)

              # A much larger buffer works.

              decompressed = dctx.decompress(compressed, max_output_size=len(source) * 64)

              self.assertEqual(decompressed, source)

          def test_stupidly_large_output_buffer(self):

              cctx = zstd.ZstdCompressor(write_content_size=False)

              compressed = cctx.compress(b'foobar' * 256)

              dctx = zstd.ZstdDecompressor()

              # Will get OverflowError on some Python distributions that can't

              # handle really large integers.

              with self.assertRaises((MemoryError, OverflowError)):

                  dctx.decompress(compressed, max_output_size=2**62)

          def test_dictionary(self):

              samples = []

              for i in range(128):

                  samples.append(b'foo' * 64)

                  samples.append(b'bar' * 64)

                  samples.append(b'foobar' * 64)

              d = zstd.train_dictionary(8192, samples)

              orig = b'foobar' * 16384

              cctx = zstd.ZstdCompressor(level=1, dict_data=d, write_content_size=True)

              compressed = cctx.compress(orig)

              dctx = zstd.ZstdDecompressor(dict_data=d)

              decompressed = dctx.decompress(compressed)

              self.assertEqual(decompressed, orig)

          def test_dictionary_multiple(self):

              samples = []

              for i in range(128):

                  samples.append(b'foo' * 64)

                  samples.append(b'bar' * 64)

                  samples.append(b'foobar' * 64)

              d = zstd.train_dictionary(8192, samples)

              sources = (b'foobar' * 8192, b'foo' * 8192, b'bar' * 8192)

              compressed = []

              cctx = zstd.ZstdCompressor(level=1, dict_data=d, write_content_size=True)

              for source in sources:

                  compressed.append(cctx.compress(source))

              dctx = zstd.ZstdDecompressor(dict_data=d)

              for i in range(len(sources)):

                  decompressed = dctx.decompress(compressed[i])

                  self.assertEqual(decompressed, sources[i])

      class TestDecompressor_copy_stream(unittest.TestCase):

          def test_no_read(self):

              source = object()

              dest = io.BytesIO()

              dctx = zstd.ZstdDecompressor()

              with self.assertRaises(ValueError):

                  dctx.copy_stream(source, dest)

          def test_no_write(self):

              source = io.BytesIO()

              dest = object()

              dctx = zstd.ZstdDecompressor()

              with self.assertRaises(ValueError):

                  dctx.copy_stream(source, dest)

          def test_empty(self):

              source = io.BytesIO()

              dest = io.BytesIO()

              dctx = zstd.ZstdDecompressor()

              # TODO should this raise an error?

              r, w = dctx.copy_stream(source, dest)

              self.assertEqual(r, 0)

              self.assertEqual(w, 0)

              self.assertEqual(dest.getvalue(), b'')

          def test_large_data(self):

              source = io.BytesIO()

              for i in range(255):

                  source.write(struct.Struct('>B').pack(i) * 16384)

              source.seek(0)

              compressed = io.BytesIO()

              cctx = zstd.ZstdCompressor()

              cctx.copy_stream(source, compressed)

              compressed.seek(0)

              dest = io.BytesIO()

              dctx = zstd.ZstdDecompressor()

              r, w = dctx.copy_stream(compressed, dest)

              self.assertEqual(r, len(compressed.getvalue()))

              self.assertEqual(w, len(source.getvalue()))

          def test_read_write_size(self):

              source = OpCountingBytesIO(zstd.ZstdCompressor().compress(

                  b'foobarfoobar'))

              dest = OpCountingBytesIO()

              dctx = zstd.ZstdDecompressor()

              r, w = dctx.copy_stream(source, dest, read_size=1, write_size=1)

              self.assertEqual(r, len(source.getvalue()))

              self.assertEqual(w, len(b'foobarfoobar'))

              self.assertEqual(source._read_count, len(source.getvalue()) + 1)

              self.assertEqual(dest._write_count, len(dest.getvalue()))

      class TestDecompressor_decompressobj(unittest.TestCase):

          def test_simple(self):

              data = zstd.ZstdCompressor(level=1).compress(b'foobar')

              dctx = zstd.ZstdDecompressor()

              dobj = dctx.decompressobj()

              self.assertEqual(dobj.decompress(data), b'foobar')

          def test_reuse(self):

              data = zstd.ZstdCompressor(level=1).compress(b'foobar')

              dctx = zstd.ZstdDecompressor()

              dobj = dctx.decompressobj()

              dobj.decompress(data)

              with self.assertRaisesRegexp(zstd.ZstdError, 'cannot use a decompressobj'):

                  dobj.decompress(data)

      def decompress_via_writer(data):

          buffer = io.BytesIO()

          dctx = zstd.ZstdDecompressor()

          with dctx.write_to(buffer) as decompressor:

              decompressor.write(data)

          return buffer.getvalue()

      class TestDecompressor_write_to(unittest.TestCase):

          def test_empty_roundtrip(self):

              cctx = zstd.ZstdCompressor()

              empty = cctx.compress(b'')

              self.assertEqual(decompress_via_writer(empty), b'')

          def test_large_roundtrip(self):

              chunks = []

              for i in range(255):

                  chunks.append(struct.Struct('>B').pack(i) * 16384)

              orig = b''.join(chunks)

              cctx = zstd.ZstdCompressor()

              compressed = cctx.compress(orig)

              self.assertEqual(decompress_via_writer(compressed), orig)

          def test_multiple_calls(self):

              chunks = []

              for i in range(255):

                  for j in range(255):

                      chunks.append(struct.Struct('>B').pack(j) * i)

              orig = b''.join(chunks)

              cctx = zstd.ZstdCompressor()

              compressed = cctx.compress(orig)

              buffer = io.BytesIO()

              dctx = zstd.ZstdDecompressor()

              with dctx.write_to(buffer) as decompressor:

                  pos = 0

                  while pos < len(compressed):

                      pos2 = pos + 8192

                      decompressor.write(compressed[pos:pos2])

                      pos += 8192

              self.assertEqual(buffer.getvalue(), orig)

          def test_dictionary(self):

              samples = []

              for i in range(128):

                  samples.append(b'foo' * 64)

                  samples.append(b'bar' * 64)

                  samples.append(b'foobar' * 64)

              d = zstd.train_dictionary(8192, samples)

              orig = b'foobar' * 16384

              buffer = io.BytesIO()

              cctx = zstd.ZstdCompressor(dict_data=d)

              with cctx.write_to(buffer) as compressor:

                  compressor.write(orig)

              compressed = buffer.getvalue()

              buffer = io.BytesIO()

              dctx = zstd.ZstdDecompressor(dict_data=d)

              with dctx.write_to(buffer) as decompressor:

                  decompressor.write(compressed)

              self.assertEqual(buffer.getvalue(), orig)

          def test_memory_size(self):

              dctx = zstd.ZstdDecompressor()

              buffer = io.BytesIO()

              with dctx.write_to(buffer) as decompressor:

                  size = decompressor.memory_size()

              self.assertGreater(size, 100000)

          def test_write_size(self):

              source = zstd.ZstdCompressor().compress(b'foobarfoobar')

              dest = OpCountingBytesIO()

              dctx = zstd.ZstdDecompressor()

              with dctx.write_to(dest, write_size=1) as decompressor:

                  s = struct.Struct('>B')

                  for c in source:

                      if not isinstance(c, str):

                          c = s.pack(c)

                      decompressor.write(c)

              self.assertEqual(dest.getvalue(), b'foobarfoobar')

              self.assertEqual(dest._write_count, len(dest.getvalue()))

      class TestDecompressor_read_from(unittest.TestCase):

          def test_type_validation(self):

              dctx = zstd.ZstdDecompressor()

              # Object with read() works.

              dctx.read_from(io.BytesIO())

              # Buffer protocol works.

              dctx.read_from(b'foobar')

              with self.assertRaisesRegexp(ValueError, 'must pass an object with a read'):

                  dctx.read_from(True)

          def test_empty_input(self):

              dctx = zstd.ZstdDecompressor()

              source = io.BytesIO()

              it = dctx.read_from(source)

              # TODO this is arguably wrong. Should get an error about missing frame foo.

              with self.assertRaises(StopIteration):

                  next(it)

              it = dctx.read_from(b'')

              with self.assertRaises(StopIteration):

                  next(it)

          def test_invalid_input(self):

              dctx = zstd.ZstdDecompressor()

              source = io.BytesIO(b'foobar')

              it = dctx.read_from(source)

              with self.assertRaisesRegexp(zstd.ZstdError, 'Unknown frame descriptor'):

                  next(it)

              it = dctx.read_from(b'foobar')

              with self.assertRaisesRegexp(zstd.ZstdError, 'Unknown frame descriptor'):

                  next(it)

          def test_empty_roundtrip(self):

              cctx = zstd.ZstdCompressor(level=1, write_content_size=False)

              empty = cctx.compress(b'')

              source = io.BytesIO(empty)

              source.seek(0)

              dctx = zstd.ZstdDecompressor()

              it = dctx.read_from(source)

              # No chunks should be emitted since there is no data.

              with self.assertRaises(StopIteration):

                  next(it)

              # Again for good measure.

              with self.assertRaises(StopIteration):

                  next(it)

          def test_skip_bytes_too_large(self):

              dctx = zstd.ZstdDecompressor()

              with self.assertRaisesRegexp(ValueError, 'skip_bytes must be smaller than read_size'):

                  dctx.read_from(b'', skip_bytes=1, read_size=1)

              with self.assertRaisesRegexp(ValueError, 'skip_bytes larger than first input chunk'):

                  b''.join(dctx.read_from(b'foobar', skip_bytes=10))

          def test_skip_bytes(self):

              cctx = zstd.ZstdCompressor(write_content_size=False)

              compressed = cctx.compress(b'foobar')

              dctx = zstd.ZstdDecompressor()

              output = b''.join(dctx.read_from(b'hdr' + compressed, skip_bytes=3))

              self.assertEqual(output, b'foobar')

          def test_large_output(self):

              source = io.BytesIO()

              source.write(b'f' * zstd.DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE)

              source.write(b'o')

              source.seek(0)

              cctx = zstd.ZstdCompressor(level=1)

              compressed = io.BytesIO(cctx.compress(source.getvalue()))

              compressed.seek(0)

              dctx = zstd.ZstdDecompressor()

              it = dctx.read_from(compressed)

              chunks = []

              chunks.append(next(it))

              chunks.append(next(it))

              with self.assertRaises(StopIteration):

                  next(it)

              decompressed = b''.join(chunks)

              self.assertEqual(decompressed, source.getvalue())

              # And again with buffer protocol.

              it = dctx.read_from(compressed.getvalue())

              chunks = []

              chunks.append(next(it))

              chunks.append(next(it))

              with self.assertRaises(StopIteration):

                  next(it)

              decompressed = b''.join(chunks)

              self.assertEqual(decompressed, source.getvalue())

          def test_large_input(self):

              bytes = list(struct.Struct('>B').pack(i) for i in range(256))

              compressed = io.BytesIO()

              input_size = 0

              cctx = zstd.ZstdCompressor(level=1)

              with cctx.write_to(compressed) as compressor:

                  while True:

                      compressor.write(random.choice(bytes))

                      input_size += 1

                      have_compressed = len(compressed.getvalue()) > zstd.DECOMPRESSION_RECOMMENDED_INPUT_SIZE

                      have_raw = input_size > zstd.DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE * 2

                      if have_compressed and have_raw:

                          break

              compressed.seek(0)

              self.assertGreater(len(compressed.getvalue()),

                                 zstd.DECOMPRESSION_RECOMMENDED_INPUT_SIZE)

              dctx = zstd.ZstdDecompressor()

              it = dctx.read_from(compressed)

              chunks = []

              chunks.append(next(it))

              chunks.append(next(it))

              chunks.append(next(it))

              with self.assertRaises(StopIteration):

                  next(it)

              decompressed = b''.join(chunks)

              self.assertEqual(len(decompressed), input_size)

              # And again with buffer protocol.

              it = dctx.read_from(compressed.getvalue())

              chunks = []

              chunks.append(next(it))

              chunks.append(next(it))

              chunks.append(next(it))

              with self.assertRaises(StopIteration):

                  next(it)

              decompressed = b''.join(chunks)

              self.assertEqual(len(decompressed), input_size)

          def test_interesting(self):

              # Found this edge case via fuzzing.

              cctx = zstd.ZstdCompressor(level=1)

              source = io.BytesIO()

              compressed = io.BytesIO()

              with cctx.write_to(compressed) as compressor:

                  for i in range(256):

                      chunk = b'\0' * 1024

                      compressor.write(chunk)

                      source.write(chunk)

              dctx = zstd.ZstdDecompressor()

              simple = dctx.decompress(compressed.getvalue(),

                                       max_output_size=len(source.getvalue()))

              self.assertEqual(simple, source.getvalue())

              compressed.seek(0)

              streamed = b''.join(dctx.read_from(compressed))

              self.assertEqual(streamed, source.getvalue())

          def test_read_write_size(self):

              source = OpCountingBytesIO(zstd.ZstdCompressor().compress(b'foobarfoobar'))

              dctx = zstd.ZstdDecompressor()

              for chunk in dctx.read_from(source, read_size=1, write_size=1):

                  self.assertEqual(len(chunk), 1)

              self.assertEqual(source._read_count, len(source.getvalue()))

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages

				import io
				import random
				import struct
				import sys

				try:
				import unittest2 as unittest
				except ImportError:
				import unittest

				import zstd

				from .common import OpCountingBytesIO


				if sys.version_info[0] >= 3:
				next = lambda it: it.__next__()
				else:
				next = lambda it: it.next()


				class TestDecompressor_decompress(unittest.TestCase):
				def test_empty_input(self):
				dctx = zstd.ZstdDecompressor()

				with self.assertRaisesRegexp(zstd.ZstdError, 'input data invalid'):
				dctx.decompress(b'')

				def test_invalid_input(self):
				dctx = zstd.ZstdDecompressor()

				with self.assertRaisesRegexp(zstd.ZstdError, 'input data invalid'):
				dctx.decompress(b'foobar')

				def test_no_content_size_in_frame(self):
				cctx = zstd.ZstdCompressor(write_content_size=False)
				compressed = cctx.compress(b'foobar')

				dctx = zstd.ZstdDecompressor()
				with self.assertRaisesRegexp(zstd.ZstdError, 'input data invalid'):
				dctx.decompress(compressed)

				def test_content_size_present(self):
				cctx = zstd.ZstdCompressor(write_content_size=True)
				compressed = cctx.compress(b'foobar')

				dctx = zstd.ZstdDecompressor()
				decompressed = dctx.decompress(compressed)
				self.assertEqual(decompressed, b'foobar')

				def test_max_output_size(self):
				cctx = zstd.ZstdCompressor(write_content_size=False)
				source = b'foobar' * 256
				compressed = cctx.compress(source)

				dctx = zstd.ZstdDecompressor()
				# Will fit into buffer exactly the size of input.
				decompressed = dctx.decompress(compressed, max_output_size=len(source))
				self.assertEqual(decompressed, source)

				# Input size - 1 fails
				with self.assertRaisesRegexp(zstd.ZstdError, 'Destination buffer is too small'):
				dctx.decompress(compressed, max_output_size=len(source) - 1)

				# Input size + 1 works
				decompressed = dctx.decompress(compressed, max_output_size=len(source) + 1)
				self.assertEqual(decompressed, source)

				# A much larger buffer works.
				decompressed = dctx.decompress(compressed, max_output_size=len(source) * 64)
				self.assertEqual(decompressed, source)

				def test_stupidly_large_output_buffer(self):
				cctx = zstd.ZstdCompressor(write_content_size=False)
				compressed = cctx.compress(b'foobar' * 256)
				dctx = zstd.ZstdDecompressor()

				# Will get OverflowError on some Python distributions that can't
				# handle really large integers.
				with self.assertRaises((MemoryError, OverflowError)):
				dctx.decompress(compressed, max_output_size=2**62)

				def test_dictionary(self):
				samples = []
				for i in range(128):
				samples.append(b'foo' * 64)
				samples.append(b'bar' * 64)
				samples.append(b'foobar' * 64)

				d = zstd.train_dictionary(8192, samples)

				orig = b'foobar' * 16384
				cctx = zstd.ZstdCompressor(level=1, dict_data=d, write_content_size=True)
				compressed = cctx.compress(orig)

				dctx = zstd.ZstdDecompressor(dict_data=d)
				decompressed = dctx.decompress(compressed)

				self.assertEqual(decompressed, orig)

				def test_dictionary_multiple(self):
				samples = []
				for i in range(128):
				samples.append(b'foo' * 64)
				samples.append(b'bar' * 64)
				samples.append(b'foobar' * 64)

				d = zstd.train_dictionary(8192, samples)

				sources = (b'foobar' * 8192, b'foo' * 8192, b'bar' * 8192)
				compressed = []
				cctx = zstd.ZstdCompressor(level=1, dict_data=d, write_content_size=True)
				for source in sources:
				compressed.append(cctx.compress(source))

				dctx = zstd.ZstdDecompressor(dict_data=d)
				for i in range(len(sources)):
				decompressed = dctx.decompress(compressed[i])
				self.assertEqual(decompressed, sources[i])


				class TestDecompressor_copy_stream(unittest.TestCase):
				def test_no_read(self):
				source = object()
				dest = io.BytesIO()

				dctx = zstd.ZstdDecompressor()
				with self.assertRaises(ValueError):
				dctx.copy_stream(source, dest)

				def test_no_write(self):
				source = io.BytesIO()
				dest = object()

				dctx = zstd.ZstdDecompressor()
				with self.assertRaises(ValueError):
				dctx.copy_stream(source, dest)

				def test_empty(self):
				source = io.BytesIO()
				dest = io.BytesIO()

				dctx = zstd.ZstdDecompressor()
				# TODO should this raise an error?
				r, w = dctx.copy_stream(source, dest)

				self.assertEqual(r, 0)
				self.assertEqual(w, 0)
				self.assertEqual(dest.getvalue(), b'')

				def test_large_data(self):
				source = io.BytesIO()
				for i in range(255):
				source.write(struct.Struct('>B').pack(i) * 16384)
				source.seek(0)

				compressed = io.BytesIO()
				cctx = zstd.ZstdCompressor()
				cctx.copy_stream(source, compressed)

				compressed.seek(0)
				dest = io.BytesIO()
				dctx = zstd.ZstdDecompressor()
				r, w = dctx.copy_stream(compressed, dest)

				self.assertEqual(r, len(compressed.getvalue()))
				self.assertEqual(w, len(source.getvalue()))

				def test_read_write_size(self):
				source = OpCountingBytesIO(zstd.ZstdCompressor().compress(
				b'foobarfoobar'))

				dest = OpCountingBytesIO()
				dctx = zstd.ZstdDecompressor()
				r, w = dctx.copy_stream(source, dest, read_size=1, write_size=1)

				self.assertEqual(r, len(source.getvalue()))
				self.assertEqual(w, len(b'foobarfoobar'))
				self.assertEqual(source._read_count, len(source.getvalue()) + 1)
				self.assertEqual(dest._write_count, len(dest.getvalue()))


				class TestDecompressor_decompressobj(unittest.TestCase):
				def test_simple(self):
				data = zstd.ZstdCompressor(level=1).compress(b'foobar')

				dctx = zstd.ZstdDecompressor()
				dobj = dctx.decompressobj()
				self.assertEqual(dobj.decompress(data), b'foobar')

				def test_reuse(self):
				data = zstd.ZstdCompressor(level=1).compress(b'foobar')

				dctx = zstd.ZstdDecompressor()
				dobj = dctx.decompressobj()
				dobj.decompress(data)

				with self.assertRaisesRegexp(zstd.ZstdError, 'cannot use a decompressobj'):
				dobj.decompress(data)


				def decompress_via_writer(data):
				buffer = io.BytesIO()
				dctx = zstd.ZstdDecompressor()
				with dctx.write_to(buffer) as decompressor:
				decompressor.write(data)
				return buffer.getvalue()


				class TestDecompressor_write_to(unittest.TestCase):
				def test_empty_roundtrip(self):
				cctx = zstd.ZstdCompressor()
				empty = cctx.compress(b'')
				self.assertEqual(decompress_via_writer(empty), b'')

				def test_large_roundtrip(self):
				chunks = []
				for i in range(255):
				chunks.append(struct.Struct('>B').pack(i) * 16384)
				orig = b''.join(chunks)
				cctx = zstd.ZstdCompressor()
				compressed = cctx.compress(orig)

				self.assertEqual(decompress_via_writer(compressed), orig)

				def test_multiple_calls(self):
				chunks = []
				for i in range(255):
				for j in range(255):
				chunks.append(struct.Struct('>B').pack(j) * i)

				orig = b''.join(chunks)
				cctx = zstd.ZstdCompressor()
				compressed = cctx.compress(orig)

				buffer = io.BytesIO()
				dctx = zstd.ZstdDecompressor()
				with dctx.write_to(buffer) as decompressor:
				pos = 0
				while pos < len(compressed):
				pos2 = pos + 8192
				decompressor.write(compressed[pos:pos2])
				pos += 8192
				self.assertEqual(buffer.getvalue(), orig)

				def test_dictionary(self):
				samples = []
				for i in range(128):
				samples.append(b'foo' * 64)
				samples.append(b'bar' * 64)
				samples.append(b'foobar' * 64)

				d = zstd.train_dictionary(8192, samples)

				orig = b'foobar' * 16384
				buffer = io.BytesIO()
				cctx = zstd.ZstdCompressor(dict_data=d)
				with cctx.write_to(buffer) as compressor:
				compressor.write(orig)

				compressed = buffer.getvalue()
				buffer = io.BytesIO()

				dctx = zstd.ZstdDecompressor(dict_data=d)
				with dctx.write_to(buffer) as decompressor:
				decompressor.write(compressed)

				self.assertEqual(buffer.getvalue(), orig)

				def test_memory_size(self):
				dctx = zstd.ZstdDecompressor()
				buffer = io.BytesIO()
				with dctx.write_to(buffer) as decompressor:
				size = decompressor.memory_size()

				self.assertGreater(size, 100000)

				def test_write_size(self):
				source = zstd.ZstdCompressor().compress(b'foobarfoobar')
				dest = OpCountingBytesIO()
				dctx = zstd.ZstdDecompressor()
				with dctx.write_to(dest, write_size=1) as decompressor:
				s = struct.Struct('>B')
				for c in source:
				if not isinstance(c, str):
				c = s.pack(c)
				decompressor.write(c)


				self.assertEqual(dest.getvalue(), b'foobarfoobar')
				self.assertEqual(dest._write_count, len(dest.getvalue()))


				class TestDecompressor_read_from(unittest.TestCase):
				def test_type_validation(self):
				dctx = zstd.ZstdDecompressor()

				# Object with read() works.
				dctx.read_from(io.BytesIO())

				# Buffer protocol works.
				dctx.read_from(b'foobar')

				with self.assertRaisesRegexp(ValueError, 'must pass an object with a read'):
				dctx.read_from(True)

				def test_empty_input(self):
				dctx = zstd.ZstdDecompressor()

				source = io.BytesIO()
				it = dctx.read_from(source)
				# TODO this is arguably wrong. Should get an error about missing frame foo.
				with self.assertRaises(StopIteration):
				next(it)

				it = dctx.read_from(b'')
				with self.assertRaises(StopIteration):
				next(it)

				def test_invalid_input(self):
				dctx = zstd.ZstdDecompressor()

				source = io.BytesIO(b'foobar')
				it = dctx.read_from(source)
				with self.assertRaisesRegexp(zstd.ZstdError, 'Unknown frame descriptor'):
				next(it)

				it = dctx.read_from(b'foobar')
				with self.assertRaisesRegexp(zstd.ZstdError, 'Unknown frame descriptor'):
				next(it)

				def test_empty_roundtrip(self):
				cctx = zstd.ZstdCompressor(level=1, write_content_size=False)
				empty = cctx.compress(b'')

				source = io.BytesIO(empty)
				source.seek(0)

				dctx = zstd.ZstdDecompressor()
				it = dctx.read_from(source)

				# No chunks should be emitted since there is no data.
				with self.assertRaises(StopIteration):
				next(it)

				# Again for good measure.
				with self.assertRaises(StopIteration):
				next(it)

				def test_skip_bytes_too_large(self):
				dctx = zstd.ZstdDecompressor()

				with self.assertRaisesRegexp(ValueError, 'skip_bytes must be smaller than read_size'):
				dctx.read_from(b'', skip_bytes=1, read_size=1)

				with self.assertRaisesRegexp(ValueError, 'skip_bytes larger than first input chunk'):
				b''.join(dctx.read_from(b'foobar', skip_bytes=10))

				def test_skip_bytes(self):
				cctx = zstd.ZstdCompressor(write_content_size=False)
				compressed = cctx.compress(b'foobar')

				dctx = zstd.ZstdDecompressor()
				output = b''.join(dctx.read_from(b'hdr' + compressed, skip_bytes=3))
				self.assertEqual(output, b'foobar')

				def test_large_output(self):
				source = io.BytesIO()
				source.write(b'f' * zstd.DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE)
				source.write(b'o')
				source.seek(0)

				cctx = zstd.ZstdCompressor(level=1)
				compressed = io.BytesIO(cctx.compress(source.getvalue()))
				compressed.seek(0)

				dctx = zstd.ZstdDecompressor()
				it = dctx.read_from(compressed)

				chunks = []
				chunks.append(next(it))
				chunks.append(next(it))

				with self.assertRaises(StopIteration):
				next(it)

				decompressed = b''.join(chunks)
				self.assertEqual(decompressed, source.getvalue())

				# And again with buffer protocol.
				it = dctx.read_from(compressed.getvalue())
				chunks = []
				chunks.append(next(it))
				chunks.append(next(it))

				with self.assertRaises(StopIteration):
				next(it)

				decompressed = b''.join(chunks)
				self.assertEqual(decompressed, source.getvalue())

				def test_large_input(self):
				bytes = list(struct.Struct('>B').pack(i) for i in range(256))
				compressed = io.BytesIO()
				input_size = 0
				cctx = zstd.ZstdCompressor(level=1)
				with cctx.write_to(compressed) as compressor:
				while True:
				compressor.write(random.choice(bytes))
				input_size += 1

				have_compressed = len(compressed.getvalue()) > zstd.DECOMPRESSION_RECOMMENDED_INPUT_SIZE
				have_raw = input_size > zstd.DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE * 2
				if have_compressed and have_raw:
				break

				compressed.seek(0)
				self.assertGreater(len(compressed.getvalue()),
				zstd.DECOMPRESSION_RECOMMENDED_INPUT_SIZE)

				dctx = zstd.ZstdDecompressor()
				it = dctx.read_from(compressed)

				chunks = []
				chunks.append(next(it))
				chunks.append(next(it))
				chunks.append(next(it))

				with self.assertRaises(StopIteration):
				next(it)

				decompressed = b''.join(chunks)
				self.assertEqual(len(decompressed), input_size)

				# And again with buffer protocol.
				it = dctx.read_from(compressed.getvalue())

				chunks = []
				chunks.append(next(it))
				chunks.append(next(it))
				chunks.append(next(it))

				with self.assertRaises(StopIteration):
				next(it)

				decompressed = b''.join(chunks)
				self.assertEqual(len(decompressed), input_size)

				def test_interesting(self):
				# Found this edge case via fuzzing.
				cctx = zstd.ZstdCompressor(level=1)

				source = io.BytesIO()

				compressed = io.BytesIO()
				with cctx.write_to(compressed) as compressor:
				for i in range(256):
				chunk = b'\0' * 1024
				compressor.write(chunk)
				source.write(chunk)

				dctx = zstd.ZstdDecompressor()

				simple = dctx.decompress(compressed.getvalue(),
				max_output_size=len(source.getvalue()))
				self.assertEqual(simple, source.getvalue())

				compressed.seek(0)
				streamed = b''.join(dctx.read_from(compressed))
				self.assertEqual(streamed, source.getvalue())

				def test_read_write_size(self):
				source = OpCountingBytesIO(zstd.ZstdCompressor().compress(b'foobarfoobar'))
				dctx = zstd.ZstdDecompressor()
				for chunk in dctx.read_from(source, read_size=1, write_size=1):
				self.assertEqual(len(chunk), 1)

				self.assertEqual(source._read_count, len(source.getvalue()))