##// END OF EJS Templates
merge: default into stable for release candidate
Augie Fackler -
r42707:4a8d9ed8 merge 5.0rc0 stable
parent child Browse files
Show More

The requested changes are too big and content was truncated. Show full diff

@@ -0,0 +1,127 b''
1 ====================
2 Mercurial Automation
3 ====================
4
5 This directory contains code and utilities for building and testing Mercurial
6 on remote machines.
7
8 The ``automation.py`` Script
9 ============================
10
11 ``automation.py`` is an executable Python script (requires Python 3.5+)
12 that serves as a driver to common automation tasks.
13
14 When executed, the script will *bootstrap* a virtualenv in
15 ``<source-root>/build/venv-automation`` then re-execute itself using
16 that virtualenv. So there is no need for the caller to have a virtualenv
17 explicitly activated. This virtualenv will be populated with various
18 dependencies (as defined by the ``requirements.txt`` file).
19
20 To see what you can do with this script, simply run it::
21
22 $ ./automation.py
23
24 Local State
25 ===========
26
27 By default, local state required to interact with remote servers is stored
28 in the ``~/.hgautomation`` directory.
29
30 We attempt to limit persistent state to this directory. Even when
31 performing tasks that may have side-effects, we try to limit those
32 side-effects so they don't impact the local system. e.g. when we SSH
33 into a remote machine, we create a temporary directory for the SSH
34 config so the user's known hosts file isn't updated.
35
36 AWS Integration
37 ===============
38
39 Various automation tasks integrate with AWS to provide access to
40 resources such as EC2 instances for generic compute.
41
42 This obviously requires an AWS account and credentials to work.
43
44 We use the ``boto3`` library for interacting with AWS APIs. We do not employ
45 any special functionality for telling ``boto3`` where to find AWS credentials. See
46 https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html
47 for how ``boto3`` works. Once you have configured your environment such
48 that ``boto3`` can find credentials, interaction with AWS should *just work*.
49
50 .. hint::
51
52 Typically you have a ``~/.aws/credentials`` file containing AWS
53 credentials. If you manage multiple credentials, you can override which
54 *profile* to use at run-time by setting the ``AWS_PROFILE`` environment
55 variable.
56
57 Resource Management
58 -------------------
59
60 Depending on the task being performed, various AWS services will be accessed.
61 This of course requires AWS credentials with permissions to access these
62 services.
63
64 The following AWS services can be accessed by automation tasks:
65
66 * EC2
67 * IAM
68 * Simple Systems Manager (SSM)
69
70 Various resources will also be created as part of performing various tasks.
71 This also requires various permissions.
72
73 The following AWS resources can be created by automation tasks:
74
75 * EC2 key pairs
76 * EC2 security groups
77 * EC2 instances
78 * IAM roles and instance profiles
79 * SSM command invocations
80
81 When possible, we prefix resource names with ``hg-`` so they can easily
82 be identified as belonging to Mercurial.
83
84 .. important::
85
86 We currently assume that AWS accounts utilized by *us* are single
87 tenancy. Attempts to have discrete users of ``automation.py`` (including
88 sharing credentials across machines) using the same AWS account can result
89 in them interfering with each other and things breaking.
90
91 Cost of Operation
92 -----------------
93
94 ``automation.py`` tries to be frugal with regards to utilization of remote
95 resources. Persistent remote resources are minimized in order to keep costs
96 in check. For example, EC2 instances are often ephemeral and only live as long
97 as the operation being performed.
98
99 Under normal operation, recurring costs are limited to:
100
101 * Storage costs for AMI / EBS snapshots. This should be just a few pennies
102 per month.
103
104 When running EC2 instances, you'll be billed accordingly. By default, we
105 use *small* instances, like ``t3.medium``. This instance type costs ~$0.07 per
106 hour.
107
108 .. note::
109
110 When running Windows EC2 instances, AWS bills at the full hourly cost, even
111 if the instance doesn't run for a full hour (per-second billing doesn't
112 apply to Windows AMIs).
113
114 Managing Remote Resources
115 -------------------------
116
117 Occassionally, there may be an error purging a temporary resource. Or you
118 may wish to forcefully purge remote state. Commands can be invoked to manually
119 purge remote resources.
120
121 To terminate all EC2 instances that we manage::
122
123 $ automation.py terminate-ec2-instances
124
125 To purge all EC2 resources that we manage::
126
127 $ automation.py purge-ec2-resources
@@ -0,0 +1,70 b''
1 #!/usr/bin/env python3
2 #
3 # automation.py - Perform tasks on remote machines
4 #
5 # Copyright 2019 Gregory Szorc <gregory.szorc@gmail.com>
6 #
7 # This software may be used and distributed according to the terms of the
8 # GNU General Public License version 2 or any later version.
9
10 import os
11 import pathlib
12 import subprocess
13 import sys
14 import venv
15
16
17 HERE = pathlib.Path(os.path.abspath(__file__)).parent
18 REQUIREMENTS_TXT = HERE / 'requirements.txt'
19 SOURCE_DIR = HERE.parent.parent
20 VENV = SOURCE_DIR / 'build' / 'venv-automation'
21
22
23 def bootstrap():
24 venv_created = not VENV.exists()
25
26 VENV.parent.mkdir(exist_ok=True)
27
28 venv.create(VENV, with_pip=True)
29
30 if os.name == 'nt':
31 venv_bin = VENV / 'Scripts'
32 pip = venv_bin / 'pip.exe'
33 python = venv_bin / 'python.exe'
34 else:
35 venv_bin = VENV / 'bin'
36 pip = venv_bin / 'pip'
37 python = venv_bin / 'python'
38
39 args = [str(pip), 'install', '-r', str(REQUIREMENTS_TXT),
40 '--disable-pip-version-check']
41
42 if not venv_created:
43 args.append('-q')
44
45 subprocess.run(args, check=True)
46
47 os.environ['HGAUTOMATION_BOOTSTRAPPED'] = '1'
48 os.environ['PATH'] = '%s%s%s' % (
49 venv_bin, os.pathsep, os.environ['PATH'])
50
51 subprocess.run([str(python), __file__] + sys.argv[1:], check=True)
52
53
54 def run():
55 import hgautomation.cli as cli
56
57 # Need to strip off main Python executable.
58 cli.main()
59
60
61 if __name__ == '__main__':
62 try:
63 if 'HGAUTOMATION_BOOTSTRAPPED' not in os.environ:
64 bootstrap()
65 else:
66 run()
67 except subprocess.CalledProcessError as e:
68 sys.exit(e.returncode)
69 except KeyboardInterrupt:
70 sys.exit(1)
@@ -0,0 +1,59 b''
1 # __init__.py - High-level automation interfaces
2 #
3 # Copyright 2019 Gregory Szorc <gregory.szorc@gmail.com>
4 #
5 # This software may be used and distributed according to the terms of the
6 # GNU General Public License version 2 or any later version.
7
8 # no-check-code because Python 3 native.
9
10 import pathlib
11 import secrets
12
13 from .aws import (
14 AWSConnection,
15 )
16
17
18 class HGAutomation:
19 """High-level interface for Mercurial automation.
20
21 Holds global state, provides access to other primitives, etc.
22 """
23
24 def __init__(self, state_path: pathlib.Path):
25 self.state_path = state_path
26
27 state_path.mkdir(exist_ok=True)
28
29 def default_password(self):
30 """Obtain the default password to use for remote machines.
31
32 A new password will be generated if one is not stored.
33 """
34 p = self.state_path / 'default-password'
35
36 try:
37 with p.open('r', encoding='ascii') as fh:
38 data = fh.read().strip()
39
40 if data:
41 return data
42
43 except FileNotFoundError:
44 pass
45
46 password = secrets.token_urlsafe(24)
47
48 with p.open('w', encoding='ascii') as fh:
49 fh.write(password)
50 fh.write('\n')
51
52 p.chmod(0o0600)
53
54 return password
55
56 def aws_connection(self, region: str):
57 """Obtain an AWSConnection instance bound to a specific region."""
58
59 return AWSConnection(self, region)
This diff has been collapsed as it changes many lines, (879 lines changed) Show them Hide them
@@ -0,0 +1,879 b''
1 # aws.py - Automation code for Amazon Web Services
2 #
3 # Copyright 2019 Gregory Szorc <gregory.szorc@gmail.com>
4 #
5 # This software may be used and distributed according to the terms of the
6 # GNU General Public License version 2 or any later version.
7
8 # no-check-code because Python 3 native.
9
10 import contextlib
11 import copy
12 import hashlib
13 import json
14 import os
15 import pathlib
16 import subprocess
17 import time
18
19 import boto3
20 import botocore.exceptions
21
22 from .winrm import (
23 run_powershell,
24 wait_for_winrm,
25 )
26
27
28 SOURCE_ROOT = pathlib.Path(os.path.abspath(__file__)).parent.parent.parent.parent
29
30 INSTALL_WINDOWS_DEPENDENCIES = (SOURCE_ROOT / 'contrib' /
31 'install-windows-dependencies.ps1')
32
33
34 KEY_PAIRS = {
35 'automation',
36 }
37
38
39 SECURITY_GROUPS = {
40 'windows-dev-1': {
41 'description': 'Mercurial Windows instances that perform build automation',
42 'ingress': [
43 {
44 'FromPort': 22,
45 'ToPort': 22,
46 'IpProtocol': 'tcp',
47 'IpRanges': [
48 {
49 'CidrIp': '0.0.0.0/0',
50 'Description': 'SSH from entire Internet',
51 },
52 ],
53 },
54 {
55 'FromPort': 3389,
56 'ToPort': 3389,
57 'IpProtocol': 'tcp',
58 'IpRanges': [
59 {
60 'CidrIp': '0.0.0.0/0',
61 'Description': 'RDP from entire Internet',
62 },
63 ],
64
65 },
66 {
67 'FromPort': 5985,
68 'ToPort': 5986,
69 'IpProtocol': 'tcp',
70 'IpRanges': [
71 {
72 'CidrIp': '0.0.0.0/0',
73 'Description': 'PowerShell Remoting (Windows Remote Management)',
74 },
75 ],
76 }
77 ],
78 },
79 }
80
81
82 IAM_ROLES = {
83 'ephemeral-ec2-role-1': {
84 'description': 'Mercurial temporary EC2 instances',
85 'policy_arns': [
86 'arn:aws:iam::aws:policy/service-role/AmazonEC2RoleforSSM',
87 ],
88 },
89 }
90
91
92 ASSUME_ROLE_POLICY_DOCUMENT = '''
93 {
94 "Version": "2012-10-17",
95 "Statement": [
96 {
97 "Effect": "Allow",
98 "Principal": {
99 "Service": "ec2.amazonaws.com"
100 },
101 "Action": "sts:AssumeRole"
102 }
103 ]
104 }
105 '''.strip()
106
107
108 IAM_INSTANCE_PROFILES = {
109 'ephemeral-ec2-1': {
110 'roles': [
111 'ephemeral-ec2-role-1',
112 ],
113 }
114 }
115
116
117 # User Data for Windows EC2 instance. Mainly used to set the password
118 # and configure WinRM.
119 # Inspired by the User Data script used by Packer
120 # (from https://www.packer.io/intro/getting-started/build-image.html).
121 WINDOWS_USER_DATA = r'''
122 <powershell>
123
124 # TODO enable this once we figure out what is failing.
125 #$ErrorActionPreference = "stop"
126
127 # Set administrator password
128 net user Administrator "%s"
129 wmic useraccount where "name='Administrator'" set PasswordExpires=FALSE
130
131 # First, make sure WinRM can't be connected to
132 netsh advfirewall firewall set rule name="Windows Remote Management (HTTP-In)" new enable=yes action=block
133
134 # Delete any existing WinRM listeners
135 winrm delete winrm/config/listener?Address=*+Transport=HTTP 2>$Null
136 winrm delete winrm/config/listener?Address=*+Transport=HTTPS 2>$Null
137
138 # Create a new WinRM listener and configure
139 winrm create winrm/config/listener?Address=*+Transport=HTTP
140 winrm set winrm/config/winrs '@{MaxMemoryPerShellMB="0"}'
141 winrm set winrm/config '@{MaxTimeoutms="7200000"}'
142 winrm set winrm/config/service '@{AllowUnencrypted="true"}'
143 winrm set winrm/config/service '@{MaxConcurrentOperationsPerUser="12000"}'
144 winrm set winrm/config/service/auth '@{Basic="true"}'
145 winrm set winrm/config/client/auth '@{Basic="true"}'
146
147 # Configure UAC to allow privilege elevation in remote shells
148 $Key = 'HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System'
149 $Setting = 'LocalAccountTokenFilterPolicy'
150 Set-ItemProperty -Path $Key -Name $Setting -Value 1 -Force
151
152 # Configure and restart the WinRM Service; Enable the required firewall exception
153 Stop-Service -Name WinRM
154 Set-Service -Name WinRM -StartupType Automatic
155 netsh advfirewall firewall set rule name="Windows Remote Management (HTTP-In)" new action=allow localip=any remoteip=any
156 Start-Service -Name WinRM
157
158 # Disable firewall on private network interfaces so prompts don't appear.
159 Set-NetFirewallProfile -Name private -Enabled false
160 </powershell>
161 '''.lstrip()
162
163
164 WINDOWS_BOOTSTRAP_POWERSHELL = '''
165 Write-Output "installing PowerShell dependencies"
166 Install-PackageProvider -Name NuGet -MinimumVersion 2.8.5.201 -Force
167 Set-PSRepository -Name PSGallery -InstallationPolicy Trusted
168 Install-Module -Name OpenSSHUtils -RequiredVersion 0.0.2.0
169
170 Write-Output "installing OpenSSL server"
171 Add-WindowsCapability -Online -Name OpenSSH.Server~~~~0.0.1.0
172 # Various tools will attempt to use older versions of .NET. So we enable
173 # the feature that provides them so it doesn't have to be auto-enabled
174 # later.
175 Write-Output "enabling .NET Framework feature"
176 Install-WindowsFeature -Name Net-Framework-Core
177 '''
178
179
180 class AWSConnection:
181 """Manages the state of a connection with AWS."""
182
183 def __init__(self, automation, region: str):
184 self.automation = automation
185 self.local_state_path = automation.state_path
186
187 self.prefix = 'hg-'
188
189 self.session = boto3.session.Session(region_name=region)
190 self.ec2client = self.session.client('ec2')
191 self.ec2resource = self.session.resource('ec2')
192 self.iamclient = self.session.client('iam')
193 self.iamresource = self.session.resource('iam')
194
195 ensure_key_pairs(automation.state_path, self.ec2resource)
196
197 self.security_groups = ensure_security_groups(self.ec2resource)
198 ensure_iam_state(self.iamresource)
199
200 def key_pair_path_private(self, name):
201 """Path to a key pair private key file."""
202 return self.local_state_path / 'keys' / ('keypair-%s' % name)
203
204 def key_pair_path_public(self, name):
205 return self.local_state_path / 'keys' / ('keypair-%s.pub' % name)
206
207
208 def rsa_key_fingerprint(p: pathlib.Path):
209 """Compute the fingerprint of an RSA private key."""
210
211 # TODO use rsa package.
212 res = subprocess.run(
213 ['openssl', 'pkcs8', '-in', str(p), '-nocrypt', '-topk8',
214 '-outform', 'DER'],
215 capture_output=True,
216 check=True)
217
218 sha1 = hashlib.sha1(res.stdout).hexdigest()
219 return ':'.join(a + b for a, b in zip(sha1[::2], sha1[1::2]))
220
221
222 def ensure_key_pairs(state_path: pathlib.Path, ec2resource, prefix='hg-'):
223 remote_existing = {}
224
225 for kpi in ec2resource.key_pairs.all():
226 if kpi.name.startswith(prefix):
227 remote_existing[kpi.name[len(prefix):]] = kpi.key_fingerprint
228
229 # Validate that we have these keys locally.
230 key_path = state_path / 'keys'
231 key_path.mkdir(exist_ok=True, mode=0o700)
232
233 def remove_remote(name):
234 print('deleting key pair %s' % name)
235 key = ec2resource.KeyPair(name)
236 key.delete()
237
238 def remove_local(name):
239 pub_full = key_path / ('keypair-%s.pub' % name)
240 priv_full = key_path / ('keypair-%s' % name)
241
242 print('removing %s' % pub_full)
243 pub_full.unlink()
244 print('removing %s' % priv_full)
245 priv_full.unlink()
246
247 local_existing = {}
248
249 for f in sorted(os.listdir(key_path)):
250 if not f.startswith('keypair-') or not f.endswith('.pub'):
251 continue
252
253 name = f[len('keypair-'):-len('.pub')]
254
255 pub_full = key_path / f
256 priv_full = key_path / ('keypair-%s' % name)
257
258 with open(pub_full, 'r', encoding='ascii') as fh:
259 data = fh.read()
260
261 if not data.startswith('ssh-rsa '):
262 print('unexpected format for key pair file: %s; removing' %
263 pub_full)
264 pub_full.unlink()
265 priv_full.unlink()
266 continue
267
268 local_existing[name] = rsa_key_fingerprint(priv_full)
269
270 for name in sorted(set(remote_existing) | set(local_existing)):
271 if name not in local_existing:
272 actual = '%s%s' % (prefix, name)
273 print('remote key %s does not exist locally' % name)
274 remove_remote(actual)
275 del remote_existing[name]
276
277 elif name not in remote_existing:
278 print('local key %s does not exist remotely' % name)
279 remove_local(name)
280 del local_existing[name]
281
282 elif remote_existing[name] != local_existing[name]:
283 print('key fingerprint mismatch for %s; '
284 'removing from local and remote' % name)
285 remove_local(name)
286 remove_remote('%s%s' % (prefix, name))
287 del local_existing[name]
288 del remote_existing[name]
289
290 missing = KEY_PAIRS - set(remote_existing)
291
292 for name in sorted(missing):
293 actual = '%s%s' % (prefix, name)
294 print('creating key pair %s' % actual)
295
296 priv_full = key_path / ('keypair-%s' % name)
297 pub_full = key_path / ('keypair-%s.pub' % name)
298
299 kp = ec2resource.create_key_pair(KeyName=actual)
300
301 with priv_full.open('w', encoding='ascii') as fh:
302 fh.write(kp.key_material)
303 fh.write('\n')
304
305 priv_full.chmod(0o0600)
306
307 # SSH public key can be extracted via `ssh-keygen`.
308 with pub_full.open('w', encoding='ascii') as fh:
309 subprocess.run(
310 ['ssh-keygen', '-y', '-f', str(priv_full)],
311 stdout=fh,
312 check=True)
313
314 pub_full.chmod(0o0600)
315
316
317 def delete_instance_profile(profile):
318 for role in profile.roles:
319 print('removing role %s from instance profile %s' % (role.name,
320 profile.name))
321 profile.remove_role(RoleName=role.name)
322
323 print('deleting instance profile %s' % profile.name)
324 profile.delete()
325
326
327 def ensure_iam_state(iamresource, prefix='hg-'):
328 """Ensure IAM state is in sync with our canonical definition."""
329
330 remote_profiles = {}
331
332 for profile in iamresource.instance_profiles.all():
333 if profile.name.startswith(prefix):
334 remote_profiles[profile.name[len(prefix):]] = profile
335
336 for name in sorted(set(remote_profiles) - set(IAM_INSTANCE_PROFILES)):
337 delete_instance_profile(remote_profiles[name])
338 del remote_profiles[name]
339
340 remote_roles = {}
341
342 for role in iamresource.roles.all():
343 if role.name.startswith(prefix):
344 remote_roles[role.name[len(prefix):]] = role
345
346 for name in sorted(set(remote_roles) - set(IAM_ROLES)):
347 role = remote_roles[name]
348
349 print('removing role %s' % role.name)
350 role.delete()
351 del remote_roles[name]
352
353 # We've purged remote state that doesn't belong. Create missing
354 # instance profiles and roles.
355 for name in sorted(set(IAM_INSTANCE_PROFILES) - set(remote_profiles)):
356 actual = '%s%s' % (prefix, name)
357 print('creating IAM instance profile %s' % actual)
358
359 profile = iamresource.create_instance_profile(
360 InstanceProfileName=actual)
361 remote_profiles[name] = profile
362
363 for name in sorted(set(IAM_ROLES) - set(remote_roles)):
364 entry = IAM_ROLES[name]
365
366 actual = '%s%s' % (prefix, name)
367 print('creating IAM role %s' % actual)
368
369 role = iamresource.create_role(
370 RoleName=actual,
371 Description=entry['description'],
372 AssumeRolePolicyDocument=ASSUME_ROLE_POLICY_DOCUMENT,
373 )
374
375 remote_roles[name] = role
376
377 for arn in entry['policy_arns']:
378 print('attaching policy %s to %s' % (arn, role.name))
379 role.attach_policy(PolicyArn=arn)
380
381 # Now reconcile state of profiles.
382 for name, meta in sorted(IAM_INSTANCE_PROFILES.items()):
383 profile = remote_profiles[name]
384 wanted = {'%s%s' % (prefix, role) for role in meta['roles']}
385 have = {role.name for role in profile.roles}
386
387 for role in sorted(have - wanted):
388 print('removing role %s from %s' % (role, profile.name))
389 profile.remove_role(RoleName=role)
390
391 for role in sorted(wanted - have):
392 print('adding role %s to %s' % (role, profile.name))
393 profile.add_role(RoleName=role)
394
395
396 def find_windows_server_2019_image(ec2resource):
397 """Find the Amazon published Windows Server 2019 base image."""
398
399 images = ec2resource.images.filter(
400 Filters=[
401 {
402 'Name': 'owner-alias',
403 'Values': ['amazon'],
404 },
405 {
406 'Name': 'state',
407 'Values': ['available'],
408 },
409 {
410 'Name': 'image-type',
411 'Values': ['machine'],
412 },
413 {
414 'Name': 'name',
415 'Values': ['Windows_Server-2019-English-Full-Base-2019.02.13'],
416 },
417 ])
418
419 for image in images:
420 return image
421
422 raise Exception('unable to find Windows Server 2019 image')
423
424
425 def ensure_security_groups(ec2resource, prefix='hg-'):
426 """Ensure all necessary Mercurial security groups are present.
427
428 All security groups are prefixed with ``hg-`` by default. Any security
429 groups having this prefix but aren't in our list are deleted.
430 """
431 existing = {}
432
433 for group in ec2resource.security_groups.all():
434 if group.group_name.startswith(prefix):
435 existing[group.group_name[len(prefix):]] = group
436
437 purge = set(existing) - set(SECURITY_GROUPS)
438
439 for name in sorted(purge):
440 group = existing[name]
441 print('removing legacy security group: %s' % group.group_name)
442 group.delete()
443
444 security_groups = {}
445
446 for name, group in sorted(SECURITY_GROUPS.items()):
447 if name in existing:
448 security_groups[name] = existing[name]
449 continue
450
451 actual = '%s%s' % (prefix, name)
452 print('adding security group %s' % actual)
453
454 group_res = ec2resource.create_security_group(
455 Description=group['description'],
456 GroupName=actual,
457 )
458
459 group_res.authorize_ingress(
460 IpPermissions=group['ingress'],
461 )
462
463 security_groups[name] = group_res
464
465 return security_groups
466
467
468 def terminate_ec2_instances(ec2resource, prefix='hg-'):
469 """Terminate all EC2 instances managed by us."""
470 waiting = []
471
472 for instance in ec2resource.instances.all():
473 if instance.state['Name'] == 'terminated':
474 continue
475
476 for tag in instance.tags or []:
477 if tag['Key'] == 'Name' and tag['Value'].startswith(prefix):
478 print('terminating %s' % instance.id)
479 instance.terminate()
480 waiting.append(instance)
481
482 for instance in waiting:
483 instance.wait_until_terminated()
484
485
486 def remove_resources(c, prefix='hg-'):
487 """Purge all of our resources in this EC2 region."""
488 ec2resource = c.ec2resource
489 iamresource = c.iamresource
490
491 terminate_ec2_instances(ec2resource, prefix=prefix)
492
493 for image in ec2resource.images.all():
494 if image.name.startswith(prefix):
495 remove_ami(ec2resource, image)
496
497 for group in ec2resource.security_groups.all():
498 if group.group_name.startswith(prefix):
499 print('removing security group %s' % group.group_name)
500 group.delete()
501
502 for profile in iamresource.instance_profiles.all():
503 if profile.name.startswith(prefix):
504 delete_instance_profile(profile)
505
506 for role in iamresource.roles.all():
507 if role.name.startswith(prefix):
508 print('removing role %s' % role.name)
509 role.delete()
510
511
512 def wait_for_ip_addresses(instances):
513 """Wait for the public IP addresses of an iterable of instances."""
514 for instance in instances:
515 while True:
516 if not instance.public_ip_address:
517 time.sleep(2)
518 instance.reload()
519 continue
520
521 print('public IP address for %s: %s' % (
522 instance.id, instance.public_ip_address))
523 break
524
525
526 def remove_ami(ec2resource, image):
527 """Remove an AMI and its underlying snapshots."""
528 snapshots = []
529
530 for device in image.block_device_mappings:
531 if 'Ebs' in device:
532 snapshots.append(ec2resource.Snapshot(device['Ebs']['SnapshotId']))
533
534 print('deregistering %s' % image.id)
535 image.deregister()
536
537 for snapshot in snapshots:
538 print('deleting snapshot %s' % snapshot.id)
539 snapshot.delete()
540
541
542 def wait_for_ssm(ssmclient, instances):
543 """Wait for SSM to come online for an iterable of instance IDs."""
544 while True:
545 res = ssmclient.describe_instance_information(
546 Filters=[
547 {
548 'Key': 'InstanceIds',
549 'Values': [i.id for i in instances],
550 },
551 ],
552 )
553
554 available = len(res['InstanceInformationList'])
555 wanted = len(instances)
556
557 print('%d/%d instances available in SSM' % (available, wanted))
558
559 if available == wanted:
560 return
561
562 time.sleep(2)
563
564
565 def run_ssm_command(ssmclient, instances, document_name, parameters):
566 """Run a PowerShell script on an EC2 instance."""
567
568 res = ssmclient.send_command(
569 InstanceIds=[i.id for i in instances],
570 DocumentName=document_name,
571 Parameters=parameters,
572 CloudWatchOutputConfig={
573 'CloudWatchOutputEnabled': True,
574 },
575 )
576
577 command_id = res['Command']['CommandId']
578
579 for instance in instances:
580 while True:
581 try:
582 res = ssmclient.get_command_invocation(
583 CommandId=command_id,
584 InstanceId=instance.id,
585 )
586 except botocore.exceptions.ClientError as e:
587 if e.response['Error']['Code'] == 'InvocationDoesNotExist':
588 print('could not find SSM command invocation; waiting')
589 time.sleep(1)
590 continue
591 else:
592 raise
593
594 if res['Status'] == 'Success':
595 break
596 elif res['Status'] in ('Pending', 'InProgress', 'Delayed'):
597 time.sleep(2)
598 else:
599 raise Exception('command failed on %s: %s' % (
600 instance.id, res['Status']))
601
602
603 @contextlib.contextmanager
604 def temporary_ec2_instances(ec2resource, config):
605 """Create temporary EC2 instances.
606
607 This is a proxy to ``ec2client.run_instances(**config)`` that takes care of
608 managing the lifecycle of the instances.
609
610 When the context manager exits, the instances are terminated.
611
612 The context manager evaluates to the list of data structures
613 describing each created instance. The instances may not be available
614 for work immediately: it is up to the caller to wait for the instance
615 to start responding.
616 """
617
618 ids = None
619
620 try:
621 res = ec2resource.create_instances(**config)
622
623 ids = [i.id for i in res]
624 print('started instances: %s' % ' '.join(ids))
625
626 yield res
627 finally:
628 if ids:
629 print('terminating instances: %s' % ' '.join(ids))
630 for instance in res:
631 instance.terminate()
632 print('terminated %d instances' % len(ids))
633
634
635 @contextlib.contextmanager
636 def create_temp_windows_ec2_instances(c: AWSConnection, config):
637 """Create temporary Windows EC2 instances.
638
639 This is a higher-level wrapper around ``create_temp_ec2_instances()`` that
640 configures the Windows instance for Windows Remote Management. The emitted
641 instances will have a ``winrm_client`` attribute containing a
642 ``pypsrp.client.Client`` instance bound to the instance.
643 """
644 if 'IamInstanceProfile' in config:
645 raise ValueError('IamInstanceProfile cannot be provided in config')
646 if 'UserData' in config:
647 raise ValueError('UserData cannot be provided in config')
648
649 password = c.automation.default_password()
650
651 config = copy.deepcopy(config)
652 config['IamInstanceProfile'] = {
653 'Name': 'hg-ephemeral-ec2-1',
654 }
655 config.setdefault('TagSpecifications', []).append({
656 'ResourceType': 'instance',
657 'Tags': [{'Key': 'Name', 'Value': 'hg-temp-windows'}],
658 })
659 config['UserData'] = WINDOWS_USER_DATA % password
660
661 with temporary_ec2_instances(c.ec2resource, config) as instances:
662 wait_for_ip_addresses(instances)
663
664 print('waiting for Windows Remote Management service...')
665
666 for instance in instances:
667 client = wait_for_winrm(instance.public_ip_address, 'Administrator', password)
668 print('established WinRM connection to %s' % instance.id)
669 instance.winrm_client = client
670
671 yield instances
672
673
674 def ensure_windows_dev_ami(c: AWSConnection, prefix='hg-'):
675 """Ensure Windows Development AMI is available and up-to-date.
676
677 If necessary, a modern AMI will be built by starting a temporary EC2
678 instance and bootstrapping it.
679
680 Obsolete AMIs will be deleted so there is only a single AMI having the
681 desired name.
682
683 Returns an ``ec2.Image`` of either an existing AMI or a newly-built
684 one.
685 """
686 ec2client = c.ec2client
687 ec2resource = c.ec2resource
688 ssmclient = c.session.client('ssm')
689
690 name = '%s%s' % (prefix, 'windows-dev')
691
692 config = {
693 'BlockDeviceMappings': [
694 {
695 'DeviceName': '/dev/sda1',
696 'Ebs': {
697 'DeleteOnTermination': True,
698 'VolumeSize': 32,
699 'VolumeType': 'gp2',
700 },
701 }
702 ],
703 'ImageId': find_windows_server_2019_image(ec2resource).id,
704 'InstanceInitiatedShutdownBehavior': 'stop',
705 'InstanceType': 't3.medium',
706 'KeyName': '%sautomation' % prefix,
707 'MaxCount': 1,
708 'MinCount': 1,
709 'SecurityGroupIds': [c.security_groups['windows-dev-1'].id],
710 }
711
712 commands = [
713 # Need to start the service so sshd_config is generated.
714 'Start-Service sshd',
715 'Write-Output "modifying sshd_config"',
716 r'$content = Get-Content C:\ProgramData\ssh\sshd_config',
717 '$content = $content -replace "Match Group administrators","" -replace "AuthorizedKeysFile __PROGRAMDATA__/ssh/administrators_authorized_keys",""',
718 r'$content | Set-Content C:\ProgramData\ssh\sshd_config',
719 'Import-Module OpenSSHUtils',
720 r'Repair-SshdConfigPermission C:\ProgramData\ssh\sshd_config -Confirm:$false',
721 'Restart-Service sshd',
722 'Write-Output "installing OpenSSL client"',
723 'Add-WindowsCapability -Online -Name OpenSSH.Client~~~~0.0.1.0',
724 'Set-Service -Name sshd -StartupType "Automatic"',
725 'Write-Output "OpenSSH server running"',
726 ]
727
728 with INSTALL_WINDOWS_DEPENDENCIES.open('r', encoding='utf-8') as fh:
729 commands.extend(l.rstrip() for l in fh)
730
731 # Disable Windows Defender when bootstrapping because it just slows
732 # things down.
733 commands.insert(0, 'Set-MpPreference -DisableRealtimeMonitoring $true')
734 commands.append('Set-MpPreference -DisableRealtimeMonitoring $false')
735
736 # Compute a deterministic fingerprint to determine whether image needs
737 # to be regenerated.
738 fingerprint = {
739 'instance_config': config,
740 'user_data': WINDOWS_USER_DATA,
741 'initial_bootstrap': WINDOWS_BOOTSTRAP_POWERSHELL,
742 'bootstrap_commands': commands,
743 }
744
745 fingerprint = json.dumps(fingerprint, sort_keys=True)
746 fingerprint = hashlib.sha256(fingerprint.encode('utf-8')).hexdigest()
747
748 # Find existing AMIs with this name and delete the ones that are invalid.
749 # Store a reference to a good image so it can be returned one the
750 # image state is reconciled.
751 images = ec2resource.images.filter(
752 Filters=[{'Name': 'name', 'Values': [name]}])
753
754 existing_image = None
755
756 for image in images:
757 if image.tags is None:
758 print('image %s for %s lacks required tags; removing' % (
759 image.id, image.name))
760 remove_ami(ec2resource, image)
761 else:
762 tags = {t['Key']: t['Value'] for t in image.tags}
763
764 if tags.get('HGIMAGEFINGERPRINT') == fingerprint:
765 existing_image = image
766 else:
767 print('image %s for %s has wrong fingerprint; removing' % (
768 image.id, image.name))
769 remove_ami(ec2resource, image)
770
771 if existing_image:
772 return existing_image
773
774 print('no suitable Windows development image found; creating one...')
775
776 with create_temp_windows_ec2_instances(c, config) as instances:
777 assert len(instances) == 1
778 instance = instances[0]
779
780 wait_for_ssm(ssmclient, [instance])
781
782 # On first boot, install various Windows updates.
783 # We would ideally use PowerShell Remoting for this. However, there are
784 # trust issues that make it difficult to invoke Windows Update
785 # remotely. So we use SSM, which has a mechanism for running Windows
786 # Update.
787 print('installing Windows features...')
788 run_ssm_command(
789 ssmclient,
790 [instance],
791 'AWS-RunPowerShellScript',
792 {
793 'commands': WINDOWS_BOOTSTRAP_POWERSHELL.split('\n'),
794 },
795 )
796
797 # Reboot so all updates are fully applied.
798 print('rebooting instance %s' % instance.id)
799 ec2client.reboot_instances(InstanceIds=[instance.id])
800
801 time.sleep(15)
802
803 print('waiting for Windows Remote Management to come back...')
804 client = wait_for_winrm(instance.public_ip_address, 'Administrator',
805 c.automation.default_password())
806 print('established WinRM connection to %s' % instance.id)
807 instance.winrm_client = client
808
809 print('bootstrapping instance...')
810 run_powershell(instance.winrm_client, '\n'.join(commands))
811
812 print('bootstrap completed; stopping %s to create image' % instance.id)
813 instance.stop()
814
815 ec2client.get_waiter('instance_stopped').wait(
816 InstanceIds=[instance.id],
817 WaiterConfig={
818 'Delay': 5,
819 })
820 print('%s is stopped' % instance.id)
821
822 image = instance.create_image(
823 Name=name,
824 Description='Mercurial Windows development environment',
825 )
826
827 image.create_tags(Tags=[
828 {
829 'Key': 'HGIMAGEFINGERPRINT',
830 'Value': fingerprint,
831 },
832 ])
833
834 print('waiting for image %s' % image.id)
835
836 ec2client.get_waiter('image_available').wait(
837 ImageIds=[image.id],
838 )
839
840 print('image %s available as %s' % (image.id, image.name))
841
842 return image
843
844
845 @contextlib.contextmanager
846 def temporary_windows_dev_instances(c: AWSConnection, image, instance_type,
847 prefix='hg-', disable_antivirus=False):
848 """Create a temporary Windows development EC2 instance.
849
850 Context manager resolves to the list of ``EC2.Instance`` that were created.
851 """
852 config = {
853 'BlockDeviceMappings': [
854 {
855 'DeviceName': '/dev/sda1',
856 'Ebs': {
857 'DeleteOnTermination': True,
858 'VolumeSize': 32,
859 'VolumeType': 'gp2',
860 },
861 }
862 ],
863 'ImageId': image.id,
864 'InstanceInitiatedShutdownBehavior': 'stop',
865 'InstanceType': instance_type,
866 'KeyName': '%sautomation' % prefix,
867 'MaxCount': 1,
868 'MinCount': 1,
869 'SecurityGroupIds': [c.security_groups['windows-dev-1'].id],
870 }
871
872 with create_temp_windows_ec2_instances(c, config) as instances:
873 if disable_antivirus:
874 for instance in instances:
875 run_powershell(
876 instance.winrm_client,
877 'Set-MpPreference -DisableRealtimeMonitoring $true')
878
879 yield instances
@@ -0,0 +1,273 b''
1 # cli.py - Command line interface for automation
2 #
3 # Copyright 2019 Gregory Szorc <gregory.szorc@gmail.com>
4 #
5 # This software may be used and distributed according to the terms of the
6 # GNU General Public License version 2 or any later version.
7
8 # no-check-code because Python 3 native.
9
10 import argparse
11 import os
12 import pathlib
13
14 from . import (
15 aws,
16 HGAutomation,
17 windows,
18 )
19
20
21 SOURCE_ROOT = pathlib.Path(os.path.abspath(__file__)).parent.parent.parent.parent
22 DIST_PATH = SOURCE_ROOT / 'dist'
23
24
25 def bootstrap_windows_dev(hga: HGAutomation, aws_region):
26 c = hga.aws_connection(aws_region)
27 image = aws.ensure_windows_dev_ami(c)
28 print('Windows development AMI available as %s' % image.id)
29
30
31 def build_inno(hga: HGAutomation, aws_region, arch, revision, version):
32 c = hga.aws_connection(aws_region)
33 image = aws.ensure_windows_dev_ami(c)
34 DIST_PATH.mkdir(exist_ok=True)
35
36 with aws.temporary_windows_dev_instances(c, image, 't3.medium') as insts:
37 instance = insts[0]
38
39 windows.synchronize_hg(SOURCE_ROOT, revision, instance)
40
41 for a in arch:
42 windows.build_inno_installer(instance.winrm_client, a,
43 DIST_PATH,
44 version=version)
45
46
47 def build_wix(hga: HGAutomation, aws_region, arch, revision, version):
48 c = hga.aws_connection(aws_region)
49 image = aws.ensure_windows_dev_ami(c)
50 DIST_PATH.mkdir(exist_ok=True)
51
52 with aws.temporary_windows_dev_instances(c, image, 't3.medium') as insts:
53 instance = insts[0]
54
55 windows.synchronize_hg(SOURCE_ROOT, revision, instance)
56
57 for a in arch:
58 windows.build_wix_installer(instance.winrm_client, a,
59 DIST_PATH, version=version)
60
61
62 def build_windows_wheel(hga: HGAutomation, aws_region, arch, revision):
63 c = hga.aws_connection(aws_region)
64 image = aws.ensure_windows_dev_ami(c)
65 DIST_PATH.mkdir(exist_ok=True)
66
67 with aws.temporary_windows_dev_instances(c, image, 't3.medium') as insts:
68 instance = insts[0]
69
70 windows.synchronize_hg(SOURCE_ROOT, revision, instance)
71
72 for a in arch:
73 windows.build_wheel(instance.winrm_client, a, DIST_PATH)
74
75
76 def build_all_windows_packages(hga: HGAutomation, aws_region, revision):
77 c = hga.aws_connection(aws_region)
78 image = aws.ensure_windows_dev_ami(c)
79 DIST_PATH.mkdir(exist_ok=True)
80
81 with aws.temporary_windows_dev_instances(c, image, 't3.medium') as insts:
82 instance = insts[0]
83
84 winrm_client = instance.winrm_client
85
86 windows.synchronize_hg(SOURCE_ROOT, revision, instance)
87
88 for arch in ('x86', 'x64'):
89 windows.purge_hg(winrm_client)
90 windows.build_wheel(winrm_client, arch, DIST_PATH)
91 windows.purge_hg(winrm_client)
92 windows.build_inno_installer(winrm_client, arch, DIST_PATH)
93 windows.purge_hg(winrm_client)
94 windows.build_wix_installer(winrm_client, arch, DIST_PATH)
95
96
97 def terminate_ec2_instances(hga: HGAutomation, aws_region):
98 c = hga.aws_connection(aws_region)
99 aws.terminate_ec2_instances(c.ec2resource)
100
101
102 def purge_ec2_resources(hga: HGAutomation, aws_region):
103 c = hga.aws_connection(aws_region)
104 aws.remove_resources(c)
105
106
107 def run_tests_windows(hga: HGAutomation, aws_region, instance_type,
108 python_version, arch, test_flags):
109 c = hga.aws_connection(aws_region)
110 image = aws.ensure_windows_dev_ami(c)
111
112 with aws.temporary_windows_dev_instances(c, image, instance_type,
113 disable_antivirus=True) as insts:
114 instance = insts[0]
115
116 windows.synchronize_hg(SOURCE_ROOT, '.', instance)
117 windows.run_tests(instance.winrm_client, python_version, arch,
118 test_flags)
119
120
121 def get_parser():
122 parser = argparse.ArgumentParser()
123
124 parser.add_argument(
125 '--state-path',
126 default='~/.hgautomation',
127 help='Path for local state files',
128 )
129 parser.add_argument(
130 '--aws-region',
131 help='AWS region to use',
132 default='us-west-1',
133 )
134
135 subparsers = parser.add_subparsers()
136
137 sp = subparsers.add_parser(
138 'bootstrap-windows-dev',
139 help='Bootstrap the Windows development environment',
140 )
141 sp.set_defaults(func=bootstrap_windows_dev)
142
143 sp = subparsers.add_parser(
144 'build-all-windows-packages',
145 help='Build all Windows packages',
146 )
147 sp.add_argument(
148 '--revision',
149 help='Mercurial revision to build',
150 default='.',
151 )
152 sp.set_defaults(func=build_all_windows_packages)
153
154 sp = subparsers.add_parser(
155 'build-inno',
156 help='Build Inno Setup installer(s)',
157 )
158 sp.add_argument(
159 '--arch',
160 help='Architecture to build for',
161 choices={'x86', 'x64'},
162 nargs='*',
163 default=['x64'],
164 )
165 sp.add_argument(
166 '--revision',
167 help='Mercurial revision to build',
168 default='.',
169 )
170 sp.add_argument(
171 '--version',
172 help='Mercurial version string to use in installer',
173 )
174 sp.set_defaults(func=build_inno)
175
176 sp = subparsers.add_parser(
177 'build-windows-wheel',
178 help='Build Windows wheel(s)',
179 )
180 sp.add_argument(
181 '--arch',
182 help='Architecture to build for',
183 choices={'x86', 'x64'},
184 nargs='*',
185 default=['x64'],
186 )
187 sp.add_argument(
188 '--revision',
189 help='Mercurial revision to build',
190 default='.',
191 )
192 sp.set_defaults(func=build_windows_wheel)
193
194 sp = subparsers.add_parser(
195 'build-wix',
196 help='Build WiX installer(s)'
197 )
198 sp.add_argument(
199 '--arch',
200 help='Architecture to build for',
201 choices={'x86', 'x64'},
202 nargs='*',
203 default=['x64'],
204 )
205 sp.add_argument(
206 '--revision',
207 help='Mercurial revision to build',
208 default='.',
209 )
210 sp.add_argument(
211 '--version',
212 help='Mercurial version string to use in installer',
213 )
214 sp.set_defaults(func=build_wix)
215
216 sp = subparsers.add_parser(
217 'terminate-ec2-instances',
218 help='Terminate all active EC2 instances managed by us',
219 )
220 sp.set_defaults(func=terminate_ec2_instances)
221
222 sp = subparsers.add_parser(
223 'purge-ec2-resources',
224 help='Purge all EC2 resources managed by us',
225 )
226 sp.set_defaults(func=purge_ec2_resources)
227
228 sp = subparsers.add_parser(
229 'run-tests-windows',
230 help='Run tests on Windows',
231 )
232 sp.add_argument(
233 '--instance-type',
234 help='EC2 instance type to use',
235 default='t3.medium',
236 )
237 sp.add_argument(
238 '--python-version',
239 help='Python version to use',
240 choices={'2.7', '3.5', '3.6', '3.7', '3.8'},
241 default='2.7',
242 )
243 sp.add_argument(
244 '--arch',
245 help='Architecture to test',
246 choices={'x86', 'x64'},
247 default='x64',
248 )
249 sp.add_argument(
250 '--test-flags',
251 help='Extra command line flags to pass to run-tests.py',
252 )
253 sp.set_defaults(func=run_tests_windows)
254
255 return parser
256
257
258 def main():
259 parser = get_parser()
260 args = parser.parse_args()
261
262 local_state_path = pathlib.Path(os.path.expanduser(args.state_path))
263 automation = HGAutomation(local_state_path)
264
265 if not hasattr(args, 'func'):
266 parser.print_help()
267 return
268
269 kwargs = dict(vars(args))
270 del kwargs['func']
271 del kwargs['state_path']
272
273 args.func(automation, **kwargs)
@@ -0,0 +1,287 b''
1 # windows.py - Automation specific to Windows
2 #
3 # Copyright 2019 Gregory Szorc <gregory.szorc@gmail.com>
4 #
5 # This software may be used and distributed according to the terms of the
6 # GNU General Public License version 2 or any later version.
7
8 # no-check-code because Python 3 native.
9
10 import os
11 import pathlib
12 import re
13 import subprocess
14 import tempfile
15
16 from .winrm import (
17 run_powershell,
18 )
19
20
21 # PowerShell commands to activate a Visual Studio 2008 environment.
22 # This is essentially a port of vcvarsall.bat to PowerShell.
23 ACTIVATE_VC9_AMD64 = r'''
24 Write-Output "activating Visual Studio 2008 environment for AMD64"
25 $root = "$env:LOCALAPPDATA\Programs\Common\Microsoft\Visual C++ for Python\9.0"
26 $Env:VCINSTALLDIR = "${root}\VC\"
27 $Env:WindowsSdkDir = "${root}\WinSDK\"
28 $Env:PATH = "${root}\VC\Bin\amd64;${root}\WinSDK\Bin\x64;${root}\WinSDK\Bin;$Env:PATH"
29 $Env:INCLUDE = "${root}\VC\Include;${root}\WinSDK\Include;$Env:PATH"
30 $Env:LIB = "${root}\VC\Lib\amd64;${root}\WinSDK\Lib\x64;$Env:LIB"
31 $Env:LIBPATH = "${root}\VC\Lib\amd64;${root}\WinSDK\Lib\x64;$Env:LIBPATH"
32 '''.lstrip()
33
34 ACTIVATE_VC9_X86 = r'''
35 Write-Output "activating Visual Studio 2008 environment for x86"
36 $root = "$env:LOCALAPPDATA\Programs\Common\Microsoft\Visual C++ for Python\9.0"
37 $Env:VCINSTALLDIR = "${root}\VC\"
38 $Env:WindowsSdkDir = "${root}\WinSDK\"
39 $Env:PATH = "${root}\VC\Bin;${root}\WinSDK\Bin;$Env:PATH"
40 $Env:INCLUDE = "${root}\VC\Include;${root}\WinSDK\Include;$Env:INCLUDE"
41 $Env:LIB = "${root}\VC\Lib;${root}\WinSDK\Lib;$Env:LIB"
42 $Env:LIBPATH = "${root}\VC\lib;${root}\WinSDK\Lib:$Env:LIBPATH"
43 '''.lstrip()
44
45 HG_PURGE = r'''
46 $Env:PATH = "C:\hgdev\venv-bootstrap\Scripts;$Env:PATH"
47 Set-Location C:\hgdev\src
48 hg.exe --config extensions.purge= purge --all
49 if ($LASTEXITCODE -ne 0) {
50 throw "process exited non-0: $LASTEXITCODE"
51 }
52 Write-Output "purged Mercurial repo"
53 '''
54
55 HG_UPDATE_CLEAN = r'''
56 $Env:PATH = "C:\hgdev\venv-bootstrap\Scripts;$Env:PATH"
57 Set-Location C:\hgdev\src
58 hg.exe --config extensions.purge= purge --all
59 if ($LASTEXITCODE -ne 0) {{
60 throw "process exited non-0: $LASTEXITCODE"
61 }}
62 hg.exe update -C {revision}
63 if ($LASTEXITCODE -ne 0) {{
64 throw "process exited non-0: $LASTEXITCODE"
65 }}
66 hg.exe log -r .
67 Write-Output "updated Mercurial working directory to {revision}"
68 '''.lstrip()
69
70 BUILD_INNO = r'''
71 Set-Location C:\hgdev\src
72 $python = "C:\hgdev\python27-{arch}\python.exe"
73 C:\hgdev\python37-x64\python.exe contrib\packaging\inno\build.py --python $python
74 if ($LASTEXITCODE -ne 0) {{
75 throw "process exited non-0: $LASTEXITCODE"
76 }}
77 '''.lstrip()
78
79 BUILD_WHEEL = r'''
80 Set-Location C:\hgdev\src
81 C:\hgdev\python27-{arch}\Scripts\pip.exe wheel --wheel-dir dist .
82 if ($LASTEXITCODE -ne 0) {{
83 throw "process exited non-0: $LASTEXITCODE"
84 }}
85 '''
86
87 BUILD_WIX = r'''
88 Set-Location C:\hgdev\src
89 $python = "C:\hgdev\python27-{arch}\python.exe"
90 C:\hgdev\python37-x64\python.exe contrib\packaging\wix\build.py --python $python {extra_args}
91 if ($LASTEXITCODE -ne 0) {{
92 throw "process exited non-0: $LASTEXITCODE"
93 }}
94 '''
95
96 RUN_TESTS = r'''
97 C:\hgdev\MinGW\msys\1.0\bin\sh.exe --login -c "cd /c/hgdev/src/tests && /c/hgdev/{python_path}/python.exe run-tests.py {test_flags}"
98 if ($LASTEXITCODE -ne 0) {{
99 throw "process exited non-0: $LASTEXITCODE"
100 }}
101 '''
102
103
104 def get_vc_prefix(arch):
105 if arch == 'x86':
106 return ACTIVATE_VC9_X86
107 elif arch == 'x64':
108 return ACTIVATE_VC9_AMD64
109 else:
110 raise ValueError('illegal arch: %s; must be x86 or x64' % arch)
111
112
113 def fix_authorized_keys_permissions(winrm_client, path):
114 commands = [
115 '$ErrorActionPreference = "Stop"',
116 'Repair-AuthorizedKeyPermission -FilePath %s -Confirm:$false' % path,
117 r'icacls %s /remove:g "NT Service\sshd"' % path,
118 ]
119
120 run_powershell(winrm_client, '\n'.join(commands))
121
122
123 def synchronize_hg(hg_repo: pathlib.Path, revision: str, ec2_instance):
124 """Synchronize local Mercurial repo to remote EC2 instance."""
125
126 winrm_client = ec2_instance.winrm_client
127
128 with tempfile.TemporaryDirectory() as temp_dir:
129 temp_dir = pathlib.Path(temp_dir)
130
131 ssh_dir = temp_dir / '.ssh'
132 ssh_dir.mkdir()
133 ssh_dir.chmod(0o0700)
134
135 # Generate SSH key to use for communication.
136 subprocess.run([
137 'ssh-keygen', '-t', 'rsa', '-b', '4096', '-N', '',
138 '-f', str(ssh_dir / 'id_rsa')],
139 check=True, capture_output=True)
140
141 # Add it to ~/.ssh/authorized_keys on remote.
142 # This assumes the file doesn't already exist.
143 authorized_keys = r'c:\Users\Administrator\.ssh\authorized_keys'
144 winrm_client.execute_cmd(r'mkdir c:\Users\Administrator\.ssh')
145 winrm_client.copy(str(ssh_dir / 'id_rsa.pub'), authorized_keys)
146 fix_authorized_keys_permissions(winrm_client, authorized_keys)
147
148 public_ip = ec2_instance.public_ip_address
149
150 ssh_config = temp_dir / '.ssh' / 'config'
151
152 with open(ssh_config, 'w', encoding='utf-8') as fh:
153 fh.write('Host %s\n' % public_ip)
154 fh.write(' User Administrator\n')
155 fh.write(' StrictHostKeyChecking no\n')
156 fh.write(' UserKnownHostsFile %s\n' % (ssh_dir / 'known_hosts'))
157 fh.write(' IdentityFile %s\n' % (ssh_dir / 'id_rsa'))
158
159 env = dict(os.environ)
160 env['HGPLAIN'] = '1'
161 env['HGENCODING'] = 'utf-8'
162
163 hg_bin = hg_repo / 'hg'
164
165 res = subprocess.run(
166 ['python2.7', str(hg_bin), 'log', '-r', revision, '-T', '{node}'],
167 cwd=str(hg_repo), env=env, check=True, capture_output=True)
168
169 full_revision = res.stdout.decode('ascii')
170
171 args = [
172 'python2.7', hg_bin,
173 '--config', 'ui.ssh=ssh -F %s' % ssh_config,
174 '--config', 'ui.remotecmd=c:/hgdev/venv-bootstrap/Scripts/hg.exe',
175 'push', '-r', full_revision, 'ssh://%s/c:/hgdev/src' % public_ip,
176 ]
177
178 subprocess.run(args, cwd=str(hg_repo), env=env, check=True)
179
180 run_powershell(winrm_client,
181 HG_UPDATE_CLEAN.format(revision=full_revision))
182
183 # TODO detect dirty local working directory and synchronize accordingly.
184
185
186 def purge_hg(winrm_client):
187 """Purge the Mercurial source repository on an EC2 instance."""
188 run_powershell(winrm_client, HG_PURGE)
189
190
191 def find_latest_dist(winrm_client, pattern):
192 """Find path to newest file in dist/ directory matching a pattern."""
193
194 res = winrm_client.execute_ps(
195 r'$v = Get-ChildItem -Path C:\hgdev\src\dist -Filter "%s" '
196 '| Sort-Object LastWriteTime -Descending '
197 '| Select-Object -First 1\n'
198 '$v.name' % pattern
199 )
200 return res[0]
201
202
203 def copy_latest_dist(winrm_client, pattern, dest_path):
204 """Copy latest file matching pattern in dist/ directory.
205
206 Given a WinRM client and a file pattern, find the latest file on the remote
207 matching that pattern and copy it to the ``dest_path`` directory on the
208 local machine.
209 """
210 latest = find_latest_dist(winrm_client, pattern)
211 source = r'C:\hgdev\src\dist\%s' % latest
212 dest = dest_path / latest
213 print('copying %s to %s' % (source, dest))
214 winrm_client.fetch(source, str(dest))
215
216
217 def build_inno_installer(winrm_client, arch: str, dest_path: pathlib.Path,
218 version=None):
219 """Build the Inno Setup installer on a remote machine.
220
221 Using a WinRM client, remote commands are executed to build
222 a Mercurial Inno Setup installer.
223 """
224 print('building Inno Setup installer for %s' % arch)
225
226 extra_args = []
227 if version:
228 extra_args.extend(['--version', version])
229
230 ps = get_vc_prefix(arch) + BUILD_INNO.format(arch=arch,
231 extra_args=' '.join(extra_args))
232 run_powershell(winrm_client, ps)
233 copy_latest_dist(winrm_client, '*.exe', dest_path)
234
235
236 def build_wheel(winrm_client, arch: str, dest_path: pathlib.Path):
237 """Build Python wheels on a remote machine.
238
239 Using a WinRM client, remote commands are executed to build a Python wheel
240 for Mercurial.
241 """
242 print('Building Windows wheel for %s' % arch)
243 ps = get_vc_prefix(arch) + BUILD_WHEEL.format(arch=arch)
244 run_powershell(winrm_client, ps)
245 copy_latest_dist(winrm_client, '*.whl', dest_path)
246
247
248 def build_wix_installer(winrm_client, arch: str, dest_path: pathlib.Path,
249 version=None):
250 """Build the WiX installer on a remote machine.
251
252 Using a WinRM client, remote commands are executed to build a WiX installer.
253 """
254 print('Building WiX installer for %s' % arch)
255 extra_args = []
256 if version:
257 extra_args.extend(['--version', version])
258
259 ps = get_vc_prefix(arch) + BUILD_WIX.format(arch=arch,
260 extra_args=' '.join(extra_args))
261 run_powershell(winrm_client, ps)
262 copy_latest_dist(winrm_client, '*.msi', dest_path)
263
264
265 def run_tests(winrm_client, python_version, arch, test_flags=''):
266 """Run tests on a remote Windows machine.
267
268 ``python_version`` is a ``X.Y`` string like ``2.7`` or ``3.7``.
269 ``arch`` is ``x86`` or ``x64``.
270 ``test_flags`` is a str representing extra arguments to pass to
271 ``run-tests.py``.
272 """
273 if not re.match(r'\d\.\d', python_version):
274 raise ValueError(r'python_version must be \d.\d; got %s' %
275 python_version)
276
277 if arch not in ('x86', 'x64'):
278 raise ValueError('arch must be x86 or x64; got %s' % arch)
279
280 python_path = 'python%s-%s' % (python_version.replace('.', ''), arch)
281
282 ps = RUN_TESTS.format(
283 python_path=python_path,
284 test_flags=test_flags or '',
285 )
286
287 run_powershell(winrm_client, ps)
@@ -0,0 +1,82 b''
1 # winrm.py - Interact with Windows Remote Management (WinRM)
2 #
3 # Copyright 2019 Gregory Szorc <gregory.szorc@gmail.com>
4 #
5 # This software may be used and distributed according to the terms of the
6 # GNU General Public License version 2 or any later version.
7
8 # no-check-code because Python 3 native.
9
10 import logging
11 import pprint
12 import time
13
14 from pypsrp.client import (
15 Client,
16 )
17 from pypsrp.powershell import (
18 PowerShell,
19 PSInvocationState,
20 RunspacePool,
21 )
22 import requests.exceptions
23
24
25 logger = logging.getLogger(__name__)
26
27
28 def wait_for_winrm(host, username, password, timeout=120, ssl=False):
29 """Wait for the Windows Remoting (WinRM) service to become available.
30
31 Returns a ``psrpclient.Client`` instance.
32 """
33
34 end_time = time.time() + timeout
35
36 while True:
37 try:
38 client = Client(host, username=username, password=password,
39 ssl=ssl, connection_timeout=5)
40 client.execute_cmd('echo "hello world"')
41 return client
42 except requests.exceptions.ConnectionError:
43 if time.time() >= end_time:
44 raise
45
46 time.sleep(1)
47
48
49 def format_object(o):
50 if isinstance(o, str):
51 return o
52
53 try:
54 o = str(o)
55 except TypeError:
56 o = pprint.pformat(o.extended_properties)
57
58 return o
59
60
61 def run_powershell(client, script):
62 with RunspacePool(client.wsman) as pool:
63 ps = PowerShell(pool)
64 ps.add_script(script)
65
66 ps.begin_invoke()
67
68 while ps.state == PSInvocationState.RUNNING:
69 ps.poll_invoke()
70 for o in ps.output:
71 print(format_object(o))
72
73 ps.output[:] = []
74
75 ps.end_invoke()
76
77 for o in ps.output:
78 print(format_object(o))
79
80 if ps.state == PSInvocationState.FAILED:
81 raise Exception('PowerShell execution failed: %s' %
82 ' '.join(map(format_object, ps.streams.error)))
@@ -0,0 +1,119 b''
1 #
2 # This file is autogenerated by pip-compile
3 # To update, run:
4 #
5 # pip-compile -U --generate-hashes --output-file contrib/automation/requirements.txt contrib/automation/requirements.txt.in
6 #
7 asn1crypto==0.24.0 \
8 --hash=sha256:2f1adbb7546ed199e3c90ef23ec95c5cf3585bac7d11fb7eb562a3fe89c64e87 \
9 --hash=sha256:9d5c20441baf0cb60a4ac34cc447c6c189024b6b4c6cd7877034f4965c464e49 \
10 # via cryptography
11 boto3==1.9.111 \
12 --hash=sha256:06414c75d1f62af7d04fd652b38d1e4fd3cfd6b35bad978466af88e2aaecd00d \
13 --hash=sha256:f3b77dff382374773d02411fa47ee408f4f503aeebd837fd9dc9ed8635bc5e8e
14 botocore==1.12.111 \
15 --hash=sha256:6af473c52d5e3e7ff82de5334e9fee96b2d5ec2df5d78bc00cd9937e2573a7a8 \
16 --hash=sha256:9f5123c7be704b17aeacae99b5842ab17bda1f799dd29134de8c70e0a50a45d7 \
17 # via boto3, s3transfer
18 certifi==2019.3.9 \
19 --hash=sha256:59b7658e26ca9c7339e00f8f4636cdfe59d34fa37b9b04f6f9e9926b3cece1a5 \
20 --hash=sha256:b26104d6835d1f5e49452a26eb2ff87fe7090b89dfcaee5ea2212697e1e1d7ae \
21 # via requests
22 cffi==1.12.2 \
23 --hash=sha256:00b97afa72c233495560a0793cdc86c2571721b4271c0667addc83c417f3d90f \
24 --hash=sha256:0ba1b0c90f2124459f6966a10c03794082a2f3985cd699d7d63c4a8dae113e11 \
25 --hash=sha256:0bffb69da295a4fc3349f2ec7cbe16b8ba057b0a593a92cbe8396e535244ee9d \
26 --hash=sha256:21469a2b1082088d11ccd79dd84157ba42d940064abbfa59cf5f024c19cf4891 \
27 --hash=sha256:2e4812f7fa984bf1ab253a40f1f4391b604f7fc424a3e21f7de542a7f8f7aedf \
28 --hash=sha256:2eac2cdd07b9049dd4e68449b90d3ef1adc7c759463af5beb53a84f1db62e36c \
29 --hash=sha256:2f9089979d7456c74d21303c7851f158833d48fb265876923edcb2d0194104ed \
30 --hash=sha256:3dd13feff00bddb0bd2d650cdb7338f815c1789a91a6f68fdc00e5c5ed40329b \
31 --hash=sha256:4065c32b52f4b142f417af6f33a5024edc1336aa845b9d5a8d86071f6fcaac5a \
32 --hash=sha256:51a4ba1256e9003a3acf508e3b4f4661bebd015b8180cc31849da222426ef585 \
33 --hash=sha256:59888faac06403767c0cf8cfb3f4a777b2939b1fbd9f729299b5384f097f05ea \
34 --hash=sha256:59c87886640574d8b14910840327f5cd15954e26ed0bbd4e7cef95fa5aef218f \
35 --hash=sha256:610fc7d6db6c56a244c2701575f6851461753c60f73f2de89c79bbf1cc807f33 \
36 --hash=sha256:70aeadeecb281ea901bf4230c6222af0248c41044d6f57401a614ea59d96d145 \
37 --hash=sha256:71e1296d5e66c59cd2c0f2d72dc476d42afe02aeddc833d8e05630a0551dad7a \
38 --hash=sha256:8fc7a49b440ea752cfdf1d51a586fd08d395ff7a5d555dc69e84b1939f7ddee3 \
39 --hash=sha256:9b5c2afd2d6e3771d516045a6cfa11a8da9a60e3d128746a7fe9ab36dfe7221f \
40 --hash=sha256:9c759051ebcb244d9d55ee791259ddd158188d15adee3c152502d3b69005e6bd \
41 --hash=sha256:b4d1011fec5ec12aa7cc10c05a2f2f12dfa0adfe958e56ae38dc140614035804 \
42 --hash=sha256:b4f1d6332339ecc61275bebd1f7b674098a66fea11a00c84d1c58851e618dc0d \
43 --hash=sha256:c030cda3dc8e62b814831faa4eb93dd9a46498af8cd1d5c178c2de856972fd92 \
44 --hash=sha256:c2e1f2012e56d61390c0e668c20c4fb0ae667c44d6f6a2eeea5d7148dcd3df9f \
45 --hash=sha256:c37c77d6562074452120fc6c02ad86ec928f5710fbc435a181d69334b4de1d84 \
46 --hash=sha256:c8149780c60f8fd02752d0429246088c6c04e234b895c4a42e1ea9b4de8d27fb \
47 --hash=sha256:cbeeef1dc3c4299bd746b774f019de9e4672f7cc666c777cd5b409f0b746dac7 \
48 --hash=sha256:e113878a446c6228669144ae8a56e268c91b7f1fafae927adc4879d9849e0ea7 \
49 --hash=sha256:e21162bf941b85c0cda08224dade5def9360f53b09f9f259adb85fc7dd0e7b35 \
50 --hash=sha256:fb6934ef4744becbda3143d30c6604718871495a5e36c408431bf33d9c146889 \
51 # via cryptography
52 chardet==3.0.4 \
53 --hash=sha256:84ab92ed1c4d4f16916e05906b6b75a6c0fb5db821cc65e70cbd64a3e2a5eaae \
54 --hash=sha256:fc323ffcaeaed0e0a02bf4d117757b98aed530d9ed4531e3e15460124c106691 \
55 # via requests
56 cryptography==2.6.1 \
57 --hash=sha256:066f815f1fe46020877c5983a7e747ae140f517f1b09030ec098503575265ce1 \
58 --hash=sha256:210210d9df0afba9e000636e97810117dc55b7157c903a55716bb73e3ae07705 \
59 --hash=sha256:26c821cbeb683facb966045e2064303029d572a87ee69ca5a1bf54bf55f93ca6 \
60 --hash=sha256:2afb83308dc5c5255149ff7d3fb9964f7c9ee3d59b603ec18ccf5b0a8852e2b1 \
61 --hash=sha256:2db34e5c45988f36f7a08a7ab2b69638994a8923853dec2d4af121f689c66dc8 \
62 --hash=sha256:409c4653e0f719fa78febcb71ac417076ae5e20160aec7270c91d009837b9151 \
63 --hash=sha256:45a4f4cf4f4e6a55c8128f8b76b4c057027b27d4c67e3fe157fa02f27e37830d \
64 --hash=sha256:48eab46ef38faf1031e58dfcc9c3e71756a1108f4c9c966150b605d4a1a7f659 \
65 --hash=sha256:6b9e0ae298ab20d371fc26e2129fd683cfc0cfde4d157c6341722de645146537 \
66 --hash=sha256:6c4778afe50f413707f604828c1ad1ff81fadf6c110cb669579dea7e2e98a75e \
67 --hash=sha256:8c33fb99025d353c9520141f8bc989c2134a1f76bac6369cea060812f5b5c2bb \
68 --hash=sha256:9873a1760a274b620a135054b756f9f218fa61ca030e42df31b409f0fb738b6c \
69 --hash=sha256:9b069768c627f3f5623b1cbd3248c5e7e92aec62f4c98827059eed7053138cc9 \
70 --hash=sha256:9e4ce27a507e4886efbd3c32d120db5089b906979a4debf1d5939ec01b9dd6c5 \
71 --hash=sha256:acb424eaca214cb08735f1a744eceb97d014de6530c1ea23beb86d9c6f13c2ad \
72 --hash=sha256:c8181c7d77388fe26ab8418bb088b1a1ef5fde058c6926790c8a0a3d94075a4a \
73 --hash=sha256:d4afbb0840f489b60f5a580a41a1b9c3622e08ecb5eec8614d4fb4cd914c4460 \
74 --hash=sha256:d9ed28030797c00f4bc43c86bf819266c76a5ea61d006cd4078a93ebf7da6bfd \
75 --hash=sha256:e603aa7bb52e4e8ed4119a58a03b60323918467ef209e6ff9db3ac382e5cf2c6 \
76 # via pypsrp
77 docutils==0.14 \
78 --hash=sha256:02aec4bd92ab067f6ff27a38a38a41173bf01bed8f89157768c1573f53e474a6 \
79 --hash=sha256:51e64ef2ebfb29cae1faa133b3710143496eca21c530f3f71424d77687764274 \
80 --hash=sha256:7a4bd47eaf6596e1295ecb11361139febe29b084a87bf005bf899f9a42edc3c6 \
81 # via botocore
82 idna==2.8 \
83 --hash=sha256:c357b3f628cf53ae2c4c05627ecc484553142ca23264e593d327bcde5e9c3407 \
84 --hash=sha256:ea8b7f6188e6fa117537c3df7da9fc686d485087abf6ac197f9c46432f7e4a3c \
85 # via requests
86 jmespath==0.9.4 \
87 --hash=sha256:3720a4b1bd659dd2eecad0666459b9788813e032b83e7ba58578e48254e0a0e6 \
88 --hash=sha256:bde2aef6f44302dfb30320115b17d030798de8c4110e28d5cf6cf91a7a31074c \
89 # via boto3, botocore
90 ntlm-auth==1.2.0 \
91 --hash=sha256:7bc02a3fbdfee7275d3dc20fce8028ed8eb6d32364637f28be9e9ae9160c6d5c \
92 --hash=sha256:9b13eaf88f16a831637d75236a93d60c0049536715aafbf8190ba58a590b023e \
93 # via pypsrp
94 pycparser==2.19 \
95 --hash=sha256:a988718abfad80b6b157acce7bf130a30876d27603738ac39f140993246b25b3 \
96 # via cffi
97 pypsrp==0.3.1 \
98 --hash=sha256:309853380fe086090a03cc6662a778ee69b1cae355ae4a932859034fd76e9d0b \
99 --hash=sha256:90f946254f547dc3493cea8493c819ab87e152a755797c93aa2668678ba8ae85
100 python-dateutil==2.8.0 \
101 --hash=sha256:7e6584c74aeed623791615e26efd690f29817a27c73085b78e4bad02493df2fb \
102 --hash=sha256:c89805f6f4d64db21ed966fda138f8a5ed7a4fdbc1a8ee329ce1b74e3c74da9e \
103 # via botocore
104 requests==2.21.0 \
105 --hash=sha256:502a824f31acdacb3a35b6690b5fbf0bc41d63a24a45c4004352b0242707598e \
106 --hash=sha256:7bf2a778576d825600030a110f3c0e3e8edc51dfaafe1c146e39a2027784957b \
107 # via pypsrp
108 s3transfer==0.2.0 \
109 --hash=sha256:7b9ad3213bff7d357f888e0fab5101b56fa1a0548ee77d121c3a3dbfbef4cb2e \
110 --hash=sha256:f23d5cb7d862b104401d9021fc82e5fa0e0cf57b7660a1331425aab0c691d021 \
111 # via boto3
112 six==1.12.0 \
113 --hash=sha256:3350809f0555b11f552448330d0b52d5f24c91a322ea4a15ef22629740f3761c \
114 --hash=sha256:d16a0141ec1a18405cd4ce8b4613101da75da0e9a7aec5bdd4fa804d0e0eba73 \
115 # via cryptography, pypsrp, python-dateutil
116 urllib3==1.24.1 \
117 --hash=sha256:61bf29cada3fc2fbefad4fdf059ea4bd1b4a86d2b6d15e1c7c0b582b9752fe39 \
118 --hash=sha256:de9529817c93f27c8ccbfead6985011db27bd0ddfcdb2d86f3f663385c6a9c22 \
119 # via botocore, requests
@@ -0,0 +1,2 b''
1 boto3
2 pypsrp
@@ -0,0 +1,200 b''
1 # install-dependencies.ps1 - Install Windows dependencies for building Mercurial
2 #
3 # Copyright 2019 Gregory Szorc <gregory.szorc@gmail.com>
4 #
5 # This software may be used and distributed according to the terms of the
6 # GNU General Public License version 2 or any later version.
7
8 # This script can be used to bootstrap a Mercurial build environment on
9 # Windows.
10 #
11 # The script makes a lot of assumptions about how things should work.
12 # For example, the install location of Python is hardcoded to c:\hgdev\*.
13 #
14 # The script should be executed from a PowerShell with elevated privileges
15 # if you don't want to see a UAC prompt for various installers.
16 #
17 # The script is tested on Windows 10 and Windows Server 2019 (in EC2).
18
19 $VS_BUILD_TOOLS_URL = "https://download.visualstudio.microsoft.com/download/pr/a1603c02-8a66-4b83-b821-811e3610a7c4/aa2db8bb39e0cbd23e9940d8951e0bc3/vs_buildtools.exe"
20 $VS_BUILD_TOOLS_SHA256 = "911E292B8E6E5F46CBC17003BDCD2D27A70E616E8D5E6E69D5D489A605CAA139"
21
22 $VC9_PYTHON_URL = "https://download.microsoft.com/download/7/9/6/796EF2E4-801B-4FC4-AB28-B59FBF6D907B/VCForPython27.msi"
23 $VC9_PYTHON_SHA256 = "070474db76a2e625513a5835df4595df9324d820f9cc97eab2a596dcbc2f5cbf"
24
25 $PYTHON27_x64_URL = "https://www.python.org/ftp/python/2.7.16/python-2.7.16.amd64.msi"
26 $PYTHON27_x64_SHA256 = "7c0f45993019152d46041a7db4b947b919558fdb7a8f67bcd0535bc98d42b603"
27 $PYTHON27_X86_URL = "https://www.python.org/ftp/python/2.7.16/python-2.7.16.msi"
28 $PYTHON27_X86_SHA256 = "d57dc3e1ba490aee856c28b4915d09e3f49442461e46e481bc6b2d18207831d7"
29
30 $PYTHON35_x86_URL = "https://www.python.org/ftp/python/3.5.4/python-3.5.4.exe"
31 $PYTHON35_x86_SHA256 = "F27C2D67FD9688E4970F3BFF799BB9D722A0D6C2C13B04848E1F7D620B524B0E"
32 $PYTHON35_x64_URL = "https://www.python.org/ftp/python/3.5.4/python-3.5.4-amd64.exe"
33 $PYTHON35_x64_SHA256 = "9B7741CC32357573A77D2EE64987717E527628C38FD7EAF3E2AACA853D45A1EE"
34
35 $PYTHON36_x86_URL = "https://www.python.org/ftp/python/3.6.8/python-3.6.8.exe"
36 $PYTHON36_x86_SHA256 = "89871D432BC06E4630D7B64CB1A8451E53C80E68DE29029976B12AAD7DBFA5A0"
37 $PYTHON36_x64_URL = "https://www.python.org/ftp/python/3.6.8/python-3.6.8-amd64.exe"
38 $PYTHON36_x64_SHA256 = "96088A58B7C43BC83B84E6B67F15E8706C614023DD64F9A5A14E81FF824ADADC"
39
40 $PYTHON37_x86_URL = "https://www.python.org/ftp/python/3.7.2/python-3.7.2.exe"
41 $PYTHON37_x86_SHA256 = "8BACE330FB409E428B04EEEE083DD9CA7F6C754366D07E23B3853891D8F8C3D0"
42 $PYTHON37_x64_URL = "https://www.python.org/ftp/python/3.7.2/python-3.7.2-amd64.exe"
43 $PYTHON37_x64_SHA256 = "0FE2A696F5A3E481FED795EF6896ED99157BCEF273EF3C4A96F2905CBDB3AA13"
44
45 $PYTHON38_x86_URL = "https://www.python.org/ftp/python/3.8.0/python-3.8.0a2.exe"
46 $PYTHON38_x86_SHA256 = "013A7DDD317679FE51223DE627688CFCB2F0F1128FD25A987F846AEB476D3FEF"
47 $PYTHON38_x64_URL = "https://www.python.org/ftp/python/3.8.0/python-3.8.0a2-amd64.exe"
48 $PYTHON38_X64_SHA256 = "560BC6D1A76BCD6D544AC650709F3892956890753CDCF9CE67E3D7302D76FB41"
49
50 # PIP 19.0.3.
51 $PIP_URL = "https://github.com/pypa/get-pip/raw/fee32c376da1ff6496a798986d7939cd51e1644f/get-pip.py"
52 $PIP_SHA256 = "efe99298f3fbb1f56201ce6b81d2658067d2f7d7dfc2d412e0d3cacc9a397c61"
53
54 $VIRTUALENV_URL = "https://files.pythonhosted.org/packages/37/db/89d6b043b22052109da35416abc3c397655e4bd3cff031446ba02b9654fa/virtualenv-16.4.3.tar.gz"
55 $VIRTUALENV_SHA256 = "984d7e607b0a5d1329425dd8845bd971b957424b5ba664729fab51ab8c11bc39"
56
57 $INNO_SETUP_URL = "http://files.jrsoftware.org/is/5/innosetup-5.6.1-unicode.exe"
58 $INNO_SETUP_SHA256 = "27D49E9BC769E9D1B214C153011978DB90DC01C2ACD1DDCD9ED7B3FE3B96B538"
59
60 $MINGW_BIN_URL = "https://osdn.net/frs/redir.php?m=constant&f=mingw%2F68260%2Fmingw-get-0.6.3-mingw32-pre-20170905-1-bin.zip"
61 $MINGW_BIN_SHA256 = "2AB8EFD7C7D1FC8EAF8B2FA4DA4EEF8F3E47768284C021599BC7435839A046DF"
62
63 $MERCURIAL_WHEEL_FILENAME = "mercurial-4.9-cp27-cp27m-win_amd64.whl"
64 $MERCURIAL_WHEEL_URL = "https://files.pythonhosted.org/packages/fe/e8/b872d53dfbbf986bdc46af0b30f580b227fb59bddd2587152a55e205b0cc/$MERCURIAL_WHEEL_FILENAME"
65 $MERCURIAL_WHEEL_SHA256 = "218cc2e7c3f1d535007febbb03351663897edf27df0e57d6842e3b686492b429"
66
67 # Writing progress slows down downloads substantially. So disable it.
68 $progressPreference = 'silentlyContinue'
69
70 function Secure-Download($url, $path, $sha256) {
71 if (Test-Path -Path $path) {
72 Get-FileHash -Path $path -Algorithm SHA256 -OutVariable hash
73
74 if ($hash.Hash -eq $sha256) {
75 Write-Output "SHA256 of $path verified as $sha256"
76 return
77 }
78
79 Write-Output "hash mismatch on $path; downloading again"
80 }
81
82 Write-Output "downloading $url to $path"
83 Invoke-WebRequest -Uri $url -OutFile $path
84 Get-FileHash -Path $path -Algorithm SHA256 -OutVariable hash
85
86 if ($hash.Hash -ne $sha256) {
87 Remove-Item -Path $path
88 throw "hash mismatch when downloading $url; got $($hash.Hash), expected $sha256"
89 }
90 }
91
92 function Invoke-Process($path, $arguments) {
93 $p = Start-Process -FilePath $path -ArgumentList $arguments -Wait -PassThru -WindowStyle Hidden
94
95 if ($p.ExitCode -ne 0) {
96 throw "process exited non-0: $($p.ExitCode)"
97 }
98 }
99
100 function Install-Python3($name, $installer, $dest, $pip) {
101 Write-Output "installing $name"
102
103 # We hit this when running the script as part of Simple Systems Manager in
104 # EC2. The Python 3 installer doesn't seem to like per-user installs
105 # when running as the SYSTEM user. So enable global installs if executed in
106 # this mode.
107 if ($env:USERPROFILE -eq "C:\Windows\system32\config\systemprofile") {
108 Write-Output "running with SYSTEM account; installing for all users"
109 $allusers = "1"
110 }
111 else {
112 $allusers = "0"
113 }
114
115 Invoke-Process $installer "/quiet TargetDir=${dest} InstallAllUsers=${allusers} AssociateFiles=0 CompileAll=0 PrependPath=0 Include_doc=0 Include_launcher=0 InstallLauncherAllUsers=0 Include_pip=0 Include_test=0"
116 Invoke-Process ${dest}\python.exe $pip
117 }
118
119 function Install-Dependencies($prefix) {
120 if (!(Test-Path -Path $prefix\assets)) {
121 New-Item -Path $prefix\assets -ItemType Directory
122 }
123
124 $pip = "${prefix}\assets\get-pip.py"
125
126 Secure-Download $VC9_PYTHON_URL ${prefix}\assets\VCForPython27.msi $VC9_PYTHON_SHA256
127 Secure-Download $PYTHON27_x86_URL ${prefix}\assets\python27-x86.msi $PYTHON27_x86_SHA256
128 Secure-Download $PYTHON27_x64_URL ${prefix}\assets\python27-x64.msi $PYTHON27_x64_SHA256
129 Secure-Download $PYTHON35_x86_URL ${prefix}\assets\python35-x86.exe $PYTHON35_x86_SHA256
130 Secure-Download $PYTHON35_x64_URL ${prefix}\assets\python35-x64.exe $PYTHON35_x64_SHA256
131 Secure-Download $PYTHON36_x86_URL ${prefix}\assets\python36-x86.exe $PYTHON36_x86_SHA256
132 Secure-Download $PYTHON36_x64_URL ${prefix}\assets\python36-x64.exe $PYTHON36_x64_SHA256
133 Secure-Download $PYTHON37_x86_URL ${prefix}\assets\python37-x86.exe $PYTHON37_x86_SHA256
134 Secure-Download $PYTHON37_x64_URL ${prefix}\assets\python37-x64.exe $PYTHON37_x64_SHA256
135 Secure-Download $PYTHON38_x86_URL ${prefix}\assets\python38-x86.exe $PYTHON38_x86_SHA256
136 Secure-Download $PYTHON38_x64_URL ${prefix}\assets\python38-x64.exe $PYTHON38_x64_SHA256
137 Secure-Download $PIP_URL ${pip} $PIP_SHA256
138 Secure-Download $VIRTUALENV_URL ${prefix}\assets\virtualenv.tar.gz $VIRTUALENV_SHA256
139 Secure-Download $VS_BUILD_TOOLS_URL ${prefix}\assets\vs_buildtools.exe $VS_BUILD_TOOLS_SHA256
140 Secure-Download $INNO_SETUP_URL ${prefix}\assets\InnoSetup.exe $INNO_SETUP_SHA256
141 Secure-Download $MINGW_BIN_URL ${prefix}\assets\mingw-get-bin.zip $MINGW_BIN_SHA256
142 Secure-Download $MERCURIAL_WHEEL_URL ${prefix}\assets\${MERCURIAL_WHEEL_FILENAME} $MERCURIAL_WHEEL_SHA256
143
144 Write-Output "installing Python 2.7 32-bit"
145 Invoke-Process msiexec.exe "/i ${prefix}\assets\python27-x86.msi /l* ${prefix}\assets\python27-x86.log /q TARGETDIR=${prefix}\python27-x86 ALLUSERS="
146 Invoke-Process ${prefix}\python27-x86\python.exe ${prefix}\assets\get-pip.py
147 Invoke-Process ${prefix}\python27-x86\Scripts\pip.exe "install ${prefix}\assets\virtualenv.tar.gz"
148
149 Write-Output "installing Python 2.7 64-bit"
150 Invoke-Process msiexec.exe "/i ${prefix}\assets\python27-x64.msi /l* ${prefix}\assets\python27-x64.log /q TARGETDIR=${prefix}\python27-x64 ALLUSERS="
151 Invoke-Process ${prefix}\python27-x64\python.exe ${prefix}\assets\get-pip.py
152 Invoke-Process ${prefix}\python27-x64\Scripts\pip.exe "install ${prefix}\assets\virtualenv.tar.gz"
153
154 Install-Python3 "Python 3.5 32-bit" ${prefix}\assets\python35-x86.exe ${prefix}\python35-x86 ${pip}
155 Install-Python3 "Python 3.5 64-bit" ${prefix}\assets\python35-x64.exe ${prefix}\python35-x64 ${pip}
156 Install-Python3 "Python 3.6 32-bit" ${prefix}\assets\python36-x86.exe ${prefix}\python36-x86 ${pip}
157 Install-Python3 "Python 3.6 64-bit" ${prefix}\assets\python36-x64.exe ${prefix}\python36-x64 ${pip}
158 Install-Python3 "Python 3.7 32-bit" ${prefix}\assets\python37-x86.exe ${prefix}\python37-x86 ${pip}
159 Install-Python3 "Python 3.7 64-bit" ${prefix}\assets\python37-x64.exe ${prefix}\python37-x64 ${pip}
160 Install-Python3 "Python 3.8 32-bit" ${prefix}\assets\python38-x86.exe ${prefix}\python38-x86 ${pip}
161 Install-Python3 "Python 3.8 64-bit" ${prefix}\assets\python38-x64.exe ${prefix}\python38-x64 ${pip}
162
163 Write-Output "installing Visual Studio 2017 Build Tools and SDKs"
164 Invoke-Process ${prefix}\assets\vs_buildtools.exe "--quiet --wait --norestart --nocache --channelUri https://aka.ms/vs/15/release/channel --add Microsoft.VisualStudio.Workload.MSBuildTools --add Microsoft.VisualStudio.Component.Windows10SDK.17763 --add Microsoft.VisualStudio.Workload.VCTools --add Microsoft.VisualStudio.Component.Windows10SDK --add Microsoft.VisualStudio.Component.VC.140"
165
166 Write-Output "installing Visual C++ 9.0 for Python 2.7"
167 Invoke-Process msiexec.exe "/i ${prefix}\assets\VCForPython27.msi /l* ${prefix}\assets\VCForPython27.log /q"
168
169 Write-Output "installing Inno Setup"
170 Invoke-Process ${prefix}\assets\InnoSetup.exe "/SP- /VERYSILENT /SUPPRESSMSGBOXES"
171
172 Write-Output "extracting MinGW base archive"
173 Expand-Archive -Path ${prefix}\assets\mingw-get-bin.zip -DestinationPath "${prefix}\MinGW" -Force
174
175 Write-Output "updating MinGW package catalogs"
176 Invoke-Process ${prefix}\MinGW\bin\mingw-get.exe "update"
177
178 Write-Output "installing MinGW packages"
179 Invoke-Process ${prefix}\MinGW\bin\mingw-get.exe "install msys-base msys-coreutils msys-diffutils msys-unzip"
180
181 # Construct a virtualenv useful for bootstrapping. It conveniently contains a
182 # Mercurial install.
183 Write-Output "creating bootstrap virtualenv with Mercurial"
184 Invoke-Process "$prefix\python27-x64\Scripts\virtualenv.exe" "${prefix}\venv-bootstrap"
185 Invoke-Process "${prefix}\venv-bootstrap\Scripts\pip.exe" "install ${prefix}\assets\${MERCURIAL_WHEEL_FILENAME}"
186 }
187
188 function Clone-Mercurial-Repo($prefix, $repo_url, $dest) {
189 Write-Output "cloning $repo_url to $dest"
190 # TODO Figure out why CA verification isn't working in EC2 and remove
191 # --insecure.
192 Invoke-Process "${prefix}\venv-bootstrap\Scripts\hg.exe" "clone --insecure $repo_url $dest"
193
194 # Mark repo as non-publishing by default for convenience.
195 Add-Content -Path "$dest\.hg\hgrc" -Value "`n[phases]`npublish = false"
196 }
197
198 $prefix = "c:\hgdev"
199 Install-Dependencies $prefix
200 Clone-Mercurial-Repo $prefix "https://www.mercurial-scm.org/repo/hg" $prefix\src
1 NO CONTENT: new file 100644
@@ -0,0 +1,175 b''
1 # downloads.py - Code for downloading dependencies.
2 #
3 # Copyright 2019 Gregory Szorc <gregory.szorc@gmail.com>
4 #
5 # This software may be used and distributed according to the terms of the
6 # GNU General Public License version 2 or any later version.
7
8 # no-check-code because Python 3 native.
9
10 import gzip
11 import hashlib
12 import pathlib
13 import urllib.request
14
15
16 DOWNLOADS = {
17 'gettext': {
18 'url': 'https://versaweb.dl.sourceforge.net/project/gnuwin32/gettext/0.14.4/gettext-0.14.4-bin.zip',
19 'size': 1606131,
20 'sha256': '60b9ef26bc5cceef036f0424e542106cf158352b2677f43a01affd6d82a1d641',
21 'version': '0.14.4',
22 },
23 'gettext-dep': {
24 'url': 'https://versaweb.dl.sourceforge.net/project/gnuwin32/gettext/0.14.4/gettext-0.14.4-dep.zip',
25 'size': 715086,
26 'sha256': '411f94974492fd2ecf52590cb05b1023530aec67e64154a88b1e4ebcd9c28588',
27 },
28 'py2exe': {
29 'url': 'https://versaweb.dl.sourceforge.net/project/py2exe/py2exe/0.6.9/py2exe-0.6.9.zip',
30 'size': 149687,
31 'sha256': '6bd383312e7d33eef2e43a5f236f9445e4f3e0f6b16333c6f183ed445c44ddbd',
32 'version': '0.6.9',
33 },
34 # The VC9 CRT merge modules aren't readily available on most systems because
35 # they are only installed as part of a full Visual Studio 2008 install.
36 # While we could potentially extract them from a Visual Studio 2008
37 # installer, it is easier to just fetch them from a known URL.
38 'vc9-crt-x86-msm': {
39 'url': 'https://github.com/indygreg/vc90-merge-modules/raw/9232f8f0b2135df619bf7946eaa176b4ac35ccff/Microsoft_VC90_CRT_x86.msm',
40 'size': 615424,
41 'sha256': '837e887ef31b332feb58156f429389de345cb94504228bb9a523c25a9dd3d75e',
42 },
43 'vc9-crt-x86-msm-policy': {
44 'url': 'https://github.com/indygreg/vc90-merge-modules/raw/9232f8f0b2135df619bf7946eaa176b4ac35ccff/policy_9_0_Microsoft_VC90_CRT_x86.msm',
45 'size': 71168,
46 'sha256': '3fbcf92e3801a0757f36c5e8d304e134a68d5cafd197a6df7734ae3e8825c940',
47 },
48 'vc9-crt-x64-msm': {
49 'url': 'https://github.com/indygreg/vc90-merge-modules/raw/9232f8f0b2135df619bf7946eaa176b4ac35ccff/Microsoft_VC90_CRT_x86_x64.msm',
50 'size': 662528,
51 'sha256': '50d9639b5ad4844a2285269c7551bf5157ec636e32396ddcc6f7ec5bce487a7c',
52 },
53 'vc9-crt-x64-msm-policy': {
54 'url': 'https://github.com/indygreg/vc90-merge-modules/raw/9232f8f0b2135df619bf7946eaa176b4ac35ccff/policy_9_0_Microsoft_VC90_CRT_x86_x64.msm',
55 'size': 71168,
56 'sha256': '0550ea1929b21239134ad3a678c944ba0f05f11087117b6cf0833e7110686486',
57 },
58 'virtualenv': {
59 'url': 'https://files.pythonhosted.org/packages/37/db/89d6b043b22052109da35416abc3c397655e4bd3cff031446ba02b9654fa/virtualenv-16.4.3.tar.gz',
60 'size': 3713208,
61 'sha256': '984d7e607b0a5d1329425dd8845bd971b957424b5ba664729fab51ab8c11bc39',
62 'version': '16.4.3',
63 },
64 'wix': {
65 'url': 'https://github.com/wixtoolset/wix3/releases/download/wix3111rtm/wix311-binaries.zip',
66 'size': 34358269,
67 'sha256': '37f0a533b0978a454efb5dc3bd3598becf9660aaf4287e55bf68ca6b527d051d',
68 'version': '3.11.1',
69 },
70 }
71
72
73 def hash_path(p: pathlib.Path):
74 h = hashlib.sha256()
75
76 with p.open('rb') as fh:
77 while True:
78 chunk = fh.read(65536)
79 if not chunk:
80 break
81
82 h.update(chunk)
83
84 return h.hexdigest()
85
86
87 class IntegrityError(Exception):
88 """Represents an integrity error when downloading a URL."""
89
90
91 def secure_download_stream(url, size, sha256):
92 """Securely download a URL to a stream of chunks.
93
94 If the integrity of the download fails, an IntegrityError is
95 raised.
96 """
97 h = hashlib.sha256()
98 length = 0
99
100 with urllib.request.urlopen(url) as fh:
101 if not url.endswith('.gz') and fh.info().get('Content-Encoding') == 'gzip':
102 fh = gzip.GzipFile(fileobj=fh)
103
104 while True:
105 chunk = fh.read(65536)
106 if not chunk:
107 break
108
109 h.update(chunk)
110 length += len(chunk)
111
112 yield chunk
113
114 digest = h.hexdigest()
115
116 if length != size:
117 raise IntegrityError('size mismatch on %s: wanted %d; got %d' % (
118 url, size, length))
119
120 if digest != sha256:
121 raise IntegrityError('sha256 mismatch on %s: wanted %s; got %s' % (
122 url, sha256, digest))
123
124
125 def download_to_path(url: str, path: pathlib.Path, size: int, sha256: str):
126 """Download a URL to a filesystem path, possibly with verification."""
127
128 # We download to a temporary file and rename at the end so there's
129 # no chance of the final file being partially written or containing
130 # bad data.
131 print('downloading %s to %s' % (url, path))
132
133 if path.exists():
134 good = True
135
136 if path.stat().st_size != size:
137 print('existing file size is wrong; removing')
138 good = False
139
140 if good:
141 if hash_path(path) != sha256:
142 print('existing file hash is wrong; removing')
143 good = False
144
145 if good:
146 print('%s exists and passes integrity checks' % path)
147 return
148
149 path.unlink()
150
151 tmp = path.with_name('%s.tmp' % path.name)
152
153 try:
154 with tmp.open('wb') as fh:
155 for chunk in secure_download_stream(url, size, sha256):
156 fh.write(chunk)
157 except IntegrityError:
158 tmp.unlink()
159 raise
160
161 tmp.rename(path)
162 print('successfully downloaded %s' % url)
163
164
165 def download_entry(name: dict, dest_path: pathlib.Path, local_name=None) -> pathlib.Path:
166 entry = DOWNLOADS[name]
167
168 url = entry['url']
169
170 local_name = local_name or url[url.rindex('/') + 1:]
171
172 local_path = dest_path / local_name
173 download_to_path(url, local_path, entry['size'], entry['sha256'])
174
175 return local_path, entry
@@ -0,0 +1,78 b''
1 # inno.py - Inno Setup functionality.
2 #
3 # Copyright 2019 Gregory Szorc <gregory.szorc@gmail.com>
4 #
5 # This software may be used and distributed according to the terms of the
6 # GNU General Public License version 2 or any later version.
7
8 # no-check-code because Python 3 native.
9
10 import os
11 import pathlib
12 import shutil
13 import subprocess
14
15 from .py2exe import (
16 build_py2exe,
17 )
18 from .util import (
19 find_vc_runtime_files,
20 )
21
22
23 EXTRA_PACKAGES = {
24 'dulwich',
25 'keyring',
26 'pygments',
27 'win32ctypes',
28 }
29
30
31 def build(source_dir: pathlib.Path, build_dir: pathlib.Path,
32 python_exe: pathlib.Path, iscc_exe: pathlib.Path,
33 version=None):
34 """Build the Inno installer.
35
36 Build files will be placed in ``build_dir``.
37
38 py2exe's setup.py doesn't use setuptools. It doesn't have modern logic
39 for finding the Python 2.7 toolchain. So, we require the environment
40 to already be configured with an active toolchain.
41 """
42 if not iscc_exe.exists():
43 raise Exception('%s does not exist' % iscc_exe)
44
45 vc_x64 = r'\x64' in os.environ.get('LIB', '')
46
47 requirements_txt = (source_dir / 'contrib' / 'packaging' /
48 'inno' / 'requirements.txt')
49
50 build_py2exe(source_dir, build_dir, python_exe, 'inno',
51 requirements_txt, extra_packages=EXTRA_PACKAGES)
52
53 # hg.exe depends on VC9 runtime DLLs. Copy those into place.
54 for f in find_vc_runtime_files(vc_x64):
55 if f.name.endswith('.manifest'):
56 basename = 'Microsoft.VC90.CRT.manifest'
57 else:
58 basename = f.name
59
60 dest_path = source_dir / 'dist' / basename
61
62 print('copying %s to %s' % (f, dest_path))
63 shutil.copyfile(f, dest_path)
64
65 print('creating installer')
66
67 args = [str(iscc_exe)]
68
69 if vc_x64:
70 args.append('/dARCH=x64')
71
72 if version:
73 args.append('/dVERSION=%s' % version)
74
75 args.append('/Odist')
76 args.append('contrib/packaging/inno/mercurial.iss')
77
78 subprocess.run(args, cwd=str(source_dir), check=True)
@@ -0,0 +1,150 b''
1 # py2exe.py - Functionality for performing py2exe builds.
2 #
3 # Copyright 2019 Gregory Szorc <gregory.szorc@gmail.com>
4 #
5 # This software may be used and distributed according to the terms of the
6 # GNU General Public License version 2 or any later version.
7
8 # no-check-code because Python 3 native.
9
10 import os
11 import pathlib
12 import subprocess
13
14 from .downloads import (
15 download_entry,
16 )
17 from .util import (
18 extract_tar_to_directory,
19 extract_zip_to_directory,
20 python_exe_info,
21 )
22
23
24 def build_py2exe(source_dir: pathlib.Path, build_dir: pathlib.Path,
25 python_exe: pathlib.Path, build_name: str,
26 venv_requirements_txt: pathlib.Path,
27 extra_packages=None, extra_excludes=None,
28 extra_dll_excludes=None,
29 extra_packages_script=None):
30 """Build Mercurial with py2exe.
31
32 Build files will be placed in ``build_dir``.
33
34 py2exe's setup.py doesn't use setuptools. It doesn't have modern logic
35 for finding the Python 2.7 toolchain. So, we require the environment
36 to already be configured with an active toolchain.
37 """
38 if 'VCINSTALLDIR' not in os.environ:
39 raise Exception('not running from a Visual C++ build environment; '
40 'execute the "Visual C++ <version> Command Prompt" '
41 'application shortcut or a vcsvarsall.bat file')
42
43 # Identity x86/x64 and validate the environment matches the Python
44 # architecture.
45 vc_x64 = r'\x64' in os.environ['LIB']
46
47 py_info = python_exe_info(python_exe)
48
49 if vc_x64:
50 if py_info['arch'] != '64bit':
51 raise Exception('architecture mismatch: Visual C++ environment '
52 'is configured for 64-bit but Python is 32-bit')
53 else:
54 if py_info['arch'] != '32bit':
55 raise Exception('architecture mismatch: Visual C++ environment '
56 'is configured for 32-bit but Python is 64-bit')
57
58 if py_info['py3']:
59 raise Exception('Only Python 2 is currently supported')
60
61 build_dir.mkdir(exist_ok=True)
62
63 gettext_pkg, gettext_entry = download_entry('gettext', build_dir)
64 gettext_dep_pkg = download_entry('gettext-dep', build_dir)[0]
65 virtualenv_pkg, virtualenv_entry = download_entry('virtualenv', build_dir)
66 py2exe_pkg, py2exe_entry = download_entry('py2exe', build_dir)
67
68 venv_path = build_dir / ('venv-%s-%s' % (build_name,
69 'x64' if vc_x64 else 'x86'))
70
71 gettext_root = build_dir / (
72 'gettext-win-%s' % gettext_entry['version'])
73
74 if not gettext_root.exists():
75 extract_zip_to_directory(gettext_pkg, gettext_root)
76 extract_zip_to_directory(gettext_dep_pkg, gettext_root)
77
78 # This assumes Python 2. We don't need virtualenv on Python 3.
79 virtualenv_src_path = build_dir / (
80 'virtualenv-%s' % virtualenv_entry['version'])
81 virtualenv_py = virtualenv_src_path / 'virtualenv.py'
82
83 if not virtualenv_src_path.exists():
84 extract_tar_to_directory(virtualenv_pkg, build_dir)
85
86 py2exe_source_path = build_dir / ('py2exe-%s' % py2exe_entry['version'])
87
88 if not py2exe_source_path.exists():
89 extract_zip_to_directory(py2exe_pkg, build_dir)
90
91 if not venv_path.exists():
92 print('creating virtualenv with dependencies')
93 subprocess.run(
94 [str(python_exe), str(virtualenv_py), str(venv_path)],
95 check=True)
96
97 venv_python = venv_path / 'Scripts' / 'python.exe'
98 venv_pip = venv_path / 'Scripts' / 'pip.exe'
99
100 subprocess.run([str(venv_pip), 'install', '-r', str(venv_requirements_txt)],
101 check=True)
102
103 # Force distutils to use VC++ settings from environment, which was
104 # validated above.
105 env = dict(os.environ)
106 env['DISTUTILS_USE_SDK'] = '1'
107 env['MSSdk'] = '1'
108
109 if extra_packages_script:
110 more_packages = set(subprocess.check_output(
111 extra_packages_script,
112 cwd=build_dir).split(b'\0')[-1].strip().decode('utf-8').splitlines())
113 if more_packages:
114 if not extra_packages:
115 extra_packages = more_packages
116 else:
117 extra_packages |= more_packages
118
119 if extra_packages:
120 env['HG_PY2EXE_EXTRA_PACKAGES'] = ' '.join(sorted(extra_packages))
121 hgext3rd_extras = sorted(
122 e for e in extra_packages if e.startswith('hgext3rd.'))
123 if hgext3rd_extras:
124 env['HG_PY2EXE_EXTRA_INSTALL_PACKAGES'] = ' '.join(hgext3rd_extras)
125 if extra_excludes:
126 env['HG_PY2EXE_EXTRA_EXCLUDES'] = ' '.join(sorted(extra_excludes))
127 if extra_dll_excludes:
128 env['HG_PY2EXE_EXTRA_DLL_EXCLUDES'] = ' '.join(
129 sorted(extra_dll_excludes))
130
131 py2exe_py_path = venv_path / 'Lib' / 'site-packages' / 'py2exe'
132 if not py2exe_py_path.exists():
133 print('building py2exe')
134 subprocess.run([str(venv_python), 'setup.py', 'install'],
135 cwd=py2exe_source_path,
136 env=env,
137 check=True)
138
139 # Register location of msgfmt and other binaries.
140 env['PATH'] = '%s%s%s' % (
141 env['PATH'], os.pathsep, str(gettext_root / 'bin'))
142
143 print('building Mercurial')
144 subprocess.run(
145 [str(venv_python), 'setup.py',
146 'py2exe',
147 'build_doc', '--html'],
148 cwd=str(source_dir),
149 env=env,
150 check=True)
@@ -0,0 +1,155 b''
1 # util.py - Common packaging utility code.
2 #
3 # Copyright 2019 Gregory Szorc <gregory.szorc@gmail.com>
4 #
5 # This software may be used and distributed according to the terms of the
6 # GNU General Public License version 2 or any later version.
7
8 # no-check-code because Python 3 native.
9
10 import distutils.version
11 import getpass
12 import os
13 import pathlib
14 import subprocess
15 import tarfile
16 import zipfile
17
18
19 def extract_tar_to_directory(source: pathlib.Path, dest: pathlib.Path):
20 with tarfile.open(source, 'r') as tf:
21 tf.extractall(dest)
22
23
24 def extract_zip_to_directory(source: pathlib.Path, dest: pathlib.Path):
25 with zipfile.ZipFile(source, 'r') as zf:
26 zf.extractall(dest)
27
28
29 def find_vc_runtime_files(x64=False):
30 """Finds Visual C++ Runtime DLLs to include in distribution."""
31 winsxs = pathlib.Path(os.environ['SYSTEMROOT']) / 'WinSxS'
32
33 prefix = 'amd64' if x64 else 'x86'
34
35 candidates = sorted(p for p in os.listdir(winsxs)
36 if p.lower().startswith('%s_microsoft.vc90.crt_' % prefix))
37
38 for p in candidates:
39 print('found candidate VC runtime: %s' % p)
40
41 # Take the newest version.
42 version = candidates[-1]
43
44 d = winsxs / version
45
46 return [
47 d / 'msvcm90.dll',
48 d / 'msvcp90.dll',
49 d / 'msvcr90.dll',
50 winsxs / 'Manifests' / ('%s.manifest' % version),
51 ]
52
53
54 def windows_10_sdk_info():
55 """Resolves information about the Windows 10 SDK."""
56
57 base = pathlib.Path(os.environ['ProgramFiles(x86)']) / 'Windows Kits' / '10'
58
59 if not base.is_dir():
60 raise Exception('unable to find Windows 10 SDK at %s' % base)
61
62 # Find the latest version.
63 bin_base = base / 'bin'
64
65 versions = [v for v in os.listdir(bin_base) if v.startswith('10.')]
66 version = sorted(versions, reverse=True)[0]
67
68 bin_version = bin_base / version
69
70 return {
71 'root': base,
72 'version': version,
73 'bin_root': bin_version,
74 'bin_x86': bin_version / 'x86',
75 'bin_x64': bin_version / 'x64'
76 }
77
78
79 def find_signtool():
80 """Find signtool.exe from the Windows SDK."""
81 sdk = windows_10_sdk_info()
82
83 for key in ('bin_x64', 'bin_x86'):
84 p = sdk[key] / 'signtool.exe'
85
86 if p.exists():
87 return p
88
89 raise Exception('could not find signtool.exe in Windows 10 SDK')
90
91
92 def sign_with_signtool(file_path, description, subject_name=None,
93 cert_path=None, cert_password=None,
94 timestamp_url=None):
95 """Digitally sign a file with signtool.exe.
96
97 ``file_path`` is file to sign.
98 ``description`` is text that goes in the signature.
99
100 The signing certificate can be specified by ``cert_path`` or
101 ``subject_name``. These correspond to the ``/f`` and ``/n`` arguments
102 to signtool.exe, respectively.
103
104 The certificate password can be specified via ``cert_password``. If
105 not provided, you will be prompted for the password.
106
107 ``timestamp_url`` is the URL of a RFC 3161 timestamp server (``/tr``
108 argument to signtool.exe).
109 """
110 if cert_path and subject_name:
111 raise ValueError('cannot specify both cert_path and subject_name')
112
113 while cert_path and not cert_password:
114 cert_password = getpass.getpass('password for %s: ' % cert_path)
115
116 args = [
117 str(find_signtool()), 'sign',
118 '/v',
119 '/fd', 'sha256',
120 '/d', description,
121 ]
122
123 if cert_path:
124 args.extend(['/f', str(cert_path), '/p', cert_password])
125 elif subject_name:
126 args.extend(['/n', subject_name])
127
128 if timestamp_url:
129 args.extend(['/tr', timestamp_url, '/td', 'sha256'])
130
131 args.append(str(file_path))
132
133 print('signing %s' % file_path)
134 subprocess.run(args, check=True)
135
136
137 PRINT_PYTHON_INFO = '''
138 import platform; print("%s:%s" % (platform.architecture()[0], platform.python_version()))
139 '''.strip()
140
141
142 def python_exe_info(python_exe: pathlib.Path):
143 """Obtain information about a Python executable."""
144
145 res = subprocess.check_output([str(python_exe), '-c', PRINT_PYTHON_INFO])
146
147 arch, version = res.decode('utf-8').split(':')
148
149 version = distutils.version.LooseVersion(version)
150
151 return {
152 'arch': arch,
153 'version': version,
154 'py3': version >= distutils.version.LooseVersion('3'),
155 }
@@ -0,0 +1,327 b''
1 # wix.py - WiX installer functionality
2 #
3 # Copyright 2019 Gregory Szorc <gregory.szorc@gmail.com>
4 #
5 # This software may be used and distributed according to the terms of the
6 # GNU General Public License version 2 or any later version.
7
8 # no-check-code because Python 3 native.
9
10 import os
11 import pathlib
12 import re
13 import subprocess
14 import tempfile
15 import typing
16 import xml.dom.minidom
17
18 from .downloads import (
19 download_entry,
20 )
21 from .py2exe import (
22 build_py2exe,
23 )
24 from .util import (
25 extract_zip_to_directory,
26 sign_with_signtool,
27 )
28
29
30 SUPPORT_WXS = [
31 ('contrib.wxs', r'contrib'),
32 ('dist.wxs', r'dist'),
33 ('doc.wxs', r'doc'),
34 ('help.wxs', r'mercurial\help'),
35 ('i18n.wxs', r'i18n'),
36 ('locale.wxs', r'mercurial\locale'),
37 ('templates.wxs', r'mercurial\templates'),
38 ]
39
40
41 EXTRA_PACKAGES = {
42 'distutils',
43 'pygments',
44 }
45
46
47 def find_version(source_dir: pathlib.Path):
48 version_py = source_dir / 'mercurial' / '__version__.py'
49
50 with version_py.open('r', encoding='utf-8') as fh:
51 source = fh.read().strip()
52
53 m = re.search('version = b"(.*)"', source)
54 return m.group(1)
55
56
57 def normalize_version(version):
58 """Normalize Mercurial version string so WiX accepts it.
59
60 Version strings have to be numeric X.Y.Z.
61 """
62
63 if '+' in version:
64 version, extra = version.split('+', 1)
65 else:
66 extra = None
67
68 # 4.9rc0
69 if version[:-1].endswith('rc'):
70 version = version[:-3]
71
72 versions = [int(v) for v in version.split('.')]
73 while len(versions) < 3:
74 versions.append(0)
75
76 major, minor, build = versions[:3]
77
78 if extra:
79 # <commit count>-<hash>+<date>
80 build = int(extra.split('-')[0])
81
82 return '.'.join('%d' % x for x in (major, minor, build))
83
84
85 def ensure_vc90_merge_modules(build_dir):
86 x86 = (
87 download_entry('vc9-crt-x86-msm', build_dir,
88 local_name='microsoft.vcxx.crt.x86_msm.msm')[0],
89 download_entry('vc9-crt-x86-msm-policy', build_dir,
90 local_name='policy.x.xx.microsoft.vcxx.crt.x86_msm.msm')[0]
91 )
92
93 x64 = (
94 download_entry('vc9-crt-x64-msm', build_dir,
95 local_name='microsoft.vcxx.crt.x64_msm.msm')[0],
96 download_entry('vc9-crt-x64-msm-policy', build_dir,
97 local_name='policy.x.xx.microsoft.vcxx.crt.x64_msm.msm')[0]
98 )
99 return {
100 'x86': x86,
101 'x64': x64,
102 }
103
104
105 def run_candle(wix, cwd, wxs, source_dir, defines=None):
106 args = [
107 str(wix / 'candle.exe'),
108 '-nologo',
109 str(wxs),
110 '-dSourceDir=%s' % source_dir,
111 ]
112
113 if defines:
114 args.extend('-d%s=%s' % define for define in sorted(defines.items()))
115
116 subprocess.run(args, cwd=str(cwd), check=True)
117
118
119 def make_post_build_signing_fn(name, subject_name=None, cert_path=None,
120 cert_password=None, timestamp_url=None):
121 """Create a callable that will use signtool to sign hg.exe."""
122
123 def post_build_sign(source_dir, build_dir, dist_dir, version):
124 description = '%s %s' % (name, version)
125
126 sign_with_signtool(dist_dir / 'hg.exe', description,
127 subject_name=subject_name, cert_path=cert_path,
128 cert_password=cert_password,
129 timestamp_url=timestamp_url)
130
131 return post_build_sign
132
133
134 LIBRARIES_XML = '''
135 <?xml version="1.0" encoding="utf-8"?>
136 <Wix xmlns="http://schemas.microsoft.com/wix/2006/wi">
137
138 <?include {wix_dir}/guids.wxi ?>
139 <?include {wix_dir}/defines.wxi ?>
140
141 <Fragment>
142 <DirectoryRef Id="INSTALLDIR" FileSource="$(var.SourceDir)">
143 <Directory Id="libdir" Name="lib" FileSource="$(var.SourceDir)/lib">
144 <Component Id="libOutput" Guid="$(var.lib.guid)" Win64='$(var.IsX64)'>
145 </Component>
146 </Directory>
147 </DirectoryRef>
148 </Fragment>
149 </Wix>
150 '''.lstrip()
151
152
153 def make_libraries_xml(wix_dir: pathlib.Path, dist_dir: pathlib.Path):
154 """Make XML data for library components WXS."""
155 # We can't use ElementTree because it doesn't handle the
156 # <?include ?> directives.
157 doc = xml.dom.minidom.parseString(
158 LIBRARIES_XML.format(wix_dir=str(wix_dir)))
159
160 component = doc.getElementsByTagName('Component')[0]
161
162 f = doc.createElement('File')
163 f.setAttribute('Name', 'library.zip')
164 f.setAttribute('KeyPath', 'yes')
165 component.appendChild(f)
166
167 lib_dir = dist_dir / 'lib'
168
169 for p in sorted(lib_dir.iterdir()):
170 if not p.name.endswith(('.dll', '.pyd')):
171 continue
172
173 f = doc.createElement('File')
174 f.setAttribute('Name', p.name)
175 component.appendChild(f)
176
177 return doc.toprettyxml()
178
179
180 def build_installer(source_dir: pathlib.Path, python_exe: pathlib.Path,
181 msi_name='mercurial', version=None, post_build_fn=None,
182 extra_packages_script=None,
183 extra_wxs:typing.Optional[typing.Dict[str,str]]=None,
184 extra_features:typing.Optional[typing.List[str]]=None):
185 """Build a WiX MSI installer.
186
187 ``source_dir`` is the path to the Mercurial source tree to use.
188 ``arch`` is the target architecture. either ``x86`` or ``x64``.
189 ``python_exe`` is the path to the Python executable to use/bundle.
190 ``version`` is the Mercurial version string. If not defined,
191 ``mercurial/__version__.py`` will be consulted.
192 ``post_build_fn`` is a callable that will be called after building
193 Mercurial but before invoking WiX. It can be used to e.g. facilitate
194 signing. It is passed the paths to the Mercurial source, build, and
195 dist directories and the resolved Mercurial version.
196 ``extra_packages_script`` is a command to be run to inject extra packages
197 into the py2exe binary. It should stage packages into the virtualenv and
198 print a null byte followed by a newline-separated list of packages that
199 should be included in the exe.
200 ``extra_wxs`` is a dict of {wxs_name: working_dir_for_wxs_build}.
201 ``extra_features`` is a list of additional named Features to include in
202 the build. These must match Feature names in one of the wxs scripts.
203 """
204 arch = 'x64' if r'\x64' in os.environ.get('LIB', '') else 'x86'
205
206 hg_build_dir = source_dir / 'build'
207 dist_dir = source_dir / 'dist'
208 wix_dir = source_dir / 'contrib' / 'packaging' / 'wix'
209
210 requirements_txt = wix_dir / 'requirements.txt'
211
212 build_py2exe(source_dir, hg_build_dir,
213 python_exe, 'wix', requirements_txt,
214 extra_packages=EXTRA_PACKAGES,
215 extra_packages_script=extra_packages_script)
216
217 version = version or normalize_version(find_version(source_dir))
218 print('using version string: %s' % version)
219
220 if post_build_fn:
221 post_build_fn(source_dir, hg_build_dir, dist_dir, version)
222
223 build_dir = hg_build_dir / ('wix-%s' % arch)
224
225 build_dir.mkdir(exist_ok=True)
226
227 wix_pkg, wix_entry = download_entry('wix', hg_build_dir)
228 wix_path = hg_build_dir / ('wix-%s' % wix_entry['version'])
229
230 if not wix_path.exists():
231 extract_zip_to_directory(wix_pkg, wix_path)
232
233 ensure_vc90_merge_modules(hg_build_dir)
234
235 source_build_rel = pathlib.Path(os.path.relpath(source_dir, build_dir))
236
237 defines = {'Platform': arch}
238
239 for wxs, rel_path in SUPPORT_WXS:
240 wxs = wix_dir / wxs
241 wxs_source_dir = source_dir / rel_path
242 run_candle(wix_path, build_dir, wxs, wxs_source_dir, defines=defines)
243
244 for source, rel_path in sorted((extra_wxs or {}).items()):
245 run_candle(wix_path, build_dir, source, rel_path, defines=defines)
246
247 # candle.exe doesn't like when we have an open handle on the file.
248 # So use TemporaryDirectory() instead of NamedTemporaryFile().
249 with tempfile.TemporaryDirectory() as td:
250 td = pathlib.Path(td)
251
252 tf = td / 'library.wxs'
253 with tf.open('w') as fh:
254 fh.write(make_libraries_xml(wix_dir, dist_dir))
255
256 run_candle(wix_path, build_dir, tf, dist_dir, defines=defines)
257
258 source = wix_dir / 'mercurial.wxs'
259 defines['Version'] = version
260 defines['Comments'] = 'Installs Mercurial version %s' % version
261 defines['VCRedistSrcDir'] = str(hg_build_dir)
262 if extra_features:
263 assert all(';' not in f for f in extra_features)
264 defines['MercurialExtraFeatures'] = ';'.join(extra_features)
265
266 run_candle(wix_path, build_dir, source, source_build_rel, defines=defines)
267
268 msi_path = source_dir / 'dist' / (
269 '%s-%s-%s.msi' % (msi_name, version, arch))
270
271 args = [
272 str(wix_path / 'light.exe'),
273 '-nologo',
274 '-ext', 'WixUIExtension',
275 '-sw1076',
276 '-spdb',
277 '-o', str(msi_path),
278 ]
279
280 for source, rel_path in SUPPORT_WXS:
281 assert source.endswith('.wxs')
282 args.append(str(build_dir / ('%s.wixobj' % source[:-4])))
283
284 for source, rel_path in sorted((extra_wxs or {}).items()):
285 assert source.endswith('.wxs')
286 source = os.path.basename(source)
287 args.append(str(build_dir / ('%s.wixobj' % source[:-4])))
288
289 args.extend([
290 str(build_dir / 'library.wixobj'),
291 str(build_dir / 'mercurial.wixobj'),
292 ])
293
294 subprocess.run(args, cwd=str(source_dir), check=True)
295
296 print('%s created' % msi_path)
297
298 return {
299 'msi_path': msi_path,
300 }
301
302
303 def build_signed_installer(source_dir: pathlib.Path, python_exe: pathlib.Path,
304 name: str, version=None, subject_name=None,
305 cert_path=None, cert_password=None,
306 timestamp_url=None, extra_packages_script=None,
307 extra_wxs=None, extra_features=None):
308 """Build an installer with signed executables."""
309
310 post_build_fn = make_post_build_signing_fn(
311 name,
312 subject_name=subject_name,
313 cert_path=cert_path,
314 cert_password=cert_password,
315 timestamp_url=timestamp_url)
316
317 info = build_installer(source_dir, python_exe=python_exe,
318 msi_name=name.lower(), version=version,
319 post_build_fn=post_build_fn,
320 extra_packages_script=extra_packages_script,
321 extra_wxs=extra_wxs, extra_features=extra_features)
322
323 description = '%s %s' % (name, version)
324
325 sign_with_signtool(info['msi_path'], description,
326 subject_name=subject_name, cert_path=cert_path,
327 cert_password=cert_password, timestamp_url=timestamp_url)
@@ -0,0 +1,51 b''
1 #!/usr/bin/env python3
2 # build.py - Inno installer build script.
3 #
4 # Copyright 2019 Gregory Szorc <gregory.szorc@gmail.com>
5 #
6 # This software may be used and distributed according to the terms of the
7 # GNU General Public License version 2 or any later version.
8
9 # This script automates the building of the Inno MSI installer for Mercurial.
10
11 # no-check-code because Python 3 native.
12
13 import argparse
14 import os
15 import pathlib
16 import sys
17
18
19 if __name__ == '__main__':
20 parser = argparse.ArgumentParser()
21
22 parser.add_argument('--python',
23 required=True,
24 help='path to python.exe to use')
25 parser.add_argument('--iscc',
26 help='path to iscc.exe to use')
27 parser.add_argument('--version',
28 help='Mercurial version string to use '
29 '(detected from __version__.py if not defined')
30
31 args = parser.parse_args()
32
33 if not os.path.isabs(args.python):
34 raise Exception('--python arg must be an absolute path')
35
36 if args.iscc:
37 iscc = pathlib.Path(args.iscc)
38 else:
39 iscc = (pathlib.Path(os.environ['ProgramFiles(x86)']) / 'Inno Setup 5' /
40 'ISCC.exe')
41
42 here = pathlib.Path(os.path.abspath(os.path.dirname(__file__)))
43 source_dir = here.parent.parent.parent
44 build_dir = source_dir / 'build'
45
46 sys.path.insert(0, str(source_dir / 'contrib' / 'packaging'))
47
48 from hgpackaging.inno import build
49
50 build(source_dir, build_dir, pathlib.Path(args.python), iscc,
51 version=args.version)
@@ -0,0 +1,219 b''
1 // ----------------------------------------------------------------------------
2 //
3 // Inno Setup Ver: 5.4.2
4 // Script Version: 1.4.2
5 // Author: Jared Breland <jbreland@legroom.net>
6 // Homepage: http://www.legroom.net/software
7 // License: GNU Lesser General Public License (LGPL), version 3
8 // http://www.gnu.org/licenses/lgpl.html
9 //
10 // Script Function:
11 // Allow modification of environmental path directly from Inno Setup installers
12 //
13 // Instructions:
14 // Copy modpath.iss to the same directory as your setup script
15 //
16 // Add this statement to your [Setup] section
17 // ChangesEnvironment=true
18 //
19 // Add this statement to your [Tasks] section
20 // You can change the Description or Flags
21 // You can change the Name, but it must match the ModPathName setting below
22 // Name: modifypath; Description: &Add application directory to your environmental path; Flags: unchecked
23 //
24 // Add the following to the end of your [Code] section
25 // ModPathName defines the name of the task defined above
26 // ModPathType defines whether the 'user' or 'system' path will be modified;
27 // this will default to user if anything other than system is set
28 // setArrayLength must specify the total number of dirs to be added
29 // Result[0] contains first directory, Result[1] contains second, etc.
30 // const
31 // ModPathName = 'modifypath';
32 // ModPathType = 'user';
33 //
34 // function ModPathDir(): TArrayOfString;
35 // begin
36 // setArrayLength(Result, 1);
37 // Result[0] := ExpandConstant('{app}');
38 // end;
39 // #include "modpath.iss"
40 // ----------------------------------------------------------------------------
41
42 procedure ModPath();
43 var
44 oldpath: String;
45 newpath: String;
46 updatepath: Boolean;
47 pathArr: TArrayOfString;
48 aExecFile: String;
49 aExecArr: TArrayOfString;
50 i, d: Integer;
51 pathdir: TArrayOfString;
52 regroot: Integer;
53 regpath: String;
54
55 begin
56 // Get constants from main script and adjust behavior accordingly
57 // ModPathType MUST be 'system' or 'user'; force 'user' if invalid
58 if ModPathType = 'system' then begin
59 regroot := HKEY_LOCAL_MACHINE;
60 regpath := 'SYSTEM\CurrentControlSet\Control\Session Manager\Environment';
61 end else begin
62 regroot := HKEY_CURRENT_USER;
63 regpath := 'Environment';
64 end;
65
66 // Get array of new directories and act on each individually
67 pathdir := ModPathDir();
68 for d := 0 to GetArrayLength(pathdir)-1 do begin
69 updatepath := true;
70
71 // Modify WinNT path
72 if UsingWinNT() = true then begin
73
74 // Get current path, split into an array
75 RegQueryStringValue(regroot, regpath, 'Path', oldpath);
76 oldpath := oldpath + ';';
77 i := 0;
78
79 while (Pos(';', oldpath) > 0) do begin
80 SetArrayLength(pathArr, i+1);
81 pathArr[i] := Copy(oldpath, 0, Pos(';', oldpath)-1);
82 oldpath := Copy(oldpath, Pos(';', oldpath)+1, Length(oldpath));
83 i := i + 1;
84
85 // Check if current directory matches app dir
86 if pathdir[d] = pathArr[i-1] then begin
87 // if uninstalling, remove dir from path
88 if IsUninstaller() = true then begin
89 continue;
90 // if installing, flag that dir already exists in path
91 end else begin
92 updatepath := false;
93 end;
94 end;
95
96 // Add current directory to new path
97 if i = 1 then begin
98 newpath := pathArr[i-1];
99 end else begin
100 newpath := newpath + ';' + pathArr[i-1];
101 end;
102 end;
103
104 // Append app dir to path if not already included
105 if (IsUninstaller() = false) AND (updatepath = true) then
106 newpath := newpath + ';' + pathdir[d];
107
108 // Write new path
109 RegWriteStringValue(regroot, regpath, 'Path', newpath);
110
111 // Modify Win9x path
112 end else begin
113
114 // Convert to shortened dirname
115 pathdir[d] := GetShortName(pathdir[d]);
116
117 // If autoexec.bat exists, check if app dir already exists in path
118 aExecFile := 'C:\AUTOEXEC.BAT';
119 if FileExists(aExecFile) then begin
120 LoadStringsFromFile(aExecFile, aExecArr);
121 for i := 0 to GetArrayLength(aExecArr)-1 do begin
122 if IsUninstaller() = false then begin
123 // If app dir already exists while installing, skip add
124 if (Pos(pathdir[d], aExecArr[i]) > 0) then
125 updatepath := false;
126 break;
127 end else begin
128 // If app dir exists and = what we originally set, then delete at uninstall
129 if aExecArr[i] = 'SET PATH=%PATH%;' + pathdir[d] then
130 aExecArr[i] := '';
131 end;
132 end;
133 end;
134
135 // If app dir not found, or autoexec.bat didn't exist, then (create and) append to current path
136 if (IsUninstaller() = false) AND (updatepath = true) then begin
137 SaveStringToFile(aExecFile, #13#10 + 'SET PATH=%PATH%;' + pathdir[d], True);
138
139 // If uninstalling, write the full autoexec out
140 end else begin
141 SaveStringsToFile(aExecFile, aExecArr, False);
142 end;
143 end;
144 end;
145 end;
146
147 // Split a string into an array using passed delimeter
148 procedure MPExplode(var Dest: TArrayOfString; Text: String; Separator: String);
149 var
150 i: Integer;
151 begin
152 i := 0;
153 repeat
154 SetArrayLength(Dest, i+1);
155 if Pos(Separator,Text) > 0 then begin
156 Dest[i] := Copy(Text, 1, Pos(Separator, Text)-1);
157 Text := Copy(Text, Pos(Separator,Text) + Length(Separator), Length(Text));
158 i := i + 1;
159 end else begin
160 Dest[i] := Text;
161 Text := '';
162 end;
163 until Length(Text)=0;
164 end;
165
166
167 procedure CurStepChanged(CurStep: TSetupStep);
168 var
169 taskname: String;
170 begin
171 taskname := ModPathName;
172 if CurStep = ssPostInstall then
173 if IsTaskSelected(taskname) then
174 ModPath();
175 end;
176
177 procedure CurUninstallStepChanged(CurUninstallStep: TUninstallStep);
178 var
179 aSelectedTasks: TArrayOfString;
180 i: Integer;
181 taskname: String;
182 regpath: String;
183 regstring: String;
184 appid: String;
185 begin
186 // only run during actual uninstall
187 if CurUninstallStep = usUninstall then begin
188 // get list of selected tasks saved in registry at install time
189 appid := '{#emit SetupSetting("AppId")}';
190 if appid = '' then appid := '{#emit SetupSetting("AppName")}';
191 regpath := ExpandConstant('Software\Microsoft\Windows\CurrentVersion\Uninstall\'+appid+'_is1');
192 RegQueryStringValue(HKLM, regpath, 'Inno Setup: Selected Tasks', regstring);
193 if regstring = '' then RegQueryStringValue(HKCU, regpath, 'Inno Setup: Selected Tasks', regstring);
194
195 // check each task; if matches modpath taskname, trigger patch removal
196 if regstring <> '' then begin
197 taskname := ModPathName;
198 MPExplode(aSelectedTasks, regstring, ',');
199 if GetArrayLength(aSelectedTasks) > 0 then begin
200 for i := 0 to GetArrayLength(aSelectedTasks)-1 do begin
201 if comparetext(aSelectedTasks[i], taskname) = 0 then
202 ModPath();
203 end;
204 end;
205 end;
206 end;
207 end;
208
209 function NeedRestart(): Boolean;
210 var
211 taskname: String;
212 begin
213 taskname := ModPathName;
214 if IsTaskSelected(taskname) and not UsingWinNT() then begin
215 Result := True;
216 end else begin
217 Result := False;
218 end;
219 end;
@@ -0,0 +1,38 b''
1 #
2 # This file is autogenerated by pip-compile
3 # To update, run:
4 #
5 # pip-compile --generate-hashes contrib/packaging/inno/requirements.txt.in -o contrib/packaging/inno/requirements.txt
6 #
7 certifi==2018.11.29 \
8 --hash=sha256:47f9c83ef4c0c621eaef743f133f09fa8a74a9b75f037e8624f83bd1b6626cb7 \
9 --hash=sha256:993f830721089fef441cdfeb4b2c8c9df86f0c63239f06bd025a76a7daddb033 \
10 # via dulwich
11 configparser==3.7.3 \
12 --hash=sha256:27594cf4fc279f321974061ac69164aaebd2749af962ac8686b20503ac0bcf2d \
13 --hash=sha256:9d51fe0a382f05b6b117c5e601fc219fede4a8c71703324af3f7d883aef476a3 \
14 # via entrypoints
15 docutils==0.14 \
16 --hash=sha256:02aec4bd92ab067f6ff27a38a38a41173bf01bed8f89157768c1573f53e474a6 \
17 --hash=sha256:51e64ef2ebfb29cae1faa133b3710143496eca21c530f3f71424d77687764274 \
18 --hash=sha256:7a4bd47eaf6596e1295ecb11361139febe29b084a87bf005bf899f9a42edc3c6
19 dulwich==0.19.11 \
20 --hash=sha256:afbe070f6899357e33f63f3f3696e601731fef66c64a489dea1bc9f539f4a725
21 entrypoints==0.3 \
22 --hash=sha256:589f874b313739ad35be6e0cd7efde2a4e9b6fea91edcc34e58ecbb8dbe56d19 \
23 --hash=sha256:c70dd71abe5a8c85e55e12c19bd91ccfeec11a6e99044204511f9ed547d48451 \
24 # via keyring
25 keyring==18.0.0 \
26 --hash=sha256:12833d2b05d2055e0e25931184af9cd6a738f320a2264853cabbd8a3a0f0b65d \
27 --hash=sha256:ca33f5ccc542b9ffaa196ee9a33488069e5e7eac77d5b81969f8a3ce74d0230c
28 pygments==2.3.1 \
29 --hash=sha256:5ffada19f6203563680669ee7f53b64dabbeb100eb51b61996085e99c03b284a \
30 --hash=sha256:e8218dd399a61674745138520d0d4cf2621d7e032439341bc3f647bff125818d
31 pywin32-ctypes==0.2.0 \
32 --hash=sha256:24ffc3b341d457d48e8922352130cf2644024a4ff09762a2261fd34c36ee5942 \
33 --hash=sha256:9dc2d991b3479cc2df15930958b674a48a227d5361d413827a4cfd0b5876fc98 \
34 # via keyring
35 urllib3==1.24.1 \
36 --hash=sha256:61bf29cada3fc2fbefad4fdf059ea4bd1b4a86d2b6d15e1c7c0b582b9752fe39 \
37 --hash=sha256:de9529817c93f27c8ccbfead6985011db27bd0ddfcdb2d86f3f663385c6a9c22 \
38 # via dulwich
@@ -0,0 +1,4 b''
1 docutils
2 dulwich
3 keyring
4 pygments
@@ -0,0 +1,84 b''
1 #!/usr/bin/env python3
2 # Copyright 2019 Gregory Szorc <gregory.szorc@gmail.com>
3 #
4 # This software may be used and distributed according to the terms of the
5 # GNU General Public License version 2 or any later version.
6
7 # no-check-code because Python 3 native.
8
9 """Code to build Mercurial WiX installer."""
10
11 import argparse
12 import os
13 import pathlib
14 import sys
15
16
17 if __name__ == '__main__':
18 parser = argparse.ArgumentParser()
19
20 parser.add_argument('--name',
21 help='Application name',
22 default='Mercurial')
23 parser.add_argument('--python',
24 help='Path to Python executable to use',
25 required=True)
26 parser.add_argument('--sign-sn',
27 help='Subject name (or fragment thereof) of certificate '
28 'to use for signing')
29 parser.add_argument('--sign-cert',
30 help='Path to certificate to use for signing')
31 parser.add_argument('--sign-password',
32 help='Password for signing certificate')
33 parser.add_argument('--sign-timestamp-url',
34 help='URL of timestamp server to use for signing')
35 parser.add_argument('--version',
36 help='Version string to use')
37 parser.add_argument('--extra-packages-script',
38 help=('Script to execute to include extra packages in '
39 'py2exe binary.'))
40 parser.add_argument('--extra-wxs',
41 help='CSV of path_to_wxs_file=working_dir_for_wxs_file')
42 parser.add_argument('--extra-features',
43 help=('CSV of extra feature names to include '
44 'in the installer from the extra wxs files'))
45
46 args = parser.parse_args()
47
48 here = pathlib.Path(os.path.abspath(os.path.dirname(__file__)))
49 source_dir = here.parent.parent.parent
50
51 sys.path.insert(0, str(source_dir / 'contrib' / 'packaging'))
52
53 from hgpackaging.wix import (
54 build_installer,
55 build_signed_installer,
56 )
57
58 fn = build_installer
59 kwargs = {
60 'source_dir': source_dir,
61 'python_exe': pathlib.Path(args.python),
62 'version': args.version,
63 }
64
65 if not os.path.isabs(args.python):
66 raise Exception('--python arg must be an absolute path')
67
68 if args.extra_packages_script:
69 kwargs['extra_packages_script'] = args.extra_packages_script
70 if args.extra_wxs:
71 kwargs['extra_wxs'] = dict(
72 thing.split("=") for thing in args.extra_wxs.split(','))
73 if args.extra_features:
74 kwargs['extra_features'] = args.extra_features.split(',')
75
76 if args.sign_sn or args.sign_cert:
77 fn = build_signed_installer
78 kwargs['name'] = args.name
79 kwargs['subject_name'] = args.sign_sn
80 kwargs['cert_path'] = args.sign_cert
81 kwargs['cert_password'] = args.sign_password
82 kwargs['timestamp_url'] = args.sign_timestamp_url
83
84 fn(**kwargs)
@@ -0,0 +1,13 b''
1 #
2 # This file is autogenerated by pip-compile
3 # To update, run:
4 #
5 # pip-compile --generate-hashes contrib/packaging/wix/requirements.txt.in -o contrib/packaging/wix/requirements.txt
6 #
7 docutils==0.14 \
8 --hash=sha256:02aec4bd92ab067f6ff27a38a38a41173bf01bed8f89157768c1573f53e474a6 \
9 --hash=sha256:51e64ef2ebfb29cae1faa133b3710143496eca21c530f3f71424d77687764274 \
10 --hash=sha256:7a4bd47eaf6596e1295ecb11361139febe29b084a87bf005bf899f9a42edc3c6
11 pygments==2.3.1 \
12 --hash=sha256:5ffada19f6203563680669ee7f53b64dabbeb100eb51b61996085e99c03b284a \
13 --hash=sha256:e8218dd399a61674745138520d0d4cf2621d7e032439341bc3f647bff125818d
@@ -0,0 +1,2 b''
1 docutils
2 pygments
1 NO CONTENT: new file 100644
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: new file 100644
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: new file 100644
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: new file 100644
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: new file 100644
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: new file 100644
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: new file 100644
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: new file 100644
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: new file 100755
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: new file 100644
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: new file 100644
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: new file 100644
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: new file 100644
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: new file 100644
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: new file 100644
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: new file 100644
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: new file 100644
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: new file 100644
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: new file 100644
The requested commit or file is too big and content was truncated. Show full diff
@@ -5,7 +5,7 b''
5 5 # % make PREFIX=/opt/ install
6 6
7 7 export PREFIX=/usr/local
8 PYTHON=python
8 PYTHON?=python
9 9 $(eval HGROOT := $(shell pwd))
10 10 HGPYTHONS ?= $(HGROOT)/build/pythons
11 11 PURE=
@@ -47,3 +47,6 b' parents(20000)'
47 47 # The one below is used by rebase
48 48 (children(ancestor(tip~5, tip)) and ::(tip~5))::
49 49 heads(commonancestors(last(head(), 2)))
50 heads(-10000:-1)
51 roots(-10000:-1)
52 only(max(head()), min(head()))
@@ -25,7 +25,7 b' def reducetest(a, b):'
25 25
26 26 try:
27 27 test1(a, b)
28 except Exception as inst:
28 except Exception:
29 29 reductions += 1
30 30 tries = 0
31 31 a = a2
@@ -40,6 +40,8 b' try:'
40 40 except ImportError:
41 41 re2 = None
42 42
43 import testparseutil
44
43 45 def compilere(pat, multiline=False):
44 46 if multiline:
45 47 pat = '(?m)' + pat
@@ -231,8 +233,10 b' utestfilters = ['
231 233 (r"( +)(#([^!][^\n]*\S)?)", repcomment),
232 234 ]
233 235
234 pypats = [
236 # common patterns to check *.py
237 commonpypats = [
235 238 [
239 (r'\\$', 'Use () to wrap long lines in Python, not \\'),
236 240 (r'^\s*def\s*\w+\s*\(.*,\s*\(',
237 241 "tuple parameter unpacking not available in Python 3+"),
238 242 (r'lambda\s*\(.*,.*\)',
@@ -261,7 +265,6 b' pypats = ['
261 265 # a pass at the same indent level, which is bogus
262 266 r'(?P=indent)pass[ \t\n#]'
263 267 ), 'omit superfluous pass'),
264 (r'.{81}', "line too long"),
265 268 (r'[^\n]\Z', "no trailing newline"),
266 269 (r'(\S[ \t]+|^[ \t]+)\n', "trailing whitespace"),
267 270 # (r'^\s+[^_ \n][^_. \n]+_[^_\n]+\s*=',
@@ -299,7 +302,6 b' pypats = ['
299 302 "wrong whitespace around ="),
300 303 (r'\([^()]*( =[^=]|[^<>!=]= )',
301 304 "no whitespace around = for named parameters"),
302 (r'raise Exception', "don't raise generic exceptions"),
303 305 (r'raise [^,(]+, (\([^\)]+\)|[^,\(\)]+)$',
304 306 "don't use old-style two-argument raise, use Exception(message)"),
305 307 (r' is\s+(not\s+)?["\'0-9-]', "object comparison with literal"),
@@ -315,21 +317,12 b' pypats = ['
315 317 "use opener.read() instead"),
316 318 (r'opener\([^)]*\).write\(',
317 319 "use opener.write() instead"),
318 (r'[\s\(](open|file)\([^)]*\)\.read\(',
319 "use util.readfile() instead"),
320 (r'[\s\(](open|file)\([^)]*\)\.write\(',
321 "use util.writefile() instead"),
322 (r'^[\s\(]*(open(er)?|file)\([^)]*\)(?!\.close\(\))',
323 "always assign an opened file to a variable, and close it afterwards"),
324 (r'[\s\(](open|file)\([^)]*\)\.(?!close\(\))',
325 "always assign an opened file to a variable, and close it afterwards"),
326 320 (r'(?i)descend[e]nt', "the proper spelling is descendAnt"),
327 321 (r'\.debug\(\_', "don't mark debug messages for translation"),
328 322 (r'\.strip\(\)\.split\(\)', "no need to strip before splitting"),
329 323 (r'^\s*except\s*:', "naked except clause", r'#.*re-raises'),
330 324 (r'^\s*except\s([^\(,]+|\([^\)]+\))\s*,',
331 325 'legacy exception syntax; use "as" instead of ","'),
332 (r':\n( )*( ){1,3}[^ ]', "must indent 4 spaces"),
333 326 (r'release\(.*wlock, .*lock\)', "wrong lock release order"),
334 327 (r'\bdef\s+__bool__\b', "__bool__ should be __nonzero__ in Python 2"),
335 328 (r'os\.path\.join\(.*, *(""|\'\')\)',
@@ -339,7 +332,6 b' pypats = ['
339 332 (r'def.*[( ]\w+=\{\}', "don't use mutable default arguments"),
340 333 (r'\butil\.Abort\b', "directly use error.Abort"),
341 334 (r'^@(\w*\.)?cachefunc', "module-level @cachefunc is risky, please avoid"),
342 (r'^import atexit', "don't use atexit, use ui.atexit"),
343 335 (r'^import Queue', "don't use Queue, use pycompat.queue.Queue + "
344 336 "pycompat.queue.Empty"),
345 337 (r'^import cStringIO', "don't use cStringIO.StringIO, use util.stringio"),
@@ -358,6 +350,34 b' pypats = ['
358 350 "don't convert rev to node before passing to revision(nodeorrev)"),
359 351 (r'platform\.system\(\)', "don't use platform.system(), use pycompat"),
360 352
353 ],
354 # warnings
355 [
356 ]
357 ]
358
359 # patterns to check normal *.py files
360 pypats = [
361 [
362 # Ideally, these should be placed in "commonpypats" for
363 # consistency of coding rules in Mercurial source tree.
364 # But on the other hand, these are not so seriously required for
365 # python code fragments embedded in test scripts. Fixing test
366 # scripts for these patterns requires many changes, and has less
367 # profit than effort.
368 (r'.{81}', "line too long"),
369 (r'raise Exception', "don't raise generic exceptions"),
370 (r'[\s\(](open|file)\([^)]*\)\.read\(',
371 "use util.readfile() instead"),
372 (r'[\s\(](open|file)\([^)]*\)\.write\(',
373 "use util.writefile() instead"),
374 (r'^[\s\(]*(open(er)?|file)\([^)]*\)(?!\.close\(\))',
375 "always assign an opened file to a variable, and close it afterwards"),
376 (r'[\s\(](open|file)\([^)]*\)\.(?!close\(\))',
377 "always assign an opened file to a variable, and close it afterwards"),
378 (r':\n( )*( ){1,3}[^ ]', "must indent 4 spaces"),
379 (r'^import atexit', "don't use atexit, use ui.atexit"),
380
361 381 # rules depending on implementation of repquote()
362 382 (r' x+[xpqo%APM][\'"]\n\s+[\'"]x',
363 383 'string join across lines with no space'),
@@ -376,21 +396,35 b' pypats = ['
376 396 # because _preparepats forcibly adds "\n" into [^...],
377 397 # even though this regexp wants match it against "\n")''',
378 398 "missing _() in ui message (use () to hide false-positives)"),
379 ],
399 ] + commonpypats[0],
380 400 # warnings
381 401 [
382 402 # rules depending on implementation of repquote()
383 403 (r'(^| )pp +xxxxqq[ \n][^\n]', "add two newlines after '.. note::'"),
384 ]
404 ] + commonpypats[1]
385 405 ]
386 406
387 pyfilters = [
407 # patterns to check *.py for embedded ones in test script
408 embeddedpypats = [
409 [
410 ] + commonpypats[0],
411 # warnings
412 [
413 ] + commonpypats[1]
414 ]
415
416 # common filters to convert *.py
417 commonpyfilters = [
388 418 (r"""(?msx)(?P<comment>\#.*?$)|
389 419 ((?P<quote>('''|\"\"\"|(?<!')'(?!')|(?<!")"(?!")))
390 420 (?P<text>(([^\\]|\\.)*?))
391 421 (?P=quote))""", reppython),
392 422 ]
393 423
424 # filters to convert normal *.py files
425 pyfilters = [
426 ] + commonpyfilters
427
394 428 # non-filter patterns
395 429 pynfpats = [
396 430 [
@@ -403,6 +437,10 b' pynfpats = ['
403 437 [],
404 438 ]
405 439
440 # filters to convert *.py for embedded ones in test script
441 embeddedpyfilters = [
442 ] + commonpyfilters
443
406 444 # extension non-filter patterns
407 445 pyextnfpats = [
408 446 [(r'^"""\n?[A-Z]', "don't capitalize docstring title")],
@@ -414,7 +452,7 b' txtfilters = []'
414 452
415 453 txtpats = [
416 454 [
417 ('\s$', 'trailing whitespace'),
455 (r'\s$', 'trailing whitespace'),
418 456 ('.. note::[ \n][^\n]', 'add two newlines after note::')
419 457 ],
420 458 []
@@ -537,9 +575,17 b' checks = ['
537 575 allfilesfilters, allfilespats),
538 576 ]
539 577
578 # (desc,
579 # func to pick up embedded code fragments,
580 # list of patterns to convert target files
581 # list of patterns to detect errors/warnings)
582 embeddedchecks = [
583 ('embedded python',
584 testparseutil.pyembedded, embeddedpyfilters, embeddedpypats)
585 ]
586
540 587 def _preparepats():
541 for c in checks:
542 failandwarn = c[-1]
588 def preparefailandwarn(failandwarn):
543 589 for pats in failandwarn:
544 590 for i, pseq in enumerate(pats):
545 591 # fix-up regexes for multi-line searches
@@ -553,10 +599,19 b' def _preparepats():'
553 599 p = re.sub(r'(?<!\\)\[\^', r'[^\\n', p)
554 600
555 601 pats[i] = (re.compile(p, re.MULTILINE),) + pseq[1:]
556 filters = c[3]
602
603 def preparefilters(filters):
557 604 for i, flt in enumerate(filters):
558 605 filters[i] = re.compile(flt[0]), flt[1]
559 606
607 for cs in (checks, embeddedchecks):
608 for c in cs:
609 failandwarn = c[-1]
610 preparefailandwarn(failandwarn)
611
612 filters = c[-2]
613 preparefilters(filters)
614
560 615 class norepeatlogger(object):
561 616 def __init__(self):
562 617 self._lastseen = None
@@ -604,13 +659,12 b' def checkfile(f, logfunc=_defaultlogger.'
604 659
605 660 return True if no error is found, False otherwise.
606 661 """
607 blamecache = None
608 662 result = True
609 663
610 664 try:
611 665 with opentext(f) as fp:
612 666 try:
613 pre = post = fp.read()
667 pre = fp.read()
614 668 except UnicodeDecodeError as e:
615 669 print("%s while reading %s" % (e, f))
616 670 return result
@@ -618,11 +672,12 b' def checkfile(f, logfunc=_defaultlogger.'
618 672 print("Skipping %s, %s" % (f, str(e).split(':', 1)[0]))
619 673 return result
620 674
675 # context information shared while single checkfile() invocation
676 context = {'blamecache': None}
677
621 678 for name, match, magic, filters, pats in checks:
622 post = pre # discard filtering result of previous check
623 679 if debug:
624 680 print(name, f)
625 fc = 0
626 681 if not (re.match(match, f) or (magic and re.search(magic, pre))):
627 682 if debug:
628 683 print("Skipping %s for %s it doesn't match %s" % (
@@ -637,6 +692,74 b' def checkfile(f, logfunc=_defaultlogger.'
637 692 # tests/test-check-code.t
638 693 print("Skipping %s it has no-che?k-code (glob)" % f)
639 694 return "Skip" # skip checking this file
695
696 fc = _checkfiledata(name, f, pre, filters, pats, context,
697 logfunc, maxerr, warnings, blame, debug, lineno)
698 if fc:
699 result = False
700
701 if f.endswith('.t') and "no-" "check-code" not in pre:
702 if debug:
703 print("Checking embedded code in %s" % (f))
704
705 prelines = pre.splitlines()
706 embeddederros = []
707 for name, embedded, filters, pats in embeddedchecks:
708 # "reset curmax at each repetition" treats maxerr as "max
709 # nubmer of errors in an actual file per entry of
710 # (embedded)checks"
711 curmaxerr = maxerr
712
713 for found in embedded(f, prelines, embeddederros):
714 filename, starts, ends, code = found
715 fc = _checkfiledata(name, f, code, filters, pats, context,
716 logfunc, curmaxerr, warnings, blame, debug,
717 lineno, offset=starts - 1)
718 if fc:
719 result = False
720 if curmaxerr:
721 if fc >= curmaxerr:
722 break
723 curmaxerr -= fc
724
725 return result
726
727 def _checkfiledata(name, f, filedata, filters, pats, context,
728 logfunc, maxerr, warnings, blame, debug, lineno,
729 offset=None):
730 """Execute actual error check for file data
731
732 :name: of the checking category
733 :f: filepath
734 :filedata: content of a file
735 :filters: to be applied before checking
736 :pats: to detect errors
737 :context: a dict of information shared while single checkfile() invocation
738 Valid keys: 'blamecache'.
739 :logfunc: function used to report error
740 logfunc(filename, linenumber, linecontent, errormessage)
741 :maxerr: number of error to display before aborting, or False to
742 report all errors
743 :warnings: whether warning level checks should be applied
744 :blame: whether blame information should be displayed at error reporting
745 :debug: whether debug information should be displayed
746 :lineno: whether lineno should be displayed at error reporting
747 :offset: line number offset of 'filedata' in 'f' for checking
748 an embedded code fragment, or None (offset=0 is different
749 from offset=None)
750
751 returns number of detected errors.
752 """
753 blamecache = context['blamecache']
754 if offset is None:
755 lineoffset = 0
756 else:
757 lineoffset = offset
758
759 fc = 0
760 pre = post = filedata
761
762 if True: # TODO: get rid of this redundant 'if' block
640 763 for p, r in filters:
641 764 post = re.sub(p, r, post)
642 765 nerrs = len(pats[0]) # nerr elements are errors
@@ -679,20 +802,30 b' def checkfile(f, logfunc=_defaultlogger.'
679 802 if ignore and re.search(ignore, l, re.MULTILINE):
680 803 if debug:
681 804 print("Skipping %s for %s:%s (ignore pattern)" % (
682 name, f, n))
805 name, f, (n + lineoffset)))
683 806 continue
684 807 bd = ""
685 808 if blame:
686 809 bd = 'working directory'
687 if not blamecache:
810 if blamecache is None:
688 811 blamecache = getblame(f)
689 if n < len(blamecache):
690 bl, bu, br = blamecache[n]
691 if bl == l:
812 context['blamecache'] = blamecache
813 if (n + lineoffset) < len(blamecache):
814 bl, bu, br = blamecache[(n + lineoffset)]
815 if offset is None and bl == l:
692 816 bd = '%s@%s' % (bu, br)
817 elif offset is not None and bl.endswith(l):
818 # "offset is not None" means "checking
819 # embedded code fragment". In this case,
820 # "l" does not have information about the
821 # beginning of an *original* line in the
822 # file (e.g. ' > ').
823 # Therefore, use "str.endswith()", and
824 # show "maybe" for a little loose
825 # examination.
826 bd = '%s@%s, maybe' % (bu, br)
693 827
694 errors.append((f, lineno and n + 1, l, msg, bd))
695 result = False
828 errors.append((f, lineno and (n + lineoffset + 1), l, msg, bd))
696 829
697 830 errors.sort()
698 831 for e in errors:
@@ -702,7 +835,7 b' def checkfile(f, logfunc=_defaultlogger.'
702 835 print(" (too many errors, giving up)")
703 836 break
704 837
705 return result
838 return fc
706 839
707 840 def main():
708 841 parser = optparse.OptionParser("%prog [options] [files | -]")
@@ -47,7 +47,7 b' errors = ['
47 47 "adds a function with foo_bar naming"),
48 48 ]
49 49
50 word = re.compile('\S')
50 word = re.compile(r'\S')
51 51 def nonempty(first, second):
52 52 if word.search(first):
53 53 return first
@@ -25,7 +25,7 b" configre = re.compile(br'''"
25 25 (?:default=)?(?P<default>\S+?))?
26 26 \)''', re.VERBOSE | re.MULTILINE)
27 27
28 configwithre = re.compile(b'''
28 configwithre = re.compile(br'''
29 29 ui\.config(?P<ctype>with)\(
30 30 # First argument is callback function. This doesn't parse robustly
31 31 # if it is e.g. a function call.
@@ -61,10 +61,10 b' def main(args):'
61 61 linenum += 1
62 62
63 63 # check topic-like bits
64 m = re.match(b'\s*``(\S+)``', l)
64 m = re.match(br'\s*``(\S+)``', l)
65 65 if m:
66 66 prevname = m.group(1)
67 if re.match(b'^\s*-+$', l):
67 if re.match(br'^\s*-+$', l):
68 68 sect = prevname
69 69 prevname = b''
70 70
@@ -14,6 +14,7 b' import importlib'
14 14 import os
15 15 import sys
16 16 import traceback
17 import warnings
17 18
18 19 def check_compat_py2(f):
19 20 """Check Python 3 compatibility for a file with Python 2"""
@@ -45,7 +46,7 b' def check_compat_py3(f):'
45 46 content = fh.read()
46 47
47 48 try:
48 ast.parse(content)
49 ast.parse(content, filename=f)
49 50 except SyntaxError as e:
50 51 print('%s: invalid syntax: %s' % (f, e))
51 52 return
@@ -91,6 +92,11 b" if __name__ == '__main__':"
91 92 fn = check_compat_py3
92 93
93 94 for f in sys.argv[1:]:
94 fn(f)
95 with warnings.catch_warnings(record=True) as warns:
96 fn(f)
97
98 for w in warns:
99 print(warnings.formatwarning(w.message, w.category,
100 w.filename, w.lineno).rstrip())
95 101
96 102 sys.exit(0)
@@ -84,8 +84,9 b' static void initcontext(context_t *ctx)'
84 84
85 85 static void enlargecontext(context_t *ctx, size_t newsize)
86 86 {
87 if (newsize <= ctx->maxdatasize)
87 if (newsize <= ctx->maxdatasize) {
88 88 return;
89 }
89 90
90 91 newsize = defaultdatasize *
91 92 ((newsize + defaultdatasize - 1) / defaultdatasize);
@@ -117,22 +118,25 b' static void readchannel(hgclient_t *hgc)'
117 118
118 119 uint32_t datasize_n;
119 120 rsize = recv(hgc->sockfd, &datasize_n, sizeof(datasize_n), 0);
120 if (rsize != sizeof(datasize_n))
121 if (rsize != sizeof(datasize_n)) {
121 122 abortmsg("failed to read data size");
123 }
122 124
123 125 /* datasize denotes the maximum size to write if input request */
124 126 hgc->ctx.datasize = ntohl(datasize_n);
125 127 enlargecontext(&hgc->ctx, hgc->ctx.datasize);
126 128
127 if (isupper(hgc->ctx.ch) && hgc->ctx.ch != 'S')
129 if (isupper(hgc->ctx.ch) && hgc->ctx.ch != 'S') {
128 130 return; /* assumes input request */
131 }
129 132
130 133 size_t cursize = 0;
131 134 while (cursize < hgc->ctx.datasize) {
132 135 rsize = recv(hgc->sockfd, hgc->ctx.data + cursize,
133 136 hgc->ctx.datasize - cursize, 0);
134 if (rsize < 1)
137 if (rsize < 1) {
135 138 abortmsg("failed to read data block");
139 }
136 140 cursize += rsize;
137 141 }
138 142 }
@@ -143,8 +147,9 b' static void sendall(int sockfd, const vo'
143 147 const char *const endp = p + datasize;
144 148 while (p < endp) {
145 149 ssize_t r = send(sockfd, p, endp - p, 0);
146 if (r < 0)
150 if (r < 0) {
147 151 abortmsgerrno("cannot communicate");
152 }
148 153 p += r;
149 154 }
150 155 }
@@ -186,8 +191,9 b' static void packcmdargs(context_t *ctx, '
186 191 ctx->datasize += n;
187 192 }
188 193
189 if (ctx->datasize > 0)
194 if (ctx->datasize > 0) {
190 195 --ctx->datasize; /* strip last '\0' */
196 }
191 197 }
192 198
193 199 /* Extract '\0'-separated list of args to new buffer, terminated by NULL */
@@ -205,8 +211,9 b' static const char **unpackcmdargsnul(con'
205 211 args[nargs] = s;
206 212 nargs++;
207 213 s = memchr(s, '\0', e - s);
208 if (!s)
214 if (!s) {
209 215 break;
216 }
210 217 s++;
211 218 }
212 219 args[nargs] = NULL;
@@ -225,8 +232,9 b' static void handlereadrequest(hgclient_t'
225 232 static void handlereadlinerequest(hgclient_t *hgc)
226 233 {
227 234 context_t *ctx = &hgc->ctx;
228 if (!fgets(ctx->data, ctx->datasize, stdin))
235 if (!fgets(ctx->data, ctx->datasize, stdin)) {
229 236 ctx->data[0] = '\0';
237 }
230 238 ctx->datasize = strlen(ctx->data);
231 239 writeblock(hgc);
232 240 }
@@ -239,8 +247,9 b' static void handlesystemrequest(hgclient'
239 247 ctx->data[ctx->datasize] = '\0'; /* terminate last string */
240 248
241 249 const char **args = unpackcmdargsnul(ctx);
242 if (!args[0] || !args[1] || !args[2])
250 if (!args[0] || !args[1] || !args[2]) {
243 251 abortmsg("missing type or command or cwd in system request");
252 }
244 253 if (strcmp(args[0], "system") == 0) {
245 254 debugmsg("run '%s' at '%s'", args[1], args[2]);
246 255 int32_t r = runshellcmd(args[1], args + 3, args[2]);
@@ -252,8 +261,9 b' static void handlesystemrequest(hgclient'
252 261 writeblock(hgc);
253 262 } else if (strcmp(args[0], "pager") == 0) {
254 263 setuppager(args[1], args + 3);
255 if (hgc->capflags & CAP_ATTACHIO)
264 if (hgc->capflags & CAP_ATTACHIO) {
256 265 attachio(hgc);
266 }
257 267 /* unblock the server */
258 268 static const char emptycmd[] = "\n";
259 269 sendall(hgc->sockfd, emptycmd, sizeof(emptycmd) - 1);
@@ -296,9 +306,10 b' static void handleresponse(hgclient_t *h'
296 306 handlesystemrequest(hgc);
297 307 break;
298 308 default:
299 if (isupper(ctx->ch))
309 if (isupper(ctx->ch)) {
300 310 abortmsg("cannot handle response (ch = %c)",
301 311 ctx->ch);
312 }
302 313 }
303 314 }
304 315 }
@@ -308,8 +319,9 b' static unsigned int parsecapabilities(co'
308 319 unsigned int flags = 0;
309 320 while (s < e) {
310 321 const char *t = strchr(s, ' ');
311 if (!t || t > e)
322 if (!t || t > e) {
312 323 t = e;
324 }
313 325 const cappair_t *cap;
314 326 for (cap = captable; cap->flag; ++cap) {
315 327 size_t n = t - s;
@@ -346,11 +358,13 b' static void readhello(hgclient_t *hgc)'
346 358 const char *const dataend = ctx->data + ctx->datasize;
347 359 while (s < dataend) {
348 360 const char *t = strchr(s, ':');
349 if (!t || t[1] != ' ')
361 if (!t || t[1] != ' ') {
350 362 break;
363 }
351 364 const char *u = strchr(t + 2, '\n');
352 if (!u)
365 if (!u) {
353 366 u = dataend;
367 }
354 368 if (strncmp(s, "capabilities:", t - s + 1) == 0) {
355 369 hgc->capflags = parsecapabilities(t + 2, u);
356 370 } else if (strncmp(s, "pgid:", t - s + 1) == 0) {
@@ -367,8 +381,9 b' static void updateprocname(hgclient_t *h'
367 381 {
368 382 int r = snprintf(hgc->ctx.data, hgc->ctx.maxdatasize, "chg[worker/%d]",
369 383 (int)getpid());
370 if (r < 0 || (size_t)r >= hgc->ctx.maxdatasize)
384 if (r < 0 || (size_t)r >= hgc->ctx.maxdatasize) {
371 385 abortmsg("insufficient buffer to write procname (r = %d)", r);
386 }
372 387 hgc->ctx.datasize = (size_t)r;
373 388 writeblockrequest(hgc, "setprocname");
374 389 }
@@ -380,8 +395,9 b' static void attachio(hgclient_t *hgc)'
380 395 sendall(hgc->sockfd, chcmd, sizeof(chcmd) - 1);
381 396 readchannel(hgc);
382 397 context_t *ctx = &hgc->ctx;
383 if (ctx->ch != 'I')
398 if (ctx->ch != 'I') {
384 399 abortmsg("unexpected response for attachio (ch = %c)", ctx->ch);
400 }
385 401
386 402 static const int fds[3] = {STDIN_FILENO, STDOUT_FILENO, STDERR_FILENO};
387 403 struct msghdr msgh;
@@ -399,23 +415,27 b' static void attachio(hgclient_t *hgc)'
399 415 memcpy(CMSG_DATA(cmsg), fds, sizeof(fds));
400 416 msgh.msg_controllen = cmsg->cmsg_len;
401 417 ssize_t r = sendmsg(hgc->sockfd, &msgh, 0);
402 if (r < 0)
418 if (r < 0) {
403 419 abortmsgerrno("sendmsg failed");
420 }
404 421
405 422 handleresponse(hgc);
406 423 int32_t n;
407 if (ctx->datasize != sizeof(n))
424 if (ctx->datasize != sizeof(n)) {
408 425 abortmsg("unexpected size of attachio result");
426 }
409 427 memcpy(&n, ctx->data, sizeof(n));
410 428 n = ntohl(n);
411 if (n != sizeof(fds) / sizeof(fds[0]))
429 if (n != sizeof(fds) / sizeof(fds[0])) {
412 430 abortmsg("failed to send fds (n = %d)", n);
431 }
413 432 }
414 433
415 434 static void chdirtocwd(hgclient_t *hgc)
416 435 {
417 if (!getcwd(hgc->ctx.data, hgc->ctx.maxdatasize))
436 if (!getcwd(hgc->ctx.data, hgc->ctx.maxdatasize)) {
418 437 abortmsgerrno("failed to getcwd");
438 }
419 439 hgc->ctx.datasize = strlen(hgc->ctx.data);
420 440 writeblockrequest(hgc, "chdir");
421 441 }
@@ -440,8 +460,9 b' static void forwardumask(hgclient_t *hgc'
440 460 hgclient_t *hgc_open(const char *sockname)
441 461 {
442 462 int fd = socket(AF_UNIX, SOCK_STREAM, 0);
443 if (fd < 0)
463 if (fd < 0) {
444 464 abortmsgerrno("cannot create socket");
465 }
445 466
446 467 /* don't keep fd on fork(), so that it can be closed when the parent
447 468 * process get terminated. */
@@ -456,34 +477,39 b' hgclient_t *hgc_open(const char *socknam'
456 477 {
457 478 const char *split = strrchr(sockname, '/');
458 479 if (split && split != sockname) {
459 if (split[1] == '\0')
480 if (split[1] == '\0') {
460 481 abortmsg("sockname cannot end with a slash");
482 }
461 483 size_t len = split - sockname;
462 484 char sockdir[len + 1];
463 485 memcpy(sockdir, sockname, len);
464 486 sockdir[len] = '\0';
465 487
466 488 bakfd = open(".", O_DIRECTORY);
467 if (bakfd == -1)
489 if (bakfd == -1) {
468 490 abortmsgerrno("cannot open cwd");
491 }
469 492
470 493 int r = chdir(sockdir);
471 if (r != 0)
494 if (r != 0) {
472 495 abortmsgerrno("cannot chdir %s", sockdir);
496 }
473 497
474 498 basename = split + 1;
475 499 }
476 500 }
477 if (strlen(basename) >= sizeof(addr.sun_path))
501 if (strlen(basename) >= sizeof(addr.sun_path)) {
478 502 abortmsg("sockname is too long: %s", basename);
503 }
479 504 strncpy(addr.sun_path, basename, sizeof(addr.sun_path));
480 505 addr.sun_path[sizeof(addr.sun_path) - 1] = '\0';
481 506
482 507 /* real connect */
483 508 int r = connect(fd, (struct sockaddr *)&addr, sizeof(addr));
484 509 if (r < 0) {
485 if (errno != ENOENT && errno != ECONNREFUSED)
510 if (errno != ENOENT && errno != ECONNREFUSED) {
486 511 abortmsgerrno("cannot connect to %s", sockname);
512 }
487 513 }
488 514 if (bakfd != -1) {
489 515 fchdirx(bakfd);
@@ -501,16 +527,21 b' hgclient_t *hgc_open(const char *socknam'
501 527 initcontext(&hgc->ctx);
502 528
503 529 readhello(hgc);
504 if (!(hgc->capflags & CAP_RUNCOMMAND))
530 if (!(hgc->capflags & CAP_RUNCOMMAND)) {
505 531 abortmsg("insufficient capability: runcommand");
506 if (hgc->capflags & CAP_SETPROCNAME)
532 }
533 if (hgc->capflags & CAP_SETPROCNAME) {
507 534 updateprocname(hgc);
508 if (hgc->capflags & CAP_ATTACHIO)
535 }
536 if (hgc->capflags & CAP_ATTACHIO) {
509 537 attachio(hgc);
510 if (hgc->capflags & CAP_CHDIR)
538 }
539 if (hgc->capflags & CAP_CHDIR) {
511 540 chdirtocwd(hgc);
512 if (hgc->capflags & CAP_SETUMASK2)
541 }
542 if (hgc->capflags & CAP_SETUMASK2) {
513 543 forwardumask(hgc);
544 }
514 545
515 546 return hgc;
516 547 }
@@ -555,16 +586,18 b' const char **hgc_validate(hgclient_t *hg'
555 586 size_t argsize)
556 587 {
557 588 assert(hgc);
558 if (!(hgc->capflags & CAP_VALIDATE))
589 if (!(hgc->capflags & CAP_VALIDATE)) {
559 590 return NULL;
591 }
560 592
561 593 packcmdargs(&hgc->ctx, args, argsize);
562 594 writeblockrequest(hgc, "validate");
563 595 handleresponse(hgc);
564 596
565 597 /* the server returns '\0' if it can handle our request */
566 if (hgc->ctx.datasize <= 1)
598 if (hgc->ctx.datasize <= 1) {
567 599 return NULL;
600 }
568 601
569 602 /* make sure the buffer is '\0' terminated */
570 603 enlargecontext(&hgc->ctx, hgc->ctx.datasize + 1);
@@ -599,8 +632,9 b' int hgc_runcommand(hgclient_t *hgc, cons'
599 632 void hgc_attachio(hgclient_t *hgc)
600 633 {
601 634 assert(hgc);
602 if (!(hgc->capflags & CAP_ATTACHIO))
635 if (!(hgc->capflags & CAP_ATTACHIO)) {
603 636 return;
637 }
604 638 attachio(hgc);
605 639 }
606 640
@@ -613,8 +647,9 b' void hgc_attachio(hgclient_t *hgc)'
613 647 void hgc_setenv(hgclient_t *hgc, const char *const envp[])
614 648 {
615 649 assert(hgc && envp);
616 if (!(hgc->capflags & CAP_SETENV))
650 if (!(hgc->capflags & CAP_SETENV)) {
617 651 return;
652 }
618 653 packcmdargs(&hgc->ctx, envp, /*argsize*/ -1);
619 654 writeblockrequest(hgc, "setenv");
620 655 }
@@ -25,8 +25,9 b' static pid_t peerpid = 0;'
25 25 static void forwardsignal(int sig)
26 26 {
27 27 assert(peerpid > 0);
28 if (kill(peerpid, sig) < 0)
28 if (kill(peerpid, sig) < 0) {
29 29 abortmsgerrno("cannot kill %d", peerpid);
30 }
30 31 debugmsg("forward signal %d", sig);
31 32 }
32 33
@@ -34,8 +35,9 b' static void forwardsignaltogroup(int sig'
34 35 {
35 36 /* prefer kill(-pgid, sig), fallback to pid if pgid is invalid */
36 37 pid_t killpid = peerpgid > 1 ? -peerpgid : peerpid;
37 if (kill(killpid, sig) < 0)
38 if (kill(killpid, sig) < 0) {
38 39 abortmsgerrno("cannot kill %d", killpid);
40 }
39 41 debugmsg("forward signal %d to %d", sig, killpid);
40 42 }
41 43
@@ -43,28 +45,36 b' static void handlestopsignal(int sig)'
43 45 {
44 46 sigset_t unblockset, oldset;
45 47 struct sigaction sa, oldsa;
46 if (sigemptyset(&unblockset) < 0)
48 if (sigemptyset(&unblockset) < 0) {
47 49 goto error;
48 if (sigaddset(&unblockset, sig) < 0)
50 }
51 if (sigaddset(&unblockset, sig) < 0) {
49 52 goto error;
53 }
50 54 memset(&sa, 0, sizeof(sa));
51 55 sa.sa_handler = SIG_DFL;
52 56 sa.sa_flags = SA_RESTART;
53 if (sigemptyset(&sa.sa_mask) < 0)
57 if (sigemptyset(&sa.sa_mask) < 0) {
54 58 goto error;
59 }
55 60
56 61 forwardsignal(sig);
57 if (raise(sig) < 0) /* resend to self */
62 if (raise(sig) < 0) { /* resend to self */
58 63 goto error;
59 if (sigaction(sig, &sa, &oldsa) < 0)
64 }
65 if (sigaction(sig, &sa, &oldsa) < 0) {
60 66 goto error;
61 if (sigprocmask(SIG_UNBLOCK, &unblockset, &oldset) < 0)
67 }
68 if (sigprocmask(SIG_UNBLOCK, &unblockset, &oldset) < 0) {
62 69 goto error;
70 }
63 71 /* resent signal will be handled before sigprocmask() returns */
64 if (sigprocmask(SIG_SETMASK, &oldset, NULL) < 0)
72 if (sigprocmask(SIG_SETMASK, &oldset, NULL) < 0) {
65 73 goto error;
66 if (sigaction(sig, &oldsa, NULL) < 0)
74 }
75 if (sigaction(sig, &oldsa, NULL) < 0) {
67 76 goto error;
77 }
68 78 return;
69 79
70 80 error:
@@ -73,19 +83,22 b' error:'
73 83
74 84 static void handlechildsignal(int sig UNUSED_)
75 85 {
76 if (peerpid == 0 || pagerpid == 0)
86 if (peerpid == 0 || pagerpid == 0) {
77 87 return;
88 }
78 89 /* if pager exits, notify the server with SIGPIPE immediately.
79 90 * otherwise the server won't get SIGPIPE if it does not write
80 91 * anything. (issue5278) */
81 if (waitpid(pagerpid, NULL, WNOHANG) == pagerpid)
92 if (waitpid(pagerpid, NULL, WNOHANG) == pagerpid) {
82 93 kill(peerpid, SIGPIPE);
94 }
83 95 }
84 96
85 97 void setupsignalhandler(pid_t pid, pid_t pgid)
86 98 {
87 if (pid <= 0)
99 if (pid <= 0) {
88 100 return;
101 }
89 102 peerpid = pid;
90 103 peerpgid = (pgid <= 1 ? 0 : pgid);
91 104
@@ -98,42 +111,52 b' void setupsignalhandler(pid_t pid, pid_t'
98 111 * - SIGINT: usually generated by the terminal */
99 112 sa.sa_handler = forwardsignaltogroup;
100 113 sa.sa_flags = SA_RESTART;
101 if (sigemptyset(&sa.sa_mask) < 0)
114 if (sigemptyset(&sa.sa_mask) < 0) {
115 goto error;
116 }
117 if (sigaction(SIGHUP, &sa, NULL) < 0) {
102 118 goto error;
103 if (sigaction(SIGHUP, &sa, NULL) < 0)
119 }
120 if (sigaction(SIGINT, &sa, NULL) < 0) {
104 121 goto error;
105 if (sigaction(SIGINT, &sa, NULL) < 0)
106 goto error;
122 }
107 123
108 124 /* terminate frontend by double SIGTERM in case of server freeze */
109 125 sa.sa_handler = forwardsignal;
110 126 sa.sa_flags |= SA_RESETHAND;
111 if (sigaction(SIGTERM, &sa, NULL) < 0)
127 if (sigaction(SIGTERM, &sa, NULL) < 0) {
112 128 goto error;
129 }
113 130
114 131 /* notify the worker about window resize events */
115 132 sa.sa_flags = SA_RESTART;
116 if (sigaction(SIGWINCH, &sa, NULL) < 0)
133 if (sigaction(SIGWINCH, &sa, NULL) < 0) {
117 134 goto error;
135 }
118 136 /* forward user-defined signals */
119 if (sigaction(SIGUSR1, &sa, NULL) < 0)
137 if (sigaction(SIGUSR1, &sa, NULL) < 0) {
120 138 goto error;
121 if (sigaction(SIGUSR2, &sa, NULL) < 0)
139 }
140 if (sigaction(SIGUSR2, &sa, NULL) < 0) {
122 141 goto error;
142 }
123 143 /* propagate job control requests to worker */
124 144 sa.sa_handler = forwardsignal;
125 145 sa.sa_flags = SA_RESTART;
126 if (sigaction(SIGCONT, &sa, NULL) < 0)
146 if (sigaction(SIGCONT, &sa, NULL) < 0) {
127 147 goto error;
148 }
128 149 sa.sa_handler = handlestopsignal;
129 150 sa.sa_flags = SA_RESTART;
130 if (sigaction(SIGTSTP, &sa, NULL) < 0)
151 if (sigaction(SIGTSTP, &sa, NULL) < 0) {
131 152 goto error;
153 }
132 154 /* get notified when pager exits */
133 155 sa.sa_handler = handlechildsignal;
134 156 sa.sa_flags = SA_RESTART;
135 if (sigaction(SIGCHLD, &sa, NULL) < 0)
157 if (sigaction(SIGCHLD, &sa, NULL) < 0) {
136 158 goto error;
159 }
137 160
138 161 return;
139 162
@@ -147,26 +170,34 b' void restoresignalhandler(void)'
147 170 memset(&sa, 0, sizeof(sa));
148 171 sa.sa_handler = SIG_DFL;
149 172 sa.sa_flags = SA_RESTART;
150 if (sigemptyset(&sa.sa_mask) < 0)
173 if (sigemptyset(&sa.sa_mask) < 0) {
151 174 goto error;
175 }
152 176
153 if (sigaction(SIGHUP, &sa, NULL) < 0)
177 if (sigaction(SIGHUP, &sa, NULL) < 0) {
154 178 goto error;
155 if (sigaction(SIGTERM, &sa, NULL) < 0)
179 }
180 if (sigaction(SIGTERM, &sa, NULL) < 0) {
156 181 goto error;
157 if (sigaction(SIGWINCH, &sa, NULL) < 0)
182 }
183 if (sigaction(SIGWINCH, &sa, NULL) < 0) {
158 184 goto error;
159 if (sigaction(SIGCONT, &sa, NULL) < 0)
185 }
186 if (sigaction(SIGCONT, &sa, NULL) < 0) {
160 187 goto error;
161 if (sigaction(SIGTSTP, &sa, NULL) < 0)
188 }
189 if (sigaction(SIGTSTP, &sa, NULL) < 0) {
162 190 goto error;
163 if (sigaction(SIGCHLD, &sa, NULL) < 0)
191 }
192 if (sigaction(SIGCHLD, &sa, NULL) < 0) {
164 193 goto error;
194 }
165 195
166 196 /* ignore Ctrl+C while shutting down to make pager exits cleanly */
167 197 sa.sa_handler = SIG_IGN;
168 if (sigaction(SIGINT, &sa, NULL) < 0)
198 if (sigaction(SIGINT, &sa, NULL) < 0) {
169 199 goto error;
200 }
170 201
171 202 peerpid = 0;
172 203 return;
@@ -180,22 +211,27 b' error:'
180 211 pid_t setuppager(const char *pagercmd, const char *envp[])
181 212 {
182 213 assert(pagerpid == 0);
183 if (!pagercmd)
214 if (!pagercmd) {
184 215 return 0;
216 }
185 217
186 218 int pipefds[2];
187 if (pipe(pipefds) < 0)
219 if (pipe(pipefds) < 0) {
188 220 return 0;
221 }
189 222 pid_t pid = fork();
190 if (pid < 0)
223 if (pid < 0) {
191 224 goto error;
225 }
192 226 if (pid > 0) {
193 227 close(pipefds[0]);
194 if (dup2(pipefds[1], fileno(stdout)) < 0)
228 if (dup2(pipefds[1], fileno(stdout)) < 0) {
195 229 goto error;
230 }
196 231 if (isatty(fileno(stderr))) {
197 if (dup2(pipefds[1], fileno(stderr)) < 0)
232 if (dup2(pipefds[1], fileno(stderr)) < 0) {
198 233 goto error;
234 }
199 235 }
200 236 close(pipefds[1]);
201 237 pagerpid = pid;
@@ -222,16 +258,18 b' error:'
222 258
223 259 void waitpager(void)
224 260 {
225 if (pagerpid == 0)
261 if (pagerpid == 0) {
226 262 return;
263 }
227 264
228 265 /* close output streams to notify the pager its input ends */
229 266 fclose(stdout);
230 267 fclose(stderr);
231 268 while (1) {
232 269 pid_t ret = waitpid(pagerpid, NULL, 0);
233 if (ret == -1 && errno == EINTR)
270 if (ret == -1 && errno == EINTR) {
234 271 continue;
272 }
235 273 break;
236 274 }
237 275 }
@@ -25,8 +25,9 b' static int colorenabled = 0;'
25 25
26 26 static inline void fsetcolor(FILE *fp, const char *code)
27 27 {
28 if (!colorenabled)
28 if (!colorenabled) {
29 29 return;
30 }
30 31 fprintf(fp, "\033[%sm", code);
31 32 }
32 33
@@ -35,8 +36,9 b' static void vabortmsgerrno(int no, const'
35 36 fsetcolor(stderr, "1;31");
36 37 fputs("chg: abort: ", stderr);
37 38 vfprintf(stderr, fmt, args);
38 if (no != 0)
39 if (no != 0) {
39 40 fprintf(stderr, " (errno = %d, %s)", no, strerror(no));
41 }
40 42 fsetcolor(stderr, "");
41 43 fputc('\n', stderr);
42 44 exit(255);
@@ -82,8 +84,9 b' void enabledebugmsg(void)'
82 84
83 85 void debugmsg(const char *fmt, ...)
84 86 {
85 if (!debugmsgenabled)
87 if (!debugmsgenabled) {
86 88 return;
89 }
87 90
88 91 va_list args;
89 92 va_start(args, fmt);
@@ -98,32 +101,37 b' void debugmsg(const char *fmt, ...)'
98 101 void fchdirx(int dirfd)
99 102 {
100 103 int r = fchdir(dirfd);
101 if (r == -1)
104 if (r == -1) {
102 105 abortmsgerrno("failed to fchdir");
106 }
103 107 }
104 108
105 109 void fsetcloexec(int fd)
106 110 {
107 111 int flags = fcntl(fd, F_GETFD);
108 if (flags < 0)
112 if (flags < 0) {
109 113 abortmsgerrno("cannot get flags of fd %d", fd);
110 if (fcntl(fd, F_SETFD, flags | FD_CLOEXEC) < 0)
114 }
115 if (fcntl(fd, F_SETFD, flags | FD_CLOEXEC) < 0) {
111 116 abortmsgerrno("cannot set flags of fd %d", fd);
117 }
112 118 }
113 119
114 120 void *mallocx(size_t size)
115 121 {
116 122 void *result = malloc(size);
117 if (!result)
123 if (!result) {
118 124 abortmsg("failed to malloc");
125 }
119 126 return result;
120 127 }
121 128
122 129 void *reallocx(void *ptr, size_t size)
123 130 {
124 131 void *result = realloc(ptr, size);
125 if (!result)
132 if (!result) {
126 133 abortmsg("failed to realloc");
134 }
127 135 return result;
128 136 }
129 137
@@ -144,30 +152,37 b' int runshellcmd(const char *cmd, const c'
144 152 memset(&newsa, 0, sizeof(newsa));
145 153 newsa.sa_handler = SIG_IGN;
146 154 newsa.sa_flags = 0;
147 if (sigemptyset(&newsa.sa_mask) < 0)
155 if (sigemptyset(&newsa.sa_mask) < 0) {
148 156 goto done;
149 if (sigaction(SIGINT, &newsa, &oldsaint) < 0)
157 }
158 if (sigaction(SIGINT, &newsa, &oldsaint) < 0) {
150 159 goto done;
160 }
151 161 doneflags |= F_SIGINT;
152 if (sigaction(SIGQUIT, &newsa, &oldsaquit) < 0)
162 if (sigaction(SIGQUIT, &newsa, &oldsaquit) < 0) {
153 163 goto done;
164 }
154 165 doneflags |= F_SIGQUIT;
155 166
156 if (sigaddset(&newsa.sa_mask, SIGCHLD) < 0)
167 if (sigaddset(&newsa.sa_mask, SIGCHLD) < 0) {
157 168 goto done;
158 if (sigprocmask(SIG_BLOCK, &newsa.sa_mask, &oldmask) < 0)
169 }
170 if (sigprocmask(SIG_BLOCK, &newsa.sa_mask, &oldmask) < 0) {
159 171 goto done;
172 }
160 173 doneflags |= F_SIGMASK;
161 174
162 175 pid_t pid = fork();
163 if (pid < 0)
176 if (pid < 0) {
164 177 goto done;
178 }
165 179 if (pid == 0) {
166 180 sigaction(SIGINT, &oldsaint, NULL);
167 181 sigaction(SIGQUIT, &oldsaquit, NULL);
168 182 sigprocmask(SIG_SETMASK, &oldmask, NULL);
169 if (cwd && chdir(cwd) < 0)
183 if (cwd && chdir(cwd) < 0) {
170 184 _exit(127);
185 }
171 186 const char *argv[] = {"sh", "-c", cmd, NULL};
172 187 if (envp) {
173 188 execve("/bin/sh", (char **)argv, (char **)envp);
@@ -176,25 +191,32 b' int runshellcmd(const char *cmd, const c'
176 191 }
177 192 _exit(127);
178 193 } else {
179 if (waitpid(pid, &status, 0) < 0)
194 if (waitpid(pid, &status, 0) < 0) {
180 195 goto done;
196 }
181 197 doneflags |= F_WAITPID;
182 198 }
183 199
184 200 done:
185 if (doneflags & F_SIGINT)
201 if (doneflags & F_SIGINT) {
186 202 sigaction(SIGINT, &oldsaint, NULL);
187 if (doneflags & F_SIGQUIT)
203 }
204 if (doneflags & F_SIGQUIT) {
188 205 sigaction(SIGQUIT, &oldsaquit, NULL);
189 if (doneflags & F_SIGMASK)
206 }
207 if (doneflags & F_SIGMASK) {
190 208 sigprocmask(SIG_SETMASK, &oldmask, NULL);
209 }
191 210
192 211 /* no way to report other errors, use 127 (= shell termination) */
193 if (!(doneflags & F_WAITPID))
212 if (!(doneflags & F_WAITPID)) {
194 213 return 127;
195 if (WIFEXITED(status))
214 }
215 if (WIFEXITED(status)) {
196 216 return WEXITSTATUS(status);
197 if (WIFSIGNALED(status))
217 }
218 if (WIFSIGNALED(status)) {
198 219 return -WTERMSIG(status);
220 }
199 221 return 127;
200 222 }
@@ -62,6 +62,11 b' contrib/python-zstandard/zstd/compress/z'
62 62 contrib/python-zstandard/zstd/compress/zstd_opt.c
63 63 contrib/python-zstandard/zstd/compress/zstd_opt.h
64 64 contrib/python-zstandard/zstd/decompress/huf_decompress.c
65 contrib/python-zstandard/zstd/decompress/zstd_ddict.c
66 contrib/python-zstandard/zstd/decompress/zstd_ddict.h
67 contrib/python-zstandard/zstd/decompress/zstd_decompress_block.c
68 contrib/python-zstandard/zstd/decompress/zstd_decompress_block.h
69 contrib/python-zstandard/zstd/decompress/zstd_decompress_internal.h
65 70 contrib/python-zstandard/zstd/decompress/zstd_decompress.c
66 71 contrib/python-zstandard/zstd/deprecated/zbuff_common.c
67 72 contrib/python-zstandard/zstd/deprecated/zbuff_compress.c
@@ -7,6 +7,7 b' import mercurial'
7 7 import sys
8 8 from mercurial import (
9 9 demandimport,
10 pycompat,
10 11 registrar,
11 12 )
12 13
@@ -32,28 +33,30 b' def ipdb(ui, repo, msg, **opts):'
32 33
33 34 IPython.embed()
34 35
35 @command('debugshell|dbsh', [])
36 @command(b'debugshell|dbsh', [])
36 37 def debugshell(ui, repo, **opts):
37 bannermsg = "loaded repo : %s\n" \
38 "using source: %s" % (repo.root,
39 mercurial.__path__[0])
38 bannermsg = ("loaded repo : %s\n"
39 "using source: %s" % (pycompat.sysstr(repo.root),
40 mercurial.__path__[0]))
40 41
41 42 pdbmap = {
42 43 'pdb' : 'code',
43 44 'ipdb' : 'IPython'
44 45 }
45 46
46 debugger = ui.config("ui", "debugger")
47 debugger = ui.config(b"ui", b"debugger")
47 48 if not debugger:
48 49 debugger = 'pdb'
50 else:
51 debugger = pycompat.sysstr(debugger)
49 52
50 53 # if IPython doesn't exist, fallback to code.interact
51 54 try:
52 55 with demandimport.deactivated():
53 56 __import__(pdbmap[debugger])
54 57 except ImportError:
55 ui.warn(("%s debugger specified but %s module was not found\n")
58 ui.warn((b"%s debugger specified but %s module was not found\n")
56 59 % (debugger, pdbmap[debugger]))
57 debugger = 'pdb'
60 debugger = b'pdb'
58 61
59 62 getattr(sys.modules[__name__], debugger)(ui, repo, bannermsg, **opts)
@@ -20,11 +20,19 b' try:'
20 20 lm = lazymanifest(mdata)
21 21 # iterate the whole thing, which causes the code to fully parse
22 22 # every line in the manifest
23 list(lm.iterentries())
23 for e, _, _ in lm.iterentries():
24 # also exercise __getitem__ et al
25 lm[e]
26 e in lm
27 (e + 'nope') in lm
24 28 lm[b'xyzzy'] = (b'\0' * 20, 'x')
25 29 # do an insert, text should change
26 30 assert lm.text() != mdata, "insert should change text and didn't: %r %r" % (lm.text(), mdata)
31 cloned = lm.filtercopy(lambda x: x != 'xyzzy')
32 assert cloned.text() == mdata, 'cloned text should equal mdata'
33 cloned.diff(lm)
27 34 del lm[b'xyzzy']
35 cloned.diff(lm)
28 36 # should be back to the same
29 37 assert lm.text() == mdata, "delete should have restored text but didn't: %r %r" % (lm.text(), mdata)
30 38 except Exception as e:
@@ -39,6 +47,11 b' except Exception as e:'
39 47
40 48 int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size)
41 49 {
50 // Don't allow fuzzer inputs larger than 100k, since we'll just bog
51 // down and not accomplish much.
52 if (Size > 100000) {
53 return 0;
54 }
42 55 PyObject *mtext =
43 56 PyBytes_FromStringAndSize((const char *)Data, (Py_ssize_t)Size);
44 57 PyObject *locals = PyDict_New();
@@ -19,6 +19,11 b' from parsers import parse_index2'
19 19 for inline in (True, False):
20 20 try:
21 21 index, cache = parse_index2(data, inline)
22 index.slicechunktodensity(list(range(len(index))), 0.5, 262144)
23 for rev in range(len(index)):
24 node = index[rev][7]
25 partial = index.shortest(node)
26 index.partialmatch(node[:partial])
22 27 except Exception as e:
23 28 pass
24 29 # uncomment this print if you're editing this Python code
@@ -31,6 +36,11 b' for inline in (True, False):'
31 36
32 37 int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size)
33 38 {
39 // Don't allow fuzzer inputs larger than 60k, since we'll just bog
40 // down and not accomplish much.
41 if (Size > 60000) {
42 return 0;
43 }
34 44 PyObject *text =
35 45 PyBytes_FromStringAndSize((const char *)Data, (Py_ssize_t)Size);
36 46 PyObject *locals = PyDict_New();
@@ -53,4 +53,45 b''
53 53 (setq mode-name "hg-test")
54 54 (run-hooks 'hg-test-mode-hook))
55 55
56 (with-eval-after-load "compile"
57 ;; Link to Python sources in tracebacks in .t failures.
58 (add-to-list 'compilation-error-regexp-alist-alist
59 '(hg-test-output-python-tb
60 "^\\+ +File ['\"]\\([^'\"]+\\)['\"], line \\([0-9]+\\)," 1 2))
61 (add-to-list 'compilation-error-regexp-alist 'hg-test-output-python-tb)
62 ;; Link to source files in test-check-code.t violations.
63 (add-to-list 'compilation-error-regexp-alist-alist
64 '(hg-test-check-code-output
65 "\\+ \\([^:\n]+\\):\\([0-9]+\\):$" 1 2))
66 (add-to-list 'compilation-error-regexp-alist 'hg-test-check-code-output))
67
68 (defun hg-test-mode--test-one-error-line-regexp (test)
69 (erase-buffer)
70 (setq compilation-locs (make-hash-table))
71 (insert (car test))
72 (compilation-parse-errors (point-min) (point-max))
73 (let ((msg (get-text-property 1 'compilation-message)))
74 (should msg)
75 (let ((loc (compilation--message->loc msg))
76 (line (nth 1 test))
77 (file (nth 2 test)))
78 (should (equal (compilation--loc->line loc) line))
79 (should (equal (caar (compilation--loc->file-struct loc)) file)))
80 msg))
81
82 (require 'ert)
83 (ert-deftest hg-test-mode--compilation-mode-support ()
84 "Test hg-specific compilation-mode regular expressions"
85 (require 'compile)
86 (with-temp-buffer
87 (font-lock-mode -1)
88 (mapc 'hg-test-mode--test-one-error-line-regexp
89 '(
90 ("+ contrib/debugshell.py:37:" 37 "contrib/debugshell.py")
91 ("+ File \"/tmp/hg/mercurial/commands.py\", line 3115, in help_"
92 3115 "/tmp/hg/mercurial/commands.py")
93 ("+ File \"mercurial/dispatch.py\", line 225, in dispatch"
94 225 "mercurial/dispatch.py")))))
95
96
56 97 (provide 'hg-test-mode)
@@ -76,7 +76,7 b' def build_docker_image(dockerfile: pathl'
76 76 p.communicate(input=dockerfile)
77 77 if p.returncode:
78 78 raise subprocess.CalledProcessException(
79 p.returncode, 'failed to build docker image: %s %s' \
79 p.returncode, 'failed to build docker image: %s %s'
80 80 % (p.stdout, p.stderr))
81 81
82 82 def command_build(args):
@@ -5,7 +5,7 b''
5 5 #define FileHandle
6 6 #define FileLine
7 7 #define VERSION = "unknown"
8 #if FileHandle = FileOpen(SourcePath + "\..\..\mercurial\__version__.py")
8 #if FileHandle = FileOpen(SourcePath + "\..\..\..\mercurial\__version__.py")
9 9 #expr FileLine = FileRead(FileHandle)
10 10 #expr FileLine = FileRead(FileHandle)
11 11 #define VERSION = Copy(FileLine, Pos('"', FileLine)+1, Len(FileLine)-Pos('"', FileLine)-1)
@@ -43,7 +43,7 b' AppUpdatesURL=https://mercurial-scm.org/'
43 43 AppID={{4B95A5F1-EF59-4B08-BED8-C891C46121B3}
44 44 AppContact=mercurial@mercurial-scm.org
45 45 DefaultDirName={pf}\Mercurial
46 SourceDir=..\..
46 SourceDir=..\..\..
47 47 VersionInfoDescription=Mercurial distributed SCM (version {#VERSION})
48 48 VersionInfoCopyright=Copyright 2005-2019 Matt Mackall and others
49 49 VersionInfoCompany=Matt Mackall and others
@@ -53,6 +53,7 b' SetupIconFile=contrib\\win32\\mercurial.ic'
53 53 AllowNoIcons=true
54 54 DefaultGroupName=Mercurial
55 55 PrivilegesRequired=none
56 ChangesEnvironment=true
56 57
57 58 [Files]
58 59 Source: contrib\mercurial.el; DestDir: {app}/Contrib
@@ -70,17 +71,12 b' Source: contrib\\hgweb.wsgi; DestDir: {ap'
70 71 Source: contrib\win32\ReadMe.html; DestDir: {app}; Flags: isreadme
71 72 Source: contrib\win32\postinstall.txt; DestDir: {app}; DestName: ReleaseNotes.txt
72 73 Source: dist\hg.exe; DestDir: {app}; AfterInstall: Touch('{app}\hg.exe.local')
73 #if ARCH == "x64"
74 74 Source: dist\lib\*.dll; Destdir: {app}\lib
75 75 Source: dist\lib\*.pyd; Destdir: {app}\lib
76 #else
77 Source: dist\w9xpopen.exe; DestDir: {app}
78 #endif
79 76 Source: dist\python*.dll; Destdir: {app}; Flags: skipifsourcedoesntexist
80 77 Source: dist\msvc*.dll; DestDir: {app}; Flags: skipifsourcedoesntexist
81 78 Source: dist\Microsoft.VC*.CRT.manifest; DestDir: {app}; Flags: skipifsourcedoesntexist
82 79 Source: dist\lib\library.zip; DestDir: {app}\lib
83 Source: dist\add_path.exe; DestDir: {app}
84 80 Source: doc\*.html; DestDir: {app}\Docs
85 81 Source: doc\style.css; DestDir: {app}\Docs
86 82 Source: mercurial\help\*.txt; DestDir: {app}\help
@@ -107,14 +103,22 b' Name: {group}\\Mercurial Configuration Fi'
107 103 Name: {group}\Mercurial Ignore Files; Filename: {app}\Docs\hgignore.5.html
108 104 Name: {group}\Mercurial Web Site; Filename: {app}\Mercurial.url
109 105
110 [Run]
111 Filename: "{app}\add_path.exe"; Parameters: "{app}"; Flags: postinstall; Description: "Add the installation path to the search path"
112
113 [UninstallRun]
114 Filename: "{app}\add_path.exe"; Parameters: "/del {app}"
106 [Tasks]
107 Name: modifypath; Description: Add the installation path to the search path; Flags: unchecked
115 108
116 109 [Code]
117 110 procedure Touch(fn: String);
118 111 begin
119 112 SaveStringToFile(ExpandConstant(fn), '', False);
120 113 end;
114
115 const
116 ModPathName = 'modifypath';
117 ModPathType = 'user';
118
119 function ModPathDir(): TArrayOfString;
120 begin
121 setArrayLength(Result, 1)
122 Result[0] := ExpandConstant('{app}');
123 end;
124 #include "modpath.iss"
@@ -1,130 +1,61 b''
1 The standalone Windows installer for Mercurial is built in a somewhat
2 jury-rigged fashion.
1 Requirements
2 ============
3 3
4 It has the following prerequisites. Ensure to take the packages
5 matching the mercurial version you want to build (32-bit or 64-bit).
4 Building the Inno installer requires a Windows machine.
6 5
7 Python 2.6 for Windows
8 http://www.python.org/download/releases/
6 The following system dependencies must be installed:
9 7
10 A compiler:
11 either MinGW
12 http://www.mingw.org/
13 or Microsoft Visual C++ 2008 SP1 Express Edition
14 http://www.microsoft.com/express/Downloads/Download-2008.aspx
15
16 Python for Windows Extensions
17 http://sourceforge.net/projects/pywin32/
18
19 mfc71.dll (just download, don't install; not needed for Python 2.6)
20 http://starship.python.net/crew/mhammond/win32/
21
22 Visual C++ 2008 redistributable package (needed for >= Python 2.6 or if you compile with MSVC)
23 for 32-bit:
24 http://www.microsoft.com/downloads/details.aspx?FamilyID=9b2da534-3e03-4391-8a4d-074b9f2bc1bf
25 for 64-bit:
26 http://www.microsoft.com/downloads/details.aspx?familyid=bd2a6171-e2d6-4230-b809-9a8d7548c1b6
8 * Python 2.7 (download from https://www.python.org/downloads/)
9 * Microsoft Visual C++ Compiler for Python 2.7
10 (https://www.microsoft.com/en-us/download/details.aspx?id=44266)
11 * Inno Setup (http://jrsoftware.org/isdl.php) version 5.4 or newer.
12 Be sure to install the optional Inno Setup Preprocessor feature,
13 which is required.
14 * Python 3.5+ (to run the ``build.py`` script)
27 15
28 The py2exe distutils extension
29 http://sourceforge.net/projects/py2exe/
30
31 GnuWin32 gettext utility (if you want to build translations)
32 http://gnuwin32.sourceforge.net/packages/gettext.htm
33
34 Inno Setup
35 http://www.jrsoftware.org/isdl.php#qsp
36
37 Get and install ispack-5.3.10.exe or later (includes Inno Setup Processor),
38 which is necessary to package Mercurial.
39
40 ISTool - optional
41 http://www.istool.org/default.aspx/
16 Building
17 ========
42 18
43 add_path (you need only add_path.exe in the zip file)
44 http://www.barisione.org/apps.html#add_path
45
46 Docutils
47 http://docutils.sourceforge.net/
19 The ``build.py`` script automates the process of producing an
20 Inno installer. It manages fetching and configuring the
21 non-system dependencies (such as py2exe, gettext, and various
22 Python packages).
48 23
49 CA Certs file
50 http://curl.haxx.se/ca/cacert.pem
51
52 And, of course, Mercurial itself.
53
54 Once you have all this installed and built, clone a copy of the
55 Mercurial repository you want to package, and name the repo
56 C:\hg\hg-release.
57
58 In a shell, build a standalone copy of the hg.exe program.
24 The script requires an activated ``Visual C++ 2008`` command prompt.
25 A shortcut to such a prompt was installed with ``Microsoft Visual C++
26 Compiler for Python 2.7``. From your Start Menu, look for
27 ``Microsoft Visual C++ Compiler Package for Python 2.7`` then launch
28 either ``Visual C++ 2008 32-bit Command Prompt`` or
29 ``Visual C++ 2008 64-bit Command Prompt``.
59 30
60 Building instructions for MinGW:
61 python setup.py build -c mingw32
62 python setup.py py2exe -b 2
63 Note: the previously suggested combined command of "python setup.py build -c
64 mingw32 py2exe -b 2" doesn't work correctly anymore as it doesn't include the
65 extensions in the mercurial subdirectory.
66 If you want to create a file named setup.cfg with the contents:
67 [build]
68 compiler=mingw32
69 you can skip the first build step.
31 From the prompt, change to the Mercurial source directory. e.g.
32 ``cd c:\src\hg``.
33
34 Next, invoke ``build.py`` to produce an Inno installer. You will
35 need to supply the path to the Python interpreter to use.:
70 36
71 Building instructions with MSVC 2008 Express Edition:
72 for 32-bit:
73 "C:\Program Files\Microsoft Visual Studio 9.0\VC\vcvarsall.bat" x86
74 python setup.py py2exe -b 2
75 for 64-bit:
76 "C:\Program Files\Microsoft Visual Studio 9.0\VC\vcvarsall.bat" x86_amd64
77 python setup.py py2exe -b 3
37 $ python3.exe contrib\packaging\inno\build.py \
38 --python c:\python27\python.exe
78 39
79 Copy add_path.exe and cacert.pem files into the dist directory that just got created.
40 .. note::
80 41
81 If you are using Python 2.6 or later, or if you are using MSVC 2008 to compile
82 mercurial, you must include the C runtime libraries in the installer. To do so,
83 install the Visual C++ 2008 redistributable package. Then in your windows\winsxs
84 folder, locate the folder containing the dlls version 9.0.21022.8.
85 For x86, it should be named like x86_Microsoft.VC90.CRT_(...)_9.0.21022.8(...).
86 For x64, it should be named like amd64_Microsoft.VC90.CRT_(...)_9.0.21022.8(...).
87 Copy the files named msvcm90.dll, msvcp90.dll and msvcr90.dll into the dist
88 directory.
89 Then in the windows\winsxs\manifests folder, locate the corresponding manifest
90 file (x86_Microsoft.VC90.CRT_(...)_9.0.21022.8(...).manifest for x86,
91 amd64_Microsoft.VC90.CRT_(...)_9.0.21022.8(...).manifest for x64), copy it in the
92 dist directory and rename it to Microsoft.VC90.CRT.manifest.
42 The script validates that the Visual C++ environment is
43 active and that the architecture of the specified Python
44 interpreter matches the Visual C++ environment and errors
45 if not.
93 46
94 Before building the installer, you have to build Mercurial HTML documentation
95 (or fix mercurial.iss to not reference the doc directory):
96
97 cd doc
98 mingw32-make html
99 cd ..
47 If everything runs as intended, dependencies will be fetched and
48 configured into the ``build`` sub-directory, Mercurial will be built,
49 and an installer placed in the ``dist`` sub-directory. The final
50 line of output should print the name of the generated installer.
100 51
101 If you use ISTool, you open the C:\hg\hg-release\contrib\win32\mercurial.iss
102 file and type Ctrl-F9 to compile the installer file.
103
104 Otherwise you run the Inno Setup compiler. Assuming it's in the path
105 you should execute:
106
107 iscc contrib\win32\mercurial.iss /dVERSION=foo
52 Additional options may be configured. Run ``build.py --help`` to
53 see a list of program flags.
108 54
109 Where 'foo' is the version number you would like to see in the
110 'Add/Remove Applications' tool. The installer will be placed into
111 a directory named Output/ at the root of your repository.
112 If the /dVERSION=foo parameter is not given in the command line, the
113 installer will retrieve the version information from the __version__.py file.
114
115 If you want to build an installer for a 64-bit mercurial, add /dARCH=x64 to
116 your command line:
117 iscc contrib\win32\mercurial.iss /dARCH=x64
55 MinGW
56 =====
118 57
119 To automate the steps above you may want to create a batchfile based on the
120 following (MinGW build chain):
121
122 echo [build] > setup.cfg
123 echo compiler=mingw32 >> setup.cfg
124 python setup.py py2exe -b 2
125 cd doc
126 mingw32-make html
127 cd ..
128 iscc contrib\win32\mercurial.iss /dVERSION=snapshot
129
130 and run it from the root of the hg repository (c:\hg\hg-release).
58 It is theoretically possible to generate an installer that uses
59 MinGW. This isn't well tested and ``build.py`` and may properly
60 support it. See old versions of this file in version control for
61 potentially useful hints as to how to achieve this.
1 NO CONTENT: file renamed from contrib/wix/COPYING.rtf to contrib/packaging/wix/COPYING.rtf, binary diff hidden
1 NO CONTENT: file renamed from contrib/wix/contrib.wxs to contrib/packaging/wix/contrib.wxs
1 NO CONTENT: file renamed from contrib/wix/defines.wxi to contrib/packaging/wix/defines.wxi
@@ -9,28 +9,6 b''
9 9 <Component Id="distOutput" Guid="$(var.dist.guid)" Win64='$(var.IsX64)'>
10 10 <File Name="python27.dll" KeyPath="yes" />
11 11 </Component>
12 <Directory Id="libdir" Name="lib" FileSource="$(var.SourceDir)/lib">
13 <Component Id="libOutput" Guid="$(var.lib.guid)" Win64='$(var.IsX64)'>
14 <File Name="library.zip" KeyPath="yes" />
15 <File Name="mercurial.cext.base85.pyd" />
16 <File Name="mercurial.cext.bdiff.pyd" />
17 <File Name="mercurial.cext.mpatch.pyd" />
18 <File Name="mercurial.cext.osutil.pyd" />
19 <File Name="mercurial.cext.parsers.pyd" />
20 <File Name="mercurial.zstd.pyd" />
21 <File Name="hgext.fsmonitor.pywatchman.bser.pyd" />
22 <File Name="pyexpat.pyd" />
23 <File Name="bz2.pyd" />
24 <File Name="select.pyd" />
25 <File Name="unicodedata.pyd" />
26 <File Name="_ctypes.pyd" />
27 <File Name="_elementtree.pyd" />
28 <File Name="_testcapi.pyd" />
29 <File Name="_hashlib.pyd" />
30 <File Name="_socket.pyd" />
31 <File Name="_ssl.pyd" />
32 </Component>
33 </Directory>
34 12 </DirectoryRef>
35 13 </Fragment>
36 14
1 NO CONTENT: file renamed from contrib/wix/doc.wxs to contrib/packaging/wix/doc.wxs
1 NO CONTENT: file renamed from contrib/wix/guids.wxi to contrib/packaging/wix/guids.wxi
1 NO CONTENT: file renamed from contrib/wix/help.wxs to contrib/packaging/wix/help.wxs
1 NO CONTENT: file renamed from contrib/wix/i18n.wxs to contrib/packaging/wix/i18n.wxs
1 NO CONTENT: file renamed from contrib/wix/locale.wxs to contrib/packaging/wix/locale.wxs
@@ -69,7 +69,7 b''
69 69 KeyPath='yes'/>
70 70 </Component>
71 71 <Component Id='COPYING' Guid='$(var.COPYING.guid)' Win64='$(var.IsX64)'>
72 <File Id='COPYING' Name='COPYING.rtf' Source='contrib\wix\COPYING.rtf'
72 <File Id='COPYING' Name='COPYING.rtf' Source='contrib\packaging\wix\COPYING.rtf'
73 73 KeyPath='yes'/>
74 74 </Component>
75 75
@@ -129,6 +129,11 b''
129 129 <MergeRef Id='VCRuntime' />
130 130 <MergeRef Id='VCRuntimePolicy' />
131 131 </Feature>
132 <?ifdef MercurialExtraFeatures?>
133 <?foreach EXTRAFEAT in $(var.MercurialExtraFeatures)?>
134 <FeatureRef Id="$(var.EXTRAFEAT)" />
135 <?endforeach?>
136 <?endif?>
132 137 <Feature Id='Locales' Title='Translations' Description='Translations' Level='1'>
133 138 <ComponentGroupRef Id='localeFolder' />
134 139 <ComponentRef Id='i18nFolder' />
@@ -144,7 +149,7 b''
144 149 <UIRef Id="WixUI_FeatureTree" />
145 150 <UIRef Id="WixUI_ErrorProgressText" />
146 151
147 <WixVariable Id="WixUILicenseRtf" Value="contrib\wix\COPYING.rtf" />
152 <WixVariable Id="WixUILicenseRtf" Value="contrib\packaging\wix\COPYING.rtf" />
148 153
149 154 <Icon Id="hgIcon.ico" SourceFile="contrib/win32/mercurial.ico" />
150 155
@@ -1,31 +1,71 b''
1 WiX installer source files
1 WiX Installer
2 =============
3
4 The files in this directory are used to produce an MSI installer using
5 the WiX Toolset (http://wixtoolset.org/).
6
7 The MSI installers require elevated (admin) privileges due to the
8 installation of MSVC CRT libraries into the Windows system store. See
9 the Inno Setup installers in the ``inno`` sibling directory for installers
10 that do not have this requirement.
11
12 Requirements
13 ============
14
15 Building the WiX installers requires a Windows machine. The following
16 dependencies must be installed:
17
18 * Python 2.7 (download from https://www.python.org/downloads/)
19 * Microsoft Visual C++ Compiler for Python 2.7
20 (https://www.microsoft.com/en-us/download/details.aspx?id=44266)
21 * Python 3.5+ (to run the ``build.py`` script)
22
23 Building
24 ========
25
26 The ``build.py`` script automates the process of producing an MSI
27 installer. It manages fetching and configuring non-system dependencies
28 (such as py2exe, gettext, and various Python packages).
29
30 The script requires an activated ``Visual C++ 2008`` command prompt.
31 A shortcut to such a prompt was installed with ``Microsoft Visual
32 C++ Compiler for Python 2.7``. From your Start Menu, look for
33 ``Microsoft Visual C++ Compiler Package for Python 2.7`` then
34 launch either ``Visual C++ 2008 32-bit Command Prompt`` or
35 ``Visual C++ 2008 64-bit Command Prompt``.
36
37 From the prompt, change to the Mercurial source directory. e.g.
38 ``cd c:\src\hg``.
39
40 Next, invoke ``build.py`` to produce an MSI installer. You will need
41 to supply the path to the Python interpreter to use.::
42
43 $ python3 contrib\packaging\wix\build.py \
44 --python c:\python27\python.exe
45
46 .. note::
47
48 The script validates that the Visual C++ environment is active and
49 that the architecture of the specified Python interpreter matches the
50 Visual C++ environment. An error is raised otherwise.
51
52 If everything runs as intended, dependencies will be fetched and
53 configured into the ``build`` sub-directory, Mercurial will be built,
54 and an installer placed in the ``dist`` sub-directory. The final line
55 of output should print the name of the generated installer.
56
57 Additional options may be configured. Run ``build.py --help`` to see
58 a list of program flags.
59
60 Relationship to TortoiseHG
2 61 ==========================
3 62
4 The files in this folder are used by the thg-winbuild [1] package
5 building architecture to create a Mercurial MSI installer. These files
6 are versioned within the Mercurial source tree because the WXS files
7 must kept up to date with distribution changes within their branch. In
8 other words, the default branch WXS files are expected to diverge from
9 the stable branch WXS files. Storing them within the same repository is
10 the only sane way to keep the source tree and the installer in sync.
11
12 The MSI installer builder uses only the mercurial.ini file from the
13 contrib/win32 folder, the contents of which have been historically used
14 to create an InnoSetup based installer. The rest of the files there are
15 ignored.
63 TortoiseHG uses the WiX files in this directory.
16 64
17 The MSI packages built by thg-winbuild require elevated (admin)
18 privileges to be installed due to the installation of MSVC CRT libraries
19 under the C:\WINDOWS\WinSxS folder. Thus the InnoSetup installers may
20 still be useful to some users.
65 The code for building TortoiseHG installers lives at
66 https://bitbucket.org/tortoisehg/thg-winbuild and is maintained by
67 Steve Borho (steve@borho.org).
21 68
22 To build your own MSI packages, clone the thg-winbuild [1] repository
23 and follow the README.txt [2] instructions closely. There are fewer
24 prerequisites for a WiX [3] installer than an InnoSetup installer, but
25 they are more specific.
26
27 Direct questions or comments to Steve Borho <steve@borho.org>
28
29 [1] http://bitbucket.org/tortoisehg/thg-winbuild
30 [2] http://bitbucket.org/tortoisehg/thg-winbuild/src/tip/README.txt
31 [3] http://wix.sourceforge.net/
69 When changing behavior of the WiX installer, be sure to notify
70 the TortoiseHG Project of the changes so they have ample time
71 provide feedback and react to those changes.
1 NO CONTENT: file renamed from contrib/wix/templates.wxs to contrib/packaging/wix/templates.wxs
@@ -28,9 +28,13 b''
28 28
29 29 set -euo pipefail
30 30
31 printusage () {
32 echo "usage: `basename $0` REPO NBHEADS DEPTH [left|right]" >&2
33 }
34
31 35 if [ $# -lt 3 ]; then
32 echo "usage: `basename $0` REPO NBHEADS DEPTH"
33 exit 64
36 printusage
37 exit 64
34 38 fi
35 39
36 40 repo="$1"
@@ -42,8 +46,26 b' shift'
42 46 depth="$1"
43 47 shift
44 48
45 leftrepo="${repo}-left"
46 rightrepo="${repo}-right"
49 doleft=1
50 doright=1
51 if [ $# -gt 1 ]; then
52 printusage
53 exit 64
54 elif [ $# -eq 1 ]; then
55 if [ "$1" == "left" ]; then
56 doleft=1
57 doright=0
58 elif [ "$1" == "right" ]; then
59 doleft=0
60 doright=1
61 else
62 printusage
63 exit 64
64 fi
65 fi
66
67 leftrepo="${repo}-${nbheads}h-${depth}d-left"
68 rightrepo="${repo}-${nbheads}h-${depth}d-right"
47 69
48 70 left="first(sort(heads(all()), 'desc'), $nbheads)"
49 71 right="last(sort(heads(all()), 'desc'), $nbheads)"
@@ -51,14 +73,35 b' right="last(sort(heads(all()), \'desc\'), '
51 73 leftsubset="ancestors($left, $depth) and only($left, heads(all() - $left))"
52 74 rightsubset="ancestors($right, $depth) and only($right, heads(all() - $right))"
53 75
54 echo '### building left repository:' $left-repo
55 echo '# cloning'
56 hg clone --noupdate "${repo}" "${leftrepo}"
57 echo '# stripping' '"'${leftsubset}'"'
58 hg -R "${leftrepo}" --config extensions.strip= strip --rev "$leftsubset" --no-backup
76 echo '### creating left/right repositories with missing changesets:'
77 if [ $doleft -eq 1 ]; then
78 echo '# left revset:' '"'${leftsubset}'"'
79 fi
80 if [ $doright -eq 1 ]; then
81 echo '# right revset:' '"'${rightsubset}'"'
82 fi
59 83
60 echo '### building right repository:' $right-repo
61 echo '# cloning'
62 hg clone --noupdate "${repo}" "${rightrepo}"
63 echo '# stripping:' '"'${rightsubset}'"'
64 hg -R "${rightrepo}" --config extensions.strip= strip --rev "$rightsubset" --no-backup
84 buildone() {
85 side="$1"
86 dest="$2"
87 revset="$3"
88 echo "### building $side repository: $dest"
89 if [ -e "$dest" ]; then
90 echo "destination repo already exists: $dest" >&2
91 exit 1
92 fi
93 echo '# cloning'
94 if ! cp --recursive --reflink=always ${repo} ${dest}; then
95 hg clone --noupdate "${repo}" "${dest}"
96 fi
97 echo '# stripping' '"'${revset}'"'
98 hg -R "${dest}" --config extensions.strip= strip --rev "$revset" --no-backup
99 }
100
101 if [ $doleft -eq 1 ]; then
102 buildone left "$leftrepo" "$leftsubset"
103 fi
104
105 if [ $doright -eq 1 ]; then
106 buildone right "$rightrepo" "$rightsubset"
107 fi
@@ -1,5 +1,34 b''
1 1 # perf.py - performance test routines
2 '''helper extension to measure performance'''
2 '''helper extension to measure performance
3
4 Configurations
5 ==============
6
7 ``perf``
8 --------
9
10 ``all-timing``
11 When set, additional statistics will be reported for each benchmark: best,
12 worst, median average. If not set only the best timing is reported
13 (default: off).
14
15 ``presleep``
16 number of second to wait before any group of runs (default: 1)
17
18 ``run-limits``
19 Control the number of runs each benchmark will perform. The option value
20 should be a list of `<time>-<numberofrun>` pairs. After each run the
21 conditions are considered in order with the following logic:
22
23 If benchmark has been running for <time> seconds, and we have performed
24 <numberofrun> iterations, stop the benchmark,
25
26 The default value is: `3.0-100, 10.0-3`
27
28 ``stub``
29 When set, benchmarks will only be run once, useful for testing
30 (default: off)
31 '''
3 32
4 33 # "historical portability" policy of perf.py:
5 34 #
@@ -65,6 +94,10 b' try:'
65 94 except ImportError:
66 95 pass
67 96 try:
97 from mercurial.utils import repoviewutil # since 5.0
98 except ImportError:
99 repoviewutil = None
100 try:
68 101 from mercurial import scmutil # since 1.9 (or 8b252e826c68)
69 102 except ImportError:
70 103 pass
@@ -207,6 +240,9 b' try:'
207 240 configitem(b'perf', b'all-timing',
208 241 default=mercurial.configitems.dynamicdefault,
209 242 )
243 configitem(b'perf', b'run-limits',
244 default=mercurial.configitems.dynamicdefault,
245 )
210 246 except (ImportError, AttributeError):
211 247 pass
212 248
@@ -279,7 +315,34 b' def gettimer(ui, opts=None):'
279 315
280 316 # experimental config: perf.all-timing
281 317 displayall = ui.configbool(b"perf", b"all-timing", False)
282 return functools.partial(_timer, fm, displayall=displayall), fm
318
319 # experimental config: perf.run-limits
320 limitspec = ui.configlist(b"perf", b"run-limits", [])
321 limits = []
322 for item in limitspec:
323 parts = item.split(b'-', 1)
324 if len(parts) < 2:
325 ui.warn((b'malformatted run limit entry, missing "-": %s\n'
326 % item))
327 continue
328 try:
329 time_limit = float(pycompat.sysstr(parts[0]))
330 except ValueError as e:
331 ui.warn((b'malformatted run limit entry, %s: %s\n'
332 % (pycompat.bytestr(e), item)))
333 continue
334 try:
335 run_limit = int(pycompat.sysstr(parts[1]))
336 except ValueError as e:
337 ui.warn((b'malformatted run limit entry, %s: %s\n'
338 % (pycompat.bytestr(e), item)))
339 continue
340 limits.append((time_limit, run_limit))
341 if not limits:
342 limits = DEFAULTLIMITS
343
344 t = functools.partial(_timer, fm, displayall=displayall, limits=limits)
345 return t, fm
283 346
284 347 def stub_timer(fm, func, setup=None, title=None):
285 348 if setup is not None:
@@ -297,12 +360,21 b' def timeone():'
297 360 a, b = ostart, ostop
298 361 r.append((cstop - cstart, b[0] - a[0], b[1]-a[1]))
299 362
300 def _timer(fm, func, setup=None, title=None, displayall=False):
363
364 # list of stop condition (elapsed time, minimal run count)
365 DEFAULTLIMITS = (
366 (3.0, 100),
367 (10.0, 3),
368 )
369
370 def _timer(fm, func, setup=None, title=None, displayall=False,
371 limits=DEFAULTLIMITS):
301 372 gc.collect()
302 373 results = []
303 374 begin = util.timer()
304 375 count = 0
305 while True:
376 keepgoing = True
377 while keepgoing:
306 378 if setup is not None:
307 379 setup()
308 380 with timeone() as item:
@@ -310,10 +382,12 b' def _timer(fm, func, setup=None, title=N'
310 382 count += 1
311 383 results.append(item[0])
312 384 cstop = util.timer()
313 if cstop - begin > 3 and count >= 100:
314 break
315 if cstop - begin > 10 and count >= 3:
316 break
385 # Look for a stop condition.
386 elapsed = cstop - begin
387 for t, mincount in limits:
388 if elapsed >= t and count >= mincount:
389 keepgoing = False
390 break
317 391
318 392 formatone(fm, results, title=title, result=r,
319 393 displayall=displayall)
@@ -401,7 +475,8 b' def getbranchmapsubsettable():'
401 475 # subsettable is defined in:
402 476 # - branchmap since 2.9 (or 175c6fd8cacc)
403 477 # - repoview since 2.5 (or 59a9f18d4587)
404 for mod in (branchmap, repoview):
478 # - repoviewutil since 5.0
479 for mod in (branchmap, repoview, repoviewutil):
405 480 subsettable = getattr(mod, 'subsettable', None)
406 481 if subsettable:
407 482 return subsettable
@@ -519,7 +594,11 b' def perfaddremove(ui, repo, **opts):'
519 594 repo.ui.quiet = True
520 595 matcher = scmutil.match(repo[None])
521 596 opts[b'dry_run'] = True
522 timer(lambda: scmutil.addremove(repo, matcher, b"", opts))
597 if b'uipathfn' in getargspec(scmutil.addremove).args:
598 uipathfn = scmutil.getuipathfn(repo)
599 timer(lambda: scmutil.addremove(repo, matcher, b"", uipathfn, opts))
600 else:
601 timer(lambda: scmutil.addremove(repo, matcher, b"", opts))
523 602 finally:
524 603 repo.ui.quiet = oldquiet
525 604 fm.end()
@@ -535,13 +614,15 b' def clearcaches(cl):'
535 614
536 615 @command(b'perfheads', formatteropts)
537 616 def perfheads(ui, repo, **opts):
617 """benchmark the computation of a changelog heads"""
538 618 opts = _byteskwargs(opts)
539 619 timer, fm = gettimer(ui, opts)
540 620 cl = repo.changelog
621 def s():
622 clearcaches(cl)
541 623 def d():
542 624 len(cl.headrevs())
543 clearcaches(cl)
544 timer(d)
625 timer(d, setup=s)
545 626 fm.end()
546 627
547 628 @command(b'perftags', formatteropts+
@@ -911,9 +992,7 b' def perfphasesremote(ui, repo, dest=None'
911 992 raise error.Abort((b'default repository not configured!'),
912 993 hint=(b"see 'hg help config.paths'"))
913 994 dest = path.pushloc or path.loc
914 branches = (path.branch, opts.get(b'branch') or [])
915 995 ui.status((b'analysing phase of %s\n') % util.hidepassword(dest))
916 revs, checkout = hg.addbranchrevs(repo, repo, branches, opts.get(b'rev'))
917 996 other = hg.peer(repo, opts, dest)
918 997
919 998 # easier to perform discovery through the operation
@@ -1014,18 +1093,44 b' def perfignore(ui, repo, **opts):'
1014 1093 fm.end()
1015 1094
1016 1095 @command(b'perfindex', [
1017 (b'', b'rev', b'', b'revision to be looked up (default tip)'),
1096 (b'', b'rev', [], b'revision to be looked up (default tip)'),
1097 (b'', b'no-lookup', None, b'do not revision lookup post creation'),
1018 1098 ] + formatteropts)
1019 1099 def perfindex(ui, repo, **opts):
1100 """benchmark index creation time followed by a lookup
1101
1102 The default is to look `tip` up. Depending on the index implementation,
1103 the revision looked up can matters. For example, an implementation
1104 scanning the index will have a faster lookup time for `--rev tip` than for
1105 `--rev 0`. The number of looked up revisions and their order can also
1106 matters.
1107
1108 Example of useful set to test:
1109 * tip
1110 * 0
1111 * -10:
1112 * :10
1113 * -10: + :10
1114 * :10: + -10:
1115 * -10000:
1116 * -10000: + 0
1117
1118 It is not currently possible to check for lookup of a missing node. For
1119 deeper lookup benchmarking, checkout the `perfnodemap` command."""
1020 1120 import mercurial.revlog
1021 1121 opts = _byteskwargs(opts)
1022 1122 timer, fm = gettimer(ui, opts)
1023 1123 mercurial.revlog._prereadsize = 2**24 # disable lazy parser in old hg
1024 if opts[b'rev'] is None:
1025 n = repo[b"tip"].node()
1124 if opts[b'no_lookup']:
1125 if opts['rev']:
1126 raise error.Abort('--no-lookup and --rev are mutually exclusive')
1127 nodes = []
1128 elif not opts[b'rev']:
1129 nodes = [repo[b"tip"].node()]
1026 1130 else:
1027 rev = scmutil.revsingle(repo, opts[b'rev'])
1028 n = repo[rev].node()
1131 revs = scmutil.revrange(repo, opts[b'rev'])
1132 cl = repo.changelog
1133 nodes = [cl.node(r) for r in revs]
1029 1134
1030 1135 unfi = repo.unfiltered()
1031 1136 # find the filecache func directly
@@ -1036,7 +1141,67 b' def perfindex(ui, repo, **opts):'
1036 1141 clearchangelog(unfi)
1037 1142 def d():
1038 1143 cl = makecl(unfi)
1039 cl.rev(n)
1144 for n in nodes:
1145 cl.rev(n)
1146 timer(d, setup=setup)
1147 fm.end()
1148
1149 @command(b'perfnodemap', [
1150 (b'', b'rev', [], b'revision to be looked up (default tip)'),
1151 (b'', b'clear-caches', True, b'clear revlog cache between calls'),
1152 ] + formatteropts)
1153 def perfnodemap(ui, repo, **opts):
1154 """benchmark the time necessary to look up revision from a cold nodemap
1155
1156 Depending on the implementation, the amount and order of revision we look
1157 up can varies. Example of useful set to test:
1158 * tip
1159 * 0
1160 * -10:
1161 * :10
1162 * -10: + :10
1163 * :10: + -10:
1164 * -10000:
1165 * -10000: + 0
1166
1167 The command currently focus on valid binary lookup. Benchmarking for
1168 hexlookup, prefix lookup and missing lookup would also be valuable.
1169 """
1170 import mercurial.revlog
1171 opts = _byteskwargs(opts)
1172 timer, fm = gettimer(ui, opts)
1173 mercurial.revlog._prereadsize = 2**24 # disable lazy parser in old hg
1174
1175 unfi = repo.unfiltered()
1176 clearcaches = opts['clear_caches']
1177 # find the filecache func directly
1178 # This avoid polluting the benchmark with the filecache logic
1179 makecl = unfi.__class__.changelog.func
1180 if not opts[b'rev']:
1181 raise error.Abort('use --rev to specify revisions to look up')
1182 revs = scmutil.revrange(repo, opts[b'rev'])
1183 cl = repo.changelog
1184 nodes = [cl.node(r) for r in revs]
1185
1186 # use a list to pass reference to a nodemap from one closure to the next
1187 nodeget = [None]
1188 def setnodeget():
1189 # probably not necessary, but for good measure
1190 clearchangelog(unfi)
1191 nodeget[0] = makecl(unfi).nodemap.get
1192
1193 def d():
1194 get = nodeget[0]
1195 for n in nodes:
1196 get(n)
1197
1198 setup = None
1199 if clearcaches:
1200 def setup():
1201 setnodeget()
1202 else:
1203 setnodeget()
1204 d() # prewarm the data structure
1040 1205 timer(d, setup=setup)
1041 1206 fm.end()
1042 1207
@@ -1056,6 +1221,13 b' def perfstartup(ui, repo, **opts):'
1056 1221
1057 1222 @command(b'perfparents', formatteropts)
1058 1223 def perfparents(ui, repo, **opts):
1224 """benchmark the time necessary to fetch one changeset's parents.
1225
1226 The fetch is done using the `node identifier`, traversing all object layers
1227 from the repository object. The first N revisions will be used for this
1228 benchmark. N is controlled by the ``perf.parentscount`` config option
1229 (default: 1000).
1230 """
1059 1231 opts = _byteskwargs(opts)
1060 1232 timer, fm = gettimer(ui, opts)
1061 1233 # control the number of commits perfparents iterates over
@@ -2290,13 +2462,18 b' def perfbranchmap(ui, repo, *filternames'
2290 2462 view = repo
2291 2463 else:
2292 2464 view = repo.filtered(filtername)
2465 if util.safehasattr(view._branchcaches, '_per_filter'):
2466 filtered = view._branchcaches._per_filter
2467 else:
2468 # older versions
2469 filtered = view._branchcaches
2293 2470 def d():
2294 2471 if clear_revbranch:
2295 2472 repo.revbranchcache()._clear()
2296 2473 if full:
2297 2474 view._branchcaches.clear()
2298 2475 else:
2299 view._branchcaches.pop(filtername, None)
2476 filtered.pop(filtername, None)
2300 2477 view.branchmap()
2301 2478 return d
2302 2479 # add filter in smaller subset to bigger subset
@@ -2323,10 +2500,15 b' def perfbranchmap(ui, repo, *filternames'
2323 2500 # add unfiltered
2324 2501 allfilters.append(None)
2325 2502
2326 branchcacheread = safeattrsetter(branchmap, b'read')
2503 if util.safehasattr(branchmap.branchcache, 'fromfile'):
2504 branchcacheread = safeattrsetter(branchmap.branchcache, b'fromfile')
2505 branchcacheread.set(classmethod(lambda *args: None))
2506 else:
2507 # older versions
2508 branchcacheread = safeattrsetter(branchmap, b'read')
2509 branchcacheread.set(lambda *args: None)
2327 2510 branchcachewrite = safeattrsetter(branchmap.branchcache, b'write')
2328 branchcacheread.set(lambda repo: None)
2329 branchcachewrite.set(lambda bc, repo: None)
2511 branchcachewrite.set(lambda *args: None)
2330 2512 try:
2331 2513 for name in allfilters:
2332 2514 printname = name
@@ -2470,9 +2652,15 b' def perfbranchmapload(ui, repo, filter=b'
2470 2652
2471 2653 repo.branchmap() # make sure we have a relevant, up to date branchmap
2472 2654
2655 try:
2656 fromfile = branchmap.branchcache.fromfile
2657 except AttributeError:
2658 # older versions
2659 fromfile = branchmap.read
2660
2473 2661 currentfilter = filter
2474 2662 # try once without timer, the filter may not be cached
2475 while branchmap.read(repo) is None:
2663 while fromfile(repo) is None:
2476 2664 currentfilter = subsettable.get(currentfilter)
2477 2665 if currentfilter is None:
2478 2666 raise error.Abort(b'No branchmap cached for %s repo'
@@ -2483,7 +2671,7 b' def perfbranchmapload(ui, repo, filter=b'
2483 2671 if clearrevlogs:
2484 2672 clearchangelog(repo)
2485 2673 def bench():
2486 branchmap.read(repo)
2674 fromfile(repo)
2487 2675 timer(bench, setup=setup)
2488 2676 fm.end()
2489 2677
@@ -5,6 +5,5 b' graft tests'
5 5 include make_cffi.py
6 6 include setup_zstd.py
7 7 include zstd.c
8 include zstd_cffi.py
9 8 include LICENSE
10 9 include NEWS.rst
@@ -8,8 +8,18 b' 1.0.0 (not yet released)'
8 8 Actions Blocking Release
9 9 ------------------------
10 10
11 * compression and decompression APIs that support ``io.rawIOBase`` interface
11 * compression and decompression APIs that support ``io.RawIOBase`` interface
12 12 (#13).
13 * ``stream_writer()`` APIs should support ``io.RawIOBase`` interface.
14 * Properly handle non-blocking I/O and partial writes for objects implementing
15 ``io.RawIOBase``.
16 * Make ``write_return_read=True`` the default for objects implementing
17 ``io.RawIOBase``.
18 * Audit for consistent and proper behavior of ``flush()`` and ``close()`` for
19 all objects implementing ``io.RawIOBase``. Is calling ``close()`` on
20 wrapped stream acceptable, should ``__exit__`` always call ``close()``,
21 should ``close()`` imply ``flush()``, etc.
22 * Consider making reads across frames configurable behavior.
13 23 * Refactor module names so C and CFFI extensions live under ``zstandard``
14 24 package.
15 25 * Overall API design review.
@@ -43,6 +53,11 b' Actions Blocking Release'
43 53 * Consider a ``chunker()`` API for decompression.
44 54 * Consider stats for ``chunker()`` API, including finding the last consumed
45 55 offset of input data.
56 * Consider exposing ``ZSTD_cParam_getBounds()`` and
57 ``ZSTD_dParam_getBounds()`` APIs.
58 * Consider controls over resetting compression contexts (session only, parameters,
59 or session and parameters).
60 * Actually use the CFFI backend in fuzzing tests.
46 61
47 62 Other Actions Not Blocking Release
48 63 ---------------------------------------
@@ -51,6 +66,207 b' Other Actions Not Blocking Release'
51 66 * API for ensuring max memory ceiling isn't exceeded.
52 67 * Move off nose for testing.
53 68
69 0.11.0 (released 2019-02-24)
70 ============================
71
72 Backwards Compatibility Nodes
73 -----------------------------
74
75 * ``ZstdDecompressor.read()`` now allows reading sizes of ``-1`` or ``0``
76 and defaults to ``-1``, per the documented behavior of
77 ``io.RawIOBase.read()``. Previously, we required an argument that was
78 a positive value.
79 * The ``readline()``, ``readlines()``, ``__iter__``, and ``__next__`` methods
80 of ``ZstdDecompressionReader()`` now raise ``io.UnsupportedOperation``
81 instead of ``NotImplementedError``.
82 * ``ZstdDecompressor.stream_reader()`` now accepts a ``read_across_frames``
83 argument. The default value will likely be changed in a future release
84 and consumers are advised to pass the argument to avoid unwanted change
85 of behavior in the future.
86 * ``setup.py`` now always disables the CFFI backend if the installed
87 CFFI package does not meet the minimum version requirements. Before, it was
88 possible for the CFFI backend to be generated and a run-time error to
89 occur.
90 * In the CFFI backend, ``CompressionReader`` and ``DecompressionReader``
91 were renamed to ``ZstdCompressionReader`` and ``ZstdDecompressionReader``,
92 respectively so naming is identical to the C extension. This should have
93 no meaningful end-user impact, as instances aren't meant to be
94 constructed directly.
95 * ``ZstdDecompressor.stream_writer()`` now accepts a ``write_return_read``
96 argument to control whether ``write()`` returns the number of bytes
97 read from the source / written to the decompressor. It defaults to off,
98 which preserves the existing behavior of returning the number of bytes
99 emitted from the decompressor. The default will change in a future release
100 so behavior aligns with the specified behavior of ``io.RawIOBase``.
101 * ``ZstdDecompressionWriter.__exit__`` now calls ``self.close()``. This
102 will result in that stream plus the underlying stream being closed as
103 well. If this behavior is not desirable, do not use instances as
104 context managers.
105 * ``ZstdCompressor.stream_writer()`` now accepts a ``write_return_read``
106 argument to control whether ``write()`` returns the number of bytes read
107 from the source / written to the compressor. It defaults to off, which
108 preserves the existing behavior of returning the number of bytes emitted
109 from the compressor. The default will change in a future release so
110 behavior aligns with the specified behavior of ``io.RawIOBase``.
111 * ``ZstdCompressionWriter.__exit__`` now calls ``self.close()``. This will
112 result in that stream plus any underlying stream being closed as well. If
113 this behavior is not desirable, do not use instances as context managers.
114 * ``ZstdDecompressionWriter`` no longer requires being used as a context
115 manager (#57).
116 * ``ZstdCompressionWriter`` no longer requires being used as a context
117 manager (#57).
118 * The ``overlap_size_log`` attribute on ``CompressionParameters`` instances
119 has been deprecated and will be removed in a future release. The
120 ``overlap_log`` attribute should be used instead.
121 * The ``overlap_size_log`` argument to ``CompressionParameters`` has been
122 deprecated and will be removed in a future release. The ``overlap_log``
123 argument should be used instead.
124 * The ``ldm_hash_every_log`` attribute on ``CompressionParameters`` instances
125 has been deprecated and will be removed in a future release. The
126 ``ldm_hash_rate_log`` attribute should be used instead.
127 * The ``ldm_hash_every_log`` argument to ``CompressionParameters`` has been
128 deprecated and will be removed in a future release. The ``ldm_hash_rate_log``
129 argument should be used instead.
130 * The ``compression_strategy`` argument to ``CompressionParameters`` has been
131 deprecated and will be removed in a future release. The ``strategy``
132 argument should be used instead.
133 * The ``SEARCHLENGTH_MIN`` and ``SEARCHLENGTH_MAX`` constants are deprecated
134 and will be removed in a future release. Use ``MINMATCH_MIN`` and
135 ``MINMATCH_MAX`` instead.
136 * The ``zstd_cffi`` module has been renamed to ``zstandard.cffi``. As had
137 been documented in the ``README`` file since the ``0.9.0`` release, the
138 module should not be imported directly at its new location. Instead,
139 ``import zstandard`` to cause an appropriate backend module to be loaded
140 automatically.
141
142 Bug Fixes
143 ---------
144
145 * CFFI backend could encounter a failure when sending an empty chunk into
146 ``ZstdDecompressionObj.decompress()``. The issue has been fixed.
147 * CFFI backend could encounter an error when calling
148 ``ZstdDecompressionReader.read()`` if there was data remaining in an
149 internal buffer. The issue has been fixed. (#71)
150
151 Changes
152 -------
153
154 * ``ZstDecompressionObj.decompress()`` now properly handles empty inputs in
155 the CFFI backend.
156 * ``ZstdCompressionReader`` now implements ``read1()`` and ``readinto1()``.
157 These are part of the ``io.BufferedIOBase`` interface.
158 * ``ZstdCompressionReader`` has gained a ``readinto(b)`` method for reading
159 compressed output into an existing buffer.
160 * ``ZstdCompressionReader.read()`` now defaults to ``size=-1`` and accepts
161 read sizes of ``-1`` and ``0``. The new behavior aligns with the documented
162 behavior of ``io.RawIOBase``.
163 * ``ZstdCompressionReader`` now implements ``readall()``. Previously, this
164 method raised ``NotImplementedError``.
165 * ``ZstdDecompressionReader`` now implements ``read1()`` and ``readinto1()``.
166 These are part of the ``io.BufferedIOBase`` interface.
167 * ``ZstdDecompressionReader.read()`` now defaults to ``size=-1`` and accepts
168 read sizes of ``-1`` and ``0``. The new behavior aligns with the documented
169 behavior of ``io.RawIOBase``.
170 * ``ZstdDecompressionReader()`` now implements ``readall()``. Previously, this
171 method raised ``NotImplementedError``.
172 * The ``readline()``, ``readlines()``, ``__iter__``, and ``__next__`` methods
173 of ``ZstdDecompressionReader()`` now raise ``io.UnsupportedOperation``
174 instead of ``NotImplementedError``. This reflects a decision to never
175 implement text-based I/O on (de)compressors and keep the low-level API
176 operating in the binary domain. (#13)
177 * ``README.rst`` now documented how to achieve linewise iteration using
178 an ``io.TextIOWrapper`` with a ``ZstdDecompressionReader``.
179 * ``ZstdDecompressionReader`` has gained a ``readinto(b)`` method for
180 reading decompressed output into an existing buffer. This allows chaining
181 to an ``io.TextIOWrapper`` on Python 3 without using an ``io.BufferedReader``.
182 * ``ZstdDecompressor.stream_reader()`` now accepts a ``read_across_frames``
183 argument to control behavior when the input data has multiple zstd
184 *frames*. When ``False`` (the default for backwards compatibility), a
185 ``read()`` will stop when the end of a zstd *frame* is encountered. When
186 ``True``, ``read()`` can potentially return data spanning multiple zstd
187 *frames*. The default will likely be changed to ``True`` in a future
188 release.
189 * ``setup.py`` now performs CFFI version sniffing and disables the CFFI
190 backend if CFFI is too old. Previously, we only used ``install_requires``
191 to enforce the CFFI version and not all build modes would properly enforce
192 the minimum CFFI version. (#69)
193 * CFFI's ``ZstdDecompressionReader.read()`` now properly handles data
194 remaining in any internal buffer. Before, repeated ``read()`` could
195 result in *random* errors. (#71)
196 * Upgraded various Python packages in CI environment.
197 * Upgrade to hypothesis 4.5.11.
198 * In the CFFI backend, ``CompressionReader`` and ``DecompressionReader``
199 were renamed to ``ZstdCompressionReader`` and ``ZstdDecompressionReader``,
200 respectively.
201 * ``ZstdDecompressor.stream_writer()`` now accepts a ``write_return_read``
202 argument to control whether ``write()`` returns the number of bytes read
203 from the source. It defaults to ``False`` to preserve backwards
204 compatibility.
205 * ``ZstdDecompressor.stream_writer()`` now implements the ``io.RawIOBase``
206 interface and behaves as a proper stream object.
207 * ``ZstdCompressor.stream_writer()`` now accepts a ``write_return_read``
208 argument to control whether ``write()`` returns the number of bytes read
209 from the source. It defaults to ``False`` to preserve backwards
210 compatibility.
211 * ``ZstdCompressionWriter`` now implements the ``io.RawIOBase`` interface and
212 behaves as a proper stream object. ``close()`` will now close the stream
213 and the underlying stream (if possible). ``__exit__`` will now call
214 ``close()``. Methods like ``writable()`` and ``fileno()`` are implemented.
215 * ``ZstdDecompressionWriter`` no longer must be used as a context manager.
216 * ``ZstdCompressionWriter`` no longer must be used as a context manager.
217 When not using as a context manager, it is important to call
218 ``flush(FRAME_FRAME)`` or the compression stream won't be properly
219 terminated and decoders may complain about malformed input.
220 * ``ZstdCompressionWriter.flush()`` (what is returned from
221 ``ZstdCompressor.stream_writer()``) now accepts an argument controlling the
222 flush behavior. Its value can be one of the new constants
223 ``FLUSH_BLOCK`` or ``FLUSH_FRAME``.
224 * ``ZstdDecompressionObj`` instances now have a ``flush([length=None])`` method.
225 This provides parity with standard library equivalent types. (#65)
226 * ``CompressionParameters`` no longer redundantly store individual compression
227 parameters on each instance. Instead, compression parameters are stored inside
228 the underlying ``ZSTD_CCtx_params`` instance. Attributes for obtaining
229 parameters are now properties rather than instance variables.
230 * Exposed the ``STRATEGY_BTULTRA2`` constant.
231 * ``CompressionParameters`` instances now expose an ``overlap_log`` attribute.
232 This behaves identically to the ``overlap_size_log`` attribute.
233 * ``CompressionParameters()`` now accepts an ``overlap_log`` argument that
234 behaves identically to the ``overlap_size_log`` argument. An error will be
235 raised if both arguments are specified.
236 * ``CompressionParameters`` instances now expose an ``ldm_hash_rate_log``
237 attribute. This behaves identically to the ``ldm_hash_every_log`` attribute.
238 * ``CompressionParameters()`` now accepts a ``ldm_hash_rate_log`` argument that
239 behaves identically to the ``ldm_hash_every_log`` argument. An error will be
240 raised if both arguments are specified.
241 * ``CompressionParameters()`` now accepts a ``strategy`` argument that behaves
242 identically to the ``compression_strategy`` argument. An error will be raised
243 if both arguments are specified.
244 * The ``MINMATCH_MIN`` and ``MINMATCH_MAX`` constants were added. They are
245 semantically equivalent to the old ``SEARCHLENGTH_MIN`` and
246 ``SEARCHLENGTH_MAX`` constants.
247 * Bundled zstandard library upgraded from 1.3.7 to 1.3.8.
248 * ``setup.py`` denotes support for Python 3.7 (Python 3.7 was supported and
249 tested in the 0.10 release).
250 * ``zstd_cffi`` module has been renamed to ``zstandard.cffi``.
251 * ``ZstdCompressor.stream_writer()`` now reuses a buffer in order to avoid
252 allocating a new buffer for every operation. This should result in faster
253 performance in cases where ``write()`` or ``flush()`` are being called
254 frequently. (#62)
255 * Bundled zstandard library upgraded from 1.3.6 to 1.3.7.
256
257 0.10.2 (released 2018-11-03)
258 ============================
259
260 Bug Fixes
261 ---------
262
263 * ``zstd_cffi.py`` added to ``setup.py`` (#60).
264
265 Changes
266 -------
267
268 * Change some integer casts to avoid ``ssize_t`` (#61).
269
54 270 0.10.1 (released 2018-10-08)
55 271 ============================
56 272
@@ -20,9 +20,9 b' https://github.com/indygreg/python-zstan'
20 20 Requirements
21 21 ============
22 22
23 This extension is designed to run with Python 2.7, 3.4, 3.5, and 3.6
24 on common platforms (Linux, Windows, and OS X). x86 and x86_64 are well-tested
25 on Windows. Only x86_64 is well-tested on Linux and macOS.
23 This extension is designed to run with Python 2.7, 3.4, 3.5, 3.6, and 3.7
24 on common platforms (Linux, Windows, and OS X). On PyPy (both PyPy2 and PyPy3) we support version 6.0.0 and above.
25 x86 and x86_64 are well-tested on Windows. Only x86_64 is well-tested on Linux and macOS.
26 26
27 27 Installing
28 28 ==========
@@ -215,7 +215,7 b' Instances can also be used as context ma'
215 215
216 216 # Do something with compressed chunk.
217 217
218 When the context manager exists or ``close()`` is called, the stream is closed,
218 When the context manager exits or ``close()`` is called, the stream is closed,
219 219 underlying resources are released, and future operations against the compression
220 220 stream will fail.
221 221
@@ -251,8 +251,54 b' emitted so far.'
251 251 Streaming Input API
252 252 ^^^^^^^^^^^^^^^^^^^
253 253
254 ``stream_writer(fh)`` (which behaves as a context manager) allows you to *stream*
255 data into a compressor.::
254 ``stream_writer(fh)`` allows you to *stream* data into a compressor.
255
256 Returned instances implement the ``io.RawIOBase`` interface. Only methods
257 that involve writing will do useful things.
258
259 The argument to ``stream_writer()`` must have a ``write(data)`` method. As
260 compressed data is available, ``write()`` will be called with the compressed
261 data as its argument. Many common Python types implement ``write()``, including
262 open file handles and ``io.BytesIO``.
263
264 The ``write(data)`` method is used to feed data into the compressor.
265
266 The ``flush([flush_mode=FLUSH_BLOCK])`` method can be called to evict whatever
267 data remains within the compressor's internal state into the output object. This
268 may result in 0 or more ``write()`` calls to the output object. This method
269 accepts an optional ``flush_mode`` argument to control the flushing behavior.
270 Its value can be any of the ``FLUSH_*`` constants.
271
272 Both ``write()`` and ``flush()`` return the number of bytes written to the
273 object's ``write()``. In many cases, small inputs do not accumulate enough
274 data to cause a write and ``write()`` will return ``0``.
275
276 Calling ``close()`` will mark the stream as closed and subsequent I/O
277 operations will raise ``ValueError`` (per the documented behavior of
278 ``io.RawIOBase``). ``close()`` will also call ``close()`` on the underlying
279 stream if such a method exists.
280
281 Typically usage is as follows::
282
283 cctx = zstd.ZstdCompressor(level=10)
284 compressor = cctx.stream_writer(fh)
285
286 compressor.write(b'chunk 0\n')
287 compressor.write(b'chunk 1\n')
288 compressor.flush()
289 # Receiver will be able to decode ``chunk 0\nchunk 1\n`` at this point.
290 # Receiver is also expecting more data in the zstd *frame*.
291
292 compressor.write(b'chunk 2\n')
293 compressor.flush(zstd.FLUSH_FRAME)
294 # Receiver will be able to decode ``chunk 0\nchunk 1\nchunk 2``.
295 # Receiver is expecting no more data, as the zstd frame is closed.
296 # Any future calls to ``write()`` at this point will construct a new
297 # zstd frame.
298
299 Instances can be used as context managers. Exiting the context manager is
300 the equivalent of calling ``close()``, which is equivalent to calling
301 ``flush(zstd.FLUSH_FRAME)``::
256 302
257 303 cctx = zstd.ZstdCompressor(level=10)
258 304 with cctx.stream_writer(fh) as compressor:
@@ -260,22 +306,12 b' data into a compressor.::'
260 306 compressor.write(b'chunk 1')
261 307 ...
262 308
263 The argument to ``stream_writer()`` must have a ``write(data)`` method. As
264 compressed data is available, ``write()`` will be called with the compressed
265 data as its argument. Many common Python types implement ``write()``, including
266 open file handles and ``io.BytesIO``.
309 .. important::
267 310
268 ``stream_writer()`` returns an object representing a streaming compressor
269 instance. It **must** be used as a context manager. That object's
270 ``write(data)`` method is used to feed data into the compressor.
271
272 A ``flush()`` method can be called to evict whatever data remains within the
273 compressor's internal state into the output object. This may result in 0 or
274 more ``write()`` calls to the output object.
275
276 Both ``write()`` and ``flush()`` return the number of bytes written to the
277 object's ``write()``. In many cases, small inputs do not accumulate enough
278 data to cause a write and ``write()`` will return ``0``.
311 If ``flush(FLUSH_FRAME)`` is not called, emitted data doesn't constitute
312 a full zstd *frame* and consumers of this data may complain about malformed
313 input. It is recommended to use instances as a context manager to ensure
314 *frames* are properly finished.
279 315
280 316 If the size of the data being fed to this streaming compressor is known,
281 317 you can declare it before compression begins::
@@ -310,6 +346,14 b' Thte total number of bytes written so fa'
310 346 ...
311 347 total_written = compressor.tell()
312 348
349 ``stream_writer()`` accepts a ``write_return_read`` boolean argument to control
350 the return value of ``write()``. When ``False`` (the default), ``write()`` returns
351 the number of bytes that were ``write()``en to the underlying object. When
352 ``True``, ``write()`` returns the number of bytes read from the input that
353 were subsequently written to the compressor. ``True`` is the *proper* behavior
354 for ``write()`` as specified by the ``io.RawIOBase`` interface and will become
355 the default value in a future release.
356
313 357 Streaming Output API
314 358 ^^^^^^^^^^^^^^^^^^^^
315 359
@@ -654,27 +698,63 b' will raise ``ValueError`` if attempted.'
654 698 ``tell()`` returns the number of decompressed bytes read so far.
655 699
656 700 Not all I/O methods are implemented. Notably missing is support for
657 ``readline()``, ``readlines()``, and linewise iteration support. Support for
658 these is planned for a future release.
701 ``readline()``, ``readlines()``, and linewise iteration support. This is
702 because streams operate on binary data - not text data. If you want to
703 convert decompressed output to text, you can chain an ``io.TextIOWrapper``
704 to the stream::
705
706 with open(path, 'rb') as fh:
707 dctx = zstd.ZstdDecompressor()
708 stream_reader = dctx.stream_reader(fh)
709 text_stream = io.TextIOWrapper(stream_reader, encoding='utf-8')
710
711 for line in text_stream:
712 ...
713
714 The ``read_across_frames`` argument to ``stream_reader()`` controls the
715 behavior of read operations when the end of a zstd *frame* is encountered.
716 When ``False`` (the default), a read will complete when the end of a
717 zstd *frame* is encountered. When ``True``, a read can potentially
718 return data spanning multiple zstd *frames*.
659 719
660 720 Streaming Input API
661 721 ^^^^^^^^^^^^^^^^^^^
662 722
663 ``stream_writer(fh)`` can be used to incrementally send compressed data to a
664 decompressor.::
723 ``stream_writer(fh)`` allows you to *stream* data into a decompressor.
724
725 Returned instances implement the ``io.RawIOBase`` interface. Only methods
726 that involve writing will do useful things.
727
728 The argument to ``stream_writer()`` is typically an object that also implements
729 ``io.RawIOBase``. But any object with a ``write(data)`` method will work. Many
730 common Python types conform to this interface, including open file handles
731 and ``io.BytesIO``.
732
733 Behavior is similar to ``ZstdCompressor.stream_writer()``: compressed data
734 is sent to the decompressor by calling ``write(data)`` and decompressed
735 output is written to the underlying stream by calling its ``write(data)``
736 method.::
665 737
666 738 dctx = zstd.ZstdDecompressor()
667 with dctx.stream_writer(fh) as decompressor:
668 decompressor.write(compressed_data)
739 decompressor = dctx.stream_writer(fh)
669 740
670 This behaves similarly to ``zstd.ZstdCompressor``: compressed data is written to
671 the decompressor by calling ``write(data)`` and decompressed output is written
672 to the output object by calling its ``write(data)`` method.
741 decompressor.write(compressed_data)
742 ...
743
673 744
674 745 Calls to ``write()`` will return the number of bytes written to the output
675 746 object. Not all inputs will result in bytes being written, so return values
676 747 of ``0`` are possible.
677 748
749 Like the ``stream_writer()`` compressor, instances can be used as context
750 managers. However, context managers add no extra special behavior and offer
751 little to no benefit to being used.
752
753 Calling ``close()`` will mark the stream as closed and subsequent I/O operations
754 will raise ``ValueError`` (per the documented behavior of ``io.RawIOBase``).
755 ``close()`` will also call ``close()`` on the underlying stream if such a
756 method exists.
757
678 758 The size of chunks being ``write()`` to the destination can be specified::
679 759
680 760 dctx = zstd.ZstdDecompressor()
@@ -687,6 +767,13 b' You can see how much memory is being use'
687 767 with dctx.stream_writer(fh) as decompressor:
688 768 byte_size = decompressor.memory_size()
689 769
770 ``stream_writer()`` accepts a ``write_return_read`` boolean argument to control
771 the return value of ``write()``. When ``False`` (the default)``, ``write()``
772 returns the number of bytes that were ``write()``en to the underlying stream.
773 When ``True``, ``write()`` returns the number of bytes read from the input.
774 ``True`` is the *proper* behavior for ``write()`` as specified by the
775 ``io.RawIOBase`` interface and will become the default in a future release.
776
690 777 Streaming Output API
691 778 ^^^^^^^^^^^^^^^^^^^^
692 779
@@ -791,6 +878,10 b' these temporary chunks by passing ``writ'
791 878 memory (re)allocations, this streaming decompression API isn't as
792 879 efficient as other APIs.
793 880
881 For compatibility with the standard library APIs, instances expose a
882 ``flush([length=None])`` method. This method no-ops and has no meaningful
883 side-effects, making it safe to call any time.
884
794 885 Batch Decompression API
795 886 ^^^^^^^^^^^^^^^^^^^^^^^
796 887
@@ -1147,18 +1238,21 b' follows:'
1147 1238 * search_log
1148 1239 * min_match
1149 1240 * target_length
1150 * compression_strategy
1241 * strategy
1242 * compression_strategy (deprecated: same as ``strategy``)
1151 1243 * write_content_size
1152 1244 * write_checksum
1153 1245 * write_dict_id
1154 1246 * job_size
1155 * overlap_size_log
1247 * overlap_log
1248 * overlap_size_log (deprecated: same as ``overlap_log``)
1156 1249 * force_max_window
1157 1250 * enable_ldm
1158 1251 * ldm_hash_log
1159 1252 * ldm_min_match
1160 1253 * ldm_bucket_size_log
1161 * ldm_hash_every_log
1254 * ldm_hash_rate_log
1255 * ldm_hash_every_log (deprecated: same as ``ldm_hash_rate_log``)
1162 1256 * threads
1163 1257
1164 1258 Some of these are very low-level settings. It may help to consult the official
@@ -1240,6 +1334,13 b' FRAME_HEADER'
1240 1334 MAGIC_NUMBER
1241 1335 Frame header as an integer
1242 1336
1337 FLUSH_BLOCK
1338 Flushing behavior that denotes to flush a zstd block. A decompressor will
1339 be able to decode all data fed into the compressor so far.
1340 FLUSH_FRAME
1341 Flushing behavior that denotes to end a zstd frame. Any new data fed
1342 to the compressor will start a new frame.
1343
1243 1344 CONTENTSIZE_UNKNOWN
1244 1345 Value for content size when the content size is unknown.
1245 1346 CONTENTSIZE_ERROR
@@ -1261,10 +1362,18 b' SEARCHLOG_MIN'
1261 1362 Minimum value for compression parameter
1262 1363 SEARCHLOG_MAX
1263 1364 Maximum value for compression parameter
1365 MINMATCH_MIN
1366 Minimum value for compression parameter
1367 MINMATCH_MAX
1368 Maximum value for compression parameter
1264 1369 SEARCHLENGTH_MIN
1265 1370 Minimum value for compression parameter
1371
1372 Deprecated: use ``MINMATCH_MIN``
1266 1373 SEARCHLENGTH_MAX
1267 1374 Maximum value for compression parameter
1375
1376 Deprecated: use ``MINMATCH_MAX``
1268 1377 TARGETLENGTH_MIN
1269 1378 Minimum value for compression parameter
1270 1379 STRATEGY_FAST
@@ -1283,6 +1392,8 b' STRATEGY_BTOPT'
1283 1392 Compression strategy
1284 1393 STRATEGY_BTULTRA
1285 1394 Compression strategy
1395 STRATEGY_BTULTRA2
1396 Compression strategy
1286 1397
1287 1398 FORMAT_ZSTD1
1288 1399 Zstandard frame format
@@ -43,7 +43,7 b' static PyObject* ZstdCompressionChunkerI'
43 43 /* If we have data left in the input, consume it. */
44 44 while (chunker->input.pos < chunker->input.size) {
45 45 Py_BEGIN_ALLOW_THREADS
46 zresult = ZSTD_compress_generic(chunker->compressor->cctx, &chunker->output,
46 zresult = ZSTD_compressStream2(chunker->compressor->cctx, &chunker->output,
47 47 &chunker->input, ZSTD_e_continue);
48 48 Py_END_ALLOW_THREADS
49 49
@@ -104,7 +104,7 b' static PyObject* ZstdCompressionChunkerI'
104 104 }
105 105
106 106 Py_BEGIN_ALLOW_THREADS
107 zresult = ZSTD_compress_generic(chunker->compressor->cctx, &chunker->output,
107 zresult = ZSTD_compressStream2(chunker->compressor->cctx, &chunker->output,
108 108 &chunker->input, zFlushMode);
109 109 Py_END_ALLOW_THREADS
110 110
@@ -298,13 +298,9 b' static PyObject* ZstdCompressionDict_pre'
298 298 cParams = ZSTD_getCParams(level, 0, self->dictSize);
299 299 }
300 300 else {
301 cParams.chainLog = compressionParams->chainLog;
302 cParams.hashLog = compressionParams->hashLog;
303 cParams.searchLength = compressionParams->minMatch;
304 cParams.searchLog = compressionParams->searchLog;
305 cParams.strategy = compressionParams->compressionStrategy;
306 cParams.targetLength = compressionParams->targetLength;
307 cParams.windowLog = compressionParams->windowLog;
301 if (to_cparams(compressionParams, &cParams)) {
302 return NULL;
303 }
308 304 }
309 305
310 306 assert(!self->cdict);
@@ -10,7 +10,7 b''
10 10
11 11 extern PyObject* ZstdError;
12 12
13 int set_parameter(ZSTD_CCtx_params* params, ZSTD_cParameter param, unsigned value) {
13 int set_parameter(ZSTD_CCtx_params* params, ZSTD_cParameter param, int value) {
14 14 size_t zresult = ZSTD_CCtxParam_setParameter(params, param, value);
15 15 if (ZSTD_isError(zresult)) {
16 16 PyErr_Format(ZstdError, "unable to set compression context parameter: %s",
@@ -23,28 +23,41 b' int set_parameter(ZSTD_CCtx_params* para'
23 23
24 24 #define TRY_SET_PARAMETER(params, param, value) if (set_parameter(params, param, value)) return -1;
25 25
26 #define TRY_COPY_PARAMETER(source, dest, param) { \
27 int result; \
28 size_t zresult = ZSTD_CCtxParam_getParameter(source, param, &result); \
29 if (ZSTD_isError(zresult)) { \
30 return 1; \
31 } \
32 zresult = ZSTD_CCtxParam_setParameter(dest, param, result); \
33 if (ZSTD_isError(zresult)) { \
34 return 1; \
35 } \
36 }
37
26 38 int set_parameters(ZSTD_CCtx_params* params, ZstdCompressionParametersObject* obj) {
27 TRY_SET_PARAMETER(params, ZSTD_p_format, obj->format);
28 TRY_SET_PARAMETER(params, ZSTD_p_compressionLevel, (unsigned)obj->compressionLevel);
29 TRY_SET_PARAMETER(params, ZSTD_p_windowLog, obj->windowLog);
30 TRY_SET_PARAMETER(params, ZSTD_p_hashLog, obj->hashLog);
31 TRY_SET_PARAMETER(params, ZSTD_p_chainLog, obj->chainLog);
32 TRY_SET_PARAMETER(params, ZSTD_p_searchLog, obj->searchLog);
33 TRY_SET_PARAMETER(params, ZSTD_p_minMatch, obj->minMatch);
34 TRY_SET_PARAMETER(params, ZSTD_p_targetLength, obj->targetLength);
35 TRY_SET_PARAMETER(params, ZSTD_p_compressionStrategy, obj->compressionStrategy);
36 TRY_SET_PARAMETER(params, ZSTD_p_contentSizeFlag, obj->contentSizeFlag);
37 TRY_SET_PARAMETER(params, ZSTD_p_checksumFlag, obj->checksumFlag);
38 TRY_SET_PARAMETER(params, ZSTD_p_dictIDFlag, obj->dictIDFlag);
39 TRY_SET_PARAMETER(params, ZSTD_p_nbWorkers, obj->threads);
40 TRY_SET_PARAMETER(params, ZSTD_p_jobSize, obj->jobSize);
41 TRY_SET_PARAMETER(params, ZSTD_p_overlapSizeLog, obj->overlapSizeLog);
42 TRY_SET_PARAMETER(params, ZSTD_p_forceMaxWindow, obj->forceMaxWindow);
43 TRY_SET_PARAMETER(params, ZSTD_p_enableLongDistanceMatching, obj->enableLongDistanceMatching);
44 TRY_SET_PARAMETER(params, ZSTD_p_ldmHashLog, obj->ldmHashLog);
45 TRY_SET_PARAMETER(params, ZSTD_p_ldmMinMatch, obj->ldmMinMatch);
46 TRY_SET_PARAMETER(params, ZSTD_p_ldmBucketSizeLog, obj->ldmBucketSizeLog);
47 TRY_SET_PARAMETER(params, ZSTD_p_ldmHashEveryLog, obj->ldmHashEveryLog);
39 TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_nbWorkers);
40
41 TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_format);
42 TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_compressionLevel);
43 TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_windowLog);
44 TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_hashLog);
45 TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_chainLog);
46 TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_searchLog);
47 TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_minMatch);
48 TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_targetLength);
49 TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_strategy);
50 TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_contentSizeFlag);
51 TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_checksumFlag);
52 TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_dictIDFlag);
53 TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_jobSize);
54 TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_overlapLog);
55 TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_forceMaxWindow);
56 TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_enableLongDistanceMatching);
57 TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_ldmHashLog);
58 TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_ldmMinMatch);
59 TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_ldmBucketSizeLog);
60 TRY_COPY_PARAMETER(obj->params, params, ZSTD_c_ldmHashRateLog);
48 61
49 62 return 0;
50 63 }
@@ -64,6 +77,41 b' int reset_params(ZstdCompressionParamete'
64 77 return set_parameters(params->params, params);
65 78 }
66 79
80 #define TRY_GET_PARAMETER(params, param, value) { \
81 size_t zresult = ZSTD_CCtxParam_getParameter(params, param, value); \
82 if (ZSTD_isError(zresult)) { \
83 PyErr_Format(ZstdError, "unable to retrieve parameter: %s", ZSTD_getErrorName(zresult)); \
84 return 1; \
85 } \
86 }
87
88 int to_cparams(ZstdCompressionParametersObject* params, ZSTD_compressionParameters* cparams) {
89 int value;
90
91 TRY_GET_PARAMETER(params->params, ZSTD_c_windowLog, &value);
92 cparams->windowLog = value;
93
94 TRY_GET_PARAMETER(params->params, ZSTD_c_chainLog, &value);
95 cparams->chainLog = value;
96
97 TRY_GET_PARAMETER(params->params, ZSTD_c_hashLog, &value);
98 cparams->hashLog = value;
99
100 TRY_GET_PARAMETER(params->params, ZSTD_c_searchLog, &value);
101 cparams->searchLog = value;
102
103 TRY_GET_PARAMETER(params->params, ZSTD_c_minMatch, &value);
104 cparams->minMatch = value;
105
106 TRY_GET_PARAMETER(params->params, ZSTD_c_targetLength, &value);
107 cparams->targetLength = value;
108
109 TRY_GET_PARAMETER(params->params, ZSTD_c_strategy, &value);
110 cparams->strategy = value;
111
112 return 0;
113 }
114
67 115 static int ZstdCompressionParameters_init(ZstdCompressionParametersObject* self, PyObject* args, PyObject* kwargs) {
68 116 static char* kwlist[] = {
69 117 "format",
@@ -75,50 +123,60 b' static int ZstdCompressionParameters_ini'
75 123 "min_match",
76 124 "target_length",
77 125 "compression_strategy",
126 "strategy",
78 127 "write_content_size",
79 128 "write_checksum",
80 129 "write_dict_id",
81 130 "job_size",
131 "overlap_log",
82 132 "overlap_size_log",
83 133 "force_max_window",
84 134 "enable_ldm",
85 135 "ldm_hash_log",
86 136 "ldm_min_match",
87 137 "ldm_bucket_size_log",
138 "ldm_hash_rate_log",
88 139 "ldm_hash_every_log",
89 140 "threads",
90 141 NULL
91 142 };
92 143
93 unsigned format = 0;
144 int format = 0;
94 145 int compressionLevel = 0;
95 unsigned windowLog = 0;
96 unsigned hashLog = 0;
97 unsigned chainLog = 0;
98 unsigned searchLog = 0;
99 unsigned minMatch = 0;
100 unsigned targetLength = 0;
101 unsigned compressionStrategy = 0;
102 unsigned contentSizeFlag = 1;
103 unsigned checksumFlag = 0;
104 unsigned dictIDFlag = 0;
105 unsigned jobSize = 0;
106 unsigned overlapSizeLog = 0;
107 unsigned forceMaxWindow = 0;
108 unsigned enableLDM = 0;
109 unsigned ldmHashLog = 0;
110 unsigned ldmMinMatch = 0;
111 unsigned ldmBucketSizeLog = 0;
112 unsigned ldmHashEveryLog = 0;
146 int windowLog = 0;
147 int hashLog = 0;
148 int chainLog = 0;
149 int searchLog = 0;
150 int minMatch = 0;
151 int targetLength = 0;
152 int compressionStrategy = -1;
153 int strategy = -1;
154 int contentSizeFlag = 1;
155 int checksumFlag = 0;
156 int dictIDFlag = 0;
157 int jobSize = 0;
158 int overlapLog = -1;
159 int overlapSizeLog = -1;
160 int forceMaxWindow = 0;
161 int enableLDM = 0;
162 int ldmHashLog = 0;
163 int ldmMinMatch = 0;
164 int ldmBucketSizeLog = 0;
165 int ldmHashRateLog = -1;
166 int ldmHashEveryLog = -1;
113 167 int threads = 0;
114 168
115 169 if (!PyArg_ParseTupleAndKeywords(args, kwargs,
116 "|IiIIIIIIIIIIIIIIIIIIi:CompressionParameters",
170 "|iiiiiiiiiiiiiiiiiiiiiiii:CompressionParameters",
117 171 kwlist, &format, &compressionLevel, &windowLog, &hashLog, &chainLog,
118 &searchLog, &minMatch, &targetLength, &compressionStrategy,
119 &contentSizeFlag, &checksumFlag, &dictIDFlag, &jobSize, &overlapSizeLog,
120 &forceMaxWindow, &enableLDM, &ldmHashLog, &ldmMinMatch, &ldmBucketSizeLog,
121 &ldmHashEveryLog, &threads)) {
172 &searchLog, &minMatch, &targetLength, &compressionStrategy, &strategy,
173 &contentSizeFlag, &checksumFlag, &dictIDFlag, &jobSize, &overlapLog,
174 &overlapSizeLog, &forceMaxWindow, &enableLDM, &ldmHashLog, &ldmMinMatch,
175 &ldmBucketSizeLog, &ldmHashRateLog, &ldmHashEveryLog, &threads)) {
176 return -1;
177 }
178
179 if (reset_params(self)) {
122 180 return -1;
123 181 }
124 182
@@ -126,32 +184,70 b' static int ZstdCompressionParameters_ini'
126 184 threads = cpu_count();
127 185 }
128 186
129 self->format = format;
130 self->compressionLevel = compressionLevel;
131 self->windowLog = windowLog;
132 self->hashLog = hashLog;
133 self->chainLog = chainLog;
134 self->searchLog = searchLog;
135 self->minMatch = minMatch;
136 self->targetLength = targetLength;
137 self->compressionStrategy = compressionStrategy;
138 self->contentSizeFlag = contentSizeFlag;
139 self->checksumFlag = checksumFlag;
140 self->dictIDFlag = dictIDFlag;
141 self->threads = threads;
142 self->jobSize = jobSize;
143 self->overlapSizeLog = overlapSizeLog;
144 self->forceMaxWindow = forceMaxWindow;
145 self->enableLongDistanceMatching = enableLDM;
146 self->ldmHashLog = ldmHashLog;
147 self->ldmMinMatch = ldmMinMatch;
148 self->ldmBucketSizeLog = ldmBucketSizeLog;
149 self->ldmHashEveryLog = ldmHashEveryLog;
187 /* We need to set ZSTD_c_nbWorkers before ZSTD_c_jobSize and ZSTD_c_overlapLog
188 * because setting ZSTD_c_nbWorkers resets the other parameters. */
189 TRY_SET_PARAMETER(self->params, ZSTD_c_nbWorkers, threads);
190
191 TRY_SET_PARAMETER(self->params, ZSTD_c_format, format);
192 TRY_SET_PARAMETER(self->params, ZSTD_c_compressionLevel, compressionLevel);
193 TRY_SET_PARAMETER(self->params, ZSTD_c_windowLog, windowLog);
194 TRY_SET_PARAMETER(self->params, ZSTD_c_hashLog, hashLog);
195 TRY_SET_PARAMETER(self->params, ZSTD_c_chainLog, chainLog);
196 TRY_SET_PARAMETER(self->params, ZSTD_c_searchLog, searchLog);
197 TRY_SET_PARAMETER(self->params, ZSTD_c_minMatch, minMatch);
198 TRY_SET_PARAMETER(self->params, ZSTD_c_targetLength, targetLength);
150 199
151 if (reset_params(self)) {
200 if (compressionStrategy != -1 && strategy != -1) {
201 PyErr_SetString(PyExc_ValueError, "cannot specify both compression_strategy and strategy");
202 return -1;
203 }
204
205 if (compressionStrategy != -1) {
206 strategy = compressionStrategy;
207 }
208 else if (strategy == -1) {
209 strategy = 0;
210 }
211
212 TRY_SET_PARAMETER(self->params, ZSTD_c_strategy, strategy);
213 TRY_SET_PARAMETER(self->params, ZSTD_c_contentSizeFlag, contentSizeFlag);
214 TRY_SET_PARAMETER(self->params, ZSTD_c_checksumFlag, checksumFlag);
215 TRY_SET_PARAMETER(self->params, ZSTD_c_dictIDFlag, dictIDFlag);
216 TRY_SET_PARAMETER(self->params, ZSTD_c_jobSize, jobSize);
217
218 if (overlapLog != -1 && overlapSizeLog != -1) {
219 PyErr_SetString(PyExc_ValueError, "cannot specify both overlap_log and overlap_size_log");
152 220 return -1;
153 221 }
154 222
223 if (overlapSizeLog != -1) {
224 overlapLog = overlapSizeLog;
225 }
226 else if (overlapLog == -1) {
227 overlapLog = 0;
228 }
229
230 TRY_SET_PARAMETER(self->params, ZSTD_c_overlapLog, overlapLog);
231 TRY_SET_PARAMETER(self->params, ZSTD_c_forceMaxWindow, forceMaxWindow);
232 TRY_SET_PARAMETER(self->params, ZSTD_c_enableLongDistanceMatching, enableLDM);
233 TRY_SET_PARAMETER(self->params, ZSTD_c_ldmHashLog, ldmHashLog);
234 TRY_SET_PARAMETER(self->params, ZSTD_c_ldmMinMatch, ldmMinMatch);
235 TRY_SET_PARAMETER(self->params, ZSTD_c_ldmBucketSizeLog, ldmBucketSizeLog);
236
237 if (ldmHashRateLog != -1 && ldmHashEveryLog != -1) {
238 PyErr_SetString(PyExc_ValueError, "cannot specify both ldm_hash_rate_log and ldm_hash_everyLog");
239 return -1;
240 }
241
242 if (ldmHashEveryLog != -1) {
243 ldmHashRateLog = ldmHashEveryLog;
244 }
245 else if (ldmHashRateLog == -1) {
246 ldmHashRateLog = 0;
247 }
248
249 TRY_SET_PARAMETER(self->params, ZSTD_c_ldmHashRateLog, ldmHashRateLog);
250
155 251 return 0;
156 252 }
157 253
@@ -259,7 +355,7 b' ZstdCompressionParametersObject* Compres'
259 355
260 356 val = PyDict_GetItemString(kwargs, "min_match");
261 357 if (!val) {
262 val = PyLong_FromUnsignedLong(params.searchLength);
358 val = PyLong_FromUnsignedLong(params.minMatch);
263 359 if (!val) {
264 360 goto cleanup;
265 361 }
@@ -336,6 +432,41 b' static void ZstdCompressionParameters_de'
336 432 PyObject_Del(self);
337 433 }
338 434
435 #define PARAM_GETTER(name, param) PyObject* ZstdCompressionParameters_get_##name(PyObject* self, void* unused) { \
436 int result; \
437 size_t zresult; \
438 ZstdCompressionParametersObject* p = (ZstdCompressionParametersObject*)(self); \
439 zresult = ZSTD_CCtxParam_getParameter(p->params, param, &result); \
440 if (ZSTD_isError(zresult)) { \
441 PyErr_Format(ZstdError, "unable to get compression parameter: %s", \
442 ZSTD_getErrorName(zresult)); \
443 return NULL; \
444 } \
445 return PyLong_FromLong(result); \
446 }
447
448 PARAM_GETTER(format, ZSTD_c_format)
449 PARAM_GETTER(compression_level, ZSTD_c_compressionLevel)
450 PARAM_GETTER(window_log, ZSTD_c_windowLog)
451 PARAM_GETTER(hash_log, ZSTD_c_hashLog)
452 PARAM_GETTER(chain_log, ZSTD_c_chainLog)
453 PARAM_GETTER(search_log, ZSTD_c_searchLog)
454 PARAM_GETTER(min_match, ZSTD_c_minMatch)
455 PARAM_GETTER(target_length, ZSTD_c_targetLength)
456 PARAM_GETTER(compression_strategy, ZSTD_c_strategy)
457 PARAM_GETTER(write_content_size, ZSTD_c_contentSizeFlag)
458 PARAM_GETTER(write_checksum, ZSTD_c_checksumFlag)
459 PARAM_GETTER(write_dict_id, ZSTD_c_dictIDFlag)
460 PARAM_GETTER(job_size, ZSTD_c_jobSize)
461 PARAM_GETTER(overlap_log, ZSTD_c_overlapLog)
462 PARAM_GETTER(force_max_window, ZSTD_c_forceMaxWindow)
463 PARAM_GETTER(enable_ldm, ZSTD_c_enableLongDistanceMatching)
464 PARAM_GETTER(ldm_hash_log, ZSTD_c_ldmHashLog)
465 PARAM_GETTER(ldm_min_match, ZSTD_c_ldmMinMatch)
466 PARAM_GETTER(ldm_bucket_size_log, ZSTD_c_ldmBucketSizeLog)
467 PARAM_GETTER(ldm_hash_rate_log, ZSTD_c_ldmHashRateLog)
468 PARAM_GETTER(threads, ZSTD_c_nbWorkers)
469
339 470 static PyMethodDef ZstdCompressionParameters_methods[] = {
340 471 {
341 472 "from_level",
@@ -352,70 +483,34 b' static PyMethodDef ZstdCompressionParame'
352 483 { NULL, NULL }
353 484 };
354 485
355 static PyMemberDef ZstdCompressionParameters_members[] = {
356 { "format", T_UINT,
357 offsetof(ZstdCompressionParametersObject, format), READONLY,
358 "compression format" },
359 { "compression_level", T_INT,
360 offsetof(ZstdCompressionParametersObject, compressionLevel), READONLY,
361 "compression level" },
362 { "window_log", T_UINT,
363 offsetof(ZstdCompressionParametersObject, windowLog), READONLY,
364 "window log" },
365 { "hash_log", T_UINT,
366 offsetof(ZstdCompressionParametersObject, hashLog), READONLY,
367 "hash log" },
368 { "chain_log", T_UINT,
369 offsetof(ZstdCompressionParametersObject, chainLog), READONLY,
370 "chain log" },
371 { "search_log", T_UINT,
372 offsetof(ZstdCompressionParametersObject, searchLog), READONLY,
373 "search log" },
374 { "min_match", T_UINT,
375 offsetof(ZstdCompressionParametersObject, minMatch), READONLY,
376 "search length" },
377 { "target_length", T_UINT,
378 offsetof(ZstdCompressionParametersObject, targetLength), READONLY,
379 "target length" },
380 { "compression_strategy", T_UINT,
381 offsetof(ZstdCompressionParametersObject, compressionStrategy), READONLY,
382 "compression strategy" },
383 { "write_content_size", T_UINT,
384 offsetof(ZstdCompressionParametersObject, contentSizeFlag), READONLY,
385 "whether to write content size in frames" },
386 { "write_checksum", T_UINT,
387 offsetof(ZstdCompressionParametersObject, checksumFlag), READONLY,
388 "whether to write checksum in frames" },
389 { "write_dict_id", T_UINT,
390 offsetof(ZstdCompressionParametersObject, dictIDFlag), READONLY,
391 "whether to write dictionary ID in frames" },
392 { "threads", T_UINT,
393 offsetof(ZstdCompressionParametersObject, threads), READONLY,
394 "number of threads to use" },
395 { "job_size", T_UINT,
396 offsetof(ZstdCompressionParametersObject, jobSize), READONLY,
397 "size of compression job when using multiple threads" },
398 { "overlap_size_log", T_UINT,
399 offsetof(ZstdCompressionParametersObject, overlapSizeLog), READONLY,
400 "Size of previous input reloaded at the beginning of each job" },
401 { "force_max_window", T_UINT,
402 offsetof(ZstdCompressionParametersObject, forceMaxWindow), READONLY,
403 "force back references to remain smaller than window size" },
404 { "enable_ldm", T_UINT,
405 offsetof(ZstdCompressionParametersObject, enableLongDistanceMatching), READONLY,
406 "whether to enable long distance matching" },
407 { "ldm_hash_log", T_UINT,
408 offsetof(ZstdCompressionParametersObject, ldmHashLog), READONLY,
409 "Size of the table for long distance matching, as a power of 2" },
410 { "ldm_min_match", T_UINT,
411 offsetof(ZstdCompressionParametersObject, ldmMinMatch), READONLY,
412 "minimum size of searched matches for long distance matcher" },
413 { "ldm_bucket_size_log", T_UINT,
414 offsetof(ZstdCompressionParametersObject, ldmBucketSizeLog), READONLY,
415 "log size of each bucket in the LDM hash table for collision resolution" },
416 { "ldm_hash_every_log", T_UINT,
417 offsetof(ZstdCompressionParametersObject, ldmHashEveryLog), READONLY,
418 "frequency of inserting/looking up entries in the LDM hash table" },
486 #define GET_SET_ENTRY(name) { #name, ZstdCompressionParameters_get_##name, NULL, NULL, NULL }
487
488 static PyGetSetDef ZstdCompressionParameters_getset[] = {
489 GET_SET_ENTRY(format),
490 GET_SET_ENTRY(compression_level),
491 GET_SET_ENTRY(window_log),
492 GET_SET_ENTRY(hash_log),
493 GET_SET_ENTRY(chain_log),
494 GET_SET_ENTRY(search_log),
495 GET_SET_ENTRY(min_match),
496 GET_SET_ENTRY(target_length),
497 GET_SET_ENTRY(compression_strategy),
498 GET_SET_ENTRY(write_content_size),
499 GET_SET_ENTRY(write_checksum),
500 GET_SET_ENTRY(write_dict_id),
501 GET_SET_ENTRY(threads),
502 GET_SET_ENTRY(job_size),
503 GET_SET_ENTRY(overlap_log),
504 /* TODO remove this deprecated attribute */
505 { "overlap_size_log", ZstdCompressionParameters_get_overlap_log, NULL, NULL, NULL },
506 GET_SET_ENTRY(force_max_window),
507 GET_SET_ENTRY(enable_ldm),
508 GET_SET_ENTRY(ldm_hash_log),
509 GET_SET_ENTRY(ldm_min_match),
510 GET_SET_ENTRY(ldm_bucket_size_log),
511 GET_SET_ENTRY(ldm_hash_rate_log),
512 /* TODO remove this deprecated attribute */
513 { "ldm_hash_every_log", ZstdCompressionParameters_get_ldm_hash_rate_log, NULL, NULL, NULL },
419 514 { NULL }
420 515 };
421 516
@@ -448,8 +543,8 b' PyTypeObject ZstdCompressionParametersTy'
448 543 0, /* tp_iter */
449 544 0, /* tp_iternext */
450 545 ZstdCompressionParameters_methods, /* tp_methods */
451 ZstdCompressionParameters_members, /* tp_members */
452 0, /* tp_getset */
546 0, /* tp_members */
547 ZstdCompressionParameters_getset, /* tp_getset */
453 548 0, /* tp_base */
454 549 0, /* tp_dict */
455 550 0, /* tp_descr_get */
This diff has been collapsed as it changes many lines, (604 lines changed) Show them Hide them
@@ -128,6 +128,96 b' static PyObject* reader_tell(ZstdCompres'
128 128 return PyLong_FromUnsignedLongLong(self->bytesCompressed);
129 129 }
130 130
131 int read_compressor_input(ZstdCompressionReader* self) {
132 if (self->finishedInput) {
133 return 0;
134 }
135
136 if (self->input.pos != self->input.size) {
137 return 0;
138 }
139
140 if (self->reader) {
141 Py_buffer buffer;
142
143 assert(self->readResult == NULL);
144
145 self->readResult = PyObject_CallMethod(self->reader, "read",
146 "k", self->readSize);
147
148 if (NULL == self->readResult) {
149 return -1;
150 }
151
152 memset(&buffer, 0, sizeof(buffer));
153
154 if (0 != PyObject_GetBuffer(self->readResult, &buffer, PyBUF_CONTIG_RO)) {
155 return -1;
156 }
157
158 /* EOF */
159 if (0 == buffer.len) {
160 self->finishedInput = 1;
161 Py_CLEAR(self->readResult);
162 }
163 else {
164 self->input.src = buffer.buf;
165 self->input.size = buffer.len;
166 self->input.pos = 0;
167 }
168
169 PyBuffer_Release(&buffer);
170 }
171 else {
172 assert(self->buffer.buf);
173
174 self->input.src = self->buffer.buf;
175 self->input.size = self->buffer.len;
176 self->input.pos = 0;
177 }
178
179 return 1;
180 }
181
182 int compress_input(ZstdCompressionReader* self, ZSTD_outBuffer* output) {
183 size_t oldPos;
184 size_t zresult;
185
186 /* If we have data left over, consume it. */
187 if (self->input.pos < self->input.size) {
188 oldPos = output->pos;
189
190 Py_BEGIN_ALLOW_THREADS
191 zresult = ZSTD_compressStream2(self->compressor->cctx,
192 output, &self->input, ZSTD_e_continue);
193 Py_END_ALLOW_THREADS
194
195 self->bytesCompressed += output->pos - oldPos;
196
197 /* Input exhausted. Clear out state tracking. */
198 if (self->input.pos == self->input.size) {
199 memset(&self->input, 0, sizeof(self->input));
200 Py_CLEAR(self->readResult);
201
202 if (self->buffer.buf) {
203 self->finishedInput = 1;
204 }
205 }
206
207 if (ZSTD_isError(zresult)) {
208 PyErr_Format(ZstdError, "zstd compress error: %s", ZSTD_getErrorName(zresult));
209 return -1;
210 }
211 }
212
213 if (output->pos && output->pos == output->size) {
214 return 1;
215 }
216 else {
217 return 0;
218 }
219 }
220
131 221 static PyObject* reader_read(ZstdCompressionReader* self, PyObject* args, PyObject* kwargs) {
132 222 static char* kwlist[] = {
133 223 "size",
@@ -140,25 +230,30 b' static PyObject* reader_read(ZstdCompres'
140 230 Py_ssize_t resultSize;
141 231 size_t zresult;
142 232 size_t oldPos;
233 int readResult, compressResult;
143 234
144 235 if (self->closed) {
145 236 PyErr_SetString(PyExc_ValueError, "stream is closed");
146 237 return NULL;
147 238 }
148 239
149 if (self->finishedOutput) {
150 return PyBytes_FromStringAndSize("", 0);
151 }
152
153 if (!PyArg_ParseTupleAndKeywords(args, kwargs, "n", kwlist, &size)) {
240 if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|n", kwlist, &size)) {
154 241 return NULL;
155 242 }
156 243
157 if (size < 1) {
158 PyErr_SetString(PyExc_ValueError, "cannot read negative or size 0 amounts");
244 if (size < -1) {
245 PyErr_SetString(PyExc_ValueError, "cannot read negative amounts less than -1");
159 246 return NULL;
160 247 }
161 248
249 if (size == -1) {
250 return PyObject_CallMethod((PyObject*)self, "readall", NULL);
251 }
252
253 if (self->finishedOutput || size == 0) {
254 return PyBytes_FromStringAndSize("", 0);
255 }
256
162 257 result = PyBytes_FromStringAndSize(NULL, size);
163 258 if (NULL == result) {
164 259 return NULL;
@@ -172,86 +267,34 b' static PyObject* reader_read(ZstdCompres'
172 267
173 268 readinput:
174 269
175 /* If we have data left over, consume it. */
176 if (self->input.pos < self->input.size) {
177 oldPos = self->output.pos;
178
179 Py_BEGIN_ALLOW_THREADS
180 zresult = ZSTD_compress_generic(self->compressor->cctx,
181 &self->output, &self->input, ZSTD_e_continue);
182
183 Py_END_ALLOW_THREADS
184
185 self->bytesCompressed += self->output.pos - oldPos;
186
187 /* Input exhausted. Clear out state tracking. */
188 if (self->input.pos == self->input.size) {
189 memset(&self->input, 0, sizeof(self->input));
190 Py_CLEAR(self->readResult);
270 compressResult = compress_input(self, &self->output);
191 271
192 if (self->buffer.buf) {
193 self->finishedInput = 1;
194 }
195 }
196
197 if (ZSTD_isError(zresult)) {
198 PyErr_Format(ZstdError, "zstd compress error: %s", ZSTD_getErrorName(zresult));
199 return NULL;
200 }
201
202 if (self->output.pos) {
203 /* If no more room in output, emit it. */
204 if (self->output.pos == self->output.size) {
205 memset(&self->output, 0, sizeof(self->output));
206 return result;
207 }
208
209 /*
210 * There is room in the output. We fall through to below, which will either
211 * get more input for us or will attempt to end the stream.
212 */
213 }
214
215 /* Fall through to gather more input. */
272 if (-1 == compressResult) {
273 Py_XDECREF(result);
274 return NULL;
275 }
276 else if (0 == compressResult) {
277 /* There is room in the output. We fall through to below, which will
278 * either get more input for us or will attempt to end the stream.
279 */
280 }
281 else if (1 == compressResult) {
282 memset(&self->output, 0, sizeof(self->output));
283 return result;
284 }
285 else {
286 assert(0);
216 287 }
217 288
218 if (!self->finishedInput) {
219 if (self->reader) {
220 Py_buffer buffer;
221
222 assert(self->readResult == NULL);
223 self->readResult = PyObject_CallMethod(self->reader, "read",
224 "k", self->readSize);
225 if (self->readResult == NULL) {
226 return NULL;
227 }
228
229 memset(&buffer, 0, sizeof(buffer));
230
231 if (0 != PyObject_GetBuffer(self->readResult, &buffer, PyBUF_CONTIG_RO)) {
232 return NULL;
233 }
289 readResult = read_compressor_input(self);
234 290
235 /* EOF */
236 if (0 == buffer.len) {
237 self->finishedInput = 1;
238 Py_CLEAR(self->readResult);
239 }
240 else {
241 self->input.src = buffer.buf;
242 self->input.size = buffer.len;
243 self->input.pos = 0;
244 }
245
246 PyBuffer_Release(&buffer);
247 }
248 else {
249 assert(self->buffer.buf);
250
251 self->input.src = self->buffer.buf;
252 self->input.size = self->buffer.len;
253 self->input.pos = 0;
254 }
291 if (-1 == readResult) {
292 return NULL;
293 }
294 else if (0 == readResult) { }
295 else if (1 == readResult) { }
296 else {
297 assert(0);
255 298 }
256 299
257 300 if (self->input.size) {
@@ -261,7 +304,7 b' readinput:'
261 304 /* Else EOF */
262 305 oldPos = self->output.pos;
263 306
264 zresult = ZSTD_compress_generic(self->compressor->cctx, &self->output,
307 zresult = ZSTD_compressStream2(self->compressor->cctx, &self->output,
265 308 &self->input, ZSTD_e_end);
266 309
267 310 self->bytesCompressed += self->output.pos - oldPos;
@@ -269,6 +312,7 b' readinput:'
269 312 if (ZSTD_isError(zresult)) {
270 313 PyErr_Format(ZstdError, "error ending compression stream: %s",
271 314 ZSTD_getErrorName(zresult));
315 Py_XDECREF(result);
272 316 return NULL;
273 317 }
274 318
@@ -288,9 +332,394 b' readinput:'
288 332 return result;
289 333 }
290 334
335 static PyObject* reader_read1(ZstdCompressionReader* self, PyObject* args, PyObject* kwargs) {
336 static char* kwlist[] = {
337 "size",
338 NULL
339 };
340
341 Py_ssize_t size = -1;
342 PyObject* result = NULL;
343 char* resultBuffer;
344 Py_ssize_t resultSize;
345 ZSTD_outBuffer output;
346 int compressResult;
347 size_t oldPos;
348 size_t zresult;
349
350 if (self->closed) {
351 PyErr_SetString(PyExc_ValueError, "stream is closed");
352 return NULL;
353 }
354
355 if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|n:read1", kwlist, &size)) {
356 return NULL;
357 }
358
359 if (size < -1) {
360 PyErr_SetString(PyExc_ValueError, "cannot read negative amounts less than -1");
361 return NULL;
362 }
363
364 if (self->finishedOutput || size == 0) {
365 return PyBytes_FromStringAndSize("", 0);
366 }
367
368 if (size == -1) {
369 size = ZSTD_CStreamOutSize();
370 }
371
372 result = PyBytes_FromStringAndSize(NULL, size);
373 if (NULL == result) {
374 return NULL;
375 }
376
377 PyBytes_AsStringAndSize(result, &resultBuffer, &resultSize);
378
379 output.dst = resultBuffer;
380 output.size = resultSize;
381 output.pos = 0;
382
383 /* read1() is supposed to use at most 1 read() from the underlying stream.
384 However, we can't satisfy this requirement with compression because
385 not every input will generate output. We /could/ flush the compressor,
386 but this may not be desirable. We allow multiple read() from the
387 underlying stream. But unlike read(), we return as soon as output data
388 is available.
389 */
390
391 compressResult = compress_input(self, &output);
392
393 if (-1 == compressResult) {
394 Py_XDECREF(result);
395 return NULL;
396 }
397 else if (0 == compressResult || 1 == compressResult) { }
398 else {
399 assert(0);
400 }
401
402 if (output.pos) {
403 goto finally;
404 }
405
406 while (!self->finishedInput) {
407 int readResult = read_compressor_input(self);
408
409 if (-1 == readResult) {
410 Py_XDECREF(result);
411 return NULL;
412 }
413 else if (0 == readResult || 1 == readResult) { }
414 else {
415 assert(0);
416 }
417
418 compressResult = compress_input(self, &output);
419
420 if (-1 == compressResult) {
421 Py_XDECREF(result);
422 return NULL;
423 }
424 else if (0 == compressResult || 1 == compressResult) { }
425 else {
426 assert(0);
427 }
428
429 if (output.pos) {
430 goto finally;
431 }
432 }
433
434 /* EOF */
435 oldPos = output.pos;
436
437 zresult = ZSTD_compressStream2(self->compressor->cctx, &output, &self->input,
438 ZSTD_e_end);
439
440 self->bytesCompressed += output.pos - oldPos;
441
442 if (ZSTD_isError(zresult)) {
443 PyErr_Format(ZstdError, "error ending compression stream: %s",
444 ZSTD_getErrorName(zresult));
445 Py_XDECREF(result);
446 return NULL;
447 }
448
449 if (zresult == 0) {
450 self->finishedOutput = 1;
451 }
452
453 finally:
454 if (result) {
455 if (safe_pybytes_resize(&result, output.pos)) {
456 Py_XDECREF(result);
457 return NULL;
458 }
459 }
460
461 return result;
462 }
463
291 464 static PyObject* reader_readall(PyObject* self) {
292 PyErr_SetNone(PyExc_NotImplementedError);
293 return NULL;
465 PyObject* chunks = NULL;
466 PyObject* empty = NULL;
467 PyObject* result = NULL;
468
469 /* Our strategy is to collect chunks into a list then join all the
470 * chunks at the end. We could potentially use e.g. an io.BytesIO. But
471 * this feels simple enough to implement and avoids potentially expensive
472 * reallocations of large buffers.
473 */
474 chunks = PyList_New(0);
475 if (NULL == chunks) {
476 return NULL;
477 }
478
479 while (1) {
480 PyObject* chunk = PyObject_CallMethod(self, "read", "i", 1048576);
481 if (NULL == chunk) {
482 Py_DECREF(chunks);
483 return NULL;
484 }
485
486 if (!PyBytes_Size(chunk)) {
487 Py_DECREF(chunk);
488 break;
489 }
490
491 if (PyList_Append(chunks, chunk)) {
492 Py_DECREF(chunk);
493 Py_DECREF(chunks);
494 return NULL;
495 }
496
497 Py_DECREF(chunk);
498 }
499
500 empty = PyBytes_FromStringAndSize("", 0);
501 if (NULL == empty) {
502 Py_DECREF(chunks);
503 return NULL;
504 }
505
506 result = PyObject_CallMethod(empty, "join", "O", chunks);
507
508 Py_DECREF(empty);
509 Py_DECREF(chunks);
510
511 return result;
512 }
513
514 static PyObject* reader_readinto(ZstdCompressionReader* self, PyObject* args) {
515 Py_buffer dest;
516 ZSTD_outBuffer output;
517 int readResult, compressResult;
518 PyObject* result = NULL;
519 size_t zresult;
520 size_t oldPos;
521
522 if (self->closed) {
523 PyErr_SetString(PyExc_ValueError, "stream is closed");
524 return NULL;
525 }
526
527 if (self->finishedOutput) {
528 return PyLong_FromLong(0);
529 }
530
531 if (!PyArg_ParseTuple(args, "w*:readinto", &dest)) {
532 return NULL;
533 }
534
535 if (!PyBuffer_IsContiguous(&dest, 'C') || dest.ndim > 1) {
536 PyErr_SetString(PyExc_ValueError,
537 "destination buffer should be contiguous and have at most one dimension");
538 goto finally;
539 }
540
541 output.dst = dest.buf;
542 output.size = dest.len;
543 output.pos = 0;
544
545 compressResult = compress_input(self, &output);
546
547 if (-1 == compressResult) {
548 goto finally;
549 }
550 else if (0 == compressResult) { }
551 else if (1 == compressResult) {
552 result = PyLong_FromSize_t(output.pos);
553 goto finally;
554 }
555 else {
556 assert(0);
557 }
558
559 while (!self->finishedInput) {
560 readResult = read_compressor_input(self);
561
562 if (-1 == readResult) {
563 goto finally;
564 }
565 else if (0 == readResult || 1 == readResult) {}
566 else {
567 assert(0);
568 }
569
570 compressResult = compress_input(self, &output);
571
572 if (-1 == compressResult) {
573 goto finally;
574 }
575 else if (0 == compressResult) { }
576 else if (1 == compressResult) {
577 result = PyLong_FromSize_t(output.pos);
578 goto finally;
579 }
580 else {
581 assert(0);
582 }
583 }
584
585 /* EOF */
586 oldPos = output.pos;
587
588 zresult = ZSTD_compressStream2(self->compressor->cctx, &output, &self->input,
589 ZSTD_e_end);
590
591 self->bytesCompressed += self->output.pos - oldPos;
592
593 if (ZSTD_isError(zresult)) {
594 PyErr_Format(ZstdError, "error ending compression stream: %s",
595 ZSTD_getErrorName(zresult));
596 goto finally;
597 }
598
599 assert(output.pos);
600
601 if (0 == zresult) {
602 self->finishedOutput = 1;
603 }
604
605 result = PyLong_FromSize_t(output.pos);
606
607 finally:
608 PyBuffer_Release(&dest);
609
610 return result;
611 }
612
613 static PyObject* reader_readinto1(ZstdCompressionReader* self, PyObject* args) {
614 Py_buffer dest;
615 PyObject* result = NULL;
616 ZSTD_outBuffer output;
617 int compressResult;
618 size_t oldPos;
619 size_t zresult;
620
621 if (self->closed) {
622 PyErr_SetString(PyExc_ValueError, "stream is closed");
623 return NULL;
624 }
625
626 if (self->finishedOutput) {
627 return PyLong_FromLong(0);
628 }
629
630 if (!PyArg_ParseTuple(args, "w*:readinto1", &dest)) {
631 return NULL;
632 }
633
634 if (!PyBuffer_IsContiguous(&dest, 'C') || dest.ndim > 1) {
635 PyErr_SetString(PyExc_ValueError,
636 "destination buffer should be contiguous and have at most one dimension");
637 goto finally;
638 }
639
640 output.dst = dest.buf;
641 output.size = dest.len;
642 output.pos = 0;
643
644 compressResult = compress_input(self, &output);
645
646 if (-1 == compressResult) {
647 goto finally;
648 }
649 else if (0 == compressResult || 1 == compressResult) { }
650 else {
651 assert(0);
652 }
653
654 if (output.pos) {
655 result = PyLong_FromSize_t(output.pos);
656 goto finally;
657 }
658
659 while (!self->finishedInput) {
660 int readResult = read_compressor_input(self);
661
662 if (-1 == readResult) {
663 goto finally;
664 }
665 else if (0 == readResult || 1 == readResult) { }
666 else {
667 assert(0);
668 }
669
670 compressResult = compress_input(self, &output);
671
672 if (-1 == compressResult) {
673 goto finally;
674 }
675 else if (0 == compressResult) { }
676 else if (1 == compressResult) {
677 result = PyLong_FromSize_t(output.pos);
678 goto finally;
679 }
680 else {
681 assert(0);
682 }
683
684 /* If we produced output and we're not done with input, emit
685 * that output now, as we've hit restrictions of read1().
686 */
687 if (output.pos && !self->finishedInput) {
688 result = PyLong_FromSize_t(output.pos);
689 goto finally;
690 }
691
692 /* Otherwise we either have no output or we've exhausted the
693 * input. Either we try to get more input or we fall through
694 * to EOF below */
695 }
696
697 /* EOF */
698 oldPos = output.pos;
699
700 zresult = ZSTD_compressStream2(self->compressor->cctx, &output, &self->input,
701 ZSTD_e_end);
702
703 self->bytesCompressed += self->output.pos - oldPos;
704
705 if (ZSTD_isError(zresult)) {
706 PyErr_Format(ZstdError, "error ending compression stream: %s",
707 ZSTD_getErrorName(zresult));
708 goto finally;
709 }
710
711 assert(output.pos);
712
713 if (0 == zresult) {
714 self->finishedOutput = 1;
715 }
716
717 result = PyLong_FromSize_t(output.pos);
718
719 finally:
720 PyBuffer_Release(&dest);
721
722 return result;
294 723 }
295 724
296 725 static PyObject* reader_iter(PyObject* self) {
@@ -315,7 +744,10 b' static PyMethodDef reader_methods[] = {'
315 744 { "readable", (PyCFunction)reader_readable, METH_NOARGS,
316 745 PyDoc_STR("Returns True") },
317 746 { "read", (PyCFunction)reader_read, METH_VARARGS | METH_KEYWORDS, PyDoc_STR("read compressed data") },
747 { "read1", (PyCFunction)reader_read1, METH_VARARGS | METH_KEYWORDS, NULL },
318 748 { "readall", (PyCFunction)reader_readall, METH_NOARGS, PyDoc_STR("Not implemented") },
749 { "readinto", (PyCFunction)reader_readinto, METH_VARARGS, NULL },
750 { "readinto1", (PyCFunction)reader_readinto1, METH_VARARGS, NULL },
319 751 { "readline", (PyCFunction)reader_readline, METH_VARARGS, PyDoc_STR("Not implemented") },
320 752 { "readlines", (PyCFunction)reader_readlines, METH_VARARGS, PyDoc_STR("Not implemented") },
321 753 { "seekable", (PyCFunction)reader_seekable, METH_NOARGS,
@@ -18,24 +18,23 b' static void ZstdCompressionWriter_deallo'
18 18 Py_XDECREF(self->compressor);
19 19 Py_XDECREF(self->writer);
20 20
21 PyMem_Free(self->output.dst);
22 self->output.dst = NULL;
23
21 24 PyObject_Del(self);
22 25 }
23 26
24 27 static PyObject* ZstdCompressionWriter_enter(ZstdCompressionWriter* self) {
25 size_t zresult;
28 if (self->closed) {
29 PyErr_SetString(PyExc_ValueError, "stream is closed");
30 return NULL;
31 }
26 32
27 33 if (self->entered) {
28 34 PyErr_SetString(ZstdError, "cannot __enter__ multiple times");
29 35 return NULL;
30 36 }
31 37
32 zresult = ZSTD_CCtx_setPledgedSrcSize(self->compressor->cctx, self->sourceSize);
33 if (ZSTD_isError(zresult)) {
34 PyErr_Format(ZstdError, "error setting source size: %s",
35 ZSTD_getErrorName(zresult));
36 return NULL;
37 }
38
39 38 self->entered = 1;
40 39
41 40 Py_INCREF(self);
@@ -46,10 +45,6 b' static PyObject* ZstdCompressionWriter_e'
46 45 PyObject* exc_type;
47 46 PyObject* exc_value;
48 47 PyObject* exc_tb;
49 size_t zresult;
50
51 ZSTD_outBuffer output;
52 PyObject* res;
53 48
54 49 if (!PyArg_ParseTuple(args, "OOO:__exit__", &exc_type, &exc_value, &exc_tb)) {
55 50 return NULL;
@@ -58,46 +53,11 b' static PyObject* ZstdCompressionWriter_e'
58 53 self->entered = 0;
59 54
60 55 if (exc_type == Py_None && exc_value == Py_None && exc_tb == Py_None) {
61 ZSTD_inBuffer inBuffer;
62
63 inBuffer.src = NULL;
64 inBuffer.size = 0;
65 inBuffer.pos = 0;
66
67 output.dst = PyMem_Malloc(self->outSize);
68 if (!output.dst) {
69 return PyErr_NoMemory();
70 }
71 output.size = self->outSize;
72 output.pos = 0;
56 PyObject* result = PyObject_CallMethod((PyObject*)self, "close", NULL);
73 57
74 while (1) {
75 zresult = ZSTD_compress_generic(self->compressor->cctx, &output, &inBuffer, ZSTD_e_end);
76 if (ZSTD_isError(zresult)) {
77 PyErr_Format(ZstdError, "error ending compression stream: %s",
78 ZSTD_getErrorName(zresult));
79 PyMem_Free(output.dst);
80 return NULL;
81 }
82
83 if (output.pos) {
84 #if PY_MAJOR_VERSION >= 3
85 res = PyObject_CallMethod(self->writer, "write", "y#",
86 #else
87 res = PyObject_CallMethod(self->writer, "write", "s#",
88 #endif
89 output.dst, output.pos);
90 Py_XDECREF(res);
91 }
92
93 if (!zresult) {
94 break;
95 }
96
97 output.pos = 0;
58 if (NULL == result) {
59 return NULL;
98 60 }
99
100 PyMem_Free(output.dst);
101 61 }
102 62
103 63 Py_RETURN_FALSE;
@@ -117,7 +77,6 b' static PyObject* ZstdCompressionWriter_w'
117 77 Py_buffer source;
118 78 size_t zresult;
119 79 ZSTD_inBuffer input;
120 ZSTD_outBuffer output;
121 80 PyObject* res;
122 81 Py_ssize_t totalWrite = 0;
123 82
@@ -130,143 +89,240 b' static PyObject* ZstdCompressionWriter_w'
130 89 return NULL;
131 90 }
132 91
133 if (!self->entered) {
134 PyErr_SetString(ZstdError, "compress must be called from an active context manager");
135 goto finally;
136 }
137
138 92 if (!PyBuffer_IsContiguous(&source, 'C') || source.ndim > 1) {
139 93 PyErr_SetString(PyExc_ValueError,
140 94 "data buffer should be contiguous and have at most one dimension");
141 95 goto finally;
142 96 }
143 97
144 output.dst = PyMem_Malloc(self->outSize);
145 if (!output.dst) {
146 PyErr_NoMemory();
147 goto finally;
98 if (self->closed) {
99 PyErr_SetString(PyExc_ValueError, "stream is closed");
100 return NULL;
148 101 }
149 output.size = self->outSize;
150 output.pos = 0;
102
103 self->output.pos = 0;
151 104
152 105 input.src = source.buf;
153 106 input.size = source.len;
154 107 input.pos = 0;
155 108
156 while ((ssize_t)input.pos < source.len) {
109 while (input.pos < (size_t)source.len) {
157 110 Py_BEGIN_ALLOW_THREADS
158 zresult = ZSTD_compress_generic(self->compressor->cctx, &output, &input, ZSTD_e_continue);
111 zresult = ZSTD_compressStream2(self->compressor->cctx, &self->output, &input, ZSTD_e_continue);
159 112 Py_END_ALLOW_THREADS
160 113
161 114 if (ZSTD_isError(zresult)) {
162 PyMem_Free(output.dst);
163 115 PyErr_Format(ZstdError, "zstd compress error: %s", ZSTD_getErrorName(zresult));
164 116 goto finally;
165 117 }
166 118
167 119 /* Copy data from output buffer to writer. */
168 if (output.pos) {
120 if (self->output.pos) {
169 121 #if PY_MAJOR_VERSION >= 3
170 122 res = PyObject_CallMethod(self->writer, "write", "y#",
171 123 #else
172 124 res = PyObject_CallMethod(self->writer, "write", "s#",
173 125 #endif
174 output.dst, output.pos);
126 self->output.dst, self->output.pos);
175 127 Py_XDECREF(res);
176 totalWrite += output.pos;
177 self->bytesCompressed += output.pos;
128 totalWrite += self->output.pos;
129 self->bytesCompressed += self->output.pos;
178 130 }
179 output.pos = 0;
131 self->output.pos = 0;
180 132 }
181 133
182 PyMem_Free(output.dst);
183
184 result = PyLong_FromSsize_t(totalWrite);
134 if (self->writeReturnRead) {
135 result = PyLong_FromSize_t(input.pos);
136 }
137 else {
138 result = PyLong_FromSsize_t(totalWrite);
139 }
185 140
186 141 finally:
187 142 PyBuffer_Release(&source);
188 143 return result;
189 144 }
190 145
191 static PyObject* ZstdCompressionWriter_flush(ZstdCompressionWriter* self, PyObject* args) {
146 static PyObject* ZstdCompressionWriter_flush(ZstdCompressionWriter* self, PyObject* args, PyObject* kwargs) {
147 static char* kwlist[] = {
148 "flush_mode",
149 NULL
150 };
151
192 152 size_t zresult;
193 ZSTD_outBuffer output;
194 153 ZSTD_inBuffer input;
195 154 PyObject* res;
196 155 Py_ssize_t totalWrite = 0;
156 unsigned flush_mode = 0;
157 ZSTD_EndDirective flush;
197 158
198 if (!self->entered) {
199 PyErr_SetString(ZstdError, "flush must be called from an active context manager");
159 if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|I:flush",
160 kwlist, &flush_mode)) {
200 161 return NULL;
201 162 }
202 163
164 switch (flush_mode) {
165 case 0:
166 flush = ZSTD_e_flush;
167 break;
168 case 1:
169 flush = ZSTD_e_end;
170 break;
171 default:
172 PyErr_Format(PyExc_ValueError, "unknown flush_mode: %d", flush_mode);
173 return NULL;
174 }
175
176 if (self->closed) {
177 PyErr_SetString(PyExc_ValueError, "stream is closed");
178 return NULL;
179 }
180
181 self->output.pos = 0;
182
203 183 input.src = NULL;
204 184 input.size = 0;
205 185 input.pos = 0;
206 186
207 output.dst = PyMem_Malloc(self->outSize);
208 if (!output.dst) {
209 return PyErr_NoMemory();
210 }
211 output.size = self->outSize;
212 output.pos = 0;
213
214 187 while (1) {
215 188 Py_BEGIN_ALLOW_THREADS
216 zresult = ZSTD_compress_generic(self->compressor->cctx, &output, &input, ZSTD_e_flush);
189 zresult = ZSTD_compressStream2(self->compressor->cctx, &self->output, &input, flush);
217 190 Py_END_ALLOW_THREADS
218 191
219 192 if (ZSTD_isError(zresult)) {
220 PyMem_Free(output.dst);
221 193 PyErr_Format(ZstdError, "zstd compress error: %s", ZSTD_getErrorName(zresult));
222 194 return NULL;
223 195 }
224 196
225 197 /* Copy data from output buffer to writer. */
226 if (output.pos) {
198 if (self->output.pos) {
227 199 #if PY_MAJOR_VERSION >= 3
228 200 res = PyObject_CallMethod(self->writer, "write", "y#",
229 201 #else
230 202 res = PyObject_CallMethod(self->writer, "write", "s#",
231 203 #endif
232 output.dst, output.pos);
204 self->output.dst, self->output.pos);
233 205 Py_XDECREF(res);
234 totalWrite += output.pos;
235 self->bytesCompressed += output.pos;
206 totalWrite += self->output.pos;
207 self->bytesCompressed += self->output.pos;
236 208 }
237 209
238 output.pos = 0;
210 self->output.pos = 0;
239 211
240 212 if (!zresult) {
241 213 break;
242 214 }
243 215 }
244 216
245 PyMem_Free(output.dst);
217 return PyLong_FromSsize_t(totalWrite);
218 }
219
220 static PyObject* ZstdCompressionWriter_close(ZstdCompressionWriter* self) {
221 PyObject* result;
222
223 if (self->closed) {
224 Py_RETURN_NONE;
225 }
226
227 result = PyObject_CallMethod((PyObject*)self, "flush", "I", 1);
228 self->closed = 1;
229
230 if (NULL == result) {
231 return NULL;
232 }
246 233
247 return PyLong_FromSsize_t(totalWrite);
234 /* Call close on underlying stream as well. */
235 if (PyObject_HasAttrString(self->writer, "close")) {
236 return PyObject_CallMethod(self->writer, "close", NULL);
237 }
238
239 Py_RETURN_NONE;
240 }
241
242 static PyObject* ZstdCompressionWriter_fileno(ZstdCompressionWriter* self) {
243 if (PyObject_HasAttrString(self->writer, "fileno")) {
244 return PyObject_CallMethod(self->writer, "fileno", NULL);
245 }
246 else {
247 PyErr_SetString(PyExc_OSError, "fileno not available on underlying writer");
248 return NULL;
249 }
248 250 }
249 251
250 252 static PyObject* ZstdCompressionWriter_tell(ZstdCompressionWriter* self) {
251 253 return PyLong_FromUnsignedLongLong(self->bytesCompressed);
252 254 }
253 255
256 static PyObject* ZstdCompressionWriter_writelines(PyObject* self, PyObject* args) {
257 PyErr_SetNone(PyExc_NotImplementedError);
258 return NULL;
259 }
260
261 static PyObject* ZstdCompressionWriter_false(PyObject* self, PyObject* args) {
262 Py_RETURN_FALSE;
263 }
264
265 static PyObject* ZstdCompressionWriter_true(PyObject* self, PyObject* args) {
266 Py_RETURN_TRUE;
267 }
268
269 static PyObject* ZstdCompressionWriter_unsupported(PyObject* self, PyObject* args, PyObject* kwargs) {
270 PyObject* iomod;
271 PyObject* exc;
272
273 iomod = PyImport_ImportModule("io");
274 if (NULL == iomod) {
275 return NULL;
276 }
277
278 exc = PyObject_GetAttrString(iomod, "UnsupportedOperation");
279 if (NULL == exc) {
280 Py_DECREF(iomod);
281 return NULL;
282 }
283
284 PyErr_SetNone(exc);
285 Py_DECREF(exc);
286 Py_DECREF(iomod);
287
288 return NULL;
289 }
290
254 291 static PyMethodDef ZstdCompressionWriter_methods[] = {
255 292 { "__enter__", (PyCFunction)ZstdCompressionWriter_enter, METH_NOARGS,
256 293 PyDoc_STR("Enter a compression context.") },
257 294 { "__exit__", (PyCFunction)ZstdCompressionWriter_exit, METH_VARARGS,
258 295 PyDoc_STR("Exit a compression context.") },
296 { "close", (PyCFunction)ZstdCompressionWriter_close, METH_NOARGS, NULL },
297 { "fileno", (PyCFunction)ZstdCompressionWriter_fileno, METH_NOARGS, NULL },
298 { "isatty", (PyCFunction)ZstdCompressionWriter_false, METH_NOARGS, NULL },
299 { "readable", (PyCFunction)ZstdCompressionWriter_false, METH_NOARGS, NULL },
300 { "readline", (PyCFunction)ZstdCompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
301 { "readlines", (PyCFunction)ZstdCompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
302 { "seek", (PyCFunction)ZstdCompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
303 { "seekable", ZstdCompressionWriter_false, METH_NOARGS, NULL },
304 { "truncate", (PyCFunction)ZstdCompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
305 { "writable", ZstdCompressionWriter_true, METH_NOARGS, NULL },
306 { "writelines", ZstdCompressionWriter_writelines, METH_VARARGS, NULL },
307 { "read", (PyCFunction)ZstdCompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
308 { "readall", (PyCFunction)ZstdCompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
309 { "readinto", (PyCFunction)ZstdCompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
259 310 { "memory_size", (PyCFunction)ZstdCompressionWriter_memory_size, METH_NOARGS,
260 311 PyDoc_STR("Obtain the memory size of the underlying compressor") },
261 312 { "write", (PyCFunction)ZstdCompressionWriter_write, METH_VARARGS | METH_KEYWORDS,
262 313 PyDoc_STR("Compress data") },
263 { "flush", (PyCFunction)ZstdCompressionWriter_flush, METH_NOARGS,
314 { "flush", (PyCFunction)ZstdCompressionWriter_flush, METH_VARARGS | METH_KEYWORDS,
264 315 PyDoc_STR("Flush data and finish a zstd frame") },
265 316 { "tell", (PyCFunction)ZstdCompressionWriter_tell, METH_NOARGS,
266 317 PyDoc_STR("Returns current number of bytes compressed") },
267 318 { NULL, NULL }
268 319 };
269 320
321 static PyMemberDef ZstdCompressionWriter_members[] = {
322 { "closed", T_BOOL, offsetof(ZstdCompressionWriter, closed), READONLY, NULL },
323 { NULL }
324 };
325
270 326 PyTypeObject ZstdCompressionWriterType = {
271 327 PyVarObject_HEAD_INIT(NULL, 0)
272 328 "zstd.ZstdCompressionWriter", /* tp_name */
@@ -296,7 +352,7 b' PyTypeObject ZstdCompressionWriterType ='
296 352 0, /* tp_iter */
297 353 0, /* tp_iternext */
298 354 ZstdCompressionWriter_methods, /* tp_methods */
299 0, /* tp_members */
355 ZstdCompressionWriter_members, /* tp_members */
300 356 0, /* tp_getset */
301 357 0, /* tp_base */
302 358 0, /* tp_dict */
@@ -59,9 +59,9 b' static PyObject* ZstdCompressionObj_comp'
59 59 input.size = source.len;
60 60 input.pos = 0;
61 61
62 while ((ssize_t)input.pos < source.len) {
62 while (input.pos < (size_t)source.len) {
63 63 Py_BEGIN_ALLOW_THREADS
64 zresult = ZSTD_compress_generic(self->compressor->cctx, &self->output,
64 zresult = ZSTD_compressStream2(self->compressor->cctx, &self->output,
65 65 &input, ZSTD_e_continue);
66 66 Py_END_ALLOW_THREADS
67 67
@@ -154,7 +154,7 b' static PyObject* ZstdCompressionObj_flus'
154 154
155 155 while (1) {
156 156 Py_BEGIN_ALLOW_THREADS
157 zresult = ZSTD_compress_generic(self->compressor->cctx, &self->output,
157 zresult = ZSTD_compressStream2(self->compressor->cctx, &self->output,
158 158 &input, zFlushMode);
159 159 Py_END_ALLOW_THREADS
160 160
@@ -204,27 +204,27 b' static int ZstdCompressor_init(ZstdCompr'
204 204 }
205 205 }
206 206 else {
207 if (set_parameter(self->params, ZSTD_p_compressionLevel, level)) {
207 if (set_parameter(self->params, ZSTD_c_compressionLevel, level)) {
208 208 return -1;
209 209 }
210 210
211 if (set_parameter(self->params, ZSTD_p_contentSizeFlag,
211 if (set_parameter(self->params, ZSTD_c_contentSizeFlag,
212 212 writeContentSize ? PyObject_IsTrue(writeContentSize) : 1)) {
213 213 return -1;
214 214 }
215 215
216 if (set_parameter(self->params, ZSTD_p_checksumFlag,
216 if (set_parameter(self->params, ZSTD_c_checksumFlag,
217 217 writeChecksum ? PyObject_IsTrue(writeChecksum) : 0)) {
218 218 return -1;
219 219 }
220 220
221 if (set_parameter(self->params, ZSTD_p_dictIDFlag,
221 if (set_parameter(self->params, ZSTD_c_dictIDFlag,
222 222 writeDictID ? PyObject_IsTrue(writeDictID) : 1)) {
223 223 return -1;
224 224 }
225 225
226 226 if (threads) {
227 if (set_parameter(self->params, ZSTD_p_nbWorkers, threads)) {
227 if (set_parameter(self->params, ZSTD_c_nbWorkers, threads)) {
228 228 return -1;
229 229 }
230 230 }
@@ -344,7 +344,7 b' static PyObject* ZstdCompressor_copy_str'
344 344 return NULL;
345 345 }
346 346
347 ZSTD_CCtx_reset(self->cctx);
347 ZSTD_CCtx_reset(self->cctx, ZSTD_reset_session_only);
348 348
349 349 zresult = ZSTD_CCtx_setPledgedSrcSize(self->cctx, sourceSize);
350 350 if (ZSTD_isError(zresult)) {
@@ -391,7 +391,7 b' static PyObject* ZstdCompressor_copy_str'
391 391
392 392 while (input.pos < input.size) {
393 393 Py_BEGIN_ALLOW_THREADS
394 zresult = ZSTD_compress_generic(self->cctx, &output, &input, ZSTD_e_continue);
394 zresult = ZSTD_compressStream2(self->cctx, &output, &input, ZSTD_e_continue);
395 395 Py_END_ALLOW_THREADS
396 396
397 397 if (ZSTD_isError(zresult)) {
@@ -421,7 +421,7 b' static PyObject* ZstdCompressor_copy_str'
421 421
422 422 while (1) {
423 423 Py_BEGIN_ALLOW_THREADS
424 zresult = ZSTD_compress_generic(self->cctx, &output, &input, ZSTD_e_end);
424 zresult = ZSTD_compressStream2(self->cctx, &output, &input, ZSTD_e_end);
425 425 Py_END_ALLOW_THREADS
426 426
427 427 if (ZSTD_isError(zresult)) {
@@ -517,7 +517,7 b' static ZstdCompressionReader* ZstdCompre'
517 517 goto except;
518 518 }
519 519
520 ZSTD_CCtx_reset(self->cctx);
520 ZSTD_CCtx_reset(self->cctx, ZSTD_reset_session_only);
521 521
522 522 zresult = ZSTD_CCtx_setPledgedSrcSize(self->cctx, sourceSize);
523 523 if (ZSTD_isError(zresult)) {
@@ -577,7 +577,7 b' static PyObject* ZstdCompressor_compress'
577 577 goto finally;
578 578 }
579 579
580 ZSTD_CCtx_reset(self->cctx);
580 ZSTD_CCtx_reset(self->cctx, ZSTD_reset_session_only);
581 581
582 582 destSize = ZSTD_compressBound(source.len);
583 583 output = PyBytes_FromStringAndSize(NULL, destSize);
@@ -605,7 +605,7 b' static PyObject* ZstdCompressor_compress'
605 605 /* By avoiding ZSTD_compress(), we don't necessarily write out content
606 606 size. This means the argument to ZstdCompressor to control frame
607 607 parameters is honored. */
608 zresult = ZSTD_compress_generic(self->cctx, &outBuffer, &inBuffer, ZSTD_e_end);
608 zresult = ZSTD_compressStream2(self->cctx, &outBuffer, &inBuffer, ZSTD_e_end);
609 609 Py_END_ALLOW_THREADS
610 610
611 611 if (ZSTD_isError(zresult)) {
@@ -651,7 +651,7 b' static ZstdCompressionObj* ZstdCompresso'
651 651 return NULL;
652 652 }
653 653
654 ZSTD_CCtx_reset(self->cctx);
654 ZSTD_CCtx_reset(self->cctx, ZSTD_reset_session_only);
655 655
656 656 zresult = ZSTD_CCtx_setPledgedSrcSize(self->cctx, inSize);
657 657 if (ZSTD_isError(zresult)) {
@@ -740,7 +740,7 b' static ZstdCompressorIterator* ZstdCompr'
740 740 goto except;
741 741 }
742 742
743 ZSTD_CCtx_reset(self->cctx);
743 ZSTD_CCtx_reset(self->cctx, ZSTD_reset_session_only);
744 744
745 745 zresult = ZSTD_CCtx_setPledgedSrcSize(self->cctx, sourceSize);
746 746 if (ZSTD_isError(zresult)) {
@@ -794,16 +794,19 b' static ZstdCompressionWriter* ZstdCompre'
794 794 "writer",
795 795 "size",
796 796 "write_size",
797 "write_return_read",
797 798 NULL
798 799 };
799 800
800 801 PyObject* writer;
801 802 ZstdCompressionWriter* result;
803 size_t zresult;
802 804 unsigned long long sourceSize = ZSTD_CONTENTSIZE_UNKNOWN;
803 805 size_t outSize = ZSTD_CStreamOutSize();
806 PyObject* writeReturnRead = NULL;
804 807
805 if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|Kk:stream_writer", kwlist,
806 &writer, &sourceSize, &outSize)) {
808 if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|KkO:stream_writer", kwlist,
809 &writer, &sourceSize, &outSize, &writeReturnRead)) {
807 810 return NULL;
808 811 }
809 812
@@ -812,22 +815,38 b' static ZstdCompressionWriter* ZstdCompre'
812 815 return NULL;
813 816 }
814 817
815 ZSTD_CCtx_reset(self->cctx);
818 ZSTD_CCtx_reset(self->cctx, ZSTD_reset_session_only);
819
820 zresult = ZSTD_CCtx_setPledgedSrcSize(self->cctx, sourceSize);
821 if (ZSTD_isError(zresult)) {
822 PyErr_Format(ZstdError, "error setting source size: %s",
823 ZSTD_getErrorName(zresult));
824 return NULL;
825 }
816 826
817 827 result = (ZstdCompressionWriter*)PyObject_CallObject((PyObject*)&ZstdCompressionWriterType, NULL);
818 828 if (!result) {
819 829 return NULL;
820 830 }
821 831
832 result->output.dst = PyMem_Malloc(outSize);
833 if (!result->output.dst) {
834 Py_DECREF(result);
835 return (ZstdCompressionWriter*)PyErr_NoMemory();
836 }
837
838 result->output.pos = 0;
839 result->output.size = outSize;
840
822 841 result->compressor = self;
823 842 Py_INCREF(result->compressor);
824 843
825 844 result->writer = writer;
826 845 Py_INCREF(result->writer);
827 846
828 result->sourceSize = sourceSize;
829 847 result->outSize = outSize;
830 848 result->bytesCompressed = 0;
849 result->writeReturnRead = writeReturnRead ? PyObject_IsTrue(writeReturnRead) : 0;
831 850
832 851 return result;
833 852 }
@@ -853,7 +872,7 b' static ZstdCompressionChunker* ZstdCompr'
853 872 return NULL;
854 873 }
855 874
856 ZSTD_CCtx_reset(self->cctx);
875 ZSTD_CCtx_reset(self->cctx, ZSTD_reset_session_only);
857 876
858 877 zresult = ZSTD_CCtx_setPledgedSrcSize(self->cctx, sourceSize);
859 878 if (ZSTD_isError(zresult)) {
@@ -1115,7 +1134,7 b' static void compress_worker(WorkerState*'
1115 1134 break;
1116 1135 }
1117 1136
1118 zresult = ZSTD_compress_generic(state->cctx, &opOutBuffer, &opInBuffer, ZSTD_e_end);
1137 zresult = ZSTD_compressStream2(state->cctx, &opOutBuffer, &opInBuffer, ZSTD_e_end);
1119 1138 if (ZSTD_isError(zresult)) {
1120 1139 state->error = WorkerError_zstd;
1121 1140 state->zresult = zresult;
@@ -57,7 +57,7 b' feedcompressor:'
57 57 /* If we have data left in the input, consume it. */
58 58 if (self->input.pos < self->input.size) {
59 59 Py_BEGIN_ALLOW_THREADS
60 zresult = ZSTD_compress_generic(self->compressor->cctx, &self->output,
60 zresult = ZSTD_compressStream2(self->compressor->cctx, &self->output,
61 61 &self->input, ZSTD_e_continue);
62 62 Py_END_ALLOW_THREADS
63 63
@@ -127,7 +127,7 b' feedcompressor:'
127 127 self->input.size = 0;
128 128 self->input.pos = 0;
129 129
130 zresult = ZSTD_compress_generic(self->compressor->cctx, &self->output,
130 zresult = ZSTD_compressStream2(self->compressor->cctx, &self->output,
131 131 &self->input, ZSTD_e_end);
132 132 if (ZSTD_isError(zresult)) {
133 133 PyErr_Format(ZstdError, "error ending compression stream: %s",
@@ -152,7 +152,7 b' feedcompressor:'
152 152 self->input.pos = 0;
153 153
154 154 Py_BEGIN_ALLOW_THREADS
155 zresult = ZSTD_compress_generic(self->compressor->cctx, &self->output,
155 zresult = ZSTD_compressStream2(self->compressor->cctx, &self->output,
156 156 &self->input, ZSTD_e_continue);
157 157 Py_END_ALLOW_THREADS
158 158
@@ -32,6 +32,9 b' void constants_module_init(PyObject* mod'
32 32 ZstdError = PyErr_NewException("zstd.ZstdError", NULL, NULL);
33 33 PyModule_AddObject(mod, "ZstdError", ZstdError);
34 34
35 PyModule_AddIntConstant(mod, "FLUSH_BLOCK", 0);
36 PyModule_AddIntConstant(mod, "FLUSH_FRAME", 1);
37
35 38 PyModule_AddIntConstant(mod, "COMPRESSOBJ_FLUSH_FINISH", compressorobj_flush_finish);
36 39 PyModule_AddIntConstant(mod, "COMPRESSOBJ_FLUSH_BLOCK", compressorobj_flush_block);
37 40
@@ -77,8 +80,11 b' void constants_module_init(PyObject* mod'
77 80 PyModule_AddIntConstant(mod, "HASHLOG3_MAX", ZSTD_HASHLOG3_MAX);
78 81 PyModule_AddIntConstant(mod, "SEARCHLOG_MIN", ZSTD_SEARCHLOG_MIN);
79 82 PyModule_AddIntConstant(mod, "SEARCHLOG_MAX", ZSTD_SEARCHLOG_MAX);
80 PyModule_AddIntConstant(mod, "SEARCHLENGTH_MIN", ZSTD_SEARCHLENGTH_MIN);
81 PyModule_AddIntConstant(mod, "SEARCHLENGTH_MAX", ZSTD_SEARCHLENGTH_MAX);
83 PyModule_AddIntConstant(mod, "MINMATCH_MIN", ZSTD_MINMATCH_MIN);
84 PyModule_AddIntConstant(mod, "MINMATCH_MAX", ZSTD_MINMATCH_MAX);
85 /* TODO SEARCHLENGTH_* is deprecated. */
86 PyModule_AddIntConstant(mod, "SEARCHLENGTH_MIN", ZSTD_MINMATCH_MIN);
87 PyModule_AddIntConstant(mod, "SEARCHLENGTH_MAX", ZSTD_MINMATCH_MAX);
82 88 PyModule_AddIntConstant(mod, "TARGETLENGTH_MIN", ZSTD_TARGETLENGTH_MIN);
83 89 PyModule_AddIntConstant(mod, "TARGETLENGTH_MAX", ZSTD_TARGETLENGTH_MAX);
84 90 PyModule_AddIntConstant(mod, "LDM_MINMATCH_MIN", ZSTD_LDM_MINMATCH_MIN);
@@ -93,6 +99,7 b' void constants_module_init(PyObject* mod'
93 99 PyModule_AddIntConstant(mod, "STRATEGY_BTLAZY2", ZSTD_btlazy2);
94 100 PyModule_AddIntConstant(mod, "STRATEGY_BTOPT", ZSTD_btopt);
95 101 PyModule_AddIntConstant(mod, "STRATEGY_BTULTRA", ZSTD_btultra);
102 PyModule_AddIntConstant(mod, "STRATEGY_BTULTRA2", ZSTD_btultra2);
96 103
97 104 PyModule_AddIntConstant(mod, "DICT_TYPE_AUTO", ZSTD_dct_auto);
98 105 PyModule_AddIntConstant(mod, "DICT_TYPE_RAWCONTENT", ZSTD_dct_rawContent);
This diff has been collapsed as it changes many lines, (511 lines changed) Show them Hide them
@@ -102,6 +102,114 b' static PyObject* reader_isatty(PyObject*'
102 102 Py_RETURN_FALSE;
103 103 }
104 104
105 /**
106 * Read available input.
107 *
108 * Returns 0 if no data was added to input.
109 * Returns 1 if new input data is available.
110 * Returns -1 on error and sets a Python exception as a side-effect.
111 */
112 int read_decompressor_input(ZstdDecompressionReader* self) {
113 if (self->finishedInput) {
114 return 0;
115 }
116
117 if (self->input.pos != self->input.size) {
118 return 0;
119 }
120
121 if (self->reader) {
122 Py_buffer buffer;
123
124 assert(self->readResult == NULL);
125 self->readResult = PyObject_CallMethod(self->reader, "read",
126 "k", self->readSize);
127 if (NULL == self->readResult) {
128 return -1;
129 }
130
131 memset(&buffer, 0, sizeof(buffer));
132
133 if (0 != PyObject_GetBuffer(self->readResult, &buffer, PyBUF_CONTIG_RO)) {
134 return -1;
135 }
136
137 /* EOF */
138 if (0 == buffer.len) {
139 self->finishedInput = 1;
140 Py_CLEAR(self->readResult);
141 }
142 else {
143 self->input.src = buffer.buf;
144 self->input.size = buffer.len;
145 self->input.pos = 0;
146 }
147
148 PyBuffer_Release(&buffer);
149 }
150 else {
151 assert(self->buffer.buf);
152 /*
153 * We should only get here once since expectation is we always
154 * exhaust input buffer before reading again.
155 */
156 assert(self->input.src == NULL);
157
158 self->input.src = self->buffer.buf;
159 self->input.size = self->buffer.len;
160 self->input.pos = 0;
161 }
162
163 return 1;
164 }
165
166 /**
167 * Decompresses available input into an output buffer.
168 *
169 * Returns 0 if we need more input.
170 * Returns 1 if output buffer should be emitted.
171 * Returns -1 on error and sets a Python exception.
172 */
173 int decompress_input(ZstdDecompressionReader* self, ZSTD_outBuffer* output) {
174 size_t zresult;
175
176 if (self->input.pos >= self->input.size) {
177 return 0;
178 }
179
180 Py_BEGIN_ALLOW_THREADS
181 zresult = ZSTD_decompressStream(self->decompressor->dctx, output, &self->input);
182 Py_END_ALLOW_THREADS
183
184 /* Input exhausted. Clear our state tracking. */
185 if (self->input.pos == self->input.size) {
186 memset(&self->input, 0, sizeof(self->input));
187 Py_CLEAR(self->readResult);
188
189 if (self->buffer.buf) {
190 self->finishedInput = 1;
191 }
192 }
193
194 if (ZSTD_isError(zresult)) {
195 PyErr_Format(ZstdError, "zstd decompress error: %s", ZSTD_getErrorName(zresult));
196 return -1;
197 }
198
199 /* We fulfilled the full read request. Signal to emit. */
200 if (output->pos && output->pos == output->size) {
201 return 1;
202 }
203 /* We're at the end of a frame and we aren't allowed to return data
204 spanning frames. */
205 else if (output->pos && zresult == 0 && !self->readAcrossFrames) {
206 return 1;
207 }
208
209 /* There is more room in the output. Signal to collect more data. */
210 return 0;
211 }
212
105 213 static PyObject* reader_read(ZstdDecompressionReader* self, PyObject* args, PyObject* kwargs) {
106 214 static char* kwlist[] = {
107 215 "size",
@@ -113,26 +221,30 b' static PyObject* reader_read(ZstdDecompr'
113 221 char* resultBuffer;
114 222 Py_ssize_t resultSize;
115 223 ZSTD_outBuffer output;
116 size_t zresult;
224 int decompressResult, readResult;
117 225
118 226 if (self->closed) {
119 227 PyErr_SetString(PyExc_ValueError, "stream is closed");
120 228 return NULL;
121 229 }
122 230
123 if (self->finishedOutput) {
124 return PyBytes_FromStringAndSize("", 0);
125 }
126
127 if (!PyArg_ParseTupleAndKeywords(args, kwargs, "n", kwlist, &size)) {
231 if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|n", kwlist, &size)) {
128 232 return NULL;
129 233 }
130 234
131 if (size < 1) {
132 PyErr_SetString(PyExc_ValueError, "cannot read negative or size 0 amounts");
235 if (size < -1) {
236 PyErr_SetString(PyExc_ValueError, "cannot read negative amounts less than -1");
133 237 return NULL;
134 238 }
135 239
240 if (size == -1) {
241 return PyObject_CallMethod((PyObject*)self, "readall", NULL);
242 }
243
244 if (self->finishedOutput || size == 0) {
245 return PyBytes_FromStringAndSize("", 0);
246 }
247
136 248 result = PyBytes_FromStringAndSize(NULL, size);
137 249 if (NULL == result) {
138 250 return NULL;
@@ -146,85 +258,38 b' static PyObject* reader_read(ZstdDecompr'
146 258
147 259 readinput:
148 260
149 /* Consume input data left over from last time. */
150 if (self->input.pos < self->input.size) {
151 Py_BEGIN_ALLOW_THREADS
152 zresult = ZSTD_decompress_generic(self->decompressor->dctx,
153 &output, &self->input);
154 Py_END_ALLOW_THREADS
261 decompressResult = decompress_input(self, &output);
155 262
156 /* Input exhausted. Clear our state tracking. */
157 if (self->input.pos == self->input.size) {
158 memset(&self->input, 0, sizeof(self->input));
159 Py_CLEAR(self->readResult);
263 if (-1 == decompressResult) {
264 Py_XDECREF(result);
265 return NULL;
266 }
267 else if (0 == decompressResult) { }
268 else if (1 == decompressResult) {
269 self->bytesDecompressed += output.pos;
160 270
161 if (self->buffer.buf) {
162 self->finishedInput = 1;
271 if (output.pos != output.size) {
272 if (safe_pybytes_resize(&result, output.pos)) {
273 Py_XDECREF(result);
274 return NULL;
163 275 }
164 276 }
165
166 if (ZSTD_isError(zresult)) {
167 PyErr_Format(ZstdError, "zstd decompress error: %s", ZSTD_getErrorName(zresult));
168 return NULL;
169 }
170 else if (0 == zresult) {
171 self->finishedOutput = 1;
172 }
173
174 /* We fulfilled the full read request. Emit it. */
175 if (output.pos && output.pos == output.size) {
176 self->bytesDecompressed += output.size;
177 return result;
178 }
179
180 /*
181 * There is more room in the output. Fall through to try to collect
182 * more data so we can try to fill the output.
183 */
277 return result;
278 }
279 else {
280 assert(0);
184 281 }
185 282
186 if (!self->finishedInput) {
187 if (self->reader) {
188 Py_buffer buffer;
189
190 assert(self->readResult == NULL);
191 self->readResult = PyObject_CallMethod(self->reader, "read",
192 "k", self->readSize);
193 if (NULL == self->readResult) {
194 return NULL;
195 }
196
197 memset(&buffer, 0, sizeof(buffer));
198
199 if (0 != PyObject_GetBuffer(self->readResult, &buffer, PyBUF_CONTIG_RO)) {
200 return NULL;
201 }
283 readResult = read_decompressor_input(self);
202 284
203 /* EOF */
204 if (0 == buffer.len) {
205 self->finishedInput = 1;
206 Py_CLEAR(self->readResult);
207 }
208 else {
209 self->input.src = buffer.buf;
210 self->input.size = buffer.len;
211 self->input.pos = 0;
212 }
213
214 PyBuffer_Release(&buffer);
215 }
216 else {
217 assert(self->buffer.buf);
218 /*
219 * We should only get here once since above block will exhaust
220 * source buffer until finishedInput is set.
221 */
222 assert(self->input.src == NULL);
223
224 self->input.src = self->buffer.buf;
225 self->input.size = self->buffer.len;
226 self->input.pos = 0;
227 }
285 if (-1 == readResult) {
286 Py_XDECREF(result);
287 return NULL;
288 }
289 else if (0 == readResult) {}
290 else if (1 == readResult) {}
291 else {
292 assert(0);
228 293 }
229 294
230 295 if (self->input.size) {
@@ -242,18 +307,288 b' readinput:'
242 307 return result;
243 308 }
244 309
310 static PyObject* reader_read1(ZstdDecompressionReader* self, PyObject* args, PyObject* kwargs) {
311 static char* kwlist[] = {
312 "size",
313 NULL
314 };
315
316 Py_ssize_t size = -1;
317 PyObject* result = NULL;
318 char* resultBuffer;
319 Py_ssize_t resultSize;
320 ZSTD_outBuffer output;
321
322 if (self->closed) {
323 PyErr_SetString(PyExc_ValueError, "stream is closed");
324 return NULL;
325 }
326
327 if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|n", kwlist, &size)) {
328 return NULL;
329 }
330
331 if (size < -1) {
332 PyErr_SetString(PyExc_ValueError, "cannot read negative amounts less than -1");
333 return NULL;
334 }
335
336 if (self->finishedOutput || size == 0) {
337 return PyBytes_FromStringAndSize("", 0);
338 }
339
340 if (size == -1) {
341 size = ZSTD_DStreamOutSize();
342 }
343
344 result = PyBytes_FromStringAndSize(NULL, size);
345 if (NULL == result) {
346 return NULL;
347 }
348
349 PyBytes_AsStringAndSize(result, &resultBuffer, &resultSize);
350
351 output.dst = resultBuffer;
352 output.size = resultSize;
353 output.pos = 0;
354
355 /* read1() is supposed to use at most 1 read() from the underlying stream.
356 * However, we can't satisfy this requirement with decompression due to the
357 * nature of how decompression works. Our strategy is to read + decompress
358 * until we get any output, at which point we return. This satisfies the
359 * intent of the read1() API to limit read operations.
360 */
361 while (!self->finishedInput) {
362 int readResult, decompressResult;
363
364 readResult = read_decompressor_input(self);
365 if (-1 == readResult) {
366 Py_XDECREF(result);
367 return NULL;
368 }
369 else if (0 == readResult || 1 == readResult) { }
370 else {
371 assert(0);
372 }
373
374 decompressResult = decompress_input(self, &output);
375
376 if (-1 == decompressResult) {
377 Py_XDECREF(result);
378 return NULL;
379 }
380 else if (0 == decompressResult || 1 == decompressResult) { }
381 else {
382 assert(0);
383 }
384
385 if (output.pos) {
386 break;
387 }
388 }
389
390 self->bytesDecompressed += output.pos;
391 if (safe_pybytes_resize(&result, output.pos)) {
392 Py_XDECREF(result);
393 return NULL;
394 }
395
396 return result;
397 }
398
399 static PyObject* reader_readinto(ZstdDecompressionReader* self, PyObject* args) {
400 Py_buffer dest;
401 ZSTD_outBuffer output;
402 int decompressResult, readResult;
403 PyObject* result = NULL;
404
405 if (self->closed) {
406 PyErr_SetString(PyExc_ValueError, "stream is closed");
407 return NULL;
408 }
409
410 if (self->finishedOutput) {
411 return PyLong_FromLong(0);
412 }
413
414 if (!PyArg_ParseTuple(args, "w*:readinto", &dest)) {
415 return NULL;
416 }
417
418 if (!PyBuffer_IsContiguous(&dest, 'C') || dest.ndim > 1) {
419 PyErr_SetString(PyExc_ValueError,
420 "destination buffer should be contiguous and have at most one dimension");
421 goto finally;
422 }
423
424 output.dst = dest.buf;
425 output.size = dest.len;
426 output.pos = 0;
427
428 readinput:
429
430 decompressResult = decompress_input(self, &output);
431
432 if (-1 == decompressResult) {
433 goto finally;
434 }
435 else if (0 == decompressResult) { }
436 else if (1 == decompressResult) {
437 self->bytesDecompressed += output.pos;
438 result = PyLong_FromSize_t(output.pos);
439 goto finally;
440 }
441 else {
442 assert(0);
443 }
444
445 readResult = read_decompressor_input(self);
446
447 if (-1 == readResult) {
448 goto finally;
449 }
450 else if (0 == readResult) {}
451 else if (1 == readResult) {}
452 else {
453 assert(0);
454 }
455
456 if (self->input.size) {
457 goto readinput;
458 }
459
460 /* EOF */
461 self->bytesDecompressed += output.pos;
462 result = PyLong_FromSize_t(output.pos);
463
464 finally:
465 PyBuffer_Release(&dest);
466
467 return result;
468 }
469
470 static PyObject* reader_readinto1(ZstdDecompressionReader* self, PyObject* args) {
471 Py_buffer dest;
472 ZSTD_outBuffer output;
473 PyObject* result = NULL;
474
475 if (self->closed) {
476 PyErr_SetString(PyExc_ValueError, "stream is closed");
477 return NULL;
478 }
479
480 if (self->finishedOutput) {
481 return PyLong_FromLong(0);
482 }
483
484 if (!PyArg_ParseTuple(args, "w*:readinto1", &dest)) {
485 return NULL;
486 }
487
488 if (!PyBuffer_IsContiguous(&dest, 'C') || dest.ndim > 1) {
489 PyErr_SetString(PyExc_ValueError,
490 "destination buffer should be contiguous and have at most one dimension");
491 goto finally;
492 }
493
494 output.dst = dest.buf;
495 output.size = dest.len;
496 output.pos = 0;
497
498 while (!self->finishedInput && !self->finishedOutput) {
499 int decompressResult, readResult;
500
501 readResult = read_decompressor_input(self);
502
503 if (-1 == readResult) {
504 goto finally;
505 }
506 else if (0 == readResult || 1 == readResult) {}
507 else {
508 assert(0);
509 }
510
511 decompressResult = decompress_input(self, &output);
512
513 if (-1 == decompressResult) {
514 goto finally;
515 }
516 else if (0 == decompressResult || 1 == decompressResult) {}
517 else {
518 assert(0);
519 }
520
521 if (output.pos) {
522 break;
523 }
524 }
525
526 self->bytesDecompressed += output.pos;
527 result = PyLong_FromSize_t(output.pos);
528
529 finally:
530 PyBuffer_Release(&dest);
531
532 return result;
533 }
534
245 535 static PyObject* reader_readall(PyObject* self) {
246 PyErr_SetNone(PyExc_NotImplementedError);
247 return NULL;
536 PyObject* chunks = NULL;
537 PyObject* empty = NULL;
538 PyObject* result = NULL;
539
540 /* Our strategy is to collect chunks into a list then join all the
541 * chunks at the end. We could potentially use e.g. an io.BytesIO. But
542 * this feels simple enough to implement and avoids potentially expensive
543 * reallocations of large buffers.
544 */
545 chunks = PyList_New(0);
546 if (NULL == chunks) {
547 return NULL;
548 }
549
550 while (1) {
551 PyObject* chunk = PyObject_CallMethod(self, "read", "i", 1048576);
552 if (NULL == chunk) {
553 Py_DECREF(chunks);
554 return NULL;
555 }
556
557 if (!PyBytes_Size(chunk)) {
558 Py_DECREF(chunk);
559 break;
560 }
561
562 if (PyList_Append(chunks, chunk)) {
563 Py_DECREF(chunk);
564 Py_DECREF(chunks);
565 return NULL;
566 }
567
568 Py_DECREF(chunk);
569 }
570
571 empty = PyBytes_FromStringAndSize("", 0);
572 if (NULL == empty) {
573 Py_DECREF(chunks);
574 return NULL;
575 }
576
577 result = PyObject_CallMethod(empty, "join", "O", chunks);
578
579 Py_DECREF(empty);
580 Py_DECREF(chunks);
581
582 return result;
248 583 }
249 584
250 585 static PyObject* reader_readline(PyObject* self) {
251 PyErr_SetNone(PyExc_NotImplementedError);
586 set_unsupported_operation();
252 587 return NULL;
253 588 }
254 589
255 590 static PyObject* reader_readlines(PyObject* self) {
256 PyErr_SetNone(PyExc_NotImplementedError);
591 set_unsupported_operation();
257 592 return NULL;
258 593 }
259 594
@@ -345,12 +680,12 b' static PyObject* reader_writelines(PyObj'
345 680 }
346 681
347 682 static PyObject* reader_iter(PyObject* self) {
348 PyErr_SetNone(PyExc_NotImplementedError);
683 set_unsupported_operation();
349 684 return NULL;
350 685 }
351 686
352 687 static PyObject* reader_iternext(PyObject* self) {
353 PyErr_SetNone(PyExc_NotImplementedError);
688 set_unsupported_operation();
354 689 return NULL;
355 690 }
356 691
@@ -367,6 +702,10 b' static PyMethodDef reader_methods[] = {'
367 702 PyDoc_STR("Returns True") },
368 703 { "read", (PyCFunction)reader_read, METH_VARARGS | METH_KEYWORDS,
369 704 PyDoc_STR("read compressed data") },
705 { "read1", (PyCFunction)reader_read1, METH_VARARGS | METH_KEYWORDS,
706 PyDoc_STR("read compressed data") },
707 { "readinto", (PyCFunction)reader_readinto, METH_VARARGS, NULL },
708 { "readinto1", (PyCFunction)reader_readinto1, METH_VARARGS, NULL },
370 709 { "readall", (PyCFunction)reader_readall, METH_NOARGS, PyDoc_STR("Not implemented") },
371 710 { "readline", (PyCFunction)reader_readline, METH_NOARGS, PyDoc_STR("Not implemented") },
372 711 { "readlines", (PyCFunction)reader_readlines, METH_NOARGS, PyDoc_STR("Not implemented") },
@@ -22,12 +22,13 b' static void ZstdDecompressionWriter_deal'
22 22 }
23 23
24 24 static PyObject* ZstdDecompressionWriter_enter(ZstdDecompressionWriter* self) {
25 if (self->entered) {
26 PyErr_SetString(ZstdError, "cannot __enter__ multiple times");
25 if (self->closed) {
26 PyErr_SetString(PyExc_ValueError, "stream is closed");
27 27 return NULL;
28 28 }
29 29
30 if (ensure_dctx(self->decompressor, 1)) {
30 if (self->entered) {
31 PyErr_SetString(ZstdError, "cannot __enter__ multiple times");
31 32 return NULL;
32 33 }
33 34
@@ -40,6 +41,10 b' static PyObject* ZstdDecompressionWriter'
40 41 static PyObject* ZstdDecompressionWriter_exit(ZstdDecompressionWriter* self, PyObject* args) {
41 42 self->entered = 0;
42 43
44 if (NULL == PyObject_CallMethod((PyObject*)self, "close", NULL)) {
45 return NULL;
46 }
47
43 48 Py_RETURN_FALSE;
44 49 }
45 50
@@ -76,9 +81,9 b' static PyObject* ZstdDecompressionWriter'
76 81 goto finally;
77 82 }
78 83
79 if (!self->entered) {
80 PyErr_SetString(ZstdError, "write must be called from an active context manager");
81 goto finally;
84 if (self->closed) {
85 PyErr_SetString(PyExc_ValueError, "stream is closed");
86 return NULL;
82 87 }
83 88
84 89 output.dst = PyMem_Malloc(self->outSize);
@@ -93,9 +98,9 b' static PyObject* ZstdDecompressionWriter'
93 98 input.size = source.len;
94 99 input.pos = 0;
95 100
96 while ((ssize_t)input.pos < source.len) {
101 while (input.pos < (size_t)source.len) {
97 102 Py_BEGIN_ALLOW_THREADS
98 zresult = ZSTD_decompress_generic(self->decompressor->dctx, &output, &input);
103 zresult = ZSTD_decompressStream(self->decompressor->dctx, &output, &input);
99 104 Py_END_ALLOW_THREADS
100 105
101 106 if (ZSTD_isError(zresult)) {
@@ -120,13 +125,94 b' static PyObject* ZstdDecompressionWriter'
120 125
121 126 PyMem_Free(output.dst);
122 127
123 result = PyLong_FromSsize_t(totalWrite);
128 if (self->writeReturnRead) {
129 result = PyLong_FromSize_t(input.pos);
130 }
131 else {
132 result = PyLong_FromSsize_t(totalWrite);
133 }
124 134
125 135 finally:
126 136 PyBuffer_Release(&source);
127 137 return result;
128 138 }
129 139
140 static PyObject* ZstdDecompressionWriter_close(ZstdDecompressionWriter* self) {
141 PyObject* result;
142
143 if (self->closed) {
144 Py_RETURN_NONE;
145 }
146
147 result = PyObject_CallMethod((PyObject*)self, "flush", NULL);
148 self->closed = 1;
149
150 if (NULL == result) {
151 return NULL;
152 }
153
154 /* Call close on underlying stream as well. */
155 if (PyObject_HasAttrString(self->writer, "close")) {
156 return PyObject_CallMethod(self->writer, "close", NULL);
157 }
158
159 Py_RETURN_NONE;
160 }
161
162 static PyObject* ZstdDecompressionWriter_fileno(ZstdDecompressionWriter* self) {
163 if (PyObject_HasAttrString(self->writer, "fileno")) {
164 return PyObject_CallMethod(self->writer, "fileno", NULL);
165 }
166 else {
167 PyErr_SetString(PyExc_OSError, "fileno not available on underlying writer");
168 return NULL;
169 }
170 }
171
172 static PyObject* ZstdDecompressionWriter_flush(ZstdDecompressionWriter* self) {
173 if (self->closed) {
174 PyErr_SetString(PyExc_ValueError, "stream is closed");
175 return NULL;
176 }
177
178 if (PyObject_HasAttrString(self->writer, "flush")) {
179 return PyObject_CallMethod(self->writer, "flush", NULL);
180 }
181 else {
182 Py_RETURN_NONE;
183 }
184 }
185
186 static PyObject* ZstdDecompressionWriter_false(PyObject* self, PyObject* args) {
187 Py_RETURN_FALSE;
188 }
189
190 static PyObject* ZstdDecompressionWriter_true(PyObject* self, PyObject* args) {
191 Py_RETURN_TRUE;
192 }
193
194 static PyObject* ZstdDecompressionWriter_unsupported(PyObject* self, PyObject* args, PyObject* kwargs) {
195 PyObject* iomod;
196 PyObject* exc;
197
198 iomod = PyImport_ImportModule("io");
199 if (NULL == iomod) {
200 return NULL;
201 }
202
203 exc = PyObject_GetAttrString(iomod, "UnsupportedOperation");
204 if (NULL == exc) {
205 Py_DECREF(iomod);
206 return NULL;
207 }
208
209 PyErr_SetNone(exc);
210 Py_DECREF(exc);
211 Py_DECREF(iomod);
212
213 return NULL;
214 }
215
130 216 static PyMethodDef ZstdDecompressionWriter_methods[] = {
131 217 { "__enter__", (PyCFunction)ZstdDecompressionWriter_enter, METH_NOARGS,
132 218 PyDoc_STR("Enter a decompression context.") },
@@ -134,11 +220,32 b' static PyMethodDef ZstdDecompressionWrit'
134 220 PyDoc_STR("Exit a decompression context.") },
135 221 { "memory_size", (PyCFunction)ZstdDecompressionWriter_memory_size, METH_NOARGS,
136 222 PyDoc_STR("Obtain the memory size in bytes of the underlying decompressor.") },
223 { "close", (PyCFunction)ZstdDecompressionWriter_close, METH_NOARGS, NULL },
224 { "fileno", (PyCFunction)ZstdDecompressionWriter_fileno, METH_NOARGS, NULL },
225 { "flush", (PyCFunction)ZstdDecompressionWriter_flush, METH_NOARGS, NULL },
226 { "isatty", ZstdDecompressionWriter_false, METH_NOARGS, NULL },
227 { "readable", ZstdDecompressionWriter_false, METH_NOARGS, NULL },
228 { "readline", (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
229 { "readlines", (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
230 { "seek", (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
231 { "seekable", ZstdDecompressionWriter_false, METH_NOARGS, NULL },
232 { "tell", (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
233 { "truncate", (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
234 { "writable", ZstdDecompressionWriter_true, METH_NOARGS, NULL },
235 { "writelines" , (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
236 { "read", (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
237 { "readall", (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
238 { "readinto", (PyCFunction)ZstdDecompressionWriter_unsupported, METH_VARARGS | METH_KEYWORDS, NULL },
137 239 { "write", (PyCFunction)ZstdDecompressionWriter_write, METH_VARARGS | METH_KEYWORDS,
138 240 PyDoc_STR("Compress data") },
139 241 { NULL, NULL }
140 242 };
141 243
244 static PyMemberDef ZstdDecompressionWriter_members[] = {
245 { "closed", T_BOOL, offsetof(ZstdDecompressionWriter, closed), READONLY, NULL },
246 { NULL }
247 };
248
142 249 PyTypeObject ZstdDecompressionWriterType = {
143 250 PyVarObject_HEAD_INIT(NULL, 0)
144 251 "zstd.ZstdDecompressionWriter", /* tp_name */
@@ -168,7 +275,7 b' PyTypeObject ZstdDecompressionWriterType'
168 275 0, /* tp_iter */
169 276 0, /* tp_iternext */
170 277 ZstdDecompressionWriter_methods,/* tp_methods */
171 0, /* tp_members */
278 ZstdDecompressionWriter_members,/* tp_members */
172 279 0, /* tp_getset */
173 280 0, /* tp_base */
174 281 0, /* tp_dict */
@@ -75,7 +75,7 b' static PyObject* DecompressionObj_decomp'
75 75
76 76 while (1) {
77 77 Py_BEGIN_ALLOW_THREADS
78 zresult = ZSTD_decompress_generic(self->decompressor->dctx, &output, &input);
78 zresult = ZSTD_decompressStream(self->decompressor->dctx, &output, &input);
79 79 Py_END_ALLOW_THREADS
80 80
81 81 if (ZSTD_isError(zresult)) {
@@ -130,9 +130,26 b' finally:'
130 130 return result;
131 131 }
132 132
133 static PyObject* DecompressionObj_flush(ZstdDecompressionObj* self, PyObject* args, PyObject* kwargs) {
134 static char* kwlist[] = {
135 "length",
136 NULL
137 };
138
139 PyObject* length = NULL;
140
141 if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|O:flush", kwlist, &length)) {
142 return NULL;
143 }
144
145 Py_RETURN_NONE;
146 }
147
133 148 static PyMethodDef DecompressionObj_methods[] = {
134 149 { "decompress", (PyCFunction)DecompressionObj_decompress,
135 150 METH_VARARGS | METH_KEYWORDS, PyDoc_STR("decompress data") },
151 { "flush", (PyCFunction)DecompressionObj_flush,
152 METH_VARARGS | METH_KEYWORDS, PyDoc_STR("no-op") },
136 153 { NULL, NULL }
137 154 };
138 155
@@ -17,7 +17,7 b' extern PyObject* ZstdError;'
17 17 int ensure_dctx(ZstdDecompressor* decompressor, int loadDict) {
18 18 size_t zresult;
19 19
20 ZSTD_DCtx_reset(decompressor->dctx);
20 ZSTD_DCtx_reset(decompressor->dctx, ZSTD_reset_session_only);
21 21
22 22 if (decompressor->maxWindowSize) {
23 23 zresult = ZSTD_DCtx_setMaxWindowSize(decompressor->dctx, decompressor->maxWindowSize);
@@ -229,7 +229,7 b' static PyObject* Decompressor_copy_strea'
229 229
230 230 while (input.pos < input.size) {
231 231 Py_BEGIN_ALLOW_THREADS
232 zresult = ZSTD_decompress_generic(self->dctx, &output, &input);
232 zresult = ZSTD_decompressStream(self->dctx, &output, &input);
233 233 Py_END_ALLOW_THREADS
234 234
235 235 if (ZSTD_isError(zresult)) {
@@ -379,7 +379,7 b' PyObject* Decompressor_decompress(ZstdDe'
379 379 inBuffer.pos = 0;
380 380
381 381 Py_BEGIN_ALLOW_THREADS
382 zresult = ZSTD_decompress_generic(self->dctx, &outBuffer, &inBuffer);
382 zresult = ZSTD_decompressStream(self->dctx, &outBuffer, &inBuffer);
383 383 Py_END_ALLOW_THREADS
384 384
385 385 if (ZSTD_isError(zresult)) {
@@ -550,28 +550,35 b' finally:'
550 550 }
551 551
552 552 PyDoc_STRVAR(Decompressor_stream_reader__doc__,
553 "stream_reader(source, [read_size=default])\n"
553 "stream_reader(source, [read_size=default, [read_across_frames=False]])\n"
554 554 "\n"
555 555 "Obtain an object that behaves like an I/O stream that can be used for\n"
556 556 "reading decompressed output from an object.\n"
557 557 "\n"
558 558 "The source object can be any object with a ``read(size)`` method or that\n"
559 559 "conforms to the buffer protocol.\n"
560 "\n"
561 "``read_across_frames`` controls the behavior of ``read()`` when the end\n"
562 "of a zstd frame is reached. When ``True``, ``read()`` can potentially\n"
563 "return data belonging to multiple zstd frames. When ``False``, ``read()``\n"
564 "will return when the end of a frame is reached.\n"
560 565 );
561 566
562 567 static ZstdDecompressionReader* Decompressor_stream_reader(ZstdDecompressor* self, PyObject* args, PyObject* kwargs) {
563 568 static char* kwlist[] = {
564 569 "source",
565 570 "read_size",
571 "read_across_frames",
566 572 NULL
567 573 };
568 574
569 575 PyObject* source;
570 576 size_t readSize = ZSTD_DStreamInSize();
577 PyObject* readAcrossFrames = NULL;
571 578 ZstdDecompressionReader* result;
572 579
573 if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|k:stream_reader", kwlist,
574 &source, &readSize)) {
580 if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|kO:stream_reader", kwlist,
581 &source, &readSize, &readAcrossFrames)) {
575 582 return NULL;
576 583 }
577 584
@@ -604,6 +611,7 b' static ZstdDecompressionReader* Decompre'
604 611
605 612 result->decompressor = self;
606 613 Py_INCREF(self);
614 result->readAcrossFrames = readAcrossFrames ? PyObject_IsTrue(readAcrossFrames) : 0;
607 615
608 616 return result;
609 617 }
@@ -625,15 +633,17 b' static ZstdDecompressionWriter* Decompre'
625 633 static char* kwlist[] = {
626 634 "writer",
627 635 "write_size",
636 "write_return_read",
628 637 NULL
629 638 };
630 639
631 640 PyObject* writer;
632 641 size_t outSize = ZSTD_DStreamOutSize();
642 PyObject* writeReturnRead = NULL;
633 643 ZstdDecompressionWriter* result;
634 644
635 if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|k:stream_writer", kwlist,
636 &writer, &outSize)) {
645 if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|kO:stream_writer", kwlist,
646 &writer, &outSize, &writeReturnRead)) {
637 647 return NULL;
638 648 }
639 649
@@ -642,6 +652,10 b' static ZstdDecompressionWriter* Decompre'
642 652 return NULL;
643 653 }
644 654
655 if (ensure_dctx(self, 1)) {
656 return NULL;
657 }
658
645 659 result = (ZstdDecompressionWriter*)PyObject_CallObject((PyObject*)&ZstdDecompressionWriterType, NULL);
646 660 if (!result) {
647 661 return NULL;
@@ -654,6 +668,7 b' static ZstdDecompressionWriter* Decompre'
654 668 Py_INCREF(result->writer);
655 669
656 670 result->outSize = outSize;
671 result->writeReturnRead = writeReturnRead ? PyObject_IsTrue(writeReturnRead) : 0;
657 672
658 673 return result;
659 674 }
@@ -756,7 +771,7 b' static PyObject* Decompressor_decompress'
756 771 inBuffer.pos = 0;
757 772
758 773 Py_BEGIN_ALLOW_THREADS
759 zresult = ZSTD_decompress_generic(self->dctx, &outBuffer, &inBuffer);
774 zresult = ZSTD_decompressStream(self->dctx, &outBuffer, &inBuffer);
760 775 Py_END_ALLOW_THREADS
761 776 if (ZSTD_isError(zresult)) {
762 777 PyErr_Format(ZstdError, "could not decompress chunk 0: %s", ZSTD_getErrorName(zresult));
@@ -852,7 +867,7 b' static PyObject* Decompressor_decompress'
852 867 outBuffer.pos = 0;
853 868
854 869 Py_BEGIN_ALLOW_THREADS
855 zresult = ZSTD_decompress_generic(self->dctx, &outBuffer, &inBuffer);
870 zresult = ZSTD_decompressStream(self->dctx, &outBuffer, &inBuffer);
856 871 Py_END_ALLOW_THREADS
857 872 if (ZSTD_isError(zresult)) {
858 873 PyErr_Format(ZstdError, "could not decompress chunk %zd: %s",
@@ -892,7 +907,7 b' static PyObject* Decompressor_decompress'
892 907 outBuffer.pos = 0;
893 908
894 909 Py_BEGIN_ALLOW_THREADS
895 zresult = ZSTD_decompress_generic(self->dctx, &outBuffer, &inBuffer);
910 zresult = ZSTD_decompressStream(self->dctx, &outBuffer, &inBuffer);
896 911 Py_END_ALLOW_THREADS
897 912 if (ZSTD_isError(zresult)) {
898 913 PyErr_Format(ZstdError, "could not decompress chunk %zd: %s",
@@ -1176,7 +1191,7 b' static void decompress_worker(WorkerStat'
1176 1191 inBuffer.size = sourceSize;
1177 1192 inBuffer.pos = 0;
1178 1193
1179 zresult = ZSTD_decompress_generic(state->dctx, &outBuffer, &inBuffer);
1194 zresult = ZSTD_decompressStream(state->dctx, &outBuffer, &inBuffer);
1180 1195 if (ZSTD_isError(zresult)) {
1181 1196 state->error = WorkerError_zstd;
1182 1197 state->zresult = zresult;
@@ -57,7 +57,7 b' static DecompressorIteratorResult read_d'
57 57 self->output.pos = 0;
58 58
59 59 Py_BEGIN_ALLOW_THREADS
60 zresult = ZSTD_decompress_generic(self->decompressor->dctx, &self->output, &self->input);
60 zresult = ZSTD_decompressStream(self->decompressor->dctx, &self->output, &self->input);
61 61 Py_END_ALLOW_THREADS
62 62
63 63 /* We're done with the pointer. Nullify to prevent anyone from getting a
@@ -16,7 +16,7 b''
16 16 #include <zdict.h>
17 17
18 18 /* Remember to change the string in zstandard/__init__ as well */
19 #define PYTHON_ZSTANDARD_VERSION "0.10.1"
19 #define PYTHON_ZSTANDARD_VERSION "0.11.0"
20 20
21 21 typedef enum {
22 22 compressorobj_flush_finish,
@@ -31,27 +31,6 b' typedef enum {'
31 31 typedef struct {
32 32 PyObject_HEAD
33 33 ZSTD_CCtx_params* params;
34 unsigned format;
35 int compressionLevel;
36 unsigned windowLog;
37 unsigned hashLog;
38 unsigned chainLog;
39 unsigned searchLog;
40 unsigned minMatch;
41 unsigned targetLength;
42 unsigned compressionStrategy;
43 unsigned contentSizeFlag;
44 unsigned checksumFlag;
45 unsigned dictIDFlag;
46 unsigned threads;
47 unsigned jobSize;
48 unsigned overlapSizeLog;
49 unsigned forceMaxWindow;
50 unsigned enableLongDistanceMatching;
51 unsigned ldmHashLog;
52 unsigned ldmMinMatch;
53 unsigned ldmBucketSizeLog;
54 unsigned ldmHashEveryLog;
55 34 } ZstdCompressionParametersObject;
56 35
57 36 extern PyTypeObject ZstdCompressionParametersType;
@@ -129,9 +108,11 b' typedef struct {'
129 108
130 109 ZstdCompressor* compressor;
131 110 PyObject* writer;
132 unsigned long long sourceSize;
111 ZSTD_outBuffer output;
133 112 size_t outSize;
134 113 int entered;
114 int closed;
115 int writeReturnRead;
135 116 unsigned long long bytesCompressed;
136 117 } ZstdCompressionWriter;
137 118
@@ -235,6 +216,8 b' typedef struct {'
235 216 PyObject* reader;
236 217 /* Size for read() operations on reader. */
237 218 size_t readSize;
219 /* Whether a read() can return data spanning multiple zstd frames. */
220 int readAcrossFrames;
238 221 /* Buffer to read from (if reading from a buffer). */
239 222 Py_buffer buffer;
240 223
@@ -267,6 +250,8 b' typedef struct {'
267 250 PyObject* writer;
268 251 size_t outSize;
269 252 int entered;
253 int closed;
254 int writeReturnRead;
270 255 } ZstdDecompressionWriter;
271 256
272 257 extern PyTypeObject ZstdDecompressionWriterType;
@@ -360,8 +345,9 b' typedef struct {'
360 345
361 346 extern PyTypeObject ZstdBufferWithSegmentsCollectionType;
362 347
363 int set_parameter(ZSTD_CCtx_params* params, ZSTD_cParameter param, unsigned value);
348 int set_parameter(ZSTD_CCtx_params* params, ZSTD_cParameter param, int value);
364 349 int set_parameters(ZSTD_CCtx_params* params, ZstdCompressionParametersObject* obj);
350 int to_cparams(ZstdCompressionParametersObject* params, ZSTD_compressionParameters* cparams);
365 351 FrameParametersObject* get_frame_parameters(PyObject* self, PyObject* args, PyObject* kwargs);
366 352 int ensure_ddict(ZstdCompressionDict* dict);
367 353 int ensure_dctx(ZstdDecompressor* decompressor, int loadDict);
@@ -36,7 +36,9 b" SOURCES = ['zstd/%s' % p for p in ("
36 36 'compress/zstd_opt.c',
37 37 'compress/zstdmt_compress.c',
38 38 'decompress/huf_decompress.c',
39 'decompress/zstd_ddict.c',
39 40 'decompress/zstd_decompress.c',
41 'decompress/zstd_decompress_block.c',
40 42 'dictBuilder/cover.c',
41 43 'dictBuilder/fastcover.c',
42 44 'dictBuilder/divsufsort.c',
@@ -5,12 +5,32 b''
5 5 # This software may be modified and distributed under the terms
6 6 # of the BSD license. See the LICENSE file for details.
7 7
8 from __future__ import print_function
9
10 from distutils.version import LooseVersion
8 11 import os
9 12 import sys
10 13 from setuptools import setup
11 14
15 # Need change in 1.10 for ffi.from_buffer() to handle all buffer types
16 # (like memoryview).
17 # Need feature in 1.11 for ffi.gc() to declare size of objects so we avoid
18 # garbage collection pitfalls.
19 MINIMUM_CFFI_VERSION = '1.11'
20
12 21 try:
13 22 import cffi
23
24 # PyPy (and possibly other distros) have CFFI distributed as part of
25 # them. The install_requires for CFFI below won't work. We need to sniff
26 # out the CFFI version here and reject CFFI if it is too old.
27 cffi_version = LooseVersion(cffi.__version__)
28 if cffi_version < LooseVersion(MINIMUM_CFFI_VERSION):
29 print('CFFI 1.11 or newer required (%s found); '
30 'not building CFFI backend' % cffi_version,
31 file=sys.stderr)
32 cffi = None
33
14 34 except ImportError:
15 35 cffi = None
16 36
@@ -49,12 +69,7 b' install_requires = []'
49 69 if cffi:
50 70 import make_cffi
51 71 extensions.append(make_cffi.ffi.distutils_extension())
52
53 # Need change in 1.10 for ffi.from_buffer() to handle all buffer types
54 # (like memoryview).
55 # Need feature in 1.11 for ffi.gc() to declare size of objects so we avoid
56 # garbage collection pitfalls.
57 install_requires.append('cffi>=1.11')
72 install_requires.append('cffi>=%s' % MINIMUM_CFFI_VERSION)
58 73
59 74 version = None
60 75
@@ -88,6 +103,7 b' setup('
88 103 'Programming Language :: Python :: 3.4',
89 104 'Programming Language :: Python :: 3.5',
90 105 'Programming Language :: Python :: 3.6',
106 'Programming Language :: Python :: 3.7',
91 107 ],
92 108 keywords='zstandard zstd compression',
93 109 packages=['zstandard'],
@@ -30,7 +30,9 b" zstd_sources = ['zstd/%s' % p for p in ("
30 30 'compress/zstd_opt.c',
31 31 'compress/zstdmt_compress.c',
32 32 'decompress/huf_decompress.c',
33 'decompress/zstd_ddict.c',
33 34 'decompress/zstd_decompress.c',
35 'decompress/zstd_decompress_block.c',
34 36 'dictBuilder/cover.c',
35 37 'dictBuilder/divsufsort.c',
36 38 'dictBuilder/fastcover.c',
@@ -79,12 +79,37 b' def make_cffi(cls):'
79 79 return cls
80 80
81 81
82 class OpCountingBytesIO(io.BytesIO):
82 class NonClosingBytesIO(io.BytesIO):
83 """BytesIO that saves the underlying buffer on close().
84
85 This allows us to access written data after close().
86 """
83 87 def __init__(self, *args, **kwargs):
88 super(NonClosingBytesIO, self).__init__(*args, **kwargs)
89 self._saved_buffer = None
90
91 def close(self):
92 self._saved_buffer = self.getvalue()
93 return super(NonClosingBytesIO, self).close()
94
95 def getvalue(self):
96 if self.closed:
97 return self._saved_buffer
98 else:
99 return super(NonClosingBytesIO, self).getvalue()
100
101
102 class OpCountingBytesIO(NonClosingBytesIO):
103 def __init__(self, *args, **kwargs):
104 self._flush_count = 0
84 105 self._read_count = 0
85 106 self._write_count = 0
86 107 return super(OpCountingBytesIO, self).__init__(*args, **kwargs)
87 108
109 def flush(self):
110 self._flush_count += 1
111 return super(OpCountingBytesIO, self).flush()
112
88 113 def read(self, *args):
89 114 self._read_count += 1
90 115 return super(OpCountingBytesIO, self).read(*args)
@@ -117,6 +142,13 b' def random_input_data():'
117 142 except OSError:
118 143 pass
119 144
145 # Also add some actual random data.
146 _source_files.append(os.urandom(100))
147 _source_files.append(os.urandom(1000))
148 _source_files.append(os.urandom(10000))
149 _source_files.append(os.urandom(100000))
150 _source_files.append(os.urandom(1000000))
151
120 152 return _source_files
121 153
122 154
@@ -140,12 +172,14 b' def generate_samples():'
140 172
141 173
142 174 if hypothesis:
143 default_settings = hypothesis.settings()
175 default_settings = hypothesis.settings(deadline=10000)
144 176 hypothesis.settings.register_profile('default', default_settings)
145 177
146 ci_settings = hypothesis.settings(max_examples=2500,
147 max_iterations=2500)
178 ci_settings = hypothesis.settings(deadline=20000, max_examples=1000)
148 179 hypothesis.settings.register_profile('ci', ci_settings)
149 180
181 expensive_settings = hypothesis.settings(deadline=None, max_examples=10000)
182 hypothesis.settings.register_profile('expensive', expensive_settings)
183
150 184 hypothesis.settings.load_profile(
151 185 os.environ.get('HYPOTHESIS_PROFILE', 'default'))
@@ -8,6 +8,9 b" ss = struct.Struct('=QQ')"
8 8
9 9 class TestBufferWithSegments(unittest.TestCase):
10 10 def test_arguments(self):
11 if not hasattr(zstd, 'BufferWithSegments'):
12 self.skipTest('BufferWithSegments not available')
13
11 14 with self.assertRaises(TypeError):
12 15 zstd.BufferWithSegments()
13 16
@@ -19,10 +22,16 b' class TestBufferWithSegments(unittest.Te'
19 22 zstd.BufferWithSegments(b'foo', b'\x00\x00')
20 23
21 24 def test_invalid_offset(self):
25 if not hasattr(zstd, 'BufferWithSegments'):
26 self.skipTest('BufferWithSegments not available')
27
22 28 with self.assertRaisesRegexp(ValueError, 'offset within segments array references memory'):
23 29 zstd.BufferWithSegments(b'foo', ss.pack(0, 4))
24 30
25 31 def test_invalid_getitem(self):
32 if not hasattr(zstd, 'BufferWithSegments'):
33 self.skipTest('BufferWithSegments not available')
34
26 35 b = zstd.BufferWithSegments(b'foo', ss.pack(0, 3))
27 36
28 37 with self.assertRaisesRegexp(IndexError, 'offset must be non-negative'):
@@ -35,6 +44,9 b' class TestBufferWithSegments(unittest.Te'
35 44 test = b[2]
36 45
37 46 def test_single(self):
47 if not hasattr(zstd, 'BufferWithSegments'):
48 self.skipTest('BufferWithSegments not available')
49
38 50 b = zstd.BufferWithSegments(b'foo', ss.pack(0, 3))
39 51 self.assertEqual(len(b), 1)
40 52 self.assertEqual(b.size, 3)
@@ -45,6 +57,9 b' class TestBufferWithSegments(unittest.Te'
45 57 self.assertEqual(b[0].tobytes(), b'foo')
46 58
47 59 def test_multiple(self):
60 if not hasattr(zstd, 'BufferWithSegments'):
61 self.skipTest('BufferWithSegments not available')
62
48 63 b = zstd.BufferWithSegments(b'foofooxfooxy', b''.join([ss.pack(0, 3),
49 64 ss.pack(3, 4),
50 65 ss.pack(7, 5)]))
@@ -59,10 +74,16 b' class TestBufferWithSegments(unittest.Te'
59 74
60 75 class TestBufferWithSegmentsCollection(unittest.TestCase):
61 76 def test_empty_constructor(self):
77 if not hasattr(zstd, 'BufferWithSegmentsCollection'):
78 self.skipTest('BufferWithSegmentsCollection not available')
79
62 80 with self.assertRaisesRegexp(ValueError, 'must pass at least 1 argument'):
63 81 zstd.BufferWithSegmentsCollection()
64 82
65 83 def test_argument_validation(self):
84 if not hasattr(zstd, 'BufferWithSegmentsCollection'):
85 self.skipTest('BufferWithSegmentsCollection not available')
86
66 87 with self.assertRaisesRegexp(TypeError, 'arguments must be BufferWithSegments'):
67 88 zstd.BufferWithSegmentsCollection(None)
68 89
@@ -74,6 +95,9 b' class TestBufferWithSegmentsCollection(u'
74 95 zstd.BufferWithSegmentsCollection(zstd.BufferWithSegments(b'', b''))
75 96
76 97 def test_length(self):
98 if not hasattr(zstd, 'BufferWithSegmentsCollection'):
99 self.skipTest('BufferWithSegmentsCollection not available')
100
77 101 b1 = zstd.BufferWithSegments(b'foo', ss.pack(0, 3))
78 102 b2 = zstd.BufferWithSegments(b'barbaz', b''.join([ss.pack(0, 3),
79 103 ss.pack(3, 3)]))
@@ -91,6 +115,9 b' class TestBufferWithSegmentsCollection(u'
91 115 self.assertEqual(c.size(), 9)
92 116
93 117 def test_getitem(self):
118 if not hasattr(zstd, 'BufferWithSegmentsCollection'):
119 self.skipTest('BufferWithSegmentsCollection not available')
120
94 121 b1 = zstd.BufferWithSegments(b'foo', ss.pack(0, 3))
95 122 b2 = zstd.BufferWithSegments(b'barbaz', b''.join([ss.pack(0, 3),
96 123 ss.pack(3, 3)]))
@@ -1,14 +1,17 b''
1 1 import hashlib
2 2 import io
3 import os
3 4 import struct
4 5 import sys
5 6 import tarfile
7 import tempfile
6 8 import unittest
7 9
8 10 import zstandard as zstd
9 11
10 12 from .common import (
11 13 make_cffi,
14 NonClosingBytesIO,
12 15 OpCountingBytesIO,
13 16 )
14 17
@@ -272,7 +275,7 b' class TestCompressor_compressobj(unittes'
272 275
273 276 params = zstd.get_frame_parameters(result)
274 277 self.assertEqual(params.content_size, zstd.CONTENTSIZE_UNKNOWN)
275 self.assertEqual(params.window_size, 1048576)
278 self.assertEqual(params.window_size, 2097152)
276 279 self.assertEqual(params.dict_id, 0)
277 280 self.assertFalse(params.has_checksum)
278 281
@@ -321,7 +324,7 b' class TestCompressor_compressobj(unittes'
321 324 cobj.compress(b'foo')
322 325 cobj.flush()
323 326
324 with self.assertRaisesRegexp(zstd.ZstdError, 'cannot call compress\(\) after compressor'):
327 with self.assertRaisesRegexp(zstd.ZstdError, r'cannot call compress\(\) after compressor'):
325 328 cobj.compress(b'foo')
326 329
327 330 with self.assertRaisesRegexp(zstd.ZstdError, 'compressor object already finished'):
@@ -453,7 +456,7 b' class TestCompressor_copy_stream(unittes'
453 456
454 457 params = zstd.get_frame_parameters(dest.getvalue())
455 458 self.assertEqual(params.content_size, zstd.CONTENTSIZE_UNKNOWN)
456 self.assertEqual(params.window_size, 1048576)
459 self.assertEqual(params.window_size, 2097152)
457 460 self.assertEqual(params.dict_id, 0)
458 461 self.assertFalse(params.has_checksum)
459 462
@@ -605,10 +608,6 b' class TestCompressor_stream_reader(unitt'
605 608 with self.assertRaises(io.UnsupportedOperation):
606 609 reader.readlines()
607 610
608 # This could probably be implemented someday.
609 with self.assertRaises(NotImplementedError):
610 reader.readall()
611
612 611 with self.assertRaises(io.UnsupportedOperation):
613 612 iter(reader)
614 613
@@ -644,15 +643,16 b' class TestCompressor_stream_reader(unitt'
644 643 with self.assertRaisesRegexp(ValueError, 'stream is closed'):
645 644 reader.read(10)
646 645
647 def test_read_bad_size(self):
646 def test_read_sizes(self):
648 647 cctx = zstd.ZstdCompressor()
648 foo = cctx.compress(b'foo')
649 649
650 650 with cctx.stream_reader(b'foo') as reader:
651 with self.assertRaisesRegexp(ValueError, 'cannot read negative or size 0 amounts'):
652 reader.read(-1)
651 with self.assertRaisesRegexp(ValueError, 'cannot read negative amounts less than -1'):
652 reader.read(-2)
653 653
654 with self.assertRaisesRegexp(ValueError, 'cannot read negative or size 0 amounts'):
655 reader.read(0)
654 self.assertEqual(reader.read(0), b'')
655 self.assertEqual(reader.read(), foo)
656 656
657 657 def test_read_buffer(self):
658 658 cctx = zstd.ZstdCompressor()
@@ -746,11 +746,202 b' class TestCompressor_stream_reader(unitt'
746 746 with cctx.stream_reader(source, size=42):
747 747 pass
748 748
749 def test_readall(self):
750 cctx = zstd.ZstdCompressor()
751 frame = cctx.compress(b'foo' * 1024)
752
753 reader = cctx.stream_reader(b'foo' * 1024)
754 self.assertEqual(reader.readall(), frame)
755
756 def test_readinto(self):
757 cctx = zstd.ZstdCompressor()
758 foo = cctx.compress(b'foo')
759
760 reader = cctx.stream_reader(b'foo')
761 with self.assertRaises(Exception):
762 reader.readinto(b'foobar')
763
764 # readinto() with sufficiently large destination.
765 b = bytearray(1024)
766 reader = cctx.stream_reader(b'foo')
767 self.assertEqual(reader.readinto(b), len(foo))
768 self.assertEqual(b[0:len(foo)], foo)
769 self.assertEqual(reader.readinto(b), 0)
770 self.assertEqual(b[0:len(foo)], foo)
771
772 # readinto() with small reads.
773 b = bytearray(1024)
774 reader = cctx.stream_reader(b'foo', read_size=1)
775 self.assertEqual(reader.readinto(b), len(foo))
776 self.assertEqual(b[0:len(foo)], foo)
777
778 # Too small destination buffer.
779 b = bytearray(2)
780 reader = cctx.stream_reader(b'foo')
781 self.assertEqual(reader.readinto(b), 2)
782 self.assertEqual(b[:], foo[0:2])
783 self.assertEqual(reader.readinto(b), 2)
784 self.assertEqual(b[:], foo[2:4])
785 self.assertEqual(reader.readinto(b), 2)
786 self.assertEqual(b[:], foo[4:6])
787
788 def test_readinto1(self):
789 cctx = zstd.ZstdCompressor()
790 foo = b''.join(cctx.read_to_iter(io.BytesIO(b'foo')))
791
792 reader = cctx.stream_reader(b'foo')
793 with self.assertRaises(Exception):
794 reader.readinto1(b'foobar')
795
796 b = bytearray(1024)
797 source = OpCountingBytesIO(b'foo')
798 reader = cctx.stream_reader(source)
799 self.assertEqual(reader.readinto1(b), len(foo))
800 self.assertEqual(b[0:len(foo)], foo)
801 self.assertEqual(source._read_count, 2)
802
803 # readinto1() with small reads.
804 b = bytearray(1024)
805 source = OpCountingBytesIO(b'foo')
806 reader = cctx.stream_reader(source, read_size=1)
807 self.assertEqual(reader.readinto1(b), len(foo))
808 self.assertEqual(b[0:len(foo)], foo)
809 self.assertEqual(source._read_count, 4)
810
811 def test_read1(self):
812 cctx = zstd.ZstdCompressor()
813 foo = b''.join(cctx.read_to_iter(io.BytesIO(b'foo')))
814
815 b = OpCountingBytesIO(b'foo')
816 reader = cctx.stream_reader(b)
817
818 self.assertEqual(reader.read1(), foo)
819 self.assertEqual(b._read_count, 2)
820
821 b = OpCountingBytesIO(b'foo')
822 reader = cctx.stream_reader(b)
823
824 self.assertEqual(reader.read1(0), b'')
825 self.assertEqual(reader.read1(2), foo[0:2])
826 self.assertEqual(b._read_count, 2)
827 self.assertEqual(reader.read1(2), foo[2:4])
828 self.assertEqual(reader.read1(1024), foo[4:])
829
749 830
750 831 @make_cffi
751 832 class TestCompressor_stream_writer(unittest.TestCase):
833 def test_io_api(self):
834 buffer = io.BytesIO()
835 cctx = zstd.ZstdCompressor()
836 writer = cctx.stream_writer(buffer)
837
838 self.assertFalse(writer.isatty())
839 self.assertFalse(writer.readable())
840
841 with self.assertRaises(io.UnsupportedOperation):
842 writer.readline()
843
844 with self.assertRaises(io.UnsupportedOperation):
845 writer.readline(42)
846
847 with self.assertRaises(io.UnsupportedOperation):
848 writer.readline(size=42)
849
850 with self.assertRaises(io.UnsupportedOperation):
851 writer.readlines()
852
853 with self.assertRaises(io.UnsupportedOperation):
854 writer.readlines(42)
855
856 with self.assertRaises(io.UnsupportedOperation):
857 writer.readlines(hint=42)
858
859 with self.assertRaises(io.UnsupportedOperation):
860 writer.seek(0)
861
862 with self.assertRaises(io.UnsupportedOperation):
863 writer.seek(10, os.SEEK_SET)
864
865 self.assertFalse(writer.seekable())
866
867 with self.assertRaises(io.UnsupportedOperation):
868 writer.truncate()
869
870 with self.assertRaises(io.UnsupportedOperation):
871 writer.truncate(42)
872
873 with self.assertRaises(io.UnsupportedOperation):
874 writer.truncate(size=42)
875
876 self.assertTrue(writer.writable())
877
878 with self.assertRaises(NotImplementedError):
879 writer.writelines([])
880
881 with self.assertRaises(io.UnsupportedOperation):
882 writer.read()
883
884 with self.assertRaises(io.UnsupportedOperation):
885 writer.read(42)
886
887 with self.assertRaises(io.UnsupportedOperation):
888 writer.read(size=42)
889
890 with self.assertRaises(io.UnsupportedOperation):
891 writer.readall()
892
893 with self.assertRaises(io.UnsupportedOperation):
894 writer.readinto(None)
895
896 with self.assertRaises(io.UnsupportedOperation):
897 writer.fileno()
898
899 self.assertFalse(writer.closed)
900
901 def test_fileno_file(self):
902 with tempfile.TemporaryFile('wb') as tf:
903 cctx = zstd.ZstdCompressor()
904 writer = cctx.stream_writer(tf)
905
906 self.assertEqual(writer.fileno(), tf.fileno())
907
908 def test_close(self):
909 buffer = NonClosingBytesIO()
910 cctx = zstd.ZstdCompressor(level=1)
911 writer = cctx.stream_writer(buffer)
912
913 writer.write(b'foo' * 1024)
914 self.assertFalse(writer.closed)
915 self.assertFalse(buffer.closed)
916 writer.close()
917 self.assertTrue(writer.closed)
918 self.assertTrue(buffer.closed)
919
920 with self.assertRaisesRegexp(ValueError, 'stream is closed'):
921 writer.write(b'foo')
922
923 with self.assertRaisesRegexp(ValueError, 'stream is closed'):
924 writer.flush()
925
926 with self.assertRaisesRegexp(ValueError, 'stream is closed'):
927 with writer:
928 pass
929
930 self.assertEqual(buffer.getvalue(),
931 b'\x28\xb5\x2f\xfd\x00\x48\x55\x00\x00\x18\x66\x6f'
932 b'\x6f\x01\x00\xfa\xd3\x77\x43')
933
934 # Context manager exit should close stream.
935 buffer = io.BytesIO()
936 writer = cctx.stream_writer(buffer)
937
938 with writer:
939 writer.write(b'foo')
940
941 self.assertTrue(writer.closed)
942
752 943 def test_empty(self):
753 buffer = io.BytesIO()
944 buffer = NonClosingBytesIO()
754 945 cctx = zstd.ZstdCompressor(level=1, write_content_size=False)
755 946 with cctx.stream_writer(buffer) as compressor:
756 947 compressor.write(b'')
@@ -764,6 +955,25 b' class TestCompressor_stream_writer(unitt'
764 955 self.assertEqual(params.dict_id, 0)
765 956 self.assertFalse(params.has_checksum)
766 957
958 # Test without context manager.
959 buffer = io.BytesIO()
960 compressor = cctx.stream_writer(buffer)
961 self.assertEqual(compressor.write(b''), 0)
962 self.assertEqual(buffer.getvalue(), b'')
963 self.assertEqual(compressor.flush(zstd.FLUSH_FRAME), 9)
964 result = buffer.getvalue()
965 self.assertEqual(result, b'\x28\xb5\x2f\xfd\x00\x48\x01\x00\x00')
966
967 params = zstd.get_frame_parameters(result)
968 self.assertEqual(params.content_size, zstd.CONTENTSIZE_UNKNOWN)
969 self.assertEqual(params.window_size, 524288)
970 self.assertEqual(params.dict_id, 0)
971 self.assertFalse(params.has_checksum)
972
973 # Test write_return_read=True
974 compressor = cctx.stream_writer(buffer, write_return_read=True)
975 self.assertEqual(compressor.write(b''), 0)
976
767 977 def test_input_types(self):
768 978 expected = b'\x28\xb5\x2f\xfd\x00\x48\x19\x00\x00\x66\x6f\x6f'
769 979 cctx = zstd.ZstdCompressor(level=1)
@@ -778,14 +988,17 b' class TestCompressor_stream_writer(unitt'
778 988 ]
779 989
780 990 for source in sources:
781 buffer = io.BytesIO()
991 buffer = NonClosingBytesIO()
782 992 with cctx.stream_writer(buffer) as compressor:
783 993 compressor.write(source)
784 994
785 995 self.assertEqual(buffer.getvalue(), expected)
786 996
997 compressor = cctx.stream_writer(buffer, write_return_read=True)
998 self.assertEqual(compressor.write(source), len(source))
999
787 1000 def test_multiple_compress(self):
788 buffer = io.BytesIO()
1001 buffer = NonClosingBytesIO()
789 1002 cctx = zstd.ZstdCompressor(level=5)
790 1003 with cctx.stream_writer(buffer) as compressor:
791 1004 self.assertEqual(compressor.write(b'foo'), 0)
@@ -794,9 +1007,27 b' class TestCompressor_stream_writer(unitt'
794 1007
795 1008 result = buffer.getvalue()
796 1009 self.assertEqual(result,
797 b'\x28\xb5\x2f\xfd\x00\x50\x75\x00\x00\x38\x66\x6f'
1010 b'\x28\xb5\x2f\xfd\x00\x58\x75\x00\x00\x38\x66\x6f'
798 1011 b'\x6f\x62\x61\x72\x78\x01\x00\xfc\xdf\x03\x23')
799 1012
1013 # Test without context manager.
1014 buffer = io.BytesIO()
1015 compressor = cctx.stream_writer(buffer)
1016 self.assertEqual(compressor.write(b'foo'), 0)
1017 self.assertEqual(compressor.write(b'bar'), 0)
1018 self.assertEqual(compressor.write(b'x' * 8192), 0)
1019 self.assertEqual(compressor.flush(zstd.FLUSH_FRAME), 23)
1020 result = buffer.getvalue()
1021 self.assertEqual(result,
1022 b'\x28\xb5\x2f\xfd\x00\x58\x75\x00\x00\x38\x66\x6f'
1023 b'\x6f\x62\x61\x72\x78\x01\x00\xfc\xdf\x03\x23')
1024
1025 # Test with write_return_read=True.
1026 compressor = cctx.stream_writer(buffer, write_return_read=True)
1027 self.assertEqual(compressor.write(b'foo'), 3)
1028 self.assertEqual(compressor.write(b'barbiz'), 6)
1029 self.assertEqual(compressor.write(b'x' * 8192), 8192)
1030
800 1031 def test_dictionary(self):
801 1032 samples = []
802 1033 for i in range(128):
@@ -807,9 +1038,9 b' class TestCompressor_stream_writer(unitt'
807 1038 d = zstd.train_dictionary(8192, samples)
808 1039
809 1040 h = hashlib.sha1(d.as_bytes()).hexdigest()
810 self.assertEqual(h, '2b3b6428da5bf2c9cc9d4bb58ba0bc5990dd0e79')
1041 self.assertEqual(h, '88ca0d38332aff379d4ced166a51c280a7679aad')
811 1042
812 buffer = io.BytesIO()
1043 buffer = NonClosingBytesIO()
813 1044 cctx = zstd.ZstdCompressor(level=9, dict_data=d)
814 1045 with cctx.stream_writer(buffer) as compressor:
815 1046 self.assertEqual(compressor.write(b'foo'), 0)
@@ -825,7 +1056,7 b' class TestCompressor_stream_writer(unitt'
825 1056 self.assertFalse(params.has_checksum)
826 1057
827 1058 h = hashlib.sha1(compressed).hexdigest()
828 self.assertEqual(h, '23f88344263678478f5f82298e0a5d1833125786')
1059 self.assertEqual(h, '8703b4316f274d26697ea5dd480f29c08e85d940')
829 1060
830 1061 source = b'foo' + b'bar' + (b'foo' * 16384)
831 1062
@@ -842,9 +1073,9 b' class TestCompressor_stream_writer(unitt'
842 1073 min_match=5,
843 1074 search_log=4,
844 1075 target_length=10,
845 compression_strategy=zstd.STRATEGY_FAST)
1076 strategy=zstd.STRATEGY_FAST)
846 1077
847 buffer = io.BytesIO()
1078 buffer = NonClosingBytesIO()
848 1079 cctx = zstd.ZstdCompressor(compression_params=params)
849 1080 with cctx.stream_writer(buffer) as compressor:
850 1081 self.assertEqual(compressor.write(b'foo'), 0)
@@ -863,12 +1094,12 b' class TestCompressor_stream_writer(unitt'
863 1094 self.assertEqual(h, '2a8111d72eb5004cdcecbdac37da9f26720d30ef')
864 1095
865 1096 def test_write_checksum(self):
866 no_checksum = io.BytesIO()
1097 no_checksum = NonClosingBytesIO()
867 1098 cctx = zstd.ZstdCompressor(level=1)
868 1099 with cctx.stream_writer(no_checksum) as compressor:
869 1100 self.assertEqual(compressor.write(b'foobar'), 0)
870 1101
871 with_checksum = io.BytesIO()
1102 with_checksum = NonClosingBytesIO()
872 1103 cctx = zstd.ZstdCompressor(level=1, write_checksum=True)
873 1104 with cctx.stream_writer(with_checksum) as compressor:
874 1105 self.assertEqual(compressor.write(b'foobar'), 0)
@@ -886,12 +1117,12 b' class TestCompressor_stream_writer(unitt'
886 1117 len(no_checksum.getvalue()) + 4)
887 1118
888 1119 def test_write_content_size(self):
889 no_size = io.BytesIO()
1120 no_size = NonClosingBytesIO()
890 1121 cctx = zstd.ZstdCompressor(level=1, write_content_size=False)
891 1122 with cctx.stream_writer(no_size) as compressor:
892 1123 self.assertEqual(compressor.write(b'foobar' * 256), 0)
893 1124
894 with_size = io.BytesIO()
1125 with_size = NonClosingBytesIO()
895 1126 cctx = zstd.ZstdCompressor(level=1)
896 1127 with cctx.stream_writer(with_size) as compressor:
897 1128 self.assertEqual(compressor.write(b'foobar' * 256), 0)
@@ -902,7 +1133,7 b' class TestCompressor_stream_writer(unitt'
902 1133 len(no_size.getvalue()))
903 1134
904 1135 # Declaring size will write the header.
905 with_size = io.BytesIO()
1136 with_size = NonClosingBytesIO()
906 1137 with cctx.stream_writer(with_size, size=len(b'foobar' * 256)) as compressor:
907 1138 self.assertEqual(compressor.write(b'foobar' * 256), 0)
908 1139
@@ -927,7 +1158,7 b' class TestCompressor_stream_writer(unitt'
927 1158
928 1159 d = zstd.train_dictionary(1024, samples)
929 1160
930 with_dict_id = io.BytesIO()
1161 with_dict_id = NonClosingBytesIO()
931 1162 cctx = zstd.ZstdCompressor(level=1, dict_data=d)
932 1163 with cctx.stream_writer(with_dict_id) as compressor:
933 1164 self.assertEqual(compressor.write(b'foobarfoobar'), 0)
@@ -935,7 +1166,7 b' class TestCompressor_stream_writer(unitt'
935 1166 self.assertEqual(with_dict_id.getvalue()[4:5], b'\x03')
936 1167
937 1168 cctx = zstd.ZstdCompressor(level=1, dict_data=d, write_dict_id=False)
938 no_dict_id = io.BytesIO()
1169 no_dict_id = NonClosingBytesIO()
939 1170 with cctx.stream_writer(no_dict_id) as compressor:
940 1171 self.assertEqual(compressor.write(b'foobarfoobar'), 0)
941 1172
@@ -1009,8 +1240,32 b' class TestCompressor_stream_writer(unitt'
1009 1240 header = trailing[0:3]
1010 1241 self.assertEqual(header, b'\x01\x00\x00')
1011 1242
1243 def test_flush_frame(self):
1244 cctx = zstd.ZstdCompressor(level=3)
1245 dest = OpCountingBytesIO()
1246
1247 with cctx.stream_writer(dest) as compressor:
1248 self.assertEqual(compressor.write(b'foobar' * 8192), 0)
1249 self.assertEqual(compressor.flush(zstd.FLUSH_FRAME), 23)
1250 compressor.write(b'biz' * 16384)
1251
1252 self.assertEqual(dest.getvalue(),
1253 # Frame 1.
1254 b'\x28\xb5\x2f\xfd\x00\x58\x75\x00\x00\x30\x66\x6f\x6f'
1255 b'\x62\x61\x72\x01\x00\xf7\xbf\xe8\xa5\x08'
1256 # Frame 2.
1257 b'\x28\xb5\x2f\xfd\x00\x58\x5d\x00\x00\x18\x62\x69\x7a'
1258 b'\x01\x00\xfa\x3f\x75\x37\x04')
1259
1260 def test_bad_flush_mode(self):
1261 cctx = zstd.ZstdCompressor()
1262 dest = io.BytesIO()
1263 with cctx.stream_writer(dest) as compressor:
1264 with self.assertRaisesRegexp(ValueError, 'unknown flush_mode: 42'):
1265 compressor.flush(flush_mode=42)
1266
1012 1267 def test_multithreaded(self):
1013 dest = io.BytesIO()
1268 dest = NonClosingBytesIO()
1014 1269 cctx = zstd.ZstdCompressor(threads=2)
1015 1270 with cctx.stream_writer(dest) as compressor:
1016 1271 compressor.write(b'a' * 1048576)
@@ -1043,22 +1298,21 b' class TestCompressor_stream_writer(unitt'
1043 1298 pass
1044 1299
1045 1300 def test_tarfile_compat(self):
1046 raise unittest.SkipTest('not yet fully working')
1047
1048 dest = io.BytesIO()
1301 dest = NonClosingBytesIO()
1049 1302 cctx = zstd.ZstdCompressor()
1050 1303 with cctx.stream_writer(dest) as compressor:
1051 with tarfile.open('tf', mode='w', fileobj=compressor) as tf:
1304 with tarfile.open('tf', mode='w|', fileobj=compressor) as tf:
1052 1305 tf.add(__file__, 'test_compressor.py')
1053 1306
1054 dest.seek(0)
1307 dest = io.BytesIO(dest.getvalue())
1055 1308
1056 1309 dctx = zstd.ZstdDecompressor()
1057 1310 with dctx.stream_reader(dest) as reader:
1058 with tarfile.open(mode='r:', fileobj=reader) as tf:
1311 with tarfile.open(mode='r|', fileobj=reader) as tf:
1059 1312 for member in tf:
1060 1313 self.assertEqual(member.name, 'test_compressor.py')
1061 1314
1315
1062 1316 @make_cffi
1063 1317 class TestCompressor_read_to_iter(unittest.TestCase):
1064 1318 def test_type_validation(self):
@@ -1192,7 +1446,7 b' class TestCompressor_chunker(unittest.Te'
1192 1446
1193 1447 it = chunker.finish()
1194 1448
1195 self.assertEqual(next(it), b'\x28\xb5\x2f\xfd\x00\x50\x01\x00\x00')
1449 self.assertEqual(next(it), b'\x28\xb5\x2f\xfd\x00\x58\x01\x00\x00')
1196 1450
1197 1451 with self.assertRaises(StopIteration):
1198 1452 next(it)
@@ -1214,7 +1468,7 b' class TestCompressor_chunker(unittest.Te'
1214 1468 it = chunker.finish()
1215 1469
1216 1470 self.assertEqual(next(it),
1217 b'\x28\xb5\x2f\xfd\x00\x50\x7d\x00\x00\x48\x66\x6f'
1471 b'\x28\xb5\x2f\xfd\x00\x58\x7d\x00\x00\x48\x66\x6f'
1218 1472 b'\x6f\x62\x61\x72\x62\x61\x7a\x01\x00\xe4\xe4\x8e')
1219 1473
1220 1474 with self.assertRaises(StopIteration):
@@ -1258,7 +1512,7 b' class TestCompressor_chunker(unittest.Te'
1258 1512
1259 1513 self.assertEqual(
1260 1514 b''.join(chunks),
1261 b'\x28\xb5\x2f\xfd\x00\x50\x55\x00\x00\x18\x66\x6f\x6f\x01\x00'
1515 b'\x28\xb5\x2f\xfd\x00\x58\x55\x00\x00\x18\x66\x6f\x6f\x01\x00'
1262 1516 b'\xfa\xd3\x77\x43')
1263 1517
1264 1518 dctx = zstd.ZstdDecompressor()
@@ -1283,7 +1537,7 b' class TestCompressor_chunker(unittest.Te'
1283 1537
1284 1538 self.assertEqual(list(chunker.compress(source)), [])
1285 1539 self.assertEqual(list(chunker.finish()), [
1286 b'\x28\xb5\x2f\xfd\x00\x50\x19\x00\x00\x66\x6f\x6f'
1540 b'\x28\xb5\x2f\xfd\x00\x58\x19\x00\x00\x66\x6f\x6f'
1287 1541 ])
1288 1542
1289 1543 def test_flush(self):
@@ -1296,7 +1550,7 b' class TestCompressor_chunker(unittest.Te'
1296 1550 chunks1 = list(chunker.flush())
1297 1551
1298 1552 self.assertEqual(chunks1, [
1299 b'\x28\xb5\x2f\xfd\x00\x50\x8c\x00\x00\x30\x66\x6f\x6f\x62\x61\x72'
1553 b'\x28\xb5\x2f\xfd\x00\x58\x8c\x00\x00\x30\x66\x6f\x6f\x62\x61\x72'
1300 1554 b'\x02\x00\xfa\x03\xfe\xd0\x9f\xbe\x1b\x02'
1301 1555 ])
1302 1556
@@ -1326,7 +1580,7 b' class TestCompressor_chunker(unittest.Te'
1326 1580
1327 1581 with self.assertRaisesRegexp(
1328 1582 zstd.ZstdError,
1329 'cannot call compress\(\) after compression finished'):
1583 r'cannot call compress\(\) after compression finished'):
1330 1584 list(chunker.compress(b'foo'))
1331 1585
1332 1586 def test_flush_after_finish(self):
@@ -1338,7 +1592,7 b' class TestCompressor_chunker(unittest.Te'
1338 1592
1339 1593 with self.assertRaisesRegexp(
1340 1594 zstd.ZstdError,
1341 'cannot call flush\(\) after compression finished'):
1595 r'cannot call flush\(\) after compression finished'):
1342 1596 list(chunker.flush())
1343 1597
1344 1598 def test_finish_after_finish(self):
@@ -1350,7 +1604,7 b' class TestCompressor_chunker(unittest.Te'
1350 1604
1351 1605 with self.assertRaisesRegexp(
1352 1606 zstd.ZstdError,
1353 'cannot call finish\(\) after compression finished'):
1607 r'cannot call finish\(\) after compression finished'):
1354 1608 list(chunker.finish())
1355 1609
1356 1610
@@ -1358,6 +1612,9 b' class TestCompressor_multi_compress_to_b'
1358 1612 def test_invalid_inputs(self):
1359 1613 cctx = zstd.ZstdCompressor()
1360 1614
1615 if not hasattr(cctx, 'multi_compress_to_buffer'):
1616 self.skipTest('multi_compress_to_buffer not available')
1617
1361 1618 with self.assertRaises(TypeError):
1362 1619 cctx.multi_compress_to_buffer(True)
1363 1620
@@ -1370,6 +1627,9 b' class TestCompressor_multi_compress_to_b'
1370 1627 def test_empty_input(self):
1371 1628 cctx = zstd.ZstdCompressor()
1372 1629
1630 if not hasattr(cctx, 'multi_compress_to_buffer'):
1631 self.skipTest('multi_compress_to_buffer not available')
1632
1373 1633 with self.assertRaisesRegexp(ValueError, 'no source elements found'):
1374 1634 cctx.multi_compress_to_buffer([])
1375 1635
@@ -1379,6 +1639,9 b' class TestCompressor_multi_compress_to_b'
1379 1639 def test_list_input(self):
1380 1640 cctx = zstd.ZstdCompressor(write_checksum=True)
1381 1641
1642 if not hasattr(cctx, 'multi_compress_to_buffer'):
1643 self.skipTest('multi_compress_to_buffer not available')
1644
1382 1645 original = [b'foo' * 12, b'bar' * 6]
1383 1646 frames = [cctx.compress(c) for c in original]
1384 1647 b = cctx.multi_compress_to_buffer(original)
@@ -1394,6 +1657,9 b' class TestCompressor_multi_compress_to_b'
1394 1657 def test_buffer_with_segments_input(self):
1395 1658 cctx = zstd.ZstdCompressor(write_checksum=True)
1396 1659
1660 if not hasattr(cctx, 'multi_compress_to_buffer'):
1661 self.skipTest('multi_compress_to_buffer not available')
1662
1397 1663 original = [b'foo' * 4, b'bar' * 6]
1398 1664 frames = [cctx.compress(c) for c in original]
1399 1665
@@ -1412,6 +1678,9 b' class TestCompressor_multi_compress_to_b'
1412 1678 def test_buffer_with_segments_collection_input(self):
1413 1679 cctx = zstd.ZstdCompressor(write_checksum=True)
1414 1680
1681 if not hasattr(cctx, 'multi_compress_to_buffer'):
1682 self.skipTest('multi_compress_to_buffer not available')
1683
1415 1684 original = [
1416 1685 b'foo1',
1417 1686 b'foo2' * 2,
@@ -1449,6 +1718,9 b' class TestCompressor_multi_compress_to_b'
1449 1718
1450 1719 cctx = zstd.ZstdCompressor(write_checksum=True)
1451 1720
1721 if not hasattr(cctx, 'multi_compress_to_buffer'):
1722 self.skipTest('multi_compress_to_buffer not available')
1723
1452 1724 frames = []
1453 1725 frames.extend(b'x' * 64 for i in range(256))
1454 1726 frames.extend(b'y' * 64 for i in range(256))
@@ -12,6 +12,7 b' import zstandard as zstd'
12 12
13 13 from . common import (
14 14 make_cffi,
15 NonClosingBytesIO,
15 16 random_input_data,
16 17 )
17 18
@@ -19,6 +20,62 b' from . common import ('
19 20 @unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set')
20 21 @make_cffi
21 22 class TestCompressor_stream_reader_fuzzing(unittest.TestCase):
23 @hypothesis.settings(
24 suppress_health_check=[hypothesis.HealthCheck.large_base_example])
25 @hypothesis.given(original=strategies.sampled_from(random_input_data()),
26 level=strategies.integers(min_value=1, max_value=5),
27 source_read_size=strategies.integers(1, 16384),
28 read_size=strategies.integers(-1, zstd.COMPRESSION_RECOMMENDED_OUTPUT_SIZE))
29 def test_stream_source_read(self, original, level, source_read_size,
30 read_size):
31 if read_size == 0:
32 read_size = -1
33
34 refctx = zstd.ZstdCompressor(level=level)
35 ref_frame = refctx.compress(original)
36
37 cctx = zstd.ZstdCompressor(level=level)
38 with cctx.stream_reader(io.BytesIO(original), size=len(original),
39 read_size=source_read_size) as reader:
40 chunks = []
41 while True:
42 chunk = reader.read(read_size)
43 if not chunk:
44 break
45
46 chunks.append(chunk)
47
48 self.assertEqual(b''.join(chunks), ref_frame)
49
50 @hypothesis.settings(
51 suppress_health_check=[hypothesis.HealthCheck.large_base_example])
52 @hypothesis.given(original=strategies.sampled_from(random_input_data()),
53 level=strategies.integers(min_value=1, max_value=5),
54 source_read_size=strategies.integers(1, 16384),
55 read_size=strategies.integers(-1, zstd.COMPRESSION_RECOMMENDED_OUTPUT_SIZE))
56 def test_buffer_source_read(self, original, level, source_read_size,
57 read_size):
58 if read_size == 0:
59 read_size = -1
60
61 refctx = zstd.ZstdCompressor(level=level)
62 ref_frame = refctx.compress(original)
63
64 cctx = zstd.ZstdCompressor(level=level)
65 with cctx.stream_reader(original, size=len(original),
66 read_size=source_read_size) as reader:
67 chunks = []
68 while True:
69 chunk = reader.read(read_size)
70 if not chunk:
71 break
72
73 chunks.append(chunk)
74
75 self.assertEqual(b''.join(chunks), ref_frame)
76
77 @hypothesis.settings(
78 suppress_health_check=[hypothesis.HealthCheck.large_base_example])
22 79 @hypothesis.given(original=strategies.sampled_from(random_input_data()),
23 80 level=strategies.integers(min_value=1, max_value=5),
24 81 source_read_size=strategies.integers(1, 16384),
@@ -33,15 +90,17 b' class TestCompressor_stream_reader_fuzzi'
33 90 read_size=source_read_size) as reader:
34 91 chunks = []
35 92 while True:
36 read_size = read_sizes.draw(strategies.integers(1, 16384))
93 read_size = read_sizes.draw(strategies.integers(-1, 16384))
37 94 chunk = reader.read(read_size)
95 if not chunk and read_size:
96 break
38 97
39 if not chunk:
40 break
41 98 chunks.append(chunk)
42 99
43 100 self.assertEqual(b''.join(chunks), ref_frame)
44 101
102 @hypothesis.settings(
103 suppress_health_check=[hypothesis.HealthCheck.large_base_example])
45 104 @hypothesis.given(original=strategies.sampled_from(random_input_data()),
46 105 level=strategies.integers(min_value=1, max_value=5),
47 106 source_read_size=strategies.integers(1, 16384),
@@ -57,14 +116,343 b' class TestCompressor_stream_reader_fuzzi'
57 116 read_size=source_read_size) as reader:
58 117 chunks = []
59 118 while True:
119 read_size = read_sizes.draw(strategies.integers(-1, 16384))
120 chunk = reader.read(read_size)
121 if not chunk and read_size:
122 break
123
124 chunks.append(chunk)
125
126 self.assertEqual(b''.join(chunks), ref_frame)
127
128 @hypothesis.settings(
129 suppress_health_check=[hypothesis.HealthCheck.large_base_example])
130 @hypothesis.given(original=strategies.sampled_from(random_input_data()),
131 level=strategies.integers(min_value=1, max_value=5),
132 source_read_size=strategies.integers(1, 16384),
133 read_size=strategies.integers(1, zstd.COMPRESSION_RECOMMENDED_OUTPUT_SIZE))
134 def test_stream_source_readinto(self, original, level,
135 source_read_size, read_size):
136 refctx = zstd.ZstdCompressor(level=level)
137 ref_frame = refctx.compress(original)
138
139 cctx = zstd.ZstdCompressor(level=level)
140 with cctx.stream_reader(io.BytesIO(original), size=len(original),
141 read_size=source_read_size) as reader:
142 chunks = []
143 while True:
144 b = bytearray(read_size)
145 count = reader.readinto(b)
146
147 if not count:
148 break
149
150 chunks.append(bytes(b[0:count]))
151
152 self.assertEqual(b''.join(chunks), ref_frame)
153
154 @hypothesis.settings(
155 suppress_health_check=[hypothesis.HealthCheck.large_base_example])
156 @hypothesis.given(original=strategies.sampled_from(random_input_data()),
157 level=strategies.integers(min_value=1, max_value=5),
158 source_read_size=strategies.integers(1, 16384),
159 read_size=strategies.integers(1, zstd.COMPRESSION_RECOMMENDED_OUTPUT_SIZE))
160 def test_buffer_source_readinto(self, original, level,
161 source_read_size, read_size):
162
163 refctx = zstd.ZstdCompressor(level=level)
164 ref_frame = refctx.compress(original)
165
166 cctx = zstd.ZstdCompressor(level=level)
167 with cctx.stream_reader(original, size=len(original),
168 read_size=source_read_size) as reader:
169 chunks = []
170 while True:
171 b = bytearray(read_size)
172 count = reader.readinto(b)
173
174 if not count:
175 break
176
177 chunks.append(bytes(b[0:count]))
178
179 self.assertEqual(b''.join(chunks), ref_frame)
180
181 @hypothesis.settings(
182 suppress_health_check=[hypothesis.HealthCheck.large_base_example])
183 @hypothesis.given(original=strategies.sampled_from(random_input_data()),
184 level=strategies.integers(min_value=1, max_value=5),
185 source_read_size=strategies.integers(1, 16384),
186 read_sizes=strategies.data())
187 def test_stream_source_readinto_variance(self, original, level,
188 source_read_size, read_sizes):
189 refctx = zstd.ZstdCompressor(level=level)
190 ref_frame = refctx.compress(original)
191
192 cctx = zstd.ZstdCompressor(level=level)
193 with cctx.stream_reader(io.BytesIO(original), size=len(original),
194 read_size=source_read_size) as reader:
195 chunks = []
196 while True:
60 197 read_size = read_sizes.draw(strategies.integers(1, 16384))
61 chunk = reader.read(read_size)
198 b = bytearray(read_size)
199 count = reader.readinto(b)
200
201 if not count:
202 break
203
204 chunks.append(bytes(b[0:count]))
205
206 self.assertEqual(b''.join(chunks), ref_frame)
207
208 @hypothesis.settings(
209 suppress_health_check=[hypothesis.HealthCheck.large_base_example])
210 @hypothesis.given(original=strategies.sampled_from(random_input_data()),
211 level=strategies.integers(min_value=1, max_value=5),
212 source_read_size=strategies.integers(1, 16384),
213 read_sizes=strategies.data())
214 def test_buffer_source_readinto_variance(self, original, level,
215 source_read_size, read_sizes):
216
217 refctx = zstd.ZstdCompressor(level=level)
218 ref_frame = refctx.compress(original)
219
220 cctx = zstd.ZstdCompressor(level=level)
221 with cctx.stream_reader(original, size=len(original),
222 read_size=source_read_size) as reader:
223 chunks = []
224 while True:
225 read_size = read_sizes.draw(strategies.integers(1, 16384))
226 b = bytearray(read_size)
227 count = reader.readinto(b)
228
229 if not count:
230 break
231
232 chunks.append(bytes(b[0:count]))
233
234 self.assertEqual(b''.join(chunks), ref_frame)
235
236 @hypothesis.settings(
237 suppress_health_check=[hypothesis.HealthCheck.large_base_example])
238 @hypothesis.given(original=strategies.sampled_from(random_input_data()),
239 level=strategies.integers(min_value=1, max_value=5),
240 source_read_size=strategies.integers(1, 16384),
241 read_size=strategies.integers(-1, zstd.COMPRESSION_RECOMMENDED_OUTPUT_SIZE))
242 def test_stream_source_read1(self, original, level, source_read_size,
243 read_size):
244 if read_size == 0:
245 read_size = -1
246
247 refctx = zstd.ZstdCompressor(level=level)
248 ref_frame = refctx.compress(original)
249
250 cctx = zstd.ZstdCompressor(level=level)
251 with cctx.stream_reader(io.BytesIO(original), size=len(original),
252 read_size=source_read_size) as reader:
253 chunks = []
254 while True:
255 chunk = reader.read1(read_size)
62 256 if not chunk:
63 257 break
258
64 259 chunks.append(chunk)
65 260
66 261 self.assertEqual(b''.join(chunks), ref_frame)
67 262
263 @hypothesis.settings(
264 suppress_health_check=[hypothesis.HealthCheck.large_base_example])
265 @hypothesis.given(original=strategies.sampled_from(random_input_data()),
266 level=strategies.integers(min_value=1, max_value=5),
267 source_read_size=strategies.integers(1, 16384),
268 read_size=strategies.integers(-1, zstd.COMPRESSION_RECOMMENDED_OUTPUT_SIZE))
269 def test_buffer_source_read1(self, original, level, source_read_size,
270 read_size):
271 if read_size == 0:
272 read_size = -1
273
274 refctx = zstd.ZstdCompressor(level=level)
275 ref_frame = refctx.compress(original)
276
277 cctx = zstd.ZstdCompressor(level=level)
278 with cctx.stream_reader(original, size=len(original),
279 read_size=source_read_size) as reader:
280 chunks = []
281 while True:
282 chunk = reader.read1(read_size)
283 if not chunk:
284 break
285
286 chunks.append(chunk)
287
288 self.assertEqual(b''.join(chunks), ref_frame)
289
290 @hypothesis.settings(
291 suppress_health_check=[hypothesis.HealthCheck.large_base_example])
292 @hypothesis.given(original=strategies.sampled_from(random_input_data()),
293 level=strategies.integers(min_value=1, max_value=5),
294 source_read_size=strategies.integers(1, 16384),
295 read_sizes=strategies.data())
296 def test_stream_source_read1_variance(self, original, level, source_read_size,
297 read_sizes):
298 refctx = zstd.ZstdCompressor(level=level)
299 ref_frame = refctx.compress(original)
300
301 cctx = zstd.ZstdCompressor(level=level)
302 with cctx.stream_reader(io.BytesIO(original), size=len(original),
303 read_size=source_read_size) as reader:
304 chunks = []
305 while True:
306 read_size = read_sizes.draw(strategies.integers(-1, 16384))
307 chunk = reader.read1(read_size)
308 if not chunk and read_size:
309 break
310
311 chunks.append(chunk)
312
313 self.assertEqual(b''.join(chunks), ref_frame)
314
315 @hypothesis.settings(
316 suppress_health_check=[hypothesis.HealthCheck.large_base_example])
317 @hypothesis.given(original=strategies.sampled_from(random_input_data()),
318 level=strategies.integers(min_value=1, max_value=5),
319 source_read_size=strategies.integers(1, 16384),
320 read_sizes=strategies.data())
321 def test_buffer_source_read1_variance(self, original, level, source_read_size,
322 read_sizes):
323
324 refctx = zstd.ZstdCompressor(level=level)
325 ref_frame = refctx.compress(original)
326
327 cctx = zstd.ZstdCompressor(level=level)
328 with cctx.stream_reader(original, size=len(original),
329 read_size=source_read_size) as reader:
330 chunks = []
331 while True:
332 read_size = read_sizes.draw(strategies.integers(-1, 16384))
333 chunk = reader.read1(read_size)
334 if not chunk and read_size:
335 break
336
337 chunks.append(chunk)
338
339 self.assertEqual(b''.join(chunks), ref_frame)
340
341
342 @hypothesis.settings(
343 suppress_health_check=[hypothesis.HealthCheck.large_base_example])
344 @hypothesis.given(original=strategies.sampled_from(random_input_data()),
345 level=strategies.integers(min_value=1, max_value=5),
346 source_read_size=strategies.integers(1, 16384),
347 read_size=strategies.integers(1, zstd.COMPRESSION_RECOMMENDED_OUTPUT_SIZE))
348 def test_stream_source_readinto1(self, original, level, source_read_size,
349 read_size):
350 if read_size == 0:
351 read_size = -1
352
353 refctx = zstd.ZstdCompressor(level=level)
354 ref_frame = refctx.compress(original)
355
356 cctx = zstd.ZstdCompressor(level=level)
357 with cctx.stream_reader(io.BytesIO(original), size=len(original),
358 read_size=source_read_size) as reader:
359 chunks = []
360 while True:
361 b = bytearray(read_size)
362 count = reader.readinto1(b)
363
364 if not count:
365 break
366
367 chunks.append(bytes(b[0:count]))
368
369 self.assertEqual(b''.join(chunks), ref_frame)
370
371 @hypothesis.settings(
372 suppress_health_check=[hypothesis.HealthCheck.large_base_example])
373 @hypothesis.given(original=strategies.sampled_from(random_input_data()),
374 level=strategies.integers(min_value=1, max_value=5),
375 source_read_size=strategies.integers(1, 16384),
376 read_size=strategies.integers(1, zstd.COMPRESSION_RECOMMENDED_OUTPUT_SIZE))
377 def test_buffer_source_readinto1(self, original, level, source_read_size,
378 read_size):
379 if read_size == 0:
380 read_size = -1
381
382 refctx = zstd.ZstdCompressor(level=level)
383 ref_frame = refctx.compress(original)
384
385 cctx = zstd.ZstdCompressor(level=level)
386 with cctx.stream_reader(original, size=len(original),
387 read_size=source_read_size) as reader:
388 chunks = []
389 while True:
390 b = bytearray(read_size)
391 count = reader.readinto1(b)
392
393 if not count:
394 break
395
396 chunks.append(bytes(b[0:count]))
397
398 self.assertEqual(b''.join(chunks), ref_frame)
399
400 @hypothesis.settings(
401 suppress_health_check=[hypothesis.HealthCheck.large_base_example])
402 @hypothesis.given(original=strategies.sampled_from(random_input_data()),
403 level=strategies.integers(min_value=1, max_value=5),
404 source_read_size=strategies.integers(1, 16384),
405 read_sizes=strategies.data())
406 def test_stream_source_readinto1_variance(self, original, level, source_read_size,
407 read_sizes):
408 refctx = zstd.ZstdCompressor(level=level)
409 ref_frame = refctx.compress(original)
410
411 cctx = zstd.ZstdCompressor(level=level)
412 with cctx.stream_reader(io.BytesIO(original), size=len(original),
413 read_size=source_read_size) as reader:
414 chunks = []
415 while True:
416 read_size = read_sizes.draw(strategies.integers(1, 16384))
417 b = bytearray(read_size)
418 count = reader.readinto1(b)
419
420 if not count:
421 break
422
423 chunks.append(bytes(b[0:count]))
424
425 self.assertEqual(b''.join(chunks), ref_frame)
426
427 @hypothesis.settings(
428 suppress_health_check=[hypothesis.HealthCheck.large_base_example])
429 @hypothesis.given(original=strategies.sampled_from(random_input_data()),
430 level=strategies.integers(min_value=1, max_value=5),
431 source_read_size=strategies.integers(1, 16384),
432 read_sizes=strategies.data())
433 def test_buffer_source_readinto1_variance(self, original, level, source_read_size,
434 read_sizes):
435
436 refctx = zstd.ZstdCompressor(level=level)
437 ref_frame = refctx.compress(original)
438
439 cctx = zstd.ZstdCompressor(level=level)
440 with cctx.stream_reader(original, size=len(original),
441 read_size=source_read_size) as reader:
442 chunks = []
443 while True:
444 read_size = read_sizes.draw(strategies.integers(1, 16384))
445 b = bytearray(read_size)
446 count = reader.readinto1(b)
447
448 if not count:
449 break
450
451 chunks.append(bytes(b[0:count]))
452
453 self.assertEqual(b''.join(chunks), ref_frame)
454
455
68 456
69 457 @unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set')
70 458 @make_cffi
@@ -77,7 +465,7 b' class TestCompressor_stream_writer_fuzzi'
77 465 ref_frame = refctx.compress(original)
78 466
79 467 cctx = zstd.ZstdCompressor(level=level)
80 b = io.BytesIO()
468 b = NonClosingBytesIO()
81 469 with cctx.stream_writer(b, size=len(original), write_size=write_size) as compressor:
82 470 compressor.write(original)
83 471
@@ -219,6 +607,9 b' class TestCompressor_multi_compress_to_b'
219 607 write_checksum=True,
220 608 **kwargs)
221 609
610 if not hasattr(cctx, 'multi_compress_to_buffer'):
611 self.skipTest('multi_compress_to_buffer not available')
612
222 613 result = cctx.multi_compress_to_buffer(original, threads=-1)
223 614
224 615 self.assertEqual(len(result), len(original))
@@ -15,17 +15,17 b' class TestCompressionParameters(unittest'
15 15 chain_log=zstd.CHAINLOG_MIN,
16 16 hash_log=zstd.HASHLOG_MIN,
17 17 search_log=zstd.SEARCHLOG_MIN,
18 min_match=zstd.SEARCHLENGTH_MIN + 1,
18 min_match=zstd.MINMATCH_MIN + 1,
19 19 target_length=zstd.TARGETLENGTH_MIN,
20 compression_strategy=zstd.STRATEGY_FAST)
20 strategy=zstd.STRATEGY_FAST)
21 21
22 22 zstd.ZstdCompressionParameters(window_log=zstd.WINDOWLOG_MAX,
23 23 chain_log=zstd.CHAINLOG_MAX,
24 24 hash_log=zstd.HASHLOG_MAX,
25 25 search_log=zstd.SEARCHLOG_MAX,
26 min_match=zstd.SEARCHLENGTH_MAX - 1,
26 min_match=zstd.MINMATCH_MAX - 1,
27 27 target_length=zstd.TARGETLENGTH_MAX,
28 compression_strategy=zstd.STRATEGY_BTULTRA)
28 strategy=zstd.STRATEGY_BTULTRA2)
29 29
30 30 def test_from_level(self):
31 31 p = zstd.ZstdCompressionParameters.from_level(1)
@@ -43,7 +43,7 b' class TestCompressionParameters(unittest'
43 43 search_log=4,
44 44 min_match=5,
45 45 target_length=8,
46 compression_strategy=1)
46 strategy=1)
47 47 self.assertEqual(p.window_log, 10)
48 48 self.assertEqual(p.chain_log, 6)
49 49 self.assertEqual(p.hash_log, 7)
@@ -59,9 +59,10 b' class TestCompressionParameters(unittest'
59 59 self.assertEqual(p.threads, 4)
60 60
61 61 p = zstd.ZstdCompressionParameters(threads=2, job_size=1048576,
62 overlap_size_log=6)
62 overlap_log=6)
63 63 self.assertEqual(p.threads, 2)
64 64 self.assertEqual(p.job_size, 1048576)
65 self.assertEqual(p.overlap_log, 6)
65 66 self.assertEqual(p.overlap_size_log, 6)
66 67
67 68 p = zstd.ZstdCompressionParameters(compression_level=-1)
@@ -85,8 +86,9 b' class TestCompressionParameters(unittest'
85 86 p = zstd.ZstdCompressionParameters(ldm_bucket_size_log=7)
86 87 self.assertEqual(p.ldm_bucket_size_log, 7)
87 88
88 p = zstd.ZstdCompressionParameters(ldm_hash_every_log=8)
89 p = zstd.ZstdCompressionParameters(ldm_hash_rate_log=8)
89 90 self.assertEqual(p.ldm_hash_every_log, 8)
91 self.assertEqual(p.ldm_hash_rate_log, 8)
90 92
91 93 def test_estimated_compression_context_size(self):
92 94 p = zstd.ZstdCompressionParameters(window_log=20,
@@ -95,12 +97,44 b' class TestCompressionParameters(unittest'
95 97 search_log=1,
96 98 min_match=5,
97 99 target_length=16,
98 compression_strategy=zstd.STRATEGY_DFAST)
100 strategy=zstd.STRATEGY_DFAST)
99 101
100 102 # 32-bit has slightly different values from 64-bit.
101 103 self.assertAlmostEqual(p.estimated_compression_context_size(), 1294072,
102 104 delta=250)
103 105
106 def test_strategy(self):
107 with self.assertRaisesRegexp(ValueError, 'cannot specify both compression_strategy'):
108 zstd.ZstdCompressionParameters(strategy=0, compression_strategy=0)
109
110 p = zstd.ZstdCompressionParameters(strategy=2)
111 self.assertEqual(p.compression_strategy, 2)
112
113 p = zstd.ZstdCompressionParameters(strategy=3)
114 self.assertEqual(p.compression_strategy, 3)
115
116 def test_ldm_hash_rate_log(self):
117 with self.assertRaisesRegexp(ValueError, 'cannot specify both ldm_hash_rate_log'):
118 zstd.ZstdCompressionParameters(ldm_hash_rate_log=8, ldm_hash_every_log=4)
119
120 p = zstd.ZstdCompressionParameters(ldm_hash_rate_log=8)
121 self.assertEqual(p.ldm_hash_every_log, 8)
122
123 p = zstd.ZstdCompressionParameters(ldm_hash_every_log=16)
124 self.assertEqual(p.ldm_hash_every_log, 16)
125
126 def test_overlap_log(self):
127 with self.assertRaisesRegexp(ValueError, 'cannot specify both overlap_log'):
128 zstd.ZstdCompressionParameters(overlap_log=1, overlap_size_log=9)
129
130 p = zstd.ZstdCompressionParameters(overlap_log=2)
131 self.assertEqual(p.overlap_log, 2)
132 self.assertEqual(p.overlap_size_log, 2)
133
134 p = zstd.ZstdCompressionParameters(overlap_size_log=4)
135 self.assertEqual(p.overlap_log, 4)
136 self.assertEqual(p.overlap_size_log, 4)
137
104 138
105 139 @make_cffi
106 140 class TestFrameParameters(unittest.TestCase):
@@ -24,8 +24,8 b' s_hashlog = strategies.integers(min_valu'
24 24 max_value=zstd.HASHLOG_MAX)
25 25 s_searchlog = strategies.integers(min_value=zstd.SEARCHLOG_MIN,
26 26 max_value=zstd.SEARCHLOG_MAX)
27 s_searchlength = strategies.integers(min_value=zstd.SEARCHLENGTH_MIN,
28 max_value=zstd.SEARCHLENGTH_MAX)
27 s_minmatch = strategies.integers(min_value=zstd.MINMATCH_MIN,
28 max_value=zstd.MINMATCH_MAX)
29 29 s_targetlength = strategies.integers(min_value=zstd.TARGETLENGTH_MIN,
30 30 max_value=zstd.TARGETLENGTH_MAX)
31 31 s_strategy = strategies.sampled_from((zstd.STRATEGY_FAST,
@@ -35,41 +35,42 b' s_strategy = strategies.sampled_from((zs'
35 35 zstd.STRATEGY_LAZY2,
36 36 zstd.STRATEGY_BTLAZY2,
37 37 zstd.STRATEGY_BTOPT,
38 zstd.STRATEGY_BTULTRA))
38 zstd.STRATEGY_BTULTRA,
39 zstd.STRATEGY_BTULTRA2))
39 40
40 41
41 42 @make_cffi
42 43 @unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set')
43 44 class TestCompressionParametersHypothesis(unittest.TestCase):
44 45 @hypothesis.given(s_windowlog, s_chainlog, s_hashlog, s_searchlog,
45 s_searchlength, s_targetlength, s_strategy)
46 s_minmatch, s_targetlength, s_strategy)
46 47 def test_valid_init(self, windowlog, chainlog, hashlog, searchlog,
47 searchlength, targetlength, strategy):
48 minmatch, targetlength, strategy):
48 49 zstd.ZstdCompressionParameters(window_log=windowlog,
49 50 chain_log=chainlog,
50 51 hash_log=hashlog,
51 52 search_log=searchlog,
52 min_match=searchlength,
53 min_match=minmatch,
53 54 target_length=targetlength,
54 compression_strategy=strategy)
55 strategy=strategy)
55 56
56 57 @hypothesis.given(s_windowlog, s_chainlog, s_hashlog, s_searchlog,
57 s_searchlength, s_targetlength, s_strategy)
58 s_minmatch, s_targetlength, s_strategy)
58 59 def test_estimated_compression_context_size(self, windowlog, chainlog,
59 60 hashlog, searchlog,
60 searchlength, targetlength,
61 minmatch, targetlength,
61 62 strategy):
62 if searchlength == zstd.SEARCHLENGTH_MIN and strategy in (zstd.STRATEGY_FAST, zstd.STRATEGY_GREEDY):
63 searchlength += 1
64 elif searchlength == zstd.SEARCHLENGTH_MAX and strategy != zstd.STRATEGY_FAST:
65 searchlength -= 1
63 if minmatch == zstd.MINMATCH_MIN and strategy in (zstd.STRATEGY_FAST, zstd.STRATEGY_GREEDY):
64 minmatch += 1
65 elif minmatch == zstd.MINMATCH_MAX and strategy != zstd.STRATEGY_FAST:
66 minmatch -= 1
66 67
67 68 p = zstd.ZstdCompressionParameters(window_log=windowlog,
68 69 chain_log=chainlog,
69 70 hash_log=hashlog,
70 71 search_log=searchlog,
71 min_match=searchlength,
72 min_match=minmatch,
72 73 target_length=targetlength,
73 compression_strategy=strategy)
74 strategy=strategy)
74 75 size = p.estimated_compression_context_size()
75 76
@@ -3,6 +3,7 b' import os'
3 3 import random
4 4 import struct
5 5 import sys
6 import tempfile
6 7 import unittest
7 8
8 9 import zstandard as zstd
@@ -10,6 +11,7 b' import zstandard as zstd'
10 11 from .common import (
11 12 generate_samples,
12 13 make_cffi,
14 NonClosingBytesIO,
13 15 OpCountingBytesIO,
14 16 )
15 17
@@ -219,7 +221,7 b' class TestDecompressor_decompress(unitte'
219 221 cctx = zstd.ZstdCompressor(write_content_size=False)
220 222 frame = cctx.compress(source)
221 223
222 dctx = zstd.ZstdDecompressor(max_window_size=1)
224 dctx = zstd.ZstdDecompressor(max_window_size=2**zstd.WINDOWLOG_MIN)
223 225
224 226 with self.assertRaisesRegexp(
225 227 zstd.ZstdError, 'decompression error: Frame requires too much memory'):
@@ -302,19 +304,16 b' class TestDecompressor_stream_reader(uni'
302 304 dctx = zstd.ZstdDecompressor()
303 305
304 306 with dctx.stream_reader(b'foo') as reader:
305 with self.assertRaises(NotImplementedError):
307 with self.assertRaises(io.UnsupportedOperation):
306 308 reader.readline()
307 309
308 with self.assertRaises(NotImplementedError):
310 with self.assertRaises(io.UnsupportedOperation):
309 311 reader.readlines()
310 312
311 with self.assertRaises(NotImplementedError):
312 reader.readall()
313
314 with self.assertRaises(NotImplementedError):
313 with self.assertRaises(io.UnsupportedOperation):
315 314 iter(reader)
316 315
317 with self.assertRaises(NotImplementedError):
316 with self.assertRaises(io.UnsupportedOperation):
318 317 next(reader)
319 318
320 319 with self.assertRaises(io.UnsupportedOperation):
@@ -347,15 +346,18 b' class TestDecompressor_stream_reader(uni'
347 346 with self.assertRaisesRegexp(ValueError, 'stream is closed'):
348 347 reader.read(1)
349 348
350 def test_bad_read_size(self):
349 def test_read_sizes(self):
350 cctx = zstd.ZstdCompressor()
351 foo = cctx.compress(b'foo')
352
351 353 dctx = zstd.ZstdDecompressor()
352 354
353 with dctx.stream_reader(b'foo') as reader:
354 with self.assertRaisesRegexp(ValueError, 'cannot read negative or size 0 amounts'):
355 reader.read(-1)
355 with dctx.stream_reader(foo) as reader:
356 with self.assertRaisesRegexp(ValueError, 'cannot read negative amounts less than -1'):
357 reader.read(-2)
356 358
357 with self.assertRaisesRegexp(ValueError, 'cannot read negative or size 0 amounts'):
358 reader.read(0)
359 self.assertEqual(reader.read(0), b'')
360 self.assertEqual(reader.read(), b'foo')
359 361
360 362 def test_read_buffer(self):
361 363 cctx = zstd.ZstdCompressor()
@@ -524,13 +526,243 b' class TestDecompressor_stream_reader(uni'
524 526 reader = dctx.stream_reader(source)
525 527
526 528 with reader:
527 with self.assertRaises(TypeError):
528 reader.read()
529 reader.read(0)
529 530
530 531 with reader:
531 532 with self.assertRaisesRegexp(ValueError, 'stream is closed'):
532 533 reader.read(100)
533 534
535 def test_partial_read(self):
536 # Inspired by https://github.com/indygreg/python-zstandard/issues/71.
537 buffer = io.BytesIO()
538 cctx = zstd.ZstdCompressor()
539 writer = cctx.stream_writer(buffer)
540 writer.write(bytearray(os.urandom(1000000)))
541 writer.flush(zstd.FLUSH_FRAME)
542 buffer.seek(0)
543
544 dctx = zstd.ZstdDecompressor()
545 reader = dctx.stream_reader(buffer)
546
547 while True:
548 chunk = reader.read(8192)
549 if not chunk:
550 break
551
552 def test_read_multiple_frames(self):
553 cctx = zstd.ZstdCompressor()
554 source = io.BytesIO()
555 writer = cctx.stream_writer(source)
556 writer.write(b'foo')
557 writer.flush(zstd.FLUSH_FRAME)
558 writer.write(b'bar')
559 writer.flush(zstd.FLUSH_FRAME)
560
561 dctx = zstd.ZstdDecompressor()
562
563 reader = dctx.stream_reader(source.getvalue())
564 self.assertEqual(reader.read(2), b'fo')
565 self.assertEqual(reader.read(2), b'o')
566 self.assertEqual(reader.read(2), b'ba')
567 self.assertEqual(reader.read(2), b'r')
568
569 source.seek(0)
570 reader = dctx.stream_reader(source)
571 self.assertEqual(reader.read(2), b'fo')
572 self.assertEqual(reader.read(2), b'o')
573 self.assertEqual(reader.read(2), b'ba')
574 self.assertEqual(reader.read(2), b'r')
575
576 reader = dctx.stream_reader(source.getvalue())
577 self.assertEqual(reader.read(3), b'foo')
578 self.assertEqual(reader.read(3), b'bar')
579
580 source.seek(0)
581 reader = dctx.stream_reader(source)
582 self.assertEqual(reader.read(3), b'foo')
583 self.assertEqual(reader.read(3), b'bar')
584
585 reader = dctx.stream_reader(source.getvalue())
586 self.assertEqual(reader.read(4), b'foo')
587 self.assertEqual(reader.read(4), b'bar')
588
589 source.seek(0)
590 reader = dctx.stream_reader(source)
591 self.assertEqual(reader.read(4), b'foo')
592 self.assertEqual(reader.read(4), b'bar')
593
594 reader = dctx.stream_reader(source.getvalue())
595 self.assertEqual(reader.read(128), b'foo')
596 self.assertEqual(reader.read(128), b'bar')
597
598 source.seek(0)
599 reader = dctx.stream_reader(source)
600 self.assertEqual(reader.read(128), b'foo')
601 self.assertEqual(reader.read(128), b'bar')
602
603 # Now tests for reads spanning frames.
604 reader = dctx.stream_reader(source.getvalue(), read_across_frames=True)
605 self.assertEqual(reader.read(3), b'foo')
606 self.assertEqual(reader.read(3), b'bar')
607
608 source.seek(0)
609 reader = dctx.stream_reader(source, read_across_frames=True)
610 self.assertEqual(reader.read(3), b'foo')
611 self.assertEqual(reader.read(3), b'bar')
612
613 reader = dctx.stream_reader(source.getvalue(), read_across_frames=True)
614 self.assertEqual(reader.read(6), b'foobar')
615
616 source.seek(0)
617 reader = dctx.stream_reader(source, read_across_frames=True)
618 self.assertEqual(reader.read(6), b'foobar')
619
620 reader = dctx.stream_reader(source.getvalue(), read_across_frames=True)
621 self.assertEqual(reader.read(7), b'foobar')
622
623 source.seek(0)
624 reader = dctx.stream_reader(source, read_across_frames=True)
625 self.assertEqual(reader.read(7), b'foobar')
626
627 reader = dctx.stream_reader(source.getvalue(), read_across_frames=True)
628 self.assertEqual(reader.read(128), b'foobar')
629
630 source.seek(0)
631 reader = dctx.stream_reader(source, read_across_frames=True)
632 self.assertEqual(reader.read(128), b'foobar')
633
634 def test_readinto(self):
635 cctx = zstd.ZstdCompressor()
636 foo = cctx.compress(b'foo')
637
638 dctx = zstd.ZstdDecompressor()
639
640 # Attempting to readinto() a non-writable buffer fails.
641 # The exact exception varies based on the backend.
642 reader = dctx.stream_reader(foo)
643 with self.assertRaises(Exception):
644 reader.readinto(b'foobar')
645
646 # readinto() with sufficiently large destination.
647 b = bytearray(1024)
648 reader = dctx.stream_reader(foo)
649 self.assertEqual(reader.readinto(b), 3)
650 self.assertEqual(b[0:3], b'foo')
651 self.assertEqual(reader.readinto(b), 0)
652 self.assertEqual(b[0:3], b'foo')
653
654 # readinto() with small reads.
655 b = bytearray(1024)
656 reader = dctx.stream_reader(foo, read_size=1)
657 self.assertEqual(reader.readinto(b), 3)
658 self.assertEqual(b[0:3], b'foo')
659
660 # Too small destination buffer.
661 b = bytearray(2)
662 reader = dctx.stream_reader(foo)
663 self.assertEqual(reader.readinto(b), 2)
664 self.assertEqual(b[:], b'fo')
665
666 def test_readinto1(self):
667 cctx = zstd.ZstdCompressor()
668 foo = cctx.compress(b'foo')
669
670 dctx = zstd.ZstdDecompressor()
671
672 reader = dctx.stream_reader(foo)
673 with self.assertRaises(Exception):
674 reader.readinto1(b'foobar')
675
676 # Sufficiently large destination.
677 b = bytearray(1024)
678 reader = dctx.stream_reader(foo)
679 self.assertEqual(reader.readinto1(b), 3)
680 self.assertEqual(b[0:3], b'foo')
681 self.assertEqual(reader.readinto1(b), 0)
682 self.assertEqual(b[0:3], b'foo')
683
684 # readinto() with small reads.
685 b = bytearray(1024)
686 reader = dctx.stream_reader(foo, read_size=1)
687 self.assertEqual(reader.readinto1(b), 3)
688 self.assertEqual(b[0:3], b'foo')
689
690 # Too small destination buffer.
691 b = bytearray(2)
692 reader = dctx.stream_reader(foo)
693 self.assertEqual(reader.readinto1(b), 2)
694 self.assertEqual(b[:], b'fo')
695
696 def test_readall(self):
697 cctx = zstd.ZstdCompressor()
698 foo = cctx.compress(b'foo')
699
700 dctx = zstd.ZstdDecompressor()
701 reader = dctx.stream_reader(foo)
702
703 self.assertEqual(reader.readall(), b'foo')
704
705 def test_read1(self):
706 cctx = zstd.ZstdCompressor()
707 foo = cctx.compress(b'foo')
708
709 dctx = zstd.ZstdDecompressor()
710
711 b = OpCountingBytesIO(foo)
712 reader = dctx.stream_reader(b)
713
714 self.assertEqual(reader.read1(), b'foo')
715 self.assertEqual(b._read_count, 1)
716
717 b = OpCountingBytesIO(foo)
718 reader = dctx.stream_reader(b)
719
720 self.assertEqual(reader.read1(0), b'')
721 self.assertEqual(reader.read1(2), b'fo')
722 self.assertEqual(b._read_count, 1)
723 self.assertEqual(reader.read1(1), b'o')
724 self.assertEqual(b._read_count, 1)
725 self.assertEqual(reader.read1(1), b'')
726 self.assertEqual(b._read_count, 2)
727
728 def test_read_lines(self):
729 cctx = zstd.ZstdCompressor()
730 source = b'\n'.join(('line %d' % i).encode('ascii') for i in range(1024))
731
732 frame = cctx.compress(source)
733
734 dctx = zstd.ZstdDecompressor()
735 reader = dctx.stream_reader(frame)
736 tr = io.TextIOWrapper(reader, encoding='utf-8')
737
738 lines = []
739 for line in tr:
740 lines.append(line.encode('utf-8'))
741
742 self.assertEqual(len(lines), 1024)
743 self.assertEqual(b''.join(lines), source)
744
745 reader = dctx.stream_reader(frame)
746 tr = io.TextIOWrapper(reader, encoding='utf-8')
747
748 lines = tr.readlines()
749 self.assertEqual(len(lines), 1024)
750 self.assertEqual(''.join(lines).encode('utf-8'), source)
751
752 reader = dctx.stream_reader(frame)
753 tr = io.TextIOWrapper(reader, encoding='utf-8')
754
755 lines = []
756 while True:
757 line = tr.readline()
758 if not line:
759 break
760
761 lines.append(line.encode('utf-8'))
762
763 self.assertEqual(len(lines), 1024)
764 self.assertEqual(b''.join(lines), source)
765
534 766
535 767 @make_cffi
536 768 class TestDecompressor_decompressobj(unittest.TestCase):
@@ -540,6 +772,9 b' class TestDecompressor_decompressobj(uni'
540 772 dctx = zstd.ZstdDecompressor()
541 773 dobj = dctx.decompressobj()
542 774 self.assertEqual(dobj.decompress(data), b'foobar')
775 self.assertIsNone(dobj.flush())
776 self.assertIsNone(dobj.flush(10))
777 self.assertIsNone(dobj.flush(length=100))
543 778
544 779 def test_input_types(self):
545 780 compressed = zstd.ZstdCompressor(level=1).compress(b'foo')
@@ -557,7 +792,11 b' class TestDecompressor_decompressobj(uni'
557 792
558 793 for source in sources:
559 794 dobj = dctx.decompressobj()
795 self.assertIsNone(dobj.flush())
796 self.assertIsNone(dobj.flush(10))
797 self.assertIsNone(dobj.flush(length=100))
560 798 self.assertEqual(dobj.decompress(source), b'foo')
799 self.assertIsNone(dobj.flush())
561 800
562 801 def test_reuse(self):
563 802 data = zstd.ZstdCompressor(level=1).compress(b'foobar')
@@ -568,6 +807,7 b' class TestDecompressor_decompressobj(uni'
568 807
569 808 with self.assertRaisesRegexp(zstd.ZstdError, 'cannot use a decompressobj'):
570 809 dobj.decompress(data)
810 self.assertIsNone(dobj.flush())
571 811
572 812 def test_bad_write_size(self):
573 813 dctx = zstd.ZstdDecompressor()
@@ -585,16 +825,141 b' class TestDecompressor_decompressobj(uni'
585 825 dobj = dctx.decompressobj(write_size=i + 1)
586 826 self.assertEqual(dobj.decompress(data), source)
587 827
828
588 829 def decompress_via_writer(data):
589 830 buffer = io.BytesIO()
590 831 dctx = zstd.ZstdDecompressor()
591 with dctx.stream_writer(buffer) as decompressor:
592 decompressor.write(data)
832 decompressor = dctx.stream_writer(buffer)
833 decompressor.write(data)
834
593 835 return buffer.getvalue()
594 836
595 837
596 838 @make_cffi
597 839 class TestDecompressor_stream_writer(unittest.TestCase):
840 def test_io_api(self):
841 buffer = io.BytesIO()
842 dctx = zstd.ZstdDecompressor()
843 writer = dctx.stream_writer(buffer)
844
845 self.assertFalse(writer.closed)
846 self.assertFalse(writer.isatty())
847 self.assertFalse(writer.readable())
848
849 with self.assertRaises(io.UnsupportedOperation):
850 writer.readline()
851
852 with self.assertRaises(io.UnsupportedOperation):
853 writer.readline(42)
854
855 with self.assertRaises(io.UnsupportedOperation):
856 writer.readline(size=42)
857
858 with self.assertRaises(io.UnsupportedOperation):
859 writer.readlines()
860
861 with self.assertRaises(io.UnsupportedOperation):
862 writer.readlines(42)
863
864 with self.assertRaises(io.UnsupportedOperation):
865 writer.readlines(hint=42)
866
867 with self.assertRaises(io.UnsupportedOperation):
868 writer.seek(0)
869
870 with self.assertRaises(io.UnsupportedOperation):
871 writer.seek(10, os.SEEK_SET)
872
873 self.assertFalse(writer.seekable())
874
875 with self.assertRaises(io.UnsupportedOperation):
876 writer.tell()
877
878 with self.assertRaises(io.UnsupportedOperation):
879 writer.truncate()
880
881 with self.assertRaises(io.UnsupportedOperation):
882 writer.truncate(42)
883
884 with self.assertRaises(io.UnsupportedOperation):
885 writer.truncate(size=42)
886
887 self.assertTrue(writer.writable())
888
889 with self.assertRaises(io.UnsupportedOperation):
890 writer.writelines([])
891
892 with self.assertRaises(io.UnsupportedOperation):
893 writer.read()
894
895 with self.assertRaises(io.UnsupportedOperation):
896 writer.read(42)
897
898 with self.assertRaises(io.UnsupportedOperation):
899 writer.read(size=42)
900
901 with self.assertRaises(io.UnsupportedOperation):
902 writer.readall()
903
904 with self.assertRaises(io.UnsupportedOperation):
905 writer.readinto(None)
906
907 with self.assertRaises(io.UnsupportedOperation):
908 writer.fileno()
909
910 def test_fileno_file(self):
911 with tempfile.TemporaryFile('wb') as tf:
912 dctx = zstd.ZstdDecompressor()
913 writer = dctx.stream_writer(tf)
914
915 self.assertEqual(writer.fileno(), tf.fileno())
916
917 def test_close(self):
918 foo = zstd.ZstdCompressor().compress(b'foo')
919
920 buffer = NonClosingBytesIO()
921 dctx = zstd.ZstdDecompressor()
922 writer = dctx.stream_writer(buffer)
923
924 writer.write(foo)
925 self.assertFalse(writer.closed)
926 self.assertFalse(buffer.closed)
927 writer.close()
928 self.assertTrue(writer.closed)
929 self.assertTrue(buffer.closed)
930
931 with self.assertRaisesRegexp(ValueError, 'stream is closed'):
932 writer.write(b'')
933
934 with self.assertRaisesRegexp(ValueError, 'stream is closed'):
935 writer.flush()
936
937 with self.assertRaisesRegexp(ValueError, 'stream is closed'):
938 with writer:
939 pass
940
941 self.assertEqual(buffer.getvalue(), b'foo')
942
943 # Context manager exit should close stream.
944 buffer = NonClosingBytesIO()
945 writer = dctx.stream_writer(buffer)
946
947 with writer:
948 writer.write(foo)
949
950 self.assertTrue(writer.closed)
951 self.assertEqual(buffer.getvalue(), b'foo')
952
953 def test_flush(self):
954 buffer = OpCountingBytesIO()
955 dctx = zstd.ZstdDecompressor()
956 writer = dctx.stream_writer(buffer)
957
958 writer.flush()
959 self.assertEqual(buffer._flush_count, 1)
960 writer.flush()
961 self.assertEqual(buffer._flush_count, 2)
962
598 963 def test_empty_roundtrip(self):
599 964 cctx = zstd.ZstdCompressor()
600 965 empty = cctx.compress(b'')
@@ -616,9 +981,21 b' class TestDecompressor_stream_writer(uni'
616 981 dctx = zstd.ZstdDecompressor()
617 982 for source in sources:
618 983 buffer = io.BytesIO()
984
985 decompressor = dctx.stream_writer(buffer)
986 decompressor.write(source)
987 self.assertEqual(buffer.getvalue(), b'foo')
988
989 buffer = NonClosingBytesIO()
990
619 991 with dctx.stream_writer(buffer) as decompressor:
620 decompressor.write(source)
992 self.assertEqual(decompressor.write(source), 3)
993
994 self.assertEqual(buffer.getvalue(), b'foo')
621 995
996 buffer = io.BytesIO()
997 writer = dctx.stream_writer(buffer, write_return_read=True)
998 self.assertEqual(writer.write(source), len(source))
622 999 self.assertEqual(buffer.getvalue(), b'foo')
623 1000
624 1001 def test_large_roundtrip(self):
@@ -641,7 +1018,7 b' class TestDecompressor_stream_writer(uni'
641 1018 cctx = zstd.ZstdCompressor()
642 1019 compressed = cctx.compress(orig)
643 1020
644 buffer = io.BytesIO()
1021 buffer = NonClosingBytesIO()
645 1022 dctx = zstd.ZstdDecompressor()
646 1023 with dctx.stream_writer(buffer) as decompressor:
647 1024 pos = 0
@@ -651,6 +1028,17 b' class TestDecompressor_stream_writer(uni'
651 1028 pos += 8192
652 1029 self.assertEqual(buffer.getvalue(), orig)
653 1030
1031 # Again with write_return_read=True
1032 buffer = io.BytesIO()
1033 writer = dctx.stream_writer(buffer, write_return_read=True)
1034 pos = 0
1035 while pos < len(compressed):
1036 pos2 = pos + 8192
1037 chunk = compressed[pos:pos2]
1038 self.assertEqual(writer.write(chunk), len(chunk))
1039 pos += 8192
1040 self.assertEqual(buffer.getvalue(), orig)
1041
654 1042 def test_dictionary(self):
655 1043 samples = []
656 1044 for i in range(128):
@@ -661,7 +1049,7 b' class TestDecompressor_stream_writer(uni'
661 1049 d = zstd.train_dictionary(8192, samples)
662 1050
663 1051 orig = b'foobar' * 16384
664 buffer = io.BytesIO()
1052 buffer = NonClosingBytesIO()
665 1053 cctx = zstd.ZstdCompressor(dict_data=d)
666 1054 with cctx.stream_writer(buffer) as compressor:
667 1055 self.assertEqual(compressor.write(orig), 0)
@@ -670,6 +1058,12 b' class TestDecompressor_stream_writer(uni'
670 1058 buffer = io.BytesIO()
671 1059
672 1060 dctx = zstd.ZstdDecompressor(dict_data=d)
1061 decompressor = dctx.stream_writer(buffer)
1062 self.assertEqual(decompressor.write(compressed), len(orig))
1063 self.assertEqual(buffer.getvalue(), orig)
1064
1065 buffer = NonClosingBytesIO()
1066
673 1067 with dctx.stream_writer(buffer) as decompressor:
674 1068 self.assertEqual(decompressor.write(compressed), len(orig))
675 1069
@@ -678,6 +1072,11 b' class TestDecompressor_stream_writer(uni'
678 1072 def test_memory_size(self):
679 1073 dctx = zstd.ZstdDecompressor()
680 1074 buffer = io.BytesIO()
1075
1076 decompressor = dctx.stream_writer(buffer)
1077 size = decompressor.memory_size()
1078 self.assertGreater(size, 100000)
1079
681 1080 with dctx.stream_writer(buffer) as decompressor:
682 1081 size = decompressor.memory_size()
683 1082
@@ -810,7 +1209,7 b' class TestDecompressor_read_to_iter(unit'
810 1209 @unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set')
811 1210 def test_large_input(self):
812 1211 bytes = list(struct.Struct('>B').pack(i) for i in range(256))
813 compressed = io.BytesIO()
1212 compressed = NonClosingBytesIO()
814 1213 input_size = 0
815 1214 cctx = zstd.ZstdCompressor(level=1)
816 1215 with cctx.stream_writer(compressed) as compressor:
@@ -823,7 +1222,7 b' class TestDecompressor_read_to_iter(unit'
823 1222 if have_compressed and have_raw:
824 1223 break
825 1224
826 compressed.seek(0)
1225 compressed = io.BytesIO(compressed.getvalue())
827 1226 self.assertGreater(len(compressed.getvalue()),
828 1227 zstd.DECOMPRESSION_RECOMMENDED_INPUT_SIZE)
829 1228
@@ -861,7 +1260,7 b' class TestDecompressor_read_to_iter(unit'
861 1260
862 1261 source = io.BytesIO()
863 1262
864 compressed = io.BytesIO()
1263 compressed = NonClosingBytesIO()
865 1264 with cctx.stream_writer(compressed) as compressor:
866 1265 for i in range(256):
867 1266 chunk = b'\0' * 1024
@@ -874,7 +1273,7 b' class TestDecompressor_read_to_iter(unit'
874 1273 max_output_size=len(source.getvalue()))
875 1274 self.assertEqual(simple, source.getvalue())
876 1275
877 compressed.seek(0)
1276 compressed = io.BytesIO(compressed.getvalue())
878 1277 streamed = b''.join(dctx.read_to_iter(compressed))
879 1278 self.assertEqual(streamed, source.getvalue())
880 1279
@@ -1001,6 +1400,9 b' class TestDecompressor_multi_decompress_'
1001 1400 def test_invalid_inputs(self):
1002 1401 dctx = zstd.ZstdDecompressor()
1003 1402
1403 if not hasattr(dctx, 'multi_decompress_to_buffer'):
1404 self.skipTest('multi_decompress_to_buffer not available')
1405
1004 1406 with self.assertRaises(TypeError):
1005 1407 dctx.multi_decompress_to_buffer(True)
1006 1408
@@ -1020,6 +1422,10 b' class TestDecompressor_multi_decompress_'
1020 1422 frames = [cctx.compress(d) for d in original]
1021 1423
1022 1424 dctx = zstd.ZstdDecompressor()
1425
1426 if not hasattr(dctx, 'multi_decompress_to_buffer'):
1427 self.skipTest('multi_decompress_to_buffer not available')
1428
1023 1429 result = dctx.multi_decompress_to_buffer(frames)
1024 1430
1025 1431 self.assertEqual(len(result), len(frames))
@@ -1041,6 +1447,10 b' class TestDecompressor_multi_decompress_'
1041 1447 sizes = struct.pack('=' + 'Q' * len(original), *map(len, original))
1042 1448
1043 1449 dctx = zstd.ZstdDecompressor()
1450
1451 if not hasattr(dctx, 'multi_decompress_to_buffer'):
1452 self.skipTest('multi_decompress_to_buffer not available')
1453
1044 1454 result = dctx.multi_decompress_to_buffer(frames, decompressed_sizes=sizes)
1045 1455
1046 1456 self.assertEqual(len(result), len(frames))
@@ -1057,6 +1467,9 b' class TestDecompressor_multi_decompress_'
1057 1467
1058 1468 dctx = zstd.ZstdDecompressor()
1059 1469
1470 if not hasattr(dctx, 'multi_decompress_to_buffer'):
1471 self.skipTest('multi_decompress_to_buffer not available')
1472
1060 1473 segments = struct.pack('=QQQQ', 0, len(frames[0]), len(frames[0]), len(frames[1]))
1061 1474 b = zstd.BufferWithSegments(b''.join(frames), segments)
1062 1475
@@ -1074,12 +1487,16 b' class TestDecompressor_multi_decompress_'
1074 1487 frames = [cctx.compress(d) for d in original]
1075 1488 sizes = struct.pack('=' + 'Q' * len(original), *map(len, original))
1076 1489
1490 dctx = zstd.ZstdDecompressor()
1491
1492 if not hasattr(dctx, 'multi_decompress_to_buffer'):
1493 self.skipTest('multi_decompress_to_buffer not available')
1494
1077 1495 segments = struct.pack('=QQQQQQ', 0, len(frames[0]),
1078 1496 len(frames[0]), len(frames[1]),
1079 1497 len(frames[0]) + len(frames[1]), len(frames[2]))
1080 1498 b = zstd.BufferWithSegments(b''.join(frames), segments)
1081 1499
1082 dctx = zstd.ZstdDecompressor()
1083 1500 result = dctx.multi_decompress_to_buffer(b, decompressed_sizes=sizes)
1084 1501
1085 1502 self.assertEqual(len(result), len(frames))
@@ -1099,10 +1516,14 b' class TestDecompressor_multi_decompress_'
1099 1516 b'foo4' * 6,
1100 1517 ]
1101 1518
1519 if not hasattr(cctx, 'multi_compress_to_buffer'):
1520 self.skipTest('multi_compress_to_buffer not available')
1521
1102 1522 frames = cctx.multi_compress_to_buffer(original)
1103 1523
1104 1524 # Check round trip.
1105 1525 dctx = zstd.ZstdDecompressor()
1526
1106 1527 decompressed = dctx.multi_decompress_to_buffer(frames, threads=3)
1107 1528
1108 1529 self.assertEqual(len(decompressed), len(original))
@@ -1138,7 +1559,12 b' class TestDecompressor_multi_decompress_'
1138 1559 frames = [cctx.compress(s) for s in generate_samples()]
1139 1560
1140 1561 dctx = zstd.ZstdDecompressor(dict_data=d)
1562
1563 if not hasattr(dctx, 'multi_decompress_to_buffer'):
1564 self.skipTest('multi_decompress_to_buffer not available')
1565
1141 1566 result = dctx.multi_decompress_to_buffer(frames)
1567
1142 1568 self.assertEqual([o.tobytes() for o in result], generate_samples())
1143 1569
1144 1570 def test_multiple_threads(self):
@@ -1149,6 +1575,10 b' class TestDecompressor_multi_decompress_'
1149 1575 frames.extend(cctx.compress(b'y' * 64) for i in range(256))
1150 1576
1151 1577 dctx = zstd.ZstdDecompressor()
1578
1579 if not hasattr(dctx, 'multi_decompress_to_buffer'):
1580 self.skipTest('multi_decompress_to_buffer not available')
1581
1152 1582 result = dctx.multi_decompress_to_buffer(frames, threads=-1)
1153 1583
1154 1584 self.assertEqual(len(result), len(frames))
@@ -1164,6 +1594,9 b' class TestDecompressor_multi_decompress_'
1164 1594
1165 1595 dctx = zstd.ZstdDecompressor()
1166 1596
1597 if not hasattr(dctx, 'multi_decompress_to_buffer'):
1598 self.skipTest('multi_decompress_to_buffer not available')
1599
1167 1600 with self.assertRaisesRegexp(zstd.ZstdError,
1168 1601 'error decompressing item 1: ('
1169 1602 'Corrupted block|'
@@ -12,6 +12,7 b' import zstandard as zstd'
12 12
13 13 from . common import (
14 14 make_cffi,
15 NonClosingBytesIO,
15 16 random_input_data,
16 17 )
17 18
@@ -23,22 +24,200 b' class TestDecompressor_stream_reader_fuz'
23 24 suppress_health_check=[hypothesis.HealthCheck.large_base_example])
24 25 @hypothesis.given(original=strategies.sampled_from(random_input_data()),
25 26 level=strategies.integers(min_value=1, max_value=5),
26 source_read_size=strategies.integers(1, 16384),
27 streaming=strategies.booleans(),
28 source_read_size=strategies.integers(1, 1048576),
27 29 read_sizes=strategies.data())
28 def test_stream_source_read_variance(self, original, level, source_read_size,
29 read_sizes):
30 def test_stream_source_read_variance(self, original, level, streaming,
31 source_read_size, read_sizes):
30 32 cctx = zstd.ZstdCompressor(level=level)
31 frame = cctx.compress(original)
33
34 if streaming:
35 source = io.BytesIO()
36 writer = cctx.stream_writer(source)
37 writer.write(original)
38 writer.flush(zstd.FLUSH_FRAME)
39 source.seek(0)
40 else:
41 frame = cctx.compress(original)
42 source = io.BytesIO(frame)
32 43
33 44 dctx = zstd.ZstdDecompressor()
34 source = io.BytesIO(frame)
35 45
36 46 chunks = []
37 47 with dctx.stream_reader(source, read_size=source_read_size) as reader:
38 48 while True:
39 read_size = read_sizes.draw(strategies.integers(1, 16384))
49 read_size = read_sizes.draw(strategies.integers(-1, 131072))
50 chunk = reader.read(read_size)
51 if not chunk and read_size:
52 break
53
54 chunks.append(chunk)
55
56 self.assertEqual(b''.join(chunks), original)
57
58 # Similar to above except we have a constant read() size.
59 @hypothesis.settings(
60 suppress_health_check=[hypothesis.HealthCheck.large_base_example])
61 @hypothesis.given(original=strategies.sampled_from(random_input_data()),
62 level=strategies.integers(min_value=1, max_value=5),
63 streaming=strategies.booleans(),
64 source_read_size=strategies.integers(1, 1048576),
65 read_size=strategies.integers(-1, 131072))
66 def test_stream_source_read_size(self, original, level, streaming,
67 source_read_size, read_size):
68 if read_size == 0:
69 read_size = 1
70
71 cctx = zstd.ZstdCompressor(level=level)
72
73 if streaming:
74 source = io.BytesIO()
75 writer = cctx.stream_writer(source)
76 writer.write(original)
77 writer.flush(zstd.FLUSH_FRAME)
78 source.seek(0)
79 else:
80 frame = cctx.compress(original)
81 source = io.BytesIO(frame)
82
83 dctx = zstd.ZstdDecompressor()
84
85 chunks = []
86 reader = dctx.stream_reader(source, read_size=source_read_size)
87 while True:
88 chunk = reader.read(read_size)
89 if not chunk and read_size:
90 break
91
92 chunks.append(chunk)
93
94 self.assertEqual(b''.join(chunks), original)
95
96 @hypothesis.settings(
97 suppress_health_check=[hypothesis.HealthCheck.large_base_example])
98 @hypothesis.given(original=strategies.sampled_from(random_input_data()),
99 level=strategies.integers(min_value=1, max_value=5),
100 streaming=strategies.booleans(),
101 source_read_size=strategies.integers(1, 1048576),
102 read_sizes=strategies.data())
103 def test_buffer_source_read_variance(self, original, level, streaming,
104 source_read_size, read_sizes):
105 cctx = zstd.ZstdCompressor(level=level)
106
107 if streaming:
108 source = io.BytesIO()
109 writer = cctx.stream_writer(source)
110 writer.write(original)
111 writer.flush(zstd.FLUSH_FRAME)
112 frame = source.getvalue()
113 else:
114 frame = cctx.compress(original)
115
116 dctx = zstd.ZstdDecompressor()
117 chunks = []
118
119 with dctx.stream_reader(frame, read_size=source_read_size) as reader:
120 while True:
121 read_size = read_sizes.draw(strategies.integers(-1, 131072))
40 122 chunk = reader.read(read_size)
41 if not chunk:
123 if not chunk and read_size:
124 break
125
126 chunks.append(chunk)
127
128 self.assertEqual(b''.join(chunks), original)
129
130 # Similar to above except we have a constant read() size.
131 @hypothesis.settings(
132 suppress_health_check=[hypothesis.HealthCheck.large_base_example])
133 @hypothesis.given(original=strategies.sampled_from(random_input_data()),
134 level=strategies.integers(min_value=1, max_value=5),
135 streaming=strategies.booleans(),
136 source_read_size=strategies.integers(1, 1048576),
137 read_size=strategies.integers(-1, 131072))
138 def test_buffer_source_constant_read_size(self, original, level, streaming,
139 source_read_size, read_size):
140 if read_size == 0:
141 read_size = -1
142
143 cctx = zstd.ZstdCompressor(level=level)
144
145 if streaming:
146 source = io.BytesIO()
147 writer = cctx.stream_writer(source)
148 writer.write(original)
149 writer.flush(zstd.FLUSH_FRAME)
150 frame = source.getvalue()
151 else:
152 frame = cctx.compress(original)
153
154 dctx = zstd.ZstdDecompressor()
155 chunks = []
156
157 reader = dctx.stream_reader(frame, read_size=source_read_size)
158 while True:
159 chunk = reader.read(read_size)
160 if not chunk and read_size:
161 break
162
163 chunks.append(chunk)
164
165 self.assertEqual(b''.join(chunks), original)
166
167 @hypothesis.settings(
168 suppress_health_check=[hypothesis.HealthCheck.large_base_example])
169 @hypothesis.given(original=strategies.sampled_from(random_input_data()),
170 level=strategies.integers(min_value=1, max_value=5),
171 streaming=strategies.booleans(),
172 source_read_size=strategies.integers(1, 1048576))
173 def test_stream_source_readall(self, original, level, streaming,
174 source_read_size):
175 cctx = zstd.ZstdCompressor(level=level)
176
177 if streaming:
178 source = io.BytesIO()
179 writer = cctx.stream_writer(source)
180 writer.write(original)
181 writer.flush(zstd.FLUSH_FRAME)
182 source.seek(0)
183 else:
184 frame = cctx.compress(original)
185 source = io.BytesIO(frame)
186
187 dctx = zstd.ZstdDecompressor()
188
189 data = dctx.stream_reader(source, read_size=source_read_size).readall()
190 self.assertEqual(data, original)
191
192 @hypothesis.settings(
193 suppress_health_check=[hypothesis.HealthCheck.large_base_example])
194 @hypothesis.given(original=strategies.sampled_from(random_input_data()),
195 level=strategies.integers(min_value=1, max_value=5),
196 streaming=strategies.booleans(),
197 source_read_size=strategies.integers(1, 1048576),
198 read_sizes=strategies.data())
199 def test_stream_source_read1_variance(self, original, level, streaming,
200 source_read_size, read_sizes):
201 cctx = zstd.ZstdCompressor(level=level)
202
203 if streaming:
204 source = io.BytesIO()
205 writer = cctx.stream_writer(source)
206 writer.write(original)
207 writer.flush(zstd.FLUSH_FRAME)
208 source.seek(0)
209 else:
210 frame = cctx.compress(original)
211 source = io.BytesIO(frame)
212
213 dctx = zstd.ZstdDecompressor()
214
215 chunks = []
216 with dctx.stream_reader(source, read_size=source_read_size) as reader:
217 while True:
218 read_size = read_sizes.draw(strategies.integers(-1, 131072))
219 chunk = reader.read1(read_size)
220 if not chunk and read_size:
42 221 break
43 222
44 223 chunks.append(chunk)
@@ -49,24 +228,36 b' class TestDecompressor_stream_reader_fuz'
49 228 suppress_health_check=[hypothesis.HealthCheck.large_base_example])
50 229 @hypothesis.given(original=strategies.sampled_from(random_input_data()),
51 230 level=strategies.integers(min_value=1, max_value=5),
52 source_read_size=strategies.integers(1, 16384),
231 streaming=strategies.booleans(),
232 source_read_size=strategies.integers(1, 1048576),
53 233 read_sizes=strategies.data())
54 def test_buffer_source_read_variance(self, original, level, source_read_size,
55 read_sizes):
234 def test_stream_source_readinto1_variance(self, original, level, streaming,
235 source_read_size, read_sizes):
56 236 cctx = zstd.ZstdCompressor(level=level)
57 frame = cctx.compress(original)
237
238 if streaming:
239 source = io.BytesIO()
240 writer = cctx.stream_writer(source)
241 writer.write(original)
242 writer.flush(zstd.FLUSH_FRAME)
243 source.seek(0)
244 else:
245 frame = cctx.compress(original)
246 source = io.BytesIO(frame)
58 247
59 248 dctx = zstd.ZstdDecompressor()
249
60 250 chunks = []
61
62 with dctx.stream_reader(frame, read_size=source_read_size) as reader:
251 with dctx.stream_reader(source, read_size=source_read_size) as reader:
63 252 while True:
64 read_size = read_sizes.draw(strategies.integers(1, 16384))
65 chunk = reader.read(read_size)
66 if not chunk:
253 read_size = read_sizes.draw(strategies.integers(1, 131072))
254 b = bytearray(read_size)
255 count = reader.readinto1(b)
256
257 if not count:
67 258 break
68 259
69 chunks.append(chunk)
260 chunks.append(bytes(b[0:count]))
70 261
71 262 self.assertEqual(b''.join(chunks), original)
72 263
@@ -75,7 +266,7 b' class TestDecompressor_stream_reader_fuz'
75 266 @hypothesis.given(
76 267 original=strategies.sampled_from(random_input_data()),
77 268 level=strategies.integers(min_value=1, max_value=5),
78 source_read_size=strategies.integers(1, 16384),
269 source_read_size=strategies.integers(1, 1048576),
79 270 seek_amounts=strategies.data(),
80 271 read_sizes=strategies.data())
81 272 def test_relative_seeks(self, original, level, source_read_size, seek_amounts,
@@ -99,6 +290,46 b' class TestDecompressor_stream_reader_fuz'
99 290
100 291 self.assertEqual(original[offset:offset + len(chunk)], chunk)
101 292
293 @hypothesis.settings(
294 suppress_health_check=[hypothesis.HealthCheck.large_base_example])
295 @hypothesis.given(
296 originals=strategies.data(),
297 frame_count=strategies.integers(min_value=2, max_value=10),
298 level=strategies.integers(min_value=1, max_value=5),
299 source_read_size=strategies.integers(1, 1048576),
300 read_sizes=strategies.data())
301 def test_multiple_frames(self, originals, frame_count, level,
302 source_read_size, read_sizes):
303
304 cctx = zstd.ZstdCompressor(level=level)
305 source = io.BytesIO()
306 buffer = io.BytesIO()
307 writer = cctx.stream_writer(buffer)
308
309 for i in range(frame_count):
310 data = originals.draw(strategies.sampled_from(random_input_data()))
311 source.write(data)
312 writer.write(data)
313 writer.flush(zstd.FLUSH_FRAME)
314
315 dctx = zstd.ZstdDecompressor()
316 buffer.seek(0)
317 reader = dctx.stream_reader(buffer, read_size=source_read_size,
318 read_across_frames=True)
319
320 chunks = []
321
322 while True:
323 read_amount = read_sizes.draw(strategies.integers(-1, 16384))
324 chunk = reader.read(read_amount)
325
326 if not chunk and read_amount:
327 break
328
329 chunks.append(chunk)
330
331 self.assertEqual(source.getvalue(), b''.join(chunks))
332
102 333
103 334 @unittest.skipUnless('ZSTD_SLOW_TESTS' in os.environ, 'ZSTD_SLOW_TESTS not set')
104 335 @make_cffi
@@ -113,7 +344,7 b' class TestDecompressor_stream_writer_fuz'
113 344
114 345 dctx = zstd.ZstdDecompressor()
115 346 source = io.BytesIO(frame)
116 dest = io.BytesIO()
347 dest = NonClosingBytesIO()
117 348
118 349 with dctx.stream_writer(dest, write_size=write_size) as decompressor:
119 350 while True:
@@ -234,10 +465,12 b' class TestDecompressor_multi_decompress_'
234 465 write_checksum=True,
235 466 **kwargs)
236 467
468 if not hasattr(cctx, 'multi_compress_to_buffer'):
469 self.skipTest('multi_compress_to_buffer not available')
470
237 471 frames_buffer = cctx.multi_compress_to_buffer(original, threads=-1)
238 472
239 473 dctx = zstd.ZstdDecompressor(**kwargs)
240
241 474 result = dctx.multi_decompress_to_buffer(frames_buffer)
242 475
243 476 self.assertEqual(len(result), len(original))
@@ -12,9 +12,9 b' from . common import ('
12 12 @make_cffi
13 13 class TestModuleAttributes(unittest.TestCase):
14 14 def test_version(self):
15 self.assertEqual(zstd.ZSTD_VERSION, (1, 3, 6))
15 self.assertEqual(zstd.ZSTD_VERSION, (1, 3, 8))
16 16
17 self.assertEqual(zstd.__version__, '0.10.1')
17 self.assertEqual(zstd.__version__, '0.11.0')
18 18
19 19 def test_constants(self):
20 20 self.assertEqual(zstd.MAX_COMPRESSION_LEVEL, 22)
@@ -29,6 +29,8 b' class TestModuleAttributes(unittest.Test'
29 29 'DECOMPRESSION_RECOMMENDED_INPUT_SIZE',
30 30 'DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE',
31 31 'MAGIC_NUMBER',
32 'FLUSH_BLOCK',
33 'FLUSH_FRAME',
32 34 'BLOCKSIZELOG_MAX',
33 35 'BLOCKSIZE_MAX',
34 36 'WINDOWLOG_MIN',
@@ -38,6 +40,8 b' class TestModuleAttributes(unittest.Test'
38 40 'HASHLOG_MIN',
39 41 'HASHLOG_MAX',
40 42 'HASHLOG3_MAX',
43 'MINMATCH_MIN',
44 'MINMATCH_MAX',
41 45 'SEARCHLOG_MIN',
42 46 'SEARCHLOG_MAX',
43 47 'SEARCHLENGTH_MIN',
@@ -55,6 +59,7 b' class TestModuleAttributes(unittest.Test'
55 59 'STRATEGY_BTLAZY2',
56 60 'STRATEGY_BTOPT',
57 61 'STRATEGY_BTULTRA',
62 'STRATEGY_BTULTRA2',
58 63 'DICT_TYPE_AUTO',
59 64 'DICT_TYPE_RAWCONTENT',
60 65 'DICT_TYPE_FULLDICT',
@@ -35,31 +35,31 b" if _module_policy == 'default':"
35 35 from zstd import *
36 36 backend = 'cext'
37 37 elif platform.python_implementation() in ('PyPy',):
38 from zstd_cffi import *
38 from .cffi import *
39 39 backend = 'cffi'
40 40 else:
41 41 try:
42 42 from zstd import *
43 43 backend = 'cext'
44 44 except ImportError:
45 from zstd_cffi import *
45 from .cffi import *
46 46 backend = 'cffi'
47 47 elif _module_policy == 'cffi_fallback':
48 48 try:
49 49 from zstd import *
50 50 backend = 'cext'
51 51 except ImportError:
52 from zstd_cffi import *
52 from .cffi import *
53 53 backend = 'cffi'
54 54 elif _module_policy == 'cext':
55 55 from zstd import *
56 56 backend = 'cext'
57 57 elif _module_policy == 'cffi':
58 from zstd_cffi import *
58 from .cffi import *
59 59 backend = 'cffi'
60 60 else:
61 61 raise ImportError('unknown module import policy: %s; use default, cffi_fallback, '
62 62 'cext, or cffi' % _module_policy)
63 63
64 64 # Keep this in sync with python-zstandard.h.
65 __version__ = '0.10.1'
65 __version__ = '0.11.0'
This diff has been collapsed as it changes many lines, (1203 lines changed) Show them Hide them
@@ -28,6 +28,8 b' from __future__ import absolute_import, '
28 28 'train_dictionary',
29 29
30 30 # Constants.
31 'FLUSH_BLOCK',
32 'FLUSH_FRAME',
31 33 'COMPRESSOBJ_FLUSH_FINISH',
32 34 'COMPRESSOBJ_FLUSH_BLOCK',
33 35 'ZSTD_VERSION',
@@ -49,6 +51,8 b' from __future__ import absolute_import, '
49 51 'HASHLOG_MIN',
50 52 'HASHLOG_MAX',
51 53 'HASHLOG3_MAX',
54 'MINMATCH_MIN',
55 'MINMATCH_MAX',
52 56 'SEARCHLOG_MIN',
53 57 'SEARCHLOG_MAX',
54 58 'SEARCHLENGTH_MIN',
@@ -66,6 +70,7 b' from __future__ import absolute_import, '
66 70 'STRATEGY_BTLAZY2',
67 71 'STRATEGY_BTOPT',
68 72 'STRATEGY_BTULTRA',
73 'STRATEGY_BTULTRA2',
69 74 'DICT_TYPE_AUTO',
70 75 'DICT_TYPE_RAWCONTENT',
71 76 'DICT_TYPE_FULLDICT',
@@ -114,10 +119,12 b' CHAINLOG_MAX = lib.ZSTD_CHAINLOG_MAX'
114 119 HASHLOG_MIN = lib.ZSTD_HASHLOG_MIN
115 120 HASHLOG_MAX = lib.ZSTD_HASHLOG_MAX
116 121 HASHLOG3_MAX = lib.ZSTD_HASHLOG3_MAX
122 MINMATCH_MIN = lib.ZSTD_MINMATCH_MIN
123 MINMATCH_MAX = lib.ZSTD_MINMATCH_MAX
117 124 SEARCHLOG_MIN = lib.ZSTD_SEARCHLOG_MIN
118 125 SEARCHLOG_MAX = lib.ZSTD_SEARCHLOG_MAX
119 SEARCHLENGTH_MIN = lib.ZSTD_SEARCHLENGTH_MIN
120 SEARCHLENGTH_MAX = lib.ZSTD_SEARCHLENGTH_MAX
126 SEARCHLENGTH_MIN = lib.ZSTD_MINMATCH_MIN
127 SEARCHLENGTH_MAX = lib.ZSTD_MINMATCH_MAX
121 128 TARGETLENGTH_MIN = lib.ZSTD_TARGETLENGTH_MIN
122 129 TARGETLENGTH_MAX = lib.ZSTD_TARGETLENGTH_MAX
123 130 LDM_MINMATCH_MIN = lib.ZSTD_LDM_MINMATCH_MIN
@@ -132,6 +139,7 b' STRATEGY_LAZY2 = lib.ZSTD_lazy2'
132 139 STRATEGY_BTLAZY2 = lib.ZSTD_btlazy2
133 140 STRATEGY_BTOPT = lib.ZSTD_btopt
134 141 STRATEGY_BTULTRA = lib.ZSTD_btultra
142 STRATEGY_BTULTRA2 = lib.ZSTD_btultra2
135 143
136 144 DICT_TYPE_AUTO = lib.ZSTD_dct_auto
137 145 DICT_TYPE_RAWCONTENT = lib.ZSTD_dct_rawContent
@@ -140,6 +148,9 b' DICT_TYPE_FULLDICT = lib.ZSTD_dct_fullDi'
140 148 FORMAT_ZSTD1 = lib.ZSTD_f_zstd1
141 149 FORMAT_ZSTD1_MAGICLESS = lib.ZSTD_f_zstd1_magicless
142 150
151 FLUSH_BLOCK = 0
152 FLUSH_FRAME = 1
153
143 154 COMPRESSOBJ_FLUSH_FINISH = 0
144 155 COMPRESSOBJ_FLUSH_BLOCK = 1
145 156
@@ -182,27 +193,27 b' def _make_cctx_params(params):'
182 193 res = ffi.gc(res, lib.ZSTD_freeCCtxParams)
183 194
184 195 attrs = [
185 (lib.ZSTD_p_format, params.format),
186 (lib.ZSTD_p_compressionLevel, params.compression_level),
187 (lib.ZSTD_p_windowLog, params.window_log),
188 (lib.ZSTD_p_hashLog, params.hash_log),
189 (lib.ZSTD_p_chainLog, params.chain_log),
190 (lib.ZSTD_p_searchLog, params.search_log),
191 (lib.ZSTD_p_minMatch, params.min_match),
192 (lib.ZSTD_p_targetLength, params.target_length),
193 (lib.ZSTD_p_compressionStrategy, params.compression_strategy),
194 (lib.ZSTD_p_contentSizeFlag, params.write_content_size),
195 (lib.ZSTD_p_checksumFlag, params.write_checksum),
196 (lib.ZSTD_p_dictIDFlag, params.write_dict_id),
197 (lib.ZSTD_p_nbWorkers, params.threads),
198 (lib.ZSTD_p_jobSize, params.job_size),
199 (lib.ZSTD_p_overlapSizeLog, params.overlap_size_log),
200 (lib.ZSTD_p_forceMaxWindow, params.force_max_window),
201 (lib.ZSTD_p_enableLongDistanceMatching, params.enable_ldm),
202 (lib.ZSTD_p_ldmHashLog, params.ldm_hash_log),
203 (lib.ZSTD_p_ldmMinMatch, params.ldm_min_match),
204 (lib.ZSTD_p_ldmBucketSizeLog, params.ldm_bucket_size_log),
205 (lib.ZSTD_p_ldmHashEveryLog, params.ldm_hash_every_log),
196 (lib.ZSTD_c_format, params.format),
197 (lib.ZSTD_c_compressionLevel, params.compression_level),
198 (lib.ZSTD_c_windowLog, params.window_log),
199 (lib.ZSTD_c_hashLog, params.hash_log),
200 (lib.ZSTD_c_chainLog, params.chain_log),
201 (lib.ZSTD_c_searchLog, params.search_log),
202 (lib.ZSTD_c_minMatch, params.min_match),
203 (lib.ZSTD_c_targetLength, params.target_length),
204 (lib.ZSTD_c_strategy, params.compression_strategy),
205 (lib.ZSTD_c_contentSizeFlag, params.write_content_size),
206 (lib.ZSTD_c_checksumFlag, params.write_checksum),
207 (lib.ZSTD_c_dictIDFlag, params.write_dict_id),
208 (lib.ZSTD_c_nbWorkers, params.threads),
209 (lib.ZSTD_c_jobSize, params.job_size),
210 (lib.ZSTD_c_overlapLog, params.overlap_log),
211 (lib.ZSTD_c_forceMaxWindow, params.force_max_window),
212 (lib.ZSTD_c_enableLongDistanceMatching, params.enable_ldm),
213 (lib.ZSTD_c_ldmHashLog, params.ldm_hash_log),
214 (lib.ZSTD_c_ldmMinMatch, params.ldm_min_match),
215 (lib.ZSTD_c_ldmBucketSizeLog, params.ldm_bucket_size_log),
216 (lib.ZSTD_c_ldmHashRateLog, params.ldm_hash_rate_log),
206 217 ]
207 218
208 219 for param, value in attrs:
@@ -220,7 +231,7 b' class ZstdCompressionParameters(object):'
220 231 'chain_log': 'chainLog',
221 232 'hash_log': 'hashLog',
222 233 'search_log': 'searchLog',
223 'min_match': 'searchLength',
234 'min_match': 'minMatch',
224 235 'target_length': 'targetLength',
225 236 'compression_strategy': 'strategy',
226 237 }
@@ -233,41 +244,170 b' class ZstdCompressionParameters(object):'
233 244
234 245 def __init__(self, format=0, compression_level=0, window_log=0, hash_log=0,
235 246 chain_log=0, search_log=0, min_match=0, target_length=0,
236 compression_strategy=0, write_content_size=1, write_checksum=0,
237 write_dict_id=0, job_size=0, overlap_size_log=0,
238 force_max_window=0, enable_ldm=0, ldm_hash_log=0,
239 ldm_min_match=0, ldm_bucket_size_log=0, ldm_hash_every_log=0,
240 threads=0):
247 strategy=-1, compression_strategy=-1,
248 write_content_size=1, write_checksum=0,
249 write_dict_id=0, job_size=0, overlap_log=-1,
250 overlap_size_log=-1, force_max_window=0, enable_ldm=0,
251 ldm_hash_log=0, ldm_min_match=0, ldm_bucket_size_log=0,
252 ldm_hash_rate_log=-1, ldm_hash_every_log=-1, threads=0):
253
254 params = lib.ZSTD_createCCtxParams()
255 if params == ffi.NULL:
256 raise MemoryError()
257
258 params = ffi.gc(params, lib.ZSTD_freeCCtxParams)
259
260 self._params = params
241 261
242 262 if threads < 0:
243 263 threads = _cpu_count()
244 264
245 self.format = format
246 self.compression_level = compression_level
247 self.window_log = window_log
248 self.hash_log = hash_log
249 self.chain_log = chain_log
250 self.search_log = search_log
251 self.min_match = min_match
252 self.target_length = target_length
253 self.compression_strategy = compression_strategy
254 self.write_content_size = write_content_size
255 self.write_checksum = write_checksum
256 self.write_dict_id = write_dict_id
257 self.job_size = job_size
258 self.overlap_size_log = overlap_size_log
259 self.force_max_window = force_max_window
260 self.enable_ldm = enable_ldm
261 self.ldm_hash_log = ldm_hash_log
262 self.ldm_min_match = ldm_min_match
263 self.ldm_bucket_size_log = ldm_bucket_size_log
264 self.ldm_hash_every_log = ldm_hash_every_log
265 self.threads = threads
266
267 self.params = _make_cctx_params(self)
265 # We need to set ZSTD_c_nbWorkers before ZSTD_c_jobSize and ZSTD_c_overlapLog
266 # because setting ZSTD_c_nbWorkers resets the other parameters.
267 _set_compression_parameter(params, lib.ZSTD_c_nbWorkers, threads)
268
269 _set_compression_parameter(params, lib.ZSTD_c_format, format)
270 _set_compression_parameter(params, lib.ZSTD_c_compressionLevel, compression_level)
271 _set_compression_parameter(params, lib.ZSTD_c_windowLog, window_log)
272 _set_compression_parameter(params, lib.ZSTD_c_hashLog, hash_log)
273 _set_compression_parameter(params, lib.ZSTD_c_chainLog, chain_log)
274 _set_compression_parameter(params, lib.ZSTD_c_searchLog, search_log)
275 _set_compression_parameter(params, lib.ZSTD_c_minMatch, min_match)
276 _set_compression_parameter(params, lib.ZSTD_c_targetLength, target_length)
277
278 if strategy != -1 and compression_strategy != -1:
279 raise ValueError('cannot specify both compression_strategy and strategy')
280
281 if compression_strategy != -1:
282 strategy = compression_strategy
283 elif strategy == -1:
284 strategy = 0
285
286 _set_compression_parameter(params, lib.ZSTD_c_strategy, strategy)
287 _set_compression_parameter(params, lib.ZSTD_c_contentSizeFlag, write_content_size)
288 _set_compression_parameter(params, lib.ZSTD_c_checksumFlag, write_checksum)
289 _set_compression_parameter(params, lib.ZSTD_c_dictIDFlag, write_dict_id)
290 _set_compression_parameter(params, lib.ZSTD_c_jobSize, job_size)
291
292 if overlap_log != -1 and overlap_size_log != -1:
293 raise ValueError('cannot specify both overlap_log and overlap_size_log')
294
295 if overlap_size_log != -1:
296 overlap_log = overlap_size_log
297 elif overlap_log == -1:
298 overlap_log = 0
299
300 _set_compression_parameter(params, lib.ZSTD_c_overlapLog, overlap_log)
301 _set_compression_parameter(params, lib.ZSTD_c_forceMaxWindow, force_max_window)
302 _set_compression_parameter(params, lib.ZSTD_c_enableLongDistanceMatching, enable_ldm)
303 _set_compression_parameter(params, lib.ZSTD_c_ldmHashLog, ldm_hash_log)
304 _set_compression_parameter(params, lib.ZSTD_c_ldmMinMatch, ldm_min_match)
305 _set_compression_parameter(params, lib.ZSTD_c_ldmBucketSizeLog, ldm_bucket_size_log)
306
307 if ldm_hash_rate_log != -1 and ldm_hash_every_log != -1:
308 raise ValueError('cannot specify both ldm_hash_rate_log and ldm_hash_every_log')
309
310 if ldm_hash_every_log != -1:
311 ldm_hash_rate_log = ldm_hash_every_log
312 elif ldm_hash_rate_log == -1:
313 ldm_hash_rate_log = 0
314
315 _set_compression_parameter(params, lib.ZSTD_c_ldmHashRateLog, ldm_hash_rate_log)
316
317 @property
318 def format(self):
319 return _get_compression_parameter(self._params, lib.ZSTD_c_format)
320
321 @property
322 def compression_level(self):
323 return _get_compression_parameter(self._params, lib.ZSTD_c_compressionLevel)
324
325 @property
326 def window_log(self):
327 return _get_compression_parameter(self._params, lib.ZSTD_c_windowLog)
328
329 @property
330 def hash_log(self):
331 return _get_compression_parameter(self._params, lib.ZSTD_c_hashLog)
332
333 @property
334 def chain_log(self):
335 return _get_compression_parameter(self._params, lib.ZSTD_c_chainLog)
336
337 @property
338 def search_log(self):
339 return _get_compression_parameter(self._params, lib.ZSTD_c_searchLog)
340
341 @property
342 def min_match(self):
343 return _get_compression_parameter(self._params, lib.ZSTD_c_minMatch)
344
345 @property
346 def target_length(self):
347 return _get_compression_parameter(self._params, lib.ZSTD_c_targetLength)
348
349 @property
350 def compression_strategy(self):
351 return _get_compression_parameter(self._params, lib.ZSTD_c_strategy)
352
353 @property
354 def write_content_size(self):
355 return _get_compression_parameter(self._params, lib.ZSTD_c_contentSizeFlag)
356
357 @property
358 def write_checksum(self):
359 return _get_compression_parameter(self._params, lib.ZSTD_c_checksumFlag)
360
361 @property
362 def write_dict_id(self):
363 return _get_compression_parameter(self._params, lib.ZSTD_c_dictIDFlag)
364
365 @property
366 def job_size(self):
367 return _get_compression_parameter(self._params, lib.ZSTD_c_jobSize)
368
369 @property
370 def overlap_log(self):
371 return _get_compression_parameter(self._params, lib.ZSTD_c_overlapLog)
372
373 @property
374 def overlap_size_log(self):
375 return self.overlap_log
376
377 @property
378 def force_max_window(self):
379 return _get_compression_parameter(self._params, lib.ZSTD_c_forceMaxWindow)
380
381 @property
382 def enable_ldm(self):
383 return _get_compression_parameter(self._params, lib.ZSTD_c_enableLongDistanceMatching)
384
385 @property
386 def ldm_hash_log(self):
387 return _get_compression_parameter(self._params, lib.ZSTD_c_ldmHashLog)
388
389 @property
390 def ldm_min_match(self):
391 return _get_compression_parameter(self._params, lib.ZSTD_c_ldmMinMatch)
392
393 @property
394 def ldm_bucket_size_log(self):
395 return _get_compression_parameter(self._params, lib.ZSTD_c_ldmBucketSizeLog)
396
397 @property
398 def ldm_hash_rate_log(self):
399 return _get_compression_parameter(self._params, lib.ZSTD_c_ldmHashRateLog)
400
401 @property
402 def ldm_hash_every_log(self):
403 return self.ldm_hash_rate_log
404
405 @property
406 def threads(self):
407 return _get_compression_parameter(self._params, lib.ZSTD_c_nbWorkers)
268 408
269 409 def estimated_compression_context_size(self):
270 return lib.ZSTD_estimateCCtxSize_usingCCtxParams(self.params)
410 return lib.ZSTD_estimateCCtxSize_usingCCtxParams(self._params)
271 411
272 412 CompressionParameters = ZstdCompressionParameters
273 413
@@ -276,31 +416,53 b' def estimate_decompression_context_size('
276 416
277 417
278 418 def _set_compression_parameter(params, param, value):
279 zresult = lib.ZSTD_CCtxParam_setParameter(params, param,
280 ffi.cast('unsigned', value))
419 zresult = lib.ZSTD_CCtxParam_setParameter(params, param, value)
281 420 if lib.ZSTD_isError(zresult):
282 421 raise ZstdError('unable to set compression context parameter: %s' %
283 422 _zstd_error(zresult))
284 423
424
425 def _get_compression_parameter(params, param):
426 result = ffi.new('int *')
427
428 zresult = lib.ZSTD_CCtxParam_getParameter(params, param, result)
429 if lib.ZSTD_isError(zresult):
430 raise ZstdError('unable to get compression context parameter: %s' %
431 _zstd_error(zresult))
432
433 return result[0]
434
435
285 436 class ZstdCompressionWriter(object):
286 def __init__(self, compressor, writer, source_size, write_size):
437 def __init__(self, compressor, writer, source_size, write_size,
438 write_return_read):
287 439 self._compressor = compressor
288 440 self._writer = writer
289 self._source_size = source_size
290 441 self._write_size = write_size
442 self._write_return_read = bool(write_return_read)
291 443 self._entered = False
444 self._closed = False
292 445 self._bytes_compressed = 0
293 446
294 def __enter__(self):
295 if self._entered:
296 raise ZstdError('cannot __enter__ multiple times')
297
298 zresult = lib.ZSTD_CCtx_setPledgedSrcSize(self._compressor._cctx,
299 self._source_size)
447 self._dst_buffer = ffi.new('char[]', write_size)
448 self._out_buffer = ffi.new('ZSTD_outBuffer *')
449 self._out_buffer.dst = self._dst_buffer
450 self._out_buffer.size = len(self._dst_buffer)
451 self._out_buffer.pos = 0
452
453 zresult = lib.ZSTD_CCtx_setPledgedSrcSize(compressor._cctx,
454 source_size)
300 455 if lib.ZSTD_isError(zresult):
301 456 raise ZstdError('error setting source size: %s' %
302 457 _zstd_error(zresult))
303 458
459 def __enter__(self):
460 if self._closed:
461 raise ValueError('stream is closed')
462
463 if self._entered:
464 raise ZstdError('cannot __enter__ multiple times')
465
304 466 self._entered = True
305 467 return self
306 468
@@ -308,50 +470,79 b' class ZstdCompressionWriter(object):'
308 470 self._entered = False
309 471
310 472 if not exc_type and not exc_value and not exc_tb:
311 dst_buffer = ffi.new('char[]', self._write_size)
312
313 out_buffer = ffi.new('ZSTD_outBuffer *')
314 in_buffer = ffi.new('ZSTD_inBuffer *')
315
316 out_buffer.dst = dst_buffer
317 out_buffer.size = len(dst_buffer)
318 out_buffer.pos = 0
319
320 in_buffer.src = ffi.NULL
321 in_buffer.size = 0
322 in_buffer.pos = 0
323
324 while True:
325 zresult = lib.ZSTD_compress_generic(self._compressor._cctx,
326 out_buffer, in_buffer,
327 lib.ZSTD_e_end)
328
329 if lib.ZSTD_isError(zresult):
330 raise ZstdError('error ending compression stream: %s' %
331 _zstd_error(zresult))
332
333 if out_buffer.pos:
334 self._writer.write(ffi.buffer(out_buffer.dst, out_buffer.pos)[:])
335 out_buffer.pos = 0
336
337 if zresult == 0:
338 break
473 self.close()
339 474
340 475 self._compressor = None
341 476
342 477 return False
343 478
344 479 def memory_size(self):
345 if not self._entered:
346 raise ZstdError('cannot determine size of an inactive compressor; '
347 'call when a context manager is active')
348
349 480 return lib.ZSTD_sizeof_CCtx(self._compressor._cctx)
350 481
482 def fileno(self):
483 f = getattr(self._writer, 'fileno', None)
484 if f:
485 return f()
486 else:
487 raise OSError('fileno not available on underlying writer')
488
489 def close(self):
490 if self._closed:
491 return
492
493 try:
494 self.flush(FLUSH_FRAME)
495 finally:
496 self._closed = True
497
498 # Call close() on underlying stream as well.
499 f = getattr(self._writer, 'close', None)
500 if f:
501 f()
502
503 @property
504 def closed(self):
505 return self._closed
506
507 def isatty(self):
508 return False
509
510 def readable(self):
511 return False
512
513 def readline(self, size=-1):
514 raise io.UnsupportedOperation()
515
516 def readlines(self, hint=-1):
517 raise io.UnsupportedOperation()
518
519 def seek(self, offset, whence=None):
520 raise io.UnsupportedOperation()
521
522 def seekable(self):
523 return False
524
525 def truncate(self, size=None):
526 raise io.UnsupportedOperation()
527
528 def writable(self):
529 return True
530
531 def writelines(self, lines):
532 raise NotImplementedError('writelines() is not yet implemented')
533
534 def read(self, size=-1):
535 raise io.UnsupportedOperation()
536
537 def readall(self):
538 raise io.UnsupportedOperation()
539
540 def readinto(self, b):
541 raise io.UnsupportedOperation()
542
351 543 def write(self, data):
352 if not self._entered:
353 raise ZstdError('write() must be called from an active context '
354 'manager')
544 if self._closed:
545 raise ValueError('stream is closed')
355 546
356 547 total_write = 0
357 548
@@ -362,16 +553,13 b' class ZstdCompressionWriter(object):'
362 553 in_buffer.size = len(data_buffer)
363 554 in_buffer.pos = 0
364 555
365 out_buffer = ffi.new('ZSTD_outBuffer *')
366 dst_buffer = ffi.new('char[]', self._write_size)
367 out_buffer.dst = dst_buffer
368 out_buffer.size = self._write_size
556 out_buffer = self._out_buffer
369 557 out_buffer.pos = 0
370 558
371 559 while in_buffer.pos < in_buffer.size:
372 zresult = lib.ZSTD_compress_generic(self._compressor._cctx,
373 out_buffer, in_buffer,
374 lib.ZSTD_e_continue)
560 zresult = lib.ZSTD_compressStream2(self._compressor._cctx,
561 out_buffer, in_buffer,
562 lib.ZSTD_e_continue)
375 563 if lib.ZSTD_isError(zresult):
376 564 raise ZstdError('zstd compress error: %s' %
377 565 _zstd_error(zresult))
@@ -382,18 +570,25 b' class ZstdCompressionWriter(object):'
382 570 self._bytes_compressed += out_buffer.pos
383 571 out_buffer.pos = 0
384 572
385 return total_write
386
387 def flush(self):
388 if not self._entered:
389 raise ZstdError('flush must be called from an active context manager')
573 if self._write_return_read:
574 return in_buffer.pos
575 else:
576 return total_write
577
578 def flush(self, flush_mode=FLUSH_BLOCK):
579 if flush_mode == FLUSH_BLOCK:
580 flush = lib.ZSTD_e_flush
581 elif flush_mode == FLUSH_FRAME:
582 flush = lib.ZSTD_e_end
583 else:
584 raise ValueError('unknown flush_mode: %r' % flush_mode)
585
586 if self._closed:
587 raise ValueError('stream is closed')
390 588
391 589 total_write = 0
392 590
393 out_buffer = ffi.new('ZSTD_outBuffer *')
394 dst_buffer = ffi.new('char[]', self._write_size)
395 out_buffer.dst = dst_buffer
396 out_buffer.size = self._write_size
591 out_buffer = self._out_buffer
397 592 out_buffer.pos = 0
398 593
399 594 in_buffer = ffi.new('ZSTD_inBuffer *')
@@ -402,9 +597,9 b' class ZstdCompressionWriter(object):'
402 597 in_buffer.pos = 0
403 598
404 599 while True:
405 zresult = lib.ZSTD_compress_generic(self._compressor._cctx,
406 out_buffer, in_buffer,
407 lib.ZSTD_e_flush)
600 zresult = lib.ZSTD_compressStream2(self._compressor._cctx,
601 out_buffer, in_buffer,
602 flush)
408 603 if lib.ZSTD_isError(zresult):
409 604 raise ZstdError('zstd compress error: %s' %
410 605 _zstd_error(zresult))
@@ -438,10 +633,10 b' class ZstdCompressionObj(object):'
438 633 chunks = []
439 634
440 635 while source.pos < len(data):
441 zresult = lib.ZSTD_compress_generic(self._compressor._cctx,
442 self._out,
443 source,
444 lib.ZSTD_e_continue)
636 zresult = lib.ZSTD_compressStream2(self._compressor._cctx,
637 self._out,
638 source,
639 lib.ZSTD_e_continue)
445 640 if lib.ZSTD_isError(zresult):
446 641 raise ZstdError('zstd compress error: %s' %
447 642 _zstd_error(zresult))
@@ -477,10 +672,10 b' class ZstdCompressionObj(object):'
477 672 chunks = []
478 673
479 674 while True:
480 zresult = lib.ZSTD_compress_generic(self._compressor._cctx,
481 self._out,
482 in_buffer,
483 z_flush_mode)
675 zresult = lib.ZSTD_compressStream2(self._compressor._cctx,
676 self._out,
677 in_buffer,
678 z_flush_mode)
484 679 if lib.ZSTD_isError(zresult):
485 680 raise ZstdError('error ending compression stream: %s' %
486 681 _zstd_error(zresult))
@@ -528,10 +723,10 b' class ZstdCompressionChunker(object):'
528 723 self._in.pos = 0
529 724
530 725 while self._in.pos < self._in.size:
531 zresult = lib.ZSTD_compress_generic(self._compressor._cctx,
532 self._out,
533 self._in,
534 lib.ZSTD_e_continue)
726 zresult = lib.ZSTD_compressStream2(self._compressor._cctx,
727 self._out,
728 self._in,
729 lib.ZSTD_e_continue)
535 730
536 731 if self._in.pos == self._in.size:
537 732 self._in.src = ffi.NULL
@@ -555,9 +750,9 b' class ZstdCompressionChunker(object):'
555 750 'previous operation')
556 751
557 752 while True:
558 zresult = lib.ZSTD_compress_generic(self._compressor._cctx,
559 self._out, self._in,
560 lib.ZSTD_e_flush)
753 zresult = lib.ZSTD_compressStream2(self._compressor._cctx,
754 self._out, self._in,
755 lib.ZSTD_e_flush)
561 756 if lib.ZSTD_isError(zresult):
562 757 raise ZstdError('zstd compress error: %s' % _zstd_error(zresult))
563 758
@@ -577,9 +772,9 b' class ZstdCompressionChunker(object):'
577 772 'previous operation')
578 773
579 774 while True:
580 zresult = lib.ZSTD_compress_generic(self._compressor._cctx,
581 self._out, self._in,
582 lib.ZSTD_e_end)
775 zresult = lib.ZSTD_compressStream2(self._compressor._cctx,
776 self._out, self._in,
777 lib.ZSTD_e_end)
583 778 if lib.ZSTD_isError(zresult):
584 779 raise ZstdError('zstd compress error: %s' % _zstd_error(zresult))
585 780
@@ -592,7 +787,7 b' class ZstdCompressionChunker(object):'
592 787 return
593 788
594 789
595 class CompressionReader(object):
790 class ZstdCompressionReader(object):
596 791 def __init__(self, compressor, source, read_size):
597 792 self._compressor = compressor
598 793 self._source = source
@@ -661,7 +856,16 b' class CompressionReader(object):'
661 856 return self._bytes_compressed
662 857
663 858 def readall(self):
664 raise NotImplementedError()
859 chunks = []
860
861 while True:
862 chunk = self.read(1048576)
863 if not chunk:
864 break
865
866 chunks.append(chunk)
867
868 return b''.join(chunks)
665 869
666 870 def __iter__(self):
667 871 raise io.UnsupportedOperation()
@@ -671,16 +875,67 b' class CompressionReader(object):'
671 875
672 876 next = __next__
673 877
878 def _read_input(self):
879 if self._finished_input:
880 return
881
882 if hasattr(self._source, 'read'):
883 data = self._source.read(self._read_size)
884
885 if not data:
886 self._finished_input = True
887 return
888
889 self._source_buffer = ffi.from_buffer(data)
890 self._in_buffer.src = self._source_buffer
891 self._in_buffer.size = len(self._source_buffer)
892 self._in_buffer.pos = 0
893 else:
894 self._source_buffer = ffi.from_buffer(self._source)
895 self._in_buffer.src = self._source_buffer
896 self._in_buffer.size = len(self._source_buffer)
897 self._in_buffer.pos = 0
898
899 def _compress_into_buffer(self, out_buffer):
900 if self._in_buffer.pos >= self._in_buffer.size:
901 return
902
903 old_pos = out_buffer.pos
904
905 zresult = lib.ZSTD_compressStream2(self._compressor._cctx,
906 out_buffer, self._in_buffer,
907 lib.ZSTD_e_continue)
908
909 self._bytes_compressed += out_buffer.pos - old_pos
910
911 if self._in_buffer.pos == self._in_buffer.size:
912 self._in_buffer.src = ffi.NULL
913 self._in_buffer.pos = 0
914 self._in_buffer.size = 0
915 self._source_buffer = None
916
917 if not hasattr(self._source, 'read'):
918 self._finished_input = True
919
920 if lib.ZSTD_isError(zresult):
921 raise ZstdError('zstd compress error: %s',
922 _zstd_error(zresult))
923
924 return out_buffer.pos and out_buffer.pos == out_buffer.size
925
674 926 def read(self, size=-1):
675 927 if self._closed:
676 928 raise ValueError('stream is closed')
677 929
678 if self._finished_output:
930 if size < -1:
931 raise ValueError('cannot read negative amounts less than -1')
932
933 if size == -1:
934 return self.readall()
935
936 if self._finished_output or size == 0:
679 937 return b''
680 938
681 if size < 1:
682 raise ValueError('cannot read negative or size 0 amounts')
683
684 939 # Need a dedicated ref to dest buffer otherwise it gets collected.
685 940 dst_buffer = ffi.new('char[]', size)
686 941 out_buffer = ffi.new('ZSTD_outBuffer *')
@@ -688,71 +943,21 b' class CompressionReader(object):'
688 943 out_buffer.size = size
689 944 out_buffer.pos = 0
690 945
691 def compress_input():
692 if self._in_buffer.pos >= self._in_buffer.size:
693 return
694
695 old_pos = out_buffer.pos
696
697 zresult = lib.ZSTD_compress_generic(self._compressor._cctx,
698 out_buffer, self._in_buffer,
699 lib.ZSTD_e_continue)
700
701 self._bytes_compressed += out_buffer.pos - old_pos
702
703 if self._in_buffer.pos == self._in_buffer.size:
704 self._in_buffer.src = ffi.NULL
705 self._in_buffer.pos = 0
706 self._in_buffer.size = 0
707 self._source_buffer = None
708
709 if not hasattr(self._source, 'read'):
710 self._finished_input = True
711
712 if lib.ZSTD_isError(zresult):
713 raise ZstdError('zstd compress error: %s',
714 _zstd_error(zresult))
715
716 if out_buffer.pos and out_buffer.pos == out_buffer.size:
717 return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
718
719 def get_input():
720 if self._finished_input:
721 return
722
723 if hasattr(self._source, 'read'):
724 data = self._source.read(self._read_size)
725
726 if not data:
727 self._finished_input = True
728 return
729
730 self._source_buffer = ffi.from_buffer(data)
731 self._in_buffer.src = self._source_buffer
732 self._in_buffer.size = len(self._source_buffer)
733 self._in_buffer.pos = 0
734 else:
735 self._source_buffer = ffi.from_buffer(self._source)
736 self._in_buffer.src = self._source_buffer
737 self._in_buffer.size = len(self._source_buffer)
738 self._in_buffer.pos = 0
739
740 result = compress_input()
741 if result:
742 return result
946 if self._compress_into_buffer(out_buffer):
947 return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
743 948
744 949 while not self._finished_input:
745 get_input()
746 result = compress_input()
747 if result:
748 return result
950 self._read_input()
951
952 if self._compress_into_buffer(out_buffer):
953 return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
749 954
750 955 # EOF
751 956 old_pos = out_buffer.pos
752 957
753 zresult = lib.ZSTD_compress_generic(self._compressor._cctx,
754 out_buffer, self._in_buffer,
755 lib.ZSTD_e_end)
958 zresult = lib.ZSTD_compressStream2(self._compressor._cctx,
959 out_buffer, self._in_buffer,
960 lib.ZSTD_e_end)
756 961
757 962 self._bytes_compressed += out_buffer.pos - old_pos
758 963
@@ -765,6 +970,159 b' class CompressionReader(object):'
765 970
766 971 return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
767 972
973 def read1(self, size=-1):
974 if self._closed:
975 raise ValueError('stream is closed')
976
977 if size < -1:
978 raise ValueError('cannot read negative amounts less than -1')
979
980 if self._finished_output or size == 0:
981 return b''
982
983 # -1 returns arbitrary number of bytes.
984 if size == -1:
985 size = COMPRESSION_RECOMMENDED_OUTPUT_SIZE
986
987 dst_buffer = ffi.new('char[]', size)
988 out_buffer = ffi.new('ZSTD_outBuffer *')
989 out_buffer.dst = dst_buffer
990 out_buffer.size = size
991 out_buffer.pos = 0
992
993 # read1() dictates that we can perform at most 1 call to the
994 # underlying stream to get input. However, we can't satisfy this
995 # restriction with compression because not all input generates output.
996 # It is possible to perform a block flush in order to ensure output.
997 # But this may not be desirable behavior. So we allow multiple read()
998 # to the underlying stream. But unlike read(), we stop once we have
999 # any output.
1000
1001 self._compress_into_buffer(out_buffer)
1002 if out_buffer.pos:
1003 return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
1004
1005 while not self._finished_input:
1006 self._read_input()
1007
1008 # If we've filled the output buffer, return immediately.
1009 if self._compress_into_buffer(out_buffer):
1010 return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
1011
1012 # If we've populated the output buffer and we're not at EOF,
1013 # also return, as we've satisfied the read1() limits.
1014 if out_buffer.pos and not self._finished_input:
1015 return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
1016
1017 # Else if we're at EOS and we have room left in the buffer,
1018 # fall through to below and try to add more data to the output.
1019
1020 # EOF.
1021 old_pos = out_buffer.pos
1022
1023 zresult = lib.ZSTD_compressStream2(self._compressor._cctx,
1024 out_buffer, self._in_buffer,
1025 lib.ZSTD_e_end)
1026
1027 self._bytes_compressed += out_buffer.pos - old_pos
1028
1029 if lib.ZSTD_isError(zresult):
1030 raise ZstdError('error ending compression stream: %s' %
1031 _zstd_error(zresult))
1032
1033 if zresult == 0:
1034 self._finished_output = True
1035
1036 return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
1037
1038 def readinto(self, b):
1039 if self._closed:
1040 raise ValueError('stream is closed')
1041
1042 if self._finished_output:
1043 return 0
1044
1045 # TODO use writable=True once we require CFFI >= 1.12.
1046 dest_buffer = ffi.from_buffer(b)
1047 ffi.memmove(b, b'', 0)
1048 out_buffer = ffi.new('ZSTD_outBuffer *')
1049 out_buffer.dst = dest_buffer
1050 out_buffer.size = len(dest_buffer)
1051 out_buffer.pos = 0
1052
1053 if self._compress_into_buffer(out_buffer):
1054 return out_buffer.pos
1055
1056 while not self._finished_input:
1057 self._read_input()
1058 if self._compress_into_buffer(out_buffer):
1059 return out_buffer.pos
1060
1061 # EOF.
1062 old_pos = out_buffer.pos
1063 zresult = lib.ZSTD_compressStream2(self._compressor._cctx,
1064 out_buffer, self._in_buffer,
1065 lib.ZSTD_e_end)
1066
1067 self._bytes_compressed += out_buffer.pos - old_pos
1068
1069 if lib.ZSTD_isError(zresult):
1070 raise ZstdError('error ending compression stream: %s',
1071 _zstd_error(zresult))
1072
1073 if zresult == 0:
1074 self._finished_output = True
1075
1076 return out_buffer.pos
1077
1078 def readinto1(self, b):
1079 if self._closed:
1080 raise ValueError('stream is closed')
1081
1082 if self._finished_output:
1083 return 0
1084
1085 # TODO use writable=True once we require CFFI >= 1.12.
1086 dest_buffer = ffi.from_buffer(b)
1087 ffi.memmove(b, b'', 0)
1088
1089 out_buffer = ffi.new('ZSTD_outBuffer *')
1090 out_buffer.dst = dest_buffer
1091 out_buffer.size = len(dest_buffer)
1092 out_buffer.pos = 0
1093
1094 self._compress_into_buffer(out_buffer)
1095 if out_buffer.pos:
1096 return out_buffer.pos
1097
1098 while not self._finished_input:
1099 self._read_input()
1100
1101 if self._compress_into_buffer(out_buffer):
1102 return out_buffer.pos
1103
1104 if out_buffer.pos and not self._finished_input:
1105 return out_buffer.pos
1106
1107 # EOF.
1108 old_pos = out_buffer.pos
1109
1110 zresult = lib.ZSTD_compressStream2(self._compressor._cctx,
1111 out_buffer, self._in_buffer,
1112 lib.ZSTD_e_end)
1113
1114 self._bytes_compressed += out_buffer.pos - old_pos
1115
1116 if lib.ZSTD_isError(zresult):
1117 raise ZstdError('error ending compression stream: %s' %
1118 _zstd_error(zresult))
1119
1120 if zresult == 0:
1121 self._finished_output = True
1122
1123 return out_buffer.pos
1124
1125
768 1126 class ZstdCompressor(object):
769 1127 def __init__(self, level=3, dict_data=None, compression_params=None,
770 1128 write_checksum=None, write_content_size=None,
@@ -803,25 +1161,25 b' class ZstdCompressor(object):'
803 1161 self._params = ffi.gc(params, lib.ZSTD_freeCCtxParams)
804 1162
805 1163 _set_compression_parameter(self._params,
806 lib.ZSTD_p_compressionLevel,
1164 lib.ZSTD_c_compressionLevel,
807 1165 level)
808 1166
809 1167 _set_compression_parameter(
810 1168 self._params,
811 lib.ZSTD_p_contentSizeFlag,
1169 lib.ZSTD_c_contentSizeFlag,
812 1170 write_content_size if write_content_size is not None else 1)
813 1171
814 1172 _set_compression_parameter(self._params,
815 lib.ZSTD_p_checksumFlag,
1173 lib.ZSTD_c_checksumFlag,
816 1174 1 if write_checksum else 0)
817 1175
818 1176 _set_compression_parameter(self._params,
819 lib.ZSTD_p_dictIDFlag,
1177 lib.ZSTD_c_dictIDFlag,
820 1178 1 if write_dict_id else 0)
821 1179
822 1180 if threads:
823 1181 _set_compression_parameter(self._params,
824 lib.ZSTD_p_nbWorkers,
1182 lib.ZSTD_c_nbWorkers,
825 1183 threads)
826 1184
827 1185 cctx = lib.ZSTD_createCCtx()
@@ -864,7 +1222,7 b' class ZstdCompressor(object):'
864 1222 return lib.ZSTD_sizeof_CCtx(self._cctx)
865 1223
866 1224 def compress(self, data):
867 lib.ZSTD_CCtx_reset(self._cctx)
1225 lib.ZSTD_CCtx_reset(self._cctx, lib.ZSTD_reset_session_only)
868 1226
869 1227 data_buffer = ffi.from_buffer(data)
870 1228
@@ -887,10 +1245,10 b' class ZstdCompressor(object):'
887 1245 in_buffer.size = len(data_buffer)
888 1246 in_buffer.pos = 0
889 1247
890 zresult = lib.ZSTD_compress_generic(self._cctx,
891 out_buffer,
892 in_buffer,
893 lib.ZSTD_e_end)
1248 zresult = lib.ZSTD_compressStream2(self._cctx,
1249 out_buffer,
1250 in_buffer,
1251 lib.ZSTD_e_end)
894 1252
895 1253 if lib.ZSTD_isError(zresult):
896 1254 raise ZstdError('cannot compress: %s' %
@@ -901,7 +1259,7 b' class ZstdCompressor(object):'
901 1259 return ffi.buffer(out, out_buffer.pos)[:]
902 1260
903 1261 def compressobj(self, size=-1):
904 lib.ZSTD_CCtx_reset(self._cctx)
1262 lib.ZSTD_CCtx_reset(self._cctx, lib.ZSTD_reset_session_only)
905 1263
906 1264 if size < 0:
907 1265 size = lib.ZSTD_CONTENTSIZE_UNKNOWN
@@ -923,7 +1281,7 b' class ZstdCompressor(object):'
923 1281 return cobj
924 1282
925 1283 def chunker(self, size=-1, chunk_size=COMPRESSION_RECOMMENDED_OUTPUT_SIZE):
926 lib.ZSTD_CCtx_reset(self._cctx)
1284 lib.ZSTD_CCtx_reset(self._cctx, lib.ZSTD_reset_session_only)
927 1285
928 1286 if size < 0:
929 1287 size = lib.ZSTD_CONTENTSIZE_UNKNOWN
@@ -944,7 +1302,7 b' class ZstdCompressor(object):'
944 1302 if not hasattr(ofh, 'write'):
945 1303 raise ValueError('second argument must have a write() method')
946 1304
947 lib.ZSTD_CCtx_reset(self._cctx)
1305 lib.ZSTD_CCtx_reset(self._cctx, lib.ZSTD_reset_session_only)
948 1306
949 1307 if size < 0:
950 1308 size = lib.ZSTD_CONTENTSIZE_UNKNOWN
@@ -976,10 +1334,10 b' class ZstdCompressor(object):'
976 1334 in_buffer.pos = 0
977 1335
978 1336 while in_buffer.pos < in_buffer.size:
979 zresult = lib.ZSTD_compress_generic(self._cctx,
980 out_buffer,
981 in_buffer,
982 lib.ZSTD_e_continue)
1337 zresult = lib.ZSTD_compressStream2(self._cctx,
1338 out_buffer,
1339 in_buffer,
1340 lib.ZSTD_e_continue)
983 1341 if lib.ZSTD_isError(zresult):
984 1342 raise ZstdError('zstd compress error: %s' %
985 1343 _zstd_error(zresult))
@@ -991,10 +1349,10 b' class ZstdCompressor(object):'
991 1349
992 1350 # We've finished reading. Flush the compressor.
993 1351 while True:
994 zresult = lib.ZSTD_compress_generic(self._cctx,
995 out_buffer,
996 in_buffer,
997 lib.ZSTD_e_end)
1352 zresult = lib.ZSTD_compressStream2(self._cctx,
1353 out_buffer,
1354 in_buffer,
1355 lib.ZSTD_e_end)
998 1356 if lib.ZSTD_isError(zresult):
999 1357 raise ZstdError('error ending compression stream: %s' %
1000 1358 _zstd_error(zresult))
@@ -1011,7 +1369,7 b' class ZstdCompressor(object):'
1011 1369
1012 1370 def stream_reader(self, source, size=-1,
1013 1371 read_size=COMPRESSION_RECOMMENDED_INPUT_SIZE):
1014 lib.ZSTD_CCtx_reset(self._cctx)
1372 lib.ZSTD_CCtx_reset(self._cctx, lib.ZSTD_reset_session_only)
1015 1373
1016 1374 try:
1017 1375 size = len(source)
@@ -1026,20 +1384,22 b' class ZstdCompressor(object):'
1026 1384 raise ZstdError('error setting source size: %s' %
1027 1385 _zstd_error(zresult))
1028 1386
1029 return CompressionReader(self, source, read_size)
1387 return ZstdCompressionReader(self, source, read_size)
1030 1388
1031 1389 def stream_writer(self, writer, size=-1,
1032 write_size=COMPRESSION_RECOMMENDED_OUTPUT_SIZE):
1390 write_size=COMPRESSION_RECOMMENDED_OUTPUT_SIZE,
1391 write_return_read=False):
1033 1392
1034 1393 if not hasattr(writer, 'write'):
1035 1394 raise ValueError('must pass an object with a write() method')
1036 1395
1037 lib.ZSTD_CCtx_reset(self._cctx)
1396 lib.ZSTD_CCtx_reset(self._cctx, lib.ZSTD_reset_session_only)
1038 1397
1039 1398 if size < 0:
1040 1399 size = lib.ZSTD_CONTENTSIZE_UNKNOWN
1041 1400
1042 return ZstdCompressionWriter(self, writer, size, write_size)
1401 return ZstdCompressionWriter(self, writer, size, write_size,
1402 write_return_read)
1043 1403
1044 1404 write_to = stream_writer
1045 1405
@@ -1056,7 +1416,7 b' class ZstdCompressor(object):'
1056 1416 raise ValueError('must pass an object with a read() method or '
1057 1417 'conforms to buffer protocol')
1058 1418
1059 lib.ZSTD_CCtx_reset(self._cctx)
1419 lib.ZSTD_CCtx_reset(self._cctx, lib.ZSTD_reset_session_only)
1060 1420
1061 1421 if size < 0:
1062 1422 size = lib.ZSTD_CONTENTSIZE_UNKNOWN
@@ -1104,8 +1464,8 b' class ZstdCompressor(object):'
1104 1464 in_buffer.pos = 0
1105 1465
1106 1466 while in_buffer.pos < in_buffer.size:
1107 zresult = lib.ZSTD_compress_generic(self._cctx, out_buffer, in_buffer,
1108 lib.ZSTD_e_continue)
1467 zresult = lib.ZSTD_compressStream2(self._cctx, out_buffer, in_buffer,
1468 lib.ZSTD_e_continue)
1109 1469 if lib.ZSTD_isError(zresult):
1110 1470 raise ZstdError('zstd compress error: %s' %
1111 1471 _zstd_error(zresult))
@@ -1124,10 +1484,10 b' class ZstdCompressor(object):'
1124 1484 # remains.
1125 1485 while True:
1126 1486 assert out_buffer.pos == 0
1127 zresult = lib.ZSTD_compress_generic(self._cctx,
1128 out_buffer,
1129 in_buffer,
1130 lib.ZSTD_e_end)
1487 zresult = lib.ZSTD_compressStream2(self._cctx,
1488 out_buffer,
1489 in_buffer,
1490 lib.ZSTD_e_end)
1131 1491 if lib.ZSTD_isError(zresult):
1132 1492 raise ZstdError('error ending compression stream: %s' %
1133 1493 _zstd_error(zresult))
@@ -1234,7 +1594,7 b' class ZstdCompressionDict(object):'
1234 1594 cparams = ffi.new('ZSTD_compressionParameters')
1235 1595 cparams.chainLog = compression_params.chain_log
1236 1596 cparams.hashLog = compression_params.hash_log
1237 cparams.searchLength = compression_params.min_match
1597 cparams.minMatch = compression_params.min_match
1238 1598 cparams.searchLog = compression_params.search_log
1239 1599 cparams.strategy = compression_params.compression_strategy
1240 1600 cparams.targetLength = compression_params.target_length
@@ -1345,6 +1705,10 b' class ZstdDecompressionObj(object):'
1345 1705 out_buffer = ffi.new('ZSTD_outBuffer *')
1346 1706
1347 1707 data_buffer = ffi.from_buffer(data)
1708
1709 if len(data_buffer) == 0:
1710 return b''
1711
1348 1712 in_buffer.src = data_buffer
1349 1713 in_buffer.size = len(data_buffer)
1350 1714 in_buffer.pos = 0
@@ -1357,8 +1721,8 b' class ZstdDecompressionObj(object):'
1357 1721 chunks = []
1358 1722
1359 1723 while True:
1360 zresult = lib.ZSTD_decompress_generic(self._decompressor._dctx,
1361 out_buffer, in_buffer)
1724 zresult = lib.ZSTD_decompressStream(self._decompressor._dctx,
1725 out_buffer, in_buffer)
1362 1726 if lib.ZSTD_isError(zresult):
1363 1727 raise ZstdError('zstd decompressor error: %s' %
1364 1728 _zstd_error(zresult))
@@ -1378,12 +1742,16 b' class ZstdDecompressionObj(object):'
1378 1742
1379 1743 return b''.join(chunks)
1380 1744
1381
1382 class DecompressionReader(object):
1383 def __init__(self, decompressor, source, read_size):
1745 def flush(self, length=0):
1746 pass
1747
1748
1749 class ZstdDecompressionReader(object):
1750 def __init__(self, decompressor, source, read_size, read_across_frames):
1384 1751 self._decompressor = decompressor
1385 1752 self._source = source
1386 1753 self._read_size = read_size
1754 self._read_across_frames = bool(read_across_frames)
1387 1755 self._entered = False
1388 1756 self._closed = False
1389 1757 self._bytes_decompressed = 0
@@ -1418,10 +1786,10 b' class DecompressionReader(object):'
1418 1786 return True
1419 1787
1420 1788 def readline(self):
1421 raise NotImplementedError()
1789 raise io.UnsupportedOperation()
1422 1790
1423 1791 def readlines(self):
1424 raise NotImplementedError()
1792 raise io.UnsupportedOperation()
1425 1793
1426 1794 def write(self, data):
1427 1795 raise io.UnsupportedOperation()
@@ -1447,25 +1815,158 b' class DecompressionReader(object):'
1447 1815 return self._bytes_decompressed
1448 1816
1449 1817 def readall(self):
1450 raise NotImplementedError()
1818 chunks = []
1819
1820 while True:
1821 chunk = self.read(1048576)
1822 if not chunk:
1823 break
1824
1825 chunks.append(chunk)
1826
1827 return b''.join(chunks)
1451 1828
1452 1829 def __iter__(self):
1453 raise NotImplementedError()
1830 raise io.UnsupportedOperation()
1454 1831
1455 1832 def __next__(self):
1456 raise NotImplementedError()
1833 raise io.UnsupportedOperation()
1457 1834
1458 1835 next = __next__
1459 1836
1460 def read(self, size):
1837 def _read_input(self):
1838 # We have data left over in the input buffer. Use it.
1839 if self._in_buffer.pos < self._in_buffer.size:
1840 return
1841
1842 # All input data exhausted. Nothing to do.
1843 if self._finished_input:
1844 return
1845
1846 # Else populate the input buffer from our source.
1847 if hasattr(self._source, 'read'):
1848 data = self._source.read(self._read_size)
1849
1850 if not data:
1851 self._finished_input = True
1852 return
1853
1854 self._source_buffer = ffi.from_buffer(data)
1855 self._in_buffer.src = self._source_buffer
1856 self._in_buffer.size = len(self._source_buffer)
1857 self._in_buffer.pos = 0
1858 else:
1859 self._source_buffer = ffi.from_buffer(self._source)
1860 self._in_buffer.src = self._source_buffer
1861 self._in_buffer.size = len(self._source_buffer)
1862 self._in_buffer.pos = 0
1863
1864 def _decompress_into_buffer(self, out_buffer):
1865 """Decompress available input into an output buffer.
1866
1867 Returns True if data in output buffer should be emitted.
1868 """
1869 zresult = lib.ZSTD_decompressStream(self._decompressor._dctx,
1870 out_buffer, self._in_buffer)
1871
1872 if self._in_buffer.pos == self._in_buffer.size:
1873 self._in_buffer.src = ffi.NULL
1874 self._in_buffer.pos = 0
1875 self._in_buffer.size = 0
1876 self._source_buffer = None
1877
1878 if not hasattr(self._source, 'read'):
1879 self._finished_input = True
1880
1881 if lib.ZSTD_isError(zresult):
1882 raise ZstdError('zstd decompress error: %s' %
1883 _zstd_error(zresult))
1884
1885 # Emit data if there is data AND either:
1886 # a) output buffer is full (read amount is satisfied)
1887 # b) we're at end of a frame and not in frame spanning mode
1888 return (out_buffer.pos and
1889 (out_buffer.pos == out_buffer.size or
1890 zresult == 0 and not self._read_across_frames))
1891
1892 def read(self, size=-1):
1893 if self._closed:
1894 raise ValueError('stream is closed')
1895
1896 if size < -1:
1897 raise ValueError('cannot read negative amounts less than -1')
1898
1899 if size == -1:
1900 # This is recursive. But it gets the job done.
1901 return self.readall()
1902
1903 if self._finished_output or size == 0:
1904 return b''
1905
1906 # We /could/ call into readinto() here. But that introduces more
1907 # overhead.
1908 dst_buffer = ffi.new('char[]', size)
1909 out_buffer = ffi.new('ZSTD_outBuffer *')
1910 out_buffer.dst = dst_buffer
1911 out_buffer.size = size
1912 out_buffer.pos = 0
1913
1914 self._read_input()
1915 if self._decompress_into_buffer(out_buffer):
1916 self._bytes_decompressed += out_buffer.pos
1917 return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
1918
1919 while not self._finished_input:
1920 self._read_input()
1921 if self._decompress_into_buffer(out_buffer):
1922 self._bytes_decompressed += out_buffer.pos
1923 return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
1924
1925 self._bytes_decompressed += out_buffer.pos
1926 return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
1927
1928 def readinto(self, b):
1461 1929 if self._closed:
1462 1930 raise ValueError('stream is closed')
1463 1931
1464 1932 if self._finished_output:
1933 return 0
1934
1935 # TODO use writable=True once we require CFFI >= 1.12.
1936 dest_buffer = ffi.from_buffer(b)
1937 ffi.memmove(b, b'', 0)
1938 out_buffer = ffi.new('ZSTD_outBuffer *')
1939 out_buffer.dst = dest_buffer
1940 out_buffer.size = len(dest_buffer)
1941 out_buffer.pos = 0
1942
1943 self._read_input()
1944 if self._decompress_into_buffer(out_buffer):
1945 self._bytes_decompressed += out_buffer.pos
1946 return out_buffer.pos
1947
1948 while not self._finished_input:
1949 self._read_input()
1950 if self._decompress_into_buffer(out_buffer):
1951 self._bytes_decompressed += out_buffer.pos
1952 return out_buffer.pos
1953
1954 self._bytes_decompressed += out_buffer.pos
1955 return out_buffer.pos
1956
1957 def read1(self, size=-1):
1958 if self._closed:
1959 raise ValueError('stream is closed')
1960
1961 if size < -1:
1962 raise ValueError('cannot read negative amounts less than -1')
1963
1964 if self._finished_output or size == 0:
1465 1965 return b''
1466 1966
1467 if size < 1:
1468 raise ValueError('cannot read negative or size 0 amounts')
1967 # -1 returns arbitrary number of bytes.
1968 if size == -1:
1969 size = DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE
1469 1970
1470 1971 dst_buffer = ffi.new('char[]', size)
1471 1972 out_buffer = ffi.new('ZSTD_outBuffer *')
@@ -1473,64 +1974,46 b' class DecompressionReader(object):'
1473 1974 out_buffer.size = size
1474 1975 out_buffer.pos = 0
1475 1976
1476 def decompress():
1477 zresult = lib.ZSTD_decompress_generic(self._decompressor._dctx,
1478 out_buffer, self._in_buffer)
1479
1480 if self._in_buffer.pos == self._in_buffer.size:
1481 self._in_buffer.src = ffi.NULL
1482 self._in_buffer.pos = 0
1483 self._in_buffer.size = 0
1484 self._source_buffer = None
1485
1486 if not hasattr(self._source, 'read'):
1487 self._finished_input = True
1488
1489 if lib.ZSTD_isError(zresult):
1490 raise ZstdError('zstd decompress error: %s',
1491 _zstd_error(zresult))
1492 elif zresult == 0:
1493 self._finished_output = True
1494
1495 if out_buffer.pos and out_buffer.pos == out_buffer.size:
1496 self._bytes_decompressed += out_buffer.size
1497 return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
1498
1499 def get_input():
1500 if self._finished_input:
1501 return
1502
1503 if hasattr(self._source, 'read'):
1504 data = self._source.read(self._read_size)
1505
1506 if not data:
1507 self._finished_input = True
1508 return
1509
1510 self._source_buffer = ffi.from_buffer(data)
1511 self._in_buffer.src = self._source_buffer
1512 self._in_buffer.size = len(self._source_buffer)
1513 self._in_buffer.pos = 0
1514 else:
1515 self._source_buffer = ffi.from_buffer(self._source)
1516 self._in_buffer.src = self._source_buffer
1517 self._in_buffer.size = len(self._source_buffer)
1518 self._in_buffer.pos = 0
1519
1520 get_input()
1521 result = decompress()
1522 if result:
1523 return result
1524
1977 # read1() dictates that we can perform at most 1 call to underlying
1978 # stream to get input. However, we can't satisfy this restriction with
1979 # decompression because not all input generates output. So we allow
1980 # multiple read(). But unlike read(), we stop once we have any output.
1525 1981 while not self._finished_input:
1526 get_input()
1527 result = decompress()
1528 if result:
1529 return result
1982 self._read_input()
1983 self._decompress_into_buffer(out_buffer)
1984
1985 if out_buffer.pos:
1986 break
1530 1987
1531 1988 self._bytes_decompressed += out_buffer.pos
1532 1989 return ffi.buffer(out_buffer.dst, out_buffer.pos)[:]
1533 1990
1991 def readinto1(self, b):
1992 if self._closed:
1993 raise ValueError('stream is closed')
1994
1995 if self._finished_output:
1996 return 0
1997
1998 # TODO use writable=True once we require CFFI >= 1.12.
1999 dest_buffer = ffi.from_buffer(b)
2000 ffi.memmove(b, b'', 0)
2001
2002 out_buffer = ffi.new('ZSTD_outBuffer *')
2003 out_buffer.dst = dest_buffer
2004 out_buffer.size = len(dest_buffer)
2005 out_buffer.pos = 0
2006
2007 while not self._finished_input and not self._finished_output:
2008 self._read_input()
2009 self._decompress_into_buffer(out_buffer)
2010
2011 if out_buffer.pos:
2012 break
2013
2014 self._bytes_decompressed += out_buffer.pos
2015 return out_buffer.pos
2016
1534 2017 def seek(self, pos, whence=os.SEEK_SET):
1535 2018 if self._closed:
1536 2019 raise ValueError('stream is closed')
@@ -1569,34 +2052,108 b' class DecompressionReader(object):'
1569 2052 return self._bytes_decompressed
1570 2053
1571 2054 class ZstdDecompressionWriter(object):
1572 def __init__(self, decompressor, writer, write_size):
2055 def __init__(self, decompressor, writer, write_size, write_return_read):
2056 decompressor._ensure_dctx()
2057
1573 2058 self._decompressor = decompressor
1574 2059 self._writer = writer
1575 2060 self._write_size = write_size
2061 self._write_return_read = bool(write_return_read)
1576 2062 self._entered = False
2063 self._closed = False
1577 2064
1578 2065 def __enter__(self):
2066 if self._closed:
2067 raise ValueError('stream is closed')
2068
1579 2069 if self._entered:
1580 2070 raise ZstdError('cannot __enter__ multiple times')
1581 2071
1582 self._decompressor._ensure_dctx()
1583 2072 self._entered = True
1584 2073
1585 2074 return self
1586 2075
1587 2076 def __exit__(self, exc_type, exc_value, exc_tb):
1588 2077 self._entered = False
2078 self.close()
1589 2079
1590 2080 def memory_size(self):
1591 if not self._decompressor._dctx:
1592 raise ZstdError('cannot determine size of inactive decompressor '
1593 'call when context manager is active')
1594
1595 2081 return lib.ZSTD_sizeof_DCtx(self._decompressor._dctx)
1596 2082
2083 def close(self):
2084 if self._closed:
2085 return
2086
2087 try:
2088 self.flush()
2089 finally:
2090 self._closed = True
2091
2092 f = getattr(self._writer, 'close', None)
2093 if f:
2094 f()
2095
2096 @property
2097 def closed(self):
2098 return self._closed
2099
2100 def fileno(self):
2101 f = getattr(self._writer, 'fileno', None)
2102 if f:
2103 return f()
2104 else:
2105 raise OSError('fileno not available on underlying writer')
2106
2107 def flush(self):
2108 if self._closed:
2109 raise ValueError('stream is closed')
2110
2111 f = getattr(self._writer, 'flush', None)
2112 if f:
2113 return f()
2114
2115 def isatty(self):
2116 return False
2117
2118 def readable(self):
2119 return False
2120
2121 def readline(self, size=-1):
2122 raise io.UnsupportedOperation()
2123
2124 def readlines(self, hint=-1):
2125 raise io.UnsupportedOperation()
2126
2127 def seek(self, offset, whence=None):
2128 raise io.UnsupportedOperation()
2129
2130 def seekable(self):
2131 return False
2132
2133 def tell(self):
2134 raise io.UnsupportedOperation()
2135
2136 def truncate(self, size=None):
2137 raise io.UnsupportedOperation()
2138
2139 def writable(self):
2140 return True
2141
2142 def writelines(self, lines):
2143 raise io.UnsupportedOperation()
2144
2145 def read(self, size=-1):
2146 raise io.UnsupportedOperation()
2147
2148 def readall(self):
2149 raise io.UnsupportedOperation()
2150
2151 def readinto(self, b):
2152 raise io.UnsupportedOperation()
2153
1597 2154 def write(self, data):
1598 if not self._entered:
1599 raise ZstdError('write must be called from an active context manager')
2155 if self._closed:
2156 raise ValueError('stream is closed')
1600 2157
1601 2158 total_write = 0
1602 2159
@@ -1616,7 +2173,7 b' class ZstdDecompressionWriter(object):'
1616 2173 dctx = self._decompressor._dctx
1617 2174
1618 2175 while in_buffer.pos < in_buffer.size:
1619 zresult = lib.ZSTD_decompress_generic(dctx, out_buffer, in_buffer)
2176 zresult = lib.ZSTD_decompressStream(dctx, out_buffer, in_buffer)
1620 2177 if lib.ZSTD_isError(zresult):
1621 2178 raise ZstdError('zstd decompress error: %s' %
1622 2179 _zstd_error(zresult))
@@ -1626,7 +2183,10 b' class ZstdDecompressionWriter(object):'
1626 2183 total_write += out_buffer.pos
1627 2184 out_buffer.pos = 0
1628 2185
1629 return total_write
2186 if self._write_return_read:
2187 return in_buffer.pos
2188 else:
2189 return total_write
1630 2190
1631 2191
1632 2192 class ZstdDecompressor(object):
@@ -1684,7 +2244,7 b' class ZstdDecompressor(object):'
1684 2244 in_buffer.size = len(data_buffer)
1685 2245 in_buffer.pos = 0
1686 2246
1687 zresult = lib.ZSTD_decompress_generic(self._dctx, out_buffer, in_buffer)
2247 zresult = lib.ZSTD_decompressStream(self._dctx, out_buffer, in_buffer)
1688 2248 if lib.ZSTD_isError(zresult):
1689 2249 raise ZstdError('decompression error: %s' %
1690 2250 _zstd_error(zresult))
@@ -1696,9 +2256,10 b' class ZstdDecompressor(object):'
1696 2256
1697 2257 return ffi.buffer(result_buffer, out_buffer.pos)[:]
1698 2258
1699 def stream_reader(self, source, read_size=DECOMPRESSION_RECOMMENDED_INPUT_SIZE):
2259 def stream_reader(self, source, read_size=DECOMPRESSION_RECOMMENDED_INPUT_SIZE,
2260 read_across_frames=False):
1700 2261 self._ensure_dctx()
1701 return DecompressionReader(self, source, read_size)
2262 return ZstdDecompressionReader(self, source, read_size, read_across_frames)
1702 2263
1703 2264 def decompressobj(self, write_size=DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE):
1704 2265 if write_size < 1:
@@ -1767,7 +2328,7 b' class ZstdDecompressor(object):'
1767 2328 while in_buffer.pos < in_buffer.size:
1768 2329 assert out_buffer.pos == 0
1769 2330
1770 zresult = lib.ZSTD_decompress_generic(self._dctx, out_buffer, in_buffer)
2331 zresult = lib.ZSTD_decompressStream(self._dctx, out_buffer, in_buffer)
1771 2332 if lib.ZSTD_isError(zresult):
1772 2333 raise ZstdError('zstd decompress error: %s' %
1773 2334 _zstd_error(zresult))
@@ -1787,11 +2348,13 b' class ZstdDecompressor(object):'
1787 2348
1788 2349 read_from = read_to_iter
1789 2350
1790 def stream_writer(self, writer, write_size=DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE):
2351 def stream_writer(self, writer, write_size=DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE,
2352 write_return_read=False):
1791 2353 if not hasattr(writer, 'write'):
1792 2354 raise ValueError('must pass an object with a write() method')
1793 2355
1794 return ZstdDecompressionWriter(self, writer, write_size)
2356 return ZstdDecompressionWriter(self, writer, write_size,
2357 write_return_read)
1795 2358
1796 2359 write_to = stream_writer
1797 2360
@@ -1829,7 +2392,7 b' class ZstdDecompressor(object):'
1829 2392
1830 2393 # Flush all read data to output.
1831 2394 while in_buffer.pos < in_buffer.size:
1832 zresult = lib.ZSTD_decompress_generic(self._dctx, out_buffer, in_buffer)
2395 zresult = lib.ZSTD_decompressStream(self._dctx, out_buffer, in_buffer)
1833 2396 if lib.ZSTD_isError(zresult):
1834 2397 raise ZstdError('zstd decompressor error: %s' %
1835 2398 _zstd_error(zresult))
@@ -1881,7 +2444,7 b' class ZstdDecompressor(object):'
1881 2444 in_buffer.size = len(chunk_buffer)
1882 2445 in_buffer.pos = 0
1883 2446
1884 zresult = lib.ZSTD_decompress_generic(self._dctx, out_buffer, in_buffer)
2447 zresult = lib.ZSTD_decompressStream(self._dctx, out_buffer, in_buffer)
1885 2448 if lib.ZSTD_isError(zresult):
1886 2449 raise ZstdError('could not decompress chunk 0: %s' %
1887 2450 _zstd_error(zresult))
@@ -1918,7 +2481,7 b' class ZstdDecompressor(object):'
1918 2481 in_buffer.size = len(chunk_buffer)
1919 2482 in_buffer.pos = 0
1920 2483
1921 zresult = lib.ZSTD_decompress_generic(self._dctx, out_buffer, in_buffer)
2484 zresult = lib.ZSTD_decompressStream(self._dctx, out_buffer, in_buffer)
1922 2485 if lib.ZSTD_isError(zresult):
1923 2486 raise ZstdError('could not decompress chunk %d: %s' %
1924 2487 _zstd_error(zresult))
@@ -1931,7 +2494,7 b' class ZstdDecompressor(object):'
1931 2494 return ffi.buffer(last_buffer, len(last_buffer))[:]
1932 2495
1933 2496 def _ensure_dctx(self, load_dict=True):
1934 lib.ZSTD_DCtx_reset(self._dctx)
2497 lib.ZSTD_DCtx_reset(self._dctx, lib.ZSTD_reset_session_only)
1935 2498
1936 2499 if self._max_window_size:
1937 2500 zresult = lib.ZSTD_DCtx_setMaxWindowSize(self._dctx,
@@ -210,7 +210,7 b' void zstd_module_init(PyObject* m) {'
210 210 We detect this mismatch here and refuse to load the module if this
211 211 scenario is detected.
212 212 */
213 if (ZSTD_VERSION_NUMBER != 10306 || ZSTD_versionNumber() != 10306) {
213 if (ZSTD_VERSION_NUMBER != 10308 || ZSTD_versionNumber() != 10308) {
214 214 PyErr_SetString(PyExc_ImportError, "zstd C API mismatch; Python bindings not compiled against expected zstd version");
215 215 return;
216 216 }
@@ -339,17 +339,10 b' MEM_STATIC size_t BIT_getUpperBits(size_'
339 339
340 340 MEM_STATIC size_t BIT_getMiddleBits(size_t bitContainer, U32 const start, U32 const nbBits)
341 341 {
342 #if defined(__BMI__) && defined(__GNUC__) && __GNUC__*1000+__GNUC_MINOR__ >= 4008 /* experimental */
343 # if defined(__x86_64__)
344 if (sizeof(bitContainer)==8)
345 return _bextr_u64(bitContainer, start, nbBits);
346 else
347 # endif
348 return _bextr_u32(bitContainer, start, nbBits);
349 #else
342 U32 const regMask = sizeof(bitContainer)*8 - 1;
343 /* if start > regMask, bitstream is corrupted, and result is undefined */
350 344 assert(nbBits < BIT_MASK_SIZE);
351 return (bitContainer >> start) & BIT_mask[nbBits];
352 #endif
345 return (bitContainer >> (start & regMask)) & BIT_mask[nbBits];
353 346 }
354 347
355 348 MEM_STATIC size_t BIT_getLowerBits(size_t bitContainer, U32 const nbBits)
@@ -366,9 +359,13 b' MEM_STATIC size_t BIT_getLowerBits(size_'
366 359 * @return : value extracted */
367 360 MEM_STATIC size_t BIT_lookBits(const BIT_DStream_t* bitD, U32 nbBits)
368 361 {
369 #if defined(__BMI__) && defined(__GNUC__) /* experimental; fails if bitD->bitsConsumed + nbBits > sizeof(bitD->bitContainer)*8 */
362 /* arbitrate between double-shift and shift+mask */
363 #if 1
364 /* if bitD->bitsConsumed + nbBits > sizeof(bitD->bitContainer)*8,
365 * bitstream is likely corrupted, and result is undefined */
370 366 return BIT_getMiddleBits(bitD->bitContainer, (sizeof(bitD->bitContainer)*8) - bitD->bitsConsumed - nbBits, nbBits);
371 367 #else
368 /* this code path is slower on my os-x laptop */
372 369 U32 const regMask = sizeof(bitD->bitContainer)*8 - 1;
373 370 return ((bitD->bitContainer << (bitD->bitsConsumed & regMask)) >> 1) >> ((regMask-nbBits) & regMask);
374 371 #endif
@@ -392,7 +389,7 b' MEM_STATIC void BIT_skipBits(BIT_DStream'
392 389 * Read (consume) next n bits from local register and update.
393 390 * Pay attention to not read more than nbBits contained into local register.
394 391 * @return : extracted value. */
395 MEM_STATIC size_t BIT_readBits(BIT_DStream_t* bitD, U32 nbBits)
392 MEM_STATIC size_t BIT_readBits(BIT_DStream_t* bitD, unsigned nbBits)
396 393 {
397 394 size_t const value = BIT_lookBits(bitD, nbBits);
398 395 BIT_skipBits(bitD, nbBits);
@@ -401,7 +398,7 b' MEM_STATIC size_t BIT_readBits(BIT_DStre'
401 398
402 399 /*! BIT_readBitsFast() :
403 400 * unsafe version; only works only if nbBits >= 1 */
404 MEM_STATIC size_t BIT_readBitsFast(BIT_DStream_t* bitD, U32 nbBits)
401 MEM_STATIC size_t BIT_readBitsFast(BIT_DStream_t* bitD, unsigned nbBits)
405 402 {
406 403 size_t const value = BIT_lookBitsFast(bitD, nbBits);
407 404 assert(nbBits >= 1);
@@ -15,6 +15,8 b''
15 15 * Compiler specifics
16 16 *********************************************************/
17 17 /* force inlining */
18
19 #if !defined(ZSTD_NO_INLINE)
18 20 #if defined (__GNUC__) || defined(__cplusplus) || defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L /* C99 */
19 21 # define INLINE_KEYWORD inline
20 22 #else
@@ -29,6 +31,13 b''
29 31 # define FORCE_INLINE_ATTR
30 32 #endif
31 33
34 #else
35
36 #define INLINE_KEYWORD
37 #define FORCE_INLINE_ATTR
38
39 #endif
40
32 41 /**
33 42 * FORCE_INLINE_TEMPLATE is used to define C "templates", which take constant
34 43 * parameters. They must be inlined for the compiler to elimininate the constant
@@ -89,23 +98,21 b''
89 98 #endif
90 99
91 100 /* prefetch
92 * can be disabled, by declaring NO_PREFETCH macro
93 * All prefetch invocations use a single default locality 2,
94 * generating instruction prefetcht1,
95 * which, according to Intel, means "load data into L2 cache".
96 * This is a good enough "middle ground" for the time being,
97 * though in theory, it would be better to specialize locality depending on data being prefetched.
98 * Tests could not determine any sensible difference based on locality value. */
101 * can be disabled, by declaring NO_PREFETCH build macro */
99 102 #if defined(NO_PREFETCH)
100 # define PREFETCH(ptr) (void)(ptr) /* disabled */
103 # define PREFETCH_L1(ptr) (void)(ptr) /* disabled */
104 # define PREFETCH_L2(ptr) (void)(ptr) /* disabled */
101 105 #else
102 106 # if defined(_MSC_VER) && (defined(_M_X64) || defined(_M_I86)) /* _mm_prefetch() is not defined outside of x86/x64 */
103 107 # include <mmintrin.h> /* https://msdn.microsoft.com/fr-fr/library/84szxsww(v=vs.90).aspx */
104 # define PREFETCH(ptr) _mm_prefetch((const char*)(ptr), _MM_HINT_T1)
108 # define PREFETCH_L1(ptr) _mm_prefetch((const char*)(ptr), _MM_HINT_T0)
109 # define PREFETCH_L2(ptr) _mm_prefetch((const char*)(ptr), _MM_HINT_T1)
105 110 # elif defined(__GNUC__) && ( (__GNUC__ >= 4) || ( (__GNUC__ == 3) && (__GNUC_MINOR__ >= 1) ) )
106 # define PREFETCH(ptr) __builtin_prefetch((ptr), 0 /* rw==read */, 2 /* locality */)
111 # define PREFETCH_L1(ptr) __builtin_prefetch((ptr), 0 /* rw==read */, 3 /* locality */)
112 # define PREFETCH_L2(ptr) __builtin_prefetch((ptr), 0 /* rw==read */, 2 /* locality */)
107 113 # else
108 # define PREFETCH(ptr) (void)(ptr) /* disabled */
114 # define PREFETCH_L1(ptr) (void)(ptr) /* disabled */
115 # define PREFETCH_L2(ptr) (void)(ptr) /* disabled */
109 116 # endif
110 117 #endif /* NO_PREFETCH */
111 118
@@ -116,7 +123,7 b''
116 123 size_t const _size = (size_t)(s); \
117 124 size_t _pos; \
118 125 for (_pos=0; _pos<_size; _pos+=CACHELINE_SIZE) { \
119 PREFETCH(_ptr + _pos); \
126 PREFETCH_L2(_ptr + _pos); \
120 127 } \
121 128 }
122 129
@@ -78,7 +78,7 b' MEM_STATIC ZSTD_cpuid_t ZSTD_cpuid(void)'
78 78 __asm__(
79 79 "pushl %%ebx\n\t"
80 80 "cpuid\n\t"
81 "movl %%ebx, %%eax\n\r"
81 "movl %%ebx, %%eax\n\t"
82 82 "popl %%ebx"
83 83 : "=a"(f7b), "=c"(f7c)
84 84 : "a"(7), "c"(0)
@@ -57,9 +57,9 b' extern "C" {'
57 57 #endif
58 58
59 59
60 /* static assert is triggered at compile time, leaving no runtime artefact,
61 * but can only work with compile-time constants.
62 * This variant can only be used inside a function. */
60 /* static assert is triggered at compile time, leaving no runtime artefact.
61 * static assert only works with compile-time constants.
62 * Also, this variant can only be used inside a function. */
63 63 #define DEBUG_STATIC_ASSERT(c) (void)sizeof(char[(c) ? 1 : -1])
64 64
65 65
@@ -70,9 +70,19 b' extern "C" {'
70 70 # define DEBUGLEVEL 0
71 71 #endif
72 72
73
74 /* DEBUGFILE can be defined externally,
75 * typically through compiler command line.
76 * note : currently useless.
77 * Value must be stderr or stdout */
78 #ifndef DEBUGFILE
79 # define DEBUGFILE stderr
80 #endif
81
82
73 83 /* recommended values for DEBUGLEVEL :
74 * 0 : no debug, all run-time functions disabled
75 * 1 : no display, enables assert() only
84 * 0 : release mode, no debug, all run-time checks disabled
85 * 1 : enables assert() only, no display
76 86 * 2 : reserved, for currently active debug path
77 87 * 3 : events once per object lifetime (CCtx, CDict, etc.)
78 88 * 4 : events once per frame
@@ -81,7 +91,7 b' extern "C" {'
81 91 * 7+: events at every position (*very* verbose)
82 92 *
83 93 * It's generally inconvenient to output traces > 5.
84 * In which case, it's possible to selectively enable higher verbosity levels
94 * In which case, it's possible to selectively trigger high verbosity levels
85 95 * by modifying g_debug_level.
86 96 */
87 97
@@ -95,11 +105,12 b' extern "C" {'
95 105
96 106 #if (DEBUGLEVEL>=2)
97 107 # include <stdio.h>
98 extern int g_debuglevel; /* here, this variable is only declared,
99 it actually lives in debug.c,
100 and is shared by the whole process.
101 It's typically used to enable very verbose levels
102 on selective conditions (such as position in src) */
108 extern int g_debuglevel; /* the variable is only declared,
109 it actually lives in debug.c,
110 and is shared by the whole process.
111 It's not thread-safe.
112 It's useful when enabling very verbose levels
113 on selective conditions (such as position in src) */
103 114
104 115 # define RAWLOG(l, ...) { \
105 116 if (l<=g_debuglevel) { \
@@ -14,6 +14,10 b''
14 14
15 15 const char* ERR_getErrorString(ERR_enum code)
16 16 {
17 #ifdef ZSTD_STRIP_ERROR_STRINGS
18 (void)code;
19 return "Error strings stripped";
20 #else
17 21 static const char* const notErrorCode = "Unspecified error code";
18 22 switch( code )
19 23 {
@@ -39,10 +43,12 b' const char* ERR_getErrorString(ERR_enum '
39 43 case PREFIX(dictionaryCreation_failed): return "Cannot create Dictionary from provided samples";
40 44 case PREFIX(dstSize_tooSmall): return "Destination buffer is too small";
41 45 case PREFIX(srcSize_wrong): return "Src size is incorrect";
46 case PREFIX(dstBuffer_null): return "Operation on NULL destination buffer";
42 47 /* following error codes are not stable and may be removed or changed in a future version */
43 48 case PREFIX(frameIndex_tooLarge): return "Frame index is too large";
44 49 case PREFIX(seekableIO): return "An I/O error occurred when reading/seeking";
45 50 case PREFIX(maxCode):
46 51 default: return notErrorCode;
47 52 }
53 #endif
48 54 }
@@ -512,7 +512,7 b' MEM_STATIC void FSE_initCState(FSE_CStat'
512 512 const U32 tableLog = MEM_read16(ptr);
513 513 statePtr->value = (ptrdiff_t)1<<tableLog;
514 514 statePtr->stateTable = u16ptr+2;
515 statePtr->symbolTT = ((const U32*)ct + 1 + (tableLog ? (1<<(tableLog-1)) : 1));
515 statePtr->symbolTT = ct + 1 + (tableLog ? (1<<(tableLog-1)) : 1);
516 516 statePtr->stateLog = tableLog;
517 517 }
518 518
@@ -531,7 +531,7 b' MEM_STATIC void FSE_initCState2(FSE_CSta'
531 531 }
532 532 }
533 533
534 MEM_STATIC void FSE_encodeSymbol(BIT_CStream_t* bitC, FSE_CState_t* statePtr, U32 symbol)
534 MEM_STATIC void FSE_encodeSymbol(BIT_CStream_t* bitC, FSE_CState_t* statePtr, unsigned symbol)
535 535 {
536 536 FSE_symbolCompressionTransform const symbolTT = ((const FSE_symbolCompressionTransform*)(statePtr->symbolTT))[symbol];
537 537 const U16* const stateTable = (const U16*)(statePtr->stateTable);
@@ -173,15 +173,19 b' typedef U32 HUF_DTable;'
173 173 * Advanced decompression functions
174 174 ******************************************/
175 175 size_t HUF_decompress4X1 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< single-symbol decoder */
176 #ifndef HUF_FORCE_DECOMPRESS_X1
176 177 size_t HUF_decompress4X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< double-symbols decoder */
178 #endif
177 179
178 180 size_t HUF_decompress4X_DCtx (HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< decodes RLE and uncompressed */
179 181 size_t HUF_decompress4X_hufOnly(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< considers RLE and uncompressed as errors */
180 182 size_t HUF_decompress4X_hufOnly_wksp(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize); /**< considers RLE and uncompressed as errors */
181 183 size_t HUF_decompress4X1_DCtx(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< single-symbol decoder */
182 184 size_t HUF_decompress4X1_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize); /**< single-symbol decoder */
185 #ifndef HUF_FORCE_DECOMPRESS_X1
183 186 size_t HUF_decompress4X2_DCtx(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< double-symbols decoder */
184 187 size_t HUF_decompress4X2_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize); /**< double-symbols decoder */
188 #endif
185 189
186 190
187 191 /* ****************************************
@@ -228,7 +232,7 b' size_t HUF_compress4X_repeat(void* dst, '
228 232 #define HUF_CTABLE_WORKSPACE_SIZE_U32 (2*HUF_SYMBOLVALUE_MAX +1 +1)
229 233 #define HUF_CTABLE_WORKSPACE_SIZE (HUF_CTABLE_WORKSPACE_SIZE_U32 * sizeof(unsigned))
230 234 size_t HUF_buildCTable_wksp (HUF_CElt* tree,
231 const U32* count, U32 maxSymbolValue, U32 maxNbBits,
235 const unsigned* count, U32 maxSymbolValue, U32 maxNbBits,
232 236 void* workSpace, size_t wkspSize);
233 237
234 238 /*! HUF_readStats() :
@@ -277,14 +281,22 b' U32 HUF_selectDecoder (size_t dstSize, s'
277 281 #define HUF_DECOMPRESS_WORKSPACE_SIZE (2 << 10)
278 282 #define HUF_DECOMPRESS_WORKSPACE_SIZE_U32 (HUF_DECOMPRESS_WORKSPACE_SIZE / sizeof(U32))
279 283
284 #ifndef HUF_FORCE_DECOMPRESS_X2
280 285 size_t HUF_readDTableX1 (HUF_DTable* DTable, const void* src, size_t srcSize);
281 286 size_t HUF_readDTableX1_wksp (HUF_DTable* DTable, const void* src, size_t srcSize, void* workSpace, size_t wkspSize);
287 #endif
288 #ifndef HUF_FORCE_DECOMPRESS_X1
282 289 size_t HUF_readDTableX2 (HUF_DTable* DTable, const void* src, size_t srcSize);
283 290 size_t HUF_readDTableX2_wksp (HUF_DTable* DTable, const void* src, size_t srcSize, void* workSpace, size_t wkspSize);
291 #endif
284 292
285 293 size_t HUF_decompress4X_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable);
294 #ifndef HUF_FORCE_DECOMPRESS_X2
286 295 size_t HUF_decompress4X1_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable);
296 #endif
297 #ifndef HUF_FORCE_DECOMPRESS_X1
287 298 size_t HUF_decompress4X2_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable);
299 #endif
288 300
289 301
290 302 /* ====================== */
@@ -306,24 +318,36 b' size_t HUF_compress1X_repeat(void* dst, '
306 318 HUF_CElt* hufTable, HUF_repeat* repeat, int preferRepeat, int bmi2);
307 319
308 320 size_t HUF_decompress1X1 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /* single-symbol decoder */
321 #ifndef HUF_FORCE_DECOMPRESS_X1
309 322 size_t HUF_decompress1X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /* double-symbol decoder */
323 #endif
310 324
311 325 size_t HUF_decompress1X_DCtx (HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);
312 326 size_t HUF_decompress1X_DCtx_wksp (HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize);
327 #ifndef HUF_FORCE_DECOMPRESS_X2
313 328 size_t HUF_decompress1X1_DCtx(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< single-symbol decoder */
314 329 size_t HUF_decompress1X1_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize); /**< single-symbol decoder */
330 #endif
331 #ifndef HUF_FORCE_DECOMPRESS_X1
315 332 size_t HUF_decompress1X2_DCtx(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< double-symbols decoder */
316 333 size_t HUF_decompress1X2_DCtx_wksp(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize); /**< double-symbols decoder */
334 #endif
317 335
318 336 size_t HUF_decompress1X_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable); /**< automatic selection of sing or double symbol decoder, based on DTable */
337 #ifndef HUF_FORCE_DECOMPRESS_X2
319 338 size_t HUF_decompress1X1_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable);
339 #endif
340 #ifndef HUF_FORCE_DECOMPRESS_X1
320 341 size_t HUF_decompress1X2_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable);
342 #endif
321 343
322 344 /* BMI2 variants.
323 345 * If the CPU has BMI2 support, pass bmi2=1, otherwise pass bmi2=0.
324 346 */
325 347 size_t HUF_decompress1X_usingDTable_bmi2(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable, int bmi2);
348 #ifndef HUF_FORCE_DECOMPRESS_X2
326 349 size_t HUF_decompress1X1_DCtx_wksp_bmi2(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize, int bmi2);
350 #endif
327 351 size_t HUF_decompress4X_usingDTable_bmi2(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUF_DTable* DTable, int bmi2);
328 352 size_t HUF_decompress4X_hufOnly_wksp_bmi2(HUF_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize, void* workSpace, size_t wkspSize, int bmi2);
329 353
@@ -39,6 +39,10 b' extern "C" {'
39 39 # define MEM_STATIC static /* this version may generate warnings for unused static functions; disable the relevant warning */
40 40 #endif
41 41
42 #ifndef __has_builtin
43 # define __has_builtin(x) 0 /* compat. with non-clang compilers */
44 #endif
45
42 46 /* code only tested on 32 and 64 bits systems */
43 47 #define MEM_STATIC_ASSERT(c) { enum { MEM_static_assert = 1/(int)(!!(c)) }; }
44 48 MEM_STATIC void MEM_check(void) { MEM_STATIC_ASSERT((sizeof(size_t)==4) || (sizeof(size_t)==8)); }
@@ -198,7 +202,8 b' MEM_STATIC U32 MEM_swap32(U32 in)'
198 202 {
199 203 #if defined(_MSC_VER) /* Visual Studio */
200 204 return _byteswap_ulong(in);
201 #elif defined (__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__ >= 403)
205 #elif (defined (__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__ >= 403)) \
206 || (defined(__clang__) && __has_builtin(__builtin_bswap32))
202 207 return __builtin_bswap32(in);
203 208 #else
204 209 return ((in << 24) & 0xff000000 ) |
@@ -212,7 +217,8 b' MEM_STATIC U64 MEM_swap64(U64 in)'
212 217 {
213 218 #if defined(_MSC_VER) /* Visual Studio */
214 219 return _byteswap_uint64(in);
215 #elif defined (__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__ >= 403)
220 #elif (defined (__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__ >= 403)) \
221 || (defined(__clang__) && __has_builtin(__builtin_bswap64))
216 222 return __builtin_bswap64(in);
217 223 #else
218 224 return ((in << 56) & 0xff00000000000000ULL) |
@@ -88,8 +88,8 b' static void* POOL_thread(void* opaque) {'
88 88 ctx->numThreadsBusy++;
89 89 ctx->queueEmpty = ctx->queueHead == ctx->queueTail;
90 90 /* Unlock the mutex, signal a pusher, and run the job */
91 ZSTD_pthread_cond_signal(&ctx->queuePushCond);
91 92 ZSTD_pthread_mutex_unlock(&ctx->queueMutex);
92 ZSTD_pthread_cond_signal(&ctx->queuePushCond);
93 93
94 94 job.function(job.opaque);
95 95
@@ -30,8 +30,10 b' const char* ZSTD_versionString(void) { r'
30 30 /*-****************************************
31 31 * ZSTD Error Management
32 32 ******************************************/
33 #undef ZSTD_isError /* defined within zstd_internal.h */
33 34 /*! ZSTD_isError() :
34 * tells if a return value is an error code */
35 * tells if a return value is an error code
36 * symbol is required for external callers */
35 37 unsigned ZSTD_isError(size_t code) { return ERR_isError(code); }
36 38
37 39 /*! ZSTD_getErrorName() :
@@ -72,6 +72,7 b' typedef enum {'
72 72 ZSTD_error_workSpace_tooSmall= 66,
73 73 ZSTD_error_dstSize_tooSmall = 70,
74 74 ZSTD_error_srcSize_wrong = 72,
75 ZSTD_error_dstBuffer_null = 74,
75 76 /* following error codes are __NOT STABLE__, they can be removed or changed in future versions */
76 77 ZSTD_error_frameIndex_tooLarge = 100,
77 78 ZSTD_error_seekableIO = 102,
@@ -41,6 +41,9 b' extern "C" {'
41 41
42 42 /* ---- static assert (debug) --- */
43 43 #define ZSTD_STATIC_ASSERT(c) DEBUG_STATIC_ASSERT(c)
44 #define ZSTD_isError ERR_isError /* for inlining */
45 #define FSE_isError ERR_isError
46 #define HUF_isError ERR_isError
44 47
45 48
46 49 /*-*************************************
@@ -75,7 +78,6 b' static const U32 repStartValue[ZSTD_REP_'
75 78 #define BIT0 1
76 79
77 80 #define ZSTD_WINDOWLOG_ABSOLUTEMIN 10
78 #define ZSTD_WINDOWLOG_DEFAULTMAX 27 /* Default maximum allowed window log */
79 81 static const size_t ZSTD_fcs_fieldSize[4] = { 0, 2, 4, 8 };
80 82 static const size_t ZSTD_did_fieldSize[4] = { 0, 1, 2, 4 };
81 83
@@ -242,7 +244,7 b' typedef struct {'
242 244 blockType_e blockType;
243 245 U32 lastBlock;
244 246 U32 origSize;
245 } blockProperties_t;
247 } blockProperties_t; /* declared here for decompress and fullbench */
246 248
247 249 /*! ZSTD_getcBlockSize() :
248 250 * Provides the size of compressed block from block header `src` */
@@ -250,6 +252,13 b' typedef struct {'
250 252 size_t ZSTD_getcBlockSize(const void* src, size_t srcSize,
251 253 blockProperties_t* bpPtr);
252 254
255 /*! ZSTD_decodeSeqHeaders() :
256 * decode sequence header from src */
257 /* Used by: decompress, fullbench (does not get its definition from here) */
258 size_t ZSTD_decodeSeqHeaders(ZSTD_DCtx* dctx, int* nbSeqPtr,
259 const void* src, size_t srcSize);
260
261
253 262 #if defined (__cplusplus)
254 263 }
255 264 #endif
@@ -115,7 +115,7 b' size_t FSE_buildCTable_wksp(FSE_CTable* '
115 115 /* symbol start positions */
116 116 { U32 u;
117 117 cumul[0] = 0;
118 for (u=1; u<=maxSymbolValue+1; u++) {
118 for (u=1; u <= maxSymbolValue+1; u++) {
119 119 if (normalizedCounter[u-1]==-1) { /* Low proba symbol */
120 120 cumul[u] = cumul[u-1] + 1;
121 121 tableSymbol[highThreshold--] = (FSE_FUNCTION_TYPE)(u-1);
@@ -658,7 +658,7 b' size_t FSE_compress_wksp (void* dst, siz'
658 658 BYTE* op = ostart;
659 659 BYTE* const oend = ostart + dstSize;
660 660
661 U32 count[FSE_MAX_SYMBOL_VALUE+1];
661 unsigned count[FSE_MAX_SYMBOL_VALUE+1];
662 662 S16 norm[FSE_MAX_SYMBOL_VALUE+1];
663 663 FSE_CTable* CTable = (FSE_CTable*)workSpace;
664 664 size_t const CTableSize = FSE_CTABLE_SIZE_U32(tableLog, maxSymbolValue);
@@ -672,7 +672,7 b' size_t FSE_compress_wksp (void* dst, siz'
672 672 if (!tableLog) tableLog = FSE_DEFAULT_TABLELOG;
673 673
674 674 /* Scan input and build symbol stats */
675 { CHECK_V_F(maxCount, HIST_count_wksp(count, &maxSymbolValue, src, srcSize, (unsigned*)scratchBuffer) );
675 { CHECK_V_F(maxCount, HIST_count_wksp(count, &maxSymbolValue, src, srcSize, scratchBuffer, scratchBufferSize) );
676 676 if (maxCount == srcSize) return 1; /* only a single symbol in src : rle */
677 677 if (maxCount == 1) return 0; /* each symbol present maximum once => not compressible */
678 678 if (maxCount < (srcSize >> 7)) return 0; /* Heuristic : not compressible enough */
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: file copied from mercurial/util.py to mercurial/utils/compression.py
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: modified file
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: file was removed
The requested commit or file is too big and content was truncated. Show full diff
1 NO CONTENT: file was removed
The requested commit or file is too big and content was truncated. Show full diff
General Comments 0
You need to be logged in to leave comments. Login now