##// END OF EJS Templates
util: implement zstd compression engine...
util: implement zstd compression engine Now that zstd is vendored and being built (in some configurations), we can implement a compression engine for zstd! The zstd engine is a little different from existing engines. Because it may not always be present, we have to defer load the module in case importing it fails. We facilitate this via a cached property that holds a reference to the module or None. The "available" method is implemented to reflect reality. The zstd engine declares its ability to handle bundles using the "zstd" human name and the "ZS" internal name. The latter was chosen because internal names are 2 characters (by only convention I think) and "ZS" seems reasonable. The engine, like others, supports specifying the compression level. However, there are no consumers of this API that yet pass in that argument. I have plans to change that, so stay tuned. Since all we need to do to support bundle generation with a new compression engine is implement and register the compression engine, bundle generation with zstd "just works!" Tests demonstrating this have been added. How does performance of zstd for bundle generation compare? On the mozilla-unified repo, `hg bundle --all -t <engine>-v2` yields the following on my i7-6700K on Linux: engine CPU time bundle size vs orig size throughput none 97.0s 4,054,405,584 100.0% 41.8 MB/s bzip2 (l=9) 393.6s 975,343,098 24.0% 10.3 MB/s gzip (l=6) 184.0s 1,140,533,074 28.1% 22.0 MB/s zstd (l=1) 108.2s 1,119,434,718 27.6% 37.5 MB/s zstd (l=2) 111.3s 1,078,328,002 26.6% 36.4 MB/s zstd (l=3) 113.7s 1,011,823,727 25.0% 35.7 MB/s zstd (l=4) 116.0s 1,008,965,888 24.9% 35.0 MB/s zstd (l=5) 121.0s 977,203,148 24.1% 33.5 MB/s zstd (l=6) 131.7s 927,360,198 22.9% 30.8 MB/s zstd (l=7) 139.0s 912,808,505 22.5% 29.2 MB/s zstd (l=12) 198.1s 854,527,714 21.1% 20.5 MB/s zstd (l=18) 681.6s 789,750,690 19.5% 5.9 MB/s On compression, zstd for bundle generation delivers: * better compression than gzip with significantly less CPU utilization * better than bzip2 compression ratios while still being significantly faster than gzip * ability to aggressively tune compression level to achieve significantly smaller bundles That last point is important. With clone bundles, a server can pre-generate a bundle file, upload it to a static file server, and redirect clients to transparently download it during clone. The server could choose to produce a zstd bundle with the highest compression settings possible. This would take a very long time - a magnitude longer than a typical zstd bundle generation - but the result would be hundreds of megabytes smaller! For the clone volume we do at Mozilla, this could translate to petabytes of bandwidth savings per year and faster clones (due to smaller transfer size). I don't have detailed numbers to report on decompression. However, zstd decompression is fast: >1 GB/s output throughput on this machine, even through the Python bindings. And it can do that regardless of the compression level of the input. By the time you have enough data to worry about overhead of decompression, you have plenty of other things to worry about performance wise. zstd is wins all around. I can't wait to implement support for it on the wire protocol and in revlogs.

File last commit:

r27878:e7bd55db default
r30442:41a81067 default
Show More
buildrpm
161 lines | 4.0 KiB | text/plain | TextLexer
Mathias De Maré
buildrpm: use bash shebang, since we use bash features in the script...
r27878 #!/bin/bash -e
mpm@selenic.com
[PATCH] Add contrib/buildrpm script...
r564 #
Mads Kiilerich
buildrpm: various minor cleanup
r21638 # Build a Mercurial RPM from the current repo
mpm@selenic.com
[PATCH] Add contrib/buildrpm script...
r564 #
Mads Kiilerich
contrib/buildrpm: Support python 2.4 and 2.6
r8867 # Tested on
Mads Kiilerich
buildrpm: various minor cleanup
r21638 # - Fedora 20
# - CentOS 5
# - centOS 6
mpm@selenic.com
[PATCH] Add contrib/buildrpm script...
r564
Augie Fackler
packaging: extract packagelib for common code from builddeb and buildrpm
r24972 . $(dirname $0)/packagelib.sh
Mads Kiilerich
buildrpm: introduce --prepare for preparing without actually building rpms
r22435 BUILD=1
Mads Kiilerich
buildrpm: introduce --rpmdir instead of using hardcoded rpmbuild dir...
r22437 RPMBUILDDIR="$PWD/rpmbuild"
Mads Kiilerich
buildrpm: introduce --prepare for preparing without actually building rpms
r22435 while [ "$1" ]; do
case "$1" in
--prepare )
shift
BUILD=
;;
Mads Kiilerich
buildrpm: introduce --withpython for building rpms that includes Python 2.7
r22436 --withpython | --with-python)
shift
Mads Kiilerich
contrib: offer Python 2.7.10
r26735 PYTHONVER=2.7.10
PYTHONMD5=d7547558fd673bd9d38e2108c6b42521
Mads Kiilerich
buildrpm: introduce --withpython for building rpms that includes Python 2.7
r22436 ;;
Mads Kiilerich
buildrpm: introduce --rpmdir instead of using hardcoded rpmbuild dir...
r22437 --rpmbuilddir )
shift
RPMBUILDDIR="$1"
shift
;;
Mads Kiilerich
buildrpm: introduce --prepare for preparing without actually building rpms
r22435 * )
echo "Invalid parameter $1!" 1>&2
exit 1
;;
esac
done
Gilles Moris
buildrpm: enable to start the script from anywhere...
r9811 cd "`dirname $0`/.."
Mads Kiilerich
buildrpm: complain when hg command isn't available...
r7431
Mads Kiilerich
buildrpm: introduce --withpython for building rpms that includes Python 2.7
r22436 specfile=$PWD/contrib/mercurial.spec
Gilles Moris
buildrpm: cleanup script
r9812 if [ ! -f $specfile ]; then
echo "Cannot find $specfile!" 1>&2
exit 1
fi
mpm@selenic.com
[PATCH] Add contrib/buildrpm script...
r564
Gilles Moris
buildrpm: cleanup script
r9812 if [ ! -d .hg ]; then
mpm@selenic.com
[PATCH] Add contrib/buildrpm script...
r564 echo 'You are not inside a Mercurial repository!' 1>&2
exit 1
fi
Augie Fackler
packaging: extract packagelib for common code from builddeb and buildrpm
r24972 gethgversion
Gilles Moris
buildrpm: build from working dir parent and use hg version for RPM versioning...
r9809
Augie Fackler
packaging: rework version detection and declaration (issue4912)...
r26833 # TODO: handle distance/node set, and type set
if [ -z "$type" ] ; then
release=1
else
release=0.9_$type
fi
if [ -n "$distance" ] ; then
release=$release+$distance_$node
fi
Mads Kiilerich
buildrpm: introduce --withpython for building rpms that includes Python 2.7
r22436 if [ "$PYTHONVER" ]; then
release=$release+$PYTHONVER
RPMPYTHONVER=$PYTHONVER
else
RPMPYTHONVER=%{nil}
fi
mpm@selenic.com
[PATCH] Add contrib/buildrpm script...
r564
Mathias De Maré
buildrpm: move creation of RPM directories from dockerrpm...
r27788 mkdir -p $RPMBUILDDIR/{SOURCES,BUILD,SRPMS,RPMS}
Mads Kiilerich
buildrpm: introduce --rpmdir instead of using hardcoded rpmbuild dir...
r22437 $HG archive -t tgz $RPMBUILDDIR/SOURCES/mercurial-$version-$release.tar.gz
Mads Kiilerich
buildrpm: introduce --withpython for building rpms that includes Python 2.7
r22436 if [ "$PYTHONVER" ]; then
(
Mads Kiilerich
rpms: create missing builds dir if it doesn't exist
r24730 mkdir -p build
Mads Kiilerich
buildrpm: introduce --withpython for building rpms that includes Python 2.7
r22436 cd build
PYTHON_SRCFILE=Python-$PYTHONVER.tgz
[ -f $PYTHON_SRCFILE ] || curl -Lo $PYTHON_SRCFILE http://www.python.org/ftp/python/$PYTHONVER/$PYTHON_SRCFILE
Mads Kiilerich
contrib: buildrpm checking of md5 checksums of downloaded Python and Docutils
r23141 if [ "$PYTHONMD5" ]; then
echo "$PYTHONMD5 $PYTHON_SRCFILE" | md5sum -w -c
fi
Mads Kiilerich
buildrpm: introduce --rpmdir instead of using hardcoded rpmbuild dir...
r22437 ln -f $PYTHON_SRCFILE $RPMBUILDDIR/SOURCES/$PYTHON_SRCFILE
Mads Kiilerich
buildrpm: introduce --withpython for building rpms that includes Python 2.7
r22436
DOCUTILSVER=`sed -ne "s/^%global docutilsname docutils-//p" $specfile`
DOCUTILS_SRCFILE=docutils-$DOCUTILSVER.tar.gz
[ -f $DOCUTILS_SRCFILE ] || curl -Lo $DOCUTILS_SRCFILE http://downloads.sourceforge.net/project/docutils/docutils/$DOCUTILSVER/$DOCUTILS_SRCFILE
Mads Kiilerich
contrib: buildrpm checking of md5 checksums of downloaded Python and Docutils
r23141 DOCUTILSMD5=`sed -ne "s/^%global docutilsmd5 //p" $specfile`
if [ "$DOCUTILSMD5" ]; then
echo "$DOCUTILSMD5 $DOCUTILS_SRCFILE" | md5sum -w -c
fi
Mads Kiilerich
buildrpm: introduce --rpmdir instead of using hardcoded rpmbuild dir...
r22437 ln -f $DOCUTILS_SRCFILE $RPMBUILDDIR/SOURCES/$DOCUTILS_SRCFILE
Mads Kiilerich
buildrpm: introduce --withpython for building rpms that includes Python 2.7
r22436 )
fi
Augie Fackler
buildrpm: mkdir -p two needed directories (issue4779)...
r26139 mkdir -p $RPMBUILDDIR/SPECS
Mads Kiilerich
buildrpm: introduce --rpmdir instead of using hardcoded rpmbuild dir...
r22437 rpmspec=$RPMBUILDDIR/SPECS/mercurial.spec
Gilles Moris
buildrpm: build full RPM package including sources
r9813
Gilles Moris
buildrpm: cleanup script
r9812 sed -e "s,^Version:.*,Version: $version," \
mpm@selenic.com
[PATCH] Add contrib/buildrpm script...
r564 -e "s,^Release:.*,Release: $release," \
Gilles Moris
buildrpm: build full RPM package including sources
r9813 $specfile > $rpmspec
mpm@selenic.com
[PATCH] Add contrib/buildrpm script...
r564
Gilles Moris
buildrpm: enhance changelog of the RPM file...
r9814 echo >> $rpmspec
echo "%changelog" >> $rpmspec
if echo $version | grep '+' > /dev/null 2>&1; then
latesttag="`echo $version | sed -e 's/+.*//'`"
$HG log -r .:"$latesttag" -fM \
--template '{date|hgdate}\t{author}\t{desc|firstline}\n' | python -c '
import sys, time
def datestr(date, format):
return time.strftime(format, time.gmtime(float(date[0]) - date[1]))
Adam Spiers
buildrpm: auto-generate %changelog in .spec file...
r4754
Gilles Moris
buildrpm: enhance changelog of the RPM file...
r9814 changelog = []
for l in sys.stdin.readlines():
tok = l.split("\t")
hgdate = tuple(int(v) for v in tok[0].split())
changelog.append((datestr(hgdate, "%F"), tok[1], hgdate, tok[2]))
prevtitle = ""
for l in sorted(changelog, reverse=True):
title = "* %s %s" % (datestr(l[2], "%a %b %d %Y"), l[1])
if prevtitle != title:
prevtitle = title
print
print title
print "- %s" % l[3].strip()
' >> $rpmspec
else
$HG log \
--template '{date|hgdate}\t{author}\t{desc|firstline}\n' \
.hgtags | python -c '
import sys, time
def datestr(date, format):
return time.strftime(format, time.gmtime(float(date[0]) - date[1]))
for l in sys.stdin.readlines():
tok = l.split("\t")
hgdate = tuple(int(v) for v in tok[0].split())
print "* %s %s\n- %s" % (datestr(hgdate, "%a %b %d %Y"), tok[1], tok[2])
' >> $rpmspec
fi
Adam Spiers
buildrpm: auto-generate %changelog in .spec file...
r4754
Mads Kiilerich
buildrpm: introduce --withpython for building rpms that includes Python 2.7
r22436 sed -i \
-e "s/^%define withpython.*$/%define withpython $RPMPYTHONVER/" \
$rpmspec
Mads Kiilerich
buildrpm: introduce --prepare for preparing without actually building rpms
r22435 if [ "$BUILD" ]; then
Mads Kiilerich
buildrpm: introduce --rpmdir instead of using hardcoded rpmbuild dir...
r22437 rpmbuild --define "_topdir $RPMBUILDDIR" -ba $rpmspec --clean
Mads Kiilerich
buildrpm: introduce --prepare for preparing without actually building rpms
r22435 if [ $? = 0 ]; then
echo
echo "Built packages for $version-$release:"
Mads Kiilerich
buildrpm: introduce --rpmdir instead of using hardcoded rpmbuild dir...
r22437 find $RPMBUILDDIR/*RPMS/ -type f -newer $rpmspec
Mads Kiilerich
buildrpm: introduce --prepare for preparing without actually building rpms
r22435 fi
else
Mads Kiilerich
buildrpm: introduce --rpmdir instead of using hardcoded rpmbuild dir...
r22437 echo "Prepared sources for $version-$release $rpmspec are in $RPMBUILDDIR/SOURCES/ - use like:"
echo "rpmbuild --define '_topdir $RPMBUILDDIR' -ba $rpmspec --clean"
mpm@selenic.com
[PATCH] Add contrib/buildrpm script...
r564 fi