upstream/mercurial-mirror Files · mercurial/revlogutils/sidedata.py

sidedata: replace sidedata upgrade mechanism with the new one...

sidedata: replace sidedata upgrade mechanism with the new one Note: this is split into a separate change (like some other patches in this series) because it's not easy to have all patches work 100% and this seemed easier for reviewers. When cloning or upgrading a repo, we may need to compute (or remove) sidedata. This is the same mechanism that is used in exchange, so we re-use the new system to simplify the code and fix the remaining issues (correctly dropping flags and handling partial removal, etc.). This also highlighted an issue with `test-copies-in-changeset.t` that kept sidedata categories that are not relevant anymore. They should probably be dropped entirely, but that would be for another patch. Differential Revision: https://phab.mercurial-scm.org/D10359

Raphaël Gomès - - Load All Authors

File last commit:

r47443:3d740058 default


                r47847:27f1191b

default

Download file

             sidedata.py
        
                    93 lines
            
             | 2.8 KiB
            
                | text/x-python
            
             |
                PythonLexer
            
             / mercurial / revlogutils / sidedata.py
          
                    History
                
                 |
                  Source
                 | Raw
                 |Copy content
                 |Copy permalink

        marmoute
    
sidedata: add a new module with basic documentation...

              r43301
            
      # sidedata.py - Logic around store extra data alongside revlog revisions

      #

      # Copyright 2019 Pierre-Yves David <pierre-yves.david@octobus.net)

      #

      # This software may be used and distributed according to the terms of the

      # GNU General Public License version 2 or any later version.

      """core code for "sidedata" support

      The "sidedata" are stored alongside the revision without actually being part of

      its content and not affecting its hash. It's main use cases is to cache

      important information related to a changesets.

      The current implementation is experimental and subject to changes. Do not rely

      on it in production.

        Raphaël Gomès
    
sidedata: move to new sidedata storage in revlogv2...

              r47443
            
      Sidedata are stored in the revlog itself, thanks to a new version of the

      revlog. The following format is currently used::

        marmoute
    
sidedata: add a new module with basic documentation...

              r43301
            
          initial header:

              <number of sidedata; 2 bytes>

          sidedata (repeated N times):

              <sidedata-key; 2 bytes>

              <sidedata-entry-length: 4 bytes>

              <sidedata-content-sha1-digest: 20 bytes>

              <sidedata-content; X bytes>

          normal raw text:

              <all bytes remaining in the rawtext>

        Joerg Sonnenberger
    
comments: fix typos...

              r46811
            
      This is a simple and effective format. It should be enough to experiment with

        marmoute
    
sidedata: add a new module with basic documentation...

              r43301
            
      the concept.

      """

      from __future__ import absolute_import

        marmoute
    
sidedata: add a function to read sidedata from revlog raw text...

              r43302
            
      import struct

      from .. import error

        Augie Fackler
    
core: migrate uses of hashlib.sha1 to hashutil.sha1...

              r44517
            
      from ..utils import hashutil

        marmoute
    
sidedata: add a function to read sidedata from revlog raw text...

              r43302
            
        marmoute
    
sidedata: test we can successfully write sidedata...

              r43308
            
      ## sidedata type constant

      # reserve a block for testing purposes.

      SD_TEST1 = 1

      SD_TEST2 = 2

      SD_TEST3 = 3

      SD_TEST4 = 4

      SD_TEST5 = 5

      SD_TEST6 = 6

      SD_TEST7 = 7

        marmoute
    
sidedatacopies: write copies information in sidedata when applicable...

              r43412
            
      # key to store copies related information

      SD_P1COPIES = 8

      SD_P2COPIES = 9

      SD_FILESADDED = 10

      SD_FILESREMOVED = 11

        marmoute
    
changing-files: rework the way we store changed files in side-data...

              r46211
            
      SD_FILES = 12

        marmoute
    
sidedatacopies: write copies information in sidedata when applicable...

              r43412
            
        marmoute
    
sidedata: test we can successfully write sidedata...

              r43308
            
      # internal format constant

        Augie Fackler
    
cleanup: remove pointless r-prefixes on single-quoted strings...

              r43906
            
      SIDEDATA_HEADER = struct.Struct('>H')

      SIDEDATA_ENTRY = struct.Struct('>HL20s')

        marmoute
    
sidedata: add a function to read sidedata from revlog raw text...

              r43302
            
        Augie Fackler
    
formatting: blacken the codebase...

              r43346
            
        Raphaël Gomès
    
sidedata: move to new sidedata storage in revlogv2...

              r47443
            
      def serialize_sidedata(sidedata):

        marmoute
    
sidedata: add a function to write sidedata into a raw text...

              r43303
            
          sidedata = list(sidedata.items())

          sidedata.sort()

        Raphaël Gomès
    
sidedata: move to new sidedata storage in revlogv2...

              r47443
            
          buf = [SIDEDATA_HEADER.pack(len(sidedata))]

        marmoute
    
sidedata: add a function to write sidedata into a raw text...

              r43303
            
          for key, value in sidedata:

        Augie Fackler
    
core: migrate uses of hashlib.sha1 to hashutil.sha1...

              r44517
            
              digest = hashutil.sha1(value).digest()

        Raphaël Gomès
    
sidedata: move to new sidedata storage in revlogv2...

              r47443
            
              buf.append(SIDEDATA_ENTRY.pack(key, len(value), digest))

        marmoute
    
sidedata: add a function to write sidedata into a raw text...

              r43303
            
          for key, value in sidedata:

        Raphaël Gomès
    
sidedata: move to new sidedata storage in revlogv2...

              r47443
            
              buf.append(value)

          buf = b''.join(buf)

          return buf

        marmoute
    
sidedata: add a function to write sidedata into a raw text...

              r43303
            
        Augie Fackler
    
formatting: blacken the codebase...

              r43346
            
        Raphaël Gomès
    
sidedata: move to new sidedata storage in revlogv2...

              r47443
            
      def deserialize_sidedata(blob):

        marmoute
    
sidedata: add a function to read sidedata from revlog raw text...

              r43302
            
          sidedata = {}

          offset = 0

        Raphaël Gomès
    
sidedata: move to new sidedata storage in revlogv2...

              r47443
            
          (nbentry,) = SIDEDATA_HEADER.unpack(blob[: SIDEDATA_HEADER.size])

        marmoute
    
sidedata: add a function to read sidedata from revlog raw text...

              r43302
            
          offset += SIDEDATA_HEADER.size

          dataoffset = SIDEDATA_HEADER.size + (SIDEDATA_ENTRY.size * nbentry)

          for i in range(nbentry):

              nextoffset = offset + SIDEDATA_ENTRY.size

        Raphaël Gomès
    
sidedata: move to new sidedata storage in revlogv2...

              r47443
            
              key, size, storeddigest = SIDEDATA_ENTRY.unpack(blob[offset:nextoffset])

        marmoute
    
sidedata: add a function to read sidedata from revlog raw text...

              r43302
            
              offset = nextoffset

              # read the data associated with that entry

              nextdataoffset = dataoffset + size

        Raphaël Gomès
    
sidedata: move to new sidedata storage in revlogv2...

              r47443
            
              entrytext = bytes(blob[dataoffset:nextdataoffset])

        Augie Fackler
    
core: migrate uses of hashlib.sha1 to hashutil.sha1...

              r44517
            
              readdigest = hashutil.sha1(entrytext).digest()

        marmoute
    
sidedata: add a function to read sidedata from revlog raw text...

              r43302
            
              if storeddigest != readdigest:

                  raise error.SidedataHashError(key, storeddigest, readdigest)

              sidedata[key] = entrytext

              dataoffset = nextdataoffset

        Raphaël Gomès
    
sidedata: move to new sidedata storage in revlogv2...

              r47443
            
          return sidedata

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages

marmoute sidedata: add a new module with basic documentation...	r43301	# sidedata.py - Logic around store extra data alongside revlog revisions
		#
		# Copyright 2019 Pierre-Yves David <pierre-yves.david@octobus.net)
		#
		# This software may be used and distributed according to the terms of the
		# GNU General Public License version 2 or any later version.
		"""core code for "sidedata" support

		The "sidedata" are stored alongside the revision without actually being part of
		its content and not affecting its hash. It's main use cases is to cache
		important information related to a changesets.

		The current implementation is experimental and subject to changes. Do not rely
		on it in production.

Raphaël Gomès sidedata: move to new sidedata storage in revlogv2...	r47443	Sidedata are stored in the revlog itself, thanks to a new version of the
		revlog. The following format is currently used::
marmoute sidedata: add a new module with basic documentation...	r43301
		initial header:
		<number of sidedata; 2 bytes>
		sidedata (repeated N times):
		<sidedata-key; 2 bytes>
		<sidedata-entry-length: 4 bytes>
		<sidedata-content-sha1-digest: 20 bytes>
		<sidedata-content; X bytes>
		normal raw text:
		<all bytes remaining in the rawtext>

Joerg Sonnenberger comments: fix typos...	r46811	This is a simple and effective format. It should be enough to experiment with
marmoute sidedata: add a new module with basic documentation...	r43301	the concept.
		"""

		from __future__ import absolute_import
marmoute sidedata: add a function to read sidedata from revlog raw text...	r43302
		import struct

		from .. import error
Augie Fackler core: migrate uses of hashlib.sha1 to hashutil.sha1...	r44517	from ..utils import hashutil
marmoute sidedata: add a function to read sidedata from revlog raw text...	r43302
marmoute sidedata: test we can successfully write sidedata...	r43308	## sidedata type constant
		# reserve a block for testing purposes.
		SD_TEST1 = 1
		SD_TEST2 = 2
		SD_TEST3 = 3
		SD_TEST4 = 4
		SD_TEST5 = 5
		SD_TEST6 = 6
		SD_TEST7 = 7

marmoute sidedatacopies: write copies information in sidedata when applicable...	r43412	# key to store copies related information
		SD_P1COPIES = 8
		SD_P2COPIES = 9
		SD_FILESADDED = 10
		SD_FILESREMOVED = 11
marmoute changing-files: rework the way we store changed files in side-data...	r46211	SD_FILES = 12
marmoute sidedatacopies: write copies information in sidedata when applicable...	r43412
marmoute sidedata: test we can successfully write sidedata...	r43308	# internal format constant
Augie Fackler cleanup: remove pointless r-prefixes on single-quoted strings...	r43906	SIDEDATA_HEADER = struct.Struct('>H')
		SIDEDATA_ENTRY = struct.Struct('>HL20s')
marmoute sidedata: add a function to read sidedata from revlog raw text...	r43302
Augie Fackler formatting: blacken the codebase...	r43346
Raphaël Gomès sidedata: move to new sidedata storage in revlogv2...	r47443	def serialize_sidedata(sidedata):
marmoute sidedata: add a function to write sidedata into a raw text...	r43303	sidedata = list(sidedata.items())
		sidedata.sort()
Raphaël Gomès sidedata: move to new sidedata storage in revlogv2...	r47443	buf = [SIDEDATA_HEADER.pack(len(sidedata))]
marmoute sidedata: add a function to write sidedata into a raw text...	r43303	for key, value in sidedata:
Augie Fackler core: migrate uses of hashlib.sha1 to hashutil.sha1...	r44517	digest = hashutil.sha1(value).digest()
Raphaël Gomès sidedata: move to new sidedata storage in revlogv2...	r47443	buf.append(SIDEDATA_ENTRY.pack(key, len(value), digest))
marmoute sidedata: add a function to write sidedata into a raw text...	r43303	for key, value in sidedata:
Raphaël Gomès sidedata: move to new sidedata storage in revlogv2...	r47443	buf.append(value)
		buf = b''.join(buf)
		return buf
marmoute sidedata: add a function to write sidedata into a raw text...	r43303
Augie Fackler formatting: blacken the codebase...	r43346
Raphaël Gomès sidedata: move to new sidedata storage in revlogv2...	r47443	def deserialize_sidedata(blob):
marmoute sidedata: add a function to read sidedata from revlog raw text...	r43302	sidedata = {}
		offset = 0
Raphaël Gomès sidedata: move to new sidedata storage in revlogv2...	r47443	(nbentry,) = SIDEDATA_HEADER.unpack(blob[: SIDEDATA_HEADER.size])
marmoute sidedata: add a function to read sidedata from revlog raw text...	r43302	offset += SIDEDATA_HEADER.size
		dataoffset = SIDEDATA_HEADER.size + (SIDEDATA_ENTRY.size * nbentry)
		for i in range(nbentry):
		nextoffset = offset + SIDEDATA_ENTRY.size
Raphaël Gomès sidedata: move to new sidedata storage in revlogv2...	r47443	key, size, storeddigest = SIDEDATA_ENTRY.unpack(blob[offset:nextoffset])
marmoute sidedata: add a function to read sidedata from revlog raw text...	r43302	offset = nextoffset
		# read the data associated with that entry
		nextdataoffset = dataoffset + size
Raphaël Gomès sidedata: move to new sidedata storage in revlogv2...	r47443	entrytext = bytes(blob[dataoffset:nextdataoffset])
Augie Fackler core: migrate uses of hashlib.sha1 to hashutil.sha1...	r44517	readdigest = hashutil.sha1(entrytext).digest()
marmoute sidedata: add a function to read sidedata from revlog raw text...	r43302	if storeddigest != readdigest:
		raise error.SidedataHashError(key, storeddigest, readdigest)
		sidedata[key] = entrytext
		dataoffset = nextdataoffset
Raphaël Gomès sidedata: move to new sidedata storage in revlogv2...	r47443	return sidedata