upstream/mercurial-mirror Files · mercurial/worker.py

repoview: discard filtered changelog if index isn't shared with unfiltered...

repoview: discard filtered changelog if index isn't shared with unfiltered Before this patch, revisions rollbacked at failure of previous transaction might be visible at subsequent operations unintentionally, if repoview object is reused even after failure of transaction: e.g. command server and HTTP server are typical cases. 'repoview' uses the tuple of values below of unfiltered changelog as "the key" to examine validity of filtered changelog cache. - length - tip node - filtered revisions (as hashed value) - '_delayed' field 'repoview' compares between "the key" of unfiltered changelog at previous caching and now, and reuses filtered changelog cache if no change is detected. But this comparison indicates only that there is no change between unfiltered 'repo.changelog' at last caching and now, but not that filtered changelog cache is valid for current unfiltered one. 'repoview' uses "shallow copy" of unfiltered changelog to create filtered changelog cache. In this case, 'index' buffer of unfiltered changelog is also referred by filtered changelog. At failure of transaction, unfiltered changelog itself is invalidated (= un-referred) on the 'repo' side (see also). But 'index' of it still contains revisions to be rollbacked at this failure, and is referred by filtered changelog. Therefore, even if there is no change between unfiltered 'repo.changelog' at last caching and now, steps below makes rollbacked revisions visible via filtered changelog unintentionally. 1. instantiate unfiltered changelog as 'repo.changelog' (call it CL1) 2. make filtered (= shallow copy of) CL1 (call it FCL1) 3. cache FCL1 with "the key" of CL1 4. revisions are appended to 'index', which is shared by CL1 and FCL1 5. invalidate 'repo.changelog' (= CL1) at failure of transaction 6. instantiate 'repo.changelog' again at next operation (call it CL2) CL2 doesn't have revisions added at (4), because it is instantiated from '00changelog.i', which isn't changed while failed transaction. 7. compare between "the key" of CL1 and CL2 8. FCL1 cached at (3) is reused, because comparison at (7) doesn't detect change between CL1 at (1) and CL2 9. revisions rollbacked at (5) are visible via FCL1 unintentionally, because FCL1 still refers 'index' changed at (4) The root cause of this issue is that there is no examination about validity of filtered changelog cache against current unfiltered one. This patch discards filtered changelog cache, if its 'index' object isn't shared with unfiltered one. BTW, at the time of this patch, redundant truncation of '00changelog.i' at failure of transaction (see for detail) often prevents "hg serve" from making already rollbacked revisions visible, because updating timestamps of '00changelog.i' by truncation makes "hg serve" discard old repoview object with invalid filtered changelog cache. This is reason why this issue is overlooked before this patch, even though test-bundle2-exchange.t has tests in similar situation: failure of "hg push" via HTTP by pretxnclose hook on server side doesn't prevent subsequent commands from looking up outgoing revisions correctly. But timestamp on the filesystem doesn't have enough resolution for recent computation power, and it can't be assumed that this avoidance always works as expected. Therefore, without this patch, this issue might appear occasionally.

Gregory Szorc - - Load All Authors

File last commit:

r28181:f8efc8a3 default


                r28265:33292621

default

Download file

             worker.py
        
                    162 lines
            
             | 4.5 KiB
            
                | text/x-python
            
             |
                PythonLexer
            
             / mercurial / worker.py
          
                    History
                
                 |
                  Source
                 | Raw
                 |Copy content
                 |Copy permalink

        Bryan O'Sullivan
    
worker: count the number of CPUs...

              r18635
            
      # worker.py - master-slave parallelism support

      #

      # Copyright 2013 Facebook, Inc.

      #

      # This software may be used and distributed according to the terms of the

      # GNU General Public License version 2 or any later version.

        Gregory Szorc
    
worker: use absolute_import

              r25992
            
      from __future__ import absolute_import

      import errno

      import os

      import signal

      import sys

      import threading

      from .i18n import _

        Pierre-Yves David
    
error: get Abort from 'error' instead of 'util'...

              r26587
            
      from . import error

        Bryan O'Sullivan
    
worker: count the number of CPUs...

              r18635
            
      def countcpus():

          '''try to count the number of CPUs on the system'''

        Gregory Szorc
    
worker: restore old countcpus code (issue4869)...

              r26568
            
          # posix

        Bryan O'Sullivan
    
worker: count the number of CPUs...

              r18635
            
          try:

        Gregory Szorc
    
worker: restore old countcpus code (issue4869)...

              r26568
            
              n = int(os.sysconf('SC_NPROCESSORS_ONLN'))

              if n > 0:

                  return n

          except (AttributeError, ValueError):

              pass

          # windows

          try:

              n = int(os.environ['NUMBER_OF_PROCESSORS'])

              if n > 0:

                  return n

          except (KeyError, ValueError):

              pass

          return 1

        Bryan O'Sullivan
    
worker: estimate whether it's worth running a task in parallel...

              r18636
            
      def _numworkers(ui):

          s = ui.config('worker', 'numcpus')

          if s:

              try:

                  n = int(s)

                  if n >= 1:

                      return n

              except ValueError:

        Pierre-Yves David
    
error: get Abort from 'error' instead of 'util'...

              r26587
            
                  raise error.Abort(_('number of cpus must be an integer'))

        Bryan O'Sullivan
    
worker: estimate whether it's worth running a task in parallel...

              r18636
            
          return min(max(countcpus(), 4), 32)

      if os.name == 'posix':

          _startupcost = 0.01

      else:

          _startupcost = 1e30

      def worthwhile(ui, costperop, nops):

          '''try to determine whether the benefit of multiple processes can

          outweigh the cost of starting them'''

          linear = costperop * nops

          workers = _numworkers(ui)

          benefit = linear - (_startupcost * workers + linear / workers)

          return benefit >= 0.15

        Bryan O'Sullivan
    
worker: partition a list (of tasks) into equal-sized chunks

              r18637
            
        Bryan O'Sullivan
    
worker: allow a function to be run in multiple worker processes...

              r18638
            
      def worker(ui, costperarg, func, staticargs, args):

          '''run a function, possibly in parallel in multiple worker

          processes.

          returns a progress iterator

          costperarg - cost of a single task

          func - function to run

          staticargs - arguments to pass to every invocation of the function

          args - arguments to split into chunks, to pass to individual

          workers

          '''

          if worthwhile(ui, costperarg, len(args)):

              return _platformworker(ui, func, staticargs, args)

          return func(*staticargs + (args,))

      def _posixworker(ui, func, staticargs, args):

          rfd, wfd = os.pipe()

          workers = _numworkers(ui)

        Bryan O'Sullivan
    
worker: fix a race in SIGINT handling...

              r18708
            
          oldhandler = signal.getsignal(signal.SIGINT)

          signal.signal(signal.SIGINT, signal.SIG_IGN)

        Bryan O'Sullivan
    
worker: handle worker failures more aggressively...

              r18709
            
          pids, problem = [], [0]

        Bryan O'Sullivan
    
worker: allow a function to be run in multiple worker processes...

              r18638
            
          for pargs in partition(args, workers):

              pid = os.fork()

              if pid == 0:

        Bryan O'Sullivan
    
worker: fix a race in SIGINT handling...

              r18708
            
                  signal.signal(signal.SIGINT, oldhandler)

        Bryan O'Sullivan
    
worker: allow a function to be run in multiple worker processes...

              r18638
            
                  try:

                      os.close(rfd)

                      for i, item in func(*(staticargs + (pargs,))):

                          os.write(wfd, '%d %s\n' % (i, item))

                      os._exit(0)

                  except KeyboardInterrupt:

                      os._exit(255)

        Matt Mackall
    
worker: properly report errors from worker processes (issue3982)

              r19408
            
                      # other exceptions are allowed to propagate, we rely

                      # on lock.py's pid checks to avoid release callbacks

        Bryan O'Sullivan
    
worker: handle worker failures more aggressively...

              r18709
            
              pids.append(pid)

          pids.reverse()

        Bryan O'Sullivan
    
worker: allow a function to be run in multiple worker processes...

              r18638
            
          os.close(wfd)

          fp = os.fdopen(rfd, 'rb', 0)

        Bryan O'Sullivan
    
worker: handle worker failures more aggressively...

              r18709
            
          def killworkers():

              # if one worker bails, there's no good reason to wait for the rest

              for p in pids:

                  try:

                      os.kill(p, signal.SIGTERM)

        Gregory Szorc
    
global: mass rewrite to use modern exception syntax...

              r25660
            
                  except OSError as err:

        Bryan O'Sullivan
    
worker: handle worker failures more aggressively...

              r18709
            
                      if err.errno != errno.ESRCH:

                          raise

          def waitforworkers():

        Mads Kiilerich
    
cleanup: avoid _ for local unused tmp variables - that is reserved for i18n...

              r22199
            
              for _pid in pids:

        Bryan O'Sullivan
    
worker: handle worker failures more aggressively...

              r18709
            
                  st = _exitstatus(os.wait()[1])

        Matt Mackall
    
worker: check problem state correctly (issue3982)...

              r19406
            
                  if st and not problem[0]:

        Bryan O'Sullivan
    
worker: handle worker failures more aggressively...

              r18709
            
                      problem[0] = st

                      killworkers()

          t = threading.Thread(target=waitforworkers)

          t.start()

        Bryan O'Sullivan
    
worker: allow a function to be run in multiple worker processes...

              r18638
            
          def cleanup():

              signal.signal(signal.SIGINT, oldhandler)

        Bryan O'Sullivan
    
worker: handle worker failures more aggressively...

              r18709
            
              t.join()

              status = problem[0]

              if status:

                  if status < 0:

                      os.kill(os.getpid(), -status)

                  sys.exit(status)

        Bryan O'Sullivan
    
worker: allow a function to be run in multiple worker processes...

              r18638
            
          try:

              for line in fp:

                  l = line.split(' ', 1)

                  yield int(l[0]), l[1][:-1]

          except: # re-raises

        Bryan O'Sullivan
    
worker: handle worker failures more aggressively...

              r18709
            
              killworkers()

        Bryan O'Sullivan
    
worker: allow a function to be run in multiple worker processes...

              r18638
            
              cleanup()

              raise

          cleanup()

        Bryan O'Sullivan
    
worker: on error, exit similarly to the first failing worker...

              r18707
            
      def _posixexitstatus(code):

          '''convert a posix exit status into the same form returned by

          os.spawnv

          returns None if the process was stopped instead of exiting'''

          if os.WIFEXITED(code):

              return os.WEXITSTATUS(code)

          elif os.WIFSIGNALED(code):

              return -os.WTERMSIG(code)

        Bryan O'Sullivan
    
worker: allow a function to be run in multiple worker processes...

              r18638
            
      if os.name != 'nt':

          _platformworker = _posixworker

        Bryan O'Sullivan
    
worker: on error, exit similarly to the first failing worker...

              r18707
            
          _exitstatus = _posixexitstatus

        Bryan O'Sullivan
    
worker: allow a function to be run in multiple worker processes...

              r18638
            
        Bryan O'Sullivan
    
worker: partition a list (of tasks) into equal-sized chunks

              r18637
            
      def partition(lst, nslices):

        Gregory Szorc
    
worker: change partition strategy to every Nth element...

              r28181
            
          '''partition a list into N slices of roughly equal size

          The current strategy takes every Nth element from the input. If

          we ever write workers that need to preserve grouping in input

          we should consider allowing callers to specify a partition strategy.

          '''

          for i in range(nslices):

              yield lst[i::nslices]

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages

Bryan O'Sullivan worker: count the number of CPUs...	r18635	# worker.py - master-slave parallelism support
		#
		# Copyright 2013 Facebook, Inc.
		#
		# This software may be used and distributed according to the terms of the
		# GNU General Public License version 2 or any later version.

Gregory Szorc worker: use absolute_import	r25992	from __future__ import absolute_import

		import errno
		import os
		import signal
		import sys
		import threading

		from .i18n import _
Pierre-Yves David error: get Abort from 'error' instead of 'util'...	r26587	from . import error
Bryan O'Sullivan worker: count the number of CPUs...	r18635
		def countcpus():
		'''try to count the number of CPUs on the system'''
Gregory Szorc worker: restore old countcpus code (issue4869)...	r26568
		# posix
Bryan O'Sullivan worker: count the number of CPUs...	r18635	try:
Gregory Szorc worker: restore old countcpus code (issue4869)...	r26568	n = int(os.sysconf('SC_NPROCESSORS_ONLN'))
		if n > 0:
		return n
		except (AttributeError, ValueError):
		pass

		# windows
		try:
		n = int(os.environ['NUMBER_OF_PROCESSORS'])
		if n > 0:
		return n
		except (KeyError, ValueError):
		pass

		return 1
Bryan O'Sullivan worker: estimate whether it's worth running a task in parallel...	r18636
		def _numworkers(ui):
		s = ui.config('worker', 'numcpus')
		if s:
		try:
		n = int(s)
		if n >= 1:
		return n
		except ValueError:
Pierre-Yves David error: get Abort from 'error' instead of 'util'...	r26587	raise error.Abort(_('number of cpus must be an integer'))
Bryan O'Sullivan worker: estimate whether it's worth running a task in parallel...	r18636	return min(max(countcpus(), 4), 32)

		if os.name == 'posix':
		_startupcost = 0.01
		else:
		_startupcost = 1e30

		def worthwhile(ui, costperop, nops):
		'''try to determine whether the benefit of multiple processes can
		outweigh the cost of starting them'''
		linear = costperop * nops
		workers = _numworkers(ui)
		benefit = linear - (_startupcost * workers + linear / workers)
		return benefit >= 0.15
Bryan O'Sullivan worker: partition a list (of tasks) into equal-sized chunks	r18637
Bryan O'Sullivan worker: allow a function to be run in multiple worker processes...	r18638	def worker(ui, costperarg, func, staticargs, args):
		'''run a function, possibly in parallel in multiple worker
		processes.

		returns a progress iterator

		costperarg - cost of a single task

		func - function to run

		staticargs - arguments to pass to every invocation of the function

		args - arguments to split into chunks, to pass to individual
		workers
		'''
		if worthwhile(ui, costperarg, len(args)):
		return _platformworker(ui, func, staticargs, args)
		return func(*staticargs + (args,))

		def _posixworker(ui, func, staticargs, args):
		rfd, wfd = os.pipe()
		workers = _numworkers(ui)
Bryan O'Sullivan worker: fix a race in SIGINT handling...	r18708	oldhandler = signal.getsignal(signal.SIGINT)
		signal.signal(signal.SIGINT, signal.SIG_IGN)
Bryan O'Sullivan worker: handle worker failures more aggressively...	r18709	pids, problem = [], [0]
Bryan O'Sullivan worker: allow a function to be run in multiple worker processes...	r18638	for pargs in partition(args, workers):
		pid = os.fork()
		if pid == 0:
Bryan O'Sullivan worker: fix a race in SIGINT handling...	r18708	signal.signal(signal.SIGINT, oldhandler)
Bryan O'Sullivan worker: allow a function to be run in multiple worker processes...	r18638	try:
		os.close(rfd)
		for i, item in func(*(staticargs + (pargs,))):
		os.write(wfd, '%d %s\n' % (i, item))
		os._exit(0)
		except KeyboardInterrupt:
		os._exit(255)
Matt Mackall worker: properly report errors from worker processes (issue3982)	r19408	# other exceptions are allowed to propagate, we rely
		# on lock.py's pid checks to avoid release callbacks
Bryan O'Sullivan worker: handle worker failures more aggressively...	r18709	pids.append(pid)
		pids.reverse()
Bryan O'Sullivan worker: allow a function to be run in multiple worker processes...	r18638	os.close(wfd)
		fp = os.fdopen(rfd, 'rb', 0)
Bryan O'Sullivan worker: handle worker failures more aggressively...	r18709	def killworkers():
		# if one worker bails, there's no good reason to wait for the rest
		for p in pids:
		try:
		os.kill(p, signal.SIGTERM)
Gregory Szorc global: mass rewrite to use modern exception syntax...	r25660	except OSError as err:
Bryan O'Sullivan worker: handle worker failures more aggressively...	r18709	if err.errno != errno.ESRCH:
		raise
		def waitforworkers():
Mads Kiilerich cleanup: avoid _ for local unused tmp variables - that is reserved for i18n...	r22199	for _pid in pids:
Bryan O'Sullivan worker: handle worker failures more aggressively...	r18709	st = _exitstatus(os.wait()[1])
Matt Mackall worker: check problem state correctly (issue3982)...	r19406	if st and not problem[0]:
Bryan O'Sullivan worker: handle worker failures more aggressively...	r18709	problem[0] = st
		killworkers()
		t = threading.Thread(target=waitforworkers)
		t.start()
Bryan O'Sullivan worker: allow a function to be run in multiple worker processes...	r18638	def cleanup():
		signal.signal(signal.SIGINT, oldhandler)
Bryan O'Sullivan worker: handle worker failures more aggressively...	r18709	t.join()
		status = problem[0]
		if status:
		if status < 0:
		os.kill(os.getpid(), -status)
		sys.exit(status)
Bryan O'Sullivan worker: allow a function to be run in multiple worker processes...	r18638	try:
		for line in fp:
		l = line.split(' ', 1)
		yield int(l[0]), l[1][:-1]
		except: # re-raises
Bryan O'Sullivan worker: handle worker failures more aggressively...	r18709	killworkers()
Bryan O'Sullivan worker: allow a function to be run in multiple worker processes...	r18638	cleanup()
		raise
		cleanup()

Bryan O'Sullivan worker: on error, exit similarly to the first failing worker...	r18707	def _posixexitstatus(code):
		'''convert a posix exit status into the same form returned by
		os.spawnv

		returns None if the process was stopped instead of exiting'''
		if os.WIFEXITED(code):
		return os.WEXITSTATUS(code)
		elif os.WIFSIGNALED(code):
		return -os.WTERMSIG(code)

Bryan O'Sullivan worker: allow a function to be run in multiple worker processes...	r18638	if os.name != 'nt':
		_platformworker = _posixworker
Bryan O'Sullivan worker: on error, exit similarly to the first failing worker...	r18707	_exitstatus = _posixexitstatus
Bryan O'Sullivan worker: allow a function to be run in multiple worker processes...	r18638
Bryan O'Sullivan worker: partition a list (of tasks) into equal-sized chunks	r18637	def partition(lst, nslices):
Gregory Szorc worker: change partition strategy to every Nth element...	r28181	'''partition a list into N slices of roughly equal size

		The current strategy takes every Nth element from the input. If
		we ever write workers that need to preserve grouping in input
		we should consider allowing callers to specify a partition strategy.
		'''
		for i in range(nslices):
		yield lst[i::nslices]