upstream/mercurial-mirror Files · mercurial/minirst.py

replace Python standard textwrap by MBCS sensitive one for i18n text...

replace Python standard textwrap by MBCS sensitive one for i18n text Mercurial has problem around text wrapping/filling in MBCS encoding environment, because standard 'textwrap' module of Python can not treat it correctly. It splits byte sequence for one character into two lines. According to unicode specification, "east asian width" classifies characters into: W(ide), N(arrow), F(ull-width), H(alf-width), A(mbiguous) W/N/F/H can be always recognized as 2/1/2/1 bytes in byte sequence, but 'A' can not. Size of 'A' depends on language in which it is used. Unicode specification says: If the context(= language) cannot be established reliably they should be treated as narrow characters by default but many of class 'A' characters are full-width, at least, in Japanese environment. So, this patch treats class 'A' characters as full-width always for safety wrapping. This patch focuses only on MBCS safe-ness, not on writing/printing rule strict wrapping for each languages MBCS sensitive textwrap class is originally implemented by ITO Nobuaki <daydream.trippers@gmail.com>.

FUJIWARA Katsunori - - Load All Authors

File last commit:

r11297:d320e704 default


                r11297:d320e704

default

Download file

             minirst.py
        
                    385 lines
            
             | 12.8 KiB
            
                | text/x-python
            
             |
                PythonLexer
            
             / mercurial / minirst.py
          
                    History
                
                 |
                  Source
                 | Raw
                 |Copy content
                 |Copy permalink

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
      # minirst.py - minimal reStructuredText parser

      #

        Martin Geisler
    
minirst: support containers...

              r10443
            
      # Copyright 2009, 2010 Matt Mackall <mpm@selenic.com> and others

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
      #

      # This software may be used and distributed according to the terms of the

        Matt Mackall
    
Update license to GPLv2+

              r10263
            
      # GNU General Public License version 2 or any later version.

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
      """simplified reStructuredText parser.

      This parser knows just enough about reStructuredText to parse the

      Mercurial docstrings.

      It cheats in a major way: nested blocks are not really nested. They

      are just indented blocks that look like they are nested. This relies

      on the user to keep the right indentation for the blocks.

      It only supports a small subset of reStructuredText:

        Martin Geisler
    
minirst: update module docstring

              r9741
            
      - sections

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
      - paragraphs

        Martin Geisler
    
minirst: update module docstring

              r9741
            
      - literal blocks

      - definition lists

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
        Martin Geisler
    
minirst: update module docstring

              r9741
            
      - bullet lists (items must start with '-')

      - enumerated lists (no autonumbering)

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
        Martin Geisler
    
minirst: parse field lists

              r9293
            
      - field lists (colons cannot be escaped)

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
      - option lists (supports only long options without arguments)

        Martin Geisler
    
minirst: update module docstring

              r9741
            
      - inline literals (no other inline markup is not recognized)

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
      """

        FUJIWARA Katsunori
    
replace Python standard textwrap by MBCS sensitive one for i18n text...

              r11297
            
      import re, sys

      import util

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
      def findblocks(text):

          """Find continuous blocks of lines in text.

          Returns a list of dictionaries representing the blocks. Each block

          has an 'indent' field and a 'lines' field.

          """

          blocks = [[]]

          lines = text.splitlines()

          for line in lines:

              if line.strip():

                  blocks[-1].append(line)

              elif blocks[-1]:

                  blocks.append([])

          if not blocks[-1]:

              del blocks[-1]

          for i, block in enumerate(blocks):

              indent = min((len(l) - len(l.lstrip())) for l in block)

              blocks[i] = dict(indent=indent, lines=[l[indent:] for l in block])

          return blocks

      def findliteralblocks(blocks):

          """Finds literal blocks and adds a 'type' field to the blocks.

          Literal blocks are given the type 'literal', all other blocks are

          given type the 'paragraph'.

          """

          i = 0

          while i < len(blocks):

              # Searching for a block that looks like this:

              #

              # +------------------------------+

              # | paragraph                    |

              # | (ends with "::")             |

              # +------------------------------+

              #    +---------------------------+

              #    | indented literal block    |

              #    +---------------------------+

              blocks[i]['type'] = 'paragraph'

        Matt Mackall
    
many, many trivial check-code fixups

              r10282
            
              if blocks[i]['lines'][-1].endswith('::') and i + 1 < len(blocks):

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
                  indent = blocks[i]['indent']

        Matt Mackall
    
many, many trivial check-code fixups

              r10282
            
                  adjustment = blocks[i + 1]['indent'] - indent

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
                  if blocks[i]['lines'] == ['::']:

                      # Expanded form: remove block

                      del blocks[i]

                      i -= 1

                  elif blocks[i]['lines'][-1].endswith(' ::'):

                      # Partially minimized form: remove space and both

                      # colons.

                      blocks[i]['lines'][-1] = blocks[i]['lines'][-1][:-3]

                  else:

                      # Fully minimized form: remove just one colon.

                      blocks[i]['lines'][-1] = blocks[i]['lines'][-1][:-1]

                  # List items are formatted with a hanging indent. We must

                  # correct for this here while we still have the original

                  # information on the indentation of the subsequent literal

                  # blocks available.

        Martin Geisler
    
minirst: prepare for general types of bullet lists...

              r9738
            
                  m = _bulletre.match(blocks[i]['lines'][0])

                  if m:

                      indent += m.end()

                      adjustment -= m.end()

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
                  # Mark the following indented blocks.

        Matt Mackall
    
many, many trivial check-code fixups

              r10282
            
                  while i + 1 < len(blocks) and blocks[i + 1]['indent'] > indent:

                      blocks[i + 1]['type'] = 'literal'

                      blocks[i + 1]['indent'] -= adjustment

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
                      i += 1

              i += 1

          return blocks

        Martin Geisler
    
minirst: support line blocks

              r10447
            
      _bulletre = re.compile(r'(-|[0-9A-Za-z]+\.|\(?[0-9A-Za-z]+\)|\|) ')

        Martin Geisler
    
minirst: combine list parsing in one function...

              r9737
            
      _optionre = re.compile(r'^(--[a-z-]+)((?:[ =][a-zA-Z][\w-]*)?  +)(.*)$')

        Martin Geisler
    
minirst: improve layout of field lists...

              r10065
            
      _fieldre = re.compile(r':(?![: ])([^:]*)(?<! ):[ ]+(.*)')

        Martin Geisler
    
minirst: combine list parsing in one function...

              r9737
            
      _definitionre = re.compile(r'[^ ]')

      def splitparagraphs(blocks):

          """Split paragraphs into lists."""

          # Tuples with (list type, item regexp, single line items?). Order

          # matters: definition lists has the least specific regexp and must

          # come last.

          listtypes = [('bullet', _bulletre, True),

                       ('option', _optionre, True),

                       ('field', _fieldre, True),

                       ('definition', _definitionre, False)]

          def match(lines, i, itemre, singleline):

              """Does itemre match an item at line i?

              A list item can be followed by an idented line or another list

              item (but only if singleline is True).

              """

              line1 = lines[i]

        Matt Mackall
    
many, many trivial check-code fixups

              r10282
            
              line2 = i + 1 < len(lines) and lines[i + 1] or ''

        Martin Geisler
    
minirst: combine list parsing in one function...

              r9737
            
              if not itemre.match(line1):

                  return False

              if singleline:

                  return line2 == '' or line2[0] == ' ' or itemre.match(line2)

              else:

                  return line2.startswith(' ')

          i = 0

          while i < len(blocks):

              if blocks[i]['type'] == 'paragraph':

                  lines = blocks[i]['lines']

                  for type, itemre, singleline in listtypes:

                      if match(lines, 0, itemre, singleline):

                          items = []

                          for j, line in enumerate(lines):

                              if match(lines, j, itemre, singleline):

                                  items.append(dict(type=type, lines=[],

                                                    indent=blocks[i]['indent']))

                              items[-1]['lines'].append(line)

        Matt Mackall
    
many, many trivial check-code fixups

              r10282
            
                          blocks[i:i + 1] = items

        Martin Geisler
    
minirst: combine list parsing in one function...

              r9737
            
                          break

              i += 1

          return blocks

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
        Martin Geisler
    
minirst: improve layout of field lists...

              r10065
            
      _fieldwidth = 12

      def updatefieldlists(blocks):

          """Find key and maximum key width for field lists."""

          i = 0

          while i < len(blocks):

              if blocks[i]['type'] != 'field':

                  i += 1

                  continue

              keywidth = 0

              j = i

              while j < len(blocks) and blocks[j]['type'] == 'field':

                  m = _fieldre.match(blocks[j]['lines'][0])

                  key, rest = m.groups()

                  blocks[j]['lines'][0] = rest

                  blocks[j]['key'] = key

                  keywidth = max(keywidth, len(key))

                  j += 1

              for block in blocks[i:j]:

                  block['keywidth'] = keywidth

              i = j + 1

          return blocks

        Martin Geisler
    
minirst: support containers...

              r10443
            
      def prunecontainers(blocks, keep):

          """Prune unwanted containers.

          The blocks must have a 'type' field, i.e., they should have been

          run through findliteralblocks first.

          """

        Martin Geisler
    
minirst: report pruned container types

              r10444
            
          pruned = []

        Martin Geisler
    
minirst: support containers...

              r10443
            
          i = 0

          while i + 1 < len(blocks):

              # Searching for a block that looks like this:

              #

              # +-------+---------------------------+

              # | ".. container ::" type            |

              # +---+                               |

              #     | blocks                        |

              #     +-------------------------------+

              if (blocks[i]['type'] == 'paragraph' and

                  blocks[i]['lines'][0].startswith('.. container::')):

                  indent = blocks[i]['indent']

                  adjustment = blocks[i + 1]['indent'] - indent

                  containertype = blocks[i]['lines'][0][15:]

                  prune = containertype not in keep

        Martin Geisler
    
minirst: report pruned container types

              r10444
            
                  if prune:

                      pruned.append(containertype)

        Martin Geisler
    
minirst: support containers...

              r10443
            
                  # Always delete "..container:: type" block

                  del blocks[i]

                  j = i

                  while j < len(blocks) and blocks[j]['indent'] > indent:

                      if prune:

                          del blocks[j]

                          i -= 1 # adjust outer index

                      else:

                          blocks[j]['indent'] -= adjustment

                          j += 1

              i += 1

        Martin Geisler
    
minirst: report pruned container types

              r10444
            
          return blocks, pruned

        Martin Geisler
    
minirst: support containers...

              r10443
            
        Martin Geisler
    
minirst: support all recommended title adornments

              r10984
            
      _sectionre = re.compile(r"""^([-=`:.'"~^_*+#])\1+$""")

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
      def findsections(blocks):

          """Finds sections.

          The blocks must have a 'type' field, i.e., they should have been

          run through findliteralblocks first.

          """

          for block in blocks:

              # Searching for a block that looks like this:

              #

              # +------------------------------+

              # | Section title                |

              # | -------------                |

              # +------------------------------+

              if (block['type'] == 'paragraph' and

                  len(block['lines']) == 2 and

        Martin Geisler
    
minirst: support all recommended title adornments

              r10984
            
                  len(block['lines'][0]) == len(block['lines'][1]) and

                  _sectionre.match(block['lines'][1])):

        Martin Geisler
    
minirst: correctly format sections containing inline markup...

              r10983
            
                  block['underline'] = block['lines'][1][0]

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
                  block['type'] = 'section'

        Martin Geisler
    
minirst: correctly format sections containing inline markup...

              r10983
            
                  del block['lines'][1]

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
          return blocks

        Martin Geisler
    
minirst: convert ``foo`` into "foo" upon display...

              r9623
            
      def inlineliterals(blocks):

          for b in blocks:

        Martin Geisler
    
minirst: correctly format sections containing inline markup...

              r10983
            
              if b['type'] in ('paragraph', 'section'):

        Martin Geisler
    
minirst: convert ``foo`` into "foo" upon display...

              r9623
            
                  b['lines'] = [l.replace('``', '"') for l in b['lines']]

          return blocks

        Martin Geisler
    
doc, minirst: support hg interpreted text role

              r10972
            
      def hgrole(blocks):

          for b in blocks:

        Martin Geisler
    
minirst: correctly format sections containing inline markup...

              r10983
            
              if b['type'] in ('paragraph', 'section'):

        Martin Geisler
    
minirst: handle line breaks in hg role

              r11192
            
                  # Turn :hg:`command` into "hg command". This also works

                  # when there is a line break in the command and relies on

                  # the fact that we have no stray back-quotes in the input

                  # (run the blocks through inlineliterals first).

                  b['lines'] = [l.replace(':hg:`', '"hg ').replace('`', '"')

                                for l in b['lines']]

        Martin Geisler
    
doc, minirst: support hg interpreted text role

              r10972
            
          return blocks

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
      def addmargins(blocks):

          """Adds empty blocks for vertical spacing.

          This groups bullets, options, and definitions together with no vertical

          space between them, and adds an empty block between all other blocks.

          """

          i = 1

          while i < len(blocks):

        Matt Mackall
    
many, many trivial check-code fixups

              r10282
            
              if (blocks[i]['type'] == blocks[i - 1]['type'] and

        Martin Geisler
    
minirst: add margin around definition items...

              r10936
            
                  blocks[i]['type'] in ('bullet', 'option', 'field')):

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
                  i += 1

              else:

                  blocks.insert(i, dict(lines=[''], indent=0, type='margin'))

                  i += 2

          return blocks

      def formatblock(block, width):

          """Format a block according to width."""

        Martin Geisler
    
util, minirst: do not crash with COLUMNS=0

              r9417
            
          if width <= 0:

              width = 78

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
          indent = ' ' * block['indent']

          if block['type'] == 'margin':

              return ''

        Martin Geisler
    
minirst: remove unnecessary "elif:" statements

              r9735
            
          if block['type'] == 'literal':

        Martin Geisler
    
minirst: indent literal blocks with two spaces...

              r9291
            
              indent += '  '

              return indent + ('\n' + indent).join(block['lines'])

        Martin Geisler
    
minirst: remove unnecessary "elif:" statements

              r9735
            
          if block['type'] == 'section':

        Martin Geisler
    
minirst: correctly format sections containing inline markup...

              r10983
            
              underline = len(block['lines'][0]) * block['underline']

              return "%s%s\n%s%s" % (indent, block['lines'][0],indent, underline)

        Martin Geisler
    
minirst: remove unnecessary "elif:" statements

              r9735
            
          if block['type'] == 'definition':

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
              term = indent + block['lines'][0]

        Martin Geisler
    
minirst: combine list parsing in one function...

              r9737
            
              hang = len(block['lines'][-1]) - len(block['lines'][-1].lstrip())

              defindent = indent + hang * ' '

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
              text = ' '.join(map(str.strip, block['lines'][1:]))

        FUJIWARA Katsunori
    
replace Python standard textwrap by MBCS sensitive one for i18n text...

              r11297
            
              return '%s\n%s' % (term, util.wrap(text, width=width,

                                                 initindent=defindent,

                                                 hangindent=defindent))

        Martin Geisler
    
minirst: removed unnecessary initindent variable

              r10937
            
          subindent = indent

        Martin Geisler
    
minirst: remove unnecessary "elif:" statements

              r9735
            
          if block['type'] == 'bullet':

        Martin Geisler
    
minirst: support line blocks

              r10447
            
              if block['lines'][0].startswith('| '):

                  # Remove bullet for line blocks and add no extra

                  # indention.

                  block['lines'][0] = block['lines'][0][2:]

              else:

                  m = _bulletre.match(block['lines'][0])

                  subindent = indent + m.end() * ' '

        Martin Geisler
    
minirst: combine list parsing in one function...

              r9737
            
          elif block['type'] == 'field':

        Martin Geisler
    
minirst: improve layout of field lists...

              r10065
            
              keywidth = block['keywidth']

              key = block['key']

              subindent = indent + _fieldwidth * ' '

              if len(key) + 2 > _fieldwidth:

                  # key too large, use full line width

                  key = key.ljust(width)

              elif keywidth + 2 < _fieldwidth:

                  # all keys are small, add only two spaces

                  key = key.ljust(keywidth + 2)

                  subindent = indent + (keywidth + 2) * ' '

              else:

                  # mixed sizes, use fieldwidth for this one

                  key = key.ljust(_fieldwidth)

              block['lines'][0] = key + block['lines'][0]

        Martin Geisler
    
minirst: combine list parsing in one function...

              r9737
            
          elif block['type'] == 'option':

              m = _optionre.match(block['lines'][0])

        Martin Geisler
    
minirst: don't test regexps twice...

              r10064
            
              option, arg, rest = m.groups()

              subindent = indent + (len(option) + len(arg)) * ' '

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
        Martin Geisler
    
minirst: combine list parsing in one function...

              r9737
            
          text = ' '.join(map(str.strip, block['lines']))

        FUJIWARA Katsunori
    
replace Python standard textwrap by MBCS sensitive one for i18n text...

              r11297
            
          return util.wrap(text, width=width,

                           initindent=indent,

                           hangindent=subindent)

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
        Martin Geisler
    
minirst: report pruned container types

              r10444
            
      def format(text, width, indent=0, keep=None):

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
          """Parse and format the text according to width."""

          blocks = findblocks(text)

        Martin Geisler
    
help: un-indent help topics...

              r9540
            
          for b in blocks:

              b['indent'] += indent

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
          blocks = findliteralblocks(blocks)

        Martin Geisler
    
minirst: report pruned container types

              r10444
            
          blocks, pruned = prunecontainers(blocks, keep or [])

        Martin Geisler
    
minirst: correctly format sections containing inline markup...

              r10983
            
          blocks = findsections(blocks)

        Martin Geisler
    
minirst: convert ``foo`` into "foo" upon display...

              r9623
            
          blocks = inlineliterals(blocks)

        Martin Geisler
    
doc, minirst: support hg interpreted text role

              r10972
            
          blocks = hgrole(blocks)

        Martin Geisler
    
minirst: combine list parsing in one function...

              r9737
            
          blocks = splitparagraphs(blocks)

        Martin Geisler
    
minirst: improve layout of field lists...

              r10065
            
          blocks = updatefieldlists(blocks)

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
          blocks = addmargins(blocks)

        Martin Geisler
    
minirst: report pruned container types

              r10444
            
          text = '\n'.join(formatblock(b, width) for b in blocks)

          if keep is None:

              return text

          else:

              return text, pruned

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
      if __name__ == "__main__":

          from pprint import pprint

        Martin Geisler
    
minirst: support containers...

              r10443
            
          def debug(func, *args):

              blocks = func(*args)

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
              print "*** after %s:" % func.__name__

              pprint(blocks)

              print

              return blocks

          text = open(sys.argv[1]).read()

          blocks = debug(findblocks, text)

          blocks = debug(findliteralblocks, blocks)

        Martin Geisler
    
minirst: fix debug code

              r11187
            
          blocks, pruned = debug(prunecontainers, blocks, sys.argv[2:])

        Martin Geisler
    
minirst: run inlineliterals too in debug mode

              r10063
            
          blocks = debug(inlineliterals, blocks)

        Martin Geisler
    
minirst: combine list parsing in one function...

              r9737
            
          blocks = debug(splitparagraphs, blocks)

        Martin Geisler
    
minirst: improve layout of field lists...

              r10065
            
          blocks = debug(updatefieldlists, blocks)

        Martin Geisler
    
minimal reStructuredText parser

              r9156
            
          blocks = debug(findsections, blocks)

          blocks = debug(addmargins, blocks)

          print '\n'.join(formatblock(b, 30) for b in blocks)

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages

Martin Geisler minimal reStructuredText parser	r9156	# minirst.py - minimal reStructuredText parser
		#
Martin Geisler minirst: support containers...	r10443	# Copyright 2009, 2010 Matt Mackall <mpm@selenic.com> and others
Martin Geisler minimal reStructuredText parser	r9156	#
		# This software may be used and distributed according to the terms of the
Matt Mackall Update license to GPLv2+	r10263	# GNU General Public License version 2 or any later version.
Martin Geisler minimal reStructuredText parser	r9156
		"""simplified reStructuredText parser.

		This parser knows just enough about reStructuredText to parse the
		Mercurial docstrings.

		It cheats in a major way: nested blocks are not really nested. They
		are just indented blocks that look like they are nested. This relies
		on the user to keep the right indentation for the blocks.

		It only supports a small subset of reStructuredText:

Martin Geisler minirst: update module docstring	r9741	- sections

Martin Geisler minimal reStructuredText parser	r9156	- paragraphs

Martin Geisler minirst: update module docstring	r9741	- literal blocks

		- definition lists
Martin Geisler minimal reStructuredText parser	r9156
Martin Geisler minirst: update module docstring	r9741	- bullet lists (items must start with '-')

		- enumerated lists (no autonumbering)
Martin Geisler minimal reStructuredText parser	r9156
Martin Geisler minirst: parse field lists	r9293	- field lists (colons cannot be escaped)

Martin Geisler minimal reStructuredText parser	r9156	- option lists (supports only long options without arguments)

Martin Geisler minirst: update module docstring	r9741	- inline literals (no other inline markup is not recognized)
Martin Geisler minimal reStructuredText parser	r9156	"""

FUJIWARA Katsunori replace Python standard textwrap by MBCS sensitive one for i18n text...	r11297	import re, sys
		import util
Martin Geisler minimal reStructuredText parser	r9156
		def findblocks(text):
		"""Find continuous blocks of lines in text.

		Returns a list of dictionaries representing the blocks. Each block
		has an 'indent' field and a 'lines' field.
		"""
		blocks = [[]]
		lines = text.splitlines()
		for line in lines:
		if line.strip():
		blocks[-1].append(line)
		elif blocks[-1]:
		blocks.append([])
		if not blocks[-1]:
		del blocks[-1]

		for i, block in enumerate(blocks):
		indent = min((len(l) - len(l.lstrip())) for l in block)
		blocks[i] = dict(indent=indent, lines=[l[indent:] for l in block])
		return blocks


		def findliteralblocks(blocks):
		"""Finds literal blocks and adds a 'type' field to the blocks.

		Literal blocks are given the type 'literal', all other blocks are
		given type the 'paragraph'.
		"""
		i = 0
		while i < len(blocks):
		# Searching for a block that looks like this:
		#
		# +------------------------------+
		# \| paragraph \|
		# \| (ends with "::") \|
		# +------------------------------+
		# +---------------------------+
		# \| indented literal block \|
		# +---------------------------+
		blocks[i]['type'] = 'paragraph'
Matt Mackall many, many trivial check-code fixups	r10282	if blocks[i]['lines'][-1].endswith('::') and i + 1 < len(blocks):
Martin Geisler minimal reStructuredText parser	r9156	indent = blocks[i]['indent']
Matt Mackall many, many trivial check-code fixups	r10282	adjustment = blocks[i + 1]['indent'] - indent
Martin Geisler minimal reStructuredText parser	r9156
		if blocks[i]['lines'] == ['::']:
		# Expanded form: remove block
		del blocks[i]
		i -= 1
		elif blocks[i]['lines'][-1].endswith(' ::'):
		# Partially minimized form: remove space and both
		# colons.
		blocks[i]['lines'][-1] = blocks[i]['lines'][-1][:-3]
		else:
		# Fully minimized form: remove just one colon.
		blocks[i]['lines'][-1] = blocks[i]['lines'][-1][:-1]

		# List items are formatted with a hanging indent. We must
		# correct for this here while we still have the original
		# information on the indentation of the subsequent literal
		# blocks available.
Martin Geisler minirst: prepare for general types of bullet lists...	r9738	m = _bulletre.match(blocks[i]['lines'][0])
		if m:
		indent += m.end()
		adjustment -= m.end()
Martin Geisler minimal reStructuredText parser	r9156
		# Mark the following indented blocks.
Matt Mackall many, many trivial check-code fixups	r10282	while i + 1 < len(blocks) and blocks[i + 1]['indent'] > indent:
		blocks[i + 1]['type'] = 'literal'
		blocks[i + 1]['indent'] -= adjustment
Martin Geisler minimal reStructuredText parser	r9156	i += 1
		i += 1
		return blocks

Martin Geisler minirst: support line blocks	r10447	_bulletre = re.compile(r'(-\|[0-9A-Za-z]+\.\|\(?[0-9A-Za-z]+\)\|\\|) ')
Martin Geisler minirst: combine list parsing in one function...	r9737	_optionre = re.compile(r'^(--[a-z-]+)((?:[ =][a-zA-Z][\w-])? +)(.)$')
Martin Geisler minirst: improve layout of field lists...	r10065	_fieldre = re.compile(r':(?![: ])([^:])(?<! ):[ ]+(.)')
Martin Geisler minirst: combine list parsing in one function...	r9737	_definitionre = re.compile(r'[^ ]')

		def splitparagraphs(blocks):
		"""Split paragraphs into lists."""
		# Tuples with (list type, item regexp, single line items?). Order
		# matters: definition lists has the least specific regexp and must
		# come last.
		listtypes = [('bullet', _bulletre, True),
		('option', _optionre, True),
		('field', _fieldre, True),
		('definition', _definitionre, False)]

		def match(lines, i, itemre, singleline):
		"""Does itemre match an item at line i?

		A list item can be followed by an idented line or another list
		item (but only if singleline is True).
		"""
		line1 = lines[i]
Matt Mackall many, many trivial check-code fixups	r10282	line2 = i + 1 < len(lines) and lines[i + 1] or ''
Martin Geisler minirst: combine list parsing in one function...	r9737	if not itemre.match(line1):
		return False
		if singleline:
		return line2 == '' or line2[0] == ' ' or itemre.match(line2)
		else:
		return line2.startswith(' ')

		i = 0
		while i < len(blocks):
		if blocks[i]['type'] == 'paragraph':
		lines = blocks[i]['lines']
		for type, itemre, singleline in listtypes:
		if match(lines, 0, itemre, singleline):
		items = []
		for j, line in enumerate(lines):
		if match(lines, j, itemre, singleline):
		items.append(dict(type=type, lines=[],
		indent=blocks[i]['indent']))
		items[-1]['lines'].append(line)
Matt Mackall many, many trivial check-code fixups	r10282	blocks[i:i + 1] = items
Martin Geisler minirst: combine list parsing in one function...	r9737	break
		i += 1
		return blocks

Martin Geisler minimal reStructuredText parser	r9156
Martin Geisler minirst: improve layout of field lists...	r10065	_fieldwidth = 12

		def updatefieldlists(blocks):
		"""Find key and maximum key width for field lists."""
		i = 0
		while i < len(blocks):
		if blocks[i]['type'] != 'field':
		i += 1
		continue

		keywidth = 0
		j = i
		while j < len(blocks) and blocks[j]['type'] == 'field':
		m = _fieldre.match(blocks[j]['lines'][0])
		key, rest = m.groups()
		blocks[j]['lines'][0] = rest
		blocks[j]['key'] = key
		keywidth = max(keywidth, len(key))
		j += 1

		for block in blocks[i:j]:
		block['keywidth'] = keywidth
		i = j + 1

		return blocks


Martin Geisler minirst: support containers...	r10443	def prunecontainers(blocks, keep):
		"""Prune unwanted containers.

		The blocks must have a 'type' field, i.e., they should have been
		run through findliteralblocks first.
		"""
Martin Geisler minirst: report pruned container types	r10444	pruned = []
Martin Geisler minirst: support containers...	r10443	i = 0
		while i + 1 < len(blocks):
		# Searching for a block that looks like this:
		#
		# +-------+---------------------------+
		# \| ".. container ::" type \|
		# +---+ \|
		# \| blocks \|
		# +-------------------------------+
		if (blocks[i]['type'] == 'paragraph' and
		blocks[i]['lines'][0].startswith('.. container::')):
		indent = blocks[i]['indent']
		adjustment = blocks[i + 1]['indent'] - indent
		containertype = blocks[i]['lines'][0][15:]
		prune = containertype not in keep
Martin Geisler minirst: report pruned container types	r10444	if prune:
		pruned.append(containertype)
Martin Geisler minirst: support containers...	r10443
		# Always delete "..container:: type" block
		del blocks[i]
		j = i
		while j < len(blocks) and blocks[j]['indent'] > indent:
		if prune:
		del blocks[j]
		i -= 1 # adjust outer index
		else:
		blocks[j]['indent'] -= adjustment
		j += 1
		i += 1
Martin Geisler minirst: report pruned container types	r10444	return blocks, pruned
Martin Geisler minirst: support containers...	r10443

Martin Geisler minirst: support all recommended title adornments	r10984	_sectionre = re.compile(r"""^([-=`:.'"~^_*+#])\1+$""")

Martin Geisler minimal reStructuredText parser	r9156	def findsections(blocks):
		"""Finds sections.

		The blocks must have a 'type' field, i.e., they should have been
		run through findliteralblocks first.
		"""
		for block in blocks:
		# Searching for a block that looks like this:
		#
		# +------------------------------+
		# \| Section title \|
		# \| ------------- \|
		# +------------------------------+
		if (block['type'] == 'paragraph' and
		len(block['lines']) == 2 and
Martin Geisler minirst: support all recommended title adornments	r10984	len(block['lines'][0]) == len(block['lines'][1]) and
		_sectionre.match(block['lines'][1])):
Martin Geisler minirst: correctly format sections containing inline markup...	r10983	block['underline'] = block['lines'][1][0]
Martin Geisler minimal reStructuredText parser	r9156	block['type'] = 'section'
Martin Geisler minirst: correctly format sections containing inline markup...	r10983	del block['lines'][1]
Martin Geisler minimal reStructuredText parser	r9156	return blocks


Martin Geisler minirst: convert ``foo`` into "foo" upon display...	r9623	def inlineliterals(blocks):
		for b in blocks:
Martin Geisler minirst: correctly format sections containing inline markup...	r10983	if b['type'] in ('paragraph', 'section'):
Martin Geisler minirst: convert ``foo`` into "foo" upon display...	r9623	b['lines'] = [l.replace('``', '"') for l in b['lines']]
		return blocks


Martin Geisler doc, minirst: support hg interpreted text role	r10972	def hgrole(blocks):
		for b in blocks:
Martin Geisler minirst: correctly format sections containing inline markup...	r10983	if b['type'] in ('paragraph', 'section'):
Martin Geisler minirst: handle line breaks in hg role	r11192	# Turn :hg:`command` into "hg command". This also works
		# when there is a line break in the command and relies on
		# the fact that we have no stray back-quotes in the input
		# (run the blocks through inlineliterals first).
		b['lines'] = [l.replace(':hg:`', '"hg ').replace('`', '"')
		for l in b['lines']]
Martin Geisler doc, minirst: support hg interpreted text role	r10972	return blocks


Martin Geisler minimal reStructuredText parser	r9156	def addmargins(blocks):
		"""Adds empty blocks for vertical spacing.

		This groups bullets, options, and definitions together with no vertical
		space between them, and adds an empty block between all other blocks.
		"""
		i = 1
		while i < len(blocks):
Matt Mackall many, many trivial check-code fixups	r10282	if (blocks[i]['type'] == blocks[i - 1]['type'] and
Martin Geisler minirst: add margin around definition items...	r10936	blocks[i]['type'] in ('bullet', 'option', 'field')):
Martin Geisler minimal reStructuredText parser	r9156	i += 1
		else:
		blocks.insert(i, dict(lines=[''], indent=0, type='margin'))
		i += 2
		return blocks


		def formatblock(block, width):
		"""Format a block according to width."""
Martin Geisler util, minirst: do not crash with COLUMNS=0	r9417	if width <= 0:
		width = 78
Martin Geisler minimal reStructuredText parser	r9156	indent = ' ' * block['indent']
		if block['type'] == 'margin':
		return ''
Martin Geisler minirst: remove unnecessary "elif:" statements	r9735	if block['type'] == 'literal':
Martin Geisler minirst: indent literal blocks with two spaces...	r9291	indent += ' '
		return indent + ('\n' + indent).join(block['lines'])
Martin Geisler minirst: remove unnecessary "elif:" statements	r9735	if block['type'] == 'section':
Martin Geisler minirst: correctly format sections containing inline markup...	r10983	underline = len(block['lines'][0]) * block['underline']
		return "%s%s\n%s%s" % (indent, block['lines'][0],indent, underline)
Martin Geisler minirst: remove unnecessary "elif:" statements	r9735	if block['type'] == 'definition':
Martin Geisler minimal reStructuredText parser	r9156	term = indent + block['lines'][0]
Martin Geisler minirst: combine list parsing in one function...	r9737	hang = len(block['lines'][-1]) - len(block['lines'][-1].lstrip())
		defindent = indent + hang * ' '
Martin Geisler minimal reStructuredText parser	r9156	text = ' '.join(map(str.strip, block['lines'][1:]))
FUJIWARA Katsunori replace Python standard textwrap by MBCS sensitive one for i18n text...	r11297	return '%s\n%s' % (term, util.wrap(text, width=width,
		initindent=defindent,
		hangindent=defindent))
Martin Geisler minirst: removed unnecessary initindent variable	r10937	subindent = indent
Martin Geisler minirst: remove unnecessary "elif:" statements	r9735	if block['type'] == 'bullet':
Martin Geisler minirst: support line blocks	r10447	if block['lines'][0].startswith('\| '):
		# Remove bullet for line blocks and add no extra
		# indention.
		block['lines'][0] = block['lines'][0][2:]
		else:
		m = _bulletre.match(block['lines'][0])
		subindent = indent + m.end() * ' '
Martin Geisler minirst: combine list parsing in one function...	r9737	elif block['type'] == 'field':
Martin Geisler minirst: improve layout of field lists...	r10065	keywidth = block['keywidth']
		key = block['key']

		subindent = indent + _fieldwidth * ' '
		if len(key) + 2 > _fieldwidth:
		# key too large, use full line width
		key = key.ljust(width)
		elif keywidth + 2 < _fieldwidth:
		# all keys are small, add only two spaces
		key = key.ljust(keywidth + 2)
		subindent = indent + (keywidth + 2) * ' '
		else:
		# mixed sizes, use fieldwidth for this one
		key = key.ljust(_fieldwidth)
		block['lines'][0] = key + block['lines'][0]
Martin Geisler minirst: combine list parsing in one function...	r9737	elif block['type'] == 'option':
		m = _optionre.match(block['lines'][0])
Martin Geisler minirst: don't test regexps twice...	r10064	option, arg, rest = m.groups()
		subindent = indent + (len(option) + len(arg)) * ' '
Martin Geisler minimal reStructuredText parser	r9156
Martin Geisler minirst: combine list parsing in one function...	r9737	text = ' '.join(map(str.strip, block['lines']))
FUJIWARA Katsunori replace Python standard textwrap by MBCS sensitive one for i18n text...	r11297	return util.wrap(text, width=width,
		initindent=indent,
		hangindent=subindent)
Martin Geisler minimal reStructuredText parser	r9156

Martin Geisler minirst: report pruned container types	r10444	def format(text, width, indent=0, keep=None):
Martin Geisler minimal reStructuredText parser	r9156	"""Parse and format the text according to width."""
		blocks = findblocks(text)
Martin Geisler help: un-indent help topics...	r9540	for b in blocks:
		b['indent'] += indent
Martin Geisler minimal reStructuredText parser	r9156	blocks = findliteralblocks(blocks)
Martin Geisler minirst: report pruned container types	r10444	blocks, pruned = prunecontainers(blocks, keep or [])
Martin Geisler minirst: correctly format sections containing inline markup...	r10983	blocks = findsections(blocks)
Martin Geisler minirst: convert ``foo`` into "foo" upon display...	r9623	blocks = inlineliterals(blocks)
Martin Geisler doc, minirst: support hg interpreted text role	r10972	blocks = hgrole(blocks)
Martin Geisler minirst: combine list parsing in one function...	r9737	blocks = splitparagraphs(blocks)
Martin Geisler minirst: improve layout of field lists...	r10065	blocks = updatefieldlists(blocks)
Martin Geisler minimal reStructuredText parser	r9156	blocks = addmargins(blocks)
Martin Geisler minirst: report pruned container types	r10444	text = '\n'.join(formatblock(b, width) for b in blocks)
		if keep is None:
		return text
		else:
		return text, pruned
Martin Geisler minimal reStructuredText parser	r9156

		if __name__ == "__main__":
		from pprint import pprint

Martin Geisler minirst: support containers...	r10443	def debug(func, *args):
		blocks = func(*args)
Martin Geisler minimal reStructuredText parser	r9156	print "*** after %s:" % func.__name__
		pprint(blocks)
		print
		return blocks

		text = open(sys.argv[1]).read()
		blocks = debug(findblocks, text)
		blocks = debug(findliteralblocks, blocks)
Martin Geisler minirst: fix debug code	r11187	blocks, pruned = debug(prunecontainers, blocks, sys.argv[2:])
Martin Geisler minirst: run inlineliterals too in debug mode	r10063	blocks = debug(inlineliterals, blocks)
Martin Geisler minirst: combine list parsing in one function...	r9737	blocks = debug(splitparagraphs, blocks)
Martin Geisler minirst: improve layout of field lists...	r10065	blocks = debug(updatefieldlists, blocks)
Martin Geisler minimal reStructuredText parser	r9156	blocks = debug(findsections, blocks)
		blocks = debug(addmargins, blocks)
		print '\n'.join(formatblock(b, 30) for b in blocks)