upstream/ipython Files · IPython/nbconvert/filters/strings.py

Backport PR : Fix for incorrect default encoding on Windows....

Backport PR : Fix for incorrect default encoding on Windows. Whilst trying out rendering notebooks in a flask app under Apache on Windows I got the below error when simply trying to import `SlidesExporter` ```python mod_wsgi (pid=6260): Exception occurred processing WSGI script 'flask_test.wsgi'. Traceback (most recent call last): File "flask_test.py", line 81, in render_notebook from IPython.nbconvert.exporters import SlidesExporter File "c:\\dev\\code\\ipython\\IPython\\__init__.py", line 47, in <module> from .terminal.embed import embed File "c:\\dev\\code\\ipython\\IPython\\terminal\\embed.py", line 32, in <module> from IPython.terminal.interactiveshell import TerminalInteractiveShell File "c:\\dev\\code\\ipython\\IPython\\terminal\\interactiveshell.py", line 25, in <module> from IPython.core.interactiveshell import InteractiveShell, InteractiveShellABC File "c:\\dev\\code\\ipython\\IPython\\core\\interactiveshell.py", line 59, in <module> from IPython.core.prompts import PromptManager File "c:\\dev\\code\\ipython\\IPython\\core\\prompts.py", line 138, in <module> HOME = py3compat.str_to_unicode(os.environ.get("HOME","//////:::::ZZZZZ,,,~~~")) File "c:\\dev\\code\\ipython\\IPython\\utils\\py3compat.py", line 18, in decode return s.decode(encoding, "replace") LookupError: unknown encoding: cp0 ``` A little bit of [googling](http://bugs.python.org/issue6501) suggests that Windows returns 'cp0' to indicate there is no code page. This fix simply looks for this invalid value and replaces it with something valid. With this change it works for me.

MinRK - - Load All Authors

File last commit:

r12428:6d6f830e


                r12463:516353d0

Download file

             strings.py
        
                    183 lines
            
             | 4.6 KiB
            
                | text/x-python
            
             |
                PythonLexer
            
             / IPython / nbconvert / filters / strings.py
          
                    History
                
                 |
                  Source
                 | Raw
                 |Copy content
                 |Copy permalink

        MinRK
    
add html_text and add_anchor filters...

              r11302
            
      # coding: utf-8

        Jonathan Frederic
    
Cleanup and refactor of filters

              r10676
            
      """String filters.

        Jonathan Frederic
    
Moved wrap code into Strings utility file.

              r10433
            
        Jonathan Frederic
    
Cleanup and refactor of filters

              r10676
            
      Contains a collection of useful string manipulation filters for use in Jinja

      templates.

        Jonathan Frederic
    
Moved wrap code into Strings utility file.

              r10433
            
      """

      #-----------------------------------------------------------------------------

      # Copyright (c) 2013, the IPython Development Team.

      #

      # Distributed under the terms of the Modified BSD License.

      #

      # The full license is in the file COPYING.txt, distributed with this software.

      #-----------------------------------------------------------------------------

      #-----------------------------------------------------------------------------

      # Imports

      #-----------------------------------------------------------------------------

        MinRK
    
add posix_path filter...

              r11972
            
      import os

        MinRK
    
fix `file/` URL replacements in nbconvert

              r11202
            
      import re

        Jonathan Frederic
    
Cleanup and refactor of filters

              r10676
            
      import textwrap

        MinRK
    
add html_text and add_anchor filters...

              r11302
            
      from xml.etree import ElementTree

        MinRK
    
convert IPython syntax to Python syntax in nbconvert python template...

              r11711
            
      from IPython.core.interactiveshell import InteractiveShell

        MinRK
    
add html_text and add_anchor filters...

              r11302
            
      from IPython.utils import py3compat

        Brian E. Granger
    
Fixing import for nbconvert.

              r11089
            
        Jonathan Frederic
    
Moved wrap code into Strings utility file.

              r10433
            
      #-----------------------------------------------------------------------------

      # Functions

      #-----------------------------------------------------------------------------

        Jonathan Frederic
    
Cleanup and refactor of filters

              r10676
            
        Brian E. Granger
    
Fixing import logic.

              r11088
            
      __all__ = [

        Jonathan Frederic
    
Filter names cleanup

              r11685
            
          'wrap_text',

          'html2text',

        MinRK
    
add html_text and add_anchor filters...

              r11302
            
          'add_anchor',

        Brian E. Granger
    
Fixing import logic.

              r11088
            
          'strip_dollars',

        Jonathan Frederic
    
Filter names cleanup

              r11685
            
          'strip_files_prefix',

          'comment_lines',

        MinRK
    
convert IPython syntax to Python syntax in nbconvert python template...

              r11711
            
          'get_lines',

          'ipython2python',

        MinRK
    
add posix_path filter...

              r11972
            
          'posix_path',

        Brian E. Granger
    
Fixing import logic.

              r11088
            
      ]

        Jonathan Frederic
    
Filter names cleanup

              r11685
            
      def wrap_text(text, width=100):

        Jonathan Frederic
    
Cleanup and refactor of filters

              r10676
            
          """ 

          Intelligently wrap text.

          Wrap text without breaking words if possible.

          Parameters

          ----------

          text : str

              Text to wrap.

          width : int, optional

              Number of characters to wrap to, default 100.

          """

        Jonathan Frederic
    
Moved wrap code into Strings utility file.

              r10433
            
        Jonathan Frederic
    
Cleanup and refactor of filters

              r10676
            
          split_text = text.split('\n')

          wrp = map(lambda x:textwrap.wrap(x,width), split_text)

        Jonathan Frederic
    
Moved wrap code into Strings utility file.

              r10433
            
          wrpd = map('\n'.join, wrp)

        Jonathan Frederic
    
Moved more code to Strings utilities file

              r10434
            
          return '\n'.join(wrpd)

        MinRK
    
add html_text and add_anchor filters...

              r11302
            
        Jonathan Frederic
    
Filter names cleanup

              r11685
            
      def html2text(element):

        MinRK
    
add html_text and add_anchor filters...

              r11302
            
          """extract inner text from html

          Analog of jQuery's $(element).text()

          """

        Jonathan Frederic
    
Fixes for Py3.3

              r11547
            
          if isinstance(element, py3compat.string_types):

        MinRK
    
add html_text and add_anchor filters...

              r11302
            
              element = ElementTree.fromstring(element)

          text = element.text or ""

          for child in element:

        Jonathan Frederic
    
Filter names cleanup

              r11685
            
              text += html2text(child)

        MinRK
    
add html_text and add_anchor filters...

              r11302
            
          text += (element.tail or "")

          return text

      def add_anchor(html):

          """Add an anchor-link to an html header tag

        MinRK
    
allow extra pandoc args

              r11293
            
        MinRK
    
add html_text and add_anchor filters...

              r11302
            
          For use in heading cells

        MinRK
    
allow extra pandoc args

              r11293
            
          """

        MinRK
    
Backport PR #4092: nbconvert: Fix for unicode html headers, Windows + Python 2.x...

              r12428
            
          h = ElementTree.fromstring(py3compat.cast_bytes_py2(html, encoding='utf-8'))

        Jonathan Frederic
    
Filter names cleanup

              r11685
            
          link = html2text(h).replace(' ', '-')

        MinRK
    
add html_text and add_anchor filters...

              r11302
            
          h.set('id', link)

          a = ElementTree.Element("a", {"class" : "anchor-link", "href" : "#" + link})

          a.text = u'¶'

          h.append(a)

        Jonathan Frederic
    
Moved add_anchor bytes-strings fix into add_anchor

              r11927
            
          # Known issue of Python3.x, ElementTree.tostring() returns a byte string

          # instead of a text string.  See issue http://bugs.python.org/issue10942

          # Workaround is to make sure the bytes are casted to a string.

        Jonathan Frederic
    
Simplify decode to unicode

              r11946
            
          return py3compat.decode(ElementTree.tostring(h), 'utf-8')

        MinRK
    
allow extra pandoc args

              r11293
            
        Jonathan Frederic
    
Moved more code to Strings utilities file

              r10434
            
      def strip_dollars(text):

        Jonathan Frederic
    
Cleanup and refactor of filters

              r10676
            
          """

          Remove all dollar symbols from text

          Parameters

          ----------

          text : str

              Text to remove dollars from

          """

        Jonathan Frederic
    
Moved more code to Strings utilities file

              r10434
            
        Jonathan Frederic
    
Post code-review, extended refactor.

              r10485
            
          return text.strip('$')

        jakobgager
    
Small latex mods: Escapes, Headings, Equations...

              r10882
            
        MinRK
    
fix `file/` URL replacements in nbconvert

              r11202
            
      files_url_pattern = re.compile(r'(src|href)\=([\'"]?)files/')

        Jonathan Frederic
    
Filter names cleanup

              r11685
            
      def strip_files_prefix(text):

        Jonathan Frederic
    
Cleanup and refactor of filters

              r10676
            
          """

        MinRK
    
fix `file/` URL replacements in nbconvert

              r11202
            
          Fix all fake URLs that start with `files/`,

          stripping out the `files/` prefix.

        Jonathan Frederic
    
Cleanup and refactor of filters

              r10676
            
          Parameters

          ----------

          text : str

        MinRK
    
fix `file/` URL replacements in nbconvert

              r11202
            
              Text in which to replace 'src="files/real...' with 'src="real...'

        Jonathan Frederic
    
Cleanup and refactor of filters

              r10676
            
          """

        MinRK
    
fix `file/` URL replacements in nbconvert

              r11202
            
          return files_url_pattern.sub(r"\1=\2", text)

        Jonathan Frederic
    
Post code-review, extended refactor.

              r10485
            
        Jonathan Frederic
    
Filter names cleanup

              r11685
            
      def comment_lines(text, prefix='# '):

        Jonathan Frederic
    
Cleanup and refactor of filters

              r10676
            
          """

          Build a Python comment line from input text.

          Parameters

          ----------

          text : str

              Text to comment out.

        Jonathan Frederic
    
Filter names cleanup

              r11685
            
          prefix : str

              Character to append to the start of each line.

        Jonathan Frederic
    
Cleanup and refactor of filters

              r10676
            
          """

          #Replace line breaks with line breaks and comment symbols.

          #Also add a comment symbol at the beginning to comment out

          #the first line.

        Jonathan Frederic
    
Filter names cleanup

              r11685
            
          return prefix + ('\n'+prefix).join(text.split('\n')) 

        Jonathan Frederic
    
Cleanup and refactor of filters

              r10676
            
        Jonathan Frederic
    
Post code-review, extended refactor.

              r10485
            
        Jonathan Frederic
    
Cleanup and refactor of filters

              r10676
            
      def get_lines(text, start=None,end=None):

        Jonathan Frederic
    
Post code-review, extended refactor.

              r10485
            
          """

          Split the input text into separate lines and then return the 

          lines that the caller is interested in.

        Jonathan Frederic
    
Cleanup and refactor of filters

              r10676
            
          Parameters

          ----------

          text : str

              Text to parse lines from.

          start : int, optional

              First line to grab from.

          end : int, optional

              Last line to grab from.

        Jonathan Frederic
    
Post code-review, extended refactor.

              r10485
            
          """

          # Split the input into lines.

        Jonathan Frederic
    
Cleanup and refactor of filters

              r10676
            
          lines = text.split("\n")

        Jonathan Frederic
    
Post code-review, extended refactor.

              r10485
            
          # Return the right lines.

          return "\n".join(lines[start:end]) #re-join

        MinRK
    
convert IPython syntax to Python syntax in nbconvert python template...

              r11711
            
      def ipython2python(code):

          """Transform IPython syntax to pure Python syntax

          Parameters

          ----------

          code : str

              IPython code, to be transformed to pure Python

          """

          shell = InteractiveShell.instance()

          return shell.input_transformer_manager.transform_cell(code)

        MinRK
    
add posix_path filter...

              r11972
            
      def posix_path(path):

          """Turn a path into posix-style path/to/etc

          Mainly for use in latex on Windows,

          where native Windows paths are not allowed.

          """

          if os.path.sep != '/':

              return path.replace(os.path.sep, '/')

          return path

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages

MinRK add html_text and add_anchor filters...	r11302	# coding: utf-8
Jonathan Frederic Cleanup and refactor of filters	r10676	"""String filters.
Jonathan Frederic Moved wrap code into Strings utility file.	r10433
Jonathan Frederic Cleanup and refactor of filters	r10676	Contains a collection of useful string manipulation filters for use in Jinja
		templates.
Jonathan Frederic Moved wrap code into Strings utility file.	r10433	"""
		#-----------------------------------------------------------------------------
		# Copyright (c) 2013, the IPython Development Team.
		#
		# Distributed under the terms of the Modified BSD License.
		#
		# The full license is in the file COPYING.txt, distributed with this software.
		#-----------------------------------------------------------------------------

		#-----------------------------------------------------------------------------
		# Imports
		#-----------------------------------------------------------------------------

MinRK add posix_path filter...	r11972	import os
MinRK fix `file/` URL replacements in nbconvert	r11202	import re
Jonathan Frederic Cleanup and refactor of filters	r10676	import textwrap
MinRK add html_text and add_anchor filters...	r11302	from xml.etree import ElementTree
MinRK convert IPython syntax to Python syntax in nbconvert python template...	r11711
		from IPython.core.interactiveshell import InteractiveShell
MinRK add html_text and add_anchor filters...	r11302	from IPython.utils import py3compat
Brian E. Granger Fixing import for nbconvert.	r11089
Jonathan Frederic Moved wrap code into Strings utility file.	r10433	#-----------------------------------------------------------------------------
		# Functions
		#-----------------------------------------------------------------------------
Jonathan Frederic Cleanup and refactor of filters	r10676
Brian E. Granger Fixing import logic.	r11088	__all__ = [
Jonathan Frederic Filter names cleanup	r11685	'wrap_text',
		'html2text',
MinRK add html_text and add_anchor filters...	r11302	'add_anchor',
Brian E. Granger Fixing import logic.	r11088	'strip_dollars',
Jonathan Frederic Filter names cleanup	r11685	'strip_files_prefix',
		'comment_lines',
MinRK convert IPython syntax to Python syntax in nbconvert python template...	r11711	'get_lines',
		'ipython2python',
MinRK add posix_path filter...	r11972	'posix_path',
Brian E. Granger Fixing import logic.	r11088	]


Jonathan Frederic Filter names cleanup	r11685	def wrap_text(text, width=100):
Jonathan Frederic Cleanup and refactor of filters	r10676	"""
		Intelligently wrap text.
		Wrap text without breaking words if possible.

		Parameters
		----------
		text : str
		Text to wrap.
		width : int, optional
		Number of characters to wrap to, default 100.
		"""
Jonathan Frederic Moved wrap code into Strings utility file.	r10433
Jonathan Frederic Cleanup and refactor of filters	r10676	split_text = text.split('\n')
		wrp = map(lambda x:textwrap.wrap(x,width), split_text)
Jonathan Frederic Moved wrap code into Strings utility file.	r10433	wrpd = map('\n'.join, wrp)
Jonathan Frederic Moved more code to Strings utilities file	r10434	return '\n'.join(wrpd)

MinRK add html_text and add_anchor filters...	r11302
Jonathan Frederic Filter names cleanup	r11685	def html2text(element):
MinRK add html_text and add_anchor filters...	r11302	"""extract inner text from html

		Analog of jQuery's $(element).text()
		"""
Jonathan Frederic Fixes for Py3.3	r11547	if isinstance(element, py3compat.string_types):
MinRK add html_text and add_anchor filters...	r11302	element = ElementTree.fromstring(element)

		text = element.text or ""
		for child in element:
Jonathan Frederic Filter names cleanup	r11685	text += html2text(child)
MinRK add html_text and add_anchor filters...	r11302	text += (element.tail or "")
		return text


		def add_anchor(html):
		"""Add an anchor-link to an html header tag
MinRK allow extra pandoc args	r11293
MinRK add html_text and add_anchor filters...	r11302	For use in heading cells
MinRK allow extra pandoc args	r11293	"""
MinRK Backport PR #4092: nbconvert: Fix for unicode html headers, Windows + Python 2.x...	r12428	h = ElementTree.fromstring(py3compat.cast_bytes_py2(html, encoding='utf-8'))
Jonathan Frederic Filter names cleanup	r11685	link = html2text(h).replace(' ', '-')
MinRK add html_text and add_anchor filters...	r11302	h.set('id', link)
		a = ElementTree.Element("a", {"class" : "anchor-link", "href" : "#" + link})
		a.text = u'¶'
		h.append(a)
Jonathan Frederic Moved add_anchor bytes-strings fix into add_anchor	r11927
		# Known issue of Python3.x, ElementTree.tostring() returns a byte string
		# instead of a text string. See issue http://bugs.python.org/issue10942
		# Workaround is to make sure the bytes are casted to a string.
Jonathan Frederic Simplify decode to unicode	r11946	return py3compat.decode(ElementTree.tostring(h), 'utf-8')
MinRK allow extra pandoc args	r11293
Jonathan Frederic Moved more code to Strings utilities file	r10434
		def strip_dollars(text):
Jonathan Frederic Cleanup and refactor of filters	r10676	"""
		Remove all dollar symbols from text

		Parameters
		----------
		text : str
		Text to remove dollars from
		"""
Jonathan Frederic Moved more code to Strings utilities file	r10434
Jonathan Frederic Post code-review, extended refactor.	r10485	return text.strip('$')

jakobgager Small latex mods: Escapes, Headings, Equations...	r10882
MinRK fix `file/` URL replacements in nbconvert	r11202	files_url_pattern = re.compile(r'(src\|href)\=([\'"]?)files/')

Jonathan Frederic Filter names cleanup	r11685	def strip_files_prefix(text):
Jonathan Frederic Cleanup and refactor of filters	r10676	"""
MinRK fix `file/` URL replacements in nbconvert	r11202	Fix all fake URLs that start with `files/`,
		stripping out the `files/` prefix.
Jonathan Frederic Cleanup and refactor of filters	r10676
		Parameters
		----------
		text : str
MinRK fix `file/` URL replacements in nbconvert	r11202	Text in which to replace 'src="files/real...' with 'src="real...'
Jonathan Frederic Cleanup and refactor of filters	r10676	"""
MinRK fix `file/` URL replacements in nbconvert	r11202	return files_url_pattern.sub(r"\1=\2", text)
Jonathan Frederic Post code-review, extended refactor.	r10485

Jonathan Frederic Filter names cleanup	r11685	def comment_lines(text, prefix='# '):
Jonathan Frederic Cleanup and refactor of filters	r10676	"""
		Build a Python comment line from input text.

		Parameters
		----------
		text : str
		Text to comment out.
Jonathan Frederic Filter names cleanup	r11685	prefix : str
		Character to append to the start of each line.
Jonathan Frederic Cleanup and refactor of filters	r10676	"""

		#Replace line breaks with line breaks and comment symbols.
		#Also add a comment symbol at the beginning to comment out
		#the first line.
Jonathan Frederic Filter names cleanup	r11685	return prefix + ('\n'+prefix).join(text.split('\n'))
Jonathan Frederic Cleanup and refactor of filters	r10676
Jonathan Frederic Post code-review, extended refactor.	r10485
Jonathan Frederic Cleanup and refactor of filters	r10676	def get_lines(text, start=None,end=None):
Jonathan Frederic Post code-review, extended refactor.	r10485	"""
		Split the input text into separate lines and then return the
		lines that the caller is interested in.
Jonathan Frederic Cleanup and refactor of filters	r10676
		Parameters
		----------
		text : str
		Text to parse lines from.
		start : int, optional
		First line to grab from.
		end : int, optional
		Last line to grab from.
Jonathan Frederic Post code-review, extended refactor.	r10485	"""

		# Split the input into lines.
Jonathan Frederic Cleanup and refactor of filters	r10676	lines = text.split("\n")
Jonathan Frederic Post code-review, extended refactor.	r10485
		# Return the right lines.
		return "\n".join(lines[start:end]) #re-join
MinRK convert IPython syntax to Python syntax in nbconvert python template...	r11711
		def ipython2python(code):
		"""Transform IPython syntax to pure Python syntax

		Parameters
		----------

		code : str
		IPython code, to be transformed to pure Python
		"""
		shell = InteractiveShell.instance()
		return shell.input_transformer_manager.transform_cell(code)
MinRK add posix_path filter...	r11972
		def posix_path(path):
		"""Turn a path into posix-style path/to/etc

		Mainly for use in latex on Windows,
		where native Windows paths are not allowed.
		"""
		if os.path.sep != '/':
		return path.replace(os.path.sep, '/')
		return path