upstream/ipython Commit - r24171:f3bd056d

Document the new input transformation API

Thomas Kluyver -

r24171:f3bd056d

parent child

docs/source/whatsnew/pr/inputtransformer2.rst

0 created 644 +3 0

			@@ -0,0 +1,3 b''
		1	* The API for transforming input before it is parsed as Python code has been
		2	completely redesigned, and any custom input transformations will need to be
		3	rewritten. See :doc:`/config/inputtransforms` for details of the new API.

docs/source/config/inputtransforms.rst

0 +113 -139

		@@ -15,36 +15,31 b' String based transformations'
15	15
16	16	.. currentmodule:: IPython.core.inputtransforms
17	17
18		When the user enters ~~a line of~~ code, it is first processed as a string. By the
	18	When the user enters code, it is first processed as a string. By the
19	19	end of this stage, it must be valid Python syntax.
20	20
21		These transformers all subclass :class:`IPython.core.inputtransformer.InputTransformer`,
22		and are used by :class:`IPython.core.inputsplitter.IPythonInputSplitter`.
23
24		These transformers act in three groups, stored separately as lists of instances
25		in attributes of :class:`~IPython.core.inputsplitter.IPythonInputSplitter`:
26
27		* ``physical_line_transforms`` act on the lines as the user enters them. For
28		example, these strip Python prompts from examples pasted in.
29		* ``logical_line_transforms`` act on lines as connected by explicit line
30		continuations, i.e. ``\`` at the end of physical lines. They are skipped
31		inside multiline Python statements. This is the point where IPython recognises
32		``%magic`` commands, for instance.
33		* ``python_line_transforms`` act on blocks containing complete Python statements.
34		Multi-line strings, lists and function calls are reassembled before being
35		passed to these, but note that function and class definitions are still a
36		series of separate statements. IPython does not use any of these by default.
37
38		An InteractiveShell instance actually has two
39		:class:`~IPython.core.inputsplitter.IPythonInputSplitter` instances, as the
40		attributes :attr:`~IPython.core.interactiveshell.InteractiveShell.input_splitter`,
41		to tell when a block of input is complete, and
42		:attr:`~IPython.core.interactiveshell.InteractiveShell.input_transformer_manager`,
43		to transform complete cells. If you add a transformer, you should make sure that
44		it gets added to both, e.g.::
45
46		ip.input_splitter.logical_line_transforms.append(my_transformer())
47		ip.input_transformer_manager.logical_line_transforms.append(my_transformer())
	21	.. versionchanged:: 7.0
	22
	23	The API for string and token-based transformations has been completely
	24	redesigned. Any third party code extending input transformation will need to
	25	be rewritten. The new API is, hopefully, simpler.
	26
	27	String based transformations are managed by
	28	:class:`IPython.core.inputtransformer2.TransformerManager`, which is attached to
	29	the :class:`~IPython.core.interactiveshell.InteractiveShell` instance as
	30	``input_transformer_manager``. This passes the
	31	data through a series of individual transformers. There are two kinds of
	32	transformers stored in three groups:
	33
	34	* ``cleanup_transforms`` and ``line_transforms`` are lists of functions. Each
	35	function is called with a list of input lines (which include trailing
	36	newlines), and they return a list in the same format. ``cleanup_transforms``
	37	are run first; they strip prompts and leading indentation from input.
	38	The only default transform in ``line_transforms`` processes cell magics.
	39	* ``token_transformers`` is a list of :class:`IPython.core.inputtransformer2.TokenTransformBase`
	40	subclasses (not instances). They recognise special syntax like
	41	``%line magics`` and ``help?``, and transform them to Python syntax. The
	42	interface for these is more complex; see below.
48	43
49	44	These transformers may raise :exc:`SyntaxError` if the input code is invalid, but
50	45	in most cases it is clearer to pass unrecognised code through unmodified and let
		@@ -54,124 +49,103 b" Python's own parser decide whether it is valid."
54	49
55	50	Added the option to raise :exc:`SyntaxError`.
56	51
57		~~Stateless~~ transformations
58		-------------------------
	52	Line based transformations
	53	--------------------------
59	54
60		The simplest kind of transformations work one line at a time. Write a function
61		which takes a line and returns a line, and decorate it with
62		:meth:`StatelessInputTransformer.wrap`::
	55	For example, imagine we want to obfuscate our code by reversing each line, so
	56	we'd write ``)5(f =+ a`` instead of ``a += f(5)``. Here's how we could swap it
	57	back the right way before IPython tries to run it::
63	58
64		@StatelessInputTransformer.wrap
65		def my_special_commands(line):
66		if line.startswith("¬"):
67		return "specialcommand(" + repr(line) + ")"
68		return line
	59	def reverse_line_chars(lines):
	60	new_lines = []
	61	for line in lines:
	62	chars = line[:-1] # the newline needs to stay at the end
	63	new_lines.append(chars[::-1] + '\n')
	64	return new_lines
69	65
70		The decorator returns a factory function which will produce instances of
71		:class:`~IPython.core.inputtransformer.StatelessInputTransformer` using your
72		function.
	66	To start using this::
73	67
74		Transforming a full block
75		-------------------------
76
77		.. warning::
78
79		Transforming a full block at once will break the automatic detection of
80		whether a block of code is complete in interfaces relying on this
81		functionality, such as terminal IPython. You will need to use a
82		shortcut to force-execute your cells.
83
84		Transforming a full block of python code is possible by implementing a
85		:class:`~IPython.core.inputtransformer.Inputtransformer` and overwriting the
86		``push`` and ``reset`` methods. The reset method should send the full block of
87		transformed text. As an example a transformer the reversed the lines from last
88		to first.
89
90		from IPython.core.inputtransformer import InputTransformer
91
92		class ReverseLineTransformer(InputTransformer):
93
94		def __init__(self):
95		self.acc = []
96
97		def push(self, line):
98		self.acc.append(line)
99		return None
100
101		def reset(self):
102		ret = '\n'.join(self.acc[::-1])
103		self.acc = []
104		return ret
105
106
107		Coroutine transformers
108		----------------------
109
110		More advanced transformers can be written as coroutines. The coroutine will be
111		sent each line in turn, followed by ``None`` to reset it. It can yield lines, or
112		``None`` if it is accumulating text to yield at a later point. When reset, it
113		should give up any code it has accumulated.
114
115		You may use :meth:`CoroutineInputTransformer.wrap` to simplify the creation of
116		such a transformer.
117
118		Here is a simple :class:`CoroutineInputTransformer` that can be thought of
119		being the identity::
120
121		from IPython.core.inputtransformer import CoroutineInputTransformer
122
123		@CoroutineInputTransformer.wrap
124		def noop():
125		line = ''
126		while True:
127		line = (yield line)
	68	ip = get_ipython()
	69	ip.input_transformer_manager.line_transforms.append(reverse_line_chars)
	70
	71	Token based transformations
	72	---------------------------
	73
	74	These recognise special syntax like ``%magics`` and ``help?``, and transform it
	75	into valid Python code. Using tokens makes it easy to avoid transforming similar
	76	patterns inside comments or strings.
	77
	78	The API for a token-based transformation looks like this::
	79
	80	.. class:: MyTokenTransformer
	81
	82	.. classmethod:: find(tokens_by_line)
	83
	84	Takes a list of lists of :class:`tokenize.TokenInfo` objects. Each sublist
	85	is the tokens from one Python line, which may span several physical lines,
	86	because of line continuations, multiline strings or expressions. If it
	87	finds a pattern to transform, it returns an instance of the class.
	88	Otherwise, it returns None.
	89
	90	.. attribute:: start_lineno
	91	start_col
	92	priority
	93
	94	These attributes are used to select which transformation to run first.
	95	``start_lineno`` is 0-indexed (whereas the locations on
	96	:class:`~tokenize.TokenInfo` use 1-indexed line numbers). If there are
	97	multiple matches in the same location, the one with the smaller
	98	``priority`` number is used.
	99
	100	.. method:: transform(lines)
	101
	102	This should transform the individual recognised pattern that was
	103	previously found. As with line-based transforms, it takes a list of
	104	lines as strings, and returns a similar list.
	105
	106	Because each transformation may affect the parsing of the code after it,
	107	``TransformerManager`` takes a careful approach. It calls ``find()`` on all
	108	available transformers. If any find a match, the transformation which matched
	109	closest to the start is run. Then it tokenises the transformed code again,
	110	and starts the process again. This continues until none of the transformers
	111	return a match. So it's important that the transformation removes the pattern
	112	which ``find()`` recognises, otherwise it will enter an infinite loop.
	113
	114	For example, here's a transformer which will recognise ``¬`` as a prefix for a
	115	new kind of special command::
	116
	117	import tokenize
	118	from IPython.core.inputtransformer2 import TokenTransformBase
	119
	120	class MySpecialCommand(TokenTransformBase):
	121	@classmethod
	122	def find(cls, tokens_by_line):
	123	"""Find the first escaped command (¬foo) in the cell.
	124	"""
	125	for line in tokens_by_line:
	126	ix = 0
	127	# Find the first token that's not INDENT/DEDENT
	128	while line[ix].type in {tokenize.INDENT, tokenize.DEDENT}:
	129	ix += 1
	130	if line[ix].string == '¬':
	131	return cls(line[ix].start)
	132
	133	def transform(self, lines):
	134	indent = lines[self.start_line][:self.start_col]
	135	content = lines[self.start_line][self.start_col+1:]
	136
	137	lines_before = lines[:self.start_line]
	138	call = "specialcommand(%r)" % content
	139	new_line = indent + call + '\n'
	140	lines_after = lines[self.start_line + 1:]
	141
	142	return lines_before + [new_line] + lines_after
	143
	144	And here's how you'd use it::
128	145
129	146	ip = get_ipython()
	147	ip.input_transformer_manager.token_transformers.append(MySpecialCommand)
130	148
131		ip.input_splitter.logical_line_transforms.append(noop())
132		ip.input_transformer_manager.logical_line_transforms.append(noop())
133
134		This code in IPython strips a constant amount of leading indentation from each
135		line in a cell::
136
137		from IPython.core.inputtransformer import CoroutineInputTransformer
138
139		@CoroutineInputTransformer.wrap
140		def leading_indent():
141		"""Remove leading indentation.
142
143		If the first line starts with a spaces or tabs, the same whitespace will be
144		removed from each following line until it is reset.
145		"""
146		space_re = re.compile(r'^[ \t]+')
147		line = ''
148		while True:
149		line = (yield line)
150
151		if line is None:
152		continue
153
154		m = space_re.match(line)
155		if m:
156		space = m.group(0)
157		while line is not None:
158		if line.startswith(space):
159		line = line[len(space):]
160		line = (yield line)
161		else:
162		# No leading spaces - wait for reset
163		while line is not None:
164		line = (yield line)
165
166
167		Token-based transformers
168		------------------------
169
170		There is an experimental framework that takes care of tokenizing and
171		untokenizing lines of code. Define a function that accepts a list of tokens, and
172		returns an iterable of output tokens, and decorate it with
173		:meth:`TokenInputTransformer.wrap`. These should only be used in
174		``python_line_transforms``.
175	149
176	150	AST transformations
177	151	===================

General Comments 0

Write
Preview

You need to be logged in to leave comments. Login now

No TODOs yet

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages