upstream/ipython Commit - r24171:f3bd056d

Document the new input transformation API

Thomas Kluyver -

r24171:f3bd056d

parent child

docs/source/whatsnew/pr/inputtransformer2.rst

0 created 644 +3 0

@@ -0,0 +1,3 b''
	1	* The API for transforming input before it is parsed as Python code has been
	2	completely redesigned, and any custom input transformations will need to be
	3	rewritten. See :doc:`/config/inputtransforms` for details of the new API.

docs/source/config/inputtransforms.rst

0 +96 -122

@@ -15,36 +15,31 b' String based transformations'
15		15
16	.. currentmodule:: IPython.core.inputtransforms	16	.. currentmodule:: IPython.core.inputtransforms
17		17
18	When the user enters ~~a line of~~ code, it is first processed as a string. By the	18	When the user enters code, it is first processed as a string. By the
19	end of this stage, it must be valid Python syntax.	19	end of this stage, it must be valid Python syntax.
20		20
21	These transformers all subclass :class:`IPython.core.inputtransformer.InputTransformer`,	21	.. versionchanged:: 7.0
22	and are used by :class:`IPython.core.inputsplitter.IPythonInputSplitter`.	22
23		23	The API for string and token-based transformations has been completely
24	These transformers act in three groups, stored separately as lists of instances	24	redesigned. Any third party code extending input transformation will need to
25	in attributes of :class:`~IPython.core.inputsplitter.IPythonInputSplitter`:	25	be rewritten. The new API is, hopefully, simpler.
26		26
27	* ``physical_line_transforms`` act on the lines as the user enters them. For	27	String based transformations are managed by
28	example, these strip Python prompts from examples pasted in.	28	:class:`IPython.core.inputtransformer2.TransformerManager`, which is attached to
29	* ``logical_line_transforms`` act on lines as connected by explicit line	29	the :class:`~IPython.core.interactiveshell.InteractiveShell` instance as
30	continuations, i.e. ``\`` at the end of physical lines. They are skipped	30	``input_transformer_manager``. This passes the
31	inside multiline Python statements. This is the point where IPython recognises	31	data through a series of individual transformers. There are two kinds of
32	``%magic`` commands, for instance.	32	transformers stored in three groups:
33	* ``python_line_transforms`` act on blocks containing complete Python statements.	33
34	Multi-line strings, lists and function calls are reassembled before being	34	* ``cleanup_transforms`` and ``line_transforms`` are lists of functions. Each
35	passed to these, but note that function and class definitions are still a	35	function is called with a list of input lines (which include trailing
36	series of separate statements. IPython does not use any of these by default.	36	newlines), and they return a list in the same format. ``cleanup_transforms``
37		37	are run first; they strip prompts and leading indentation from input.
38	An InteractiveShell instance actually has two	38	The only default transform in ``line_transforms`` processes cell magics.
39	:class:`~IPython.core.inputsplitter.IPythonInputSplitter` instances, as the	39	* ``token_transformers`` is a list of :class:`IPython.core.inputtransformer2.TokenTransformBase`
40	attributes :attr:`~IPython.core.interactiveshell.InteractiveShell.input_splitter`,	40	subclasses (not instances). They recognise special syntax like
41	to tell when a block of input is complete, and	41	``%line magics`` and ``help?``, and transform them to Python syntax. The
42	:attr:`~IPython.core.interactiveshell.InteractiveShell.input_transformer_manager`,	42	interface for these is more complex; see below.
43	to transform complete cells. If you add a transformer, you should make sure that
44	it gets added to both, e.g.::
45
46	ip.input_splitter.logical_line_transforms.append(my_transformer())
47	ip.input_transformer_manager.logical_line_transforms.append(my_transformer())
48		43
49	These transformers may raise :exc:`SyntaxError` if the input code is invalid, but	44	These transformers may raise :exc:`SyntaxError` if the input code is invalid, but
50	in most cases it is clearer to pass unrecognised code through unmodified and let	45	in most cases it is clearer to pass unrecognised code through unmodified and let
@@ -54,124 +49,103 b" Python's own parser decide whether it is valid."
54		49
55	Added the option to raise :exc:`SyntaxError`.	50	Added the option to raise :exc:`SyntaxError`.
56		51
57	~~Stateless~~ transformations	52	Line based transformations
58	-------------------------	53	--------------------------
59		54
60	The simplest kind of transformations work one line at a time. Write a function	55	For example, imagine we want to obfuscate our code by reversing each line, so
61	which takes a line and returns a line, and decorate it with	56	we'd write ``)5(f =+ a`` instead of ``a += f(5)``. Here's how we could swap it
62	:meth:`StatelessInputTransformer.wrap`::	57	back the right way before IPython tries to run it::
63		58
64	@StatelessInputTransformer.wrap	59	def reverse_line_chars(lines):
65	def my_special_commands(line):	60	new_lines = []
66	if line.startswith("¬"):	61	for line in lines:
67	return "specialcommand(" + repr(line) + ")"	62	chars = line[:-1] # the newline needs to stay at the end
68	return line	63	new_lines.append(chars[::-1] + '\n')
		64	return new_lines
69		65
70	The decorator returns a factory function which will produce instances of	66	To start using this::
71	:class:`~IPython.core.inputtransformer.StatelessInputTransformer` using your
72	function.
73		67
74	Transforming a full block	68	ip = get_ipython()
75	-------------------------	69	ip.input_transformer_manager.line_transforms.append(reverse_line_chars)
76
77	.. warning::
78		70
79	Transforming a full block at once will break the automatic detection of	71	Token based transformations
80	whether a block of code is complete in interfaces relying on this	72	---------------------------
81	functionality, such as terminal IPython. You will need to use a
82	shortcut to force-execute your cells.
83		73
84	Transforming a full block of python code is possible by implementing a	74	These recognise special syntax like ``%magics`` and ``help?``, and transform it
85	:class:`~IPython.core.inputtransformer.Inputtransformer` and overwriting the	75	into valid Python code. Using tokens makes it easy to avoid transforming similar
86	``push`` and ``reset`` methods. The reset method should send the full block of	76	patterns inside comments or strings.
87	transformed text. As an example a transformer the reversed the lines from last
88	to first.
89		77
90	from IPython.core.inputtransformer import InputTransformer	78	The API for a token-based transformation looks like this::
91		79
92	class ReverseLineTransformer(InputTransformer):	80	.. class:: MyTokenTransformer
93		81
94	def __init__(self):	82	.. classmethod:: find(tokens_by_line)
95	self.acc = []
96		83
97	def push(self, line):	84	Takes a list of lists of :class:`tokenize.TokenInfo` objects. Each sublist
98	self.acc.append(line)	85	is the tokens from one Python line, which may span several physical lines,
99	return None	86	because of line continuations, multiline strings or expressions. If it
		87	finds a pattern to transform, it returns an instance of the class.
		88	Otherwise, it returns None.
100		89
101	def reset(self):	90	.. attribute:: start_lineno
102	ret = '\n'.join(self.acc[::-1])	91	start_col
103	self.acc = []	92	priority
104	return ret
105		93
		94	These attributes are used to select which transformation to run first.
		95	``start_lineno`` is 0-indexed (whereas the locations on
		96	:class:`~tokenize.TokenInfo` use 1-indexed line numbers). If there are
		97	multiple matches in the same location, the one with the smaller
		98	``priority`` number is used.
106		99
107	Coroutine transformers	100	.. method:: transform(lines)
108	----------------------
109		101
110	More advanced transformers can be written as coroutines. The coroutine will be	102	This should transform the individual recognised pattern that was
111	sent each line in turn, followed by ``None`` to reset it. It can yield lines, or	103	previously found. As with line-based transforms, it takes a list of
112	``None`` if it is accumulating text to yield at a later point. When reset, it	104	lines as strings, and returns a similar list.
113	should give up any code it has accumulated.
114		105
115	You may use :meth:`CoroutineInputTransformer.wrap` to simplify the creation of	106	Because each transformation may affect the parsing of the code after it,
116	such a transformer.	107	``TransformerManager`` takes a careful approach. It calls ``find()`` on all
		108	available transformers. If any find a match, the transformation which matched
		109	closest to the start is run. Then it tokenises the transformed code again,
		110	and starts the process again. This continues until none of the transformers
		111	return a match. So it's important that the transformation removes the pattern
		112	which ``find()`` recognises, otherwise it will enter an infinite loop.
117		113
118	Here is a simple :class:`CoroutineInputTransformer` that can be thought of	114	For example, here's a transformer which will recognise ``¬`` as a prefix for a
119	being the identity::	115	new kind of special command::
120		116
121	from IPython.core.inputtransformer import CoroutineInputTransformer	117	import tokenize
		118	from IPython.core.inputtransformer2 import TokenTransformBase
122		119
123	@CoroutineInputTransformer.wrap	120	class MySpecialCommand(TokenTransformBase):
124	def noop():	121	@classmethod
125	line = ''	122	def find(cls, tokens_by_line):
126	while True:	123	"""Find the first escaped command (¬foo) in the cell.
127	line = (yield line)	124	"""
		125	for line in tokens_by_line:
		126	ix = 0
		127	# Find the first token that's not INDENT/DEDENT
		128	while line[ix].type in {tokenize.INDENT, tokenize.DEDENT}:
		129	ix += 1
		130	if line[ix].string == '¬':
		131	return cls(line[ix].start)
128		132
129	ip = get_ipython()	133	def transform(self, lines):
		134	indent = lines[self.start_line][:self.start_col]
		135	content = lines[self.start_line][self.start_col+1:]
130		136
131	ip.input_splitter.logical_line_transforms.append(noop())	137	lines_before = lines[:self.start_line]
132	ip.input_transformer_manager.logical_line_transforms.append(noop())	138	call = "specialcommand(%r)" % content
		139	new_line = indent + call + '\n'
		140	lines_after = lines[self.start_line + 1:]
133		141
134	This code in IPython strips a constant amount of leading indentation from each	142	return lines_before + [new_line] + lines_after
135	line in a cell::
136		143
137	from IPython.core.inputtransformer import CoroutineInputTransformer	144	And here's how you'd use it::
138		145
139	@CoroutineInputTransformer.wrap	146	ip = get_ipython()
140	def leading_indent():	147	ip.input_transformer_manager.token_transformers.append(MySpecialCommand)
141	"""Remove leading indentation.
142		148
143	If the first line starts with a spaces or tabs, the same whitespace will be
144	removed from each following line until it is reset.
145	"""
146	space_re = re.compile(r'^[ \t]+')
147	line = ''
148	while True:
149	line = (yield line)
150
151	if line is None:
152	continue
153
154	m = space_re.match(line)
155	if m:
156	space = m.group(0)
157	while line is not None:
158	if line.startswith(space):
159	line = line[len(space):]
160	line = (yield line)
161	else:
162	# No leading spaces - wait for reset
163	while line is not None:
164	line = (yield line)
165
166
167	Token-based transformers
168	------------------------
169
170	There is an experimental framework that takes care of tokenizing and
171	untokenizing lines of code. Define a function that accepts a list of tokens, and
172	returns an iterable of output tokens, and decorate it with
173	:meth:`TokenInputTransformer.wrap`. These should only be used in
174	``python_line_transforms``.
175		149
176	AST transformations	150	AST transformations
177	===================	151	===================

General Comments 0

Write
Preview

You need to be logged in to leave comments. Login now

No TODOs yet

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages