Show More
@@ -0,0 +1,3 b'' | |||
|
1 | * The API for transforming input before it is parsed as Python code has been | |
|
2 | completely redesigned, and any custom input transformations will need to be | |
|
3 | rewritten. See :doc:`/config/inputtransforms` for details of the new API. |
@@ -15,36 +15,31 b' String based transformations' | |||
|
15 | 15 | |
|
16 | 16 | .. currentmodule:: IPython.core.inputtransforms |
|
17 | 17 | |
|
18 |
When the user enters |
|
|
18 | When the user enters code, it is first processed as a string. By the | |
|
19 | 19 | end of this stage, it must be valid Python syntax. |
|
20 | 20 | |
|
21 | These transformers all subclass :class:`IPython.core.inputtransformer.InputTransformer`, | |
|
22 | and are used by :class:`IPython.core.inputsplitter.IPythonInputSplitter`. | |
|
23 | ||
|
24 | These transformers act in three groups, stored separately as lists of instances | |
|
25 | in attributes of :class:`~IPython.core.inputsplitter.IPythonInputSplitter`: | |
|
26 | ||
|
27 | * ``physical_line_transforms`` act on the lines as the user enters them. For | |
|
28 | example, these strip Python prompts from examples pasted in. | |
|
29 | * ``logical_line_transforms`` act on lines as connected by explicit line | |
|
30 | continuations, i.e. ``\`` at the end of physical lines. They are skipped | |
|
31 | inside multiline Python statements. This is the point where IPython recognises | |
|
32 | ``%magic`` commands, for instance. | |
|
33 | * ``python_line_transforms`` act on blocks containing complete Python statements. | |
|
34 | Multi-line strings, lists and function calls are reassembled before being | |
|
35 | passed to these, but note that function and class *definitions* are still a | |
|
36 | series of separate statements. IPython does not use any of these by default. | |
|
37 | ||
|
38 | An InteractiveShell instance actually has two | |
|
39 | :class:`~IPython.core.inputsplitter.IPythonInputSplitter` instances, as the | |
|
40 | attributes :attr:`~IPython.core.interactiveshell.InteractiveShell.input_splitter`, | |
|
41 | to tell when a block of input is complete, and | |
|
42 | :attr:`~IPython.core.interactiveshell.InteractiveShell.input_transformer_manager`, | |
|
43 | to transform complete cells. If you add a transformer, you should make sure that | |
|
44 | it gets added to both, e.g.:: | |
|
45 | ||
|
46 | ip.input_splitter.logical_line_transforms.append(my_transformer()) | |
|
47 | ip.input_transformer_manager.logical_line_transforms.append(my_transformer()) | |
|
21 | .. versionchanged:: 7.0 | |
|
22 | ||
|
23 | The API for string and token-based transformations has been completely | |
|
24 | redesigned. Any third party code extending input transformation will need to | |
|
25 | be rewritten. The new API is, hopefully, simpler. | |
|
26 | ||
|
27 | String based transformations are managed by | |
|
28 | :class:`IPython.core.inputtransformer2.TransformerManager`, which is attached to | |
|
29 | the :class:`~IPython.core.interactiveshell.InteractiveShell` instance as | |
|
30 | ``input_transformer_manager``. This passes the | |
|
31 | data through a series of individual transformers. There are two kinds of | |
|
32 | transformers stored in three groups: | |
|
33 | ||
|
34 | * ``cleanup_transforms`` and ``line_transforms`` are lists of functions. Each | |
|
35 | function is called with a list of input lines (which include trailing | |
|
36 | newlines), and they return a list in the same format. ``cleanup_transforms`` | |
|
37 | are run first; they strip prompts and leading indentation from input. | |
|
38 | The only default transform in ``line_transforms`` processes cell magics. | |
|
39 | * ``token_transformers`` is a list of :class:`IPython.core.inputtransformer2.TokenTransformBase` | |
|
40 | subclasses (not instances). They recognise special syntax like | |
|
41 | ``%line magics`` and ``help?``, and transform them to Python syntax. The | |
|
42 | interface for these is more complex; see below. | |
|
48 | 43 | |
|
49 | 44 | These transformers may raise :exc:`SyntaxError` if the input code is invalid, but |
|
50 | 45 | in most cases it is clearer to pass unrecognised code through unmodified and let |
@@ -54,124 +49,103 b" Python's own parser decide whether it is valid." | |||
|
54 | 49 | |
|
55 | 50 | Added the option to raise :exc:`SyntaxError`. |
|
56 | 51 | |
|
57 |
|
|
|
58 | ------------------------- | |
|
52 | Line based transformations | |
|
53 | -------------------------- | |
|
59 | 54 | |
|
60 | The simplest kind of transformations work one line at a time. Write a function | |
|
61 | which takes a line and returns a line, and decorate it with | |
|
62 | :meth:`StatelessInputTransformer.wrap`:: | |
|
55 | For example, imagine we want to obfuscate our code by reversing each line, so | |
|
56 | we'd write ``)5(f =+ a`` instead of ``a += f(5)``. Here's how we could swap it | |
|
57 | back the right way before IPython tries to run it:: | |
|
63 | 58 | |
|
64 | @StatelessInputTransformer.wrap | |
|
65 | def my_special_commands(line): | |
|
66 | if line.startswith("¬"): | |
|
67 | return "specialcommand(" + repr(line) + ")" | |
|
68 | return line | |
|
59 | def reverse_line_chars(lines): | |
|
60 | new_lines = [] | |
|
61 | for line in lines: | |
|
62 | chars = line[:-1] # the newline needs to stay at the end | |
|
63 | new_lines.append(chars[::-1] + '\n') | |
|
64 | return new_lines | |
|
69 | 65 | |
|
70 | The decorator returns a factory function which will produce instances of | |
|
71 | :class:`~IPython.core.inputtransformer.StatelessInputTransformer` using your | |
|
72 | function. | |
|
66 | To start using this:: | |
|
73 | 67 |
|
|
74 | Transforming a full block | |
|
75 | ------------------------- | |
|
76 |
|
|
|
77 | .. warning:: | |
|
78 | ||
|
79 | Transforming a full block at once will break the automatic detection of | |
|
80 | whether a block of code is complete in interfaces relying on this | |
|
81 | functionality, such as terminal IPython. You will need to use a | |
|
82 | shortcut to force-execute your cells. | |
|
83 | ||
|
84 | Transforming a full block of python code is possible by implementing a | |
|
85 | :class:`~IPython.core.inputtransformer.Inputtransformer` and overwriting the | |
|
86 | ``push`` and ``reset`` methods. The reset method should send the full block of | |
|
87 | transformed text. As an example a transformer the reversed the lines from last | |
|
88 | to first. | |
|
89 | ||
|
90 | from IPython.core.inputtransformer import InputTransformer | |
|
91 | ||
|
92 | class ReverseLineTransformer(InputTransformer): | |
|
93 | ||
|
94 | def __init__(self): | |
|
95 | self.acc = [] | |
|
96 | ||
|
97 | def push(self, line): | |
|
98 | self.acc.append(line) | |
|
99 | return None | |
|
100 | ||
|
101 | def reset(self): | |
|
102 | ret = '\n'.join(self.acc[::-1]) | |
|
103 | self.acc = [] | |
|
104 | return ret | |
|
105 | ||
|
106 | ||
|
107 | Coroutine transformers | |
|
108 | ---------------------- | |
|
109 | ||
|
110 | More advanced transformers can be written as coroutines. The coroutine will be | |
|
111 | sent each line in turn, followed by ``None`` to reset it. It can yield lines, or | |
|
112 | ``None`` if it is accumulating text to yield at a later point. When reset, it | |
|
113 | should give up any code it has accumulated. | |
|
114 | ||
|
115 | You may use :meth:`CoroutineInputTransformer.wrap` to simplify the creation of | |
|
116 | such a transformer. | |
|
117 | ||
|
118 | Here is a simple :class:`CoroutineInputTransformer` that can be thought of | |
|
119 | being the identity:: | |
|
120 | ||
|
121 | from IPython.core.inputtransformer import CoroutineInputTransformer | |
|
122 |
|
|
|
123 | @CoroutineInputTransformer.wrap | |
|
124 | def noop(): | |
|
125 | line = '' | |
|
126 | while True: | |
|
127 | line = (yield line) | |
|
68 | ip = get_ipython() | |
|
69 | ip.input_transformer_manager.line_transforms.append(reverse_line_chars) | |
|
70 | ||
|
71 | Token based transformations | |
|
72 | --------------------------- | |
|
73 | ||
|
74 | These recognise special syntax like ``%magics`` and ``help?``, and transform it | |
|
75 | into valid Python code. Using tokens makes it easy to avoid transforming similar | |
|
76 | patterns inside comments or strings. | |
|
77 | ||
|
78 | The API for a token-based transformation looks like this:: | |
|
79 | ||
|
80 | .. class:: MyTokenTransformer | |
|
81 | ||
|
82 | .. classmethod:: find(tokens_by_line) | |
|
83 | ||
|
84 | Takes a list of lists of :class:`tokenize.TokenInfo` objects. Each sublist | |
|
85 | is the tokens from one Python line, which may span several physical lines, | |
|
86 | because of line continuations, multiline strings or expressions. If it | |
|
87 | finds a pattern to transform, it returns an instance of the class. | |
|
88 | Otherwise, it returns None. | |
|
89 | ||
|
90 | .. attribute:: start_lineno | |
|
91 | start_col | |
|
92 | priority | |
|
93 | ||
|
94 | These attributes are used to select which transformation to run first. | |
|
95 | ``start_lineno`` is 0-indexed (whereas the locations on | |
|
96 | :class:`~tokenize.TokenInfo` use 1-indexed line numbers). If there are | |
|
97 | multiple matches in the same location, the one with the smaller | |
|
98 | ``priority`` number is used. | |
|
99 | ||
|
100 | .. method:: transform(lines) | |
|
101 | ||
|
102 | This should transform the individual recognised pattern that was | |
|
103 | previously found. As with line-based transforms, it takes a list of | |
|
104 | lines as strings, and returns a similar list. | |
|
105 | ||
|
106 | Because each transformation may affect the parsing of the code after it, | |
|
107 | ``TransformerManager`` takes a careful approach. It calls ``find()`` on all | |
|
108 | available transformers. If any find a match, the transformation which matched | |
|
109 | closest to the start is run. Then it tokenises the transformed code again, | |
|
110 | and starts the process again. This continues until none of the transformers | |
|
111 | return a match. So it's important that the transformation removes the pattern | |
|
112 | which ``find()`` recognises, otherwise it will enter an infinite loop. | |
|
113 | ||
|
114 | For example, here's a transformer which will recognise ``¬`` as a prefix for a | |
|
115 | new kind of special command:: | |
|
116 | ||
|
117 | import tokenize | |
|
118 | from IPython.core.inputtransformer2 import TokenTransformBase | |
|
119 | ||
|
120 | class MySpecialCommand(TokenTransformBase): | |
|
121 | @classmethod | |
|
122 | def find(cls, tokens_by_line): | |
|
123 | """Find the first escaped command (¬foo) in the cell. | |
|
124 | """ | |
|
125 | for line in tokens_by_line: | |
|
126 | ix = 0 | |
|
127 | # Find the first token that's not INDENT/DEDENT | |
|
128 | while line[ix].type in {tokenize.INDENT, tokenize.DEDENT}: | |
|
129 | ix += 1 | |
|
130 | if line[ix].string == '¬': | |
|
131 | return cls(line[ix].start) | |
|
132 | ||
|
133 | def transform(self, lines): | |
|
134 | indent = lines[self.start_line][:self.start_col] | |
|
135 | content = lines[self.start_line][self.start_col+1:] | |
|
136 | ||
|
137 | lines_before = lines[:self.start_line] | |
|
138 | call = "specialcommand(%r)" % content | |
|
139 | new_line = indent + call + '\n' | |
|
140 | lines_after = lines[self.start_line + 1:] | |
|
141 | ||
|
142 | return lines_before + [new_line] + lines_after | |
|
143 | ||
|
144 | And here's how you'd use it:: | |
|
128 | 145 | |
|
129 | 146 | ip = get_ipython() |
|
147 | ip.input_transformer_manager.token_transformers.append(MySpecialCommand) | |
|
130 | 148 | |
|
131 | ip.input_splitter.logical_line_transforms.append(noop()) | |
|
132 | ip.input_transformer_manager.logical_line_transforms.append(noop()) | |
|
133 | ||
|
134 | This code in IPython strips a constant amount of leading indentation from each | |
|
135 | line in a cell:: | |
|
136 | ||
|
137 | from IPython.core.inputtransformer import CoroutineInputTransformer | |
|
138 | ||
|
139 | @CoroutineInputTransformer.wrap | |
|
140 | def leading_indent(): | |
|
141 | """Remove leading indentation. | |
|
142 | ||
|
143 | If the first line starts with a spaces or tabs, the same whitespace will be | |
|
144 | removed from each following line until it is reset. | |
|
145 | """ | |
|
146 | space_re = re.compile(r'^[ \t]+') | |
|
147 | line = '' | |
|
148 | while True: | |
|
149 | line = (yield line) | |
|
150 | ||
|
151 | if line is None: | |
|
152 | continue | |
|
153 | ||
|
154 | m = space_re.match(line) | |
|
155 | if m: | |
|
156 | space = m.group(0) | |
|
157 | while line is not None: | |
|
158 | if line.startswith(space): | |
|
159 | line = line[len(space):] | |
|
160 | line = (yield line) | |
|
161 | else: | |
|
162 | # No leading spaces - wait for reset | |
|
163 | while line is not None: | |
|
164 | line = (yield line) | |
|
165 | ||
|
166 | ||
|
167 | Token-based transformers | |
|
168 | ------------------------ | |
|
169 | ||
|
170 | There is an experimental framework that takes care of tokenizing and | |
|
171 | untokenizing lines of code. Define a function that accepts a list of tokens, and | |
|
172 | returns an iterable of output tokens, and decorate it with | |
|
173 | :meth:`TokenInputTransformer.wrap`. These should only be used in | |
|
174 | ``python_line_transforms``. | |
|
175 | 149 |
|
|
176 | 150 | AST transformations |
|
177 | 151 | =================== |
General Comments 0
You need to be logged in to leave comments.
Login now