##// END OF EJS Templates
Require blank line to end input cell immediately after dedenting....
Thomas Kluyver -
Show More
@@ -1,1005 +1,1009 b''
1 1 """Analysis of text input into executable blocks.
2 2
3 3 The main class in this module, :class:`InputSplitter`, is designed to break
4 4 input from either interactive, line-by-line environments or block-based ones,
5 5 into standalone blocks that can be executed by Python as 'single' statements
6 6 (thus triggering sys.displayhook).
7 7
8 8 A companion, :class:`IPythonInputSplitter`, provides the same functionality but
9 9 with full support for the extended IPython syntax (magics, system calls, etc).
10 10
11 11 For more details, see the class docstring below.
12 12
13 13 Syntax Transformations
14 14 ----------------------
15 15
16 16 One of the main jobs of the code in this file is to apply all syntax
17 17 transformations that make up 'the IPython language', i.e. magics, shell
18 18 escapes, etc. All transformations should be implemented as *fully stateless*
19 19 entities, that simply take one line as their input and return a line.
20 20 Internally for implementation purposes they may be a normal function or a
21 21 callable object, but the only input they receive will be a single line and they
22 22 should only return a line, without holding any data-dependent state between
23 23 calls.
24 24
25 25 As an example, the EscapedTransformer is a class so we can more clearly group
26 26 together the functionality of dispatching to individual functions based on the
27 27 starting escape character, but the only method for public use is its call
28 28 method.
29 29
30 30
31 31 ToDo
32 32 ----
33 33
34 34 - Should we make push() actually raise an exception once push_accepts_more()
35 35 returns False?
36 36
37 37 - Naming cleanups. The tr_* names aren't the most elegant, though now they are
38 38 at least just attributes of a class so not really very exposed.
39 39
40 40 - Think about the best way to support dynamic things: automagic, autocall,
41 41 macros, etc.
42 42
43 43 - Think of a better heuristic for the application of the transforms in
44 44 IPythonInputSplitter.push() than looking at the buffer ending in ':'. Idea:
45 45 track indentation change events (indent, dedent, nothing) and apply them only
46 46 if the indentation went up, but not otherwise.
47 47
48 48 - Think of the cleanest way for supporting user-specified transformations (the
49 49 user prefilters we had before).
50 50
51 51 Authors
52 52 -------
53 53
54 54 * Fernando Perez
55 55 * Brian Granger
56 56 """
57 57 #-----------------------------------------------------------------------------
58 58 # Copyright (C) 2010 The IPython Development Team
59 59 #
60 60 # Distributed under the terms of the BSD License. The full license is in
61 61 # the file COPYING, distributed as part of this software.
62 62 #-----------------------------------------------------------------------------
63 63 from __future__ import print_function
64 64
65 65 #-----------------------------------------------------------------------------
66 66 # Imports
67 67 #-----------------------------------------------------------------------------
68 68 # stdlib
69 69 import ast
70 70 import codeop
71 71 import re
72 72 import sys
73 73
74 74 # IPython modules
75 75 from IPython.utils.text import make_quoted_expr
76 76
77 77 #-----------------------------------------------------------------------------
78 78 # Globals
79 79 #-----------------------------------------------------------------------------
80 80
81 81 # The escape sequences that define the syntax transformations IPython will
82 82 # apply to user input. These can NOT be just changed here: many regular
83 83 # expressions and other parts of the code may use their hardcoded values, and
84 84 # for all intents and purposes they constitute the 'IPython syntax', so they
85 85 # should be considered fixed.
86 86
87 87 ESC_SHELL = '!' # Send line to underlying system shell
88 88 ESC_SH_CAP = '!!' # Send line to system shell and capture output
89 89 ESC_HELP = '?' # Find information about object
90 90 ESC_HELP2 = '??' # Find extra-detailed information about object
91 91 ESC_MAGIC = '%' # Call magic function
92 92 ESC_QUOTE = ',' # Split args on whitespace, quote each as string and call
93 93 ESC_QUOTE2 = ';' # Quote all args as a single string, call
94 94 ESC_PAREN = '/' # Call first argument with rest of line as arguments
95 95
96 96 #-----------------------------------------------------------------------------
97 97 # Utilities
98 98 #-----------------------------------------------------------------------------
99 99
100 100 # FIXME: These are general-purpose utilities that later can be moved to the
101 101 # general ward. Kept here for now because we're being very strict about test
102 102 # coverage with this code, and this lets us ensure that we keep 100% coverage
103 103 # while developing.
104 104
105 105 # compiled regexps for autoindent management
106 106 dedent_re = re.compile(r'^\s+raise|^\s+return|^\s+pass')
107 107 ini_spaces_re = re.compile(r'^([ \t\r\f\v]+)')
108 108
109 109 # regexp to match pure comment lines so we don't accidentally insert 'if 1:'
110 110 # before pure comments
111 111 comment_line_re = re.compile('^\s*\#')
112 112
113 113
114 114 def num_ini_spaces(s):
115 115 """Return the number of initial spaces in a string.
116 116
117 117 Note that tabs are counted as a single space. For now, we do *not* support
118 118 mixing of tabs and spaces in the user's input.
119 119
120 120 Parameters
121 121 ----------
122 122 s : string
123 123
124 124 Returns
125 125 -------
126 126 n : int
127 127 """
128 128
129 129 ini_spaces = ini_spaces_re.match(s)
130 130 if ini_spaces:
131 131 return ini_spaces.end()
132 132 else:
133 133 return 0
134 134
135 135
136 136 def remove_comments(src):
137 137 """Remove all comments from input source.
138 138
139 139 Note: comments are NOT recognized inside of strings!
140 140
141 141 Parameters
142 142 ----------
143 143 src : string
144 144 A single or multiline input string.
145 145
146 146 Returns
147 147 -------
148 148 String with all Python comments removed.
149 149 """
150 150
151 151 return re.sub('#.*', '', src)
152 152
153 153
154 154 def get_input_encoding():
155 155 """Return the default standard input encoding.
156 156
157 157 If sys.stdin has no encoding, 'ascii' is returned."""
158 158 # There are strange environments for which sys.stdin.encoding is None. We
159 159 # ensure that a valid encoding is returned.
160 160 encoding = getattr(sys.stdin, 'encoding', None)
161 161 if encoding is None:
162 162 encoding = 'ascii'
163 163 return encoding
164 164
165 165 #-----------------------------------------------------------------------------
166 166 # Classes and functions for normal Python syntax handling
167 167 #-----------------------------------------------------------------------------
168 168
169 169 # HACK! This implementation, written by Robert K a while ago using the
170 170 # compiler module, is more robust than the other one below, but it expects its
171 171 # input to be pure python (no ipython syntax). For now we're using it as a
172 172 # second-pass splitter after the first pass transforms the input to pure
173 173 # python.
174 174
175 175 def split_blocks(python):
176 176 """ Split multiple lines of code into discrete commands that can be
177 177 executed singly.
178 178
179 179 Parameters
180 180 ----------
181 181 python : str
182 182 Pure, exec'able Python code.
183 183
184 184 Returns
185 185 -------
186 186 commands : list of str
187 187 Separate commands that can be exec'ed independently.
188 188 """
189 189 # compiler.parse treats trailing spaces after a newline as a
190 190 # SyntaxError. This is different than codeop.CommandCompiler, which
191 191 # will compile the trailng spaces just fine. We simply strip any
192 192 # trailing whitespace off. Passing a string with trailing whitespace
193 193 # to exec will fail however. There seems to be some inconsistency in
194 194 # how trailing whitespace is handled, but this seems to work.
195 195 python_ori = python # save original in case we bail on error
196 196 python = python.strip()
197 197
198 198 # The compiler module will parse the code into an abstract syntax tree.
199 199 # This has a bug with str("a\nb"), but not str("""a\nb""")!!!
200 200 try:
201 201 code_ast = ast.parse(python)
202 202 except:
203 203 return [python_ori]
204 204
205 205 # Uncomment to help debug the ast tree
206 206 # for n in code_ast.body:
207 207 # print n.lineno,'->',n
208 208
209 209 # Each separate command is available by iterating over ast.node. The
210 210 # lineno attribute is the line number (1-indexed) beginning the commands
211 211 # suite.
212 212 # lines ending with ";" yield a Discard Node that doesn't have a lineno
213 213 # attribute. These nodes can and should be discarded. But there are
214 214 # other situations that cause Discard nodes that shouldn't be discarded.
215 215 # We might eventually discover other cases where lineno is None and have
216 216 # to put in a more sophisticated test.
217 217 linenos = [x.lineno-1 for x in code_ast.body if x.lineno is not None]
218 218
219 219 # When we finally get the slices, we will need to slice all the way to
220 220 # the end even though we don't have a line number for it. Fortunately,
221 221 # None does the job nicely.
222 222 linenos.append(None)
223 223
224 224 # Same problem at the other end: sometimes the ast tree has its
225 225 # first complete statement not starting on line 0. In this case
226 226 # we might miss part of it. This fixes ticket 266993. Thanks Gael!
227 227 linenos[0] = 0
228 228
229 229 lines = python.splitlines()
230 230
231 231 # Create a list of atomic commands.
232 232 cmds = []
233 233 for i, j in zip(linenos[:-1], linenos[1:]):
234 234 cmd = lines[i:j]
235 235 if cmd:
236 236 cmds.append('\n'.join(cmd)+'\n')
237 237
238 238 return cmds
239 239
240 240
241 241 class InputSplitter(object):
242 242 """An object that can split Python source input in executable blocks.
243 243
244 244 This object is designed to be used in one of two basic modes:
245 245
246 246 1. By feeding it python source line-by-line, using :meth:`push`. In this
247 247 mode, it will return on each push whether the currently pushed code
248 248 could be executed already. In addition, it provides a method called
249 249 :meth:`push_accepts_more` that can be used to query whether more input
250 250 can be pushed into a single interactive block.
251 251
252 252 2. By calling :meth:`split_blocks` with a single, multiline Python string,
253 253 that is then split into blocks each of which can be executed
254 254 interactively as a single statement.
255 255
256 256 This is a simple example of how an interactive terminal-based client can use
257 257 this tool::
258 258
259 259 isp = InputSplitter()
260 260 while isp.push_accepts_more():
261 261 indent = ' '*isp.indent_spaces
262 262 prompt = '>>> ' + indent
263 263 line = indent + raw_input(prompt)
264 264 isp.push(line)
265 265 print 'Input source was:\n', isp.source_reset(),
266 266 """
267 267 # Number of spaces of indentation computed from input that has been pushed
268 268 # so far. This is the attributes callers should query to get the current
269 269 # indentation level, in order to provide auto-indent facilities.
270 270 indent_spaces = 0
271 271 # String, indicating the default input encoding. It is computed by default
272 272 # at initialization time via get_input_encoding(), but it can be reset by a
273 273 # client with specific knowledge of the encoding.
274 274 encoding = ''
275 275 # String where the current full source input is stored, properly encoded.
276 276 # Reading this attribute is the normal way of querying the currently pushed
277 277 # source code, that has been properly encoded.
278 278 source = ''
279 279 # Code object corresponding to the current source. It is automatically
280 280 # synced to the source, so it can be queried at any time to obtain the code
281 281 # object; it will be None if the source doesn't compile to valid Python.
282 282 code = None
283 283 # Input mode
284 284 input_mode = 'line'
285 285
286 286 # Private attributes
287 287
288 288 # List with lines of input accumulated so far
289 289 _buffer = None
290 290 # Command compiler
291 291 _compile = None
292 292 # Mark when input has changed indentation all the way back to flush-left
293 293 _full_dedent = False
294 294 # Boolean indicating whether the current block is complete
295 295 _is_complete = None
296 296
297 297 def __init__(self, input_mode=None):
298 298 """Create a new InputSplitter instance.
299 299
300 300 Parameters
301 301 ----------
302 302 input_mode : str
303 303
304 304 One of ['line', 'cell']; default is 'line'.
305 305
306 306 The input_mode parameter controls how new inputs are used when fed via
307 307 the :meth:`push` method:
308 308
309 309 - 'line': meant for line-oriented clients, inputs are appended one at a
310 310 time to the internal buffer and the whole buffer is compiled.
311 311
312 312 - 'cell': meant for clients that can edit multi-line 'cells' of text at
313 313 a time. A cell can contain one or more blocks that can be compile in
314 314 'single' mode by Python. In this mode, each new input new input
315 315 completely replaces all prior inputs. Cell mode is thus equivalent
316 316 to prepending a full reset() to every push() call.
317 317 """
318 318 self._buffer = []
319 319 self._compile = codeop.CommandCompiler()
320 320 self.encoding = get_input_encoding()
321 321 self.input_mode = InputSplitter.input_mode if input_mode is None \
322 322 else input_mode
323 323
324 324 def reset(self):
325 325 """Reset the input buffer and associated state."""
326 326 self.indent_spaces = 0
327 327 self._buffer[:] = []
328 328 self.source = ''
329 329 self.code = None
330 330 self._is_complete = False
331 331 self._full_dedent = False
332 332
333 333 def source_reset(self):
334 334 """Return the input source and perform a full reset.
335 335 """
336 336 out = self.source
337 337 self.reset()
338 338 return out
339 339
340 340 def push(self, lines):
341 341 """Push one or more lines of input.
342 342
343 343 This stores the given lines and returns a status code indicating
344 344 whether the code forms a complete Python block or not.
345 345
346 346 Any exceptions generated in compilation are swallowed, but if an
347 347 exception was produced, the method returns True.
348 348
349 349 Parameters
350 350 ----------
351 351 lines : string
352 352 One or more lines of Python input.
353 353
354 354 Returns
355 355 -------
356 356 is_complete : boolean
357 357 True if the current input source (the result of the current input
358 358 plus prior inputs) forms a complete Python execution block. Note that
359 359 this value is also stored as a private attribute (_is_complete), so it
360 360 can be queried at any time.
361 361 """
362 362 if self.input_mode == 'cell':
363 363 self.reset()
364 364
365 365 self._store(lines)
366 366 source = self.source
367 367
368 368 # Before calling _compile(), reset the code object to None so that if an
369 369 # exception is raised in compilation, we don't mislead by having
370 370 # inconsistent code/source attributes.
371 371 self.code, self._is_complete = None, None
372 372
373 373 # Honor termination lines properly
374 374 if source.rstrip().endswith('\\'):
375 375 return False
376 376
377 377 self._update_indent(lines)
378 378 try:
379 379 self.code = self._compile(source)
380 380 # Invalid syntax can produce any of a number of different errors from
381 381 # inside the compiler, so we have to catch them all. Syntax errors
382 382 # immediately produce a 'ready' block, so the invalid Python can be
383 383 # sent to the kernel for evaluation with possible ipython
384 384 # special-syntax conversion.
385 385 except (SyntaxError, OverflowError, ValueError, TypeError,
386 386 MemoryError):
387 387 self._is_complete = True
388 388 else:
389 389 # Compilation didn't produce any exceptions (though it may not have
390 390 # given a complete code object)
391 391 self._is_complete = self.code is not None
392 392
393 393 return self._is_complete
394 394
395 395 def push_accepts_more(self):
396 396 """Return whether a block of interactive input can accept more input.
397 397
398 398 This method is meant to be used by line-oriented frontends, who need to
399 399 guess whether a block is complete or not based solely on prior and
400 400 current input lines. The InputSplitter considers it has a complete
401 401 interactive block and will not accept more input only when either a
402 402 SyntaxError is raised, or *all* of the following are true:
403 403
404 404 1. The input compiles to a complete statement.
405 405
406 406 2. The indentation level is flush-left (because if we are indented,
407 407 like inside a function definition or for loop, we need to keep
408 408 reading new input).
409 409
410 410 3. There is one extra line consisting only of whitespace.
411 411
412 412 Because of condition #3, this method should be used only by
413 413 *line-oriented* frontends, since it means that intermediate blank lines
414 414 are not allowed in function definitions (or any other indented block).
415 415
416 416 Block-oriented frontends that have a separate keyboard event to
417 417 indicate execution should use the :meth:`split_blocks` method instead.
418 418
419 419 If the current input produces a syntax error, this method immediately
420 420 returns False but does *not* raise the syntax error exception, as
421 421 typically clients will want to send invalid syntax to an execution
422 422 backend which might convert the invalid syntax into valid Python via
423 423 one of the dynamic IPython mechanisms.
424 424 """
425 425
426 426 # With incomplete input, unconditionally accept more
427 427 if not self._is_complete:
428 428 return True
429 429
430 430 # If we already have complete input and we're flush left, the answer
431 # depends. In line mode, we're done. But in cell mode, we need to
432 # check how many blocks the input so far compiles into, because if
433 # there's already more than one full independent block of input, then
434 # the client has entered full 'cell' mode and is feeding lines that
435 # each is complete. In this case we should then keep accepting.
436 # The Qt terminal-like console does precisely this, to provide the
437 # convenience of terminal-like input of single expressions, but
438 # allowing the user (with a separate keystroke) to switch to 'cell'
439 # mode and type multiple expressions in one shot.
431 # depends. In line mode, if there hasn't been any indentation,
432 # that's it. If we've come back from some indentation, we need
433 # the blank final line to finish.
434 # In cell mode, we need to check how many blocks the input so far
435 # compiles into, because if there's already more than one full
436 # independent block of input, then the client has entered full
437 # 'cell' mode and is feeding lines that each is complete. In this
438 # case we should then keep accepting. The Qt terminal-like console
439 # does precisely this, to provide the convenience of terminal-like
440 # input of single expressions, but allowing the user (with a
441 # separate keystroke) to switch to 'cell' mode and type multiple
442 # expressions in one shot.
440 443 if self.indent_spaces==0:
441 444 if self.input_mode=='line':
442 return False
445 if not self._full_dedent:
446 return False
443 447 else:
444 448 nblocks = len(split_blocks(''.join(self._buffer)))
445 449 if nblocks==1:
446 450 return False
447 451
448 452 # When input is complete, then termination is marked by an extra blank
449 453 # line at the end.
450 454 last_line = self.source.splitlines()[-1]
451 455 return bool(last_line and not last_line.isspace())
452 456
453 457 def split_blocks(self, lines):
454 458 """Split a multiline string into multiple input blocks.
455 459
456 460 Note: this method starts by performing a full reset().
457 461
458 462 Parameters
459 463 ----------
460 464 lines : str
461 465 A possibly multiline string.
462 466
463 467 Returns
464 468 -------
465 469 blocks : list
466 470 A list of strings, each possibly multiline. Each string corresponds
467 471 to a single block that can be compiled in 'single' mode (unless it
468 472 has a syntax error)."""
469 473
470 474 # This code is fairly delicate. If you make any changes here, make
471 475 # absolutely sure that you do run the full test suite and ALL tests
472 476 # pass.
473 477
474 478 self.reset()
475 479 blocks = []
476 480
477 481 # Reversed copy so we can use pop() efficiently and consume the input
478 482 # as a stack
479 483 lines = lines.splitlines()[::-1]
480 484 # Outer loop over all input
481 485 while lines:
482 486 #print 'Current lines:', lines # dbg
483 487 # Inner loop to build each block
484 488 while True:
485 489 # Safety exit from inner loop
486 490 if not lines:
487 491 break
488 492 # Grab next line but don't push it yet
489 493 next_line = lines.pop()
490 494 # Blank/empty lines are pushed as-is
491 495 if not next_line or next_line.isspace():
492 496 self.push(next_line)
493 497 continue
494 498
495 499 # Check indentation changes caused by the *next* line
496 500 indent_spaces, _full_dedent = self._find_indent(next_line)
497 501
498 502 # If the next line causes a dedent, it can be for two differnt
499 503 # reasons: either an explicit de-dent by the user or a
500 504 # return/raise/pass statement. These MUST be handled
501 505 # separately:
502 506 #
503 507 # 1. the first case is only detected when the actual explicit
504 508 # dedent happens, and that would be the *first* line of a *new*
505 509 # block. Thus, we must put the line back into the input buffer
506 510 # so that it starts a new block on the next pass.
507 511 #
508 512 # 2. the second case is detected in the line before the actual
509 513 # dedent happens, so , we consume the line and we can break out
510 514 # to start a new block.
511 515
512 516 # Case 1, explicit dedent causes a break.
513 517 # Note: check that we weren't on the very last line, else we'll
514 518 # enter an infinite loop adding/removing the last line.
515 519 if _full_dedent and lines and not next_line.startswith(' '):
516 520 lines.append(next_line)
517 521 break
518 522
519 523 # Otherwise any line is pushed
520 524 self.push(next_line)
521 525
522 526 # Case 2, full dedent with full block ready:
523 527 if _full_dedent or \
524 528 self.indent_spaces==0 and not self.push_accepts_more():
525 529 break
526 530 # Form the new block with the current source input
527 531 blocks.append(self.source_reset())
528 532
529 533 #return blocks
530 534 # HACK!!! Now that our input is in blocks but guaranteed to be pure
531 535 # python syntax, feed it back a second time through the AST-based
532 536 # splitter, which is more accurate than ours.
533 537 return split_blocks(''.join(blocks))
534 538
535 539 #------------------------------------------------------------------------
536 540 # Private interface
537 541 #------------------------------------------------------------------------
538 542
539 543 def _find_indent(self, line):
540 544 """Compute the new indentation level for a single line.
541 545
542 546 Parameters
543 547 ----------
544 548 line : str
545 549 A single new line of non-whitespace, non-comment Python input.
546 550
547 551 Returns
548 552 -------
549 553 indent_spaces : int
550 554 New value for the indent level (it may be equal to self.indent_spaces
551 555 if indentation doesn't change.
552 556
553 557 full_dedent : boolean
554 558 Whether the new line causes a full flush-left dedent.
555 559 """
556 560 indent_spaces = self.indent_spaces
557 561 full_dedent = self._full_dedent
558 562
559 563 inisp = num_ini_spaces(line)
560 564 if inisp < indent_spaces:
561 565 indent_spaces = inisp
562 566 if indent_spaces <= 0:
563 567 #print 'Full dedent in text',self.source # dbg
564 568 full_dedent = True
565 569
566 570 if line[-1] == ':':
567 571 indent_spaces += 4
568 572 elif dedent_re.match(line):
569 573 indent_spaces -= 4
570 574 if indent_spaces <= 0:
571 575 full_dedent = True
572 576
573 577 # Safety
574 578 if indent_spaces < 0:
575 579 indent_spaces = 0
576 580 #print 'safety' # dbg
577 581
578 582 return indent_spaces, full_dedent
579 583
580 584 def _update_indent(self, lines):
581 585 for line in remove_comments(lines).splitlines():
582 586 if line and not line.isspace():
583 587 self.indent_spaces, self._full_dedent = self._find_indent(line)
584 588
585 589 def _store(self, lines, buffer=None, store='source'):
586 590 """Store one or more lines of input.
587 591
588 592 If input lines are not newline-terminated, a newline is automatically
589 593 appended."""
590 594
591 595 if buffer is None:
592 596 buffer = self._buffer
593 597
594 598 if lines.endswith('\n'):
595 599 buffer.append(lines)
596 600 else:
597 601 buffer.append(lines+'\n')
598 602 setattr(self, store, self._set_source(buffer))
599 603
600 604 def _set_source(self, buffer):
601 605 return u''.join(buffer)
602 606
603 607
604 608 #-----------------------------------------------------------------------------
605 609 # Functions and classes for IPython-specific syntactic support
606 610 #-----------------------------------------------------------------------------
607 611
608 612 # RegExp for splitting line contents into pre-char//first word-method//rest.
609 613 # For clarity, each group in on one line.
610 614
611 615 line_split = re.compile("""
612 616 ^(\s*) # any leading space
613 617 ([,;/%]|!!?|\?\??) # escape character or characters
614 618 \s*(%?[\w\.\*]*) # function/method, possibly with leading %
615 619 # to correctly treat things like '?%magic'
616 620 (\s+.*$|$) # rest of line
617 621 """, re.VERBOSE)
618 622
619 623
620 624 def split_user_input(line):
621 625 """Split user input into early whitespace, esc-char, function part and rest.
622 626
623 627 This is currently handles lines with '=' in them in a very inconsistent
624 628 manner.
625 629
626 630 Examples
627 631 ========
628 632 >>> split_user_input('x=1')
629 633 ('', '', 'x=1', '')
630 634 >>> split_user_input('?')
631 635 ('', '?', '', '')
632 636 >>> split_user_input('??')
633 637 ('', '??', '', '')
634 638 >>> split_user_input(' ?')
635 639 (' ', '?', '', '')
636 640 >>> split_user_input(' ??')
637 641 (' ', '??', '', '')
638 642 >>> split_user_input('??x')
639 643 ('', '??', 'x', '')
640 644 >>> split_user_input('?x=1')
641 645 ('', '', '?x=1', '')
642 646 >>> split_user_input('!ls')
643 647 ('', '!', 'ls', '')
644 648 >>> split_user_input(' !ls')
645 649 (' ', '!', 'ls', '')
646 650 >>> split_user_input('!!ls')
647 651 ('', '!!', 'ls', '')
648 652 >>> split_user_input(' !!ls')
649 653 (' ', '!!', 'ls', '')
650 654 >>> split_user_input(',ls')
651 655 ('', ',', 'ls', '')
652 656 >>> split_user_input(';ls')
653 657 ('', ';', 'ls', '')
654 658 >>> split_user_input(' ;ls')
655 659 (' ', ';', 'ls', '')
656 660 >>> split_user_input('f.g(x)')
657 661 ('', '', 'f.g(x)', '')
658 662 >>> split_user_input('f.g (x)')
659 663 ('', '', 'f.g', '(x)')
660 664 >>> split_user_input('?%hist')
661 665 ('', '?', '%hist', '')
662 666 >>> split_user_input('?x*')
663 667 ('', '?', 'x*', '')
664 668 """
665 669 match = line_split.match(line)
666 670 if match:
667 671 lspace, esc, fpart, rest = match.groups()
668 672 else:
669 673 # print "match failed for line '%s'" % line
670 674 try:
671 675 fpart, rest = line.split(None, 1)
672 676 except ValueError:
673 677 # print "split failed for line '%s'" % line
674 678 fpart, rest = line,''
675 679 lspace = re.match('^(\s*)(.*)', line).groups()[0]
676 680 esc = ''
677 681
678 682 # fpart has to be a valid python identifier, so it better be only pure
679 683 # ascii, no unicode:
680 684 try:
681 685 fpart = fpart.encode('ascii')
682 686 except UnicodeEncodeError:
683 687 lspace = unicode(lspace)
684 688 rest = fpart + u' ' + rest
685 689 fpart = u''
686 690
687 691 #print 'line:<%s>' % line # dbg
688 692 #print 'esc <%s> fpart <%s> rest <%s>' % (esc,fpart.strip(),rest) # dbg
689 693 return lspace, esc, fpart.strip(), rest.lstrip()
690 694
691 695
692 696 # The escaped translators ALL receive a line where their own escape has been
693 697 # stripped. Only '?' is valid at the end of the line, all others can only be
694 698 # placed at the start.
695 699
696 700 class LineInfo(object):
697 701 """A single line of input and associated info.
698 702
699 703 This is a utility class that mostly wraps the output of
700 704 :func:`split_user_input` into a convenient object to be passed around
701 705 during input transformations.
702 706
703 707 Includes the following as properties:
704 708
705 709 line
706 710 The original, raw line
707 711
708 712 lspace
709 713 Any early whitespace before actual text starts.
710 714
711 715 esc
712 716 The initial esc character (or characters, for double-char escapes like
713 717 '??' or '!!').
714 718
715 719 fpart
716 720 The 'function part', which is basically the maximal initial sequence
717 721 of valid python identifiers and the '.' character. This is what is
718 722 checked for alias and magic transformations, used for auto-calling,
719 723 etc.
720 724
721 725 rest
722 726 Everything else on the line.
723 727 """
724 728 def __init__(self, line):
725 729 self.line = line
726 730 self.lspace, self.esc, self.fpart, self.rest = \
727 731 split_user_input(line)
728 732
729 733 def __str__(self):
730 734 return "LineInfo [%s|%s|%s|%s]" % (self.lspace, self.esc,
731 735 self.fpart, self.rest)
732 736
733 737
734 738 # Transformations of the special syntaxes that don't rely on an explicit escape
735 739 # character but instead on patterns on the input line
736 740
737 741 # The core transformations are implemented as standalone functions that can be
738 742 # tested and validated in isolation. Each of these uses a regexp, we
739 743 # pre-compile these and keep them close to each function definition for clarity
740 744
741 745 _assign_system_re = re.compile(r'(?P<lhs>(\s*)([\w\.]+)((\s*,\s*[\w\.]+)*))'
742 746 r'\s*=\s*!\s*(?P<cmd>.*)')
743 747
744 748 def transform_assign_system(line):
745 749 """Handle the `files = !ls` syntax."""
746 750 m = _assign_system_re.match(line)
747 751 if m is not None:
748 752 cmd = m.group('cmd')
749 753 lhs = m.group('lhs')
750 754 expr = make_quoted_expr(cmd)
751 755 new_line = '%s = get_ipython().getoutput(%s)' % (lhs, expr)
752 756 return new_line
753 757 return line
754 758
755 759
756 760 _assign_magic_re = re.compile(r'(?P<lhs>(\s*)([\w\.]+)((\s*,\s*[\w\.]+)*))'
757 761 r'\s*=\s*%\s*(?P<cmd>.*)')
758 762
759 763 def transform_assign_magic(line):
760 764 """Handle the `a = %who` syntax."""
761 765 m = _assign_magic_re.match(line)
762 766 if m is not None:
763 767 cmd = m.group('cmd')
764 768 lhs = m.group('lhs')
765 769 expr = make_quoted_expr(cmd)
766 770 new_line = '%s = get_ipython().magic(%s)' % (lhs, expr)
767 771 return new_line
768 772 return line
769 773
770 774
771 775 _classic_prompt_re = re.compile(r'^([ \t]*>>> |^[ \t]*\.\.\. )')
772 776
773 777 def transform_classic_prompt(line):
774 778 """Handle inputs that start with '>>> ' syntax."""
775 779
776 780 if not line or line.isspace():
777 781 return line
778 782 m = _classic_prompt_re.match(line)
779 783 if m:
780 784 return line[len(m.group(0)):]
781 785 else:
782 786 return line
783 787
784 788
785 789 _ipy_prompt_re = re.compile(r'^([ \t]*In \[\d+\]: |^[ \t]*\ \ \ \.\.\.+: )')
786 790
787 791 def transform_ipy_prompt(line):
788 792 """Handle inputs that start classic IPython prompt syntax."""
789 793
790 794 if not line or line.isspace():
791 795 return line
792 796 #print 'LINE: %r' % line # dbg
793 797 m = _ipy_prompt_re.match(line)
794 798 if m:
795 799 #print 'MATCH! %r -> %r' % (line, line[len(m.group(0)):]) # dbg
796 800 return line[len(m.group(0)):]
797 801 else:
798 802 return line
799 803
800 804
801 805 class EscapedTransformer(object):
802 806 """Class to transform lines that are explicitly escaped out."""
803 807
804 808 def __init__(self):
805 809 tr = { ESC_SHELL : self._tr_system,
806 810 ESC_SH_CAP : self._tr_system2,
807 811 ESC_HELP : self._tr_help,
808 812 ESC_HELP2 : self._tr_help,
809 813 ESC_MAGIC : self._tr_magic,
810 814 ESC_QUOTE : self._tr_quote,
811 815 ESC_QUOTE2 : self._tr_quote2,
812 816 ESC_PAREN : self._tr_paren }
813 817 self.tr = tr
814 818
815 819 # Support for syntax transformations that use explicit escapes typed by the
816 820 # user at the beginning of a line
817 821 @staticmethod
818 822 def _tr_system(line_info):
819 823 "Translate lines escaped with: !"
820 824 cmd = line_info.line.lstrip().lstrip(ESC_SHELL)
821 825 return '%sget_ipython().system(%s)' % (line_info.lspace,
822 826 make_quoted_expr(cmd))
823 827
824 828 @staticmethod
825 829 def _tr_system2(line_info):
826 830 "Translate lines escaped with: !!"
827 831 cmd = line_info.line.lstrip()[2:]
828 832 return '%sget_ipython().getoutput(%s)' % (line_info.lspace,
829 833 make_quoted_expr(cmd))
830 834
831 835 @staticmethod
832 836 def _tr_help(line_info):
833 837 "Translate lines escaped with: ?/??"
834 838 # A naked help line should just fire the intro help screen
835 839 if not line_info.line[1:]:
836 840 return 'get_ipython().show_usage()'
837 841
838 842 # There may be one or two '?' at the end, move them to the front so that
839 843 # the rest of the logic can assume escapes are at the start
840 844 l_ori = line_info
841 845 line = line_info.line
842 846 if line.endswith('?'):
843 847 line = line[-1] + line[:-1]
844 848 if line.endswith('?'):
845 849 line = line[-1] + line[:-1]
846 850 line_info = LineInfo(line)
847 851
848 852 # From here on, simply choose which level of detail to get, and
849 853 # special-case the psearch syntax
850 854 pinfo = 'pinfo' # default
851 855 if '*' in line_info.line:
852 856 pinfo = 'psearch'
853 857 elif line_info.esc == '??':
854 858 pinfo = 'pinfo2'
855 859
856 860 tpl = '%sget_ipython().magic("%s %s")'
857 861 return tpl % (line_info.lspace, pinfo,
858 862 ' '.join([line_info.fpart, line_info.rest]).strip())
859 863
860 864 @staticmethod
861 865 def _tr_magic(line_info):
862 866 "Translate lines escaped with: %"
863 867 tpl = '%sget_ipython().magic(%s)'
864 868 cmd = make_quoted_expr(' '.join([line_info.fpart,
865 869 line_info.rest]).strip())
866 870 return tpl % (line_info.lspace, cmd)
867 871
868 872 @staticmethod
869 873 def _tr_quote(line_info):
870 874 "Translate lines escaped with: ,"
871 875 return '%s%s("%s")' % (line_info.lspace, line_info.fpart,
872 876 '", "'.join(line_info.rest.split()) )
873 877
874 878 @staticmethod
875 879 def _tr_quote2(line_info):
876 880 "Translate lines escaped with: ;"
877 881 return '%s%s("%s")' % (line_info.lspace, line_info.fpart,
878 882 line_info.rest)
879 883
880 884 @staticmethod
881 885 def _tr_paren(line_info):
882 886 "Translate lines escaped with: /"
883 887 return '%s%s(%s)' % (line_info.lspace, line_info.fpart,
884 888 ", ".join(line_info.rest.split()))
885 889
886 890 def __call__(self, line):
887 891 """Class to transform lines that are explicitly escaped out.
888 892
889 893 This calls the above _tr_* static methods for the actual line
890 894 translations."""
891 895
892 896 # Empty lines just get returned unmodified
893 897 if not line or line.isspace():
894 898 return line
895 899
896 900 # Get line endpoints, where the escapes can be
897 901 line_info = LineInfo(line)
898 902
899 903 # If the escape is not at the start, only '?' needs to be special-cased.
900 904 # All other escapes are only valid at the start
901 905 if not line_info.esc in self.tr:
902 906 if line.endswith(ESC_HELP):
903 907 return self._tr_help(line_info)
904 908 else:
905 909 # If we don't recognize the escape, don't modify the line
906 910 return line
907 911
908 912 return self.tr[line_info.esc](line_info)
909 913
910 914
911 915 # A function-looking object to be used by the rest of the code. The purpose of
912 916 # the class in this case is to organize related functionality, more than to
913 917 # manage state.
914 918 transform_escaped = EscapedTransformer()
915 919
916 920
917 921 class IPythonInputSplitter(InputSplitter):
918 922 """An input splitter that recognizes all of IPython's special syntax."""
919 923
920 924 # String with raw, untransformed input.
921 925 source_raw = ''
922 926
923 927 # Private attributes
924 928
925 929 # List with lines of raw input accumulated so far.
926 930 _buffer_raw = None
927 931
928 932 def __init__(self, input_mode=None):
929 933 InputSplitter.__init__(self, input_mode)
930 934 self._buffer_raw = []
931 935
932 936 def reset(self):
933 937 """Reset the input buffer and associated state."""
934 938 InputSplitter.reset(self)
935 939 self._buffer_raw[:] = []
936 940 self.source_raw = ''
937 941
938 942 def source_raw_reset(self):
939 943 """Return input and raw source and perform a full reset.
940 944 """
941 945 out = self.source
942 946 out_r = self.source_raw
943 947 self.reset()
944 948 return out, out_r
945 949
946 950 def push(self, lines):
947 951 """Push one or more lines of IPython input.
948 952 """
949 953 if not lines:
950 954 return super(IPythonInputSplitter, self).push(lines)
951 955
952 956 # We must ensure all input is pure unicode
953 957 if type(lines)==str:
954 958 lines = lines.decode(self.encoding)
955 959
956 960 lines_list = lines.splitlines()
957 961
958 962 transforms = [transform_escaped, transform_assign_system,
959 963 transform_assign_magic, transform_ipy_prompt,
960 964 transform_classic_prompt]
961 965
962 966 # Transform logic
963 967 #
964 968 # We only apply the line transformers to the input if we have either no
965 969 # input yet, or complete input, or if the last line of the buffer ends
966 970 # with ':' (opening an indented block). This prevents the accidental
967 971 # transformation of escapes inside multiline expressions like
968 972 # triple-quoted strings or parenthesized expressions.
969 973 #
970 974 # The last heuristic, while ugly, ensures that the first line of an
971 975 # indented block is correctly transformed.
972 976 #
973 977 # FIXME: try to find a cleaner approach for this last bit.
974 978
975 979 # If we were in 'block' mode, since we're going to pump the parent
976 980 # class by hand line by line, we need to temporarily switch out to
977 981 # 'line' mode, do a single manual reset and then feed the lines one
978 982 # by one. Note that this only matters if the input has more than one
979 983 # line.
980 984 changed_input_mode = False
981 985
982 986 if self.input_mode == 'cell':
983 987 self.reset()
984 988 changed_input_mode = True
985 989 saved_input_mode = 'cell'
986 990 self.input_mode = 'line'
987 991
988 992 # Store raw source before applying any transformations to it. Note
989 993 # that this must be done *after* the reset() call that would otherwise
990 994 # flush the buffer.
991 995 self._store(lines, self._buffer_raw, 'source_raw')
992 996
993 997 try:
994 998 push = super(IPythonInputSplitter, self).push
995 999 for line in lines_list:
996 1000 if self._is_complete or not self._buffer or \
997 1001 (self._buffer and self._buffer[-1].rstrip().endswith(':')):
998 1002 for f in transforms:
999 1003 line = f(line)
1000 1004
1001 1005 out = push(line)
1002 1006 finally:
1003 1007 if changed_input_mode:
1004 1008 self.input_mode = saved_input_mode
1005 1009 return out
@@ -1,693 +1,704 b''
1 1 # -*- coding: utf-8 -*-
2 2 """Tests for the inputsplitter module.
3 3
4 4 Authors
5 5 -------
6 6 * Fernando Perez
7 7 * Robert Kern
8 8 """
9 9 #-----------------------------------------------------------------------------
10 10 # Copyright (C) 2010 The IPython Development Team
11 11 #
12 12 # Distributed under the terms of the BSD License. The full license is in
13 13 # the file COPYING, distributed as part of this software.
14 14 #-----------------------------------------------------------------------------
15 15
16 16 #-----------------------------------------------------------------------------
17 17 # Imports
18 18 #-----------------------------------------------------------------------------
19 19 # stdlib
20 20 import unittest
21 21 import sys
22 22
23 23 # Third party
24 24 import nose.tools as nt
25 25
26 26 # Our own
27 27 from IPython.core import inputsplitter as isp
28 28
29 29 #-----------------------------------------------------------------------------
30 30 # Semi-complete examples (also used as tests)
31 31 #-----------------------------------------------------------------------------
32 32
33 33 # Note: at the bottom, there's a slightly more complete version of this that
34 34 # can be useful during development of code here.
35 35
36 36 def mini_interactive_loop(input_func):
37 37 """Minimal example of the logic of an interactive interpreter loop.
38 38
39 39 This serves as an example, and it is used by the test system with a fake
40 40 raw_input that simulates interactive input."""
41 41
42 42 from IPython.core.inputsplitter import InputSplitter
43 43
44 44 isp = InputSplitter()
45 45 # In practice, this input loop would be wrapped in an outside loop to read
46 46 # input indefinitely, until some exit/quit command was issued. Here we
47 47 # only illustrate the basic inner loop.
48 48 while isp.push_accepts_more():
49 49 indent = ' '*isp.indent_spaces
50 50 prompt = '>>> ' + indent
51 51 line = indent + input_func(prompt)
52 52 isp.push(line)
53 53
54 54 # Here we just return input so we can use it in a test suite, but a real
55 55 # interpreter would instead send it for execution somewhere.
56 56 src = isp.source_reset()
57 57 #print 'Input source was:\n', src # dbg
58 58 return src
59 59
60 60 #-----------------------------------------------------------------------------
61 61 # Test utilities, just for local use
62 62 #-----------------------------------------------------------------------------
63 63
64 64 def assemble(block):
65 65 """Assemble a block into multi-line sub-blocks."""
66 66 return ['\n'.join(sub_block)+'\n' for sub_block in block]
67 67
68 68
69 69 def pseudo_input(lines):
70 70 """Return a function that acts like raw_input but feeds the input list."""
71 71 ilines = iter(lines)
72 72 def raw_in(prompt):
73 73 try:
74 74 return next(ilines)
75 75 except StopIteration:
76 76 return ''
77 77 return raw_in
78 78
79 79 #-----------------------------------------------------------------------------
80 80 # Tests
81 81 #-----------------------------------------------------------------------------
82 82 def test_spaces():
83 83 tests = [('', 0),
84 84 (' ', 1),
85 85 ('\n', 0),
86 86 (' \n', 1),
87 87 ('x', 0),
88 88 (' x', 1),
89 89 (' x',2),
90 90 (' x',4),
91 91 # Note: tabs are counted as a single whitespace!
92 92 ('\tx', 1),
93 93 ('\t x', 2),
94 94 ]
95 95
96 96 for s, nsp in tests:
97 97 nt.assert_equal(isp.num_ini_spaces(s), nsp)
98 98
99 99
100 100 def test_remove_comments():
101 101 tests = [('text', 'text'),
102 102 ('text # comment', 'text '),
103 103 ('text # comment\n', 'text \n'),
104 104 ('text # comment \n', 'text \n'),
105 105 ('line # c \nline\n','line \nline\n'),
106 106 ('line # c \nline#c2 \nline\nline #c\n\n',
107 107 'line \nline\nline\nline \n\n'),
108 108 ]
109 109
110 110 for inp, out in tests:
111 111 nt.assert_equal(isp.remove_comments(inp), out)
112 112
113 113
114 114 def test_get_input_encoding():
115 115 encoding = isp.get_input_encoding()
116 116 nt.assert_true(isinstance(encoding, basestring))
117 117 # simple-minded check that at least encoding a simple string works with the
118 118 # encoding we got.
119 119 nt.assert_equal('test'.encode(encoding), 'test')
120 120
121 121
122 122 class NoInputEncodingTestCase(unittest.TestCase):
123 123 def setUp(self):
124 124 self.old_stdin = sys.stdin
125 125 class X: pass
126 126 fake_stdin = X()
127 127 sys.stdin = fake_stdin
128 128
129 129 def test(self):
130 130 # Verify that if sys.stdin has no 'encoding' attribute we do the right
131 131 # thing
132 132 enc = isp.get_input_encoding()
133 133 self.assertEqual(enc, 'ascii')
134 134
135 135 def tearDown(self):
136 136 sys.stdin = self.old_stdin
137 137
138 138
139 139 class InputSplitterTestCase(unittest.TestCase):
140 140 def setUp(self):
141 141 self.isp = isp.InputSplitter()
142 142
143 143 def test_reset(self):
144 144 isp = self.isp
145 145 isp.push('x=1')
146 146 isp.reset()
147 147 self.assertEqual(isp._buffer, [])
148 148 self.assertEqual(isp.indent_spaces, 0)
149 149 self.assertEqual(isp.source, '')
150 150 self.assertEqual(isp.code, None)
151 151 self.assertEqual(isp._is_complete, False)
152 152
153 153 def test_source(self):
154 154 self.isp._store('1')
155 155 self.isp._store('2')
156 156 self.assertEqual(self.isp.source, '1\n2\n')
157 157 self.assertTrue(len(self.isp._buffer)>0)
158 158 self.assertEqual(self.isp.source_reset(), '1\n2\n')
159 159 self.assertEqual(self.isp._buffer, [])
160 160 self.assertEqual(self.isp.source, '')
161 161
162 162 def test_indent(self):
163 163 isp = self.isp # shorthand
164 164 isp.push('x=1')
165 165 self.assertEqual(isp.indent_spaces, 0)
166 166 isp.push('if 1:\n x=1')
167 167 self.assertEqual(isp.indent_spaces, 4)
168 168 isp.push('y=2\n')
169 169 self.assertEqual(isp.indent_spaces, 0)
170 170
171 171 def test_indent2(self):
172 172 # In cell mode, inputs must be fed in whole blocks, so skip this test
173 173 if self.isp.input_mode == 'cell': return
174 174
175 175 isp = self.isp
176 176 isp.push('if 1:')
177 177 self.assertEqual(isp.indent_spaces, 4)
178 178 isp.push(' x=1')
179 179 self.assertEqual(isp.indent_spaces, 4)
180 180 # Blank lines shouldn't change the indent level
181 181 isp.push(' '*2)
182 182 self.assertEqual(isp.indent_spaces, 4)
183 183
184 184 def test_indent3(self):
185 185 # In cell mode, inputs must be fed in whole blocks, so skip this test
186 186 if self.isp.input_mode == 'cell': return
187 187
188 188 isp = self.isp
189 189 # When a multiline statement contains parens or multiline strings, we
190 190 # shouldn't get confused.
191 191 isp.push("if 1:")
192 192 isp.push(" x = (1+\n 2)")
193 193 self.assertEqual(isp.indent_spaces, 4)
194 194
195 195 def test_dedent(self):
196 196 isp = self.isp # shorthand
197 197 isp.push('if 1:')
198 198 self.assertEqual(isp.indent_spaces, 4)
199 199 isp.push(' pass')
200 200 self.assertEqual(isp.indent_spaces, 0)
201 201
202 202 def test_push(self):
203 203 isp = self.isp
204 204 self.assertTrue(isp.push('x=1'))
205 205
206 206 def test_push2(self):
207 207 isp = self.isp
208 208 self.assertFalse(isp.push('if 1:'))
209 209 for line in [' x=1', '# a comment', ' y=2']:
210 210 self.assertTrue(isp.push(line))
211 211
212 212 def test_replace_mode(self):
213 213 isp = self.isp
214 214 isp.input_mode = 'cell'
215 215 isp.push('x=1')
216 216 self.assertEqual(isp.source, 'x=1\n')
217 217 isp.push('x=2')
218 218 self.assertEqual(isp.source, 'x=2\n')
219 219
220 220 def test_push_accepts_more(self):
221 221 isp = self.isp
222 222 isp.push('x=1')
223 223 self.assertFalse(isp.push_accepts_more())
224 224
225 225 def test_push_accepts_more2(self):
226 226 # In cell mode, inputs must be fed in whole blocks, so skip this test
227 227 if self.isp.input_mode == 'cell': return
228 228
229 229 isp = self.isp
230 230 isp.push('if 1:')
231 231 self.assertTrue(isp.push_accepts_more())
232 232 isp.push(' x=1')
233 233 self.assertTrue(isp.push_accepts_more())
234 234 isp.push('')
235 235 self.assertFalse(isp.push_accepts_more())
236 236
237 237 def test_push_accepts_more3(self):
238 238 isp = self.isp
239 239 isp.push("x = (2+\n3)")
240 240 self.assertFalse(isp.push_accepts_more())
241 241
242 242 def test_push_accepts_more4(self):
243 243 # In cell mode, inputs must be fed in whole blocks, so skip this test
244 244 if self.isp.input_mode == 'cell': return
245 245
246 246 isp = self.isp
247 247 # When a multiline statement contains parens or multiline strings, we
248 248 # shouldn't get confused.
249 249 # FIXME: we should be able to better handle de-dents in statements like
250 250 # multiline strings and multiline expressions (continued with \ or
251 251 # parens). Right now we aren't handling the indentation tracking quite
252 252 # correctly with this, though in practice it may not be too much of a
253 253 # problem. We'll need to see.
254 254 isp.push("if 1:")
255 255 isp.push(" x = (2+")
256 256 isp.push(" 3)")
257 257 self.assertTrue(isp.push_accepts_more())
258 258 isp.push(" y = 3")
259 259 self.assertTrue(isp.push_accepts_more())
260 260 isp.push('')
261 261 self.assertFalse(isp.push_accepts_more())
262
263 def test_push_accepts_more5(self):
264 # In cell mode, inputs must be fed in whole blocks, so skip this test
265 if self.isp.input_mode == 'cell': return
266
267 isp = self.isp
268 isp.push('try:')
269 isp.push(' a = 5')
270 isp.push('except:')
271 isp.push(' raise')
272 self.assertTrue(isp.push_accepts_more())
262 273
263 274 def test_continuation(self):
264 275 isp = self.isp
265 276 isp.push("import os, \\")
266 277 self.assertTrue(isp.push_accepts_more())
267 278 isp.push("sys")
268 279 self.assertFalse(isp.push_accepts_more())
269 280
270 281 def test_syntax_error(self):
271 282 isp = self.isp
272 283 # Syntax errors immediately produce a 'ready' block, so the invalid
273 284 # Python can be sent to the kernel for evaluation with possible ipython
274 285 # special-syntax conversion.
275 286 isp.push('run foo')
276 287 self.assertFalse(isp.push_accepts_more())
277 288
278 289 def check_split(self, block_lines, compile=True):
279 290 blocks = assemble(block_lines)
280 291 lines = ''.join(blocks)
281 292 oblock = self.isp.split_blocks(lines)
282 293 self.assertEqual(oblock, blocks)
283 294 if compile:
284 295 for block in blocks:
285 296 self.isp._compile(block)
286 297
287 298 def test_split(self):
288 299 # All blocks of input we want to test in a list. The format for each
289 300 # block is a list of lists, with each inner lists consisting of all the
290 301 # lines (as single-lines) that should make up a sub-block.
291 302
292 303 # Note: do NOT put here sub-blocks that don't compile, as the
293 304 # check_split() routine makes a final verification pass to check that
294 305 # each sub_block, as returned by split_blocks(), does compile
295 306 # correctly.
296 307 all_blocks = [ [['x=1']],
297 308
298 309 [['x=1'],
299 310 ['y=2']],
300 311
301 312 [['x=1',
302 313 '# a comment'],
303 314 ['y=11']],
304 315
305 316 [['if 1:',
306 317 ' x=1'],
307 318 ['y=3']],
308 319
309 320 [['def f(x):',
310 321 ' return x'],
311 322 ['x=1']],
312 323
313 324 [['def f(x):',
314 325 ' x+=1',
315 326 ' ',
316 327 ' return x'],
317 328 ['x=1']],
318 329
319 330 [['def f(x):',
320 331 ' if x>0:',
321 332 ' y=1',
322 333 ' # a comment',
323 334 ' else:',
324 335 ' y=4',
325 336 ' ',
326 337 ' return y'],
327 338 ['x=1'],
328 339 ['if 1:',
329 340 ' y=11'] ],
330 341
331 342 [['for i in range(10):'
332 343 ' x=i**2']],
333 344
334 345 [['for i in range(10):'
335 346 ' x=i**2'],
336 347 ['z = 1']],
337 348
338 349 [['"asdf"']],
339 350
340 351 [['"asdf"'],
341 352 ['10'],
342 353 ],
343 354
344 355 [['"""foo',
345 356 'bar"""']],
346 357 ]
347 358 for block_lines in all_blocks:
348 359 self.check_split(block_lines)
349 360
350 361 def test_split_syntax_errors(self):
351 362 # Block splitting with invalid syntax
352 363 all_blocks = [ [['a syntax error']],
353 364
354 365 [['x=1',
355 366 'another syntax error']],
356 367
357 368 [['for i in range(10):'
358 369 ' yet another error']],
359 370
360 371 ]
361 372 for block_lines in all_blocks:
362 373 self.check_split(block_lines, compile=False)
363 374
364 375 def test_unicode(self):
365 376 self.isp.push(u"PΓ©rez")
366 377 self.isp.push(u'\xc3\xa9')
367 378 self.isp.push(u"u'\xc3\xa9'")
368 379
369 380 class InteractiveLoopTestCase(unittest.TestCase):
370 381 """Tests for an interactive loop like a python shell.
371 382 """
372 383 def check_ns(self, lines, ns):
373 384 """Validate that the given input lines produce the resulting namespace.
374 385
375 386 Note: the input lines are given exactly as they would be typed in an
376 387 auto-indenting environment, as mini_interactive_loop above already does
377 388 auto-indenting and prepends spaces to the input.
378 389 """
379 390 src = mini_interactive_loop(pseudo_input(lines))
380 391 test_ns = {}
381 392 exec src in test_ns
382 393 # We can't check that the provided ns is identical to the test_ns,
383 394 # because Python fills test_ns with extra keys (copyright, etc). But
384 395 # we can check that the given dict is *contained* in test_ns
385 396 for k,v in ns.iteritems():
386 397 self.assertEqual(test_ns[k], v)
387 398
388 399 def test_simple(self):
389 400 self.check_ns(['x=1'], dict(x=1))
390 401
391 402 def test_simple2(self):
392 403 self.check_ns(['if 1:', 'x=2'], dict(x=2))
393 404
394 405 def test_xy(self):
395 406 self.check_ns(['x=1; y=2'], dict(x=1, y=2))
396 407
397 408 def test_abc(self):
398 409 self.check_ns(['if 1:','a=1','b=2','c=3'], dict(a=1, b=2, c=3))
399 410
400 411 def test_multi(self):
401 412 self.check_ns(['x =(1+','1+','2)'], dict(x=4))
402 413
403 414
404 415 def test_LineInfo():
405 416 """Simple test for LineInfo construction and str()"""
406 417 linfo = isp.LineInfo(' %cd /home')
407 418 nt.assert_equals(str(linfo), 'LineInfo [ |%|cd|/home]')
408 419
409 420
410 421 def test_split_user_input():
411 422 """Unicode test - split_user_input already has good doctests"""
412 423 line = u"PΓ©rez Fernando"
413 424 parts = isp.split_user_input(line)
414 425 parts_expected = (u'', u'', u'', line)
415 426 nt.assert_equal(parts, parts_expected)
416 427
417 428
418 429 # Transformer tests
419 430 def transform_checker(tests, func):
420 431 """Utility to loop over test inputs"""
421 432 for inp, tr in tests:
422 433 nt.assert_equals(func(inp), tr)
423 434
424 435 # Data for all the syntax tests in the form of lists of pairs of
425 436 # raw/transformed input. We store it here as a global dict so that we can use
426 437 # it both within single-function tests and also to validate the behavior of the
427 438 # larger objects
428 439
429 440 syntax = \
430 441 dict(assign_system =
431 442 [('a =! ls', 'a = get_ipython().getoutput("ls")'),
432 443 ('b = !ls', 'b = get_ipython().getoutput("ls")'),
433 444 ('x=1', 'x=1'), # normal input is unmodified
434 445 (' ',' '), # blank lines are kept intact
435 446 ],
436 447
437 448 assign_magic =
438 449 [('a =% who', 'a = get_ipython().magic("who")'),
439 450 ('b = %who', 'b = get_ipython().magic("who")'),
440 451 ('x=1', 'x=1'), # normal input is unmodified
441 452 (' ',' '), # blank lines are kept intact
442 453 ],
443 454
444 455 classic_prompt =
445 456 [('>>> x=1', 'x=1'),
446 457 ('x=1', 'x=1'), # normal input is unmodified
447 458 (' ', ' '), # blank lines are kept intact
448 459 ('... ', ''), # continuation prompts
449 460 ],
450 461
451 462 ipy_prompt =
452 463 [('In [1]: x=1', 'x=1'),
453 464 ('x=1', 'x=1'), # normal input is unmodified
454 465 (' ',' '), # blank lines are kept intact
455 466 (' ....: ', ''), # continuation prompts
456 467 ],
457 468
458 469 # Tests for the escape transformer to leave normal code alone
459 470 escaped_noesc =
460 471 [ (' ', ' '),
461 472 ('x=1', 'x=1'),
462 473 ],
463 474
464 475 # System calls
465 476 escaped_shell =
466 477 [ ('!ls', 'get_ipython().system("ls")'),
467 478 # Double-escape shell, this means to capture the output of the
468 479 # subprocess and return it
469 480 ('!!ls', 'get_ipython().getoutput("ls")'),
470 481 ],
471 482
472 483 # Help/object info
473 484 escaped_help =
474 485 [ ('?', 'get_ipython().show_usage()'),
475 486 ('?x1', 'get_ipython().magic("pinfo x1")'),
476 487 ('??x2', 'get_ipython().magic("pinfo2 x2")'),
477 488 ('x3?', 'get_ipython().magic("pinfo x3")'),
478 489 ('x4??', 'get_ipython().magic("pinfo2 x4")'),
479 490 ('%hist?', 'get_ipython().magic("pinfo %hist")'),
480 491 ('f*?', 'get_ipython().magic("psearch f*")'),
481 492 ('ax.*aspe*?', 'get_ipython().magic("psearch ax.*aspe*")'),
482 493 ],
483 494
484 495 # Explicit magic calls
485 496 escaped_magic =
486 497 [ ('%cd', 'get_ipython().magic("cd")'),
487 498 ('%cd /home', 'get_ipython().magic("cd /home")'),
488 499 (' %magic', ' get_ipython().magic("magic")'),
489 500 ],
490 501
491 502 # Quoting with separate arguments
492 503 escaped_quote =
493 504 [ (',f', 'f("")'),
494 505 (',f x', 'f("x")'),
495 506 (' ,f y', ' f("y")'),
496 507 (',f a b', 'f("a", "b")'),
497 508 ],
498 509
499 510 # Quoting with single argument
500 511 escaped_quote2 =
501 512 [ (';f', 'f("")'),
502 513 (';f x', 'f("x")'),
503 514 (' ;f y', ' f("y")'),
504 515 (';f a b', 'f("a b")'),
505 516 ],
506 517
507 518 # Simply apply parens
508 519 escaped_paren =
509 520 [ ('/f', 'f()'),
510 521 ('/f x', 'f(x)'),
511 522 (' /f y', ' f(y)'),
512 523 ('/f a b', 'f(a, b)'),
513 524 ],
514 525
515 526 )
516 527
517 528 # multiline syntax examples. Each of these should be a list of lists, with
518 529 # each entry itself having pairs of raw/transformed input. The union (with
519 530 # '\n'.join() of the transformed inputs is what the splitter should produce
520 531 # when fed the raw lines one at a time via push.
521 532 syntax_ml = \
522 533 dict(classic_prompt =
523 534 [ [('>>> for i in range(10):','for i in range(10):'),
524 535 ('... print i',' print i'),
525 536 ('... ', ''),
526 537 ],
527 538 ],
528 539
529 540 ipy_prompt =
530 541 [ [('In [24]: for i in range(10):','for i in range(10):'),
531 542 (' ....: print i',' print i'),
532 543 (' ....: ', ''),
533 544 ],
534 545 ],
535 546 )
536 547
537 548
538 549 def test_assign_system():
539 550 transform_checker(syntax['assign_system'], isp.transform_assign_system)
540 551
541 552
542 553 def test_assign_magic():
543 554 transform_checker(syntax['assign_magic'], isp.transform_assign_magic)
544 555
545 556
546 557 def test_classic_prompt():
547 558 transform_checker(syntax['classic_prompt'], isp.transform_classic_prompt)
548 559 for example in syntax_ml['classic_prompt']:
549 560 transform_checker(example, isp.transform_classic_prompt)
550 561
551 562
552 563 def test_ipy_prompt():
553 564 transform_checker(syntax['ipy_prompt'], isp.transform_ipy_prompt)
554 565 for example in syntax_ml['ipy_prompt']:
555 566 transform_checker(example, isp.transform_ipy_prompt)
556 567
557 568
558 569 def test_escaped_noesc():
559 570 transform_checker(syntax['escaped_noesc'], isp.transform_escaped)
560 571
561 572
562 573 def test_escaped_shell():
563 574 transform_checker(syntax['escaped_shell'], isp.transform_escaped)
564 575
565 576
566 577 def test_escaped_help():
567 578 transform_checker(syntax['escaped_help'], isp.transform_escaped)
568 579
569 580
570 581 def test_escaped_magic():
571 582 transform_checker(syntax['escaped_magic'], isp.transform_escaped)
572 583
573 584
574 585 def test_escaped_quote():
575 586 transform_checker(syntax['escaped_quote'], isp.transform_escaped)
576 587
577 588
578 589 def test_escaped_quote2():
579 590 transform_checker(syntax['escaped_quote2'], isp.transform_escaped)
580 591
581 592
582 593 def test_escaped_paren():
583 594 transform_checker(syntax['escaped_paren'], isp.transform_escaped)
584 595
585 596
586 597 class IPythonInputTestCase(InputSplitterTestCase):
587 598 """By just creating a new class whose .isp is a different instance, we
588 599 re-run the same test battery on the new input splitter.
589 600
590 601 In addition, this runs the tests over the syntax and syntax_ml dicts that
591 602 were tested by individual functions, as part of the OO interface.
592 603
593 604 It also makes some checks on the raw buffer storage.
594 605 """
595 606
596 607 def setUp(self):
597 608 self.isp = isp.IPythonInputSplitter(input_mode='line')
598 609
599 610 def test_syntax(self):
600 611 """Call all single-line syntax tests from the main object"""
601 612 isp = self.isp
602 613 for example in syntax.itervalues():
603 614 for raw, out_t in example:
604 615 if raw.startswith(' '):
605 616 continue
606 617
607 618 isp.push(raw)
608 619 out, out_raw = isp.source_raw_reset()
609 620 self.assertEqual(out.rstrip(), out_t)
610 621 self.assertEqual(out_raw.rstrip(), raw.rstrip())
611 622
612 623 def test_syntax_multiline(self):
613 624 isp = self.isp
614 625 for example in syntax_ml.itervalues():
615 626 out_t_parts = []
616 627 raw_parts = []
617 628 for line_pairs in example:
618 629 for lraw, out_t_part in line_pairs:
619 630 isp.push(lraw)
620 631 out_t_parts.append(out_t_part)
621 632 raw_parts.append(lraw)
622 633
623 634 out, out_raw = isp.source_raw_reset()
624 635 out_t = '\n'.join(out_t_parts).rstrip()
625 636 raw = '\n'.join(raw_parts).rstrip()
626 637 self.assertEqual(out.rstrip(), out_t)
627 638 self.assertEqual(out_raw.rstrip(), raw)
628 639
629 640
630 641 class BlockIPythonInputTestCase(IPythonInputTestCase):
631 642
632 643 # Deactivate tests that don't make sense for the block mode
633 644 test_push3 = test_split = lambda s: None
634 645
635 646 def setUp(self):
636 647 self.isp = isp.IPythonInputSplitter(input_mode='cell')
637 648
638 649 def test_syntax_multiline(self):
639 650 isp = self.isp
640 651 for example in syntax_ml.itervalues():
641 652 raw_parts = []
642 653 out_t_parts = []
643 654 for line_pairs in example:
644 655 for raw, out_t_part in line_pairs:
645 656 raw_parts.append(raw)
646 657 out_t_parts.append(out_t_part)
647 658
648 659 raw = '\n'.join(raw_parts)
649 660 out_t = '\n'.join(out_t_parts)
650 661
651 662 isp.push(raw)
652 663 out, out_raw = isp.source_raw_reset()
653 664 # Match ignoring trailing whitespace
654 665 self.assertEqual(out.rstrip(), out_t.rstrip())
655 666 self.assertEqual(out_raw.rstrip(), raw.rstrip())
656 667
657 668
658 669 #-----------------------------------------------------------------------------
659 670 # Main - use as a script, mostly for developer experiments
660 671 #-----------------------------------------------------------------------------
661 672
662 673 if __name__ == '__main__':
663 674 # A simple demo for interactive experimentation. This code will not get
664 675 # picked up by any test suite.
665 676 from IPython.core.inputsplitter import InputSplitter, IPythonInputSplitter
666 677
667 678 # configure here the syntax to use, prompt and whether to autoindent
668 679 #isp, start_prompt = InputSplitter(), '>>> '
669 680 isp, start_prompt = IPythonInputSplitter(), 'In> '
670 681
671 682 autoindent = True
672 683 #autoindent = False
673 684
674 685 try:
675 686 while True:
676 687 prompt = start_prompt
677 688 while isp.push_accepts_more():
678 689 indent = ' '*isp.indent_spaces
679 690 if autoindent:
680 691 line = indent + raw_input(prompt+indent)
681 692 else:
682 693 line = raw_input(prompt)
683 694 isp.push(line)
684 695 prompt = '... '
685 696
686 697 # Here we just return input so we can use it in a test suite, but a
687 698 # real interpreter would instead send it for execution somewhere.
688 699 #src = isp.source; raise EOFError # dbg
689 700 src, raw = isp.source_raw_reset()
690 701 print 'Input source was:\n', src
691 702 print 'Raw source was:\n', raw
692 703 except EOFError:
693 704 print 'Bye'
General Comments 0
You need to be logged in to leave comments. Login now