##// END OF EJS Templates
Fix bug where bare strings would be silently ignored in input....
Robert Kern -
Show More
@@ -1,1014 +1,1021 b''
1 1 """Analysis of text input into executable blocks.
2 2
3 3 The main class in this module, :class:`InputSplitter`, is designed to break
4 4 input from either interactive, line-by-line environments or block-based ones,
5 5 into standalone blocks that can be executed by Python as 'single' statements
6 6 (thus triggering sys.displayhook).
7 7
8 8 A companion, :class:`IPythonInputSplitter`, provides the same functionality but
9 9 with full support for the extended IPython syntax (magics, system calls, etc).
10 10
11 11 For more details, see the class docstring below.
12 12
13 13 Syntax Transformations
14 14 ----------------------
15 15
16 16 One of the main jobs of the code in this file is to apply all syntax
17 17 transformations that make up 'the IPython language', i.e. magics, shell
18 18 escapes, etc. All transformations should be implemented as *fully stateless*
19 19 entities, that simply take one line as their input and return a line.
20 20 Internally for implementation purposes they may be a normal function or a
21 21 callable object, but the only input they receive will be a single line and they
22 22 should only return a line, without holding any data-dependent state between
23 23 calls.
24 24
25 25 As an example, the EscapedTransformer is a class so we can more clearly group
26 26 together the functionality of dispatching to individual functions based on the
27 27 starting escape character, but the only method for public use is its call
28 28 method.
29 29
30 30
31 31 ToDo
32 32 ----
33 33
34 34 - Should we make push() actually raise an exception once push_accepts_more()
35 35 returns False?
36 36
37 37 - Naming cleanups. The tr_* names aren't the most elegant, though now they are
38 38 at least just attributes of a class so not really very exposed.
39 39
40 40 - Think about the best way to support dynamic things: automagic, autocall,
41 41 macros, etc.
42 42
43 43 - Think of a better heuristic for the application of the transforms in
44 44 IPythonInputSplitter.push() than looking at the buffer ending in ':'. Idea:
45 45 track indentation change events (indent, dedent, nothing) and apply them only
46 46 if the indentation went up, but not otherwise.
47 47
48 48 - Think of the cleanest way for supporting user-specified transformations (the
49 49 user prefilters we had before).
50 50
51 51 Authors
52 52 -------
53 53
54 54 * Fernando Perez
55 55 * Brian Granger
56 56 """
57 57 #-----------------------------------------------------------------------------
58 58 # Copyright (C) 2010 The IPython Development Team
59 59 #
60 60 # Distributed under the terms of the BSD License. The full license is in
61 61 # the file COPYING, distributed as part of this software.
62 62 #-----------------------------------------------------------------------------
63 63 from __future__ import print_function
64 64
65 65 #-----------------------------------------------------------------------------
66 66 # Imports
67 67 #-----------------------------------------------------------------------------
68 68 # stdlib
69 69 import codeop
70 70 import re
71 71 import sys
72 72
73 73 # IPython modules
74 74 from IPython.utils.text import make_quoted_expr
75 75
76 76 #-----------------------------------------------------------------------------
77 77 # Globals
78 78 #-----------------------------------------------------------------------------
79 79
80 80 # The escape sequences that define the syntax transformations IPython will
81 81 # apply to user input. These can NOT be just changed here: many regular
82 82 # expressions and other parts of the code may use their hardcoded values, and
83 83 # for all intents and purposes they constitute the 'IPython syntax', so they
84 84 # should be considered fixed.
85 85
86 86 ESC_SHELL = '!' # Send line to underlying system shell
87 87 ESC_SH_CAP = '!!' # Send line to system shell and capture output
88 88 ESC_HELP = '?' # Find information about object
89 89 ESC_HELP2 = '??' # Find extra-detailed information about object
90 90 ESC_MAGIC = '%' # Call magic function
91 91 ESC_QUOTE = ',' # Split args on whitespace, quote each as string and call
92 92 ESC_QUOTE2 = ';' # Quote all args as a single string, call
93 93 ESC_PAREN = '/' # Call first argument with rest of line as arguments
94 94
95 95 #-----------------------------------------------------------------------------
96 96 # Utilities
97 97 #-----------------------------------------------------------------------------
98 98
99 99 # FIXME: These are general-purpose utilities that later can be moved to the
100 100 # general ward. Kept here for now because we're being very strict about test
101 101 # coverage with this code, and this lets us ensure that we keep 100% coverage
102 102 # while developing.
103 103
104 104 # compiled regexps for autoindent management
105 105 dedent_re = re.compile(r'^\s+raise|^\s+return|^\s+pass')
106 106 ini_spaces_re = re.compile(r'^([ \t\r\f\v]+)')
107 107
108 108 # regexp to match pure comment lines so we don't accidentally insert 'if 1:'
109 109 # before pure comments
110 110 comment_line_re = re.compile('^\s*\#')
111 111
112 112
113 113 def num_ini_spaces(s):
114 114 """Return the number of initial spaces in a string.
115 115
116 116 Note that tabs are counted as a single space. For now, we do *not* support
117 117 mixing of tabs and spaces in the user's input.
118 118
119 119 Parameters
120 120 ----------
121 121 s : string
122 122
123 123 Returns
124 124 -------
125 125 n : int
126 126 """
127 127
128 128 ini_spaces = ini_spaces_re.match(s)
129 129 if ini_spaces:
130 130 return ini_spaces.end()
131 131 else:
132 132 return 0
133 133
134 134
135 135 def remove_comments(src):
136 136 """Remove all comments from input source.
137 137
138 138 Note: comments are NOT recognized inside of strings!
139 139
140 140 Parameters
141 141 ----------
142 142 src : string
143 143 A single or multiline input string.
144 144
145 145 Returns
146 146 -------
147 147 String with all Python comments removed.
148 148 """
149 149
150 150 return re.sub('#.*', '', src)
151 151
152 152
153 153 def get_input_encoding():
154 154 """Return the default standard input encoding.
155 155
156 156 If sys.stdin has no encoding, 'ascii' is returned."""
157 157 # There are strange environments for which sys.stdin.encoding is None. We
158 158 # ensure that a valid encoding is returned.
159 159 encoding = getattr(sys.stdin, 'encoding', None)
160 160 if encoding is None:
161 161 encoding = 'ascii'
162 162 return encoding
163 163
164 164 #-----------------------------------------------------------------------------
165 165 # Classes and functions for normal Python syntax handling
166 166 #-----------------------------------------------------------------------------
167 167
168 168 # HACK! This implementation, written by Robert K a while ago using the
169 169 # compiler module, is more robust than the other one below, but it expects its
170 170 # input to be pure python (no ipython syntax). For now we're using it as a
171 171 # second-pass splitter after the first pass transforms the input to pure
172 172 # python.
173 173
174 174 def split_blocks(python):
175 175 """ Split multiple lines of code into discrete commands that can be
176 176 executed singly.
177 177
178 178 Parameters
179 179 ----------
180 180 python : str
181 181 Pure, exec'able Python code.
182 182
183 183 Returns
184 184 -------
185 185 commands : list of str
186 186 Separate commands that can be exec'ed independently.
187 187 """
188 188
189 189 import compiler
190 190
191 191 # compiler.parse treats trailing spaces after a newline as a
192 192 # SyntaxError. This is different than codeop.CommandCompiler, which
193 193 # will compile the trailng spaces just fine. We simply strip any
194 194 # trailing whitespace off. Passing a string with trailing whitespace
195 195 # to exec will fail however. There seems to be some inconsistency in
196 196 # how trailing whitespace is handled, but this seems to work.
197 197 python_ori = python # save original in case we bail on error
198 198 python = python.strip()
199 199
200 200 # The compiler module does not like unicode. We need to convert
201 201 # it encode it:
202 202 if isinstance(python, unicode):
203 203 # Use the utf-8-sig BOM so the compiler detects this a UTF-8
204 204 # encode string.
205 205 python = '\xef\xbb\xbf' + python.encode('utf-8')
206 206
207 207 # The compiler module will parse the code into an abstract syntax tree.
208 208 # This has a bug with str("a\nb"), but not str("""a\nb""")!!!
209 209 try:
210 210 ast = compiler.parse(python)
211 211 except:
212 212 return [python_ori]
213 213
214 214 # Uncomment to help debug the ast tree
215 215 # for n in ast.node:
216 216 # print n.lineno,'->',n
217 217
218 218 # Each separate command is available by iterating over ast.node. The
219 219 # lineno attribute is the line number (1-indexed) beginning the commands
220 220 # suite.
221 221 # lines ending with ";" yield a Discard Node that doesn't have a lineno
222 222 # attribute. These nodes can and should be discarded. But there are
223 223 # other situations that cause Discard nodes that shouldn't be discarded.
224 224 # We might eventually discover other cases where lineno is None and have
225 225 # to put in a more sophisticated test.
226 226 linenos = [x.lineno-1 for x in ast.node if x.lineno is not None]
227 227
228 # When we have a bare string as the first statement, it does not end up as
229 # a Discard Node in the AST as we might expect. Instead, it gets interpreted
230 # as the docstring of the module. Check for this case and prepend 0 (the
231 # first line number) to the list of linenos to account for it.
232 if ast.doc is not None:
233 linenos.insert(0, 0)
234
228 235 # When we finally get the slices, we will need to slice all the way to
229 236 # the end even though we don't have a line number for it. Fortunately,
230 237 # None does the job nicely.
231 238 linenos.append(None)
232 239
233 240 # Same problem at the other end: sometimes the ast tree has its
234 241 # first complete statement not starting on line 0. In this case
235 242 # we might miss part of it. This fixes ticket 266993. Thanks Gael!
236 243 linenos[0] = 0
237 244
238 245 lines = python.splitlines()
239 246
240 247 # Create a list of atomic commands.
241 248 cmds = []
242 249 for i, j in zip(linenos[:-1], linenos[1:]):
243 250 cmd = lines[i:j]
244 251 if cmd:
245 252 cmds.append('\n'.join(cmd)+'\n')
246 253
247 254 return cmds
248 255
249 256
250 257 class InputSplitter(object):
251 258 """An object that can split Python source input in executable blocks.
252 259
253 260 This object is designed to be used in one of two basic modes:
254 261
255 262 1. By feeding it python source line-by-line, using :meth:`push`. In this
256 263 mode, it will return on each push whether the currently pushed code
257 264 could be executed already. In addition, it provides a method called
258 265 :meth:`push_accepts_more` that can be used to query whether more input
259 266 can be pushed into a single interactive block.
260 267
261 268 2. By calling :meth:`split_blocks` with a single, multiline Python string,
262 269 that is then split into blocks each of which can be executed
263 270 interactively as a single statement.
264 271
265 272 This is a simple example of how an interactive terminal-based client can use
266 273 this tool::
267 274
268 275 isp = InputSplitter()
269 276 while isp.push_accepts_more():
270 277 indent = ' '*isp.indent_spaces
271 278 prompt = '>>> ' + indent
272 279 line = indent + raw_input(prompt)
273 280 isp.push(line)
274 281 print 'Input source was:\n', isp.source_reset(),
275 282 """
276 283 # Number of spaces of indentation computed from input that has been pushed
277 284 # so far. This is the attributes callers should query to get the current
278 285 # indentation level, in order to provide auto-indent facilities.
279 286 indent_spaces = 0
280 287 # String, indicating the default input encoding. It is computed by default
281 288 # at initialization time via get_input_encoding(), but it can be reset by a
282 289 # client with specific knowledge of the encoding.
283 290 encoding = ''
284 291 # String where the current full source input is stored, properly encoded.
285 292 # Reading this attribute is the normal way of querying the currently pushed
286 293 # source code, that has been properly encoded.
287 294 source = ''
288 295 # Code object corresponding to the current source. It is automatically
289 296 # synced to the source, so it can be queried at any time to obtain the code
290 297 # object; it will be None if the source doesn't compile to valid Python.
291 298 code = None
292 299 # Input mode
293 300 input_mode = 'line'
294 301
295 302 # Private attributes
296 303
297 304 # List with lines of input accumulated so far
298 305 _buffer = None
299 306 # Command compiler
300 307 _compile = None
301 308 # Mark when input has changed indentation all the way back to flush-left
302 309 _full_dedent = False
303 310 # Boolean indicating whether the current block is complete
304 311 _is_complete = None
305 312
306 313 def __init__(self, input_mode=None):
307 314 """Create a new InputSplitter instance.
308 315
309 316 Parameters
310 317 ----------
311 318 input_mode : str
312 319
313 320 One of ['line', 'cell']; default is 'line'.
314 321
315 322 The input_mode parameter controls how new inputs are used when fed via
316 323 the :meth:`push` method:
317 324
318 325 - 'line': meant for line-oriented clients, inputs are appended one at a
319 326 time to the internal buffer and the whole buffer is compiled.
320 327
321 328 - 'cell': meant for clients that can edit multi-line 'cells' of text at
322 329 a time. A cell can contain one or more blocks that can be compile in
323 330 'single' mode by Python. In this mode, each new input new input
324 331 completely replaces all prior inputs. Cell mode is thus equivalent
325 332 to prepending a full reset() to every push() call.
326 333 """
327 334 self._buffer = []
328 335 self._compile = codeop.CommandCompiler()
329 336 self.encoding = get_input_encoding()
330 337 self.input_mode = InputSplitter.input_mode if input_mode is None \
331 338 else input_mode
332 339
333 340 def reset(self):
334 341 """Reset the input buffer and associated state."""
335 342 self.indent_spaces = 0
336 343 self._buffer[:] = []
337 344 self.source = ''
338 345 self.code = None
339 346 self._is_complete = False
340 347 self._full_dedent = False
341 348
342 349 def source_reset(self):
343 350 """Return the input source and perform a full reset.
344 351 """
345 352 out = self.source
346 353 self.reset()
347 354 return out
348 355
349 356 def push(self, lines):
350 """Push one ore more lines of input.
357 """Push one or more lines of input.
351 358
352 359 This stores the given lines and returns a status code indicating
353 360 whether the code forms a complete Python block or not.
354 361
355 362 Any exceptions generated in compilation are swallowed, but if an
356 363 exception was produced, the method returns True.
357 364
358 365 Parameters
359 366 ----------
360 367 lines : string
361 368 One or more lines of Python input.
362 369
363 370 Returns
364 371 -------
365 372 is_complete : boolean
366 373 True if the current input source (the result of the current input
367 374 plus prior inputs) forms a complete Python execution block. Note that
368 375 this value is also stored as a private attribute (_is_complete), so it
369 376 can be queried at any time.
370 377 """
371 378 if self.input_mode == 'cell':
372 379 self.reset()
373 380
374 381 self._store(lines)
375 382 source = self.source
376 383
377 384 # Before calling _compile(), reset the code object to None so that if an
378 385 # exception is raised in compilation, we don't mislead by having
379 386 # inconsistent code/source attributes.
380 387 self.code, self._is_complete = None, None
381 388
382 389 # Honor termination lines properly
383 390 if source.rstrip().endswith('\\'):
384 391 return False
385 392
386 393 self._update_indent(lines)
387 394 try:
388 395 self.code = self._compile(source)
389 396 # Invalid syntax can produce any of a number of different errors from
390 397 # inside the compiler, so we have to catch them all. Syntax errors
391 398 # immediately produce a 'ready' block, so the invalid Python can be
392 399 # sent to the kernel for evaluation with possible ipython
393 400 # special-syntax conversion.
394 401 except (SyntaxError, OverflowError, ValueError, TypeError,
395 402 MemoryError):
396 403 self._is_complete = True
397 404 else:
398 405 # Compilation didn't produce any exceptions (though it may not have
399 406 # given a complete code object)
400 407 self._is_complete = self.code is not None
401 408
402 409 return self._is_complete
403 410
404 411 def push_accepts_more(self):
405 412 """Return whether a block of interactive input can accept more input.
406 413
407 414 This method is meant to be used by line-oriented frontends, who need to
408 415 guess whether a block is complete or not based solely on prior and
409 416 current input lines. The InputSplitter considers it has a complete
410 417 interactive block and will not accept more input only when either a
411 418 SyntaxError is raised, or *all* of the following are true:
412 419
413 420 1. The input compiles to a complete statement.
414 421
415 422 2. The indentation level is flush-left (because if we are indented,
416 423 like inside a function definition or for loop, we need to keep
417 424 reading new input).
418 425
419 426 3. There is one extra line consisting only of whitespace.
420 427
421 428 Because of condition #3, this method should be used only by
422 429 *line-oriented* frontends, since it means that intermediate blank lines
423 430 are not allowed in function definitions (or any other indented block).
424 431
425 432 Block-oriented frontends that have a separate keyboard event to
426 433 indicate execution should use the :meth:`split_blocks` method instead.
427 434
428 435 If the current input produces a syntax error, this method immediately
429 436 returns False but does *not* raise the syntax error exception, as
430 437 typically clients will want to send invalid syntax to an execution
431 438 backend which might convert the invalid syntax into valid Python via
432 439 one of the dynamic IPython mechanisms.
433 440 """
434 441
435 442 # With incomplete input, unconditionally accept more
436 443 if not self._is_complete:
437 444 return True
438 445
439 446 # If we already have complete input and we're flush left, the answer
440 447 # depends. In line mode, we're done. But in cell mode, we need to
441 448 # check how many blocks the input so far compiles into, because if
442 449 # there's already more than one full independent block of input, then
443 450 # the client has entered full 'cell' mode and is feeding lines that
444 451 # each is complete. In this case we should then keep accepting.
445 452 # The Qt terminal-like console does precisely this, to provide the
446 453 # convenience of terminal-like input of single expressions, but
447 454 # allowing the user (with a separate keystroke) to switch to 'cell'
448 455 # mode and type multiple expressions in one shot.
449 456 if self.indent_spaces==0:
450 457 if self.input_mode=='line':
451 458 return False
452 459 else:
453 460 nblocks = len(split_blocks(''.join(self._buffer)))
454 461 if nblocks==1:
455 462 return False
456 463
457 464 # When input is complete, then termination is marked by an extra blank
458 465 # line at the end.
459 466 last_line = self.source.splitlines()[-1]
460 467 return bool(last_line and not last_line.isspace())
461 468
462 469 def split_blocks(self, lines):
463 470 """Split a multiline string into multiple input blocks.
464 471
465 472 Note: this method starts by performing a full reset().
466 473
467 474 Parameters
468 475 ----------
469 476 lines : str
470 477 A possibly multiline string.
471 478
472 479 Returns
473 480 -------
474 481 blocks : list
475 482 A list of strings, each possibly multiline. Each string corresponds
476 483 to a single block that can be compiled in 'single' mode (unless it
477 484 has a syntax error)."""
478 485
479 486 # This code is fairly delicate. If you make any changes here, make
480 487 # absolutely sure that you do run the full test suite and ALL tests
481 488 # pass.
482 489
483 490 self.reset()
484 491 blocks = []
485 492
486 493 # Reversed copy so we can use pop() efficiently and consume the input
487 494 # as a stack
488 495 lines = lines.splitlines()[::-1]
489 496 # Outer loop over all input
490 497 while lines:
491 498 #print 'Current lines:', lines # dbg
492 499 # Inner loop to build each block
493 500 while True:
494 501 # Safety exit from inner loop
495 502 if not lines:
496 503 break
497 504 # Grab next line but don't push it yet
498 505 next_line = lines.pop()
499 506 # Blank/empty lines are pushed as-is
500 507 if not next_line or next_line.isspace():
501 508 self.push(next_line)
502 509 continue
503 510
504 511 # Check indentation changes caused by the *next* line
505 512 indent_spaces, _full_dedent = self._find_indent(next_line)
506 513
507 514 # If the next line causes a dedent, it can be for two differnt
508 515 # reasons: either an explicit de-dent by the user or a
509 516 # return/raise/pass statement. These MUST be handled
510 517 # separately:
511 518 #
512 519 # 1. the first case is only detected when the actual explicit
513 520 # dedent happens, and that would be the *first* line of a *new*
514 521 # block. Thus, we must put the line back into the input buffer
515 522 # so that it starts a new block on the next pass.
516 523 #
517 524 # 2. the second case is detected in the line before the actual
518 525 # dedent happens, so , we consume the line and we can break out
519 526 # to start a new block.
520 527
521 528 # Case 1, explicit dedent causes a break.
522 529 # Note: check that we weren't on the very last line, else we'll
523 530 # enter an infinite loop adding/removing the last line.
524 531 if _full_dedent and lines and not next_line.startswith(' '):
525 532 lines.append(next_line)
526 533 break
527 534
528 535 # Otherwise any line is pushed
529 536 self.push(next_line)
530 537
531 538 # Case 2, full dedent with full block ready:
532 539 if _full_dedent or \
533 540 self.indent_spaces==0 and not self.push_accepts_more():
534 541 break
535 542 # Form the new block with the current source input
536 543 blocks.append(self.source_reset())
537 544
538 545 #return blocks
539 546 # HACK!!! Now that our input is in blocks but guaranteed to be pure
540 547 # python syntax, feed it back a second time through the AST-based
541 548 # splitter, which is more accurate than ours.
542 549 return split_blocks(''.join(blocks))
543 550
544 551 #------------------------------------------------------------------------
545 552 # Private interface
546 553 #------------------------------------------------------------------------
547 554
548 555 def _find_indent(self, line):
549 556 """Compute the new indentation level for a single line.
550 557
551 558 Parameters
552 559 ----------
553 560 line : str
554 561 A single new line of non-whitespace, non-comment Python input.
555 562
556 563 Returns
557 564 -------
558 565 indent_spaces : int
559 566 New value for the indent level (it may be equal to self.indent_spaces
560 567 if indentation doesn't change.
561 568
562 569 full_dedent : boolean
563 570 Whether the new line causes a full flush-left dedent.
564 571 """
565 572 indent_spaces = self.indent_spaces
566 573 full_dedent = self._full_dedent
567 574
568 575 inisp = num_ini_spaces(line)
569 576 if inisp < indent_spaces:
570 577 indent_spaces = inisp
571 578 if indent_spaces <= 0:
572 579 #print 'Full dedent in text',self.source # dbg
573 580 full_dedent = True
574 581
575 582 if line[-1] == ':':
576 583 indent_spaces += 4
577 584 elif dedent_re.match(line):
578 585 indent_spaces -= 4
579 586 if indent_spaces <= 0:
580 587 full_dedent = True
581 588
582 589 # Safety
583 590 if indent_spaces < 0:
584 591 indent_spaces = 0
585 592 #print 'safety' # dbg
586 593
587 594 return indent_spaces, full_dedent
588 595
589 596 def _update_indent(self, lines):
590 597 for line in remove_comments(lines).splitlines():
591 598 if line and not line.isspace():
592 599 self.indent_spaces, self._full_dedent = self._find_indent(line)
593 600
594 601 def _store(self, lines, buffer=None, store='source'):
595 602 """Store one or more lines of input.
596 603
597 604 If input lines are not newline-terminated, a newline is automatically
598 605 appended."""
599 606
600 607 if buffer is None:
601 608 buffer = self._buffer
602 609
603 610 if lines.endswith('\n'):
604 611 buffer.append(lines)
605 612 else:
606 613 buffer.append(lines+'\n')
607 614 setattr(self, store, self._set_source(buffer))
608 615
609 616 def _set_source(self, buffer):
610 617 return ''.join(buffer).encode(self.encoding)
611 618
612 619
613 620 #-----------------------------------------------------------------------------
614 621 # Functions and classes for IPython-specific syntactic support
615 622 #-----------------------------------------------------------------------------
616 623
617 624 # RegExp for splitting line contents into pre-char//first word-method//rest.
618 625 # For clarity, each group in on one line.
619 626
620 627 line_split = re.compile("""
621 628 ^(\s*) # any leading space
622 629 ([,;/%]|!!?|\?\??) # escape character or characters
623 630 \s*(%?[\w\.\*]*) # function/method, possibly with leading %
624 631 # to correctly treat things like '?%magic'
625 632 (\s+.*$|$) # rest of line
626 633 """, re.VERBOSE)
627 634
628 635
629 636 def split_user_input(line):
630 637 """Split user input into early whitespace, esc-char, function part and rest.
631 638
632 639 This is currently handles lines with '=' in them in a very inconsistent
633 640 manner.
634 641
635 642 Examples
636 643 ========
637 644 >>> split_user_input('x=1')
638 645 ('', '', 'x=1', '')
639 646 >>> split_user_input('?')
640 647 ('', '?', '', '')
641 648 >>> split_user_input('??')
642 649 ('', '??', '', '')
643 650 >>> split_user_input(' ?')
644 651 (' ', '?', '', '')
645 652 >>> split_user_input(' ??')
646 653 (' ', '??', '', '')
647 654 >>> split_user_input('??x')
648 655 ('', '??', 'x', '')
649 656 >>> split_user_input('?x=1')
650 657 ('', '', '?x=1', '')
651 658 >>> split_user_input('!ls')
652 659 ('', '!', 'ls', '')
653 660 >>> split_user_input(' !ls')
654 661 (' ', '!', 'ls', '')
655 662 >>> split_user_input('!!ls')
656 663 ('', '!!', 'ls', '')
657 664 >>> split_user_input(' !!ls')
658 665 (' ', '!!', 'ls', '')
659 666 >>> split_user_input(',ls')
660 667 ('', ',', 'ls', '')
661 668 >>> split_user_input(';ls')
662 669 ('', ';', 'ls', '')
663 670 >>> split_user_input(' ;ls')
664 671 (' ', ';', 'ls', '')
665 672 >>> split_user_input('f.g(x)')
666 673 ('', '', 'f.g(x)', '')
667 674 >>> split_user_input('f.g (x)')
668 675 ('', '', 'f.g', '(x)')
669 676 >>> split_user_input('?%hist')
670 677 ('', '?', '%hist', '')
671 678 >>> split_user_input('?x*')
672 679 ('', '?', 'x*', '')
673 680 """
674 681 match = line_split.match(line)
675 682 if match:
676 683 lspace, esc, fpart, rest = match.groups()
677 684 else:
678 685 # print "match failed for line '%s'" % line
679 686 try:
680 687 fpart, rest = line.split(None, 1)
681 688 except ValueError:
682 689 # print "split failed for line '%s'" % line
683 690 fpart, rest = line,''
684 691 lspace = re.match('^(\s*)(.*)', line).groups()[0]
685 692 esc = ''
686 693
687 694 # fpart has to be a valid python identifier, so it better be only pure
688 695 # ascii, no unicode:
689 696 try:
690 697 fpart = fpart.encode('ascii')
691 698 except UnicodeEncodeError:
692 699 lspace = unicode(lspace)
693 700 rest = fpart + u' ' + rest
694 701 fpart = u''
695 702
696 703 #print 'line:<%s>' % line # dbg
697 704 #print 'esc <%s> fpart <%s> rest <%s>' % (esc,fpart.strip(),rest) # dbg
698 705 return lspace, esc, fpart.strip(), rest.lstrip()
699 706
700 707
701 708 # The escaped translators ALL receive a line where their own escape has been
702 709 # stripped. Only '?' is valid at the end of the line, all others can only be
703 710 # placed at the start.
704 711
705 712 class LineInfo(object):
706 713 """A single line of input and associated info.
707 714
708 715 This is a utility class that mostly wraps the output of
709 716 :func:`split_user_input` into a convenient object to be passed around
710 717 during input transformations.
711 718
712 719 Includes the following as properties:
713 720
714 721 line
715 722 The original, raw line
716 723
717 724 lspace
718 725 Any early whitespace before actual text starts.
719 726
720 727 esc
721 728 The initial esc character (or characters, for double-char escapes like
722 729 '??' or '!!').
723 730
724 731 fpart
725 732 The 'function part', which is basically the maximal initial sequence
726 733 of valid python identifiers and the '.' character. This is what is
727 734 checked for alias and magic transformations, used for auto-calling,
728 735 etc.
729 736
730 737 rest
731 738 Everything else on the line.
732 739 """
733 740 def __init__(self, line):
734 741 self.line = line
735 742 self.lspace, self.esc, self.fpart, self.rest = \
736 743 split_user_input(line)
737 744
738 745 def __str__(self):
739 746 return "LineInfo [%s|%s|%s|%s]" % (self.lspace, self.esc,
740 747 self.fpart, self.rest)
741 748
742 749
743 750 # Transformations of the special syntaxes that don't rely on an explicit escape
744 751 # character but instead on patterns on the input line
745 752
746 753 # The core transformations are implemented as standalone functions that can be
747 754 # tested and validated in isolation. Each of these uses a regexp, we
748 755 # pre-compile these and keep them close to each function definition for clarity
749 756
750 757 _assign_system_re = re.compile(r'(?P<lhs>(\s*)([\w\.]+)((\s*,\s*[\w\.]+)*))'
751 758 r'\s*=\s*!\s*(?P<cmd>.*)')
752 759
753 760 def transform_assign_system(line):
754 761 """Handle the `files = !ls` syntax."""
755 762 m = _assign_system_re.match(line)
756 763 if m is not None:
757 764 cmd = m.group('cmd')
758 765 lhs = m.group('lhs')
759 766 expr = make_quoted_expr(cmd)
760 767 new_line = '%s = get_ipython().getoutput(%s)' % (lhs, expr)
761 768 return new_line
762 769 return line
763 770
764 771
765 772 _assign_magic_re = re.compile(r'(?P<lhs>(\s*)([\w\.]+)((\s*,\s*[\w\.]+)*))'
766 773 r'\s*=\s*%\s*(?P<cmd>.*)')
767 774
768 775 def transform_assign_magic(line):
769 776 """Handle the `a = %who` syntax."""
770 777 m = _assign_magic_re.match(line)
771 778 if m is not None:
772 779 cmd = m.group('cmd')
773 780 lhs = m.group('lhs')
774 781 expr = make_quoted_expr(cmd)
775 782 new_line = '%s = get_ipython().magic(%s)' % (lhs, expr)
776 783 return new_line
777 784 return line
778 785
779 786
780 787 _classic_prompt_re = re.compile(r'^([ \t]*>>> |^[ \t]*\.\.\. )')
781 788
782 789 def transform_classic_prompt(line):
783 790 """Handle inputs that start with '>>> ' syntax."""
784 791
785 792 if not line or line.isspace():
786 793 return line
787 794 m = _classic_prompt_re.match(line)
788 795 if m:
789 796 return line[len(m.group(0)):]
790 797 else:
791 798 return line
792 799
793 800
794 801 _ipy_prompt_re = re.compile(r'^([ \t]*In \[\d+\]: |^[ \t]*\ \ \ \.\.\.+: )')
795 802
796 803 def transform_ipy_prompt(line):
797 804 """Handle inputs that start classic IPython prompt syntax."""
798 805
799 806 if not line or line.isspace():
800 807 return line
801 808 #print 'LINE: %r' % line # dbg
802 809 m = _ipy_prompt_re.match(line)
803 810 if m:
804 811 #print 'MATCH! %r -> %r' % (line, line[len(m.group(0)):]) # dbg
805 812 return line[len(m.group(0)):]
806 813 else:
807 814 return line
808 815
809 816
810 817 class EscapedTransformer(object):
811 818 """Class to transform lines that are explicitly escaped out."""
812 819
813 820 def __init__(self):
814 821 tr = { ESC_SHELL : self._tr_system,
815 822 ESC_SH_CAP : self._tr_system2,
816 823 ESC_HELP : self._tr_help,
817 824 ESC_HELP2 : self._tr_help,
818 825 ESC_MAGIC : self._tr_magic,
819 826 ESC_QUOTE : self._tr_quote,
820 827 ESC_QUOTE2 : self._tr_quote2,
821 828 ESC_PAREN : self._tr_paren }
822 829 self.tr = tr
823 830
824 831 # Support for syntax transformations that use explicit escapes typed by the
825 832 # user at the beginning of a line
826 833 @staticmethod
827 834 def _tr_system(line_info):
828 835 "Translate lines escaped with: !"
829 836 cmd = line_info.line.lstrip().lstrip(ESC_SHELL)
830 837 return '%sget_ipython().system(%s)' % (line_info.lspace,
831 838 make_quoted_expr(cmd))
832 839
833 840 @staticmethod
834 841 def _tr_system2(line_info):
835 842 "Translate lines escaped with: !!"
836 843 cmd = line_info.line.lstrip()[2:]
837 844 return '%sget_ipython().getoutput(%s)' % (line_info.lspace,
838 845 make_quoted_expr(cmd))
839 846
840 847 @staticmethod
841 848 def _tr_help(line_info):
842 849 "Translate lines escaped with: ?/??"
843 850 # A naked help line should just fire the intro help screen
844 851 if not line_info.line[1:]:
845 852 return 'get_ipython().show_usage()'
846 853
847 854 # There may be one or two '?' at the end, move them to the front so that
848 855 # the rest of the logic can assume escapes are at the start
849 856 l_ori = line_info
850 857 line = line_info.line
851 858 if line.endswith('?'):
852 859 line = line[-1] + line[:-1]
853 860 if line.endswith('?'):
854 861 line = line[-1] + line[:-1]
855 862 line_info = LineInfo(line)
856 863
857 864 # From here on, simply choose which level of detail to get, and
858 865 # special-case the psearch syntax
859 866 pinfo = 'pinfo' # default
860 867 if '*' in line_info.line:
861 868 pinfo = 'psearch'
862 869 elif line_info.esc == '??':
863 870 pinfo = 'pinfo2'
864 871
865 872 tpl = '%sget_ipython().magic("%s %s")'
866 873 return tpl % (line_info.lspace, pinfo,
867 874 ' '.join([line_info.fpart, line_info.rest]).strip())
868 875
869 876 @staticmethod
870 877 def _tr_magic(line_info):
871 878 "Translate lines escaped with: %"
872 879 tpl = '%sget_ipython().magic(%s)'
873 880 cmd = make_quoted_expr(' '.join([line_info.fpart,
874 881 line_info.rest]).strip())
875 882 return tpl % (line_info.lspace, cmd)
876 883
877 884 @staticmethod
878 885 def _tr_quote(line_info):
879 886 "Translate lines escaped with: ,"
880 887 return '%s%s("%s")' % (line_info.lspace, line_info.fpart,
881 888 '", "'.join(line_info.rest.split()) )
882 889
883 890 @staticmethod
884 891 def _tr_quote2(line_info):
885 892 "Translate lines escaped with: ;"
886 893 return '%s%s("%s")' % (line_info.lspace, line_info.fpart,
887 894 line_info.rest)
888 895
889 896 @staticmethod
890 897 def _tr_paren(line_info):
891 898 "Translate lines escaped with: /"
892 899 return '%s%s(%s)' % (line_info.lspace, line_info.fpart,
893 900 ", ".join(line_info.rest.split()))
894 901
895 902 def __call__(self, line):
896 903 """Class to transform lines that are explicitly escaped out.
897 904
898 905 This calls the above _tr_* static methods for the actual line
899 906 translations."""
900 907
901 908 # Empty lines just get returned unmodified
902 909 if not line or line.isspace():
903 910 return line
904 911
905 912 # Get line endpoints, where the escapes can be
906 913 line_info = LineInfo(line)
907 914
908 915 # If the escape is not at the start, only '?' needs to be special-cased.
909 916 # All other escapes are only valid at the start
910 917 if not line_info.esc in self.tr:
911 918 if line.endswith(ESC_HELP):
912 919 return self._tr_help(line_info)
913 920 else:
914 921 # If we don't recognize the escape, don't modify the line
915 922 return line
916 923
917 924 return self.tr[line_info.esc](line_info)
918 925
919 926
920 927 # A function-looking object to be used by the rest of the code. The purpose of
921 928 # the class in this case is to organize related functionality, more than to
922 929 # manage state.
923 930 transform_escaped = EscapedTransformer()
924 931
925 932
926 933 class IPythonInputSplitter(InputSplitter):
927 934 """An input splitter that recognizes all of IPython's special syntax."""
928 935
929 936 # String with raw, untransformed input.
930 937 source_raw = ''
931 938
932 939 # Private attributes
933 940
934 941 # List with lines of raw input accumulated so far.
935 942 _buffer_raw = None
936 943
937 944 def __init__(self, input_mode=None):
938 945 InputSplitter.__init__(self, input_mode)
939 946 self._buffer_raw = []
940 947
941 948 def reset(self):
942 949 """Reset the input buffer and associated state."""
943 950 InputSplitter.reset(self)
944 951 self._buffer_raw[:] = []
945 952 self.source_raw = ''
946 953
947 954 def source_raw_reset(self):
948 955 """Return input and raw source and perform a full reset.
949 956 """
950 957 out = self.source
951 958 out_r = self.source_raw
952 959 self.reset()
953 960 return out, out_r
954 961
955 962 def push(self, lines):
956 963 """Push one or more lines of IPython input.
957 964 """
958 965 if not lines:
959 966 return super(IPythonInputSplitter, self).push(lines)
960 967
961 968 # We must ensure all input is pure unicode
962 969 if type(lines)==str:
963 970 lines = lines.decode(self.encoding)
964 971
965 972 lines_list = lines.splitlines()
966 973
967 974 transforms = [transform_escaped, transform_assign_system,
968 975 transform_assign_magic, transform_ipy_prompt,
969 976 transform_classic_prompt]
970 977
971 978 # Transform logic
972 979 #
973 980 # We only apply the line transformers to the input if we have either no
974 981 # input yet, or complete input, or if the last line of the buffer ends
975 982 # with ':' (opening an indented block). This prevents the accidental
976 983 # transformation of escapes inside multiline expressions like
977 984 # triple-quoted strings or parenthesized expressions.
978 985 #
979 986 # The last heuristic, while ugly, ensures that the first line of an
980 987 # indented block is correctly transformed.
981 988 #
982 989 # FIXME: try to find a cleaner approach for this last bit.
983 990
984 991 # If we were in 'block' mode, since we're going to pump the parent
985 992 # class by hand line by line, we need to temporarily switch out to
986 993 # 'line' mode, do a single manual reset and then feed the lines one
987 994 # by one. Note that this only matters if the input has more than one
988 995 # line.
989 996 changed_input_mode = False
990 997
991 998 if self.input_mode == 'cell':
992 999 self.reset()
993 1000 changed_input_mode = True
994 1001 saved_input_mode = 'cell'
995 1002 self.input_mode = 'line'
996 1003
997 1004 # Store raw source before applying any transformations to it. Note
998 1005 # that this must be done *after* the reset() call that would otherwise
999 1006 # flush the buffer.
1000 1007 self._store(lines, self._buffer_raw, 'source_raw')
1001 1008
1002 1009 try:
1003 1010 push = super(IPythonInputSplitter, self).push
1004 1011 for line in lines_list:
1005 1012 if self._is_complete or not self._buffer or \
1006 1013 (self._buffer and self._buffer[-1].rstrip().endswith(':')):
1007 1014 for f in transforms:
1008 1015 line = f(line)
1009 1016
1010 1017 out = push(line)
1011 1018 finally:
1012 1019 if changed_input_mode:
1013 1020 self.input_mode = saved_input_mode
1014 1021 return out
@@ -1,679 +1,688 b''
1 1 # -*- coding: utf-8 -*-
2 2 """Tests for the inputsplitter module.
3 3 """
4 4 #-----------------------------------------------------------------------------
5 5 # Copyright (C) 2010 The IPython Development Team
6 6 #
7 7 # Distributed under the terms of the BSD License. The full license is in
8 8 # the file COPYING, distributed as part of this software.
9 9 #-----------------------------------------------------------------------------
10 10
11 11 #-----------------------------------------------------------------------------
12 12 # Imports
13 13 #-----------------------------------------------------------------------------
14 14 # stdlib
15 15 import unittest
16 16 import sys
17 17
18 18 # Third party
19 19 import nose.tools as nt
20 20
21 21 # Our own
22 22 from IPython.core import inputsplitter as isp
23 23
24 24 #-----------------------------------------------------------------------------
25 25 # Semi-complete examples (also used as tests)
26 26 #-----------------------------------------------------------------------------
27 27
28 28 # Note: at the bottom, there's a slightly more complete version of this that
29 29 # can be useful during development of code here.
30 30
31 31 def mini_interactive_loop(input_func):
32 32 """Minimal example of the logic of an interactive interpreter loop.
33 33
34 34 This serves as an example, and it is used by the test system with a fake
35 35 raw_input that simulates interactive input."""
36 36
37 37 from IPython.core.inputsplitter import InputSplitter
38 38
39 39 isp = InputSplitter()
40 40 # In practice, this input loop would be wrapped in an outside loop to read
41 41 # input indefinitely, until some exit/quit command was issued. Here we
42 42 # only illustrate the basic inner loop.
43 43 while isp.push_accepts_more():
44 44 indent = ' '*isp.indent_spaces
45 45 prompt = '>>> ' + indent
46 46 line = indent + input_func(prompt)
47 47 isp.push(line)
48 48
49 49 # Here we just return input so we can use it in a test suite, but a real
50 50 # interpreter would instead send it for execution somewhere.
51 51 src = isp.source_reset()
52 52 #print 'Input source was:\n', src # dbg
53 53 return src
54 54
55 55 #-----------------------------------------------------------------------------
56 56 # Test utilities, just for local use
57 57 #-----------------------------------------------------------------------------
58 58
59 59 def assemble(block):
60 60 """Assemble a block into multi-line sub-blocks."""
61 61 return ['\n'.join(sub_block)+'\n' for sub_block in block]
62 62
63 63
64 64 def pseudo_input(lines):
65 65 """Return a function that acts like raw_input but feeds the input list."""
66 66 ilines = iter(lines)
67 67 def raw_in(prompt):
68 68 try:
69 69 return next(ilines)
70 70 except StopIteration:
71 71 return ''
72 72 return raw_in
73 73
74 74 #-----------------------------------------------------------------------------
75 75 # Tests
76 76 #-----------------------------------------------------------------------------
77 77 def test_spaces():
78 78 tests = [('', 0),
79 79 (' ', 1),
80 80 ('\n', 0),
81 81 (' \n', 1),
82 82 ('x', 0),
83 83 (' x', 1),
84 84 (' x',2),
85 85 (' x',4),
86 86 # Note: tabs are counted as a single whitespace!
87 87 ('\tx', 1),
88 88 ('\t x', 2),
89 89 ]
90 90
91 91 for s, nsp in tests:
92 92 nt.assert_equal(isp.num_ini_spaces(s), nsp)
93 93
94 94
95 95 def test_remove_comments():
96 96 tests = [('text', 'text'),
97 97 ('text # comment', 'text '),
98 98 ('text # comment\n', 'text \n'),
99 99 ('text # comment \n', 'text \n'),
100 100 ('line # c \nline\n','line \nline\n'),
101 101 ('line # c \nline#c2 \nline\nline #c\n\n',
102 102 'line \nline\nline\nline \n\n'),
103 103 ]
104 104
105 105 for inp, out in tests:
106 106 nt.assert_equal(isp.remove_comments(inp), out)
107 107
108 108
109 109 def test_get_input_encoding():
110 110 encoding = isp.get_input_encoding()
111 111 nt.assert_true(isinstance(encoding, basestring))
112 112 # simple-minded check that at least encoding a simple string works with the
113 113 # encoding we got.
114 114 nt.assert_equal('test'.encode(encoding), 'test')
115 115
116 116
117 117 class NoInputEncodingTestCase(unittest.TestCase):
118 118 def setUp(self):
119 119 self.old_stdin = sys.stdin
120 120 class X: pass
121 121 fake_stdin = X()
122 122 sys.stdin = fake_stdin
123 123
124 124 def test(self):
125 125 # Verify that if sys.stdin has no 'encoding' attribute we do the right
126 126 # thing
127 127 enc = isp.get_input_encoding()
128 128 self.assertEqual(enc, 'ascii')
129 129
130 130 def tearDown(self):
131 131 sys.stdin = self.old_stdin
132 132
133 133
134 134 class InputSplitterTestCase(unittest.TestCase):
135 135 def setUp(self):
136 136 self.isp = isp.InputSplitter()
137 137
138 138 def test_reset(self):
139 139 isp = self.isp
140 140 isp.push('x=1')
141 141 isp.reset()
142 142 self.assertEqual(isp._buffer, [])
143 143 self.assertEqual(isp.indent_spaces, 0)
144 144 self.assertEqual(isp.source, '')
145 145 self.assertEqual(isp.code, None)
146 146 self.assertEqual(isp._is_complete, False)
147 147
148 148 def test_source(self):
149 149 self.isp._store('1')
150 150 self.isp._store('2')
151 151 self.assertEqual(self.isp.source, '1\n2\n')
152 152 self.assertTrue(len(self.isp._buffer)>0)
153 153 self.assertEqual(self.isp.source_reset(), '1\n2\n')
154 154 self.assertEqual(self.isp._buffer, [])
155 155 self.assertEqual(self.isp.source, '')
156 156
157 157 def test_indent(self):
158 158 isp = self.isp # shorthand
159 159 isp.push('x=1')
160 160 self.assertEqual(isp.indent_spaces, 0)
161 161 isp.push('if 1:\n x=1')
162 162 self.assertEqual(isp.indent_spaces, 4)
163 163 isp.push('y=2\n')
164 164 self.assertEqual(isp.indent_spaces, 0)
165 165
166 166 def test_indent2(self):
167 167 # In cell mode, inputs must be fed in whole blocks, so skip this test
168 168 if self.isp.input_mode == 'cell': return
169 169
170 170 isp = self.isp
171 171 isp.push('if 1:')
172 172 self.assertEqual(isp.indent_spaces, 4)
173 173 isp.push(' x=1')
174 174 self.assertEqual(isp.indent_spaces, 4)
175 175 # Blank lines shouldn't change the indent level
176 176 isp.push(' '*2)
177 177 self.assertEqual(isp.indent_spaces, 4)
178 178
179 179 def test_indent3(self):
180 180 # In cell mode, inputs must be fed in whole blocks, so skip this test
181 181 if self.isp.input_mode == 'cell': return
182 182
183 183 isp = self.isp
184 184 # When a multiline statement contains parens or multiline strings, we
185 185 # shouldn't get confused.
186 186 isp.push("if 1:")
187 187 isp.push(" x = (1+\n 2)")
188 188 self.assertEqual(isp.indent_spaces, 4)
189 189
190 190 def test_dedent(self):
191 191 isp = self.isp # shorthand
192 192 isp.push('if 1:')
193 193 self.assertEqual(isp.indent_spaces, 4)
194 194 isp.push(' pass')
195 195 self.assertEqual(isp.indent_spaces, 0)
196 196
197 197 def test_push(self):
198 198 isp = self.isp
199 199 self.assertTrue(isp.push('x=1'))
200 200
201 201 def test_push2(self):
202 202 isp = self.isp
203 203 self.assertFalse(isp.push('if 1:'))
204 204 for line in [' x=1', '# a comment', ' y=2']:
205 205 self.assertTrue(isp.push(line))
206 206
207 207 def test_replace_mode(self):
208 208 isp = self.isp
209 209 isp.input_mode = 'cell'
210 210 isp.push('x=1')
211 211 self.assertEqual(isp.source, 'x=1\n')
212 212 isp.push('x=2')
213 213 self.assertEqual(isp.source, 'x=2\n')
214 214
215 215 def test_push_accepts_more(self):
216 216 isp = self.isp
217 217 isp.push('x=1')
218 218 self.assertFalse(isp.push_accepts_more())
219 219
220 220 def test_push_accepts_more2(self):
221 221 # In cell mode, inputs must be fed in whole blocks, so skip this test
222 222 if self.isp.input_mode == 'cell': return
223 223
224 224 isp = self.isp
225 225 isp.push('if 1:')
226 226 self.assertTrue(isp.push_accepts_more())
227 227 isp.push(' x=1')
228 228 self.assertTrue(isp.push_accepts_more())
229 229 isp.push('')
230 230 self.assertFalse(isp.push_accepts_more())
231 231
232 232 def test_push_accepts_more3(self):
233 233 isp = self.isp
234 234 isp.push("x = (2+\n3)")
235 235 self.assertFalse(isp.push_accepts_more())
236 236
237 237 def test_push_accepts_more4(self):
238 238 # In cell mode, inputs must be fed in whole blocks, so skip this test
239 239 if self.isp.input_mode == 'cell': return
240 240
241 241 isp = self.isp
242 242 # When a multiline statement contains parens or multiline strings, we
243 243 # shouldn't get confused.
244 244 # FIXME: we should be able to better handle de-dents in statements like
245 245 # multiline strings and multiline expressions (continued with \ or
246 246 # parens). Right now we aren't handling the indentation tracking quite
247 247 # correctly with this, though in practice it may not be too much of a
248 248 # problem. We'll need to see.
249 249 isp.push("if 1:")
250 250 isp.push(" x = (2+")
251 251 isp.push(" 3)")
252 252 self.assertTrue(isp.push_accepts_more())
253 253 isp.push(" y = 3")
254 254 self.assertTrue(isp.push_accepts_more())
255 255 isp.push('')
256 256 self.assertFalse(isp.push_accepts_more())
257 257
258 258 def test_continuation(self):
259 259 isp = self.isp
260 260 isp.push("import os, \\")
261 261 self.assertTrue(isp.push_accepts_more())
262 262 isp.push("sys")
263 263 self.assertFalse(isp.push_accepts_more())
264 264
265 265 def test_syntax_error(self):
266 266 isp = self.isp
267 267 # Syntax errors immediately produce a 'ready' block, so the invalid
268 268 # Python can be sent to the kernel for evaluation with possible ipython
269 269 # special-syntax conversion.
270 270 isp.push('run foo')
271 271 self.assertFalse(isp.push_accepts_more())
272 272
273 273 def check_split(self, block_lines, compile=True):
274 274 blocks = assemble(block_lines)
275 275 lines = ''.join(blocks)
276 276 oblock = self.isp.split_blocks(lines)
277 277 self.assertEqual(oblock, blocks)
278 278 if compile:
279 279 for block in blocks:
280 280 self.isp._compile(block)
281 281
282 282 def test_split(self):
283 283 # All blocks of input we want to test in a list. The format for each
284 284 # block is a list of lists, with each inner lists consisting of all the
285 285 # lines (as single-lines) that should make up a sub-block.
286 286
287 287 # Note: do NOT put here sub-blocks that don't compile, as the
288 288 # check_split() routine makes a final verification pass to check that
289 289 # each sub_block, as returned by split_blocks(), does compile
290 290 # correctly.
291 291 all_blocks = [ [['x=1']],
292 292
293 293 [['x=1'],
294 294 ['y=2']],
295 295
296 296 [['x=1',
297 297 '# a comment'],
298 298 ['y=11']],
299 299
300 300 [['if 1:',
301 301 ' x=1'],
302 302 ['y=3']],
303 303
304 304 [['def f(x):',
305 305 ' return x'],
306 306 ['x=1']],
307 307
308 308 [['def f(x):',
309 309 ' x+=1',
310 310 ' ',
311 311 ' return x'],
312 312 ['x=1']],
313 313
314 314 [['def f(x):',
315 315 ' if x>0:',
316 316 ' y=1',
317 317 ' # a comment',
318 318 ' else:',
319 319 ' y=4',
320 320 ' ',
321 321 ' return y'],
322 322 ['x=1'],
323 323 ['if 1:',
324 324 ' y=11'] ],
325 325
326 326 [['for i in range(10):'
327 327 ' x=i**2']],
328 328
329 329 [['for i in range(10):'
330 330 ' x=i**2'],
331 331 ['z = 1']],
332
333 [['"asdf"']],
334
335 [['"asdf"'],
336 ['10'],
337 ],
338
339 [['"""foo',
340 'bar"""']],
332 341 ]
333 342 for block_lines in all_blocks:
334 343 self.check_split(block_lines)
335 344
336 345 def test_split_syntax_errors(self):
337 346 # Block splitting with invalid syntax
338 347 all_blocks = [ [['a syntax error']],
339 348
340 349 [['x=1',
341 350 'another syntax error']],
342 351
343 352 [['for i in range(10):'
344 353 ' yet another error']],
345 354
346 355 ]
347 356 for block_lines in all_blocks:
348 357 self.check_split(block_lines, compile=False)
349 358
350 359 def test_unicode(self):
351 360 self.isp.push(u"PΓ©rez")
352 361 self.isp.push(u'\xc3\xa9')
353 362 self.isp.push("u'\xc3\xa9'")
354 363
355 364 class InteractiveLoopTestCase(unittest.TestCase):
356 365 """Tests for an interactive loop like a python shell.
357 366 """
358 367 def check_ns(self, lines, ns):
359 368 """Validate that the given input lines produce the resulting namespace.
360 369
361 370 Note: the input lines are given exactly as they would be typed in an
362 371 auto-indenting environment, as mini_interactive_loop above already does
363 372 auto-indenting and prepends spaces to the input.
364 373 """
365 374 src = mini_interactive_loop(pseudo_input(lines))
366 375 test_ns = {}
367 376 exec src in test_ns
368 377 # We can't check that the provided ns is identical to the test_ns,
369 378 # because Python fills test_ns with extra keys (copyright, etc). But
370 379 # we can check that the given dict is *contained* in test_ns
371 380 for k,v in ns.iteritems():
372 381 self.assertEqual(test_ns[k], v)
373 382
374 383 def test_simple(self):
375 384 self.check_ns(['x=1'], dict(x=1))
376 385
377 386 def test_simple2(self):
378 387 self.check_ns(['if 1:', 'x=2'], dict(x=2))
379 388
380 389 def test_xy(self):
381 390 self.check_ns(['x=1; y=2'], dict(x=1, y=2))
382 391
383 392 def test_abc(self):
384 393 self.check_ns(['if 1:','a=1','b=2','c=3'], dict(a=1, b=2, c=3))
385 394
386 395 def test_multi(self):
387 396 self.check_ns(['x =(1+','1+','2)'], dict(x=4))
388 397
389 398
390 399 def test_LineInfo():
391 400 """Simple test for LineInfo construction and str()"""
392 401 linfo = isp.LineInfo(' %cd /home')
393 402 nt.assert_equals(str(linfo), 'LineInfo [ |%|cd|/home]')
394 403
395 404
396 405 def test_split_user_input():
397 406 """Unicode test - split_user_input already has good doctests"""
398 407 line = u"PΓ©rez Fernando"
399 408 parts = isp.split_user_input(line)
400 409 parts_expected = (u'', u'', u'', line)
401 410 nt.assert_equal(parts, parts_expected)
402 411
403 412
404 413 # Transformer tests
405 414 def transform_checker(tests, func):
406 415 """Utility to loop over test inputs"""
407 416 for inp, tr in tests:
408 417 nt.assert_equals(func(inp), tr)
409 418
410 419 # Data for all the syntax tests in the form of lists of pairs of
411 420 # raw/transformed input. We store it here as a global dict so that we can use
412 421 # it both within single-function tests and also to validate the behavior of the
413 422 # larger objects
414 423
415 424 syntax = \
416 425 dict(assign_system =
417 426 [('a =! ls', 'a = get_ipython().getoutput("ls")'),
418 427 ('b = !ls', 'b = get_ipython().getoutput("ls")'),
419 428 ('x=1', 'x=1'), # normal input is unmodified
420 429 (' ',' '), # blank lines are kept intact
421 430 ],
422 431
423 432 assign_magic =
424 433 [('a =% who', 'a = get_ipython().magic("who")'),
425 434 ('b = %who', 'b = get_ipython().magic("who")'),
426 435 ('x=1', 'x=1'), # normal input is unmodified
427 436 (' ',' '), # blank lines are kept intact
428 437 ],
429 438
430 439 classic_prompt =
431 440 [('>>> x=1', 'x=1'),
432 441 ('x=1', 'x=1'), # normal input is unmodified
433 442 (' ', ' '), # blank lines are kept intact
434 443 ('... ', ''), # continuation prompts
435 444 ],
436 445
437 446 ipy_prompt =
438 447 [('In [1]: x=1', 'x=1'),
439 448 ('x=1', 'x=1'), # normal input is unmodified
440 449 (' ',' '), # blank lines are kept intact
441 450 (' ....: ', ''), # continuation prompts
442 451 ],
443 452
444 453 # Tests for the escape transformer to leave normal code alone
445 454 escaped_noesc =
446 455 [ (' ', ' '),
447 456 ('x=1', 'x=1'),
448 457 ],
449 458
450 459 # System calls
451 460 escaped_shell =
452 461 [ ('!ls', 'get_ipython().system("ls")'),
453 462 # Double-escape shell, this means to capture the output of the
454 463 # subprocess and return it
455 464 ('!!ls', 'get_ipython().getoutput("ls")'),
456 465 ],
457 466
458 467 # Help/object info
459 468 escaped_help =
460 469 [ ('?', 'get_ipython().show_usage()'),
461 470 ('?x1', 'get_ipython().magic("pinfo x1")'),
462 471 ('??x2', 'get_ipython().magic("pinfo2 x2")'),
463 472 ('x3?', 'get_ipython().magic("pinfo x3")'),
464 473 ('x4??', 'get_ipython().magic("pinfo2 x4")'),
465 474 ('%hist?', 'get_ipython().magic("pinfo %hist")'),
466 475 ('f*?', 'get_ipython().magic("psearch f*")'),
467 476 ('ax.*aspe*?', 'get_ipython().magic("psearch ax.*aspe*")'),
468 477 ],
469 478
470 479 # Explicit magic calls
471 480 escaped_magic =
472 481 [ ('%cd', 'get_ipython().magic("cd")'),
473 482 ('%cd /home', 'get_ipython().magic("cd /home")'),
474 483 (' %magic', ' get_ipython().magic("magic")'),
475 484 ],
476 485
477 486 # Quoting with separate arguments
478 487 escaped_quote =
479 488 [ (',f', 'f("")'),
480 489 (',f x', 'f("x")'),
481 490 (' ,f y', ' f("y")'),
482 491 (',f a b', 'f("a", "b")'),
483 492 ],
484 493
485 494 # Quoting with single argument
486 495 escaped_quote2 =
487 496 [ (';f', 'f("")'),
488 497 (';f x', 'f("x")'),
489 498 (' ;f y', ' f("y")'),
490 499 (';f a b', 'f("a b")'),
491 500 ],
492 501
493 502 # Simply apply parens
494 503 escaped_paren =
495 504 [ ('/f', 'f()'),
496 505 ('/f x', 'f(x)'),
497 506 (' /f y', ' f(y)'),
498 507 ('/f a b', 'f(a, b)'),
499 508 ],
500 509
501 510 )
502 511
503 512 # multiline syntax examples. Each of these should be a list of lists, with
504 513 # each entry itself having pairs of raw/transformed input. The union (with
505 514 # '\n'.join() of the transformed inputs is what the splitter should produce
506 515 # when fed the raw lines one at a time via push.
507 516 syntax_ml = \
508 517 dict(classic_prompt =
509 518 [ [('>>> for i in range(10):','for i in range(10):'),
510 519 ('... print i',' print i'),
511 520 ('... ', ''),
512 521 ],
513 522 ],
514 523
515 524 ipy_prompt =
516 525 [ [('In [24]: for i in range(10):','for i in range(10):'),
517 526 (' ....: print i',' print i'),
518 527 (' ....: ', ''),
519 528 ],
520 529 ],
521 530 )
522 531
523 532
524 533 def test_assign_system():
525 534 transform_checker(syntax['assign_system'], isp.transform_assign_system)
526 535
527 536
528 537 def test_assign_magic():
529 538 transform_checker(syntax['assign_magic'], isp.transform_assign_magic)
530 539
531 540
532 541 def test_classic_prompt():
533 542 transform_checker(syntax['classic_prompt'], isp.transform_classic_prompt)
534 543 for example in syntax_ml['classic_prompt']:
535 544 transform_checker(example, isp.transform_classic_prompt)
536 545
537 546
538 547 def test_ipy_prompt():
539 548 transform_checker(syntax['ipy_prompt'], isp.transform_ipy_prompt)
540 549 for example in syntax_ml['ipy_prompt']:
541 550 transform_checker(example, isp.transform_ipy_prompt)
542 551
543 552
544 553 def test_escaped_noesc():
545 554 transform_checker(syntax['escaped_noesc'], isp.transform_escaped)
546 555
547 556
548 557 def test_escaped_shell():
549 558 transform_checker(syntax['escaped_shell'], isp.transform_escaped)
550 559
551 560
552 561 def test_escaped_help():
553 562 transform_checker(syntax['escaped_help'], isp.transform_escaped)
554 563
555 564
556 565 def test_escaped_magic():
557 566 transform_checker(syntax['escaped_magic'], isp.transform_escaped)
558 567
559 568
560 569 def test_escaped_quote():
561 570 transform_checker(syntax['escaped_quote'], isp.transform_escaped)
562 571
563 572
564 573 def test_escaped_quote2():
565 574 transform_checker(syntax['escaped_quote2'], isp.transform_escaped)
566 575
567 576
568 577 def test_escaped_paren():
569 578 transform_checker(syntax['escaped_paren'], isp.transform_escaped)
570 579
571 580
572 581 class IPythonInputTestCase(InputSplitterTestCase):
573 582 """By just creating a new class whose .isp is a different instance, we
574 583 re-run the same test battery on the new input splitter.
575 584
576 585 In addition, this runs the tests over the syntax and syntax_ml dicts that
577 586 were tested by individual functions, as part of the OO interface.
578 587
579 588 It also makes some checks on the raw buffer storage.
580 589 """
581 590
582 591 def setUp(self):
583 592 self.isp = isp.IPythonInputSplitter(input_mode='line')
584 593
585 594 def test_syntax(self):
586 595 """Call all single-line syntax tests from the main object"""
587 596 isp = self.isp
588 597 for example in syntax.itervalues():
589 598 for raw, out_t in example:
590 599 if raw.startswith(' '):
591 600 continue
592 601
593 602 isp.push(raw)
594 603 out, out_raw = isp.source_raw_reset()
595 604 self.assertEqual(out.rstrip(), out_t)
596 605 self.assertEqual(out_raw.rstrip(), raw.rstrip())
597 606
598 607 def test_syntax_multiline(self):
599 608 isp = self.isp
600 609 for example in syntax_ml.itervalues():
601 610 out_t_parts = []
602 611 raw_parts = []
603 612 for line_pairs in example:
604 613 for lraw, out_t_part in line_pairs:
605 614 isp.push(lraw)
606 615 out_t_parts.append(out_t_part)
607 616 raw_parts.append(lraw)
608 617
609 618 out, out_raw = isp.source_raw_reset()
610 619 out_t = '\n'.join(out_t_parts).rstrip()
611 620 raw = '\n'.join(raw_parts).rstrip()
612 621 self.assertEqual(out.rstrip(), out_t)
613 622 self.assertEqual(out_raw.rstrip(), raw)
614 623
615 624
616 625 class BlockIPythonInputTestCase(IPythonInputTestCase):
617 626
618 627 # Deactivate tests that don't make sense for the block mode
619 628 test_push3 = test_split = lambda s: None
620 629
621 630 def setUp(self):
622 631 self.isp = isp.IPythonInputSplitter(input_mode='cell')
623 632
624 633 def test_syntax_multiline(self):
625 634 isp = self.isp
626 635 for example in syntax_ml.itervalues():
627 636 raw_parts = []
628 637 out_t_parts = []
629 638 for line_pairs in example:
630 639 for raw, out_t_part in line_pairs:
631 640 raw_parts.append(raw)
632 641 out_t_parts.append(out_t_part)
633 642
634 643 raw = '\n'.join(raw_parts)
635 644 out_t = '\n'.join(out_t_parts)
636 645
637 646 isp.push(raw)
638 647 out, out_raw = isp.source_raw_reset()
639 648 # Match ignoring trailing whitespace
640 649 self.assertEqual(out.rstrip(), out_t.rstrip())
641 650 self.assertEqual(out_raw.rstrip(), raw.rstrip())
642 651
643 652
644 653 #-----------------------------------------------------------------------------
645 654 # Main - use as a script, mostly for developer experiments
646 655 #-----------------------------------------------------------------------------
647 656
648 657 if __name__ == '__main__':
649 658 # A simple demo for interactive experimentation. This code will not get
650 659 # picked up by any test suite.
651 660 from IPython.core.inputsplitter import InputSplitter, IPythonInputSplitter
652 661
653 662 # configure here the syntax to use, prompt and whether to autoindent
654 663 #isp, start_prompt = InputSplitter(), '>>> '
655 664 isp, start_prompt = IPythonInputSplitter(), 'In> '
656 665
657 666 autoindent = True
658 667 #autoindent = False
659 668
660 669 try:
661 670 while True:
662 671 prompt = start_prompt
663 672 while isp.push_accepts_more():
664 673 indent = ' '*isp.indent_spaces
665 674 if autoindent:
666 675 line = indent + raw_input(prompt+indent)
667 676 else:
668 677 line = raw_input(prompt)
669 678 isp.push(line)
670 679 prompt = '... '
671 680
672 681 # Here we just return input so we can use it in a test suite, but a
673 682 # real interpreter would instead send it for execution somewhere.
674 683 #src = isp.source; raise EOFError # dbg
675 684 src, raw = isp.source_raw_reset()
676 685 print 'Input source was:\n', src
677 686 print 'Raw source was:\n', raw
678 687 except EOFError:
679 688 print 'Bye'
General Comments 0
You need to be logged in to leave comments. Login now