faq.txt
115 lines
| 4.7 KiB
| text/plain
|
TextLexer
Brian E Granger
|
r1256 | .. _faq: | ||
================ | ||||
FAQ for IPython | ||||
================ | ||||
General questions | ||||
================= | ||||
What is the difference between IPython and IPython? | ||||
---------------------------------------------------- | ||||
IPython is the next generation of IPython. It is being created with three main goals in | ||||
mind: | ||||
1. Clean up the existing codebase and write lots of tests. | ||||
2. Separate the core functionality of IPython from the terminal to enable IPython | ||||
to be used from within a variety of GUI applications. | ||||
3. Implement a system for interactive parallel computing. | ||||
Currently, IPython is not a full replacement for IPython and until that happens, | ||||
IPython will be developed as a separate project. IPython currently provides a stable | ||||
and powerful architecture for parallel computing that can be used with IPython or even | ||||
the default Python shell. For more information, see our `introduction to parallel | ||||
computing with IPython`__. | ||||
.. __: ./parallel_intro | ||||
What is the history of IPython? | ||||
-------------------------------- | ||||
Questions about parallel computing with IPython | ||||
================================================ | ||||
Will IPython speed my Python code up? | ||||
-------------------------------------- | ||||
Yes and no. When converting a serial code to run in parallel, there often many | ||||
difficulty questions that need to be answered, such as: | ||||
* How should data be decomposed onto the set of processors? | ||||
* What are the data movement patterns? | ||||
* Can the algorithm be structured to minimize data movement? | ||||
* Is dynamic load balancing important? | ||||
We can't answer such questions for you. This is the hard (but fun) work of parallel | ||||
computing. But, once you understand these things IPython will make it easier for you to | ||||
implement a good solution quickly. Most importantly, you will be able to use the | ||||
resulting parallel code interactively. | ||||
With that said, if your problem is trivial to parallelize, IPython has a number of | ||||
different interfaces that will enable you to parallelize things is almost no time at | ||||
all. A good place to start is the ``map`` method of our `multiengine interface`_. | ||||
.. _multiengine interface: ./parallel_multiengine | ||||
What is the best way to use MPI from Python? | ||||
-------------------------------------------- | ||||
What about all the other parallel computing packages in Python? | ||||
--------------------------------------------------------------- | ||||
Some of the unique characteristic of IPython are: | ||||
* IPython is the only architecture that abstracts out the notion of a | ||||
parallel computation in such a way that new models of parallel computing | ||||
can be explored quickly and easily. If you don't like the models we | ||||
provide, you can simply create your own using the capabilities we provide. | ||||
* IPython is asynchronous from the ground up (we use `Twisted`_). | ||||
* IPython's architecture is designed to avoid subtle problems | ||||
that emerge because of Python's global interpreter lock (GIL). | ||||
* While IPython'1 architecture is designed to support a wide range | ||||
of novel parallel computing models, it is fully interoperable with | ||||
traditional MPI applications. | ||||
* IPython has been used and tested extensively on modern supercomputers. | ||||
* IPython's networking layers are completely modular. Thus, is | ||||
straightforward to replace our existing network protocols with | ||||
high performance alternatives (ones based upon Myranet/Infiniband). | ||||
* IPython is designed from the ground up to support collaborative | ||||
parallel computing. This enables multiple users to actively develop | ||||
and run the *same* parallel computation. | ||||
* Interactivity is a central goal for us. While IPython does not have | ||||
to be used interactivly, is can be. | ||||
.. _Twisted: http://www.twistedmatrix.com | ||||
Why The IPython controller a bottleneck in my parallel calculation? | ||||
------------------------------------------------------------------- | ||||
A golden rule in parallel computing is that you should only move data around if you | ||||
absolutely need to. The main reason that the controller becomes a bottleneck is that | ||||
too much data is being pushed and pulled to and from the engines. If your algorithm | ||||
is structured in this way, you really should think about alternative ways of | ||||
handling the data movement. Here are some ideas: | ||||
1. Have the engines write data to files on the locals disks of the engines. | ||||
2. Have the engines write data to files on a file system that is shared by | ||||
the engines. | ||||
3. Have the engines write data to a database that is shared by the engines. | ||||
4. Simply keep data in the persistent memory of the engines and move the | ||||
computation to the data (rather than the data to the computation). | ||||
5. See if you can pass data directly between engines using MPI. | ||||
Isn't Python slow to be used for high-performance parallel computing? | ||||
--------------------------------------------------------------------- | ||||