faq.txt
105 lines
| 3.8 KiB
| text/plain
|
TextLexer
Brian E Granger
|
r1256 | .. _faq: | ||
Brian E Granger
|
r1258 | ======================================== | ||
Frequently asked questions | ||||
======================================== | ||||
Brian E Granger
|
r1256 | |||
General questions | ||||
================= | ||||
Questions about parallel computing with IPython | ||||
================================================ | ||||
Will IPython speed my Python code up? | ||||
-------------------------------------- | ||||
Yes and no. When converting a serial code to run in parallel, there often many | ||||
difficulty questions that need to be answered, such as: | ||||
Brian Granger
|
r1677 | * How should data be decomposed onto the set of processors? | ||
* What are the data movement patterns? | ||||
* Can the algorithm be structured to minimize data movement? | ||||
* Is dynamic load balancing important? | ||||
Brian E Granger
|
r1256 | |||
We can't answer such questions for you. This is the hard (but fun) work of parallel | ||||
computing. But, once you understand these things IPython will make it easier for you to | ||||
implement a good solution quickly. Most importantly, you will be able to use the | ||||
resulting parallel code interactively. | ||||
With that said, if your problem is trivial to parallelize, IPython has a number of | ||||
different interfaces that will enable you to parallelize things is almost no time at | ||||
Brian Granger
|
r1677 | all. A good place to start is the ``map`` method of our :class:`MultiEngineClient`. | ||
Brian E Granger
|
r1256 | |||
What is the best way to use MPI from Python? | ||||
-------------------------------------------- | ||||
What about all the other parallel computing packages in Python? | ||||
--------------------------------------------------------------- | ||||
Some of the unique characteristic of IPython are: | ||||
Brian Granger
|
r1677 | * IPython is the only architecture that abstracts out the notion of a | ||
parallel computation in such a way that new models of parallel computing | ||||
can be explored quickly and easily. If you don't like the models we | ||||
provide, you can simply create your own using the capabilities we provide. | ||||
* IPython is asynchronous from the ground up (we use `Twisted`_). | ||||
* IPython's architecture is designed to avoid subtle problems | ||||
that emerge because of Python's global interpreter lock (GIL). | ||||
* While IPython's architecture is designed to support a wide range | ||||
of novel parallel computing models, it is fully interoperable with | ||||
traditional MPI applications. | ||||
* IPython has been used and tested extensively on modern supercomputers. | ||||
* IPython's networking layers are completely modular. Thus, is | ||||
straightforward to replace our existing network protocols with | ||||
high performance alternatives (ones based upon Myranet/Infiniband). | ||||
* IPython is designed from the ground up to support collaborative | ||||
parallel computing. This enables multiple users to actively develop | ||||
and run the *same* parallel computation. | ||||
* Interactivity is a central goal for us. While IPython does not have | ||||
to be used interactivly, it can be. | ||||
Brian E Granger
|
r1256 | .. _Twisted: http://www.twistedmatrix.com | ||
Why The IPython controller a bottleneck in my parallel calculation? | ||||
------------------------------------------------------------------- | ||||
A golden rule in parallel computing is that you should only move data around if you | ||||
absolutely need to. The main reason that the controller becomes a bottleneck is that | ||||
too much data is being pushed and pulled to and from the engines. If your algorithm | ||||
is structured in this way, you really should think about alternative ways of | ||||
handling the data movement. Here are some ideas: | ||||
Brian Granger
|
r1677 | 1. Have the engines write data to files on the locals disks of the engines. | ||
2. Have the engines write data to files on a file system that is shared by | ||||
the engines. | ||||
3. Have the engines write data to a database that is shared by the engines. | ||||
4. Simply keep data in the persistent memory of the engines and move the | ||||
computation to the data (rather than the data to the computation). | ||||
5. See if you can pass data directly between engines using MPI. | ||||
Brian E Granger
|
r1256 | |||
Isn't Python slow to be used for high-performance parallel computing? | ||||
--------------------------------------------------------------------- | ||||