Numerical Computing Environments
Does getting rid of explicit loops in your really buy you a whole lot? Ie, is writing
z = x*y
instead of
float z[N]; for (i=0; i<N; i++) z[i] = x[i]*y[i];
such a big deal? To my mind, yes, it's a very big deal, and the win is not only in terms of how long it takes to write the thing in the first place, but also in terms of read/understandability later on. If you agree, read on.
The usual suspects are:
- IDL -- Works, but the language feels cobwebby and it's expensive
- Matlab -- Nice program, but expensive (for extra packages or non-student versions)
- Mathematica -- Great symbolic manipulation, but expensive (for non-student versions)
Some programs/environments I've tried are:
Sci Py with iPython -- The one I currently use the most. It's a nice, modern language, but can be dog-slow. This is ameliorated by the fact that you can call C code without an enormous amount of pain, so you can (hopefully) use C for the heavy-duty number crunching and Python for the fussy "management" code that isn't performance-critical. Plotting isn't so slick right now but check out matplotlib, pyx. This environment has the advantage that StSci seems to be throwing some weight behind it so it can be expected to be around for a while.
Octave -- Nice matlab clone, great for playing around with numerical algorithms. Speed eventually becomes a problem, though, and at this point you can't have 3-d matricies, so I don't use it so much anymore. However, this is the program that launched my search for the Nirvana of numerical computing environments. Suddenly writing code was fun again.
Maxima -- A computer algebra system that predates Mathematica by about three decades. Not as nice, pretty, or slick as Mathematica and it's written in (gasp!) Lisp, but it gets the job done for many problems. And it's free. If you use it, this site is also useful.
Lush -- If you're into Lisp, check this out. For many plain-vanilla matrix-based applications, the prefix notation of Lisp gets in the way, but if you're doing something more complicated that also needs to be fast in places, this may suit your needs. The two main authors are big shots in the machine learning community (they achieved the best automated handwritten digit recognition in a study commissioned by the US Postal Service in the early 90's), so they're not to be trifled with.
Scilab -- a Matlab clone that seems to serve a large community. Haven't tried it, but it looks promising.
PDL -- An attempt to make Perl into something like IDL. Interesting proof-of-concept, but seems to lack a large user base and isn't really "there" yet, when I tried to use it (admittedly many years ago).
Blitz++ -- a set of C++ array classes using Expression Template mechanisms to avoid (most) runtime penalties. This is what I use for the Monte-Carlo code, not as high-level as dedicated linear algebra systems but avoids the use of loops and explicit allocations. Drawback would be LONG compliation times...
There are lots of others, in particular the ones listed here. If you try any of them, please list your experiences here! It's easy to find great lists of software, but there's almost no editorial content guiding you to which ones might be right for you...
Also, it's worth knowing about just a bit of the incredible amount of numerical software that's been written over the years. These two links will get you on the right track: http://www.netlib.org/ http://gams.nist.gov/
Finally, this is a nice FAQ about various matematical software/environements/methods out there.
Visualization
Visualization is a perennial problem, but the three best solutions I've seen so far are http://public.kitware.com/VTK/ http://mayavi.sourceforge.net/ http://casa.colorado.edu/~gnedin/IFRIT/ . The second two are based on the first, perhaps not coincidentally.
Generating documentation with [http://www.stack.nl/~dimitri/doxygen/index.html Doxygen]
If you are writing in C++ or a few other languages for which it works, Doxygen can parse your code and use inline comments to generate a complete code documentation, including stuff like inheritance graphs. It's also useful for browsing code you haven't written since it makes it easier to see how stuff is organized. (I generated a web page with my Monte Carlo code docs with it but I won't link to it here because I don't want my code to show up in search engines yet...
If you are interested in giving it a try, I have a copy for Sun in /home/public/patrik/bin.
Performance with threads
If you are running a code that has several threads executing in parallel on a multiprocessor machine, and is noticing a performance drop with increasing number of threads even though there is no communication between the threads (most of you have probably stopped reading by now...), then check this out. Even if the threads are not using mutexes to serialize access to shared structures, they can still affect each other. If the threads are using data that lie on the same cache line, writing to these will invalidate the cache lines on all the processors. If the threads are dynamically allocating memory, quite likely these memory blocks will be interleaved among each other, so if the blocks allocated are smaller than the cache line size this will happen. The thing is, it's very hard to know because you have no idea where the blocks are... just something to think about. (Thanks to Greg for putting me on the right track figuring this very convoluted thing out, it led to a *3x* speed up for my Monte Carlo code!
Intel Floating Point Exceptions
In my experience, it's easier to get a numerical code to divide by zero than you might think, and it's difficult to track down what's going on when the error happens only after a long time. Intel has made the default behavior in response to floating point exceptions to be to try to patch up the situation and pretend nothing's wrong. This is good for production code, but when debugging I'd like a core dump. You can tell the processor to call a software exception handler, but only via assembly langauge. Some compilers have switches to determine this, but notably gcc and Intel's C compiler don't. Here's some C code with inlined assembly language to turn individual exceptions on and off. Under *nix, this will generate a signal which you can catch or just ignore (to generate a core dump). Here's the code: Answers/Code/IntelFloatingPointExceptionsDotC and Answers/Code/IntelFloatingPointExceptionsDotH
Another way to do it is to put this at the top of your code:
#define _GNU_SOURCE 1 #include <fenv.h> static void attribute ((constructor)) trapfpe(void) {
- /* Enable some exceptions. At startup all exceptions are masked. */
- feenableexcept(FE_INVALID|FE_DIVBYZERO|FE_OVERFLOW);
} Finally, you can look for information about fpu_control.h, which should let you do the same thing.
You may also want to look at the Intel IA-32 documentation if you want to do something more compilicated than what these files allow: Vol 1 Vol 2A Vol 2B Vol 3 Optimization Guide (They're not as scary as they look, they're just references).
Word Expansion
What if you want to do all the fancy stuff that the shell does, like ~ expansion, environment variable expansion, cmd expansion, etc., in your program? The GNU C library comes to the rescue: wordexp() is the function to look for. Glibc also has many other useful functions.
