Numerical Computing Environments

Does getting rid of explicit loops in your really buy you a whole lot? Ie, is writing

z = x*y

instead of

float z[N];
for (i=0; i<N; i++) 
  z[i] = x[i]*y[i];

such a big deal? To my mind, yes, it's a very big deal, and the win is not only in terms of how long it takes to write the thing in the first place, but also in terms of read/understandability later on. If you agree, read on.

The usual suspects are:

Some programs/environments I've tried are:

There are lots of others, in particular the ones listed here. If you try any of them, please list your experiences here! It's easy to find great lists of software, but there's almost no editorial content guiding you to which ones might be right for you...

Also, it's worth knowing about just a bit of the incredible amount of numerical software that's been written over the years. These two links will get you on the right track: http://www.netlib.org/ http://gams.nist.gov/

Finally, this is a nice FAQ about various matematical software/environements/methods out there.

Visualization

Visualization is a perennial problem, but the three best solutions I've seen so far are http://public.kitware.com/VTK/ http://mayavi.sourceforge.net/ http://casa.colorado.edu/~gnedin/IFRIT/ . The second two are based on the first, perhaps not coincidentally.

Generating documentation with [http://www.stack.nl/~dimitri/doxygen/index.html Doxygen]

If you are writing in C++ or a few other languages for which it works, Doxygen can parse your code and use inline comments to generate a complete code documentation, including stuff like inheritance graphs. It's also useful for browsing code you haven't written since it makes it easier to see how stuff is organized. (I generated a web page with my Monte Carlo code docs with it but I won't link to it here because I don't want my code to show up in search engines yet... ;-) If you are interested in giving it a try, I have a copy for Sun in /home/public/patrik/bin.

Performance with threads

If you are running a code that has several threads executing in parallel on a multiprocessor machine, and is noticing a performance drop with increasing number of threads even though there is no communication between the threads (most of you have probably stopped reading by now...), then check this out. Even if the threads are not using mutexes to serialize access to shared structures, they can still affect each other. If the threads are using data that lie on the same cache line, writing to these will invalidate the cache lines on all the processors. If the threads are dynamically allocating memory, quite likely these memory blocks will be interleaved among each other, so if the blocks allocated are smaller than the cache line size this will happen. The thing is, it's very hard to know because you have no idea where the blocks are... just something to think about. (Thanks to Greg for putting me on the right track figuring this very convoluted thing out, it led to a *3x* speed up for my Monte Carlo code! :-)

Intel Floating Point Exceptions

In my experience, it's easier to get a numerical code to divide by zero than you might think, and it's difficult to track down what's going on when the error happens only after a long time. Intel has made the default behavior in response to floating point exceptions to be to try to patch up the situation and pretend nothing's wrong. This is good for production code, but when debugging I'd like a core dump. You can tell the processor to call a software exception handler, but only via assembly langauge. Some compilers have switches to determine this, but notably gcc and Intel's C compiler don't. Here's some C code with inlined assembly language to turn individual exceptions on and off. Under *nix, this will generate a signal which you can catch or just ignore (to generate a core dump). Here's the code: Answers/Code/IntelFloatingPointExceptionsDotC and Answers/Code/IntelFloatingPointExceptionsDotH

Another way to do it is to put this at the top of your code:

#define _GNU_SOURCE 1 #include <fenv.h> static void attribute ((constructor)) trapfpe(void) {

} Finally, you can look for information about fpu_control.h, which should let you do the same thing.

You may also want to look at the Intel IA-32 documentation if you want to do something more compilicated than what these files allow: Vol 1 Vol 2A Vol 2B Vol 3 Optimization Guide (They're not as scary as they look, they're just references).

Word Expansion

What if you want to do all the fancy stuff that the shell does, like ~ expansion, environment variable expansion, cmd expansion, etc., in your program? The GNU C library comes to the rescue: wordexp() is the function to look for. Glibc also has many other useful functions.

Answers/Code (last edited 2007-09-12 23:35:13 by GregNovak)