Monthly Archives: January 2012

Parallel Computing with R

One of the reasons that R can be quite slow is that by default it uses only one core, regardless of how many your machine actually runs. There are a number of ways to get better computing time using R and with almost no code overhead increase performance by at a factor of at least the number cores locally available. Most of the packages are designed for running network clusters, but they work equally, albeit likely not as quickly, well with just one machine. Luckily many of them have very nice high-level wrappers that essentially hide all of the low-level maintenance. In addition, the examples to follow provide a good introduction to parallel computing in the case you decided to take it to the next level, linking multiple machines together, etc.

I will give a brief survey on the workings of a few of these packages in view of just one machine (extending this to a ‘real’ cluster basically only requires making sure that all packages and dependencies are installed in all machines and passwordless ssh login is enabled).

Continue reading