This Wednesday Feb. 29th, SIAM (CU Society of Applied and Industrial Mathematics) and ADI (CU Application Development Initiative) will host and event geared towards getting the word out that math skills open doors for cool tech jobs. There will be mingling, presentations from Columbia Alumni (including your’s truly) and some more mingling. Spread the word!

Parallel Computing with R

One of the reasons that R can be quite slow is that by default it uses only one core, regardless of how many your machine actually runs. There are a number of ways to get better computing time using R and with almost no code overhead increase performance by at a factor of at least the number cores locally available. Most of the packages are designed for running network clusters, but they work equally, albeit likely not as quickly, well with just one machine. Luckily many of them have very nice high-level wrappers that essentially hide all of the low-level maintenance. In addition, the examples to follow provide a good introduction to parallel computing in the case you decided to take it to the next level, linking multiple machines together, etc.

I will give a brief survey on the workings of a few of these packages in view of just one machine (extending this to a ‘real’ cluster basically only requires making sure that all packages and dependencies are installed in all machines and passwordless ssh login is enabled).

Isn’t there a recession?….

According to the news there seems to be a recession going on, economic instability, high unemployment and all that. Somehow that is very hard to see from within the tech/startup community. Most every strong developer and data scientist I know is happily employed and competent undergards are entertaining offers 6 months before graduation. Starups are raising capital and data science is seeing more and more funding – for a big example, Mu Sigma a data analytics firm just raised $108 million, see here. The data scientists I have tried to recruit for Sailthru all have great jobs and multiple others waiting for them if the need arises. As a matter of fact the company has been desperately trying to seek out an experienced sys. admin., so if you are reading of this and know someone out there do let me know. Apparently, the recession has decided to skip the tech industry…

Statistics and Algebra. An Example.

Written by : Matt

There is a developing field called algebraic statistics which explores probability and statistics problems involving discrete random variables using methods coming from commutative algebra and algebraic geometry. The basic point is that the parameters for such statistical models are often constrained by polynomial relationships – and these are exactly the subject of commutative algebra and algebraic geometry. I would like to learn something more about this relationship, so in this post I’ll describe one example that I worked through – it comes from a book on the subject written by Bernd Sturmfels. Disclaimer : the rest of this post is technical.

