Monthly Archives: December 2011

Isn’t there a recession?….

According to the news there seems to be a recession going on, economic instability, high unemployment and all that. Somehow that is very hard to see from within the tech/startup community. Most every strong developer and data scientist I know is happily employed and competent undergards are entertaining offers 6 months before graduation. Starups are raising capital and data science is seeing more and more funding – for a big example, Mu Sigma a data analytics firm just raised $108 million, see here. The data scientists I have tried to recruit for Sailthru all have great jobs and multiple others waiting for them if the need arises. As a matter of fact the company has been desperately trying to seek out an experienced sys. admin., so if you are reading of this and know someone out there do let me know. Apparently, the recession has decided to skip the tech industry…


Statistics and Algebra. An Example.

Written by : Matt

There is a developing field called algebraic statistics which explores probability and statistics problems involving discrete random variables using methods coming from commutative algebra and algebraic geometry. The basic point is that the parameters for such statistical models are often constrained by polynomial relationships – and these are exactly the subject of commutative algebra and algebraic geometry. I would like to learn something more about this relationship, so in this post I’ll describe one example that I worked through – it comes from a book on the subject written by Bernd Sturmfels. Disclaimer : the rest of this post is technical.

Continue reading

Gambling and Shannon’s Entropy Function Part 2.

In the last post I gave an introduction of Kelly’s paper where he describes optimal gambling strategies based on information received over a potentially noisy channel. Here I’ll talk about the general case, where the channel has several inputs symbols, each with a given probability  of being transmitted, and which represent the outcome of some chance event. First, we need to set up some notation:

p(s) –  probability that the transmitted symbol is s.

p(r | s) – the conditional probability that the received symbols is r given that the transmitted symbol is s.

p(r, s) –  the joint probability that the received symbol is r and the transmitted symbol is s.

q(r) –  probability that the received symbol is r.

q(s | r) – the conditional probability that the transmitted symbol is s given that the received symbols is r.

\alpha_s – the odds paid on the occurrence of s, i.e. the number of dollars returned on a one-dollar bet on s.

a(s/r) – the fraction of capital that the gambler decides to bet on the occurrence of s after receiving r.

Continue reading

Gambling and Shannon’s entropy function.

A little while ago Matt posted an overview of the definition of Shannon’s entropy function and it’s use in assessing interaction between random variables. I’d like  to stay with the 1960’s Bell Labs team and describe some of the results from J. L. Kelly’s wonderful paper, which arises from the definitions and ideas in Shannon’s original work.

Consider at first the trivial scenario where we have a noiseless binary channel that transmits the results of say a baseball game. If the odds are even and the gambler has access to this channel he can grow his capital exponentially by making certain bets (since he knows the outcome before betting). His capital would grow at a rate of 2^N , after N bets. In view of the fact that any such function should be maximum in the above scenario, Kelly defines the exponential rate of growth of capital as

G = lim_{N \rightarrow \infty} \frac{1}{N} log_2(\frac{V_N}{V_0}),

were V_0 is the initial capital and V_N is the capital after N bets.

Now suppose that the channel is noisy, and a given symbol has a probability p of error and q of correct transmission. The question now is, how much should the gambler bet?

Continue reading