John Nunemaker on implementation of Analytics at Github using Cassandra:

The collector became a Rails app with one purpose – to receive events and queue them in Kestrel, which I used on Gauges.


The processor pulled from the queue and stored the raw data in Cassandra. The other component of processing (Hadoop) then iterated the raw data on intervals and turned it into aggregated “indexes”.


The reporter became a Rails app with one purpose as well, to receive API requests from github.com and read the data required to fulfil the request from Cassandra.