Details behind "Is Caltrain Running Late?"

We use social data to give real time estimates of Caltrain delays. The idea is that when a public service is experiencing delays, people will complain about the issues.

Our app compares the predicted and actual amount of attention the Caltrain is getting on social media and report to you when we detect anomalies. For example, check out the last 3 days of activity (updated nightly):

The last three days of data.

Another cool thing that you can see in this data is that there are noticeable spikes in Tweets when people are normally going from/to work. (5, 10, 50, 90 and 95th percentiles)

Average hourly rates.

The dynamics are different on the weekends.

Average hourly rates.

Development Stack

This site runs on a node.js server on an Amazon Web Service instance. Data analytics are done in a combination of Python and R.