Details behind "Is Caltrain Running Late?"
We use social data to give real time estimates of Caltrain delays. The idea is that when a public service is experiencing delays, people will complain about the issues.
Our app compares the predicted and actual amount of attention the Caltrain is getting on social media and report to you when we detect anomalies. For example, check out the last 3 days of activity (updated nightly):
Another cool thing that you can see in this data is that there are noticeable spikes in Tweets when people are normally going from/to work. (5, 10, 50, 90 and 95th percentiles)
The dynamics are different on the weekends.
This site runs on a node.js server on an Amazon Web Service instance. Data analytics are done in a combination of Python and R.