So I was thinking that if there were an earthquake in Los Angeles, the internet would react as follows:
- 100,000 people in Los Angeles search for "earthquake" and related terms
- Some areas stop generating web traffic in the immediate area
- CNN adds a story to their front page about a tsunami in Santa Monica
- Other news outlets eventually post stories with official information regarding the quake
- CNN fixes their front page to let people know that it was actually a 5.8 quake around Pasadena
The timing, I would guess would be as follows. (1) happens from 5 seconds after the shaking starts until it's been over for about 6 hours. (2) would be happening from the time people started losing power in the area. (3) takes 20 minutes to go live, and is based on bad information. The tsunami is obviously an exaggeration, but the idea is that they will get some key facts wrong. (4) would start happening about an hour or two after the quake, and would be around the same time as (5).
So what do we see here? We could have guessed at the earthquake from the hundred thousand related queries in the immediate area, and probably generated some useful information by bumping that up against the region that stopped generating as much traffic as it usually does.
This would require some sort of simplistic combination of search terms to generate the "news" article. It would obviously be susceptible to clever hacking, but (3) is all we see now. That's typically bad information anyhow. You can pull a location, an "event" and all sorts of other data from the information that's already out there.
If you think this is a bad idea, then think about it some more. If there is more of a deal breaker that I'm missing out on, please let me know. It seems that there is a lot of this "sort" of information out there anyhow, and I would bet that news outlets actually monitor this sort of thing to have people create reports out of… but this could be fully automated. It's almost simple.