News Sniffer hits one million news article versions

Monday, July 8th, 2013 at 1:24 pm

News Sniffer retrieved it’s one millionth news article version last week, after running continuously for almost 7 years!

The first version ever collected was on the 29th August 2006 – 6 years 10 months 10 days ago.

There are currently 1,004,651 versions of 394,967 articles, so each article has on average 2.5 changes. It’s collected 185,949 versions this year alone which is about 1,000 versions discovered each day.

For the techies, it currently takes up about 7 gigabytes of data on disk (in MySQL) with an additional 29 gigabytes of search index (in Xapian).

Looking forward to another million versions, which will come sooner as we start monitoring more news sources.

Remember that the News Sniffer project is open source, so if you’re a Ruby programer (or can hire Ruby programmers) you can make it better, or even run your own sniffer! Join in!

Leave a Reply