Upgrade: wym comment cleanup, downtime, and improved search

Friday, April 20th, 2007 at 1:13 pm

News Sniffer was down for a couple of hours yesterday afternoon whilst I upgraded the software that runs it. This fixes a few display bugs and more importantly patches up another problem where bbc comments were being misclassed as censored. The misclassification was due to a mistake when handling british summer time, so it’s only been prevalent since the end of March. We double checked all censored comment from the last few months and removed any misclassified ones.

Due to the downtime whilst we cleaned and reindexed the database, some news articles and bbc comments were not added to the database so will not be tracked. So, if you’re looking for a particular news article published yesterday afternoon, it’s likely that it is missing. Apologies for the inconvenience.

The biggest new feature is vastly improved search system.

It works like most search engines, so just type keywords in and you get results. You can be a bit more advanced too though. Say you want to find all censored bbc comments by the author ‘John Smith’. Do a search for:

author:"john smith"


You want all the comments mentioning the NHS but to exclude any mentioning Patricia Hewitt:

nhs -hewitt

Maybe you want to find all censored bbc comments with the words ‘iran’ and ‘bomb’ made in April 2007:

iran bomb created_at:200704*

To find a particular bbc comment, you can search on the bbc’s own comment id number:

bbcid:bbcid:2481215

To get all Revisionista news articles with the work ‘iraq’ in the title published today (19th April 2007):

title:iraq created_at:20070419

The BBC Have Your Say search is here, and the Revisionista news article search is here. Happy searching!

One response to “Upgrade: wym comment cleanup, downtime, and improved search”

  1. […] I’ve been working on my News Sniffer project for the last few days, finishing up a two month experiment with using the Ruby Lucene implementation, Ferret, to index news articles and comments.  More info on the News Sniffer blog.  The project spanned two months due to some instability in the newer versions of Ferret, but the author responded to the bug reports and managed to fix all the problems so I decided to deploy. […]

Leave a Reply

Your email address will not be published. Required fields are marked *