No updates for the last 2 days

March 1st, 2007

Apologies but due to a VPN problem, News Sniffer has not been monitoring news articles or Have Your Say forums for the last 2 days. All working again now though. I’ll set up some alerting system to prevent this happening again.

BBC UK Politics feed added to Revisionista

January 18th, 2007

I’ve just added the BBC UK Politics section to Revisionista, so articles from it will be monitored for changes from now on.

BBC fixes RSS feeds – breaks Watch Your Mouth

December 8th, 2006

When the BBC discovered News Sniffer, I was invited to discuss it on their techie lists. I mentioned a few of the problems I’d had with the feeds such as duplicate entries, a lack of useful caching HTTP headers and the huge size of the feeds. In response to this they looked into it and fixed the duplicate entries within a couple of weeks.

Yesterday they changed the default size of the feeds but also rejigged the RSS format. This broke Watch Your Mouth in a number of ways (mostly affecting only new threads since yesterday):

  • The timestamps of new comments weren’t reported correctly
  • The author details of new comments were missing
  • New thread description details missing

Due to a combination of some of the above, some comments were marked censored when they were in fact published. I’ve adjusted Watch Your Mouth in response to these changes and it’s working ok again now. I’ve also run the clean-up scripts so any published comments marked censored have been restored – you might notice a bunch of “censored” comments disappear from the indexes.

These kinds of problems are expected when you’re monitoring data that’s in the control of someone else (especially in ways they might not have ever intended). I just need to keep an eye on the situation and make alterations accordingly when problems arise.

To compare it to an “arms race” isn’t quite right because there is no evidence that the BBC are purposefully making life difficult for us. In fact, these changes are actually helpful.

UPDATE: Due to the combination of malformed BBC RSS timestamps (hours going from 0000 to 2400?!) and a bug in Watch Your Mouth, we’ve been missing a lot of potentially censored comments on some threads for the last 3 or 4 days. I’ve now written a workaround to this quirk so things should be back to normal.

News Sniffer used by NHS Blog Doctor

December 4th, 2006

The NHS Blog Doctor blog used Revisionista to expose a bit of a “cover-up” on a BBC News article.

Basically, NHS Blog Doctor criticised a story on babies with milk allergies published by the BBC. They even made a formal written complaint. The article was then changed and readers started accusing NHS Blog Doctor of misrepresenting the BBC. They never received a reply from the BBC about their complaint. Not even an acknowledgement.

They used News Sniffer to show the article had been changed. Go read the whole thing.

The Revisionista diff of the particular change is here.

Revisionista parser changes – BBC flurry

November 2nd, 2006

I’ve tweaked the way BBC news articles are parsed for Revisionista. Unfortunately this means you’ll see a flurry of new revisions, with no actual changes (though the whole article will be marked as changed).

I also fixed the Guardian news article title parsing – they changed things around a bit. This shouldn’t result in any flurries.

BBC Editors blog about News Sniffer

October 31st, 2006

The BBC Editors blog has mentioned News Sniffer today.

It’s largely just marginalised us, but it’s rather ambiguous. When they suggest that Revisionista will not find examples of bias, I can’t decide if they mean that the BBC is not biased, or that they are just very good at it.

Some of the recommended revisions are interesting, but maybe we need a comments feature so people can explain and discuss their recommendations.

And they also describe their censoring of ‘Have Your Say’ comments as “censoring” in quotation marks. I’m not sure what this is supposed to mean either. Is it not censorship when they remove comments? Or do they not remove comments?

See the top recommended censored comments for some interesting examples of “censorship”.

Watch Your Mouth updates

October 30th, 2006

Back end changes

I rolled out a new version of News Sniffer last night. A lot of work went into rejigging the censored comment detection to make sure we don’t make mistakes. I also wrote a system to confirm censored comments by html scraping, which gives us a way to double check censorship.

This also allowed me to check the entire backlog of censored comments. There were a number of comments that we thought were censored but were not and those are now fixed. We also found a number of comments that we didn’t know had been censored.

Apologies for this. My understanding of the BBC HYS RSS feeds was flawed (to be frank, largely due to some brain-deadness on the part of the BBC forum software).

So you might notice that some existing censored comments disappeared but other ones appeared for the first time. I’ll be running the confirmation script regularly to monitor the new checking system (though not as regularly as the others as it’s a bit intensive).

Front end changes

The “Recommended Comments” page now lists the latest recommended comments, not the highest recommended as before. Comments in order of highest recommendations can now be found on the “Top Recommended Comments” page.

The thread listing pages now displays the number of published comments along with the number of censored. This helps give an idea of how busy a thread is (you might expect busier threads to be more censored).

And lastly, the thread display page now includes the BBC HYS description, which gives a bit of background to the thread and links to any news articles that might have prompted the it.

Thanks to datamining.typepad.com and currybet.net for the feedback that led to some of these improvements.

BBC Radar

October 27th, 2006

News Sniffer blipped on the BBC radar this week. Someone mailed their public “backstage” mailing list with a link and it eventually turned up on the backstage.bbc.co.uk blog.

Richard Sambrook, Director of BBC Global News, has commented too.

I’m not sure why he’s so Interested in what my name is, but I’ve added it clearly to the about page now. It’s not like it was a big secret.

An ex-BBC employee, Martin Belam, commented extensively on it too.

I also got invited to discuss News Sniffer on the backstage mailing list by Ian Forrester of the BBC. An archive of the mailing list is available. With a notable threads starting here. I posted some details of the inner workings of News Sniffer too.

‘Watch Your Mouth’ malfunction – fixed

October 26th, 2006

The ‘Watch Your Mouth’ system is currently marking some comments as censored when they are not. This seems to be due to the BBC’s servers being out of sync with each other and I get out of date RSS feeds. I have a solution for this and am working on it.

This was brought to my attention (very gracefully) by a BBC employee.

UPDATE:: Problem fixed. No more comments should be mis-classified. I’m working on verifying the backlog, but it should only have been a small number of comments. It was a bit of a corner case causing the problem.

News Sniffer Bias

October 22nd, 2006

In drumming up some more attention for News Sniffer, I e-mailed the Biased-BBC blog which then linked us in a posting. Some comments on their post brought up the issue of News Sniffer Bias. Here is a quick response I put together:

Read the rest of this entry »