News Scraper is a library to help track news articles on the web. Many people use news aggregators such as O'Reilly's Meerkat or desktop solutions such as amphetaDesk to combine news items from many different sources. However, they require the web pages to be published in RSS format. Although many web sites are available in RSS format (notably blogs), a lot of my favorite ones are not.
Therefore, this project provides a set of scripts that will scrape web pages and reformat them as RSS feeds. This project was inspired by Jamie Zawinski's cheesegrater, but is designed to be more easily extensible. This project also provides a rudimentary script to format RSS files into a HTML page (see sample, please do not link to this page). However, I imagine that the primary use will be to generate RSS feeds for much nicer front ends.
News Scraper is implemented in Python and runs anywhere Python does. It is distributed under the BSD license. News Scraper is currently developed and maintained by Jeffrey Chang.