Ever since I started using Sharpreader I have been missing out on some feeds I used to read. Newzcrawler has the ability to read NNTP and show it in the same style interface as RSS feeds which is really nice for reading mailing lists via gmane.org‘s SMTP to NNTP service.
At the moment I read FreeBSD Stable, FreeBSD Security Announce, NT Bugtraq and Bugtraq through this gmane.org so you can understand that it was rather traumatic to loose this ability.
I spent hours looking for something to convert newsgroups to RSS feeds and came across one promising project but it has some limitations – it does not do full bodies and its a PHP script so it does the NNTP download at the time you request it which means even coding in Body support would no doubt make it very slow to respond.
I set out writing my own gmane specific implementation using Perl using News::NNTPClient, XML::RSS and DateTime::Format::W3CDTF.
You can get the full perl source here and be sure to satisfy the required modules above, they are all in the FreeBSD Ports Tree.
I run the script from a crontab entry regularly and it outputs RSS feeds. Currently the major features are:
- Channel Link redirects the browser to the main Gmane.org web view of the newsgroup
- Meta Data such as author, date, subject are retrieved from the original NNTP article
- Full Bodies are displayed
- Configurable channel subject and creator
- Valid RSS 1.0 code as produced by XML::RSS
At the moment in version 1.1 no caching or intelligent retrieval of articles exist, future versions will retrieve only articles that are new since the previous poll and the resulting RSS file will only contain these entries, this will conserve bandwidth and CPU time. It also seems that XML::RSS does not handle all special characters in the bodies correctly so posts with accented characters fail validation, I will either cater for these in my code or contact the XML::RSS authors to see if they can fix it.
A sample execution for the gmane.comp.security.ntbugtraq group looks like this:
gmane2rss.pl -n 25 -g gmane.comp.security.bugtraq -s "Bugtraq" -c rip-blog@devco.net -o bugtraq.xml
The resulting RSS feed can be seen here and can be previewed here in my RSS Previewer.