wp-archivebot
Version 0.1 revision 0 uploaded by GwernBranwen.
Package meta
- Synopsis
- Subscribe to a wiki's RSS feed and archive external links
- Description
A MediaWiki's RecentChanges or NewPages links to every new edit or article; this bot will poll the corresponding RSS feeds (easier and more reliable than parsing the HTML), follow the links to the new edit/article, and then use TagSoup to filter out every off-wiki link (eg. to http://cnn.com).
With this list of external links, the bot will then fire off requests to http://webcitation.org/, which will make a backup (similar to the Internet Archive, but on-demand).
Example: to archive links from every article in the English Wikipedia's RecentChanges:
wp-archivebot gwern0@gmail.com 'http://en.wikipedia.org/w/index.php?title=Special:RecentChanges&feed=rss'
- Author
- Gwern
- Bug reports
- n/a
- Category
- Network
- Copyright
- n/a
- Homepage
- n/a
- Maintainer
- gwern0@gmail.com
- Package URL
- n/a
- Stability
- Experimental