The Herd Of Kittens Python Development - Comics

I read a lot of web comics. Now that I've started using an RSS reader as my primary source of distraction, it turns out to be useful to collide the two, and produce an RSS-feed of recently updated comics.

Note that this does not generate deep links directly to the comic strip -- it merely looks for specific changes and drops in a new RSS "item" when there's new content. The reasons for this are two fold:

So the way this works is that you edit the sites array (assuming you have differing comic interests than I do, or less time) and run

python comick.py mycomics update

This produces mycomics.rss which you can feed to your RSS reader and mycomics.db which you want to keep around for future runs. The database contains the string matches that were used to identify the content, and last-looked and last-changed timestamps.

It currently assumes all comics are roughly daily, so once it sees a new comic it won't re-check that site until 6 hours have past. Sites that are older than that are still not checked more than once an hour. This will get further tuning - but note that even checking this often is much more efficient (ie. costs the artist less) than checking with your browser, because it only grabs the top level HTML page, which is usually relatively small, and doesn't grab any images or extra frames (with the exception of Helen where the comic is actually on a secondary frame with a fixed name.)