The Herd Of Kittens Python Development - Comics

I read a lot of web comics. Now that I've started using an RSS reader as my primary source of distraction, it turns out to be useful to collide the two, and produce an RSS-feed of recently updated comics.

Note that this does not generate deep links directly to the comic strip -- it merely looks for specific changes and drops in a new RSS "item" when there's new content. The reasons for this are two fold:

Some artists consider the full presentation of the comic to be important, not merely the strip itself. This includes collateral text (sinfest's "Resistance" or ozyandmillie's background commentary) as well as the look of the page. (Yes, if they're syndicated, this is lost - but I don't especially want to lose it, I feel it gives more of a connection with the artist.)
Some artists actually make some money (I doubt any make a lot, but that's even more reason to respect what little they do extract) from on-page ads, auctions that they advertise in their own collateral, and "donate" buttons and progress bars.

So the way this works is that you edit the sites array (assuming you have differing comic interests than I do, or less time) and run

python comick.py mycomics update

This produces mycomics.rss which you can feed to your RSS reader and mycomics.db which you want to keep around for future runs. The database contains the string matches that were used to identify the content, and last-looked and last-changed timestamps.

It currently assumes all comics are roughly daily, so once it sees a new comic it won't re-check that site until 6 hours have past. Sites that are older than that are still not checked more than once an hour. This will get further tuning - but note that even checking this often is much more efficient (ie. costs the artist less) than checking with your browser, because it only grabs the top level HTML page, which is usually relatively small, and doesn't grab any images or extra frames (with the exception of Helen where the comic is actually on a secondary frame with a fixed name.)

comick.py
A parsed out list of the comics for linkage and direct browsing
comickpage.kid (the template for the list above)
The README serves as a changelog for now.