Simple twitter to zephyr gateway (one way, so I can get twitter messages in my zephyr client.) Now lives on github and the commit history there is probably more useful than the early dev history here.

ztwit: Wed Dec 10 00:15:00 2008

Wed Dec 10 00:15:00 2008

Got one negative comment about the flickr updates, decided to interpret it as "make them more interesting" rather than actually stopping them (after all, playing with the tools is more than half of what this is about - I'm not trying to find strangers here, or anything...)

The flickr updates now include as many tags as they can, sorted by popularity and length, so the latest one says [Concord,Great Meadows National Wildlife Refuge,Massachusetts,ice,sunshine,winter,marsh...] which perhaps gives the reader an idea of what pictures they'll find. (Granted, most of my readers are local, so knowing me and knowing the weather is probably sufficient to guess :-) Not realizing that python's sort is stable in 2.4 and later, I counted tags and used a key of tag_popularity[tag] - 1/float(len(tag)) (the value itself is the major key, putting everything into integer buckets, and the tag length reduces that - the shorter the tag, the more it reduces, so the closer to the start of the list of tags that length, but bounded between 1/1 and 1/longest-tag-len, approaching zero, so the values fit. Of course we want high popularity first, so this is a reverse sort, so we're actually favoring longer tags, based on them likely having more inherent detail.) This is what we call "excessively clever" - given a stable sort, we could just sort the list twice:

tags = sorted(tag_popularity, key=len)
tags = sorted(tags, key=tag_popularity.__getitem__)

and then reverse the whole thing. (Notice that sort-stability is the place where sorted(reverse=True) is different from reversed(sorted()) - the former reverses the direction of the current key, but preserves the order of things that were equal; the later reverses everything.)

Future directions could include

ztwit: Sat Nov 22 23:02:00 2008

Sat Nov 22 23:02:00 2008

Fixed another error handling case (urllib is ancient, and uses the poor practice of raising IOError with several different shapes of argument list, making it harder to catch cleanly.) Also implemented a simple posting client, so that I could call it from my KPhotoAlbum to Flickr uploader and announce when I had new pictures (something I sometimes announce on Zephyr too, but it seems slightly less appropriate to automatically post there, and I currently only do when there's something of note.)

Found an interesting article comparing the author/audience aspects of RSS vs. Twitter - it seems that the Twitter dynamic manages to be simultaneously more casual and more engaging, which perhaps deserves a little thought.


ztwit: Tue Nov 18 22:09:00 2008

Tue Nov 18 22:09:00 2008

On a whim, refactored to support @replies too; I'm not sure if the 100/hour applies to each interface or any, but "two every 90s" is still only 80/hour. (Also not sure if the change particularly works, since the one @-reference to me wasn't the first one in the message. It's rather straightforward, though.)

It's run quietly for two days now; I did get one 502 Bad Gateway which is apparently "returned if Twitter is down or being upgraded"; looks like all of the 5xx codes can be handled by sleeping a little extra rather than outright failing, so I've added that too.

ztwit: Sun Nov 16 17:02:00 2008

Sun Nov 16 17:02:00 2008

The API docs say "100/hour", "please use if-modified-since", "basic auth only", and "xml, json, rss, or atom."

json is first-class in later versions of python, so let's use simplejson for now. That gives us an array of twits, which are just dicts; looks like I want t["text"] for the body, t["user"]["screen_name"] for the fake sender, and ignore the rest for now; maybe add the reply, location, or user.name in the zsig later, and maybe filter on user.protected if that actually turns out to mean something.

Since etag/if-modified-since seem reliable enough, I won't bother stashing messages at all, just history.

Yay brute force. 27m total including some manual testing.