Worklog for raw wsgi album code.
Realized when half-awake this morning that the album-search cache was wrong at an entirely different level - reloading the album needs to invalidate all prior searches, where instead the code was just reloading the one that was in progress when we noticed the base album was stale. This is also the first hint that, at 900 lines of code, I already need bugtracking; given my offline working mode, I should probably look at SD - having a bug list that can be checked in with the code makes a lot of sense, especially if it still supports real management tools and isn't just a free form list.
Pushed the change out as 0.2a after testing. (Still needs to be made "pure functional with caching" to eliminate some other rarely-seen problems. Also need to think about adding this album to nagaina monitoring...)
inchpebble 21: Took some pictures at a party, which triggered adding an external list of exported albums (conveniently mapping short names from urls to actual kphotoalbum keywords; eventually it'll map to full blown queries, but not soon.) That's enough to call this 0.2 and roll it out as the primary version, for the half dozen people who should see these pictures... did use one of the speculatively-written caching functions after all, though not on the code path I expected (the ad-hoc caching code that was hand-fixed earlier should use it, but doesn't yet.)
(This is 21, not 20, because the internal rewrite counts as an inchsomething... but I already used inchboulder, and it's bigger than that was.)
Violating my "inchpebble" model cost me several weeks (probably a factor of 4 of extra friction over actual coding time) but the major changes are integrated; there is now
Of course this turns out to be entirely internal - it doesn't change any visible functionality (though some obscure bugs have been fixed just by having to write doctests, they're mostly hidden from users in the old version) - so I can't quite bring myself to deploy it (though one could also argue that this is the best time to deploy it, since users won't notice.) I suppose I'll have new visible-features soon, though, once I get back to an inchpebble a day.
What really got me over the friction of folding in the broader set of changes, though, turned out to be "seeing it all at once" - I've done all of this development on the EEEpc, but 88x27 isn't really enough room to see structural changes across multiple files. The real breakthrough was sitting down and applying some 80's technology to the situation - setting up X forwarding between the EEE and the Thinkpad :-) m-x make-frame-on-display and I suddenly had a much more expansive view of the new code, and was able to put everything back together. It mostly meant that I needed to concentrate less, and could just look at the code; while it reduced some of the friction, I'm not sure it was actually for the best.
In any case, it'll now be easier to add some better album management and export-selection features, as well as some slightly better rendering; clearly marking "new pictures" looks like the most interesting short-term viewer (as opposed to publisher) feature.
Writing more tests as part of the caching structure. After about the third time I added a sleep(1) to allow the file timestamp to change within the test, I realized - I should just trigger the cache on length (and maybe inode number) too. The album application doesn't really need that, but once I push the caching under the abstraction of "documented interface" it should behave more intuitively, and that means "file changed" not merely "timestamp changed".
This brings up an interesting aspect of doctest - "hard to test" suddenly explicitly means "hard to document" because the tests are documentation. I've traditionally used "this is hard to document" as a valid reason to change an interface, but "hard to test" just means "work harder on the tests" - even when I'm writing them, I've traditionally not wanted the tests themselves to distort an interface simply due to weak testing infrastructure. Doctest binds these together much more clearly - not only are the tests serving as documentation, they're serving as realistic users of the API, which increases their legitimacy as potential critics of that API.
(Of course, it turns out that the function I was working on that triggered this rant really does have make(1)-like semantics, so it can only make reference to the contents on disk, not an in-memory cache; I'll have to consider using a stampfile to record the source-size and source-inode along with the generated file, but that's more clearly far enough ahead that I should defer it until I find an actual need for it.)
Since I'd actually published a couple of albums with the 0.1 deployment, I've left that alone and cloned the deployment so I can test my main development line. I did find one bug in the 0.1 deployment, an artifact of caching the kphotoalbum lookups - which take up to a minute, the first time - when I extended it to support multiple albums/searches, I neglected to treat each album separately, so adding a new album would fail unless it was the first load after an index change. According to my logs, none (ok, "neither") of the users noticed it, but it's still embarassing.
The fix was trivial - but it did underscore that even a couple of lines of ad hoc caching code was the wrong thing to do. Fortunately, the main effort in the 0.2 line has been to come up with some helpers and decorators that abstract out the rest of the web-app-related functions that I've been hardcoding. None of this is actually hard (though I have learned about some new (to me) python features like inspect.getargspec in the process) but getting the result to seem reasonable and not too "magic" is a bit more subtle. (Yes, this is the beginning of the "second system" slippery slope :-)
Note that the world probably doesn't need another python WSGI framework - but given my background, I'm not willing to use one without enough experience with the lower levels to be able to debug it (and to appreciate why those lower levels are there.) So far it's actually worked well in terms of allowing me to tune my apps, and get more perspective on how the different the questions are when you're thinking in the web framework context.
Did a friendly test, which was much more successful than I expected (the tester in question is somewhat famous for finding 5 bugs in 5 minutes in a 100 line palm app that was ostensibly "done" :-)
inchstone 19: added video thumbnailing, and correct delivery of video/avi in /image. The tricky bit is that convert will take frame 0 if you add [0] to the end of the path... but still reads the whole AVI, so you really want to "cache" the results.
This inchstone let me expand the test to include someone to whom I happened to owe some pictures and video from a construction party; we'll see how that goes. (A long ago design feature for the album-generators I wrote on top of bins was to be able to automatically produce protected albums of "all the pictures containing you" for people I had pictures of; once I make a few structural improvements, I finally should be able to do that with this code...)
Finally realized that the principal name in this context should just be an instance of my primary, since I'm actually running the server on my behalf, not the machine's.
inchboulder 18: the required kerberos and AFS configuration to make this actually work. Actual commands below; particular expensive (re)learnings are
Given all that, if it's still running in the morning, it might be time to show it to strangers :-)
inchpebble 17: momentum carried me through to implementing smallimage (page with next/previous nav and an inline smaller image) and smallrender (which actually produces that image, and caches it.) Next step is to actually run this under k5start, which involves picking a principal name and giving it a pts entry, which should really wait until I'm more awake :-)
inchpebble 14: determined that SCGI does allow HTTP_AUTHORIZATION through to the application, so we can trivially unpack basic auth. Needed to extend @check_http slightly to handle sending back error-type-specific headers; this would be useful for 301 as well, but @directory_like shouldn't rely on @check_http and nothing else needs 301 or 302 support yet.
inchpebble 15: construct an @authby decorator and an application-local authorizer function using the above; allows a simple acl file per keyword.
inchpebble 16: make the image renderer check the cached image list before allowing the fetch. (Implemented multiple cached image lists since we'll need them almost immediately.)
inchpebble 11: cooked up a light wsgi-wrapping framework, consisting of an @url and an @check_http decorator, a "structured" dispatcher to use the url paths, and split that off into a file of its own. Learned that WSGI seems to be at just the right level to inspire such framework-building. Also learned about functools.wraps which is new in python 2.5 and very nice for making decorators "play nice" with the existing function's attributes.
inchpebble 12: made the keyword an argument (for now it's a direct search key, but by production it needs to be a lookup in a publish/acl table.)
inchpebble 13: made a @directory_like decorator to tag functions that generate html and need relative links to work.
In terms of visible function, this is only superior to inchpebble 8 in that albums have names - and album.py is only 25% shorter (though much more of it is actually about album generation and less about HTTP than before, I think there's more room to improve that without obscuring what's going on.)
Next steps (since I'm offline for a little while, not long enough to actually work on it...)
inchpebble 9: use extend rather than append to declutter the hand-synthesis of html (I'm nowhere near wanting to template anything, and particularly don't want any of it separate from the code - the whole point is to know everything that's going on...)
inchpebble 10: figured out that WSGIDaemonProcess is pretty flexible but apparently not enough so to be able to run the daemons under k5start. That means I should probably be running it under flup, in fcgi or scgi mode...
At least we know that
works, and can use that to run the server.
inchpebble 11: switched to flup.server.scgi for running the process; kludged the apache config to avoid the global / alias and just aliased the pieces by hand (eww, but it passes the site-wide testsuite.) PATH_INFO doesn't work in scgi; switched to SCRIPT_NAME because it was there (but need to dig up some real docs and pick something sane.) Also added a file-not-found handler for the image fetcher.
This could be run with k5start, but it needs a little more thought. (At least having it under flup gets better tracebacks, and I can just kill it off when I'm done hacking...)
inchpebble 4: implemented a classic printenv application function to see what values were handy in the environ arg.
inchpebble 5: implemented trival url-routing based on a dict and longest-match, so I could call the printenv app and still call the album generator.
inchpebble 6: implemented foo to foo/ redirection, called automatically by the trivial url-router if the name would match with an appended slash; fixes a whole class of url-construction problems up front.
inchpebble 7: changed album to list all of the returned results in a list; faked up direct image links and a 404-handler for them.
inchpebble 8: implement a trivial file delivery function (for WSGI, this just means coming up with a Content-Type and returning the open file handle as the content.) Needs safer pathname construction, better handling of File Not Found, and real (non-hardcoded) content type matching.
(This serves as the first almost end-to-end does-its-job test. It's still working with local copies of the index and images, since it doesn't have tokens yet... also doesn't have any access control...)
inchpebble 0: got a "hello world" mod_wsgi app running. (Learned more about modern apache config and "Alias" vs. "WSGIScriptAlias", namely that the former actually works if you're using "Alias" for other things already...)
inchpebble 1: got it to import kimdaba_album and look up a list of pictures based on a single keyword. (Learned about mod_wsgi and module reloading, namely that it doesn't happen by default - the application itself is special, and not actually a module. This is ok, since ultimately I need to run the app as an external k5start'ed process anyway...) (Also learned that loading the album is slow and should be cached...)
inchpebble 2: cached the album-parse and the search (which also turned out to be slow.) Hardcoded for one search now, will parameterize (and memcache) later.
inchpebble 3: extract the label, description, and path from the image. Tomorrow we can try rendering a reference to it...