README: 2004 Apr 11 - 2008 Dec 10

Description

Blogging-related code - RSS generators, parsers, and the make_readme_blog tool itself.

Wed Dec 10 00:02:00 2008

I've gone another year without significant change to the blogging software; I did just add http://rpc.weblogs.com pinging to go along with the year-old google-pinging support (embarassingly, the only difference between the two is the noise they return on success, and I just print that anyway; I was just reminded because it showed up on http://scripting.com recently.)

I have been putting a little time into using Sphinx to build one of my documentation/ranting sites, though, because Sphinx sites actually look pretty nice as-is, and maybe it's time to clean up the look around here a bit (especially code display.) It looks like I'll be able to use the existing parsers to do at least a rough draft automatic conversion to RST, which is nice - though I'm seriously considering WikiCreole as well, it looks like RST still wins on tools.

Footnotes:

Fri Dec 28 00:52:00 2007

http://www.google.com/help/blogsearch/pinging_API.html now lists a REST interface, much simpler than the XMLRPC one. Easier to just have the top-level Makefile invoke it, instead of making it part of the aggregator, so feedping.py is standalone for now, not even usefully importable. Probably just as well, the feed url identification is a bit of a kludge...

Thu Feb 22 20:52:00 2007

I suppose it's a good sign that I've gone over three months without tweaking the blogging software, though I haven't done a huge amount of blogging in that time either :-)

In any case, I'm doing more code, so I need moinmoin's three-curly-brace codesetting syntax... but first, needed to implement a basic regression test, so I can make the changes without fear.

Tue Nov 28 23:50:00 2006

kcr noticed that the comment links were broken. Turns out that my method of generating them was entirely non-functional. New approach: use the nearest comments.html file, crawling upwards from the current directory. If you don't find one, don't allow comments. Much clearer.

Tue Nov 28 00:40:00 2006

In the same spirit, but not particularly blog-related: the various index.html files in this tree are hand generated, which has always bothered me. Now that we have per-README descriptions, those can be collected up into index files directly (and existing index files can be mined for descriptions.)

Looks like the main trick will be picking the "right" file, which isn't always README, it is sometimes NOTES... hmm, ok, only in one case... the clear answer is to just rename that case so README really is the right thing, even if this is still in CVS and renames aren't really available. (I suppose I can resurrect my cvsrename hack, or I can just copy the files and not care especially much - they're inherently self-versioned anyway.)

Thu Nov 16 19:27:00 2006

More suggestions from readers: add a label to the feed title (for the aggregation), add a visible hint (rather than just the date) as to where the individual aggregated items came from (implemented as a directory name based "category".) Sorry about everything looking changed, though since the guids haven't changed it shouldn't be that dramatic...

Still to come - the previously described per-feed description, and some (optional) syntax for article titles...

Sat Nov 11 21:21:00 2006

Now that I'm starting to add articles outside the python space, I think there needs to be an (optional) top level description, particularly for the generated HTML page, though I'm using the first post for that now. Perhaps "any content before the first item" is enough, and I can treat it as an item in every other way.

update: done as a non-stamped initial item. 20 lines of code.

Thu Nov 9 02:20:00 2006

I don't really want to launch right into a comment system, so I set up a comment page that just tells readers where to email. It's linked in to the rss feed, as rss.channel.item.comment, but I don't know if any readers actually support that; eventually I'll add it to the html pages as well, I suppose.

In the meantime, you can also go directly to http://www.thok.org/intranet/python/comments.html for instructions on how to comment.

update: kcr points out that liferea, at least, uses the comment element.

Sun Nov 5 18:45:00 2006

Added support for a "localfile" pseudo-url type, to allow easy reference to files in the same project directory.

Sadly, relative hrefs don't work from html-in-rss, at least in akgregator (and I want it to work there, regardless of what it does elsewhere and in the absence of a spec to use to cause bug reports.) So, make them absolute when in generating RSS, since we have a baselink around anyway.

Come to think of it, static_aggregator isn't (and shouldn't be) smart enough to fix the relative references when "lifting" the items into the aggregated feed, so it's more useful to just generate them as fully qualified urls anyway.

Sat Nov 4 00:25:00 2006

Learning of the day - if you're lazy and use stdout for a generated file, and then you add "commit to cvs" options... and can't tell if it works because you don't see cvs output, don't relax when you get the commit mail. "There Are No Mysteries". And rss doesn't parse so well when it starts with "new revision: 1.12; previous revision: 1.11"...

(Hmm. I should rant on There Are No Mysteries. It's not strictly python, but I'm starting to thing that since all of my dev talks will be under python anyway, rather than recategorize up a level - in the words of Arlo Guthrie, "and rather than bring that one up we decided to throw our's down.")

update 2006-11-11: done, see http://www.thok.org/intranet/python/rants/README.html

Thu Nov 2 00:17:00 2006

I've been in the habit of using make (and cut&pasted Makefile fragments) to define html- and rss-building targets, mostly because these directories are primarily build directories for their projects. Since the point of static_aggregator is to pull in common rss files from a pile of directories, it already knows how to find them and probably update them. Fixing that is my first http://codemonth.org project, I think.

I will note that while trying to make the use of make more clever, I found http://www.cmcrossroads.com/content/view/6507/120/ and a bunch of other make articles on the same site, which lead to the GNU Make Debugger http://gmd.sf.net and the GNU Make Standard Library http://gmsl.sf.net which would have been really useful... though not in a good way... last decade. These days I actually understand that if you're being that clever in Make, you're probably doing something wrong :-)

update: turned out to be about an hour to add --update to static_aggregator directly; as a pleasant side effect, since it is generating the baselinks directly from the treewalk, it gets them right, where at least one of them was wrong in my cut-and-pastes :-}

Tue Oct 31 02:03:00 2006

Python has rfc822.formatdate and rfc822.parsedate... which make it trivial to

Thus, static_aggregator. (Since it emits rss as well, this lead to a bunch of refactoring of make_readme_blog, they now share rss generation code.) Currently only generating a python feed, and that only from two projects (blogthing and nagaina) but that'll expand rapidly.

Mon Oct 30 01:31:00 2006

Implemented the html items below. Next:

Also, now I'll start making formatting changes to the readmes, since I've reached the point of diminishing returns in terms of adhoc parsing. A lot of what's left really is ambiguous, so it's easier to just fix it, and at least this particular readme simply isn't going to need much of that.

Mon Oct 30 00:33:00 2006

While working on make_readme_blog, I came upon the need to pairwise iterate through a list. itertools lacks anything for this, but instead of stopping with the pydoc like I usually do, I found http://www.python.org/doc/current/lib/itertools-recipes.html and the definition of example grouper. It is interesting in that it works simply using izip(a,a) - and the side effect that reading an iterator mutates it, so the second read gets the second value.

The dangerous bit is that this only works on real iterators... it doesn't work on lists at all (it silently doubles up the values.) Still a good trick, but if those recipes ever get "promoted" to batteries, it needs some defensive exceptions...

Sun Oct 29 15:27:00 2006

Gee, 2.5 years later and I still haven't done anything with this. However, I haven't stopped writing README files like this, semi-stylized but mostly with enough explicit structure to break things into individual items. make_release_rss points out some nice ways to make RSS (and html) using cElementTree which would be useful here too...

So, taking the "simplest thing that could possibly work" approach, I set a goal of turning this README file into a blog. (It turns out there are only 28 of these anyhow, which limits the amount of effort necessary... the point of diminishing returns on the parser is the point at which it's cheaper to tweak the file, as long as the tweak is something that I can keep a consistent part of my habits when writing these.)

As of now, make_readme_blog produces:

Next items:

Wed Aug 11 00:02:00 2004

Just finished implementing a refactoring of extract_stuff into "incremental" components, with completion-stamp files. Suddenly building date-range files is easier...

Sun Aug 8 23:52:00 2004

how about at least fixing extract_stuff: extract txns and use a datestamp to know when to stop. then build that into the rss feed. faster, no?

Sun Aug 1 06:02:00 2004

think about turning this around - start with a basic blog entry, as rich as I want, then an rss-generator, then an rss->html tool that makes split pages?

Wed Jul 14 14:16:00 2004

http://java.sun.com/developer/technicalArticles/Interviews/bray_qa.html

bray talks about "rss feed to bank account". That sounds stupid at first, but perhaps just argues for a class distinction - critic-feeds, vs. casual-feeds.

Sat Jun 26 19:21:00 2004

another workflow: web browser to zephyr (sort of linkblogging, but includes a quote (with charset fixes) and comments)

Mon May 24 04:09:00 2004

Some more prototyping done. Perhaps I really need a meta-form mechanism...

Wed May 12 02:26:00 2004

relayer it with rsswriter in mind. (oops.) then generate at least a daffodil-blog from it

Sun May 9 03:24:00 2004

Finally have a working abstraction-based trackforward generator, to a first approximation. It needs work, and I have to think through the stamping, but it generated a page, at least. Yay.

Thu May 6 01:49:00 2004

searchstuff has worked moderately well, in that it has helped me find a few things I'd been looking for, but sometimes it's very noisy. trigrams may not be suitable at all, and I should start indexing words and wordpairs.

Andras talk on positive evidence (http://www.kornai.com/Papers/vizsla.pdf, the Hungarian Web paper) is something I should review, along with Maciej's vector space search engine paper. Can I do naive autotopic'ing with cosine-vector collation?

Sun May 2 14:10:00 2004

remember to make a server out of stuffthis, and cvs it

Sun Apr 25 04:58:00 2004

start with the data already in bbr, and upconvert to xml/rdf (snarf book back from office.)

also do an outright bins replacement. Think about dependencies and rules; think about high level descriptions of what's done now -

Also figure out what blogs the galleries get announced in (how about an autoblog of changes, or at least grazing off of galleries? "new flowers", "new hardware" and such, picture or album level?)

RDF thoughts: we want to normalize locations, maybe that will do it (or maybe these fields should just have a reference tag for common things, like description/event/location. people too?)

Sat Apr 24 04:52:00 2004

padml looks way too adhoc after seeing rdf, but perhaps an rdf profile would serve the same purpose. Also, chronologically close pictures are probably spacially close, if not topic/project.

Fri Apr 23 04:07:00 2004

Generics:

Wed Apr 21 13:03:00 2004

Some good comments from cfox led me to realize that these pages all have "title" fields which I'm already capturing, and those work much better than the "home" link parsing.

Added a debug mode for trying stuff out locally without copying, but default is still to publish.

Tue Apr 20 23:30:00 2004

announced (last night) on lj, thok/bloggery has the html and rss. A few friends and googlebot have picked it up - so it is already #8 for "trackforward" on google :-)

A simple next project would be to do an apache log-munger that generated categorized streams (and could be taught about new categories), out of which something referrer-based might be built...

also need a picture blogthing. and another attempt at using albumshaper...

Tue Apr 20 00:15:00 2004

ok, ok, cleaned up the html, now generates two files, I can scp them somewhere for now.

Mon Apr 19 23:55:00 2004

extract_stuff now generates crude html and valid rss. Should clean up the html, and figure out a way to "publish" the whole thing, reliably...

Sat Apr 17 22:14:00 2004

another use case: turn the trackforward capturer into a blog!

another thought: aap fails to solve the building problem in a different way than make does.

Sun Apr 11 12:59:00 2004

After looking at pyllbox, I want something that I extend in different directions, with more underlying power - but I also want something soon, so it's time to start cracking.

concepts:

use cases: