README: 2007 Feb 27 - 2008 Jan 14
Description
femtocomment: An attempt at a "least reasonable" comment system. Anonymity being a priori "not reasonable"...
Mon Jan 14 22:47:00 2008
The README-blog has enough content that it is starting to pick up
traffic about random topics, even with only email and zephyr
backchannels. femtocomment got sidetracked into being an OpenID
playground, which made sense a year ago, but looking at
http://openid.net/ today, it's a lot more real, and I should start
focusing on the comment side of the story again.
(I'll probably start with something that works off of a simple
hand-set cookie just to get it off the ground, but that's not
inconsistent with having an OpenID interface, it's just a shortcut
path.)
Inputs:
- the comment itself
- the authentication metadata
- the context of the comment
- replying to what posting
- (later) replying to what comment
- when
Outputs:
- rendering of comments (threads) on each topic page
- global comment-review page
- site-wide, for the administrator (me)
- comment deletion
- cookie revocation / (later) id blocking
- (later) RSS feeds for comment threads you're in
- (later) email for comments
Constraints:
- Can't be generated offline the way the blog is
- README-blog can insert a "talk to the service" bit, though
- maybe README-blog can pull static copies of threads, so the service bit is only the freshest stuff?
This is probably enough to start slinging bits around and see what it
looks like :-)
Footnotes:
Sun Apr 8 03:08:00 2007
Some random browsing (specifically, a mention of Mark Novak from
Microsoft doing a lightning talk at Ignite Seattle on OpenID Security,
via planet python, and my subsequent attempts to find things about it)
led me to notice the January 2007 "Implementors Draft 11" of OpenID
2.0.
This draft fixes a lot of my complaints about the form of the 1.1 spec:
- it defines btwoc up front (though regrettably it still uses it)
- the terms "smart mode" and "dumb mode" are gone
- a much more convincing Security Considerations section
All in all a much better effort at "phrasing your RFC in the form of an RFC."
Given that, I think it's worth the effort to discard the 1.1
experiments and try again with 2.0draft11 versions; given the flaws in
some of the 1.1 paths I clearly don't want to support it anyway, so
I'm not losing anything by only looking at 2.0 from here on out.
Footnotes:
Thu Mar 29 01:09:00 2007
Today we grind through "association". This sets up a shared secret
between the Consumer and the Identity Provider, up front. It's not
really spontaneous, since you don't actually know of the existance of
the Identity Provider until you get some user to tell you about it,
though 3.4.1 points out that you can cache them, and they do expire.
Looks like checkid_immediate always results in another browser
redirect, to the approve.bml page, but that may be specific to
the livejournal implementation (and kind of makes sense in the "must
approve" case, to redirect instead of just showing the query, for
implementation rather than spec reasons.)
More spec flaws:
- btwoc is defined after the three times it is used.
- also, base64(btwoc) seems like a silly representation, when "decimal integer" would do just fine - there's no particular point in making these values compact, especially when they're just not that big and they're surrounded by other HTTP verbosity.
- google(btwoc -openid) yields nothing sensible, which is a bad sign.
- the "plaintext shared secret" (mac_key) is discussed in 4.1.2 and 4.1.3, but its use is never explained.
- it helps to ignore the names of the fields, and follow the references to secret(assoc_handle)
More useful python tools:
- hmac and sha modules
- base64 as a string codec
Apparently I've just executed "smart mode" ("pre-associated", so I
could use the result of checkid_immediate directly) without
doing any further actual crypto - just verifying an HMAC using a
plaintext key. This is vulnerable to a "modern" variant of the
Zanarotti attack (or "over-shared secrets"):
- attacker spoofs, or even forces a pre-caching of, a mac_key for this Identity Provider
- attacker then spoofs an openid.sig response signed with that known mac_key
If we assume the attacker isn't a full MITM - can't disrupt
connections initiated by others, or spontaneously, but can interpose
responses to connections it instigates - the attacker can
- watch the first one go by
- wait for expires_in to pass
- then send a new request for that Identity Provider and forge a response with a known mac_key
- DNS spoofing is an easy way to do this
- ignore the redirect and instead synthesize a 4.2.2.3 response signed in the known mac_key
- the Consumer now verifies the response and accepts it, letting the attacker impersonate the target user.
Note that this isn't a flaw in OpenID by itself - just an indication
that 4.1.3 needs to explicitly forbid using plaintext-association
mode, or perhaps that mode should be removed from the spec, as it can
not really be said to provide repeated authentication, which is the
whole point.
Next time, we move on to association with openid.session_type
set to DH-SHA1 and see how it does against weak-MITM.
Footnotes:
Wed Mar 28 01:32:00 2007
The Kathy Sierra mess got me to put some more effort into this (after
not touching it for a month.) I've successfully, if crudely,
"Consumed" (via the "dumb" path) a livejournal-based OpenID identity.
It took a while to grok the workflow, partly because the spec is weak
on actually defining terms consistently (it makes one appreciate the
effort that goes into real RFCs that much more.) Specific flaws include:
- talking about "smart mode" and "dumb mode" (3.4) without ever clarifying which path through the protocol is smart or dumb
- using smart/dumb instead of more descriptive terms ("pre-associated" vs. "stateless" seems accurate, but I'm not yet sure...)
- leaving holes like allowing openid.signed to include mode (4.4.1 simply says you MUST copy the value, but if you do, it the Identity Provider won't have any idea what request you're making.)
- not clarifying 301 vs. 302 for the fundamental redirects used
- not spell-checking the spec (the description of openid.signed is reliably misspelled)
That said, there are a good collection of python tools to handle the pieces:
- HTMLParser with a custom handle_starttag is a good http-equiv extractor, useful for pulling out X-XRDS-Location
- cElementTree.iterparse to actually parse the XRDS data (to then extract openid.server)
- urllib.urlopen out of laziness - I should wrap it in something to protect against the Consumer-fetch risks in 3.3.1, later
- urllib.urlencode to pack the request arguments
- BaseHTTPServer.BaseHTTPRequestHandler to handle getting the browser-redirected responses back, and send 302's to crank the protocol along
- SocketServer.ForkingMixIn and BaseHTTPServer.HTTPServer to actually serve the requests. (Yes, http://localhost:port/ can be a Consumer)
- urllib.splitquery and cgi.parse_qs to extract response fields (like openid.user_setup_url)
- subprocess.call(["firefox", "-remote", "openurl(%s,new-tab)" % target_url]) to pass the initial bits to firefox
In "smart" ("pre-associated") mode I'd also use hmac.new, among others.
So in stateless/dumb mode, it looks like the "tracer bullet" path is
- look for the XRDS header in the user-supplied URL
- find the openid.server in that data block
- cheat and don't look for delegation info
- construct a checkid_immediate (4.2.1) query, pointing at a localhost url
- leave out assoc_handle, so we do the "dumb" instead of "smart" path
- leave out trust_root because the Identity Producer will fix it up for us
- feed the constructed url to the browser
- do a single handle_request to get the first response, and redirect it back because it has a openid.user_setup_url (4.2.2.2)
- do another single handle_request and actually find a signed response in it (4.2.2.3)
- pull out the signed fields and construct a check_authentication request (4.4.1)
- as mentioned above, do not copy openid.mode
- POST the request and read the response, seeing is_valid:true\n\n
- mutate the openid.sig field and try again, verifying the is_valid:false\n\n response, to show that it is doing something
One reason that "stateless" may be the wrong name for this mode is
that the check_authentication pass does involve the Consumer
waiting around for a response from the Identity Provider - still
stateless in the traditional "web app" sense in that you didn't need
to preserve anything across User-Agent to Consumer transactions (http
requests), but you still need to handle waiting for the Identity
Provider, and in an asynchronous system that's going to be explicit
state, even if it's bounded.
Next step will be to walk through the "pre-associated" mode and
compare the complexity (especially relative to the possible security
benefits.) After that, safety-armoring and a prototype comment
page...
Footnotes:
Wed Feb 28 02:59:00 2007
Looks like the best way to understand OpenID is going to be a "drunken
walk" through the spec... implementing a minimum identity-consumer so
that I can get a good handle on what the problem points are. For
example, the first thing the identity-consumer does is fetch the url
the user handed them; I wonder how many implementations get unhappy if
they're fed a youtube video, in spite of the suggestions in 3.3.1.
I'll probably go back and look at one of the other implementations
too, but I expect that they'll make a lot more sense after this
exercise.
Tue Feb 27 02:51:00 2007
Enough of my README-blog entries include questions that an in-line
comment system seems worthwhile, and I've been looking for an excuse
to "do something" with http://openid.net for a while now.
I don't happen to believe in giving people "options" when I can
instead provide something simple and "right" :-)
Thus, femtocomment, from the SI prefix for 1e-15 (microcomment
and nanocomment being already in use.)
The constraints are fairly basic -
- web-side: "comment on this" should be minimal; ideally, only
- the comment entry box itself
- "who am I"
- eventually, pre-filled
- use the "right" field name to support form-autofill, if there is one
- this is the OpenID URL; if we can scrape a handle/alias/name from that we should, otherwise just display the URL
- consider having the display form special case popular services like LiveJournal
- site-admin side (ie. me):
- easy to add to any given page
- easy to see all comments on my site(s)
- easy to identify and destroy spam comments
I suppose the truly minimal version would support
- linear comments on a single page
- OpenID + comment form
- per-comment deletion by admin
Later improvements could include
- make it as single-signon-ish as possible; remember people's URLs, maybe even try to pre-authenticate them if the protocol allows it?
- let people subscribe to comment threads (just their own, or any?)
- threaded reply
- formatting (though since this is mostly for technical discussions, code-quoting syntax and auto-url might be enough :-)
- support for a Dave Winer-style "show it to the author but not to anyone else" path for first-time commenters
The first implementation choice is whether to use an existing openid
library, or to implement enough of the spec to learn something useful
about it...