DRAFT 11 -> FINAL?
Martin Atkins
mart at degeneration.co.uk
Thu Jan 25 18:51:14 UTC 2007
Since your list is long, I'm only going to address things I have an
answer to. I'll leave the rest to other people. :)
Claus Färber wrote:
> -----
> | 4.1.1. Key-Value Form Encoding
> |
> | A message in Key-Value form is a sequence of lines. Each line begins
> | with a key, followed by a colon, and the value associated with the
> | key. The line is terminated by a single newline (UCS codepoint 10,
> | "\n"). A key or value MUST NOT contain a newline and a key also MUST
> | NOT contain a colon.
>
> While it's ok for keys to have a limited alphabet, not allowing them in
> the value may be a problem for future extensions (say, if AX wants to
> transfer a multiline postal address).
>
> I don't think it's a good idea to invent a new format here. What's wrong
> with URI percent-encoding, which has to be implemented by OpenID
> software anyway?
>
As far as I'm aware the protocol does not allow for extensions to add to
the messages sent in this encoding. Extensions are only allowed to take
part in the messages passed by redirecting the UA, in urlencoded format.
The OpenID protocol itself has no need for newlines or other fancy
characters here.
I agree with you that inventing a new format here was an odd choice, but
unfortunately it's already in use by deployed OpenID 1.1 implementations
and so OpenID 2 implementations will have to support it anyway.
>
> | The response body MUST be a Key-Value Form (Key-Value Form Encoding)
> | message with the following fields:
> ...
> | *error
> | Value: A human-readable message indicating the cause of the error.
>
> What about different languages?
>
> This is actually tricky. The Accept-Language header from the user is not
> available to the OP if it is contacted by the RP directly, so the OP
> can't choose an appropriate language.
This "human-readable message" is really addressed at the developer of
the non-compliant implementation, not the end-user. Therefore support
for multiple languages was thought not to be that important. All current
implementations use straightforward English messages.
The end-user should never see such a message since these error responses
only arise when there is a bug in someone's implementation. (In theory,
at least!)
>
> | 7.3.3. HTML-Based Discovery
> |
> | HTML-Based discovery MUST be supported by Relying Parties. HTML-Based
> | discovery is only usable for discovery of Claimed Identifiers. OP
> | Identifiers must be XRIs or URLs that support XRDS discovery.
> |
> | To use HTML-Based discovery, an HTML document MUST be available at the
> | URL of the Claimed Identifier. In the HEAD section of the document:
>
> It seems that it has been unclear to some implementers what "HEAD
> section" means. This is not surprising as the correct term is "HEAD
> element" (HTML 4.01, section 3.2.1).
Agreed. This is just a simple language change.
> | A <LINK> tag MUST be included with attributes "rel" set to
> | "openid2.provider" and "href" set to an OP Endpoint URL
> | A <LINK> tag MAY be included with attributes "rel" set to
> | "openid2.local_id" and "href" set to the end user's OP-Local
> | Identifier
>
> These should read "LINK element", not "<LINK> tag".
While you're correct, I don't think it's that erroneous to say "<LINK>
tag" here. Still, it can't hurt to correct it.
> | The protocol version when HTML discovery is performed is "http://
> | specs.openid.net/auth/2.0/signon".
> |
> | The host of the HTML document MAY be different from the end user's
> | OP's host.
> |
> | The "openid2.provider" and "openid2.local_id" URLs MUST NOT include
> | entities other than "&", "<", ">", and """. Other
> | characters that would not be valid in the HTML document or that cannot
> | be represented in the document's character encoding MUST be escaped
> | using the percent-encoding (%xx) mechanism described in [RFC3986]
> | (Berners-Lee, T., "Uniform Resource Identifiers (URI): Generic
> | Syntax," .).
>
> This is incompatible with HTML parsing rules.
>
[snip]
>
> I think the whole section is flawed and should be replaced with a small
> note as a warning:
>
> Note: If the "openid2.provider" and "openid2.local_id" contain an
> ampersand (&), this character MUST be escaped as "&" according to
> HTML rules.
In practice, few implementations actually use an HTML parser to find
these elements. These extra restrictions are present to facilitate
regex-based parsing.
Since there is no regression if an implementation should choose to use a
*real* HTML parser, I don't see the harm here. It's imposing extra
restrictions only on people who want to use OpenID HTML discovery; the
rest of the document uses the normal HTML rules, though there is one
exception which I'll describe below.
>
> BTW, good HTML parsers are mandatory to protect against HTML injection
> attacks. An ad-hoc parser might simply fall for something like that:
>
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">
> <link rel=openid2.provider href='http://openid.example.com/~user'>
> <title>Nice test</title>
> <form action="doit">
> <p>Send me your comment:
> <input type=text name=comment value='<html><head><link rel=openid2.provider href="http://bogous.example.net"></head>'>
> <input type=submit>
> </form>
>
> Yes, this is valid HTML and yes, a HTML "sanitizer" that intends to do
> the shortest possible representation would produce something like this.
The regex-based parsers employed by existing implementations require
explicit <head> start and end tags. I agree that this is not ideal, but
it's hardly an onerous requirement on document authors.
This is mostly an ideological argument founded on whether we're allowed
to impose additional restrictions on HTML documents that are making use
of OpenID discovery. There is certainly no *practical* reason why this
shouldn't be done, assuming that the restrictions are sufficient to
prevent the above attack.
(The spec is currently lacking specifics on how to do this safely, by
the way. While I'm in two minds about the spec describing particular
implementations, it *does* need some normative parsing requirements in
sufficient detail that all complying implementations would not fall for
the above attack.)
>
> | 11.1. Verifying the Return URL
> |
> | To verify that the "openid.return_to" URL matches the URL that is processing this assertion:
> |
> | * The URL scheme, authority, and path MUST be the same between the two URLs.
> | * Any query parameters that are present in the "openid.return_to"
> | URL MUST also be present with the same values in the URL of the
> | HTTP request the RP received.
>
> http://example.com/do?dosomethingharmless=yes
> matches
> http://example.com/do?dosomethingharmless=yes&deleteverything=YES
>
While this is true,
* Good sites will not have a GET request performing an action
* If an attacker has something to gain from sending the browser to
that latter URL, you don't need to go through all the overhead of an
OpenID request to send the browser there.
More information about the specs
mailing list