About Me

I'm just someone struggling against my own inertia to be creative. My current favorite book is "Oh the places you'll go" by Dr. Seuss

Friday, September 14, 2007

How I revived the semicolon

I was searching my own name in google, (as I often obsessively do), to see what the e-world thinks of me. Just as a lark, I searched a common misspelling, and found someone on del.icio.us that linked to a post I made to a mailing list. This was the post, and reasoning that influenced David Heinemier Hannson, to use semicolons in the urls of his popular Ruby on Rails framework. Since David is one of what they call "The Alpha Nerds" on the interwebs, this is kind of like the uber nerd version of someone like Bruce Lee deciding that aviator hats are a really cool idea, after having a deep philisophical conversation with me. And then all the other martial arts experts proceed to comment "Bruce Lee is pretty cool, but what the fuck is with the aviator hat?"

so that I may bask in my own glory and personal brillliance, I give you, the people who haven't given up reading in frustration, the post that will ensure the semicolon's place in your browser bars for years to come:

It would seem that my discovery of a rarely used aspect of the HTTP
url scheme, namely the semicolon ; has led to perhaps a level of
unjustified excitement and advocacy. Intent is very difficult to
gleam from carefully reading a spec, despite, or perhaps because of
the high standards specs are held to for preciseness and lack of
ambiguity, which encourage a sort of unnatural writing style which in
all honesty is hard to read and easy to misinterpret. That aside, I
have a question after reading a set of documents by Tim Berners-Lee,
which are more geared toward the intentional side of things rather
than the specification side of things, which you will find here: <
http://www.w3.org/DesignIssues/Axioms.html >

Regarding the issues I'm concerned with, this is my interpretation of
the document, and you can tell me whether it's incorrect or not:

*For REST, it is mostly important that GET have no side effects, and
anything which does have side effects be implemented with POST (or

*forward slashes, as mapped to unix heirarchical structures are used
merely as a matter of convenience, and due to the principals of
opacity, are not neccesarily significant to the resolution of a URI,
except regarding the resolution of relative URI's in which case, such
interpretation should be defined per URI scheme, with client support.
Forward slashes do not indicate a resource, but the entire URI
indicates the resource, and it's entirely up to the server how to
find that resource based on the URL. No procedure for internal URI
resolution is explicitly defined.

*That http defines such a scheme for resolving relative URI's for /
style urls, but not yet for semicolons as indicators of parameters
(though TBL provides such a scheme as a possibility for implementation)

*Therefore, virtually any symbol, including but not limited to
forward slashes and semicolons can be used for the purposes of
mapping to arbitrary data topologies, at the discretion of the URI
scheme designer, which can be interpreted and resolved by the server
in whatever way is appropriate. In addition, a great deal of this
flexibility is left in the http scheme to be used at the discretion
of server programmers. As long as opacity is observed, and URI's
remain idempotent nouns.

*To this add the additional constraint given by roy fielding that
this can only be done as long as the nature of the URI scheme is
clearly communicated by the server to the client. In other words,
there is no assumption made that the client will be able to infer a
URI scheme from a base URI without explicit communication that such a
URI scheme is in use.

*Notable exception is the # fragment identifier which is reserved for
use by clients, and to be defined when registering the mime type for
a document.

*TBL then demonstrates the flexibility of URI's by laying out his
Matrix URI scheme, that is using the ; and = symbols to specify a
matrix information space, which is distinctly incompatible with a
heirarchical space. This does not imply that ; has any inherent
meaning relative to the REST or uri model, but as a scheme designer,
TBL is using it to represent a type of information topology being
made available by a server.

If one is to take all the above statements as true and accurate
interpretations of REST, then solutions to certain problems become
available, which are not possible, or are awkward under a naive
surface impression that REST requires URLS to contain only forward
slashes / to represent information topologies. It is beginning to
look to me that this impression has more to do with Aesthetics, and
search engine technologies, than any specific requirements of REST.

A problem this interpretation potentially solves, for instance, is
that of multiple views of the same information resource. Usually,
this would be accomplished through content negotiation. However this
is only effective if your multiple views all have distinct mime
types. If you have for example, a resource such as < http://
example.com/blog/posts/53 > representing an article on a blog, you
may have a non editable view, and an editable view.

In an ideal world this sort of thing could be specified using a
fragment identifier, such as < http://example.com/blog/posts/53#edit
>, however for this to work on a practical level, it requires client
support which is not likely, and somewhat incompatible with the
fragment identifier scheme defined for html/xml

Another option is to use the forward slash < http://example.com/blog/
posts/53/edit >
However it seems logically innacurate to state that a view is a
hierarchical child of the resource, rather than something more
lateral, or directly related.

If the bullet points above are accurate, then it becomes possible and
reasonable to use a semi colon, or any other appropriate symbol to
represent a lateral space, such as a view. Matrix notation seems
appropriate, since such a view system can be seen as a "matrix" (or
tensor) with 1 axis, and english words representing named points
along that axis.
< http://example.com/blog/posts/53;view=edit > < http://example.com/
blog/posts/53;view=display> < http://example.com/blog/posts/
53;view=hatom>. This makes sense to me as viewing a slice or
subsection of a single resource, which includes multiple possible views.

The caveats being how relative URL's are resolved by clients, whether
the server supports this level of flexibility, and whether
appropriate mechanisms are employed for informing the client of the
URI scheme in use (such as hrefs along all points on a matrix
allowing navigation)

This may be overintellectualizing a simple problem, but the main
purpose of this post is to verify if I have my bulleted
interpretations correct, otherwise my conclusion may be false.

No comments: