Parsing HTTP headers in Factor with multi-assocs
Saturday, February 2, 2008
The implementation of setting and parsing http headers in Factor has previously used a hashtable with a single key/value pair. However, this is broken because certain fields can be sent twice, e.g. set-cookie. The new implementation is a hashtable with keys/vectors to store multiple values for the same key.
I originally tried to make this obey the assoc protocol so that you could convert from a hashtable of vectors back to any type of assoc (hashtable/alist/AVL tree/etc) but this turned out to be a really bad idea because not only was it not useful, but it breaks the semantics of the assoc protocol if set-at inserts an element instead of sets it.
So the implementation is in assocs.lib as a few helper words:
: insert-at ( value key assoc -- )
[ ?push ] change-at ;
: peek-at* ( key assoc -- obj ? )
at* dup [ >r peek r> ] when ;
: peek-at ( key assoc -- obj )
peek-at* drop ;
: >multi-assoc ( assoc -- new-assoc )
[ 1vector ] assoc-map ;
: multi-assoc-each ( assoc quot -- )
[ with each ] curry assoc-each ; inline
: insert ( value variable -- ) namespace insert-at ;
Of course, set-at
and at
still set and access the values, but there
are a couple new utility words. The insert-at
word has the same stack
effect as set-at
but pushes a value instead of setting it. peek-at
will give you the last value set for a given key, and this is the
standard way of accessing values when you only care about the last
one.
To turn an assoc into a multi-assoc, call >multi-assoc
. To iterate
over all the key/value pairs, use multi-assoc-each
.
The insert
word is for use with the make-assoc
word, which executes
inside a new namespace and outputs the variables you set as a
hashtable.
Here’s an example of what the headers look like for a website:
( scratchpad ) USE: http.client "amazon.com" http-get drop .
H{
{ "connection" V{ "close" } }
{ "content-type" V{ "text/html; charset=ISO-8859-1" } }
{ "server" V{ "Server" } }
{ "x-amz-id-2" V{ "L0oid1yo1Z6cuq+VgwWCv0G/UdPov/0v" } }
{ "x-amz-id-1" V{ "15CPXN68HXB35FXE62CX" } }
{
"set-cookie"
V{
"skin=noskin; path=/; domain=.amazon.com; expires=Sun, 03-Feb-2008 04:57:59 GMT"
"session-id-time=1202544000l; path=/; domain=.amazon.com; expires=Sat Feb 09 08:00:00 2008 GMT"
"session-id=002-3595241-4867224; path=/; domain=.amazon.com; expires=Sat Feb 09 08:00:00 2008 GMT"
}
}
{ "vary" V{ "Accept-Encoding,User-Agent" } }
{ "date" V{ "Sun, 03 Feb 2008 04:57:59 GMT" } }
}
I have normalized the keys by converting them to all lower case. For some reason, Amazon sends two headers as Set-Cookie and the last one as Set-cookie, which is pretty weird.
Since the prettyprinter outputs valid Factor code, you can copy/paste the above headers into a Factor listener and run some of the multi-assoc words on them.