Tuesday, December 14, 2004

Del.icio.us clusters

Del.icio.us is interesting. It lends
itself well to the following experiment/analysis: Some tags co-occur
with other tags much more often than others. This creates natural
clusters. For example, "php" and "mysql" are a pair of tags that are
more likely to apply to the same URL than say "philosophy" and "web".
So, it's pretty straightforward to analyze the RSS feeds they provide
and automatically generate the clusters, so I did it.

(Note, to illustrate some of these clusters I use a stylesheet which
will not get picked up by most RSS readers, so visit the actual page
if the diagrams make no sense) Most clusters are exactly the kind of
thing you'd expect, though some have interesting structure. Others
are amusing or surprising. One "standard" cluster is essentially a
web development cluster. I've collapsed some of the sub-clusters for clarity:



Web development cluster





php mysql lamp


apache mode_perl perl work





creative css html design

javascript webdev





foaf semantic xml rdf

software web

jsp java j2ee programming



So that's a pretty coherent, straightforward cluster. There are others including the Blogging/RSS cluster (rss atom syndication cool tech blog), the Recreation cluster (photography art photo annotation flickr geo photos games fun flash humor funny comics comic) and the Politics cluster, drawn out here.


Politics cluster


rumsfeld iraq
rnc dnc
democrats
political gop
bush
politics
kerry
election fraud
usa



When you get to the somewhat higher level connections (which are that much more tenuous) there are some less obvious arrangements. All caps words represent an entire subcluster that doesn't have its structure shown:


RSS/BLOGGING


LINUX
mail windows
ftp
spam


IETF RFCs
imap email
sputnik darpa
1950s 1960s


INTERNET HISTORY
free reference
snort security
tools

WEB DEVELOPMENT




Some of the odd connections (eg "mail windows" and "ftp") may just be an artifact of the relatively small sample (less than 6000 tagged URLs).
I'll have to collect more data and continue the experiment. In any case, it's interesting to me that it exposes a hierarchical ontology in a rather straightforward way.

Monday, December 6, 2004

November 2004 Games

36 games played, 20 titles (8 new to me) over 8 sessions with 38 different people.

Hot Games for November, 2004




Buy Word (6 plays)
This is a fun word game. I'll try to write a full review soon.
Leapfrog (3 plays)
Interesting simple game of simultaneous action selection.
Light Speed (6 plays)
Still one of the best speed games around.
Princes of Florence (1 play)
I tried a prestige card focused strategy. Not a win, but some success.
Victory & Honor (1 play)
I'm wondering about it's staying power, but it's fun.
Electronic Catchphrase (3 plays)
beep beep beep
Heroscape (1 play)
My main current wish on this is that it didn't take so long to set up. Still fun.
San Juan (1 play)
Production buildings just aren't that useful.
Plupsack (2 plays)
I've come up with some new mnemonic approaches that make me less awful at this.
Typo (1 play)
Buy Word is better, but this is still a fun word game.

Sunday, December 5, 2004

NOAA Weather data

Back in 1993, I set up a web server and one of the things I put on it was a gateway to National Weather Service data. In the intervening decade, a great many superior weather services have appeared on the web. I'm not particularly a weather geek, but my early contribution in the area has always made me follow web weather information a little more closely than I might have otherwise.

I was pleased to see recently that the NOAA was providing forecast
data in XML
. I was disappointed to see that the query mechanism was
SOAP. Allow me to rant for a moment: SOAP is a bit of an atrocity.
SOAP is the "simple object access protocol" which is this abomination
in which you tunnel gigantic XML wrapped queries over HTTP in order to
access various web services. Then, XML documents are returned.
Great, I'm all for XML in appropriate places. If you're accessing a
web service that is returning structured data, an XML document is
almost the perfect response. However, an XML query?
Why? Why? Ok, I know some of the reasons, but when it comes down to
it, they're not very good. Just because you can beat a nail into the
wall with the end of a screwdriver doesn't make it the right tool for
the job. If only we had a straightforward method for requesting named
objects with arbitrary named parameters as a query, right? We do.
It's called HTTP and url-encoded form data. So, if I have a really
simple object "myObject" I want to call "myMethod" on with the named
parameter "myParameter" with a value of "myValue", in SOAP, I have to send:



POST /myProxy HTTP/1.1
standard HTTP headers removed for clarity
SOAPAction: "urn:myObject#myMethod"

<?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope xmlns:xsd="http://www.w3.org/1999/XMLSchema"
    SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
    xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance"
    xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
    xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/">
<SOAP-ENV:Body>
<namesp1:myMethod xmlns:namesp1="urn:myObject">
<myParameter xsi:type="xsd:string">myValue</myParameter>
</namesp1:myMethod>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>


in contrast, if I were to use just a URL, it might look something like this:



GET /myProxy/myObject/myMethod?myParameter=myValue HTTP/1.1
standard HTTP headers removed for clarity



Both approaches return the exact same data. SOAP, I have to use third
party libraries which may be incomplete, poorly documented or buggy.
In the latter approach, I can use curl, lynx, wget, any web
browser
, third party libraries or I can hack together a raw
socket implementation in minutes. Some people have taken to calling
this latter approach "REST" which refers to Fielding's
"Representational State Transfer". I'm fine with calling it REST if
the goal is to have a name for it but the real name for it is "the
world wide web". Having a queryable URL which returns XML is great.
Having to send it XML in the first place is unnecassary. Emerson
commented: "A foolish consistency is the hobgoblin of small minds."
I'm sure this is exactly what he was thinking of. XML is cool, but
let's not over apply it til we get sick of it. WSDL is cool, but
let's just skip section 3 (SOAP binding) and focus on section 4 (HTTP
binding). If you have a strong reason to use SOAP rather than
straight HTTP, go ahead, but in essentially every case I've ever seen,
an HTTP binding would be more than sufficient and vastly superior.
Ok, the rant part is over.

So, in an effort to make the world a better place, I made a RESTish
HTTP gateway to the NOAA SOAP interface.

The URL is http://mkgray.com:8000/noaaforecast and it take parameters longitude, latitude, product, optionally startTime, optionally endTime and at least one of the weather parameters (maxt, mint, temp, dew, pop12, qpf, snow, sky, wspd, wdir, wx, icons, waveh) if you are using time-series mode. For example, if you wanted to see what was forecast to fall out of the sky near Boston:
If you hit the page too often (more than 30 times in a 6 hour period) you'll be throttled and given 503 errors. If you omit latitude, longitude, product or parameters when required, you'll get a 404. See the NOAA page for details on the format of the inputs (ie, the times) and the format of the returned XML.

Jordan's Furniture megastore

We visited the new gigantic Jordan's Furniture in Reading,
Massachusetts. Wow, it's big, but lacking in most of the charm of the
other Jordan's stores.

I've been reasonably impressed with the other Jordan's stores, even
the large ones. They are famous for their low-pressure sales approach
and quirky stores. They'll have free coffee, cookies, and popcorn.
One of them has a motion movie ride. The new store has a 3-d IMAX
theatre, a "trapeze school", a brightly colored illuminated fountain,
a Jelly Bellies store and an ice cream stand. Plus, a great deal more.
Unfortunately, this has the net feeling of making it feel rather
extremely over-commercialized. It's not that the other stores are
non-commercial feeling; they are retail stores. The new one feels crass rather than quirky though.

To top it all off, despite their overwhelming size, their selection is
only somewhat better. We went looking for entertainment centers. If
you want an entertainment center 46" wide and 68" tall, with swing out
doors and drawers underneath, they have a wide range of styles and
appearances. You want a narrower one? Well, mostly not. One without
doors? No. A taller one with cabinets above? No. The uniformity of
choice in a smaller store is unsurprising. In a store this large it's
disappointing.

Thursday, December 2, 2004

Why didn't anyone tell me?

JavaScript isn't awful. Well, it isn't awful anymore. When
JavaScript first came out, freshly renamed from the less misleading
"ActiveScript", it was awful. It was ill-defined, clunky, full of
security holes, and awkward in a great many ways. I wrote it off as a
tool used to make web sites do things they really didn't need to or
shouldn't do anyway. At some point, seemingly around 1999, this changed.

JavaScript is a clever language with an interesting twist on the
"standard" object model of languages like Java or C++. It's got a
relatively clean and useful set of built-in libraries, including the
valuable DOM. It's threading/timing model is a bit bizarre but
surprisingly useful. It interacts with XHTML in powerful ways to
enable some very useful bits of web UI. Through the use of
"bookmarklets" it puts a lot of power back in the browser which, in
the interest of "interactive" web sites has been gradually leached
away.

JavaScript still has it's issues. The three biggest as I see it are:


  • It used to be bad. It really was bad. No DOM. No good documentation. No prototype inheritance model. Bad. First impressions make a big difference.
  • It's mostly used by non-developers. At some level this isn't bad, but it creates the problem that a lot of the JS code out there is awful because the people writing it don't really know what they're doing. It's horrible voodoo code which is unintelligible and barely functional, never mind maintainable or usable by others.
  • Most use of JavaScript is abuse. The most ubiquitous applications of JS are things like popup and popunder ads. Even image rollovers and other stupid UI tricks could be called abuse. Modern browsers (eg, Firefox) address the whole popup issue well, making this abuse less apparent. Unfortunately, a tool that is mostly used for bad things is often assumed to be a bad thing. JavaScript was used for a lot of bad things.


Now, though it's a nice language. It's being accepted more by
developers. It's being used in useful ways. So, if you're like me
and thought JavaScript was an atrocity deserving no attention, look
again. It's grown up a lot. The question remains: "Why didn't anyone
tell me, in the last 5 years, that it was so much better?"

For reference, I highly recommend the Rhino book by O'Reilly.