Tuesday, October 26, 2004

The Triumph of Semantic Markup

Semantic markup is the new big thing, the semantic web and all that. I see it less as a cool new thing and more of a returning to our roots,
or at least the roots of the web as I saw them when it first started in the early 90s.

Back in the early days of the web (1993), I had a mantra I used when explaining HTML markup to people "Semantic, not Literal". That is, an html H1
tag does not mean "bigger font". It means "1st level heading". HTML markup was meant to be semantic, and not for any aesthetic reason. It wasn't
even for the reason of browser independence (though, had people stuck to semantic markup, this whole WML thing would have been simpler). It was
for the reason that semantic markup is much more powerful from an information access point of view. Specifying a font color is nice, but it came
at a cost of semantic ambiguity. Of course, HTML was never really fully semantic markup, it was just an appealing dream. People fell in love
with the FONT tag and a variety of other literal markup tags and mechanisms. The hope for semantic
markup was lost, and HTML became very much a literal presentation markup language.

Then, to add insult to injury, things like DHTML came along, flaunting the fact that HTML had become a presentation markup language, not a semantic
markup language. I resisted learning DHTML for a while because of this. Once you give in and accept HTML as a presentation language, DHTML is
sort of neat. When I first learned about CSS, I had some hope that perhaps it restored some of that semantic markup quality. Sadly, it did not.
It does do a decent job of abstracting out "style", but HTML is still essentially purely presentational. CSS allows it to presented in a variety
of styles, though, and it's nice having a relatively universal standard for those style details.

But looking at the web now, the power of semantic markup is winning out again. href=http://www.xml.com/pub/a/2002/12/18/dive-into-xml.html>RSS has become an extremely rapidly growing and popular scheme
for information delivery. RSS
is a pure semantic markup with all the presentation details left up to HTML. I currently read well over 100 RSS feeds, only a fraction of which
are from blogs which prompted the success of RSS. This blog itself is accessible via RSS and many of the readers
access it that way. I read the
comics, the Times, a grab bag of web sites, a few search engine queries, and even a mailing list, all via RSS. It's very gratifying to see
"semantic, not literal" finally getting a lot of traction.

Two other random thoughts on RSS: I expect it won't be long before some RSS feeds start including advertising entries. It surprises me I haven't
really seen them yet. I just hope when the time comes it isn't overwhelming. There are some feeds I would keep reading even with a reasonable
dose of ads. Most, however, I'd stop reading if ads started being included. Until then, it's a nice little garden. The other thought is that
as the progenitor of RSS, share other qualities with the early days of the web. In the early days of the web, the thing to do was have your
own personal homepage you created. Rapidly, as most people had nothing they wanted to put on such a page, having a homepage became something that
while many people had them, it was more commonly something a university or corporation had, not an individual. Further, in the early days of the
web, the goal (as much as there was a "goal") was to share information. You weren't trying to drive banner hits, collect demographic data or
derive revenue. While some (many?) blog writers may now have those goals, the biggest goal I've observed is wanting to be read. RSS is a great
way to make it easier to be read.

(Footnote: If you're looking for a great way to read RSS feeds, there are a variety of href=http://www.google.com/search?hl=en&ie=ISO-8859-1&q=rss%20aggregator&btnG=Google+Search>RSS aggregator applications, but I will highly
recommend Bloglines as an outstanding web based reader.)

No comments:

Post a Comment