The World Wide Web Consortium has finally released a new version of the official HTML specification for public review and comment. The new 4.0 version follows the January release of HTML 3.2 with a number of improvements and extensions.
At first glance, it would appear that the true innovation is rather limited. Most Web users, after all, have seen much of this functionality in their browsers for some time now. Still, it's important to remember that much of the consortium's work is focused on documenting current practice. While Netscape and Microsoft battle over the coolest features, the HTML 4.0 spec offers Web developers a sane, standards-based intersection in functionality.
The consortium has also drawn attention to a number of accessibility features included in this spec. This has been primarily accomplished through new functionality attached to forms and tables, coupled with a clear path for separating presentation and structure through stylesheets.
For all the details, check out the HTML 4.0 Press Release and the Director's Perspective, written by the Web's founder, Tim Berners-Lee.
Getting the goods
Ready to dig in? Your first step, of course, is to get hold of the current specification. There are a number of different formats, and eventually there will be dozens of language translations. For now, you should choose from the straight HTML version, a plain ASCII text version, a formatted and printable postscript version, or an Adobe PDF version. For the SGML geeks out there, or those of you interested in developing validation tools, the HTML 4.0 Document Type Definition (or DTD) is online as well.
The specification itself has more than you would actually expect. The structure of the material is different from previous standards documents released by the W3C.
For example, included in the introductory materials for the spec is an SGML Tutorial. This feature attempts to give an overview of the standardized general markup language, how HTML fits into SGML, and exactly what a DTD really is.
Web authors will also appreciate the list of changes from HTML 3.2. This appendix to the HTML 4.0 spec gives a concise description of what has been added, depreciated, and changed.
Stuff to know about
We've covered quite a few of the improvements that HTML 4.0 would bring to the Web environment. The links below point back to relevant columns that point out both the technical advances being proposed, and the reasons they're important.
Links are the foundation of the Web; they tie the world's content together, letting us surf from page to page. But they could be far more powerful. We've seen examples of linking other resources to a page - stylesheets and scripts are a good example. Check out some of the other research that's been going on in this area.
We're starting to get comfortable with the idea of stylesheets now that the power of CSS has crept into recent browsers. HTML 4.0 includes a number of hooks for adding those presentation features to a basic Web document. The spec also includes a couple of new tags for adding more structure to your content.
Even more than CSS, client-side scripting has become a baseline requirement for most Web pages. Yet HTML currently has no official syntax for hooking those scripts to your content - until now. Both linked and embedded scripts, plus a handful of event triggers, are defined in the new spec.
Way back in the days of HTML 3.0, the concept of a nonscrolling area of the screen, or "banner," was proposed as a way to offer unified navigation, branding, or advertising for Web sites. While that syntax was never actually implemented commercially, Netscape took the idea a step further and developed frames. Then everyone did frames. Now there is an official spec. Cool.
Wouldn't it be nice if there was one way to add any sort of media to any page? Imagine one standard tag that allowed you to include digital video, audio, Java applets, or even other HTML pages in your document. Soon, you very well may be able to.
There are a bunch of "escape characters" you can include in your HTML by using their character codes (the syntax is an ampersand followed by a name or number, followed by a semicolon). That's how we currently get things like ©, ñ, and ¿. The new HTML 4.0 spec extends and codifies how these are included in the language.
The basic HTML forms we've been using (like text fields and pop-up menus) are a good start, but certainly don't allow for true interactivity or graphical interface building. The new spec includes a number of enhancements and new elements for building better ways to gather input from your users.
There has been plenty of other work done lately to advance HTML. Much thought is currently going into how tables should work, especially since soon we won't have to rely on them for layout. You can also find information on HTTP upload via forms (users would be able to send files to a server), and more in-depth recommendations on true internationalization in HTML. (How, for example, do you deal with languages that don't read from left to right, top to bottom?)