I am a programmer and architect (the kind that writes code) with a focus on testing and open source; I maintain the PHPUnit_Selenium project. I believe programming is one of the hardest and most beautiful jobs in the world. Giorgio is a DZone MVB and is not an employee of DZone and has posted 637 posts at DZone. You can read more from them at their website. View Full User Profile

Where has XHTML gone?

02.01.2011
| 6730 views |
  • submit to reddit

When I started working in the web development field, XHTML had recently been introduced and was all the rage. I even included XHTML templates and a copy of the DTD in our enterprise CMS, believing that in some years the publicly hosted DTD would be targeted by millions of users browsers, trying to validate XHTML code and rejecting malformed documents.

But now in 2011, where XHTML has gone?

What XHTML is

XHTML is a specification which defines the XML serialization of HTML: while HTML itself is not a strict language, and ignore most of the malformed tags and nesting structures, XML is much more draconian. In its original versions, XHTML 1 and 2, XHTML was the reformulation of HTML 4 in order to transform HTML documents in valid XML ones, agnostic with respect to the graphic presentation or the media type.

For example, XHTML deprecated or invalidated all tags strictly related to presentation issues, like <b> (substituted by <strong>) but also <font>. Ideally, XHTML documents could just be viewed on different medias by specifying a different CSS.

An interesting idea of XHTML was also providing different modules, via XML namespaces. You are able to compose different markup languages in a document, in addition to the standard one: a language for forms, one for mathematical formulas, one for vector graphics.

Here's an example of XHTML snippet, including a MathML expression.

<p>Some random text.</p>
<math xmlns="http://www.w3.org/1998/Math/MathML">
<apply>
<plus/>
<apply>
<times/>
<ci>a</ci>
<apply>
<power/>
<ci>x</ci>
<cn>2</cn>
</apply>
</apply>
<apply>
<times/>
<ci>b</ci>
<ci>x</ci>
</apply>
<ci>c</ci>
</apply>
</math>

A bit verbose, but comparing to using cryptic LaTeX notation, which must be parsed on the server-side, it's not so ugly.

The approach of HTML 5 has been instead to provide support directly into a single specification: the <input> element has been extended to provide user-friendly forms without the need for additional JavaScript libraries; <canvas> can be used for SVG; and so on. The DOM is also manipulable via JavaScript, in an important part of the spec.

Another advantage of XHTML may be the use of XML tools for web pages too: in every language you have an XML parser, but an HTML one is more difficult to find or write.

The issues

XHTML 1.0 dates back to 2000. If it is so powerful, why it has not been widely adopted?

I think here's why:

XML has a very strict syntax with respect to SGML-derived languages like HTML. If there is a syntax error or a missing closing tag or attribute double quote in even one row of your XHTML document, it won't be interpreted by the browser.

Moreover, character encoding and JavaScript access to the DOM is more difficult in XHTML documents: try write a & character or access an element without its XHTML namespace.

The Facebook case

Facebook includes an XHTML 1.0 strict doctype in each page. However, it serves documents with the text/html HTTP response header, which means browser do not treat the content as XML.

XHTML has spreaded in Facebook in the last years: the last time I saw it was as FBML, a language used to extend the capabilities of Facebook applications on the client side. FBML tags are included in an XML namespace and are interpreted via JavaScript to produce an effect (not by the browser by itself).

For example, the following snippet produces a friend selector, of course customized for the current user:

<form action="http://www.example.com/handler.php" id="testForm" method="post">
<fb:friend-selector uid="12345" name="uid" idname="grab_me_please" prefill_id="7906796"/>
<input type="submit" value="test" />
</form>


Rather then resorting to custom attributes over standard tags like many JS frameworks do, this declarative approach adds a fb XHTML namespace and makes available a whole new set of tags with extended capabilities. And it uses a well-documented standard, like XHTML.

However Facebook is in the process of deprecating FBML (moving to iframe-based applications), and preaches to just use more widely adopted standard: HTML and CSS. The same feature of FBML are now available via a JavaScript SDK and by a bunch of Social Plugins.

XHTML5!

Just when you thought XHTML may disappear, I have to tell you that XHTML has been evolved to accomodate the HTML 5 specification. You can write (only if you really want, of course) HTML 5 as valid XML. However it seems that XHTML 5 will try to remain backward compatible with HTML, for example by allowing elements like <i> and <font> in certain use cases.

As a web developer, will you use XHTML [5] in the future?

Resources

http://xhtml.com/en/future/x-html-5-versus-xhtml-2/

http://www.w3.org/MarkUp/2004/xhtml-faq#advantages admits that translating an HTML document in XHTML won't result in a difference, unless you incude other languages.

Wikipedia's article on MathML shows you an example of a language that can be used in an XHTML document as a module.

Published at DZone with permission of Giorgio Sironi, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Sam S replied on Tue, 2011/02/01 - 9:03am

In 2011, XHTML is doing quite well: http://w3techs.com/technologies/history_overview/markup_language.

Whether it will still be around in 2015, that's another question.

Mario T. replied on Tue, 2011/02/01 - 9:40am

I think your example highlights the XHTML problem quite well. Most developers have been using the XHTML syntax solely, without any intent to embed MathML or SVG namespaces. Facebook is the exception. And MSIE successfully blocked usage of the correct MIME type anyway, so most developers didn't notice that target=_blank was forbidden in XHTML-strict. So there was never much real usage, just mediocre buzzword application.

Giorgio Sironi replied on Wed, 2011/02/02 - 7:58am in response to: Mario T.

I remember defining class="blank" on links to modify their target via JavaScript... Weird.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.