XNX? (that's short for "XHTML's not XML?")

XNX? (that's short for "XHTML's not XML?")

XHTML documents are (at least I always believed so) XML documents, and as such, should be parsed with an XML parser. I don’t know of any parsers or data models which would even be able to distinguish

<div></div>

from

<div/>

So I would assume that both are valid markup. Or maybe, both would be invalid. But as far as  I know, neither DTD nor XSD can allow mixed content on an element, but forbid it to be empty, so they must both be well-formed and valid (validated) XML. Maybe there are rules for XHTML which go beyond the pure XML validation, but to check those, there are special validators. And since application-level validation must take place on the data model, not on the serialized syntax level, any such application-level-rule would be defined on a layer of abstraction that cannot possibly see which syntax was used on the parsing-level.
So I am completely baffled to see that:

  • Both ways are accepted without any warnings by the W3C Validator.
  • The online version of HTML tidy also accepts both as valid input, no warnings.
  • But while „tidying“ the markup, it changes one of them into invalid XML that is not even well-formed (!!!), by changing <div/> into <div> without a closing tag.
  • Firefox 3.6 completely refuses to display invalid XHTML, like the one generated by tidy.
  • But it accepts both <div></div> and <div/> without complaining.
  • It displays <div></div> correctly, but (that’s the second shocking thing) completely messes up my page when I use <div/>.

So there are subtle differences between two identical things, which are silently misinterpreted by three software tools (validator, tidy, firefox) which should either do the right thing, or tell me that I’m doing wrong.
There are folks on the internet who say that in HTML, <element/> is not a valid syntax at all. Others say that in HTML, empty div’s, even <div></div> is not allowed. See here and here.
But whatever they say, I’m doing XHTML, and it is completely against any logic that there’s any difference between <div></div> and <div/>.
I don’t know whether to blame the W3C, or the Firefox implementors, or the ones of tidy for this, but if XML and XHTML were good for anything, than it would be to eliminate those syntactical subtleties and save me from wasting hours on chasing errors in my markup which, by definition, cannot be errors at all.
(PS: If you don’t get the meaning of the title… it’s an allusion to „GNU“ which is an acronym for „GNU’s not Unix“, which is kind of misleading, because GNU is very Unix-like – actually, it’s an attempt to be another Unix. And since the whole XML-XHTML-thing is kind of misleading as well…)

2 Gedanken zu „XNX? (that's short for "XHTML's not XML?")

    1. Hi Dominik,
      thanks for your comment and all the effort you spend to get it into my blog. I slightly edited your comment by inserting the missing tags into the text. I hope this is ok for you.

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht.