X and accessibility

July 12, 2008 / Standards, pragmatism, accessibility and HTML.

In April, 2007 David Andersson summarised the development of, and differences between, HTML5 and XHTML2 and concluded that the web’s future lies with HTML5. I think he’s generally right, though XHTML2 has never been a likely successor to HTML4/XHTML1. The real question is what will become of the X in XHTML given that most authors are doing it wrong?

HTML5 is looking so strong because it’s a pragmatically driven project that incorporates much of what people are already doing—stealing XHTML’s thunder by keeping the standards-based focus while decoupling the web’s primary language from XML. (HTML5 is homologous to XML—it can even be served as XML—but most browsers will never see it that way.) And because it’s well-grounded it’s already being implemented.

Despite the efforts of the W3C to absorb HTML under the XML project, it seems that the two vocabularies will remain on separate paths, running parallel for now. This threatens the W3C’s goal of a semantic (machine-readable) web in its idealist form. WHATWG’s efforts which, like those of the microformats community, are grounded in popular practice, will get us only part-way there, but unlike XHTML2 they promise us something we can use here and now.

So is HTML5 a fait accompli? Taking the contrary view in a recent article, James Edwards still favours the current XHMTL standard, served as XML where possible, over HTML5 for accessibility reasons. He doesn’t mention XHTML5 explicitly (i.e. HTML5 served with an XML MIME-type), but he does say he’d rather stick with XHTML1 than adopt HTML5’s markup spec, which drops support for several accessibility features, including the alt attribute for images and the summary and headers attributes for tables.

Edwards and Gez Lemon, linked above, are right that this is a problem, especially regarding the alt attribute (given the prevalence of images over correctly marked-up complex data tables). This needn’t be a practical quandry: Edwards is taking a stand on principle in sticking with XHTML1 because the spec recognises these accessibility features.

So what should an organisation that is concerned about accessibility do? This is a question I’m trying to answer. Two new sets of guidelines are particularly relevant: WCAG+Samurai released on February 26 (see commentary by Joe Clark and Roger Johansson), and the W3C’s much-revised WCAG 2 Candidate Recommendation released on April 30 (discussed in interviews with Patrick Lauke and Lachlan Hunt).

The two questions that need to be addressed:

  • Which set of normative rules, if any, should guide the organisation?
  • Which X/HTML syntax, if any, maximises access for both assistive technologies and mobile user agents?

“If any” is important: I don’t want to presume that a single choice must be adopted in either case. It may be true, for example, that more than one versions of HTML could be used without any significant detriment to accessibility, or that neither set of accessibility guidelines is completely appropriate or usable. (I doubt that, but let’s see.) Nevertheless, being able to specify one in each case is desirable, and so is testing and evaluating the results.

I’m planning to follow up on this post once I’ve read the two documents and done some testing, and I’m interested in hearing what people think.

13 responses

  1. Ian Hickson

    HTML5 still has alt=”” and still has headers=””; indeed, it defines them in much more detail than HTML4.

    It’s true that HTML5 won’t get us all the way to a completely machine-understandable Web, but it’s not clear to me that that’s an important destination. I think a human-readable and human-writable Web is a much more critical destination, and that’s the focus HTML5 takes. I think a much better way of getting a computer-understandable Web would be to make the computers understand the current Web. I think it’s going to be much easier to get computers to understand humans than it would be to get humans to understand how to write computer-understandable content like RDF, let alone motivating most humans to actually write such content.

    July 12th, 2008 at 8:49 pm #

  2. Adrian

    HTML5 still has alt=”” and still has headers=””; indeed, it defines them in much more detail than HTML4.

    Ian, you’re right, I wasn’t specific enough. Regarding alt, I said “which drops support for several accessibility features, including the alt attribute for images” when I should have said “makes it possible for alt="" to be omitted.” The comments on Gez’s post flesh this out.[1] On headers, the issue is even more curly: they are in the spec but do not apply to the td element. Gez’s bug is still open on this,[2] so the accessibility issue remains a question. I’ve never made a table remotely like this,[3] let alone seen one marked up correctly, and I didn’t appreciate the complexity of the issues James raised in his post until you called them out, but from what I can tell these two accessibility issues are still unresolved for HTML5.

    I think it’s going to be much easier to get computers to understand humans than it would be to get humans to understand how to write computer-understandable content like RDF, let alone motivating most humans to actually write such content.

    I think I agree with you. The argument that it’s easier to change computer behaviour than human behaviour seems pretty sound.

    [1] http://juicystudio.com/article/html5-alt-text-authoring-tools.php
    [2] http://www.w3.org/Bugs/Public/show_bug.cgi?id=5822
    [3] http://juicystudio.com/wcag/tables/complexdatatable.html

    July 13th, 2008 at 2:31 am #

  3. steve faulkner

    As usual ian hickon is being disingenuous when he says that the spec still has headers. Indeed it does, but in a neutered form that does not provide the ability to mark up complex tables such that current AT will be able to decipher the relationships between header and data cells. As for the alt attribute, while it may be documented in much more detail than HTML, much of the documentation is of poor quality and in conflict with WCAG 1.0 and WCAG 2.0.

    Both of these issues are being reviewed by the W3C protocols and formats working group, who has expertise and knowledge in regards to accessibility and is responsible for ensuring accessibility considerations are taken into account in all specifications produced within the W3C. So we don’t have to rely upon only one persons view of what is good for people with disabilities.

    July 13th, 2008 at 2:21 pm #

  4. Adrian

    Both of these issues are being reviewed by the W3C protocols and formats working group

    Steve, that’s good to hear. The arguments accessibility experts are making about alt and headers are convincing, especially once you follow them in depth. The bug report linked above is particularly insightful there.

    July 14th, 2008 at 5:56 pm #

  5. Steven Clark

    From a computer science perspective, while it might sound easier to get a computer to understand something than a human I believe that “Artificial Intelligence” Phd’s would argue that until the cows come home…

    There are some things that appear complex and computers solve them very fast. Other things were considered very simple 20+ years ago and still haven’t had much progress. Put simply, don’t forget computers are 1’s and 0’s – the dumbest creation on the planet.

    Given the wide reaching proposal that computers are in fact smarter than the average human we should be at a point where there are efficient Big 0 solvable algorithms for everything.

    Sorry, don’t mean to diverge. But when we’re talking about human to computer versus computer to computer interaction then it should really be in the conversation. When we talk about computer to computer, of course we mean interoperability of programming solutions, including databases and data stores, etc etc etc. For us to move into a context aware computing paradigm it would be a requirement at some point to have computers talking to computers.

    As long as we aren’t looking in that direction we’re really saying that our software will aim to work in unique or limited silos of interoperability.

    Anyway, diverging again Adrian. Sorry. Naturally the whole web won’t suddenly do this magic computer to computer inter-talking magic context thing all at once. There needs to still be mom and pop (and me) coding basic content. But software will get a lot out of that ability so big organisations with big apps would get a lot of advantage – web technologies in relation to service provisioning, etc etc. come to mind.

    It would be very sad to see the ultimate goal of a “meaningful” web disappear because it doesn’t appear attainable at the moment. This is still a young field and we’re just getting up and running.

    There’s something extremely boring about a web paradigm that doesn’t include my toilet taking medical stats to send to my health specialist etc.

    In short, machines need a common language to communicate. The more complex we make either end of that conversation the more programming effort we’re going to have to put in to interpretation. Some research on natural language understanding techniques and difficulties are worth investigating.

    I think the human to computer vs computer to computer argument does sound logical on face value, but in science its probably not that true.

    July 18th, 2008 at 9:33 pm #

  6. Steven Clark

    IMO also another issue that I see is that often “generality” is used in the form that, for example, most people wouldn’t know how to code all of that semantic stuff so it shouldn’t be there, why worry about it and we’ll just dumb down our ambitions (eg. Flickr users case study and alt attributes may come to mind in HTML 5).

    Its a strange argument. Because most people in the end will always code junk from a WYSIWYG, we’re the exceptions. Most people wouldn’t code in C or Java either. They’re just different tools for different purposes. It’s just the generality argument comes up so often from proponents of dropping accessibility features because most authors don’t do them anyway yada yada…

    OK I’m early morning babbling lol. Don’t mean to sound like a flame, its an excellent article. I just think seriously that when we stop seeing the computer paradigm as someone sitting on a chair in front of a monitor then we might actually get a liberating computer experience. Or Big Brother lol.

    I’d like to think whatever the web is in 10 years time it will be nothing like I imagined it. As opposed to just better HTML for making websites.

    But, of course, I may be wrong about many things on a Saturday morning. :)

    July 18th, 2008 at 9:45 pm #

  7. Adrian

    There’s something extremely boring about a web paradigm that doesn’t include my toilet taking medical stats to send to my health specialist etc.

    One man’s utopia is another’s Orwellian nightmare.

    This is all a bit off-topic Steven. I don’t know whether Ian Hickson is right or wrong about the semantic web, and about teaching new tricks to humans versus computers. I do care about whether HTML5 is going to support important accessibility features. Whether HTML5 should play a large role in the web’s future or not, I think it is likely that it will. So it needs to be accessible and it needs to support existing adaptive technologies. Faulkner, Clark, Lemon and Edwards are all right that there is a problem here and the hardest part seems to be conveying it to the authors of the specification.

    most people wouldn’t know how to code all of that semantic stuff so it shouldn’t be there, why worry about it and we’ll just dumb down our ambitions (eg. Flickr users case study and alt attributes may come to mind in HTML 5)

    Semantics are extremely important to a rich and accessible web. I’m not sure whether XHTML is the best way to achieve that richness, but it seems fairly clear that XHTML as XML is not really working out the way the W3C planned. The big question is what to do with that. Regarding alt, it’s pretty clear that I’m in favour of requiring it.

    July 19th, 2008 at 4:27 pm #

  8. Steven Clark

    Adrian, all I meant by the Flickr case study reference was that the argument is “the general user doesn’t know how to do it anyway” as an excuse to reduce the accessibility of alt. I agree with what you are saying totally, That’s actually a very dangerous argument, the generalist one. Because we have to accept that by that account “most people don’t write valid anything”. There really needs to be a separation in that argument between “real developers” and “the junk that will exist regardless”. Yet, for some reason, this generalist argument keeps popping up. IMO if Flickr can’t meet a bar that’s a their business case not our excuse to dumb down to meet their current ability. They should be looking at altering their software, where appropriate, rather than dumbing down a spec to get them under the bar as valid anything. Especially at the cost of accessibility features that currently have support and are of benefit now.

    While admittedly off the topic, the broad statement about it being easier for computers to interpret the current web than for humans to author strict syntax is also a kind of easily digestible generalisation. That pre-supposes a whole lot of simplicity into the solution, and simply put computers aren’t the magic answer. Its much easier for 2 computers to share information if the languages are the same on each end, and their is semantic meaning. Its naive to think that garbage on either end will result in a clear conversation (even for humans).

    As for HTML5, I don’t think it can be all things to all people. Its probably the way forward at this point because its progressing with support. Like you though, I don’t see it as healthy to drop accessibility features in the name of the ‘generalist’ argument. I apologise if my sarcasm didn’t come over that well, it was very early when I wrote the comment.

    Yes semantics are extremely important, too. Which goes back to what my computer talking to computer conversation was about. We need semantics to make that happen. Computers are really pretty dumb so if they have to think and filter and guess then that’s all more code and overhead and complexity, is my point.

    I agree too, Faulkner, Clark, Lemon and Edwards are right. I’ve agreed with that for a long time. Sometimes vocally. But like you say, my comment was slightly to the left of tangent, and I apologise.

    I just get a little miffed when generalisations get thrown around in the HTML5 conversation about reasons to clip accessibility features.

    Generalisations like “most users” and “computers can…”. I do see partly where Ian is coming from but I don’t agree with the arguments put forward in that way. Flickr is NO excuse to reduce the current accessibility benefit of alt. How? Again I come back to their argument about “most users” or “most people”. Most? Most people don’t care, and we shouldn’t be reducing the experience of some actual web users because “some developers” don’t know and aren’t willing to learn to code properly. Again, sarcastically Apathetica 1.0 for dumb developers! It’s a call of frustration.

    If that makes sense. Again, sorry for sidetracking the comments. I’ll try to zip it and rant on my own bandwidth lol… regards… Steven Clark

    July 19th, 2008 at 7:40 pm #

  9. Adrian

    Its naive to think that garbage on either end will result in a clear conversation (even for humans).

    Nicely put, though it’s all a matter of degree. If a particular site makes sense to people at all (and badly authored sites can still be understood by those who can access them) then it would be nice, utopian even, if computers could see that in a structured and meaningful way. Having said that, what I really wish is that people who ought to know better picked up their game and actually learned the craft.

    July 20th, 2008 at 9:39 pm #


Zero to One-Eighty contains writing on design, opinion, stories and technology.