Making dasBlog (and ASP.net) XHTML 1.0 Strict Compliant

Over the last week or so, I’ve been modifying the dasBlog source code quite heavily to make it Xhtml 1.0 Strict compliant. I also want to make it content type negotiation enabled, and as such serve application/xhtml+xml mime types instead of just text/html (more on that later).

The goals for this project:

  • Render only XML formatted XHTML 1.0 Strict markup;
  • Clean up the markup being used, and beautify it;
  • Give an option to minimize the size of a page.

The scope of the problem is large, as both asp.net and dasBlog lack most of the support for it. Over the next couple of weeks, I’ll publish several articles on how I fixed a bunch of issues:

  • Links don’t have proper character entity escaping (turning & into & in urls like http://test.com?firstparam&secondparam ).
  • WebControls push style blocks in the body of the page.
  • Different components misuse the Write* family of methods on the HtmlTextWriter object that is used to render markup in the HTTP pipeline of asp.net.
  • Blog entries are not stored or processed as proper Xhtml content. There needs to be clean-up of the code, both when an entry is being added and when pulling an entry from the dasblog store. Storing as Xhtml only content and publishing proper namespaced html content will be a next step in the project.
  • The __doPostback and other javascript infrastructure code rendered by asp.net are not DOM compliant and use name attributes on several tags.
  • Several elements, including input and form, have a name attribute instead of / alongside an id attribute.
  • Input elements are not wrapped in the proper controls
  • Javascript blocks are not properly CDATA escaped, nor are style blocks.
  • Javascript blocks contain the language attribute that was deprecated in HTML 4.01 and is not supported by the strict DTD.
  • Several elements use inappropriate attributes (language when using scripts in onclick blocks etc)

This non-exhaustive list is what must be fixed before we can start thinking of being compliant. There’s also a couple of things I wanted to achieve to clean the markup rendered by dasBlog:

  • Aggregate the different javascript pieces into one script block inside the page, for the code that changes on each request.
  • Output the fixed javascript code used by some controls in .js files embed in the assemblies and serve them separately.
  • Do these two things for style blocks as well.

I decided to fix these problems by:

  • Adding the proper API to the base asp.net page used in dasBlog to write javascript and style content in the header of the page
  • Inhering from HtmlTextWriter and building a new XhtmlTextWriter.
  • Overriding all the Write* methods on HtmlTextWriter and pipe them through the proper AddAttribute / BeginElement methods (involving parsing html content and cleaning it / piping it on the fly).

I’ll write at least two articles on the subject over the next couple of weeks.

Ads