The docType Class

This class is for people who want to use the php DOMDocument class to create valid XHTML documents sent with the proper mime type while still providing backwards compatibility for users with web browsers that do not properly support the application/xhtml+xml mime type.

Using DOMDocument to generate your content is a little more tedious than typing content directly, but there are a lot of benefits. I will not go into a lot of detail here on the benefits, they are discussed on the php list from time to time, but to highlight:

The class uses the $_SERVER['HTTP_ACCEPT'] variable, sent in a header by the requesting web client, to detect how well the client supports XHTML. When support is sufficient, it sets up the DOM as an XHTML document. Otherwise, it sets up the DOM as an HTML document.

When you are ready to serve the document, the class can send the appropriate header and the appropriate version of the content.

Class Source

Using the Class

Initialize the Class

<?php
require_once('xml_doctype.inc');
$dom = new DOMDocument('1.0','UTF-8');
$dom->preserveWhiteSpace = false;
$dom->formatOutput = true;
 
$myPage = new docType($dom);
$xmlHtml = $myPage->document();
?>

The variable $xmlHtml represents the <html></html> node of your document. Add your child elements to that node.

Serving the Content

Once your document is fully constructed and you are ready to serve it:

<?php
$myPage->sendpage();
?>

That’s it! Your content has now been served. Browsers that properly support the application/xhtml+xml mime type will get an XHTML version of your page. Other browsers will get an HTML version of your page.

Public Variables

Public variables need to be defined after the class is initialized. The first three must be defined before the document() function is called. You can also extend the class the define them.

public $htmlDT
String. The DOCTYPE string to use for HTML documents. Defaults to HTML 4.01 strict.
public $xhtmlDT
String. The DOCTYPE string to use for XHTML documents. Defaults to XHTML 1.1.
public $xmlLang
String. The xml:lang attribute to use for XHTML documents. Defaults to en.
public $noChild
Array. In (X)HTML, some elements are never allowed to have child elements. For example, the meta, br, and hr elements. In XHTML they are self closing, but when exporting to HTML they are not. Since they are not allowed to have children, they do not have a closing tag since that technically results in a child, even though the child is 0 bytes. DOMDocument knows about most of these and gets them right on HTML export, but it may know about the newest ones, such as the HTML 5 source element. This array variable allows you to define new elements that never are not allowed to have children so that if DOMDocument does not know about them, their spurious closing tag can be removed from HTML exports of the DOM.
public $keywordArray
Array. Array of keywords to add to the keyword meta tag. Defaults to empty (in which case no keyword metatag is generated.)
public $descriptionMeta
String. Used to generate a description meta tag if the string is not empty. Defaults to empty.
public $generator
Boolean. Whether or not you want a generator meta tag added to the document head. Defaults to true.
public $genstring
String. Specifies the string to put in the content attribute of the generator meta tag if you want a generator tag. If empty, the content attribute will specify the version of PHP, DOMDocument, and the version of libxml2 your php is compiled against. See the generated (X)HTML source of this page to see an example.
public $chromeFrame
Boolean. If set to true and the requesting browser identifies itself as having the Chrome Frame plugin, the appropriate meta tag will be inserted into the head section when the document is sent. Defaults to false, but in the html5DT class that extends this class, it defaults to true.

Public Functions

constructor function docType($dom,$accept='')
Takes a DOMDocument object as the first argument, optional an accept string for the second argument (overrides what the browser sends, useful for debugging by forcing HTML or XHTML). This function is called when you initialize the class.
public function document()
Loads the Document Type and root element into the DOM object. Returns an object for the root html node that you can append children to.
public function addKeyword($keyword)
Requires one argument, a string. Adds the specified keyword to the $keywordArray public variable.
public function addKeyArray($keywords)
Requires one argument, an array. Adds the elements of the specified array to the $keywordArray public variable.
public function sendpage()
Sends the appropriate header and web page to the requesting client.

Gotchas

White Space

Make sure you do not have any blank lines or carriage returns before your opening <?php. Otherwise the server may send a header and content before you intend it to, which will result in a broken page.

Lower Case Elements

HTML is not case sensitive for element and attribute names. XML is, and XHTML uses lower case for element and attribute names. When you create elements and attributes, make sure they are lower case.

HTML Entities

HTML defines many entities that are popular in web design. For example, &nbsp; and &copy;.

These are not valid in XML unless defined in the Document Type. Do not use them, they will cause XML errors in clients that receive XHTML content. You should use the Entity Number instead. For a non breaking space, you would use &#160;. For copyright symbol, you could use either &#169; or use a UTF-8 text editor and just type a © directly.

JavaScript

Inline JavaScript is bad form. You really should keep your JavaScript in external script files and reference in your document head. However, even though it is bad form, it technically is legal to define your scripts inline.

If you insist on using inline JavaScript, you need to be aware that XHTML does not play nice with JavaScript contained in a comment. It needs to be contained in a cdata block. Note that using a cdata block will fail in some older browsers, you really should just keep all your JavaScript external, it really is the best way.

Additionally, the JavaScript document.write() function can not be used with XHTML. If you have JavaScript that modifies the document, use the JavaScript DOM2 functions.

I am not a JavaScript guru. I personally try to use it as little as possible. However, I do think it is worth mentioning that by far, the very best web site I have ever seen discussing JavaScript is Quirksmode. His book on JavaScript is on my highly desired book list.

Tips and Tricks

When creating documents from scratch with DOMDocument, it can be a lot more tedious than just typing the raw (X)HTML:

<img src="funny.jpg" width="300" height="300" alt="[Kid with Pie]" style="float: left;">

That is one line of HTML but requires multiple lines to generate with DOMDocument. One to create the img element, one for each attribute, and finally one to add it to the document object.

Static Content

For static content, you can still create your content the old way, you just need to write it as vanilla XML and then import it into your document. For example, suppose you have a text file called 'content.xml' containing the following:

<?xml version="1.0" encoding="UTF-8"?>
<html>
<head><title>I am a web page</title></head>
<body>
<h1 class="funky">Hello World!</h1>
<p style="text-align: center;">I am a paragraph<br />
with a self closing break <span style="color: red; font-family: monospace;">
and a span</span>.</p>
</body>
</html>

Since it is clean XML, you can import that XML into your DOM using the following technique:

<?php
require_once('xml_doctype.inc');
$dom = new DOMDocument('1.0','UTF-8');
$dom->preserveWhiteSpace = false;
$dom->formatOutput = true;
 
$myPage = new docType($dom);
$xmlHtml = $myPage->document();
 
$xmlfile = 'content.xml';
$buffer = file_get_contents($xmlfile);
$tmpDOM = new DOMDocument('1.0','utf-8');
$tmpDOM->loadXML($buffer);
 
$nodeList = $tmpDOM->getElementsByTagName('head');
$impHead = $nodeList->item(0);
$xmlHead = $dom->importNode($impHead,true);
 
$nodeList = $tmpDOM->getElementsByTagName('body');
$impBody = $nodeList->item(0);
$xmlBody = $dom->importNode($impBody,true);
 
$xmlHtml->appendChild($xmlHead);
$xmlHtml->appendChild($xmlBody);
 
$myPage->sendpage();
?>

The result can be seen here: example.php. Try viewing the generated source in a browser that properly supports XML (like Firefox) and then in a browser that does not (like Internet Explorer).

Now you can edit content.xml at your leisure, just make sure you keep the XML well formed. DOMDocument also has a loadHTML() function that is a little more lenient, but I do not recommend using it. It tends to replace UTF-8 multibyte characters with entities, and that causes problems. So stick with loadXML() and keep your content as well formed XML.

Dynamic Content

For dynamic content, life will be a lot easier if you write yourself a library of functions and classes to do the dirty work for you. For example, this page was generated using the functions described in my DOM Functions web page.

[W3C Valid]