Understanding the Web series

Understanding HTML

Topic status automatically displays here - do not remove.

Add me to your favorites!Bookmark this topic  Print me!Print this topic

This topic explores the principles and purpose of Hyper-Text Markup Language (HTML). HTML is officially defined in standards published by the World Wide Web (WWW) Consortium (W3C) at http://www.w3.org/MarkUp/.

W3C Leading the Web to its full potential

W3C produces what are known as "Recommendations". These are specifications, developed by W3C working groups, and then reviewed by Members of the Consortium. A W3C Recommendation indicates that consensus has been reached among the Consortium Members that a specification is appropriate for widespread use.

A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR.


HTML is a non-proprietary subset of the Standard General Markup Language (SGML—see ISO8879), and is designed for human-readable simple markup of documents using plain ASCII text for transmission and distribution over the internet and the WWW. HTML marks-up (constructs) a document with tags such as <h1> and </h1> to structure text into headings, paragraphs, lists, hypertext links etc.

HTML can be created and processed by a wide range of tools, from simple plain text editors to sophisticated WYSIWYG authoring tools.

HTML is retrieved and read by a device known as a 'user agent', which may render and display the document, read it aloud, cause it to be printed, or convert it to another format, etc. An 'HTML user agent' is one that supports the HTML 2.x, HTML 3.x, or HTML 4.x specifications—see HTML4.

Note Notes

The basic structure of an HTML table is described in Jump across to separate topic Understanding HTML Tables.
[insert more crosslinks here]

Chronology (major releases)

HTML 2.0
HTML 2.0 (RFC 1866) was developed by the IETF's HTML Working Group, which closed in 1996. It set the standard for core HTML features based upon current practice in 1994. Note that with the release of RFC 2854, RFC 1866 has been obsoleted and its current status is HISTORIC.
HTML 3.2
W3C's first Recommendation for HTML which represented the consensus on HTML features for 1996. HTML 3.2 added widely-deployed features such as tables, applets, text-flow around images, superscripts and subscripts, while providing backwards compatibility with the existing HTML 2.0 Standard.
HTML 4.0
First released as a W3C Recommendation on 18 December 1997. A second release was issued on 24 April 1998 with changes limited to editorial corrections. This specification has now been superseded by HTML 4.01.
HTML 4.01
HTML 4.01 is a revision of the HTML 4.0 Recommendation which fixes minor errors found since its release.
It defined three document type definitions:
    'Strict', 'Transitional', and 'Frameset';
and two fragment identifiers:
    the 'name' attribute for the elements 'a', 'applet', 'form', 'frame', 'iframe', 'img', and 'map'; and
    the 'id' attribute.
XHTML 1.0 is a reformulation of HTML 4.01 in XML, and combines the strength of HTML 4 with the power of XML. First released as a W3C Recommendation on 26 January 2000, revised 1 August 2002 with no substantive changes—only the integration of various errata. The XHTML 1.0 spec relies on HTML 4.01 for the meanings of XHTML elements and attributes.
Mission of the HTML Working Group
The mission of the HTML Working Group is to develop the next generation of HTML as a suite of XML tag sets with a clean migration path from HTML 4. Some of the expected benefits include: reduced authoring costs, an improved match to database & workflow applications, a modular solution to the increasingly disparate capabilities of browsers, and the ability to cleanly integrate HTML with other XML applications.


The Extensible Hyper-Text Markup Language (XHTML„) is a family of current and future document types and modules that reproduce, subset, and extend HTML, reformulated in XML. XHTML Family document types are all XML-based, and ultimately are designed to work in conjunction with XML-based user agents. XHTML is the successor of HTML.

XHTML 1.0 is the first major change to HTML since HTML 4.0 was released in 1997. It brings the rigor of XML to Web pages and is the keystone in W3C's work to create standards that provide richer Web pages on an ever increasing range of browser platforms including cell phones, televisions, cars, wallet sized wireless communicators, kiosks, and desktops.

Differences with HTML 4

Documents written in XHTML (and other XML-based languages—see XHTML Differences) differ from HTML with several notable differences between the mark-up vocabularies:



[to be described]


Back to Top

Sub heading

Sub text 


Document type declarations

According to HTML Compatibility Guidelines of XHTML 1.0:
When an XML declaration is not included in a document, the [HTML] document can only use the default character encodings UTF-8 or UTF-16.

See Also

Jump to site home page Lotech Solutions' Tips, Tricks, and Procedures

Back to Top