HTML tutorial - Document structure

Navigation

Skip navigation.

Site search

Site navigation

HTML tutorial

Printing

Other tutorials

Document structure

Before we can start on the document itself, we have to tell the browser what version of HTML we will be using. The current HTML version is HTML 4.01, so that is what I will concentrate on in this tutorial (note that for those of you who want to use XHTML, I will not cover that, however XHTML 1.0 maps directly to the elements available in HTML 4.01).

There are three versions of HTML 4.01, and they each have their own purposes. You should choose the one that is most appropriate for your uses. In this tutorial, I will concentrate on the strict version, and give notes about the others where needed.

Strict
The cleanest and simplest version of HTML. It allows you to use only the parts of HTML that relate to structure, and in general, does not allow you to use parts that relate to styling (since HTML can perform some basic styling, but this has been replaced by CSS).
Transitional
This is the messy version. It allows you to use several styling tags and attributes, that really have no place in HTML, but were originally introduced before CSS existed. In general, it is best to keep the markup and styling separate (that makes it easier to change styles later, and to share the same style on multiple pages), but you will need this HTML version if you start using framesets. Elements and attributes that are only available in transitional HTML (with the exceptions of IFRAME and TARGET) are referred to as deprecated, and you are advised not to use them.
Frameset
This allows you to use a frameset instead of a body, so you can combine multiple pages into one.

In practice, browsers generally allow you to use any HTML no matter what HTML type you use, but note that this is not a recommended way to write pages. You can even omit the HTML version declaration. However, a browser would be within its rights to ignore anything that is not in the specified version of HTML. Browsers are very forgiving. They are designed to cope with a large number of mistakes, but there is no standard for how to do this. Each browser tries its best to deal with as many mistakes as possible, but they may each take a different approach to dealing with these mistakes. The best way to write your code is to declare the correct type of HTML you will be using, not to make mistakes, and not to rely on the browser to understand how to fix your mistakes.

If you do not define these document types correctly, then most browsers will treat your document as having problems. They will start making deliberate mistakes (mainly to replicate the bugs of certain older browsers). These deliberate mistakes are known as quirks, and will change the behaviour of CSS and JavaScript. It is very important that you define the document types as I will show you here, so that you get a reliable response in all current browsers.

The way we tell the browser what version of HTML we will be using is with the DOCTYPE declaration. This should be the first thing in the HTML file. The three doctypes for the three HTML 4.01 versions are:

Strict
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
Transitional
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
Frameset
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" "http://www.w3.org/TR/html4/frameset.dtd">

Example Structure

Each HTML page should consist of two sections:

  1. The head, where information about the page is held, such as the title, a short description and keywords. It may also contain stylesheet information and script libraries.
  2. The body, where the text or images that the user is going to see are held.

The following tags are used to build the basic structure (note that the tags for HTML, HEAD, and BODY are optional, but as I have already said, it is a good idea to include them anyway, as it helps you to keep track of where things are in your document):

<html>
Signifies the start (and end) of the document.
<head>
Signifies the start (and end) of the head section of the document.
<title>
The title of the document. This is displayed by most browsers in the window title bar, the tab, and the taskbar button. Search engines will usually use it as the title for search results. It is also used by most browsers as a bookmark title, so try to keep it short and concise.
<body>
Signifies the start (and end) of the visible contents of the document - this is where the parts you want the user to see should go.
BODY contents
The part you want the user to see - according to the specification, this must contain at least one block element, such as a heading, paragraph, table, or bullet list. All contents of the body must be inside a block level (or equivalent) element. Text content and inline elements must not be put directly into the body.

A complete example of a HTML document would be:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
  <head>
    <title>Hello world example</title>
  </head>
  <body>
    <p>Hello World!</p>
  </body>
</html>

Last modified: 4 September 2008

  1. Previous
  2. Next
This site was created by Mark "Tarquin" Wilton-Jones.
Don't click this link unless you want to be banned from our site.