HTML tutorial

The Basics
Document structure
Text formatting
Semantics
Links
Special characters
Images
Lists
Tables
Framesets
Forms
Image maps

The Basics

This is not intended to teach you everything there is to know. My hope is that after reading this, you will be able to use these examples to put together HTML formatted documents of your own.

This tutorial is based on on the W3C recommendation, where the idea is to use CSS wherever possible, but to still include semantic HTML to support older browsers. I have also written a CSS tutorial to demonstrate its use.

HTML is a markup language, and is by far the most commonly used language on the Web. Markup languages give structure to a document. They say what parts are headings, what parts are paragraphs, what parts are bullet lists, etc.

The W3C HTML 4.0.1 specification is available if you need to check on what elements are available and what attributes they support.

Tags or elements are on/off switches for different types of formatting. Unless otherwise specified, every "on" tag (such as <head>) needs a closing ("off") tag at its end (</head>). Wherever possible, include the closing tags, even if they are not essential, as this makes it easier for you to follow your own markup, and makes it easier to read.

Whilst many tags can be 'on' at any one time, under no circumstances should tags overlap. For example, this is invalid:

<p><strong>strong text <em>strong and emphasised text</strong> just emphasised text</em></p>

This version is valid:

<p><strong>strong text <em>strong and emphasised text</em></strong> <em>just emphasised text</em></p>

This is one of the reasons you will frequently see designers indent html, as it makes it easy to check which closing tag relates to which tag. For example, this is clearly wrong:

<p>
  <strong>
    strong text
    <em>
      strong and emphasised text
  </strong>
    just emphasised text
    </em>
</p>

However, this is valid:

<p>
  <strong>
    strong text
    <em>
      strong and emphasised text
    </em>
  </strong>
  <em>
    just emphasised text
  </em>
</p>

Note here that none of the indents will show up in the html. If we look at the valid version of that last line, when displayed it will look like this:

strong text strong and emphasised text just emphasised text

The reason for this is that in HTML, there is never more than one space between words or characters, reguardless of line breaks, extra spaces or tab characters in the source code. The only way to make more than one space is to use the 'non breaking space' entity   (see the section on special characters), set the HTML to be preformatted using <pre> tags, or use CSS to style the text so that whitespace is respected.

Tags, elements, and attributes

We have already seen what a tag is, and that there are opening and (in most cases) closing tags. The browser will read these tags, and it will internally create a representation of what you gave it. This internal representation is known as an element. It will then work out how to display the element on the screen. Not all elements are displayed (such as the HEAD element), and some elements will always exist, even if you do not create the tags for them (such as the HTML, HEAD, or BODY elements). These elements are most obvious through scripts or CSS, but for now, just trust me, they are there.

Some elements accept extra parameters. For example, the A element can accept the HREF parameter, which converts it into a link. These parameters are known as attributes, and are created like this:

<a href="somefile.html">

Although it is possible to specify some attributes without quotes (depending on the value they hold), I advise you to always include them, as it makes the document easier to maintain, and will help to avoid mistakes later.

Attributes are separated by spaces or linebreaks. Some attributes do not expect a value, and are written just as the name of the attribute, without any equals sign, or quotes:

<select id="oselect"
  name="somechoice" multiple>

Note that in HTML, tags and attribute names can be written in any case. Some authors like to use upper case to make them stand out from their contents, and some like to use lower case to make them easier to translate to XHTML later if needed. It is perfectly OK to use whichever makes the most sense or is the most useful to you.

Document structure

Before we can start on the document itself, we have to tell the browser what version of HTML we will be using. The current HTML version is HTML 4.01, so that is what I will concentrate on in this tutorial (note that for those of you who want to use XHTML, I will not cover that, however XHTML 1.0 maps directly to the elements available in HTML 4.01).

There are three versions of HTML 4.01, and they each have their own purposes. You should choose the one that is most appropriate for your uses. In this tutorial, I will concentrate on the strict version, and give notes about the others where needed.

Strict: The cleanest and simplest version of HTML. It allows you to use only the parts of HTML that relate to structure, and in general, does not allow you to use parts that relate to styling (since HTML can perform some basic styling, but this has been replaced by CSS).
Transitional: This is the messy version. It allows you to use several styling tags and attributes, that really have no place in HTML, but were originally introduced before CSS existed. In general, it is best to keep the markup and styling separate (that makes it easier to change styles later, and to share the same style on multiple pages), but you will need this HTML version if you start using framesets. Elements and attributes that are only available in transitional HTML (with the exceptions of IFRAME and TARGET) are referred to as deprecated, and you are advised not to use them.
Frameset: This allows you to use a frameset instead of a body, so you can combine multiple pages into one.

In practice, browsers generally allow you to use any HTML no matter what HTML type you use, but note that this is not a recommended way to write pages. You can even omit the HTML version declaration. However, a browser would be within its rights to ignore anything that is not in the specified version of HTML. Browsers are very forgiving. They are designed to cope with a large number of mistakes, but there is no standard for how to do this. Each browser tries its best to deal with as many mistakes as possible, but they may each take a different approach to dealing with these mistakes. The best way to write your code is to declare the correct type of HTML you will be using, not to make mistakes, and not to rely on the browser to understand how to fix your mistakes.

If you do not define these document types correctly, then most browsers will treat your document as having problems. They will start making deliberate mistakes (mainly to replicate the bugs of certain older browsers). These deliberate mistakes are known as quirks, and will change the behaviour of CSS and JavaScript. It is very important that you define the document types as I will show you here, so that you get a reliable response in all current browsers.

The way we tell the browser what version of HTML we will be using is with the DOCTYPE declaration. This should be the first thing in the HTML file. The three doctypes for the three HTML 4.01 versions are:

Strict

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">

Transitional

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

Frameset

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" "http://www.w3.org/TR/html4/frameset.dtd">

Example Structure

Each HTML page should consist of two sections:

The head, where information about the page is held, such as the title, a short description and keywords. It may also contain stylesheet information and script libraries.
The body, where the text or images that the user is going to see are held.

The following tags are used to build the basic structure (note that the tags for HTML, HEAD, and BODY are optional, but as I have already said, it is a good idea to include them anyway, as it helps you to keep track of where things are in your document):

<html>: Signifies the start (and end) of the document.
<head>: Signifies the start (and end) of the head section of the document.
<title>: The title of the document. This is displayed by most browsers in the window title bar, the tab, and the taskbar button. Search engines will usually use it as the title for search results. It is also used by most browsers as a bookmark title, so try to keep it short and concise.
<body>: Signifies the start (and end) of the visible contents of the document - this is where the parts you want the user to see should go.
BODY contents: The part you want the user to see - according to the specification, this must contain at least one block element, such as a heading, paragraph, table, or bullet list. All contents of the body must be inside a block level (or equivalent) element. Text content and inline elements must not be put directly into the body.

A complete example of a HTML document would be:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
  <head>
    <title>Hello world example</title>
  </head>
  <body>
    <p>Hello World!</p>
  </body>
</html>

Text formatting

Block level elements

A block level element is something like a paragraph or heading. Typically, when browsers display them, they are shown with gaps above and below them to separate them from other elements. There are a few main block level elements that are of main interest at this stage. These are the headings, paragraph, preformatted text, and generic div elements. Browsers will have a default way of displaying these, so that even if there is no styling information, readers can still make sense out of the information.

Unless otherwise stated, these block level elements can only contain inline elements or text. They cannot contain other block elements.

<h1>: Generally this serves as the main heading on the page. It is often the same as the title, but it does not have to be. Normally, this would be the first element inside the BODY.
<h2> - <h6>: These are sub headings. You should step through these in sequence. If you need to give subheadings to anything after the main heading, then you should use H2. If you need to create subsections within these sections, they should use H3, etc.
<p>: This denotes a paragraph, just like a normal paragraph in a document. In theory you can omit the closing tag, but I advise you to always include it.
<pre>: This denotes a block of preformatted text. In general you should avoid this, but it can be useful for a few things, such as displaying a block of source code, or displaying a verse of a poem. Inside a PRE block, all spaces, tabs and linebreaks are preserved, and will be displayed on the page.
<address>: This denotes a special type of paragraph that contains contact information, such as a postal or email address.
<blockquote>: This is for use when quoting text from other pages, books, documents, speeches, etc. It cannot contain text directly, and should instead contain other block level elements. They can then contain the quoted text. It is also possible to use the cite attribute to give the URL of a page where the quote was taken from, but no current browsers have a useful way to use that.
<div>: This is a generic block element, and it can contain text directly, or it can contain other block elements. In pure HTML, it serves no purpose. The reason it exists is mainly to facilitate styling, or to allow you to denote arbitrary blocks of content, to give meaning where there is nothing more appropriate. For example, there is no footer element in HTML, but you may still want to create a footer for your document. If you cannot find a more appropriate element for what you want to put in your footer, you can create a DIV and use either the ID or CLASS attributes to give it an identifier of your choice. You can then use that identifier to denote a footer, which you can then style with CSS.
<hr>: Displays a horizontal rule between two blocks. Note that if you need to display horizontal rules, there are usually better ways, such as using CSS to apply a border to an element. The HR element itself has no real meaning in HTML.

Inline elements

Inline elements are fragments of the contents of a block level element. For example, a piece of emphasised text inside a sentence. HTML has a large number of these inline elements, and they each serve a specific purpose. Browsers may apply default styles to these elements, such as displaying a line through deteted text, and using italics for definition text. There is no strict rule as to how these should be rendered, and most users will be used to the response of their browser. If you need a specific response, use CSS to style the elements however you need.

Inline elements can contain other inline elements as long as they are correctly nested.

<em>

Indicates emphasised text - most browsers render this in italics.

<p>This is an <em>important</em> word.</p>

<strong>

Indicates strongly emphasised text - most browsers render this in bold.

<li>This is <strong>very important</strong>.</li>

<sub>

Indicates subscript text - most browsers render this in a small font, positioned near the bottom of normal text.

<h3>Oxygen is O<sub>2</sub></h3>

<sup>

Indicates superscript text - most browsers render this in a small font, positioned near the top of normal text.

<p>This is the 2<sup>nd</sup> street.</p>

<code>

Used for a short piece of programming code that is used as part of a sentence - most browsers render this in a monospace font.

<dd>This is done using the <code>x++</code> operator.</dd>

<samp>

Used for a sample output from a program, script, or form - most browsers render this in a monospace font.

<p>This script would output <samp>Hello world</samp></p>

<kbd>

Used to indicate a key combination or keyboard shortcut - most browsers render this in a monospace font.

<td>Press <kbd>Ctrl+C</kbd> to copy</td>

<var>

Used to indicate a program or code variable - most browsers render this in italics.

<li>Here, we can use the <var>window.document</var> object</li>

<dfn>

Used to indicate that the word(s) inside the DFN element are being defined in the current paragraph (or whatever the parent block element is) - most browsers render this in italics.

<p>A <dfn>heading</dfn> is a title for a section of a document.</p>

<ins>

Indicates that the inserted text has been inserted into the document after its initial creation - generally used along with the DATETIME attribute to say when the change occurred. It is also possible to use the cite attribute to give the URL of a page with more details about the change, but no current browsers have a useful way to use that. Most browsers render this with an underline or in italics. The underline can make it easy to confuse with links, but most browser have still adopted the underline convention.

<p>This is <ins datetime="2006-02-22T17:43:32GMT">not</ins> the only time this has happened.</p>

<del>

Indicates that the inserted text has been deleted - generally used along with the DATETIME attribute to say when the change occurred - most browsers render this with a line through it.

<p>There are <del datetime="2006-02-22T17:43:32GMT">loads of</del> options.</p>

<abbr> and <acronym>

Used to indicate that the word or letters are a contracted form of more words. There is a lot of confusion over where each of these should be used, but in general, the ABBR indicates that the letters are not spoken as a word (such as HTTP), whereas ACRONYM indicates that the contents are spoken as a word (such as such as laser). Future HTML versions will only have the ABBR element, so you may want to avoid the ACRONYM element altogether, and use only the ABBR element for all abbreviations and acronyms. The title attribute is used to give the full expanded form of the abbreviated word. Most browsers display this with a dotted bottom border. Internet Explorer 6- does not recognise either of these elements. Internet Explorer 7+ recognises both, but does not apply any styles to them by default.

<dd>This uses the <abbr title="HyperText Transfer Protocol">HTTP</abbr> protocol.</dd>

<q>

This is for use when quoting text from other pages, books, documents, speeches, etc. In some browsers it will automatically be given quotes at each end. It is also possible to use the cite attribute to give the URL of a page where the quote was taken from, but no current browsers have a useful way to use that.

<p>According to him <q cite="http://example.com/">there is no spoon</q>.</p>

<cite>

Used to give the title of a cited source - most browsers render this in italics.

<li>More information can be found in <cite>A Tale of Two Cities</cite>.</li>

<span>

This is a generic inline element. In pure HTML, it serves no purpose. The reason it exists is mainly to facilitate styling, or to allow you to denote arbitrary inline content, to give meaning where there is nothing more appropriate. For example, you may want to show how to work through a menu to find the desired option. Since there is no menu path element in HTML, you could use a span, give it an appropriate CLASS or ID that you can use as an identifier, then use that identifier to style it in the CSS.

<li>Open the options dialog using <span class="menu">Tools - Options</span>.</li>

<br>

Inserts a line break into text, and does not have a closing tag. This should be avoided in most cases. There are very few cases where this is the right thing to use. The only places where it should be used are where the parent element has no other means of formatting but the contents require line breaks, such as a postal address inside an address element.

<address>22 Example Street<br>Exampletown</address>

<b>, <i>, <big>, <small>, <tt>

These elements make text bold, italic, big, small, and fixed width font respectively. HTML transitional also allows a few others such as STRIKE or S (line-through), U (underline), and FONT (font families and colours). I recommend that you avoid these, mainly because there is almost always something much more appropriate.

Semantics

Semantics simply means trying to make sure your documents mean something, even if CSS is not available, even if your document is being displayed on a device that does not use the same default styles as you expected, or if it is being interpretted through a non-visual medium, such as braille or speech.

The idea is simple. Use the right elements for the right tasks. Browsers understand what those elements mean, and they can use many different techniques to convey that meaning to the user. But that only works if you use the elements the way they were meant to be used.

For example, the DFN element is usually shown in italics. Assume that you want to display italics text, to emphasise it. You could use the DFN element. But that makes no sense, since you are not defining anything. You are emphasizing it. You could also use the I element, since that will always display in italics (assuming the browser can display italics). However, this means nothing. It does not emphasise the text. It just displays it in italics. What you want to do is to emphasize the text, so use the EM element, that is what it is there for. Most browsers already display this in italics, but just to make sure, you can include this in your stylesheet:

em { font-style: italic; }

By using the right elements, you have the benefit that without CSS, the browser will display it emphasised in some way. Some text based browsers may use bold or underline, speech browsers may say that part with a little more stress or volume, but in all cases, the browser can use that information to tell the user that the text is emphasised.

So you can see how using the right elements in a sentence is a useful approach. But it does not end there. It is equally important to use the right levels of headings in the right places, and not to use other elements to replace headings. If you use proper headings, some browsers will even allow users to jump from heading to heading. This is only possible if you actually use proper headings.

The biggest offender when it comes to semantics is the table. I will cover these later, but just accept that tables have a purpose. That is to display tabular data. Unfortunately, due to poor support for styling by some old browsers, tables were often abused to format the page, putting parts of their contents into columns, or specific arrangements. It is common to find pages made out of multiple tables nested inside each other, forcing the page into whatever shape the author desired. Tables were never meant to do this, but to a large extent, they filled a void before CSS was supported well enough to use properly. In fact, in some cases, Internet Explorer 7- still forces tables to be abused to do this. To make a proper semantic page, use tables only for tabular data. They denote a table structure, nothing else. If you want to position parts of your page in strategic places, use CSS, that is what it is there for.

As a final example, try the common navigation used on a page. Typically, pages will have a list of links, and these are often displayed at the top of the page, or to one side of the main content. There are many ways to produce a series of links, but some are much better than others. The list of links is basically just that, a list. So use a list. Give it a heading (such as "Navigation") and then use a bullet or ordered list.

There is no single rule for how to make a semantic document, but just remember that HTML has a lot of element types available, and whenever you think that you would like to make a part of the document look or behave in a certain way, take a look at the list of available elements, and use the one that suits the purpose for what that part of the page represents. If you want to make it look a specific way, style it with CSS, and leave the HTML free to denote what the parts of the page represent, instead of how they should look.

Another factor when making a semantic document is making sure the order of the document makes sense. For example, you should make sure the navigation and main content is sensibly ordered (typically with the navigation either first or last - CSS can then display this wherever you want). Try to keep the markup clear from clutter. Adding in several unrelated blocks (often for advertising) in places where the user would not expect it can cause problems, so try to make sure that the flow of the document still makes sense. The following is a typical example:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
  <head>
    <title>My document</title>
  </head>
  <body>
    <h1>My document</h1>
    <h2>Section 1</h2>
      <p>Some text about the current subject.</p>
      <h3>Section 1.1</h3>
        <p>Some more text about the current subject.</p>
        <p>Yet more text about the current subject.</p>
        <blockquote><p>Some quoted text</p></blockquote>
    <h2>Section 2</h2>
      <p>Some introduction to the data table.</p>
      <table>
        ...contents omited for clarity...
      </table>
    <h2>Navigation</h2>
      <ul>
        <li><a href="/">Home</a></li>
        <li><a href="/articles">Other articles</a></li>
      </ul>
  </body>
</html>

Links

HTML pages link to each other using the A element with the HREF attribute set:

<a href="otherpage.html">Link text</a>

The A element is an inline element, and must not contain block elements. All links require a closing tag.

There are several ways to define the HREF of the link, so that you can link to other files in the same directory, files in parent or child directories, files on other Web sites, email addresses, and several other things as well.

If the href starts with a protocol (such as http:, https:, mailto:, ftp: or file:), then the link will be absolute, and will need to include the full server and path information. If it does not start with a protocol, it will be relative to the current file, and will need to use the path format to make it jump up and down directories as needed.

Full HREFs are usually specified in the following format:

http://domain_name/directory_name/sub_directory_name/file_name.file_extension

Relative HREFs depend on many things, but there are a few simple formats:

foo.html: Go to the file called foo.html in the current directory
../: Go back up one directory
somename/: Go forward to the somename directory
./: Go to the root of the current directory (most servers will serve the index.html file in the current directory)
/: Go to the root of the current Web site
/somepath: Go to the root of the current Web site, then follow the path
#identifier: Scroll the page to the element with the ID "identifier" or the A element with the name attribute set to "identifier" - this is known as an anchor

Some of these can be combined, as with the following example, where the link points to a page two directories up, into the directory called foo, then the file called bar.html, where it will scroll to the internal anchor called baz:

<a href="../../foo/bar.html#baz">

The following set of examples show what various HREFs would link to:

http://www.example.com/foo.html: Links to "http://www.example.com/foo.html"
file://localhost/c:/foo.html: Links to "c:\foo.html" (for security reasons, some browsers will not allow online Web pages to link to files on the user's computer)
foo.html: Links to "foo.html" in the same directory as the current page
#sublinknumber1: Scroll to the anchor in the current page called sublinknumber1
foo.html#sublinknumber1: Go to "foo.html" and scroll to the internal anchor called sublinknumber1
mailto:jon@example.com: Use the mail client the user has defined to start an email to "jon@example.com"

Special characters

Since HTML itself uses certain characters for its markup, those characters cannot be used on a page, or the browser will think they are part of the markup, and will not display them the way you want. But HTML has to make it possible to display these characters. In order to do this, it uses entities. Entities are written in the following format:

&name_or_numeric_code;

In fact, it is possible to write all characters in entity format, including those not supported by the encoding used by your server. These are the most important entity characters, and you should use them whenever you want to display the relevant characters, either inside the normal page content, or inside the atttributes of HTML elements:

The <code title="The &gt; means &quot;greater than&quot;">&gt;</code> character

The most important characters are these:

&: A & character
<: A < character
>: A > character
": A " character
 : This is a space when you want more than one space between things

There are more. See the Web Design Group pages for more details or view my entity summary.

Opera 6+ users can even add it to their Hotlist, IE 4+ (Win) users can install it as a sidebar panel / uninstall it as a sidebar panel and IE 4+ (Mac) users can add it to their Page Holder.

My entity summary page includes extended characters, and browser compatibility information (for 5^th generation browsers).

HTML allows you do do several things to exceed the limitations of basic text. As well as allowing you to write entity characters that you would not normally be able to write, HTML allows you to add comments into your pages to remind yourself what you were doing, and what that part of the page represents. The browser will ignore them, and will act like they are not even there. These are defined as below, and to keep things easy and reliable, never include '--' anywhere inside your comment.

<!-- Comments go in here -->

For a more detailed look at comments, see my HTML and SGML comments article.

Images

There are three types of images that are usually used on the Web; JPEGs, GIFs and PNGs. Some browsers cannot handle alpha transparency in PNG images.

Images are embedded in Web pages using the IMG element. This allows you to specify the source of the image, and alternative text to use if the image cannot be displayed. It is also possible to specify its width and its height, but generally this is not needed, since images will be displayed at their natural size anyway. If you want to display an image at anything other than its natural size, you can use CSS to manipulate its height and width. Most browsers will also display a border on the image if it is inside a link. If you do not want this, then remove it with CSS.

Images are an inline element and can be inserted anywhere inside the normal flow of a paragraph, or other text content.

<img src="some_image.jpg" alt="Alternative text here">

The ALT attribute is required. There is no closing tag.

Choosing good alternative text

The alt text should be used to ensure that the document still makes sense without the image being displayed.

If you have text on the image, the alt text should match the text or say something relevant. If the image is a link, you should put text that is relevant to the link. If the image is not important, say for example a red ball that is not a link, then you should not put alt text of "a red ball", you should put no alt text at all, and instead write alt="".

Basically, imagine that every image is replaced directly with the alt text. Then try reading the page, and see if it all still makes sense.

Some browsers incorrectly use this attribute to produce tooltips. This is incorect behavior. If you want tooltips, use the title attribute:

title="my tooltip"

Lists

There are three types of lists:

Unordered: These are typically thought of as bullet lists. The items in the list have no specific numeric relationship to each other. Most browsers use bullet points when displaying list items.
Ordered: The items in the list have an incremental numeric relationship to each other. Most browsers display numbers beside the list items, and may change this to alternative numberings as lists are nested.
Definition: This contains a series of terms and definitions, and would typically be used in a glossary.

The UL, OL and DL elements are block elements. The LI and DD elements they contain may either hold text directly, inline elements, or block elements.

The closing tag is optional for the <li> and <dd> tags, but as always, I recommend you include it anyway.

Unordered lists

The UL element can only contain LI elements directly. It must not contain any other elements unless they are inside the LI elements.

<ul>
  <li> list item 1 </li>
  <li> list item 2 </li>
</ul>

That will produce this output:

list item 1
list item 2

Nested lists

It is common to have lists inside lists, allowing you to have several levels of nesting. The nested UL should be put inside one of the LI elements of its parent:

<ul>
  <li> list item 1 </li>
  <li> list item 2
    <ul>
      <li> list item 2.1 </li>
      <li> list item 2.2 </li>
    </ul>
  </li>
</ul>

That will produce this output:

list item 1
list item 2
- list item 2.1
- list item 2.2

Ordered lists

The syntax of the ordered list is exactly the same as the unordered list, including the nesting. It is even possible to nest UL and OL lists inside each other.

<ol>
  <li> list item 1 </li>
  <li> list item 2 </li>
</ol>

That will produce this output:

list item 1
list item 2

Browsers will have a limit to the number of items they can include in such a list. In general, 10'000'000 is the highest number that can be reliably used in almost all browsers, with Konqueror being the only browser that has a limit lower than that, at just 32'767.

Definition lists

Definition lists consist of a series of terms and definitions. It is also possible to have multiple terms and multiple definitions, if that is appropriate. The terms are given using the DT element, and the definitions are given using the DD element. It is possible (although unusual) to nest definition lists, where the nested list must be inside the DD of the parent list.

The following sample shows a definition list. The first term has only one definition, the second has two definitions, and the third and fourth terms share the same definition.

<dl>
  <dt>Sump</dt>
  <dd>A place where water completely fills the cave passage</dd>
  <dt>Rift</dt>
  <dd>A vertical fracture in the rock, created by geological stress</dd>
  <dd>A passage formed along such a fracture, usually tall and narrow</dd>
  <dt>Abseil</dt>
  <dt>Rappel</dt>
  <dd>To descend a rope using a device to control speed</dd>
</dl>

That will produce this output:

Sump: A place where water completely fills the cave passage
Rift: A vertical fracture in the rock, created by geological stress; A passage formed along such a fracture, usually tall and narrow
Abseil
Rappel: To descend a rope using a device to control speed

Tables

Tables in HTML should be used when you need to display tabular data. They are a block level element, and should not be put inside paragraphs. They can be put directly inside the BODY, DIV, LI or DD elements. (They can also be put inside other tables, but I advise against doing that.)

Unlike other block elements, tables do not take up the full width that is available to them (unless you specify a width). Instead, they shrink to fit their contents. As well as shrinking to fit, they can also grow to fit. If you specify a width for the table (using CSS), and the contents force it to be wider, the table will grow to fit the needs of its contents.

Tables are often abused in Web pages to define the structure. Note that this practice is outdated, and can cause problems since it removes the semantic meaning of the tables. If you are thinking of using tables to lay out your page, then you are not using them correctly. Use CSS for layout, in addition, it is easier to setup and change.

Tables offer a large amount of control over their aspects, such as the heights and widths of rows and columns, whether borders should be shown, and what the paddings of each cell should be. I will not cover that here, since that relates to display, and should be done from CSS. There is only one display-related attribute I will cover, and that is because IE 7- does not support the CSS that replicates that attribute's behaviour.

For most tables, the following CSS will produce a normal bordered effect, commonly used when displaying data in tables:

table {
  border: 1px outset gray;
}
td, th {
  border: 1px inset gray;
  padding: 2px;
}

It should also be possible to remove the gaps between the cells using the border-spacing:0px; style on the TABLE element, but Internet Explorer 7- will not understand that, and requires you to use the cellspacing="0" attribute on the table element. Alternatively, you can use the border-collapse:collapse; style. Note that most browsers will also apply the following rules by default:

th {
  font-weight: bold;
  text-align: center;
}
th, td { vertical-align: middle; }

Empty table cells are not displayed by default in most browsers (so their borders are hidden). To change that, set the empty-cells:show; style on the TH and TD elements.

A simple data table

Tables are defined a row at a time, using the TR element. Each of these can contain any number of TH (heading) and TD (data) cells. For a table to display correctly, you should have the same number of cells in each row.

<table>
  <tr>
    <th>Heading 1</th>
    <th>Heading 2</th>
  </tr>
  <tr>
    <td>Data 1</td>
    <td>Data 2</td>
  </tr>
</table>

That will produce a table like this:

Heading 1	Heading 2
Data 1	Data 2

Using a caption

You can optionally include a caption for your table. If you choose to use this, it must be the first element inside the table. By default, most browsers will display the caption above the table:

<table>
  <caption>Table n. Sample</caption>
  <tr>
    <th>Heading 1</th>
    <th>Heading 2</th>
  </tr>
  <tr>
    <td>Data 1</td>
    <td>Data 2</td>
  </tr>
</table>

That will produce a table like this:

Table n. Sample
Heading 1	Heading 2
Data 1	Data 2

Spanning rows and columns

Cells are permitted to span multiple rows or columns. Typically, this is most useful for headings, but it can be applied to either TH or TD cells. The ROWSPAN and COLSPAN allow a cell to span as many rows or columns as you need. Just make sure that you do not span more rows and columns than are actually available, and make sure that at no point do a rowspan and colspan overlap - this is an error, and browser error handling is not very good at solving that particular problem:

<table>
  <caption>Table n. Sample</caption>
  <tr>
    <th rowspan="2">Heading 1</th>
    <th colspan="2">Heading 2</th>
  </tr>
  <tr>
    <th>Heading 2.1</th>
    <th>Heading 2.2</th>
  </tr>
  <tr>
    <td>Data 1</td>
    <td>Data 2</td>
    <td>Data 3</td>
  </tr>
</table>

That will produce a table like this:

Table n. Sample
Heading 1	Heading 2
Heading 1	Heading 2.1	Heading 2.2
Data 1	Data 2	Data 3

Adding a table head, body, and foot

With more complex tables, it may be necessary to have more than one dimension of headers. In this case, you can use a THEAD element to signify the headers at the top of the table, a TBODY for the normal data, which can also have its own headers, and a TFOOT for a footer. In theory a browser can also detatch the head and foot to keep them usefully positioned when scrolling or printing, but in practice, no browser does this. In theory, you can have multiple TBODY elements, but these are rarely used. If you use a THEAD or TFOOT, these must be written before the TBODY, even though the TFOOT will actually be displayed after it:

<table>
  <caption>Table n. Sample</caption>
  <thead>
    <tr>
      <th>Test</th>
      <th>Result 1</th>
      <th>Result 2</th>
    </tr>
  </thead>
  <tfoot>
    <tr>
      <td></td>
      <td>5</td>
      <td>5.5</td>
    </tr>
  </tfoot>
  <tbody>
    <tr>
      <th>Type 1</th>
      <td>3</td>
      <td>7</td>
    </tr>
    <tr>
      <th>Type 2</th>
      <td>6</td>
      <td>5</td>
    </tr>
  </tbody>
</table>

That will produce a table like this:

Table n. Sample
Test	Result 1	Result 2
	5	5.5
Type 1	3	7
Type 2	6	5

More table features

Tables have a vast array of extra features that can help you make sense out of complicated data tables. I will not cover these here, and instead, I will point you to my article about making accessible tables, where I cover the extra features in detail.

Framesets

Framesets allow you to have more than one page displayed as if they were one page, above each other or beside each other. It is also possible to nest framesets so that some pages are displayed above others, and some are displayed beside others. In addition, pages contained within a frameset can also contain framesets of their own.

Generally, framesets are not a good solution. If all you want to do is to use frames to force your page into a particular layout, then you are using them for the wrong purpose, you should be using CSS. Framesets exist for a very specific purpose. If you have a single page that has the navigation, and you want to keep that page visible at all times, then you can consider using a frameset (although it is much better to put the navigation on every page). With a frameset, you can display the navigation in one frame, and have it open pages inside another frame. The navigation would remain visible no matter what page was being viewed.

Framesets have several problems. They are a general problem for users that cannot view framesets, such as those who use a speech reader, as they make it very difficult to work out exactly what page the user is viewing. They are a problem for normal users because they cannot be bookmarked - users who try to bookmark individual pages only end up with a bookmark for the overall frameset, so the bookmark cannot open the correct page. Then they are also a problem for users who arrive at the linked pages via a search engine, as they cannot get back to the frameset to see the navigation (even if they can reopen the frameset, they usually lose the page they were looking at in the process).

Generally, I advise you not to use framesets. If you choose to use them, make sure that these limitations will not cause problems.

Pages that contain framesets should use the frameset document type declaration as shown on the document structure. Pages within framesets that use the target attribute on links or forms should use the transitional document type declaration.

The frameset element

Framesets are defined using the FRAMESET tag (which also requires a closing tag). This must have the ROWS or COLS attribute specified to say how the frames should be arranged. If you specify rows, the frames will be laid out top to bottom in the order that you define them. If you specify cols, the frames will be laid out left to right. If you specify both rows and cols, the frames will be layed out in a grid from left to right, one row at a time from top to bottom.

The ROWS and COLS attributes expect a comma separated list of frame sizes. The sizes can be written in a variety of different ways. These are; just numbers (representing the number of pixels), percentages (representing a percentage of the available space), and the asterisk. The asterisk tells it to use whatever is still available after laying out the other frames. In addition, you can specify multiplication factors when combined with an asterisk (such as 2*), so that if more than one frame uses it, they will have the appropriate share of the available space.

For example, a COLS value of "200,30%,*,2*" would create four columns. Assuming there is 1000 pixels available; the first column would be 200 pixels wide. The second would be 300 pixels wide. This would leave 500 pixels free for the remaining columns. The third column will be half the size of the fourth, so the third column would be 167 pixels wide, and the fourth would be 333 pixels wide.

The FRAMESET element replaces the BODY element from a normal document. In frameset documents, there is no BODY element.

The frame element

Frames are defined using the FRAME tag, which does not have a closing tag. This must have the SRC attribute specified to say what page should be displayed in the the frame. The format of the SRC attribute is exactly the same as a HREF attribute of a link, with the exception that it can only load other pages. It cannot link to anchors within the current page. It can load other pages from the current site, or from other sites.

As well as the frame attribute, there are a few other things that you can specify. By default, users can resize frames to make them larger or smaller. This is a good thing, since their screen may be too small for your layout, and they may need to change the sizes. If you have a specific need to prevent them from doing that, you can set the NORESIZE attribute (this is the type of attribute where you do not have to give it a value, just write the attribute inside the tag).

Most browsers will also show a thin border between frames so that the frames are visually separated. To remove this border, and make the framed pages display without anything separating them, set the frameborder="0" attribute on both frames on either side of the border.

The last of the important attributes is the NAME attribute. This is used to set a target that can be used by links and forms inside in pages inside the frames. The name should be a name that makes sense to you, and for the sake of simplicity, try to use just letters and numbers (it can actually contain any characters, but some browsers will display these names when frames are disabled, so it helps if other people can understand them).

It is also possible to use attributes to specify if the frames can be scrolled, or the margins of the body element in the contained document. However, these are display related, so they should be controlled using CSS (set the overflow:hidden; style, and the margin and padding on the HTML and BODY elements).

It is very important to make sure you provide the correct number of frames according to the number or rows and columns. It is even more important to make sure that no frame loads the parent page, or you will end up with an infinitely nested frameset, and you may cause poorly designed browsers to hang up or crash.

Nested framesets

It is possible to use another frameset tag instead of a frame. This should then contain its own frames within it.

Noframes

To allow you to cater for browsers that do not support frames, or users that are unable to use them, you should always provide a noframes section inside your frameset. This should be placed inside the outermost frameset tag, usually at the end of it. It can contain almost anything that a normal document's BODY can contain.

This is not the place where you tell people to get a better browser. They will already be well aware that their browser does not support frames, and you can be quite sure that they will have their reasons for using what they use. The NOFRAMES element is where you give alternative content. Often this will be a list of links to the pages held inside the frames, or a sitemap giving them links to the pages they may want to visit. You could also use this part of the page to help to give them the overview they are missing without being able to use the frameset.

A complete example

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" "http://www.w3.org/TR/html4/frameset.dtd">
<html>
  <head>
    <title>Database access</title>
  </head>
  <frameset rows="100,*">
    <frame src="header.html" frameborder="0" noresize>
    <frameset cols="100,*">
      <frame src="nav.html" name="navigation" frameborder="0">
      <frame src="main.html" name="mainpage" frameborder="0">
    </frameset>
    <noframes>
      <h1>The Database</h1>
      <p>This gives access to the database contents.</p>
      <ul>
        <li><a href="header.html">Database summary</a></li>
        <li><a href="nav.html">Database sections</a></li>
        <li><a href="main.html">Database overview</a></li>
      </ul>
    </noframes>
  </frameset>
</html>

Opening links in other frames

Pages in a frameset can cause links or forms to open in other pages in the frameset by using the TARGET attribute. The value of the attribute should match the name of the desired frame. Note that in most browsers, a page can only target a frame that comes from the same site. Using the example above, pages in the navigation could use this to target the main frame:

<a href="foo.html" target="mainpage">

There are also some extra targets that it can use, even though they are not defined as frame names:

_parent: This will load the page in place of the page that is holding the current page in a frameset. This means that the frameset will be removed, and the new page will be loaded instead.
_top: This will load the page in place of the topmost page that is holding the current page in a frameset. Even if there are multiple levels of nested frameset pages, they will all be replaced with the new page, so there are no longer any framesets.

Inline frames

There is another type of frame, known as an inline frame. This can be used just like an image, and can be inserted anywhere in the normal flow of text and inline content. To insert an inline frame, use the IFRAME element. This is a little like the FRAME element, in that it uses the SRC attribute to set the location of the page to load. It is also possible to set the WIDTH, HEIGHT, and FRAMEBORDER attributes, but these are usually better done with CSS, setting the width, height, and border styles.

Unlike the FRAME element, the IFRAME has its own fallback content for when inline frames cannot be displayed. The IFRAME has a closing tag, and anything inside the iframe before the closing tag will be used if the inline frame itself cannot be rendered. It can contain anything that is block or inline. Perhaps just some text, or perhaps a list of links to resources. Whatever is appropriate for your purpose.

<iframe src="news.html" name="innerframe">
  <ul>
    <li><a href="news.html">News page</a></li>
    <li><a href="access.html">Accessibility notes</a></li>
  </ul>
</iframe>

Pages that use inline frames should use the transitional document type declaration.

Forms

Forms are used to allow the user to provide information that can be sent to the server. They are also often used as a way for a user to provide information to be used by JavaScripts. To make the most use of forms, you will have to have access to server side scripting, which can process the information. Some hosting services may provide automated scripts that can process the form data and send it as an email.

Different server side scripting implementations deal with these input values in different ways. You will need to check with the documentation of the relevant server side environment to see how to use the submitted values.

Forms are defined using the FORM tag, and there are a few attributes you will need to define. Firstly, you will need to say what method you want the form to use. There are two methods that are used with forms, and your server side environment may place restrictions on which you can use:

GET

This is the most common method, and is most useful for smaller forms, where the user will not provide much information. When they submit the form, it will build a page address that contains all of the form information, encoded as part of the address:

http://example.com/foo.php?bar=some+data&baz=test%2B%3D%5Bdata%5D

Because the information is encoded in the URL, it is limited to the length of a URL. In many browsers, this is 4 KB (and due to the encoding, this means about 3 KB of actual form data).

POST

This is the smarter method. You will need to use this if any of your inputs are file inputs, or if you might need to be able to handle more than 3 KB of form data. Alternatively, you might want to use this if you need to keep your page addresses clean. Note that if you use this method, your users will not be able to bookmark the resulting page addresses, so I advise you not to use this method for search engines. This method also can cause problems when using back and forward buttons.

The other attribute you will need to specify is the action attribute. This is the location of the page that you want to send the form information to. The syntax is similar to the HREF of a link, except that it should always point to a page, not to any internal links.

<form method="get" action="processform.php">

If one of your inputs is a file, you must also set the ENCTYPE attribute to "multipart/form-data". Normally, you do not need to set this attribute, as it will assume its default value of "application/x-www-form-urlencoded".

The FORM element is block level, but it cannot contain inline elements or text directly. It must contain other block level elements, such as paragraphs. These can then contain all the desired form controls.

Form controls

There are several types of form controls, and they serve different purposes. They may appear differently in different browsers, this is intentional, so that they can fit in with the theme of the relevant system. These inputs each have their own behaviour, and although it is possible to use scripts to make some of them behave like others (such as making checkboxes behave like radio inputs), I advise you not to attempt to do this, as it will confuse your users.

Inputs are all inline elements. The inputs created with the INPUT element do not have a closing tag. All other input element types require a closing tag.

The majority of inputs are created with the INPUT tag, and the type of input is specified using the TYPE attribute. All inputs accept the NAME attribute. This will be used as the name of the variable that will be sent to the server, with the value specified (normally) by the user.

Forms can contain as many form controls as they need, including multiple submit or reset buttons. These form controls can be in any order that you choose, but of course, you should try to make it make sense.

When a form is submitted to a server, the values of all relevant form controls are sent to the server as a text string. The server side script may convert them into other data types, and in the form control they may appear to look like numbers or timestamps, but as long as they are held in a form control, they are held as a string.

Text inputs

The basic text input is created using the INPUT tag by setting the TYPE to "text". The initial contents of the input can be specified using the VALUE attribute. If you do not include this attribute, or if you specify an empty value, there will be no contents by default. The text input is a single line input accepting any normal text characters, but not linebreaks. If the contents are too large to fit, the input will scroll, usually without a visible scrollbar.

<p><input type="text" name="street" value="Some initial content"></p>

That will create an input like this:

There are a few other attributes that are of interest here. These are mainly the MAXLENGTH, READONLY and DISABLED attributes. MAXLENGTH can be used to specify the maximum number of characters that the user is allowed to enter in the input (such as maxlength="50").

READONLY does exactly what it says. This is most useful when working with scripts, with the idea being that you can put new content in the input, but the user cannot change it (although they can usually select text in it). I advise you not to overuse this, since it looks like a normal input and may confuse the user when it fails to respond. DISABLED is useful for a similar reason because it can be changed with script (for example, when another input is changed). This has the additional benefit that the input looks like it is disabled so that it will not confuse the user. I advise you to only set these attributes using scripts - otherwise if the script is not able to run for whatever reason, the inputs will be unusable. These attributes do not need a value, you only have to write the attribute name:

<input type="text" name="street" disabled>

You can also set the size attribute, but since that is display related, it is much better to set the width from CSS. Two other attributes may be of use. These are the TABINDEX and ACCESSKEY attributes. TABINDEX expects an integer value, and can be used to alter the sequence that the tab key uses to step between inputs, if their layout is not the same as the order they are defined in the source. However, since the layout changes depending on the CSS, this attribute is much less useful.

The ACCESSKEY attribute allows you to specify a key that can be used as a shortcut to focus the input. Different browsers will expect a different key combination to activate it, and often they will conflict with the browser's own shortcuts. In my opinion, the accesskey attribute is a nice idea, but fairly useless in practice. However it is there if you want to use it.

Password inputs

This is functionally and syntactically identical to a normal text input. The only difference is that a password input does not display the typed characters that it contains. Instead, it displays a hashed version of the value, typically with the characters replaced by asterisk ('*') characters.

This is created using the INPUT element with the TYPE attribute to "password".

<p><input type="password" name="theirkey" value="Content"></p>

That will create an input like this:

Textareas

A textarea is a larger version of a text input that can contain multiple lines of text, including linebreaks. Most browsers will display a scrollbar if the contents are too large for the textarea.

Textareas are created using the TEXTAREA tag. This requires a closing tag, and has no VALUE attribute. Instead, the default value is written in between the opening and closing tags. It requires that you provide the ROWS and COLS attributes, which give a suggestion for an initial size (based on the number of characters that should be displayed vertically and horizontally). This is a little unusual, since it forces you to specify display related information in the HTML, but you can always override them using the height and width styles in CSS.

Most other attributes, such as DISABLED and ACCESSKEY are available, but MAXLENGTH is not.

<textarea name="comments" rows="3" cols="30">This is the initial content of the textarea.
Generally, browsers will wrap it if needed, and linebreaks will also be displayed.</textarea>

That will create an input like this:

This is the initial content of the textarea. Generally, browsers will wrap it if needed, and linebreaks will also be displayed.

Select inputs

Select inputs are also known as dropdown menus, list boxes or combo boxes. They provide an input with a number of options that the user can select.

These are the most complicated of all the inputs, and they have a variety of different ways that they can be configured. They are also well known as being difficult to style. Most browsers allow you to colour the background and text of the SELECT element, and some also allow you to specify the same for the OPTION elements. Some may allow a little more, but in general you will have to accept the limitations. SELECT inputs accept the TABINDEX attribute, and SELECT, OPTGROUP and OPTION elements all accept the DISABLED attribute.

Select inputs are created using the SELECT element, and this requires a closing tag. The options are created using the OPTION element. These have an optional closing tag, and as always, I advise you to always include it. In general the SELECT element only needs a NAME attribute. The OPTION elements will usually have a VALUE attribute (this will be what is sent to the server as the value of the input, if the user selects that option). If there is no value, then the content of the option will be used as the value. One option may also have the SELECTED attribute set to tell the browser to pre-select that option. If no option has this attribute, most browsers will pre-select the first option.

<select name="theirchoice">
  <option>First choice</option>
  <option value="val2" selected>Second choice</option>
  <option value="val3">Third choice</option>
</select>

That will create an input like this:

Options can also be grouped into subsections, using the OPTGROUP element. This can only be put directly into the SELECT element. It cannot be put inside another OPTGROUP (this will change in future versions of HTML, and most browsers already allow it, but for now, it is not valid). Most browsers display an OPTGROUP as an indented section with a title (that cannot be selected). Internet Explorer 5 on Mac and the Links browser family are the only browsers I know that display it as a hierarchical menu. The OPTGROUP element uses LABEL attribute to define what title should be displayed, and requires a closing tag.

<select name="theirchoice">
  <option value="val1">First choice</option>
  <optgroup label="Subsection 1">
    <option value="val2" selected>Second choice</option>
    <option value="val3">Third choice</option>
  </optgroup>
  <optgroup label="Subsection 2">
    <option value="val4">Forth choice</option>
    <option value="val5">Fifth choice</option>
  </optgroup>
</select>

That will create an input like this:

Select inputs can be made into a scrolling list by setting the SIZE attribute to more than 1. The options list no longer drops down in the same way, and instead, it becomes a scrolling list showing the number of lines that you specify.

<select name="theirchoice" size="3">
  <option value="val1">First choice</option>
  <option value="val2" selected>Second choice</option>
  <option value="val3">Third choice</option>
  <option value="val4">Fourth choice</option>
  <option value="val5">Fifth choice</option>
</select>

That will create an input like this:

It is also possible to allow more than one input to be selected at a time by including the MULTIPLE attribute (another one of those attributes that does not need a value). If you include this attribute, then more than one option can have the SELECTED attribute. Typically, the user can then use the Ctrl/Cmd and Shift modifier keys to select multiple options.

<select name="theirchoice" size="3" multiple>
  <option value="val1">First choice</option>
  <option value="val2" selected>Second choice</option>
  <option value="val3" selected>Third choice</option>
  <option value="val4">Fourth choice</option>
  <option value="val5">Fifth choice</option>
</select>

That will create an input like this:

It is possible to set the SIZE to 1 on a multiple select input, to end up with something that looks like spin buttons. However, I advise you never to use this, because it does not automatically select options as you spin (so you have to spin to an option then click on it), meaning that its behaviour is unpredictable and very confusing for users.

Radio buttons

Radio buttons allow you to specify several different options, where only one can be selected at a time. This is similar to options in a basic select input, except that radio buttons can be put anywhere in the form, and do not have to be close together. Most browsers display radio inputs as either a small circle (with a dot in the middle of a selected button), or as a punched in/out diamond.

A radio button is created using the INPUT element, by setting the TYPE attribute to "radio". Radio buttons are put into groups by giving them the same NAME attribute, and you can have multiple radio groups in a form. To pre-select a radio input, set the CHECKED attribute on the desired input (this is another attribute that does not need a value). You should never attempt to pre-select more than one radio input in a group. Note that if you do not pre-select any radio input in a group, some browsers will not pre-select any, while others may pre-select the first one. For this reason, I advise you to always pre-select one input in each radio group you create.

Each radio input should have its VALUE attribute set, as that is what will be sent to the server if that input is selected. They will also accept many of the common attributes, such as DISABLED and ACCESSKEY.

<ul>
  <li><input name="vehicle" type="radio" value="cars"> Car
  <ul>
    <li><input name="cartype" type="radio" value="small" checked> 2 door</li>
    <li><input name="cartype" type="radio" value="medium"> 4 door</li>
    <li><input name="cartype" type="radio" value="large"> 17 door</li>
  </ul>
  </li>
  <li><input name="vehicle" type="radio" value="busses" checked> Bus</li>
  <li><input name="vehicle" type="radio" value="trains"> Train</li>
</ul>

That will create inputs like this:

Car
- 2 door
- 4 door
- 17 door
Bus
Train

Checkboxes

Checkboxes offer a simple interface, where the user can either select the option or not. If it is selected, then the value will be sent to the server when the form is submitted. Checkboxes cannot be grouped, and normally, they will not share the same name (it is possible for them to share the name, but the server side script will have to be capable of understanding multiple values for the same variable).

A checkbox is created using the INPUT element, by setting the TYPE attribute to "checkbox". To pre-select a checkbox, set the CHECKED attribute.

Each checkbox should have its VALUE attribute set, as that is what will be sent to the server if that input is selected. If the checkbox is not selected, the browser will either pass an empty value to the server, or not pass the variable at all. They will also accept many of the common attributes, such as DISABLED and ACCESSKEY.

<ul>
  <li><input name="car" type="checkbox" value="1" checked> Car</li>
  <li><input name="bus" type="checkbox" value="1" checked> Bus</li>
  <li><input name="train" type="checkbox" value="1"> Train</li>
</ul>

That will create inputs like this:

Car
Bus
Train

File inputs

This allows the user to choose a file that will be uploaded to the server. There are obvious security issues here, so in order to prevent pages uploading files without permission, browsers will not allow HTML or script to set the initial value of the input. In order to make sure that the user is aware of the input type, the wording used by the input also cannot be changed. Different browsers will have their own layout for the input. Most browsers on Windows will have what looks like a text input, with either a "Choose" or "Browse" button. Many browsers on Mac will have only a button with "Choose" on it. Some Linux browsers will use only a button as well.

Browsers will have several ways to prevent users from being tricked into filling in a file input. Some may prevent pasting into the input, and some will not allow scripting events to be used to trigger the file choosing functionality. Some browsers will not allow scripts to read the value (or maybe only the file name, not the path). Some may reset the input if a script triggers the file chooser. Most browsers only allow very limited styling of file inputs. This is all intentional, and I advise you not to attempt to use any tricks to alter the display or behaviour of a file input.

A file input is created using the INPUT element, by setting the TYPE attribute to "file". They will also accept many of the common attributes, such as DISABLED and ACCESSKEY.

<input name="filetoattach" type="file">

That will create an input like this:

Submit buttons

The submit button is the button that causes the information in the form to be submitted to the server. Normally these are not given a name but if they are, then their name and value will also be sent to the server. This can be useful if you want to have multiple submit buttons, and you want the server side script to make decisions based on which button is clicked.

A submit button is created using the INPUT element, by setting the TYPE attribute to "submit". They will also accept many of the common attributes, such as DISABLED and ACCESSKEY.

<input type="submit" value="Go!">

That will create an input like this:

When a form contains at least one submit button, most browsers will automatically click the first submit button in the form if the user presses the Enter/Return key in a text or password input. Some will also do the same for radio, checkbox, select, and file inputs. Some browsers will automatically submit the form even if there is no submit button and the user presses Enter/Return in a relevant input type, but this does not work in all browsers (such as mobile, television and console browsers, as well as some desktop browsers), and should not be relied on.

Reset buttons

This will reset the form back to the state that it was in when the page was first loaded. Note that these are almost never needed, and generally are pressed by mistake when trying to submit the form (meaning that the user loses all the information they entered). You should only include a reset button if you have a real use-case for it.

A reset button is created using the INPUT element, by setting the TYPE attribute to "reset". They will also accept many of the common attributes, such as DISABLED and ACCESSKEY.

<input type="reset" value="Reset changes">

That will create an input like this:

Generic buttons

These serve no purpose on their own. The only reason they exist is to activate scripts (normally done with an onclick event handler).

A generic button is created using the INPUT element, by setting the TYPE attribute to "button". They will also accept many of the common attributes, such as DISABLED and ACCESSKEY.

<input type="button" value="Run script" onclick="alert('Hello');">

That will create an input like this:

Smarter buttons

Internet Explorer 7- sends the HTML content of the element to the server as the value, instead of using the value attribute.
Internet Explorer 7- defaults to treating each button as a generic button if the TYPE attribute is omitted, instead of a submit button.
Internet Explorer 6- sends the content of all button elements, not just the one used to submit the form.

The basic button input is fairly limited, since it can only contain text. The BUTTON element is much more advanced, as it can contain virtually any content, including block and inline elements, such as lists or images. The limitation is that it cannot contain forms or form controls, links, or image maps. It requires a closing tag. Some old browsers (such as Netscape 4) do not support this element.

The BUTTON element accepts the usual NAME attribute, and has a VALUE attribute that will be sent to the server in the same way as with a normal submit button. By setting the TYPE attribute to "submit", "reset" or "button", you can choose if it should be a submit, reset or generic button. It will also accept many of the common attributes, such as DISABLED and ACCESSKEY.

<button type="submit">
  <ul>
    <li>Inside the button</li>
    <li>Also inside the button</li>
  </ul>
  <p><img src="someimage.png" alt=""></p>
</button>

That will create an input like this:

Image inputs

These are a special kind of submit button, that use an image instead of a button interface. Clicking the image will submit the form. If the input is given a name, then when the image is clicked, the coordinates of the click are sent to the server using the name of the input. For example, if it is given the name "thesub", and you click 47 pixels from its left edge, and 8 pixels from its top edge, then the server will be sent thesub.x=47 and thesub.y=8

An image input is created using the INPUT element, by setting the TYPE attribute to "image". You will then need to set the SRC and ALT attributes, just like with a normal image. It will also accept many of the common attributes, such as DISABLED and ACCESSKEY.

<input type="image" src="someimage.png" alt="Submit">

That will create an input like this:

Hidden inputs

Sometimes it is necessary to pass values to a form as if the inputs had been filled in by a user, but without the user seeing those options.

An example of this would be a form that spans multiple pages, where the user fills in a page at a time, then moves on to the next page. You could use hidden inputs to pass the values from each page into the next page, so that the last page actually contains all the values, but the user only sees a few of them. Another example is the search form on this page. The search engine for this site allows you to search sections, but on this page, you do not see the options to choose sections. I use hidden inputs to pass the section selections (so that it only searches the tutorials by default). The search results page uses checkboxes instead, so you can choose which sections to search.

An hidden input is created using the INPUT element, by setting the TYPE attribute to "hidden". Set the NAME and VALUE attributes to say what value the input represents.

<input type="hidden" name="firstname" value="Mark &quot;Tarquin&quot;">

That will create an input that is invisible.

Labels

On their own, some inputs are a little small, and difficult for users to interact with. For example, a checkbox is just a small input, but it usually has a large amount of text associated with it. Often it is useful to have the text be a proper label, where clicking the text is the same as clicking the input. This is done using the inline LABEL element. In theory you should be able to just wrap this around an input and the text, but some browsers do not understand this, so it is better to use the alternative.

Firstly give the input an ID attribute. Usually this can be the same as the name attribute, but note that two elements must never share the same ID (so radio inputs will each need their own ID, which cannot be the same as the NAME attribute). Now create the LABEL element and give it a FOR attrible, whose value matches the ID of the input that it should relate to. It will also accept the ACCESSKEY attribute. The closing tag is required.

You can associate as many labels as needed with each form control, but each label can only be associated with one form control. Some browsers will not allow labels to be associated with file inputs (for security reasons).

<ul>
  <li><input name="whatcar" id="whatcar" type="checkbox" value="1">
  <label for="whatcar">Car</label></li>
  <li><input name="whatedition" id="edition1" type="radio" value="1" checked>
  <label for="edition1">First edition</label></li>
  <li><input name="whatedition" id="edition2" type="radio" value="2">
  <label for="edition2">Second edition</label></li>
</ul>

That would create this (try clicking the words to see what happens):

Car
First edition
Second edition

Fieldsets

Fieldsets allow you to group form controls into sections. For example, you may want to put a set of options into a group, so they appear separated from the main part of the form, but they are still used by it. To do this, place all of the relevant form controls inside a FIELDSET element (which requires a closing tag).

All FIELDSET elements need to have a LEGEND element, containing the title for the fieldset. This should be the first element inside the fieldset, and is an inline element. It accepts the optional ACCESSKEY attribute. The closing tag is required.

The fieldset is usually displayed as a thin border surrounding the inputs, with the legend written on top of the border.

<fieldset>
  <legend>Options</legend>
  <ul>
    <li><input name="alpha" type="checkbox" value="1"> Alpha</li>
    <li><input name="beta" type="checkbox" value="1"> Beta</li>
    <li><input name="gamma" type="checkbox" value="1"> Gamma</li>
  </ul>
</fieldset>

That would create this:

Options

Alpha
Beta
Gamma

Fieldsets can be nested as needed.

<form method="get" action="/search">
  <fieldset>
    <legend>Search</legend>
    <p>
      <input type="text" name="searchforthis">
      <input type="submit" value="Search">
    </p>
    <fieldset>
      <legend>Search sections</legend>
      <ul>
        <li><input name="searchsc1" type="checkbox" value="1" checked> Section 1</li>
        <li><input name="searchsc2" type="checkbox" value="1"> Section 2</li>
        <li><input name="searchsc3" type="checkbox" value="1"> Section 3</li>
      </ul>
    </fieldset>
  </fieldset>
</form>

That would create this:

Search sections

Section 1
Section 2
Section 3

A complete example

Yes, a complete example would be nice, so here goes:

<form method="post" action="feedback.php" enctype="multipart/form-data">
  <fieldset>
    <legend>Personal</legend>
    <p><label for="persname">Name:
    <input type="text" name="persname" id="persname" maxlength="50"></label></p>
    <p><label for="perspswd">Password:
    <input type="password" name="perspswd" id="perspswd"></label></p>
    <fieldset>
      <legend>History</legend>
      <p><label for="waystohear">How did you hear about us:
      <select name="waystohear" id="waystohear" size="3" multiple>
        <option value="search" selected>Search engine</option>
        <option value="friend">Word of mouth</option>
        <option value="advert">Ads</option>
      </select></label></p>
      <p><label for="prevvisit">How many times have you visited:
      <select name="prevvisit" id="prevvisit">
        <option value="1" selected>1</option>
        <option value="2">2</option>
        <option value="3+">3 or more</option>
      </select></label></p>
    </fieldset>
  </fieldset>
  <fieldset>
    <legend>Subscribe</legend>
    <ul>
      <li><label for="subs1"><input type="checkbox" name="subs1" id="subs1">
      Email spam</label></li>
      <li><label for="subs2"><input type="checkbox" name="subs2" id="subs2">
      Postal spam</label></li>
    </ul>
  </fieldset>
  <fieldset>
    <legend>Comments</legend>
    <ul>
      <li><label for="cmnttype1"><input type="radio" name="cmnttype" id="cmnttype1" checked>
      About the site</label></li>
      <li><label for="cmnttype2"><input type="radio" name="cmnttype" id="cmnttype2">
      About the company</label></li>
    </ul>
    <p><label for="perscmnt">Let us know what you think:</label></p>
    <p><textarea rows="3" cols="30" name="perscmnt" id="perscmnt">Hi,
I like your &quot;stuff&quot;.</textarea></p>
    <p><label for="cmntfile">Upload comments as a file:
    <input type="file" name="cmntfile" id="cmntfile"></label></p>
  </fieldset>
  <p>
    <input type="reset" value="Reset">
    <button type="submit">Submit</button>
  </p>
</form>

That would create this form:

Personal

Name:

Password:

History

How did you hear about us:

How many times have you visited:

Email spam
Postal spam

Comments

About the site
About the company

Let us know what you think:

Hi, I like your "stuff".

Upload comments as a file:

Is there more?

There is more, but not widely supported. Web Forms 2.0, now included in HTML 5, adds many more input types and abilities. Most notably, it adds number inputs, date inputs, time inputs, slider inputs ('range' - like a volume control), text inputs with auto-completion, output status fields, and others. As well as that, they are self validating, so you can specify the format of the input, and it will require the user to enter valid data. There are also many other features, such as nested forms, multiple file uploads, and ability to automatically repeat sets of inputs.

One of the extra useful features is that older browsers that do not understand the specific type of input will fall back to a basic text input, meaning that you can use the newer input types, and the form will work in browsers that support it, and still function as a basic form in browsers that do not.

Currently, Web Forms 2.0 is supported by Opera 9, and olav.dk has a behaviour file that can be used to add support for Internet Explorer 6+. Safari/Chrome also supports the range input.

Image maps

Note that image maps can cause significant accessibility problems (and can be hard work to maintain), so you should restrict their use to places where they are really appropriate, such as (surprise) a map, where clicking on the parts of the map gives information about the relevant area. If you plan to make an image map out of a list of words just to make your navigation prettier, then you are using them for the wrong reason, and you should use a normal list styled with CSS.

Image maps allow you to make certain areas of an image into links. There are two types of image maps; server side and client side.

Server side image maps

For a server side image map, put an image inside a link, and set the ISMAP attribute on the IMG (just the name, it does not need a value). When the link is clicked, the browser will request the given link, and add ?x,y on the end of it, as the click offset from the left,top corner of the image (such as foo.html?47,8). If the user is not using a mouse (or equivalent), then the coordinates will be 0,0.

<a href="foo.html"><img src="bar.gif" alt="" ismap></a>

Client side image maps

Internet Explorer only understands image maps that use AREA elements, not A elements.

Client side image maps are generally more popular. With a client side image map, you can specify a list of areas that will be used as the links. This gives the user the chance to see immediately if where they are about to click is somewhere useful, as opposed to the server-side image map where they must wait for the reply from the server to tell them. There are four types of these areas; rectangles, circles, polygons and default.

Firstly, you need to create the map that will be associated with the image. This is created using the MAP element, which must have a NAME attribute set, with a name that will be used to reference the map. Images that use the map should have their USEMAP attributes set to the same as the map name, with a '#' character in front of it. The closing tag is required. The MAP is a little strange, since it is an inline element, but it can contain block level contents.

Note; in theory, the map can contain a mix of AREA elements, and block level content. The block level content will always be displayed, even if image maps are supported. Any links within the block level content will be interpreted by the map in the same way as AREA elements, so they can have the AREA and COORDS attributes. This allows you to use part of the normal content as the map areas, hopefully ending up with a more accessible document. Unfortunately, this capability is not well supported, and Internet Explorer in particular does not support it. Since that means that the majority of Web users cannot use these A areas, I recommend you stick with basic areas.

Image maps can be placed anywhere in the document (inside elements where inline content is allowed), and can be before or after the image(s) that use them.

The AREA element should be treated as a block level element, and must be directly inside the MAP element, not inside any of the other block level content inside it. If you intend to use normal block level content inside the map, I recommend you only put it where it makes sense, since it will be rendered, and also only put it inside an element where that sort of content is allowed (such as inside a DIV or LI element). Personally, I think the idea that a map should be inline is wrong, considering the way it is used, but that is what the spec says.

Areas are not rendered visually if image maps are supported. They remain invisible, and only create an area of the map that can be clicked. Browsers that cannot display image maps generally display a list of all the links and areas inside the map. To allow them to display the areas, each AREA needs the ALT attribute set, giving the text that should be displayed as the link content in these browsers. They will usually be displayed in the order that they are defined in the source, so make sure that it makes sense.

Creating areas

Internet Explorer does not understand the default shape.
No major browsers understand percentage coordinates correctly, and all interpret percentage coordinates as pixel coordinates.

For now, I will concentrate on the AREA element, but just remember that the SHAPE and COORDS attributes also apply to links inside the map (although again, I recommend that you do not use them).

Firstly, you need to plan what shapes you intend to use, and where they will go. Try to make sure the shapes make sense, and that the user will be able to recognise where those shapes might be on the map. In most browsers, the only way they will know there is an area is that their mouse cursor will change when they hover over it. Image map areas accept almost no styling. The three main shapes are rectangles, circles and polygons. You can use percentages for any of these, but most image maps use exact pixel values, as they work with fixed size images. Firstly create the AREA tag. Use the SHAPE attribute to define the shape; one of "rect", "circle" or "poly". Then use the COORDS attribute to specify the comma separated list of coordinates:

Rectangle

This expects four coordinates. The horizontal position of the top-left corner, the vertical position (from the top of the image) of the top-left corner, the horizontal position of the bottom-right corner and the vertical position of the bottom-right corner. An example would be:

shape="rect" coords="10,20,75,40"

Circle

This expects three coordinates. The horizontal position of the centre, the vertical position of the centre and the radius of the circle (percentage radii are taken as a percentage of the shorter side of the image). An example would be:

shape="circle" coords="50,80,20"

Polygon

This expects as many pairs of coordinates as you need to make your polygon. These can make any polygon shapes you need, and can have sloping lines. All coordinates are specified as horizontal position then vertical position, with all of them in a long comma separated list. The last pair of coordinates can optionally match the first. An example would be:

shape="poly" coords="217,305,218,306,218,306,228,316,243,316,243,325,229,325,229,322,217,310"

If any of these areas overlap, the one that is defined first will be used in the places where they overlap. There is also a "default" shape, which covers the entire image, and does not need the coords attribute. However, I advise you not to use this shape, as it makes it impossible for a user to know when they are over a proper area, since the mouse cursor will always show as an area link.

It is possible to use an AREA to puch a hole out of another one. Instead of giving it an HREF attribute, set the NOHREF attribute (without giving it a value). Then make sure that it appears before the other area in the source code, and it will be placed on top of it, as a dead space where the other area will not react.

Remember that every area must have an ALT attribute giving the alternative text to display. For areas with no HREF, it is best to provide an empty ALT attribute. If you use A elements instead, these cannot have an ALT attribute, but browsers can use their contents instead. I also recommend giving every area a TITLE attribute, that most browsers will display as a tooltip when hovering the area. This makes it much more easy to see what the area represents.

An image map example

In this example, I create four areas. One is a rectangle, representing a flag. One is a circle with another circle overlaying it. This creates the doughnut representing a life ring. Lastly there is the polygon representing a beach hut.

<div>
  <map name="beachmap">
    <area href="/" shape="poly" coords="17,51,42,35,66,51,66,89,17,89"
      alt="Beach hut" title="Beach hut - where you get changed">
    <area shape="circle" coords="99,92,12" nohref alt="">
    <area href="/" shape="circle" coords="99,92,23"
      alt="Life ring" title="Life ring - to help you swim">
    <area href="/" shape="rect" coords="129,27,171,52"
      alt="Flag" title="Flag - says if it safe to swim">
  </map>
</div>
<p><img src="../jsexamples/imagemap.png" alt="" usemap="#beachmap"></p>

Last modified: 2 January 2012

HTML tutorial

Navigation

Site search

Site navigation

HTML tutorial

Printing

Other tutorials

Table of contents

The Basics

Tags, elements, and attributes

Document structure

Example Structure

Text formatting

Block level elements

Inline elements

Semantics

Links

Special characters

Images

Choosing good alternative text

Lists

Unordered lists

Nested lists

Ordered lists

Definition lists

Tables

A simple data table

Using a caption

Spanning rows and columns

Adding a table head, body, and foot

More table features

Framesets

The frameset element

The frame element

Nested framesets

Noframes

A complete example

Opening links in other frames

Inline frames

Forms

Form controls

Text inputs

Password inputs

Textareas

Select inputs

Radio buttons

Checkboxes

File inputs

Submit buttons

Reset buttons

Generic buttons

Smarter buttons

Image inputs

Hidden inputs

Labels

Fieldsets

A complete example

Is there more?

Image maps

Server side image maps

Client side image maps

Creating areas

An image map example