How to Write HTML Pages for PhysioNet

If you are preparing an HTML page to be posted on PhysioNet, please start with a copy of this page, which you can view or download as template.txt. Don't simply use your browser's File → Save feature to download this page, since what you will get by doing so is not the page source containing the SSI commands (described in the next section) that are interpreted by PhysioNet's Apache web server. It's important to use these SSI commands in your final page, because they ensure that your page contains the current version of the PhysioNet banner, the correct feedback links, and the correct modification date.

Structure of a PhysioNet page

To simplify page composition, and to provide a consistent navigation experience and a clean, uniform style, PhysioNet pages follow a simple structure, as shown below and in short-template.txt:

<!--#set var="TITLE" value="Insert your page's title here" -->
<!-- additional SSI variables may be set here -->
<!--#include virtual="/pn/head.shtml" -->
<p> Insert your page's HTML body content here. If you are creating a
HEADER.shtml file, omit the next line.
<!--#include virtual="/pn/footer.shtml" -->

(You can also view this as short-template.shtml.)

The three lines above that begin with <!--# are SSI commands that PhysioNet's Apache web server interprets. They should appear exactly as shown (except that you should substitute the title of your page where indicated).

The first line (<!--#set var="TITLE" value="..." -->) is optional but strongly recommended. It sets the title for the page, which most browsers show in the title bar of the browser window and/or in the tab title. If you omit this line, "PhysioNet" or "PhysioNetWorks" will appear in the browser's title bar.

As the comment in the second line notes, other SSI variables can be set at this point (see SSI Directives below).

When you view a PhysioNet page, the navigation bar appears at the top of the browser window. The third line above (<!--#include virtual="/pn/head.shtml" -->) provides the navigation bar and is required for all HTML pages added to PhysioNet. It loads the contents of head.shtml, which include not only the navigation bar, but also the DOCTYPE declaration for the page (HTML 4.01 Transitional), the character set declaration (utf-8), PhysioNet's CSS style sheet, an anchor point named "top" (see Hyperlinks below), the page's <h1> headline (if TITLE was set previously), and enhancements for Javascript-enabled browsers (such as the page's table of contents visible near the upper right corner of the page).

Your page's content (body) follows next. Note that it should not contain <html>, <head>, or <body> tags, or the corresponding end tags.

The last line shown above defines the page footer. It should be included on all pages except HEADER.shtml pages (see the next section).


File names for PhysioNet HTML pages

On many other websites, URLs have names ending in .html. (Those developed using MS Windows tools sometimes use .htm instead.) On PhysioNet, either of these is acceptable, but .shtml is preferred.

The special suffix .shtml signals PhysioNet's Apache web server that the HTML source file may contain so-called SSI (server-side include) directives, which are described below.

File names on PhysioNet (including those of HTML pages as well as all other files) may include upper- and lower-case roman (unaccented) letters, numerals, and the characters ".-+_" (period, hyphen, plus, underscore). Don't use file names that contain other characters such as spaces, commas, apostrophes, quotation marks, parentheses, slashes, backslashes, colons, semicolons, question marks, ampersands, etc. Don't begin file names with punctuation, and avoid unnecessarily long names and names that will be difficult to type or to remember.

Default page: When your browser requests a PhysioNet URL ending in "/" (such as, Apache returns one of three default pages, depending on the contents of the corresponding directory (folder):

  1. index.shtml, if it exists within the directory
  2. (if index.shtml does not exist) HEADER.shtml, followed by a list of files within the directory
  3. (if neither index.shtml nor HEADER.shtml exist) a bare file listing without a page header

In most cases, the top-level directory for your project should contain a default page called "index.shtml". If your project includes subdirectories, each subdirectory should have its own "index.shtml". Other pages meant to be viewed in a web browser should normally have names ending in ".shtml".

The Apache web server can construct a list of files in a directory (see this page for an example; scroll to the end of the page to see the file listing). If you wish to use this feature in a directory, do not create an "index.shtml" in that directory; rather, create a file called "HEADER.shtml", which contains the top of the directory's default page, to be followed by the file listing. This template (the file you are now reading) can be used as a model to prepare "HEADER.shtml" files, but the final SSI command that includes the page footer should be omitted; Apache inserts both the file listing and the page footer automatically when it serves a HEADER.shtml file.

If your project's top-level directory or one of its subdirectories does not include "index.shtml" or "HEADER.shtml", Apache will display a bare file listing without a page header for that directory. This is generally discouraged unless your project contains a large number of subdirectories that do not require individual descriptions (see the subdirectories of the MIMIC II Waveform Database for examples).


Character set

The character encoding for PhysioNet pages is utf-8 (Unicode). Pages must not contain byte-order marks (BOMs). Note that the characters <, >, and & must be escaped (as &lt;, &gt;, and &amp;, respectively) in order to appear in body text; this is true even within <pre> sections. You may type accented and other Unicode characters into the body text directly, although the use of HTML entities and Unicode escapes is recommended where possible.

The most common formatting error in contributed pages is the use of Windows-specific characters, which are not Unicode. These will appear if you use software such as MS Word to compose your pages, since accented characters and punctuation such as quotation marks and apostrophes in Word documents are not Unicode.


SSI directives

As the example above shows, SSI directives can be used to include the contents of one file within another, as in the third and last lines in the structure outlined above, which the server interprets as instructions to transmit the standard page header and footer before and after the body content whenever the page is requested. Other types of SSI directives can be used to set variables (as in the first line of the example) and test their values (as in head.shtml, visible without SSI interpretation as head.txt). Others can be used to display timestamps of files (as in footer.shtml, a.k.a. footer.txt).

SSI directives have the outer syntax of HTML comments: they begin with "<!--", and end with "-->". As a result, if uninterpreted SSI directives are received by a web browser, they are treated as comments and their contents are not displayed; they are invisible. This happens, for example, if you view a locally stored page containing SSI directives in your browser using its File → Open File... feature.

On PhysioNet, SSI directives are ignored unless they appear within files named with the suffix .shtml (so, for example, they are ignored in .html and .htm files). The SSI exec directive is disabled, and only files that have a text MIME-type (text/plain, text/html, etc.) can be included.

These SSI directives may be useful:

<!--#set var="NOTOC" value="1"-->
To hide the table of contents, insert this directive before including head.shtml.
<!--#set var="CONTACT" value=""-->
To include an email address for feedback in the page footer, insert this directive before including footer.shtml.

In most cases, we don't recommend heavy use of SSI directives, since they may result in pages that load unnecessarily slowly. The directives used to display the standard headers and footers do not have this effect, however, since the content they display is kept in the server's cache (and, as such, can be served more quickly than otherwise, since no disk access is required to do so.)


Section headings

If you define the TITLE of your .shtml file as in the example above, it will appear as the top-level (<h1>) heading for the page, just below the navigation bar. Your file should not include explicit <h1> tags; use <h2>-level headings for the major sections of your page, as on this page, and use <h3> headings for subsections if necessary.

These headings run across the page (or <div>, in multi-column layouts), with foreground and background colors for these headings (as well as the navigation bar and the top line of the footer) selected according to the portion of PhysioNet to which the page belongs (blue for PhysioBank, red for PhysioToolkit, brown for PhysioNetWorks, and green for the PhysioNet library and other pages not belonging to any of the other sections). The color selection is automatic (to see how it works, examine head.txt).

Subsubsections, etc.

Although <h4>, <h5>, and <h6> headings are available, use them sparingly if at all. Often it is more effective simply to mark up a few words in boldface (either set off in a paragraph of their own, as for the heading above this paragraph, or run in to the text of the paragraph they mark).

Table of Contents

If your page contains any <h2> headings, its table of contents will normally appear near the upper-right corner of the page. The table of contents, if present, contains links to the <h2>, <h3>, and <h4>, headings. The table of contents is not displayed if the SSI variable NOTOC has been set (see SSI Directives).



When constructing hyperlinks, keep these points in mind:


Commands and their output, and code examples

If you need to show examples of commands to be typed by the reader, enclose them between <pre> and </pre> tags, as in this example, which shows a command, its output, and the command-line prompt:

$ echo "Hello, world!"
Hello, world!

The '$' represents the command-line prompt. It can be omitted if your example does not include output.

Similarly, if you want to show code to be compiled or interpreted, or text within a plain-text data file, enclose it between <pre class="code"> and </pre>, as in this example:

#include <stdio.h>

main(int argc, char **argv)
    int i;

    for (i = 0; i < argc; i++)
        printf("%s\n", argv[i]);


PhysioNetWorks https:// pages

All new pages are initially developed on PhysioNetWorks, where they are served using secure HTTP (identifiable by URLs that begin with https://). Most browsers indicate if https:// pages are secure (fully encrypted) or insecure (not fully encrypted). You can create a secure page by following the guidelines above.

If your page contains any http:// links pointing outside the server's domain, however, it will be classified as insecure, since form variables can be transmitted in unencrypted form if such links are followed.