HTML Programming

and

The World Wide Web

Researched and Prepared by

Adam Thodey

University of Michigan

December 12, 1994

Preface

This document is meant to give you a basic introduction to the World Wide Web and how to program your own homepage. This, by any means, is not a complete guide for programming in HTML or how to use a specific browser. You are expected to gain an understanding of how to do the basic functions of HTML and some of the more advanced tools. I hope that through both use and feedback that this document can become a major help to those who would like to write their own homepage.

Author's Note

When I first learned about the World Wide Web, I thought it was only a fad that would pass. After gaining more experience through trial and error using a browser and seeing what information I could get, and by listening to both news stories and friends who had written their own homepage and were having fun with it, I decided that I should get involved and let the world know I exist. There really is not a comprehensive guide on making your own homepage. After gathering information from different places at University and on the Web, I sat down and wrote my own homepage. After gathering more and more information on how to do different things to include in my homepage, I thought I would compile it into one document of easy reference for anyone who wanted to write their own homepage or to write any documents that they want accessed over the Web.

Table of Contents

Preface
Author's note
Introduction
World Wide Web
HyperText Markup Language
Creating HTML Documents
General Format of an HTML Document

Introduction

With the increase in today's world population, and the amount of information that each person carries, there must be a place where it can all be stored. In today's day and age, almost everyone in the United States has a computer or one available to them. This availability of computers has erupted a new fad of information exchanges. Created by the Defense Department to carry information between the different defense contractors and the government, the internet today serves as a method of sharing of information and services among a variety of disciplines, groups, and individuals. In an attempt to simplify the task of cruising the internet, the World Wide web was born. The World Wide Web provides a standard to data transmission and retrieval through hypertext files.

World Wide Web

The World Wide Web (Web) is a simple tool in which a user can navigate his or her way through the internet using hypertext. Hypertext consists of both text of information and links to either expand upon that information or other related information. This ŇtextÓ is stored in files in different directories around the world that can be accessed by different people for different reasons.

Through the use of a browser, different types of media can be accessed. This can range from sounds of any kind, to movies, to pictures, to weather maps, to astronomical data. With these advancements in technology, almost everything a person wants to find is out there waiting to be discovered.

HyperText Markup Language

HyperText Markup Language (HTML) is the language used to define a documents and its presentation on the World Wide Web. HTML is an agreed upon standard for the use of easy navigation and retrieval of documents over the World Wide Web. This retrieval includes text, pictures, graphics, multimedia, and sounds. This language is simple to use after having learned the basic commands. There are many more advanced features to programming for the World Wide Web which this document will not go into, but will make reference to where you can look up this information.

HTML is a simple markup language used to create hypertext documents that are portable from one platform to another. HTML documents have generic semantics that are appropriate for representing information from a wide range of applications. HTML markup can represent hypertext news, mail, documentation and hypermedia; menus of options; database query results; simple structured documents with in-lined graphic; and hypertext views of existing bodies of information.1

Creating HTML Documents

HTML documents are written in plain (ASCII) text format. This can be done several different ways. One can use a simple text editor, such as, SimpleText on the Macintosh to nedit on a UNIX workstation. You could even use a word processor, such as Word or WordPerfect, as long as you save your document as text.

General Format of an HTML document

This section is designed to outline the common format of an HTML document. The items used in the formatting of the document will be discussed later on in this document.

Each HTML document contains certain elements in certain orders and other elements where the writer needs to use them. The following is an example HTML document; its output follows.

<HEAD>

<TITLE> A sample document </TITLE>

</HEAD>

</BODY>

<H1> A Sample document </H1>

<HR>

Hi, My name is Adam, and this a sample document to show you the different ways of HTML documents.

<P>

When you begin writing your HTML documents keep the basics in mind and try to have it compatible with as many browsers as possible.

<P>

I am a very busy person as you can see from my <A HREF = "schedule.html">Schedule</a>. ItŐs loaded with everything I do every day. Also, I am involved in many <A HREF = "Cluborg.html">Clubs and Organizations</a> and doing a lot of different <A HREF = "activity.html">activities</a>.

This is just a sample of the things I do.

<HR>

<address>athodey@engin.umich.edu</address>

<I>updated 12 December 1994</I>

</BODY>



This prints out the following:

A Sample document

Hi, My name is Adam, and this a sample document to show you the different ways of HTML documents.

When you begin writing your HTML documents keep the basics in mind and try to have it compatible with as many browsers as possible.

I am a very busy person as you can see from Schedule. It's loaded with everything I do every day. Also, I am involved in many Clubs and Organizations and doing a lot of activities.

This is just a sample of the things I do.

athodey@engin.umich.edu

updated 12 December 1994

The things that are underlined are hyperlinked with other documents with information pertaining to it.

Programming

When beginning to program in any language, one must remember itŐs structure. HTML has a very simple structure to it. There are a few essentials tags that the document must contain and a few optional tags that I strongly suggest one includes. Let me begin by explaining how these tags are used.

The HyperText Markup Language is made up of markup tags that show the format of included text or a reference to something external. Within these tags, its elements may include a name, some attributes, some text or hypertext. This will appear in the HTML document as:

<tag_name> text </tag_name>

<tag_name attribute_name=argument> text </tag_name>

or just <tag_name>

To begin using a tag name you put that tag name in greater than, less than symbols, i.e. <tag_name>. This indicates to whatever program that is viewing the HTML document that it is an elemental tag and contains certain instructions for the viewer. When finished with the area that the tag will be defining you put a forward slash in front of the tag name, i.e. </tag_name>. Remember, not all tags need a closing tag.

The first essential tag is the <HEAD> ... </HEAD> tag. This tag is the header of your HTML document. With this, one can just download the information about your document with out having to download the entire thing and then decide if they want to view it. The elements that are placed within the head element are:

<title> ... </title>

Specifies the documents title. This will not appear on the document as is customary on printed documents; however, it will usually appear on in a window bar identifying the contents of the window.

<base>

Specifies the name of the file in which the current document is stored. This is useful when link references within the document do not include full pathnames (i.e. are partially qualified).

<link rev="RELATIONSHIP" rel="relationship" href="URL">

The link tag allows you to define relationships between the document containing the link tag and the document specified in the "URL". The rel attribute specifies the relationship between the HTML file and the Uniform Resource Locator (URL). The rev attribute (for "reverse") specifies the relationship between the URL and the HTML file. For example, <link rev="made" href="URL"> indicates that the file maker or owner is described in the document identified by the URL. (Note that link tags are not displayed on the screen as part of the document. They define static relationships, not hypertext links

The next essential tag is the <BODY> ... </BODY> tag. This separates the body of the HTML document from the header. Within these tags goes everything else, as described below.

Within each document you should give it a header. This header is not necessarily the same as the title of the document, but most of the time it is. There have been six different header levels defined. They are from one (1) to six (6), with one being the highest level. The syntax of the header tag is as follows

<Hy> The Header Text </Hy>

where y is the header level number.

The other essential tag that should be included in an HTML document in the body is

<address> ... </address>

Specifies an address type statement and presents it as such. This helps identify the creators e-mail address.

Some information that would be useful to the reader would be the date your document was last updated. This means just put the date that you revised, or updated the document. if you want to put the date it was created please include the date it was updated or revised. For example:

Updated on 12 December 1994

or

Last revised 11 December 1994

Basic Formats to HTML

In an HTML document, you can pass a variety of different tags that has formatting. Most of these tags can be nested as long as one remembers to end it when that desired effect is no longer needed. The first thing to explain is how HTML interprets spacing within the document.

As in Programming in any language, "white space" is important to add to the document. In HTML the "white space" does not mean anything. That means if I type something like the following,

This is only an example.

I wonder what it will print.

In the browser one would see:

This is only an example. I wonder what it will print.

As you can see, it did not interpret your line spacing. In order to do this you need a tag. To separate paragraph one uses the <P> tag. It is one of the ones that does not need an ending tag. As an example if I were to put

This is an example of the paragraph tag to separate paragraphs.

<P>

It really does work.

would be the same if I wrote

This is an example of the paragraph tag to separate paragraphs.<P>

It really does work.

or

This is an example of the paragraph tag to separate paragraphs.

<P>It really does work.

All these examples will print the following:

This is an example of the paragraph tag to separate paragraphs.

It really does work.

Another way of putting the text on the next line with out adding an extra space between the text is to use the <BR> tag. It also does not require and ending tag. For example

An example of the line break tag.<BR>It moves to the next line.

is the same as

An example of the line break tag.<BR>

It moves to the next line.

or

An example of the line break tag.

<BR>It moves to the next line.

This produces the following in the browser

An example of the line break tag.

It moves to the next line.

If you want to have text in a certain format, i.e. have spaces and CR/LF (Cursor Returns/Line Feed) count, then you want to use the <pre> ... </pre> tags. This identifies the text that it has already been formatted and display it as is. Preformatted text may include embedded tags, but not all tag types are permitted.

You can also have things bolded, italicized, underlined and typewriter font by the following, respectively,

<b> ... </b> Boldface

<I> ... </I> Italics

<u> ... </u> Underline

<tt> ... </tt> Typewriter Font

<sup> ... </sup> Superscript

<sub> ... </sub> Subscript

<s> ... </s> strikethrough

These are known as physical styles. They are absolutely defined. These indicate directly how the text is to be rendered.

The following are Logical styles that follow the style that has been setup on your browser, yet are logical for what they are defining.

<em> ... </em> Emphasis

<code> ... </code> Display an HTML directive

<samp> ... </samp> Include Sample Output

<kbd> ... </kbd> Display keyboard key

<var> ... </var> Define a variable

<dfn> ... </dfn> Display a definition (not widely supported)

<cite> ... </cite> Display a citation

<q> ... </q> a short quotation

<strong> ... </strong> Strong Emphasis

As you view many peoples homepages or documents, you might be wondering how they display peoples names that have a link on separate line. This is done by using a list definition tag. There are a few ways of displaying lists. Some of them in Mosaic for the Macintosh present them the exact same way.

Most list have a beginning tag and an ending tag with the <li> tag in the middle to show that it is part of the list.

Just to present a list with out numbers on uses the <ul> ... </ul> set of tags. This is known as the unordered list, its code is seen below.

<ul>

<li> Your first item in your list

<li> Your second item in your list

: :

<li> Your last item in your list

</ul>

This prints the following in Mosaic for PowerPC version 2.0.0a6.

ˇ Your first item in your list

ˇ Your second item in your list

: :

ˇ Your last item in your list

The next type of list is and ordered list. This puts numbers next to each item in your list starting with the number one (1). As seen here,

<ol>

<li> Your first item in your list

<li> Your second item in your list

: :

<li> Your last item in your list

</ol>

This prints the following in Mosaic for PowerPC version 2.0.0a6.

1. Your first item in your list

2. Your second item in your list

: :

n. Your last item in your list

where n is the nth item in your list.

Another type of list is the definition list/glossary; its tag is <dl> ... </dl>. In this list you present the word to be defined, using <dt>, then in the next line you put itŐs definition, <dd>.

<dl>

<dt> A term to be defined

<dd> its definition

<dt> Another term to be defined

<dd> and this terms definition

</dl>



Displays

A term to be defined

its definition

Another term to be defined

and this terms definition

There are a few other types of menus which I will briefly give their context.

An interactive menu uses the tags, <menu> ... </menu>, with its constituents listed using the <li> tags. Another type of list is the directory list represented by <dir> ... </dir>, with its constituents listed using the <li> tag. In Mosaic for PowerPC version 2.0.0a6, these two seem to appear as the same.

Linking to other Documents

The major concept behind HTML is being able to link different regions of text (and images as well) to another document someplace else. These are represented by either different colors or underlining the pertinent text to show that they are hypertext links (links).

In order for one to have a link in your HTML document to another object you must anchor that link to some text (or graphics). This is done by using the hypertext-related tag, <A>, for anchor. Also you must indicate that it is a hyperlink reference using the "HREF =" attribute. To include an anchor in your document:

1 - Start the anchor with <A (There is a space after the A)

2 - Specify the document that's being pointed to by entering the parameter HREF="filename" followed by a closing right angle bracket, >.

3 - Enter the text that will serve as the hypertext link in the current document

4 - Enter the ending anchor tag: </A>

A sample hypertext reference is as follows:

<A HREF="Fortran.html">FORTRAN Information</A>

This makes the words "FORTRAN Information" the hyperlink to the document Fortran.html, in the same directory as the document that it is in. This is known as a relative link. If you use the absolute pathname of the file, this is known as absolute pathname.

The difference between relative links and absolute path names is that if you decide to move a group of documents to another location, their relative links will still be valid. One should use absolute pathnames if they are linking to other documents that are not directly related.

Uniform Resource Locator

The Web uses Uniform Resource Locators (URLs) to specify the location of a file on other servers and to determine the type of resource that is being accessed. The syntax is:

scheme://host.domain[:port]/path/filename

where scheme is on of the following

file

a file on your local system, or a file on an anonymous FTP server

http

a file on a Web server

gopher

a file on a Gopher server

WAIS

a file on a WAIS server

news

a UseNet newsgroup

telnet

a connection to a Telnet-based service

The port number can be omitted, unless someone tells you otherwise.

To include my homepage as a link in your document you would type:

<A HREF=http://www.engin.umich.edu/~athodey>Adam's Home Page</A>

In order to you link to specific section in another document, you need to name an anchor in that document then set up a link to it by the following method:

Here is <A NAME = "Information">some information</A>

in the other document you would put:

Here is some <A HREF = "otherdocumentname.html#Information"> information</A> in another document.

If you were to click on the word "some" it would send you directly to "some information" in the other document.

To include a link to someplace in the same file you would do the exact same thing except not include the file name.

Here is some <A HREF = "#Information"> information</A> in another document.

If you were to click on the word "some" it would send you directly to "some information" in the same document.

Special Characters

Four of the ASCII characters have special meaning within HTML; therefore, they cannot be used "as is" in text. These are the left and right angle brackets (< , >), and ampersand (&) and the double quote ("). In order to use these characters in your document you must use the escape sequence associated with it. These are case sensitive

&lt;

the escape sequence for <

&gt;

the escape sequence for >

&amp;

the escape sequence for &

&quot;

the escape sequence for "

A full list of supported character can be found at CERN

Horizontal Rules

You can include lines that extend all the way across the screen in the browser. The tag that produces this is <HR>.

In-Line Images

Most graphical browsers have the capability to display images along side of text. The images must be of either GIF format or X Bitmap (XBM) format. The tag for the inline image is:

<IMG SRC=image_URL>

where the image_URL is the URL of the image file. If the image file is in GIF format, it must have the extension .GIF, the same follows for X Bitmap images, they must end in .xbm. You can display text right along side text by using the ALIGN attribute. You can align the text either on top, middle or bottom, by using ALIGN=place, where place is either top or middle. By default, the paragraph is aligned on the bottom.

Since there are certain browsers that can not display in-line images, you specify text to be displayed instead. To do this you use the alt attribute: <img alt="o" src="image.gif">

You may want to have an image open as a separate document. You would link the image the same way as you would with any other document. Just keep in mind that certain extensions have been defined to help locate external programs to run that file or to let the browser know what type of file it is loading.

The file type extensions are as follows:

.txt

Plain text

.html

HTML document

.gif

GIF image

.tiff

TIFF image

.xbm

XBM Bitmap image

.jpg or .jpeg

JPEG image

.ps

PostScript file

.aiff

AIFF sound

.au

AU sound

.mov

QuickTime movie

.mpeg or .mpg

MPEG movie

Putting it All Together

Now that you have learned the basics of HTML programming, use this as an aid in your programming of your World Wide Web Documents.

As a student at the University of Michigan, you have either an ITD or CAEN accounts (and in some cases, both). I will describe what each student must do in both cases to give another person access to your HTML documents.

For both CAEN and ITD account holders, you need to have the html directory in your Public directory. To do this, if you donŐt already have one, all you need to type, after you login, is

mkdir -p ~/Public/html

Now when you create your html documents, make sure they are located in your html directory.

The URL for your ITD account is: http://www.umich.edu/~uniqname/

The URL for your CAEN account is: http://www.engin.umich.edu/~loginid

For more information about ITD accounts, telnet umce.itd.umich.edu.

Now that you have created you html directory, you need to put all your HTML documents into that directory and give access rights to any user and the Web. Giving access rights to your html directory is simple. First make sure you are in that directory. Then type the following:

fs sa . -acl ip_addrs:www rl

fs sa . -acl system:anyuser rl

The name of your homepage is index.html (all lowercase). When you write it, please name it as such.

You can also add your homepage URL to the X.500 directory.

Type everything after the prompt and follow instruction inside the square brackets

(% represents your prompt in your unix account, * represents the prompt in the X.500 directory)

% ud

* modify [leave a space and then type your name]

then it will give you a list of the things you can change.

[Select More information and type in what it requests.]

To list yourself either before or after you want to modify yourself type

* find [following by your name]

Now you are all set up for people to have access to your homepage and World Wide Web HTML documents.

Getting Your Homepage Statistics

One good thing about having a CAEN account is that CAEN has set it up such that you can get access statistics for your homepage and relevant other pages. You can also get statistics for any of your friends and acquaintances homepages. The URL for this is

http://www-personal.engin.umich.edu/cache/stats/uniqname.html

Acknowledgments

I would like to thank all those on the Web that I have researched and read to get the birth of this document.

Not in any particular order

EDS - http://www.pcweek.ziff.com/~eamonn/about_me.html

Unsure - http://nearnet.gnn.com/gnn/news/feature/netizens/create.html

Michael Grobe, Academic Computing Services, The University of Kansas, 25 September 1994, grobe@kuhub.cc.ukans.edu - http://kufacts.cc.ukans.edu/lynx_help/HTML_quick.html

1. Tim Berners-Lee, timbl@quag.lcs.mit.edu

Daniel W. Connolly, Hal Software Systems, connolly@hal.com

http://www.hal.com/users/connolly/html-spec/spyglass-19941128/htmlspec281194_1.html