Jump to content

HyperText Markup Language: Difference between revisions

From EDM2
No edit summary
Line 50: Line 50:
* [[FTE]] - Has support for syntax highlighting, code folding and syntax-aware autoindent. - Open source - Current.
* [[FTE]] - Has support for syntax highlighting, code folding and syntax-aware autoindent. - Open source - Current.
* [[jEdit]] - Java based - HTML syntax highlighting built in, XHTML available as an optional download - Current.
* [[jEdit]] - Java based - HTML syntax highlighting built in, XHTML available as an optional download - Current.
* [[Lugaru Epsilon]] - HTML syntax highlighting. - Commercial.
* [[NEdit]] - XFree86 - Autoindent, autocomplete and syntax highlighting - Open source - Discontinued.
* [[NEdit]] - XFree86 - Autoindent, autocomplete and syntax highlighting - Open source - Discontinued.
=====1=====
=====1=====
Note though that some browsers have problems parsing 32 bit [[Unicode]] files.
Note though that some browsers have problems parsing 32 bit [[Unicode]] files.
[[Category:Markup Languages]][[Category:File formats]][[Category:World Wide Web]]
[[Category:Markup Languages]][[Category:File formats]][[Category:World Wide Web]]

Revision as of 15:19, 19 February 2016

The HyperText Markup Language, commonly abbreviated as HTML is literally what it says on the tin, a hypertext markup language used to create web pages, help systems, user interfaces and other similar documentation and/or user facing interactive elements. It was originally a subset of SGML and was introduced in 1991 by Tim Berners-Lee with his Web browser for the NeXT computer system and was initially a tool used by scientists working at CERN, in more recent years development of HTML has primarily been done by web browser manufacturers and has diverged quite substantially from its from its SGML roots.

XHTML

There is also a variant called XHTML that is basically HTML formatted like XML, this was originally meant to be the basis of HTML5 but that never happened due to inertia.

Basic HTML

For formatting HTML uses tags enclosed in angle brackets as for example <body>, the tags usually come in pairs like <rel> and </rel> in which case the first tag, called an opening tag, represents the start of a formatting or feature and the tag with the forward slash, commonly referred to as the closing tag, represents the end of formatting. There are some tags that represent empty elements and therefore are unpaired, for example <img> and some tags can close themselves with a forward slash at the end such as <example />

Here below is an example of a HTML webpage missing only a header

 <body>
 <h1>Main headline goes here</h1>

 <p>The emphasis (em) tag is used to <em>place emphasis on a portion of text</em>,
 in most systems and in all graphic mode browsers this is done by converting the
 emphasised text to italics, but in a few text mode browsers and in some SMGL
 systems that are mostly oriented towards printed output this may be converted
 into bold or underlined text. Browsers intended for those with vision impairments
 may also choose to format em differently and tools that convert the HTML page
 into something else like spoken word may choose or need to make other choices.</p>

 <h3>Alternatively.....</h3>
 <p>You may choose to do the <b>emphasising formatting yourself</b> by using either a
 bold (b) or italics (i) tag <i>in place of the em tag</i> although that is discouraged
 since em shows intent and may therefore translate correctly between media and languages
 while the i and b tags are presentational and may not.</p>
 </body>

You can convert the above example into a working web page by pasting it into a text editor, inserting a minimalistic header at the top of the file, like for instance placing a line that says <!DOCTYPE html> and then saving the text file with a .htm or .html ending.

HTML as a file format

While HTML code can be embedded into other files and formats it is usually delivered as a HTML file, which is a plain text file that has been saved with a .htm or .html ending. Line endings are not critical since browsers can sort out the difference between UNIX and DOS (OS/2) style line terminations.

Codepages are a different matter, while most browsers can sort it out for themselves if they are seeing a Unicode based file#1 or a plain text file, they do not have any mechanism to sort out what codepage the plain text file uses. HTML was developed on a NeXT workstation that had a ISO-8859-1 character set as default and since it was an international standard it was made the default codepage, it was after all supported on virtually all commercial Unices, most home computers that were appearing at the time such as the Atari Falcon and in addition to that was available as a optional codepage for DOS, OS/2, MS Windows and most mini and mainframe computer systems.

When open source operating systems that modelled themselves on UNIX started appearing however they had either only very rudimentary or sometimes no support for internationalisation so they did not fall back to the 8 bit ISO but rather to 7 bit ASCII, this meant that some early browsers originally designed for these systems did not show even ISO coded homepages correctly unless ISO was specified . This made it important that you specified the character set (codepage) used in creating the file to make sure it was interpreted correctly regardless of what browser or what operating system was used.

This is usually done by placing a line in the header that goes something like this:

<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />

While HTML 5 is supposed to follow the 8 bit Unicode as standard, in praxis nothing has changed since if it senses a Unicode file it displays 8 bit Unicode but if the browser thinks it is dealing with a plain text file it falls back to ISO.

Text & programmers editors with HTML support

  • Boxer - HTML syntax highlighting support built in - Commercial - DOS and OS/2 versions discontinued, Win32 version still sold.
  • Elvis - HTML syntax highlighting support built in - Open source freeware - Current
  • Enhanced Editor - Has HTML syntax highlighting built in with some auto-formatting features, more advanced formatting options available as a seperate download.
  • FTE - Has support for syntax highlighting, code folding and syntax-aware autoindent. - Open source - Current.
  • jEdit - Java based - HTML syntax highlighting built in, XHTML available as an optional download - Current.
  • Lugaru Epsilon - HTML syntax highlighting. - Commercial.
  • NEdit - XFree86 - Autoindent, autocomplete and syntax highlighting - Open source - Discontinued.
1

Note though that some browsers have problems parsing 32 bit Unicode files.