EtText is a simple plain-text format which allows conversion to and from HTML.
Instead of editing HTML directly, it provides an easy-to-edit, easy-to-read and
intuitive way to write HTML, based on the plain-text markup conventions we've
been using for years.
Like most simple text markup formats (POD, setext, etc.), EtText markup handles
the usual things: insertion of P tags, header recognition and markup. However
it also adds a powerful link markup system.
EtText markup is simple and effective; it's very similar to setext, WikiWikiWeb
TextFormattingRules or Zope's StructuredText.
EtText is distributed under the same licensing terms as Perl itself.
Contributors to Text::EtText
Here's a list of people who've contributed to Text::EtText:
Justin Mason <jm /at/ jmason.org>: original author and maintainer
rudif /at/ bluemail.ch: lots of help with supporting Windows
Chris Barrett, chris /at/ getfrank.com: suggested CSS class support for the
Latte-style balanced tags
Thanks all! Patches and suggestions are welcomed -- send them in!
(By the way, patch contributors get listed at the top, 'cos patches save
me writing the code ;)
Using EtText
Like most simple text markup formats (POD, setext, etc.), EtText markup
handles the usual things: insertion of <P> tags, header
recognition and markup. However it adds a powerful link markup system
and several other useful features.
EtText markup is simple and effective; it's based loosely on setext, with bits
of WikiWikiWebTextFormattingRules thrown in.
EtText was previously part of WebMake, but is now distributed
as a standalone component.
If you leave blank lines between paragraphs, <p> and
</p> tags will be inserted in the correct places.
EtText does quite a good job of this.
Words wrap and fill automatically, so there's no need to worry about wrapping
before 80 characters. (It's good form to do so anyway, in case other people
ever need to edit your text, or you need to mail it around.)
A paragraph consisting of a line of 10 or more consecutive - or _ signs will be
converted to a HR tag.
Sections of text between pairs of certain characters will be turned into
markup, as follows:
EtText
Tag Used
Result
**text**
<strong>
text
__text__
<em>
text
##text##
<code>
text
& signs that have whitespace on either side will be converted
to & signs automatically.
Text indented from the left margin will be converted into a <P>
paragraph wrapped in a <blockquote> -- unless it starts with a
*, -, +, o character
followed by whitespace, or is numbered -- 1., A) or a.,
etc. -- in which case it's interpreted as a list item; see Lists below.
Another exception to the above rule is that text indented by only 1 space, or
on lines starting in the first column with two colon characters, will be
surrounded by <pre> tags.
If you find writing HTML tag-pairs manually annoying, EtText includes an idea
from Latte; balanced-tag generation. Wrap the text to be tagged with
the name of the tag followed immediately by a { character on the left, and a }
character on the right. In other words,
strong{text}
will be rendered as
<strong>text</strong>
or, in other words, text . This can be nested, so strong{text
with i{italic} bits} will be rendered as text with italic
bits.
In addition, the balanced-tag support has a bonus feature, in that it supports
CSS classes; follow the name of the tag with a full stop and the class, and
it will use that class, like so:
i.green{foo}
will be rendered as
<i class="green>foo</i>
Mail headers, and mail messages, are now marked up automatically.
Lists
A paragraph indented from the left margin (by either spaces or tabs, or both),
and starting with a *, -, + or
o character followed by whitespace, will be converted into a list
item (<li> tag).
The same goes for indented paragraphs that start with the string
1., a., A., 1), A), or a), followed by
whitespace. However the default list tag in this case will be an
<ol>...</ol> list. Any positive integer followed
immediately by a full stop and a space will do the trick. The <ol>
tag will use the correct type attribute to match the indexing you're using.
(Compatibility note: previous versions of EtText required that the
<ul> or <ol> tags be written manually. This is no
longer the case, they will be added automatically.)
When you're writing <ul> lists, note that some text editors (such as
vim) will reformat list items automatically, assuming that you want the
text to line up with the start of the text, instead of the bullet-point
character, on the previous line, like so:
- this is a list item. We should make sure that
blah blah etc. etc.
This is pretty handy, so using a - as the list bullet point character is
recommended.
Indented paragraphs that start with term:tabrest
of paragraph will be converted into definition lists (this is another
StolenFromWikiIdea). As a result, this:
Foo: Blah blah blah etc.
Will look like this:
Foo
Blah blah blah etc.
Sidebars and Side Images
If you wish to display an image, or small sidebar, beside a paragraph of text,
use the <etleft> and <etright>
tags. These are rendered as a one-row, two-column
<table> wrapping the paragraph and the sidebar, as
follows:
<etleft><img src=bubba.png></etleft>This is the main
paragraph body. Foo bar baz blah blah blah etc.
Is displayed as:
This is the main paragraph body.
Foo bar baz blah blah blah etc.
<etright><img src=bubba.png></etright>This is the
main paragraph body. Foo bar baz blah blah blah etc.
Is displayed as:
This is the main paragraph body.
Foo bar baz blah blah blah etc.
Links in EtText
As well as the standard <a href=url>...</a> link
specification used in HTML, EtText will automatically add href tags for URLs
and email addresses that occur in the text. In addition, EtText supports its
own link format, as follows.
To use labelled links, you surround the link text with double-square-brackets,
and (optionally) use a single open-square-bracket on the right-hand side with
the link label.
Here's an example:
WebMake's home page is [[at this website [WebMake]].
Alternatively, if the link text matches the link label, the link label is
optional.
Here's an example: [[WebMake]].
The href used in the link is then defined at another point in the document, as
an indented line like this:
[WebMake]: http://webmake.taint.org/
Even simpler: if the link label has been set as an Auto link, you can omit the square
brackets altogether:
Here's an example: WebMake.
Text and markup can be enclosed in the double-square-brackets, everything
quoted will become part of the link text. Unlike the older form of EtText
links (see below), even single words need to be enclosed in brackets
to become links. This protects against accidentally interpreting normal
text as a broken link.
The following text describes the old style for EtText links. Since it
was way too easy to produce links this way where they were not intended
to be, it has now been obsoleted by the method described above. However,
support for it will remain on by default for a few revisions.
To turn off this backwards compatibility, set the EtTextOldLinkStyle option
to 0, either using WebMake's <option> tag, or from your code.
The basic concept is of a word or "quoted set of words" followed by an
optional link label in [square brackets], like this: "this is a
link" [label].
The href used in the link is then defined at another point in the document, as
above.
Text and markup can be enclosed in the quotes, everything quoted will become
part of the link text. Single words or HTML tags do not need to be quoted, so
EtText also supports a concept called glossary links; if you define a link,
the name of that link will automatically become a href if enclosed in
double-square-brackets or quotes. For example:
[Justin Mason]: http://jmason.org/
will mean that any occurrence of [[Justin Mason]], or
"Justin Mason", in any EtText content chunk or file in the
site, becomes a link to that address.
These links are stored in the WebMake cache file, if WebMake is being used.
If you use EtText in a standalone mode, without WebMake, you can provide an
implementation of the Text::EtText::LinkGlossary interface to store
defined links so that they can be used in other EtText files.
Quoted bits of text that do not map to an entry in the glossary are not
converted to links (unless they're followed by a square-bracketed link-label
reference).
In addition, if the link definition is preceded with Auto:, the quotes are
not required, and any occurrence of the link label -- with or without quotes or
double-square-brackets -- will become a link.
Auto: [WebMake]: http://webmake.taint.org/
Auto: [any occurrence of the words]: http://webmake.taint.org/
To block interpretation as a link, replace square brackets with the HTML
entities &etsqi; and &etsqo;, which map to [ and ]
respectively; replace quote characters, ", with two apostrophes,
''. If that doesn't do the trick, wrap the entire section of text
with the <!--etsafe-->...<!--/etsafe--> tags.
Similar Systems
EtText-like plain-text-to-markup conversion systems have a long history. The
first time I came across the concept was with Setext, which was
included with Tony Sanders' Plexus web server, back in September 1993.
Yes, 1993. Setext has been around for a while!
WikiWikiWeb is quite a recent, well-established system which uses
a similar markup style.
txt2html provided a lot of impetus to rewrite the core of EtText since 2.0,
since its list-parsing engine was much better. However EtText is now up to
scratch again ;)
The real inspiration for EtText was Userland's Frontier; Dave
Winer's evangelisation of its easily-editable markup system convinced me that
it was worth polishing up the rudimentary EtText system I had then. In
addition, the name "EtText" is derived from "Edit This Text", in
a tip of the hat to Dave's "Edit This Page" concept.
Jorn Barger maintains an impressive summary of etext formats at his Robot
Wisdom site. Skip down to section 3, Internet etext
standards, for the directly-relevant stuff.
Zope and ZWiki use a format called StructuredText, which again comes from
WikiLand. There's some interesting work going on there with the STXDocument
object, which is "a web-managable object that contains information marked up
in the structured text format".
When HTML and EtText Collide
HTML tags can be used freely throughout an EtText document. However, in some
situations, you may wish to preserve whitespace, avoid paragraph tags being
added, etc.; to use your own HTML without meddling from EtText, wrap it in an
<!--etsafe-->...<!--/etsafe-->
tag pair; this will protect it.
Note that text blocks wrapped in <pre>,
<listing> and <xmp> tags are
automatically protected in this way; the <!--etsafe-->
tag pair is not required.
EtText adds two entities, &etsqi; and &etsqo;. These represent
[ and ] respectively, and are used to protect a square-bracketed
piece of text from being interpreted as a link URL (see Link Markup below).
If this is insufficient, and you're using WebMake, the <safe> tag
will escape any type of code to protect it from interpretation by WebMake,
EtText or HTML.
Use the older EtText link-markup style, with quote characters and single square
brackets. This is easy to type, but if you're using text from other people, it
can easily destroy formatting; so the new link-markup style, with double square
brackets, can be used instead.
The base HREF to use for relative links. If set, all relative links
in tags with HREF attributes will be rewritten as absolute links,
making the output HTML independent of the URL tree structure.
Indicates that all EtText links are relative to the top of the WebMake document
tree. This (obviously) is only relevant if you are using EtText in conjunction
with WebMake, and WebMake sets it by default. If set, all relative links in
tags with HREF attributes will be rewritten as relative to the ''top'' of the
WebMake site, making the output HTML independent of the URL tree structure.
Provide a glossary for shared link definitions, allowing link definitions to be
shared and reused across multiple EtText files. $glosobj must implement the
interface defined by Text::EtText::LinkGlossary.
The Text::EtText::LinkGlossary is an interface which allows EtText to support
''link glossaries'', persistent collections of link text and its corresponding
HREF.
The interface which needs to be implemented is as follows:
The ethtml2text command is part of the HTML::WebMake Perl module set.
Install this as a normal Perl module, using perl -MCPAN -e shell, or by
installing WebMake.
The ettext2html command is part of the HTML::WebMake Perl module set.
Install this as a normal Perl module, using perl -MCPAN -e shell, or by
installing WebMake.