EtText Documentation (version 2.4)

Contents

Contents


The Blurb

EtText is a simple plain-text format which allows conversion to and from HTML. Instead of editing HTML directly, it provides an easy-to-edit, easy-to-read and intuitive way to write HTML, based on the plain-text markup conventions we've been using for years.

Like most simple text markup formats (POD, setext, etc.), EtText markup handles the usual things: insertion of P tags, header recognition and markup. However it also adds a powerful link markup system.

EtText markup is simple and effective; it's very similar to setext, WikiWikiWeb TextFormattingRules or Zope's StructuredText.

EtText is distributed under the same licensing terms as Perl itself.


Contributors to Text::EtText

Here's a list of people who've contributed to Text::EtText:

  • Justin Mason <jm /at/ jmason.org>: original author and maintainer
  • Caolan McNamara <caolan /at/ csn.ul.ie>: EtText contributions; lists, pre-formatted text, lots of suggestions
  • rudif /at/ bluemail.ch: lots of help with supporting Windows
  • Chris Barrett, chris /at/ getfrank.com: suggested CSS class support for the Latte-style balanced tags

Thanks all! Patches and suggestions are welcomed -- send them in! (By the way, patch contributors get listed at the top, 'cos patches save me writing the code ;)


Using EtText

Like most simple text markup formats (POD, setext, etc.), EtText markup handles the usual things: insertion of <P> tags, header recognition and markup. However it adds a powerful link markup system and several other useful features.

EtText markup is simple and effective; it's based loosely on setext, with bits of WikiWikiWeb TextFormattingRules thrown in.

EtText was previously part of WebMake, but is now distributed as a standalone component.

Basic Text Markup

If you leave blank lines between paragraphs, <p> and </p> tags will be inserted in the correct places. EtText does quite a good job of this.

Words wrap and fill automatically, so there's no need to worry about wrapping before 80 characters. (It's good form to do so anyway, in case other people ever need to edit your text, or you need to mail it around.)

A paragraph consisting of a line of 10 or more consecutive - or _ signs will be converted to a HR tag.

Sections of text between pairs of certain characters will be turned into markup, as follows:

EtText Tag Used Result
**text** <strong> text
__text__ <em> text
##text## <code> text

& signs that have whitespace on either side will be converted to &amp; signs automatically.

Text indented from the left margin will be converted into a <P> paragraph wrapped in a <blockquote> -- unless it starts with a *, -, +, o character followed by whitespace, or is numbered -- 1., A) or a., etc. -- in which case it's interpreted as a list item; see Lists below.

Another exception to the above rule is that text indented by only 1 space, or on lines starting in the first column with two colon characters, will be surrounded by <pre> tags.

If you find writing HTML tag-pairs manually annoying, EtText includes an idea from Latte; balanced-tag generation. Wrap the text to be tagged with the name of the tag followed immediately by a { character on the left, and a } character on the right. In other words,

strong{text}

will be rendered as

<strong>text</strong>

or, in other words, text . This can be nested, so strong{text with i{italic} bits} will be rendered as text with italic bits.

In addition, the balanced-tag support has a bonus feature, in that it supports CSS classes; follow the name of the tag with a full stop and the class, and it will use that class, like so:

i.green{foo}

will be rendered as

<i class="green>foo</i>

Mail headers, and mail messages, are now marked up automatically.


Lists

A paragraph indented from the left margin (by either spaces or tabs, or both), and starting with a *, -, + or o character followed by whitespace, will be converted into a list item (<li> tag).

The same goes for indented paragraphs that start with the string 1., a., A., 1), A), or a), followed by whitespace. However the default list tag in this case will be an <ol>...</ol> list. Any positive integer followed immediately by a full stop and a space will do the trick. The <ol> tag will use the correct type attribute to match the indexing you're using.

(Compatibility note: previous versions of EtText required that the <ul> or <ol> tags be written manually. This is no longer the case, they will be added automatically.)

When you're writing <ul> lists, note that some text editors (such as vim) will reformat list items automatically, assuming that you want the text to line up with the start of the text, instead of the bullet-point character, on the previous line, like so:

    - this is a list item. We should make sure that
      blah blah etc. etc.

This is pretty handy, so using a - as the list bullet point character is recommended.

Indented paragraphs that start with term: tab rest of paragraph will be converted into definition lists (this is another StolenFromWikiIdea). As a result, this:

    Foo:	Blah blah blah etc.

Will look like this:

Foo
Blah blah blah etc.

Sidebars and Side Images

If you wish to display an image, or small sidebar, beside a paragraph of text, use the <etleft> and <etright> tags. These are rendered as a one-row, two-column <table> wrapping the paragraph and the sidebar, as follows:

  <etleft><img src=bubba.png></etleft>This is the main
  paragraph body.  Foo bar baz blah blah blah etc.

Is displayed as:

This is the main paragraph body. Foo bar baz blah blah blah etc.

  <etright><img src=bubba.png></etright>This is the
  main paragraph body.  Foo bar baz blah blah blah etc.

Is displayed as:

This is the main paragraph body. Foo bar baz blah blah blah etc.


Links in EtText

As well as the standard <a href=url>...</a> link specification used in HTML, EtText will automatically add href tags for URLs and email addresses that occur in the text. In addition, EtText supports its own link format, as follows.

To use labelled links, you surround the link text with double-square-brackets, and (optionally) use a single open-square-bracket on the right-hand side with the link label.

Here's an example:

  WebMake's home page is [[at this website [WebMake]].

Alternatively, if the link text matches the link label, the link label is optional.

  Here's an example: [[WebMake]].

The href used in the link is then defined at another point in the document, as an indented line like this:

	[WebMake]: http://webmake.taint.org/

Even simpler: if the link label has been set as an Auto link, you can omit the square brackets altogether:

  Here's an example: WebMake.

Text and markup can be enclosed in the double-square-brackets, everything quoted will become part of the link text. Unlike the older form of EtText links (see below), even single words need to be enclosed in brackets to become links. This protects against accidentally interpreting normal text as a broken link.

EtText Linking, Backwards Compatibility

The following text describes the old style for EtText links. Since it was way too easy to produce links this way where they were not intended to be, it has now been obsoleted by the method described above. However, support for it will remain on by default for a few revisions.

To turn off this backwards compatibility, set the EtTextOldLinkStyle option to 0, either using WebMake's <option> tag, or from your code.

The basic concept is of a word or "quoted set of words" followed by an optional link label in [square brackets], like this: "this is a link" [label].

The href used in the link is then defined at another point in the document, as above.

Text and markup can be enclosed in the quotes, everything quoted will become part of the link text. Single words or HTML tags do not need to be quoted, so

  <img src="/license_plate.jpg" width="10" height="10"> [homepage]

will work correctly.

Glossary Links

EtText also supports a concept called glossary links; if you define a link, the name of that link will automatically become a href if enclosed in double-square-brackets or quotes. For example:

  [Justin Mason]: http://jmason.org/

will mean that any occurrence of [[Justin Mason]], or "Justin Mason", in any EtText content chunk or file in the site, becomes a link to that address.

These links are stored in the WebMake cache file, if WebMake is being used. If you use EtText in a standalone mode, without WebMake, you can provide an implementation of the Text::EtText::LinkGlossary interface to store defined links so that they can be used in other EtText files.

Quoted bits of text that do not map to an entry in the glossary are not converted to links (unless they're followed by a square-bracketed link-label reference).

Auto Links - Even More Convenient

In addition, if the link definition is preceded with Auto:, the quotes are not required, and any occurrence of the link label -- with or without quotes or double-square-brackets -- will become a link.

  Auto: [WebMake]: http://webmake.taint.org/
  Auto: [any occurrence of the words]: http://webmake.taint.org/

URLs and Email Addresses

URLs, such as http://webmake.taint.org/ , and email addresses, such as jm@nospam-jmason.org, are automatically converted into links to that same address.

Blocking EtText Link Interpretation

To block interpretation as a link, replace square brackets with the HTML entities &etsqi; and &etsqo;, which map to [ and ] respectively; replace quote characters, ", with two apostrophes, ''. If that doesn't do the trick, wrap the entire section of text with the <!--etsafe-->...<!--/etsafe--> tags.


Similar Systems

EtText-like plain-text-to-markup conversion systems have a long history. The first time I came across the concept was with Setext, which was included with Tony Sanders' Plexus web server, back in September 1993. Yes, 1993. Setext has been around for a while!

WikiWikiWeb is quite a recent, well-established system which uses a similar markup style.

txt2html provided a lot of impetus to rewrite the core of EtText since 2.0, since its list-parsing engine was much better. However EtText is now up to scratch again ;)

The real inspiration for EtText was Userland's Frontier; Dave Winer's evangelisation of its easily-editable markup system convinced me that it was worth polishing up the rudimentary EtText system I had then. In addition, the name "EtText" is derived from "Edit This Text", in a tip of the hat to Dave's "Edit This Page" concept.

Some well-known sites that use their own converters to convert plain-text to markup include http://www.blogger.com/, http://slashdot.org/ (for comments) and http://www.advogato.org/.

Jorn Barger maintains an impressive summary of etext formats at his Robot Wisdom site. Skip down to section 3, Internet etext standards, for the directly-relevant stuff.

Zope and ZWiki use a format called StructuredText, which again comes from WikiLand. There's some interesting work going on there with the STXDocument object, which is "a web-managable object that contains information marked up in the structured text format".


When HTML and EtText Collide

HTML tags can be used freely throughout an EtText document. However, in some situations, you may wish to preserve whitespace, avoid paragraph tags being added, etc.; to use your own HTML without meddling from EtText, wrap it in an <!--etsafe-->...<!--/etsafe--> tag pair; this will protect it.

Note that text blocks wrapped in <pre>, <listing> and <xmp> tags are automatically protected in this way; the <!--etsafe--> tag pair is not required.

EtText adds two entities, &etsqi; and &etsqo;. These represent [ and ] respectively, and are used to protect a square-bracketed piece of text from being interpreted as a link URL (see Link Markup below).

If this is insufficient, and you're using WebMake, the <safe> tag will escape any type of code to protect it from interpretation by WebMake, EtText or HTML.


Text::EtText::DefaultGlossary

x


NAME

Text::EtText::DefaultGlossary - default, non-persistent link glossary


SYNOPSIS


DESCRIPTION

The Text::EtText::DefaultGlossary is an implementation of Text::EtText::LinkGlossary which is used if no other implementation is registered.

It will not save glossary link details persistently.


METHODS


Text::EtText::EtText2HTML

x


NAME

Text::EtText::EtText2HTML - convert from the simple EtText editable-text format into HTML


SYNOPSIS

  my $t = new Text::EtText::EtText2HTML;
  print $t->text2html ($text);

or

  my $t = new Text::EtText::EtText2HTML;
  print $t->text2html ();               # from STDIN

DESCRIPTION

ettext2html will convert a text file in the EtText editable-text format into HTML.

For more information on the EtText format, check the WebMake documentation on the web at http://webmake.taint.org/ .


METHODS

$f = new Text::EtText::EtText2HTML
Constructs a new Text::EtText::EtText2HTML object.
$f->set_option ($optname, $optval);
Set an EtText option. (Options can also be set on the WebMake object itself, or from inside the WebMake file.) Currently supported options are:
EtTextOneCharMarkup (default: 0)
Allow one-character sets of asterisks etc. to mark up as strong, emphasis etc., instead of the default two-character markup.
EtTextOldLinkStyle (default: 1)
Use the older EtText link-markup style, with quote characters and single square brackets. This is easy to type, but if you're using text from other people, it can easily destroy formatting; so the new link-markup style, with double square brackets, can be used instead.
EtTextBaseHref (default: '')
The base HREF to use for relative links. If set, all relative links in tags with HREF attributes will be rewritten as absolute links, making the output HTML independent of the URL tree structure.
EtTextHrefsRelativeToTop (default: 0)
Indicates that all EtText links are relative to the top of the WebMake document tree. This (obviously) is only relevant if you are using EtText in conjunction with WebMake, and WebMake sets it by default. If set, all relative links in tags with HREF attributes will be rewritten as relative to the ''top'' of the WebMake site, making the output HTML independent of the URL tree structure.
$html = $f->set_glossary ($glosobj)
Provide a glossary for shared link definitions, allowing link definitions to be shared and reused across multiple EtText files. $glosobj must implement the interface defined by Text::EtText::LinkGlossary.

See below for more information on this interface.

$html = $f->text2html( [$text] )
Convert text, either from the argument or from STDIN, into HTML.

MORE DOCUMENTATION

See also http://webmake.taint.org/ for more information.


SEE ALSO

webmake ettext2html ethtml2text HTML::WebMake Text::EtText::EtText2HTML Text::EtText::HTML2EtText Text::EtText::LinkGlossary Text::EtText::DefaultGlossary


AUTHOR

Justin Mason <jm /at/ jmason.org>


COPYRIGHT

WebMake is distributed under the terms of the GNU Public License.


AVAILABILITY

The latest version of this library is likely to be available from CPAN as well as:

  http://webmake.taint.org/

Text::EtText::HTML2EtText

x


NAME

Text::EtText::HTML2EtText - convert from HTML to the EtText editable-text format


SYNOPSIS

  my $t = new Text::EtText::HTML2EtText;
  print $t->html2text ($html);

or

  my $t = new Text::EtText::HTML2EtText;
  print $t->html2text ();                       # from STDIN

DESCRIPTION

ethtml2text will convert a HTML file into the EtText editable-text format, for use with webmake or ettext2html.

For more information on the EtText format, check the WebMake documentation on the web at http://webmake.taint.org/ .


METHODS

$f = new Text::EtText::HTML2EtText
Constructs a new Text::EtText::HTML2EtText object.
$text = $f->html2text( [$html] )
Convert HTML, either from the argument or from STDIN, into EtText.

MORE DOCUMENTATION

See also http://webmake.taint.org/ for more information.


SEE ALSO

webmake ettext2html ethtml2text HTML::WebMake Text::EtText::EtText2HTML Text::EtText::HTML2EtText


AUTHOR

Justin Mason <jm /at/ jmason.org>


COPYRIGHT

WebMake is distributed under the terms of the GNU Public License.


AVAILABILITY

The latest version of this library is likely to be available from CPAN as well as:

  http://webmake.taint.org/

Text::EtText::LinkGlossary

x


NAME

Text::EtText::LinkGlossary - interface for EtText link glossaries to implement.


SYNOPSIS

  use Text::EtText::LinkGlossary;
  @ISA = qw(Text::EtText::LinkGlossary);
  sub open { ... }
  sub close { ... }
  ...

DESCRIPTION

The Text::EtText::LinkGlossary is an interface which allows EtText to support ''link glossaries'', persistent collections of link text and its corresponding HREF.

The interface which needs to be implemented is as follows:


METHODS

$g->open()
Open the link glossary $g for reading and writing.
$g->close()
Close the link glossary; no more links can be written or read.
$url = $g->get_link ($name)
Get a named link from the glossary.
$g->put_link ($name, $url)
Put a named link to the glossary.
$url = $g->get_auto_link ($name)
Get a named automatic link from the glossary.
$g->put_auto_link ($name, $url)
Put a named automatic link to the glossary.
@keys = $g->get_auto_link_keys ()
Get a list of the names of automatic links stored in the glossary.
$g->add_auto_link_keys (@keys)
Add to the list of names of automatic links stored in the glossary.

ethtml2text(1)

x


NAME

ethtml2text - convert from HTML to the EtText editable-text format


SYNOPSIS

  ethtml2text file.html > file.txt

DESCRIPTION

ethtml2text will convert a HTML file into the EtText editable-text format, for use with webmake or ettext2html.

For more information on the EtText format, check the WebMake documentation on the web at http://ettext.taint.org/ .


INSTALLATION

The ethtml2text command is part of the HTML::WebMake Perl module set. Install this as a normal Perl module, using perl -MCPAN -e shell, or by installing WebMake.


ENVIRONMENT

No environment variables, aside from those used by perl, are required to be set.


SEE ALSO

webmake ettext2html ethtml2text HTML::WebMake Text::EtText


AUTHOR

Justin Mason <jm /at/ jmason.org>


PREREQUISITES

HTML::Entities


ettext2html(1)

x


NAME

ettext2html - convert from the simple EtText editable-text format into HTML


SYNOPSIS

  ettext2html file.txt > file.html

DESCRIPTION

ettext2html will convert a text file in the EtText editable-text format into HTML.

For more information on the EtText format, check the WebMake documentation on the web at http://ettext.taint.org/ .


INSTALLATION

The ettext2html command is part of the HTML::WebMake Perl module set. Install this as a normal Perl module, using perl -MCPAN -e shell, or by installing WebMake.


ENVIRONMENT

No environment variables, aside from those used by perl, are required to be set.


SEE ALSO

webmake ettext2html ethtml2text HTML::WebMake Text::EtText


AUTHOR

Justin Mason <jm /at/ jmason.org>


PREREQUISITES

HTML::Entities


EtText Documentation (version 2.4)