Documentation for the AscToRTF conversion utility

The latest version of these files is available online at http://www.jafsoft.com/doco/docindex.html


Previous page Back to Contents List Next page

Using the pre-processor

The pre-processor allows authors to add special lines to the source document to customise the conversion. This is usually used where someone intends regularly generating RTF from a master text document.

The Pre-processor is described more fully in the separate document the Tag manual.

The pre-processor works by giving the software hints and instructions on how to process the text. During the analysis process the software reads the source files line-by-line. The pre-processor recognises special keywords in two ways

In both cases the tag or directive cannot be split over multiple lines, that is directives must be on a line by themselves, and in-line tags must be wholly contained on a single line.

Contents of this section

Pre-processor Directives
Pre-processor In-line tags
Document commands
Pre-processor command: DESCRIPTION
Pre-processor command: KEYWORDS
Pre-processor command: TITLE
Section delimiters
Pre-processor command: ALLOW and DISALLOW
Pre-processor command: ASCII
Pre-processor command: CONTENTS
Pre-processor command: CODE
Pre-processor command: COMMA_DELIMITED_TABLE
Pre-processor command: DELIMITED_TABLE
Pre-processor command: DIAGRAM
Pre-processor command: IGNORE
Pre-processor command: PRE
Pre-processor command: TABLE
Pre-processor command: SECTION
Tagged Table commands
Tagged table command: BEGIN_USER_TABLE
Tagged table command: COLUMN_DETAILS
Tagged table command: NEW_ROW
Tagged table command: NEW_CELL
Tagged table: Cell contents
Table modifier commands
Pre-processor command: TABLE_HEADER_ROWS
Pre-processor command: TABLE_IGNORE_HEADER
Pre-processor command: TABLE_LAYOUT
Pre-processor command: TABLE_MAY_BE_SPARSE
Pre-processor command: TABLE_MIN_COLUMN_SEPARATION
Other commands
Pre-processor command: BR
Pre-processor command: CHANGE_POLICY
Pre-processor command: FILENAME
Pre-processor command: FO
Pre-processor command: FRACTION
Pre-processor command: GOTO
Pre-processor command: POPUP
Pre-processor command: SUPER and SUB
Pre-processor command: IGNORE_THIS
Pre-processor command: INCLUDE
Pre-processor command: PAGE
Pre-processor command: VERSION

Pre-processor Directives

"Directives" consist of a single line in the source file beginning with the string "$_$_" followed by a recognised keyword and any additional "attributes" that the directive supports.


Pre-processor In-line tags

In-line tags, as the name implies, can occur anywhere in the source lines. They are enclosed between the special strings "[[" and "]]". Between these strings the tag consists of a keyword and then any attributes that tag supports.

Useful in-line tags include

Document commands

These commands are used to control tags placed in the document information section of the created RTF page(s).

Not yet implemented in this release.

DESCRIPTION Add a description in the document properties
KEYWORDS Add keywords to the document properties
TITLE Add a Title to the RTF pages

Pre-processor command: DESCRIPTION

You can specify a description of your page to be added as a to the document information portion of your page by adding a line of the form

        $_$_DESCRIPTION <rest of line is used as a description>

This takes precedence over any description added via a policy file.

Not yet implemented

Pre-processor command: KEYWORDS

You can specify keywords that describe the contents of your page to be added to the document information section of your file by adding a line of the form

        $_$_KEYWORDS <rest of line is used as a list of keywords>

This takes precedence over any keywords added via a policy file.

Not yet implemented


Pre-processor command: TITLE

You can specify the TITLE to be added to your RTF page in the document information section, by adding a line of the form

        $_$_TITLE <rest of line is used as a title>

This title takes precedence over any title added via a policy file.

Not yet implemented

Section delimiters

These commands mark the start and end of various sections in your document

ALLOW and DISALLOW Enable and disable certain types of detection.
ASCII Mark a section of text for utility A2HDETAG.
CONTENTS Mark a section as the contents list.
CODE Mark a section as C-like code sample
DIAGRAM Mark a section as a diagram or ASCII Art
IGNORE Ignore a section of the document
PRE Mark a section as pre-formatted text.
TABLE Mark a section as a table.
COMMA_DELIMITED_TABLE Mark a section as comma-delimited data table.
DELIMITED_TABLE Mark a section as a tab-delimited data table.
SECTION Mark the start of a user-specified section.

Pre-processor command: ALLOW and DISALLOW

New in version 2.0
AscToRTF will automatically try to detect various typographical features. You can turn this behaviour on and off in different sections by using the ALLOW and DISALLOW. This can be used, for example, to prevent a numbered list being wrongly detected as a numbered heading and vice versa.

The syntax for both commands is the same, namely

        ALLOW/DISALLOW  <comma-separated list of keywords>

Where the recognised keywords are as follows

Headings
This enables/disable the search for lines that could be treated as headings.
Lists
This enables/disables the search for lines that could be regarded as list items (either unordered bullets, or alphabetic or numeric list points)
  All   Set (enable) all of the above
  Reset   Reset (disable) all of the above

In each case the tag will simply add or subtract from the current list of allowable features. To aid control, two special keywords "all" and "reset" are available for inclusion in the list. "Reset" will disable all options, thus

        $_$_ALLOW reset, Headings

will have the effect of disabling everything (the "reset") and then adding "Headings" to the allowed list. In this respect "ALLOW all" and "DISALLOW reset" are identical commands.

Below is an example in which the DISALLOW tag is used to prevent numbered lines being regarded as lists or headings. The ALLOW tag at the end switched back to default behaviour,, so if there are any lists of numbered headings elsewhere in the document they will still be detected.

        $_$_DISALLOW headings
        ...
        1. Whatever this line is, it isn't a heading
        ...
        $_$_DISALLOW headings,lists
        ...
        2. Whatever this line is, it isn't a heading or a list item
        ...
        $_$_ALLOW reset


Pre-processor command: ASCII

New in version 2.0

As of version 2.0, the separate utility A2HDETAG is available to create a plain ASCII file, by removing all pre-processor tags from your source file. In this was a source file designed for conversion to RTF by AscToRTF can be "cleaned up" and posted as plain text elsewhere.

To support this, the BEGIN_ASCII and END_ASCII tags can be used to delimit a section of text that will only appear in the version created by A2HDETAG. This allows you to add comments to the "plain text" version, that won't appear in the RTF conversion. Use these commands as follows

        $_$_BEGIN_ASCII
        You are reading this text in a "cleaned up" version of the
        source file.  This text won't get copied across when this
        file is converted to either RTF or HTML
        $_$_END_ASCII


Pre-processor command: CONTENTS

You can mark up a section of your document as a contents list. To do this use matching BEGIN_CONTENTS and END_CONTENTS command as follows:

        $_$_BEGIN_CONTENTS
        ...
        $_$_END_CONTENTS

AscToRTF will then attempt to treat the enclosed text as a contents list.

See comments on contents policies

Pre-processor command: CODE

You can mark up a section of your document as being a piece of sample C-like code. To do this use matching BEGIN_CODE and END_CODE command as follows:

        $_$_BEGIN_CODE
        ...
        $_$_END_CODE

AscToRTF will then mark up the enclosed text in fixed width fonts.

See comments on pre-formatted text

Pre-processor command: COMMA_DELIMITED_TABLE

New in version 2.0
These commands delimit a table of comma-delimited data

Syntax:

        $_$_BEGIN_COMMA_DELIMITED_TABLE
        ...
        (lines of comma-delimited data)
        ...
        $_$_END_COMMA_DELIMITED_TABLE

The BEGIN_COMMA_DELIMITED_TABLE ... END_COMMA_DELIMITED_TABLE directives can be used to delimit a series of comma-delimited data values that should be interpreted as a table (e.g. data originally exported from a spreadsheet such as Excel)

See comments in Pre-processor command: TABLE


Pre-processor command: DELIMITED_TABLE

These directives delimit a table of delimited data

Syntax:

        $_$_BEGIN_DELIMITED_TABLE [<delimiter>]
        ...
        (lines of delimited data)
        ...
        $_$_END_DELIMITED_TABLE

where

<delimiter> The delimiter character to use. If omitted
the default is tab-delimited. The delimiter
can be any character except a comma. For
comma-delimited tables use the
COMMA_DELIMITED_TABLE Command instead

The BEGIN_DELIMITED_TABLE ... END_DELIMITED_TABLE directives can be used to delimit a series of delimited data values that should be interpreted as a table (e.g. data originally exported from a spreadsheet such as Excel)

See comments in Pre-processor command: TABLE


Pre-processor command: DIAGRAM

You can mark up a section of your document as being a diagram or a piece of ASCII art. To do this use matching BEGIN_DIAGRAM and END_DIAGRAM commands as follows:

        $_$_BEGIN_DIAGRAM
        ...
        $_$_END_DIAGRAM

AscToRTF will then mark up the enclosed text in fixed width fonts.

See comments on pre-formatted text


Pre-processor command: IGNORE

You can mark up a section in your document that you want ignored in the output. This can be used to store change history information or whatever you want.

Syntax:

      $_$_BEGIN_IGNORE
      ...
      (text to be ignored)
      ...
      $_$_END_IGNORE

This markup can be used to delimit a section to be wholly ignored. Any markup and tags in the ignored section will have no effect.


Pre-processor command: PRE

You can mark up a section of your document as being pre-formatted text. To do this use matching BEGIN_PRE and END_PRE commands as follows:

        $_$_BEGIN_PRE
        ...
        $_$_END_PRE

AscToRTF will then mark up the enclosed text in fixed width fonts.

See comments on pre-formatted text


Pre-processor command: TABLE

You can mark up a section of your document as being a text table. To do this use matching BEGIN_TABLE and END_TABLE Commands as follows:

        $_$_BEGIN_TABLE
        ...
        $_$_END_TABLE

AscToRTF will then analyse the enclosed text to determine the table layout and will generate a proper RTF table.

General comments on marking up tables

AscToRTF has some ability to auto-detect tables (see comments on pre-formatted text), but this can be error prone. Marking up tables removes a lot of the ambiguity and so can give better results*

For tables of delimited data (as opposed to plain text tables) you should use the DELIMITED_TABLE and COMMA_DELIMITED_TABLE commands.

Note in each case the presence of these directives overrides any value set in the Attempt table generation policy, as that only refers to the auto-detection of tables. Placing markup in the source forces the text to be treated as tables.

Within each marked-up table other pre-processor commands may be used to customise the table as follows:

For a full list see Table modifier commands


Pre-processor command: SECTION

You can mark up sections of your document as being named sections. By default text belongs to a section called "all".

To do so insert SECTION command at the start of each section as follows:

        $_$_SECTION <name>
        ...

All following text will be marked as belonging to the named section until another SECTION command is encountered. AscToRTF will only copy across those sections named in the allowable sections policy, and any text in "all" sections. In this way you can generate variants of your document for different audiences (e.g. Internet and Intranet).

If you want the rest of your document to be included in all conversions, insert an "all" SECTION command as follows:

        $_$_SECTION all
        ...


Tagged Table commands

New in version 2.0

In addition to converting plain text files, and sets of delimited data into tables, the software also supports a method of explicitly tagging the input as to how it should be placed in a table.

This may seem extreme, as the point of the converters is to generate the desired markup as save work, but there are a couple of situations in which this approach can be useful.

Here's a sample of a user-tagged table (with blank lines added for clarity) :-

        $_$_BEGIN_USER_TABLE C,1 in
        $_$_COLUMN_DETAILS 1,,,L, 2 in
        $_$_COLUMN_DETAILS 2,,,C, 1 ins
        $_$_TABLE_BORDER 1

        $_$_NEW_ROW HEAD
        $_$_NEW_CELL
        Substance (units)
        $_$_NEW_CELL
        Year
        Sampled

        $_$_NEW_ROW DATA
        $_$_NEW_CELL
        Alpha emitters (pCi/L)
        $_$_NEW_CELL
        1999

        $_$_NEW_ROW DATA
        $_$_NEW_CELL
        Asbestos (MFL)
        $_$_NEW_CELL
        1993
        $_$_END_TABLE

Here's how this table appears when converted into the current format


Substance (units)
Year
Sampled
Alpha emitters (pCi/L)
1999
Asbestos (MFL)
1993


See also

Tagged table command: BEGIN_USER_TABLE

To identify a section of a source file as a user table, it must be enclosed in the BEGIN_USER_TABLE ... END_TABLE commands as follows

        $_$_BEGIN_USER_TABLE <arguments>
        ...
        <other commands to layout the table>
        ...
        $_$_END_TABLE

The command line can take arguments as follows

        $_$_BEGIN_USER_TABLE <alignment>,<margin>

where

<align> The alignment of the table. This can be
L(eft), R(ight) or C(enter)
<margin> The margin to be applied to the table. This
consists of a number and a unit. Recognised
units include points ("pts" or "pt"), inches
("ins" or "in") and centimetres ("cm"). In HTML
generation these margins will be approximate only

Tagged table command: COLUMN_DETAILS

After the BEGIN_USER_TABLE line will appear a number of COLUMN_DETAILS lines. These are optional, but if present they give details of the characteristics of each column in the table as follows :-

        $_$_COLUMN_DETAILS <col_no>,<align>,<width>

where

<col_no> This is the column number, starting at 1
<align> This is the alignment of data in this column.
If omitted this will be auto-detected, but you can
choose to set it to L(eft), R(ight) or C(enter)
<width> The width of the column. If omitted the width will
be calculated. As with the <margin> on the table
the width can be specified in points, inches or
centimetres. If a width is set too narrow, it may
be ignored.

Tagged table command: NEW_ROW

Each new row is identifies by the present of a NEW_ROW command on a line by itself. The format is

        $_$_NEW_ROW <row_type>

where

<row_type> This is the row type. Options include
  HEAD This is a header row
DATA This is a data row
LINE This is a line in the table
  The type may be omitted, in which case the default
is "DATA"

except when the NEW_ROW is a "LINE", this command should be followed by a series of NEW_CELL commands and their matching cell data - normally one per column.


Tagged table command: NEW_CELL

Except for "LINE" rows, each new cell in a row identifies by the present of a NEW_CELL command on a line by itself. The contents of the cell follow on subsequent lines until either another NEW_CELL, NEW_ROW or END_TABLE command is encountered.

The format of the NEW_CELL command is

        $_$_NEW_CELL

At present the NEW_CELL command doesn't take any arguments.


Tagged table: Cell contents

Anything following a NEW_CELL command up until the next NEW_CELL, NEW_ROW or END_TABLE commands will be added into the current cell. The line structure will be preserved, so that if you have three lines of text following a NEW_CELL command, this will appear as a cell in the table with three lines of data in it.

The alignment of the cell will normally be that of the column the cell is in. This will either have been calculated automatically for the column as a whole, or will be value passed in via the matching COLUMN_DETAILS line, earlier in the table definition.


Table modifier commands

These commands can be used to tailor the appearance of a table. They're usually placed between the BEGIN_TABLE ... END_TABLE for the table they will affect, but they can also be placed at the top of the document to define defaults for all tables.

Pre-processor command: TABLE_HEADER_ROWS

This specifies how many rows in the table should be regarded as the table header.

Pre-processor command: TABLE_IGNORE_HEADER

New in version 2.0
This directive specifies that a table header should be ignored during the column analysis

Syntax:

        $_$_TABLE_IGNORE_HEADER

This tag has no attributes.

If present, indicates that the first few lines of the table - assumed to be the header - should be ignored when calculating the table's column structure.

This should be enabled if the table has a particularly complex header that may confuse the program.

This command has the same effect as the policy Ignore table header when analysing columns, but can be applied on a table-by-table basis when enclosed between TABLE command markers.

Pre-processor command: TABLE_LAYOUT

New in version 2.0
This directive allows you to specify the column structure of a table

Syntax:

        $_$_TABLE_LAYOUT <number of columns>,"<col 1 spec>","<col 2>",.....

where,

<Number_of_cols> Integer number of columns
<col_n_spec> Specification of the nth column. The
specification must be contained in quote.
  Currently the specification consists of
just the end position of the column.
  More may be added in later versions

An example would be

        $_$_TABLE_LAYOUT 3,"6","21","32"

which describes a 3-column table with column boundaries at the 6th, 21st and 32nd character positions.

Normally this directive should be placed between the BEGIN_TABLE...END_TABLE directives for the table it applies to, thereby overriding the "intelligent" analysis the program would otherwise attempt for a plain text table.


Pre-processor command: TABLE_MAY_BE_SPARSE

This specifies that the table may be sparse, i.e. largely empty in places. There is no data value required on this command.

See also expect sparse tables policy

Pre-processor command: TABLE_MIN_COLUMN_SEPARATION

This specifies the minimum number of spaces to be regarded as a column separator. The default value is 1, but occasionally this gives too many columns, especially in short tables. Increasing this value will reduce the number of columns calculated.


Other commands

BR Insert a line break
CHANGE_POLICY Dynamically vary policies through the input file
FILENAME Output the original filename
FO Change the prevailing font
FRACTION Output a fraction
GOTO Add a hyperlink to a section title
IGNORE_THIS Ignore some text in the source
INCLUDE Include an external file into the source
PAGE Create a page boundary at this location
POPUP Add a hyperlink to a section title
SUPER and SUB Add superscripts and subscripts
VERSION Output the program version used in this conversion


Pre-processor command: BR

This command tells the software to output a line break at this point. Usually the default is to let all lines flow together to form a paragraph. This commands can be used (e.g. in address lines to make sure lines are correctly placed on new lines).


Pre-processor command: CHANGE_POLICY

This option allows you to embed policy lines in the source document. This can be used to avoid the need for separate policy files, or to change the policy at different locations within the document (although the effects can sometimes be unpredictable).

The syntax is

        $_$_CHANGE_POLICY <policy line as in policy file>

For example placing

        $_$_CHANGE_POLICY Convert mailto links : yes

would make all subsequent email addresses be converted into working hyperlinks. By adding several lines of this type you can toggle this behaviour on and off, controlling which email links become hyperlinks and which do not.

Pre-processor command: FILENAME

This in-line tag substitutes the name of the files being converted

Syntax:

[[FILENAME]]

The tag will be replaced by the name of the file being converted. This facilitates the construction of sentences like

"This file was converted from [[FILENAME]] at [[TIMESTAMP]] "

which becomes

"This file was converted from asctortf.txt at 22-Feb-2004"


Pre-processor command: FO

New in version 2.0

NOTE: The FO tag is only currently supported in RTF generation.

This in-line tag allows the font used in a document to be changed, either locally within some text, or from this point onwards.

The FO tag should be used in conjunction with a Style Definition File (SDF), which can be used to define the "font id"s that are used

Syntax:

FO [<font_id>],[<font_size>],[<font_weight>]

where

<font_id> Identifies the font to be used. This must match the
name of a font in the SDF file. If no name is given then
the prevailing font will be used.
<font_size> The font size in pts. Only needed if the default
value in the font table is to be overridden.
  The size can be supplied as an absolute value or - if a plus or
minus sign is present - as a relative size. So for example
"4" means 4pt, whereas "+4" means 4pt larger than the
surrounding text.
  A value of "-" will be taken as a reset to the prevailing
default font size.
<font_weight> The font weight. Only needed if the default value
in the font table is to be overridden. Possible values
are
  it (Italic)
bo (Bold)
bi (Bold Italic)
no (Normal)
- (Reset)
  The "reset" will cause the weight to be reset to the
prevailing default, i.e. no longer override the
prevailing font.

Example:

        "This text is [[fo ,+6,bo]]big and bold,[[fo ,-,-]] but this text is
        normal again"

becomes

"This text is big and bold, but this text is normal again"

(this may only work in the RTF version of this document)

See also Scope for font tags


Pre-processor command: FRACTION

This in-line tag implements a fraction

Syntax:

[[FRACTION <expression>]]

where

<expression> This is the fraction expression which should contain
a slash ("/") separating the numerator and denominator
  Both values must be present.

So for example

The fractions [[FRACTION 5/16]] and 1[[FRACTION 1/2]].

becomes

The fractions 5/16 and 11/2.


Pre-processor command: GOTO

New in version 2.0
This in-line tag adds a hyperlink to the named section heading.

Syntax:

[[GOTO <Heading_name>]]

where

<Heading_name> Name of a heading else where in the file.
The text used must match exactly for this tag
to work (case insensitive though)

It creates a hyperlink to the named section heading. The heading must match the text exactly, and be in the same file. It must also have been recognised by AscToRTF as a heading.

If making RTF WinHelp source files, see also the POPUP command.


Pre-processor command: POPUP

New in version 2.0
This in-line tag adds a hyperlink to the named section heading.

Syntax:

[[POPUP <Heading_name>]]

This behaves in an identical manner to the GOTO unless you are creating an RTF file for use as a Windows Help file, in which case the hyperlink link becomes a pop-up link, instead of a full "go to" link.


Pre-processor command: SUPER and SUB

These in-line tags implement superscripts and subscripts

Syntax:

[[SUPER <expression>]]
[[SUB <expression>]]

So for example

        This[[SUPER superscript]] and that[[SUB subscript]]

becomes

Thissuperscript and thatsubscript


Pre-processor command: IGNORE_THIS

This is an in-line tag whose contents are ignored. Could be used for comments

Syntax:

[[IGNORE_THIS <anything_you_like>]]

This tag is ignored. It is replaced by a single space in the output stream. It could be used to add a brief comment to your source that would not appear in the output.

See also the IGNORE command

Pre-processor command: INCLUDE

You can include one source file in another by using the include command as follows:-

        $_$_INCLUDE filename

Make sure the file is accessible from wherever AscToRTF is run, or in the same directory as the original source file. AscToRTF will read the file on each pass, treating its contents as part of the main file for both analysis and conversion purposes.

Note, the include file should be plain text, which will be converted as normal for the document. It may include other pre-processor commands including further INCLUDE commands up to a limit of 9 levels. Be careful not to set up include loops (i.e. a includes b include c includes a etc).

Include files like this can be a useful way of embedding standard disclaimers etc, and compliment the use of header and footers.

Pre-processor command: PAGE

New in version 2.0
The syntax is

$_$_PAGE

This signals a page boundary. In RTF generation a page break will be generated at this point. In HTML the concept of page boundaries isn't really supported, so a horizontal rule <HR> is put out instead.

Pre-processor command: VERSION

This in-line tag adds a description of the program name/version used to convert the files (e.g. "AscToRTF 2.1")

Syntax:

[[VERSION]]

Outputs the version name of the conversion into the output file. For example "AscToHTM 4.2 beta".



Previous page Back to Contents List Next page

Valid HTML 4.0! Converted from a single text file by AscToHTM
© 1997-2004 John A Fotheringham
Converted by AscToHTM