Documentation for the AscToPDF conversion utility

The latest version of these files is available online at http://www.jafsoft.com/doco/docindex.html


Previous page Back to Contents List Next page

Using the pre-processor

The pre-processor allows authors to add special lines to the source document to customise the conversion. This is usually used where someone intends regularly generating PDF from a master text document.

The Pre-processor is described more fully in the separate document the Tag manual.

The pre-processor works by giving the software hints and instructions on how to process the text. During the analysis process the software reads the source files line-by-line. The pre-processor recognises special keywords in two ways

In both cases the tag or directive cannot be split over multiple lines, that is directives must be on a line by themselves, and in-line tags must be wholly contained on a single line.

Contents of this section

Pre-processor Directives
Pre-processor In-line tags
Document commands
Pre-processor command: DESCRIPTION
Pre-processor command: KEYWORDS
Pre-processor command: TITLE
Section delimiters
Pre-processor command: ALLOW and DISALLOW
Pre-processor command: CONTENTS
Pre-processor command: CODE
Pre-processor command: COMMA_DELIMITED_TABLE
Pre-processor command: DELIMITED_TABLE
Pre-processor command: DIAGRAM
Pre-processor command: IGNORE
Pre-processor command: PRE
Pre-processor command: TABLE
Pre-processor command: SECTION
Other commands
Pre-processor command: BR
Pre-processor command: CHANGE_POLICY
Pre-processor command: FILENAME
Pre-processor command: IGNORE_THIS
Pre-processor command: INCLUDE
Pre-processor command: PAGE
Pre-processor command: VERSION

Pre-processor Directives

"Directives" consist of a single line in the source file beginning with the string "$_$_" followed by a recognised keyword and any additional "attributes" that the directive supports.


Pre-processor In-line tags

In-line tags, as the name implies, can occur anywhere in the source lines. They are enclosed between the special strings "[[" and "]]". Between these strings the tag consists of a keyword and then any attributes that tag supports.

Note:
At this time not many of the JafSoft pre-processor in-line tags offred by our other conversion programs are supported by AscToPDF.

Document commands

These commands are used to control tags placed in the document information section of the created PDF page(s).

Not yet implemented in this release.

DESCRIPTION Add a description in the document properties
KEYWORDS Add keywords to the document properties
TITLE Add a Title to the PDF pages

Pre-processor command: DESCRIPTION

You can specify a description of your page to be added as a to the document information portion of your page by adding a line of the form

        $_$_DESCRIPTION <rest of line is used as a description>

This takes precedence over any description added via a policy file.

Not yet implemented

Pre-processor command: KEYWORDS

You can specify keywords that describe the contents of your page to be added to the document information section of your file by adding a line of the form

        $_$_KEYWORDS <rest of line is used as a list of keywords>

This takes precedence over any keywords added via a policy file.

Not yet implemented


Pre-processor command: TITLE

You can specify the TITLE to be added to your PDF document in the document information section, by adding a line of the form

        $_$_TITLE <rest of line is used as a title>

This title takes precedence over any title added via a policy file.

Not yet implemented

Section delimiters

These commands mark the start and end of various sections in your document

ALLOW and DISALLOW Enable and disable certain types of detection.
CONTENTS Mark a section as the contents list.
CODE Mark a section as C-like code sample
DIAGRAM Mark a section as a diagram or ASCII Art
IGNORE Ignore a section of the document
PRE Mark a section as pre-formatted text.
TABLE Mark a section as a table.
COMMA_DELIMITED_TABLE Mark a section as comma-delimited data table.
DELIMITED_TABLE Mark a section as a tab-delimited data table.
SECTION Mark the start of a user-specified section.

Pre-processor command: ALLOW and DISALLOW

AscToPDF will automatically try to detect various typographical features. You can turn this behaviour on and off in different sections by using the ALLOW and DISALLOW. This can be used, for example, to prevent a numbered list being wrongly detected as a numbered heading and vice versa.

The syntax for both commands is the same, namely

        ALLOW/DISALLOW  <comma-separated list of keywords>

Where the recognised keywords are as follows

Headings
This enables/disable the search for lines that could be treated as headings.
Lists
This enables/disables the search for lines that could be regarded as list items (either unordered bullets, or alphabetic or numeric list points)
  All   Set (enable) all of the above
  Reset   Reset (disable) all of the above

In each case the tag will simply add or subtract from the current list of allowable features. To aid control, two special keywords "all" and "reset" are available for inclusion in the list. "Reset" will disable all options, thus

        $_$_ALLOW reset, Headings

will have the effect of disabling everything (the "reset") and then adding "Headings" to the allowed list. In this respect "ALLOW all" and "DISALLOW reset" are identical commands.

Below is an example in which the DISALLOW tag is used to prevent numbered lines being regarded as lists or headings. The ALLOW tag at the end switched back to default behaviour,, so if there are any lists of numbered headings elsewhere in the document they will still be detected.

        $_$_DISALLOW headings
        ...
        1. Whatever this line is, it isn't a heading
        ...
        $_$_DISALLOW headings,lists
        ...
        2. Whatever this line is, it isn't a heading or a list item
        ...
        $_$_ALLOW reset

Pre-processor command: CONTENTS

You can mark up a section of your document as a contents list. To do this use matching BEGIN_CONTENTS and END_CONTENTS command as follows:

        $_$_BEGIN_CONTENTS
        ...
        $_$_END_CONTENTS

AscToPDF will then attempt to treat the enclosed text as a contents list.

See comments on contents policies

Pre-processor command: CODE

You can mark up a section of your document as being a piece of sample C-like code. To do this use matching BEGIN_CODE and END_CODE command as follows:

        $_$_BEGIN_CODE
        ...
        $_$_END_CODE

AscToPDF will then mark up the enclosed text in fixed width fonts.

See comments on pre-formatted text

Pre-processor command: COMMA_DELIMITED_TABLE

These commands delimit a table of comma-delimited data

Syntax:

        $_$_BEGIN_COMMA_DELIMITED_TABLE
        ...
        (lines of comma-delimited data)
        ...
        $_$_END_COMMA_DELIMITED_TABLE

The BEGIN_COMMA_DELIMITED_TABLE ... END_COMMA_DELIMITED_TABLE directives can be used to delimit a series of comma-delimited data values that should be interpreted as a table (e.g. data originally exported from a spreadsheet such as Excel)

See comments in Pre-processor command: TABLE


Pre-processor command: DELIMITED_TABLE

These directives delimit a table of delimited data

Syntax:

        $_$_BEGIN_DELIMITED_TABLE [<delimiter>]
        ...
        (lines of delimited data)
        ...
        $_$_END_DELIMITED_TABLE

where

<delimiter> The delimiter character to use. If omitted
the default is tab-delimited. The delimiter
can be any character except a comma. For
comma-delimited tables use the
COMMA_DELIMITED_TABLE Command instead

The BEGIN_DELIMITED_TABLE ... END_DELIMITED_TABLE directives can be used to delimit a series of delimited data values that should be interpreted as a table (e.g. data originally exported from a spreadsheet such as Excel)

See comments in Pre-processor command: TABLE


Pre-processor command: DIAGRAM

You can mark up a section of your document as being a diagram or a piece of ASCII art. To do this use matching BEGIN_DIAGRAM and END_DIAGRAM commands as follows:

        $_$_BEGIN_DIAGRAM
        ...
        $_$_END_DIAGRAM

AscToPDF will then mark up the enclosed text in fixed width fonts.

See comments on pre-formatted text


Pre-processor command: IGNORE

You can mark up a section in your document that you want ignored in the output. This can be used to store change history information or whatever you want.

Syntax:

      $_$_BEGIN_IGNORE
      ...
      (text to be ignored)
      ...
      $_$_END_IGNORE

This markup can be used to delimit a section to be wholly ignored. Any markup and tags in the ignored section will have no effect.


Pre-processor command: PRE

You can mark up a section of your document as being pre-formatted text. To do this use matching BEGIN_PRE and END_PRE commands as follows:

        $_$_BEGIN_PRE
        ...
        $_$_END_PRE

AscToPDF will then mark up the enclosed text in fixed width fonts.

See comments on pre-formatted text


Pre-processor command: TABLE

You can mark up a section of your document as being a text table. To do this use matching BEGIN_TABLE and END_TABLE Commands as follows:

        $_$_BEGIN_TABLE
        ...
        $_$_END_TABLE

Note:
At present AscToPDF doesn't have the ability to fully lay out your table in the way that other JafSoft utilities such as AscToHTM and AscToRTF do, For the time being "tables" will simply be output using a fixed font. However in later version it is hoped to fully implement table creation.

General comments on marking up tables

AscToPDF has some ability to auto-detect tables (see comments on pre-formatted text), but this can be error prone. Marking up tables removes a lot of the ambiguity and so can give better results.

For tables of delimited data (as opposed to plain text tables) you should use the DELIMITED_TABLE and COMMA_DELIMITED_TABLE commands.

Note in each case the presence of these directives overrides any value set in the poliy file, as those policy values only refer to the auto-detection of tables. Placing markup in the source forces the text to be treated as tables.

Pre-processor command: SECTION

You can mark up sections of your document as being named sections. By default text belongs to a section called "all".

To do so insert SECTION command at the start of each section as follows:

        $_$_SECTION <name>
        ...

All following text will be marked as belonging to the named section until another SECTION command is encountered. AscToPDF will only copy across those sections named in the allowable sections policy, and any text in "all" sections. In this way you can generate variants of your document for different audiences (e.g. Internet and Intranet).

If you want the rest of your document to be included in all conversions, insert an "all" SECTION command as follows:

        $_$_SECTION all
        ...

Other commands

BR Insert a line break
CHANGE_POLICY Dynamically vary policies through the input file
FILENAME Output the original filename
IGNORE_THIS Ignore some text in the source
INCLUDE Include an external file into the source
PAGE Create a page boundary at this location
VERSION Output the program version used in this conversion


Pre-processor command: BR

This command tells the software to output a line break at this point. Usually the default is to let all lines flow together to form a paragraph. This commands can be used (e.g. in address lines to make sure lines are correctly placed on new lines).


Pre-processor command: CHANGE_POLICY

This option allows you to embed policy lines in the source document. This can be used to avoid the need for separate policy files, or to change the policy at different locations within the document (although the effects can sometimes be unpredictable).

The syntax is

        $_$_CHANGE_POLICY <policy line as in policy file>

For example placing

        $_$_CHANGE_POLICY Convert mailto links : yes

would make all subsequent email addresses be converted into working hyperlinks. By adding several lines of this type you can toggle this behaviour on and off, controlling which email links become hyperlinks and which do not.

Pre-processor command: FILENAME

This in-line tag substitutes the name of the files being converted

Syntax:

[[FILENAME]]

The tag will be replaced by the name of the file being converted. This facilitates the construction of sentences like

"This file was converted from [[FILENAME]] at [[TIMESTAMP]] "

which becomes

"This file was converted from asctopdf.txt at 17-Apr-2006"


Pre-processor command: IGNORE_THIS

This is an in-line tag whose contents are ignored. Could be used for comments

Syntax:

[[IGNORE_THIS <anything_you_like>]]

This tag is ignored. It is replaced by a single space in the output stream. It could be used to add a brief comment to your source that would not appear in the output.

See also the IGNORE command

Pre-processor command: INCLUDE

You can include one source file in another by using the include command as follows:-

        $_$_INCLUDE filename

Make sure the file is accessible from wherever AscToPDF is run, or in the same directory as the original source file. AscToPDF will read the file on each pass, treating its contents as part of the main file for both analysis and conversion purposes.

Note, the include file should be plain text, which will be converted as normal for the document. It may include other pre-processor commands including further INCLUDE commands up to a limit of 9 levels. Be careful not to set up include loops (i.e. a includes b include c includes a etc).

Include files like this can be a useful way of embedding standard disclaimers etc, and compliment the use of header and footers.

Pre-processor command: PAGE

The syntax is

$_$_PAGE

This signals a page boundary. In PDF generation a page break will be generated at this point.

Pre-processor command: VERSION

This in-line tag adds a description of the program name/version used to convert the files (e.g. "AscToPDF 2.1")

Syntax:

[[VERSION]]

Outputs the version name of the conversion into the output file. For example "AscToHTM 5.0".



Previous page Back to Contents List Next page

Valid HTML 4.0! Converted from a single text file by AscToHTM
© 2006 John A Fotheringham
Converted by AscToHTM