Jade - James' DSSSL Engine

Contents

What is Jade?

Jade is an implementation of the DSSSL style language. The current version is 1.2.1.

For general information about DSSSL, see my DSSSL page.

You can discuss Jade and get help from other users on DSSSList, the DSSSL Users' Mailing List. I will announce new versions of Jade on DSSSList. To subscribe send mail to majordomo@mulberrytech.com with "subscribe dssslist" as the body of your message. To subscribe to the digest instead, the body should be "subscribe dssslist-digest". The list is archived.

Jade includes the following components:

Jade Copyright

Jade is licensed under the same terms as SP. This imposes almost no restrictions even for commercial use.

If you do use Jade in a commercial product, I would ask you, as a courtesy, to let me know about it and acknowledge the use of Jade.

Getting Jade

If you're using Windows 95 or Windows NT, then you all you need is in the binary distribution.

Otherwise you will need to build it yourself from source. The Jade sources are available in two forms:

Windows distribution
This is a ZIP file. The sources use CR/LF delimited lines. After getting this create a new directory and then unpack the sources file in this directory. You must use an unzip that preserves long filenames, such as WinZip 6.1. You should also make sure that your unzip preserves the case of filenames; this requires using a -U option with some versions of unzip. You should also ensure that the unzip preserves the directories. If you want to use this on Unix, you must unpack using unzip -a since the sources in the Windows distribution use CR/LF delimited lines.
Unix distribution
This is a gzipped, tar file. The sources use LF delimited lines. Unpack this using gunzip and tar.

The distributions include the sources for a compatible version of SP (which may be different from the latest released version of SP).

Building Jade

Win32

Only Microsoft Visual C++ 6.0 is supported. To build using the Visual Studio GUI, open the workspace jade.dsw and build the Win32 Release configuration of the all project. To build on the command line, ensure that the directory containing msdev is in your path, typically by executing the command:

path C:/Program Files/Microsoft Visual Studio/Common/MSDev98/Bin;%path%

then run the command:

msdev jade.dsw /make "all - Win32 Release"

Unix

The following compilers should work:

Only the first has been tested by me.

If you use gcc 2.7.2 with -O on an x86 processor you must use -fno-strength-reduce. gcc 2.7.2.1 fixes this problem.

Edit Makefile, then build with make. Note that you must use -DSP_MULTI_BYTE. If you plan to do any development, also do make depend.

Alternatively you can build using the experimental autoconf support.

Using Jade

Add the directory containing the jade binary to your path, change directory to the dsssl directory, and do

jade demo.sgm

If everything is working, there should be a well-formed XML file demo.fot created.

The system identifier of the document to be processed is specified as an argument to Jade. If this is omitted, standard input will be read.

Jade determines the system identifier for the DSSSL specification as follows:

  1. If the -d option is specified, it will use the argument as the system identifier.
  2. Otherwise, it will look for processing instructions in the prolog of the document. Two kinds of processing instruction are recognized:
    <?stylesheet href="sysid" type="text/dsssl">
    The system data of the processing instruction is parsed like an SGML start-tag. It will be parsed using the reference concrete syntax whatever the actual concrete syntax of the document. The name that starts the processing instruction can be either stylesheet, xml-stylesheet or xml:stylesheet. The processing instruction will be ignored unless the value of the type attribute is one of text/dsssl, text/x-dsssl, application/dsssl, or application/x-dsssl. The value of href attribute is the system identifier of the DSSSL specification.
    <?dsssl sysid>
    The system identifier is the portion of the system data of the processing instruction following the initial name and any whitespace.

    Although the processing instruction is only recognized in the prolog, it need not occur in the document entity. For example, it could occur in a DTD. The system identifier will be interpreted relative to where the the processing instruction occurs.

  3. Otherwise, it will use the system identifier of the document with any extension changed to .dsl.

A DSSSL specification document can contain more than one style-specification. If the system identifier of the DSSSL specification is followed by #id, then jade will use the style-specification whose unique identifier is id. This is allowed both with the -d option and with the processing instructions.

The DSSSL specification must be an SGML document conforming to the DSSSL architecture. For an example, see dsssl/demo.dsl.

Jade supports the following options in addition to the normal SP options:

-d dsssl_spec
This specifies that dsssl_spec is the system identifier of the DSSSL specification to be used.
-G
Debug mode. When an error occurs in the evaluation of an expression, Jade will display a stack trace. Note that this disables tail-call optimization.
-t output_type
output_type specifies the type of output as follows:
fot
An XML representation of the flow object tree
rtf rtf-95
Microsoft's Rich Text Format. rtf-95 produces output optimized for Word 95 rather than Word 97.
tex
TeX
sgml
SGML (used for SGML-to-SGML transformations)
xml
XML (used for SGML-to-XML transformations)
-o output_file
Write output to output_file instead of the default. The default filename is the name of the last input file with its extension replaced by the name of the type of output. If there is no input filename, then the extension is added onto jade-out.
-V variable
This is equivalent to doing
(define variable #t)
except that this definition will take priority over any definition of variable in a style-sheet.

Jade ignores the SP_CHARSET_FIXED and SP_SYSTEM_CHARSET environment variables and always uses Unicode as its internal character set, as if SP_CHARSET_FIXED was 1 and SP_SYSTEM_CHARSET was unset. Thus only the SP_ENCODING environment variable is relevant to Jade's handling of character sets.

Jade Extensions

The following external procedures are available. These external procedures are defined by a prototype in the same manner as in the standard. To use one of these external procedures, you must make use of the standard external-procedure procedure, using a public identifier of "UNREGISTERED::James Clark//Procedure::name" where name is the name given here, typically by including the following in the DSSSL specification:

(define name
  (external-procedure "UNREGISTERED::James Clark//Procedure::name"))

Note that external-procedure returns #f if it doesn't know about the specified public identifier. You can use this to enable your DSSSL specifications to work gracefully with other implementations which do not support these extensions.

Debugging

(debug obj)

Generates a message including the value of obj and then returns obj.

Simple-page-sequence header/footer control

(if-first-page sosofo1 sosofo2)

This can be used only in the specification of the value of one of the header/footer characteristics of simple-page-sequence. It returns a sosofo that will display as sosofo1 if the page is the first page of the simple-page-sequence and as sosofo2 otherwise.

(if-front-page sosofo1 sosofo2)

This can be used only in the specification of the value of one of the header/footer characteristics of simple-page-sequence. It returns a sosofo that will display as sosofo1 if the page is a front (ie recto, odd-numbered) page and as sosofo2 if it is a back (ie verso, even-numbered) page

Numbering

(all-element-number)
(all-element-number osnl)

This is the same as element-number except it counts elements with any generic identifier. If osnl is not an element returns #f, otherwise returns 1 plus the number of elements that started before osnl. This provides an efficient way of creating a unique identifier for any element in a document.

External entity access

(read-entity string)

This returns a string containing the contents of the external entity with system identifier string. This should be used only for textual entities (CDATA and SDATA), and not for binary entities (NDATA).

Current Jade Limitations

This section describes the limitations of the front-end (the general-purpose DSSSL engine): each backend also has its own limitations.

Only the DSSSL Online subset of DSSSL is implemented with the following additions (all part of full DSSSL)

Note that only inherited characteristics that are applicable to some supported flow object can be specified.

Jade supports only a single, fixed grove plan which comprises the following modules:

Character/glyph handling

It only supports a single pre-defined character repertoire. A character name of the form U-XXXX where XXXX are four upper-case hexadecimal digits, is recognized as referring to the Unicode character with that code. For many characters, it is also possible to use the ISO/IEC 10646 name in lower-case with words separated by hyphens.

Some common SDATA entity names from the ISO entity sets are recognized and mapped to characters. In addition an SDATA entity name of the form U-XXXX, where XXXX are four upper-case hexadecimal digits, is mapped to the Unicode character with that code.

Jade does not make use of any of the declaration architectural forms related to characters and glyphs.

The following style language declarations (as well as the non-DSSSL Online declarations) are ignored:

declare-char-characteristic+property
declare-char-property
add-char-properties
define-language
declare-default-language

Validation

Several things that it would be desirable to have checked aren't checked:

Other limitations

The following primitives are just stubs:

char-script-case
Always returns last argument.
char-property
Always returns #f or specified default value.
address-visited?
Always returns #f.

Backends

RTF backend

Only the following flow object classes are implemented:

sequence
character
paragraph
paragraph-break
line-field
Only at the beginning of a paragraph.
display-group
simple-page-sequence
score
Only type after and through
rule
Only horizontal orientation. Rules only show up in Page Layout View.
box
Changing indentation inside a box will not work.
leader
The content of the flow object is ignored: a dotted leader will always be used. The specified length is ignored: it always fills out the line.
external-graphic
On Windows platforms, this can be used to embed OLE objects, by making the the value of the notation-system-id: a formal system identifier whose storage manager is CLSID and whose storage object identifier is the COM CLSID (including surrounding braces). The system identifier may also be just <CLSID> (that is, the storage object identifier may be empty); in this case, the OLE default CLSID for the file (usually chosen based on the file's extension) will be used.
link
Only destinations that are single elements in the same RTF output file.
table
table-part
table-column
The table-auto-width feature isn't properly supported: it's not really possible in RTF.
table-row
table-cell
table-border
math flow objects
These are converted into Word EQ fields. The Word EQ feature is not powerful enough to do a really good job. The mark flow object class is particularily problematic, because there is no way automatically to compute an appropriate position for the contents of the over-mark and under-mark areas; however, some additional characteristics are provided that allow the positioning to be explicitly specified.

Many DSSSL characteristics cannot be implemented in RTF. The backend does the best it can.

In order to get correct page numbers in Microsoft Word, type the following after opening the document:

  1. CTRL+END
  2. CTRL+A
  3. F9

In Word Viewer 97, you must instead do:

  1. CTRL+END
  2. ALT
  3. V
  4. N
  5. ALT
  6. V
  7. P
Page numbers also get updated automatically when you print.

The RTF backend supports some additional characteristics. To use a characteristic named here as C, declare it using declare-characteristic with the public identifier:

"UNREGISTERED::James Clark//Characteristic::C"
heading-level
Value is an integer. It applies to paragraph flow objects. If the value is between 1 and 9, then the paragraph is output as a header of this level, otherwise it is output as body text. Using this characteristic allows Word to provide useful outline views and a document map. (Note that Word's handling of document maps for RTF documents is buggy: if you load an RTF document, and the previous document was using the Online Layout view, then RTF will attempt to guess what paragraphs are headings, which it will almost always do wrong. To avoid this, switch to the Normal view before loading an RTF document.) The initial value is 0
page-number-format
Value is a string as for format-number procedure. This controls the format of the number used by page-number-sosofo and current-page-number-sosofo for references to pages in the simple-page-sequence. The initial value is "1". It applies to simple-page-sequence flow objects.
page-number-restart?
Value is a boolean. If true, then for the purposes of page-number-sosofo and current-page-number-sosofo, the page numbers for this simple-page-sequence will restart from 1. The initial value is #f. It applies to simple-page-sequence flow objects.
page-n-columns
Value is a strictly positive integer, specifying the number of columns. The initial value is 1. It applies to simple-page-sequence flow objects.
page-column-sep
Value is a length, specifying the separation between columns. The initial value is .5in. It applies to simple-page-sequence flow objects.
page-balance-columns?
Value is a boolean. If true, the columns on the final page of the page-sequence should be balanced. The initial value is #f. It applies to simple-page-sequence flow objects.
superscript-height
Value is a length. Specifies the height of the baseline of a superscript above its parent's baseline. It applies to superscript and script flow objects.
subscript-depth
Value is a length. Specifies the depth of the baseline of a subscript below its parent's baseline. It applies to subscript and script flow objects.
over-mark-height
Value is a length. Specifies the height of the baseline of the contents of the over-mark area of a mark flow object above the baseline of the contents of the main area. It also controls the height of the contents of the mid-sup area of the script flow object. It applies to mark and script flow objects.
under-mark-depth
Value is a length. Specifies the depth of the baseline of the contents of the under-mark area of a mark flow object below the baseline of the contents of the main area. It also controls the depth of the contents of the mid-sub area of the script flow object. It applies to mark and script flow objects.
grid-row-sep
Value is a length. Specifies the separation between rows of a grid flow object.
grid-column-sep
Value is a length. Specifies the separation between columns of a grid flow object.

XML Flow Object Tree backend

The DTD for the XML generated is in dsssl/fot.dtd.

Characteristics declared with declare-characteristic are not reported.

Flow objects of classes declared with declare-flow-object are not reported.

Reading the Jade Sources

Start with the following headers:

grove/Node.h
style/FOTBuilder.h
style/StyleEngine.h

Reporting Bugs in Jade

If you find a bug in Jade, please take the time to report it. If Jade crashes on any input whatever, that's a bug and I want to hear about it. If Jade fails to process a specification that conforms to the DSSSL standard in the manner required by the DSSSL standard in a way that is not documented here as a current limitation nor is documented as an error in the proposed Technical Corrigendum to the DSSSL standard, that's a bug and I want to hear about it.

Please report bugs by email to me, jjc@jclark.com. Do not post them to comp.text.sgml nor to the sp-prog mailing list nor to the DSSSList mailing list.

I do not want to get bug reports about documented limitations, so please read the list of limitations carefully. However, feel free to let me know which of the current limitations you would most like to see addressed.

I also at this stage do not want to hear about bugs in your C++ compiler that prevent it from compiling Jade: if your compiler refuses to compile Jade, I want to hear about it only if

Before reporting a bug, please check that your version is current.

The most important thing in reporting a bug is to include a complete set of files on which I can run jade and reproduce the problem. Include all DTDs that you use, whether or not they are standard. Also tell me what command line I should use, and what is incorrect about the behaviour of jade. If the files are large package them up as a tar or zip file and upload them to ftp://ftp.jclark.com/incoming.

It is useful if you have a fix for a bug, but please don't delay sending in the bug while you work on a fix and don't send in a fix without giving me the files to reproduce the bug it fixes.

Contributing to Jade

Here are some ways you can contribute to Jade:

James Clark