Stream-based Style sheet Proposal
Other proposals
Before I can introduce my own syntax, I will have to dismiss :-)
earlier proposals. (Also see HTML
Style Sheets)
- Robert
Raisch's proposal of June 1993. Except for minor details, this
would meet my requirements, but I think the syntax can be improved.
- Joe English'
proposal using SGML syntax. (Already withdrawn by the author.)
Everything can be expressed in SGML syntax, but SGML isn't necessarily
the best format for everything.
- DSSSL & DSSSL
Lite by James Clark e.a.. I'm sure DSSSL (or DSSSL Lite) will find
its place in the formatting of complex SGML, despite its shortcomings,
since there simply isn't anything else that's standardized. But for on
the fly (stream-based) formatting, it's unusable. As long as the
on-screen representation of a document maintains (more or less) the
order of the elements in the SGML file, there is no need for the
goal-based model of DSSSL.
- Haakon
Lie's cascading style sheets. This proposal again follows the same
route as 1 and 2 and thereby meets the requirements I set. The syntax
is very close to what I will use below. The ability for the client to
override style sheets in a controlled manner is very appealing, but
unfortunately not as simple as this proposal supposes. Nevertheless,
context-dependent or even weighted overrides are worth investigating
further.
- `HTML
to the Max' by C.M. Sperberg-McQueen and Robert Goldstein. This
`manifesto' contains many sensible words about the deployment of
general SGML on the WEB. For a style sheet language to be practical,
the SGML document must be in the canonical form towards which the
authors have made a start. I guess this defines the border between
documents requiring DSSSL and those that can be formatted `on the
fly': the SGML must be in canonical form and the elements must be in
the right order.
Syntax
First a note about the syntax I adopted: it is simply the syntax
used by X resource files. The advantages are as follows:
- At least under X, all the routines for parsing and
string-to-whatever conversions are already there.
- It supports the addressing of elements as direct children or
descendants of other elements.
- Except for a few subtleties with inheritance and priorities among
general and specific specs, the syntax is straightforward, very
readable and easy to write by hand.
The subtleties mentioned above involve specifications with
wild cards. There are well-defined rules for them, but a few cases they
may be somewhat surprising. With inheritance I mean the rule that
elements in the SGML document inherit most of the style properties of
the enclosing element. This must be distinguished from the properties
that are shared because they have a wild card in them.
In general all element names are in uppercase, attribute names are
also in uppercase and preceded by `!
', ID's are preceded
by an `@
', property names consist of lowercase letters.
In certain contexts, property names can be preceded by
`$
'. Example: HTML.BODY.DISPLAY.P.EM.slant
.
All of the properties can have an explicit value, an attribute
reference (with `!
'), or a reference to another property
(indicated by an initial `$
'). An attribute or property
is replaced by its value. For a property, this is the value of the
property in the enclosing element (otherwise the order of evaluation
would matter). If a property has an illegal value, the property is
regarded as not being present.
The value may also be a built-in function. The only function
defined so far is `@ifmatch(A, B, C,
D)
', where A is the name of an attribute (with
`!
') or property (with `$
'), B is a
regular expression, C is the value of the property if A
matches B, D is the value otherwise. D may be
omitted.
Properties
Below is a list of style properties. I've tried to make the list as
comprehensive as possible, but I'm sure there are omissions or things
that are better expressed in different ways. For each property there
is a short explanation of the semantics and the way it is inherited in
sub-elements. If a property is inherited, it means that the final
value of the property is inherited. E.g., if a property is given as
`*FOO.justify: !ALIGN
' and attribute ALIGN
of FOO
has the value `left
', then all
sub-elements of FOO
are left aligned, it doesn't matter
if they have an ALIGN
attribute of their own.
Vertical space is, unless explicitly stated otherwise, expressed
in units equal to the normal line height of the default (initial)
font. Thus, these heights are independent of the actual font size or
leading. For X, this line height would be the sum of the font's
overall ascent and descent.
Horizontal space is measured in ems of the default font. For an X
font, this would be the value of the QUAD_WIDTH
property
(if indeed the default font has one, otherwise a suitable estimate).
empty
- (Boolean) Indicates that the element has no
end tag. The formatter will in effect open the element and then
immediately insert a virtual end tag and execute that. Not inherited
(obviously:-)).
size
- Font size, absolute or as an increment to
the inherited size, the two are distiguished by the presence/absence
of an initial `+' or `-'. Example: assume initial size is 0; if
<SUP>
has property `*SUP.size: -2
' it
would be -2 after <SUP>
; after another
<SUP>
; it would be -4. Of course, browsers only have
a limited set of sizes and -4 may well turn out the same as -2. This
property is not inherited (but the resulting cumulative size is!)
family
- Font family. There are four families:
normal, alt, tt, and sym. `normal' is useful for running text. `alt',
if different from `normal', is suitable for titles. `tt' is a fixed
width font. `sym' is for special symbols that are not in any character
set (called WWW-icons, such as folder and audio). Inherited.
familyname
- Font family. This is the name of a
specific family, such as Bembo, Gill Sans, Univers, Garamond, etc.
Takes precedence over `family', but only if the browser is able to
provide the font. Inherited.
emphasis
- A number selecting the level of
emphasis. Browsers are free to implement this any way they like, but
they must support at least levels 0 (of course!), 1, and 2. Inherited.
slant
- (Boolean) Select oblique or italic font.
Overrides `emphasis'. Note: there is no way to select slanted rather
than italic as in TEX, is this needed? Inherited.
bold
- (Boolean) Select bold font. Overrides
`emphasis'. Inherited.
underscore
- A number giving the number of lines
under the text. Overrides `emphasis'. Inherited.
strikeout
- (Boolean) a line through the text.
Inherited.
textcolor
- Colour of foreground, using X
specifications. Inherited.
textbackground
- Colour of background. The value
`transparent' is also allowed. Inherited.
leading
- Extra vertical space to put between
lines, relative to the default line height of the actual font in the
element, Thus, 1.0 means double-spaced lines, whatever the current
font size. Inherited.
raise
- Amount to raise (or lower, if negative) the
text above the baseline. The value can be ...-2, -1, 0, 1, 2,..., to
indicate the positions (The exact positions in pixels are a property
of the font.) This property automatically selects an appropriately
smaller font size, so no `size' is needed (unless one wants an even
smaller font). Inherited.
prebreak
- (Floating point number) Minimum amount
of whitespace above the element. (Implies a paragraph break.) The
whitespace is not added to the vertical space already there, but the
space becomes the maximum of the existing space and the new. Example:
`
*OL.prebreak: 1.0
' and `*LI.prebreak: 0.5
'
would cause the first <LI>
to be 1.0 line (maximum
of 1.0 and 0.5) below the previous paragraph. Not inherited.
postbreak
- Minimum amount of whitespace below the
element. Not inherited.
rulebefore
- (Floating point number) The presence
of this property causes a horizontal rule to be inserted above the
element, followed by the given amount of whitespace. The rule appears
after any `prebreak'. Not inherited.
ruleafter
- Insert the given amount of whitespace
and a horizontal rule below the element, but before any `postbreak'
(implies a paragraph break). Not inherited.
rulethickness
- Thickness of the rule, in line
heights. Only meaningful if `rulebefore' and/or `ruleafter' are
present. Inherited.
leftindent
- (Floating point number) Increment to
the current left margin. Implies a paragraph break. Not inherited (but
the cumulative margin is!).
rightindent
- Increment to the right margin. Not
inherited.
textwidth
- Width of the paragraph in ems. This
implies a paragraph break and overrides `rightindent'. Not inherited.
justify
- Alignment mode for the paragraph. The
presence of this property implies a paragraph break). Possible values
are `left', `right', `full' and `center'. Default value is `full'.
Inherited (but not the implied paragraph break).
track
- There are three tracks: the main one, one
on the left and one on the right. Text (or images) can be `floated'
against the left or right margins, causing the main track to flow
around it. Values for this property are: left, right, normal. Implies
a paragraph break. Inherited (except for the implied paragraph break).
parindent
- (Floating point) Indentation of first
line. Inherited.
noindent
- (Boolean) Suppresses `parindent' on the
next paragraph after this element. Not inherited.
label
- Start a new paragraph with a label sticking
out into the left margin. Values are: A, a, 1, I, i, bullet, square,
-, *, names of symbols (resp. auto-numbering uppercase letters,
lowercase letters, Arabic numbers, Roman numerals, lowercase Roman
numerals, bullets, squares, dashes, asterisks, WWW-icons). Not
inherited.
hide
- (Boolean). Text of element is not displayed.
Inherited.
title
- (Boolean). Text of element is added to
document's title. Inherited.
inline
- URL of something to display in-line at the
start of the element. Not inherited.
vmargin
- Extra space to add above and below an inline
object. Only useful in combination with inline. Inherited.
hmargin
- Extra space to add left and right of an
inline object. Only useful in combination with inline. Inherited.
ismap
- (Boolean) A click on this element should
generate a URL that includes the coordinates of the click, relative to
the upper left corner of the in-line illustration. Only useful in
combination with inline and anchor. Inherited.
valign
- Vertical alignment of an inline object.
Possible values: top, bottom, middle. Not inherited.
height
- Height of an inline object in pixels.
Takes priority over whatever the inline object itself suggests for a
height. Not inherited.
depth
- Depth below the baseline of an inline
object, in pixels. This overrides `valign'. Not inherited.
width
- Width of an inline object in pixels. Only
meaningful in combination with `inline'. Not inherited.
minimized
- (Boolean) The element is replaced by a
marker or button, which, when pressed, unfolds into the contents of
the element. Depending on the browser, this may be implemented as a
pop-up box, a new window, or expanding text. Not inherited.
caption
- Possible values: top, bottom, left,
right. The contents of the element should be rendered as a caption and
positioned in the indicated position. The object which it is a caption
for can be found by looking for an `inline' or `table' property on the
element itself or its ancestors. Not inherited.
id
- The (unique) ID of this element. This value
will nearly always be an attribute reference, such as `!ID'. Presence
of this property will cause the formatter to look for additional
property specifications starting with this ID. Example: if `
*id:
!ID
' works out, after resolving the attribute, as `*id:
p101
', the formatter will look for `@p101.size
',
`@p101.family
', etc. Not inherited.
obeyspaces
- (Boolean). Spaces and line breaks are
not removed. Spaces have fixed width, REs (record ends) cause a line
break in the text. If the current value of `justify' is `full', it is
treated as `left' instead.
nowrap
- (Boolean) Don't break lines. If combined
with
obeyspaces
, REs are treated as single
spaces. Inherited.
insertbefore
- Text to insert just before the text
of the element, after any `parindent'. Not inherited.
insertafter
- Text to insert after the text of the
element. Not inherited.
flush
- Causes enough vertical space to be inserted
to end up below any floating figure in the left, right, or both side
tracks. Implies a paragraph break. Possible values: left, right, full.
Not inherited.
anchor
- Contains the URL of which this element is
the source anchor. Inherited
anchorshape
- (Only meaningful in combination with
`anchor') Indicates that the anchor is a hotspot in the inline image
of the element or one of its ancestors. Possible values: rectangle,
circle, polygon. Not inherited.
anchorcoords
- Coordinates of the hotspot. Only
useful in combination with `anchorshape' and the interpretation
depends on the value of that property. Not inherited.
table
- (Boolean) Indicates that the element
constitutes a table. Not inherited.
tablerow
- (Boolean) Indicates that the element is
a table row. It is possible for an element to be both a `table' and a
`tablerow'. If neither the element itself nor one of its ancestors has
the `table' property, `tablerow' has no meaning. Not inherited.
tablecell
- (Boolean) Indicates that the element
forms one or more cells in a table. An element can be both a
`tablecell' and a `tablerow'. If neither the element itself nor one of
its ancestors has the `tablerow' property, `tablecell' has no meaning.
Not inherited.
rowspan
- The table cell spans the indicated number
of rows. Only meaningful in combination with `tablecell'. If not
present, 1 is assumed. Not inherited.
colspan
- The table cell spans the indicated number
of columns. Only meaningful in combination with `tablecell'. If not
present, 1 is assumed. Not inherited.
frame
- Lines around the table cell. Possible
values: any sequence of zero or more words from `left', `right',
`top', `bottom', `border'. `border' is equivalent to `left right top
bottom'. Inherited (but only results in any visible effect in elements
with the `tablecell' property). Default is no frame. Not inherited.
hyphenate
- (Boolean) the text in this element may
be hyphenated at line breaks. Initial value is true. Inherited.
language
- (ISO code for a language) This may
influence the hyphenation and maybe some other things (such as the
style sheet itself). Default is US English. Inherited.
target
- Target of a hyperlink (the fragment-ID of
a URL). Not inherited.
stylesheet
- The contents of the element are a
style sheet, applicable to the rest of the document. Possible values
are: `merge', `replace' and `override'. Any other value means that the
contents are not a style sheet. `Merge' means that the style rules are
added to the existing style, with the existing style taking precedence
in case of conflict. `Override' also adds the two together, but gives
precedence to the new rules. `Replace' completely removes the old
style sheet and uses the new one instead. Inherited.
math??
-
Example
Here is part of a style sheet for HTML:
! Style sheet for HTML documents
! -----------------------------
*id: !ID
*language: !LANG
*target: !ID
! Create a default margin of 3 em
!
HTML.leftindent: 3.0
HTML.justify: full
! <H1> is bold, centered, 2 sizes larger than the surrounding text,
! with lines above and below.
!
*H1.size: 2
*H1.bold: true
*H1.justify: center
*H1.rulebefore: 1.0
*H1.ruleafter: 1.0
*H1.prebreak: 2.0
*H1.postbreak: 1.0
*H1.noindent: true
! <H2> is bold, 1 size larger, sticking out into the margin,
! not justified and it has a line above it
!
*H2.size: 1
*H2.bold: true
*H2.justify: left
*H2.rulebefore: 0.3
*H2.prebreak: 1.0
*H2.postbreak: 1.0
*H2.leftindent: -3.0
*H2.parindent: 0.0
*H2.noindent: true
! <H3> is italic, 1 size larger, in left margin
!
*H3.size: 1
*H3.slant: true
*H3.justify: left
*H3.prebreak: 1.0
*H3.postbreak: 1.0
*H3.leftindent: -3.0
*H3.parindent: 0.0
*H3.noindent: true
! <H4> is bold
!
*H4.bold: true
*H4.justify: left
*H4.prebreak: 1.0
*H4.postbreak: 0.5
*H4.parindent: 0.0
*H4.noindent: true
! <H5> is bold and run-in (i.e., no postbreak)
!
*H5.bold: true
*H5.prebreak: 1.0
*H5.parindent: 0.0
! <H6> is italic and run-in
!
*H6.slant: true
*H6.prebreak: 1.0
*H6.parindent: 0.0
! <P>. Note absence of prebreak: because H5 and H6 are run-in
!
*P.parindent: 2.0
*P.postbreak: 0.0
! Various character-level elements
!
*U.underscore: 1
*S.strikeout: true
*TT.family: tt
*B.bold: true
*I.slant: true
*BIG.size: 1
*SMALL.size: -1
*EM.emphasis: 1
*STRONG.emphasis: 2
*CODE.family: tt
*SAMP.family: tt
*KBD.family: tt
*KDB.underscore: 1
*VAR.slant: true
*CITE.slant: true
*Q.insertbefore: `
*Q.insertafter: '
! <BR> forcedly break a line. Flush as in Netscape
!
*BR.empty: true
*BR.prebreak: 0.0
*BR.flush: !CLEAN
*BR.noindent: True
! <WBR> is another un-SGML-like element from Netscape
! (&sbsp; doesn't exist, I've made it up, cf. ­)
!
*WBR.empty: True
*WBR.insertbefore: &sbsp;
! <A> is a hyperlink
!
*A.textcolor: blue
*A.underscore: 1
*A.anchor: !HREF
*A.target: !NAME
! <IMG> in-line or floating illustration (Note double use of ALIGN)
!
*IMG.empty: true
*IMG.inline: !SRC
*IMG.valign: !ALIGN
*IMG.track: !ALIGN
*IMG.ismap: @ifmatch(!ISMAP, "ISMAP", true, false)
*IMG.width: !WIDTH
*IMG.height: !HEIGHT
! if images are not displayed:
! *IMG.insertbefore: !ALT
! <HR> horizontal rule
!
*HR.empty: true
*HR.prebreak: 0.5
*HR.rulebefore: 0.0
*HR.postbreak: 0.0
! <PRE> preformatted text
!
*PRE.prebreak: 0.5
*PRE.postbreak: 0.5
*PRE.family: tt
*PRE.justify: left
*PRE.width: !WIDTH
*PRE.obeyspaces: true
! <DL>
!
*DL.prebreak: 1.0
*DL.postbreak: 1.0
*DL.leftindent: 2.0
! <DT>
!
*DT.prebreak: 0.5
*DT.parindent: -2.0
*DT.bold: true
*DT.insertafter: \
! <DD>
!
*DD.postbreak: 0.5
! <OL>
!
*OL.prebreak: 1.0
*OL.postbreak: 1.0
*OL.leftindent: 2.0
! <UL>
!
*UL.prebreak: 1.0
*UL.postbreak: 1.0
*UL.leftindent: 2.0
! <OL> has numbered items <LI>: 1.A.I, bullets elsewhere
!
*LI.prebreak: 0.5
*LI.postbreak: 0.5
*OL.LI.label: 1
*OL*OL.LI.label: A
*OL*OL*OL.LI.label: I
*LI.label: bullet
Discussion
Linking to style sheets
Some problems remain: how are style sheets associated with
documents? This is outside the scope of this article, but I would like
to offer some possibilities:
- In the LINK tag of HTML. This is unsatisfactory for several
reasons: (1) it is too late, the document has already started
before the link is found; (2) it doesn't work for non-HTML.
- In a new header line of the HTTP protocol. This is better,
but it relies on HTTP being used.
- As part of a MIME/multipart document.
- In the URL. A bad idea, not only because the style doesn't
really `belong' to the document, but also because the URL would
become too long.
- The other way round: a hyperlink contains not the URL of the
document, but of its style sheet, which in turn references the
document (in a new `document' property).
- As an attribute of A: <A HREF="doc.html" STYLE="doc.sty">
Conditional overrides
As said earlier, there must be some way for the user to selectively
override part of a style sheet. To keep the rest of the style sheet
intact, it may be necessary to introduce conditionals into the syntax:
! Effect of `delay image loading' on FIG element
!
*FIG.inline: !SRC
*FIG.hide: true
delay_images*FIG.inline:
delay_images*FIG.hide: false
!
! Indentation and justification made dependent on window width
!
*DL.leftindent: 3.0
narrow*DL.leftindent: 1.0
wide*DL.leftindent: 5.0
!
narrow*P.justify: left
!
! Use colours only if on colour screen
!
*A.textcolor: red
*A.textbackground: yellow
b&w*A.textcolor: white
b&w*A.textbackground: black
monochrome*A.textcolor: black
monochrome*A.textbackground: gray80
Other output devices
A somewhat related question is: how can the user (or the browser)
select the right style for a particular purpose, such as printing,
on-line display, speech synthesis. These different purposes could be
encoded in the style language (allowing all style variants to be in
the same style sheet) or they could be handled externally, possibly in
the way the style sheet is linked from a document.
Restricted SGML
If a style sheet is to work as intended, the browser must ensure
that the proper nesting of elements is maintained. The parser must
insert missing elements. A formatter that doesn't maintain a parsing
context based on the HTML DTD and only formats the elements that are
named explicitly in the text, will give strange results.
If the document is not coded in HTML, but in some other SGML
format, the formatter has to assume that all tags are present, except
perhaps for a few closing tags that can be inferred from the rule that
elements must be properly nested. This is the only rule that the
parser can use.
Explicit lengths
Dimensions have implicit units (ems, line heights, pixels)
depending on context. Maybe a notation with explicit units should be
allowed as well.
Missing properties
There are still some properties missing:
- background colour/pattern, not just behind the text, but behind an
element
- change bars and other vertical lines
- language-dependent layout (hyphenation, quotes)
- overlapping elements, overlays