Rtfspeci
Rtfspeci
Rtfspeci
RTF Syntax
The Rich Text Format (RTF) standard is a method of encoding formatted text and graphics for easy transfer
between applications. Currently, users depend on special translation software to move word processing documents
between different DOS applications, and between DOS applications and Apple Macintosh applications.
The RTF standard provides a standard format for text and graphics interchange that can be used with different
output devices, operating environments, and operating systems. RTF uses the ANSI, Macintosh, or IBM PC
character set to control the representation and formatting of a document, both on the screen and in print. With the
RTF standard, documents composed under different operating systems and with different software applications
can be transferred between those operating systems and applications.
An RTF file consists of unformatted text, "control words," "control symbols," and "groups." A standard RTF file
consists of only 7-bit ASCII characters for ease of transport.
A "control word" is a specially formatted command that RTF uses to mark printer control codes and information
that applications use to manage documents. A control word consists of a backslash followed by an alphabetic
string and a delimiter, as shown in the following example:
\rtf1…
ABC
The delimiter can be a space or one or more nonalphabetic characters. If a numeric parameter immediately
follows the control word, this parameter is the delimiter, and is itself followed by a delimiter, also consisting of a
space or one or more nonalphabetic characters.
Rich Text Format 2
A "control symbol" consists of a backslash followed by a single, nonalphabetic character. For example, \~
represents a nonbreaking space. Control symbols take no delimiters.
A "group" consists of text and control words or control symbols enclosed in braces ({}). Formatting specified
within a group affects only the text within the group. Generally, text within a group inherits any formatting of the
text preceding the group. However, Microsoft implementations of RTF assume that the footnote, header/footer,
and annotation groups (described later in this document) do not inherit formatting of the preceding text.
Therefore, to ensure that these groups will always be formatted correctly, you should set the formatting within
these groups to the default with the \sectd, \pard, and \plain control words, and then add any desired
formatting.
Any other characters in the file are plain text. As mentioned above, the backslash (\) and braces ({}) have special
meaning in RTF. To use these characters as text, precede them with a backslash.
Software that takes a formatted file and turns it into an RTF file is called a "writer." Software that translates an
RTF file into a formatted file is called a "reader." An RTF writer separates the application's control information
from the plain text and writes a new file containing the plain text and the RTF groups associated with that text. An
RTF reader does the converse of this procedure.
An entire RTF file is considered a group and must be enclosed in braces. The control word \rtfn must follow
the first open brace. The numeric parameter identifies the version of the RTF standard used. The RTF standard
described in this document corresponds to version 1.
The order of groups within an RTF file is important. Each group specifies the part of the document affected by the
group and the different attributes of that text. An RTF file must begin with the following two control words in the
following order:
• Character set
The RTF file can also include groups for fonts, styles, screen color, pictures, footnotes, annotations, headers and
footers, summary information, fields, and bookmarks, as well as document, section, paragraph, and character
formatting properties. If the font, style, screen color, and summary information groups and document formatting
properties are included, they must precede the first plain text character in the document. If included, the group for
fonts should precede the group for styles.
The groups are discussed in the following sections. If a group isn't used, it can be omitted.
Certain groups, referred to as "destinations," mark the beginning of a collection of related text. An example of this
is the \footnote group, where the footnote text follows the control word. Destinations added after the RTF
specification published in the March 1987 Microsoft Systems Journal may be preceded by the control symbol \*.
This control symbol identifies destinations whose related text should be ignored if the RTF reader does not
recognize the destination. RTF writers should follow this convention when adding new control words.
Destinations whose related text should be inserted into the document even if the destination is not recognized
should not use \*. In this document, all destinations that use \* will be shown with \* as part of the control
word.
Rich Text Format 3
A font is defined by its name, a font number, and a font family, as shown in the following example. Semicolons
are used as delimiters between fonts.
ABC D
A Control word
B Font number
C Font family
D Font name
The font numbers represent the full font definitions in the group, and vary with each document. The font families
are listed below:
If an RTF file uses a default font, the default font number is specified with the \deffn control word which must
precede the font table group. The RTF writer supplies the default font number used in the creation of the
document as the numeric argument. The RTF reader then translates this number through the font table into the
most similar font available on the reader's system.
Rich Text Format 4
In some applications, styles are based on, or are the basis for, other styles. In these cases, two other control words
can be used:
An example of an RTF style sheet and styles is shown in the following example. In this example, Postscript is
declared but not used. Some of the control words in this example are discussed in the following sections.
\widowctrl\ftnbj\ftnrestart\sectd\linex0\endnhere
\pard\plain\fs20 This is Normal style.
\par\pard\plain\s1
B—This is right justified. I call this style FLUSHRIGHT.
\par\pard\plain\s2
This is an indented paragraph. I call this style IND. It produces a hanging
indent.
\par}
A Style sheet
B Styles applied to text
Rich Text Format 5
Screen colors, character colors, and other color information are contained in the color table group. The control
word \colortbl begins this group. Values for red, green, blue, and the foreground and background colors are
shown in the following list. These parameter values correspond to the color indexes used by Microsoft Windows
(0-255). Each color table entry is defined by the amount of red, green, and blue it has. For more information on
color setup, see your Windows documentation.
Each definition must be delimited by a semicolon, even if the definition is omitted. If a color definition is omitted,
the RTF reader uses its default color. In the example below, three colors are defined. The first color is omitted, as
shown by the semicolon following the \colortbl control word.
{\colortbl;\red0\green0\blue0;\red0\green0\blue255;}
The following example defines a block of text in color (where supported). Note that the cf/cb index is the index of
an entry in the color table, which represents a red/green/blue color combination.
If the file is translated by software that does not display color, this group is ignored.
Pictures
An RTF file can include picture files composed with other applications. These files are in hexadecimal (default)
or binary format. The control word \pict begins this group. Control words that define and describe the picture
parameters follow the \pict control word.
Rich Text Format 6
These control words are listed in the table that follows. Some measurements in this table are in twips; a twip is
one-twentieth of a printer's point. The control words for picture border patterns (\brdrs, \brdrdb, \brdrth,
\brdrsh, \brdrdot, and \brdrhair) are ignored when translated into Microsoft Word for the Macintosh,
which uses character properties to make borders.
The \wbitmap control word is optional; if neither \wmetafile nor \macpict is specified, the picture is
assumed to be a Windows bitmap.
Be careful with spaces following control words when dealing with pictures in binary format. When reading files,
RTF considers the first space the delimiter and subsequent spaces part of the document text. Therefore, any extra
white space is attached to the picture, with unpredictable results.
RTF writers should not use the carriage-return-line-feed (CRLF) combination to break up pictures in binary
format. In this case, the CRLF will be treated as literal text and considered part of the picture data.
The picture in hexadecimal or binary format follows the picture group control words. The following example
illustrates the group format and the result.
{\pict\wbitmap0\picw170\pich77\wbmbitspixel1\wbmplanes1
\wbmwidthbytes22\picwgoal505
\pichgoal221
\picscalex172
\picscaley172
4912000000000273023d1101a030
3901000a000000000273023d98
0048000200000275
0240000200010275023e000000000
273023dO00002b90002b90002
b90002b90002b9
0002b90002b90002b90002b90002b90002
b92222b90002b90002b90
002b90002b9
D002b90002b90002b90002b9000
A Source
B Width
C Height
D Bits per pixel
E Bitmap planes
F Width of picture in bytes
G Desired picture width
H Desired picture height
I Horizontal scaling value
J Vertical scaling value
K Hexadecimal data
Footnotes
The group containing footnote text begins with the control word \footnote. Footnotes are anchored to the
character that immediately precedes the footnote group. If automatic footnoting is defined, the group can be
preceded by a footnote reference character, identified by the control word \chftn.
Rich Text Format 8
Mead's landmark study has been amply annotated.1 It was her work in America during the Second World War.
however. that forms the basis for this paper. As others have noted2 this period was a turning point for Margaret
Mead.
A Footnotes
See "Section Formatting Properties," "Document Formatting Properties," and "Special Characters" later in this
document for other control words relating to footnotes.
Annotations
The group containing annotation text begins with the control word \*\annotation. Annotations are anchored
to the character that immediately precedes the annotation group. The group must be preceded by an annotation
reference character, identified by the control word \chatn, which itself must be preceded by a group that begins
with the control word \*\atnid, and contains the identification text for the author of the annotation.
Headers and footers can be defined for each section. If none is defined for a given section, the headers and footers
from the previous section (if any) are used.
The control words \header and \footer can be replaced by the following control words, as appropriate:
Information
The RTF file can also contain an information group, which is translated but not displayed with the text. This
information can include the title, author, key words, comments, and other information specific to the file. This
information can be used when a document management utility is available.
This group begins with the control word \info. Some applications, such as Word, ask a user to type this
docurnent information when saving the document in native format. When the document is then saved or translated
into RTF, the RTF writer specifies this information using the following control words. These control words are
destinations, and should be enclosed in braces ({}).
The RTF writer may automatically enter other control words, as shown in the following list:
Entries without the n parameter have the \yr \mo \dy \hr \min format. An example of an information
group follows.
Fields
The field group contains the text of Word fields. The field group begins with the control word \field.
The following control words can follow the \field control word:
Two subgroups are available within the \field group. They must be enclosed in braces ({}) and begin with the
following control words:
The \fldrslt control word should be included even if no result has been calculated. This simplifies the RTF
reader's task, because even readers that do not recognize fields can generally include the value of the \fldrslt
group in the document.
IA |C IC
1 I
Index Entries
The index entry group begins with the control word \xe. Following this control word is the text of the index entry
and other, optional control words that further define the index entry.
If the text of the index entry is not formatted as hidden text with the \v control word (see "Character Formatting
Properties," later in this document), the text is put into the document as well as into the index. Similarly, the text
of the \txe subgroup, described later, becomes part of the document if it is not formatted as hidden text.
The following control words are destinations within the \xe group and are followed by text arguments. These
control words and their arguments must be enclosed in braces ({}):
As with index entries, text that is not formatted as hidden with the \v character formatting control word should be
put into the document.
Bookmarks
This group contains two control words: \*\bkmkstart, to indicate the start of the specified bookmark, and
\*\bkmkend, to indicate the end of the specified bookmark. A bookmark is shown in the following example:
…
\pard\plain \fs20 Kuhn believes that science, rather than discovering in
experience certain structured relationships, actually creates (or already
participates in) a presupposed structure to which it fits the data.
{\bkmkstart paradigm}Kuhn calls such a presupposed structure a
paradigm.{\bkmkend paradigm}
…
The following control words specify document formatting. If you omit a control word, RTF uses the default value
shown in parentheses. Measurements are in twips.
Absolute-Positioned Objects
These paragraph formatting control words specify the location of the paragraph on the
page.
…
\par \pard \pvpg\phpg\posxc\posyt\absw5040\dxfrtextl73 abs pos paral
\par \pard \phmrg\posxo\posyc \dxfrtext1 152 abs pos para2
…
A Text to be positioned
Tables
A table is a collection of paragraphs. A table row is a continuous sequence of paragraphs partitioned into cells.
The last paragraph of a cell is terminated by a cell mark (the \cell control word), and the row is terminated by a
row mark (the \row control word) There is no RTF table group; the \intbl paragraph formatting control word
identifies the paragraph as part of a table.
Rich Text Format 17
…
\par\trowd\trqc\trgaph108\trrh280\trleft36
\clbrdrt\brdrth\clbrdrn\brdrth\clbrdrb\brdrdb
\clbrdr\brdrdb\celbx3636\clbrdrt\brdrth
\clbrdrl\brdrdb \clbrdrb\brdrdb \dbrdrr\brdrdb
\cellx7236\clbrdrt\brdrth\dbrdrl\brdrdb
\clbrdrb\brdrdb\clbrdr\brdrdb\cellx10836\pard \intbl
\cell \pard \intbl \cell \pard \intbl \cell \pard \intbl \row
\trowd \trqc\trgaph108\trrh280\trleft36 \dbrdrt\brdrdb
\clbrdrl\brdrth \clbrdrb \brdrsh\brdrs \dbrdrr\brdrdb
\cellx3636\clbrdr\brdrdb \dbrdr\brdrdb
\clbrdrb\brdrsh\brdrs \dbrdrr\brdrdb
\cellx7236\clbrdrt \brdrdb \dbrdr \brdrdb
\clbrdrb\brdrsh\brdrs \clbrdrr\brdrdb \celbx10836\pard
\intbl \cell \pard \intbl \cell \pard \intbl \cell \pard
\intbl \row \pard
…
Rich Text Format 18
The last group controls character formatting properties. A control word preceding plain text turns on the specified
attribute. Some control words (indicated by an asterisk following the description) can be turned off by the control
word followed by a zero (0). For example, \b turns bold on, while \b0 turns bold off.
Special Characters
Special RTF characters are listed below. If a character is not recognized by the RTF reader, it is ignored and the
text following it is considered plain text. The RTF specification is flexible enough to allow new characters to be
added for interchange with other software.
An ASCII 9 will be accepted as a tab character. The code \<ASCII10> (line feed) or \<ASCIIl3> (carriage
return) is treated as the control word \par. You must include the backslashes or RTF will ignore the control
word. You may also want to insert a carriage-return-line-feed pair (without backslashes) at least every 255
characters for better text transmission over communication lines.