[Next] [Prev] [Up] [Top] [Contents] [Search]
2 - A comparison of printed and WWW documents
2.2 - A feature by feature discussion
We here identify the major properties of both media and how each of these is handled by WebMaker.
Superficially, the structure of paper documents is significantly different from that of WWW documents. Printed documents are deemed to be linear while hypertext is non-linear. Despite this, few printed documents are read in a linear fashion. This necessitates the provision of non-linear navigation aids such as tables of contents and alphabetic keyword indices and suggests that such documents are well suited to benefit from a hypertext environment.
Some printed documents are in fact essentially linear and should be consumed in a linear fashion, for example reading a novel in a comfortable chair. However, many conventional printed documents such as reference manuals, newspapers, indexed structures such as dictionaries, encyclopedias and so on, could in fact be much more usable in a hypertext format.
WebMaker can read configuration information to decide how to break up the text into the different HTML nodes. Both linear and hierarchical navigation aids may be automatically supplied, for example, links to the linear next and previous pages, links to the topmost node, to the parent node, to tables of contents, indices and so on. Non-hierarchal hypertext links to the same Web document or to other resources may also be included.
The basic language building blocks conveying information, i.e. the words, must provide the common ground between the two forms of a document. This is not to say that there are no subtle differences in the wording between two ideal paper and WWW representations of a document. However, without the restriction that the textual body is the same in both formats a comparison of the two media is meaningless and the existence of converters between them would not be possible.
WebMaker translates text word for word. For the minor cases where some differences in the text are desirable as, for example, in the case of the wording of this cross reference to References, a simple solution is provided. Such subtle differences may be controlled by conditional text for paper and conditional text for WWW.
The character set available for printing is wide; with most systems additional typefaces may be incorporated as needed. The character set available in HTML is limited to ISO Latin 1, requiring special software for other sets, for example Japanese.
This limitation will be considerably less relevant, if not disappear altogether, with the implementation of HTML+ [3]. Until such time as this, WebMaker includes, in the generated Web files, an image representation of characters in FrameMaker that are not in ISO Latin 1.
WWW documents may reference information sources including other documents and active data sources such as on-line news conferences, mail and databases. References are not limited to the scope of a single document or file system. This is made possible by the use of Uniform Resource Locators, or URLs, which describe a data object and contain sufficient information for their retrieval. This is the fundamental WWW principle.
In comparison, references in paper documents redirect the reader to other marked (numerically or otherwise) locations in a document. References in WWW are more convenient as they do not require the reader to perform any action to retrieve referenced data other than a click of the mouse.
WebMaker translates all cross references within the same FrameMaker document into hypertext links. HTML hypertext links may be inserted anywhere in the text by a simple system that is completely independent of the hypertext facilities in Frame. Two special FrameMaker markers (Type 25) are inserted to delineate the anchor text, with the first one also containing the URL of the destination.
Cross references between FrameMaker documents are not currently being translated.
Graphics in FrameMaker may be either native, imported by copying into file or imported by reference. Some WWW browsers can display images, normally in GIF, in a similar way as they display character entities within text.
WebMaker adopts the method implemented by Jon Stephenson von Tetzchner in the fm2html converter to translate graphics. All three FrameMaker graphics cases are translated by first extracting each one into a separate MIF file, then translating to PostScript, cropping and translating again to GIF (default format) or to another format specified by the user, per graphic, in the Frame master file. An image may also be specified to be the destination of hypertext links, in which case a clickable iconised image that points to the external image is inserted into the text. Other configurable actions regarding graphics are the format of external image files and the sizes, in terms of the original, of the generated icon and of the actual image.
WebMaker also allows the inclusion of non textual WWW objects, such as graphics and animation that contain hypertext links, sound and interactive forms, in the FrameMaker master. Any text in the FrameMaker master tagged with the HTML conditional text tag is translated as raw HTML. Printing of this is suppressed if text conditional under HTML is hidden.
As HTML does not support mathematics, tables and figures, these may not be as such translated. WebMaker adopts a solution similar to the one first introduced by Nikos Drakos in the LaTeX2HTML converter [4]. The solution is to treat highly formatted objects for which there is no equivalent in HTML as pictures. Each one is therefore isolated into a separate MIF file, printed to a PostScript file, cropped and translated to GIF, and then put back in the generated Web document.
The implementation of HTML+ will allow a real translation of these objects.
WWW documents may offer powerful indexing features which is especially useful for reference material that is normally consulted for very particular information. The ISINDEX tag enables users to send a string of characters to a server for interpretation. The actions taken by the server depend on the selection of an external program running on the server. The implementation of forms in HTML+ significantly enhance this feature.
An author may specify whether a generated Web document is to be a searchable index.
Paper documents are defined by physical size. HTML does not provide the means to mark the perimeter of a document or whether a node, or single HTML file, is part of a Web document or not.
WebMaker uses such paper intrinsic information to organise the generated Webs by inserting a navigation panel in each node. This helps a reader to know exactly which page of the document and which document are being browsed.
Page layout information belongs to the realm of the printed paper. This may include information for headers, footers, page numbering, text columns and so on. This information is ignored by WebMaker.
WWW documents may continually change without causing the need for a new copy, i.e. the document would still be accessed by the same URL. This implies that a WWW document may contain general information about a given subject that is constant as well as continuously changing information that can always be kept up to date.
Printed documents are frozen; modifications will necessitate a new physical copy.
This property of WWW does not translate to any WebMaker requirements. However, the maintenance of a regularly updated document is considerably simplified if the master is in FrameMaker and WebMaker.
First International WWW Conference, May 1994BR, MR - CERN PTG - 31 May 1994
[Next] [Prev] [Up] [Top] [Contents] [Search]
Generated with WebMaker