Overview, Inventor and VRML

This document has evolved into the VRML 1.0 Draft Specification. However, this document may still be useful since it provides more detailed information on the choices made during the design of the VRML 1.0 draft.

This is a proposal for a VRML design based on the Open Inventor file format. It has been heavily influenced by the discussions of what VRML should be that have taken place on the VRML mailing list; please visit the VRML home page to get up-to-date.

This design is a subset of the Inventor file format, with compatible extensions. Because saying "the Open Inventor ASCII file format" is annoying, I will just use the phrase "Inventor" in this proposal; however, please remember that the Inventor ASCII file format and the Inventor programming interface are separate entities.

Inventor has taken many programmer-years to design and implement, and is a fairly large, very general system. VRML must be much smaller to become a success; otherwise, implementations will be either incompatible or will take too long to produce. Therefore, only the most commonly used subset of Inventor is proposed here as the basis for VRML.

I have tried to make this proposal readable; I started to write a detailed design spec and quickly got bogged down in the details of field syntax. Lets argue about the bigger issues; I will create a document describing precisely Inventor's syntax later, if it is felt necessary.

Issue: I have tried to anticipate criticisms to Inventor's design; you will see paragraphs marked Nit: throughout this document, where I point out parts of Inventor that are easy to nit-pick. I am going to assume that compatibility with Inventor, warts and all, is a desireable goal, and that minor incompatibilities should not be introduced just to make the design a bit more elegant or logical. I hope we can agree to that, and avoid nit-picking these minor details. Paragraphs marked with Issue: are larger issues that I think need to be discussed.

The Inventor group at Silicon Graphics has committed to separating the ASCII file-reading code from the rest of the Inventor library, repackaging it, and putting it in the public domain as the start of a VRML toolkit to make implementing VRML easier. The file reader will be C++ code that produces a hierarchical structure of C++ classes; a VRML implementor would need to define appropriate render() methods for these classes to implement rendering, a pick() method to implement picking, etc. Or, alternatively, traverses these classes and creates a completely different internal representation for the scene.

John Barrus is organizing an effort to summarize "The Inventor Mentor"; if you are unfamiliar with Inventor you should read it to get an idea of what Inventor is.

Object-Oriented

At the highest level of abstraction, the Inventor file format is just a way for objects to read and write themselves. Theoretically, the objects can contain anything-- 3D geometry, MIDI data, JPEG images, anything. Inventor defines a set of objects useful for doing 3D graphics. These objects are called 'nodes'.

Scene Graphs

Nodes are arranged in hierarchical structures called "scene graphs". Scene graphs are more than just a collection of nodes; the scene graph defines an ordering for the nodes. The Inventor scene graph has a notion of 'state'-- nodes earlier in the scene can affect nodes that appear later in the scene. For example, a Rotation or Material node will affect the nodes after it in the scene. A mechanism is defined to limit the effects of properties (Separator nodes), allowing parts of the scene graph to be functionally isolated from other parts.

This notion of order in the scene graph may be the most controversial feature of Inventor. Most other systems attempt to attach properties to objects, with the properties affecting only that one object. In fact, an early prototype of Inventor was written that way. However, treating properties differently from geometry resulted in several problems. First, if a shape has several properties associated with it, you must still define an order in which the properties are applied. Second, there are some objects, such as lights and cameras, that act as both shapes (things that have a position in the world) and properties (things that affect the way other things look). Getting rid of the distinction between shapes and properties simplified both the implementation and the use of the library.

What is in a node

A node has the following characteristics:

What kind of object it is. A node might be a cube, a sphere, a texture map, a transformation, etc.

The parameters that distinguish this node from other nodes of the same type. For example, each Sphere node might have a different radius, and different texture maps nodes will certainly contain different images to use as the texture maps. These parameters are called 'fields'. A node can have 0 or more fields.

A name to identify this node. Being able to name nodes and refer to them elsewhere is very powerful; it allows a scene's author to give hints to applications using the scene about what is in the scene, and creates possibilities for very powerful scripting extensions. Nodes do not have to be named, but if they are named, they have only one name.

Child nodes. Object hierarchy is implemented by allowing nodes to contain other nodes. Parent nodes traverse their children in order during rendering. Nodes that may have children are referred to as "group nodes". Group nodes can have zero or more children.

The syntax chosen to represent these pieces of information is straightforward:

DEF objectname objecttype { fields  children }

Only the objecttype and curly braces are required; nodes may or may not have a name, fields, and children.

The following sections describe the types of objects I think should be the basis of VRML, and describe details of this basic syntax.

IndexedFaceSet

Inventor has eight different ways of specifying a polygonal shape or surface (FaceSet, IndexedFaceSet, QuadMesh, TriangleStripSet, IndexedTriangleStripSet, NurbsSurface, IndexedNurbsSurface and Text3). To ease the implementation burden, I propose that only IndexedFaceSet be part of VRML. IndexedFaceSets can be used to represent any of the other polygonal shape types, are fairly space-efficient, and are (in my experience) the most common type of geometry.

Issue: Text3 is a much more compact representation for 3D text than IndexedFaceSet. I'd like to hear from implementors on if they would be willing to implement it; there are also cross-platform issues if VRML allows different fonts revolving around what fonts are available on various systems, what they are named, etc.

IndexedFaceSet supports overall, per-face, and per-vertex materials and normals. IndexedFaceSet will automatically generate normals if the user doesn't specify normals. Faces with fewer than 3 vertices will be ignored.

Here is a simple example of two IndexedFaceSets, showing some of its more advanced features (per-vertex coloring, for example):

#Inventor V2.0 ascii
# Two IndexedFaceSets each describing a cube.
# Normals are per polygon. The first has OVERALL material binding, and
# appears all one color.
# The second has colors indexed per vertex.  This allows the colors
# to be defined in any order and then randomly accessed for each vertex.

Separator {
  Coordinate3 {
    point [ -1  1  1,  -1 -1  1,   1 -1  1,   1  1  1,
            -1  1 -1,  -1 -1 -1,   1 -1 -1,   1  1 -1 ]
  }
  Material { diffuseColor [ 1 0 0,  0 1 0,  0 0 1, 1 1 0 ] }# indices 0,1,2,3
  Normal {
    vector [  0.0  0.0  1.0,   1.0  0.0  0.0,    # front and right faces
	      0.0  0.0 -1.0,  -1.0  0.0  0.0,    # back and left faces
	      0.0  1.0  0.0,   0.0 -1.0  0.0 ]   # top and bottom faces
  }
  NormalBinding { value PER_FACE_INDEXED }
  MaterialBinding { value OVERALL }
  IndexedFaceSet {
    coordIndex [ 0, 1, 2, 3, -1,  3, 2, 6, 7, -1,  # front and right faces
	         7, 6, 5, 4, -1,  4, 5, 1, 0, -1,  # back and left faces
		 0, 3, 7, 4, -1,  1, 5, 6, 2, -1 ] # top and bottom faces
    normalIndex [ 0, 1, 2, 3, 4, 5 ]    # Apply normals to faces, in order
  }
  Translation { translation 3 0 0 }
  MaterialBinding { value PER_VERTEX_INDEXED }
  IndexedFaceSet {
    coordIndex [ 0, 1, 2, 3, -1,  3, 2, 6, 7, -1,  # front and right faces
	         7, 6, 5, 4, -1,  4, 5, 1, 0, -1,  # back and left faces
		 0, 3, 7, 4, -1,  1, 5, 6, 2, -1 ] # top and bottom faces
    materialIndex [ 0, 0, 1, 1, -1,	# red/green front
		 2, 2, 3, 3, -1,	# blue/yellow right
		 0, 0, 1, 1, -1,	# red/green back
		 2, 2, 3, 3, -1,	# blue/yellow left
		 0, 0, 0, 0, -1,	# red top
		 2, 2, 2, 2, -1 ]	# blue bottom
  }
}

Each of the faces of an IndexedFaceSet are assumed to be convex by default. A ShapeHints node (see below) can be used to change this assumption to allow concave faces. However, all faces must be simple (they must not self-intersect).

If not enough normals are specified to satisfy the current normal binding, normals will be automatically be generated based on the IndexedFaceSet's geometry.

If explicit texture coordinates are not specified using a TextureCoordinate2 node, then default texture coordinates will be automatically generated. A simple planar projection along one of the primary axes is used, mapping the width of the texture image onto the longest dimension of the IndexedFaceSet's bounding box, with the height of the texture image going in the direction of the next-longest dimension of its bounding box.

ShapeHints

The ShapeHints node gives extra information about the shapes in the scene that a renderer can use to optimize rendering. It has a faceType field that can be either CONVEX (meaning all of the polygons in all shapes to follow are convex) or UNKNOWN_FACE_TYPE (meaning the polygons may be either concave or convex). Its vertexOrdering field lets a renderer know whether faces are defined CLOCKWISE, COUNTERCLOCKWISE, or have an UNKNOWN_ORDERING. Its shapeType field defines whether the shape is SOLID (the faces completely enclose a volume of space) or an UNKNOWN_SHAPE_TYPE. Inventor uses these hints to turn on or off backface removal, two-sided lighting, and the tesselation of concave polygons.

ShapeHints also has a creaseAngle field used during normal generation; it is a hint to the normal generator about where sharp creases between polygons should be created (if two faces sharing an edge have a dihedral angle less than the creaseAngle, the normals will be smoothed across the edge; otherwise, the edge will appear as a sharp crease).

Coordinates and Normals

Coordinate3 and Normal nodes are considered properties, like materials or textures. Specifying coordinates and normals separately from the shape makes it much easier to extend the format to support other representations for coordinates and normals.

If a binary format for VRML is developed, it will be worthwhile to specify low-bandwidth alternatives to the standard Inventor Coordinate3 and Normal nodes, which store each coordinate or normal as three floating-point numbers. Lighting is usually good enough even with byte-sized normals; a ByteNormal with normal XYZ vectors with components from -127 to 127 would save a significant amount of network bandwidth. Similarly, a ShortCoordinate3 that specified vertices in the range of -32767 to 32767 (the model would need an appropriate Scale to make it reasonably sized, of course) could also save significant network bandwidth. Note that in the ASCII file format, new nodes aren't necessary-- you can just limit the precision of the ASCII numbers in your scene to a few digits of accuracy. For example, instead of specifying a normal as: [.7071067811865 .7071067811865 0], specify it as [.707 .707 0] to save bandwidth.

Bindings

Binding nodes (MaterialBinding and NormalBinding) specify how to apply properties to primitives. Inventor has 8 different ways of binding materials or normals to primitives. The _INDEXED bindings use the index fields in IndexedFaceSet (coordIndex, normalIndex) to index into the list of current materials or normals.

DEFAULT: Each shape chooses a reasonable binding. The primitive shapes and IndexedFaceSet all choose OVERALL as their default material binding; the DEFAULT normal binding for IndexedFaceSet is PER_VERTEX_INDEXED (the primitive shapes generate their own normals and ignore the normal binding).
OVERALL: One material or normal used for the entire object.
PER_PART, PER_PART_INDEXED: One material or normal for each part of the shape. For IndexedFaceSet, these are the same as PER_FACE and PER_FACE_INDEXED. Primitive shapes treat PER_PART_INDEXED the same as PER_PART.
PER_FACE, PER_FACE_INDEXED.: One material or normal for each face of the shape. Since primitive shapes do not have faces, they interpret these bindings the same as OVERALL.
PER_VERTEX, PER_VERTEX_INDEXED: One material or normal for each vertex of the shape. Since primitive shapes do not have explicit vertices, they interpret these bindings the same as OVERALL.

Specifying how materials or normals are applied to shapes allows the same set of materials (or, much less common, normals) to be used for several different shapes. For example, a program may use only a limited palette of materials that it applies to either the vertices or faces of IndexedFaceSets. The same Material {} node may be used by all of the IndexedFaceSets, with MaterialBinding nodes switching between PER_VERTEX_INDEXED and PER_FACE_INDEXED materials.

Nit: Yeah, I think eight is too many bindings, too. However, implementing all of the bindings is easy, since most shapes only really have two binding (OVERALL or PER_PART), and all of the bindings used by IndexedFaceSet are useful.

Nit: PER_FACE or PER_VERTEX bindings can be done using appropriate indices and PER_FACE_INDEXED or PER_VERTEX_INDEXED bindings. I'm hesitant to get rid of them, though, because PER_FACE is more common and requiring all those indices will increase file sizes.

Inventor has a TextureCoordinateBinding node with DEFAULT, PER_VERTEX, and PER_VERTEX_INDEXED values. Because binding texture coordinates PER_VERTEX is very rare (PER_VERTEX_INDEXED is infinitely more common), I don't think this node should be part of VRML.

IndexedLineSet

IndexedLineSet is just like IndexedFaceSet, only open line segments are drawn instead of polygons. For example, this is two line segments; the first is a closed triangle (note that the first index is repeated to close the loop), the second is a zig-zag of 4 connected line segments:

Separator {
    Coordinate3 { point [ 0 0 0, 1 0 0, 0 1 0,  # Triangle vertices
	2 0 0, 3 0 1, 4 0 0, 5 0 1, 6 0 0]  # Zig-zag vertices
    }
    IndexedLineSet {
	coordIndex [ 0, 1, 2, 0, -1,
		     3, 4, 5, 6, 7 ]
    }
}

Unlike IndexedFaceSet, an IndexedLineSet will be drawn with lighting turned off if normals are not specified. Lines with fewer than 2 vertices are ignored.

PointSet

Points are drawn using the PointSet primitive. Its startIndex and numPoints fields are used to specify which points from the current Coordinate node should be drawn; by default, startIndex is zero and numPoints is -1, meaning draw all of them. PointSet uses the current coordinates in order, starting at the index specified by the startIndex field. The number of points in the set is specified by the numPoints field. A value of -1 for this field indicates that all remaining values in the current coordinates are to be used as points.

Like IndexedLineSet, if normals are not specified then the points will be drawn unlighted.

Note: An IndexedPointSet primitive isn't terribly useful, because coordinates used for a PointSet aren't typically shared (unlike polygons and lines, where several polygons or line segments may meet at a vertex).

Primitive Shapes

The Inventor Cube, Sphere, Cylinder, and Cone shapes are very useful for creating simple scenes; I propose that they be part of standard VRML. Cubes, Cylinders and Cones have a notion of 'parts'; for example, the parts of a cylinder is the top, the bottom, and the sides. PER_FACE_INDEXED material bindings are interpreted as applying different materials to each part of the object. Cylinders and Cones also allow their parts to be drawn or not drawn, controlled by a field (for example, a Cylinder with its sides and bottom turned off is just a circular disc).

Issue: If scenes will contain large numbers of these primitives, CubeSet/SphereSet/CylinderSet/ConeSet primitives should be defined to reduce the network bandwidth of (for example) sending "Separator { Translation { translation x y z } Cube { } }" over and over.

Nit: yes, Cubes aren't really Cubes if they have different widths, heights, and depths. But a non-uniform scale can also make a sphere not a sphere.

Issue: Inventor has a Complexity node with a 0.0 to 1.0 value that can be used to control the quality of these shapes. I think complexity control should be left to the browser, which could control the complexity to get good interactive performance.

Groups

Group nodes (Separator, Group, TransformSeparator, Switch and LevelOfDetail) are used to create the scene hierarchy. Separator is most commonly used; it separates the effects of its children (material changes, translates/rotates/scales, etc) from the rest of the scene.

Inventor's Separator has several fields to control its caching (whether or not it should build a display list when rendering) and culling (whether or not it should draw its children, based on whether or not it is in the view volume) behavior. I propose that VRML require only the renderCulling field, since the caching fields is specific to API's like GL that have a notion of display lists (and the default, Inventor's AUTO caching, works very well).

Another group that is very useful is TransformSeparator, which separates the effects of transformations inside it from the rest of the scene, but allows other properties to "leak" out. This node wasn't implemented to improve performance over Separator (on a well-implemented system Separator should do a lazy push/pop of attributes, only saving/restoring attributes that matter), but was done to allow transformations to transform lights and cameras without affecting the objects that the camera is viewing or the lights are illuminating.

The Switch node traverses none, one, or all of its children based on its whichChild field. It is most useful in programs (for example, a scene may contain two representations of a world, with a named Switch used to switch between them), but it can be very useful for "commenting out" part of the scene.

LevelOfDetail

LevelOfDetail is a special group that traverses one of its children based on approximately how much screen area its children occupy. It approximates the screen area by taking the 3D bounding box of its children, projecting it onto the 2D screen, then taking the area of the 2D bounding box that contains that projected 3D bounding box. The different levels of detail are stored as the LevelOfDetail node's children; the first child should be the most detailed version of the object, with subsequent children being less detailed versions of the object. For example, here is a very simple LevelOfDetail node that display a sphere as the most detailed object (when its bounding box is larger than 10,000 pixels (about 100 by 100), a cube as the middle level of detail (when the object is between 10,000 and 100 pixels big) and displays nothing (an Info node) when it is smaller than 100 pixels:

LevelOfDetail {
	screenArea [ 10000, 100 ]
	Sphere { }  # Highest level of detail
	Cube { }    # Next level of detail
	Info { }    # Lowest level of detail
}

Issue: Will implementing this be too hard? I wouldn't mind a much simpler node that just chose a child based on how far away it is from the eye (called "DistanceSwitch", perhaps). DistanceSwitch could either switch based on the distance of the center of its children's bounding box from the eye (but then that forces implementors to be able to figure out bounding boxes for objects), or could just switch based on the distance of point (0,0,0) in object space from the eye (this assumes objects are modelled around (0,0,0) and translated into position).

Issue: The Inventor LevelOfDetail node (and Inventor's primitive shapes-- Cube/Sphere/Cone/Cylinder-- pay attention to the current complexity value, stored in the Complexity node (a lower complexity value causes LevelOfDetail to choose simpler levels of detail). I think it is OK for VRML to leave complexity as a global value controlled by the browser.

Materials

Inventor's Material node supports a simple model for how light reflects off the surface of an object. Materials are intended to be easily implementable, not to capture a truly accurate physical description of the surface. The parameters are ambient, diffuse, specular and emissive colors, a shininess parameter (specular exponent for the geeks reading this), and how transparent or opaque the material is. The ambient, diffuse, specular and emissive colors are specified as RGB triples in the range 0.0 to 1.0.

Issue: Inventor has two other material nodes; BaseColor is equivalent to a Material except that it only sets the diffuseColor for subsequent shapes. PackedColor is a compact form of BaseColor, with diffuse colors and transparency specified as 32-bit unsigned long values; the red, green, blue and alpha components are specified with 8 bits of precision. I don't think BaseColor adds enough functionality to justify its inclusion in VRML. However, PackedColor does use significantly less bandwidth than Material, and I think it should be included.

Textures

The Texture2 node specifies a 2D array of colors (possibly with transparency) to be mapped onto 3D objects. It allows the image to be specified either in an external file, or to be stored in an SFImage field directly in the Inventor file. It also specifies how the colors should interact with the material of the object (if they should combine with or replace the object's material) and whether or not the texture should repeat (see texture coordinates, below).

Inventor's Texture2 node has a 'model' field which controls how the texture image and the object's lighted color are combined. BLEND is used with greyscale and greyscale+alpha images, and uses the intensity of the texture image to control how much of the object's color is used and how much of a constand blending color (specified in the Texture2 Issue: for VRML, the filename field should take a URL. What image formats should be supported? The same ones as HTML (is it just GIF?)? The SFImage field is an uncompressed, 8-bit-per-component format; should it be eliminated from VRML as too much of a bandwidth hog?

Texture Coordinates

Inventor's primitive shapes define how the texture image is mapped onto their geometry. A TextureCoordinate2 node is used to specify how the texture is mapped onto the IndexedFaceSet primitive. The texture image is mapped into the (0,0) to (1,1) space of texture coordinates; TextureCoordinate2 allows each vertex of the IndexedFaceSet to be given a different texture coordinate, allowing arbitrary mappings. Texture coordinates outside the range (0,0) to (1,1) will either cause the texture image to repeat itself, or will be clamped, causing the border pixel in the texture image to be re-used, depending on the fields of the Texture2 node.

The Texture2Transform node can be used to modify a shape's texture coordinates. A Texture2Transform is a 2D version of the Transform node that transforms texture coordinates instead of vertex coordinates. It has fields that specify a 2D translation, 2D rotation, 2D scale, and a 2D center about which the transformations will be applied. Texture coordinates are either specified explicitly in a TextureCoordinate2 node or are implictly generated by shapes. The cumulative texture transformation is applied to the texture coordinates, and the transformed texture coordinates are used to find the appropriate texel in the texture image. Note that, like regular transformations, Texture2Transform nodes have a cumulative effect.

Texture2Transforms allow the default mapping of textures onto primitive shapes to be changed. For example, you might build a house out of Cube primitives (if you didn't really care about performance!) and change the Texture2Transform so that a wallpaper texture was repeated across the walls, instead of the default mapping of the texture being repeated once across the faces of the cube.

Transformations

Inventor defines 12 different transformation nodes. I think the following should be part of VRML:

Translation has a single field that specifies an XYZ translation for subsequent objects. Note that all transformations are relative; for example:

Translation { translation 1 0 0 }
Translation { translation 3.5 2 1 }
Cube { }

will result in the cube having a total translation of (4.5,2,1).

Scale has a single field which specifies a relative scale. The scale will be non-uniform in the X, Y or Z directions if all of the components of scaleFactor are not the same.

Rotation has a single field that specifies an axis to rotate about and an angle (in radians) specifying how much right-hand rotation about that axis to apply. Nit: yes, it would have been more convenient if the angle was specified in degrees instead of radians.

MatrixTransform has a single field containing an arbitrary 4 by 4 rotation matrix, to be combined with previous transformations and applied to subsequent objects.

The Transform node combines several common transformation tasks into one convenient node. It has fields specifying a translation, rotation and scaleFactor, along with scaleOrientation and center fields for specifying what coordinate axes the scale should be applied along and about which point the scale and rotation should occur.

Cameras

Inventor defines two types of cameras; PerspectiveCamera and OrthographicCamera. PerspectiveCamera has position and orientation fields that specify the camera's location and orientation relative to "world space" (the space objects are in after all transformations have been applied). PerspectiveCamera also has a heightAngle that specifies how wide or narrow the field of view should be; changing the heightAngle is like using a zoom lens to zoom in and out. The focalDistance field is a hint to browsers about where the person behind the camera is looking; browsers can use this information to do correct stereo rendering (basing the stereo eye separation on the focalDistance), to adjust how quickly the viewer should move through the scene, and to do fancy depth-of-field blur effects.

I don't think that the viewportMapping, nearDistance, farDistance, or aspectRatio fields need to be part of VRML. viewportMapping is almost always left at its default value of ADJUST_CAMERA. The near and far clipping planes distances are best calculated by the VRML browser and adjusted automatically. And we should assume the aspectRatio will match the window; authors that want their scenes to look squished can insert non-uniform scales.

OrthographicCamera is exactly like PerspectiveCamera, only instead of a heightAngle field to control the field-of-view, it has a height field that specifies how tall the viewing volume is, in world-space coordinates.

Issue: This spec doesn't define any way of specifying a recommended viewing paradigm-- walk-through versus fly-through versus looking at a single object. I think the most common paradigms will be a single object (you just want to move around the object and look at it from all sides) and an immersive "room" or environment (you want to walk or fly or crawl or hop around it exploring). Smart browsers should be able to distinguish between these two cases pretty easily (using position of camera versus rest of scene, plus viewer size (focalDistance) versus rest of scene).

Lights

Inventor defines three basic kinds of lights. All lights have an intensity, color, and an 'on' field that can be used to turn them on or off. DirectionalLight has a direction field that specifies what direction the light is travelling. A PointLight has a position in space, and radiates light uniformly in all directions from that point. A SpotLight has both a location and a direction, plus fields to control the width and focus of its beam. Note that light positions and directions ARE transformed by the current transformation, allowing lights to be "attached" to objects.

Issue: Are SpotLight and PointLight too hard to implement on non-GL platforms?

Info

Storing arbitrary information in a file is handy for recording who created an object, copyright information, etc. The Info node contains a single field called "string" that contains an arbitrary ASCII string that can be used for this. For example:

Info {
    string  "Created by Thad Beier.
Slightly ill-behaved model: has some clockwise polygons.
Public domain.
"
}

Note that newlines are are allowed in string fields, allowing one Info node to contain several lines of information.

Issue: Should conventions for the information inside Info nodes be established to allow browsers to interpret that information? For example, the convention for author information could be a line of the form "Author: author_name".

VRML Extensions to Inventor

For the first release of VRML, I propose two new nodes, WWWInline and WWWAnchor. WWWInline would look like this:

WWWInline {
    name "http://www.sgi.com/FreeStuff/CoolScene.vrml"
    bboxCenter 0 0 4
    bboxSize  10.5 4.5 8
}

The name field is an SFString containing the URL for the file. A 'smart' implementation can delay the retrieval of the file until it is actually rendered, instead of reading it right away; combined with LevelOfDetail , this provides an automatic mechanism for delayed loading of complicated scenes.

The bboxCenter and bboxSize fields allow an author to specify the bounding box for this WWWInline. Specifying a bounding box this way allows a browser to decide whether or not the WWWInline can be seen from the current camera location without reading the WWWInline's contents. If a bounding box is not specified, the contents of the WWWInline do have to be read to determine the WWWInline's bounding box.

WWWAnchor looks very much like WWWInline, except that it is a group node and can have children:

WWWAnchor {
    name "http://www.sgi.com/FreeStuff/CoolScene.vrml"
    Separator {
	Material { diffuseColor 0 0 .8 }
	Cube { }
    }
}

WWWAnchor is a strange node; it must somehow communicate with the browser and cause the browser to load the scene specified in its name field when a child of the WWWAnchor is picked, replacing the "current" scene that the WWWAnchor is part of. Specifying how that happens is up to the browser and implementor of WWWAnchor, as is implementing the picking code.

Issue: What happens when you nest WWWAnchors (you have WWWAnchors as children of WWWAnchors)? Suggestion: the "lowest" WWWAnchor wins.

WWWAnchor also has a "map" field that adds the object-space point on the object the user picked to the URL in the name field. This is like the image-map feature of HTML, and allows scripts to do different things based on exactly what part of an object is picked. For example, given this WWWAnchor:

WWWAnchor {
	name "http://www.foo.com/cgi-bin/pickMapper"
	map POINT

	Cube { }
}

Picking on the top of the Cube might produce the URL "http://www.foo.com/cgi-bin/pickMapper?.211,1.0,-.56".

Issue: Is this the best way of doing this?

Inventor--

There are several other Inventor nodes that I either don't think are common enough or think will be too hard to implement to include in VRML. Here are the most interesting, along with my reasons for not including them in this proposal:

Text2, Text3, Font

Text2 is two-dimensional screen-aligned text. Adding a 2D primitive whose size is not in 3D object coordinates but window coordinates turns up a lot of annoying implementation issues. 3D text is pretty complicated to implement, and implementing the full Inventor functionality would necessitate adding 4 more nodes to specify the font, coordinates for the 3D extrusion bevel on the text, and specifying whether the extrusion is a curve or a set of line segments (Font, ProfileCoordinate2, LinearProfile and NurbsProfile). Also, allowing different fonts to be specified opens up a whole can of worms on what font names are allowed, where fonts are defined, what format fonts are in, etc.

NURBS

NURBS curves and surfaces are also complicated, especially if trimming of NURBS surfaces is supported (Inventor supports that). I think VRML can be very successful without them.

Nodekits

Inventor's nodekits impose a structure on the scene graph, making it easier for applications to manipulate the scene. For example, figuring out where to insert a material node to affect a picked object is not trivial for an arbitrary scene. If you use nodekits, it is trivial; each ShapeKit has an associated AppearanceKit, and you just tell the AppearanceKit what you want the material to be. I don't think nodekits should be part of VRML only because they add yet another thing to be implemented, and I have tried to keep this design as minimal as possible.

Draggers

Draggers are incredibly powerful interactive objects that respond to user interaction by moving themselves. For example, a RotateSpherical dragger will rotate itself when the mouse is clicked and dragged over it. Draggers become powerful when their fields (which change as they move) are wired to other parts of the scene using field to field connections. Because I don't think behaviors, engines, and field-to-field connections should be part of VRML, I don't think draggers should be, either.

Array, MultipleCopy

An Array node copies its children several times, each translated differently. A MultipleCopy node is similar, except arbitrary transformation matrices are specified. They aren't used very much (although they are useful for things like regular grids).

Blinker, Rotor, Shuttle, Pendulum

A Blinker is a subclass of the Switch node that changes which child is drawn over time. Rotor is a rotation that changes over time. Shuttle is a Translation that changes between two locations over time, and Pendulum changes between two rotations over time. I think the notion of animating objects that change over time should wait for VRML 2.0, and should include full-fledged engines and behaviors.

ClipPlane

ClipPlane specifies an arbitrary clipping plane that can be used to clip out parts of the scene. Useful for specific CAD applications, not very useful generally (and not typically saved as part of a scene).

DrawStyle

Allows parts of the scene to be drawn as LINES or POINTS. I think this functionality can be left to the browser; authors can use PointSet or IndexedLineSet if they want lines or points in their scene.

Environment

The Environment node specifies global illumination settings like fog and the ambient light intensity of the scene. Those are pretty advanced features that I think VRML can do without, at least for now.

LightModel

Inventor supports either a BASE_COLOR lighting model, which is basically no lighting at all, or the default PHONG lighting model, which is a simple lighting approximation. BASE_COLOR lighting model is mainly useful for scenes that have already had their lighting precomputed, such as a scene for which a radiosity solution has been calculated. I don't think these kinds of scenes are common enough to justify the addition of the LightModel node to VRML.

ResetTransform, Units, AntiSquish, RotationXYZ

ResetTransform can be painful to implement correctly, and is almost never specified in scene files. Ditto for AntiSquish. I don't like Units because strange things happen when you nest them; for example:

Separator {
	Units { units FEET }
	DEF FootCube Cube { }
	Separator {
		Units { units METERS }
		DEF MeterCube Cube { }
	}
}

Applications that try to be smart about rearranging the object hierarchy will have trouble figuring out exactly what effect the second Units node will have, since its effect will change if it is moved out from under the first Units node. The rules are much simpler if a simple Scale node is used instead.

Issue: RotationXYZ allows rotation about one of the primary axes. I prefer the generality of Rotation, which allows rotation about an arbitrary axis. However, it might make sense to replace Rotation with RotationXYZ, since the general Transform node can also be used to rotate about an arbitrary axis.

TextureCoordinateEnvironment, TextureCoordinatePlane

These specify a fairly simple mapping of object geometry to texture coordinates. I could be convinced that TextureCoordinatePlane should be part of VRML; however, it will add a fair amount of complexity to a VRML implementation's texture coordinate handling code.

Coordinate4, ProfileCoordinate2, ProfileCoordinate3

Coordinate4 specifies homogeneous coordinates (X, Y, Z and W), which are not very common. ProfileCoordinate2 and ProfileCoordinate3 specify coordinate for the bevels on 3D text and the trim curves of NURBS objects. Since I don't think 3D text or NURBS should be part of VRML 1.0, they aren't necessary either.

Binary format

To make implementation of VRML easier, I don't think a binary format should initially be part of VRML. Our experience with Inventor has shown that using a standard compression utility (such as compress, pack or gzip) to compress ASCII Inventor files results in files that are just as small as Inventor's binary format (compressing the binary format typically has very little effect). If a new protocol for trasmitting VRML scenes is designed, servers and browsers could automatically compress and decompress ASCII VRML files to save network bandwidth (actually, this is orthogonal to VRML-- HTML files could also be stored and sent compressed to save network bandwith). Parsing time is greatly improved with a binary format; however, for VRML browsers, the network transmission time will be much greater than the ASCII parsing time.

Field syntax

Each node has zero or more fields, which stores the data for the node. Each of the fields is written as the field's name followed by the data in the field.

For example, a Sphere has a single "radius" field which contains a single floating point value, and is written as:
Sphere { radius 1.0 }

Each node defines reasonable default values for its fields, which are used if the field does not appear as part of the node's definition.

Some fields can contain multiple values. The syntax for a multiple-valued field is a superset of the syntax for single-valued fields. The values are all enclosed in square brackets ("[]") and are separated by commas. The final value may optionally be followed by an extra comma. If a multiple-valued field has only one value, the brackets and commas may be omitted, resulting in the same syntax as single-valued fields. A multiple-valued field may also contain zero values, in which case just a set of empty brackets appears.

Field classes

Single-valued fields have type names that begin with "SF". Multiple-valued fields have type names that begin with "MF". Inventor defines 42 different types of fields; I propose that VRML consist of the subset used by the nodes defined by VRML, which are:

SFFloat
MFVec3f
SFVec3f
MFColor
SFColor
SFRotation
SFMatrix
SFEnum
SFBitMask
MFString
SFString
SFImage

(I probably missed some, but I want to get this out for discussion before the end of the year...).

Nit: Some of the types for fields are tied to the C programming language ("Float"; Inventor also has "Long" and "Short") and will be misleading on some machines (an Inventor SFFloat is a 32-bit floating point number, even though floats are larger or smaller on different machines).

Naming

A node may be given a name by prepending its definition with the reserved token "DEF" followed by whitespace, the name of the node, and whitespace to separate the name from the type of the node.

For example, to give the name "SquareHead" to a cube:
DEF SquareHead Cube {}

Names must not start with a digit (0-9), and must not contain ASCII control characters, whitespace, or the following characters: +\'"{}

Note: The "+" character is illegal for compatibility with Inventor programs, where the characters after the "+" are used to disambiguate multiple nodes with the same name. For example, a user of an Inventor program may give two nodes the name "Joe"; when written, these might appear as "Joe+0" and "Joe+1". The other characters are illegal to make parsing easier and to leave room for future format extensions.

Instancing

A node may be the child of more than one group. This is called "instancing" (using the same instance of a node multiple times, called "aliasing" or "multiple references" by other systems), and is accomplished by using the "USE" keyword.

The DEF keyword both defines a named node, and creates a single instance of it. The USE keyword indicates that the most recently defined instance should be used again. If several nodes were given the same name, then the last DEF encountered during parsing "wins". DEF/USE is limited to a single file; there is no mechanism for USE'ing nodes that are DEF'ed in other files.

For example, rendering this scene will result in three spheres being drawn. Both of the spheres are named 'Joe'; the second (smaller) sphere is drawn twice:

Separator {
	DEF Joe Sphere { }
	Translation { translation 2 0 0 }
	DEF Joe Sphere { radius .2 }
	Translation { translation 2 0 0 }
	USE Joe
}

Extensibility

Inventor's file format has two mechanisms to support easy extensibility; self-declaring nodes, and alternate representations for nodes.

Objects that are not built-in write out a description of themselves first, which allows them to be read in and ignored by applications that don't understand them.

This description is written just after the opening curly-brace for the node, and consists of the keyword 'fields' followed by a list of the types and names of fields used by that node (to save space, fields with default values that won't be written also will not have their descriptions written). For example, if Cube was not built into the core library, it would be written like this:

Cube {
  fields [ SFFloat width, SFFloat height, SFFloat depth ]
  width 10 height 4 depth 3
}

By describing the types and names of the cube's fields, a parser can correctly read the new node. Field to field connections and engines (which I do not think should be part of VRML 1.0; see the last section of this document on Futures) require that the parser know the names and types of fields in unknown nodes; it isn't good enough to just search for matching curly-braces outside of strings and store unknown node contents as an unparsed string.

The other feature that allows easy extensibility is the ability to supply an alternate representation for objects. This is done by adding a special field named 'alternateRep' of type 'SFNode' to your new nodes. For example, if I wanted to implement a new kind of material that supported indexOfRefraction, I could also supply a regular Material as an alternate representation for applications that do not understand my RefracMaterial. It the file format, it would look like:

RefracMaterial {
  fields [ SFNode alternateRep, MFFloat indexOfRefraction,
	   MFColor diffuseColor ]
  indexOfRefraction 0.2
  diffuseColor 0.9 0.0 0.2
  alternateRep Material { diffuseColor 0.9 0.0 0.2 }
}

Inventor uses DSO's (dynamic shared objects; DLL's, dynamic link libraries, on the Windows NT port of Inventor) to support run-time loading of the code for a new node; I can give you a RefracMaterial.so with an implementation (written in C++) of the new RefracMaterial, and existing Inventor applications will then work with then recognize the new node, and NOT use the alternateRep. However, I think it is beyond the scope of VRML to try to define a method for the dynamic loading of platform-independent code across the network, and that that issue is completely independent of VRML.

Header

For easy identification of VRML files, every VRML file must begin with the characters:
#VRML V1.0 ascii
Any characters after these on the same line are ignored. The line is terminated by either the ASCII newline or carriage-return characters.

Note: It would be a little more convenient if VRML shared the same identifying header as Inventor ("Inventor V2.0 ascii"). However, in the long run I think there will be many fewer problems if it is easy to distinguish VRML files from Inventor files. It should be trivial to write a VRML to Inventor translator, and only moderately difficult to write an Inventor to VRML translator that tesselated any primitives that were not part of VRML (e.g. NURBS) into IndexedFaceSets.

Comments

The '#' character begins a comment; all characters until the next newline or carriage return are ignored. The only exception to this is within string fields, where the '#' character will be part of the string.

Note: Comments and whitespace may not be preserved; in particular, a VRML document server may strip comments and extraneous whitespace from a VRML file before transmitting it. Info nodes should be used for persistent information like copyrights or author information.

Whitespace

Blanks, tabs, newlines and carriage returns are whitespace characters wherever they appear outside of string fields. One or more whitespace characters separates the syntactical entities in VRML files, where necessary.

File Contents

After the required header, a VRML file contains exactly one VRML node. That node may, in turn, contain any number of other nodes.

Note: Inventor allows a series of root nodes to be parsed from a single file. This causes problems for filters that operate on Inventor files (instancing between the nodes in the different roots tend to get broken as each root is worked on independently), and doesn't really add any functionality.

File Extension

I like ".vrml" as the standard file extension for VRML files. I'll assume that there is no issue with DOS machines and a 4-character suffix, since the standard suffix for HTML documentes seems to be ".html".

Coordinate Space Conventions

Inventor's default unit is the meter. Inventor uses a right-handed coordinate system; if you are looking at your computer monitor, the default Inventor coordinate axes are +X towards the right, +Y towards the top, and +Z coming out of the screen towards you.

Issue: The consensus on the mailing list was +X right, +Y into the screen, and +Z up. Is it worth having VRML be incompatible with Inventor (it is easy for a translator to add a Rotation node...)?

Is that everything?

I'll go through the list of features generated by Brian Behlendorf and the VRML survey, to make sure this proposal covers everything:

Platform independence.

This is an ASCII file format that has already proven to be easily transportable to other platforms.

True 3-d information (not pre-rendered texture maps a la DOOM).

All of the primitives are true 3D objects.

PHIGS-ish lighting and view model.

Material, the lights, and PerspectiveCamera give a simple but powerful way of specifying views and lights.

Unrecognized data types are ignored (to leave open future development).

Extensibility has been designed in.

Hierarchical data structure.

Groups create the hierarchy, with Separators providing appropriate encapsulation so sub-parts do not affect their parents.

Lightweight Design.

You'll have to judge, but I have tried to keep this proposal minimal.

Convex and Concave objects allowed.

IndexedFaceSets allow arbitrary geometry.

Fill-in-the-details support (pictures in a museum).

LevelOfDetail combined with a lazy-load WWWInline node give this functionality.

The file format is public domain.

Not only is the Inventor file format usable by anybody, but Silicon Graphics will be releasing a public domain parser for it to make it easier to write applications that understand the file format.

Level of Detail support.

LevelOfDetail supports this directly.

Geometric Primitives.

Cube, Sphere, Cone, and Cylinder (anybody want to argue hard for others?).

Object-Oriented.

I think most people would agree that Inventor is object-oriented, and that this is reflected in its file format.

Texture mapping.

Texture2 and TextureCoordinate2 provides a general method for doing simple texture mapping.

Engines.

This is the one feature I think it would be a mistake to add to VRML 1.0, and that this proposal doesn't address. I think (and I think the people who will be actually implementing VRML will agree) that this can safely wait for a later version of VRML; we know that this proposal can easily handle engines when we decide to add them, since Inventor handles engines now.

"Recommended" views.

There are at least two ways of getting this functionality: either create several simple scenes that are just a camera and a WWWInline node that refers to the scene geometry, or come up with a standardized set of names for nodes in VRML scenes; for example, browsers could look for a Switch named "RecommendedViews" that had several cameras as children; each camera could have a descriptive name, like this:

DEF RecommendedViews Switch {
	whichChild 0   # Use first camera by default

	DEF DefaultView PerspectiveCamera { ... }
	DEF UnderSofaView PerspectiveCamera { ... }
	DEF TopOfChimneyView PerspectiveCamera { ... }
}

The browser could then present this list of recommended views to the user, and change the Switch value to change between them.

Matrix Representations.

Translation, Scale, MatrixTranform, etc cover this.

Hooks/API.

Object naming and field names makes it easy to refer to objects the objects in a scene. For example, it would be possible to write a new node that did something like:

DEF WackyCube Cube { }
MyScriptingNode {
	fields [ SFString script ]
	script "if (....) WackyCube.width += 3;"
}

The ability to put arbitrary nodes with arbitrary fields in the file, plus the ability for them to refer to other nodes, gives the needed flexibility.

Transparency rendering.

Inventor's Material nodes support transparency.

Looking Ahead

Inventor's field to field connections, global fields, and engine objects (which I don't think should be part of VRML 1.0) provide the infrastructure for creating objects with behaviors.

Inventor is missing a good set of engines for doing simple keyframed animated behaviors of objects. It is also missing some simple interactive nodes, such as buttons (like the WWWAnchor node, only more general). The Inventor team at Silicon Graphics will be designing and implementing these kinds of nodes and engines in the near future.

As and example of what standard Inventor can do now, see the FunWithDraggers PostScript document, which is an article Paul Isaacs wrote for the Silicon Graphics developer newsletter on how to build an interactive 3D scene using only standard Inventor nodes and the Inventor ASCII file format (no programming necessary). Here is TrackLight.iv, the file containing the interesting stuff (draggers, connections). If you're curious, here are AllRoom.iv (the main file, which references Room.iv and TrackLight.iv), and Room.iv (which is just the walls of the room).

gavin@sgi.com