http://www.sgi.com/tech/ImageVision/ILTech.html (Silicon Surf Promotional CD, 01/1995)
ImageVision Technical Report
ImageVision Technical Report
Silicon Graphics Computer Systems
The ImageVision Library
Table of Contents
Section 1 Introduction
1.1 Abstract
Section 2 Using the ImageVision Library
Section 3 Architecture
Section 4 Extending the Library
The ImageVision LibraryTM (IL) from Silicon Graphics\xa8 is an object-oriented, extensible toolkit for creating, processing, and displaying images on all Silicon Graphics workstations. The library provides image processing application developers with a robust framework for managing and manipulating images.
Today's rapidly advancing hardware technology often requires software developers to rewrite their code for each new release. IL is explicitly designed to provide a constant software interface to changing hardware. Applications you write today will run on all future generations of machines, requiring few or no changes. With IL, Silicon Graphics can continue to bring you the benefits of next-generation hardware while giving you all the advantages of a stable but growing software environment.
IL consists of a library developed in C++. Interfaces for C and FORTRAN programmers are included in supporting libraries.
The current release of IL contains a core set of more than seventy image processing functions common to most disciplines; future releases will support many more. Since developers must work with such a broad range of image processing operators, IL has been designed to allow you to extend this set to suit your specific needs. Silicon Graphics provides a set of data abstractions and access functions that make it easy to augment IL's image operators and design new ones.
Similarly, image data sets come in a bewildering variety of formats. IL accommodates developers' needs by allowing new file formats to be integrated seamlessly into the library. IL currently supports three standard formats: TIFF and its extensions, Silicon Graphics, and FIT, a simple tiled format.
IL provides an efficient model for the manipulation of image data and image attributes. IL's image model includes a configurable cache to allow access to and processing of the very large images common in many disciplines. It provides a common interface for image manipulations, while requiring little or no programmer knowledge of the image's internal structure or format.
In addition, IL supports the ability to chain a series of image operations together. The chained operations are then executed on demand, processing data from the requested area only. Furthermore, no intermediate results need to be stored between each step in the chain. IL's demand-driven execution model, in conjunction with its ability to chain operators, provides a powerful environment for building image processing applications.
IL's execution model also employs the full power of the hardware. IL transparently supports the parallel processing and graphics features of Silicon Graphics workstations. The multi-threaded implementation of IL allows single-processor platforms to perform look-ahead operations. On multi-processor architectures, IL will further utilize multi-threading to execute processing requests automatically in parallel.
In summary, Silicon Graphics's IL provides the following features:
- an image processing programmer interface common across all Silicon Graphics workstations
- a core set of general-purpose image operators and an easy way to add new ones
- a framework for chaining a sequence of operators together
- an efficient method of augmenting the set of supported image file formats
- an optimized memory model for handling very large images
- an efficient method for pre-fetching image data
- transparent support for execution on single processors and multiple, parallel processors
- an architecture that supports general image types
IL organizes the diverse I/O, computation, and display requirements of an image processing application into a set of coordinated programming models. These models are outlined in the following sections.
The foundation and unifying concept of IL is the image object. All image types are derived from a common object definition. These image types include:
- a memory image, implemented as a contiguous array
- a cached image that resides in a file and is buffered in memory
- a displayed image, resident in the frame buffer
- an operator image that implements an image processing algorithm
By sharing common mechanisms for manipulating data and attributes, all image types can be handled in a consistent fashion. This results in a streamlined programming model that greatly simplifies application development.
The file I/O model abstracts how images are accessed from disk. File images use a look-ahead buffer or cache to minimize delays caused by disk I/O. The common object definition allows an application to transparently support the different file formats provided by IL. You can easily integrate new file formats into IL without modifying the application.
IL implements a pull-driven, or demand-driven, model, such that data is processed only on demand. The model is based on the same cached-image model as file images. This technique enables an application to process just the area of interest providing significant benefits in terms of reduced I/O and improved system performance. The pull execution model also enables IL to efficiently support chaining or linking of image processing operators. This allows a sequence of operations to be performed on a subregion without creating temporary images to hold intermediate results.
In contrast, traditional, or push execution models require that each image operator process the entire image in sequence and also require extra buffers to store the intermediate results. Figure 1 contrasts the push and pull execution models.
FIGURE 1 Push Execution Model versus ImageVision Library
The display model defines how images are displayed on the screen. It provides an abstraction of an X window and can display one or more images in subregions of the window, called views (shown in Figure 2). The display model provides methods to move, resize, reorder, and select views to be operated on by display operators such as split, wipe, and translate. This model allows the application to conveniently specify how images are to be displayed while automatically drawing only the portions of each image that are exposed.
FIGURE 2 An Example Using the ImageVision Library Display Model.
IL is based on a single object abstraction of the way in which images are manipulated, ilImage. This abstraction presents a common interface for manipulating images while hiding the actual data representation. The interface provides functions for setting and retrieving information about image size (rows, columns, depth), channels or bands, data type, and color interpretation.
Most importantly, ilImage provides a common interface for storing and retrieving image data. These image data-access functions are used to store and retrieve an arbitrary subset of the channels for an arbitrary rectangle or tile of pixels. Convenience methods allow access to a single pixel value.
Figure 3 depicts IL abstractions for a set of distinct image types: memory images, disk file images, display images, and processed images (operator images).
FIGURE 3 Image Object Abstraction Hierarchy
Because all image types share a common model, a programmer need only access images and image data as an ilImage type. The following sections fully describe the attributes and access methods for all image types.
IL supports three main image dimensions: x-dimension or number of columns,
y-dimension or number of rows, and z-dimension or image depth.
Each element in this array, pixel or voxel, can consist of multiple channels or bands. All pixels in a given image must have the same number of channels and the same scalar data type. There is no prescribed limit to the size of the x, y, z, or channel dimensions.
Each channel of a pixel has a scalar data type that can have one of the following values (using the standard C interpretation):
- bit
- char (signed or unsigned)
- short (signed or unsigned)
- long (signed or unsigned)
- float
- double
To provide flexible support for different image formats, IL supports image data ordering as a configurable image attribute. Figure 4 is an example for a color (or three-channel) image; in this case, image data can be retrieved or archived in one of three orders.
FIGURE 4 Image Data Ordering
1. Interleaved ordering clusters the pixel components together. For example, a three-channel (RGB) image would be stored as RGBRGBRGB... and so on.
2. Sequential ordering clusters the individual pixel components on a per-line basis. In the RGB image example above, a corresponding tile of data would contain a line's worth of red values followed by the same line's green values and then the blue values for that line, before continuing with the data for the next line.
3. Separate ordering stores each channel component in a separate contiguous piece.
Image data can be interpreted using several different color semantics. The programmer can define or query an image's color model and, if relevant, its color map. IL supports the following color models:
- grey scale with minimum value black
- grey scale with minimum value white
- color palette or pseudo-color
- RGB
- RGB plus alfa (transparency value)
- HSV (hue, color, and value)
- CMYK
- multi-spectral
IL provides access functions for reading and writing image data. Data can be read or written as a rectangular region (tile) or as an individual pixel. A tile can have an arbitrary size and need not match the underlying storage format of the image being accessed. Image data can be accessed in a data type and ordering independent from that of its source; IL performs the necessary conversion.
Programmers often work with only a small portion of an image. To reduce the image's processing region and boost performance, IL provides an abstraction for defining and applying a mask or region of interest (ROI) to images, ilRoiImg. This results in substantial computational efficiencies during image access or storage since only data inside the ROI is affected.
Another class, ilSubImg, allows a rectangular portion of a parent image to be treated as if it were an independent ilImage. This is a simplified version of ROI that is useful when selecting a small portion of a larger image as input to an image operator.
Using IL is easy. With image objects, it is possible to write many effective, versatile applications. IL allows you to read from multiple sources, construct both simple and complex chains of operators, and write the results to a display or file. In addition, you can interactively modify operator parameters while simultaneously displaying results.
IL represents image processing functions as image operator objects. Each image operator specializes in processing one or more images using a particular algorithm, while allowing user access to the resulting image data through a common programming interface. All image data can be stored or retrieved using IL's setTile, copyTile or getTile functions. Additionally, IL transparently supports the ability to chain image operators together.
Figure 3 shows how elements of an IL application might be connected. IL manages the data caching to appropriately process images from anywhere in the chain.
FIGURE 5 Elements of an IL Application
Programming with IL is easy and intuitive. For example, the following simple steps allow you to obtain an image from a file, sharpen it, and write the result back to disk:
1. Open the input image file
2. Construct a sharpen operator image using this file as input
3. Create the output image file
4. Copy the sharpen image to the output image file
Even better, the following C++ code shows how each of the outlined steps maps easily into a single call to the IL:
void main ()
{
// Step 1. Open the input image file
ilFileImg* inimg = ilOpenImgFile("image.tif", "r");
// Step 2. Construct a sharpen operator image using
// this file as input, sharpen the image by .5
ilSharpenImg* sharpen = new ilSharpenImg(inimg, .5);
// Step 3. Create the output image file, use the file attributes
// from the sharpened result.
ilType dtype = sharpen->getDataType();
ilOrder order = sharpen->getOrder();
ilSize outsize;
sharpen->getSize(outsize);
ilFileImg* outimg = ilCreateImgFile("output.tif", outsize, dtype,
order);
// Step 4. Copy the sharpened image to the file
outimg->copyTile(0, 0, outsize.x, outsize.y, sharpen, 0, 0);
// Have IL flush buffers and close the file automatically.
delete outimg;
}
Image application developers must often perform more than one operation in sequence. In the following example, the input image is scaled, rotated, sharpened, and then radiometrically adjusted using a look-up table. Here is the new sequence:
1. Open the input image file
2. Construct a scale and rotate operator image using the input image file as input
3. Construct a sharpen image using the warp image as input
4. Construct a LUT image using the sharpen image as input
5. Create the output image file
6. Copy the LUT image to the output image file
IL treats all image operators as images, so operators can be given as inputs to other image operators. The ability to specify image operators as inputs enables IL to manage a sequence of operations as a chain transparently. Developing a simple or complex chain in the IL can be as easy as performing single operations. The next C++ code fragment demonstrates the versatility of IL:
main()
{
// Step 1. Open the input image file
ilFileImg* inimg = ilOpenImgFile("input.tif", "r");
// Step 2. Construct a scale & rotate operator, rotate by 30
// degrees and scale it by 0.5 using bilinear resampling
ilRotZoomImg* rotate = new ilRotZoomImg(inimg, 30, .5, .5,
ilNoFlip, ilBiLinear);
// Step 3. Construct a sharpen operator,
// sharpen the rotated image by .5
ilSharpenImg* sharpen = new ilSharpenImg(rotate, .5);
// Step 4. Construct a look-up operator, use the standard SGI
// lookup table to pass the sharpened image through.
ilLut *sgitable = ilSGICmapLut(ilSgiDefault);
ilLutImg* lutimg = new ilLutImg(sharpen, *sgitable);
// Step 5. Create the output image file, use the file attributes
// from the final operator result.
ilType dtype = lutimg->getDataType();
ilOrder order = lutimg->getOrder();
ilSize outsize;
lutimg->getSize(outsize);
ilFileImg* outimg = ilCreateImgFile("output.tif", outsize, dtype,
order);
// Step 6. Copy the final image to the file
outimg->copyTile(0, 0, outsize.x, outsize.y, lutimg, 0, 0);
// All done, remove and close the output image to
// flush all buffers.
delete outimg;
}
Using IL, you can display your results as easily as writing them to an image file. To display your results directly, replace steps 5 and 6 above with the following sequence:
5. Create a display object
6. Invoke the desired display operator
IL provides an efficient model for displaying one to many images within a window. Image rendering with IL can be done in either an X window system or the GLTM windowing system:
// Step 5. create a window to display the image results,
// save the window id to pass onto IL
ilDisplay display(window-id);
display.addView(lutImage);
// In your event loop, when you get a redraw event notify IL
// that it must update the images on the display
display.redraw();
This code fragment merely shows you the minimal set of features provided by IL. The IL display capabilities also include operations such as roam, wipe, and translate.
IL allows you to change the parameters of an existing chain. For example, to alter the sharpness of the image created by the rotate and scale operator in the previous example, take the following steps:
7. Call the sharpen operator's built-in method to alter its sharpness
8. Invoke the redraw display operator
You can repeat these steps as often as desired in an interactive loop by adjusting each image operator's values as shown in Figure 6.
FIGURE 6 Interaction Within an ImageVision Application
Chains do not have to be linear. Individual operators can fork to several operator chains. In the previous example, the results of the sharpen operation can be channeled into one display view while the results of the LUT operation can be directed to a second view. The split display operator can then be used to compare the two outputs.
IL is independent of the User Interface (UI) and therefore can be used with any UI toolkit. IL operators provide public member functions that make them easy to control interactively. When these attributes are altered, any cached data in the operator image is discarded. In this way, an operator can have its behavior modified without having to be destroyed and re-created. When an image in the chain is affected by parameter modifications, the succeeding images are informed that their results must be updated.
IL can be controlled with keyboard input, mouse position, Motif\xaa widgets, or any combination of these or any other toolkits. For example, ilCompassImg performs directional edge enhancement. IL provides member functions to specify the direction as an x-weight and y-weight, a floating-point angle, or a predefined direction such as north or south-west. Mouse position could be used to control the
x-weight and y-weight. The direction could also be specified as an angle from the keyboard, or it could be selected by providing Motif push buttons labeled North, South, East, and West. This simple yet flexible control makes it easy to develop interactive image processing applications.
In addition to standard image processing operations, IL provides display operators such as roam, wipe and split view.Again, any combination of UI widgets can be used. For example, mouse position could be used to control roaming through a large image. Alternatively, a Motif scroll-bar could be used to control the wipe between two images being compared.
For example, when using a Motif scroll-bar, the callback routine for a "value changed" event could determine the new position of the scroll bar and then use this information to set the zoom factor on ilRotZoomImg. Similarly, the callback routine for a Motif push button labeled "2:1" could set the zoom factor to 2 on ilRotZoomImg. These two widgets, used together, provide the ability to step to a desired zoom factor and then zoom interactively relative to the current zoom setting.
The X event queue can deliver mouse-position events, mouse-button events, or keyboard events. To control roaming with the mouse, you can use the MOUSEX and MOUSEY events to specify the delta-x and delta-y for the translate operator.
IL's fundamental building blocks include its caching mechanism, broad variety of image operators, efficient execution model, multiple file types, and versatile techniques for displaying results.
Data movement can be a critical bottleneck in the implementation of an image processing library. To avoid this bottleneck, IL processes data in pieces small enough to fit in working memory. If every data request were serviced directly from the disk, performance would suffer, since the computation would need to wait for each new piece of data to be read or written. To avoid delays caused by disk I/O and re-computation, the data read from disk and the processed results in memory must be cached.
The cache holds two different types of image data.
1. When reading image files from disk, the cache stores raw, or uncompressed, data. If the image is stored by rows, the cache will be allocated by rows matching those in the file. If it is stored in rectangular chunks, the cache will be allocated in pieces that are the same size and shape.
2. For operator images, the cache stores the processed results.
The cache stores the image data in fixed-size rectangles called pages. Image pages play a similar role to "traditional" pages in a virtual memory system. The most important difference is the two-dimensional or array-like structure of an image page. This allows for very efficient image oriented memory management.
Arbitrary reference to image data is provided through tiles. The image access methods getTile and setTile allow access to any contiguous rectangle of image data, as shown in Figure 7. Because these accesses often do not match the underlying storage format of the image file, the cached image object provides any necessary reformatting of image data.
The cache supports two fundamental data orders: interleaved/sequential and separate. With interleaved or sequential data, all channels of a given pixel are stored adjacently; each page therefore holds all channels for that page's rectangular area. With separate data, each channel is stored separately; there is a separate page for each channel in a given page's rectangular area. As with data-type conversion, IL defers page reformatting until it performs each individual access, and the data in the cache is stored in the raw page format of the underlying image file.
If getTile requests a single channel and the image file is stored in a separate format, only the pages containing that channel will be accessed. For multi-spectral operations, this can result in significantly reduced I/O overhead. The format of choice for multi-spectral image data, therefore, is separate channels. Similarly, color data is usually stored interleaved, because all channels are accessed together, and the interleaved format matches the storage in the frame buffer. In many cases, this data order eliminates the need to reformat during display operations.
FIGURE 7 Image Storage Model
In addition to enabling IL to access a specific set of data, tiling allows it to break up that set into multiple requests. This feature has the dual benefit of allowing data to be manipulated more efficiently in memory and permitting the requests to be distributed to all available processors.
Because images can have widely varying page sizes, the allocation of memory for the image cache can present problems with fragmentation of heap memory. To avoid this problem, IL uses a sophisticated memory allocation scheme that controls fragmentation. Allocated memory is broken up into pools, one for each active page size. Each pool is maintained in its own address region to avoid interactions between different page sizes. Within a pool, compaction of holes in the allocated memory is straightforward since all of the allocations are the same size. When large amounts of memory are freed in a given pool, the memory is returned to the operating system for use by other applications. The IL will automatically perform this compaction and reclamation whenever the fragmentations exceed a user-controllable threshold.
Operators implement image processing algorithms. They are the building blocks of IL's execution model. IL includes over 70 functions, which are listed in the Appendix of this report. Future releases will provide additional functionality. Users may easily extend the available set of operators to encompass special needs as described in Section 4.2. Current IL operators include the following:
- color conversion
- arithmetic functions
- radiometric transforms
- geometric transforms
- statistics
- non-spatial domain transforms
- spatial domain transforms
- edge, line, and spot-detection
Operator images extend the object-oriented ilImage to the actual image processing algorithms. The operator image is a read-only image that contains the result of the image processing operation it performs. Operator images can be used as an input to another operator image, an ilFileImg object, or to an ilDisplay object. Since IL also provides a unified programming model, several operator images can also be chained together to execute a sequence of operations.
The operator image is a simple model. As shown in Figure 8, when the image is accessed with getTile, the cache is searched. If the requested tile does not exist in the cache, an image page fault occurs. The resulting call to getPage causes the input data to be requested from the input image with getTile. The returned data is processed as specified by the ilOperatorImg and the resulting page is stored in the cache. Using this technique, an ilOperatorImg only processes the part of the image necessary to provide the requested tile.
FIGURE 8 Operator Image Data Flow
Because IL considers data read with getPage to be clean until modified by setTile, and because operator images are read-only, the data in the cache is simply discarded when the page-replacement algorithm is invoked. Unlike file-based image classes, no backing store is required to hold the complete processed image. To prevent data from being recomputed, the cache must be large enough so that overlapping fetches do not read pages that have already been discarded. To save the result of an operation, an image file of the appropriate type or format (for example, ilTIFFImg) must be created and the operator image copied to it.
The potentially increased memory requirements associated with a larger image cache are offset by this model's reduced number of in-memory data shuffles and its minimized I/O overhead, both of which significantly boost operator performance.
Because operators in varying data types may access image data in the cache, that data is cached in its raw data type (after being decompressed, if necessary). Individual data accesses can explicitly recast it into a more desirable working data type. If an operator is performed predominately on a particular data type, you can code that case to prevent the data from being converted to another type.
Upon creation, operator images define their input images(s) and control parameters. Although control parameters are normally supplied at construction time, IL also provides methods to change these control parameters after the image-operation is created. By default, operator images inherit many attributes from their input image(s), such as data type, image size, and page size; methods to change these attributes are also provided by IL. When these attributes or control parameters are altered, any cached data in the operator image is discarded. In this way, an operator can have its behavior modified without it having to be destroyed and re-created.
IL implements a pull execution model of image processing in which data is processed only on demand. Relative to push execution and virtual memory models, a pull execution model:
- eliminates the need to read and process excess data
- obviates need for users to manage the results of intermediate processing
- decreases the start-up time normally associated with roaming on a large image
- provides an excellent foundation for parallel processing architectures
- reduces I/O and memory overhead
Typically, more than one image operator must be applied in order to produce a final image. These operators are implicitly chained together by IL's execution model. Because images can be applied to many image operators and the latter can require multiple input sources, IL supports multiple forward and backward linking. When an image in the chain is affected by parameter modifications, the succeeding images are informed that their results must be updated. Figure 9 illustrates the execution model.
The actual chaining is done implicitly as operator images are created. In Figure 9, for example, a link is established between TIFFImg and rotateImg during the creation of rotateImg. Specifically, rotateImg becomes a child of TIFFImg and TIFFImg becomes the parent of rotateImg. IL provides support for chaining multiple parents and multiple children within an image, and ilImage also supports the management of children and parent relations.
A conventional operator execution model would read the input image into a buffer, process the data, then write the result to an output image. IL's operator image model effectively eliminates the need to write out the contents of the buffer by allowing the data to be processed directly into the image cache.
In the case of a single operator that writes its result directly to a file-based image, an operator-image model provides little benefit because data must still be copied to the output image. However, when operations are chained together, each operator in the chain eliminates the requirement for another buffer move.
A chain of operations on a large image (larger than can be cached in memory) using the conventional operator model would require each intermediate stage to be flushed to disk and read back in for the next stage. In the pull model of chained operator images, no intermediate files are generated. Instead, the data for a piece of the final image is pulled all the way through the chain and --- provided the caches are large enough to handle overlaps --- will only be computed once for any given area of any operator image in the chain. Thus the pull execution model completely eliminates
I/O overhead for intermediate processing steps, regardless of an image's size.
Another advantage of this model becomes apparent when roaming on a large operator image. Because only the portion being viewed must be processed, start-up time is significantly reduced. Those portions of the image that are not visited need never be processed at all.
FIGURE 9 Execution Model
IL's demand-driven, page-oriented execution model leads to a natural coarse-grained parallel processing implementation. IL executes multiple getPage calls simultaneously on multi-processor platforms. When a copyTile involves multiple pages, the latter are processed by separate threads of execution running on separate processors. The library thus takes advantage of parallel processing without requiring any recoding of getPage methods written for user-defined operators. Existing applications thus become "parallelized" without having to be modified in any way. A further benefit of the multi-threaded extension is the capability to perform look-ahead I/O, both automatically and under explicit control using the seekTile method.
Images accessed from disk are encoded in various formats, each requiring special handling. IL supports five file formats: TIFF, GIF, PhotoCD, the original SGI RGB file, and a sample format from Silicon Graphics. Each file format has its own object class: ilTIFFImg, ilGIFImg, ilPCDImg, lSGIImg, and ilFITImg, respectively. The placement of these objects in the image object hierarchy is shown in Figure 10. IL also provides function wrappers to resolve image file types, simplifying the process of adding file types (see "Extending the Libraries" later in this report). The following sections describe in more detail how IL distinguishes and applies image file formats.
FIGURE 10 File Types in the Image Object Hierarchy
ilFileImg abstracts the concept of disk media as an I/O source for accessing or storing image data. ilFileImg is a descendant of ilImage and thus inherits the external interface used for depicting image data on disks. To fully support large images and allow for image data pre-fetching, the ilFileImg abstract is derived from IL's in-memory model, ilMemCacheImg, and thus inherits the features of the cache image. In addition to supplying the standard ilImage interface and ilMemCacheImg's memory-management support, ilFileImg introduces the concept of a UNIX file and access mode.
Although the actual implementation for retrieving (getTile) and storing (setTile) image data is provided by the ilMemCacheImg class, it is the responsibility of classes derived from ilFileImg to populate the cache pages, performing any compression or decompression of the image data as it is stored or retrieved.
IL provides a wrapper function to be used for instantiating the appropriate ilFileImg type. Given a filename, ilOpenImgFile will return a pointer to one of the supported ilFileImg types: ilTIFFImg, ilGIFImg, ilPCDImg, lSGIImg, ilFITImg, or a NULL pointer if the file does not exist or if the image file format is unsupported. Similarly, IL provides a wrapper function for creating a new image file. Given a filename, ilCreateImgFile will return a pointer to the desired file format. Although ilCreateImgFile will create a TIFF image file by default, IL can create files in other supported formats as well.
TIFF is a tag-based image file format introduced by Aldus\xa8 and Microsoft\xa8 . The current release of TIFF, Version 6.0, provides support for storing one or more images in a file. This extensible image file format currently provides support for storing images using Lempel-Ziv, CCITT Group 3 or 4 Facsimile, packbits, or no compression schemes. For further details of TIFF intrinsics, refer to the TIFF Developer's Toolkit (Version 5.0) or the TIFF 6.0 Draft manuals published by Aldus. IL fully supports the TIFF file format, and newly created images default to TIFF.
The ilTIFFImg class defines support for TIFF images. ilTIFFImg implements the constructors and reader and writer modules for encoding and decoding TIFF images properly. IL also employs some TIFF 6.0 extensions to provide support for tiled images and various TIFF data types. It extends the TIFF format to support the full range of IL data types better and to allow both multi-spectral and 3D images.
GIF is the Compuserve\xa8 Graphics Image File format. It is commonly used on computer bulletin boards. It is essentially a color-index format with a simple compression scheme. It thus store files in a reduced space, with some degradation of image quality.
The ilGIFImg class defines support for GIF images. ilGIFImg only supports a constructor for the reading of existing GIF images. ilGIFImg does not support the creation of new GIF images in this release.
The Kodak PhotoCD image pack format consists of representations of a single photographic image so that it is accessible in up to six different resolutions. These images are normally stored on a CD-ROM.
The ilPCDImg class defines support for PhotoCD images. Like GIF, only the reading of PhotoCD format images is supported. ilPCDImg implements the decompression of data from the DCT-based compression used in PhotoCD images. The PhotoCD images are stored in the YCC color model. The ilColorImg class provides the necessary color conversion to RGB for proper processing and display of such images.
SGI file format is a simple file format used to store RGB or color-indexed image data. The SGI file format supports run-length encoding and verbatim (no encoding) modes. SGI image pixel data is always composed of one or three bands; the three bands are always interpreted as RGB. For further details of SGI file-format intrinsics, refer to the ilSGIImg-related information in the Appendix of this report.
The ilSGIImg class defines support for SGI images. ilSGIImg implements the constructors and reader (getPage) and writer (setPage) modules for properly encoding and decoding SGI images.
Tiled image files are supported by TIFF. Some tags have been added and/or extended to store the number of channels and the size of the Z dimension.
In addition, IL supports a simple tiled file format called FIT. This format supports the full range of IL capabilities and is intended primarily as an example. The source code for this format is supplied as a basis for implementing user-defined file formats.
Tiling provides many benefits to an image processing application in which you will roam through a large image. Most significantly, tiled images allow you to avoid unnecessary disk I/O by requesting flexible, rectangular memory pages. The immediate results are reduced memory overhead and processing time and maximized application throughput.
IL's display capability plays an important role in the creation of an interactive image processing application.
An ilDisplay object defines what will be displayed in a GL or an X window. It manages a set of ilViews and provides methods to manipulate the set. When an ilImage is added, an ilView is created for it and added to the set. An existing ilView can be resized, moved, or deleted from the set.
The available ilDisplay methods include:
- push/pop Changes the stacking order of the selected views.
- setBorders Draws or erases borders on the selected views.
- findView Determines the view visible at a given location on the window.
- redraw Redraws the window.
- display Sets the size and position of an ilView.
- split Displays a split view of active ilViews. It modifies each ilView to accommodate the needs of the split view, using a default window partition method.
- align Allows views to be aligned to one another. Options are provided for left, right, top, bottom, and center alignment.
- wipe Performs a wipe of any combination of view edges, using the selected ilViews, and updates the ilViews accordingly.
- moveView Shifts the position of an ilView on the window by the desired amount in x and y.
- moveImg Shifts all the images in the selected ilViews by the desired amount in x and y, and updates the corresponding ilViews.
- update Allows the view size, view position, and image position in an ilView to be updated in one call.
An ilView is an object that associates an ilImage with a rectangular sub-area of a window, called a view. The ilView also defines what position in the ilImage is mapped to the origin of the view. Many of the methods of ilDisplay are also defined on ilVIew as a convenience (such as wipe, moveView, moveImg, update, and so on.)
An ilDisplayImg is an abstract class that provides access to a GL or an X window. ilDisplayImg is derived from ilImage and provides data access through getTile and setTile. There are two classes of display image derived from ilDisplayImg:
- ilGLDisplayImg for pure GL or mixed model applications
- ilXDisplayImg for applications that want to use pure X rendering (for instance, to display on an X-terminal)
FIGURE 11 Display Image Abstraction Hierarchy
An ilGLDisplayImg or ilXDisplayImg is automatically created for a user invoking ilDisplay. On Silicon Graphics platforms, the default behavior is to use GL rendering because of the performance advantages it offers:
- Copying between ilDisplayImgs uses hardware supported rectcopy.
- Copying from an ilCacheImg or ilMemoryImg to an ilDisplayImg is implemented using hardware supported functions such as lrectwrite using pixmode to set any necessary stride to pick the data directly out of the cache pages or memory array.
- Copying from an ilDisplayImg to an ilCacheImg or ilMemoryImg is implemented with lrectread using pixmode.
- Operator chains that end in an ilGLDisplayImg may be subject to acceleration using the graphics hardware.
When displaying on Silicon Graphics platforms, IL automatically takes advantage of the graphics hardware features that can accelerate the requested operations. The operations supported on the available graphics options are shown in Table 1.
IL automatically evaluates any operator chain that terminates in an ilGLDisplayImg; operations at the end of the chain that can be collapsed together and executed in the graphics pipeline will be performed there. The RealityEngine\xaa graphics subsystem includes special support that allows operator chains requiring multiple passes through the graphics pipeline to be accelerated. An auxiliary buffer is allocated in the frame buffer by IL to hold intermediate results between passes. This buffer is independent of the windowing system, so results are not affected by partially obscured windows or edge effects on neighborhood operations. This buffer also allows processing chains that do not end in a displayed result to be accelerated and read back to the CPU.
As with the multi-processing support, graphics hardware support is integrated transparently to the application program. If, for some reason, it is desirable to perform an operation in the CPU, the hardware acceleration can be overridden on individual operators with a simple procedure call. This can be useful for debugging or, in some cases, to achieve a more precise result.
TABLE 1 Supported Operation by Platform
-----------------------------------------------------------------
Operation PI GT VGX Reality
Indigo GTX Engine
Starter
Elan
-----------------------------------------------------------------
Integer Zoom1 4 4 4 4
Continuous Zoom1 4 4
Blend 4 4 4
Logic Ops 4 4 4
Zoom, Rotation & Polynomial Warp2 4
Convolution3 4
Min/Max & Histogram 4
Scale and Bias 4
Lookup Table 4
Color Model Conversions 4
Data Type 4
Conversions
-----------------------------------------------------------------
Notes:
1. Nearest neighbor resampling only
2. Nearest neighbor, bilinear or bicubic resampling
3. 3x3, 5x5, 7x7 general and separable convolution kernels
IL has been designed to be very flexible in order to accommodate the wealth of file formats, algorithms, and data sources with which image processing application developers must work. Figure 12 shows how user defined extensions fit into the image object hierarchy.
FIGURE 12 Library Extensions
ilFileImg provides the internal interfaces for programmers to develop their own formats. To support a specific image file format, the following functions must be fully implemented:
Constructors The constructor must open or create the given file. If the access mode is read-only, then the file header and information must be retrieved. Information about image size, data type, order, page size, and color model must be initialized.
Reading from file The getPage function provides the external interface for reading raw image data into the cache. getPage must be implemented to read and, if necessary, decompress a tile of data from file.
Writing to file The setPage function provides the external interface for writing image data into the cache. The image data passed by setPage is provided in raw format and must therefore be compressed before it is written out to the file.
Opening a file This is a function used to open an existing image file or create a new image file. The function's responsibility is to validate the named file's format and to return a pointer to the object. The file is only created when you have specified size, data type, and order.
IL determines the supported file formats by searching for dynamic shared-objects (DSOs) that contain the code to read and write a particular format. IL scans a standard directory for file format DSOs and optionally a user defined directory. All of the file formats found are maintained in an internal list of registered file routines. The ilOpenImgFile and ilCreateImgFile functions are simple iterators that loop through the list of supported formats when attempting to instantiate the appropriate image file type. Thus, by creating a new DSO with the code to read a user-defined format and placing it in a directory where IL will find it, new file types are integrated automatically and seamlessly. An existing IL-based application will automatically be able to read and write files of the new format with recompiling or even relinking.
IL makes it easy to add new operators. IL provides multiple base classes from which new operators can be derived:
- This is the most general class. It is derived from ilCacheImg and, like ilFileImg, the method for getPage must be supplied. For an operator, this method is responsible for reading from the input image(s) and performing the desired operation, leaving the result in the operator image's cache. If the operation computes parameters based on the attributes of its input(s), the programmer must define the resetOp method to compute these parameters when the input(s) are changed.
- This class is used as a base for spatial or neighborhood operators. It is derived from ilOpImg and automatically fetches the input data with sufficient padding around the page being computed to support the neighborhood operation. The programmer defines the calcPage method responsible for performing the operation from an input buffer into an output buffer (the operator's cache).
- This class defines a generalized warp algorithm. It provides all the mechanics of fetching and resampling the image data. The programmer supplies an address generation method that maps the output image addresses to the input image.
- This class makes it easy to add simple single-input point-processing algorithms to the library. Like ilSpatialImg, it performs all I/O for the programmer, who is only responsible for the calcPage method, which is passed an input buffer full of data, and an output buffer to receive the result.
- This class is identical to ilMonadicImg except that two input images are provided and two input buffers are passed to calcPage.
IL has the flexibility to support the acquisition of images from various sources.
For example, applications requiring analysis of video information that is or can be converted to digital format can use IL for image processing. IL currently provides the protocol required to pass data through image operators for processing. To support access to data from frame digitizers, an object class, ilVideoImg, derived from ilImage, could be implemented. The class definition would implement how data is accessed from the frame grabber, namely, initialization (constructor), getTile, and setTile methods:
Constructors Must initialize or perform the necessary operations to establish communication to that source. Upon creation, ilImage attributes such as image size, data type, and organization must be initialized.
getTile Must retrieve a specified block of data in the given configuration (channel order, data ordering, and data type). The main responsibility of getTile is to retrieve the corresponding block of data from its source. IL provides an object, ilConverter, to perform required conversions.
setTile Must store a given block of data to the image source. Data, specified in a given configuration, must be converted (if necessary) before writing it to the source.
These functions must invoke a set of control functions provided by the frame-grabber device driver or library. The constructor must execute the required control sequences to establish communication between the process and the frame grabber. Similarly, the reader (getTile) and writer (setTile) must invoke the corresponding set of request functions for reading and writing data to the frame grabber.
APPENDIX List of Operators
A.1 Image Support Functions
ilCombineImg Combines two images based on an ROI mask.
ilCreateImgFile Creates an image file and returns an ilFileImg handle.
ilMemoryImg Accesses a contiguous array of data as an ilImage.
ilMergeImg Merges several images into one.
ilNopImg Performs page size, data type, order and coordinate space conversions on an image
ilOpenImgFile Opens an image file and returns an ilFileImg handle.
ilRoiImg Masks accesses to an image by an ROI mask.
ilSubImg Selects a rectangular subregion and/or channel subset of an image.
ilSwitchImg Implements a switch construct in an image operator chain
A.2 Color Conversion
ilColorImg Provides support for generic color conversion
ilFalseColorImg Performs false coloring of multispectral images.
ilBGRImg Performs BGR color conversion
ilRGBImg Performs RGB color conversion.
ilCMYKImg Performs CMYK color conversion.
ilHSVImg Performs HSV color conversion.
ilGrayImg Performs grayscale color conversion.
ilABGRImg Converts an image to GL's 24-bit ABGR lrectwrite format.
ilSGIPaletteImg Performs color conversion to use the system default GL's color map and rectwrite format.
ilSaturateImg Performs color saturation of an image
A.3 Arithmetic Functions
ilAbsImg Computes the pixelwise absolute value on an image.
ilAddImg Performs a pixelwise addition of two images.
ilAndImg Performs a pixelwise logical-and of two images.
ilArithLutImg Provides basic support for arithmetic operations implemented with a look-up table.
ilDivImg Performs a pixelwise division of two images.
ilDyadicImg Provides basic support for two-image operators
ilExpImg Computes the pixelwise exponent of an image.
ilInvertImg Computes the complement of an image.
ilLogImg Computes the pixelwise log on an image.
ilMaxImg Performs a pixelwise maximum or ">" of two images.
ilMinImg Performs a pixelwise minimum or "<" of two images.
ilMonadicImg Provides basic support for single-image operators
ilMultiplyImg Performs a pixelwise multiplication of two images.
ilNegImg Computes the pixelwise negation on an image.
ilOrImg Performs a pixelwise logical-or of two images.
ilPowerImg Performs a power law scaling of an image.
ilSqRootImg Computes the pixelwise square root of an image.
ilSquareImg Computes the pixelwise square of an image.
ilSubtractImg Performs a pixelwise subtraction of two images.
ilXorImg Performs a pixelwise exclusive-or of two images.
A.4 Radiometric Transforms
ilHistEqImg Performs histogram equalization of an image.
ilHistNormImg Transforms image to desired means and standard deviation.
ilHistScaleImg Performs histogram scaling of an image.
ilScaleImg Performs a linear scale of an image.
ilThreshImg Performs thresholding of an image.
A.5 Geometric Transforms
ilRotZoomImg Rotates, zooms; translates an image with optional flip.
ilWarpImg Performs generalized warp on an image.
ilPolyWarpImg Performs up to a seventh-order polynomial warp on an image.
ilTieWarpImg Performs a two-dimensional warp on an image
A.6 Statistics
ilImgStat Computes the histogram, mean, and standard deviation pixel values of an image.
A.7 Non-spatial Domain Transforms
ilRFFTfImg Performs forward Fourier transform on an image.
ilRFFTiImg Performs an inverse Fourier transform on an image.
ilFFTAvg Computes the average power spectrum of an image.
ilFConjImg Computes the conjugate of an image and normalizes the complex value by a real factor.
ilFCrCorrImg Computes the cross correlation of two images.
ilFDivImg Divides two FFT images.
ilFDyadicImg Provides support for dual input Fourier domain operators
ilFFiltImg Provides support for Fourier filter operators
ilFExpFiltImg Applies a Fourier domain filter to a Fourier domain image.
ilFGaussFiltImg Applies a Gaussian domain filter to a Fourier domain image.
ilFMagImg Accesses only the magnitude values of an FFT image.
ilFMergeImg Merges two single-band images and accesses them as an FFT image.
ilFMonadicImg Provides support for single input Fourier domain operators
ilFMultImg Multiplies two FFT images.
ilFPhaseImg Accesses only the phase values of an FFT image.
ilFRaisePwrImg Raises the Fourier coefficients of an FFT image by a power in the log domain.
A.8 Spatial Domain Transforms
ilBlurImg Blurs an image.
ilGBlurImg Performs a Gaussian blur on an image.
ilConvImg Convolves an image with specified kernel.
ilDilateImg Performs a morphological dilation on an image
ilErodeImg Performs a morphological erosion on an image
lMaxFltImg Performs a maximum filter on an image.
ilMedFltImg Performs a median filter on an image.
ilMinFltImg Performs a minimum filter on an image.
ilRankFiltImg Performs a rank filter on an image.
ilSepConvImg Convolves an image with a separable kernel
ilSharpenImg Sharpens an image.
ilSpatialImg Provides the basis for generalized spatial or neighborhood operators.
A.9 Edge, Line, & Spot Detection
ilCompassImg Performs a 2D, 3x3 gradient transform on an image.
lLaplaceImg Performs a 2D, 3x3 Laplace transform on an image.
ilRobertsImg Computes the gradient vector of an image by performing two 2D spatial convolutions in the x and y directions, using the 3x3 Roberts kernels.
ilSobelImg Computes the gradient vector of an image by performing two 2D spatial convolutions in the x and y directions, using the 3x3 Sobel kernels.
A.10 Image File Support
ilTIFFImg Reads and writes TIFF formatted image files.
ilGIFImg Reads GIF formatted image files.
ilPCDImg Reads Kodak PhotoCD formatted image files.
ilSGIImg Reads and writes SGI RGB formatted image files.
ilFITImg Reads and writes SGI simple tiled formatted image files.
ilFileImg Provides the basis for reading image data from a file.
A.11 Miscellaneous
ilLutImg Performs look-up table manipulation of an image.
ilBlendImg Blends two images.
ilCombineImg Combines two images based on an ROI mask.
ilMergeImg Merges several images into one.
ilRoiImg Masks accesses to an image by an ROI mask.
ilSubImg Selects a rectangular subregion and/or channel subset of an image.
A.12 Display
push Pushes a view to the back of the stack of views.
pop Raises a view to the top of the stack of views.
setBorders Draws or erases borders on the selected views.
findView Determines the view visible at a given location on the window.
redraw Redraws the window.
display Sets the size and position of an ilView.
split Displays a split view of active ilViews. It modifies each ilView to accommodate the needs of the split view, using a default window partition method.
align Allows views to be aligned to each other. Options are provide for left, right, top, bottom, and center alignment.
wipe Performs a wipe of any combination of view edges, using the selected ilViews, and updates the ilViews accordingly.
moveView The replacePosition operator changes the position on the screen of an ilView object.
moveImg Shifts all the images in the selected ilViews by the desired amount in x and y and updates the corresponding ilViews.
update Allows the view size, view position and image position in an ilView to be updated in one call.