Path: uni-muenster.de!news.dfn.de!xlink.net!howland.reston.ans.net!spool.mu.edu!sgiblab!a2i!flash.us.com!britt.pax.tpa.com.au!not-for-mail
From: dclunie@britt.pax.tpa.com.au (David Clunie)
Newsgroups: alt.image.medical,sci.med.radiology,comp.protocols.dicom
Subject: Revised draft FAQ on image formats
Followup-To: alt.image.medical
Date: 24 Jun 1994 12:13:57 +0300
Organization: Her Master's Voice
Lines: 1770
Distribution: world
Message-ID: <2ue84l$un@britt.pax.tpa.com.au>
Reply-To: dclunie@flash.us.com
NNTP-Posting-Host: britt.ksapax
Xref: uni-muenster.de sci.med.radiology:359 comp.protocols.dicom:221

This is not yet a real FAQ, just a draft on which I am actively working
that gives some idea of the direction in which I am progressing. I would
be very grateful if those who know about this sort of stuff would give me
as much help as they can. I have all the GE stuff ready to include when
I get through figuring out a compact way to describe it, and a little of the
SPI and Siemens information, as well as sections on ACR/NEMA, DICOM, and
Interfile.

I desperately need more information from non-GE vendors though, as well as
manufacturers of other modalities, and in particular email addresses of
helpful people would be nice.

david (dclunie@flash.us.com)

---------------------------------------------------------------------------
~Newsgroups: alt.image.medical,comp.protocols.dicom,sci.data.formats,
            sci.med.radiology,alt.answers,comp.answers,sci.answers,
            news.answers
~Subject: Medical Image Format Frequently Asked Questions (FAQ)
~From: dclunie@flash.us.com (David A. Clunie)
Followup-To: alt.image.medical
~Reply-To: dclunie@flash.us.com (David A. Clunie)
Summary: This posting contains answers to the most Frequently Asked
         Question on alt.image.medical - how do I convert from image
         format X from vendor Y to something I can use ? In addition
         it contains information about various standard formats.

Archive-name: medicalimage-faq
Posting-Frequency: monthly
Last-modified: Thu Jun 23 19:20:46 GMT+0300 1994
Version: 1.0

This message is automatically posted once a month to help readers looking
for information about medical image formats. If you don't want to see this
posting every month, please add the subject line to your kill file.

Many FAQs, including this Listing, are available on the archive site
pit-manager.mit.edu (alias rtfm.mit.edu) [18.172.1.27] in the directory
pub/usenet/news.answers.  The name under which a FAQ is archived appears
in the Archive-name line at the top of the article.

There's a mail server on that machine. You send a e-mail message to
mail-server@pit-manager.mit.edu  containing the keyword "help" (without
quotes!) in the message body.

Changes are marked with a preceding "|".  You can skip to them
by typing g^| in (most) newsreaders.

Changes this issue: 
    none.

Note: this FAQ has been formatted as a digest.  Many newsreaders
can skip to each of the major subsections by pressing ^G.

Please direct comments or questions and especially contributions to

    "dclunie@flash.us.com"

or reply to this article.

--------
~Subject: Index

1.  Introduction
    1.1 Objective
    1.2 Types of Formats
    1.3 In Desperation - Quick & Dirty Tricks
2.  Standard Formats
    2.1 ACR/NEMA 1.0 and 2.0
    2.2 ACR/NEMA DICOM 3.0
    2.3 Papyrus
    2.4 Interfile V3.3
3.  Proprietary Formats
    3.1 General
        3.1.1 SPI (Standard Product Interconnect)
    3.2 CT
        3.2.1 General Electric
              3.2.1.1 CT 9800
                      3.2.1.1.1 Image data
                      3.2.1.1.2 Tape format
                      3.2.1.1.3 Raw data
              3.2.1.2 CT Advantage - Genesis
                      3.2.1.2.1 Image data
                      3.2.1.2.2 Archive format
                      3.2.1.2.3 Raw data
              3.2.1.3 Scitec/Pace
        3.2.2 Siemens
        3.2.3 Philips
        3.2.4 Picker
        3.2.5 Toshiba
        3.2.6 Hitachi
        3.2.7 Shimadzu
        3.2.8 Elscint
    3.3 MR
        3.3.1 General Electric
              3.3.1.1 Signa 3X and 4X
                      3.3.1.1.1 Image data
                      3.3.1.1.2 Tape format
                      3.3.1.1.3 Raw data
              3.3.1.2 Signa 5X - Genesis
                      3.3.1.2.1 Image data
                      3.3.1.2.2 Tape format
                      3.3.1.2.3 Raw data
              3.3.1.3 Vectra
        3.3.2 Siemens
              3.3.2.1 GBS I/II
              3.3.2.2 SP/Vision
              3.3.2.3 Impact
        3.3.3 Philips
              3.3.3.1 pre-ACS
              3.3.3.2 ACS
              3.3.3.3 T5
              3.3.3.4 NT5 & NT15
        3.3.4 Picker
        3.3.5 Toshiba
        3.3.6 Hitachi
        3.3.7 Shimadzu
        3.3.8 Elscint

4.  Host Machines
    4.1 Data General
        4.1.1 Data
              4.1.1.1 Integers
              4.1.1.2 Floating Point
        4.1.2 Operating System
    4.2 Vax
        4.2.1 Data
              4.2.1.1 Integers
              4.2.1.2 Floating Point
        4.2.2 Operating System
    4.3 Sun4 - Sparc
        4.2.1 Data
              4.2.1.1 Integers
              4.2.1.2 Floating Point
        4.2.2 Operating System

5.  Compression Schemes
    5.1 Reversible
    5.2 Irreversible
        5.2.1 Perimeter Encoding

6.  Getting Connected
    6.1 Tapes
    6.2 Ethernet
    6.3 Serial Ports

7.  Sources of Information
    7.1 Vendor Contacts
    7.2 Relevant FAQ's
    7.3 Source Code
    7.4 Commercial Offerings
    7.5 FTP sites
    7.6 Mailservers
    7.7 References

--------
~Subject: Introduction

1.  Introduction

    1.1 Objective

        The goal of this FAQ is to facilitate access to medical images stored 
on digital imaging modalities such as CT and MR scanners, and their 
accompanying descriptive information. The document is designed particularly for 
those who do not have access to the necessary proprietary tools or 
descriptions, particularly in those moments when inspiration strikes and one 
just can't wait for the local sales person to track down the necessary 
authority and go through the cycle of correspondence necessary to get a 
non-disclosure agreement in place, by which time interest in the project has 
usually faded, and another great research opportunity has passed ! It may also 
be helpful for those keen to experiment with home-grown PACS-like systems using 
their existing equipment, and also for those who still have equipment that is 
still useful but so old even the host computer vendor doesn't support it any 
more !

        There is of course no substitute for the genuine tools or descriptions 
from the equipment vendors themselves, and pointers to helpful individuals in 
various organizations, as well as names and catalog numbers of various useful 
documents, are included here where known.

        In addition there are several small companies that specialize in such 
connectivity problems that have a good reputation and are well known. Contact 
information is provided for them, though I personally have no experience with 
their products and am not endorsing them.

        Finally, great care has been taken not to include any information that 
has been released under non-disclosure agreements. What is included here is the 
result of either information freely released by vendors, handy hints from 
others working in the field, or in many cases close scrutiny of hex dumps and 
experimentation with scanner parameters and study of the effects on the image 
files. The intent is to spread hard-earned knowledge gained over many years 
amongst those new to the field or a particular piece of equipment, not to 
threaten anyone's proprietary interests, or to substitute for the technical 
support available from vendors that ranges from free to extortionate, and 
excellent to abysmal, depending on who your are dealing with and where in the 
world you are located !

         Please use this information in the spirit in which is intended, and 
where possible contribute whatever you know in order to expand the information 
to cover more vendors and equipment.


    1.2 Types of Formats

        Later sections will deal with the problems of getting the image files 
from the modality to the workstation, but for the moment assume the files are 
there and need to be deciphered.

        Four types of information are generally present in these files:

            - image data, which may be unmodified or compressed,
            - patient identification and demographics,
            - technique information about the exam, series, and slice/image.

        Extracting the image information alone is usually straightforward and 
is described in 1.3. Dealing with the descriptive information, for example to 
make use of the data for dissemination in a PACS environment, or to extract 
geometry details in order to combine images into 3D datasets, is more difficult 
and requires deeper understanding of how the files are constructed.

        There are three basis families of formats that are in popular use:

            - fixed format, where layout is identical in each file,
            - block format, where the header contains pointers to information,
            - tag based format, where each item contains its own length.

        The block format is one of the most popular, though in most cases, the 
early part of the header contains only a limited number of pointers to large 
blocks, the blocks are almost always in the same place and a constant length, 
for standard rather than reformatted images at least, and if one doesn't know 
the specifics of the layout one can get by assumming a fixed format. I presume 
this reflects the intent of the designers to handle future expansion and 
revision of the format.

         The example par excellence of the tag based format is the ACR/NEMA 
style of data stream, which, though never intended as a file format per se has 
proven useful as model. See for example the sections dealing with the ACR/NEMA 
standards as well as DICOM (whose creators are about to vote on a media 
interchange format after all this time) and Papyrus. ACR/NEMA style tags are 
described in more detail elsewhere, but each is self-contained and 
self-describing (at least if you have the appropriate data dictionary) and 
contains its own length, so if you can't interpret it you can skip it ! Very 
convenient. Most file formats based on this scheme are just concatenated series 
of tags, and apart from having to guess the byte order, which is not specified 
(unlike TIFF which is a similar deal for those in the "real" imaging world), 
and sometimes skip a fixed length but short header, are dead easy to handle.

         To identify such a file just do a "strings |
            ______________ ______________ ______________ ______________
           |XXXXXXXXXXXXXX|              |              |              |
           |______________|______________|______________|______________|
            15          12 11           8 7            4 3            0

        ---------------------------

        Bits Allocated = 16
        Bits Stored    = 12
        High Bit       = 15

           |<------------------ pixel ----------------->|
            ______________ ______________ ______________ ______________
           |              |              |              |XXXXXXXXXXXXXX|
           |______________|______________|______________|______________|
            15          12 11           8 7            4 3            0

        ---------------------------

        Bits Allocated = 12
        Bits Stored    = 12
        High Bit       = 11

           ------ 2 ----->|<------------------ pixel 1 --------------->|
            ______________ ______________ ______________ ______________
           |              |              |              |              |
           |______________|______________|______________|______________|
            15          12 11           8 7            4 3            0


           -------------- 3 ------------>|<------------ 2 --------------
            ______________ ______________ ______________ ______________
           |              |              |              |              |
           |______________|______________|______________|______________|
            15          12 11           8 7            4 3            0


           |<------------------ pixel 4 --------------->|<----- 3 ------
            ______________ ______________ ______________ ______________
           |              |              |              |              |
           |______________|______________|______________|______________|
            15          12 11           8 7            4 3            0

        ---------------------------

        And so on ... refer to the standard itself for more detail.

    2.2 ACR/NEMA DICOM 3.0

        ACR/NEMA Standards Publications

            No. PS 3.1-1992   <- DICOM 3 - Introduction & Overview
            No. PS 3.8-1992   <- DICOM 3 - Network Communication Support

            No. PS 3.2-1993   <- DICOM 3 - Conformance
            No. PS 3.3-1993   <- DICOM 3 - Information Object Definitions
            No. PS 3.4-1993   <- DICOM 3 - Service Class Specifications
            No. PS 3.5-1993   <- DICOM 3 - Data Structures & Encoding
            No. PS 3.6-1993   <- DICOM 3 - Data Dictionary
            No. PS 3.7-1993   <- DICOM 3 - Message Exchange
            No. PS 3.9-1993   <- DICOM 3 - Point-to-Point Communication

            No. PS 3.10-????  <- DICOM 3 - Media Storage & File Format
            No. PS 3.11-????  <- DICOM 3 - Media Storage Application Profiles
            No. PS 3.12-????  <- DICOM 3 - Media Formats & Physical Media

        DICOM (Digital Imaging and Communications in Medicine) standards are of 
course the hot topic at every radiological trade show. Unlike previous attempts 
at developing a standard, this one seems to have the potential to actually 
achieve its objective, which in a nutshell, is to allow vendors to produce a 
piece of equipment or software that has a high probability of communicating 
with devices from other vendors.

        Where DICOM differs substantially from other attempts, is in defining 
so called Service-Object Pairs. For instance if a vendor's MR DICOM conformance 
statement says that it supports an MR Storage Class as a Service Class 
Provider, and another vendor's workstation says that it supports an MR Storage 
Class as a Service Class User, and both can connect via TCP/IP over Ethernet, 
then the two devices will almost certainly be able to talk to each other once 
they are setup with each others network addresses and so on.

        The keys to the success of DICOM are the use of standard network 
facilities for interconnection (TCP/IP and ISO-OSI), a mechanism of association 
establishment that allows for negotiation of how messages are to be 
transferred, and an object-oriented specification of Information Objects (ie. 
data sets) and Service Classes.

        Of course all this makes for a huge and difficult to read standard, but 
once the basic concepts are grasped, the standard itself just provides a 
detailed reference. From the users' and equipment purchasers' points of view 
the important thing is to be able to read and match up the Conformance 
Statements from each vendor to see if two pieces of equipment will talk.

        Just being able to communicate and transfer information is of course 
not sufficient - these are only tools to help construct a total system with 
useful functionality. Because a workstation can pull an image off an MRI 
scanner doesn't mean it knows when to do it, when the image has become 
available, to which patient it belongs, and where it is subsequently archived, 
not to mention notifying the Radiology or Hospital Information System (RIS/HIS) 
when such a task has been performed. In other words DICOM Conformance does not 
guarantee functionality, it only facilitates connectivity.

        In otherwords, don't get too carried away with espousing the virtues of 
DICOM, demanding it from vendors, and expecting it to be the panacea to create 
a useful multi-vendor environment.

        Fred Prior (prior@xray.hmc.psu.edu) has come up with the concept of a 
User Conformance Statement to be generated by purchasers and to be satisfied by 
vendors. The idea is that one describes what one expects and hence gives the 
vendor a chance to realistically satisfy the buyer ! Of course each such 
statement must be tailored to the user's needs, and simply stapling a copy of 
Fred's statement to a Request For Proposals is not going to achieve the desired 
objective. Caveat empor.

        To get more information about DICOM:

            - Purchase the standards from NEMA (address below) when they
              become available around July 1994.

            - Ftp the final versions of the drafts in electronic form
              one of the sites described below.

            - Follow the Usenet group comp.protocols.dicom.

            - Get a copy of "Understanding DICOM 3.0" $12.50 from Kodak.

            - Insist that your existing and potential vendors supply you
              with DICOM conformance statements before you upgrade or
              purchase, and don't buy until you know what they mean. Don't
              take no for an answer !!!!

        What is all this doing in an FAQ about medical image formats you ask ? 
Well first of all, in many ways DICOM 3.0 will solve future connectivity 
problems, if not provide functional solutions to common problems. Hence 
actually getting the images from point A to B is going to be easier if everyone 
conforms. Furthermore, for those of us with old equipment, interfacing it to 
new DICOM conforming equipment is going to be a problem. In otherwords old 
network solutions and file formats are going to have to be transformed if they 
are going to communicate unidirectionally or bidirectionally with DICOM 3.0 
nodes. One is still faced with the same old questions of how does one move the 
data and how does one interpret it.

         The specifics of the DICOM message format are very similar to the 
previous versions of ACR/NEMA on which it is based. The data dictionary is 
greatly extended, and certain data elements have been "retired" but can be 
ignored gracefully if present. The message itself can now be transmitted as a 
byte stream over networks, rather than using a point-to-point paradigm 
excusively (though the old point-to-point interface is available). This message 
can be encoded in various different Transfer Syntaxes for transmission. When 
two devices ("Application Entities" or AE) begin to establish an "Association", 
they negotiate an appropriate transfer syntax. They may choose an Explicit 
Big-Endian Transfer Syntax in which integers are encoded as big-endian and 
where each data element includes a specific field that says "I am an unsigned 
16 bit integer" or "I am an ascii floating-point number", or alternatively they 
can fall back on the default transfer syntax which every AE must support, the 
Implicit Little-Endian Transfer Syntax which is just the same as an old 
ACR/NEMA message with the byte order defined once and for all.

        This is all very well if you are using DICOM as it was originally 
envisaged - talking over a network, negotiating an association, and determining 
what Transfer Syntax to use. What if one wants to store a DICOM message in a 
file though ? Who is to say which transfer syntax one will use to encode it 
offline ? One approach, used for example by the Central Test Node software 
produced by Mallinkrodt and used in the RSNA Inforad demonstrations, is just to 
store it in the default little-endian implicit syntax and be done with it. This 
is obviously not good enough if one is going to be mailing tapes, floppies and 
optical disks between sites and vendors though, and hence the DICOM group 
decided to define a "Media Storage & File Format" part of the standard, the new 
Chapter 10 which is about to be or has just been voted on.

        Amongst other things, this new part defines a generic DICOM file format 
that contains a brief header, the "DICOM File Meta Information Header" which 
contains a 128 byte preamble (that the user can fill with anything), a 4 byte 
DICOM prefix "DICM", then a short DICOM format message that contains newly 
defined elements of group 0002 in the default Implicit Little Endian Transfer 
Syntax, which uniquely identify the data set as well as specifying the Transfer 
Syntax for the rest of the file. The rest of the message must specify a single 
SOP instance which can of course contain multiple images as folders if 
necessary. The length of the brief message in the Meta Header is specified in 
the first data element as usual, the group length.

        So what choices of Transfer Syntax does one have and why all the fuss ? 
Well the biggest distinction is between implicit and explicit representation 
which allows for multiple possible representations for a single element, in 
theory at least, and perhaps allows one to make more of an unknown data element 
than one otherwise could perhaps. Some purists (and Interfile people) would 
argue that the element should be identified descriptively, and there is nothing 
to stop someone from defining their own private Transfer Syntax that does just 
that (what a heretical thought, wash my mouth out with soap). With regard to 
the little vs. big endian debate I can't see what the fuss is about, as it 
can't really be a serious performance issue.

        Perhaps more importantly in the long run, the Transfer Syntax mechanism 
provides a means for encapsulating compressed data streams, without having to 
deal with the vagaries and mechanics of compression in the standard itself. For 
example, if DICOM version 3.0, in addition to the "normal" Transfer Syntaxes, a 
series are defined to correspond to each of the Joint Photographic Experts 
Group (JPEG) processes. Each one of these Transfer Syntaxes encodes data 
elements in the normal way, except for the image pixel data, which is defined 
to be encoded as a valid and self-contained JPEG byte stream. Both reversible 
and irreversible processes of various types are provided for, without having to 
mess with the intricacies of encoding the various tables and parameters that 
JPEG processes require. Presumably a display application that supports such a 
Transfer Syntax will just chop out the byte stream, pass it to the relevant 
JPEG decode, and get an uncompressed image back. More importantly, an archive 
server can store the image and retrieve it without ever having to know anything 
about how the image pixel data is encoded. Contrast this approach with that 
taken by those defining the TIFF (Tagged Image File Format) for general imaging 
and page layout applications. In their version 6.0 standard they attempted to 
disassemble the JPEG stream into its various components and assign each to a 
specific tag. Unfortunately this proved to be unworkable after the standard was 
disseminated and they have gone back to the drawing board.

        Now one may not like the JPEG standard, but one cannot argue with the 
fact that the scheme is workable, and a readily available means of reversible 
compression has been incorporated painlessly. How effective a compression 
scheme this is remains to be determined, and whether or not the irreversible 
modes gain wide acceptance will be dictated by the usual medico-legal paranoia 
that prevails in the United States, but the option is there for those who want 
to take it up. There is of course no reason why private compression schemes 
cannot be readily incorporated using this "encapsulation" mechanism, and to 
preserve bandwidth this will undoubtedly occur. This will not compromise 
compatibility though, as one can always fall back to a default, uncompressed 
Transfer Syntax. The DICOM Working Group on compression will undoubtedly bring 
out new possibilities.

        In order to identify all these various syntaxes, information objects, 
and so on, DICOM has adopted the ISO concept of the Unique Identifier (UID) 
which is a text string of numbers and periods with a unique root for each 
organization that is registered with ISO and various organizations that in turn 
register others in a hierarchical fashion. For example 1.2.840.10008.1.2 is 
defined as the Implicit VR Little Endian Transfer Syntax. The 1 identifies ISO, 
the 2 is the ISO member body branch, the 840 is the specific member body 
country code, in this case ANSI, and the 10008 is registered by ANSI to NEMA 
for DICOM.  UID's are also used to uniqely identify non-DICOM specific things, 
such as information objects. These are constructed from a prefix registered to 
the supplier or vendor or site, and a unique suffix that may be generated from 
say a date and time stamp (which is not to be parsed). For example an instance 
of a CT information object might have a UID of 
1.2.840.123456.002.999999.940623.170717 where a (presumably US) vendor 
registered 123456, and the modality generated a unique suffix based on its 
device number, patient hospital id, date and time, which have no other 
significance other than to create a unique suffix.

        The other important new concept that DICOM introduced was the concept 
of Information Objects. In the previous ACR/NEMA standard, though modalities 
were identified by a specific data element, and though there were rules about 
which data elements were mandatory, conditional or optional in ceratin 
settings, the concept was relatively loosely defined. Presumably in order to 
provide a mechanism to allow conformance to be specified and hence ensure 
interoperability, various Information Objects are defined that are composed of 
sets of Modules, each module containing a specific set of data elements that 
are present or absent according to specific rules. For example, a CT Image 
Information Object contains amongst others, a Patient module, a General 
Equipment module, a CT Image module, and an Image Pixel module. An MR Image 
Information module would contain all of these except the CT Image module which 
would be replaced by an MR Image module. Clearly one needs descriptive 
information about a CT image that is different from an MR image, yet the 
commonality of the image pixel data and the patient information is recognized 
by this model.

        Hence, as described earlier, one can define pairs of Information 
Objects and Services that operate on such objects (Storage, Query/Retrieve, 
etc.) and one gets SOP classes and instances. All very object oriented and 
initially confusing perhaps, but it provides a mechanism for specifying 
conformance. From the point of view of an interpreters of a DICOM compatible 
data stream this means that for a certain instance of an Information Object, 
certain information is guaranteed to be in there, which is nice. As a creator 
of such a data stream, one must ensure that one follows all the rules to make 
sure that all the data elements from all the necessary modules are present. 
Having done so one then just throws all the data elements together, sorts them 
into ascending order by group and element order, and pumps them out. It is a 
shame that the data stream itself doesn't reflect the underlying order in the 
Information Objects, but I guess they had to maintain backward compatibility, 
hence this little bit of ugliness. This gets worse when one considers how to 
put more than one object in a folder inside another object.

        At this point I am tempted to include more details of various different 
modules, data elements and transfer syntaxes, as well as the TCP/IP mechanism 
for connection. However all this information is in the standard itself which is 
readily available electronically from the ftp sites, and in the interests of 
brevity I will not succumb to temptation at this time.

    2.3 Papyrus

        Papyrus is an image file format based on ACR/NEMA version 2.0. I don't 
have much information about it yet, but what I do know, gleaned from Usenet and 
a presentation at SCAR 94 is:

        - it is from Switzerland,
        - there is a library of tools available for handling it,
        - it allows multiple images/file,
        - it has something to do with the European RACE Telemed project,
        - it stores 16 bit integers as big-endian,

        and that is all for the moment ! Someone is sending me more information 
Real Soon Now so stay tuned.

    2.4 Interfile V3.3

        Interfile is a "file format for the exchange of nuclear medicine image 
data" created I gather under the auspices of the American Association of 
Physicists in Medicine (AAPM) for the purpose of transfer of images of quality 
control phantoms, and has been subsequently used for clinical work (please 
correct me if I am wrong Trevor).

        It specifies a file format composed of ascii "key-value" pairs and a 
data dictionary of keys. The binary image data may be contained in the same 
file as the "administrative information", or in a separate file pointed to by a 
"name of data file" key. Image data may be binary integers, IEEE floating point 
values, or ascii and the byte order is specified by a key "imagedata byte 
order". The order of keys is defined by the Interfile syntax which is more 
sophisticated than a simple list of keys, allowing for groups, conditionals and 
loops to dictate the order of key-value pairs.

        Conformance to the Interfile standard is informally described in terms 
of which types of image data types, pixel types, multiple windows, special 
Interfile features including curves, and restriction to various maximum 
recommended limits.

        Interfile is specifically NOT a communications protocol and strictly 
deals with offline files. There are efforts to extend Interfile to include 
modalities other than nuclear medicine, as well as to keep ACR/NEMA and 
Interfile data dictionaries in some kind of harmony.

        A sample list of Interfile 3.3 key-value pairs is shown here to give 
you some idea of the flavor of the format. The example is culled from part of a 
Static study in the Interfile standard document and is not complete:

                !INTERFILE :=
                !imaging modality :=nucmed 
                !version of keys :=3.3
                data description :=static
                patient name :=joe doe
                !patient ID  :=12345
                patient dob :=1968:08:21
                patient sex :=M
                !study ID :=test
                exam type :=test
                data compression :=none
                !image number :=1
                !matrix size [1] :=64
                !matrix size [2] :=64
                !number format :=signed integer
                !number of bytes per pixel :=2
                !image duration (sec) :=100
                image start time :=10:20: 0
                total counts :=8512
                !END OF INTERFILE :=

        One can see how easy such a format would be to extend, as well as how 
it is readable and almost useable without reference to any standard document or 
data dictionary.

        Undoubtedly ACR/NEMA DICOM 3.0 to Interfile translators will soon 
proliferate in view of the fact that many Nuclear Medicine vendors supply 
Interfile translators at present.

        To get hold of the Interfile 3.3 standard by ftp, see the sources and 
contacts listed later in this document.

--------
~Subject: Proprietary Formats

3.  Proprietary Formats
    3.1 General
        3.1.1 SPI (Standard Product Interconnect)

        SPI is a standard based on the old ACR/NEMA standard, devised I gather 
by Siemens and Philips, for use in a PACS environment. Who currently maintains 
it and whether or not Sienet PACS systems are based on it, I am not certain. 
Many machines in the workplace use it in some shape or form, or can export 
files in SPI format. I gather it has been around since 1987 or so, but I do not 
yet have access to the reference documents, nor permission to disclose their 
contents, so much of the following is guess work or hearsay from Usenet.

        Like the ACR/NEMA standard, SPI is designed to define interconnections 
between pieces of equipment from the physical level through to the application 
level. Where appropriate it utilized relevant parts of ACR/NEMA. Unlike 
ACR/NEMA, I gather that SPI is aware of the concept of networks, objects 
containing information, the need to uniquely identify instances of objects, and 
defines an offline file format. Thus in many ways it sounds like the missing 
link between ACR/NEMA 2.0 and DICOM 3.0.

       SPI makes use of ACR/NEMA data elements and groups, and in addition 
provides "shadow" private odd-numbered groups as dictated by the ACR/NEMA 
standard for the purpose of storing additional items of information, including 
a means of uniquely identifying objects, as well as allowing for enumerated 
values for elements beyond those defined by ACR/NEMA. SPI also defines a byte 
order for offline storage of data streams. Integers are stored in little endian 
format (least significant byte first).

       Needless to say this section needs expanding dramatically so please send 
more information !

    3.2 CT
        3.2.1 General Electric

              Now we get to the meaty part. After years of being faced with the 
problem of either a) hours of detective work, or b) tediously tracking down the 
name of the responsible person and exercising a non-disclosure agreement, 
General Electric (or Generous Electric as I heard them described the other day) 
now really are being generous, as well as sensible, and are making their image 
format description documents freely available. For details see the contact 
section later on. In the meantime, both for historical completeness, 
educational purposes, and for those who can't wait for document to come in the 
mail, a summary of the relevant formats and decompression algorithms is 
provided here.

              3.2.1.1 CT 9800
                      3.2.1.1.1 Image data

                                - "block format" header
                                - perimeter encoding
                                - optional DPCM compression
                                - Data General host (various)
                                - RDOS (yuck !)

                                Almost everyone in this field has at some stage 
encountered the dreaded CT 9800 format. The world is divided into two groups of 
people ... those who have seen the documents or the critical piece of code in 
another program or have been given a handy hint, and those who will never 
figure out the format themselves.

                                Essentially the format fits into the "block 
format" described earlier, with pointers to each of the major header 
components. Rarely, if ever, does one encounter a file that doesn't have the 
same size blocks in the same place, so most people treat it as a fixed layout. 
I believe that reformatted images may have another header stored in there, but 
I have never tested for it.

                                The data itself is stored in one of two forms 
depending on whether compression is selected or not during archival. In the 
uncompressed form, a type of perimeter encoding is used (see later section) in 
which for an essentially circular object, the outer parts of a rectangular 
image are discarded (and expected to be filled in with a background pixel value 
during reconstitution of the image). In the case of the CT9800 then, the image 
pixel data is interpreted using a map, which contains an entry for each row of 
the image (either 256, 320 or 512 entries) which specifies the length of the 
row that is actually stored, centered about the midline of the image. This 
obviously saves a lot of space.

                                 If compression is selected on one of the later 
model machines, then a form of Differential Pulse Code Modulation is used, in 
which advantage is taken of the fact that not all the bits of a 16 bit word are 
need to store a CT value. I gather only 12 bits of data are actually 
significant, but one can theoretically represent 15 using this scheme. 
Essentially, the first 16 bit word is read and used as is. Then another byte is 
read. If its most significant bit is set, then the remaining 7 bits represent a 
signed difference value relative to the previous pixel. If its most significant 
bit is not set, then the difference must have exceeded the range of 7 bits, and 
hence the next byte is read to complete a valid 16 bit word (15 bits really) 
which is the actual pixel value. The really neat thing about this scheme is 
that the same algorithm can be used for compressed or uncompressed data as an 
uncompressed stream of words will never have the most significant bit set !

                                  The following piece of C++ code pulled out of 
a CT9800 to DICOM translator will give you the general idea. Note that the 
perimeter encoding map has already been read in. Note in particular the need to 
deal with sign extension of the difference value. Also note that the code 
doesn't handle the first pixel specially because its high bit will not be set.

static void
copy9800image(ifstream& instream,DC3ofstream& outstream,
	      Uint16 resolution,Uint16 *map)
{
	unsigned i;
	Int16 last_pixel;

	last_pixel=0;
	for (i=0; i|<------ Exponent ------>|<--------- Mantissa -------->|
            ______________ ______________ ______________ ______________
           |              |              |              |              |
           |______________|______________|______________|______________|
            31          28 27          24 23          20 19          16
           |<----------------------- Mantissa ------------------------>|
            ______________ ______________ ______________ ______________
           |              |              |              |              |
           |______________|______________|______________|______________|
            15          12 11           8 7            4 3            0


                      Here is a little piece of C++ code that should run on 
anything and convert Data General floats to whatever the host's floating point 
format is.

		double	value;
		unsigned char	sign;
		Uint16		exponent;
		Uint32		mantissa;

		typedef struct {
			unsigned	sign : 1;
			unsigned	exponent : 7;
			unsigned	mantissa : 24;
		} DG_FLOAT;

		DG_FLOAT number;

		unsigned char buffer[4];
		instream.read(buffer,4);
		if (instream) {
			// DataGeneral is a Big Endian machine
			memcpy ((char *)(&number),buffer,4);
			sign     = number.sign;
			exponent = number.exponent;
			mantissa = number.mantissa;

			value = (double) mantissa / (1 << 24) *
				pow (16.0, (long)(exponent) - 64);
			value = (sign == 0) ? value : -value;
		}
		else {
			cerr << "read failed\n" << flush;
			value=0;
		}

        4.1.2 Operating System

              4.1.2.1 RDOS

                      Used on the GE CT 9800 family. Severely primitive and not 
multitasking. Documentation is no longer available from Data General (I tried)  
and was not supplied with the scanner by GE, so if anyone knows where I can 
find it let me know. Here is a brief command summary culled from a nifty pocket 
book from GE for SunOS/Genesis users that compares commands:

                 CHATR  - file attributes
                 CRAND  - create randomly organized file
                 CDIR   - create directory
                 DELETE - files or directories
                 DIR    - change directory
                 DISK   - free space
                 FILCOM - compare files
                 GDIR   - show working directory name
                 GTOD   - show date and time
                 LINK   - files (symbolic)
                 LIST   - directory contents
                 MOVE   - a file
                 RENAME - a file
                 SDAT   - set date
                 STOD   - set time
                 SDUMP  - write files to a device
                 SLOAD  - read dumped files
                 SPEED  - tex editor
                 TYPE   - contents of file
                 XFER   - copy a file

                 wildcards: '-' is series, '*' is single character

              4.1.2.2 AOS/VS

                      Used on the GE Signa 3X and 4X family. Quite a nice 
operating system with multi-tasking and hierarchical directories. Here is a 
brief command summary again culled from a nifty pocket book from GE for 
SunOS/Genesis users that compares commands:

                 ACL         - access control list (ownership)
                 BYE         - exit command process
                 COPY        - a file
                 CREATE      - a text file
                 CREATE/DIR  - a directory
                 CREATE/LINK - link files
                 DELETE      - files & directories
                 DIR         - display or change working directory
                 DUMP        - to peripheral
                 F/AS/S      - directory listing with file status
                 DATE        - show or set
                 HELP
                 LOAD        - DUMPed files
                 MOVE        - a file
                 RENAME      - a file
                 PATH        - show pathname of a file
                 PAUSE       - the command line interpreter
                 SUPERU ON   - enable superuser
                 SED         - text editor
                 TIME        - show or set
                 TYPE        - contents of text file
                 ?           - list processes running

                 wildcards: '+' is series, '*' is single character

                      Other useful hints include the use of "^" to refer to the 
next directory up (like ".." in Unix) in DIR commands. Command options follow 
the command name without any spaces and are indicated by a slash. COPY 
operations specify the destination name first and then the source name. Devices 
like the mag tape are indicated by "@", for example "@MTB0" is tape drive zero. 
Files on the tape can be referred to as "@MTB0:nn" which is very handy. For 
example to read a file off a CT 9800 tape under AOS/VS:

                COPY/V/IMTRSIZE=8192 B038040101.YP @MTB0:18

                      Perhaps most importantly, there is an extensive online 
help system ... use the HELP command.

    4.2 Vax
        4.2.1 Data
              4.2.1.1 Integers
              4.2.1.2 Floating Point
              4.2.1.3 Strings
        4.2.2 Operating System

    4.3 Sun4 - Sparc
        4.2.1 Data
              4.2.1.1 Integers
              4.2.1.2 Floating Point
              4.2.1.3 Strings
        4.2.2 Operating System

--------
~Subject: Compression Schemes

5.  Compression Schemes
    5.1 Reversible
    5.2 Irreversible
        5.2.1 Perimeter Encoding


--------
~Subject: Getting Connected

6.  Getting Connected
    6.1 Tapes
    6.2 Ethernet
    6.3 Serial Ports


--------
~Subject: Sources of Information

7.  Sources of Information
    7.1 Vendor Contacts

        ACR/NEMA and DICOM standards:

                NEMA Publication Sales
                2101 L St. NW, Suite 300
                Washington DC 20037-1526
                phone (202) 457-8474

        DICOM standards comments and working group information:

                David Snavely, Staff Executive
                NEMA
                2101 L St. NW, Suite 300
                Washington DC 20037-1526
                phone (202) 457-1965

                Gordon Bass
                American College of Radiology
                1891 Preston White Drive
                Reston VA 22091
                phone (703) 648-8900

        Kodak "Understanding DICOM 3.0" for $US 12.50 (no credit cards):

                Angie Helms
                Kodak Health Imaging Systems
                18325 Waterview Parkway
                Dallas TX 75252
                phone 1-800-767-3448

        Independent JPEG Group (IJG):

                Tom Lane                 (tgl@netcom.com)

        Interfile:

                Trevor Cradduck          (cradduck@irus.rri.uwo.ca)
                Andrew Todd-Pokropek     (a.todd@ucl.ac.uk)

        General Electric, for image format information:

                John Meissner
                Networks Technical Leader
                GE Medical Systems
                N25 W23255 Paul Road
                Pewaukee WI 53072
                phone (414) 896-2707
                email "meissnerj@med.ge.com"


    7.2 Relevant FAQ's

        Archive-name: graphics/resources-list/part[1-3]
        Archive-name: graphics/faq
        Archive-name: pixutils-faq
        Archive-name: image-processing/Macintosh
        Archive-name: sci-data-formats

        DICOM FAQ - maintained by dsc@xray.hmc.psu.edu (David S. Channin)
                  - periodically posted to comp.protocols.dicom

        med.volviz.faq - maintained by mhaveri@phoenix.oulu.fi (Matti Haveri)
                       - occasionally posted to alt.image.medical
                       - discussed medical volume visualization

        FITS basics and information (periodic posting)
                - FITS (Flexible Image Transport System)
                - for astronomical data
                - periodically posted by
                      bschlesinger@nssdca.gsfc.nasa.gov (BARRY M. SCHLESINGER)
                - in sci.astro.fits,sci.data.formats

    7.3 Source Code
        7.3.X JPEG

              PVRG-JPEG CODEC:
                havefun.stanford.edu[36.2.0.35]:/pub/jpeg/JPEGv1.2.tar.Z

                Supports
                  - sequential DCT baseline,
                  - lossless modes.
              IJG:
                ftp.uu.net[137.39.1.9]:/graphics/jpeg/jpegsrc.v4.tar.Z

                Supports
                  - sequential DCT baseline,
                  - 12 bit DCT modes.

    7.4 Commercial Offerings

        Interfaces between vendors equipment and DICOM 3.0:

                DeJarnette Research Systems Inc.
                Suite 700
                401 Washington Avenue
                Towson, Maryland 21204
                USA
                phone 410-583-0694

    7.5 FTP sites

        DICOM draft standards and demonstration software:

                ftp.xray.hmc.psu.edu:/dicom_docs

                        /dicom_docs/dicom_3.0/postcript  postscript
                        /dicom_docs/dicom_3.0/frame      FrameMaker
                        /dicom_docs/dicom_3.0/word_hqx   Microsoft Word

                ftp.xray.hmc.psu.edu:/dicom_software

                        /dicom_software/Mallinckrodt     Mallinkrodt RSNA '93
                        /dicom_software/European         European CEN/TC251/WG4

                rsna.org

                wuerlim.wustl.edu:/pub/dicom

                        /pub/dicom/images/version3       sample images

                ftp.uni-oldenburg.de:/pub/dicom

                        /pub/dicom/dicom_docs
                        /pub/dicom/dicom_software


        ACR/NEMA (dicom) viewer for MAC (haven't tried this yet):

                ftp.u.washington.edu:/public/razz

        Interfile: (site maintained by cradduck@irus.rri.uwo.ca)

                uwovax.uwo.ca:pub:[000000.nucmed]

        Various sample medical images (may be out of date):

                ftp:fokus.uke.uni-hamburg.de:/Voxelman/images
                ftp:rwja.umdnj.edu:/pub

                gopher://gopher.austin.unimelb.edu.au/11/images/petimages/

        CT reconstruction software:

                peipa.essex.ac.uk:/ipa/src/process

                        /ipa/src/process/ct.tar.Z
                        /ipa/src/process/snark77.tar.Z

        3DVIEWNIX (University of Pennsylvania):

                ftp:mipgsun.mipg.upenn.edu:/3DVIEWNIX1.0/BINARIES
                http://mipgsun.mipg.upenn.edu

        FITS (Flexible Image Transport System) for astronomical data:

                ftp:fits.cv.nrao.edu:/fits
                ftp:nssdca.gsfc.nasa.gov:/FITS

        Medical Informatics standards, including HL7:

                ftp:dumccss.mc.duke.edu:/standards

                        /standards/read-me.txt
                        /standards/HL7/pubs/version2.2/ballot1.zip
    7.6 Mailservers
    7.7 References

-- 
David A. Clunie (dclunie@flash.us.com)

"I must see your DICOM 3 conformance statement before I buy."