Z39.50 in a Nutshell

(An Introduction to Z39.50)


John A. Kunze & R. P. C. Rodgers
Lister Hill National Center for Biomedical Communications
National Library of Medicine
July 1995


Table of Contents


What It Is


Primary Goal


Ancillary Goals


Associated Organizations


Related Protocols


Underlying Network Environment


History


Important Terms

PDU: complete network message or Protocol Data Unit, specified in ASN.1

ASN.1: Abstract Syntax Notation, high-level data structuring language

Example: a definition of the Present Request PDU

  PresentRequest ::= SEQUENCE {
      referenceId                 [2]   IMPLICIT OCTET STRING OPTIONAL,
      resultSetId                 [31]  IMPLICIT VisibleString,
      resultSetStartPoint         [30]  IMPLICIT INTEGER,
      numberOfRecordsRequested    [30]  IMPLICIT INTEGER,
      elementSetNames                   ElementSetNames OPTIONAL,
      preferredRecordSyntax       [104] IMPLICIT OBJECT IDENTIFIER OPTIONAL
  }
  

BER: Basic Encoding Rules for serializing ASN.1 structures

Example: encoded Present Request PDU Encoded PDU:
\270\020\237\037\007default\236\001\001\235\001\012
What's behind it:
presentRequest = {                       \270\020
    resultSetId = "default"                  \237\037\007default
    resultSetStartPoint = 1                  \236\001\001
    numberOfRecordsRequested = 10            \235\001\012
}
One more view of the same thing (see ISO 8825 for details):
tag=24, Class=2, form=1, count=3
    tag=31, Class=2, form=0, count=7
        data=default
             6666767
             45615C4
    tag=30, Class=2, form=0, count=1
        data=.
             0
             1
    tag=29, Class=2, form=0, count=1
        data=.
             0
             A

Attribute: Type/Value integer pair, such as 1/1003 = Use/Author or 3/1 = Position/First_in_Field

Attribute List: list of Type/Value integer pairs qualifying a search term

Example: an attribute list and its meaning

  1/1003  4/1  5/100  3/2  6/1  2/3

  Type/Value   Meaning
  1/1003       author
  4/1          with words structured as a phrase
  5/100        no truncation
  3/2          compared with the first item in the field
  6/1          not necessarily consuming an entire subfield
  2/3          and compared for equality
  
Attribute Set: published table defining attribute semantics for given search domain

Result Set: records resulting from search, stored on server, fetched on demand

RPNQuery: hierarchical search expression, subexpressions connected by AND, OR, and AND-NOT; also known as the Reverse Polish Notation query for historical reasons

OID: Object IDentifier, global hierarchical name (eg, for attribute sets)

Type-0 Query: used only when the server and client have a priori agreement outside of the standard

Type-1 Query: main Z39.50 query (of five types total), consists of RPNQuery plus attribute set OID

Example: a query of Type-1 with a single boolean AND

-> Begin Type-1 query
query = {
   type-1 = {
-> Specify Bib-1 attribute set
      attributeSet = OID 1.2.840.10003.3.1
-> Begin RPN query
      rpn = {
         rpnRpnOp = {
            rpn1 = {
               op = {
                  attrTerm = {
                     attributes = {
                        use = Body of text
                        position = Any position in subfield
                        structure = Word list
                        completeness = Incomplete subfield
                     }
-> User's first search term
                     term = {
                        general = "heart"
                     }
                  }
               }
            }
            rpn2 = {
               op = {
                  attrTerm = {
                     attributes = {
                        use = Body of text
                        position = Any position in subfield
                        structure = Word list
                        completeness = Incomplete subfield
                  }
-> User's second search term
                  term = {
                     general = "depression"
                     }
                  }
               }
            }
-> Boolean AND connecting user's first and second search terms
            op = {
               and = and
            }
         }
      }
   }
}

Type-2 Query: ISO 8777 type query (the ISO version of Z39.58; see Type-100 query).

Type-100 Query: Z39.58 type query (Common Command Language, a standardized ASCII query language)

Type-101 Query: extended Type-1 query (allows proximity searching and restriction of result sets by attributes)

Bib-1: initial (and principal) Z39.50 attribute set, bibliographic in origin

Stas-1: alternate attribute set, scientific/technical in origin, subsumes Bib-1

Element: record part identified for purposes of retrieval

Element Set: named subset (eg, "F" for full) of record elements

Record Syntax: semantics and bit-level layout of retrieved record

MARC: common bibliographic record syntax (MAchine Readable Catalog)

SUTRS: Simple Unstructured Text Record Syntax, for plain text records

GRS: Generic Record Syntax, for hierarchical, tagged elements


OSI-Specific Terms

Association: Z39.50 session, originally conceived over OSI stack but now deployed over TCP

Origin: client

Target: server

Abstract Syntax: semantic part of record syntax specification

Transfer Syntax: bit-level layout part of record syntax specification

Presentation Context: pairings of abstract syntaxes with transfer syntaxes


Facilities/Functions

Access Control
Server request for authentication information (eg, a password) after suspending a client request. Client response determines whether suspended operation will be resumed.

Accounting/Resource Control Facility (Three Services)
Resource-Control
Server request for acknowledgement of current or projected resource usage (eg, how much money has been spent) after suspending a client request. Client response determines whether suspended operation will be resumed.

Trigger-Resource-Control
Client request for server to cancel current operation or to ask that a resource control request be sent to the client. No paired server response except for above side-effect and possibly early return from current operation.

Resource-Report
Client request for resource report. Response contains resource usage so far (eg, money spent).

Browse/Scan (Added in Version 3)
Client request to fetch index terms. Response is list of terms suitable for searching.

Explain (Added in Version 3)
Used to discover information about a server, including available databases, supported attribute combinations, record syntaxes, and element specifications. Also includes human readable information such as database descriptions and hours of operation.

Extended Services (Added in Version 3)
Provides access to services outside the protocol. The seven currently defined features include persistent result sets, periodic query, item order, and database update.

Initialization (Init)
Client request to set up Z39.50 session on top of existing TCP connection, suggests buffer sizes, optional features, versions, and provides optional user authentication information. Server response includes acceptance or rejection, actual buffer sizes, and features.

Result-Set-Delete
Client request to delete one or more result sets. Response contains operation status.

Retrieve (Two Services)
Present
Client request for specific records from named result set. Response contains records, not necessarily all that were requested.

Segment (Added in Version 3)
Client requests via Present that a records be returned in multiple server-generated Segment PDUs. Allows continuous transmission of many records inside PDU buffers of limited size.

Search
Client request to search, creating result set of records on server. Response includes number of records found and optionally an initial range of records.

Sort (Added in Version 3)
Client request to sort a result set. Response contains operation status.

Termination (Added in Version 3)
Client or server request to close session. May be received at any time.

Miscellaneous Other Features


Current Developments


Perceived Strengths


Perceived Weaknesses


Examples of Envisaged Order of Use of Facilities within a Session

Simple Version 2 Session
PDU 1: Client sends initialization request
PDU 2: Server responds to initialization request
PDU 3: Client sends search request
PDU 4: Server responds to search request
PDU 5: Client sends present request
PDU 6: Server responds to present request
Complex Version 3 Session Using All Facilities
PDU 1: Client sends initialization request (initialization request put on hold while server initiates following suboperation)
PDU 2: Server sends access control request (requesting user account and password)
PDU 3: Client sends access control response (user account and password)
PDU 4: Server sends initialization response
PDU 5: Client sends explain request asking for server databases available
PDU 6: Server sends explain response with list of databases
PDU 7: Client sends search request (asking for a particularly common term in a selected database)
PDU 8: Server sends resource-control request asking client permission to proceed with potentially expensive search
PDU 9: Client sends resource-control response denying permission to proceed
PDU 10: Server sends search response indicating that search was cancelled
PDU 11: Client sends scan request asking for list of alphabetic neighbors of the search term
PDU 12: Server sends scan response with a list of neighboring terms
PDU 13: Client sends search request using an alternate term
PDU 14: Server sends search response indicating that it found 120 matching records, storing pointers to them in a result set on the server
PDU 15: Client sends sort request to order records by publication date
PDU 16: Server sends sort response indicating operation was successful
PDU 17: Client sends present request asking for the entire result set, delivered in as many segments as needed given buffer size constraints
PDU 18: Server sends segment 1 containing records 1-50
PDU 19: Server sends segment 2 containing records 51-100
PDU 20: Server sends present response including records 101-120
PDU 21: Client sends another search request; this time the server appears to hang
PDU 22: Client sends trigger-resource-control request to find out why
PDU 23: Server sends resource control request asking client permission to proceed (there is no such thing as a trigger-resource-control response)
PDU 24: Client sends resource-control response granting permission to proceed; server still appears to hang
PDU 25: Client sends another trigger-resource-control request, asking that search be cancelled
PDU 26: Server sends search response indicating that search was cancelled
PDU 27: Client sends resource-report request
PDU 28: Server sends resource-report response with current charges and list of result sets in use
PDU 29: Client sends extended services request ordering paper copy of the manuscript referred to in record 85 of the 120 record result set
PDU 30: Server sends extended services response indicating that manuscript will be mailed
PDU 31: Client sends result-set-delete request removing the result set
PDU 32: Server sends result-set-delete response indicating that result set was successfully removed
PDU 33: Client sends close request (Termination)
PDU 34: Server sends close response

Annotated Sample Session

Here is a sample session taken from the log files of an experimental Z39.50 Version 2 server at the NLM. In it a client application running on a host named myhost.Berkeley.EDU requests a simple search for the term "heart" as a Subject in the MEDLINE database, for which the server produces a result set of 7927 records. The client then requests delivery of record number 5555 from the result set.

The actual communication between server and client is encoded in the BER binary format. What appears below is an annotated translation of a binary log of a Z39.50 session, as produced by a pretty-printer utility.

Note:
Client requests below are marked by lines beginning with C:
Server requests and actions are marked by lines beginning with S:
Lines beginning with -> are descriptive comments

-> Server senses an incoming TCP connection
S:Association with myhost.Berkeley.EDU

-> PDU 1: Client sends an initialization request
C:initRequest = {
-> Client states versions it supports
C:    protocolVersion = {
C:        version-1(0)
C:        version-2(1)
C:    }
C:    options = {
-> Client states facilities & features it would use
C:        search(0)
C:        present(1)
C:        delSet(2)
C:        resourceReport(3)
C:        triggerResourceCtrl(4)
C:        resourceCtrl(5)
C:        accessCtrl(6)
C:    }
-> Client proposes some buffer sizes
C:    preferredMessageSize = 4096
C:    exceptionalRecordSize = 8192
-> Client software announces its identity
C:    implementationId = "1991"
C:    implementationName = "UCB Info Client"
C:    implementationVersion = "0.2"
C:}

-> PDU 2: Server responds to initialization request
-> Server states versions it supports
S:initResponse = {
S:    protocolVersion = {
S:        version-1(0)
S:        version-2(1)
S:    }
-> Server states features it supports
S:    options = {
S:        search(0)
S:        present(1)
S:        delSet(2)
S:    }
-> Server agrees with buffer sizes (the server always has the last word)
S:    preferredMessageSize = 4096
S:    exceptionalRecordSize = 8192
-> Z39.50 (not TCP) session "accepted"
S:    result = true
-> Server software announces its identity
S:    implementationId = "1994"
S:    implementationName = "NLM MEDLARS Server"
S:    implementationVersion = "0.3"
-> Server sends information that client may optionally display
S:    userInformationField = {
S:        data = "
S:Welcome to the NLM Z39.50 server (beta)
S:
S:"
S:    }
S:}

-> PDU 3: Client sends a search request
C:searchRequest = {
-> Three numbers saying the client wants only one record back, and only if the total count < 100
C:    smallSetUpperBound = 1
C:    largeSetLowerBound = 100
C:    mediumSetPresentNumber = 1
-> Replace any existing result set of the same name
C:    replaceIndicator = true
-> Client sends the name of the result set
C:    resultSetName = "default"
C:    databaseNames = {
C:         "medline"
C:    }
-> Begin Type-1 query
C:    query = {
C:        type-1 = {
-> Specify Bib-1 attribute set
C:            attributeSet = OID 1.2.840.10003.3.1
-> Begin RPN query
C:            rpn = {
C:                op = {
C:                    attrTerm = {
C:                        attributes = {
C:                                relation = Equal
C:                                     use = Subject
C:                        }
-> User's search term
C:                        term = {
C:                            general = "heart"
C:                        }
C:                    }
C:                }
C:            }
C:        }
C:    }
C:}

-> PDU 4: Server responds to search request
S:searchResponse = {
S:    resultCount = 7927
S:    numberOfRecordsReturned = 0
S:    nextResultSetPosition = 1
S:    searchStatus = true
S:    presentStatus = success(0)
S:}

-> PDU 5: Client sends a present request
C:presentRequest = {
C:    resultSetId = "default"
-> Send one record starting at record 5555 within result set 
C:    resultSetStartPoint = 5555
C:    numberOfRecordsRequested = 1
C:}

-> PDU 6: Server responds to present request
S:presentResponse = {
S:    numberOfRecordsReturned = 1
-> Number of record following last one delivered
S:    nextResultSetPosition = 5556
S:    presentStatus = success(0)
S:    records = {
S:        responseRecords = {
S:            {
-> The requested record
S:                record = {
S:                    retrievalRecord = {
-> Record is in USMARC format
S:                        id = OID 1.2.840.10003.5.10
-> Beginning of the record itself (truncated)
S:                        data = "02729naa  2200336z  450000100[...]"
-> [Rest of record deleted in this example]
S:                    }
S:                }
S:            }
S:        }
S:    }
S:}

-> Server senses that client has closed TCP connection (no PDU sent to client)
S:Association closed: release or abort.

Further Information


Z39.50 in a Nutshell (Kunze & Rodgers) / Lister Hill Center, NLM / July 1995