Sender: Let's Go Gopherin' <GOPHERN@UBVM.cc.buffalo.edu> From: richard smith <rjs@lis.pitt.edu> Subject: #25 The Limits of Gopher To: Multiple recipients of list GOPHERN <GOPHERN@UBVM.cc.buffalo.edu> NAVIGATING THE INTERNET: LET'S GO GOPHERIN' Richard J. Smith and Jim Gerland As promised (a 2 days late) here is our guest lecturer. ------------------------------------------------------------------- Christinger Tomer is Assistant Professor, School of Library and Information Science, University of Pittsburgh. Before joining the faculty at Pittsburgh, he taught at several other institutions, mainly Case Western Reserve University. He holds a bachelor's degree from the College of Wooster and Master's and Ph.D degrees from CWRU. His interests include the application of information technologies to library services. -------------------------------------------------------------------- THE LIMITS OF GOPHER In terms of the applications developed in recent years to support resource discovery and information retrieval over the Internet, the University of Minnesota's Internet Gopher is arguably the most important development. Part of its importance owes to the scope of deployment; a recent estimate fixed the number of active Gopher servers worldwide well in excess of 1200. But the larger reason for its importance is the more obvious one -- Gopher has made the Internet both accessible and usable for large numbers of users, many of them new users otherwise lacking the means to make extensive use of the resources accessible to them. Yet, as significant as it has been and remains today, Gopher is in many ways already outmoded. Designed primarily as a document delivery system, it lacks the finer granularity that many users require. Where users were once satisfied, say, to identify the machines on which the latest version of the manual for the Elm mail user agent resides, today they want to be able to query an array of servers and retrieve the relevant sections of the manual. The availability of the search engine known as Veronica has helped to a some degree, but the main problem is that Gopher's designers did not outfit their system with native mechanisms for more sophisticated forms of searching or processing of comparatively more complex document types. (Although release of the software to the Internet community clearly implied a desire for deployment beyond the University of Minnesota system, that the system is based on a simple, hierarchical file system suggests that the designers of the original system did not envision supporting a network of well over a thousand file servers scattered across a global network.) The "Gopher+" enhancements, which rely on transmitting tab-delimited fields beyond those specified by the first generation of Gopher servers and clients, support the retrieval and display of pictures, sounds, and motion video, but the basic Gopher mechanisms remain fairly primitive and inflexible, with the bookmark feature being the only significant option for customizing at the client level. NCSA MOSAIC AND THE NEXT GENERATION OF RESOURCE DISCOVERY TOOLS However, the next generation of tools is already at hand. Perhaps the most interesting of them is the National Center for Supercomputing Application's Mosaic. Based on the so-called "WorldWideWeb" technologies developed at CERN in Switzerland, Mosaic's developers call it "a distributed hypermedia system designed for information discovery and retrieval over the global Internet." (Marc Andreessen, "Getting Started with NCSA Mosaic," Unpublished paper, National Center for Supercomputing Applications.) Using the X Window system as its interface, NCSA Mosaic unifies access to various protocols, data formats, and archives, and provides interfaces to external viewers designed to handle display formats other than the X bitmap, e.g., JPEG, TIFF, DVI, MPEG, and PostScript. For example, within the framework provided by a single interface, a user may run a Gopher session, instruct an Archie client to run a search, or retrieve images from The Library of Congress's Vatican exhibit. Mosaic's hypermedia capabilities are derived from the use of the HyperText Markup Language (HTML). Based on the Standard Generalized Markup Language SGML), the ISO standard for internal document description, HTML uses tags to indicate formatting or structural information. One of the structures HTML tags may specify is a link to another document, which may situated on the same server or located somewhere else on the network. Based on a single directive known in the context of HTML as an "anchor," the tag points to a specific file and provides the basis for a traversable link between the anchor and the file to which the link points. The operational significance of the embedded "anchors" is that, at least in principle, files located anywhere on the Internet may be linked, and that links may be added or deleted in accord with the requirements of either document designers or end users. As a result, Mosaic is capable of supporting several modes of asynchronous collaboration, including document annotation, document crosslinking, and document revision control. In addition, NCSA Mosaic can communicate directly with Collage, which is NCSA's synchronous collaboration tool intended mainly for use in scientific data analysis and manipulation, and NCSA's Data Management Facility, which is a relational database system designed especially for scientific data. (One of the threads connecting Mosaic, the WorldWideWeb, and the Internet Gopher is a scheme for document naming known as the Uniform Resource Locator (URL). The URL has been described as "a networked extension of the standard filename concept: not only can you point to a file in a directory, but that file and that directory can exist on any machine on the network, can be served via any of several different methods, and might not even be something as simple as a file: URLs can also point to queries, documents stored deep within databases, the results of a finger or archie command, or whatever." Perhaps more to the point, the use of URLs and the deployment of a similar scheme for resource naming represent key factors in further regularizing the processes supported by tools like Gopher, WWW, and Mosaic.) THE NEAR FUTURE In the near term, we can expect that the Gopher system will be superseded, albeit slowly, by Mosaic and similar applications. Already there are Mosaic clients -- in effect, "proof-of-concept" applications -- that will run successfully under Microsoft Windows 3.1 and Macintosh System 7. The speed of this transition will depend in large measure upon the capabilities of the local area networks from which clients are launched and the processing capabilities of the computers upon which those clients run. For example, so-called "fast Ethernet" will support transfer rates of up to 100 megabytes per second. Coupled with the next generation of desktop computers, which are expected to be RISC machines, or the equivalent thereof, available network bandwidth and local processing power should be great enough to support a generation of robust resource discovery/retrieval tools based on or emulating the X Window interface. The more difficult question is how long it will be necessary to support the several generations of machines built on the PC AT bus and running versions of MS-DOS. However, as long as those machines represent a significant factor, and it would seem at this point, given their numbers, the state of the general economy, and the nature of end-user computing, that these machines will be a significant factor for at least another five years, the Internet Gopher and other essentially low-end systems will remain a potent factor in this area of network computing. Thanks to Slippery Rock University's library and computer center staff, and the State University of New York at Buffalo's School of Information & Library Studies faculy for their assistance in helping me continue the course while on the road.--Rich Richard J. Smith smithr@clp2.clpgh.org The Carnegie Library of Pittsburgh or rjs@lis.pitt.edu Jim Gerland gerland@ubvms.cc.buffalo.edu State University of New York at Buffalo Academic Services, Computing and Information Technology .