USGS visual identity mark and link to main Web site at http://www.usgs.gov/

Digital Mapping Techniques '02 -- Workshop Proceedings
U.S. Geological Survey Open-File Report 02-370

Using NADM in a Distributed Framework

By Eric Boisvert, Annie Morin, and Martin Anctil

Geological Survey of Canada, Québec Division
880 Chemin Ste-Foy
Québec City, Québec, Canada G1V 4C7
Telephone: (418) 654-3705
Fax: (418) 654-2615
e-mail: eboisver@nrcan.gc.ca, amorin@nrcan.gc.ca, manctil@nrcan.gc.ca

INTRODUCTION

The goal of the Canadian Geoscience Knowledge Network (CGKN, http://www.cgkn.net/) is to establish a framework "which would link all of the government geological surveys and could potentially include knowledge held within academic institutions and the private sector. The resulting 'single window' access will facilitate national and international access to Canadian geoscience knowledge and incorporate Canadian geoscience data into the Canadian Geospatial Data Infrastructure." The approach chosen by CGKN is to establish links between data sources without imposing formal structure and yet dynamically link the information seamlessly. GIS interoperability is well served by several functional tools that allow the display of maps from various sources without prior conversion. The emergence of Web mapping technology now allows a user to dynamically merge map data from remote locations through the Web. Although these technologies are functional for the geometric aspect of GIS, the content of the maps is far from easy to integrate. Various organizations use particular classification frameworks, often conflicting, based on historical usage, particular institutional interest or mandate, or they do not impose any rules at all, resulting into a gigantic patchwork of classifications that must be integrated into a single coherent database. In a previous paper (Boisvert and others, 2001) we discussed how we experimented with a small system that extracts information from several databases and presents the results into a single report without having to physically link the underlying databases. In this paper we present new developments and how some of our earlier ideas have been implemented.

DO YOU SPEAK NADM?

GEOMDB Prototype

GEOMDB (GEOscience Multiple DataBase) presented in Boisvert and others (2001) was our first attempt to connect various databases into a single interface. Its architecture was relatively simple and used the NADM-Cord1 (Brodaric and others, 1999a) Conceptual Object Archive (COA) structure to index the information from one database to the other. (The NADM-Cord COA has evolved from the original meaning of COA, as documented in the v4.3 of the model, Compound Object Archive, but the differences are purely academic for our usage. In this paper we use COA and concept interchangeably.) The only shared part between the various databases was a unique identifier related to a concept (the COA) in the local database. The list of globally known concepts was kept in a central registry, and each local database had the responsibility to maintain a correlation between these global concepts and their local counterparts. When a query was issued at the central registry, the demand was cascaded to all local databases, which then converted the global identifier into the local identifier to search for the information. When found in the local database, the result was sent back to the central registry as a small part of a Web page that was combined with positive results from all other local databases, reassembled into a single page, and sent back to the client. The whole system depends on the fact that almost everything in NADM-Cord is tied in one way or another to a COA. This feature was the backbone for the interoperability of various NADM-Cord implementations that we tested. With a COA reference, it is possible to extract information from any related map, such as a single map legend element or descriptive attributes (including images and text), all of which can be attached to a COA. The COA really acts as feature-level metadata and can be used as a point of contact between databases.

IMPROVEMENTS

Two major changes, one to the data model and the other one to the GEOMDB framework, have put the above-discussed idea in a new perspective. The first change was to adopt an idea put forward by Brodaric and others (2001) in the U.S. National Geologic Map Database's Kentucky prototype (an object oriented version of NADM; see documentation at the Web site, http://geology.usgs.gov/dm/steering/teams/design/. The idea is to eliminate what are called lookup tables, or independent lists of terms that are used to populate various areas of the data model, and instead concentrate all terms in the concept domain of the data model (as COA's). This means that all keywords or terms become concepts in their own right. The immediate impact is to simplify the management of terms in the database, and the long-term impact is to establish a complex network of interrelationships between concepts. For example, a rock unit concept is related to a stratigraphic age concept to define the unit's age; this age concept can in turn be linked to another piece of information (for example, a piece of text, or another concept, like a geochron age) that was not foreseen by the person loading the rock unit in the database.

This technique has been used by Davenport (this volume) to build an emerging encoded science language. He established that certain geological concepts are best described by the union of several concepts. For example, a rock type can be described by a conjunction of material, genesis, and texture/fabric. A map unit can in turn be described by a collection of rock types and an age. The cascading effects of linking one concept to other concepts allows the linking of a map unit with genesis (and genesis to environment, and so forth). This technique seems intuitively closer to the way geological information is structured. The approach used in the GEOMDB prototype seems profitable because this small improvement in the data model opens a realm of possibilities, such as dynamic reclassification (for example, unit into ages: because the unit concepts are related to age concepts, it's possible to reclassify units as ages) and the possibility to query the database about the possible relationships between concepts even if the relationship is not encoded in a single database. The map unit-age relationship could lie in one database, and the map unit-genesis relationship could reside in another. Joining the results of both databases, a user could find where a particular genesis is found at a specific time, even if this information is not explicitly coded in one database instance.

Boisvert and others (2001) noted "we are of course toying with the idea of using XML as an exchange mechanism," and, in this year's work, we did indeed. This first prototype exchanged specially formatted HTML pages (actually, snippets of pages), and converting this information to XML added a new dimension to the project: the possibility to process rather than simply to display the result. The original approach used only a single mediator software (the piece of software that translates the database content into an HTML page) and the resulting set of pages was merely reassembled and displayed. Using XML we can now ask another mediator to receive the series of XML responses and process them, and do something useful with the result. Using XML also allows software other than browsers to use the server response.

This opens another set of possibilities in database interoperability. This approach relies on a translation mechanism that brings information stored in structure A to a portable format that can be translated back to structure B with another translator. This "lingua franca" method is already used by software like FME (http://www.safe.com/) where a common format (based on SAIF) is used as a launch point toward another format. Our goal for geological information is to use NADM-Cord as a lingua franca between database structures. Because the goal for CGKN is to share database content, the goal for a specific agency participating in this exchange is to provide a mechanism to translate its local structure into NADM-Cord concepts and constructs (Figure 1). This is done usually through the creation of mediators, which are pieces of software that translate back and forth between NADM-Cord and local structure and content. Developments made during the past year advanced further the concept of connecting distributed databases using an emerging concept called Web services.

 NADM-Cord as an interchange language between various database implementation       Figure 1. NADM-Cord as an interchange language between various database implementation. The top part of the figure shows a brute-force solution for interoperability; the bottom part shows our vision.

Web Services

People working in the information technology domain are aware of the Web service revolution. Simply stated, a Web service works similarly to the standard Web page server we all are familiar with, except that the Web page is formatted in such a way that another machine can read it and process it. It is called a service simply because it offers a small piece of information of processing logic to whatever client that might want to use it; it is not exclusive to a given platform or software. Many emerging standards address this concept to make it work: A software application somewhere uses a URL (Universal Resource Locator) to locate a machine where a piece of information is stored, requests the information (like a browser would request a Web page), and the server generates a specially formatted page (using XML) that is parsed back to be used by the calling software. The calling client can itself be a server for another application. The technique has many benefits, principally that it is relatively easy to implement because you need only a Web server and a scripting language to generate pages dynamically. (More sophisticated approaches are available.) The new .NET platform from Microsoft makes Web services even easier to deploy. This technology is ideal to implement our NADM-Cord as a lingua franca idea; we are experimenting with this.

PROTOTYPE SERVICES

The COA Service

The first problem we had to resolve in the distributed database project was how to store a single copy of the COA tree (see Boisvert and others, 2001, for the rationale) and give access to several users at once. In other words, we needed to determine how to remotely manipulate a list of COA's (which is the database incarnation of a concept). A service has been created on our Web server to permit manipulation of a COA tree; operations such as "Create a new COA," "Move a branch," and "Get a copy of a branch" are all possible through a series of ColdFusion pages specially formatted to be parsed by Geomatter (Boisvert and others, 2000; Brodaric and others, 1999b).

Figure 2 shows schematically how this service can be used to synchronize information between a central database and a local database. The call to the service is done through a regular URL2 to a specific page, which activates a ColdFusion script. These scripts operate on the database and generate responses formatted in XML, which Geomatter parses back, making the necessary adjustment to the user interface (that is, creating a visual representation of the information). This is a good example in which the client of the service is not a Web browser. The user working on Geomatter never sees the XML and is not aware that a conversation is being held between Geomatter and the Web server; the user is shielded by a user interface that displays familiar Windows controls.

Geomatter as a client of a service

Figure 2. Geomatter as a client of a service. Geomatter calls the concept "manipulation service" and can manipulate the content of the central registry through a series of operations. The server side is encoded using ColdFusion scripts that interact with the database and return results to Geomatter using XML formatted pages.

MAP AVAILABILITY SERVICE

This service is a direct improvement on the HTML-only solution presented in Boisvert and others (2001) because XML is now used. Each server is asked to return a list of sources (maps) that contain or display a specific geological concept. This list of sources is formatted in XML (see Example 1) so it can be further processed. This is then displayed by the central portal, which shows the results from a set of local databases as if it were one seamless database.

In Figure 3, the central portal receives a request to find maps that contain a certain geological concept. This request is passed to all local databases, which return an XML segment as a response. The central database can then easily merge XML segments and present a single response to the issuer of the request. The XML (Example 1) contains all necessary information to locate the map through a standard Web Mapping Service (WMS) call. The WMS is an Open GIS Consortium (OGC) standard to request maps over the Internet (OGC-WMS, 2000); this technology is gaining wide acceptance among GIS vendors.

Figure 3. Web site as a client of a service. The content of this page is the result of a series of calls made to various services. The application is interacting with a server that merges information from various servers using Web services.

Example 1. Sample of a XML response from the server.

<?xml version="1.0"?>
<MAPLIST CLIENT="SIMPLE_MAPSERVICE">
<!—this is where on the web the map is→
<MAP SRC="http://www.cgq-qgc.ca/cgi-bin/mapserv_35s.exe?map=d:/webcgq/hydrolink/data/maps/production/english/surf_wms.map">
	Surficial geology of Canada
<!—This is its full name→	
<NAME>Surficial geology of Canada</NAME>
	<!—this tell us that the map can be accessed using WMS protocol→
	<TYPE>WMS</TYPE>
	<!—the projection of the source map, using EPSG3 codes→
	<PROJ SRS="EPSG:4269" />
	<!—the limits of the map→
	<BBOX XMIN="-140.0" YMIN="40.0" XMAX="-40.0" YMAX="85.0" />
		<!—and a list of layers composing the map→
		<COVERS>
		<COVER NAME="10002">Surficial Geology of Canada</COVER>
		</COVERS>
</MAP>
<!—and we continue with the next map→
<MAP SRC="http://www.cgq-qgc.ca/cgi-bin/mapserv_35s.exe?map=d:/webcgq/hydrolink/data/maps/production/english/piedmond_wms.map">
	Carte des formations de surface de Portneuf
	<NAME>Carte des formations de surface de Portneuf</NAME>
	<TYPE>WMS</TYPE>
	<PROJ SRS="EPSG:4269" />
	<BBOX XMIN="45.5" YMIN="-71.5" XMAX="45.0" YMAX="-70.5" />
		<COVERS>
		<COVER NAME="10006">Unité geologie de surface</COVER>
		<COVER NAME="10009">Forages</COVER>
		<COVER NAME="10031">Station GIMS</COVER>
		</COVERS>
. . .continues. . .

This XML segment can be consumed by any application that can parse XML tags. We coded a simple consumer (a consumer is a service user, to extend the business metaphor; this term is widely used in Web service literature) that can merge a selected list of maps into a single view (Figure 3), but because this has been established as a service, other applications can use it. For instance, someone writing software in Visual Basic might want to use this service and call it from within its code; the service is totally independent of which client is using it. The map is not trapped in this Web page. Other applications can use the service and extract the map. For instance, we wrote a small application that can read this service back and build a composite view of maps aggregated from a single query to the central registry.

OGC RELATED SERVICES

Because OGC is becoming an important part of Canadian spatial infrastructure, we developed, with the help of Compusult Inc., in Newfoundland, Canada, the first step of WMS compliance for NADM-Cord. This service delivers a standard WMS GetCapability and GetFeatureInfo with information extracted from the NADM-Cord framework, achieving for NADM-Cord a primary aspect of WMS interoperability. The GetCapability is a standard WMS call to identify what is available from a server (which map, layers, projections, metadata, etc.). The GetFeatureInfo allows users to query a specific feature from a map and extract its attributes. The response format is not specified in the WMS specification, and we had to create one that would fit the NADM-Cord requirements. A typical result is presented in Example 2.

Example 2. Response for a GetFeatureInfo.

<xml version='1.0' encoding="UTF-8" standalone="no" ?>
<NADM VERSION="5.2" xmlns="http://www.cgkn.net/NADM">
<!—A feature_block is created for every spatial object in the list -->

<FEATURE_INFO>
<DBSOURCE xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" xlink:href="http://www.cordlink1.org">

<REQUEST MODE="SELECT"> 
 <SOURCE SOURCE_ID="24">Geological Map Of Canada</SOURCE>
 <NADM_DATASET DATASET_ID="56">Geology
 <SPATIAL_OBJECT_ID ID="100">
 </NADM_DATASET>
 <NADM_SERVICE URL="http://www.cordlinkg.org"/>
</REQUEST>

<CLASSIFICATION SCHEME_ID = "125" CLASS_OBJ_ID="1265">
 <CLASS_LABEL>Dst</CLASS_LABEL>
 <CLASS_NAME>Talwar Formation</CLASS_NAME>
 <COA COA_ID = "55">Talwar Fmt
 <COA_ATT DESC_TYPE="IMAGE" DESC_ID="25"/>
 <COA_ATT DESC_TYPE="IMAGE" DESC_ID="56"/>
 <COA_ATT DESC_TYPE="TEXT" DESC_ID="1123"/>
 </COA>
 <COA_REL COA_REL_TYPE="ROCK COMPOSITION" COA_ID = "225">Calcarous limestone</COA_REL> 
 interbedded with <COA_REL COA_REL_TYPE="ROCK COMPOSITION" COA_ID = "123">minor 
 shales</COA_REL> of <COA_REL COA_REL_TYPE="AGE" COA_ID = "1234">devonian age</COA_REL> 
. . .continues. . .

CONCLUSION

This work has given us an opportunity to experiment and crystallize our vision about how distributed databases can work. Since we started this project, the Web service paradigm has flourished, and large companies (e.g., Microsoft) are adopting it. Now there are more solid standards, such as SOAP (Box and others, 2000) and WSDL (Christensen and others, 2002), we can use to implement the ideas we tested. The new .NET platform makes service creation and consumption extremely easy to implement -- it's a matter of adding a keyword to a function. We are now looking at these tools to redesign our interoperability platform and publish these services for the outside world to use.

ACKNOWLEDGMENTS

As always, Peter Davenport and Andrew Moore of the Geological Survey of Canada and David Soller of the U.S. Geological Survey were kind enough to review the manuscript.

REFERENCES

Boisvert, E., 1999, NADM variant fact sheet: U.S. Geological Survey, http://geology.usgs.gov/dm/steering/teams/design/variant/CORDLink_Variant_Description.htm.

Boisvert, E., Desjardins, V., Brodaric, B., Berdusco, B., Johnson, B., and Lauzière, K., 2000, Geomatter II, a progress report, in Soller, D.R., ed., Digital Mapping Techniques '00 -- Workshop Proceedings: U.S. Geological Survey Open-File Report 00-325, p 87-95, http://pubs.usgs.gov/of/2000/of00-325/boisvert.html.

Boisvert, E., Morin, A., Lauzière, K., and Lebel, D., 2001, Using the proposed North American Data Model in a distributed database environment, in Soller, D.R., ed., Digital Mapping Techniques '01 -- Workshop Proceedings: U.S. Geological Survey Open-File Report 01-223, p. 35-43, http://pubs.usgs.gov/openfile/2001/of01-223/boisvert.html.

Box, D., and others, 2000, Simple Object Access Protocol (SOAP): W3C Note 08 May 2000: World Wide Web Consortium, http://www.w3.org/TR/SOAP/.

Brodaric, B., and Hastings, J., 2001, Evolution of an object-oriented, NADM-based Data Model Prototype for the USGS National Geologic Map Database Project [abs.]: Annual Conference of the International Association for Mathematical Geology, IAMG2001, Cancun, Mexico, http://www.kgs.ku.edu/ Conferences/IAMG/Sessions/I/brodaric.html.

Brodaric, B., Journeay, M., Talwar, S., Boisvert, E., and others, 1999a, CordLink Digital Library Geologic Map Data Model Version 5.2, http://cordlink.gsc.nrcan.gc.ca/cordlink1/info_pages/ English/dm52.pdf.

Brodaric, B., Boisvert, E., and Lauzière, K., 1999b, Geomatter: A map-oriented software tool for attributing geologic map information according to the proposed U.S. National Digital Geologic Map Data Model, in Soller, D.R., ed., Digital Mapping Techniques '99 -- Workshop Proceedings: U.S. Geological Survey Open-File Report 99-386, p. 101-106, http://pubs.usgs.gov/openfile/of99-386/brodaric2.html.

Christensen, E., Curbera, F., Meredith, G., Weerawarana, S., 2001, Web Services Description Language 1.1 (WSDL): W3C Note 15 March 2001: World Wide Web Consortium, http://www.w3.org/TR/wsdl.html.

Open GIS Consortium-Web Mapping Service (OGC-WMS), 2000, Web Mapping Service specification documentation, http://www.opengis.org/techno/specs/00-028.pdf.


RETURN TO Contents
National Cooperative Geologic Mapping Program | Geologic Division | Open-File Reports
U.S. Department of the Interior, U.S. Geological Survey
URL: http://pubs.usgs.gov/of/2002/of02-370/boisvert.html
Maintained by David R. Soller
Last modified: 03:33:05 Fri 11 Jan 2013
Privacy statement | General disclaimer | Accessibility