USGS visual identity mark and link to main Web site at http://www.usgs.gov/

Digital Mapping Techniques '02 -- Workshop Proceedings
U.S. Geological Survey Open-File Report 02-370

Distributed Spatial Databases--The MIDCARB Carbon Sequestration Project

By Gerald A. Weisenfluh,1 Nathan K. Eaton,2 and Ken Nelson3

1Kentucky Geological Survey
228 Mining and Mineral Resources Bldg.
University of Kentucky
Lexington, KY 40506
Telephone: (859) 257-5500
Fax: (859) 257-1147
e-mail: jerryw@kgs.mm.uky.edu

2Indiana Geological Survey
611 North Walnut Grove
Bloomington, IN 47405
Telephone: (812) 855-7636
Fax: (812) 855-2862
e-mail: neaton@indiana.edu

3Kansas Geological Survey
1930 Constant Ave.
Lawrence, KS 66047
Telephone: (785) 864-3965
Fax: (785) 864-5317
e-mail: nelson@kgs.ku.edu

INTRODUCTION

The MIDCARB Project

The state geological surveys of Illinois, Indiana, Kansas, Kentucky, and Ohio have formed a consortium to investigate the potential for sequestering carbon dioxide from significant emitters within their regions. The multi-year Midcontinent Interactive Digital Carbon Atlas and Relational DataBase (MIDCARB) project is funded by the U.S. Department of Energy's National Energy Technology Laboratory. The goals of MIDCARB are to (1) develop and organize scientific information related to CO2 sources (primarily electricity-generating facilities) and potential sequestration sites in the five-state area; (2) develop the information technology needed to access, query, analyze, display, and disseminate natural-resource data related to carbon management; and (3) make this information accessible to users of the World Wide Web. Each of the participating states have petroleum and coal-fired electricity-generating facilities that produce CO2 as well as potential for sequestration in petroleum reservoirs, unmineable coal beds, and saline aquifers.

Project Strategy

Each state survey has well established relational and spatial databases that can be used to evaluate potential sequestration sites. Each of these databases has different data elements and design characteristics, however, and are continuously updated by the respective state organizations. It was clear from the onset of the MIDCARB project that compiling information from each state into a centralized database would present serious maintenance issues, especially because each database is frequently updated. The project strategy was, therefore, to use the existing distributed database structure for tabular data, to create a similar distributed system for spatial information, and to develop software tools to integrate and interact with the data in a Web environment. Figure 1 shows the spectrum of individual state geographic databases that are being used in the Arc Internet Map Server (ArcIMS) project.

Categories and types of geographic databases used for the MIDCARB ArcIMS project   Figure 1. Categories and types of geographic databases used for the MIDCARB ArcIMS project.

Tabular databases from the participating organizations currently use either Oracle or SQLServer relational database management systems (RDBMS). Environmental Systems Research Institute's (ESRI) Spatial Database Engine (ArcSDE) software was chosen so that the spatial data sets could also be managed in an RDBMS. ArcSDE software is installed on each state survey's RDBMS and acts as an interface between GIS software and the underlying relational database (e.g., Oracle). SDE allows GIS data to be managed by a traditional enterprise-scale relational database and manages requests for information from a variety of ESRI applications, including ArcMap, ArcCatalog, and ArcIMS. Staff of the project elected to use this software solution because each of the organizations was using some or all of the ESRI products and because of ESRI's diverse off-the-shelf application support. ArcIMS was chosen as the platform for integrating all the distributed information in a single Web application. Figure 2 shows the logical architecture for the MIDCARB data integration system.

 Architecture of the MIDCARB distributed database system   Figure 2. Architecture of the MIDCARB distributed database system. Gray objects show existing configuration that uses a Web browser to interact with the data through ArcIMS. White objects show alternative data pathways where ESRI clients (e.g., ArcView or ArcExplorer) could access spatial data directly from an SDE database.

ARCSDE DATABASE DESIGN

The Kentucky Spatial Database

The MIDCARB project coincided with efforts at the Kentucky Geological Survey (KYGS) to construct its own ArcSDE database. Several database design considerations evolved during this development process that relate to data storage and user access.

The KYGS maintains numerous large spatial data sets that cover a diverse spectrum of natural resource themes. These include general-purpose GIS data, such as base maps and geologic maps, as well as specialized themes such as the carbon-sequestration layers developed specifically for the MIDCARB project. The KYGS decided to maintain all its spatial data in a single ArcSDE database, using sub-tables for organizing the data thematically, rather than creating separate ArcSDE databases (Figure 3). This approach alleviated the necessity for users to make the many database connections required by a multiple database scenario. Data layers were prepared in a single coordinate system and datum (NAD83, decimal degrees) to simplify data integration. All data were added to ArcSDE as simple, unregistered feature classes--no complex geodatabase functionality, such as feature editing, custom object behaviors, or versioning, has yet been enabled.

Configuration of the Kentucky ArcSDE database   Figure 3. Configuration of the Kentucky ArcSDE database, showing feature-class naming scheme. Thematic tables represent the sub-databases within ArcSDE. Theme names are formatted to easily identify feature classes and their characteristics within an application's database connection dialog (e.g., ArcView).

The KYGS point databases (oil, gas, coal, and water well/sample locations) that are maintained in a relational database as tabular datasets were spatially enabled by adding their location and basic descriptive information to ArcSDE feature classes. This greatly simplifies the task of using these data in mapping applications (e.g., eliminating the need to add data files to ArcView for event theme creation). Attribute data for well locations are still managed by the relational database system; however, updating location information is more problematic. At the present time, there is no convenient procedure to update ArcSDE locations that are maintained in a relational table. The best solution would probably be to manage location information in ArcSDE and create a method for updating the coordinate attributes stored in the relational database.

Using a single ArcSDE database presents logistical challenges for user access. The large number of ArcSDE themes listed in “Add Theme” dialogs (e.g., in ArcView) can confuse the user. One solution to this problem is to manage groups of feature classes with permissions. For example, all coal (and some related) themes could be made accessible only to a default “coal” user. Such users would not see other unrelated feature classes. KYGS implemented a table-naming scheme as an alternative solution to this problem. Thematic groupings are added to ArcSDE tables named with a four-character code (e.g., COAL, GEOL). Feature classes within the groups are named with a two-letter state prefix, followed by a three-character scale integer, and then a meaningful theme name (Figure 3). This format results in a connection list that is sorted first by theme type, and then by data scale and name, with each designation nearly in vertical alignment. Most relational databases limit such table names to approximately 32 characters.

Another challenge for managing ArcSDE data relates to the potentially large size of statewide databases. For example, the KYGS 1:24,000-scale geologic map database, when complete, will comprise 707 detailed vector data sets that have been edge matched and joined into statewide feature themes. Although ArcSDE does support spatial queries using tiling methods, they will not be effective with such databases because many of the merged features can cover as much as 30 percent of the state. ArcSDE's tile methods return all features that touch the tiles within the current view extent. The KYGS staff decided to pre-intersect these complex feature classes with commonly queried geographic extents (i.e., county and quadrangle outlines). This not only facilitates faster queries, but simplifies the process of preparing finished map layouts by eliminating the need for clipping for common map extents.

Future ArcSDE Work

Constructing maps of custom areas using ArcSDE query methods is effective, but if many themes are involved, the query needs to be issued separately for each theme. This can be a tedious process. To simplify the ArcSDE query process, a tool could be constructed for each ESRI application (e.g., ArcMap, ArcView) that collects the query criteria from the user, then iterates through selected themes. Large, seamless databases, such as the Kentucky 7.5-minute geologic map formations, present challenges for feature symbolization because the number of distinct map units is very large. Moreover, each of the ESRI applications uses a different method for storing symbol styles. Such maps should be rendered in a standard way, irrespective of the application, and map legends should be constructed so that they contain only styles for features in the current selection. This calls for a database solution that stores symbol definitions in a generic format (e.g., RGB or CMYK), with functions that obtain the required symbols for a selected feature set and construct a custom legend.

Finally, most of the current MIDCARB feature themes are relatively simple--point locations and simple geographic outlines. Serving complex and large spatial databases from distributed locations will require extensive testing for efficiency and development of methods for filtering the data that are returned to the user.

DATA INTEGRATION

The MIDCARB Web Site

The MIDCARB databases are integrated using a single ArcIMS service at the Kansas Geological Survey. This map service is accessed through the MIDCARB Web site, . Site development is based on a standard HTML template customized with additional HTML and JavaScript code. Spatial data on the Web page are integrated in the ArcIMS AXL file. Connections to each remote ArcSDE database are made with a WORKSPACE reference (example 1 below) and attachment to a feature theme is specified with the DATASET reference (example 2 below).

Example 1. Example workspace reference in ArcIMS AXL file used to specify a connection to a remote data server.

<WORKSPACES>
<SDEWORKSPACE name="sde_ws-48" server="kgsdata" instance="port:5151" database="" user="jerryw_sde" encrypted="true" password="PKOTJKSWGTKNGMKR" geoindexdir="c:\tmp\" />
</WORKSPACES>

Example 2. Example dataset reference in ArcIMS AXL file used to specify specific feature layer from a remote database connection.

<DATASET name="SEQUESTER.DBO.IB_500_SPRINGFIELD_COAL_OVR" type="polygon" workspace="sde_ws-48" />

It is transparent to the user, both from a design and efficiency perspective, that map layers are being loaded from more than one location. All data layers relevant to the five-state area are viewable from a single ArcIMS page. Figure 4 shows an example of a custom map view that integrates Illinois and Indiana oil and gas fields. Users control the themes to be displayed by activating layers in the ArcIMS Web page table of contents. The amount of data returned to the Web page from the various servers also can be limited by use of the zoom function, which uses ArcSDE's database tiling capabilities. All themes can be queried for attribute information using tools provided in the standard template. Because of the large number of themes provided in the MIDCARB map service, the standard table of contents required customization to clarify and simplify the user legend.

Figure 4. Example view of the MIDCARB ArcIMS site, showing oil and gas fields accessed from the Illinois and Indiana ArcSDE databases.

Table of Contents Customization

The first solution for legend simplification was to group themes by subject categories. This was accomplished by adding subject headings at the top of the table of contents with hyperlinks to appropriate parts of the legend (Figure 5A). Clicking on a subject hyperlinks to the list of themes related to that subject. To further simplify the user interface, the legend view (symbolization of feature types) was combined with the table of contents (list of features). These interface elements are typically shown separately in the standard ArcIMS template. Combining these functions saves screen space that can be used for the map view. Legends are displayed only for active themes selected by the user. The legends for this map service were preformatted as GIF images and are inserted dynamically when a theme is activated. This provides custom control over the appearance of the legend that is more readable than dynamic legends generated by the ArcIMS application (Figure 5B).

 ArcIMS legend customization   Figure 5. ArcIMS legend customization. A. Subject categories (Select Map Layers) provide hyperlinks to respective parts of the legend. Clicking on CO2 Sources results in scrolling to that part of legend. B. Active feature classes (checked items in legend) expand to custom explanations that are inserted as GIF images.

Tabular Data Integration

In addition to integrating spatial data, the MIDCARB site has linked tabular databases related to the map themes from each of the state's repositories. The standard ArcIMS template displays feature attributes (using the identity tool) in a horizontal frame at the base of the map. If the number of attributes is large, or field sizes are long, this frame must be scrolled to view the data. For the MIDCARB site, customized reports were prepared to more clearly summarize attribute information. Macromedia's ColdFusion software is being used to interpret browser requests for data, retrieve those data from the appropriate state databases, and return a formatted report to the user. The ColdFusion processor resides at the same location as the ArcIMS server; however, this is not required. Figure 6 shows the architecture of the data pathways used to prepare a ColdFusion report of Illinois database information for a hypothetical user request. This request is initiated by the ArcIMS hotlink tool when a user clicks on a feature. ColdFusion reports are returned to the browser in HTML format. Some data, like power plant emissions, are more clearly represented in graphical form. In these cases, the ColdFusion server passes the database information to a Java program for graph preparation. Both these functions are efficiently processed in a matter of seconds.

Data pathways for a typical tabular data request

Figure 6. Data pathways for a typical tabular data request. 1. A data request is sent from the Web browser to the ArcIMS server that passes it to the ColdFusion Server. 2. The ColdFusion server sends a formatted SQL request to the appropriate databases. 3. The RDBMS returns the requested data to ColdFusion. 4. The ColdFusion Server prepares a formatted HTML report and returns it to the user's browser. 5. For some requests, data are transferred to a JAVA program that prepares a formatted graph and returns it to the browser.

Future ArcIMS Development

The distributed nature of the data in MIDCARB is relatively transparent to the user. The fact that related themes are coming from different feature classes (on different servers) is not, because individual database themes have a separate entry in the ArcIMS table of contents (e.g., each state's oil and gas wells are displayed as a separate legend item). Showing only one legend item for the collection of related feature themes from each state would cause less confusion for the user. One approach to this problem would use custom programming to consolidate the legend. An alternative would be to create an ArcSDE database view of the related themes so that ArcIMS would only need to connect to a single database. The latter method may also remedy a problem related to multiple ArcSDE connections. ArcIMS may malfunction if one or more attached ArcSDE services becomes unavailable. Consolidating remote ArcSDE databases into a single database view may prevent these malfunctions, but could also affect performance.

The number of potential themes available to MIDCARB users is large and continues to grow. Much work needs to be done to simplify the user interface so that only desired themes are shown. This simplification could be accomplished with a query interface to collect information from users about what they want to see. The results of each query would be used to construct a customized view of the MIDCARB data.

The current implementation of the MIDCARB ArcIMS service uses version 3.1 software. New capabilities for accessing metadata from an ArcSDE database will greatly enhance the functionality of the service when version 4.0 is implemented.

CONCLUSIONS

The integration of spatial and tabular information in a Web environment has clear advantages for organizations that are collaborating with other institutions in research and public service programs. The principal benefits are that each agency can continue to maintain their own data and ensure that the Web service provides up-to-date information. ArcSDE and ArcIMS appear to be efficient environments for achieving this goal. The MIDCARB project has also demonstrated interesting opportunities for institutional data sharing using direct connections among ArcSDE databases with other ESRI applications.

The technological challenges for implementing a distributed data site are considerable, but the greatest challenge is designing an interface that clearly communicates the function of the site and how to use it.


RETURN TO Contents
National Cooperative Geologic Mapping Program | Geologic Division | Open-File Reports
U.S. Department of the Interior, U.S. Geological Survey
URL: http://pubs.usgs.gov/of/2002/of02-370/weisenfluh.html
Maintained by David R. Soller
Last modified: 03:33:43 Fri 11 Jan 2013
Privacy statement | General disclaimer | Accessibility