Kansas Geological Survey
Campus West, University of Kansas
1930 Constant Avenue
Lawrence, KS 66047
Telephone: (785) 864-3965
Fax: (785) 864-5317
OBJECTIVES AND MOTIVATION The primary goal of the nomenclature database project is the organization and enhancement of information from the Kansas Lexicon into a form that permits easy access to descriptive information on rock units based on a wide and flexible range of user needs and selection criteria. One geologist may be interested in the current nomenclature of the formations within the Admire Group (Upper Pennsylvanian, Virgilian Series), along with the locations, descriptions, and available images of their type sections. Another geologist's interest may focus on the accepted nomenclature for the Admire Group prior to 1938 (then considered Permian) and its relationship to the previously defined "Admire shales." Someone studying the history of geologic research in Kansas might want a list of all rock units first described in publications having R. C. Moore (a former state geologist) as principal author.
Numerous other goals are tied to flexible access to rock unit information. It is clearly desirable to link descriptive text files for a specific rock unit to corresponding digital map objects, photographs, or document images. On-line edit capabilities permit easy correction of errors found within the nomenclature database, with immediate display of corrections to users. A capability for publication on-demand directly from the database permits prompt, cost-effective publication of enhanced or specialized lexicons as information within the database is improved.
Geologic mapping, broadly defined, is the fundamental data collection and information management activity of the geologist. With the rapid pace of digital geologic map data development in Kansas, the nomenclature database is needed for direct support of mapping and related publication activities. The importance of the nomenclature database for implementation of the NGMDB Project's proposed geologic map data model standards became apparent in the early stages of the project, with recognition that the proposed standards had much to offer toward design of the nomenclature database. Both efforts are viewed as a step toward development of a general model for all geologic data, as suggested by Richard (1998).
Many of these objectives arose in direct response to shortcomings of the Lexicon, which was developed in a word processing environment. The Lexicon's digital text files lack organized database structure and are not publicly accessible. By its nature, a lexicon of stratigraphic nomenclature is characterized by repetitive use of a limited set of information types. Contributions to the Lexicon from numerous stratigraphers, combined with the large number of included names, contributed to inconsistent style, format, and information content for named units. Use of abstracts from earlier lexicons perpetuated previously published errors. This practice also resulted in frequent occurrences of citations with imbedded references to sources described by author, date, and page with no further identifying information to be found anywhere in the Lexicon. In its printed form, the Lexicon provides no visualization of type sections or the geographic extent of named units. Direct support to digital mapping activities is not practical with the structure of the Lexicon's text files. In an effort to conserve space and limit the publication to a manageable size, a considerable amount of useful information was excluded from the Lexicon.
As development proceeded, it became apparent that similar data structures were appropriate for management of digital geologic map data and for management of historical information about the names of geologic rock units. Information on digital geologic map data models available through the NGMDB Project web-site http://ncgmp.usgs.gov/ngmdbproject/ simplified the development process. It provided a clear starting point, identifying critical tables, data fields, and relations.
An iterative process was implemented to achieve a balance between the additional effort of working around more problems in the full text files and the additional gains from further identification of useful text segments prior to parsing and loading into the relational database. Once that balance was reached and parsing routines were thoroughly tested, the information from the enhanced and reformatted Lexicon files was parsed and loaded into the relational database management system. A large portion of the information went into three tables of the new NDB; the SOURCES, CITATIONS, and ROCK_UNITS tables (Figure 1). There are one-to-many links from SOURCES to CITATIONS (each source may have citations relating to many different rock units) and many-to-one links from CITATIONS to ROCK_UNITS (where citations from many sources may define a particular rock unit). Further parsing, as needed, will be done within the relational database management system.
Figure 1. Primary tables in the NDB, reflecting the interface of information sources and defined rock units with specific citations.
The SOURCES table contains a separate record for each unique information source. There are 914 sources currently identified in the NDB. Records contain basic bibliographic information, source format (book, journal, note, map, etc.), and (where appropriate for specific geologic reports) information on the geographic extent of the study. A recursive source relationship ("contained in") is built into the SOURCES table. One source may contain many other sources. For example, an issue of a journal may contain many articles. Currently, 186 sources are identified as "contained in" 97 of the other sources. Records within the ROCK_UNITS table provide the basic identifying information for each recognized geologic rock unit name. This includes name, name origin, lithostratigraphic or chronostratigraphic rank, the names of each unit of higher rank containing the original unit, text statements of geographic extent and pointers to map objects for visualization of geographic extent. There are 1820 unit names in the database, including about 250 chronostratigraphic names and 1570 lithostratigraphic names. Fewer than 500 of the lithostratigraphic names are currently accepted as formal names in Kansas. A recursive unit relationship ("current_usage") is built into the ROCK_UNITS table to link abandoned unit names to the currently accepted nomenclature for the corresponding unit. Each record of the CITATIONS table, linking SOURCES and ROCK_UNITS, contains descriptions or comments regarding a specific rock unit, obtained from a specific source, with reference to the specific location of the information within that source. There are 5179 separate citations; an average of 2.8 citations per named rock unit.
Figure 2. Generalized structure of Version 4.3 of the digital geologic map data model (Johnson, Brodaric, Raines, Hastings, and Wahl, 1998).
Figure 3. Metadata tables in the NDB, describing the origins and nature of available information.
A separate AUTHORS table has been added to facilitate access to work by particular authors in the NDB. An intersection table (X_AUTHORS_SOURCES) links authors to each of their publications, identifies their sequence in a list of contributing authors, and links the author to their employing organization for that publication. Sources are linked to publishing and funding organizations through a separate intersection table (X_SOURCE_SUPPORT). This permits many-to-many relationships between funding agencies and information sources, and between publishers and information sources, to be handled as compound one-to-many relationships. The SOURCES_RELATIONSHIPS table links sources within the SOURCES table through relationships such as "complies with [the specified standard]" or "digitized from [the specified source]" as defined in a data dictionary. The PROJECTIONS table provides the additional information unique to information sources with map formats.
Figure 4. Rock unit tables in the NDB, providing descriptions and relationships of rock units.
STATUS The design of the Kansas Geologic Names Database is consistent with the corresponding elements of the proposed geologic map data model standards, and represents a major step toward full implementation of those standards. Functions defined in tables in the LEGEND portion of Version 4.3 of the MDM (see Figure 2), control production of visualizations of geologic map data. Similar functions are defined for the NDB using commercial report writer software (available either as components of the relational database management system, or as separate systems) to control report generation for a complete and current lexicon of geologic names in Kansas or selected subsets. For example, a lexicon could be extracted from the database for geologic names used by Moore, Jewett, and O'Connor (1951) in their geologic map of Chase County, Kansas.
Universal web access is now under development. A revised lexicon of geologic names in Kansas will be published on-line at the Kansas Geological Survey's web site http://www.kgs.ukans.edu/. The on-line publication will accompany a web conference site of the Kansas Nomenclature Committee for discussion of nomenclature issues, contributions of new information, and reporting of errors within the database. This will be similar to the web conference site used for discussion of the geologic map data model standards by the AASG/USGS Geologic Map Data Model Working Group at http://geology.usgs.gov/dm/.
Merging the NDB with the MDM will result in a geologic data model with sources (for particular nomenclature citations) as attributes of spatial objects used to represent specific rock units within a particular visualization of regional geology. The concept of a geologic names database can be broadened to include the historical development of accepted names for specific occurrences of structures or other geologic features in addition to rock units.
(2) Consistent formats throughout large printed volumes (almost impossible to achieve as a manual process and still not easily obtained using word processing software) become feasible in a relational database environment.
(3) Books, including lexicons, are just like published geologic maps -- you always discover important omissions and uncorrected errors after they go to press. The larger the press run, it seems, the more numerous and significant the errors.
(4) On-demand publication and distribution from relational databases provides a more efficient and cost-effective method for geological surveys to maintain formal lexicons and provide access to information on geologic nomenclature.
(5) Universal, on-line, access provides strong incentives for geologists to participate in the contribution of new information or identification of errors within the database by limiting their involvement to productive activities and providing rapid incorporation of contributions into the public domain.
Johnson, B. R., B. Brodaric, and G. L. Raines, 1998, Digital geologic map data model; version 4.2: AASG/USGS Geologic Map Data Model Working Group, http://ncgmp.usgs.gov/ngmdbproject/standards/datamodel/model42.pdf.
Johnson, B. R., B. Brodaric, G. L. Raines, J. T. Hastings, and R. Wahl, 1998, Digital geologic map data model, Addendum to Chapter 2; version 4.3: AASG/USGS Geologic Map Data Model Working Group, http://geology.usgs.gov/dm/model/Chapter2add.pdf.
Moore, R. C., J. M. Jewett, and H. G. O'Connor, 1951, Areal Geology of Chase County, Kansas; in, Geology, Mineral Resources, and Groundwater Resources of Chase County, Kansas: Kansas Geological Survey, Volume 11.
Richard, S. M., 1998, Digital Geologic Database Model: Arizona Geological Survey, http://www.azgs.state.az.us/GeoData_model.pdf.
Return to Table of Contents
This site is http://pubs.usgs.gov/openfile/of99-386/collins.html