U.S. Geological Survey
906 National Center
Reston, VA 20192
Telephone: (703) 648-6533
Fax: (703) 648-6560
e-mail: pschweitzer@usgs.gov
Throughout the 1980s and early 1990s, the improving capability of desktop computers to carry out complex analyses has increased the popularity of geographic information systems (GIS). As they became familiar with GIS technology, people at all levels of government, in industry, and in academia have been calling for better access to publicly available geospatial information and more general use of standard terms of reference and of standard formats for the exchange of geospatial data and information. Answering this need is the goal of the National Spatial Data Infrastructure (NSDI), a government-wide coordination effort initiated at the Federal level through Executive Order 12906, which was signed by President Clinton in April of 1994.
A key component of NSDI is the development of a National Geospatial Data Clearinghouse, a general source of information about geospatial data that are available to the public. With the Clearinghouse a user can determine whether geospatial data on a region of interest exist and are appropriate for solving the problem at hand. The Clearinghouse is a distributed network of internet sites providing metadata (information about geospatial data) to users in the same ways. Its success depends on the overall consistency of the metadata that are made available, because users are expected to evaluate metadata from numerous sources in order to determine which data meet their needs.
To promote consistency in metadata, the Federal Geographic Data Committee (FGDC), an interagency council charged with coordinating the Federal implementation of NSDI, has produced the Content Standards for Digital Geospatial Metadata (CSDGM). That document provides standard terms describing elements common to most geospatial data, and encourages people who document geospatial data sets to use these terms. The CSDGM not only describes the terms of reference but also specifies the relationships among those terms. The relationships, many of which are hierarchical, are complex and a formal syntax is provided to specify them.
Because the syntax of the standard is complex and the number of descriptive elements is fairly large (335), creating metadata that conform to the standard is not an easy task. In addition to the problem of assembling the information needed to properly describe the subject data sets, data producers must arrange that information using the terms given in the standard and arrange the terms using the syntactical rules given in the standard. The resulting metadata are formally structured and use standard terms of reference, hence the term "formal metadata" in the title of this report.
The chief advantages of formal metadata are (1) the ability of computer software to process the information meaningfully and (2) the ability of users to locate and recognize within a record the topical components of the information. For these purposes it is important to be able to say with confidence that metadata conform to the structure of the standard. Human review is still required--no software can determine whether metadata are accurate--but human review of the content is easier to do if the syntactical structure is predictable and in accord with the standard.
Our Nation's digital geologic map data form a fundamental part of the its geoscience data infrastructure; making these data more widely known and used is clearly a worthwhile national goal for both the national and state geological surveys. Recognizing the importance of consistent metadata for digital geologic map data, the National Geologic Map Database (NGMDB), a joint project of the USGS and Association of American State Geologists (AASG), formed a Metadata working group to study the implementation of metadata for digital geologic maps. Members of the working group are Peter Schweitzer (USGS, chair), Dan Nelson (Illinois), Greg Herman (New Jersey), Kate Barrett (Wisconsin), and Ron Wahl (USGS). The working group was asked to: (1) look at the Content Standards for Digital Geospatial Metadata for adequacy; (2) examine implementing metadata in a standard format for geologic maps; (3) establish guidelines as to what the metadata elements mean to a geologist; (4) determine a process for facilitating input from state geological surveys not represented at this (1995) meeting; and (5) format a specific set of fields that must be filled out for the NGMDB map catalog.
The working group's report is online at http://ncgmp.usgs.gov/ngmdbproject/standards/metadata/metaWG.html. Briefly, the working group found: (1) The CSDGM works with a highly diverse range of thematic data; geologic maps fit naturally into this range. Additional metadata elements may be helpful, especially for geologic ages. (2) Technology, training, and work-flow strategies have been developed through discussions within the larger geospatial data community; these apply as well to geologic maps as to any other form of geospatial data. (3) Meaning of metadata to a geologists rarely differs from meaning to anyone else. Terminology used in the standard is not in every case the same as research geologists use, but the concepts apply directly. (4) The geologic mapping community is invited and encouraged to collaborate, communicate, and participate with other geospatial data producers in the NSDI. This Nation needs the wisdom of the geologic mapping community at least as much as it needs that of other scientific and technical disciplines. (5) The catalog schema, as already defined by the NGMDB, is acceptable. Records of the National Geologic Map Catalog are a brief subset of metadata because the emphasis of the Catalog is on all published maps, most of which are printed and not available in digital form. For digital map products metadata must be more detailed because these products are more ready to be used in digital spatial analysis.
The National Spatial Data Clearinghouse has come a long way since its inception in January of 1995. At that time the Clearinghouse consisted of a disparate set of web sites, not searchable by a single protocol or through a single gateway, providing metadata that varied substantially in structure, format, quality, and appearance. Since then, the study of the community, aided in no small way by the financial support and coordination of the FGDC, has developed software tools, training materials, and a more comprehensive understanding of the work-flow issues involved. As a result, the Clearinghouse is now a centrally searchable source of mostly high-quality metadata consistent in structure and format. Much work needs to be done to enhance the usability of the Clearinghouse, but it is now evident that investments made in creating formal metadata are beneficial now and will retain their value well into the future.
Organizations contemplating the task of producing metadata should be aware that many of the questions they ponder have been considered by other organizations, both similar and different from them. It is not a painless process by anyone's measure, and much information and informed opinion is available on the internet. From a business perspective, it makes sense to devote time and energy where the value gained is greatest. The value of metadata depends on: (1) the value of the data to the producers (cost to make and support the product, as well as the benefits, if any, gained by other organizations' use of it); (2) the transience of the workforce, meaning the potential cost to the producing organization if the people who understand how and why the data were produced leave; (3) the goals of the organization overall and its purpose in making the data available; and (4) the quality of the metadata themselves.
One of the difficulties that hinders implementation of metadata among those who are new to the process is the technical jargon within the CSDGM. The jargon tends to focus the attention of both metadata producers and reviewers on details. Details matter, of course, but it is crucial to ensure that the metadata answer satisfactorily and clearly the broadest questions about a data set that one might have.
With this perspective in mind, I have rephrased most of the CSDGM as a series of plain-language questions arranged in a hierarchy. My intent is to provide managers, novice metadata producers, and metadata reviewers with a general framework within which they can judge fairly the information requested or provided by a metadata record. The hierarchy extends, at the finest level of detail to the element names and structure by which the answers are encoded in a record. That level of detail is not presented here; it is best provided in a hypertext medium. The presentation here is an attempt to specify the information contained in a metadata record in a manner independent of the precise form in which the information will be stored. In the hypertext version these questions lead to specific instructions for encoding the answers in a metadata record. The hypertext version is online at http://geology.usgs.gov/tools/metadata/.
For further information, please consult the information resources available at the web site of the National Geologic Map Database Project, http://ncgmp.usgs.gov/ngmdbproject/.
U.S.Department of the Interior, U.S. Geological Survey
<https://pubs.usgs.gov/openfile/of98-487/schweitz.html>
Maintained by Dave Soller
Last updated 10.06.98