USGS OFR 98-487 -- Haugerud

Digital Mapping Techniques '98 -- Workshop Proceedings
U.S. Geological Survey Open-File Report 98-487

Geologic Maps, Spatial Databases, and Standards

By Ralph Haugerud

U.S. Geological Survey
Box 351310, Seattle, WA 98195
Telephone: (206) 553-5542
Fax: (206) 553-8350
e-mail: rhaugerud@usgs.gov

"May you live in exciting times."
Ancient Chinese blessing and curse

Makers and users of geologic maps live in exciting times. The cartographic world is undergoing a revolution perhaps comparable only to the invention of maps themselves. No longer tied to paper or mylar, spatial data are now most usefully stored in digital form and manipulated by computer and plotter, not cartographer and pen. Geologic maps are not immune to this change. There are many efforts to bring the world of digital cartography, GIS, and spatial databases to the map-making geologist, and vice versa. But how should we do this--what form should digital geologic spatial data take?

Maps and the discipline of mapping, developed by two centuries of practice with paper and sharp pencil, are integral to geology as we know it. As our spatial data become digital, we are obviously concerned that we do not lose the benefits that paper and pencil bring us. Most discussion to date on digitization has been over how to put paper maps into the computer. But we should think further: Digital spatial databases aren't just a faster and cheaper way to produce and deliver geologic maps. As our spatial data become digital, we will change and improve the way we do geology.

In this essay I discuss what a geologic map is, how geologic maps differ from other maps, and what a geologic spatial database is and how it differs from a map. My hope is to encourage clarity on these topics as we discuss standards for digital geologic spatial data. I then propose several years of explicit experimentation before national standards for digital geologic spatial data are adopted. As a contribution towards this discussion and experimentation, I conclude with some suggestions towards better databases.

My qualifications are limited. I have no formal training in cartography, GIS theory, or database design. I am a geologist with nearly two decades of experience mapping in the humbling terrane of northwest Washington (e.g. Tabor and others, 1994; Haugerud and others, 1994). I have the good fortune to have worked closely for much of this time with an excellent mapper and to have been exposed to many of the other wise men and women within the USGS. I have used Arc/Info for several years. A recent project in regional map compilation (Haugerud, 1997) has turned me into a consumer of digital geologic spatial data. And I have been a peripheral participant in two of the ongoing standards-generation efforts (Raines and others, 1997; Fitzgibbon and Wentworth, 1991).

WHAT IS A GEOLOGIC MAP?

"There isn't a unique, correct, geologic map."
E.H. Brown to field camp students, 1974

"Plot a symbol, some symbol, to show each outcrop that you visit."
Eric Cheney to field camp students, circa 1985

"A geologic map is an expression of an hypothesis.
(Geologists who make blob maps have blobby hypotheses.)"
E-an Zen, 1986

"Plot as you plod, or you won't know where to go next."
Rowland Tabor (paraphrased), many times

A geologic map is (1) a record of observations located in space, (2) an expression of an hypothesis about Earth history, and (3) a tool for analysing Earth history. It is NOT a simple, or even an abstract, description of some part of the Earth.

The Map as Record of Observations

First of all, our maps are records of our observations. A bedding attitude here, a sample locality there. But many, even most, of the observations shown on a geologic map are not simple observed facts. How many times have you explained to an assistant why your notes don't mention the red color, or the extensive fracturing, or the (fill in the blank) of some outcrop? We do not simply record what can be seen. Our field "data" are already filtered through a large body of geologic theory. Sometimes our observations are no closer to being facts than the hypotheses we attempt to test with these observations. This is the major reason that advances in structural, stratigraphic, or petrologic theory commonly necessitate remapping of regions. The point is driven home when you discover that you and your colleague, with over five decades field experience between you, cannot agree on the orientation of bedding at the last outcrop.

The Map as Expression of an Hypothesis

Most geologic mappers see only a fraction of a map area, but possess powerful theoretical tools (Steno's Laws, expectations of facies patterns, and so on) that, with an Earth history inferred from these limited observations, allow extrapolation of geologic relations to the rest of the map sheet, even the areas that are covered. A geologic map is commonly more prediction than depiction. The map is an illustration of an inferred Earth history.

The amalgam of observation and inference on a map can be uneasy, as is discovered by the geologist who gets a helicopter ride to an inaccessible ridge only to discover that the foliation attitude plotted by a revered predecessor was wishful thinking. Unfortunately, paper usually does not allow the density and detail of symbolization that would be needed to fully distinguish between what we saw, what we think we saw, and what we think we would have seen if we could have seen.

The Map as Analytic Device

A geologic map is also an analytic device. This is true after it is made -- how many classes have we collectively taken, or taught, in which we were asked to unravel some aspect of Earth history from a map? But the making of a map is also an analytical tool. The mapper knows well the ritual: daily plotting of field observations on the office map; inking that which is well-known; erasing; coloring -- sometimes tentatively, sometimes with conviction -- of areas that have been mapped; erasing; poring over the uncompleted map, field sheets, and air photos; and long hours of discussion if one is fortunate enough to work with a colleague, always asking "What is the stratigraphy? What is the structure? Where do I traverse tomorrow to fill in the map, maximize outcrop, minimize travel costs, and best resolve the major unknowns in the geologic story?" Putting one's observations on mylar at 1:24,000 or 1:100,000 scale turns incomprehensible mountainsides into simple geologic relations that can be held in the hand -- doing so is a tool that allows a flea to see elephants. Putting a station in every square inch of the map enforces a completeness to one's analysis. And mapping all the unconsolidated deposits keeps one honest about what bedrock history can be known and what cannot. (And vice versa.)

HOW DO GEOLOGIC MAPS DIFFER FROM OTHER MAPS?

Geologic maps differ from many other maps in important ways: (1) Our maps are commonly of entities that are observed with difficulty. Identification of a municipal boundary, or an interstate highway, is easier than the identification of many (most?) geologic map units. Uncertainties about classification of the mapped object are greater than with other types of maps. Misclassification errors are correspondingly more common. (2) Our maps are more sensitive to scale than other types of maps. The nature of a contact on a political map does not change with map scale: it is commonly defined, and observed, with much greater precision than it is plotted at a wide range of scales. Geologic objects are commonly defined and observed with a resolution that is near the intrinsic resolution of the map -- indeed, we choose our map scale or our observation method so that this is true, and we then use symbols to denote whether an object is located as well as or more poorly than can be depicted at that scale (e.g. continuous or dashed contacts). (3) Our maps are more complex. Whereas some cartographic dogma proclaims 'one theme, one map', geologic maps commonly display multiple themes (e.g. lithostratigraphy and topography) so that the user can observe the interplay between themes and use this interplay to make useful inferences. (4) Many geologic maps have text and correlation charts that describe rich and complex relations between the various units shown on the map. (5) A well-made geologic map has benefited from a great deal of thought about its symbolization. Colors, patterns, line-weights, symbol sizes, unit tags, and type fonts are all carefully chosen to reinforce explicitly recognized relations and to hint at others. (6) Because geologic maps are largely inference, their authorship is important.

WHAT IS A GEOLOGIC SPATIAL DATABASE AND HOW DOES IT DIFFER FROM A MAP?

Most of us understand, at some level, that a geologic spatial database is a set of spatial geologic data in the computer that can be queried, whereas a map is a graphic on a piece of paper or a computer screen. Nonetheless, we often use the terms interchangeably, presuming that a digital map is a database. But there are essential differences between databases and maps.

Databases lack cartography. The meanings (attributes) of objects in a database can, and should, be expressed without symbolization. The lack of symbolization, and the corresponding ability to prescribe different symbolization of the data for different purposes, give digital geology much of its power. The price we pay for this power is the need to explicitly describe some of the meanings that are implicit in much present map symbolization. What is most important? What least? What map units are like, or unlike, other map units?

A database has no intrinsic scale. A paper map does. Consequences of this difference include no limit on the richness of attribute information in a database, whereas maps are limited by printing technology and the resolving power of the human eye. If a database is to have a scale, or resolution, it must be specified explicitly. (Raster databases are exceptions: in them, cell size defines resolution.)

Databases commonly lack base maps. Reynolds and others (in U.S. Geological Survey, 1995), following USGS tradition, suggested that any geologic map worthy of the name must have a base, as the location of geologic features is always relative to the location of physiographic or cultural features. Yet any number of digital geologic spatial databases are available without an associated base. Such databases are more compact. Many users want to plot geologic data on a base map of their own choosing. And in many cases an adequate digital representation of the appropriate base map simply hasn't been available.

Databases can be updated more easily than paper maps. Producing, printing, and distributing a paper map is an expensive and time-consuming proposition. This had the effect of stabilizing our geologic understanding: it simply cost too much for advances in knowledge to be quickly and widely disseminated.

Some of these differences became clear to me when I began a project to produce a large composite geologic spatial database and straightforward methods to produce many maps from this database (see Haugerud, 1997). I have one database: it is a collection of computer files that describe the geology of the Pacific Northwest in (1) terms that the software can manipulate and in (2) symbolic abstractions that are similar to those embodied in a traditional geologic map -- units and contacts. If I add to this database a geographic outline, a projection, a scale, and a set of rules for generalizing and visualizing the content of the database, I get a map. With a different outline, projection, scale, or set of generalization and visualization rules, I get a different map. The database contains spatial information sufficient to produce many maps. It does so without the symbolization or scale dependence of a map. Each map contains information (scale, symbolization, projection, extent) that is not present or is different from that in the database. There is not a one-to-one relation between database and map.

STANDARDS FOR DIGITAL GEOLOGIC SPATIAL DATA

Executive Order 12906 (establishing the National Spatial Data Infrastructure) and the National Cooperative Geologic Mapping Act effectively mandate a national standard for digital geologic map data. I believe we do not know what this standard should be. There is a large body of knowledge regarding the computerization of other kinds of spatial data, but much of it is irrelevant. Geologic maps and geologic spatial data differ from other maps and spatial data in significant ways. We need many experiments with data structures and archival and retrieval mechanisms because we presently lack the empirical knowledge to prescribe an optimal standard.

But one of the great advantages of digital storage of spatial data is the ease with which such data can be transferred, transformed, and merged with other data sets. This ease is obtained only if data conform to common standards for format and content. We are thus faced with the need to encourage experimentation while creating some measure of standardization, not an easy task.

I suggest that those who would suggest standards offer an interim minimum standard to provide translatability between data systems and a measure of completeness. This has in part been done (see http://ncgmp.usgs.gov/ngmdbproject/standards/dataexch/dataexchinterim.txt). Beyond this, we should wait some period -- perhaps five years -- during which we support multiple experiments with data models, formats, and transfer standards. At the end of this period, the experiments should be reviewed and one or two best options adopted by the geologic mapping community.

SOME SUGGESTIONS FOR BETTER DATABASES

What should a geologic spatial database look like? I don't know, but in the hope of encouraging open discussion I make a few suggestions here. A common theme among these suggestions is that they are controversial: each is in opposition to recent practice by thoughtful geologists.

I begin with data quality. Quality of our "data" has two aspects, locational accuracy and attribute accuracy. Paper maps have described data quality explicitly (approximate versus exact contacts, queried faults and unit identifications, reliability diagrams) and implicitly (via map scale, map-series, e.g. Open-file versus MI- and GQ-map, and authorship). Limitations of printing technology and the human eye don't allow much more, but I believe that both geologists who make and re-make maps and users of these maps would benefit from further information. Digital media allow us to provide such information.

Data Quality 1: Explicit Statement of Precision (Locational Accuracy) for Each Data Particle

Digital spatial databases lack the intrinsic scale of paper maps. Yet as noted above, the geologic information recorded in the database is scale-sensitive. Some geologists have suggested that labelling digital spatial databases with the scale of the equivalent paper map is sufficient, but I find a scale statement inadequate on five counts:

Geologists may know how precisely the contacts on a 1:100,000-scale geologic map are located, but the county planner who uses our database most certainly does not!
A contact is attributed as "approximately located" at 1:24,000 scale. Is it precisely located at 1:62,500 scale? At 1:100,000? Are all "approximately located" features equally-well located? No. As the scale at which a database is plotted changes, how should the symbolization of each of the contacts and faults in it change?
The precision of our maps is intrinsically heterogenous. Some contacts are diffuse boundaries seen on air photos, some are located to a pin-point 6 paces NNE of a survey mark, and sometimes I straddle a contact but am in heavy timber and don't exactly know where I am. On a paper map (or a database where precision is denoted by a single scale) all precision of location more exact than that accommodated by the working scale is thrown away. Differing precisions less exact than defined by the working scale are not distinguished by our conventional symbology -- all are "approximately located."
I have spent a decade mapping on 1:24,000-scale base maps for publication at 1:100,000 scale. Doing so is not significantly more expensive than mapping on a 1:100,000 base. Yet if a geologist of the next millenium wants to know where I was more precisely than the circa 100 meters allowed by the publication scale, she will have to repeat my expensive traverses. Is this a responsible use of our mapping funds? I think not, yet as long as we were restricted to paper maps we had no other choice. With the richness allowed by digital storage of spatial data, we can do better.
Users of our digital geology will plot it at larger scales and overlay it on more precisely located features. We cannot prevent this. We can provide information for this to be done intelligently. For example:

Database

arc no.	LineType	EstimatedPrecision
36	fault	22 [meters]
37	contact, intrusive	100

Plotting instructions

Set MapScale = 24000
Set MapUnits = meters
Set MapPrecision = MapScale / 1000
...
Select arcs with EstimatedPrecision <= MapPrecision
Plot these arcs as continuous lines
Select arcs with (EstimatedPrecision > MapPrecision)
and (EstimatedPrecision <= MapPrecision * 25)
Plot these arcs as dashed lines
Select arcs with EstimatedPrecision > MapPrecision * 25
Plot these arcs with question marks
...

Data Quality 2: Record Data Lineage as a Proxy for Attribute Accuracy

One could ask for clear distinction between what the geologist saw, what the geologist thought he saw, and what the geologist thought he would have seen if he could have seen. But as noted above, these distinctions are arbitrary divisions along a continuum of confidence in "data." Communicating the position on this continuum of a given element of data is difficult, for we do not have the necessary vocabulary. By this, I mean that we cannot quantify, even crudely, our answers to questions such as How confident are we that a certain contact exists? How certain are we that it is a fault and not an unconformity? What is the likelihood that the structural-stratigraphic hypothesis embodied in this map is fundamentally flawed? Even if it is fundamentally flawed, the lithologic and structural information on the map may be useful; how useful is it?

In the absence of the necessary vocabulary, I suggest we record the lineage of each data element as a crude proxy for attribute accuracy. Let me explain: Geologic spatial data on the 1:100,000-scale geologic maps that I know well come from a variety of sources: previous maps, new field work by many geologists and assistants, interpretation of different kinds of imagery (topography, aerial photographs, aeromagnetic maps, etc.), and inference. Our standard working procedure has been to bring all data -- structural attitudes, fragments of observed contact or fault, spots of color on a field sheet to record an outcrop confidently assigned to a geologic unit -- together on a single sheet of mylar and then ponder them. As we work -- extending and finalizing contacts, inferring faults necessary to make sense of the distribution of geologic units, drawing cross-sections -- it is useful to know where each datum comes from. Perhaps a particular strike-and-dip observation is at odds with the surrounding structure. If it is my observation, maybe it should be ignored, as at times I confuse dip directions. If it was made by a trusted colleague, maybe it is correct and my understanding of the surrounding structure needs changing.

I hope that some users of our geologic spatial databases also think this hard about the data within them. Such users will appreciate knowing what we, as creators, know about the lineage of the pieces that comprise the database. I suggest that each element (contact segment, polygon label, structural measurement, fossil age call) carry a lineage attribute. Example values for this attribute, from a 30'x60' compilation I am working on, are

attribute value	explanation
RAH 24K field sheet	from RA Haugerud field sheet
WADGER OFR 89-3	from McGroder et al., 1989, Washington Division of Geology and Earth Resources Open-file Report 1989-3
RWT 24K field sheet	from RW Tabor field sheet
RAH 30m DEM	interpreted by RA Haugerud from 30m digital elevation model
RWT 616080B 571-34	interpreted by RW Tabor from aerial photography, with project, roll, and frame number

Short Text Attributes; Limited Numbers of Values

Attributes of lines, points, and polygons that correspond to contacts, faults, and geologic units should be relatively short and should have a limited number of values. Short because a short attribute can always be written to a longer field when merging data into another database; longer attributes may have to be truncated or translated. Text (character) attributes are preferable to binary numeric attributes because they are immediately readable by humans; integers are OK. Attribute length must be a compromise: shorter attributes take less storage space yet longer attributes are more easily remembered and read by people. I suggest line-type attributes no more than 35 characters long and polygon-type attributes of no more than 10 characters.

Storage of long, frequently-repeated attribute values in a related table is one possible solution. Unfortunately, some experience suggests that as a community we are commonly not sufficiently sophisticated to recognize the existence of the related tables.

If the range of permitted values for a polygon-type or line-type is large, you should question whether you are making a geologic database -- much of our art lies in simplifying the Earth into a small number of entities (contact, thrust fault, Quaternary alluvium, etc.) If your geologic spatial database for a 7.5' quadrangle has 800 different values for the polygon identifier, perhaps you have made a set of digital field notes -- a useful thing to do, but not helpful for those who wish to operate on the database with the rules developed for geologic maps, or merge it with other databases.

Keep Faults with Unit-Polygons

Some GIS practitioners insist that polygon layers should not contain dangling (unjoined) lines, or lines that separate polygons with identical attributes. The presence of such lines becomes an obvious indication that a polygon layer is corrupt. Faults that terminate within a unit and faults that separate polygons of the same unit are thus not allowed in the unit-polygon layer. Several groups have followed this practice and produced databases with faults in a different layer than polygons. Viljoen (1997) presented arguments that this practice speeds cartography and generalization.

Nonetheless, I suggest that all faults be contained within the geologic-unit polygon coverage. (1) Faults are given meaning by the geologic units that they bound (cut). Removing faults from unit polygons destroys any possibility of automating the analysis of faults. (2) If there are reasons to remove faults from a polygon coverage, this is easily done. It is not always easy to restore faults. (3) Updating a dual-coverage database is likely to be incomplete or lead to inconsistencies.

Define Attribute Values and Conventions

One of the interesting things I have learned downloading geologic spatial data off the Web is that many databases are not comprehensible without reference to the equivalent published paper map. Geologic units are not defined! A polygon is attributed as MzTv, but one one has to go to the map library to learn that MzTv means Mesozoic-Tertiary volcanic rocks.

All values of attributes for which the meaning is not readily evident to the non-specialist should be defined. These definitions can take the form of a simple text file, such as

...
MzTv	Mesozoic-Tertiary volcanic rocks	[description]
pTb	Pre-Tertiary ultrabasic rocks	[description]
...

Such text files should be formatted so that they can be read into a database file.

Various conventions need to be specified: are strikes and trends of structural symbols measured from geographic or grid north? Or from theta = 0 at grid azimuth = 90 of the GIS software? Are azimuths reported in the clockwise degrees of the geologist, or the anti-clockwise degrees of the GIS software? In units other than degrees? Are you using the right-hand rule (dip to the right when facing in the strike direction)? If not, how is dip direction specified? What do you mean by foliation? Is "bedding" with or without an independent, at-the-site observation of facing direction? Are precision estimates 1 sigma, alpha95, or something else?

Authorship

A cartographer I know suggests that if you turn someone else's analog map into digital form, your role is akin to that of the "cartographer who spends years on a map and gets mentioned in 8-point type down in the bottom corner." I suggest that a translator, who gets second billing, is a better analog. In either case you are not the author. The user of the resulting geologic spatial database deserves to know who is responsible for the geology and who is responsible for whatever improvement or degradation has been introduced in the process of digitization. Both pieces of information should be readily evident from even the most cursory catalog listing.

For previously published maps, I suggest a title and authorship statement in the form

Digital version of <PAPER MAPNAME> by <PAPERMAP AUTHORS>, digital transcription by <DIGITAL AUTHORS>

This map previously published by <PAPER MAP PUBLISHER> as <PAPER MAP SERIES NAME/NUMBER>, <PAPERMAP PUBLICATION DATE>

Burying authorship of the geology within an accompanying descriptive file, or in a dry metadata statement, is not helpful. New maps may be titled

<GEOLOGY OF QUADRANGLE XYZ> by <GEOLOGISTS>, digitization by <DIGITIZERS>

if agency policy allows. If the geologist and digitizer are the same, it is then appropriate to use

These recommendations pass muster with the librarians with whom I have discussed them and are similar to those of Reynolds and others (in U.S. Geological Survey, 1995).

ACKNOWLEDGMENTS

I thank Dave Soller for his comments on an earlier draft of this paper.

REFERENCES

Fitzgibbon, T.T., and Wentworth, C.M., 1991, ALACARTE user interface--AML code and demonstration maps: U.S. Geological Survey Open-File Report 91-587, 10 p.

Haugerud, R.A., 1997, Flexible map production from a composite geologic-map database: Geological Society of America Abstracts with Programs, v. 27, p. A-306.

Haugerud, R.A., Brown, E.H., Tabor, R.W., Kriens, B.J., and McGroder, M.F., 1994, Late Cretaceous and early Tertiary orogeny in the North Cascades, in Swanson, D.A., and Haugerud, R.A., editors, Geologic field trips in the Pacific Northwest: Geological Society of America Annual Meeting field trip guide, Dept. of Geological Sciences, University of Washington, Seattle, p. 2E-1-51.

Raines, G.L., Brodaric, B., and Johnson, B.R., 1997, Progress report--digital geologic map data model, in Soller, D.R., editor, Proceedings of a workshop on digital mapping techniques: Methods for geologic map data capture, management, and publication: U.S. Geological Survey Open-File Report 97-269, p. 43-46. See also http://ncgmp.usgs.gov/ngmdbproject/standards/datamodel/model42.pdf

Soller, D.R., editor, 1997, Proceedings of a workshop on digital mapping techniques: Methods for geologic map data capture, management, and publication: U.S. Geological Survey Open-File Report 97-269, 120 p.

Tabor, R.W., Haugerud, R.A., Booth, D.B., and Brown, E.H., 1994, Preliminary geologic map of the Mount Baker 30- by 60-minute quadrangle, Washington: U.S. Geological Survey Open-File Report 94-403.

U.S. Geological Survey, 1995, Draft cartographic and digital standard for geologic map information: USGS Open-File Report 95-525.

Viljoen, David, 1997, Topological and thematic layering of geologic map information: Improving efficiency of digital data capture and management, in Soller, D.R., editor, Proceedings of a workshop on digital mapping techniques: Methods for geologic map data capture, management, and publication: U.S. Geological Survey Open-File Report 97-269, p. 15-21.

Home | Contents | Next

U.S.Department of the Interior, U.S. Geological Survey
<https://pubs.usgs.gov/openfile/of98-487/haug1.html>
Maintained by Dave Soller
Last updated 10.07.98

Digital Mapping Techniques '98 -- Workshop Proceedings U.S. Geological Survey Open-File Report 98-487