What is a digital geologic map? A digital geologic map is any geologic map whose geographic details and explanatory data are recorded in a digital format that is readable by computer. There are two fundamentally different conceptual uses for digital geologic maps, cartography and analysis. Digital systems are used by cartographers to produce geologic maps for a number of reasons. Although there are differences of opinion about whether digital methods are faster or more efficient for the initial production of geologic maps, nearly all agree that digital maps are much faster and more efficient to update. Digital geologic maps are much more likely to be re-used for purposes beyond their original goals. Digital maps can easily be re-drawn at a different scale or projection than the original, and features on the maps can be easily added, deleted, or modified. Thus, the original map does not become obsolete just because of changing needs or purposes of the cartographers. Cartographers are generally concerned with using the digital representation of the geologic map to produce one or more published geologic maps, usually on paper.
Analysts, on the other hand, are usually more interested in the representation of the geology in its digital form. Their interest is in combining the digital geology with other types of digital data in an attempt to model natural systems or to solve problems related to natural systems. The analyst needs to represent his products on paper also; so the analystÍs needs include those of the cartographer. Consequently, the analystÍs perspective is taken in this model development.
So we are involved with modeling because we are attempting to define for computing purposes how people think about and use geologic maps. With this understanding we can design a database that will meet the needs of both the cartographer and the analyst.
The purpose of a data model for digital geologic maps is to provide a structure for the organization, storage, and use of geologic map data in a computer. The data model formally defines the grammar of the digital geologic maps. This grammar is independent of the vocabulary of geologic maps. To be truly powerful it is necessary to address both the grammar and the vocabulary. The primary objective of this effort will be to develop a digital data model (grammar) for geologic map information. Time will tell how much of the vocabulary will be addressed.
The model is being developed in two forms: a conceptual and a relational database presentation. Because technology is rapidly evolving, we are attempting to be forward looking in developing a conceptual data model. Consequently, some aspects of this conceptual model cannot be easily implemented in common relational database GIS software. The relational presentation of the model is an attempt to translate the more general conceptual model into a more easily implemented relational database form. This relational database form is the starting point for implementing the data model in a GIS such as Arc/Info. We intend to implement this model in Arc/Info and ArcView 3 with prototype tools to facilitate data entry and analysis.
Over the last decade we have gained much experience in using various simple approaches to digital geologic map data models. Interestingly, most of these simple data models have been developed independently by separate groups, but they have been similar. The important feature lacking from most of these approaches, however, is the recognition that the text information in a map legend, explanation, or associated book report contains essential information needed to apply the geologic map to solving problems. In order to use the geologic map for spatial analysis, this text information needs to be organized in a form that can be analyzed by computer. Geologic maps can be extremely complex with many different types of information. Most geologic maps include a background of polygonal areas, which represent geologic units or materials that cover the geologic units such as water, ice, etc. The lines that separate the polygons also have significance; they represent differing types of contacts. Overlaying this background are usually numerous linear features such as faults, folds, dikes, veins, etc., and several different types of point features such as structural symbols and sample location symbols.
Additional complexity is introduced by the lack of symbolization standards for geologic maps. Although some general colors are often used for the same general types of rock units, there is no convention in common use for assigning a particular rock unit the same color on all maps. The same is true to a lesser degree for line patterns and point symbols. A pattern that may represent a dike on one map may be used to represent a fault or a vein on an adjacent map.
In the use of such complex information, there are cartographic and analytical considerations, each making its own demands. To deal with all of these issues is a complex task requiring a complex data model. There is also a competing need for simplicity, that is, the task of getting information from a digital geologic map into a data model must be efficient. Considering all of these diverse and complex requirements, it has been concluded by users of simple data models that they are not adequate. A long list of problems has been identified that summarizes the concerns that arise in using the simple model. A more complete data model is needed and a formal analysis of the goals of such a model is necessary. Development of this more complete model has evolved from efforts of the USGS, the GSC, and U.S. state and Canadian provincial geological surveys.
Figure 1. Summary diagram of the components of the data model.
In designing this more complete model, the point of view considered is that of the user of geologic maps. Figure 1 presents the user's perspective on digital geologic maps. The real world is composed of many entities including geologic objects. Geologic objects include such things as structural measurements, faults, map units such as formations, and other geologic features commonly represented on geologic maps. These real world geologic objects are of two types, interpreted and observed. Observed geologic objects consist of things that are actually observed or measured in the field, such as structural measurements, fault traces in outcrop, or characterizations of individual samples or outcrops. Interpreted geologic objects consist of the interpretation, grouping, or classification of multiple observed geologic objects, such as map units defined by observations of outcrops or fault traces defined from evidence observed in several outcrops. Representation of both interpreted and observed geologic objects are stored in a geologic object data archive, which requires a GIS to deal with the geometric and spatial aspects. A map legend establishes an association between objects in the data archive and their geometric, spatial, semantic, and symbolic representation on a particular map. Thus, a map is a representation of selected geologic objects symbolized and described for some specific purpose. Symbolization is defined by scale and purpose. Any geologic object, from the object archive, could be represented, for example as a point, line, polygon, or volume, depending on the scale and purpose of the map. The major point of this concept is to separate symbolization from data description.
These concepts lead to a large number of tables with complex linkages. The complexity of such data structures is managed through user interface tools. With proper tools, the complexity of the data model is transparent to the user. The critical tools needed are computerized data entry forms and standardized queries that can be packaged with a geologic map visualization tool such as Arc/Info or Arcview 3. In developing the data model, such tools are currently being developed as prototypes in order to test and demonstrate the use of the model. A complete description of the data model and a demonstration data set is planned for early summer, 1997.
A number of design criteria have been identified that guide the development of the data model. Those criteria are the following:
Why define a data model? Modeling is a complex task that attempts to capture the intricacies of real-world situations, including the characteristics of real-world objects, events, and object-event interrelations. Modeling by its very nature is from a particular point of view, often a combination of the view of an expert in the system being described (the earth, wildlife, surface geology, etc.) and an expert in implementing models on computer systems (databases, forward modeling, etc.). Thus, the modeling process occurs at many levels of abstraction. In the domain of geological mapping, the real world objects modeled typically range widely from the details of individual observations, to their interconnections, and to their synthesis into explanatory structures.
When the modeling process is intended to produce a data manipulation framework, the conceptual setting in which it occurs is typically called a data model. A data model is formally defined as a set of fundamental conceptual objects and mathematical and logical rules that govern their behavior. The rules are usually expressed in terms of how and why objects may exist, and what interactions are permitted (Codd 1980).
The formal objects and operators of a data model are generally abstract in nature and form a language in which real world situations may be expressed. Generally such languages are intended to be mapped into computing constructs, easing the transition from the real world, to the abstract and finally to the computer. This process requires the identification of key concepts within a specific real-world domain (as seen through the eyes of an expert) and an expression of their interactions using the data modelÍs conceptual objects and operators. In this sense, a data model may be seen as a tool kit composed of concepts, operators, and their rules of behavior, all used to describe some real world phenomenon for computing purposes. In its most abstract sense a data model provides the logical framework in which the real world may be described for computing.
However, there exist many possible levels of examination in this process. At one level it can be described as a rigorous, abstract notation for describing some real world domain (i.e., geologic maps). On another level, it can be seen as a way of organizing and manipulating data pertaining to the real world domain at the physical level of the computer, in terms of bytes, records and files. Both perspectives are commonly referred to as a type of data model. Hence, the term data model is often used to describe the product of a modeling process, usually as a database design for a particular real world domain, as well as the method and rules of abstraction used to generate such representations of reality. For instance, it is not uncommon to speak of geometric data models or geological data models -- these are each abstractions containing domain-specific concepts and rules. In another sense, however, the computing paradigm in which the models are formed, be it relational, object-oriented, or some other, is also a data model (of a data model) as it describes how the domain models are created and how their architecture behaves.
In some cases the domain specific model is called a database model (Burrough 1992; Teory 1988), as database design is the ultimate purpose at hand. This notion of the model being directly expressed as a database design may be attributed to the seminal work of Codd on relational databases (Codd 1970), which has caused data modeling to become linked with database design. As a result, the relational data model has become the standard example of a data model.
This initial notion of a data model providing a conceptual framework as well as a logical mapping into computing constructs has been under review for some time. Computationally driven frameworks that are expressed as an algebra with mathematical operators, as in the relational model, are seen as being generally insufficient in expressing many semantic relationships between data. Because of this, conceptual models utilize semantically richer, often non-mathematical operators. However, this results in their translation to computing environments being more complex or impossible to implement with commercial systems.
The results of a modeling process must ultimately be applied in a computing environment, be it spatial (GIS -- Geographical Information System), or non-spatial (RDBMS -- Relational Database Systems, Object-Oriented Database Systems), or both. Before this can occur a model must first minutely and exactly describe the type and behavior of the information to be managed by the database. This process usually involves the undertaking of requirements analysis and database modeling, resulting in a particular database design for a given subject area and set of data. The needs are identified during the requirement analysis, and these in turn lead to the identification of critical concepts, their interactions, and other implementation criteria, all of which constitute the database model. It is important to design and populate the database for optimum querying, both in terms of conceptual completeness as well as performance efficiency: a bad database design can result in slow, incomplete, or incorrect responses. Database models are usually described at the three levels of conceptual, logical, and physical models (Frank 1992). Once a design is formulated it can be programmed within a database system, the resultant database can be populated with data, and finally, the database becomes useful for thematic querying.
Burrough, P.A., 1992, Are GIS data structures too simple-minded?: Computers and Geosciences, v. 18., no. 4., p.395-400.
Codd, E.F., 1970, A relational model of data for large shared data banks: CACM 13, no. 6.
Codd, E.F., 1980, Data models in database management: Proc. Workshop on Data Abstraction, Databases, and Conceptual Modeling, Pingree Park, Colorado, June 1980.
Frank, A.U., 1992, Spatial concepts, geometric data models, and geometric data structures: Computers and Geosciences, vol. 18., no. 4., p.409-417.
Teory, T.J., 1988, Database modeling and design -- the fundamental principles, second edition; Morgan Kaufmann, San Francisco, CA, 277 p.