Digital Mapping Techniques '97
U.S. Geological Survey Open-File Report 97-269

Semi-Automated Data Capture For Vectorizing
Geologic Quadrangle Maps In Kentucky

By Warren H. Anderson, Lance G. Morris, and Thomas N. Sparks

Kentucky Geological Survey

228 Mining and Mineral Resources Building

University of Kentucky

Lexington, KY 40506

Telephone: (606) 257-5500



The Kentucky Geological Survey is using a semi-automated data capture technique to vectorize data from existing hard-copy geologic maps. A multi-step procedure is being developed to collect accurate vector data. The vector data will facilitate the use of geologic information in geographic information systems (GIS). The primary objectives of this program are to collect accurate vectorized geologic data at a scale of 1:24,000, create electronic 7.5-minute geologic quadrangle maps, and compile electronic 1:100,000-scale maps from the 1:24,000-scale maps.

Conventional geologic mapping of Kentucky at a scale of 1:24,000 was completed in 1978. In 1996, as part of the STATEMAP component of the U.S. Geological Survey's National Cooperative Geologic Mapping program, the Kentucky Geological Survey initiated a project to convert published 1:24,000-scale geologic quadrangle maps to digital format. Some published geologic quadrangle maps are out of print, and there are no plans to reprint them. The demand for digital information and the utility of digital data focused our attention toward converting these published maps to digital format. A total of 24 geologic quadrangle maps in the Hazard 30-minute by 1-degree 1:100,000-scale quadrangle in the Kentucky River Basin was chosen to begin the digitizing process.

Vectorizing existing geologic maps will create digital geologic data that can be used in computer applications and GIS. In a GIS, vectorized geologic contacts make it possible to manipulate data to obtain area and volume measurements, answer spatial queries, locate wells, and perform other geologic and mineral resource manipulations and searches.


The STATEMAP component of the USGS National Cooperative Geologic Mapping program mandates five types of data for inclusion in electronic form: geological, geochemical, geophysical, geochronological, and paleontological. As part of this program, the Kentucky Geological Survey (KGS) is currently collecting the following information in digital form: geologic contacts, major coal beds, structural contours, faults, paleontological sites, and point data (outcrops or other features).

Under the STATEMAP program for 1996, KGS will complete the vectorization of 24 1:24,000-scale, 7.5 minute geologic quadrangle maps, and will compile them into a 1:100,000-scale geologic map by edgematching each boundary. In the process, a database of appropriate metadata will be created. The Barcreek geologic quadrangle map was the first to be vectorized using the semi-automated data capture techniques (fig. 1).

Figure 1. Draft geologic map of the Barcreek Quadrangle, Clay County, Kentucky. [200 K GIF, 1000 x 1138 pixels]

Kentucky Geology as it relates to digital mapping

The geology in Kentucky consists predominantly of flat-lying sedimentary rocks. The Cincinnati Arch, a large north-trending anticlinal structure in central Kentucky, divides the State into two basins: the Appalachian Basin in the east and the Illinois Basin in the west. Faulting is extensive in parts of Kentucky, but igneous intrusions are rare. Coal, limestone, sand, gravel, clay, oil, and natural gas are important mineral resources in Kentucky. Industries that extract these resources require resource assessments and geologic analyses. Converting paper maps to digital products will provide needed digital geologic information to mineral industries, planning agencies, and other public and private interests.

Computer Systems, Software, and Personnel

The Kentucky Geological Survey is using two 160-megahertz Pentium PC's, one Digital Equipment Corporation Alpha workstation, a Hewlett Packard 650-C color plotter, and an Eagle black and white scanner for the project. One Pentium PC is used to drive the scanner and capture the raster image, and the Alpha workstation is used for vectorization and map compilation.

Arc/Info (ArcScan) software is used to capture vector files from raster images. Experience with Arc/Info has shown that a period of 2 to 4 months is required to obtain minimum proficiency. Once proficient in the software, a geologist can complete the digital conversion of a 1:24,000-scale geologic quadrangle map in approximately 2 to 3 weeks.

Personnel for the KGS digital geologic mapping project includes a principal investigator, two individuals with expertise in both GIS and geology, and two technicians. In addition, the project receives significant support from the KGS Computer Services Section. This project will also enlist the aid of the Survey's GIS coordinator, who will provide assistance for the organization of metadata files. The KGS Database Manager will incorporate the digital geologic mapping files from this project into the KGS database. Several members of other KGS sections (Coal and Minerals, and Geologic Mapping and Hydrocarbon Resources) will also be available to help in their areas of expertise.

Procedures and Methods

A multi-step process for digital conversion of geologic maps has been developed for this project to obtain accurate and reliable data (fig. 2).

Figure 2. A process for digital conversin of geologic maps. [50 K GIF, 500 x 665 pixels]

Mylar Preparation

To convert published geologic maps into digital vector data, a stable-base Mylar composite of the geologic data is used. Paper maps are not stable and should not be used unless no Mylar copies are available. The composite is created by photo-enhancing the original geologic map to create a film positive that contains all geologic data but none of the topographic contours. The topographic contours cause significant problems during auto-vectorization because of the frequent line intersections encountered during the digitizing process. Later, a DEM and/or a DRG will be used to add topography to the map.

Scan Parameters

Scanner accuracy has been an issue during our conversion process. Potential problems included stretching of the Mylar during scanning, the scanner's roller control, and the scanner's camera alignment and calibration. These three problems can be controlled by calibrating the equipment according to the manufacturer's specifications. Each scan is checked to ensure that roller slippage or medium stretching (particularly with paper) has not changed the dimensions of the map. We have resolved these issues by calibrating and recalibrating as necessary to maintain high standards of accuracy.

The scanner software permits scanning parameters to be adjusted for each scan to obtain best results. Contrast control is adjusted for each quadrangle to obtain high-contrast raster images. Speckle removal is not used because it removes some important data. Experience has shown that a resolution of 400 to 600 dots per inch gives the best results; this resolution avoids line coalescence while still resolving very thin lines that might not be totally captured at lower resolutions.

Registration and Rectification

Once a mylar has been scanned and saved as a raster image (in TIFF format), the image is registered to a blank vector coverage based on the four known corner coordinates of the original Mylar. These four points can be expressed in either digitizer inches or real-world coordinates. The registered corner points serve as the georeferenced links between the original raster image and the vector coverage. Rectification corrects any skewness for a particular quadrangle.

Image to Grid

A new rectified TIFF image is then converted into an Arc/Info grid that precisely overlies the blank vector coverage. This grid is used as a raster background during vectorization. Some Quaternary deposits are manually digitized in AutoCAD because of problems scanning the Quaternary contacts. Corner coordinates are also created as tic marks in AutoCAD, then converted to DXF format and imported into Arc/Info. This method was chosen to maintain procedural consistency in registration because of the operator's familiarity with Autocad. Future tic marks will be selected from the USGS master tic files.

Semi-Automated Vectorization

Once the registered and rectified raster image is ready, it is entered into ArcScan. This software package is a semi-automated, operator-assisted, controlled method for rapid vectorization of raster data. The vector arrow traces the raster image line until it intersects another line, whereupon the operator is prompted for directions on which way to proceed. The operator directs the vectorization process, making this the slowest part of the process. The process to vectorize a complete 1:24,000-scale quadrangle took two weeks for an experienced operator.

Once vectorization of the quadrangle is completed, the resulting polygons, arcs, and points are built in Arc/Info to create topology. This establishes the coverage and attribute tables, which can later be used for analysis. The information in these coverages is made up of three Arc/Info feature classes: arcs, points, and polygons. Arcs and polygons define geologic contacts, coal outcrops, formations, structural contours, and faults. Points define selected outcrops, well locations, and fossil locations.

Attributes for arcs and points are stored in subfiles with .aat and .pat file extensions, using the following fields: Formation Name (FMNAME), Formation Code (FMCODE), Geologic Quadrangle Number (GQNUM), and Hazard 1:100,000 Code (HZ_FMCODE). Coding examples are shown in the table below.













Matching and Joining Map Boundaries

Once several quadrangles have been vectorized, their boundaries must be matched and joined. This involves establishing a "snap environment" where geologic features are connected and a seamless boundary is created. This process is important for maintaining geologic integrity and cartographic smoothness, because most boundaries are not perfectly aligned. Once the quadrangles are joined, they are imported into ArcView, where the final map is produced.

Final Map Product

A preliminary draft of a 1:24,000-scale map is shown in Figure 1. The legend, scale, and titles are added in the layout windows of ArcView. In the future, we plan to include a raster image of the original geologic quadrangle map as a part of the digital product, perhaps in the metadata file.

The 1:100,000-scale digital geologic map of the Hazard quadrangle will include a stratigraphic column, topographic base, and legend. The map is being created as follows:

  1. Map data were Arc/Info coverages compiled in ArcView, where the layout, title, authors, scale, corner labels, and legend were established.
  2. The stratigraphic column was created in Autocad, converted to DXF format and imported into ArcView. The cross section was partly created in a Terrastation computer plot of subsurface data, which was supplemented with near-surface data. It was compiled in Autocad, converted into DXF format and imported into ArcView.
  3. The text for the stratigraphic column lithologic descriptions, economic geology, and references sections were compiled in Microsoft Word, imported into ArcView, and converted into a "text with line breaks" format. These data were manipulated in the Text Layout window of ArcView to achieve the final map product.


An integral part of digital mapping is establishing a mapping database that links the graphic information with its geologic components. Of the five major types of data being captured in the STATEMAP program, geology has the most subcategories. It is important to be able to search digital files for geologic subcategories such as quadrangle, lithology, formation, faults, fossils, minerals, engineering properties, environmental properties, imagery, and drill holes.

In addition to the new mapping database, we are working toward establishing dynamic links to the principal KGS databases on petroleum, coal, water, and minerals. This will allow us to produce custom digital geologic maps, on which options such as locations of oil and gas wells, coal data, water data, and mineral data can be plotted. The ultimate goal is to integrate the digital geologic mapping spatial database with the point locations and attributes in the KGS relational database.


Several new software products on the market combine a true automatic vectorization with optical character recognition to vectorize all the data for a quadrangle. Programs with the ability to recognize text and labels and the ability to distinguish between the various line widths are a major advance in automatic vectorization. These programs still require operator control and cleanup after vectorization, but the future is promising.

Home | Contents | Next

Maintained by Dave Soller
Last updated 10.06.97