Semi-Automated Methods to Develop a Unified Geographic Information System Dataset

Techniques and Methods 11-C10
National Land Imaging Program
By:  and 

Links

Abstract

Geospatial data describing the topography, natural features, human-built features, and land uses of a particular area or region can come from independent data providers and, therefore, vary in format, data encoding, and geographic coverage. Because of the complexity of the processes and procedures required for unifying these heterogeneous data into a dataset with consistent format, encoding, and coverage, fully automated procedures for data unification do not exist. However, a combination of manual and automated procedures—semi-automated methods—can substantially reduce the time required for data unification while improving accuracy. This report presents three semi-automated data-unification methods in detail. Although these methods are not new in principle, their details are the result of original development work, and they serve as examples that can be reused, adapted, or generalized to provide head starts to future data-unification projects. The format of this report can be used and refined to encourage the publication of future reports and more widespread sharing of semi-automated methods.

Suggested Citation

Shapiro, J.L., and Donato, D.I., 2024, Semi-automated methods to develop a unified geographic information system dataset: U.S. Geological Survey Techniques and Methods, book 11, chap. C10, 32 p., https://doi.org/10.3133/tm11C10.

ISSN: 2328-7055 (online)

Table of Contents

  • Abstract
  • Introduction
  • Background
  • Methods for Removing Data Inconsistencies
  • Method 1: Unifying Data-File Names and Attribute Tables
  • Method 2: Eliminating Data Overlaps
  • Method 3: Merging Spatial Extents
  • The Benefits of Semi-Automation
  • Summary
  • Acknowledgments
  • References Cited
  • Glossary
  • Appendix 1. Python Scripts
Publication type Report
Publication Subtype USGS Numbered Series
Title Semi-automated methods to develop a unified geographic information system dataset
Series title Techniques and Methods
Series number 11-C10
DOI 10.3133/tm11C10
Year Published 2024
Language English
Publisher U.S. Geological Survey
Publisher location Reston, VA
Contributing office(s) Office of the Director USGS
Description iv, 32 p.
Online Only (Y/N) Y
Additional Online Files (Y/N) N
Google Analytic Metrics Metrics page
Additional publication details