Techniques and Methods 7-A1
Published 2008 |
Fast, Inclusive Searches for Geographic Names Using DigraphsBy David I. DonatoChapter 1 of |
Abstract |
An algorithm specifies how to quickly identify names that approximately match any specified name when searching a list or database of geographic names. Based on comparisons of the digraphs (ordered letter pairs) contained in geographic names, this algorithmic technique identifies approximately matching names by applying an artificial but useful measure of name similarity. A digraph index enables computer name searches that are carried out using this technique to be fast enough for deployment in a Web application. This technique, which is a member of the class of n-gram algorithms, is related to, but distinct from, the soundex, PHONIX, and metaphone phonetic algorithms. Despite this technique's tendency to return some counterintuitive approximate matches, it is an effective aid for fast, inclusive searches for geographic names when the exact name sought, or its correct spelling, is unknown. |
Report |
Download the report in Portable Document Format (PDF). Techniques and Methods 7-A1 (203KB) To view and print Portable Document Format (PDF) files you will need Adobe Reader software. Download the latest version of Adobe Reader, free of charge. |
Suggested Citation |
Donato, D.I., 2008, Fast, inclusive searches for geographic names using digraphs: U.S. Geological Survey, Techniques and Methods, book 7, chap. A1, 6 p. |
Contact |
For scientific questions or comments, please send inquiries to David I. Donato (E-mail address: didonato@usgs.gov). |