U.S. Geological Survey Data Series 182, usSEABED: Pacific Coast Offshore Surficial-Sediment Data Release, version 1.0
dbSEABED is an information-processing system that can perform statistical and individual tests of accuracy across the range of output parameters.
Issues of accuracy and reliability become apparent as soon as data are integrated. Tools for monitoring the integration process are required, with feedback to the input data, so that improvements can be made in the system.
Basic uncertainties exist in all the incoming data that cannot be reduced and integrative systems cannot proceed past that uncertainty. Parallel studies in dbSEABED have determined on the basis of replicate analyses that analytical data, such as grain size analyses (Syvitski and others, 1991), has 1-sigma uncertainties on the order of 4 percent of the total parameter range, or 0.8 phi. With good maintenance of the data, the outputs from dbSEABED approach those levels of reliability.
In the case of the thousands of samples where both analytical and descriptive data exists, a statistical comparison can be made between the EXT and PRS data outputs. The results of this calibration are an overall guide to the accuracy of the regional mappings, and a highlighting of areas and issues in the data where improvements can be made. Those improvements involve both the analytical and descriptive raw input data. For example, grain-size analyses that appear to be the whole sediment but are really only of the sand fraction or analyses where gravel/shell has been omitted from an analysis.
The EXT and PRS outputs are imported into Microsoft Access® and links are created between the two files (based usually on the SampleKey). Entries with null values (-99) in either EXT or PRS are eliminated through a query. This query is brought into MS Excel and used to calculate the frequency distribution of deviations ( + and absolute) and plotted for inspection. Percentile statistics are calculated using the absolute deviation at the 50 (Median Absolute Deviation (MAD)), 68, and 95 percentiles (1s, 2s). Examples of the outputs are shown in the description of usSEABED. For most datasets the percentile statistics are 0.4, 0.8, and 4 phi for the 50, 68, and 95 percent levels, which may be acceptable over such a diverse set of input datasets but can be improved. An example of this analysis is shown in the figure below, for a dataset that is under improvement.
A second way of statistically evaluating the results uses a cross-plot between the EXT and PRS output data in the figure below. This type of plot serves to highlight some of the issues that may reduce the accuracy of dbSEABED with incoming datasets. At the locations A-D these common issues are identified in populations of points:
The programs of dbSEABED have been equipped to detect problematic data, whether by values falling outside plausible limits or by mismatches between EXT and PRS results. These tools normally do not prevent the problem values being output, but they do report detections to a diagnostics file that is particularly useful in the preparation and cleaning of incoming datasets. The statistical data shown in figure 3 is employed to set the filters, usually at the 68 percent (1s) level. The original data can then be revisited, checked for issues such as those shown in Figure 4, and can be corrected, deactivated, or left alone as appropriate.
Any use of trade names is for descriptive purposes only and does not imply endorsement by the U.S. Government.