![]() ![]() Contaminated Sediments Database for the Gulf of Maine, OFR 02-403 | ||||||||||||||
![]() Home/Abstract Site Map Introduction Content Overview How to Reach Us Database Construction How to Access the Data Data Utilization Data Tables & Maps Geographic Context & Outside Links References Cited Collaborators Acknowledgements DISCLAIMER |
Database ConstructionThe methods used to construct the database included: locating data and references, defining data types, screening and entry of data, editing and validation of data, placement of data into a geographic and physiographic context, and transfer of information to users of the data. A summary of the database structure and content is given in the section "Content Overview." Please read the following text for details. CollaborationThis database of existing data on chemical contaminant concentrations in sediment for the Gulf of Maine region was compiled with the collaboration and cooperation of many scientists, agencies, and institutions. The participation of the research and regulatory community in defining communal goals, in determining what measurements were important to record, and in assessing how to judge the quality of the rescued data results in products that meet the needs of the Gulf of Maine community. A listing of parameters to include in the database was agreed on and training in data screening and entry was provided to the participants. Principal collaborators, and their assistants or students, were responsible for locating references and entering data within their geographic or topic area. The compiled entries were reviewed as a batch by USGS staff for completeness and quality using iterative validation and screening methods (Manheim and Hathaway, 1991; Manheim et al., 1998). Entries that were identified by the validation process as questionable, data that needed repair, and samples with sparse documentation of quality criteria were reviewed again and appropriate comments were made in the database about these samples. Each collaborator was familiar with the content and structure of the database and could serve as a resource for others in the region on how to utilize searches, graphical displays, and comments to select and use data for specific needs. Data and references
Data contained in the database originated from many
sources (Table
1). The
USGS. completed searches of existing bibliographies and electronic searches
of the American Geological Institute's Geoscience Database (GEOREF),
the Aquatic Sciences and Fisheries Abstracts (ASFA), and the National
Technical Information Search (NTIS) listings. The ASFA and GEOREF searches
identified most of the papers in the peer-reviewed literature that contained
significant amounts of data. The NTIS search identified many governmental
agency documents that have limited distribution. Such in-house and consultant
reports are commonly referred to as "gray literature". Keywords
used for the searches included major locations, elements or compounds,
and likely general terms. Records held in existing bibliographies,
funding agencies, institutions, libraries, and individual contact with
scientists and regulators working in marine sciences throughout the
Gulf of Maine were used to identify additional documents likely to contain
data on contaminants in sediments. Bibliographies reviewed for documents include Regional Association for Research on the Gulf of Maine (RARGOM, 1997), Massachusetts Bays (Massachusetts Institute of Technology (MIT) Sea Grant for Coastal Resources, n.d.), Great Bay (Ward and Pope, 1994), and Bay of Fundy collections (Conservation Council of New Brunswick, 1993). When an existing compilation of historical
data was available (e.g., Metcalf & Eddy,
1984; Cahill and Imbalzano, 1991), the data was transferred electronically
and verified with the original data source when possible. Data held
in agency databases was also transferred electronically, and associated
information about data quality was acquired from published documents
and discussions with scientists at these agencies. Agency databases
that were utilized include: the NOAA
Status and Trends Program (NOAA, 1988),
the Massachusetts
Water Resources Authority´s Monitoring Program and the US
Army Corps of Engineers´ permit and dredging programs (New England District,
Concord, MA, (Buchholtz
ten Brink and others, 1992). Documents
containing data included in the database were cited in each data table
(under "Source of Information or Reference") and full bibliographic
references are given in the References. The
database contains linked information to aid users in locating original
data sources and paper copies are archived at the U.S. Geological Survey
in Woods Hole, MA. The compiled bibliographic information also includes
related references that did not contain original data on contaminants
in sediments. Measurements of major elements, trace elements, metals, or organic contaminant compounds on whole sediments within the Gulf of Maine were compiled. Those for measurements in sediment fractions, waters, pore waters, or biota were not. The geographic area for sample inclusion is the marine region bounded on the south by Cape Cod, MA., on the east by Georges Bank, on the north by Nova Scotia, and on the west by coastal New England. Some references containing samples in contiguous wetlands, river estuaries, Georges Bank, and the Bay of Fundy were collected; however not all samples from these peripheral areas were entered in this edition of the database nor was the literature scrutinized for data from these areas. The Database of Contaminated Sediments for the Gulf of Maine (Vol. 1) has attempted to comprehensively retrieve analytical data for sediment samples collected from 1950 through 1995; some omissions are inevitable. Data sets for more recent samples (some through 1998) that could be transferred electronically are included; however, newer documents that require hand-entry into the database are not in the current compilation. We maintain a listing of potential data sources and we ask that omissions, mistakes, supplementary information, and new data be brought to our attention. Ancillary dataIn addition to discrete contaminant measurements,
the database includes documentation about sample collection, analytical
methods, and other information that is required to assess the quality
of the reported data. The heterogeneity of the data sources has resulted
in a wide range of accuracy and precision for the data that is compiled.
Scientific editing of the data (see Data Validation section, below)
has identified some clerical or omission problems and permitted many
of them to be repaired. Commentary and qualifier information is provided
throughout the database to assist users in deciding which data are appropriate
for their specific application. Database StructureThe Contaminated Sediment Database has a flat-file (spreadsheet) structure, with samples in the vertical dimension and properties in the horizontal dimension. The database is subdivided into six data tables in order to accommodate more than 800 fields without exceeding spreadsheet limitations. Each sample in the database occupies a record (row). Each sample record is linked across the tables by a unique identification number (Sample ID) that is assigned when the data is entered, and by a citation to the original source. This structure is flexible. It allowed unlimited addition of fields as new data types were encountered. It also provided a single structure for data entry, for data processing, and for data output in a format suited for immediate data plotting and evaluation using widely-accessible commercial software. Requirements for special database management skills were minimized. The flat-file structure maximizes flexibility and transportability at the expense of compactness and structured query capabilities. Since software and data manipulation capabilities are changing rapidly, the database in its present structure can be imported into database management software of choice by the user. Data Dictionary and Database TablesData Dictionary The Data Dictionary defines the parameters that are in each data field included in the six data tables (Table 2). These tables contain information about the sample location and collection, measurements in sediments of inorganic chemicals, general organic compounds, polychlorinated biphenyls (PCB) and pesticides, polyaromatic hydrocarbons (PAHs), and grain size. These linked tables are supplemented by separate glossary and reference tables (see Content Overview). The glossary includes abbreviations, methods and devices, and other lists compiled during the construction of the database. The full Data Dictionary, in vertical format, provides field names for each parameter in three columns, with short field name (10 characters), medium field name (25 characters) and a definition of the field. This choice of format is provided to accommodate restrictions that may be imposed by a variety of software types that are used in the community. The fields within each table, and their full definitions in the Data Dictionary, are organized by subcategory, and are further organized alphabetically within subcategories. The Data Dictionary is a working and evolving document that provides detailed definitions of parameter fields, codes and abbreviations. It is suggested that the user print these files and keep them handy while inspecting or extracting data. Table 2. Organization of the Data Tables in the Contaminated Sediments Database
Information preservation This compilation aims to preserve the information
that is reported in the original references yet make it homogenous enough
to compile and manipulate. Most text fields in the database accept unrestricted
entry (except for text length) and there are numerous fields throughout
the tables for qualifiers and comments about the data and the sample.
The Working Dictionary and the
Glossary (the alphabetized Working
Dictionary) are metadata for the Data Dictionary. They were used
to record abbreviations, types of methods or devices used, new parameters,
data-entry logs, codes, and similar tables about the descriptive information
entered into the database during compilation. Entries were assigned
for a limited number of interpretive and coded fields in order to aid
in comparing heterogeneous data. For example, "collection depth"
separates "surface samples", which are defined as having more
than 80% of their length above 6 cm in depth, from subsurface samples
and samples with unknown depth. All available information was used to
assign coded fields: geographic location (Area Code), depth in sediment
(Depth Code), sampling device (Core or Grab), type of analysis recorded
in the Database (Metals & Other Inorganics, Organic Contaminants,
Grain Sizes) and availability of related data (Bioassay Data, Other
Analysis, Other references). The "row number" field, which
is present at the beginning of each table, is used for organization
and sorting and can be changed by the user. The contents of most fields in the Database are suggested by their names, and all fields are fully defined in the Data Dictionary. The following comments focus on selected fields in the tables that are especially important or need explanation. Station data:
sample identity, location, and documented
source Analytical data: common features The data tables (Inorganic, General organics, PCB and pesticides, PAHs, and Texture), follow the Station Table and have a common format: The "Unique ID#" and "Source of Information or Reference" fields are at the beginning of each table of analytical data. Next follows specific laboratory and analytical method information that pertains to all or many of the chemical entities reported in the table for a given source. Both instruments and procedures are noted and quality data for groups of compounds may be consolidated here. Last are the analytical data reported for each sample and each parameter´s qualifier fields. Chemical fields usually have a field for concentration values and specific units, a field for detection limit for the method and component, and a qualifier field that may contain quality or other annotations. Qualifiers include notes on measurements than fell below detection limits, reported detection limits, duplicate measurements, corrected measurements, original reported units, questionable values, editorial or data quality notes, and explanatory comments. Associating quality-control data with analytical values decreases the likelihood that information about data quality will be lost or ignored during data retrieval. Measurements that were made but could not be quantified (values were below limit of detection) were entered as zero. Cells were left blank where no data was available.
Inorganic data: major and trace elements, and other inorganic properties There are some parameters listed are in the Data Dictionary that have no entries in the Database; e.g., surface area, resistivity, pH, acid volatile sulfides, and radiochemical and isotopic data. These properties can effect the fate and transport of contaminants in marine sediments but the data not identified in the compiled references. Such supporting analyses may have been measured as part of a project but reported in a different reference that was not available at the time of data entry. Organic data: changing methods, bulk organic properties, and organic contaminants Improvements in analytical methods for organic contaminants over time have resulted in a decrease of broad-scope measurements like "total PCBs" and an increase in analysis for specific organic compounds. The names of organic compounds, such as are reported in the table of polyaromatic hydrocarbons (PAHs), polychlorinated biphenyls (PCBs) and pesticides, are those cited in original data and are arranged categorically and alphabetically. Microbial contaminants and organotins are also recorded in this table but total and organic carbon is recorded in the inorganic data table. Many organic contaminants are known and reported by more than one name; however, the Chemical Abstract Registry Number (CAS #) is also given for compounds whenever possible. Naming protocols may be confusing: specific organic compounds may be reported as total, sums of certain groups, or with names that differ slightly from those listed here. For example, "Fluorene", "C1-Fluorene", "C2-Fluorene", and "Fluorenes" are different measurements. In this database, results are separated where there is ambiguity about their equivalence. Data users should carefully consider information recorded in the methodology and qualifier sections, consult original sources if necessary, and use caution when comparing organic contaminant data from differing sources and years. Texture data: sediment grain size and lithology Information in the texture table can be used to better understand the geologic context in which contaminants are found and the impact which they might have in situ. Sediment grain size (texture) data were originally generated by a variety of methods (Poppe et al, 2000) that can result in non-equivalent units for grain-size measurements. The percentages of sediment in gravel, sand, silt, or clay-size classifications were calculated from sieve-size information according to standard geological boundaries (if data allowed) when the breakdowns were not reported in the source documents. Straightforward conversions between geological grain-size norms and those used for many engineering applications are not possible. Users should consider information recorded in methodology and qualifier sections for samples prior to use of data, consult original sources if necessary, and use caution when comparing grain size data from differing sources. References for the Contaminated Sediments Database Reference Tables provide full bibliographic citations for: 1) sources of compiled data; 2) other references reviewed for data content; 3) documents and bibliographies pertinent to Contaminants in Gulf of Maine Sediments; and 4) references cited in this publication. The Gulf of Maine Database Bibliography lists documents from which data was compiled. The tabular (Excel) file contains both the full citation and the short citation, which is given in the data tables under "Source of Information or Reference". The List of Additional References Reviewed for Data (download below) lists additional references that do not contain samples entered in the database but were reviewed for measurements of contaminants for whole sediments from the Gulf of Maine. These include documents that contained: measurements of related parameters but had no contaminant data; measurements of contaminants in biota, waters, or fractionated sediments; samples outside the study area; synthesized data that was previously reported elsewhere; and new reports. Extensive documentation about sub-areas in the Gulf of Maine is available from a number of libraries in the region. Documents that are referred to in the text of this publication, "Contaminated Sediments Database for the Gulf of Maine", are given in the List of Citations in this Publication. The paper-trail information that is in the station table, reference tables, and data tables may be useful for selecting and evaluating data and for locating the original sources.
|
The data user is STRONGLY ENCOURAGED to review the contents of the data qualifier fields for every parameter and sample that is extracted from the Contaminated Sediments Database prior to its use so that the validity of data for a specific purpose is considered carefully. |
Database
access and data
utilization techniques
This web site provides the description
of the Gulf of Maine database project, descriptive plots and maps of
compiled data, and access to viewing and down-loading the data tables.
The CD-ROM, "Contaminated Sediments Database for the Gulf of Maine"
provides an edition copy of the web document and data tables. All of
the data compilation was accomplished with spreadsheet software (usually
Excel or Quattro Pro) on both PC and Macintosh platforms. Commercially
available database software, such as PARADOX, 4th Dimension,
FoxPro, and ACCESS were tested or used at various times to determine
if they provided significant advantages for data manipulation and to
insure that the data was compatible with a variety of database structures.
The plots and maps used for data validation were also generated by an
assortment of programs and platforms: Kaleidagraph, Deltagraph, and
Sigmaplot; MAPINFO, ARCINFO/VIEW, Grapher, and others. Bibliographic
information was also compiled in and converted between various formats.
All of the data access and manipulation tasks can be accomplished with
minimal investment of software or hardware. The site can be viewed with
most common browsers, and is constructed with compatibility for Netscape
Communicator Version 4.5 and Internet Explorer Version 4.0. The data
dictionary and data tables, which are provided in Microsoft Excel 4.0,
can be imported to most word processors, spreadsheet, database, and
data manipulation programs. Tables can be viewed, downloaded and manipulated
on any computer platform that has appropriate software installed and
sufficient memory to open the data tables.
These compiled data are intended to
be a resource for researchers and managers in the Gulf of Maine. Potential
applications are numerous. They include mapping surficial sediment concentrations
to identify potentially toxic areas, assessing the thoroughness of data
reporting in regional literature, identifying areas that have a paucity
of measurements, determining the scale of necessary monitoring, quantifying
changes in environmental conditions over time, locating specific historical
samples, selecting indicator parameters, and others. Selective sorting,
plotting, or mapping the data that is compiled in the data tables provides
a means to accomplish this (see Examples).