Review and Revision of the Environmental Sample Data
The first step in preparing the data for analysis was to review all
the records and make any necessary revisions. The initial review involved
checking data codes to ensure that the sample type was properly identified
for each record. The primary types for nutrient data are environmental
samples, replicate samples, and blank samples. Replicates and blanks
are quality-control (QC) samples. Nutrient QC data were described and
analyzed by Mueller and Titus (2005). A few cases of inconsistent or
obviously erroneous sample coding were noted and corrected. For example,
25 records with nutrient concentration data were coded as plant or animal
tissue samples, and 8 obvious environmental samples were incorrectly
coded as various QC sample types. Also, three samples coded as blanks
were found to be switched with related samples that were coded as environmental.
Data for some environmental samples were stored in more than one record
in the database. These records were combined into a single record for
each sample. In most of these cases, one type of data, such as organic
carbon, was stored in one record, and the remaining data for the sample,
such as physical measurements and nutrients, were stored in a second
At two National Water-Quality Assessment Program sites, samples were
collected in more than one location, and the individual records were
combined to provide total streamflow and flow-weighted average concentrations.
These sites were the Apalachicola River near Sumatra, Florida, which
was sampled both in the main channel and on the flood plain during
high flow, and the Rio Grande at San Marcial, New Mexico, which was
sampled in the historical channel and a bypass canal.
The resultant environmental samples data set contains more than 28,000
samples from 500 sites collected during water years 19922001.
About 200 of these samples contained known or obvious data errors.
Known errors included sampling mistakes for example, analyses
of total nutrient concentrations made on filtered water samples and
laboratory errors. These were deleted from the data set. Extremely
anomalous values were identified by comparing samples over time and
ranges of streamflow and by checking for consistency among constituent
concentrations and physical measures. About 175 of these values were
deleted. Also, about 25 anomalous “less than” remark codes
(<) were deleted, but the values were retained. In about 35 samples,
anomalous values were replaced by values from a replicate sample, or “dissolved” concentrations
were substituted for “total” values. In addition, about
40 other data values were obvious decimal errors, which were corrected.
All revisions are recorded in the environmental data file in the column
labeled “Nutrient National Synthesis Team Comments.”