In Reply Refer To: September 8, 1981 EGS-Mail Stop 412 QUALITY OF WATER BRANCH TECHNICAL MEMORANDUM NO. 81.22 Subject: PROGRAMS AND PLANS--WATSTORE, Remark Coded Data Quality of Water Branch Technical Memorandum 81.18 described modifications to be made to the water-quality data stored in WATSTORE. The principal effect of these changes will be a greatly increased number of data values being qualified by the 'less than' (<) remark code. The purposes of this memorandum are to define the 'less than' and ND remarks, review the implications of remark codes, and outline the options available to the user relative to WATSTORE, SAS, and statistical procedures in general. The ND, "not detected", remark means that an analysis was made, that the detection limit of the analytical technique was unknown and that the constituent was not detected. The data value stored in such a record is 0.0. This value has no quantitative validity and is merely a holding place in storage until the detection limit of the analytical technique is determined at which time the ND remark can be replaced with a "less than" remark and the value 0.0 replaced with the appropriate value for the detection limit. The <, "less than" remark means that the analysis was made, that the detection limit was known, and, that any readout from the analytical device was within two standard deviation units of background noise, meaning that there was not sufficient confidence in the readout to assign a real, numerical value to the concentration. In any given data-analysis application, the decision to use or not use remark-coded data must be based on the statistical analysis method being used. Furthermore, for a number of chemical constituents, the analytical results stored in the Water Quality File were obtained using several different methods, each of which had a different detection limit. The presence of multiple detection limits for a given constituent presents additional challenges to interpretive analysis. For difficult data-analysis situations such as these, the analyst should consult a recognized statistical expert early in the planning stage. The remark codes presently available for use in WATSTORE are described in Volume 3, Chapter II, Section A, Page 16 of the WATSTORE manual. Each parameter in a record of the Water-Quality File is described by three data fields as follows: BYTE POSITION DATA TYPE IDENTIFIER 137-140 Fixed bin (31) Parameter code 141-145 Float (15) Value 140-149 Char (4) Remark When a data value is qualified by a remark code, the value itself is not modified in any way; only the four byte field assigned to the remark code is filled. Therefore, unless remarked data are specifically excluded from a retrieval, the value in the data field would be passed to an application program; that is ~ <200 becomes 200 in the application program. Analysis of a data set containing such data can lead to serious interpretive errors. The WATSTORE retrieval program (E771) described in Volume 3, Chapter III, Section A of the WATSTORE manual has two options for the exclusion of remarked data from the output data set. A. Code a nonblank character in column 24 of the "M" card to exclude an entire analysis from the output data set if any parameter value in the analysis is qualified by a remark. B. Code a nonblank character in column 25 of the 'M' card to exclude a given parameter code, value, and remark from the output data set if that parameter value is qualified by a remark. All other parameters in the analysis will be written to the output data set. A SAS user routine (a "macro") written by WRD called QWSAS, described in Volume 3, Chapter IV, Section P of the WATSTORE manual, converts data retrieved from the QW File in backfile format to a SAS data set. Each record in a QWSAS SAS data set, as in the QW file backfile format, contains all the parameters for a given analysis. In the SAS records each parameter code is converted to a label for the parameter in question and the remark code, at the option of the user, can be made into an additional variable in the record. The remark code variable can then be used in a SAS DATA statement to either exclude or otherwise qualify a particular data value for use in subsequent processing steps. Another SAS macro, QWINPUT, available for use with data retrieved from the Water-Quality File creates a data set with a slightly different format than that created by QWSAS. QWINPUT produces as many records as there are data values in the Quality of Water file output data set. Each record contains a single parameter value (labeled VALUE), a parameter code (labeled PARMCODE), and a remark code (labeled REMARK). As in QWSAS the remark code variable can be used in a SAS DATA statement to exclude or qualify the parameter value for use in subsequent processing steps. Methods for using remark-coded data have been the subject of widespread discussion within the Division and a number are being tested at the present time. We welcome suggestions, questions, and comments on this subject and will update current information as appropriate. R. J. Pickering Chief, Quality of Water Branch This memo supplements Quality of Water Branch Technical Memorandum No. 81.18. Key Words: WATSTORE, Remark codes, Statistics, SAS, Water Quality Distribution:- A, B, S, FO, PO