U.S. Geological Survey Open-File Report 2005–1423
Radon in Soils of Parts of Cameron, Hidalgo, and Willacy Counties, Texas
Discriminant AnalysisThis section presents results of principal components analysis and subsequent discriminant analysis on distributions of density, potassium (K), uranium (eU), and thorium (eTh) measured for the soil samples. The data were grouped based on the classification results of the airborne data from Duval (2005) as discussed in the section on analysis of variance. The purpose of this analysis is to determine how well the groups characterize the soil samples and to identify samples that may be misclassified. The principal components analysis yielded the eigenvalues listed in Table 12. Principal components 1 and 2 account for more than 90 percent of the variance. Figure 19 shows a plot of the eigenvalues versus the principal component number. Table 13 lists the component loadings for the first two components after varimax rotation and Figure 20 shows a graph of principal component 1 (PC1) versus principal component 2 (PC2). The fact that the component loadings for uranium (eU) and thorium (eTh) are similar suggests that they are closely linked. Table 14 lists the Pearson's correlation values for the variables used in the principal component analysis and a value of 0.95 for uranium and thorium confirms a strong linear relationship between them. A likely explanation for this linear relationship is that uranium and thorium are commonly both present in minerals such as zircon and monazite and such minerals are presumed to be present in the sediments. The principal component loadings were used to calculate PC1 and PC2 for all of the samples. Discriminant analysis was then performed using PC1 and PC2 as the variables. Table 15 lists the numbers of samples classified into the four groups corresponding to recent alluvium (Qal), Beaumont Formation (Qb), Lissie Formation and or dune sand (Ql), and Goliad Formation (Tg). Prior to the analysis the four samples designated as Alt were assigned according to the maximum likelihood values as discussed in the section on analysis of variance. The table lists the numbers of samples identified as misclassified. The classification accuracy is reasonably good for Qal and Tg but is less than 50 percent for Qb and Ql. In order to determine whether some of the samples might be classified incorrectly relative to the classification of Duval (2005), the Mahalanobis distances between the sample values and the group means were calculated to identify the misclassified samples. The location of each sample so identified was carefully compared to the classification map of Duval (2005). In some cases the sample locations were found to lie very near the boundaries between different classes. Because the classification map is calculated using grids derived from flightline data and approximately 75 percent of the grid cells are interpolated between flightlines, those samples near the boundaries were reassigned to a different group as appropriate. A total of 14 samples were reassigned. The discriminant analysis was then repeated and Table 16 lists the results. The classification accuracies improved such that all of the values are greater than 65 percent. Click here to download a spreadsheet with the reassigned samples. Table 12. Listing of eigenvalues from the principal components analysis.
Figure 19. Eigenvalues plotted versus the principal component number.
Table 13. Listing of the component loadings for the first two principal components after varimax rotation and the approximate percentage of the total variance for each component.
Figure 20. Graph of principal component loadings for component 1 (PC1) versus component 2 (PC2) plotted as red dots with labels that identify the variable.
Table 14. Listing of Pearson's correlation values for the variables used in the principal component analysis.
Table 15. Listing of the numbers of samples within each grouping showing the number of samples misclassified by the discriminant analysis. The groups are designated as recent alluvium (Qal), Beaumont Formation (Qb), Lissie Formation and/or sand dunes (Ql), and Goliad Formation (Tg).
Table 16. Results of discriminant analysis after some of the samples were reassigned to different groups based upon their locations near class boundaries.
|