Appendix: Log-contrast principal components analysis and factor interpretation--a description of the method and tables summarizing two experiments

As a supplement to the above-described data analysis, a combination of principal components and factor analysis was used to search for relationships among chemical elements that might be useful in petrologic interpretation. Principal components and factor analysis are described by Reyment and Savazzi (1999) and Cooley and Lohnes (1962), respectively. A number of experiments using various methods of principal components and factor analysis were conducted on chemical data on sandstones of the Minturn and Sangre de Cristo Formations. It is important to realize that, in factor analysis, multiple methods, solutions, and interpretations are possible. Choices among methods, number of principal components to rotate, and alternative factor interpretations affect the outcome. However, after repeated experimentation, some consensus is generally possible. The following discussion summarizes two experiments, using the method of log-contrast principal components analysis (method of Reyment and Savazzi, 1999), followed by rotation of principal components and factor interpretation. The results are considered to be representative, although by no means identical, for the experiments conducted. The first experiment presented here utilized data on 50 sandstone samples, for which 17 quantitatively determined chemical constituents (major oxides, loss on ignition, and trace metals) were available. The second experiment utilized the same data, but added data on 12 semi-quantitatively determined trace elements. In both experiments, steps in the analysis were:

  1. Data preparation--Values below the limit of analytical detection were replaced and a centered log-ratio correlation matrix was calculated.
  2. Principal components analysis--A principal components analysis was performed on a centered log-ratio correlation matrix with unities in the diagonal. The correlation matrix was solved for its latent roots (eigenvalues) and a new matrix specifying uncorrelated vectors (principal component axes) was calculated.
  3. Selection of principal components--In the first experiment, the first six principal components were selected for rotation, based on eigenvalues > 1, maximum communalities for log-ratios, and ease of interpretation. In the second experiment, six principal components were selected for rotation to allow comparison with the first experiment.
  4. Rotation of principal components--Principal component axes were rotated to new, orthogonal (uncorrelated) axes using the Kaiser Varimax criterion.
  5. Factor interpretation--Rotated principal components were interpreted as petrologic processes, based on factor loadings, and supported by analysis of mineral-major oxide correlation and stratigraphic distribution.

The method of log-contrast principal components analysis (Reyment and Savazzi, 1999) was used instead of standard R-mode principal components (factor) analysis (Cooley and Lohnes, 1962) for compositional data. Although the author has found it to be a useful guide to identifying petrologic processes, standard R-mode factor analysis of compositional data is not always satisfactory because the theoretical sum of values in each row (sample) of the data matrix is 100 percent. This property, referred to as the "constant sum problem" in geochemistry (Chayes, 1960), constrains the values of correlation coefficients. Correlations are not free to range between -1 and +1. Correlations for major constituents are forced toward negative values. Moreover, when the number of constituents is reduced and recalculated to 100 percent as is done, for example, during conversion to a volatile-free basis, the correlations change. Solutions to the constant sum problem have been proposed, only to be declared invalid upon subsequent investigation, and the problem does not seem to be completely resolved. The method employed here, termed "log-contrast principal components analysis" (Reyment and Savazzi, 1999), is a newly proposed solution for the constant sum problem.

Although Reyment and Savazzi (1999) provide a convenient DOS program that performs a log-contrast principal components analysis and a traditional (R-mode) principal components analysis for comparison, I also used the program "Statview" to permit analysis of large data matrices and to extend the analysis to rotation of principal components. Using either program, a new matrix of log-ratios is calculated by dividing each cell of a row by the geometric mean of that row and converting the result to its logarithm. A new covariance matrix, termed the "centered log-ratio covariance matrix," is calculated from the log-ratio matrix, and a corresponding centered log-ratio correlation matrix is computed (Table 4). The new correlation matrix is a measure of proportionality between the original variables (columns). Raw data may be expressed either as "percent," "parts per million," or both, if consistent within columns. Finally, in principal components analysis, the centered log-ratio correlation matrix is solved for its roots. Centered log-ratio correlation coefficients are not comparable to correlation coefficients calculated from raw data, but the results of the principal components analyses (latent roots and vectors that specify factor loadings) can be inspected for similarities and differences. In the present analysis, I have relied exclusively on the method of log-contrast analysis.

Selection of the number of principal components to preserve for rotation and factor interpretation is not always obvious, as in the present case. Criteria for selection are discussed by Jackson (1993). The number of principal components selected for rotation can be based on 1) eigenvalue magnitude greater than one, 2) the point on the eigenvalue distribution curve where an obvious change in magnitude occurs (root curve method), 3) eigenvalue distribution curve compared to eigenvalues calculated from random data (broken-stick distribution), 4) maximum communalities under various rotation scenarios, and 5) trial interpretation of rotated factors. Criteria (1), (4), and (5) were used here.

Six principal components were selected for rotation and interpretation. Selection of such a large number of principal components risks interpreting random effects (Jackson, 1993) but, in the present case, solutions involving fewer components yielded low communalities and posed difficulties for interpretation. (Communality is the sum of the squared factor loadings, and is a measure of the degree to which a particular solution accounts for the variance of a variable, in this case a log ratio). The first two eigenvalues account for almost half of the total variance (Table 5) and, if low communalities for many log-ratios were ignored, a simple two-factor solution could be considered. The eigenvalue curve (not shown) becomes flat after the fourth root and 67 pct of the total variance is included, but communalities of some log ratios remain low. In contrast, the first six eigenvalues account for more than 80 pct of total variance (Table 5) and all communalities exceed 0.70 (Table 6). Results of rotating 6 principal components are summarized in Table 7.

Tables summarizing a second experiment, utilizing 12 semi-quantitatively analyzed trace elements in addition to the variables included in the first experiment, are given in Tables 8-11.

Table 4.--17X17 centered log-ratio correlation matrix, log ratio data, Minturn and Sangre de Cristo Formations. N = 50; X, row geometric mean of original data matrix. Click here for Excel file.

 

Table 5.--Eigenvalues (roots) of 17X17 log-ratio correlation matrix.

 

Magnitude

Variance Proportion

Cumulative Variance Proportion
Value 1

5.157

.303

.303
Value 2

2.610

.154

.457
Value 3

2.041

.120

.577
Value 4

1.691

.099

.676
Value 5

1.178

.069

.745
Value 6

1.010

.059

.804
Value 7

.707

.042

.846
Value 8

.599

.035

.881

 

Table 6.--Communalities for orthogonal rotations, 6 principal components, 17 variables.

 Proportion

h2
Log (SiO2/X)

.951
Log (Al2O3/X)

.951
Log (FeTO3/X)

.801
Log (MgO/X)

.814
Log (CaO/X)

.778
Log (Na2O/X)

.829
Log (K2O/X)

.861
Log (TiO2/X)

.795
Log (P2O5/X)

.725
Log (MnO/X)

.832
Log (LOI/X)

.880
Log (Cu/X)

.698
Log (Mo/X)

.779
Log (Pb/X)

.725
Log (Th/X)

.725
Log (U/X)

.839
Log (Zn/X)

.702

 

Table 7.--Six-factor orthogonal solution, 17 variables, Varimax rotation.

Proportion

Factor 1

Factor 2

Factor 3

Factor 4

Factor 5

Factor 6
Log (SiO2/X)

.007

.958

.125

-.049

.125

-.004
Log (Al2O3/X)

.173

.856

.069

-.077

.405

-.119
Log (FeTO3/X)

.786

.211

.157

.064

.168

.286
Log (MgO/X)

.359

.056

.730

-.109

.363

.075
Log (CaO/X)

-.463

-.506

-.267

-.010

.392

-.288
Log (Na2O/X)

-.046

-.078

-.619

-.170

.364

-.526
Log (K2O/X)

-.037

.900

.092

.192

-.041

.057
Log (TiO2/X)

.841

-.109

.185

.199

.048

.010
Log (P2O5/X)

.605

-.402

.032

.096

.248

-.356
Log (MnO/X)

-.441

-.234

-.327

.555

.349

.216
Log (LOI/X)

.061

.239

.802

.403

.042

.110
Log (Cu/X)

-.269

-.354

-.163

-.360

-.499

-.308
Log (Mo/X)

-.255

-.016

-.182

-.819

-.048

.082
Log (Pb/X)

-.126

-.178

-.080

.084

-.815

-.025
Log (Th/X)

.650

.416

-.053

.168

.019

.313
Log (U/X)

.172

-.058

.153

.115

.124

.868
Log (Zn/X)

.123

.092

.099

.770

-.180

.209

Factor 1--Iron-titanium oxides (detrital grains)--FeTO3, TiO2, P2O5, and Th--placer concentration (provenance).

Factor 2--Potassium feldspar and mica (detrital grains)--SiO2, Al2O3, K2O--chemical maturity (provenance).

Factor 3--Chlorite and/or mica (detrital grains or matrix) versus plagioclase (detrital grains)--MgO, LOI vs Na2O--if detrital mica vs plagioclase destruction, then provenance; if matrix vs plagioclase destruction, then greenschist metamorphism or burial diagenesis.

Factor 4--Manganese oxides vs molybdenite--MnO and Zn vs Mo--weathering of mineralized rock.

Factor 5--Pb--Uninterpreted unique factor.

Factor 6--U--Uninterpreted unique factor.

 

Table 8.--29X29 centered log-ratio correlation matrix, log ratio data, Minturn and Sangre de Cristo Formations. N = 50; X, row geometric mean of original data matrix. Click here for Excel file.

 

Table 9.--Eigenvalues (roots) of 29X29 log-ratio correlation matrix.

 

Magnitude

Variance Proportion

Cumulative Variance Proportion
Value 1

5.491

.189

.189
Value 2

4.780

.165

.354
Value 3

3.852

.133

.487
Value 4

2.520

.087

.574
Value 5

1.945

.067

.641
Value 6

1.832

.063

.704
Value 7

1.340

.046

.750
Value 8

.972

.034

.784
Value 9

.880

.030

.814
Value 10

.784

.027

.841
Value 11

.715

.025

.866
Value 12

.596

.021

.887
Value 13

.509

.018

.905
Value 14

.461

.016

.921

 

Table 10.--Communalities for orthogonal rotation, 5-7 principal components, 29 variables. h2(5), communality for rotation of five components.

 Proportion

h2(5)

h2 (6)

h2 (7)
Log (SiO2/X)

.914

.915

.917
Log (Al2O3/X)

.723

.811

.875
Log (FeTO3/X)

.605

.613

.762
Log (MgO/X)

.731

.813

.832
Log (CaO/X)

.725

.799

.813
Log (Na2O/X)

.641

.737

.800
Log (K2O/X)

.823

.829

.830
Log (TiO2/X)

.761

.842

.848
Log (P2O5/X)

.441

.735

.764
Log (MnO/X)

.671

.694

.696
Log (LOI/X)

.661

.692

.772
Log (B/X)

.635

.678

.684
Log (Ba/X)

.693

.700

.718
Log (Be/X)

.413

.437

.839
Log( Co/X)

.596

.629

.727
Log (Cr/X)

.725

.726

.782
Log (Cu/X)

.660

.696

.706
Log (La/X)

.669

.669

.675
Log (Mo/X)

.466

.598

.599
Log (Ni/X)

.742

.743

.771
Log (Pb/X)

.486

.603

.645
Log (Sc/X)

.581

.591

.591
Log (Sr/X)

.566

.795

.797
Log (Th/X)

.623

.634

.696
Log (U/X)

.459

.666

.734
Log (V/X)

.714

.723

.727
Log (Y/X)

.631

.761

.766
Log (Zn/X)

.611

.612

.617
Log (Zr/X)

.621

.679

.777

 

Table 11.--Six-factor orthogonal solution, 29 variables, Varimax rotation.

 Proportion

Factor 1

Factor 2

Factor 3

Factor 4

Factor 5

Factor 6
Log (SiO2/X)

.071

.901

-.143

-.218

.148

.087
Log (Al2O3/X)

-.027

.863

.086

-.076

.153

-.173
Log (FeTO3/X)

.114

.015

.666

.005

.188

.348
Log (MgO/X)

.428

-.251

.215

-.116

.711

.033
Log (CaO/X)

-.154

-.350

-.208

.414

-.382

-.540
Log (Na2O/X)

-.374

-.005

.117

.088

-.446

-.615
Log (K2O/X)

-.088

.832

-.184

-.003

.173

.258
Log (TiO2/X)

-.136

-.076

.869

.138

.120

.170
Log (P2O5/X)

-.359

-.169

.643

.266

.119

-.279
Log MnO/X

-.151

-.157

-.209

.747

-.209

-.038
Log (LOI/X)

-.018

.196

.085

.311

.642

.369
Log (B/X)

.366

-.049

-.005

.060

.715

.166
Log (Ba/X)

.360

.697

-.243

-.049

-.070

-.135
Log (Be/X)

-.095

.179

-.615

.133

.024

.010
Log (Co/X)

.110

-.664

-.103

-.335

.230

.014
Log (Cr/X)

.685

-.431

.007

-.099

.224

.107
Log (Cu/X)

-.660

-.308

-.093

-.304

-.159

-.196
Log (La/X)

.118

.703

.003

.082

-.393

-.024
Log (Mo/X)

-.253

-.093

-.190

-.579

-.350

-.175
Log (Ni/X)

.459

-.241

-.069

-.396

.555

.065
Log (Pb/X)

-.697

-.192

-.106

-.089

.002

.248
Log (Sc/X)

.483

-.254

.431

-.290

-.088

-.124
Log (Sr/X)

.166

.072

-.294

.134

-.154

-.797
Log (Th/X)

.064

.324

.518

.041

-.022

.504
Log (U/X)

.020

-.212

.007

.285

-.084

.730
Log (V/X)

.771

.155

-.182

-.257

.059

.045
Log (Y/X)

.311

-.117

-.049

.040

-.799

.088
Log (Zn/X)

-.250

.107

.188

.465

.182

.503
Log (Zr/X)

.050

.290

.543

-.344

-.219

.361

Factor 1--Mafic vs base trace metals--Cr and V vs Cu and Pb--uninterpreted factor, possibly representing mineralization.

Factor 2--Potassium feldspar and mica (detrital grains)--SiO2, Al2O3, K2O, Ba, and La vs Co-- chemical maturity (provenance).

Factor 3--Iron-titanium oxides and zircon (detrital grains)--FeTO3, TiO2, P2O5, Th, and Zr vs Be--near-source placer concentration (provenance).

Factor 4--Manganese oxides vs molybdenite--MnO vs Mo--weathering of mineralized rock.

Factor 5--Chlorite and/or mica--MgO, LOI, B, and Ni vs Y--if detrital mica, then provenance; if matrix, then greenschist metamorphism or burial diagenesis.

Factor 6--Th-U mineral vs plagioclase--Th, U, and Zn vs CaO, Na2O, and Sr--destruction of plagioclase (provenance or metamorphism).