Data report: X-ray fluorescence core scanning of IODP Site U1474 sediments, Natal Valley, Southwest Indian Ocean, Expedition 361 1

X-ray fluorescence (XRF) core scanning was conducted on core sections from International Ocean Discovery Program Site U1474, located in the Natal Valley off the coast of South Africa. The data were collected at 2 mm resolution along the 255 m length of the splice, but this setting resulted in noisy data. This problem was addressed by applying a 10 point running sum on the XRF data prior to converting peak area to element intensities. This effectively integrates 10 measurements into 1, representing an average over 2 cm resolution, and significantly improves noise in the data. With 25 calibration samples, whose element concentrations were derived using inductively coupled plasma–optical emission spectrometry, the XRF measurements were converted to concentrations using a uni-variate log-ratio calibration method. The resulting concentrations of terrigenously derived major elements (Al, Si, K, Ti, and Fe) are anticorrelated with Ca concentrations, indicating the main control on sediment chemistry is the variable proportion of terrigenous to in situ produced carbonate material.

B a bi n, D a ni el P., F r a n z e s e, Alliso n M ., H e m m i n g, Si d n e y R., H all, I a n R. , LeVay, Le a h J., B a r k er, S t e p h e n , Teje d a , Luis a n d Si m o n, M a r gi t H . 2 0 2 0. D a t a r e p o r t: X-r a y fluo r e s c e n c e c o r e s c a n ni n g of IODP Si t e U 1 4 7 4 s e di m e n t s , N a t al Valley, s o u t h w e s t In di a n O c e a n , Ex p e di tio n 3 6 1. This ve r sio n is b ei n g m a d e a v ail a bl e in a c c o r d a n c e wi t h p u blis h e r p olici e s. S e e h t t p://o r c a . cf. a c. u k/ p olici e s. h t ml fo r u s a g e p olici e s. Co py ri g h t a n d m o r al ri g h t s fo r p u blic a tio n s m a d e a v ail a bl e in ORCA a r e r e t ai n e d by t h e c o py ri g h t h ol d e r s .

Introduction
X-ray fluorescence (XRF) core scanning is widely used in paleoceanography and provides qualitative, high-resolution, direct geochemical measurements along the length of a sediment or rock core. Although log ratios represent well the amount of variability in the data, there is a clear advantage to calibrating XRF data (Weltje, 2002;Weltje et al., 2015;Weltje and Tjallingii, 2008), which enables direct comparison to geochemical data gathered with other methods. The calibration process involves subsampling along the length of the core, determining absolute elemental concentrations using a different geochemical method such as inductively coupled plasma mass spectrometry, and comparing these values to the counts derived by XRF.
Site U1474 of International Ocean Discovery Program (IODP) Expedition 361 was drilled 88 nmi southwest of Durban, South Africa, in the Natal Valley at 31°13.00ʹS, 31°32.71ʹE and 3045 m below sea level and produced a record back to at least 6 Ma (see the Site U1474 chapter [Hall et al., 2017]) ( Figure F1). The site is located in the path of the Agulhas Current, a large western boundary current (70 Sv) flowing south along Africa's southeast margin (Lutjeharms, 2006). The record was obtained with the intention of documenting changes in both Agulhas Current variability and southern African paleoclimate. The composition of sediment at Site U1474 is largely terrigenous (55%-65%), and clay abundance is high (see the Site U1474 chapter [Hall et al., 2017]). Likely terrigenous sediment contributors include river systems proximal to the core site such as the Tugela, Mfolozi, and Mkomazi Rivers, which drain the nearby Drakensberg Mountains (see the Site U1474 chapter [Hall et al., 2017]; Simon et al., 2015) (Figure F1). The Natal Valley also receives sedimentation from more distant sources such as the Limpopo and Zambezi Rivers, inferred from the high 87 Sr/ 86 Sr ratio of the >63 μm

IODP Proceedings
2V o l u m e 3 6 1 fraction of Natal Valley sediment, which reflects the Archean age of the Kaapvaal and Zimbabwe Cratons that are eroded by the Zambezi and Limpopo Rivers (Franzese et al., 2006(Franzese et al., , 2009 (Figure F1). Sediment from these rivers is presumably transported to the Natal Valley via the Mozambique Current, estimated today to have a 15 Sv flux (Ridderinkhof et al., 2010) (Figure F1), and ultimately the Agulhas Current. Work conducted by Simon et al. (2015) on a piston core collected near Site U1474 for the pre-site survey (CD-154-10-06P; 31°10.36ʹS, 032°08.91ʹE; 3076 m water depth) used the Fe/K ratio of bulk sediment derived from XRF core scanning to monitor the changing character of the terrigenous fraction. Because K is more mobile during weathering than Fe (Kossoff et al., 2012) and the spatial distribution of Fe and K in surface marine sediments tends to reflect the wetness or dryness of proximal sediment sources (Govin et al., 2012), Simon et al. (2015) argued that the changing Fe/K ratio indicated changes in chemical weathering as a consequence of changing hydroclimatic conditions. They also found precessional variation in the intensity of chemical weathering of sediment consistent throughout the 270 ky length of the record. A similar application of XRF core scanning on the 6 My record at Site U1474 could be useful in monitoring hydroclimate on longer timescales.

Methods and materials
In an effort to promote transparency and reproducibility in the geosciences, especially with large and cumbersome data sets such as XRF scans, the entirety of the process detailed in this work is docu-mented on GitHub, a free online software version control platform commonly used for open source projects. This includes all of the raw data files, the XRF spectra for each scan, element signal intensities from the inductively coupled plasma-optical emission spectrometer (ICP-OES), and all the code used to organize the data, calibrate the XRF scans, and produce figures. Each process is displayed in IPython Jupyter Notebook, a web-based interactive computational environment. These can be examined by any reader in a web browser or can be downloaded and reprocessed by Python users. Each notebook can be viewed at https://github.com/danielbabin/U1474_XRF_Data_Report under Notebooks.

XRF core scans
The ITRAX Core Scanner (Croudace et al., 2006) in the Core Repository at Lamont-Doherty Earth Observatory was used to scan 213 archive-half sections composing the Site U1474 splice. Each section was removed from refrigeration 30 min before measurement to allow the core to warm to room temperature. Surface roughness was removed from the core with a plastic card, and the core was covered with 4 μm thick Ultralene plastic film (SPEX Centriprep, Inc.) before being placed on the scanner, which is essential to minimize variation in parameter S (measurement geometry and specimen inhomogeneity) in the XRF calibration (Weltje and Tjallingii, 2008). Warming before adding the film helps prevent condensation on the film because condensation absorbs X-ray spectra, especially biasing light elements (Kido et al., 2006). X-ray illumination area was set at 2 mm in the downcore direction and 2 cm in the crosscore direction, and the scan was run down the center of the split core section half. Measurement spacing was set at 2 mm, voltage at 30 kV, current at 55 mA, and exposure time at 2 s using a Cr tube.

Calibration samples
Discrete 10 cm 3 samples were taken from the working halves for (nondestructive) shipboard measurements of moisture and density (MAD). Because water content is one of the key variables that affects measured XRF intensities, we used a subset of the MAD sample residues for this calibration. After XRF scanning was complete, we selected 25 of the MAD residues to capture the range of variability seen in the XRF scan data. Because the MAD samples were taken from Hole U1474A, not the splice, we used the magnetic susceptibility records to tie each MAD sample to a precise depth interval on the archive halves that were used for scanning (Table T1). This correlation for each MAD sample is documented on GitHub in the notebook mad_correlation.ipynb. The 25 discrete samples were digested using a standard lithium metaborate flux-fusion method along with a set of six powdered rock standards and six procedural blanks. Element concentrations were measured using the Agilent 720 axial ICP-OES at the Lamont-Doherty Earth Observatory and American Museum of Natural History inductively coupled plasmamass spectrometry (ICP-MS) laboratory, for which routine precision is 1%-2%. Signal intensities were corrected for blanks and instrumental drift, then converted to concentrations using the Figure F1. Drainage configuration and oceanography of southern Africa. The Zambezi and Limpopo are Africa's largest rivers draining into the Indian Ocean, draining 44 and 33 Mt of sediment annually (Milliman and Meade, 1983). Sediment from those rivers could be transported to Site U1474 first by the Mozambique Channel (MC) eddies transporting 15 Sv (Ridderenkoff et al., 2010), and ultimately by the strong Agulhas Current (AC) (Lutjeharms, 2006). The Tugela River and other smaller rivers drain the Drakensberg Mountains, flanking the southeastern African Margin. The mountains are composed of dominantly sedimentary rock of Paleozoic age called the Karoo Supergroup (Johnson et al., 1996). These sediments could all be deposited into the southwest-northeast-striking Natal Valley (NV), the location for Site U1474. Yellow rings = MC eddies. SEC = South Equatorial Current. SEMC = South East Madagascar Current. accepted values for the six rock standards (Table T2) (Flanagan, 1984;Imai et al., 1996;McLennan and Taylor, 1980;Pretorius et al., 2006). The raw data and data reduction process for calculating elemental concentrations for the calibrations samples are documented on GitHub in the notebook xrf_calibration_samples_reduction.ipynb.

Calcium carbonate percent
An aliquot of the bulk moisture and density samples was powdered and measured for percent CaCO 3 using a UIC CM5012 CO 2 coulometer at Lamont-Doherty Earth Observatory's Core Repository. Precision is determined to be <4% by replicate measurements every 10 samples. These measurements of percent CaCO 3 are combined with measurements of CaCO 3 performed aboard R/V JOIDES Resolution. These data are listed in Table T3.

Preprocessing: running sum spectra
For this study, we focus only on the major elements detectable with the ITRAX: Al, Si, K, Ca, Ti, and Fe. The results of our XRF core scanning at 2 mm resolution can be found in Table T4. Unfortunately, even for the major elements, the low dwell time chosen (2 s) yielded noisy records, especially for light elements like Al ( Figure  F2). The minimum recommended dwell time with the ITRAX core scanner is 2 s. We chose this short dwell time, applied at a high resolution (2 mm), because we had a large volume of core to analyze and limited funds for analysis time. Fortunately, our high-resolution scanning allowed us to integrate the raw spectral data across a wider depth interval, increasing the height of our spectra and improving the signal-to-noise ratio.
To accomplish this, we access the raw spectral data, in the form of an .spe file, created with each measurement. The file contains a single vector of intensities at different energy wavelengths. We import all the spectral intensities from each .spe file (~750 for 1.5 m of core) from a single section into a data structure and take a 10 measurement running sum at each energy level across that data structure. Taking a 10 measurement running sum across these data, collected every 2 mm for 2 s, simulates a 2 cm exposure for 20 s. The section "offset" assigned to these integrated measurements represents the depth of the median (fifth) measurement out of every 10 exposures integrated. This process was repeated for the 213 sections scanned. Some sections had measurements with very low total spectra intensities (the sum of each element in the spectra). These resulted from measuring air at the tops or bottoms of sections or cracks in the core. All measurements with total spectral intensities lower than 300,000 were dropped for this reason. The Python code using these steps is available on GitHub in the notebook xrf_preprocessing.ipynb.
For the ITRAX core scanner, converting a spectra of energy intensities to element intensities involves fitting the observed spectra from a measurement with a model and computing the relevant area under the spectral peaks. This is done using the software Q-Spec, provided by ITRAX and Cox Analytical Systems for XRF data reduction (Croudace et al., 2006). To optimize the fit for several measurements (for example, along the length of a core section), the ITRAX software sums the spectra from each measurement in a section and outputs a "sumspectra" .spe file. To optimize the model fit for our 213 sections, we averaged the intensities of our 213 sumspectra files. This average sumspectra was then fit using the Q-spec software. We chose elements for the spectral model that minimized mean-squared error (MSE). The parameters keV/channel, energy offset, full width at half maximum (FWHM) slope, FWHM offset, and FWHM cutoff were also optimized to minimize MSE. We found that this method provides the highest quality record of element-intensity continuity in depth-space, minimizing jumps between sections. The MSE parameter provides a continuous assessment of how well the model fits each measurement. The results of this data processing can be found in Table T5. The Q-spec settings file (.dfl file type) that we used for our data reduction can be found in the GitHub repository for this paper under the subdirectory Settings.
Section positions were first converted into core depth below seafloor, Method A (CSF-A) and then into the CCSF scale used for the splice. No preprocessing steps, outlier removal, or filtering were      Although we only focus on major elements here (Al, Si, K, Ca, Ti, and Fe), an extensive suite of elements not suitable for calibration are available in the full XRF data tables (T4, T5). Table T4 contains the 2 mm, 2 s XRF data. Table T5 contains the 2 cm, 20 s integrated XRF data set. The XRF data are shown by depth in Figures F3 and F4. Figure  F3 shows individual element intensities for both the 2 cm, 20 s data set (full color) and the 2 mm, 2 s (paler colors) data sets. Figure F4 presents element intensities from both data sets normalized to Ca intensity. These figures clearly visualize the noise reduction offered by the 2 cm integration of the raw spectra.

XRF core scanning data calibration to element concentrations
The relative abundances provided by the XRF in "counts" were converted to concentrations with a calibration data set of 25 samples (Table T6; Figures F4 and F5). The conversion was made using the univariate log-ratio calibration (ULC) of Weltje and Tjallingii (2008) and used Ca counts and concentrations in the denominator of each element ratio. We chose the simpler ULC as opposed to the more sophisticated multivariate log-ratio calibration of Weltje et al. (2015) because of the relative sparse spacing of our calibration samples compared to the total length of our core. Concentrations are reported as weight percent element oxides. The Python code used for the calibration process, including the calculation of α and β parameters, is available on GitHub in the notebook xrf_calibration.ipynb. Model parameters α and β are available in Table T7.
The ULC is designed to calibrate element ratios, not necessarily individual element concentrations. However, if the elements in the collection of ratios are assumed to compose the entire sample, the relative concentrations of each element can be calculated. To ensure that the resulting concentrations derived from the XRF core scanning data reflect the concentrations of our calibration samples as closely as possible, we normalized the sum of each calibrated XRF measurement to 96.2%, the average sum of Al 2 O 3 + SiO 2 + K 2 O + CaO +TiO 2 + Fe 2 O 3 in the 25 calibration samples. This simply reflects the fact that the XRF did not measure Na 2 O and MgO, which sum to 3.8% on average in the calibration data set.
The results of the calibration are shown by depth in Figures F4  and F5. Figure F4 shows individual element intensities. Figure F5 presents element intensities from both data sets normalized to Ca intensity. Solid circles along each depth series are the concentrations and element ratios from the calibration samples. Figure F3. Raw XRF data by depth for each element normalized to Ca counts: (A) Al/Ca, (B) Si/Ca, (C) K/Ca, (D), Ti/Ca, (E) Fe/Ca. Pale colors = 2 mm, 2 s data set. Dark colors = integrated 2 cm, 20 s data set. Shaded vertical bars = anomalously low counts (which do not stand out when presented in ratio form; see Figure F6).   Figure F2 shows individual element intensities and indicates a few sections with unusually low counts, highlighted by shaded bars (Sections 361-U1474D-4H-1, 8H-3, 11H-4, and 11H-5). We attribute these anomalous measurements to malfunctions with the ITRAX scanner. The sections cannot be rescanned because these anomalous data were not discovered until after the cores were shipped back to Kochi Core Center in Japan. However, counts of individual element intensities are affected by this malfunction, whereas ratios seem less affected (Figures F3, F6). Figure F6 shows, on the left, signal intensity for each major element across a section with a scanning error and low counts (Section 8H-3; section signal colored orange). The right panels show how, when plotted as element ratios, the ratios from the adjacent core sections join together, making the element ra-tio-depth series continuous. As ratios are the typical application of XRF data, we argue these sections may be used with caution. Figure F7 compares the concentrations measured on the calibration samples with the XRF measurements at the appropriate depth and includes the ratios of the raw counts (Figure F7A), the log ratios of the calibration and the XRF counts (Figure F7B), and ratios of the results and the calibrated XRF ratios (Figure F7C). For each element ratio, there is a strong correlation with an intercept at zero, within bootstrapped uncertainty.

Data quality
As mentioned previously, the ULC works best for ratios, but individual element concentrations can be derived if the collection of ratios is assumed to compose the entire sample, summed and normalized. Figure F8 shows how effectively our calibration yielded individual element concentrations. Most elements have high quality calibrations with R 2 coefficients greater than 0.53 and intercepts at or near zero within bootstrapped uncertainty intervals. Only titanium has a lower R 2 coefficient. Titanium counts have a strong relationship with concentrations from the calibration sample; however, the relationship with XRF-derived concentrations is weaker.

Description of results
For marine sediment, CaO concentration at Site U1474 is low, ranging from 13 to 41 wt% ( Figure F4D). CaO concentration-derived CaCO 3 coulometry (Table T3) are plotted alongside the XRF data as open squares (Figure F4D), and the CaO concentration from both methods at identical depths is crossplotted in Figure F9. The results indicate that most of the Ca is borne within CaCO 3 . By calculating the difference between total CaO and CaCO 3 -borne CaO for each sample, we estimate an average terrigenous CaO content of approximately 5%. CaO abundance is lowest between 100 and 170 m CCSF. There is a large excursion toward higher CaO deposition between 10 and 20 m CCSF.
Elements derived from the continental crust, borne in terrigenous phases, vary inversely with Ca content (Figure F4). The element with the greatest variability in the core scan data set is K (Figure F4C), varying by more than a factor of 2 through the core. From 250 to 175 m CCSF, K content is low, about 1.25 wt%, compared to the rest of the record ( Figure F4C). From 170 to 100 m CCSF, K content increases to ~1.75 wt%, overprinted by two lowfrequency excursions to even higher values. From 100 to 20 m CCSF, K concentration declines gradually to a minimum of ~0.75 wt%. Above this interval, K content increases again. Long-term trends in variability of the other terrigenously derived elements (Al, Si, Ti, and Fe) largely follow the pattern for K described above (Figure F4A, F4B, F4E, F4F). The covariance of the terrigenously derived elements and anticorrelation of calcium carbonate indicate that the main control on the chemical composition of sediment at Site U1474 is the variable input of terrigenous material.  6V o l u m e 3 6 1 Figure F6. The XRF scanner experienced problems during the scans of four core sections: 361-U1474D-4H-1, 8H-3, 11H-4, and 11H-5. All panels: orange = signal from Section 8H-3, blue = adjacent sections. Left: when only the signal intensity is plotted, Section 8H-3 has low counts that do not fit the depth series. Right: element ratios are contiguous with the adjacent sections in depth series.

U1474-D-8H3
Al counts Al/Ca ratio (counts) Si counts Si/Ca ratio (counts) Ti/Ca ratio (counts) K/Ca ratio (counts) K counts  Figure F7. Comparison of counts derived from the XRF spectra to the concentration ratios from the calibration samples after each step in the XRF data reduction procedure provides a quality check. Shaded intervals = 95% confidence interval regression model computed by bootstrapping the data 1000 times. A. 2 cm, 20 s data set, unprocessed. B. Log ratios of both concentrations and counts. C. Calibrated ratios. Generated using the Python package Seaborn's regplot function.
Data report: X-ray fluorescence core scanning of IODP Site U1474 sediments IODP Proceedings 8V o l u m e 3 6 1 Figure F8. Comparison of XRF data to calibration samples using individual element concentrations. Shaded intervals = 95% confidence interval regression model computed by bootstrapping the data 1000 times. A. Before calibration. B. After calibration. Generated using the Python package Seaborn's regplot function.