Evaluating machine learning approaches to identify and predict oil and gas produced water lithium concentrations
Links
- More information: Publisher Index Page (via DOI)
- Download citation as: RIS | Dublin Core
Abstract
Recently, the demand for battery-grade lithium has substantially increased, largely due to electrification of the transportation sector. The search for new lithium sources has turned to produced waters (frequently brines), a large-volume wastewater by-product of oil and gas extraction. Geochemical analysis indicates the presence of varying concentrations of lithium from produced water samples collected across the United States and represented in the U.S. Geological Survey’s National Produced Water Geochemical Database, as well as mixtures of Marcellus Shale produced water included in the Pennsylvania Department of Environmental Protection’s Oil and Gas Well Waste Reports. We first examined whether the geochemical signature of the lithium-bearing produced waters is sufficiently distinct so that machine learning (ML) can be used to correctly classify samples to the formation of origin. The produced water sample data used to assess classification accuracy were from the Marcellus Shale, Utica Shale and Point Pleasant Formation (Utica), and Smackover Formation oil and gas wells. Further, we evaluated the potential for ML to accurately classify Marcellus Shale produced water spatially (i.e., northeast versus southwest Pennsylvania). We then investigated whether ML algorithms applied to a suite of geochemical concentration data (i.e. Ba, Br, Cl, K, Mg, Sr) may be used to predict the lithium concentration of an unknown sample. Finally, we applied an estimated economic lithium grade cutoff of 150 milligrams per liter (mg/l) and assessed the utility of ML to predict whether a produced water sample would fall above or below the grade cutoff based on the suite of geochemical parameters. Four machine learning algorithms—Random Forest (RF), Gradient Boosting Trees (GBT), Extreme Boosting (XGBoost), and Deep Neural Networks (DNN) were assessed. This study successfully demonstrates that all four machine learning methods can precisely and accurately estimate lithium concentrations and geologic formation classification. The products of this study contribute to the growing body of knowledge aimed at expanding the lithium resource base within the United States.
Suggested Citation
Attanasi, E., McDevitt, B., Freeman, P., Coburn, T., 2026, Evaluating machine learning approaches to identify and predict oil and gas produced water lithium concentrations: Data Science in Science, v. 5, no. 1, 2624195, 18 p., https://doi.org/10.1080/26941899.2026.2624195.
Study Area
| Publication type | Article |
|---|---|
| Publication Subtype | Journal Article |
| Title | Evaluating machine learning approaches to identify and predict oil and gas produced water lithium concentrations |
| Series title | Data Science in Science |
| DOI | 10.1080/26941899.2026.2624195 |
| Volume | 5 |
| Issue | 1 |
| Publication Date | February 06, 2026 |
| Year Published | 2026 |
| Language | English |
| Publisher | Taylor & Francis |
| Contributing office(s) | Geology, Energy & Minerals Science Center |
| Description | 2624195, 18 p. |
| Country | United States |
| State | Alabama, Arkansas, Florida, Georgia, Louisiana, Mississippi, Oklahoma, South Carolina, Texas |