U.S. Geological Survey Data Strategy 2023–33

Circular 1517
By: , and 

Links

Acknowledgments

Data Advisory Board members, who represent the breadth of the U.S. Geological Survey (USGS), were instrumental in developing the USGS Data Strategy. The authors thank many groups across the USGS, representing all aspects of our data community, for their contributions to this USGS Data Strategy.

The USGS recognizes the National Oceanic and Atmospheric Administration for their leadership in developing a comprehensive data strategy, one that inspired the USGS in our efforts to develop our own.

Abstract

The U.S. Geological Survey (USGS) has long recognized the strategic importance and value of well-managed data assets as an integral component of scientific integrity and foundational to the advancement of scientific research, decision making, and public safety. The USGS investment in the science lifecycle, including collection of unbiased data assets, interpretation, peer review, interpretive publications, and data release, ultimately contributes to the transparency and availability of science. Foundational Government directives and laws, such as the Foundations for Evidence-Based Policymaking Act of 2018 (Public Law 115–435, 132 Stat. 5529) as well as Executive Order 13642, “Making Open and Machine Readable the New Default for Government Information,” provide a framework for addressing strategic data management. The USGS Data Strategy builds on that framework by outlining high-level goals and objectives that serve as a long-term, decadal guide toward achieving a broad, data-focused vision.

Benefits of the USGS Data Strategy are many. The USGS will contribute to open science by increasing efficiencies in the consistent management of valuable data assets; driving innovation that results in modernized capabilities to ensure data are analysis ready; increasing data skills across the Bureau to enhance workforce data literacy; broadening capacity to understand and address needs of stakeholders; and measuring progress in producing findable, accessible, interoperable, and reusable (FAIR) data products.

The major goals and objectives of the USGS Data Strategy promote maximizing the utility of USGS data based on stakeholder needs, promoting data innovation, coordinating common data practices, modernizing our USGS enterprise data architecture, and enhancing our data-centric culture. The goals and objectives in the strategy align with other Bureau strategic plans, guidance, and directives from the Department of the Interior and the Federal Government. This strategy is a key component to strengthen the Bureau’s data ecosystem to ensure a relevant, long-term capacity that supports internal needs and achieves its scientific mission in the most efficient and effective manner.

Background and Purpose

A global movement is underway to increase access to data to enable greater transparency, utility, and integration capabilities to expand the scope and scale of science. The increased access to scientific data has resulted in a need to maximize investment in Federal data through adoption of findable, accessible, interoperable, and reusable (FAIR) principles (Wilkinson and others, 2016). Federal science agencies are recognizing the importance of open and reusable science and are establishing mechanisms and policies to support scientists in the release of scientific data (Holdren, 2013; National Oceanic and Atmospheric Administration, 2020; U.S. Department of State, 2021; Nelson, 2022; U.S. Department of Commerce, 2022; U.S. Geological Survey [USGS], 2023).

The USGS provides science about the natural hazards that threaten lives and livelihoods; water, energy, minerals, and other natural resources that humans rely on; health of ecosystems and environment; and impacts of climate and land-use change. Created by an act of Congress in 1879, the USGS is the primary water, earth, and biological science and civilian mapping bureau in the Department of the Interior (USGS, 2021b). The USGS provides stakeholders, scientists, and the public with timely, accurate data needed for decision making, from critical emergencies to land management planning. The “U.S. Geological Survey 21st-Century Science Strategy 2020–2030” (USGS, 2021b) describes the increasing complexity of scientific challenges and the importance of evolving all aspects of Bureau capacity, including our management of all data assets, to support our mission to provide integrative, predictive, multidisciplinary science. Data assets that are appropriately coupled with each other, managed, and of high quality represent the outputs of a workforce knowledgeable in data and research activities that possesses the skills to deliver value from those data. The USGS prides itself in its scientific integrity and its ability to conduct unbiased science and data collection.

With data as its primary currency, the USGS has long recognized the strategic importance and value of well-managed data assets as an integral component of scientific integrity. The USGS invests in the science lifecycle, including collection of unbiased data assets, interpretation, peer review, interpretive publications, and data release, thereby contributing to the overall advancement and transparency of scientific research used in decision making and for public safety. In 2016, the USGS published a set of four data policies as a part of Fundamental Science Practices (FSP) that provide guidance to review, release, and preserve scientific data. In FSP, the term “data” is defined as “Observations or measurements (unprocessed or processed) represented as text, numbers, or multimedia” (USGS, undated a). Data are considered noninterpretive information (USGS, 2017). Within the USGS Data Strategy, scientific data are recognized in the broadest sense to include the variable(s) collected, standard metadata, transformations and subsequent interpretations of the data, quality assurance and quality control, constructs for the data such as projections and datums, calibration data associated with the collection, and more. This broad understanding pertains to all fields of science and processes for data collection and analysis from sources such as laboratory, field, drone, and satellite.

Scientific data management and release applications, such as trusted digital repositories (USGS, undated b) and tools supporting metadata and identifiers, in addition to best practices and trainings, enable scientists and other USGS personnel to efficiently apply the data lifecycle (fig. 1; Faundeen and others, 2013; USGS, 2021a). As a result, scientists have made thousands of datasets available to the public. Foundational Government directives and laws, such as the Foundations for Evidence-Based Policymaking Act of 2018 (Public Law 115–435, 132 Stat. 5529) as well as Executive Order 13642, “Making Open and Machine Readable the New Default for Government Information” (Obama, 2013), provide a framework for addressing strategic data management. The USGS Data Strategy builds on that framework by outlining high-level goals and objectives that serve as a long-term, decadal guide toward achieving a broad, data-focused vision. Each goal will have its own implementation plans that are carefully aligned and coordinated.

The U.S. Geological Survey Science Data Lifecycle Model has the following steps: plan,
                     acquire, process, analyze, preserve, and publish or share.
Figure 1.

Diagram showing U.S. Geological Survey Science Data Lifecycle Model (modified from Faundeen and others, 2013).

The USGS Data Strategy serves to provide a consistent approach to managing the lifecycle of USGS data such that their maximum value can be realized. These data include both public and internal scientific and nonscientific data (for example, business operations). The strategy maintains a focus specifically on data, while not excluding the critical importance of the analysis and interpretation of those data and resulting products by USGS scientists and personnel that result in high-impact scientific conclusions. With common data goals, the USGS can reach common data successes.

The USGS Data Strategy includes the following:

  • Continuing to ensure scientific integrity based on USGS FSP;

  • Providing leadership and support to ensure consistency in data practices and policies across the Bureau;

  • Delivering high-quality, peer-reviewed metadata to support the proper use and interpretation of our data products by any user;

  • Modernizing the USGS data infrastructure to increase interoperability, scalability, and timeliness of data availability;

  • Understanding the needs of our stakeholders through co-production and participatory design where possible;

  • Delivering analysis-ready, integratable data;

  • Continuing to build a workforce of the future emphasizing a strong data-centric culture.

Implementation of the USGS Data Strategy across the Bureau will benefit the USGS by

  • Contributing to open science by increasing efficiencies in the consistent management of valuable data assets throughout the data lifecycle, enabling increased access, transparency, and usefulness of USGS data products for current and future generations;

  • Measuring our progress in producing FAIR data products;

  • Driving innovation that results in modernized capabilities to ensure data are analysis ready;

  • Increasing data skills across the Bureau to enhance workforce data literacy;

  • Continuing to streamline the data release and delivery process through increased automation and updated data management tools;

  • Broadening our capacity to understand and deliver actionable data to support the needs of our stakeholders;

  • Increasing our ability to accelerate our contributions to open science and the advancement of scientific discovery.

The goals and objectives of the USGS Data Strategy will be interwoven into all aspects of USGS science, information systems, and business operations to achieve greater organizational efficiency in fulfilling the USGS mission. For example, the Bureau will work to implement FAIR practices and see an increase in the data-empowered workforce through education and training to expand data consistency and dependability (Lightsom and others, 2022). Leveraging and updating data management tools and procedures for each of the mission areas, regions, and science support functions will improve data collection and data integration activities. Furthermore, USGS datasets and delivery mechanisms that have achieved FAIR success should be identified and showcased as examples for the execution of the strategy. Achievement of these goals will require engagement with senior leadership, coordination across USGS offices, and participation by all employees.

USGS Data Strategy Goals and Objectives

Data Strategy Vision: Broaden the USGS data-centric culture to increase the production of unbiased, accessible, high-quality, and interoperable data, managed as a strategic asset and offered at scales and timeframes relevant to society’s needs.

The USGS Data Strategy focuses on a series of goals and objectives that are intended to be implemented over a decadal period and build on existing efforts wherever possible. Each of the goals and objectives in table 1 are designed to provide building blocks toward continuous improvement in the management and delivery of USGS data to realize their maximum potential.

Table 1.    

Strategic goals and objectives.

[EDGE, Equipment Development Grade Evaluation; FAIR, findable, accessible, interoperable, and reusable; FSP, Fundamental Science Practices; USGS, U.S. Geological Survey]

Goal Objectives
Goal 1: Maximize the utility of USGS data by addressing stakeholder needs and openly sharing analysis-ready data based on FAIR practices. This goal will ensure USGS data are collected, prepared, and made available to a diverse audience of decision makers, members of the scientific community, and the public in the most efficient, FAIR, and usable formats for those audiences. 1.1. Evaluate, monitor, and gather feedback from stakeholders (internal and external) about improvements needed to maximize the value of USGS data.
1.2. Enable processes and policies that ensure USGS data are shared appropriately inside the USGS, with other Federal agencies, with partners (government and nongovernment), and with the public.
1.3. Evaluate USGS data to determine how well data and databases meet FAIR principles (for example, the USGS State of the Data Assessment, assessing business operations information [Hutchison and others, 2023]).
1.4. Ensure all USGS data are identified, discoverable, and shared appropriately to maximize their utility.
1.5. Increase digital access to and management of legacy data as appropriate.
1.6. Ensure all USGS science data adhere to recognized national, international, departmental, and bureau standards for collection, quality assurance and control, transformation, sharing, and archival.
Goal 2: Promote and encourage innovation in data, technology, and quality improvements to provide a comprehensive, evolving, and flexible data ecosystem. This goal will elevate innovations in data techniques that effectively allow sharing and leveraging data within and outside of the USGS, ensuring the USGS remains on the cutting edge of technical and nontechnical solutions for data understanding, use, and delivery, supported by modern technical architectures. 2.1. Automate and innovate established processes to support modern data needs of stakeholders.
2.2. Promote, support, and reward innovative data solutions developed within the USGS that exploit the utility of data, add value, increase equity, and can be leveraged and operationalized across the Bureau and the Department of the Interior.
2.3. Embrace innovation and flexibility in data collection, presentation, preservation, and infrastructure for evolving data needs by leveraging the breadth of USGS community of practice approaches (for example, the Community for Data Integration) and staffing capabilities (for example, the EDGE program).
2.4. Innovate and modernize approaches to collection devices, equitable data sharing, hosting, and communication capabilities.
Goal 3: Establish common data policies, methods, and standards for managing and accessing data assets in accordance with FSP. This goal will align tools, processes, services, policies, and standards to support the scientific goals and mission of the USGS. It will provide a path to address common data management needs and provide practicable solutions at all scales—for science and science support—following in accordance with FSP. 3.1. Develop and implement a unifying Bureau data governance policy, including roles of data stewardship as defined by the Federal Data Strategy (Federal Data Strategy Development Team, 2021).
3.2. Use FAIR data and information principles throughout the Bureau to effectively share scientific information, eliminate database redundancies, increase efficiency and quality of data calls, and promote consistent understanding and use of USGS data.
3.3. Provide clear, consistent procedures for data and information management to ensure world-class scientific data quality based on USGS FSP and quality management systems requirements and realize the utility of well-managed internal business operations systems.
3.4. Establish a coordinated approach across the Bureau to collect, transform, archive, and deliver data intended for multiple and differing partners and stakeholder purposes (external and internal).
Goal 4: Provide a robust, scalable, and stable enterprise data architecture designed and maintained to enable advanced capabilities of stewardship, use, and delivery of USGS data. This goal will ensure that the USGS has the capabilities, technology, and processes in place to create an efficient, compliant, and usable data environment for the Bureau and its stakeholders. 4.1. Define requirements based on infrastructure needs of scientific and operational data and associated human capital for storing, sharing, manipulating, visualizing, and analyzing data (for example, dedicated science networks) in analysis-ready formats (for example, machine-readable, tidy data, and so forth).
4.2. Advance open science and data interoperability by achieving broader interconnectivity of USGS repositories and data stores to improve data access across the Bureau.
4.3. Develop efficient approaches to better communicate, deliver, and share data and information with external partners through such capabilities as linking USGS infrastructure to national data networks.
4.4. Ensure enterprise and local data systems meet USGS policy requirements (for example, security, review and approval, open access data, and section 508 of the Rehabilitation Act of 1973 [29 U.S.C. § 794d]) and data best practices (for example, FAIR, targeted system integration, and data lifecycles).
4.5. Ensure USGS data architectures meet security expectations while allowing advances and innovation in scientific and business operation data utility.
Goal 5: Enhance the data-centric culture by extending data literacy to all areas of the USGS workforce, emphasizing skill-building activities essential for achieving the USGS mission.1 This goal will further enhance the USGS data expertise and data-centric culture through training, strategic hires, and workforce development. 5.1. Invest in data expertise through workforce planning, hiring, position descriptions focused on data skills, staff development, training, and outreach consistent with current data and technology practices.
5.2. Foster broader workforce skills to develop methods that support and enhance USGS science and business data operations, emphasizing continuous improvement, usability, workforce adoption, and widespread use.
5.3. Implement the “U.S. Geological Survey 21st-Century Science Strategy 2020–2030” (USGS, 2021b) core value describing a diverse and highly skilled workforce to increase the efficiency of mission success.
Table 1.    Strategic goals and objectives.
1

Achievement of this goal will rely on key partnerships with the Department of the Interior’s Office of Human Capital and Office of Employee Development.

Conclusion

The U.S. Geological Survey Data Strategy serves as an overarching guide to improve data governance and inspire innovation, awareness, and new understanding of the potential and impact of U.S. Geological Survey data. The strategy focuses on improved data access and the ability to analyze, reuse, and integrate data in new and innovative ways, ultimately expanding the scope and scale of our science. It provides a foundational direction and a framework to promote a data-centric culture and enhance the skillsets of Bureau staff to harness the power of U.S. Geological Survey data more effectively. The goals and objectives in the strategy align with other Bureau strategic plans and guidance and directives from the Department of the Interior and the Federal Government. It relies on the entire Bureau for successful implementation, with key partnerships in the Associate Chief Information Office, Office of Human Capital, and Office of Employee Development. The strategy is a key component to strengthen the Bureau’s data ecosystem to ensure relevant, long-term capacity that supports internal needs and achieves its scientific mission in the most efficient and effective manner.

References Cited

Faundeen, J.L., Burley, T.E., Carlino, J.A., Govoni, D.L., Henkel, H.S., Holl, S.L., Hutchison, V.B., Martín, E., Montgomery, E.T., Ladino, C.C., Tessler, S., and Zolly, L.S., 2013, The United States Geological Survey science data lifecycle model: U.S. Geological Survey Open-File Report 2013–1265, 4 p., accessed June 2022 at https://doi.org/10.3133/ofr20131265.

Federal Data Strategy Development Team, 2021, Federal Data Strategy 2021 action plan: Federal Data Strategy Development Team, 24 p., accessed June 2022 at https://strategy.data.gov/2021/action-plan/.

Holdren, J.P., 2013, Increasing access to the results of federally funded scientific research: Executive Office of the President memorandum, 6 p., accessed January 18, 2021, at https://obamawhitehouse.archives.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf.

Hutchison, V.B., Zolly, L.S., Norkin, T., Hsu, L., and Hou, C.-Y., 2023, USGS State of the Data Project—Rubric and assessment data: U.S. Geological Survey data release, https://doi.org/10.5066/P97V4XA4.

Lightsom, F.L., Hutchison, V.B., Bishop, B., Debrewer, L.M., Govoni, D.L., Latysh, N., and Stall, S., 2022, Opportunities to improve alignment with the FAIR principles for U.S. Geological Survey data: U.S. Geological Survey Open-File Report 2022–1043, 23 p., accessed June 2022 at https://doi.org/https://doi.org/10.3133/ofr20221043.

National Institute of Standards and Technology [NIST], 2023, Glossary of key information security terms: National Institute of Standards and Technology Computer Security Resource Center website, accessed September 2023 at https://csrc.nist.gov/glossary.

National Library of Medicine, 2023, Data glossary: National Library of Medicine website, accessed November 2023 at https://www.nnlm.gov/guides/data-glossary.

National Oceanic and Atmospheric Administration, 2020, NOAA data strategy—Maximizing the value of NOAA data: National Oceanic and Atmospheric Administration, 12 p., accessed June 2022 at https://www.glerl.noaa.gov/review2021/documents/NOAA_DataStrategy.pdf.

Nelson, A., 2022, Ensuring free, immediate, and equitable access to federally funded research: Executive Office of the President memorandum, 8 p., accessed October 16, 2022, at https://www.whitehouse.gov/wp-content/uploads/2022/08/08-2022-OSTP-Public-Access-Memo.pdf.

Obama, B.H., 2013, Executive order 13642—Making open and machine readable the new default for government information: Federal Register, v. 78, no. 93, p. 28111–28113, accessed February 8, 2021, at https://www.govinfo.gov/content/pkg/FR-2013-05-14/pdf/2013-11533.pdf.

U.S. Department of Commerce, 2022, Department of Commerce data strategic action plan: U.S. Department of Commerce, 14 p., accessed June 2022 at https://www.commerce.gov/sites/default/files/2022-01/US-Dept-of-Commerce-Data-Strategic-Action-Plan-FY21-22.pdf.

U.S. Department of State, 2021, Enterprise data strategy—Empowering data informed diplomacy: U.S. Department of State, 18 p., accessed June 2022 at https://www.state.gov/the-department-unveils-its-first-ever-enterprise-data-strategy/.

U.S. Geological Survey [USGS], 2017, Fundamental science practices—Review and approval of scientific data for release: U.S. Geological Survey Manual, chap. 502.8, accessed November 2023 at https://www.usgs.gov/survey-manual/5028-fundamental-science-practices-review-and-approval-scientific-data-release.

U.S. Geological Survey [USGS], 2021a, Data management: U.S. Geological Survey data management website, accessed November 2023 at https://www.usgs.gov/data-management.

U.S. Geological Survey [USGS], 2021b, U.S. Geological Survey 21st-century science strategy 2020–2030: U.S. Geological Survey Circular 1476, 20 p., accessed June 2022 at https://doi.org/https://doi.org/10.3133/cir1476.

U.S. Geological Survey [USGS], 2023, Public access to results of federally funded research at the U.S. Geological Survey—Scholarly publications and digital data (ver. 2.0): U.S. Geological Survey, 24 p., accessed May 2023 at https://www.usgs.gov/media/files/public-access-results-federally-funded-research-usgs.

U.S. Geological Survey [USGS], [undated] a, Fundamental Science Practices (FSP) and related policy directives: U.S. Geological Survey web page, accessed September 23, 2022, at https://www.usgs.gov/about/organization/science-support/office-science-quality-and-integrity/policy-directives.

U.S. Geological Survey [USGS], [undated] b, Fundamental Science Practices (FSP) procedures and guidelines: U.S. Geological Survey web page, accessed November 2023 at https://www.usgs.gov/office-of-science-quality-and-integrity/usgs-trusted-digital-repositories-tdr.

Wilkinson, M.D., Dumontier, M., Aalbersberg, I.J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J.-W., da Silva Santos, L.B., Bourne, P.E., Bouwman, J., Brookes, A.J., Clark, T., Crosas, M., Dillo, I., Dumon, O., Edmunds, S., Evelo, C.T., Finkers, R., Gonzalez-Beltran, A., Gray, A.J.G., Groth, P., Goble, C., Grethe, J.S., Heringa, J., ’t Hoen, P.A.C., Hooft, R., Kuhn, T., Kok, R., Kok, J., Lusher, S.J., Martone, M.E., Mons, A., Packer, A.L., Persson, B., Rocca-Serra, P., Roos, M., van Schaik, R., Sansone, S.-A., Schultes, E., Sengstag, T., Slater, T., Strawn, G., Swertz, M.A., Thompson, M., van der Lei, J., van Mulligen, E., Velterop, J., Waagmeester, A., Wittenburg, P., Wolstencroft, K., Zhao, J., and Mons, B., 2016, The FAIR guiding principles for scientific data management and stewardship: Scientific Data, v. 3, article 160018, 9 p., accessed September 2022 at https://doi.org/10.1038/sdata.2016.18.

Glossary

analysis-ready data

Scientific data available in performant infrastructure and in optimized formats well suited for analytical workflows and visualization.

data asset

“Any entity that is comprised of data. For example, a database is a data asset that is comprised of data records. A data asset may be a system or application output file, database, document, or web page. A data asset also includes a service that may be provided to access data from an application. For example, a service that returns individual records from a database would be a data asset. Similarly, a web site [sic] that returns data in response to specific queries (e.g., www.weather.com) would be a data asset” (National Institute of Standards and Technology [NIST], 2023).

data-centric culture

An organizational philosophy that recognizes accessible quality data as the foundation for science, research, and business operations, champions data roles and skill growth, and promotes data best practices.

data ecosystem

“The complex and interconnected relationships among entities involved in creating or deploying systems, products, or services or any components that process data” (NIST, 2023).

enterprise data architecture

“Fundamental concepts or properties related to a system in its environment embodied in its elements, relationships, and in the principles of its design and evolution” (NIST, 2023).

interoperable

Data interoperability refers to the ways in which data are formatted that allow diverse datasets to be merged or aggregated in meaningful ways (National Library of Medicine, 2023).

trusted digital repository

“. . . one whose mission is to provide reliable, long-term access to managed digital resources to its customers, now and in the future” (USGS, undated b).

Abbreviations

FAIR

findable, accessible, interoperable, and reusable

FSP

Fundamental Science Practices

NIST

National Institute of Standards and Technology

USGS

U.S. Geological Survey

Publishing support provided by the Science Publishing Network,

Denver and Lafayette Publishing Service Centers

For more information concerning the research in this report, contact the

Center Director, USGS Science Analytics and Synthesis

P.O. Box 25046, Mail Stop 302

Denver, CO 80225

(303) 202–4774

Or visit the Science Analytics and Synthesis website at

https://www.usgs.gov/core-science-systems/science-analytics-and-synthesis/

Disclaimers

Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.

Although this information product, for the most part, is in the public domain, it also may contain copyrighted materials as noted in the text. Permission to reproduce copyrighted items must be secured from the copyright owner.

Suggested Citation

Hutchison, V.B., Burley, T.E., Blasch, K.W., Exter, P.E., Gunther, G.L., Shipman, A.J., Kelley, C.M., and Morris, C.A., 2024, U.S. Geological Survey data strategy 2023–33: U.S. Geological Survey Circular 1517, 7 p., https://doi.org/10.3133/cir1517.

ISSN: 2330-5703 (online)

Publication type Report
Publication Subtype USGS Numbered Series
Title U.S. Geological Survey data strategy 2023–33
Series title Circular
Series number 1517
DOI 10.3133/cir1517
Year Published 2024
Language English
Publisher U.S. Geological Survey
Publisher location Reston, VA
Contributing office(s) Office of Administration, Northwest Regional Director's Office, Science Analytics and Synthesis, Oklahoma-Texas Water Science Center, Office of the Associate Chief Information Officer
Description v, 7 p.
Online Only (Y/N) Y
Google Analytic Metrics Metrics page
Additional publication details