Content Metadata Standards for Marine Science: A Case Study, USGS Open-File Report 2004-1002

Challenges in Cataloguing the Marine Realms

Organizing and aggregating information about marine environments is a necessity, but it is not trivial. Six specific challenges were encountered during the MRIB's development. It is important to note that several of these challenges arose from conscious decisions made by the MRIB developers, and would not necessarily be faced by any digital library project. Information about the difficulty of responding to each challenge may prove useful in making selections about the scope and audience of other digital library projects.

Perhaps the most difficult challenge is posed by the MRIB's broad potential audience: to meet the needs of many groups and individuals, the MRIB developers strove to organize information in ways that would anticipate and meet many users' search strategies. Because no search strategy could ever be sufficiently universal to assist to every possible user, the broad-audience challenged mandated many avenues to the same information. It should be noted that the MRIB does not attempt to meet every possible strategy, but only to provide a large set of possible strategies. This audience consideration strongly tempered consideration and approaches to the other challenges.

The second challenge is that much research on the Earth is firmly located in specific places and times, and research on Earth's water bodies is no exception. Even so simple a measurement as water depth may vary greatly within a few lateral feet and a few hours, so spatial and temporal placement of data is crucial to underwater studies. Thus it is important that any catalogue dealing with the marine environments allow classification and searching of information resources by location and time.

A third challenge is that "marine science" encompasses an exceptionally vast range of academic disciplines: all the natural, physical, and social sciences are concerned with standing-water environments. The humanities, the practical fields (such as education and law), and the arts also have something to say about the marine environments. Thus, a truly inclusive ontology (categorization system) for marine science must appeal to the disparate mindsets and internal organizations of many disciplines. Developers and users of this ontology must remain aware that terms from one discipline may represent wholly different concepts in a another discipline. The MRIB has simplified the problem of excessive disciplinary scope by limiting its content to the scientific and educational disciplines for the near future.

Because the MRIB's ambition is to serve users who are not necessarily scientifically trained, the ontology and term lists must not be over-full of jargon (this is the fourth challenge). Avenues must exist for the scientist who maps the seafloor, the student who is reporting on cephalopods, and the concerned citizen whose home is in a flood zone to each find the information she or he requires. It is an ongoing process for the MRIB ontology to balance the conflicting needs for 1) precision of terminology; 2) consideration of conflicting term connotations between academics and lay people, as well as between academic disciplines; and 3) avoiding excessive jargon.

Although not specifically associated with cataloguing the ocean and lake environments, a further challenge posed to the development of a marine science ontology is the need to moderate the number of search fields it offers. One limitation of traditional electronic library catalogues is that they rely on simple lists of authors, subjects, and titles, which provide little or no contextual information about related subjects and items. The MRIB provides this information in a more organized set of hierarchical topics, so it requires more than those three basic metadata fields. However, too many fields or fields that overlapped too greatly could overwhelm the user and render an intuitive interface impossible.

Those who are perhaps best able to precisely and elaborately catalogue documents are the authors and maintainers of those documents. Often these people are already accustomed to negotiating the murky waters of interdisciplinary and educational communication. Although an MRIB record about a document may be created by someone uninvolved in the document's production, the MRIB categorization scheme is intended to encourage authors to create their own bibliographic records. Self-cataloguing also has the benefit of involving authors in the developing of the scheme, where their suggested terms may be vital. Designing an ontology which encourages author generation of metadata records was a final challenge to MRIB development.

In summary, the most significant challenges posed in creating a categorization scheme for a widely-usable digital library of the ocean, lake, and marine environments are:

  • Accommodate geospatial and temporal "footprint" of information;
  • Integrate information from a broad spectrum of academic disciplines, while minimizing discord from different term usage across disciplines;
  • Organize information so that a variety of searching strategies can succeed;
  • Minimize jargon;
  • Use enough metadata to provide the above, while remaining cautious about the total number of searchable metadata; and
  • Encourage composition of metadata records by document authors.

