In the development of BIBFRAME, specialized cataloging communities have extensive requirements for cataloging their materials. As a particularly complicated example, the BIBFRAME AV Modeling Study laid out some of the challenges presented by audio-visual materials: the time-based/event-based nature of the material, multiple creators and contributors playing different roles, the proliferance of aggregations and collections, and the appearance of the same content on different media types, including preservation copies, all with different dates.
Should BIBFRAME aim to accommodate these special needs with vocabulary extensions? Or should it stay as simple as possible and have specialized communities rely on external vocabularies to extend it?
The MARC format has been expanded significantly by proposals from specialized cataloging committees. Many communities have identified their needs and submitted proposals for new MARC fields and subfields, which are then integrated into the MARC format and available for use by all. Recent audio and visual examples include the 344 (Sound Characteristics), 345 (Projection Characteristics of Moving Image), 346 (Video Characteristics), 347 (Digital File Characteristics), and 382 (Medium of Performance).
Having specialized fields in MARC offers several benefits. Cataloging committees have the opportunity to put forward their own needs and potential solutions to a centralized, standardized body. The fields make it possible for catalogers to provide granular data that can be specifically targeted by catalogs and users.
However, there are also drawbacks to our current practice. The MARC format has grown to be seen as too complicated, and the high level of granularity is often not leveraged by end-user systems. The data is not always interoperable with other data, and there are often two (or more) places to put the same information (e.g., one note field and one coded or authorized field). Additionally, MARC proposals may be intimidating for small communities with specialized needs.
BIBFRAME’s current status
The current set of BIBFRAME vocabulary is a fairly concise list. Many specialized MARC fields do not (yet?) have BIBFRAME equivalents, including (just as one example) the properties represented by the subfields in the 344 (Sound Characteristics) field: Type of recording, Recording medium, Playing speed, Groove characteristic, Track configuration, Tape configuration, Configuration of playback channels, and Special playback characteristics--data elements that would be very important for certain types of audio cataloging.
It is unclear whether BIBFRAME developers are interested in incorporating more and more specialized vocabulary into the BIBFRAME vocabulary, and public opinion on the topic seems to vary as well.
The 2012 Library of Congress document Bibliographic Framework as a Web of Data outlined BIBFRAME as a model that might have the same depth as MARC:
It is important to remember that this model, like MARC, must be able to accommodate any number of content models and specific implementations, but still enable data exchange between libraries. (Page 5)
The goal of the Bibliographic Framework Initiative is to develop a model to which various content models can be mapped. This recognizes that different communities may have different views of their resources and thus different needs for resource descriptions. (Page 15)
The AV Modeling Study also argued for a robust vocabulary, but also emphasized the need for external vocabularies:
Rather than be forced to compromise descriptive detail by making it fit into a model that is not aligned with the content type, or use vocabularies built for a specific, unrelated purpose, the cataloger should be able to easily describe any given resource using a shared model. (Page 25)
While [BIBFRAME] will not be all things to all content, it has the potential to offer a logical but flexible data model, and a strong core set of vocabularies that are extensible as needed. Modeled in RDF, BIBFRAME provides organizations the opportunity to utilize other namespaces in order to add more extensive description required in specific contexts, such as technical, preservation, and rights metadata. (Pages 25-26)
In response to the AV Modeling Study, Stanford’s Phil Schreur argued on the BIBFRAME email list for keeping BIBFRAME’s vocabulary simple:
I'd prefer seeing the BIBFRAME vocabulary remain as simple as possible and, when extensions are needed, to make use of other established vocabularies as opposed to incorporating them as part of the BIBFRAME vocabulary. If they were absorbed, the resultant behemoth would soon become impossible to manage and keep in synch [sic] with whatever vocabulary it was derived from.
In general, the Linked Data community ethos seems to encourage as much reuse of vocabulary as possible. From the Ontology Best Practices on the Open Semantic Framework:
Reuse structure and vocabularies as much as possible. This best practice refers to leveraging non-ontological content such as existing relational database schema, taxonomies, controlled vocabularies, MDM directories, industry specifications, and spreadsheets and informal lists. Practitioners within domains have been looking at the questions of relationships, structure, language and meaning for decades. Effort has already been expended to codify many of these understandings. Good practice therefore leverages these existing structural and vocabulary assets (of any nature), and relies on known design patterns.
As we move forward, we could try to continue in the tradition of MARC, including specialized vocabulary in BIBFRAME, or we could keep BIBFRAME simple and encourage the use of outside vocabularies.
Having everything in one place would be a more familiar method and point of entry to BIBFRAME for those accustomed to working in the MARC environment. There would need to be clear and accessible ways for communities to propose new vocabulary and procedures in place to keep all vocabulary current and updated. The specialized vocabulary might compromise the agnostic flavor of BIBFRAME.
If BIBFRAME is kept simple, communities could develop their own BIBFRAME-compatible vocabularies more freely. There would be the possibility for less standardization (e.g., different communities using different vocabularies for the same resource types), but also the possibility for more standardization, especially with outside communities and systems.
We think it should be noted that the work that has gone into the MARC format is incredibly useful and should be preserved in some way.
We recommend that all vocabularies that currently exist in MARC but not in BIBFRAME should be at least evaluated for inclusion in BIBFRAME. Some are probably appropriate and desirable for BIBFRAME; others might be moved into new, standalone vocabularies that could be used in conjunction with BIBFRAME.
If a sufficient external vocabulary for a particular domain already exists (e.g., the Art and Architecture Thesaurus, the RBMS Controlled Vocabularies), the outside vocabulary should be used rather than be incorporated into BIBFRAME in some way.
If ideal vocabularies do not exist either in MARC or in another domain, BIBFRAME should allow proposals to be made for new BIBFRAME vocabulary (e.g., some of the AV-specific properties like sampling rate and frame rate).
It may be helpful to use nesting vocabulary: generalized concepts could be incorporated into the vocabulary (e.g., keyDate), which could be augmented by more specific vocabulary (e.g., rereleaseDate, firstrecordingDate, etc.). This would make BIBFRAME more extensible, allowing the use of other vocabularies to more fully capture the information particular to this field.