Comparative Study and Expansion of Metadata Standards for Historic Fashion Collections
Abstract
This research seeks to contribute to efforts to standardize metadata across the costume and fashion domain by adding new metadata elements and controlled vocabularies to Costume Core. Expanding the metadata schema could increase the searchability and discoverability of fashion collections. To expand Costume Core, we used vocabulary from pre-trained Natural Language Processing (NLP) models to identify potential new descriptors from a conceptual latent space provided by a technique known as word embeddings. We also pulled from controlled vocabularies shared by other fashion collection personnel across the United States via online surveys.
The NLP techniques involved using a language model pre-trained on the Google News dataset to pinpoint terms similar to those in Costume Core. MOCHA, a Model Output Confirmative Helper Application, was developed to facilitate the review of potential descriptors. The results of the NLP analysis showed a difference between generated descriptors predicted to be accurate, and descriptors deemed accurate and confirmed by a fashion domain expert. However, using machine learning models for metadata expansion is justifiable due to the accuracy of generated descriptors and time-saving potential, as NLP analysis allowed for selection from a wider array of descriptors.
The revision process also resulted in identifying 528 new potential descriptors. The survey data indicated high variability in: collection cataloging systems; the resources used to determine accurate vocabulary for cataloging artifacts; the controlled vocabularies used; and how vocabularies were categorized, reflecting a lack of standardization in the field. However, by crowdsourcing controlled vocabularies, we discovered 48 new vocabularies that may be used to expand the Metadata schema.
In addition, the study provided insight into adding metadata elements in the form of fields or columns, such as those relating to medium such as fiber, fabric structure, and color, including hue, value, and intensity. The addition of such metadata elements could potentially enrich the schema and promote greater standardization of metadata across fashion collections.
Copyright (c) 2023 Dina Smith-Glaviana, Wen Nie Ng, Caleb McIrvin, Chreston Miller, Julia Spencer
This work is licensed under a Creative Commons Attribution 4.0 International License.
The VRAB does not require copyright transfer, only permission to publish and archive the article. Copyright holders retain copyright ownership, granting a nonexclusive license to the journal and OJS to publish the article, meaning that the author may also publish it elsewhere. Before submitting an article to the journal, please be sure that all necessary permissions have been cleared in any third party material.
This is an open access journal; users are allowed to read, download, copy, distribute, print, search, or link to the full texts of the articles, or use them for any other lawful purpose, without asking prior permission from the publisher or the author. All issues of the journal are licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).