Simplifying Subject Indexing: A Python-Powered Approach in KBR, the National Library of Belgium
Authors
Lowagie, Hannes
Van Woensel, Julie
Discipline
Computer and information sciences
Subject
Python
Subject Indexing
Cataloguing
Audience
Scientific
Date
2024-10-07Metadata
Show full item recordDescription
This paper details the National Library of Belgium’s (KBR) exploration of automating the subject indexing process for their extensive collection using Python scripts. The initial exploration involved creating a reference dataset and automating the classification process using MARCXML files. The focus is on demonstrating the practicality, adaptability, and user-friendliness of the Python-based solution. The authors introduce their unique approach, emphasizing the semantically significant words in subject determination. The paper outlines the Python workflow, from creating the reference dataset to generating enriched bibliographic records. Criteria for an optimal workflow, including ease of creation and maintenance of the dataset, transparency, and correctness of suggestions, are discussed. The paper highlights the promising results of the Python-powered approach, showcasing two specific scripts that create a reference dataset and automate subject indexing. The flexibility and user-friendliness of the Python solution are emphasized, making it a compelling choice for libraries seeking efficient and maintainable solutions for subject indexing projects.
Citation
Lowagie, Hannes; Van Woensel, Julie (2024-10-07). Simplifying Subject Indexing: A Python-Powered Approach in KBR, the National Library of Belgium. , Code4lib Journal, Issue 59,Identifiers
Type
Article
Peer-Review
Yes
Language
eng