Fennica metadata conversions: statistical monitoring and analysis
1 Preface chapter
It is imperative to underscore that this bookdown project is an evolving work-in-progress, and several additional fields will be incorporated into the dataset as they undergo the rigorous processes of data cleaning and harmonization.
Within the DARIAH-FI and FIN-CLARIAH Metadata Harmonization and Analysis work package, we leverage the Finnish national bibliography (FNB) Fennica dataset to develop a harmonized dataset, serving research purposes and laying the groundwork for further infrastructure iterations. The project’s outcomes will be instrumental in supporting the DHL-FI project funded by the Research Council of Finland. The FNB encompasses metadata for over a million documents, including books, newspapers, maps, etc., with records spanning from 1488 to the present. For more details about Fennica, visit The National Finnish Library website.
Currently, the bookdown project comprises a few distinct chapters. Notably, the harmonization process has been executed through the establishment of dual pipelines: Complete FNB Pipeline and 1809 to 1917 Period Pipeline These chapters are dedicated to the specific metadata categories.
1.1 Data collection and transformation scripts
init.R: compiles instructions for downloading the data and specifying the required packages for installation.
Scripts are tailored for specific needs of the research and can be modified for different research questions:
priority_fields.R: collects priority fields from the raw data.
leader.R: collects type of records (leader/06) and bibliographical level (leader/07)
008_field.R: collects date of entry (008/00-05), publication status (008/06), publication time (008/07-14) and genre (008/33) for the specific type of record and bibliographical level.