3 Language
MARC: 041a
Polish language field: language.R
Harmonize language data function: polish_languages.R
Render publication time field: language.qmd
3.1 Complete Dataset Overview
- 170 unique languages
- 148 unique primary languages
- 1115444 single-language documents (93.91%)
- 72369 multilingual documents (6.09%)
- Conversions from raw to preprocessed language entries
- 50515 documents (4.25%) with unrecognized language
Language codes are from MARC; new custom abbreviations can be added in this table.
3.2 Subset Analysis: 1809-1917
- 54 unique languages
- 60632 single-language documents (94%)
- 3872 multilingual documents (6%)
- 929 documents (1.44%) with unrecognized language
3.2.1 Top languages for 1809-1917
Number of titles assigned with each language (top-10). For a complete list, see accepted languages.
Language | Documents (n) | Fraction (%) |
---|---|---|
Finnish | 33674 | 2.8 |
Swedish | 19107 | 1.6 |
Finnish;Swedish | 2021 | 0.2 |
German | 1988 | 0.2 |
Latin | 1662 | 0.1 |
Russian | 1587 | 0.1 |
Undetermined | 929 | 0.1 |
French | 628 | 0.1 |
English | 425 | 0 |
Swedish;Finnish | 170 | 0 |
Title count per language (including multi-language documents; note the log10 scale):