Ocrqa Exploration
OCR Quality Exploration on Impresso Corpus
Historical Media Analysis and Enrichment
Impresso - Media Monitoring of the Past is an interdisciplinary research project using machine learning to transform how historical media are processed, enriched, explored, and studied across modalities, languages, time periods, and national borders.
We develop the π Impresso Web App and the π¬ Impresso Datalab, providing access to a large multilingual corpus of historical newspapers and radio broadcasts.
Impresso gratefully acknowledges the continued support of its cultural heritage partners, as well as funding from the SNSF (Grant Nos. CRSII5_173719 and CRSII5_213585) and the FNR (Grant No. 17498891).
OCR Quality Exploration on Impresso Corpus
Explore yearly ad and non-ad distributions in Impresso
Multilingual Named Entity Recognition in Historical Data
Multilingual Entity Linking for Historical Data
Search for similar words using word embeddings