took place on Friday, June 7th, 2019, 11 a.m. - 5 p.m. at DFKI Saarbrücken, Saarland Informatics Campus D 3_2.
Invited keynote speakers:
Prof. Dr. Christa Womser-Hacker
Which language does the internet speak?
Prof. Gareth Jones
(Dublin City University):
Reconsidering Domain-Specific Cross-Language Information Access in the Age of Distributional Semantics
Prof. Daniela Petrelli
(Sheffield Hallam University):
A designerly approach to interactive cross-language information retrieval
Assoc. Prof. Pavel Pecina
(Charles University, Prague):
Breaking the language barrier in health-related web search
Antoine Isaac, PhD
Multilingual challenges and ongoing work to tackle them at Europeana
and the CLuBS project members.
Does overcoming the language barrier contribute to global advancement of science or has English become the lingua franca of science?
Research has shown that results in non-English languages are less available and referenced than results published in English. This may lead to situations where information published in languages other than English is lost or, in practice, non-existent for individual researchers or even the scientific community as a whole. The aim of the project Cross-Lingual Bibliographic Search (CLuBS) is to investigate strategies to address these problems in Psychology. The use case is PubPsych, an open access multilingual search engine for psychological literature, tests, treatment programs and research data with metadata in four languages: English, French, German, and Spanish.
In order to make bibliographic metadata better available to non-native speakers, information retrieval performances of query translation versus complete record translation were empirically tested and evaluated. Strategies that proved to be successful despite few query terms and the resulting scarcity of context information:
- lexicon (thesaurus) based query translation together with some simple translation rules;
- neural machine translation for content translation to cover those language pairs with few in-domain parallel data.
The resulting system is a combination of human expert translation, multilingual thesaurus mappings and neural machine translation.
The objective of this workshop is to present and discuss the CLuBS project results. The main focus is on information retrieval
and on machine translation
and to demonstrate the improved PubPsych search engine. Participants are invited from the fields of
- Library and Information Science,
- Computer Science,
- and other disciplines interested in multilingual systems.
By sharing experience and knowledge this contribution to overcome the language barrier would benefit all researchers.