CLUBS ♣ Projektdokumentation

Published reports

M1.1 — Cross-lingual Information Retrieval, Usage Scenarios and PubPsych Structure

Andreas Lüschow

This document describes our motivation and the initital situation at the beginning of the CLUBS project. We explain common issues of multilingual systems and why a multilingual approach for the psychological search engine PubPsych is useful and promising but also which challenges exist in cross-lingual systems. Characteristical user inputs as well as probable usage scenarios are presented. We describe the structure of PubPsych, including currently present problems and necessary adaptations to the system during the project.

Last update: January 2019

M1.2 — Corpora for the Machine Translation Engines

Cristina España-Bonet & Juliane Stiller & Sophie Henning

This document describes the corpora used for training and evaluating the baseline MT engines used within the CLUBS project.

Last update: July 2018

M1.3.1 — Evaluation plan for CLUBS project

Juliane Stiller & Vivien Petras

This document describes the different evaluation studies which will be executed during the course of the project. The studies assess the performance of different MT approaches for cross-lingual retrieval in the bibliographic search engine PubPsych.

Last update: Juli 2017

M1.3.2 — CLUBS - Testing retrieval performance

Juliane Stiller & Vivien Petras

This document will detail the experiment that will be conducted to determine, which of the five approaches performs best with regard to retrieval performance.

Last update: March 2018

M1.4 — MT Approaches towards Cross-lingual IR in PubPsych

Cristina España-Bonet

This document describes the architecture options and final choices for implementing the machine translation (MT) system aimed to translate articles' titles and abstracts. The alternatives are presented and a comparison among the two most promising architectures, Statistical MT and Neural MT, is given.

Last update: August 2018

M1.5 — Mapping Approaches for Vocabularies and Queries in PubPsych

Cristina España-Bonet & Roland Ramthun

This document describes the in-domain vocabularies the project has available, and how their multilingual counterparts are built from them. It also proposes several public resources we can use to complement and extend this data with general-domain vocabulary. Finally, application of the multilingual lexicons to controlled terms and query translations is sketched.

Last update: March 2018

M3.1 — Cross-lingual Thesaurus and Controlled Term Translation

Cristina España-Bonet & Roland Ramthun

This document describes the data, resources, methodology and software developed to translate the controlled terms and related text available as metadata in the PubPsych database.

Last update: March 2018

M5.3 — Final Evaluation

Juliane Stiller, Vivien Petras & Andreas Lüschow

This document presents the evaluation results from the CLUBS project. We describe the design of the evaluation experiment that allows us to compare the different systems that were developed. After presenting the outcomes (e.g., inter-annotator agreement) from a pilot experiment that was conducted before the final user evaluation, we give a detailed report about the different systems and their single evaluation results. The system that makes use of translated metadata performs best.

Last update: September 2019

Project documentation

Published reports

M1.1 — Cross-lingual Information Retrieval, Usage Scenarios and PubPsych Structure

Andreas Lüschow

M1.2 — Corpora for the Machine Translation Engines

Cristina España-Bonet & Juliane Stiller & Sophie Henning

M1.3.1 — Evaluation plan for CLUBS project

Juliane Stiller & Vivien Petras

M1.3.2 — CLUBS - Testing retrieval performance

Juliane Stiller & Vivien Petras

M1.4 — MT Approaches towards Cross-lingual IR in PubPsych

Cristina España-Bonet

M1.5 — Mapping Approaches for Vocabularies and Queries in PubPsych

Cristina España-Bonet & Roland Ramthun

M3.1 — Cross-lingual Thesaurus and Controlled Term Translation

Cristina España-Bonet & Roland Ramthun

M5.3 — Final Evaluation

Juliane Stiller, Vivien Petras & Andreas Lüschow

Software

MeSHMerger version 1.0

Roland Ramthun

Documentation

Online integration of the query translation module into PubPsych