_Publications

2022

Amir Hazem, Mérième Bouhandi, Florian Boudin and Béatrice Daille (2022). Cross-lingual and Cross-domain Transfer Learning for Automatic Term Extraction from Low Resource Data. In Proceedings, 13th Language Resources and Evaluation Conference (LREC), Marseille, France.
Mérième Bouhandi, Emmanuel Morin and Thierry Hamon (2022). Adaptation au domaine de modèles de langue à l'aide de réseaux à base de graphes. In Actes, 29ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN), Avignon, France. [pdf]
Mérième Bouhandi, Emmanuel Morin and Thierry Hamon (2022). Graph Neural Networks for Adapting Off-the-shelf General Domain Language Models to Low-Resource Specialised Domains. In Proceedings, NAACL 2022 Workshop on Deep Learning on Graphs for Natural Language Processing, Seattle, Washington, US.
Olivier Ferret (2022). Décontextualiser des plongements contextuels pour construire des thésaurus distributionnels. In Actes, 29ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN), Avignon, France.
Olivier Ferret (2022). Building Static Embeddings from Contextual Ones: Is It Useful for Building Distributional Thesauri?. In Proceedings, 13th Language Resources and Evaluation Conference (LREC), Marseille, France.
Martin Laville, Emmanuel Morin and Philippe Langlais (2022). About Evaluating Bilingual Lexicon Induction. In Proceedings, 15th Workshop on Building and Using Comparable Corpora (BUCC), Marseille, France.
Reinhard Rapp, Pierre Zweigenbaum and Serge Sharoff (2022). Proceedings of the BUCC Workshop within LREC 2022, Marseille, France.
Omar Adjali, Emmanuel Morin and Pierre Zweigenbaum (2022). Building Comparable Corpora for Assessing Multi-Word Term Alignment. In Proceedings, 13th Language Resources and Evaluation Conference (LREC), Marseille, France.
Hicham El Boukkouri, Olivier Ferret, Thomas Lavergne and Pierre Zweigenbaum (2022). Re-train or Train from Scratch? Comparing Pre-training Strategies of BERT in the Medical Domain. In Proceedings, 13th Language Resources and Evaluation Conference (LREC), Marseille, France.
Omar Adjali, Emmanuel Morin, Serge Sharoff, Reinhard Rapp and Pierre Zweigenbaum (2022). Overview of the 2022 BUCC Shared Task: Bilingual Alignment in Comparable Specialized Corpora. In Proceedings, 15th Workshop on Building and Using Comparable Corpora (BUCC), Marseille, France.

2021

Olivier Ferret (2021). Using Distributional Principles for the Semantic Study of Contextual Language Models. In 35th Pacific Asia Conference on Language, Information and Computation (PACLIC, Shanghai, China (Online).
Olivier Ferret (2021). Exploration des relations sémantiques sous-jacentes aux plongements contextuels de mots. In 28ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN), p. 26–36, Lille, France (Online) [pdf].
Yizhe Wand, Béatrice Daille and Nabil Hathout (2021). In Actes, 28ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN), Lille, France.
Lucie Gianola, Hicham Boukkouri, Cyril Grouin, Thomas Lavergne, Patrick Paroubek and Pierre Zweigenbaum (2021). Differential Evaluation: a Qualitative Analysis of Natural Language Processing System Behavior Based Upon Data Resistance to Processing. In Proceedings, 2nd Workshop on Evaluation and Comparison of NLP Systems, Punta Cana, Dominican Republic.

2020

Ludovic Tanguy, Cécile Fabre and Yoann Bard (2020). Impact de la structure logique des documents sur les modèles distributionnels : expérimentations sur le corpus TALN. In Actes, 27ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN), Nancy, France. pp.122-135. [pdf]
Martin Laville, Amir Hazem, Emmanuel Morin and Langlais Philippe (2020). Data Selection for Bilingual Lexicon Inductionfrom Specialized Comparable Corpora. In Proceedings of the 28th International Conference on Computational Linguistics (COLING), Barcelona, Spain, 2020
Yizhe Wang, Béatrice Daille and Nabil Hathout, (2020). A study of semantic projection from single word terms to multi-word terms in the environment domain. In Proceedings of the 6th International Workshop on Computational Terminology, 50--54, Marseille, France. [pdf]
Hicham El Boukkouri, Olivier Ferret, Thomas Lavergne, Hiroshi Noji, Pierre Zweigenbaum and Junichi Tsujii (2020). CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters. arXiv preprint arXiv:2010.10392.[pdf] [code]
Hicham El Boukkouri (2020). Ré-entraîner ou entraîner soi-même ? Stratégies de pré-entraînement de BERT en domaine médical . In Actes, 22ème Rencontres des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL), 29–42, Nancy.[pdf] [code]
Pauline Brunet, Olivier Ferret and Ludovic Tanguy (2020). Which Dependency Parser to Use for Distributional Semantics in a Specialized Domain? In Proceedings, 6th International Workshop on Computational Terminology (COMPUTERM), 26–36, Marseille, France.[pdf]
Ludovic Tanguy, Pauline Brunet and Olivier Ferret (2020). Extrinsic Evaluation of French Dependency Parsers on a Specialized Corpus: Comparison of Distributional Thesauri. In Proceedings, 12th Language Resources and Evaluation Conference (LREC), 5822–5830, Marseille, France. [pdf]
Martin Laville, Amir Hazem and Emmanuel Morin(2020). TALN/LS2N Participation at the BUCC Shared Task: Bilingual Dictionary Induction from Comparable Corpora. In Proceedings, 13th Workshop on Building and Using Comparable Corpora (BUCC), Marseille, France.
Martin Laville, Mériéme Bouhandi, Emmanuel Morin and Philippe Langlais (2020). Seed Lexicons, Word Representations, Mapping Procedure, and Evaluation Lists: What Matters in Bilingual Lexicon Induction from Comparable Corpora? In Proceedings, 33rd Canadian Conference on Artificial Intelligence (CAIAC), Quebec, Canada.
Béatrice Daille, Kyo Kageura, and Ayla Rigouts Terryn (2020), Editors. Workshop 6e International Workshop on Computational Terminology (COMPUTERM 2020) associated to the Twelveth International Language Resources and Evaluation Conference (LREC 2020).
Amir Hazem, Mérième Bouhandi, Florian. Boudin and Béatrice Daille (2020). TermEval 2020: TALN-LS2N System for Automatic Term Extraction. Proceedings of the 6th International Workshop on Computational Terminology (COMPUTERM 2020), pages 95–100 Language Resources and Evaluation Conference (LREC 2020), Marseille, 11–16 May 2020 ©European Language Resources Association (ELRA).

2019

Hicham El Boukkouri, Olivier Ferret, Thomas Lavergne and Pierre Zweigenbaum (2019). Embedding Strategies for Specialized Domains: Application to Clinical Entity Recognition. In Proceedings, 57th Conference of the Association for Computational Linguistics (ACL) student research workshop, 295–301, Florence, Italy. [pdf] [code]
Mérième Bouhandi (2019). Apport des termes complexes pour enrichir l’analyse distributionnelle en domaine spécialisé. In Actes, 21ème Rencontres des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL), 473–486 Toulouse, France. [pdf]
Ludovic Tanguy, Pauline Brunet et Olivier Ferret (2019). Comparaison qualitative et extrinsèque d'analyseurs syntaxiques du français : confrontation de modèles distributionnels sur un corpus spécialisé. In Actes, 26ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN), 39–53, Toulouse, France. [pdf]
Mohamadou Ba, Robert Bossy, Pauline Brunet, Louise Deléger, Hicham El Boukkouri, Olivier Ferret, Arnaud Ferré, Thomas Lavergne, Claire Nédellec, & Pierre Zweigenbaum (2019). Combining string-based and embeddings-based methods for medical concept normalization: LIMSI-CEA-INRA@n2c2 2019. In Özlem Uzuner, Yanshan Wang, Feichen Shen, and Anna Rumshisky, editors, 2019 n2c2/OHNLP Shared Task on Challenges in Natural Language Processing for Clinical Data, 2019.

2018

Olivier Ferret (2018). Using pseudo-senses for improving the extraction of synonyms from word embeddings. In Proceedings, 56th Annual Meeting of the Association for Computational Linguistics : short paper session (ACL), 351–357, Melbourne, Australia.[pdf]
Olivier Ferret (2018). Des pseudo-sens pour améliorer l'extraction de synonymes à partir de plongements lexicaux. In Actes, 25e Conférence sur le Traitement Automatique des Langues Naturelles (CORIA-TALN-RJC), session articles courts, 365–373, Rennes, France. [pdf]
Amir Hazem and Emmanuel Morin (2018). Leveraging Meta-Embeddings for Bilingual Lexicon Extraction from Specialized Comparable Corpora. In Proceedings, 27th International Conference on Computational Linguistics (COLING), 937–949, Santa Fe, New Mexico, USA. [pdf]

_Publications

2022

Amir Hazem, Mérième Bouhandi, Florian Boudin and Béatrice Daille (2022). Cross-lingual and Cross-domain Transfer Learning for Automatic Term Extraction from Low Resource Data. In Proceedings, 13th Language Resources and Evaluation Conference (LREC), Marseille, France.

Mérième Bouhandi, Emmanuel Morin and Thierry Hamon (2022). Adaptation au domaine de modèles de langue à l'aide de réseaux à base de graphes. In Actes, 29ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN), Avignon, France. [pdf]

Mérième Bouhandi, Emmanuel Morin and Thierry Hamon (2022). Graph Neural Networks for Adapting Off-the-shelf General Domain Language Models to Low-Resource Specialised Domains. In Proceedings, NAACL 2022 Workshop on Deep Learning on Graphs for Natural Language Processing, Seattle, Washington, US.

Olivier Ferret (2022). Décontextualiser des plongements contextuels pour construire des thésaurus distributionnels. In Actes, 29ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN), Avignon, France.

Olivier Ferret (2022). Building Static Embeddings from Contextual Ones: Is It Useful for Building Distributional Thesauri?. In Proceedings, 13th Language Resources and Evaluation Conference (LREC), Marseille, France.

Martin Laville, Emmanuel Morin and Philippe Langlais (2022). About Evaluating Bilingual Lexicon Induction. In Proceedings, 15th Workshop on Building and Using Comparable Corpora (BUCC), Marseille, France.

Reinhard Rapp, Pierre Zweigenbaum and Serge Sharoff (2022). Proceedings of the BUCC Workshop within LREC 2022, Marseille, France.

Omar Adjali, Emmanuel Morin and Pierre Zweigenbaum (2022). Building Comparable Corpora for Assessing Multi-Word Term Alignment. In Proceedings, 13th Language Resources and Evaluation Conference (LREC), Marseille, France.

Hicham El Boukkouri, Olivier Ferret, Thomas Lavergne and Pierre Zweigenbaum (2022). Re-train or Train from Scratch? Comparing Pre-training Strategies of BERT in the Medical Domain. In Proceedings, 13th Language Resources and Evaluation Conference (LREC), Marseille, France.

Omar Adjali, Emmanuel Morin, Serge Sharoff, Reinhard Rapp and Pierre Zweigenbaum (2022). Overview of the 2022 BUCC Shared Task: Bilingual Alignment in Comparable Specialized Corpora. In Proceedings, 15th Workshop on Building and Using Comparable Corpora (BUCC), Marseille, France.

2021

Olivier Ferret (2021). Using Distributional Principles for the Semantic Study of Contextual Language Models. In 35th Pacific Asia Conference on Language, Information and Computation (PACLIC, Shanghai, China (Online).

Olivier Ferret (2021). Exploration des relations sémantiques sous-jacentes aux plongements contextuels de mots. In 28ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN), p. 26–36, Lille, France (Online) [pdf].

Yizhe Wand, Béatrice Daille and Nabil Hathout (2021). In Actes, 28ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN), Lille, France.

2020

Ludovic Tanguy, Cécile Fabre and Yoann Bard (2020). Impact de la structure logique des documents sur les modèles distributionnels : expérimentations sur le corpus TALN. In Actes, 27ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN), Nancy, France. pp.122-135. [pdf]

Martin Laville, Amir Hazem, Emmanuel Morin and Langlais Philippe (2020). Data Selection for Bilingual Lexicon Inductionfrom Specialized Comparable Corpora. In Proceedings of the 28th International Conference on Computational Linguistics (COLING), Barcelona, Spain, 2020

Yizhe Wang, Béatrice Daille and Nabil Hathout, (2020). A study of semantic projection from single word terms to multi-word terms in the environment domain. In Proceedings of the 6th International Workshop on Computational Terminology, 50--54, Marseille, France. [pdf]

Hicham El Boukkouri, Olivier Ferret, Thomas Lavergne, Hiroshi Noji, Pierre Zweigenbaum and Junichi Tsujii (2020). CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters. arXiv preprint arXiv:2010.10392.[pdf] [code]

Hicham El Boukkouri (2020). Ré-entraîner ou entraîner soi-même ? Stratégies de pré-entraînement de BERT en domaine médical . In Actes, 22ème Rencontres des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL), 29–42, Nancy.[pdf] [code]

Pauline Brunet, Olivier Ferret and Ludovic Tanguy (2020). Which Dependency Parser to Use for Distributional Semantics in a Specialized Domain? In Proceedings, 6th International Workshop on Computational Terminology (COMPUTERM), 26–36, Marseille, France.[pdf]

Ludovic Tanguy, Pauline Brunet and Olivier Ferret (2020). Extrinsic Evaluation of French Dependency Parsers on a Specialized Corpus: Comparison of Distributional Thesauri. In Proceedings, 12th Language Resources and Evaluation Conference (LREC), 5822–5830, Marseille, France. [pdf]

Martin Laville, Amir Hazem and Emmanuel Morin(2020). TALN/LS2N Participation at the BUCC Shared Task: Bilingual Dictionary Induction from Comparable Corpora. In Proceedings, 13th Workshop on Building and Using Comparable Corpora (BUCC), Marseille, France.

Béatrice Daille, Kyo Kageura, and Ayla Rigouts Terryn (2020), Editors. Workshop 6e International Workshop on Computational Terminology (COMPUTERM 2020) associated to the Twelveth International Language Resources and Evaluation Conference (LREC 2020).

2019

Mérième Bouhandi (2019). Apport des termes complexes pour enrichir l’analyse distributionnelle en domaine spécialisé. In Actes, 21ème Rencontres des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL), 473–486 Toulouse, France. [pdf]

2018

Olivier Ferret (2018). Using pseudo-senses for improving the extraction of synonyms from word embeddings. In Proceedings, 56th Annual Meeting of the Association for Computational Linguistics : short paper session (ACL), 351–357, Melbourne, Australia.[pdf]

Olivier Ferret (2018). Des pseudo-sens pour améliorer l'extraction de synonymes à partir de plongements lexicaux. In Actes, 25e Conférence sur le Traitement Automatique des Langues Naturelles (CORIA-TALN-RJC), session articles courts, 365–373, Rennes, France. [pdf]

Amir Hazem and Emmanuel Morin (2018). Leveraging Meta-Embeddings for Bilingual Lexicon Extraction from Specialized Comparable Corpora. In Proceedings, 27th International Conference on Computational Linguistics (COLING), 937–949, Santa Fe, New Mexico, USA. [pdf]