TRAN Thi Hong Hanh

NLP Engineer | Open-source Contributor

[email protected]

Paris, France

Vietnamese (Native), English (C1), French (B2), Spanish (A1)

Google Scholar GitHub LinkedIn Website

Work Experience

NLP / ML Research Engineer

Arkhn • Paris, France

2024 – Present

Designed and deployed Assistant for hospitals, enabling clinicians to query patient tracking data (chatbot) and auto-generate reports.
Built robust data pipelines integrating OCR, document parsing, and information extraction across heterogeneous medical document formats.
Implemented real-time monitoring and evaluation to trace system performance.

Founder

Tihado • Paris, France

2024 – Present

Vidzly - 1st Prize for Creativity at Agents & MCP Hackathon 2025
EatAble - 3rd Prize at AMD Robotics Hackathon 2025
Zootopi - a knowledge-sharing platform, featuring tutorials, research projects, blog articles.

Ph.D.

La Rochelle University (France) & Jožef Stefan Institute (Slovenia)

2021 – 2024

Conducted applied research on cross-lingual and cross-domain terminology extraction for scientific and institutional users under several projects and fundings: RSDO, TERMITRAID, PROTEUS (BI-FR/23-24-PROTEUS006), CANDAS (P2-0103), KOBOS (J6-3131).
Developed end-to-end ML pipelines deployed in public research infrastructures (e.g. Slovenian Terminology Portal) and prototyped solutions for named entity recognition in Slavic languages (Slovenian, Macedonian, Serbian, Bosnian, Croatian) under KLIPING project.

Data Scientist / ML Engineer

3T JSC • Hanoi, Vietnam

2020 – 2021

Designed and deployed credit scoring systems used in banks and fintech companies.
Built data pipelines and evaluation metrics for large-scale call-center analytics.

Data Scientist

Samsung SDS • Hanoi, Vietnam

2019 – 2020

Developed ML pipelines for legal document classification and entity extraction, supporting compliance with Vietnamese law.

Teaching Experience

VietAI, Vietnam

2021-Now

Pre-Machine Learning
Foundation of Deep Learning
ChatGPT for everyone
Build applications with OpenAI

Python Trainer MCI, Vietnam

2021-2022

Python for Data Analysis
Python for Machine Learning & Deep Learning

Program Manager BeeCode, Vietnam

2016-2017

Computer Science classes
Summer schools

Education

Ph.D

La Rochelle University, France & Jožef Stefan Institute, Slovenia

2021 – 2024

MSc

University of Montpellier, France

2019 – 2020

BSc

University of Science and Technology of Hanoi, Vietnam

2014 – 2017

Awards & Certificates

1st PlaceCreativity Award

MCP 1st Birthday Hackathon

Anthropic, HuggingFace, Gradio (December 2025)

3rd Place

AMD Open Robotics Hackathon

AMD, HuggingFace, WowRobo, Data Monsters (December 2025)

Certification

Associate Cloud Engineer

June 2025

Open-source & Community Contributions

Open contributor: robotics-course, agents-course, mcp-course, data-science-specialization
Developed tutorials and educational material for ML practitioners and researchers
Conference roles: Programme committee member, reviewer, and co-chair (LREC, EMNLP, COLING, ECAI, ICPR, SemEval, ESSLLI)
Speaker at Women in AI & Women Technakers events

Main Publications

12025Conference

SEKE: Specialised Experts for Keyword Extraction

Matej Martinc, Hanh Thi Hong Tran, et al.

In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 14191–14205, Suzhou, China.

2Conference

Ar-Q-former: Historical Newspaper Article Separation based on Multimodal Transformer Structure

Wenjun Sun, Nancy Girdhar, Hanh Thi Hong Tran, et al.

In International Conference on Document Analysis and Recognition (pp. 476-492). Cham: Springer Nature Switzerland.

32025Journal

LlamATE: Automated terminology extraction using large-scale generative language models

Hanh Thi Hong Tran, et al.

Terminology: International Journal of Theoretical and Applied Issues in Specialized Communication, 2025, vol. 31, no 1, p. 5-36.

42024Conference

Is Prompting What Term Extraction Needs?

Hanh Thi Hong Tran, et al.

In International Conference on Text, Speech, and Dialogue. Cham: Springer Nature Switzerland, 2024.

5Best Paper Award2024Conference

Leveraging Open Large Language Models for Historical Named Entity Recognition

Carlos-Emiliano González-Gallardo, Hanh Thi Hong Tran, et al.

International Conference on Theory and Practice of Digital Libraries. Cham: Springer Nature Switzerland, 2024.

62024Journal

Can cross-domain term extraction benefit from cross-lingual transfer and nested term labeling?

Hanh Thi Hong Tran, et al.

Machine Learning, 2024, p. 1-30.

Full list available on Google Scholar