
TRAN Thi Hong Hanh
NLP Engineer | Open-source Contributor
Work Experience
NLP / ML Research Engineer
Arkhn • Paris, France
- Designed and deployed Assistant for hospitals, enabling clinicians to query patient tracking data (chatbot) and auto-generate reports.
- Built robust data pipelines integrating OCR, document parsing, and information extraction across heterogeneous medical document formats.
- Implemented real-time monitoring and evaluation to trace system performance.
Founder
Tihado • Paris, France
- Vidzly - 1st Prize for Creativity at Agents & MCP Hackathon 2025
- EatAble - 3rd Prize at AMD Robotics Hackathon 2025
- Zootopi - a knowledge-sharing platform, featuring tutorials, research projects, blog articles.
Ph.D.
La Rochelle University (France) & Jožef Stefan Institute (Slovenia)
- Conducted applied research on cross-lingual and cross-domain terminology extraction for scientific and institutional users under several projects and fundings: RSDO, TERMITRAID, PROTEUS (BI-FR/23-24-PROTEUS006), CANDAS (P2-0103), KOBOS (J6-3131).
- Developed end-to-end ML pipelines deployed in public research infrastructures (e.g. Slovenian Terminology Portal) and prototyped solutions for named entity recognition in Slavic languages (Slovenian, Macedonian, Serbian, Bosnian, Croatian) under KLIPING project.
Data Scientist / ML Engineer
3T JSC • Hanoi, Vietnam
- Designed and deployed credit scoring systems used in banks and fintech companies.
- Built data pipelines and evaluation metrics for large-scale call-center analytics.
Data Scientist
Samsung SDS • Hanoi, Vietnam
- Developed ML pipelines for legal document classification and entity extraction, supporting compliance with Vietnamese law.
Teaching Experience
VietAI, Vietnam
2021-Now- Pre-Machine Learning
- Foundation of Deep Learning
- ChatGPT for everyone
- Build applications with OpenAI
Python Trainer MCI, Vietnam
2021-2022- Python for Data Analysis
- Python for Machine Learning & Deep Learning
Program Manager BeeCode, Vietnam
2016-2017- Computer Science classes
- Summer schools
Education
Ph.D
La Rochelle University, France & Jožef Stefan Institute, Slovenia
MSc
University of Montpellier, France
BSc
University of Science and Technology of Hanoi, Vietnam
Awards & Certificates
MCP 1st Birthday Hackathon
Anthropic, HuggingFace, Gradio (December 2025)
AMD Open Robotics Hackathon
AMD, HuggingFace, WowRobo, Data Monsters (December 2025)
Associate Cloud Engineer
June 2025
Open-source & Community Contributions
- Open contributor: robotics-course, agents-course, mcp-course, data-science-specialization
- Developed tutorials and educational material for ML practitioners and researchers
- Conference roles: Programme committee member, reviewer, and co-chair (LREC, EMNLP, COLING, ECAI, ICPR, SemEval, ESSLLI)
- Speaker at Women in AI & Women Technakers events
Main Publications
SEKE: Specialised Experts for Keyword Extraction
Matej Martinc, Hanh Thi Hong Tran, et al.
In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 14191–14205, Suzhou, China.
Ar-Q-former: Historical Newspaper Article Separation based on Multimodal Transformer Structure
Wenjun Sun, Nancy Girdhar, Hanh Thi Hong Tran, et al.
In International Conference on Document Analysis and Recognition (pp. 476-492). Cham: Springer Nature Switzerland.
LlamATE: Automated terminology extraction using large-scale generative language models
Hanh Thi Hong Tran, et al.
Terminology: International Journal of Theoretical and Applied Issues in Specialized Communication, 2025, vol. 31, no 1, p. 5-36.
Is Prompting What Term Extraction Needs?
Hanh Thi Hong Tran, et al.
In International Conference on Text, Speech, and Dialogue. Cham: Springer Nature Switzerland, 2024.
Leveraging Open Large Language Models for Historical Named Entity Recognition
Carlos-Emiliano González-Gallardo, Hanh Thi Hong Tran, et al.
International Conference on Theory and Practice of Digital Libraries. Cham: Springer Nature Switzerland, 2024.
Can cross-domain term extraction benefit from cross-lingual transfer and nested term labeling?
Hanh Thi Hong Tran, et al.
Machine Learning, 2024, p. 1-30.
Full list available on Google Scholar