Job opportunities

Centres Inria associés

Type de contrat

CDD/Fixed-term contract

Contexte

Within the framework of NeuroKnowAI, a deep-tech startup project stemmed from research. This project is currently in the Inria Startup Studio acceleration program. NeuroknowAI is a privacy-first intelligent document processing platform with domain knowledge across industries.
The objective is to develop and integrate AI models and document processing pipelines more specifically dedicated to intelligent multi-industry document processing (insurance, healthcare, legal, finance, media, HR, marketing, real estate) with a privacy-first architecture.
No regular travel is foreseen for this post. Work is primarily on-site (some remote days are available).

Mission confié

Assignments: With the help of the NeuroKnowAI technical team, the recruited person will design, develop, and optimize machine learning models for intelligent document processing, including Transformer models, Named Entity Recognition (NER), and differential privacy algorithms.
Collaboration: The recruited person will be in connection with the R&D team that develops NeuroDoc, NeuroShield, and NeuroGuard products for ensuring ML model integration into production infrastructure.
Responsibilities: The person recruited is responsible for designing and implementing industry-specific ML models and will take initiatives for improving the performance, accuracy, and efficiency of document processing pipelines.
Steering/Management: The person recruited will be responsible for documenting technical developments and contributing to ML architectural decisions.

Principales activités

Main activities: 1. Develop and train Transformer models for multi-modal document processing (OCR, speech-to-text, text analysis) 2. Design industry-specific NER models (healthcare, legal, finance, insurance, etc.) 3. Implement differential privacy algorithms for NeuroShield 4. Optimize ML pipelines for high-performance processing (multi-GPU, mixed precision computation) 5. Integrate models into semantic search infrastructure
Complementary activities: 1. Write technical documentation and performance reports 2. Test, modify, and validate models before production deployment 3. Present work progress to partners and the team

Compétences

Technical skills and level required: - Python: Expert - PyTorch or TensorFlow: Advanced - Hugging Face Transformers: Advanced - NLP and document processing: Advanced - OCR and multi-modal processing: Intermediate to Advanced - GPU optimization (CUDA, mixed precision): Intermediate - MLOps (Docker, CI/CD, model deployment): Intermediate - Git and version control: Advanced
Languages: - English: Fluent (technical documentation, team communication) - French: Appreciated but not mandatory
Relational skills: - Ability to communicate complex technical concepts clearly - Team spirit and collaboration - Autonomy and initiative - Adaptability in a fast-evolving environment
Other values appreciated: - Experience with differential privacy techniques - Knowledge of data protection regulations (GDPR, HIPAA) - Experience in industry-specific document processing (healthcare, legal, finance) - Open-source contributions or scientific publications

Référence

2025-09639

Thème

Data and Knowledge Representation and Processing

Domaine d'activité

Software engineering

AI/Machine Learning Engineer (F/M)