Job opportunities
Centres Inria associés
Type de contrat
Contexte
<p>Within the framework of NeuroKnowAI, a deep-tech startup project stemmed from research. This project is currently in the Inria Startup Studio acceleration program. NeuroknowAI is a privacy-first intelligent document processing platform with domain knowledge across industries.</p>
<p>The objective is to develop and integrate <strong>AI models and document processing pipelines</strong> more specifically dedicated to intelligent multi-industry document processing (insurance, healthcare, legal, finance, media, HR, marketing, real estate) with a privacy-first architecture.</p>
<p>No regular travel is foreseen for this post. Work is primarily on-site (some remote days are available).</p>
<p>The objective is to develop and integrate <strong>AI models and document processing pipelines</strong> more specifically dedicated to intelligent multi-industry document processing (insurance, healthcare, legal, finance, media, HR, marketing, real estate) with a privacy-first architecture.</p>
<p>No regular travel is foreseen for this post. Work is primarily on-site (some remote days are available).</p>
Mission confié
<p><strong>Assignments:</strong><br />With the help of the NeuroKnowAI technical team, the recruited person will design, develop, and optimize machine learning models for intelligent document processing, including Transformer models, Named Entity Recognition (NER), and differential privacy algorithms.</p>
<p><strong>Collaboration:</strong><br />The recruited person will be in connection with the R&D team that develops NeuroDoc, NeuroShield, and NeuroGuard products for ensuring ML model integration into production infrastructure.</p>
<p><strong>Responsibilities:</strong><br />The person recruited is responsible for designing and implementing industry-specific ML models and will take initiatives for improving the performance, accuracy, and efficiency of document processing pipelines.</p>
<p><strong>Steering/Management:</strong><br />The person recruited will be responsible for documenting technical developments and contributing to ML architectural decisions.</p>
<p><strong>Collaboration:</strong><br />The recruited person will be in connection with the R&D team that develops NeuroDoc, NeuroShield, and NeuroGuard products for ensuring ML model integration into production infrastructure.</p>
<p><strong>Responsibilities:</strong><br />The person recruited is responsible for designing and implementing industry-specific ML models and will take initiatives for improving the performance, accuracy, and efficiency of document processing pipelines.</p>
<p><strong>Steering/Management:</strong><br />The person recruited will be responsible for documenting technical developments and contributing to ML architectural decisions.</p>
Principales activités
<p><strong>Main activities:</strong><br />1. Develop and train Transformer models for multi-modal document processing (OCR, speech-to-text, text analysis)<br />2. Design industry-specific NER models (healthcare, legal, finance, insurance, etc.)<br />3. Implement differential privacy algorithms for NeuroShield<br />4. Optimize ML pipelines for high-performance processing (multi-GPU, mixed precision computation)<br />5. Integrate models into semantic search infrastructure</p>
<p><strong>Complementary activities:</strong><br />1. Write technical documentation and performance reports<br />2. Test, modify, and validate models before production deployment<br />3. Present work progress to partners and the team</p>
<p><strong>Complementary activities:</strong><br />1. Write technical documentation and performance reports<br />2. Test, modify, and validate models before production deployment<br />3. Present work progress to partners and the team</p>
Compétences
<p><strong>Technical skills and level required:</strong><br />- Python: Expert<br />- PyTorch or TensorFlow: Advanced<br />- Hugging Face Transformers: Advanced<br />- NLP and document processing: Advanced<br />- OCR and multi-modal processing: Intermediate to Advanced<br />- GPU optimization (CUDA, mixed precision): Intermediate<br />- MLOps (Docker, CI/CD, model deployment): Intermediate<br />- Git and version control: Advanced</p>
<p><strong>Languages:</strong><br />- English: Fluent (technical documentation, team communication)<br />- French: Appreciated but not mandatory</p>
<p><strong>Relational skills:</strong><br />- Ability to communicate complex technical concepts clearly<br />- Team spirit and collaboration<br />- Autonomy and initiative<br />- Adaptability in a fast-evolving environment</p>
<p><strong>Other values appreciated:</strong><br />- Experience with differential privacy techniques<br />- Knowledge of data protection regulations (GDPR, HIPAA)<br />- Experience in industry-specific document processing (healthcare, legal, finance)<br />- Open-source contributions or scientific publications</p>
<p><strong>Languages:</strong><br />- English: Fluent (technical documentation, team communication)<br />- French: Appreciated but not mandatory</p>
<p><strong>Relational skills:</strong><br />- Ability to communicate complex technical concepts clearly<br />- Team spirit and collaboration<br />- Autonomy and initiative<br />- Adaptability in a fast-evolving environment</p>
<p><strong>Other values appreciated:</strong><br />- Experience with differential privacy techniques<br />- Knowledge of data protection regulations (GDPR, HIPAA)<br />- Experience in industry-specific document processing (healthcare, legal, finance)<br />- Open-source contributions or scientific publications</p>
Référence
2025-09639
Domaine d'activité