ML Inference Engineer in JetBrains AI

At JetBrains, code is our passion. Ever since we started back in 2000, we have been striving to make the world’s most robust and effective developer tools. By automating routine checks and corrections, our tools speed up production, freeing developers to grow, discover, and create. We are working on an ambitious new platform that provides AI capabilities to all JetBrains products. Our platform is based on in-house developed models for writing and coding assistance, as well as integration with models from our strategic partners. We are looking for an experienced and passionate ML Inference Engineer to join our team at JetBrains AI. In this role, you’ll play a crucial part in deploying and optimizing ML models, ensuring they run efficiently and effectively in production environments.

We value engineers who:

Work well both independently and as part of a collaborative team environment.
Communicate effectively, both in writing and verbally, with technical and non-technical stakeholders.
Take ownership of projects and drive them to completion with a high degree of quality.

In this role, you will:

Optimize and run ML models for inference in high-load production environments.
Collaborate with ML engineers to develop scalable and efficient pipelines.
Monitor and troubleshoot ML models in production, ensuring their reliability and performance.
Contribute to the development of best practices for ML deployment and inference.

We’ll be happy to have you on our team if you have:

Proven experience in deploying ML models to production, including experience with model optimization techniques (e.g., quantization, knowledge distillation).
Strong programming skills in Python and familiarity with other programming languages (e.g., C++, Java).
Proficiency with ML frameworks such as PyTorch and common libraries for NLP.
Experience with cloud platforms like AWS, Google Cloud, or Azure for ML deployment.
Familiarity with containerization and orchestration tools like Docker and Kubernetes.

We’d be especially thrilled if you have:

Advanced knowledge of model serving frameworks such as vLLM, TensorRT-LLM, Llama.cpp, OpenVINO, or similar.
Experience with profiling ML and MLOps code to identify bottlenecks and achieve better performance.
Experience with MLOps tools and practices, including CI/CD for ML.
Expertise in optimizing models for latency and throughput in production environments.
Previous work experience with NLP and LLMs.

To develop JetBrains AI, we use:

Git for source control management.
AWS and GCP for cloud computing infrastructure.
Python, PyTorch, and Hugging Face as our ML development stack.
TeamCity as a CI automation system.

Team

AI Assistant Code Completion

地点

Armenia (Yerevan)

Cyprus (Limassol, Paphos)

Czech Republic (Prague)

Germany (Munich, Berlin)

Netherlands (Amsterdam)

Poland (Warsaw)

Serbia (Belgrade)

分享此职位

Benefits and perks

Depending on office facilities and local market rules

不确定哪个工具最适合您？

JetBrains IDE

Qodana

Datalore

YouTrack

JetBrains Academy

为您的业务打造的开发者工具

远程开发

下载并安装

联系我们

服务与插件

学习工具

协作开发

All Products Pack