At JetBrains, code is our passion. Ever since we started back in 2000, we have been striving to make the world’s most robust and effective developer tools. By automating routine checks and corrections, our tools speed up production, freeing developers to grow, discover, and create.
We are working on an ambitious new platform that provides AI capabilities to all JetBrains products. Our platform is based on in-house developed models for writing and coding assistance, as well as integration with models from our strategic partners.
We are looking for an experienced and passionate ML Inference Engineer to join our team at JetBrains AI. In this role, you’ll play a crucial part in deploying and optimizing ML models, ensuring they run efficiently and effectively in production environments.
We value engineers who:
- Work well both independently and as part of a collaborative team environment.
- Communicate effectively, both in writing and verbally, with technical and non-technical stakeholders.
- Take ownership of projects and drive them to completion with a high degree of quality.
In this role, you will:
- Optimize and run ML models for inference in high-load production environments.
- Collaborate with ML engineers to develop scalable and efficient pipelines.
- Monitor and troubleshoot ML models in production, ensuring their reliability and performance.
- Contribute to the development of best practices for ML deployment and inference.
We’ll be happy to have you on our team if you have:
- Proven experience in deploying ML models to production, including experience with model optimization techniques (e.g., quantization, knowledge distillation).
- Strong programming skills in Python and familiarity with other programming languages (e.g., C++, Java).
- Proficiency with ML frameworks such as PyTorch and common libraries for NLP.
- Experience with cloud platforms like AWS, Google Cloud, or Azure for ML deployment.
- Familiarity with containerization and orchestration tools like Docker and Kubernetes.
We’d be especially thrilled if you have:
- Advanced knowledge of model serving frameworks such as vLLM, TensorRT-LLM, Llama.cpp, OpenVINO, or similar.
- Experience with profiling ML and MLOps code to identify bottlenecks and achieve better performance.
- Experience with MLOps tools and practices, including CI/CD for ML.
- Expertise in optimizing models for latency and throughput in production environments.
- Previous work experience with NLP and LLMs.
To develop JetBrains AI, we use:
- Git for source control management.
- AWS and GCP for cloud computing infrastructure.
- Python, PyTorch, and Hugging Face as our ML development stack.
- TeamCity as a CI automation system.