GPU kernels engineer

Ljubljana, Slovenia

Apply now

About the role

At Soniox, we build cutting-edge real-time AI systems — and we push every layer of the stack to its limit. As a GPU kernels engineer, you’ll be at the heart of this effort: writing custom high-performance kernels that unlock the full potential of modern hardware and fuel massive-scale training and inference workloads.

You’ll work across platform, research, and systems teams to accelerate our most demanding training jobs and shape model architectures around the physical realities of GPUs. If you care deeply about register pressure, tensor core utilization, warp shuffle efficiency, and squeezing every last FLOP from memory bandwidth — this role is for you.

In this role, you will:

  • Design and implement custom GPU/CPU kernels to maximize hardware throughput and efficiency.
  • Optimize for HBM utilization, instruction issue rate, cache locality, and memory bandwidth.
  • Collaborate with platform and infra teams to integrate and deploy kernels at scale.
  • Develop low-precision kernels and quantization-aware techniques to reduce compute without compromising ML accuracy.
  • Partner with ML engineers to co-design model architectures optimized for real-time training and inference.
  • Work directly with hardware vendors to advise on architecture direction and co-design opportunities.

You might thrive in this role if you:

  • Write excellent C/C++ and Python code and enjoy writing fast, clean low-level systems.
  • Have deep understanding of GPU (especially CUDA), CPU, or AI accelerator architectures.
  • Know how to optimize every part of a compute kernel — from memory layout to instruction scheduling.
  • Have experience working on large-scale ML training infrastructure, ideally for LLMs or real-time AI models.
  • Are skilled in quantization and low-precision computation for modern ML workloads.
  • Thrive on performance benchmarks and obsess over every percentage point of speedup.
  • Have 3+ years of experience in high-performance computing, ML infra, or systems-level optimization.

Why Soniox

You’ll help build one of the most technically advanced AI platforms in the world — and shape how it reaches and supports users globally.

You’ll work directly with a world-class team of engineers and researchers solving frontier problems in speech and language AI.

You'll have a voice in how our company grows, how our customers succeed, and how AI transforms human communication.

Ready to join Soniox? Apply now