Taalas

Software Engineer – Inference Serving

📍 Location
toronto, on
⏰ Job Type
Full-time
📅 Posted
May 27, 2026
Apply Now

Job Description

Join to apply for the

Software Engineer – Inference Serving

role at

Taalas

At Taalas we believe that fundamental progress is achieved by those who are willing to understand and assail a problem end-to-end, without regard for commonly accepted abstractions and boundaries. We are building a team of hands‑on technologists who dislike overspecialization and seek to excel in both depth and breadth. In this position the successful candidate will build software infrastructure for an inference serving cluster built around Taalas hardcore AI model chips.

Job Responsibilities

Adapt open‑source inference servers like vLLM and Punica to interface with Taalas’ hardcore AI models

Implement a highly efficient LoRA swapping solution for multi-{tenant,LoRA} environments

Build and test a scalable inference serving cluster using K8 and Traefik or similar

Qualifications

Bachelor’s or higher degree in Computer Science...

Start Your Week Right!

Apply now and make every Monday exciting with Taalas

Apply for this Position