Job Description
Role:- Staff Pricipal Performance Engineer,
Remote
Responsibilities:
● Performance Leadership:
○ Define and implement performance engineering strategies for our Generative AI full stack, including services, application, LLMs, RAG pipelines, and related infrastructure.
○ Lead performance testing, profiling, and analysis efforts to identify and resolve performance bottlenecks.
○ Establish and maintain performance benchmarks and SLAs for critical AI services.
○ Provide technical leadership and mentorship to performance engineering team members.
● LLM Capacity and Tuning:
○ Analyze and improve LLM inference performance, including latency, throughput, and resource utilization.
○ Develop and implement strategies for LLM capacity planning and scaling.
○ Collaborate with AI researchers to evaluate and improve LLM model architectures and training techniques for perform...