AI Inference Engineer

hpc ai technology pte. ltd. · singapore, singapore, Singapore

Location

singapore

Job Type

Full-time

Posted

June 11, 2026

Job Description

 Job Description    We are looking for a highly skilled engineer to build, optimize, and maintain high-performance inference services for large language models (LLMs) and multimodal models. You will work closely with algorithm, systems, and product teams to deliver best-in-class performance, stability, and efficiency in production environments-ensuring low-latency, highly available AI services for tens of millions of users.  
 Key Responsibilities    High-Performance Computing & Kernel Optimization    Perform deep GPU/CUDA kernel optimization, including memory access pattern tuning, instruction-level parallelism, and warp-level optimization to fully utilize hardware capabilities  
 Develop and optimize custom high-performance operators using advanced DSLs or compiler frameworks such as Triton and TVM  
 Identify and resolve performance bottlenecks in scenarios such as operator fusion and quantization  
    

Ready to Apply?

Submit your application for AI Inference Engineer at hpc ai technology pte. ltd.

Apply Now