Location
ottawa
Job Type
Full-time
Posted
May 20, 2026
Job Description
HPC Specialist – Role Overview
Arca dion is seeking an HPC (High-Performance Computing) Specialist responsible for the design, deployment, optimization, and management of high-performance computing systems, often used in scientific research, engineering simulations, AI/ML workloads, and large-scale data analytics.
Core Responsibilities
- Architecture & System Design
- Design scalable HPC clusters (on-prem, cloud, or hybrid)
- Choose appropriate CPUs, GPUs, interconnects (e.g., InfiniBand), and storage
- Configure Slurm, PBS, or OpenHPC job schedulers
- Cluster Deployment & Maintenance
- Install and manage Linux-based compute nodes
- Maintain job schedulers and resource managers
- Integrate monitoring tools (Prometheus, Grafana, Nagios)
- Performance Tuning & Optimization
- Benchmark workloads and tune for performance (e.g., MPI, CUDA, OpenMP)
- ...