Our client is a premiere Service Center exclusively offering various functions to the Group of diverse Global Companies it owns. This is a place where things are often changing and growing so you won't be bored!
As an HPC (High-Performance Computing) Data Center/Private Cloud Architect, you will be
responsible for designing and implementing state-of-the-art datacenter/private cloud infrastructure solutions that support high-performance computing and scientific research. You will collaborate with cross-functional teams, including researchers, system administrators, network engineers, and data scientists, to understand their requirements and create efficient and scalable data center architectures.
Your expertise in HPC technologies, data center design principles, and emerging trends will be instrumental in driving innovation and optimizing performance within the data center/private cloud environment as well as possible integration into the public cloud (such as AWS or Azure).
Key Responsibilities:- Data Center Architecture Design: Develop and refine data center architecture blueprints and guidelines, considering performance, scalability, security, and efficiency aspects. Design and implement solutions for compute, storage, networking, and cooling infrastructure that align with HPC requirements.
- HPC Infrastructure Optimization: Continuously evaluate and enhance the data center infrastructure to maximize HPC performance and resource utilization. Identify and address potential bottlenecks and performance gaps, employing industry best practices and cutting-edge technologies.
- System Integration and Deployment: Collaborate with system administrators and engineers to ensure seamless integration and deployment of HPC systems within the data center. Oversee hardware and software installation, configuration, and testing activities.
- Research and Evaluation: Stay up to date with emerging HPC technologies, tools, and methodologies. Conduct research and feasibility studies on new hardware and software solutions to enhance capabilities and align to new incoming customer use cases. Evaluate vendor offerings and provide recommendations for procurement.
- Performance Monitoring and Troubleshooting: Monitor and analyze performance metrics to identify issues and implement necessary optimizations. Troubleshoot complex system problems, working closely with technical teams to ensure efficient resolution and minimal impact on operations.
- Security and Compliance: Collaborate with security teams to design and implement robust security measures within the data center infrastructure. Ensure compliance with relevant industry standards and regulations, such as HIPAA or GDPR, in data handling and storage.
- Documentation and Reporting: Create comprehensive technical documentation, including architectural diagrams, standard operating procedures, and configuration guidelines. Prepare regular reports on data center performance, capacity planning, and future infrastructure requirements.
- Team Collaboration and Leadership: Collaborate effectively with cross-functional teams, fostering a culture of knowledge sharing and innovation. Provide technical leadership and mentorship to junior team members, guiding them in adopting best practices and enhancing their skill sets.
Qualifications and Skills:- Advanced degree in computer science, engineering, or a related field.
- Proven experience as an HPC architect or similar role, with a strong focus on multi-tenant data center infrastructure design and performance optimization.
- In-depth knowledge of HPC technologies, including parallel/cluster computing, distributed storage systems, InfiniBand, RDMA and Ethernet networking, GPU acceleration, and job scheduling frameworks.
- Experience with CFD (Computational Fluid Dynamics) workloads, tooling (ex: OpenFOAM, Ansys, Altair, Siemens Simcenter, Pointwise, etc) and associated HPC optimization a major plus
- Proficiency in data center design principles, including power and cooling considerations, space planning, physical security and network topologies (preferably in a muti-tenant environment).
- Familiarity with industry-standard tools and software used in HPC environments, such as SLURM, PBS Pro, Lustre, GPFS, OpenStack, cluster managers and containerization technologies (e.g., Docker, Kubernetes).
- Experience architecting hybrid private (on-prem/data center) + public cloud (AWS or Azure) infrastructure a plus.
- Strong problem-solving and analytical skills, with the ability to identify and resolve complex technical issues.
- Excellent communication and interpersonal skills, with the ability to collaborate effectively with diverse teams and stakeholders.
- Detail-oriented mindset with a strong focus on documentation and adherence to standards.
- Familiarity with security protocols and compliance requirements in the context of data center operations.
- Ability to adapt to a fast-paced and rapidly evolving technological landscape.
Join our team as an HPC Data Center Architect and contribute to the advancement of scientific research and innovation by designing and optimizing cutting-edge data center infrastructure.