- Vast experience with High Performance Computing environments on Linux systems
- SME in the Linux/Unix domain and serve as highest level of technical escalation point within the Team
- Understanding of infrastructure technologies including server, storage, network, database, and virtualization
- Experience in managing technical workstation blades (Windows & Linux), building/re-building them using customized OS images
- Plan, prepare, and execute required timely security patching and software upgrades during scheduled maintenance periods
- Hands-on experience of writing Ansible Playbooks
- Script deployment to automate repetitive tasks to enhance support of the HPC systems
- Support HPC end users and develop, improve, and enhance user experience
- Experience with GPUs and graphic cards
- Ability to work in complex and international remote teams and handle multiple tasks & projects simultaneously
- Troubleshooting skills balanced with problem-solving skills, to tackle highly complex large-scale technical problems
- Experience with Microsoft Azure pipelines and branches
- Experience with system management, monitoring/alerting tools (eg. Zabbix, SPLUNK etc.)
- Ability to quantify, analyze, determine root cause, and resolve system and communication network issues, and develop preventive actions
- Understanding of Agile methodology
| |