IT

High-Performance Computing Engineer

Looking to hire your next High-Performance Computing Engineer? Here’s a full job description template to use as a guide.

About Vintti

Vintti stands at the forefront of economically advantageous staffing solutions for US businesses. By facilitating partnerships between American companies and Latin American professionals, we offer a pathway to reduced operational costs without sacrificing quality. Our approach enables businesses to reinvest savings into core areas, fostering growth and enhancing overall market competitiveness.

Description

A High-Performance Computing (HPC) Engineer is a specialized professional responsible for designing, implementing, and managing high-speed computing systems and networks that handle large-scale data processing and complex computational tasks. They optimize hardware and software configurations to enhance computational efficiency and performance, ensuring the infrastructure meets the demands of various scientific, engineering, and research applications. The role involves collaborating with researchers and developers to tailor HPC solutions to specific project needs, keeping up-to-date with technological advancements, and continually improving system scalability and robustness to support cutting-edge innovations.

Requirements

- Bachelor's degree in Computer Science, Information Technology, or related field.
- Proven experience in designing, deploying, and maintaining HPC clusters and infrastructure.
- Strong proficiency in system administration, including Linux/Unix-based systems.
- Experience with scripting languages such as Python, Bash, or Perl for automation and system management.
- Familiarity with job scheduling and resource management tools such as Slurm, PBS, or Torque.
- Knowledge of HPC software, libraries, and scientific applications.
- Expertise in performance analysis, benchmarking, and optimization techniques.
- Understanding of network configuration and high-speed interconnects such as InfiniBand.
- Experience with data storage solutions, backup, and recovery operations.
- Strong problem-solving skills and the ability to troubleshoot and resolve complex issues.
- Knowledge of cybersecurity practices and experience in applying security patches and conducting vulnerability assessments.
- Excellent communication skills and the ability to collaborate effectively with researchers, scientists, and engineers.
- Ability to provide technical support and training to end-users with varying levels of expertise.
- Proficiency in documenting system configurations, procedures, and troubleshooting techniques.
- Experience coordinating with hardware and software vendors for system upgrades and patches.
- Willingness to participate in on-call rotation for critical system support and maintenance tasks.
- Demonstrated ability to stay updated with the latest HPC technologies, trends, and advancements.
- Experience working in multidisciplinary project teams to support computational research and development.

Responsabilities

- Design, deploy, and maintain HPC clusters and infrastructure.
- Perform system administration, including monitoring, tuning, troubleshooting, and optimizing HPC systems.
- Develop and implement custom scripts and automation tools.
- Collaborate with researchers, scientists, and engineers to understand and address computational requirements.
- Implement and manage scheduling systems and job queues.
- Install, configure, and maintain HPC software, libraries, and scientific applications.
- Conduct performance analyses and benchmarking to resolve bottlenecks.
- Ensure security and compliance of HPC systems through patching, vulnerability assessments, and best practices.
- Provide technical support and training to end-users.
- Manage data storage solutions and perform backup and recovery operations.
- Document system configurations, procedures, and troubleshooting guides.
- Coordinate with hardware and software vendors for system upgrades and patches.
- Participate in on-call rotation for critical system support and maintenance.
- Stay updated with HPC technologies, trends, and advancements.
- Participate in multidisciplinary project teams to support computational research and development.

Ideal Candidate

The ideal candidate for the High-Performance Computing Engineer role will possess a Bachelor's degree in Computer Science, Information Technology, or a related field, and have extensive experience in designing, deploying, and maintaining HPC clusters and infrastructure. They will demonstrate strong proficiency in Linux/Unix-based system administration, coupled with a deep understanding of job scheduling and resource management tools such as Slurm, PBS, or Torque. Proficiency in scripting languages like Python, Bash, or Perl for automation and system management is crucial. The candidate will have a proven track record in performance analysis, benchmarking, and optimization techniques, alongside a robust understanding of network configuration and high-speed interconnects like InfiniBand. They will exhibit excellent problem-solving abilities, capable of troubleshooting and resolving complex issues, and will have experience in data storage solutions, including backup and recovery operations. Knowledge of HPC software, libraries, and scientific applications, as well as cybersecurity practices and experience in applying security patches and conducting vulnerability assessments, is essential. The candidate will be an effective communicator, both verbally and in writing, with the ability to collaborate seamlessly with researchers, scientists, and engineers. They will have a strong customer service orientation and a passion for supporting scientific research and computational development. Enthusiastic about learning and adopting new technologies, the ideal candidate will be self-motivated, highly organized, and adept at managing multiple tasks and projects concurrently. With a collaborative mindset, they will thrive in a team environment, demonstrating high ethical standards, integrity, and a commitment to continuous professional development. The ability to work under pressure, meet tight deadlines, and participate in an on-call rotation for critical system support will round out this candidate's profile, making them the perfect fit for advancing our HPC infrastructure and supporting our multidisciplinary research initiatives.

On a typical day, you will...

- Design, deploy, and maintain high-performance computing (HPC) clusters and infrastructure.
- Perform system administration tasks, including monitoring, tuning, troubleshooting, and optimizing HPC systems for performance and reliability.
- Develop and implement custom scripts and automation tools to improve workflow efficiency and system management.
- Collaborate with researchers, scientists, and engineers to understand computational requirements and provide tailored HPC solutions.
- Implement and manage scheduling systems and job queues to ensure optimal resource utilization.
- Install, configure, and maintain HPC software, libraries, and scientific applications.
- Conduct performance analyses and benchmarking to identify and resolve bottlenecks.
- Ensure the security and compliance of HPC systems by applying patches, conducting vulnerability assessments, and implementing best practices.
- Provide technical support and training to end-users on the effective use of HPC resources and software.
- Manage data storage solutions and perform data backup and recovery operations.
- Document system configurations, procedures, and troubleshooting guides for internal use and knowledge sharing.
- Coordinate with hardware and software vendors for system upgrades, patches, and troubleshooting.
- Participate in on-call rotation for critical system support and maintenance tasks.
- Stay updated with the latest HPC technologies, trends, and advancements to propose and implement improvements.
- Participate in multidisciplinary project teams to support and advance computational research and development initiatives.

What we are looking for

- Strong analytical and problem-solving skills
- Excellent attention to detail
- High level of technical aptitude and curiosity
- Effective communicator, both verbally and in writing
- Collaborative mindset and ability to work effectively in a team environment
- Enthusiastic about learning and adopting new technologies
- Self-motivated and highly organized
- Ability to manage multiple tasks and projects simultaneously
- Strong customer service orientation
- Innovative thinker with a proactive approach to system improvement
- Adaptability to rapidly changing environments
- Capability to work under pressure and meet tight deadlines
- Strong ethical standards and integrity
- Commitment to continuous professional development
- Passionate about supporting scientific research and computational development initiatives

What you can expect (benefits)

- Competitive salary range based on experience and qualifications
- Comprehensive health, dental, and vision insurance plans
- Retirement savings plan with company matching contributions
- Flexible working hours and remote work options
- Generous paid time off (vacation, sick leave, and holidays)
- Career development opportunities, including professional training and certifications
- Tuition reimbursement for continued education
- Access to state-of-the-art HPC facilities and resources
- Opportunity to work on cutting-edge research and innovative projects
- Collaborative and inclusive work environment
- Life and disability insurance coverage
- Employee assistance program (EAP)
- Wellness programs and gym membership discounts
- Relocation assistance for qualified candidates
- Employee recognition and reward programs
- Opportunities for participation in conferences and industry events
- Support for publishing and presenting research findings in academic journals and conferences

Vintti logo

Do you want to find amazing talent?

See how we can help you find a perfect match in only 20 days.

High-Performance Computing Engineer FAQs

Here are some common questions about our staffing services for startups across various industries.

More Job Descriptions

Browse all roles

Start Hiring Remote

Find the talent you need to grow your business

You can secure high-quality South American talent in just 20 days and for around $9,000 USD per year.

Start Hiring For Free