Senior HPC AI Engineer

5 months ago


Falun, Sweden NVIDIA Full time

NVIDIA is looking for an experienced HPC Engineer to join the E2E software verification HPC/AI Infrastructure team. we are focused on building supercomputers and HPC clusters based on groundbreaking technologies. We are looking for an outstanding architect for a senior HPC, be a key player to the most exciting computing hardware and software to contribute to the latest breakthroughs in artificial intelligence and GPU computing. Provide insights on at-scale system design and tuning mechanisms for large-scale compute runs. You will work with the latest Accelerated computing and Deep Learning software and hardware platforms, and with many scientific researchers, developers, and customers to craft improved workflows and develop new, leading differentiated solutions. You will interact with HPC, OS, GPU compute, and systems specialist to architect, develop and bring up large scale performance platforms.

What you will be doing:

  • Design, implement and maintain large scale HPC/AI clusters with monitoring, logging and alerting

  • Manage Linux job/workload schedules and orchestration tools

  • Develop and maintain continuous integration and delivery pipelines

  • Develop tooling to automate deployment and management of large-scale infrastructure environments, to automate operational monitoring and alerting, and to enable self-service consumption of resources

  • Deploy monitoring solutions for the servers, network and storage

  • Perform troubleshooting bottom up from bare metal, operating system, software stack and application level

  • Being a technical resource, develop, re-define and document standard methodologies to share with internal teams

  • Support Research & Development activities and engage in POCs/POVs for future improvements

What we need to see:

  • A degree in Computer Science, Engineering, or a related field and 5+ years of experience

  • Knowledge of HPC and AI solution technologies from CPU’s and GPU’s to high speed interconnects and supporting software

  • Experience with job scheduling workloads and orchestration tools such as Slurm, K8s

  • Excellent knowledge of Windows and Linux (Redhat/CentOS and Ubuntu) networking (sockets, firewalld, iptables, wireshark, etc.) and internals, ACLs and OS level security protection and common protocols e.g. TCP, DHCP, DNS, etc.

  • Experience with multiple storage solutions such as Lustre, GPFS, zfs and xfs. Familiarity with newer and emerging storage technologies.

  • Python programming and bash scripting experience.

  • Comfortable with automation and configuration management tools such as Jenkins, Ansible, Puppet/chef

  • Deep knowledge of Networking Protocols like InfiniBand, Ethernet

  • Deep understanding and experience with virtual systems (for example VMware, Hyper-V, KVM, or Citrix)

  • Familiarity with cloud computing platforms (e.g. AWS, Azure, Google Cloud)

Ways to stand out from the crowd:

  • Knowledge of CPU and/or GPU architecture

  • Knowledge of Kubernetes, container related microservice technologies

  • Experience with GPU-focused hardware/software (DGX, Cuda)

  • Background with RDMA (InfiniBand or RoCE) fabrics

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, sex, gender, gender expression, sexual orientation, age, marital status, veteran status, or disability status. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.



  • Falun, Dalarna, Sweden NVIDIA Full time

    Job Title: Senior HPC AI EngineerNVIDIA is seeking an experienced HPC AI Engineer to join our E2E software verification HPC/AI Infrastructure team. We are focused on building supercomputers and HPC clusters based on cutting-edge technologies. As a key player in our team, you will contribute to the latest breakthroughs in artificial intelligence and GPU...


  • Falun, Dalarna, Sweden NVIDIA Full time

    NVIDIA is seeking an experienced HPC Engineer to join the E2E software verification HPC/AI Infrastructure team. The team is focused on building supercomputers and HPC clusters based on groundbreaking technologies. We are looking for an outstanding architect for a senior HPC role, who will be a key player in the most exciting computing hardware and software...


  • Falun, Dalarna, Sweden NVIDIA Full time

    About the RoleWe are seeking an experienced HPC Engineer to join our E2E software verification HPC/AI Infrastructure team at NVIDIA. The ideal candidate will have a strong background in building supercomputers and HPC clusters based on cutting-edge technologies.Key ResponsibilitiesDesign, implement, and maintain large-scale HPC/AI clusters with robust...


  • Falun, Dalarna, Sweden NVIDIA Full time

    NVIDIA Job DescriptionWe are seeking an experienced HPC Engineer to join our E2E software verification HPC/AI Infrastructure team at NVIDIA. As a key player in building supercomputers and HPC clusters based on groundbreaking technologies, you will contribute to the latest breakthroughs in artificial intelligence and GPU computing.Key Responsibilities:Design,...


  • Falun, Dalarna, Sweden NVIDIA Full time

    NVIDIA is seeking an experienced HPC Engineer to join the E2E software verification HPC/AI Infrastructure team. We are focused on building supercomputers and HPC clusters based on groundbreaking technologies. As a senior HPC AI Engineer, you will be a key player in the most exciting computing hardware and software to contribute to the latest breakthroughs in...


  • Falun, Dalarna, Sweden NVIDIA Full time

    We are seeking an experienced High-Performance Computing Architect to join our team at NVIDIA. The ideal candidate will have a strong background in designing and implementing large-scale HPC/AI clusters, with expertise in monitoring, logging, and alerting. Additionally, they will have experience with Linux job/workload schedules and orchestration tools, as...


  • Falun, Dalarna, Sweden Grafana Labs Full time

    About the RoleWe are seeking a highly skilled Product Manager to lead our AI and Machine Learning initiatives at Grafana Labs. As a key member of our product team, you will be responsible for developing a vision and strategy for our AI offerings, driving the roadmap of our AI products, and collaborating with cross-functional teams to deliver these...


  • Falun, Dalarna, Sweden A-hub Full time

    About the RoleA-hub is seeking a highly skilled Senior Engineer to join their team in Ludvika. As a Senior Engineer, you will be responsible for designing and implementing network and Cyber Security solutions for SCADA and IS/IT systems.Key ResponsibilitiesDesign and implement network and Cyber Security solutions for SCADA and IS/IT systemsDevelop technical...


  • Falun, Dalarna, Sweden Grafana Labs Full time

    About the RoleWe are seeking a skilled Senior Backend Software Engineer to join our team at Grafana Labs. As a Senior Backend Software Engineer, you will play a key role in developing and maintaining our cloud-based observability platform.Key ResponsibilitiesDesign, build, and operate critical systems, ensuring high reliability, performance, and...


  • Falun, Dalarna, Sweden Grafana Labs Full time

    About the RoleGrafana Labs is seeking a highly skilled Senior Backend Engineer to join our team. As a key member of our engineering organization, you will be responsible for designing, building, and operating critical systems that power our observability platform.Key ResponsibilitiesDesign and implement scalable, reliable, and performant software...


  • Falun, Dalarna, Sweden Red Hat Limited, Suomen Sivuliike (Finland Branch) Full time

    About the RoleRed Hat is seeking a highly skilled Senior Principal/Principal Software Engineer to join our Telco 5G Platform Partner Architecture team. As a key member of our team, you will be responsible for designing and implementing the container platform for 5G telecommunication networks, contributing to industry-leading technologies in the Kubernetes...


  • Falun, Dalarna, Sweden Red Hat Limited, Suomen Sivuliike (Finland Branch) Full time

    About the RoleRed Hat is seeking a Senior Principal/Principal Software Engineer to join the Telco 5G Platform Partner Architecture team. As a key member of this team, you will be responsible for designing and implementing the container platform for 5G telecommunication networks, contributing to industry-leading technologies in the Kubernetes and Telecom...


  • Falun, Dalarna, Sweden Red Hat Limited, Suomen Sivuliike (Finland Branch) Full time

    Red Hat Telco 5G Platform Partner Architecture TeamRed Hat is the world's leading provider of enterprise open source software solutions, using a community-powered approach to deliver reliable and high-performing Linux, hybrid cloud, container, and Kubernetes technologies. We strive to make software freely accessible to everyone around the world.Job SummaryWe...


  • Falun, Dalarna, Sweden Red Hat Limited, Suomen Sivuliike (Finland Branch) Full time

    Red Hat Telco 5G Platform Partner Architecture TeamRed Hat is the world's leading provider of enterprise open source software solutions, using a community-powered approach to deliver reliable and high-performing Linux, hybrid cloud, container, and Kubernetes technologies. We strive to make software freely accessible to everyone around the world.Job SummaryWe...


  • Falun, Dalarna, Sweden Red Hat Limited, Suomen Sivuliike (Finland Branch) Full time

    About the RoleRed Hat is the world's leading provider of enterprise open source software solutions, using a community-powered approach to deliver reliable and high-performing Linux, hybrid cloud, container, and Kubernetes technologies. We strive to make software freely accessible to everyone around the world.As a Senior Principal/Principal Software Engineer,...

  • senior architect

    4 months ago


    Falun, Sweden Broadcom Inc. Full time

    Please Note : 1. If you are a first time user, please create your candidate login account before you apply for a job. (Click Sign In > Create Account) 2. If you already have a Candidate Account, please Sign-In before you apply. Job Description: Senior Architect Sweden (Remote/Hybrid/Office Location) The role in a nutshell Once...


  • Falun, Dalarna, Sweden EPRI Europe DAC Full time

    About the RoleEPRI Europe DAC is seeking a highly skilled and experienced professional to join our Transmission Operations and Planning team. As a Senior Resource Adequacy Analyst, you will play a key role in sourcing and preparing datasets, building and extending resource adequacy and resilience models, and conducting adequacy analyses to address research...


  • Falun, Dalarna, Sweden Red Hat Limited, Suomen Sivuliike (Finland Branch) Full time

    At Red Hat Limited, Suomen Sivuliike (Finland Branch), we are seeking a skilled Senior Principal/Principal Software Architect to join our Telco 5G Platform team. This is an exciting opportunity to contribute to the design and implementation of a container platform for 5G telecommunication networks, leveraging industry-leading technologies such as Kubernetes...


  • Falun, Sweden Trellix Full time

    Director, Global Partner MarketingSolutions EngineerEnterprise Account Manager (Benelux and Nordic regions)Analyst - Corporate DevelopmentSales EngineerWorkday Reporting AnalystFederal Civilian Account ManagerSenior Security ResearcherSenior SDETSenior Software QA EngineerStaff Software Development Engineer in TestSenior Security Researcher - Malware...


  • Falun, Dalarna, Sweden Grafana Labs Full time

    About the RoleWe are seeking a highly skilled Senior Backend Engineer to join our team at Grafana Labs. As a key member of our back-end development team, you will collaborate closely with customers and the Grafana Cloud teams to provide Tempo, an open source distributed tracing backend, for on-premises use and at scale in the cloud.Key...