Senior HPC AI Infrastructure Architect

3 weeks ago


Falun, Dalarna, Sweden NVIDIA Full time

NVIDIA is seeking an experienced HPC Engineer to join the E2E software verification HPC/AI Infrastructure team. The team is focused on building supercomputers and HPC clusters based on groundbreaking technologies. We are looking for an outstanding architect for a senior HPC role, who will be a key player in the most exciting computing hardware and software to contribute to the latest breakthroughs in artificial intelligence and GPU computing. The successful candidate will provide insights on at-scale system design and tuning mechanisms for large-scale compute runs. They will work with the latest Accelerated computing and Deep Learning software and hardware platforms, and with many scientific researchers, developers, and customers to craft improved workflows and develop new, leading differentiated solutions. The architect will interact with HPC, OS, GPU compute, and systems specialists to architect, develop, and bring up large-scale performance platforms.

Key Responsibilities:

  • Design, implement, and maintain large-scale HPC/AI clusters with monitoring, logging, and alerting.
  • Manage Linux job/workload schedules and orchestration tools.
  • Develop and maintain continuous integration and delivery pipelines.
  • Develop tooling to automate deployment and management of large-scale infrastructure environments, to automate operational monitoring and alerting, and to enable self-service consumption of resources.
  • Deploy monitoring solutions for the servers, network, and storage.
  • Perform troubleshooting from bare metal to application level.
  • Develop, re-define, and document standard methodologies to share with internal teams.
  • Support Research & Development activities and engage in POCs/POVs for future improvements.

Requirements:

  • A degree in Computer Science, Engineering, or a related field and 5+ years of experience.
  • Knowledge of HPC and AI solution technologies from CPU's and GPU's to high-speed interconnects and supporting software.
  • Experience with job scheduling workloads and orchestration tools such as Slurm, K8s.
  • Excellent knowledge of Windows and Linux (Redhat/CentOS and Ubuntu) networking (sockets, firewalld, iptables, wireshark, etc.) and internals, ACLs, and OS-level security protection and common protocols e.g. TCP, DHCP, DNS, etc.
  • Experience with multiple storage solutions such as Lustre, GPFS, zfs, and xfs. Familiarity with newer and emerging storage technologies.
  • Python programming and bash scripting experience.
  • Comfortable with automation and configuration management tools such as Jenkins, Ansible, Puppet/chef.
  • Deep knowledge of Networking Protocols like InfiniBand, Ethernet.
  • Deep understanding and experience with virtual systems (for example VMware, Hyper-V, KVM, or Citrix).
  • Familiarity with cloud computing platforms (e.g. AWS, Azure, Google Cloud).

Desirable Skills:

  • Knowledge of CPU and/or GPU architecture.
  • Knowledge of Kubernetes, container-related microservice technologies.
  • Experience with GPU-focused hardware/software (DGX, Cuda).
  • Background with RDMA (InfiniBand or RoCE) fabrics.

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, sex, gender, gender expression, sexual orientation, age, marital status, veteran status, or disability status. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.



  • Falun, Dalarna, Sweden NVIDIA Full time

    Job Title: Senior HPC AI EngineerNVIDIA is seeking an experienced HPC AI Engineer to join our E2E software verification HPC/AI Infrastructure team. We are focused on building supercomputers and HPC clusters based on cutting-edge technologies. As a key player in our team, you will contribute to the latest breakthroughs in artificial intelligence and GPU...


  • Falun, Dalarna, Sweden NVIDIA Full time

    About the RoleWe are seeking an experienced HPC Engineer to join our E2E software verification HPC/AI Infrastructure team at NVIDIA. The ideal candidate will have a strong background in building supercomputers and HPC clusters based on cutting-edge technologies.Key ResponsibilitiesDesign, implement, and maintain large-scale HPC/AI clusters with robust...


  • Falun, Dalarna, Sweden NVIDIA Full time

    NVIDIA Job DescriptionWe are seeking an experienced HPC Engineer to join our E2E software verification HPC/AI Infrastructure team at NVIDIA. As a key player in building supercomputers and HPC clusters based on groundbreaking technologies, you will contribute to the latest breakthroughs in artificial intelligence and GPU computing.Key Responsibilities:Design,...


  • Falun, Dalarna, Sweden NVIDIA Full time

    We are seeking an experienced High-Performance Computing Architect to join our team at NVIDIA. The ideal candidate will have a strong background in designing and implementing large-scale HPC/AI clusters, with expertise in monitoring, logging, and alerting. Additionally, they will have experience with Linux job/workload schedules and orchestration tools, as...


  • Falun, Dalarna, Sweden NVIDIA Full time

    NVIDIA is seeking an experienced HPC Engineer to join the E2E software verification HPC/AI Infrastructure team. We are focused on building supercomputers and HPC clusters based on groundbreaking technologies. As a senior HPC AI Engineer, you will be a key player in the most exciting computing hardware and software to contribute to the latest breakthroughs in...

  • Senior Architect

    3 weeks ago


    Falun, Dalarna, Sweden Broadcom Inc. Full time

    About the RoleWe are seeking a highly skilled Senior Architect to join our team at Broadcom Inc. The successful candidate will be responsible for designing and implementing cloud solutions for our customers.Key ResponsibilitiesLead the design and implementation of cloud solutions for our customersCollaborate with cross-functional teams to ensure successful...

  • Senior Architect

    1 month ago


    Falun, Dalarna, Sweden Broadcom Inc. Full time

    Job DescriptionSenior ArchitectWe are seeking a highly skilled Senior Architect to join our team at Broadcom Inc. As a Senior Architect, you will play a key role in designing and implementing VMware Cloud Foundation solutions for our customers.Key ResponsibilitiesLead the design and implementation of VMware Cloud Foundation solutions for complex enterprise...

  • Senior Architect

    2 weeks ago


    Falun, Dalarna, Sweden Broadcom Inc. Full time

    About the RoleWe are seeking a highly skilled Senior Architect to join our team at Broadcom Inc. The successful candidate will have a deep understanding of VMware Cloud Foundation and be able to design, implement, and manage complex cloud infrastructure solutions.Key ResponsibilitiesLead the design and implementation of VMware Cloud Foundation solutions for...

  • Solution Architect

    1 month ago


    Falun, Dalarna, Sweden NVIDIA Full time

    Job Title: Solution ArchitectWe are seeking a highly skilled Solution Architect to join our Telecommunications Industry team. As a key member of our team, you will be responsible for designing and implementing innovative solutions that meet the needs of our customers in the telecommunications industry.Key Responsibilities:Provide technical advisory services...


  • Falun, Dalarna, Sweden Red Hat Limited, Suomen Sivuliike (Finland Branch) Full time

    At Red Hat Limited, Suomen Sivuliike (Finland Branch), we are seeking a skilled Senior Principal/Principal Software Architect to join our Telco 5G Platform team. This is an exciting opportunity to contribute to the design and implementation of a container platform for 5G telecommunication networks, leveraging industry-leading technologies such as Kubernetes...


  • Falun, Dalarna, Sweden Grafana Labs Full time

    About the RoleWe are seeking a highly skilled Product Manager to lead our AI and Machine Learning initiatives at Grafana Labs. As a key member of our product team, you will be responsible for developing a vision and strategy for our AI offerings, driving the roadmap of our AI products, and collaborating with cross-functional teams to deliver these...


  • Falun, Dalarna, Sweden Grafana Labs Full time

    Security Engineer - Platform SecurityAbout our Platform (at Grafana Labs):Grafana Cloud is a highly available, low-latency stack processing and storing millions of metrics, log lines, and traces per second from customers' environments. Our goal is to improve performance, increase reliability, and achieve efficiency as we scale to hundreds of millions per...


  • Falun, Dalarna, Sweden Grafana Labs Full time

    About Our PlatformGrafana Cloud is a highly available, low-latency stack that processes and stores massive amounts of data from our customers' environments. We're growing rapidly, and it's essential that we improve our performance, increase our reliability, and do it efficiently and effectively.The Platform Security SquadWe're hiring for our Platform...


  • Falun, Dalarna, Sweden Grafana Labs Full time

    About the RoleWe are seeking a skilled Senior Backend Software Engineer to join our team at Grafana Labs. As a Senior Backend Software Engineer, you will play a key role in developing and maintaining our cloud-based observability platform.Key ResponsibilitiesDesign, build, and operate critical systems, ensuring high reliability, performance, and...


  • Falun, Dalarna, Sweden Grafana Labs Full time

    About the RoleWe are seeking a highly skilled Senior Backend Engineer to join our team at Grafana Labs. As a key member of our back-end development team, you will collaborate closely with customers and the Grafana Cloud teams to provide Tempo, an open source distributed tracing backend, for on-premises use and at scale in the cloud.Key...


  • Falun, Dalarna, Sweden Grafana Labs Full time

    Principal Software EngineerAt Grafana Labs, we're looking for a Principal Software Engineer to lead our OpenTelemetry instrumentation team.Key Responsibilities:Collaborate in open-source communities to contribute to the OpenTelemetry specification and components across various programming languages.Support and lead the technical direction of the team,...


  • Falun, Dalarna, Sweden Trellix Full time

    About the RoleThe Enterprise Account Manager will be responsible for growing new business sales customers across the Benelux and Nordic regions.Key responsibilities include:Partnering with the inside renewals team to grow or upsell existing renewal and expand new business within accounts that have been shrinking.Developing account and opportunity plans to...


  • Falun, Dalarna, Sweden Trellix Full time

    About the RoleThe Enterprise Account Manager will be responsible for growing new business sales customers in the Benelux and Nordic regions. This role will partner with the inside renewals team to help grow or upsell existing renewal and work to expand new business within accounts that have been shrinking.Key ResponsibilitiesDevelop and execute account and...

  • Senior HPC AI Engineer

    5 months ago


    Falun, Sweden NVIDIA Full time

    NVIDIA is looking for an experienced HPC Engineer to join the E2E software verification HPC/AI Infrastructure team. we are focused on building supercomputers and HPC clusters based on groundbreaking technologies. We are looking for an outstanding architect for a senior HPC, be a key player to the most exciting computing hardware and software to contribute to...

  • senior architect

    4 months ago


    Falun, Sweden Broadcom Inc. Full time

    Please Note : 1. If you are a first time user, please create your candidate login account before you apply for a job. (Click Sign In > Create Account) 2. If you already have a Candidate Account, please Sign-In before you apply. Job Description: Senior Architect Sweden (Remote/Hybrid/Office Location) The role in a nutshell Once...


  • Falun, Sweden OutSystems Full time

    Description Job description, low-code style: As the #1 low-code application development platform, OutSystems provides customers with everything they need to build apps incredibly fast. So, let us cut to the chase: we are looking for a Senior Account Executive to be based in Sweden.   As an Senior Account Executive,  you will be on the front...


  • Falun, Sweden Trellix Full time

    Senior Customer Success ManagerEnterprise Account Manager - NordicsInside Sales Account Rep - Turkish languageInside Sales Account RepSr Software Development Engineer ,Data ProtectionSenior Pre-Sales Solutions Engineer - Cyber SecuritySenior Solution ConsultantSr. Enterprise Account ManagerInside Sales Account Rep - German LanguageInside Sales Account Rep -...


  • Falun, Sweden Trellix Full time

    Senior Cloud Software Development EngineerSr SDETDeal Desk AnalystCountry Sales Leader / FSI Sales LeaderSenior Software Development EngineerSenior Security ResearcherSoftware Development Engineer in Test (SDET)Apprentice- HRSolutions EngineerSoftware Development Engineer in Test (SDET)Staff SDETProfessional Services ConsultantSenior Software Development...


  • Falun, Sweden Trellix Full time

    Director, Global Partner MarketingSolutions EngineerEnterprise Account Manager (Benelux and Nordic regions)Analyst - Corporate DevelopmentSales EngineerWorkday Reporting AnalystFederal Civilian Account ManagerSenior Security ResearcherSenior SDETSenior Software QA EngineerStaff Software Development Engineer in TestSenior Security Researcher - Malware...