Senior Software Engineer – AI Middleware
Company: Cornelis Networks, Inc.
Location: Austin
Posted on: April 1, 2026
|
|
|
Job Description:
Cornelis Networks delivers the world’s highest performance
scale-out networking solutions for AI and HPC datacenters. Our
differentiated architecture seamlessly integrates hardware,
software and system level technologies to maximize the efficiency
of GPU, CPU and accelerator-based compute clusters at any scale.
Our solutions drive breakthroughs in AI & HPC workloads, empowering
our customers to push the boundaries of innovation. Backed by
top-tier venture capital and strategic investors, we are committed
to innovation, performance and scalability - solving the world’s
most demanding computational challenges with our next-generation
networking solutions. We are a fast-growing, forward-thinking team
of architects, engineers, and business professionals with a proven
track record of building successful products and companies. As a
global organization, our team spans multiple U.S. states and six
countries, and we continue to expand with exceptional talent in
onsite, hybrid, and fully remote roles. We are seeking a highly
experienced Senior Software Engineer to design, develop, and
upstream-enable Cornelis Networks’ AI communication middleware.
This role focuses on distributed AI workloads and
enabling/optimizing collective communication libraries (e.g.,
NCCL/RCCL) over Cornelis Networks’ interconnects. Key
Responsibilities Design and implement performance-critical features
for CCL enablement on Cornelis Networks’ fabrics. Optimize
distributed training performance across multi-node, multi-GPU
configurations. Improve GPU communication paths including
GPU-direct transfers, IPC, and CPU/GPU synchronization. Profile
distributed AI workloads and identify bottlenecks across the
software and hardware stack. Tune AI frameworks such as PyTorch
Distributed, TensorFlow/XLA, JAX, DeepSpeed, and Megatron-LM.
Develop benchmarks and microbenchmarks aligned with real model
performance. Contribute upstream to AI communication and
distributed training projects. Participate in design reviews, code
reviews, CI, and long-term maintenance. Prototype and validate
Ultra Ethernet capabilities for AI collective communication.
Provide technical input for deployment considerations and
performance validation. Collaborate with kernel/driver, switch,
performance, and systems teams. Support advanced escalations by
analyzing traces and providing robust fixes. Minimum Qualifications
8 years of experience in high-performance systems programming in
C/C++ on Linux. Strong experience with GPU communication stacks
including CUDA/ROCm and NCCL/RCCL. Ability to optimize distributed
training performance using profiling and tracing. Understanding of
collective communication concepts and topology awareness.
Experience delivering production-quality code. Open-source
contributions in relevant areas. Preferred Qualifications
Experience with AI frameworks such as PyTorch Distributed,
DeepSpeed, and Megatron-LM. Familiarity with libfabric/OFI, UCX,
and RDMA concepts. Experience with RoCEv2 and Ultra Ethernet.
Experience building cluster-scale performance test infrastructure.
Location: This is a remote position for employees residing within
the United States. We offer a competitive compensation package that
includes equity, cash, and incentives, along with health and
retirement benefits. Our dynamic, flexible work environment
provides the opportunity to collaborate with some of the most
influential names in the semiconductor industry. At Cornelis
Networks your base salary is only one component of your
comprehensive total rewards package. Your base pay will be
determined by factors such as your skills, qualifications,
experience, and location relative to the hiring range for the
position. Depending on your role, you may also be eligible for
performance-based incentives, including an annual bonus or sales
incentives. In addition to your base pay, you’ll have access to a
broad range of benefits, including medical, dental, and vision
coverage, as well as disability and life insurance, a dependent
care flexible spending account, accidental injury insurance, and
pet insurance. We also offer generous paid holidays, 401(k) with
company match, and Open Time Off (OTO) for regular full-time exempt
employees. Other paid time off benefits include sick time, bonding
leave, and pregnancy disability leave. Cornelis Networks does not
accept unsolicited resumes from headhunters, recruitment agencies,
or fee-based recruitment services. Cornelis Networks is an equal
opportunity employer, and all qualified applicants will receive
consideration for employment without regard to race, color,
religion, sex, sexual orientation, gender identity or expression,
pregnancy, age, national origin, disability status, genetic
information, protected veteran status, or any other characteristic
protected by law. We encourage applications from all qualified
candidates and will accommodate applicants’ needs under the
respective laws throughout all stages of the recruitment and
selection process.
Keywords: Cornelis Networks, Inc., Temple , Senior Software Engineer – AI Middleware, IT / Software / Systems , Austin, Texas