Senior AI/ML Engineer

principal it • United State
Visa Sponsorship
Apply
AI Summary

Join a fast-growing AI company in Austin, TX, as a Senior AI/ML Engineer to build production-grade AI/ML systems at a deep technical level. The ideal candidate will have a deep understanding of AI/ML systems in production and experience with custom models, model fine-tuning, and retraining. This is a hands-on engineering position with a focus on performance, infrastructure, and model-level challenges.

Key Highlights
Build production-grade AI/ML systems
Work with LLM architecture and custom models
Fine-tune, retrain, and adapt models for real-world deployment
Key Responsibilities
Build and deploy production AI/ML systems
Work with LLM architecture and model trade-offs
Fine-tune, retrain, or adapt custom models
Optimize inference and model performance
Build systems for real-time decisioning
Technical Skills Required
Python LLMs/Large Language Models LLM architecture Custom models Model fine-tuning, retraining, or adaptation Production ML deployment Production inference C/C++ CUDA GPU programming Low-latency systems Real-time decisioning GCP AWS/Azure Cloud deployment AI/ML infrastructure MLOps Model serving Distributed systems High-performance systems Algorithms and data structures Systems architecture
Benefits & Perks
$180,000-$200,000 base salary
Hybrid work model with 4 days onsite and 1 day remote
Visa sponsorship available
Nice to Have
C/C++
CUDA
GPU programming
GPU acceleration
High-performance computing
Low-latency systems
Real-time decisioning
Production inference systems
Distributed systems
High-throughput systems
Model serving
MLOps
GCP
AWS/Azure

Job Description


Senior AI / Machine Learning Engineer

Austin, TX | Hybrid | $180k-$200k base


Many current AI roles offer similar responsibilities:


"Build with LLMs"

"Work on cutting-edge AI"

"Join a fast-growing team"


This one is different.


This role is ideal for engineers who want to work directly with AI models, systems, performance constraints, deployment layers, and real-time decision-making.


The team is looking for candidates with a deep understanding of AI/ML systems in production, not candidates whose experience is limited to integrating third-party LLM APIs.


THE ROLE

You'll join an Austin-based AI company focused on building production-grade AI/ML systems at a deep technical level.

The work sits across:


- LLM architecture

- Custom model work

- Model fine-tuning and retraining

- Production ML

- Deployment and inference systems

- GPU-scale compute

- Low-latency decisioning

- Cloud-based AI infrastructure

- High-performance software engineering


This is a hands-on engineering position. While seniority is valued, the role requires active technical involvement rather than remote management.


The ideal candidate will be comfortable building, debugging, optimizing, and deploying complex AI systems from concept to production.


WHAT YOU'LL BE WORKING ON

You'll contribute to building AI/ML systems designed for real-world deployment, not just demonstrations.

That could include:


- Building and deploying production AI/ML systems

- Working with LLM architecture and model trade-offs

- Fine-tuning, retraining, or adapting custom models

- Optimizing inference and model performance

- Building systems for real-time decisioning

- Working in latency-sensitive environments where milliseconds matter

- Using Python for AI/ML engineering

- Working with C/C++ or CUDA where performance requires it

- Scaling AI/ML systems in cloud environments, primarily GCP

- Taking prototypes or research ideas into commercial production


WHAT MAKES THIS INTERESTING

This position requires AI experience that extends beyond prompt engineering.

You'll work on systems where performance, architecture, scalability, and deployment are critical.

The ideal person will be able to talk clearly about:


- Models they have worked with

- How those models were deployed

- How inference was handled

- What performance constraints existed

- What trade-offs were made

- What broke in production

- How they fixed it


If you enjoy solving complex technical challenges, this role will be engaging.


CORE REQUIREMENTS

You'll need:


- 5+ years of relevant software engineering, AI engineering, or ML engineering experience

- Strong computer science fundamentals

- Commercial experience building or deploying AI/ML systems

- Production ML deployment experience

- Understanding of LLM architecture beyond API usage

- Experience with custom models, model adaptation, fine-tuning, or retraining

- Strong Python engineering experience

- Experience with ML frameworks such as PyTorch or similar

- Cloud deployment experience

- Ability to build reliable systems, not just prototypes

- Comfort working in a startup-style environment

- Ability to work onsite in Austin 4 days per week


HIGHLY VALUABLE EXPERIENCE

These qualifications are not required, but will help your application:


- C / C++

- CUDA

- GPU programming

- GPU acceleration

- High-performance computing

- Low-latency systems

- Real-time decisioning

- Production inference systems

- Distributed systems

- High-throughput systems

- Model serving

- MLOps

- GCP

- AWS or Azure

- Experience moving research or data science work into production

- Experience in transaction-heavy, regulated, robotics, autonomous systems, defence, trading, infrastructure, or other performance-sensitive environments


TECH STACK


- Python

- LLMs / Large Language Models

- LLM architecture

- Custom models

- Model fine-tuning, retraining, or adaptation

- Production ML deployment

- Production inference

- C / C++

- CUDA

- GPU programming

- Low-latency systems

- Real-time decisioning

- GCP

- AWS / Azure

- Cloud deployment

- AI/ML infrastructure

- MLOps

- Model serving

- Distributed systems

- High-performance systems

- Algorithms and data structures

- Systems architecture


WORKING MODEL

This position is based in Austin and follows a hybrid working model:


- 4 days per week onsite

- 1 day remote deep-work day


The remote day is intended for focused technical work, including algorithms, coding, optimisation, architecture, and complex problem-solving.


COMPENSATION

Base salary is expected to sit around: $180,000-$200,000


Compensation may be flexible for exceptional candidates with deep AI/ML expertise, production deployment experience, low-level engineering skills, and performance optimisation capabilities.


Sponsorship may be available.


WHO THIS WILL SUIT


This role is well-suited for candidates who:


- Want to work on deeper AI/ML engineering problems

- Have built systems that reached production

- Understand the difference between research, prototype, and commercial deployment

- Enjoy performance, infrastructure, and model-level challenges

- Are still hands-on

- Can operate in a startup-style environment

- Like building from zero to one


WHO THIS MAY NOT SUIT


This position may not be suitable if you:


- Have only built basic LLM wrappers

- Mainly use third-party APIs without deeper model or system knowledge

- Have no production ML deployment experience

- Want a purely research-only role

- Want a purely management role

- Need a fully remote setup

- Prefer heavily structured corporate environments


Similar Jobs

Explore other opportunities that match your interests

LLM Fine-Tuning Engineer

Programming
•
9m ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

Bright Vision Technologies

United State
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Associate

Randstad Professional Italia

United State
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

Venture Up

United State

Subscribe our newsletter

New Things Will Always Update Regularly