Build Production ML Infrastructure with TensorOps
TensorOps is an applied machine learning studio helping organizations worldwide plan, design, train, and deploy production-grade ML systems. Our clients range from NASDAQ-listed enterprises to seed-stage startups. Projects span from small proofs-of-concept to multi-year strategic initiatives.
What We're Working On:
- ML Infrastructure at Scale: Building and optimizing ML pipelines across cloud platforms
- Generative AI Deployment: Production-ready chatbots, agents, and LLM applications
- Traditional ML Systems: Time series forecasting, AdTech, computer vision in production
- Platform Engineering: CI/CD for ML, model serving infrastructure, and observability systems
Core Stack:
As we work with many clients, our stack varies, but we often use:
Cloud Platforms (Primary Focus):
- GCP: Vertex AI, Cloud Run, GKE, BigQuery, Cloud Storage, Cloud Build
- AWS: SageMaker, Bedrock, EKS, S3, Lambda, Step Functions, ECR
Infrastructure & Orchestration:
- IaC: Terraform, CloudFormation
- Containers: Docker, Kubernetes (EKS, GKE)
- Workflow Orchestration: Airflow, Kubeflow Pipelines, Vertex AI Pipelines, SageMaker Pipelines
ML Tools & Frameworks:
- Model Training: PyTorch, HuggingFace, LightGBM, CatBoost
- Model Serving: FastAPI, TorchServe, TensorFlow Serving
- LLM Frameworks: LangChain, LangGraph
Observability & Monitoring:
- MLFlow, Weights & Biases, Langfuse
- Cloud-native monitoring (CloudWatch, Cloud Monitoring)
- Prometheus, Grafana
Data Engineering:
- Pandas, Polars, DuckDB
- BigQuery, Redshift, Athena
The Role:
We're looking for an
MLOps Engineer to help us build and scale ML infrastructure for our diverse client base. You'll report to and be mentored by a senior team member while working on cloud-native ML systems that serve real users. This is a hands-on role from day one, where you'll architect pipelines, automate deployments, and ensure reliability at scale.
Required Qualifications:
- BSc in Computer Science, Software Engineering, or equivalent practical experience
- Demonstrable experience with GCP and/or AWS in production environments
Required Skills:
- Cloud Expertise: Strong working knowledge of GCP and AWS ML/AI services (Vertex AI, SageMaker, Bedrock, etc.)
- DevOps Fundamentals: CI/CD pipelines, infrastructure-as-code (Terraform preferred), containerization
- MLOps Practices: Experience designing and maintaining ML pipelines, model versioning, automated retraining
- Python Proficiency: Strong Python skills with focus on production-ready code
- System Design: Understanding of distributed systems, scalability patterns, and reliability engineering
- Excellent English communication skills
Nice to Have:
- Kubernetes expertise (EKS, GKE administration)
- Experience with model monitoring and observability platforms
- Knowledge of LLM deployment patterns (RAG systems, agent architectures)
- Contributions to ML infrastructure tooling or open-source projects
- Multi-cloud architecture experience
- Certifications: GCP Professional ML Engineer, AWS Machine Learning Specialty, or CKA
Why TensorOps?
- Fully remote (legal residence in Spain required)
- Real-world infrastructure challenges with immediate impact
- Work across cutting-edge cloud technologies and ML frameworks
- Mentorship from engineers who have built ML platforms at scale
- Competitive compensation with growth tied to ownership and performance rather than periodic reviews (which we still do)
Compensation & Perks:
- Yearly salary: €50,000-65,000 (adjusted for MLOps focus)
- Travel expenses allowance
- Urban Sports Club membership
- Professional development budget for certifications and training