Data Automation Engineer

chatgpt jobs • United State
Remote
Apply
AI Summary

Design and implement AI-driven automation solutions across AWS and Azure hybrid environments. Build intelligent, scalable data pipelines and automations integrating cloud services, enterprise tools, and Generative AI for mission-critical analytics, reporting, and customer engagement platforms.

Key Highlights
Design and maintain data pipelines in AWS
Integrate AWS Connect CRM data into enterprise data pipelines
Leverage Generative AI services for vector generation and embeddings
Key Responsibilities
Design and maintain data pipelines in AWS
Develop ETL/ELT processes between DynamoDB, SQL Server (AWS), and AWS ↔ Azure SQL systems
Implement SQL Server stored procedures, indexing, query optimization, and performance tuning
Technical Skills Required
AWS Azure SQL Python Spark Bash PowerShell AWS CLI Azure CLI Apache Flume Kafka Solr AWS Bedrock Amazon Q Azure OpenAI Hugging Face LangChain GitHub Jenkins Azure DevOps IAM KMS encryption VPC isolation RBAC firewalls
Benefits & Perks
Remote work
Clearance: Public Trust (or willingness to obtain)
Nice to Have
Certifications: AWS Data Engineer, AWS AI/ML Specialty, Azure AI Engineer, Databricks Certified Data Engineer
Experience implementing RAG pipelines, embeddings, and vector search with Solr, OpenSearch, FAISS, Pinecone, or Pgvector/SQL Server vector types

Job Description


Job Description

Data Automation Engineer

Location: Washington, DC

  • Remote (fully remote with potential quarterly travel to Gaithersburg, MD / Washington D.C. metro area)

Clearance: Public Trust (or willingness to obtain; must be a U.S. Citizen)

Note: NOT OPEN TO C2C OR W2 REFERRALS AT THIS TIME

Job Description

Seeking a Data Automation Engineer to design and implement innovative, AI-driven automation solutions across AWS and Azure hybrid environments. Responsible for building intelligent, scalable data pipelines and automations integrating cloud services, enterprise tools, and Generative AI for mission-critical analytics, reporting, and customer engagement platforms.

Key Responsibilities

  • Design and maintain data pipelines in AWS using S3, RDS/SQL Server, Glue, Lambda, EMR, DynamoDB, and Step Functions
  • Develop ETL/ELT processes between DynamoDB, SQL Server (AWS), and AWS ↔ Azure SQL systems
  • Integrate AWS Connect CRM data into enterprise data pipelines for analytics and reporting
  • Engineering ingestion pipelines with Apache Spark, Flume, Kafka for real-time/batch processing into Apache Solr, AWS OpenSearch
  • Leverage Generative AI services (AWS Bedrock, Amazon Q, Azure OpenAI, Hugging Face, LangChain) for:
    • Vector generation and embeddings from unstructured data
    • Automated data quality checks, metadata tagging, and lineage tracking
    • LLM-assisted transformation and anomaly detection in ETL
    • Conversational BI interfaces for natural language access to Solr and SQL data
    • AI-powered copilots for pipeline monitoring and troubleshooting
  • Implement SQL Server stored procedures, indexing, query optimization, and performance tuning
  • Apply CI/CD best practices using GitHub, Jenkins, or Azure DevOps
  • Ensure security and compliance via IAM, KMS encryption, VPC isolation, RBAC, firewalls
  • Support Agile DevOps processes with sprint-based delivery
Required Qualifications

  • BS in Computer Science or related field with 2+ years data engineering/automation experience
  • Hands-on experience with SQL, SSIS, Python, Spark, Bash, PowerShell, AWS/Azure CLIs
  • Experience with AWS services (S3, RDS/SQL Server, Glue, Lambda, EMR, DynamoDB)
  • Familiarity with Apache Flume, Kafka, Solr for large-scale data ingestion and search
  • Familiarity with LLM/Gen AI frameworks (AWS Bedrock, Azure OpenAI, or open-source platforms/tools)
  • Experience integrating REST API calls in data pipelines and workflows
  • Familiarity with JIRA, GitHub / Azure DevOps / Jenkins for SDLC and CI/CD automation
  • Strong troubleshooting and performance optimization skills in SQL, Spark or other data engineering solutions
  • Experience operationalizing Generative AI (GenAI Ops) pipelines, including model deployment, monitoring, retraining, and lifecycle management
  • Good communication and presentation skills
  • Ability to obtain Federal government Public Trust clearance

Preferred Qualifications (Plus)

  • Certifications: AWS Data Engineer, AWS AI/ML Specialty, Azure AI Engineer, Databricks Certified Data Engineer
  • Experience implementing RAG pipelines, embeddings, and vector search with Solr, OpenSearch, FAISS, Pinecone, or Pgvector/SQL Server vector types
  • Experience with GenAI-powered coding tools (Claude Code, OpenAI Codex, VS Code)
  • Experience with multi-cloud data integration (AWS ↔ Azure SQL)
  • Familiarity with Microsoft BizTalk and SSIS for SQL Server ETL workflows
  • Knowledge of data lineage/governance tools (Purview, Unity Catalog, AWS Glue Catalog)
  • Familiarity with Infrastructure-as-Code (Terraform/CloudFormation, Bicep) for automated deployments
  • Experience with compliance frameworks (FedRAMP, PCI-DSS, HIPAA)

Similar Jobs

Explore other opportunities that match your interests

Senior Deployment Engineer

Devops
•
18m ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

Trinity Cyber

United State

VP, Head of Engineering Practice

Devops
•
23m ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

Arch Insurance Group Inc.

United State

Senior Cloud Architect

Devops
•
2h ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Associate

Jobgether

United State

Subscribe our newsletter

New Things Will Always Update Regularly