PwC India company
Line of Service
Advisory
Industry/Sector
Not Applicable
Specialism
Data, Analytics & AI
Management Level
Senior Manager
Job Description & Summary
At PwC, our people in data and analytics focus on leveraging data to drive insights and make informed business decisions. They utilise advanced analytics techniques to help clients optimise their operations and achieve their strategic goals.
In business intelligence at PwC, you will focus on leveraging data and analytics to provide strategic insights and drive informed decision-making for clients. You will develop and implement innovative solutions to optimise business performance and enhance competitive advantage.
Why PWC
At PwC, you will be part of a vibrant community of solvers that leads with trust and creates distinctive outcomes for our clients and communities. This purpose-led and values-driven work, powered by technology in an environment that drives innovation, will enable you to make a tangible impact in the real world. We reward your contributions, support your wellbeing, and offer inclusive benefits, flexibility programmes and mentorship that will help you thrive in work and life. Together, we grow, learn, care, collaborate, and create a future of infinite experiences for each other. Learn more about us.
At PwC, we believe in providing equal employment opportunities, without any discrimination on the grounds of gender, ethnic background, age, disability, marital status, sexual orientation, pregnancy, gender identity or expression, religion or other beliefs, perceived differences and status protected by law. We strive to create an environment where each one of our people can bring their true selves and contribute to their personal growth and the firm’s growth. To enable this, we have zero tolerance for any discrimination and harassment based on the above considerations.
Job Description & Summary: Lead the enterprise design, build, and governance of the Databricks Lakehouse platform across cloud providers (AWS/Azure/GCP). Own architecture standards, platform reliability, cost efficiency, security/compliance, and enablement for data engineering, analytics, AI/ML, and streaming workloads. Manage a team of architects/engineers and partner with product, security, and business domains to deliver value at scale.
Responsibilities:
· Strategy and architecture o Define the enterprise Lakehouse strategy, reference architectures, and roadmap aligned to business objectives and data domain needs (data mesh principles, product-oriented delivery). o Architect scalable, secure, and cost-efficient Databricks workspaces, clusters/SQL warehouses, Unity Catalog, and Delta Lake across environments. o Establish medallion (bronze/silver/gold) and CDC patterns; standardize batch and streaming pipelines (Structured Streaming, DLT/Delta Live Tables). · Platform engineering and operations o Own landing zone architecture o Implement cluster policies, serverless, job scheduling/orchestration, secret scopes/Key Vault/Secrets Manager, credentials passthrough, BYOK/KMS, SCIM provisioning, SSO (SAML/OIDC). o Drive CI/CD and IaC (Azure devops , Terraform Databricks provider), environment promotion, release management, and automation standards. o Build observability: audit logs to SIEM (e.g., Splunk), job and query monitoring, data pipeline SLAs, lineage, and usage telemetry. · Data governance, security, and compliance o Operationalize Unity Catalog for catalogs/schemas/tables, RBAC/ABAC, resource and data-level permissions, row/column masking, and lineage. o Partner with InfoSec to meet GDPR/CCPA/HIPAA/SOX/SOC2/ISO requirements, encryption, data retention, PII handling, and incident playbooks. o Integrate enterprise data catalogs (e.g., Purview/UC/Alation) and policies; establish stewardship and quality SLAs. · Performance, reliability, and FinOps
o Optimize performance: Photon, partitioning, Z-ORDER, OPTIMIZE/auto-compaction, caching, file layout, streaming watermarking/state store tuning. o Establish reliability standards: SLAs, SLOs, error budgets, graceful retries, checkpointing, backfills, hotfix playbooks. o Own FinOps practices: DBU tracking, tagging, budgets/alerts, cluster sizing, spot instances, right-sizing SQL warehouses, workload consolidation. · AI/ML architecture and enablement o Standardize ML lifecycle with MLflow (experiments, model registry), feature store, model serving/endpoints, and MLOps pipelines. o Guide teams on feature engineering at scale, governance for ML artifacts, and responsible AI practices. · Stakeholder leadership and team management o Manage and develop a team of solution/data architects and platform engineers; hire, mentor, and set career paths. o Translate business goals into technical roadmaps; run architecture reviews; communicate to executives with clear outcomes and metrics. o Vendor management, licensing/SOWs, and cross-functional coordination with data platforms, analytics, and application teams. · Enablement and best practices o Create standards, design patterns, playbooks, and reusable components; run training and community of practice. o Lead migrations to Unity Catalog and Delta Lake; deprecate legacy stacks and consolidate tools via Partner Connect/Delta Sharing. Core competencies · Enterprise architecture and systems thinking · Leadership, coaching, and stakeholder management · Security-first mindset and compliance fluency · Data/ML reliability engineering and performance tuning · Clear written/spoken communication Tools and technologies · Databricks: Workspaces, Clusters, SQL Warehouses, Unity Catalog, DLT, Jobs, MLflow, Feature Store, Serverless endpoints · Languages: Python, SQL, Scala
· Cloud: AWS/Azure/GCP core services, IAM, KMS/Key Vault/Cloud KMS · Orchestration/DevOps: Terraform, GitHub/GitLab/Azure DevOps, Jenkins, Airflow/ADF · Streaming/Integration: Kafka/Event Hubs/Pub/Sub, REST, Delta Sharing · Observability/Security: CloudWatch/Log Analytics/Stackdriver, Splunk, Databricks observability (UC)
Mandatory skill sets:
· 10+ years in data platforms/architecture; 5+ years hands-on with Databricks and Apache Spark at enterprise scale. · 3+ years in people management leading architects/engineers. · Deep expertise in: · Databricks Lakehouse: Delta Lake, Unity Catalog, SQL Warehouses, Jobs, DLT, Structured Streaming, MLflow, Feature Store, Delta Sharing. · Programming and query languages: Python, SQL; Scala/Java familiarity for Spark. · Cloud services: one or more of AWS (S3, IAM, KMS, EMR, Glue, Lambda), Azure (ADLS Gen2, AAD, Key Vault, Event Hubs, ADF), GCP (GCS, IAM, Pub/Sub, Dataflow). · Networking/security: VPC/VNet design, PrivateLink/PE, routing, firewalls, SSO/SCIM, secrets management, encryption, data masking. · DevOps/MLOps: GitHub/GitLab/Azure DevOps, Jenkins, Terraform (Databricks provider), containerization, CI/CD for data/ML. · Proven delivery of large-scale data engineering, analytics, and ML programs with measurable business outcomes. · Strong communication with executives and technical teams; ability to create clear architecture artifacts and standards.
Preferred skill sets:
· Databricks certifications: Data Engineer Professional, Machine Learning Professional, Lakehouse Fundamentals. · Cloud architect certifications (AWS/Azure/GCP). · Experience with data governance tools (Purview/Collibra/Alation), BI tools (Power BI/Tableau/Looker), and orchestration (Airflow/ADF/Step Functions). · Experience with Message streaming (Kafka/Event Hubs/Pub/Sub), and data quality frameworks (Great Expectations/Deequ).
Years of experience required: 12 to 16 years
Education qualification: Graduate Engineer or Management Graduate
Education (if blank, degree and/or field of study not specified)
Degrees/Field of Study required: Bachelor of Engineering, Master of Engineering
Degrees/Field of Study preferred:
Certifications (if blank, certifications not specified)
Required Skills
Databricks Platform
Optional Skills
Accepting Feedback, Accepting Feedback, Active Listening, Analytical Thinking, Applied Macroeconomics, Business Case Development, Business Data Analytics, Business Intelligence and Reporting Tools (BIRT), Business Intelligence Development Studio, Coaching and Feedback, Communication, Competitive Advantage, Continuous Process Improvement, Creativity, Data Analysis and Interpretation, Data Architecture, Database Management System (DBMS), Data Collection, Data Pipeline, Data Quality, Data Science, Data Visualization, Embracing Change, Emotional Regulation, Empathy {+ 32 more}
Desired Languages (If blank, desired languages not specified)
Travel Requirements
Available for Work Visa Sponsorship?
Government Clearance Required?
Job Posting End Date