Senior DevOps Engineer - GCP
WitnessAI
Location
Bay Area
Employment Type
Full time
Location Type
Hybrid
Department
Engineering
Job Title: Senior DevOps Engineer – GCP
Location: SF Bay Area
Type: Full-time
Team: Platform Engineering
Reports To: Head of Platform Engineering
About Us
WitnessAI is a fast-growing SaaS startup on a mission to enable enterprises to adopt AI, safely. We're building a product that provides security and governance guardrails for public and private LLMs.
About the Role
We’re looking for a Sr. DevOps Engineer who will take ownership of designing, securing, and scaling the cloud backbone of our AI security platform. You’ll be responsible for infrastructure architecture in Google Cloud Platform (GCP), with a deep focus on networking, VPC security, interconnectivity, and service reliability.
You’ll work closely with ML engineers, security researchers, and backend developers to build highly reliable, secure, and performant environments for running AI workloads and security tooling.
What You’ll Do
Design, implement, and maintain GCP-based infrastructure for secure AI workloads and APIs
Build and manage scalable, low latency Internet and Cloud networking strategies eg Anycast+, route optimization, and private VPC peering, VPC networks, private service access, peering, and firewall configurations.
Develop secure ingress/egress patterns, service meshes, and zero-trust networking topologies
Automate infrastructure provisioning using Terraform, Helm, and CI/CD workflows
Collaborate on platform observability: logging, monitoring, alerting, and incident response
Harden cloud infrastructure against threats using IAM best practices, organization policies, and GCP security controls
Work cross-functionally with engineering, data science, and security teams to optimize environment reliability and cost
Help establish infrastructure SLAs, SLOs, and runbooks as the platform scales
What We’re Looking For
7+ years of experience in infrastructure engineering, site reliability, or DevOps
Deep expertise in Google Cloud Platform, including VPC networking, Cloud NAT, private services, and inter-project connectivity
Strong knowledge of Terraform and infrastructure-as-code practices
Proficiency with container orchestration (Kubernetes / GKE preferred)
Experience designing for high availability, scalability, and secure access across cloud environments
Familiarity with service mesh tools (Istio, Linkerd, etc.) and API gateways
Solid understanding of Linux, DNS, TLS, load balancing, and network security principles
Comfort working in a fast-paced startup environment with ownership and autonomy
Bonus: Experience in regulated or high-security environments (SOC 2, FedRAMP, HIPAA, etc.)
Bonus: Exposure to ML infrastructure, GPU workloads, or data pipelines, including VPC networking, Cloud NAT, private services, and inter-project connectivity
Strong knowledge of Terraform and infrastructure-as-code practices
Proficiency with container orchestration (Kubernetes / GKE preferred)
Experience designing for high availability, scalability, and secure access across cloud environments
Familiarity with service mesh tools (Istio, Linkerd, etc.) and API gateways
Solid understanding of Linux, DNS, TLS, load balancing, and network security principles
Demonstrated automation experience using bash, golang, python or other languages.
Comfort working in a fast-paced startup environment with ownership and autonomy
Benefits:
Hybrid work environment
Competitive salary, health, dental, and vision insurance
401(k) plan
Opportunities for professional development and growth
Generous vacation policy
Salary range:
$168K-$225K (Bay Area)