Senior Site Reliability Engineer - Observability
Bellevue, WashingtonFull-TimeSeniorDevOps
Key Responsibilities
- Automated Infrastructure: Design, build, and maintain scalable observability infrastructure using tools like Terraform.
- Splunk Engineering: Optimize the collection, processing, and storage of log data to ensure high reliability and low latency of our Splunk services
- Incident Response: Participate in on-call rotations and lead post-incident reviews to drive systemic improvements and "observability-driven development."
- Automation: Eliminate "toil" by automating the deployment and scaling of observability agents and collectors.
Log Management: Minimum 5+ Experience scaling and managing Splunk Cloud at scale (1000+ SVCs), including Workload Management (WLM) and HEC optimization. Visualization: Expertise in creating intuitive, actionable Splunk dashboards that correlate data across multiple sources.SRE Mindset: Minimum 3+ years of experience in an SRE, DevOps, or Systems Engineering role with a focus on high-availability systems.
- Programming Proficiency: Strong coding skills in SPL, Go for building internal tools and automating workflows.
- Distributed Systems: Deep understanding of Linux internals, networking (TCP/IP, DNS, Load Balancing), and container orchestration (Kubernetes/EKS).
- Problem Solving: A data-driven approach to debugging complex, cross-service performance bottlenecks.
Bonus Skills (The "Nice-to-Haves")
- Telemetry Standards: Hands-on experience with OpenTelemetry (OTel), Vector, or similar frameworks for instrumenting applications.
- Charge-back app: Experience in implementing Splunk charge-back app for usage reporting
Additional requirements
- This position requires the ability to access federal environments and/or have access to protected federal data. As a condition of employment for this position, the successful candidate must be able to submit documentation establishing U.S. Person status (e.g. a U.S. Citizen, National, Lawful Permanent Resident, Refugee, or Asylee. 22 CFR 120.15) upon hire.
- This person must attend in person onboarding in our San Francisco office the first week of employment.
- #LI-MM
- #LI-HybridP14596_3372199
- Below is the annual base salary range for candidates located in California (excluding San Francisco Bay Area), Colorado, Illinois, New York and Washington. Your actual base salary will depend on factors such as your skills, qualifications, experience, and work location. In addition, Okta offers equity (where applicable), bonus, and benefits, including health, dental and vision insurance, 401(k), flexible spending account, and paid leave (including PTO and parental leave) in accordance with our applicable plans and policies. To learn more about our Total Rewards program please visit: https://rewards.okta.com/us.
What you can look forward to as a Full-Time Okta employee!
- Amazing Benefits
- Making Social Impact
- Developing Talent and Fostering Connection + Community at Okta
- Okta cultivates a dynamic work environment, providing the best tools, technology and benefits to empower our employees to work productively in a setting that best and uniquely suits their needs. Each organization is unique in the degree of flexibility and mobility in which they work so that all employees are enabled to be their most creative and successful versions of themselves, regardless of where they live. Find your place at Okta today! https://www.okta.com/company/careers/.Some roles may require travel to one of our office locations for in-person onboarding.
Similar roles you might like
View all DevOps rolesSite Reliability Engineer - Top Secret Clearance
Hawthorne, CAFull-Time
DevOps
More roles at Okta
View company profileIT Support Intern (Summer 2026)
San Francisco, CaliforniaFull-Time
Support / Customer Success
Manager, Engineering - Okta Secures AI
San Francisco, CaliforniaFull-Time
AI / Data Science
Senior Regional Marketing Specialist
Bellevue, Washington; Chicago, Illinois; Washington, DCFull-Time
Customer Acquisition
