👋 Hi, I’m Sahil Gothoskar
Site Reliability Engineer · Cloud & Platform Reliability · Automation-Driven ⚙️
I’m a Site Reliability Engineer with ~4 years of experience building, operating, and scaling reliable distributed systems across cloud and Kubernetes environments.
I enjoy turning operational pain into automation, strengthening system resilience, and building platforms engineers can trust. My work has consistently focused on reducing outages, improving MTTR, and eliminating manual toil through better observability, automation, and system design.
🚀 I’ve worked across AWS, OCI, Azure, and Kubernetes, leading cloud migrations, modernizing monitoring stacks, and building CI/CD pipelines that help teams ship faster without compromising reliability.
🧠 How I Think About Reliability
- Reliability is a feature, not an afterthought
- Automate what’s frequent and well understood
- Keep humans in the loop when risk or ambiguity is high
- Treat incidents as feedback to improve systems, not people
My goal as an SRE is simple:
👉 Enable engineers to move fast, safely, and with confidence.
📬 Get in Touch
🛠️ Technical Skills
Containerization & Orchestration
OpenShift · Docker · Docker Swarm · Rancher
Operating Systems
RedHat Linux · Ubuntu · CentOS · Windows
Cloud (AWS / OCI / Azure)
IAM · EC2 · ELB · ALB · CloudFront
Observability & Reliability
Prometheus · Grafana · Splunk · Nagios · PagerDuty
Automation & CI/CD
Terraform · Ansible · Puppet · Jenkins · Git
Data & Messaging
Kafka · Zookeeper · Elasticsearch
Frameworks & Web
Flask · Django
Programming Languages
Python · Bash · Java · C · SQL · R
🎓 Certifications & Credentials
🏅 View Certifications✨ What I’m Looking For
I’m excited by roles where I can:
- Take long-term ownership of reliability
- Work on complex, large-scale, or safety-critical systems
- Build infrastructure with real-world impact
- Learn from strong engineers and grow as an SRE
Always curious. Always learning. Always improving systems. 🚀
