DevOps and SRE interviews have converged in 2026. Whether the role is titled "DevOps Engineer," "SRE," or "Platform Engineer," the interview content is similar β Linux/OS fundamentals, infrastructure as code, Kubernetes, observability, and incident response.
The DevOps/SRE Interview Loop
- Technical screen β Linux/scripting + one infrastructure question (45 min)
- Loop (4β5 rounds):
- Linux and OS fundamentals
- Coding (Python/Go/Bash scripting)
- Infrastructure design / system design
- Kubernetes and container orchestration
- Incident response and observability
- Behavioral / on-call culture fit
Linux & OS Fundamentals
Every DevOps/SRE interview tests Linux knowledge. Must-know topics:
Process management:
ps,top,htop,kill,nice,renice- Process states: running, sleeping, zombie, stopped
fork()vsexec()β how processes are created- File descriptors and
lsof
Networking:
- TCP three-way handshake, FIN sequence
netstat,ss,tcpdump,curl -v- DNS resolution chain:
/etc/hostsβ local resolver β recursive β authoritative iptables/nftablesbasics, NAT, port forwarding- Load balancing: L4 vs L7, connection vs request distribution
Storage:
- Filesystem types: ext4, XFS, tmpfs
- inodes, hard links vs symlinks
df,du,iostat,lsblk- LVM basics, RAID levels trade-offs
Common interview questions:
- "A server's load average is very high but CPU is not β what's happening?" (I/O wait)
- "How does DNS resolution work step by step?"
- "What happens when you run
ssh user@host?" - "How would you find which process is using port 8080?"
Kubernetes
Kubernetes is now table stakes for DevOps/SRE roles:
Core concepts:
- Pod, Deployment, ReplicaSet, DaemonSet, StatefulSet β when to use each
- Services: ClusterIP, NodePort, LoadBalancer, Headless
- ConfigMaps and Secrets β how they're injected, security implications
- PersistentVolumes and PersistentVolumeClaims
- Namespaces and RBAC
Advanced topics (senior roles):
- HPA vs VPA vs KEDA β horizontal vs vertical vs event-driven autoscaling
- Ingress controllers: NGINX, Traefik β routing rules, TLS termination
- Network policies β pod-to-pod traffic control
- Custom Resource Definitions (CRDs) and operators
- Helm: chart structure, templating, release management
Kubernetes troubleshooting questions:
- "A pod is in
CrashLoopBackOffβ how do you debug it?" - "A deployment rollout is stuck β what commands do you run?"
- "How do you do zero-downtime deployments in Kubernetes?"
Infrastructure as Code
- Terraform:
plan,apply,state, modules, remote state,terraform import - Ansible: playbooks, roles, inventory, idempotency
- Pulumi: IaC with real programming languages β increasingly common
- GitOps: ArgoCD, Flux β declarative infrastructure from Git
Common question: "What are the trade-offs between Terraform and Ansible? When would you use each?"
CI/CD
- Pipeline design: build β test β security scan β staging deploy β prod deploy
- GitHub Actions, GitLab CI, Jenkins, CircleCI β know at least one deeply
- Deployment strategies: rolling, blue-green, canary β implement each in Kubernetes
- Artifact management: Docker registry, Nexus, Artifactory
- Secret management in pipelines: Vault, AWS Secrets Manager, sealed secrets
Observability
The three pillars: metrics, logs, traces.
- Metrics: Prometheus + Grafana, alerting rules, recording rules, SLO dashboards
- Logging: ELK/EFK stack, Loki, structured logging, log aggregation patterns
- Tracing: Jaeger, Zipkin, OpenTelemetry β distributed trace propagation
- SLIs, SLOs, SLAs: Define them, measure them, set error budgets
Incident response questions:
- "Walk me through how you'd investigate a 50% increase in 5xx errors."
- "How do you write a post-mortem? What makes a good one?"
- "What's your on-call philosophy? How do you prevent alert fatigue?"
6-Week DevOps/SRE Prep Plan
| Week | Focus |
|------|-------|
| 1 | Linux fundamentals + 20 Linux interview questions |
| 2 | Kubernetes: core + advanced topics, kubectl fluency |
| 3 | Terraform + CI/CD pipeline design |
| 4 | Observability stack: Prometheus, Grafana, distributed tracing |
| 5 | Infrastructure system design: 3 full designs |
| 6 | Mock loops + incident response scenarios |
Practice explaining your infrastructure decisions out loud with CareerLift.ai β DevOps interviews reward candidates who can clearly communicate their operational reasoning and trade-off thinking.