Advanced Concepts

6 min read
Rapid overview

Kubernetes Advanced Concepts

Custom Resource Definitions (CRDs)

CRDs extend the Kubernetes API with custom resources.

Creating a CRD

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: databases.mycompany.com
spec:
  group: mycompany.com
  versions:
  - name: v1
    served: true
    storage: true
    schema:
      openAPIV3Schema:
        type: object
        properties:
          spec:
            type: object
            required:
            - engine
            - version
            properties:
              engine:
                type: string
                enum: ["postgres", "mysql", "mongodb"]
              version:
                type: string
              replicas:
                type: integer
                minimum: 1
                maximum: 10
                default: 3
              storage:
                type: string
                pattern: '^[0-9]+Gi$'
          status:
            type: object
            properties:
              state:
                type: string
              message:
                type: string
    subresources:
      status: {}
    additionalPrinterColumns:
    - name: Engine
      type: string
      jsonPath: .spec.engine
    - name: Version
      type: string
      jsonPath: .spec.version
    - name: Status
      type: string
      jsonPath: .status.state
  scope: Namespaced
  names:
    plural: databases
    singular: database
    kind: Database
    shortNames:
    - db

Using the Custom Resource

apiVersion: mycompany.com/v1
kind: Database
metadata:
  name: production-db
spec:
  engine: postgres
  version: "15.2"
  replicas: 3
  storage: 100Gi
kubectl get databases
kubectl get db
kubectl describe database production-db

Operators

Operators are software extensions that use custom resources to manage applications.

Operator Pattern Components

  1. Custom Resource Definition (CRD): Defines the desired state schema
  2. Controller: Watches resources and reconciles desired vs actual state
  3. Reconciliation Loop: Continuously ensures desired state is achieved

Operator SDK Example (Go)

func (r *DatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    log := log.FromContext(ctx)

    // Fetch the Database instance
    database := &mycompanyv1.Database{}
    err := r.Get(ctx, req.NamespacedName, database)
    if err != nil {
        if errors.IsNotFound(err) {
            return ctrl.Result{}, nil
        }
        return ctrl.Result{}, err
    }

    // Check if StatefulSet exists
    found := &appsv1.StatefulSet{}
    err = r.Get(ctx, types.NamespacedName{Name: database.Name, Namespace: database.Namespace}, found)
    if err != nil && errors.IsNotFound(err) {
        // Create StatefulSet
        sts := r.statefulSetForDatabase(database)
        log.Info("Creating StatefulSet", "name", sts.Name)
        err = r.Create(ctx, sts)
        if err != nil {
            return ctrl.Result{}, err
        }
        return ctrl.Result{Requeue: true}, nil
    }

    // Update status
    database.Status.State = "Running"
    r.Status().Update(ctx, database)

    return ctrl.Result{}, nil
}

Common Operators

OperatorPurpose
Prometheus OperatorManages Prometheus instances
Cert-ManagerAutomates TLS certificate management
StrimziManages Apache Kafka clusters
PostgreSQL OperatorManages PostgreSQL clusters
Argo CDGitOps continuous delivery

Service Mesh

Istio Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                Control Plane                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  Pilot  β”‚  β”‚ Citadel β”‚  β”‚   Galley    β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β”‚            β”‚               β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                 Data Plane                   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”             β”‚
β”‚  β”‚   Pod   β”‚        β”‚   Pod   β”‚             β”‚
β”‚  β”‚ β”Œβ”€β”€β”€β”€β”€β” │◄──────►│ β”Œβ”€β”€β”€β”€β”€β” β”‚             β”‚
β”‚  β”‚ β”‚Envoyβ”‚ β”‚        β”‚ β”‚Envoyβ”‚ β”‚             β”‚
β”‚  β”‚ β””β”€β”€β”€β”€β”€β”˜ β”‚        β”‚ β””β”€β”€β”€β”€β”€β”˜ β”‚             β”‚
β”‚  β”‚ β”Œβ”€β”€β”€β”€β”€β” β”‚        β”‚ β”Œβ”€β”€β”€β”€β”€β” β”‚             β”‚
β”‚  β”‚ β”‚ App β”‚ β”‚        β”‚ β”‚ App β”‚ β”‚             β”‚
β”‚  β”‚ β””β”€β”€β”€β”€β”€β”˜ β”‚        β”‚ β””β”€β”€β”€β”€β”€β”˜ β”‚             β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Traffic Management

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: reviews
spec:
  hosts:
  - reviews
  http:
  - match:
    - headers:
        end-user:
          exact: jason
    route:
    - destination:
        host: reviews
        subset: v2
  - route:
    - destination:
        host: reviews
        subset: v1
      weight: 90
    - destination:
        host: reviews
        subset: v2
      weight: 10
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: reviews
spec:
  host: reviews
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 100
      http:
        h2UpgradePolicy: UPGRADE
    outlierDetection:
      consecutive5xxErrors: 5
      interval: 10s
      baseEjectionTime: 30s
  subsets:
  - name: v1
    labels:
      version: v1
  - name: v2
    labels:
      version: v2

Circuit Breaker Pattern

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: httpbin
spec:
  host: httpbin
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 1
      http:
        http1MaxPendingRequests: 1
        maxRequestsPerConnection: 1
    outlierDetection:
      consecutive5xxErrors: 1
      interval: 1s
      baseEjectionTime: 3m
      maxEjectionPercent: 100

Multi-Cluster Kubernetes

Federation Patterns

  1. Cluster API: Declarative cluster lifecycle management
  2. Kubefed: Federated resources across clusters
  3. Admiral: Service mesh federation
  4. Submariner: Cross-cluster networking

Cluster API Example

apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: production-cluster
spec:
  clusterNetwork:
    pods:
      cidrBlocks: ["192.168.0.0/16"]
    services:
      cidrBlocks: ["10.96.0.0/12"]
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1beta1
    kind: KubeadmControlPlane
    name: production-control-plane
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    kind: AWSCluster
    name: production-aws

Multi-Cluster Service Discovery

# With Istio multi-cluster
apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
  name: external-svc
spec:
  hosts:
  - api.cluster2.local
  location: MESH_INTERNAL
  ports:
  - number: 80
    name: http
    protocol: HTTP
  resolution: DNS
  endpoints:
  - address: <cluster2-ingress-ip>
    ports:
      http: 15443

Advanced Scheduling

Pod Topology Spread Constraints

spec:
  topologySpreadConstraints:
  - maxSkew: 1
    topologyKey: topology.kubernetes.io/zone
    whenUnsatisfiable: DoNotSchedule
    labelSelector:
      matchLabels:
        app: web
  - maxSkew: 1
    topologyKey: kubernetes.io/hostname
    whenUnsatisfiable: ScheduleAnyway
    labelSelector:
      matchLabels:
        app: web

Priority and Preemption

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: high-priority
value: 1000000
globalDefault: false
preemptionPolicy: PreemptLowerPriority
description: "Critical production workloads"
---
apiVersion: v1
kind: Pod
metadata:
  name: critical-pod
spec:
  priorityClassName: high-priority
  containers:
  - name: app
    image: app:latest

Pod Disruption Budget

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: web-pdb
spec:
  minAvailable: 2  # or use maxUnavailable: 1
  selector:
    matchLabels:
      app: web

Security Advanced

Pod Security Standards

apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted

Security Context

spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    runAsGroup: 3000
    fsGroup: 2000
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: app
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      capabilities:
        drop:
        - ALL

RBAC Advanced

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: pod-reader
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "watch", "list"]
- apiGroups: [""]
  resources: ["pods/log"]
  verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: read-pods
  namespace: default
subjects:
- kind: ServiceAccount
  name: my-service-account
  namespace: default
roleRef:
  kind: ClusterRole
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io

Autoscaling

Horizontal Pod Autoscaler v2

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  - type: Pods
    pods:
      metric:
        name: packets-per-second
      target:
        type: AverageValue
        averageValue: 1k
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
      - type: Percent
        value: 100
        periodSeconds: 15
      - type: Pods
        value: 4
        periodSeconds: 15
      selectPolicy: Max

Vertical Pod Autoscaler

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: web-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web
  updatePolicy:
    updateMode: "Auto"  # Off, Initial, Recreate, Auto
  resourcePolicy:
    containerPolicies:
    - containerName: "*"
      minAllowed:
        cpu: 100m
        memory: 50Mi
      maxAllowed:
        cpu: 1
        memory: 500Mi

KEDA (Kubernetes Event-Driven Autoscaling)

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: kafka-consumer
spec:
  scaleTargetRef:
    name: kafka-consumer
  pollingInterval: 30
  cooldownPeriod: 300
  minReplicaCount: 1
  maxReplicaCount: 100
  triggers:
  - type: kafka
    metadata:
      bootstrapServers: kafka:9092
      consumerGroup: my-group
      topic: events
      lagThreshold: "50"

Interview Deep Dives

1. Explain Kubernetes garbage collection

  • Owner References: Resources have ownerReferences linking to parent
  • Cascading Delete: Foreground (wait for dependents) vs Background (immediate)
  • Finalizers: Prevent deletion until cleanup is complete
  • Orphan Policy: Delete owner but keep dependents

2. How does kube-proxy work?

  • iptables mode: Creates iptables rules for service routing
  • IPVS mode: Uses Linux IPVS for load balancing, better performance at scale
  • userspace mode: Legacy, proxies through userspace process

3. Explain Kubernetes API aggregation

  • API server can proxy requests to extension API servers
  • Used by metrics-server, custom API servers
  • Registered via APIService resources

4. How do admission controllers work?

  • Mutating: Modify resources before persistence
  • Validating: Accept/reject resources
  • Webhook-based: External HTTP callbacks
  • Order: Mutating β†’ Schema validation β†’ Validating