1. Nodes - Machines that host containers in the k8s cluster (alt minions)
  2. Masters - Machines that run coordinating software that schedules containers on Nodes
  3. Cluster - A collection of masters and nodes
  4. Pods - Collections of containers and volumes that are bundled and scheduled together because they share a common resource
  5. Shared Fate - if one container in a pod dies, they all die
  6. Labels - a key/value pair that is assigned to a k8s object to uniquely identify it
  7. Annotations - useful information about a k8s object
  8. Replicas - multiple copies of a pod
  1. The Facts:
    • Containers are lightweight -> flexible and fast
    • Designed to be short-lived and fragile
    • VMs are like passenger jets - super resilient and have the swing of the entire OS behind them
    • Containers are like the Lunar Module - light, heavily optimised and designed to work in very specific and orchestrated conditions
  2. The Problem:
    • Distribution of containers across machines (to minimise the effect of a single machine failure)
    • Detection of container failure and immediate replacement
    • Running containers in managed clusters and heavily orchestrating them
    • Management of overall network and memory resources for the cluster
  3. Pillars:
    1. Isolation - a failure of one computing unit cannot take down another
    2. Orchestration - resources should be well-balanced geographically (load distribution)
    3. Scheduling - detection of failures and their replacement should be near instant
  4. Applications vs Services:
    • Service:
      • Designed to do a small number of things
      • Has no UI and in invoked solely via some kind of API
      • When designing, try to make the services as light and limited purpose (aids rearranging them later in response to the changes in the load)
    • Application:
      • Has a UI (even if it is a CLI)
      • Performs a number of different tasks
      • Can possibly expose an API (but doesn't have to)
  5. Masters:
    • API Server:
      • Handles API calls from nearly all components of the master or the nodes
    • Etcd:
      • Service
      • Keeps and replicates the current configuration and run state of the cluster
      • Lightweight distributed key-value store
      • Originally developed in Core OS project
    • Scheduler and Controller Manager:
      • Schedule pods onto target nodes
      • Ensure that a correct number of them are running at any given point in time
  6. Nodes:
    • Kubelet:
      • Daemon that runs on each node
      • Responds to commands from the master to create, destroy and monitor the containers
    • Proxy:
      • Simple proxy
      • Separates the IP address of the target container from the name of the service it provides
    • cAdvisor:
      • Optional
      • Daemon that runs on a node
      • Collects, aggregates, processes and exports information about running containers
      • I.E: Resource isolation, historical usage and key network stats
  7. Pods:
    • In Docker every container get its own IP address
    • In k8s a shared IP is assigned to the pod
    • Containers on the same pod can communicate with one another using localhost
    • Scheduling happens at the pod level, no the container level
  8. Reasons to put services in a pod but not in a single container:
    • Management Transparency:
      • You assume responsibility for monitoring and managing how much resource each service in the container uses
      • If one rogue process starves others it will up to you to detect and fix that
      • K8s can manage it for you if the services are split into multiple containers in the same pod
    • Deployment and Maintenance:
      • Individual containers can be rebuild and redeployed whenever needed
      • Decoupling of deployments allows for faster iteration and testing
      • Simplifies rollbacks
    • Focus:
      • If k8s is handling all the monitoring an resource management the containers can be lighter and only contain the business logic
  9. Lack of Durability:
    • Pods are not durable
    • From time to time the master may choose to evict the pod from its host
    • Pod eviction - kill it and raise a new one somewhere else
    • Preservation of the application state is on the developer (you)
    • Use a shared data store like Redis, Memcached or Cassandra
  10. Volumes:
    • Docker volume - a virtual FS that the container can see and use
    • K8s volumes are defined at the pod level, which solves the following problems:
      • Durability:
        1. Containers die and reborn all the time
        2. A volume tied to a container will die with it, along with any data written to it
      • Communication:
        1. Any container at the pod has access to the volume
        2. Easy to move temporary data across containers
    • Its important to be clear what kind of volume is in question (a container volume or a pod one)
  11. Volume Types:
    • EmptyDir:
      • Most commonly used
      • Bound to the pod and is initially always empty when first created
      • Exists for the life of the pod
      • When the pod is evicted, the volume is destroyed and all data is lost
      • For the life of the pod, every container on it can read and write data to the volume
      • Ephemeral volume
    • NFS:
      • Mounting NFS volume at the pod level
      • Persistent volume
    • GCEPersistentDisk:
      • Google Cloud persistent disk that can be mounted as a volume on a pod
      • PD is like a managed NFS service
  1. K8s provides basic ways to annotate and document the infrastructure:
    • Labels:
      • Decorates a pod and is a part of the pod.yaml file
      • Keys are a combination of 0 or more prefixes followed by a "/" E.G: "application.game/tier" vs "tier"
      • Labels must conform to the rules of DNS entries (DNS Labels)
      • Prefixes are one or more DNS Labels separated by the "." character
      • Max size for prefix is 253 characters
      • Values follow the same rules but cannot be longer than 63 characters
      • Neither keys nor values can contain spaces
      • -- Is there a way to statically check your labels to avoid missed routes? --
    • Label Selectors:
      • Central means of rooting and orchestration
      • Equality-based label selectors:
        1. Exact match
        2. Can be combined like so tier != frontend, game = super-shooter-2
      • Set-based:
        1. "In/Not in" kind of query
        2. Checks the label against a set of acceptable values
        3. Can be combined like so environment in (production, test), tier notin (frontend, back-end), partition
    • Annotations:
      • Can be queried against
      • Key/value pairs and have the same rules as Labels
      • Information stored on it is something like build date or a URL to more information
      • Annotations are used to store arbitrary information about a thing that you might want to query against
  2. Replication Controllers:
    • Replicas exist to provide scale and fault-tolerance to the cluster
    • Replication Controller ensures that a correct number of replicas is running at all times
    • Therefore it is a component that focuses on availability of the services in the pods
    • The controller will create and destroy replicas as it needs to
    • Controller operates following a set of rules defined in a pod template
    • Pod template:
      • Definition of the desired state of the cluster
      • Specifies which images are to be used to create each pod
      • Specifies how many replicas should exist
  3. Replication scheme in k8s is loosely coupled:
    • Replication controller operates (and identifies pods) using a label selector
    • If the controller is killed, then replicas managed by it are unaffected
    • Benefits of such loose coupling:
      • Removing a pod for debugging is as simple as changing the label
      • Changing the label selector changes which replicas are controlled in real time
      • Controllers can be swapped and pods will be unaffected by the change
    • TIP: Account for possible failures in replication controllers as well!
      • Run at least two controllers to avoid having a single point of failure
  4. On Scheduling and Scaling: -