9 February 2022

966 words 5 mins read

Perpetual list of stuff to procrastinate about

Perpetual list of stuff to procrastinate about

This will hopefully be a somewhat up to date dump of what my brain is working through over time.

Operational and Development Stuff

Developer Responsibility

  • Pushing responsibility left
    • SLOs as part of the developer KPIs
    • Keeping the product inside the creating team
    • Long term ownership
    • The job is not over until it’s deleted
  • Confidence
    • Context sharing difficulties
    • Compatibility guarantees
    • Beyonce rule
    • Infrastructure upgrade responsibility
  • Business Goals
    • Queries Per Cost
    • New functionality vs Reliability vs Priorities
  • SLOs as conversation tools
    • Using the SLO to decide what to work on
    • Feature vs reliability

Services as a graph

  • Dependency modeling
    • I Depend on these services which depend on those
    • These services depend on me
    • Who just changed?
  • Oops I forgot that service
  • Who owns this? Why is it me!?
  • Where are the Docs!
  • What’s in the Box?

ADR tools

Honeycomb.io for mystery investigations

  • The dangers of nanoservices\microservices
  • Cardinality
  • Long term vs short term investigation\planning
  • Most queries are boring Sample (h/t honeycomb people)
    • Discarding boring stuff
    • Higher rates for errors
    • Higher rates for “that’s weird”
  • Comparison
  • The importance of speed

IAP tool roundup

  • pomerium
  • oauth2-proxy
  • nginx auth lookup


Pipeline 1 to 10

  • Code delivery concept
  • Benefits of common paths
  • Repeatability of change
    • To support migrations of infrastructure
  • Optimising for frequency
  • Dynamic pipelines
    • Building a pipeline that changes based on the environment
  • Fan out and in
    • Tiers of environments
    • Not just a linear path
    • Waiting on eventual success


  • Why deployment velocity matters
    • Value only comes when people use it
    • Quicker delivery == smaller experiments
  • Frequent deployments
    • Build process understanding
    • Standardise deployment process with incident process
    • Consistent expectations
    • Run book memorisation or removal
  • Validation
    • Constant validation of deployment tooling
    • Validation of function and performance as part of your deploy automation
  • Org wide goals
    • Standardise on deployment expectations
    • Maintain ability to experiment

Path to Prod

  • Building a smooth path to prod for changes
  • Onboarding for new services
  • Deploying safely
    • Experimenting in Prod

CD Round up

Coordination is important as stuff runs in multiple environments\countries\regions. Big bang releases are generally the worst possible outcome due to the substantial risk involved

  • Coordination vs Independence
  • The mistakes of manual pipelines
  • Gitops
    • Argo
    • Flux
  • CI Repurposed
    • Github Actions
    • Gitlab CI
    • Jenkins
    • Buildkite
  • SAAS solutions
    • Circle CI

Faster builds by being lazy

  • If it’s important
  • Monorepos in 5 minutes
  • Bazel
  • Speed
    • Why Action Cache
    • Test caching and merkle tree
    • Slow tests are hated
    • Parallelised builds
    • Remote Builds
      • Cross compile
  • Explicit dependency trees
    • Enables build reuse
    • Action Graph
    • Clear dependency context
    • Cross Service testing
  • MultiLanguage support
    • Bundling to provide a smoother onboarding
    • Only testing what you care about
      • Test when my dependencies change
      • Trigger others when I change
  • Downstream Triggers

GKE thread

  • Cluster maintenance requirements
  • Benefits of Spinnaker (feed off CD roundup?)
  • Dynamic deployment pipelines
    • Detecting new clusters
    • Deploying to a new cluster without making everyone hate you

Dev Life

  • Blueprinted deployment patterns
    • Consistent 80/20 plan
    • generic config vs specific customisation vs overriding
  • Service mesh or mess
  • Are you ready?
    • Diving in to status check patterns.
    • Readiness vs liveness with a dash of startup
    • Deep ready checks

Multi Cluster Operations

  • Many Clusters vs Multi Cluster Vs BIG Clusters
  • Benefits of disposable clusters
  • Multi Cluster -> Multi Region
    • Maybe even multi provider
    • Cluster configuration as a Pull operation
  • Submariner to connect across clusters?
  • State options
    • Ohno.gif
  • Admirality for deploy federation

Cluster-API is a magic wtf.

Canarying your traffic

  • The test environment is prod? Always has been
    • Differences
      • Scale
      • Traffic load and patterns of load
      • Do you know how long your service has been running?
      • Concurrent load and sustained load
      • Spikiness
    • Hyrums law
    • Longer term testing
  • Canary tools
    • Flagger
    • Spinnaker
    • ??
  • Pre Merge Canarying

kNative and Gloo

  • Minimising the width of the interface contract
  • API gateway vs mesh
  • 80/20 implementations

Krustlet investigation

  • What is Krustlet
  • Wasm vs wasi
  • Rust for great justice

API Interfaces

RPC vs Queue solutions

  • Conclusion
  • Queuing solutions
    • What you get
      • Buffering
      • Asynchronous
    • What you need to build
      • Error mechanism
      • Response channel
      • Back pressure
    • What do you still need to configure
      • topic information
      • Queue location
      • message specification
    • What you need to worry about
      • Scaling
        • Queues don’t make it magic
        • Kafka Partitions
        • Subscriber locking
      • Many Single Point Of Failures
      • Merging of control plane and data plane
      • Idempotency
  • RPC solutions
    • What you get
      • Response channel
      • Error mechanism
    • What you need to build
      • Asynchronous patterns
      • Back Pressure
      • Idempotency
    • What do you still need to configure
      • Endpoint discovery
      • I specification
    • What you need to worry about
      • Scaling
      • Control plane solutions
      • Idempotency

gRPC for programatic conversion

  • One IDL
  • IDL First API creation
  • Forward and backward versioning
  • Multiple External Protocols
    • Soap Proxy
    • Rest
    • Graphql

Book thingies to write

  • Non fiction
    • SWE at google
    • gopl.io
    • Seeking SRE
    • SRE and SRW books
    • Philosophy of software design

Overly complex and not sensible in any way webcam solution

  • Route audio and camera feed to multiple computers in a local network
  • Restrictions:
    • Due to driver signing on MacOS video source MUST appear as a UVC webcam to support slack and etc.
    • to reduce duplicate speakers audio out from client computers should rout e to source box for mixing
    • Software installation is not possible on some devices
  • Raspberry Pis 4 running as a usb gadget
    • Composite USB device configured to be:
      • a sound card providing:
        • microphone in
        • speaker out
      • UVC webcam
    • Speaker out routed to source\speaker box with pulse audio
    • Video streamed to pi with RTMP
  • One overly complex solution