Curriculum Vitæ

Kevin Collas-Arundell

Summary (TL;DR)

Linux Ops, Go, Kubernetes and Cloudy person.

I’ve been in Operations roles for over a decade. In my roles, I have had to frequently and continually communicate across teams to reach success. What started as identifying and solving requirements for other teams at RMIT lead to working directly with Development teams to build new systems at Hitwise and Aconex and being a cloud team member at LIFX with duties split between software development and operations. At ME Bank, Momenton and Annalise the scope of my duties has grown to provide mentoring, training, design, planning and implementation. In my career I have worked providing support for sole traders to universities with 10s of thousands of students, and as an Operations specialist from startups to giant multinationals, banks and more. I believe I can bring a pragmatic understanding of operations and product development with a dash of mild disappointment in network reliability.

Core Values:

  • It’s not done until it’s in prod
  • Faster, Smaller, Safer, Sooner
  • Document the why
  • On call doesn’t have to suck

Experience

April 2021 - Current, Senior Devops Engineer - Annalise.ai

Operations team member with a focus on product availability, resilience and Continual Delivery principals

  • Part of the team building and maintaining the infrastructure of rapidly iterating and growing startup
  • Implemented additional dynamic node tooling to improve resource availability while balancing cost
  • Researched solutions to provide scale to zero support for some services
  • Software currency and maintenance
  • Internal advocacy for continual delivery practices
  • Maintained provisioning tools for on premises based installations
  • Worked with dev teams on feature and product design
  • Discovered the “fun” of tensorflow serving
  • Gradual reduction in outstanding technical debt

July 2020 - March 2021, Senior DevOps Engineer - Momenton, Melbourne, Victoria

Consultant with a specialisation around Operations, Kubernetes and Deployment practices

  • Lead an initiative to reduce complexity in development and deploy flows and move to a single branch model with continual integration and automated deployment
  • Create kubernetes resources for products developed and deployment pipeliens
  • Work with the team to investigate Kafka operational patterns and develop plans for replication
  • Train and instruct team members on day 2 Kubernetes operation and troubleshooting
  • Delivered internal brownbag on Bazel
  • Instructed team members on metric design for prometheus style collection

Jan 2019 - July 2020, Senior Platform Engineer - ME Bank, Melbourne, Victoria

Operations, cloud, service mesh and Kubernetes specialist

  • Participate in on-call process
    • Evaluate tool options
    • Assist in prototyping integration
  • Maintain CI platform
  • Work with service teams to build required features and support
  • Analyse and implement out of hour scaling controller to cut costs
  • Upgrade and improve internal helm charts
  • Provide operators to solve developer requirements
  • Deliver training on Istio, gRPC and Kubernetes
  • Mentor staff
  • Develop design proposals for internal tools
  • Improve log processing tooling
  • Develop GitOps pattern using flux
    • GitOps was implemented to give us isolated clusters with a more regular and regimented deployment process and reduce defects in deployments
  • Implement automated policy control systems using Open Policy Agent
    • To minimise defects in deployments OPA was used to provide significant automatic protections and compliance systems
  • Participate with quarterly PI planning process
    • Discuss priorities with platform consumers
    • Identify capacity
    • Ensure value of delivered features while maintaining slack for operational duties

September 2016 - Jan 2019, Cloud Engineer - LIFX, Melbourne, Victoria

Operations specialist focusing on scaling, resilience and maintaining Go Codebase.

  • Participated in On-Call rotation
  • Maintained production and staging environments
  • Rebuilt build infrastructure as internal demand changed
    • Migrated to Packer created images in static node count
      • This allowed us to reduce build node contention and update our build nodes
    • Migrated to build container hosted in GKE
    • Migrated to dynamically scaling build hosts running in GCP nodes
      • These two steps were taken to improve build performance, reduce build node contention and cost.
  • Designed, developed and implemented custom deployment tooling for Mesos to improve the deployment process
    • This deployment process reduced the risk and smoothed the impact of deploying the significant products at LIFX. Allowing us to deploy much more frequently with smaller changes
  • Lead migration from Mesos & Marathon to Google Kubernetes Engine
    • Moving to GKE allowed the team to maintain high rates of growth and improve focus on internal value
  • Determined internal deployment processes for services on Kubernetes
    • Deployment improvements on Kubernetes let us deploy frequently and safely
    • Each service was deployed in an on-demand pattern with little risk due to the methods used
  • Maintained highly concurrent systems in Go
    • Rebuilt multiple internal services to address scaling issues
  • Debugged production services to identify, triage and resolve faults
  • Refactored internal services for higher resilience and performance
  • Reduced median internal messaging latency from 100ms to ~1ms
  • Maintained high volume Nats messaging layer
  • Created software design specifications
  • Implemented solutions per design specifications
  • Initial investigation into gRPC client-side balancing
  • Investigated and assisted in the implementation of gRPC internal APIs

December 2012 - September 2016, Linux Systems Engineer - Aconex, Melbourne, Victoria

Operations team member focusing on platform migrations and remediation

  • Was part of an on-call rotation for global support and escalation requests
  • Responded to critical incidents in major production environments
  • Build out infrastructure to deploy new production and QA instances on AWS
  • Identify significant cost savings in deployments
  • Trained new members on infrastructure and ops procedures
  • Worked with development teams to build and deploy new products
  • Maintained and improved Puppet infrastructure globally
  • Investigated and implemented Packer for automated Linux and Windows image building
  • Senior member of the team responsible for 6 data centre moves. Assisted new teams in 2 other moves
  • Internal Puppet SME
  • Migrated IMAP testing into performant Go utility
  • Internal presentation to spread best practices in Puppet
  • Internal presentation discussing new QA environments
  • Developed new isolated QA environments for use by Engineering teams
  • Perform production releases
  • Discuss and plan with other teams improvements to QA and development environments

July 2011 - December 2012, Systems Administrator - Experian Hitwise, Melbourne, Victoria

Platform ops role with a focus on escalation and iterative improvements.

  • Involved in maintenance and growth of Puppet modules
  • Developed in house Puppet modules to manage Jenkins build nodes
  • Managed provisioning of Xen and KVM virtual machines with Cobbler and Koan
  • Supported CentOS, Fedora, Debian and Ubuntu hosts
  • Implemented Puppet managed Linux based selenium testing infrastructure
  • developed custom scripts to automated the provisioning and deployment of production systems with the AWS API
  • Assisted in the development of requirements and deployment of a production Hadoop environment
  • Responded to escalation requests our operation support team

July 2008 - July 2011, IT Support Specialist - RMIT University, Melbourne, Victoria

  • Managed imaging back-end for multi-boot labs.
  • Developed and implemented pilot method to multicast clone many Apple products.
  • Primary Support Staff for Linux and UNIX machines.
  • Assisted in implementing initial multi-boot lab at our RMIT Vietnam Campus.
  • Assisted in implementing and support of cost-saving storage systems for research users.
  • Leading college involvement implementation of improved Authentication and directory mounting solutions for Linux and UNIX clients.
  • Leading pilot roll-out of Puppet configuration management system for UNIX, Linux and OS X machines.
  • Trained new staff on IT support.
  • Determine requirements in a complex environment and develop non-trivial solutions.
  • Assisted in the implementation and planning of a new video-on-demand training tool.
  • Primary support for inter-uni Video conferencing Using the Access Grid system.
  • Documented software installations.
  • Trained and developed training systems for other IT staff on the access grid system.
  • Ensured that software compliance, including licensing was correct and current.
  • Respond to support calls from users in various areas of the University.
  • Organise the scheduled replacement of hardware with users.
  • Identify and resolve faults with hardware and software installations in varying environments.
  • Coordinate with external support agents to resolve hardware and software issues.
  • Support Linux, OS X and Windows systems throughout the University.
  • Supply consumables to users in supported lab environments.
  • Arrange suitable replacements for historical equipment for specialist hardware.
  • Assist in the creation of images for various areas throughout RMIT.
  • Assist in the development of software and hardware requirements.

Older roles

  • July 2011 - December 2012, Systems Administrator - Experian Hitwise, Melbourne, Victoria
  • July 2008 - July 2011, IT Support Specialist - RMIT University, Melbourne, Victoria
  • 2006-June 2008, MIS Operator, Customer Support Operator - Lasseters Online Casino Alice Springs
  • 2005-2006, Field Service Technician - Bizcom NT Alice Springs
  • 2003-2004, Technical support consultant - Micropower NT Alice Springs

Further information about my time in these roles may be found at https://kca.id.au/cv

Speaking

  • 2019, A brief introduction to gRPC, Internal presentation
  • 2018, Should you go, Internal presentation at Mondo
  • 2018, Kubernetes at LIFX v2, Infracoders Melbourne Meetup. Details about the Mesos to Kubernetes Migration at LIFX
  • 2018, Kubernetes at LIFX, REA DevOps Guild. Details about the Mesos to Kubernetes Migration at LIFX
  • 2018, Exploring gRPC balancing, Golang-Melbourne Meetup. Exploring the various types and patterns of load balancing and service discovery with gRPC
  • 2016, Traefik, Consul, Docker and you, InfraCoders Melbourne Meetup. A Look at service discovery options with Docker and Traefik

Community

  • 2022, Continual Delivery Foundation Ambassador
  • October 2019, Assistant - Introduction to Programming in Go workshop, GopherCon Au, Sydney, NSW
  • 2013 - 2018, Sysadmin - Server Team, Pax Australia, Melbourne, Victoria
    • Help attendees with questions of a wide variety
    • Bump in and out (unload, setup and cable 150+ PCs 10+ Servers, kilometres of network cables)
    • Setup of windows game boxes and Linux & windows game servers
    • Operation and implementation of infrastructure services
    • Work with competitive team to run servers to maintain the competition schedule
    • Assist other areas
  • July 2015 Volunteer - Packer and Vagrant Workshop with Norton Truter, Infracoders Melbourne
    • Helped test plan of workshop
    • Helped attendees as they came across issues and discussed use of Packer

Recognition

  • 2012, High Performer of the year, Experian Hitwise
  • 2012, High flyer Award, Experian Hitwise GPD

Education

  • 2012 Graduate Certificate of Information Technology at Swinburne University

Interests

  • Running: Slowly
  • Cycling: Not as slowly
  • Public Speaking: Infrequently

References

Available on request via email