container-linux-update-operator

by coreos

A Kubernetes operator to manage updates of Container Linux by CoreOS

206 Stars 49 Forks Last release: over 2 years ago (v0.7.0) Apache License 2.0 307 Commits 13 Releases

Available items

No Items, yet!

The developer of this repository has not created any items for sale yet. Need a bug fixed? Help with integration? A different license? Create a request here:

Container Linux Update Operator

Container Linux Update Operator is a node reboot controller for Kubernetes running Container Linux images. When a reboot is needed after updating the system via update_engine, the operator will drain the node before rebooting it.

Container Linux Update Operator fulfills the same purpose as locksmith, but has better integration with Kubernetes by explicitly marking a node as unschedulable and deleting pods on the node before rebooting.

Design

Original proposal

Container Linux Update Operator is divided into two parts:

update-operator
and
update-agent
.

update-agent
runs as a DaemonSet on each node, waiting for a
UPDATE_STATUS_UPDATED_NEED_REBOOT
signal via D-Bus from
update_engine
. It will indicate via node annotations that it needs a reboot.

update-operator
runs as a Deployment, watching changes to node annotations and reboots the nodes as needed. It coordinates the reboots of multiple nodes in the cluster, ensuring that not too many are rebooting at once.

Currently,

update-operator
only reboots one node at a time.

Requirements

  • A Kubernetes cluster (>= 1.6) running on Container Linux
  • The
    update-engine.service
    systemd unit on each machine should be unmasked, enabled and started in systemd
  • The
    locksmithd.service
    systemd unit on each machine should be masked and stopped in systemd

To unmask a service, run

systemctl unmask 
. To enable a service, run
systemctl enable 
. To start/stop a service, run
systemctl start 
or
systemctl stop 
respectively.

Usage

Create the

update-operator
deployment and
update-agent
daemonset.
kubectl apply -f examples/deploy -R

Test

To test that it is working, you can SSH to a node and trigger an update check by running

update_engine_client -check_for_update
or simulate a reboot is needed by running
locksmithctl send-need-reboot
.

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.