This document describes the high-level architecture of kube-rs.
This is intended for contributors or people interested in architecture.
The kube-rs repository contains 5 main crates, examples and tests.
The main crate that users generally import is
kube, and it's a straight facade crate that re-exports from the four other crates:
kube_core-> re-exported as
kube_client-> re-exported as
kube_derive-> re-exported as
kube_runtime-> re-exported as
In terms of dependencies between these 4:
kube_coreis used by
kube_clientis used by
kube_runtimeis the highest level abstraction
The extra indirection crate
kube is there to avoid cyclic dependencies between the client and the runtime (if the client re-exported the runtime then the two crates would be cyclically dependent).
NB: We refer to these crates by their
crates.io name using underscores for separators, but the folders have dashes as separators.
When working on features/issues with
kube-rs you will generally work inside one of these crates at a time, so we will focus on these in isolation, but talk about possible overlaps at the end.
Kubernetes Ecosystem Considerations#
The Rust ecosystem does not exist in a vaccum as we take heavy inspirations from the popular Go ecosystem. In particular:
coremodule contains invariants from apimachinery that is preseved across individual apis
client::Clientis a re-envisioning of a generic client-go
runtime::Controllerabstraction follows conventions in controller-runtime
derive::CustomResourcederive macro for CRDs is loosely inspired by kubebuilder's annotations
We do occasionally diverge on matters where following the go side is worse for the rust language, but when it comes to choosing names and finding out where some modules / functionality should reside; a precedent in
kubebuilder goes a long way.
We do not maintain the kubernetes types generated from the
swagger.json or the protos at present moment, and we do not handle client-side validation of fields relating to these types (that's left to the api-server).
We generally use k8s-openapi's Rust bindings for Kubernetes' builtin types types, see:
We also maintain an experimental set of Protobuf bindings, see k8s-pb.
This crate only contains types relevant to the Kubernetes API, abstractions analogous to what you'll find inside apimachinery, and extra Rust traits that help us with generics further down in
Starting out with the basic type modules first:
metadata: the various metadata types;
subresource: a sans-IO style http interface for the API
watch: a generic enum and behaviour for the watch api
params: generic parameters passed to sans-IO request interface (
Then there are traits
crd: a versioned
objectgeneric conveniences for iterating over typed lists of objects, and objects following spec/status conventions
Api+ a convenience
ResourceExttrait for users
The most important export here is the
Resource trait and its impls. It is a pretty complex trait, with an associated type called
DynamicType (that is default empty). Every
ObjectMeta-using type that comes from
k8s-openapi gets a blanket impl of
Resource so we can use them generically (in
discovery: types returned by the discovery api; capabilities, verbs, scopes, key info
gvk: partial type information to infer api types
The main type here from these two modules is
ApiResource because it can also be used to construct a
kube_client::Api instance without compile-time type information (both
Resource impls where
DynamicType = ApiResource).
Configis the source-agnostic type (with all the information needed by our
Kubeconfigis for loading from
~/.kube/configor from any number of kubeconfig like files set by
Config::from_cluster_envreads environment variables that are injected when running inside a pod
In general this module has similar functionality to the upstream client-go/clientcmd module.
Client is one of the most complicated parts of
kube-rs, because it has the most generic interface. People can mock the
Client, people can replace individual components and force inject headers, people can choose their own tls stack, and - in theory - use whatever http clients they want.
Client is created from the properties of a
Config to create a particular
hyper::Client with a pre-configured amount of tower::Layers (see
TryFrom<Config> for Client), but users can also pass in an arbitrary
tower::Service (to fully customise or to mock). The signature restrictions on
Client::new is commensurately large.
tls module contains the
rustls interfaces to let users pick their tls stacks. The connectors created in that module is passed to
hyper::Client based on feature selection.
Client can be created from a particular type of using the properties in the
Config to configure its layers. Some of our layers come straight from tower-http:
tower_http::DecompressionLayerto deal with gzip compression
tower_http::TraceLayerto propagate http request information onto tracing spans.
tower_http::AddAuthorizationLayerto set bearer tokens / basic auth (when needed)
but we also have our own layers in the
AsyncFilterLayer<RefreshableToken>depending on authentication method in the kubeconfig.
AddAuthorizationLayer, but with a token that's refreshed when necessary.
middleware module is kept small to avoid mixing the business logic (
client::auth openid connect oauth provider logic) with the tower layering glue.)
The exported layers and tls connectors are mainly exposed through the
ConfigExt trait which is only implemented by
Config (because the config has all the properties needed for this in general, and it helps minimise our api surface).
Client manages other key aspects of IO the protocol such as:
Client::connectperforms an HTTP Upgrade for specialised verbs
Client::requesthandles 90% of all requests
Either<T, Status>responses from kubernetes
Api type and its methods.
Builds on top of the
Response interface in
kube_core by parametrising over a generic type
K that implement
Resource (plus whatever else is needed).
Api absorbs a
Client on construction and is then configured with its
Scope (through its
For dynamic types (
DynamicObject) it has slightly more complicated constructors which have the
core_methods and most
subresource methods generally follow this recipe:
- store the kubernetes verb in the [
- call the request with the
Clientand tell it what type(s) to deserialize into
Some subresource methods (behind the
ws feature) use the
AttachedProcess interface expecting a duplex stream to deal with specialised websocket verbs (
attach) and is calling
Client::connect first to get that stream.
Deals with dynamic discovery of what apis are available on the api-server.
Normally this can be used to discover custom resources, but also certain standard resources that vary between providers.
Discovery client can be used to do a full recursive sweep of api-groups into all api resources (through
run) and then the users can periodically re-
run to keep the cache up to date (as kubernetes is being upgraded behind the scenes).
discovery module also contains a way to run smaller queries through the
oneshot module; e.g. resolving resource name when having group version kind, resolving every resource within one specific group, or even one group at a pinned version.
The equivalent Go logic is found in client-go/discovery
The smallest crate. A simple derive proc_macro to generate Kubernetes wrapper structs and trait impls around a data struct.
darling to parse
#[kube(attrs...)] then uses
quote to produce a suitable syntax tree based on the attributes requested.
It ultimately contains a lot of ugly json coercing from attributes into serialization code, but this is code that everyone working with custom resources need.
It has hooks into
schemars when using
JsonSchema to ensure the correct type of CRD schema is attached to the right part of the generated custom resource definition.
The highest level crate that deals with the highest level abstractions (such as controllers/watchers/reflectors) and specific Kubernetes apis that need common care (finalisers, waiting for conditions, event publishing).
watcher module contains state machine wrappers around
Api::watch that will watch and auto-recover on allowable failures.
watcher fn is the general purpose one that is similar to informers in Go land, and will watch a collection of objects. The
watch_object is a specialised version of this that watches a single object.
reflector module contains wrappers around
watcher that will cache objects in memory.
reflector fn wraps a
watcher and a state
Store that is updated on every event emitted by the
The reason for the difference between
watcher::Event (created by
kube::api::WatchEvent (created by
Api::watch) is that
watcher will deals with desync errors and do a full relist whose result is then propagated as a single event, ensuring the
reflector can do a single, atomic update to its state
controller module contains the
Controller type and its associated definitions.
Controller is configured to watch one root object (configured via
::new), and several owned objects (via
::owns), and - once
::run - it will hit a users
reconcile function for every change to the root object or any of its child objects (and internally it will traverse up the object tree - usually through owner references - to find the affected root object).
The user is then meant to provide an idempotent
reconcile fn, that does not know what underlying object was changed, to ensure the state configured in its crd, is what can be seen in the world.
To manage this, a vector of watchers is converted into a set of streams of the same type by mapping the watchers so they have the same output type. This is why
owns looks up
watches need you to define the relation yourself with a
mapper. The mappers we support are
trigger_self, and the custom
Once we have combined the stream of streams we essentially have a flattened super stream with events from multiple watchers that will act as our input events. With this, the
applier can start running its fairly complex machinery:
- new input events get sent to the
- scheduled events are then passed them through a
Runnerpreventing duplicate parallel requests for the same object
- when running, we send the affected object to the users
reconcilerfn and await that future
- a) on success, prepare the users
Action(generally a slow requeue several minutes from now)
- b) on failure, prepare a
Actionbased on the users error policy (generally a backoff'd requeue with shorter initial delay)
- Map resulting
Actions through an ad-hoc
- Resulting requeue requests through the channel are picked up at the top of
applierand merged with input events in step 1.
Ideally, the process runs forever, and it minimises unnecessary reconcile calls (like users changing more than one related object while one reconcile is already happening).
See controller internals for some more information on this.
Contains a helper wrapper
finalizer for a
reconcile fn used by a
Controller when a user is using finalizers to handle garbage collection.
This lets the user focus on simply selecting the type of behaviour they would like to exhibit based on whether the object is being deleted or it's just being regularly reconciled (through enum matching on
finalizer::Event). This lets the user elide checking for potential deletion timestamps and manage the state machinery of
metadata.finalizers through jsonpatching.
Contains helpers for waiting for
conditions, or objects to be fully removed (i.e. waiting for finalizers post delete).
These build upon
watch_object with specific mappers.
Contains an event
Recorder ala client-go/events that controllers can hook into, to publish events related to their reconciliations.
Crate Delineation and Overlaps#
When working on the the client machinery, it's important to realise that there are effectively 5 layers involved:
- Sans-IO request builder (in
- IO (in
- Typing (in
- Helpers for using the API correctly (e.g.
- High-level abstractions for specific tasks (e.g.
At level 3, we essentially have what the K8s team calls a basic client. As a consequence, new methods/subresources typically cross 2 crate boundaries (
kube_client), and needs to touch 3 main modules.
Similarly, there are also the traits and types that define what an api means in
If modifying these, then changes to
kube-derive are likely necessary, as it needs to directly implement this for users.
These types of cross-crate dependencies are why we expose
kube as a single versioned facade crate that users can upgrade atomically (without being caught in the middle of a publish cycle). This also gives us better compatibility with