The Beginner’s Guide to the CNCF Landscape

26-10-2018 / CloudOps

The cloud native ecosystem can be complicated and confusing. Its myriad of open source projects are supported by the constant contributions of a vibrant and expansive community. The Cloud Native Computing Foundation (CNCF>) has a landscape map that shows the full extent of cloud native solutions, many of which are under their umbrella.

As a CNCF ambassador, I am actively engaged in promoting community efforts and cloud native education throughout Canada. At CloudOps I lead workshops on Docker and Kubernetes that provide an introduction to cloud native technologies and help DevOps teams operate their applications.

I also organize Kubernetes and Cloud Native meetups that bring in speakers from around the world and represent a variety of projects. They are run quarterly inMontreal,Ottawa, Toronto, Kitchener-Waterloo, and Quebec City. Reach out to me @archyufa or email CloudOps to learn more about becoming cloud native.

In the meantime, I have written a beginners guide to the cloud native landscape. I hope that it will help you understand the landscape and give you a better sense of how to navigate it.

CNCF

The History of the CNCF

In 2014 Google open sourced an internal project called Borg that they had been using to orchestrate containers. Not having a place to land the project, Google partnered with the Linux Foundation to create the Cloud Native Computing Foundation (CNCF), which would encourage the development and collaboration of Kubernetes and other cloud native solutions. Borg implementation was rewritten in Go, renamed to Kubernetes and donated as the incepting project. It became clear early on that Kubernetes was just the beginning and that a swarm of new projects would join the CNCF, extending the functionality of Kubernetes.

The CNCF Mission

The CNCF fosters this landscape of open source projects by helping provide end-user communities with viable options for building cloud native applications. By encouraging projects to collaborate with each other, the CNCF hopes to enable fully-fledged technology stacks comprised solely of CNCF member projects. This is one way that organizations can own their destinies in the cloud.

CNCF Processes

A total of twenty-five projects have followed Kubernetes and been adopted by the CNCF. In order to join, projects must be selected and then elected with a supermajority by the Technical Oversight Committee (TOC). The voting process is aided by a healthy community of TOC contributors, which are representatives from CNCF member companies, including myself. Member projects will join the Sandbox, Incubation, or Graduation phase depending on their level of code maturity.

Sandbox projects are in a very early stage and require significant code maturity and community involvement before being deployed in production. They are adopted because they offer unrealized potential. The CNCF’s guidelines state that the CNCF helps encourage the public visibility of sandbox projects and facilitate their alignment with existing projects. Sandbox projects receive minimal funding and marketing support from the CNCF and are subject to review and possible removal every twelve months.

Projects enters the Incubation when they meet all sandbox criteria as well as demonstrate certain growth and maturity characteristics. They must be in production usage by at least three companies, maintain healthy team that approves and accepts a healthy flow of contributions that include new features and code from the community.
Once Incubation projects have reached a tipping point in production use, they can be voted by the TOC to have reached Graduation phase. Graduated projects have to demonstrate thriving adoption rates and meet all Incubation criteria. They must also have committers from at least two organizations, have documented and structured governance processes, and meet the Linux Foundation Core Infrastructure Initiative’s Best Practices Badge. So far, only Kubernetes and Prometheus have graduated.

The Projects Themselves

Below I’ve grouped projects into twelve categories: orchestration, app development, monitoring, logging, tracing, container registries, storage and databases, runtimes, service discovery, service meshes, service proxy, security, and streaming and messaging. And provided information that can be helpful for companies or individuals to evaluate what each project does, how project integrates with other CNCF projects and understand its evolution and current state.

Orchestrations

Kubernetes (graduated) Kubernetes automates the deployment, scaling, and management of containerised applications, emphasising automation and declarative configuration. It means helmsman in ancient Greek. Kubernetes orchestrates containers, which are packages of portable and modular microservices. Kubernetes adds a layer of abstraction, grouping containers into pods. Kubernetes helps engineers schedule workloads and allows containers to be deployed at scale over multi-cloud environments. Having graduated, Kubernetes has reached a critical mass of adoption. In a recent CNCF survey, over 40% of respondents from enterprise companies are running Kubernetes in production.

App Development

Helm Helm (Incubating) Helm is an application package manager that allows users to find, share, install, and upgrade Kubernetes applications (aka charts) with ease. It helps end users deploy existing applications (including MySQL, Jenkins, Artifactory and etc.) using KubeApps Hub, which display charts from stable and incubator repositories maintained by the Kubernetes community. With Helm you can install all other CNCF projects that run on top of Kubernetes. Helm can also let organizations create and then deploy custom applications or microservices to Kubernetes. This involves creating YAML manifests with numerical values not suitable for deployment in different environments or CI/CD pipelines. Helm creates single charts that can be versioned based on application or configuration changes, deployed in various environments, and shared across organizations.

Helm originated at Deis from an attempt to create a ‘homebrew’ experience for Kubernetes users. Helm V2 consisted of the client-side of what is currently the Helm Project. The server-side ‘tiller’, or Helm V2, was added by Deis in collaboration with Google at around the same time that Kubernetes 1.2 was released. This was how Helm became the standard way of deploying applications on top of Kubernetes.

Helm is currently making a series of changes and updates in preparation for the release of Helm V3, which is expected to happen by the end of the year. Companies that rely on Helm for their daily CI/CD development, including Reddit, Ubisoft, and Nike, have suggested improvements for the redesign.

Telepresence (Sandbox) It can be challenging to develop containerized applications on Kubernetes. Popular tools for local development include Docker Compose and Minikube. Unfortunately, most cloud native applications today are resource intensive and involve multiple databases, services, and dependencies. Moreover, it can be complicated to mimic cloud dependencies, such as messaging systems and databases in Compose and Minikube. An alternative approach is to use fully remote Kubernetes clusters, but this precludes you from developing with your local tools (e.g., IDE, debugger) and creates slow developer “inner loops” that make developers wait for CI to test changes.

Telepresence, which was developed by Datawire, offers the best of both worlds. It allows the developer to ‘live code’ by running single microservices locally for development purposes while remaining connected to remote Kubernetes clusters that run the rest of their application. Telepresence deploys pods that contain two-way network proxies on remote Kubernetes clusters. This connects local machines to proxies. Telepresence implements realistic development/test environments without freezing local tools for coding, debugging, and editing.

Monitoring

Prometheus (Graduated) Following in the footsteps of Kubernetes, Prometheus was the second project to join the CNCF and the second (and so far last) project to have graduated. It’s a monitoring solution that is suitable for dynamic cloud and container environments. It was inspired by Google’s monitoring system, Borgman. Prometheus is a pull-based system – its configurations decide when and what to scrape. This is unlike other monitoring systems using push-based approach where monitoring agent running on nodes. Prometheus stores scrapped metrics in a TSDB. Prometheus allows you to create meaningful graphs inside the Grafana dashboard with powerful query languages, such as PromQL. You can also generate and send alerts to various destinations, such as slack and email, using the built-in Alert Manager.

Hugely successful, Prometheus has become the de facto standard in cloud native metric monitoring. With Prometheus one can monitor VMs, Kubernetes clusters, and microservices being run anywhere, especially in dynamic systems like Kubernetes. Prometheus’ metrics also automate scaling decisions by leveraging Kubernetes’ features including HPA, VPA, and Cluster Autoscaling. Prometheus can monitor other CNCF projects such as Rook, Vitesse, Envoy, Linkerd, CoreDNS, Fluentd, and NATS. Prometheus’ exporters integrate with many other applications and distributed systems. Use Prometheus’ official Helm Chart to start.

OpenMetrics(Sandbox) OpenMetrics creates neutral standards for an application’s metric exposition format. Its modern metric standard enables users to transmit metrics at scale. OpenMetrics is based on the popular Prometheus exposition format, which has over 300 existing exporters and is based on operational experience from Borgmon. Borgman enables ‘white-box monitoring’ and mass data collection with low overheads. The monitoring landscape before OpenMetrics was largely based on outdated standards and techniques (such as SNMP) that use proprietary formats and place minimal focus on metrics. OpenMetrics builds on the Prometheus exposition format, but has a tighter, cleaner, and more enhanced syntax. While OpenMetrics is only in the Sandbox phase, it is already being used in production by companies including AppOptics, Cortex, Datadog, Google, InfluxData, OpenCensus, Prometheus, Sysdig, and Uber.

Cortex Cortex (Sandbox) Operational simplicity has always been a primary design objective of Prometheus. Consequently, Prometheus itself can only be run without clustering (as single nodes or container) and can only use local storage that is not designed to be durable or long-term. Clustering and distributed storage come with additional operational complexity that Prometheus forgoed in favour of simplicity. Cortex is a horizontally scalable, multi-tenant, long-term storage solution that can complement Prometheus. It allows large enterprises to use Prometheus while maintaining access to HA (High Availability) and long-term storage. There are currently other projects in this space that are gaining community interest, such as Thanos, Timbala, and M3DB. However, Cortex has already been battle-tested as a SaaS offering at both GrafanaLabs and Weaveworks and is also deployed on prem by both EA and StorageOS.

Logging and Tracing

Fluentd Fluentd (Incubator) Fluentd collects, interprets, and transmits application logging data. It unifies data collection and consumption so you can better use and understand your data. Fluentd structures data as JSON and brings together the collecting, filtering, buffering, and outputting of logs across multiple sources and destinations. Fluentd can collect logs from VMs and traditional applications, however it really shines in cloud native environments that run microservices on top of Kubernetes, where applications are created in a dynamic fashion.

Fluentd runs in Kubernetes as a daemonset (workload that runs on each node). Not only does it collects logs from all applications being run as containers (including CNCF ones) and emits logs to STDOUT. Fluentd also parses and buffers incoming log entries and sends formatted logs to configured destinations, such as Elasticsearch, Hadoop, and Mongo, for further processing.

Fluentd was initially written in Ruby and takes over 50Mb in memory at runtime, making it unsuitable for running alongside containers in sidecar patterns. Fluentbit is being developed alongside Fluentd as a solution. Fluentbit is written in C and only uses a few Kb in memory at runtime. Fluentd is more efficient in CPU and memory usage, but has more limited features than Fluentd. Fluentd was originally developed by Treasuredata as an open source project.

Fluentd is available as a Kubernetes plugin and can be deployed as version 0.12, an older and more stable version that currently is widely deployed in production. The new version (Version 1.X) was recently developed and has many improvements, including new plugin APIs, nanosecond resolution, and windows support. Fluentd is becoming the standard for log collection in the cloud native space and is a solid candidate for CNCF Graduation.

OpenTracing (Incubator) Do not underestimate the importance of distributed tracing for building microservices at scale. Developers must be able to view each transaction and understand the behaviour of their microservices. However, distributed tracing can be challenging because the instrumentation must propagate the tracing context both within and between the processes that exist throughout services, packages, and application-specific code. OpenTracing allows developers of application code, OSS packages, and OSS services to instrument their own code without locking into any particular tracing vendor. OpenTracing provides a distributed tracing standard for applications and OSS packages with vendor-neutral APIs with libraries available in nine languages. These enforce distributed tracing, making OpenTracing ideal for service meshes and distributed systems. OpenTracing itself is not a tracing system that runs traces to analyze spans from within the UI. It is an API that works with application business logic, frameworks, and existing instrumentation to create, propagate, and tag spans. It integrates with both open source (e.g. Jaeger, Zipkin) or commercial (e.g Instana, Datadog) tracing solutions, and create traces that are either stored in a backend or spanned into a UI format. Click here to try a tutorial or start instrumenting your own system with Jaeger, a compatible tracing solution.

Jaeger Jaeger (Incubator) Jaeger is a distributed tracing system solution that is compatible with OpenTracing and was originally developed and battle tested by Uber. Its name is pronounced yā′gər and means hunter. It was inspired by Dapper, Google’s internal tracing system, and Zipkin, an alternative open source tracing system that was written by Twitter but built with the OpenTracing’s standard in mind. Zipkin has limited OpenTracing integration support, but Jaeger does provide backwards-compatibility with Zipkin by accepting spans in Zipkin formats over HTTP. Jaeger’s use cases monitor and troubleshoot microservices-based distributions, providing distributed context propagation, distributed transaction monitoring, root cause analysis, service dependency analysis, and performance and latency optimization. Jaeger’s data model and instrumentation libraries are compatible with OpenTracing. Its Modern Web UI is built with React/Javascript and has multiple supports for its backend. This includes Cassandra, Elasticsearch, and memory. Jaeger integrates with service meshes including Istio and Linkerd, making tracing instrumentation much easier.

Jaeger has observatibility because it exposes Prometheus metrics by default and integrates with Fluentd for log shipping. Start deploying Jaeger to Kubernetes using a Helm chart or the recently developed Jaeger Operator. Most contributions to the Jaeger codebase come from Uber and RedHat, but there are hundreds of companies adopting Jaeger for cloud native, microservices-based, distributed tracing.

Container Registries

Harbor Harbor (Sandbox) Harbor is an open source trusted container registry that stores, signs, and scans docker images. It provides free-of-charge, enhanced docker registry features and capabilities. These include a web interface with role-based access control (RBAC) and LDAP support. It integrates with Clair, an open source project developed by CoreOS, for vulnerability scanning and with Notary, a CNCF Incubation project described below, for content trust. Harbor provides activity auditing, Helm chart management and replicates images from one Harbor instance to another for HA and DR. Harbor was originally developed by VMWare as an open source solution. It is now being used by companies of many sizes, including TrendMicro, Rancher, Pivotal, and AXA.

Storage and Databases

Rook Rook (Incubator) Rook is an open source cloud native storage orchestrator for Kubernetes. With Rook, ops teams can run Software Distributed Systems (SDS) (such as Ceph) on top of Kubernetes. Developers can then use that storage to dynamically create Persistent Volumes (PV) in Kubernetes to deploy applications, such as Jenkins, WordPress and any other app that requires state. Ceph is a popular open-source SDS that can provide many popular types of storage systems, such as Object, Block and File System and runs on top of commodity hardware. While it is possible to run Ceph clusters outside of Kubernetes and connect it to Kubernetes using the CSI plugin, deploying and then operating Ceph clusters on hardware is a challenging task, reducing the popularity of the system. Rook deploys and integrates Ceph inside Kubernetes as a first class object using Custom Resource Definition (CRDs) and turns it into a self-managing, self-scaling, and self-healing storage service using the Operator Framework. The goal of Operators in Kubernetes is to encode human operational knowledge into software that is more easily packaged and shared with end users. In comparison to Helm that focuses on packaging and deploying Kubernetes applications, Operator can deploy and manage the life cycles of complex applications. In the case of Ceph, Rook Operator automates storage administrator tasks, such as deployment, bootstrapping, configuration, provisioning, horizontal scaling, healing, upgrading, backups, disaster recovery and monitoring. Initially, Rook Operator’s implementation supported Ceph only. As of version 0.8, Ceph support has been moved to Beta. Project Rook later announced Rook Framework for storage providers, which extends Rook as a general purpose cloud native storage orchestrator that supports multiple storage solutions with reusable specs, logic, policies and testing. Currently Rook supports CockroachDB, Minio, NFS all in alpha and in future Cassandra, Nexenta, and Alluxio. The list of companies using Rook Operator with Ceph in production is growing, especially for companies deploying on Prem, amongst them CENGN, Gini, RPR and many in the evaluation stage.

Vitess Vitess (Incubator) Vitess is a middleware for databases. It employs generalized sharding to distribute data across MySQL instances. It scales horizontally and can scale indefinitely without affecting your application. When your shards reach full capacity, Vitess will reshard your underlying database with zero downtime and good observativability. Vitess solves many problems associated with transactional data, which is continuing to grow.

TiKV TiKV (Sandbox) TiKV is a transactional key-value database that offers simplified scheduling and auto-balancing. It acts as a distributed storage layer that supports strong data consistency, distributed transactions, and horizontal scalability. TiKV was inspired by the design of Google Spanner and HBase, but has the advantage of not having a distributed file system. TiKV was developed by PingCAP and currently has contributors from Samsung, Tencent Cloud, and UCloud.

Runtimes

RKT RKT (Incubator) RKT (read as Rocket) is an application container runtime that was originally developed at CoreOS. Back when Docker was the default runtime for Kubernetes and was baked into kubelet, the Kubernetes and Docker communities had challenges working with each other. Docker Inc., the company behind the development of Docker as an open source software, had its own roadmap and was adding complexity to Docker. For example, they were adding swarm-mode or changing filesystem from AUFS to overlay2 without providing notice. These changes were generally not well coordinated with the Kubernetes community and complicated roadmap planning and release dates. At the end of the day, Kubernetes users need a simple runtime that can start and stop containers and provide functionalities for scaling, upgrading, and uptimes. With RKT, CoreOS intended to create an alternative runtime to Docker that was purposely built to run with Kubernetes. This eventually led to the SIG-Node team of Kubernetes developing a Container Runtime Interface (CRI) for Kubernetes that can connect any type of container and remove Docker code from its core. RKT can consume both OCI Images and Docker format Images. While RKT had a positive impact on the Kubernetes ecosystem, this project was never adopted by end users, specifically by developers who are used to docker cli and don’t want to learn alternatives for packaging applications. Additionally, due to the popularity of Kubernetes, there are a sprawl of container solutions competing for this niche. Projects like gvisor and cri-o (based on OCI) are gaining popularity these days while RKT is losing its position. This makes RKT a potential candidate for removal from the CNCF Incubator.

Containerd (Incubator) Containerd is a container runtime that emphasises simplicity, robustness and portability. In contrast to RKT, Containerd is designed to be embedded into a larger system, rather than being used directly by developers or end-users. Similar to RKT containerd can consume both OCI and Docker Image formats. Containerd was donated to the CNCF by the Docker project. Back in the days, Docker’s platform was a monolithic application. However, with time, it became a complex system due to the addition of features, such as swarm mode. The growing complexity made Docker increasingly hard to manage, and its complex features were redundant if you were using docker with systems like Kubernetes that required simplicity. As a result, Kubernetes started looking for alternative runtimes, such as RKT, to replace docker as the default container runtime. Docker project then decided to break itself up into loosely coupled components and adopt a more modular architecture. This was formerly known as Moby Project, where containerd was used as the core runtime functionality. Since Moby Project, Containerd was later integrated to Kubernetes via a CRI interface known as cri-containerd. However cri-containerd is not required anymore because containerd comes with a built-in CRI plugin that is enabled by default starting from Kubernetes 1.10 and can avoid any extra grpc hop. While containerd has its place in the Kubernetes ecosystem, projects like cri-o (based on OCI) and gvisor are gaining popularity these days and containerd is losing its community interest. However, it is still an integral part of the Docker Platform.

Service Discovery

CoreDNS CoreDNS (Incubator) CoreDNS is a DNS server that provides service discovery in cloud native deployments. CoreDNS is a default Cluster DNS in Kubernetes starting from its version 1.12 release. Prior to that, Kubernetes used SkyDNS, which was itself a fork of Caddy and later KubeDNS. SkyDNS – a dynamic DNS-based service discovery solution – had an inflexible architecture that made it difficult to add new functionalities or extensions. Kubernetes later used KubeDNS, which was running as 3 containers (kube-dns, dnsmasq, sidecar), was prone to dnsmasq vulnerabilities, and had similar issues extending the DNS system with new functionalities. On the other hand, CoreDNS was re-written in Go from scratch and is a flexible plugin-based, extensible DNS solution. It runs inside Kubernetes as one container vs. KubeDNS, which runs with three. It has no issues with vulnerabilities and can update its configuration dynamically using ConfigMaps. Additionally, CoreDNS fixed a lot of KubeDNS issues that it had introduced due to its rigid design (e.g. Verified Pod Records). CoreDNS’ architecture allows you to add or remove functionalities using plugins. Currently, CoreDNS has over thirty plugins and over twenty external plugins. By chaining plugins, you can enable monitoring with Prometheus, tracing with Jaeger, logging with Fluentd, configuration with K8s’ API or etcd, as well as enable advanced dns features and integrations.

Service Meshes

Linkerd Linkerd (Incubator) – Linkerd is an open source network proxy designed to be deployed as a service mesh, which is a dedicated layer for managing, controlling, and monitoring service-to-service communication within an application. Linkerd helps developers run microservices at scale by improving an application’s fault tolerance via the programmable configuration of circuit braking, rate limiting, timeouts and retries without application code change. It also provides visibility into microservices via distributed tracing with Zipkin. Finally, it provides advanced traffic control instrumentation to enable Canaries, Staging, Blue-green deployments. SecOps teams will appreciate the capability of Linkerd to transparently encrypt all cross-node communication in a Kubernetes cluster via TLS. Linkerd is built on top of Twitter’s Finagle project, which has extensive production usage and attracts the interest of many companies exploring Service Meshes. Today Linkerd can be used with Kubernetes, DC/OS and AWS/ECS. The Linkerd service mesh is deployed on Kubernetes as a DaemonSet, meaning it is running one Linkerd pod on each node of the cluster.

Recent changes in the service mesh ecosystem (i.e. the introduction of the Istio project which closely integrates with Kubernetes and uses the lightweight proxy Envoy as a sidecar to run side by side with each microservice) can provide more capabilities than Linkerd and have considerably slowed down its popularity. Some are even questioning the existence of Linkerd. To regain community interest and support a large base of existing customers, Buoyant (the company behind Linkerd) announced project Conduit with the idea of allowing DaemonSetts to use the sidecar approached used by Istio and rewriting dataplane in Rust and Control plane in Go. This enables many possible features that can use the sidecar approach. Not so long ago project Conduit was renamed Linkerd 2.0 and recently announced GA, signaling its readiness for production use. Service Meshes continue to evolve at a fast pace and projects like Istio and Linkerd2 will be at its core.

Service Proxies

Envoy Envoy (Incubator) Envoy is a modern edge and service proxy designed for cloud native applications. It is a vendor agnostic, high performance, lightweight (written in C++) production grade proxy that was developed and battle tested at Lyft. Envoy is now a CNCF incubating project. Envoy provides fault tolerance capabilities for microservices (timeouts, security, retries, circuit breaking) without having to change any lines of existing application code. It provides automatic visibility into what’s happening between microservice via integration with Prometheus, Fluentd, Jaeger and Kiali. Envoy can be also used as an edge proxy (e.g. L7 Ingress Controller for Kubernetes) due to its capabilities performing traffic routing and splitting as well as zone-aware load balancing with failovers.

While the service proxy landscape already has many options, Envoy is a great addition that has sparked a lot of interest and revolutionary ideas around service meshes and modern load-balancing. Heptio announced project Contour, an Ingress controller for Kubernetes that works by deploying the Envoy proxy as a reverse proxy and load balancer. Contour supports dynamic configuration updates and multi-team Kubernetes clusters with the ability to limit the Namespaces that may configure virtual hosts and TLS credentials as well as provide advanced load balancing strategies. Another project that uses Envoy at its core is Datawires Ambassador a powerful Kubernetes-native API Gateway. Since Envoy was written in C++, it is a super lightweight and perfect candidate to run in a sidecar pattern inside Kubernetes and, in combination with its API-driven config update style, has become a perfect candidate for service mesh dataplanes. First, the service mesh Istio announced Envoy to be the default service proxy for its dataplane, where envoy proxies are deployed alongside each instance inside Kubernetes using a sidecar pattern. It creates a transparent service mesh that is controlled and configured by Istio’s Control Plane. This approach compares to the DaemonSet pattern used in Linkerd v1 that provides visibility to each service as well as the ability to create a secure TLS for each service inside Kubernetes or even Hybrid Cloud scenarios. Recently Hashicorp announced that its open source project Consul Connect will use Envoy to establish secure TLS connections between microservices.

Today Envoy has large and active open source community that is not driven by any vendor or commercial project behind it. If you want to start using Envoy, try Istio, Ambassador or Contour or join the Envoy community at Kubecon (Seattle, WA) on December 10th 2018 for the very first EnvoyCon.

Security

Sysdig Falco Falco (Sandbox) Falco is an open source runtime security tool developed by Sysdig. It was designed to detect anomalous activity and intrusions in Kubernetes-orchestrated systems. Falco is more an auditing tool than an enforcement tool (such as SecComp or AppArmor). It is run in user space with the help of a Sysdig kernel module that retrieves system calls.

Falco is run inside Kubernetes as a DaemonSet with a preconfigured set of rules that define the behaviours and events to watch out for. Based on those rules, Falco detects and adds alerts to any behaviour that makes Linux system calls (such as shell runs inside containers or binaries making outbound network connections). These events can be captured at STDERR via Fluentd and then sent to ElasticSearch for filtering or Slack. This can help organizations quickly respond to security incidents, such as container exploits and breaches and minimize the financial penalties posed by such incidents.

With the addition of Falco to the CNCF sandbox, we hope that there will be closer integrations with other CNCF projects in the future. To start using Falco, find an official Helm Chart.

Spiffe Spiffe (Sandbox) Spiffe provides a secure production identity framework. It enables communication between workloads by verifying identities. It’s policy-driven, API-driven, and can be entirely automated. It’s a cloud native solution to the complex problem of establishing trust between workloads, which becomes difficult and even dangerous as workloads scale elastically and become dynamically scheduled. Spiffe is a relatively new project, but it was designed to integrate closely with Spire.

Spire Spire (Sandbox) Spire is Spiffe’s runtime environment. It’s a set of software components that can be integrated into cloud providers and middleware layers. Spire has a modular architecture that supports a wide variety of platforms. In particular, the communities around Spiffe and Spire are growing very quickly. HashiCorp just announced support for Spiffe IDs in Vault, so it can be used for key material and rotation. Spiffe and Spire are both currently in the sandbox.

Tuf Tuf (Incubator) Tuf is short for ‘The Update Framework’. It is a framework that is used for trusted content distribution. Tuf helps solve content trust problems, which can be a major security problem. It helps validate the provenance of software and verify that it only the latest version is being used. TUF project play many very important roles within the Notary project that is described below. It is also used in production by many companies that include Docker, DigitalOcean, Flynn, Cloudflare, and VMware to build their internal tooling and products.

Notary Notary (Incubator) Notary is a secure software distribution implementation. In essence, Notary is based on TUF and ensures that all pulled docker images are signed, correct and untampered version of an image at any stage of you CI/CD workflow, which is one of the major security concerns for Docker-based deployments in Kubernetes systems. Notary publishes and manages trusted collections of content. It allows DevOps engineers to approve trusted data that has been published and create signed collections. This is similar to the software repository management tools present in modern Linux systems, but for Docker images. Some of Notary’s goals include guaranteeing image freshness (always having up-to-date content so vulnerabilities are avoided), trust delegation between users or trusted distribution over untrusted mirrors or transport channels. While Tuf and Notary are generally not used by end users, their solutions integrate into various commercial products or open source projects for content signing or image signing of trusted distributions, such as Harbor, Docker Enterprise Registry, Quay Enterprise, Aqua. Another interesting open-source project in this space Grafeas is an open source API for metadata, which can be used to store “attestations” or image signatures, which can then be checked as part of admission control and used in products such as Container Analysis API and binary authorization at GCP, as well products of JFrog and AquaSec.

Open Policy Agent (Sandbox) By enforcing policies to be specified declaratively, Open Policy Agent (OPA) allows different kinds of policies to be distributed across a technology stack and have updates enforced automatically without being recompiled or redeployed. Living at the application and platform layers, OPA runs by sending queries from services to inform policy decisions. It integrates well with Docker, Kubernetes, Istio, and many more.

Streaming and Messaging

NATS NATS (Incubator) NATS is a messaging service that focuses on middleware, allowing infrastructures to send and receive messages between distributed systems. Its clustering and auto-healing technologies are HA, and its log-based streaming has guaranteed delivery for replaying historical data and receiving all messages. NATS has a relatively straightforward API and supports a diversity of technical use cases, including messaging in the cloud (general messaging, microservices transport, control planes, and service discovery), and IoT messaging. Unlike the solutions for logging, monitoring, and tracing listed above, NATS works at the application layer.

gRPC gRPC (Incubator) A high-performance RPC framework, gRPC allows communication between libraries, clients and servers in multiple platforms. It can run in any environment and provide support for proxies, such as Envoy and Nginx. gRPC efficiently connects services with pluggable support for load balancing, tracing, health checking, and authentication. Connecting devices, applications, and browsers with back-end services, gRPC is an application level tool that facilitates messaging.

CloudEvents (Sandbox) CloudEvents provides developers with a common way to describe events that happen across multi-cloud environments. By providing a specification for describing event data, CloudEvents simplifies event declaration and delivery across services and platforms. Still in Sandbox phase, CloudEvents should greatly increases the portability and productivity of an application.

What’s Next?

The cloud native ecosystem is continuing to grow at a fast pace. More projects will be adopted into the Sandbox in the close future, giving them chances of gaining community interest and awareness. That said, we hope that infrastructure-related projects like Vitess, NATs, and Rook will continuously get attention and support from CNCF as they will be important enablers of Cloud Native deployments on Prem. Another area that we hope the CNCF will continue to place focus on is Cloud Native Continuous Delivery where there is currently a gap in the ecosystem.

While the CNCF accepts and graduates new projects, it is also important to have a working mechanism of removal of projects that have lost community interest because they cease to have value or are replaced other, more relevant projects. While project submission process is open to anybody, I hope that the TOC committee will continue to only sponsor the best candidates, making the CNCF a diverse ecosystem of projects that work well with each other.

As a CNCF ambassador, I hope to teach people how to use these technologies. At CloudOps I lead workshops on Docker and Kubernetes that provide an introduction to cloud native technologies and help DevOps teams operate their applications. I also organize Kubernetes and Cloud Native meetups that bring in speakers from around the world and represent a variety of projects. They are run quarterly in Montreal, Ottawa, Toronto, Kitchener-Waterloo, and Quebec City.I would also encourage people to join the Ambassador team atCloudNativeCon North America 2018 on December 10th. Reach out to me @archyufa or email CloudOps to learn more about becoming cloud native.

Ayrat Khayretdinov

Ayrat Khayretdinov is a Solutions Architect at CloudOps and a Kubernetes evangelist dedicated to driving community growth. He is both a CNCF ambassador and a member of CNCF Technical Oversight Committee (TOC). Ayrat is passionate about promoting community efforts for the cloud native ecosystem.

Photo by Maximilian Weisbecker

The Beginner’s Guide to the CNCF Landscape

The History of the CNCF

The CNCF Mission

CNCF Processes