Category Archives: Ceph

Ceph RBD Bare-Metal Install for Kubernetes (on

In this guide, we’ll create a bare-metal Ceph RBD cluster which may provide persistent volume support for a Kubernetes environment.

Our Ceph RBD cluster will be composed of a single Ceph monitor (MON) and two Ceph Object Storage Daemon (OSD) nodes. In Ceph, MON nodes track the state of the cluster, and OSD nodes hold the data to be persisted.

Kubernetes and the Need for Ceph RBD

Creating a Kubernetes cluster environment equivalent to hosted solutions (GKE) and turn-key solutions (Kubernetes on AWS or GCE) requires 3 things:

  1. The Kubernetes System Itself: Masters and Nodes (Setup Guide)
  2. Persistent Storage (This Guide)
  3. External Load-Balancers (Future Guide)

Use this guide to implement persistent storage to make a Kubernetes environment suitable for hosting stateful applications. When Pods of persistent applications such as databases fail, the Kubernetes system directs the Pod to migrate from one Node to another. The data is often left behind and lost… unless it was stored on a Kubernetes volume backed by a network block device. In this case Kubernetes will automatically migrate the Volume mount along with the Pod to the new destination Node.

Ceph RBD may be used to create a redundant, highly available storage cluster that provides network-mountable block storage devices, similar to that of Rackspace Cloud Block Storage and Amazons EBS (minus the API). Ceph itself is a large project which also provides network-mountable posix filesystems (CephFS) and network object storage like S3 or Swift. However, for Kubernetes we only need to deploy a subset of Ceph’s functionality… Ceph RBD.

This guide should work on any machines, whether bare-metal, virtual, or cloud, so long as the following criterial are met:

  • All Ceph cluster machines are running Ubuntu 14.04 LTS
    • Centos7 may work with yum-plugin-prorities disabled.
  • All machines are network reachable, without restriction
    • i.e. open iptables, open security groups
  • Root access through password-less SSH is enabled
    • i.e. configured /root/.ssh/authorized_keys

Because not many of us have multiple bare-metal machines laying around, we’ll rent them from If you already have machines provisioned, you may skip the provisioning section below.

Step Summary

  • Setup a Development Environment
  • Provision Bare-Metal Machines (on
  • Configure and Run the Ceph ceph/ceph-ansible scripts
  • Test the Cluster by Creating Storage Volumes
  • Configure Remote Kubernetes Nodes to Use Ceph RBD Volumes

Setup a Development Environment

The easiest method is to instantiate a vagrant machine with all of the necessary tools. Feel free to do this manually. The vagrant environment includes the following:

  • Docker
  • Kubectl
  • Ansible
  • Hashicorp Terraform

Fetch the vagrant machine

If you are on OSX, ensure and install the host dependencies.

Create and connect to the vagrant environment

* From now on, do everything from within this vagrant machine.

Provision Bare-Metal Machines

The ceph/ceph-ansible scripts
work on Ubuntu 14.04, but fail on Centos 7 unless yum-plugin-priorities are disabled.

If you enjoy clicking around WebUI’s, follow the manual instructions below. Otherwise, the only automated provisioning method supported by is to use Hashicorp Terraform. A CLI client does not yet exist.

Manual WebUI Instructions

  • Login to the web UI at
  • Create a project
  • Set an SSH key (which will be provisioned on new servers)
  • Create 3 servers with Ubuntu 14.04
    • ceph-mon-0001
    • ceph-osd-0001
    • ceph-osd-0002
  • Follow the Create and Attach Disks section below
  • Note the names and IPs of the newly created machines

Semi-Automatic Instructions (Using Terraform)

Instantiating servers via Hashicorp Terraform must happen in two steps if a “packet_project” has not yet been created.

This is because the “packet_device” (aka. bare metal machine) definitions require a “project_id” upon execution. However, one does not know the “project_id” until after the “packet_project” has been created. One might resolve this issue in the future by having the packet_device require a project_name instead of a project_id. Then we could “terraform apply” this file in one go.

See the Appendix for curl commands that will help you discover project_ids from the API.

Step 1: Create the Project

First, create your auth token via the UI and note it down.

Create a, and tweak the {variables} according to your needs.

Run Terraform to create the project.

Step 2: Create the Machines

Locate your PACKET_PROJECT_ID, so that you may embed it in the Terraform file.

Append to, and tweak the {variables} according to your needs. As of May 2016, only the ewr1 facility supports disk creation. Because of this, ensure all machines and disks are instantiated in facility ewr1.

Run Terraform

Note the names and IPs of the newly created machines

Create an ~/.ssh/config file with the hostname to IP mappings. This allows you to refer to the machines via hostnames from your Dev box, as well as Ansible inventory definitions.

Step 3: Create and Attach Disks

This part is manual for now, using the WebUI.

For each OSD node

  • Using the WebUI, create a 100GB disk and name it the be the same as the respective OSD node (e.g. ceph-osd-0001)
  • Using the WebUI, attach the disk to the respective OSD node
  • Using SSH, connect to the OSD node and run the on-machine attach command

  • Using SSH, connect to the OSD node and rename the attached volume to be a consistent name. The Ansible scripts we run later will expect the same volume names (e.g. /dev/mapper/data01) and the same number of volumes across all of the OSD nodes.

You may add additional disks to each OSD machine, so long as the number of and naming of disks on across all of the OSD machines are the consistent.

Configure and Run the ceph/ceph-ansible scripts

Fetch the ceph/ceph-ansible code

Create an Ansible inventory file name “inventory”, and use the hosts and IPs recorded prior.

Test the ansible connection, and accept host-keys into your ~/.ssh/knownhosts

Configure and edit group_vars/all. You may add the following
settings, or replace/uncomment existing lines:

Configure and edit groups_vars/osds. You may add the following
settings, or replace/uncomment existing lines:

Check if any “inventory” machine targets are running Centos 7. Disable yum-plugin-priorities, or else the ceph-ansible scripts will fail. If enabled, the scripts will fail to locate packages and dependencies because other repositories with the older versions of packages will take precedence over the ceph repository.

Run the ansible setup script to install Ceph on all machines

* Run this more than once until zero errors. It’s idempotent!

Test the Cluster by Creating Storage Volumes

Ensure a Healthy Ceph RBD Cluster

Your local vagrant machine should now have the Ceph CLI tools installed and configured. If for some reason the tools are not working, try the following from the Ceph MON machine.

Query Ceph cluster status

You can see that the cluster is working, but in a WARN state. Let us debug this.

List all osd pools, and find only the default “rbd” pool.

List the parameters of the default “rbd” osd pool.

Notice that size == 3 which means that a healthy state requires 3 replicas for all placement groups within the “rbd” pool. However, we only have 2 OSDs, and thus are running in a degraded HEALTH_WARN state. If we decrease the number of required replicas, the cluster will return to a healthy state.

Check that Ceph cluster status has returned to HEALTH_OK. This may take a few seconds.

Create a Volume for Testing

We’ll test the Ceph cluster from the OSD nodes as Ceph clients. According to the earlier configuration:

public_network == cluster_network ==

Ceph will only listen to requests on the network, which is a pseudo-servicenet alias for a private network between the machines within the same project. (Note: Perhaps this may not work as expected. It seems that the ceph-monitor process is bound to the IP on the public network, which must be used in the Kubernetes Volume definitions).

Copy the credentials from the Ceph MON to the Ceph OSDs, so that they will be able to access the Ceph cluster as clients

From the first OSD node, create a volume, mount it, add data, and unmount the volume.

From the second OSD node, mount the volume that the first node created, read data, and unmount the volume.

Configure a Kubernetes Container to Mount a Ceph RBD Volume

The rest of this guide assumes that you have a working Kubernetes environment created by following this (Setup Guide).

Ensure that your Kubernetes environment is working.

First, base64-encode the ceph client key.

Create a file called “ceph-test.yaml”, which will contain definitions of the Secret, Volume, VolumeClaim, and ReplicationController. We will mount the “rbd/vol01” test volume created in the prior step into any test Container (in our case, nginx).

Create the resources defined within “ceph-test.yaml”

Check that the nginx Pod/Container is running, and locate its name to use in the next command

Verify that the earlier-created files are readable within the “rbd/vol01” mount inside of the POD/Container

We now have a functioning Ceph cluster that is mountable by a Kubernetes environment.

If the Pod is destroyed and migrated to a different Kubernetes node, the volume mount will also follow.