Running an OpenStack platform is a never ending story of debugging and searching for the best possible components, including the storage subsystem. It is very naive to think that OpenStack is just about a one-time installation and running of GUI or API interfaces.
One of the challenges we were facing was in replicating/triplicating storage across multiple datacenters with the lowest possible latency and highest throughput. At the beginning we were, as most other providers, running Ceph. Unfortunately that was a huge mistake which at the end of the day cost us a lot of wasted effort. However this article is not about Ceph storage…
In 2016 we entered a close cooperation with a company called Linbit, guys who invented a well-known storage technology called DRBD. DRBD has been an “old school”, part of the Linux world for many years. What it does is basically replicate block devices over the network. Version 9 supports up to 32 replicate copies of a single resource. But if you want to use it in a cloud environment, it’s necessary to do much more – to orchestrate on multiple levels.
So what is BlackStor?
Simply said, BlackStor is an orchestrator with added value which utilizes multiple technologies including DRBD. It is a simple interface to a complex problem, managing virtual drives with or without replication and providing them to the service’s consumers (virtual servers, bare metal servers…).
Another key technology used by BlackStor is ZFS. We decided to go with ZFS because of its critical features (such as compression, deduplication, QoS, etc.) and known stability.
ZFS has several key offerings including stability, compression, deduplication, quotes, QoS and high performance (when set up with the right settings). Thanks to compression and deduplication, we are able to maximize the disc space and thanks to quotes, we are able to prevent unforeseen exhaustion of the site. QoS lets us guarantee good performance for individual virtual disks. In combination with the correct ZFS Intent Log (ZIL), ARC, and L2ARC, we can get truly incredible performance out of ZFS, whether it’s the number of I/O operations per second or its permeability.
BlackStor includes multiple components:
- API server (for communicating with clients)
- Worker(s) – providing the orchestration itself
- GUI – web-based interface to the API
- CLI – command line interface
In order to make BlackStor universally usable we decided to go forward with a JSON-based API. All communication between the client and BlackStor is based on HTTP requests to the API server.
The API server then creates and assigns tasks to Workers, manages queues, etc.
The API can run in fully synchronous or asynchronous mode. (We recommend a synchronous setup; however there can be applications that cannot wait for API responses, so async mode would run everything in the background).
Example of CLI interface
When we would like to create a volume in BlackStore, we simply call POST /v1/volumes and add information about the volume. The rest of the work is up to BlackStor.
Each volume has its size and name, and we must point to the folder location and appropriate policy with which to create the volume.
The Volume is made up of resources existing in pools. A Pool is a ZFS pool on any node attached to the system. A Node can have zero, one or multiple pools. Nodes without any pools are then called diskless clients.
Every volume in BlackStor is made up of resources. Resources are ZFS volumes on top of which runs DRBD. They are then connected to the desired destination as block storage / volume.
All volumes are saved into folders. Folders are virtual, similar to what you know as “Folders” from standard filesystems. It is used for better categorisation and management (e.g. if you wish to create a multi-tenant environment).
A Policy is the definition of properties for each volume. For example:
- Number of replicas
- Block size
- Placement policy
- QoS parameters
- Backup policy
Policies are created by operators and are always connected with a Volume. It must also be one of “allowed policies” configured for the folder into which you would like to create volume.
Placement policy / Constraints
Each pool can have assigned labels. Based on these labels, BlackStor decides where to create resources.
Let’s imagine that we have 3 datacenters, each equipped with two storage servers. One storage has SSD drives and the second one standard spindle drives. On each storage we create a pool, thus we have 6 pools total.
We assign two labels to each pool – one relevant to physical placement (DC=DC1, DC=DC2, etc.) and one saying drive types (DISK_TYPE=SSD, DISK_TYPE=SPINDLE).
Now we would like to create a volume which has 3 replicas and which runs on SSD drives. Each replica should be in a different datacenter. To do that, we will define the policy “required” for “DISK_TYPE” label (DISK_TYPE=SSD) and policy “uniq” (unique) for label “DC”. Based on these policies, BlackStor will create resources for such a volume in 3 different datacenters on pools with SSD drives.
BlackStor allows you to do well-known operations on top of volumes such as snapshots and clones. During snapshots and cloning there is guaranteed consistency across all replicas even if you are writing intensively into the original volume.
BlackStor offers a new effective cloud-aware means for orchestrating the best tried-and-true storage approaches of the Linux world.
Follow us to get to know more about BlackStor (and other cloud, network, media and IoT technologies we’ve been working on). Or contact us for an easy on-line demo.