9 min read
Object Storage Benchmarking

When shifting from EFK to LokiStack in OpenShift, the architecture takes a fundamental turn as logs are no longer stored using block storage in Elasticsearch, but instead are written to object storage.

However, not everyone (typically those on premises) have object storage readily available. For those without object storage, the guidance has been to use ODFs MCG (noobaa) with a pv-pool backing store. Simply, this is essentially deploying Noobaa backed on existing storage to provide object storage APIs for Loki to consume. But how well does that actually work in practice, especially at scale and when comparing it to other object storage solutions?

Now, it is expected that querying logs will be slower in LokiStack when compared to EFK, this just is the case with the change in architecture as logs might have to be pulled down from the object storage before being read and presented to the user, especially slower when dealing with large amounts of data. Thus, the longer response time of historic and more complex queries has to be accepted to a degree.

So, why Benchmark?

Instead of relying solely on feel, I wanted to understand how various S3 backends perform when used with LokiStack particularly under read heavy operations to like what log queries essentially are. Quite simply, slow reads can severely impact the user experience when troubleshooting or searching logs which is something you obviously want to avoid.

To find the best option available to me in the environment I was in I decided to benchmark a few setups using MinIO’s Warp tool. The idea wasn’t to compare vendors per se, but rather to explore the comparative difference between the options I had available and to primarily compare other options to MCG (to where I was seeing slow responsiveness in large clusters).

Setup & Scenarios

I ran benchmarks against the following scenarios:

  • MCG out of the box using a pv-pool backing store
  • Dell ECS S3
  • MCG using the the same Dell ECS S3 as a backing store

The focus of my investigation here was on read heavy operations (GET, LIST, STAT) since those are likely to be dominant for Loki in querying logs.

Warp is very easy to use, at its simplist, you can just run something like the following:

$ ./warp mixed --host=xxxx --access-key=xxxx --secret-key=xxxx --bucket=xxxx --tls

Please note, the below are just example results, as explained later in the article, there are many factors to consider here, so there is no one fit all answer.

Key Results (Mixed Operation Test)

MetricMCG (pv-pool)MCG (Dell backed)Dell ECS
Total Throughput121.85 MiB/s, 20.33 obj/s50.58 MiB/s, 8.39 obj/s127.21 MiB/s, 21.16 obj/s
GET Avg Throughput91.33 MiB/s, 9.13 obj/s37.8 MiB/s, 3.78 obj/s95.71 MiB/s, 9.57 obj/s
GET Avg Latency1373.7 ms3369.2 ms1594.6 ms
GET TTFB (Avg)415 ms2.69 s96 ms
PUT Avg Throughput30.52 MiB/s, 3.05 obj/s12.79 MiB/s, 1.28 obj/s31.50 MiB/s, 3.15 obj/s
PUT Avg Latency2126.4 ms4711.0 ms1415.1 ms
DELETE Latency188.4 ms477.5 ms54.3 ms
STAT Latency130.2 ms365.6 ms30.5 ms

Warp is a useful utility and allows multiple benchmarks runs to be merged and comparatively compared. This was a really good feature that allowed us to compare the performance of different storage systems quite quickly and easily.

For example, to specifically compare MCG with a pv backing pool to the Dell ECS array:

$ warp cmp dell/warp_results_dell_mixed.zst mcg_pvpool_backed/warp_results_mcg_mixed.zst
-------------------
Operation: DELETE
Operations: 628 -> 610
* Average: -2.83% (-0.1) obj/s
* Requests: Avg: +149.1ms (+247%), P50: +103.4ms (+263%), P99: +397.2ms (+184%), Best: +5.3ms (+65%), Worst: +989.2ms (+166%) StdDev: +109.9ms (+209%)
* Fastest: +10.73% (+0.8) obj/s
* 50% Median: -5.50% (-0.1) obj/s
* Slowest: NaN% (0.0) obj/s
-------------------
Operation: GET
Operations: 2877 -> 2742
* Average: -4.67% (-4.5 MiB/s) throughput, -4.67% (-0.4) obj/s
* Requests: Avg: -173.1ms (-14%), P50: -306.5ms (-20%), P99: +1.1293s (+39%), Best: -163.7ms (-43%), Worst: +5.1521s (+137%) StdDev: +259.6ms (+57%)
* TTFB: Avg: +319ms (+331%), P50: +226.372968ms (+284%), P99: +1.320459019s (+377%), Best: -16.350656ms (-40%), Worst: +2.78712417s (+376%) StdDev: +298.862985ms (+532%)
* Fastest: +24.63% (+27.6 MiB/s) throughput, +24.66% (+2.8) obj/s
* 50% Median: -2.85% (-2.8 MiB/s) throughput, -2.79% (-0.3) obj/s
* Slowest: -54.60% (-37.7 MiB/s) throughput, -54.64% (-3.8) obj/s
-------------------
Operation: PUT
Operations: 948 -> 919
Duration: 5m1s -> 5m2s
* Average: -3.35% (-1.1 MiB/s) throughput, -3.35% (-0.1) obj/s
* Requests: Avg: +751.4ms (+50%), P50: +726ms (+53%), P99: +2.256s (+97%), Best: -18.9ms (-3%), Worst: +6.441s (+163%) StdDev: +508.5ms (+146%)
-------------------
Operation: STAT
Operations: 1912 -> 1835
* Average: -3.97% (-0.3) obj/s
* Requests: Avg: +112.9ms (+327%), P50: +74.5ms (+433%), P99: +394.2ms (+207%), Best: +2ms (+31%), Worst: +904.7ms (+183%) StdDev: +88.1ms (+262%)

List Operation Test

MetricMCG (pv-pool)MCG (Dell backed)Dell ECS
Obj/s (Avg)16,65016,00740,649
Req Latency (Avg)628.8ms643.4ms247.4ms
Req Latency (P99)1571.0ms1624.9ms422.3ms
TTFB (Avg)125ms130ms49ms
TTFB (P99)469ms676ms143ms
Slowest Op7036.9ms12549.8ms600.6ms
Throughput Median17,892 obj/s17,066 obj/s41,594 obj/s
Throughput Lowest5,645 obj/s6,138 obj/s23,229 obj/s

To specifically compare MCG with a pv backing pool to the Dell ECS array:

$ warp cmp dell/warp_results_dell_list.zst mcg_pvpool_backed/warp_results_mcg_list.zst 
-------------------
Operation: LIST
Operations: 24377 -> 9986
* Average: -59.06% (-23984.4) obj/s
* Requests: Avg: +389.2ms (+154%), P50: +361.6ms (+151%), P99: +1.1487s (+272%), Best: -28.7ms (-24%), Worst: +6.4363s (+1072%) StdDev: +149.9ms (+242%)
* TTFB: Avg: +75.4ms (+153%), P50: +64.288416ms (+161%), P99: +325.356144ms (+227%), Best: -1.404854ms (-9%), Worst: +1.62851957s (+634%) StdDev: +56.835209ms (+204%)

Findings & Observations

In my tests, some elements of the results were fairly comparative, however, for our workload and our use case we know what operation types are important to us. The warp tool can really help here when doing comparative analysis helping you to put together the numbers on paper that you need to make an informed decision.

GET Operations

  • MCG had 4.67% lower throughput
  • TTFB (Time to First Byte) was up to 377% worse
  • Latency showed high variance, with MCG having better best case but dramatically worse worst case times

STAT Operations

  • MCG was dramatically slower, with 327% higher average latency
  • P50 and P99 request times were also significantly worse

LIST Operations

  • MCG had 59% fewer objects/second
  • Average latency was 154% higher, with TTFB also much slower

Summary of results

In my case, MCG consistently underperformed Dell ECS, especially in latency sensitive tasks like GET, STAT, and LIST which are likely critical to Loki’s querying responsiveness. While raw throughput was sometimes comparable, MCG’s high and unpredictable latency was the real issue, especially in Time to First Byte (TTFB) and worst case response times.

  • GET: MCG had better best case throughput but worse worst case latency, with TTFB up to 377% slower.
  • STAT: Showed the largest latency gap, with MCG 3x slower on average.
  • LIST: MCG was 59% slower in object throughput.

Even when backed by Dell ECS, MCG introduced overhead that negatively impacted consistency. This suggests that MCG adds variability, even when the raw performance of the backing store is strong.

I did consider whether NooBaa’s compression or deduplication features might be affecting performance, but going under the covers and disabling them didn’t help, in fact I found it made things worse. Unfortunately, tuning options appear limited, especially via the CRDs exposed in OpenShift.

The bottom line for me is that MCG in the environment I was in especially with pv-pool backing, introduces unpredictable performance and this is critical for Loki, where fast, consistent reads matter.

Considerations

Log size

Additionally, log size will matter. Out of the box logging in OpenShift tends to produce large log events/entries with lots of metadata that might not be useful in practice. These bulkier logs mean more data must be pulled from object storage during queries potentially amplifying any performance pain.

So it may be worth, asking yourself, do you really need all those fields? Is it worth to investigate having filters? Smaller logs, will naturally have a smaller footprint/size and be quicker to download and thus speed up queries at a certain level, however, depending how logs are indexed, this may not be a definitive statement and would require further investigation.

This Isn’t a “NooBaa is Bad” Post

This is not a hit piece on NooBaa/MCG. This benchmark reflects a point in time test using specific older versions, configurations, and hardware. Performance can and does vary widely depending on:

  • Backend disk performance (especially for pv-pool)
  • Is the backing pv-pool storageclass doing any replication etc?
  • NooBaa deployment and number of volumes
  • Object lifeycle/retention periods
  • Your infrastructure!

That said, it seems in full ODF deployments RGW (Rados Gateway) seems much more performant than MCG (Noobaa) when comparing the two, which appears to make sense.

If you’re deploying LokiStack and relying on MCG especially in pv-pool mode it’s worth testing your object storage performance before problems show up in production. While MCG may work fine for small scale setups, larger clusters or longer log retention periods may demand more consistent and lower latency backends. So please, run your own benchmarks to weigh up your options.