Apache Kafka is leaving the disk behind, and exposing the real cloud bill
Cloud-native Kafka increasingly looks like a cost-controlled platform, not just a fast disk-backed log.📷 AI-generated image / TECH&SPACE
- ★Tiered storage separates hot local data from cheaper object storage and changes Kafka retention economics.
- ★FinOps telemetry becomes critical because cloud Kafka costs come from compute, storage, network traffic and consumer patterns.
- ★Diskless Kafka proposals reduce dependence on local disks, but require careful design around latency, recovery and availability.
Apache Kafka has long been an architectural constant: a distributed commit log, local disks, replicas, partitions and enough operational discipline to keep it from becoming an expensive traffic junction. Cloud infrastructure has pushed that model in a different direction. In his analysis for InfoQ, Viquar Khan describes Kafka’s move toward a cloud-native architecture where adding more brokers and disks is no longer the whole answer. The platform now has to expose who is spending, where data cools down, how consumers scale and how much isolation each tenant really gets.
The first important shift is tiered storage in Apache Kafka. The idea is simple, but the consequences are large: a broker no longer has to keep the full historical log on local disk. Hot data stays close to compute, while older segments can move into cheaper object storage. Retention is no longer calculated only through local SSD capacity and broker count. For organizations that keep long event streams for audit, replay or analytics, that changes the economics of the platform.
The second layer is less flashy but operationally decisive: FinOps telemetry. Kafka in the cloud does not cost money only through storage. The bill also comes from compute instances, network traffic, replication, cross-zone data movement, consumer lag and poorly sized workloads. A cloud-native Kafka platform therefore has to expose usable cost and utilization signals by team, topic, tenant space or application flow. Without that, the platform looks like a shared service but behaves like an opaque cost center.
Tiered storage, FinOps telemetry, virtual clusters and Share Groups are pushing event streaming toward a more elastic, but more complex, infrastructure model.
The diskless approach separates broker compute from durable storage, with new trade-offs.📷 AI-generated image / TECH&SPACE
Khan’s article also underlines that elasticity is not only about broker count. Consumers often determine the practical value of event streaming: if they fall behind, the system may be technically healthy while the business process is late. Elastic consumer scaling therefore becomes part of the architecture, not an afterthought. In the same logic, virtual clusters try to give tenants the experience of separate Kafka environments without physically duplicating the full infrastructure. That is attractive for platform teams, but it demands precise control over quotas, isolation and access rights.
The freshest part of the discussion concerns Share Groups, a mechanism being developed in the Kafka community through KIP-932. The goal is to bring Kafka closer to queue-style processing, where multiple consumers can share work without the rigidity of classic consumer group partitioning. If that model matures, Kafka could cover work queues and event streaming patterns with less application-level contortion.
The diskless future is the most radical point. Instead of treating the Kafka broker and local disk as an inseparable pair, diskless proposals push toward a model where durable storage is separated and brokers become a more elastic compute layer. That sounds natural for cloud infrastructure, but it is not free: every dependency on remote storage raises questions about latency, availability, recovery and behavior under heavy replay load. Kafka can move away from the disk, but it cannot escape the physics of distributed systems.
The real conclusion is more sober than a slogan about a “diskless” future. Cloud-native Kafka is not a single feature, but a bundle of architectural decisions: Apache Kafka has to preserve the reliability of the log, improve storage economics, expose costs through telemetry and give tenants stronger isolation at the same time. That is a serious infrastructure turn, not a cosmetic modernization.

