MQTT to Kafka: Streaming 1M Sensor Events per Second
MQTT handles device ingestion; Kafka handles backpressure, replay, and fan-out. The architecture for bridging OT sensor networks to IT analytics at industrial scale.
Audi ingests connected car telemetry in real time. Deutsche Bahn runs train information systems across Germany. Quarterhill adjusts toll pricing within sub-second latency from sensor to decision. All three run the same core architecture: MQTT handles the last mile to constrained devices on unreliable networks, Kafka handles everything after. The pattern works because neither protocol tries to do the other's job.
8-Bit Microcontroller Meets Enterprise Data Platform
MQTT was designed for constrained devices and unreliable networks. A client runs on an 8-bit microcontroller, supports QoS levels for guaranteed delivery, and handles sensors that go offline and reconnect. Kafka was built for enterprise data streaming — high throughput, durability, exactly-once semantics, and integration with the broader data platform.
Since Kafka was not built for IoT communication at the edge, the combination of Apache Kafka and MQTT together are a match made in heaven for building scalable, reliable, and secure IoT infrastructures.
Trying to use Kafka clients directly on constrained devices is impractical — they're resource-intensive and assume reliable connectivity that IoT environments can't guarantee. The protocols are complementary, not competing.
Five-Layer Architecture: Edge to Analytics
IoT Streaming Stack
Edge Layer
Sensors publish to MQTT broker (EMQX, Mosquitto, HiveMQ)
Bridge Layer
MQTT-Kafka connector subscribes and produces to Kafka topics
Processing Layer
Kafka Streams or Apache Flink for real-time analytics
Storage Layer
Time-series database (TimescaleDB, InfluxDB) for hot data
Analytics Layer
Data warehouse (BigQuery, Snowflake) for historical analysis
The MQTT broker handles the complexity of device connections—authentication, session state, last-will messages for detecting disconnected sensors. The Kafka connector creates a clean interface between edge chaos and enterprise order.
Production Deployments: Automotive, Rail, Energy, Steel
The MQTT-Kafka pattern runs in production across industries at massive scale:
Notable Deployments
- Audi — Connected car infrastructure for real-time ingestion and analysis
- Deutsche Bahn — Real-time train information systems across Germany
- E.ON — IoT cloud platform for smart homes and energy grids
- Bosch Power Tools — Real-time alerting dashboards for industrial equipment
- Severstal — Edge analytics for predictive maintenance in steel production
Quarterhill's intelligent traffic system demonstrates sub-second decision-making: adjusting toll rates based on real-time congestion is only possible with data streaming that maintains sub-second latency from sensor to pricing engine.
OT/IT Convergence: Event-Driven Replaces Polling
Traditional OT (Operational Technology) middleware — vendor-locked, polling-based systems — is giving way to event-driven architectures built on Kafka, MQTT, and OPC-UA. This is an integration pattern, not just a technology upgrade.
Kafka serves as the central event backbone, MQTT enables lightweight device communication, and OPC-UA ensures secure industrial data exchange. Together, they allow organizations to scale dynamically without vendor lock-in.
Zero Data Loss Through Network Partitions
Kafka's advantage in IoT isn't just throughput — it's resilience. If connectivity between edge and cloud is interrupted, Kafka's storage semantics guarantee that records aren't lost and will be delivered once connection is reestablished.
For seismic sensor networks, where remote stations may lose satellite connectivity during storms, this durability is essential. Sensors buffer locally, the edge gateway buffers to disk, and when connectivity returns, everything flows through to the central platform without data loss.
EMQX: Millions of Concurrent Connections, Native Kafka Bridge
# EMQX Kafka integration configuration
bridges:
kafka:
servers: "kafka:9092"
topic: sensor_data
message_key: "${clientid}"
value_encoder: json
ssl:
enable: true
cacertfile: /etc/emqx/certs/ca.crtEMQX has emerged as the enterprise MQTT broker of choice, offering native Kafka integration without custom connectors. Its clustering capability handles millions of concurrent connections, making it suitable for large-scale IoT deployments.
Topic Sprawl and Operational Cost
Kafka has limitations for IoT-specific patterns. Managing a large number of topics (common when each device has multiple data streams) creates overhead. Design topic hierarchies carefully — one topic per sensor type rather than per device.
Cost is the other constraint. Kafka clusters aren't cheap, especially managed cloud offerings. For smaller deployments, simpler alternatives (Redis Streams, NATS) deserve evaluation before committing to the full Kafka ecosystem.
Under 100 Sensors? Skip Kafka.
The MQTT-Kafka pattern is the default architecture for IoT projects at scale. The separation of concerns is clean: MQTT handles the messiness of device communication; Kafka provides the enterprise integration layer.
For projects with fewer than a hundred sensors and straightforward analytics requirements, simpler stacks (MQTT direct to TimescaleDB, for example) are more appropriate. Kafka shines at scale; at smaller scales, it's overhead without proportional benefit.
References & Further Reading
Kafka for IoT: Key Capabilities and Top Use Cases in 2025
Instaclustr guide to Kafka for IoT applications
https://www.instaclustr.com/education/apache-kafka/kafka-for-iot-4-key-capabilities-and-top-use-cases-in-2025/
IoT and Event Streaming at Scale with Kafka & MQTT
Confluent's architecture guide for IoT streaming
https://www.confluent.io/blog/iot-with-kafka-connect-mqtt-and-rest-proxy/
MQTT to Kafka: Benefits, Use Cases & Quick Guide
EMQX integration guide for MQTT-Kafka bridging
https://www.emqx.com/en/blog/mqtt-and-kafka
IoT (MQTT) and Data Streaming for Tolling Traffic System
Real-world case study of Kafka for intelligent transportation
https://www.kai-waehner.de/blog/2024/11/01/iot-and-data-streaming-with-kafka-for-a-tolling-traffic-system-with-dynamic-pricing/
Modernizing OT Middleware: The Shift to Open Industrial IoT Architectures
Kai Waehner's analysis of industrial IoT evolution
https://www.kai-waehner.de/blog/2025/03/17/modernizing-ot-middleware-the-shift-to-open-industrial-iot-architectures-with-data-streaming/