← Case studies

Event-driven analytics pipeline

ETL with Kafka, Debezium, Apache Flink, and Python. Real-time streaming into ClickHouse and MongoDB.

  • Kafka
  • Flink
  • ClickHouse
  • Microservices

Problem

Conicle needed real-time analytics from operational databases without blocking production or building one-off ETL jobs. Data had to flow into analytics storage (ClickHouse, MongoDB) with low latency and clear ownership.

System design

  • CDC: Debezium captured changes from source DBs and published to Kafka.
  • Stream processing: Apache Flink jobs consumed topics, transformed and enriched events, and wrote to ClickHouse and MongoDB.
  • Python: Supporting services and glue code in Python for flexibility and data-team collaboration.

Architecture

  • Microservices owned their events; a central pipeline consumed and routed by use case.
  • Multi-tenant isolation via tenant IDs in payloads and partitioned sinks.
  • Monitoring and alerting on lag, throughput, and error rates.

Impact

  • Single pipeline for multiple analytics consumers; reduced duplicate ETL and ad-hoc scripts.
  • Near real-time dashboards and reporting.
  • Clear ownership and scalability per topic and sink.