Shipped — Backend

Event Pipeline

Q: Are you available for remote work?

Yes, I am fully available for remote opportunities, freelance projects, and full-time roles across different time zones.

Q: Do you accept freelance projects?

Yes, I work on contract and freelance projects ranging from backend architecture consultation to full system implementation.

Q: What is your expertise?

I specialize in backend architecture, distributed systems, machine learning operations (MLOps), API design, system design, and production-scale AI integration.

Q: Where are you based?

I am based in Cairo, Egypt, but work with teams globally and am comfortable with remote and hybrid arrangements.

Real-time event processing system for a Cairo-based startup

Started

March 2022

Repository Private / Internal

The Context

The company was relying on a nightly batch ETL process to move user events into the analytics and machine learning feature stores. By the time the data was available, it was already 6+ hours stale. Marketing couldn't execute same-day campaigns, and the recommendation algorithms were always training on yesterday's behaviors. We needed a system that could validate, clean, and distribute events as they happened, without dropping data.

Architecture & Execution

The pipeline is built around Apache Kafka as the resilient backbone. The ingestion API does minimal structural validation and immediately pushes to Kafka to ensure high throughput and low latency. The stream processors are independent Python workers that pull from Kafka, apply business logic, deduplicate events using a sliding window Bloom filter in Redis to save memory, and then route the cleaned payloads to the appropriate sinks.

Post-Mortem Lessons

Getting the number of Kafka partitions right is a dark art. We started with 12, bottlenecked, jumped to 48, and eventually settled on 24 after properly profiling the message distribution.

The biggest incident we had wasn't a crash — it was a schema change that passed ingestion validation but completely broke the parsing logic of 3 downstream consumers. Schema registries are not optional.

Bloom filters for deduplication are magical until you need to periodically clear them to prevent false positive rates from climbing. We implemented rotating daily filters with a 24-hour overlap buffer.

Writing

Batch to streaming: the migration nobody wanted →

Projects

LLM Gateway Router →

← Return to Case Studies

Daily Events

2M+

End-to-End Latency

<200ms

Uptime

99.9%

Data Lag (Down from 6hrs)

0 hrs

Core Technologies

Python Kafka PostgreSQL Redis Docker

Mobile/Web Clients
     ↓
Ingestion API (Validation)
     ↓
Kafka Topic (Partitioned)
     ↓
Stream Processors
 ├─ Deduplication (Redis)
 └─ Enrichment (PostgreSQL)
     ↓
Multi-Sink Distribution