Loading...
Loading...
Senior Backend Engineer · Java · Kafka · AI Infrastructure
Ravi Kant Shukla — 9 years shipping distributed Java systems. Currently: production RAG assistant, document intelligence extraction engine, event-driven Kafka architecture.
Years in production Java
AI systems shipped
PCB files auto-processed
Fewer support tickets via RAG
Now
Studying for the Confluent Kafka Streams cert. Drafting a piece on agentic tool-calling. Reading Designing Data-Intensive Applications, again.
Updated 2 days ago
Try It Live
Paste a document and ask a question. Watch in real-time as the RAG pipeline chunks your text, embeds it, retrieves relevant sections, and generates an answer—all animated step-by-step.
Document
RAG systems combine retrieval and generation to provide grounded answers. The pipeline chunks documents, embeds them, retrieves relevant sections...
Question
How does RAG improve answer accuracy?
4 Chunks
§ Audience
Three paths. Pick the one that matches why you’re here.
01
Senior Java + AI engineer. Production RAG, Kafka Streams, distributed document processing at scale. Background page has the fast summary.
Read my background02
Deep technical writing — Kafka internals, RAG architecture, Spring Boot performance, AWS patterns. One new post every 2–3 weeks. No fluff.
Read the blog03
Two AWS certifications. Three production AI systems shipped. 9+ years Java backend. Resume is one click away.
Download Resume§ Work
Production systems. All metrics are real.
Problem
Manual review of PCB manufacturing files was a bottleneck at 150+ uploads per day. Engineers were spending hours on structured data extraction that could be automated.
Approach
Built an async Java pipeline using Spring Batch — OCR extraction, structured normalization, validation rules, and S3-backed storage. Zero manual review for standard file formats.
Outcome
150+ PCB files auto-processed daily
Problem
Support team fielding repetitive engineering queries. High escalation rate creating load on senior engineers.
Approach
LangChain4j RAG pipeline with vector store retrieval and agentic tool-calling. Answers grounded in internal documentation with fallback to human escalation.
Outcome
40% reduction in support escalations
Problem
Monolithic deployment pipeline causing 2-hour deployment windows. Tight service coupling blocking independent team velocity.
Approach
Migrated to Kafka Streams with Schema Registry for event-driven service communication. GitHub Actions CI/CD pipeline built from scratch alongside the migration.
Outcome
Deploy time: 2 hours → 12 minutes
Technologies I specialize in
Core Backend
Messaging
Cloud · AWS
AI · LLM
Data
Observability
DevOps
Join 500+ developers and startup founders. No spam — just practical AWS, system design, and AI/ML content straight to your inbox.
Unsubscribe anytime. Your email is safe — no selling, ever.
Deep dives into system design, AWS, and AI/ML deployment.
A practical, production-minded guide to building a CI/CD pipeline for Spring Boot microservices, covering tests, quality gates, Docker image tagging, environment promotion, blue-green and canary deployments, rollbacks, and secrets.
FastAPI has become the go-to Python framework for serving ML models in production. Here's how to build async inference endpoints, stream LLM responses, and deploy them reliably on AWS.
Master the journey from code commit to production deployment. Learn pipeline design, quality gates, blue-green deployments, and secrets management for Spring Boot services.