RS
Ravi Shukla
HomeBlogToolsAbout
Resume
RS
Ravi Shukla

Senior Java + AI engineer. Kafka, RAG, distributed systems.

Content

  • Blog
  • System Design
  • AI & ML
  • DevOps

Explore

  • About Ravi
  • Open Stats
  • Thank You

© 2026 Ravi Kant Shukla. All rights reserved.

Deployed on Vercel · Mumbai region

Writing

System design, AWS, microservices, AI/ML deployment — practical posts for engineers who want to build production-grade systems.

devops

CI/CD for Spring Boot Microservices: From Commit to Production

A practical, production-minded guide to building a CI/CD pipeline for Spring Boot microservices, covering tests, quality gates, Docker image tagging, environment promotion, blue-green and canary deployments, rollbacks, and secrets.

DevOpsCI/CDSpring Boot
Dec 1, 202619 min read
ai-ml

Serving ML Models in Production with FastAPI: Async Inference, Streaming, and Deployment

FastAPI has become the go-to Python framework for serving ML models in production. Here's how to build async inference endpoints, stream LLM responses, and deploy them reliably on AWS.

FastAPIMachine LearningPython
May 25, 202620 min read
devops

CI/CD for Spring Boot Microservices: From Commit to Production

Master the journey from code commit to production deployment. Learn pipeline design, quality gates, blue-green deployments, and secrets management for Spring Boot services.

CI/CDGitHub-ActionsDevOps
May 25, 202619
system-design

Designing for Scale: From 0 to 1 Million Requests/Day

A practical system design walkthrough for scaling a product from a single server to 1 million requests per day, covering load balancing, caching, database bottlenecks, queues, observability, and operational trade-offs.

System DesignScalabilityLoad Balancing
May 11, 202618 min read
ai-ml

Building a Production RAG Pipeline with LangChain4j + Spring Boot

A complete guide to building retrieval-augmented generation (RAG) systems in Java using LangChain4j. Learn chunking strategies, embedding pipelines, vector store integration, and how to ship RAG to production.

LangChain4jRAGSpring Boot
May 5, 202618 min read
system-design

Designing a URL Shortener on AWS: From Zero to Production

A complete walkthrough of designing a production-ready URL shortener on AWS — covering hashing strategies, database selection, caching, and scaling to billions of redirects.

AWSSystem DesignDynamoDB
Mar 10, 202412 min read
ai-ml

Deploy Your ML Model on AWS Lambda: The Complete Production Guide

Step-by-step guide to packaging a scikit-learn or PyTorch model as a Lambda function — covering cold starts, container images, model versioning, and A/B testing on AWS.

AWS LambdaML DeploymentDocker
Feb 20, 202414 min read