From Stock Market CD to Scalable API
Earlier this year, I got my hands on “CSE data CD” from the Colombo Stock Exchange (CSE) — a library of historical market data packed into a bunch of Excel files. As a developer, I couldn’t resist the opportunity to build something useful out of it.
The files were messy — different formats, multiple sheets, and data inconsistency across years (newer data had more details, older files less so). But the potential was clear: a clean, fast, and shareable API that others could use to access historical stock market data.
So, I rolled up my sleeves and started turning this offline dataset into a real-time, rate-limited API using Java, Redis, and SparkJava.
The Data Pipeline
At its core, this project is a data pipeline that takes stock data from Excel sheets and makes it queryable over HTTP. Here’s how it works:
1. Ingesting Excel Data
The first challenge was reading all those Excel files. I used Apache POI to parse them, building a flexible Excel reader that could handle multiple formats and structures. Each row was turned into a simple key-value map, where headers became keys and cell values became data points.
My first focus was the “Data 24” files — the daily share prices of all public companies for the year 2024.
2. Storing in Redis with RedisJSON
After parsing, I needed a fast and flexible store — and Redis was a perfect fit.
- I serialized each row into JSON using Gson.
- Then stored them in Redis using the RedisJSON module, which allows not only fast access but also partial reads and updates (great for querying just a few fields).
Redis also makes it easy to manage TTLs and perform atomic operations, which became useful later in the rate-limiting phase.
3. Exposing Data via REST API
With the data safely in Redis, I spun up a REST API using SparkJava, a lightweight Java web framework.
The endpoint (e.g., /stock) accepts parameters like stock symbol and date, and returns the corresponding data from Redis. Simple, fast, and clean.
Rate Limiting the API
I wanted the API to be open — but not too open. To prevent abuse and ensure fair usage, I implemented rate limiting as middleware.
Redis again came in handy here. I implemented multiple algorithms to experiment and see which worked best:
- Fixed Window
- Sliding Window Log
- Sliding Window Counter
- Token Bucket
- Leaky Bucket
These algorithms use Redis’s atomic operations (INCR, SET, EXPIRE) to efficiently track and control request counts.
Requests within the allowed limit are forwarded to the API logic. If a client exceeds their quota, the API returns a 429 Too Many Requests response — a friendly nudge to slow down.
Why Redis?
Redis wasn’t just a cache here — it became the core of the system:
- Blazing-fast in-memory storage made lookups and writes feel instant.
- RedisJSON added structure and flexibility to the data.
- Atomic counters made rate limiting across distributed clients safe and efficient.
Final Thoughts
This project was a rewarding mix of real-world data wrangling and backend engineering. It started with a dusty data files and ended up as a rate-limited, production-ready API.
Whether you’re building something similar or just exploring how to glue different technologies together, I hope this gives you ideas — and maybe a shortcut or two.
👉 You can check out the full code on GitHub and try it out yourself.
Tech Stack
- Java
- Apache POI
- Redis + RedisJSON
- SparkJava