From Stock Market CD to Scalable API

Earlier this year, I got my hands on “CSE data CD” from the Colombo Stock Exchange (CSE) — a library of historical market data packed into a bunch of Excel files. As a developer, I couldn’t resist the opportunity to build something useful out of it.

The files were messy — different formats, multiple sheets, and data inconsistency across years (newer data had more details, older files less so). But the potential was clear: a clean, fast, and shareable API that others could use to access historical stock market data.

So, I rolled up my sleeves and started turning this offline dataset into a real-time, rate-limited API using Java, Redis, and SparkJava.


The Data Pipeline

At its core, this project is a data pipeline that takes stock data from Excel sheets and makes it queryable over HTTP. Here’s how it works:

1. Ingesting Excel Data

The first challenge was reading all those Excel files. I used Apache POI to parse them, building a flexible Excel reader that could handle multiple formats and structures. Each row was turned into a simple key-value map, where headers became keys and cell values became data points.

My first focus was the “Data 24” files — the daily share prices of all public companies for the year 2024.

2. Storing in Redis with RedisJSON

After parsing, I needed a fast and flexible store — and Redis was a perfect fit.

  • I serialized each row into JSON using Gson.
  • Then stored them in Redis using the RedisJSON module, which allows not only fast access but also partial reads and updates (great for querying just a few fields).

Redis also makes it easy to manage TTLs and perform atomic operations, which became useful later in the rate-limiting phase.

3. Exposing Data via REST API

With the data safely in Redis, I spun up a REST API using SparkJava, a lightweight Java web framework.

The endpoint (e.g., /stock) accepts parameters like stock symbol and date, and returns the corresponding data from Redis. Simple, fast, and clean.


Rate Limiting the API

I wanted the API to be open — but not too open. To prevent abuse and ensure fair usage, I implemented rate limiting as middleware.

Redis again came in handy here. I implemented multiple algorithms to experiment and see which worked best:

  • Fixed Window
  • Sliding Window Log
  • Sliding Window Counter
  • Token Bucket
  • Leaky Bucket

These algorithms use Redis’s atomic operations (INCR, SET, EXPIRE) to efficiently track and control request counts.

Requests within the allowed limit are forwarded to the API logic. If a client exceeds their quota, the API returns a 429 Too Many Requests response — a friendly nudge to slow down.


Why Redis?

Redis wasn’t just a cache here — it became the core of the system:

  • Blazing-fast in-memory storage made lookups and writes feel instant.
  • RedisJSON added structure and flexibility to the data.
  • Atomic counters made rate limiting across distributed clients safe and efficient.

Final Thoughts

This project was a rewarding mix of real-world data wrangling and backend engineering. It started with a dusty data files and ended up as a rate-limited, production-ready API.

Whether you’re building something similar or just exploring how to glue different technologies together, I hope this gives you ideas — and maybe a shortcut or two.

👉 You can check out the full code on GitHub and try it out yourself.


Tech Stack

  • Java
  • Apache POI
  • Redis + RedisJSON
  • SparkJava