Zerodha’s Infrastructure Upgrades

July 29, 2024

We recently sent SEBI a routine summary of significant infrastructure and software upgrades and performance improvements that we have made in the last 18 months. Here’s a copy of the document. It does not include product features and improvements, only major upgrades in the backend.

Hardware

Our OMS installations and exchange connectivity run on independent physical racks across data centers.

Pre-covid (March 2020)

Post-covid (as on date July 2024)

~2 million clients

~14 million clients

1 OMS (OmneNest, formerly Refinitiv) setup in Mumbai on a physical DC rack.

8 independent OMS setups across two regions in Mumbai and Chennai (DR site). 6 fully live, 2 in stages of testing. Each independent setup is a “silo”, an architecture we pioneered during the rapid market growth during Covid, which is now an industry standard.

13 servers running the OMS and related services.

58 servers running independent instances of OMSes and related services. 

14 exchange leased lines (NSE, BSE, MCX primary + backup)

44 exchange leased lines (NSE, BSE, MCX primary + backup).

 

Our user-facing applications that handle massive traffic run on the AWS public cloud across 3 different “availability zones” within Mumbai. We are in the advanced stages of extending this to the recently launched Hyderabad region. The cloud infrastructure auto-scales in real time based on traffic and requirements.

Software and technology improvements in the last ~18 months

This is a list of major (and risky) software, R&D, and technology upgrades and improvements we have done over the last 18 months. This list does not include any user facing feature and product improvements, and nor does it include the numerous minor and incremental improvements done in the same period.

Major behind-the-scenes upgrades and improvements in the last 18 months

R&D + testing + go live time

Large number of continuous, ongoing, regular updates to operating systems, databases, and numerous other software dependencies and infrastructure. Some low-risk, many that are tested and deployed over weeks and months based on risk and complexity.

NA

Induction of Parquet (open source columnar, binary file format) in across backend systems for exchange of large data dumps (formerly CSV). This has resulted in significant disk usage reduction, brought in type-safety to data, and performance increase topping ~1000% across various systems. This was a high-risk exercise that was done in phases over many months.

12 months

Moved 20TB+ of backoffice financial data into ClickHouse away from Postgres. ClickHouse is a modern, highly performant and scalable database technology. This switch reduced data storage from 20TB to 4TB while storing hundreds of billions of financial records. The query performance has increased 1000x performance in many scenarios.

2 years

Rebuilt end-of-day P&L and trade process that consume end-of-day reports from various MIIs by inducting new open source technologies (DuckDB, Parquet, ClickHouse). The processing time reduced from ~9 hours to ~4, a 120% improvement in performance.

6 months

Refactored DP process to handle corporate actions reducing discrepancies that would go up to 1L items to just 1k.

Developed and deployed a new version of the distributed database querying and reporting system that serves billions of financial reports to customers instantly. We have open-sourced this system: Dung Beetle.

6 months

Set up a distributed, fault-tolerant message queue system (open source system, Kafka) for keeping track of real-time margin allocations (with NSCCL) significantly improving robustness and reducing the risk of this critical intraday process.

3 months

Refactor the depository ear-marking program to be real-time. Add fault-tolerance and support for automatic intraday reconciliation in case of errors. This is a critical intraday-process.

1 month

Migration of immutable financial and ledger data to ClickHouse (away from MySQL and Postgres databases) for significant performance gains.

12 months

Generation of daily critical backend reports scaled to ~250% performance improvements by inducting new open-source technologies (Pandas, Dask, DuckDB, and ClickHouse)

2 months

Transition numerous legacy workflows to Apache Airflow (open source workflow management system) for robust, ops-team-friendly workflow orchestrations, significantly reducing risk and handing over control to dedicated ops teams.

6 months

Beginning-of-day margin limit computation and generation processes were refactored to finish in 1 hour (down from 2.5 hours, a performance gain of 150%) by rebuilding it using Pandas.

2 weeks

Beginning-of-day generation of ~90 million equity holdings records for IBTs (Kite), time reduced from 2 hours to 30 minutes (~300% performance gain) by rebuilding on new technologies: Airflow, DuckDB, and Pandas.

3 months

Beginning-of-day generation of client balances for IBTs reduced from 1 hour to 5 minutes (~1100% performance gain) by rebuilding the process to use ClickHouse.

6 months

End-of-day-positions reversals process rewritten, reducing the time taken from around 30 mins to 4 mins (~650% performance gain)

3 weeks

All MII file import processes rebuilt while incorporating the new SEBI-mandated UDIFF format, adding a ~150% performance gain.

3 weeks

Developed a new highly scalable, distributed system for generating and sending contract notes. We now compute, generate, digitally sign, and e-mail 1.5+ million contract note PDFs in ~25 minutes (down from 8 hours earlier, ~1800% performance gain). This breakthrough has been published as a technical blog here. This is built on top of robust open source technologies, including Typst, Haraka, Nomad, among others. In the process, developed and integrated a distributed job processing library, which we have open-sourced as Tasqueue.

12 months

Migrated to a horizontally auto-scaling SMTP setup (using the open source server, Haraka), significantly speeding up the sending of various kinds of e-mails to clients. This setup sends ~1.5 million e-mails with PDF attachments every night in ~25 minutes.

1 month

Rebuilt margin statement generation to the same distributed system reducing the time from 6 hours to 30 minutes, a ~1100% performance gain.

3 months

Upgrade the old logging infrastructure to a new ClickHouse-based backend for streaming app logs for real-time monitoring.

2 weeks

Integrate (and upgrade in phases), the distributed open-source runtime environment and orchestration system, Nomad, across application groups for high availability and uniformity. This project has been done in phases over several years owing to risk.

3 years

Refactored Pigeon (in-house messaging system that handles billions of customer e-mails, SMS, WhatsApp, mobile notifications etc.) to use a new robust, distributed, horizontally scalable open-source database technology, ScyllaDB.

1 month

Ongoing migration in phases to a new open-source zero-trust network, VPN, and auth technology, Netbird.

1 year

Refactoring of XIRR/PNL computation of hundreds of billions of rows to a new ClickHouse-based sharded database architecture, allowing serving of XIRR numbers instantenously on client request.

12 months

Migrated ~3TB of the master client database, legacy data structures, and management system from Postgres 11 to Postgres 16 (open-source database system), including numerous performance improvements.

6 months

Rebuilt the live market “candlestick” processing, storage, and serving system to use a distributed queue (Kafka) and ClickHouse as the backend database. The migration from the PG-Timescale (open source) database to ClickHouse reduced disk usage from ~1TB to ~120 GB (~90% reduction). The new database backend and the rewritten system increased the performance manifold, and brought in better fault tolerance and high availability. This system serves terabytes of data to clients daily at low latency.

12 months

Refactored the corporate action re-adjustment process on historical candlestick data from ~22 minutes to ~50 seconds (5900% performance gain) by integrating open-source technologies, including DuckDB, Parquet, and ClickHouse.

1 month

Increased the number of daily automated test suites for running live tests against trading platforms, including new beginning-of-day, real-time, and end-of-day tests with new early detection and warning. The system now incorporates a new open-source technology, playwright-py, that allows ops and risk management teams to write their own tests. The live test suite comprises of 860+ of tests now.

1 month

Finished the implementation a new client-side caching system (which we have open sourced as indexed-cache) charting and backoffice applications for snappier UX significant reduction in bandwidth usage and network requests for both our backend and end uesrs.

3 months

Transitioned several MII processes to use MII APIs and moved away from the legacy CSV file exchange systems

2 months

Refactored the mutual fund SIP system that would distribute orders over ~10 hours before market open to take only ~40 minutes, a 1400% performance gain.

3 weeks

Significant updates and optimizations to the internal payment processing system and gateway, reducing intra-day payment errors to processing ~2.5 million payouts with zero reconciliation errors on quarterly settlement days.

6 months

Refactoring of the new ClickHouse-based ledger and payments system, reducing the resource usage of certain large, critical queries from 1 GB of memory to 50 MB, a ~1900% performance gain.

12 months

Migration of historical, immutable order logs (~2TB) from 2019 to new ClickHouse shards with better data structure, resulting in ~60% disk savings and significantly improving “cold storage” lookup performance.

1 day

40%+ speed improvement to the Metabase (open source system) dashboard setup used for real-time intraday monitoring of order and risk parameters through a refactor.

1 day

Migration and switching of IBT’s (Kite) primary, real-time data stores to the distributed, horizontally scalable open-source ScyllaDB system, an extremely high-risk exercise done in phases over many months.

12 months

Migration and switching of IBT (Kite) services and dependencies to a self-contained Nomad (open source run time environment and orchestration system) cluster, an extremely high-risk exercise.

12 months

New internal PKI infra for intranet mTLS certificate-based communication

12 months

Full refactoring and migration of critical intraday, beginning-of-day, and end-of-day IBT workflows used by ops and risk management teams to a new architecture in the latest version of RunDeck (an open-source job orchestration system). A high risk exercise done in phases over many months.

6 months

Complete, high-risk rewrite of legacy OMS (Order Management System) and real-time datastream connectors for significant tech-debt reduction and performance gains. New Kafka (open source distributed message queue) for intraday data persistence. We also developed a new system (open sourced as kaf-relay) to aggregate realtime data from 8 different independent “silos” (physical DC racks) into the cloud environment. Extremely high-risk project that was tested and taken live in phases over many months.

8 months

7th full rewrite (in 9 years) of the “ticker” program that delivers real-time stock ticks to millions of clients in intraday via our web and mobile IBTs. The new system uses significantly lower CPU and RAM resources and has been benchmarked to handle ~500k connections on a single process, a 900% improvement in capacity. On 4th June 2024 (election results day), the ticker served ~175 million market ticks / second to clients at peak. The new system streams more ticks to end users with lower latency.

6 months

Refactored exchange data feed aggregator to use multiple exchange sources and auto-detect and auto-switch between feeds when a lag is detected, significantly improving the end-user experience. On volatile days, certain exchange sources witness a slowdown in the real-time ticks in the derivative segment.

2 months

Veto and Nudge: Nudge moved to Nomad, removing Docker dependence. Support was added for more fields and meta fields. Now, fields from Kite exposed to Veto/Nudge come from CMR via new meta fields, reducing duplication, centralizing the source of truth, and reducing KYC/KRA/compliance-related overhead. Reduced build and deployment times allow instant blocking of orders and management of RMS duties, informing specifically affected clients. Rule scoring based on rule complexity to reduce latency. CDSL TPIN auth handled at Veto level.

2 years

Commissioning of MPLS + P2P lines between 8 independent on-prem DC racks (“silos”) and AWS cloud for private networking.

12 months

We are in the advanced stages of deploying the AWS Hyderabad (region) infra set up and conducting R&D with private networking between AWS Mumbai for intraday high availability and disaster recovery of IBTs.

12 months

Significant upgrades to our real-time monitoring and alerting stack (using open source technologies, Grafana and VictoriaMetrics) to improve the performance of real-time monitoring and alerting of our metrics entire infrastructure, servers and software. At any given moment, this system has about 2.5 trillion metric data points available for analysis.

12 months

India's largest broker trusted by 1.5+ crore investors.


Post a comment




22 comments
  1. Ashu says:

    Zerodha when will you get the feature of direct buy and sell on the tradingview chart this is very useful for scalping we can move our stoploss immediately.
    The chart is in IQ but multiple charts do not open in the same window. All brokers are offering it but you people are not offering it. I have been using it for the last 5 years But now I am having problem with scalping so please get it quickly so that I can continue my journey with full dedication. Zerodha has a big technical team, still why does this feature take so much time, please bring it soon

  2. Narayanan S says:

    It is interesting to know that you are using a opensource software for developing a software product for trading. It clearly demonstrates your ability to leverage the technical human capital available in India. I would like to highlight another

  3. Ankit says:

    Hello Zerodha Team,

    I’ve noticed that our project uses both Rundeck and Apache Airflow as workflow systems. Given their similar functionalities, I’m curious about the rationale behind this decision. Could anyone explain the specific reasons for not consolidating on either one for all our use cases?

    Thanks in advance!

  4. raak says:

    Bahot der ho gayi: Buch calls for mandatory T+0 settlement by qualified stock brokers.

    It’s great to see that Zerodha collected about Rs 1500 crores in STT.

    I would like to know how many of your clients contributed to this amount.

    Kindly respond Nithin ji…………………

  5. Nihar Ranjan Dhar says:

    Your service is good. Investment in stock through Zerodha app is easy but M.F Investment through Coin is not that easy. Moreover one need separate fund arrangements. Can we view entire portfolio stock and M.F holding together.

    • Shubham says:

      Hi Nihar,

      1. As per regulations, funds in your trading account cannot be used for investing in mutual funds, the MF investments, payment has to be made diretly from your bank account.
      2. You can view your stock holdings as well as MF holdings together on Console.

  6. Satishkumar Amin says:

    I would like to invest in Mutual Funds. I have found GROWW App very simple and easy for investing in Mutual Fund. Please let me know whether ZERODHA has similar App for investing in Mutual Funds.

  7. RAAK says:

    At least the qualified stock brokers should be mandated to offer T+0 settlement, said the Securities and Exchange Board of India (Sebi) chairperson Madhabi Puri Buch in an event on July 30.

    Talking about the delay in making T+0 mandatory, Buch said, “Bahot der ho gayi na?” She also said that the regulator has given enough time to the market players to adjust, and now Sebi should put up a proposal where the qualified stock brokers at least offer mandatory T+0 settlement. “Then it is the investors choice to opt for T+0 or not,” said the SEBI chairperson during the launch of a report on capital markets in Mumbai.

    Order Online Today – Avail benefits – Affordable Selection
    amazon.in
    Order Online Today – Avail benefits – Affordable Selection
    Ad
    An entity would be designated as a qualified stock broker based on the total number of active clients, available total assets of clients with the stock broker, trading volumes of the stock broker, and the end of day margin obligations of all clients of a stock broker.
    T+0 settlement refers to a settlement cycle where transactions are settled on the same day as the trade date, without any delay. The T+0 facility is in addition to the existing T+1 settlement cycle in the secondary markets for the equity cash segment.

    Sebi had launched a beta version of the T+0 settlement on an optional basis in March 28. Sebi had said that it will review the progress of the beta version at the end of three months and six months from the date of implementation and also determine further steps.

    Until now the Indian securities markets have been operating at a T+1 settlement cycle. The regulator had shortened the settlement cycle to T+3 from T+5 in 2002 and subsequently to T+2 in 2003. It introduced T+1 in 2021 and implemented in phases, with the final phase completed in January 2023
    I request you to provide an option to download customer care calls made by clients, as it requires a support code.I

    request you to provide an option to download customer care calls made by clients, as it requires a support code.
    Want to get customer care call record.
    It will greatly help in solving technical glitch and shows how good Zerodha CC responding when a glitch happens.

  8. Prashant says:

    It’s Amazing to know how much Hardware, Infrastructure and Money required to run such fantastic high speed flawless services proud to be part of it Thanks for sharing

  9. Raak says:

    Request you to provide option to down load Customer Care calls.

  10. Sridhar says:

    @Rekha Debnath – The bandwidth of the Leased line matters and not the no. of leased lines. From a network perspective for 7X customers, the no. of leased lines need not be proportional. The system by design has the ability to process all the incoming requests.

  11. Shantanu says:

    Very good job Zerodha team. Keep it up

  12. Manish says:

    Any update on app UI in pipeline?

  13. Rekha Debnath says:

    Your customer base increased by 7 times but your servers and exchange leased lines didn’t increase by 7 times. That may be a reason for your frequent outage during peak hours.

  14. V Rama Raju says:

    This is why we are using zerodha irrespective of propaganda going on social media platforms. Great efforts by the team thank you 👍👍

  15. Piyush says:

    I see that you guys have started using Kafka while you were a big advocate of using NATS in the past. Is there any advantage of pitfalls you found for using Kafka over nats?

  16. Abdul Aziz Mazumder says:

    Thank you zeroda team sir bohot acha hi por ak update kijia watchlist me surch korke add korne wala company niche chala jata hai use upor rahne ki kindly update kijia.

  17. Vishnu says:

    Religiously adopting and giving back to open source ❤️
    . You are amazing and incredibly great minds at work. Keep going 🤞