How to Design a Rate Limiter API | Learn System Design
Last Updated : 16 Mar, 2023
A Rate Limiter API is a tool that developers can use to define rules that specify how many requests can be made in a given time period and what actions should be taken when these limits are exceeded.
Rate limiting is an essential technique used in software systems to control the rate of incoming requests. It helps to prevent the overloading of servers by limiting the number of requests that can be made in a given time frame.
It helps to prevent a high volume of requests from overwhelming a server or API. Here is a basic design for a rate limiter API In this article, we will discuss the design of a rate limiter API, including its requirements, high-level design, and algorithms used for rate limiting.
Why is rate limiting used?
- Avoid resource starvation due to a Denial of Service (DoS) attack.
- Ensure that servers are not overburdened. Using rate restriction per user
- ensures fair and reasonable use without harming other users.
- Control the flow of information, for example, prevent a single worker from
- accumulating a backlog of unprocessed items while other workers are idle.
Requirements to Design a Rate Limiter API
The requirements of a rate limiter API can be classified into two categories: functional and non-functional.
Functional requirements to Design a Rate Limiter API:
- The API should allow the definition of multiple rate-limiting rules.
- The API should provide the ability to customize the response to clients when rate limits are exceeded.
- The API should allow for the storage and retrieval of rate-limit data.
- The API should be implemented with proper error handling as in when the threshold limit of requests are crossed for a single server or across different combinations, the client should get a proper error message.
Non-functional requirements to Design a Rate Limiter API:
- The API should be highly available and scalable. Availability is the main pillar in the case of request fetching APIs.
- The API should be secure and protected against malicious attacks.
- The API should be easy to integrate with existing systems.
- There should be low latency provided by the rate limiter to the system, as performance is one of the key factors in the case of any system.
High Level Design (HLD) to Design a Rate Limiter API
Where to place the Rate Limiter - Client Side or Server Side?
A rate limiter should generally be implemented on the server side rather than on the client side. This is because of the following points:
- Positional Advantage: The server is in a better position to enforce rate limits across all clients, whereas client-side rate limiting would require every client to implement their own rate limiter, which would be difficult to coordinate and enforce consistently.
- Security: Implementing rate limiting on the server side also provides better security, as it allows the server to prevent malicious clients from overwhelming the system with a large number of requests. If rate limiting were implemented on the client side, it would be easier for attackers to bypass the rate limit by just modifying or disabling the client-side code.
- Flexible: Server-side rate limiting allows more flexibility in adjusting the rate limits and managing resources. The server can dynamically adjust the rate limits based on traffic patterns and resource availability, and can also prioritize certain types of requests or clients over others. Thus, lends to better utilization of available resources, and also keeps performance good.
HLD of Rate Limiter API - rate limiter placed at server side The overall basic structure of a rate limiter seems relatively simpler. We just need a counter associated with each user to track how many requests are being same submitted in a particular timeframe. The request is rejected if the counter value hits the limit.
Memory Structure/Approximation
Thus, now let's think of the data structure which might help us. Since we need fast retrieval of the counter values associated with each user, we can use a hash-table. Considering we have a key-value pair. The key would contain hash value of each User Id, and the corresponding value would be the pair or structure of counter and the startTime, e.g.,
UserId -> {counter, startTime}
Now, each UserId let's say takes 8 bytes(long long) and the counter takes 2 bytes(int), which for now can count to 50k(limit). Now for the time if we store only the minute and seconds, it will also take 2 bytes. So in total, we would need 12 bytes to store each user's data.
Now considering the overhead of 10 bytes for each record in our hash-table, we would be needing to track at least 5 million users at any time(traffic), so the total memory in need would be:
(12+10)bytes*5 million = 110 MB
Key Components in the Rate Limiter
- Define the rate limiting policy: The first step is to determine the policy for rate limiting. This policy should include the maximum number of requests allowed per unit of time, the time window for measuring requests, and the actions to be taken when a limit is exceeded (e.g., return an error code or delay the request).
- Store request counts: The rate limiter API should keep track of the number of requests made by each client. One way to do this is to use a database, such as Redis or Cassandra, to store the request counts.
- Identify the client: The API must identify each client that makes a request. This can be done using a unique identifier such as an IP address or an API key.
- Handle incoming requests: When a client makes a request, the API should first check if the client has exceeded their request limit within the specified time window. If the limit has been reached, the API can take the action specified in the rate-limiting policy (e.g., return an error code). If the limit has not been reached, the API should update the request count for the client and allow the request to proceed.
- Set headers: When a request is allowed, the API should set appropriate headers in the response to indicate the remaining number of requests that the client can make within the time window, as well as the time at which the limit will be reset.
- Expose an endpoint: Finally, the rate limiter API should expose an endpoint for clients to check their current rate limit status. This endpoint can return the number of requests remaining within the time window, as well as the time at which the limit will be reset.
Where should we keep the counters?
Due to the slowness of Database operations, it is not a smart option for us. This problem can be handled by an in-memory cache such as Redis. It is quick and supports the already implemented time-based expiration technique.
We can rely on two commands being used with in-memory storage,
- INCR: This is used for increasing the stored counter by 1.
- EXPIRE: This is used for setting the timeout on the stored counter. This counter is automatically deleted from the storage when the timeout expires.
In this design, client requests pass through a rate limiter middleware, which checks against the configured rate limits. The rate limiter module stores and retrieves rate limit data from a backend storage system. If a client exceeds a rate limit, the rate limiter module returns an appropriate response to the client.
Algorithms to Design a Rate Limiter API
Several algorithms are used for rate limiting, including
- The Token bucket,
- Leaky bucket,
- Sliding window logs, and
- Sliding window counters.
Let's discuss each algorithm in detail:
Token Bucket
The token bucket algorithm is a simple algorithm that uses a fixed-size token bucket to limit the rate of incoming requests. The token bucket is filled with tokens at a fixed rate, and each request requires a token to be processed. If the bucket is empty, the request is rejected.
The token bucket algorithm can be implemented using the following steps:
- Initialize the token bucket with a fixed number of tokens.
- For each request, remove a token from the bucket.
- If there are no tokens left in the bucket, reject the request.
- Add tokens to the bucket at a fixed rate.
Thus, by allocating a bucket with a predetermined number of tokens for each user, we are successfully limiting the number of requests per user per time unit. When the counter of tokens comes down to 0 for a certain user, we know that he or she has reached the maximum amount of requests in a particular timeframe. The bucket will be auto-refilled whenever the new timeframe starts.
Token bucket example with initial bucket token count of 3 for each user in one minuteLeaky Bucket
It is based on the idea that if the average rate at which water is poured exceeds the rate at which the bucket leaks, the bucket will overflow.
The leaky bucket algorithm is similar to the token bucket algorithm, but instead of using a fixed-size token bucket, it uses a leaky bucket that empties at a fixed rate. Each incoming request adds to the bucket's depth, and if the bucket overflows, the request is rejected.
One way to implement this is using a queue, which corresponds to the bucket that will contain the incoming requests. Whenever a new request is made, it is added to the queue's end. If the queue is full at any time, then the additional requests are discarded.
The leaky bucket algorithm can be separated into the following concepts:
- Initialize the leaky bucket with a fixed depth and a rate at which it leaks.
- For each request, add to the bucket's depth.
- If the bucket's depth exceeds its capacity, reject the request.
- Leak the bucket at a fixed rate.
Leaky bucket example with token count per user per minute is 3, which is the queue size.Sliding Window Logs
Another approach to rate limiting is to use sliding window logs. This data structure involves a "window" of fixed size that slides along a timeline of events, storing information about the events that fall within the window at any given time.
The window can be thought of as a buffer of limited size that holds the most recent events or changes that have occurred. As new events or changes occur, they are added to the buffer, and old events that fall outside of the window are removed. This ensures that the buffer stays within its fixed size, and only contains the most recent events.
This rate limitation keeps track of each client's request in a time-stamped log. These logs are normally stored in a time-sorted hash set or table.
The sliding window logs algorithm can be implemented using the following steps:
- A time-sorted queue or hash table of timestamps within the time range of the most recent window is maintained for each client making the requests.
- When a certain length of the queue is reached or after a certain number of minutes, whenever a new request comes, a check is done for any timestamps older than the current window time.
- The queue is updated with new timestamp of incoming request and if number of elements in queue does not exceed the authorised count, it is proceeded otherwise an exception is triggered.
Sliding window logs in a timeframe of 1 minuteSliding Window Counters
The sliding window counter algorithm is an optimization over sliding window logs. As we can see in the previous approach, memory usage is high. For example, to manage numerous users or huge window timeframes, all the request timestamps must be kept for a window time, which eventually uses a huge amount of memory. Also, removing numerous timestamps older than a particular timeframe means high complexity of time as well.
To reduce surges of traffic, this algorithm accounts for a weighted value of the previous window's request based on timeframe. If we have a one-minute rate limit, we can record the counter for each second and calculate the sum of all counters in the previous minute whenever we get a new request to determine the throttling limit.
The sliding window counters can be separated into the following concepts:
- Remove all counters which are more than 1 minute old.
- If a request comes which falls in the current bucket, the counter is increased.
- If a request comes when the current bucket has reached it's throat limit, the request is blocked.
sliding window counters with a timeframe of 20 secondsExamples of Rate Limiting APIs used worldwide
- Google Cloud Endpoints: It is a platform for building APIs that includes a built-in rate limiter to help prevent excessive API usage.
- AWS API Gateway: Amazon Web Services (AWS) API Gateway includes a feature called Usage Plans that allows for rate limiting and throttling of API requests.
- Akamai API Gateway: Akamai API Gateway is a cloud-based platform that includes a rate limiter feature for controlling API requests.
- Cloudflare Rate Limiting: Cloudflare's Rate Limiting feature helps prevent DDoS attacks and other types of abusive traffic by limiting the number of requests that can be made to an API.
- Redis: It is an in-memory data structure store that can be used as a database, cache, and message broker. It includes several features that make it useful for implementing a rate limiter, such as its ability to store data in memory for fast access and its support for atomic operations.
How to Design a Rate Limiter API | Learn System Design
Visit Course
Similar Reads
System Design Tutorial System Design is the process of designing the architecture, components, and interfaces for a system so that it meets the end-user requirements. This specifically designed System Design tutorial will help you to learn and master System Design concepts in the most efficient way, from the basics to the
3 min read
Must Know System Design Concepts We all know that System Design is the core concept behind the design of any distributed system. Therefore every person in the tech industry needs to have at least a basic understanding of what goes behind designing a System. With this intent, we have brought to you the ultimate System Design Intervi
15+ min read
What is System Design
System Design Introduction - LLD & HLDSystem design is the process of designing the architecture and components of a software system to meet specific business requirements. Involves translating user requirements into a detailed blueprint that guides the implementation phase. The goal is to create a well-organized and efficient structure
7 min read
System Design Life Cycle | SDLC (Design)System Design Life Cycle is defined as the complete journey of a System from planning to deployment. The System Design Life Cycle is divided into 7 Phases or Stages, which are:1. Planning Stage 2. Feasibility Study Stage 3. System Design Stage 4. Implementation Stage 5. Testing Stage 6. Deployment S
7 min read
What are the components of System Design?System Design involves looking at the system's requirements, determining its assumptions and limitations, and defining its high-level structure and components. The primary elements of system design, including databases, load balancers, and messaging systems, will be discussed.1. Load BalancerIncomin
10 min read
Goals and Objectives of System DesignThe objective of system design is to create a plan for a software or hardware system that meets the needs and requirements of a customer or user. This plan typically includes detailed specifications for the system, including its architecture, components, and interfaces. System design is an important
5 min read
Why is it Important to Learn System Design?System design is an important skill in the tech industry, especially for freshers aiming to grow. Top MNCs like Google and Amazon emphasize system design during interviews, with 40% of recruiters prioritizing it. Beyond interviews, it helps in the development of scalable and effective solutions to a
6 min read
Important Key Concepts and Terminologies â Learn System DesignSystem Design is the core concept behind the design of any distributed systems. System Design is defined as a process of creating an architecture for different components, interfaces, and modules of the system and providing corresponding data helpful in implementing such elements in systems. In this
9 min read
Advantages of System DesignSystem Design is the process of designing the architecture, components, and interfaces for a system so that it meets the end-user requirements. System Design for tech interviews is something that canât be ignored! Almost every IT giant whether it be Facebook, Amazon, Google, Apple or any other asks
4 min read
System Design Fundamentals
Analysis of Monolithic and Distributed Systems - Learn System DesignSystem analysis is the process of gathering the requirements of the system prior to the designing system in order to study the design of our system better so as to decompose the components to work efficiently so that they interact better which is very crucial for our systems. System design is a syst
10 min read
What is Requirements Gathering Process in System Design?The first and most essential stage in system design is requirements collecting. It identifies and documents the needs of stakeholders to guide developers during the building process. This step makes sure the final system meets expectations by defining project goals and deliverables. We will explore
7 min read
Differences between System Analysis and System DesignSystem Analysis and System Design are two stages of the software development life cycle. System Analysis is a process of collecting and analyzing the requirements of the system whereas System Design is a process of creating a design for the system to meet the requirements. Both are important stages
4 min read
Horizontal and Vertical Scaling | System DesignIn system design, scaling is crucial for managing increased loads. Horizontal scaling and vertical scaling are two different approaches to scaling a system, both of which can be used to improve the performance and capacity of the system. Why do we need Scaling?We need scaling to built a resilient sy
5 min read
Capacity Estimation in Systems DesignCapacity Estimation in Systems Design explores predicting how much load a system can handle. Imagine planning a party where you need to estimate how many guests your space can accommodate comfortably without things getting chaotic. Similarly, in technology, like websites or networks, we must estimat
10 min read
Object-Oriented Analysis and Design(OOAD)Object-Oriented Analysis and Design (OOAD) is a way to design software by thinking of everything as objects similar to real-life things. In OOAD, we first understand what the system needs to do, then identify key objects, and finally decide how these objects will work together. This approach helps m
6 min read
How to Answer a System Design Interview Problem/Question?System design interviews are crucial for software engineering roles, especially senior positions. These interviews assess your ability to architect scalable, efficient systems. Unlike coding interviews, they focus on overall design, problem-solving, and communication skills. You need to understand r
5 min read
Functional and Non Functional RequirementsRequirements analysis is an essential process in software development. It helps to determine whether a system or project will meet its objectives and achieve success.To make this analysis effective, requirements are generally divided into two categories: What are Functional Requirements?These are th
6 min read
Communication Protocols in System DesignModern distributed systems rely heavily on communication protocols for both design and operation.Communication protocols facilitate smooth coordination and communication in distributed systems by defining the norms and guidelines for message exchange between various components.By choosing the right
6 min read
Web Server, Proxies and their role in Designing SystemsIn system design, web servers and proxies are crucial components that facilitate seamless user-application communication. Web pages, images, or data are delivered by a web server in response to requests from clients, like browsers. A proxy, on the other hand, acts as a mediator between clients and s
9 min read
Scalability in System Design
Databases in Designing Systems
Complete Guide to Database Design - System DesignDatabase design is key to building fast and reliable systems. It involves organizing data to ensure performance, consistency, and scalability while meeting application needs. From choosing the right database type to structuring data efficiently, good design plays a crucial role in system success. Th
11 min read
SQL vs. NoSQL - Which Database to Choose in System Design?When designing a system, one of the most critical system design choices is among SQL vs. NoSQL databases can drastically impact your system's overall performance, scalability, and usual success. What is SQL Database?Here are some key features of SQL databases:Tabular Data Model: SQL databases organi
5 min read
File and Database Storage Systems in System DesignFile and database storage systems are important to the effective management and arrangement of data in system design. These systems offer a structure for data organization, retrieval, and storage in applications while guaranteeing data accessibility and integrity. Database systems provide structured
4 min read
Block, Object, and File Storage in System DesignStorage is a key part of system design, and understanding the types of storage can help you build efficient systems. Block, object, and file storage are three common methods, each suited for specific use cases. Block storage is like building blocks for structured data, object storage handles large,
5 min read
Database Sharding - System DesignDatabase sharding is a technique for horizontal scaling of databases, where the data is split across multiple database instances, or shards, to improve performance and reduce the impact of large amounts of data on a single database.Database ShardingIt is basically a database architecture pattern in
8 min read
Database Replication in System DesignMaking and keeping duplicate copies of a database on other servers is known as database replication. It is essential for improving modern systems' scalability, reliability, and data availability.By distributing their data across multiple servers, organizations can guarantee that it will remain acces
6 min read
High Level Design(HLD)
What is High Level Design? - Learn System DesignHigh-level design or HLD is an initial step in the development of applications where the overall structure of a system is planned. Focuses mainly on how different components of the system work together without getting to know about internal coding and implementation. Helps everyone involved in the p
9 min read
Availability in System DesignA system or service's readiness and accessibility to users at any given moment is referred to as availability. It calculates the proportion of time a system is available and functional. Redundancy, fault tolerance, and effective recovery techniques are usually used to achieve high availability, whic
5 min read
Consistency in System DesignConsistency in system design refers to the property of ensuring that all nodes in a distributed system have the same view of the data at any given point in time, despite possible concurrent operations and network delays.Importance of Consistency in System DesignConsistency plays a crucial role in sy
8 min read
Reliability in System DesignReliability is crucial in system design, ensuring consistent performance and minimal failures. System reliability refers to how consistently a system performs its intended functions without failure over a given period under specified operating conditions. It means the system can be trusted to work c
5 min read
CAP Theorem in System DesignAccording to the CAP theorem, only two of the three desirable characteristicsâconsistency, availability, and partition toleranceâcan be shared or present in a networked shared-data system or distributed system.The theorem provides a way of thinking about the trade-offs involved in designing and buil
5 min read
What is API Gateway?An API Gateway is a key component in system design, particularly in microservices architectures and modern web applications. It serves as a centralized entry point for managing and routing requests from clients to the appropriate microservices or backend services within a system. An API Gateway serv
8 min read
What is Content Delivery Network(CDN) in System DesignThese days, user experience and website speed are crucial. Content Delivery Networks (CDNs) are useful in this situation. A distributed network of servers that work together to deliver content (like images, videos, and static files) to users faster and more efficiently.These servers, called edge ser
7 min read
What is Load Balancer & How Load Balancing works?A load balancer is a networking device or software application that distributes and balances the incoming traffic among the servers to provide high availability, efficient utilization of servers, and high performance. Works as a âtraffic copâ routing client requests across all serversEnsures that no
8 min read
Caching - System Design ConceptCaching is a system design concept that involves storing frequently accessed data in a location that is easily and quickly accessible. The purpose of caching is to improve the performance and efficiency of a system by reducing the amount of time it takes to access frequently accessed data.=Caching a
9 min read
Communication Protocols in System DesignModern distributed systems rely heavily on communication protocols for both design and operation.Communication protocols facilitate smooth coordination and communication in distributed systems by defining the norms and guidelines for message exchange between various components.By choosing the right
6 min read
Activity Diagrams - Unified Modeling Language (UML)Activity diagrams are an essential part of the Unified Modeling Language (UML) that help visualize workflows, processes, or activities within a system. They depict how different actions are connected and how a system moves from one state to another. By offering a clear picture of both simple and com
10 min read
Message Queues - System DesignMessage queues enable communication between various system components, which makes them crucial to system architecture. Serve as buffers and allow messages to be sent and received asynchronously, enabling systems to function normally even if certain components are temporarily or slowly unavailable.
8 min read
Low Level Design(LLD)
What is Low Level Design or LLD?Low-Level Design (LLD) plays a crucial role in software development, transforming high-level abstract concepts into detailed, actionable components that developers can use to build the system. LLD is the blueprint that guides developers on how to implement specific components of a system, such as cl
6 min read
Authentication vs Authorization in LLD - System DesignTwo fundamental ideas in system design, particularly in low-level design (LLD), are authentication and authorization. Authentication confirms a person's identity.Authorization establishes what resources or actions a user is permitted to access.Authentication MethodsPassword-based AuthenticationDescr
3 min read
Performance Optimization Techniques for System DesignThe ability to design systems that are not only functional but also optimized for performance and scalability is essential. As systems grow in complexity, the need for effective optimization techniques becomes increasingly critical. Data Structures & AlgorithmsChoose data structures (hash tables
3 min read
Object-Oriented Analysis and Design(OOAD)Object-Oriented Analysis and Design (OOAD) is a way to design software by thinking of everything as objects similar to real-life things. In OOAD, we first understand what the system needs to do, then identify key objects, and finally decide how these objects will work together. This approach helps m
6 min read
Data Structures and Algorithms for System DesignSystem design relies on Data Structures and Algorithms (DSA) to provide scalable and effective solutions. They assist engineers with data organization, storage, and processing so they can efficiently address real-world issues. In system design, understanding DSA concepts like arrays, trees, graphs,
6 min read
Containerization Architecture in System DesignIn system design, containerization architecture describes the process of encapsulating an application and its dependencies into a portable, lightweight container that is easily deployable in a variety of computing environments. Because it makes the process of developing, deploying, and scaling appli
10 min read
Modularity and Interfaces In System DesignThe process of breaking down a complex system into smaller, more manageable components or modules is known as modularity in system design. Each module is designed to perform a certain task or function, and these modules work together to achieve the overall functionality of the system.Many fields, su
8 min read
Unified Modeling Language (UML) DiagramsUnified Modeling Language (UML) is a general-purpose modeling language. The main aim of UML is to define a standard way to visualize the way a system has been designed. It is quite similar to blueprints used in other fields of engineering. UML is not a programming language, it is rather a visual lan
14 min read
Data Partitioning Techniques in System DesignUsing data partitioning techniques, a huge dataset can be divided into smaller, easier-to-manage portions. These techniques are applied in a variety of fields, including distributed systems, parallel computing, and database administration. Data Partitioning Techniques in System DesignTable of Conten
9 min read
How to Prepare for Low-Level Design Interviews?Low-Level Design (LLD) interviews are crucial for many tech roles, especially for software developers and engineers. These interviews test your ability to design detailed components and interactions within a system, ensuring that you can translate high-level requirements into concrete implementation
4 min read
Essential Security Measures in System DesignWith various threats like cyberattacks, Data Breaches, and other Vulnerabilities, it has become very important for system administrators to incorporate robust security measures into their systems. Some of the key reasons are given below:Protection Against Cyber Threats: Data Breaches, Hacking, DoS a
8 min read
Design Patterns
Design Patterns TutorialSoftware design patterns are important tools developers, providing proven solutions to common problems encountered during software development. Reusable solutions for typical software design challenges are known as design patterns. Provide a standard terminology and are specific to particular scenar
9 min read
Creational Design PatternsCreational Design Patterns focus on the process of object creation or problems related to object creation. They help in making a system independent of how its objects are created, composed, and represented. Creational patterns give a lot of flexibility in what gets created, who creates it, and how i
4 min read
Structural Design PatternsStructural Design Patterns are solutions in software design that focus on how classes and objects are organized to form larger, functional structures. These patterns help developers simplify relationships between objects, making code more efficient, flexible, and easy to maintain. By using structura
7 min read
Behavioral Design PatternsBehavioral design patterns are a category of design patterns that focus on the interactions and communication between objects. They help define how objects collaborate and distribute responsibility among them, making it easier to manage complex control flow and communication in a system. Table of Co
5 min read
Design Patterns Cheat Sheet - When to Use Which Design Pattern?In system design, selecting the right design pattern is related to choosing the right tool for the job. It's essential for crafting scalable, maintainable, and efficient systems. Yet, among a lot of options, the decision can be difficult. This Design Patterns Cheat Sheet serves as a guide, helping y
7 min read
Interview Guide for System Design
How to Crack System Design Interview Round?In the System Design Interview round, You will have to give a clear explanation about designing large scalable distributed systems to the interviewer. This round may be challenging and complex for you because you are supposed to cover all the topics and tradeoffs within this limited time frame, whic
9 min read
System Design Interview Questions and Answers [2025]In the hiring procedure, system design interviews play a significant role for many tech businesses, particularly those that develop large, reliable software systems. In order to satisfy requirements like scalability, reliability, performance, and maintainability, an extensive plan for the system's a
7 min read
Most Commonly Asked System Design Interview Problems/QuestionsThis System Design Interview Guide will provide the most commonly asked system design interview questions and equip you with the knowledge and techniques needed to design, build, and scale your robust applications, for professionals and newbiesBelow are a list of most commonly asked interview proble
1 min read
5 Common System Design Concepts for Interview PreparationIn the software engineering interview process system design round has become a standard part of the interview. The main purpose of this round is to check the ability of a candidate to build a complex and large-scale system. Due to the lack of experience in building a large-scale system a lot of engi
12 min read
5 Tips to Crack Low-Level System Design InterviewsCracking low-level system design interviews can be challenging, but with the right approach, you can master them. This article provides five essential tips to help you succeed. These tips will guide you through the preparation process. Learn how to break down complex problems, communicate effectivel
6 min read