API Rate Limiter Design

SystemDesign rate limiter

创建时间:2020-12-08 05:35

字数:579 阅读:

评论:

1. Overview

What is a rate limiter?
- due to limited resources, also get rid of some abusive action, we need some kind of throttling or rate limiting mechanism thus only a certain number of requests will go to our service, and we are able to respond all of them
- a rate limiter limits the number of events an entity can perform in a particular time window, then block requests once the cap is reached
why we need to do rate limiting?
- Protect services against abusive behaviors targeting the application layer like
  - DOS attacks
  - Brute Force credit card transactions
- why we need such protection?
  - these attacks are a barrage of HTTP/S requests which may look like they are coming from real users, but are actually generated by machines
  - thus such attachs are harder to detect and can more easily bring down a service, application or an API
- could make a service and APIs more reliable
  - misbehaving clients/ scripts
  - security
    - second factor attempts
  - prevent abusive behavior and bad design practices
  - keep costs and resource usage under control
  - revenue
    - revenue model based on rate limiting
  - eliminate spikiness in traffic

functional
- limit the number of requests an entity can send to an API within a time window
- APIs are accessible through a cluster, so the rate limit should be considered across different servers
  - users should get an error message whenever the defined threshold is crossed within a single server or across a combination of servers

table to store the request information,
- every entry will look like
  - userId
  - api name
  - accessTime
  - api parameters
- and then we could query and sort by the accesstime to get related info
there should have a caching layer to store the related info, most important one is for a specific user and specific api, based on the throttle limit(suppose n), what’s the time when they made the recent nth request, thus we could take a notes on this info

Clients make call to our web server
when request come, first sync with rate limiter server to decide if it will be served or throttled
web server then sync with API servers if rate limiter says it should not be blocked

target
- limit the number of requests per user
  - keep a count representing how many requestss the user has made
  - a timestamp when we started counting the requests
use a hashtable to store the info
- key - userId
- value - count + startTime
  - count would be enough if we don’t need detail about the metrics or we say there are some other stuff controlling it
  - so based on the count and requirement, we could either increase the count, reset the count, or reset the start time.
- one issue for only store the starttime is it’s possible to allow twice the configured number during a period. End of the previous window, with full capacity; and start at the next window, with full capacity.

转载请注明来源，欢迎对文章中的引用来源进行考证，欢迎指出任何有错误或不够清晰的表达。可以在下面评论区评论，也可以邮件至 stone2paul@gmail.com