Designing a rate limiting system for any API can be done in many ways. In this article I'll talk about one of them called Leaky Bucket algorithm.
In Leaky bucket algorithm, we assume incoming traffic to be water and we assume there a bucket which is having a hole, which can leak this water, at a constant rate. If this bucket is full, incoming water will be spilled or ignored. This way leaky bucket algorithm ensures a constant, limited and smooth flow of requests and put a rate limit on API calls.
In implementation we introduce a constant variable, that denotes capacity of the bucket, and another counter to store how many open requests are there. Then, with every incoming request, this counter gets incremented and with every served request, this counter gets reduced, if total open requests gets equal to the bucket size, we will ignore all incoming requests.