Something Special Making it EVEN better.

22Nov/100

Weekend project: A rate limiting HTTP proxy in node.js and Redis

One day last week a couple of co-workers asked about how one would go about implementing an API limiter for a "global scale" RESTful webservice. I thought about it briefly and assumed that most proxies would already have implemented this. Lo and behold, it's true as nginx, HAProxy and Squid all have ways to put limits on HTTP traffic. The only big problem with these is one of shared state. In other words, between multiple instances of a proxy, there isn't any.

So having never worked with node.js before and Redis only for experimenting, I figured it was time to try my hand at both and implement an HTTP proxy limiter. The result, node-rate-limiter-proxy, is up on GitHub. Fork it or let me know if it's useful. If you are in Rails/Rack you can also have a look at rack-throttle which has a few nice features and configuration options.

Of course with this implementation, I haven't quite solved the "global scale" requirement. To do so would require some fun multi-master replication since the "global scale" webservice we're talking about is currently deployed to 3 data centers around the world. Let's tackle that requirement next weekend! (or just assume that a user stays in the same datacenter for his entire session/duration of the timeout for the request limiting, which is probably also reasonable to start with)

Redis Implementation Notes

Due to the way Redis < 2.1.3 treats volatile keys, several steps are needed when dealing with the rate limiting. Here is a rundown of the steps that are taken against Redis < 2.1.3 to achieve incrementing volatile keys. This example uses redis-cli as an illustration tool and so you can test the semantics yourself or re-use them elsewhere.

    $ redis-cli

The first step is to use two keys, X and Y representing X requests in Y seconds. In this example, we'll use 60 seconds as the rate period and 10 as the maximum number of requests allowed in those 60 seconds. Because of the Redis limitation on volatile keys, the key prefixed with "Y:" will actually be the timing/expiration key only and not used to store the number of accesses in this timeframe. So for this key, the value is actually irrelevant since it will never be read or changed. The key prefixed with "X:" will keep track of the number of requests in the time period. We'll assume neither is yet set and get both values from Redis. This is best done in a transaction so we can send just one request to Redis and have guarantees about consistency between the two values.

    redis> MULTI
    OK
    redis> GET X:foo
    QUEUED
    redis> TTL Y:foo
    QUEUED
    redis> EXEC
    1. (integer) -1
    2. (integer) -1

Since we've never set either of these keys, it's no wonder that they are both -1. The next step is to of course start incrementing the X counter and set the Y key to start counting down.

    redis> MULTI
    OK
    redis> SET X:foo 1
    QUEUED
    redis> SETEX Y:foo 60 0
    QUEUED
    redis> EXEC
    1. OK
    2. OK

Some introspection on the keys in Redis shows us their current values.

    redis> MGET X:foo Y:foo
    1. "1"
    2. "0"
    redis> TTL Y:foo
    (integer) 56 <-- this is the number of seconds left for the key to "live" so will vary slightly depending on when you call TTL

Now that we have both keys X and Y for resource "foo" we can let the request through.

Let's now keep going and imagine what subsequent requests would look like. Normally we wouldn't actually do a GET on the X value but instead would be optimistic and increment the value while we check the TTL of key Y.

    redis> MULTI
    OK
    redis> INCR X:foo
    QUEUED
    redis> TTL Y:foo
    QUEUED
    redis> EXEC
    1. (integer) 5
    2. (integer) -1

Here we have the case where the resource has been accessed 4 times before (this would be the 5th access) but the TTL or Y value has been reached. In this case, we simply have to reset everything before letting the request through.

    redis> MULTI
    OK
    redis> SET X:foo 1
    QUEUED
    redis> SETEX Y:foo 60 0
    QUEUED
    redis> EXEC
    1. OK
    2. OK

Should the expiry time not be reached, we need to do something else.

    redis> MULTI
    OK
    redis> INCR X:foo
    QUEUED
    redis> TTL Y:foo
    QUEUED
    redis> EXEC
    1. (integer) 11
    2. (integer) 30

Okay, now it's clear that if this request were to be let through we would exceed the maximum 10 requests per 60 seconds. In this case, we just deny the incoming request. We don't need to decrement the key since the next request would still result in a denial: 12 > 10. In reality, this actually gives us some useful insight. Before we reset this key, we can record this value to disk somewhere for statistics purposes. This will give us an idea of how far beyond people's allowable limit they are attempting to go. Of course, the downside is that you continue to increment this value and possibly reach the maximum integer value. If incrementing is not for you, you can always be pessimistic and just do a check before allowing the request and incrementing the counter.

    redis> GET X|foo
    "9"
    redis> INCR X|foo
    (integer) 10

This does require two calls for every successful request, but will avoid running out of integers if you were to increment forever.

Of course, once Redis >= 2.1.3 is stable and released, this algorithm becomes a bit simpler since it will require only one key to be operated on.

  • Print
  • Google Bookmarks
  • Facebook
  • Tumblr
  • Twitter
  • LinkedIn
  • StumbleUpon
  • Digg
  • del.icio.us
  • Reddit
  • Slashdot
Comments (0) Trackbacks (0)

No comments yet.


Leave a comment


No trackbacks yet.