How To Write Code That Doesn't Suck: a simple canonical port numbering scheme for web services

My work in the past few years has involved producing quite a few web services. Some are public facing but many are middle-tier type things accessed by other services or REST back-ends for client apps. Almost all have leveraged some sort of framework such as rails or django. Each such service must be bound to a port to be accessed, and since binding to the standard http port (80) requires root privileges most frameworks have some higher default port they run on during development. For example rails defaults to 3000, django to 8000, dropwizard uses 8080, etc. This works well enough when throwing together a single service, but what if you need a bunch of these running at once? Each framework lets you configure alternate ports, but what values should you actually use?

The answer to that question is really a function of how the services will be used and maintained. If you're part of a team developing multiple services that need to work together you will have to agree on some sort of standard, or pick arbitrary numbers and simply record them in a central place. In practice that "standard" may amount to little more than "we've been using rails, so our first service is 3000, our next is 3001, etc.". For a variety of reasons, including preserving your sanity, I suggest a somewhat more formal approach--a canonical port numbering scheme based on the name of your service.

A service's name might not be particularly well defined in all cases, but usually there is some simple short name that a team refers to a service by, and that's the one you should start with. The name of the project, app, or source code repository are all reasonable candidates. There are a variety of ways to turn that name into a number (e.g. a hash like CRC-16), but I've come to prefer simply interpreting the first few characters of the service name as a base-32 encoded number (base-32 is like hex but goes from 0 to v instead of just 0 to f). This approach typically ensure unique ports as long as you pick service names that don't start with the same three letters. It also allows you to guess the service from the port. Here's an example--assume you have a service named api. Interpreted as a base-32 number that's 11058. With a couple tweaks to account for the valid port range, restricted ports, and characters that aren't valid base-32 digits you have a complete solution. Here's an example algorithm in javascript:

Here are a few examples of service names and the ports they yield, and what you get from converting the port number back to a string

api         => 12082 => api...
db          => 14688 => db0... right padded with 0
foo-bar     => 17176 => foo...
aardvark    => 11611 => aar...
alligator   => 11957 => all... note that lexical sort is preserved
bee         => 12750 => bee...
cat         => 13661 => cat...
caterpillar => 13661 => cat... collides with cat
dog         => 15120 => dog...
elephant    => 16046 => ele...
vole        => 33557 => vol...
zebra       => 33227 => veb... z isn't a valid digit

Turns out this scheme easily avoids most of the common "well-known" ports, as they mostly decode to strings you'd never think to start a service name with. Some examples:

                6379 => 67b... redis
               27017 => qc9... mongodb
                3000 => 2to... rails default
                8000 => 7q0... django default
                8080 => 7sg... dropwizard et al
                2195 => 24j... apple push notifications

While just having a sane and consistent way of picking port numbers may be it's own reward, this approach is actually an important component of automated service deployment tools I've been working with for the past year. I hope to explore these further in an upcoming post, but for example consider that well chosen service names be part of a domain name. You can leverage this into a "config-less" approach to inter-service communication, e.g. if the api service running on api.cluster-1.example.com it could assume the database service it should talk to lives at db.cluster-1.example.com, and will be found at port 14688. Initially the two services could be deployed on one box pointed two by both domain names. If later the load increases and the db service is moved to another box there's no need to update the config for the api service.

How To Write Code
That Doesn't Suck

2014-01-22

a simple canonical port numbering scheme for web services

1 comments:

Post a Comment