UDP & TCP Ports – Should Port Numbers Be Represented as Short or Integer?

design-patternsjava

I am currently designing some networking code, and this code expects the caller of the code to give it a port and hostname to connect to.

Since I am still in the development phase, I can still change many aspects of the code and more advantages and disadvantages come in for the code. However, some aspect of the code has left me still wondering what kind of number I should pass to the underlying socket creation code.

I have now to decide what version I should keep,

Version 1:

int port = 80;
ServerSocket socket = new ServerSocket(port);

Advantages:

  • Almost al libraries in java expects integers instead of shorts, including the InetSocketAddress.

Disadvantages:

  • Using integers instead of shorts has left its security bugs in the past by security systems that don't except the underlying code to skip the first 16 bits.

Version 2:

short port = 80;
ServerSocket socket = new ServerSocket(port);

Advantages:

  • It follows the underlying system convention from [C] to use 16 bit numbers

Disadvantages:

  • There must be a reason why java uses integers, isn't it?

My Question: Should I design my program to accept a int or a short

Best Answer

Always use integers, unless you have a good reason to do otherwise. Here, we actually have a good reason: port numbers must be in the range 0 to 216-1 (i.e. fits into an unsigned 16 bit number).

Unfortunately, Java's short is a signed 16 bit number, having the range -215 to 215-1. If you use shorts in your API, port 65535 would have to be specified as -1. This might be understood correctly by a C API that just casts the number to an unsigned interpretation, but within Java this will lead to loads of unnecessary pain. In particular, -1 != 65535, and you can't easily test whether a given port is inside some port range.

If you are worried about correctness, then take steps to ensure correctness. We can model a concept such as a “valid port number” as a new type:

class ValidPortNumber {
  private static final MIN_PORT = 0;
  private static final MAX_PORT = 65535;
  private final int port;
  public ValidPortNumber(int port) {
    if (!(MIN_PORT <= port && port <= MAX_PORT))
      throw ...;
    this.port = port;
  }
  public int get() { return port; }
}

Alternatively, you can do this validation in your socket class.

Why can't we rely on the value range of primitive types instead of performing validation? When mapping concepts (such as port numbers) to an implementation (such as numeric types), there can always be a mismatch:

  • some instances of the concept cannot be mapped to the implementation, since the implementation is too restricted. This more or less happens if you had chosen short, since new ServerSocket(65535) would not have worked! For a correct implementation, it is necessary that no such mismatch exists.

  • the implementation can represent values that are not instances of the concept. This would happen if you just use ints without any validation, since new ServerSocket(65536) would have been accepted. Eliminating this mismatch is not strictly necessary for a correct implementation, but it is necessary to detect bogus input and make your implementation resilient against usage bugs.

So remember: every port number is an integer, but not every integer is a port number. Therefore, you will have to write validation code. You cannot use shorts since not every port number is a signed short, and not every signed short is a port number.

Exercises

Answers are displayed when hovering over the quote.

  • can I represent the concept “Address” as the type “String”?

    Yes, every address can be represented as a string. However, I should create a custom type with validation since some strings such as "92hhef92fdff#as" aren't usable addresses.

  • can I represent the concept “Name” as the type “String”?

    Yes, every name can be written in a Unicode string. Since it is not possible to validate whether a name is “correct”, it is not useful to add a validation type. While some names cannot be expressed in script, this is outside of the scope of any reasonable model of names.

  • can I represent the concept “Telephone Number” as the type “int”?

    No, for a variety of reasons: leading zeroes might be significant (e.g. for international calls), or because digits might be grouped with spaces, parens, hyphens, … for better readability. Some numbers might use letters as mnemonics for numbers. However, every telephone number can be represented as a sequence of digits (plus perhaps the + and # chars?). We should therefore define a custom type to provide validation and a correct concept of equality, but we could use a string to store the digit sequence and possibly any user-supplied formatting.

Related Topic