Java – Guessing a phone number’s country code

java

This is the problem I'm working with: given a phone number from anywhere in the world and some location information (state, province, possibly country name if I'm lucky, etc.), return the ISO country code for that number.

For the purposes of this question, I will not focus on the location information, as that provides an alternative solution to determining the country code which doesn't even need to use the phone number anymore (though, it would be useful for validation purposes)

When I first started working on the problem, I was hoping there was a deterministic way to figure this out because there was some sort of international standard out there. It became immediately apparent that one does not exist for phone numbers. There are standards within countries, between countries (NANP for example), but no unified international standard.

Playing around with libphonenumbers for a few days, it seems to be able to provide accurate validation of a phone number if I'm given a country code (eg: CA for Canada, GB for United Kingdom, etc).

The library provides two methods: isPossibleNumber, and isValidNumberForRegion. This is the code I'm using

boolean isValid;
PhoneNumber number;
PhoneNumberUtil util = PhoneNumberUtil.getInstance();

String numStr = "(123) 456-7890";
for (String r : util.getSupportedRegions())
{
    try {
        // check if it's a possible number
        isValid = util.isPossibleNumber(numStr, r);
        if (isValid)
        {
            number = util.parse(numStr, r);

            // check if it's a valid number for the given region
            isValid = util.isValidNumberForRegion(number, r);
            if (isValid)
                System.out.println(r + ": " + number.getCountryCode() + ", " + number.getNationalNumber());
        }
    } catch (NumberParseException e)
    {
        e.printStackTrace();
    }
}

So for example, if I took an arbitrary phone number like +44 20 7930 4832 and ran it through the method, I would get the following output

GB: 44, 2079304832

Now, that's assuming I'm given the dialing code (sometimes it's there). If I weren't given the dialing code, I might just get something like 20 7930 4832, and the results are not as pretty

DE: 49, 2079304832
US: 1, 2079304832
GB: 44, 2079304832
FI: 358, 2079304832
AX: 358, 2079304832
RS: 381, 2079304832
CN: 86, 2079304832
NZ: 64, 2079304832
IN: 91, 2079304832
IR: 98, 2079304832
JP: 81, 2079304832

Given a phone number, I can run it through all of the different rules for every country and filter the list down from 244 to around 20 or less if I'm lucky, but I'm not sure if there's anything else I could do to try and guess the country.

Best Answer

Telephone switches can figure this out, so there must be some method to the madness. And, in fact, there is:

Country codes are arranged in a tree, where you start at the root and descend into a subtree depending on the value of each digit. When you've reached a leaf node, you're at something that could be a country code. Any digits beyond those you've used to find a leaf node in a tree are would would have to be decoded according to the local numbering plan.

For example, the +1 country code is easy: it covers the North American Numbering Plan. Any number beginning with +1 will fall into that plan; there are no other countries within it with numbers like +12 or +123. If you want to know whether a +1 number is the U.S. or Canada, you'd have to look up the next three digits (the area code) in a table that will tell you that information. (Russia, Kazakhstan and a few other territories are the same way on the Russian numbering plan, which occupies the entire +7 branch.)

The +2 branch has ten sub-branches. All of +20 is Egypt, +27 is South Africa and +28 is unassigned. As above, no countries other than Egypt have a country code that starts with +20. The remaining seven prefixes are divided ten ways each. +21 is subdivided ten ways into +210 through +219 and covers countries like South Sudan, Morocco and Algeria. +22 is divided into +220 through +229 for The Gambia, Senegal, Mauritania, etc.

The tree you need for this can be easily generated from a list of the country codes and their countries.