SPF – Is the 10-DNS-Lookup Limit Typically Enforced

emailspamspf

My understanding is that the SPF spec specifies an email receiver shouldn't have to do more than 10 DNS lookups in order to gather all the allowed IPs for a sender. So if an SPF record has include:foo.com include:bar.com include:baz.com and those three domains each have SPF records which also have 3 include entries, now we are up to 3+3+3+3=12 DNS lookups.

  1. is my understanding above correct?

  2. I only use 2 or 3 services for my domain and I am already way past this limit. Is this limit typically (or ever) enforced by major/minor email providers?

Best Answer

Both libspf2 (C) and Mail::SPF::Query (perl, used in sendmail-spf-milter) implement a limit of 10 DNS-causing mechanisms, but the latter does not (AFAICT) apply the MX or PTR limits. libspf2 limits each of mx and ptr to 10 also.

Mail::SPF (perl) has a limit of 10 DNS-causing mechanisms, and a limit of 10 lookups per mechanism, per MX and per PTR. (The two perl packages are commonly, though not by default, used in MIMEDefang.)

pyspf has limits of 10 on all of: "lookups", MX, PTR, CNAME; but it explicitly multiplies MAX_LOOKUPS by 4 during operation. Unless in "strict" mode, it also multiples MAX_MX and MAX_PTR by 4.

I can't comment on commercial/proprietary implementations, but the above (except pyspf) clearly implement an upper limit of 10 DNS-triggering mechanisms (more on that below), give or take, though in most cases it can be overridden at run-time.

In your specific case you are correct, it is 12 includes and that exceeds the limit of 10. I would expect most SPF software to return "PermError", however, failures will only affect the final "included" provider(s) because the count will be calculated as a running total: SPF mechanisms are evaluated left-to-right and checks will "early-out" on a pass, so it depends on where in the sequence the sending server appears.

The way around this is to use mechanisms which do not trigger DNS lookups, e.g. ip4 and ip6, and then use mx if possible as that gets you up to 10 further names, each of which can have more than one IP.

Since SPF results in arbitrary DNS requests with potentially exponential scaling, it could easily be exploited for DOS/amplification attacks. It has deliberately low limits to prevent this: it does not scale the way you want.


10 mechanisms (strictly mechanisms + the "redirect" modifier) causing DNS look-ups is not exactly the same thing as 10 DNS look-ups though. Even "DNS lookups" is open to interpretation, you don't know in advance how many discrete lookups are required, and you don't know how many discrete lookups your recursive resolver may need to perform (see below).

RFC 4408 §10.1:

SPF implementations MUST limit the number of mechanisms and modifiers that do DNS lookups to at most 10 per SPF check, including any lookups caused by the use of the "include" mechanism or the "redirect" modifier. If this number is exceeded during a check, a PermError MUST be returned. The "include", "a", "mx", "ptr", and "exists" mechanisms as well as the "redirect" modifier do count against this limit. The "all", "ip4", and "ip6" mechanisms do not require DNS lookups and therefore do not count against this limit.

[...]

When evaluating the "mx" and "ptr" mechanisms, or the %{p} macro, there MUST be a limit of no more than 10 MX or PTR RRs looked up and checked.

So you may use up to 10 mechanisms/modifiers which trigger DNS lookups. (The wording here is poor: it seems to state only the upper bound of the limit, a confirming implementation could have a limit of 2.)

§5.4 for the mx mechanism, and §5.5 for the ptr mechanism each have a limit of 10 lookups of that kind of name, and that applies to the processing of that mechanism only, e.g.:

To prevent Denial of Service (DoS) attacks, more than 10 MX names MUST NOT be looked up during the evaluation of an "mx" mechanism (see Section 10).

i.e. you may have 10 mx mechanisms, with up to 10 MX names, so each of those may cause 20 DNS operations (10 MX + 10 A DNS lookups each) for total of 200. It's similar for ptr or %{p}, you can look up 10 ptr mechanisms, hence 10x10 PTRs, each PTR also requires an A lookup, again a total of 200.

This is exactly what the 2009.10 test suite checks, see the "Processing Limits" tests.

There is no clearly stated upper limit on the total number of client DNS lookup operations per-SPF-check, I calculate it as implicitly 210, give or take. There is also a suggestion to limit the volume of DNS data per-SPF-check, no actual limit is suggested though. You can get a rough estimate as SPF records are limited to 450 bytes (which is sadly shared with all other TXT records), but the total could exceed 100kiB if you're generous. Both those values are clearly open to potential abuse as an amplification attack, which is exactly what §10.1 says you need to avoid.

Empirical evidence suggests a total of 10 lookup mechanisms is commonly implemented in records (check out the SPF for microsoft.com who seem to have gone to some lengths to keep it to exactly 10). It's hard to collect evidence of too-many-lookups failure because the mandated error code is simply "PermError", which covers all manner of problems (DMARC reporting might help with that though).

The OpenSPF FAQ perpetuates the limit of a total of "10 DNS lookups", rather than the more precise "10 DNS causing mechanisms or redirects". This FAQ is arguably wrong since it actually says:

Since there is a limit of 10 DNS lookups per SPF record, specifying an IP address [...]

which is in disagreement with the RFC which imposes the limits on an "SPF check" operation, does not limit DNS lookup operations in this way, and clearly states an SPF record is a single DNS text RR. The FAQ would imply that you restart the count when you process an "include" as that is a new SPF record. What a mess.


DNS Lookups

What is a "DNS lookup" anyway? As a user. I would consider "ping www.microsoft.com" to involve a single DNS "lookup": there's one name that I expect to turn into one IP. Simple? Sadly not.

As an administrator I know that www.microsoft.com might not be a simple A record with a single IP, it might be a CNAME that in turn needs another discrete lookup to obtain an A record, albeit one that my upstream resolver will probably perform rather than the resolver on my desktop. Today, for me, www.microsoft.com is a chain of 3 CNAMEs that finally end up as an A record on akamaiedge.net, that's (at least) 4 DNS query operations for someone. SPF may see CNAMEs with the "ptr" mechanism, an MX record should not be a CNAME though.

Finally, as a DNS adminstrator I know that answering (almost) any question involves many discrete DNS operations, individual questions and answer transactions (UDP datagrams) — assuming an empty cache, a recursive resolver needs to start at the DNS root and work its way down: .commicrosoft.comwww.microsoft.com asking for specific types of records (NS, A etc) as required, and dealing with CNAMEs. You can see this in action with dig +trace www.microsoft.com, though you probably won't get the exact same answer due to geolocation trickery (example here). (There's even a little bit more to this complexity since SPF piggybacks on TXT records, and obsolete limits of 512 bytes on DNS answers might mean retrying queries over TCP.)

So what does SPF consider as a lookup? It's really closest to the administrator point of view, it needs to be aware of the specifics of each type of DNS query (but not to the point where it actually needs to count individual DNS datagrams or connections).