Ubuntu – Exim says DNS lookup succeded, but also says “host lookup did not complete”

eximUbuntuvagrantvirtualbox

I'm trying to send automatic notification emails to users of my website. Those emails are sent to the users by a custom daemon that sends them via exim 4. Exim's role in this is simply to relay the mail to the mail server associated to the recepients address. All outgoing emails must be relayed. There are no local emails, and no incoming mails.

However, when I try to send emails, my daemon always gets the following response:

com.sun.mail.smtp.SMTPAddressFailedException: 451 Temporary local problem - please try later

In /var/log/exim4/mainlog, I have the following lines

2014-09-09 22:30:50 no host name found for IP address 10.0.2.2
2014-09-09 22:30:50 H=(lotp-lanbox) [10.0.2.2] F=<noreply@mydomain.com> temporarily rejected RCPT <foobar@romandie.com>: host lookup did not complete

(Note that 10.0.2.2 is the IP address of the host on which the sender daemon is installed.)

That message is strange, but it lacks details. Here is another debug command I've launched:

user@host:~$ exim4 -bt -d-resolver foobar@romandie.com
Exim version 4.82 uid=0 gid=0 pid=14035 D=fbb95cfd
Berkeley DB: Berkeley DB 5.3.28: (September  9, 2013)
Support for: crypteq iconv() IPv6 GnuTLS move_frozen_messages DKIM
Lookups (built-in): lsearch wildlsearch nwildlsearch iplsearch cdb dbm dbmjz dbmnz dnsdb dsearch nis nis0 passwd
Authenticators: cram_md5 plaintext
Routers: accept dnslookup ipliteral manualroute queryprogram redirect
Transports: appendfile/maildir/mailstore autoreply lmtp pipe smtp
Fixed never_users: 0
Size of off_t: 8
Compiler: GCC [4.8.2]
Library version: GnuTLS: Compile: 2.12.23
                         Runtime: 2.12.23
Library version: PCRE: Compile: 8.31
                       Runtime: 8.31 2012-07-06
Total 13 lookups
WHITELIST_D_MACROS: "OUTGOING"
TRUSTED_CONFIG_LIST: "/etc/exim4/trusted_configs"
changed uid/gid: forcing real = effective
  uid=0 gid=0 pid=14035
  auxiliary group list: <none>
seeking password data for user "uucp": cache not available
getpwnam() succeeded uid=10 gid=10
changed uid/gid: calling tls_validate_require_cipher
  uid=109 gid=116 pid=14036
  auxiliary group list: <none>
tls_validate_require_cipher child 14036 ended: status=0x0
configuration file is /var/lib/exim4/config.autogenerated
log selectors = 00000ffc 00632001
trusted user
admin user
seeking password data for user "mail": cache not available
getpwnam() succeeded uid=8 gid=8
user name "root" extracted from gecos field "root"
originator: uid=0 gid=0 login=root name=root
sender address = root@dev
Address testing: uid=0 gid=116 euid=0 egid=116
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Testing foobar@romandie.com
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Considering foobar@romandie.com
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
routing foobar@romandie.com
--------> hubbed_hosts router <--------
local_part=foobar domain=romandie.com
checking domains
expansion of "${if exists{/etc/exim4/hubbed_hosts}{partial-lsearch;/etc/exim4/hubbed_hosts}fail}" forced failure: assume not in this list
hubbed_hosts router skipped: domains mismatch
--------> dnslookup_relay_to_domains router <--------
local_part=foobar domain=romandie.com
checking domains
romandie.com in "@:localhost"? no (end of list)
romandie.com in "*"? yes (matched "*")
romandie.com in "! +local_domains : +relay_to_domains"? yes (matched "+relay_to_domains")
R: dnslookup_relay_to_domains for foobar@romandie.com
calling dnslookup_relay_to_domains router
dnslookup_relay_to_domains router called for foobar@romandie.com
  domain = romandie.com
DNS lookup of romandie.com (MX) succeeded
dnslookup_relay_to_domains router: defer for foobar@romandie.com
  message: host lookup did not complete
foobar@romandie.com cannot be resolved at this time: host lookup did not complete
search_tidyup called
>>>>>>>>>>>>>>>> Exim pid=14035 terminating with rc=1 >>>>>>>>>>>>>>>>

Here's the extract that looks particularily weird to me (from the end of that output):

dnslookup_relay_to_domains router called for foobar@romandie.com
  domain = romandie.com
DNS lookup of romandie.com (MX) succeeded
dnslookup_relay_to_domains router: defer for foobar@romandie.com
  message: host lookup did not complete
foobar@romandie.com cannot be resolved at this time: host lookup did not complete

How is it possible for the DNS lookup to both succeed and not complete? What am I doing wrong?

I've tried doing a DNS lookups using dig from the machine on which exim4 is installed, and the results look fine to me:

user@host:~$ dig mx romandie.com
;; Warning: Message parser reports malformed message packet.

; <<>> DiG 9.9.5-3-Ubuntu <<>> mx romandie.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36151
;; flags: qr aa rd ra ad; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: Message has 1 extra bytes at end

;; QUESTION SECTION:
;romandie.com.                  IN      MX

;; ANSWER SECTION:
romandie.com.           3600    IN      A       37.35.105.169
romandie.com.           3600    IN      A       37.35.105.166

;; Query time: 19 msec
;; SERVER: 10.0.2.3#53(10.0.2.3)
;; WHEN: Tue Sep 09 23:14:45 UTC 2014
;; MSG SIZE  rcvd: 63

The lookup itself looks fine.

Why is exim saying that it is both succeeding and failing at the same time?

Best Answer

The recursive DNS resolver you are using (the one on 10.0.2.3) is severely broken. In your dig command, you are asking it for the MX record. But it sends an answer containing two A records instead. This is not even because the domain doesn't have an MX record. I just checked, and there is indeed an MX record on that domain. Moreover dig is warning you that the reply packet is malformed WARNING: Message has 1 extra bytes at end.

I recommend you stop using that faulty DNS server. Try to put another DNS resolver in /etc/resolv.conf. I have good experience with using 8.8.8.8.