Smtp sending via Exchange some mails do not complete but still send – why

exchangesmtp

I have an application that sends smtp email via a local Exchange server at a customer. All the email sends ok (it gets to the recipient) but some email gets sent multiple times because my application never receives the 221 respose code to say that it has worked ok and so re-tries. We could remve the retries but often this is needed for genuine problems.

It happens more commonly on certain desination email addresses and will repeatedly happen on that email once it has.

I have captured the TCP stream of one that does work and one that doesn't. Obviously names changed to protect the guilty.

Did work:
220 SBS4S1.example.local Microsoft ESMTP MAIL Service ready at Thu, 16 Apr 2015 11:51:16 +0100
EHLO example-PC
250-SBS4S1.example.local Hello [192.168.111.15]
250-SIZE 10485760
250-PIPELINING
250-DSN
250-ENHANCEDSTATUSCODES
250-STARTTLS
250-AUTH
250-8BITMIME
250-BINARYMIME
250-CHUNKING
250-XEXCH50
250 XSHADOW
RSET
250 2.0.0 Resetting
MAIL FROM:<yyyy@example.com>
250 2.1.0 Sender OK
RCPT TO:<zzz@example.com>
250 2.1.5 Recipient OK
DATA
354 Start mail input; end with <CRLF>.<CRLF>
--DATA Here, identical other than MIME/Content IDs in both emails--
.
250 2.6.0 <ac8dcf01-0d61-4e76-a52e-3946e68bf3a1@SBS4S1.example.local> [InternalId=26602] Queued mail for delivery
QUIT
221 2.0.0 Service closing transmission channel

.

    Did NOT work:
    220 SBS4S1.example.local Microsoft ESMTP MAIL Service ready at Thu, 16 Apr 2015 11:43:39 +0100
    EHLO example-PC
    250-SBS4S1.example.local Hello [192.168.111.15]
    250-SIZE 10485760
    250-PIPELINING
    250-DSN
    250-ENHANCEDSTATUSCODES
    250-STARTTLS
    250-AUTH
    250-8BITMIME
    250-BINARYMIME
    250-CHUNKING
    250-XEXCH50
    250 XSHADOW
    RSET
    250 2.0.0 Resetting
    MAIL FROM:<yyyy@example.com>
    250 2.1.0 Sender OK
    RCPT TO:<zzz@example.com>
    250 2.1.5 Recipient OK
    DATA
    354 Start mail input; end with <CRLF>.<CRLF>
    --DATA Here, identical other than MIME/Content IDs in both emails--
    .
    QUIT
    250 2.6.0 <96ac271e-5525-4d89-aefa-a7469bae04df@SBS4S1.example.local> [InternalId=26573] Queued mail for delivery


--This then times out but never gets the 221--

The only difference I can see other than the missing 221 is that the 250 and the QUIT are the other way around in the message that "fails". Is this a known Exchange quirk we have to work around, something we could fix or should cope with in our application or something we could ask to be changed on the mail server? We don't want to break anything for the normal, common situation.

Update: I thought we wouldn't be able to get hold of the Exchange logs (the admins didn't even know where they were) but I have! So the relevent line is:

Tarpit for '0.00:00:32.619' due to 'DelayedAck',Expired;Timeout

It looks like it might be due to:

https://mikecrowley.wordpress.com/2010/07/24/delayed-smtp-acknowledgement/

Got to try and get it tested first.

Best Answer

It seems that sometimes your application sends QUIT command before receiving reply for the "the final dot" of the message (250 for OK).
It seems that the MTA (MS Exchange) ignores commands received before it sends the reply.

Suggested Fixes:
1) Increase timeout for waiting for the reply, do not send QUIT before receiving it (even when PIPELINING extension is advertised).
2) Do not resend if you receive 250 reply for "the final dot". In such case MTA (MS Exchange) takes over responsibility for delivering the message.