SSL Handshake fails after clienthello

networkingopenssl

EDIT: I'll leave this as a good example of debugging SSL.

Final analysis: We had a networking issue in which one of our routers was misconfigured for a totally different application causing that router to be running borderline on CPU usage. The first few handshakes didn't pin it… then a subsequent one did and the connection was dropped as the router became overloaded. This presented as an SSL problem when it was really something else entirely.

Take-away: When SSL drops completely in mid connection make sure to check load level of routers in your control as well.


I've been at this for a while so hopefully I can provide an accurate picture.

We have SVN and Git repositories at a third party provider. We noticed that each would hang intermittently with no error given to the screen. In SVN's case the process had to be kill -9ed in Git a ctrl-C would suffice.

Looking into I found that the SSL handshake negotiation was failing. In SVN we would get to the handshake part and … nothing.

Now, I know we use these repos elsewhere without known issues, so that is the first rabbithole I go down.

  1. This only happens on one datacenter. Not in our whoel network. These repos are fine everywhere, else, but at this one datacenter about 1 in 3ish requests hangs on the SSL handshake.

  2. These same machines can access SSL handshakes without issue everywhere else except this third party provider.

So, I delved into the SSL handshake, and finally landed at:

openssl s_client -msg -debug -state -connect DOMAIN.DOM:443 which stops here:

CONNECTED(00000003)
SSL_connect:before/connect initialization
write to 0x20b94e0 [0x20b9560] (317 bytes => 317 (0x13D))
0000 - 16 03 01 01 38 01 00 01-34 03 03 88 0a 00 73 97   ....8...4.....s.
0010 - 12 69 c9 00 65 29 10 f6-57 00 57 44 e8 0e 3f cf   .i..e)..W.WD..?.
0020 - af a0 f9 80 e2 20 98 f0-d2 79 8c 00 00 9e c0 30   ..... ...y.....0
0030 - c0 2c c0 28 c0 24 c0 14-c0 0a c0 22 c0 21 00 a3   .,.(.$.....".!..
0040 - 00 9f 00 6b 00 6a 00 39-00 38 00 88 00 87 c0 32   ...k.j.9.8.....2
0050 - c0 2e c0 2a c0 26 c0 0f-c0 05 00 9d 00 3d 00 35   ...*.&.......=.5
0060 - 00 84 c0 12 c0 08 c0 1c-c0 1b 00 16 00 13 c0 0d   ................
0070 - c0 03 00 0a c0 2f c0 2b-c0 27 c0 23 c0 13 c0 09   ...../.+.'.#....
0080 - c0 1f c0 1e 00 a2 00 9e-00 67 00 40 00 33 00 32   .........g.@.3.2
0090 - 00 9a 00 99 00 45 00 44-c0 31 c0 2d c0 29 c0 25   .....E.D.1.-.).%
00a0 - c0 0e c0 04 00 9c 00 3c-00 2f 00 96 00 41 c0 11   .......<./...A..
00b0 - c0 07 c0 0c c0 02 00 05-00 04 00 15 00 12 00 09   ................
00c0 - 00 14 00 11 00 08 00 06-00 03 00 ff 01 00 00 6d   ...............m
00d0 - 00 0b 00 04 03 00 01 02-00 0a 00 34 00 32 00 0e   ...........4.2..
00e0 - 00 0d 00 19 00 0b 00 0c-00 18 00 09 00 0a 00 16   ................
00f0 - 00 17 00 08 00 06 00 07-00 14 00 15 00 04 00 05   ................
0100 - 00 12 00 13 00 01 00 02-00 03 00 0f 00 10 00 11   ................
0110 - 00 23 00 00 00 0d 00 20-00 1e 06 01 06 02 06 03   .#..... ........
0120 - 05 01 05 02 05 03 04 01-04 02 04 03 03 01 03 02   ................
0130 - 03 03 02 01 02 02 02 03-00 0f 00 01 01            .............
>>> TLS 1.2 Handshake [length 0138], ClientHello
01 00 01 34 03 03 88 0a 00 73 97 12 69 c9 00 65
29 10 f6 57 00 57 44 e8 0e 3f cf af a0 f9 80 e2
20 98 f0 d2 79 8c 00 00 9e c0 30 c0 2c c0 28 c0
24 c0 14 c0 0a c0 22 c0 21 00 a3 00 9f 00 6b 00
6a 00 39 00 38 00 88 00 87 c0 32 c0 2e c0 2a c0
26 c0 0f c0 05 00 9d 00 3d 00 35 00 84 c0 12 c0
08 c0 1c c0 1b 00 16 00 13 c0 0d c0 03 00 0a c0
2f c0 2b c0 27 c0 23 c0 13 c0 09 c0 1f c0 1e 00
a2 00 9e 00 67 00 40 00 33 00 32 00 9a 00 99 00
45 00 44 c0 31 c0 2d c0 29 c0 25 c0 0e c0 04 00
9c 00 3c 00 2f 00 96 00 41 c0 11 c0 07 c0 0c c0
02 00 05 00 04 00 15 00 12 00 09 00 14 00 11 00
08 00 06 00 03 00 ff 01 00 00 6d 00 0b 00 04 03
00 01 02 00 0a 00 34 00 32 00 0e 00 0d 00 19 00
0b 00 0c 00 18 00 09 00 0a 00 16 00 17 00 08 00
06 00 07 00 14 00 15 00 04 00 05 00 12 00 13 00
01 00 02 00 03 00 0f 00 10 00 11 00 23 00 00 00
0d 00 20 00 1e 06 01 06 02 06 03 05 01 05 02 05
03 04 01 04 02 04 03 03 01 03 02 03 03 02 01 02
02 02 03 00 0f 00 01 01
SSL_connect:unknown state

And it hangs there. On successful connections the next lines to that debug output should be

read from 0x1735590 [0x173ab70] (7 bytes => 7 (0x7))
0000 - 16 03 03 00 42 02                                 ....B.
0007 - <SPACES/NULS>
read from 0x1735590 [0x173ab7a] (64 bytes => 64 (0x40))
0000 - 00 3e 03 03 52 fd 41 af-09 0b 96 d6 c4 01 c2 1b   .>..R.A.........
0010 - eb e9 23 71 93 a6 1b 78-df 67 17 fe 86 c4 f9 82   ..#q...x.g......
0020 - 53 4f 36 09 00 c0 30 00-00 16 ff 01 00 01 00 00   SO6...0.........
0030 - 0b 00 04 03 00 01 02 00-23 00 00 00 0f 00 01 01   ........#.......
<<< TLS 1.2 Handshake [length 0042], ServerHello

02 00 00 3e 03 03 52 fd 41 af 09 0b 96 d6 c4 01
c2 1b eb e9 23 71 93 a6 1b 78 df 67 17 fe 86 c4
f9 82 53 4f 36 09 00 c0 30 00 00 16 ff 01 00 01
00 00 0b 00 04 03 00 01 02 00 23 00 00 00 0f 00
01 01
SSL_connect:SSLv3 read server hello A

So, as you can see this is way before the handshake is close to done.

From what I can tell, our clients initiate the handshake (cleinthello) and sometimes get silence on the wire.

I've already tried upgrading openssl, with no change. And again, this is only connecting to this one provider out of this one datacenter.

I'm down to some sort of networking issue filtering outgoing SSL traffic in some way, but I have no idea why that would be happening.

Any thoughts on where to go next in the troubleshooting process would be much appreciated. Thanks.

edit: Tried multiple ciphers and can reproduce with all, leading me again to possible network issue.

edit: Similar results with gnutls:

ifx14:/home/cadre/stresler# gnutls-cli -d 9 DOMAIN.DOM
Resolving 'DOMAIN.DOM'...
Connecting to 'X.X.X.X:443'...
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: DHE_RSA_AES_128_CBC_SHA1
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: DHE_RSA_CAMELLIA_128_CBC_SHA1
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: DHE_RSA_AES_256_CBC_SHA1
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: DHE_RSA_CAMELLIA_256_CBC_SHA1
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: DHE_RSA_3DES_EDE_CBC_SHA1
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: DHE_DSS_AES_128_CBC_SHA1
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: DHE_DSS_CAMELLIA_128_CBC_SHA1
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: DHE_DSS_AES_256_CBC_SHA1
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: DHE_DSS_CAMELLIA_256_CBC_SHA1
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: DHE_DSS_3DES_EDE_CBC_SHA1
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: DHE_DSS_ARCFOUR_SHA1
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: DHE_PSK_SHA_AES_128_CBC_SHA1
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: DHE_PSK_SHA_AES_256_CBC_SHA1
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: DHE_PSK_SHA_3DES_EDE_CBC_SHA1
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: DHE_PSK_SHA_ARCFOUR_SHA1
|<3>| HSK[0x1a11a60]: Removing ciphersuite: SRP_SHA_RSA_AES_128_CBC_SHA1
|<3>| HSK[0x1a11a60]: Removing ciphersuite: SRP_SHA_RSA_AES_256_CBC_SHA1
|<3>| HSK[0x1a11a60]: Removing ciphersuite: SRP_SHA_RSA_3DES_EDE_CBC_SHA1
|<3>| HSK[0x1a11a60]: Removing ciphersuite: SRP_SHA_DSS_AES_128_CBC_SHA1
|<3>| HSK[0x1a11a60]: Removing ciphersuite: SRP_SHA_DSS_AES_256_CBC_SHA1
|<3>| HSK[0x1a11a60]: Removing ciphersuite: SRP_SHA_DSS_3DES_EDE_CBC_SHA1
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: RSA_AES_128_CBC_SHA1
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: RSA_CAMELLIA_128_CBC_SHA1
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: RSA_AES_256_CBC_SHA1
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: RSA_CAMELLIA_256_CBC_SHA1
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: RSA_3DES_EDE_CBC_SHA1
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: RSA_ARCFOUR_SHA1
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: RSA_ARCFOUR_MD5
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: PSK_SHA_AES_128_CBC_SHA1
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: PSK_SHA_AES_256_CBC_SHA1
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: PSK_SHA_3DES_EDE_CBC_SHA1
|<3>| HSK[0x1a11a60]: Keeping ciphersuite: PSK_SHA_ARCFOUR_SHA1
|<3>| HSK[0x1a11a60]: Removing ciphersuite: SRP_SHA_AES_128_CBC_SHA1
|<3>| HSK[0x1a11a60]: Removing ciphersuite: SRP_SHA_AES_256_CBC_SHA1
|<3>| HSK[0x1a11a60]: Removing ciphersuite: SRP_SHA_3DES_EDE_CBC_SHA1
|<2>| EXT[0x1a11a60]: Sending extension CERT_TYPE
|<2>| EXT[0x1a11a60]: Sending extension SERVER_NAME
|<3>| HSK[0x1a11a60]: CLIENT HELLO was send [131 bytes]
|<4>| REC[0x1a11a60]: Sending Packet[0] Handshake(22) with length: 131
|<2>| ASSERT: gnutls_cipher.c:204
|<4>| REC[0x1a11a60]: Sent Packet[1] Handshake(22) with length: 136

Best Answer

This is old and already answered, but we suffered the same exact issue and the cause was different.

The key was to sniff traffic on our edge router, where we saw ICMP messages to the server (GitHub.com) asking for fragmentation. This was messing the connection, with retransmissions, duplicated ACKs and so.

enter image description here

The ICMP packet had a field, MTU of next hop with a weird value, 1450. The usual value is 1500.

enter image description here

We checked our router and one of the interfaces (an Ethernet tunnel) had this value as MTU, so the router was taking the minumum MTU of all interfaces as next hop. As soon as we removed this interface (it was unused), the SSL handshake started to work again.

Related Topic