C# – .Net C# TcpClient / Socket HTTP Client Performance / Efficiency

chttpnetsocketstcpclient

I'm writing an HTTP client using the .Net TcpClient / Sockets.

So far, the client handles both Content-Length and chunked responses by iterating through the NetworkStream response (after writing a GET request to the TcpClient), parsing the headers and retrieving the relevant message body bytes / chunked bytes. To do this it uses the NetworkStream ReadByte method.

This all works fine, but performance is a key consideration of the application so I would like to make it as quick and efficient as possible.

Initially this will involve swapping ReadByte for Read for the message body (based on Content-Length) or chunked message body byte retrieval into an appropriately sized buffer, using ReadByte in all other areas (such as reading the Headers, Chunk sizes etc).

I'm interested to know thoughts on better / different ways to do this to achieve optimum performance? Obviously the main problem with HTTP is not knowing the length of the response stream unless it is parsed as it is retrieved.

There a specific reasons why I'm not using more abstract classes (eg HttpWebRequest) for this (I need better control at the socket level).

Many Thanks,

Chris

Best Answer

I suggest using a process with a medium sized buffer. Repeatedly fill the buffer until the response stream ends. When the buffer is full, or the stream ends, attach that buffer content onto the string (or whatever you're using to store the message).

If you want to read an important bit of information early in the stream, read just enough of the stream to see that. (In other words, you don't need to fill the buffer on the first pass if you don't want to.)

You should also consider using an event system to signal the presence of new data, which has been shaped in such a way that the main part of your process doesn't need to know anything about where the data came from or how you are buffering it.

Edit

In response to your comment question, if you have one connection that you are trying to reuse for multiple requests, you would create a thread that reads from it over and over. When it finds data, it uses the event to push it out for the main part of your program to handle. I don't have a sample handy, but you should be able to find several with a few bing or google searches.

Related Solutions

HTTP vs HTTPS performance

There's a very simple answer to this: Profile the performance of your web server to see what the performance penalty is for your particular situation. There are several tools out there to compare the performance of an HTTP vs HTTPS server (JMeter and Visual Studio come to mind) and they are quite easy to use.

No one can give you a meaningful answer without some information about the nature of your web site, hardware, software, and network configuration.

As others have said, there will be some level of overhead due to encryption, but it is highly dependent on:

Hardware
Server software
Ratio of dynamic vs static content
Client distance to server
Typical session length
Etc (my personal favorite)
Caching behavior of clients

In my experience, servers that are heavy on dynamic content tend to be impacted less by HTTPS because the time spent encrypting (SSL-overhead) is insignificant compared to content generation time.

Servers that are heavy on serving a fairly small set of static pages that can easily be cached in memory suffer from a much higher overhead (in one case, throughput was havled on an "intranet").

Edit: One point that has been brought up by several others is that SSL handshaking is the major cost of HTTPS. That is correct, which is why "typical session length" and "caching behavior of clients" are important.

Many, very short sessions means that handshaking time will overwhelm any other performance factors. Longer sessions will mean the handshaking cost will be incurred at the start of the session, but subsequent requests will have relatively low overhead.

Client caching can be done at several steps, anywhere from a large-scale proxy server down to the individual browser cache. Generally HTTPS content will not be cached in a shared cache (though a few proxy servers can exploit a man-in-the-middle type behavior to achieve this). Many browsers cache HTTPS content for the current session and often times across sessions. The impact the not-caching or less caching means clients will retrieve the same content more frequently. This results in more requests and bandwidth to service the same number of users.

C# – How to receive HTTP messages using Socket

I suggest that you don't implement this yourself - the HTTP 1.1 protocol is sufficiently complex to make this a project of several man-months.

The question is, is there a HTTP request protocol parser for .NET? This question has been asked on SO, and in the answers you'll see several suggestions, including source code for handling HTTP streams.

Converting Raw HTTP Request into HTTPWebRequest Object

EDIT: The rotor code is reasonably complex, and difficult to read/navigate as webpages. But still, the implementaiton effort to add SOCKS supports is much lower than implementing the entire HTTP protocol yourself. You will have something working within a few days at most that you can depend upon, that is based on a tried and tested implementation.

The request and response are read from/written to to a NetworkStream, m_Transport, in the Connection class. This is used in these methods:

internal int Read(byte[] buffer, int offset, int size) 
//and
private static void ReadCallback(IAsyncResult asyncResult)

both in http://www.123aspx.com/Rotor/RotorSrc.aspx?rot=42903

The socket is created in

private void StartConnectionCallback(object state, bool wasSignalled)

So you could modify this method to create a Socket to your socks server, and do the necessary handshake to obtain the external connection. The rest of the code can remain the same.

I gammered this info in about 30 mins looking on the pages on the web. This should go much faster if you load these files into an IDE. It may seem like a burden to have to read through this code - after all, reading code is far harder than writing it, but you are making just small changes to an already established, working system.

To be sure the changes work in all cases, it will be wise to also test when the connection is broken, to ensure that the client reconnects using the same method , and so re-establishes the SOCKS connection and sends the SOCKS request.

Best Answer

Related Solutions

HTTP vs HTTPS performance

C# – How to receive HTTP messages using Socket

Related Topic