Is Moving Entity Framework objects over a webservice really the best way

entity-frameworkwcfweb services

I've inherited a .NET project that has close to 2 thousand clients out in the field that need to push data periodically up to a central repository. The clients wake up and attempt to push the data up via a series of WCF webservices where they are passing each entity framework entity as parameter. Once the service receives this object, it preforms some business logic on the data, and then turns around and sticks it in it's own database that mirrors the database on the client machines.

The trick is, is that this data is being transmitted over a metered connection, which is very expensive. So optimizing the data is a serious priority. Now, we are using a custom encoder that compresses the data (and decompresses it on the other end) while it is being transmitted, and this is reducing the data footprint. However, the amount of data that the clients are using, seem ridiculously large, given the amount of information that is actually being transmitted.

It seems me that entity framework itself may be to blame. I'm suspecting that the objects are very large when serialized to be sent over wire, with a lot context information and who knows what else, when what we really need is just the 'new' inserts.

Is using the entity framework and WCF services as we have done so far the correct way, architecturally, of approaching this n-tiered, asynchronous, push only problem? Or is there a different approach, that could optimize the data use?

Best Answer

There's no doubt you could optimize this application - any application can be optimized. But before you dive in are you sure you need to do this? Is there a problem with the current process - is it too slow, too expensive, is someone complaining? If you're just doing this as an iterative improvement & imagine you'll be a hero if you can reduce data transfer by 30%, the risks are far bigger than the benefits. Rewriting your service contracts will mean you'll need to add transformation code at each end, which means rehydrating EF objects and ensuring they have the correct state to reattach to the data context. It sounds easy but it's a big change.

You should definitely profile how expensive the EF objects are compared to equivalent DTOs. You're running on a hunch at the moment. How much data will you save by making this change?

Are there simpler, more obvious improvements? When optimizing WCF services in the past I've identified that a huge overhead in service request size was Windows Authentication headers - enormous security tokens being passed between client and server - which can be replaced with a much smaller certificate. Is all the data being sent completely necessary? I assume you're sending binary (net.tcp) rather than text (http), but if you're not that's an obvious improvement.

DTOs are a useful pattern, and they are championed heavily by MVC guys, but this isn't because of any data-saving concern - it's because they provide a service interface, an abstraction from the database. Without a DTO you add a dependency to your database model. This doesn't apply in your case because it seems you have the same database model on both ends. The simplest approach will be to send the EF objects over the wire and directly insert them, just as you're currently doing.

Reducing data traffic will save some money. How much money? Enough to warrant your time developing this solution, additional maintenance time due to increased application complexity?