Zip Code and Distance Calculation – How to Architect in C#

Architecturecentity-framework

I am building an app using MVC 4/Razor and EF 4/ .NET 4 in which it is possible to search items by zip code and thus retrieving items only a certain distance from your current zip code. I have a class that calculates distance using census gazetteer files (zip, lat, and lon) and the Haversine formula.

My specific problem deals with the SQL portion of my code. I have a view that uses a pager (20 results per page) and has hyper links that order the columns in asc and desc order. I also have 3 drop down lists that can further filter the results down: Manufacturer, Category, Distance.

When I am filtering by say 50 miles my class retrieves an IEnumerable of zipcodes that are within the requested distance. At times this can be in the 10,000 – 30,000 area. Each Item object contains an int ZipCode

Relationships

The problem EF generates when I try to grab item objects from the database using a list of acceptable zip codes is One of your queries is too deeply nested. Try breaking them up. I am assuming that it is because you can not have a where clause with thousands of conditions. This makes sense. I had found a link a few days ago that basically dealt with creating a temporary table to use as the list in the where condition but can not find the link anymore. It also applied to raw ADO.NET as opposed to EF.

My question is basically, how can I implement the functionality that I want without having make huge changes?

 using (var context = new JDMExchangeEntities())
        {

            var results = context.Items.Include(P => P.Manufacturer).Include(P => P.Category)
                .Include(P => P.VehicleMake).Include(P => P.VehicleModel).Include(P => P.VehicleYear);



           //Category logic
            if (!string.IsNullOrEmpty(Categories))
            {
                results = results.Where(P => P.Category.Name == Categories);
            }

            //Manufacturers logic
            if (!string.IsNullOrEmpty(Manufacturers))
            {
                results = results.Where(P => P.Manufacturer.Name == Manufacturers);
            }

            //Vehicle logic
            if (Vehicle > 0)
            {
                results = results.Where(P => P.YearId == Vehicle);
            }

            if (acceptableCodes.Count > 0)
            {
                List<int> codes = acceptableCodes.Keys.ToList();
                results = results.Where(P => codes.Contains((int)P.ZipCode));

                    //return _UoW.tblcoursebookingRepo.All
   //.Where(cb => AttendanceIDs.Contains(cb.Attended))
   //.ToList();
            }
            ////Take out items that are not in the ManualQueryResults Dictionary
            //if (!String.IsNullOrEmpty(ManualQuery))
            //{
            //    var all = listB.Where(b => listA.Any(a => a.code == b.code));
            //    results = results.Where(r => ManualQueryResults.Any(a => a == r.id));
            //}

            ////Take out zips not in the closeCodes variable
            //if (!string.IsNullOrEmpty(DistanceLimit))
            //{
            //    results = results.Where(P => closeCodes.Any(z => z.ZipCode.Code == P.ZipCode));
            //}

            switch (sortOrder)
            {
                case "Price_desc":
                    results = results.OrderByDescending(P => P.Price);
                    break;
                case "Price":
                    results = results.OrderBy(P => P.Price);
                    break;
                case "Date_asc":
                    results = results.OrderBy(P => P.PostDate);
                    break;
                default:
                    results = results.OrderByDescending(P => P.PostDate);
                    break;

            }
            int pageSize = 20;
            int pageNumber = (page ?? 1);


            IPagedList<Item> cc = results.ToPagedList(pageNumber, pageSize);

I could call .ToList() somewhere before I hit the .ToPagedList() but then I will be returning back thousands of entries and thus lagging the response time by a few seconds.

After that I could easily make a paged list from the IEnumerable list. I want to try and minimize response time and so I think the area I need to fix is in the SQL code so as to not pull unnecessary data and then filter it out.

Best Answer

Doing any geospatial work in SQL Server without using the geospatial types is, frankly, insane.

Store the postal code if you want for reference, but query based on latitude and longitude using the geography type. It even has an STDistance function that can use a geospatial index in order to return results in constant time.

Yes, that involves what you'd probably call "huge" changes. Do it. I've seen search systems implemented using the approach you're trying to use and they are, quite frankly, horrible. It's so simple to do a true radius search these days, you're doing yourself and your users a major disservice to hack together a half-assed version like this.

Also, zip codes as numeric? Obviously you're only intending this to be used in the USA, but even so... great way to ensure extreme pain if you ever need or want to globalize.

Related Topic