Entity Framework – Can I Update an Attached Object Using a Detached Object?

centity-framework

I retrieve movie data from an external API. In a first phase I will scrape each movie and insert it into my own database. In a second phase I will periodically update my database by using the API's "Changes" API which I can query to see what movies have had their information changed.

My ORM layer is Entity-Framework. The Movie class looks like this:

class Movie
{
    public virtual ICollection<Language> SpokenLanguages { get; set; }
    public virtual ICollection<Genre> Genres { get; set; }
    public virtual ICollection<Keyword> Keywords { get; set; }
}

The problem arises when I have a movie that needs to be updated: my database will think of the object being tracked and the new one that I receive from the update API call as different objects, disregarding .Equals().

This causes a problem because when I now try to update the database with the updated movie, it will insert it instead of updating the existing Movie.

I had this issue before with the languages and my solution was to search for the attached language objects, detach them from the context, move their PK to the updated object and attach that to the context. When SaveChanges() is now executed, it will essentially replace it.

This is a rather smelly approach because if I continue this approach to my Movie object, it means I'll have to detach the movie, the languages, the genres and the keywords, look up each one in the database, transfer their IDs and insert the new objects.

Is there a way to do this more elegantly? Ideally I just want to pass in the updated movie to the context and have that select the correct movie to update based on the Equals() method, update all its fields and for each complex object: use the existing record again based on its own Equals() method and insert if it doesn't exist yet.

I can skip the detaching/attaching by providing .Update() methods on each complex object which I can use in combination of retrieving all the attached objects but this will still require me to retrieve every single existing object to then update it.

Best Answer

I didn't find what I was hoping for but I did find an improvement over the existing select-detach-update-attach sequence.

The extension method AddOrUpdate(this DbSet) allows you to do exactly what I want to do: Insert if it's not there and update if it found an existing value. I didn't realize using this sooner since I've really only seen it be used in the seed() method in combination with Migrations. If there is any reason I shouldn't use this, let me know.

Something useful to note: There is an overload available which allows you to specifically select how equality should be determined. Here I could have used my TMDbId but I instead opted to simply disregard my own ID altogether and instead use a PK on TMDbId combined with DatabaseGeneratedOption.None. I use this approach on each subcollection as well, where appropriate.

Interesting part of the source:

internalSet.InternalContext.Owner.Entry(existing).CurrentValues.SetValues(entity);

which is how the data is actually updated under the hood.

All that is left is calling AddOrUpdate on each object that I want to be affected by this:

public void InsertOrUpdate(Movie movie)
{
    _context.Movies.AddOrUpdate(movie);
    _context.Languages.AddOrUpdate(movie.SpokenLanguages.ToArray());
    // Other objects/collections
    _context.SaveChanges();
}

It isn't as clean as I hoped since I have to manually specify each piece of my object that needs to be updated but it's about as close as it will get.

Related reading: https://stackoverflow.com/questions/15336248/entity-framework-5-updating-a-record


Update:

It turns out my tests weren't rigorous enough. After using this technique I noticed that while the new language was added, it wasn't connected to the movie. in the many-to-many table. This is a known but seemingly low-priority issue and hasn't been fixed as far as I know.

In the end I decided to go for the approach where I have Update(T) methods on each type and follow this sequence of events:

  • Loop over collections in new object
  • For each entry in each collection, look it up in the database
  • If it exists, use the Update() method to update it with the new values
  • If it doesn't exist, add it to the appropriate DbSet
  • Return the attached objects and replace the collections in the root object with the collections of the attached objects
  • Find and update the root object

It's a lot of manual work and it's ugly so it'll go through a few more refactorings but now my tests indicate it should work for more rigorous scenarios.


After cleaning it up further I now use this method:

private IEnumerable<T> InsertOrUpdate<T, TKey>(IEnumerable<T> entities, Func<T, TKey> idExpression) where T : class
{
    foreach (var entity in entities)
    {
        var existingEntity = _context.Set<T>().Find(idExpression(entity));
        if (existingEntity != null)
        {
            _context.Entry(existingEntity).CurrentValues.SetValues(entity);
            yield return existingEntity;
        }
        else
        {
            _context.Set<T>().Add(entity);
            yield return entity;
        }
    }
    _context.SaveChanges();
}

This allows me to call it like this and insert/update the underlying collections:

movie.Genres = new List<Genre>(InsertOrUpdate(movie.Genres, x => x.TmdbId));

Notice how I reassign the retrieved value to the original root object: now it is connected to each attached object. Updating the root object (the movie) is done the same way:

var localMovie = _context.Movies.SingleOrDefault(x => x.TmdbId == movie.TmdbId);
if (localMovie == null)
{
    _context.Movies.Add(movie);
} 
else
{
    _context.Entry(localMovie).CurrentValues.SetValues(movie);
}