LINQ to SQL only supports 1 to 1 mapping of database tables, views, sprocs and functions available in Microsoft SQL Server. It's a great API to use for quick data access construction to relatively well designed SQL Server databases. LINQ2SQL was first released with C# 3.0 and .Net Framework 3.5.
LINQ to Entities (ADO.Net Entity Framework) is an ORM (Object Relational Mapper) API which allows for a broad definition of object domain models and their relationships to many different ADO.Net data providers. As such, you can mix and match a number of different database vendors, application servers or protocols to design an aggregated mash-up of objects which are constructed from a variety of tables, sources, services, etc. ADO.Net Framework was released with the .Net Framework 3.5 SP1.
This is a good introductory article on MSDN:
Introducing LINQ to Relational Data
To your remark in the comments to your question:
"...SavingChanges (for each
record)..."
That's the worst thing you can do! Calling SaveChanges()
for each record slows bulk inserts extremely down. I would do a few simple tests which will very likely improve the performance:
- Call
SaveChanges()
once after ALL records.
- Call
SaveChanges()
after for example 100 records.
- Call
SaveChanges()
after for example 100 records and dispose the context and create a new one.
- Disable change detection
For bulk inserts I am working and experimenting with a pattern like this:
using (TransactionScope scope = new TransactionScope())
{
MyDbContext context = null;
try
{
context = new MyDbContext();
context.Configuration.AutoDetectChangesEnabled = false;
int count = 0;
foreach (var entityToInsert in someCollectionOfEntitiesToInsert)
{
++count;
context = AddToContext(context, entityToInsert, count, 100, true);
}
context.SaveChanges();
}
finally
{
if (context != null)
context.Dispose();
}
scope.Complete();
}
private MyDbContext AddToContext(MyDbContext context,
Entity entity, int count, int commitCount, bool recreateContext)
{
context.Set<Entity>().Add(entity);
if (count % commitCount == 0)
{
context.SaveChanges();
if (recreateContext)
{
context.Dispose();
context = new MyDbContext();
context.Configuration.AutoDetectChangesEnabled = false;
}
}
return context;
}
I have a test program which inserts 560.000 entities (9 scalar properties, no navigation properties) into the DB. With this code it works in less than 3 minutes.
For the performance it is important to call SaveChanges()
after "many" records ("many" around 100 or 1000). It also improves the performance to dispose the context after SaveChanges and create a new one. This clears the context from all entites, SaveChanges
doesn't do that, the entities are still attached to the context in state Unchanged
. It is the growing size of attached entities in the context what slows down the insertion step by step. So, it is helpful to clear it after some time.
Here are a few measurements for my 560000 entities:
- commitCount = 1, recreateContext = false: many hours (That's your current procedure)
- commitCount = 100, recreateContext = false: more than 20 minutes
- commitCount = 1000, recreateContext = false: 242 sec
- commitCount = 10000, recreateContext = false: 202 sec
- commitCount = 100000, recreateContext = false: 199 sec
- commitCount = 1000000, recreateContext = false: out of memory exception
- commitCount = 1, recreateContext = true: more than 10 minutes
- commitCount = 10, recreateContext = true: 241 sec
- commitCount = 100, recreateContext = true: 164 sec
- commitCount = 1000, recreateContext = true: 191 sec
The behaviour in the first test above is that the performance is very non-linear and decreases extremely over time. ("Many hours" is an estimation, I never finished this test, I stopped at 50.000 entities after 20 minutes.) This non-linear behaviour is not so significant in all other tests.
Best Answer
You can get the same behavior with Migrations by using automatic migrations.
In Configuration.cs in the constructor, make sure automatic migrations and allow data loss are set to true...
Now whenever your model changes, just execute update-database...
The schema will be changed to match the models and any existing data in the tables will stay there (unless it's in a renamed or removed column). The Seed method is run after the database is updated so you can further control any changes to the data in there.