Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EF Core AddRange and entities with duplicate keys #24780

Closed
epatrick opened this issue Apr 27, 2021 · 10 comments
Closed

EF Core AddRange and entities with duplicate keys #24780

epatrick opened this issue Apr 27, 2021 · 10 comments
Labels
closed-no-further-action The issue is closed and no further action is planned. customer-reported

Comments

@epatrick
Copy link

I have a use case where a third party provides an enumeration of items that I wish to merge into a database using EF Core. There are use cases where the third party provides an item with the same key more than once in the enumeration.

Id Account LastPayment
12345 ABC123 1/1/2021
23456 BCD234 2/1/2021
12345 ABC123 2/1/2021

Ideally, I would like BOTH updates to 12345 to take place (we audit history in the data tier).

I get an error when attempting to add 12345 twice to the same context. Code POC is:

[Fact]
public async Task HandlesDuplicateKeys()
{
    var services = new ServiceCollection()
        .AddDbContext<ItemContext>(options => options.UseInMemoryDatabase(Guid.NewGuid().ToString()))
        .BuildServiceProvider();

    var items = new List<ItemA>()
    {
        new ItemA() { Id = 1, A = "Foo" },
        new ItemA() { Id = 1, A = "Bar" }
    };

    using (var context = services.GetRequiredService<ItemContext>())
    {
        context.AList.AddRange(items);
        await context.SaveChangesAsync();
    }
}

public class ItemA
{
    public int Id { get; set; }
    public string? A { get; set; }
}

public class ItemContext : DbContext
{
    public RepositoryContext(DbContextOptions<RepositoryContext> options) : base(options)
    { }

    public DbSet<ItemA> AList { get; set; }
}

yields:

Message:
System.InvalidOperationException : The instance of entity type 'ItemA' cannot be tracked because another instance with the same key value for {'Id'} is already being tracked. When attaching existing entities, ensure that only one entity instance with a given key value is attached. Consider using 'DbContextOptionsBuilder.EnableSensitiveDataLogging' to see the conflicting key values.

What's the appropriate way to manage this use case?

@ajcvickers
Copy link
Contributor

Normally it is common to do identity resolution before tracking and saving. For example:

foreach (var item in items)
{
    var existing = context.AList.Local.FirstOrDefault(e => e.Id == item.Id);
    if (existing != null)
    {
        existing.A = item.A;
    }
    else
    {
        context.Add(item);
    }
}

context.SaveChanges();

However, this will only save the last value. If you want to save the first value and then replace it with subsequent values, then this will require calling SaveChanges more than once. For example:

foreach (var item in items)
{
    var existing = context.AList.Local.FirstOrDefault(e => e.Id == item.Id);
    if (existing != null)
    {
        context.SaveChanges();
        existing.A = item.A;
    }
    else
    {
        context.Add(item);
    }
}

context.SaveChanges();

@epatrick
Copy link
Author

Thanks for the feedback.

Does this imply the only mechanisms EF has to accomplish my goal are:

  • Calling SaveChanges() per changed item as you demonstrate above, or
  • Manually building a SQL statement and using ExecuteSqlRaw along these lines:
UPDATE Items SET A = 'Foo' WHERE Id = 1
UPDATE Items SET A = 'Bar' WHERE Id = 1

I hesitate to call SaveChanges() on a per-item basis for performance reasons. I would think that batching update statements (perhaps for every X items) would be significantly more performant.

@roji
Copy link
Member

roji commented Apr 27, 2021

@epatrick you don't need to call SaveChanges on a per-item basis - note that it's outside of the loop in the above samples. You're indeed correct that batching multiple updates in a single SaveChanges can be important for performance.

@ajcvickers
Copy link
Contributor

@roji You do if you want the same item to be updated more than once in the database.

@epatrick EF Core does batch updates, but it is limited to a single update per entity instance per call to SaveChanges. The way I wrote the code above it batches as much as it can before saving, but it must save before another update to the same instance can be made.

@roji
Copy link
Member

roji commented Apr 27, 2021

My bad, missed the bit where two database updates are desired for the same row.

@stevendarby
Copy link
Contributor

You might want to consider explicitly beginning a transaction over the multiple SaveChanges if you want an error during the process to rollback the saves done up to that point

@epatrick
Copy link
Author

epatrick commented Apr 27, 2021

Excellent feedback folks; thank you!

@stevendarby, agreed on the transaction.

@ajcvickers, sorry I missed that SaveChanges is only called on the "duplicate"; that works.

My next challenge would be to abstract matching existing.A = item.A. I need to replace any non-null properties present on the object. I can certainly do this with reflection; does EF have anything built-in to do this for me? (It would seem to do that under the hood at some point.)

Lastly, can I abstract e.Id == item.Id to handle whatever identity is used (including compound keys)? (This is not such a big deal; can interface it if need be.)

@ajcvickers
Copy link
Contributor

@epatrick For setting values:

context.Entry(existing).CurrentValues.SetValues(item);

#7391 is tracking making working with key values easier. For now, it requires some code. For example:

var keyPropertyNames
    = context.Model.FindEntityType(typeof(ItemA)).FindPrimaryKey().Properties.Select(e => e.Name);

foreach (var item in items)
{
    var existing = context.ChangeTracker
        .Entries<ItemA>()
        .FirstOrDefault(e => AreSameEntity(e, context.Entry(item)));
    
    if (existing != null)
    {
        context.SaveChanges();
        existing.CurrentValues.SetValues(item);
    }
    else
    {
        context.Add(item);
    }
}

context.SaveChanges();

bool AreSameEntity(EntityEntry left, EntityEntry right)
{
    foreach (var propertyName in keyPropertyNames)
    {
        if (!Equals(left.Property(propertyName).CurrentValue, right.Property(propertyName).CurrentValue))
        {
            return false;
        }
    }

    return true;
}

@epatrick
Copy link
Author

Awesome - thanks for the help!

@ajcvickers ajcvickers added the closed-no-further-action The issue is closed and no further action is planned. label Apr 27, 2021
@stevendarby
Copy link
Contributor

stevendarby commented Apr 27, 2021

I think SetValues sets all matching properties and may not meet your “non-null properties” requirement, so some reflection might be required (assuming you’re aiming to code once for multiple types and not just ItemA)

@ajcvickers ajcvickers reopened this Oct 16, 2022
@ajcvickers ajcvickers closed this as not planned Won't fix, can't repro, duplicate, stale Oct 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-no-further-action The issue is closed and no further action is planned. customer-reported
Projects
None yet
Development

No branches or pull requests

4 participants