-
Notifications
You must be signed in to change notification settings - Fork 413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change diff to evaluate PKs for FK relationships #420
Change diff to evaluate PKs for FK relationships #420
Conversation
elif (field.one_to_one or field.many_to_one) and hasattr(field, "rel_class"): | ||
value = smart_str( | ||
getattr(obj, field.get_attname(), None), strings_only=True | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the primary fix.
Codecov Report
@@ Coverage Diff @@
## master #420 +/- ##
==========================================
+ Coverage 93.55% 93.66% +0.10%
==========================================
Files 29 29
Lines 869 884 +15
==========================================
+ Hits 813 828 +15
Misses 56 56
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
def test_no_log_entry_created_on_related_object_string_update(self): | ||
t1 = self.test_date | ||
with freezegun.freeze_time(t1): | ||
simple = SimpleModel.objects.create(text="Test Foo") | ||
one_simple = SimpleModel.objects.create(text="Test Bar") | ||
instance = RelatedModel.objects.create( | ||
one_to_one=simple, related=one_simple | ||
) | ||
|
||
t2 = self.test_date + datetime.timedelta(days=20) | ||
with freezegun.freeze_time(t2): | ||
# Order is important. Without special FK handling, the arbitrary in memory | ||
# changes to the (same) related object's signature result in a perceived | ||
# update where no update has occurred. | ||
one_simple.text = "Test Baz" | ||
instance.save() | ||
one_simple.save() | ||
|
||
# Assert that only one log for the instance was created | ||
self.assertEqual(instance.history.all().count(), 1) | ||
# Assert that two logs were created for the parent object | ||
self.assertEqual(one_simple.history.all().count(), 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without the primary fix, this test case fails. Specifically, instance
will have two LogEntry
records: a create and an update. The update is false though because there was no actual change on the record. The only change that occurred was on the SimpleModel.text
field.
auditlog/models.py
Outdated
def _get_changes_display_for_fk_field( | ||
self, field: Union[models.ForeignKey, models.OneToOneField], value: Any | ||
) -> str: | ||
""" | ||
:return: An object display string for a given FK value and the field to which it belongs | ||
""" | ||
# Return "None" if the FK value is "None". | ||
if value == "None": | ||
return value | ||
|
||
# Attempt to convert given value to the PK type for the related model | ||
try: | ||
pk_value = field.related_model._meta.pk.to_python(value) | ||
# ValidationError will handle legacy values where string representations were | ||
# stored rather than PKs. This will also handle cases where the PK type is | ||
# changed between the time the LogEntry is created and this method is called. | ||
except ValidationError: | ||
return value | ||
# Attempt to return the string representation of the object | ||
try: | ||
return smart_str(field.related_model.objects.get(pk=pk_value)) | ||
# ObjectDoesNotExist will be raised if the object was deleted. | ||
except ObjectDoesNotExist: | ||
return f"Deleted '{field.related_model.__name__}' ({value})" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One consequence of this is that the related objects' str representation is based on the CURRENT state of that object, not the historical state. This is not... great when looking at the audit log of object A, it should not tell you the lifecycle of object B.
There are probably ways around this by using the serialized data feature, but this has quite a few drawbacks as serialized data may not be compatible with the current code.
One possible solution is storing a tuple in the changes dict (pk, str)
but this changes the shape in other ways and represents a more complex breaking change.
or maybe having a datetime in a str method is just an antipattern (the root issue, on why we are here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
String representations are usually used as a way to have a human readable display for the object. I do not believe that __str__
is intended to show state. Therefore showing the human readable form of which object was attached to this field on this object may not be so problematic. One could also argue that displaying the current version of the string representation for both the old and new relationships is a feature add because it's easier to compare. That said, I don't think people are likely to change the format of the returned __str__
value that often. (Although losing the string representation when the object is deleted is unfortunate.)
Also... A tuple where we store the PK for comparison and the string representation for display here is the best case scenario. I'd be happy to implement that if we are ok with a (potentially) breaking change.
def test_log_entry_created_if_obj_strings_are_same_for_two_objs(self): | ||
"""FK changes trigger update when the string representation is the same.""" | ||
t1 = self.test_date | ||
with freezegun.freeze_time(t1): | ||
simple = SimpleModel.objects.create(text="Test Foo") | ||
one_simple = SimpleModel.objects.create(text="Twinsies", boolean=True) | ||
two_simple = SimpleModel.objects.create(text="Twinsies", boolean=False) | ||
instance = RelatedModel.objects.create( | ||
one_to_one=simple, related=one_simple | ||
) | ||
|
||
t2 = self.test_date + datetime.timedelta(days=20) | ||
with freezegun.freeze_time(t2): | ||
instance.related = two_simple | ||
instance.save() | ||
|
||
self.assertEqual(instance.history.all().count(), 2) | ||
log_create = instance.history.filter(timestamp=t1).first() | ||
log_update = instance.history.filter(timestamp=t2).first() | ||
self.assertEqual(int(log_create.changes_dict["related"][1]), one_simple.id) | ||
self.assertEqual(int(log_update.changes_dict["related"][1]), two_simple.id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without the primary fix, this test case also fails. This seems pretty likely to be happening in other installations.
auditlog/models.py
Outdated
self, field: Union[models.ForeignKey, models.OneToOneField], value: Any | ||
) -> str: | ||
""" | ||
:return: An object display string for a given FK value and the field to which it belongs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this function does not return an object
I think this change is a breaking change as the patch is changing the representation of the related field. we need to add changelog for that and explain the change and the breaking change. Also, It would be great if @alieh-rymasheuski can take a look. |
@hramezani Sorry for the delay in returning to this. I've added a statement in the change log and updated the docstring as requested. Let me know if there is anything else I can do to help get this patch in the 2.2 release. If it is critical to preserve the object representation in the changes field (instead of the FK), there are options but they involve more complexity and my estimation is that the complexity isn't worth it. |
Thanks @sum-rock for addressing the comments. As I mentioned in #420 (comment) it is a breaking change and we can not include it in |
@sum-rock we are preparing |
The diff method now evaluates the primary keys for changes to determine if a new LogEntry should be created. Previously, the diff method was evaluating the string representation of the object. This was flawed because cases can occur when a parent object has in memory changes to its string string representation and the child related object is saved prior to these in memory changes being persisted. In these cases a new LogEntry object would be created erroneously. This cases is asserted with a test and a regression test will verify the bug. The consequence of these updates is that the ``LogEntry.changes`` field now stores primary keys rather than string representations for related objects. To keep the changes dictionary display unaffected by this update, a method was added to the ``LogEntry`` model. This method looks up the object display string from the stored foreign key. Exceptions were written to handle backwards compatibility.
Because the string representation is not unique for every object, relying on it to determine FK diffs may not capture all changes. This test case shows another type of scenario that is fixed by comparing primary keys rather than object string representations. This is likely occurring fairly regularly but is hard to spot because it is an error of omission.
b219fbe
to
39e0be9
Compare
@hramezani This branch is now refreshed |
I am going to write up an issue to relate to this PR in the next day or two. At the moment, I'm focused on getting this fix merged into our fork to resolve some bloating on our LogEntry table.(Done, see attached issue)Take a look and let me know what you think of this fix. I did look through the current issues and couldn't find anyone else complaining about this so maybe we've got some anti-patterns going on in our code. The issue is only happening in some of our more complex systems but I'm sure we're not that unique. I'll highlight the test case that shows the fix.
After letting this sit for an hour or two I realized this change will fix another more insidious bug. Currently, in cases where there are more than one object with the same string representation (e.g., two or more "John Smith" objects in the
Customer
model) and a related object is updated to a different object with the same string representation (e.g., an update to aPhoneNumber
object "+1555-555-5555" to belong to a different "John Smith"Customer
object) auditlog is not able to capture that update. This is because because thediff
method is comparing object strings rather than primary keys and there is no guarantee that the__str__
method is returning a unique value. Therefore, a portion of updates toForeignKey
fields are likely not being captured by auditlog on current installations.In this commit you can see a test case that demonstrates the flaw and how it is resolved by this change.
The diff method now evaluates the primary keys for changes to determine
if a new LogEntry should be created. Previously, the diff method was
evaluating the string representation of the object. This was flawed
because cases can occur when a parent object has in memory changes to
its string string representation and the child related object is saved
prior to these in memory changes being persisted. In these cases a new
LogEntry object would be created erroneously. This cases is asserted
with a test and a regression test will verify the bug.
The consequence of these updates is that the
LogEntry.changes
fieldnow stores primary keys rather than string representations for related
objects. To keep the changes dictionary display unaffected by this
update, a method was added to the
LogEntry
model. This method looksup the object display string from the stored foreign key. Exceptions
were written to handle backwards compatibility.
Issue