schemadiff: analyze and report foreign key loops/cycles#15062
schemadiff: analyze and report foreign key loops/cycles#15062shlomi-noach merged 5 commits intovitessio:mainfrom
schemadiff: analyze and report foreign key loops/cycles#15062Conversation
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…possibility as this is already covered by ForeignKeyLoopError Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #15062 +/- ##
===========================================
- Coverage 47.49% 40.88% -6.61%
===========================================
Files 1149 1454 +305
Lines 239475 368246 +128771
===========================================
+ Hits 113730 150567 +36837
- Misses 117138 201488 +84350
- Partials 8607 16191 +7584 ☔ View full report in Codecov by Sentry. |
|
|
||
| type ForeignKeyLoopError struct { | ||
| Table string | ||
| Loop []string |
There was a problem hiding this comment.
Is it useful to have for each step in the loop also the column name? To more exactly identify the specific loop?
There was a problem hiding this comment.
I'm afraid of clutter. The foreign key might reference multiple columns. It is indeed possible that same table child will reference table parent with two different foreign keys, one of which has a cycle, the other does not. In such case it would indeed be beneficial to identify the referenced columns both on parent and child; but again this is an amount of information that is likely to create much background noise. Alternatively, we could use the foreign key name participating in the loop.
Either solution would complicate the loop detection logic only slightly; question is how useful vs. confusing would the information be.
There was a problem hiding this comment.
Just took a look at the code. The current approach (only report table name) is consistent with other schemadiff error reports, that suffice with table name or column name without drilling into specific details. I think this should be fine as it is for now.
go/vt/schemadiff/errors.go
Outdated
| escaped := make([]string, 0, len(e.Loop)) | ||
| for _, t := range e.Loop { | ||
| escaped = append(escaped, sqlescape.EscapeID(t)) |
There was a problem hiding this comment.
Nitty, but I think this is slightly more performant:
escaped := make([]string, len(e.Loop))
for i, t := range e.Loop {
escaped[i] = sqlescape.EscapeID(t)
go/vt/schemadiff/errors.go
Outdated
| } | ||
| } | ||
| if tableIsInsideLoop { | ||
| return fmt.Sprintf("table %s participates in foreign key loop: %s", |
There was a problem hiding this comment.
I think "infinite loop" or "circular foreign key references" might be more descriptive for user facing error messages and docs, etc.
go/vt/schemadiff/schema.go
Outdated
| } | ||
| if seenTable == tableName { | ||
| // This table alreay appears in `seen`. | ||
| // We only return the loop portion of `seen` that starts and ends with this table. |
There was a problem hiding this comment.
I think this comment is a little off, no? Maybe ~ // We only return the tail end of seen which begins with this table.?
There was a problem hiding this comment.
Reworded as:
// We only return the suffix of `seen` that starts (and now ends) with this table.
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Description
See #15061 for details.
schemadiffdoes not support foreign key cycles (with the exception of self referencing tables). In this PR,schemadiffidentifies cycles and reports those cycles as part of the schema validation process. Taking the same example from #15061:As of this PR,
schemadiffwill now report:schemadiffwill further identify tables that reference foreign key loops, i.e. their immediate parent or on of their indirect parents participates in a loop, and will report something like:Computing foreign key loops uses DFS, and for every loop found it caches the tables participating in that loop, so as to never recompute the same table twice. This limits the number of loops found per table to
1, which is sufficient, as we already only report one foreign key error per table anyway (otherwise this can blow up with excessive error messages).Related Issue(s)
schemadiff#15061Checklist
Deployment Notes