Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema migration fails due to column order issue due to extra columns #244

Closed
alok87 opened this issue Jul 3, 2021 · 3 comments · Fixed by #245
Closed

Schema migration fails due to column order issue due to extra columns #244

alok87 opened this issue Jul 3, 2021 · 3 comments · Fixed by #245
Labels
bug Something isn't working p2 intermittent issue, not very urgent, but need to be done soon

Comments

@alok87
Copy link
Contributor

alok87 commented Jul 3, 2021

Masking has a feature which allows to add new extra columns in Redshift tables. These columns are only present in Redshift and not at source, as they are computed columns required for aggregate analysis on sensitive columns.

One such feature is length_keys

Recently in one of such table using that feature we saw the new columns were added in source, the change could not be applied in Redshift. Following errors occured was seen in the loader logs:

[sarama] 2021/07/03 12:41:50 kafka: error while consuming loader-09a9e4-ts.customers.inventory/0: Schema migration failed, err: 1 error occurred:
	* table: customers mismatch col: new_col_name, prop: Name, input: new_col_name, target: text_length

(actual table names have been changed above)^
Conflict happens as the source has new_col_name coming before. But the target has an extra columntext_length at that position.

I remember I had solved for this use case #126. But I see it has happened again. Need to relook and fix the problem.

Solution proposed

Need to have extra columns as first columns in the table instead of they being the last columns. When this starts happening we won't get order issues when new columns are added to a table at source. It would also work when a new extra column is added because that would be a recreate of table due to reload.

@alok87 alok87 added the bug Something isn't working label Jul 3, 2021
@alok87
Copy link
Contributor Author

alok87 commented Jul 3, 2021

One issue found

klog.V(5).Infof(
"Missing column: %s, add column will run.\n", inCol.Name)

I0703 13:36:46.282035    4794 redshift.go:1105] Missing column: new_col_name, add column will run.
I0703 13:36:46.282056    4794 redshift.go:1105] Missing column: text_length, add column will run.

ALTER TABLE ADD COLUMNS is being constructed by the migration for the extra columns which are not present in source. This is because it only checks if the column needs to be added based on the index number. It should ignore extra columns instead.

@alok87
Copy link
Contributor Author

alok87 commented Jul 3, 2021

Full table reload is required in this case. Also the full table reload should happen as reload group reload.

@alok87
Copy link
Contributor Author

alok87 commented Jul 12, 2021

Solution should consider keeping tables with extra columns as first columns so that there is no migration problem due to new columns being added at source.

And if any new extra column is added in masking, it will any way recreate the table in reload group, which already works!

@alok87 alok87 added the p2 intermittent issue, not very urgent, but need to be done soon label Jul 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working p2 intermittent issue, not very urgent, but need to be done soon
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant