Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3022 fk delete cascade #3237

Merged
merged 4 commits into from
Jul 1, 2022
Merged

3022 fk delete cascade #3237

merged 4 commits into from
Jul 1, 2022

Conversation

jvail
Copy link
Contributor

@jvail jvail commented Jun 24, 2022

Here are a few statistics for a little test to reduce the database size:

full proj.db:

4 largest tables

USAGE............................................. 514 25.7%
ALIAS_NAME........................................ 281 14.0%
PROJECTED_CRS..................................... 216 10.8%
CONVERSION_TABLE.................................. 215 10.7%

8.2 MB

reduced proj.db:

PRAGMA FOREIGN_KEYS=1;
DELETE FROM ellipsoid WHERE name != 'WGS 84';
VACUUM;

4 largest tables

USAGE............................................. 514 33.7%
ALIAS_NAME........................................ 281 18.4%
CONVERSION_TABLE.................................. 215 14.1%
EXTENT............................................ 169 11.1%

6.3 MB

clean table usage:

DELETE FROM usage WHERE (object_table_name, object_auth_name, object_code) IN (
  SELECT object_table_name, object_auth_name, object_code FROM usage WHERE NOT EXISTS (
     SELECT 1 FROM object_view o WHERE
         o.table_name = object_table_name AND
         o.auth_name = object_auth_name AND
         o.code = object_code));
VACUUM;

4 largest tables

ALIAS_NAME........................................ 240 20.4%
CONVERSION_TABLE.................................. 215 18.3%
EXTENT............................................ 169 14.4%
USAGE............................................. 89 7.6%

4.8 MB

Not sure if 'alias_name' could be cleaned as well (similar to the suggest query from @rouault to clean 'usage')

@rouault
Copy link
Member

rouault commented Jun 24, 2022

It could perhaps be useful to document quickly your above procedure, perhaps under https://proj.org/resource_files.html#proj-db

@kbevers
Copy link
Member

kbevers commented Jun 24, 2022

This is basically a new feature of the database so documenting its intended use would also be great. If it's possible to test the intended use that would also be nice. I'm mostly thinking about a mechanism that helps us put in the cascade keyword when new tables are added.

@jvail
Copy link
Contributor Author

jvail commented Jun 24, 2022

Oh yes, that is what I forgot to ask: If I should add some text. I try to write a proper peace of docu. I'll think about the "cascade testing" - not sure about that one.

@kbevers
Copy link
Member

kbevers commented Jun 24, 2022

I'll think about the "cascade testing" - not sure about that one.

Me neither. A simple test could be to delete everything to do with WGS84 and see if there are any remnants of it in the remaining rows. Not sure how easy that is, though

@jvail
Copy link
Contributor Author

jvail commented Jun 24, 2022

I'll think about the "cascade testing" - not sure about that one.

Me neither. A simple test could be to delete everything to do with WGS84 and see if there are any remnants of it in the remaining rows. Not sure how easy that is, though

I had an idea meanwhile, but I may have misunderstood you:

I thought about having a test case that queries the FK clause (not sure if accessible in sqlite_master etc ...) to check if all have a ON DELETE CASCADE.

@kbevers
Copy link
Member

kbevers commented Jun 24, 2022

I thought about having a test case that queries the FK clause (not sure if accessible in sqlite_master etc ...) to check if all have a ON DELETE CASCADE.

If possible that sounds like a good option. Probably better than my idea.

@rouault
Copy link
Member

rouault commented Jun 24, 2022

I thought about having a test case that queries the FK clause (not sure if accessible in sqlite_master etc ...) to check if all have a ON DELETE CASCADE.

You can access the full SQL definition of a table, but you'd have to parse it yourself. That's not practical at all

A simple test could be to delete everything to do with WGS84 and see if there are any remnants of it in the remaining rows. Not sure how easy that is, though

I'd say that just deleting non-WGS 84 ellipsoids, running "projinfo -s EPSG:4326 -t EPSG:32631" to verify that it still works and checking "echo .dump | sqlite3 proj.db | grep RGF93" (for example... but you may need to purge the alias table too) returns nothing should be enough. Could be done in a new script in test/cli that would work on a copy of proj.db

@jvail
Copy link
Contributor Author

jvail commented Jun 24, 2022

I thought about having a test case that queries the FK clause (not sure if accessible in sqlite_master etc ...) to check if all have a ON DELETE CASCADE.

You can access the full SQL definition of a table, but you'd have to parse it yourself. That's not practical at all

Well, apparently one can do this:

sqlite> PRAGMA foreign_key_list('geodetic_crs');
id|seq|table|from|to|on_update|on_delete|match
0|0|geodetic_datum|datum_auth_name|auth_name|NO ACTION|CASCADE|NONE
0|1|geodetic_datum|datum_code|code|NO ACTION|CASCADE|NONE
1|0|coordinate_system|coordinate_system_auth_name|auth_name|NO ACTION|CASCADE|NONE
1|1|coordinate_system|coordinate_system_code|code|NO ACTION|CASCADE|NONE

@rouault
Copy link
Member

rouault commented Jun 24, 2022

Well, apparently one can do this:

oh, that's neet! I just checked this is available in sqlite 3.11 too, which is our minimal version.

@jvail
Copy link
Contributor Author

jvail commented Jun 24, 2022

Well, apparently one can do this:

oh, that's neet! I just checked this is available in sqlite 3.11 too, which is our minimal version.

Ok - so we agree that I try to create a test case like this:

go over all tables and their foreign keys and if any ''on delete' is not set to 'cascade' then fail.

@jvail
Copy link
Contributor Author

jvail commented Jun 25, 2022

I'd suggest to add the test to commit.sql - seems the right place. Unfortunately pragma_foreign_key_list is only available in sqlite >= 3.16.0. Since it effects only the build system (the function will never end up in a distributed proj.db) it might be acceptable. Is it?

@jvail
Copy link
Contributor Author

jvail commented Jun 25, 2022

Hm, guess the question is answered: Failed on ubuntu with sqlite 3.11 ...

@jvail
Copy link
Contributor Author

jvail commented Jun 26, 2022

I have commented out the test. I think even if someone forgets to add a "CASCADE DELETE" it does no harm. Then the database will just complain that certain records can not be removed. Ok?

@rouault
Copy link
Member

rouault commented Jun 28, 2022

Then the database will just complain that certain records can not be removed. Ok?

yes, but only if verification of foreign keys is explicitly enabled in the database connection with "pragma foreign_keys=1;"

That said this PR looks OK for me as such.

@jvail
Copy link
Contributor Author

jvail commented Jun 28, 2022

Then the database will just complain that certain records can not be removed. Ok?

yes, but only if verification of foreign keys is explicitly enabled in the database connection with "pragma foreign_keys=1;"

Yes, right. But this will happen with or without the "delete cascade". And in the example I have added both the enabling & the foreign key check.

That said this PR looks OK for me as such.

Great! Than I guess it is your turn. Otherwise let me know if anything is missing.

@rouault rouault merged commit 424be5c into OSGeo:master Jul 1, 2022
@rouault rouault added this to the 9.1.0 milestone Jul 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants