Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft -- PostgreSQL db replaces Oracle db #1956

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

zimzoom
Copy link
Contributor

@zimzoom zimzoom commented Aug 17, 2023

This draft PR addresses 2 pending questions: 1) how much work would still remain to move CMR from Oracle to Postgres, following the datastore prototype? (wiki report here) & 2) how should we approach refactoring CMR to allow for running with an external datastore that is different from the official one? (discussion wiki here)

"Why are there still oracle namespaces here?" -- These changes have been made 'in place' as much as possible, except for certain Postgres core aspects, mostly to help address question 2 -- much "oracle" code contains general CMR logic for working with external data stores in general (health check, etc.) or at least RDBMS (jdbc utils), and conversely much of CMR code is actually specific to Oracle (explicit SQL, assumption of receiving blob instead of bytes, etc.)

What's been done so far

  • DB connection in same style, with connection pool and db-spec. (used in prototype and for setup db)
  • cmr setup db -- creation of all users, schemas, data migrations have been converted from Oracle to Postgres
  • Job Scheduler -- the Java dependencies has been translated from Oracle to Postgres, successfully creates most jobs
  • CRUD functionality brought over from prototype work -- confirmed for providers, collections, granules [1]

What remains to be done

  • (EDIT: solved apparently) FIRST PRIORITY BUG -- proper start of DB system component [2] -- (hard)
  • cmr setup db -- the tear-down of users and migrations (e.g. (drop), (drop-user)) etc. (easy)
  • debug some scheduled jobs still fail -- might be helped by solving First Priority Bug above.
  • test bulk update/bulk migration/catalog rest code (dev already started)
  • add timezone-aware functionality to db-timestamp->string
  • debug the find-one sql utility -- currently functional due to redundant first but needs to limit results in the actual sql to reduce load on db.
  • connection pool improvement: add failover and ons functionality as in Oracle, add manual close inside PostgresStore stop
  • misc. DB utilities -- changing of Oracle-specific property names, values etc. to Postgres e.g. DB health checks, ignore-existing-error wrapper, etc.
  • (optional, recommended) - implement AWS Aurora bells and whistles to take advantage of improved monitoring, performance, failover, etc.
  • (optional, recommended) - rename namespaces to just 'postgres' instead of 'aurora-postgres', to future proof, and decrease confusion as this is all specific to Postgres engine [3]

Misc. dev notes

  • Columns that are used to indicate true/false such as 'deleted', etc. have been changed from being type INTEGER (1/0) or VARCHAR(1) ('1'/'0') to being type BOOLEAN. Oracle did not support the boolean data type and sometimes required manual conversion; Postgres does, will convert '1'/'0' to true/false automatically (but won't convert true/false to a char/int, which is how Job Scheduler was working). Most all CMR data objects are using true/false.
  • The temp table created in metadata-db-app migration no.2 does not persist in Postgres as it does in Oracle. Manual table creation (aurora/create-temp-table) has been added where it is used.
  • Postgres uses BYTEA data type instead of Oracle's BLOB; therefore where these columns are parsed, gzip-blob->string becomes gzip-bytes->string
  • CMR's jdbc and clj-time libs are currently marked as deprecated/not actively maintained
  • You may see this in your REPL after a while, will then need to do full restart, still need to debug: org.postgresql.util.ServerErrorMessage "FATAL: remaining connection slots are reserved for non-replication superuser connections"

How to run this locally

Much the same as using the Oracle container. Make sure your profiles.clj content is set as environment variables. Run cmr setup dev. Launch a container using the official postgres image with these configs: (you can add a volume if you want)

docker run \
  --name cmr-postgres-db \
  -e POSTGRES_PASSWORD=admin \
  -e POSTGRES_DB=cmrcdb \
  -d \
  -p 5432:5432 postgres

After the container is 'running', run cmr setup db to create users, schemas, migrations. Upon entering REPL, run (reset :db :external).

Footnotes

  • [0] access control app's (dev-start)'s bootstrap is commented out not just because it errors due to connection error, also because I have an M1 chip and the new Oracle ARM image does not work with this function either so I couldn't even compare for debugging. Also this function might be removed in a different current ticket.
  • [1] search isn't confirmed locally because again, having M1 chip search is broken for my Oracle container as well so I can't compare for debug. Search was confirmed on cloud in prototype work.
  • [2] (EDIT: solved) The expected way that a CMR service launches its DB within its create-system is not working properly here. It appears the lifecycle start doesn't execute, because it returns having :datasource nil. Note that the PostgresStore record itself does successfully start when used in an atom instead of component, as in setup db. This atom method has been added to the needed create-system's as a WORKAROUND ONLY.
  • [3] right now the JDBC driver is the only 'Aurora' thing, but it should be just a wrapper on the Postgres JDBC driver. The connection pool lib, HikariCP, is recommended by AWS Aurora JDBC Wrapper docs, but the quartzite Job Scheduler uses C3P0.
    • This driver will throw 'WARNING: Expecting a dialect that supports a cluster topology' when run locally, that's normal.

Copy link
Contributor

@jceaser jceaser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is much here to consider, ticket is done in my opinion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants