Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Postgres support added to Autogen Studio #1429

Closed
wants to merge 3 commits into from

Conversation

m-carter1
Copy link

Why are these changes needed?

Currently the data for Autogen Studio is stored on a sqlite database saved to the filesystem. There are many limitations with this such as when deploying to a container the database file may not be persisted etc.

This PR allows users to configure if they want to connect to a Postgres database (set via env vars) or if not it will use the default sqlite.

I changed the DBManager to be an interface so more database types can be added in the future.

Checks

@afourney afourney requested review from victordibia and gagb January 27, 2024 16:08
@afourney afourney added the proj-studio Related to AutoGen Studio. label Jan 27, 2024
@codecov-commenter
Copy link

codecov-commenter commented Jan 27, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (1ab2354) 32.48% compared to head (66bdf91) 32.48%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1429   +/-   ##
=======================================
  Coverage   32.48%   32.48%           
=======================================
  Files          41       41           
  Lines        4907     4907           
  Branches     1120     1120           
=======================================
  Hits         1594     1594           
  Misses       3187     3187           
  Partials      126      126           
Flag Coverage Δ
unittests 32.44% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@victordibia victordibia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @m-carter1 ,

Thanks so much for this! It paves the way to much better db support in AutoGen studio.
One early comment here is that I am wondering if we should try merging this into the autogenstudio branch (dev/feature branch for autogenstudio) per the ags contribution guide which is ahead of main?

Do you want to take a pass at this, else, I can try to do this too. Let me know.

@m-carter1
Copy link
Author

Hi @m-carter1 ,

Thanks so much for this! It paves the way to much better db support in AutoGen studio. One early comment here is that I am wondering if we should try merging this into the autogenstudio branch (dev/feature branch for autogenstudio) per the ags contribution guide which is ahead of main?

Do you want to take a pass at this, else, I can try to do this too. Let me know.

@victordibia I've had a go at merging: PR 1446

@victordibia
Copy link
Collaborator

@m-carter1 ,

Thanks. However, that merge seems to have a few issues:

  • It seems it is not based on the autogenstudio branch (one way to do this would be to clone the autogenstudio branch and then add/test your changes on top)
  • It seems to be touching many files outside of the samples/autogenstudio folder. In general, we want zero touch points outside of this folder for autogenstudio related PRs.

Given the above, I will review and move this towards merging in main and then handle the integration with the autogenstudio branch :). For future PRs, we can start with autogenstudio!

I'll finish my review and update you shortly!

@m-carter1
Copy link
Author

@victordibia ah ok my bad, yes i just merged this branch (based off of main) with the autogenstudio branch.

I will use autogenstudio branch for any future changes :)

@ashish31negi
Copy link

@m-carter1 can you also add connection pooling to make it more robust.
Also there is bug for timestamp conversion in get_gallery method, please check below highlighted code.

for row in result:
if isinstance(row.get('timestamp'), datetime):
row['timestamp'] = row['timestamp'].isoformat()

gallery_item = Gallery(
id=row["id"],
session=Session(**json.loads(row["session"])),
messages=[Message(**message) for message in json.loads(row["messages"])],
tags=json.loads(row["tags"]),
timestamp=row["timestamp"],
)

@victordibia
Copy link
Collaborator

victordibia commented Mar 14, 2024

Hi @m-carter1 , all,

Just to revisit this PR.
First of all, thanks for contributing this PR @m-carter1, it is a step towards improving AGS backend api.
The ideas here are related to a few other issues that we are all discussing, several of them consolidated in #1694 ..

  • Need to link entities in db for better ux and enforce protections etc
  • Integrate an ORM like SQLAlchemy / SQLModel for broader db backend support ..
  • Better serialization of data
  • Improved API specs

I'll update this as progress begins

@ShaneYuTH
Copy link

Hi @m-carter1, I modified AutogenStudio's code myself to use PostgreSQL instead of sqlite, one thing to note is that PostgreSQL is stricter with JSON structures compared to SQLite. For instance, in the workflows table, the sender and receiver fields may encounter issues during retrieval if the JSON structure deviates even slightly due to its growing complexity (single quote/double quote, new line, some symbol not being properly escaped, etc.). It will work fine with sqlite but not with postgreSQL, at least in my case. In order to fix it, you might have to touch upsert_ functions in dbutils to make sure what you inserted is 100% clean formattable json structure. Hope it helps.

Copy link

gitguardian bot commented Jul 20, 2024

⚠️ GitGuardian has uncovered 96 secrets following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

Since your pull request originates from a forked repository, GitGuardian is not able to associate the secrets uncovered with secret incidents on your GitGuardian dashboard.
Skipping this check run and merging your pull request will create secret incidents on your GitGuardian dashboard.

🔎 Detected hardcoded secrets in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
12853598 Triggered Generic High Entropy Secret 79dbb7b test/oai/test_utils.py View secret
10404693 Triggered Generic High Entropy Secret e43a86c test/oai/test_utils.py View secret
10404693 Triggered Generic High Entropy Secret bdb40d7 test/oai/test_utils.py View secret
10404693 Triggered Generic High Entropy Secret 954ca45 test/oai/test_utils.py View secret
10404693 Triggered Generic High Entropy Secret 79dbb7b test/oai/test_utils.py View secret
10404662 Triggered Generic CLI Secret eff19ac .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret 06a0a5d .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret 0524c77 .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret d7ea410 .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret e43a86c .github/workflows/dotnet-build.yml View secret
10404662 Triggered Generic CLI Secret 841ed31 .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret 802f099 .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret 9a484d8 .github/workflows/dotnet-build.yml View secret
10404662 Triggered Generic CLI Secret e973ac3 .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret 89650e7 .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret e07b06b .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret abe4c41 .github/workflows/dotnet-build.yml View secret
10404662 Triggered Generic CLI Secret 7362fb9 .github/workflows/dotnet-release.yml View secret
12853599 Triggered Generic High Entropy Secret 79dbb7b test/oai/test_utils.py View secret
10404694 Triggered Generic High Entropy Secret e43a86c test/oai/test_utils.py View secret
10404694 Triggered Generic High Entropy Secret 954ca45 test/oai/test_utils.py View secret
10404694 Triggered Generic High Entropy Secret bdb40d7 test/oai/test_utils.py View secret
10404695 Triggered Generic High Entropy Secret abad9ff test/oai/test_utils.py View secret
10404695 Triggered Generic High Entropy Secret 954ca45 test/oai/test_utils.py View secret
10404695 Triggered Generic High Entropy Secret c7bb588 test/oai/test_utils.py View secret
10404695 Triggered Generic High Entropy Secret b97b99d test/oai/test_utils.py View secret
10404695 Triggered Generic High Entropy Secret e43a86c test/oai/test_utils.py View secret
12853600 Triggered Generic High Entropy Secret 79dbb7b test/oai/test_utils.py View secret
12853601 Triggered Generic High Entropy Secret 79dbb7b test/oai/test_utils.py View secret
10493810 Triggered Generic Password 49e8053 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 501610b notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 49e8053 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 501610b notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password d422c63 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 97fa339 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 49e8053 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password d422c63 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 97fa339 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password d422c63 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 97fa339 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 501610b notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10404696 Triggered Generic High Entropy Secret 954ca45 test/oai/test_utils.py View secret
10404696 Triggered Generic High Entropy Secret bdb40d7 test/oai/test_utils.py View secret
10404696 Triggered Generic High Entropy Secret 79dbb7b test/oai/test_utils.py View secret
10404696 Triggered Generic High Entropy Secret e43a86c test/oai/test_utils.py View secret
10422482 Triggered Generic High Entropy Secret 79dbb7b test/oai/test_utils.py View secret
10422482 Triggered Generic High Entropy Secret bdb40d7 test/oai/test_utils.py View secret
12853602 Triggered Generic High Entropy Secret 79dbb7b test/oai/test_utils.py View secret
11616921 Triggered Generic High Entropy Secret a86d0fd notebook/agentchat_agentops.ipynb View secret
11616921 Triggered Generic High Entropy Secret 394561b notebook/agentchat_agentops.ipynb View secret
11616921 Triggered Generic High Entropy Secret 3eac646 notebook/agentchat_agentops.ipynb View secret
11616921 Triggered Generic High Entropy Secret f45b553 notebook/agentchat_agentops.ipynb View secret
11616921 Triggered Generic High Entropy Secret 6563248 notebook/agentchat_agentops.ipynb View secret
12853598 Triggered Generic High Entropy Secret 2b3a9ae test/oai/test_utils.py View secret
12853598 Triggered Generic High Entropy Secret c03558f test/oai/test_utils.py View secret
10404693 Triggered Generic High Entropy Secret c03558f test/oai/test_utils.py View secret
10404693 Triggered Generic High Entropy Secret 2b3a9ae test/oai/test_utils.py View secret
10404693 Triggered Generic High Entropy Secret 0a3c6c4 test/oai/test_utils.py View secret
10404693 Triggered Generic High Entropy Secret 76f5f5a test/oai/test_utils.py View secret
10404662 Triggered Generic CLI Secret 954ca45 .github/workflows/dotnet-build.yml View secret
12853599 Triggered Generic High Entropy Secret 2b3a9ae test/oai/test_utils.py View secret
12853599 Triggered Generic High Entropy Secret c03558f test/oai/test_utils.py View secret
10404694 Triggered Generic High Entropy Secret 76f5f5a test/oai/test_utils.py View secret
10404694 Triggered Generic High Entropy Secret 0a3c6c4 test/oai/test_utils.py View secret
10404695 Triggered Generic High Entropy Secret 3b79cc6 test/oai/test_utils.py View secret
10404695 Triggered Generic High Entropy Secret 11baa52 test/oai/test_utils.py View secret
12853600 Triggered Generic High Entropy Secret c03558f test/oai/test_utils.py View secret
12853600 Triggered Generic High Entropy Secret 2b3a9ae test/oai/test_utils.py View secret
12853601 Triggered Generic High Entropy Secret c03558f test/oai/test_utils.py View secret
12853601 Triggered Generic High Entropy Secret 2b3a9ae test/oai/test_utils.py View secret
10493810 Triggered Generic Password 3b79cc6 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 11baa52 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 11baa52 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 3b79cc6 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10404696 Triggered Generic High Entropy Secret 0a3c6c4 test/oai/test_utils.py View secret
10404696 Triggered Generic High Entropy Secret 76f5f5a test/oai/test_utils.py View secret
10404696 Triggered Generic High Entropy Secret c03558f test/oai/test_utils.py View secret
10404696 Triggered Generic High Entropy Secret 2b3a9ae test/oai/test_utils.py View secret
10422482 Triggered Generic High Entropy Secret 2b3a9ae test/oai/test_utils.py View secret
10422482 Triggered Generic High Entropy Secret c03558f test/oai/test_utils.py View secret

and 16 others.

🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secrets safely. Learn here the best practices.
  3. Revoke and rotate these secrets.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@gagb gagb closed this Aug 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proj-studio Related to AutoGen Studio.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants