You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The cache-server database contains rows that have the same executionCacheKey. The same pipeline is getting written to the cache multiple times, this increases the number of records in the database.
Environment
How did you deploy Kubeflow Pipelines (KFP)?
K8s on AWS using the KFP chart.
KFP version:
2.2.0
KFP SDK version:
2.11.0
Steps to reproduce
Create a simple pipeline (see pipeline code below)
Run the same pipeline multiple times
Connect to the DB and observe the following (see below)
observe there are duplicate hash entries in the database
Expected result
We expect that there are no duplicate hash entries in the database. If a cache entry has the same executionCacheKey it means the pipeline is identical and should be pulled from the cache and not written to the cache. The following query should not return any rows because each executionCacheKey should be unique.
SELECT executioncachekey, COUNT(executioncachekey) AS count FROM execution_caches GROUP BY executioncachekey HAVING count > 1;```
Materials and Reference
Impacted by this bug? Give it a 👍.
The text was updated successfully, but these errors were encountered:
harrisonfritz
changed the title
Kubeflow Cache-Server Storing Duplicate ExecutionCacheIDs on Identical Runs
[bug] Kubeflow Cache-Server Storing Duplicate ExecutionCacheIDs on Identical Runs
Feb 11, 2025
harrisonfritz
changed the title
[bug] Kubeflow Cache-Server Storing Duplicate ExecutionCacheIDs on Identical Runs
[backend] Kubeflow Cache-Server Storing Duplicate ExecutionCacheIDs on Identical Runs
Feb 11, 2025
The cache-server database contains rows that have the same executionCacheKey. The same pipeline is getting written to the cache multiple times, this increases the number of records in the database.
Environment
How did you deploy Kubeflow Pipelines (KFP)?
KFP version:
KFP SDK version:
Steps to reproduce
KFP code which creates 5 pipeline runs
ssh into a node in the cluster.
If prompted for a secret to connect to the mysql database, you can get it using the following kubectl command:
Install mariaDB on the node if not already installed
sql commands
Sample Output
observe there are duplicate hash entries in the database
Expected result
We expect that there are no duplicate hash entries in the database. If a cache entry has the same executionCacheKey it means the pipeline is identical and should be pulled from the cache and not written to the cache. The following query should not return any rows because each executionCacheKey should be unique.
Materials and Reference
Impacted by this bug? Give it a 👍.
The text was updated successfully, but these errors were encountered: