Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[backend] Kubeflow Cache-Server Storing Duplicate ExecutionCacheIDs on Identical Runs #11614

Open
harrisonfritz opened this issue Feb 11, 2025 · 0 comments

Comments

@harrisonfritz
Copy link

harrisonfritz commented Feb 11, 2025

The cache-server database contains rows that have the same executionCacheKey. The same pipeline is getting written to the cache multiple times, this increases the number of records in the database.

Environment

How did you deploy Kubeflow Pipelines (KFP)?

  • K8s on AWS using the KFP chart.

KFP version:

  • 2.2.0

KFP SDK version:

  • 2.11.0

Steps to reproduce

  1. Create a simple pipeline (see pipeline code below)
  2. Run the same pipeline multiple times
  3. Connect to the DB and observe the following (see below)

KFP code which creates 5 pipeline runs

import kfp

client = kfp.Client()

base_image = "kubeflownotebookswg/jupyter:v1.9.1" 

def hello_world(mystr: str) -> str:
    new_str = f"hello world ({mystr})"
    print(new_str)
    return new_str

hello_world_comp = kfp.components.create_component_from_func(
    func = hello_world,
    base_image = base_image
)

@kfp.dsl.pipeline(name="Test caching")
def test_caching_pipeline(string: str):

    hello_world_step1 = hello_world_comp(string)

    hello_world_step2 = hello_world_comp(hello_world_step1.output)
    
    hello_world_step3 = hello_world_comp(hello_world_step2.output)
    
    hello_world_step4 = hello_world_comp(hello_world_step3.output)
    
    
for i in range(0,5):
    run = client.create_run_from_pipeline_func(
        pipeline_func = test_caching_pipeline,
        arguments = {'string': 'hello world'}
    )

ssh into a node in the cluster.

ssh -i /path/to/your-key.pem ec2-user@your-ec2-instance-public-dns

If prompted for a secret to connect to the mysql database, you can get it using the following kubectl command:

kubectl get secret -o yaml mysql-secret | grep -i password | awk '{print $2}' | base64 --decode

Install mariaDB on the node if not already installed

sudo yum install mariadb

sql commands

use cachedb;
SELECT executioncachekey, COUNT(executioncachekey) AS count FROM execution_caches GROUP BY executioncachekey HAVING count > 1;

Sample Output

+------------------------------------------------------------------+-------+
| executioncachekey                                                | count |
+------------------------------------------------------------------+-------+
| 0001523383413dcf3254c7470201dceda0107ec1a9c664291c72943cf6c08d45 |     4 |
| 00015d7d7773bd7d03c2aa92497d41c3e31f6f908c1b0e97c54f2a442578d373 |     3 |
| 00046c50536db58d5b9c927f3434c6d287a1640143989fa3e723379363075596 |     3 |
| 000df86ffab21b5e761c97ba45dc2b53da6f8c36d488b727f2bbacaad074ac3b |    28 |
| 00226af1ff3294a9507dd9e77faf1a89f56ee7e49bc6bdbf6b4436af3ad731aa |    18 |

observe there are duplicate hash entries in the database

Expected result

We expect that there are no duplicate hash entries in the database. If a cache entry has the same executionCacheKey it means the pipeline is identical and should be pulled from the cache and not written to the cache. The following query should not return any rows because each executionCacheKey should be unique.

SELECT executioncachekey, COUNT(executioncachekey) AS count FROM execution_caches GROUP BY executioncachekey HAVING count > 1;```

Materials and Reference

Impacted by this bug? Give it a 👍.

@harrisonfritz harrisonfritz changed the title Kubeflow Cache-Server Storing Duplicate ExecutionCacheIDs on Identical Runs [bug] Kubeflow Cache-Server Storing Duplicate ExecutionCacheIDs on Identical Runs Feb 11, 2025
@harrisonfritz harrisonfritz changed the title [bug] Kubeflow Cache-Server Storing Duplicate ExecutionCacheIDs on Identical Runs [backend] Kubeflow Cache-Server Storing Duplicate ExecutionCacheIDs on Identical Runs Feb 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant