Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retain generated corpus and pretranslation files for a build #468

Open
ddaspit opened this issue Aug 27, 2024 · 6 comments
Open

Retain generated corpus and pretranslation files for a build #468

ddaspit opened this issue Aug 27, 2024 · 6 comments
Assignees
Labels
enhancement New feature or request

Comments

@ddaspit
Copy link
Contributor

ddaspit commented Aug 27, 2024

Currently, Serval deletes corpus and pretranslation files once a build has finished. This makes it difficult to debug issues and to perform testing. Instead, Serval should retain the files after a build has finished. The files should be deleted after a predetermined amount of time (maybe 30 days).

@ddaspit ddaspit added the enhancement New feature or request label Aug 27, 2024
@ddaspit
Copy link
Contributor Author

ddaspit commented Aug 27, 2024

This is needed to support gray box testing by the test team.

@johnml1135
Copy link
Collaborator

Do we want this for QA only? Do we want this controllable by a flag?

@ddaspit
Copy link
Contributor Author

ddaspit commented Aug 28, 2024

I think it would be good to do this on production as well, since it will be helpful for debugging.

@johnml1135 johnml1135 self-assigned this Aug 29, 2024
johnml1135 added a commit that referenced this issue Sep 9, 2024
johnml1135 added a commit that referenced this issue Sep 10, 2024
johnml1135 added a commit that referenced this issue Sep 10, 2024
johnml1135 added a commit that referenced this issue Sep 10, 2024
* Fix #464 - add lock lifetime for all
* Add HTTP timeout
* Make adjustable through options
* Will need to delete all locks from MongoDB - otherwise will endlessly loop for startup
* Fix some ci-e2e issues

Only use locking when accessing SMT model

Fix unit tests

Update to latest version of Machine

Fix bug where wrong id is used when starting a build

Remove reference to Serval.Shared in Serval.Machine.Shared

* preserve fix fro #468
@ddaspit
Copy link
Contributor Author

ddaspit commented Sep 10, 2024

We still need to implement a method for cleaning up the files.

@ddaspit ddaspit reopened this Sep 10, 2024
@johnml1135 johnml1135 assigned Enkidu93 and unassigned johnml1135 Oct 7, 2024
@Enkidu93
Copy link
Collaborator

I know you were asking about an approach in the PR above, @ddaspit: I'm thinking I'll just add a DateFinished to the machine build model which we can set in BuildJobService.BuildJobFinishedAsync and then add another RecurrentTask similar to the ModelCleanupService to delete the build artifacts after a certain amount of time (is 30 days long enough?). Does that sound like a good approach? We could wait until 30 days after the code has been pushed to production and then push another change to delete build artifacts for translation engines which don't have a DateFinished property set to avoid indefinitely storing the build artifacts from builds build recently or we could set the existing builds to have the current date at the time of the push (probable simpler).

@johnml1135
Copy link
Collaborator

The longest that I could imagine that a job could take would be 1 day. If we set the expiration for 30 days and key off of DateCreated (or whatever the field is), I think it could be simpler to implement and not have an old vs. new data issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: 🔖 Ready
Development

No branches or pull requests

3 participants