Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed AQUA integration #410

Open
johnml1135 opened this issue Jun 11, 2024 · 1 comment
Open

Proposed AQUA integration #410

johnml1135 opened this issue Jun 11, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@johnml1135
Copy link
Collaborator

Here is a potential way to integrate AQUA:

  • AQUA assessments done in k8s in a separate docker container
  • Pre and post processing done using Serval repo, main AQUA algorithm in separate repo, as well as deployment scripts
  • Needed alignments done in machine

A bit more in depth:

  • Machine changes:

    • Merge together machine-engine and machine-job, now that the SMT jobs are done through ClearML
    • Expose alignments in machine, serving the same way as pretranslations
  • Serval changes:

    • Allow engines to have "parent" and "child" links (or similar), which connect to different engines.
    • There is a new state when building an engine called "check preconditions"
    • When a child engine finishes building, it checks any parents to see if all preconditions have been met because the child finished.
  • Assessment

    • Specifies multiple alignment engines that need to be completed at a specific revision as "preconditions" before the build starts.
    • The flow would be: Setup all alignment engines and the assessment engine. When starting the build, specify the minimum revision of the build of each engine to be checked in preconditions
    • When the assessment starts, it already has everything to complete the assessment without interacting with machine again
    • When an updated assessment is requested, Serval asks the child engine for the alignment and then the assessment engine for the updated assessment.

@ddaspit, what do you think?

@johnml1135 johnml1135 added the enhancement New feature or request label Jun 11, 2024
@johnml1135
Copy link
Collaborator Author

Planning updates:

Updates to code:

  • Machine.py job code moved to Serval
  • Code added to Serval to call AQUA
    • Serval has extra poetry AQUA dependency referencing git tag: { git = "AQUA.git", tag = "v1.2.3" }
  • AQUA updated to be callable with these functions:
    • aqua = AQUA(config)
    • aqua.run(parallelCorpus, listOfReferenceAssessments)
    • assessments = aqua.get_assessment()
    • path = aqua.save_model()

Phase 2: - live inference

  • The saved model will be on the S3 bucket
  • Using python.net, the Serval/AQUA dotnet code will call AQUA code to run the incremental algorithm, utilizing the saved models and any other needed assessments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: 🆕 New
Development

No branches or pull requests

1 participant