Skip to content

[formrecognizer] reconcile storage of forms across languages #10973

@kristapratico

Description

@kristapratico

Python currently dynamically creates a storage and formrecognizer account for more complicated operations like training. We then tear down all the resources at the end - this strategy was recommended by Mike H. It is possible that in the future we switch to fixed storage account with a container containing all the training forms that we use across all languages.

If using a fixed storage account we could remove test dependency on storage, but would need environment variables set for the container SAS URLs and/or storage credentials. We would also need to maintain these so that they don't expire/change and fail tests.

My thoughts on this: Ways to test training (bold is what we currently do)

1) have training files/labeled files in repo, upload to blob storage, create container sas url, train

  • PRO: create everything on the fly, tear down everything at the end, no environment variables. Current recommendation by Mike H.
  • CON: test dependency on storage, training/labeled files committed to repo (~5MB per set), lots to do before we can actually test the training
  1. training/labeled files already uploaded to shared storage account. just create SAS URL and train
  • PRO: Ensures that our SAS URL doesn't expire on us, training files don't exist in repo
  • CON: we have to maintain a shared storage account and have credentials as environment variables.
  1. training files already uploaded to shared storage account with container SAS URL grabbed from environment variable
  • PRO: no test dependency on storage, training files don't exist in repos
  • CON: need to maintain a shared storage account with all files, and maintain container SAS URL so it doesn't expire and fail tests.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions