-
Notifications
You must be signed in to change notification settings - Fork 2.1k
[OpenAI] Whisper support #36693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
[OpenAI] Whisper support #36693
Changes from 58 commits
Commits
Show all changes
73 commits
Select commit
Hold shift + click to select a range
721a035
Early code generation from topic branch for whisper
jpalvarezl bce544c
Added simplest test
jpalvarezl 98b0587
Regened with correct paths
jpalvarezl 3e7c30f
Fixed name of method in the test
jpalvarezl 35384fb
Added test file for translations
jpalvarezl 8c0087e
[OpenAI] BYO Multipart form request support (#36621)
jpalvarezl bfd2bd8
Code regen and adjustments to new methods
jpalvarezl bd7f7b8
Using latest commit
jpalvarezl 9f141ec
plain text works
jpalvarezl 6a09621
Code gen works
jpalvarezl 042a688
Code regen with looser types, no hooks for content-type nor length
jpalvarezl 372fbc9
Migrated multiform implementation over from the strongly typed branch
jpalvarezl 430ce3e
Added headers
jpalvarezl f86b515
Added classes
jpalvarezl 5246435
reran code gen
jpalvarezl 05d703e
Compiles with modded tsp defintion, including content-type
jpalvarezl b338c6e
Corrected wrong value passed for content-length
jpalvarezl b4fce21
It works!
jpalvarezl dd075cd
Removed pattern instanceof for older compatibility version
jpalvarezl cc9736c
Refactored the MultipartHelper to be testable
jpalvarezl d5be9d4
Added test definition for MultipartDataHelper class
jpalvarezl f19d690
Added happy path test and model to the list to be serialized
jpalvarezl b21db2b
Added tests for the MultipartDataHelper class
jpalvarezl d78ebb3
Refactored audio translation tests to use testRunners
jpalvarezl 1b0b188
Added tests for miused formats
jpalvarezl e5b3612
Added more negative tests for wrong formats
jpalvarezl 1417214
Renamed tests
jpalvarezl 6ba8260
Finished Azure OAI sync test suite
jpalvarezl c9c4a51
Added support for nonAzure translations
jpalvarezl 2860ce9
Added Async translation methods
jpalvarezl d1a7aea
Added tests and async functionality for translations
jpalvarezl 262cdde
Async translation tests for non-Azure
jpalvarezl 8d6bfc6
Extracted audioTranscription assertion statements to method
jpalvarezl 9b842de
Added sync transcription functionality and AOAI tests
jpalvarezl 5f49b14
Added license to source files
jpalvarezl a5ad09a
Added todo markers where docs are missing
jpalvarezl 15ddabd
Added async implementation and minimal testing for transcription
jpalvarezl f299a3a
Added tests for nonAzure OAI
jpalvarezl 1f7161b
Code regen
jpalvarezl 1974d1e
Corrected content type for bodyParam nonAzure
jpalvarezl 846cd6c
Added remaing transcription tests for AOAI sync case
jpalvarezl 4d8eea3
Added tests for async AOAI
jpalvarezl a6af7ea
Added transcription tests for nonAzure OAI sync API
jpalvarezl fac7601
Added tests for nonAzure OAI async API
jpalvarezl 5e39c7d
Commited whisper session-record changes
jpalvarezl 70bbc20
Inlined methods
jpalvarezl 5b75b7e
Added documentation to sync/async client for translation and transcri…
jpalvarezl 88406cd
Added documentation to multipart helper classes
jpalvarezl fbbaea7
Replaced start imports with single class imports
jpalvarezl 38078fb
Simplified tests and added logger to async client
jpalvarezl 2bc1976
Added missing asset
jpalvarezl 1c48b63
Added recordings for nonAzure tests
jpalvarezl 4383d7a
Style checks
jpalvarezl 3578e4c
Style check
jpalvarezl 76f640e
Style check
jpalvarezl 691080d
Style check done
jpalvarezl 398b702
Changelog update and static bug analysis issues addressed
jpalvarezl ae71931
Last 2 replacement of monoError
jpalvarezl 47e345b
[OpenAI] Added sample and updated READMEs (#36806)
mssfang 6306060
suppression spotbugs for allowing external mutation on the bytep[ (#3…
mssfang 9b6fdd1
Merge branch 'main' into jpalvarezl/whisper_support_looser_types
mssfang 40dcaff
fixed unknown cspell error, 'mpga'
mssfang 4af1ba5
fixed sample broken links
mssfang 65e4ee4
regenerated, no changes but only indents alignment
mssfang c1fb921
Hardcoded boundary value for multipart requests
jpalvarezl a5f32d9
Updated test records for nonAzure
jpalvarezl bcb6fae
Most test passing with latest service version
jpalvarezl 04a3201
Rolled back test records for regressed tests
jpalvarezl 8773677
Removed unused import
jpalvarezl 6bafc2e
Re-ordered method parameters. Options bag is last
jpalvarezl 8f4e36a
Readme update
jpalvarezl 5a9b295
[OpenAI] Improve JavaDoc and compatible with JDK 21 (#36846)
mssfang 8b6fca8
removed export implementaion/model
mssfang File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
548 changes: 545 additions & 3 deletions
548
sdk/openai/azure-ai-openai/src/main/java/com/azure/ai/openai/OpenAIAsyncClient.java
Large diffs are not rendered by default.
Oops, something went wrong.
537 changes: 534 additions & 3 deletions
537
sdk/openai/azure-ai-openai/src/main/java/com/azure/ai/openai/OpenAIClient.java
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
17 changes: 17 additions & 0 deletions
17
...i-openai/src/main/java/com/azure/ai/openai/implementation/MultipartBoundaryGenerator.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,17 @@ | ||
| // Copyright (c) Microsoft Corporation. All rights reserved. | ||
| // Licensed under the MIT License. | ||
|
|
||
| package com.azure.ai.openai.implementation; | ||
|
|
||
| /** | ||
| * Interface implemented by classes that would generate a boundary string to use in multipart type of requests. | ||
| * The main purpose of this class is to allow to mock behaviour for tests | ||
| */ | ||
| public interface MultipartBoundaryGenerator { | ||
|
|
||
| /** | ||
| * Generates a new multipart boundary value each time the method is called | ||
| * @return a {@link String} value containing a boundary to be used in HTTP multipart requests | ||
| */ | ||
| String generateBoundary(); | ||
| } | ||
203 changes: 203 additions & 0 deletions
203
...azure-ai-openai/src/main/java/com/azure/ai/openai/implementation/MultipartDataHelper.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,203 @@ | ||
| // Copyright (c) Microsoft Corporation. All rights reserved. | ||
| // Licensed under the MIT License. | ||
|
|
||
| package com.azure.ai.openai.implementation; | ||
jpalvarezl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| import com.azure.ai.openai.models.AudioTranscriptionOptions; | ||
| import com.azure.ai.openai.models.AudioTranslationOptions; | ||
| import com.azure.core.util.BinaryData; | ||
|
|
||
| import java.io.ByteArrayOutputStream; | ||
| import java.io.IOException; | ||
| import java.nio.charset.Charset; | ||
| import java.nio.charset.StandardCharsets; | ||
| import java.util.ArrayList; | ||
| import java.util.List; | ||
| import java.util.UUID; | ||
|
|
||
| /** | ||
| * Helper class for marshaling {@link AudioTranscriptionOptions} and {@link AudioTranslationOptions} objects to be used | ||
| * in multipart HTTP requests according to RFC7578. | ||
| */ | ||
| public class MultipartDataHelper { | ||
|
|
||
| /** | ||
| * Value to be used as part of the divider for the multipart requests. | ||
| */ | ||
| private final String boundary; | ||
|
|
||
| /** | ||
| * The actual part separator in the request. This is obtained by prepending "--" to the "boundary". | ||
| */ | ||
| private final String partSeparator; | ||
|
|
||
| /** | ||
| * The marker for the ending of a multipart request. This is obtained by post-pending "--" to the "partSeparator". | ||
| */ | ||
| private final String endMarker; | ||
|
|
||
| /** | ||
| * Charset used for encoding the multipart HTTP request. | ||
| */ | ||
| private final Charset encoderCharset = StandardCharsets.UTF_8; | ||
|
|
||
| /** | ||
| * Line separator for the multipart HTTP request. | ||
| */ | ||
| private static final String CRLF = "\r\n"; | ||
|
|
||
| /** | ||
| * Default constructor used in the code. The boundary is a random value. | ||
| */ | ||
| public MultipartDataHelper() { | ||
| this(() -> UUID.randomUUID().toString().substring(0, 16)); | ||
| } | ||
|
|
||
| /** | ||
| * Constructor accepting a boundary generator. Used for testing. | ||
| * @param boundaryGenerator Generates the value for "boundary". | ||
| */ | ||
| public MultipartDataHelper(MultipartBoundaryGenerator boundaryGenerator) { | ||
| this.boundary = boundaryGenerator.generateBoundary(); | ||
| partSeparator = "--" + boundary; | ||
| endMarker = partSeparator + "--"; | ||
| } | ||
|
|
||
| /** | ||
| * | ||
| * @return the "boundary" value. | ||
| */ | ||
| public String getBoundary() { | ||
| return boundary; | ||
| } | ||
|
|
||
| /** | ||
| * This methods marshals the passed request into ready to be sent | ||
| * @param requestOptions object to be marshalled for the multipart HTTP request | ||
| * @param fileName the name of the file that is being sent as a part of this request | ||
| * @return the marshalled data and its length | ||
| * @param <T> {@link AudioTranscriptionOptions} and {@link AudioTranslationOptions} are the only types supported. | ||
| * This represents the type information of the request object. | ||
| */ | ||
| public <T> MultipartDataSerializationResult serializeRequest(T requestOptions, String fileName) { | ||
| if (requestOptions instanceof AudioTranslationOptions) { | ||
| AudioTranslationOptions audioTranslationOptions = (AudioTranslationOptions) requestOptions; | ||
| byte[] file = audioTranslationOptions.getFile(); | ||
| List<MultipartField> fields = formatAudioTranslationOptions(audioTranslationOptions); | ||
| return serializeRequestFields(file, fields, fileName); | ||
| } else if (requestOptions instanceof AudioTranscriptionOptions) { | ||
| AudioTranscriptionOptions audioTranscriptionOptions = (AudioTranscriptionOptions) requestOptions; | ||
| byte[] file = audioTranscriptionOptions.getFile(); | ||
| List<MultipartField> fields = formatAudioTranscriptionOptions(audioTranscriptionOptions); | ||
| return serializeRequestFields(file, fields, fileName); | ||
| } else { | ||
| throw new IllegalArgumentException("Only AudioTranslationOptions and AudioTranscriptionOptions currently supported"); | ||
| } | ||
| } | ||
|
|
||
| /** | ||
| * | ||
| * @param file is the byte[] representation of the file in the request object. | ||
| * @param fields a list of the members other than the file in the request object. | ||
| * @param fileName the name of the file passed in the "file" field of the request object. | ||
| * @return a structure containing the marshalled data and its length. | ||
| */ | ||
| private MultipartDataSerializationResult serializeRequestFields(byte[] file, List<MultipartField> fields, String fileName) { | ||
| ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream(); | ||
|
|
||
| // Multipart preamble | ||
| String fileFieldPreamble = partSeparator | ||
| + CRLF + "Content-Disposition: form-data; name=\"file\"; filename=\"" | ||
| + fileName + "\"" | ||
| + CRLF + "Content-Type: application/octet-stream" + CRLF + CRLF; | ||
| try { | ||
| // Writing the file into the request as a byte stream | ||
| byteArrayOutputStream.write(fileFieldPreamble.getBytes(encoderCharset)); | ||
| byteArrayOutputStream.write(file); | ||
|
|
||
| // Adding other fields to the request | ||
| for (MultipartField field : fields) { | ||
| byteArrayOutputStream.write(serializeField(field)); | ||
| } | ||
| byteArrayOutputStream.write((CRLF + endMarker).getBytes(encoderCharset)); | ||
| } catch (IOException e) { | ||
| throw new RuntimeException(e); | ||
| } | ||
|
|
||
| byte[] totalData = byteArrayOutputStream.toByteArray(); | ||
| return new MultipartDataSerializationResult(BinaryData.fromBytes(totalData), totalData.length); | ||
| } | ||
|
|
||
| /** | ||
| * Adds member fields apart from the file to the multipart HTTP request | ||
| * @param audioTranslationOptions request object | ||
| * @return a list of the fields in the request (except for "file") | ||
| */ | ||
| private List<MultipartField> formatAudioTranslationOptions(AudioTranslationOptions audioTranslationOptions) { | ||
| List<MultipartField> fields = new ArrayList<>(); | ||
| if (audioTranslationOptions.getResponseFormat() != null) { | ||
| fields.add(new MultipartField( | ||
| "response_format", | ||
| audioTranslationOptions.getResponseFormat().toString())); | ||
| } | ||
| if (audioTranslationOptions.getModel() != null) { | ||
| fields.add(new MultipartField("model", | ||
| audioTranslationOptions.getModel() | ||
| )); | ||
| } | ||
| if (audioTranslationOptions.getPrompt() != null) { | ||
| fields.add(new MultipartField("prompt", | ||
| audioTranslationOptions.getPrompt())); | ||
| } | ||
| if (audioTranslationOptions.getTemperature() != null) { | ||
| fields.add(new MultipartField("temperature", | ||
| String.valueOf(audioTranslationOptions.getTemperature()))); | ||
| } | ||
| return fields; | ||
| } | ||
|
|
||
| /** | ||
| * Adds member fields apart from the file to the multipart HTTP request | ||
| * @param audioTranscriptionOptions request object | ||
| * @return a list of the fields in the request (except for "file") | ||
| */ | ||
| private List<MultipartField> formatAudioTranscriptionOptions(AudioTranscriptionOptions audioTranscriptionOptions) { | ||
| List<MultipartField> fields = new ArrayList<>(); | ||
| if (audioTranscriptionOptions.getResponseFormat() != null) { | ||
| fields.add(new MultipartField("response_format", | ||
| audioTranscriptionOptions.getResponseFormat().toString())); | ||
| } | ||
| if (audioTranscriptionOptions.getModel() != null) { | ||
| fields.add(new MultipartField("model", | ||
| audioTranscriptionOptions.getModel() | ||
| )); | ||
| } | ||
| if (audioTranscriptionOptions.getPrompt() != null) { | ||
| fields.add(new MultipartField("prompt", | ||
| audioTranscriptionOptions.getPrompt())); | ||
| } | ||
| if (audioTranscriptionOptions.getTemperature() != null) { | ||
| fields.add(new MultipartField("temperature", | ||
| String.valueOf(audioTranscriptionOptions.getTemperature()))); | ||
| } | ||
| if (audioTranscriptionOptions.getLanguage() != null) { | ||
| fields.add(new MultipartField("language", | ||
| audioTranscriptionOptions.getLanguage())); | ||
| } | ||
| return fields; | ||
| } | ||
|
|
||
| /** | ||
| * This method formats a field for a multipart HTTP request and returns its byte[] representation | ||
| * @param field the field of the request to be marshalled | ||
| * @return byte[] representation of a field for a multipart HTTP request | ||
| */ | ||
| private byte[] serializeField(MultipartField field) { | ||
| String serialized = CRLF + partSeparator | ||
| + CRLF + "Content-Disposition: form-data; name=\"" | ||
| + field.getWireName() + "\"" + CRLF + CRLF | ||
| + field.getValue(); | ||
|
|
||
| return serialized.getBytes(encoderCharset); | ||
| } | ||
| } | ||
50 changes: 50 additions & 0 deletions
50
...ai/src/main/java/com/azure/ai/openai/implementation/MultipartDataSerializationResult.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,50 @@ | ||
| // Copyright (c) Microsoft Corporation. All rights reserved. | ||
| // Licensed under the MIT License. | ||
|
|
||
| package com.azure.ai.openai.implementation; | ||
jpalvarezl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| import com.azure.core.util.BinaryData; | ||
|
|
||
| /** | ||
| * This class is used as a stand-in representation of marshalled data to be used in an HTTP multipart request. | ||
| */ | ||
| public class MultipartDataSerializationResult { | ||
|
|
||
| /** | ||
| * Represents the length of the content of this request. The value is to be used for the "Content-Length" header | ||
| * of the HTTP request | ||
| */ | ||
| private final long dataLength; | ||
|
|
||
| /** | ||
| * The multipart form data of the request. | ||
| */ | ||
| private final BinaryData data; | ||
|
|
||
| /** | ||
| * Constructor bundling both data and its length | ||
| * @param data the multipart form data of the request | ||
| * @param contentLength the length of the multipart form data of the request | ||
| */ | ||
| public MultipartDataSerializationResult(BinaryData data, long contentLength) { | ||
| this.dataLength = contentLength; | ||
| this.data = data; | ||
| } | ||
|
|
||
| /** | ||
| * | ||
| * @return the result of marshaling a multipart HTTP request | ||
| */ | ||
| public BinaryData getData() { | ||
| return data; | ||
| } | ||
|
|
||
| /** | ||
| * | ||
| * @return the length of a multipart HTTP request data | ||
| */ | ||
| public long getDataLength() { | ||
| return dataLength; | ||
| } | ||
|
|
||
| } | ||
46 changes: 46 additions & 0 deletions
46
...enai/azure-ai-openai/src/main/java/com/azure/ai/openai/implementation/MultipartField.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,46 @@ | ||
| // Copyright (c) Microsoft Corporation. All rights reserved. | ||
| // Licensed under the MIT License. | ||
|
|
||
| package com.azure.ai.openai.implementation; | ||
jpalvarezl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| /** | ||
| * A field of a request for a multipart HTTP request. | ||
| */ | ||
| public class MultipartField { | ||
|
|
||
| /** | ||
| * The JSON key name of this field. | ||
| */ | ||
| private final String wireName; | ||
|
|
||
| /** | ||
| * The JSON value of this field. | ||
| */ | ||
| private final String value; | ||
|
|
||
| /** | ||
| * | ||
| * @param wireName The JSON key name of this field. | ||
| * @param value The JSON value of this field. | ||
| */ | ||
| public MultipartField(String wireName, String value) { | ||
| this.wireName = wireName; | ||
| this.value = value; | ||
| } | ||
|
|
||
| /** | ||
| * | ||
| * @return The JSON key name of this field. | ||
| */ | ||
| public String getWireName() { | ||
| return wireName; | ||
| } | ||
|
|
||
| /** | ||
| * | ||
| * @return The JSON value of this field. | ||
| */ | ||
| public String getValue() { | ||
| return value; | ||
| } | ||
| } | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.