Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend Images API to support Edits and Variations #62

Merged
merged 5 commits into from
Nov 10, 2023
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 79 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@ This repository contains Swift community-maintained implementation over [OpenAI]
- [Completions](#completions)
- [Chats](#chats)
- [Images](#images)
- [Create Image](#create-image)
- [Create Image Edit](#create-image-edit)
- [Create Image Variation](#create-image-variation)
- [Audio](#audio)
- [Audio Transcriptions](#audio-transcriptions)
- [Audio Translations](#audio-translations)
Expand Down Expand Up @@ -252,6 +255,8 @@ Given a prompt and/or an input image, the model will generate a new image.

As Artificial Intelligence continues to develop, so too does the intriguing concept of Dall-E. Developed by OpenAI, a research lab for artificial intelligence purposes, Dall-E has been classified as an AI system that can generate images based on descriptions provided by humans. With its potential applications spanning from animation and illustration to design and engineering - not to mention the endless possibilities in between - it's easy to see why there is such excitement over this new technology.

### Create Image

**Request**

```swift
Expand All @@ -276,6 +281,7 @@ struct ImagesResult: Codable, Equatable {
public let data: [URLResult]
}
```

**Example**

```swift
Expand All @@ -300,6 +306,79 @@ let result = try await openAI.images(query: query)

![Generated Image](https://user-images.githubusercontent.com/1411778/213134082-ba988a72-fca0-4213-8805-63e5f8324cab.png)

### Create Image Edit

Creates an edited or extended image given an original image and a prompt.

**Request**

```swift
public struct ImageEditsQuery: Codable {
/// The image to edit. Must be a valid PNG file, less than 4MB, and square. If mask is not provided, image must have transparency, which will be used as the mask.
public let image: Data
public let fileName: String
/// An additional image whose fully transparent areas (e.g. where alpha is zero) indicate where image should be edited. Must be a valid PNG file, less than 4MB, and have the same dimensions as image.
public let mask: Data?
public let maskFileName: String?
/// A text description of the desired image(s). The maximum length is 1000 characters.
public let prompt: String
/// The number of images to generate. Must be between 1 and 10.
public let n: Int?
/// The size of the generated images. Must be one of 256x256, 512x512, or 1024x1024.
public let size: String?
}
```

**Response**

Uses the ImagesResult response similarly to ImagesQuery.

**Example**

```swift
let data = image.pngData()
let query = ImagesEditQuery(image: data, fileName: "whitecat.png", prompt: "White cat with heterochromia sitting on the kitchen table with a bowl of food", n: 1, size: "1024x1024")
openAI.imageEdits(query: query) { result in
//Handle result here
}
//or
let result = try await openAI.imageEdits(query: query)
```

### Create Image Variation

Creates a variation of a given image.

**Request**

```swift
public struct ImageVariationsQuery: Codable {
/// The image to edit. Must be a valid PNG file, less than 4MB, and square. If mask is not provided, image must have transparency, which will be used as the mask.
public let image: Data
public let fileName: String
/// The number of images to generate. Must be between 1 and 10.
public let n: Int?
/// The size of the generated images. Must be one of 256x256, 512x512, or 1024x1024.
public let size: String?
}
```

**Response**

Uses the ImagesResult response similarly to ImagesQuery.

**Example**

```swift
let data = image.pngData()
let query = ImagesVariationQuery(image: data, fileName: "whitecat.png", n: 1, size: "1024x1024")
openAI.imageVariations(query: query) { result in
//Handle result here
}
//or
let result = try await openAI.imageVariations(query: query)
```

Review [Images Documentation](https://platform.openai.com/docs/api-reference/images) for more info.

### Audio
Expand Down
13 changes: 12 additions & 1 deletion Sources/OpenAI/OpenAI.swift
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,14 @@ final public class OpenAI: OpenAIProtocol {
performRequest(request: JSONRequest<ImagesResult>(body: query, url: buildURL(path: .images)), completion: completion)
}

public func imageEdits(query: ImageEditsQuery, completion: @escaping (Result<ImagesResult, Error>) -> Void) {
performRequest(request: MultipartFormDataRequest<ImagesResult>(body: query, url: buildURL(path: .imageEdits)), completion: completion)
}

public func imageVariations(query: ImageVariationsQuery, completion: @escaping (Result<ImagesResult, Error>) -> Void) {
performRequest(request: MultipartFormDataRequest<ImagesResult>(body: query, url: buildURL(path: .imageVariations)), completion: completion)
}

public func embeddings(query: EmbeddingsQuery, completion: @escaping (Result<EmbeddingsResult, Error>) -> Void) {
performRequest(request: JSONRequest<EmbeddingsResult>(body: query, url: buildURL(path: .embeddings)), completion: completion)
}
Expand Down Expand Up @@ -151,7 +159,6 @@ typealias APIPath = String
extension APIPath {

static let completions = "/v1/completions"
static let images = "/v1/images/generations"
static let embeddings = "/v1/embeddings"
static let chats = "/v1/chat/completions"
static let edits = "/v1/edits"
Expand All @@ -161,6 +168,10 @@ extension APIPath {
static let audioTranscriptions = "/v1/audio/transcriptions"
static let audioTranslations = "/v1/audio/translations"

static let images = "/v1/images/generations"
static let imageEdits = "/v1/images/edits"
static let imageVariations = "/v1/images/variations"

func withPath(_ path: String) -> String {
self + "/" + path
}
Expand Down
12 changes: 7 additions & 5 deletions Sources/OpenAI/Private/MultipartFormDataBodyBuilder.swift
Original file line number Diff line number Diff line change
Expand Up @@ -32,11 +32,13 @@ private extension MultipartFormDataEntry {
var body = Data()
switch self {
case .file(let paramName, let fileName, let fileData, let contentType):
body.append("--\(boundary)\r\n")
body.append("Content-Disposition: form-data; name=\"\(paramName)\"; filename=\"\(fileName)\"\r\n")
body.append("Content-Type: \(contentType)\r\n\r\n")
body.append(fileData)
body.append("\r\n")
if let fileName, let fileData {
body.append("--\(boundary)\r\n")
body.append("Content-Disposition: form-data; name=\"\(paramName)\"; filename=\"\(fileName)\"\r\n")
body.append("Content-Type: \(contentType)\r\n\r\n")
body.append(fileData)
body.append("\r\n")
}
case .string(let paramName, let value):
if let value {
body.append("--\(boundary)\r\n")
Expand Down
2 changes: 1 addition & 1 deletion Sources/OpenAI/Private/MultipartFormDataEntry.swift
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,6 @@ import Foundation

enum MultipartFormDataEntry {

case file(paramName: String, fileName: String, fileData: Data, contentType: String),
case file(paramName: String, fileName: String?, fileData: Data?, contentType: String),
string(paramName: String, value: Any?)
}
46 changes: 46 additions & 0 deletions Sources/OpenAI/Public/Models/ImageEditsQuery.swift
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
//
// ImageEditsQuery.swift
//
//
// Created by Aled Samuel on 24/04/2023.
//

import Foundation

public struct ImageEditsQuery: Codable {
/// The image to edit. Must be a valid PNG file, less than 4MB, and square. If mask is not provided, image must have transparency, which will be used as the mask.
public let image: Data
public let fileName: String
/// An additional image whose fully transparent areas (e.g. where alpha is zero) indicate where image should be edited. Must be a valid PNG file, less than 4MB, and have the same dimensions as image.
public let mask: Data?
public let maskFileName: String?
/// A text description of the desired image(s). The maximum length is 1000 characters.
public let prompt: String
/// The number of images to generate. Must be between 1 and 10.
public let n: Int?
/// The size of the generated images. Must be one of 256x256, 512x512, or 1024x1024.
public let size: String?

public init(image: Data, fileName: String, mask: Data? = nil, maskFileName: String? = nil, prompt: String, n: Int? = nil, size: String? = nil) {
self.image = image
self.fileName = fileName
self.mask = mask
self.maskFileName = maskFileName
self.prompt = prompt
self.n = n
self.size = size
}
}

extension ImageEditsQuery: MultipartFormDataBodyEncodable {
func encode(boundary: String) -> Data {
let bodyBuilder = MultipartFormDataBodyBuilder(boundary: boundary, entries: [
.file(paramName: "image", fileName: fileName, fileData: image, contentType: "image/png"),
.file(paramName: "mask", fileName: maskFileName, fileData: mask, contentType: "image/png"),
.string(paramName: "prompt", value: prompt),
.string(paramName: "n", value: n),
.string(paramName: "size", value: size)
])
return bodyBuilder.build()
}
}
36 changes: 36 additions & 0 deletions Sources/OpenAI/Public/Models/ImageVariationsQuery.swift
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
//
// ImageVariationsQuery.swift
//
//
// Created by Aled Samuel on 24/04/2023.
//

import Foundation

public struct ImageVariationsQuery: Codable {
/// The image to edit. Must be a valid PNG file, less than 4MB, and square.
public let image: Data
public let fileName: String
/// The number of images to generate. Must be between 1 and 10.
public let n: Int?
/// The size of the generated images. Must be one of 256x256, 512x512, or 1024x1024.
public let size: String?

public init(image: Data, fileName: String, n: Int? = nil, size: String? = nil) {
self.image = image
self.fileName = fileName
self.n = n
self.size = size
}
}

extension ImageVariationsQuery: MultipartFormDataBodyEncodable {
func encode(boundary: String) -> Data {
let bodyBuilder = MultipartFormDataBodyBuilder(boundary: boundary, entries: [
.file(paramName: "image", fileName: fileName, fileData: image, contentType: "image/png"),
.string(paramName: "n", value: n),
.string(paramName: "size", value: size)
])
return bodyBuilder.build()
}
}
30 changes: 30 additions & 0 deletions Sources/OpenAI/Public/Protocols/OpenAIProtocol+Async.swift
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,36 @@ public extension OpenAIProtocol {
}
}
}

func imageEdits(
query: ImageEditsQuery
) async throws -> ImagesResult {
try await withCheckedThrowingContinuation { continuation in
imageEdits(query: query) { result in
switch result {
case let .success(success):
return continuation.resume(returning: success)
case let .failure(failure):
return continuation.resume(throwing: failure)
}
}
}
}

func imageVariations(
query: ImageVariationsQuery
) async throws -> ImagesResult {
try await withCheckedThrowingContinuation { continuation in
imageVariations(query: query) { result in
switch result {
case let .success(success):
return continuation.resume(returning: success)
case let .failure(failure):
return continuation.resume(throwing: failure)
}
}
}
}

func embeddings(
query: EmbeddingsQuery
Expand Down
14 changes: 14 additions & 0 deletions Sources/OpenAI/Public/Protocols/OpenAIProtocol+Combine.swift
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,20 @@ public extension OpenAIProtocol {
}
.eraseToAnyPublisher()
}

func imageEdits(query: ImageEditsQuery) -> AnyPublisher<ImagesResult, Error> {
Future<ImagesResult, Error> {
imageEdits(query: query, completion: $0)
}
.eraseToAnyPublisher()
}

func imageVariations(query: ImageVariationsQuery) -> AnyPublisher<ImagesResult, Error> {
Future<ImagesResult, Error> {
imageVariations(query: query, completion: $0)
}
.eraseToAnyPublisher()
}

func embeddings(query: EmbeddingsQuery) -> AnyPublisher<EmbeddingsResult, Error> {
Future<EmbeddingsResult, Error> {
Expand Down
36 changes: 35 additions & 1 deletion Sources/OpenAI/Public/Protocols/OpenAIProtocol.swift
Original file line number Diff line number Diff line change
Expand Up @@ -38,11 +38,45 @@ public protocol OpenAIProtocol {
```

- Parameters:
- query: An `ImagesQuery` object containing the input parameters for the API request. This includes the query parameters such as the model, text prompt, image size, and other settings.
- query: An `ImagesQuery` object containing the input parameters for the API request. This includes the query parameters such as the text prompt, image size, and other settings.
- completion: A closure which receives the result when the API request finishes. The closure's parameter, `Result<ImagesResult, Error>`, will contain either the `ImagesResult` object with the generated images, or an error if the request failed.
**/
func images(query: ImagesQuery, completion: @escaping (Result<ImagesResult, Error>) -> Void)

/**
This function sends an image edit query to the OpenAI API and retrieves generated images in response. The Images Edit API enables you to edit images or graphics using OpenAI's powerful deep learning models.

Example:
```
let query = ImagesEditQuery(image: "@whitecat.png", prompt: "White cat with heterochromia sitting on the kitchen table with a bowl of food", n: 1, size: "1024x1024")
openAI.imageEdits(query: query) { result in
//Handle result here
}
```

- Parameters:
- query: An `ImagesEditQuery` object containing the input parameters for the API request. This includes the query parameters such as the image to be edited, an image to be used a mask if applicable, text prompt, image size, and other settings.
- completion: A closure which receives the result when the API request finishes. The closure's parameter, `Result<ImagesResult, Error>`, will contain either the `ImagesResult` object with the generated images, or an error if the request failed.
**/
func imageEdits(query: ImageEditsQuery, completion: @escaping (Result<ImagesResult, Error>) -> Void)

/**
This function sends an image variation query to the OpenAI API and retrieves generated images in response. The Images Variations API enables you to create a variation of a given image using OpenAI's powerful deep learning models.

Example:
```
let query = ImagesVariationQuery(image: "@whitecat.png", n: 1, size: "1024x1024")
openAI.imageVariations(query: query) { result in
//Handle result here
}
```

- Parameters:
- query: An `ImagesVariationQuery` object containing the input parameters for the API request. This includes the query parameters such as the image to use as a basis for the variation(s), image size, and other settings.
- completion: A closure which receives the result when the API request finishes. The closure's parameter, `Result<ImagesResult, Error>`, will contain either the `ImagesResult` object with the generated images, or an error if the request failed.
**/
func imageVariations(query: ImageVariationsQuery, completion: @escaping (Result<ImagesResult, Error>) -> Void)

/**
This function sends an embeddings query to the OpenAI API and retrieves embeddings in response. The Embeddings API enables you to generate high-dimensional vector representations of texts, which can be used for various natural language processing tasks such as semantic similarity, clustering, and classification.

Expand Down
Loading