Skip to content

Commit

Permalink
release: 10/12/2024 (#2230)
Browse files Browse the repository at this point in the history
* Fix: Add support for handling multiple schemas from a comma-separated list in schema-loader-utils

- Updated `readSchemas` in `schema-loader-utils.ts` to correctly parse and handle multiple schema files passed as a comma-separated list.
- Added a test case to validate functionality for multiple schemas in `schema-loader-utils.test.ts`.

Addresses issue #2087.
[Bug] Fix issue #2087: Handle multiple schema files in gen-schema-views script

* chore(fs-bq-schema-views): bump package version (#2225)

* feat(firestore-translate-text): optionally allow genkit for translations (#2228)

* feat(firestore-translate-text): optionally allow the use of Gemini 1.5 Pro for translations

* chore(firestore-translate-text): bump extension version

* chore(firestore-translate-text): add JSDoc comments to new code

---------

Co-authored-by: Gustolandia <Gustavo.ricou@gmail.com>
cabljac and Gustolandia authored Dec 10, 2024

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
1 parent ce22524 commit 0a64f98
Showing 15 changed files with 4,060 additions and 510 deletions.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "@firebaseextensions/fs-bq-schema-views",
"version": "0.4.8",
"version": "0.4.9",
"description": "Generate strongly-typed BigQuery Views based on raw JSON",
"main": "./lib/index.js",
"repository": {
Original file line number Diff line number Diff line change
@@ -76,4 +76,19 @@ describe("filesystem schema loading", () => {
const schemas = schema_loader_utils.readSchemas([globPattern]);
expect(Object.keys(schemas)).to.have.members(results);
});
it("should load schemas from a comma-separated list of file paths", () => {
const schemaFiles = `${schemaDir}/full-directory/schema-1.json,${schemaDir}/full-directory/schema-2.json`;
const schemas = Object.keys(schema_loader_utils.readSchemas([schemaFiles]));
expect(schemas.length).to.equal(2);
expect(schemas).to.include(
schema_loader_utils.filePathToSchemaName(
`${schemaDir}/full-directory/schema-1.json`
)
);
expect(schemas).to.include(
schema_loader_utils.filePathToSchemaName(
`${schemaDir}/full-directory/schema-2.json`
)
);
});
});
Original file line number Diff line number Diff line change
@@ -174,6 +174,7 @@ async function parseConfig(): Promise<CliConfig> {
program.outputHelp();
process.exit(1);
}

return {
projectId: program.project,
bigQueryProjectId: program.bigQueryProject,
Original file line number Diff line number Diff line change
@@ -63,8 +63,10 @@ function resolveFilePath(filePath: string): string {

function expandGlobs(globs: string[]): string[] {
let results = [];
for (var i = 0; i < globs.length; i++) {
const globResults = glob.sync(globs[i]);
// Split any comma-separated globs into individual paths
const expandedGlobs = globs.flatMap((g) => g.split(",").map((s) => s.trim()));
for (const globPath of expandedGlobs) {
const globResults = glob.sync(globPath);
results = results.concat(globResults);
}
return results;
4 changes: 4 additions & 0 deletions firestore-translate-text/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
## Version 0.1.20

feat - add optional Gemini translations powered by Firebase Genkit

## Version 0.1.19

fixed - bump dependencies, fix vulnerabilities
18 changes: 18 additions & 0 deletions firestore-translate-text/PREINSTALL.md
Original file line number Diff line number Diff line change
@@ -33,6 +33,24 @@ of languages, such as `en,fr,de`. See the [supported languages list](https://clo

Before installing this extension, make sure that you've [set up a Cloud Firestore database](https://firebase.google.com/docs/firestore/quickstart) in your Firebase project.

#### Optional Genkit Integration

This extension optionally supports Genkit as an alternative to the Google Cloud Translation API for performing translations. With Genkit, you can leverage large language models such as Google AI Gemini or Vertex AI Gemini to generate translations.

##### How it works:
Genkit Integration allows you to use the powerful Gemini 1.5 Pro model for translations. When enabled, the extension uses the specified Genkit provider to perform the translations instead of the default Cloud Translation API.

You can choose between:

- Google AI: Uses the googleai plugin with an API key.
- Vertex AI: Uses the vertexai plugin and connects to your Google Cloud Vertex AI endpoint.

In theory, a large language model like Gemini 1.5 Pro may have more contextual understanding. For example in the sentence `I left my keys in the bank` the model may understand whether `bank` refers to a financial institution or a riverbank, and may provide a more accurate translation.

##### Notes:
- Using Genkit may incur additional charges based on your model provider (Google AI or Vertex AI).
- If you do not wish to use Genkit, the extension defaults to the Cloud Translation API.

#### Billing
To install an extension, your project must be on the [Blaze (pay as you go) plan](https://firebase.google.com/pricing)

31 changes: 31 additions & 0 deletions firestore-translate-text/README.md
Original file line number Diff line number Diff line change
@@ -41,6 +41,24 @@ of languages, such as `en,fr,de`. See the [supported languages list](https://clo

Before installing this extension, make sure that you've [set up a Cloud Firestore database](https://firebase.google.com/docs/firestore/quickstart) in your Firebase project.

#### Optional Genkit Integration

This extension optionally supports Genkit as an alternative to the Google Cloud Translation API for performing translations. With Genkit, you can leverage large language models such as Google AI Gemini or Vertex AI Gemini to generate translations.

##### How it works:
Genkit Integration allows you to use the powerful Gemini 1.5 Pro model for translations. When enabled, the extension uses the specified Genkit provider to perform the translations instead of the default Cloud Translation API.

You can choose between:

- Google AI: Uses the googleai plugin with an API key.
- Vertex AI: Uses the vertexai plugin and connects to your Google Cloud Vertex AI endpoint.

In theory, a large language model like Gemini 1.5 Pro may have more contextual understanding. For example in the sentence `I left my keys in the bank` the model may understand whether `bank` refers to a financial institution or a riverbank, and may provide a more accurate translation.

##### Notes:
- Using Genkit may incur additional charges based on your model provider (Google AI or Vertex AI).
- If you do not wish to use Genkit, the extension defaults to the Cloud Translation API.

#### Billing
To install an extension, your project must be on the [Blaze (pay as you go) plan](https://firebase.google.com/pricing)

@@ -70,6 +88,15 @@ To install an extension, your project must be on the [Blaze (pay as you go) plan
* Languages field name: What is the name of the field that contains the languages that you want to translate into? This field is optional. If you don't specify it, the extension will use the languages specified in the LANGUAGES parameter.


* Use Genkit for translations?: If you want to use Genkit to perform translations, select "Yes" and provide the necessary configuration parameters. If you select "No", the extension will use Google Cloud Translation API.


* Genkit Gemini provider: If you selected to use Genkit to perform translations, please provide the name of the Gemini API you want to use.


* Google AI API key: If you selected to use Genkit with Google AI to perform translations, please provide a Google AI API key




**Cloud Functions:**
@@ -84,6 +111,8 @@ To install an extension, your project must be on the [Blaze (pay as you go) plan

* translate.googleapis.com (Reason: To use Google Translate to translate strings into your specified target languages.)

* aiplatform.googleapis.com (Reason: This extension uses the Vertex AI multimodal model for embedding images, if configured to do so.)



**Access Required**:
@@ -93,3 +122,5 @@ To install an extension, your project must be on the [Blaze (pay as you go) plan
This extension will operate with the following project IAM roles:

* datastore.user (Reason: Allows the extension to write translated strings to Cloud Firestore.)

* aiplatform.user (Reason: This extension requires access to Vertex AI to create, update and query a Vertex Matching Engine index.)
46 changes: 45 additions & 1 deletion firestore-translate-text/extension.yaml
Original file line number Diff line number Diff line change
@@ -13,7 +13,7 @@
# limitations under the License.

name: firestore-translate-text
version: 0.1.19
version: 0.1.20
specVersion: v1beta

tags: [ai]
@@ -47,11 +47,20 @@ apis:
reason:
To use Google Translate to translate strings into your specified target
languages.
- apiName: aiplatform.googleapis.com
reason:
This extension uses the Vertex AI multimodal model for embedding images,
if configured to do so.

roles:
- role: datastore.user
reason: Allows the extension to write translated strings to Cloud Firestore.

- role: aiplatform.user
reason: >-
This extension requires access to Vertex AI to create, update and query a
Vertex Matching Engine index.
resources:
- name: fstranslate
type: firebaseextensions.v1beta.function
@@ -130,6 +139,41 @@ params:
default: languages
required: false

- param: USE_GENKIT
label: Use Genkit for translations?
description: >
If you want to use Genkit to perform translations, select "Yes" and
provide the necessary configuration parameters. If you select "No", the
extension will use Google Cloud Translation API.
type: select
required: true
options:
- label: Yes
value: true
- label: No
value: false

- param: GEMINI_PROVIDER
label: Genkit Gemini provider
description: >
If you selected to use Genkit to perform translations, please provide the
name of the Gemini API you want to use.
type: select
required: false
options:
- label: Google AI
value: googleai
- label: Vertex AI
value: vertexai

- param: GOOGLE_AI_API_KEY
label: Google AI API key
description: >
If you selected to use Genkit with Google AI to perform translations,
please provide a Google AI API key
type: secret
required: false

# - param: DO_BACKFILL
# label: Translate existing documents?
# description: >
3,949 changes: 3,586 additions & 363 deletions firestore-translate-text/functions/package-lock.json

Large diffs are not rendered by default.

8 changes: 6 additions & 2 deletions firestore-translate-text/functions/package.json
Original file line number Diff line number Diff line change
@@ -12,20 +12,24 @@
"generate-readme": "firebase ext:info .. --markdown > ../README.md"
},
"dependencies": {
"@genkit-ai/googleai": "^0.9.7",
"@genkit-ai/vertexai": "^0.9.7",
"@google-cloud/translate": "^8.2.0",
"@google-cloud/vertexai": "^1.9.2",
"@types/express-serve-static-core": "4.19.0",
"@types/node": "^20.10.3",
"firebase-admin": "^12.1.0",
"firebase-functions": "^4.9.0",
"genkit": "^0.9.7",
"rimraf": "^2.6.3",
"typescript": "^4.8.4"
},
"devDependencies": {
"@types/jest": "29.5.0",
"firebase-functions-test": "3.2.0",
"jest": "29.5.0",
"js-yaml": "^3.13.1",
"mocked-env": "^1.3.1",
"@types/jest": "29.5.0",
"jest": "29.5.0",
"ts-jest": "29.1.2"
},
"private": true
3 changes: 3 additions & 0 deletions firestore-translate-text/functions/src/config.ts
Original file line number Diff line number Diff line change
@@ -21,4 +21,7 @@ export default {
inputFieldName: process.env.INPUT_FIELD_NAME,
outputFieldName: process.env.OUTPUT_FIELD_NAME,
languagesFieldName: process.env.LANGUAGES_FIELD_NAME,
useGenkit: process.env.USE_GENKIT === "true",
geminiProvider: process.env.GEMINI_PROVIDER,
googleAIAPIKey: process.env.GOOGLE_API_KEY,
};
297 changes: 234 additions & 63 deletions firestore-translate-text/functions/src/translate/common.ts
Original file line number Diff line number Diff line change
@@ -3,79 +3,250 @@ import * as logs from "../logs";
import * as events from "../events";
import * as admin from "firebase-admin";
import config from "../config";
import { genkit, Genkit, z, ModelReference } from "genkit";
import vertexAI, {
gemini15Pro as gemini15ProVertex,
} from "@genkit-ai/vertexai";
import {
gemini15Pro as gemini15ProGoogleAI,
googleAI,
} from "@genkit-ai/googleai";

/**
* Represents a translation result with target language and translated text
*/
export type Translation = {
language: string;
output: string;
};

export const translate = new v2.Translate({
projectId: process.env.PROJECT_ID,
});

export const translateString = async (
string: string,
targetLanguage: string
): Promise<string> => {
try {
const [translatedString] = await translate.translate(
string,
targetLanguage
);
logs.translateStringComplete(string, targetLanguage, translatedString);
return translatedString;
} catch (err) {
logs.translateStringError(string, targetLanguage, err);
await events.recordErrorEvent(err as Error);
throw err;
/**
* Interface defining the contract for translator implementations
*/
interface ITranslator {
/**
* Translates text to a target language
* @param text - The text to translate
* @param targetLanguage - The language code to translate to
* @returns A promise resolving to the translated text
*/
translate(text: string, targetLanguage: string): Promise<string>;
}

/**
* Implementation of ITranslator using Google Cloud Translation API v2
*/
export class GoogleTranslator implements ITranslator {
private client: v2.Translate;

/**
* Creates a new instance of GoogleTranslator
* @param projectId - The Google Cloud project ID
*/
constructor(projectId: string) {
this.client = new v2.Translate({ projectId });
}
};

export const extractInput = (
snapshot: admin.firestore.DocumentSnapshot
): any => {
return snapshot.get(config.inputFieldName);
};
/**
* Translates text using Google Cloud Translation API
* @param text - The text to translate
* @param targetLanguage - The language code to translate to
* @returns A promise resolving to the translated text
* @throws Will throw an error if translation fails
*/
async translate(text: string, targetLanguage: string): Promise<string> {
try {
const [translatedString] = await this.client.translate(
text,
targetLanguage
);
logs.translateStringComplete(text, targetLanguage, translatedString);
return translatedString;
} catch (err) {
logs.translateStringError(text, targetLanguage, err);
await events.recordErrorEvent(err as Error);
throw err;
}
}
}

export const extractOutput = (
snapshot: admin.firestore.DocumentSnapshot
): any => {
return snapshot.get(config.outputFieldName);
};
/**
* Implementation of ITranslator using Genkit with either Vertex AI or Google AI
*/
export class GenkitTranslator implements ITranslator {
private client: Genkit;
plugin: "vertexai" | "googleai";
model: ModelReference<any>;

export const extractLanguages = (
snapshot: admin.firestore.DocumentSnapshot
): string[] => {
if (!config.languagesFieldName) return config.languages;
return snapshot.get(config.languagesFieldName) || config.languages;
};
/**
* Creates a new instance of GenkitTranslator
* @param options - Configuration options for the translator
* @param options.plugin - The AI service provider to use ("vertexai" or "googleai")
* @throws Will throw an error if required API keys are missing
*/
constructor({ plugin }: { plugin: "vertexai" | "googleai" }) {
this.plugin = plugin;
if (plugin === "googleai" && !config.googleAIAPIKey) {
throw new Error(
"Google AI API key is required for Genkit Google AI translations"
);
}

export const filterLanguagesFn = (
existingTranslations: Record<string, any>
): ((targetLanguage: string) => boolean) => {
return (targetLanguage: string) => {
if (existingTranslations[targetLanguage] != undefined) {
logs.skippingLanguage(targetLanguage);
return false;
this.model =
plugin === "vertexai" ? gemini15ProVertex : gemini15ProGoogleAI;

const plugins =
plugin === "vertexai"
? [vertexAI({ location: process.env.LOCATION! })]
: [googleAI({ apiKey: config.googleAIAPIKey })];

this.client = genkit({
plugins,
});
}

/**
* Translates text using Genkit with either Vertex AI or Google AI
* @param text - The text to translate
* @param targetLanguage - The language code to translate to
* @returns A promise resolving to the translated text
* @throws Will throw an error if translation fails or no output is returned
*/
async translate(text: string, targetLanguage: string): Promise<string> {
try {
const prompt =
"Translate the following text to " + targetLanguage + ":\n" + text;

const response = await this.client.generate({
model: this.model,
output: {
format: "json",
schema: z.object({
translation: z.string(),
}),
},
prompt: prompt,
});

if (!response.output) {
throw new Error("No translation returned from Gemini 1.5 Pro");
}

logs.translateStringComplete(text, targetLanguage, response.text);
return response.output.translation;
} catch (err) {
logs.translateStringError(text, targetLanguage, err);
await events.recordErrorEvent(err as Error);
throw err;
}
return true;
};
};
}
}

export const updateTranslations = async (
snapshot: admin.firestore.DocumentSnapshot,
translations: any
): Promise<void> => {
logs.updateDocument(snapshot.ref.path);
// Wrapping in transaction to allow for automatic retries (#48)
await admin.firestore().runTransaction((transaction) => {
transaction.update(snapshot.ref, config.outputFieldName, translations);
return Promise.resolve();
});

logs.updateDocumentComplete(snapshot.ref.path);
await events.recordSuccessEvent({
subject: snapshot.ref.path,
data: { outputFieldName: config.outputFieldName, translations },
});
};
/**
* Service class that orchestrates translation operations using the provided translator
*/
export class TranslationService {
/**
* Creates a new instance of TranslationService
* @param translator - The translator implementation to use
*/
constructor(private translator: ITranslator) {}

/**
* Translates a string to the specified target language
* @param text - The text to translate
* @param targetLanguage - The language code to translate to
* @returns A promise resolving to the translated text
*/
async translateString(text: string, targetLanguage: string): Promise<string> {
return this.translator.translate(text, targetLanguage);
}

/**
* Extracts input field value from a Firestore document
* @param snapshot - The Firestore document snapshot
* @returns The value of the configured input field
*/
extractInput(snapshot: admin.firestore.DocumentSnapshot): any {
return snapshot.get(config.inputFieldName);
}

/**
* Extracts output field value from a Firestore document
* @param snapshot - The Firestore document snapshot
* @returns The value of the configured output field
*/
extractOutput(snapshot: admin.firestore.DocumentSnapshot): any {
return snapshot.get(config.outputFieldName);
}

/**
* Extracts target languages from a Firestore document or returns default languages
* @param snapshot - The Firestore document snapshot
* @returns Array of language codes to translate to
*/
extractLanguages(snapshot: admin.firestore.DocumentSnapshot): string[] {
if (!config.languagesFieldName) return config.languages;
return snapshot.get(config.languagesFieldName) || config.languages;
}

/**
* Creates a filter function to skip already translated languages
* @param existingTranslations - Record of existing translations
* @returns A function that returns true for languages that need translation
*/
filterLanguagesFn(
existingTranslations: Record<string, any>
): (targetLanguage: string) => boolean {
return (targetLanguage: string) => {
if (existingTranslations[targetLanguage] != undefined) {
logs.skippingLanguage(targetLanguage);
return false;
}
return true;
};
}

/**
* Updates translations in a Firestore document
* @param snapshot - The Firestore document snapshot
* @param translations - The translations to update
* @returns A promise that resolves when the update is complete
*/
async updateTranslations(
snapshot: admin.firestore.DocumentSnapshot,
translations: any
): Promise<void> {
logs.updateDocument(snapshot.ref.path);

await admin.firestore().runTransaction((transaction) => {
transaction.update(snapshot.ref, config.outputFieldName, translations);
return Promise.resolve();
});

logs.updateDocumentComplete(snapshot.ref.path);
await events.recordSuccessEvent({
subject: snapshot.ref.path,
data: { outputFieldName: config.outputFieldName, translations },
});
}
}

// Initialize the translation service based on configuration
const translationService = config.useGenkit
? new TranslationService(new GenkitTranslator({ plugin: "vertexai" }))
: new TranslationService(new GoogleTranslator(process.env.PROJECT_ID));

// Export bound methods for convenience
export const translateString =
translationService.translateString.bind(translationService);
export const extractInput =
translationService.extractInput.bind(translationService);
export const extractOutput =
translationService.extractOutput.bind(translationService);
export const extractLanguages =
translationService.extractLanguages.bind(translationService);
export const filterLanguagesFn =
translationService.filterLanguagesFn.bind(translationService);
export const updateTranslations =
translationService.updateTranslations.bind(translationService);
3 changes: 2 additions & 1 deletion firestore-translate-text/functions/tsconfig.json
Original file line number Diff line number Diff line change
@@ -5,7 +5,8 @@
"module": "commonjs",
"noImplicitReturns": true,
"sourceMap": false,
"outDir": "lib"
"outDir": "lib",
"skipLibCheck": true
},
"compileOnSave": true,
"include": ["src"]
183 changes: 108 additions & 75 deletions package-lock.json

Large diffs are not rendered by default.

0 comments on commit 0a64f98

Please sign in to comment.