diff --git a/clients/client-glue/README.md b/clients/client-glue/README.md
index d52cc2f7dd1d1..92974bb731a14 100644
--- a/clients/client-glue/README.md
+++ b/clients/client-glue/README.md
@@ -324,6 +324,14 @@ BatchGetWorkflows
[Command API Reference](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/client/glue/command/BatchGetWorkflowsCommand/) / [Input](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-aws-sdk-client-glue/Interface/BatchGetWorkflowsCommandInput/) / [Output](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-aws-sdk-client-glue/Interface/BatchGetWorkflowsCommandOutput/)
+
+ Annotate datapoints over time for a specific data quality statistic. A specified entity does not exist An internal service error occurred. The input provided was not valid. A resource numerical limit was exceeded. Base exception class for all service exceptions from Glue service. Retrieve the training status of the model along with more information (CompletedOn, StartedOn, FailureReason). A specified entity does not exist An internal service error occurred. The input provided was not valid. The operation timed out. Base exception class for all service exceptions from Glue service. Retrieve a statistic's predictions for a given Profile ID. A specified entity does not exist An internal service error occurred. The input provided was not valid. The operation timed out. Base exception class for all service exceptions from Glue service. Retrieve annotations for a data quality statistic. An internal service error occurred. The input provided was not valid. Base exception class for all service exceptions from Glue service. Retrieves a list of data quality statistics. A specified entity does not exist An internal service error occurred. The input provided was not valid. Base exception class for all service exceptions from Glue service. Annotate all datapoints for a Profile. A specified entity does not exist An internal service error occurred. The input provided was not valid. Base exception class for all service exceptions from Glue service. A failed annotation. The Profile ID for the failed annotation. The Statistic ID for the failed annotation. The reason why the annotation failed. A timestamped inclusion annotation. The inclusion annotation value. The timestamp when the inclusion annotation was last modified. A Statistic Annotation. The Profile ID. The Statistic ID. The timestamp when the annotated statistic was recorded. The inclusion annotation applied to the statistic. Specifies a single column in a Glue schema definition. The Statistic ID. An object of type The evaluated rule. The Profile ID for the data quality result. An aggregate data quality score. Represents the ratio of rules that passed to the total number of rules. An Inclusion Annotation. The ID of the data quality profile the statistic belongs to. The Statistic ID. The inclusion annotation value to apply to the statistic. A list of Client Token. A list of The name of the blueprint. A description of the blueprint. Specifies a path in Amazon S3 where the blueprint is published. The tags to be applied to this blueprint. Returns the name of the blueprint that was registered. Specifies a custom CSV classifier for The name of the classifier. A custom symbol to denote what separates each column entry in the row. A custom symbol to denote what combines content into a single column value. Must be different from the column delimiter. Indicates whether the CSV file contains a header. A list of strings representing column names. Specifies not to trim values before identifying the type of column values. The default value is true. Enables the processing of files that contain only one column. Enables the configuration of custom datatypes. Creates a list of supported custom datatypes. Sets the SerDe for processing CSV in the classifier, which will be applied in the Data Catalog. Valid values are Specifies a An identifier of the data format that the classifier matches,
- * such as Twitter, JSON, Omniture logs, Amazon CloudWatch Logs, and so on. The name of the new classifier. The grok pattern used by this classifier. Optional custom grok patterns used by this classifier. Specifies a JSON classifier for The name of the classifier. A The name of the blueprint. A description of the blueprint. Specifies a path in Amazon S3 where the blueprint is published. The tags to be applied to this blueprint. Returns the name of the blueprint that was registered. Specifies a custom CSV classifier for The name of the classifier. A custom symbol to denote what separates each column entry in the row. A custom symbol to denote what combines content into a single column value. Must be different from the column delimiter. Indicates whether the CSV file contains a header. A list of strings representing column names. Specifies not to trim values before identifying the type of column values. The default value is true. Enables the processing of files that contain only one column. Enables the configuration of custom datatypes. Creates a list of supported custom datatypes. Sets the SerDe for processing CSV in the classifier, which will be applied in the Data Catalog. Valid values are Specifies a An identifier of the data format that the classifier matches,
+ * such as Twitter, JSON, Omniture logs, Amazon CloudWatch Logs, and so on. The name of the new classifier. The grok pattern used by this classifier. Optional custom grok patterns used by this classifier. Specifies a JSON classifier for The name of the classifier. A Specifies an XML classifier for The name of the security configuration created with the data quality encryption option. Used for idempotency and is recommended to be set to a random ID (such as a UUID) to avoid creating or starting multiple instances of the same resource. A unique result ID for the data quality result. The Statistic ID. The Profile ID. A unique result ID for the data quality result. The training status of the data quality model. An aggregate data quality score. Represents the ratio of rules that passed to the total number of rules. The timestamp when the data quality model training started. The table associated with the data quality result, if any. The timestamp when the data quality model training completed. The name of the ruleset associated with the data quality result. The training failure reason. The Statistic ID. In the context of a job in Glue Studio, each node in the canvas is typically assigned some sort of name and data quality nodes will have names. In the case of multiple nodes, the The Profile ID. The statistic model result. The date and time when the run for this data quality result started. The lower bound. The date and time when the run for this data quality result was completed. The upper bound. The job name associated with the data quality result, if any. The predicted value. The job run ID associated with the data quality result, if any. The actual value. The unique run ID associated with the ruleset evaluation. The date. A list of The inclusion annotation. A list of The timestamp when the data quality model training completed. A list of A list of The unique run identifier associated with this run. A unique result ID for the data quality result. The unique run identifier associated with this run. A unique result ID for the data quality result. The data source (an Glue table) associated with this run. The Profile ID for the data quality result. An IAM role supplied to encrypt the results of the run. An aggregate data quality score. Represents the ratio of rules that passed to the total number of rules. The number of The table associated with the data quality result, if any. The timeout for a run in minutes. This is the maximum time that a run can consume resources before it is terminated and enters The name of the ruleset associated with the data quality result. The status for this run. In the context of a job in Glue Studio, each node in the canvas is typically assigned some sort of name and data quality nodes will have names. In the case of multiple nodes, the The date and time when the run for this data quality result started. The date and time when the run for this data quality result was completed. The job name associated with the data quality result, if any. The job run ID associated with the data quality result, if any. The unique run ID associated with the ruleset evaluation. A list of A list of A list of The unique run identifier associated with this run. The unique run identifier associated with this run. The data source (an Glue table) associated with this run. An IAM role supplied to encrypt the results of the run. The number of The timeout for a run in minutes. This is the maximum time that a run can consume resources before it is terminated and enters The status for this run. The error strings that are associated with the run. The name of the security configuration created with the data quality encryption option. The name of the security configuration created with the data quality encryption option. A paginated token to offset the results. The maximum number of results to return. The filter transformation criteria. The sorting criteria. A structure for a machine learning transform. The unique transform ID that is generated for the machine learning transform. The ID is
- * guaranteed to be unique and does not change. A user-defined name for the machine learning transform. Names are not guaranteed unique
- * and can be changed at any time. A user-defined, long-form description text for the machine learning transform.
- * Descriptions are not guaranteed to be unique and can be changed at any time. The current status of the machine learning transform. A timestamp. The time and date that this machine learning transform was created. A timestamp. The last point in time when this machine learning transform was modified. A list of Glue table definitions used by the transform. A An A count identifier for the labeling files generated by Glue for this transform. As you create a better transform, you can iteratively download, label, and upload the labeling file. A map of key-value pairs representing the columns and data types that this transform can
- * run against. Has an upper bound of 100 columns. The name or Amazon Resource Name (ARN) of the IAM role with the required permissions. The required permissions include both Glue service role permissions to Glue resources, and Amazon S3 permissions required by the transform. This role needs Glue service role permissions to allow access to resources in Glue. See Attach a Policy to IAM Users That Access Glue. This role needs permission to your Amazon Simple Storage Service (Amazon S3) sources, targets, temporary directory, scripts, and any libraries used by the task run for this transform. This value determines which version of Glue this machine learning transform is compatible with. Glue 1.0 is recommended for most customers. If the value is not set, the Glue compatibility defaults to Glue 0.9. For more information, see Glue Versions in the developer guide. The number of Glue data processing units (DPUs) that are allocated to task runs for this transform. You can allocate from 2 to 100 DPUs; the default is 10. A DPU is a relative measure of
- * processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more
- * information, see the Glue pricing
- * page.
- * If either If If
- * When the The type of predefined worker that is allocated when a task of this transform runs. Accepts a value of Standard, G.1X, or G.2X. For the For the For the
- * If either If If
- * The number of workers of a defined If The timeout in minutes of the machine learning transform. The maximum number of times to retry after an The encryption-at-rest settings of the transform that apply to accessing user data. Machine learning transforms can access user data encrypted in Amazon S3 using KMS. A list of machine learning transforms. A pagination token, if more results are available. The ID of the Data Catalog where the partition in question resides. If none is provided,
- * the Amazon Web Services account ID is used by default. The name of the catalog database where the partition resides. The name of the partition's table. The values that define the partition. The requested information, in the form of a The catalog ID where the table resides. Specifies the name of a database from which you want to retrieve partition indexes. Specifies the name of a table for which you want to retrieve the partition indexes. A continuation token, included if this is a continuation call. A list of errors that can occur when registering partition indexes for an existing table. These errors give the details about why an index registration failed and provide a limited number of partitions in the response, so that you can fix the partitions at fault and try registering the index again. The most common set of errors that can occur are categorized as follows: EncryptedPartitionError: The partitions are encrypted. InvalidPartitionTypeDataError: The partition value doesn't match the data type for that partition column. MissingPartitionValueError: The partitions are encrypted. UnsupportedPartitionCharacterError: Characters inside the partition value are not supported. For example: U+0000 , U+0001, U+0002. InternalError: Any error which does not belong to other error codes. The error code for an error that occurred when registering partition indexes for an existing table. A list of a limited number of partitions in the response. A partition key pair consisting of a name and a type. The name of a partition key. The type of a partition key. A descriptor for a partition index in a table. The name of the partition index. A list of one or more keys, as The status of the partition index. The possible statuses are: CREATING: The index is being created. When an index is in a CREATING state, the index or its table cannot be deleted. ACTIVE: The index creation succeeds. FAILED: The index creation fails. DELETING: The index is deleted from the list of indexes. A list of errors that can occur when registering partition indexes for an existing table. A list of index descriptors. A continuation token, present if the current list segment is not the last. Defines a non-overlapping region of a table's partitions, allowing
- * multiple requests to be run in parallel. The zero-based index number of the segment. For example, if the total number of segments
- * is 4, The total number of segments. The ID of the Data Catalog where the partitions in question reside. If none is provided,
- * the Amazon Web Services account ID is used by default. A paginated token to offset the results. The name of the catalog database where the partitions reside. The maximum number of results to return. The name of the partitions' table. The filter transformation criteria. An expression that filters the partitions to be returned. The expression uses SQL syntax similar to the SQL
- * Operators: The following are the operators that you can use in the
- * Checks whether the values of the two operands are equal; if yes, then the condition becomes
- * true. Example: Assume 'variable a' holds 10 and 'variable b' holds 20. (a = b) is not true. Checks whether the values of two operands are equal; if the values are not equal,
- * then the condition becomes true. Example: (a < > b) is true. Checks whether the value of the left operand is greater than the value of the right
- * operand; if yes, then the condition becomes true. Example: (a > b) is not true. Checks whether the value of the left operand is less than the value of the right
- * operand; if yes, then the condition becomes true. Example: (a < b) is true. Checks whether the value of the left operand is greater than or equal to the value
- * of the right operand; if yes, then the condition becomes true. Example: (a >= b) is not true. Checks whether the value of the left operand is less than or equal to the value of
- * the right operand; if yes, then the condition becomes true. Example: (a <= b) is true. Logical operators.
- * Supported Partition Key Types: The following are the supported
- * partition keys.
- *
- *
- *
- *
- *
- *
- *
- *
- * If an type is encountered that is not valid, an exception is thrown. The following list shows the valid operators on each type. When you define a crawler, the
- *
- * Sample API Call: The sorting criteria. A structure for a machine learning transform. A continuation token, if this is not the first call to retrieve
- * these partitions. The unique transform ID that is generated for the machine learning transform. The ID is
+ * guaranteed to be unique and does not change. The segment of the table's partitions to scan in this request. A user-defined name for the machine learning transform. Names are not guaranteed unique
+ * and can be changed at any time. The maximum number of partitions to return in a single response. A user-defined, long-form description text for the machine learning transform.
+ * Descriptions are not guaranteed to be unique and can be changed at any time. When true, specifies not returning the partition column schema. Useful when you are interested only in other partition attributes such as partition values or location. This approach avoids the problem of a large response by not returning duplicate data. The current status of the machine learning transform. The transaction ID at which to read the partition contents. A timestamp. The time and date that this machine learning transform was created. The time as of when to read the partition contents. If not set, the most recent transaction commit time will be used. Cannot be specified along with A timestamp. The last point in time when this machine learning transform was modified. A list of requested partitions. A list of Glue table definitions used by the transform. A continuation token, if the returned list of partitions does not include the last
- * one. A The list of mappings from a source table to target tables. An The source table. A count identifier for the labeling files generated by Glue for this transform. As you create a better transform, you can iteratively download, label, and upload the labeling file. The target tables. A map of key-value pairs representing the columns and data types that this transform can
+ * run against. Has an upper bound of 100 columns. The parameters for the mapping. The name or Amazon Resource Name (ARN) of the IAM role with the required permissions. The required permissions include both Glue service role permissions to Glue resources, and Amazon S3 permissions required by the transform. This role needs Glue service role permissions to allow access to resources in Glue. See Attach a Policy to IAM Users That Access Glue. This role needs permission to your Amazon Simple Storage Service (Amazon S3) sources, targets, temporary directory, scripts, and any libraries used by the task run for this transform. The programming language of the code to perform the mapping. This value determines which version of Glue this machine learning transform is compatible with. Glue 1.0 is recommended for most customers. If the value is not set, the Glue compatibility defaults to Glue 0.9. For more information, see Glue Versions in the developer guide. A map to hold additional optional key-value parameters. Currently, these key-value pairs are supported: The number of Glue data processing units (DPUs) that are allocated to task runs for this transform. You can allocate from 2 to 100 DPUs; the default is 10. A DPU is a relative measure of
+ * processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more
+ * information, see the Glue pricing
+ * page.
+ * If either If If
- *
+BatchPutDataQualityStatisticAnnotation
+
+
+[Command API Reference](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/client/glue/command/BatchPutDataQualityStatisticAnnotationCommand/) / [Input](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-aws-sdk-client-glue/Interface/BatchPutDataQualityStatisticAnnotationCommandInput/) / [Output](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-aws-sdk-client-glue/Interface/BatchPutDataQualityStatisticAnnotationCommandOutput/)
+
@@ -940,6 +948,22 @@ GetDataflowGraph
[Command API Reference](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/client/glue/command/GetDataflowGraphCommand/) / [Input](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-aws-sdk-client-glue/Interface/GetDataflowGraphCommandInput/) / [Output](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-aws-sdk-client-glue/Interface/GetDataflowGraphCommandOutput/)
+
+GetDataQualityModel
+
+
+[Command API Reference](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/client/glue/command/GetDataQualityModelCommand/) / [Input](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-aws-sdk-client-glue/Interface/GetDataQualityModelCommandInput/) / [Output](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-aws-sdk-client-glue/Interface/GetDataQualityModelCommandOutput/)
+
+
+GetDataQualityModelResult
+
+
+[Command API Reference](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/client/glue/command/GetDataQualityModelResultCommand/) / [Input](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-aws-sdk-client-glue/Interface/GetDataQualityModelResultCommandInput/) / [Output](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-aws-sdk-client-glue/Interface/GetDataQualityModelResultCommandOutput/)
+
@@ -1412,6 +1436,22 @@ ListDataQualityRulesets
[Command API Reference](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/client/glue/command/ListDataQualityRulesetsCommand/) / [Input](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-aws-sdk-client-glue/Interface/ListDataQualityRulesetsCommandInput/) / [Output](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-aws-sdk-client-glue/Interface/ListDataQualityRulesetsCommandOutput/)
+
+ListDataQualityStatisticAnnotations
+
+
+[Command API Reference](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/client/glue/command/ListDataQualityStatisticAnnotationsCommand/) / [Input](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-aws-sdk-client-glue/Interface/ListDataQualityStatisticAnnotationsCommandInput/) / [Output](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-aws-sdk-client-glue/Interface/ListDataQualityStatisticAnnotationsCommandOutput/)
+
+
+ListDataQualityStatistics
+
+
+[Command API Reference](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/client/glue/command/ListDataQualityStatisticsCommand/) / [Input](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-aws-sdk-client-glue/Interface/ListDataQualityStatisticsCommandInput/) / [Output](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-aws-sdk-client-glue/Interface/ListDataQualityStatisticsCommandOutput/)
+
@@ -1516,6 +1556,14 @@ PutDataCatalogEncryptionSettings
[Command API Reference](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/client/glue/command/PutDataCatalogEncryptionSettingsCommand/) / [Input](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-aws-sdk-client-glue/Interface/PutDataCatalogEncryptionSettingsCommandInput/) / [Output](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-aws-sdk-client-glue/Interface/PutDataCatalogEncryptionSettingsCommandOutput/)
+
+PutDataQualityProfileAnnotation
+
+
+[Command API Reference](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/client/glue/command/PutDataQualityProfileAnnotationCommand/) / [Input](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-aws-sdk-client-glue/Interface/PutDataQualityProfileAnnotationCommandInput/) / [Output](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-aws-sdk-client-glue/Interface/PutDataQualityProfileAnnotationCommandOutput/)
+
diff --git a/clients/client-glue/src/Glue.ts b/clients/client-glue/src/Glue.ts
index 13b7a4bdc30a9..350857eb8bbfd 100644
--- a/clients/client-glue/src/Glue.ts
+++ b/clients/client-glue/src/Glue.ts
@@ -77,6 +77,11 @@ import {
BatchGetWorkflowsCommandInput,
BatchGetWorkflowsCommandOutput,
} from "./commands/BatchGetWorkflowsCommand";
+import {
+ BatchPutDataQualityStatisticAnnotationCommand,
+ BatchPutDataQualityStatisticAnnotationCommandInput,
+ BatchPutDataQualityStatisticAnnotationCommandOutput,
+} from "./commands/BatchPutDataQualityStatisticAnnotationCommand";
import {
BatchStopJobRunCommand,
BatchStopJobRunCommandInput,
@@ -434,6 +439,16 @@ import {
GetDataflowGraphCommandInput,
GetDataflowGraphCommandOutput,
} from "./commands/GetDataflowGraphCommand";
+import {
+ GetDataQualityModelCommand,
+ GetDataQualityModelCommandInput,
+ GetDataQualityModelCommandOutput,
+} from "./commands/GetDataQualityModelCommand";
+import {
+ GetDataQualityModelResultCommand,
+ GetDataQualityModelResultCommandInput,
+ GetDataQualityModelResultCommandOutput,
+} from "./commands/GetDataQualityModelResultCommand";
import {
GetDataQualityResultCommand,
GetDataQualityResultCommandInput,
@@ -665,6 +680,16 @@ import {
ListDataQualityRulesetsCommandInput,
ListDataQualityRulesetsCommandOutput,
} from "./commands/ListDataQualityRulesetsCommand";
+import {
+ ListDataQualityStatisticAnnotationsCommand,
+ ListDataQualityStatisticAnnotationsCommandInput,
+ ListDataQualityStatisticAnnotationsCommandOutput,
+} from "./commands/ListDataQualityStatisticAnnotationsCommand";
+import {
+ ListDataQualityStatisticsCommand,
+ ListDataQualityStatisticsCommandInput,
+ ListDataQualityStatisticsCommandOutput,
+} from "./commands/ListDataQualityStatisticsCommand";
import {
ListDevEndpointsCommand,
ListDevEndpointsCommandInput,
@@ -722,6 +747,11 @@ import {
PutDataCatalogEncryptionSettingsCommandInput,
PutDataCatalogEncryptionSettingsCommandOutput,
} from "./commands/PutDataCatalogEncryptionSettingsCommand";
+import {
+ PutDataQualityProfileAnnotationCommand,
+ PutDataQualityProfileAnnotationCommandInput,
+ PutDataQualityProfileAnnotationCommandOutput,
+} from "./commands/PutDataQualityProfileAnnotationCommand";
import {
PutResourcePolicyCommand,
PutResourcePolicyCommandInput,
@@ -982,6 +1012,7 @@ const commands = {
BatchGetTableOptimizerCommand,
BatchGetTriggersCommand,
BatchGetWorkflowsCommand,
+ BatchPutDataQualityStatisticAnnotationCommand,
BatchStopJobRunCommand,
BatchUpdatePartitionCommand,
CancelDataQualityRuleRecommendationRunCommand,
@@ -1059,6 +1090,8 @@ const commands = {
GetDatabasesCommand,
GetDataCatalogEncryptionSettingsCommand,
GetDataflowGraphCommand,
+ GetDataQualityModelCommand,
+ GetDataQualityModelResultCommand,
GetDataQualityResultCommand,
GetDataQualityRuleRecommendationRunCommand,
GetDataQualityRulesetCommand,
@@ -1118,6 +1151,8 @@ const commands = {
ListDataQualityRuleRecommendationRunsCommand,
ListDataQualityRulesetEvaluationRunsCommand,
ListDataQualityRulesetsCommand,
+ ListDataQualityStatisticAnnotationsCommand,
+ ListDataQualityStatisticsCommand,
ListDevEndpointsCommand,
ListJobsCommand,
ListMLTransformsCommand,
@@ -1131,6 +1166,7 @@ const commands = {
ListUsageProfilesCommand,
ListWorkflowsCommand,
PutDataCatalogEncryptionSettingsCommand,
+ PutDataQualityProfileAnnotationCommand,
PutResourcePolicyCommand,
PutSchemaVersionMetadataCommand,
PutWorkflowRunPropertiesCommand,
@@ -1437,6 +1473,23 @@ export interface Glue {
cb: (err: any, data?: BatchGetWorkflowsCommandOutput) => void
): void;
+ /**
+ * @see {@link BatchPutDataQualityStatisticAnnotationCommand}
+ */
+ batchPutDataQualityStatisticAnnotation(
+ args: BatchPutDataQualityStatisticAnnotationCommandInput,
+ options?: __HttpHandlerOptions
+ ): Promise
DataQualityMetricValues
representing the analysis of the data quality metric value.DatapointInclusionAnnotation
's.AnnotationError
's.CreateClassifier
to create.OpenCSVSerDe
, LazySimpleSerDe
, and None
. You can specify the None
value when you want the crawler to do the detection.grok
classifier for CreateClassifier
- * to create.CreateClassifier
to create.JsonPath
string defining the JSON data for the classifier to classify.
- * Glue supports a subset of JsonPath, as described in Writing JsonPath Custom Classifiers.CreateClassifier
to create.OpenCSVSerDe
, LazySimpleSerDe
, and None
. You can specify the None
value when you want the crawler to do the detection.grok
classifier for CreateClassifier
+ * to create.CreateClassifier
to create.JsonPath
string defining the JSON data for the classifier to classify.
+ * Glue supports a subset of JsonPath, as described in Writing JsonPath Custom Classifiers.CreateClassifier
to create.evaluationContext
can differentiate the nodes.DataQualityRuleResult
objects representing the results for each rule. DataQualityAnalyzerResult
objects representing the results for each analyzer. DataQualityObservation
objects representing the observations generated after evaluating the rules and analyzers. StatisticModelResult
+ * G.1X
workers to be used in the run. The default is 5.TIMEOUT
status. The default is 2,880 minutes (48 hours).evaluationContext
can differentiate the nodes.DataQualityRuleResult
objects representing the results for each rule. DataQualityAnalyzerResult
objects representing the results for each analyzer. DataQualityObservation
objects representing the observations generated after evaluating the rules and analyzers. G.1X
workers to be used in the run. The default is 5.TIMEOUT
status. The default is 2,880 minutes (48 hours).TransformParameters
object. You can use parameters to tune (customize) the
- * behavior of the machine learning transform by specifying what data it learns from and your
- * preference on various tradeoffs (such as precious vs. recall, or accuracy vs. cost).EvaluationMetrics
object. Evaluation metrics provide an estimate of the quality of your machine learning transform.
- *
- * @public
- */
- Role?: string;
-
- /**
- * MaxCapacity
is a mutually exclusive option with NumberOfWorkers
and WorkerType
.
- *
- * NumberOfWorkers
or WorkerType
is set, then MaxCapacity
cannot be set.MaxCapacity
is set then neither NumberOfWorkers
or WorkerType
can be set.WorkerType
is set, then NumberOfWorkers
is required (and vice versa).MaxCapacity
and NumberOfWorkers
must both be at least 1.WorkerType
field is set to a value other than Standard
, the MaxCapacity
field is set automatically and becomes read-only.
- *
- * Standard
worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.G.1X
worker type, each worker provides 4 vCPU, 16 GB of memory and a 64GB disk, and 1 executor per worker.G.2X
worker type, each worker provides 8 vCPU, 32 GB of memory and a 128GB disk, and 1 executor per worker.MaxCapacity
is a mutually exclusive option with NumberOfWorkers
and WorkerType
.
- *
- * @public
- */
- WorkerType?: WorkerType;
-
- /**
- * NumberOfWorkers
or WorkerType
is set, then MaxCapacity
cannot be set.MaxCapacity
is set then neither NumberOfWorkers
or WorkerType
can be set.WorkerType
is set, then NumberOfWorkers
is required (and vice versa).MaxCapacity
and NumberOfWorkers
must both be at least 1.workerType
that are allocated when a task of the transform runs.WorkerType
is set, then NumberOfWorkers
is required (and vice versa).MLTaskRun
of the machine
- * learning transform fails.Partition
- * object.
- *
- * @public
- */
-export interface BackfillError {
- /**
- * KeySchemaElement
structures, for the partition index.
- *
- * @public
- */
- IndexStatus: PartitionIndexStatus | undefined;
-
- /**
- * SegmentNumber
values range from 0 through 3.WHERE
filter clause. The
- * SQL statement parser JSQLParser parses the expression. Expression
API call:
- *
- *
- *
- * string
- * date
- * timestamp
- * int
- * bigint
- * long
- * tinyint
- * smallint
- * decimal
- * partitionKey
type is created as a STRING
, to be compatible with the catalog
- * partitions. TransactionId
.TransformParameters
object. You can use parameters to tune (customize) the
+ * behavior of the machine learning transform by specifying what data it learns from and your
+ * preference on various tradeoffs (such as precious vs. recall, or accuracy vs. cost).EvaluationMetrics
object. Evaluation metrics provide an estimate of the quality of your machine learning transform.
+ *
* @public
*/
- Location?: Location;
+ Role?: string;
/**
- * MaxCapacity
is a mutually exclusive option with NumberOfWorkers
and WorkerType
.
*
NumberOfWorkers
or WorkerType
is set, then MaxCapacity
cannot be set.MaxCapacity
is set then neither NumberOfWorkers
or WorkerType
can be set.WorkerType
is set, then NumberOfWorkers
is required (and vice versa).inferSchema
 —  Specifies whether to set inferSchema
to true or false for the default script generated by an Glue job. For example, to set inferSchema
to true, pass the following key value pair:MaxCapacity
and NumberOfWorkers
must both be at least 1.
When the WorkerType
field is set to a value other than Standard
, the MaxCapacity
field is set automatically and becomes read-only.
The type of predefined worker that is allocated when a task of this transform runs. Accepts a value of Standard, G.1X, or G.2X.
+ *For the Standard
worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.
For the G.1X
worker type, each worker provides 4 vCPU, 16 GB of memory and a 64GB disk, and 1 executor per worker.
For the G.2X
worker type, each worker provides 8 vCPU, 32 GB of memory and a 128GB disk, and 1 executor per worker.
+ * MaxCapacity
is a mutually exclusive option with NumberOfWorkers
and WorkerType
.
If either NumberOfWorkers
or WorkerType
is set, then MaxCapacity
cannot be set.
If MaxCapacity
is set then neither NumberOfWorkers
or WorkerType
can be set.
If WorkerType
is set, then NumberOfWorkers
is required (and vice versa).
- * --additional-plan-options-map '\{"inferSchema":"true"\}'
- *
MaxCapacity
and NumberOfWorkers
must both be at least 1.
* A Python script to perform the mapping.
+ *The number of workers of a defined workerType
that are allocated when a task of the transform runs.
If WorkerType
is set, then NumberOfWorkers
is required (and vice versa).
The Scala code to perform the mapping.
+ *The timeout in minutes of the machine learning transform.
* @public */ - ScalaCode?: string; -} + Timeout?: number; -/** - * @public - */ -export interface GetRegistryInput { /** - *This is a wrapper structure that may contain the registry name and Amazon Resource Name (ARN).
+ *The maximum number of times to retry after an MLTaskRun
of the machine
+ * learning transform fails.
The encryption-at-rest settings of the transform that apply to accessing user data. Machine learning transforms can access user data encrypted in Amazon S3 using KMS.
+ * @public + */ + TransformEncryption?: TransformEncryption; } /** * @public */ -export interface GetRegistryResponse { +export interface GetMLTransformsResponse { /** - *The name of the registry.
+ *A list of machine learning transforms.
* @public */ - RegistryName?: string; + Transforms: MLTransform[] | undefined; /** - *The Amazon Resource Name (ARN) of the registry.
+ *A pagination token, if more results are available.
* @public */ - RegistryArn?: string; - + NextToken?: string; +} + +/** + * @public + */ +export interface GetPartitionRequest { /** - *A description of the registry.
+ *The ID of the Data Catalog where the partition in question resides. If none is provided, + * the Amazon Web Services account ID is used by default.
* @public */ - Description?: string; + CatalogId?: string; /** - *The status of the registry.
+ *The name of the catalog database where the partition resides.
* @public */ - Status?: RegistryStatus; + DatabaseName: string | undefined; /** - *The date and time the registry was created.
+ *The name of the partition's table.
* @public */ - CreatedTime?: string; + TableName: string | undefined; /** - *The date and time the registry was updated.
+ *The values that define the partition.
* @public */ - UpdatedTime?: string; + PartitionValues: string[] | undefined; } /** * @public */ -export interface GetResourcePoliciesRequest { - /** - *A continuation token, if this is a continuation request.
- * @public - */ - NextToken?: string; - +export interface GetPartitionResponse { /** - *The maximum size of a list to return.
+ *The requested information, in the form of a Partition
+ * object.
A structure for returning a resource policy.
* @public */ -export interface GluePolicy { +export interface GetPartitionIndexesRequest { /** - *Contains the requested policy document, in JSON format.
+ *The catalog ID where the table resides.
* @public */ - PolicyInJson?: string; + CatalogId?: string; /** - *Contains the hash value associated with this policy.
+ *Specifies the name of a database from which you want to retrieve partition indexes.
* @public */ - PolicyHash?: string; + DatabaseName: string | undefined; /** - *The date and time at which the policy was created.
+ *Specifies the name of a table for which you want to retrieve the partition indexes.
* @public */ - CreateTime?: Date; + TableName: string | undefined; /** - *The date and time at which the policy was last updated.
+ *A continuation token, included if this is a continuation call.
* @public */ - UpdateTime?: Date; + NextToken?: string; } /** * @public + * @enum */ -export interface GetResourcePoliciesResponse { - /** - *A list of the individual resource policies and the account-level resource policy.
- * @public - */ - GetResourcePoliciesResponseList?: GluePolicy[]; - - /** - *A continuation token, if the returned list does not contain the last resource policy available.
- * @public - */ - NextToken?: string; -} +export const BackfillErrorCode = { + ENCRYPTED_PARTITION_ERROR: "ENCRYPTED_PARTITION_ERROR", + INTERNAL_ERROR: "INTERNAL_ERROR", + INVALID_PARTITION_TYPE_DATA_ERROR: "INVALID_PARTITION_TYPE_DATA_ERROR", + MISSING_PARTITION_VALUE_ERROR: "MISSING_PARTITION_VALUE_ERROR", + UNSUPPORTED_PARTITION_CHARACTER_ERROR: "UNSUPPORTED_PARTITION_CHARACTER_ERROR", +} as const; /** * @public */ -export interface GetResourcePolicyRequest { - /** - *The ARN of the Glue resource for which to retrieve the resource policy. If not
- * supplied, the Data Catalog resource policy is returned. Use GetResourcePolicies
- * to view all existing resource policies. For more information see Specifying Glue Resource ARNs.
- *
A list of errors that can occur when registering partition indexes for an existing table.
+ *These errors give the details about why an index registration failed and provide a limited number of partitions in the response, so that you can fix the partitions at fault and try registering the index again. The most common set of errors that can occur are categorized as follows:
+ *EncryptedPartitionError: The partitions are encrypted.
+ *InvalidPartitionTypeDataError: The partition value doesn't match the data type for that partition column.
+ *MissingPartitionValueError: The partitions are encrypted.
+ *UnsupportedPartitionCharacterError: Characters inside the partition value are not supported. For example: U+0000 , U+0001, U+0002.
+ *InternalError: Any error which does not belong to other error codes.
+ *Contains the requested policy document, in JSON format.
- * @public - */ - PolicyInJson?: string; - - /** - *Contains the hash value associated with this policy.
- * @public - */ - PolicyHash?: string; - +export interface BackfillError { /** - *The date and time at which the policy was created.
+ *The error code for an error that occurred when registering partition indexes for an existing table.
* @public */ - CreateTime?: Date; + Code?: BackfillErrorCode; /** - *The date and time at which the policy was last updated.
+ *A list of a limited number of partitions in the response.
* @public */ - UpdateTime?: Date; + Partitions?: PartitionValueList[]; } /** * @public + * @enum */ -export interface GetSchemaInput { - /** - *This is a wrapper structure to contain schema identity fields. The structure contains:
- *SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. Either SchemaArn
or SchemaName
and RegistryName
has to be provided.
SchemaId$SchemaName: The name of the schema. Either SchemaArn
or SchemaName
and RegistryName
has to be provided.
The name of the registry.
- * @public - */ - RegistryName?: string; - - /** - *The Amazon Resource Name (ARN) of the registry.
- * @public - */ - RegistryArn?: string; - - /** - *The name of the schema.
- * @public - */ - SchemaName?: string; - - /** - *The Amazon Resource Name (ARN) of the schema.
- * @public - */ - SchemaArn?: string; - - /** - *A description of schema if specified when created
- * @public - */ - Description?: string; - - /** - *The data format of the schema definition. Currently AVRO
, JSON
and PROTOBUF
are supported.
The compatibility mode of the schema.
- * @public - */ - Compatibility?: Compatibility; - - /** - *The version number of the checkpoint (the last time the compatibility mode was changed).
- * @public - */ - SchemaCheckpoint?: number; - - /** - *The latest version of the schema associated with the returned schema definition.
- * @public - */ - LatestSchemaVersion?: number; +export type PartitionIndexStatus = (typeof PartitionIndexStatus)[keyof typeof PartitionIndexStatus]; +/** + *A partition key pair consisting of a name and a type.
+ * @public + */ +export interface KeySchemaElement { /** - *The next version of the schema associated with the returned schema definition.
+ *The name of a partition key.
* @public */ - NextSchemaVersion?: number; + Name: string | undefined; /** - *The status of the schema.
+ *The type of a partition key.
* @public */ - SchemaStatus?: SchemaStatus; + Type: string | undefined; +} +/** + *A descriptor for a partition index in a table.
+ * @public + */ +export interface PartitionIndexDescriptor { /** - *The date and time the schema was created.
+ *The name of the partition index.
* @public */ - CreatedTime?: string; + IndexName: string | undefined; /** - *The date and time the schema was updated.
+ *A list of one or more keys, as KeySchemaElement
structures, for the partition index.
This is a wrapper structure to contain schema identity fields. The structure contains:
+ *The status of the partition index.
+ *The possible statuses are:
*SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. One of SchemaArn
or SchemaName
has to be provided.
CREATING: The index is being created. When an index is in a CREATING state, the index or its table cannot be deleted.
*SchemaId$SchemaName: The name of the schema. One of SchemaArn
or SchemaName
has to be provided.
ACTIVE: The index creation succeeds.
+ *FAILED: The index creation fails.
+ *DELETING: The index is deleted from the list of indexes.
*The definition of the schema for which schema details are required.
+ *A list of errors that can occur when registering partition indexes for an existing table.
* @public */ - SchemaDefinition: string | undefined; + BackfillErrors?: BackfillError[]; } /** * @public */ -export interface GetSchemaByDefinitionResponse { - /** - *The schema ID of the schema version.
- * @public - */ - SchemaVersionId?: string; - - /** - *The Amazon Resource Name (ARN) of the schema.
- * @public - */ - SchemaArn?: string; - - /** - *The data format of the schema definition. Currently AVRO
, JSON
and PROTOBUF
are supported.
The status of the schema version.
+ *A list of index descriptors.
* @public */ - Status?: SchemaVersionStatus; + PartitionIndexDescriptorList?: PartitionIndexDescriptor[]; /** - *The date and time the schema was created.
+ *A continuation token, present if the current list segment is not the last.
* @public */ - CreatedTime?: string; + NextToken?: string; } /** - *A structure containing the schema version information.
+ *Defines a non-overlapping region of a table's partitions, allowing + * multiple requests to be run in parallel.
* @public */ -export interface SchemaVersionNumber { +export interface Segment { /** - *The latest version available for the schema.
+ *The zero-based index number of the segment. For example, if the total number of segments
+ * is 4, SegmentNumber
values range from 0 through 3.
The version number of the schema.
+ *The total number of segments.
* @public */ - VersionNumber?: number; + TotalSegments: number | undefined; } /** * @public */ -export interface GetSchemaVersionInput { - /** - *This is a wrapper structure to contain schema identity fields. The structure contains:
- *SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. Either SchemaArn
or SchemaName
and RegistryName
has to be provided.
SchemaId$SchemaName: The name of the schema. Either SchemaArn
or SchemaName
and RegistryName
has to be provided.
The SchemaVersionId
of the schema version. This field is required for fetching by schema ID. Either this or the SchemaId
wrapper has to be provided.
The ID of the Data Catalog where the partitions in question reside. If none is provided, + * the Amazon Web Services account ID is used by default.
* @public */ - SchemaVersionId?: string; + CatalogId?: string; /** - *The version number of the schema.
+ *The name of the catalog database where the partitions reside.
* @public */ - SchemaVersionNumber?: SchemaVersionNumber; -} + DatabaseName: string | undefined; -/** - * @public - */ -export interface GetSchemaVersionResponse { /** - *The SchemaVersionId
of the schema version.
The name of the partitions' table.
* @public */ - SchemaVersionId?: string; + TableName: string | undefined; /** - *The schema definition for the schema ID.
- * @public - */ - SchemaDefinition?: string; - - /** - *The data format of the schema definition. Currently AVRO
, JSON
and PROTOBUF
are supported.
The Amazon Resource Name (ARN) of the schema.
- * @public - */ - SchemaArn?: string; - - /** - *The version number of the schema.
- * @public - */ - VersionNumber?: number; - - /** - *The status of the schema version.
- * @public - */ - Status?: SchemaVersionStatus; - - /** - *The date and time the schema version was created.
- * @public - */ - CreatedTime?: string; -} - -/** - * @public - * @enum - */ -export const SchemaDiffType = { - SYNTAX_DIFF: "SYNTAX_DIFF", -} as const; - -/** - * @public - */ -export type SchemaDiffType = (typeof SchemaDiffType)[keyof typeof SchemaDiffType]; - -/** - * @public - */ -export interface GetSchemaVersionsDiffInput { - /** - *This is a wrapper structure to contain schema identity fields. The structure contains:
+ *An expression that filters the partitions to be returned.
+ *The expression uses SQL syntax similar to the SQL WHERE
filter clause. The
+ * SQL statement parser JSQLParser parses the expression.
+ * Operators: The following are the operators that you can use in the
+ * Expression
API call:
Checks whether the values of the two operands are equal; if yes, then the condition becomes + * true.
+ *Example: Assume 'variable a' holds 10 and 'variable b' holds 20.
+ *(a = b) is not true.
+ *Checks whether the values of two operands are equal; if the values are not equal, + * then the condition becomes true.
+ *Example: (a < > b) is true.
+ *Checks whether the value of the left operand is greater than the value of the right + * operand; if yes, then the condition becomes true.
+ *Example: (a > b) is not true.
+ *Checks whether the value of the left operand is less than the value of the right + * operand; if yes, then the condition becomes true.
+ *Example: (a < b) is true.
+ *Checks whether the value of the left operand is greater than or equal to the value + * of the right operand; if yes, then the condition becomes true.
+ *Example: (a >= b) is not true.
+ *Checks whether the value of the left operand is less than or equal to the value of + * the right operand; if yes, then the condition becomes true.
+ *Example: (a <= b) is true.
+ *Logical operators.
+ *+ * Supported Partition Key Types: The following are the supported + * partition keys.
*SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. One of SchemaArn
or SchemaName
has to be provided.
+ * string
+ *
SchemaId$SchemaName: The name of the schema. One of SchemaArn
or SchemaName
has to be provided.
+ * date
+ *
+ * timestamp
+ *
+ * int
+ *
+ * bigint
+ *
+ * long
+ *
+ * tinyint
+ *
+ * smallint
+ *
+ * decimal
+ *
If an type is encountered that is not valid, an exception is thrown.
+ *The following list shows the valid operators on each type. When you define a crawler, the
+ * partitionKey
type is created as a STRING
, to be compatible with the catalog
+ * partitions.
+ * Sample API Call:
* @public */ - SchemaId: SchemaId | undefined; - - /** - *The first of the two schema versions to be compared.
- * @public - */ - FirstSchemaVersionNumber: SchemaVersionNumber | undefined; - - /** - *The second of the two schema versions to be compared.
- * @public - */ - SecondSchemaVersionNumber: SchemaVersionNumber | undefined; - - /** - *Refers to SYNTAX_DIFF
, which is the currently supported diff type.
The difference between schemas as a string in JsonPatch format.
+ *A continuation token, if this is not the first call to retrieve + * these partitions.
* @public */ - Diff?: string; -} + NextToken?: string; -/** - * @public - */ -export interface GetSecurityConfigurationRequest { /** - *The name of the security configuration to retrieve.
+ *The segment of the table's partitions to scan in this request.
* @public */ - Name: string | undefined; -} + Segment?: Segment; -/** - *Specifies a security configuration.
- * @public - */ -export interface SecurityConfiguration { /** - *The name of the security configuration.
+ *The maximum number of partitions to return in a single response.
* @public */ - Name?: string; + MaxResults?: number; /** - *The time at which this security configuration was created.
+ *When true, specifies not returning the partition column schema. Useful when you are interested only in other partition attributes such as partition values or location. This approach avoids the problem of a large response by not returning duplicate data.
* @public */ - CreatedTimeStamp?: Date; + ExcludeColumnSchema?: boolean; /** - *The encryption configuration associated with this security configuration.
+ *The transaction ID at which to read the partition contents.
* @public */ - EncryptionConfiguration?: EncryptionConfiguration; -} + TransactionId?: string; -/** - * @public - */ -export interface GetSecurityConfigurationResponse { /** - *The requested security configuration.
+ *The time as of when to read the partition contents. If not set, the most recent transaction commit time will be used. Cannot be specified along with TransactionId
.
The maximum number of results to return.
+ *A list of requested partitions.
* @public */ - MaxResults?: number; + Partitions?: Partition[]; /** - *A continuation token, if this is a continuation call.
+ *A continuation token, if the returned list of partitions does not include the last + * one.
* @public */ NextToken?: string; @@ -949,2798 +743,2214 @@ export interface GetSecurityConfigurationsRequest { /** * @public */ -export interface GetSecurityConfigurationsResponse { - /** - *A list of security configurations.
- * @public - */ - SecurityConfigurations?: SecurityConfiguration[]; - +export interface GetPlanRequest { /** - *A continuation token, if there are more security - * configurations to return.
+ *The list of mappings from a source table to target tables.
* @public */ - NextToken?: string; -} + Mapping: MappingEntry[] | undefined; -/** - * @public - */ -export interface GetSessionRequest { /** - *The ID of the session.
+ *The source table.
* @public */ - Id: string | undefined; + Source: CatalogEntry | undefined; /** - *The origin of the request.
+ *The target tables.
* @public */ - RequestOrigin?: string; -} - -/** - * @public - */ -export interface GetSessionResponse { - /** - *The session object is returned in the response.
- * @public - */ - Session?: Session; -} - -/** - * @public - */ -export interface GetStatementRequest { - /** - *The Session ID of the statement.
- * @public - */ - SessionId: string | undefined; - - /** - *The Id of the statement.
- * @public - */ - Id: number | undefined; - - /** - *The origin of the request.
- * @public - */ - RequestOrigin?: string; -} - -/** - *The code execution output in JSON format.
- * @public - */ -export interface StatementOutputData { - /** - *The code execution output in text format.
- * @public - */ - TextPlain?: string; -} - -/** - * @public - * @enum - */ -export const StatementState = { - AVAILABLE: "AVAILABLE", - CANCELLED: "CANCELLED", - CANCELLING: "CANCELLING", - ERROR: "ERROR", - RUNNING: "RUNNING", - WAITING: "WAITING", -} as const; - -/** - * @public - */ -export type StatementState = (typeof StatementState)[keyof typeof StatementState]; - -/** - *The code execution output in JSON format.
- * @public - */ -export interface StatementOutput { - /** - *The code execution output.
- * @public - */ - Data?: StatementOutputData; - - /** - *The execution count of the output.
- * @public - */ - ExecutionCount?: number; - - /** - *The status of the code execution output.
- * @public - */ - Status?: StatementState; - - /** - *The name of the error in the output.
- * @public - */ - ErrorName?: string; - - /** - *The error value of the output.
- * @public - */ - ErrorValue?: string; - - /** - *The traceback of the output.
- * @public - */ - Traceback?: string[]; -} - -/** - *The statement or request for a particular action to occur in a session.
- * @public - */ -export interface Statement { - /** - *The ID of the statement.
- * @public - */ - Id?: number; - - /** - *The execution code of the statement.
- * @public - */ - Code?: string; - - /** - *The state while request is actioned.
- * @public - */ - State?: StatementState; - - /** - *The output in JSON.
- * @public - */ - Output?: StatementOutput; - - /** - *The code execution progress.
- * @public - */ - Progress?: number; - - /** - *The unix time and date that the job definition was started.
- * @public - */ - StartedOn?: number; - - /** - *The unix time and date that the job definition was completed.
- * @public - */ - CompletedOn?: number; -} - -/** - * @public - */ -export interface GetStatementResponse { - /** - *Returns the statement.
- * @public - */ - Statement?: Statement; -} - -/** - * @public - */ -export interface GetTableRequest { - /** - *The ID of the Data Catalog where the table resides. If none is provided, the Amazon Web Services account - * ID is used by default.
- * @public - */ - CatalogId?: string; - - /** - *The name of the database in the catalog in which the table resides. - * For Hive compatibility, this name is entirely lowercase.
- * @public - */ - DatabaseName: string | undefined; - - /** - *The name of the table for which to retrieve the definition. For Hive - * compatibility, this name is entirely lowercase.
- * @public - */ - Name: string | undefined; - - /** - *The transaction ID at which to read the table contents.
- * @public - */ - TransactionId?: string; - - /** - *The time as of when to read the table contents. If not set, the most recent transaction commit time will be used. Cannot be specified along with TransactionId
.
A table that points to an entity outside the Glue Data Catalog.
- * @public - */ -export interface FederatedTable { - /** - *A unique identifier for the federated table.
- * @public - */ - Identifier?: string; - - /** - *A unique identifier for the federated database.
- * @public - */ - DatabaseIdentifier?: string; - - /** - *The name of the connection to the external metastore.
- * @public - */ - ConnectionName?: string; -} - -/** - *A structure that contains the dialect of the view, and the query that defines the view.
- * @public - */ -export interface ViewRepresentation { - /** - *The dialect of the query engine.
- * @public - */ - Dialect?: ViewDialect; - - /** - *The version of the dialect of the query engine. For example, 3.0.0.
- * @public - */ - DialectVersion?: string; - - /** - *The SELECT
query provided by the customer during CREATE VIEW DDL
. This SQL is not used during a query on a view (ViewExpandedText
is used instead). ViewOriginalText
is used for cases like SHOW CREATE VIEW
where users want to see the original DDL command that created the view.
The expanded SQL for the view. This SQL is used by engines while processing a query on a view. Engines may perform operations during view creation to transform ViewOriginalText
to ViewExpandedText
. For example:
Fully qualified identifiers: SELECT * from table1 -> SELECT * from db1.table1
- *
The name of the connection to be used to validate the specific representation of the view.
- * @public - */ - ValidationConnection?: string; - - /** - *Dialects marked as stale are no longer valid and must be updated before they can be queried in their respective query engines.
- * @public - */ - IsStale?: boolean; -} - -/** - *A structure containing details for representations.
- * @public - */ -export interface ViewDefinition { - /** - *You can set this flag as true to instruct the engine not to push user-provided operations into the logical plan of the view during query planning. However, setting this flag does not guarantee that the engine will comply. Refer to the engine's documentation to understand the guarantees provided, if any.
- * @public - */ - IsProtected?: boolean; - - /** - *The definer of a view in SQL.
- * @public - */ - Definer?: string; - - /** - *A list of table Amazon Resource Names (ARNs).
- * @public - */ - SubObjects?: string[]; - - /** - *A list of representations.
- * @public - */ - Representations?: ViewRepresentation[]; -} - -/** - *Represents a collection of related data organized in columns and rows.
- * @public - */ -export interface Table { - /** - *The table name. For Hive compatibility, this must be entirely - * lowercase.
- * @public - */ - Name: string | undefined; - - /** - *The name of the database where the table metadata resides. - * For Hive compatibility, this must be all lowercase.
- * @public - */ - DatabaseName?: string; - - /** - *A description of the table.
- * @public - */ - Description?: string; - - /** - *The owner of the table.
- * @public - */ - Owner?: string; - - /** - *The time when the table definition was created in the Data Catalog.
- * @public - */ - CreateTime?: Date; - - /** - *The last time that the table was updated.
- * @public - */ - UpdateTime?: Date; - - /** - *The last time that the table was accessed. This is usually taken from HDFS, and might not - * be reliable.
- * @public - */ - LastAccessTime?: Date; - - /** - *The last time that column statistics were computed for this table.
- * @public - */ - LastAnalyzedTime?: Date; - - /** - *The retention time for this table.
- * @public - */ - Retention?: number; - - /** - *A storage descriptor containing information about the physical storage - * of this table.
- * @public - */ - StorageDescriptor?: StorageDescriptor; - - /** - *A list of columns by which the table is partitioned. Only primitive - * types are supported as partition keys.
- *When you create a table used by Amazon Athena, and you do not specify any
- * partitionKeys
, you must at least set the value of partitionKeys
to
- * an empty list. For example:
- * "PartitionKeys": []
- *
Included for Apache Hive compatibility. Not used in the normal course of Glue operations.
- * If the table is a VIRTUAL_VIEW
, certain Athena configuration encoded in base64.
Included for Apache Hive compatibility. Not used in the normal course of Glue operations.
- * @public - */ - ViewExpandedText?: string; - - /** - *The type of this table.
- * Glue will create tables with the EXTERNAL_TABLE
type.
- * Other services, such as Athena, may create tables with additional table types.
- *
Glue related table types:
- *Hive compatible attribute - indicates a non-Hive managed table.
- *Used by Lake Formation.
- * The Glue Data Catalog understands GOVERNED
.
These key-value pairs define properties associated with the table.
- * @public - */ - Parameters?: RecordThe person or entity who created the table.
- * @public - */ - CreatedBy?: string; - - /** - *Indicates whether the table has been registered with Lake Formation.
- * @public - */ - IsRegisteredWithLakeFormation?: boolean; - - /** - *A TableIdentifier
structure that describes a target table for resource linking.
The ID of the Data Catalog in which the table resides.
- * @public - */ - CatalogId?: string; - - /** - *The ID of the table version.
- * @public - */ - VersionId?: string; - - /** - *A FederatedTable
structure that references an entity outside the Glue Data Catalog.
A structure that contains all the information that defines the view, including the dialect or dialects for the view, and the query.
- * @public - */ - ViewDefinition?: ViewDefinition; - - /** - *Specifies whether the view supports the SQL dialects of one or more different query engines and can therefore be read by those engines.
- * @public - */ - IsMultiDialectView?: boolean; -} - -/** - * @public - */ -export interface GetTableResponse { - /** - *The Table
object that defines the specified table.
The Catalog ID of the table.
- * @public - */ - CatalogId: string | undefined; - - /** - *The name of the database in the catalog in which the table resides.
- * @public - */ - DatabaseName: string | undefined; - - /** - *The name of the table.
- * @public - */ - TableName: string | undefined; + Sinks?: CatalogEntry[]; /** - *The type of table optimizer.
+ *The parameters for the mapping.
* @public */ - Type: TableOptimizerType | undefined; -} + Location?: Location; -/** - * @public - */ -export interface GetTableOptimizerResponse { /** - *The Catalog ID of the table.
+ *The programming language of the code to perform the mapping.
* @public */ - CatalogId?: string; + Language?: Language; /** - *The name of the database in the catalog in which the table resides.
+ *A map to hold additional optional key-value parameters.
+ *Currently, these key-value pairs are supported:
+ *
+ * inferSchema
 —  Specifies whether to set inferSchema
to true or false for the default script generated by an Glue job. For example, to set inferSchema
to true, pass the following key value pair:
+ * --additional-plan-options-map '\{"inferSchema":"true"\}'
+ *
The name of the table.
+ *A Python script to perform the mapping.
* @public */ - TableName?: string; + PythonScript?: string; /** - *The optimizer associated with the specified table.
+ *The Scala code to perform the mapping.
* @public */ - TableOptimizer?: TableOptimizer; + ScalaCode?: string; } /** * @public */ -export interface GetTablesRequest { +export interface GetRegistryInput { /** - *The ID of the Data Catalog where the tables reside. If none is provided, the Amazon Web Services account - * ID is used by default.
+ *This is a wrapper structure that may contain the registry name and Amazon Resource Name (ARN).
* @public */ - CatalogId?: string; + RegistryId: RegistryId | undefined; +} +/** + * @public + */ +export interface GetRegistryResponse { /** - *The database in the catalog whose tables to list. For Hive - * compatibility, this name is entirely lowercase.
+ *The name of the registry.
* @public */ - DatabaseName: string | undefined; + RegistryName?: string; /** - *A regular expression pattern. If present, only those tables - * whose names match the pattern are returned.
+ *The Amazon Resource Name (ARN) of the registry.
* @public */ - Expression?: string; + RegistryArn?: string; /** - *A continuation token, included if this is a continuation call.
+ *A description of the registry.
* @public */ - NextToken?: string; + Description?: string; /** - *The maximum number of tables to return in a single response.
+ *The status of the registry.
* @public */ - MaxResults?: number; + Status?: RegistryStatus; /** - *The transaction ID at which to read the table contents.
+ *The date and time the registry was created.
* @public */ - TransactionId?: string; + CreatedTime?: string; /** - *The time as of when to read the table contents. If not set, the most recent transaction commit time will be used. Cannot be specified along with TransactionId
.
The date and time the registry was updated.
* @public */ - QueryAsOfTime?: Date; + UpdatedTime?: string; } /** * @public */ -export interface GetTablesResponse { +export interface GetResourcePoliciesRequest { /** - *A list of the requested Table
objects.
A continuation token, if this is a continuation request.
* @public */ - TableList?: Table[]; + NextToken?: string; /** - *A continuation token, present if the current list segment is - * not the last.
+ *The maximum size of a list to return.
* @public */ - NextToken?: string; + MaxResults?: number; } /** + *A structure for returning a resource policy.
* @public */ -export interface GetTableVersionRequest { +export interface GluePolicy { /** - *The ID of the Data Catalog where the tables reside. If none is provided, the Amazon Web Services account - * ID is used by default.
+ *Contains the requested policy document, in JSON format.
* @public */ - CatalogId?: string; + PolicyInJson?: string; /** - *The database in the catalog in which the table resides. For Hive - * compatibility, this name is entirely lowercase.
+ *Contains the hash value associated with this policy.
* @public */ - DatabaseName: string | undefined; + PolicyHash?: string; /** - *The name of the table. For Hive compatibility, - * this name is entirely lowercase.
+ *The date and time at which the policy was created.
* @public */ - TableName: string | undefined; + CreateTime?: Date; /** - *The ID value of the table version to be retrieved. A VersionID
is a string representation of an integer. Each version is incremented by 1.
The date and time at which the policy was last updated.
* @public */ - VersionId?: string; + UpdateTime?: Date; } /** - *Specifies a version of a table.
* @public */ -export interface TableVersion { +export interface GetResourcePoliciesResponse { /** - *The table in question.
+ *A list of the individual resource policies and the account-level resource policy.
* @public */ - Table?: Table; + GetResourcePoliciesResponseList?: GluePolicy[]; /** - *The ID value that identifies this table version. A VersionId
is a string representation of an integer. Each version is incremented by 1.
A continuation token, if the returned list does not contain the last resource policy available.
* @public */ - VersionId?: string; + NextToken?: string; } /** * @public */ -export interface GetTableVersionResponse { +export interface GetResourcePolicyRequest { /** - *The requested table version.
+ *The ARN of the Glue resource for which to retrieve the resource policy. If not
+ * supplied, the Data Catalog resource policy is returned. Use GetResourcePolicies
+ * to view all existing resource policies. For more information see Specifying Glue Resource ARNs.
+ *
The ID of the Data Catalog where the tables reside. If none is provided, the Amazon Web Services account - * ID is used by default.
- * @public - */ - CatalogId?: string; - - /** - *The database in the catalog in which the table resides. For Hive - * compatibility, this name is entirely lowercase.
- * @public - */ - DatabaseName: string | undefined; - - /** - *The name of the table. For Hive - * compatibility, this name is entirely lowercase.
- * @public - */ - TableName: string | undefined; - +export interface GetResourcePolicyResponse { /** - *A continuation token, if this is not the first call.
+ *Contains the requested policy document, in JSON format.
* @public */ - NextToken?: string; + PolicyInJson?: string; /** - *The maximum number of table versions to return in one response.
+ *Contains the hash value associated with this policy.
* @public */ - MaxResults?: number; -} + PolicyHash?: string; -/** - * @public - */ -export interface GetTableVersionsResponse { /** - *A list of strings identifying available versions of the - * specified table.
+ *The date and time at which the policy was created.
* @public */ - TableVersions?: TableVersion[]; + CreateTime?: Date; /** - *A continuation token, if the list of available versions does - * not include the last one.
+ *The date and time at which the policy was last updated.
* @public */ - NextToken?: string; + UpdateTime?: Date; } /** * @public */ -export interface GetTagsRequest { +export interface GetSchemaInput { /** - *The Amazon Resource Name (ARN) of the resource for which to retrieve tags.
+ *This is a wrapper structure to contain schema identity fields. The structure contains:
+ *SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. Either SchemaArn
or SchemaName
and RegistryName
has to be provided.
SchemaId$SchemaName: The name of the schema. Either SchemaArn
or SchemaName
and RegistryName
has to be provided.
The requested tags.
+ *The name of the registry.
* @public */ - Tags?: RecordThe name of the trigger to retrieve.
+ *The Amazon Resource Name (ARN) of the registry.
* @public */ - Name: string | undefined; -} + RegistryArn?: string; -/** - * @public - */ -export interface GetTriggerResponse { /** - *The requested trigger definition.
+ *The name of the schema.
* @public */ - Trigger?: Trigger; -} + SchemaName?: string; -/** - * @public - */ -export interface GetTriggersRequest { /** - *A continuation token, if this is a continuation call.
+ *The Amazon Resource Name (ARN) of the schema.
* @public */ - NextToken?: string; + SchemaArn?: string; /** - *The name of the job to retrieve triggers for. The trigger that can start this job is - * returned, and if there is no such trigger, all triggers are returned.
+ *A description of schema if specified when created
* @public */ - DependentJobName?: string; + Description?: string; /** - *The maximum size of the response.
+ *The data format of the schema definition. Currently AVRO
, JSON
and PROTOBUF
are supported.
A list of triggers for the specified job.
+ *The compatibility mode of the schema.
* @public */ - Triggers?: Trigger[]; + Compatibility?: Compatibility; /** - *A continuation token, if not all the requested triggers - * have yet been returned.
+ *The version number of the checkpoint (the last time the compatibility mode was changed).
* @public */ - NextToken?: string; -} + SchemaCheckpoint?: number; -/** - *A structure used as a protocol between query engines and Lake Formation or Glue. Contains both a Lake Formation generated authorization identifier and information from the request's authorization context.
- * @public - */ -export interface QuerySessionContext { /** - *A unique identifier generated by the query engine for the query.
+ *The latest version of the schema associated with the returned schema definition.
* @public */ - QueryId?: string; + LatestSchemaVersion?: number; /** - *A timestamp provided by the query engine for when the query started.
+ *The next version of the schema associated with the returned schema definition.
* @public */ - QueryStartTime?: Date; + NextSchemaVersion?: number; /** - *An identifier string for the consumer cluster.
+ *The status of the schema.
* @public */ - ClusterId?: string; + SchemaStatus?: SchemaStatus; /** - *A cryptographically generated query identifier generated by Glue or Lake Formation.
+ *The date and time the schema was created.
* @public */ - QueryAuthorizationId?: string; + CreatedTime?: string; /** - *An opaque string-string map passed by the query engine.
+ *The date and time the schema was updated.
* @public */ - AdditionalContext?: RecordThis is a wrapper structure to contain schema identity fields. The structure contains:
+ *SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. One of SchemaArn
or SchemaName
has to be provided.
SchemaId$SchemaName: The name of the schema. One of SchemaArn
or SchemaName
has to be provided.
Specified only if the base tables belong to a different Amazon Web Services Region.
+ *The definition of the schema for which schema details are required.
* @public */ - Region?: string; + SchemaDefinition: string | undefined; +} +/** + * @public + */ +export interface GetSchemaByDefinitionResponse { /** - *The catalog ID where the partition resides.
+ *The schema ID of the schema version.
* @public */ - CatalogId: string | undefined; + SchemaVersionId?: string; /** - *(Required) Specifies the name of a database that contains the partition.
+ *The Amazon Resource Name (ARN) of the schema.
* @public */ - DatabaseName: string | undefined; + SchemaArn?: string; /** - *(Required) Specifies the name of a table that contains the partition.
+ *The data format of the schema definition. Currently AVRO
, JSON
and PROTOBUF
are supported.
(Required) A list of partition key values.
+ *The status of the schema version.
* @public */ - PartitionValues: string[] | undefined; + Status?: SchemaVersionStatus; /** - *A structure containing Lake Formation audit context information.
+ *The date and time the schema was created.
* @public */ - AuditContext?: AuditContext; + CreatedTime?: string; +} +/** + *A structure containing the schema version information.
+ * @public + */ +export interface SchemaVersionNumber { /** - *(Required) A list of supported permission types.
+ *The latest version available for the schema.
* @public */ - SupportedPermissionTypes: PermissionType[] | undefined; + LatestVersion?: boolean; /** - *A structure used as a protocol between query engines and Lake Formation or Glue. Contains both a Lake Formation generated authorization identifier and information from the request's authorization context.
+ *The version number of the schema.
* @public */ - QuerySessionContext?: QuerySessionContext; + VersionNumber?: number; } /** * @public */ -export interface GetUnfilteredPartitionMetadataResponse { +export interface GetSchemaVersionInput { /** - *A Partition object containing the partition metadata.
+ *This is a wrapper structure to contain schema identity fields. The structure contains:
+ *SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. Either SchemaArn
or SchemaName
and RegistryName
has to be provided.
SchemaId$SchemaName: The name of the schema. Either SchemaArn
or SchemaName
and RegistryName
has to be provided.
A list of column names that the user has been granted access to.
+ *The SchemaVersionId
of the schema version. This field is required for fetching by schema ID. Either this or the SchemaId
wrapper has to be provided.
A Boolean value that indicates whether the partition location is registered - * with Lake Formation.
+ *The version number of the schema.
* @public */ - IsRegisteredWithLakeFormation?: boolean; + SchemaVersionNumber?: SchemaVersionNumber; } /** - *The operation timed out.
* @public */ -export class PermissionTypeMismatchException extends __BaseException { - readonly name: "PermissionTypeMismatchException" = "PermissionTypeMismatchException"; - readonly $fault: "client" = "client"; +export interface GetSchemaVersionResponse { /** - *There is a mismatch between the SupportedPermissionType used in the query request - * and the permissions defined on the target table.
+ *The SchemaVersionId
of the schema version.
The schema definition for the schema ID.
+ * @public */ - constructor(opts: __ExceptionOptionTypeSpecified only if the base tables belong to a different Amazon Web Services Region.
+ *The data format of the schema definition. Currently AVRO
, JSON
and PROTOBUF
are supported.
The ID of the Data Catalog where the partitions in question reside. If none is provided, - * the AWS account ID is used by default.
+ *The Amazon Resource Name (ARN) of the schema.
* @public */ - CatalogId: string | undefined; + SchemaArn?: string; /** - *The name of the catalog database where the partitions reside.
+ *The version number of the schema.
* @public */ - DatabaseName: string | undefined; + VersionNumber?: number; /** - *The name of the table that contains the partition.
+ *The status of the schema version.
* @public */ - TableName: string | undefined; + Status?: SchemaVersionStatus; /** - *An expression that filters the partitions to be returned.
- *The expression uses SQL syntax similar to the SQL WHERE
filter clause. The
- * SQL statement parser JSQLParser parses the expression.
- * Operators: The following are the operators that you can use in the
- * Expression
API call:
Checks whether the values of the two operands are equal; if yes, then the condition becomes - * true.
- *Example: Assume 'variable a' holds 10 and 'variable b' holds 20.
- *(a = b) is not true.
- *Checks whether the values of two operands are equal; if the values are not equal, - * then the condition becomes true.
- *Example: (a < > b) is true.
- *Checks whether the value of the left operand is greater than the value of the right - * operand; if yes, then the condition becomes true.
- *Example: (a > b) is not true.
- *Checks whether the value of the left operand is less than the value of the right - * operand; if yes, then the condition becomes true.
- *Example: (a < b) is true.
- *Checks whether the value of the left operand is greater than or equal to the value - * of the right operand; if yes, then the condition becomes true.
- *Example: (a >= b) is not true.
- *Checks whether the value of the left operand is less than or equal to the value of - * the right operand; if yes, then the condition becomes true.
- *Example: (a <= b) is true.
- *Logical operators.
- *- * Supported Partition Key Types: The following are the supported - * partition keys.
- *
- * string
- *
- * date
- *
- * timestamp
- *
- * int
- *
- * bigint
- *
- * long
- *
- * tinyint
- *
The date and time the schema version was created.
+ * @public + */ + CreatedTime?: string; +} + +/** + * @public + * @enum + */ +export const SchemaDiffType = { + SYNTAX_DIFF: "SYNTAX_DIFF", +} as const; + +/** + * @public + */ +export type SchemaDiffType = (typeof SchemaDiffType)[keyof typeof SchemaDiffType]; + +/** + * @public + */ +export interface GetSchemaVersionsDiffInput { + /** + *This is a wrapper structure to contain schema identity fields. The structure contains:
+ *
- * smallint
- *
SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. One of SchemaArn
or SchemaName
has to be provided.
- * decimal
- *
SchemaId$SchemaName: The name of the schema. One of SchemaArn
or SchemaName
has to be provided.
If an type is encountered that is not valid, an exception is thrown.
- * @public - */ - Expression?: string; - - /** - *A structure containing Lake Formation audit context information.
* @public */ - AuditContext?: AuditContext; + SchemaId: SchemaId | undefined; /** - *A list of supported permission types.
+ *The first of the two schema versions to be compared.
* @public */ - SupportedPermissionTypes: PermissionType[] | undefined; + FirstSchemaVersionNumber: SchemaVersionNumber | undefined; /** - *A continuation token, if this is not the first call to retrieve - * these partitions.
+ *The second of the two schema versions to be compared.
* @public */ - NextToken?: string; + SecondSchemaVersionNumber: SchemaVersionNumber | undefined; /** - *The segment of the table's partitions to scan in this request.
+ *Refers to SYNTAX_DIFF
, which is the currently supported diff type.
The maximum number of partitions to return in a single response.
+ *The difference between schemas as a string in JsonPatch format.
* @public */ - MaxResults?: number; + Diff?: string; +} +/** + * @public + */ +export interface GetSecurityConfigurationRequest { /** - *A structure used as a protocol between query engines and Lake Formation or Glue. Contains both a Lake Formation generated authorization identifier and information from the request's authorization context.
+ *The name of the security configuration to retrieve.
* @public */ - QuerySessionContext?: QuerySessionContext; + Name: string | undefined; } /** - *A partition that contains unfiltered metadata.
+ *Specifies a security configuration.
* @public */ -export interface UnfilteredPartition { +export interface SecurityConfiguration { /** - *The partition object.
+ *The name of the security configuration.
* @public */ - Partition?: Partition; + Name?: string; /** - *The list of columns the user has permissions to access.
+ *The time at which this security configuration was created.
* @public */ - AuthorizedColumns?: string[]; + CreatedTimeStamp?: Date; /** - *A Boolean value indicating that the partition location is registered with Lake Formation.
+ *The encryption configuration associated with this security configuration.
* @public */ - IsRegisteredWithLakeFormation?: boolean; + EncryptionConfiguration?: EncryptionConfiguration; } /** * @public */ -export interface GetUnfilteredPartitionsMetadataResponse { - /** - *A list of requested partitions.
- * @public - */ - UnfilteredPartitions?: UnfilteredPartition[]; - +export interface GetSecurityConfigurationResponse { /** - *A continuation token, if the returned list of partitions does not include the last - * one.
+ *The requested security configuration.
* @public */ - NextToken?: string; + SecurityConfiguration?: SecurityConfiguration; } /** - *A structure specifying the dialect and dialect version used by the query engine.
* @public */ -export interface SupportedDialect { +export interface GetSecurityConfigurationsRequest { /** - *The dialect of the query engine.
+ *The maximum number of results to return.
* @public */ - Dialect?: ViewDialect; + MaxResults?: number; /** - *The version of the dialect of the query engine. For example, 3.0.0.
+ *A continuation token, if this is a continuation call.
* @public */ - DialectVersion?: string; + NextToken?: string; } /** * @public */ -export interface GetUnfilteredTableMetadataRequest { +export interface GetSecurityConfigurationsResponse { /** - *Specified only if the base tables belong to a different Amazon Web Services Region.
+ *A list of security configurations.
* @public */ - Region?: string; + SecurityConfigurations?: SecurityConfiguration[]; /** - *The catalog ID where the table resides.
+ *A continuation token, if there are more security + * configurations to return.
* @public */ - CatalogId: string | undefined; + NextToken?: string; +} +/** + * @public + */ +export interface GetSessionRequest { /** - *(Required) Specifies the name of a database that contains the table.
+ *The ID of the session.
* @public */ - DatabaseName: string | undefined; + Id: string | undefined; /** - *(Required) Specifies the name of a table for which you are requesting metadata.
+ *The origin of the request.
* @public */ - Name: string | undefined; + RequestOrigin?: string; +} +/** + * @public + */ +export interface GetSessionResponse { /** - *A structure containing Lake Formation audit context information.
+ *The session object is returned in the response.
* @public */ - AuditContext?: AuditContext; + Session?: Session; +} +/** + * @public + */ +export interface GetStatementRequest { /** - *Indicates the level of filtering a third-party analytical engine is capable of enforcing when calling the GetUnfilteredTableMetadata
API operation. Accepted values are:
- * COLUMN_PERMISSION
- Column permissions ensure that users can access only specific columns in the table. If there are particular columns contain sensitive data, data lake administrators can define column filters that exclude access to specific columns.
- * CELL_FILTER_PERMISSION
- Cell-level filtering combines column filtering (include or exclude columns) and row filter expressions to restrict access to individual elements in the table.
- * NESTED_PERMISSION
- Nested permissions combines cell-level filtering and nested column filtering to restrict access to columns and/or nested columns in specific rows based on row filter expressions.
- * NESTED_CELL_PERMISSION
- Nested cell permissions combines nested permission with nested cell-level filtering. This allows different subsets of nested columns to be restricted based on an array of row filter expressions.
Note: Each of these permission types follows a hierarchical order where each subsequent permission type includes all permission of the previous type.
- *Important: If you provide a supported permission type that doesn't match the user's level of permissions on the table, then Lake Formation raises an exception. For example, if the third-party engine calling the GetUnfilteredTableMetadata
operation can enforce only column-level filtering, and the user has nested cell filtering applied on the table, Lake Formation throws an exception, and will not return unfiltered table metadata and data access credentials.
The Session ID of the statement.
* @public */ - SupportedPermissionTypes: PermissionType[] | undefined; + SessionId: string | undefined; /** - *The resource ARN of the view.
+ *The Id of the statement.
* @public */ - ParentResourceArn?: string; + Id: number | undefined; /** - *The resource ARN of the root view in a chain of nested views.
+ *The origin of the request.
* @public */ - RootResourceArn?: string; + RequestOrigin?: string; +} +/** + *The code execution output in JSON format.
+ * @public + */ +export interface StatementOutputData { /** - *A structure specifying the dialect and dialect version used by the query engine.
+ *The code execution output in text format.
* @public */ - SupportedDialect?: SupportedDialect; + TextPlain?: string; +} - /** - *The Lake Formation data permissions of the caller on the table. Used to authorize the call when no view context is found.
- * @public - */ - Permissions?: Permission[]; +/** + * @public + * @enum + */ +export const StatementState = { + AVAILABLE: "AVAILABLE", + CANCELLED: "CANCELLED", + CANCELLING: "CANCELLING", + ERROR: "ERROR", + RUNNING: "RUNNING", + WAITING: "WAITING", +} as const; - /** - *A structure used as a protocol between query engines and Lake Formation or Glue. Contains both a Lake Formation generated authorization identifier and information from the request's authorization context.
- * @public - */ - QuerySessionContext?: QuerySessionContext; -} +/** + * @public + */ +export type StatementState = (typeof StatementState)[keyof typeof StatementState]; /** - *A filter that uses both column-level and row-level filtering.
+ *The code execution output in JSON format.
* @public */ -export interface ColumnRowFilter { +export interface StatementOutput { /** - *A string containing the name of the column.
+ *The code execution output.
* @public */ - ColumnName?: string; + Data?: StatementOutputData; /** - *A string containing the row-level filter expression.
+ *The execution count of the output.
* @public */ - RowFilterExpression?: string; -} + ExecutionCount?: number; -/** - * @public - */ -export interface GetUnfilteredTableMetadataResponse { /** - *A Table object containing the table metadata.
+ *The status of the code execution output.
* @public */ - Table?: Table; + Status?: StatementState; /** - *A list of column names that the user has been granted access to.
+ *The name of the error in the output.
* @public */ - AuthorizedColumns?: string[]; + ErrorName?: string; /** - *A Boolean value that indicates whether the partition location is registered - * with Lake Formation.
+ *The error value of the output.
* @public */ - IsRegisteredWithLakeFormation?: boolean; + ErrorValue?: string; /** - *A list of column row filters.
+ *The traceback of the output.
+ * @public + */ + Traceback?: string[]; +} + +/** + *The statement or request for a particular action to occur in a session.
+ * @public + */ +export interface Statement { + /** + *The ID of the statement.
* @public */ - CellFilters?: ColumnRowFilter[]; + Id?: number; /** - *A cryptographically generated query identifier generated by Glue or Lake Formation.
+ *The execution code of the statement.
* @public */ - QueryAuthorizationId?: string; + Code?: string; /** - *Specifies whether the view supports the SQL dialects of one or more different query engines and can therefore be read by those engines.
+ *The state while request is actioned.
* @public */ - IsMultiDialectView?: boolean; + State?: StatementState; /** - *The resource ARN of the parent resource extracted from the request.
+ *The output in JSON.
* @public */ - ResourceArn?: string; + Output?: StatementOutput; /** - *A flag that instructs the engine not to push user-provided operations into the logical plan of the view during query planning. However, if set this flag does not guarantee that the engine will comply. Refer to the engine's documentation to understand the guarantees provided, if any.
+ *The code execution progress.
* @public */ - IsProtected?: boolean; + Progress?: number; /** - *The Lake Formation data permissions of the caller on the table. Used to authorize the call when no view context is found.
+ *The unix time and date that the job definition was started.
* @public */ - Permissions?: Permission[]; + StartedOn?: number; /** - *The filter that applies to the table. For example when applying the filter in SQL, it would go in the WHERE
clause and can be evaluated by using an AND
operator with any other predicates applied by the user querying the table.
The unix time and date that the job definition was completed.
* @public */ - RowFilter?: string; + CompletedOn?: number; } /** * @public */ -export interface GetUsageProfileRequest { +export interface GetStatementResponse { /** - *The name of the usage profile to retrieve.
+ *Returns the statement.
* @public */ - Name: string | undefined; + Statement?: Statement; } /** * @public */ -export interface GetUsageProfileResponse { +export interface GetTableRequest { /** - *The name of the usage profile.
+ *The ID of the Data Catalog where the table resides. If none is provided, the Amazon Web Services account + * ID is used by default.
* @public */ - Name?: string; + CatalogId?: string; /** - *A description of the usage profile.
+ *The name of the database in the catalog in which the table resides. + * For Hive compatibility, this name is entirely lowercase.
* @public */ - Description?: string; + DatabaseName: string | undefined; /** - *A ProfileConfiguration
object specifying the job and session values for the profile.
The name of the table for which to retrieve the definition. For Hive + * compatibility, this name is entirely lowercase.
* @public */ - Configuration?: ProfileConfiguration; + Name: string | undefined; /** - *The date and time when the usage profile was created.
+ *The transaction ID at which to read the table contents.
* @public */ - CreatedOn?: Date; + TransactionId?: string; /** - *The date and time when the usage profile was last modified.
+ *The time as of when to read the table contents. If not set, the most recent transaction commit time will be used. Cannot be specified along with TransactionId
.
A table that points to an entity outside the Glue Data Catalog.
* @public */ -export interface GetUserDefinedFunctionRequest { +export interface FederatedTable { /** - *The ID of the Data Catalog where the function to be retrieved is located. If none is - * provided, the Amazon Web Services account ID is used by default.
+ *A unique identifier for the federated table.
* @public */ - CatalogId?: string; + Identifier?: string; /** - *The name of the catalog database where the function is located.
+ *A unique identifier for the federated database.
* @public */ - DatabaseName: string | undefined; + DatabaseIdentifier?: string; /** - *The name of the function.
+ *The name of the connection to the external metastore.
* @public */ - FunctionName: string | undefined; + ConnectionName?: string; } /** - *Represents the equivalent of a Hive user-defined function
- * (UDF
) definition.
A structure that contains the dialect of the view, and the query that defines the view.
* @public */ -export interface UserDefinedFunction { +export interface ViewRepresentation { /** - *The name of the function.
+ *The dialect of the query engine.
* @public */ - FunctionName?: string; + Dialect?: ViewDialect; /** - *The name of the catalog database that contains the function.
+ *The version of the dialect of the query engine. For example, 3.0.0.
* @public */ - DatabaseName?: string; + DialectVersion?: string; /** - *The Java class that contains the function code.
+ *The SELECT
query provided by the customer during CREATE VIEW DDL
. This SQL is not used during a query on a view (ViewExpandedText
is used instead). ViewOriginalText
is used for cases like SHOW CREATE VIEW
where users want to see the original DDL command that created the view.
The owner of the function.
+ *The expanded SQL for the view. This SQL is used by engines while processing a query on a view. Engines may perform operations during view creation to transform ViewOriginalText
to ViewExpandedText
. For example:
Fully qualified identifiers: SELECT * from table1 -> SELECT * from db1.table1
+ *
The owner type.
+ *The name of the connection to be used to validate the specific representation of the view.
* @public */ - OwnerType?: PrincipalType; + ValidationConnection?: string; /** - *The time at which the function was created.
+ *Dialects marked as stale are no longer valid and must be updated before they can be queried in their respective query engines.
* @public */ - CreateTime?: Date; + IsStale?: boolean; +} +/** + *A structure containing details for representations.
+ * @public + */ +export interface ViewDefinition { /** - *The resource URIs for the function.
+ *You can set this flag as true to instruct the engine not to push user-provided operations into the logical plan of the view during query planning. However, setting this flag does not guarantee that the engine will comply. Refer to the engine's documentation to understand the guarantees provided, if any.
* @public */ - ResourceUris?: ResourceUri[]; + IsProtected?: boolean; /** - *The ID of the Data Catalog in which the function resides.
+ *The definer of a view in SQL.
* @public */ - CatalogId?: string; -} + Definer?: string; -/** - * @public - */ -export interface GetUserDefinedFunctionResponse { /** - *The requested function definition.
+ *A list of table Amazon Resource Names (ARNs).
* @public */ - UserDefinedFunction?: UserDefinedFunction; + SubObjects?: string[]; + + /** + *A list of representations.
+ * @public + */ + Representations?: ViewRepresentation[]; } /** + *Represents a collection of related data organized in columns and rows.
* @public */ -export interface GetUserDefinedFunctionsRequest { +export interface Table { /** - *The ID of the Data Catalog where the functions to be retrieved are located. If none is - * provided, the Amazon Web Services account ID is used by default.
+ *The table name. For Hive compatibility, this must be entirely + * lowercase.
* @public */ - CatalogId?: string; + Name: string | undefined; /** - *The name of the catalog database where the functions are located. If none is provided, functions from all the - * databases across the catalog will be returned.
+ *The name of the database where the table metadata resides. + * For Hive compatibility, this must be all lowercase.
* @public */ DatabaseName?: string; /** - *An optional function-name pattern string that filters the function - * definitions returned.
+ *A description of the table.
* @public */ - Pattern: string | undefined; + Description?: string; /** - *A continuation token, if this is a continuation call.
+ *The owner of the table.
* @public */ - NextToken?: string; + Owner?: string; /** - *The maximum number of functions to return in one response.
+ *The time when the table definition was created in the Data Catalog.
* @public */ - MaxResults?: number; -} + CreateTime?: Date; -/** - * @public - */ -export interface GetUserDefinedFunctionsResponse { /** - *A list of requested function definitions.
+ *The last time that the table was updated.
* @public */ - UserDefinedFunctions?: UserDefinedFunction[]; + UpdateTime?: Date; /** - *A continuation token, if the list of functions returned does - * not include the last requested function.
+ *The last time that the table was accessed. This is usually taken from HDFS, and might not + * be reliable.
* @public */ - NextToken?: string; -} + LastAccessTime?: Date; -/** - * @public - */ -export interface GetWorkflowRequest { /** - *The name of the workflow to retrieve.
+ *The last time that column statistics were computed for this table.
* @public */ - Name: string | undefined; + LastAnalyzedTime?: Date; /** - *Specifies whether to include a graph when returning the workflow resource metadata.
+ *The retention time for this table.
* @public */ - IncludeGraph?: boolean; -} + Retention?: number; -/** - * @public - */ -export interface GetWorkflowResponse { /** - *The resource metadata for the workflow.
+ *A storage descriptor containing information about the physical storage + * of this table.
* @public */ - Workflow?: Workflow; -} + StorageDescriptor?: StorageDescriptor; -/** - * @public - */ -export interface GetWorkflowRunRequest { /** - *Name of the workflow being run.
+ *A list of columns by which the table is partitioned. Only primitive + * types are supported as partition keys.
+ *When you create a table used by Amazon Athena, and you do not specify any
+ * partitionKeys
, you must at least set the value of partitionKeys
to
+ * an empty list. For example:
+ * "PartitionKeys": []
+ *
The ID of the workflow run.
+ *Included for Apache Hive compatibility. Not used in the normal course of Glue operations.
+ * If the table is a VIRTUAL_VIEW
, certain Athena configuration encoded in base64.
Specifies whether to include the workflow graph in response or not.
+ *Included for Apache Hive compatibility. Not used in the normal course of Glue operations.
* @public */ - IncludeGraph?: boolean; -} + ViewExpandedText?: string; -/** - * @public - */ -export interface GetWorkflowRunResponse { /** - *The requested workflow run metadata.
+ *The type of this table.
+ * Glue will create tables with the EXTERNAL_TABLE
type.
+ * Other services, such as Athena, may create tables with additional table types.
+ *
Glue related table types:
+ *Hive compatible attribute - indicates a non-Hive managed table.
+ *Used by Lake Formation.
+ * The Glue Data Catalog understands GOVERNED
.
Name of the workflow which was run.
+ *These key-value pairs define properties associated with the table.
* @public */ - Name: string | undefined; + Parameters?: RecordThe ID of the workflow run whose run properties should be returned.
+ *The person or entity who created the table.
* @public */ - RunId: string | undefined; -} + CreatedBy?: string; -/** - * @public - */ -export interface GetWorkflowRunPropertiesResponse { /** - *The workflow run properties which were set during the specified run.
+ *Indicates whether the table has been registered with Lake Formation.
* @public */ - RunProperties?: RecordName of the workflow whose metadata of runs should be returned.
+ *A TableIdentifier
structure that describes a target table for resource linking.
Specifies whether to include the workflow graph in response or not.
+ *The ID of the Data Catalog in which the table resides.
* @public */ - IncludeGraph?: boolean; + CatalogId?: string; /** - *The maximum size of the response.
+ *The ID of the table version.
* @public */ - NextToken?: string; + VersionId?: string; /** - *The maximum number of workflow runs to be included in the response.
+ *A FederatedTable
structure that references an entity outside the Glue Data Catalog.
A list of workflow run metadata objects.
+ *A structure that contains all the information that defines the view, including the dialect or dialects for the view, and the query.
* @public */ - Runs?: WorkflowRun[]; + ViewDefinition?: ViewDefinition; /** - *A continuation token, if not all requested workflow runs have been returned.
+ *Specifies whether the view supports the SQL dialects of one or more different query engines and can therefore be read by those engines.
* @public */ - NextToken?: string; + IsMultiDialectView?: boolean; } /** * @public */ -export interface ImportCatalogToGlueRequest { +export interface GetTableResponse { /** - *The ID of the catalog to import. Currently, this should be the Amazon Web Services account ID.
+ *The Table
object that defines the specified table.
The Catalog ID of the table.
+ * @public + */ + CatalogId: string | undefined; -/** - * @public - */ -export interface ListBlueprintsRequest { /** - *A continuation token, if this is a continuation request.
+ *The name of the database in the catalog in which the table resides.
* @public */ - NextToken?: string; + DatabaseName: string | undefined; /** - *The maximum size of a list to return.
+ *The name of the table.
* @public */ - MaxResults?: number; + TableName: string | undefined; /** - *Filters the list by an Amazon Web Services resource tag.
+ *The type of table optimizer.
* @public */ - Tags?: RecordList of names of blueprints in the account.
+ *The Catalog ID of the table.
* @public */ - Blueprints?: string[]; + CatalogId?: string; /** - *A continuation token, if not all blueprint names have been returned.
+ *The name of the database in the catalog in which the table resides.
* @public */ - NextToken?: string; -} + DatabaseName?: string; -/** - * @public - */ -export interface ListColumnStatisticsTaskRunsRequest { /** - *The maximum size of the response.
+ *The name of the table.
* @public */ - MaxResults?: number; + TableName?: string; /** - *A continuation token, if this is a continuation call.
+ *The optimizer associated with the specified table.
* @public */ - NextToken?: string; + TableOptimizer?: TableOptimizer; } /** * @public */ -export interface ListColumnStatisticsTaskRunsResponse { +export interface GetTablesRequest { /** - *A list of column statistics task run IDs.
+ *The ID of the Data Catalog where the tables reside. If none is provided, the Amazon Web Services account + * ID is used by default.
* @public */ - ColumnStatisticsTaskRunIds?: string[]; + CatalogId?: string; /** - *A continuation token, if not all task run IDs have yet been returned.
+ *The database in the catalog whose tables to list. For Hive + * compatibility, this name is entirely lowercase.
* @public */ - NextToken?: string; -} + DatabaseName: string | undefined; -/** - * @public - */ -export interface ListCrawlersRequest { /** - *The maximum size of a list to return.
+ *A regular expression pattern. If present, only those tables + * whose names match the pattern are returned.
* @public */ - MaxResults?: number; + Expression?: string; /** - *A continuation token, if this is a continuation request.
+ *A continuation token, included if this is a continuation call.
* @public */ NextToken?: string; /** - *Specifies to return only these tagged resources.
+ *The maximum number of tables to return in a single response.
* @public */ - Tags?: RecordThe names of all crawlers in the account, or the crawlers with the specified tags.
+ *The transaction ID at which to read the table contents.
* @public */ - CrawlerNames?: string[]; + TransactionId?: string; /** - *A continuation token, if the returned list does not contain the - * last metric available.
+ *The time as of when to read the table contents. If not set, the most recent transaction commit time will be used. Cannot be specified along with TransactionId
.
A list of fields, comparators and value that you can use to filter the crawler runs for a specified crawler.
* @public */ -export interface CrawlsFilter { - /** - *A key used to filter the crawler runs for a specified crawler. Valid values for each of the field names are:
- *
- * CRAWL_ID
: A string representing the UUID identifier for a crawl.
- * STATE
: A string representing the state of the crawl.
- * START_TIME
and END_TIME
: The epoch timestamp in milliseconds.
- * DPU_HOUR
: The number of data processing unit (DPU) hours used for the crawl.
A defined comparator that operates on the value. The available operators are:
- *
- * GT
: Greater than.
- * GE
: Greater than or equal to.
- * LT
: Less than.
- * LE
: Less than or equal to.
- * EQ
: Equal to.
- * NE
: Not equal to.
A list of the requested Table
objects.
The value provided for comparison on the crawl field.
+ *A continuation token, present if the current list segment is + * not the last.
* @public */ - FieldValue?: string; + NextToken?: string; } /** * @public */ -export interface ListCrawlsRequest { +export interface GetTableVersionRequest { /** - *The name of the crawler whose runs you want to retrieve.
+ *The ID of the Data Catalog where the tables reside. If none is provided, the Amazon Web Services account + * ID is used by default.
* @public */ - CrawlerName: string | undefined; + CatalogId?: string; /** - *The maximum number of results to return. The default is 20, and maximum is 100.
+ *The database in the catalog in which the table resides. For Hive + * compatibility, this name is entirely lowercase.
* @public */ - MaxResults?: number; + DatabaseName: string | undefined; /** - *Filters the crawls by the criteria you specify in a list of CrawlsFilter
objects.
The name of the table. For Hive compatibility, + * this name is entirely lowercase.
* @public */ - Filters?: CrawlsFilter[]; + TableName: string | undefined; /** - *A continuation token, if this is a continuation call.
+ *The ID value of the table version to be retrieved. A VersionID
is a string representation of an integer. Each version is incremented by 1.
Specifies a version of a table.
* @public - * @enum */ -export const CrawlerHistoryState = { - COMPLETED: "COMPLETED", - FAILED: "FAILED", - RUNNING: "RUNNING", - STOPPED: "STOPPED", -} as const; +export interface TableVersion { + /** + *The table in question.
+ * @public + */ + Table?: Table; -/** - * @public - */ -export type CrawlerHistoryState = (typeof CrawlerHistoryState)[keyof typeof CrawlerHistoryState]; + /** + *The ID value that identifies this table version. A VersionId
is a string representation of an integer. Each version is incremented by 1.
Contains the information for a run of a crawler.
* @public */ -export interface CrawlerHistory { +export interface GetTableVersionResponse { /** - *A UUID identifier for each crawl.
+ *The requested table version.
* @public */ - CrawlId?: string; + TableVersion?: TableVersion; +} +/** + * @public + */ +export interface GetTableVersionsRequest { /** - *The state of the crawl.
+ *The ID of the Data Catalog where the tables reside. If none is provided, the Amazon Web Services account + * ID is used by default.
* @public */ - State?: CrawlerHistoryState; + CatalogId?: string; /** - *The date and time on which the crawl started.
+ *The database in the catalog in which the table resides. For Hive + * compatibility, this name is entirely lowercase.
* @public */ - StartTime?: Date; + DatabaseName: string | undefined; /** - *The date and time on which the crawl ended.
+ *The name of the table. For Hive + * compatibility, this name is entirely lowercase.
* @public */ - EndTime?: Date; + TableName: string | undefined; /** - *A run summary for the specific crawl in JSON. Contains the catalog tables and partitions that were added, updated, or deleted.
+ *A continuation token, if this is not the first call.
* @public */ - Summary?: string; + NextToken?: string; /** - *If an error occurred, the error message associated with the crawl.
+ *The maximum number of table versions to return in one response.
* @public */ - ErrorMessage?: string; + MaxResults?: number; +} +/** + * @public + */ +export interface GetTableVersionsResponse { /** - *The log group associated with the crawl.
+ *A list of strings identifying available versions of the + * specified table.
* @public */ - LogGroup?: string; + TableVersions?: TableVersion[]; /** - *The log stream associated with the crawl.
+ *A continuation token, if the list of available versions does + * not include the last one.
* @public */ - LogStream?: string; + NextToken?: string; +} +/** + * @public + */ +export interface GetTagsRequest { /** - *The prefix for a CloudWatch message about this crawl.
+ *The Amazon Resource Name (ARN) of the resource for which to retrieve tags.
* @public */ - MessagePrefix?: string; + ResourceArn: string | undefined; +} +/** + * @public + */ +export interface GetTagsResponse { /** - *The number of data processing units (DPU) used in hours for the crawl.
+ *The requested tags.
* @public */ - DPUHour?: number; + Tags?: RecordA list of CrawlerHistory
objects representing the crawl runs that meet your criteria.
The name of the trigger to retrieve.
* @public */ - Crawls?: CrawlerHistory[]; + Name: string | undefined; +} +/** + * @public + */ +export interface GetTriggerResponse { /** - *A continuation token for paginating the returned list of tokens, returned if the current segment of the list is not the last.
+ *The requested trigger definition.
* @public */ - NextToken?: string; + Trigger?: Trigger; } /** * @public */ -export interface ListCustomEntityTypesRequest { +export interface GetTriggersRequest { /** - *A paginated token to offset the results.
+ *A continuation token, if this is a continuation call.
* @public */ NextToken?: string; /** - *The maximum number of results to return.
+ *The name of the job to retrieve triggers for. The trigger that can start this job is + * returned, and if there is no such trigger, all triggers are returned.
* @public */ - MaxResults?: number; + DependentJobName?: string; /** - *A list of key-value pair tags.
+ *The maximum size of the response.
* @public */ - Tags?: RecordA list of CustomEntityType
objects representing custom patterns.
A list of triggers for the specified job.
* @public */ - CustomEntityTypes?: CustomEntityType[]; + Triggers?: Trigger[]; /** - *A pagination token, if more results are available.
+ *A continuation token, if not all the requested triggers + * have yet been returned.
* @public */ NextToken?: string; } /** - *Criteria used to return data quality results.
+ *A structure used as a protocol between query engines and Lake Formation or Glue. Contains both a Lake Formation generated authorization identifier and information from the request's authorization context.
* @public */ -export interface DataQualityResultFilterCriteria { +export interface QuerySessionContext { /** - *Filter results by the specified data source. For example, retrieving all results for an Glue table.
+ *A unique identifier generated by the query engine for the query.
* @public */ - DataSource?: DataSource; + QueryId?: string; /** - *Filter results by the specified job name.
+ *A timestamp provided by the query engine for when the query started.
* @public */ - JobName?: string; + QueryStartTime?: Date; /** - *Filter results by the specified job run ID.
+ *An identifier string for the consumer cluster.
* @public */ - JobRunId?: string; + ClusterId?: string; /** - *Filter results by runs that started after this time.
+ *A cryptographically generated query identifier generated by Glue or Lake Formation.
* @public */ - StartedAfter?: Date; + QueryAuthorizationId?: string; /** - *Filter results by runs that started before this time.
+ *An opaque string-string map passed by the query engine.
* @public */ - StartedBefore?: Date; + AdditionalContext?: RecordThe filter criteria.
+ *Specified only if the base tables belong to a different Amazon Web Services Region.
* @public */ - Filter?: DataQualityResultFilterCriteria; + Region?: string; /** - *A paginated token to offset the results.
+ *The catalog ID where the partition resides.
* @public */ - NextToken?: string; + CatalogId: string | undefined; /** - *The maximum number of results to return.
+ *(Required) Specifies the name of a database that contains the partition.
* @public */ - MaxResults?: number; -} + DatabaseName: string | undefined; -/** - *Describes a data quality result.
- * @public - */ -export interface DataQualityResultDescription { /** - *The unique result ID for this data quality result.
+ *(Required) Specifies the name of a table that contains the partition.
* @public */ - ResultId?: string; + TableName: string | undefined; /** - *The table name associated with the data quality result.
+ *(Required) A list of partition key values.
* @public */ - DataSource?: DataSource; + PartitionValues: string[] | undefined; /** - *The job name associated with the data quality result.
+ *A structure containing Lake Formation audit context information.
* @public */ - JobName?: string; + AuditContext?: AuditContext; /** - *The job run ID associated with the data quality result.
+ *(Required) A list of supported permission types.
* @public */ - JobRunId?: string; + SupportedPermissionTypes: PermissionType[] | undefined; /** - *The time that the run started for this data quality result.
+ *A structure used as a protocol between query engines and Lake Formation or Glue. Contains both a Lake Formation generated authorization identifier and information from the request's authorization context.
* @public */ - StartedOn?: Date; + QuerySessionContext?: QuerySessionContext; } /** * @public */ -export interface ListDataQualityResultsResponse { +export interface GetUnfilteredPartitionMetadataResponse { /** - *A list of DataQualityResultDescription
objects.
A Partition object containing the partition metadata.
* @public */ - Results: DataQualityResultDescription[] | undefined; + Partition?: Partition; /** - *A pagination token, if more results are available.
+ *A list of column names that the user has been granted access to.
* @public */ - NextToken?: string; + AuthorizedColumns?: string[]; + + /** + *A Boolean value that indicates whether the partition location is registered + * with Lake Formation.
+ * @public + */ + IsRegisteredWithLakeFormation?: boolean; +} + +/** + *The operation timed out.
+ * @public + */ +export class PermissionTypeMismatchException extends __BaseException { + readonly name: "PermissionTypeMismatchException" = "PermissionTypeMismatchException"; + readonly $fault: "client" = "client"; + /** + *There is a mismatch between the SupportedPermissionType used in the query request + * and the permissions defined on the target table.
+ * @public + */ + Message?: string; + /** + * @internal + */ + constructor(opts: __ExceptionOptionTypeA filter for listing data quality recommendation runs.
* @public */ -export interface DataQualityRuleRecommendationRunFilter { - /** - *Filter based on a specified data source (Glue table).
- * @public - */ - DataSource: DataSource | undefined; - +export interface GetUnfilteredPartitionsMetadataRequest { /** - *Filter based on time for results started before provided time.
+ *Specified only if the base tables belong to a different Amazon Web Services Region.
* @public */ - StartedBefore?: Date; + Region?: string; /** - *Filter based on time for results started after provided time.
+ *The ID of the Data Catalog where the partitions in question reside. If none is provided, + * the AWS account ID is used by default.
* @public */ - StartedAfter?: Date; -} + CatalogId: string | undefined; -/** - * @public - */ -export interface ListDataQualityRuleRecommendationRunsRequest { /** - *The filter criteria.
+ *The name of the catalog database where the partitions reside.
* @public */ - Filter?: DataQualityRuleRecommendationRunFilter; + DatabaseName: string | undefined; /** - *A paginated token to offset the results.
+ *The name of the table that contains the partition.
* @public */ - NextToken?: string; + TableName: string | undefined; /** - *The maximum number of results to return.
+ *An expression that filters the partitions to be returned.
+ *The expression uses SQL syntax similar to the SQL WHERE
filter clause. The
+ * SQL statement parser JSQLParser parses the expression.
+ * Operators: The following are the operators that you can use in the
+ * Expression
API call:
Checks whether the values of the two operands are equal; if yes, then the condition becomes + * true.
+ *Example: Assume 'variable a' holds 10 and 'variable b' holds 20.
+ *(a = b) is not true.
+ *Checks whether the values of two operands are equal; if the values are not equal, + * then the condition becomes true.
+ *Example: (a < > b) is true.
+ *Checks whether the value of the left operand is greater than the value of the right + * operand; if yes, then the condition becomes true.
+ *Example: (a > b) is not true.
+ *Checks whether the value of the left operand is less than the value of the right + * operand; if yes, then the condition becomes true.
+ *Example: (a < b) is true.
+ *Checks whether the value of the left operand is greater than or equal to the value + * of the right operand; if yes, then the condition becomes true.
+ *Example: (a >= b) is not true.
+ *Checks whether the value of the left operand is less than or equal to the value of + * the right operand; if yes, then the condition becomes true.
+ *Example: (a <= b) is true.
+ *Logical operators.
+ *+ * Supported Partition Key Types: The following are the supported + * partition keys.
+ *
+ * string
+ *
+ * date
+ *
+ * timestamp
+ *
+ * int
+ *
+ * bigint
+ *
+ * long
+ *
+ * tinyint
+ *
+ * smallint
+ *
+ * decimal
+ *
If an type is encountered that is not valid, an exception is thrown.
* @public */ - MaxResults?: number; -} + Expression?: string; -/** - *Describes the result of a data quality rule recommendation run.
- * @public - */ -export interface DataQualityRuleRecommendationRunDescription { /** - *The unique run identifier associated with this run.
+ *A structure containing Lake Formation audit context information.
* @public */ - RunId?: string; + AuditContext?: AuditContext; /** - *The status for this run.
+ *A list of supported permission types.
* @public */ - Status?: TaskStatusType; + SupportedPermissionTypes: PermissionType[] | undefined; /** - *The date and time when this run started.
+ *A continuation token, if this is not the first call to retrieve + * these partitions.
* @public */ - StartedOn?: Date; + NextToken?: string; /** - *The data source (Glue table) associated with the recommendation run.
+ *The segment of the table's partitions to scan in this request.
* @public */ - DataSource?: DataSource; -} + Segment?: Segment; -/** - * @public - */ -export interface ListDataQualityRuleRecommendationRunsResponse { /** - *A list of DataQualityRuleRecommendationRunDescription
objects.
The maximum number of partitions to return in a single response.
* @public */ - Runs?: DataQualityRuleRecommendationRunDescription[]; + MaxResults?: number; /** - *A pagination token, if more results are available.
+ *A structure used as a protocol between query engines and Lake Formation or Glue. Contains both a Lake Formation generated authorization identifier and information from the request's authorization context.
* @public */ - NextToken?: string; + QuerySessionContext?: QuerySessionContext; } /** - *The filter criteria.
+ *A partition that contains unfiltered metadata.
* @public */ -export interface DataQualityRulesetEvaluationRunFilter { +export interface UnfilteredPartition { /** - *Filter based on a data source (an Glue table) associated with the run.
+ *The partition object.
* @public */ - DataSource: DataSource | undefined; + Partition?: Partition; /** - *Filter results by runs that started before this time.
+ *The list of columns the user has permissions to access.
* @public */ - StartedBefore?: Date; + AuthorizedColumns?: string[]; /** - *Filter results by runs that started after this time.
+ *A Boolean value indicating that the partition location is registered with Lake Formation.
* @public */ - StartedAfter?: Date; + IsRegisteredWithLakeFormation?: boolean; } /** * @public */ -export interface ListDataQualityRulesetEvaluationRunsRequest { +export interface GetUnfilteredPartitionsMetadataResponse { /** - *The filter criteria.
+ *A list of requested partitions.
* @public */ - Filter?: DataQualityRulesetEvaluationRunFilter; + UnfilteredPartitions?: UnfilteredPartition[]; /** - *A paginated token to offset the results.
+ *A continuation token, if the returned list of partitions does not include the last + * one.
* @public */ NextToken?: string; +} +/** + *A structure specifying the dialect and dialect version used by the query engine.
+ * @public + */ +export interface SupportedDialect { /** - *The maximum number of results to return.
+ *The dialect of the query engine.
* @public */ - MaxResults?: number; + Dialect?: ViewDialect; + + /** + *The version of the dialect of the query engine. For example, 3.0.0.
+ * @public + */ + DialectVersion?: string; } /** - *Describes the result of a data quality ruleset evaluation run.
* @public */ -export interface DataQualityRulesetEvaluationRunDescription { +export interface GetUnfilteredTableMetadataRequest { /** - *The unique run identifier associated with this run.
+ *Specified only if the base tables belong to a different Amazon Web Services Region.
* @public */ - RunId?: string; + Region?: string; /** - *The status for this run.
+ *The catalog ID where the table resides.
* @public */ - Status?: TaskStatusType; + CatalogId: string | undefined; /** - *The date and time when the run started.
+ *(Required) Specifies the name of a database that contains the table.
* @public */ - StartedOn?: Date; + DatabaseName: string | undefined; /** - *The data source (an Glue table) associated with the run.
+ *(Required) Specifies the name of a table for which you are requesting metadata.
* @public */ - DataSource?: DataSource; -} + Name: string | undefined; -/** - * @public - */ -export interface ListDataQualityRulesetEvaluationRunsResponse { /** - *A list of DataQualityRulesetEvaluationRunDescription
objects representing data quality ruleset runs.
A structure containing Lake Formation audit context information.
* @public */ - Runs?: DataQualityRulesetEvaluationRunDescription[]; + AuditContext?: AuditContext; /** - *A pagination token, if more results are available.
+ *Indicates the level of filtering a third-party analytical engine is capable of enforcing when calling the GetUnfilteredTableMetadata
API operation. Accepted values are:
+ * COLUMN_PERMISSION
- Column permissions ensure that users can access only specific columns in the table. If there are particular columns contain sensitive data, data lake administrators can define column filters that exclude access to specific columns.
+ * CELL_FILTER_PERMISSION
- Cell-level filtering combines column filtering (include or exclude columns) and row filter expressions to restrict access to individual elements in the table.
+ * NESTED_PERMISSION
- Nested permissions combines cell-level filtering and nested column filtering to restrict access to columns and/or nested columns in specific rows based on row filter expressions.
+ * NESTED_CELL_PERMISSION
- Nested cell permissions combines nested permission with nested cell-level filtering. This allows different subsets of nested columns to be restricted based on an array of row filter expressions.
Note: Each of these permission types follows a hierarchical order where each subsequent permission type includes all permission of the previous type.
+ *Important: If you provide a supported permission type that doesn't match the user's level of permissions on the table, then Lake Formation raises an exception. For example, if the third-party engine calling the GetUnfilteredTableMetadata
operation can enforce only column-level filtering, and the user has nested cell filtering applied on the table, Lake Formation throws an exception, and will not return unfiltered table metadata and data access credentials.
The criteria used to filter data quality rulesets.
- * @public - */ -export interface DataQualityRulesetFilterCriteria { /** - *The name of the ruleset filter criteria.
+ *The resource ARN of the view.
* @public */ - Name?: string; + ParentResourceArn?: string; /** - *The description of the ruleset filter criteria.
+ *The resource ARN of the root view in a chain of nested views.
* @public */ - Description?: string; + RootResourceArn?: string; /** - *Filter on rulesets created before this date.
+ *A structure specifying the dialect and dialect version used by the query engine.
* @public */ - CreatedBefore?: Date; + SupportedDialect?: SupportedDialect; /** - *Filter on rulesets created after this date.
+ *The Lake Formation data permissions of the caller on the table. Used to authorize the call when no view context is found.
* @public */ - CreatedAfter?: Date; + Permissions?: Permission[]; /** - *Filter on rulesets last modified before this date.
+ *A structure used as a protocol between query engines and Lake Formation or Glue. Contains both a Lake Formation generated authorization identifier and information from the request's authorization context.
* @public */ - LastModifiedBefore?: Date; + QuerySessionContext?: QuerySessionContext; +} +/** + *A filter that uses both column-level and row-level filtering.
+ * @public + */ +export interface ColumnRowFilter { /** - *Filter on rulesets last modified after this date.
+ *A string containing the name of the column.
* @public */ - LastModifiedAfter?: Date; + ColumnName?: string; /** - *The name and database name of the target table.
+ *A string containing the row-level filter expression.
* @public */ - TargetTable?: DataQualityTargetTable; + RowFilterExpression?: string; } /** * @public */ -export interface ListDataQualityRulesetsRequest { - /** - *A paginated token to offset the results.
- * @public - */ - NextToken?: string; - +export interface GetUnfilteredTableMetadataResponse { /** - *The maximum number of results to return.
+ *A Table object containing the table metadata.
* @public */ - MaxResults?: number; + Table?: Table; /** - *The filter criteria.
+ *A list of column names that the user has been granted access to.
* @public */ - Filter?: DataQualityRulesetFilterCriteria; + AuthorizedColumns?: string[]; /** - *A list of key-value pair tags.
+ *A Boolean value that indicates whether the partition location is registered + * with Lake Formation.
* @public */ - Tags?: RecordDescribes a data quality ruleset returned by GetDataQualityRuleset
.
The name of the data quality ruleset.
+ *A list of column row filters.
* @public */ - Name?: string; + CellFilters?: ColumnRowFilter[]; /** - *A description of the data quality ruleset.
+ *A cryptographically generated query identifier generated by Glue or Lake Formation.
* @public */ - Description?: string; + QueryAuthorizationId?: string; /** - *The date and time the data quality ruleset was created.
+ *Specifies whether the view supports the SQL dialects of one or more different query engines and can therefore be read by those engines.
* @public */ - CreatedOn?: Date; + IsMultiDialectView?: boolean; /** - *The date and time the data quality ruleset was last modified.
+ *The resource ARN of the parent resource extracted from the request.
* @public */ - LastModifiedOn?: Date; + ResourceArn?: string; /** - *An object representing an Glue table.
+ *A flag that instructs the engine not to push user-provided operations into the logical plan of the view during query planning. However, if set this flag does not guarantee that the engine will comply. Refer to the engine's documentation to understand the guarantees provided, if any.
* @public */ - TargetTable?: DataQualityTargetTable; + IsProtected?: boolean; /** - *When a ruleset was created from a recommendation run, this run ID is generated to link the two together.
+ *The Lake Formation data permissions of the caller on the table. Used to authorize the call when no view context is found.
* @public */ - RecommendationRunId?: string; + Permissions?: Permission[]; /** - *The number of rules in the ruleset.
+ *The filter that applies to the table. For example when applying the filter in SQL, it would go in the WHERE
clause and can be evaluated by using an AND
operator with any other predicates applied by the user querying the table.
A paginated list of rulesets for the specified list of Glue tables.
- * @public - */ - Rulesets?: DataQualityRulesetListDetails[]; - +export interface GetUsageProfileRequest { /** - *A pagination token, if more results are available.
+ *The name of the usage profile to retrieve.
* @public */ - NextToken?: string; + Name: string | undefined; } /** * @public */ -export interface ListDevEndpointsRequest { +export interface GetUsageProfileResponse { /** - *A continuation token, if this is a continuation request.
+ *The name of the usage profile.
* @public */ - NextToken?: string; + Name?: string; /** - *The maximum size of a list to return.
+ *A description of the usage profile.
* @public */ - MaxResults?: number; + Description?: string; /** - *Specifies to return only these tagged resources.
+ *A ProfileConfiguration
object specifying the job and session values for the profile.
The names of all the DevEndpoint
s in the account, or the
- * DevEndpoint
s with the specified tags.
The date and time when the usage profile was created.
* @public */ - DevEndpointNames?: string[]; + CreatedOn?: Date; /** - *A continuation token, if the returned list does not contain the - * last metric available.
+ *The date and time when the usage profile was last modified.
* @public */ - NextToken?: string; + LastModifiedOn?: Date; } /** * @public */ -export interface ListJobsRequest { +export interface GetUserDefinedFunctionRequest { /** - *A continuation token, if this is a continuation request.
+ *The ID of the Data Catalog where the function to be retrieved is located. If none is + * provided, the Amazon Web Services account ID is used by default.
* @public */ - NextToken?: string; + CatalogId?: string; /** - *The maximum size of a list to return.
+ *The name of the catalog database where the function is located.
* @public */ - MaxResults?: number; + DatabaseName: string | undefined; /** - *Specifies to return only these tagged resources.
+ *The name of the function.
* @public */ - Tags?: RecordRepresents the equivalent of a Hive user-defined function
+ * (UDF
) definition.
The names of all jobs in the account, or the jobs with the specified tags.
- * @public - */ - JobNames?: string[]; - +export interface UserDefinedFunction { /** - *A continuation token, if the returned list does not contain the - * last metric available.
+ *The name of the function.
* @public */ - NextToken?: string; -} + FunctionName?: string; -/** - * @public - */ -export interface ListMLTransformsRequest { /** - *A continuation token, if this is a continuation request.
+ *The name of the catalog database that contains the function.
* @public */ - NextToken?: string; + DatabaseName?: string; /** - *The maximum size of a list to return.
+ *The Java class that contains the function code.
* @public */ - MaxResults?: number; + ClassName?: string; /** - *A TransformFilterCriteria
used to filter the machine learning transforms.
The owner of the function.
* @public */ - Filter?: TransformFilterCriteria; + OwnerName?: string; /** - *A TransformSortCriteria
used to sort the machine learning transforms.
The owner type.
* @public */ - Sort?: TransformSortCriteria; + OwnerType?: PrincipalType; /** - *Specifies to return only these tagged resources.
+ *The time at which the function was created.
* @public */ - Tags?: RecordThe identifiers of all the machine learning transforms in the account, or the - * machine learning transforms with the specified tags.
+ *The resource URIs for the function.
* @public */ - TransformIds: string[] | undefined; + ResourceUris?: ResourceUri[]; /** - *A continuation token, if the returned list does not contain the - * last metric available.
+ *The ID of the Data Catalog in which the function resides.
* @public */ - NextToken?: string; + CatalogId?: string; } /** * @public */ -export interface ListRegistriesInput { - /** - *Maximum number of results required per page. If the value is not supplied, this will be defaulted to 25 per page.
- * @public - */ - MaxResults?: number; - +export interface GetUserDefinedFunctionResponse { /** - *A continuation token, if this is a continuation call.
+ *The requested function definition.
* @public */ - NextToken?: string; + UserDefinedFunction?: UserDefinedFunction; } /** - *A structure containing the details for a registry.
* @public */ -export interface RegistryListItem { - /** - *The name of the registry.
- * @public - */ - RegistryName?: string; - +export interface GetUserDefinedFunctionsRequest { /** - *The Amazon Resource Name (ARN) of the registry.
+ *The ID of the Data Catalog where the functions to be retrieved are located. If none is + * provided, the Amazon Web Services account ID is used by default.
* @public */ - RegistryArn?: string; + CatalogId?: string; /** - *A description of the registry.
+ *The name of the catalog database where the functions are located. If none is provided, functions from all the + * databases across the catalog will be returned.
* @public */ - Description?: string; + DatabaseName?: string; /** - *The status of the registry.
+ *An optional function-name pattern string that filters the function + * definitions returned.
* @public */ - Status?: RegistryStatus; + Pattern: string | undefined; /** - *The data the registry was created.
+ *A continuation token, if this is a continuation call.
* @public */ - CreatedTime?: string; + NextToken?: string; /** - *The date the registry was updated.
+ *The maximum number of functions to return in one response.
* @public */ - UpdatedTime?: string; + MaxResults?: number; } /** * @public */ -export interface ListRegistriesResponse { +export interface GetUserDefinedFunctionsResponse { /** - *An array of RegistryDetailedListItem
objects containing minimal details of each registry.
A list of requested function definitions.
* @public */ - Registries?: RegistryListItem[]; + UserDefinedFunctions?: UserDefinedFunction[]; /** - *A continuation token for paginating the returned list of tokens, returned if the current segment of the list is not the last.
+ *A continuation token, if the list of functions returned does + * not include the last requested function.
* @public */ NextToken?: string; @@ -3749,170 +2959,190 @@ export interface ListRegistriesResponse { /** * @public */ -export interface ListSchemasInput { +export interface GetWorkflowRequest { /** - *A wrapper structure that may contain the registry name and Amazon Resource Name (ARN).
+ *The name of the workflow to retrieve.
* @public */ - RegistryId?: RegistryId; + Name: string | undefined; /** - *Maximum number of results required per page. If the value is not supplied, this will be defaulted to 25 per page.
+ *Specifies whether to include a graph when returning the workflow resource metadata.
* @public */ - MaxResults?: number; + IncludeGraph?: boolean; +} +/** + * @public + */ +export interface GetWorkflowResponse { /** - *A continuation token, if this is a continuation call.
+ *The resource metadata for the workflow.
* @public */ - NextToken?: string; + Workflow?: Workflow; } /** - *An object that contains minimal details for a schema.
* @public */ -export interface SchemaListItem { +export interface GetWorkflowRunRequest { /** - *the name of the registry where the schema resides.
+ *Name of the workflow being run.
* @public */ - RegistryName?: string; + Name: string | undefined; /** - *The name of the schema.
+ *The ID of the workflow run.
* @public */ - SchemaName?: string; + RunId: string | undefined; /** - *The Amazon Resource Name (ARN) for the schema.
+ *Specifies whether to include the workflow graph in response or not.
* @public */ - SchemaArn?: string; + IncludeGraph?: boolean; +} +/** + * @public + */ +export interface GetWorkflowRunResponse { /** - *A description for the schema.
+ *The requested workflow run metadata.
* @public */ - Description?: string; + Run?: WorkflowRun; +} +/** + * @public + */ +export interface GetWorkflowRunPropertiesRequest { /** - *The status of the schema.
+ *Name of the workflow which was run.
* @public */ - SchemaStatus?: SchemaStatus; + Name: string | undefined; /** - *The date and time that a schema was created.
+ *The ID of the workflow run whose run properties should be returned.
* @public */ - CreatedTime?: string; + RunId: string | undefined; +} +/** + * @public + */ +export interface GetWorkflowRunPropertiesResponse { /** - *The date and time that a schema was updated.
+ *The workflow run properties which were set during the specified run.
* @public */ - UpdatedTime?: string; + RunProperties?: RecordAn array of SchemaListItem
objects containing details of each schema.
Name of the workflow whose metadata of runs should be returned.
* @public */ - Schemas?: SchemaListItem[]; + Name: string | undefined; /** - *A continuation token for paginating the returned list of tokens, returned if the current segment of the list is not the last.
+ *Specifies whether to include the workflow graph in response or not.
* @public */ - NextToken?: string; -} + IncludeGraph?: boolean; -/** - * @public - */ -export interface ListSchemaVersionsInput { /** - *This is a wrapper structure to contain schema identity fields. The structure contains:
- *SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. Either SchemaArn
or SchemaName
and RegistryName
has to be provided.
SchemaId$SchemaName: The name of the schema. Either SchemaArn
or SchemaName
and RegistryName
has to be provided.
The maximum size of the response.
* @public */ - SchemaId: SchemaId | undefined; + NextToken?: string; /** - *Maximum number of results required per page. If the value is not supplied, this will be defaulted to 25 per page.
+ *The maximum number of workflow runs to be included in the response.
* @public */ MaxResults?: number; +} + +/** + * @public + */ +export interface GetWorkflowRunsResponse { + /** + *A list of workflow run metadata objects.
+ * @public + */ + Runs?: WorkflowRun[]; /** - *A continuation token, if this is a continuation call.
+ *A continuation token, if not all requested workflow runs have been returned.
* @public */ NextToken?: string; } /** - *An object containing the details about a schema version.
* @public */ -export interface SchemaVersionListItem { +export interface ImportCatalogToGlueRequest { /** - *The Amazon Resource Name (ARN) of the schema.
+ *The ID of the catalog to import. Currently, this should be the Amazon Web Services account ID.
* @public */ - SchemaArn?: string; + CatalogId?: string; +} - /** - *The unique identifier of the schema version.
- * @public - */ - SchemaVersionId?: string; +/** + * @public + */ +export interface ImportCatalogToGlueResponse {} +/** + * @public + */ +export interface ListBlueprintsRequest { /** - *The version number of the schema.
+ *A continuation token, if this is a continuation request.
* @public */ - VersionNumber?: number; + NextToken?: string; /** - *The status of the schema version.
+ *The maximum size of a list to return.
* @public */ - Status?: SchemaVersionStatus; + MaxResults?: number; /** - *The date and time the schema version was created.
+ *Filters the list by an Amazon Web Services resource tag.
* @public */ - CreatedTime?: string; + Tags?: RecordAn array of SchemaVersionList
objects containing details of each schema version.
List of names of blueprints in the account.
* @public */ - Schemas?: SchemaVersionListItem[]; + Blueprints?: string[]; /** - *A continuation token for paginating the returned list of tokens, returned if the current segment of the list is not the last.
+ *A continuation token, if not all blueprint names have been returned.
* @public */ NextToken?: string; @@ -3921,50 +3151,32 @@ export interface ListSchemaVersionsResponse { /** * @public */ -export interface ListSessionsRequest { - /** - *The token for the next set of results, or null if there are no more result.
- * @public - */ - NextToken?: string; - +export interface ListColumnStatisticsTaskRunsRequest { /** - *The maximum number of results.
+ *The maximum size of the response.
* @public */ MaxResults?: number; /** - *Tags belonging to the session.
- * @public - */ - Tags?: RecordThe origin of the request.
+ *A continuation token, if this is a continuation call.
* @public */ - RequestOrigin?: string; + NextToken?: string; } /** * @public */ -export interface ListSessionsResponse { - /** - *Returns the ID of the session.
- * @public - */ - Ids?: string[]; - +export interface ListColumnStatisticsTaskRunsResponse { /** - *Returns the session object.
+ *A list of column statistics task run IDs.
* @public */ - Sessions?: Session[]; + ColumnStatisticsTaskRunIds?: string[]; /** - *The token for the next set of results, or null if there are no more result.
+ *A continuation token, if not all task run IDs have yet been returned.
* @public */ NextToken?: string; @@ -3973,38 +3185,39 @@ export interface ListSessionsResponse { /** * @public */ -export interface ListStatementsRequest { +export interface ListCrawlersRequest { /** - *The Session ID of the statements.
+ *The maximum size of a list to return.
* @public */ - SessionId: string | undefined; + MaxResults?: number; /** - *The origin of the request to list statements.
+ *A continuation token, if this is a continuation request.
* @public */ - RequestOrigin?: string; + NextToken?: string; /** - *A continuation token, if this is a continuation call.
+ *Specifies to return only these tagged resources.
* @public */ - NextToken?: string; + Tags?: RecordReturns the list of statements.
+ *The names of all crawlers in the account, or the crawlers with the specified tags.
* @public */ - Statements?: Statement[]; + CrawlerNames?: string[]; /** - *A continuation token, if not all statements have yet been returned.
+ *A continuation token, if the returned list does not contain the + * last metric available.
* @public */ NextToken?: string; @@ -4012,187 +3225,230 @@ export interface ListStatementsResponse { /** * @public + * @enum */ -export interface ListTableOptimizerRunsRequest { - /** - *The Catalog ID of the table.
- * @public - */ - CatalogId: string | undefined; +export const FieldName = { + CRAWL_ID: "CRAWL_ID", + DPU_HOUR: "DPU_HOUR", + END_TIME: "END_TIME", + START_TIME: "START_TIME", + STATE: "STATE", +} as const; - /** - *The name of the database in the catalog in which the table resides.
- * @public - */ - DatabaseName: string | undefined; +/** + * @public + */ +export type FieldName = (typeof FieldName)[keyof typeof FieldName]; - /** - *The name of the table.
- * @public - */ - TableName: string | undefined; +/** + * @public + * @enum + */ +export const FilterOperator = { + EQ: "EQ", + GE: "GE", + GT: "GT", + LE: "LE", + LT: "LT", + NE: "NE", +} as const; + +/** + * @public + */ +export type FilterOperator = (typeof FilterOperator)[keyof typeof FilterOperator]; +/** + *A list of fields, comparators and value that you can use to filter the crawler runs for a specified crawler.
+ * @public + */ +export interface CrawlsFilter { /** - *The type of table optimizer. Currently, the only valid value is compaction
.
A key used to filter the crawler runs for a specified crawler. Valid values for each of the field names are:
+ *
+ * CRAWL_ID
: A string representing the UUID identifier for a crawl.
+ * STATE
: A string representing the state of the crawl.
+ * START_TIME
and END_TIME
: The epoch timestamp in milliseconds.
+ * DPU_HOUR
: The number of data processing unit (DPU) hours used for the crawl.
The maximum number of optimizer runs to return on each call.
+ *A defined comparator that operates on the value. The available operators are:
+ *
+ * GT
: Greater than.
+ * GE
: Greater than or equal to.
+ * LT
: Less than.
+ * LE
: Less than or equal to.
+ * EQ
: Equal to.
+ * NE
: Not equal to.
A continuation token, if this is a continuation call.
+ *The value provided for comparison on the crawl field.
* @public */ - NextToken?: string; + FieldValue?: string; } /** * @public */ -export interface ListTableOptimizerRunsResponse { +export interface ListCrawlsRequest { /** - *The Catalog ID of the table.
+ *The name of the crawler whose runs you want to retrieve.
* @public */ - CatalogId?: string; + CrawlerName: string | undefined; /** - *The name of the database in the catalog in which the table resides.
+ *The maximum number of results to return. The default is 20, and maximum is 100.
* @public */ - DatabaseName?: string; + MaxResults?: number; /** - *The name of the table.
+ *Filters the crawls by the criteria you specify in a list of CrawlsFilter
objects.
A continuation token for paginating the returned list of optimizer runs, returned if the current segment of the list is not the last.
+ *A continuation token, if this is a continuation call.
* @public */ NextToken?: string; - - /** - *A list of the optimizer runs associated with a table.
- * @public - */ - TableOptimizerRuns?: TableOptimizerRun[]; } /** * @public + * @enum */ -export interface ListTriggersRequest { - /** - *A continuation token, if this is a continuation request.
- * @public - */ - NextToken?: string; +export const CrawlerHistoryState = { + COMPLETED: "COMPLETED", + FAILED: "FAILED", + RUNNING: "RUNNING", + STOPPED: "STOPPED", +} as const; - /** - *The name of the job for which to retrieve triggers. The trigger that can start this job - * is returned. If there is no such trigger, all triggers are returned.
- * @public - */ - DependentJobName?: string; +/** + * @public + */ +export type CrawlerHistoryState = (typeof CrawlerHistoryState)[keyof typeof CrawlerHistoryState]; +/** + *Contains the information for a run of a crawler.
+ * @public + */ +export interface CrawlerHistory { /** - *The maximum size of a list to return.
+ *A UUID identifier for each crawl.
* @public */ - MaxResults?: number; + CrawlId?: string; /** - *Specifies to return only these tagged resources.
+ *The state of the crawl.
* @public */ - Tags?: RecordThe names of all triggers in the account, or the triggers with the specified tags.
+ *The date and time on which the crawl started.
* @public */ - TriggerNames?: string[]; + StartTime?: Date; /** - *A continuation token, if the returned list does not contain the - * last metric available.
+ *The date and time on which the crawl ended.
* @public */ - NextToken?: string; -} + EndTime?: Date; -/** - * @public - */ -export interface ListUsageProfilesRequest { /** - *A continuation token, included if this is a continuation call.
+ *A run summary for the specific crawl in JSON. Contains the catalog tables and partitions that were added, updated, or deleted.
* @public */ - NextToken?: string; + Summary?: string; /** - *The maximum number of usage profiles to return in a single response.
+ *If an error occurred, the error message associated with the crawl.
* @public */ - MaxResults?: number; -} + ErrorMessage?: string; -/** - *Describes an Glue usage profile.
- * @public - */ -export interface UsageProfileDefinition { /** - *The name of the usage profile.
+ *The log group associated with the crawl.
* @public */ - Name?: string; + LogGroup?: string; /** - *A description of the usage profile.
+ *The log stream associated with the crawl.
* @public */ - Description?: string; + LogStream?: string; /** - *The date and time when the usage profile was created.
+ *The prefix for a CloudWatch message about this crawl.
* @public */ - CreatedOn?: Date; + MessagePrefix?: string; /** - *The date and time when the usage profile was last modified.
+ *The number of data processing units (DPU) used in hours for the crawl.
* @public */ - LastModifiedOn?: Date; + DPUHour?: number; } /** * @public */ -export interface ListUsageProfilesResponse { +export interface ListCrawlsResponse { /** - *A list of usage profile (UsageProfileDefinition
) objects.
A list of CrawlerHistory
objects representing the crawl runs that meet your criteria.
A continuation token, present if the current list segment is not the last.
+ *A continuation token for paginating the returned list of tokens, returned if the current segment of the list is not the last.
* @public */ NextToken?: string; @@ -4201,3924 +3457,4037 @@ export interface ListUsageProfilesResponse { /** * @public */ -export interface ListWorkflowsRequest { +export interface ListCustomEntityTypesRequest { /** - *A continuation token, if this is a continuation request.
+ *A paginated token to offset the results.
* @public */ NextToken?: string; /** - *The maximum size of a list to return.
+ *The maximum number of results to return.
* @public */ MaxResults?: number; -} - -/** - * @public - */ -export interface ListWorkflowsResponse { - /** - *List of names of workflows in the account.
- * @public - */ - Workflows?: string[]; /** - *A continuation token, if not all workflow names have been returned.
+ *A list of key-value pair tags.
* @public */ - NextToken?: string; + Tags?: RecordThe ID of the Data Catalog to set the security configuration for. If none is provided, the - * Amazon Web Services account ID is used by default.
+ *A list of CustomEntityType
objects representing custom patterns.
The security configuration to set.
+ *A pagination token, if more results are available.
* @public */ - DataCatalogEncryptionSettings: DataCatalogEncryptionSettings | undefined; + NextToken?: string; } /** + *Criteria used to return data quality results.
* @public */ -export interface PutDataCatalogEncryptionSettingsResponse {} - -/** - * @public - * @enum - */ -export const EnableHybridValues = { - FALSE: "FALSE", - TRUE: "TRUE", -} as const; - -/** - * @public - */ -export type EnableHybridValues = (typeof EnableHybridValues)[keyof typeof EnableHybridValues]; - -/** - * @public - * @enum - */ -export const ExistCondition = { - MUST_EXIST: "MUST_EXIST", - NONE: "NONE", - NOT_EXIST: "NOT_EXIST", -} as const; - -/** - * @public - */ -export type ExistCondition = (typeof ExistCondition)[keyof typeof ExistCondition]; - -/** - * @public - */ -export interface PutResourcePolicyRequest { +export interface DataQualityResultFilterCriteria { /** - *Contains the policy document to set, in JSON format.
+ *Filter results by the specified data source. For example, retrieving all results for an Glue table.
* @public */ - PolicyInJson: string | undefined; + DataSource?: DataSource; /** - *Do not use. For internal use only.
+ *Filter results by the specified job name.
* @public */ - ResourceArn?: string; + JobName?: string; /** - *The hash value returned when the previous policy was set using
- * PutResourcePolicy
. Its purpose is to prevent concurrent modifications of a
- * policy. Do not use this parameter if no previous policy has been set.
Filter results by the specified job run ID.
* @public */ - PolicyHashCondition?: string; + JobRunId?: string; /** - *A value of MUST_EXIST
is used to update a policy. A value of
- * NOT_EXIST
is used to create a new policy. If a value of NONE
or a
- * null value is used, the call does not depend on the existence of a policy.
Filter results by runs that started after this time.
* @public */ - PolicyExistsCondition?: ExistCondition; + StartedAfter?: Date; /** - *If 'TRUE'
, indicates that you are using both methods to grant cross-account
- * access to Data Catalog resources:
By directly updating the resource policy with PutResourePolicy
- *
By using the Grant permissions command on the Amazon Web Services Management Console.
- *Must be set to 'TRUE'
if you have already used the Management Console to
- * grant cross-account access, otherwise the call fails. Default is 'FALSE'.
Filter results by runs that started before this time.
* @public */ - EnableHybrid?: EnableHybridValues; + StartedBefore?: Date; } /** * @public */ -export interface PutResourcePolicyResponse { +export interface ListDataQualityResultsRequest { /** - *A hash of the policy that has just been set. This must - * be included in a subsequent call that overwrites or updates - * this policy.
+ *The filter criteria.
* @public */ - PolicyHash?: string; -} + Filter?: DataQualityResultFilterCriteria; -/** - *A structure containing a key value pair for metadata.
- * @public - */ -export interface MetadataKeyValuePair { /** - *A metadata key.
+ *A paginated token to offset the results.
* @public */ - MetadataKey?: string; + NextToken?: string; /** - *A metadata key’s corresponding value.
+ *The maximum number of results to return.
* @public */ - MetadataValue?: string; + MaxResults?: number; } /** + *Describes a data quality result.
* @public */ -export interface PutSchemaVersionMetadataInput { +export interface DataQualityResultDescription { /** - *The unique ID for the schema.
+ *The unique result ID for this data quality result.
* @public */ - SchemaId?: SchemaId; + ResultId?: string; /** - *The version number of the schema.
+ *The table name associated with the data quality result.
* @public */ - SchemaVersionNumber?: SchemaVersionNumber; + DataSource?: DataSource; /** - *The unique version ID of the schema version.
+ *The job name associated with the data quality result.
* @public */ - SchemaVersionId?: string; + JobName?: string; /** - *The metadata key's corresponding value.
+ *The job run ID associated with the data quality result.
* @public */ - MetadataKeyValue: MetadataKeyValuePair | undefined; + JobRunId?: string; + + /** + *The time that the run started for this data quality result.
+ * @public + */ + StartedOn?: Date; } /** * @public */ -export interface PutSchemaVersionMetadataResponse { +export interface ListDataQualityResultsResponse { /** - *The Amazon Resource Name (ARN) for the schema.
+ *A list of DataQualityResultDescription
objects.
The name for the schema.
+ *A pagination token, if more results are available.
* @public */ - SchemaName?: string; + NextToken?: string; +} +/** + *A filter for listing data quality recommendation runs.
+ * @public + */ +export interface DataQualityRuleRecommendationRunFilter { /** - *The name for the registry.
+ *Filter based on a specified data source (Glue table).
* @public */ - RegistryName?: string; + DataSource: DataSource | undefined; /** - *The latest version of the schema.
+ *Filter based on time for results started before provided time.
* @public */ - LatestVersion?: boolean; + StartedBefore?: Date; /** - *The version number of the schema.
+ *Filter based on time for results started after provided time.
* @public */ - VersionNumber?: number; + StartedAfter?: Date; +} +/** + * @public + */ +export interface ListDataQualityRuleRecommendationRunsRequest { /** - *The unique version ID of the schema version.
+ *The filter criteria.
* @public */ - SchemaVersionId?: string; + Filter?: DataQualityRuleRecommendationRunFilter; /** - *The metadata key.
+ *A paginated token to offset the results.
* @public */ - MetadataKey?: string; + NextToken?: string; /** - *The value of the metadata key.
+ *The maximum number of results to return.
* @public */ - MetadataValue?: string; + MaxResults?: number; } /** + *Describes the result of a data quality rule recommendation run.
* @public */ -export interface PutWorkflowRunPropertiesRequest { +export interface DataQualityRuleRecommendationRunDescription { /** - *Name of the workflow which was run.
+ *The unique run identifier associated with this run.
* @public */ - Name: string | undefined; + RunId?: string; /** - *The ID of the workflow run for which the run properties should be updated.
+ *The status for this run.
* @public */ - RunId: string | undefined; + Status?: TaskStatusType; /** - *The properties to put for the specified run.
+ *The date and time when this run started.
* @public */ - RunProperties: RecordA wrapper structure that may contain the schema name and Amazon Resource Name (ARN).
+ *The data source (Glue table) associated with the recommendation run.
* @public */ - SchemaId?: SchemaId; + DataSource?: DataSource; +} +/** + * @public + */ +export interface ListDataQualityRuleRecommendationRunsResponse { /** - *The version number of the schema.
+ *A list of DataQualityRuleRecommendationRunDescription
objects.
The unique version ID of the schema version.
+ *A pagination token, if more results are available.
* @public */ - SchemaVersionId?: string; + NextToken?: string; +} +/** + *The filter criteria.
+ * @public + */ +export interface DataQualityRulesetEvaluationRunFilter { /** - *Search key-value pairs for metadata, if they are not provided all the metadata information will be fetched.
+ *Filter based on a data source (an Glue table) associated with the run.
* @public */ - MetadataList?: MetadataKeyValuePair[]; + DataSource: DataSource | undefined; /** - *Maximum number of results required per page. If the value is not supplied, this will be defaulted to 25 per page.
+ *Filter results by runs that started before this time.
* @public */ - MaxResults?: number; + StartedBefore?: Date; /** - *A continuation token, if this is a continuation call.
+ *Filter results by runs that started after this time.
* @public */ - NextToken?: string; + StartedAfter?: Date; } /** - *A structure containing other metadata for a schema version belonging to the same metadata key.
* @public */ -export interface OtherMetadataValueListItem { +export interface ListDataQualityRulesetEvaluationRunsRequest { /** - *The metadata key’s corresponding value for the other metadata belonging to the same metadata key.
+ *The filter criteria.
* @public */ - MetadataValue?: string; + Filter?: DataQualityRulesetEvaluationRunFilter; /** - *The time at which the entry was created.
+ *A paginated token to offset the results.
* @public */ - CreatedTime?: string; + NextToken?: string; + + /** + *The maximum number of results to return.
+ * @public + */ + MaxResults?: number; } /** - *A structure containing metadata information for a schema version.
+ *Describes the result of a data quality ruleset evaluation run.
* @public */ -export interface MetadataInfo { +export interface DataQualityRulesetEvaluationRunDescription { /** - *The metadata key’s corresponding value.
+ *The unique run identifier associated with this run.
* @public */ - MetadataValue?: string; + RunId?: string; /** - *The time at which the entry was created.
+ *The status for this run.
* @public */ - CreatedTime?: string; + Status?: TaskStatusType; /** - *Other metadata belonging to the same metadata key.
+ *The date and time when the run started.
* @public */ - OtherMetadataValueList?: OtherMetadataValueListItem[]; -} + StartedOn?: Date; -/** - * @public - */ -export interface QuerySchemaVersionMetadataResponse { /** - *A map of a metadata key and associated values.
+ *The data source (an Glue table) associated with the run.
* @public */ - MetadataInfoMap?: RecordThe unique version ID of the schema version.
+ *A list of DataQualityRulesetEvaluationRunDescription
objects representing data quality ruleset runs.
A continuation token for paginating the returned list of tokens, returned if the current segment of the list is not the last.
+ *A pagination token, if more results are available.
* @public */ NextToken?: string; } /** + *The criteria used to filter data quality rulesets.
* @public */ -export interface RegisterSchemaVersionInput { +export interface DataQualityRulesetFilterCriteria { /** - *This is a wrapper structure to contain schema identity fields. The structure contains:
- *SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. Either SchemaArn
or SchemaName
and RegistryName
has to be provided.
SchemaId$SchemaName: The name of the schema. Either SchemaArn
or SchemaName
and RegistryName
has to be provided.
The name of the ruleset filter criteria.
+ * @public + */ + Name?: string; + + /** + *The description of the ruleset filter criteria.
* @public */ - SchemaId: SchemaId | undefined; + Description?: string; /** - *The schema definition using the DataFormat
setting for the SchemaName
.
Filter on rulesets created before this date.
* @public */ - SchemaDefinition: string | undefined; -} + CreatedBefore?: Date; -/** - * @public - */ -export interface RegisterSchemaVersionResponse { /** - *The unique ID that represents the version of this schema.
+ *Filter on rulesets created after this date.
* @public */ - SchemaVersionId?: string; + CreatedAfter?: Date; /** - *The version of this schema (for sync flow only, in case this is the first version).
+ *Filter on rulesets last modified before this date.
* @public */ - VersionNumber?: number; + LastModifiedBefore?: Date; /** - *The status of the schema version.
+ *Filter on rulesets last modified after this date.
* @public */ - Status?: SchemaVersionStatus; + LastModifiedAfter?: Date; + + /** + *The name and database name of the target table.
+ * @public + */ + TargetTable?: DataQualityTargetTable; } /** * @public */ -export interface RemoveSchemaVersionMetadataInput { +export interface ListDataQualityRulesetsRequest { /** - *A wrapper structure that may contain the schema name and Amazon Resource Name (ARN).
+ *A paginated token to offset the results.
* @public */ - SchemaId?: SchemaId; + NextToken?: string; /** - *The version number of the schema.
+ *The maximum number of results to return.
* @public */ - SchemaVersionNumber?: SchemaVersionNumber; + MaxResults?: number; /** - *The unique version ID of the schema version.
+ *The filter criteria.
* @public */ - SchemaVersionId?: string; + Filter?: DataQualityRulesetFilterCriteria; /** - *The value of the metadata key.
+ *A list of key-value pair tags.
* @public */ - MetadataKeyValue: MetadataKeyValuePair | undefined; + Tags?: RecordDescribes a data quality ruleset returned by GetDataQualityRuleset
.
The Amazon Resource Name (ARN) of the schema.
- * @public - */ - SchemaArn?: string; - +export interface DataQualityRulesetListDetails { /** - *The name of the schema.
+ *The name of the data quality ruleset.
* @public */ - SchemaName?: string; + Name?: string; /** - *The name of the registry.
+ *A description of the data quality ruleset.
* @public */ - RegistryName?: string; + Description?: string; /** - *The latest version of the schema.
+ *The date and time the data quality ruleset was created.
* @public */ - LatestVersion?: boolean; + CreatedOn?: Date; /** - *The version number of the schema.
+ *The date and time the data quality ruleset was last modified.
* @public */ - VersionNumber?: number; + LastModifiedOn?: Date; /** - *The version ID for the schema version.
+ *An object representing an Glue table.
* @public */ - SchemaVersionId?: string; + TargetTable?: DataQualityTargetTable; /** - *The metadata key.
+ *When a ruleset was created from a recommendation run, this run ID is generated to link the two together.
* @public */ - MetadataKey?: string; + RecommendationRunId?: string; /** - *The value of the metadata key.
+ *The number of rules in the ruleset.
* @public */ - MetadataValue?: string; + RuleCount?: number; } /** * @public */ -export interface ResetJobBookmarkRequest { +export interface ListDataQualityRulesetsResponse { /** - *The name of the job in question.
+ *A paginated list of rulesets for the specified list of Glue tables.
* @public */ - JobName: string | undefined; + Rulesets?: DataQualityRulesetListDetails[]; /** - *The unique run identifier associated with this job run.
+ *A pagination token, if more results are available.
* @public */ - RunId?: string; + NextToken?: string; } /** + *A timestamp filter.
* @public */ -export interface ResetJobBookmarkResponse { +export interface TimestampFilter { /** - *The reset bookmark entry.
+ *The timestamp before which statistics should be included in the results.
* @public */ - JobBookmarkEntry?: JobBookmarkEntry; -} + RecordedBefore?: Date; -/** - *Too many jobs are being run concurrently.
- * @public - */ -export class ConcurrentRunsExceededException extends __BaseException { - readonly name: "ConcurrentRunsExceededException" = "ConcurrentRunsExceededException"; - readonly $fault: "client" = "client"; /** - *A message describing the problem.
+ *The timestamp after which statistics should be included in the results.
* @public */ - Message?: string; - /** - * @internal - */ - constructor(opts: __ExceptionOptionTypeThe workflow is in an invalid state to perform a requested operation.
* @public */ -export class IllegalWorkflowStateException extends __BaseException { - readonly name: "IllegalWorkflowStateException" = "IllegalWorkflowStateException"; - readonly $fault: "client" = "client"; +export interface ListDataQualityStatisticAnnotationsRequest { /** - *A message describing the problem.
+ *The Statistic ID.
* @public */ - Message?: string; + StatisticId?: string; + /** - * @internal + *The Profile ID.
+ * @public */ - constructor(opts: __ExceptionOptionTypeThe name of the workflow to resume.
+ *A timestamp filter.
* @public */ - Name: string | undefined; + TimestampFilter?: TimestampFilter; /** - *The ID of the workflow run to resume.
+ *The maximum number of results to return in this request.
* @public */ - RunId: string | undefined; + MaxResults?: number; /** - *A list of the node IDs for the nodes you want to restart. The nodes that are to be restarted must have a run attempt in the original run.
+ *A pagination token to retrieve the next set of results.
* @public */ - NodeIds: string[] | undefined; + NextToken?: string; } /** * @public */ -export interface ResumeWorkflowRunResponse { +export interface ListDataQualityStatisticAnnotationsResponse { /** - *The new ID assigned to the resumed workflow run. Each resume of a workflow run will have a new run ID.
+ *A list of StatisticAnnotation
applied to the Statistic
A list of the node IDs for the nodes that were actually restarted.
+ *A pagination token to retrieve the next set of results.
* @public */ - NodeIds?: string[]; + NextToken?: string; } /** * @public */ -export interface RunStatementRequest { +export interface ListDataQualityStatisticsRequest { /** - *The Session Id of the statement to be run.
+ *The Statistic ID.
* @public */ - SessionId: string | undefined; + StatisticId?: string; /** - *The statement code to be run.
+ *The Profile ID.
* @public */ - Code: string | undefined; + ProfileId?: string; /** - *The origin of the request.
+ *A timestamp filter.
* @public */ - RequestOrigin?: string; -} + TimestampFilter?: TimestampFilter; -/** - * @public - */ -export interface RunStatementResponse { /** - *Returns the Id of the statement that was run.
+ *The maximum number of results to return in this request.
* @public */ - Id?: number; + MaxResults?: number; + + /** + *A pagination token to request the next page of results.
+ * @public + */ + NextToken?: string; } /** * @public * @enum */ -export const Comparator = { - EQUALS: "EQUALS", - GREATER_THAN: "GREATER_THAN", - GREATER_THAN_EQUALS: "GREATER_THAN_EQUALS", - LESS_THAN: "LESS_THAN", - LESS_THAN_EQUALS: "LESS_THAN_EQUALS", +export const StatisticEvaluationLevel = { + COLUMN: "Column", + DATASET: "Dataset", + MULTICOLUMN: "Multicolumn", } as const; /** * @public */ -export type Comparator = (typeof Comparator)[keyof typeof Comparator]; +export type StatisticEvaluationLevel = (typeof StatisticEvaluationLevel)[keyof typeof StatisticEvaluationLevel]; /** - *Defines a property predicate.
+ *A run identifier.
* @public */ -export interface PropertyPredicate { - /** - *The key of the property.
- * @public - */ - Key?: string; - +export interface RunIdentifier { /** - *The value of the property.
+ *The Run ID.
* @public */ - Value?: string; + RunId?: string; /** - *The comparator used to compare this property to others.
+ *The Job Run ID.
* @public */ - Comparator?: Comparator; + JobRunId?: string; } /** + *Summary information about a statistic.
* @public - * @enum */ -export const Sort = { - ASCENDING: "ASC", - DESCENDING: "DESC", -} as const; +export interface StatisticSummary { + /** + *The Statistic ID.
+ * @public + */ + StatisticId?: string; -/** - * @public - */ -export type Sort = (typeof Sort)[keyof typeof Sort]; + /** + *The Profile ID.
+ * @public + */ + ProfileId?: string; -/** - *Specifies a field to sort by and a sort order.
- * @public - */ -export interface SortCriterion { /** - *The name of the field on which to sort.
+ *The Run Identifier
* @public */ - FieldName?: string; + RunIdentifier?: RunIdentifier; /** - *An ascending or descending sort.
+ *The name of the statistic.
* @public */ - Sort?: Sort; + StatisticName?: string; + + /** + *The value of the statistic.
+ * @public + */ + DoubleValue?: number; + + /** + *The evaluation level of the statistic. Possible values: Dataset
, Column
, Multicolumn
.
The list of columns referenced by the statistic.
+ * @public + */ + ColumnsReferenced?: string[]; + + /** + *The list of datasets referenced by the statistic.
+ * @public + */ + ReferencedDatasets?: string[]; + + /** + *A StatisticPropertiesMap
, which contains a NameString
and DescriptionString
+ *
The timestamp when the statistic was recorded.
+ * @public + */ + RecordedOn?: Date; + + /** + *The inclusion annotation for the statistic.
+ * @public + */ + InclusionAnnotation?: TimestampedInclusionAnnotation; } /** * @public */ -export interface SearchTablesRequest { - /** - *A unique identifier, consisting of
- * account_id
- *
.
A StatisticSummaryList
.
A continuation token, included if this is a continuation call.
+ *A pagination token to request the next page of results.
* @public */ NextToken?: string; +} +/** + * @public + */ +export interface ListDevEndpointsRequest { /** - *A list of key-value pairs, and a comparator used to filter the search results. Returns all entities matching the predicate.
- *The Comparator
member of the PropertyPredicate
struct is used only for time fields, and can be omitted for other field types. Also, when comparing string values, such as when Key=Name
, a fuzzy match algorithm is used. The Key
field (for example, the value of the Name
field) is split on certain punctuation characters, for example, -, :, #, etc. into tokens. Then each token is exact-match compared with the Value
member of PropertyPredicate
. For example, if Key=Name
and Value=link
, tables named customer-link
and xx-link-yy
are returned, but xxlinkyy
is not returned.
A continuation token, if this is a continuation request.
* @public */ - Filters?: PropertyPredicate[]; + NextToken?: string; /** - *A string used for a text search.
- *Specifying a value in quotes filters based on an exact match to the value.
+ *The maximum size of a list to return.
* @public */ - SearchText?: string; + MaxResults?: number; /** - *A list of criteria for sorting the results by a field name, in an ascending or descending order.
+ *Specifies to return only these tagged resources.
* @public */ - SortCriteria?: SortCriterion[]; + Tags?: RecordThe maximum number of tables to return in a single response.
+ *The names of all the DevEndpoint
s in the account, or the
+ * DevEndpoint
s with the specified tags.
Allows you to specify that you want to search the tables shared with your account. The allowable values are FOREIGN
or ALL
.
If set to FOREIGN
, will search the tables shared with your account.
If set to ALL
, will search the tables shared with your account, as well as the tables in yor local account.
A continuation token, if the returned list does not contain the + * last metric available.
* @public */ - ResourceShareType?: ResourceShareType; + NextToken?: string; } /** * @public */ -export interface SearchTablesResponse { +export interface ListJobsRequest { /** - *A continuation token, present if the current list segment is not the last.
+ *A continuation token, if this is a continuation request.
* @public */ NextToken?: string; /** - *A list of the requested Table
objects. The SearchTables
response returns only the tables that you have access to.
The maximum size of a list to return.
* @public */ - TableList?: Table[]; + MaxResults?: number; + + /** + *Specifies to return only these tagged resources.
+ * @public + */ + Tags?: RecordThe blueprint is in an invalid state to perform a requested operation.
* @public */ -export class IllegalBlueprintStateException extends __BaseException { - readonly name: "IllegalBlueprintStateException" = "IllegalBlueprintStateException"; - readonly $fault: "client" = "client"; +export interface ListJobsResponse { /** - *A message describing the problem.
+ *The names of all jobs in the account, or the jobs with the specified tags.
* @public */ - Message?: string; + JobNames?: string[]; + /** - * @internal + *A continuation token, if the returned list does not contain the + * last metric available.
+ * @public */ - constructor(opts: __ExceptionOptionTypeThe name of the blueprint.
+ *A continuation token, if this is a continuation request.
* @public */ - BlueprintName: string | undefined; + NextToken?: string; /** - *Specifies the parameters as a BlueprintParameters
object.
The maximum size of a list to return.
* @public */ - Parameters?: string; + MaxResults?: number; /** - *Specifies the IAM role used to create the workflow.
+ *A TransformFilterCriteria
used to filter the machine learning transforms.
The run ID for this blueprint run.
+ *A TransformSortCriteria
used to sort the machine learning transforms.
Specifies to return only these tagged resources.
+ * @public + */ + Tags?: RecordAn exception thrown when you try to start another job while running a column stats generation job.
* @public */ -export class ColumnStatisticsTaskRunningException extends __BaseException { - readonly name: "ColumnStatisticsTaskRunningException" = "ColumnStatisticsTaskRunningException"; - readonly $fault: "client" = "client"; +export interface ListMLTransformsResponse { /** - *A message describing the problem.
+ *The identifiers of all the machine learning transforms in the account, or the + * machine learning transforms with the specified tags.
* @public */ - Message?: string; + TransformIds: string[] | undefined; + /** - * @internal + *A continuation token, if the returned list does not contain the + * last metric available.
+ * @public */ - constructor(opts: __ExceptionOptionTypeThe name of the database where the table resides.
+ *Maximum number of results required per page. If the value is not supplied, this will be defaulted to 25 per page.
* @public */ - DatabaseName: string | undefined; + MaxResults?: number; /** - *The name of the table to generate statistics.
+ *A continuation token, if this is a continuation call.
* @public */ - TableName: string | undefined; + NextToken?: string; +} +/** + *A structure containing the details for a registry.
+ * @public + */ +export interface RegistryListItem { /** - *A list of the column names to generate statistics. If none is supplied, all column names for the table will be used by default.
+ *The name of the registry.
* @public */ - ColumnNameList?: string[]; + RegistryName?: string; /** - *The IAM role that the service assumes to generate statistics.
+ *The Amazon Resource Name (ARN) of the registry.
* @public */ - Role: string | undefined; + RegistryArn?: string; /** - *The percentage of rows used to generate statistics. If none is supplied, the entire table will be used to generate stats.
+ *A description of the registry.
* @public */ - SampleSize?: number; + Description?: string; /** - *The ID of the Data Catalog where the table reside. If none is supplied, the Amazon Web Services account ID is used by default.
+ *The status of the registry.
* @public */ - CatalogID?: string; + Status?: RegistryStatus; /** - *Name of the security configuration that is used to encrypt CloudWatch logs for the column stats task run.
+ *The data the registry was created.
* @public */ - SecurityConfiguration?: string; -} + CreatedTime?: string; -/** - * @public - */ -export interface StartColumnStatisticsTaskRunResponse { /** - *The identifier for the column statistics task run.
+ *The date the registry was updated.
* @public */ - ColumnStatisticsTaskRunId?: string; + UpdatedTime?: string; } /** * @public */ -export interface StartCrawlerRequest { +export interface ListRegistriesResponse { /** - *Name of the crawler to start.
+ *An array of RegistryDetailedListItem
objects containing minimal details of each registry.
There is no applicable schedule.
- * @public - */ -export class NoScheduleException extends __BaseException { - readonly name: "NoScheduleException" = "NoScheduleException"; - readonly $fault: "client" = "client"; /** - *A message describing the problem.
+ *A continuation token for paginating the returned list of tokens, returned if the current segment of the list is not the last.
* @public */ - Message?: string; - /** - * @internal - */ - constructor(opts: __ExceptionOptionTypeThe specified scheduler is already running.
* @public */ -export class SchedulerRunningException extends __BaseException { - readonly name: "SchedulerRunningException" = "SchedulerRunningException"; - readonly $fault: "client" = "client"; +export interface ListSchemasInput { /** - *A message describing the problem.
+ *A wrapper structure that may contain the registry name and Amazon Resource Name (ARN).
* @public */ - Message?: string; + RegistryId?: RegistryId; + /** - * @internal + *Maximum number of results required per page. If the value is not supplied, this will be defaulted to 25 per page.
+ * @public */ - constructor(opts: __ExceptionOptionTypeName of the crawler to schedule.
+ *A continuation token, if this is a continuation call.
* @public */ - CrawlerName: string | undefined; + NextToken?: string; } /** + *An object that contains minimal details for a schema.
* @public */ -export interface StartCrawlerScheduleResponse {} +export interface SchemaListItem { + /** + *the name of the registry where the schema resides.
+ * @public + */ + RegistryName?: string; -/** - * @public - */ -export interface StartDataQualityRuleRecommendationRunRequest { /** - *The data source (Glue table) associated with this run.
+ *The name of the schema.
* @public */ - DataSource: DataSource | undefined; + SchemaName?: string; /** - *An IAM role supplied to encrypt the results of the run.
+ *The Amazon Resource Name (ARN) for the schema.
* @public */ - Role: string | undefined; + SchemaArn?: string; /** - *The number of G.1X
workers to be used in the run. The default is 5.
A description for the schema.
* @public */ - NumberOfWorkers?: number; + Description?: string; /** - *The timeout for a run in minutes. This is the maximum time that a run can consume resources before it is terminated and enters TIMEOUT
status. The default is 2,880 minutes (48 hours).
The status of the schema.
* @public */ - Timeout?: number; + SchemaStatus?: SchemaStatus; /** - *A name for the ruleset.
+ *The date and time that a schema was created.
* @public */ - CreatedRulesetName?: string; + CreatedTime?: string; /** - *Used for idempotency and is recommended to be set to a random ID (such as a UUID) to avoid creating or starting multiple instances of the same resource.
+ *The date and time that a schema was updated.
* @public */ - ClientToken?: string; + UpdatedTime?: string; } /** * @public */ -export interface StartDataQualityRuleRecommendationRunResponse { +export interface ListSchemasResponse { /** - *The unique run identifier associated with this run.
+ *An array of SchemaListItem
objects containing details of each schema.
The data source (Glue table) associated with this run.
+ *A continuation token for paginating the returned list of tokens, returned if the current segment of the list is not the last.
* @public */ - DataSource: DataSource | undefined; + NextToken?: string; +} +/** + * @public + */ +export interface ListSchemaVersionsInput { /** - *An IAM role supplied to encrypt the results of the run.
+ *This is a wrapper structure to contain schema identity fields. The structure contains:
+ *SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. Either SchemaArn
or SchemaName
and RegistryName
has to be provided.
SchemaId$SchemaName: The name of the schema. Either SchemaArn
or SchemaName
and RegistryName
has to be provided.
The number of G.1X
workers to be used in the run. The default is 5.
Maximum number of results required per page. If the value is not supplied, this will be defaulted to 25 per page.
* @public */ - NumberOfWorkers?: number; + MaxResults?: number; /** - *The timeout for a run in minutes. This is the maximum time that a run can consume resources before it is terminated and enters TIMEOUT
status. The default is 2,880 minutes (48 hours).
A continuation token, if this is a continuation call.
* @public */ - Timeout?: number; + NextToken?: string; +} +/** + *An object containing the details about a schema version.
+ * @public + */ +export interface SchemaVersionListItem { /** - *Used for idempotency and is recommended to be set to a random ID (such as a UUID) to avoid creating or starting multiple instances of the same resource.
+ *The Amazon Resource Name (ARN) of the schema.
* @public */ - ClientToken?: string; + SchemaArn?: string; /** - *Additional run options you can specify for an evaluation run.
+ *The unique identifier of the schema version.
* @public */ - AdditionalRunOptions?: DataQualityEvaluationRunAdditionalRunOptions; + SchemaVersionId?: string; /** - *A list of ruleset names.
+ *The version number of the schema.
* @public */ - RulesetNames: string[] | undefined; + VersionNumber?: number; /** - *A map of reference strings to additional data sources you can specify for an evaluation run.
+ *The status of the schema version.
* @public */ - AdditionalDataSources?: RecordThe unique run identifier associated with this run.
+ *The date and time the schema version was created.
* @public */ - RunId?: string; + CreatedTime?: string; } /** * @public */ -export interface StartExportLabelsTaskRunRequest { - /** - *The unique identifier of the machine learning transform.
- * @public - */ - TransformId: string | undefined; - +export interface ListSchemaVersionsResponse { /** - *The Amazon S3 path where you export the labels.
+ *An array of SchemaVersionList
objects containing details of each schema version.
The unique identifier for the task run.
+ *A continuation token for paginating the returned list of tokens, returned if the current segment of the list is not the last.
* @public */ - TaskRunId?: string; + NextToken?: string; } /** * @public */ -export interface StartImportLabelsTaskRunRequest { +export interface ListSessionsRequest { /** - *The unique identifier of the machine learning transform.
+ *The token for the next set of results, or null if there are no more result.
* @public */ - TransformId: string | undefined; + NextToken?: string; /** - *The Amazon Simple Storage Service (Amazon S3) path from where you import the - * labels.
+ *The maximum number of results.
* @public */ - InputS3Path: string | undefined; + MaxResults?: number; /** - *Indicates whether to overwrite your existing labels.
+ *Tags belonging to the session.
* @public */ - ReplaceAllLabels?: boolean; -} + Tags?: RecordThe unique identifier for the task run.
+ *The origin of the request.
* @public */ - TaskRunId?: string; + RequestOrigin?: string; } /** * @public */ -export interface StartJobRunRequest { +export interface ListSessionsResponse { /** - *The name of the job definition to use.
+ *Returns the ID of the session.
* @public */ - JobName: string | undefined; + Ids?: string[]; /** - *The ID of a previous JobRun
to retry.
Returns the session object.
* @public */ - JobRunId?: string; + Sessions?: Session[]; /** - *The job arguments associated with this run. For this job run, they replace the default - * arguments set in the job definition itself.
- *You can specify arguments here that your own job-execution script - * consumes, as well as arguments that Glue itself consumes.
- *Job arguments may be logged. Do not pass plaintext secrets as arguments. Retrieve secrets - * from a Glue Connection, Secrets Manager or other secret management - * mechanism if you intend to keep them within the Job.
- *For information about how to specify and consume your own Job arguments, see the Calling Glue APIs in Python topic in the developer guide.
- *For information about the arguments you can provide to this field when configuring Spark jobs, - * see the Special Parameters Used by Glue topic in the developer guide.
- *For information about the arguments you can provide to this field when configuring Ray - * jobs, see Using - * job parameters in Ray jobs in the developer guide.
+ *The token for the next set of results, or null if there are no more result.
* @public */ - Arguments?: RecordThis field is deprecated. Use MaxCapacity
instead.
The number of Glue data processing units (DPUs) to allocate to this JobRun. - * You can allocate a minimum of 2 DPUs; the default is 10. A DPU is a relative measure - * of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. - * For more information, see the Glue - * pricing page.
+ *The Session ID of the statements.
* @public */ - AllocatedCapacity?: number; + SessionId: string | undefined; /** - *The JobRun
timeout in minutes. This is the maximum time that a job run can
- * consume resources before it is terminated and enters TIMEOUT
status. This value overrides the timeout value set in the parent job.
Streaming jobs must have timeout values less than 7 days or 10080 minutes. When the value is left blank, the job will be restarted after 7 days based if you have not setup a maintenance window. If you have setup maintenance window, it will be restarted during the maintenance window after 7 days.
+ *The origin of the request to list statements.
* @public */ - Timeout?: number; + RequestOrigin?: string; /** - *For Glue version 1.0 or earlier jobs, using the standard worker type, the number of - * Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is - * a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB - * of memory. For more information, see the - * Glue pricing page.
- *For Glue version 2.0+ jobs, you cannot specify a Maximum capacity
.
- * Instead, you should specify a Worker type
and the Number of workers
.
Do not set MaxCapacity
if using WorkerType
and NumberOfWorkers
.
The value that can be allocated for MaxCapacity
depends on whether you are
- * running a Python shell job, an Apache Spark ETL job, or an Apache Spark streaming ETL
- * job:
When you specify a Python shell job (JobCommand.Name
="pythonshell"), you can
- * allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU.
When you specify an Apache Spark ETL job (JobCommand.Name
="glueetl") or Apache
- * Spark streaming ETL job (JobCommand.Name
="gluestreaming"), you can allocate from 2 to 100 DPUs.
- * The default is 10 DPUs. This job type cannot have a fractional DPU allocation.
A continuation token, if this is a continuation call.
* @public */ - MaxCapacity?: number; + NextToken?: string; +} +/** + * @public + */ +export interface ListStatementsResponse { /** - *The name of the SecurityConfiguration
structure to be used with this job
- * run.
Returns the list of statements.
* @public */ - SecurityConfiguration?: string; + Statements?: Statement[]; /** - *Specifies configuration properties of a job run notification.
+ *A continuation token, if not all statements have yet been returned.
* @public - */ - NotificationProperty?: NotificationProperty; - - /** - *The type of predefined worker that is allocated when a job runs. Accepts a value of - * G.1X, G.2X, G.4X, G.8X or G.025X for Spark jobs. Accepts the value Z.2X for Ray jobs.
- *For the G.1X
worker type, each worker maps to 1 DPU (4 vCPUs, 16 GB of memory) with 84GB disk (approximately 34GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs.
For the G.2X
worker type, each worker maps to 2 DPU (8 vCPUs, 32 GB of memory) with 128GB disk (approximately 77GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs.
For the G.4X
worker type, each worker maps to 4 DPU (16 vCPUs, 64 GB of memory) with 256GB disk (approximately 235GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs in the following Amazon Web Services Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), and Europe (Stockholm).
For the G.8X
worker type, each worker maps to 8 DPU (32 vCPUs, 128 GB of memory) with 512GB disk (approximately 487GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs, in the same Amazon Web Services Regions as supported for the G.4X
worker type.
For the G.025X
worker type, each worker maps to 0.25 DPU (2 vCPUs, 4 GB of memory) with 84GB disk (approximately 34GB free), and provides 1 executor per worker. We recommend this worker type for low volume streaming jobs. This worker type is only available for Glue version 3.0 streaming jobs.
For the Z.2X
worker type, each worker maps to 2 M-DPU (8vCPUs, 64 GB of memory) with 128 GB disk (approximately 120GB free), and provides up to 8 Ray workers based on the autoscaler.
The Catalog ID of the table.
* @public */ - WorkerType?: WorkerType; + CatalogId: string | undefined; /** - *The number of workers of a defined workerType
that are allocated when a job runs.
The name of the database in the catalog in which the table resides.
* @public */ - NumberOfWorkers?: number; + DatabaseName: string | undefined; /** - *Indicates whether the job is run with a standard or flexible execution class. The standard execution-class is ideal for time-sensitive workloads that require fast job startup and dedicated resources.
- *The flexible execution class is appropriate for time-insensitive jobs whose start and completion times may vary.
- *Only jobs with Glue version 3.0 and above and command type glueetl
will be allowed to set ExecutionClass
to FLEX
. The flexible execution class is available for Spark jobs.
The name of the table.
* @public */ - ExecutionClass?: ExecutionClass; -} + TableName: string | undefined; -/** - * @public - */ -export interface StartJobRunResponse { /** - *The ID assigned to this job run.
+ *The type of table optimizer. Currently, the only valid value is compaction
.
The machine learning transform is not ready to run.
- * @public - */ -export class MLTransformNotReadyException extends __BaseException { - readonly name: "MLTransformNotReadyException" = "MLTransformNotReadyException"; - readonly $fault: "client" = "client"; /** - *A message describing the problem.
+ *The maximum number of optimizer runs to return on each call.
* @public */ - Message?: string; + MaxResults?: number; + /** - * @internal + *A continuation token, if this is a continuation call.
+ * @public */ - constructor(opts: __ExceptionOptionTypeThe unique identifier of the machine learning transform.
+ *The Catalog ID of the table.
* @public */ - TransformId: string | undefined; -} + CatalogId?: string; -/** - * @public - */ -export interface StartMLEvaluationTaskRunResponse { /** - *The unique identifier associated with this run.
+ *The name of the database in the catalog in which the table resides.
* @public */ - TaskRunId?: string; -} + DatabaseName?: string; -/** - * @public - */ -export interface StartMLLabelingSetGenerationTaskRunRequest { /** - *The unique identifier of the machine learning transform.
+ *The name of the table.
* @public */ - TransformId: string | undefined; + TableName?: string; /** - *The Amazon Simple Storage Service (Amazon S3) path where you generate the labeling - * set.
+ *A continuation token for paginating the returned list of optimizer runs, returned if the current segment of the list is not the last.
* @public */ - OutputS3Path: string | undefined; -} + NextToken?: string; -/** - * @public - */ -export interface StartMLLabelingSetGenerationTaskRunResponse { /** - *The unique run identifier that is associated with this task run.
+ *A list of the optimizer runs associated with a table.
* @public */ - TaskRunId?: string; + TableOptimizerRuns?: TableOptimizerRun[]; } /** * @public */ -export interface StartTriggerRequest { +export interface ListTriggersRequest { /** - *The name of the trigger to start.
+ *A continuation token, if this is a continuation request.
* @public */ - Name: string | undefined; -} + NextToken?: string; -/** - * @public - */ -export interface StartTriggerResponse { /** - *The name of the trigger that was started.
+ *The name of the job for which to retrieve triggers. The trigger that can start this job + * is returned. If there is no such trigger, all triggers are returned.
* @public */ - Name?: string; -} + DependentJobName?: string; -/** - * @public - */ -export interface StartWorkflowRunRequest { /** - *The name of the workflow to start.
+ *The maximum size of a list to return.
* @public */ - Name: string | undefined; + MaxResults?: number; /** - *The workflow run properties for the new workflow run.
+ *Specifies to return only these tagged resources.
* @public */ - RunProperties?: RecordAn Id for the new run.
+ *The names of all triggers in the account, or the triggers with the specified tags.
* @public */ - RunId?: string; + TriggerNames?: string[]; + + /** + *A continuation token, if the returned list does not contain the + * last metric available.
+ * @public + */ + NextToken?: string; } /** - *An exception thrown when you try to stop a task run when there is no task running.
* @public */ -export class ColumnStatisticsTaskNotRunningException extends __BaseException { - readonly name: "ColumnStatisticsTaskNotRunningException" = "ColumnStatisticsTaskNotRunningException"; - readonly $fault: "client" = "client"; +export interface ListUsageProfilesRequest { /** - *A message describing the problem.
+ *A continuation token, included if this is a continuation call.
* @public */ - Message?: string; + NextToken?: string; + /** - * @internal + *The maximum number of usage profiles to return in a single response.
+ * @public */ - constructor(opts: __ExceptionOptionTypeAn exception thrown when you try to stop a task run.
+ *Describes an Glue usage profile.
* @public */ -export class ColumnStatisticsTaskStoppingException extends __BaseException { - readonly name: "ColumnStatisticsTaskStoppingException" = "ColumnStatisticsTaskStoppingException"; - readonly $fault: "client" = "client"; +export interface UsageProfileDefinition { /** - *A message describing the problem.
+ *The name of the usage profile.
* @public */ - Message?: string; + Name?: string; + /** - * @internal + *A description of the usage profile.
+ * @public */ - constructor(opts: __ExceptionOptionTypeThe name of the database where the table resides.
+ *The date and time when the usage profile was created.
* @public */ - DatabaseName: string | undefined; + CreatedOn?: Date; /** - *The name of the table.
+ *The date and time when the usage profile was last modified.
* @public */ - TableName: string | undefined; + LastModifiedOn?: Date; } /** * @public */ -export interface StopColumnStatisticsTaskRunResponse {} +export interface ListUsageProfilesResponse { + /** + *A list of usage profile (UsageProfileDefinition
) objects.
A continuation token, present if the current list segment is not the last.
+ * @public + */ + NextToken?: string; +} /** - *The specified crawler is not running.
* @public */ -export class CrawlerNotRunningException extends __BaseException { - readonly name: "CrawlerNotRunningException" = "CrawlerNotRunningException"; - readonly $fault: "client" = "client"; +export interface ListWorkflowsRequest { /** - *A message describing the problem.
+ *A continuation token, if this is a continuation request.
* @public */ - Message?: string; + NextToken?: string; + /** - * @internal + *The maximum size of a list to return.
+ * @public */ - constructor(opts: __ExceptionOptionTypeThe specified crawler is stopping.
* @public */ -export class CrawlerStoppingException extends __BaseException { - readonly name: "CrawlerStoppingException" = "CrawlerStoppingException"; - readonly $fault: "client" = "client"; +export interface ListWorkflowsResponse { /** - *A message describing the problem.
+ *List of names of workflows in the account.
* @public */ - Message?: string; + Workflows?: string[]; + /** - * @internal - */ - constructor(opts: __ExceptionOptionTypeA continuation token, if not all workflow names have been returned.
+ * @public + */ + NextToken?: string; } /** * @public */ -export interface StopCrawlerRequest { +export interface PutDataCatalogEncryptionSettingsRequest { /** - *Name of the crawler to stop.
+ *The ID of the Data Catalog to set the security configuration for. If none is provided, the + * Amazon Web Services account ID is used by default.
* @public */ - Name: string | undefined; + CatalogId?: string; + + /** + *The security configuration to set.
+ * @public + */ + DataCatalogEncryptionSettings: DataCatalogEncryptionSettings | undefined; } /** * @public */ -export interface StopCrawlerResponse {} +export interface PutDataCatalogEncryptionSettingsResponse {} /** - *The specified scheduler is not running.
* @public */ -export class SchedulerNotRunningException extends __BaseException { - readonly name: "SchedulerNotRunningException" = "SchedulerNotRunningException"; - readonly $fault: "client" = "client"; +export interface PutDataQualityProfileAnnotationRequest { /** - *A message describing the problem.
+ *The ID of the data quality monitoring profile to annotate.
* @public */ - Message?: string; + ProfileId: string | undefined; + /** - * @internal + *The inclusion annotation value to apply to the profile.
+ * @public */ - constructor(opts: __ExceptionOptionTypeLeft blank.
* @public */ -export interface StopCrawlerScheduleRequest { - /** - *Name of the crawler whose schedule state to set.
- * @public - */ - CrawlerName: string | undefined; -} +export interface PutDataQualityProfileAnnotationResponse {} /** * @public + * @enum */ -export interface StopCrawlerScheduleResponse {} +export const EnableHybridValues = { + FALSE: "FALSE", + TRUE: "TRUE", +} as const; /** * @public */ -export interface StopSessionRequest { +export type EnableHybridValues = (typeof EnableHybridValues)[keyof typeof EnableHybridValues]; + +/** + * @public + * @enum + */ +export const ExistCondition = { + MUST_EXIST: "MUST_EXIST", + NONE: "NONE", + NOT_EXIST: "NOT_EXIST", +} as const; + +/** + * @public + */ +export type ExistCondition = (typeof ExistCondition)[keyof typeof ExistCondition]; + +/** + * @public + */ +export interface PutResourcePolicyRequest { /** - *The ID of the session to be stopped.
+ *Contains the policy document to set, in JSON format.
* @public */ - Id: string | undefined; + PolicyInJson: string | undefined; /** - *The origin of the request.
+ *Do not use. For internal use only.
* @public */ - RequestOrigin?: string; -} + ResourceArn?: string; -/** - * @public - */ -export interface StopSessionResponse { /** - *Returns the Id of the stopped session.
+ *The hash value returned when the previous policy was set using
+ * PutResourcePolicy
. Its purpose is to prevent concurrent modifications of a
+ * policy. Do not use this parameter if no previous policy has been set.
The name of the trigger to stop.
+ *A value of MUST_EXIST
is used to update a policy. A value of
+ * NOT_EXIST
is used to create a new policy. If a value of NONE
or a
+ * null value is used, the call does not depend on the existence of a policy.
If 'TRUE'
, indicates that you are using both methods to grant cross-account
+ * access to Data Catalog resources:
By directly updating the resource policy with PutResourePolicy
+ *
By using the Grant permissions command on the Amazon Web Services Management Console.
+ *Must be set to 'TRUE'
if you have already used the Management Console to
+ * grant cross-account access, otherwise the call fails. Default is 'FALSE'.
The name of the trigger that was stopped.
+ *A hash of the policy that has just been set. This must + * be included in a subsequent call that overwrites or updates + * this policy.
* @public */ - Name?: string; + PolicyHash?: string; } /** + *A structure containing a key value pair for metadata.
* @public */ -export interface StopWorkflowRunRequest { +export interface MetadataKeyValuePair { /** - *The name of the workflow to stop.
+ *A metadata key.
* @public */ - Name: string | undefined; + MetadataKey?: string; /** - *The ID of the workflow run to stop.
+ *A metadata key’s corresponding value.
* @public */ - RunId: string | undefined; + MetadataValue?: string; } /** * @public */ -export interface StopWorkflowRunResponse {} +export interface PutSchemaVersionMetadataInput { + /** + *The unique ID for the schema.
+ * @public + */ + SchemaId?: SchemaId; -/** - * @public - */ -export interface TagResourceRequest { /** - *The ARN of the Glue resource to which to add the tags. For more - * information about Glue resource ARNs, see the Glue ARN string pattern.
+ *The version number of the schema.
* @public */ - ResourceArn: string | undefined; + SchemaVersionNumber?: SchemaVersionNumber; /** - *Tags to add to this resource.
+ *The unique version ID of the schema version.
* @public */ - TagsToAdd: RecordThe metadata key's corresponding value.
+ * @public + */ + MetadataKeyValue: MetadataKeyValuePair | undefined; } /** * @public */ -export interface TagResourceResponse {} +export interface PutSchemaVersionMetadataResponse { + /** + *The Amazon Resource Name (ARN) for the schema.
+ * @public + */ + SchemaArn?: string; -/** - * @public - */ -export interface UntagResourceRequest { /** - *The Amazon Resource Name (ARN) of the resource from which to remove the tags.
+ *The name for the schema.
* @public */ - ResourceArn: string | undefined; + SchemaName?: string; /** - *Tags to remove from this resource.
+ *The name for the registry.
* @public */ - TagsToRemove: string[] | undefined; -} + RegistryName?: string; -/** - * @public - */ -export interface UntagResourceResponse {} + /** + *The latest version of the schema.
+ * @public + */ + LatestVersion?: boolean; -/** - * @public - */ -export interface UpdateBlueprintRequest { /** - *The name of the blueprint.
+ *The version number of the schema.
* @public */ - Name: string | undefined; + VersionNumber?: number; /** - *A description of the blueprint.
+ *The unique version ID of the schema version.
* @public */ - Description?: string; + SchemaVersionId?: string; /** - *Specifies a path in Amazon S3 where the blueprint is published.
+ *The metadata key.
* @public */ - BlueprintLocation: string | undefined; + MetadataKey?: string; + + /** + *The value of the metadata key.
+ * @public + */ + MetadataValue?: string; } /** * @public */ -export interface UpdateBlueprintResponse { +export interface PutWorkflowRunPropertiesRequest { /** - *Returns the name of the blueprint that was updated.
+ *Name of the workflow which was run.
* @public */ - Name?: string; + Name: string | undefined; + + /** + *The ID of the workflow run for which the run properties should be updated.
+ * @public + */ + RunId: string | undefined; + + /** + *The properties to put for the specified run.
+ * @public + */ + RunProperties: RecordSpecifies a custom CSV classifier to be updated.
* @public */ -export interface UpdateCsvClassifierRequest { +export interface PutWorkflowRunPropertiesResponse {} + +/** + * @public + */ +export interface QuerySchemaVersionMetadataInput { /** - *The name of the classifier.
+ *A wrapper structure that may contain the schema name and Amazon Resource Name (ARN).
* @public */ - Name: string | undefined; + SchemaId?: SchemaId; /** - *A custom symbol to denote what separates each column entry in the row.
+ *The version number of the schema.
* @public */ - Delimiter?: string; + SchemaVersionNumber?: SchemaVersionNumber; /** - *A custom symbol to denote what combines content into a single column value. It must be - * different from the column delimiter.
+ *The unique version ID of the schema version.
+ * @public + */ + SchemaVersionId?: string; + + /** + *Search key-value pairs for metadata, if they are not provided all the metadata information will be fetched.
* @public */ - QuoteSymbol?: string; + MetadataList?: MetadataKeyValuePair[]; /** - *Indicates whether the CSV file contains a header.
+ *Maximum number of results required per page. If the value is not supplied, this will be defaulted to 25 per page.
* @public */ - ContainsHeader?: CsvHeaderOption; + MaxResults?: number; /** - *A list of strings representing column names.
+ *A continuation token, if this is a continuation call.
* @public */ - Header?: string[]; + NextToken?: string; +} +/** + *A structure containing other metadata for a schema version belonging to the same metadata key.
+ * @public + */ +export interface OtherMetadataValueListItem { /** - *Specifies not to trim values before identifying the type of column values. The default value is true.
+ *The metadata key’s corresponding value for the other metadata belonging to the same metadata key.
* @public */ - DisableValueTrimming?: boolean; + MetadataValue?: string; /** - *Enables the processing of files that contain only one column.
+ *The time at which the entry was created.
* @public */ - AllowSingleColumn?: boolean; + CreatedTime?: string; +} +/** + *A structure containing metadata information for a schema version.
+ * @public + */ +export interface MetadataInfo { /** - *Specifies the configuration of custom datatypes.
+ *The metadata key’s corresponding value.
* @public */ - CustomDatatypeConfigured?: boolean; + MetadataValue?: string; /** - *Specifies a list of supported custom datatypes.
+ *The time at which the entry was created.
* @public */ - CustomDatatypes?: string[]; + CreatedTime?: string; /** - *Sets the SerDe for processing CSV in the classifier, which will be applied in the Data Catalog. Valid values are OpenCSVSerDe
, LazySimpleSerDe
, and None
. You can specify the None
value when you want the crawler to do the detection.
Other metadata belonging to the same metadata key.
* @public */ - Serde?: CsvSerdeOption; + OtherMetadataValueList?: OtherMetadataValueListItem[]; } /** - *Specifies a grok classifier to update when passed to
- * UpdateClassifier
.
The name of the GrokClassifier
.
An identifier of the data format that the classifier matches, such as Twitter, JSON, Omniture logs, - * Amazon CloudWatch Logs, and so on.
+ *A map of a metadata key and associated values.
* @public */ - Classification?: string; + MetadataInfoMap?: RecordThe grok pattern used by this classifier.
+ *The unique version ID of the schema version.
* @public */ - GrokPattern?: string; + SchemaVersionId?: string; /** - *Optional custom grok patterns used by this classifier.
+ *A continuation token for paginating the returned list of tokens, returned if the current segment of the list is not the last.
* @public */ - CustomPatterns?: string; + NextToken?: string; } /** - *Specifies a JSON classifier to be updated.
* @public */ -export interface UpdateJsonClassifierRequest { +export interface RegisterSchemaVersionInput { /** - *The name of the classifier.
+ *This is a wrapper structure to contain schema identity fields. The structure contains:
+ *SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. Either SchemaArn
or SchemaName
and RegistryName
has to be provided.
SchemaId$SchemaName: The name of the schema. Either SchemaArn
or SchemaName
and RegistryName
has to be provided.
A JsonPath
string defining the JSON data for the classifier to classify.
- * Glue supports a subset of JsonPath, as described in Writing JsonPath Custom Classifiers.
The schema definition using the DataFormat
setting for the SchemaName
.
Specifies an XML classifier to be updated.
* @public */ -export interface UpdateXMLClassifierRequest { +export interface RegisterSchemaVersionResponse { /** - *The name of the classifier.
+ *The unique ID that represents the version of this schema.
* @public */ - Name: string | undefined; + SchemaVersionId?: string; /** - *An identifier of the data format that the classifier matches.
+ *The version of this schema (for sync flow only, in case this is the first version).
* @public */ - Classification?: string; + VersionNumber?: number; /** - *The XML tag designating the element that contains each record in an XML document being
- * parsed. This cannot identify a self-closing element (closed by />
). An empty
- * row element that contains only attributes can be parsed as long as it ends with a closing tag
- * (for example,
is okay, but
- *
is not).
The status of the schema version.
* @public */ - RowTag?: string; + Status?: SchemaVersionStatus; } /** * @public */ -export interface UpdateClassifierRequest { +export interface RemoveSchemaVersionMetadataInput { /** - *A GrokClassifier
object with updated fields.
A wrapper structure that may contain the schema name and Amazon Resource Name (ARN).
* @public */ - GrokClassifier?: UpdateGrokClassifierRequest; + SchemaId?: SchemaId; /** - *An XMLClassifier
object with updated fields.
The version number of the schema.
* @public */ - XMLClassifier?: UpdateXMLClassifierRequest; + SchemaVersionNumber?: SchemaVersionNumber; /** - *A JsonClassifier
object with updated fields.
The unique version ID of the schema version.
* @public */ - JsonClassifier?: UpdateJsonClassifierRequest; + SchemaVersionId?: string; /** - *A CsvClassifier
object with updated fields.
The value of the metadata key.
* @public */ - CsvClassifier?: UpdateCsvClassifierRequest; + MetadataKeyValue: MetadataKeyValuePair | undefined; } /** * @public */ -export interface UpdateClassifierResponse {} +export interface RemoveSchemaVersionMetadataResponse { + /** + *The Amazon Resource Name (ARN) of the schema.
+ * @public + */ + SchemaArn?: string; -/** - *There was a version conflict.
- * @public - */ -export class VersionMismatchException extends __BaseException { - readonly name: "VersionMismatchException" = "VersionMismatchException"; - readonly $fault: "client" = "client"; /** - *A message describing the problem.
+ *The name of the schema.
* @public */ - Message?: string; + SchemaName?: string; + /** - * @internal + *The name of the registry.
+ * @public */ - constructor(opts: __ExceptionOptionTypeThe ID of the Data Catalog where the partitions in question reside. - * If none is supplied, the Amazon Web Services account ID is used by default.
+ *The latest version of the schema.
* @public */ - CatalogId?: string; + LatestVersion?: boolean; /** - *The name of the catalog database where the partitions reside.
+ *The version number of the schema.
* @public */ - DatabaseName: string | undefined; + VersionNumber?: number; /** - *The name of the partitions' table.
+ *The version ID for the schema version.
* @public */ - TableName: string | undefined; + SchemaVersionId?: string; /** - *A list of partition values identifying the partition.
+ *The metadata key.
* @public */ - PartitionValues: string[] | undefined; + MetadataKey?: string; /** - *A list of the column statistics.
+ *The value of the metadata key.
* @public */ - ColumnStatisticsList: ColumnStatistics[] | undefined; + MetadataValue?: string; } /** - *Encapsulates a ColumnStatistics
object that failed and the reason for failure.
The ColumnStatistics
of the column.
The name of the job in question.
* @public */ - ColumnStatistics?: ColumnStatistics; + JobName: string | undefined; /** - *An error message with the reason for the failure of an operation.
+ *The unique run identifier associated with this job run.
* @public */ - Error?: ErrorDetail; + RunId?: string; } /** * @public */ -export interface UpdateColumnStatisticsForPartitionResponse { +export interface ResetJobBookmarkResponse { /** - *Error occurred during updating column statistics data.
+ *The reset bookmark entry.
* @public */ - Errors?: ColumnStatisticsError[]; + JobBookmarkEntry?: JobBookmarkEntry; } /** + *Too many jobs are being run concurrently.
* @public */ -export interface UpdateColumnStatisticsForTableRequest { +export class ConcurrentRunsExceededException extends __BaseException { + readonly name: "ConcurrentRunsExceededException" = "ConcurrentRunsExceededException"; + readonly $fault: "client" = "client"; /** - *The ID of the Data Catalog where the partitions in question reside. - * If none is supplied, the Amazon Web Services account ID is used by default.
+ *A message describing the problem.
* @public */ - CatalogId?: string; + Message?: string; + /** + * @internal + */ + constructor(opts: __ExceptionOptionTypeThe workflow is in an invalid state to perform a requested operation.
+ * @public + */ +export class IllegalWorkflowStateException extends __BaseException { + readonly name: "IllegalWorkflowStateException" = "IllegalWorkflowStateException"; + readonly $fault: "client" = "client"; + /** + *A message describing the problem.
+ * @public + */ + Message?: string; + /** + * @internal + */ + constructor(opts: __ExceptionOptionTypeThe name of the catalog database where the partitions reside.
+ *The name of the workflow to resume.
* @public */ - DatabaseName: string | undefined; + Name: string | undefined; /** - *The name of the partitions' table.
+ *The ID of the workflow run to resume.
* @public */ - TableName: string | undefined; + RunId: string | undefined; /** - *A list of the column statistics.
+ *A list of the node IDs for the nodes you want to restart. The nodes that are to be restarted must have a run attempt in the original run.
* @public */ - ColumnStatisticsList: ColumnStatistics[] | undefined; + NodeIds: string[] | undefined; } /** * @public */ -export interface UpdateColumnStatisticsForTableResponse { +export interface ResumeWorkflowRunResponse { /** - *List of ColumnStatisticsErrors.
+ *The new ID assigned to the resumed workflow run. Each resume of a workflow run will have a new run ID.
* @public */ - Errors?: ColumnStatisticsError[]; + RunId?: string; + + /** + *A list of the node IDs for the nodes that were actually restarted.
+ * @public + */ + NodeIds?: string[]; } /** * @public */ -export interface UpdateConnectionRequest { +export interface RunStatementRequest { /** - *The ID of the Data Catalog in which the connection resides. If none is provided, the Amazon Web Services - * account ID is used by default.
+ *The Session Id of the statement to be run.
* @public */ - CatalogId?: string; + SessionId: string | undefined; /** - *The name of the connection definition to update.
+ *The statement code to be run.
* @public */ - Name: string | undefined; + Code: string | undefined; /** - *A ConnectionInput
object that redefines the connection
- * in question.
The origin of the request.
* @public */ - ConnectionInput: ConnectionInput | undefined; + RequestOrigin?: string; } /** * @public */ -export interface UpdateConnectionResponse {} +export interface RunStatementResponse { + /** + *Returns the Id of the statement that was run.
+ * @public + */ + Id?: number; +} /** * @public + * @enum */ -export interface UpdateCrawlerRequest { - /** - *Name of the new crawler.
- * @public - */ - Name: string | undefined; +export const Comparator = { + EQUALS: "EQUALS", + GREATER_THAN: "GREATER_THAN", + GREATER_THAN_EQUALS: "GREATER_THAN_EQUALS", + LESS_THAN: "LESS_THAN", + LESS_THAN_EQUALS: "LESS_THAN_EQUALS", +} as const; - /** - *The IAM role or Amazon Resource Name (ARN) of an IAM role that is used by the new crawler - * to access customer resources.
- * @public - */ - Role?: string; +/** + * @public + */ +export type Comparator = (typeof Comparator)[keyof typeof Comparator]; +/** + *Defines a property predicate.
+ * @public + */ +export interface PropertyPredicate { /** - *The Glue database where results are stored, such as:
- * arn:aws:daylight:us-east-1::database/sometable/*
.
The key of the property.
* @public */ - DatabaseName?: string; + Key?: string; /** - *A description of the new crawler.
+ *The value of the property.
* @public */ - Description?: string; + Value?: string; /** - *A list of targets to crawl.
+ *The comparator used to compare this property to others.
* @public */ - Targets?: CrawlerTargets; + Comparator?: Comparator; +} + +/** + * @public + * @enum + */ +export const Sort = { + ASCENDING: "ASC", + DESCENDING: "DESC", +} as const; +/** + * @public + */ +export type Sort = (typeof Sort)[keyof typeof Sort]; + +/** + *Specifies a field to sort by and a sort order.
+ * @public + */ +export interface SortCriterion { /** - *A cron
expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run
- * something every day at 12:15 UTC, you would specify:
- * cron(15 12 * * ? *)
.
The name of the field on which to sort.
* @public */ - Schedule?: string; + FieldName?: string; /** - *A list of custom classifiers that the user - * has registered. By default, all built-in classifiers are included in a crawl, - * but these custom classifiers always override the default classifiers - * for a given classification.
+ *An ascending or descending sort.
* @public */ - Classifiers?: string[]; + Sort?: Sort; +} +/** + * @public + */ +export interface SearchTablesRequest { /** - *The table prefix used for catalog tables that are created.
+ *A unique identifier, consisting of
+ * account_id
+ *
.
The policy for the crawler's update and deletion behavior.
+ *A continuation token, included if this is a continuation call.
* @public */ - SchemaChangePolicy?: SchemaChangePolicy; + NextToken?: string; /** - *A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
+ *A list of key-value pairs, and a comparator used to filter the search results. Returns all entities matching the predicate.
+ *The Comparator
member of the PropertyPredicate
struct is used only for time fields, and can be omitted for other field types. Also, when comparing string values, such as when Key=Name
, a fuzzy match algorithm is used. The Key
field (for example, the value of the Name
field) is split on certain punctuation characters, for example, -, :, #, etc. into tokens. Then each token is exact-match compared with the Value
member of PropertyPredicate
. For example, if Key=Name
and Value=link
, tables named customer-link
and xx-link-yy
are returned, but xxlinkyy
is not returned.
Specifies data lineage configuration settings for the crawler.
+ *A string used for a text search.
+ *Specifying a value in quotes filters based on an exact match to the value.
* @public */ - LineageConfiguration?: LineageConfiguration; + SearchText?: string; /** - *Specifies Lake Formation configuration settings for the crawler.
+ *A list of criteria for sorting the results by a field name, in an ascending or descending order.
* @public */ - LakeFormationConfiguration?: LakeFormationConfiguration; + SortCriteria?: SortCriterion[]; /** - *Crawler configuration information. This versioned JSON string allows users - * to specify aspects of a crawler's behavior. - * For more information, see Setting crawler configuration options.
+ *The maximum number of tables to return in a single response.
* @public */ - Configuration?: string; + MaxResults?: number; /** - *The name of the SecurityConfiguration
structure to be used by this
- * crawler.
Allows you to specify that you want to search the tables shared with your account. The allowable values are FOREIGN
or ALL
.
If set to FOREIGN
, will search the tables shared with your account.
If set to ALL
, will search the tables shared with your account, as well as the tables in yor local account.
The name of the crawler whose schedule to update.
+ *A continuation token, present if the current list segment is not the last.
* @public */ - CrawlerName: string | undefined; + NextToken?: string; /** - *The updated cron
expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run
- * something every day at 12:15 UTC, you would specify:
- * cron(15 12 * * ? *)
.
A list of the requested Table
objects. The SearchTables
response returns only the tables that you have access to.
The blueprint is in an invalid state to perform a requested operation.
* @public */ -export interface UpdateCrawlerScheduleResponse {} - -/** - * @public - */ -export interface UpdateDatabaseRequest { - /** - *The ID of the Data Catalog in which the metadata database resides. If none is provided, - * the Amazon Web Services account ID is used by default.
- * @public - */ - CatalogId?: string; - +export class IllegalBlueprintStateException extends __BaseException { + readonly name: "IllegalBlueprintStateException" = "IllegalBlueprintStateException"; + readonly $fault: "client" = "client"; /** - *The name of the database to update in the catalog. For Hive - * compatibility, this is folded to lowercase.
+ *A message describing the problem.
* @public */ - Name: string | undefined; - + Message?: string; /** - *A DatabaseInput
object specifying the new definition
- * of the metadata database in the catalog.
The name of the data quality ruleset.
+ *The name of the blueprint.
* @public */ - Name: string | undefined; + BlueprintName: string | undefined; /** - *A description of the ruleset.
+ *Specifies the parameters as a BlueprintParameters
object.
A Data Quality Definition Language (DQDL) ruleset. For more information, see the Glue developer guide.
+ *Specifies the IAM role used to create the workflow.
* @public */ - Ruleset?: string; + RoleArn: string | undefined; } /** * @public */ -export interface UpdateDataQualityRulesetResponse { - /** - *The name of the data quality ruleset.
- * @public - */ - Name?: string; - - /** - *A description of the ruleset.
- * @public - */ - Description?: string; - +export interface StartBlueprintRunResponse { /** - *A Data Quality Definition Language (DQDL) ruleset. For more information, see the Glue developer guide.
+ *The run ID for this blueprint run.
* @public */ - Ruleset?: string; + RunId?: string; } /** - *Custom libraries to be loaded into a development endpoint.
+ *An exception thrown when you try to start another job while running a column stats generation job.
* @public */ -export interface DevEndpointCustomLibraries { +export class ColumnStatisticsTaskRunningException extends __BaseException { + readonly name: "ColumnStatisticsTaskRunningException" = "ColumnStatisticsTaskRunningException"; + readonly $fault: "client" = "client"; /** - *The paths to one or more Python libraries in an Amazon Simple Storage Service (Amazon S3)
- * bucket that should be loaded in your DevEndpoint
. Multiple values must be
- * complete paths separated by a comma.
You can only use pure Python libraries with a DevEndpoint
. Libraries that rely on
- * C extensions, such as the pandas Python data
- * analysis library, are not currently supported.
A message describing the problem.
* @public */ - ExtraPythonLibsS3Path?: string; - + Message?: string; /** - *The path to one or more Java .jar
files in an S3 bucket that should be loaded
- * in your DevEndpoint
.
You can only use pure Java/Scala libraries with a DevEndpoint
.
The name of the DevEndpoint
to be updated.
The name of the database where the table resides.
* @public */ - EndpointName: string | undefined; + DatabaseName: string | undefined; /** - *The public key for the DevEndpoint
to use.
The name of the table to generate statistics.
* @public */ - PublicKey?: string; + TableName: string | undefined; /** - *The list of public keys for the DevEndpoint
to use.
A list of the column names to generate statistics. If none is supplied, all column names for the table will be used by default.
* @public */ - AddPublicKeys?: string[]; + ColumnNameList?: string[]; /** - *The list of public keys to be deleted from the DevEndpoint
.
The IAM role that the service assumes to generate statistics.
* @public */ - DeletePublicKeys?: string[]; + Role: string | undefined; /** - *Custom Python or Java libraries to be loaded in the DevEndpoint
.
The percentage of rows used to generate statistics. If none is supplied, the entire table will be used to generate stats.
* @public */ - CustomLibraries?: DevEndpointCustomLibraries; + SampleSize?: number; /** - *
- * True
if the list of custom libraries to be loaded in the development endpoint
- * needs to be updated, or False
if otherwise.
The ID of the Data Catalog where the table reside. If none is supplied, the Amazon Web Services account ID is used by default.
* @public */ - UpdateEtlLibraries?: boolean; + CatalogID?: string; /** - *The list of argument keys to be deleted from the map of arguments used to configure the
- * DevEndpoint
.
Name of the security configuration that is used to encrypt CloudWatch logs for the column stats task run.
* @public */ - DeleteArguments?: string[]; + SecurityConfiguration?: string; +} +/** + * @public + */ +export interface StartColumnStatisticsTaskRunResponse { /** - *The map of arguments to add the map of arguments used to configure the
- * DevEndpoint
.
Valid arguments are:
- *
- * "--enable-glue-datacatalog": ""
- *
You can specify a version of Python support for development endpoints by using the Arguments
parameter in the CreateDevEndpoint
or UpdateDevEndpoint
APIs. If no arguments are provided, the version defaults to Python 2.
The identifier for the column statistics task run.
* @public */ - AddArguments?: RecordName of the crawler to start.
+ * @public + */ + Name: string | undefined; +} /** * @public */ -export interface UpdateJobResponse { +export interface StartCrawlerResponse {} + +/** + *There is no applicable schedule.
+ * @public + */ +export class NoScheduleException extends __BaseException { + readonly name: "NoScheduleException" = "NoScheduleException"; + readonly $fault: "client" = "client"; /** - *Returns the name of the updated job definition.
+ *A message describing the problem.
* @public */ - JobName?: string; + Message?: string; + /** + * @internal + */ + constructor(opts: __ExceptionOptionTypeThe specified scheduler is already running.
* @public */ -export interface UpdateJobFromSourceControlRequest { +export class SchedulerRunningException extends __BaseException { + readonly name: "SchedulerRunningException" = "SchedulerRunningException"; + readonly $fault: "client" = "client"; /** - *The name of the Glue job to be synchronized to or from the remote repository.
+ *A message describing the problem.
* @public */ - JobName?: string; + Message?: string; + /** + * @internal + */ + constructor(opts: __ExceptionOptionType- * The provider for the remote repository. Possible values: GITHUB, AWS_CODE_COMMIT, GITLAB, BITBUCKET. - *
+ *Name of the crawler to schedule.
* @public */ - Provider?: SourceControlProvider; + CrawlerName: string | undefined; +} + +/** + * @public + */ +export interface StartCrawlerScheduleResponse {} +/** + * @public + */ +export interface StartDataQualityRuleRecommendationRunRequest { /** - *The name of the remote repository that contains the job artifacts.
- * For BitBucket providers, RepositoryName
should include WorkspaceName
.
- * Use the format
.
- *
The data source (Glue table) associated with this run.
* @public */ - RepositoryName?: string; + DataSource: DataSource | undefined; /** - *The owner of the remote repository that contains the job artifacts.
+ *An IAM role supplied to encrypt the results of the run.
* @public */ - RepositoryOwner?: string; + Role: string | undefined; /** - *An optional branch in the remote repository.
+ *The number of G.1X
workers to be used in the run. The default is 5.
An optional folder in the remote repository.
+ *The timeout for a run in minutes. This is the maximum time that a run can consume resources before it is terminated and enters TIMEOUT
status. The default is 2,880 minutes (48 hours).
A commit ID for a commit in the remote repository.
+ *A name for the ruleset.
* @public */ - CommitId?: string; + CreatedRulesetName?: string; /** - *The type of authentication, which can be an authentication token stored in Amazon Web Services Secrets Manager, or a personal access token.
+ *The name of the security configuration created with the data quality encryption option.
* @public */ - AuthStrategy?: SourceControlAuthStrategy; + DataQualitySecurityConfiguration?: string; /** - *The value of the authorization token.
+ *Used for idempotency and is recommended to be set to a random ID (such as a UUID) to avoid creating or starting multiple instances of the same resource.
* @public */ - AuthToken?: string; + ClientToken?: string; } /** * @public */ -export interface UpdateJobFromSourceControlResponse { +export interface StartDataQualityRuleRecommendationRunResponse { /** - *The name of the Glue job.
+ *The unique run identifier associated with this run.
* @public */ - JobName?: string; + RunId?: string; } /** * @public */ -export interface UpdateMLTransformRequest { +export interface StartDataQualityRulesetEvaluationRunRequest { /** - *A unique identifier that was generated when the transform was created.
+ *The data source (Glue table) associated with this run.
* @public */ - TransformId: string | undefined; + DataSource: DataSource | undefined; /** - *The unique name that you gave the transform when you created it.
+ *An IAM role supplied to encrypt the results of the run.
* @public */ - Name?: string; + Role: string | undefined; /** - *A description of the transform. The default is an empty string.
+ *The number of G.1X
workers to be used in the run. The default is 5.
The configuration parameters that are specific to the transform type (algorithm) used. - * Conditionally dependent on the transform type.
+ *The timeout for a run in minutes. This is the maximum time that a run can consume resources before it is terminated and enters TIMEOUT
status. The default is 2,880 minutes (48 hours).
The name or Amazon Resource Name (ARN) of the IAM role with the required - * permissions.
+ *Used for idempotency and is recommended to be set to a random ID (such as a UUID) to avoid creating or starting multiple instances of the same resource.
* @public */ - Role?: string; + ClientToken?: string; /** - *This value determines which version of Glue this machine learning transform is compatible with. Glue 1.0 is recommended for most customers. If the value is not set, the Glue compatibility defaults to Glue 0.9. For more information, see Glue Versions in the developer guide.
+ *Additional run options you can specify for an evaluation run.
* @public */ - GlueVersion?: string; + AdditionalRunOptions?: DataQualityEvaluationRunAdditionalRunOptions; /** - *The number of Glue data processing units (DPUs) that are allocated to task runs for this transform. You can allocate from 2 to 100 DPUs; the default is 10. A DPU is a relative measure of - * processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more - * information, see the Glue pricing - * page.
- *When the WorkerType
field is set to a value other than Standard
, the MaxCapacity
field is set automatically and becomes read-only.
A list of ruleset names.
* @public */ - MaxCapacity?: number; + RulesetNames: string[] | undefined; /** - *The type of predefined worker that is allocated when this task runs. Accepts a value of Standard, G.1X, or G.2X.
- *For the Standard
worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.
For the G.1X
worker type, each worker provides 4 vCPU, 16 GB of memory and a 64GB disk, and 1 executor per worker.
For the G.2X
worker type, each worker provides 8 vCPU, 32 GB of memory and a 128GB disk, and 1 executor per worker.
A map of reference strings to additional data sources you can specify for an evaluation run.
* @public */ - WorkerType?: WorkerType; + AdditionalDataSources?: RecordThe number of workers of a defined workerType
that are allocated when this task runs.
The unique run identifier associated with this run.
* @public */ - NumberOfWorkers?: number; + RunId?: string; +} +/** + * @public + */ +export interface StartExportLabelsTaskRunRequest { /** - *The timeout for a task run for this transform in minutes. This is the maximum time that a task run for this transform can consume resources before it is terminated and enters TIMEOUT
status. The default is 2,880 minutes (48 hours).
The unique identifier of the machine learning transform.
* @public */ - Timeout?: number; + TransformId: string | undefined; /** - *The maximum number of times to retry a task for this transform after a task run fails.
+ *The Amazon S3 path where you export the labels.
* @public */ - MaxRetries?: number; + OutputS3Path: string | undefined; } /** * @public */ -export interface UpdateMLTransformResponse { +export interface StartExportLabelsTaskRunResponse { /** - *The unique identifier for the transform that was updated.
+ *The unique identifier for the task run.
* @public */ - TransformId?: string; + TaskRunId?: string; } /** * @public */ -export interface UpdatePartitionRequest { - /** - *The ID of the Data Catalog where the partition to be updated resides. If none is provided, - * the Amazon Web Services account ID is used by default.
- * @public - */ - CatalogId?: string; - +export interface StartImportLabelsTaskRunRequest { /** - *The name of the catalog database in which the table in question - * resides.
+ *The unique identifier of the machine learning transform.
* @public */ - DatabaseName: string | undefined; + TransformId: string | undefined; /** - *The name of the table in which the partition to be updated is located.
+ *The Amazon Simple Storage Service (Amazon S3) path from where you import the + * labels.
* @public */ - TableName: string | undefined; + InputS3Path: string | undefined; /** - *List of partition key values that define the partition to update.
+ *Indicates whether to overwrite your existing labels.
* @public */ - PartitionValueList: string[] | undefined; + ReplaceAllLabels?: boolean; +} +/** + * @public + */ +export interface StartImportLabelsTaskRunResponse { /** - *The new partition object to update the partition to.
- *The Values
property can't be changed. If you want to change the partition key values for a partition, delete and recreate the partition.
The unique identifier for the task run.
* @public */ - PartitionInput: PartitionInput | undefined; + TaskRunId?: string; } /** * @public */ -export interface UpdatePartitionResponse {} +export interface StartJobRunRequest { + /** + *The name of the job definition to use.
+ * @public + */ + JobName: string | undefined; -/** - * @public - */ -export interface UpdateRegistryInput { /** - *This is a wrapper structure that may contain the registry name and Amazon Resource Name (ARN).
+ *The ID of a previous JobRun
to retry.
A description of the registry. If description is not provided, this field will not be updated.
+ *The job arguments associated with this run. For this job run, they replace the default + * arguments set in the job definition itself.
+ *You can specify arguments here that your own job-execution script + * consumes, as well as arguments that Glue itself consumes.
+ *Job arguments may be logged. Do not pass plaintext secrets as arguments. Retrieve secrets + * from a Glue Connection, Secrets Manager or other secret management + * mechanism if you intend to keep them within the Job.
+ *For information about how to specify and consume your own Job arguments, see the Calling Glue APIs in Python topic in the developer guide.
+ *For information about the arguments you can provide to this field when configuring Spark jobs, + * see the Special Parameters Used by Glue topic in the developer guide.
+ *For information about the arguments you can provide to this field when configuring Ray + * jobs, see Using + * job parameters in Ray jobs in the developer guide.
* @public */ - Description: string | undefined; -} + Arguments?: RecordThe name of the updated registry.
+ * @deprecated + * + *This field is deprecated. Use MaxCapacity
instead.
The number of Glue data processing units (DPUs) to allocate to this JobRun. + * You can allocate a minimum of 2 DPUs; the default is 10. A DPU is a relative measure + * of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. + * For more information, see the Glue + * pricing page.
* @public */ - RegistryName?: string; + AllocatedCapacity?: number; /** - *The Amazon Resource name (ARN) of the updated registry.
+ *The JobRun
timeout in minutes. This is the maximum time that a job run can
+ * consume resources before it is terminated and enters TIMEOUT
status. This value overrides the timeout value set in the parent job.
Streaming jobs must have timeout values less than 7 days or 10080 minutes. When the value is left blank, the job will be restarted after 7 days based if you have not setup a maintenance window. If you have setup maintenance window, it will be restarted during the maintenance window after 7 days.
* @public */ - RegistryArn?: string; -} + Timeout?: number; -/** - * @public - */ -export interface UpdateSchemaInput { /** - *This is a wrapper structure to contain schema identity fields. The structure contains:
+ *For Glue version 1.0 or earlier jobs, using the standard worker type, the number of + * Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is + * a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB + * of memory. For more information, see the + * Glue pricing page.
+ *For Glue version 2.0+ jobs, you cannot specify a Maximum capacity
.
+ * Instead, you should specify a Worker type
and the Number of workers
.
Do not set MaxCapacity
if using WorkerType
and NumberOfWorkers
.
The value that can be allocated for MaxCapacity
depends on whether you are
+ * running a Python shell job, an Apache Spark ETL job, or an Apache Spark streaming ETL
+ * job:
SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. One of SchemaArn
or SchemaName
has to be provided.
When you specify a Python shell job (JobCommand.Name
="pythonshell"), you can
+ * allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU.
SchemaId$SchemaName: The name of the schema. One of SchemaArn
or SchemaName
has to be provided.
When you specify an Apache Spark ETL job (JobCommand.Name
="glueetl") or Apache
+ * Spark streaming ETL job (JobCommand.Name
="gluestreaming"), you can allocate from 2 to 100 DPUs.
+ * The default is 10 DPUs. This job type cannot have a fractional DPU allocation.
Version number required for check pointing. One of VersionNumber
or Compatibility
has to be provided.
The new compatibility setting for the schema.
+ *The name of the SecurityConfiguration
structure to be used with this job
+ * run.
The new description for the schema.
+ *Specifies configuration properties of a job run notification.
* @public */ - Description?: string; -} + NotificationProperty?: NotificationProperty; -/** - * @public - */ -export interface UpdateSchemaResponse { /** - *The Amazon Resource Name (ARN) of the schema.
+ *The type of predefined worker that is allocated when a job runs. Accepts a value of + * G.1X, G.2X, G.4X, G.8X or G.025X for Spark jobs. Accepts the value Z.2X for Ray jobs.
+ *For the G.1X
worker type, each worker maps to 1 DPU (4 vCPUs, 16 GB of memory) with 84GB disk (approximately 34GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs.
For the G.2X
worker type, each worker maps to 2 DPU (8 vCPUs, 32 GB of memory) with 128GB disk (approximately 77GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs.
For the G.4X
worker type, each worker maps to 4 DPU (16 vCPUs, 64 GB of memory) with 256GB disk (approximately 235GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs in the following Amazon Web Services Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), and Europe (Stockholm).
For the G.8X
worker type, each worker maps to 8 DPU (32 vCPUs, 128 GB of memory) with 512GB disk (approximately 487GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs, in the same Amazon Web Services Regions as supported for the G.4X
worker type.
For the G.025X
worker type, each worker maps to 0.25 DPU (2 vCPUs, 4 GB of memory) with 84GB disk (approximately 34GB free), and provides 1 executor per worker. We recommend this worker type for low volume streaming jobs. This worker type is only available for Glue version 3.0 streaming jobs.
For the Z.2X
worker type, each worker maps to 2 M-DPU (8vCPUs, 64 GB of memory) with 128 GB disk (approximately 120GB free), and provides up to 8 Ray workers based on the autoscaler.
The name of the schema.
+ *The number of workers of a defined workerType
that are allocated when a job runs.
The name of the registry that contains the schema.
+ *Indicates whether the job is run with a standard or flexible execution class. The standard execution-class is ideal for time-sensitive workloads that require fast job startup and dedicated resources.
+ *The flexible execution class is appropriate for time-insensitive jobs whose start and completion times may vary.
+ *Only jobs with Glue version 3.0 and above and command type glueetl
will be allowed to set ExecutionClass
to FLEX
. The flexible execution class is available for Spark jobs.
The name of the Glue job to be synchronized to or from the remote repository.
- * @public - */ - JobName?: string; - - /** - *- * The provider for the remote repository. Possible values: GITHUB, AWS_CODE_COMMIT, GITLAB, BITBUCKET. - *
- * @public - */ - Provider?: SourceControlProvider; - +export interface StartJobRunResponse { /** - *The name of the remote repository that contains the job artifacts.
- * For BitBucket providers, RepositoryName
should include WorkspaceName
.
- * Use the format
.
- *
The ID assigned to this job run.
* @public */ - RepositoryName?: string; + JobRunId?: string; +} +/** + *The machine learning transform is not ready to run.
+ * @public + */ +export class MLTransformNotReadyException extends __BaseException { + readonly name: "MLTransformNotReadyException" = "MLTransformNotReadyException"; + readonly $fault: "client" = "client"; /** - *The owner of the remote repository that contains the job artifacts.
+ *A message describing the problem.
* @public */ - RepositoryOwner?: string; - + Message?: string; /** - *An optional branch in the remote repository.
- * @public + * @internal */ - BranchName?: string; + constructor(opts: __ExceptionOptionTypeAn optional folder in the remote repository.
+ *The unique identifier of the machine learning transform.
* @public */ - Folder?: string; + TransformId: string | undefined; +} +/** + * @public + */ +export interface StartMLEvaluationTaskRunResponse { /** - *A commit ID for a commit in the remote repository.
+ *The unique identifier associated with this run.
* @public */ - CommitId?: string; + TaskRunId?: string; +} +/** + * @public + */ +export interface StartMLLabelingSetGenerationTaskRunRequest { /** - *The type of authentication, which can be an authentication token stored in Amazon Web Services Secrets Manager, or a personal access token.
+ *The unique identifier of the machine learning transform.
* @public */ - AuthStrategy?: SourceControlAuthStrategy; + TransformId: string | undefined; /** - *The value of the authorization token.
+ *The Amazon Simple Storage Service (Amazon S3) path where you generate the labeling + * set.
* @public */ - AuthToken?: string; + OutputS3Path: string | undefined; } /** * @public */ -export interface UpdateSourceControlFromJobResponse { +export interface StartMLLabelingSetGenerationTaskRunResponse { /** - *The name of the Glue job.
+ *The unique run identifier that is associated with this task run.
* @public */ - JobName?: string; + TaskRunId?: string; } /** * @public - * @enum */ -export const ViewUpdateAction = { - ADD: "ADD", - ADD_OR_REPLACE: "ADD_OR_REPLACE", - DROP: "DROP", - REPLACE: "REPLACE", -} as const; +export interface StartTriggerRequest { + /** + *The name of the trigger to start.
+ * @public + */ + Name: string | undefined; +} /** * @public */ -export type ViewUpdateAction = (typeof ViewUpdateAction)[keyof typeof ViewUpdateAction]; +export interface StartTriggerResponse { + /** + *The name of the trigger that was started.
+ * @public + */ + Name?: string; +} /** * @public */ -export interface UpdateTableRequest { +export interface StartWorkflowRunRequest { /** - *The ID of the Data Catalog where the table resides. If none is provided, the Amazon Web Services account - * ID is used by default.
+ *The name of the workflow to start.
* @public */ - CatalogId?: string; + Name: string | undefined; /** - *The name of the catalog database in which the table resides. For Hive - * compatibility, this name is entirely lowercase.
+ *The workflow run properties for the new workflow run.
* @public */ - DatabaseName: string | undefined; + RunProperties?: RecordAn updated TableInput
object to define the metadata table
- * in the catalog.
An Id for the new run.
* @public */ - TableInput: TableInput | undefined; + RunId?: string; +} +/** + *An exception thrown when you try to stop a task run when there is no task running.
+ * @public + */ +export class ColumnStatisticsTaskNotRunningException extends __BaseException { + readonly name: "ColumnStatisticsTaskNotRunningException" = "ColumnStatisticsTaskNotRunningException"; + readonly $fault: "client" = "client"; /** - *By default, UpdateTable
always creates an archived version of the table
- * before updating it. However, if skipArchive
is set to true,
- * UpdateTable
does not create the archived version.
A message describing the problem.
* @public */ - SkipArchive?: boolean; - + Message?: string; /** - *The transaction ID at which to update the table contents.
- * @public + * @internal */ - TransactionId?: string; + constructor(opts: __ExceptionOptionTypeAn exception thrown when you try to stop a task run.
+ * @public + */ +export class ColumnStatisticsTaskStoppingException extends __BaseException { + readonly name: "ColumnStatisticsTaskStoppingException" = "ColumnStatisticsTaskStoppingException"; + readonly $fault: "client" = "client"; /** - *The version ID at which to update the table contents.
+ *A message describing the problem.
* @public */ - VersionId?: string; + Message?: string; + /** + * @internal + */ + constructor(opts: __ExceptionOptionTypeThe operation to be performed when updating the view.
+ *The name of the database where the table resides.
* @public */ - ViewUpdateAction?: ViewUpdateAction; + DatabaseName: string | undefined; /** - *A flag that can be set to true to ignore matching storage descriptor and subobject matching requirements.
+ *The name of the table.
* @public */ - Force?: boolean; + TableName: string | undefined; } /** * @public */ -export interface UpdateTableResponse {} +export interface StopColumnStatisticsTaskRunResponse {} /** + *The specified crawler is not running.
* @public */ -export interface UpdateTableOptimizerRequest { +export class CrawlerNotRunningException extends __BaseException { + readonly name: "CrawlerNotRunningException" = "CrawlerNotRunningException"; + readonly $fault: "client" = "client"; /** - *The Catalog ID of the table.
+ *A message describing the problem.
* @public */ - CatalogId: string | undefined; - + Message?: string; /** - *The name of the database in the catalog in which the table resides.
- * @public + * @internal */ - DatabaseName: string | undefined; + constructor(opts: __ExceptionOptionTypeThe specified crawler is stopping.
+ * @public + */ +export class CrawlerStoppingException extends __BaseException { + readonly name: "CrawlerStoppingException" = "CrawlerStoppingException"; + readonly $fault: "client" = "client"; /** - *The name of the table.
+ *A message describing the problem.
* @public */ - TableName: string | undefined; - + Message?: string; /** - *The type of table optimizer. Currently, the only valid value is compaction
.
A TableOptimizerConfiguration
object representing the configuration of a table optimizer.
Name of the crawler to stop.
* @public */ - TableOptimizerConfiguration: TableOptimizerConfiguration | undefined; + Name: string | undefined; } /** * @public */ -export interface UpdateTableOptimizerResponse {} +export interface StopCrawlerResponse {} /** - *A structure used to provide information used to update a trigger. This object updates the - * previous trigger definition by overwriting it completely.
+ *The specified scheduler is not running.
* @public */ -export interface TriggerUpdate { +export class SchedulerNotRunningException extends __BaseException { + readonly name: "SchedulerNotRunningException" = "SchedulerNotRunningException"; + readonly $fault: "client" = "client"; /** - *Reserved for future use.
+ *A message describing the problem.
* @public */ - Name?: string; - + Message?: string; /** - *A description of this trigger.
- * @public + * @internal */ - Description?: string; + constructor(opts: __ExceptionOptionTypeA cron
expression used to specify the schedule (see Time-Based
- * Schedules for Jobs and Crawlers. For example, to run
- * something every day at 12:15 UTC, you would specify:
- * cron(15 12 * * ? *)
.
Name of the crawler whose schedule state to set.
* @public */ - Schedule?: string; + CrawlerName: string | undefined; +} + +/** + * @public + */ +export interface StopCrawlerScheduleResponse {} +/** + * @public + */ +export interface StopSessionRequest { /** - *The actions initiated by this trigger.
+ *The ID of the session to be stopped.
* @public */ - Actions?: Action[]; + Id: string | undefined; /** - *The predicate of this trigger, which defines when it will fire.
+ *The origin of the request.
* @public */ - Predicate?: Predicate; + RequestOrigin?: string; +} +/** + * @public + */ +export interface StopSessionResponse { /** - *Batch condition that must be met (specified number of events received or batch time window expired) - * before EventBridge event trigger fires.
+ *Returns the Id of the stopped session.
* @public */ - EventBatchingCondition?: EventBatchingCondition; + Id?: string; } /** * @public */ -export interface UpdateTriggerRequest { +export interface StopTriggerRequest { /** - *The name of the trigger to update.
+ *The name of the trigger to stop.
* @public */ Name: string | undefined; - - /** - *The new values with which to update the trigger.
- * @public - */ - TriggerUpdate: TriggerUpdate | undefined; } /** * @public */ -export interface UpdateTriggerResponse { +export interface StopTriggerResponse { /** - *The resulting trigger definition.
+ *The name of the trigger that was stopped.
* @public */ - Trigger?: Trigger; + Name?: string; } /** * @public */ -export interface UpdateUsageProfileRequest { +export interface StopWorkflowRunRequest { /** - *The name of the usage profile.
+ *The name of the workflow to stop.
* @public */ Name: string | undefined; /** - *A description of the usage profile.
- * @public - */ - Description?: string; - - /** - *A ProfileConfiguration
object specifying the job and session values for the profile.
The ID of the workflow run to stop.
* @public */ - Configuration: ProfileConfiguration | undefined; + RunId: string | undefined; } /** * @public */ -export interface UpdateUsageProfileResponse { - /** - *The name of the usage profile that was updated.
- * @public - */ - Name?: string; -} +export interface StopWorkflowRunResponse {} /** * @public */ -export interface UpdateUserDefinedFunctionRequest { +export interface TagResourceRequest { /** - *The ID of the Data Catalog where the function to be updated is located. If none is - * provided, the Amazon Web Services account ID is used by default.
+ *The ARN of the Glue resource to which to add the tags. For more + * information about Glue resource ARNs, see the Glue ARN string pattern.
* @public */ - CatalogId?: string; + ResourceArn: string | undefined; /** - *The name of the catalog database where the function to be updated is - * located.
+ *Tags to add to this resource.
* @public */ - DatabaseName: string | undefined; + TagsToAdd: RecordThe name of the function.
+ *The Amazon Resource Name (ARN) of the resource from which to remove the tags.
* @public */ - FunctionName: string | undefined; + ResourceArn: string | undefined; /** - *A FunctionInput
object that redefines the function in the Data
- * Catalog.
Tags to remove from this resource.
* @public */ - FunctionInput: UserDefinedFunctionInput | undefined; + TagsToRemove: string[] | undefined; } /** * @public */ -export interface UpdateUserDefinedFunctionResponse {} +export interface UntagResourceResponse {} /** * @public */ -export interface UpdateWorkflowRequest { +export interface UpdateBlueprintRequest { /** - *Name of the workflow to be updated.
+ *The name of the blueprint.
* @public */ Name: string | undefined; /** - *The description of the workflow.
+ *A description of the blueprint.
* @public */ Description?: string; /** - *A collection of properties to be used as part of each execution of the workflow.
- * @public - */ - DefaultRunProperties?: RecordYou can use this parameter to prevent unwanted multiple updates to data, to control costs, or in some cases, to prevent exceeding the maximum number of concurrent runs of any of the component jobs. If you leave this parameter blank, there is no limit to the number of concurrent workflow runs.
+ *Specifies a path in Amazon S3 where the blueprint is published.
* @public */ - MaxConcurrentRuns?: number; + BlueprintLocation: string | undefined; } /** * @public */ -export interface UpdateWorkflowResponse { +export interface UpdateBlueprintResponse { /** - *The name of the workflow which was specified in input.
+ *Returns the name of the blueprint that was updated.
* @public */ Name?: string; } /** - *Specifies the mapping of data property keys.
+ *Specifies a custom CSV classifier to be updated.
* @public */ -export interface Mapping { - /** - *After the apply mapping, what the name of the column should be. Can be the same as FromPath
.
The table or column to be modified.
- * @public - */ - FromPath?: string[]; - +export interface UpdateCsvClassifierRequest { /** - *The type of the data to be modified.
+ *The name of the classifier.
* @public */ - FromType?: string; + Name: string | undefined; /** - *The data type that the data is to be modified to.
+ *A custom symbol to denote what separates each column entry in the row.
* @public */ - ToType?: string; + Delimiter?: string; /** - *If true, then the column is removed.
+ *A custom symbol to denote what combines content into a single column value. It must be + * different from the column delimiter.
* @public */ - Dropped?: boolean; + QuoteSymbol?: string; /** - *Only applicable to nested data structures. If you want to change the parent structure, but also one of its children, you can fill out this data strucutre. It is also Mapping
, but its FromPath
will be the parent's FromPath
plus the FromPath
from this structure.
For the children part, suppose you have the structure:
- *
- * \{
- * "FromPath": "OuterStructure",
- * "ToKey": "OuterStructure",
- * "ToType": "Struct",
- * "Dropped": false,
- * "Chidlren": [\{
- * "FromPath": "inner",
- * "ToKey": "inner",
- * "ToType": "Double",
- * "Dropped": false,
- * \}]
- * \}
- *
You can specify a Mapping
that looks like:
- * \{
- * "FromPath": "OuterStructure",
- * "ToKey": "OuterStructure",
- * "ToType": "Struct",
- * "Dropped": false,
- * "Chidlren": [\{
- * "FromPath": "inner",
- * "ToKey": "inner",
- * "ToType": "Double",
- * "Dropped": false,
- * \}]
- * \}
- *
Indicates whether the CSV file contains a header.
* @public */ - Children?: Mapping[]; -} + ContainsHeader?: CsvHeaderOption; -/** - *Specifies a transform that maps data property keys in the data source to data property keys in the data target. You can rename keys, modify the data types for keys, and choose which keys to drop from the dataset.
- * @public - */ -export interface ApplyMapping { /** - *The name of the transform node.
+ *A list of strings representing column names.
* @public */ - Name: string | undefined; + Header?: string[]; /** - *The data inputs identified by their node names.
+ *Specifies not to trim values before identifying the type of column values. The default value is true.
* @public */ - Inputs: string[] | undefined; + DisableValueTrimming?: boolean; /** - *Specifies the mapping of data property keys in the data source to data property keys in the data target.
+ *Enables the processing of files that contain only one column.
* @public */ - Mapping: Mapping[] | undefined; -} + AllowSingleColumn?: boolean; -/** - *
- * CodeGenConfigurationNode
enumerates all valid Node types. One and only one of its member variables can be populated.
Specifies a connector to an Amazon Athena data source.
+ *Specifies the configuration of custom datatypes.
* @public */ - AthenaConnectorSource?: AthenaConnectorSource; + CustomDatatypeConfigured?: boolean; /** - *Specifies a connector to a JDBC data source.
+ *Specifies a list of supported custom datatypes.
* @public */ - JDBCConnectorSource?: JDBCConnectorSource; + CustomDatatypes?: string[]; /** - *Specifies a connector to an Apache Spark data source.
+ *Sets the SerDe for processing CSV in the classifier, which will be applied in the Data Catalog. Valid values are OpenCSVSerDe
, LazySimpleSerDe
, and None
. You can specify the None
value when you want the crawler to do the detection.
Specifies a grok classifier to update when passed to
+ * UpdateClassifier
.
Specifies a data store in the Glue Data Catalog.
+ *The name of the GrokClassifier
.
Specifies an Amazon Redshift data store.
+ *An identifier of the data format that the classifier matches, such as Twitter, JSON, Omniture logs, + * Amazon CloudWatch Logs, and so on.
* @public */ - RedshiftSource?: RedshiftSource; + Classification?: string; /** - *Specifies an Amazon S3 data store in the Glue Data Catalog.
+ *The grok pattern used by this classifier.
* @public */ - S3CatalogSource?: S3CatalogSource; + GrokPattern?: string; /** - *Specifies a command-separated value (CSV) data store stored in Amazon S3.
+ *Optional custom grok patterns used by this classifier.
* @public */ - S3CsvSource?: S3CsvSource; + CustomPatterns?: string; +} +/** + *Specifies a JSON classifier to be updated.
+ * @public + */ +export interface UpdateJsonClassifierRequest { /** - *Specifies a JSON data store stored in Amazon S3.
+ *The name of the classifier.
* @public */ - S3JsonSource?: S3JsonSource; + Name: string | undefined; /** - *Specifies an Apache Parquet data store stored in Amazon S3.
+ *A JsonPath
string defining the JSON data for the classifier to classify.
+ * Glue supports a subset of JsonPath, as described in Writing JsonPath Custom Classifiers.
Specifies an XML classifier to be updated.
+ * @public + */ +export interface UpdateXMLClassifierRequest { /** - *Specifies a relational catalog data store in the Glue Data Catalog.
+ *The name of the classifier.
* @public */ - RelationalCatalogSource?: RelationalCatalogSource; + Name: string | undefined; /** - *Specifies a DynamoDBC Catalog data store in the Glue Data Catalog.
+ *An identifier of the data format that the classifier matches.
* @public */ - DynamoDBCatalogSource?: DynamoDBCatalogSource; + Classification?: string; /** - *Specifies a data target that writes to Amazon S3 in Apache Parquet columnar storage.
+ *The XML tag designating the element that contains each record in an XML document being
+ * parsed. This cannot identify a self-closing element (closed by />
). An empty
+ * row element that contains only attributes can be parsed as long as it ends with a closing tag
+ * (for example,
is okay, but
+ *
is not).
Specifies a target that uses an Apache Spark connector.
+ *A GrokClassifier
object with updated fields.
Specifies a target that uses a Glue Data Catalog table.
+ *An XMLClassifier
object with updated fields.
Specifies a target that uses Amazon Redshift.
+ *A JsonClassifier
object with updated fields.
Specifies a data target that writes to Amazon S3 using the Glue Data Catalog.
+ *A CsvClassifier
object with updated fields.
Specifies a data target that writes to Amazon S3 in Apache Parquet columnar storage.
- * @public - */ - S3GlueParquetTarget?: S3GlueParquetTarget; +/** + * @public + */ +export interface UpdateClassifierResponse {} +/** + *There was a version conflict.
+ * @public + */ +export class VersionMismatchException extends __BaseException { + readonly name: "VersionMismatchException" = "VersionMismatchException"; + readonly $fault: "client" = "client"; /** - *Specifies a data target that writes to Amazon S3.
+ *A message describing the problem.
* @public */ - S3DirectTarget?: S3DirectTarget; - + Message?: string; /** - *Specifies a transform that maps data property keys in the data source to data property keys in the data target. You can rename keys, modify the data types for keys, and choose which keys to drop from the dataset.
- * @public + * @internal */ - ApplyMapping?: ApplyMapping; + constructor(opts: __ExceptionOptionTypeSpecifies a transform that chooses the data property keys that you want to keep.
+ *The ID of the Data Catalog where the partitions in question reside. + * If none is supplied, the Amazon Web Services account ID is used by default.
* @public */ - SelectFields?: SelectFields; + CatalogId?: string; /** - *Specifies a transform that chooses the data property keys that you want to drop.
+ *The name of the catalog database where the partitions reside.
* @public */ - DropFields?: DropFields; + DatabaseName: string | undefined; /** - *Specifies a transform that renames a single data property key.
+ *The name of the partitions' table.
* @public */ - RenameField?: RenameField; + TableName: string | undefined; /** - *Specifies a transform that writes samples of the data to an Amazon S3 bucket.
+ *A list of partition values identifying the partition.
* @public */ - Spigot?: Spigot; + PartitionValues: string[] | undefined; /** - *Specifies a transform that joins two datasets into one dataset using a comparison phrase on the specified data property keys. You can use inner, outer, left, right, left semi, and left anti joins.
+ *A list of the column statistics.
* @public */ - Join?: Join; + ColumnStatisticsList: ColumnStatistics[] | undefined; +} +/** + *Encapsulates a ColumnStatistics
object that failed and the reason for failure.
Specifies a transform that splits data property keys into two DynamicFrames
. The output is a collection of DynamicFrames
: one with selected data property keys, and one with the remaining data property keys.
The ColumnStatistics
of the column.
Specifies a transform that chooses one DynamicFrame
from a collection of DynamicFrames
. The output is the selected DynamicFrame
- *
An error message with the reason for the failure of an operation.
* @public */ - SelectFromCollection?: SelectFromCollection; + Error?: ErrorDetail; +} +/** + * @public + */ +export interface UpdateColumnStatisticsForPartitionResponse { /** - *Specifies a transform that locates records in the dataset that have missing values and adds a new field with a value determined by imputation. The input data set is used to train the machine learning model that determines what the missing value should be.
+ *Error occurred during updating column statistics data.
* @public */ - FillMissingValues?: FillMissingValues; + Errors?: ColumnStatisticsError[]; +} +/** + * @public + */ +export interface UpdateColumnStatisticsForTableRequest { /** - *Specifies a transform that splits a dataset into two, based on a filter condition.
+ *The ID of the Data Catalog where the partitions in question reside. + * If none is supplied, the Amazon Web Services account ID is used by default.
* @public */ - Filter?: Filter; + CatalogId?: string; /** - *Specifies a transform that uses custom code you provide to perform the data transformation. The output is a collection of DynamicFrames.
+ *The name of the catalog database where the partitions reside.
* @public */ - CustomCode?: CustomCode; + DatabaseName: string | undefined; /** - *Specifies a transform where you enter a SQL query using Spark SQL syntax to transform the data. The output is a single DynamicFrame
.
The name of the partitions' table.
* @public */ - SparkSQL?: SparkSQL; + TableName: string | undefined; /** - *Specifies a direct Amazon Kinesis data source.
+ *A list of the column statistics.
* @public */ - DirectKinesisSource?: DirectKinesisSource; + ColumnStatisticsList: ColumnStatistics[] | undefined; +} +/** + * @public + */ +export interface UpdateColumnStatisticsForTableResponse { /** - *Specifies an Apache Kafka data store.
+ *List of ColumnStatisticsErrors.
* @public */ - DirectKafkaSource?: DirectKafkaSource; + Errors?: ColumnStatisticsError[]; +} +/** + * @public + */ +export interface UpdateConnectionRequest { /** - *Specifies a Kinesis data source in the Glue Data Catalog.
+ *The ID of the Data Catalog in which the connection resides. If none is provided, the Amazon Web Services + * account ID is used by default.
* @public */ - CatalogKinesisSource?: CatalogKinesisSource; + CatalogId?: string; /** - *Specifies an Apache Kafka data store in the Data Catalog.
+ *The name of the connection definition to update.
* @public */ - CatalogKafkaSource?: CatalogKafkaSource; + Name: string | undefined; /** - *Specifies a transform that removes columns from the dataset if all values in the column are 'null'. By default, Glue Studio will recognize null objects, but some values such as empty strings, strings that are "null", -1 integers or other placeholders such as zeros, are not automatically recognized as nulls.
+ *A ConnectionInput
object that redefines the connection
+ * in question.
Specifies a transform that merges a DynamicFrame
with a staging DynamicFrame
based on the specified primary keys to identify records. Duplicate records (records with the same primary keys) are not de-duplicated.
Specifies a transform that combines the rows from two or more datasets into a single result.
+ *Name of the new crawler.
* @public */ - Union?: Union; + Name: string | undefined; /** - *Specifies a transform that identifies, removes or masks PII data.
+ *The IAM role or Amazon Resource Name (ARN) of an IAM role that is used by the new crawler + * to access customer resources.
* @public */ - PIIDetection?: PIIDetection; + Role?: string; /** - *Specifies a transform that groups rows by chosen fields and computes the aggregated value by specified function.
+ *The Glue database where results are stored, such as:
+ * arn:aws:daylight:us-east-1::database/sometable/*
.
Specifies a transform that removes rows of repeating data from a data set.
+ *A description of the new crawler.
* @public */ - DropDuplicates?: DropDuplicates; + Description?: string; /** - *Specifies a data target that writes to a goverened catalog.
+ *A list of targets to crawl.
* @public */ - GovernedCatalogTarget?: GovernedCatalogTarget; + Targets?: CrawlerTargets; /** - *Specifies a data source in a goverened Data Catalog.
+ *A cron
expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run
+ * something every day at 12:15 UTC, you would specify:
+ * cron(15 12 * * ? *)
.
Specifies a Microsoft SQL server data source in the Glue Data Catalog.
+ *A list of custom classifiers that the user + * has registered. By default, all built-in classifiers are included in a crawl, + * but these custom classifiers always override the default classifiers + * for a given classification.
* @public */ - MicrosoftSQLServerCatalogSource?: MicrosoftSQLServerCatalogSource; + Classifiers?: string[]; /** - *Specifies a MySQL data source in the Glue Data Catalog.
+ *The table prefix used for catalog tables that are created.
* @public */ - MySQLCatalogSource?: MySQLCatalogSource; + TablePrefix?: string; /** - *Specifies an Oracle data source in the Glue Data Catalog.
+ *The policy for the crawler's update and deletion behavior.
* @public */ - OracleSQLCatalogSource?: OracleSQLCatalogSource; + SchemaChangePolicy?: SchemaChangePolicy; /** - *Specifies a PostgresSQL data source in the Glue Data Catalog.
+ *A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
* @public */ - PostgreSQLCatalogSource?: PostgreSQLCatalogSource; + RecrawlPolicy?: RecrawlPolicy; /** - *Specifies a target that uses Microsoft SQL.
+ *Specifies data lineage configuration settings for the crawler.
* @public */ - MicrosoftSQLServerCatalogTarget?: MicrosoftSQLServerCatalogTarget; + LineageConfiguration?: LineageConfiguration; /** - *Specifies a target that uses MySQL.
+ *Specifies Lake Formation configuration settings for the crawler.
* @public */ - MySQLCatalogTarget?: MySQLCatalogTarget; + LakeFormationConfiguration?: LakeFormationConfiguration; /** - *Specifies a target that uses Oracle SQL.
+ *Crawler configuration information. This versioned JSON string allows users + * to specify aspects of a crawler's behavior. + * For more information, see Setting crawler configuration options.
* @public */ - OracleSQLCatalogTarget?: OracleSQLCatalogTarget; + Configuration?: string; /** - *Specifies a target that uses Postgres SQL.
+ *The name of the SecurityConfiguration
structure to be used by this
+ * crawler.
Specifies a custom visual transform created by a user.
+ *The name of the crawler whose schedule to update.
* @public */ - DynamicTransform?: DynamicTransform; + CrawlerName: string | undefined; /** - *Specifies your data quality evaluation criteria.
+ *The updated cron
expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run
+ * something every day at 12:15 UTC, you would specify:
+ * cron(15 12 * * ? *)
.
Specifies a Hudi data source that is registered in the Glue Data Catalog. The data source must be stored in Amazon S3.
+ *The ID of the Data Catalog in which the metadata database resides. If none is provided, + * the Amazon Web Services account ID is used by default.
* @public */ - S3CatalogHudiSource?: S3CatalogHudiSource; + CatalogId?: string; /** - *Specifies a Hudi data source that is registered in the Glue Data Catalog.
+ *The name of the database to update in the catalog. For Hive + * compatibility, this is folded to lowercase.
* @public */ - CatalogHudiSource?: CatalogHudiSource; + Name: string | undefined; /** - *Specifies a Hudi data source stored in Amazon S3.
+ *A DatabaseInput
object specifying the new definition
+ * of the metadata database in the catalog.
Specifies a target that writes to a Hudi data source in the Glue Data Catalog.
+ *The name of the data quality ruleset.
* @public */ - S3HudiCatalogTarget?: S3HudiCatalogTarget; + Name: string | undefined; /** - *Specifies a target that writes to a Hudi data source in Amazon S3.
+ *A description of the ruleset.
* @public */ - S3HudiDirectTarget?: S3HudiDirectTarget; + Description?: string; /** - *Specifies the direct JDBC source connection.
+ *A Data Quality Definition Language (DQDL) ruleset. For more information, see the Glue developer guide.
* @public */ - DirectJDBCSource?: DirectJDBCSource; + Ruleset?: string; +} +/** + * @public + */ +export interface UpdateDataQualityRulesetResponse { /** - *Specifies a Delta Lake data source that is registered in the Glue Data Catalog. The data source must be stored in Amazon S3.
+ *The name of the data quality ruleset.
* @public */ - S3CatalogDeltaSource?: S3CatalogDeltaSource; + Name?: string; /** - *Specifies a Delta Lake data source that is registered in the Glue Data Catalog.
+ *A description of the ruleset.
* @public */ - CatalogDeltaSource?: CatalogDeltaSource; + Description?: string; /** - *Specifies a Delta Lake data source stored in Amazon S3.
+ *A Data Quality Definition Language (DQDL) ruleset. For more information, see the Glue developer guide.
* @public */ - S3DeltaSource?: S3DeltaSource; + Ruleset?: string; +} +/** + *Custom libraries to be loaded into a development endpoint.
+ * @public + */ +export interface DevEndpointCustomLibraries { /** - *Specifies a target that writes to a Delta Lake data source in the Glue Data Catalog.
+ *The paths to one or more Python libraries in an Amazon Simple Storage Service (Amazon S3)
+ * bucket that should be loaded in your DevEndpoint
. Multiple values must be
+ * complete paths separated by a comma.
You can only use pure Python libraries with a DevEndpoint
. Libraries that rely on
+ * C extensions, such as the pandas Python data
+ * analysis library, are not currently supported.
Specifies a target that writes to a Delta Lake data source in Amazon S3.
+ *The path to one or more Java .jar
files in an S3 bucket that should be loaded
+ * in your DevEndpoint
.
You can only use pure Java/Scala libraries with a DevEndpoint
.
Specifies a target that writes to a data source in Amazon Redshift.
+ *The name of the DevEndpoint
to be updated.
Specifies a target that writes to a data target in Amazon Redshift.
+ *The public key for the DevEndpoint
to use.
Specifies your data quality evaluation criteria. Allows multiple input data and returns a collection of Dynamic Frames.
+ *The list of public keys for the DevEndpoint
to use.
Specifies a Glue DataBrew recipe node.
+ *The list of public keys to be deleted from the DevEndpoint
.
Specifies a Snowflake data source.
+ *Custom Python or Java libraries to be loaded in the DevEndpoint
.
Specifies a target that writes to a Snowflake data source.
+ *
+ * True
if the list of custom libraries to be loaded in the development endpoint
+ * needs to be updated, or False
if otherwise.
Specifies a source generated with standard connection options.
+ *The list of argument keys to be deleted from the map of arguments used to configure the
+ * DevEndpoint
.
Specifies a target generated with standard connection options.
+ *The map of arguments to add the map of arguments used to configure the
+ * DevEndpoint
.
Valid arguments are:
+ *
+ * "--enable-glue-datacatalog": ""
+ *
You can specify a version of Python support for development endpoints by using the Arguments
parameter in the CreateDevEndpoint
or UpdateDevEndpoint
APIs. If no arguments are provided, the version defaults to Python 2.
The name you assign to this job definition. It must be unique in your account.
- * @public - */ - Name: string | undefined; +export interface UpdateDevEndpointResponse {} +/** + * @public + */ +export interface UpdateJobResponse { /** - *A mode that describes how a job was created. Valid values are:
- *
- * SCRIPT
- The job was created using the Glue Studio script editor.
- * VISUAL
- The job was created using the Glue Studio visual editor.
- * NOTEBOOK
- The job was created using an interactive sessions notebook.
When the JobMode
field is missing or null, SCRIPT
is assigned as the default value.
Returns the name of the updated job definition.
* @public */ - JobMode?: JobMode; + JobName?: string; +} +/** + * @public + */ +export interface UpdateJobFromSourceControlRequest { /** - *Description of the job being defined.
+ *The name of the Glue job to be synchronized to or from the remote repository.
* @public */ - Description?: string; + JobName?: string; /** - *This field is reserved for future use.
+ *+ * The provider for the remote repository. Possible values: GITHUB, AWS_CODE_COMMIT, GITLAB, BITBUCKET. + *
* @public */ - LogUri?: string; + Provider?: SourceControlProvider; /** - *The name or Amazon Resource Name (ARN) of the IAM role associated with this job.
+ *The name of the remote repository that contains the job artifacts.
+ * For BitBucket providers, RepositoryName
should include WorkspaceName
.
+ * Use the format
.
+ *
An ExecutionProperty
specifying the maximum number of concurrent runs allowed
- * for this job.
The owner of the remote repository that contains the job artifacts.
* @public */ - ExecutionProperty?: ExecutionProperty; + RepositoryOwner?: string; /** - *The JobCommand
that runs this job.
An optional branch in the remote repository.
* @public */ - Command: JobCommand | undefined; + BranchName?: string; /** - *The default arguments for every run of this job, specified as name-value pairs.
- *You can specify arguments here that your own job-execution script - * consumes, as well as arguments that Glue itself consumes.
- *Job arguments may be logged. Do not pass plaintext secrets as arguments. Retrieve secrets - * from a Glue Connection, Secrets Manager or other secret management - * mechanism if you intend to keep them within the Job.
- *For information about how to specify and consume your own Job arguments, see the Calling Glue APIs in Python topic in the developer guide.
- *For information about the arguments you can provide to this field when configuring Spark jobs, - * see the Special Parameters Used by Glue topic in the developer guide.
- *For information about the arguments you can provide to this field when configuring Ray - * jobs, see Using - * job parameters in Ray jobs in the developer guide.
+ *An optional folder in the remote repository.
* @public */ - DefaultArguments?: RecordArguments for this job that are not overridden when providing job arguments - * in a job run, specified as name-value pairs.
+ *A commit ID for a commit in the remote repository.
* @public */ - NonOverridableArguments?: RecordThe connections used for this job.
+ *The type of authentication, which can be an authentication token stored in Amazon Web Services Secrets Manager, or a personal access token.
* @public */ - Connections?: ConnectionsList; + AuthStrategy?: SourceControlAuthStrategy; /** - *The maximum number of times to retry this job if it fails.
+ *The value of the authorization token.
* @public */ - MaxRetries?: number; + AuthToken?: string; +} +/** + * @public + */ +export interface UpdateJobFromSourceControlResponse { /** - * @deprecated - * - *This parameter is deprecated. Use MaxCapacity
instead.
The number of Glue data processing units (DPUs) to allocate to this Job. You can - * allocate a minimum of 2 DPUs; the default is 10. A DPU is a relative measure of processing - * power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, - * see the Glue pricing - * page.
+ *The name of the Glue job.
* @public */ - AllocatedCapacity?: number; + JobName?: string; +} +/** + * @public + */ +export interface UpdateMLTransformRequest { /** - *The job timeout in minutes. This is the maximum time that a job run
- * can consume resources before it is terminated and enters TIMEOUT
- * status. The default is 2,880 minutes (48 hours) for batch jobs.
Streaming jobs must have timeout values less than 7 days or 10080 minutes. When the value is left blank, the job will be restarted after 7 days based if you have not setup a maintenance window. If you have setup maintenance window, it will be restarted during the maintenance window after 7 days.
+ *A unique identifier that was generated when the transform was created.
* @public */ - Timeout?: number; + TransformId: string | undefined; /** - *For Glue version 1.0 or earlier jobs, using the standard worker type, the number of - * Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is - * a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB - * of memory. For more information, see the - * Glue pricing page.
- *For Glue version 2.0+ jobs, you cannot specify a Maximum capacity
.
- * Instead, you should specify a Worker type
and the Number of workers
.
Do not set MaxCapacity
if using WorkerType
and NumberOfWorkers
.
The value that can be allocated for MaxCapacity
depends on whether you are
- * running a Python shell job, an Apache Spark ETL job, or an Apache Spark streaming ETL
- * job:
When you specify a Python shell job (JobCommand.Name
="pythonshell"), you can
- * allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU.
When you specify an Apache Spark ETL job (JobCommand.Name
="glueetl") or Apache
- * Spark streaming ETL job (JobCommand.Name
="gluestreaming"), you can allocate from 2 to 100 DPUs.
- * The default is 10 DPUs. This job type cannot have a fractional DPU allocation.
The unique name that you gave the transform when you created it.
* @public */ - MaxCapacity?: number; + Name?: string; /** - *The name of the SecurityConfiguration
structure to be used with this
- * job.
A description of the transform. The default is an empty string.
* @public */ - SecurityConfiguration?: string; + Description?: string; /** - *The tags to use with this job. You may use tags to limit access to the job. For more information about tags in Glue, see Amazon Web Services Tags in Glue in the developer guide.
+ *The configuration parameters that are specific to the transform type (algorithm) used. + * Conditionally dependent on the transform type.
* @public */ - Tags?: RecordSpecifies configuration properties of a job notification.
+ *The name or Amazon Resource Name (ARN) of the IAM role with the required + * permissions.
* @public */ - NotificationProperty?: NotificationProperty; + Role?: string; /** - *In Spark jobs, GlueVersion
determines the versions of Apache Spark and Python
- * that Glue available in a job. The Python version indicates the version
- * supported for jobs of type Spark.
Ray jobs should set GlueVersion
to 4.0
or greater. However,
- * the versions of Ray, Python and additional libraries available in your Ray job are determined
- * by the Runtime
parameter of the Job command.
For more information about the available Glue versions and corresponding - * Spark and Python versions, see Glue version in the developer - * guide.
- *Jobs that are created without specifying a Glue version default to Glue 0.9.
+ *This value determines which version of Glue this machine learning transform is compatible with. Glue 1.0 is recommended for most customers. If the value is not set, the Glue compatibility defaults to Glue 0.9. For more information, see Glue Versions in the developer guide.
* @public */ GlueVersion?: string; /** - *The number of workers of a defined workerType
that are allocated when a job runs.
The number of Glue data processing units (DPUs) that are allocated to task runs for this transform. You can allocate from 2 to 100 DPUs; the default is 10. A DPU is a relative measure of + * processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more + * information, see the Glue pricing + * page.
+ *When the WorkerType
field is set to a value other than Standard
, the MaxCapacity
field is set automatically and becomes read-only.
The type of predefined worker that is allocated when a job runs. Accepts a value of - * G.1X, G.2X, G.4X, G.8X or G.025X for Spark jobs. Accepts the value Z.2X for Ray jobs.
+ *The type of predefined worker that is allocated when this task runs. Accepts a value of Standard, G.1X, or G.2X.
*For the G.1X
worker type, each worker maps to 1 DPU (4 vCPUs, 16 GB of memory) with 84GB disk (approximately 34GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs.
For the G.2X
worker type, each worker maps to 2 DPU (8 vCPUs, 32 GB of memory) with 128GB disk (approximately 77GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs.
For the G.4X
worker type, each worker maps to 4 DPU (16 vCPUs, 64 GB of memory) with 256GB disk (approximately 235GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs in the following Amazon Web Services Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), and Europe (Stockholm).
For the G.8X
worker type, each worker maps to 8 DPU (32 vCPUs, 128 GB of memory) with 512GB disk (approximately 487GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs, in the same Amazon Web Services Regions as supported for the G.4X
worker type.
For the Standard
worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.
For the G.025X
worker type, each worker maps to 0.25 DPU (2 vCPUs, 4 GB of memory) with 84GB disk (approximately 34GB free), and provides 1 executor per worker. We recommend this worker type for low volume streaming jobs. This worker type is only available for Glue version 3.0 streaming jobs.
For the G.1X
worker type, each worker provides 4 vCPU, 16 GB of memory and a 64GB disk, and 1 executor per worker.
For the Z.2X
worker type, each worker maps to 2 M-DPU (8vCPUs, 64 GB of memory) with 128 GB disk (approximately 120GB free), and provides up to 8 Ray workers based on the autoscaler.
For the G.2X
worker type, each worker provides 8 vCPU, 32 GB of memory and a 128GB disk, and 1 executor per worker.
The representation of a directed acyclic graph on which both the Glue Studio visual component and Glue Studio code generation is based.
+ *The number of workers of a defined workerType
that are allocated when this task runs.
Indicates whether the job is run with a standard or flexible execution class. The standard execution-class is ideal for time-sensitive workloads that require fast job startup and dedicated resources.
- *The flexible execution class is appropriate for time-insensitive jobs whose start and completion times may vary.
- *Only jobs with Glue version 3.0 and above and command type glueetl
will be allowed to set ExecutionClass
to FLEX
. The flexible execution class is available for Spark jobs.
The timeout for a task run for this transform in minutes. This is the maximum time that a task run for this transform can consume resources before it is terminated and enters TIMEOUT
status. The default is 2,880 minutes (48 hours).
The details for a source control configuration for a job, allowing synchronization of job artifacts to or from a remote repository.
+ *The maximum number of times to retry a task for this transform after a task run fails.
* @public */ - SourceControlDetails?: SourceControlDetails; + MaxRetries?: number; +} +/** + * @public + */ +export interface UpdateMLTransformResponse { /** - *This field specifies a day of the week and hour for a maintenance window for streaming jobs. Glue periodically performs maintenance activities. During these maintenance windows, Glue will need to restart your streaming jobs.
- *Glue will restart the job within 3 hours of the specified maintenance window. For instance, if you set up the maintenance window for Monday at 10:00AM GMT, your jobs will be restarted between 10:00AM GMT to 1:00PM GMT.
+ *The unique identifier for the transform that was updated.
* @public */ - MaintenanceWindow?: string; + TransformId?: string; } /** - *Specifies a job definition.
* @public */ -export interface Job { +export interface UpdatePartitionRequest { /** - *The name you assign to this job definition.
+ *The ID of the Data Catalog where the partition to be updated resides. If none is provided, + * the Amazon Web Services account ID is used by default.
* @public */ - Name?: string; + CatalogId?: string; /** - *A mode that describes how a job was created. Valid values are:
- *
- * SCRIPT
- The job was created using the Glue Studio script editor.
- * VISUAL
- The job was created using the Glue Studio visual editor.
- * NOTEBOOK
- The job was created using an interactive sessions notebook.
When the JobMode
field is missing or null, SCRIPT
is assigned as the default value.
The name of the catalog database in which the table in question + * resides.
* @public */ - JobMode?: JobMode; + DatabaseName: string | undefined; /** - *A description of the job.
+ *The name of the table in which the partition to be updated is located.
* @public */ - Description?: string; + TableName: string | undefined; /** - *This field is reserved for future use.
+ *List of partition key values that define the partition to update.
* @public */ - LogUri?: string; + PartitionValueList: string[] | undefined; /** - *The name or Amazon Resource Name (ARN) of the IAM role associated with this job.
+ *The new partition object to update the partition to.
+ *The Values
property can't be changed. If you want to change the partition key values for a partition, delete and recreate the partition.
The time and date that this job definition was created.
- * @public - */ - CreatedOn?: Date; +/** + * @public + */ +export interface UpdatePartitionResponse {} +/** + * @public + */ +export interface UpdateRegistryInput { /** - *The last point in time when this job definition was modified.
+ *This is a wrapper structure that may contain the registry name and Amazon Resource Name (ARN).
* @public */ - LastModifiedOn?: Date; + RegistryId: RegistryId | undefined; /** - *An ExecutionProperty
specifying the maximum number of concurrent runs allowed
- * for this job.
A description of the registry. If description is not provided, this field will not be updated.
* @public */ - ExecutionProperty?: ExecutionProperty; + Description: string | undefined; +} +/** + * @public + */ +export interface UpdateRegistryResponse { /** - *The JobCommand
that runs this job.
The name of the updated registry.
* @public */ - Command?: JobCommand; + RegistryName?: string; /** - *The default arguments for every run of this job, specified as name-value pairs.
- *You can specify arguments here that your own job-execution script - * consumes, as well as arguments that Glue itself consumes.
- *Job arguments may be logged. Do not pass plaintext secrets as arguments. Retrieve secrets - * from a Glue Connection, Secrets Manager or other secret management - * mechanism if you intend to keep them within the Job.
- *For information about how to specify and consume your own Job arguments, see the Calling Glue APIs in Python topic in the developer guide.
- *For information about the arguments you can provide to this field when configuring Spark jobs, - * see the Special Parameters Used by Glue topic in the developer guide.
- *For information about the arguments you can provide to this field when configuring Ray - * jobs, see Using - * job parameters in Ray jobs in the developer guide.
+ *The Amazon Resource name (ARN) of the updated registry.
* @public */ - DefaultArguments?: RecordArguments for this job that are not overridden when providing job arguments - * in a job run, specified as name-value pairs.
+ *This is a wrapper structure to contain schema identity fields. The structure contains:
+ *SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. One of SchemaArn
or SchemaName
has to be provided.
SchemaId$SchemaName: The name of the schema. One of SchemaArn
or SchemaName
has to be provided.
The connections used for this job.
+ *Version number required for check pointing. One of VersionNumber
or Compatibility
has to be provided.
The maximum number of times to retry this job after a JobRun fails.
+ *The new compatibility setting for the schema.
* @public */ - MaxRetries?: number; + Compatibility?: Compatibility; /** - * @deprecated - * - *This field is deprecated. Use MaxCapacity
instead.
The number of Glue data processing units (DPUs) allocated to runs of this job. You can - * allocate a minimum of 2 DPUs; the default is 10. A DPU is a relative measure of processing - * power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, - * see the Glue pricing - * page.
- * + *The new description for the schema.
* @public */ - AllocatedCapacity?: number; + Description?: string; +} +/** + * @public + */ +export interface UpdateSchemaResponse { /** - *The job timeout in minutes. This is the maximum time that a job run
- * can consume resources before it is terminated and enters TIMEOUT
- * status. The default is 2,880 minutes (48 hours) for batch jobs.
Streaming jobs must have timeout values less than 7 days or 10080 minutes. When the value is left blank, the job will be restarted after 7 days based if you have not setup a maintenance window. If you have setup maintenance window, it will be restarted during the maintenance window after 7 days.
+ *The Amazon Resource Name (ARN) of the schema.
* @public */ - Timeout?: number; + SchemaArn?: string; /** - *For Glue version 1.0 or earlier jobs, using the standard worker type, the number of - * Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is - * a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB - * of memory. For more information, see the - * Glue pricing page.
- *For Glue version 2.0 or later jobs, you cannot specify a Maximum capacity
.
- * Instead, you should specify a Worker type
and the Number of workers
.
Do not set MaxCapacity
if using WorkerType
and NumberOfWorkers
.
The value that can be allocated for MaxCapacity
depends on whether you are
- * running a Python shell job, an Apache Spark ETL job, or an Apache Spark streaming ETL
- * job:
When you specify a Python shell job (JobCommand.Name
="pythonshell"), you can
- * allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU.
When you specify an Apache Spark ETL job (JobCommand.Name
="glueetl") or Apache
- * Spark streaming ETL job (JobCommand.Name
="gluestreaming"), you can allocate from 2 to 100 DPUs.
- * The default is 10 DPUs. This job type cannot have a fractional DPU allocation.
The name of the schema.
* @public */ - MaxCapacity?: number; + SchemaName?: string; /** - *The type of predefined worker that is allocated when a job runs. Accepts a value of - * G.1X, G.2X, G.4X, G.8X or G.025X for Spark jobs. Accepts the value Z.2X for Ray jobs.
- *For the G.1X
worker type, each worker maps to 1 DPU (4 vCPUs, 16 GB of memory) with 84GB disk (approximately 34GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs.
For the G.2X
worker type, each worker maps to 2 DPU (8 vCPUs, 32 GB of memory) with 128GB disk (approximately 77GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs.
For the G.4X
worker type, each worker maps to 4 DPU (16 vCPUs, 64 GB of memory) with 256GB disk (approximately 235GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs in the following Amazon Web Services Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), and Europe (Stockholm).
For the G.8X
worker type, each worker maps to 8 DPU (32 vCPUs, 128 GB of memory) with 512GB disk (approximately 487GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs, in the same Amazon Web Services Regions as supported for the G.4X
worker type.
For the G.025X
worker type, each worker maps to 0.25 DPU (2 vCPUs, 4 GB of memory) with 84GB disk (approximately 34GB free), and provides 1 executor per worker. We recommend this worker type for low volume streaming jobs. This worker type is only available for Glue version 3.0 streaming jobs.
For the Z.2X
worker type, each worker maps to 2 M-DPU (8vCPUs, 64 GB of memory) with 128 GB disk (approximately 120GB free), and provides up to 8 Ray workers based on the autoscaler.
The name of the registry that contains the schema.
* @public */ - WorkerType?: WorkerType; + RegistryName?: string; +} +/** + * @public + */ +export interface UpdateSourceControlFromJobRequest { /** - *The number of workers of a defined workerType
that are allocated when a job runs.
The name of the Glue job to be synchronized to or from the remote repository.
* @public */ - NumberOfWorkers?: number; + JobName?: string; /** - *The name of the SecurityConfiguration
structure to be used with this
- * job.
+ * The provider for the remote repository. Possible values: GITHUB, AWS_CODE_COMMIT, GITLAB, BITBUCKET. + *
* @public */ - SecurityConfiguration?: string; + Provider?: SourceControlProvider; /** - *Specifies configuration properties of a job notification.
+ *The name of the remote repository that contains the job artifacts.
+ * For BitBucket providers, RepositoryName
should include WorkspaceName
.
+ * Use the format
.
+ *
In Spark jobs, GlueVersion
determines the versions of Apache Spark and Python
- * that Glue available in a job. The Python version indicates the version
- * supported for jobs of type Spark.
Ray jobs should set GlueVersion
to 4.0
or greater. However,
- * the versions of Ray, Python and additional libraries available in your Ray job are determined
- * by the Runtime
parameter of the Job command.
For more information about the available Glue versions and corresponding - * Spark and Python versions, see Glue version in the developer - * guide.
- *Jobs that are created without specifying a Glue version default to Glue 0.9.
+ *The owner of the remote repository that contains the job artifacts.
* @public */ - GlueVersion?: string; + RepositoryOwner?: string; /** - *The representation of a directed acyclic graph on which both the Glue Studio visual component and Glue Studio code generation is based.
+ *An optional branch in the remote repository.
* @public */ - CodeGenConfigurationNodes?: RecordIndicates whether the job is run with a standard or flexible execution class. The standard execution class is ideal for time-sensitive workloads that require fast job startup and dedicated resources.
- *The flexible execution class is appropriate for time-insensitive jobs whose start and completion times may vary.
- *Only jobs with Glue version 3.0 and above and command type glueetl
will be allowed to set ExecutionClass
to FLEX
. The flexible execution class is available for Spark jobs.
An optional folder in the remote repository.
* @public */ - ExecutionClass?: ExecutionClass; + Folder?: string; /** - *The details for a source control configuration for a job, allowing synchronization of job artifacts to or from a remote repository.
+ *A commit ID for a commit in the remote repository.
* @public */ - SourceControlDetails?: SourceControlDetails; + CommitId?: string; /** - *This field specifies a day of the week and hour for a maintenance window for streaming jobs. Glue periodically performs maintenance activities. During these maintenance windows, Glue will need to restart your streaming jobs.
- *Glue will restart the job within 3 hours of the specified maintenance window. For instance, if you set up the maintenance window for Monday at 10:00AM GMT, your jobs will be restarted between 10:00AM GMT to 1:00PM GMT.
+ *The type of authentication, which can be an authentication token stored in Amazon Web Services Secrets Manager, or a personal access token.
* @public */ - MaintenanceWindow?: string; + AuthStrategy?: SourceControlAuthStrategy; /** - *The name of an Glue usage profile associated with the job.
+ *The value of the authorization token.
* @public */ - ProfileName?: string; + AuthToken?: string; } /** - *Specifies information used to update an existing job definition. The previous job - * definition is completely overwritten by this information.
* @public */ -export interface JobUpdate { +export interface UpdateSourceControlFromJobResponse { /** - *A mode that describes how a job was created. Valid values are:
- *
- * SCRIPT
- The job was created using the Glue Studio script editor.
- * VISUAL
- The job was created using the Glue Studio visual editor.
- * NOTEBOOK
- The job was created using an interactive sessions notebook.
When the JobMode
field is missing or null, SCRIPT
is assigned as the default value.
The name of the Glue job.
* @public */ - JobMode?: JobMode; + JobName?: string; +} - /** - *Description of the job being defined.
- * @public - */ - Description?: string; +/** + * @public + * @enum + */ +export const ViewUpdateAction = { + ADD: "ADD", + ADD_OR_REPLACE: "ADD_OR_REPLACE", + DROP: "DROP", + REPLACE: "REPLACE", +} as const; - /** - *This field is reserved for future use.
- * @public - */ - LogUri?: string; +/** + * @public + */ +export type ViewUpdateAction = (typeof ViewUpdateAction)[keyof typeof ViewUpdateAction]; +/** + * @public + */ +export interface UpdateTableRequest { /** - *The name or Amazon Resource Name (ARN) of the IAM role associated with this job - * (required).
+ *The ID of the Data Catalog where the table resides. If none is provided, the Amazon Web Services account + * ID is used by default.
* @public */ - Role?: string; + CatalogId?: string; /** - *An ExecutionProperty
specifying the maximum number of concurrent runs allowed
- * for this job.
The name of the catalog database in which the table resides. For Hive + * compatibility, this name is entirely lowercase.
* @public */ - ExecutionProperty?: ExecutionProperty; + DatabaseName: string | undefined; /** - *The JobCommand
that runs this job (required).
An updated TableInput
object to define the metadata table
+ * in the catalog.
The default arguments for every run of this job, specified as name-value pairs.
- *You can specify arguments here that your own job-execution script - * consumes, as well as arguments that Glue itself consumes.
- *Job arguments may be logged. Do not pass plaintext secrets as arguments. Retrieve secrets - * from a Glue Connection, Secrets Manager or other secret management - * mechanism if you intend to keep them within the Job.
- *For information about how to specify and consume your own Job arguments, see the Calling Glue APIs in Python topic in the developer guide.
- *For information about the arguments you can provide to this field when configuring Spark jobs, - * see the Special Parameters Used by Glue topic in the developer guide.
- *For information about the arguments you can provide to this field when configuring Ray - * jobs, see Using - * job parameters in Ray jobs in the developer guide.
+ *By default, UpdateTable
always creates an archived version of the table
+ * before updating it. However, if skipArchive
is set to true,
+ * UpdateTable
does not create the archived version.
Arguments for this job that are not overridden when providing job arguments - * in a job run, specified as name-value pairs.
+ *The transaction ID at which to update the table contents.
* @public */ - NonOverridableArguments?: RecordThe connections used for this job.
+ *The version ID at which to update the table contents.
* @public */ - Connections?: ConnectionsList; + VersionId?: string; /** - *The maximum number of times to retry this job if it fails.
+ *The operation to be performed when updating the view.
* @public */ - MaxRetries?: number; + ViewUpdateAction?: ViewUpdateAction; /** - * @deprecated - * - *This field is deprecated. Use MaxCapacity
instead.
The number of Glue data processing units (DPUs) to allocate to this job. You can - * allocate a minimum of 2 DPUs; the default is 10. A DPU is a relative measure of processing - * power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, - * see the Glue pricing - * page.
+ *A flag that can be set to true to ignore matching storage descriptor and subobject matching requirements.
* @public */ - AllocatedCapacity?: number; + Force?: boolean; +} - /** - *The job timeout in minutes. This is the maximum time that a job run
- * can consume resources before it is terminated and enters TIMEOUT
- * status. The default is 2,880 minutes (48 hours) for batch jobs.
Streaming jobs must have timeout values less than 7 days or 10080 minutes. When the value is left blank, the job will be restarted after 7 days based if you have not setup a maintenance window. If you have setup maintenance window, it will be restarted during the maintenance window after 7 days.
- * @public - */ - Timeout?: number; +/** + * @public + */ +export interface UpdateTableResponse {} +/** + * @public + */ +export interface UpdateTableOptimizerRequest { /** - *For Glue version 1.0 or earlier jobs, using the standard worker type, the number of - * Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is - * a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB - * of memory. For more information, see the - * Glue pricing page.
- *For Glue version 2.0+ jobs, you cannot specify a Maximum capacity
.
- * Instead, you should specify a Worker type
and the Number of workers
.
Do not set MaxCapacity
if using WorkerType
and NumberOfWorkers
.
The value that can be allocated for MaxCapacity
depends on whether you are
- * running a Python shell job, an Apache Spark ETL job, or an Apache Spark streaming ETL
- * job:
When you specify a Python shell job (JobCommand.Name
="pythonshell"), you can
- * allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU.
When you specify an Apache Spark ETL job (JobCommand.Name
="glueetl") or Apache
- * Spark streaming ETL job (JobCommand.Name
="gluestreaming"), you can allocate from 2 to 100 DPUs.
- * The default is 10 DPUs. This job type cannot have a fractional DPU allocation.
The Catalog ID of the table.
* @public */ - MaxCapacity?: number; + CatalogId: string | undefined; /** - *The type of predefined worker that is allocated when a job runs. Accepts a value of - * G.1X, G.2X, G.4X, G.8X or G.025X for Spark jobs. Accepts the value Z.2X for Ray jobs.
- *For the G.1X
worker type, each worker maps to 1 DPU (4 vCPUs, 16 GB of memory) with 84GB disk (approximately 34GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs.
For the G.2X
worker type, each worker maps to 2 DPU (8 vCPUs, 32 GB of memory) with 128GB disk (approximately 77GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs.
For the G.4X
worker type, each worker maps to 4 DPU (16 vCPUs, 64 GB of memory) with 256GB disk (approximately 235GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs in the following Amazon Web Services Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), and Europe (Stockholm).
For the G.8X
worker type, each worker maps to 8 DPU (32 vCPUs, 128 GB of memory) with 512GB disk (approximately 487GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs, in the same Amazon Web Services Regions as supported for the G.4X
worker type.
For the G.025X
worker type, each worker maps to 0.25 DPU (2 vCPUs, 4 GB of memory) with 84GB disk (approximately 34GB free), and provides 1 executor per worker. We recommend this worker type for low volume streaming jobs. This worker type is only available for Glue version 3.0 streaming jobs.
For the Z.2X
worker type, each worker maps to 2 M-DPU (8vCPUs, 64 GB of memory) with 128 GB disk (approximately 120GB free), and provides up to 8 Ray workers based on the autoscaler.
The name of the database in the catalog in which the table resides.
* @public */ - WorkerType?: WorkerType; + DatabaseName: string | undefined; /** - *The number of workers of a defined workerType
that are allocated when a job runs.
The name of the table.
* @public */ - NumberOfWorkers?: number; + TableName: string | undefined; /** - *The name of the SecurityConfiguration
structure to be used with this
- * job.
The type of table optimizer. Currently, the only valid value is compaction
.
Specifies the configuration properties of a job notification.
+ *A TableOptimizerConfiguration
object representing the configuration of a table optimizer.
A structure used to provide information used to update a trigger. This object updates the + * previous trigger definition by overwriting it completely.
+ * @public + */ +export interface TriggerUpdate { /** - *In Spark jobs, GlueVersion
determines the versions of Apache Spark and Python
- * that Glue available in a job. The Python version indicates the version
- * supported for jobs of type Spark.
Ray jobs should set GlueVersion
to 4.0
or greater. However,
- * the versions of Ray, Python and additional libraries available in your Ray job are determined
- * by the Runtime
parameter of the Job command.
For more information about the available Glue versions and corresponding - * Spark and Python versions, see Glue version in the developer - * guide.
- *Jobs that are created without specifying a Glue version default to Glue 0.9.
+ *Reserved for future use.
* @public */ - GlueVersion?: string; + Name?: string; /** - *The representation of a directed acyclic graph on which both the Glue Studio visual component and Glue Studio code generation is based.
+ *A description of this trigger.
* @public */ - CodeGenConfigurationNodes?: RecordIndicates whether the job is run with a standard or flexible execution class. The standard execution-class is ideal for time-sensitive workloads that require fast job startup and dedicated resources.
- *The flexible execution class is appropriate for time-insensitive jobs whose start and completion times may vary.
- *Only jobs with Glue version 3.0 and above and command type glueetl
will be allowed to set ExecutionClass
to FLEX
. The flexible execution class is available for Spark jobs.
A cron
expression used to specify the schedule (see Time-Based
+ * Schedules for Jobs and Crawlers. For example, to run
+ * something every day at 12:15 UTC, you would specify:
+ * cron(15 12 * * ? *)
.
The details for a source control configuration for a job, allowing synchronization of job artifacts to or from a remote repository.
+ *The actions initiated by this trigger.
* @public */ - SourceControlDetails?: SourceControlDetails; + Actions?: Action[]; /** - *This field specifies a day of the week and hour for a maintenance window for streaming jobs. Glue periodically performs maintenance activities. During these maintenance windows, Glue will need to restart your streaming jobs.
- *Glue will restart the job within 3 hours of the specified maintenance window. For instance, if you set up the maintenance window for Monday at 10:00AM GMT, your jobs will be restarted between 10:00AM GMT to 1:00PM GMT.
+ *The predicate of this trigger, which defines when it will fire.
* @public */ - MaintenanceWindow?: string; -} + Predicate?: Predicate; -/** - * @public - */ -export interface GetJobResponse { /** - *The requested job definition.
+ *Batch condition that must be met (specified number of events received or batch time window expired) + * before EventBridge event trigger fires.
* @public */ - Job?: Job; + EventBatchingCondition?: EventBatchingCondition; } /** * @public */ -export interface UpdateJobRequest { +export interface UpdateTriggerRequest { /** - *The name of the job definition to update.
+ *The name of the trigger to update.
* @public */ - JobName: string | undefined; + Name: string | undefined; /** - *Specifies the values with which to update the job definition. Unspecified configuration is removed or reset to default values.
+ *The new values with which to update the trigger.
* @public */ - JobUpdate: JobUpdate | undefined; + TriggerUpdate: TriggerUpdate | undefined; } /** * @public */ -export interface BatchGetJobsResponse { - /** - *A list of job definitions.
- * @public - */ - Jobs?: Job[]; - +export interface UpdateTriggerResponse { /** - *A list of names of jobs not found.
+ *The resulting trigger definition.
* @public */ - JobsNotFound?: string[]; + Trigger?: Trigger; } /** * @public */ -export interface GetJobsResponse { +export interface UpdateUsageProfileRequest { /** - *A list of job definitions.
+ *The name of the usage profile.
* @public */ - Jobs?: Job[]; + Name: string | undefined; /** - *A continuation token, if not all job definitions have yet been returned.
+ *A description of the usage profile.
* @public */ - NextToken?: string; -} - -/** - * @internal - */ -export const CreateJobRequestFilterSensitiveLog = (obj: CreateJobRequest): any => ({ - ...obj, - ...(obj.CodeGenConfigurationNodes && { CodeGenConfigurationNodes: SENSITIVE_STRING }), -}); - -/** - * @internal - */ -export const JobFilterSensitiveLog = (obj: Job): any => ({ - ...obj, - ...(obj.CodeGenConfigurationNodes && { CodeGenConfigurationNodes: SENSITIVE_STRING }), -}); - -/** - * @internal - */ -export const JobUpdateFilterSensitiveLog = (obj: JobUpdate): any => ({ - ...obj, - ...(obj.CodeGenConfigurationNodes && { CodeGenConfigurationNodes: SENSITIVE_STRING }), -}); - -/** - * @internal - */ -export const GetJobResponseFilterSensitiveLog = (obj: GetJobResponse): any => ({ - ...obj, - ...(obj.Job && { Job: JobFilterSensitiveLog(obj.Job) }), -}); + Description?: string; -/** - * @internal - */ -export const UpdateJobRequestFilterSensitiveLog = (obj: UpdateJobRequest): any => ({ - ...obj, - ...(obj.JobUpdate && { JobUpdate: JobUpdateFilterSensitiveLog(obj.JobUpdate) }), -}); + /** + *A ProfileConfiguration
object specifying the job and session values for the profile.
The name of the usage profile that was updated.
+ * @public + */ + Name?: string; +} + +/** + * @public + */ +export interface UpdateUserDefinedFunctionRequest { + /** + *The ID of the Data Catalog where the function to be updated is located. If none is + * provided, the Amazon Web Services account ID is used by default.
+ * @public + */ + CatalogId?: string; + + /** + *The name of the catalog database where the function to be updated is + * located.
+ * @public + */ + DatabaseName: string | undefined; + + /** + *The name of the function.
+ * @public + */ + FunctionName: string | undefined; + + /** + *A FunctionInput
object that redefines the function in the Data
+ * Catalog.
Name of the workflow to be updated.
+ * @public + */ + Name: string | undefined; + + /** + *The description of the workflow.
+ * @public + */ + Description?: string; + + /** + *A collection of properties to be used as part of each execution of the workflow.
+ * @public + */ + DefaultRunProperties?: RecordYou can use this parameter to prevent unwanted multiple updates to data, to control costs, or in some cases, to prevent exceeding the maximum number of concurrent runs of any of the component jobs. If you leave this parameter blank, there is no limit to the number of concurrent workflow runs.
+ * @public + */ + MaxConcurrentRuns?: number; +} + +/** + * @public + */ +export interface UpdateWorkflowResponse { + /** + *The name of the workflow which was specified in input.
+ * @public + */ + Name?: string; +} + +/** + *Specifies the mapping of data property keys.
+ * @public + */ +export interface Mapping { + /** + *After the apply mapping, what the name of the column should be. Can be the same as FromPath
.
The table or column to be modified.
+ * @public + */ + FromPath?: string[]; + + /** + *The type of the data to be modified.
+ * @public + */ + FromType?: string; + + /** + *The data type that the data is to be modified to.
+ * @public + */ + ToType?: string; + + /** + *If true, then the column is removed.
+ * @public + */ + Dropped?: boolean; + + /** + *Only applicable to nested data structures. If you want to change the parent structure, but also one of its children, you can fill out this data strucutre. It is also Mapping
, but its FromPath
will be the parent's FromPath
plus the FromPath
from this structure.
For the children part, suppose you have the structure:
+ *
+ * \{
+ * "FromPath": "OuterStructure",
+ * "ToKey": "OuterStructure",
+ * "ToType": "Struct",
+ * "Dropped": false,
+ * "Chidlren": [\{
+ * "FromPath": "inner",
+ * "ToKey": "inner",
+ * "ToType": "Double",
+ * "Dropped": false,
+ * \}]
+ * \}
+ *
You can specify a Mapping
that looks like:
+ * \{
+ * "FromPath": "OuterStructure",
+ * "ToKey": "OuterStructure",
+ * "ToType": "Struct",
+ * "Dropped": false,
+ * "Chidlren": [\{
+ * "FromPath": "inner",
+ * "ToKey": "inner",
+ * "ToType": "Double",
+ * "Dropped": false,
+ * \}]
+ * \}
+ *
Specifies a transform that maps data property keys in the data source to data property keys in the data target. You can rename keys, modify the data types for keys, and choose which keys to drop from the dataset.
+ * @public + */ +export interface ApplyMapping { + /** + *The name of the transform node.
+ * @public + */ + Name: string | undefined; + + /** + *The data inputs identified by their node names.
+ * @public + */ + Inputs: string[] | undefined; + + /** + *Specifies the mapping of data property keys in the data source to data property keys in the data target.
+ * @public + */ + Mapping: Mapping[] | undefined; +} + +/** + *
+ * CodeGenConfigurationNode
enumerates all valid Node types. One and only one of its member variables can be populated.
Specifies a connector to an Amazon Athena data source.
+ * @public + */ + AthenaConnectorSource?: AthenaConnectorSource; + + /** + *Specifies a connector to a JDBC data source.
+ * @public + */ + JDBCConnectorSource?: JDBCConnectorSource; + + /** + *Specifies a connector to an Apache Spark data source.
+ * @public + */ + SparkConnectorSource?: SparkConnectorSource; + + /** + *Specifies a data store in the Glue Data Catalog.
+ * @public + */ + CatalogSource?: CatalogSource; + + /** + *Specifies an Amazon Redshift data store.
+ * @public + */ + RedshiftSource?: RedshiftSource; + + /** + *Specifies an Amazon S3 data store in the Glue Data Catalog.
+ * @public + */ + S3CatalogSource?: S3CatalogSource; + + /** + *Specifies a command-separated value (CSV) data store stored in Amazon S3.
+ * @public + */ + S3CsvSource?: S3CsvSource; + + /** + *Specifies a JSON data store stored in Amazon S3.
+ * @public + */ + S3JsonSource?: S3JsonSource; + + /** + *Specifies an Apache Parquet data store stored in Amazon S3.
+ * @public + */ + S3ParquetSource?: S3ParquetSource; + + /** + *Specifies a relational catalog data store in the Glue Data Catalog.
+ * @public + */ + RelationalCatalogSource?: RelationalCatalogSource; + + /** + *Specifies a DynamoDBC Catalog data store in the Glue Data Catalog.
+ * @public + */ + DynamoDBCatalogSource?: DynamoDBCatalogSource; + + /** + *Specifies a data target that writes to Amazon S3 in Apache Parquet columnar storage.
+ * @public + */ + JDBCConnectorTarget?: JDBCConnectorTarget; + + /** + *Specifies a target that uses an Apache Spark connector.
+ * @public + */ + SparkConnectorTarget?: SparkConnectorTarget; + + /** + *Specifies a target that uses a Glue Data Catalog table.
+ * @public + */ + CatalogTarget?: BasicCatalogTarget; + + /** + *Specifies a target that uses Amazon Redshift.
+ * @public + */ + RedshiftTarget?: RedshiftTarget; + + /** + *Specifies a data target that writes to Amazon S3 using the Glue Data Catalog.
+ * @public + */ + S3CatalogTarget?: S3CatalogTarget; + + /** + *Specifies a data target that writes to Amazon S3 in Apache Parquet columnar storage.
+ * @public + */ + S3GlueParquetTarget?: S3GlueParquetTarget; + + /** + *Specifies a data target that writes to Amazon S3.
+ * @public + */ + S3DirectTarget?: S3DirectTarget; + + /** + *Specifies a transform that maps data property keys in the data source to data property keys in the data target. You can rename keys, modify the data types for keys, and choose which keys to drop from the dataset.
+ * @public + */ + ApplyMapping?: ApplyMapping; + + /** + *Specifies a transform that chooses the data property keys that you want to keep.
+ * @public + */ + SelectFields?: SelectFields; + + /** + *Specifies a transform that chooses the data property keys that you want to drop.
+ * @public + */ + DropFields?: DropFields; + + /** + *Specifies a transform that renames a single data property key.
+ * @public + */ + RenameField?: RenameField; + + /** + *Specifies a transform that writes samples of the data to an Amazon S3 bucket.
+ * @public + */ + Spigot?: Spigot; + + /** + *Specifies a transform that joins two datasets into one dataset using a comparison phrase on the specified data property keys. You can use inner, outer, left, right, left semi, and left anti joins.
+ * @public + */ + Join?: Join; + + /** + *Specifies a transform that splits data property keys into two DynamicFrames
. The output is a collection of DynamicFrames
: one with selected data property keys, and one with the remaining data property keys.
Specifies a transform that chooses one DynamicFrame
from a collection of DynamicFrames
. The output is the selected DynamicFrame
+ *
Specifies a transform that locates records in the dataset that have missing values and adds a new field with a value determined by imputation. The input data set is used to train the machine learning model that determines what the missing value should be.
+ * @public + */ + FillMissingValues?: FillMissingValues; + + /** + *Specifies a transform that splits a dataset into two, based on a filter condition.
+ * @public + */ + Filter?: Filter; + + /** + *Specifies a transform that uses custom code you provide to perform the data transformation. The output is a collection of DynamicFrames.
+ * @public + */ + CustomCode?: CustomCode; + + /** + *Specifies a transform where you enter a SQL query using Spark SQL syntax to transform the data. The output is a single DynamicFrame
.
Specifies a direct Amazon Kinesis data source.
+ * @public + */ + DirectKinesisSource?: DirectKinesisSource; + + /** + *Specifies an Apache Kafka data store.
+ * @public + */ + DirectKafkaSource?: DirectKafkaSource; + + /** + *Specifies a Kinesis data source in the Glue Data Catalog.
+ * @public + */ + CatalogKinesisSource?: CatalogKinesisSource; + + /** + *Specifies an Apache Kafka data store in the Data Catalog.
+ * @public + */ + CatalogKafkaSource?: CatalogKafkaSource; + + /** + *Specifies a transform that removes columns from the dataset if all values in the column are 'null'. By default, Glue Studio will recognize null objects, but some values such as empty strings, strings that are "null", -1 integers or other placeholders such as zeros, are not automatically recognized as nulls.
+ * @public + */ + DropNullFields?: DropNullFields; + + /** + *Specifies a transform that merges a DynamicFrame
with a staging DynamicFrame
based on the specified primary keys to identify records. Duplicate records (records with the same primary keys) are not de-duplicated.
Specifies a transform that combines the rows from two or more datasets into a single result.
+ * @public + */ + Union?: Union; + + /** + *Specifies a transform that identifies, removes or masks PII data.
+ * @public + */ + PIIDetection?: PIIDetection; + + /** + *Specifies a transform that groups rows by chosen fields and computes the aggregated value by specified function.
+ * @public + */ + Aggregate?: Aggregate; + + /** + *Specifies a transform that removes rows of repeating data from a data set.
+ * @public + */ + DropDuplicates?: DropDuplicates; + + /** + *Specifies a data target that writes to a goverened catalog.
+ * @public + */ + GovernedCatalogTarget?: GovernedCatalogTarget; + + /** + *Specifies a data source in a goverened Data Catalog.
+ * @public + */ + GovernedCatalogSource?: GovernedCatalogSource; + + /** + *Specifies a Microsoft SQL server data source in the Glue Data Catalog.
+ * @public + */ + MicrosoftSQLServerCatalogSource?: MicrosoftSQLServerCatalogSource; + + /** + *Specifies a MySQL data source in the Glue Data Catalog.
+ * @public + */ + MySQLCatalogSource?: MySQLCatalogSource; + + /** + *Specifies an Oracle data source in the Glue Data Catalog.
+ * @public + */ + OracleSQLCatalogSource?: OracleSQLCatalogSource; + + /** + *Specifies a PostgresSQL data source in the Glue Data Catalog.
+ * @public + */ + PostgreSQLCatalogSource?: PostgreSQLCatalogSource; + + /** + *Specifies a target that uses Microsoft SQL.
+ * @public + */ + MicrosoftSQLServerCatalogTarget?: MicrosoftSQLServerCatalogTarget; + + /** + *Specifies a target that uses MySQL.
+ * @public + */ + MySQLCatalogTarget?: MySQLCatalogTarget; + + /** + *Specifies a target that uses Oracle SQL.
+ * @public + */ + OracleSQLCatalogTarget?: OracleSQLCatalogTarget; + + /** + *Specifies a target that uses Postgres SQL.
+ * @public + */ + PostgreSQLCatalogTarget?: PostgreSQLCatalogTarget; + + /** + *Specifies a custom visual transform created by a user.
+ * @public + */ + DynamicTransform?: DynamicTransform; + + /** + *Specifies your data quality evaluation criteria.
+ * @public + */ + EvaluateDataQuality?: EvaluateDataQuality; + + /** + *Specifies a Hudi data source that is registered in the Glue Data Catalog. The data source must be stored in Amazon S3.
+ * @public + */ + S3CatalogHudiSource?: S3CatalogHudiSource; + + /** + *Specifies a Hudi data source that is registered in the Glue Data Catalog.
+ * @public + */ + CatalogHudiSource?: CatalogHudiSource; + + /** + *Specifies a Hudi data source stored in Amazon S3.
+ * @public + */ + S3HudiSource?: S3HudiSource; + + /** + *Specifies a target that writes to a Hudi data source in the Glue Data Catalog.
+ * @public + */ + S3HudiCatalogTarget?: S3HudiCatalogTarget; + + /** + *Specifies a target that writes to a Hudi data source in Amazon S3.
+ * @public + */ + S3HudiDirectTarget?: S3HudiDirectTarget; + + /** + *Specifies the direct JDBC source connection.
+ * @public + */ + DirectJDBCSource?: DirectJDBCSource; + + /** + *Specifies a Delta Lake data source that is registered in the Glue Data Catalog. The data source must be stored in Amazon S3.
+ * @public + */ + S3CatalogDeltaSource?: S3CatalogDeltaSource; + + /** + *Specifies a Delta Lake data source that is registered in the Glue Data Catalog.
+ * @public + */ + CatalogDeltaSource?: CatalogDeltaSource; + + /** + *Specifies a Delta Lake data source stored in Amazon S3.
+ * @public + */ + S3DeltaSource?: S3DeltaSource; + + /** + *Specifies a target that writes to a Delta Lake data source in the Glue Data Catalog.
+ * @public + */ + S3DeltaCatalogTarget?: S3DeltaCatalogTarget; + + /** + *Specifies a target that writes to a Delta Lake data source in Amazon S3.
+ * @public + */ + S3DeltaDirectTarget?: S3DeltaDirectTarget; + + /** + *Specifies a target that writes to a data source in Amazon Redshift.
+ * @public + */ + AmazonRedshiftSource?: AmazonRedshiftSource; + + /** + *Specifies a target that writes to a data target in Amazon Redshift.
+ * @public + */ + AmazonRedshiftTarget?: AmazonRedshiftTarget; + + /** + *Specifies your data quality evaluation criteria. Allows multiple input data and returns a collection of Dynamic Frames.
+ * @public + */ + EvaluateDataQualityMultiFrame?: EvaluateDataQualityMultiFrame; + + /** + *Specifies a Glue DataBrew recipe node.
+ * @public + */ + Recipe?: Recipe; + + /** + *Specifies a Snowflake data source.
+ * @public + */ + SnowflakeSource?: SnowflakeSource; + + /** + *Specifies a target that writes to a Snowflake data source.
+ * @public + */ + SnowflakeTarget?: SnowflakeTarget; + + /** + *Specifies a source generated with standard connection options.
+ * @public + */ + ConnectorDataSource?: ConnectorDataSource; + + /** + *Specifies a target generated with standard connection options.
+ * @public + */ + ConnectorDataTarget?: ConnectorDataTarget; +} + +/** + * @public + */ +export interface CreateJobRequest { + /** + *The name you assign to this job definition. It must be unique in your account.
+ * @public + */ + Name: string | undefined; + + /** + *A mode that describes how a job was created. Valid values are:
+ *
+ * SCRIPT
- The job was created using the Glue Studio script editor.
+ * VISUAL
- The job was created using the Glue Studio visual editor.
+ * NOTEBOOK
- The job was created using an interactive sessions notebook.
When the JobMode
field is missing or null, SCRIPT
is assigned as the default value.
Description of the job being defined.
+ * @public + */ + Description?: string; + + /** + *This field is reserved for future use.
+ * @public + */ + LogUri?: string; + + /** + *The name or Amazon Resource Name (ARN) of the IAM role associated with this job.
+ * @public + */ + Role: string | undefined; + + /** + *An ExecutionProperty
specifying the maximum number of concurrent runs allowed
+ * for this job.
The JobCommand
that runs this job.
The default arguments for every run of this job, specified as name-value pairs.
+ *You can specify arguments here that your own job-execution script + * consumes, as well as arguments that Glue itself consumes.
+ *Job arguments may be logged. Do not pass plaintext secrets as arguments. Retrieve secrets + * from a Glue Connection, Secrets Manager or other secret management + * mechanism if you intend to keep them within the Job.
+ *For information about how to specify and consume your own Job arguments, see the Calling Glue APIs in Python topic in the developer guide.
+ *For information about the arguments you can provide to this field when configuring Spark jobs, + * see the Special Parameters Used by Glue topic in the developer guide.
+ *For information about the arguments you can provide to this field when configuring Ray + * jobs, see Using + * job parameters in Ray jobs in the developer guide.
+ * @public + */ + DefaultArguments?: RecordArguments for this job that are not overridden when providing job arguments + * in a job run, specified as name-value pairs.
+ * @public + */ + NonOverridableArguments?: RecordThe connections used for this job.
+ * @public + */ + Connections?: ConnectionsList; + + /** + *The maximum number of times to retry this job if it fails.
+ * @public + */ + MaxRetries?: number; + + /** + * @deprecated + * + *This parameter is deprecated. Use MaxCapacity
instead.
The number of Glue data processing units (DPUs) to allocate to this Job. You can + * allocate a minimum of 2 DPUs; the default is 10. A DPU is a relative measure of processing + * power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, + * see the Glue pricing + * page.
+ * @public + */ + AllocatedCapacity?: number; + + /** + *The job timeout in minutes. This is the maximum time that a job run
+ * can consume resources before it is terminated and enters TIMEOUT
+ * status. The default is 2,880 minutes (48 hours) for batch jobs.
Streaming jobs must have timeout values less than 7 days or 10080 minutes. When the value is left blank, the job will be restarted after 7 days based if you have not setup a maintenance window. If you have setup maintenance window, it will be restarted during the maintenance window after 7 days.
+ * @public + */ + Timeout?: number; + + /** + *For Glue version 1.0 or earlier jobs, using the standard worker type, the number of + * Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is + * a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB + * of memory. For more information, see the + * Glue pricing page.
+ *For Glue version 2.0+ jobs, you cannot specify a Maximum capacity
.
+ * Instead, you should specify a Worker type
and the Number of workers
.
Do not set MaxCapacity
if using WorkerType
and NumberOfWorkers
.
The value that can be allocated for MaxCapacity
depends on whether you are
+ * running a Python shell job, an Apache Spark ETL job, or an Apache Spark streaming ETL
+ * job:
When you specify a Python shell job (JobCommand.Name
="pythonshell"), you can
+ * allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU.
When you specify an Apache Spark ETL job (JobCommand.Name
="glueetl") or Apache
+ * Spark streaming ETL job (JobCommand.Name
="gluestreaming"), you can allocate from 2 to 100 DPUs.
+ * The default is 10 DPUs. This job type cannot have a fractional DPU allocation.
The name of the SecurityConfiguration
structure to be used with this
+ * job.
The tags to use with this job. You may use tags to limit access to the job. For more information about tags in Glue, see Amazon Web Services Tags in Glue in the developer guide.
+ * @public + */ + Tags?: RecordSpecifies configuration properties of a job notification.
+ * @public + */ + NotificationProperty?: NotificationProperty; + + /** + *In Spark jobs, GlueVersion
determines the versions of Apache Spark and Python
+ * that Glue available in a job. The Python version indicates the version
+ * supported for jobs of type Spark.
Ray jobs should set GlueVersion
to 4.0
or greater. However,
+ * the versions of Ray, Python and additional libraries available in your Ray job are determined
+ * by the Runtime
parameter of the Job command.
For more information about the available Glue versions and corresponding + * Spark and Python versions, see Glue version in the developer + * guide.
+ *Jobs that are created without specifying a Glue version default to Glue 0.9.
+ * @public + */ + GlueVersion?: string; + + /** + *The number of workers of a defined workerType
that are allocated when a job runs.
The type of predefined worker that is allocated when a job runs. Accepts a value of + * G.1X, G.2X, G.4X, G.8X or G.025X for Spark jobs. Accepts the value Z.2X for Ray jobs.
+ *For the G.1X
worker type, each worker maps to 1 DPU (4 vCPUs, 16 GB of memory) with 84GB disk (approximately 34GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs.
For the G.2X
worker type, each worker maps to 2 DPU (8 vCPUs, 32 GB of memory) with 128GB disk (approximately 77GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs.
For the G.4X
worker type, each worker maps to 4 DPU (16 vCPUs, 64 GB of memory) with 256GB disk (approximately 235GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs in the following Amazon Web Services Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), and Europe (Stockholm).
For the G.8X
worker type, each worker maps to 8 DPU (32 vCPUs, 128 GB of memory) with 512GB disk (approximately 487GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs, in the same Amazon Web Services Regions as supported for the G.4X
worker type.
For the G.025X
worker type, each worker maps to 0.25 DPU (2 vCPUs, 4 GB of memory) with 84GB disk (approximately 34GB free), and provides 1 executor per worker. We recommend this worker type for low volume streaming jobs. This worker type is only available for Glue version 3.0 streaming jobs.
For the Z.2X
worker type, each worker maps to 2 M-DPU (8vCPUs, 64 GB of memory) with 128 GB disk (approximately 120GB free), and provides up to 8 Ray workers based on the autoscaler.
The representation of a directed acyclic graph on which both the Glue Studio visual component and Glue Studio code generation is based.
+ * @public + */ + CodeGenConfigurationNodes?: RecordIndicates whether the job is run with a standard or flexible execution class. The standard execution-class is ideal for time-sensitive workloads that require fast job startup and dedicated resources.
+ *The flexible execution class is appropriate for time-insensitive jobs whose start and completion times may vary.
+ *Only jobs with Glue version 3.0 and above and command type glueetl
will be allowed to set ExecutionClass
to FLEX
. The flexible execution class is available for Spark jobs.
The details for a source control configuration for a job, allowing synchronization of job artifacts to or from a remote repository.
+ * @public + */ + SourceControlDetails?: SourceControlDetails; + + /** + *This field specifies a day of the week and hour for a maintenance window for streaming jobs. Glue periodically performs maintenance activities. During these maintenance windows, Glue will need to restart your streaming jobs.
+ *Glue will restart the job within 3 hours of the specified maintenance window. For instance, if you set up the maintenance window for Monday at 10:00AM GMT, your jobs will be restarted between 10:00AM GMT to 1:00PM GMT.
+ * @public + */ + MaintenanceWindow?: string; +} + +/** + *Specifies a job definition.
+ * @public + */ +export interface Job { + /** + *The name you assign to this job definition.
+ * @public + */ + Name?: string; + + /** + *A mode that describes how a job was created. Valid values are:
+ *
+ * SCRIPT
- The job was created using the Glue Studio script editor.
+ * VISUAL
- The job was created using the Glue Studio visual editor.
+ * NOTEBOOK
- The job was created using an interactive sessions notebook.
When the JobMode
field is missing or null, SCRIPT
is assigned as the default value.
A description of the job.
+ * @public + */ + Description?: string; + + /** + *This field is reserved for future use.
+ * @public + */ + LogUri?: string; + + /** + *The name or Amazon Resource Name (ARN) of the IAM role associated with this job.
+ * @public + */ + Role?: string; + + /** + *The time and date that this job definition was created.
+ * @public + */ + CreatedOn?: Date; + + /** + *The last point in time when this job definition was modified.
+ * @public + */ + LastModifiedOn?: Date; + + /** + *An ExecutionProperty
specifying the maximum number of concurrent runs allowed
+ * for this job.
The JobCommand
that runs this job.
The default arguments for every run of this job, specified as name-value pairs.
+ *You can specify arguments here that your own job-execution script + * consumes, as well as arguments that Glue itself consumes.
+ *Job arguments may be logged. Do not pass plaintext secrets as arguments. Retrieve secrets + * from a Glue Connection, Secrets Manager or other secret management + * mechanism if you intend to keep them within the Job.
+ *For information about how to specify and consume your own Job arguments, see the Calling Glue APIs in Python topic in the developer guide.
+ *For information about the arguments you can provide to this field when configuring Spark jobs, + * see the Special Parameters Used by Glue topic in the developer guide.
+ *For information about the arguments you can provide to this field when configuring Ray + * jobs, see Using + * job parameters in Ray jobs in the developer guide.
+ * @public + */ + DefaultArguments?: RecordArguments for this job that are not overridden when providing job arguments + * in a job run, specified as name-value pairs.
+ * @public + */ + NonOverridableArguments?: RecordThe connections used for this job.
+ * @public + */ + Connections?: ConnectionsList; + + /** + *The maximum number of times to retry this job after a JobRun fails.
+ * @public + */ + MaxRetries?: number; + + /** + * @deprecated + * + *This field is deprecated. Use MaxCapacity
instead.
The number of Glue data processing units (DPUs) allocated to runs of this job. You can + * allocate a minimum of 2 DPUs; the default is 10. A DPU is a relative measure of processing + * power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, + * see the Glue pricing + * page.
+ * + * @public + */ + AllocatedCapacity?: number; + + /** + *The job timeout in minutes. This is the maximum time that a job run
+ * can consume resources before it is terminated and enters TIMEOUT
+ * status. The default is 2,880 minutes (48 hours) for batch jobs.
Streaming jobs must have timeout values less than 7 days or 10080 minutes. When the value is left blank, the job will be restarted after 7 days based if you have not setup a maintenance window. If you have setup maintenance window, it will be restarted during the maintenance window after 7 days.
+ * @public + */ + Timeout?: number; + + /** + *For Glue version 1.0 or earlier jobs, using the standard worker type, the number of + * Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is + * a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB + * of memory. For more information, see the + * Glue pricing page.
+ *For Glue version 2.0 or later jobs, you cannot specify a Maximum capacity
.
+ * Instead, you should specify a Worker type
and the Number of workers
.
Do not set MaxCapacity
if using WorkerType
and NumberOfWorkers
.
The value that can be allocated for MaxCapacity
depends on whether you are
+ * running a Python shell job, an Apache Spark ETL job, or an Apache Spark streaming ETL
+ * job:
When you specify a Python shell job (JobCommand.Name
="pythonshell"), you can
+ * allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU.
When you specify an Apache Spark ETL job (JobCommand.Name
="glueetl") or Apache
+ * Spark streaming ETL job (JobCommand.Name
="gluestreaming"), you can allocate from 2 to 100 DPUs.
+ * The default is 10 DPUs. This job type cannot have a fractional DPU allocation.
The type of predefined worker that is allocated when a job runs. Accepts a value of + * G.1X, G.2X, G.4X, G.8X or G.025X for Spark jobs. Accepts the value Z.2X for Ray jobs.
+ *For the G.1X
worker type, each worker maps to 1 DPU (4 vCPUs, 16 GB of memory) with 84GB disk (approximately 34GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs.
For the G.2X
worker type, each worker maps to 2 DPU (8 vCPUs, 32 GB of memory) with 128GB disk (approximately 77GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs.
For the G.4X
worker type, each worker maps to 4 DPU (16 vCPUs, 64 GB of memory) with 256GB disk (approximately 235GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs in the following Amazon Web Services Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), and Europe (Stockholm).
For the G.8X
worker type, each worker maps to 8 DPU (32 vCPUs, 128 GB of memory) with 512GB disk (approximately 487GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs, in the same Amazon Web Services Regions as supported for the G.4X
worker type.
For the G.025X
worker type, each worker maps to 0.25 DPU (2 vCPUs, 4 GB of memory) with 84GB disk (approximately 34GB free), and provides 1 executor per worker. We recommend this worker type for low volume streaming jobs. This worker type is only available for Glue version 3.0 streaming jobs.
For the Z.2X
worker type, each worker maps to 2 M-DPU (8vCPUs, 64 GB of memory) with 128 GB disk (approximately 120GB free), and provides up to 8 Ray workers based on the autoscaler.
The number of workers of a defined workerType
that are allocated when a job runs.
The name of the SecurityConfiguration
structure to be used with this
+ * job.
Specifies configuration properties of a job notification.
+ * @public + */ + NotificationProperty?: NotificationProperty; + + /** + *In Spark jobs, GlueVersion
determines the versions of Apache Spark and Python
+ * that Glue available in a job. The Python version indicates the version
+ * supported for jobs of type Spark.
Ray jobs should set GlueVersion
to 4.0
or greater. However,
+ * the versions of Ray, Python and additional libraries available in your Ray job are determined
+ * by the Runtime
parameter of the Job command.
For more information about the available Glue versions and corresponding + * Spark and Python versions, see Glue version in the developer + * guide.
+ *Jobs that are created without specifying a Glue version default to Glue 0.9.
+ * @public + */ + GlueVersion?: string; + + /** + *The representation of a directed acyclic graph on which both the Glue Studio visual component and Glue Studio code generation is based.
+ * @public + */ + CodeGenConfigurationNodes?: RecordIndicates whether the job is run with a standard or flexible execution class. The standard execution class is ideal for time-sensitive workloads that require fast job startup and dedicated resources.
+ *The flexible execution class is appropriate for time-insensitive jobs whose start and completion times may vary.
+ *Only jobs with Glue version 3.0 and above and command type glueetl
will be allowed to set ExecutionClass
to FLEX
. The flexible execution class is available for Spark jobs.
The details for a source control configuration for a job, allowing synchronization of job artifacts to or from a remote repository.
+ * @public + */ + SourceControlDetails?: SourceControlDetails; + + /** + *This field specifies a day of the week and hour for a maintenance window for streaming jobs. Glue periodically performs maintenance activities. During these maintenance windows, Glue will need to restart your streaming jobs.
+ *Glue will restart the job within 3 hours of the specified maintenance window. For instance, if you set up the maintenance window for Monday at 10:00AM GMT, your jobs will be restarted between 10:00AM GMT to 1:00PM GMT.
+ * @public + */ + MaintenanceWindow?: string; + + /** + *The name of an Glue usage profile associated with the job.
+ * @public + */ + ProfileName?: string; +} + +/** + *Specifies information used to update an existing job definition. The previous job + * definition is completely overwritten by this information.
+ * @public + */ +export interface JobUpdate { + /** + *A mode that describes how a job was created. Valid values are:
+ *
+ * SCRIPT
- The job was created using the Glue Studio script editor.
+ * VISUAL
- The job was created using the Glue Studio visual editor.
+ * NOTEBOOK
- The job was created using an interactive sessions notebook.
When the JobMode
field is missing or null, SCRIPT
is assigned as the default value.
Description of the job being defined.
+ * @public + */ + Description?: string; + + /** + *This field is reserved for future use.
+ * @public + */ + LogUri?: string; + + /** + *The name or Amazon Resource Name (ARN) of the IAM role associated with this job + * (required).
+ * @public + */ + Role?: string; + + /** + *An ExecutionProperty
specifying the maximum number of concurrent runs allowed
+ * for this job.
The JobCommand
that runs this job (required).
The default arguments for every run of this job, specified as name-value pairs.
+ *You can specify arguments here that your own job-execution script + * consumes, as well as arguments that Glue itself consumes.
+ *Job arguments may be logged. Do not pass plaintext secrets as arguments. Retrieve secrets + * from a Glue Connection, Secrets Manager or other secret management + * mechanism if you intend to keep them within the Job.
+ *For information about how to specify and consume your own Job arguments, see the Calling Glue APIs in Python topic in the developer guide.
+ *For information about the arguments you can provide to this field when configuring Spark jobs, + * see the Special Parameters Used by Glue topic in the developer guide.
+ *For information about the arguments you can provide to this field when configuring Ray + * jobs, see Using + * job parameters in Ray jobs in the developer guide.
+ * @public + */ + DefaultArguments?: RecordArguments for this job that are not overridden when providing job arguments + * in a job run, specified as name-value pairs.
+ * @public + */ + NonOverridableArguments?: RecordThe connections used for this job.
+ * @public + */ + Connections?: ConnectionsList; + + /** + *The maximum number of times to retry this job if it fails.
+ * @public + */ + MaxRetries?: number; + + /** + * @deprecated + * + *This field is deprecated. Use MaxCapacity
instead.
The number of Glue data processing units (DPUs) to allocate to this job. You can + * allocate a minimum of 2 DPUs; the default is 10. A DPU is a relative measure of processing + * power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, + * see the Glue pricing + * page.
+ * @public + */ + AllocatedCapacity?: number; + + /** + *The job timeout in minutes. This is the maximum time that a job run
+ * can consume resources before it is terminated and enters TIMEOUT
+ * status. The default is 2,880 minutes (48 hours) for batch jobs.
Streaming jobs must have timeout values less than 7 days or 10080 minutes. When the value is left blank, the job will be restarted after 7 days based if you have not setup a maintenance window. If you have setup maintenance window, it will be restarted during the maintenance window after 7 days.
+ * @public + */ + Timeout?: number; + + /** + *For Glue version 1.0 or earlier jobs, using the standard worker type, the number of + * Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is + * a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB + * of memory. For more information, see the + * Glue pricing page.
+ *For Glue version 2.0+ jobs, you cannot specify a Maximum capacity
.
+ * Instead, you should specify a Worker type
and the Number of workers
.
Do not set MaxCapacity
if using WorkerType
and NumberOfWorkers
.
The value that can be allocated for MaxCapacity
depends on whether you are
+ * running a Python shell job, an Apache Spark ETL job, or an Apache Spark streaming ETL
+ * job:
When you specify a Python shell job (JobCommand.Name
="pythonshell"), you can
+ * allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU.
When you specify an Apache Spark ETL job (JobCommand.Name
="glueetl") or Apache
+ * Spark streaming ETL job (JobCommand.Name
="gluestreaming"), you can allocate from 2 to 100 DPUs.
+ * The default is 10 DPUs. This job type cannot have a fractional DPU allocation.
The type of predefined worker that is allocated when a job runs. Accepts a value of + * G.1X, G.2X, G.4X, G.8X or G.025X for Spark jobs. Accepts the value Z.2X for Ray jobs.
+ *For the G.1X
worker type, each worker maps to 1 DPU (4 vCPUs, 16 GB of memory) with 84GB disk (approximately 34GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs.
For the G.2X
worker type, each worker maps to 2 DPU (8 vCPUs, 32 GB of memory) with 128GB disk (approximately 77GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs.
For the G.4X
worker type, each worker maps to 4 DPU (16 vCPUs, 64 GB of memory) with 256GB disk (approximately 235GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs in the following Amazon Web Services Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), and Europe (Stockholm).
For the G.8X
worker type, each worker maps to 8 DPU (32 vCPUs, 128 GB of memory) with 512GB disk (approximately 487GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs, in the same Amazon Web Services Regions as supported for the G.4X
worker type.
For the G.025X
worker type, each worker maps to 0.25 DPU (2 vCPUs, 4 GB of memory) with 84GB disk (approximately 34GB free), and provides 1 executor per worker. We recommend this worker type for low volume streaming jobs. This worker type is only available for Glue version 3.0 streaming jobs.
For the Z.2X
worker type, each worker maps to 2 M-DPU (8vCPUs, 64 GB of memory) with 128 GB disk (approximately 120GB free), and provides up to 8 Ray workers based on the autoscaler.
The number of workers of a defined workerType
that are allocated when a job runs.
The name of the SecurityConfiguration
structure to be used with this
+ * job.
Specifies the configuration properties of a job notification.
+ * @public + */ + NotificationProperty?: NotificationProperty; + + /** + *In Spark jobs, GlueVersion
determines the versions of Apache Spark and Python
+ * that Glue available in a job. The Python version indicates the version
+ * supported for jobs of type Spark.
Ray jobs should set GlueVersion
to 4.0
or greater. However,
+ * the versions of Ray, Python and additional libraries available in your Ray job are determined
+ * by the Runtime
parameter of the Job command.
For more information about the available Glue versions and corresponding + * Spark and Python versions, see Glue version in the developer + * guide.
+ *Jobs that are created without specifying a Glue version default to Glue 0.9.
+ * @public + */ + GlueVersion?: string; + + /** + *The representation of a directed acyclic graph on which both the Glue Studio visual component and Glue Studio code generation is based.
+ * @public + */ + CodeGenConfigurationNodes?: RecordIndicates whether the job is run with a standard or flexible execution class. The standard execution-class is ideal for time-sensitive workloads that require fast job startup and dedicated resources.
+ *The flexible execution class is appropriate for time-insensitive jobs whose start and completion times may vary.
+ *Only jobs with Glue version 3.0 and above and command type glueetl
will be allowed to set ExecutionClass
to FLEX
. The flexible execution class is available for Spark jobs.
The details for a source control configuration for a job, allowing synchronization of job artifacts to or from a remote repository.
+ * @public + */ + SourceControlDetails?: SourceControlDetails; + + /** + *This field specifies a day of the week and hour for a maintenance window for streaming jobs. Glue periodically performs maintenance activities. During these maintenance windows, Glue will need to restart your streaming jobs.
+ *Glue will restart the job within 3 hours of the specified maintenance window. For instance, if you set up the maintenance window for Monday at 10:00AM GMT, your jobs will be restarted between 10:00AM GMT to 1:00PM GMT.
+ * @public + */ + MaintenanceWindow?: string; +} + +/** + * @public + */ +export interface GetJobResponse { + /** + *The requested job definition.
+ * @public + */ + Job?: Job; +} + +/** + * @public + */ +export interface UpdateJobRequest { + /** + *The name of the job definition to update.
+ * @public + */ + JobName: string | undefined; + + /** + *Specifies the values with which to update the job definition. Unspecified configuration is removed or reset to default values.
+ * @public + */ + JobUpdate: JobUpdate | undefined; +} + +/** + * @public + */ +export interface BatchGetJobsResponse { + /** + *A list of job definitions.
+ * @public + */ + Jobs?: Job[]; + + /** + *A list of names of jobs not found.
+ * @public + */ + JobsNotFound?: string[]; +} + +/** + * @public + */ +export interface GetJobsResponse { + /** + *A list of job definitions.
+ * @public + */ + Jobs?: Job[]; + + /** + *A continuation token, if not all job definitions have yet been returned.
+ * @public + */ + NextToken?: string; +} + +/** + * @internal + */ +export const CreateJobRequestFilterSensitiveLog = (obj: CreateJobRequest): any => ({ + ...obj, + ...(obj.CodeGenConfigurationNodes && { CodeGenConfigurationNodes: SENSITIVE_STRING }), +}); + +/** + * @internal + */ +export const JobFilterSensitiveLog = (obj: Job): any => ({ + ...obj, + ...(obj.CodeGenConfigurationNodes && { CodeGenConfigurationNodes: SENSITIVE_STRING }), +}); + +/** + * @internal + */ +export const JobUpdateFilterSensitiveLog = (obj: JobUpdate): any => ({ + ...obj, + ...(obj.CodeGenConfigurationNodes && { CodeGenConfigurationNodes: SENSITIVE_STRING }), +}); + +/** + * @internal + */ +export const GetJobResponseFilterSensitiveLog = (obj: GetJobResponse): any => ({ + ...obj, + ...(obj.Job && { Job: JobFilterSensitiveLog(obj.Job) }), +}); + +/** + * @internal + */ +export const UpdateJobRequestFilterSensitiveLog = (obj: UpdateJobRequest): any => ({ + ...obj, + ...(obj.JobUpdate && { JobUpdate: JobUpdateFilterSensitiveLog(obj.JobUpdate) }), +}); + +/** + * @internal + */ +export const BatchGetJobsResponseFilterSensitiveLog = (obj: BatchGetJobsResponse): any => ({ + ...obj, + ...(obj.Jobs && { Jobs: obj.Jobs.map((item) => JobFilterSensitiveLog(item)) }), +}); + +/** + * @internal + */ +export const GetJobsResponseFilterSensitiveLog = (obj: GetJobsResponse): any => ({ + ...obj, + ...(obj.Jobs && { Jobs: obj.Jobs.map((item) => JobFilterSensitiveLog(item)) }), +}); diff --git a/clients/client-glue/src/protocols/Aws_json1_1.ts b/clients/client-glue/src/protocols/Aws_json1_1.ts index f3bf4c93fddac..5175dda3e70c5 100644 --- a/clients/client-glue/src/protocols/Aws_json1_1.ts +++ b/clients/client-glue/src/protocols/Aws_json1_1.ts @@ -65,6 +65,10 @@ import { } from "../commands/BatchGetTableOptimizerCommand"; import { BatchGetTriggersCommandInput, BatchGetTriggersCommandOutput } from "../commands/BatchGetTriggersCommand"; import { BatchGetWorkflowsCommandInput, BatchGetWorkflowsCommandOutput } from "../commands/BatchGetWorkflowsCommand"; +import { + BatchPutDataQualityStatisticAnnotationCommandInput, + BatchPutDataQualityStatisticAnnotationCommandOutput, +} from "../commands/BatchPutDataQualityStatisticAnnotationCommand"; import { BatchStopJobRunCommandInput, BatchStopJobRunCommandOutput } from "../commands/BatchStopJobRunCommand"; import { BatchUpdatePartitionCommandInput, @@ -223,6 +227,14 @@ import { GetDataCatalogEncryptionSettingsCommandOutput, } from "../commands/GetDataCatalogEncryptionSettingsCommand"; import { GetDataflowGraphCommandInput, GetDataflowGraphCommandOutput } from "../commands/GetDataflowGraphCommand"; +import { + GetDataQualityModelCommandInput, + GetDataQualityModelCommandOutput, +} from "../commands/GetDataQualityModelCommand"; +import { + GetDataQualityModelResultCommandInput, + GetDataQualityModelResultCommandOutput, +} from "../commands/GetDataQualityModelResultCommand"; import { GetDataQualityResultCommandInput, GetDataQualityResultCommandOutput, @@ -351,6 +363,14 @@ import { ListDataQualityRulesetsCommandInput, ListDataQualityRulesetsCommandOutput, } from "../commands/ListDataQualityRulesetsCommand"; +import { + ListDataQualityStatisticAnnotationsCommandInput, + ListDataQualityStatisticAnnotationsCommandOutput, +} from "../commands/ListDataQualityStatisticAnnotationsCommand"; +import { + ListDataQualityStatisticsCommandInput, + ListDataQualityStatisticsCommandOutput, +} from "../commands/ListDataQualityStatisticsCommand"; import { ListDevEndpointsCommandInput, ListDevEndpointsCommandOutput } from "../commands/ListDevEndpointsCommand"; import { ListJobsCommandInput, ListJobsCommandOutput } from "../commands/ListJobsCommand"; import { ListMLTransformsCommandInput, ListMLTransformsCommandOutput } from "../commands/ListMLTransformsCommand"; @@ -370,6 +390,10 @@ import { PutDataCatalogEncryptionSettingsCommandInput, PutDataCatalogEncryptionSettingsCommandOutput, } from "../commands/PutDataCatalogEncryptionSettingsCommand"; +import { + PutDataQualityProfileAnnotationCommandInput, + PutDataQualityProfileAnnotationCommandOutput, +} from "../commands/PutDataQualityProfileAnnotationCommand"; import { PutResourcePolicyCommandInput, PutResourcePolicyCommandOutput } from "../commands/PutResourcePolicyCommand"; import { PutSchemaVersionMetadataCommandInput, @@ -533,6 +557,7 @@ import { BatchGetTriggersRequest, BatchGetWorkflowsRequest, BatchGetWorkflowsResponse, + BatchPutDataQualityStatisticAnnotationRequest, BatchStopJobRunRequest, BatchTableOptimizer, BatchUpdatePartitionRequest, @@ -560,11 +585,8 @@ import { Crawler, CrawlerNodeDetails, CrawlerTargets, - CreateBlueprintRequest, - CreateCsvClassifierRequest, - CreateGrokClassifierRequest, - CreateJsonClassifierRequest, CustomCode, + DatapointInclusionAnnotation, DataQualityAnalyzerResult, DataQualityMetricValues, DataQualityObservation, @@ -698,11 +720,13 @@ import { Spigot, SplitFields, SqlAlias, + StatisticAnnotation, StorageDescriptor, StreamingDataPreviewOptions, TableOptimizer, TableOptimizerConfiguration, TableOptimizerRun, + TimestampedInclusionAnnotation, TransformConfigParameter, Union, UpsertRedshiftTargetOptions, @@ -735,14 +759,18 @@ import { ConnectionPropertyKey, CrawlerMetrics, CrawlerRunningException, + CreateBlueprintRequest, CreateClassifierRequest, CreateConnectionRequest, CreateCrawlerRequest, + CreateCsvClassifierRequest, CreateCustomEntityTypeRequest, CreateDatabaseRequest, CreateDataQualityRulesetRequest, CreateDevEndpointRequest, CreateDevEndpointResponse, + CreateGrokClassifierRequest, + CreateJsonClassifierRequest, CreateMLTransformRequest, CreatePartitionIndexRequest, CreatePartitionRequest, @@ -845,6 +873,10 @@ import { GetDatabasesResponse, GetDataCatalogEncryptionSettingsRequest, GetDataflowGraphRequest, + GetDataQualityModelRequest, + GetDataQualityModelResponse, + GetDataQualityModelResultRequest, + GetDataQualityModelResultResponse, GetDataQualityResultRequest, GetDataQualityResultResponse, GetDataQualityRuleRecommendationRunRequest, @@ -871,11 +903,6 @@ import { GetMLTaskRunsResponse, GetMLTransformRequest, GetMLTransformResponse, - GetMLTransformsRequest, - GetMLTransformsResponse, - GetPartitionIndexesRequest, - GetPartitionRequest, - GetPartitionResponse, GrokClassifier, IcebergInput, IdempotentParameterMismatchException, @@ -884,7 +911,6 @@ import { Location, LongColumnStatisticsData, MappingEntry, - MLTransform, MLUserDataEncryption, OpenTableFormatInput, OperationNotSupportedException, @@ -898,9 +924,9 @@ import { S3Encryption, SchedulerTransitioningException, SchemaColumn, - Segment, Session, SessionCommand, + StatisticModelResult, StringColumnStatisticsData, TableIdentifier, TableInput, @@ -918,9 +944,6 @@ import { XMLClassifier, } from "../models/models_1"; import { - ApplyMapping, - BatchGetJobsResponse, - CodeGenConfigurationNode, ColumnStatisticsError, ColumnStatisticsTaskNotRunningException, ColumnStatisticsTaskRunningException, @@ -930,7 +953,6 @@ import { CrawlerNotRunningException, CrawlerStoppingException, CrawlsFilter, - CreateJobRequest, DataQualityResultDescription, DataQualityResultFilterCriteria, DataQualityRuleRecommendationRunDescription, @@ -940,8 +962,11 @@ import { DataQualityRulesetFilterCriteria, DataQualityRulesetListDetails, DevEndpointCustomLibraries, - GetJobResponse, - GetJobsResponse, + GetMLTransformsRequest, + GetMLTransformsResponse, + GetPartitionIndexesRequest, + GetPartitionRequest, + GetPartitionResponse, GetPartitionsRequest, GetPartitionsResponse, GetPlanRequest, @@ -998,8 +1023,6 @@ import { IllegalBlueprintStateException, IllegalWorkflowStateException, ImportCatalogToGlueRequest, - Job, - JobUpdate, ListBlueprintsRequest, ListColumnStatisticsTaskRunsRequest, ListCrawlersRequest, @@ -1014,6 +1037,10 @@ import { ListDataQualityRulesetEvaluationRunsResponse, ListDataQualityRulesetsRequest, ListDataQualityRulesetsResponse, + ListDataQualityStatisticAnnotationsRequest, + ListDataQualityStatisticAnnotationsResponse, + ListDataQualityStatisticsRequest, + ListDataQualityStatisticsResponse, ListDevEndpointsRequest, ListJobsRequest, ListMLTransformsRequest, @@ -1030,14 +1057,15 @@ import { ListUsageProfilesRequest, ListUsageProfilesResponse, ListWorkflowsRequest, - Mapping, MetadataKeyValuePair, + MLTransform, MLTransformNotReadyException, NoScheduleException, PermissionType, PermissionTypeMismatchException, PropertyPredicate, PutDataCatalogEncryptionSettingsRequest, + PutDataQualityProfileAnnotationRequest, PutResourcePolicyRequest, PutSchemaVersionMetadataInput, PutWorkflowRunPropertiesRequest, @@ -1054,6 +1082,7 @@ import { SearchTablesRequest, SearchTablesResponse, SecurityConfiguration, + Segment, SortCriterion, StartBlueprintRunRequest, StartColumnStatisticsTaskRunRequest, @@ -1069,6 +1098,7 @@ import { StartTriggerRequest, StartWorkflowRunRequest, Statement, + StatisticSummary, StopColumnStatisticsTaskRunRequest, StopCrawlerRequest, StopCrawlerScheduleRequest, @@ -1079,6 +1109,7 @@ import { Table, TableVersion, TagResourceRequest, + TimestampFilter, TriggerUpdate, UnfilteredPartition, UntagResourceRequest, @@ -1097,7 +1128,6 @@ import { UpdateDevEndpointRequest, UpdateGrokClassifierRequest, UpdateJobFromSourceControlRequest, - UpdateJobRequest, UpdateJsonClassifierRequest, UpdateMLTransformRequest, UpdatePartitionRequest, @@ -1108,13 +1138,25 @@ import { UpdateTableRequest, UpdateTriggerRequest, UpdateUsageProfileRequest, - UpdateUserDefinedFunctionRequest, - UpdateWorkflowRequest, UpdateXMLClassifierRequest, UsageProfileDefinition, UserDefinedFunction, VersionMismatchException, } from "../models/models_2"; +import { + ApplyMapping, + BatchGetJobsResponse, + CodeGenConfigurationNode, + CreateJobRequest, + GetJobResponse, + GetJobsResponse, + Job, + JobUpdate, + Mapping, + UpdateJobRequest, + UpdateUserDefinedFunctionRequest, + UpdateWorkflowRequest, +} from "../models/models_3"; /** * serializeAws_json1_1BatchCreatePartitionCommand @@ -1311,6 +1353,19 @@ export const se_BatchGetWorkflowsCommand = async ( return buildHttpRpcRequest(context, headers, "/", undefined, body); }; +/** + * serializeAws_json1_1BatchPutDataQualityStatisticAnnotationCommand + */ +export const se_BatchPutDataQualityStatisticAnnotationCommand = async ( + input: BatchPutDataQualityStatisticAnnotationCommandInput, + context: __SerdeContext +): Promise<__HttpRequest> => { + const headers: __HeaderBag = sharedHeaders("BatchPutDataQualityStatisticAnnotation"); + let body: any; + body = JSON.stringify(_json(input)); + return buildHttpRpcRequest(context, headers, "/", undefined, body); +}; + /** * serializeAws_json1_1BatchStopJobRunCommand */ @@ -2312,6 +2367,32 @@ export const se_GetDataflowGraphCommand = async ( return buildHttpRpcRequest(context, headers, "/", undefined, body); }; +/** + * serializeAws_json1_1GetDataQualityModelCommand + */ +export const se_GetDataQualityModelCommand = async ( + input: GetDataQualityModelCommandInput, + context: __SerdeContext +): Promise<__HttpRequest> => { + const headers: __HeaderBag = sharedHeaders("GetDataQualityModel"); + let body: any; + body = JSON.stringify(_json(input)); + return buildHttpRpcRequest(context, headers, "/", undefined, body); +}; + +/** + * serializeAws_json1_1GetDataQualityModelResultCommand + */ +export const se_GetDataQualityModelResultCommand = async ( + input: GetDataQualityModelResultCommandInput, + context: __SerdeContext +): Promise<__HttpRequest> => { + const headers: __HeaderBag = sharedHeaders("GetDataQualityModelResult"); + let body: any; + body = JSON.stringify(_json(input)); + return buildHttpRpcRequest(context, headers, "/", undefined, body); +}; + /** * serializeAws_json1_1GetDataQualityResultCommand */ @@ -3076,6 +3157,32 @@ export const se_ListDataQualityRulesetsCommand = async ( return buildHttpRpcRequest(context, headers, "/", undefined, body); }; +/** + * serializeAws_json1_1ListDataQualityStatisticAnnotationsCommand + */ +export const se_ListDataQualityStatisticAnnotationsCommand = async ( + input: ListDataQualityStatisticAnnotationsCommandInput, + context: __SerdeContext +): Promise<__HttpRequest> => { + const headers: __HeaderBag = sharedHeaders("ListDataQualityStatisticAnnotations"); + let body: any; + body = JSON.stringify(se_ListDataQualityStatisticAnnotationsRequest(input, context)); + return buildHttpRpcRequest(context, headers, "/", undefined, body); +}; + +/** + * serializeAws_json1_1ListDataQualityStatisticsCommand + */ +export const se_ListDataQualityStatisticsCommand = async ( + input: ListDataQualityStatisticsCommandInput, + context: __SerdeContext +): Promise<__HttpRequest> => { + const headers: __HeaderBag = sharedHeaders("ListDataQualityStatistics"); + let body: any; + body = JSON.stringify(se_ListDataQualityStatisticsRequest(input, context)); + return buildHttpRpcRequest(context, headers, "/", undefined, body); +}; + /** * serializeAws_json1_1ListDevEndpointsCommand */ @@ -3245,6 +3352,19 @@ export const se_PutDataCatalogEncryptionSettingsCommand = async ( return buildHttpRpcRequest(context, headers, "/", undefined, body); }; +/** + * serializeAws_json1_1PutDataQualityProfileAnnotationCommand + */ +export const se_PutDataQualityProfileAnnotationCommand = async ( + input: PutDataQualityProfileAnnotationCommandInput, + context: __SerdeContext +): Promise<__HttpRequest> => { + const headers: __HeaderBag = sharedHeaders("PutDataQualityProfileAnnotation"); + let body: any; + body = JSON.stringify(_json(input)); + return buildHttpRpcRequest(context, headers, "/", undefined, body); +}; + /** * serializeAws_json1_1PutResourcePolicyCommand */ @@ -4247,6 +4367,26 @@ export const de_BatchGetWorkflowsCommand = async ( return response; }; +/** + * deserializeAws_json1_1BatchPutDataQualityStatisticAnnotationCommand + */ +export const de_BatchPutDataQualityStatisticAnnotationCommand = async ( + output: __HttpResponse, + context: __SerdeContext +): PromiseSpecifies an Amazon Redshift target.
" } }, + "com.amazonaws.glue#AnnotationError": { + "type": "structure", + "members": { + "ProfileId": { + "target": "com.amazonaws.glue#HashString", + "traits": { + "smithy.api#documentation": "The Profile ID for the failed annotation.
" + } + }, + "StatisticId": { + "target": "com.amazonaws.glue#HashString", + "traits": { + "smithy.api#documentation": "The Statistic ID for the failed annotation.
" + } + }, + "FailureReason": { + "target": "com.amazonaws.glue#DescriptionString", + "traits": { + "smithy.api#documentation": "The reason why the annotation failed.
" + } + } + }, + "traits": { + "smithy.api#documentation": "A failed annotation.
" + } + }, + "com.amazonaws.glue#AnnotationErrorList": { + "type": "list", + "member": { + "target": "com.amazonaws.glue#AnnotationError" + } + }, + "com.amazonaws.glue#AnnotationList": { + "type": "list", + "member": { + "target": "com.amazonaws.glue#StatisticAnnotation" + } + }, "com.amazonaws.glue#ApplyMapping": { "type": "structure", "members": { @@ -3724,6 +3780,67 @@ "smithy.api#output": {} } }, + "com.amazonaws.glue#BatchPutDataQualityStatisticAnnotation": { + "type": "operation", + "input": { + "target": "com.amazonaws.glue#BatchPutDataQualityStatisticAnnotationRequest" + }, + "output": { + "target": "com.amazonaws.glue#BatchPutDataQualityStatisticAnnotationResponse" + }, + "errors": [ + { + "target": "com.amazonaws.glue#EntityNotFoundException" + }, + { + "target": "com.amazonaws.glue#InternalServiceException" + }, + { + "target": "com.amazonaws.glue#InvalidInputException" + }, + { + "target": "com.amazonaws.glue#ResourceNumberLimitExceededException" + } + ], + "traits": { + "smithy.api#documentation": "Annotate datapoints over time for a specific data quality statistic.
" + } + }, + "com.amazonaws.glue#BatchPutDataQualityStatisticAnnotationRequest": { + "type": "structure", + "members": { + "InclusionAnnotations": { + "target": "com.amazonaws.glue#InclusionAnnotationList", + "traits": { + "smithy.api#documentation": "A list of DatapointInclusionAnnotation
's.
Client Token.
" + } + } + }, + "traits": { + "smithy.api#input": {} + } + }, + "com.amazonaws.glue#BatchPutDataQualityStatisticAnnotationResponse": { + "type": "structure", + "members": { + "FailedInclusionAnnotations": { + "target": "com.amazonaws.glue#AnnotationErrorList", + "traits": { + "smithy.api#documentation": "A list of AnnotationError
's.
A target table associated with the data quality ruleset.
" } }, + "DataQualitySecurityConfiguration": { + "target": "com.amazonaws.glue#NameString", + "traits": { + "smithy.api#documentation": "The name of the security configuration created with the data quality encryption option.
" + } + }, "ClientToken": { "target": "com.amazonaws.glue#HashString", "traits": { @@ -11121,6 +11244,29 @@ "smithy.api#documentation": "Describes the data quality metric value according to the analysis of historical data.
" } }, + "com.amazonaws.glue#DataQualityModelStatus": { + "type": "enum", + "members": { + "RUNNING": { + "target": "smithy.api#Unit", + "traits": { + "smithy.api#enumValue": "RUNNING" + } + }, + "SUCCEEDED": { + "target": "smithy.api#Unit", + "traits": { + "smithy.api#enumValue": "SUCCEEDED" + } + }, + "FAILED": { + "target": "smithy.api#Unit", + "traits": { + "smithy.api#enumValue": "FAILED" + } + } + } + }, "com.amazonaws.glue#DataQualityObservation": { "type": "structure", "members": { @@ -11148,7 +11294,8 @@ "min": 0, "max": 2048 }, - "smithy.api#pattern": "^[\\u0020-\\uD7FF\\uE000-\\uFFFD\\uD800\\uDC00-\\uDBFF\\uDFFF\\r\\n\\t]*$" + "smithy.api#pattern": "^[\\u0020-\\uD7FF\\uE000-\\uFFFD\\uD800\\uDC00-\\uDBFF\\uDFFF\\r\\n\\t]*$", + "smithy.api#sensitive": {} } }, "com.amazonaws.glue#DataQualityObservations": { @@ -11172,6 +11319,12 @@ "smithy.api#documentation": "A unique result ID for the data quality result.
" } }, + "ProfileId": { + "target": "com.amazonaws.glue#HashString", + "traits": { + "smithy.api#documentation": "The Profile ID for the data quality result.
" + } + }, "Score": { "target": "com.amazonaws.glue#GenericBoundedDouble", "traits": { @@ -11458,6 +11611,12 @@ "traits": { "smithy.api#documentation": "A map of metrics associated with the evaluation of the rule.
" } + }, + "EvaluatedRule": { + "target": "com.amazonaws.glue#DataQualityRuleResultDescription", + "traits": { + "smithy.api#documentation": "The evaluated rule.
" + } } }, "traits": { @@ -11471,7 +11630,8 @@ "min": 0, "max": 2048 }, - "smithy.api#pattern": "^[\\u0020-\\uD7FF\\uE000-\\uFFFD\\uD800\\uDC00-\\uDBFF\\uDFFF\\r\\n\\t]*$" + "smithy.api#pattern": "^[\\u0020-\\uD7FF\\uE000-\\uFFFD\\uD800\\uDC00-\\uDBFF\\uDFFF\\r\\n\\t]*$", + "smithy.api#sensitive": {} } }, "com.amazonaws.glue#DataQualityRuleResultStatus": { @@ -11926,6 +12086,32 @@ } } }, + "com.amazonaws.glue#DatapointInclusionAnnotation": { + "type": "structure", + "members": { + "ProfileId": { + "target": "com.amazonaws.glue#HashString", + "traits": { + "smithy.api#documentation": "The ID of the data quality profile the statistic belongs to.
" + } + }, + "StatisticId": { + "target": "com.amazonaws.glue#HashString", + "traits": { + "smithy.api#documentation": "The Statistic ID.
" + } + }, + "InclusionAnnotation": { + "target": "com.amazonaws.glue#InclusionAnnotationValue", + "traits": { + "smithy.api#documentation": "The inclusion annotation value to apply to the statistic.
" + } + } + }, + "traits": { + "smithy.api#documentation": "An Inclusion Annotation.
" + } + }, "com.amazonaws.glue#Datatype": { "type": "structure", "members": { @@ -14686,6 +14872,9 @@ }, "value": { "target": "com.amazonaws.glue#NullableDouble" + }, + "traits": { + "smithy.api#sensitive": {} } }, "com.amazonaws.glue#EvaluationMetrics": { @@ -16618,6 +16807,153 @@ "smithy.api#output": {} } }, + "com.amazonaws.glue#GetDataQualityModel": { + "type": "operation", + "input": { + "target": "com.amazonaws.glue#GetDataQualityModelRequest" + }, + "output": { + "target": "com.amazonaws.glue#GetDataQualityModelResponse" + }, + "errors": [ + { + "target": "com.amazonaws.glue#EntityNotFoundException" + }, + { + "target": "com.amazonaws.glue#InternalServiceException" + }, + { + "target": "com.amazonaws.glue#InvalidInputException" + }, + { + "target": "com.amazonaws.glue#OperationTimeoutException" + } + ], + "traits": { + "smithy.api#documentation": "Retrieve the training status of the model along with more information (CompletedOn, StartedOn, FailureReason).
" + } + }, + "com.amazonaws.glue#GetDataQualityModelRequest": { + "type": "structure", + "members": { + "StatisticId": { + "target": "com.amazonaws.glue#HashString", + "traits": { + "smithy.api#documentation": "The Statistic ID.
" + } + }, + "ProfileId": { + "target": "com.amazonaws.glue#HashString", + "traits": { + "smithy.api#documentation": "The Profile ID.
", + "smithy.api#required": {} + } + } + }, + "traits": { + "smithy.api#input": {} + } + }, + "com.amazonaws.glue#GetDataQualityModelResponse": { + "type": "structure", + "members": { + "Status": { + "target": "com.amazonaws.glue#DataQualityModelStatus", + "traits": { + "smithy.api#documentation": "The training status of the data quality model.
" + } + }, + "StartedOn": { + "target": "com.amazonaws.glue#Timestamp", + "traits": { + "smithy.api#documentation": "The timestamp when the data quality model training started.
" + } + }, + "CompletedOn": { + "target": "com.amazonaws.glue#Timestamp", + "traits": { + "smithy.api#documentation": "The timestamp when the data quality model training completed.
" + } + }, + "FailureReason": { + "target": "com.amazonaws.glue#HashString", + "traits": { + "smithy.api#documentation": "The training failure reason.
" + } + } + }, + "traits": { + "smithy.api#output": {} + } + }, + "com.amazonaws.glue#GetDataQualityModelResult": { + "type": "operation", + "input": { + "target": "com.amazonaws.glue#GetDataQualityModelResultRequest" + }, + "output": { + "target": "com.amazonaws.glue#GetDataQualityModelResultResponse" + }, + "errors": [ + { + "target": "com.amazonaws.glue#EntityNotFoundException" + }, + { + "target": "com.amazonaws.glue#InternalServiceException" + }, + { + "target": "com.amazonaws.glue#InvalidInputException" + }, + { + "target": "com.amazonaws.glue#OperationTimeoutException" + } + ], + "traits": { + "smithy.api#documentation": "Retrieve a statistic's predictions for a given Profile ID.
" + } + }, + "com.amazonaws.glue#GetDataQualityModelResultRequest": { + "type": "structure", + "members": { + "StatisticId": { + "target": "com.amazonaws.glue#HashString", + "traits": { + "smithy.api#documentation": "The Statistic ID.
", + "smithy.api#required": {} + } + }, + "ProfileId": { + "target": "com.amazonaws.glue#HashString", + "traits": { + "smithy.api#documentation": "The Profile ID.
", + "smithy.api#required": {} + } + } + }, + "traits": { + "smithy.api#input": {} + } + }, + "com.amazonaws.glue#GetDataQualityModelResultResponse": { + "type": "structure", + "members": { + "CompletedOn": { + "target": "com.amazonaws.glue#Timestamp", + "traits": { + "smithy.api#documentation": "The timestamp when the data quality model training completed.
" + } + }, + "Model": { + "target": "com.amazonaws.glue#StatisticModelResults", + "traits": { + "smithy.api#documentation": "A list of StatisticModelResult
\n
A unique result ID for the data quality result.
" } }, + "ProfileId": { + "target": "com.amazonaws.glue#HashString", + "traits": { + "smithy.api#documentation": "The Profile ID for the data quality result.
" + } + }, "Score": { "target": "com.amazonaws.glue#GenericBoundedDouble", "traits": { @@ -16867,6 +17209,12 @@ "traits": { "smithy.api#documentation": "The name of the ruleset that was created by the run.
" } + }, + "DataQualitySecurityConfiguration": { + "target": "com.amazonaws.glue#NameString", + "traits": { + "smithy.api#documentation": "The name of the security configuration created with the data quality encryption option.
" + } } }, "traits": { @@ -17098,6 +17446,12 @@ "traits": { "smithy.api#documentation": "When a ruleset was created from a recommendation run, this run ID is generated to link the two together.
" } + }, + "DataQualitySecurityConfiguration": { + "target": "com.amazonaws.glue#NameString", + "traits": { + "smithy.api#documentation": "The name of the security configuration created with the data quality encryption option.
" + } } }, "traits": { @@ -21925,6 +22279,29 @@ "smithy.api#documentation": "Specifies configuration properties for an importing labels task run.
" } }, + "com.amazonaws.glue#InclusionAnnotationList": { + "type": "list", + "member": { + "target": "com.amazonaws.glue#DatapointInclusionAnnotation" + } + }, + "com.amazonaws.glue#InclusionAnnotationValue": { + "type": "enum", + "members": { + "INCLUDE": { + "target": "smithy.api#Unit", + "traits": { + "smithy.api#enumValue": "INCLUDE" + } + }, + "EXCLUDE": { + "target": "smithy.api#Unit", + "traits": { + "smithy.api#enumValue": "EXCLUDE" + } + } + } + }, "com.amazonaws.glue#Integer": { "type": "integer", "traits": { @@ -24590,6 +24967,165 @@ "smithy.api#output": {} } }, + "com.amazonaws.glue#ListDataQualityStatisticAnnotations": { + "type": "operation", + "input": { + "target": "com.amazonaws.glue#ListDataQualityStatisticAnnotationsRequest" + }, + "output": { + "target": "com.amazonaws.glue#ListDataQualityStatisticAnnotationsResponse" + }, + "errors": [ + { + "target": "com.amazonaws.glue#InternalServiceException" + }, + { + "target": "com.amazonaws.glue#InvalidInputException" + } + ], + "traits": { + "smithy.api#documentation": "Retrieve annotations for a data quality statistic.
" + } + }, + "com.amazonaws.glue#ListDataQualityStatisticAnnotationsRequest": { + "type": "structure", + "members": { + "StatisticId": { + "target": "com.amazonaws.glue#HashString", + "traits": { + "smithy.api#documentation": "The Statistic ID.
" + } + }, + "ProfileId": { + "target": "com.amazonaws.glue#HashString", + "traits": { + "smithy.api#documentation": "The Profile ID.
" + } + }, + "TimestampFilter": { + "target": "com.amazonaws.glue#TimestampFilter", + "traits": { + "smithy.api#documentation": "A timestamp filter.
" + } + }, + "MaxResults": { + "target": "com.amazonaws.glue#PageSize", + "traits": { + "smithy.api#documentation": "The maximum number of results to return in this request.
" + } + }, + "NextToken": { + "target": "com.amazonaws.glue#PaginationToken", + "traits": { + "smithy.api#documentation": "A pagination token to retrieve the next set of results.
" + } + } + }, + "traits": { + "smithy.api#input": {} + } + }, + "com.amazonaws.glue#ListDataQualityStatisticAnnotationsResponse": { + "type": "structure", + "members": { + "Annotations": { + "target": "com.amazonaws.glue#AnnotationList", + "traits": { + "smithy.api#documentation": "A list of StatisticAnnotation
applied to the Statistic
A pagination token to retrieve the next set of results.
" + } + } + }, + "traits": { + "smithy.api#output": {} + } + }, + "com.amazonaws.glue#ListDataQualityStatistics": { + "type": "operation", + "input": { + "target": "com.amazonaws.glue#ListDataQualityStatisticsRequest" + }, + "output": { + "target": "com.amazonaws.glue#ListDataQualityStatisticsResponse" + }, + "errors": [ + { + "target": "com.amazonaws.glue#EntityNotFoundException" + }, + { + "target": "com.amazonaws.glue#InternalServiceException" + }, + { + "target": "com.amazonaws.glue#InvalidInputException" + } + ], + "traits": { + "smithy.api#documentation": "Retrieves a list of data quality statistics.
" + } + }, + "com.amazonaws.glue#ListDataQualityStatisticsRequest": { + "type": "structure", + "members": { + "StatisticId": { + "target": "com.amazonaws.glue#HashString", + "traits": { + "smithy.api#documentation": "The Statistic ID.
" + } + }, + "ProfileId": { + "target": "com.amazonaws.glue#HashString", + "traits": { + "smithy.api#documentation": "The Profile ID.
" + } + }, + "TimestampFilter": { + "target": "com.amazonaws.glue#TimestampFilter", + "traits": { + "smithy.api#documentation": "A timestamp filter.
" + } + }, + "MaxResults": { + "target": "com.amazonaws.glue#PageSize", + "traits": { + "smithy.api#documentation": "The maximum number of results to return in this request.
" + } + }, + "NextToken": { + "target": "com.amazonaws.glue#PaginationToken", + "traits": { + "smithy.api#documentation": "A pagination token to request the next page of results.
" + } + } + }, + "traits": { + "smithy.api#input": {} + } + }, + "com.amazonaws.glue#ListDataQualityStatisticsResponse": { + "type": "structure", + "members": { + "Statistics": { + "target": "com.amazonaws.glue#StatisticSummaryList", + "traits": { + "smithy.api#documentation": "A StatisticSummaryList
.
A pagination token to request the next page of results.
" + } + } + }, + "traits": { + "smithy.api#output": {} + } + }, "com.amazonaws.glue#ListDevEndpoints": { "type": "operation", "input": { @@ -26240,6 +26776,12 @@ "smithy.api#documentation": "The name of the data quality metric used for generating the observation.
" } }, + "StatisticId": { + "target": "com.amazonaws.glue#HashString", + "traits": { + "smithy.api#documentation": "The Statistic ID.
" + } + }, "MetricValues": { "target": "com.amazonaws.glue#DataQualityMetricValues", "traits": { @@ -28090,6 +28632,59 @@ "smithy.api#output": {} } }, + "com.amazonaws.glue#PutDataQualityProfileAnnotation": { + "type": "operation", + "input": { + "target": "com.amazonaws.glue#PutDataQualityProfileAnnotationRequest" + }, + "output": { + "target": "com.amazonaws.glue#PutDataQualityProfileAnnotationResponse" + }, + "errors": [ + { + "target": "com.amazonaws.glue#EntityNotFoundException" + }, + { + "target": "com.amazonaws.glue#InternalServiceException" + }, + { + "target": "com.amazonaws.glue#InvalidInputException" + } + ], + "traits": { + "smithy.api#documentation": "Annotate all datapoints for a Profile.
" + } + }, + "com.amazonaws.glue#PutDataQualityProfileAnnotationRequest": { + "type": "structure", + "members": { + "ProfileId": { + "target": "com.amazonaws.glue#HashString", + "traits": { + "smithy.api#documentation": "The ID of the data quality monitoring profile to annotate.
", + "smithy.api#required": {} + } + }, + "InclusionAnnotation": { + "target": "com.amazonaws.glue#InclusionAnnotationValue", + "traits": { + "smithy.api#documentation": "The inclusion annotation value to apply to the profile.
", + "smithy.api#required": {} + } + } + }, + "traits": { + "smithy.api#input": {} + } + }, + "com.amazonaws.glue#PutDataQualityProfileAnnotationResponse": { + "type": "structure", + "members": {}, + "traits": { + "smithy.api#documentation": "Left blank.
", + "smithy.api#output": {} + } + }, "com.amazonaws.glue#PutResourcePolicy": { "type": "operation", "input": { @@ -28798,6 +29393,12 @@ "smithy.api#documentation": "Specifies a target that uses Amazon Redshift.
" } }, + "com.amazonaws.glue#ReferenceDatasetsList": { + "type": "list", + "member": { + "target": "com.amazonaws.glue#NameString" + } + }, "com.amazonaws.glue#RegisterSchemaVersion": { "type": "operation", "input": { @@ -29429,6 +30030,26 @@ "com.amazonaws.glue#RunId": { "type": "string" }, + "com.amazonaws.glue#RunIdentifier": { + "type": "structure", + "members": { + "RunId": { + "target": "com.amazonaws.glue#HashString", + "traits": { + "smithy.api#documentation": "The Run ID.
" + } + }, + "JobRunId": { + "target": "com.amazonaws.glue#HashString", + "traits": { + "smithy.api#documentation": "The Job Run ID.
" + } + } + }, + "traits": { + "smithy.api#documentation": "A run identifier.
" + } + }, "com.amazonaws.glue#RunMetrics": { "type": "structure", "members": { @@ -32520,6 +33141,12 @@ "smithy.api#documentation": "A name for the ruleset.
" } }, + "DataQualitySecurityConfiguration": { + "target": "com.amazonaws.glue#NameString", + "traits": { + "smithy.api#documentation": "The name of the security configuration created with the data quality encryption option.
" + } + }, "ClientToken": { "target": "com.amazonaws.glue#HashString", "traits": { @@ -33370,6 +33997,217 @@ } } }, + "com.amazonaws.glue#StatisticAnnotation": { + "type": "structure", + "members": { + "ProfileId": { + "target": "com.amazonaws.glue#HashString", + "traits": { + "smithy.api#documentation": "The Profile ID.
" + } + }, + "StatisticId": { + "target": "com.amazonaws.glue#HashString", + "traits": { + "smithy.api#documentation": "The Statistic ID.
" + } + }, + "StatisticRecordedOn": { + "target": "com.amazonaws.glue#Timestamp", + "traits": { + "smithy.api#documentation": "The timestamp when the annotated statistic was recorded.
" + } + }, + "InclusionAnnotation": { + "target": "com.amazonaws.glue#TimestampedInclusionAnnotation", + "traits": { + "smithy.api#documentation": "The inclusion annotation applied to the statistic.
" + } + } + }, + "traits": { + "smithy.api#documentation": "A Statistic Annotation.
" + } + }, + "com.amazonaws.glue#StatisticEvaluationLevel": { + "type": "enum", + "members": { + "DATASET": { + "target": "smithy.api#Unit", + "traits": { + "smithy.api#enumValue": "Dataset" + } + }, + "COLUMN": { + "target": "smithy.api#Unit", + "traits": { + "smithy.api#enumValue": "Column" + } + }, + "MULTICOLUMN": { + "target": "smithy.api#Unit", + "traits": { + "smithy.api#enumValue": "Multicolumn" + } + } + } + }, + "com.amazonaws.glue#StatisticModelResult": { + "type": "structure", + "members": { + "LowerBound": { + "target": "com.amazonaws.glue#NullableDouble", + "traits": { + "smithy.api#documentation": "The lower bound.
" + } + }, + "UpperBound": { + "target": "com.amazonaws.glue#NullableDouble", + "traits": { + "smithy.api#documentation": "The upper bound.
" + } + }, + "PredictedValue": { + "target": "com.amazonaws.glue#NullableDouble", + "traits": { + "smithy.api#documentation": "The predicted value.
" + } + }, + "ActualValue": { + "target": "com.amazonaws.glue#NullableDouble", + "traits": { + "smithy.api#documentation": "The actual value.
" + } + }, + "Date": { + "target": "com.amazonaws.glue#Timestamp", + "traits": { + "smithy.api#documentation": "The date.
" + } + }, + "InclusionAnnotation": { + "target": "com.amazonaws.glue#InclusionAnnotationValue", + "traits": { + "smithy.api#documentation": "The inclusion annotation.
" + } + } + }, + "traits": { + "smithy.api#documentation": "The statistic model result.
" + } + }, + "com.amazonaws.glue#StatisticModelResults": { + "type": "list", + "member": { + "target": "com.amazonaws.glue#StatisticModelResult" + } + }, + "com.amazonaws.glue#StatisticNameString": { + "type": "string", + "traits": { + "smithy.api#length": { + "min": 1, + "max": 255 + }, + "smithy.api#pattern": "^[A-Z][A-Za-z\\.]+$" + } + }, + "com.amazonaws.glue#StatisticPropertiesMap": { + "type": "map", + "key": { + "target": "com.amazonaws.glue#NameString" + }, + "value": { + "target": "com.amazonaws.glue#DescriptionString" + }, + "traits": { + "smithy.api#sensitive": {} + } + }, + "com.amazonaws.glue#StatisticSummary": { + "type": "structure", + "members": { + "StatisticId": { + "target": "com.amazonaws.glue#HashString", + "traits": { + "smithy.api#documentation": "The Statistic ID.
" + } + }, + "ProfileId": { + "target": "com.amazonaws.glue#HashString", + "traits": { + "smithy.api#documentation": "The Profile ID.
" + } + }, + "RunIdentifier": { + "target": "com.amazonaws.glue#RunIdentifier", + "traits": { + "smithy.api#documentation": "The Run Identifier
" + } + }, + "StatisticName": { + "target": "com.amazonaws.glue#StatisticNameString", + "traits": { + "smithy.api#documentation": "The name of the statistic.
" + } + }, + "DoubleValue": { + "target": "com.amazonaws.glue#Double", + "traits": { + "smithy.api#default": 0, + "smithy.api#documentation": "The value of the statistic.
" + } + }, + "EvaluationLevel": { + "target": "com.amazonaws.glue#StatisticEvaluationLevel", + "traits": { + "smithy.api#documentation": "The evaluation level of the statistic. Possible values: Dataset
, Column
, Multicolumn
.
The list of columns referenced by the statistic.
" + } + }, + "ReferencedDatasets": { + "target": "com.amazonaws.glue#ReferenceDatasetsList", + "traits": { + "smithy.api#documentation": "The list of datasets referenced by the statistic.
" + } + }, + "StatisticProperties": { + "target": "com.amazonaws.glue#StatisticPropertiesMap", + "traits": { + "smithy.api#documentation": "A StatisticPropertiesMap
, which contains a NameString
and DescriptionString
\n
The timestamp when the statistic was recorded.
" + } + }, + "InclusionAnnotation": { + "target": "com.amazonaws.glue#TimestampedInclusionAnnotation", + "traits": { + "smithy.api#documentation": "The inclusion annotation for the statistic.
" + } + } + }, + "traits": { + "smithy.api#documentation": "Summary information about a statistic.
" + } + }, + "com.amazonaws.glue#StatisticSummaryList": { + "type": "list", + "member": { + "target": "com.amazonaws.glue#StatisticSummary" + }, + "traits": { + "smithy.api#documentation": "A list of StatisticSummary
.
The timestamp before which statistics should be included in the results.
" + } + }, + "RecordedAfter": { + "target": "com.amazonaws.glue#Timestamp", + "traits": { + "smithy.api#documentation": "The timestamp after which statistics should be included in the results.
" + } + } + }, + "traits": { + "smithy.api#documentation": "A timestamp filter.
" + } + }, "com.amazonaws.glue#TimestampValue": { "type": "timestamp" }, + "com.amazonaws.glue#TimestampedInclusionAnnotation": { + "type": "structure", + "members": { + "Value": { + "target": "com.amazonaws.glue#InclusionAnnotationValue", + "traits": { + "smithy.api#documentation": "The inclusion annotation value.
" + } + }, + "LastModifiedOn": { + "target": "com.amazonaws.glue#Timestamp", + "traits": { + "smithy.api#documentation": "The timestamp when the inclusion annotation was last modified.
" + } + } + }, + "traits": { + "smithy.api#documentation": "A timestamped inclusion annotation.
" + } + }, "com.amazonaws.glue#Token": { "type": "string" },