com.amazonaws.services.sagemaker.sparksdk
The instance type used to run the model container.
The minimum number of instances used to host the model.
Serializes a Row to an Array of Bytes.
Deserializes an Array of Bytes to a series of Rows.
An endpoint name.
A Docker image URI.
An S3 location that a successfully completed SageMaker Training Job has stored its model output to.
The environment variables that SageMaker will set on the model container during execution.
The IAM Role used by SageMaker when running the hosted Model and to download model data from S3.
Whether the endpoint is created upon SageMakerModel construction, transformation, or not at all.
Amazon SageMaker client. Used to send CreateTrainingJob, CreateModel, and CreateEndpoint requests.
Whether the transformation result should also include the input Rows. If true, each output Row is formed by a concatenation of the input Row with the corresponding Row produced by SageMaker invocation, produced by responseRowDeserializer. If false, each output Row is just taken from responseRowDeserializer.
The NamePolicy to use when naming SageMaker entities created during usage of this Model.
The unique identifier of this Estimator. Used to represent this stage in Spark ML pipelines.
Creates a copy of the SageMakerModel instance with the same instance variables and extra params.
Creates a copy of the SageMakerModel instance with the same instance variables and extra params.
Sets the EndpointCreationPolicy to EndpointCreationPolicy.DO_NOT_CREATE regardless of the original EndpointCreationPolicy value so that copies do not make Endpoints.
Params to be applied to the new instance
The copy of SageMakerModel
Whether the endpoint is created upon SageMakerModel construction, transformation, or not at all.
The minimum number of instances used to host the model.
The instance type used to run the model container.
An endpoint name if it exists or None otherwise.
An endpoint name if it exists or None otherwise.
An endpoint name.
An endpoint name.
Gets potentially created resources during operation of the SageMakerModel.
Gets potentially created resources during operation of the SageMakerModel.
Resources that may have been created.
The environment variables that SageMaker will set on the model container during execution.
The IAM Role used by SageMaker when running the hosted Model and to download model data from S3.
A Docker image URI.
An S3 location that a successfully completed SageMaker Training Job has stored its model output to.
The NamePolicy to use when naming SageMaker entities created during usage of this Model.
Whether the transformation result should also include the input Rows.
Whether the transformation result should also include the input Rows. If true, each output Row is formed by a concatenation of the input Row with the corresponding Row produced by SageMaker invocation, produced by responseRowDeserializer. If false, each output Row is just taken from responseRowDeserializer.
Serializes a Row to an Array of Bytes.
Deserializes an Array of Bytes to a series of Rows.
Amazon SageMaker client.
Amazon SageMaker client. Used to send CreateTrainingJob, CreateModel, and CreateEndpoint requests.
Transforms the input dataset.
Transforms the input dataset.
Transforms Dataset to DataFrame by repeated, distributed SageMaker Endpoint invocation using RequestBatchIterator. Creates all necessary SageMaker entities if specified by the EndpointCreationPolicy and if an Endpoint doesn't exist yet.
An input dataset.
Transformed dataset.
Checks transform validity of the input schema and provides the output schema.
Checks transform validity of the input schema and provides the output schema.
Validates the input schema against RequestRowSerializer and returns ResponseRowDeserializer schema. Prepends the output with the input schema if required by the SageMakerModel.
Input schema to be validated and transformed.
Output schema
The unique identifier of this Estimator.
The unique identifier of this Estimator. Used to represent this stage in Spark ML pipelines.
A Model implementation which transforms a DataFrame by making requests to a SageMaker Endpoint. Manages life cycle of all necessary SageMaker entities, including Model, EndpointConfig, and Endpoint.
This Model transforms one DataFrame to another by repeated, distributed SageMaker Endpoint invocation. Each invocation request body is formed by concatenating input DataFrame Rows serialized to Byte Arrays by the specified serializer. The invocation request content-type property is set from contentType. The invocation request accepts property is set from the deserializer's accepts.
The transformed DataFrame is produced by deserializing each invocation response body into a series of Rows. Row deserialization is delegated to the specified deserializer, which converts an Array of Bytes to an Iterator[Row]. If prependInputRows is false, the transformed DataFrame will contain just these Rows. If prependInputRows is true, then each transformed Row is a concatenation of the input Row with its corresponding SageMaker invocation deserialized Row.
Each invocation of transform passes the Dataset.schema of the input DataFrame to requestRowSerialize by invoking setSchema.
The specified serializer also controls the validity of input Row Schemas for this Model. Schema validation is carried out on each call to transformSchema, which invokes validateSchema.
Adapting this SageMaker model to the data format and type of a specific Endpoint is achieved by sub-classing RequestRowSerializer and RequestRowDeserializer. Examples of a Serializer and Deseralizer are LibSVMRequestRowSerializer and LibSVMResponseRowDeserializer respectively.