Class/Object

com.amazonaws.services.sagemaker.sparksdk

SageMakerModel

Related Docs: object SageMakerModel | package sparksdk

Permalink

class SageMakerModel extends Model[SageMakerModel]

A Model implementation which transforms a DataFrame by making requests to a SageMaker Endpoint. Manages life cycle of all necessary SageMaker entities, including Model, EndpointConfig, and Endpoint.

This Model transforms one DataFrame to another by repeated, distributed SageMaker Endpoint invocation. Each invocation request body is formed by concatenating input DataFrame Rows serialized to Byte Arrays by the specified serializer. The invocation request content-type property is set from contentType. The invocation request accepts property is set from the deserializer's accepts.

The transformed DataFrame is produced by deserializing each invocation response body into a series of Rows. Row deserialization is delegated to the specified deserializer, which converts an Array of Bytes to an Iterator[Row]. If prependInputRows is false, the transformed DataFrame will contain just these Rows. If prependInputRows is true, then each transformed Row is a concatenation of the input Row with its corresponding SageMaker invocation deserialized Row.

Each invocation of transform passes the Dataset.schema of the input DataFrame to requestRowSerialize by invoking setSchema.

The specified serializer also controls the validity of input Row Schemas for this Model. Schema validation is carried out on each call to transformSchema, which invokes validateSchema.

Adapting this SageMaker model to the data format and type of a specific Endpoint is achieved by sub-classing RequestRowSerializer and RequestRowDeserializer. Examples of a Serializer and Deseralizer are LibSVMRequestRowSerializer and LibSVMResponseRowDeserializer respectively.

Linear Supertypes
Model[SageMakerModel], Transformer, PipelineStage, Logging, Params, Serializable, Serializable, Identifiable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. SageMakerModel
  2. Model
  3. Transformer
  4. PipelineStage
  5. Logging
  6. Params
  7. Serializable
  8. Serializable
  9. Identifiable
  10. AnyRef
  11. Any
  1. Hide All
  2. Show all
Visibility
  1. Public
  2. All

Instance Constructors

  1. new SageMakerModel(endpointInstanceType: Option[String], endpointInitialInstanceCount: Option[Int], requestRowSerializer: RequestRowSerializer, responseRowDeserializer: ResponseRowDeserializer, existingEndpointName: Option[String] = Option.empty, modelImage: Option[String] = Option.empty, modelPath: Option[S3DataPath] = Option.empty, modelEnvironmentVariables: Map[String, String] = Map[String, String](), modelExecutionRoleARN: Option[String] = Option.empty, endpointCreationPolicy: EndpointCreationPolicy = ..., sagemakerClient: AmazonSageMaker = ..., prependResultRows: Boolean = true, namePolicy: NamePolicy = new RandomNamePolicy(), uid: String = Identifiable.randomUID("sagemaker"))

    Permalink

    endpointInstanceType

    The instance type used to run the model container.

    endpointInitialInstanceCount

    The minimum number of instances used to host the model.

    requestRowSerializer

    Serializes a Row to an Array of Bytes.

    responseRowDeserializer

    Deserializes an Array of Bytes to a series of Rows.

    existingEndpointName

    An endpoint name.

    modelImage

    A Docker image URI.

    modelPath

    An S3 location that a successfully completed SageMaker Training Job has stored its model output to.

    modelEnvironmentVariables

    The environment variables that SageMaker will set on the model container during execution.

    modelExecutionRoleARN

    The IAM Role used by SageMaker when running the hosted Model and to download model data from S3.

    endpointCreationPolicy

    Whether the endpoint is created upon SageMakerModel construction, transformation, or not at all.

    sagemakerClient

    Amazon SageMaker client. Used to send CreateTrainingJob, CreateModel, and CreateEndpoint requests.

    prependResultRows

    Whether the transformation result should also include the input Rows. If true, each output Row is formed by a concatenation of the input Row with the corresponding Row produced by SageMaker invocation, produced by responseRowDeserializer. If false, each output Row is just taken from responseRowDeserializer.

    namePolicy

    The NamePolicy to use when naming SageMaker entities created during usage of this Model.

    uid

    The unique identifier of this Estimator. Used to represent this stage in Spark ML pipelines.

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def $[T](param: Param[T]): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  4. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  5. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  6. final def clear(param: Param[_]): SageMakerModel.this.type

    Permalink
    Definition Classes
    Params
  7. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  8. def copy(extra: ParamMap): SageMakerModel

    Permalink

    Creates a copy of the SageMakerModel instance with the same instance variables and extra params.

    Creates a copy of the SageMakerModel instance with the same instance variables and extra params.

    Sets the EndpointCreationPolicy to EndpointCreationPolicy.DO_NOT_CREATE regardless of the original EndpointCreationPolicy value so that copies do not make Endpoints.

    extra

    Params to be applied to the new instance

    returns

    The copy of SageMakerModel

    Definition Classes
    SageMakerModel → Model → Transformer → PipelineStage → Params
  9. def copyValues[T <: Params](to: T, extra: ParamMap): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  10. final def defaultCopy[T <: Params](extra: ParamMap): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  11. val endpointCreationPolicy: EndpointCreationPolicy

    Permalink

    Whether the endpoint is created upon SageMakerModel construction, transformation, or not at all.

  12. val endpointInitialInstanceCount: Option[Int]

    Permalink

    The minimum number of instances used to host the model.

  13. val endpointInstanceType: Option[String]

    Permalink

    The instance type used to run the model container.

  14. def endpointName: Option[String]

    Permalink

    An endpoint name if it exists or None otherwise.

    An endpoint name if it exists or None otherwise.

    returns

    An endpoint name.

  15. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  16. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  17. val existingEndpointName: Option[String]

    Permalink

    An endpoint name.

  18. def explainParam(param: Param[_]): String

    Permalink
    Definition Classes
    Params
  19. def explainParams(): String

    Permalink
    Definition Classes
    Params
  20. final def extractParamMap(): ParamMap

    Permalink
    Definition Classes
    Params
  21. final def extractParamMap(extra: ParamMap): ParamMap

    Permalink
    Definition Classes
    Params
  22. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  23. final def get[T](param: Param[T]): Option[T]

    Permalink
    Definition Classes
    Params
  24. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  25. def getCreatedResources: CreatedResources

    Permalink

    Gets potentially created resources during operation of the SageMakerModel.

    Gets potentially created resources during operation of the SageMakerModel.

    returns

    Resources that may have been created.

  26. final def getDefault[T](param: Param[T]): Option[T]

    Permalink
    Definition Classes
    Params
  27. final def getOrDefault[T](param: Param[T]): T

    Permalink
    Definition Classes
    Params
  28. def getParam(paramName: String): Param[Any]

    Permalink
    Definition Classes
    Params
  29. final def hasDefault[T](param: Param[T]): Boolean

    Permalink
    Definition Classes
    Params
  30. def hasParam(paramName: String): Boolean

    Permalink
    Definition Classes
    Params
  31. def hasParent: Boolean

    Permalink
    Definition Classes
    Model
  32. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  33. def initializeLogIfNecessary(isInterpreter: Boolean): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  34. final def isDefined(param: Param[_]): Boolean

    Permalink
    Definition Classes
    Params
  35. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  36. final def isSet(param: Param[_]): Boolean

    Permalink
    Definition Classes
    Params
  37. def isTraceEnabled(): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  38. def log: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  39. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  40. def logDebug(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  41. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  42. def logError(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  43. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  44. def logInfo(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  45. def logName: String

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  46. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  47. def logTrace(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  48. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  49. def logWarning(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  50. val modelEnvironmentVariables: Map[String, String]

    Permalink

    The environment variables that SageMaker will set on the model container during execution.

  51. val modelExecutionRoleARN: Option[String]

    Permalink

    The IAM Role used by SageMaker when running the hosted Model and to download model data from S3.

  52. val modelImage: Option[String]

    Permalink

    A Docker image URI.

  53. val modelPath: Option[S3DataPath]

    Permalink

    An S3 location that a successfully completed SageMaker Training Job has stored its model output to.

  54. val namePolicy: NamePolicy

    Permalink

    The NamePolicy to use when naming SageMaker entities created during usage of this Model.

  55. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  56. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  57. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  58. lazy val params: Array[Param[_]]

    Permalink
    Definition Classes
    Params
  59. var parent: Estimator[SageMakerModel]

    Permalink
    Definition Classes
    Model
  60. val prependResultRows: Boolean

    Permalink

    Whether the transformation result should also include the input Rows.

    Whether the transformation result should also include the input Rows. If true, each output Row is formed by a concatenation of the input Row with the corresponding Row produced by SageMaker invocation, produced by responseRowDeserializer. If false, each output Row is just taken from responseRowDeserializer.

  61. val requestRowSerializer: RequestRowSerializer

    Permalink

    Serializes a Row to an Array of Bytes.

  62. val responseRowDeserializer: ResponseRowDeserializer

    Permalink

    Deserializes an Array of Bytes to a series of Rows.

  63. val sagemakerClient: AmazonSageMaker

    Permalink

    Amazon SageMaker client.

    Amazon SageMaker client. Used to send CreateTrainingJob, CreateModel, and CreateEndpoint requests.

  64. final def set(paramPair: ParamPair[_]): SageMakerModel.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  65. final def set(param: String, value: Any): SageMakerModel.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  66. final def set[T](param: Param[T], value: T): SageMakerModel.this.type

    Permalink
    Definition Classes
    Params
  67. final def setDefault(paramPairs: ParamPair[_]*): SageMakerModel.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  68. final def setDefault[T](param: Param[T], value: T): SageMakerModel.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  69. def setParent(parent: Estimator[SageMakerModel]): SageMakerModel

    Permalink
    Definition Classes
    Model
  70. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  71. def toString(): String

    Permalink
    Definition Classes
    Identifiable → AnyRef → Any
  72. def transform(dataset: Dataset[_]): DataFrame

    Permalink

    Transforms the input dataset.

    Transforms the input dataset.

    Transforms Dataset to DataFrame by repeated, distributed SageMaker Endpoint invocation using RequestBatchIterator. Creates all necessary SageMaker entities if specified by the EndpointCreationPolicy and if an Endpoint doesn't exist yet.

    dataset

    An input dataset.

    returns

    Transformed dataset.

    Definition Classes
    SageMakerModel → Transformer
  73. def transform(dataset: Dataset[_], paramMap: ParamMap): DataFrame

    Permalink
    Definition Classes
    Transformer
    Annotations
    @Since( "2.0.0" )
  74. def transform(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): DataFrame

    Permalink
    Definition Classes
    Transformer
    Annotations
    @Since( "2.0.0" ) @varargs()
  75. def transformSchema(schema: StructType): StructType

    Permalink

    Checks transform validity of the input schema and provides the output schema.

    Checks transform validity of the input schema and provides the output schema.

    Validates the input schema against RequestRowSerializer and returns ResponseRowDeserializer schema. Prepends the output with the input schema if required by the SageMakerModel.

    schema

    Input schema to be validated and transformed.

    returns

    Output schema

    Definition Classes
    SageMakerModel → PipelineStage
    Annotations
    @DeveloperApi()
  76. def transformSchema(schema: StructType, logging: Boolean): StructType

    Permalink
    Attributes
    protected
    Definition Classes
    PipelineStage
    Annotations
    @DeveloperApi()
  77. val uid: String

    Permalink

    The unique identifier of this Estimator.

    The unique identifier of this Estimator. Used to represent this stage in Spark ML pipelines.

    Definition Classes
    SageMakerModel → Identifiable
  78. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  79. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  80. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Model[SageMakerModel]

Inherited from Transformer

Inherited from PipelineStage

Inherited from Logging

Inherited from Params

Inherited from Serializable

Inherited from Serializable

Inherited from Identifiable

Inherited from AnyRef

Inherited from Any

Ungrouped