XGBoostSageMakerEstimator

Instance Constructors

new XGBoostSageMakerEstimator(sagemakerRole: IAMRoleResource = IAMRoleFromConfig(), trainingInstanceType: String, trainingInstanceCount: Int, endpointInstanceType: String, endpointInitialInstanceCount: Int, requestRowSerializer: RequestRowSerializer = new LibSVMRequestRowSerializer(), responseRowDeserializer: ResponseRowDeserializer = new XGBoostCSVRowDeserializer(), trainingInputS3DataPath: S3Resource = S3AutoCreatePath(), trainingOutputS3DataPath: S3Resource = S3AutoCreatePath(), trainingInstanceVolumeSizeInGB: Int = 1024, trainingProjectedColumns: Option[List[String]] = None, trainingChannelName: String = "train", trainingContentType: Option[String] = Some("libsvm"), trainingS3DataDistribution: String = ..., trainingSparkDataFormat: String = "libsvm", trainingSparkDataFormatOptions: Map[String, String] = Map(), trainingInputMode: String = TrainingInputMode.File.toString, trainingCompressionCodec: Option[String] = None, trainingMaxRuntimeInSeconds: Int = 24 * 60 * 60, trainingKmsKeyId: Option[String] = None, modelEnvironmentVariables: Map[String, String] = Map(), endpointCreationPolicy: EndpointCreationPolicy = ..., sagemakerClient: AmazonSageMaker = ..., region: Option[String] = None, s3Client: AmazonS3 = ..., stsClient: AWSSecurityTokenService = ..., modelPrependInputRowsToTransformationRows: Boolean = true, deleteStagingDataAfterTraining: Boolean = true, namePolicyFactory: NamePolicyFactory = new RandomNamePolicyFactory(), uid: String = Identifiable.randomUID("sagemaker"))

sagemakerRole
The SageMaker TrainingJob and Hosting IAM Role. Used by a SageMaker to access S3 and ECR resources. SageMaker hosted Endpoints instances launched by this Estimator run with this role.
trainingInstanceType
The SageMaker TrainingJob Instance Type to use.
trainingInstanceCount
The number of instances of instanceType to run a SageMaker Training Job with.
endpointInstanceType
The SageMaker Endpoint Confing instance type.
endpointInitialInstanceCount
The SageMaker Endpoint Config minimum number of instances that can be used to host modelImage.
requestRowSerializer
Serializes Spark DataFrame Rows for transformation by Models built from this Estimator.
responseRowDeserializer
Deserializes an Endpoint response into a series of Rows.
trainingInputS3DataPath
An S3 location to upload SageMaker Training Job input data to.
trainingOutputS3DataPath
An S3 location for SageMaker to store Training Job output data to.
trainingInstanceVolumeSizeInGB
The EBS volume size in gigabytes of each instance
trainingProjectedColumns
The columns to project from the Dataset being fit before training. If an Optional.empty is passed then no specific projection will occur and all columns will be serialized.
trainingChannelName
The SageMaker Channel name to input serialized Dataset fit input to
trainingContentType
The MIME type of the training data.
trainingS3DataDistribution
The SageMaker Training Job S3 data distribution scheme.
trainingSparkDataFormat
The Spark Data Format name used to serialize the Dataset being fit for input to SageMaker.
trainingSparkDataFormatOptions
The Spark Data Format Options used during serialization of the Dataset being fit.
trainingInputMode
The SageMaker Training Job Channel input mode.
trainingCompressionCodec
The type of compression to use when serializing the Dataset being fit for input to SageMaker.
trainingMaxRuntimeInSeconds
A SageMaker Training Job Termination Condition MaxRuntimeInHours.
trainingKmsKeyId
A KMS key ID for the Output Data Source
modelEnvironmentVariables
The environment variables that SageMaker will set on the model container during execution.
endpointCreationPolicy
Defines how a SageMaker Endpoint referenced by a SageMakerModel is created.
sagemakerClient
Amazon SageMaker client. Used to send CreateTrainingJob, CreateModel, and CreateEndpoint requests.
region
The region in which to run the algorithm. If not specified, gets the region from the DefaultAwsRegionProviderChain.
s3Client
AmazonS3. Used to create a bucket for staging SageMaker Training Job input and/or output if either are set to S3AutoCreatePath.
stsClient
AmazonSTS. Used to resolve the account number when creating staging input / output buckets.
modelPrependInputRowsToTransformationRows
Whether the transformation result on Models built by this Estimator should also include the input Rows. If true, each output Row is formed by a concatenation of the input Row with the corresponding Row produced by SageMaker Endpoint invocation, produced by responseRowDeserializer. If false, each output Row is just taken from responseRowDeserializer.
deleteStagingDataAfterTraining
Whether to remove the training data on s3 after training is complete or failed.
namePolicyFactory
The NamePolicyFactory to use when naming SageMaker entities created during fit
uid
The unique identifier of this Estimator. Used to represent this stage in Spark ML pipelines.

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def $[T](param: Param[T]): T

Attributes
protected
Definition Classes
Params
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
val alpha: DoubleParam

L1 regularization term on weights.
L1 regularization term on weights. Increase this value will make model more conservative. Default = 0

Definition Classes
XGBoostParams
final def asInstanceOf[T0]: T0

Definition Classes
Any
val baseScore: DoubleParam

The initial prediction score of all instances, global bias.
The initial prediction score of all instances, global bias. Default = 0.5

Definition Classes
XGBoostParams
val booster: Param[String]

Which booster to use.
Which booster to use. Can be gbtree, gblinear or dart. The gbtree and dart values use a tree based model while gblinear uses a linear function. Default = gbtree

Definition Classes
XGBoostParams
final def clear(param: Param[_]): XGBoostSageMakerEstimator.this.type

Definition Classes
Params
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
val colSampleByLevel: DoubleParam

Subsample ratio of columns for each split, in each level.
Subsample ratio of columns for each split, in each level. Must be in (0, 1]. Default = 1

Definition Classes
XGBoostParams
val colSampleByTree: DoubleParam

Subsample ratio of columns when constructing each tree.
Subsample ratio of columns when constructing each tree. Must be in (0, 1] Default = 1

Definition Classes
XGBoostParams
def copy(extra: ParamMap): SageMakerEstimator

Definition Classes
SageMakerEstimator → Estimator → PipelineStage → Params
def copyValues[T <: Params](to: T, extra: ParamMap): T

Attributes
protected
Definition Classes
Params
final def defaultCopy[T <: Params](extra: ParamMap): T

Attributes
protected
Definition Classes
Params
val deleteStagingDataAfterTraining: Boolean

Whether to remove the training data on s3 after training is complete or failed.
Whether to remove the training data on s3 after training is complete or failed.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val endpointCreationPolicy: EndpointCreationPolicy

Defines how a SageMaker Endpoint referenced by a SageMakerModel is created.
Defines how a SageMaker Endpoint referenced by a SageMakerModel is created.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val endpointInitialInstanceCount: Int

The SageMaker Endpoint Config minimum number of instances that can be used to host modelImage.
The SageMaker Endpoint Config minimum number of instances that can be used to host modelImage.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val endpointInstanceType: String

The SageMaker Endpoint Confing instance type.
The SageMaker Endpoint Confing instance type.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
val eta: DoubleParam

Step size shrinkage used in update to prevent overfitting.
Step size shrinkage used in update to prevent overfitting. After each boosting step, we can directly get the weights of new features and eta actually shrinks the feature weights to make the boosting process more conservative. Must be in [0, 1] Default = 0.3

Definition Classes
XGBoostParams
val evalMetric: Param[String]

Evaluation metrics for validation data.
Evaluation metrics for validation data. A default metric will be assigned according to the objective (rmse for regression, error for classification, and map for ranking ) Default according to objective

Definition Classes
XGBoostParams
def explainParam(param: Param[_]): String

Definition Classes
Params
def explainParams(): String

Definition Classes
Params
final def extractParamMap(): ParamMap

Definition Classes
Params
final def extractParamMap(extra: ParamMap): ParamMap

Definition Classes
Params
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
def fit(dataSet: Dataset[_]): SageMakerModel

Fits a SageMakerModel on dataSet by running a SageMaker training job.
Fits a SageMakerModel on dataSet by running a SageMaker training job.

Definition Classes
SageMakerEstimator → Estimator
def fit(dataset: Dataset[_], paramMaps: Array[ParamMap]): Seq[SageMakerModel]

Definition Classes
Estimator
Annotations
@Since( "2.0.0" )
def fit(dataset: Dataset[_], paramMap: ParamMap): SageMakerModel

Definition Classes
Estimator
Annotations
@Since( "2.0.0" )
def fit(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): SageMakerModel

Definition Classes
Estimator
Annotations
@Since( "2.0.0" ) @varargs()
val gamma: DoubleParam

Minimum loss reduction required to make a further partition on a leaf node of the tree.
Minimum loss reduction required to make a further partition on a leaf node of the tree. The larger, the more conservative the algorithm will be. Must be >= 0. Default = 0

Definition Classes
XGBoostParams
final def get[T](param: Param[T]): Option[T]

Definition Classes
Params
def getAlpha: Double

Definition Classes
XGBoostParams
def getBaseScore: Double

Definition Classes
XGBoostParams
def getBooster: String

Definition Classes
XGBoostParams
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def getColSampleByLevel: Double

Definition Classes
XGBoostParams
def getColSampleByTree: Double

Definition Classes
XGBoostParams
final def getDefault[T](param: Param[T]): Option[T]

Definition Classes
Params
def getEta: Double

Definition Classes
XGBoostParams
def getEvalMetric: String

Definition Classes
XGBoostParams
def getGamma: Double

Definition Classes
XGBoostParams
def getGrowPolicy: String

Definition Classes
XGBoostParams
def getLambda: Double

Definition Classes
XGBoostParams
def getLambdaBias: Double

Definition Classes
XGBoostParams
def getMaxBin: Int

Definition Classes
XGBoostParams
def getMaxDeltaStep: Double

Definition Classes
XGBoostParams
def getMaxDepth: Double

Definition Classes
XGBoostParams
def getMaxLeaves: Int

Definition Classes
XGBoostParams
def getMinChildWeight: Double

Definition Classes
XGBoostParams
def getNThread: Int

Definition Classes
XGBoostParams
def getNormalizeType: String

Definition Classes
XGBoostParams
def getNumClasses: Int

Definition Classes
XGBoostParams
def getNumRound: Int

Definition Classes
XGBoostParams
def getObjective: String

Definition Classes
XGBoostParams
def getOneDrop: Int

Definition Classes
XGBoostParams
final def getOrDefault[T](param: Param[T]): T

Definition Classes
Params
def getParam(paramName: String): Param[Any]

Definition Classes
Params
def getProcessType: String

Definition Classes
XGBoostParams
def getRateDrop: Double

Definition Classes
XGBoostParams
def getRefreshLeaf: Int

Definition Classes
XGBoostParams
def getSampleType: String

Definition Classes
XGBoostParams
def getScalePosWeight: Double

Definition Classes
XGBoostParams
def getSeed: Double

Definition Classes
XGBoostParams
def getSilent: Int

Definition Classes
XGBoostParams
def getSketchEps: Double

Definition Classes
XGBoostParams
def getSkipDrop: Double

Definition Classes
XGBoostParams
def getSubsample: Double

Definition Classes
XGBoostParams
def getTreeMethod: String

Definition Classes
XGBoostParams
def getTweedieVariancePower: Double

Definition Classes
XGBoostParams
def getUpdater: String

Definition Classes
XGBoostParams
val growPolicy: Param[String]

Controls the way that new nodes are added to the tree.
Controls the way that new nodes are added to the tree. Can be "depthwise" or "lossguide". Currently supported only if tree_method is set to hist. Default = "depthwise"

Definition Classes
XGBoostParams
final def hasDefault[T](param: Param[T]): Boolean

Definition Classes
Params
def hasParam(paramName: String): Boolean

Definition Classes
Params
def hashCode(): Int

Definition Classes
AnyRef → Any
val hyperParameters: Map[String, String]

A map from hyperParameter names to their respective values for training.
A map from hyperParameter names to their respective values for training.

Definition Classes
SageMakerEstimator
def initializeLogIfNecessary(isInterpreter: Boolean): Unit

Attributes
protected
Definition Classes
Logging
final def isDefined(param: Param[_]): Boolean

Definition Classes
Params
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
final def isSet(param: Param[_]): Boolean

Definition Classes
Params
def isTraceEnabled(): Boolean

Attributes
protected
Definition Classes
Logging
val lambda: DoubleParam

L2 regularization term on weights.
L2 regularization term on weights. Increase this value will make model more conservative. Default = 1

Definition Classes
XGBoostParams
val lambdaBias: DoubleParam

L2 regularization term on bias.
L2 regularization term on bias. Must be in [0, 1]. Default = 0.0

Definition Classes
XGBoostParams
def log: Logger

Attributes
protected
Definition Classes
Logging
def logDebug(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logDebug(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logError(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logError(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logInfo(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logInfo(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logName: String

Attributes
protected
Definition Classes
Logging
def logTrace(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logTrace(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logWarning(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logWarning(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
val maxBin: IntParam

Maximum number of discrete bins to bucket continuous features.
Maximum number of discrete bins to bucket continuous features. Used only if tree_method=hist. Default = 256

Definition Classes
XGBoostParams
val maxDeltaStep: DoubleParam

Maximum delta step allowed for each tree's weight estimation can be.
Maximum delta step allowed for each tree's weight estimation can be. Valid inputs: When a positive integer is used, it helps make the update more conservative. The preferred options is to use it in logistic regression. Set it to 1-10 to help control the update. Must be >= 0. Default = 0

Definition Classes
XGBoostParams
val maxDepth: DoubleParam

Maximum depth of a tree, increase this value will make the model more complex (likely to be overfitting).
Maximum depth of a tree, increase this value will make the model more complex (likely to be overfitting). 0 indicates no limit, limit is required when grow_policy=depth-wise. Must be >= 0. Default = 6

Definition Classes
XGBoostParams
val maxLeaves: IntParam

Maximum number of nodes to be added.
Maximum number of nodes to be added. Relevant only if grow_policy = lossguide. Must be >= 0. Default = 0

Definition Classes
XGBoostParams
val minChildWeight: DoubleParam

Minimum sum of instance weight (hessian) needed in a child.
Minimum sum of instance weight (hessian) needed in a child. If the tree partition step results in a leaf node with the sum of instance weight less than min_child_weight, then the building process will give up further partitioning. In linear regression mode, this simply corresponds to minimum number of instances needed to be in each node. The larger the algorithm is, the more conservative it will be. Must be >= 0. Default = 1

Definition Classes
XGBoostParams
val modelEnvironmentVariables: Map[String, String]

The environment variables that SageMaker will set on the model container during execution.
The environment variables that SageMaker will set on the model container during execution.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val modelImage: String

A SageMaker Model hosting Docker image URI.
A SageMaker Model hosting Docker image URI.

Definition Classes
SageMakerEstimator
val modelPrependInputRowsToTransformationRows: Boolean

Whether the transformation result on Models built by this Estimator should also include the input Rows.
Whether the transformation result on Models built by this Estimator should also include the input Rows. If true, each output Row is formed by a concatenation of the input Row with the corresponding Row produced by SageMaker Endpoint invocation, produced by responseRowDeserializer. If false, each output Row is just taken from responseRowDeserializer.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val nThread: IntParam

Number of parallel threads used to run xgboost.
Number of parallel threads used to run xgboost. Must be >= 1. Defaults to maximum number of threads available.

Definition Classes
XGBoostParams
val namePolicyFactory: NamePolicyFactory

The NamePolicyFactory to use when naming SageMaker entities created during fit
The NamePolicyFactory to use when naming SageMaker entities created during fit

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
val normalizeType: Param[String]

Type of normalization algorithm.
Type of normalization algorithm. Can be "tree" or "forest". Default = "tree"

Definition Classes
XGBoostParams
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
val numClasses: IntParam

No default.
No default. Used for softmax multiclass classification.

Definition Classes
XGBoostParams
val numRound: IntParam

Number of rounds for gradient boosting.
Number of rounds for gradient boosting. Must be >= 1. Required.

Definition Classes
XGBoostParams
val objective: Param[String]

Specifies the learning task and the corresponding learning objective.
Specifies the learning task and the corresponding learning objective. Default: "reg:linear"

Definition Classes
XGBoostParams
val oneDrop: IntParam

Whether to drop at least one tree during the dropout.
Whether to drop at least one tree during the dropout. Default = 0

Definition Classes
XGBoostParams
lazy val params: Array[Param[_]]

Definition Classes
Params
val processType: Param[String]

The type of boosting process to run.
The type of boosting process to run. Can be default or update. Default = "default"

Definition Classes
XGBoostParams
val rateDrop: DoubleParam

Dropout rate (a fraction of previous trees to drop during the dropout).
Dropout rate (a fraction of previous trees to drop during the dropout). Must be in [0, 1]. Default = 0.0

Definition Classes
XGBoostParams
val refreshLeaf: IntParam

A parameter of the 'refresh' updater plugin.
A parameter of the 'refresh' updater plugin. When set to true, tree leaves and tree node stats are updated. When set to false, only tree node stats are updated. Default = 1

Definition Classes
XGBoostParams
val region: Option[String]

The region in which to run the algorithm.
The region in which to run the algorithm. If not specified, gets the region from the DefaultAwsRegionProviderChain.
val requestRowSerializer: RequestRowSerializer

Serializes Spark DataFrame Rows for transformation by Models built from this Estimator.
Serializes Spark DataFrame Rows for transformation by Models built from this Estimator.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val responseRowDeserializer: ResponseRowDeserializer

Deserializes an Endpoint response into a series of Rows.
Deserializes an Endpoint response into a series of Rows.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val s3Client: AmazonS3

AmazonS3.
AmazonS3. Used to create a bucket for staging SageMaker Training Job input and/or output if either are set to S3AutoCreatePath.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val sagemakerClient: AmazonSageMaker

Amazon SageMaker client.
Amazon SageMaker client. Used to send CreateTrainingJob, CreateModel, and CreateEndpoint requests.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val sagemakerRole: IAMRoleResource

The SageMaker TrainingJob and Hosting IAM Role.
The SageMaker TrainingJob and Hosting IAM Role. Used by a SageMaker to access S3 and ECR resources. SageMaker hosted Endpoints instances launched by this Estimator run with this role.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val sampleType: Param[String]

Type of sampling algorithm.
Type of sampling algorithm. Can be "uniform" or "weighted". Default = "uniform"

Definition Classes
XGBoostParams
val scalePosWeight: DoubleParam

Controls the balance of positive and negative weights.
Controls the balance of positive and negative weights. It's useful for unbalanced classes. A typical value to consider: sum(negative cases) / sum(positive cases). Default = 1

Definition Classes
XGBoostParams
val seed: DoubleParam

Random number seed.
Random number seed. Default = 0

Definition Classes
XGBoostParams
final def set(paramPair: ParamPair[_]): XGBoostSageMakerEstimator.this.type

Attributes
protected
Definition Classes
Params
final def set(param: String, value: Any): XGBoostSageMakerEstimator.this.type

Attributes
protected
Definition Classes
Params
final def set[T](param: Param[T], value: T): XGBoostSageMakerEstimator.this.type

Definition Classes
Params
def setAlpha(value: Double): XGBoostSageMakerEstimator.this.type
def setBaseScore(value: Double): XGBoostSageMakerEstimator.this.type
def setBooster(value: String): XGBoostSageMakerEstimator.this.type
def setColSampleByLevel(value: Double): XGBoostSageMakerEstimator.this.type
def setColSampleByTree(value: Double): XGBoostSageMakerEstimator.this.type
final def setDefault(paramPairs: ParamPair[_]*): XGBoostSageMakerEstimator.this.type

Attributes
protected
Definition Classes
Params
final def setDefault[T](param: Param[T], value: T): XGBoostSageMakerEstimator.this.type

Attributes
protected
Definition Classes
Params
def setEta(value: Double): XGBoostSageMakerEstimator.this.type
def setEvalMetric(value: String): XGBoostSageMakerEstimator.this.type
def setGamma(value: Double): XGBoostSageMakerEstimator.this.type
def setGrowPolicy(value: String): XGBoostSageMakerEstimator.this.type
def setLambda(value: Double): XGBoostSageMakerEstimator.this.type
def setLambdaBias(value: Double): XGBoostSageMakerEstimator.this.type
def setMaxBin(value: Int): XGBoostSageMakerEstimator.this.type
def setMaxDeltaStep(value: Double): XGBoostSageMakerEstimator.this.type
def setMaxDepth(value: Double): XGBoostSageMakerEstimator.this.type
def setMaxLeaves(value: Int): XGBoostSageMakerEstimator.this.type
def setMinChildWeight(value: Double): XGBoostSageMakerEstimator.this.type
def setNThread(value: Int): XGBoostSageMakerEstimator.this.type
def setNormalizeType(value: String): XGBoostSageMakerEstimator.this.type
def setNumClasses(value: Int): XGBoostSageMakerEstimator.this.type
def setNumRound(value: Int): XGBoostSageMakerEstimator.this.type
def setObjective(value: String): XGBoostSageMakerEstimator.this.type
def setOneDrop(value: Int): XGBoostSageMakerEstimator.this.type
def setProcessType(value: String): XGBoostSageMakerEstimator.this.type
def setRateDrop(value: Double): XGBoostSageMakerEstimator.this.type
def setRefreshLeaf(value: Int): XGBoostSageMakerEstimator.this.type
def setSampleType(value: String): XGBoostSageMakerEstimator.this.type
def setScalePosWeight(value: Double): XGBoostSageMakerEstimator.this.type
def setSeed(value: Double): XGBoostSageMakerEstimator.this.type
def setSilent(value: Int): XGBoostSageMakerEstimator.this.type
def setSketchEps(value: Double): XGBoostSageMakerEstimator.this.type
def setSkipDrop(value: Double): XGBoostSageMakerEstimator.this.type
def setSubsample(value: Double): XGBoostSageMakerEstimator.this.type
def setTreeMethod(value: String): XGBoostSageMakerEstimator.this.type
def setTweedieVariancePower(value: Double): XGBoostSageMakerEstimator.this.type
def setUpdater(value: String): XGBoostSageMakerEstimator.this.type
val silent: IntParam

Whether in silent mode.
Whether in silent mode. Can be 0 or 1. 0 means printing running messages, 1 means silent mode. Default = 0

Definition Classes
XGBoostParams
val sketchEps: DoubleParam

Used only for approximate greedy algorithm.
Used only for approximate greedy algorithm. Translates into O(1 / sketch_eps) number of bins. Compared to directly select number of bins, this comes with theoretical guarantee with sketch accuracy. Must be in (0, 1). Default = 0.03

Definition Classes
XGBoostParams
val skipDrop: DoubleParam

Probability of skipping the dropout procedure during a boosting iteration.
Probability of skipping the dropout procedure during a boosting iteration. Must be in [0, 1]. Default: 0

Definition Classes
XGBoostParams
val stsClient: AWSSecurityTokenService

AmazonSTS.
AmazonSTS. Used to resolve the account number when creating staging input / output buckets.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val subsample: DoubleParam

Subsample ratio of the training instance.
Subsample ratio of the training instance. Setting it to 0.5 means that XGBoost randomly collected half of the data instances to grow trees and this will prevent overfitting. Must be in (0, 1]. Default = 1

Definition Classes
XGBoostParams
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
Identifiable → AnyRef → Any
val trainingChannelName: String

The SageMaker Channel name to input serialized Dataset fit input to
The SageMaker Channel name to input serialized Dataset fit input to

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val trainingCompressionCodec: Option[String]

The type of compression to use when serializing the Dataset being fit for input to SageMaker.
The type of compression to use when serializing the Dataset being fit for input to SageMaker.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val trainingContentType: Option[String]

The MIME type of the training data.
The MIME type of the training data.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val trainingImage: String

A SageMaker Training Job Algorithm Specification Training Image Docker image URI.
A SageMaker Training Job Algorithm Specification Training Image Docker image URI.

Definition Classes
SageMakerEstimator
val trainingInputMode: String

The SageMaker Training Job Channel input mode.
The SageMaker Training Job Channel input mode.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val trainingInputS3DataPath: S3Resource

An S3 location to upload SageMaker Training Job input data to.
An S3 location to upload SageMaker Training Job input data to.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val trainingInstanceCount: Int

The number of instances of instanceType to run a SageMaker Training Job with.
The number of instances of instanceType to run a SageMaker Training Job with.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val trainingInstanceType: String

The SageMaker TrainingJob Instance Type to use.
The SageMaker TrainingJob Instance Type to use.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val trainingInstanceVolumeSizeInGB: Int

The EBS volume size in gigabytes of each instance
The EBS volume size in gigabytes of each instance

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val trainingKmsKeyId: Option[String]

A KMS key ID for the Output Data Source
A KMS key ID for the Output Data Source

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val trainingMaxRuntimeInSeconds: Int

A SageMaker Training Job Termination Condition MaxRuntimeInHours.
A SageMaker Training Job Termination Condition MaxRuntimeInHours.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val trainingOutputS3DataPath: S3Resource

An S3 location for SageMaker to store Training Job output data to.
An S3 location for SageMaker to store Training Job output data to.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val trainingProjectedColumns: Option[List[String]]

The columns to project from the Dataset being fit before training.
The columns to project from the Dataset being fit before training. If an Optional.empty is passed then no specific projection will occur and all columns will be serialized.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val trainingS3DataDistribution: String

The SageMaker Training Job S3 data distribution scheme.
The SageMaker Training Job S3 data distribution scheme.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val trainingSparkDataFormat: String

The Spark Data Format name used to serialize the Dataset being fit for input to SageMaker.
The Spark Data Format name used to serialize the Dataset being fit for input to SageMaker.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
val trainingSparkDataFormatOptions: Map[String, String]

The Spark Data Format Options used during serialization of the Dataset being fit.
The Spark Data Format Options used during serialization of the Dataset being fit.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator
def transformSchema(schema: StructType): StructType

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator → PipelineStage
def transformSchema(schema: StructType, logging: Boolean): StructType

Attributes
protected
Definition Classes
PipelineStage
Annotations
@DeveloperApi()
val treeMethod: Param[String]

The tree construction algorithm used in XGBoost.
The tree construction algorithm used in XGBoost. Can be auto, exact, approx, hist. Default = "auto"

Definition Classes
XGBoostParams
val tweedieVariancePower: DoubleParam

Parameter that controls the variance of the Tweedie distribution.
Parameter that controls the variance of the Tweedie distribution. Must be in (1, 2). Default = 1.5

Definition Classes
XGBoostParams
val uid: String

The unique identifier of this Estimator.
The unique identifier of this Estimator. Used to represent this stage in Spark ML pipelines.

Definition Classes
XGBoostSageMakerEstimator → SageMakerEstimator → Identifiable
val updater: Param[String]

A comma-separated string that defines the sequence of tree updaters to run.
A comma-separated string that defines the sequence of tree updaters to run. This provides a modular way to construct and to modify the trees. Default = "grow_colmaker,prune"

Definition Classes
XGBoostParams
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

Related Docs: object XGBoostSageMakerEstimator | package algorithms

class XGBoostSageMakerEstimator extends SageMakerEstimator with XGBoostParams

Instance Constructors

Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

final def $[T](param: Param[T]): T

final def ==(arg0: Any): Boolean

val alpha: DoubleParam

final def asInstanceOf[T0]: T0

val baseScore: DoubleParam

val booster: Param[String]

final def clear(param: Param[_]): XGBoostSageMakerEstimator.this.type

def clone(): AnyRef

val colSampleByLevel: DoubleParam

val colSampleByTree: DoubleParam

def copy(extra: ParamMap): SageMakerEstimator

def copyValues[T <: Params](to: T, extra: ParamMap): T

final def defaultCopy[T <: Params](extra: ParamMap): T

val deleteStagingDataAfterTraining: Boolean

val endpointCreationPolicy: EndpointCreationPolicy

val endpointInitialInstanceCount: Int

val endpointInstanceType: String

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

val eta: DoubleParam

val evalMetric: Param[String]

def explainParam(param: Param[_]): String

def explainParams(): String

final def extractParamMap(): ParamMap

final def extractParamMap(extra: ParamMap): ParamMap

def finalize(): Unit

def fit(dataSet: Dataset[_]): SageMakerModel

def fit(dataset: Dataset[_], paramMaps: Array[ParamMap]): Seq[SageMakerModel]

def fit(dataset: Dataset[_], paramMap: ParamMap): SageMakerModel

def fit(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): SageMakerModel

val gamma: DoubleParam

final def get[T](param: Param[T]): Option[T]

def getAlpha: Double

def getBaseScore: Double

def getBooster: String

final def getClass(): Class[_]

def getColSampleByLevel: Double

def getColSampleByTree: Double

final def getDefault[T](param: Param[T]): Option[T]

def getEta: Double

def getEvalMetric: String

def getGamma: Double

def getGrowPolicy: String

def getLambda: Double

def getLambdaBias: Double

def getMaxBin: Int

def getMaxDeltaStep: Double

def getMaxDepth: Double

def getMaxLeaves: Int

def getMinChildWeight: Double

def getNThread: Int

def getNormalizeType: String

def getNumClasses: Int

def getNumRound: Int

def getObjective: String

def getOneDrop: Int

final def getOrDefault[T](param: Param[T]): T

def getParam(paramName: String): Param[Any]

def getProcessType: String

def getRateDrop: Double

def getRefreshLeaf: Int

def getSampleType: String

def getScalePosWeight: Double

def getSeed: Double

def getSilent: Int

def getSketchEps: Double

def getSkipDrop: Double

def getSubsample: Double

def getTreeMethod: String

def getTweedieVariancePower: Double

def getUpdater: String

val growPolicy: Param[String]

final def hasDefault[T](param: Param[T]): Boolean