A Spark FileFormat for serializing Dataframes of labeled vectors to the Amazon Record protobuf file format encoded in RecordIO.
Writes rows of labeled vectors to Amazon protobuf Records encoded in RecordIO
Writes rows of labeled vectors to Amazon protobuf Records encoded in RecordIO
By default, writes a label column of Doubles named "label" and a features column of Vector[Double]s named "features" to protobuf.
These column names can be reassigned with the option "labelColumnName" and "featuresColumnName".
To write records from a DataFrame in this file format, run
dataframe.save
.format("sagemaker")
.option("labelColumnName", "myLabelColumn")
.option("featuresColumnName", "myFeaturesColumn")
.save("my_output_path")
https://mxnet.incubator.apache.org/architecture/note_data_loading.html for more information on recordIO
https://aws.amazon.com/sagemaker/latest/dg/cdf-training.html/ for more information on the Amazon Record data format.
Utility functions that convert to and from the Amazon Record protobuf data format and encode Records in recordIO.
Utility functions that convert to and from the Amazon Record protobuf data format and encode Records in recordIO.
https://mxnet.incubator.apache.org/architecture/note_data_loading.html for more information on recordIO
https://aws.amazon.com/sagemaker/latest/dg/cdf-training.html/ for more information on the Amazon Record data format
A Spark FileFormat for serializing Dataframes of labeled vectors to the Amazon Record protobuf file format encoded in RecordIO.
To write records from a DataFrame in this file format, run
dataframe.save .format("sagemaker") .option("labelColumnName", "myLabelColumn") .option("featuresColumnName", "myFeaturesColumn") .save("my_output_path")
https://mxnet.incubator.apache.org/architecture/note_data_loading.html for more information on recordIO
https://aws.amazon.com/sagemaker/latest/dg/cdf-training.html/ for more information on the Amazon Record data format.