title | hide_title | sidebar_label | description |
R setup |
true |
R setup |
R setup and example for SynapseML |
Requirements: You will need to have R and devtools installed on your machine.
To install the current SynapseML package for R use:
It will take some time to install all dependencies. Then, run:
config <- spark_config()
config$sparklyr.defaultPackages <- "com.microsoft.azure:synapseml_2.12:0.10.2"
sc <- spark_connect(master = "local", config = config)
This will create a spark context on local machine.
We will then need to import the R wrappers:
We can use the faithful dataset in R:
faithful_df <- copy_to(sc, faithful)
cmd_model = ml_clean_missing_data(
inputCols = c("eruptions", "waiting"),
outputCols = c("eruptions_output", "waiting_output"),
sdf_transform(cmd_model, faithful_df)
You should see the output:
# Source: table<sparklyr_tmp_17d66a9d490c> [?? x 4]
# Database: spark_connection
eruptions waiting eruptions_output waiting_output
<dbl> <dbl> <dbl> <dbl>
1 3.600 79 3.600 79
2 1.800 54 1.800 54
3 3.333 74 3.333 74
4 2.283 62 2.283 62
5 4.533 85 4.533 85
6 2.883 55 2.883 55
7 4.700 88 4.700 88
8 3.600 85 3.600 85
9 1.950 51 1.950 51
10 4.350 85 4.350 85
# ... with more rows
In Azure Databricks, you can install devtools and the spark package from URL and then use spark_connect with method = "databricks":
sc <- spark_connect(method = "databricks")
faithful_df <- copy_to(sc, faithful)
unfit_model = ml_light_gbmregressor(sc, maxDepth=20, featuresCol="waiting", labelCol="eruptions", numIterations=10, unfit.model=TRUE)
ml_train_regressor(faithful_df, labelCol="eruptions", unfit_model)
Our R bindings are built as part of the normal build process. To get a quick build, start at the root of the synapsemldirectory, and:
./runme TESTS=NONE
unzip ./BuildArtifacts/packages/R/synapseml-0.0.zip
You can then run R in a terminal and install the above files directly: