Hannah Frick
Tidymodels team, RStudio
Date et heure
-

Are you a data scientist or statistician who is looking to do some machine learning? You already know why you want to split your data in training and test sets? You know which models you want to try out but don't want to memorize the syntax details for each one? You are aware of sklearn but would prefer to work in R? 

This workshop offers an introductory tour through tidymodels, a framework for modeling and machine learning using tidyverse principles. It lets you build up your workflow in clear steps with consistency, flexibility, and sensible defaults. We'll walk through an exemplary case study to show how you can specify a range of models, bundle preprocessing and model fitting to avoid data leakage, resample your data, and tune your models to avoid overfitting.

It would be great if participants could install the following versions of R and RStudio plus packages:

- A recent version of R (>=3.9.0), which is available for free at https://cran.r-project.org/

- A recent version of RStudio Desktop (RStudio Desktop Open Source License, at least v2022.02), available for free at https://www.rstudio.com/download

- The following R packages, which you can install from the R console: install.packages(c("tidyverse", "tidymodels", "modeldata", "kknn", "ranger", "rpart", "rpart.plot", "rattle", "vip", "partykit", "vetiver", "xgboost"))