Health Metrics and the Spread of Infectious Diseases is now available in print from Routledge!! π Order your copy
Predictive modeling
modeling
Predictive modeling comparison in R using caret and tidymodels for preprocessing, resampling, training, and model evaluation.
Published
June 5, 2022
Overview
This post is dedicated to make a comparison between Caret and TidyModels R packages. Data modelling with R pass through data preprocessing and parameters assessments to predicting an outcome. Both set of packages can be used to achieve same results, with the purpose of finding the best predictive performance for data specific models.
Predictive modeling - TreeMap
The Caret package is the starting point for understanding how to manage models and produce unbiases predictions with R. As well as TidyModels meta package, it gives the opportunity to contruct a multivariate model syntax to manage several models to be applied on same set of data. TidyModels allows the use of a set of concatenated functions in partership with the TidyVerse grammar to build a structural model base which blends different models as one global model.
The following is an attempt to a comparison between the two predictive model structures.
Caret package
The most important functions for this package, grouped by steps to modeling, are:
Preprocessing (data cleaning/wrangling)
preProcess()
Data splitting and resampling
createDataPartition()
createResample()
createTimeSlices()
Model fit and prediction
train()
predict()
Model comparison
confusionMatrix()
TidyModels meta package
This βmeta packageβ is made of a set of packages for modeling, with the support of other well known packages for data manipulation and visualization such as broom, dplyr, ggplot2, purrr, infer, modeldata, and tibble; it includes:
recipes (a preprocessor)
rsample (for resampling)
parsnip (model syntax)
tune and dials (optimization of hyperparameters)
workflows and workflowsets (combine pre-processing steps and models)
yardstick (for evaluating models)
The most important functions for this meta package, grouped by steps to modeling, are: