Package: sl3 1.4.5

Jeremy Coyle

sl3: Pipelines for Machine Learning and Super Learning

A modern implementation of the Super Learner prediction algorithm, coupled with a general purpose framework for composing arbitrary pipelines for machine learning tasks.

Authors:Jeremy Coyle [aut, cre, cph], Nima Hejazi [aut], Oleg Sofrygin [aut], Ivana Malenica [aut], Rachael Phillips [aut], Weixin Cai [ctb], Yulun Wu [ctb], Hugh Jiang [ctb]

sl3_1.4.5.tar.gz
sl3_1.4.5.zip(r-4.5)sl3_1.4.5.zip(r-4.4)sl3_1.4.5.zip(r-4.3)
sl3_1.4.5.tgz(r-4.5-any)sl3_1.4.5.tgz(r-4.4-any)sl3_1.4.5.tgz(r-4.3-any)
sl3_1.4.5.tar.gz(r-4.5-noble)sl3_1.4.5.tar.gz(r-4.4-noble)
sl3_1.4.5.tgz(r-4.4-emscripten)sl3_1.4.5.tgz(r-4.3-emscripten)
sl3.pdf |sl3.html✨
sl3/json (API)
NEWS

# Install 'sl3' in R:

install.packages('sl3', repos = c('https://tlverse.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/tlverse/sl3/issues

Pkgdown site:https://tlverse.org

Datasets:

bsds - Bicycle sharing time series dataset
cpp - Subset of growth data from the collaborative perinatal project
cpp_1yr - Subset of growth data from the collaborative perinatal project
cpp_imputed - Subset of growth data from the collaborative perinatal project
density_dat - Simulated data with continuous exposure

On CRAN:

data-science ensemble-learning ensemble-model machine-learning model-selection regression stacking statistics

9.94 score 100 stars 7 packages 748 scripts 129 exports 122 dependencies

Last updated 4 months agofrom:4bb008e655 (on fix-tests). Checks:1 OK, 7 ERROR. Indexed: yes.

Target	Result	Latest binary
Doc / Vignettes	OK	Feb 12 2025
R-4.5-win	ERROR	Feb 12 2025
R-4.5-mac	ERROR	Feb 12 2025
R-4.5-linux	ERROR	Feb 12 2025
R-4.4-win	ERROR	Feb 12 2025
R-4.4-mac	ERROR	Feb 12 2025
R-4.3-win	ERROR	Feb 12 2025
R-4.3-mac	ERROR	Feb 12 2025

Exports:args_to_list Custom_chain custom_ROCR_risk customize_chain cv_sl debug_predict debug_train debugonce_predict debugonce_train define_h2o_X delayed_learner_fit_chain delayed_learner_fit_predict delayed_learner_process_formula delayed_learner_subset_covariates delayed_learner_train delayed_make_learner dt_expand_factors factor_to_indicators importance importance_plot inverse_sample learner_fit_chain learner_fit_predict learner_process_formula learner_subset_covariates learner_train loss_loglik_binomial loss_loglik_multinomial loss_loglik_true_cat loss_squared_error loss_squared_error_multivariate Lrnr_arima Lrnr_bartMachine Lrnr_base Lrnr_bayesglm Lrnr_bound Lrnr_caret Lrnr_cv Lrnr_cv_selector Lrnr_dbarts Lrnr_define_interactions Lrnr_density_discretize Lrnr_density_hse Lrnr_density_semiparametric Lrnr_earth Lrnr_expSmooth Lrnr_ga Lrnr_gam Lrnr_gbm Lrnr_glm Lrnr_glm_fast Lrnr_glm_semiparametric Lrnr_glmnet Lrnr_glmtree Lrnr_grf Lrnr_grfcate Lrnr_gru_keras Lrnr_h2o_classifier Lrnr_h2o_glm Lrnr_h2o_grid Lrnr_h2o_mutator Lrnr_hal9001 Lrnr_haldensify Lrnr_HarmonicReg Lrnr_independent_binomial Lrnr_lightgbm Lrnr_lstm_keras Lrnr_mean Lrnr_multiple_ts Lrnr_multivariate Lrnr_nnet Lrnr_nnls Lrnr_optim Lrnr_pca Lrnr_pkg_SuperLearner Lrnr_pkg_SuperLearner_method Lrnr_pkg_SuperLearner_screener Lrnr_polspline Lrnr_pooled_hazards Lrnr_randomForest Lrnr_ranger Lrnr_revere_task Lrnr_rpart Lrnr_rugarch Lrnr_screener_augment Lrnr_screener_coefs Lrnr_screener_correlation Lrnr_screener_importance Lrnr_sl Lrnr_solnp Lrnr_solnp_density Lrnr_stratified Lrnr_subset_covariates Lrnr_svm Lrnr_ts_weights Lrnr_tsDyn Lrnr_xgboost make_learner make_learner_stack make_sl3_Task metalearner_linear metalearner_linear_multinomial metalearner_linear_multivariate metalearner_logistic_binomial normalize_rows pack_predictions Pipeline pooled_hazard_task predict_classes prediction_plot process_data risk safe_dim Shared_Data sl3_debug_mode sl3_list_learners sl3_list_properties sl3_revere_Task sl3_Task sl3Options Stack subset_folds train_task undebug_learner unpack_predictions validation_task variable_type Variable_Type write_learner_template

Dependencies:abind assertthat backports base64enc BBmisc bitops bslib cachem caret caTools checkmate class cli clock codetools colorspace cpp11 crayon data.table delayed diagram digest dplyr e1071 evaluate fansi farver fastmap fontawesome foreach fs future future.apply generics ggplot2 globals glue gower gplots gtable gtools hardhat highr hms htmltools htmlwidgets igraph ipred isoband iterators jquerylib jsonlite KernSmooth knitr labeling lattice lava lifecycle listenv lubridate magrittr MASS Matrix memoise mgcv mime ModelMetrics munsell nlme nnet numDeriv origami parallelly pillar pkgconfig plyr prettyunits pROC prodlim progress progressr proxy purrr R.methodsS3 R.oo R.utils R6 rappdirs rbibutils RColorBrewer Rcpp Rdpack recipes reshape2 rlang rmarkdown ROCR rpart rstackdeque sass scales shape sparsevctrs SQUAREM stringi stringr survival tibble tidyr tidyselect timechange timeDate tinytex tzdb utf8 uuid vctrs viridisLite visNetwork withr xfun yaml

Defining New sl3 Learners

Jeremy Coyle, Nima Hejazi, Ivana Malenica, Oleg Sofrygin

Rendered fromcustom_lrnrs.Rmdusingknitr::rmarkdownon Feb 12 2025.

Last update: 2024-09-23
Started: 2017-08-13

Modern Machine Learning in R

Jeremy Coyle, Nima Hejazi, Ivana Malenica, Oleg Sofrygin

Rendered fromintro_sl3.Rmdusingknitr::rmarkdownon Feb 12 2025.

Last update: 2024-09-23
Started: 2017-08-13

Citation

Development and contributors

Readme and manuals

Help Manual

Help page	Topics
Get all arguments of parent call (both specified and defaults) as list	args_to_list
Bicycle sharing time series dataset	bsds
Subset of growth data from the collaborative perinatal project (CPP)	cpp cpp_imputed
Subset of growth data from the collaborative perinatal project (CPP)	cpp_1yr
Customize chaining for a learner	customize_chain Custom_chain
Cross-validated Risk Estimation	cv_risk
Cross-validated Super Learner	cv_sl
Helper functions to debug sl3 Learners	debugonce_predict debugonce_train debug_predict debug_train sl3_debug_mode undebug_learner
Automatically Defined Metalearner	default_metalearner
h2o Model Definition	define_h2o_X Lrnr_h2o_glm
Learner helpers	delayed_learner_fit_chain delayed_learner_fit_predict delayed_learner_process_formula delayed_learner_subset_covariates delayed_learner_train delayed_make_learner learner_fit_chain learner_fit_predict learner_process_formula learner_subset_covariates learner_train
Simulated data with continuous exposure	density_dat
Convert Factors to indicators	dt_expand_factors factor_to_indicators
Importance Extract variable importance measures produced by 'randomForest' and order in decreasing order of importance.	importance
Variable Importance Plot	importance_plot
Inverse CDF Sampling	inverse_sample
Loss Function Definitions	loss_functions loss_loglik_binomial loss_loglik_multinomial loss_loglik_true_cat loss_squared_error loss_squared_error_multivariate
Univariate ARIMA Models	Lrnr_arima
bartMachine: Bayesian Additive Regression Trees (BART)	Lrnr_bartMachine
Base Class for all sl3 Learners	Lrnr_base make_learner
Bayesian Generalized Linear Models	Lrnr_bayesglm
Bound Predictions	Lrnr_bound
Caret (Classification and Regression) Training	Lrnr_caret
Fit/Predict a learner with Cross Validation	Lrnr_cv
Cross-Validated Selector	Lrnr_cv_selector
Discrete Bayesian Additive Regression Tree sampler	Lrnr_dbarts
Define interactions terms	Lrnr_define_interactions
Density from Classification	Lrnr_density_discretize
Density Estimation With Mean Model and Homoscedastic Errors	Lrnr_density_hse
Density Estimation With Mean Model and Homoscedastic Errors	Lrnr_density_semiparametric
Earth: Multivariate Adaptive Regression Splines	Lrnr_earth
Exponential Smoothing state space model	Lrnr_expSmooth
Nonlinear Optimization via Genetic Algorithm (GA)	Lrnr_ga
GAM: Generalized Additive Models	Lrnr_gam
GBM: Generalized Boosted Regression Models	Lrnr_gbm
Generalized Linear Models	Lrnr_glm
Computationally Efficient Generalized Linear Model (GLM) Fitting	Lrnr_glm_fast
Semiparametric Generalized Linear Models	Lrnr_glm_semiparametric
GLMs with Elastic Net Regularization	Lrnr_glmnet
Generalized Linear Model Trees	Lrnr_glmtree
Generalized Random Forests Learner	Lrnr_grf
Generalized Random Forests for Conditional Average Treatment Effects	Lrnr_grfcate
Recurrent Neural Network with Gated Recurrent Unit (GRU) with Keras	Lrnr_gru_keras
Grid Search Models with h2o	Lrnr_h2o_classifier Lrnr_h2o_grid Lrnr_h2o_mutator
Scalable Highly Adaptive Lasso (HAL)	Lrnr_hal9001
Conditional Density Estimation with the Highly Adaptive LASSO	Lrnr_haldensify
Harmonic Regression	Lrnr_HarmonicReg
Classification from Binomial Regression	Lrnr_independent_binomial
LightGBM: Light Gradient Boosting Machine	Lrnr_lightgbm
Long short-term memory Recurrent Neural Network (LSTM) with Keras	Lrnr_lstm_keras
Fitting Intercept Models	Lrnr_mean
Stratify univariable time-series learners by time-series	Lrnr_multiple_ts
Multivariate Learner	Lrnr_multivariate
Feed-Forward Neural Networks and Multinomial Log-Linear Models	Lrnr_nnet
Non-negative Linear Least Squares	Lrnr_nnls
Optimize Metalearner according to Loss Function using optim	Lrnr_optim
Principal Component Analysis and Regression	Lrnr_pca
Use SuperLearner Wrappers, Screeners, and Methods, in sl3	Lrnr_pkg_SuperLearner Lrnr_pkg_SuperLearner_method Lrnr_pkg_SuperLearner_screener
Polyspline - multivariate adaptive polynomial spline regression (polymars) and polychotomous regression and multiple classification (polyclass)	Lrnr_polspline
Classification from Pooled Hazards	Lrnr_pooled_hazards
Random Forests	Lrnr_randomForest
Ranger: Fast(er) Random Forests	Lrnr_ranger
Learner that chains into a revere task	Lrnr_revere_task
Learner for Recursive Partitioning and Regression Trees	Lrnr_rpart
Univariate GARCH Models	Lrnr_rugarch
Augmented Covariate Screener	Lrnr_screener_augment
Coefficient Magnitude Screener	Lrnr_screener_coefs
Correlation Screening Procedures	Lrnr_screener_correlation
Variable Importance Screener	Lrnr_screener_importance
The Super Learner Algorithm	Lrnr_sl
Nonlinear Optimization via Augmented Lagrange	Lrnr_solnp
Nonlinear Optimization via Augmented Lagrange	Lrnr_solnp_density
Stratify learner fits by a single variable	Lrnr_stratified
Learner with Covariate Subsetting	Lrnr_subset_covariates
Support Vector Machines	Lrnr_svm
Time-specific weighting of prediction losses	Lrnr_ts_weights
Nonlinear Time Series Analysis	Lrnr_tsDyn
xgboost: eXtreme Gradient Boosting	Lrnr_xgboost
Make a stack of sl3 learners	make_learner_stack
Combine predictions from multiple learners	metalearners metalearner_linear metalearner_linear_multinomial metalearner_linear_multivariate metalearner_logistic_binomial
Pack multidimensional predictions into a vector (and unpack again)	normalize_rows pack_predictions print.packed_predictions unpack_predictions
Pipeline (chain) of learners.	Pipeline
Generate A Pooled Hazards Task from a Failure Time (or Categorical) Task	pooled_hazard_task
Predict Class from Predicted Probabilities	predict_classes
Plot predicted and true values for diganostic purposes	prediction_plot
Process Data	process_data
Risk Estimation	risk
FACTORY RISK FUNCTION FOR ROCR PERFORMANCE MEASURES WITH BINARY OUTCOMES	custom_ROCR_risk risk_functions
dim that works for vectors too	safe_dim
Container Class for data.table Shared Between Tasks	Shared_Data
List sl3 Learners	sl3_list_learners sl3_list_properties
Revere (SplitSpecific) Task	sl3_revere_Task
Define a Machine Learning Task	make_sl3_Task sl3_Task
Querying/setting a single 'sl3' option	sl3Options
Learner Stacking	Stack
Make folds work on subset of data	subset_folds
Subset Tasks for CV THe functions use origami folds to subset tasks. These functions are used by Lrnr_cv (and therefore other learners that use Lrnr_cv). So that nested cv works properly, currently the subsetted task objects do not have fold structures of their own, and so generate them from defaults if nested cv is requested.	train_task validation_task
Undocumented Learner	undocumented_learner
Specify Variable Type	Variable_Type variable_type
Generate a file containing a template 'sl3' Learner	write_learner_template