The R package sparsediscrim
provides a collection of
sparse and regularized discriminant analysis classifiers that are
especially useful for when applied to small-sample, high-dimensional
data sets.
The package was archived in 2018 and was re-released in 2021. The package code was forked from John Ramey’s repo and subsequently modified.
You can install the stable version on CRAN:
install.packages('sparsediscrim', dependencies = TRUE)
If you prefer to download the latest version, instead type:
library(devtools)
install_github('topepo/sparsediscrim')
The formula and non-formula interfaces can be used:
library(sparsediscrim)
data(parabolic, package = "modeldata")
<- qda_shrink_mean(class ~ ., data = parabolic)
qda_mod # or
<- qda_shrink_mean(x = parabolic[, 1:2], y = parabolic$class)
qda_mod
qda_mod#> Shrinkage-Mean-Based Diagonal QDA
#>
#> Sample Size: 500
#> Number of Features: 2
#>
#> Classes and Prior Probabilities:
#> Class1 (48.8%), Class2 (51.2%)
# Prediction uses the `type` argument:
<-
parabolic_grid expand.grid(X1 = seq(-5, 5, length = 100),
X2 = seq(-5, 5, length = 100))
$qda <- predict(qda_mod, parabolic_grid, type = "prob")$Class1
parabolic_grid
library(ggplot2)
ggplot(parabolic, aes(x = X1, y = X2)) +
geom_point(aes(col = class), alpha = .5) +
geom_contour(data = parabolic_grid, aes(z = qda), col = "black", breaks = .5) +
theme_bw() +
theme(legend.position = "top") +
coord_equal()
The sparsediscrim
package features the following
classifier (the R function is included within parentheses):
rda_high_dim()
) from
Ramey et al. (2015)The sparsediscrim
package also includes a variety of
additional classifiers intended for small-sample, high-dimensional data
sets. These include:
Classifier | Author | R Function |
---|---|---|
Diagonal Linear Discriminant Analysis | Dudoit et al. (2002) | lda_diag() |
Diagonal Quadratic Discriminant Analysis | Dudoit et al. (2002) | qda_diag() |
Shrinkage-based Diagonal Linear Discriminant Analysis | Pang et al. (2009) | lda_shrink_cov() |
Shrinkage-based Diagonal Quadratic Discriminant Analysis | Pang et al. (2009) | qda_shrink_cov() |
Shrinkage-mean-based Diagonal Linear Discriminant Analysis | Tong et al. (2012) | lda_shrink_mean() |
Shrinkage-mean-based Diagonal Quadratic Discriminant Analysis | Tong et al. (2012) | qda_shrink_mean() |
Minimum Distance Empirical Bayesian Estimator (MDEB) | Srivistava and Kubokawa (2007) | lda_emp_bayes() |
Minimum Distance Rule using Modified Empirical Bayes (MDMEB) | Srivistava and Kubokawa (2007) | lda_emp_bayes_eigen() |
Minimum Distance Rule using Moore-Penrose Inverse (MDMP) | Srivistava and Kubokawa (2007) | lda_eigen() |
We also include modifications to Linear Discriminant Analysis (LDA) with regularized covariance-matrix estimators:
lda_pseudo()
)lda_schafer()
)lda_thomaz()
)