% Generated by roxygen2 (4.1.1): do not edit by hand
% Please edit documentation in R/Information.R
\docType{package}
\name{Information}
\alias{Information}
\alias{Information-package}
\title{Data exploration with information theory (weight-of-evidence and information value)}
\description{
The information package performs exploratory data analysis and variable screening for binary
classification models using information theory (WOE and IV).

The package also supports exploratory analysis and variable screening for uplift models (NWOE and NIV).

Note that the only functions you will need to use are create_infotables() and plot_infotables():

  - create_infotables() creates WOE or NWOE tables and outputs a variable-strength summary data.frame (IV or NIV)

  - plot_infotables() creates WOE or NWOE bar charts for one or more variables
}
\details{
Given a data.frame with a set of predictive variables and a binary response variable,
create_infotables() will cycle through all variables and create
NWOE or WOE tables. It will also rank all variables by their respective IV or NIV values.

If requested, calculations can be distributed across multiple cores.

NWOE analysis is only for uplift models. Thus, for NWOE analysis, you must have a "treatment" and a conrol group identified by a binary treatment indicator.
For regular WOE analysis, all you need is a binary response variable (dependent variable).

You can cross validate your IV or NIV values by supplying a validation dataset. This will produce penalized IV/NIV values.

#' To learn more about the Information package, start with the vignette:
\code{browseVignettes(package = "Information")}
}
\examples{
##------------------------------------------------------------
## WOE analysis, no validation
##------------------------------------------------------------
library(Information)

data(train, package="Information")
train <- subset(train, TREATMENT==1)
IV <- Information::create_infotables(data=train, y="PURCHASE", parallel=FALSE)

print(head(IV$Summary), row.names=FALSE)
print(IV$Tables$N_OPEN_REV_ACTS, row.names=FALSE)

# Plotting a single variable
Information::plot_infotables(IV, "N_OPEN_REV_ACTS")

# Plotting multiple variables
Information::plot_infotables(IV, IV$Summary$Variable[1:4], same_scale=TRUE)

#' # If the goal is to plot multiple variables individually, as opposed to a comparison-grid, we can
# loop through the variable names and create individual plots
# names <- names(IV$Tables)
# plots <- list()
# for (i in 1:length(names)){
#    plots[[i]] <- plot_infotables(IV, names[i])
# }
# Showing the top 18 variables
# plots[1:18]

closeAllConnections()
}
\author{
Kim Larsen (kblarsen4 at gmail.com)
}
\keyword{IV,}
\keyword{NIV,}
\keyword{NWOE,}
\keyword{WOE,}
\keyword{classification,}
\keyword{uplift}
\keyword{weight-of-evidence,}

