% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/relabel.R
\name{relabel}
\alias{relabel}
\title{Relabel Factor Levels by Updating the Metadata}
\usage{
relabel(file, root = ".", change)
}
\arguments{
\item{file}{the name of the git2rdata object. Git2rdata objects cannot
have dots in their name. The name may include a relative path. \code{file} is a
path relative to the \code{root}.
Note that \code{file} must point to a location within \code{root}.}

\item{root}{The root of a project. Can be a file path or a \code{git-repository}.
Defaults to the current working directory (\code{"."}).}

\item{change}{either a \code{list} or a \code{data.frame}. In case of a \code{list} is a
named \code{list} with named \code{vectors}. The names of list elements must match the
names of the variables. The names of the vector elements must match the
existing factor labels. The values represent the new factor labels. In case
of a \code{data.frame} it needs to have the variables \code{factor} (name of the
factor), \code{old} (the old) factor label and \code{new} (the new factor label).
\code{relabel()} ignores all other columns.}
}
\value{
invisible \code{NULL}.
}
\description{
Imagine the situation where we have a dataframe with a factor variable and we
have stored it with \code{write_vc(optimize = TRUE)}. The raw data file contains
the factor indices and the metadata contains the link between the factor
index and the corresponding label. See
\code{vignette("version_control", package = "git2rdata")}. In such a case,
relabelling a factor can be fast and lightweight by updating the metadata.
}
\examples{

# initialise a git repo using git2r
repo_path <- tempfile("git2rdata-repo-")
dir.create(repo_path)
repo <- git2r::init(repo_path)
git2r::config(repo, user.name = "Alice", user.email = "alice@example.org")

# Create a dataframe and store it as an optimized git2rdata object.
# Note that write_vc() uses optimization by default.
# Stage and commit the git2rdata object.
ds <- data.frame(
  a = c("a1", "a2"),
  b = c("b2", "b1"),
  stringsAsFactors = TRUE
)
junk <- write_vc(ds, "relabel", repo, sorting = "b", stage = TRUE)
cm <- commit(repo, "initial commit")
# check that the workspace is clean
status(repo)

# Define new labels as a list and apply them to the git2rdata object.
new_labels <- list(
  a = list(a2 = "a3")
)
relabel("relabel", repo, new_labels)
# check the changes
read_vc("relabel", repo)
# relabel() changed the metadata, not the raw data
status(repo)
git2r::add(repo, "relabel.*")
cm <- commit(repo, "relabel using a list")

# Define new labels as a dataframe and apply them to the git2rdata object
change <- data.frame(
  factor = c("a", "a", "b"),
  old = c("a3", "a1", "b2"),
  new = c("c2", "c1", "b3"),
  stringsAsFactors = TRUE
)
relabel("relabel", repo, change)
# check the changes
read_vc("relabel", repo)
# relabel() changed the metadata, not the raw data
status(repo)

# clean up
junk <- file.remove(
  rev(list.files(repo_path, full.names = TRUE, recursive = TRUE,
                 include.dirs = TRUE, all.files = TRUE)),
  repo_path)
}
\seealso{
Other storage: \code{\link{list_data}},
  \code{\link{prune_meta}}, \code{\link{read_vc}},
  \code{\link{rm_data}}, \code{\link{write_vc}}
}
\concept{storage}
