% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/forward.R
\name{fast_HMM}
\alias{fast_HMM}
\title{Fits a HMM to a genotype dataset by calling fastPHASE}
\usage{
fast_HMM(X, out_path = NULL, X_filename = NULL,
  fp_path = "bin/fastPHASE", n_state = 12, n_iter = 25)
}
\arguments{
\item{X}{genotype matrix}

\item{out_path}{prefix for the fitted parameters filenames. If \code{NULL},
the files are saved in a temporary directory.}

\item{X_filename}{filename for the fastPHASE-formatted genotype file. If
\code{NULL}, the file is created in a temporary directory.}

\item{fp_path}{path to the fastPHASE executable}

\item{n_state}{dimensionality of the latent space}

\item{n_iter}{number of iterations for the EM algorithm}
}
\value{
Fitted parameters of the fastPHASE HMM. They are grouped in a list
  with the following fields: \code{pInit} for the initial marginal
  distribution, the three-dimensional array \code{Q} for the transition
  probabilities and finally \code{pEmit}, another three-dimensional array
  for the emission probabilities
}
\description{
In this function, we fit the fastPHASE hidden Markov model (HMM) using the EM
algorithm. The fastPHASE executable is required to run \code{fast_HMM}. It
can be downloaded from the following web page: \url{http://scheet.org/software.html}
}
\details{
Because of the quadratic complexity of the forward algorithm
in terms  of the dimensionality of the latent space \code{n_state}, we
recommend setting this parameter to 12. Choosing a higher number does
not result in a dramatic increase of performance. An optimal
choice for the number of iterations for the EM algorithm  is between 20
and 25.
}
\examples{
\donttest{
p <- 50
n <- 100
genotypes <- matrix((runif(n * p, min = 0, max = 1) < 0.5) +
            (runif(n * p, min = 0, max = 1) < 0.5),
            nrow = n, dimnames = list(NULL, paste0("SNP_", seq_len(p))))

hmm <- fast_HMM(genotypes, fp_path = "/path/to/fastPHASE",
                n_state = 4, n_iter = 10)
}

}
\references{
Scheet, P., & Stephens, M. (2006). A fast and flexible
statistical model for large-scale population genotype data: applications
to inferring missing genotypes and haplotypic phase. American Journal of
Human Genetics, 78(4), 629–644.
}
