% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/tar_format.R
\name{tar_format}
\alias{tar_format}
\title{Define a custom target storage format.}
\usage{
tar_format(
  read = NULL,
  write = NULL,
  marshal = NULL,
  unmarshal = NULL,
  convert = NULL,
  copy = NULL,
  repository = NULL
)
}
\arguments{
\item{read}{A function with a single argument named \code{path}.
This function should read and return the target stored
at the file in the argument. It should have no side effects.
See the "Format functions" section for specific requirements.
If \code{NULL}, the \code{read} argument defaults to \code{readRDS()}.}

\item{write}{A function with two arguments: \code{object} and \code{path},
in that order. This function should save the R object \code{object}
to the file path at \code{path} and have no other side effects.
The return value does not matter.
See the "Format functions" section for specific requirements.
If \code{NULL}, the \code{write} argument defaults to \code{saveRDS()}
with \code{version = 3}.}

\item{marshal}{A function with a single argument named \code{object}.
This function should marshal the R object and return
an in-memory object that can be exported to remote parallel workers.
It should not read or write any persistent files.
See the Marshalling section for details.
See the "Format functions" section for specific requirements.
If \code{NULL}, the \code{marshal} argument defaults to just
returning the original object without any modifications.}

\item{unmarshal}{A function with a single argument named \code{object}.
This function should unmarshal the (marshalled) R object and return
an in-memory object that is appropriate and valid for use
on a parallel worker. It should not read or write any persistent files.
See the Marshalling section for details.
See the "Format functions" section for specific requirements.
If \code{NULL}, the \code{unmarshal} argument defaults to just
returning the original object without any modifications.}

\item{convert}{The \code{convert} argument is a function
that accepts the object returned by the command of the target
and changes it into an acceptable format (e.g. can be
saved with the \code{read} function). The \code{convert}
ensures the in-memory copy
of an object during the running pipeline session
is the same as the copy of the object that is saved
to disk. The function should be idempotent, and it should
handle edge cases like \code{NULL} values (especially for
\code{error = "null"} in \code{\link[=tar_target]{tar_target()}} or \code{\link[=tar_option_set]{tar_option_set()}}).
If \code{NULL}, the \code{convert} argument defaults to just
returning the original object without any modifications.}

\item{copy}{The \code{copy} argument is a function
that accepts the object returned by the command of the target
and makes a deep copy in memory. This method does is relevant
to objects like \code{data.table}s that support in-place modification
which could cause unpredictable side effects from target
to target. In cases like these, the target should be deep-copied
before a downstream target attempts to use it (in the case of
\code{data.table} objects, using \code{data.table::copy()}).
If \code{NULL}, the \code{copy} argument defaults to just
returning the original object without any modifications.}

\item{repository}{Deprecated. Use the \code{repository} argument of
\code{\link[=tar_target]{tar_target()}} or \code{\link[=tar_option_set]{tar_option_set()}} instead.}
}
\value{
A character string of length 1 encoding the custom format.
You can supply this string directly to the \code{format}
argument of \code{\link[=tar_target]{tar_target()}} or \code{\link[=tar_option_set]{tar_option_set()}}.
}
\description{
Define a custom target storage format for the
\code{format} argument of \code{\link[=tar_target]{tar_target()}} or \code{\link[=tar_option_set]{tar_option_set()}}.
}
\details{
It is good practice to write formats that correctly handle
\code{NULL} objects if you are planning to set \code{error = "null"}
in \code{\link[=tar_option_set]{tar_option_set()}}.
}
\section{Marshalling}{

If an object can only be used in the R session
where it was created, it is called "non-exportable".
Examples of non-exportable R objects are Keras models,
Torch objects, \code{xgboost} matrices, \code{xml2} documents,
\code{rstan} model objects, \code{sparklyr} data objects, and
database connection objects. These objects cannot be
exported to parallel workers (e.g. for \code{\link[=tar_make_future]{tar_make_future()}})
without special treatment. To send an non-exportable
object to a parallel worker, the object must be marshalled:
converted into a form that can be exported safely
(similar to serialization but not always the same).
Then, the worker must unmarshal the object: convert it
into a form that is usable and valid in the current R session.
Arguments \code{marshal} and \code{unmarshal} of \code{tar_format()}
let you control how marshalling and unmarshalling happens.
}

\section{Format functions}{

In \code{tar_format()}, functions like \code{read}, \code{write},
\code{marshal}, and \code{unmarshal} must be perfectly pure
and perfectly self-sufficient.
They must load or namespace all their own packages,
and they must not depend on any custom user-defined
functions or objects in the global environment of your pipeline.
\code{targets} converts each function to and from text,
so it must not rely on any data in the closure.
This disqualifies functions produced by \code{Vectorize()},
for example.

The functions to read and write the object
should not do any conversions on the object. That is the job
of the \code{convert} argument. The \code{convert} argument is a function
that accepts the object returned by the command of the target
and changes it into an acceptable format (e.g. can be
saved with the \code{read} function). Working with the \code{convert}
function is best because it ensures the in-memory copy
of an object during the running pipeline session
is the same as the copy of the object that is saved
to disk.
}

\examples{
# The following target is equivalent to the current superseded
# tar_target(name, command(), format = "keras").
# An improved version of this would supply a `convert` argument
# to handle NULL objects, which are returned by the target if it
# errors and the error argument of tar_target() is "null".
tar_target(
  name = keras_target,
  command = your_function(),
  format = tar_format(
    read = function(path) {
      keras::load_model_hdf5(path)
    },
    write = function(object, path) {
      keras::save_model_hdf5(object = object, filepath = path)
    },
    marshal = function(object) {
      keras::serialize_model(object)
    },
    unmarshal = function(object) {
      keras::unserialize_model(object)
    }
  )
)
# And the following is equivalent to the current superseded
# tar_target(name, torch::torch_tensor(seq_len(4)), format = "torch"),
# except this version has a `convert` argument to handle
# cases when `NULL` is returned (e.g. if the target errors out
# and the `error` argument is "null" in tar_target()
# or tar_option_set())
tar_target(
  name = torch_target,
  command = torch::torch_tensor(),
  format = tar_format(
    read = function(path) {
      torch::torch_load(path)
    },
    write = function(object, path) {
      torch::torch_save(obj = object, path = path)
    },
    marshal = function(object) {
      con <- rawConnection(raw(), open = "wr")
      on.exit(close(con))
      torch::torch_save(object, con)
      rawConnectionValue(con)
    },
    unmarshal = function(object) {
      con <- rawConnection(object, open = "r")
      on.exit(close(con))
      torch::torch_load(con)
    }
  )
)
}
\seealso{
Other targets: 
\code{\link{tar_cue}()},
\code{\link{tar_target_raw}()},
\code{\link{tar_target}()}
}
\concept{targets}
