
CHANGES IN VERSION 0.5.0

BUG FIXES

  - Severe bug: `greedy` and `select_greedy` can give incorrect results.  There
    was a bug in `greedy` (also affecting `select_greedy`) where keeping track
    of the number of times records from the second data set are linked had an
    error. This has been fixed.

CHANGES IN VERSION 0.4.0

NEW FEATURES
 
  - Additional arguments to `compare_vars` are passed on to the comparison
    function.

  - Added `inplace` argument to `predict`.

  - The `greedy` function is now exported.

  - The `identical` function has been removed. It was already deprecated. The
    identical `cmp_identical` can be used instead.

BUG FIXES

  - `identical` called itself resulting in a stack overflow. This has been
    fixed.

  - In extreme cases where the one of the m-probabilities converged to zero
    `problink_em` gave an error. The algorithm will now converge to a small
    value and give a warning (as this usually means that the algorithm didn't
    converge to a valid solution.

CHANGES IN VERSION 0.3.2


NEW FEATURES

  - The `select_unique` function has been added. This function filters pairs and
    removes pairs that have been matched more than once. For pairs that are
    matched more than once (with a high enough link quality) it is not possible
    to decide which link is the correct one. In some linkage scenarios (with a
    focus on reducing false matches) it is then preferable to remove both. 

  - The `score_simple` function has been added which calculates a weighted sum
    of the comparison vectors. Different weights are allowed for agreement, 
    non-agreement and missing values for each of the variables in the comparison
    vector.

  - The `merge_pairs` function can combine different sets of pairs for the same
    two data sets. For example, combine two sets of pairs generated with
    different blocking variables.

  - The functions `identical`, `jaro_winkler`, `lcs` and `jaccard` are
    deprecated and will be removed in future versions of the package. Instead
    use the functions `cmp_identical`, `cmp_jarowinkler`, `cmp_lcs` and
    `cmp_jaccard`. The function `identical` caused a conflict with a function in
    base R. Also this, hopefully, stresses that these functions return a
    function.

  - `select_greedy` has `n` and `m` arguments that allow the user to specify
    what type of linkage is allows (1-to-1, n-to-1, 1-to-m, n-to-m).

  - `select_greedy` has an `include_ties` option that will include multiple pairs
    for a record when these pairs have an equal weight/score.

  - `pair_minsim` and `cluster_pair_minsim` now have an `on_blocking` option
    that functions similar to the `on` argument of `pair_blocking`: pairs are
    only generated when the agree exactly on the blocking keys. 

  - When selecting pairs using a threshold. Pairs are now selected when their
    score is above or *equal to* the threshold. This has an effect for all of 
    the `select_` functions.

BUG FIXES

  - `select_greedy` gave an error when the set of pairs is empty. 

  - The `by_x` and `by_y` arguments were not handled correctly. Fixed.

  - The result from `cluster_collect` can now be used without introducing 
    a warning from `data.table`.

CHANGES IN VERSION 0.2.0

NEW FEATURES:

  - `merge_pairs` combines two sets of pairs into one (like `rbind`). Especially 
    useful in combination with `pair_blocking` to generate pairs that are blocked
    on multiple keys (e.g. they have match on either `postcode` or `lastname`).

BUG FIXES:

  - `pairs_minsim` gave an error when the number of records squared is larger 
    the maximum integer value.

  - `pairs_minsim` did not generate pairs when one of the keys contained 
    missing values.

