Skip to content

Commit

Permalink
going to submit to CRAN
Browse files Browse the repository at this point in the history
  • Loading branch information
tonyfischetti committed Jun 26, 2015
1 parent fc480bb commit 6465ab7
Show file tree
Hide file tree
Showing 24 changed files with 573 additions and 104 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: assertr
Type: Package
Title: Assertive Programming for R Analysis Pipelines
Version: 0.9.7
Version: 1.0.0
Authors@R: person("Tony", "Fischetti", email="[email protected]",
role = c("aut", "cre"))
Maintainer: Tony Fischetti <[email protected]>
Expand Down
2 changes: 1 addition & 1 deletion NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Generated by roxygen2 (4.1.0): do not edit by hand
# Generated by roxygen2 (4.1.1): do not edit by hand

export(assert)
export(assert_)
Expand Down
8 changes: 8 additions & 0 deletions NEWS
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
# assertr 1.0.0

* added row reduction functions like mahalanobis distnace

* added assert_rows and insist_rows assert verbs

* bug fixes

# assertr 0.5.7

* added within_n_mads predicate generator
Expand Down
3 changes: 2 additions & 1 deletion R/assertr.R
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
#' \item \code{\link{insist_rows}}
#' \item \code{\link{not_na}}
#' \item \code{\link{in_set}}
#' \item \code{\link{num_rows_NAs}}
#' \item \code{\link{num_row_NAs}}
#' \item \code{\link{maha_dist}}
#' \item \code{\link{within_bounds}}
#' \item \code{\link{within_n_sds}}
Expand All @@ -26,6 +26,7 @@
#'
#' @examples
#' library(magrittr) # for the piping operator
#' library(dplyr)
#'
#' # this confirms that
#' # - that the dataset contains more than 10 observations
Expand Down
8 changes: 4 additions & 4 deletions R/row-redux.R
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,12 @@
#' Computes mahalanobis distance for each row of data frame
#'
#' This function will return a vector, with the same length as the number
#' of rows of the provided data frame, corresponding to the mahalanobis
#' distances of each row.
#' of rows of the provided data frame, corresponding to the average
#' mahalanobis distances of each row from the whole data set.
#'
#' This is useful for finding anomalous row-wise observations.
#' This is useful for finding anomalous observations, row-wise.
#'
#' It will convert strings into numerics.
#' It will convert any categorical variables in the data frame into numerics.
#'
#' @param data A data frame
#' @param keep.NA Ensure that every row with missing data remains NA in
Expand Down
11 changes: 6 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ The assertr package supplies a suite of functions designed to verify
assumptions about data early in an analysis pipeline so that
data errors are spotted early and can be addressed quickly.

This package in no way needs to be used with the magrittr/dplyr piping
This package does not need to be used with the magrittr/dplyr piping
mechanism but the examples in this README use them for clarity.

### Installation
Expand All @@ -31,7 +31,7 @@ This package offers five assertion functions, `assert`, `verify`,
`insist`, `assert_rows`, and `insist_rows`, that are designed to be used
shortly after data-loading in an analysis pipeline...

Let’s say, for example, that the R’s built-in car dataset, mtcars, was not
Let’s say, for example, that the R’s built-in car dataset, `mtcars`, was not
built-in but rather procured from an external source that was known for making
errors in data entry or coding. Pretend we wanted to find the average
miles per gallon for each number of engine cylinders. We might want to first,
Expand All @@ -44,7 +44,7 @@ that is outside 4 standard deviations from its mean, and
respectively) contain 0s and 1s only
- each row contains at most 2 NAs
- each row's mahalanobis distance is within 10 median absolute deviations of
all the distance (for outlier detection)
all the distances (for outlier detection)


This could be written (in order) using `assertr` like this:
Expand Down Expand Up @@ -102,7 +102,8 @@ missing values in each row. Internally, the `assert_rows` function uses
`dplyr`'s`select` function to extract the columns to test the predicate
function on.

- `insist_rows` - takes a data frame, a row reduction function, a predicate
- `insist_rows` - takes a data frame, a row reduction function, a
predicate-generating
function, and an arbitrary number of columns to apply the predicate function
to. The row reduction function is applied to the data frame, and returns a value
for each row. The predicate-generating function is then applied to the vector
Expand Down Expand Up @@ -136,7 +137,7 @@ and `insist_rows`:

- `num_row_NAs` - counts number of missing values in each row
- `maha_dist` - computes the mahalanobis distance of each row (for outlier
detection)
detection). It will coerce categorical variables into numerics if it needs to.

Finally, each assertion function has a counterpart that using standard
evaluation. The counterpart functions are postfixed by "_" (an underscore).
Expand Down
23 changes: 12 additions & 11 deletions cran-comments.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,19 @@
## Test environments
* local OS X Yosemite 10.10.2 install, R 3.1.3
* ubuntu (on travis-ci), R 3.1.2
* local OS X Yosemite 10.10.2 install, R 3.2.1
* ubuntu (on travis-ci), R 3.2.1
* win-builder (devel and release)

## R CMD check results

There were no ERRORs, WARNINGs or NOTEs
when checked locally with --no-manual
There were no ERRORs, WARNINGs but 1 NOTE
when checked locally with --as-cran and --no-manual

I got an email from [email protected] saying that assertr v0.4
(which was just accepted into CRAN a few days ago) failed with the
oldrelease (3.0.3). I was told to either fix or declare a proper version
dependency.
The NOTE said:
* checking CRAN incoming feasibility ... NOTE
Maintainer: 'Tony Fischetti <[email protected]>'

I fixed it, slightly incremented the version number and I am submitting it
here. This is the proper thing to do, right? Please excuse my ignorance, as
this is my first package.
License components with restrictions and base license permitting such:
MIT + file LICENSE
File 'LICENSE':
YEAR: 2015
COPYRIGHT HOLDER: Tony Fischetti
7 changes: 7 additions & 0 deletions inst/doc/assertr.R
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,10 @@ not.empty.p <- function(x) if(x=="") return(FALSE)
## ------------------------------------------------------------------------
seven.digit.p <- function(x) nchar(x)==7

## ----perl=FALSE----------------------------------------------------------
example.data <- data.frame(x=c(8, 9, 6, 5, 9, 5, 6, 7,
8, 9, 6, 5, 5, 6, 7),
y=c(82, 91, 61, 49, 40, 49, 57,
74, 78, 90, 61, 49, 51, 62, 68))
(example.data)

Loading

0 comments on commit 6465ab7

Please sign in to comment.