Check for variable agreement within units of assignment — design_data

Useful for debugging purposes to ensure that there is concordance between variables in the Design and data.

Usage

design_data_concordance(design, data, by = NULL, warn_on_nonexistence = TRUE)

Arguments

design: a Design object
data: a new data set, presumably not the same used to create design.
by: optional; named vector or list connecting names of variables in design to variables in data. Names represent variables in design; values represent variables in data. Only needed if variable names differ.
warn_on_nonexistence: default TRUE. If a variable does not exist in data, should this be flagged? If FALSE, silently move on if a variable doesn't exist in data.

Value

invisibly TRUE if no warnings are produced, FALSE if any warnings are produced.

Details

Consider the following scenario: A Design is generated from some dataset, "data1", which includes a block variable "b1". Within each unique unit of assignment/unitid/cluster of "data1", it must be the case that "b1" is constant. (Otherwise the creation of the Design will fail.)

Next, a model is fit which includes weights generated from the Design, but on dataset "data2". In "data2", the block variable "b1" also exists, but due to some issue with data cleaning, does not agree with "b1" in "data1".

This could cause errors, either directly (via actual error messages) or simply produce nonsense results. design_data_concordance() is designed to help debug these scenarios by providing information on whether variables in both the data used in the creation of design ("data1" in the above example) and some new dataset, data, ("data2" in the above example) have any inconsistencies.