Check for variable agreement within units of assignment
Source:R/Design.R
design_data_concordance.Rd
Useful for debugging purposes to ensure that there is
concordance between variables in the Design
and data.
Arguments
- design
a
Design
object- data
a new data set, presumably not the same used to create
design
.- by
optional; named vector or list connecting names of variables in
design
to variables indata
. Names represent variables indesign
; values represent variables indata
. Only needed if variable names differ.- warn_on_nonexistence
default
TRUE
. If a variable does not exist indata
, should this be flagged? IfFALSE
, silently move on if a variable doesn't exist indata
.
Details
Consider the following scenario: A Design
is generated from
some dataset, "data1", which includes a block variable "b1". Within each
unique unit of assignment/unitid/cluster of "data1", it must be the case
that "b1" is constant. (Otherwise the creation of the Design
will
fail.)
Next, a model is fit which includes weights generated from the
Design
, but on dataset "data2". In "data2", the block variable "b1"
also exists, but due to some issue with data cleaning, does not agree with
"b1" in "data1".
This could cause errors, either directly (via actual error messages) or
simply produce nonsense results. design_data_concordance()
is designed
to help debug these scenarios by providing information on whether
variables in both the data used in the creation of design
("data1"
in the above example) and some new dataset, data
, ("data2" in the
above example) have any inconsistencies.