iNZight Plot Summary and Inference — getPlotSummary • iNZightPlots

Generate summary or inference information for an iNZight plot

getPlotSummary(
  x,
  y = NULL,
  g1 = NULL,
  g1.level = NULL,
  g2 = NULL,
  g2.level = NULL,
  varnames = list(),
  colby = NULL,
  sizeby = NULL,
  data = NULL,
  design = NULL,
  freq = NULL,
  missing.info = TRUE,
  inzpars = inzpar(),
  summary.type = "summary",
  table.direction = c("horizontal", "vertical"),
  hypothesis.value = 0,
  hypothesis.alt = c("two.sided", "less", "greater"),
  hypothesis.var.equal = FALSE,
  hypothesis.use.exact = FALSE,
  hypothesis.test = c("default", "t.test", "anova", "chi2", "proportion"),
  hypothesis.simulated.p.value = FALSE,
  hypothesis = list(value = hypothesis.value, alternative = match.arg(hypothesis.alt),
    var.equal = hypothesis.var.equal, use.exact = hypothesis.use.exact, test =
    match.arg(hypothesis.test), simulated.p.value = hypothesis.simulated.p.value),
  survey.options = list(),
  width = 100,
  epi.out = FALSE,
  privacy_controls = NULL,
  html = FALSE,
  ...,
  env = parent.frame()
)

Arguments

x: a vector (numeric or factor), or the name of a column in the supplied data or design object
y: a vector (numeric or factor), or the name of a column in the supplied data or design object
g1: a vector (numeric or factor), or the name of a column in the supplied data or design object. This variable acts as a subsetting variable.
g1.level: the name (or numeric position) of the level of g1 that will be used instead of the entire data set
g2: a vector (numeric or factor), or the name of a column in the supplied data or design object. This variable acts as a subsetting variable, similar to g1
g2.level: same as g1.level, however takes the additional value "_MULTI", which produces a matrix of g1 by g2
varnames: a list of variable names, with the list named using the appropriate arguments (i.e., list(x = "height", g1 = "gender"))
colby: the name of a variable (numeric or factor) to colour points by. In the case of a numeric variable, a continuous colour scale is used, otherwise each level of the factor is assigned a colour
sizeby: the name of a (numeric) variable, which controls the size of points
data: the name of a data set
design: the name of a survey object, obtained from the survey package
freq: the name of a frequency variable if the data are frequencies
missing.info: logical, if TRUE, information regarding missingness is displayed in the plot
inzpars: allows specification of iNZight plotting parameters over multiple plots
summary.type: one of "summary" or "inference"
table.direction: one of 'horizontal' (default) or 'vertical' (useful for many categories)
hypothesis.value: H0 value for hypothesis test
hypothesis.alt: alternative hypothesis (!=, <, >)
hypothesis.var.equal: use equal variance assumption for t-test?
hypothesis.use.exact: logical, if TRUE the exact p-value will be calculated (if applicable)
hypothesis.test: in some cases (currently just two-samples) can perform multiple tests (t-test or ANOVA)
hypothesis.simulated.p.value: also calculate (where available) the simulated p-value
hypothesis: either NULL for no test, or missing (in which case above arguments are used)
survey.options: additional options passed to survey methods
width: width for the output, default is 100 characters
epi.out: logical, if TRUE, then odds/rate ratios and rate differences are printed when appropriate (y with 2 levels)
privacy_controls: optional, pass in confidentialisation and privacy controls (e.g., random rounding, suppression) for microdata
html: logical, it TRUE output will be returned as an HTML page (if supported)
...: additional arguments, see inzpar
env: compatibility argument

Value

an inzight.plotsummary object with a print method

Details

Works much the same as iNZightPlot

Author

Tom Elliott

Examples

getPlotSummary(Species, data = iris)
#> ====================================================================================================
#>                                           iNZight Summary
#> ----------------------------------------------------------------------------------------------------
#>    Primary variable of interest: Species (categorical)
#>                                  
#>    Total number of observations: 150
#> ====================================================================================================
#> 
#> Summary of the distribution of Species:
#> ---------------------------------------
#> 
#>              setosa   versicolor   virginica   Total
#>      Count       50           50          50     150
#>    Percent   33.33%       33.33%      33.33%    100%
#> 
#> 
#> ====================================================================================================
#> 
#> 
getPlotSummary(Species, data = iris,
    summary.type = "inference", inference.type = "conf")
#> ====================================================================================================
#>                                iNZight Inference using Normal Theory
#> ----------------------------------------------------------------------------------------------------
#>    Primary variable of interest: Species (categorical)
#>                                  
#>    Total number of observations: 150
#> ====================================================================================================
#> 
#> Inference of the distribution of Species:
#> -----------------------------------------
#> 
#> Estimated Proportions with 95% Confidence Interval
#> 
#>                 Estimate   Lower   Upper
#>        setosa      0.333   0.258   0.409
#>    versicolor      0.333   0.258   0.409
#>     virginica      0.333   0.258   0.409
#> 
#> Chi-square test for equal proportions
#> 
#>    X^2 = 0, df = 2, p-value = 1
#> 
#>           Null Hypothesis: true proportions in each category are equal
#>    Alternative Hypothesis: true proportions in each category are not equal
#> 
#> 
#> ### Difference in proportions of Species
#>     with 95% Confidence Intervals (adjusted for multiple comparisons)
#> 
#>                            Estimate     Lower    Upper
#>  -----------------------------------------------------
#>      setosa - versicolor          0   -0.1596   0.1596
#>      setosa - virginica           0   -0.1596   0.1596
#> 
#>  versicolor - virginica           0   -0.1596   0.1596
#> 
#> 
#> 
#> ====================================================================================================
#> 
#> 

# perform hypothesis testing
getPlotSummary(Sepal.Length, data = iris,
    summary.type = "inference", inference.type = "conf",
    hypothesis.value = 5)
#> ====================================================================================================
#>                                iNZight Inference using Normal Theory
#> ----------------------------------------------------------------------------------------------------
#>    Primary variable of interest: Sepal.Length (numeric)
#>                                  
#>    Total number of observations: 150
#> ====================================================================================================
#> 
#> Inference of Sepal.Length:
#> --------------------------
#> 
#> Mean with 95% Confidence Interval
#> 
#>    Estimate   Lower   Upper
#>       5.843    5.71   5.977
#> 
#> One Sample t-test
#> 
#>    t = 12.473, df = 149, p-value < 2.22e-16
#> 
#>           Null Hypothesis: true mean is equal to 5
#>    Alternative Hypothesis: true mean is not equal to 5
#> 
#> 
#> ====================================================================================================
#> 
#> 

# if you prefer a formula interface
inzsummary(Sepal.Length ~ Species, data = iris)
#> ====================================================================================================
#>                                           iNZight Summary
#> ----------------------------------------------------------------------------------------------------
#>    Primary variable of interest: Sepal.Length (numeric)
#>              Secondary variable: Species (categorical)
#>                                  
#>    Total number of observations: 150
#> ====================================================================================================
#> 
#> Summary of Sepal.Length by Species:
#> -----------------------------------
#> 
#> Estimates
#> 
#>                 Min     25%   Median   75%   Max    Mean       SD   Sample Size
#>        setosa   4.3   4.800      5.0   5.2   5.8   5.006   0.3525            50
#>    versicolor   4.9   5.600      5.9   6.3   7.0   5.936   0.5162            50
#>     virginica   4.9   6.225      6.5   6.9   7.9   6.588   0.6359            50
#> 
#> 
#> ====================================================================================================
#> 
#> 
inzinference(Sepal.Length ~ Species, data = iris)
#> ====================================================================================================
#>                                iNZight Inference using Normal Theory
#> ----------------------------------------------------------------------------------------------------
#>    Primary variable of interest: Sepal.Length (numeric)
#>              Secondary variable: Species (categorical)
#>                                  
#>    Total number of observations: 150
#> ====================================================================================================
#> 
#> Inference of Sepal.Length by Species:
#> -------------------------------------
#> 
#> Group Means with 95% Confidence Intervals
#> 
#>                 Estimate   Lower   Upper
#>        setosa      5.006   4.906   5.106
#>    versicolor      5.936   5.789   6.083
#>     virginica      6.588   6.407   6.769
#> 
#> One-way Analysis of Variance (ANOVA F-test)
#> 
#>    F = 119.26, df = 2 and 147, p-value < 2.22e-16
#> 
#>           Null Hypothesis: true group means are all equal
#>    Alternative Hypothesis: true group means are not all equal
#> 
#> Pairwise differences in group means with 95% Confidence Intervals and P-values
#> (The CIs and P-values have been adjusted for multiple comparisons)
#> 
#>                            Estimate     Lower     Upper      P-value
#>  -------------------------------------------------------------------
#>      setosa - versicolor     -0.930   -1.1738   -0.6862   < 2.22e-16
#>      setosa - virginica      -1.582   -1.8258   -1.3382   < 2.22e-16
#> 
#>  versicolor - virginica      -0.652   -0.8958   -0.4082   < 2.22e-16
#> 
#> 
#>           Null Hypothesis: true difference in group means is zero
#>    Alternative Hypothesis: true difference in group means is not zero
#> 
#> 
#> ====================================================================================================
#> 
#> 

## confidentialisation and privacy controls
# random rounding and suppression:
HairEyeColor_df <- as.data.frame(HairEyeColor)
inzsummary(Hair ~ Eye, data = HairEyeColor_df, freq = Freq)
#> ====================================================================================================
#>                                           iNZight Summary
#> ----------------------------------------------------------------------------------------------------
#>    Primary variable of interest: Hair (categorical)
#>              Secondary variable: Eye (categorical)
#>                                  
#>    Total number of observations: 32
#> ====================================================================================================
#> 
#> Summary of the distribution of Hair (columns) by Eye (rows):
#> ------------------------------------------------------------
#> 
#> Table of Counts:
#> 
#>            Black   Brown   Red   Blond   Row Total
#>    Brown      68     119    26       7         220
#>     Blue      20      84    17      94         215
#>    Hazel      15      54    14      10          93
#>    Green       5      29    14      16          64
#> 
#> Table of Percentages (within categories of Eye):
#> 
#>             Black    Brown      Red    Blond   Total   Row N
#>    Brown   30.91%   54.09%   11.82%    3.18%    100%     220
#>     Blue    9.30%   39.07%    7.91%   43.72%    100%     215
#>    Hazel   16.13%   58.06%   15.05%   10.75%    100%      93
#>    Green    7.81%   45.31%   21.88%   25.00%    100%      64
#> 
#> 
#> ====================================================================================================
#> 
#> 
inzsummary(Hair ~ Eye, data = HairEyeColor_df, freq = Freq,
    privacy_controls = list(
        rounding = "RR3",
        suppression = 10
    )
)
#> ====================================================================================================
#>                                           iNZight Summary
#> ----------------------------------------------------------------------------------------------------
#>    Primary variable of interest: Hair (categorical)
#>              Secondary variable: Eye (categorical)
#>                                  
#>    Total number of observations: 32
#> ====================================================================================================
#> 
#> Privacy and confidentialisation information
#> -------------------------------------------
#> 
#>   * counts are rounded using RR3 (random rounding to base 3)
#>   * suppression of counts smaller than 10, indicated by S, with secondary suppression where necessary
#>   * suppression of totals and means where underlying unrounded count < 10
#> 
#> NOTE: this feature is still experimental, and all output should be manually
#> checked before being made public. This is simply to aid that process.
#> 
#> ====================================================================================================
#> 
#> Summary of the distribution of Hair (columns) by Eye (rows):
#> ------------------------------------------------------------
#> 
#> Table of Counts:
#> 
#>            Black   Brown   Red   Blond   Row Total
#>    Brown      69     120    27       S           S
#>     Blue      21      84    18      96         219
#>    Hazel      15      54    15      12          96
#>    Green       S      30    15      15           S
#> 
#> Table of Percentages (within categories of Eye):
#> 
#>             Black    Brown      Red    Blond   Total   Row N
#>    Brown        S        S        S        S       S       S
#>     Blue    9.59%   38.36%    8.22%   43.84%    100%     219
#>    Hazel   15.62%   56.25%   15.62%   12.50%    100%      96
#>    Green        S        S        S        S       S       S
#> 
#> 
#> ====================================================================================================
#> 
#>