An essential first step in any analysis is to become
familiar with the data. While total costs, length of stay, and birthweight
are mentioned in the problem statement and are of primary interest,
it is also valuable to provide a descriptive analysis for the other
variables. This gives the analyst additional insight for interpreting
the linear regressions in the problem context. For example, what
is the distribution of severity of illness or the type of payment?
Figure 12.4 Descriptive Statistics for Other Variables shows descriptive
statistics for the other variables in the data set. These tables
were created with the Tabulate function with the variables grouped
by categories such as demographics, admission, diagnosis, and payment.
For some variables,
all observations have the same value. In some cases this is due to
the subset being analyzed (e.g., the Hospital County is Clinton for
all observations since we are only looking at one hospital). In other
cases, such as Emergency Department Indicator, variation is possible
but for CVPH in 2014 no newborns were admitted through the Emergency
Department. It is tempting to omit the description of Emergency Department
Indicator because there is no variation, but including this information
adds insight in the problem context.