The
exploratory analysis presented in this case illustrates the use of
visualization to become familiar with the variability of each data
element, examine outliers, identify variables that are not candidates
for further analysis (at least initially), and view relationships
between variables. Such exploratory analysis is indispensable at the
beginning of an inquiry and forms the foundation for subsequent analyses.
In data sets with more
than a few variables, the number of possible combinations of variables
is very large and considerable time and resources may be needed to
examine a large number of combinations. An understanding of the problem
domain in conjunction with insight gained from basic descriptive analysis
can guide the development of an analysis strategy that will meet the
needs of stakeholders. Gaining deeper understanding through consulting
subject matter experts and pertinent references helps in selecting
and interpreting meaningful analyses.
With large data sets,
such as are available from SPARCS. significance testing is often of
little value, as very small differences will be statistically significant,
but not of practical significance. In the next two cases, we will
continue our analysis of the Adirondack newborns data by developing
predictive models through regression analysis.