This case has illustrated the
basics of multiple regression with two continuous predictors. The
multiple regression equation that predicts total costs from length
of stay and birthweight showed both independent variables to be significant.
However, the addition of birthweight showed only very slight improvement
in the goodness-of-fit measures. With a RMSE of $1596, the regression
model seems insufficiently precise to be useful in practice. Residual
analysis suggests that there are other factors that should be included
in the regression model.
Regression coefficients should
always be assessed for reasonableness in both direction and magnitude.
The coefficient for length of stay seems reasonable in the multiple
regression, just as it was in the simple regression. The coefficient
for birthweight is $179. At first it seems counterintuitive that
a larger weight newborn should incur additional costs. Bear in mind,
that CVPH is a Level 1 perinatal center and does not have a neonatal
intensive care unit and only handles normal and low-risk births.
According to Medline, larger birthweight babies are at risk for injury
during delivery and problems with blood sugar.
As
a next step, incorporating procedures performed and birth complications
into a regression model is indicated. All Patient Refined Diagnosis
Related Groups (APR-DRGs), APR severity, and APR risk of mortality
data elements could be used which classify patients based on their
reason for admission, severity of illness, and risk of morality.
Using APR-DRGs to perform analysis relies on complete and accurate
encoding by health care providers.
Finally, the data used
in this analysis was limited in the detail provided so that individuals
could not be identified. The full SPARCS data set has additional
information that may improve the model. The length of stay of a newborn
is often related to the length of stay of the mother. However, this
data set does not allow us to link the records of the newborn and
mother.
Multiple linear regression
is one of a number of possible methods to obtain predictive models.
Model building is often an iterative process requiring the analyst
to experiment with different combinations of predictor variables.
When selecting among several potential predictive models, all things
being equal, it is advisable to select the simplest model (i.e., the
one with the fewest predictors). This is referred to as the principle
of parsimony. The simplest model should be chosen that meets the
required precision for its application.
A good understanding
of the problem domain can assist the analyst when searching for good
predictors and critically evaluating candidate models. Additional
insights can be obtained by consulting the literature or subject matter
experts.