Discussion Questions and Problems

Discussion Questions

  1. 4-1 What is the meaning of least squares in a regression model?

  2. 4-2 Discuss the use of dummy variables in regression analysis.

  3. 4-3 Discuss how the coefficient of determination and the coefficient of correlation are related and how they are used in regression analysis.

  4. 4-4 Explain how a scatter diagram can be used to identify the type of regression to use.

  5. 4-5 Explain how the adjusted r2 value is used in developing a regression model.

  6. 4-6 Explain what information is provided by the F test.

  7. 4-7 What is the SSE? How is this related to the SST and the SSR?

  8. 4-8 Explain how a plot of the residuals can be used in developing a regression model.

Problems

  1. 4-9 John Smith has developed the following forecasting model:

    Y^=36+4.3X1

    where

    Y^=demand for K10 air conditionersX1=the outside temperature (F)
    1. Forecast the demand for K10 when the temperature is 70°F.

    2. What is the demand for a temperature of 80°F?

    3. What is the demand for a temperature of 90°F?

  2. 4-10 The operations manager of a musical instrument distributor feels that demand for a particular type of guitar may be related to the number of YouTube views for a music video by the popular rock group Marble Pumpkins during the preceding month. The manager has collected the data shown in the following table:

    YouTube VIEWS (1,000s) GUITAR SALES
    30 8
    40 11
    70 12
    60 10
    80 15
    50 13
    1. Graph these data to see whether a linear equation might describe the relationship between the views on YouTube and guitar sales.

    2. Using the equations presented in this chapter, compute the SST, SSE, and SSR. Find the least-squares regression line for these data.

    3. Using the regression equation, predict guitar sales if there were 40,000 views last month.

  3. 4-11 Using the data in Problem 4-10, test to see if there is a statistically significant relationship between sales and YouTube views at the 0.05 level of significance. Use the formulas in this chapter and Appendix D .

  4. 4-12 Using computer software, find the least-squares regression line for the data in Problem 4-10. Based on the F test, is there a statistically significant relationship between the demand for guitars and the number of YouTube views?

  5. 4-13 Students in a management science class have just received their grades on the first test. The instructor has provided information about the first test grades in some previous classes, as well as the final averages for the same students. Some of these grades have been sampled and are as follows:

    STUDENT 1 2 3 4 5 6 7 8 9
    1st test grade 98 77 88 80 96 61 66 95 69
    Final average 93 78 84 73 84 64 64 95 76
    1. Develop a regression model that could be used to predict the final average in the course based on the first test grade.

    2. Predict the final average of a student who made an 83 on the first test.

    3. Give the values of r and r2 for this model. Interpret the value of r2 in the context of this problem.

  6. 4-14 Using the data in Problem 4-13, test to see if there is a statistically significant relationship between the grade on the first test and the final average at the 0.05 level of significance. Use the formulas in this chapter and Appendix D .

  7. 4-15 Using computer software, find the least-squares regression line for the data in Problem 4-13. Based on the F test, is there a statistically significant relationship between the first test grade and the final average in the course?

  8. 4-16 Steve Caples, a real estate appraiser in Lake Charles, Louisiana, has developed a regression model to help appraise residential housing in the Lake Charles area. The model was developed using recently sold homes in a particular neighborhood. The price (Y) of the house is based on the square footage (X) of the house. The model is

    Y^=33,478+62.4X

    The coefficient of correlation for the model is 0.63.

    1. Use the model to predict the selling price of a house that is 1,860 square feet.

    2. A house with 1,860 square feet recently sold for $165,000. Explain why this is not what the model predicted.

    3. If you were going to use multiple regression to develop an appraisal model, what other quantitative variables might be included in the model?

    4. What is the coefficient of determination for this model?

  9. 4-17 Accountants at the firm Walker and Walker believed that several traveling executives submit unusually high travel vouchers when they return from business trips. The accountants took a sample of 200 vouchers submitted from the past year; they then developed the following multiple regression equation relating expected travel cost (Y) to number of days on the road (X1) and distance traveled (X2) in miles:

    Y^=$90.00+$48.50X1+$0.40X2

    The coefficient of correlation computed was 0.68.

    1. If Thomas Williams returns from a 300-mile trip that took him out of town for 5 days, what is the expected amount that he should claim as expenses?

    2. Williams submitted a reimbursement request for $685; what should the accountant do?

    3. Comment on the validity of this model. Should any other variables be included? Which ones? Why?

  10. 4-18 Thirteen students entered the undergraduate business program at Rollins College 2 years ago. The following table indicates what their grade-point averages (GPAs) were after being in the program for 2 years and what each student scored on the SAT exam (maximum 2400) when he or she was in high school. Is there a meaningful relationship between grades and SAT scores? If a student scores a 1200 on the SAT, what do you think his or her GPA will be? What about a student who scores 2400?

    STUDENT SAT SCORE GPA STUDENT SAT SCORE GPA
    A 1263 2.90 H 1443 2.53
    B 1131 2.93 I 2187 3.22
    C 1755 3.00 J 1503 1.99
    D 2070 3.45 K 1839 2.75
    E 1824 3.66 L 2127 3.90
    F 1170 2.88 M 1098 1.60
    G 1245 2.15
  11. 4-19 Bus and subway ridership in Washington, D.C., during the summer months is believed to be heavily tied to the number of tourists visiting the city. During the past 12 years, the following data have been obtained:

    YEAR NUMBER OF TOURISTS (1,000,000s) RIDERSHIP (100,000s)
    1 7 15
    2 2 10
    3 6 13
    4 4 15
    5 14 25
    6 15 27
    7 16 24
    8 12 20
    9 14 27
    10 20 44
    11 15 34
    12 7 17
    1. Plot these data and determine whether a linear model is reasonable.

    2. Develop a regression model.

    3. What is expected ridership if 10 million tourists visit the city?

    4. If there are no tourists at all, explain the predicted ridership.

  12. 4-20 Use computer software to develop a regression model for the data in Problem 4-19. Explain what this output indicates about the usefulness of this model.

  13. 4-21 The following data give the starting salary for students who recently graduated from a local university and accepted jobs soon after graduation. The starting salary, grade-point average (GPA), and major (business or other) are provided.

    SALARY $29,500 $46,000 $39,800 $36,500
    GPA 3.1 3.5 3.8 2.9
    Major Other Business Business Other
    SALARY $42,000 $31,500 $36,200
    GPA 3.4 2.1 2.5
    Major Business Other Business
    1. Using a computer, develop a regression model that could be used to predict starting salary based on GPA and major.

    2. Use this model to predict the starting salary for a business major with a GPA of 3.0.

    3. What does the model say about the starting salary for a business major compared to a nonbusiness major?

    4. Do you believe this model is useful in predicting the starting salary? Justify your answer, using information provided in the computer output.

  14. 4-22 The following data give the selling price, square footage, number of bedrooms, and age of houses that have sold in a neighborhood in the past 6 months. Develop three regression models to predict the selling price based upon each of the other factors individually. Which of these is best?

    SELLING PRICE ($) SQUARE FOOTAGE BEDROOMS AGE (YEARS)
    84,000 1,670 2 30
    79,000 1,339 2 25
    91,500 1,712 3 30
    120,000 1,840 3 40
    127,500 2,300 3 18
    132,500 2,234 3 30
    145,000 2,311 3 19
    164,000 2,377 3 7
    155,000 2,736 4 10
    168,000 2,500 3 1
    172,500 2,500 4 3
    174,000 2,479 3 3
    175,000 2,400 3 1
    177,500 3,124 4 0
    184,000 2,500 3 2
    195,500 4,062 4 10
    195,000 2,854 3 3

  15. 4-23 Use the data in Problem 4-22 and develop a regression model to predict selling price based on the square footage and number of bedrooms. Use this to predict the selling price of a 2,000-square-foot house with three bedrooms. Compare this model with the models in Problem 4-22. Should the number of bedrooms be included in the model? Why or why not?

  16. 4-24 Use the data in Problem 4-22 and develop a regression model to predict selling price based on the square footage, number of bedrooms, and age. Use this to predict the selling price of a 10-year-old, 2,000-square-foot house with three bedrooms.

  17. 4-25 The total expenses of a hospital are related to many factors. Two of these factors are the number of beds in the hospital and the number of admissions. Data were collected on 14 hospitals, as shown in the following table:

    HOSPITAL NUMBER OF BEDS ADMISSIONS (100s) TOTAL EXPENSES ($1,000,000s)
    1 215 77 57
    2 336 160 127
    3 520 230 157
    4 135 43 24
    5 35 9 14
    6 210 155 93
    7 140 53 45
    8 90 6 6
    9 410 159 99
    10 50 18 12
    11 65 16 11
    12 42 29 15
    13 110 28 21
    14 305 98 63

    Find the best regression model to predict the total expenses of a hospital. Discuss the accuracy of this model. Should both variables be included in the model? Why or why not?

  18. 4-26 A sample of 20 automobiles was taken, and the miles per gallon (MPG), horsepower, and total weight were recorded. Develop a linear regression model to predict MPG, using horsepower as the only independent variable. Develop another model with weight as the independent variable. Which of these two models is better? Explain.

    MPG HORSEPOWER WEIGHT
    44 67 1,844
    44 50 1,998
    40 62 1,752
    37 69 1,980
    37 66 1,797
    34 63 2,199
    35 90 2,404
    32 99 2,611
    30 63 3,236
    28 91 2,606
    26 94 2,580
    26 88 2,507
    25 124 2,922
    22 97 2,434
    20 114 3,248
    21 102 2,812
    18 114 3,382
    18 142 3,197
    16 153 4,380
    16 139 4,036
  19. 4-27 Use the data in Problem 4-26 to develop a multiple linear regression model. How does this compare with each of the models in Problem 4-26?

  20. 4-28 Use the data in Problem 4-26 to find the best quadratic regression model. (There is more than one to consider.) How does this compare to the models in Problems 4-26 and 4-27?

  21. 4-29 A sample of nine public universities and nine private universities was taken. The total cost for the year (including room and board) and the median SAT score (maximum total is 2400) at each school were recorded. It was felt that schools with higher median SAT scores would have a better reputation and would charge more tuition as a result of that. The data are in the following table. Use regression to help answer the following questions based on the sample data. Do schools with higher SAT scores charge more in tuition and fees? Are private schools more expensive than public schools when SAT scores are taken into consideration? Discuss how accurate you believe these results are, using information related to the regression models.

    CATEGORY TOTAL COST ($) MEDIAN SAT
    Public 21,700 1990
    Public 15,600 1620
    Public 16,900 1810
    Public 15,400 1540
    Public 23,100 1540
    Public 21,400 1600
    Public 16,500 1560
    Public 23,500 1890
    Public 20,200 1620
    Private 30,400 1630
    Private 41,500 1840
    Private 36,100 1980
    Private 42,100 1930
    Private 27,100 2130
    Private 34,800 2010
    Private 32,100 1590
    Private 31,800 1720
    Private 32,100 1770

  22. 4-30 In 2012, the total payroll for the New York Yankees was almost $200 million, while the total payroll for the Oakland Athletics (a team known for using baseball analytics or sabermetrics) was about $55 million, less than one-third of the Yankees’ payroll. In the following table, you will see the payrolls (in millions) and the total number of victories for the baseball teams in the American League in the 2012 season. Develop a regression model to predict the total number of victories based on the payroll. Use the model to predict the number of victories for a team with a payroll of $79 million. Based on the results of the computer output, discuss the relationship between payroll and victories.

    TEAM PAYROLL ($1,000,000s) NUMBER OF VICTORIES
    Baltimore Orioles 81.4 93
    Boston Red Sox 173.2 69
    Chicago White Sox 96.9 85
    Cleveland Indians 78.4 68
    Detroit Tigers 132.3 88
    Kansas City Royals 60.9 72
    Los Angeles Angels 154.5 89
    Minnesota Twins 94.1 66
    New York Yankees 198.0 95
    Oakland Athletics 55.4 94
    Seattle Mariners 82.0 75
    Tampa Bay Rays 64.2 90
    Texas Rangers 120.5 93
    Toronto Blue Jays 75.5 73
  23. 4-31 The number of victories (W), earned run average (ERA), runs scored (R), batting average (AVG), and on-base percentage (OBP) for each team in the American League in the 2012 season are provided in the following table. The ERA is one measure of the effectiveness of the pitching staff, and a lower number is better. The other statistics are measures of the effectiveness of the hitters, and a higher number is better for each of these.

    TEAM W ERA R AVG OBP
    Baltimore Orioles 93 3.90 712 0.247 0.311
    Boston Red Sox 69 4.70 734 0.260 0.315
    Chicago White Sox 85 4.02 748 0.255 0.318
    Cleveland Indians 68 4.78 667 0.251 0.324
    Detroit Tigers 88 3.75 726 0.268 0.335
    Kansas City Royals 72 4.30 676 0.265 0.317
    Los Angeles Angels 89 4.02 767 0.274 0.332
    Minnesota Twins 66 4.77 701 0.260 0.325
    New York Yankees 95 3.85 804 0.265 0.337
    Oakland Athletics 94 3.48 713 0.238 0.310
    Seattle Mariners 75 3.76 619 0.234 0.296
    Tampa Bay Rays 90 3.19 697 0.240 0.317
    Texas Rangers 93 3.99 808 0.273 0.334
    Toronto Blue Jays 73 4.64 716 0.245 0.309
    1. Develop a regression model that could be used to predict the number of victories based on the ERA.

    2. Develop a regression model that could be used to predict the number of victories based on the runs scored.

    3. Develop a regression model that could be used to predict the number of victories based on the batting average.

    4. Develop a regression model that could be used to predict the number of victories based on the on-base percentage.

    5. Which of the four models is better for predicting the number of victories?

    6. Find the best multiple regression model to predict the number of wins. Use any combination of the variables to find the best model.

  24. 4-32 The closing stock price for each of two stocks was recorded over a 12-month period. The closing price for the Dow Jones Industrial Average (DJIA) was also recorded over this same time period. These values are shown in the following table:

    MONTH DJIA STOCK 1 STOCK 2
    1 11,168 48.5 32.4
    2 11,150 48.2 31.7
    3 11,186 44.5 31.9
    4 11,381 44.7 36.6
    5 11,679 49.3 36.7
    6 12,081 49.3 38.7
    7 12,222 46.1 39.5
    8 12,463 46.2 41.2
    9 12,622 47.7 43.3
    10 12,269 48.3 39.4
    11 12,354 47.0 40.1
    12 13,063 47.9 42.1
    13 13,326 47.8 45.2

    1. Develop a regression model to predict the price of stock 1 based on the Dow Jones Industrial Average.

    2. Develop a regression model to predict the price of stock 2 based on the Dow Jones Industrial Average.

    3. Which of the two stocks is most highly correlated to the Dow Jones Industrial Average over this time period?

  25. 4-33 Annual rainfall plays an important role in corn agriculture. The drought of 2011 literally affected corn prices for years. Given the following data, build a model and predict the harvest for 2016 given that the total rainfall was 6.45 inches. Critique your prediction.

    YEAR RAIN (INCHES) REAP (TONS)
    2011 2.06 325
    2012 5.11 408
    2013 7.43 609
    2014 6.12 512
    2015 7.14 544

Note: means the problem may be solved with QM for Windows; means the problem may be solved with Excel QM; and means the problem may be solved with QM for Windows and/or Excel QM.

Case Study North–South Airline

In January 2012, Northern Airlines merged with Southeast Airlines to create the fourth largest U.S. carrier. The new North–South Airline inherited both an aging fleet of Boeing 727-300 aircraft and Stephen Ruth. Stephen was a tough former Secretary of the Navy who stepped in as new president and chairman of the board.

Stephen’s first concern in creating a financially solid company was maintenance costs. It was commonly surmised in the airline industry that maintenance costs rise with the age of the aircraft. He quickly noticed that historically there had been a significant difference in the reported B727-300 maintenance costs (from ATA Form 41s) in both the airframe and the engine areas between Northern Airlines and Southeast Airlines, with Southeast having the newer fleet.

On February 12, 2012, Peg Jones, vice president for operations and maintenance, was called into Stephen’s office and asked to study the issue. Specifically, Stephen wanted to know whether the average fleet age was correlated to direct airframe maintenance costs and whether there was a relationship between average fleet age and direct engine maintenance costs. Peg was to report back by February 26 with the answer, along with quantitative and graphical descriptions of the relationship.

Peg’s first step was to have her staff construct the average age of the Northern and Southeast B727-300 fleets, by quarter, since the introduction of that aircraft to service by each airline in late 1993 and early 1994. The average age of each fleet was calculated by first multiplying the total number of calendar days each aircraft had been in service at the pertinent point in time by the average daily utilization of the respective fleet to determine the total fleet hours flown. The total fleet hours flown was then divided by the number of aircraft in service at that time, giving the age of the “average” aircraft in the fleet.

The average utilization was found by taking the actual total fleet hours flown on September 30, 2011, from Northern and Southeast data, and dividing by the total days in service for all aircraft at that time. The average utilization for Southeast was 8.3 hours per day, and the average utilization for Northern was 8.7 hours per day. Because the available cost data were calculated for each yearly period ending at the end of the first quarter, average fleet age was calculated at the same points in time. The fleet data are shown in the following table. Airframe cost data and engine cost data are both shown paired with fleet average age in that table.

Discussion Questions

  1. Prepare Peg Jones’s response to Stephen Ruth.

Note: Dates and names of airlines and individuals have been changed in this case to maintain confidentiality. The data and issues described here are real.

North–South Airline Data for Boeing 727-300 Jets

NORTHERN AIRLINES DATA SOUTHEAST AIRLINES DATA
YEAR AIRFRAME COST PER AIRCRAFT ($) ENGINE COST PER AIRCRAFT ($) AVERAGE AGE (HOURS) AIRFRAME COST PER AIRCRAFT ($) ENGINE COST PER AIRCRAFT ($) AVERAGE AGE (HOURS)
2001 51.80 43.49 6,512 13.29 18.86 5,107
2002 54.92 38.58 8,404 25.15 31.55 8,145
2003 69.70 51.48 11,077 32.18 40.43 7,360
2004 68.90 58.72 11,717 31.78 22.10 5,773
2005 63.72 45.47 13,275 25.34 19.69 7,150
2006 84.73 50.26 15,215 32.78 32.58 9,364
2007 78.74 79.60 18,390 35.56 38.07 8,259
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset