An oil company has decided to statistically analyze the relationship between the performance of recently drilled wells and their age. The company collects data on 45 wells drilled in the last 3 years. The variables considered are:
ONLINE: How long ago the well was drilled, in years.
FLOW: Current rate of oil flow, in thousands of barrels per day.
One possibility is to cross-tabulate the two variables and perform a chi-square test of independence. The cross-tabulation (pivot table of counts) below is on the “Cross-tab” sheet of the workbook.
| 
 | 
 | FLOW | |
| 
 | 
 | 1.7-2.7 | 2.7-3.7 | 
| 
 | 0-1 | 10 | 5 | 
| ONLINE | 1-2 | 1 | 14 | 
| 
 | 2-3 | 9 | 6 | 
A. (5 points) Describe the relationship between the variables, if any, that is suggested by the above cross-tabulation.
B. (15 points) Run the chi-square test on the cross-tabulation, using the macro available from the worksheet kmacros.xls. Based on the test, is there a statistically significant relationship between the variables? Explain.
C. (5 points) Another possibility is to run a simple regression of FLOW on ONLINE. This regression results in the following output:
Making the usual assumptions of the Simple Linear Model, what would you conclude about the effect of ONLINE on FLOW based on this regression? Explain.
D. (15 points) Run the above simple regression yourself, saving the residuals and generating a residual plot, putting the output on a sheet labeled “regr1.” Also produce a histogram of the residuals on this sheet. Are the residuals of this regression consistent with the assumptions of the Simple Linear Model? Explain.
.
E. (15 points) Add a new variable ONLINE2 to the data sheet, with ONLINE2 equal to the square of ONLINE (so if ONLINE=2 then ONLINE2=4, etc.). Run a multiple regression of FLOW on ONLINE and ONLINE2, saving the residuals and generating residual plots, putting the output on a sheet labeled “regr2.” Also produce a histogram of the residuals on this sheet. How do measures of fit (R-squared, standard error of estimate) and the residuals compare with the original simple regression? Based on this regression, what would you conclude about the relationship between the original variables (ONLINE and FLOW)?
F. (5 points) Give a simple physical interpretation for the intercept in the second regression.
G. (5 points) Is there a simple interpretation for the slope coefficient of ONLINE in the second regression? Explain.
H. (5 points) Use the second regression to predict the flow of a well that was drilled 4 years ago. Is there any reason to be skeptical about the accuracy of this forecast? Explain.
I. (10 points) Suppose that the 45 wells in the data set are a random sample taken from a much larger population of wells drilled in the past 3 years. Use the data to construct a 95% confidence interval for the true mean flow among all the wells in the larger population.
Delivering a high-quality product at a reasonable price is not enough anymore.
        
 That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.
You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.
Read moreEach paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.
Read moreThanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.
Read moreYour email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.
Read moreBy sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.
Read more