STATS1900 Business Statistics
Semester 2, 2014
Date Due: Refer to Unit Description
Total Marks: 40 marks Worth: 20% of final assessment
This assignment also requires a certain amount of computer work and written comment. You may need to seek guidance from your tutor along the way. Do not leave things until too late! Each question carefully describes what you are required to do so please follow these carefully.
In this assignment you will examine monthly data obtained from a health service provider whose hospital network may have up to 17,000 patients at any one time. Patient falls in hospital can delay other treatments, leading to additional costs all round, not to mention the extra pain and suffering of the patient. The health care company has introduced new procedures in the hope that the numbers of falls can be reduced and that overall health outcomes might be improved. The data is contained in the file FallsData.xls and contains the following columns (variables):
Id Identification number from original file
Falls Number of falls
P Number of patients across the entire network
#/1000 Number of month falls per 1000 patients
PTR Number of physiotherapy referrals in the given month
PTR/1000P Number of physiotherapy referrals per 1000 patients
Random Sample: Before you begin your analysis you are required to take a random sample of size 18 from the 40 cases in the file. Again, use the file Sample-Generator.xls to do this. (Your tutor can show you how to do this in EXCEL.) Your answers to the assignment tasks below are to be based on your sample of 18. Make sure to keep a safe copy of your sample since you cannot use Sample-Generator to reproduce the first sample.
To prepare your data file ready for analysis you must follow the following steps:
1. Use the Sample-Generator file to generate a random sample of 18 from the file Accident Data.xls.
2. After you have generated your sample you need to sort your data by the column headed Id â€“ remember to highlight all of your sample data before sorting.
3. The second column headed Mth is empty. Fill in this column with the numbers 1 to 18. These numbers represent individual months in 1.5 years of monthly data.
4. Use appropriate EXCEL formulae to calculate entries for the 5th and 7th columns.
5. Copy your sample file to another spreadsheet to work on your assignment. Make sure you save your random sample.
6. Now you are ready to use your sample data to complete the assignment tasks listed below
Provide a printout of the data in your sample, with Id numbers in ascending order.
1. Historically this hospital chain believed that their average number of patient falls was relatively stable at 53 falls per month. In recent years they have introduced new procedures, including assessing risk at the time of admission, greater patient monitoring, provision of mobility aids such as canes and walkers, etc.
Determine the mean, standard deviation and standard error of the number of falls, for your 1.5 years of monthly data. Use this information to carry out a t-test to see if the average number of monthly falls for the last 1.5 years is significantly less than 53.
2. Obtain a scatter-plot of the monthly number of falls vs physiotherapy referrals per month. Think carefully about which variable should go on the vertical axis â€“ remember, it is the independent variable that goes on the horizontal axis (i.e. the x-axis). Make sure you label your axes properly and your graph has an appropriate title. Briefly describe the nature of the relationship between these two variables.
3. Using Tools Data Analysis in XL carry out a regression analysis on these two variables. Copy the output into your assignment and use it to determine the following:
o Write down the regression equation
o State the R-Square value and the Standard Error and explain what they mean with respect to this data.
o Write down the value of the gradient of the regression line and explain what it means in this case.
o Is the constant or intercept value significant in this case? How do you know this?
o Briefly explain why you think this regression model is, or is not, a good model.
4. Obtain a time series plot of the number of falls versus Time (month). Describe the nature of this time series and identify which of the four time series components are present in this case.
Compare the first half of the time series to the second half (i.e. the first nine months with the last nine months), and briefly describe any differences.
By right clicking on one of the data points, add a trend line to the time series, making sure you include the regression equation and the R-Square value. Interpret the meaning of the slope coefficient in this case and state whether or not this is good news for the health company.
Use the regression equation to de-trend the data placing the de-trended data in another column. Obtain a time series plot of the de-trended data and explain what it shows.
5. Using information from your analyses write a short concluding paragraph about whether or not the health care provider has been successful in reducing the number of monthly falls amongst its patients.