This report looks at two datasets on rented properties taken by students in Australia. It uses primary and secondary data to determine a few determinants of rent paid by students, in terms of the chosen suburb, the number of bedrooms as a proxy for size, type of property (flat or house) and bond amount paid for the property. As we are using a sample its usefulness is limited by the sampling technique, and the data collected.
Section 1
We first consider a primary dataset, obtained by an interview with students to know their weekly rent. This dataset is problematic as
The other data is a secondary data- with 500 observations. It is rich in terms of additional information and can be used to make estimates campus wise or to determine association among variables. It can also be used for regression purposes to determine what affects rents in a significant way. Such conclusions can help students search for the right property based on their own requirements. A snapshot of this set is as follows:
Bond Amount |
Weekly Rent |
Dwelling Type |
Number Bedrooms |
Postcode |
Suburb |
$2,700 |
$675 |
Flat |
3 |
2031 |
RANDWICK |
$3,000 |
$750 |
Flat |
2 |
2031 |
RANDWICK |
$1,540 |
$385 |
Flat |
2 |
2144 |
AUBURN |
$2,360 |
$590 |
House |
3 |
2144 |
AUBURN |
$2,600 |
$650 |
Flat |
1 |
2000 |
SYDEY |
Section 2
We now use the primary data, to give a snapshot with numerical and visual help.
weekly rent |
|
Mean |
160 |
Standard Error |
24.2899156 |
Median |
150 |
Mode |
#N/A |
Standard Deviation |
54.31390246 |
Sample Variance |
2950 |
Kurtosis |
-1.952887101 |
Skewness |
0.327662152 |
Range |
130 |
Minimum |
100 |
Maximum |
230 |
Sum |
800 |
Count |
5 |
It can be seen that the highest rent is $230, while the lowest is $120 only0 almost half of the maximum. The mean rent is $160 while median is $150. The data is limited, but a little skewed to the right.
Section 3
Next we consider dataset2, and look at the variable – Dwelling Type. The following points are clear:
SUBURB |
Flat |
House |
Grand Total |
AUBURN |
60 |
13 |
73 |
PARRAMATTA |
147 |
9 |
156 |
RANDWICK |
101 |
3 |
104 |
SYDNEY |
166 |
1 |
167 |
Grand Total |
474 |
26 |
500 . |
We move to check the hypothesis that houses are preferred by less than 10% of students. = sample proportion of students in houses = 26/500 = 0.052
Ho: p= 0.1
H1: p < 0.1 ( left tail test)
Z test value = (0.052 – 0.1)/ SE where SE = (0.052 *.948 /500)^.5 = 0.01
Test value = – 0.042 /0.01 = -4.834. Using a 95% confidence level, the critical z value is -1.645. The p value of the test value is P (z < -4.834) = 0 as p value < 0.05, we conclude that we cannot accept the null hypothesis. There is statistical evidence that proportion of Houses is less than 0.1.
This conclusion proves the data shown above in true in a statistical sense. It is not mere luck/sampling problem that share of flats is so high. The low share of houses is systematic, and may have deeper reasons which we are unable to see in this assignment. ?
Section 4
Next we move on to dwellings with 2 bedrooms only- flat and houses, irrespective of the suburb. To compare them we segregate them on the basis of suburbs and use average mean as the comparison metric. The table shows that Auburn has the lowest average rent of $404.67, while Sydney is at the other extreme of $838.04. A visual description is shown for easier comparison.
Row Labels |
Sum of Average of WeeklyRent |
AUBURN |
404.67 |
PARRAMATTA |
461.31 |
RANDWICK |
618.04 |
SYDNEY |
838.04 |
Once again, like we tested for statistical significance of the houses proportion, we can check if the differences in average rent are amatter of luck/ design of the sample chosen or sysematic. For testing this we use ANOVA test.
The null hypothesis is
Ho: µ1 = µ2 = µ3 =µ4 ( 1, 2,3 4 refer to suburbs)
The alternative hypothesis is
H1: µ1 ≠ µ2 ≠ µ3 ≠ µ4
Using the ANOVA function in Excel we get the following table. Using the F test we note that p value is zero as P( F > 456.9) = 0. This implies that the differences are statistically different. Accordingly a student must decide on the suburb to choose after considering these mean values as important and real enough.
Source of Variation |
SS |
df |
MS |
F |
Between Groups |
7640126.16 |
3 |
2546709 |
456.9565 |
Within Groups |
1616227.32 |
290 |
5573.198 |
|
|
|
|||
Total |
9256353.49 |
293 |
|
|
Section 5
Lastly, we consider the relation between two quantitative variables- weekly Rent and Bond Amount in a scatterplot. ?
We can see that most datapoints lie on the regression line (red line) of are very close to. This shows up as very strong association, with virtually zero outliers. The value of coefficient of determination = 0.985. The correlation coefficient is 0.992= .958^.5 is extremely high. So the bond amount can act as a very good and reliable guide to the value of rent. Lower bond amount properties are likely to have lower rents.
Section 6
To conclude we can say that the secondary data is richer and more useful. It can still be validated with use of primary data. However we need more details on the primary data, so that we can make it more comparable with secondary data.
The data can be improved to include more parameters that affect rents – size of dwelling in square feet, shared or single occupancy, provision of kitchen can be some examples.
Hypothesis Testing . (n.d.). Retrieved May 30, 2017, from https://onlinecourses.science.psu.edu/statprogram/node/138
Hypothess testing . (n.d.). Retrieved June 2, 2017, from https://www.statisticshowto.com/probability-and-statistics/hypothesis-testing/
Mean, median, mode. (n.d.). Retrieved May 31, 2017, from https://www.bbc.co.uk/schools/gcsebitesize/maths/statistics/measuresofaveragerev6.shtml
Measures of Spread. (n.d.). Retrieved Sep 13, 2017, from Statistics. laerd.com: https://statistics.laerd.com/statistical-guides/measures-of-spread-range-quartiles.php
Measuresof dispersion. (n.d.). Retrieved Sep 11, 2017, from Simon.cs.vt.edu: https://simon.cs.vt.edu/SoSci/converted/Dispersion_I/
Regression analysis. (n.d.). Retrieved June 6, 2017, from Home.iitk.ac.in: https://home.iitk.ac.in/~shalab/regression/Chapter2-Regression-SimpleLinearRegressionAnalysis.pdf
Sampling techniques. (n.d.). Retrieved June 18, 2017, from Rgs.org: https://www.rgs.org/OurWork/Schools/Fieldwork+and+local+learning/Fieldwork+techniques/Sampling+techniques.htm
The 5 steps in Hypothesis testing. (n.d.). Retrieved June 5, 2017, from Learn,bu.edu: https://learn.bu.edu/bbcswebdav/pid-826908-dt-content-rid-2073693_1/courses/13sprgmetcj702_ol/week04/metcj702_W04S01T05_fivesteps.html
What isa P value . (n.d.). Retrieved May 29, 2017, from stat.ualberta.ca: https://www.stat.ualberta.ca/~hooper/teaching/misc/Pvalue.pdf
Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.
You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.
Read moreEach paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.
Read moreThanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.
Read moreYour email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.
Read moreBy sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.
Read more