STATISTICAL
INFERENCE
In this part we will use the Summerton sales as a basis
for predicting property values in that suburb.
See analyzing and
presenting data
The use of predicted
sale prices is not so much a valuation method (as this is not
acceptable to the courts or industry) but as a description of possible
trends in this locality. That is, as with the previous part sales data
can be analyzed and used to predict general values in foe example, an
"investment report".
CONFIDENCE INTERVALS
The standard
deviation can be used to indicate what percentage of the sample of a
population may be expected to fall within selected confidence
intervals. As the diagram below shows about 68.26% of the sample of the
population will generally fall within plus or minus one standard
deviation from the mean assuming that the data approximates a normal
distribution.
Normally at least 30 random sales are required to confidently state
that the sample is representative of the population. For Summerton we
only have 19 sales which are skewed, but for the purposes of this part
we will assume that the sample approximates a normal curve and is
representative of the population (Summerton houses).
Assuming the sales
data in Summerton approximates a normal distribution 68.26% of the
sales will fall between the mean - 1 standard deviation and the mean +
1 standard deviation, about 95.44% should fall with 2 standard
deviations either side of the mean and about 99.74% should fall within
3 standard deviations either side of the mean - see diagram above.
STATISTICAL INFERENCE
Past sale prices and rents can be used to predict future prices, rents,
and values.
EXAMPLE
What percentage of sales fall within the range of 10 (000) on either
side of the mean (598.4) for the Summerton sales?
Using the Z score formula:
Z = R/STD = 10/325.5 = 0.0307
Where:
R = required range (10)
STD = standard deviation
The Z score shows that 608.4 and 588.4 each deviate from the mean by
0.0307 standard deviations. The percent is found by referring to the
diagram below which shows a value of about 0.012. Therefore, about 1.2%
of sales lie between the mean and 608.4 and about 2.4% lie between
608.4 and 588.4.
PROBABILITY
USING Z VALUES
The probability of a
selected sale falling between a given range can be found with the above
formula. For a range of +5(000) and -5(000) of either side of the mean:
Z = 5/325.5 = 0.0154
See Z values – table
The Z value table shows that a Z value of 0.0154 corresponds to about
.006 (by interpolation) Therefore, there is about a 0.6% chance that
the sale will fall within the range 5(000) above the mean or 1.2%
chance that it will fall between 330.5 and 320.5.
CONFIDENCE LEVELS
For a number of
statistical analyses a 95% confidence level is required. From the
previous calculations we can state with 95% degree of confidence a sale
will fall between 1.96 standard deviations either side of the mean -
see diagram above and Z value table.
That is 1.96 * 325.5
= 638 either side of the mean however, such statements depend on how
accurately the estimated mean represents the population mean.
Regardless of the size of the population there is a specific sample
size that will permit a certain level of confidence in the estimated
mean.
NECESSARY SAMPLE SIZE
The necessary sample
size can be calculated with the following formula:
n = (Z2*STD2)/e2
Where:
n = the sample size required
z = z value at the required degree of confidence eg 95%
STD = standard deviation
e = range from the
mean.
EXAMPLE
Determine the sample size required from Summerton for the valuer to be
95% confident that the true mean is within +/-10(000) of the estimated
mean of 600(000). That is, between 588.4 and 608.4(000):
n = (1.962*325.52)/102 = (3.842 * 105950)/100 = 407061/100 = 4071
Therefore, the
Summerton sample is well short of the required number of sales for the
valuer to be 95% confident that they will represent the population.
Note that this confidence limit is at variance with standard valuation
practice where commonly, a few comparable sales meeting the rigorous
standards of the willing buyer-willing seller theory will
provide extremely reliable evidence of market value.
SCATTERPLOTS
Scatterplots are
useful devices for determining relationships between variables. In
valuation work there are a number of variables which affect the value
of real estate which can be shown to correlate with market value.
EXAMPLE
The 19 sales in
Summerton are plotted against distance from the local railway station.
The following scatterplot results:
SALE PRICE VERSUS
DISTANCE FROM RAILWAY STATION
A visual inspection
of the scatterplot above shows a reasonable inverse correlation between
sale prices and distance from the local railway station. On the other
hand the scatterplot below shows no discernable pattern and there would
appear to be no correlation between sale price and distance from
railway station:
OUTLIERS
There are 2 or 3
outliers shown on the scatterplot. Outliers are most important and will
show either an error in the sample or application or may indicate an
interesting new variable which should be examined. Valid outliers
require further investigation.
Upon investigation it
is found that the reason why prices of the outliers had held up so well
despite the distance from the local railway station is because they
come inside the commuting area of the neighbouring railway station.
Therefore, the plot would support the hypothesis.
TIME SERIES
Values and rents can
be traced over time to ascertain a trend and for prediction. Although
cyclical theory has been discredited for land values the "boom bust"
pattern can be discerned over time.
STATISTICAL
INFERENCE
In this part we will use the Summerton sales as a basis
for predicting property values in that suburb.
See analyzing and
presenting data
The use of predicted
sale prices is not so much a valuation method (as this is not
acceptable to the courts or industry) but as a description of possible
trends in this locality. That is, as with the previous part sales data
can be analyzed and used to predict general values in foe example, an
"investment report".
CONFIDENCE INTERVALS
The standard
deviation can be used to indicate what percentage of the sample of a
population may be expected to fall within selected confidence
intervals. As the diagram below shows about 68.26% of the sample of the
population will generally fall within plus or minus one standard
deviation from the mean assuming that the data approximates a normal
distribution.
Normally at least 30 random sales are required to confidently state
that the sample is representative of the population. For Summerton we
only have 19 sales which are skewed, but for the purposes of this part
we will assume that the sample approximates a normal curve and is
representative of the population (Summerton houses).
Assuming the sales
data in Summerton approximates a normal distribution 68.26% of the
sales will fall between the mean - 1 standard deviation and the mean +
1 standard deviation, about 95.44% should fall with 2 standard
deviations either side of the mean and about 99.74% should fall within
3 standard deviations either side of the mean - see diagram above.
STATISTICAL INFERENCE
Past sale prices and rents can be used to predict future prices, rents,
and values.
EXAMPLE
What percentage of sales fall within the range of 10 (000) on either
side of the mean (598.4) for the Summerton sales?
Using the Z score formula:
Z = R/STD = 10/325.5 = 0.0307
Where:
R = required range (10)
STD = standard deviation
The Z score shows that 608.4 and 588.4 each deviate from the mean by
0.0307 standard deviations. The percent is found by referring to the
diagram below which shows a value of about 0.012. Therefore, about 1.2%
of sales lie between the mean and 608.4 and about 2.4% lie between
608.4 and 588.4.
PROBABILITY
USING Z VALUES
The probability of a
selected sale falling between a given range can be found with the above
formula. For a range of +5(000) and -5(000) of either side of the mean:
Z = 5/325.5 = 0.0154
See Z values – table
The Z value table shows that a Z value of 0.0154 corresponds to about
.006 (by interpolation) Therefore, there is about a 0.6% chance that
the sale will fall within the range 5(000) above the mean or 1.2%
chance that it will fall between 330.5 and 320.5.
CONFIDENCE LEVELS
For a number of
statistical analyses a 95% confidence level is required. From the
previous calculations we can state with 95% degree of confidence a sale
will fall between 1.96 standard deviations either side of the mean -
see diagram above and Z value table.
That is 1.96 * 325.5
= 638 either side of the mean however, such statements depend on how
accurately the estimated mean represents the population mean.
Regardless of the size of the population there is a specific sample
size that will permit a certain level of confidence in the estimated
mean.
NECESSARY SAMPLE SIZE
The necessary sample
size can be calculated with the following formula:
n = (Z2*STD2)/e2
Where:
n = the sample size required
z = z value at the required degree of confidence eg 95%
STD = standard deviation
e = range from the
mean.
EXAMPLE
Determine the sample size required from Summerton for the valuer to be
95% confident that the true mean is within +/-10(000) of the estimated
mean of 600(000). That is, between 588.4 and 608.4(000):
n = (1.962*325.52)/102 = (3.842 * 105950)/100 = 407061/100 = 4071
Therefore, the
Summerton sample is well short of the required number of sales for the
valuer to be 95% confident that they will represent the population.
Note that this confidence limit is at variance with standard valuation
practice where commonly, a few comparable sales meeting the rigorous
standards of the willing buyer-willing seller theory will
provide extremely reliable evidence of market value.
SCATTERPLOTS
Scatterplots are
useful devices for determining relationships between variables. In
valuation work there are a number of variables which affect the value
of real estate which can be shown to correlate with market value.
EXAMPLE
The 19 sales in
Summerton are plotted against distance from the local railway station.
The following scatterplot results:
SALE PRICE VERSUS
DISTANCE FROM RAILWAY STATION
A visual inspection
of the scatterplot above shows a reasonable inverse correlation between
sale prices and distance from the local railway station. On the other
hand the scatterplot below shows no discernable pattern and there would
appear to be no correlation between sale price and distance from
railway station:
OUTLIERS
There are 2 or 3
outliers shown on the scatterplot. Outliers are most important and will
show either an error in the sample or application or may indicate an
interesting new variable which should be examined. Valid outliers
require further investigation.
Upon investigation it
is found that the reason why prices of the outliers had held up so well
despite the distance from the local railway station is because they
come inside the commuting area of the neighbouring railway station.
Therefore, the plot would support the hypothesis.
TIME SERIES
Values and rents can
be traced over time to ascertain a trend and for prediction. Although
cyclical theory has been discredited for land values the "boom bust"
pattern can be discerned over time.
The above time series
show office rents for a particular type of office block in Sydney over
a period of 5 years. The series can be made into a "control chart" by
including "upper" and "lower" control limits which are usually 2
standard deviations. These are shown on the Z value table and those
values outside the control limits are treated as outliers.
Often such plots need
smoothing to ascertain some underlining trend. This can be done
for example by using a running medium of 3 which means each
data point is the mean of that point plus its two neighbouring points.
"t" DISTRIBUTION
As sample sizes decrease the sampling distribution of their means
becomes more pointed in the middle and has relatively more area in
their tails. Such a distribution is known as the "t" distribution or
"students" distribution. The diagram below compares the normal curve A
with two "t" distributions, B and C:
THE "t" DISTRIBUTION VERSUS THE NORMAL CURVE
7
The above time series
show office rents for a particular type of office block in Sydney over
a period of 5 years. The series can be made into a "control chart" by
including "upper" and "lower" control limits which are usually 2
standard deviations. These are shown on the Z value table and those
values outside the control limits are treated as outliers.
Often such plots need
smoothing to ascertain some underlining trend. This can be done
for example by using a running medium of 3 which means each
data point is the mean of that point plus its two neighbouring points.
"t" DISTRIBUTION
As sample sizes decrease the sampling distribution of their means
becomes more pointed in the middle and has relatively more area in
their tails. Such a distribution is known as the "t" distribution or
"students" distribution. The diagram below compares the normal curve A
with two "t" distributions, B and C:
THE "t" DISTRIBUTION VERSUS THE NORMAL CURVE
7