sign test

THE SIGN TEST

Where an analysis of data does not meet the basic parametric assumptions of the "t" test we can use the sign test. The sign test merely counts the number of cases in one group who exceed their matched partners and compares this with the number of persons in the second group who exceed their matched partners.

EXAMPLE

Cottage values in two suburbs were subdivided into 10 different architectural styles. The average value for each style is matched between the 2 suburbs as follows. Therefore, after adjusting each suburb's housing stock according to style, is there a difference in value between the 2 suburbs?

AVERAGE VALUE PER ARCHITECTURAL STYLE

SUBURB	SUBURB	SIGN
A	B	(A-B)
21	16	+
10	14	-
14	8	+
21	13	+
28	10	+
19	19	0
14	17	-
12	11	+
11	13	-
18	18	0

If the groups had changed about equally, the pluses and minuses will be randomly distributed around a median of 0. The null hypothesis is therefore, that the median difference = 0. If there are considerably more of one sign than the other, the distribution of differences is clearly not random and the hypothesis of equal change in the 2 groups must be rejected.

TESTING WITH <=10 PAIRS

H0 is tested when there are 10 or fewer cases by use of the binomial expansion at probability of 0.5 and n = to the number of pairs observed. In the above table there are 10 matched pairs, 5 of which are pluses, 3 minuses and 2 zeros. The zeros are disregarded and n=8. By chance we would expect 4 pluses and 4 minuses from the 8 non zero pairs. We find the probability of getting 5 pluses in a binomial expansion by reading the values from Pascal's triangle.

See pascal's triangle

The line in Pascal's triangle where n=8 reads 1,8,28,56,70,56,28,8,1 which sums to 256 individuals. Since the median (70) is the point where p=.50 that is, 4 pluses out of 8 we move to the right and find that 56 would represent the times out of 256 we would expect to get 5 pluses out of 8, 28 times would give 618, 8 times would get 718 and 1 time would 818.

Therefore, to determine the probability of getting 5 or more out of 8 = (56+28+8+1)/1256 = 0.36.

At .05, 0.36 is not significant and therefore, we accept the null hypothesis and the 5 pluses out of 8 could have easily been by chance alone.

PASCAL'S TRIANGLE
n FREQUENCIES OF COMBINATIONS sum
1 1 1 2
2 1 2 1 4
3 1 3 3 1 8
4 1 4 6 4 1 16
5 1 5 10 10 5 1 32
6 1 6 15 20 15 6 1 64
7 1 7 21 35 35 21 7 1 128
8 1 8 28 56 70 56 28 8 1 256

TESTING WITH >10 PAIRS

For more than 10 pairs the normal curve can be used as an approximation of the probabilities. The necessary Z scores are found with a mean of 0.5n and a STDEV of square root of N90.25). The Z score for a given number of pluses(X) is:

Z = ((X+I#0.5)#0.5n)hIN(o.25)

If the number of pluses > 0.5n then use X-0.5 and if the number of pluses <0.5n then use X+0.5 in computing Z. This procedure corrects for the discontinuity of the data. Having computed the Z score, the valuer can consult the table of areas under the normal curve (see Z value table) to determine the proportion of samples that would have more pluses than our obtained number. Either a 1 or 2 tailed test may be applied depending on the nature of the hypothesis.