Statistical Applications Ranks are useful for investigating the distribution of values for a variable. You can apply inverse cumulative distribution functions to these fractional ranks to obtain probability quantile scores, which you can compare to the original values to judge the fit to the distribution.

For example, if a set of data has a normal distribution, the normal scores should be a linear function of the original values, and a plot of scores versus original values should be a straight line.

Many nonparametric methods are based on analyzing ranks of a variable: A two-sample t-test applied to the ranks is equivalent to a Wilcoxon rank sum test using the t approximation for the significance level. If you apply the t-test to the normal scores rather than to the ranks, the test is equivalent to the van der Waerden test.

A one-way analysis of variance applied to ranks is equivalent to the Kruskal-Wallis k-sample test; the F-test generated by the parametric procedure applied to the ranks is often better than the approximation used by Kruskal-Wallis.

This test can be extended to other rank scores Quade You can investigate regression relationships by using rank transformations with a method described by Iman and Conover Because the values are indistinguishable and there is usually no further obvious information on which the ranks can reasonably be based, PROC RANK does not assign different ranks to the values.

Tied values could be arbitrarily assigned different ranks. But in statistical applications such as nonparametric statistical tests employing ranks, it is conventional to assign the same rank to tied values. These statistical tests commonly assume that the data is from a continuous distribution, in which the probability of a tie is theoretically zero.

In practice, whether because of inaccuracies in measurement, the finite accuracy of representation within a digital computer, or other reasons, tied values often occur. It is also conventional in these statistical tests to assign the average rank to a group of tied values.

Assignment of the average rank is preferred because it preserves the sum of the ranks and, therefore, does not distort the estimate of the cumulative distribution function.

The default value for this option depends on the specified ranking or scoring method, which you can specify with the options of the PROC RANK statement.

These methods all begin by sorting the values of the analysis variable within a BY group, and then assigning to each nonmissing value an ordinal number that indicates its position in the sequence.

PROC RANK then obtains the rank from this value through one or more further transformations such as scaling, translation, and truncation. PROC RANK then resolves tied values by selecting the minimum, selecting the maximum, or calculating the average of all scores within a tied group.

However, a group of tied values is treated as a single value. The ordinal differs by only -1 from the ordinal assigned to the value just after the group, if there is one.

Therefore, the smallest ordinal within a BY group is 1, and the largest ordinal is the number of unique, nonmissing values in the BY group. After the ordinals are assigned, PROC RANK calculates ranks and scores using the number of unique, nonmissing values instead of the number of nonmissing values for scaling.

Because of its tendency to distort the cumulative distribution function estimate, dense ranking is not generally acceptable for use in nonparametric statistical tests.

Basic Concepts in Item and Test Analysis