When creating a new evaluation round in Evaluation > Rounds, you can turn on the Enable score normalization option for it. This uses a mathematical formula to ensure that every entry is scored based on the same set of standards, regardless of how many judges are evaluating the program submissions. It uses two measures of distribution - a “mean” and a “standard” deviation. A judge’s scores are first scaled to have a mean deviation of 0 and a standard deviation of 1.
Let’s assume a judge has given scores with values x1, x2, x3 … xn. The system starts the score normalization by calculating the mean deviation of a judge’s votes:
And the standard deviation like this:
Then it rescales each vote with the following formula:
Once the scores are normalized, the system can calculate the mean score for each application. But this presents scores that are out of the supposed range (for example 0 to 100). To solve this, we take the minimum and maximum mean score of all applications before the normalization, and shift and scale the normalized mean scores so we retain the same minimum and maximum mean score.
If S is the set of original mean scores and N is the set of the normalized mean scores. We can denote these values:
Finally, we apply the following fit function to the normalized mean scores: