January 30th 2019

Scores rebalanced

As of today, existing scores for all reports have been updated to be more representative. Here you can learn exactly what has changed and why.

At a glance

  • Average scores have not changed
  • Scores are now much broader, i.e. sites can score higher and lower
  • You don’t need to re-run anything to see the new scores

Improving distribution

Here you can see how often a website was awarded a given overall score:

Distribution of old scores

As you can see, the range of scores is quite small. As we’ve added more and more checks to Silktide, a side effect has been for each individual issue to be crowded out by others. The result has been websites have tended to score more and more similarly.

With our new update, website scores are much more diverse:

Distribution of new scores

Note that although the distribution is broader, the average overall score is almost identical (previously 60%, now 59%).

How scoring has changed

There are several significant changes to how scoring works:

  1. Summary scores are now linear (previously they were fitted to a curve, meaning higher scores were exponentially harder to reach).
  2. The first 50% of the summary score for Content, SEO, and UX is ignored – essentially a score of 50% is equivalent to zero. For Accessibility, the bottom 65% is ignored. The remaining score is stretched to fill 100%, so for Content this means scores are essentially (score – 50) * 2.
  3. Previously many page checks had scored based on the percentage of pages that had an issue, e.g. the percentage of pages with at least one spelling error. For many checks, we now also consider the number of issues on a page, e.g. having 5 spelling errors per page is now worse than just having one. This has added much more gradual scoring, with a more representative range.
  4. We have revised the weights and severities of many tests, with the above criteria in mind.

Why do we ignore the bottom 50-65% of a given summary score? We found that – having tested millions of websites – no one website ever fails more than that many checks. For example, Accessibility includes checks for Ruby tags, which less than 0.1% of websites even use. As a result, over 99.9% of websites gain points for a check that they couldn’t possibly have failed. Deducting this baseline fixes the problem.

How this affects you

All existing reports that were updated in January now use the new scoring, which has been applied retroactively. There’s no need to re-run your reports to see the changes.

Reports which haven’t been run since January need to be updated before they can take advantage of the new scoring.

Questions? Click on the Chat to us button and we’ll be happy to help.

Need more help?