For the past few years, Coverfly has helped screenplay contests and coverage providers behind the scenes as a platform for automating and streamlining administration. Today, Coverfly boasts one of the most comprehensive script databases ever seen, with millions of pages of scripts, feedback on those scripts, and evaluation data across every genre and format, from many of the top screenwriting competitions and screenplay coverage services. This database includes over 25,000 amateur and professional scripts, from writers around the world.
Now that writers will have the option of making their data available to the industry, we knew it was important to design a “quality metric.” After meticulously analyzing our database, surveying the industry on what they’d want in such a metric, and looking at the design of scoring/ranking algorithms used by technology platforms in other industries, we came up with our very own: the Coverfly Score.
Before we dive into the details of the algorithm, it’s important to note that the Coverfly Score is not a metric of quality, it’s a metric of confidence of quality, which increases with more strong evaluations.
In designing the Coverfly Score, we required that it satisfy the following criteria:
- A high Coverfly Score requires multiple, high-marking evaluations. Besides just reflecting high marks from an evaluation, one evaluation from a single reader shouldn’t be enough to garner a high Coverfly Score. In other words, ten evaluations of 8/10 should rank higher than one evaluation of 10/10. The Coverfly Score requires at least 5 evaluations (from qualifying contest readers or professional coverage services) to reach the acceptable “confidence” quotient in your project’s score, since more evaluations paints a clearer picture of how the industry will receive your script because it’s based on a wider range of professional industry readers’ evaluations. The more evaluations, the better!
- Coverfly Scores never go down. Yup, Coverfly Scores can’t go down. This was a design requirement primarily because many screenplays are works in progress, and we don’t want to discourage writers from submitting their screenplays for feedback at the risk of reducing their Coverfly Score. Rather, we wanted to incentivize the opposite; more evaluations on a project, good or bad, help us predict the project’s quality at a higher level of confidence. However, a strong Coverfly Score must be a metric of quality not quantity, so the next point is critical.
- Attaining a high Coverfly Score is very difficult. An obvious risk that comes with Coverfly Scores that don’t decrease is an entire database of undifferentiated, highly rated scripts, which would defeat the purpose of a quality-metric in the first place.
- Coverfly Scores are insulated from reader bias. The average scores given by readers can vary widely from reader to reader, just like movie reviews. The metric design should take that into account and normalize for reader bias.
- Coverfly Scores weigh different scores differently. The more prestigious the contest, the more influence its evaluations will have on a Coverfly Score. Winning The Nicholl Fellowship should boost your Coverfly Score a lot more than winning Joe Schmo’s Weekly Logline Competition.
Here’s a deeper (more mathematical) look at how a Coverfly Score is calculated:
Each project’s Coverfly Score is recalculated anytime a new evaluation is entered into the system. Evaluations are typically submitted by readers/judges, who score the project in several categories (i.e. plot, dialogue, voice, concept) on a scale of 0 to 10.
Each category’s score is compared to the reader’s overall average and standard deviation for that category, and the score is shifted to fit a more normally distributed “bell-curve”.
Next, a composite score is determined using the weighted sum model, which applies contest-specific weights to each scoring category and this process determines a single, reader-normalized score between 0 and 10 for the evaluation.
Next, we take every evaluation for a given project and again use the weighted sum model to apply weights to each evaluation based on the quality of the competition that the evaluation came from. We plug this composite score into the following formula:
Score = xp/10 + x(1−e−q/Q)
Where p is the composite score (after competition weights have been applied), q is the number of evaluations multiplied by their respective weights, x is the number of evaluations (capped at 5), and Q is a constant we assign based on the importance of “quantity” in our calculation.
We then cap the score depending on how many evaluations the project has. The cap increases the more evaluations there are, and is altogether removed after 5 evaluations. Finally, we multiply the score by a constant to create a wider range of scores, and to remove the false perception of associating a score of a 60 with an F and a score of an 90 with an A if it were on a 1-100 scale.
If this new score is lower than the most recent score before the recalculation, then we ignore it (so as to satisfy the non-decreasing score requirement).
If you’re still following along, you’ll notice a few implications of this design:
- An Coverfly Score won’t reach its full potential until it has received at least 5 evaluations
- A single evaluation with an extremely high composite score isn’t enough to put the Coverfly Score into the upper echelon of scores. In fact, one strong score alone isn’t even enough to get it out of the basement of Coverfly Scores.
Thus, the Coverfly Score is not necessarily a perfect reflection of a script’s quality – rather, it is a reflection of quality and confidence of quality simultaneously. High Coverfly Scores require multiple, high-marking evaluations from prestigious competitions and coverage providers – one or the other (quality/quantity) simply isn’t enough to determine a script’s quality confidently.
Check out our top Coverfly Scores on THE RED LIST.