Tutorperch Research · No. 2

Fairness in tutor rankings

If you're a top-rated tutor, you stay at the top. If you're not, you'll likely come up. Here's the research.

Written by Robert Smith Reviewed by Fiona Hennessy Last reviewed 27 May 2026

Page-1 reach

99%

of tutors reach page 1 within 30 days

Bottom 20% of tutors

+83%

More page-1 impressions for the least-shown tutors under Tutorperch's sort

Top-rated tutors

unchanged

Mean impressions for tutors with 6+ reviews are essentially identical under both sorts

When a parent searches for a tutor on Tutorperch, the directory has to put someone first. That choice matters. Tutors near the top of the page are seen by more visitors; tutors further down are seen by fewer.

The fairness question is what the directory should do when several tutors are essentially indistinguishable on the evidence available. A tutor with fifty positive reviews ranks above one with five for an obvious reason. But two tutors with the same record, in the same subject, at the same price: which one goes first, and why?

This page walks through what fair ranking means in a tutor marketplace, the approaches we considered, the simulation we built to compare them, and what we chose. The animation below is a 30-day simulation of one search and shows the chosen approach in action.

Example scenario · 30-day daily rotation on page 1

Day

of 30

Top-rated tutor

Score-tied tutor

Baseline tutor

New to Tutorperch (reserved slot)

Slots 10, 15, and 20 are reserved for tutors who joined recently (carrying the "New to Tutorperch" label). The other 21 page-1 slots fill from the ranked order, with tied tutors taking turns each day. Watch the "views" counter on each card: page-1 tutors climb steadily while off-page tutors barely move.

Section one

What fair ranking means here

Two principles, in tension:

The order should reflect what we know about each tutor. A tutor with a strong record of positive reviews has earned a higher slot than one without.
The order should distribute attention fairly when the evidence is the same. Reviews take time to accumulate, and a tutor who has not yet been given the chance to collect them is not the same thing as a tutor who has been given the chance and failed.

A simple sort breaks the second principle. Suppose the directory orders tutors by their score, then by who joined most recently when scores tie. Two tutors with identical evidence will always appear in the same order, with the more recent joiner always above the other. Across enough searches, the older tutor ends up invisible without ever having done anything wrong. They are simply on the wrong side of a deterministic tiebreaker.

The literature calls this an exposure-fairness problem and there is a growing body of academic and industry work on it. Airbnb's 2020 paper on the cold-start gap in their search quantifies the cost. Wang and Joachims' SIGIR 2021 paper on two-sided market fairness formalises the principle that exposure should be proportional to relevance when relevance is observed, and equal when it is not.

That second half is what most of this study is about. The first half, ranking by evidence, is the easy part: it is a weighted average of reviews. The hard part is what to do when many tutors have the same average. Our directory needs a tiebreaker, and the choice of tiebreaker is the most consequential design decision in the sort.

Section two

Five approaches we compared

We surveyed what comparable marketplaces do and pulled out five distinct mechanisms. The simplest is a recency tiebreaker: when scores tie, show the more recent joiner first. Etsy and Airbnb use a different shape: a time-limited boost for new listings, designed to give new sellers a brief data-collection window before normal ranking takes over. Reddit's old comment sort uses a confidence interval rather than an average, pushing items with thin data lower regardless of their direction.

The approach we built is the simplest one that gets the fairness property right: rotate the order of tied tutors each day, deterministically. Every tutor whose score matches a peer's takes turns at the top. Within any one day the order is stable so pagination works correctly. Across many days the position evens out.

Six ranking ideas, side by side

Newest-first sort Current

Bayesian score, then newest-published first

Fairness

3.1%

Never seen

4.9

Top pos.

Tutorperch sort Proposed

Rotate tied tutors with score-rounding for close scores

Fairness

1.5%

Never seen

5.5

Top pos.

30-day new-tutor boost

Daily rotation plus a brief boost for new joiners

Fairness

2.9%

Never seen

5.8

Top pos.

60-day new-tutor boost

Daily rotation plus a longer boost for new joiners

Fairness

1.7%

Never seen

5.7

Top pos.

Confidence-bound sort

Wilson lower-bound on positive ratings

Fairness

3.1%

Never seen

4.9

Top pos.

Fairness is a 0-to-100 score derived from the Gini coefficient of impression share, where higher means more evenly distributed. Never seen is the share of eligible tutors who did not appear on page 1 even once over the simulation window. Lower is better. Top pos. is the average page-1 position of the highest-rated tutors. Lower numbers are higher on the page.

Fairness in the table is a measure of how evenly impressions are spread across tutors, with higher meaning more evenly. Never seen is the share of eligible tutors who didn't appear on page 1 even once in 30 days. Top position is the average page-1 position of the highest-rated tutors, where the cost of fairness shows up.

Section three

How we tested them

We built a simulation. It generates a population of 500 synthetic tutors with realistic distributions of subjects, levels, hourly rates, cities, and verification status (drawn from the patterns we documented in our earlier UK Private Tutoring Rate Report). Each tutor has a profile, a calendar, and a publish date.

The simulation then runs 54,141 visits to the directory over 30 days, with realistic mixes of what parents search for. Most visits include a subject. Many include a level. Some include a city. For each visit we record which tutors appeared in the top 12 positions, the page-1 cohort.

We ran the simulation under all six ranking approaches against the same population and the same visits, so the only thing changing between runs was the sort. That lets us compare directly what happens to each tutor under each approach.

We also ran the simulation under a projected future state where reviews have accumulated, to confirm that the chosen approach continues to behave correctly when there is real reputation data to work from. It does: once tutors have distinct scores the rotation has nothing to do, and the highest-rated tutors sit at the top of the page in the order their reviews would suggest.

Section four

What the simulation found

Under the naive newest-first tiebreaker, 15 of the 500 simulated tutors went the entire 30 days without ever appearing on page 1. Under Tutorperch's sort, that drops to 7. The difference is the structural advantage the rotation gives to tutors who would otherwise be permanently below the fold.

Tutors shown on page 1 at least once, by day

Newest-first sort Tutorperch sort

Of 481 eligible tutors, the share who had appeared on page 1 at least once by day N. Higher and earlier is fairer.

The fairness question is sharpest on the bottom of the distribution. Under the naive sort, the bottom fifth of tutors are shown about 6 times in 30 days. The top fifth are shown 371 times. That is a 38-to-1 gap between the most-shown and least-shown tutors. Tutorperch's sort narrows that gap to about 23-to-1. Tutors at the bottom see their impressions roughly double. Tutors at the top give up a small share.

Average page-1 impressions per tutor, by quintile, over 30 days

Newest-first sort Tutorperch sort

Bottom 20%

+83%

Lower-middle 20%

+33%

Middle 20%

+17%

Upper-middle 20%

114

+18%

Top 20%

371

334

-10%

Quintiles formed by ranking tutors by their page-1 impression count over the simulation window. The bars are scaled to the same absolute axis to show the gap between top and bottom; the % change column on the right shows the relative shift for each group.

A busy subject: Maths

The whole-directory numbers above average over every kind of query. Many searches filter to a small candidate pool (a particular subject in a particular city) where every matching tutor fits on page 1 regardless of sort. The fairness question is muted in those searches because there is no real competition for top slots.

The interesting case is a busy subject. In the simulation, 264 of 500 tutors offer Maths, and Maths is searched for around 693 times over the 30-day window. With a 12-slot page and 264 potential candidates per query, the sort has a real choice to make.

Here the difference between the two sorts is stark. Under the naive newest-first sort, 77 of 264 Maths tutors (29%) did not appear on a single Maths page-1 over 30 days. Under Tutorperch's sort, that drops to 44 of 264 (17%). The same tutors are competing under both sorts. The difference is whether the rotation gives the score-tied tutors their turn at the top.

Average page-1 impressions per tutor, by quintile, over 30 days

Newest-first sort Tutorperch sort

Bottom 20%

—

Lower-middle 20%

+100%

Middle 20%

+25%

Upper-middle 20%

+111%

Top 20%

127

115

-9%

The same effect would show up in any other busy subject where the candidate pool exceeds the page size. Maths is shown here because it is the busiest in our population and the cleanest illustration. English, the sciences, and language tutors at popular levels would all show a similar pattern.

A floor for new tutors

Three slots held open for new tutors

The rotation gives every score-tied tutor an equal share of being seen, but in a busy subject the tied baseline cohort can be large. Equal share of a small slice is still a small slice. To give new tutors a predictable visibility floor while their first reviews accumulate, three page-1 slots are held open as a promotion from off-page for tutors who joined in the last 30 days, clearly labelled as new.

This is explicit rather than implicit. We don't promote new tutors within the ranking, which would say a new tutor is somehow better than an established one with the same evidence; we surface them in a separate, labelled position on the page. Yelp uses the same structural pattern for its "Hot and New Businesses" surface, for the same reason: it gives newcomers visibility without distorting the ranking of everyone else.

A new tutor whose ranked score already puts them on page 1 keeps their natural position and carries no special label. They earned their spot. The three slots only fill from new tutors who would otherwise have been off-page. If fewer than three need that lift, or none match the query, the slots fall back to ranked tutors and the page fills normally. The top 4 positions are never touched. The animation higher up the page shows the slot mechanic and the rotation together.

Section five

What this means for tutors on the platform

If your position in the directory has moved compared to where it was the last time you looked, that's the rotation rather than a change in your profile. Within any one tied group, position rotates each day. Your absolute score has not changed and your position will rotate back.

The rotation only affects tutors whose scores match. As soon as a tutor accumulates enough reviews to distinguish them from the pack, their position is determined by their score, not by the rotation. The most reliable way to step out of the tiebreaker is to ask the people you teach to leave you a review on Tutorperch once they've unlocked your contact details. Reviews are the signal the sort listens to when it has one.

The order on any one day is the same for every visitor. The rotation is deterministic per day, not random per request. So if a parent shows your profile to a partner later the same day, you appear in the same place. The day-to-day movement is what gives every tutor a fair rotation through the top slots over a week or month.

Section six

What we'll learn from real data

The simulation runs on synthetic distributions because that is what a study of ranking design needs to produce a fair comparison across alternatives. The populations of tutors and the patterns of parent searches are calibrated against UK tutor-market data but they are not Tutorperch's actual logged behaviour.

As Tutorperch accumulates real impression and unlock data over time, we plan to revisit this study with live numbers and either confirm or revise our choice. When we re-run the analysis it will be against the same metrics so anyone following along can see how the predictions held up.

Methodology & sources

How the simulation was built

The synthetic tutor population was generated with distributions calibrated against UK tutor-marketplace patterns documented in our earlier UK Private Tutoring Rate Report: log-normal hourly rates with a median around £35, a London-heavy geographic head with a long regional tail, a Zipf-1 distribution of subjects with Maths, English, and the sciences dominating, and a 25% safeguarding-verified baseline. The simulation runs against 500 tutors over 30 days at 200 pageviews per day, with all six ranking approaches seeing the same population and the same queries.

Fairness on this page is measured as one minus the Gini coefficient of page-1 impressions per tutor. Never-seen rate is the share of eligible tutors who didn't appear on page 1 even once during the 30-day window. Other metrics in the simulation include time-to-first-impression and position drift for top-rated tutors.

A note on rounding. The simulation in this study tested score-rounding to two decimal places (S2 in the table above). The version that shipped to production rounds to one decimal place instead, to match the precision of the star rating displayed on tutor cards. The safeguarding lift is applied after rounding, so a verified tutor sits just above a non-verified tutor with the same rounded score rather than promoting them across a whole rounded tier. The qualitative finding stands: tied tutors rotate, top-rated tutors are unchanged, the least-shown tutors see meaningfully more page-1 impressions.

The approach we took follows the tradition of two-sided marketplace fairness research. Haldar et al., Improving Deep Learning for Airbnb Search (KDD 2020), quantifies the cold-start gap on Airbnb and describes their fix. Wang & Joachims, User Fairness, Item Fairness, and Diversity for Rankings in Two-Sided Markets (SIGIR 2021), gives the academic framing of exposure-fairness as a ranking constraint. Etsy's How Etsy Search Works and Wyzant's algorithm explainer are the two clearest examples of marketplaces that publish their ranking mechanisms in plain language; both informed how we have written this page.