
Test-Time Search & Re-Ranking: Why One Candidate Is Usually Not Enough
Why best-of-N search outperforms one-shot mutation proposals when the search space is wide and the judge is stable.
Tag chips now link to a real page instead of sitting as dead text.

Why best-of-N search outperforms one-shot mutation proposals when the search space is wide and the judge is stable.