Allen Institute launches GENIE, a leaderboard for human-in-the-loop language model benchmarking

There’s been an explosion in recent times of pure language processing (NLP) datasets geared toward testing varied AI capabilities. Many of those datasets have accompanying leaderboards, which give a method of rating and evaluating fashions. However the adoption of leaderboards … Read more