Neural Crossword Solver Outperforms Humans for the First Time

One of humanity's most popular pleasures is crossword puzzles. They are made up of a grid of overlapping words that serve as hints to solutions. The objective is to finish the grid. Crossword puzzles are one area where humans outperform robots greatly. The American Crossword Puzzle Tournament, which is conducted every year and has always been won by a person, is, without a doubt, the most prestigious tournament.

Dr Fill, a machine, placed 11th out of over 600 competitors in 2017. The American Crossword Puzzle Tournament had never been won by a machine. Until the year before last.

The Berkeley Crossword Solver, created by Eric Wallace and other members of the University of California, Berkeley's natural language programming team, as well as Matt Ginsberg, the original inventor of the Dr Fill software, is an automatic crossword-solving computer.

First Place

In the 2021 American Crossword Puzzle Tournament, the new software outscored all human opponents.  “[This] marks the first time that a computer program has surpassed human performance at this event,” say the researchers. “Our system won first place against 1,100 top human solvers.”

Crossword puzzles are tough for machines to answer because they contain multiple different sorts of clues that require different ways to complete. Knowledge-based hints, for example, need an understanding of history, pop culture, or other sorts of trivia. As an example, the team provides the following: ISAIAH is the book that comes after Song of Solomon.

Players must consider anagrams, puns, and words with similar meanings while solving a wordplay hint. CLUE: One followed by nothing, A: TEN, for example.

Common sense cues, on the other hand, necessitate a genuine comprehension of the real world. CLUE: WETINK is the cause of a smudge, for example.

Programs struggle with replies that contain more than one word because the number of possible solutions skyrockets.

Word meanings, on the other hand, are frequently easy for a machine to find automatically using data mining. CLUE: Tusked savanna inhabitant A: WARTHOG, for instance.

The Berkeley Crossword Solver has a clear method. It all starts with a neural questioning and responding system that generates possible solutions for all of the clues, as well as a score indicating how excellent a solution it could be.

This system was developed using a massive database of over six million question-answer pairings from historical crossword puzzles spanning back 70 years. It also makes use of GPT-2, an open-source natural language AI that aids with word segmentation.

The grid is then filled with these proposed responses. This method is difficult without a flawless set of solutions because of the various conflicts that arise, necessitating a preference for one candidate solution over another.

To tackle this, the Berkeley team used a technique known as Belief Propagation. Rather of choosing terms with higher scores from the question-answering procedure, this seeks to generate a solution with the largest overlap to the intended result.

As a result, you'll often end up with a solution that's near to perfect but has a few minor flaws. As a result, the last stage is a second run at the puzzle, in which the software looks for other solutions that are only a few clicks away. The algorithm then scores the solutions and continues the process until no more improvements can be made.

Capable Solver

As a result, a powerful crossword-solving program has been created.  “Our system outperforms even the best human solvers and can solve puzzles from a wide range of domains with perfect accuracy,” the researchers explain.

They are quick to stress out, however, that crossword problems are not yet "solved." The Berkeley Crossword Solver, on the other hand, is tailored to particular types of puzzles popular in the United States, such as the venerable New York Times crossword. “Compared to existing approaches, our system improves exact puzzle accuracy from 57% to 82% on crosswords from The New York Times,” the team claims.

However, it does not work well with other kinds. The application, in particular, is incapable of solving cryptic crosswords in the form common in the United Kingdom. “ Cryptic crosswords involve a different set of conventions and challenges, e.g., more metalinguistic reasoning clues such as anagrams, and likely require different methods from those we propose,” the Berkeley team admits.

That's an intriguing piece of work that shows tremendous progress over earlier solutions. It also illustrates that humans continue to rule the roost in the realm of crossword puzzle solving; how much longer this will be difficult to predict.
Previous Post Next Post