Google AI has released Natural Questions: a new, large-scale corpus for training and evaluating question-answering systems. According to Google, the idea behind Natural Questions was to provide a corpus of naturally occurring questions that can be answered using a larger amount of information. In fact, the questions in NQ are questions that can be answered by reading an entire page rather than a sentence or short paragraph solely.
To make this possible, researchers at Google AI have used Google’s search to find and aggregate user search questions. Then they asked human annotators to read entire Wikipedia pages and annotate relevant sentences or paragraphs to find the answer.
The search queries from Google’s search engine were anonymized and aggregated to finally end-up with more than 300 000 “natural” questions. Additionally, 16 000 questions that were answered by multiple annotators were included for measuring the performance of QA systems as well as the quality of the annotations.
Google also announced a challenge in parallel to the release of the Natural Questions dataset. The competition has a blind test set of 7,842 examples that have exactly the same format as the released development set.
In the paper, the researchers from Google AI present experiments that validate the quality of the data. The quality of the annotations in the NQ corpus has been measured at 90% accuracy. Examples of the questions and answers from the dataset as well as the annotation process can be seen on the visualization page.