This is an open shared task for language modelling.
The task is to assign scores to sentences, based on their quality. The dataset contains 10,000 sentences that need to be scored. The sentences are in pairs - one correct and one incorrect sentence. The paired sentences are kept together in the dataset, but it is randomly selected whether the correct sentence is first or second.
The sentences come from two sources. Half of the sentences are from Wikipedia, and the incorrect versions are generated by randomly switching two words in the sentence. The rest of the sentences are from essays written by language learners, and the correct versions have been manually created by annotators.
The system needs to assign a score to each sentence, with the goal of giving a higher score to the correct sentence. The submissions are evaluated using accuracy: the relative number of times that the system correctly assigned a higher score to the correct sentence. When the scores for both sentences are equal, the pair will be counted as incorrect.
The submission file needs to contain 1 score per line, and these scores directly correspond to sentences in the input file.
When you upload your file, you can set a name and password, and the system will register your name. You can then update your results next time using the same password. Only the submission with the highest score is saved.
Dataset
Download from here: lm-task-dataset.tar.gzSubmit
Results
Name | Accuracy | Date |
test | 0.81 | 27 June 2015 |
kin | 0.8038 | 29 June 2015 |
Laimis Dalke | 0.7944 | 4 May 2015 |
O | 0.7938 | 25 June 2015 |
M | 0.7884 | 18 May 2015 |
badmodel | 0.7632 | 28 January 2016 |
Aleksandr Tkatsenko | 0.7492 | 3 May 2015 |
Marek | 0.733 | 5 April 2015 |
Sue | 0.7114 | 20 June 2018 |
Dmytro bigrams train + de | 0.6938 | 4 May 2015 |
Dmytro bigrams train + de | 0.6938 | 4 May 2015 |
Dmytro | 0.69 | 5 May 2015 |
Karl-Oskar | 0.6072 | 24 April 2015 |