So there i get quite close results, considering it is obvious that the other sentence is wrong by all means.
I am a man. 50.63967
I is an man. 230.10565
Is there any other way to calculate if i sentence is correct. Because this is a quite close result.
Maybe finetune T5 on examples, if there is a training set?
I have made some huge 3 gram and 4 gram models and they seem to be useless, even I used around 800 GB of text and i cant tell if a sentence is good or not.
Although I cannot vouch for their quality, there are a number of grammar correction models in model hub: Models - Hugging Face
They seem to finetune T5 or GPT as you mentioned. However, there will never be a guarantee that the model output is 100% grammatically correct. I think a rule-based approach suits grammar the most, since it mostly follows well-defined rules.
The task you are referring to is one of the subtasks in the GLUE benchmark (which is an important benchmark in NLP): the CoLa dataset (CoLa is short for Corpus of Linguistic Acceptability). This is a simply binary classification task: given a sentence, the model needs to determine whether the sentence is grammatically correct or not.
Hence, you can use a BERT model (or one of its variants, such as RoBERTa, DistilBERT, etc.) fine-tuned on this dataset. This is already available on the hub, for example this one.
Hi if you’re looking for a model that predicts whether a given sentence is correct or not. You can go with gramformer, and also it has corrector as well.