Model or Dataset available for classifying a grammatical sentence?

I want to be able to classify if an input text is a complete sentence or not.

  • The closest accurate definition of ‘being complete’ is if the sentence is a grammatical sentence.
  • Also ‘being complete’ sentence, can depend on the context of the sentence but I want to focus on a sentence-like text as input for now.

Example of a complete sentence:

  • “You can write using one of the following styles”
  • “You can write”
  • “He writes code”

Example of an incomplete sentence:

  • “You can write using”
  • “You can write using one”
  • “He writes code for”

I found this package for grammar checking which I am going to try:

I am wondering if there is an ML/DL solution for this problem. Is there a dataset or available model for this that you know?

Hi @emadg ,

I don’t think language tools is best way to go here, because language tool will just check grammer and grammer does not predict wheather this sentence is complete or not. Here are the rules that are implemented in Language tool, you can check if there are any rules which will help you to classify a sentence as complete or not.
https://community.languagetool.org/rule/list?sort=category&order=asc

For ML Approach, I think you can try using a Language model, You can trying looking for things which end a sentence and find probaibily of those words like (Punctuations, conjuction words) in the end. Lesser probability means very less chances that sentence will end there.

PS: I will add If I found a concreate method to solve this.

1 Like