Is there any model for document prioritization

I have bunch of documents but have some criteria’s, based on criteria it should prioritize the documents and list them out

1 Like

Classification models?


Yes, there are models on Hugging Face that can be used for document classification. Here are some candidate models and a brief explanation of their suitability:

  1. ProsusAI/finbert [2]: This model is primarily designed for text classification, particularly in the financial domain. It can be adapted for document classification if the focus is on financial-related documents.

  2. jinaai/jina-reranker-v2-base-multilingual [2]: This model is a multilingual text classifier, suitable for classifying documents in multiple languages. It can be useful if your documents are in different languages.

  3. distilbert/distilbert-base-uncased-finetuned-sst-2-english [2]: This is a sentiment analysis model. While it may not directly address document classification, it can be fine-tuned for this task if the classification criteria are related to sentiment.

  4. cardiffnlp/twitter-roberta-base-sentiment-latest [2]: Another sentiment analysis model, similar to the above. It can be adapted for document classification if sentiment is a relevant criterion.

  5. MilaNLProc/xlm-emo-t [2]: This model is designed for emotion classification in multilingual texts. It can be useful if your classification criteria are related to the emotional tone of the documents.

  6. unitary/toxic-bert [2]: A toxicity detection model. It can be used if you need to classify documents based on toxicity or harmful content.

  7. SamLowe/roberta-base-go_emotions [2]: A model for emotion classification, specifically for GoEmotions. It can be relevant if your criteria involve the emotional content of the documents.

  8. textdetox/xlmr-large-toxicity-classifier [2]: Another toxicity detection model, similar to the unitary/toxic-bert model.

  9. mixedbread-ai/mxbai-rerank-base-v1 [2]: This model is designed for reranking and can be adapted for document classification if reranking is part of your prioritization criteria.

  10. LayoutLM-based models [4][5]: LayoutLM is specifically designed for document classification tasks, including understanding the layout and structure of documents. It can be a strong candidate if your documents have a structured format and you need high accuracy in classification.

Recommendation:

For general document classification with criteria-based prioritization, LayoutLM-based models [4][5] are highly recommended due to their ability to handle structured documents and achieve high accuracy. For more domain-specific tasks, consider fine-tuning ProsusAI/finbert [2] or other domain-specific models.