Integrated gradients for explainability of VLMs
|
|
0
|
86
|
February 3, 2025
|
Understanding where model weights are stored for research project on AI openness
|
|
3
|
239
|
January 31, 2025
|
Retrieving Meta Data on Models for Innovation Research
|
|
0
|
84
|
December 1, 2024
|
Generating Synthetic Data for Machine Translation of Dialects
|
|
2
|
1503
|
October 2, 2024
|
Theme Extraction from Text
|
|
1
|
1821
|
December 29, 2023
|
Interest in Contributing PEFT Educational Resources - Seeking Community Input
|
|
2
|
47
|
December 15, 2024
|
Best practices for estimating FLOPs-per-token with real datasets?
|
|
1
|
1818
|
September 20, 2022
|
Deepseek v3 analysis
|
|
0
|
81
|
February 18, 2025
|
Have you submitted feedback about ChatGPT?
|
|
4
|
615
|
June 27, 2023
|
Zero shot classification for automated electrocardiogram reports
|
|
3
|
1211
|
August 26, 2022
|
Model or Dataset available for classifying a grammatical sentence?
|
|
1
|
1677
|
February 3, 2021
|
How does the vocabulary size count towards total parameter size of a model?
|
|
0
|
2309
|
January 18, 2022
|
How to download all the docs?
|
|
4
|
1004
|
August 23, 2023
|
Why are embedding / pooler layers excluded from pruning comparisons?
|
|
7
|
789
|
February 16, 2021
|
Print All Tokens Over a Certain Probability Threshold
|
|
3
|
1106
|
July 21, 2020
|
Is causal language modeling (CLM) vs masked language modeling (MLM) a common distinction in NLP research?
|
|
0
|
2177
|
April 21, 2021
|
Train from scratch vs further pretraining/fine tuning with MLM and NSP
|
|
1
|
1525
|
August 28, 2023
|
LayoutLM for extraction of information from tables
|
|
1
|
1519
|
September 29, 2022
|
Conversational Search and Analysis of Collections of Letters and Comments
|
|
3
|
596
|
February 3, 2024
|
Large Language Models and Conversational User Interfaces for Interactive Fiction and other Videogames
|
|
2
|
670
|
September 24, 2024
|
Classification Heads in BERT and DistilBERT for Sequence Classification
|
|
2
|
1177
|
May 13, 2021
|
Detection Transformer (DETR) for text detection in documents
|
|
0
|
2026
|
September 29, 2021
|
A novel approach for training LLM models that suppress hallucinations and possess memory capabilities (without using RAG)
|
|
3
|
178
|
May 1, 2025
|
Building a custom Squad 2.0 style dataset, is it worth it?
|
|
3
|
999
|
July 20, 2020
|
Relative Position Representation/Encoding for Transformer
|
|
0
|
1923
|
February 22, 2022
|
Fine tuning gpt-neo via ppo
|
|
1
|
1353
|
June 11, 2023
|
Language model gradients sensitive to target value/length
|
|
0
|
339
|
June 16, 2023
|
Integration with Public-sector Data Portals
|
|
0
|
339
|
May 16, 2023
|
Text to Text Transformer - T5
|
|
2
|
1100
|
January 4, 2021
|
Feeding a Knowledge Base into Transformer model
|
|
1
|
1319
|
May 2, 2023
|
How do i choose a optimal LLM for Pentesting
|
|
2
|
1064
|
December 13, 2023
|
Resume Training / Finetune a language model and further finetune a classifier
|
|
1
|
1262
|
October 19, 2020
|
Debugging the RAG question encoder
|
|
2
|
573
|
February 10, 2021
|
[Call for Participation] GermEval2024 GerMS-Detect - Sexism Detection in German Online News Fora @Konvens 2024
|
|
0
|
313
|
April 19, 2024
|
Finetuning German BERT for QA on biomedical domain
|
|
2
|
1016
|
January 30, 2022
|
Inference optimization with HPC
|
|
2
|
571
|
January 8, 2024
|
Address extraction and formated using Places API (Google Maps API)
|
|
0
|
1724
|
July 4, 2021
|
Fine Tuning LLM
|
|
0
|
1710
|
August 16, 2023
|
LayoutLMV3 information extraction from invoice
|
|
2
|
980
|
September 22, 2024
|
What does the datacenter infrastructure of HF look like?
|
|
0
|
293
|
March 28, 2024
|
A Survey to Understand Challenges of Deploying Text Classification
|
|
2
|
943
|
February 8, 2022
|
How to add your paper to your models or datasets metadata?
|
|
2
|
896
|
October 30, 2023
|
How find idea for academic thesis?
|
|
2
|
879
|
February 19, 2022
|
What is Q* algorithm?
|
|
0
|
269
|
November 25, 2023
|
Do We Still Need Dimensionality Reduction for LLM Text Embeddings?
|
|
1
|
1064
|
August 20, 2024
|
BART question, it seems that pretraining is not work for a small model?
|
|
6
|
563
|
August 3, 2020
|
Prompt Theory: A Framework
|
|
8
|
159
|
June 7, 2025
|
Open-Source LLM Models for Data Extraction Tasks
|
|
0
|
262
|
September 24, 2024
|
Domain-specific word similarity problem
|
|
2
|
846
|
July 19, 2023
|
AANN: Agents As Neural Networks
|
|
0
|
45
|
March 8, 2025
|