Implementing a custom Attention Transformer
|
|
5
|
3175
|
September 6, 2021
|
LLM for autism research
|
|
4
|
1053
|
July 13, 2024
|
Improving performance of Wav2Vec2 fine tuning with word piece vocabulary
|
|
5
|
2984
|
October 27, 2021
|
Request for comments: Simple Universal Prompting System
|
|
4
|
101
|
June 12, 2025
|
LayoutLMv3 paper review and fine tuning code
|
|
0
|
1229
|
June 23, 2022
|
A complete survey on ChatGPT: One Small Step for Generative AI, One Giant Leap for AGI
|
|
0
|
1192
|
April 5, 2023
|
Wake word detection
|
|
6
|
141
|
April 5, 2025
|
A criticism of instruction fine-tuning datasets
|
|
2
|
2091
|
June 20, 2023
|
Masked Language Model Scoring
|
|
5
|
2578
|
June 15, 2023
|
Introducing The AGI Framework: Open-Source Modular Architecture for Artificial General Intelligence Development
|
|
0
|
198
|
January 28, 2025
|
Language model to search an answer in a huge collection of (unrelated) paragraphs
|
|
4
|
1510
|
July 6, 2021
|
Task-specific fine-tuning of GPT2
|
|
0
|
1045
|
April 22, 2021
|
Is there an easy way to apply layer-wise decaying learning rate in huggingface trainer for RobertaMaskedForLM?
|
|
3
|
2928
|
April 5, 2022
|
Attention mask and token ids
|
|
1
|
2261
|
October 18, 2022
|
EFFECTIVE PROMPTING - ReACT & GRAPHS
|
|
2
|
300
|
October 2, 2024
|
Free Access for Masters Dissertation
|
|
1
|
649
|
February 2, 2024
|
Ionic vs. React Native vs. Flutter
|
|
4
|
402
|
June 6, 2025
|
Open API standard for open-source LLMs
|
|
0
|
887
|
July 1, 2023
|
Error while trying to Load the "deepseek-ai/DeepSeek-V3" model
|
|
3
|
425
|
April 14, 2025
|
About the encoder and generator used in the RAG model
|
|
2
|
860
|
December 25, 2020
|
GRPO Trainer for VLM?
|
|
2
|
260
|
March 11, 2025
|
Different response from different UI's
|
|
3
|
219
|
March 6, 2025
|
Profiling all layers of a model
|
|
0
|
756
|
January 26, 2024
|
Debiasing models by HEX projection
|
|
1
|
521
|
July 28, 2020
|
Incorporating structural information in a Transformer?
|
|
0
|
718
|
April 6, 2022
|
Can you use both copy mechanism and BPE for a NMT task?
|
|
0
|
712
|
April 6, 2022
|
Meta Persona an abstract adaptive neural construct
|
|
0
|
712
|
November 25, 2020
|
Transformer for Abstractive Summarization for Chats Based on Performance
|
|
3
|
1950
|
October 9, 2020
|
Improving Key-Value Pair Extraction with LayoutLM and LiLT on Custom OCR Dataset
|
|
2
|
216
|
February 21, 2025
|
AI Memory : The Simplest System That Beats Every Complex Solution
|
|
6
|
142
|
May 25, 2025
|
Resources on interpretability of wav2vec-style speech models
|
|
0
|
636
|
September 12, 2022
|
Analysis of attention map
|
|
2
|
199
|
October 24, 2024
|
Looking for help to use AI (LLM) for a Systematic Literature Review in Soil Biodiversity/agroecology Research
|
|
4
|
153
|
February 1, 2025
|
Extracting Training Data from GPT-2 (+ Differential Privacy)
|
|
2
|
1922
|
November 9, 2023
|
Web Search Implementation with LLM
|
|
1
|
1284
|
October 11, 2024
|
Collaborative Training Experiment Round 2 with Yandex and HuggingFace
|
|
0
|
565
|
September 1, 2021
|
(Research/Personal) Projects Ideas
|
|
2
|
1817
|
November 29, 2024
|
[Help needed] Extending Trainer for Meta learning
|
|
3
|
1571
|
October 19, 2021
|
Best way to infer continuously with Transformer?
|
|
0
|
557
|
July 26, 2021
|
AgentLite Is A Lightweight Framework for Building AI Agents
|
|
0
|
99
|
October 2, 2024
|
Why do the commit histories of Hugging Face's datasets and models appear recent? Weren't these datasets and models uploaded a while ago?
|
|
2
|
983
|
April 8, 2022
|
Best way to deploy a SLM/LLM model. Best library and approach?
|
|
6
|
641
|
March 11, 2025
|
Can we access attention component and feed-forward component of a Bert layer?
|
|
2
|
973
|
September 23, 2024
|
Seeking Hugging Face Users for Casual Chat About AI Model Openness
|
|
5
|
122
|
March 4, 2025
|
LLM for analysing JSON data
|
|
1
|
369
|
December 12, 2024
|
Finetuning for fp16 compatibility
|
|
2
|
1693
|
June 17, 2021
|
Is there a way to split a news article into subtopic
|
|
4
|
1272
|
September 22, 2022
|
Token merging for fast LLM inference
|
|
0
|
492
|
April 17, 2024
|
Multilingual token, phrase and sentence representations for text similarity
|
|
0
|
490
|
January 13, 2021
|
Integrated gradients for explainability of VLMs
|
|
0
|
86
|
February 3, 2025
|