🚧 ReTool: PyTorch Implementation of Strategic Tool Use in LLMs (Seeking Collaborators)
|
|
0
|
22
|
June 1, 2025
|
This Python class offers a multiprocessing-powered Pool for efficiently collecting and managing experience replay data in reinforcement learning
|
|
0
|
12
|
June 2, 2025
|
Why can padding tokens attend to other tokens in masked self attention?
|
|
0
|
67
|
November 4, 2024
|
Questions about classification models
|
|
0
|
31
|
November 4, 2024
|
How to use Transformers ViTs with different resolutions like in timm?
|
|
0
|
66
|
November 14, 2024
|
BERT token classification / regression question
|
|
0
|
34
|
November 5, 2024
|
An idea about LLMs
|
|
0
|
76
|
November 3, 2024
|
Low Accuracy in BERT Ensemble Despite Strong Individual Model Performance
|
|
0
|
10
|
May 22, 2025
|
Which VLM is best for defect detection in images
|
|
0
|
316
|
November 6, 2024
|
RAG performance
|
|
0
|
84
|
November 6, 2024
|
What should I study in fine-tuning
|
|
0
|
18
|
November 6, 2024
|
[Network Access Inquiry] Confirming GET-only Requests to huggingface.co for Firewall Whitelisting
|
|
0
|
17
|
June 4, 2025
|
Diffusers load custom embedding
|
|
0
|
45
|
November 7, 2024
|
Saving Manually Resized Embeddings for a Pretrained Bert Model (I believe I am asking this correctly)
|
|
0
|
101
|
November 7, 2024
|
Confusion regarding when to use dict-styled chat dialogue vs. when to format using chat template
|
|
0
|
42
|
November 6, 2024
|
Is there any way to fine tuning model with existing embedding?
|
|
0
|
15
|
November 7, 2024
|
Guidance on Optimizing Text Similarity and Reporting with Transformers and Advanced NLP Techniques
|
|
0
|
33
|
November 7, 2024
|
AOTInductor with Llama-3.2-3B-Instruct
|
|
0
|
89
|
November 14, 2024
|
Stateful PEFT adapter
|
|
0
|
10
|
June 5, 2025
|
Problem with finetuning model whisper
|
|
0
|
85
|
November 7, 2024
|
Zero-shot finetuning a model for translation
|
|
0
|
41
|
November 7, 2024
|
Model type: chatglm - unexpected keyword argument 'padding_side'
|
|
0
|
387
|
November 7, 2024
|
Fully local chatpdf
|
|
0
|
194
|
November 7, 2024
|
LLama2-7b QA gives unwanted characters in text_output during inference
|
|
0
|
9
|
November 7, 2024
|
GPT Memory Structure Experiment — How Did GPT Recognize Me Without Any Stored Memory?
|
|
0
|
24
|
June 4, 2025
|
A lightweight utility for training multiple Keras models in parallel and comparing their final loss and last-epoch time
|
|
0
|
7
|
June 5, 2025
|
List out of range when using boundings boxes in object detection
|
|
0
|
20
|
November 7, 2024
|
Creating a custom Multi Task model using a custom config
|
|
0
|
15
|
November 7, 2024
|
HighNoon LLM: Revolutionizing Sequence Processing with Hierarchical Spatial Neural Memory for Scalable and Ethical NLP
|
|
0
|
38
|
June 3, 2025
|
WandB does not log train loss
|
|
0
|
56
|
November 7, 2024
|