I am interested in using HuggingFace models & datasets for a Reinforcement Learning use case. For my purpose I would need to implement a replay buffer.
I considered using HF Datasets due to (1) easily coupling with HF models and (2) efficiency stemming from zero-copy reads by memory mapping the whole dataset. However, I do not see any functionality for (efficiently) augmenting the dataset. Is this functionality there?
Additionally I need the other replay buffer functionality: sampling based on priorities, unloading the buffer etc.
Do you think I should customize HF Datasets for my use case or I better couple some other replay buffer (e.g. rllib, stable baselines) with HF Models?
Hi Blazej – I agree. Are there any structural blockers to using datasets for this? I guess the challenge is how to do something like FIFO/LIFO nature of a replay buffer. I wonder if it’s interesting to just keep all of the data and have a wrapper that keeps the N most recent.
What do you mean by augmenting?
I think there are some discussions around this with another collaborator, let me follow up internally on this too.
Hi Nathan, thanks for the response.
Indeed FIFO/LIFO sampling and removing functionality is something that I need. Additionally sampling proportional to item’s priority is desired. Would something like this be possible with datasets while retaining the efficiency of datasets?
By augmenting the dataset I mean adding new items to the buffer (in an efficient manner).
Having such a functionality would definitely push HF forward as a place for RL experiments.