Are there any recommended models or solution that are able to connect click data to object detection?
For example, I have a GUI and I want to know how users are interacting with it.
Would I need to generate a custom dataset made from the GUI and then fine tune something like DETR? Or is there a way to use the click data to automatically generate bounding boxes?
Thanks in advance