Cost-Effective LLM for Extracting Web Selectors from E-Commerce HTML

:wave: Hey everyone! I’m looking to train an LLM for automatically detecting CSS/XPath selectors for specific web elements on e-commerce sites (e.g., product name, price) based on the site’s raw HTML code.

GPT-4o handles this task well, but I’m looking for a more cost-effective solution. My main questions are:
:one: Which open-source LLM is best for fine-tuning on this task?
:two: Would fine-tuning be the best approach, or should I explore prompting/LoRA adaptation instead?

Any recommendations on models, datasets, or training setups would be greatly appreciated! :rocket:

1 Like