Feature extraction for regression/classification vs Fine Tuning

This is sort of a general question, but I’ve been working on fine tuning some models on regression task using GPU instances on AWS. I can already estimate the cost seems to be rather astronomical. So I’m wondering as a cheaper option (hopefully) if I should just extract the pooled output and then run plain old regression models on a distributed architecture like Spark. Does anyone have experience comparing these two options in terms of cost and performance? Thanks!

Hi @thecity2, your dataset must be huge if you’re considering running Spark jobs on the model outputs :exploding_head: .

I’ve never done this exact comparison (Spark vs GPU), but can’t you get a rough estimate by doing the fine-tuning vs feature extraction comparison on a subset of the dataset? This would also give you an idea about whether the accuracy (or whatever metric you’re measuring) is good enough in the feature-based approach - in some cases, I’ve seen massive drops compared to fine-tuning.


1 Like