The crux of the question: is there a way to train a sequence generation model to have some degree of numerical inference.
Problem setup: a user is attempting to create an advertisement campaign constructed of a set of attritubes (i.e. the campaign should target individuals in San Francisco). The user provides two numerical values to describe the constraints of their campaign, and the model generates a sequence of attributes describing a campaign.
Initial approach: the native approach is to convert the numerical inputs into string representations (2.0 → two point zero). A custom tokenizer is trained on the dataset. After the number to string conversion, the numerical values, along with the target sequence, are used to train a fine-tuned version of the GPT-2 model by OpenAI.
Results: As expected, the model does very well at generating campaigns that make sense; However, the campaigns do not make sense in the context of the numerical inputs. Numerical inputs of 2.0 and 2.1 yield very different campaigns, when in fact, they should yield similar sequences of attributes.
Has there been work done on passing numerical information through a transformer model?
Example data: Given a budget of 1200, a CPM target of 2.1, we recommend the following targets. Channel: mobile, connected tv, Location: US, … (and so on).