Any numbers-to-text example?

thehemen · May 8, 2022, 5:23pm

Hi.

I have a table with customer integer parameters (age, activity, postal code) and list of text descriptions of articles he would like to buy.

So, is there any example that generates some text based on numerical information?

I’ve learned about text, audio and image information retrieval but that’s not I’m looking for. I’m also familiar with Table Question Answering topic. It’s actually used to select the specific columns of a table, not to answer a general question.

Thanks for the answers if any.

merve · May 8, 2022, 10:08pm

Hello

Are these texts fixed or has to be generated? In that case, maybe you can somehow encode this tabular information and decode the text according to attributes entered. Essentially encoder-decoder RNN models used to do it, they’d have an encoded representation and they would pass it to decoder and it would decode, so you can train it on paired information (encoder doesn’t necessarily has to be an RNN given you don’t have a sequential model) but this is the intuition I get from your use case.

thehemen · May 9, 2022, 4:36am

Thanks for your answer. I’ll try this approach…

Output texts are different but sometimes they’re almost equal with little differences of color, pattern, material.

merve · May 9, 2022, 9:54am

I thought of modelling this as a different process.
e.g. you will extract the features from text separately, like color as a column, pattern as a column, material as a column and then formulate as any tabular data problem. Decoding can be quite random so I’d avoid generating text if I can just classify attributes.

prithivida · May 9, 2022, 4:02pm

@thehemen - You have to share more details on your use case, dataset & your goal. With the current info here are some thoughts.

Frame1: Generating Text from unstructured data (numeric or not the information) is very possible it is a task called “data to text generation”. But please keep in mind, it is used to directly translate structured data into fluent NL text, NOT exactly like what you are asking for. Here is a good code example. To get a feel of what “data to text” can do, look at the gifs in the above post. If you are sure of the problem frame try and reframe “data to text”, perhaps?. But we are not sure about your goals.

Frame2: Recommendation. But you need Customer features and Article features along with explicit or implicit feedback. Then you can do Matrix factorization. Given a user, the recommendation model will give Top-N article ids, based on the article id you can always retrieve text descriptions from a lookup.

Again, You can get better frames/solutions from the community if you share more information on the use case

prithivida · May 9, 2022, 4:06pm

@thehemen - You have to share more details on your use case, dataset & your goal. With the current info here are some thoughts.

Frame1: Generating Text from unstructured data (numeric or not the information) is very possible it is a task called “data to text generation”. But please keep in mind, it is used to directly translate structured data into fluent NL text, NOT exactly like what you are asking for . Here is a good code example. To get a feel of what “data to text” can do, look at the gifs in the above post. If you are sure of the problem frame try and reframe “data to text”, perhaps?. But we are not sure about your goals.

Frame2: Recommendation. But you need Customer features and Article features along with explicit or implicit feedback. Then you can do Matrix factorization. Given a user, the recommendation model will give Top-N article ids, based on the article id you can always retrieve text descriptions from a lookup.

I wrote a “text to number” generation using plain LSTM based encoder-decoder a few years ago, you could reverse it for your needs if you’d like.

Again, You can get better frames/solutions from the community if you share more information on the use case.

thehemen · May 9, 2022, 4:58pm

OK, thanks.
You’re right, it’s about building the recommendation system. AFAIK, the usual solution is to use boosting like LGBMRanker. But it requires huge amount of memory, so I’ve decided to search for deep learning techniques.
Here is the complete description: H&M Personalized Fashion Recommendations | Kaggle
It’s the competition that is almost ended up now.

prithivida · May 9, 2022, 6:06pm

Dude Sure - Before the solution, a couple of things, please read through.

First - Respect the time of the community, we are all well-meaning people and want to help you, you have taken a problem statement and presented your own frame of it without adequate information. This at best receives no or bad answers and at worst wastes our time. I can totally understand if for some reason you didn’t want to mention the competition or the fact that you are participating, but please don’t make us guess what you are trying to solve. Make it easy, write the problem statements as-is, and share your attempts and NOT the way you think the problem can be solved.

Second - Because of your frame you might have thought this topic is pertinent to HF, but AFAIK this is not pertinent to HF as of this post (maybe in the future), Except you could extract customer features and article features using some DL models but that’s only when you have to index and do ANN search not to build a recommendation system

Third - Boosting like LGBMRanker is not the only way to build recommenders. Take a look at the implicit library in Github, you can choose to use ALS (recommended) or LSA style MF techniques for building the model. While serving you could use the ANN algorithms like NMSLIB’s HSNW or ANNOY or FAISS (built into implicit) to get ANNs.

Wish you the best.

thehemen · May 9, 2022, 7:32pm

I’m sorry. That was a mistake. I’d better not to stay there…

merve · May 11, 2022, 2:06pm

Hello Prithiviraj,

While we appreciate your help in our Forum, we do not tolerate the attitude you’ve shown towards the community member. Hugging Face is a place where we want to enable free discussion, so there was actually no problem with what @thehemen was stating. It’s also fine to have other discussions that are not within Hugging Face but asking for advice, as we see Hugging Face as home for machine learning where we have more libraries and domains, so we don’t see this as out of scope.

prithivida · May 11, 2022, 2:36pm

@thehemen @merve - Rank Apology. My intent wasn’t to make the original author feel bad about the line of questioning. I saw they have recently joined (if am not wrong) so I genuinely shared how a better frame of question can elicit better answers in a respectful way. I wasn’t pointing out it’s out-of-scope, just said that the wrong frame can conflate the scope of the solution. I fully agree the forum can have questions from a vast scope, in that spirit I did help them with the right frame and possible answers they can pursue.

Thank you!

merve · May 11, 2022, 2:50pm

Hello again,

It wasn’t about the content of the post you were sharing in general but more about the attitude and the tone used to warn someone else in the community. We do not want to spread negativity and would like to discourage this language in general to keep the community as a place where people feel welcome and do not shy away from asking questions.

All the best,
Merve

prithivida · May 11, 2022, 3:20pm

Apology again, if I came across like that, but that wasn’t my intention. I promise to work with you all to make it better and contribute better.

Topic		Replies	Views
Text generation conditioned on numbers 🤗Transformers	0	403	May 26, 2022
What model would best fit a structured text generator? Beginners	0	769	April 10, 2022
Text generation from numerical label Beginners	1	205	September 28, 2022
How to Implement Numerical Inference in a Text Generation Problem Intermediate	0	516	May 17, 2022
Training Text-To-SQL models Beginners	1	205	January 21, 2025

Any numbers-to-text example?

Related topics