How to pre-train or finetune LLM with structured dataset, so the LLM can reason the relationships between data objects

I want to train LLM with structured dataset such as database with multiple tables. Here is an simple example:

id name
1 Jason
2 Eric
3 David


id title
1 Gone with The Wind
2 Brave New World
3 Native Son


id name
1 The Book Nook
2 The Literary Loft
3 Wordsmith Books


user_id bookstore_id book_id
1 2 1
1 2 2
2 3 3

After training the LLM, it can reason on this relational database. Such as when I ask:
how many books did Jason buy in The Literary Loft bootstore? It can answers: 2 books, Gone with The Wind and Brave New World.

How can I prepare the corpus from the database and train the LLM ?

I have searched for some solutions, such as ask the LLM to transform prompts to sql queries, but that’s not what I want.

what about What is Table Question Answering? - Hugging Face

Table Question Answering seems can handle only one table, can’t reason relationships between multiple tables, are there any other suggestions?

There is at least one model for multi table qa.will this work for your data?