Conversational Budget Analytics

I recently thought of an idea which seems like it might be useful and so I would like to share it with the Hugging Face R&D Community.

Last year, I did some volunteering pertaining to catalyzing and spurring AI-enhanced budget navigation and analytics. A thought was that the general public, accountants, and auditors could each navigate public sector budgetary data conversationally using dialogue systems or chatbots. Furthermore, these dialogue systems could be multimodal, producing data visualizations and analytics alongside their natural-language responses.

More recently, large language models and chatbots are quite popular. Contemporary dialogue systems can answer natural-language questions while indicating their document-based sources. What about dialogue systems which could answer questions about large-scale budgets, spreadsheets, tables, and other database data?

Approaches to connecting dialogue systems to budgetary data include, but are not limited to:

  1. Software data adapters.
  2. Automatically generating abundant “virtual documents” with “pass-through data provenance” which can be used to trace back to data resources utilized to generate the documents.

Expanding on point 2, the idea that I would like to share, today, is that, for large-scale budgetary datasets, software tools could generate a very large number of “virtual documents” which each utilize natural language (and, perhaps, multimodal data visualizations) to answer automatically-generated questions.

Large language models could, then, be trained on large-scale corpora of “virtual documents”. Large language models could, with respect to providing sources, dereference or redirect through these “virtual documents” back to the actual data (spreadsheets, tables, budget-related files). In this way, provided answers accompanied by hyperlinks would be able to refer end-users to actual data through the “virtual documents”.

That is, accompanying hyperlinks provided to end-users would “pass through” or “redirect through” the “virtual documents” (which needn’t be, but could be, stored after training) to allow end-users to conversationally interact with budgetary data and to navigate into (views of) backing data.

I wanted to broach these topics with the Hugging Face R&D Community and would be very much interested in discussing these and any other ideas towards delivering conversational budget analytics to end-users. Thank you.

Clarifying, pertinent technologies include: (1) AI-enhanced business intelligence for public-sector accountants, auditors, analysts, and comptrollers, and (2) AI-enhanced Web-based UI/UX for the public, journalists, and government watchdog organizations to be able to better access and interact with this same data.

Today, in the United States of America, relevant websites include, but are not limited to: data.gov, usaspending.gov, and performance.gov.

Also, there was an exciting development since I wrote the earlier post. Here is an example of the new state of the art, Copilot for Excel: https://www.youtube.com/watch?v=I-waFp6rLc0.