How to ensuring a new instance of a Language Model (LLM) agent is created or simply specific function executed with every refresh of a web application, as demonstrated in the provided Python code

bill11 · August 26, 2023, 12:08pm

Problem

I am in the process of developing a Langchain application and planning to utilize Docker containers for its deployment. However, I’ve encountered a specific challenge: the memory allocation for each user and device remains uniform, regardless of their interactions. My objective is to ensure that a new agent of the Language Model (LLM) agent is initiated every time the web application is refreshed or reloaded.

Reproduce

To replicate the issue, you can follow run code below:

import gradio as gr
import openai
from fastapi import FastAPI, Request
from fastapi.templating import Jinja2Templates
from fastapi.responses import JSONResponse
from pydantic import BaseModel
import uvicorn

app = FastAPI()

templates = Jinja2Templates(directory="templates")

state = 0

def reset_state():
    global state
    state = 0

def get_completion_from_messages(prompt, history):
    global state
    state = state + 1
    model_response =  f"State/messages count ---> <b>state = {str(state)}</b>"

    history.append((prompt, model_response))
    return "", history

with gr.Blocks() as iface:
    chatbot = gr.Chatbot(height = 620)
    msg = gr.Textbox(label = "Prompt")
    btn = gr.Button("Submit")
    clear = gr.ClearButton(components=[msg, chatbot], value="Clear console")
    print('blocks run')
    live=True,

    btn.click(get_completion_from_messages, inputs=[msg, chatbot], outputs=[msg, chatbot])
    msg.submit(get_completion_from_messages, inputs=[msg, chatbot], outputs=[msg, chatbot])

app = gr.mount_gradio_app(app, iface, path="/gradio")

if __name__ == "__main__":
    uvicorn.run(app, host="127.0.0.1", port=8000)

Program run in http://127.0.0.1:8000/gradio/

Code above is simplified version. The primary concern here revolves around the ‘state’ variable, which accumulates as conversations progress. The goal is to reset the ‘state’ value to zero every time the web page is refreshed or when the user clicks the “Clear console” button, or to be precised, reset_state function will run when web page is refreshed.

By addressing this challenge, the application would be able to offer users a fresh LLM agent instance with each refresh, effectively managing memory-related issues.

I’m seeking advice on how to implement the desired functionality. Any guidance on resolving this matter would be highly appreciated.

Possible Solution

I have tried to use javascript function that detect when web app is reloaded but i still failed to trigger the javascript function in http://127.0.0.1:8000/gradio/.

<!DOCTYPE html>
<html>
<head>
    <title></title>
    <script>
        window.onbeforeunload = function () {
            fetch('/refresh', {
                method: 'POST',
                headers: {
                    'Content-Type': 'application/json',
                },
                body: JSON.stringify({ 'refresh': true }), // Use boolean value instead of a string
            });
        };
    </script>
</head>
<body>
</body>
</html>

Topic		Replies	Views
Deploying LLM in Production: Performance Degradation with Multiple Users 🤗Transformers	6	4730	June 7, 2024
Optimizing LLM Inference with One Base LLM and Multiple LoRA Adapters for Memory Efficiency 🤗Transformers	1	4639	January 20, 2024
How to implement chatbot streaming with a function Beginners	0	1074	June 15, 2023
Memory keeps growing when called from Uvicorn/FastAPI Beginners	0	3793	August 26, 2022
Creating A Team Of LLMs Intermediate	2	196	February 6, 2025

How to ensuring a new instance of a Language Model (LLM) agent is created or simply specific function executed with every refresh of a web application, as demonstrated in the provided Python code

Problem

Reproduce

Possible Solution

Related topics