Problem
I am in the process of developing a Langchain application and planning to utilize Docker containers for its deployment. However, I’ve encountered a specific challenge: the memory allocation for each user and device remains uniform, regardless of their interactions. My objective is to ensure that a new agent of the Language Model (LLM) agent is initiated every time the web application is refreshed or reloaded.
Reproduce
To replicate the issue, you can follow run code below:
import gradio as gr
import openai
from fastapi import FastAPI, Request
from fastapi.templating import Jinja2Templates
from fastapi.responses import JSONResponse
from pydantic import BaseModel
import uvicorn
app = FastAPI()
templates = Jinja2Templates(directory="templates")
state = 0
def reset_state():
global state
state = 0
def get_completion_from_messages(prompt, history):
global state
state = state + 1
model_response = f"State/messages count ---> <b>state = {str(state)}</b>"
history.append((prompt, model_response))
return "", history
with gr.Blocks() as iface:
chatbot = gr.Chatbot(height = 620)
msg = gr.Textbox(label = "Prompt")
btn = gr.Button("Submit")
clear = gr.ClearButton(components=[msg, chatbot], value="Clear console")
print('blocks run')
live=True,
btn.click(get_completion_from_messages, inputs=[msg, chatbot], outputs=[msg, chatbot])
msg.submit(get_completion_from_messages, inputs=[msg, chatbot], outputs=[msg, chatbot])
app = gr.mount_gradio_app(app, iface, path="/gradio")
if __name__ == "__main__":
uvicorn.run(app, host="127.0.0.1", port=8000)
Program run in http://127.0.0.1:8000/gradio/
Code above is simplified version. The primary concern here revolves around the ‘state’ variable, which accumulates as conversations progress. The goal is to reset the ‘state’ value to zero every time the web page is refreshed or when the user clicks the “Clear console” button, or to be precised, reset_state
function will run when web page is refreshed.
By addressing this challenge, the application would be able to offer users a fresh LLM agent instance with each refresh, effectively managing memory-related issues.
I’m seeking advice on how to implement the desired functionality. Any guidance on resolving this matter would be highly appreciated.
Possible Solution
I have tried to use javascript function that detect when web app is reloaded but i still failed to trigger the javascript function in http://127.0.0.1:8000/gradio/.
<!DOCTYPE html>
<html>
<head>
<title></title>
<script>
window.onbeforeunload = function () {
fetch('/refresh', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({ 'refresh': true }), // Use boolean value instead of a string
});
};
</script>
</head>
<body>
</body>
</html>