Monitoring ML and LLM models in production for drift, trust, and safety

erinlm31 · July 18, 2025, 8:37pm

Hi all —

I wanted to share a quick look at what we’ve been working on at InsightFinder AI, and get feedback from anyone who’s solving similar problems.

Over the past year we’ve seen more teams deploying not just traditional ML models, but also LLMs, into production systems. These are outstanding issues:

Data drift and model drift (input and output) degrading performance over time.
No clear way to evaluate LLM outputs for hallucinations, bias, or sensitive data leakage.
Trouble pinpointing why something went wrong when anomalies happen.
Lack of visibility into costs and performance metrics across models.

We tried to solve these issues - this demo walks through the current version:
(https://youtu.be/7aPwvO94fXg)

main highlights:

We built a platform that tries to address those by making it easy to:

Onboard a model with its metadata and data sources (Snowflake, Elastic, etc.)
Set up monitors for specific use cases (data quality, drift, hallucinations, etc.)
Dig into issues with a diagnostic “workbench” for root cause analysis.
See dashboards of costs, failed evaluations, and overall model health.

We’re still actively improving it, and it’d be really helpful to know what you’d want or what doesn’t resonate. Happy to answer questions about how it works or share more details about the underlying implementation if anyone’s curious.

Thanks for taking a look.

Topic		Replies	Views
📢 13 Critical Questions About LLMs – Seeking Insight and Collaboration Beginners	4	90	May 31, 2025
Safe_Mode = [True, False] Community Calls	0	258	December 27, 2023
Using LLM for Data Analytics Beginners	1	1319	June 7, 2025
Fine tuning and it's effects on model safety Models	3	24	July 21, 2025
Runtime Identity Drift in LLMs — Can We Stabilize Without Memory? Research	3	198	April 28, 2025

Monitoring ML and LLM models in production for drift, trust, and safety

Related topics