Local redaction of gpt queries

dr-shreber · January 7, 2025, 7:31pm

I’m looking for advice on how to efficiently set up a workflow where a low-parameter model (to be evaluated locally) redacts sensitive information from user queries before they are passed to ChatGPT. Additionally, the system would later use the redacted query, along with the response from ChatGPT, as high-quality context to construct an appropriate final response to the user’s query.

My goals are:

Efficient Local Processing: Use a lightweight model that can run locally to identify and redact sensitive data in user inputs. (Ideally only requiring CPU?)
Seamless Redaction and Response Integration: Store the redacted data securely and reintegrate it with the ChatGPT response to produce a coherent final answer.
Privacy and Security: Ensure that sensitive data is not exposed during the process.

Key Questions:

What would be a good starting point for identifying lightweight models that excel in entity recognition or sensitive data redaction?
Are there Hugging Face models or pipelines pre-trained for tasks like PII (Personally Identifiable Information) redaction?
What are some best practices for integrating redaction and response-generation workflows efficiently?
How can I ensure that sensitive data storage is secure and its reintegration into final outputs is seamless?
Are there frameworks or tools within the Hugging Face ecosystem that can simplify this setup?

Any pointers to relevant models, libraries, or architectural patterns would be highly appreciated. As a beginner in this space, I’m particularly interested in practical examples, tutorials, or code snippets to get started.

Topic		Replies	Views
Looking for a Local Model with Fine-Tuning and No Censorship Beginners	4	311	January 7, 2025
Sensitive data privacy / gathering Spaces	2	394	April 7, 2025
Alot of questions, or, How can i run models locally (for an absolute begginger) Beginners	3	44	July 4, 2025
Hugging face sentence embeddings on Dominolab Localy Models	0	1783	April 28, 2023
Extracting Training Data from GPT-2 (+ Differential Privacy) Research	2	1926	November 9, 2023

Local redaction of gpt queries

Related topics