Noob asking code review and advice: langchain and translation with towerinstruct and Python

n00bplusplus · April 28, 2024, 10:03pm

Complete noob in AI, deep learning, machine learning, everything with “intelligent something”.

I would love some advice to start understanding how it works, and understand my mistakes.

I started to write code for a very simple task:

I have a text file in Spanish (but Spanish is not important), and there is no necessarily a relationship between the lines - meaning by now I do not need to handle the context (maybe later!)
I read it line by line
I write a prompt asking to translate it for a towerinstruct model
Then I print the result.

To be honest, the behavior of the machine seems very strange to me. At first it works (first lines), but after few lines it starts to write text by himself as such as "The translation you entered is as follows: " , “Translation in English” or "Spanish: ". I tried to add some system prompt, without significant success.

Here is my dumb code. Any comment will be so helpful to me!

import sys
import os
import re
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain_community.llms import LlamaCpp

MODEL="/home/dani/AI-models/towerinstruct-7b-v0.1.Q8_0.gguf"

TEMPLATE = """
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
"""

PROMPT = PromptTemplate(
	input_variables=["prompt", "system_message"],
	template=TEMPLATE,
)
SYSTEM_MESSAGE = ""
CALLBACK_MANAGER = CallbackManager([StreamingStdOutCallbackHandler()])
LLM = LlamaCpp(
	model_path=MODEL,
	temperature=0.5,
	max_tokens=500,
	top_p=1,
	callback_manager=CALLBACK_MANAGER,
	verbose=False,
)

def prompt_tr(txt, in_lang='Spanish', out_lang='English'):
	return "Translate the following text from {lang1} into {lang2}.\n{lang1}: {prompt}\n{lang2}:".format(
		lang1=in_lang,
		lang2=out_lang,
		prompt=txt
	)

def translate_sp_en(txt):
	text = prompt_tr(txt)
	#print(PROMPT.format(prompt=text, system_message=SYSTEM_MESSAGE))
	output = LLM.invoke(PROMPT.format(prompt=text, system_message=SYSTEM_MESSAGE))
	print(output)

def usage():
	print("Usage: {} @filepath".format(sys.argv[0]))

if __name__ == '__main__':
	if len(sys.argv) < 2:
		usage()
		sys.exit(1)
	if not os.path.isfile(sys.argv[1]):
		print("Wrong path '{}'".format(sys.argv[1]))
		usage()
		sys.exit(2)
	with open(sys.argv[1],'r') as f:
		for line in f:
			translate_sp_en(line.rstrip())

Topic		Replies	Views
LLAMA2 (and Other Models) Engaging in Self-Dialogue: Asking and Answering Its Own Questions Beginners	0	373	May 10, 2024
Need advice for a new project Beginners	0	312	February 6, 2024
Llama 2 don't reponse prompt invokes Models	0	404	February 9, 2024
Trying to understand system prompts with Llama 2 and transformers interface 🤗Transformers	9	45622	October 19, 2024
Prompt printing gibberish Beginners	1	681	September 15, 2023

Noob asking code review and advice: langchain and translation with towerinstruct and Python

Related topics