AttributeError: 'ChatPromptValue' object has no attribute 'size'

DP13 · February 5, 2024, 7:49am

AttributeError Traceback (most recent call last)
in <cell line: 8>()
6 chain = setup_and_retrieval | prompt | model | output_parser
7
----> 8 chain.invoke(“What is AR?”)

11 frames
/usr/local/lib/python3.10/dist-packages/transformers/models/gpt2/modeling_gpt2.py in forward(self, input_ids, past_key_values, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, encoder_hidden_states, encoder_attention_mask, use_cache, output_attentions, output_hidden_states, return_dict)
774 elif input_ids is not None:
775 self.warn_if_padding_and_no_attention_mask(input_ids, attention_mask)
→ 776 input_shape = input_ids.size()
777 input_ids = input_ids.view(-1, input_shape[-1])
778 batch_size = input_ids.shape[0]

AttributeError: ‘ChatPromptValue’ object has no attribute ‘size’

CKeibel · February 5, 2024, 8:45am

Hello @DP13!
I’m assuming you are using langchain?
I never used langchain myself but when I look into the docs ChatPromptValue it seems, that ChatPromptValue does not have an attribute size.

Maybe you could share some more code, to get a better understanding why it fails?

DP13 · February 6, 2024, 6:55am

Hello @CKeibel
Yes i am using langchain and issue is my llm is not taking prompt as a perameter llm shows that is only wants string.

DP13 · February 6, 2024, 6:59am

!pip install ctransformers

from langchain_core.runnables import RunnablePassthrough, RunnableParallel
from langchain_core.output_parsers import StrOutputParser
from langchain import hub
from langchain_core.prompts import ChatPromptTemplate

from ctransformers.chat_models import ChatPrompt

retriever = vector_store.as_retriever(search_type=“similarity”, num_docs=1)

prompt = hub.pull(“rlm/rag-prompt”)

template = “”"Answer the question based only on the following context:
{context}

Question: {question}
“”"
prompt = ChatPromptTemplate.from_template(template)

prompt = ChatPrompt(template)

rag_chain = (
{“context”: retriever | format_docs, “question”: RunnablePassthrough()}
| prompt
)

result = rag_chain.invoke(query)

print(result)

Output = messages=[HumanMessage(content=“Answer the question based only on the following context:\nAugmented reality (AR) is an interactive experience that combines the real world and computer-generated content. The content can span multiple sensory modalities, including visual, auditory, haptic, somatosensory and olfactory.[1] AR can be defined as a system that incorporates three basic features: a combination of real and virtual worlds, real-time interaction, and accurate 3D registration of virtual and real objects.[2] The overlaid sensory information can be constructive (i.e. additive to the natural environment), or destructive (i.e. masking of the natural environment).[3] This experience is seamlessly interwoven with the physical world such that it is perceived as an immersive aspect of the real environment.[3] In this way, augmented reality alters one’s ongoing perception of a real-world environment, whereas virtual reality completely replaces the user’s real-world environment with a simulated one Domain IP Address\nDomain IP Address\n\nARP: ARP stands for Address Resolution Protocol. It is used to convert an IP address to its corresponding physical address(i.e., MAC Address). ARP is used by the Data Link Layer to identify the MAC address of the Receiver’s machine. \n\nRARP: RARP stands for Reverse Address Resolution Protocol. As the name suggests, it provides the IP address of the device given a physical address as input. But RARP has become obsolete since the time DHCP has come into the picture.\n\n\nUnlock the Power of Placement Preparation!\nFeeling lost in OS, DBMS, CN, SQL, and DSA chaos? Our Complete Interview Preparation Course is the ultimate guide to conquer placements. Trusted by over 100,000+ geeks, this course is your roadmap to interview triumph.\nReady to dive in? Explore our Free Demo Content and join our Complete Interview Preparation course. RARP: RARP stands for Reverse Address Resolution Protocol. As the name suggests, it provides the IP address of the device given a physical address as input. But RARP has become obsolete since the time DHCP has come into the picture.\n\n\nUnlock the Power of Placement Preparation!\nFeeling lost in OS, DBMS, CN, SQL, and DSA chaos? Our Complete Interview Preparation Course is the ultimate guide to conquer placements. Trusted by over 100,000+ geeks, this course is your roadmap to interview triumph.\nReady to dive in? Explore our Free Demo Content and join our Complete Interview Preparation course.\n\n\nThree 90 Challenge ending on 5th Feb! Last chance to get 90% refund by completing 90% course in 90 days. Explore offer now.\nLast Updated : 17 May, 2023\n\n1.23k\n\nNext\nTypes of Network Topology\nShare your thoughts in the comments RARP: RARP stands for Reverse Address Resolution Protocol. As the name suggests, it provides the IP address of the device given a physical address as input. But RARP has become obsolete since the time DHCP has come into the picture.\n\n\nUnlock the Power of Placement Preparation!\nFeeling lost in OS, DBMS, CN, SQL, and DSA chaos? Our Complete Interview Preparation Course is the ultimate guide to conquer placements. Trusted by over 100,000+ geeks, this course is your roadmap to interview triumph.\nReady to dive in? Explore our Free Demo Content and join our Complete Interview Preparation course.\n\n\nThree 90 Challenge ending on 5th Feb! Last chance to get 90% refund by completing 90% course in 90 days. Explore offer now.\nLast Updated : 17 May, 2023\n\n1.23k\n\nNext\nTypes of Network Topology\nShare your thoughts in the comments\n\nAdd Your Comment\nSimilar Reads\nAdvantages and Disadvantages of Computer Networking\nOSI Model Full Form in Computer Networking\nMAN Full Form in Computer Networking\nTCP/IP in Computer Networking\nImportance of Computer Networking\nDifference Between BOOTP and RARP in Computer Networking\nDifference Between MSS and MTU in Computer Networking\nBasics of Computer and its Operations\nIntroduction to basic Networking Terminology\nSoftware defined Networking(SDN)\nhttps://media.geeksforgeeks.org/auth/avatar.png\nGeeksforGeeks\nArticle Tags :\nComputer Networks \nMisc\nPractice Tags :\nMisc\nAdditional Information\nTrending in News\nView More\n50 Most Popular Female Characters in Naruto\nHow to Earn Money Online by Creating eBooks with ChatGPT\nTop 10 AI Tools for Disinformation Detection in 2024\nBest ChatGPT Plugin for Data Analysis\n30 OOPs Interview Questions and Answers (2024)\n\nQuestion: What is AR?\n”)]

print(llm(prompt))

Output

AttributeError Traceback (most recent call last)
in <cell line: 1>()
----> 1 print(llm(prompt))

2 frames
/usr/local/lib/python3.10/dist-packages/ctransformers/llm.py in tokenize(self, text, add_bos_token)
334 add_bos_token = self.model_type == “llama”
335 tokens = (c_int * (len(text) + 1))()
→ 336 n_tokens = self.ctransformers_llm_tokenize(text.encode(), add_bos_token, tokens)
337 return tokens[:n_tokens]
338

AttributeError: ‘ChatPromptTemplate’ object has no attribute ‘encode’

DP13 · February 6, 2024, 6:59am

Thank you for your help and advance thanks for the solution

CKeibel · February 6, 2024, 8:13am

After researching a bit here are my first attempts:

When you are creating a ChatPromptTemplate from a template string like the following:

template = """Answer the question based only on the following context:
{context}

Question: {question}
"""

template = ChatPromptTemplate.from_template(template)

you also need to pass in values for your prompt.
The following should be noted, template.format_messages(...) seems to return a list but template.format(...) returns a string.

So:

messages = template.format(context="The sky is blue.", question="The sky is blue.")

As the result message should be a string which looks like:

'Human: Answer the question based only on the following context:\nThe sky is blue.\n\nQuestion: What color is the sky?\n'

Then you should be able to pass message into the llm.

If you want to get a similar behaviour with a chain, you need to pass the the llm to the chain.

Here are some code snippets from the langchain cookbook .

Initialize VectorStore and PromptTemplate:

retriever = vectorstore.as_retriever()

template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

Initialize your LLM

# your llm
llm = ...

create a chain

chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt # your prompt template
    | llm # your llm
    | StrOutputParser()
)

and at the end you can invoke the cain:

chain.invoke("where did harrison work?")

I hope that it helps you. If you have any further questions, please let me know and I’ll take a closer look.

DP13 · February 6, 2024, 10:18am

Your last suggested chain part was my first implementation and i got the attribute error and i created this thread.

you can find 3rd line

chain = setup_and_retrieval | prompt | model | output_parser

please suggest me if any other options are available @CKeibel

CKeibel · February 7, 2024, 7:09am

It looks as if it is not possible to use a model from CTransformers with the ChatPromptTemplate and a RAG chain.

The only thing I could find on the internet is using it with the PromptTemplate from langchain.prompts.

from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate

template = """Question: {question}

Answer:"""

prompt = PromptTemplate(template=template, input_variables=["question"])

llm_chain = LLMChain(prompt=prompt, llm=llm)

response = llm_chain.run("What is AI?")

Reference

If you want to use a CTransformer in a RAG setup, you could use faiss index or chromadb as vector store and an sbert model for document/ text embeddings. Then you would search with the sbert model in your vector store and retrieve documents which you pass on to you llm.

init embedding model

import torch
import torch.nn.functional as F
from transformers import AutoModel, AutoTokenizer, Pipeline

def mean_pooling(model_output, attention_mask):
    token_embeddings = model_output[0] #First element of model_output contains all token embeddings
    input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
    return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)

# Inference pipeline for embedding model
class EmbeddingPipeline(Pipeline):
    def _sanitize_parameters(self, **kwargs):
        preprocess_kwargs = {}
        return preprocess_kwargs, {}, {}


    def preprocess(self, text):
        encoded_text = self.tokenizer(text, padding=True, truncation=True, return_tensors='pt').to(device)
        return encoded_text


    def _forward(self, model_inputs):
        outputs = self.model(**model_inputs)
        return {"outputs": outputs, "attention_mask": model_inputs["attention_mask"]}


    def postprocess(self, model_outputs):
        sentence_embeddings = mean_pooling(model_outputs["outputs"], model_outputs['attention_mask'])
        sentence_embeddings = F.normalize(sentence_embeddings, p=2, dim=1)
        return sentence_embeddings[0].numpy()

model_id = "sentence-transformers/all-MiniLM-L6-v2"
model = AutoModel.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
encoder = EmbeddingPipeline(model=model, tokenizer=tokenizer, device=device)

setup chromadb index

import chromadb

collection = chroma_client.create_collection(name="squad_v2", metadata={"hnsw:space": "ip"})

# embed documents 
embedding_vector = encoder(document).tolist()
collection.add(
    embeddings=[embedding_vector],
    documents=[document],
    ids="1" # string id
)

query the vetor store

question = "Some question..."

embedded_question = encoder(question ).tolist()
    result = collection.query(
    query_embeddings=question,
    n_results=5 # getting 5 best results
)

contexts = "\n".join(result["documents"])

use llm to generate answer

template = """Answer the question based only on the following context:
{context}

Question: {question}
"""

prompt = PromptTemplate(template=template, input_variables=["question"])

llm_chain = LLMChain(prompt=prompt, llm=llm)

response = llm_chain.run(contexts , "What is AI?")

If the llm_chain.run(contexts , "What is AI?") does not take 2 arguments you could use a simple function to create you prompts.

def get_prompt(question, contexts):
    return f"""Answer the question based only on the following context:
{context}

Question: {question}"""


llm(get_prompt(context, question))

Here are some notebooks I implemented when I learned about RAG. (definitely not best practices )
hybrid search - just embedding model tests

Notebooks with different tests

DP13 · February 9, 2024, 4:50am

Thank you for your help @CKeibel and i learned a lot from you practises, as a beginner it helps me lot to learn and my query just solved from 6/8 reply without llm chain.

system · February 9, 2024, 4:51pm

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
RAG LLM Generating the Prompt also at the response Beginners	8	4264	September 25, 2024
AttributeError: 'LangchainEmbedding' object has no attribute '_langchain_embedding; Intermediate	0	697	June 21, 2024
TypeError: InferenceClient.text_generation() got an unexpected keyword argument 'token' Beginners	5	128	June 10, 2025
Getting Additional response from my RAG using HuggingFaceEndpoint inference Beginners	3	47	March 16, 2025
Getting empty response from meta-llama/Meta-Llama-3-8B Models	1	741	June 19, 2024

AttributeError: 'ChatPromptValue' object has no attribute 'size'

from ctransformers.chat_models import ChatPrompt

prompt = hub.pull(“rlm/rag-prompt”)

prompt = ChatPrompt(template)

Related topics