Deploy model with prompt-tuned adapter saved in S3

lcleary · March 21, 2024, 12:13am

I prompt-tuned an adapter for LLaMA 7B and saved it to S3 after training without merging it to the base model first (i.e., I only have the adapter saved in S3). I have not pushed it up to the hub yet. I want to deploy a model using this adapter on SageMaker using HuggingFaceModel, but I’m not sure how to do this. Would I need to merge it to the base model separately first, or is there a way to merge it within HuggingFaceModel?

I was thinking I might be able to write a script that merges the adapter to the model and pass that in through entry_point. Would this work? How would I go about writing such a script?

Topic		Replies	Views
Multi-lora serving with adapters on S3 Amazon SageMaker	0	114	November 12, 2024
SageMaker Pipeline from model saved on S3 Amazon SageMaker	1	1182	September 9, 2022
Deploying Mixtral8x7B on AWS Sagemaker from S3 Amazon SageMaker	2	481	June 11, 2024
SageMaker Inference for Model Tuned Elsewhere Amazon SageMaker	4	1068	September 2, 2021
Using S3 as model cache for Huggingface LLM inference DLC on Sagemaker Amazon SageMaker	1	3892	June 21, 2023

Deploy model with prompt-tuned adapter saved in S3

Related topics