I’d love to be able to do 2 things:
- export models from huggingface into a custom directory I can “backup” and also load into a variety of other programming languages
- specifically load a huggingface model into Golang
So far I have saved a model in tensorflow format:
from transformers import AutoTokenizer, TFAutoModel
# Load the model
model_name = "sentence-transformers/all-MiniLM-L6-v1"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = TFAutoModel.from_pretrained(model_name, from_pt=True)
# Define an example input
inputs = tokenizer("Hello, world!", return_tensors="tf")
# Perform inference on the input
outputs = model(inputs)
# Save the model in SavedModel format for use in TensorFlow Serving
model.save("all-MiniLM-L6-v1", save_format="tf")
…based on the instructions for the github repo which is the most up to date one I could find (~4 months): GitHub - galeone/tfgo: Tensorflow + Go, the gopher way
Specifically I am trying to load the all-MiniLM-L6-v1 sentence embedding model, and I’ll just show my best attempt so far below:
package main
import (
tg "github.com/galeone/tfgo"
tf "github.com/galeone/tensorflow/tensorflow/go"
func main() {
model := tg.LoadModel("all-MiniLM-L6-v1", []string{"serve"}, nil)
// Get the input and output layers
inputLayer := model.Op("serving_default_input_ids", 0)
outputLayer := model.Op("StatefulPartitionedCall", 0)
// Create a new Tensor from the input string
input,_ := tf.NewTensor([]int32{5, 3, 0, 1, 1, 3, 53, 12, 5, 300, 22})
tokenTypeIds,_ := tf.NewTensor([][]int32{{0,0,0},{1,1,1}})
attentionMask,_ := tf.NewTensor([][]int32{{3,3,3},{8,8,8}})
// Run the model with the input Tensor
results := model.Exec([]tf.Output{outputLayer}, map[tf.Output]*tf.Tensor{
inputLayer: input,
model.Op("serving_default_token_type_ids", 0): tokenTypeIds,
model.Op("serving_default_attention_mask", 0): attentionMask,
// Get the output vector embedding
embedding := results[0].Value().([][]float32)[0]
Although I don’t want this post to be specific to this issue I’m having, I’ll post the error I’m getting below anyway in case it helps:
2023-02-20 09:28:37.735722: I tensorflow/cc/saved_model/reader.cc:45] Reading SavedModel from: all-MiniLM-L6-v1
2023-02-20 09:28:37.761113: I tensorflow/cc/saved_model/reader.cc:89] Reading meta graph with tags { serve }
2023-02-20 09:28:37.761140: I tensorflow/cc/saved_model/reader.cc:130] Reading SavedModel debug info (if present) from: all-MiniLM-L6-v1
2023-02-20 09:28:37.761187: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-02-20 09:28:37.831780: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:357] MLIR V1 optimization pass is not enabled
2023-02-20 09:28:37.855819: I tensorflow/cc/saved_model/loader.cc:229] Restoring SavedModel bundle.
2023-02-20 09:28:38.104320: I tensorflow/cc/saved_model/loader.cc:213] Running initialization op on SavedModel bundle at path: all-MiniLM-L6-v1
2023-02-20 09:28:38.223155: I tensorflow/cc/saved_model/loader.cc:305] SavedModel load for tags { serve }; Status: success: OK. Took 487444 microseconds.
2023-02-20 09:28:38.747692: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at reduction_ops_common.h:147 : INVALID_ARGUMENT: Invalid reduction dimension (1 for input with 1 dimension(s)
panic: Invalid reduction dimension (1 for input with 1 dimension(s)
[[{{function_node __inference__wrapped_model_4818}}{{node tf_bert_model/bert/embeddings/assert_less/All}}]]
goroutine 1 [running]:
github.com/galeone/tfgo.(*Model).Exec(0x62e180?, {0xc0007ebe50?, 0x1e?, 0x0?}, 0x1?)
/home/oxpsi/pkg/mod/github.com/galeone/tfgo@v0.0.0-20230214145115-56cedbc50978/model.go:87 +0x65
/home/oxpsi/code/x-load/main.go:26 +0x47a
In python I can just feed that model a string and it gives me an embedding vector, but trying to load it in Go requires me to tokenize and have the correct tensor dimensions to the model. I still have yet to figure out how to properly tokenize the input, so in that code I’m just using a test tensor for troubleshooting.
In any case, I have no idea what I’m doing and instead of blindly continuing to try to troubleshoot my errors, I’m just doing a sanity check here to see if anyone recommends a better way in general.