Load Huggingface models into Golang?

I’d love to be able to do 2 things:

  1. export models from huggingface into a custom directory I can “backup” and also load into a variety of other programming languages
  2. specifically load a huggingface model into Golang

So far I have saved a model in tensorflow format:

from transformers import AutoTokenizer, TFAutoModel

# Load the model
model_name = "sentence-transformers/all-MiniLM-L6-v1"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = TFAutoModel.from_pretrained(model_name, from_pt=True)

# Define an example input
inputs = tokenizer("Hello, world!", return_tensors="tf")

# Perform inference on the input
outputs = model(inputs)

print(outputs)

# Save the model in SavedModel format for use in TensorFlow Serving
model.save("all-MiniLM-L6-v1", save_format="tf")

…based on the instructions for the github repo which is the most up to date one I could find (~4 months): GitHub - galeone/tfgo: Tensorflow + Go, the gopher way

Specifically I am trying to load the all-MiniLM-L6-v1 sentence embedding model, and I’ll just show my best attempt so far below:

package main

import (
  "fmt"
  tg "github.com/galeone/tfgo"
  tf "github.com/galeone/tensorflow/tensorflow/go"
)

func main() {

  model := tg.LoadModel("all-MiniLM-L6-v1", []string{"serve"}, nil)

  // Get the input and output layers
  inputLayer := model.Op("serving_default_input_ids", 0)
  outputLayer := model.Op("StatefulPartitionedCall", 0)

  // Create a new Tensor from the input string
  input,_ := tf.NewTensor([]int32{5, 3, 0, 1, 1, 3, 53, 12, 5, 300, 22})

  tokenTypeIds,_ := tf.NewTensor([][]int32{{0,0,0},{1,1,1}})
  attentionMask,_ := tf.NewTensor([][]int32{{3,3,3},{8,8,8}})

  // Run the model with the input Tensor
  results := model.Exec([]tf.Output{outputLayer}, map[tf.Output]*tf.Tensor{
    inputLayer: input,
    model.Op("serving_default_token_type_ids", 0): tokenTypeIds,
    model.Op("serving_default_attention_mask", 0): attentionMask,
  })  

  // Get the output vector embedding
  embedding := results[0].Value().([][]float32)[0]
  fmt.Println(embedding)

}

Although I don’t want this post to be specific to this issue I’m having, I’ll post the error I’m getting below anyway in case it helps:

2023-02-20 09:28:37.735722: I tensorflow/cc/saved_model/reader.cc:45] Reading SavedModel from: all-MiniLM-L6-v1
2023-02-20 09:28:37.761113: I tensorflow/cc/saved_model/reader.cc:89] Reading meta graph with tags { serve }
2023-02-20 09:28:37.761140: I tensorflow/cc/saved_model/reader.cc:130] Reading SavedModel debug info (if present) from: all-MiniLM-L6-v1
2023-02-20 09:28:37.761187: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-02-20 09:28:37.831780: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:357] MLIR V1 optimization pass is not enabled
2023-02-20 09:28:37.855819: I tensorflow/cc/saved_model/loader.cc:229] Restoring SavedModel bundle.
2023-02-20 09:28:38.104320: I tensorflow/cc/saved_model/loader.cc:213] Running initialization op on SavedModel bundle at path: all-MiniLM-L6-v1
2023-02-20 09:28:38.223155: I tensorflow/cc/saved_model/loader.cc:305] SavedModel load for tags { serve }; Status: success: OK. Took 487444 microseconds.
2023-02-20 09:28:38.747692: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at reduction_ops_common.h:147 : INVALID_ARGUMENT: Invalid reduction dimension (1 for input with 1 dimension(s)
panic: Invalid reduction dimension (1 for input with 1 dimension(s)
	 [[{{function_node __inference__wrapped_model_4818}}{{node tf_bert_model/bert/embeddings/assert_less/All}}]]

goroutine 1 [running]:
github.com/galeone/tfgo.(*Model).Exec(0x62e180?, {0xc0007ebe50?, 0x1e?, 0x0?}, 0x1?)
	/home/oxpsi/pkg/mod/github.com/galeone/tfgo@v0.0.0-20230214145115-56cedbc50978/model.go:87 +0x65
main.main()
	/home/oxpsi/code/x-load/main.go:26 +0x47a

In python I can just feed that model a string and it gives me an embedding vector, but trying to load it in Go requires me to tokenize and have the correct tensor dimensions to the model. I still have yet to figure out how to properly tokenize the input, so in that code I’m just using a test tensor for troubleshooting.

In any case, I have no idea what I’m doing and instead of blindly continuing to try to troubleshoot my errors, I’m just doing a sanity check here to see if anyone recommends a better way in general.