Load Huggingface models into Golang?

I’d love to be able to do 2 things:

  1. export models from huggingface into a custom directory I can “backup” and also load into a variety of other programming languages
  2. specifically load a huggingface model into Golang

So far I have saved a model in tensorflow format:

from transformers import AutoTokenizer, TFAutoModel

# Load the model
model_name = "sentence-transformers/all-MiniLM-L6-v1"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = TFAutoModel.from_pretrained(model_name, from_pt=True)

# Define an example input
inputs = tokenizer("Hello, world!", return_tensors="tf")

# Perform inference on the input
outputs = model(inputs)

print(outputs)

# Save the model in SavedModel format for use in TensorFlow Serving
model.save("all-MiniLM-L6-v1", save_format="tf")

…based on the instructions for the github repo which is the most up to date one I could find (~4 months): GitHub - galeone/tfgo: Tensorflow + Go, the gopher way

Specifically I am trying to load the all-MiniLM-L6-v1 sentence embedding model, and I’ll just show my best attempt so far below:

package main

import (
  "fmt"
  tg "github.com/galeone/tfgo"
  tf "github.com/galeone/tensorflow/tensorflow/go"
)

func main() {

  model := tg.LoadModel("all-MiniLM-L6-v1", []string{"serve"}, nil)

  // Get the input and output layers
  inputLayer := model.Op("serving_default_input_ids", 0)
  outputLayer := model.Op("StatefulPartitionedCall", 0)

  // Create a new Tensor from the input string
  input,_ := tf.NewTensor([]int32{5, 3, 0, 1, 1, 3, 53, 12, 5, 300, 22})

  tokenTypeIds,_ := tf.NewTensor([][]int32{{0,0,0},{1,1,1}})
  attentionMask,_ := tf.NewTensor([][]int32{{3,3,3},{8,8,8}})

  // Run the model with the input Tensor
  results := model.Exec([]tf.Output{outputLayer}, map[tf.Output]*tf.Tensor{
    inputLayer: input,
    model.Op("serving_default_token_type_ids", 0): tokenTypeIds,
    model.Op("serving_default_attention_mask", 0): attentionMask,
  })  

  // Get the output vector embedding
  embedding := results[0].Value().([][]float32)[0]
  fmt.Println(embedding)

}

Although I don’t want this post to be specific to this issue I’m having, I’ll post the error I’m getting below anyway in case it helps:

2023-02-20 09:28:37.735722: I tensorflow/cc/saved_model/reader.cc:45] Reading SavedModel from: all-MiniLM-L6-v1
2023-02-20 09:28:37.761113: I tensorflow/cc/saved_model/reader.cc:89] Reading meta graph with tags { serve }
2023-02-20 09:28:37.761140: I tensorflow/cc/saved_model/reader.cc:130] Reading SavedModel debug info (if present) from: all-MiniLM-L6-v1
2023-02-20 09:28:37.761187: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-02-20 09:28:37.831780: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:357] MLIR V1 optimization pass is not enabled
2023-02-20 09:28:37.855819: I tensorflow/cc/saved_model/loader.cc:229] Restoring SavedModel bundle.
2023-02-20 09:28:38.104320: I tensorflow/cc/saved_model/loader.cc:213] Running initialization op on SavedModel bundle at path: all-MiniLM-L6-v1
2023-02-20 09:28:38.223155: I tensorflow/cc/saved_model/loader.cc:305] SavedModel load for tags { serve }; Status: success: OK. Took 487444 microseconds.
2023-02-20 09:28:38.747692: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at reduction_ops_common.h:147 : INVALID_ARGUMENT: Invalid reduction dimension (1 for input with 1 dimension(s)
panic: Invalid reduction dimension (1 for input with 1 dimension(s)
	 [[{{function_node __inference__wrapped_model_4818}}{{node tf_bert_model/bert/embeddings/assert_less/All}}]]

goroutine 1 [running]:
github.com/galeone/tfgo.(*Model).Exec(0x62e180?, {0xc0007ebe50?, 0x1e?, 0x0?}, 0x1?)
	/home/oxpsi/pkg/mod/github.com/galeone/tfgo@v0.0.0-20230214145115-56cedbc50978/model.go:87 +0x65
main.main()
	/home/oxpsi/code/x-load/main.go:26 +0x47a

In python I can just feed that model a string and it gives me an embedding vector, but trying to load it in Go requires me to tokenize and have the correct tensor dimensions to the model. I still have yet to figure out how to properly tokenize the input, so in that code I’m just using a test tensor for troubleshooting.

In any case, I have no idea what I’m doing and instead of blindly continuing to try to troubleshoot my errors, I’m just doing a sanity check here to see if anyone recommends a better way in general.

1 Like

Hi, were you able to have any success using this with golang? I’m coming from a go background as well, and would prefer to use HF with go as well.

1 Like

Hi both, I had the same use case, in particular wanting to run all-MiniLM-L6-v2 locally on a large amount of input data. We ended up releasing Hugot to solve this, which integrates the tokenizers, onnx runtime bindings and postprocessing required to run a subset of pipeline types in Go. Note: you naturally need to use Onnx versions of huggingface models, but these are typically available. It’s very early days for the project but let me know if this works for your use case!

1 Like