Hi
I am new to ML. I would like to convert meta-llama/Llama-3.2-1B to .mar file. (to import to later on to gcp model registry).
I have see the docs of torch-model-archiver
but still not sure of how that should be done. This is the model structure:
├── LICENSE.txt
├── README.md
├── USE_POLICY.md
├── config.json
├── generation_config.json
├── model.safetensors
├── original
│ ├── consolidated.00.pth
│ ├── params.json
│ └── tokenizer.model
├── special_tokens_map.json
├── tokenizer.json
└── tokenizer_config.json
I am not sure what should I supply for the --model-file
and --handler
Thanks
1 Like
It seems to be an unsolved issue. It looks like it could be used as it is…
# [Examples showcasing TorchServe Features and Integrations](#torchserve-internals)
## Security Changes
TorchServe now enforces token authorization and model API control by default. This change will impact the current examples so please check the following documentation for more information: [Token Authorization](https://github.com/pytorch/serve/blob/master/docs/token_authorization_api.md), [Model API control](https://github.com/pytorch/serve/blob/master/docs/model_api_control.md)
## TorchServe Internals
* [Creating mar file for an eager mode model](#creating-mar-file-for-eager-mode-model)
* [Creating mar file for torchscript mode model](#creating-mar-file-for-torchscript-mode-model)
* [Serving custom model with custom service handler](#serving-custom-model-with-custom-service-handler)
* [Serving model using Docker Container](image_classifier/mnist/Docker.md)
* [Creating a Workflow](Workflows/dog_breed_classification)
* [Custom Metrics](custom_metrics)
* [Dynamic Batch Processing](image_classifier/resnet_152_batch)
* [Dynamic Batched Async Requests](image_classifier/near_real_time_video)
## TorchServe Integrations
### Kubernetes <img src="images/k8s.png" width="30" title="K8s" style="float:right padding:10px" />
This file has been truncated. show original
opened 07:01PM - 12 Nov 23 UTC
triaged
security
### 📚 The doc issue
In examples/Huggingface_Transformers/README.md, the `torch-… model-archiver` commands specify `--serialized-file Transformer_model/pytorch_model.bin` option-arguments. However, the `Download_Transformer_models.py` script in the same directory currently results in a downloaded model in the safetensors format:
```
$ python Download_Transformer_models.py
...
Save model and tokenizer/ Torchscript model based on the setting from setup_config bert-base-uncased in directory ./Transformer_model
$ ls -1 Transformer_model/
config.json
model.safetensors
special_tokens_map.json
tokenizer_config.json
tokenizer.json
vocab.txt
```
### Suggest a potential alternative/fix
Perhaps update the examples to accommodate both formats.