I have been trying to establish my endpoint through AWS. I tested handler.py locally, and it works just fine. but the deployment keeps failing by getting FileNotFoundError. Why can’t it read the files correctly here?
My repo structure:
├── floortrans
│ ├── loaders ...
│ ├── losses ...
│ └── models
| ├── __init__.py
| ├── hg_furukawa_original.py
| ├── model_1427.pth # model weights1, git-lfs
| |__ model_1427.py
├── model_best_val_loss_var.pkl # model weights2, git-lfs
├── requirements.txt
└── handler.py
Error Msg from logs:
dbnn7 2023-03-18T23:43:30.638Z File "/repository/floortrans/models/hg_furukawa_original.py", line 238, in init_weights
dbnn7 2023-03-18T23:43:30.638Z File "/repository/handler.py", line 74, in __init__
dbnn7 2023-03-18T23:43:30.638Z FileNotFoundError: [Errno 2] No such file or directory: './floortrans/models/model_1427.pth'
dbnn7 2023-03-18T23:43:30.638Z File "/opt/conda/lib/python3.9/site-packages/torch/serialization.py", line 594, in load
dbnn7 2023-03-18T23:43:30.638Z Traceback (most recent call last):
dbnn7 2023-03-18T23:43:30.638Z super(_open_file, self).__init__(open(name, mode))
dbnn7 2023-03-18T23:43:30.638Z File "/opt/conda/lib/python3.9/site-packages/torch/serialization.py", line 230, in _open_file_like
dbnn7 2023-03-18T23:43:30.638Z File "/app/./webservice_starlette.py", line 56, in some_startup_task
dbnn7 2023-03-18T23:43:30.638Z await self._router.startup()
dbnn7 2023-03-18T23:43:30.638Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 671, in lifespan
dbnn7 2023-03-18T23:43:30.638Z return _open_file(name_or_buffer, mode)
dbnn7 2023-03-18T23:43:30.638Z custom_pipeline = handler.EndpointHandler(model_dir)
dbnn7 2023-03-18T23:43:30.638Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 566, in __aenter__
dbnn7 2023-03-18T23:43:30.639Z Application startup failed. Exiting.
Handler.py :
class EndpointHandler:
def __init__(self, path="."):
# load the model
device, split, model = load_model()
self.model = model
self.device = device
self.split = split
def __call__(self, data: Any) -> List[List[Dict[str, float]]]:
inputs = data.pop("inputs", data)
coors_info= get_Image_Seg(inputs, self.device, self.split, self.model)
return coors_info
Another question: Isn’t there anyway to reboot my endpoint so that I can test my model changes, or do I have to delete it and recreate another one each time?