Endpoint failed to start, FileNotFoundError even though it exists

I have been trying to establish my endpoint through AWS. I tested handler.py locally, and it works just fine. but the deployment keeps failing by getting FileNotFoundError. Why can’t it read the files correctly here?

My repo structure:

├── floortrans                   
│   ├── loaders      ...   
│   ├── losses       ...
│   └── models  
|       ├── __init__.py          
|       ├── hg_furukawa_original.py           
|       ├── model_1427.pth          # model weights1, git-lfs
|       |__ model_1427.py          
├── model_best_val_loss_var.pkl   # model weights2, git-lfs
├── requirements.txt      
└── handler.py

Error Msg from logs:

dbnn7 2023-03-18T23:43:30.638Z   File "/repository/floortrans/models/hg_furukawa_original.py", line 238, in init_weights
dbnn7 2023-03-18T23:43:30.638Z   File "/repository/handler.py", line 74, in __init__
dbnn7 2023-03-18T23:43:30.638Z FileNotFoundError: [Errno 2] No such file or directory: './floortrans/models/model_1427.pth'
dbnn7 2023-03-18T23:43:30.638Z   File "/opt/conda/lib/python3.9/site-packages/torch/serialization.py", line 594, in load
dbnn7 2023-03-18T23:43:30.638Z Traceback (most recent call last):
dbnn7 2023-03-18T23:43:30.638Z     super(_open_file, self).__init__(open(name, mode))
dbnn7 2023-03-18T23:43:30.638Z   File "/opt/conda/lib/python3.9/site-packages/torch/serialization.py", line 230, in _open_file_like
dbnn7 2023-03-18T23:43:30.638Z   File "/app/./webservice_starlette.py", line 56, in some_startup_task
dbnn7 2023-03-18T23:43:30.638Z     await self._router.startup()
dbnn7 2023-03-18T23:43:30.638Z   File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 671, in lifespan
dbnn7 2023-03-18T23:43:30.638Z     return _open_file(name_or_buffer, mode)
dbnn7 2023-03-18T23:43:30.638Z     custom_pipeline = handler.EndpointHandler(model_dir)
dbnn7 2023-03-18T23:43:30.638Z   File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 566, in __aenter__
dbnn7 2023-03-18T23:43:30.639Z Application startup failed. Exiting.

Handler.py :

class EndpointHandler:
    def __init__(self, path="."):
        # load the model
        device, split, model = load_model() 
        self.model = model 
        self.device = device
        self.split = split

    def __call__(self, data: Any) -> List[List[Dict[str, float]]]:

        inputs = data.pop("inputs", data)
        coors_info= get_Image_Seg(inputs, self.device, self.split, self.model)
        return coors_info

Another question: Isn’t there anyway to reboot my endpoint so that I can test my model changes, or do I have to delete it and recreate another one each time?

I think i responded to your eamil. Its hard to tell why your endpoint fails. But it seems like that the load_model() is not providing the correct path where to load the models.