Exit code: 1. Reason: s/selective_scan_interface.py:231: FutureWarning: `torch.cuda.amp.custom_bwd(args...)` is deprecated. Please use `torch.amp.custom_bwd(args..., device_type='cuda')` instead.\n @custom_bwd\n/opt/conda/lib/python3.11/site-packages/mamba_ssm/ops/triton/layernorm.py:507: FutureWarning: `torch.cuda.amp.custom_fwd(args...)` is deprecated. Please use `torch.amp.custom_fwd(args..., device_type='cuda')` instead.\n @custom_fwd\n/opt/conda/lib/python3.11/site-packages/mamba_ssm/ops/triton/layernorm.py:566: FutureWarning: `torch.cuda.amp.custom_bwd(args...)` is deprecated. Please use `torch.amp.custom_bwd(args..., device_type='cuda')` instead.\n @custom_bwd\n/opt/conda/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:79: FutureWarning: You are using a Backend <class 'text_generation_server.utils.dist.FakeGroup'> as a ProcessGroup. This usage is deprecated since PyTorch 2.0. Please use a public API of PyTorch Distributed instead.\n return func(*args, **kwargs)"},"target":"text_generation_launcher","span":{"rank":0,"name":"shard-manager"},"spans":[{"rank":0,"name":"shard-manager"}]}
{"timestamp":"2024-12-08T02:46:24.695372Z","level":"ERROR","fields":{"message":"Shard 0 crashed"},"target":"text_generation_launcher"}
{"timestamp":"2024-12-08T02:46:24.695409Z","level":"INFO","fields":{"message":"Terminating webserver"},"target":"text_generation_launcher"}
{"timestamp":"2024-12-08T02:46:24.695438Z","level":"INFO","fields":{"message":"Waiting for webserver to gracefully shutdown"},"target":"text_generation_launcher"}
{"timestamp":"2024-12-08T02:46:24.695556Z","level":"INFO","message":"signal received, starting graceful shutdown","target":"text_generation_router::server","filename":"router/src/server.rs","line_number":2485}
{"timestamp":"2024-12-08T02:46:24.995806Z","level":"INFO","fields":{"message":"webserver terminated"},"target":"text_generation_launcher"}
{"timestamp":"2024-12-08T02:46:24.995837Z","level":"INFO","fields":{"message":"Shutting down shards"},"target":"text_generation_launcher"}
Error: ShardFailed
My first deployment yesterday worked, I set it to scale zero after 15 minutes, but when I accessed it again today I initialized it and failed, then I went and tried Google’s deployment again and again it failed. I am deploying a llama 3.1 after fine tuning it