How to debug if loading model caused pc to shutdown?

I use Ubuntu in
Intel® Core™ i7-8565U CPU @ 1.80GHz × 8
16 Gb Ram
GeForce MX150 GPU

I have this model that I have tried in Google Collab. I know Collab provide much better environment than my pc. But when I tried to load it in my pc, I use 4bit quantization. It should be much easier to load, and it doesn’t seems like my pc lagging while loading it.

my code as follow:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline, AutoModelForSeq2SeqLM
from langchain.llms import HuggingFacePipeline

device = "cuda:0" if torch.cuda.is_available() else "cpu"

tokenizer = AutoTokenizer.from_pretrained("Yellow-AI-NLP/komodo-7b-base",cache_dir="./huggingface_cache/",trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("Yellow-AI-NLP/komodo-7b-base",cache_dir="./huggingface_cache/",load_in_4bit=True,
   bnb_4bit_compute_dtype=torch.bfloat16,llm_int8_enable_fp32_cpu_offload=True, device_map="auto",trust_remote_code=True)
model = model.to(device)
pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=1280
)

local_llm = HuggingFacePipeline(pipeline=pipeline)

Seems normal at first. Everything run smoothly until loading checkpoint shard:
Loading checkpoint shards: 0%| | 0/6
This run pretty smooth but suddenly my vscode closed. I try it again and my pc show blackscreen, took a while and it seems to be restarted. I don’t know what’s going on. No error show in vscode log or anything. Anyone know why this is happen? or anyone know how to debug it ?