AttributeError: 'FalconModel' object has no attribute 'model'

Hi,

After training and saving falcon-7b-instruct, I am attempting to load the model for inference using accelerate init_empty_weights, and load_checkpoint_and_dispatch. When the call to load_checkpoint_and_dispatch() is made, AttributeError: ‘FalconModel’ object has no attribute ‘model’ is returned (specifically by torch/nn/modules/module.py).

As a sanity check, the error does not manifest when using the original foundation falcon-7b-instruct model snapshot. Further, the derived fine-tuned model runs correctly when the accelerate init_empty_weights, and load_checkpoint_and_dispatch methods are not invoked.

After fine-tuning the model is saved as
accelerator.save_model(model, out_path, max_shard_size=“1GB”) and as unwrapped_model.save_pretrained(out_path, …),
both saved checkpoints run correctly in inference when the empty weights process is not applied.

The resulting config.json from accelerator.save_model() is as follows:
{
“_name_or_path”: “/home/jellybean/.cache/huggingface/hub/models–tiiuae-falcon-7b-instruct-machine_500/accelerate”,
“alibi”: false,
“apply_residual_connection_post_layernorm”: false,
“architectures”: [
“FalconForCausalLM”
],
“attention_dropout”: 0.0,
“auto_map”: {
“AutoModel”: “tiiuae/falcon-7b-instruct–modeling_falcon.FalconModel”,
“AutoModelForCausalLM”: “tiiuae/falcon-7b-instruct–modeling_falcon.FalconForCausalLM”
},
“bias”: false,
“bos_token_id”: 50256,
“do_sample”: true,
“eos_token_id”: 50256,
“hidden_dropout”: 0.0,
“hidden_size”: 4544,
“initializer_range”: 0.02,
“layer_norm_epsilon”: 1e-05,
“model_type”: “falcon”,
“multi_query”: true,
“new_decoder_architecture”: false,
“num_attention_heads”: 71,
“num_hidden_layers”: 32,
“num_kv_heads”: 71,
“parallel_attn”: true,
“top_k”: 0,
“top_p”: 0.75,
“torch_dtype”: “bfloat16”,
“transformers_version”: “4.34.1”,
“use_cache”: true,
“vocab_size”: 65024
}
I have been trying to determine the cause of this attribute error with no avail. Any help will be greatly appreciated.

Best,
Boris

Hi @Borell, can you provide a reproducer. My hunch is that you are not initializing the correct model architecture under init_empty_weights .

Hi @marcsun13,

Thanks for the reply. This is the code used to generate the model with parameters set as follows:

model_path = /home/jellybean/.cache/huggingface/hub/models–tiiuae-falcon-7b-instruct-machine_500/accelerate

config = FalconConfig {
“_name_or_path”: “/home/jellybean/.cache/huggingface/hub/models–tiiuae-falcon-7b-instruct-machine_500/accelerate”,
“alibi”: false,
“apply_residual_connection_post_layernorm”: false,
“architectures”: [
“FalconForCausalLM”
],
“attention_dropout”: 0.0,
“auto_map”: {
“AutoModel”: “tiiuae/falcon-7b-instruct–modeling_falcon.FalconModel”,
“AutoModelForCausalLM”: “tiiuae/falcon-7b-instruct–modeling_falcon.FalconForCausalLM”
},
“bias”: false,
“bos_token_id”: 50256,
“do_sample”: true,
“eos_token_id”: 50256,
“hidden_dropout”: 0.0,
“hidden_size”: 4544,
“initializer_range”: 0.02,
“layer_norm_epsilon”: 1e-05,
“max_position_embeddings”: 2048,
“model_type”: “falcon”,
“multi_query”: true,
“new_decoder_architecture”: false,
“num_attention_heads”: 71,
“num_hidden_layers”: 32,
“num_kv_heads”: 71,
“parallel_attn”: true,
“rope_scaling”: null,
“rope_theta”: 10000.0,
“top_k”: 0,
“top_p”: 0.75,
“torch_dtype”: “bfloat16”,
“transformers_version”: “4.34.1”,
“use_cache”: true,
“vocab_size”: 65024
}
with torch_dtype and init_device being added in the code as shown below.

def getModel(self, model_path, config):
	
	from accelerate import init_empty_weights, load_checkpoint_and_dispatch, infer_auto_device_map
	
	#no split modules from: modeling_falcon.py
	_no_split_modules = ['FalconDecoderLayer']
	config.torch_dtype = 'auto'
	config.init_device = 'cuda'
		
	with init_empty_weights():
		transformers.utils.logging.set_verbosity(transformers.logging.CRITICAL)
		model = transformers.AutoModelForCausalLM.from_pretrained(model_path, 
														  config=config,
														  trust_remote_code=True,
														  local_files_only=True)
																  
	#Warning: weights are not tied; use `tie_weights` method before `infer_auto_device` function	
	model.tie_weights()
	device_map = infer_auto_device_map(model, no_split_module_classes=_no_split_modules)

	#ValueError: base_model.model.transformer.word_embeddings.weight doesn't have any device set
	device_map.update({'transformer.word_embeddings': 0})
	
	#TODO: Works with foundation, but not with fine-tuned models... 
	#... AttributeError: 'FalconModel' object has no attribute 'model'
	model = load_checkpoint_and_dispatch(model,
									 checkpoint=model_path,
									 device_map=device_map,
									 no_split_module_classes=_no_split_modules)
	if self.verbose > 1: print(f"\Device Map:\n{model.hf_device_map}")
	transformers.utils.logging.set_verbosity(logger_level)
	
	return model

The error happens at the #TODO: annotation. Let me know if this suffices your request.

Best,
Borrel

Hi @marcsun13,

I was wondering if there is any further suggestions for the issue I presented here.

Thanks,
Borell