Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail to load the clip model #18

Open
cocoshe opened this issue Jun 23, 2024 · 1 comment
Open

Fail to load the clip model #18

cocoshe opened this issue Jun 23, 2024 · 1 comment

Comments

@cocoshe
Copy link

cocoshe commented Jun 23, 2024

I try to run the inference part, and following the command here:

python test/DS_SmartEdit_test.py --is_understanding_scenes True --model_name_or_path "./checkpoints/vicuna-13b-v1-1" --LLaVA_model_path "./checkpoints/LLaVA-13B-v1" --save_dir './checkpoints/SmartEdit-13B/Understand-15000' --steps 15000 --total_dir "./checkpoints/SmartEdit-13B" --sd_qformer_version "v1.1-13b" --resize_resolution 256

the output in terminal here:

/home/SmartEdit/test/InstructPix2PixSD_SM.py:35: FutureWarning: Importing `DiffusionPipeline` or `ImagePipelineOutput` from diffusers.pipeline_utils is deprecated. Please import from diffusers.pipelines.pipeline_utils instead.
  from diffusers.pipeline_utils import DiffusionPipeline
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:26<00:00,  8.73s/it]
> /home/SmartEdit/model/DS_SmartEdit_model.py(169)init_visual_features_extractor()
-> LLaVA_model = LlavaLlamaForCausalLM.from_pretrained(LLaVA_model_path)
(Pdb) c
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:31<00:00, 10.51s/it]
Traceback (most recent call last):
  File "/home/SmartEdit/SmartEdit/test/DS_SmartEdit_test.py", line 652, in <module>
    main()
  File "/home/SmartEdit/SmartEdit/test/DS_SmartEdit_test.py", line 215, in main
    model_.init_visual_features_extractor(LLaVA_model_path=LLaVA_model_path, sd_qformer_version=sd_qformer_version)
  File "/home/SmartEdit/SmartEdit/model/DS_SmartEdit_model.py", line 172, in init_visual_features_extractor
    self.vision_tower.load_model()
AttributeError: 'NoneType' object has no attribute 'load_model'

I read the code and maybe the initialize_vision_modules is some how not used in inference part?

def initialize_vision_modules(self, model_args, fsdp=None):
vision_tower = model_args.vision_tower
mm_vision_select_layer = model_args.mm_vision_select_layer
mm_vision_select_feature = model_args.mm_vision_select_feature
pretrain_mm_mlp_adapter = model_args.pretrain_mm_mlp_adapter
self.config.mm_vision_tower = vision_tower
vision_tower = build_vision_tower(model_args)
if fsdp is not None and len(fsdp) > 0:
self.vision_tower = [vision_tower]
else:
self.vision_tower = vision_tower
self.config.use_mm_proj = True
self.config.mm_hidden_size = vision_tower.hidden_size
self.config.mm_vision_select_layer = mm_vision_select_layer
self.config.mm_vision_select_feature = mm_vision_select_feature
if not hasattr(self, 'mm_projector'):
self.mm_projector = nn.Linear(self.config.mm_hidden_size, self.config.hidden_size)
if pretrain_mm_mlp_adapter is not None:
mm_projector_weights = torch.load(pretrain_mm_mlp_adapter, map_location='cpu')
def get_w(weights, keyword):
return {k.split(keyword + '.')[1]: v for k, v in weights.items() if keyword in k}
self.mm_projector.load_state_dict(get_w(mm_projector_weights, 'mm_projector'))

@yuzhou914
Copy link
Collaborator

Hi, maybe you have wrong packages version and it causes the problem. However, since I have never met your problem, I might not know which package causes the problem. I suggest you firstly conduct the LLaVA dialogue in order to make sure your LLaVA is correct. You can take a look at LLaVA instructions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants