Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: execution of "benchmark.exe -m ssd.xml -d cpu" reporting "invalid broadcast" error with openvino static library #24963

Open
3 tasks done
feixuedudiao opened this issue Jun 12, 2024 · 11 comments
Assignees
Labels

Comments

@feixuedudiao
Copy link

OpenVINO Version

2024.01/2024.1.0

Operating System

Windows System

Device used for inference

CPU

Framework

None

Model used

ssd

Issue description

When running the benchmark test of the SSD model in Openvino compiled with a static library, the following error "Exception from src\inference\src\dev\plugin.cpp:54: invalid broadcast" is reported on both the CPU and GPU devices. However, it is strange that when the Openvino library is a dll, running the benchmark test on the CPU and GPU devices can run normally.

Step-by-step reproduction

No response

Relevant log output

the cpu log:
λ benchmark_app.exe -m ssd.xml ssd.bin -d GPU
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.2.0-15316-22dcf50ce01
[ INFO ]
[ INFO ] Device info:
[ INFO ] GPU
[ INFO ] Build ................................. 2024.2.0-15316-22dcf50ce01
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[ WARNING ] Performance hint was not explicitly specified in command line. Device(GPU) performance hint will be set to THROUGHPUT.
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 22.72 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Network inputs:
[ INFO ]     input_0 (node: input_0) : f32 / [...] / [1,3,300,300]
[ INFO ] Network outputs:
[ INFO ]     scores (node: scores) : f32 / [...] / [1,3000,81]
[ INFO ]     boxes (node: boxes) : f32 / [...] / [1,3000,4]
[Step 5/11] Resizing model to match image sizes and given batch
[ WARNING ] input_0: layout is not set explicitly, so it is defaulted to NCHW. It is STRONGLY recommended to set layout manually to avoid further issues.
[Step 6/11] Configuring input of the model
[ INFO ] Model batch size: 1
[ INFO ] Network inputs:
[ INFO ]     input_0 (node: input_0) : u8 / [N,C,H,W] / [1,3,300,300]
[ INFO ] Network outputs:
[ INFO ]     scores (node: scores) : f32 / [...] / [1,3000,81]
[ INFO ]     boxes (node: boxes) : f32 / [...] / [1,3000,4]
[Step 7/11] Loading the model to the device
[ ERROR ] Exception from src\inference\src\cpp\core.cpp:107:
Exception from src\inference\src\dev\plugin.cpp:54:
invalid broadcast

the gpu log:
λ benchmark_app.exe -m ssd.xml -d CPU
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.2.0-15316-22dcf50ce01
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2024.2.0-15316-22dcf50ce01
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[ WARNING ] Performance hint was not explicitly specified in command line. Device(CPU) performance hint will be set to THROUGHPUT.
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 428.26 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Network inputs:
[ INFO ]     input_0 (node: input_0) : f32 / [...] / [1,3,300,300]
[ INFO ] Network outputs:
[ INFO ]     scores (node: scores) : f32 / [...] / [1,3000,81]
[ INFO ]     boxes (node: boxes) : f32 / [...] / [1,3000,4]
[Step 5/11] Resizing model to match image sizes and given batch
[ WARNING ] input_0: layout is not set explicitly, so it is defaulted to NCHW. It is STRONGLY recommended to set layout manually to avoid further issues.
[Step 6/11] Configuring input of the model
[ INFO ] Model batch size: 1
[ INFO ] Network inputs:
[ INFO ]     input_0 (node: input_0) : u8 / [N,C,H,W] / [1,3,300,300]
[ INFO ] Network outputs:
[ INFO ]     scores (node: scores) : f32 / [...] / [1,3000,81]
[ INFO ]     boxes (node: boxes) : f32 / [...] / [1,3000,4]
[Step 7/11] Loading the model to the device
[ ERROR ] Exception from src\inference\src\cpp\core.cpp:107:
Exception from src\inference\src\dev\plugin.cpp:54:
invalid broadcast

Issue submission checklist

  • I'm reporting an issue. It's not a question.
  • I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
  • There is reproducer code and related data files such as images, videos, models, etc.
@feixuedudiao feixuedudiao added bug Something isn't working support_request labels Jun 12, 2024
@ilya-lavrenov ilya-lavrenov added the category: GPU OpenVINO GPU plugin label Jun 12, 2024
@avitial avitial removed the bug Something isn't working label Jun 14, 2024
@Wan-Intel Wan-Intel self-assigned this Jun 15, 2024
@Wan-Intel
Copy link

Wan-Intel commented Jun 16, 2024

I've validated ssd_mobilenet_v1_coco with Benchmark C++ Tool with the OpenVINO™ GitHub Master branch.

Could you please re-build the OpenVINO™ GitHub Master branch and see if the issue can be resolved? Documentation on building OpenVINO™ static libraries and OpenVINO™ from sources are as follows:

benchmark_app.exe -m ssd_mobilenet_v1_coco.xml
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.3.0-15702-78fcf9de187
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2024.3.0-15702-78fcf9de187
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[ WARNING ] Performance hint was not explicitly specified in command line. Device(CPU) performance hint will be set to THROUGHPUT.
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 72.02 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Network inputs:
[ INFO ] image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]
[ INFO ] Network outputs:
[ INFO ] detection_boxes , detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]
[Step 5/11] Resizing model to match image sizes and given batch
[Step 6/11] Configuring input of the model
[ INFO ] Model batch size: 1
[ INFO ] Network inputs:
[ INFO ] image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]
[ INFO ] Network outputs:
[ INFO ] detection_boxes , detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 491.62 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ] NETWORK_NAME: ssd_mobilenet_v1_coco
[ INFO ] OPTIMAL_NUMBER_OF_INFER_REQUESTS: 4
[ INFO ] NUM_STREAMS: 4
[ INFO ] INFERENCE_NUM_THREADS: 8
[ INFO ] PERF_COUNT: NO
[ INFO ] INFERENCE_PRECISION_HINT: f32
[ INFO ] PERFORMANCE_HINT: THROUGHPUT
[ INFO ] EXECUTION_MODE_HINT: PERFORMANCE
[ INFO ] PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ] ENABLE_CPU_PINNING: NO
[ INFO ] SCHEDULING_CORE_TYPE: ANY_CORE
[ INFO ] MODEL_DISTRIBUTION_POLICY:
[ INFO ] ENABLE_HYPER_THREADING: YES
[ INFO ] EXECUTION_DEVICES: CPU
[ INFO ] CPU_DENORMALS_OPTIMIZATION: NO
[ INFO ] LOG_LEVEL: LOG_NONE
[ INFO ] CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE: 1
[ INFO ] DYNAMIC_QUANTIZATION_GROUP_SIZE: 0
[ INFO ] KV_CACHE_PRECISION: f16
[ INFO ] AFFINITY: NONE
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Test Config 0
[ INFO ] image_tensor ([N,H,W,C], u8, [1,300,300,3], static): random (image/numpy array is expected)
[Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests, limits: 60000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 26.09 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices: [ CPU ]
[ INFO ] Count: 2096 iterations
[ INFO ] Duration: 60136.14 ms
[ INFO ] Latency:
[ INFO ] Median: 82.18 ms
[ INFO ] Average: 114.70 ms
[ INFO ] Min: 41.48 ms
[ INFO ] Max: 1093.93 ms
[ INFO ] Throughput: 34.85 FPS

@feixuedudiao
Copy link
Author

feixuedudiao commented Jun 17, 2024

thanks ,i check it ,but this problem also accurs to gpu device,can you check it on gpu device?

@Wan-Intel
Copy link

Did you encounter the same issue after building the OpenVINO™ GitHub Master branch? The latest version of the build will be 2024.3.0-15718-808a908ea92.

Meanwhile, inference of ssd_mobilenet_v1_coco with Benchmark C++ Tool using the OpenVINO™ GitHub Master branch GPU plugin is shown as follows:

benchmark_app.exe -m ssd_mobilenet_v1_coco.xml -t 1 -d GPU
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.3.0-15718-808a908ea92
[ INFO ]
[ INFO ] Device info:
[ INFO ] GPU
[ INFO ] Build ................................. 2024.3.0-15718-808a908ea92
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[ WARNING ] Performance hint was not explicitly specified in command line. Device(GPU) performance hint will be set to THROUGHPUT.
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 20.39 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Network inputs:
[ INFO ] image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]
[ INFO ] Network outputs:
[ INFO ] detection_boxes , detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]
[Step 5/11] Resizing model to match image sizes and given batch
[Step 6/11] Configuring input of the model
[ INFO ] Model batch size: 1
[ INFO ] Network inputs:
[ INFO ] image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]
[ INFO ] Network outputs:
[ INFO ] detection_boxes , detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 5701.54 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ] NETWORK_NAME: ssd_mobilenet_v1_coco
[ INFO ] OPTIMAL_NUMBER_OF_INFER_REQUESTS: 4
[ INFO ] PERF_COUNT: NO
[ INFO ] ENABLE_CPU_PINNING: NO
[ INFO ] MODEL_PRIORITY: MEDIUM
[ INFO ] GPU_HOST_TASK_PRIORITY: MEDIUM
[ INFO ] GPU_QUEUE_PRIORITY: MEDIUM
[ INFO ] GPU_QUEUE_THROTTLE: MEDIUM
[ INFO ] GPU_ENABLE_LOOP_UNROLLING: YES
[ INFO ] GPU_DISABLE_WINOGRAD_CONVOLUTION: NO
[ INFO ] CACHE_DIR:
[ INFO ] CACHE_MODE: optimize_speed
[ INFO ] PERFORMANCE_HINT: THROUGHPUT
[ INFO ] EXECUTION_MODE_HINT: PERFORMANCE
[ INFO ] COMPILATION_NUM_THREADS: 8
[ INFO ] NUM_STREAMS: 2
[ INFO ] PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ] INFERENCE_PRECISION_HINT: f16
[ INFO ] DEVICE_ID: 0
[ INFO ] EXECUTION_DEVICES: GPU.0
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Test Config 0
[ INFO ] image_tensor ([N,H,W,C], u8, [1,300,300,3], static): random (image/numpy array is expected)
[Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests, limits: 1000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 16.93 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices: [ GPU.0 ]
[ INFO ] Count: 124 iterations
[ INFO ] Duration: 1041.24 ms
[ INFO ] Latency:
[ INFO ] Median: 31.61 ms
[ INFO ] Average: 33.14 ms
[ INFO ] Min: 20.64 ms
[ INFO ] Max: 46.92 ms
[ INFO ] Throughput: 119.09 FPS

@feixuedudiao
Copy link
Author

Did you encounter the same issue after building the OpenVINO™ GitHub Master branch? The latest version of the build will be 2024.3.0-15718-808a908ea92.

Meanwhile, inference of ssd_mobilenet_v1_coco with Benchmark C++ Tool using the OpenVINO™ GitHub Master branch GPU plugin is shown as follows:

benchmark_app.exe -m ssd_mobilenet_v1_coco.xml -t 1 -d GPU [Step 1/11] Parsing and validating input arguments [ INFO ] Parsing input parameters [Step 2/11] Loading OpenVINO Runtime [ INFO ] OpenVINO: [ INFO ] Build ................................. 2024.3.0-15718-808a908ea92 [ INFO ] [ INFO ] Device info: [ INFO ] GPU [ INFO ] Build ................................. 2024.3.0-15718-808a908ea92 [ INFO ] [ INFO ] [Step 3/11] Setting device configuration [ WARNING ] Performance hint was not explicitly specified in command line. Device(GPU) performance hint will be set to THROUGHPUT. [Step 4/11] Reading model files [ INFO ] Loading model files [ INFO ] Read model took 20.39 ms [ INFO ] Original model I/O parameters: [ INFO ] Network inputs: [ INFO ] image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3] [ INFO ] Network outputs: [ INFO ] detection_boxes , detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7] [Step 5/11] Resizing model to match image sizes and given batch [Step 6/11] Configuring input of the model [ INFO ] Model batch size: 1 [ INFO ] Network inputs: [ INFO ] image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3] [ INFO ] Network outputs: [ INFO ] detection_boxes , detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7] [Step 7/11] Loading the model to the device [ INFO ] Compile model took 5701.54 ms [Step 8/11] Querying optimal runtime parameters [ INFO ] Model: [ INFO ] NETWORK_NAME: ssd_mobilenet_v1_coco [ INFO ] OPTIMAL_NUMBER_OF_INFER_REQUESTS: 4 [ INFO ] PERF_COUNT: NO [ INFO ] ENABLE_CPU_PINNING: NO [ INFO ] MODEL_PRIORITY: MEDIUM [ INFO ] GPU_HOST_TASK_PRIORITY: MEDIUM [ INFO ] GPU_QUEUE_PRIORITY: MEDIUM [ INFO ] GPU_QUEUE_THROTTLE: MEDIUM [ INFO ] GPU_ENABLE_LOOP_UNROLLING: YES [ INFO ] GPU_DISABLE_WINOGRAD_CONVOLUTION: NO [ INFO ] CACHE_DIR: [ INFO ] CACHE_MODE: optimize_speed [ INFO ] PERFORMANCE_HINT: THROUGHPUT [ INFO ] EXECUTION_MODE_HINT: PERFORMANCE [ INFO ] COMPILATION_NUM_THREADS: 8 [ INFO ] NUM_STREAMS: 2 [ INFO ] PERFORMANCE_HINT_NUM_REQUESTS: 0 [ INFO ] INFERENCE_PRECISION_HINT: f16 [ INFO ] DEVICE_ID: 0 [ INFO ] EXECUTION_DEVICES: GPU.0 [Step 9/11] Creating infer requests and preparing input tensors [ WARNING ] No input files were given: all inputs will be filled with random values! [ INFO ] Test Config 0 [ INFO ] image_tensor ([N,H,W,C], u8, [1,300,300,3], static): random (image/numpy array is expected) [Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests, limits: 1000 ms duration) [ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop). [ INFO ] First inference took 16.93 ms [Step 11/11] Dumping statistics report [ INFO ] Execution Devices: [ GPU.0 ] [ INFO ] Count: 124 iterations [ INFO ] Duration: 1041.24 ms [ INFO ] Latency: [ INFO ] Median: 31.61 ms [ INFO ] Average: 33.14 ms [ INFO ] Min: 20.64 ms [ INFO ] Max: 46.92 ms [ INFO ] Throughput: 119.09 FPS

Thansk ,yes. the problem is same to yours. Can i build the last version ? In the branch, i can't find the version of 2024.3.0-15718-808a908ea92..
图片

@feixuedudiao
Copy link
Author

@Wan-Intel I rebuild and verified it from the maser branch, and found that id still did not work well.The speciifc log information is as follows. and you say that the new version of 2024.3.0-15718-808a908ea92 can be got from where?
benchmark_app.exe -m ssd.xml -t 1 -d GPU
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.3.0-15743-15257f1bac1
[ INFO ]
[ INFO ] Device info:
[ INFO ] GPU
[ INFO ] Build ................................. 2024.3.0-15743-15257f1bac1
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[ WARNING ] Performance hint was not explicitly specified in command line. Device(GPU) performance hint will be set to THROUGHPUT.
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 388.11 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Network inputs:
[ INFO ] input_0 (node: input_0) : f32 / [...] / [1,3,300,300]
[ INFO ] Network outputs:
[ INFO ] scores (node: scores) : f32 / [...] / [1,3000,81]
[ INFO ] boxes (node: boxes) : f32 / [...] / [1,3000,4]
[Step 5/11] Resizing model to match image sizes and given batch
[ WARNING ] input_0: layout is not set explicitly, so it is defaulted to NCHW. It is STRONGLY recommended to set layout manually to avoid further issues.
[Step 6/11] Configuring input of the model
[ INFO ] Model batch size: 1
[ INFO ] Network inputs:
[ INFO ] input_0 (node: input_0) : u8 / [N,C,H,W] / [1,3,300,300]
[ INFO ] Network outputs:
[ INFO ] scores (node: scores) : f32 / [...] / [1,3000,81]
[ INFO ] boxes (node: boxes) : f32 / [...] / [1,3000,4]
[Step 7/11] Loading the model to the device
[ ERROR ] Exception from src\inference\src\cpp\core.cpp:107:
Exception from src\inference\src\dev\plugin.cpp:53:
bad combination

@Wan-Intel
Copy link

Wan-Intel commented Jun 20, 2024

I built OpenVINO™ GitHub Master branch via the following command:

git clone https://github.com/openvinotoolkit/openvino.git
cd openvino
git submodule update --init

Did you encountered error: bad combination when running the inference with CPU plugin? Could you please provide the following information with us?

  • Hardware Specification
  • Host Operating System

@feixuedudiao
Copy link
Author

feixuedudiao commented Jun 20, 2024

@Wan-Intel Thanks.
Hardware Specification is "Intel(R) Core(TM) i7-10700 CPU @ 2.90GHz 2.90, and RAM 16.0GB"
Host Operating System is "Windows 10 Enterprise 20H2"
I build the openvino static library with the follow command:
cmake -G "Visual Studio 16 2019" -DCMAKE_BUILD_TYPE=release -DENABLE_OV_IR_FRONTEND=ON -DBUILD_SHARED_LIBS=OFF -DENABLE_TEMPLATE=OFF -DENABLE_HETERO=OFF -DENABLE_MULTI=OFF -DENABLE_AUTO_BATCH=OFF -DENABLE_INTEL_NPU=OFF -DENABLE_JS=OFF -DENABLE_PYTHON=OFF -DENABLE_WHEEL=OFF -DENABLE_OV_ONNX_FRONTEND=OFF -DENABLE_OV_PADDLE_FRONTEND=OFF -DENABLE_OV_TF_FRONTEND=OFF -DENABLE_OV_TF_LITE_FRONTEND=OFF -DENABLE_OV_PYTORCH_FRONTEND=OFF -DENABLE_MLAS_FOR_CPU=ON -DENABLE_SYSTEM_OPENCL=OFF -DENABLE_SYSTEM_FLATBUFFERS=OFF

@Wan-Intel
Copy link

Wan-Intel commented Jun 22, 2024

Hi, I noticed that your CMake option did not specify <path/to/openvino>

I've specified the <path/to/openvino> with your CMake option and built the OpenVINO™ from source successfully on a Windows 10 Machine.

Benchmark C++ Tool ran successfully with Intel® CPU and Intel® GPU plugin. The inference results are shown as follows:

benchmark_app.exe -m "ssd_mobilenet_v1_coco.xml" -t 1 -d CPU
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.3.0-15771-6a7c44220f0
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2024.3.0-15771-6a7c44220f0
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[ WARNING ] Performance hint was not explicitly specified in command line. Device(CPU) performance hint will be set to THROUGHPUT.
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 28.32 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Network inputs:
[ INFO ] image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]
[ INFO ] Network outputs:
[ INFO ] detection_boxes , detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]
[Step 5/11] Resizing model to match image sizes and given batch
[Step 6/11] Configuring input of the model
[ INFO ] Model batch size: 1
[ INFO ] Network inputs:
[ INFO ] image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]
[ INFO ] Network outputs:
[ INFO ] detection_boxes , detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 384.69 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ] NETWORK_NAME: ssd_mobilenet_v1_coco
[ INFO ] OPTIMAL_NUMBER_OF_INFER_REQUESTS: 4
[ INFO ] NUM_STREAMS: 4
[ INFO ] INFERENCE_NUM_THREADS: 8
[ INFO ] PERF_COUNT: NO
[ INFO ] INFERENCE_PRECISION_HINT: f32
[ INFO ] PERFORMANCE_HINT: THROUGHPUT
[ INFO ] EXECUTION_MODE_HINT: PERFORMANCE
[ INFO ] PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ] ENABLE_CPU_PINNING: NO
[ INFO ] SCHEDULING_CORE_TYPE: ANY_CORE
[ INFO ] MODEL_DISTRIBUTION_POLICY:
[ INFO ] ENABLE_HYPER_THREADING: YES
[ INFO ] EXECUTION_DEVICES: CPU
[ INFO ] CPU_DENORMALS_OPTIMIZATION: NO
[ INFO ] LOG_LEVEL: LOG_NONE
[ INFO ] CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE: 1
[ INFO ] DYNAMIC_QUANTIZATION_GROUP_SIZE: 0
[ INFO ] KV_CACHE_PRECISION: f16
[ INFO ] AFFINITY: NONE
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Test Config 0
[ INFO ] image_tensor ([N,H,W,C], u8, [1,300,300,3], static): random (image/numpy array is expected)
[Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests, limits: 1000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 37.48 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices: [ CPU ]
[ INFO ] Count: 44 iterations
[ INFO ] Duration: 1131.67 ms
[ INFO ] Latency:
[ INFO ] Median: 79.52 ms
[ INFO ] Average: 101.08 ms
[ INFO ] Min: 56.44 ms
[ INFO ] Max: 244.95 ms
[ INFO ] Throughput: 38.88 FPS

benchmark_app.exe -m "ssd_mobilenet_v1_coco.xml" -t 1 -d GPU
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.3.0-15771-6a7c44220f0
[ INFO ]
[ INFO ] Device info:
[ INFO ] GPU
[ INFO ] Build ................................. 2024.3.0-15771-6a7c44220f0
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[ WARNING ] Performance hint was not explicitly specified in command line. Device(GPU) performance hint will be set to THROUGHPUT.
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 17.82 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Network inputs:
[ INFO ] image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]
[ INFO ] Network outputs:
[ INFO ] detection_boxes , detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]
[Step 5/11] Resizing model to match image sizes and given batch
[Step 6/11] Configuring input of the model
[ INFO ] Model batch size: 1
[ INFO ] Network inputs:
[ INFO ] image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]
[ INFO ] Network outputs:
[ INFO ] detection_boxes , detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 5600.81 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ] NETWORK_NAME: ssd_mobilenet_v1_coco
[ INFO ] OPTIMAL_NUMBER_OF_INFER_REQUESTS: 4
[ INFO ] PERF_COUNT: NO
[ INFO ] ENABLE_CPU_PINNING: NO
[ INFO ] MODEL_PRIORITY: MEDIUM
[ INFO ] GPU_HOST_TASK_PRIORITY: MEDIUM
[ INFO ] GPU_QUEUE_PRIORITY: MEDIUM
[ INFO ] GPU_QUEUE_THROTTLE: MEDIUM
[ INFO ] GPU_ENABLE_LOOP_UNROLLING: YES
[ INFO ] GPU_DISABLE_WINOGRAD_CONVOLUTION: NO
[ INFO ] CACHE_DIR:
[ INFO ] CACHE_MODE: optimize_speed
[ INFO ] PERFORMANCE_HINT: THROUGHPUT
[ INFO ] EXECUTION_MODE_HINT: PERFORMANCE
[ INFO ] COMPILATION_NUM_THREADS: 8
[ INFO ] NUM_STREAMS: 2
[ INFO ] PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ] INFERENCE_PRECISION_HINT: f16
[ INFO ] DEVICE_ID: 0
[ INFO ] EXECUTION_DEVICES: GPU.0
[ INFO ] DYNAMIC_QUANTIZATION_GROUP_SIZE: 0
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Test Config 0
[ INFO ] image_tensor ([N,H,W,C], u8, [1,300,300,3], static): random (image/numpy array is expected)
[Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests, limits: 1000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 43.21 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices: [ GPU.0 ]
[ INFO ] Count: 136 iterations
[ INFO ] Duration: 1060.05 ms
[ INFO ] Latency:
[ INFO ] Median: 31.04 ms
[ INFO ] Average: 30.75 ms
[ INFO ] Min: 21.18 ms
[ INFO ] Max: 49.56 ms
[ INFO ] Throughput: 128.30 FPS

Could you please re-built the OpenVINO™ from source by specifying the <path/to/openvino> to your CMake option and see if the issue can be resolved?

@wenjiew wenjiew changed the title [Bug]: 执行benchmark.exe -m ssd.xml -d cpu在openvino静态库时报错"invalid broadcast" [Bug]: execution of "benchmark.exe -m ssd.xml -d cpu" reporting error in openvino static library with "invalid broadcast" Jun 24, 2024
@wenjiew wenjiew changed the title [Bug]: execution of "benchmark.exe -m ssd.xml -d cpu" reporting error in openvino static library with "invalid broadcast" [Bug]: execution of "benchmark.exe -m ssd.xml -d cpu" reporting "invalid broadcast" error with openvino static library Jun 24, 2024
@feixuedudiao
Copy link
Author

@Wan-Intel
thanks, dou you mean that i don't specify the <path/to/openvino> with "CMAKE_PREFIX_PATH" variable? The default path is bin\intel64.Can you tell what is yours? I will be try it.

@Wan-Intel
Copy link

Wan-Intel commented Jun 25, 2024

You may specify the path to the location of the OpenVINO™ folder as follows:

cmake -G "Visual Studio 16 2019" -DCMAKE_BUILD_TYPE=release -DENABLE_OV_IR_FRONTEND=ON -DBUILD_SHARED_LIBS=OFF -DENABLE_TEMPLATE=OFF -DENABLE_HETERO=OFF -DENABLE_MULTI=OFF -DENABLE_AUTO_BATCH=OFF -DENABLE_INTEL_NPU=OFF -DENABLE_JS=OFF -DENABLE_PYTHON=OFF -DENABLE_WHEEL=OFF -DENABLE_OV_ONNX_FRONTEND=OFF -DENABLE_OV_PADDLE_FRONTEND=OFF -DENABLE_OV_TF_FRONTEND=OFF -DENABLE_OV_TF_LITE_FRONTEND=OFF -DENABLE_OV_PYTORCH_FRONTEND=OFF -DENABLE_MLAS_FOR_CPU=ON -DENABLE_SYSTEM_OPENCL=OFF -DENABLE_SYSTEM_FLATBUFFERS=OFF "C:\Users\myusername\Downloads\openvino"

You may proceed to use the build command as shown in the following link:
https://github.com/openvinotoolkit/openvino/blob/master/docs/dev/static_libaries.md#build-static-openvino-libraries

Please get back to us if the issue persists.

@feixuedudiao
Copy link
Author

@Wan-Intel ok, thanks. I will rebuild the openvino with the way of yours.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants