[Bug]: execution of "benchmark.exe -m ssd.xml -d cpu" reporting "invalid broadcast" error with openvino static library #24963

feixuedudiao · 2024-06-12T07:37:37Z

OpenVINO Version

2024.01/2024.1.0

Operating System

Windows System

Device used for inference

CPU

Framework

None

Model used

ssd

Issue description

When running the benchmark test of the SSD model in Openvino compiled with a static library, the following error "Exception from src\inference\src\dev\plugin.cpp:54: invalid broadcast" is reported on both the CPU and GPU devices. However, it is strange that when the Openvino library is a dll, running the benchmark test on the CPU and GPU devices can run normally.

Step-by-step reproduction

No response

Relevant log output

the cpu log:
λ benchmark_app.exe -m ssd.xml ssd.bin -d GPU
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.2.0-15316-22dcf50ce01
[ INFO ]
[ INFO ] Device info:
[ INFO ] GPU
[ INFO ] Build ................................. 2024.2.0-15316-22dcf50ce01
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[ WARNING ] Performance hint was not explicitly specified in command line. Device(GPU) performance hint will be set to THROUGHPUT.
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 22.72 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Network inputs:
[ INFO ]     input_0 (node: input_0) : f32 / [...] / [1,3,300,300]
[ INFO ] Network outputs:
[ INFO ]     scores (node: scores) : f32 / [...] / [1,3000,81]
[ INFO ]     boxes (node: boxes) : f32 / [...] / [1,3000,4]
[Step 5/11] Resizing model to match image sizes and given batch
[ WARNING ] input_0: layout is not set explicitly, so it is defaulted to NCHW. It is STRONGLY recommended to set layout manually to avoid further issues.
[Step 6/11] Configuring input of the model
[ INFO ] Model batch size: 1
[ INFO ] Network inputs:
[ INFO ]     input_0 (node: input_0) : u8 / [N,C,H,W] / [1,3,300,300]
[ INFO ] Network outputs:
[ INFO ]     scores (node: scores) : f32 / [...] / [1,3000,81]
[ INFO ]     boxes (node: boxes) : f32 / [...] / [1,3000,4]
[Step 7/11] Loading the model to the device
[ ERROR ] Exception from src\inference\src\cpp\core.cpp:107:
Exception from src\inference\src\dev\plugin.cpp:54:
invalid broadcast

the gpu log:
λ benchmark_app.exe -m ssd.xml -d CPU
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.2.0-15316-22dcf50ce01
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2024.2.0-15316-22dcf50ce01
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[ WARNING ] Performance hint was not explicitly specified in command line. Device(CPU) performance hint will be set to THROUGHPUT.
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 428.26 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Network inputs:
[ INFO ]     input_0 (node: input_0) : f32 / [...] / [1,3,300,300]
[ INFO ] Network outputs:
[ INFO ]     scores (node: scores) : f32 / [...] / [1,3000,81]
[ INFO ]     boxes (node: boxes) : f32 / [...] / [1,3000,4]
[Step 5/11] Resizing model to match image sizes and given batch
[ WARNING ] input_0: layout is not set explicitly, so it is defaulted to NCHW. It is STRONGLY recommended to set layout manually to avoid further issues.
[Step 6/11] Configuring input of the model
[ INFO ] Model batch size: 1
[ INFO ] Network inputs:
[ INFO ]     input_0 (node: input_0) : u8 / [N,C,H,W] / [1,3,300,300]
[ INFO ] Network outputs:
[ INFO ]     scores (node: scores) : f32 / [...] / [1,3000,81]
[ INFO ]     boxes (node: boxes) : f32 / [...] / [1,3000,4]
[Step 7/11] Loading the model to the device
[ ERROR ] Exception from src\inference\src\cpp\core.cpp:107:
Exception from src\inference\src\dev\plugin.cpp:54:
invalid broadcast

Issue submission checklist

I'm reporting an issue. It's not a question.
I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
There is reproducer code and related data files such as images, videos, models, etc.

Wan-Intel · 2024-06-16T03:21:22Z

I've validated ssd_mobilenet_v1_coco with Benchmark C++ Tool with the OpenVINO™ GitHub Master branch.

Could you please re-build the OpenVINO™ GitHub Master branch and see if the issue can be resolved? Documentation on building OpenVINO™ static libraries and OpenVINO™ from sources are as follows:

benchmark_app.exe -m ssd_mobilenet_v1_coco.xml
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.3.0-15702-78fcf9de187
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2024.3.0-15702-78fcf9de187
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[ WARNING ] Performance hint was not explicitly specified in command line. Device(CPU) performance hint will be set to THROUGHPUT.
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 72.02 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Network inputs:
[ INFO ] image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]
[ INFO ] Network outputs:
[ INFO ] detection_boxes , detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]
[Step 5/11] Resizing model to match image sizes and given batch
[Step 6/11] Configuring input of the model
[ INFO ] Model batch size: 1
[ INFO ] Network inputs:
[ INFO ] image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]
[ INFO ] Network outputs:
[ INFO ] detection_boxes , detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 491.62 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ] NETWORK_NAME: ssd_mobilenet_v1_coco
[ INFO ] OPTIMAL_NUMBER_OF_INFER_REQUESTS: 4
[ INFO ] NUM_STREAMS: 4
[ INFO ] INFERENCE_NUM_THREADS: 8
[ INFO ] PERF_COUNT: NO
[ INFO ] INFERENCE_PRECISION_HINT: f32
[ INFO ] PERFORMANCE_HINT: THROUGHPUT
[ INFO ] EXECUTION_MODE_HINT: PERFORMANCE
[ INFO ] PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ] ENABLE_CPU_PINNING: NO
[ INFO ] SCHEDULING_CORE_TYPE: ANY_CORE
[ INFO ] MODEL_DISTRIBUTION_POLICY:
[ INFO ] ENABLE_HYPER_THREADING: YES
[ INFO ] EXECUTION_DEVICES: CPU
[ INFO ] CPU_DENORMALS_OPTIMIZATION: NO
[ INFO ] LOG_LEVEL: LOG_NONE
[ INFO ] CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE: 1
[ INFO ] DYNAMIC_QUANTIZATION_GROUP_SIZE: 0
[ INFO ] KV_CACHE_PRECISION: f16
[ INFO ] AFFINITY: NONE
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Test Config 0
[ INFO ] image_tensor ([N,H,W,C], u8, [1,300,300,3], static): random (image/numpy array is expected)
[Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests, limits: 60000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 26.09 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices: [ CPU ]
[ INFO ] Count: 2096 iterations
[ INFO ] Duration: 60136.14 ms
[ INFO ] Latency:
[ INFO ] Median: 82.18 ms
[ INFO ] Average: 114.70 ms
[ INFO ] Min: 41.48 ms
[ INFO ] Max: 1093.93 ms
[ INFO ] Throughput: 34.85 FPS

feixuedudiao · 2024-06-17T02:33:48Z

thanks ,i check it ,but this problem also accurs to gpu device,can you check it on gpu device?

Wan-Intel · 2024-06-18T04:34:54Z

Did you encounter the same issue after building the OpenVINO™ GitHub Master branch? The latest version of the build will be 2024.3.0-15718-808a908ea92.

Meanwhile, inference of ssd_mobilenet_v1_coco with Benchmark C++ Tool using the OpenVINO™ GitHub Master branch GPU plugin is shown as follows:

benchmark_app.exe -m ssd_mobilenet_v1_coco.xml -t 1 -d GPU
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.3.0-15718-808a908ea92
[ INFO ]
[ INFO ] Device info:
[ INFO ] GPU
[ INFO ] Build ................................. 2024.3.0-15718-808a908ea92
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[ WARNING ] Performance hint was not explicitly specified in command line. Device(GPU) performance hint will be set to THROUGHPUT.
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 20.39 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Network inputs:
[ INFO ] image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]
[ INFO ] Network outputs:
[ INFO ] detection_boxes , detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]
[Step 5/11] Resizing model to match image sizes and given batch
[Step 6/11] Configuring input of the model
[ INFO ] Model batch size: 1
[ INFO ] Network inputs:
[ INFO ] image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]
[ INFO ] Network outputs:
[ INFO ] detection_boxes , detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 5701.54 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ] NETWORK_NAME: ssd_mobilenet_v1_coco
[ INFO ] OPTIMAL_NUMBER_OF_INFER_REQUESTS: 4
[ INFO ] PERF_COUNT: NO
[ INFO ] ENABLE_CPU_PINNING: NO
[ INFO ] MODEL_PRIORITY: MEDIUM
[ INFO ] GPU_HOST_TASK_PRIORITY: MEDIUM
[ INFO ] GPU_QUEUE_PRIORITY: MEDIUM
[ INFO ] GPU_QUEUE_THROTTLE: MEDIUM
[ INFO ] GPU_ENABLE_LOOP_UNROLLING: YES
[ INFO ] GPU_DISABLE_WINOGRAD_CONVOLUTION: NO
[ INFO ] CACHE_DIR:
[ INFO ] CACHE_MODE: optimize_speed
[ INFO ] PERFORMANCE_HINT: THROUGHPUT
[ INFO ] EXECUTION_MODE_HINT: PERFORMANCE
[ INFO ] COMPILATION_NUM_THREADS: 8
[ INFO ] NUM_STREAMS: 2
[ INFO ] PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ] INFERENCE_PRECISION_HINT: f16
[ INFO ] DEVICE_ID: 0
[ INFO ] EXECUTION_DEVICES: GPU.0
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Test Config 0
[ INFO ] image_tensor ([N,H,W,C], u8, [1,300,300,3], static): random (image/numpy array is expected)
[Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests, limits: 1000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 16.93 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices: [ GPU.0 ]
[ INFO ] Count: 124 iterations
[ INFO ] Duration: 1041.24 ms
[ INFO ] Latency:
[ INFO ] Median: 31.61 ms
[ INFO ] Average: 33.14 ms
[ INFO ] Min: 20.64 ms
[ INFO ] Max: 46.92 ms
[ INFO ] Throughput: 119.09 FPS

feixuedudiao · 2024-06-19T09:31:08Z

Did you encounter the same issue after building the OpenVINO™ GitHub Master branch? The latest version of the build will be 2024.3.0-15718-808a908ea92.

Meanwhile, inference of ssd_mobilenet_v1_coco with Benchmark C++ Tool using the OpenVINO™ GitHub Master branch GPU plugin is shown as follows:

benchmark_app.exe -m ssd_mobilenet_v1_coco.xml -t 1 -d GPU [Step 1/11] Parsing and validating input arguments [ INFO ] Parsing input parameters [Step 2/11] Loading OpenVINO Runtime [ INFO ] OpenVINO: [ INFO ] Build ................................. 2024.3.0-15718-808a908ea92 [ INFO ] [ INFO ] Device info: [ INFO ] GPU [ INFO ] Build ................................. 2024.3.0-15718-808a908ea92 [ INFO ] [ INFO ] [Step 3/11] Setting device configuration [ WARNING ] Performance hint was not explicitly specified in command line. Device(GPU) performance hint will be set to THROUGHPUT. [Step 4/11] Reading model files [ INFO ] Loading model files [ INFO ] Read model took 20.39 ms [ INFO ] Original model I/O parameters: [ INFO ] Network inputs: [ INFO ] image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3] [ INFO ] Network outputs: [ INFO ] detection_boxes , detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7] [Step 5/11] Resizing model to match image sizes and given batch [Step 6/11] Configuring input of the model [ INFO ] Model batch size: 1 [ INFO ] Network inputs: [ INFO ] image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3] [ INFO ] Network outputs: [ INFO ] detection_boxes , detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7] [Step 7/11] Loading the model to the device [ INFO ] Compile model took 5701.54 ms [Step 8/11] Querying optimal runtime parameters [ INFO ] Model: [ INFO ] NETWORK_NAME: ssd_mobilenet_v1_coco [ INFO ] OPTIMAL_NUMBER_OF_INFER_REQUESTS: 4 [ INFO ] PERF_COUNT: NO [ INFO ] ENABLE_CPU_PINNING: NO [ INFO ] MODEL_PRIORITY: MEDIUM [ INFO ] GPU_HOST_TASK_PRIORITY: MEDIUM [ INFO ] GPU_QUEUE_PRIORITY: MEDIUM [ INFO ] GPU_QUEUE_THROTTLE: MEDIUM [ INFO ] GPU_ENABLE_LOOP_UNROLLING: YES [ INFO ] GPU_DISABLE_WINOGRAD_CONVOLUTION: NO [ INFO ] CACHE_DIR: [ INFO ] CACHE_MODE: optimize_speed [ INFO ] PERFORMANCE_HINT: THROUGHPUT [ INFO ] EXECUTION_MODE_HINT: PERFORMANCE [ INFO ] COMPILATION_NUM_THREADS: 8 [ INFO ] NUM_STREAMS: 2 [ INFO ] PERFORMANCE_HINT_NUM_REQUESTS: 0 [ INFO ] INFERENCE_PRECISION_HINT: f16 [ INFO ] DEVICE_ID: 0 [ INFO ] EXECUTION_DEVICES: GPU.0 [Step 9/11] Creating infer requests and preparing input tensors [ WARNING ] No input files were given: all inputs will be filled with random values! [ INFO ] Test Config 0 [ INFO ] image_tensor ([N,H,W,C], u8, [1,300,300,3], static): random (image/numpy array is expected) [Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests, limits: 1000 ms duration) [ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop). [ INFO ] First inference took 16.93 ms [Step 11/11] Dumping statistics report [ INFO ] Execution Devices: [ GPU.0 ] [ INFO ] Count: 124 iterations [ INFO ] Duration: 1041.24 ms [ INFO ] Latency: [ INFO ] Median: 31.61 ms [ INFO ] Average: 33.14 ms [ INFO ] Min: 20.64 ms [ INFO ] Max: 46.92 ms [ INFO ] Throughput: 119.09 FPS

Thansk ,yes. the problem is same to yours. Can i build the last version ? In the branch, i can't find the version of 2024.3.0-15718-808a908ea92..

feixuedudiao · 2024-06-20T01:10:46Z

@Wan-Intel I rebuild and verified it from the maser branch, and found that id still did not work well.The speciifc log information is as follows. and you say that the new version of 2024.3.0-15718-808a908ea92 can be got from where?
benchmark_app.exe -m ssd.xml -t 1 -d GPU
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.3.0-15743-15257f1bac1
[ INFO ]
[ INFO ] Device info:
[ INFO ] GPU
[ INFO ] Build ................................. 2024.3.0-15743-15257f1bac1
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[ WARNING ] Performance hint was not explicitly specified in command line. Device(GPU) performance hint will be set to THROUGHPUT.
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 388.11 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Network inputs:
[ INFO ] input_0 (node: input_0) : f32 / [...] / [1,3,300,300]
[ INFO ] Network outputs:
[ INFO ] scores (node: scores) : f32 / [...] / [1,3000,81]
[ INFO ] boxes (node: boxes) : f32 / [...] / [1,3000,4]
[Step 5/11] Resizing model to match image sizes and given batch
[ WARNING ] input_0: layout is not set explicitly, so it is defaulted to NCHW. It is STRONGLY recommended to set layout manually to avoid further issues.
[Step 6/11] Configuring input of the model
[ INFO ] Model batch size: 1
[ INFO ] Network inputs:
[ INFO ] input_0 (node: input_0) : u8 / [N,C,H,W] / [1,3,300,300]
[ INFO ] Network outputs:
[ INFO ] scores (node: scores) : f32 / [...] / [1,3000,81]
[ INFO ] boxes (node: boxes) : f32 / [...] / [1,3000,4]
[Step 7/11] Loading the model to the device
[ ERROR ] Exception from src\inference\src\cpp\core.cpp:107:
Exception from src\inference\src\dev\plugin.cpp:53:
bad combination

Wan-Intel · 2024-06-20T04:06:40Z

I built OpenVINO™ GitHub Master branch via the following command:

git clone https://github.com/openvinotoolkit/openvino.git
cd openvino
git submodule update --init

Did you encountered error: bad combination when running the inference with CPU plugin? Could you please provide the following information with us?

Hardware Specification
Host Operating System

feixuedudiao · 2024-06-20T09:56:03Z

@Wan-Intel Thanks.
Hardware Specification is "Intel(R) Core(TM) i7-10700 CPU @ 2.90GHz 2.90, and RAM 16.0GB"
Host Operating System is "Windows 10 Enterprise 20H2"
I build the openvino static library with the follow command:
cmake -G "Visual Studio 16 2019" -DCMAKE_BUILD_TYPE=release -DENABLE_OV_IR_FRONTEND=ON -DBUILD_SHARED_LIBS=OFF -DENABLE_TEMPLATE=OFF -DENABLE_HETERO=OFF -DENABLE_MULTI=OFF -DENABLE_AUTO_BATCH=OFF -DENABLE_INTEL_NPU=OFF -DENABLE_JS=OFF -DENABLE_PYTHON=OFF -DENABLE_WHEEL=OFF -DENABLE_OV_ONNX_FRONTEND=OFF -DENABLE_OV_PADDLE_FRONTEND=OFF -DENABLE_OV_TF_FRONTEND=OFF -DENABLE_OV_TF_LITE_FRONTEND=OFF -DENABLE_OV_PYTORCH_FRONTEND=OFF -DENABLE_MLAS_FOR_CPU=ON -DENABLE_SYSTEM_OPENCL=OFF -DENABLE_SYSTEM_FLATBUFFERS=OFF

Wan-Intel · 2024-06-22T11:23:47Z

Hi, I noticed that your CMake option did not specify <path/to/openvino>

I've specified the <path/to/openvino> with your CMake option and built the OpenVINO™ from source successfully on a Windows 10 Machine.

Benchmark C++ Tool ran successfully with Intel® CPU and Intel® GPU plugin. The inference results are shown as follows:

benchmark_app.exe -m "ssd_mobilenet_v1_coco.xml" -t 1 -d CPU
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.3.0-15771-6a7c44220f0
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2024.3.0-15771-6a7c44220f0
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[ WARNING ] Performance hint was not explicitly specified in command line. Device(CPU) performance hint will be set to THROUGHPUT.
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 28.32 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Network inputs:
[ INFO ] image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]
[ INFO ] Network outputs:
[ INFO ] detection_boxes , detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]
[Step 5/11] Resizing model to match image sizes and given batch
[Step 6/11] Configuring input of the model
[ INFO ] Model batch size: 1
[ INFO ] Network inputs:
[ INFO ] image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]
[ INFO ] Network outputs:
[ INFO ] detection_boxes , detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 384.69 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ] NETWORK_NAME: ssd_mobilenet_v1_coco
[ INFO ] OPTIMAL_NUMBER_OF_INFER_REQUESTS: 4
[ INFO ] NUM_STREAMS: 4
[ INFO ] INFERENCE_NUM_THREADS: 8
[ INFO ] PERF_COUNT: NO
[ INFO ] INFERENCE_PRECISION_HINT: f32
[ INFO ] PERFORMANCE_HINT: THROUGHPUT
[ INFO ] EXECUTION_MODE_HINT: PERFORMANCE
[ INFO ] PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ] ENABLE_CPU_PINNING: NO
[ INFO ] SCHEDULING_CORE_TYPE: ANY_CORE
[ INFO ] MODEL_DISTRIBUTION_POLICY:
[ INFO ] ENABLE_HYPER_THREADING: YES
[ INFO ] EXECUTION_DEVICES: CPU
[ INFO ] CPU_DENORMALS_OPTIMIZATION: NO
[ INFO ] LOG_LEVEL: LOG_NONE
[ INFO ] CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE: 1
[ INFO ] DYNAMIC_QUANTIZATION_GROUP_SIZE: 0
[ INFO ] KV_CACHE_PRECISION: f16
[ INFO ] AFFINITY: NONE
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Test Config 0
[ INFO ] image_tensor ([N,H,W,C], u8, [1,300,300,3], static): random (image/numpy array is expected)
[Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests, limits: 1000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 37.48 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices: [ CPU ]
[ INFO ] Count: 44 iterations
[ INFO ] Duration: 1131.67 ms
[ INFO ] Latency:
[ INFO ] Median: 79.52 ms
[ INFO ] Average: 101.08 ms
[ INFO ] Min: 56.44 ms
[ INFO ] Max: 244.95 ms
[ INFO ] Throughput: 38.88 FPS

benchmark_app.exe -m "ssd_mobilenet_v1_coco.xml" -t 1 -d GPU
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.3.0-15771-6a7c44220f0
[ INFO ]
[ INFO ] Device info:
[ INFO ] GPU
[ INFO ] Build ................................. 2024.3.0-15771-6a7c44220f0
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[ WARNING ] Performance hint was not explicitly specified in command line. Device(GPU) performance hint will be set to THROUGHPUT.
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 17.82 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Network inputs:
[ INFO ] image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]
[ INFO ] Network outputs:
[ INFO ] detection_boxes , detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]
[Step 5/11] Resizing model to match image sizes and given batch
[Step 6/11] Configuring input of the model
[ INFO ] Model batch size: 1
[ INFO ] Network inputs:
[ INFO ] image_tensor , image_tensor:0 (node: image_tensor) : u8 / [N,H,W,C] / [1,300,300,3]
[ INFO ] Network outputs:
[ INFO ] detection_boxes , detection_boxes:0 (node: DetectionOutput) : f32 / [...] / [1,1,100,7]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 5600.81 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ] NETWORK_NAME: ssd_mobilenet_v1_coco
[ INFO ] OPTIMAL_NUMBER_OF_INFER_REQUESTS: 4
[ INFO ] PERF_COUNT: NO
[ INFO ] ENABLE_CPU_PINNING: NO
[ INFO ] MODEL_PRIORITY: MEDIUM
[ INFO ] GPU_HOST_TASK_PRIORITY: MEDIUM
[ INFO ] GPU_QUEUE_PRIORITY: MEDIUM
[ INFO ] GPU_QUEUE_THROTTLE: MEDIUM
[ INFO ] GPU_ENABLE_LOOP_UNROLLING: YES
[ INFO ] GPU_DISABLE_WINOGRAD_CONVOLUTION: NO
[ INFO ] CACHE_DIR:
[ INFO ] CACHE_MODE: optimize_speed
[ INFO ] PERFORMANCE_HINT: THROUGHPUT
[ INFO ] EXECUTION_MODE_HINT: PERFORMANCE
[ INFO ] COMPILATION_NUM_THREADS: 8
[ INFO ] NUM_STREAMS: 2
[ INFO ] PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ] INFERENCE_PRECISION_HINT: f16
[ INFO ] DEVICE_ID: 0
[ INFO ] EXECUTION_DEVICES: GPU.0
[ INFO ] DYNAMIC_QUANTIZATION_GROUP_SIZE: 0
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Test Config 0
[ INFO ] image_tensor ([N,H,W,C], u8, [1,300,300,3], static): random (image/numpy array is expected)
[Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests, limits: 1000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 43.21 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices: [ GPU.0 ]
[ INFO ] Count: 136 iterations
[ INFO ] Duration: 1060.05 ms
[ INFO ] Latency:
[ INFO ] Median: 31.04 ms
[ INFO ] Average: 30.75 ms
[ INFO ] Min: 21.18 ms
[ INFO ] Max: 49.56 ms
[ INFO ] Throughput: 128.30 FPS

Could you please re-built the OpenVINO™ from source by specifying the <path/to/openvino> to your CMake option and see if the issue can be resolved?

feixuedudiao · 2024-06-24T09:58:19Z

@Wan-Intel
thanks, dou you mean that i don't specify the <path/to/openvino> with "CMAKE_PREFIX_PATH" variable? The default path is bin\intel64.Can you tell what is yours? I will be try it.

Wan-Intel · 2024-06-25T00:18:26Z

You may specify the path to the location of the OpenVINO™ folder as follows:

cmake -G "Visual Studio 16 2019" -DCMAKE_BUILD_TYPE=release -DENABLE_OV_IR_FRONTEND=ON -DBUILD_SHARED_LIBS=OFF -DENABLE_TEMPLATE=OFF -DENABLE_HETERO=OFF -DENABLE_MULTI=OFF -DENABLE_AUTO_BATCH=OFF -DENABLE_INTEL_NPU=OFF -DENABLE_JS=OFF -DENABLE_PYTHON=OFF -DENABLE_WHEEL=OFF -DENABLE_OV_ONNX_FRONTEND=OFF -DENABLE_OV_PADDLE_FRONTEND=OFF -DENABLE_OV_TF_FRONTEND=OFF -DENABLE_OV_TF_LITE_FRONTEND=OFF -DENABLE_OV_PYTORCH_FRONTEND=OFF -DENABLE_MLAS_FOR_CPU=ON -DENABLE_SYSTEM_OPENCL=OFF -DENABLE_SYSTEM_FLATBUFFERS=OFF "C:\Users\myusername\Downloads\openvino"

You may proceed to use the build command as shown in the following link:
https://github.com/openvinotoolkit/openvino/blob/master/docs/dev/static_libaries.md#build-static-openvino-libraries

Please get back to us if the issue persists.

feixuedudiao · 2024-06-25T07:42:28Z

@Wan-Intel ok, thanks. I will rebuild the openvino with the way of yours.

feixuedudiao added bug Something isn't working support_request labels Jun 12, 2024

ilya-lavrenov added the category: GPU OpenVINO GPU plugin label Jun 12, 2024

avitial assigned Munesh-Intel Jun 14, 2024

avitial removed the bug Something isn't working label Jun 14, 2024

Wan-Intel self-assigned this Jun 15, 2024

wenjiew changed the title ~~[Bug]: 执行benchmark.exe -m ssd.xml -d cpu在openvino静态库时报错"invalid broadcast"~~ [Bug]: execution of "benchmark.exe -m ssd.xml -d cpu" reporting error in openvino static library with "invalid broadcast" Jun 24, 2024

wenjiew changed the title ~~[Bug]: execution of "benchmark.exe -m ssd.xml -d cpu" reporting error in openvino static library with "invalid broadcast"~~ [Bug]: execution of "benchmark.exe -m ssd.xml -d cpu" reporting "invalid broadcast" error with openvino static library Jun 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: execution of "benchmark.exe -m ssd.xml -d cpu" reporting "invalid broadcast" error with openvino static library #24963

[Bug]: execution of "benchmark.exe -m ssd.xml -d cpu" reporting "invalid broadcast" error with openvino static library #24963

feixuedudiao commented Jun 12, 2024

Wan-Intel commented Jun 16, 2024 •

edited

Loading

feixuedudiao commented Jun 17, 2024 •

edited

Loading

Wan-Intel commented Jun 18, 2024

feixuedudiao commented Jun 19, 2024

feixuedudiao commented Jun 20, 2024

Wan-Intel commented Jun 20, 2024 •

edited

Loading

feixuedudiao commented Jun 20, 2024 •

edited

Loading

Wan-Intel commented Jun 22, 2024 •

edited

Loading

feixuedudiao commented Jun 24, 2024

Wan-Intel commented Jun 25, 2024 •

edited

Loading

feixuedudiao commented Jun 25, 2024

[Bug]: execution of "benchmark.exe -m ssd.xml -d cpu" reporting "invalid broadcast" error with openvino static library #24963

[Bug]: execution of "benchmark.exe -m ssd.xml -d cpu" reporting "invalid broadcast" error with openvino static library #24963

Comments

feixuedudiao commented Jun 12, 2024

OpenVINO Version

Operating System

Device used for inference

Framework

Model used

Issue description

Step-by-step reproduction

Relevant log output

Issue submission checklist

Wan-Intel commented Jun 16, 2024 • edited Loading

feixuedudiao commented Jun 17, 2024 • edited Loading

Wan-Intel commented Jun 18, 2024

feixuedudiao commented Jun 19, 2024

feixuedudiao commented Jun 20, 2024

Wan-Intel commented Jun 20, 2024 • edited Loading

feixuedudiao commented Jun 20, 2024 • edited Loading

Wan-Intel commented Jun 22, 2024 • edited Loading

feixuedudiao commented Jun 24, 2024

Wan-Intel commented Jun 25, 2024 • edited Loading

feixuedudiao commented Jun 25, 2024

Wan-Intel commented Jun 16, 2024 •

edited

Loading

feixuedudiao commented Jun 17, 2024 •

edited

Loading

Wan-Intel commented Jun 20, 2024 •

edited

Loading

feixuedudiao commented Jun 20, 2024 •

edited

Loading

Wan-Intel commented Jun 22, 2024 •

edited

Loading

Wan-Intel commented Jun 25, 2024 •

edited

Loading