ImageAnalyzer running VNRecognizeTextRequest in the background? #3

spencer-hong · 2022-12-14T03:02:51Z

Using VisionKit, there's two main ways to get text from images.

Based on some OCR tests, I'm seeing that the outputs from these two methods are different. Initially, I thought ImageAnalyzer was running VNRequestTextRecognitionLevel.fast because it's for Live Text, but the outputs from ImageAnalyzer are sometimes better than VNRequestTextRecognitionLevel.accurate.

VNRecognizeRequest does have more options, including language correction and custom words.

Do you know what ImageAnalyzer is calling in the background? Is it essentially running VNRecognizeRequest or is it a separate model/pipeline? And this naturally begs the question, which model would be better for OCR? My initial tests show a pretty similar performance in aggregation between ImageAnalyzer and VNRequestTextRecognitionLevel.accurate, but the results per test case can sometimes be highly variable between the two.

For documentation & in case this is outside the scope of your expertise, I've asked the same question on Apple Developers forum here.

freedmand · 2022-12-14T17:00:12Z

I'm also seeing a slight disparity between the two. The results are very close, so I do wonder if it's just particular settings of VNRecognizeRequest or a whole new pipeline. In any case, I anecdotally feel like ImageAnalyzer has more favorably picked up small bits of text that VNRecognizeRequest sometimes misses. To make matters more confusing, the live text interface allows selecting individual words, and these appear different than ImageAnalyzer's full text output and VNRecognizeRequest's.

I've started a thread of my own as well, to see if it's possible to get bounding boxes from the ImageAnalyzer. The additional options from VNRecognizeRequest are nice and would potentially be useful to have as options in textra in the future.

Currently I'm implementing a feature to get positional text using VNRecognizeRequest, with the caveat that the returned positional text may differ (which may change in the future if we hear back on the threads)

aehlke · 2024-01-31T23:37:46Z

@freedmand did you discover a way to get bounding boxes from ImageAnalyzer? the only possibility I can think of is to use VNRecognizeTextRequest to first get bounding boxes of text, and then extract images of that text to put into ImageAnalyzer to get enhanced results within a known box of text (including attributed strings) but I'm not sure that would really work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ImageAnalyzer running VNRecognizeTextRequest in the background? #3

ImageAnalyzer running VNRecognizeTextRequest in the background? #3

spencer-hong commented Dec 14, 2022 •

edited

Loading

freedmand commented Dec 14, 2022

aehlke commented Jan 31, 2024

ImageAnalyzer running VNRecognizeTextRequest in the background? #3

ImageAnalyzer running VNRecognizeTextRequest in the background? #3

Comments

spencer-hong commented Dec 14, 2022 • edited Loading

freedmand commented Dec 14, 2022

aehlke commented Jan 31, 2024

spencer-hong commented Dec 14, 2022 •

edited

Loading