Hi
we receive a large number of documents which need to be scanned in and stored in Sharepoint but they need to be searchable. So my idea was to use the computer vision API.
But I can’t see a way to do this
this is the way I visualise the flow working
email received in shared mailbox
Pdf attachment copied to sharepoint
computer vision api scans document
Document tags
saved in sharepoint
document is searchable
Is this possible?
thank you in advance
The easiest solution is to move your PDF archive to Google Workspace, which makes uploaded PDFs searchable by default.
[edit]I'm posting this answer here in hopes that Microsoft gets a clue...[/edit]
Hi,
You need to perform OCR on the PDF documents - This post details how to automatically OCR PDF documents added to SharePoint document library: https://blog.encodian.com/2019/10/automatically-ocr-pdf-documents-added-to-a-sharepoint-library/
You may also wish to use the 'Get PDF Document Information' action to check whether the PDF document has got a text layer... if it hasn't then perform OCR if it does don't do anything.
Reference:
OCR a PDF Document: https://support.encodian.com/hc/en-gb/articles/360012686653-OCR-a-PDF-Document
Get PDF Document Information: https://support.encodian.com/hc/en-gb/articles/360002949358-Get-PDF-Document-Information
HTH
Jay
The problem is that some of the pdfs are scanned images of documents.
If I understood correctly, You are successfully able to save PDF's in SharePoint using Flow.
Are you facing the problem, when viewing the PDF's in SharePoint, as the document is not searchable?
Have you tried to upload a PDF doc manually and check if that is searchable? Please share a Screenshot of it.
Michael E. Gernaey
18
Super User 2025 Season 1
stampcoin
16
Churchy
12