Extract excel data from video/case screen

My complete use case is I am using Microsoft Windows Connect inorder to cast/project one device screen to the device which has uipath, now I am projecting a excel file from the host device and I have a way in which I can scroll down the host screen excel file at regular intervals. Now I need to extract the data that is being projected in the device having uipath. I tried to do screen scrapping but since it’s like a video changing frames frequently and I am not able to get the data .
A simpler version of my use case would be, extract data being displayed in any video file(mp4) to excel

OCR screen scrapping is the only way, what OCR engine you tried with?

Also, a workaround can be to take a screenshot of the screen and then apply for on that particular saved photo using Omni or Google OCR engine.

Or even crazier approach to take the screenshot and convert it into pdf and use document understand to extract the table

Interesting usecase, I would give a try using the second approach.

Hope this helps!

Thanks for the reply, the “crazier” way that you said seems to be a really good one, I got an idea
1)I can use “ffmpeg” its a library/software using which I can extract all the frames from the video and store as images . I can even do that using VLC player
2)Now once I have the images going OCR would be easy on each image and extract the data
thanks :slight_smile:
But it would be difficult for the real-time streaming/projecting situation, yes I can do screenrecording or screenshots at particular interval of time and then do the ocr
But I wish to know if there is any way to apply OCR engine on video files not on images

1 Like

Glad you got some insights

Regarding-> OCR on videos, not sure though as the Target should be still to extract the data hence I doubt your requirement will be achievable

Let’s see someone has any other thoughts on this.

1 Like


I wanted to share my suggestions on the Usecase you guys are talking about.

I am not sure we can do this with Uipath but I have done small reasearch I found out that there are lot of free web application like feed.io not sure how secure they are. In these websites we have to input video and it will automatically transcribed to text format.

And also I have come across the combination of using vision OCR api and open cv to get the text from realtime streaming. Please refer the below link as well.

My thought is like why we have to capture frame by frame by video recording and apply with OCR to capture text instead of we can convert video to text directly it will save our time and it will help us on realtime screening.

Just sharing thought on the idea you shared. Thanks .

1 Like

Hi Kiran,
Thanks for the link
But unfortunately I think we can’t extract data/text from video itself, if you check the code in the link that you provided overall structure of the code is something like this:
1)Read the video file/livestream using OpenCV
2)Extract each frame from this video(through cv2.videocapture(0).read() )
3)Store this frame as a png/jpg image
4)Pass this image to the google cloud vision API
5)API returns the text present in image
So here we are doing image OCR not video OCR
Internally video is a changing/moving frame so nearly all text extraction/object detection algorithms use images(frames) extracted from video in order to get/detect the text/object
I will search and check feed.io thanks :slight_smile:
Overall seems like this is a difficult and out of context task for RPA, seems to be more dependent on the OCR/text detection algorithms being used, using python TensorFlow and other libraries
I will keep searching and inform you all ,once I get some approach to do video OCR

Nice to see your message. Good thought from your side. As you correctly mentioned it is not as easy task for RPA and fully depend on the Advanced OCR techniques.

Sorry I have done spelling mistake please refer the veed.io site it is website we can upload video and output we can get text.

Please share with us if you found good solution on video OCR. Thanks.

1 Like