I am trying to extract text that only appears inside a canvas

I was trying to automate this webpage: Challenging DOM.

However, I cannot access the inner text. This text is generated by JavaScript, which transforms the text into pixels to form an image.

How can I retrieve this text?

Hi @rafael.hist ,

You can use Computer vision activity to extract the text from the image. Please refer to this- https://docs.uipath.com/activities/other/latest/ui-automation/cv-get-text-with-descriptor

2 Likes

@rafael.hist

if you are talking about the answer field specifically its a canvas and so you can use get Text with scapping method as OCR so it targets the canvas elements and extracts the full text

cheers

cheers

1 Like

Hello my friends @Karan_Rautela @Anil_G,

You both provided me with great answers. I really appreciate it. For now, this is how I will proceed. Thank you very much.

However, is there any way to do this without using computer vision? In the future, I would like to make this process fully headless. Is that possible?

1 Like

Hey @rafael.hist ,

It is possible to extract the data without using computer vision by injecting a JavaScript code that extracts the required data, if the data is being rendered by JavaScript. Then you can just run the browser in headless mode.

Please refer to this for how to inject a JS code into UiPath workflow - https://docs.uipath.com/activities/other/latest/ui-automation/inject-js-script

1 Like

@Karan_Rautela, thank you again. I will study this documentation to see what I can do. For now, I have marked your response as the solution.

1 Like

@rafael.hist

The method I gave is not computer vision but ocr method with proper selectors no CV identification

Inject JS also might not work as injectJS works in situations where element has the data in tags and not in canvas

cheers

1 Like

@Anil_G ,

Thank you for the explanation. I had a conceptual misunderstanding.

So, can I run this in headless mode using OCR with Get Text?

I thought it was only possible in foreground mode.

@rafael.hist

Ideally it might not either CV or OCR as both depend on image

and as your element canvas is an image we need a gui for it to render..so headless you might not be able to

cheers

Hi @rafael.hist

You can’t read it as text because it’s drawn on a canvas (image).

Use OCR (screenshot → OCR) or
Read the API/network response if available

DOM methods won’t work.

Hey @rafael.hist,

You can use Inject JS Script activity to extract the data, if the data is present in the DOM. Like this-

Paste this in the Script Code input-
“function (element, input) {” & vbCrLf &
" try {" & vbCrLf &
" var scripts = document.getElementsByTagName(‘script’);" & vbCrLf &
" var all = ‘’;" & vbCrLf &
" for (var i = 0; i < scripts.length; i++) {" & vbCrLf &
" all += (scripts[i].text || scripts[i].textContent || ‘’) + ‘\n’;" & vbCrLf &
" }" & vbCrLf &
" // Match inside the strokeText call too (handles quotes)" & vbCrLf &
" var m = /Answer:\s*(\d+)/i.exec(all);" & vbCrLf &
" return m ? m[1] : ‘’;" & vbCrLf &
" } catch (e) {" & vbCrLf &
" return ‘’;" & vbCrLf &
" }" & vbCrLf &
“}”

**You can run this in headless mode.