Selecting specific element from ARRAY of JSON within a JSON

Dear community,
I’m using Microsoft’s cognitive services for my OCR and receive a JSON from the service. I have used the deserialize JSON activity, but now am left wondering how to navigate the JSON. The JSON returned by Microsoft is a single JSON file that contains several arrays of JSON (one for each page, one for each line, one for each word, see example below).

I have watched KB Tutorial’s great video on JSON parsing, but my real application will be an OCRed transaction log and I will need specific values within it to process the transactions. Does anybody know, for example, how to reference only the word “OCR” in the following example?

{
“status”: “Succeeded”,
“recognitionResults”: [{
“page”: 1,
“clockwiseOrientation”: 0.1,
“width”: 640,
“height”: 480,
“unit”: “pixel”,
“lines”: [{
“boundingBox”: [35, 89, 582, 90, 581, 120, 34, 119],
“text”: “This is a lot of 12 point text to test the”,
“words”: [{
“boundingBox”: [35, 90, 97, 90, 97, 119, 35, 117],
“text”: “This”
}, {
“boundingBox”: [100, 90, 129, 90, 130, 119, 101, 119],
“text”: “is”
}, {
“boundingBox”: [137, 90, 155, 90, 155, 120, 137, 119],
“text”: “a”
}, {
“boundingBox”: [160, 90, 206, 90, 206, 120, 161, 120],
“text”: “lot”
}, {
“boundingBox”: [211, 90, 242, 90, 242, 121, 211, 120],
“text”: “of”
}, {
“boundingBox”: [248, 90, 282, 90, 282, 121, 248, 121],
“text”: “12”
}, {
“boundingBox”: [293, 90, 369, 90, 369, 121, 293, 121],
“text”: “point”
}, {
“boundingBox”: [369, 90, 431, 90, 431, 120, 369, 121],
“text”: “text”
}, {
“boundingBox”: [431, 90, 464, 91, 464, 120, 431, 120],
“text”: “to”
}, {
“boundingBox”: [469, 91, 531, 91, 531, 119, 469, 120],
“text”: “test”
}, {
“boundingBox”: [531, 91, 580, 91, 580, 118, 531, 119],
“text”: “the”
}]
}, {
“boundingBox”: [33, 121, 619, 124, 619, 157, 32, 154],
“text”: “ocr code and see if it works on all types”,
“words”: [{
“boundingBox”: [35, 128, 82, 126, 83, 152, 36, 152],
“text”: “ocr”
}, {
“boundingBox”: [90, 126, 158, 125, 159, 152, 91, 152],
“text”: “code”
}, {
“boundingBox”: [169, 125, 222, 124, 223, 152, 170, 152],
“text”: “and”
}, {
“boundingBox”: [233, 124, 283, 123, 284, 152, 234, 152],
“text”: “see”
}, {
“boundingBox”: [292, 123, 316, 123, 316, 152, 292, 152],
“text”: “if”
}, {
“boundingBox”: [317, 123, 342, 123, 342, 153, 318, 152],
“text”: “it”
}, {
“boundingBox”: [346, 123, 432, 124, 433, 154, 347, 153],
“text”: “works”
}, {
“boundingBox”: [443, 124, 474, 124, 474, 154, 444, 154],
“text”: “on”
}, {
“boundingBox”: [496, 125, 535, 125, 536, 155, 497, 155],
“text”: “all”
}, {
“boundingBox”: [537, 125, 619, 127, 620, 157, 537, 155],
“text”: “types”
}]
}, {
“boundingBox”: [35, 156, 225, 157, 224, 186, 34, 185],
“text”: “of file format.”,
“words”: [{
“boundingBox”: [35, 157, 68, 157, 68, 186, 35, 186],
“text”: “of”
}, {
“boundingBox”: [68, 157, 112, 157, 112, 187, 68, 186],
“text”: “file”
}, {
“boundingBox”: [118, 157, 229, 158, 230, 187, 118, 187],
“text”: “format.”
}]
}, {
“boundingBox”: [34, 191, 585, 189, 586, 226, 35, 227],
“text”: “The quick brown dog jumped over the”,
“words”: [{
“boundingBox”: [35, 192, 91, 192, 91, 227, 35, 226],
“text”: “The”
}, {
“boundingBox”: [98, 191, 177, 191, 176, 227, 98, 227],
“text”: “quick”
}, {
“boundingBox”: [181, 191, 275, 191, 275, 227, 181, 227],
“text”: “brown”
}, {
“boundingBox”: [282, 191, 340, 192, 340, 227, 282, 227],
“text”: “dog”
}, {
“boundingBox”: [343, 192, 457, 192, 457, 225, 343, 227],
“text”: “jumped”
}, {
“boundingBox”: [466, 192, 536, 192, 536, 223, 466, 225],
“text”: “over”
}, {
“boundingBox”: [534, 192, 585, 193, 585, 222, 533, 223],
“text”: “the”
}]
}, {
“boundingBox”: [33, 225, 585, 226, 584, 260, 32, 259],
“text”: “lazy fox. The quick brown dog jumped”,
“words”: [{
“boundingBox”: [29, 227, 94, 226, 93, 259, 28, 259],
“text”: “lazy”
}, {
“boundingBox”: [98, 226, 163, 226, 162, 260, 97, 259],
“text”: “fox.”
}, {
“boundingBox”: [165, 226, 221, 226, 221, 260, 164, 260],
“text”: “The”
}, {
“boundingBox”: [228, 226, 307, 226, 307, 260, 227, 260],
“text”: “quick”
}, {
“boundingBox”: [313, 226, 403, 226, 403, 260, 313, 260],
“text”: “brown”
}, {
“boundingBox”: [414, 226, 470, 226, 470, 260, 413, 260],
“text”: “dog”
}, {
“boundingBox”: [472, 226, 587, 227, 587, 259, 472, 260],
“text”: “jumped”
}]
}, {
“boundingBox”: [34, 259, 599, 260, 598, 294, 33, 293],
“text”: “over the lazy fox. The quick brown dog”,
“words”: [{
“boundingBox”: [35, 263, 100, 262, 100, 290, 36, 288],
“text”: “over”
}, {
“boundingBox”: [105, 262, 151, 261, 151, 292, 105, 291],
“text”: “the”
}, {
“boundingBox”: [159, 261, 218, 260, 219, 293, 159, 292],
“text”: “lazy”
}, {
“boundingBox”: [226, 260, 287, 260, 288, 294, 227, 293],
“text”: “fox.”
}, {
“boundingBox”: [294, 260, 345, 260, 346, 294, 295, 294],
“text”: “The”
}, {
“boundingBox”: [358, 260, 433, 260, 433, 294, 359, 294],
“text”: “quick”
}, {
“boundingBox”: [442, 260, 530, 261, 530, 293, 443, 294],
“text”: “brown”
}, {
“boundingBox”: [543, 261, 596, 262, 596, 291, 544, 292],
“text”: “dog”
}]
}, {
“boundingBox”: [41, 292, 563, 293, 562, 328, 40, 327],
“text”: “jumped over the lazy fox. The quick”,
“words”: [{
“boundingBox”: [37, 294, 153, 293, 153, 327, 37, 328],
“text”: “jumped”
}, {
“boundingBox”: [159, 293, 228, 293, 228, 327, 159, 327],
“text”: “over”
}, {
“boundingBox”: [228, 293, 280, 293, 279, 328, 228, 327],
“text”: “the”
}, {
“boundingBox”: [282, 293, 349, 293, 348, 328, 282, 328],
“text”: “lazy”
}, {
“boundingBox”: [351, 293, 416, 293, 415, 328, 351, 328],
“text”: “fox.”
}, {
“boundingBox”: [418, 293, 476, 294, 475, 328, 417, 328],
“text”: “The”
}, {
“boundingBox”: [481, 294, 563, 294, 562, 329, 480, 328],
“text”: “quick”
}]
}, {
“boundingBox”: [36, 328, 562, 327, 563, 360, 37, 361],
“text”: “brown dog jumped over the lazy fox.”,
“words”: [{
“boundingBox”: [33, 330, 121, 329, 121, 361, 33, 360],
“text”: “brown”
}, {
“boundingBox”: [131, 329, 188, 329, 188, 361, 131, 361],
“text”: “dog”
}, {
“boundingBox”: [192, 329, 306, 328, 306, 361, 192, 361],
“text”: “jumped”
}, {
“boundingBox”: [314, 328, 382, 328, 382, 361, 314, 361],
“text”: “over”
}, {
“boundingBox”: [382, 328, 433, 329, 432, 361, 382, 361],
“text”: “the”
}, {
“boundingBox”: [437, 329, 501, 329, 500, 361, 436, 361],
“text”: “lazy”
}, {
“boundingBox”: [505, 329, 569, 329, 568, 361, 504, 361],
“text”: “fox.”
}]
}]
}]
}

Thank you very much!

Have a nice day,
Philippe

You can use excel to have a much more interactive experience in parsing JSON files. I explained it to a user yesterday how to do this. Couple it with a refresh macro and you can have a dynamic web data connection using UiPath

Thanks ronanpeter!

Ideally, I would simply want to assign the values of several elements in the JSON to variables that I can then upload to a SQL server and not use Excel. It would be, for example, for finding the date in a standardized form and the transaction item. They would always be in the same place in the JSON, but might be 3 arrays of JSON deep into the file. Any idea how I can do this? Thanks!

@Phil

If you provide a sample JSON file I can have a go for you to parse.

You can use this site to validate it. The data you provided earlier is not recognised as valid as a JSON file.

Hi Ronanpeter,
See attached the output from the Microsfot Cognitive Services API. It should be valid according to the JSON validator. Thanks!

Have a nice day,
Phil

Testfile.json (7.2 KB)

Dear community,
I seem to have cracked it: JSON arrays can be referenced with their index.

For example, if I want to get the word “OCR” in the example above, I should put in a message box: outputJSON(“recognitionResults”)(0)(“lines”)(1)(“words”)(0)(“text”).ToString

Thanks for the help!

Have a nice day,
Phil

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.