From text need to create datatable

Output.txt (8.9 KB)
Above is the input
How can I build. using regex, I tried but not working
I need output as datatable

For YEAR ENDING JUNE 2024:

Period Duration Start Date End Date
Period 1 JUN - JUL 26-Jun-2023 30-Jul-2023
Period 2 AUG 31-Jul-2023 27-Aug-2023
Period 3 AUG - SEPT 28-Aug-2023 24-Sept-2023
Period 4 SEPT - OCT 25-Sept-2023 29-Oct-2023
Period 5 OCT - NOV 30-Oct-2023 26-Nov-2023
Period 6 NOV - DEC - JAN 27-Nov-2023 31-Dec-2023
Period 7 JAN 1-Jan-2024 28-Jan-2024
Period 8 JAN - FEB 29-Jan-2024 25-Feb-2024
Period 9 FEB - MAR 26-Feb-2024 24-Mar-2024
Period 10 MAR - APR 25-Mar-2024 28-Apr-2024
Period 11 MAY 29-Apr-2024 26-May-2024
Period 12 MAY - JUN 27-May-2024 30-Jun-2024

For YEAR ENDING JUNE 2025:

Period Duration Start Date End Date
Period 1 JUL - AUG 1-Jul-2024 4-Aug-2024
Period 2 AUG - SEPT 5-Aug-2024 1-Sept-2024
Period 3 SEPT 2-Sept-2024 29-Sept-2024
Period 4 SEPT - OCT - NOV 30-Sept-2024 3-Nov-2024
Period 5 NOV - DEC 4-Nov-2024 1-Dec-2024
Period 6 DEC - JAN 2-Dec-2024 5-Jan-2025
Period 7 JAN - FEB 6-Jan-2025 2-Feb-2025
Period 8 FEB - MAR 3-Feb-2025 2-Mar-2025
Period 9 MAR 3-Mar-2025 30-Mar-2025
Period 10 MAR - APR - MAY 31-Mar-2025 4-May-2025
Period 11 MAY - JUN 5-May-2025 1-Jun-2025
Period 12 JUN 2-Jun-2025 29-Jun-2025

Hi @Meghana_Bonu

You can save this file as a csv file. How to convert this file into csv format, you can refer below video on same:
https://m.youtube.com/watch?v=JvNMtKuulhA

Then use read csv activity in UiPath and save it in a datatable.

Regards
Sonali

Hey @Meghana_Bonu can you try Generate Data Table From Text Activity .it convert your text into datatable .

cheers


This is the input - which is in pdf

Hi @Meghana_Bonu

In that case, you can use “read pdf with OCR” activity to read contents of the pdf.

and then use regex on that output string to fetch the details you need.

Regards
Sonali

can you please elaborate more on start and end date logic?
Update: if possible attach original PDF file

I can able to read the content from the pdf, but when I tried regex, it is unable to identity.


In first box, Period is 1 then Start Date as 26-Jun-2024 and End Date as 30-Jul-2024
In the second box, Period is 2, then Start Date as 31-Jul-2024 and End Date as 27-Aug-2024
In the third Box, Period is 3 then Start Date as 28-Aug-2024 and End Date as 24-Sep-2024 so on

@Meghana_Bonu

Try below regex:

^Period\s+(\d+)\s+([\w\s-]+)\s+(\d{1,2}-[A-Za-z]{3,4}-\d{4})\s+(\d{1,2}-[A-Za-z]{3,4}-\d{4})$