Extracting table from txt file

I have one text file and i want to extract table from it like
Transaction List
No. Booking Date Debit Credit Opening / Closing Balance Transaction Customer Bank Detail Information
(Value Date) Type Reference Reference
this table.here is my text file

MUFGCA.txt (2.5 KB)

Hi @sayali_rokade,

In this you will need to extract the information from the txt file on the basis of either regex or text matching…

This will be possible to get the whole thing as per your requirement.

Thanks,
Shubham

yes right,m searching for regex pattern to extract info if u have some idea pls share with me

What are the fields you are looking to extract?

You can try the following link to test your regex:
https://regex101.com/

Hope this helps :slight_smile:

@Shubham_Varshney
What are the fields you are expecting to be extracted in data table

test1.pdf (16.6 KB)
This data from no to PSFC INTEREST

test1.pdf (16.6 KB)
this table

Hi @sayali_rokade,

Why don’t you extract the table directly from the excel file… This will allow the extraction of the data quicker for you :slight_smile:

Steps:

  1. Open PDF in chrome
  2. Extract Data-table

Hope this helps :slight_smile:

can u pls share example/code if you have that will help me

This won’t work for the shared PDF… Just tried with this approach…

As you are looking for extracting the PSFC Intrest only, is there any thing that also comes at start for identifying the same? This will be used for creating the Regex…

Hi @sayali_rokade

I opened a pdf and try to get the text using get full text activity.

i got the below result

Transaction List
No.
Booking Date
(Value Date)
Debit
Credit
Opening / Closing Balance
Transaction
Type
Customer
Reference
Bank
Reference
Detail Information
1
2021.04.09
670,673.29
Opening Balance
2
2021.04.09
(2021.04.09)
1,500.00
Miscellaneous
COMMISSION
TTS-122235-00
3
2021.04.09
(2021.04.09)
4,026.24
Miscellaneous
GST AND FCC
TTS-122235-00
4
2021.04.09
5,526.24
0.00
665,147.05
Closing Balance
5
2021.04.19
665,147.05
Opening Balance
6
2021.04.19
(2021.04.19)
23,014.00
Miscellaneous
INTTRFRFD 5834
FDR-005834-00
7
2021.04.19
0.00
23,014.00
688,161.05
Closing Balance
8
2021.04.23
688,161.05
Opening Balance
9
2021.04.23
(2021.04.23)
1,500.00
Miscellaneous
IMPORT PAYMENT
TTS-010541-00
10
2021.04.23
(2021.04.23)
5,177.86
Miscellaneous
GST FCC IMP PAY
TTS-010541-00
11
2021.04.23
6,677.86
0.00
681,483.19
Closing Balance
12
2021.04.26
681,483.19
Opening Balance
13
2021.04.26
(2021.04.26)
39,123.00
Miscellaneous
NETINTONFD 5835
FDR-005835-00
14
2021.04.26
0.00
39,123.00
720,606.19
Closing Balance
15
2021.04.30
720,606.19
Opening Balance
16
2021.04.30
(2021.04.30)
18,725.00
Miscellaneous
FBD-009133-00
3 PSFC INTEREST REFUND

you can use this string and create a data table.

1 Like

Yup this will be great way to get the things to be captured :slight_smile:
Once @sayali_rokade have obtained the DT, you can perform a lot of items as per your requirement!!!

yes but later on how you put this information in excel,i am trying regex if u find solution to get table data nd store in excel pls tell me that will really help for me

yes m trying regex and then i have to save this table data in excel that is my task

Follow the following procedure:

  1. Find 1, before that for each new line is a column name for you
  2. Now look for the next number, and keep on adding that to the column respectively
  3. If the next number appear, add the blank data to the column not populated
  4. Add this to datatable in similar fashion.

Hope this helps.

Limitations:
You will be mixing up data in case any column from any rows found would come in as blank…

hey i am reaching upto here


trying for remaining if u have idea about pls tell me

Hi @sayali_rokade

Alternative way

I read the pdf file using read pdf text and set the property PreseveFormatting to True

image

After that i got a output like below in the string format

                                                                                                              Transaction                    List

Date / Date Range: 2021.04.01 - 2021.04.30
Debit / Credit : All
Account Details
Bank Name MUFG Bank Account Type CURRENT ACCOUNT
Branch Name Mumbai Branch Account No. 009482
SWIFT BIC Account Name BHARAT FORGE
Currency INR IBAN

                                                                                                                                                                                                                                                  Page:           1  /  2

Sort by: Booking Date *: Intraday transaction red: Canceled
Transaction List
No. Booking Date Debit Credit Opening / Closing Balance Transaction Customer Bank Detail Information
(Value Date) Type Reference Reference
1 2021.04.09 670,673.29 Opening Balance
2 2021.04.09 1,500.00 Miscellaneous COMMISSION TTS-122235-00
(2021.04.09)
3 2021.04.09 4,026.24 Miscellaneous GST AND FCC TTS-122235-00
(2021.04.09)
4 2021.04.09 5,526.24 0.00 665,147.05 Closing Balance
5 2021.04.19 665,147.05 Opening Balance
6 2021.04.19 23,014.00 Miscellaneous INTTRFRFD 5834 FDR-005834-00
(2021.04.19)
7 2021.04.19 0.00 23,014.00 688,161.05 Closing Balance
8 2021.04.23 688,161.05 Opening Balance
9 2021.04.23 1,500.00 Miscellaneous IMPORT PAYMENT TTS-010541-00
(2021.04.23)
10 2021.04.23 5,177.86 Miscellaneous GST FCC IMP PAY TTS-010541-00
(2021.04.23)
11 2021.04.23 6,677.86 0.00 681,483.19 Closing Balance
12 2021.04.26 681,483.19 Opening Balance
13 2021.04.26 39,123.00 Miscellaneous NETINTONFD 5835 FDR-005835-00
(2021.04.26)
14 2021.04.26 0.00 39,123.00 720,606.19 Closing Balance
15 2021.04.30 720,606.19 Opening Balance
16 2021.04.30 18,725.00 Miscellaneous FBD-009133-00 3 PSFC INTEREST REFUND
(2021.04.30)

Although MUFG Bank (the “Bank”) shall make every effort to ensure that entries are updated and accurate, the Bank shall not be liable in any way for any loss or damage arising from or occasioned by any error, inac
curacy, delay or omission of data. The Bank further reserves the right to amend any entry without prior notice.

GCMS Plus User ID (as of): SAMEER 2021.05.14 14:26:56 [IND] Transaction List

                                                                                                                                                                                                                                    Page:          2 /  2

Sort by: Booking Date *: Intraday transaction red: Canceled
Transaction List
No. Booking Date Debit Credit Opening / Closing Balance Transaction Customer Bank Detail Information
(Value Date) Type Reference Reference
17 2021.04.30 0.00 18,725.00 739,331.19 Closing Balance

Although MUFG Bank (the “Bank”) shall make every effort to ensure that entries are updated and accurate, the Bank shall not be liable in any way for any loss or damage arising from or occasioned by any error, inac
curacy, delay or omission of data. The Bank further reserves the right to amend any entry without prior notice.

GCMS Plus User ID (as of): SAMEER 2021.05.14 14:26:56 [IND]

Then i use generate data table activity
And i am able to get the data in required format.

1 Like

hey thanks for Your efforts prashant,if you have code pls share me will try my side.

hey @Shubham_Varshney and @PRASHANT_GABHANE i am done with regex i tried in my xmal but showing null
MUFGCA_toExcel.xaml (13.4 KB)
pls check and guide

@sayali_rokade - Based on the screenshot below…I see Multiline(gm) option is selected here…

image

As always, please test your pattern with .NET Regex Tester - Regex Storm once. Since this is a .NET regex engine. I personally always test in both before going to UiPath.