How to split one text file into multiple text files according to text content?

I cannot upload the text file for reference. So, I screenshot.
How to split this one text file into multiple text file name?
For example, split to:
ABC PTE LTD.txt
DEF PTE LTD.txt

Hi,

Is there any rule to identify company name, such as “the next line of Co. REG No. :”?

Regards,

Hi Yoichi,

The co registration number is actually my company registration number not the vendor registration number. Is it possible to search the vendor name: ABC PTE LTD & create a new text file name according to ABC PTE LTD. Sometimes this vendor details too long, might not only Page 1…but have Page 2 or 3 or more.

HI,

If co registration number always exists just before line of company name, it’s ok, even if the number is your company’s code.

I just wrote workflow for this case as the following. Can you check if it works for you?

Sample20221117-4.zip (2.7 KB)

Sometimes this vendor details too long, might not only Page 1…but have Page 2 or 3 or more.

Can you share screenshot etc for this case?

Regards,

Hi Yoichi,

You can refer to my screenshot when the details are too long, it will have Page 2 onwards.

I cannot see any output text files after I run.

HI,

The above sample just shows filename and content in MessageBox

How about if there are 2 or more same company name, merge them, as the following?
This sample actually output text file in project folder.

Sample20221117-4v2.zip (3.1 KB)

Regards,

Hi Yoichi,

Yes. For example, ABC PTE LTD has 2 pages. Merged together become only 1 text file name: ABC PTE LTD.txt. When open ABC PTE LTD.txt, i can see to Page 1 & Page 2.

HI,

Have you tried the above sample? This outputs merged text as a file as the following image.

Regards,

Hi, Yoichi.

Your data.txt file:

I tried your sample file & is working.

But when I replace with my actual text by replacing ur data.txt, there are no split files.
Mine actual data.txt file:


Is it because of the wording “page 1” & “page 2” has effect the run? Because your sample file do not have “page 1” & “page 2”

HI,

Can you share your txt file? It’s no problem if dummy data.

Regards,

abc.txt (2.8 KB)
I attached my text file for you to test whether is working?

Hi,

Can you try the following sample? I’ve modified regex pattern for your actual data.(Delete extra white space from the previous pattern)

Sample20221117-4v3.zip (4.1 KB)

Regards,

HI, Yoichi,

I noticed the “Page 1” wording was thrown to the bottom for ABC PTD LTD:

But DEF PTD LTD, the wording “page 1” is missing.

I attached my output files for your reference
Sample20221117-4v3.zip (1.4 KB)

Hi,

Alright. I modifid pattern to find Page and Co.REGNo. Can you try the following sample?

System.Text.RegularExpressions.Regex.Matches(strData,".*Page.*\r?\n\r?\n.*Co\.REG.No\..*\n(?<COMPANYNAME>.*?)\s[^\S\n]+.*\n[\s\S]+?(?=.*?Page|$)")

Sample20221117-4v4.zip (4.1 KB)

Regards,

HI, Yoichi.

Great~
Your sample file is working:


image

Can I know whether these output files can save to individual folder by date?
image

Hi,

How do we get the date?

Regards,

Hi, Yoichi.

From here:

HI,

Can you try the following sample? We can extract it using regex.

Sample20221117-4v5.zip (4.5 KB)

Regards,

Hi, Yoichi.

May I know how should I edit the folder path for the output files?

For example, if I want to save the output text files to this folder path?
C:\Desktop\2022 NOV\20221115