Remove specific duplicated rows

Areto_taco · March 27, 2023, 9:38am

Hi… i have a PDF file with multiple pages containing the following data:

Page 1
Officers
Name
Tom
Jerry
Ben

Page 2
Officers
Name
Mary
Jack

Because of the page break, when i use the read PDF text activity, the “Officers” and “Name” headers are duplicated in the entire text.

how can i remove the duplicated headers after the first header so that my text file is now:

Officers
Name
Tom
Jerry
Ben
Mary
Jack

Anil_G · March 27, 2023, 9:54am

@Areto_taco

Welcome to the community

try this

"Officers" + Environment.NewLine + "Name" + str.Replace("Officers" + Environment.NewLine + "Name","")

cheers

zaqq · March 27, 2023, 9:59am

Hi,

you can also use “yourString”.Split(Environment.NewLine.TocharArray).Distinct().ToArray()

Areto_taco · March 27, 2023, 10:01am

is it possible to specifically remove only duplicated header rows? because i may have duplicated officer names as well but i want to keep the duplicated officer names…

vishal.kp · March 27, 2023, 10:47am

@Areto_taco ,

Give this a try:
Areto.xaml (6.7 KB)

zaqq · March 27, 2023, 11:27am

Yes it is possible, you can do it that way:

strOfficers = System.Text.RegularExpressions.Regex.Replace(yourString,“[\n]”," “)
then
strOfficers.Replace(“Officers”,”“).Replace(“Name”,”")

Regards

Topic		Replies	Views
Header repeating every iteration of for each row loop Help	4	1326	September 3, 2019
Remove duplicate headers in a sheet Activities uiautomation	2	109	June 19, 2024
Remove only particular duplicate rows and remians take string upto duplicate rows Studio studio , activities_panel	3	260	July 21, 2023
Remove only a specific duplicate row if found Studio studio , question , activities_panel	8	237	February 21, 2024
How to remove Multiple Headers from CSV file Studio studio , question , activities_panel	16	3583	March 31, 2022

Remove specific duplicated rows

Related topics