I want to read a pdf file and write data to excel in which i am facing an issue with date:
From the pdf read activity i used regex to get date along with time.
Then i split the data to get date only which is: “12-Oct-2018” now i want to print date in “12102018” format
I used this code to
Yes when I am trying to do with dash also still it is not fetching.
Since I am trying to get data from PDF in which date is in bold format, can you help me to change this in normal fonts
So is your issue in extracting the text / parsing it with Regex or Converting the text string into a DateTime object?
The fact that the original text in PDF is bold should not matter if it is being extracted correctly; In either case, I would check what the value of ddd is the format you are placing in ParseExtract() matching the expected format of the string.
To help you further, we’d need a sample of what the value of ddd after you’d extracted and split the data.
Hi ,
I got the text from pdf using regex and now I want to convert the string in date format.
I cannot share the sample but for your reference I can explain the scenario as:
I have a PDF on the top of it we have invoice and and date both are in bold case in PDF.
After extracting the date I am converting the string date into date and time and simultaneously assigning this to a string as per my format (the code for this I shared initially)
However the code is working for other hardcode value except the PDF extracted one.
Can you share the small screenshot of the date from PDF, the content of the ddd Assign activity, and the sample before it is assigned to ddd and the value after it’s assigned?
From the sounds of it, the content of ddd is not in the expected string or format, without more insight, logs, screenshots, etc. we’d only be guessing, and I can only point out your original post has a format conflict between “12-Oct-2018” and “ddMMMyyyy”.
convert.ToDateTime(“30-Jul-2018 16:28 PM”).ToString(“ddmmyyyy”) <— put this in a message box and see if it works.
You can change the “30-Jul-2018 16:28 PM” string with the string from your pdf.
Based on what I can examine, it doesn’t look like your regex subgroups are what you are thinking they are, can you verify that pdocdate(0) is “Invoice Date: 30-Jul-2018 16:26 PM” before you Split it on the space?
Okay, well dump your variables and see what the content actually is to validate it.
Are you writing the full Date/Time to Excel and Excel is formatting the display to be Date only?
Swapping your PDF out for a text file with the same content…
splitdata =
string[1]
{
" 30-Jul-2018 16:26 PM"
}
So Trim(splitdata(1).ToString) is not valid. changing it to an index of 0, will get you a step further, but the format in
does not match the content 30-Jul-2018 16:26 PM which I pointed out in an earlier post.
Based on the test.xaml file you provided, You NEED to change either 1) Regex, 2) Date format, that you are expected along with the few index changes based on your current Regex.
Swapped PDF for TXT, Inserted Log Message activities