Not accepting the correct date formate

Hi,

I want to read a pdf file and write data to excel in which i am facing an issue with date:
From the pdf read activity i used regex to get date along with time.
Then i split the data to get date only which is: “12-Oct-2018” now i want to print date in “12102018” format
I used this code to

I am getting this error:

Please help

Hi @nsharma

Based on your description (bolded in the quote), 12-Oct-2018 does not match the format ddMMMyyyy in your screenshot as it’s missing the dashes.

Creating a string variable ddd="12-Oct-2018" and passing it into the following code

DateTime.ParseExact(ddd, "dd-MMM-yyyy", CultureInfo.InvariantCulture).ToString("ddMMyyyy")

It outputs as 12102018 as expected.

Cheers,
Tim

Hello,

Can you check if the final variable in which you storing the date is of the type ‘system.datetime’

Thank you.
Anusha

Yes when I am trying to do with dash also still it is not fetching.
Since I am trying to get data from PDF in which date is in bold format, can you help me to change this in normal fonts

So is your issue in extracting the text / parsing it with Regex or Converting the text string into a DateTime object?

The fact that the original text in PDF is bold should not matter if it is being extracted correctly; In either case, I would check what the value of ddd is the format you are placing in ParseExtract() matching the expected format of the string.

To help you further, we’d need a sample of what the value of ddd after you’d extracted and split the data.

Hi ,
I got the text from pdf using regex and now I want to convert the string in date format.
I cannot share the sample but for your reference I can explain the scenario as:
I have a PDF on the top of it we have invoice and and date both are in bold case in PDF.
After extracting the date I am converting the string date into date and time and simultaneously assigning this to a string as per my format (the code for this I shared initially)

However the code is working for other hardcode value except the PDF extracted one.

Can you share the small screenshot of the date from PDF, the content of the ddd Assign activity, and the sample before it is assigned to ddd and the value after it’s assigned?

From the sounds of it, the content of ddd is not in the expected string or format, without more insight, logs, screenshots, etc. we’d only be guessing, and I can only point out your original post has a format conflict between “12-Oct-2018” and “ddMMMyyyy”.

Hi,

This is the sample of pdf:

pdf

and here the full code:

test.xaml (16.5 KB)

please refer this test code

I used the same but it was not working so i tried after removing the dashes.

convert.ToDateTime(“30-Jul-2018 16:28 PM”).ToString(“ddmmyyyy”) <— put this in a message box and see if it works.
You can change the “30-Jul-2018 16:28 PM” string with the string from your pdf.

@nsharma

Okay so I’m going to make some assumptions based on the test code as I don’t have a PDF to read.

1. pdocdate="(?<=(Invoice.Date:))(.*)"
2. splitdata=Split(pdocdate(0).tostring, " ")
3. ddd=Trim(splitdata(1).ToString)
4. dd=DateTime.ParseExact(ddd, "dd-MMM-yyyy",CultureInfo.InvariantCulture).ToString("ddMMyyyy")

Based on what I can examine, it doesn’t look like your regex subgroups are what you are thinking they are, can you verify that pdocdate(0) is “Invoice Date: 30-Jul-2018 16:26 PM” before you Split it on the space?

As you’re using sub/capturing groups in your regex, it might be simpler to Trim your pdocdate group 2 and ParseExact using “dd-MMM-yyyy HH:mm tt”

I’m not sure what flavour of Regex .Net uses, but I imagine it supports non-capturing group and named capturing groups.

(?<=(?:Invoice.Date:))(?<invoiceDate>.*). Remove the first group with ?: and name the second group invoiceData to simplify what you are referencing.

But regex is completely fine it is writing correct data to excel

This is showing me error as: disallows implicit conversations from string to 'system.iformatprovider

and this: convert.ToDateTime(“30-Jul-2018 16:28 PM”).ToString(“ddMMyyyy”) ?

Okay, well dump your variables and see what the content actually is to validate it.
Are you writing the full Date/Time to Excel and Excel is formatting the display to be Date only?

Swapping your PDF out for a text file with the same content…

splitdata =

string[1] 
{
  " 30-Jul-2018 16:26 PM" 
}

So Trim(splitdata(1).ToString) is not valid. changing it to an index of 0, will get you a step further, but the format in

DateTime.ParseExact(ddd, "dd-MMM-yyyy",CultureInfo.InvariantCulture).ToString("ddMMyyyy")

does not match the content 30-Jul-2018 16:26 PM which I pointed out in an earlier post.

Based on the test.xaml file you provided, You NEED to change either 1) Regex, 2) Date format, that you are expected along with the few index changes based on your current Regex.

Swapped PDF for TXT, Inserted Log Message activities

Fixed the out of bounds due to the original Regex expected capture

Updated date format to match the captured date/time text