I have an automation that is set up to use MS Excel as a database using an ODBC connector. The data is read from the Excel file into a data table. There is an error with the SQL query when it encounters integers in the Excel file. Tracing back through the code and looking at values during execution it seems that all works fine when the first value in the required column in the Excel data (Column A in example below)is presented as a string:
Example data from Excel sheet (all is ok, all values are read correctly)
-Column A
bill
apple
car
But when the first value in the column (Column A in example below) is an integer, the SQL reads the integer values but then reads the string values as nothing (output shows integers, but blank field for the strings, line is still in data table but it’s empty
Example data from Excel sheet (only reads integer values but leaves strings blank)
-Column A
123
456
bill
apple
Example output of the data table made from above data after SQL query (in memory):
-Column A
123
456 blank line when output to logs blank line when output to logs
I’ve tried to force the SQL statement to read data as strings through various methods (Build DT activity then write data into it, using Cstr, Cast, Convert, etc in the SQL statement) but nothing has been successful so far. There is also the possibility that I did not implement the conversion methods correctly as well
SQL query: Select Column A FROM Sheet 1
Note: My actual SQL statement is selecting multiple columns, I’ve trimmed it down to simplify the problem/example.
Any help making the SQL statement read all values (Integers and Strings) as strings and properly put them into the data table is greatly appreciated.
Hello @supermanPunch,
Thank you for the prompt reply and sorry for my delayed response. I’ve attempted to put in what you have shown but I was unsuccessful. My original connection string is not quite formatted like yours.
My original connection string:
Dsc=ExcelAsDB;dpq={0};defaultdir={1};driverid=1046;fil=excel 12.0;maxbuffersize=4096;pagetimeout=5
What I tried (unsuccessfully):
Dsc=ExcelAsDB;dpq={0};defaultdir={1};driverid=1046;fil=excel 12.0;maxbuffersize=4096;pagetimeout=5;Extended Properties=“IMEX=1”;
Notes:
{0} and {1} are formatted with a path to a .xlsx file and a path to the directory location of the same file (no actual file, just the folder path to the file)
“ExcelAsDB” refers to a System DSN I created in the ODBC Data Source Administrator (32-bit) software
My apologies if this information is now what you were looking for. It’s been a while since this was configured and only now has data come through that has required another look at this setup. This is also my first time setting up Excel as a database.
Any suggestions on what needs to be changed? Or is more information needed?
When I ran the query with the modified connection string the output was the same as before I modified the Connection String, no errors or anything, just the same output.
Example output of the data table made from above data after SQL query (in memory):
-Column A
123
456 blank line when output to logs blank line when output to logs
Is there anything like the “Mixed Types” parameter that can be used with ODBC? From searching it looks like it’s not possible. Just curious if anyone has any inputs.
I did try and change the whole automation over to use OleDb as outlined earlier, I was able to get the data output as needed, but there were other issues that were created. I have not been successful because there are several areas where the connection is used to read information from the files, then the data is manipulated and a different sequence is called which uses “Write Range” activity to write the manipulated data back to the same workbook.
I’ve tried to close the database connection before each Write Range then re-open afterwards but I’m getting a “Only one of the connections can be used” error. I looked at the properties of my “Run Query” activity and only “Existing connection” and “Provider Name” are populated in the “Connection Configuration” section. This error occurs after the first connect and disconnect from the OleDB connection. The “Run query” after the second “Connect to database” activity throws the error
As a final update from my end. I was able to get the data read correctly from Excel with the OleDb suggestion from @supermanPunch. For writing the data back to Excel, I wasn’t able to get the OleDb version to work correctly so I just kept the ODBC version from the original. I’m reading data with OleDb and writing with ODBC. Thank you for the help @supermanPunch, very much appreciated.