Connection String Parameters for Excel Data Sources
In the previous article, I discussed how we can treat Excel and text files as if they were a database using DAO, and how we can open them without linking. Because they do not use ODBC drivers, their connection string will be formatted quite differently from what you might be accustomed to seeing for an ODBC connection string. There’s a dearth of documentation on the Excel connection string parameters. This is a best-effort to cover some of the gaps and discuss the ramifications of parameters.
Excel connection string parameters
Even though we have 3 different data source “types”:
Excel 8.0
: 97-2003 xls filesExcel 12.0
: xlsb filesExcel 12.0 Xml
: xlsx files
They all use the same parameters.
Here are the list of parameters:
HDR
parameter: Header row
YES
: The first row is the header and should become the column names for the “table”/”recordset”
NO
: The first row is not treated differently and is just a data. All column names will be named “FN” where “N” is a number starting with 1
IMEX
parameter: Import/Export Behavior
This governs how the column data types should be defined, based on the content:
1
: If the column contain different data types, treat it as a string. Otherwise, match the column to the best data type.
2
: Always match the column to a certain data type based on the sample. That may cause an error in reading when we read a row that contains data that does not match the expected data type.
ACCDB
parameter: Indicates that Access is using ACCDB file format?
By default, this is always set ACCDB=YES in an accdb file format. However, omitting it or setting it to NO seems to do nothing. It’s a bit of a mystery. If anyone can share what effect this parameter, post in comment and I’ll update the blog.
DATABASE: Path to the Excel workbook
The parameter should contain a fully qualified path, including the workbook’s name.
Minimum working connection string
Note that the DATABASE is the only mandatory parameter in addition to the data type source keyword. Therefore a minimum working connection string can be:
Excel 8.0;DATABASE=C:\Links\Products.xls
Specifying sheet or range in connection string
In the previous sample, you saw that a sheet represented a “DAO.TableDef
“. However, worksheets aren’t the only thing that can be a “Tabledef
“. If the Excel spreadsheet contains a named range, the named range will be reported as a “Tabledef
” as well. Additionally, we can “query” an arbitrary block in the sheet using cell address. For example:
Dim db As DAO.Database Dim rs As DAO.Recordset Set db = DBEngine.OpenDatabase(vbNullString, False, False, "Excel 12.0 Xml;HDR=YES;IMEX=2;ACCDB=YES;DATABASE=C:\Links\Products.xlsx") Set rs = db.OpenRecordsset("Sheet$1A1:A3") Debug.Print rs.Name, rs.Fields.Count
It’s important to note that the cell addresses cannot exceed the sheet’s used range. For example, the Products.xlsx
only actually has contents in A1:B3, that means if you open a recordset using Sheet1$A1:D5, you still get only 2 for fields count and 3 for record count. The extra blank columns/rows are simply ignored. On the flip side, if you dirtied a cell somewhere outside the A1:B3
, the sheet’s UsedRange
will be now as bigger and querying would then include blank columns and rows.
Therefore those are valid names to use in a query on a Excel “database”:
Sheet1$
– Entire used range of a worksheet.Sheet1$A1:B4
– Only 2 columns and 3 rows (not counting header), providing that the contents are filled. Otherwise columns or rows may be less than requested.ProductsRange
– the named range with that name.
I find it much nicer to use named ranges where practical as this ensure that you are not hard-coding the addresses in your code especially if the range gets moved around due to user inserting new columns or rows but not altering the contents of the named range. However it is not always practical, especially if you are receiving spreadsheets from a 3rd party and therefore have no control over its contents or formats. In this case, writing a SQL query can work, too.
Querying Excel data source
Suppose we can’t control the format and we don’t want to rely on absolute address even though we are confident that certain columns and rows will be in fact present. In that situation, the best thing to do is to query. Here’s an example that select only one row:
Dim db As DAO.Database Set db = DBEngine.OpenDatabase(vbNullString, False, False, "Excel 12.0 Xml;HDR=YES;IMEX=2;ACCDB=YES;DATABASE=C:\Links\Products.xlsx") Dim rs As DAO.Recordset Set rs = db.OpenRecordset("SELECT d.[Count] FROM [Sheet1$] AS d WHERE d.[Products] = 'Bananas';") Debug.Print rs.Fields(0).Value
Hopefully you can see this is much easier than iterating over each rows to find which one has “Bananas” and then reading the column to right to get the count. In this case, querying beats out automating Excel.
Conclusion
You’ve seen that DAO makes it very easy for us to work with Excel data source and pretend as if it was a relational data source and use our favorite querying language and familiar DAO objects instead of writing bunch of VBA code automating Excel to find the data we want. The connection string parameters are fairly straightforward and as long you have the path, you’re good for linking or opening an Excel spreadsheet.
In the next article we will look at text file connection parameters.
Thanks Ben, a csv! Brilliant but simple, why didn’t I think of that.
One of the reasons for using an Excel file is that you can put it in Dropbox and it should synch. I haven’t tried it yet because of the Read/Write issue. and it provides an adequate interface for the external users without having to distribute and Access application.
I have one potential project where I had used an Access Web app a few years ago for a similar project and it worked well for sharing relatively small amounts of data with multiple outside consultants but allas that option is no longer available.
I was also looking at an Azure VM with Parallels RAS but that is way over the top for this project.
Not sure if a csv will give enough functionality (freeze panes for the column heading, no drop down lists etc) but it may just be enough. But now that Access is included with every 365 business plan I may be able to convince them to use a local Access db as the UI. I’d get too much push back if I asked them to install the Access Runtime.
This method, as does linking, provides a read only connection. I gather this is a constraint left over from a patent lawsuit that MS lost back in 1994 (circa). I assume that that patent has now expired. Does MS have any plan to reactivate that ability to read write to Excel from Access?
So what do you see as the easiest method we currently have to update data in a work sheet – Automation or CopyFromRecordset?
Correct. However, you can workaround this by targeting a CSV file instead, then converting it to an Excel spreadsheet. My biggest issue with Excel automation is that it’s too easy to get mired down in plumbing details — making sure you are in right cell, formatting everything, moving things around, whatever. All those should not be a part of the code because that increases the maintenance burden. What if someone wanted to change the format? Nope, can’t do that because it’s in VBA code, and adding a column could suddenly mess up the formulas or VBA code writing values to certain cells that aren’t in their place anymore. To avoid getting into this, I’m leaning toward the idea that we want to have an Excel spreadsheet template that makes use of a Table (aka
ListObject
in VBA). The important point is that aListObject
, aPivotTable
, or aPivotChart
can have a data connection which can then point to an Access backend or to a SQL Server backend. With a live connection, it’s easy to refresh the data connection to get the current data, and it’s still relatively easy to break the data connection and convert the data into a ordinary range. Therefore, if live connection works for your scenario, you avoid the need to generate an Excel spreadsheet every time. But even if you don’t want a live connection, using aListObject
/PivotTable
/PivotChart
object to get the data helps ensure that you get the data, in the format you want with minimal programming. The user would be free to customize the format of the report (to some extent but much more than if we were automating).CopyFromRecordset
function is fine for simple scenarios but breaks down if you need to add subtotals or totals or provide some formats that depend on the data’s contents. I hope that helps!