Effect of joins in a recordset
In our sixth and final article of the ODBC tracing series, we are going to take a look at how Access will handle joins in Access queries. In the previous article, you saw how filters are handled by Access. Depending on the expression, Access may choose to parameterize it away or may be forced to evaluate it itself by downloading all the input data then performing the evaluations locally. In this article, we will focus on joins. When you think about it, joins are actually a special kind of filter. Therefore, in theory, Access ought to remote as much as possible even with joins. You might typically see joins written in the following pseudo-SQL:
FROM a INNER JOIN b ON a.ID = b.ID
JOIN
syntax.FROM a, b WHERE a.ID = b.ID
JOIN..ON
, Access is free to treat it as a WHERE
which is helpful in situations where Access cannot fully remote the query. But here’s the rub… when does Access decide to remote the joins? Let’s try a simple join query:
SELECT c.CityID ,c.StateProvinceID ,c.CityName ,s.StateProvinceName FROM Cities AS c INNER JOIN StateProvinces AS s ON c.StateProvinceID = s.StateProvinceID;
SQLExecDirect: SELECT "c"."CityID" ,"s"."StateProvinceID" FROM "Application"."Cities" "c", "Application"."StateProvinces" "s" WHERE ("c"."StateProvinceID" = "s"."StateProvinceID" ) SQLPrepare: SELECT "CityID" ,"CityName" ,"StateProvinceID" FROM "Application"."Cities" WHERE "CityID" = ? SQLExecute: (GOTO BOOKMARK) SQLPrepare: SELECT "StateProvinceID" ,"StateProvinceName" FROM "Application"."StateProvinces" WHERE "StateProvinceID" = ? SQLExecute: (GOTO BOOKMARK) SQLPrepare: SELECT "StateProvinceID" ,"StateProvinceName" FROM "Application"."StateProvinces" WHERE "StateProvinceID" = ? OR "StateProvinceID" = ? OR "StateProvinceID" = ? OR "StateProvinceID" = ? OR "StateProvinceID" = ? OR "StateProvinceID" = ? OR "StateProvinceID" = ? OR "StateProvinceID" = ? OR "StateProvinceID" = ? OR "StateProvinceID" = ? SQLExecute: (MULTI-ROW FETCH) SQLPrepare: SELECT "CityID" ,"CityName" ,"StateProvinceID" FROM "Application"."Cities" WHERE "CityID" = ? OR "CityID" = ? OR "CityID" = ? OR "CityID" = ? OR "CityID" = ? OR "CityID" = ? OR "CityID" = ? OR "CityID" = ? OR "CityID" = ? OR "CityID" = ? SQLExecute: (MULTI-ROW FETCH) SQLExecute: (MULTI-ROW FETCH) SQLExecute: (MULTI-ROW FETCH) SQLExecute: (MULTI-ROW FETCH) SQLExecute: (MULTI-ROW FETCH) SQLExecute: (MULTI-ROW FETCH) SQLExecute: (MULTI-ROW FETCH) SQLExecute: (MULTI-ROW FETCH) SQLExecute: (MULTI-ROW FETCH)
If we change the same Access query to be a snapshot-type instead of the default dynaset-type, we get:
SQLExecDirect: SELECT "c"."CityID" ,"c"."CityName" ,"c"."StateProvinceID" ,"s"."StateProvinceName" FROM "Application"."Cities" "c", "Application"."StateProvinces" "s" WHERE ("c"."StateProvinceID" = "s"."StateProvinceID" )
Such query allows updating to both columns. That is not actually expressible in SQL but such action is legal for user to perform. Therefore, to execute that update, Access would submit the following ODBC SQL:
SQLExecDirect: UPDATE "Application"."StateProvinces" SET "StateProvinceName"=? WHERE "StateProvinceID" = ? AND "StateProvinceName" = ? SQLExecDirect: UPDATE "Application"."Cities" SET "CityName"=? WHERE "CityID" = ? AND "CityName" = ? AND "StateProvinceID" = ?
To prove this, we will create a SQL view and link it to Access:
CREATE VIEW dbo.vwCitiesAndStates AS SELECT c.CityID ,c.StateProvinceID ,c.CityName ,s.StateProvinceName FROM Application.Cities AS c INNER JOIN Application.StateProvinces AS s ON c.StateProvinceID = s.StateProvinceID;
SELECT c.CityID ,c.StateProvinceID ,c.CityName ,c.StateProvinceName FROM vwCitiesAndStates AS c;
SQLExecDirect: SELECT "c"."CityID" FROM "dbo"."vwCitiesAndStates" "c" SQLPrepare: SELECT "CityID" ,"StateProvinceID" ,"CityName" ,"StateProvinceName" FROM "dbo"."vwCitiesAndStates" WHERE "CityID" = ? SQLExecute: (GOTO BOOKMARK) SQLPrepare: SELECT "CityID" ,"StateProvinceID" ,"CityName" ,"StateProvinceName" FROM "dbo"."vwCitiesAndStates" WHERE "CityID" = ? OR "CityID" = ? OR "CityID" = ? OR "CityID" = ? OR "CityID" = ? OR "CityID" = ? OR "CityID" = ? OR "CityID" = ? OR "CityID" = ? OR "CityID" = ? SQLExecute: (MULTI-ROW FETCH) SQLExecute: (MULTI-ROW FETCH) SQLExecute: (MULTI-ROW FETCH) SQLExecute: (MULTI-ROW FETCH) SQLExecute: (MULTI-ROW FETCH) SQLExecute: (GOTO BOOKMARK) SQLExecDirect: UPDATE "dbo"."vwCitiesAndStates" SET "CityName"=?, "StateProvinceName"=? WHERE "CityID" = ? AND "StateProvinceID" = ? AND "CityName" = ? AND "StateProvinceName" = ?
UPDATE
on a single source whereas in the original example, Access was actually secretly issuing two separate UPDATE
statements on each individual table. Hopefully that helps make the case that you should avoid doing joins in Access queries/recordsources/rowsources especially when they need to be updatable. If they don’t, use snapshot where feasible.
A quick note regarding heterogeneous joins
We need to comment regarding joins between two linked tables that comes from two different ODBC data sources. Such joins are “heterogeneous” because Access must handle the joins locally since each data sources are assumed to not know of each other. Regardless of whether you specify a dynaset-type or snapshot-type recordsets, Access must fetch the full set of keys from each data source and resolve the joins by sending separate parameterized queries to each data source. If updating is allowed, Access will formulate a separate UPDATE
query to each data source that needs to be updated. It is also important to note that a join between two linked tables that comes from two different databases each is still considered by Access as heterogeneous. That is still true even if the two databases are on the same server and you have no problem doing cross database queries. In this scenario, a SQL view can help cut down on the additional chatter by hiding the cross database joins from Access similar to what we saw already in this article.
Outer join syntax difference
As long as the outer joins do not affect the updatability of the Access query, Access will handle it similarly to how it handled the inner join version. If we modify the same query we used to be an left join, the traced ODBC SQL will output the key population query like so:
SQLExecDirect: SELECT "c"."CityID" ,"s"."StateProvinceID" FROM {oj "Application"."Cities" "c" LEFT OUTER JOIN "Application"."StateProvinces" "s" ON ("c"."StateProvinceID" = "s"."StateProvinceID" ) }
{oj ...}
expression. For more details on that syntax, consult the documentation. For our purpose, we can just disregard the {oj
and the closing }
as noise.
Conclusions
We saw that joins are treated as if they are a kind of filter and Access will try to remote the joins where it is allowed to. One particular area to pay close attention is the fact that by default we use dynaset type recordsets and Access will make no assumptions about whether we want to allow modifying so and so columns in the recordset and goes out of it way to make it possible for us to update to two tables which actually is not easily expressed in standard SQL. As a consequence, Access will do much more work to support updatability for a query that contains joins which can negatively impact the performance.
We can help avoid the penalty by using SQL views in place of joins expressed in an Access query. The tradeoff is that we are then subject to the updatability rules of a SQL view; we might not be allowed to update two tables at the same time. Usually because a well-designed Access form will only represent a single table to update, that is not much of a restriction and is a good discipline to follow.
With that, the current series is done. However, the learning that the series hopefully sparks should not be done. I sincerely hope you found the series useful and look forward to hearing about new insights you gained from using tools to help analyze and address performance issues with Access applications using ODBC data sources. Feel free to leave comments or request for more information and thanks for reading together!
For further assistance with anything Microsoft Access related, call our experts on 773-809-5456 or email us at sales@itimpact.com.