Excel/Access ADO

L

Loane Sharp

Hi there

I'm a beginning ADO user ... please help!

Is there a more efficient and less time-consuming way to achieve the
following effect (pulling Access data into Excel) ...?

I have two different *.mdb files with identical field etc. structures
(really a single database that is bigger than 2GB and that I've split into
two files). A given piece of analysis in Excel requires records from both
files (however I split the database). In each iteration (over 200,000 loops)
I open a connection to the first database and create a recordset and,
thereafter, open a connection to the second database and create a new
recordset. The two recordsets are then placed in a contiguous range on an
Excel sheet and I proceed with the analysis from there ...

I've truncated my code below.

Please help

Loane

Dim rsData As ADODB.Recordset
Dim strConnect As String
Dim strSQL As String

For i = 1 to NoRecords

For j = 1 to 2

If j = 1 Then
strConnect = "...\DataBase1.mdb;"
strSQL = "SELECT Field1 FROM Table1 WHERE Index = i"
ElseIf j = 2 Then
strConnect = "...\DataBase2.mdb;"
strSQL = "SELECT Field1 FROM Table2 WHERE Index = i"
End If

rsData.Open strSQL, strConnect, adOpenForwardOnly, adLockReadOnly,
adCmdText

Cells(i, 1).CopyFromRecordset rsData

rsData.Close

Next j

[Some code representing Excel-based analysis goes in here. The analysis
currently requires that the data in the two recordsets (i.e. drawn from
Table1 and Table 2, respectively) be deposited in a contiguous range in a
worksheet.]

Next i

Set rsData = Nothing
 
J

Jake Marx

Hi Loane,

I'm sure there is a more efficient way of doing this, but not knowing the
data, it's hard to give specifics. Here are a few general
comments/suggestions:

1) Only hit each database once:

strSQL = "SELECT Index1, Field1 FROM Table1 " _
& "WHERE Index<=" & CStr(NoRecords)

....and do the same for Table2 from database #2.

2) Use CopyFromRecordset to copy each recordset to a range (obviously,
Table2 will have to go below Table1).

3) Use Excel's built-in sort (Sort method) to sort the resulting data so it
is in the order you want (from what I understand of your data, you'll
probably sort on Index and then Field1). If you don't have a good secondary
key to determine which table the row came from, you can add that key to a
column to the right of your data (a 1 for Table1 and a 2 for Table2, for
instance) and use that as your second sorting column.

If you really have 100,000 records to return, you'll have to split them over
2 or more worksheets, as each worksheet can hold only ~65k rows.


Another option is to put all the data for each Index on the same row (if you
want that):

1) See step 1 above.

2) Use CopyFromRecordset to copy each recordset to its own worksheet.

3) On the second worksheet (Table2 data), use the VLOOKUP worksheet function
in a third column to "pull" the data from worksheet 1, column 2 based on the
Index #.


A final option (one that I'm not sure will work because of possible
limitations in Access):

1) Set up a linked table in one of your databases that points to the other
database. For this example, let's assume that in DataBase1.mdb, you set up
a link to Table2 in DataBase2.mdb and name the linked table "Table2". Now,
you can use a single SELECT statement to get all the data you need:

strSQL = "SELECT a.Index, a.Field1 As Table1Field1, b.Field1 As
Table2Field1 " _
& "FROM Table1 a INNER JOIN Table2 b ON a.Index=b.Index " _
& "WHERE a.Index<=" & CStr(NoRecords)

2) Use CopyFromRecordset to copy each recordset to the desired range.


Hopefully, this makes sense. If you have any further questions, please
reply to this post and we'll try to help out further.

--
Regards,

Jake Marx
MS MVP - Excel
www.longhead.com

[please keep replies in the newsgroup - email address unmonitored]


Loane said:
Hi there

I'm a beginning ADO user ... please help!

Is there a more efficient and less time-consuming way to achieve the
following effect (pulling Access data into Excel) ...?

I have two different *.mdb files with identical field etc. structures
(really a single database that is bigger than 2GB and that I've split
into two files). A given piece of analysis in Excel requires records
from both files (however I split the database). In each iteration
(over 200,000 loops) I open a connection to the first database and
create a recordset and, thereafter, open a connection to the second
database and create a new recordset. The two recordsets are then
placed in a contiguous range on an Excel sheet and I proceed with the
analysis from there ...

I've truncated my code below.

Please help

Loane

Dim rsData As ADODB.Recordset
Dim strConnect As String
Dim strSQL As String

For i = 1 to NoRecords

For j = 1 to 2

If j = 1 Then
strConnect = "...\DataBase1.mdb;"
strSQL = "SELECT Field1 FROM Table1 WHERE Index = i"
ElseIf j = 2 Then
strConnect = "...\DataBase2.mdb;"
strSQL = "SELECT Field1 FROM Table2 WHERE Index = i"
End If

rsData.Open strSQL, strConnect, adOpenForwardOnly, adLockReadOnly,
adCmdText

Cells(i, 1).CopyFromRecordset rsData

rsData.Close

Next j

[Some code representing Excel-based analysis goes in here. The
analysis currently requires that the data in the two recordsets (i.e.
drawn from Table1 and Table 2, respectively) be deposited in a
contiguous range in a worksheet.]

Next i

Set rsData = Nothing
 
L

Loane Sharp

Hi Jake
Thanks for your help. I've always maintained that, if it can't be done in
Excel, it's probably not worth doing. However I didn't quite expect the
ramp-up in technique required to hold this position!

Best regards
Loane




Jake Marx said:
Hi Loane,

I'm sure there is a more efficient way of doing this, but not knowing the
data, it's hard to give specifics. Here are a few general
comments/suggestions:

1) Only hit each database once:

strSQL = "SELECT Index1, Field1 FROM Table1 " _
& "WHERE Index<=" & CStr(NoRecords)

...and do the same for Table2 from database #2.

2) Use CopyFromRecordset to copy each recordset to a range (obviously,
Table2 will have to go below Table1).

3) Use Excel's built-in sort (Sort method) to sort the resulting data so it
is in the order you want (from what I understand of your data, you'll
probably sort on Index and then Field1). If you don't have a good secondary
key to determine which table the row came from, you can add that key to a
column to the right of your data (a 1 for Table1 and a 2 for Table2, for
instance) and use that as your second sorting column.

If you really have 100,000 records to return, you'll have to split them over
2 or more worksheets, as each worksheet can hold only ~65k rows.


Another option is to put all the data for each Index on the same row (if you
want that):

1) See step 1 above.

2) Use CopyFromRecordset to copy each recordset to its own worksheet.

3) On the second worksheet (Table2 data), use the VLOOKUP worksheet function
in a third column to "pull" the data from worksheet 1, column 2 based on the
Index #.


A final option (one that I'm not sure will work because of possible
limitations in Access):

1) Set up a linked table in one of your databases that points to the other
database. For this example, let's assume that in DataBase1.mdb, you set up
a link to Table2 in DataBase2.mdb and name the linked table "Table2". Now,
you can use a single SELECT statement to get all the data you need:

strSQL = "SELECT a.Index, a.Field1 As Table1Field1, b.Field1 As
Table2Field1 " _
& "FROM Table1 a INNER JOIN Table2 b ON a.Index=b.Index " _
& "WHERE a.Index<=" & CStr(NoRecords)

2) Use CopyFromRecordset to copy each recordset to the desired range.


Hopefully, this makes sense. If you have any further questions, please
reply to this post and we'll try to help out further.

--
Regards,

Jake Marx
MS MVP - Excel
www.longhead.com

[please keep replies in the newsgroup - email address unmonitored]


Loane said:
Hi there

I'm a beginning ADO user ... please help!

Is there a more efficient and less time-consuming way to achieve the
following effect (pulling Access data into Excel) ...?

I have two different *.mdb files with identical field etc. structures
(really a single database that is bigger than 2GB and that I've split
into two files). A given piece of analysis in Excel requires records
from both files (however I split the database). In each iteration
(over 200,000 loops) I open a connection to the first database and
create a recordset and, thereafter, open a connection to the second
database and create a new recordset. The two recordsets are then
placed in a contiguous range on an Excel sheet and I proceed with the
analysis from there ...

I've truncated my code below.

Please help

Loane

Dim rsData As ADODB.Recordset
Dim strConnect As String
Dim strSQL As String

For i = 1 to NoRecords

For j = 1 to 2

If j = 1 Then
strConnect = "...\DataBase1.mdb;"
strSQL = "SELECT Field1 FROM Table1 WHERE Index = i"
ElseIf j = 2 Then
strConnect = "...\DataBase2.mdb;"
strSQL = "SELECT Field1 FROM Table2 WHERE Index = i"
End If

rsData.Open strSQL, strConnect, adOpenForwardOnly, adLockReadOnly,
adCmdText

Cells(i, 1).CopyFromRecordset rsData

rsData.Close

Next j

[Some code representing Excel-based analysis goes in here. The
analysis currently requires that the data in the two recordsets (i.e.
drawn from Table1 and Table 2, respectively) be deposited in a
contiguous range in a worksheet.]

Next i

Set rsData = Nothing
 
J

Jamie Collins

Jake Marx said:
I'm sure there is a more efficient way of doing this, but not knowing the
data, it's hard to give specifics.

Another option is to use a UNION query to create a single recordset
and do the sort within the query (must use the column's ordinal
position e.g. ORDER BY 1). I haven't tested but I'm pretty sure it
will be more efficient to get the provider to do the sorting. For
example (note only the path of the non-connected database is required;
both are shown for clarity):

SELECT
Field1
FROM
[Database=C:\DataBase1.mdb;].Table1
WHERE
Index = 1
UNION ALL
SELECT
Field1
FROM
[Database=C:\DataBase2.mdb;].Table1
WHERE
Index = 1
ORDER BY
1
;

Jamie.

--
 
L

Loane Sharp

Hi Jamie

Thanks for your suggestions.

I pulled the data into an Excel workbook this weekend ... As a reminder, I'm
working with a Human Resources database comprising 3 Access database files
each measuring about 1GB, with 23 tables in total, about 250 fields and
107288 records (each record relates to an employee, either work history,
academic history, on-the-job performance, static information, etc. spread
across the 3 database files).

The Excel file (incorporating bits and pieces extracted and summarized from
the 3 Access database files) measures about 110MB, so there's a fair amount
of data summarizing going on.

The real pain is that it took 38 hours to pull the data into Excel
record-by-record.

I broke down my quite lengthy code into little pieces, to see where the
inefficiency is ... At first it seemed that I'd found the problem: turns out
I was instantiating a new Connection and Recordset object each time in the
For (i = 1 to 107288) ... Next loop. However, when I instantiate the
connection right at the beginning of the procedure (i.e. so that connections
to each of the 3 databases are established once and thereafter maintained
throughout the procedure), and when I instantiate the recordset object
outside the loop, the procedure doesn't run noticeably faster. It seems that
the "hog" is the [Recordset].Open method, which must be given for each of
the 107288 records.

Do you think the following will be a fruitful line of further enquiry? ...
I'm going to move my procedure to another (empty) Excel workbook, and this
workbook will be the only instance running. I.e. I'm going to SELECT from
the Access databases (Access not running) and INSERT INTO an Excel workbook
(workbook not open).

Best regards
Loane

Jamie Collins said:
Jake Marx said:
I'm sure there is a more efficient way of doing this, but not knowing the
data, it's hard to give specifics.

Another option is to use a UNION query to create a single recordset
and do the sort within the query (must use the column's ordinal
position e.g. ORDER BY 1). I haven't tested but I'm pretty sure it
will be more efficient to get the provider to do the sorting. For
example (note only the path of the non-connected database is required;
both are shown for clarity):

SELECT
Field1
FROM
[Database=C:\DataBase1.mdb;].Table1
WHERE
Index = 1
UNION ALL
SELECT
Field1
FROM
[Database=C:\DataBase2.mdb;].Table1
WHERE
Index = 1
ORDER BY
1
;

Jamie.

--
 
J

Jamie Collins

Loane Sharp said:
The real pain is that it took 38 hours to pull the data into Excel
record-by-record

I don't understand why you are reading the data to Excel row by row.
It seems that
the "hog" is the [Recordset].Open method, which must be given
for each of the 107288 records.

Opening a recordset for each of your 100K rows definitely sounds
wrong.

I think you should aim to use sql to get one recordset of all the rows
you need in Excel. What are you doing in your For (i = 1 to
107288)...Next loop that you can't do in sql?

Jamie.

--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top