programming with Access 2007?

J

jasonshohet

I know ASP.NET fairly well (getting up to speed with 2.0).

Here's my problem: Sometimes someone comes up to me with a large flat
file from a mainframe (and it ain't XML) and says "loop through all the
rows, make all X's into Y's (or whatever) and do some other reporting
on this file".

oooookkkkkk.... Problem is, I'm on an IBM laptop at a client, without
access to Oracle or SQL Server. How do I work with a LARGE file with
100,000+ records? I can get the file on my laptop, but then what?

1. Let Excel 2007 beta look at it - oops - too many ROWS ! NO GOOD.
2. import it into Access - but i don't know vb / vba to interrogate
each row (looping thru the file). Trying to import it into an Access
table chokes Access and I have to shut down.
3. import into mySQL - the import works much better here - but its
still time consuming and I don't have an easy way of interrogating the
data in a procedural (non-SQL) format. I wish there was a .NET way to
work with mySQL without creating a whole .NET application / website.
4. create an ASP.NET application, use C#.net in a code-behind to read
the file in line by line & work with the data - time consuming because
i have to create a compile this whole big .NET application.

I guess what I'm looking for is SAS, or ACL for .NET - I've seen
people do some cool things with Access - using Access to read an
external file (not pulling it into the Access db which would choke) and
using vb to do stuff on each row. But learning vbScript w/ Access
seems to be going way backwards for me. I don't know if its even going
away with Access 2007? Is there another tool / beta of a tool I should
look at instead of Access to do this kinda stuff? It has to be great
with all kinds of data extracts, and I need to have it work on my
laptop.

Thanks for any advice
Jason Shohet
 
D

Douglas J. Steele

If you're linking to an external data source, you don't have to use VBA to
manipulate it: you can use a query in many cases.

Also, have you tried the 100,000+ row table in Excel 2007? The limits have
increased significantly...
 
A

Albert D. Kallal

oooookkkkkk.... Problem is, I'm on an IBM laptop at a client, without
access to Oracle or SQL Server. How do I work with a LARGE file with
100,000+ records? I can get the file on my laptop, but then what?

first of all, 100,000 records is nothing for ms-access. You do realize that
ms-access with a jet based file is going to be 50%-200% FASTER then oracle,
or sql server..right?

Here is an example:

Lets assume a typical products database and we want to keep
inventory..

Lets assume 500 products in the product table. = 500 records

Lets assume that we had 5 stock taking days where we added stock to EACH
product 5 TIMES this year.
(so, each stock item was re-stocked 5 times during the year. Note that the
sample given design allows for price changes as new stock arrives).

That now means that our Inventory table has 2500 records.

Lets also assume that each Inventory item has 50 orders in the invoices
(order details) table on average.

That now means our Orders Details table has 50 * 2500 = 125,000 records.

So, what we want to do is calculate quantity on hand.

So, we got 125,000 detail records, and 2500 inventory items (and 500
products.


Remember, in a modern system, we do NOT store the quality on hand, but must
calculate it on the fly.

The beauty of this approach is that I can then simply delete, or add, or
modify records, and the totals for inventory is always 100% correct.


Further, I can just paint away with the
forms designer and build sub forms etc for invoice details where users
can enter the quantity ordered. Again, no special code is needed to
update the inventory stocks since I *CALCULATE* it on the fly
when needed.

That means the sql has to join up all records that belong to each
product..and sum them, and then subtract the quantities in the
invoice details.


Time to total up all of the in-stock records (that is running sql
statements to total all Inventory additions less all those 125,000 order
details to come up with a total for EACH product.???

On a average pc today, ms-access will total up and generate the quality on
hand for those 125,000 detail of is LESS then 1 second. (this time
includes the matching, adding, and subtracting of all orders in the system).

Again. LESS THEN ONE second on a bottom end pc today. So, we are talking
about ms-access in the above.

You can download the sample file that processing the 125,000 records here:
http://www.members.shaw.ca/AlbertKallal/msaccess/msaccess.html

So, if you want to use the command prompt, or query builder in ms-access to
massage, and mangle the data into some format for use, I don't think
ms-access can be beat for this task. Especially if your files are going to
be small 100,000, or 150,000 recodes in size. for the most part, these types
of files can be processed in seconds of time, and are a sweet spot in terms
of size for ms-access.
 
D

David F Cox

What is the format of this file that chokes accesss 2007? How big is it in
megabytes?

The app I deal with most has 680,000 records ine its main table, and is
handled with ease.
 
J

jasonshohet

Hi guys I'll take a look at the file this week and let you know the
exact size. I think the largest one was 20 million records (a bit more
than 100k i know).
Access 2007 definately choked after all that work mapping the fields
from the extraction to what they would be in an Access table - and told
Access to start importing - it just froze and died. Even keeping them
as linked files was a problem but I can't recall why - but it didn't
work. Access 2007 choked several times, it was embarassing. Not to
hype mySQL, but mySQL was able to do it no problem and begin importing
the data.

The reason I can't continue down the mySQL approach, is that the import
process is slow #1. And #2 - the client has a requirement that I could
not leave with the data on my laptop. So after I had 20,000 records
imported, I stopped - just to see if I could easily get the data off
the laptop. I tried the mySQL backup utility, took like 10 minutes
for that measling 20k records. Now for 20 million, its gonna take a
large chunk of time. Tried simply copying mySQL database over to an
external drive at the client (which i could then use when I go back)
but that caused corruption of my mySQL database. I don't have this
worry with Access - since Access is integrated as a file. I looked
into setting mySQL up on a usb 2.0 external hard drive (instead of my
local HD) and I would leave that drive at the client, but I'm no expert
with mySQL and wasn't able to get it set up on the drive last time.

So anyhow - a guy who knows vba w/ Access was able to write a little
script and read in the file as plain text. Not relying on Acess 2007's
apparently handicapped database-import functionality, we were able to
at least interrogate the data from the extraction, line by line,
without reading it into a SQL table. (It had me thinking, I wonder why
vba is used in access and not .NET - I would be more comfortable if I
could somehow write c#.net in Access to do that.) Of course with this
method, I can't do fancy data analysis because there's no table to do
any sql on... but at least it was something.

So the above outlines the frustration I've had with Access - as a
database - for reading large data extractions. mySQL is great - but it
loses the ability to just copy an Access file somewhere. You have to
do a backup it seems, which takes forever etc...
 
D

Douglas J. Steele

What are you doing using Access 2007 with clients? It's in only in Beta: why
on earth would you use Beta software with paying customers?
 
A

Albert D. Kallal

Hi guys I'll take a look at the file this week and let you know the
exact size. I think the largest one was 20 million records (a bit more
than 100k i know).

Yes, the above differences is a lot. As I said, a file with 150,000 records
is NOTHING.

Access 2007 definately choked after all that work mapping the fields
from the extraction to what they would be in an Access table

I don't think I would use a linked table. I would import the data into a
table for processing. The text file might have extra characters, or
miss-lined data. Worse, why are you using a beta? You should not even
install that on a production machine, let alone use it for any serious data
processing.
Access to start importing - it just froze and died.

I not sure exactly how you did the import. I just did a import of a comma
delimited file into ms-access. It was a small file, 155,000 records.

The import took 6 seconds flat. The resulting file was 40 megs in size -- 6
seconds!!!

As I said, files in the 150,000 record range is a sweet spot. 20 million is
defiantly not.
Even keeping them
as linked files was a problem but I can't recall why - but it didn't
work.
Access 2007 choked several times, it was embarassing.


I think perhaps you are very new to computers, or perhaps miss-informed. I
would NEVER install a beta on to ANY machine that is going to be used for
production work. You do realize that virtual pc from Microsoft is free?. So,
if you want to test some compatibility issues with old version of ms-access,
or play with some of the new features in 2007, then virtual pc is a great
use for this. Remember, I not even talking about using a2007 for production
data, I am talking about INSTALLING it on a production machine -- never ever
do that. Do not install beta programs on a production machines...shame
shame...
The reason I can't continue down the mySQL approach, is that the import
process is slow #1.

How is the data being imported? I was able to import 155,000 records in 6
seconds (and, that was on my notebook which has a considerable slower hard
disk then a desktop machine).

If you have a comma delimited, or a fixed text file that just been copied to
your local pc, then you should be able to import 100,000 records in about 5
seconds. I am using a2003 with the service packs installed.

I would suggest that you import into a clean blank mdb file for starters.
Then, if you need some field re-mapping etc, then use a append query to an
existing table.

I tried the mySQL backup utility, took like 10 minutes
for that measling 20k records.

Well, a copy of 155,000 records I just imported can be copied in a 2
seconds.
Furhter, even a text export of those 155,000 reocrds to a text file took
only took 20 seconds. (again, on my notebook).
So anyhow - a guy who knows vba w/ Access was able to write a little
script and read in the file as plain text. Not relying on Acess 2007's
apparently handicapped database-import functionality, we were able to
at least interrogate the data from the extraction, line by line,
without reading it into a SQL table.

Well, I still wondering why that access person did not use a production
verison of ms-access, and one with all of the bug fixes (sp packs)
installed?
(It had me thinking, I wonder why
vba is used in access and not .NET

Well, the above shows your miss understanding on ms-access works. Ms-access
is a development tool. It is like vb or any other tool. ms-access IS NOT a
database. You can use ms-access as a front end to mysql, orcalce, or even
the JET database engine that ships with windows. So, you don't need to, nor
have to use ms-access to hold this data you are working with. The *default*
data engine used with ms-access is JET (at least this is the case for
current production versions of ms-access - a2007 changes this somewhat).
What this means is that any windows box can read, and write to the "mdb"
file format. You DO NOT have to install ms-access on a windows XP box to
read mdb files. So, if you realized this, then you would simply write your
code in .net, and use the mdb file format that ms-access also uses. That
"JET" database can be used via a windows batch file (script) to read data if
you are really poor. (that is right, you can write a windows batch file with
sql to operate on that mdb file).

So, nothing is stopping you from .net here (but, only lack of knowledge).

Further, you can use ms-access as a com object in .net anyway *if* for some
reason you need ms-access features for working with the data.
- I would be more comfortable if I
could somehow write c#.net in Access to do that.) Of course with this
method, I can't do fancy data analysis because there's no table to do
any sql on... but at least it was something.

Why not? The JET database engine is shipped with every copy of windows. As I
said, you can write a windows batch file to read, and write these JET
(ms-access format) files on a computer. You DO NOT need to install ms-access
a computer to use, and read, and write a mdb file.

Anyway, I given you some numbers, and 6 seconds to import 155,000 records is
not a lot of time.
 
J

jasonshohet

sorry it wasn't 100k, it was i recall about 20 million records and
Access 2007 definately choked when mySQL did not. I would have rather
Access been able to handle it though.
 
G

Granny Spitz via AccessMonster.com

Access 2007 definately choked after all that work mapping the fields
from the extraction to what they would be in an Access table - and told
Access to start importing - it just froze and died.

Are you trying to stuff 10 pounds of junk into a 5 pound bag? An Access 2007
database only holds almost 2 GB of data. If your records are uniform size,
it will hold 20,000,000 records of about 90 - 95 bytes each. Your remark of
*all that work mapping the fields* sounds like you have plenty of columns and
are exceeding what the table can hold.
Even keeping them
as linked files was a problem but I can't recall why - but it didn't
work.

The ISAM driver won't let you edit linked text files. But Access 2007 is
still in beta, so there might still be bugs in just reading the text files in
a linked table. Did you try it with a stable version of Access?
Not to
hype mySQL, but mySQL was able to do it no problem and begin importing
the data.

MySQL tables are about 4 GB by default. (Use SHOW TABLE STATUS in MySQL or
myisamchk -dv TableName from the command line to find out how big yours is.)
This can be enlarged to whatever size the disk space will allow and the
operating system can handle. ACE/Jet is a desktop database that can hold
almost 2 GB of data. MySQL isn't a desktop database. You're comparing an
orange with a watermelon and saying, "the watermelon holds more juice." That
may well be, but most of us prefer a glass of orange juice in the morning,
not a glass of watermelon juice even if there's a lot more of it. Access and
MySQL don't have the same capabilities, which enables one to be better suited
than the other in certain circumstances.
The reason I can't continue down the mySQL approach, is that the import
process is slow #1.

You haven't optimized your laptop for top performance as a database server.
And #2 - the client has a requirement that I could
not leave with the data on my laptop.

Let me guess. They watch you delete the file from your laptop and the
recycle bin and happily let you leave without wiping your laptop's hard drive.
In which case, you can go back to your office and use an undelete utility and
continue to experiment with how you can transform the data from the mainframe.

Tried simply copying mySQL database over to an
external drive at the client (which i could then use when I go back)
but that caused corruption of my mySQL database. I don't have this
worry with Access - since Access is integrated as a file.

That's ironic. People complain they can't use Access because it corrupts
their data.

The reason you can just copy an MDB file from one drive to another and have
it work is because it's a desktop database (in which one of the features is
portability), not a client/server database like MySQL. I haven't had an
opportunity to migrate any MySQL databases, but I've migrated SQL Server
databases. You have to detach the data file from the database and attach it
again at the destination or else use backup/restore. MySQL has similar
functions.
(It had me thinking, I wonder why
vba is used in access and not .NET - I would be more comfortable if I
could somehow write c#.net in Access to do that.)

VBA is the proprietary language of all Office applications. .Net was
invented more than a decade after Office was. Some Office applications have
been retrofitted to use the .Net framework, but Access is a lot more complex
than all the other applications. It will be at least two versions behind
Excel and Word when it's integrated into the .Net framework and you'll be
able to write C# in Access's code window. *If* that ever happens.
So the above outlines the frustration I've had with Access - as a
database - for reading large data extractions. mySQL is great - but it
loses the ability to just copy an Access file somewhere. You have to
do a backup it seems, which takes forever etc...

Hon, I don't mean to be rude, but you don't take a knife to a gunfight. Your
client needs a database specialist and you're a .Net programmer. You're
trying to wing it because you have some experience working with databases.
You're using the wrong tools, the wrong software and the wrong hardware
because you don't know their capabilities and limitations or what you
*should* be using. You embarrassed yourself by picking the wrong tools and
by using adequate tools in the wrong way, and then you blamed your failures
on Access 2007 beta. Don't let this happen again. People die of
embarrassment. Look what happened to Elvis.

What you need to do is take an experienced database specialist with you to
the client next time. That, or get the training you need to become a
database specialist. You're thinking like a programmer, not a database guy.

Since the data files were large and SQL Server and Oracle weren't available
at the client's office, a database guy would have imported the data into one
of the free client/server databases, SQL Express, Oracle Express, or DB2
Express-C because if your laptop is capable of running Access 2007 beta, it
can run any of these. SQL Express and Oracle Express hold up to 4 GB and DB2
Express-C is only limited by your available disk space. I don't know about
DB2 Express-C, but SQL Express and Oracle Express have separate managers you
can download that will make importing/exporting and backup/restore faster
than in MySQL. And with SQL Express, you could have used your .Net skills to
do your ETL task if you aren't familiar with T-SQL. None of that *backwards*
VBA (as you call it) necessary.

Download one of these express versions and the manager application. Get some
training. Start seeing how the other half lives. You'll be thinking like a
database guy in no time.
 
J

jasonshohet

I read your email, and I know more than you think. I know the history
of VBA, and I've been using .NET since it was a beta years ago. I'm
familiar with rdbms's from Oracle to DB2 to Sybase. Nevertheless - For
data analysis on someone else's mainframe extracts, Access 2007 is
still best. The fully integrated single file, that can query external
data (links etc), run vba all in a single 'file' - very very powerful.
I was using Access 2007 this past week (with those large files, reading
them as text) and VBA is just archaic compared to c#.net. I hate it.
But why would I use SQL Express to link to someone else's data? Then
when I have to leave - I need to remove it from my laptop and back it
up somewhere. Pain in the *SS. I just will not go through that
process. With Access - I just copy the file. So Access is MUCH more
portable and useful for quick analysis of external data. Its a single
file that - much like the swiss army knife - can do almost anything.
I wish I could write C# into Access modules.... such a shame.
 
D

David F Cox

Unfortunately "choked" is only a tad more informative than "It don't work".

with 20 million records does "choked" mean hit file size limits?

If you are letting Access parse the data it is probably going to pick
maximum sizes and precisions for all fields. Have you tried setting up the
table structure to match the data and importing the data into that?
 
G

Granny Spitz via AccessMonster.com

Nevertheless - For
data analysis on someone else's mainframe extracts, Access 2007 is
still best.

Access 2007 is *still* in beta for crying out loud! It was originally
scheduled for release two years ago but it's still too buggy to be released
to retail customers. Can't you use a stable version?
I was using Access 2007 this past week (with those large files, reading
them as text)

Line by line instead of an import? That's sooo slow for 20 million records
when you're examining and massaging the values in each column.
But why would I use SQL Express to link to someone else's data?

A programmer might try to link to the data with SQL Express, but a database
guy would do a bulk insert with logging off and use T-SQL (and maybe C#) to
massage the data. Much quicker than using Access and way quicker than
reading/massaging records in large files line by line with VBA.
Then
when I have to leave - I need to remove it from my laptop and back it
up somewhere.

The amount of time you save with using a client/server database engine
correctly and with a programming language you are very familiar with
outweighs using Access and VBA, which you aren't familiar with, even if you
do have to move the data files from your laptop before you leave.
I just will not go through that
process.

You get paid by the hour, don't you? That would explain why you tried Excel
first and want to use VBA for ETL tasks, not a database engine.
I wish I could write C# into Access modules.... such a shame.

You'll have to wait for the next version of Access after 2007 ... or the one
after that ... or longer. But there's nothing stopping you from writing a C#
application to do the ETL work since that's your preference.
 
J

jasonshohet

Granny you're pretty funny - seriously !
Access 2007 is *still* in beta for crying out loud! It was originally
scheduled for release two years ago but it's still too buggy to be released
to retail customers. Can't you use a stable version?
Nah.


Line by line instead of an import? That's sooo slow for 20 million records
when you're examining and massaging the values in each column.

Actually the vba was somewhat fast, about 20-30 minutes - and thats
interrogating each fixed width line looking for something say - 32
characters over, 10 characters long etc.

A programmer might try to link to the data with SQL Express, but a database
guy would do a bulk insert with logging off and use T-SQL (and maybe C#) to
massage the data. Much quicker than using Access and way quicker than
reading/massaging records in large files line by line with VBA.

I am a rdbms guy - but not for this - because of the limitation that I
can't host the data on my laptop. It has to stay on the usb 2.0
external hard drive. I wonder if SQL Express can create a database on
a removeable usb 2.0 drive, and hook back up to it on the fly the next
time I'm back at the client. See this is where Access is great!
Linked tables, vba ... the whole thing is just so portable and easy to
do.

The amount of time you save with using a client/server database engine
correctly and with a programming language you are very familiar with
outweighs using Access and VBA, which you aren't familiar with, even if you
do have to move the data files from your laptop before you leave.

The problem is - I cannot figure out how ot move the data files from
the laptop to the usb 2.0 HD easily - the export takes way too long.
One database table is - if I recall - 20 million records (thats the
one I went through with the vba). Others are 1-2 million. But the
export just takes too long. I'd be sitting there for over an hour just
exporting things. And then when I come back - what - I have to bring
it back again?? Once again the linked tables (for the smaller ones)
and the vba (for the extra large text file) just seems easier. If .NET
had a way to do linked tables - I'd GLADLY do it in .NET. I think
thats the crux here. Reading text as vba - yea I can do that with
c#.net no big deal. Its those linked tables in Access which saves so
much time - I don't need to import things in to SQL Server or whatever.


You get paid by the hour, don't you? That would explain why you tried Excel
first and want to use VBA for ETL tasks, not a database engine.

Hehe - funny - I liked this one. I guess because I had the linked
tables, I stayed with vba since I was already in Access for the linked
tables. Kind of like the swiss army knife - why do things with
multiple tools when you have 1 that does it all, even if its not as
efficient on the ETL - text reading.


You'll have to wait for the next version of Access after 2007 ... or the one
after that ... or longer. But there's nothing stopping you from writing a C#
application to do the ETL work since that's your preference.

I don't want to be tied down by a regular rdbms when a single Access
file does both the ETL (not as nicely of course) and the linked tables.
Its the swiss army knife that I like, even though individually a
particular 'knife' isn't as sharp as the c#.net :) I guess thats an
analogy of sorts. When dealing with dozens of m/f extracts - fixed
length text/data files - I'm looking for a simple way to read files as
text or use the linked-tables approach. And avoid moving data to my
laptop - taking it off etc. Too many moving wheels, too many things
going on that could cause a mistake... Granny I WANT to believe .NET &
SQL Express would be better for this, but its not convincing yet,
unless someone tells me there's an easy way to do the linked tables.

Regards
 
G

Granny Spitz via AccessMonster.com


It's ok to experiment with your own data but when it's a paying customer,
you're inviting disaster if you use beta software. If you keep this up, some
day you're going to hear the song "Granny said there'll be days like this ...
there'll be days like this Granny said."
Actually the vba was somewhat fast, about 20-30 minutes - and thats
interrogating each fixed width line looking for something say - 32
characters over, 10 characters long etc.

With a client/server database and a 20 million record text file on a 100 Mb
Ethernet network you could expect 2-10 minutes for an ETL task, depending on
the hardware, the size of the file and how many computer operations it takes
to transform each record into the right format. While VBA is going to
execute much slower than say a DTS package, I think your bottleneck is going
to be file I/O to that USB external drive.
I wonder if SQL Express can create a database on
a removeable usb 2.0 drive, and hook back up to it on the fly the next
time I'm back at the client.

If your OS can find the data file, SQL Express can too. It's not quite *on
the fly* though. To do it correctly you'd have to run an SP to detach the
data file before you disconnect the external drive, then run another SP to
reattach the data file when you reconnect to the external drive. Takes a few
seconds but you can set it up as an automated task in case you don't remember
to do it manually.
If .NET
had a way to do linked tables - I'd GLADLY do it in .NET. I think
thats the crux here. Reading text as vba - yea I can do that with
c#.net no big deal. Its those linked tables in Access which saves so
much time - I don't need to import things in to SQL Server or whatever.

I'll tell you a little secret. Microsoft is offering free online mini-
courses for administering and programming SQL Server 2005 (one course
includes .Net). SQL Express is a scaled down version of SQL Server 2005, so
much of what's learned in these courses is going to apply to SQL Express and
its free tools. The free price has been extended and now expires the end of
November. (That's for sign-up, not to finish the 6 hour courses. You have
90 days to finish each one.) If you take the time to study for and pass
these courses, not only will you have new skills to add to your resume,
you'll figure out how to get the job done without Access and VBA.
https://www.microsoftelearning.com/sqlserver2005/

Good luck to you, hon.
 
T

Tony Toews

Granny Spitz via AccessMonster.com said:
Access 2007 is *still* in beta for crying out loud! It was originally
scheduled for release two years ago

Access (version after 2003) was scheduled for release two years ago?
News to me.

Tony
--
Tony Toews, Microsoft Access MVP
Please respond only in the newsgroups so that others can
read the entire thread of messages.
Microsoft Access Links, Hints, Tips & Accounting Systems at
http://www.granite.ab.ca/accsmstr.htm
 
G

Granny Spitz via AccessMonster.com

Tony said:
Access (version after 2003) was scheduled for release two years ago?
News to me.

Microsoft *always* forgets to send you those important memos, don't they hon?
<g>

Office 12, which Access 2007 is a part of, was originally planned to *only*
run on Longhorn to help boost the sagging sales of Office. Office 2003
wasn't the commercial success it was expected to be. Previous versions were
often deemed good enough and customers had few compelling reasons to upgrade.
The Longhorn client OS was originally slated to ship in 2004 and Office 12 in
2005, in keeping with the 2-3 year planned release schedule between versions
to lure Software Assurance customers. When Longhorn's release was delayed to
2005, then 2006 and later (remember WinFS? Avalon? Indigo? The delay from
Windows XP SP2?), Office 12 had to be redesigned to run on Windows XP SP2 too.


Here's some news written by Paul Thurrott (a respected authority on Windows)
from back in 2004:
http://www.winsupersite.com/showcase/longhorn_preview_2004.asp. And more
news from the middle of 2004 when Microsoft made public announcements of its
plans, like Office 12 would run on Windows XP *and* Windows 2000 (which we
now know it doesn't) and eventually Longhorn when it shipped:
http://www.crn.com/sections/breakingnews/dailyarchives.jhtml?articleId=22104567
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top