Edit/Update results in badly fragmented file

E

esteiner

am scrubbing values of a field in an Access 2003 table and updating the
results in the same table.

The routine runs OK but the size of the database goes to 600+ MB. I must run
the compact and repair database operation each time I run the routine. The
amount of data to process will continue to increase, so I would like to make
this routine as efficient as possible. I currently process about 200K
records, but expect this to grow to 1 million+

I assume that this is caused by the many read/write cycles to the hard disk.
Is there a way to operate on the table/recordset in RAM before writing to
disk? Any help/suggestions to make this more efficient would be appreciated .
Below is the code that runs the routine:

Private Sub cmdSplitMPN_Click()
On Error GoTo E_Handle
Dim rs As Recordset ' Recordset object
Dim strTableName As String ' Name of table where export records originate
Dim strSQL As String ' SQL String
Dim intRecordNum As Long ' Record Number being dealt with
Dim strMPN As String
Dim I As Integer
Dim strPrefix As String
Dim strBasenum As String
Dim blHasNum As Boolean
Set db = DBEngine(0)(0)

' Create the table name from where the export records will orginate
strTableName = "pull_inv_BRE"
strSQL = "SELECT ManfPartNum, CorePartNum FROM " & strTableName & ""
Set rs = db.OpenRecordset(strSQL)
With rs
While Not (.BOF Or .EOF)
strMPN = !ManfPartNum
For I = 1 To Len(strMPN)
blHasNum = False
If IsNumeric(Mid(strMPN, I, 1)) Then
'number found, exit
blHasNum = True
Exit For
End If
Next I

If (blHasNum = True) Then
strBasenum = Right(strMPN, Len(strMPN) - I + 1)
.Edit
!CorePartNum = strBasenum
.Update
End If
.MoveNext
Wend
..Close
End With

sExit:
On Error Resume Next
rs.Close
Set rs = Nothing
Set db = Nothing
Reset
Exit Sub



E_Handle:
MsgBox Err.Description, vbOKOnly + vbCritical, "Error: " & Err.Number
Resume sExit

End Sub

Thank you for your help!!!

-Charlie
 
B

Brendan Reynolds

It looks as though it might be possible to do it with an update query
instead of looping through a recordset. Do I understand the goal correctly -
if the field ManfPartNum contains any digit (0 - 9) then the field
CorePartNum should contain the part of ManfPartNum starting at and including
the first digit? In other words, if ManfPartNum contains the value "ABC123"
then CorePartNum should be assigned the value "123"?
 
E

esteiner

Yes, Brendan. You understand the goal perfectly. Please offer suggestions
on how I can accomplish this with an update query or any other method. As I
mentioned before, the file becomes very badly fragmented using the current
method.

Brendan said:
It looks as though it might be possible to do it with an update query
instead of looping through a recordset. Do I understand the goal correctly -
if the field ManfPartNum contains any digit (0 - 9) then the field
CorePartNum should contain the part of ManfPartNum starting at and including
the first digit? In other words, if ManfPartNum contains the value "ABC123"
then CorePartNum should be assigned the value "123"?
am scrubbing values of a field in an Access 2003 table and updating the
results in the same table.
[quoted text clipped - 71 lines]
 
B

Brendan Reynolds

I think Klatuu has beaten me to it! :) His suggestion looks very similar to
what I had in mind.

--
Brendan Reynolds
Access MVP

esteiner said:
Yes, Brendan. You understand the goal perfectly. Please offer
suggestions
on how I can accomplish this with an update query or any other method.
As I
mentioned before, the file becomes very badly fragmented using the current
method.

Brendan said:
It looks as though it might be possible to do it with an update query
instead of looping through a recordset. Do I understand the goal
correctly -
if the field ManfPartNum contains any digit (0 - 9) then the field
CorePartNum should contain the part of ManfPartNum starting at and
including
the first digit? In other words, if ManfPartNum contains the value
"ABC123"
then CorePartNum should be assigned the value "123"?
am scrubbing values of a field in an Access 2003 table and updating the
results in the same table.
[quoted text clipped - 71 lines]
 
E

esteiner

Hi Brendan. I do not see a response from Klatuu or anyone else on this
thread. Please advise?!?!

Brendan said:
I think Klatuu has beaten me to it! :) His suggestion looks very similar to
what I had in mind.
Yes, Brendan. You understand the goal perfectly. Please offer
suggestions
[quoted text clipped - 18 lines]
 
B

Brendan Reynolds

Here's a link to Klatuu's response ...

http://groups.google.com/group/microsoft.public.access.modulesdaovba/msg/373ae319ec15f849

--
Brendan Reynolds
Access MVP

esteiner said:
Hi Brendan. I do not see a response from Klatuu or anyone else on this
thread. Please advise?!?!

Brendan said:
I think Klatuu has beaten me to it! :) His suggestion looks very similar
to
what I had in mind.
Yes, Brendan. You understand the goal perfectly. Please offer
suggestions
[quoted text clipped - 18 lines]
 
A

Albert D. Kallal

I am not sure or even convinced that changing your code to a update query
will fix the bloat problem.
The routine runs OK but the size of the database goes to 600+ MB.

Yes, but if the original size is 590mb, then what is the problem?

Since you are only editing one field, and "adding" a bit of text, then you
might be able to mitigate this problem.

first, do realize that the JET database is a frame based data engine. What
that means that all records are packed into frames.

frame1
record 1
record 2
record 3

etc. etc. .ect

frame2
record n
record n+1

Lets assume the frame size is 512. If you have 5 records of 100 characters,
then you likely can edit two records, and add up to 12 characters of data
for those 5 records. Lets assume that you add 1 more character to a record.
Well, now, those records CAN NOT fit in the above frame..can they? So, what
happens is another frame is added (note that the database can only add space
by the frame size, and this is done for performance reasons. Worse, since
the database needs to keep the data together, then often the current frame
is *copied* to a new location, and the additional frame is appended.
(remember, we can't delete holes, or compact a database while OTHER users
may be in the file, as that would cause EVEN MORE disk i/o).

The above conceptual view now gives you an idea that adding only a few
characters per record can actually results in a WHOLE NEW FRAME being added
for each of those records. (and, I not sure of the size of frames in
ms-access now..but, they are like larger then 512 characters).

What this means is that if you edit a record, ms-access will OFTEN expand
the file to handle this case. And, after the records expand a bit, then the
chances that *extra* space is available actually increases by a large
amount. (eg: if you just expanded to 513 chars, then you can now expand
records in that frame another 511 characters before ANY GROWTH occurs. What
this means is that growth of a file will subside to a dull roar.

And, another trick you could consider is that if you *knew* the size of code
part num, you could just store 5 text "0", or some other appropriate value.
Then, when you run your edit routines, the reocrd(s) will NOT expanded
beyond their current frame size, and viola...no bloat!!!

Another CRITICAL CRITICAL CRITICAL feature to turn off is row record
locking. ms-access locks records by frames,and if you turn on row locking,
it actually PADS THE RECORDS TO EXPAND to a full frame. so, having row
locking turned on is a huge hit here (I would check this feature, as it also
might be responsible for your bloating).

So, you can keep you locking turned on if need be, but for the most part, I
would make sure that row locking is turned off.

tools->options->advanced.

make sure the
"[ ] open databases using record level locking

The above option needs to be un-checked.

So, check the above, and now that you have an understanding of how/why the
file bloats, you could try padding that codepart number with as many
"holder" characters as needed. Now, do a compact and repair..and then run
your routine...there should be very little, if any bloat...
 
A

admin

The logic of your solution sounds absolutely perfect. I am running
into a problem as Access will not allow a user defined function to be
used within the query.

See this KB article that I found here:
http://support.microsoft.com/kb/q180810/

Any other help would be appreciated as I have reached the limits of my
knowledge.
It would probably be faster if you create an Update query and use the
following function to find the number you want. If the function returns "",
that means it did not find a number in the string, so there should be no
update. That you can handle with an IIf statement in your query. I would
add a calculated field so the query only has to run once per record:

ScrubValue([ManfPartNum]) As NewVal

Then to get it into the field

IIf([NewVal] = "", [CorePartNum], [ManfPartNum])


Public Function ScrubValue (strMPN As String) As String
Dim intI As Integer
Dim blnHasNum As Boolean

For intI = 1 To Len(strMPN)
If IsNumeric(Mid(strMPN, intI, 1)) Then
'number found, exit
blHasNum = True
Exit For
End If
Next intI

If blHasNum Then
ScrubValue = Right(strMPN, Len(strMPN) - intI + 1)
Else
ScrubValue = ""
End If
End Function


esteiner said:
am scrubbing values of a field in an Access 2003 table and updating the
results in the same table.

The routine runs OK but the size of the database goes to 600+ MB. I must run
the compact and repair database operation each time I run the routine. The
amount of data to process will continue to increase, so I would like to make
this routine as efficient as possible. I currently process about 200K
records, but expect this to grow to 1 million+

I assume that this is caused by the many read/write cycles to the hard disk.
Is there a way to operate on the table/recordset in RAM before writing to
disk? Any help/suggestions to make this more efficient would be appreciated .
Below is the code that runs the routine:

Private Sub cmdSplitMPN_Click()
On Error GoTo E_Handle
Dim rs As Recordset ' Recordset object
Dim strTableName As String ' Name of table where export records originate
Dim strSQL As String ' SQL String
Dim intRecordNum As Long ' Record Number being dealt with
Dim strMPN As String
Dim I As Integer
Dim strPrefix As String
Dim strBasenum As String
Dim blHasNum As Boolean
Set db = DBEngine(0)(0)

' Create the table name from where the export records will orginate
strTableName = "pull_inv_BRE"
strSQL = "SELECT ManfPartNum, CorePartNum FROM " & strTableName & ""
Set rs = db.OpenRecordset(strSQL)
With rs
While Not (.BOF Or .EOF)
strMPN = !ManfPartNum
For I = 1 To Len(strMPN)
blHasNum = False
If IsNumeric(Mid(strMPN, I, 1)) Then
'number found, exit
blHasNum = True
Exit For
End If
Next I

If (blHasNum = True) Then
strBasenum = Right(strMPN, Len(strMPN) - I + 1)
.Edit
!CorePartNum = strBasenum
.Update
End If
.MoveNext
Wend
.Close
End With

sExit:
On Error Resume Next
rs.Close
Set rs = Nothing
Set db = Nothing
Reset
Exit Sub



E_Handle:
MsgBox Err.Description, vbOKOnly + vbCritical, "Error: " & Err.Number
Resume sExit

End Sub

Thank you for your help!!!

-Charlie
 
E

esteiner via AccessMonster.com

Thank you for the education and solution. This works beautifully.

Best regards
I am not sure or even convinced that changing your code to a update query
will fix the bloat problem.
The routine runs OK but the size of the database goes to 600+ MB.

Yes, but if the original size is 590mb, then what is the problem?

Since you are only editing one field, and "adding" a bit of text, then you
might be able to mitigate this problem.

first, do realize that the JET database is a frame based data engine. What
that means that all records are packed into frames.

frame1
record 1
record 2
record 3

etc. etc. .ect

frame2
record n
record n+1

Lets assume the frame size is 512. If you have 5 records of 100 characters,
then you likely can edit two records, and add up to 12 characters of data
for those 5 records. Lets assume that you add 1 more character to a record.
Well, now, those records CAN NOT fit in the above frame..can they? So, what
happens is another frame is added (note that the database can only add space
by the frame size, and this is done for performance reasons. Worse, since
the database needs to keep the data together, then often the current frame
is *copied* to a new location, and the additional frame is appended.
(remember, we can't delete holes, or compact a database while OTHER users
may be in the file, as that would cause EVEN MORE disk i/o).

The above conceptual view now gives you an idea that adding only a few
characters per record can actually results in a WHOLE NEW FRAME being added
for each of those records. (and, I not sure of the size of frames in
ms-access now..but, they are like larger then 512 characters).

What this means is that if you edit a record, ms-access will OFTEN expand
the file to handle this case. And, after the records expand a bit, then the
chances that *extra* space is available actually increases by a large
amount. (eg: if you just expanded to 513 chars, then you can now expand
records in that frame another 511 characters before ANY GROWTH occurs. What
this means is that growth of a file will subside to a dull roar.

And, another trick you could consider is that if you *knew* the size of code
part num, you could just store 5 text "0", or some other appropriate value.
Then, when you run your edit routines, the reocrd(s) will NOT expanded
beyond their current frame size, and viola...no bloat!!!

Another CRITICAL CRITICAL CRITICAL feature to turn off is row record
locking. ms-access locks records by frames,and if you turn on row locking,
it actually PADS THE RECORDS TO EXPAND to a full frame. so, having row
locking turned on is a huge hit here (I would check this feature, as it also
might be responsible for your bloating).

So, you can keep you locking turned on if need be, but for the most part, I
would make sure that row locking is turned off.

tools->options->advanced.

make sure the
"[ ] open databases using record level locking

The above option needs to be un-checked.

So, check the above, and now that you have an understanding of how/why the
file bloats, you could try padding that codepart number with as many
"holder" characters as needed. Now, do a compact and repair..and then run
your routine...there should be very little, if any bloat...
 
B

Brendan Reynolds

Access does allow user-defined functions in queries. The KB article is
referring to *other* Office apps, not Access.

<quote emphasis="mine">
You can only refer to Microsoft Jet objects that are only dependent on other
Microsoft Jet objects when using DAO in *other* Microsoft Office programs.
The following steps demonstrate this problem by creating a query in
Microsoft Access that calls a user-defined function and then attempting to
open a recordset on that query using DAO in Microsoft *Excel*.
</quote>

BTW: While the article discusses opening a recordset using DAO, the issue is
not specific to DAO. The same thing would occur if you attempted to open the
recordset using ADO, or from a .NET app using ADO.NET. The article doesn't
mention these newer data access technologies simply because it is an old
article.

The bottom line is that you can use user-defined functions in JET queries
only when those queries are executed within the Microsoft Access
environment.

If you need to execute your query outside of the Microsoft Access
environment, it *might* be possible to replace the user-defined function
using the built-in Choose() or Switch() functions, but I have not tested
that idea.

--
Brendan Reynolds
Access MVP

The logic of your solution sounds absolutely perfect. I am running
into a problem as Access will not allow a user defined function to be
used within the query.

See this KB article that I found here:
http://support.microsoft.com/kb/q180810/

Any other help would be appreciated as I have reached the limits of my
knowledge.
It would probably be faster if you create an Update query and use the
following function to find the number you want. If the function returns
"",
that means it did not find a number in the string, so there should be no
update. That you can handle with an IIf statement in your query. I
would
add a calculated field so the query only has to run once per record:

ScrubValue([ManfPartNum]) As NewVal

Then to get it into the field

IIf([NewVal] = "", [CorePartNum], [ManfPartNum])


Public Function ScrubValue (strMPN As String) As String
Dim intI As Integer
Dim blnHasNum As Boolean

For intI = 1 To Len(strMPN)
If IsNumeric(Mid(strMPN, intI, 1)) Then
'number found, exit
blHasNum = True
Exit For
End If
Next intI

If blHasNum Then
ScrubValue = Right(strMPN, Len(strMPN) - intI + 1)
Else
ScrubValue = ""
End If
End Function


esteiner said:
am scrubbing values of a field in an Access 2003 table and updating
the
results in the same table.

The routine runs OK but the size of the database goes to 600+ MB. I
must run
the compact and repair database operation each time I run the routine.
The
amount of data to process will continue to increase, so I would like to
make
this routine as efficient as possible. I currently process about 200K
records, but expect this to grow to 1 million+

I assume that this is caused by the many read/write cycles to the hard
disk.
Is there a way to operate on the table/recordset in RAM before writing
to
disk? Any help/suggestions to make this more efficient would be
appreciated .
Below is the code that runs the routine:

Private Sub cmdSplitMPN_Click()
On Error GoTo E_Handle
Dim rs As Recordset ' Recordset object
Dim strTableName As String ' Name of table where export records
originate
Dim strSQL As String ' SQL String
Dim intRecordNum As Long ' Record Number being dealt with
Dim strMPN As String
Dim I As Integer
Dim strPrefix As String
Dim strBasenum As String
Dim blHasNum As Boolean
Set db = DBEngine(0)(0)

' Create the table name from where the export records will orginate
strTableName = "pull_inv_BRE"
strSQL = "SELECT ManfPartNum, CorePartNum FROM " & strTableName & ""
Set rs = db.OpenRecordset(strSQL)
With rs
While Not (.BOF Or .EOF)
strMPN = !ManfPartNum
For I = 1 To Len(strMPN)
blHasNum = False
If IsNumeric(Mid(strMPN, I, 1)) Then
'number found, exit
blHasNum = True
Exit For
End If
Next I

If (blHasNum = True) Then
strBasenum = Right(strMPN, Len(strMPN) - I + 1)
.Edit
!CorePartNum = strBasenum
.Update
End If
.MoveNext
Wend
.Close
End With

sExit:
On Error Resume Next
rs.Close
Set rs = Nothing
Set db = Nothing
Reset
Exit Sub



E_Handle:
MsgBox Err.Description, vbOKOnly + vbCritical, "Error: " & Err.Number
Resume sExit

End Sub

Thank you for your help!!!

-Charlie
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top