frequent database corruption

A

adirat

I have read a lot on this subject on newsgroups and other access
related websites on data corruption, but since we are still not able
to isolate the problem – I am posting this detailed explanation of my
problem:

We have a 23 user environment with Windows advanced server and windows
2000 clients with access 2002 running on all clients in a FE/BE
format. The problem is our database gets corrupted almost 3-6 times on
a busy day (lot of data being processed).

Typically the following ways are in which the database gets corrupt:
1. We get the error message "Unrecognizable database format - do you
want to repair...." – this situation is most common and a compress and
repair on the database does the trick for us.
2. The second kind of error is when one record in any table becomes
corrupt – this usually happens with "#error" in each field of a record
or "####" in each field or sometimes "oriental and other garbled
characters" in place of the data in a record. Deleting this record is
sometimes possible, sometimes the rest of the records have to be
copied into a fresh table.

Also the problem faces could start on any client – hence we do not
think it is because of any one particular client.

Here is a brief history which can assist you in helping us dentify the
source of problem:

1. Earlier we had win95 with access97 with BE on Novel Netware. At
that time, for four years, we never faced this problem.
2. The we got new machines for all the nodes with win2000 with
access2000 (still using the netware server). We upgraded the same
access file from 97 to 2000. STILL NO PROBLEM.
3. Then we changed the server to win 2000 advanced server. That is
when we started having the corruption problems.
4. We further changed to access2002 – hoping this would remove the
problem – but it hasn't.

Further we have ensured that:
1. All the nodes have access 2002, SP4 for win2000 and same Jet engine
4.0
2. We have changed the server's network card, server's cable and
switch to ensure it is not because of network hardware devices – still
the problem persists.

Can anybody guide us how do we further isolate the problem and then
how do we resolve it.
Thanks in advance
adirat
 
N

Nikos Yannacopoulos

Adirat,

You say the database is spilt to a FE/BE, but you do not clarify whether
each user has their own copy of the FE, or they are all using a common copy
on the server. In the latter case, the source of your problem is there, and
the solution is to provide each user with a dedicated copy of the FE.

HTH,
Nikos
 
L

Larry Linson

One thing you did not mention: have you separated the back-end (tables,
data, and relationships) from the front-end (queries, forms, reports,
macros, modules, and any local lookup tables) and given each user their own
copy of the front-end to be linked to the tables in the back end? Having
multiple users logged in to the same copy of the front-end or to the
monolithic database greatly increases the chances of corruption.

There's an introductory presentation on Access in a Multiuser Environment
that I did for my user group that you can download from
http://appdevissues.tripod.com. It will identify topics that I thought
worthwhile to discuss, and a bit more. The best collection of detailed
information and links on the subject of Access in the multiuser environment
is at MVP Tony Toews' site, http://www.granite.ab.ca/accsmstr.htm. You'll
likely find the answer to your problem in the references at Tony's site.

Larry Linson
Microsoft Access MVP
 
T

Tony Toews

3. Then we changed the server to win 2000 advanced server. That is
when we started having the corruption problems.

Given this history I strongly suspect the problem is the OpLocks setting on the
server. Corrupt Microsoft Access MDB Causes - OpLocks
http://www.granite.ab.ca/access/corruption/causesoplocks.htm

Tony
--
Tony Toews, Microsoft Access MVP
Please respond only in the newsgroups so that others can
read the entire thread of messages.
Microsoft Access Links, Hints, Tips & Accounting Systems at
http://www.granite.ab.ca/accsmstr.htm
 
T

Tony Toews

Tony Toews said:
Given this history I strongly suspect the problem is the OpLocks setting on the
server. Corrupt Microsoft Access MDB Causes - OpLocks
http://www.granite.ab.ca/access/corruption/causesoplocks.htm

Or, as Larry suggests, you haven't split the MDB into a FE/BE.

Tony
--
Tony Toews, Microsoft Access MVP
Please respond only in the newsgroups so that others can
read the entire thread of messages.
Microsoft Access Links, Hints, Tips & Accounting Systems at
http://www.granite.ab.ca/accsmstr.htm
 
D

David W. Fenton

(e-mail address removed) (adirat) wrote in
Further we have ensured that:
1. All the nodes have access 2002, SP4 for win2000 and same Jet
engine 4.0

Which version of Jet 4.0? SP6 or SP8 are the only usable versions of
Jet 4.0.
2. We have changed the server's network card, server's cable and
switch to ensure it is not because of network hardware devices -
still the problem persists.

Have you investigated the oplocks settings on the server?

I wonder if you are using memo fields in your data tables, and if
those are the tables that exhibit the corrupted records?

If so, one way to make your app more robust is to never edit the
memo with a bound field. Instead, use an unbound textbox for the
memo field, and load the data from the recordsource into the unbound
textbox in the OnCurrent event of the form, and then write to the
field in the textbox's AfterUpdate event.
 
A

adirat

Thanks Nikos, Larry Linson & John for your response.

Indeed each user have their own copy of FE - in fact each user uses an
individual 'mde' file with all the forms, queries, reports; with most
of the tables linked to the BE. The BE consists only of tables.

I have also gone through MVP Tony Toews' site,
http://www.granite.ab.ca/accsmstr.htm (thanks for all the stuff tony)
and tried to implement many things that he suggests - but really my
objective is of isolation of the problem - as there could be so many
reaosns for the corruption. I even tried opLocks - but somehow I could
not find the relevant the registry entry in my server - as suggested
by microsoft - HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\MRXSmb\Parameters\

Can the fact that many of my forms are bound be a problem?

will look up other links suggested by you all. Any more suggestions
welcome.
regards, adirat
 
D

david epsom dot com dot au

not find the relevant the registry entry in my server
- as suggested by microsoft -
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\MRXSmb\Parameters\

1) That is the Client setting - the Server setting is under:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\LanmanServer\Parameters

2) There is no dummy entry for the default setting: you need
to create a registry value for OplocksDisabled/EnableOplocks
if you want to change the value to something OTHER than default.

The oplocks problem has been listed as fixed: with SP4 on all
the clients you are not supposed to have any problems. We are
all watching to see what you discover :~)

(david)
 
T

Tony Toews

david epsom dot com dot au said:
1) That is the Client setting - the Server setting is under:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\LanmanServer\Parameters

Good catch. That article is rather confusing for our requirements.
I've requested an update. I wonder if they didn't change it on me a
while back.

I've also updated my page.

Thanks.
The oplocks problem has been listed as fixed: with SP4 on all
the clients you are not supposed to have any problems. We are
all watching to see what you discover :~)

I haven't seen any postings yet on this topic to know if SP4 really
has solved this problem. But then I'm a pessimist. <smile>

Tony
--
Tony Toews, Microsoft Access MVP
Please respond only in the newsgroups so that others can
read the entire thread of messages.
Microsoft Access Links, Hints, Tips & Accounting Systems at
http://www.granite.ab.ca/accsmstr.htm
 
A

adirat

David W. Fenton said:
(e-mail address removed) (adirat) wrote in


Which version of Jet 4.0? SP6 or SP8 are the only usable versions of
Jet 4.0.

Have you investigated the oplocks settings on the server?

I wonder if you are using memo fields in your data tables, and if
those are the tables that exhibit the corrupted records?

If so, one way to make your app more robust is to never edit the
memo with a bound field. Instead, use an unbound textbox for the
memo field, and load the data from the recordsource into the unbound
textbox in the OnCurrent event of the form, and then write to the
field in the textbox's AfterUpdate event.

version of Jet 4.0 is sp8 in all machines
no memo fields at all.

oplocks NOW also set to disable (only on server - do i need to do on
client machines too?)
will watch and post if oplocks has solved the problem

regards
adirat
 
A

adirat

will watch and post if oplocks has solved the problem

Here is the update:
29th june we did the oplocks disable.
Till 8th july no problem - we were about to distribute sweets in the
office when

1. On 9th one table in the BE database had two problems
a) One record had "#ERROR" in every field
b) Another record had oriental characters (like japanese or chinese)
in every field and the id field (which usually has a four digit
number) had a minus 9 digit number.
c) We lost one record.
Also to be able to use the table further, we had to copy all the
uncorrupted records (around 3578) to a fresh table with the same
structure.

Any ideas what could have caused all this and HOW TO PREVENT THIS.

2. Today, 12th July we once again got the unrecognised database - need
to repair error message - and had to run a compact and repair.
Any ideas where we go from here.

thanks ina advance
adirat
 
M

Michael \(michka\) Kaplan [MS]

The hard part about corruption is that it is not a bug, it is the result of
a bug (almost universally to do with hardware/NIC issues). The result is not
so great from a "debugging the problem" standpoint except in the case where
the machines where it happens show patterns that can point to the offenders.

Defensive Access apps work to minimize the ability for such issues to cause
problems; disabling oplocks is one such workitem, but there are many
others -- such as never binding to memo or bound object fields in
forms/reports, for example.


--
MichKa [MS]
NLS Collation/Locale/Keyboard Development
Globalization Infrastructure and Font Technologies
Windows International Division

This posting is provided "AS IS" with
no warranties, and confers no rights.
 
D

David W. Fenton

(e-mail address removed) (adirat) wrote in
1. On 9th one table in the BE database had two problems
a) One record had "#ERROR" in every field
b) Another record had oriental characters (like japanese or
chinese) in every field and the id field (which usually has a four
digit number) had a minus 9 digit number.
c) We lost one record.
Also to be able to use the table further, we had to copy all the
uncorrupted records (around 3578) to a fresh table with the same
structure.

Any ideas what could have caused all this and HOW TO PREVENT THIS.

Are there memo fields in the table that got corrupted?

If so, did you take the advice given here to not edit those in bound
controls?
 
A

adirat

as mentioned in earlier mail
no memo field
no bound objects fields
nowhere in the whole app.

what i am asking for is what next i can do - what else i can look for
-
1.how does one trace a NIC issue (assuming that could be one of the
problems) - since the problem do not seem to originate from one node)
2. Anything i can change in the access database (in the FE or in BE)

any tips most welcome....

adirat
 
D

David W. Fenton

(e-mail address removed) (adirat) wrote in
as mentioned in earlier mail
no memo field
no bound objects fields
nowhere in the whole app.

what i am asking for is what next i can do - what else i can look
for -
1.how does one trace a NIC issue (assuming that could be one of
the problems) - since the problem do not seem to originate from
one node)

Probably you need to start digging into the contents of the LDB
file. I don't know how much information the ADO UserRoster function
returns, but the LDBVIEW utility provides you an interface to all
the information available in an LDB (through a DLL):

http://support.microsoft.com/default.aspx?scid=kb;EN-US;176670
2. Anything i can change in the access database (in the FE or in
BE)

First off, I assume that users are *not* sharing the front end?

If everything is unbound, then it's very odd that you're having
problems.

I forget your original situation. Did this all start happening with
the upgrade to a new version of Access, or was it an upgraded
server?

The other thing you really *must* be certain about is that every
single workstation have the latest service pack for Access and Jet
4.0 service pack 6 or later (though not SP7, which was very buggy,
and was quickly replaced with 8). Even one single workstation still
having a substandard installation can cause problems -- I've seen
it.

I now have all my A2K apps record on user startup what version of
the MSACCESS.EXE file they are using and what version of
MSJET40.DLL. This makes it very easy to tell if any workstations
have the wrong versions, and makes it very easy to identify when a
workstation reverts to older versions (which can happen very easily,
with, say, a repair of Office, or the rebuilding of a workstation or
its replacement with a new PC).

I'd say the problem is unlikely to be a NIC *if* A97 worked reliably
on the same equipment (I may be remembering wrong?). I think it's
much more likely to be a software configuration issue on the server,
especially given that changing the OPLOCKS setting improved the
situation.

Is this server running MS Exchange, by chance? I had a client once
where an Exchange hot fix started causing corruptions. As Exchange
wasn't even in use on the server (and shouldn't have been running
and should not have been patched), we backed out the hot fix and the
corruption never recurred. That was a very long time ago (early
1999) with very different versions of software (and replication
involved, to boot), but it taught me a very important lesson:
corruption can be caused by changes to the software configuration of
a server, something I'd never really considered before that point.
 
T

Tony Toews

Here is the update:
29th june we did the oplocks disable.
Till 8th july no problem - we were about to distribute sweets in the
office when

1. On 9th one table in the BE database had two problems
a) One record had "#ERROR" in every field
b) Another record had oriental characters

This one is a puzzler. One place I personally visited was having
similar problems which turned out to be an old house which was
converted into an office which had severely overloaded circuits and
very flaky power. The PCs and server were on UPSs but the hub
wasn't. Once they put the hub on a UPS all was well.

The point here though is that you'd think the network protocols would
somehow tell the other computer to retransmit clean packets or
completely fail. But that's not what happened. Corrupted packets got
through.
2. Today, 12th July we once again got the unrecognised database - need
to repair error message - and had to run a compact and repair.

Next time it happens try the steps at Determining the workstation
which caused the Microsoft Access MDB corruption
http://www.granite.ab.ca/access/corruption/workstation.htm to see if
you can narrow it down to a specific workstation.

Tony
--
Tony Toews, Microsoft Access MVP
Please respond only in the newsgroups so that others can
read the entire thread of messages.
Microsoft Access Links, Hints, Tips & Accounting Systems at
http://www.granite.ab.ca/accsmstr.htm
 
D

david epsom dot com dot au

1.how does one trace a NIC issue (assuming that could be one of the

I don't even have the hardware to trace NIC issues, I would hire an
expensive network guy who did. But presumably Win Server can at least
count bad packets.
what i am asking for is what next i can do - what else i can look for

Also, remove extra network protocols, and turn off Anti-Virus software

(david)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top