Database Blocking -- Why?

J

Jonathan Wood

My database seems to be getting blocked but I don't understand why. This is
for a desktop application and my code looks something like this:

private void toolStripButton1_Click(object sender, EventArgs e)
{
try
{
string qry = String.Format("SELECT TOP {0} * FROM URLs WHERE Status=0
AND PollCount < 5 ORDER BY URL", total);
using (SqlDataReader rdr = DataHelper.ExecDataReader(qry))
{
while (rdr.Read())
{
try
{

// ...

if (<error condition>)
throw new Exception("Custom Error");

// ...

DataHelper.ExecNonQuery("UPDATE URLs SET Status=1,
Comment=@Comment WHERE ID=@ID",
"@Comment", "Success",
"@ID", (int)rdr["ID"]);
}
catch (Exception ex)
{
DataHelper.ExecNonQuery("UPDATE URLs SET
Status=2,Comment=@Comment,PollCount=PollCount+1 WHERE ID=@ID",
"@Comment", ex.Message,
"@ID", (int)rdr["ID"]);
}
}
}
}
catch (Exception ex)
{
MessageBox.Show(String.Format("Unexpected error : {0}", ex.Message));
}
finally
{
// Clean up
}
}

This code was working just fine all day. But, suddenly, the database query
in the catch block was timing out, causing the second, outter, catch
statement to run.

If I set CommandTimeout to 0, it blocks indefinitely. While it was blocked,
I ran sp_who2 and the results can be seen at
http://www.fileparade.com/DBBlock.jpg.

It shows the UPDATE is blocked by the SELECT. But why does a SELECT block?

Thanks for any tips. I'm stuck!

Jonathan
 
A

Alex Kuznetsov

It shows the UPDATE is blocked by the SELECT. But why does a SELECT block?

Updates need exclusive blocks, so they need to wait until shared
blocks acquired by selects are released.
 
J

Jonathan Wood

Well, I'm still not clear why a read-only operation needs any lock at all. I
want to be able to update records as I'm looping through them. Any ideas on
how this is possible? (Or why it was working fine until now?)

Thanks.

Jonathan

It shows the UPDATE is blocked by the SELECT. But why does a SELECT block?

Updates need exclusive blocks, so they need to wait until shared
blocks acquired by selects are released.
 
A

Andrew J. Kelly

You have to understand that in SQL Server the default isolation level is
Read Committed. Meaning that if any row is being modified you can not read
it until it is done. That ensures you don't get dirty reads. It goes the
other way as well. You should not be able to modify a row that someone is in
the middle of reading either. If you want a dirty read then use NOLOCK with
the read but be aware of the consequences. But I think a bigger issue is
the way you are writing your code. Since you are embedding all the code into
your app and you have not parameterized any of the statements you may not
get plan reuse and worse is that you may not use indexes properly if the
datatypes are interpreted wrong. That could be one reason you are blocking.
You did parameterize the UPDATE but not the select and the select may be
scanning the table or taking a table level lock.
 
J

Jonathan Wood

Hi Andrew,
You have to understand that in SQL Server the default isolation level is
Read Committed. Meaning that if any row is being modified you can not read
it until it is done. That ensures you don't get dirty reads. It goes the
other way as well. You should not be able to modify a row that someone is
in the middle of reading either. If you want a dirty read then use NOLOCK
with the read but be aware of the consequences. But I think a bigger
issue is the way you are writing your code. Since you are embedding all
the code into your app and you have not parameterized any of the
statements you may not get plan reuse and worse is that you may not use
indexes properly if the datatypes are interpreted wrong. That could be one
reason you are blocking.

I can't see why that would cause this, but I did try changing my select as
follows.

string qry = String.Format("SELECT TOP {0} * FROM URLs WHERE Status=@Status
AND PollCount < @PollCount ORDER BY URL", total);
using (SqlDataReader rdr = DataHelper.ExecDataReader(qry, "@Status", 0,
"@PollCount", 5))

It didn't change it though. I even changed the UPDATE query so that every
argument was parameterized.

Note that this is not a performance issue. The database appears blocked
indefinitely.
You did parameterize the UPDATE but not the select and the select may be
scanning the table or taking a table level lock.

I feel I'm a bit over my head here.

Thanks.

Jonathan
 
J

Jonathan Wood

Thanks. I'm reading about it.

Can you suggest a way to rework my code so this would work?

Jonathan
 
J

Jonathan Wood

Thanks. I'm reading about it.

I read a few negative things about SNAPSHOT ISOLATION, and I'm still yet to
find a code example that shows how I'd use it.

Can you suggest any way to restructure my code so this would work?

Jonathan

Well, I'm still not clear why a read-only operation needs any lock at all.
I
want to be able to update records as I'm looping through them. Any ideas
on
how this is possible? (Or why it was working fine until now?)

Explanation here:

http://en.wikipedia.org/wiki/Isolation_(database_systems)

Also you can use snapshot isolation when readers do not block writers.
 
J

Jonathan Wood

If that is the case and your problem is of sufficient importance (or even
if it isn't), I recommend you hire a professional to help address this
problem and also give you some mentoring along the way so you can do
things better and investigate better in the future.

Well, I'm juggling a lot of technologies and I can't become an expert in all
of them. But if I end up having to hire help then that's what I'll do.

I guess I was hoping someone here would be able to see what was wrong and
suggest a correction.

Jonathan
 
J

Jonathan Wood

Thanks. I've looked that over--to the extent I understand it.

Can you tell me this: If you need to iterate through a large dataset, and
you need to update each row that you process, would your solution be to use
SNAPSHOT ISOLATION, or would you change the code some other way?

I may use a Microsoft support incidence on this. But my experience with this
is that they tend to resolve the issue, eventually, but are generally not
that knowledgeable.

Thanks.

Jonathan

Thanks. I'm reading about it.

I read a few negative things about SNAPSHOT ISOLATION, and I'm still yet
to
find a code example that shows how I'd use it.

Can you suggest any way to restructure my code so this would work?

My take at it:

http://www.devx.com/dbzone/Article/32957/1954
 
S

sloan

I'm kinda on this line of thinking as well.

The live IDataReader ... coupled with an inner Update statement just feels
wrong.

If I had to pull some records from the database, and then update,,,I would
put them in something besides doing the update in a live IDataReader.

I would capture the
"@ID", (int)rdr["ID"]);
into a (strong) DataSet or List<int> or something...then run the updates.

............

Is this actual code or a sample?

You can also consider a BULK Update as well. I'm not sure why you're
updating the rows one at a time.

Is your datastore a SQL Server database or something else?





Patrice said:
It looks like to me that you have an exception while reading and try to
update the same table with an active reader. What if you try to clean the
reader before updating the same table again.

You may want also to try to avoid exception if they are part of your
normal execution flow (IMO they should happen only when soemthing really
wrong happens, not if you have some klind of custopn condition that should
trigger a db upodate).

--
Patrice

Jonathan Wood said:
My database seems to be getting blocked but I don't understand why. This
is for a desktop application and my code looks something like this:

private void toolStripButton1_Click(object sender, EventArgs e)
{
try
{
string qry = String.Format("SELECT TOP {0} * FROM URLs WHERE Status=0
AND PollCount < 5 ORDER BY URL", total);
using (SqlDataReader rdr = DataHelper.ExecDataReader(qry))
{
while (rdr.Read())
{
try
{

// ...

if (<error condition>)
throw new Exception("Custom Error");

// ...

DataHelper.ExecNonQuery("UPDATE URLs SET Status=1,
Comment=@Comment WHERE ID=@ID",
"@Comment", "Success",
"@ID", (int)rdr["ID"]);
}
catch (Exception ex)
{
DataHelper.ExecNonQuery("UPDATE URLs SET
Status=2,Comment=@Comment,PollCount=PollCount+1 WHERE ID=@ID",
"@Comment", ex.Message,
"@ID", (int)rdr["ID"]);
}
}
}
}
catch (Exception ex)
{
MessageBox.Show(String.Format("Unexpected error : {0}", ex.Message));
}
finally
{
// Clean up
}
}

This code was working just fine all day. But, suddenly, the database
query in the catch block was timing out, causing the second, outter,
catch statement to run.

If I set CommandTimeout to 0, it blocks indefinitely. While it was
blocked, I ran sp_who2 and the results can be seen at
http://www.fileparade.com/DBBlock.jpg.

It shows the UPDATE is blocked by the SELECT. But why does a SELECT
block?

Thanks for any tips. I'm stuck!

Jonathan
 
J

Jonathan Wood

Hi Patrice,
It looks like to me that you have an exception while reading and try to
update the same table with an active reader. What if you try to clean the
reader before updating the same table again.

At the point the exception occurs, I'm done with reading the current record,
but I still need to maintain my position so that I can Read() the next
record. Is there something I can do to let the database know I'm done with
the current record without losing my position? I'd love to do that but don't
know of anything like that.
You may want also to try to avoid exception if they are part of your
normal execution flow (IMO they should happen only when soemthing really
wrong happens, not if you have some klind of custopn condition that should
trigger a db upodate).

The problem is that I also need to catch those exceptions that result when
something really wrong happens, which includes many possible conditions in
this particular code. I just figured those few conditions that I test for in
my own code may as well be handled with everything else.

Thanks.

Jonathan
 
J

Jonathan Wood

Hi sloan,
The live IDataReader ... coupled with an inner Update statement just feels
wrong.

If I had to pull some records from the database, and then update,,,I would
put them in something besides doing the update in a live IDataReader.

I would capture the
"@ID", (int)rdr["ID"]);
into a (strong) DataSet or List<int> or something...then run the updates.

...........

Is this actual code or a sample?

This is extracted from actual code. The main problem is that there are over
a hundred thousand rows so I was reluctant to load everything into memory.
Also, it takes hours and hours (days) to run. I'd hate for an unexpected
error to happen during the update and then I'd need to redo those hours of
processing again.

But I am perfectly open to changing my code as long as I can make it
addresses my objectives.
You can also consider a BULK Update as well. I'm not sure why you're
updating the rows one at a time.

For each row, I need to poll a URL specified in the record. I have no way to
do that other than row by row.
Is your datastore a SQL Server database or something else?

It's SQL Server. Currently, it's local but it will be moved to a Web server
when the code is working.

Thanks.

Jonathan
 
J

Jonathan Wood

Hi Alex,
I would not make a choice until I have stress tested all the
alternatives:

http://www.simple-talk.com/sql/t-sq...oopholes-stress-test-those-stored-procedures/

I really appreciate your input and I'm really trying to follow along, but
this article seems specific to stored procedures. I don't believe the code
in question is using a single stored procedures. So I feel really lost
reading that article in this context.

I'll understand if you don't have any suggestions, but I'm still really
curious if there is a *general* approach to iterating through a large
dataset and updating each row, one by one, given that the straight forward
approach can produce locking.? Or is it just hit and miss?

Thanks.

Jonathan
 
S

sloan

I see 2 options to try:

Throw a "nolock" in the SELECT


String.Format("SELECT TOP {0} * FROM URLs with (nolock) WHERE Status=0
AND PollCount < 5 ORDER BY URL", total);

OR

Take your values......from the IDataReader.....put them in a List<int> or a
strong dataset (or anything that isn't a IDataReader)...THEN Loop on the
values.

I would DO THE SECOND ONE.
 
J

Jonathan Wood

Hi sloan,
I see 2 options to try:

Throw a "nolock" in the SELECT

String.Format("SELECT TOP {0} * FROM URLs with (nolock) WHERE Status=0
AND PollCount < 5 ORDER BY URL", total);

OR

Take your values......from the IDataReader.....put them in a List<int> or
a strong dataset (or anything that isn't a IDataReader)...THEN Loop on the
values.

I would DO THE SECOND ONE.

Helpful on both counts.

The first one solves the problem, and with all the headaches, I am also
starting to think the second approach might be the best way to make this
code bullet-proof.

Thanks!

Jonathan
 
D

Dan Guzman

Rather than NOLOCK, another alternative to a row-versioning isolation level
or reading the entire result into memory is to create your own snapshot via
a temp table. This way, locks are needed only for the short period of
loading the temp table. Parameterized example:

command.CommandText =
"IF OBJECT_ID('tempdb..#URLsSmapshot', 'U') IS NOT NULL" +
" DROP TABLE #URLsSmapshot;" +
" SELECT TOP (@RowLimit) *" +
" INTO #URLsSmapshot" +
" FROM URLs WHERE Status=0" +
" AND PollCount < 5 ORDER BY URL;
" SELECT * FROM #URLsSmapshot ORDER BY URL;";

BTW, I'm not sure I understand why the TOP clause is used here because I
can't see why you would limit the number of rows updated. If you remove the
TOP, you can also remove the ORDER BY from the SELECT...INTO. Additionally,
you can remove the ORDER BY from the SELECT too unless you need the rows
updated in a particular sequence. This would likely improve both
performance and concurrency.

Another alternative is to alter the database to turn on
READ_COMMITTED_SNAPSHOT. This should not be confused with the SNAPSHOT
isolation, which is multi-statement isolation level that requires
application changes. With the READ_COMMITTED_SNAPSHOT database option
turned on, update statements in the default READ_COMMITTED isolation level
store pre-update row versions in tempdb. These row versions are used by
requests from other connections to return data consistent at the time the
statement started. Select statements do not acquire shared locks in a
row-versioning isolation level.

As mentioned earlier in this thread, row versioning isolation levels
increase tempdb usage so you need to perform due diligence on tempdb sizing
and performance if you perform a lot of updates in a row-versioning
isolation level.

--
Hope this helps.

Dan Guzman
SQL Server MVP
http://weblogs.sqlteam.com/dang/
 
J

Jonathan Wood

Hi Dan,
Rather than NOLOCK, another alternative to a row-versioning isolation
level or reading the entire result into memory is to create your own
snapshot via a temp table. This way, locks are needed only for the short
period of loading the temp table. Parameterized example:

command.CommandText =
"IF OBJECT_ID('tempdb..#URLsSmapshot', 'U') IS NOT NULL" +
" DROP TABLE #URLsSmapshot;" +
" SELECT TOP (@RowLimit) *" +
" INTO #URLsSmapshot" +
" FROM URLs WHERE Status=0" +
" AND PollCount < 5 ORDER BY URL;
" SELECT * FROM #URLsSmapshot ORDER BY URL;";

BTW, I'm not sure I understand why the TOP clause is used here because I
can't see why you would limit the number of rows updated. If you remove
the TOP, you can also remove the ORDER BY from the SELECT...INTO.
Additionally, you can remove the ORDER BY from the SELECT too unless you
need the rows updated in a particular sequence. This would likely improve
both performance and concurrency.

In fact, I've been thinking about this type of approach myself. However, I
have a problem:

The reason I was using TOP was because the entire database could take
several days to poll. So I want to be able to do a chunk at a time--maybe at
nights. So then I started thinking this snapshot table would need a column
to flag whether or not each row had been polled yet. This way, when I needed
to run another chunk, I could select rows that haven't yet been polled. And
when all rows are flagged as polled, I'd know I had completed the entire
table.

The problem here, though, is that I'm right back where I started, trying to
update this flag in the same table I'm reading. I thought about adding the
flag to the original table instead but then I need a join in my select query
and again I'm reading the same table I need to update.
Another alternative is to alter the database to turn on
READ_COMMITTED_SNAPSHOT. This should not be confused with the SNAPSHOT
isolation, which is multi-statement isolation level that requires
application changes. With the READ_COMMITTED_SNAPSHOT database option
turned on, update statements in the default READ_COMMITTED isolation level
store pre-update row versions in tempdb. These row versions are used by
requests from other connections to return data consistent at the time the
statement started. Select statements do not acquire shared locks in a
row-versioning isolation level.

As mentioned earlier in this thread, row versioning isolation levels
increase tempdb usage so you need to perform due diligence on tempdb
sizing and performance if you perform a lot of updates in a row-versioning
isolation level.

Yeah, this gets into areas I'm not very knowledgeable about. If this is a
database-wide setting, then I don't know if there becomes configuration
issues should I move the database ???

At any rate, I'll to a bit of research on READ_COMMITTED_SNAPSHOT.

Thanks!

Jonathan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top