Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion GroupsGeneralPHPASPPerlColdFusionFlashHTML, CSS, ScriptsBrowsers

Webmaster Forum / ASP / Database Access / October 2006



Tip: Looking for answers? Try searching our database.

Speeding up multiple updates

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
brian.lukoff@gmail.com - 10 Oct 2006 15:15 GMT
I have a table with three columns, ID, Field, and Data.  There are
about 1000 rows with a given ID (and given an ID each row has a unique
Field value).  At some point in a script, I need to update the Data
column of a subset of these rows (somewhere between 1 and maybe 30 of
them), but each updated Data value needs to be able to contain a
different value.  Right now I'm using a sequence of UPDATE statements,
one for each row that I need to update, (e.g., UPDATE [...] SET [Data]
= ... WHERE [Field] = ... AND [ID] = ...) but this is very slow and is
causing users to receive timeouts and SQL deadlock errors.  Is there a
better way for me to do this?

Brian
Anthony Jones - 10 Oct 2006 15:39 GMT
> I have a table with three columns, ID, Field, and Data.  There are
> about 1000 rows with a given ID (and given an ID each row has a unique
[quoted text clipped - 8 lines]
>
> Brian

You didn't specify what DB.
How are you executing these updates? Are they in a transaction?

I can't see updating 30 records in a 1000 record DB taking any measurable
time at all.  (Unless your Data field just happens to be a Large Binary type
field in which you are storing a lot of data).

Do you have triggers on the table that might be taking a long time?
Is the table actively being read as part of other transaction protected
operations?
brian - 10 Oct 2006 18:04 GMT
Hi Anthony,

The table has many records, but for any particular ID there are about
1000 records each.  There are probably about 300 IDs, so about 300,000
records in the database.

The UPDATEs are being sent through the Execute statement of a
connection object.  There aren't any triggers.  The Data field is a
large varchar column (size 5000, I think).

Brian

> > I have a table with three columns, ID, Field, and Data.  There are
> > about 1000 rows with a given ID (and given an ID each row has a unique
[quoted text clipped - 19 lines]
> Is the table actively being read as part of other transaction protected
> operations?
Bob Barrows [MVP] - 10 Oct 2006 18:45 GMT
> I have a table with three columns, ID, Field, and Data.  There are
> about 1000 rows with a given ID (and given an ID each row has a unique
[quoted text clipped - 6 lines]
> causing users to receive timeouts and SQL deadlock errors.  Is there a
> better way for me to do this?

By "SQL", I'm assuming you mean SQL Server ...
Check out Erland's article about handling arrays in T-SQL*:
http://www.sommarskog.se/arrays-in-sql.html

You may be able to improve performance by passing all the updates as a
delimited string to a stored procedure which will parse the string,
inserting the data into a temp table and then using a join to perform
the actual updates.

Bob Barrows
* make sure you check out the other articles on his site. I have learned
from all of them.
Signature

Microsoft MVP -- ASP/ASP.NET
Please reply to the newsgroup. The email account listed in my From
header is my spam trap, so I don't check it very often. You will get a
quicker response by posting to the newsgroup.

brian - 10 Oct 2006 18:56 GMT
Would it be better to redesign the database table so that each user has
only one row and the Data column is just a comma-delimited version of
*all* of their data: e.g.,

4,5,2,a text response,another text response,2,3

and then just parse into and out of this format in code?  The Data
column would have to be a text column then, because it would end up
being very long.  But it would mean that there would only be one row
per user (and one UPDATE statement) instead of many.

Brian

> > I have a table with three columns, ID, Field, and Data.  There are
> > about 1000 rows with a given ID (and given an ID each row has a unique
[quoted text clipped - 24 lines]
> header is my spam trap, so I don't check it very often. You will get a
> quicker response by posting to the newsgroup.
Bob Barrows [MVP] - 10 Oct 2006 19:05 GMT
> Would it be better to redesign the database table so that each user
> has only one row and the Data column is just a comma-delimited
> version of *all* of their data: e.g.,

Absolutely not. It's not the worst thing you could do, but it's close.
Almost worse than creating multiple columns to hold the individual data
values.

> 4,5,2,a text response,another text response,2,3
>
> and then just parse into and out of this format in code?  The Data
> column would have to be a text column then, because it would end up
> being very long.  But it would mean that there would only be one row
> per user (and one UPDATE statement) instead of many.

My solution involves a single UPDATE statement as well. And data
integrity is not compromised.

Signature

Microsoft MVP -- ASP/ASP.NET
Please reply to the newsgroup. The email account listed in my From
header is my spam trap, so I don't check it very often. You will get a
quicker response by posting to the newsgroup.

brian - 10 Oct 2006 20:24 GMT
> Absolutely not. It's not the worst thing you could do, but it's close.
> Almost worse than creating multiple columns to hold the individual data
> values.

> My solution involves a single UPDATE statement as well. And data
> integrity is not compromised.

Why is this so bad?  There is no validation done on the values (the
user enters the data), so I don't think data integrity would be a
problem.  (Unless I'm not understanding what you mean.)  Also, having
one column with delimited values seems simpler and would seem to take
less time on the database end since the ASP code would put together
(and take apart) the delimited list of values.  It's not an elegant
solution, but it seems like it should work and alleviate the problems I
have been experiencing with timeouts, locking, and cascading blocking.

Brian
Bob Barrows [MVP] - 10 Oct 2006 20:35 GMT
>> Absolutely not. It's not the worst thing you could do, but it's
>> close. Almost worse than creating multiple columns to hold the
[quoted text clipped - 4 lines]
>
> Why is this so bad?

I'm not going to attempt to answer this here because:
1. A newsgroup is not the proper forum to teach relational database
design.
2. I am not an expert - I just know enough to realize what a bad idea it
is to put delimited data values in a database table.

I suggest you get a book or do a google search.

It may seem like a good solution for your immediate problem, but I can
almost guarantee it will lead to problems down the road.

Again, my proposed solution involved a single update to your table and
avoided the problems inherent in storing delimited data.

Signature

Microsoft MVP -- ASP/ASP.NET
Please reply to the newsgroup. The email account listed in my From
header is my spam trap, so I don't check it very often. You will get a
quicker response by posting to the newsgroup.

brian - 10 Oct 2006 20:56 GMT
OK, I agree that what I'm proposing is a kludgy solution.

I recently found this (suggesting using the CASE statement to make a
single UPDATE clause)...it sounds like it might help to speed up the
UPDATEs while maintaining the proper database structure:
http://www.sql-server-performance.com/nb_case_statement.asp

Do you think this will work to improve the performance of the script to
the point of being usable (and remove the problems with timeouts,
locking, cascading blocking)?  My understanding is that 30 UPDATE
statements will require 30 passes through the table (which takes
forever with hundreds of thousands or millions of records) while using
the CASE statement will only require 1 pass through the table.

Brian

> >> Absolutely not. It's not the worst thing you could do, but it's
> >> close. Almost worse than creating multiple columns to hold the
[quoted text clipped - 24 lines]
> header is my spam trap, so I don't check it very often. You will get a
> quicker response by posting to the newsgroup.
Bob Barrows [MVP] - 10 Oct 2006 23:25 GMT
> OK, I agree that what I'm proposing is a kludgy solution.
>
[quoted text clipped - 9 lines]
> forever with hundreds of thousands or millions of records) while using
> the CASE statement will only require 1 pass through the table.

That might help ... and it might not. There are too many factors involved
about which we know nothing: indexes, extent of fragmentation, etc. The only
way to answer the question is to try both solutions and see which one.

My issue with this solution is that it cannot be done without dynamic sql. I
would personally prefer a solution that does not involve concatenating a
potentially large sql statement, especially since this will leave you
vulnerable to sql injection, no matter how stringently you validate the user
inputs.
Signature

Microsoft MVP - ASP/ASP.NET
Please reply to the newsgroup. This email account is my spam trap so I
don't check it very often. If you must reply off-line, then remove the
"NO SPAM"

Mike Brind - 10 Oct 2006 20:55 GMT
>> Absolutely not. It's not the worst thing you could do, but it's close.
>> Almost worse than creating multiple columns to hold the individual data
[quoted text clipped - 13 lines]
>
> Brian

Here's one reason:  If at some stage in the future you need to do something
with these values, and the number of values changes over time (more
questions get added), you will spend huge amounts of time producing
server-side code to iterate through variable length delimited lists to get
to the value that you want - if it's there.

Here's another: If the requirement changes such that a stakeholder in the
project wants to know how many people answered "2" to question 3, how will
you go about getting a count easily?

--
Mike Brind
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.