PC Review


Reply
Thread Tools Rate Thread

Design / best practices question

 
 
trullock@googlemail.com
Guest
Posts: n/a
 
      15th Aug 2008
Hi,

Can anyone suggest the best way to go about the following...

I'm tracking clicks (mouse down x,y coordinates) on a web page by
using some javascript to create an XHR which sends the coordinates to
a recording service, such as:

/Record.ashx?X=123&Y=456

In record.ashx, I'm wondering about the best way to record all these
values to the database.

In the end, when selecting the data from the database, all i need to
know is X, Y and a count. so for example:

123,456,10 would mean there were 10 clicks at 123,456.

I'm figuring that since this service is going to be under quite high
load, that making an sqlconnection inside record.ashx for every
request would be a bad idea?

I'm wondering if i should make a static array in the record servce,
which then performs a dump at regular intervals.

I'm also trying to learn LINQ and i'm wondering if there some trickery
i can employ to make this as lightweight as possible.

As an asside, do you think its best to try and maintain and X,Y,Count
table from the outset, or to just dump clicks into a raw table, which
gets periodically parsed into the grouped table?

Thanks

Trull
 
Reply With Quote
 
 
 
 
Cowboy \(Gregory A. Beamer\)
Guest
Posts: n/a
 
      15th Aug 2008

<(E-Mail Removed)> wrote in message
news:ffdc35c1-b868-490d-b4b0-(E-Mail Removed)...
> Hi,
>
> Can anyone suggest the best way to go about the following...
>
> I'm tracking clicks (mouse down x,y coordinates) on a web page by
> using some javascript to create an XHR which sends the coordinates to
> a recording service, such as:
>
> /Record.ashx?X=123&Y=456
>
> In record.ashx, I'm wondering about the best way to record all these
> values to the database.
>
> In the end, when selecting the data from the database, all i need to
> know is X, Y and a count. so for example:
>
> 123,456,10 would mean there were 10 clicks at 123,456.
>
> I'm figuring that since this service is going to be under quite high
> load, that making an sqlconnection inside record.ashx for every
> request would be a bad idea?


First, what is "quite high load"? Are you expecting this service to be a
google?

Second, what do you think the overhead is to instantiate a SQL connection?

What I am challenging here is whether the creation of the SQL connection is
really a problem. Sure, instantiating a SQL connection is an an expensive
endeavor, but you pull from the pool anyway. Under load, the pool will
retain all of the connection objects, so the overhead, per call, is in the
millisecond range at max and more likely to be microseconds. What I am
getting at is there are other ways to scale other than avoiding putting data
in the database. You can scale vertically by getting beefier machines. You
can spread the web work on to multiple machines via a cluster. You can move
the database to its own server.

Back to the overhead to instantiate a connection issue. In ADO.NET, you do
see a lot of overhead when you have a system that only takes a hit once
every couple of minutes. This is due ot the fact the connection pool is
cleared after 30 seconds (default). On heavily used apps, ADO.NET is
constantly pulling from the connection pool, so the instantiation time is
reduced tremendously.

> I'm wondering if i should make a static array in the record servce,
> which then performs a dump at regular intervals.


How important is an accurate count? - a machine failure here means you are
way off, as the array simply dies. At least with a persistant method, you
will be close, even with sudden heart failure on your system.
How are you going to do the dump? - copy the array, clear the original and
then fire another thread to save? Create a new array to service calls, and
fire another thread to save? Use a multi-threaded round robin? All possible,
but add a lot of complexity to your solution. More moving parts means more
places for potential failure.

What you end up doing in situations like this is you just delay your problem
by trading overhead per click (very small) to large overhead later on. There
are certainly times to do this, but you need to ask yourself whether the
extra programming, for safety, is warranted and the likelihood of a sudden
heart failure on your machine(s).

In short, know your risks from both directions. Trading a bit of perf for a
lack of safety? Might be worth it to you; I cannot answer that.

> I'm also trying to learn LINQ and i'm wondering if there some trickery
> i can employ to make this as lightweight as possible.


LINQ is not a data access method, per se. It is also not the best option to
use it as such when you need to disconnect your data, as the data context
desires an open connection all the time.

> As an asside, do you think its best to try and maintain and X,Y,Count
> table from the outset, or to just dump clicks into a raw table, which
> gets periodically parsed into the grouped table?


I assume you are working with a static image (meaning you are collecting
clicks on a single image of fixed size). If so, I see no reason to use a raw
table and parse. Consider this:

CREATE PROCEDURE TestProc
(
@x int
, @y int
, @clicks int
)
AS

UPDATE Table
SET clicks = clicks + @clicks
WHERE x = @x
AND y = @y

This would allow you to do either of the ways you are talking about (count
click by click or offline for later).

--
Gregory A. Beamer
MVP, MCP: +I, SE, SD, DBA

Subscribe to my blog
http://feeds.feedburner.com/GregoryBeamer#

or just read it:
http://feeds.feedburner.com/GregoryBeamer

********************************************
| Think outside the box! |
********************************************

 
Reply With Quote
 
sloan
Guest
Posts: n/a
 
      15th Aug 2008

You might consider using xml and pass down the values you need to track in
"sets of X number" or "after y number of seconds".

http://pratchev.blogspot.com/2007/06...rver-2005.html

http://support.microsoft.com/kb/315968

If you do it like the KB method, pay attention to the "element based"
comment at the top link.

................




<(E-Mail Removed)> wrote in message
news:ffdc35c1-b868-490d-b4b0-(E-Mail Removed)...
> Hi,
>
> Can anyone suggest the best way to go about the following...
>
> I'm tracking clicks (mouse down x,y coordinates) on a web page by
> using some javascript to create an XHR which sends the coordinates to
> a recording service, such as:
>
> /Record.ashx?X=123&Y=456
>
> In record.ashx, I'm wondering about the best way to record all these
> values to the database.
>
> In the end, when selecting the data from the database, all i need to
> know is X, Y and a count. so for example:
>
> 123,456,10 would mean there were 10 clicks at 123,456.
>
> I'm figuring that since this service is going to be under quite high
> load, that making an sqlconnection inside record.ashx for every
> request would be a bad idea?
>
> I'm wondering if i should make a static array in the record servce,
> which then performs a dump at regular intervals.
>
> I'm also trying to learn LINQ and i'm wondering if there some trickery
> i can employ to make this as lightweight as possible.
>
> As an asside, do you think its best to try and maintain and X,Y,Count
> table from the outset, or to just dump clicks into a raw table, which
> gets periodically parsed into the grouped table?
>
> Thanks
>
> Trull



 
Reply With Quote
 
trullock@googlemail.com
Guest
Posts: n/a
 
      15th Aug 2008
On Aug 15, 3:09*pm, "Cowboy \(Gregory A. Beamer\)"
<NoSpamMgbwo...@comcast.netNoSpamM> wrote:
> <trull...@googlemail.com> wrote in message
>
> news:ffdc35c1-b868-490d-b4b0-(E-Mail Removed)...
>
>
>
> > Hi,

>
> > Can anyone suggest the best way to go about the following...

>
> > I'm tracking clicks (mouse down x,y coordinates) on a web page by
> > using some javascript to create an XHR which sends the coordinates to
> > a recording service, such as:

>
> > /Record.ashx?X=123&Y=456

>
> > In record.ashx, I'm wondering about the best way to record all these
> > values to the database.

>
> > In the end, when selecting the data from the database, all i need to
> > know is X, Y and a count. so for example:

>
> > 123,456,10 would mean there were 10 clicks at 123,456.

>
> > I'm figuring that since this service is going to be under quite high
> > load, that making an sqlconnection inside record.ashx for every
> > request would be a bad idea?

>
> First, what is "quite high load"? Are you expecting this service to be a
> google?
>
> Second, what do you think the overhead is to instantiate a SQL connection?
>
> What I am challenging here is whether the creation of the SQL connection is
> really a problem. Sure, instantiating a SQL connection is an an expensive
> endeavor, but you pull from the pool anyway. Under load, the pool will
> retain all of the connection objects, so the overhead, per call, is in the
> millisecond range at max and more likely to be microseconds. What I am
> getting at is there are other ways to scale other than avoiding putting data
> in the database. You can scale vertically by getting beefier machines. You
> can spread the web work on to multiple machines via a cluster. You can move
> the database to its own server.
>
> Back to the overhead to instantiate a connection issue. In ADO.NET, you do
> see a lot of overhead when you have a system that only takes a hit once
> every couple of minutes. This is due ot the fact the connection pool is
> cleared after 30 seconds (default). On heavily used apps, ADO.NET is
> constantly pulling from the connection pool, so the instantiation time is
> reduced tremendously.
>
> > I'm wondering if i should make a static array in the record servce,
> > which then performs a dump at regular intervals.

>
> How important is an accurate count? - a machine failure here means you are
> way off, as the array simply dies. At least with a persistant method, you
> will be close, even with sudden heart failure on your system.
> How are you going to do the dump? - copy the array, clear the original and
> then fire another thread to save? Create a new array to service calls, and
> fire another thread to save? Use a multi-threaded round robin? All possible,
> but add a lot of complexity to your solution. More moving parts means more
> places for potential failure.
>
> What you end up doing in situations like this is you just delay your problem
> by trading overhead per click (very small) to large overhead later on. There
> are certainly times to do this, but you need to ask yourself whether the
> extra programming, for safety, is warranted and the likelihood of a sudden
> heart failure on your machine(s).
>
> In short, know your risks from both directions. Trading a bit of perf fora
> lack of safety? Might be worth it to you; I cannot answer that.
>
> > I'm also trying to learn LINQ and i'm wondering if there some trickery
> > i can employ to make this as lightweight as possible.

>
> LINQ is not a data access method, per se. It is also not the best option to
> use it as such when you need to disconnect your data, as the data context
> desires an open connection all the time.
>
> > As an asside, do you think its best to try and maintain and X,Y,Count
> > table from the outset, or to just dump clicks into a raw table, which
> > gets periodically parsed into the grouped table?

>
> I assume you are working with a static image (meaning you are collecting
> clicks on a single image of fixed size). If so, I see no reason to use a raw
> table and parse. Consider this:
>
> CREATE PROCEDURE TestProc
> (
> * * @x * *int
> * * , @y * *int
> * * , @clicks * *int
> )
> AS
>
> UPDATE Table
> * * SET clicks = clicks + @clicks
> WHERE x = @x
> AND y = @y
>
> This would allow you to do either of the ways you are talking about (count
> click by click or offline for later).
>
> --
> Gregory A. Beamer
> MVP, MCP: +I, SE, SD, DBA
>
> Subscribe to my bloghttp://feeds.feedburner.com/GregoryBeamer#
>
> or just read it:http://feeds.feedburner.com/GregoryBeamer
>
> ********************************************
> | Think outside the box! * * * * * * * * * * * * * * * |
> ********************************************


Hi Gregory,

thanks for your feedback.

You made several relevant points; about vertical scaleability and
count accuracy.

What i'm doing is currently just a proof of concept to see how things
go. If load is getting too high, getting better machines may be an
option along with you other suggestions. However initially, i just
wanted to make sure i'd taken the best route possible in the first
place.

If you don't see a problem with using a connection and command to do a
single insert per click, then thats cool. I'll give it a whirl and see
how it holds up

The image isn't a fixed size, which means i either need to:

a) always insert, then: select x, y, count(1) group by x, y
a) always insert into a holding table, then batch update a table with
a count field periodically, and always select x,y, clicks from the
aggregated table
b) check if it exists, if so, update, else insert, then: select
x,y,clicks

(also there will be multiple images running at the same time, so all
the data has an imageid associated with it.)

I'm just trying to learn where the performance will be won or lost

Thanks for your advice

Trull
http://www.muonlab.com
 
Reply With Quote
 
Cowboy \(Gregory A. Beamer\)
Guest
Posts: n/a
 
      16th Aug 2008

<(E-Mail Removed)> wrote in message
news:9807a9b9-422a-4554-960f-(E-Mail Removed)...
On Aug 15, 3:09 pm, "Cowboy \(Gregory A. Beamer\)"
<NoSpamMgbwo...@comcast.netNoSpamM> wrote:
>
>What i'm doing is currently just a proof of concept to see how things
>go.


Welcome to my world! :-)

>If load is getting too high, getting better machines may be an
>option along with you other suggestions. However initially, i just
>wanted to make sure i'd taken the best route possible in the first
>place.


Best route, at first, is to code the simplest way you can, which is to
insert on each click. You then create a click simulator and pound the
backend and see where the system breaks down. I believe you will find it is
much higher than you think. The next test is to add stress on the web at the
same level, while you hammer the back end. Does not necessarily have to be
clicks through the UI, but you can do this. The idea here is just to see
where the single box breaks, with the added load of pages being served. This
will be lower than the first shot.

Most likely, you will run these tests on your dev box, which sucks compared
to the server(s) (hopefully (s) instead of not) you will put the solution
on, but will give you a worst case benchmark, which is good to have. If the
line is above your final solution, you are golden.

>If you don't see a problem with using a connection and command to do a
>single insert per click, then thats cool. I'll give it a whirl and see
>how it holds up


If the simplest solution does not perform to where you feel you need it,
then you have multiple options.

1. Move to beefier boxes
2. Employ a multi-threaded solution
3. Add a service that handles the clicks off line (safer than merely loading
in a list)
etc.

>The image isn't a fixed size, which means i either need to:


>a) always insert, then: select x, y, count(1) group by x, y
>a) always insert into a holding table, then batch update a table with
>a count field periodically, and always select x,y, clicks from the
>aggregated table
>b) check if it exists, if so, update, else insert, then: select
>x,y,clicks


There are ways to handle this. ONe is to gather metadata from images you
will serve when they are loaded into the application. This will give you the
parameters.

Just out of curiosity, are you working with maps? Does not matter, as the
algorithm is essentially the same, but I do a lot with maps.

>(also there will be multiple images running at the same time, so all
>the data has an imageid associated with it.)


Then the metadata route makes a lot of sense.

>I'm just trying to learn where the performance will be won or lost


You are probably going to hammer your SQL Server box first. But, pulling all
of the work on the web server, where companies generally skimp, does not
work in most cases. It is better to cluster the SQL boxes and/or move the
actual platters to a high speed NAS or SAN (which is fairly standard for
clustering). The end result is the SQL box hums.

--
Gregory A. Beamer
MVP, MCP: +I, SE, SD, DBA

Subscribe to my blog
http://feeds.feedburner.com/GregoryBeamer#

or just read it:
http://feeds.feedburner.com/GregoryBeamer

********************************************
| Think outside the box! |
********************************************

 
Reply With Quote
 
 
 
Reply

Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Design / Good practices question Jorge Microsoft C# .NET 3 18th Feb 2008 12:28 AM
Design/Practices question... groups.james@gmail.com Microsoft ASP .NET 2 1st Dec 2006 09:44 PM
Best practices question - Data Structure design. Sasha Microsoft VB .NET 0 26th Sep 2003 07:48 AM
Best practices question - Data Structure design. Sasha Microsoft Dot NET 0 26th Sep 2003 07:48 AM
Best practices question - Data Structure design. Sasha Microsoft C# .NET 0 26th Sep 2003 07:48 AM


Features
 

Advertising
 

Newsgroups
 


All times are GMT +1. The time now is 07:40 PM.