Normalized datatables, stack arrays, and lots of data

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

I have a 20meg in-memory stack-based array and I'd like to normalise in the
client's memory... then add foriegn keys, and display results on a datagrid.
I discovered that converting the stack to a datatable my memory utilisation
increases to 120 megs. (lots of overhead)

Couple of questions
1)- Is that memory increase typical? All I want to do is bind it to a
datagrid.

2)- Suppose I dump it to a CSV, or SQL. Is it possible to only retrieve a
subset of all that data? ...And page through the data when the user scrolls
(given that this is not an ASP application)

3)- Lastly and **perhaps most importantly** - how do people in the real
world access and update normalised data when it comes to large DB's like this
one in C#? Is the data joined by stored procedure,in-memory datatable FK's,
or a Transact-SQL command issued by the client?
 
Scottie_do,
I'd ask first why you want to bind the equivalent of a 20meg array to a a
Datagrid?

Realistically, is the user going to need to view all this data in one go?
Datatables have a lot of extra baggage and features, so yes you would expect
the memory consumption to be higher.

If you store it in SQL Server, of course you can write a paging algorithm.
There are a number of examples of stored prodedures that will handle "per
page" resultsets for display in a grid via a DataTable.

Typically, the data in the database is already normalized, so the results
are obtained through a multi-table join query.

Peter
 
I'm analysing data at runtime and need to provide a few filtered views. I
dont have to bind all the tables to the grid since and am planning on tossing
irrelevant data, or exporting it for future analysis.

Most of the paging solutions Ive seen target a Web-based model. Since I
want my app to be more of a Avalon/Interactive app, I'd like to have it as an
exe.

The target database has not been created yet, this aspect of the project is
still in design. I'm just trying to come up with a solution that will allow
for several real time feeds writing to the database, normalising (to save
space), to obtaining unique keys and to allow clients to instantly see the
aggregated data.

I'm thinking I could have normalisation on the "stream reader" client that
can feed to a JIT display engine that will discard unnessary data, or it will
fork off a stream to the SQL DB. Should the forked stream be normalised by
the client/stream reader or should it be processed by the SQL engine. This
can be a lot of data so I'm trying to offload as much as possible.

-Chris
 
Scottie_do said:
I have a 20meg in-memory stack-based array and I'd like to normalise
in the client's memory... then add foriegn keys, and display results
on a datagrid. I discovered that converting the stack to a datatable
my memory utilisation increases to 120 megs. (lots of overhead)

Couple of questions
1)- Is that memory increase typical? All I want to do is bind it to
a datagrid.

Yes, although data in a datatable isn't stored with a lot of overhead,
binding it to a grid is.
2)- Suppose I dump it to a CSV, or SQL. Is it possible to only
retrieve a subset of all that data? ...And page through the data
when the user scrolls (given that this is not an ASP application)

Yes, paging is easily done on every database out there, except ms
access (ms access doesn't offer 'server side' paging). You typically
retrieve a page of data from the db, bind that to a grid, user works on
that, user clicks 'next' button or whatever and next page is retrieved.
This is of course also doable in a .exe.

Most databases offer very easy paging code, SqlServer has problems in
this area though, except sqlserver 2005 which offers a slightly better
way of doing paging (except it still isn't optimal, as you have to
specify a sort column for the rownumber). If reading is your primary
concern, I'd take this into account when selecting the db.
3)- Lastly and **perhaps most importantly** - how do people in the
real world access and update normalised data when it comes to large
DB's like this one in C#? Is the data joined by stored
procedure,in-memory datatable FK's, or a Transact-SQL command issued
by the client?

large databases are threated like small databases: retrieve the data
you want to work with at time T and only that data, not the data you
want to work with at T+t. Use the RDBMS features available to retrieve
the data you want to work with, don't do things in memory, as that's
always slower and you always have to keep more data around. So if you
have to process 10,000 rows, no-one will load these up front and bind
these to a grid. Instead, load 50 up front, show them in the grid and
let hte user proceed from there.

FB

--
 
Back
Top