C
colin
Hi,
I have several gigabytes of CSV files - coma seperated variables (in text)
I find the converting to numbers is a bit slow,
to double or int seems about the same,
using split seems to take longer than substring,
seems no alternative to parse/tryparse.
although I havnt tested to see wich is faster.
It reads the file at a rate of about 1 million lines a sec wich is probably
as fast as the system will allow,
but only about converts 100k lines a sec with 5 numbers in each line,
this seems rather slow, must be many thousands of instructions per
conversion,
im running on a 64bit amd ~ 2ghz, 3gb ram.
I do this in assembler in my PIC mcu in a lot less lol.
the number crunching wich is quite involved can also do about 100k lines a
sec,
I have seperate threads for each step so I can delay the start of the next
thread to see how fast it is.
otherwise its procesed and displayed as it comes in.
I could do with speeding up the number crunching too,
I found it considerably faster to use structs instead of class
to organize the data, wich is a couple of complex numbers per point, plus
DateTime.
the data is stored in chunks in List<>
but theres lots of function calls as its quite structred with a complex
class I copied from somwhere
howcome c# doesnt have its own complex class or even a complex variable type
?
do simple functions get inlined ? or is this not happening in debug.
I read the Jit makes its own mind up about inlining,
structures get passed by value on the stack though
I tried using ref but got into trouble as then you cant pass values from
function calls directly.
I also run into memory problems, I need to keep all the data I read in as it
takes long to convert it,
then do statistical noise reduction and more than one FFT
and be able to change parameters and see the results quickly.
I store the files in 1hour chunks wich is about 150k records,
I found it slightly better to use an array[] then resize the aray when it is
finished,
changing the initial size of the array and the increment if its too small
considerably affects the max memory it uses.
I managed to get the memory down to about 115% of the size of data I have.
although I only have a few days data atm,
with a years worth im going to run into problems I think.
its probably going to be more than 4gb,
maybe I should store the input data as binary numbers, or is there a way to
speed this up ?
I would need to store both.
maybe I just though of an idea .. to have a binary cache.
could I simply map the file into memory and make it an array of structs ?
Ive done something like this before with c++
I gues it would need to be unsafe code,
is there much scope for speeding things up with unsafe code?
Im also wondering if its worth going for a 64bit version of winxp,
has any one been down this route with c# and know if its that much better
or has any disadvantages ? I know some things arnt compatable.
thanks
Colin =^.^=
I have several gigabytes of CSV files - coma seperated variables (in text)
I find the converting to numbers is a bit slow,
to double or int seems about the same,
using split seems to take longer than substring,
seems no alternative to parse/tryparse.
although I havnt tested to see wich is faster.
It reads the file at a rate of about 1 million lines a sec wich is probably
as fast as the system will allow,
but only about converts 100k lines a sec with 5 numbers in each line,
this seems rather slow, must be many thousands of instructions per
conversion,
im running on a 64bit amd ~ 2ghz, 3gb ram.
I do this in assembler in my PIC mcu in a lot less lol.
the number crunching wich is quite involved can also do about 100k lines a
sec,
I have seperate threads for each step so I can delay the start of the next
thread to see how fast it is.
otherwise its procesed and displayed as it comes in.
I could do with speeding up the number crunching too,
I found it considerably faster to use structs instead of class
to organize the data, wich is a couple of complex numbers per point, plus
DateTime.
the data is stored in chunks in List<>
but theres lots of function calls as its quite structred with a complex
class I copied from somwhere
howcome c# doesnt have its own complex class or even a complex variable type
?
do simple functions get inlined ? or is this not happening in debug.
I read the Jit makes its own mind up about inlining,
structures get passed by value on the stack though
I tried using ref but got into trouble as then you cant pass values from
function calls directly.
I also run into memory problems, I need to keep all the data I read in as it
takes long to convert it,
then do statistical noise reduction and more than one FFT
and be able to change parameters and see the results quickly.
I store the files in 1hour chunks wich is about 150k records,
I found it slightly better to use an array[] then resize the aray when it is
finished,
changing the initial size of the array and the increment if its too small
considerably affects the max memory it uses.
I managed to get the memory down to about 115% of the size of data I have.
although I only have a few days data atm,
with a years worth im going to run into problems I think.
its probably going to be more than 4gb,
maybe I should store the input data as binary numbers, or is there a way to
speed this up ?
I would need to store both.
maybe I just though of an idea .. to have a binary cache.
could I simply map the file into memory and make it an array of structs ?
Ive done something like this before with c++
I gues it would need to be unsafe code,
is there much scope for speeding things up with unsafe code?
Im also wondering if its worth going for a 64bit version of winxp,
has any one been down this route with c# and know if its that much better
or has any disadvantages ? I know some things arnt compatable.
thanks
Colin =^.^=