Generating very large string in C#

  • Thread starter Thread starter Dukkov
  • Start date Start date
D

Dukkov

Hi Folks,

I need to generate a very large string (1 MB or so) in my C# code, so I
can test the code.

What is the most elegant way to do so?

Thanks!

Dim
 
Is this really elegant?

What if I use directly StringBuilder str = new
StringBuilder(UInt32.Max)? or smaller capacity?

Thanks!
 
Is this really elegant?

What if I use directly StringBuilder str = new
StringBuilder(UInt32.Max)? or smaller capacity?

This won't compile - for one, StringBuilder takes an Int32 argument, and
for another, it's UInt32.MaxValue. If you pass Int32.MaxValue, it should
work on a 64-bit system, although it might be slow if there isn't enough
memory.

The OP specified on the order of 1MB, so talk of such large cases as 4GB
(int.MaxValue * sizeof(char)) is fairly irrelevant.

If the task is generating a string in an output fashion only (i.e. no
random access, replacements, insertions etc. needed), where it has to
scale far beyond 1MB or so, writing to a file will be more efficient,
but then the string won't be in memory, and so the source code itself
that works with the string may be less elegant.

-- Barry
 
I need to generate a very large string (1 MB or so) in my C# code, so I
can test the code.

What is the most elegant way to do so?

string large = new string('?', 512*1024);

Should take roughly 1 MB.


Mattias
 
I need to generate a very large string (1 MB or so) in my C# code, so I
can test the code.

What is the most elegant way to do so?

string x = new string ('x', 1024*1024/2);

(The division by two is to take account of each character being 2
bytes.)
 
| (e-mail address removed) wrote:
|
| > Is this really elegant?
| >
| > What if I use directly StringBuilder str = new
| > StringBuilder(UInt32.Max)? or smaller capacity?
|
| This won't compile - for one, StringBuilder takes an Int32 argument, and
| for another, it's UInt32.MaxValue. If you pass Int32.MaxValue, it should
| work on a 64-bit system, although it might be slow if there isn't enough
| memory.
|

It won't work on a 64 bit system either, a string (like any other CLR
object) is limitted to 2GB irrespective the OS version.

Willy.
 
Willy Denoyette said:
It won't work on a 64 bit system either, a string (like any other CLR
object) is limitted to 2GB irrespective the OS version.

I have not yet had the luxury or necessity to test on a 64-bit sysem
yet... :)

-- Barry
 
|
| > It won't work on a 64 bit system either, a string (like any other CLR
| > object) is limitted to 2GB irrespective the OS version.
|
| I have not yet had the luxury or necessity to test on a 64-bit sysem
| yet... :)
|

If yo can't test it yourself, just read this snip:
<First some background; in the 2.0 version of the .Net runtime (CLR) we made
a conscious design decision to keep the maximum object size allowed in the
GC Heap at 2GB, even on the 64-bit version of the runtime.>

from: http://blogs.msdn.com/joshwil/archive/2005/08/10/450202.aspx

Willy.
 
Willy Denoyette said:
If yo can't test it yourself, just read this snip:

Yes. Now that you mentioned it, I do dimly recall this. However, I
usually test everything I post on the newsgroup here, so I would have
tested it anyway.

-- Barry
 
Still not a good idea if the string is to be built over several steps,
rather than just one assignment. StringBuilder would be better.
 
Scott M. said:
Still not a good idea if the string is to be built over several steps,
rather than just one assignment. StringBuilder would be better.

Why? I don't see anything in the OP's original post to indicate that
anything beyond a large string is required. Using that constructor is
the simplest way of generating such a string, IMO. In what way would
StringBuilder be better?
 
Willy said:
|
| > It won't work on a 64 bit system either, a string (like any other CLR
| > object) is limitted to 2GB irrespective the OS version.
|
| I have not yet had the luxury or necessity to test on a 64-bit sysem
| yet... :)
|

If yo can't test it yourself, just read this snip:
<First some background; in the 2.0 version of the .Net runtime (CLR) we made
a conscious design decision to keep the maximum object size allowed in the
GC Heap at 2GB, even on the 64-bit version of the runtime.>

from: http://blogs.msdn.com/joshwil/archive/2005/08/10/450202.aspx

Willy.

Another "who would ever need more than 640 kb" decision.... ;)
 
As stated, a StringBuilder would be better *if* the string is to be built up
over several expressions.
 
Scott M. said:
As stated, a StringBuilder would be better *if* the string is to be built up
over several expressions.

Better than a concatenation, yes. I don't think anyone was suggesting
that though.
 
Jon said:
Better than a concatenation, yes. I don't think anyone was suggesting
that though.

No, the only suggested alternative was using a string constructor that
creates a huge string filled with the same character.

That fulfills the only requirements that the OP had, e.g. that the
string should be large, and that the way to create it should be elegant.

If that solution really is sufficient for the test that the OP is going
to do depends on information about the test that was not revealed in the
original post.

Perhaps time (e.g. the OP) will tell... ;)
 
| Willy Denoyette [MVP] wrote:
| > | > |
| > | > It won't work on a 64 bit system either, a string (like any other
CLR
| > | > object) is limitted to 2GB irrespective the OS version.
| > |
| > | I have not yet had the luxury or necessity to test on a 64-bit sysem
| > | yet... :)
| > |
| >
| > If yo can't test it yourself, just read this snip:
| > <First some background; in the 2.0 version of the .Net runtime (CLR) we
made
| > a conscious design decision to keep the maximum object size allowed in
the
| > GC Heap at 2GB, even on the 64-bit version of the runtime.>
| >
| > from: http://blogs.msdn.com/joshwil/archive/2005/08/10/450202.aspx
| >
| > Willy.
| >
|
| Another "who would ever need more than 640 kb" decision.... ;)

I don't see the link, anyway, remember reference type instances are
'moveable' unless they are larger than 85Kb, that means that strings larger
than 85Kb are stored on the large object heap which never gets compacted,
simply because it's too costly to move such large blocks of memory (ever
thought what it would take to move 2GB objects in memory).
Note also that .NET is not the silver bullet, if you really need such large
contigious strings, you'll have to allocate them from the process heap using
unmanaged code.

Willy.
 
Back
Top