PC Review
Forums
Newsgroups
Microsoft DotNet
Microsoft C# .NET
maximum string length in c# and .net
Forums
Newsgroups
Microsoft DotNet
Microsoft C# .NET
maximum string length in c# and .net
![]() |
maximum string length in c# and .net |
|
|
Thread Tools |
Rating:
|
|
|
#1 |
|
Guest
Posts: n/a
|
Hi,
Having read this article: http://www.codeproject.com/dotnet/s...0&select=773966 I got curious about the limit of string lengths, so I wrote this program: public static void Main(string[] args) { StringBuilder s = new StringBuilder(); String adder = "A"; for(int i = 0; i < 10000; i++) { adder += "A"; } try { while(true) { System.Console.Out.WriteLine(s.Length * 2 + " - " + s.Length); s.Append(adder); } } catch (Exception exp) { System.Console.Out.WriteLine(exp.ToString()); } finally { System.Console.Out.WriteLine(s.Length * 2 + " - " + s.Length); System.Console.In.ReadLine(); } } The program consistently ends with: System.OutOfMemoryException: Exception of type System.OutOfMemoryException was thrown. 327,732,770 - 163,866,385 Does this mean that the max length of a string is 163,866,385 characters? Isn't it interesting that the length of the string is not a multiple of 10,000? thanks, fawce |
|
|
|
#2 |
|
Guest
Posts: n/a
|
The string builder keeps doubling the string size ...
You happen to be dying during one of those doublings .. when it is trying to allocate a new string of size 655465540 ... its not that that is the maximum size, it is that it can't re-grow the internal string btw if you put the following line for your constructor you can get further ![]() StringBuilder s = new StringBuilder(455465540); Cheers, Greg Young MVP - C# <fawcett@gmail.com> wrote in message news:1147223020.057360.278820@j33g2000cwa.googlegroups.com... > Hi, > > Having read this article: > http://www.codeproject.com/dotnet/s...0&select=773966 > > I got curious about the limit of string lengths, so I wrote this > program: > public static void Main(string[] args) > { > StringBuilder s = new StringBuilder(); > String adder = "A"; > for(int i = 0; i < 10000; i++) > { > adder += "A"; > } > try > { > while(true) > { > System.Console.Out.WriteLine(s.Length * 2 + " - " + s.Length); > s.Append(adder); > > } > } > catch (Exception exp) > { > System.Console.Out.WriteLine(exp.ToString()); > } > finally > { > System.Console.Out.WriteLine(s.Length * 2 + " - " + s.Length); > System.Console.In.ReadLine(); > } > } > > The program consistently ends with: > > System.OutOfMemoryException: Exception of type > System.OutOfMemoryException was thrown. > 327,732,770 - 163,866,385 > > Does this mean that the max length of a string is 163,866,385 > characters? Isn't it interesting that the length of the string is not a > multiple of 10,000? > > thanks, > fawce > |
|
|
|
#3 |
|
Guest
Posts: n/a
|
Greg,
Thanks for your reply. I modified the test app to this: public static void Main(string[] args) { int size = (int)Math.Pow(2, 29); StringBuilder s = new StringBuilder(size); String adder = "A"; for(int i = 0; i < 10000; i++) { adder += "A"; } try { while(true) { System.Console.Out.WriteLine(s.Length * 2 + " - " + s.Length); s.Append(adder); } } catch (Exception exp) { System.Console.Out.WriteLine(exp.ToString()); } finally { System.Console.Out.WriteLine(s.Length * 2 + " - " + s.Length + " - " + size); System.Console.In.ReadLine(); } } I experimented with the size variable, and I found that 2^29 was the max I could allocate without getting an out of memory on the string buffer create. The result is now consistently: System.OutOfMemoryException: Exception of type System.OutOfMemoryException was thrown. 1073727362 - 536863681 - 536870912 The VMsize in task manager is 1.07G. Using the default of 2^4, the result is: System.OutOfMemoryException: Exception of type System.OutOfMemoryException was thrown. 327732770 - 163866385 - 16 and a VMSize of 497M in task manager. So, I guess this all just proves that the max string size is 2^29, but that in practice, the replication of the string data required to grow the stringBuffer can limit the useable string size to something much smaller? thanks, fawce |
|
|
|
#4 |
|
Guest
Posts: n/a
|
fawcett@gmail.com wrote:
> So, I guess this all just proves that the max string size is 2^29, but > that in practice, the replication of the string data required to grow > the stringBuffer can limit the useable string size to something much > smaller? The maximum string size depends on a lot more in Win32, specifically, memory fragmentation. If you've got lumps of memory pinned (possibly on other threads, for example, or by allocations using VirtualAlloc by unmanaged code), the compacting GC won't be able to make most of the available address space contiguous the way it wants to, so you could end up with a maximum string length a lot less than 0.5 billion characters. The limits you're hitting are effectively 32-bit architectural limits, rather than .NET limits, although it is true that all GC languages like 2x to 3x more memory for efficient GC behaviour compared to languages using manual allocation. You'll see different results on 64-bit hardware, but it won't be much longer before you start hitting the upper limit of Int32.MaxValue, limiting the Length. ..NET strings aren't designed for this kind of use anyway - when you have data this large, you need to custom-design your algorithms and in-memory (and probably on-disk, for really huge data) structures around what you want to do with it all. Probably the most important thing is changing all algorithms that require whole-data access into ones that process the data linearly, possibly with reference to dictionaries created in earlier passes of the data - consider the way compilers worked back in the '60s etc. -- Barry |
|
|
|
#5 |
|
Guest
Posts: n/a
|
Interesting !!!
Well lets see facts first .... when you declare string it creates contigous chuck of memory for screen ... Now when I do append some character to String, if new added character is making string longer than previously declared length then .net creates new inmemory string and place the pervious string in new string and appends it. If this is working of string then i dont think so there is limit for string length. But I also created program to check it and found very very interesting facts. String a = new String('a',10000); for (int loop = 0; loop < 10000 ; loop++) a+= "a"; string b = ""; for (int loop = 0; loop < 10000 ; loop++) { b+= a; Console.WriteLine(b.Length); } Console.ReadLine(); Results : On my PC First Run : Maximum size was 29340000 Second Run : Maximum size was 29340000 Then i run it on another PC First Run : Maximum size was 29340000 Second Run : Maximum size was 29340000 See above results are same .... I dont think so there is limit for string length .. although i found same results on two PCs but these two pcs have exactly same configration ... string totally works on memory chunks .. So in nut shell Length of String = When memory gets fulll !!!!! I have also found some interesting facts .. I would request you to run my code on ur pc and post results. "fawcett@gmail.com" wrote: > Greg, > > Thanks for your reply. > > I modified the test app to this: > public static void Main(string[] args) > { > int size = (int)Math.Pow(2, 29); > StringBuilder s = new StringBuilder(size); > String adder = "A"; > for(int i = 0; i < 10000; i++) > { > adder += "A"; > } > try > { > while(true) > { > System.Console.Out.WriteLine(s.Length * 2 + " - " + s.Length); > s.Append(adder); > > } > } > catch (Exception exp) > { > System.Console.Out.WriteLine(exp.ToString()); > } > finally > { > System.Console.Out.WriteLine(s.Length * 2 + " - " + s.Length + " - > " + size); > System.Console.In.ReadLine(); > } > } > > > I experimented with the size variable, and I found that 2^29 was the > max I could allocate without getting an out of memory on the string > buffer create. > > The result is now consistently: > System.OutOfMemoryException: Exception of type > System.OutOfMemoryException was thrown. > 1073727362 - 536863681 - 536870912 > > The VMsize in task manager is 1.07G. > > Using the default of 2^4, the result is: > System.OutOfMemoryException: Exception of type > System.OutOfMemoryException was thrown. > 327732770 - 163866385 - 16 > > and a VMSize of 497M in task manager. > > So, I guess this all just proves that the max string size is 2^29, but > that in practice, the replication of the string data required to grow > the stringBuffer can limit the useable string size to something much > smaller? > > thanks, > fawce > > |
|
|
|
#6 |
|
Guest
Posts: n/a
|
This has been discussed a number of times recently in this NG.
The maximum size of all reference type (like a string) instances is limited by the CLR to 2GB, that means that a string can hold a maximum of ~1G characters. While it's possible to reach that limit when running on a 64 bit OS, you will never be able to create such large strings (or arrays) on a 32 bit OS. The reason is that you won't have that amount of "contiguous" address space available to create the backing store (a char array) for the string. The size of the largest contiguous memory space highly depends on how modules are mapped (see: Win32 and framework DLL's base addresses) into the process address space. Some modules are laid-out in such a way that the largest chunk becomes something like 950.000Kb, this before you even have created a single object. Lesson learned, always be prepared to get some OOM exceptions thrown on you if you don't care about your memory allocation patterns on 32 bit windows (also true for unmanaged!). Willy. <fawcett@gmail.com> wrote in message news:1147223020.057360.278820@j33g2000cwa.googlegroups.com... | Hi, | | Having read this article: | http://www.codeproject.com/dotnet/s...0&select=773966 | | I got curious about the limit of string lengths, so I wrote this | program: | public static void Main(string[] args) | { | StringBuilder s = new StringBuilder(); | String adder = "A"; | for(int i = 0; i < 10000; i++) | { | adder += "A"; | } | try | { | while(true) | { | System.Console.Out.WriteLine(s.Length * 2 + " - " + s.Length); | s.Append(adder); | | } | } | catch (Exception exp) | { | System.Console.Out.WriteLine(exp.ToString()); | } | finally | { | System.Console.Out.WriteLine(s.Length * 2 + " - " + s.Length); | System.Console.In.ReadLine(); | } | } | | The program consistently ends with: | | System.OutOfMemoryException: Exception of type | System.OutOfMemoryException was thrown. | 327,732,770 - 163,866,385 | | Does this mean that the max length of a string is 163,866,385 | characters? Isn't it interesting that the length of the string is not a | multiple of 10,000? | | thanks, | fawce | |
|
|
|
#7 |
|
Guest
Posts: n/a
|
Hi,
Thanks for the explanations about win32 memory, the allocation and doubling effects make sense. I wanted to point out a quote from the article I referenced in the first post: <quote>m_stringLength int, This is the logical length of the string, the one returned by String.Length. Because a number of high bits are used for additional flags to enhance performance, the maximum length of the string is constrained to a limit much smaller than UInt32.Max for 32bit systems. Some of these flags indicate the string contains simple characters such as plain ASCII and will not required invoking complex UNICODE algorithms for sorting and comparison tests.</quote> In other words, the string object in .net has a member variable for its length (makes sense, keeps length checks fast). This variable imposes a natural limit to the string's length, because you can't track more than what the length variable can hold. This would have no effect whatsoever on a 32 bit system, because the max string length would exceed the max object size. However, the length variable is also used to hold some control bits. >From my test, I believe it is 3 bits, knocking the max string length down to 2^29 chars. So, that would be 2^29 chars X 2 bytes/char / 1024 byte per kb / 1025 kb per M / 1024 M per G or 1 Gig, which is almost exactly the result I got from my StringBuilder test. Even more interesting, is that the uninitialized StringBuilder couldn't make it to the 1Gig limit. From Willy's posting, I postulate that the limit of the uninitialized StringBuilder is capped because the Win32 memory manager can't easily allocate a very large, continguous block of memory, having run through many allocations in the growth from a 16 char string to a very large string. So to summarize: - There is a 2G byte limit for Win32 objects. This is never the limiting factor for a string. - There is a 2^29 length cap for a string from the construction of the length member variable. This is rarely the limiting factor, because stringbuilders are not often allocated with a size cap. - There is a practical limit, dependent on the current system conditions, that limits the growth of strings to the largest continguous block of memory available. This cap is deterministic, but unpredictable. Conclusion: - It is a terrible idea to use huge strings in .net or pretty much anywhere else. Data that large should be streamed to and from its source (disk, network, etc). thanks for all your explanations, fawce |
|
|
|
#8 |
|
Guest
Posts: n/a
|
Hi,
"Willy Denoyette [MVP]" <willy.denoyette@telenet.be> wrote in message news:OzPHVxBdGHA.536@TK2MSFTNGP02.phx.gbl... > This has been discussed a number of times recently in this NG. > The maximum size of all reference type (like a string) instances is > limited > by the CLR to 2GB, that means that a string can hold a maximum of ~1G > characters. Last week was the last time I remember ![]() honestly I do not see why the maximum length of a string is so interesting to so many people. I would have other worries if I have to handle such a huge piece of data. -- Ignacio Machin, ignacio.machin AT dot.state.fl.us Florida Department Of Transportation |
|
|
|
#9 |
|
Guest
Posts: n/a
|
See inline...
Willy. <fawcett@gmail.com> wrote in message news:1147267284.806167.199860@u72g2000cwu.googlegroups.com... | Hi, | | Thanks for the explanations about win32 memory, the allocation and | doubling effects make sense. | | I wanted to point out a quote from the article I referenced in the | first post: | <quote>m_stringLength int, | This is the logical length of the string, the one returned by | String.Length. | Because a number of high bits are used for additional flags to enhance | performance, the maximum length of the string is constrained to a limit | much smaller than UInt32.Max for 32bit systems. Some of these flags | indicate the string contains simple characters such as plain ASCII and | will not required invoking complex UNICODE algorithms for sorting and | comparison tests.</quote> | | In other words, the string object in .net has a member variable for its | length (makes sense, keeps length checks fast). This variable imposes a | natural limit to the string's length, because you can't track more than | what the length variable can hold. This would have no effect whatsoever | on a 32 bit system, because the max string length would exceed the max | object size. | | However, the length variable is also used to hold some control bits. | >From my test, I believe it is 3 bits, knocking the max string length | down to 2^29 chars. So, that would be 2^29 chars X 2 bytes/char / 1024 | byte per kb / 1025 kb per M / 1024 M per G | or | 1 Gig, which is almost exactly the result I got from my StringBuilder | test. | You mean 1 Gig bytes I guess, that is 500M Char. | Even more interesting, is that the uninitialized StringBuilder couldn't | make it to the 1Gig limit. From Willy's posting, I postulate that the | limit of the uninitialized StringBuilder is capped because the Win32 | memory manager can't easily allocate a very large, continguous block of | memory, having run through many allocations in the growth from a 16 | char string to a very large string. | The reason is the way the CLR expands the SB, the underlying char array has to be copied to a new contigious memory block each time the SB expands. That means that finaly you need two times the size of the last SB object as a free contigious block of free memory in order to be able to copy the contents to the new expanded SB. | So to summarize: | - There is a 2G byte limit for Win32 objects. This is never the | limiting factor for a string. No, there is currently a 2GB limit for all CLR objects, no matter what OS you are running on (32 or 64 bit). | - There is a 2^29 length cap for a string from the construction of the | length member variable. This is rarely the limiting factor, because | stringbuilders are not often allocated with a size cap. You better create a StringBuilder with the correct size (or somewhat bigger) if you can, the exponential growth can put high pressure on the GC especially when using large strings. | - There is a practical limit, dependent on the current system | conditions, that limits the growth of strings to the largest | continguous block of memory available. This cap is deterministic, but | unpredictable. | | Conclusion: | - It is a terrible idea to use huge strings in .net or pretty much | anywhere else. Data that large should be streamed to and from its | source (disk, network, etc). | Yep, but this isn't limitted to strings, the same applies to arrays and all other containers using arrays as backing store. Willy. |
|
|
|
#10 |
|
Guest
Posts: n/a
|
Ignacio,
The reason I find it interesting is that the practical limit for default StringBuilder and string behavior is much smaller than the theoretical limit. So, a 100Mb string with default handling can generate much more massive memory swings. Understanding why the vm size drastically exceeds the size of strings is important. In .net, if you have a string in the megabyte range, you'll generate a lot of objects that end up on the large object heap as residue as you grow the string. thanks, fawce |
|
![]() |
|
| Thread Tools | |
| Rate This Thread | |
|
|

Main Page 



