There has to be a better way to work withat string chars

T

Thomas T. Veldhouse

Willy Denoyette said:
| >> > The calls to substring IMO are going to be a little more costly.
| >>
| >> StringBuilder is probably not internally implemented in C#, but rather
in
| >> native code; thus, the allocating of memory and copying bytes will be
| >> optimized as opposed to doing it with substrings in C#. Most C/C++
develops
| >> likely understand this implicitly.
| >
| > Very, very little of StringBuilder is implemented directly in native
| > code, as far as I can tell with Reflector.
| >
|
| Reflector as in "reflection". As far as I am concerned, it would only
have a
| use in StringBuilder.AppendFormat(...). Otherwise, I expect calls like
| memcpy, wstrcpy, etc (or equivalents that work with BSTR). My point is
that
| such manipulation is FAR faster in native code with direct memory access,
so
| implementation in this manner makes sense.
|
| --
| Thomas T. Veldhouse
| Key Fingerprint: 2DB9 813F F510 82C2 E1AE 34D0 D69D 1EDC D5EC AED1
|
|

No, V2 FCL StringBuilder is almost completely implemented in C# using unsafe
code constructs for performance critical path's. It doesn't even rely upon
the CRT. All there is is a couple of internal calls (the runtime), for
instance to allocate the initial string.
But all this is moot as mscorlib is native code (ngen'd) anyway.

Bingo. If you have access, care to share the code? If there are calls into
mscorlib, I think we are very likely agreeing.
 
J

Jon Skeet [C# MVP]

Thomas T. Veldhouse said:
I am indicating that the code in StringBuilder that creates the final output
string (ToString()) will use native code (via unsafe code, or perhaps Windows
or C API calls).

You really need to understand the difference between native code and
unsafe code. They're completely different.

Anyway, I see no evidence that StringBuilder.ToString uses native code
any more than calls to String.Substring does. Both use
FastAllocateString eventually, which *is* an external call, but that's
about it as far as I can tell. If you believe there's more to it than
that, it would be nice if you'd present some evidence.
Unsafe means it is not proteced from memory issues. Any calls to native
compiled code or C/Windows API will be unsafe by nature.

No, that will be unmanaged. Not unsafe in the normal understanding of
it in the context of .NET.
My reference to C/C++ developers inherently understanding seems to have
stirred a few people up.

Not at all - your incorrect guess at the implementation of
StringBuilder was what stirred me up.
It was NOT a dig at other programmers; it was simply
a statement that C/C++ developers understand programming using memory
managment and how to reference memory, free it, allocate it (both stack and heap
allocation) and how to move it around. It is clear what must happen and it
seems unlikely there would be a lot of gain using StringBuilder without such
calls made to native code to do this.

That's where you're wrong. The principal reason for using StringBuilder
is to avoid making copies of strings unnecessarily. See
http://www.pobox.com/~skeet/csharp/stringbuilder.html for more of an
explanation. The optimisation is to do with how much copying is
required, not how fast each copy is.
 
W

Willy Denoyette [MVP]

Agreed, for Replace which performs an internal call, but not for Append (at
least not that I see an internal call or a call to native code libraries).
StringBuilder (like some other methods) was completely re-written for V2,
favoring safety over speed, that means not relying on native CRT libraries
(using PInvoke) which could compromise this safety, so all you will see are
some internal calls, that's all.

Willy.

message | To add more fuel to the fire, the calls that ultimately call unmanaged
| code are Append and Replace. Insert and Remove are completely managed
calls
| (ultimately).
|
| --
| - Nicholas Paldino [.NET/C# MVP]
| - (e-mail address removed)
|
| | >> > Thomas,
| >> >
| >> > In that case, you would be wrong. Stringbuilder is a combination
of
| >> > managed code, unsafe code (still managed) and calls to API functions.
| >> > Depending on the operation, the strings are copied or modified with
one
| >> > of
| >> > these three methods.
| >>
| >> That is exactly what I was saying ... implemented in native code ...
API
| >> is
| >> native C.
| >
| > Well, which bit are you suggesting is in native code? Looking in
| > Reflector I can't see many calls to native methods... I dare say there
| > may be *some* bits which use native calls, but that's a long way from
| > what you originally said.
| >
| >> > As for the implicit understanding statement. I guess that would
be
| >> > subject to review as well =)
| >> >
| >>
| >> No ... I think we agree ... API calls [and generally unsafe code as
well]
| >> is
| >> going to be compiled native code, such as compiled C [as is the case
with
| >> the
| >> Windows API].
| >
| > Unsafe code isn't the same as unmanaged code. Normally "unsafe" refers
| > to code which is still IL.
| >
| > --
| > Jon Skeet - <[email protected]>
| > http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
| > If replying to the group, please do not mail me too
|
|
 
J

Jon Skeet [C# MVP]

Thomas T. Veldhouse said:
Bingo. If you have access, care to share the code? If there are calls into
mscorlib, I think we are very likely agreeing.

I think you misunderstood Willy's post. mscorlib is the assembly where
String and StringBuilder live - but they're both implemented (almost
entirely) in IL. Now, they're pre-compiled to native code, but that's
just really just another type of JIT - it's not like .NET code is ever
interpreted. The majority of the code is still managed code, and could
well be implemented in C#, contrary to your original guess.
 
W

Willy Denoyette [MVP]

| >
| > | > | >> > The calls to substring IMO are going to be a little more
costly.
| > | >>
| > | >> StringBuilder is probably not internally implemented in C#, but
rather
| > in
| > | >> native code; thus, the allocating of memory and copying bytes will
be
| > | >> optimized as opposed to doing it with substrings in C#. Most C/C++
| > develops
| > | >> likely understand this implicitly.
| > | >
| > | > Very, very little of StringBuilder is implemented directly in native
| > | > code, as far as I can tell with Reflector.
| > | >
| > |
| > | Reflector as in "reflection". As far as I am concerned, it would only
| > have a
| > | use in StringBuilder.AppendFormat(...). Otherwise, I expect calls
like
| > | memcpy, wstrcpy, etc (or equivalents that work with BSTR). My point
is
| > that
| > | such manipulation is FAR faster in native code with direct memory
access,
| > so
| > | implementation in this manner makes sense.
| > |
| > | --
| > | Thomas T. Veldhouse
| > | Key Fingerprint: 2DB9 813F F510 82C2 E1AE 34D0 D69D 1EDC D5EC AED1
| > |
| > |
| >
| > No, V2 FCL StringBuilder is almost completely implemented in C# using
unsafe
| > code constructs for performance critical path's. It doesn't even rely
upon
| > the CRT. All there is is a couple of internal calls (the runtime), for
| > instance to allocate the initial string.
| > But all this is moot as mscorlib is native code (ngen'd) anyway.
| >
|
| Bingo. If you have access, care to share the code? If there are calls
into
| mscorlib, I think we are very likely agreeing.


StringBuilder does not call into mscorlib, mscorlib is the assembly that
holds StringBuilder (and all the core framework classes), it is compiled to
native code (ngen'd) at framework install time.
So if you want to see the code take a look at it using reflector or ildasm.
Note that most assemblies are pre-compiled (ngen'd) into native code since
V2.

Willy.
 
W

Willy Denoyette [MVP]

| > > No, V2 FCL StringBuilder is almost completely implemented in C# using
unsafe
| > > code constructs for performance critical path's. It doesn't even rely
upon
| > > the CRT. All there is is a couple of internal calls (the runtime), for
| > > instance to allocate the initial string.
| > > But all this is moot as mscorlib is native code (ngen'd) anyway.
| >
| > Bingo. If you have access, care to share the code? If there are calls
into
| > mscorlib, I think we are very likely agreeing.
|
| I think you misunderstood Willy's post. mscorlib is the assembly where
| String and StringBuilder live - but they're both implemented (almost
| entirely) in IL. Now, they're pre-compiled to native code, but that's
| just really just another type of JIT - it's not like .NET code is ever
| interpreted. The majority of the code is still managed code, and could
| well be implemented in C#, contrary to your original guess.
|
| --

That's correct, I find it amusing when people say '... running managed code
... ", at program execution time everything that get's executed is native
code, we still don't have CPU's that can run IL ;-)

Willy.
 
K

Kevin Spencer

I am bamboozled that nobody has mentioned using
System.GlobalizationTextInfo.ToTitleCase
(http://msdn2.microsoft.com/en-us/library/system.globalization.textinfo.totitlecase.aspx).

--
HTH,

Kevin Spencer
Microsoft MVP
Software Composer

A watched clock never boils.

sklett said:
Wow, I turn my head for 10 minutes and come back to a ton of posts, neat!

Thanks all for the info, nice little thread. It helped me out.
In case anyones wondering, I'm leaving my code as is for now, but am going
to play with stringBuilder some more to see what other uses I have for it!

sklett said:
I need to take a string, make the first character lowercase and prepend an
underscore character to it. Something like this:
"California"
"_california"

Here is the (ugly) solution I cam up with:
Code:
public static string MakeNetSuiteStateName(string normalStateName)
{
string firstChar = normalStateName.Substring(0, 1);
string restOfName = normalStateName.Substring(1);
string newName = "_" + firstChar.ToLower() + restOfName;
return newName;
}

There must be a better way... is there?
 
N

Nicholas Paldino [.NET/C# MVP]

Willy,

You are right, I saw the call to string.wstrcpy and I just assumed it
was a P/Invoke call. That being said, of the four main operations, Replace
is the only one I see which is an internal call.
 
W

Willy Denoyette [MVP]

message | Willy,
|
| You are right, I saw the call to string.wstrcpy and I just assumed it
| was a P/Invoke call. That being said, of the four main operations,
Replace
| is the only one I see which is an internal call.
|
|

Nicholas ,

True, wstrcpy is one of the methods (there are others) that is a
(simplified) copy of the safe CRT counterpart.

Willy.
 
D

Dave Sexton

Hi Kevin,

Well it doesn't address the problem and anyway the docs say that its behavior is not guaranteed and that it might change in the
future!

--
Dave Sexton

Kevin Spencer said:
I am bamboozled that nobody has mentioned using System.GlobalizationTextInfo.ToTitleCase
(http://msdn2.microsoft.com/en-us/library/system.globalization.textinfo.totitlecase.aspx).

--
HTH,

Kevin Spencer
Microsoft MVP
Software Composer

A watched clock never boils.

sklett said:
Wow, I turn my head for 10 minutes and come back to a ton of posts, neat!

Thanks all for the info, nice little thread. It helped me out.
In case anyones wondering, I'm leaving my code as is for now, but am going to play with stringBuilder some more to see what other
uses I have for it!

sklett said:
I need to take a string, make the first character lowercase and prepend an underscore character to it. Something like this:
"California"
"_california"

Here is the (ugly) solution I cam up with:
Code:
public static string MakeNetSuiteStateName(string normalStateName)
{
string firstChar = normalStateName.Substring(0, 1);
string restOfName = normalStateName.Substring(1);
string newName = "_" + firstChar.ToLower() + restOfName;
return newName;
}

There must be a better way... is there?
 
K

Kevin Spencer

Darn! I read most of the messages several times, but what I expected to see
must have affected my perception. Never mind!

--
:p,

Kevin Spencer
Microsoft MVP
Software Composer

A watched clock never boils.
 
C

Christof Nordiek

Thomas T. Veldhouse said:
Nicholas Paldino said:
Thomas,

In that case, you would be wrong. Stringbuilder is a combination of
managed code, unsafe code (still managed) and calls to API functions.
Depending on the operation, the strings are copied or modified with one
of
these three methods.

That is exactly what I was saying ... implemented in native code ... API
is
native C.
As for the implicit understanding statement. I guess that would be
subject to review as well =)

No ... I think we agree ... API calls [and generally unsafe code as well]
is
going to be compiled native code, such as compiled C [as is the case with
the
Windows API].
managed code will be native code aswell, isn't it?
 
J

Jon Skeet [C# MVP]

Christof said:
No ... I think we agree ... API calls [and generally unsafe code as well]
is going to be compiled native code, such as compiled C [as is the case with
the Windows API].
managed code will be native code aswell, isn't it?

By the time it's executed, yes.

Jon
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top